NLU training report

Created on May 25|Last edited on June 1
Comment
﻿
IntroductionIn this report, we will evaluate the performance of an NLU system trained to recognise messages from consumers belonging to general intents like FindRestaurants, GetWeather, CheckBalance, LookupMusic, etc. It has been extracted from schema_guided_dstc8 dataset, publicly available at HuggingFace Datasets.
Model training performanceDeep Learning models make use of loads of data and, even if it goes through a preprocessing pipeline, we are not certain about data quality and distribution. Because of that, a common strategy to follow to check models consistency is Stratified K-Fold. In it, data is partitioned in K equally-sized portions, reserving in each iteration one for validation and the rest for training. In our example, we have performed a Stratified 5-Fold strategy (80% train-20% validation), maintaining the same proportion of phrases per intent in each bucket.
﻿
f1_score
f1_score
0246810Step00.20.40.60.81
Fold_4
Fold_3
Fold_2
Fold_1
Fold_0
accuracy_score
accuracy_score
0246810Step00.20.40.60.81
Fold_4
Fold_3
Fold_2
Fold_1
Fold_0
val_loss
val_loss
0246810Step0.60.811.21.41.6
Fold_4
Fold_3
Fold_2
Fold_1
Fold_0
Run set6
﻿
Out-Of-Folder analysisThe main advantage of following this approach is that we can infer how our model would perform in all our dataset, as at some point any phrase has been part of the validation set in any of the stages (folds). By considering all of those predictions, we can build the so called Out-Of-Folder metrics (OOF), which entangles the behaviour of our model in all possible scenarios within our data.
For fine-grained vision of model errors though OOF, a confusion matrix is provided as well:
﻿
Run set6
﻿
Hands-on dataHowever, this is "just" a numerical outcome of model performance. Sometimes it's convenient to take a look at data to see if there's any reason for those errors beyond model inaccuracy. For that reason, we provide an example of each error type contained in the previous confusion matrix; i.e., there will be an element in the table per non-negative value out of the diagonal of the matrix.
﻿
Run set6
﻿
﻿
﻿
Add a comment