Fine-Tuning
This report presents the fine-tuning of RoBERTa for extractive question answering.
Created on August 26|Last edited on September 16
Comment
This report visualizes the fine-tuning process of the RoBERTa encoder on the English version of the BiPaR dataset. The results of six different approaches are presented below:
- roberta-large_squad had been previously fine-tuned on SQuAD 1.1.
- roberta-large was fine-tuned only on BiPaR.
- roberta-large_mlm_20-words had been previously adaptated using masked language modeling (MLM) with 20% masked words.
- roberta-large_mlm_40-tokens had been previously adaptated MLM with 40% masked tokens.
- roberta-large_augmented_questions was fine-tuned on BiPaR enhanced with paraphrased questions.
- roberta-large_augmented_q&a was fine-tuned on BiPaR enhanced with additional generated question-answer pairs.
Training
- roberta-large_squad yielded the best scores owing to the additional fine-tuning on SQuAD 1.1.
- roberta-large_augmented_q&a provided the worst scores due to the poor quality of generated question-answer pairs.
- All models began to overfit to the training data after two or three epochs, as indicated by the decreasing training loss curve and the increasing validation loss curve.
Add a comment