Reports
Created by
Created On
Last edited
8-bit Adam vs 32-bit Adam tests
I ran longformer base for 3 epochs on the Feedback Prize data with 3 different seeds and 3 configurations:
1. 8-bit Adam and 8-bit Embeddings
2. 8-bit Adam and 32-bit Embeddings
3. 32-bit Adam and 32-bit Embeddings
Each configuration used the same hyperparameters and data - the only difference being the seed. For full hyperparameter details, see the table at the very end.
1
2022-01-25
8-bit Adam vs 32-bit Adam
A comparison between training using 8-bit Adam and 32-bit Adam. I had a slight error in the recall calculations for each discourse type, so those scores are not shown. This was only 1 run, so the results are not conclusive. I've had more of a difference in training times when I've done this on other projects, so your mileage may vary!
1
2022-01-17
[Feedback Prize] Bigbird-base NER fine-tuning
It turns out I was calculating my F1 score incorrectly so now my CV values are much higher. Thus, there aren't as many runs now.
0
2021-12-26
BigBird Base In-Domain Pre-training Results
Using Masked Language Modeling to adapt the model to the domain of high school essays.
1
2021-12-20
BigBird base NER fine-tuning results
A comparison of training results for the Feedback Prize competition
0
2021-12-19