Bug Fix - 25th May
Created on May 30|Last edited on May 30
Comment
Section 1
Pink run is Mangoes (Joe's code). Dark blue is our approach. On train metrics, they're the same, basically.
Run set
2
Context:
- Both models are trained and evaluated with my scripts. The data is preprocessed etc with my script.
- THE BUG IS GONE!
Observations:
- We're in the similar rough frame. Joe's model overfits less, mine overfits more (as of now)
- Both the numbers are a bit off with what Joe gets but that's manageable for now. That's a matter of learning rate annealing and other scheduling, weight decay etc. I AM working on it but its not an urgent priority.
Coref with Pruner
When I add the pruner loss alongside it, things get a bit hazy. Note that these are not full runs. In each epoch instead of 2700 documents, I only give the model first 50 documents. This is to increase the turn around time. (The general trends hold across both). Here the brown run is the one with just the Coref Loss, and the light blue one has both Coref and Pruner loss in it.
Run set
2
Observations:
- On train metrics, coref loss makes the entire thing worse
- On valid metrics it wrecks havoc. I'm figuring out why, but it seems to be a balancing issue between the magnitude of both losses, for now.
Overall Conclusions
Again, I realise that I'm still a few days off in getting everything up to speed smoothly but at least I am over the horrid roadblock I had for eight weeks now.
Add a comment