Skip to main content

Phase 3 - CR Hyperparameters & Variants

Created on September 30|Last edited on October 13
We first do experiments on the trimmed set and then see whether those insights generalise to the main set.

Sweep: Vary Loss Scales


10203040506070Step00.10.20.30.4
Run set
5


In varying the loss scales there does not seem to be any difference. Now we vary the learning rate. Its best to keep them as equal I guess.
I still suspect that there might be no influence of this factor.

Sweep: LR , ELR

Here we vary learning rate and encoder learning rate.
LR : 2e-5 4e-5 1e-4
ELR: 1e-5 5e-5 2e-6


Run set
16

So from this it seems that the best performing combinations are as follows:
LR = 1e-3 and ELR (encoder learning rate) = 5e-5
LREncoder LRAvg F1
1e-35e-50.4578
5e-45e-50.4497
2e-41e-50.4268
5e-41e-50.4208

Another thing to note is that all of crpr runs are worse than `cr runs. That can't be a good sign!

Sweep: CRPR/CR Trim Baselines (default param, varying size)

LR Full Dataset (Sweep LR Results)

Clearly, the 'default' learning rate - 2e-4, 1e-5 seems to work better than every other one. So, NO, the insights from limited 50 instance experiments do not directly translate to the main, whole dataset.
So then, we need to figure out what's a good size of dataset after which we can conclude that indeed, it works.

Run set
5

Clearly, the 'default' learning rate - 2e-4, 1e-5 seems to work better than every other one. So, NO, the insights from limited 50 instance experiments do not directly translate to the main, whole dataset.

Unary hdim

HOI Trainer

Priyansh Trivedi
Priyansh Trivedi •  
# Unary hdim Test is inconclusive. Trims still don't represent the real deal. unary_hdim of 1000 works better. But there's little difference between 500 and 100 🤷
Reply
Priyansh Trivedi
Priyansh Trivedi •  
# HOI Trainer There is NO difference in the two. Ignore hoitrainer.
Reply