Skip to main content
Reports
Created by
Created On
Last edited
Negative sampling methods
This report considers various negative sampling techniques.
0
2021-03-25
Agents trained with Imitation Learning Reward
Note that DSSMs which provide the IL reward were trained on datasets containing negatives for all states. Therefore following experiment serve more as a proof of concept. Moreover, only MountainCar and Acrobot are considered as agent learns to solve CartPole as long as reward as positive.
0
2021-03-12
Temperature and LR for LogSoftMax Reward
This report considers various values of temperature and learning rate for LogSoftMax Reward.
0
2021-03-24
Imitation Learning on CartPole
Agents trained on CartPole with different rewards.
0
2021-03-15
Negatives from Experience Replay vs Negatives from Expert
Size of Experience Replay is 10. Fixed subset of size 10 is chosen from the expert dataset.
0
2021-03-12