Insanely fast Lunar Lander with AWAC
Created on April 5|Last edited on April 6
Comment
Section 1
Insanely fast training using AWAC. Viable trajectories with only 1500 updates, 1000 of them from offline training data. Batch size 128. 1000 randomly selected transitions from a dataset of 100, 000 expert trajectories.
Possibly what helped here was drawing at random from such a large training set of good actions, then pursuing only limited fine tuning.
video
Run: sunny-rain-229
1
Run: sunny-rain-229
1
Add a comment