Skip to main content

Insanely fast Lunar Lander with AWAC

Created on April 5|Last edited on April 6

Section 1



Insanely fast training using AWAC. Viable trajectories with only 1500 updates, 1000 of them from offline training data. Batch size 128. 1000 randomly selected transitions from a dataset of 100, 000 expert trajectories.

Possibly what helped here was drawing at random from such a large training set of good actions, then pursuing only limited fine tuning.

video
This run didn't log videos for key "video", step 3009, index 0. Docs →
Run: sunny-rain-229
1



Run: sunny-rain-229
1