Negative sampling methods
This report considers various negative sampling techniques.
Created on March 25|Last edited on March 25
Comment
Sampling techniques.
- Expert Experience Replay with random sampling (ExpertER_rs). Negative states are randomly sampled with repetition from expert dataset.
- Experience Replay (ER). States are stored in the replay buffer. All states from the replay buffer are used as negatives. Replay Buffer does not get reset at the end of the episode.
- Experience Replay with Resets (ResetER). States are stored in the replay buffer. All states from the replay buffer are used as negatives. Replay Buffer gets reset at the end of the episode. If the amount of transitions in the replay buffer is less then required size negatives are sampled from replay buffer with repetition. For example if the required size is 100 and there are only 20 states in the replay buffer all 100 negatives are sampled from 20 states with repetition.
- Expert Experience Replay with Resets (ExpertResetER). Half of the required amount of negatives is randomly sampled from the expert dataset with repetition. The other half is sampled from the replay buffer which gets reset at the end of each episode. If there are too few states stored in Replay Buffer missing negatives are also sampled from the expert dataset.
Run set
49
Add a comment