Comment
Disagreement/episode_reward
Disagreement/episode_reward
Weight Initialization=Default
Update Proportion=10%
Update Proportion=1%
Update Proportion=50%
Reward Normalization=Min-Max
Reward Normalization=Vanilla
Observation Normalization=Vanilla
Baseline
with LSTM
E3B/episode_reward
E3B/episode_reward
Update Proportion=1%
Weight Initialization=Default
Update Proportion=10%
Update Proportion=50%
Reward Normalization=Min-Max
Reward Normalization=Vanilla
Baseline
Observation Normalization=Vanilla
with LSTM
ICM/episode_reward
ICM/episode_reward
Weight Initialization=Default
Update Proportion=10%
Update Proportion=1%
Update Proportion=50%
Reward Normalization=Min-Max
Reward Normalization=Vanilla
Observation Normalization=Vanilla
Baseline
with LSTM
PseudoCounts/episode_reward
PseudoCounts/episode_reward
Weight Initialization=Default
Update Proportion=1%
Update Proportion=10%
Update Proportion=50%
Reward Normalization=Min-Max
Reward Normalization=Vanilla
Observation Normalization=Vanilla
with LSTM
Baseline
Run set
690
Add a comment
Created with ❤️ on Weights & Biases.
https://wandb.ai/yuanmingqi/RLeXplore/reports/RLeXplore-RLLTE-s-PPO-SuperMarioBros--Vmlldzo5NTQwMDU3