Atari HFxSB3 Benchmark
We trained PPO model on 4 differents environments (Pong, Qbert, Seaquest and Breakout) using SB3. The models are published on HuggingFace Hub
Created on March 1|Last edited on March 1
Comment
ppo-PongNoFrameskip-v4 (W&B crashed tracking after 3M steps)
This is a trained model of a PPO agent playing PongNoFrameskip-v4 using the stable-baselines3 library (our agent is the 🟢 one).
Usage
- This model is hosted on the HuggingFace Hub, you can access it here: https://huggingface.co/ThomasSimonini/ppo-PongNoFrameskip-v4
Evaluation Results
Mean_reward: 21.00 +/- 0.0
Run set
1
ppo-BreakoutNoFrameskip-v4
This is a trained model of a PPO agent playing BreakoutNoFrameskip-v4 using the stable-baselines3 library.
Usage
- This model is hosted on the HuggingFace Hub, you can access it here: https://huggingface.co/ThomasSimonini/ppo-BreakoutNoFrameskip-v4
Evaluation Results
Mean_reward: 339.0
Run set
1
ppo-QbertNoFrameskip-v4
This is a trained model of a PPO agent playing QbertNoFrameskip-v4 using the stable-baselines3 library.
Usage
- This model is hosted on the HuggingFace Hub, you can access it here: https://huggingface.co/ThomasSimonini/ppo-QbertNoFrameskip-v4
Evaluation Results
Mean_reward: 15685.00 +/- 115.217
Run set
1
ppo-SeaquestNoFrameSkip-v4
This is a trained model of a PPO agent playing SeaquestNoFrameskip-v4 using the stable-baselines3 library.
Usage
- This model is hosted on the HuggingFace Hub, you can access it here: https://huggingface.co/ThomasSimonini/ppo-SeaquestNoFrameskip-v4
Evaluation Results
Mean_reward: 1820.00 +/- 20.0
Run set
4
Add a comment