Skip to main content

Atari HFxSB3 Benchmark

We trained PPO model on 4 differents environments (Pong, Qbert, Seaquest and Breakout) using SB3. The models are published on HuggingFace Hub
Created on March 1|Last edited on March 1

ppo-PongNoFrameskip-v4 (W&B crashed tracking after 3M steps)

This is a trained model of a PPO agent playing PongNoFrameskip-v4 using the stable-baselines3 library (our agent is the 🟢 one).


Usage



Evaluation Results

Mean_reward: 21.00 +/- 0.0

Run set
1


ppo-BreakoutNoFrameskip-v4

This is a trained model of a PPO agent playing BreakoutNoFrameskip-v4 using the stable-baselines3 library.

Usage



Evaluation Results

Mean_reward: 339.0



Run set
1


ppo-QbertNoFrameskip-v4

This is a trained model of a PPO agent playing QbertNoFrameskip-v4 using the stable-baselines3 library.

Usage



Evaluation Results

Mean_reward: 15685.00 +/- 115.217


Run set
1


ppo-SeaquestNoFrameSkip-v4

This is a trained model of a PPO agent playing SeaquestNoFrameskip-v4 using the stable-baselines3 library.

Usage



Evaluation Results

Mean_reward: 1820.00 +/- 20.0


Run set
4