andreas_giannoutsos

Proximal Policy Optimization (PPO) Experiments

Here we can observe three experiments with PPO RL algorithm (with different number of timesteps {2000,5000,10000}) seperately (with non-smoothing form) and finally all three together (with smoothing form).

spympr

2021-02-28

3 years ago

Deep Q-Network (DQN) Experiments

Here we can observe three experiments with DQN RL algorithm (with different number of timesteps {2000,5000,10000}) seperately (with non-smoothing form) and finally all three together (with smoothing form).

spympr

2021-02-28

3 years ago

Car Racing PPO MIA3

joarmu21

2021-09-17

4 years ago

Αctor-Critic (A2C) Experiments

Here we can observe three experiments with A2C RL algorithm (with different number of n_steps {16,36}) seperately (with non-smoothing form) and finally both of them (with smoothing form).

spympr

2021-02-28

4 years ago