Gym-MicroRTS: Our PPO + action mask vs Our PPO vs vs openai/baselines' PPO
Created on January 3|Last edited on April 12
Comment
MicrortsMining-v1
MicrortsMining-v1
Run set
9
Name
9 visualized
gym_id: MicrortsMining-v1
gym_id: MicrortsMining-v1
3
9
State
Notes
User
Tags
Created
Runtime
Sweep
alg
env
exp_name
network
num_env
num_timesteps
play
reward_scale
save_video_interval
save_video_length
seed
track
anneal_lr
batch_size
capture_video
clip_coef
clip_vloss
cuda
ent_coef
gae
gae_lambda
gamma
gym_id
learning_rate
max_grad_norm
minibatch_size
norm_adv
num_envs
num_minibatches
num_steps
torch_deterministic
total_timesteps
update_epochs
vf_coef
wandb_entity
wandb_project_name
aux_batch_size
aux_minibatch_size
beta_clone
e_auxiliary
e_policy
n_aux_grad_accum
n_aux_minibatch
n_iteration
Finished
costa-huang
20d 2h 7m 24s
-
ppo2
MicrortsMining-v1
["baselines-ppo2-cnn_gym_microrts","ppo_multidiscrete","ppo_multidiscrete_mask"]
cnn_gym_microrts
8
2000000
false
1
0
200
2
true
true
1024
false
0.1
true
true
0.01
true
0.95
0.99
MicrortsMining-v1
0.00025
0.5
256
true
8
4
128
true
2000000
4
0.5
vwxyzjn
ppo-details
-
-
-
-
-
-
-
-
1-1
of 1
Run set
9
Run set
9
Add a comment
Created with ❤️ on Weights & Biases.
https://wandb.ai/vwxyzjn/ppo-details/reports/Gym-MicroRTS-Our-PPO-action-mask-vs-Our-PPO-vs-vs-openai-baselines-PPO--VmlldzoxNDAwMTc3