MicroRTS Double-cone, Hyperparam Transitions
Created on May 16|Last edited on May 26
Comment
91d85dd MicroRTSGridModeSharedMemVecEnv isn’t supportedbfe9121 Don’t reward power generation. Reward robot buildingcb58909 Give policy hp and resources as floats8ec7c25 Microrts-selfplay-dc-phases-final adds all maps
ce58e3f Reduce n_epochs to 2 for double-cone microrts6c5a3dd Replace metal_remaining with factories_to_place4e033fe Reduce n_epochs from 4 to 2 for Lux to reduce kl-div
7ed67a8 Get working on A100cceeda0 Upgrade tensorboard & upgrade java runtime
Need to reduce batchsize for the 24x24 map to 8 minibatches (from 6 on A10 for 16x16 maps)
28227ab Support for different size maps through padding2f3729f Copy over vec_env from gym_micrortsd44cf1c Specify wheel url for gym-microrts4003f90 Point gym-microrts to sgoodfriend fork
5c0a469 Support for multiple map_paths (limit to save size)
3f97901 A10 variant of Microrts with double-cone
On A100 to fit 1 training in GPU memory (26.6GB). unet baseline used to fit 3 simultaneous trainings in an A10.
Baseline using unet and decayed rewards
ppo-Microrts-selfplay-dc-phases-A10-S1-2023-05-25T04:10:09.181610
ppo-Microrts-selfplay-dc-phases-A10-S1-2023-05-24T03:40:49.861095
ppo-Microrts-selfplay-dc-phases-maps24-S1-2023-05-19T01:26:19.244633
ppo-Microrts-selfplay-dc-phases-A10-S1-2023-05-19T01:03:24.744977
ppo-Microrts-selfplay-dc-phases-A10-S1-2023-05-17T20:35:16.497141
ppo-Microrts-selfplay-dc-phases-A10-S1-2023-05-16T22:54:34.689524
ppo-Microrts-selfplay-dc-phases-S1-2023-05-16T22:44:50.508414: A100
Baseline
Add a comment