Skip to main content

MicroRTS Additional Rewards and Observations

Created on May 25|Last edited on June 9

050M100M150M200M250M300Mglobal_step00.20.40.60.81
050M100M150M200M250M300Mglobal_step-0.500.51
Run set
7

1d1195b Fix CriticHead setting hidden layer gains to 1.0
eb5e2f5 Squash advantage by weights before going through loss
267a387 Fix WinLoss draw when no units
b0f6ce8 Reward for factories, heavies, and lights being alive
37b3aff assert -> warn on score_reward != winloss
78553fc microrts winloss 0 can occur even if score_reward is non-zero
57f5923 Assert score_reward is same sign as winloss
No "Switch WinLoss head to use no activation"
d285d50 Switch WinLoss head to use no activation
37b3aff assert -> warn on score_reward != winloss
78553fc microrts winloss 0 can occur even if score_reward is non-zero
57f5923 Assert score_reward is same sign as winloss
b1a4b68 Use ScoreReward as the third head in Microrts and Lux
282f633 Lux reward term based off of the difference of scores
9813dfd Record results info & score based on cost+hp
91d85dd MicroRTSGridModeSharedMemVecEnv isn’t supported
bfe9121 Don’t reward power generation. Reward robot building
cb58909 Give policy hp and resources as floats
8ec7c25 Microrts-selfplay-dc-phases-final adds all maps
ce58e3f Reduce n_epochs to 2 for double-cone microrts
6c5a3dd Replace metal_remaining with factories_to_place
4e033fe Reduce n_epochs from 4 to 2 for Lux to reduce kl-div
7ed67a8 Get working on A100
cceeda0 Upgrade tensorboard & upgrade java runtime
28227ab Support for different size maps through padding
2f3729f Copy over vec_env from gym_microrts
d44cf1c Specify wheel url for gym-microrts
4003f90 Point gym-microrts to sgoodfriend fork
Baseline using unet and decayed rewards


ppo-Microrts-selfplay-dc-phases-A10-S1-2023-05-30T00:47:31.012042


Run set
1



ppo-Microrts-selfplay-dc-phases-A10-S1-2023-05-28T02:18:36.719127


Run set
1


ppo-Microrts-selfplay-dc-phases-A10-S1-2023-05-26T00:09:17.258204


Run set
1


ppo-Microrts-selfplay-dc-phases-A10-S1-2023-05-25T04:10:09.181610



Run set
1


ppo-Microrts-selfplay-dc-phases-A10-S1-2023-05-24T03:40:49.861095


Run set
1


ppo-Microrts-selfplay-dc-phases-A10-S1-2023-05-19T01:03:24.744977


Run set
1


Baseline


Run set
3