MicroRTS U-net Self-play
Created on April 8|Last edited on May 16
Comment
Experiment tags:
- Shaped rewards: benchmark_4706d8d host_192-9-146-21 branch_selfplay v0.0.9
- Win-loss, gamma 0.999: benchmark_4706d8d host_152-70-115-196 branch_selfplay v0.0.9
- Shaped rewards decay, gamma 0.999: benchmark_08664bf host_192-9-250-82 branch_selfplay v0.0.9
- Shaped rewards decay, gamma decay 0.99-0.999: benchmark_f7c6f26 host_192-9-151-120 branch_selfplay v0.0.9
- Shaped rewards decay, gamma decay, 4000 train max_steps, 300,000 save_steps, 6000 swap_steps: benchmark_9ba0ab5 host_192-9-155-233 branch_main v0.0.9
Shaped rewards: benchmark_4706d8d host_192-9-146-21 branch_selfplay v0.0.9
Win-loss, gamma 0.999: benchmark_4706d8d host_152-70-115-196 branch_selfplay v0.0.9
Shaped rewards decay, gamma 0.999: benchmark_08664bf host_192-9-250-82 branch_selfplay v0.0.9
Shaped rewards decay, gamma decay 0.99-0.999: benchmark_f7c6f26 host_192-9-151-120 branch_selfplay v0.0.9
Shaped rewards decay, gamma decay, 4000 train max_steps, 300,000 save_steps, 6000 swap_steps: benchmark_9ba0ab5 host_192-9-155-233 branch_main v0.0.9
Add a comment