Skip to main content
andrewzhang505
Projects
sample_factory
Reports
Quad-Swarm-RL
Log in
Sign up
Share
Comment
Star
Share
Comment
Star
Quad-Swarm-RL
Andrew Zhang
Created on July 21
|
Last edited on July 22
Comment
Below are the reward plots with both input and return normalization
reward/reward
reward/reward
200M
400M
600M
800M
global_step
-100
-80
-60
-40
-20
0
03_baseline_see_3333_20220720_104137_993357
02_baseline_see_2222_20220720_104137_992489
01_baseline_see_1111_20220720_104137_983199
00_baseline_see_0_20220720_104135_336414
policy_stats/avg_rew_crash
policy_stats/avg_rew_crash
200M
400M
600M
800M
global_step
-6
-4
-2
03_baseline_see_3333_20220720_104137_993357
02_baseline_see_2222_20220720_104137_992489
01_baseline_see_1111_20220720_104137_983199
00_baseline_see_0_20220720_104135_336414
Run set
4
As a comparison, here are some plots without normalization.
Run set
8
Add a comment