Skip to main content
costa-huang
Projects
cleanRL
Reports
Regression Report: wandb
Log in
Sign up
Share
Comment
Star
Share
Comment
Star
Regression Report: wandb
[['?we=costa-huang&wpn=trl&ceik=tracker_project_name&cen=log_with&metrics=env/reward_mean&metrics=env/reward_std&metrics=objective/kl_coef&metrics=objective/kl&metrics=objective/entropy&metrics=ppo/std_scores&metrics=ppo/mean_scores&metrics=ppo/learning_rate&metrics=ppo/mean_non_score_reward&metrics=ppo/loss/value&metrics=ppo/loss/total&metrics=ppo/loss/policy&metrics=ppo/policy/advantages_mean&metrics=ppo/policy/approxkl&metrics=ppo/policy/clipfrac&metrics=ppo/policy/entropy&metrics=ppo/returns/mean&metrics=ppo/returns/var', 'wandb?tag=calculator_few_shots_env&tag=pr-429&cl=calculator_env (various improvement)']]
Costa
Created on June 15
|
Last edited on June 15
Comment
env/reward_mean trl
env/reward_mean trl
100
200
300
Steps
0
0.1
0.2
0.3
0.4
0.5
0.6
Episodic Return
env/reward_std trl
env/reward_std trl
There's no data for the selected runs.
Try a different X axis setting.
Current X axis: _step
objective/kl_coef trl
objective/kl_coef trl
100
200
300
Steps
0.17
0.175
0.18
0.185
0.19
0.195
0.2
Episodic Return
objective/kl trl
objective/kl trl
100
200
300
Steps
-600
-500
-400
-300
-200
-100
0
Episodic Return
objective/entropy trl
objective/entropy trl
100
200
300
Steps
100
200
300
400
500
600
700
Episodic Return
ppo/std_scores trl
ppo/std_scores trl
100
200
300
Steps
0.1
0.15
0.2
0.25
Episodic Return
calculator_env (various improvement)
10
Add a comment