Skip to main content
llychinalz
Projects
Flash-GSM8K
Log in
Sign up
Overview
Workspace
Runs
Automat.
Sweeps
Reports
Artifacts
Llychinalz's workspace
Personal workspace
Automated workspace
Changes are only visible to you.
Runs
12
Name
6 visualized
FP8rollout-ppo (prob-diff agnostic)
FP8rollout-ppo (prob-diff agnostic)
FP8rollout-ppo w. using p_vllm as pi_old (no recompute variant for TIS)
FP8rollout-ppo w. using p_vllm as pi_old (no recompute variant for TIS)
FP8rollout-ppo w. vanilla-IS (no truncation variant for TIS)
FP8rollout-ppo w. vanilla-IS (no truncation variant for TIS)
FP8rollout-ppo w. TIS-2 (prob-diff aware)
FP8rollout-ppo w. TIS-2 (prob-diff aware)
BF16rollout-ppo w. vanilla-IS (no truncation variant for TIS)
BF16rollout-ppo w. vanilla-IS (no truncation variant for TIS)
BF16rollout-ppo w. using p_vllm as pi_old (no recompute variant for TIS)
BF16rollout-ppo w. using p_vllm as pi_old (no recompute variant for TIS)
BF16rollout-ppo w. TIS-2 (prob-diff aware)
BF16rollout-ppo w. TIS-2 (prob-diff aware)
INT8rollout-ppo w. using p_vllm as pi_old (no recompute variant for TIS)
INT8rollout-ppo w. using p_vllm as pi_old (no recompute variant for TIS)
INT8rollout-ppo w. vanilla-IS (no truncation variant for TIS)
INT8rollout-ppo w. vanilla-IS (no truncation variant for TIS)
INT8rollout-ppo w. TIS-2 (prob-diff aware)
INT8rollout-ppo w. TIS-2 (prob-diff aware)
INT8rollout-ppo (prob-diff agnostic)
INT8rollout-ppo (prob-diff agnostic)
BF16rollout-ppo (prob-diff agnostic)
BF16rollout-ppo (prob-diff agnostic)
1-12
of 12
Add panels
Core Monitor
5
Pinned
1-5 of 5
val-core-bf16/openai/gsm8k/reward/mean@1
val-core-bf16/openai/gsm8k/reward/mean@1
100
200
300
400
Step
0.1
0.2
0.3
0.4
0.5
training/vllm_kl
training/vllm_kl
100
200
300
400
Step
0.02
0.04
0.06
actor/pg_clipfrac
actor/pg_clipfrac
100
200
300
400
Step
0.002
0.004
0.006
0.008
0.01
0.012
actor/pg_clipfrac_lower
actor/pg_clipfrac_lower
100
200
300
400
Step
0
0.0002
0.0004
0.0006
0.0008
0.001
0.0012
training/rollout_probs_diff_max
training/rollout_probs_diff_max
100
200
300
400
Step
0.2
0.4
0.6
0.8
1
val-core-bf16
0
actor
7
1-6 of 7
critic
21
1-6 of 21
global_seqlen
6
1-6 of 6
perf
8
1-6 of 8
Add section