Skip to main content
yipingwanguw
Projects
verl_few_shot
Log in
Sign up
Overview
Workspace
Runs
Automat.
Sweeps
Reports
Artifacts
Yipingwang22's workspace
Personal workspace
Automated workspace
Changes are only visible to you.
Runs
21
Name
2 visualized
dpsk_distill_1.5b, 1-shot RLVR with pi_1
dpsk_distill_1.5b, 1-shot RLVR with pi_1
dpsk_distill_1.5b, 16-shot RLVR with pi_{1},..., pi_{16}
dpsk_distill_1.5b, 16-shot RLVR with pi_{1},..., pi_{16}
dpsk_distill_1.5b, 4-shot RLVR with pi_{1}, pi_{2}, pi_{13}, pi_{1209}}
dpsk_distill_1.5b, 4-shot RLVR with pi_{1}, pi_{2}, pi_{13}, pi_{1209}}
dpsk_distill_1.5b, RLVR with DSR-sub
dpsk_distill_1.5b, RLVR with DSR-sub
Qwen2.5-Math-1.5B, 1-shot RLVR with pi_{13}
Qwen2.5-Math-1.5B, 1-shot RLVR with pi_{13}
Qwen2.5-Math-1.5B, 1-shot RLVR with pi_{13}
Qwen2.5-Math-1.5B, 1-shot RLVR with pi_{13}
Qwen2.5-Math-1.5B, 1-shot RLVR with pi_{13}
Qwen2.5-Math-1.5B, 1-shot RLVR with pi_{13}
Qwen2.5-Math-1.5B, 1-shot RLVR with pi_{1}
Qwen2.5-Math-1.5B, 1-shot RLVR with pi_{1}
Qwen2.5-Math-1.5B, 1-shot RLVR with pi_{1}
Qwen2.5-Math-1.5B, 1-shot RLVR with pi_{1}
Qwen2.5-Math-1.5B, 2-shot RLVR with pi_{1}, pi_{13}
Qwen2.5-Math-1.5B, 2-shot RLVR with pi_{1}, pi_{13}
Qwen2.5-Math-1.5B, RLVR with DSR-sub
Qwen2.5-Math-1.5B, RLVR with DSR-sub
Qwen2.5-Math-1.5B, RLVR with DSR-sub
Qwen2.5-Math-1.5B, RLVR with DSR-sub
Qwen2.5-Math-1.5B-pi1_r128
Qwen2.5-Math-1.5B-pi1_r128
Qwen2.5-Math-7B, 1-shot RLVR with pi_1
Qwen2.5-Math-7B, 1-shot RLVR with pi_1
Qwen2.5-Math-7B, 1-shot RLVR with pi_1
Qwen2.5-Math-7B, 1-shot RLVR with pi_1
Qwen2.5-Math-7B, 16-shot RLVR with pi_{1},..., pi_{16}
Qwen2.5-Math-7B, 16-shot RLVR with pi_{1},..., pi_{16}
Qwen2.5-Math-7B, 2-shot RLVR with pi_{1}, pi_{13}
Qwen2.5-Math-7B, 2-shot RLVR with pi_{1}, pi_{13}
Qwen2.5-Math-7B, 2-shot RLVR with pi_{1}, pi_{13}
Qwen2.5-Math-7B, 2-shot RLVR with pi_{1}, pi_{13}
Qwen2.5-Math-7B, 4-shot RLVR with pi_{1}, pi_{2}, pi_{13}, pi_{1209}}
Qwen2.5-Math-7B, 4-shot RLVR with pi_{1}, pi_{2}, pi_{13}, pi_{1209}}
Qwen2.5-Math-7B, RLVR with DSR-sub
Qwen2.5-Math-7B, RLVR with DSR-sub
1-20
of 21
Add panels
actor
8
1-6 of 8
actor/ppo_kl
actor/ppo_kl
500
1k
1.5k
2k
Step
-2
-1
0
1
2
actor/pg_loss
actor/pg_loss
500
1k
1.5k
2k
Step
-0.1
0
0.1
0.2
actor/pg_clipfrac
actor/pg_clipfrac
500
1k
1.5k
2k
Step
-2
-1
0
1
2
actor/lr
actor/lr
500
1k
1.5k
2k
Step
0
5e-7
0.000001
0.0000015
0.000002
actor/kl_loss
actor/kl_loss
500
1k
1.5k
2k
Step
0.5
1
1.5
2
2.5
actor/kl_coef
actor/kl_coef
500
1k
1.5k
2k
Step
0
0.0005
0.001
0.0015
0.002
critic
12
1-6 of 12
global_seqlen
6
1-6 of 6
mfu
1
prompt_length
4
1-4 of 4
response_length
4
1-4 of 4
Add section