SearchR1_archive Table – Weights & Biases

Skip to main content

Chenmientan's workspace

Runs

1

Crashed

-

chenmientan

3mo ago

16h 35m 45s

-

0.2

0.0001

0

true

0

k2

k1

0.000001

1

8192

Qwen/Qwen2.5-3B-Instruct

true

true

Qwen/Qwen2.5-3B-Instruct

envs/searchr1.py

0.5

0

1

4

512

0

1

512

1

ckpts/qwen2.5-3b-inst_reinforce/actor

true

1

1

0.01

reinforce

1

1

false

0.5

true

0.000005

1

8192

Qwen/Qwen2.5-3B-Instruct

true

true

ckpts/qwen2.5-3b-inst_reinforce/critic

true

1

1-1

of 1