openrlhf_train_ppo Table – Weights & Biases

Skip to main content

Andreaskoepf's workspace

Runs

2

Finished

-

andreaskoepf

7mo ago

22h 39m 5s

-

false

5.0000e-7

[0.9,0.95]

true

gae

true

0

true

./ckpt/checkpoints_ppo

0.000009

meta-llama/Llama-3.2-3B-Instruct

false

false

false

0.2

-1

true

-1

1

1024

true

false

0.01

question

0

1

false

false

0

1

16

0

0

0.03

100000000

3

1

1

100000

4

2

1

true

1

Crashed

-

andreaskoepf

7mo ago

47m 5s

-

false

5.0000e-7

[0.9,0.95]

true

gae

true

0

true

./ckpt/checkpoints_ppo

0.000009

meta-llama/Llama-3.2-1B-Instruct

false

false

false

0.2

-1

true

-1

1

1024

true

false

0.01

question

0

1

false

false

0

1

16

0

0

0.03

100000000

3

1

1

100000

4

2

1

true

1

1-2

of 2