ddpo-aesthetic-ddpm-celebahq256 Table – Weights & Biases

Skip to main content

Alcazar90's workspace

Runs

113

KL (current vs old policy)

average loss

batch

epoch

eval_mean_reward

inner_epoch

loss

max_reward

mean_reward

min_reward

pct_clipped_ratios

std_reward

learning_rate

Finished

alcazar90

2y ago

1h 53m 29s

-

10

10

0.0001

google/ddpm-celebahq-256

1

650

-

10

25

64

40

1

100

-

92013491249214130

Task.LAION

0.0001

-

-

9.0000e-8

-

0.0000374

-

true

-

-

0.25

1515.60046

0.34806

9

24

5.76955

0

0.407

6.52627

5.79663

4.3818

1

0.29944

-

1.1610e-8

1-1

of 1