mats_sae_training_gpt2_ghost_grad_experiment Table

Jbloom's workspace

Runs

context_size

Finished

jbloom

2y ago

1h 25m 11s

0.9999

mean

activations/apollo-research_sae-Skylion007-openwebtext-tokenizer-gpt2/gpt2-small/blocks.3.hook_resid_pre

checkpoints/ysoc8r4p

128

768

12288

apollo-research/sae-Skylion007-openwebtext-tokenizer-gpt2

1.0000e-8

5000

cuda

torch.float32

1000

100000000

decoder

blocks.3.hook_resid_pre

true

0.00008

true

0.0004

constantwithwarmup

10000

gpt2-small

variance

128

false

12288-L1-8e-05-LR-0.0004-Tokens-2.000e+08

67108864

200000000

4096

false

100

mats_sae_training_gpt2_ghost_grad_experiment

Finished

jbloom

2y ago

2h 10m 22s

0.9999

mean

activations/apollo-research_sae-Skylion007-openwebtext-tokenizer-gpt2/gpt2-small/blocks.3.hook_resid_pre

checkpoints/penlplwz

128

768

12288

apollo-research/sae-Skylion007-openwebtext-tokenizer-gpt2

1.0000e-8

5000

cuda

torch.float32

1000

100000000

decoder

blocks.3.hook_resid_pre

true

0.00008

true

0.0004

constantwithwarmup

10000

gpt2-small

variance

128

false

12288-L1-8e-05-LR-0.0004-Tokens-2.000e+08

67108864

200000000

4096

false

true

100

mats_sae_training_gpt2_ghost_grad_experiment

Finished

jbloom

2y ago

1h 25m 29s

0.9

0.999

mean

activations/apollo-research_sae-Skylion007-openwebtext-tokenizer-gpt2/gpt2-small/blocks.3.hook_resid_pre

checkpoints/0bli492w

128

768

12288

apollo-research/sae-Skylion007-openwebtext-tokenizer-gpt2

1.0000e-8

5000

cuda

torch.float32

1000

blocks.3.hook_resid_pre

true

0.00008

true

0.0004

constantwithwarmup

10000

gpt2-small

variance

128

false

12288-L1-8e-05-LR-0.0004-Tokens-3.000e+08

67108864

300000000

4096

false

100

mats_sae_training_gpt2_ghost_grad_experiment

Finished

jbloom

2y ago

2h 23m 37s

0.9999

mean

activations/apollo-research_sae-Skylion007-openwebtext-tokenizer-gpt2/gpt2-small/blocks.3.hook_resid_pre

checkpoints/ylits63n

128

768

12288

apollo-research/sae-Skylion007-openwebtext-tokenizer-gpt2

1.0000e-8

5000

cuda

torch.float32

1000

decoder

blocks.3.hook_resid_pre

true

0.00008

true

0.0004

constantwithwarmup

10000

gpt2-small

variance

128

false

12288-L1-8e-05-LR-0.0004-Tokens-3.000e+08

67108864

300000000

4096

false

true

100

mats_sae_training_gpt2_ghost_grad_experiment

1-4

of 4