Jkminder's workspace
Runs
6
Name
6 visualized
State
Notes
User
Tags
Created
Runtime
Sweep
activation_dim
auxk_alpha
code_normalization
code_normalization_alpha_cc
code_normalization_alpha_sae
device
dict_class
dict_class_kwargs.code_normalization
dict_class_kwargs.code_normalization_alpha_cc
dict_class_kwargs.code_normalization_alpha_sae
dict_class_kwargs.encoder_layers
dict_class_kwargs.init_with_transpose
dict_class_kwargs.norm_init_scale
dict_class_kwargs.same_init_for_all_layers
dict_size
k
k_annealing_steps
k_initial
k_target
l1_penalty
layer
lm_name
lr
steps
threshold_beta
threshold_start_step
top_k_aux
trainer_class
use_mse_loss
wandb_name
warmup_steps
step
train/auxk_loss
train/cl0_frac_variance_explained
train/cl1_frac_variance_explained
train/effective_l0
train/frac_deads
train/frac_variance_explained
train/k_current_value
train/l0
train/l2_loss
train/loss
train/mse_loss
train/pre_norm_auxk_loss
Finished
jkminder
19h 43m 42s
-
4096
0.03125
CROSSCODER
0.1
1
cuda
BatchTopKCrossCoder
crosscoder
0.1
1.0
None
True
0.3
True
131072
1000
5000
1000
200
-
16
meta-llama/Meta-Llama-3.1-8B-Instruct-meta-llama/Meta-Llama-3.1-8B
0.0001
97656
0.999
1000
2048
BatchTopKCrossCoderTrainer
-
Meta-Llama-3.1-8B-L16-k200-lr1e-04-local-shuffling-Crosscoder-ni0.3-ka1k5k
1000
97656
1.00078
0.84512
0.82993
200
0.56441
0.83775
200
200
4.38344
4.41471
20.00297
40.00594
Finished
jkminder
10h 9m 3s
-
4096
-
CROSSCODER
0.1
1
cuda
CrossCoder
-
-
-
-
-
-
-
131072
-
-
-
-
0.021
16
meta-llama/Meta-Llama-3.1-8B-Instruct-meta-llama/Meta-Llama-3.1-8B
0.0001
-
-
-
-
CrossCoderTrainer
false
Meta-Llama-3.1-8B-L16-mu2.1e-02-lr1e-04-local-shuffling-CrosscoderLoss
1000
97656
-
0.80946
0.79286
-
0.33459
0.80141
-
199.13672
4.83243
6.18061
24.47817
-
Finished
jkminder
4h 38m 47s
-
2048
-
CROSSCODER
0.1
1
cuda
CrossCoder
-
-
-
-
-
-
-
65536
-
-
-
-
0.036
8
meta-llama/Llama-3.2-1B-Instruct-meta-llama/Llama-3.2-1B
0.0001
-
-
-
-
CrossCoderTrainer
false
Llama-3.2-1B-L8-mu3.6e-02-lr1e-04-local-shuffling-CrosscoderLoss
1000
97656
-
0.79779
0.77257
-
0.26848
0.78439
-
109.62842
2.31492
3.06884
5.6185
-
Finished
jkminder
4h 56m 11s
-
2048
0.03125
CROSSCODER
0.1
1
cuda
BatchTopKCrossCoder
crosscoder
0.1
1.0
None
True
1.0
True
65536
100
-
-
-
-
8
meta-llama/Llama-3.2-1B-Instruct-meta-llama/Llama-3.2-1B
0.0001
97656
0.999
1000
1024
BatchTopKCrossCoderTrainer
-
Llama-3.2-1B-L8-k100-lr1e-04-local-shuffling-Crosscoder
1000
97656
1.00063
0.83666
0.8167
100
0.6125
0.82606
-
100
2.08872
2.11999
4.53313
9.06627
Finished
jkminder
9h 20m 48s
-
2304
0.03125
CROSSCODER
0.1
1
cuda
BatchTopKCrossCoder
crosscoder
0.1
1
-
true
1
true
73728
100
-
-
-
-
13
google/gemma-2-2b-it-google/gemma-2-2b
0.0001
97656
0.999
1000
1152
BatchTopKCrossCoderTrainer
-
gemma-2-2b-L13-k100-lr1e-04-local-shuffling-Crosscoder
1000
97656
1.0006
0.86134
0.86924
100
0.63411
0.86596
-
100
61.60526
61.63652
3961.23779
7922.45703
Finished
jkminder
3h 16m 22s
-
2304
-
-
-
-
cuda
CrossCoder
-
-
-
-
-
-
-
73728
-
-
-
-
0.041
13
google/gemma-2-2b-it-google/gemma-2-2b
0.0001
-
-
-
-
CrossCoderTrainer
-
gemma-2-2b-L13-mu4.1e-02-lr1e-04-2x100M-local-shuffling-CCLoss
1000
97656
-
0.82116
0.8305
-
0.33816
0.82663
-
96.64893
69.57137
93.3694
5128.38477
-
1-6
of 6