(with KL) GPT2 Learning functionally important features with end-to-end dictionary learning
Report containing pareto frontiers for local SAEs, e2e SAEs, and e2e+ds SAEs
Created on August 22|Last edited on August 22
Comment
See https://wandb.ai/sparsify/gpt2 for all runs used in the paper, including appendices (with the exception of tinystories-1m runs which can be found at https://wandb.ai/sparsify/tinystories-1m-2). The runs in the pareto frontier for each method can be found in the plots below or by using the wandb run tags "pareto", "local", "e2e", and "e2eds".
Blocks.2.hook_resid_pre
Run set
21
Blocks.6.hook_resid_pre
Run set
23
Blocks.10.hook_resid_pre
Run set
23
Add a comment