Upup-ashton-wang's group workspace
Group: SAE Hookpoints Ablation
Name
23 visualized
State
Notes
User
Tags
Created
Runtime
Sweep
_attn_implementation_autoset
_name_or_path
accelerator_config.even_batches
accelerator_config.non_blocking
accelerator_config.split_batches
accelerator_config.use_seedable_sampler
adafactor
adam_beta1
adam_beta2
adam_epsilon
add_cross_attention
architectures
attention_dropout
auto_find_batch_size
average_tokens_across_devices
base_model_name
batch_eval_metrics
batch_size
bf16
bf16_full_eval
bos_token_id
chars_per_token
chunk_size_feed_forward
dataloader_drop_last
dataloader_num_workers
dataloader_persistent_workers
dataloader_pin_memory
dataset_text_field
ddp_timeout
debug
disable_tqdm
distill_type
diversity_penalty
do_eval
do_predict
do_sample
do_train
early_stopping
encoder_no_repeat_ngram_size
eos_token_id
eval_delay
eval_do_concat_batches
eval_on_start
eval_strategy
Finished
-
upup-ashton-wang
22m 12s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
DeepSeek-R1-Distill-Qwen-1.5B
-
1
-
-
-
-
-
-
-
-
-
-
-
-
-
sft_r1_distill
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
upup-ashton-wang
22m 55s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
DeepSeek-R1-Distill-Qwen-1.5B
-
1
-
-
-
-
-
-
-
-
-
-
-
-
-
sft_r1_distill
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
upup-ashton-wang
23m 8s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
DeepSeek-R1-Distill-Qwen-1.5B
-
1
-
-
-
-
-
-
-
-
-
-
-
-
-
sft_r1_distill
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
upup-ashton-wang
24m
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
DeepSeek-R1-Distill-Qwen-1.5B
-
1
-
-
-
-
-
-
-
-
-
-
-
-
-
sft_r1_distill
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
upup-ashton-wang
24m 35s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
DeepSeek-R1-Distill-Qwen-1.5B
-
1
-
-
-
-
-
-
-
-
-
-
-
-
-
sft_r1_distill
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
upup-ashton-wang
25m 3s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
DeepSeek-R1-Distill-Qwen-1.5B
-
1
-
-
-
-
-
-
-
-
-
-
-
-
-
sft_r1_distill
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
upup-ashton-wang
25m 31s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
DeepSeek-R1-Distill-Qwen-1.5B
-
1
-
-
-
-
-
-
-
-
-
-
-
-
-
sft_r1_distill
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
upup-ashton-wang
26m 41s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
DeepSeek-R1-Distill-Qwen-1.5B
-
1
-
-
-
-
-
-
-
-
-
-
-
-
-
sft_r1_distill
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
upup-ashton-wang
27m 22s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
DeepSeek-R1-Distill-Qwen-1.5B
-
1
-
-
-
-
-
-
-
-
-
-
-
-
-
sft_r1_distill
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
upup-ashton-wang
27m 48s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
DeepSeek-R1-Distill-Qwen-1.5B
-
1
-
-
-
-
-
-
-
-
-
-
-
-
-
sft_r1_distill
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
upup-ashton-wang
28m 20s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
DeepSeek-R1-Distill-Qwen-1.5B
-
1
-
-
-
-
-
-
-
-
-
-
-
-
-
sft_r1_distill
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
upup-ashton-wang
29m 40s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
DeepSeek-R1-Distill-Qwen-1.5B
-
1
-
-
-
-
-
-
-
-
-
-
-
-
-
sft_r1_distill
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
upup-ashton-wang
31m 33s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
DeepSeek-R1-Distill-Qwen-1.5B
-
1
-
-
-
-
-
-
-
-
-
-
-
-
-
sft_r1_distill
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
upup-ashton-wang
31m 34s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
DeepSeek-R1-Distill-Qwen-1.5B
-
1
-
-
-
-
-
-
-
-
-
-
-
-
-
sft_r1_distill
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
upup-ashton-wang
32m 9s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
DeepSeek-R1-Distill-Qwen-1.5B
-
1
-
-
-
-
-
-
-
-
-
-
-
-
-
sft_r1_distill
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
upup-ashton-wang
33m 1s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
DeepSeek-R1-Distill-Qwen-1.5B
-
1
-
-
-
-
-
-
-
-
-
-
-
-
-
sft_r1_distill
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
upup-ashton-wang
33m 24s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
DeepSeek-R1-Distill-Qwen-1.5B
-
1
-
-
-
-
-
-
-
-
-
-
-
-
-
sft_r1_distill
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
upup-ashton-wang
33m 56s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
DeepSeek-R1-Distill-Qwen-1.5B
-
1
-
-
-
-
-
-
-
-
-
-
-
-
-
sft_r1_distill
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
upup-ashton-wang
34m 52s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
DeepSeek-R1-Distill-Qwen-1.5B
-
1
-
-
-
-
-
-
-
-
-
-
-
-
-
sft_r1_distill
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
upup-ashton-wang
35m 7s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
DeepSeek-R1-Distill-Qwen-1.5B
-
1
-
-
-
-
-
-
-
-
-
-
-
-
-
sft_r1_distill
-
-
-
-
-
-
-
-
-
-
-
-
1-20
of 23