Yiwen_hu's workspace
Runs
760
Name
128 visualized
Name: log/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-stage4-640
Name: log/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-stage4-640
56
Name: log/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-stage6-643
Name: log/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-stage6-643
56
Name: log/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-stage3-639
Name: log/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-stage3-639
56
Name: log/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-stage5-from-43k-642
Name: log/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-stage5-from-43k-642
56
Name: log/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-stage2-632
Name: log/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-stage2-632
56
Name: log/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-stage5-641
Name: log/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-stage5-641
56
Name: log/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-stage2-630
Name: log/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-stage2-630
56
Name: log/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-stage2-try-626
Name: log/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-stage2-try-626
32
Name: log/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-from-7k-625
Name: log/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-from-7k-625
32
Name: log/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-621
Name: log/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-621
32
Name: log/miniyulan-0.1B-520-wesar-cerebras-lr-hiddenshrink-stage2-592
Name: log/miniyulan-0.1B-520-wesar-cerebras-lr-hiddenshrink-stage2-592
16
Name: log/miniyulan-0.1B-520-wesar-cerebras-lr-hiddenshrink-from-4k-582
Name: log/miniyulan-0.1B-520-wesar-cerebras-lr-hiddenshrink-from-4k-582
16
Name: log/miniyulan-0.1B-520-wesar-cerebras-lr-hiddenshrink-567
Name: log/miniyulan-0.1B-520-wesar-cerebras-lr-hiddenshrink-567
16
Name: log/miniyulan-0.1B-520-noemb-lns02-555
Name: log/miniyulan-0.1B-520-noemb-lns02-555
16
Name: log/miniyulan-0.1B-520-noemb-lns5-553
Name: log/miniyulan-0.1B-520-noemb-lns5-553
16
Name: log/miniyulan-0.1B-520-531
Name: log/miniyulan-0.1B-520-531
32
Name: log/smaller1.11-stage1-cerebras-tie-hidden.shrink-lr001-init-520
Name: log/smaller1.11-stage1-cerebras-tie-hidden.shrink-lr001-init-520
56
Name: log/smaller1.11-stage1-reproduce-qkln-warmup-492
Name: log/smaller1.11-stage1-reproduce-qkln-warmup-492
56
Name: log/smaller1.11-stage1-reproduce-qkln-469
Name: log/smaller1.11-stage1-reproduce-qkln-469
16
Name: log/smaller1.11-stage1-reproduce-436
Name: log/smaller1.11-stage1-reproduce-436
16
State
Notes
User
Tags
Created
Runtime
Sweep
_n_gpu
_name_or_path
accelerator_config.even_batches
accelerator_config.non_blocking
accelerator_config.split_batches
accelerator_config.use_seedable_sampler
adafactor
adam_beta1
adam_beta2
adam_epsilon
add_cross_attention
architectures
attention_bias
attention_dropout
auto_find_batch_size
batch_eval_metrics
bf16
bf16_full_eval
bos_token_id
chunk_size_feed_forward
config_dtype
data_path
dataloader_drop_last
dataloader_num_workers
dataloader_persistent_workers
dataloader_pin_memory
dataloader_prefetch_factor
ddp_timeout
debug
deepspeed
deepspeed_gradient_checkpointing
dim_ffn_base_init
dim_model_base
dim_model_base_attn
dim_model_base_init
dim_model_base_lmh
dim_model_base_logits
dim_model_base_lr
disable_tqdm
diversity_penalty
do_eval
do_predict
do_sample
do_train
Finished
-
3151273556
19m 4s
-
1
/fs/archive/share/yulan/data/aa_mini/output/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-stage3/checkpoint-29235-rms_norm
true
false
false
true
false
0.9
0.95
1.0000e-15
false
MiniYuLanModelForCausalLM
true
0
false
false
true
false
1
0
bfloat16
/fs/archive/share/yulan/data/aa_mini/hf_dataset/myl_new/20241013_145520
false
4
false
true
2
3600
-
/home/u20140041/pretrain-mini/model/config/ds2_config_adamw_kd.json
false
-
576
64
-
1
576
256
false
0
false
false
false
false
Finished
-
3151273556
17m 22s
-
1
-
-
-
-
-
false
0.9
0.95
1.0000e-15
-
-
true
-
false
false
true
false
-
-
bfloat16
/fs/archive/share/yulan/data/aa_mini/hf_dataset/myl_new/20241016_122150
false
4
false
true
2
3600
-
/home/u20140041/pretrain-mini/model/config/ds2_config_adamw_kd.json
false
-
-
-
-
-
576
256
false
-
false
false
-
false
Finished
-
3151273556
19m 5s
-
1
/fs/archive/share/yulan/data/aa_mini/output/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-stage2/checkpoint-19468-rms_norm
true
false
false
true
false
0.9
0.95
1.0000e-15
false
MiniYuLanModelForCausalLM
true
0
false
false
true
false
1
0
bfloat16
/fs/archive/share/yulan/data/aa_mini/hf_dataset/myl_new/20241013_145404
false
4
false
true
2
3600
-
/home/u20140041/pretrain-mini/model/config/ds2_config_adamw_kd.json
false
-
576
64
-
1
576
256
false
0
false
false
false
false
Finished
-
3151273556
16m 48s
-
1
/fs/archive/share/yulan/data/aa_mini/output/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-stage5/checkpoint-43000
true
false
false
true
false
0.9
0.95
1.0000e-15
false
MiniYuLanModelForCausalLM
true
0
false
false
true
false
1
0
bfloat16
/fs/archive/share/yulan/data/aa_mini/hf_dataset/myl_new/20241013_145601
false
4
false
true
2
3600
-
/home/u20140041/pretrain-mini/model/config/ds2_config_adamw_kd.json
false
-
576
64
-
1
576
256
false
0
false
false
false
false
Finished
-
3151273556
18m 58s
-
1
/fs/archive/share/yulan/data/aa_mini/output/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new-from-7k/checkpoint-9573-rms_norm
true
false
false
true
false
0.9
0.95
1.0000e-15
false
MiniYuLanModelForCausalLM
true
0
false
false
true
false
1
0
bfloat16
/fs/archive/share/yulan/data/aa_mini/hf_dataset/myl_new/20241013_145248
false
4
false
true
2
3600
-
/home/u20140041/pretrain-mini/model/config/ds2_config_adamw_kd.json
false
-
576
64
-
1
576
256
false
0
false
false
false
false
Finished
-
3151273556
16m 30s
-
1
-
-
-
-
-
false
0.9
0.95
1.0000e-15
-
-
true
-
false
false
true
false
-
-
bfloat16
/fs/archive/share/yulan/data/aa_mini/hf_dataset/myl_new/20241013_145601
false
4
false
true
2
3600
-
/home/u20140041/pretrain-mini/model/config/ds2_config_adamw_kd.json
false
-
-
-
-
-
576
256
false
-
false
false
-
false
Finished
-
3151273556
14m 48s
-
1
-
-
-
-
-
false
0.9
0.95
1.0000e-15
-
-
true
-
false
false
true
false
-
-
bfloat16
/fs/archive/share/yulan/data/aa_mini/hf_dataset/myl_new/20241013_145248
false
4
false
true
2
3600
-
/home/u20140041/pretrain-mini/model/config/ds2_config_adamw_kd.json
false
-
-
-
-
-
576
256
false
-
false
false
-
false
Finished
-
3151273556
9m 41s
-
1
-
-
-
-
-
false
0.9
0.95
1.0000e-15
-
-
true
-
false
false
true
false
-
-
bfloat16
/fs/archive/share/yulan/data/aa_mini/hf_dataset/myl_new/20241013_145248
false
4
false
true
2
3600
-
/home/u20140041/pretrain-mini/model/config/ds2_config_adamw_kd.json
false
-
-
-
-
-
576
256
false
-
false
false
-
false
Finished
-
3151273556
9m 53s
-
1
/fs/archive/share/yulan/data/aa_mini/output/miniyulan-0.1B-wesar-cerebras-lr-hs-nona-new/checkpoint-7000
true
false
false
true
false
0.9
0.95
1.0000e-15
false
MiniYuLanModelForCausalLM
true
0
false
false
true
false
1
0
bfloat16
/fs/archive/share/yulan/data/aa_mini/hf_dataset/myl_new/20241013_144711
false
4
false
true
2
3600
-
/home/u20140041/pretrain-mini/model/config/ds2_config_adamw_kd.json
false
-
576
64
-
1
576
256
false
0
false
false
false
false
Finished
-
3151273556
3h 6m 57s
-
1
-
-
-
-
-
false
0.9
0.95
1.0000e-15
-
-
true
-
false
false
true
false
-
-
bfloat16
/fs/archive/share/yulan/data/aa_mini/hf_dataset/myl_new/20241013_144711
false
4
false
true
2
3600
-
/home/u20140041/pretrain-mini/model/config/ds2_config_adamw_kd.json
false
-
-
-
-
-
576
256
false
-
false
false
-
false
Finished
-
3151273556
10h 40m 24s
-
1
/fs/archive/share/yulan/data/aa_mini/output/miniyulan-0.1B-520-wesar-cerebras-lr-hiddenshrink-from-4k/checkpoint-9577
true
false
false
true
false
0.9
0.95
1.0000e-15
false
MiniYuLanModelForCausalLM
true
0
false
false
true
false
1
0
bfloat16
/fs/archive/share/yulan/data/aa_mini/hf_dataset/myl/20241004_160729
false
4
false
true
2
3600
-
/home/u20140041/pretrain-mini/model/config/ds2_config_adamw_kd.json
false
-
576
64
-
1
576
256
false
0
false
false
false
false
Finished
-
3151273556
4h 20m 35s
-
1
/fs/archive/share/yulan/data/aa_mini/output/miniyulan-0.1B-520-wesar-cerebras-lr-hiddenshrink/checkpoint-4000
true
false
false
true
false
0.9
0.95
1.0000e-15
false
MiniYuLanModelForCausalLM
true
0
false
false
true
false
1
0
bfloat16
/fs/archive/share/yulan/data/aa_mini/hf_dataset/myl/20240930_230656
false
4
false
true
2
3600
-
/home/u20140041/pretrain-mini/model/config/ds2_config_adamw_kd.json
false
-
576
64
-
1
576
256
false
0
false
false
false
false
Finished
-
3151273556
13h 28m 8s
-
1
-
-
-
-
-
false
0.9
0.95
1.0000e-15
-
-
true
-
false
false
true
false
-
-
bfloat16
/fs/archive/share/yulan/data/aa_mini/hf_dataset/myl/20240930_230656
false
4
false
true
2
3600
-
/home/u20140041/pretrain-mini/model/config/ds2_config_adamw_kd.json
false
-
-
-
-
-
576
256
false
-
false
false
-
false
Finished
-
3151273556
3m 34s
-
1
-
-
-
-
-
false
0.9
0.95
1.0000e-15
-
-
true
-
false
false
true
false
-
-
bfloat16
/fs/archive/share/yulan/data/aa_mini/hf_dataset/myl/20240930_230656
false
4
false
true
2
3600
-
/home/u20140041/pretrain-mini/model/config/ds2_config_adamw_kd.json
false
-
256
1
256
-
-
-
false
-
false
false
-
false
Finished
-
3151273556
3h 3m 51s
-
1
-
-
-
-
-
false
0.9
0.95
1.0000e-15
-
-
true
-
false
false
true
false
-
-
bfloat16
/fs/archive/share/yulan/data/aa_mini/hf_dataset/myl/20240930_230656
false
4
false
true
2
3600
-
/home/u20140041/pretrain-mini/model/config/ds2_config_adamw_kd.json
false
-
256
1
256
-
-
-
false
-
false
false
-
false
Finished
-
3151273556
6h 37m 58s
-
1
-
-
-
-
-
false
0.9
0.95
1.0000e-15
-
-
true
-
false
false
true
false
-
-
bfloat16
/fs/archive/share/yulan/data/aa_mini/hf_dataset/myl/20240930_230656
false
4
false
true
2
3600
-
/home/u20140041/pretrain-mini/model/config/ds2_config_adamw_kd.json
false
-
256
1
256
-
-
-
false
-
false
false
-
false
Finished
-
3151273556
2h 41m 55s
-
1
-
-
-
-
-
false
0.9
0.95
1.0000e-8
-
-
true
-
false
false
true
false
-
-
bfloat16
/fs/archive/share/yulan/data/aa_mini/hf_dataset/myl/20240930_230656
false
4
false
true
2
3600
-
/home/u20140041/pretrain-mini/model/config/ds2_config_adamw_kd.json
false
-
256
1
256
-
-
-
false
-
false
false
-
false
Finished
-
3151273556
11m 21s
-
1
-
-
-
-
-
false
0.9
0.95
1.0000e-15
-
-
true
-
false
false
true
false
-
-
bfloat16
/fs/archive/share/yulan/data/aa_mini/hf_dataset/myl/20240930_230656
false
4
false
true
2
3600
-
/home/u20140041/pretrain-mini/model/config/ds2_config_adamw_kd.json
false
-
-
-
-
-
-
-
false
-
false
false
-
false
Finished
-
3151273556
21m 3s
-
1
/fs/archive/share/yulan/data/aa_mini/output/smaller1.11-stage1-reproduce-qkln/checkpoint-0
true
false
false
true
false
0.9
0.95
1.0000e-15
false
MiniYuLanModelForCausalLM
true
0
false
false
true
false
1
0
bfloat16
/fs/archive/share/yulan/data/aa_mini/hf_dataset/myl/20240930_230656
false
4
false
true
2
3600
-
/home/u20140041/pretrain-mini/model/config/ds2_config_adamw_kd.json
false
1
576
64
-
1
-
-
false
0
false
false
false
false
Finished
-
3151273556
1d 3h 32m 9s
-
1
/fs/archive/share/yulan/data/aa_mini/output/smaller1.11-stage1-reproduce/checkpoint-0
true
false
false
true
false
0.9
0.95
1.0000e-15
false
MiniYuLanModelForCausalLM
true
0
false
false
true
false
1
0
bfloat16
/fs/archive/share/yulan/data/aa_mini/hf_dataset/myl/20240930_230656
false
4
false
true
2
3600
-
/home/u20140041/pretrain-mini/model/config/ds2_config_adamw_kd.json
false
1
576
64
-
1
-
-
false
0
false
false
false
false
1-20
of 21