A0970601776's workspace
Runs
11
Name
9 visualized
Tags
H100*8
adamw_torch_fused
batch_5
cutoff_len_1024
epochs_3
grad_acc_4
lr_5e-6
warmup_ratio_0.01
z3
H100*8
adamw_torch_fused
batch_4
cutoff_len_1024
epochs_60
grad_acc_4
lr_5e-6
warmup_ratio_0.01
z3
H100*8
adamw_torch_fused
batch_4
cutoff_len_1024
epochs_3
grad_acc_4
lr_5e-6
warmup_ratio_0.01
z3
H100*8
adamw_torch_fused
batch_4
cutoff_len_1024
epochs_3
grad_acc_4
lr_5e-6
warmup_ratio_0.01
z2
H100*8
adamw_torch_fused
batch_18
cutoff_len_1024
epochs_3
grad_acc_82
lr_5e-6
warmup_ratio_0.01
z2
H100*8
adamw_torch_fused
batch_18
cutoff_len_1024
epochs_3
grad_acc_82
lr_5e-6
warmup_ratio_0.01
z2
H100*8
adamw_torch_fused
batch_18
cutoff_len_1024
epochs_3
grad_acc_82
lr_5e-6
warmup_ratio_0.01
z2
H100*8
adamw_torch_fused
batch_18
cutoff_len_1024
epochs_10
grad_acc_82
lr_5e-6
warmup_ratio_0.01
z2
H100*8
adamw_torch_fused
batch_18
cutoff_len_1024
epochs_10
grad_acc_4
lr_5e-6
warmup_ratio_0.01
z2
H100*8
adamw_torch_fused
batch_18
cutoff_len_1024
epochs_10
grad_acc_2
lr_5e-6
warmup_ratio_0.01
z2
H100*8
adamw_torch_fused
batch_3
cutoff_len_1024
epochs_10
grad_acc_2
lr_5e-6
warmup_ratio_0.01
z2
State
Notes
User
Created
Runtime
Sweep
_attn_implementation_autoset
_name_or_path
accelerator_config.even_batches
accelerator_config.non_blocking
accelerator_config.split_batches
accelerator_config.use_seedable_sampler
adafactor
adam_beta1
adam_beta2
adam_epsilon
add_cross_attention
architectures
attention_bias
attention_dropout
auto_find_batch_size
batch_eval_metrics
bf16
bf16_full_eval
bos_token_id
chunk_size_feed_forward
dataloader_drop_last
dataloader_num_workers
dataloader_persistent_workers
dataloader_pin_memory
ddp_find_unused_parameters
ddp_timeout
debug
deepspeed
disable_tqdm
diversity_penalty
do_eval
do_predict
do_sample
do_train
early_stopping
encoder_no_repeat_ngram_size
eos_token_id
eval_delay
eval_do_concat_batches
eval_on_start
eval_strategy
eval_use_gather_object
fp16
fp16_backend
fp16_full_eval
Finished
增加 patent translate dpo 的品質,並目標去解決 rag 和特定字翻譯
a0970601776
1h 33m 4s
-
true
/mnt/llm/LLaMA-Factory/saves/Llama-3-Freego-8B-Instruct/sft/sft-2024-12-03-1
true
false
false
true
false
0.9
0.999
1.0000e-8
false
["LlamaForCausalLM"]
false
0
false
false
true
false
128000
0
false
72
false
true
false
10800000
[]
/mnt/llm/LLaMA-Factory/examples/deepspeed/ds_z3_config.json
false
0
false
false
false
true
false
0
128256
0
true
false
no
false
false
auto
false
Crashed
-
a0970601776
18h 43m 21s
-
true
/mnt/llm/LLaMA-Factory/saves/Llama-3-Freego-8B-Instruct/sft/sft-2024-12-03-1
true
false
false
true
false
0.9
0.999
1.0000e-8
false
["LlamaForCausalLM"]
false
0
false
false
true
false
128000
0
false
72
false
true
false
10800000
[]
/mnt/llm/LLaMA-Factory/examples/deepspeed/ds_z3_config.json
false
0
false
false
false
true
false
0
128256
0
true
false
no
false
false
auto
false
Finished
-
a0970601776
1h 21m 22s
-
true
/mnt/llm/LLaMA-Factory/saves/Llama-3-Freego-8B-Instruct/sft/sft-2024-12-03-1
true
false
false
true
false
0.9
0.999
1.0000e-8
false
["LlamaForCausalLM"]
false
0
false
false
true
false
128000
0
false
72
false
true
false
10800000
[]
/mnt/llm/LLaMA-Factory/examples/deepspeed/ds_z3_config.json
false
0
false
false
false
true
false
0
128256
0
true
false
no
false
false
auto
false
Crashed
-
a0970601776
1s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
Crashed
-
a0970601776
1s
-
true
/mnt/llm/LLaMA-Factory/saves/Llama-3-Freego-8B-Instruct/sft/sft-2024-12-03-1
true
false
false
true
false
0.9
0.999
1.0000e-8
false
["LlamaForCausalLM"]
false
0
false
false
true
false
128000
0
false
72
false
true
false
10800000
[]
/mnt/llm/LLaMA-Factory/examples/deepspeed/ds_z2_config.json
false
0
false
false
false
true
false
0
128256
0
true
false
no
false
false
auto
false
Crashed
-
a0970601776
1s
-
true
/mnt/llm/LLaMA-Factory/saves/Llama-3-Freego-8B-Instruct/sft/sft-2024-12-03-1
true
false
false
true
false
0.9
0.999
1.0000e-8
false
["LlamaForCausalLM"]
false
0
false
false
true
false
128000
0
false
72
false
true
false
10800000
[]
/mnt/llm/LLaMA-Factory/examples/deepspeed/ds_z2_config.json
false
0
false
false
false
true
false
0
128256
0
true
false
no
false
false
auto
false
Finished
這版把非翻譯的資料集的量壓在3k以下,以翻譯集為主
a0970601776
3m 43s
-
true
/mnt/llm/LLaMA-Factory/saves/Llama-3-Freego-8B-Instruct/sft/sft-2024-11-29-1/checkpoint-385/
true
false
false
true
false
0.9
0.999
1.0000e-8
false
["LlamaForCausalLM"]
false
0
false
false
true
false
128000
0
false
72
false
true
false
10800000
[]
/mnt/llm/LLaMA-Factory/examples/deepspeed/ds_z2_config.json
false
0
false
false
false
true
false
0
128256
0
true
false
no
false
false
auto
false
Crashed
-
a0970601776
3d 10h 45m 31s
-
true
yentinglin/Llama-3-Taiwan-8B-Instruct
true
false
false
true
false
0.9
0.999
1.0000e-8
false
["LlamaForCausalLM"]
false
0
false
false
true
false
128000
0
false
72
false
true
false
10800000
[]
/mnt/llm/LLaMA-Factory/examples/deepspeed/ds_z2_config.json
false
0
false
false
false
true
false
0
128256
0
true
false
no
false
false
auto
false
Crashed
-
a0970601776
4m 1s
-
true
yentinglin/Llama-3-Taiwan-8B-Instruct
true
false
false
true
false
0.9
0.999
1.0000e-8
false
["LlamaForCausalLM"]
false
0
false
false
true
false
128000
0
false
72
false
true
false
10800000
[]
/mnt/llm/LLaMA-Factory/examples/deepspeed/ds_z2_config.json
false
0
false
false
false
true
false
0
128256
0
true
false
no
false
false
auto
false
Crashed
-
a0970601776
2m 31s
-
true
yentinglin/Llama-3-Taiwan-8B-Instruct
true
false
false
true
false
0.9
0.999
1.0000e-8
false
["LlamaForCausalLM"]
false
0
false
false
true
false
128000
0
false
72
false
true
false
10800000
[]
/mnt/llm/LLaMA-Factory/examples/deepspeed/ds_z2_config.json
false
0
false
false
false
true
false
0
128256
0
true
false
no
false
false
auto
false
Crashed
-
a0970601776
46s
-
true
yentinglin/Llama-3-Taiwan-8B-Instruct
true
false
false
true
false
0.9
0.999
1.0000e-8
false
["LlamaForCausalLM"]
false
0
false
false
true
false
128000
0
false
72
false
true
false
10800000
[]
/mnt/llm/LLaMA-Factory/examples/deepspeed/ds_z2_config.json
false
0
false
false
false
true
false
0
128256
0
true
false
no
false
false
auto
false
1-11
of 11