1659808224's workspace
Runs
3
State
Notes
User
Tags
Created
Runtime
Sweep
adafactor_clip_threshold
adafactor_decay_rate
adafactor_eps
adafactor_relative_step
adafactor_scale_parameter
adafactor_warmup_init
adam_betas
adam_epsilon
best_model_dir
cache_dir
cosine_schedule_num_cycles
custom_layer_parameters
custom_parameter_groups
dataloader_num_workers
do_lower_case
do_sample
dynamic_quantize
early_stopping
early_stopping_consider_epochs
early_stopping_delta
early_stopping_metric
early_stopping_metric_minimize
early_stopping_patience
eval_batch_size
evaluate_during_training
evaluate_during_training_silent
evaluate_during_training_steps
evaluate_during_training_verbose
evaluate_each_epoch
evaluate_generated_text
fp16
gradient_accumulation_steps
learning_rate
length_penalty
local_rank
logging_steps
max_grad_norm
max_length
max_seq_length
max_steps
model_class
model_name
model_type
multiprocessing_chunksize
Finished
Add notes...
chence08
15h 27m 33s
-
1
-0.8
[1e-30,0.001]
false
false
false
[0.9,0.999]
1.0000e-8
outputs/best_model
cache_dir/
0.5
[]
[]
0
false
false
false
true
false
0
eval_loss
true
3
8
true
true
1000
false
true
false
false
1
0.001
2
-1
50
1
20
128
-1
T5Model
google/mt5-small
mt5
-1
Finished
Add notes...
chence08
15h 32m 42s
-
1
-0.8
[1e-30,0.001]
false
false
false
[0.9,0.999]
1.0000e-8
outputs/best_model
cache_dir/
0.5
[]
[]
0
false
false
false
true
false
0
eval_loss
true
3
8
true
true
1000
false
true
false
false
1
0.001
2
-1
50
1
20
128
-1
T5Model
google/mt5-small
mt5
-1
Finished
Add notes...
chence08
4h 46m 40s
-
1
-0.8
[1e-30,0.001]
false
false
false
[0.9,0.999]
1.0000e-8
outputs/best_model
cache_dir/
0.5
[]
[]
0
false
false
false
true
false
0
eval_loss
true
3
20
true
true
30000
false
true
false
false
1
0.001
2
-1
50
1
20
96
-1
T5Model
google/mt5-small
mt5
-1
1-3
of 3