Smangrul's workspace
Runs
2
State
Notes
User
Tags
Created
Runtime
Sweep
gradient_accumulation_steps
ignore_mismatched_sizes
learning_rate
lr_scheduler_type
max_length
max_train_steps
model_name_or_path
num_train_epochs
num_warmup_steps
output_dir
pad_to_max_length
per_device_eval_batch_size
per_device_train_batch_size
push_to_hub
report_to
task_name
use_slow_tokenizer
weight_decay
with_tracking
accuracy.accuracy
accuracy.f1
epoch
step
train_loss
Finished
smangrul
6m 8s
-
1
true
0.00002
linear
128
690
microsoft/deberta-v2-xlarge-mnli
3
0
/tmp/mrpc/no_deepspeed
false
8
8
false
wandb
mrpc
false
0
true
0.90441
0.93146
2
690
0.065684
Finished
smangrul
1m 56s
-
1
true
0.00002
linear
128
138
microsoft/deberta-v2-xlarge-mnli
3
0
/tmp/mrpc/
false
8
40
false
wandb
mrpc
false
0
true
0.91176
0.93594
2
138
0.10939
1-2
of 2