Guitaricet's workspace
Runs
58
Name
3 visualized
State
Notes
User
Tags
Created
Runtime
Sweep
activation_dropout
activation_fn
adam_betas
adam_eps
adaptive_input
adaptive_input_factor
adaptive_softmax_dropout
adaptive_softmax_factor
add_bos_token
all_gather_list_size
arch
attention_bias
attention_dropout
best_checkpoint_metric
bf16
broadcast_buffers
bucket_cap_mb
char_embedder_highway_layers
character_embedding_dim
character_embeddings
character_filters
checkpoint_suffix
clip_norm
cpu
criterion
cross_self_attention
curriculum
data
data_buffer_size
ddp_backend
decoder_attention_heads
decoder_embed_dim
decoder_ffn_embed_dim
decoder_input_dim
decoder_layerdrop
decoder_layers
decoder_learned_pos
decoder_normalize_before
decoder_output_dim
device_id
disable_validation
distributed_backend
distributed_init_method
distributed_no_spawn
Finished
A smaller version of [2o4fmnq1](https://wandb.ai/guitaricet/simple-mt/runs/2o4fmnq1/overview)
guitaricet
1d 21h 19m 44s
-
0
relu
(0.9, 0.98)
1.0000e-8
false
-
0
-
-
16384
transformer_with_input_proj
false
0
bleu
false
false
25
-
-
-
-
0
false
label_smoothed_cross_entropy
false
0
../fairseq/data-bin/wmt17_en_de
0
no_c10d
8
504
2048
511
0
6
false
false
511
0
false
nccl
tcp://localhost:14157
false
Failed
-
guitaricet
11s
-
0
relu
(0.9, 0.98)
1.0000e-8
false
-
0
-
-
16384
transformer_with_input_proj
false
0
bleu
false
false
25
-
-
-
-
0
false
label_smoothed_cross_entropy
false
0
../fairseq/data-bin/wmt17_en_de
0
no_c10d
8
504
2048
511
0
6
false
false
511
0
false
nccl
tcp://localhost:15378
false
Finished
-
guitaricet
14h 51s
-
0
relu
(0.9, 0.98)
1.0000e-8
false
-
0
-
-
16384
transformer_with_input_proj
false
0
bleu
false
false
25
-
-
-
-
0
false
label_smoothed_cross_entropy
false
0
../fairseq/data-bin/wmt17_en_de
0
no_c10d
8
512
2048
511
0
6
false
false
511
0
false
nccl
tcp://localhost:12935
false
Crashed
failed because of the power-outage-looking-like thing
the continuation is the run 56l32m6i
https://wandb.ai/guitaricet/simple-mt/runs/56l32m6i
guitaricet
17h 49m 7s
-
0
relu
(0.9, 0.98)
1.0000e-8
false
-
0
-
-
16384
transformer_with_input_proj
false
0
bleu
false
false
25
-
-
-
-
0
false
label_smoothed_cross_entropy
false
0
../fairseq/data-bin/wmt17_en_de
0
no_c10d
8
512
2048
511
0
6
false
false
511
0
false
nccl
tcp://localhost:16076
false
Finished
3 GPUs, fully comparable with 7b6ei3g3
No, it is not absolutely fully comparable, it has bias term in the attention.
guitaricet
2d 29m 2s
-
0
relu
(0.9, 0.98)
1.0000e-8
false
-
0
-
-
16384
transformer_with_input_proj
-
0
bleu
false
false
25
-
-
-
-
0
false
label_smoothed_cross_entropy
false
0
../fairseq/data-bin/wmt17_en_de
0
no_c10d
8
512
2048
511
0
6
false
false
511
0
false
nccl
tcp://localhost:19572
false
Finished
86,950,336 params compared to 87,377,272 in Vanilla (adj)
guitaricet
1d 8h 2m 29s
-
0
relu
(0.9, 0.98)
1.0000e-8
false
-
0
-
-
16384
simple_transformer
-
0
bleu
false
false
25
-
-
-
-
0
false
label_smoothed_cross_entropy
false
0
../fairseq/data-bin/wmt17_en_de
0
no_c10d
8
552
2048
511
0
6
false
false
511
0
false
nccl
tcp://localhost:10070
false
Failed
-
guitaricet
5s
-
0
relu
(0.9, 0.98)
1.0000e-8
false
-
0
-
-
16384
simple_transformer
-
0
bleu
false
false
25
-
-
-
-
0
false
label_smoothed_cross_entropy
false
0
../fairseq/data-bin/wmt17_en_de
0
no_c10d
8
552
2048
511
0
6
false
false
511
0
false
nccl
-
false
Finished
87,377,272
2GPUs version should yield the same result as a three GPUs version, but I wanted to check it anyway.
guitaricet
pruning-exp-2
1d 22h 25m 16s
-
0
relu
(0.9, 0.98)
1.0000e-8
false
-
0
-
-
16384
transformer_with_input_proj
-
0
bleu
false
false
25
-
-
-
-
0
false
label_smoothed_cross_entropy
false
0
../fairseq/data-bin/wmt17_en_de
0
no_c10d
8
512
2048
511
0
6
false
false
511
0
false
nccl
tcp://localhost:16900
false
Finished
New run with the same hyper parameters. For some reason we're unable to load previous checkpoints, so we train a new model.
guitaricet
pruning-exp-2
1d 21h 20m 5s
-
0
relu
(0.9, 0.98)
1.0000e-8
false
-
0
-
-
16384
simple_transformer
-
0
bleu
false
false
25
-
-
-
-
0
false
label_smoothed_cross_entropy
false
0
../fairseq/data-bin/wmt17_en_de
0
no_c10d
8
544
2048
511
0
6
false
false
511
0
false
nccl
tcp://localhost:11312
false
Finished
-
guitaricet
1d 5h 38m 38s
-
0
relu
(0.9, 0.98)
1.0000e-8
false
-
0
-
-
16384
simple_transformer
-
0
bleu
false
false
25
-
-
-
-
0
false
label_smoothed_cross_entropy
false
0
../unet_transformer_translation/data-bin/wmt17_en_de
2
c10d
8
656
2048
511
0
6
false
false
511
0
false
nccl
-
false
1-10
of 58