Saforem2's workspace
Runs
469
Name
5 visualized
MODEL_SIZE: GPT33B
MODEL_SIZE: GPT33B
1
5
State
Notes
User
Tags
Created
Runtime
Sweep
DDP_impl
MODEL_SIZE
accumulate_allreduce_grads_in_fp32
adam_beta1
adam_beta2
adam_eps
add_bias_linear
add_position_embedding
adlr_autoresume
adlr_autoresume_interval
apply_layernorm_1p
apply_query_key_layer_scaling
apply_residual_connection_post_layernorm
async_tensor_model_parallel_allreduce
attention_dropout
attention_softmax_in_fp32
barrier_with_L1_time
bert_binary_head
bert_embedder_type
bf16
bias_dropout_fusion
bias_gelu_fusion
biencoder_projection_dim
biencoder_shared_query_context_model
checkpoint_activations
checkpoint_in_cpu
checkpoint_num_layers
classes_fraction
clip_grad
compression_training
consumed_train_samples
consumed_train_tokens
consumed_valid_samples
contigious_checkpointing
cpu_optimizer
cpu_torch_adam
create_moe_param_group
current_time
curriculum_learning_legacy
custom_token_counting
data_efficiency_curriculum_learning
data_impl
data_parallel_random_init
data_parallel_size
Finished
saforem2
7h 11m 31s
-
local
GPT33B
false
0.9
0.999
1.0000e-8
true
true
false
1000
false
true
false
false
0.1
false
true
true
megatron
false
true
true
0
false
true
false
1
1
1
false
0
0
0
false
true
false
false
1693954436.78268
false
false
false
mmap
false
1
1-1
of 1