Skip to main content

Semenov-andrei-v's workspace

Charts
12
0100200300400Step0.00050.0010.00150.0020.00250.003

Selected runs are not logging media for the key final-val/perplexity, but instead are logging values of type nil.

If final-val/perplexity is never supposed to be a media type, please delete this panel and create the proper panel type manually.

Selected runs are not logging media for the key final-val/loss, but instead are logging values of type nil.

If final-val/loss is never supposed to be a media type, please delete this panel and create the proper panel type manually.

Selected runs are not logging media for the key final-val/acc, but instead are logging values of type nil.

If final-val/acc is never supposed to be a media type, please delete this panel and create the proper panel type manually.

Showing first 10 runs
0100200300400Step01e+102e+103e+104e+10
62x8_model-moe_llama_dataset-fineweb_opt-soap__iterations-10000_warmup_steps-2000_beta2-0.999_grad_clip-0.1_moe-True64x1_model-moe_llama_dataset-fineweb_opt-adopt__iterations-336000_warmup_steps-2000_beta2-0.999_grad_clip-0.5_n_layer-12_moe-True62x8_model-moe_llama_dataset-fineweb_opt-adamw__iterations-10000_warmup_steps-2000_beta2-0.999_grad_clip-0.1_moe-True64x1_model-moe_llama_dataset-fineweb_opt-signum__iterations-336000_warmup_steps-2000_lr-0.0005_grad_clip-0.5_momentum-0.95_nesterov-True_mars_vr_gamma-0.024_n_layer-12_moe-True64x1_model-moe_llama_dataset-fineweb_opt-d-muon__iterations-336000_warmup_steps-2000_beta1-0.8_beta2-0.999_grad_clip-0.5_momentum-0.95_nesterov-True_mars_vr_gamma-0.024_n_layer-12_moe-True32x2_model-moe_llama_dataset-fineweb_opt-sophiag__iterations-336000_warmup_steps-2000_lr-0.0005_beta2-0.999_grad_clip-0.5_n_layer-12_moe-True0200,000,000400,000,000600,000,000800,000,0001,000,000,000