Skip to main content
gmongaras1
Projects
Gated_Attention
Workspace
Log in
Sign up
Overview
Workspace
Runs
Automat.
Sweeps
Reports
Artifacts
Gmongaras's workspace
Personal workspace
Automated workspace
Changes are only visible to you.
Runs
134
Name
4 visualized
fineweb_largeDepth_softmax_36bs_2gpu_1024seqlen
fineweb_largeDepth_softmax_36bs_2gpu_1024seqlen
fineweb_largeDepth_softmax_divS_gatev2_36bs_2gpu_1024seqlen
fineweb_largeDepth_softmax_divS_gatev2_36bs_2gpu_1024seqlen
fineweb_largeDepth_gated_softmax_no_gate_L2norm_nodivS_noclamp_35bs_2gpu_1024seqlen
fineweb_largeDepth_gated_softmax_no_gate_L2norm_nodivS_noclamp_35bs_2gpu_1024seqlen
SlimPajama_clip_softmax_divS_gatev2_35bs_2gpu_1024seqlen
SlimPajama_clip_softmax_divS_gatev2_35bs_2gpu_1024seqlen
fineweb_large_clip_softmax_divS_gatev2_36bs_4gpu_1024seqlen
fineweb_large_clip_softmax_divS_gatev2_36bs_4gpu_1024seqlen
fineweb_clip_softmax_gate_35bs_2gpu_1024seqlen
fineweb_clip_softmax_gate_35bs_2gpu_1024seqlen
fineweb_clip_softmax_gate_35bs_2gpu_1024seqlen
fineweb_clip_softmax_gate_35bs_2gpu_1024seqlen
fineweb_clip_softmax_divs_35bs_2gpu_1024seqlen
fineweb_clip_softmax_divs_35bs_2gpu_1024seqlen
fineweb_large_softmax_divS_gatev2_36bs_4gpu_1024seqlen
fineweb_large_softmax_divS_gatev2_36bs_4gpu_1024seqlen
fineweb_softmax_divS_gatev2_35bs_2gpu_1024seqlen
fineweb_softmax_divS_gatev2_35bs_2gpu_1024seqlen
SlimPajama_softmax_divS_gatev2_35bs_2gpu_1024seqlen
SlimPajama_softmax_divS_gatev2_35bs_2gpu_1024seqlen
SlimPajama_softmax_divS_gatev2_35bs_2gpu_1024seqlen
SlimPajama_softmax_divS_gatev2_35bs_2gpu_1024seqlen
fineweb_softmax_divS_gatev2_36bs_6gpu_4096seqlen
fineweb_softmax_divS_gatev2_36bs_6gpu_4096seqlen
fineweb_large_softmax_divS_gatev2_35bs_2gpu_1024seqlen
fineweb_large_softmax_divS_gatev2_35bs_2gpu_1024seqlen
Pile_softmax_divS_gatev2_35bs_2gpu_1024seqlen
Pile_softmax_divS_gatev2_35bs_2gpu_1024seqlen
fineweb_softmax_divS_gatev2_35bs_2gpu_1024seqlen
fineweb_softmax_divS_gatev2_35bs_2gpu_1024seqlen
fineweb_gated_softmax_out_gate_L2norm_nodivS_noclamp_35bs_2gpu_1024seqlen
fineweb_gated_softmax_out_gate_L2norm_nodivS_noclamp_35bs_2gpu_1024seqlen
fineweb_softmax_detach_denom_gate_35bs_2gpu_1024seqlen
fineweb_softmax_detach_denom_gate_35bs_2gpu_1024seqlen
fineweb_softmax_gate_35bs_2gpu_1024seqlen
fineweb_softmax_gate_35bs_2gpu_1024seqlen
fineweb_gated_ReLU_no_gate_L2norm_nodivS_noclamp_35bs_2gpu_1024seqlen
fineweb_gated_ReLU_no_gate_L2norm_nodivS_noclamp_35bs_2gpu_1024seqlen
1-20
of 134
test_loss
test_loss
20k
40k
60k
80k
Step
2.8
3
3.2
3.4
3.6
fineweb_largeDepth_softmax_36bs_2gpu_1024seqlen
fineweb_largeDepth_softmax_divS_gatev2_36bs_2gpu_1024seqlen
fineweb_largeDepth_gated_softmax_no_gate_L2norm_nodivS_noclamp_35bs_2gpu_1024seqlen
fineweb_softmax_35bs_2gpu_1024seqlen
Previous
Next