Skip to main content
gmongaras1
Projects
Gated_Attention
Log in
Sign up
Overview
Workspace
Runs
Automat.
Sweeps
Reports
Artifacts
Gmongaras's workspace
Personal workspace
Automated workspace
Changes are only visible to you.
Runs
134
Name
4 visualized
fineweb_largeDepth_softmax_36bs_2gpu_1024seqlen
fineweb_largeDepth_softmax_36bs_2gpu_1024seqlen
fineweb_largeDepth_softmax_divS_gatev2_36bs_2gpu_1024seqlen
fineweb_largeDepth_softmax_divS_gatev2_36bs_2gpu_1024seqlen
fineweb_largeDepth_gated_softmax_no_gate_L2norm_nodivS_noclamp_35bs_2gpu_1024seqlen
fineweb_largeDepth_gated_softmax_no_gate_L2norm_nodivS_noclamp_35bs_2gpu_1024seqlen
SlimPajama_clip_softmax_divS_gatev2_35bs_2gpu_1024seqlen
SlimPajama_clip_softmax_divS_gatev2_35bs_2gpu_1024seqlen
fineweb_large_clip_softmax_divS_gatev2_36bs_4gpu_1024seqlen
fineweb_large_clip_softmax_divS_gatev2_36bs_4gpu_1024seqlen
fineweb_clip_softmax_gate_35bs_2gpu_1024seqlen
fineweb_clip_softmax_gate_35bs_2gpu_1024seqlen
fineweb_clip_softmax_gate_35bs_2gpu_1024seqlen
fineweb_clip_softmax_gate_35bs_2gpu_1024seqlen
fineweb_clip_softmax_divs_35bs_2gpu_1024seqlen
fineweb_clip_softmax_divs_35bs_2gpu_1024seqlen
fineweb_large_softmax_divS_gatev2_36bs_4gpu_1024seqlen
fineweb_large_softmax_divS_gatev2_36bs_4gpu_1024seqlen
fineweb_softmax_divS_gatev2_35bs_2gpu_1024seqlen
fineweb_softmax_divS_gatev2_35bs_2gpu_1024seqlen
SlimPajama_softmax_divS_gatev2_35bs_2gpu_1024seqlen
SlimPajama_softmax_divS_gatev2_35bs_2gpu_1024seqlen
SlimPajama_softmax_divS_gatev2_35bs_2gpu_1024seqlen
SlimPajama_softmax_divS_gatev2_35bs_2gpu_1024seqlen
fineweb_softmax_divS_gatev2_36bs_6gpu_4096seqlen
fineweb_softmax_divS_gatev2_36bs_6gpu_4096seqlen
fineweb_large_softmax_divS_gatev2_35bs_2gpu_1024seqlen
fineweb_large_softmax_divS_gatev2_35bs_2gpu_1024seqlen
Pile_softmax_divS_gatev2_35bs_2gpu_1024seqlen
Pile_softmax_divS_gatev2_35bs_2gpu_1024seqlen
fineweb_softmax_divS_gatev2_35bs_2gpu_1024seqlen
fineweb_softmax_divS_gatev2_35bs_2gpu_1024seqlen
fineweb_gated_softmax_out_gate_L2norm_nodivS_noclamp_35bs_2gpu_1024seqlen
fineweb_gated_softmax_out_gate_L2norm_nodivS_noclamp_35bs_2gpu_1024seqlen
fineweb_softmax_detach_denom_gate_35bs_2gpu_1024seqlen
fineweb_softmax_detach_denom_gate_35bs_2gpu_1024seqlen
fineweb_softmax_gate_35bs_2gpu_1024seqlen
fineweb_softmax_gate_35bs_2gpu_1024seqlen
fineweb_gated_ReLU_no_gate_L2norm_nodivS_noclamp_35bs_2gpu_1024seqlen
fineweb_gated_ReLU_no_gate_L2norm_nodivS_noclamp_35bs_2gpu_1024seqlen
1-20
of 134
log(test_loss)
log(test_loss)
20k
40k
60k
80k
Step
1
1.05
1.1
1.15
1.2
1.25
1.3
fineweb_largeDepth_softmax_36bs_2gpu_1024seqlen
fineweb_largeDepth_softmax_divS_gatev2_36bs_2gpu_1024seqlen
fineweb_largeDepth_gated_softmax_no_gate_L2norm_nodivS_noclamp_35bs_2gpu_1024seqlen
fineweb_softmax_35bs_2gpu_1024seqlen
Previous
Next