Skip to main content
gmongaras1
Projects
Mamba_buildup
Log in
Sign up
Overview
Workspace
Runs
Automat.
Sweeps
Reports
Artifacts
Gmongaras's workspace
Personal workspace
Automated workspace
Changes are only visible to you.
Runs
60
Name
45 visualized
8192L_SM
8192L_SM
Ker_8192L_2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
Ker_8192L_2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
4096L_SM
4096L_SM
Ker_4096L_2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
Ker_4096L_2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
Ker_4096L_2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
Ker_4096L_2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
Ker_4096L_Exp_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
Ker_4096L_Exp_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
2048L_2P_SM
2048L_2P_SM
2048L_2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
2048L_2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
EXP_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
EXP_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
3P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
3P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
012P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
012P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_AMaskBias_AMaskValueDiscretizationDT
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_AMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationSOFTPLUS2
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationSOFTPLUS2
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationSOFTPLUS
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationSOFTPLUS
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationSILU
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationSILU
1-20
of 60
Settings
Add panels
Charts
5
1-5 of 5
loss
loss
0
20k
40k
60k
80k
100k
Step
2
4
6
8
10
8192L_SM
Ker_8192L_2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
4096L_SM
Ker_4096L_2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
Ker_4096L_Exp_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
2048L_2P_SM
2048L_2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
EXP_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
3P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
012P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_AMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationSOFTPLUS2
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationSOFTPLUS
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationSILU
1P_NoPE_Conv_AMask_AMaskTypeNEGSILU_AMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS2_AMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_AMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskBias_AMaskValueDiscretization
1P_NoPE_Conv_AMask_AMaskBias
01P_NoPE_Conv_AMask_AMaskBias
Mamba
2P_NoPE_Conv_AMask
01P_NoPE_Conv_AMask
P012_NoPE
P2_NoPE
P01_NoPE_conv
P1_NoPE_conv
P01_NoPE_learnable
P01_NoPE
P1_NoPE
SM
P012
P2
P01_conv
P1_conv
P01_learnable_LowRankHeads
P01_LowRankHeads
P01_learnable
P01
P1
test_loss
test_loss
20k
40k
60k
80k
Step
3
3.5
4
4.5
8192L_SM
Ker_8192L_2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
4096L_SM
Ker_4096L_2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
Ker_4096L_Exp_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
2048L_2P_SM
2048L_2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
EXP_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
3P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
012P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_AMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationSOFTPLUS2
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationSOFTPLUS
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationSILU
1P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_AMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskBias_AMaskValueDiscretization
1P_NoPE_Conv_AMask_AMaskBias
01P_NoPE_Conv_AMask_AMaskBias
Mamba
2P_NoPE_Conv_AMask
01P_NoPE_Conv_AMask
P012_NoPE
P2_NoPE
P01_NoPE_conv
P1_NoPE_conv
P01_NoPE_learnable
P01_NoPE
P1_NoPE
SM
P012
P2
P01_conv
P1_conv
P01_learnable_LowRankHeads
P01_LowRankHeads
P01_learnable
P01
P1
test_perplexity
test_perplexity
20k
40k
60k
80k
Step
20
40
60
80
100
8192L_SM
Ker_8192L_2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
4096L_SM
Ker_4096L_2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
Ker_4096L_Exp_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
2048L_2P_SM
2048L_2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
EXP_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
3P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
012P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_AMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationSOFTPLUS2
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationSOFTPLUS
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationSILU
1P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_AMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskBias_AMaskValueDiscretization
1P_NoPE_Conv_AMask_AMaskBias
01P_NoPE_Conv_AMask_AMaskBias
Mamba
2P_NoPE_Conv_AMask
01P_NoPE_Conv_AMask
P012_NoPE
P2_NoPE
P01_NoPE_conv
P1_NoPE_conv
P01_NoPE_learnable
P01_NoPE
P1_NoPE
SM
P012
P2
P01_conv
P1_conv
P01_learnable_LowRankHeads
P01_LowRankHeads
P01_learnable
P01
P1
perplexity
perplexity
0
20k
40k
60k
80k
100k
Step
10000
20000
30000
40000
8192L_SM
Ker_8192L_2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
4096L_SM
Ker_4096L_2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
Ker_4096L_Exp_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
2048L_2P_SM
2048L_2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT_SMNorm
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
EXP_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
3P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
012P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_NOAMaskBias_AMaskValueDiscretizationDT
2P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_AMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationSOFTPLUS2
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationSOFTPLUS
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationSILU
1P_NoPE_Conv_AMask_AMaskTypeNEGSILU_AMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS2_AMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskTypeNEGSOFTPLUS_AMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskTypeDISCRETIZE_AMaskBias_AMaskValueDiscretizationDT
1P_NoPE_Conv_AMask_AMaskBias_AMaskValueDiscretization
1P_NoPE_Conv_AMask_AMaskBias
01P_NoPE_Conv_AMask_AMaskBias
Mamba
2P_NoPE_Conv_AMask
01P_NoPE_Conv_AMask
P012_NoPE
P2_NoPE
P01_NoPE_conv
P1_NoPE_conv
P01_NoPE_learnable
P01_NoPE
P1_NoPE
SM
P012
P2
P01_conv
P1_conv
P01_learnable_LowRankHeads
P01_LowRankHeads
P01_learnable
P01
P1
System
21
1-6 of 21
Add section