Skip to main content
gmongaras1
Projects
Cottention_Tests
Workspace
Log in
Sign up
Overview
Workspace
Runs
Automat.
Sweeps
Reports
Artifacts
Gmongaras's workspace
Personal workspace
Manual workspace
Changes are only visible to you.
Runs
177
Name
58 visualized
10term_ELU_1GPU_768seqlen_16bz
10term_ELU_1GPU_768seqlen_16bz
4term_ELU_1GPU_768seqlen_16bz
4term_ELU_1GPU_768seqlen_16bz
3term_ELU_1GPU_768seqlen_16bz
3term_ELU_1GPU_768seqlen_16bz
2term_ELU_1GPU_768seqlen_16bz
2term_ELU_1GPU_768seqlen_16bz
ELU_1GPU_768seqlen_16bz
ELU_1GPU_768seqlen_16bz
10term_ReLU_1GPU_768seqlen_16bz
10term_ReLU_1GPU_768seqlen_16bz
4term_ReLU_1GPU_768seqlen_16bz
4term_ReLU_1GPU_768seqlen_16bz
3term_ReLU_1GPU_768seqlen_16bz
3term_ReLU_1GPU_768seqlen_16bz
2term_ReLU_1GPU_768seqlen_16bz
2term_ReLU_1GPU_768seqlen_16bz
ReLU_1GPU_768seqlen_16bz
ReLU_1GPU_768seqlen_16bz
2term_ReLU_1GPU_768seqlen_16bz
2term_ReLU_1GPU_768seqlen_16bz
ReLU_1GPU_768seqlen_16bz
ReLU_1GPU_768seqlen_16bz
ReLU_1GPU_768seqlen_16bz
ReLU_1GPU_768seqlen_16bz
10term_cosine_1GPU_768seqlen_16bz
10term_cosine_1GPU_768seqlen_16bz
3term_cosine_1GPU_768seqlen_16bz
3term_cosine_1GPU_768seqlen_16bz
2term_cosine_1GPU_768seqlen_16bz
2term_cosine_1GPU_768seqlen_16bz
cosine_1GPU_768seqlen_16bz
cosine_1GPU_768seqlen_16bz
cosine_1GPU_768seqlen_16bz
cosine_1GPU_768seqlen_16bz
cosine_1GPU_768seqlen_16bz
cosine_1GPU_768seqlen_16bz
8termtaylor_softmax_2GPU_768seqlen_16bz
8termtaylor_softmax_2GPU_768seqlen_16bz
1-20
of 177
loss
loss
Showing first 50 runs
0
50k
100k
150k
200k
Step
2
4
6
8
10
12
softmax_10termtaylorsoftmax_2GPU_768seqlen_16bz
softmax_80termtaylorsoftmax_2GPU_768seqlen_16bz
softmax_softmax_2GPU_768seqlen_16bz
softmax_2termtaylorsoftmax_2GPU_768seqlen_16bz
softmax_4termtaylorsoftmax_2GPU_768seqlen_16bz
softmax_80termtaylorsoftmax_2GPU_768seqlen_16bz
softmax_detachsumdim1_gate_outnorm_2GPU_768seqlen_16bz
softmax_detachsumdim2_gate_outnorm_2GPU_768seqlen_16bz
double_expgate_tanh_Sdenom_outnorm_2GPU_768seqlen_16bz
double_lineargate_Sdenom_outnorm_2GPU_768seqlen_16bz
double_cubegate_Sdenom_outnorm_2GPU_768seqlen_16bz
double_squaregate_Sdenom_outnorm_2GPU_768seqlen_16bz
double_expgate_Sdenom_outnorm_2GPU_768seqlen_16bz
expgate_highdenom_learnconst_2GPU_768seqlen_16bz
expgate_Sdenom_outnorm_2GPU_768seqlen_16bz
softmax_2GPU_768seqlen_16bz
expgate_highdenom_outnorm_2GPU_768seqlen_16bz
expgate_learnhighdenom_1GPU_256seqlen_32bz
expgate_highdenom_1GPU_256seqlen_32bz
expgate_outnorm_1GPU_256seqlen_32bz
expgate_1GPU_256seqlen_32bz
expgate_learndenom_1GPU_256seqlen_32bz
softmax_1GPU_256seqlen_32bz
expgate_learndenom_1GPU_256seqlen_32bz
lineargate_1GPU_256seqlen_32bz
lineargate_1GPU_256seqlen_32bz
squaredgate_1GPU_256seqlen_32bz
relugate2_1GPU_256seqlen_32bz
relugate_1GPU_256seqlen_32bz
softmax_decayv1_1GPU_256seqlen_32bz
softmax_decayv1_1GPU_256seqlen_32bz
expgate_1GPU_256seqlen_32bz
softmax_decayv1_1GPU_256seqlen_32bz
memmosaic_1GPU_256seqlen_32bz
softmax_learnablebase_1GPU_256seqlen_32bz
softmax_Covar_1GPU_256seqlen_32bz
softmax_L2Dist_1GPU_256seqlen_32bz
softmax_vardiv_1GPU_256seqlen_32bz
relus80termtaylorseries_1GPU_256seqlen_32bz
relus4termtaylorseries_1GPU_256seqlen_32bz
relusquared_1GPU_256seqlen_32bz
relulinear_1GPU_256seqlen_32bz
cosine80termtaylorseries_1GPU_256seqlen_32bz
cosine4termtaylorseries_1GPU_256seqlen_32bz
cosinesquared_1GPU_256seqlen_32bz
cosinelinear_1GPU_256seqlen_32bz
coshmax_1GPU_256seqlen_32bz
sinhmax_1GPU_256seqlen_32bz
softmax_decomposedodd_1GPU_256seqlen_32bz
softmax_decomposedeven_1GPU_256seqlen_32bz
Previous
Next