Skip to main content

FastText Baseline

Exploring FastText
Created on September 27|Last edited on January 31

Optimal Learning Rate




020406080Step46810121416

Optimal learning rate: 0.5-0.6

Optimal learning rate appears to be around 0.5-0.6 (yellows). Lower values show higher loss (red) and higher values also show slightly higher loss (blue).

Vary LR
8
State
Notes
User
Tags
Created
Runtime
Sweep
epochs
lr
threads
word_vec_dim
ngrams
loss
Finished
stacey
LR
13s
-
25
1
5
100
-
4.13772
Finished
stacey
LR
14s
-
25
0.9
5
100
-
3.82115
Finished
stacey
LR
14s
-
25
0.8
5
100
-
3.61619
Finished
stacey
LR
14s
-
25
0.6
5
100
-
3.52926
Finished
stacey
LR
14s
-
25
0.5
5
100
-
3.65292
Finished
stacey
LR
14s
-
25
0.3
5
100
-
4.55334
Finished
stacey
LR
14s
-
25
0.2
5
100
-
5.66532
Finished
stacey
LR
14s
-
25
0.1
5
100
-
7.66276
1-8
of 8


Word Embedding and Ngram Size




Word Dim (green)
3
Ngrams (violet)
3