Skip to main content
mesolitica
Projects
pretrain-mistral-3b
Reports
Pretrain Larger Malaysian Mistral
Log in
Sign up
Share
Comment
Star
Share
Comment
Star
Pretrain Larger Malaysian Mistral
1.1B, 3B and 5B LLM trained on 90B tokens, 312GB JSONL file.
Husein Zolkepli
Created on November 27
|
Last edited on December 1
Comment
train/loss
train/loss
0
2k
4k
6k
8k
Step
2
4
6
8
10
12
14
Group
3B
5B
5
3B
43
1.1B
52
Add a comment