Reports
Created by
Created On
Last edited
LLama3.2-1B Posting Training By GRPO from DeepSeek
Model link: https://huggingface.co/accuracy-maker/Llama-3.2-1B-GRPO-gsm8k
Wandb link: https://wandb.ai/accuracy-maker/Llama3.2-1B-GRPO?nw=nwuseraccuracymaker
0
2025-02-12