Skip to main content
Reports
Created by
Created On
Last edited
TinyZero-R1-Countdown Qilong Wu
Reproduce R1 ~ RL boosts the reasoning ability of LLM and enable the 'aha' moment testing in countdown task.
2
2025-01-28