Ucalyptus's workspace
Runs
18
Name
18 visualized
Name: llama-3.2-3b/test_gsm_symbolic_formatted.jsonl/
Name: llama-3.2-3b/test_gsm_symbolic_formatted.jsonl/
1
Name: qwen-2.5-0.5b/test_gsm_symbolic_formatted.jsonl/
Name: qwen-2.5-0.5b/test_gsm_symbolic_formatted.jsonl/
4
Name: llama-3.2-1b/test_gsm_symbolic_formatted.jsonl/
Name: llama-3.2-1b/test_gsm_symbolic_formatted.jsonl/
3
Name: qwen-2.5-3b/test_gsm_symbolic_formatted.jsonl/
Name: qwen-2.5-3b/test_gsm_symbolic_formatted.jsonl/
4
Name: phi-4-14b/test_gsm_symbolic_formatted.jsonl/
Name: phi-4-14b/test_gsm_symbolic_formatted.jsonl/
4
Name: qwen-2.5-7b/test_gsm_symbolic_formatted.jsonl/
Name: qwen-2.5-7b/test_gsm_symbolic_formatted.jsonl/
2
1-6
of 6profiling
6
train
14
train/rewards/xml_format_reward
train/rewards/xml_format_reward
displayName: llama-3.2-1b/test_gsm_symbolic_formatted.jsonl/
displayName: llama-3.2-3b/test_gsm_symbolic_formatted.jsonl/
displayName: qwen-2.5-0.5b/test_gsm_symbolic_formatted.jsonl/
displayName: phi-4-14b/test_gsm_symbolic_formatted.jsonl/
displayName: qwen-2.5-3b/test_gsm_symbolic_formatted.jsonl/
displayName: qwen-2.5-7b/test_gsm_symbolic_formatted.jsonl/
train/rewards/soft_reward
train/rewards/soft_reward
displayName: llama-3.2-1b/test_gsm_symbolic_formatted.jsonl/
displayName: llama-3.2-3b/test_gsm_symbolic_formatted.jsonl/
displayName: qwen-2.5-0.5b/test_gsm_symbolic_formatted.jsonl/
displayName: phi-4-14b/test_gsm_symbolic_formatted.jsonl/
displayName: qwen-2.5-3b/test_gsm_symbolic_formatted.jsonl/
displayName: qwen-2.5-7b/test_gsm_symbolic_formatted.jsonl/
train/rewards/numeric_match_reward
train/rewards/numeric_match_reward
displayName: llama-3.2-1b/test_gsm_symbolic_formatted.jsonl/
displayName: llama-3.2-3b/test_gsm_symbolic_formatted.jsonl/
displayName: qwen-2.5-0.5b/test_gsm_symbolic_formatted.jsonl/
displayName: phi-4-14b/test_gsm_symbolic_formatted.jsonl/
displayName: qwen-2.5-3b/test_gsm_symbolic_formatted.jsonl/
displayName: qwen-2.5-7b/test_gsm_symbolic_formatted.jsonl/
train/rewards/gsm_reward
train/rewards/gsm_reward
displayName: llama-3.2-1b/test_gsm_symbolic_formatted.jsonl/
displayName: llama-3.2-3b/test_gsm_symbolic_formatted.jsonl/
displayName: qwen-2.5-0.5b/test_gsm_symbolic_formatted.jsonl/
displayName: phi-4-14b/test_gsm_symbolic_formatted.jsonl/
displayName: qwen-2.5-3b/test_gsm_symbolic_formatted.jsonl/
displayName: qwen-2.5-7b/test_gsm_symbolic_formatted.jsonl/
train/reward_std
train/reward_std
displayName: llama-3.2-1b/test_gsm_symbolic_formatted.jsonl/
displayName: llama-3.2-3b/test_gsm_symbolic_formatted.jsonl/
displayName: qwen-2.5-0.5b/test_gsm_symbolic_formatted.jsonl/
displayName: phi-4-14b/test_gsm_symbolic_formatted.jsonl/
displayName: qwen-2.5-3b/test_gsm_symbolic_formatted.jsonl/
displayName: qwen-2.5-7b/test_gsm_symbolic_formatted.jsonl/
train/reward
train/reward
displayName: llama-3.2-1b/test_gsm_symbolic_formatted.jsonl/
displayName: llama-3.2-3b/test_gsm_symbolic_formatted.jsonl/
displayName: qwen-2.5-0.5b/test_gsm_symbolic_formatted.jsonl/
displayName: phi-4-14b/test_gsm_symbolic_formatted.jsonl/
displayName: qwen-2.5-3b/test_gsm_symbolic_formatted.jsonl/
displayName: qwen-2.5-7b/test_gsm_symbolic_formatted.jsonl/
System
21