Skip to main content

Changma's workspace

scienceworld/metrics_comparison
Current Runlemur-70bgpt-35-turboclaude2text-davinci-003gpt-35-turbo-16kllama2-70bcodellama-34b00.20.40.6
Progress Rate (%)Success Rate (%)Grounding Accuracy (%)Scienceworld Metrics Compared to Baseline Models