Skip to main content

Changma's workspace

summary/avg_metrics_comparison
Gpt-4Gpt-35-turboCurrent RunText-davinci-003Codellama-34bLemur-70bVicuna-13b-16k00.20.40.6
Progress Rate (%)Success Rate (%)Average Metrics for All Tasks Compared to Baseline Models