Skip to main content
agentboard
Projects
llm-agent-eval-gpt-4-all
Log in
Sign up
Overview
Workspace
Runs
Automat.
Sweeps
Reports
Artifacts
Changma's workspace
Personal workspace
Automated workspace
Changes are only visible to you.
Runs
1
Name
1 visualized
gpt_azure_gpt-4
gpt_azure_gpt-4
1-1
of 1
scienceworld/progress_score_w.r.t_difficulty
0
0.2
0.4
0.6
0.8
gpt-35-turbo-16k
llama2-70b
codellama-34b
text-davinci-003
claude2
gpt-35-turbo
lemur-70b
Current Run
Progress Rate For Easy Examples(%)
Progress Rate For Hard Examples(%)
Scienceworld Progress Rate w.r.t Difficulty
plotly-logomark
Previous
Next