Skip to main content
agentboard
Projects
llm-agent-eval-llama2-13b-all
Log in
Sign up
Project
Workspace
Runs
Automat.
Sweeps
Reports
Artifacts
Changma's workspace
Personal workspace
Automated workspace
Changes are only visible to you.
Runs
3
Name
1 visualized
vllm_meta-llama/Llama-2-13b-chat-hf
vllm_meta-llama/Llama-2-13b-chat-hf
vllm_meta-llama/Llama-2-13b-chat-hf
vllm_meta-llama/Llama-2-13b-chat-hf
vllm_meta-llama/Llama-2-13b-chat-hf
vllm_meta-llama/Llama-2-13b-chat-hf
1-3
of 3
jericho/success_rate_w.r.t_difficulty
0
0.1
0.2
0.3
0.4
0.5
Current Run
llama2-70b
lemur-70b
codellama-13b
codellama-34b
gpt-35-turbo
text-davinci-003
gpt-4
Success Rate For Easy Examples(%)
Success Rate For Hard Examples(%)
Jericho Success Rate w.r.t Difficulty
plotly-logomark
Previous
Next