Skip to main content
agentboard
Projects
llm-agent-eval-llama2-13b-all
Log in
Sign up
Project
Workspace
Runs
Automat.
Sweeps
Reports
Artifacts
Changma's workspace
Personal workspace
Automated workspace
Changes are only visible to you.
Runs
3
Name
1 visualized
vllm_meta-llama/Llama-2-13b-chat-hf
vllm_meta-llama/Llama-2-13b-chat-hf
vllm_meta-llama/Llama-2-13b-chat-hf
vllm_meta-llama/Llama-2-13b-chat-hf
vllm_meta-llama/Llama-2-13b-chat-hf
vllm_meta-llama/Llama-2-13b-chat-hf
1-3
of 3
jericho/metrics_comparison
gpt-4
text-davinci-003
gpt-35-turbo
codellama-34b
lemur-70b
Current Run
llama2-70b
codellama-13b
0
0.2
0.4
0.6
0.8
1
Progress Rate (%)
Success Rate (%)
Grounding Accuracy (%)
Jericho Metrics Compared to Baseline Models
plotly-logomark
Previous
Next