Skip to main content
agentboard
Projects
llm-agent-eval-deepseek-67b-all
Log in
Sign up
Project
Workspace
Runs
Automat.
Sweeps
Reports
Artifacts
Changma's workspace
Personal workspace
Automated workspace
Changes are only visible to you.
Runs
1
Name
1 visualized
vllm_deepseek-ai/deepseek-llm-67b-chat
vllm_deepseek-ai/deepseek-llm-67b-chat
1-1
of 1
summary/avg_metrics_comparison
Gpt-4
Gpt-35-turbo
Current Run
Text-davinci-003
Codellama-34b
Lemur-70b
Vicuna-13b-16k
0
0.2
0.4
0.6
Progress Rate (%)
Success Rate (%)
Average Metrics for All Tasks Compared to Baseline Models
plotly-logomark
Previous
Next