Skip to main content
simplebench
Projects
simple_bench_comp_dataset
Objects
competition_dataset-evaluation
Eiy8IYCILaGoIfy96VEGhCp3yXuJgnaLBsLGi151IZk
Log in
Sign up
Project
Traces
Evals
Playground
Monitors
Leaders
Threads
Assets
Assets
All assets
Prompts
Ops
Models
Datasets
Scorers
competition_dataset-evaluation:v0
Name
competition_dataset-evaluation
(1 version)
Last updated
8 months ago
Last updated by
Jonas Zabel
Storage size
970B
Leaderboard
Values
Use
Calls
competition_dataset:v0
eval_multi_choice:v0
Summary
1
Model
2
true_count
3
true_fraction
4
Avg. Latency
5
Run Date
6
Trials
7
LiteLLMModel:v1
1.00
5.00%
31.21
8 months ago
1.00
LiteLLMModel:v2
2.00
10.00%
204.63
8 months ago
1.00
LiteLLMModel:v6
3.00
15.00%
31.79
8 months ago
1.00
LiteLLMModel:v0
1.00
5.00%
34.53
8 months ago
1.00
LiteLLMModel:v4
N/A
N/A
83.73%
8 months ago
1.00
LiteLLMModel:v3
N/A
N/A
83.48%
8 months ago
1.00
LiteLLMModel:v8
N/A
N/A
N/A
8 months ago
1.00
LiteLLMModel:v5
N/A
N/A
0.36%
8 months ago
1.00