Skip to main content
morgan
Projects
ai-hacker-cup-benchmark
Objects
Evaluation
wKRBReNgdkq7M7oipPsBKt7gQBy5VfGg8WwagSYHAIU
Log in
Sign up
Project
Traces
Evals
Playground
Monitors
Leaders
Threads
Assets
Assets
All assets
Prompts
Ops
Models
Datasets
Scorers
Evaluation:v1
Name
Evaluation
(2 versions)
Last updated
1 year ago
Storage size
0B (0B from all versions)
Leaderboard
Values
Use
Calls
Dataset:v0
scorer:v0
Summary
1
Model
2
solution_passed.true_count
3
solution_passed.true_fraction
4
Avg. Latency
5
Run Date
6
Trials
7
ReflectionSolver:v0
13.00
52.00%
330.35
1 year ago
5.00
ReflectionSolver:v1
13.00
52.00%
95.36
1 year ago
5.00
OneShotSolver:v6
5.00
20.00%
58.95
1 year ago
5.00
OneShotSolver:v7
9.00
36.00%
220.68
1 year ago
5.00