Skip to main content
morgan
Projects
ai-hacker-cup-benchmark
Calls
Log in
Sign up
Overview
Traces
Evals
Playground
Monitors
Leaders
Threads
Assets
Traces
All Ops
Filter
inputs
output
model_latency
scorer
solution_passed
Trace
Feedback
Status
model
self
mean
true_count
true_fraction
User
Called
Tokens
Cost
reflection o1-mini-5-trials
0951
ReflectionSolver:v1
Evaluation:v1
95.3632
13
0.52
12 months ago
509,646
$4.4256
reflection o1-preview-5-trials
e130
ReflectionSolver:v0
Evaluation:v1
330.353
13
0.52
12 months ago
491,867
$21.4333
o1-preview-5-trials
937a
OneShotSolver:v7
Evaluation:v1
220.679
9
0.36
12 months ago
273,547
$12.7703
gpt-4o-5-trials
1666
OneShotSolver:v6
Evaluation:v1
58.9544
5
0.2
12 months ago
110,325
$0.6534
o1-preview-1-trial
8e65
OneShotSolver:v5
Evaluation:v0
336.8651
2
0.4
12 months ago
53,996
$2.5415