Skip to main content
morgan
Projects
ai-hacker-cup-benchmark
Traces
Log in
Sign up
Project
Traces
Evals
Playground
Monitors
Leaders
Threads
Assets
Traces
All Ops
Filter
Visualize
Columns
inputs
output
Trace
Feedback
Status
model
self
...
mean
...
true_count
...
true_fraction
User
Called
Tokens
Cost
reflection o1-mini-5-trials
0951
ReflectionSolver:v1
Evaluation:v1
95.3632
13
0.52
1 year ago
509,646
$4.4256
reflection o1-preview-5-trials
e130
ReflectionSolver:v0
Evaluation:v1
330.353
13
0.52
1 year ago
491,867
$21.4333
o1-preview-5-trials
937a
OneShotSolver:v7
Evaluation:v1
220.679
9
0.36
1 year ago
273,547
$12.7703
gpt-4o-5-trials
1666
OneShotSolver:v6
Evaluation:v1
58.9544
5
0.2
1 year ago
110,325
$0.6534
o1-preview-1-trial
8e65
OneShotSolver:v5
Evaluation:v0
336.8651
2
0.4
1 year ago
53,996
$2.5415