Evaluations
All evaluations
All datasets
Filter
inputs
output
correctness
Trace
Feedback
Status
model
self
accuracy
true_count
1-50
Charts
3
Score summary
6
General
Cost
$0.00
↘- $0.02
Tokens
367
↘- 685.68K
Latency
3m2s
↘- 1h30m29s
gpt4o_scorer
correctness.true_count
16
↗+ 9
correctness.true_fraction
0.67
↗+ 0.43
model_latency
mean
224.17
↗+ 38.62
gpt4_scorer
correctness.true_count
3
↗+ 0
correctness.true_fraction
0.6
↗+ 0
gpt4o_correctness
true_count
2
↗+ 0
true_fraction
0.67
↗+ 0
accuracy
value
0.67
↗+ 0
correctness
true_count
2
↗+ 0
true_fraction
1
↗+ 0