Skip to main content
Evaluations
Filter
Charts
3
Score summary
5
General
Cost
$0.25
- $0.34
Tokens
23.66K
- 8.24K
Latency
4.24s
- 11.81s
CorrectnessLLMJudge
answer correct.true_count
14
- 3
answer correct.true_fraction
0.58
- 0.13
answer correct.stderr
0.1
+ 0.02
HallucinationLLMJudge
follows from source.true_count
12
- 12
follows from source.true_fraction
0.5
- 0.5
eval_retrieval
first retrieval correct.true_count
15
+ 0
first retrieval correct.true_fraction
0.63
+ 0
model_latency
mean
1.73
- 3.19
Correctness
score.true_count
3
+ 0
score.true_fraction
1
+ 0