Evaluations
All evaluations
All datasets
Filter
inputs
output
HalluScorerEvaluator
generation_time
is_hallucination_ground_truth
Trace
Feedback
Status
model
self
mean
true_count
0.0886
426
0.0902
426
0.0893
426
0.0898
426
0.0887
426
0.09
426
0.093
426
0.1046
426
0.1045
426
0.1049
426
0.1066
426
0.1066
426
0.1048
426
1-50 of 195
Per page:
50
Charts
3
Score summary
3
General
Cost
$0.00
↗+ $0.00
Tokens
1.36M
↗+ 0
Latency
1m59s
↘- 11.11s
model_output
total_tokens.mean
1.54K
↗+ 1
generation_time.mean
0.09
↘- 0.01
total_completion_tokens.mean
2
↗+ 0
HalluScorerEvaluator
scorer_accuracy.true_count
544
↘- 49
scorer_accuracy.true_fraction
0.54
↘- 0.05
total_tokens.mean
1.54K
↗+ 1
generation_time.mean
0.09
↘- 0.01
is_hallucination.true_count
572
↗+ 269
is_hallucination.true_fraction
0.57
↗+ 0.27
total_completion_tokens.mean
2
↗+ 0
scorer_worked.true_count
1K
↗+ 0
scorer_worked.true_fraction
1
↗+ 0
is_hallucination_ground_truth.true_count
426
↗+ 0
is_hallucination_ground_truth.true_fraction
0.43
↗+ 0
scorer_evaluation_metrics.precision
0.47
↘- 0.06
scorer_evaluation_metrics.recall
0.64
↗+ 0.26
scorer_evaluation_metrics.f1
0.54
↗+ 0.1
scorer_evaluation_metrics.accuracy
0.54
↘- 0.05
model_latency
mean
4.45
↘- 0.46