Evaluations
Filter
inputs
output
compute_bleu
compute_diff
Trace
Feedback
Status
model
self
mean
mean
1-31 of 31
Per page:
50
Charts
3
Score summary
14
General
Cost
$0.05
↘- $0.02
Tokens
64.88K
↗+ 37.41K
Latency
8.32s
↗+ 6.83s
compute_hit_rate
mean
0.64
↗+ 0.24
compute_mrr
mean
0.3
↗+ 0.12
compute_ndcg
mean
0.65
↗+ 0.25
compute_map
mean
0.95
↗+ 0.49
compute_precision
mean
0.65
↗+ 0.25
compute_recall
mean
0.63
↗+ 0.1
compute_f1_score
mean
0.61
↗+ 0.18
model_latency
mean
0.99
↗+ 0.69
llm_retrieval_scorer
relevance.mean
0.67
↘- 0.02
relevance_rank_score.mean
0.48
↘- 0.11
compute_diff
mean
0.06
↘- 0.01
compute_levenshtein
mean
0.43
↘- 0.01
compute_rouge
mean
0.24
↘- 0.02
llm_response_scorer
score.mean
0.78
↗+ 0.44
correct.true_count
1
↗+ 0
correct.true_fraction
0.11
↗+ 0
compute_bleu
mean
0.1
↗+ 0.02