Evaluations
Filter
inputs
output
check_concrete_fields
city_match
Trace
Feedback
Status
model
self
true_count
true_fraction
5
1
1-48 of 48
Per page:
50
Charts
3
Score summary
5
General
Cost
$0.00
↗+ $0.00
Tokens
1.79K
↗+ 1.28K
Latency
7.58s
↗+ 7.53s
check_concrete_fields
state_match.true_count
0
↘- 1
state_match.true_fraction
0
↘- 0.2
city_match.true_count
5
↗+ 4
city_match.true_fraction
1
↗+ 0.8
check_value_fields
avg_temp_f_err.mean
0.02
↘- 0.11
median_income_err.mean
0.35
↘- 0.02
population_err.mean
0.3
↘- 8.41
check_subjective_fields
correct_known_for.true_count
3
↗+ 3
correct_known_for.true_fraction
0.6
↗+ 0.6
output
avg_temp_f.mean
57.8
↗+ 7.8
population.mean
282.9K
↘- 717.1K
median_income.mean
83.92K
↘- 16.08K
model_latency
mean
2.52
↗+ 2.52