gen_eval_dataset-evaluation:v0
1
Model
2
answer correct.true_count
3
answer correct.true_fraction
4
follows from source.true_count
5
follows from source.true_fraction
6
first retrieval correct.true_count
7
first retrieval correct.true_fraction
8
Avg. Latency
9
Run Date
10
Trials
11
Total Rows: 1