Evaluations
Filter
inputs
output
DocumentationAgentJudge
model_latency
api_reference_retrieval_accuracy
num_api_reference_correct
num_correct_ops_extracted
op_extraction_accuracy
Trace
Feedback
Status
model
self
mean
mean
mean
mean
mean
N/A
N/A
N/A
N/A
0.1758
N/A
N/A
N/A
N/A
0.2806
0.9333
4.5
4.5
0.9333
10.0559
0.6819
3.3
4.5
0.9333
86.513
0.9333
4.5
4.5
0.9333
10.2518
0.9333
4.5
4.5
0.9333
9.8424
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
0.9333
4.5
4.5
0.9333
12.3483
0.0333
0.1
4.5
0.9333
86.5634
0.9333
4.5
4.5
0.9333
12.001
0.0333
0.1
4.3
0.9097
69.7025
1-50 of 205
Per page:
50