Skip to main content
wandb-smle
Projects
e2e-hiring-assistant
Evaluations
Log in
Sign up
Overview
Models
Workspace
Runs
More
Weave
Traces
Evals
Playground
Monitors
Assets
More
Evaluations
Filter
inputs
output
decision_match
model_latency
model_output
decision_match
has_hallucination
Trace
Feedback
Status
model
self
true_count
true_fraction
mean
true_count
true_fraction
eval-2025-03-05-rich-island
049c
HiringAgent:v46
evaluation_dataset-evaluation:v12
13
0.9286
67.2868
5
0.3571
eval-2025-03-05-nice-daisy
5e53
HiringAgent:v45
evaluation_dataset-evaluation:v12
14
0.8235
34.5554
0
0
eval-2025-03-03-gentle-rose
023d
HiringAgent:v38
evaluation_dataset-evaluation:v11
N/A
N/A
0.4417
N/A
N/A
eval-2025-02-26-unique-hill
451d
HiringAgent:v23
evaluation_dataset-evaluation:v10
6
1
78.6385
0
0
eval-2025-02-26-friendly-mountain
6a6b
HiringAgent:v23
evaluation_dataset-evaluation:v9
4
1
58.682
0
0
eval-2025-02-26-brave-rain
ee9c
HiringAgent:v22
evaluation_dataset-evaluation:v8
6
1
109.3506
1
0.1667
eval-2025-02-09-dazzling-meadow
ceb8
HiringAgent:v6
evaluation_dataset-evaluation:v7
6
1
196.7072
0
0
eval-2025-02-09-fierce-dolphin
f696
HiringAgent:v3
evaluation_dataset-evaluation:v6
6
1
115.2909
1
0.1667
eval-2025-02-09-graceful-lake
d874
HiringAgent:v3
evaluation_dataset-evaluation:v2
1
0.25
87.7583
1
0.25
eval-2025-02-04-nice-forest
937b
HiringAgent:v1
evaluation_dataset-evaluation:v1
3
0.6
288.7976
0
0
eval-2025-02-04-jubilant-wind
8940
HiringAgent:v3
evaluation_dataset-evaluation:v0
4
0.8
279.8493
0
0