wandbot-eval Reports – Weights & Biases

Skip to main content

Last 5 Evaluations: Accuracy & Cost Analysis

Analysis of the most recent 5 wandbot evaluations showing accuracy scores and associated costs

0

2025-06-11

5 months ago

Wandbot Evaluations: Working Chart Version

Fixed version using proper plots_html dictionary format

0

2025-06-11

5 months ago

Wandbot Evaluations: SVG Chart Version

Simple SVG chart version for W&B compatibility

0

2025-06-11

5 months ago

Wandbot Evaluations: Accuracy vs Cost (Fixed)

Fixed version with W&B compatible HTML chart

0

2025-06-11

5 months ago

Last 5 Evaluations: Accuracy vs Cost Analysis

Performance analysis of the latest 5 wandbot evaluations showing accuracy scores and associated costs

0

2025-06-11

5 months ago

Last 5 Evaluations: Accuracy vs Cost Analysis

Quick analysis of the last 5 wandbot evaluations showing accuracy and cost trends

0

2025-06-10

5 months ago

Simple Test Report

Basic test report

0

2025-06-10

5 months ago

Last 5 Evaluations Analysis

Analysis of accuracy vs cost for the most recent wandbot evaluations

0

2025-06-10

5 months ago

Last 5 Evaluations: Accuracy & Cost Analysis

Performance overview of the most recent wandbot evaluations

0

2025-06-10

5 months ago

Last 5 Wandbot Evaluations Performance Analysis

Analysis of the most recent 5 evaluations in the wandbot-eval project, showing correctness scores and accuracy percentages

0

2025-06-10

5 months ago

Testing Intercom's Fin on wandbot evals

0

2025-05-20

5 months ago

How to Evaluate an LLM, Part 1: Building an Evaluation Dataset for our LLM System

Building gold standard questions for evaluating our QA bot based on production data.

8

2023-09-25

7 months ago

wandbot v1.3 vs 1.2 Debugging Eval Acccuray

0

2025-01-06

9 months ago

How to evaluate an LLM Part 3: LLMs evaluating LLMs

Employing auto-evaluation strategies to evaluate different component of our Wandbot RAG-based support system.

3

2023-10-18

1 year ago

Copy of ayut's How to evaluate an LLM Part 3: LLMs evaluating LLMs

Employing auto-evaluation strategies to evaluate different component of our Wandbot RAG-based support system.

0

2024-06-12

1 year ago

Debug feat/v1.3 with Auto Evaluation

Journal of auto evaluation based LLM app debugging.

0

2024-04-03

2 years ago

Wandbot AutoEval Plots

0

2024-01-18

2 years ago

How to Evaluate an LLM, Part 2: Manual Evaluation of Wandbot, our LLM-Powered Docs Assistant

How we used manual annotation from subject matter experts to generate a baseline correctness score and what we learned about how to improve our system and our annotation process

4

2023-10-23

2 years ago

Wandbot Data Ingestion Report: 2023-10-09 10:03:52

This report contains details of the data ingestion process for the Wandbot run on 2023-10-09 10:03:52

0

2023-10-09

2 years ago

Wandbot Data Ingestion Report: 2023-09-04 09:47:17

This report contains details of the data ingestion process for the Wandbot run on 2023-09-04 09:47:17

0

2023-09-04

2 years ago