Skip to main content
Reports
Created by
Created On
Last edited
Last 5 Evaluations: Accuracy & Cost Analysis
Analysis of the most recent 5 wandbot evaluations showing accuracy scores and associated costs
0
2025-06-11
Wandbot Evaluations: Working Chart Version
Fixed version using proper plots_html dictionary format
0
2025-06-11
Wandbot Evaluations: SVG Chart Version
Simple SVG chart version for W&B compatibility
0
2025-06-11
Wandbot Evaluations: Accuracy vs Cost (Fixed)
Fixed version with W&B compatible HTML chart
0
2025-06-11
Last 5 Evaluations: Accuracy vs Cost Analysis
Performance analysis of the latest 5 wandbot evaluations showing accuracy scores and associated costs
0
2025-06-11
Last 5 Evaluations: Accuracy vs Cost Analysis
Quick analysis of the last 5 wandbot evaluations showing accuracy and cost trends
0
2025-06-10
Simple Test Report
Basic test report
0
2025-06-10
Last 5 Evaluations Analysis
Analysis of accuracy vs cost for the most recent wandbot evaluations
0
2025-06-10
Last 5 Evaluations: Accuracy & Cost Analysis
Performance overview of the most recent wandbot evaluations
0
2025-06-10
Last 5 Wandbot Evaluations Performance Analysis
Analysis of the most recent 5 evaluations in the wandbot-eval project, showing correctness scores and accuracy percentages
0
2025-06-10
0
2025-05-20
How to Evaluate an LLM, Part 1: Building an Evaluation Dataset for our LLM System
Building gold standard questions for evaluating our QA bot based on production data.
8
2023-09-25
0
2025-01-06
How to evaluate an LLM Part 3: LLMs evaluating LLMs
Employing auto-evaluation strategies to evaluate different component of our Wandbot RAG-based support system.
3
2023-10-18
Copy of ayut's How to evaluate an LLM Part 3: LLMs evaluating LLMs
Employing auto-evaluation strategies to evaluate different component of our Wandbot RAG-based support system.
0
2024-06-12
Debug feat/v1.3 with Auto Evaluation
Journal of auto evaluation based LLM app debugging.
0
2024-04-03
0
2024-01-18
How to Evaluate an LLM, Part 2: Manual Evaluation of Wandbot, our LLM-Powered Docs Assistant
How we used manual annotation from subject matter experts to generate a baseline correctness score and what we learned about how to improve our system and our annotation process
4
2023-10-23
Wandbot Data Ingestion Report: 2023-10-09 10:03:52
This report contains details of the data ingestion process for the Wandbot run on 2023-10-09 10:03:52
0
2023-10-09
Wandbot Data Ingestion Report: 2023-09-04 09:47:17
This report contains details of the data ingestion process for the Wandbot run on 2023-09-04 09:47:17
0
2023-09-04