Skip to main content

Wandbot Evaluations: Working Chart Version

Fixed version using proper plots_html dictionary format
Created on June 11|Last edited on June 11

Wandbot Evaluation Performance Analysis (Working Version)

Summary

Analysis of the last 5 wandbot evaluations showing accuracy and cost metrics.

Key Findings

- **Best Performance**: Jun 10 v1.3.3 (91.0% accuracy, $6.03) - **Cost Range**: $0.38 - $7.61 per evaluation - **Poor Performers**: May 20 trials (9-11% accuracy) - **Trend**: Recent versions show improved accuracy

Visualizations

The chart below shows the accuracy and cost comparison across evaluations.

Run set
286