Wandbot Evaluations: Working Chart Version
Fixed version using proper plots_html dictionary format
Created on June 11|Last edited on June 11
Comment
Wandbot Evaluation Performance Analysis (Working Version)
Summary
Analysis of the last 5 wandbot evaluations showing accuracy and cost metrics.
Key Findings
- **Best Performance**: Jun 10 v1.3.3 (91.0% accuracy, $6.03)
- **Cost Range**: $0.38 - $7.61 per evaluation
- **Poor Performers**: May 20 trials (9-11% accuracy)
- **Trend**: Recent versions show improved accuracy
Visualizations
The chart below shows the accuracy and cost comparison across evaluations.
Run set
286
Add a comment