Skip to main content

Wandbot Evaluations: SVG Chart Version

Simple SVG chart version for W&B compatibility
Created on June 11|Last edited on June 11

Wandbot Evaluation Performance Analysis (SVG Chart)

Summary

Analysis of the last 5 wandbot evaluations showing accuracy and cost metrics.

Key Findings

- **Best Performance**: Jun 10 v1.3.3 (91.0% accuracy, $6.03) - **Cost Range**: $0.38 - $7.61 per evaluation - **Poor Performers**: May 20 trials (9-11% accuracy) - **Trend**: Recent versions show improved accuracy

Raw Data

| Date | Version | Accuracy | Cost | |------|---------|----------|------| | Jun 10 | v1.3.3 | 91.0% | $6.03 | | May 20 | Trial 5 | 9.0% | $1.88 | | May 20 | Trial 1 | 11.2% | $0.38 | | May 19 | v1.3.2 PROD | 90.4% | $6.02 | | Apr 17 | o4-mini | 85.3% | $7.61 |