Last 5 Evaluations: Accuracy vs Cost Analysis
Performance analysis of the latest 5 wandbot evaluations showing accuracy scores and associated costs
Created on June 11|Last edited on June 11
Comment
Wandbot Evaluation Performance Analysis
Summary
Analysis of the last 5 wandbot evaluations showing accuracy and cost metrics.
Key Findings
- **Best Performance**: Jun 10 v1.3.3 and May 19 v1.3.2 PROD both achieved ~90% accuracy
- **Cost Range**: $0.38 - $7.61 per evaluation
- **Poor Performers**: May 20 trials showed very low accuracy (9-11%)
- **Cost vs Accuracy**: Higher performing models generally cost more (~$6), but Apr 17 o4-mini was most expensive at $7.61 with 85.3% accuracy
Evaluation Details
1. **Jun 10 v1.3.3**: 91.0% accuracy, $6.03 cost
2. **May 20 Trial 5**: 9.0% accuracy, $1.88 cost
3. **May 20 Trial 1**: 11.2% accuracy, $0.38 cost
4. **May 19 v1.3.2 PROD**: 90.4% accuracy, $6.02 cost
5. **Apr 17 o4-mini**: 85.3% accuracy, $7.61 cost
Add a comment