Accuracy Scorer

Created on December 2|Last edited on December 3
Comment
We also support the good old accuracy scorer that can handle three well known tasks -- binary accuracy, multi-class accuracy and multi-label accuracy.
DefinitionAccuracy is simply given by:
﻿Accuracy=Number of Correct PredictionsTotal Number of Predictions
\text{Accuracy} = \frac{\text{Number of Correct Predictions}}{\text{Total Number of Predictions}}Accuracy=Total Number of PredictionsNumber of Correct Predictions​﻿﻿
AccuracyScorer
  
﻿
Try out the colab notebook to see how we can use this scorer for different tasks.
from weave.scorers import AccuracyScorer
﻿
accuracy_scorer = AccuracyScorer(task="binary")
﻿
eval = weave.Evaluation(
    dataset=...
    scorers=[accuracy_scorer]
)
﻿
# evaluate your model
Binary AccuracyHere's a comparison of three LLM systems, using gpt-3.5-turbo, got-4o-mini and gpt-4o on the IMDB sentiment analysis dataset.
Figure 1: Comparison of three LLM systems using the AccuracyScorer. >>>Click here for interactivity<<<
﻿
Add a comment