Fine-tuning an LLM judge to reduce hallucination

Webinar on July 17, 8am PT / 5pm CEST

In this webinar, we explore the potential of leveraging out-of-domain data to enhance the fine-tuning of MistralAI language models for detecting factual inconsistencies, also known as hallucinations.

Inspired by Eugene Yan’s article on bootstrapping hallucination detection, we use the Factual Inconsistency Benchmark (FIB) dataset and initially fine-tune a MistralAI-based model solely on this dataset, achieving limited success.

We then employed pre-finetuning on Wikipedia summaries from the Unified Summarization Benchmark (USB) before applying task-specific finetuning on FIB. This approach significantly improved performance.

Our methodology incorporates Weights & Biases Weave to automate model evaluation, demonstrating that pre-fine-tuning on related but out-of-domain data can effectively bootstrap the detection of factual inconsistencies, thus reducing the need for extensive task-specific data collection. This technique offers a promising strategy for enhancing the accuracy and applicability of natural language inference models in production environments.

Thomas Capelle

ML Engineer
Weights & Biases

Sophia Yang

Head of Developer Relations
Mistral AI

© 2024 Weights & Biases.