Fine-tuning Azure OpenAI models with Weights & Biases
Discover how to fine-tune OpenAI models using Azure OpenAI and Weights & Biases for enterprise-grade AI performance in healthcare, legal, finance, and more.
Created on February 27|Last edited on June 5
Comment
Fine-tuning LLMs is crucial for aligning pre-trained models, such as GPT-4o and GPT-4o-mini, with specific tasks. For instance, by training an LLM on specific medical literature and patient data, healthcare providers can enhance the model's ability to suggest potential diagnoses based on symptoms. Similarly, training it with legal documents can help lawyers draft precise contracts and detailed legal memoranda more quickly and efficiently.
Various AI application optimization methods - including prompt engineering and retrieval-augmented generation (RAG) - can be used to effectively boost performance, but they often lack the precision required for achieving desired behavior for specialized applications.
Table of contents
Mastering model customization: Fine-tuning Azure OpenAI service models with Weights & BiasesFine-tuning Azure OpenAI ServiceAzure OpenAI fine-tuning with Weights & BiasesPrerequisites for using the Azure OpenAI integrationAdditional resources
Mastering model customization: Fine-tuning Azure OpenAI service models with Weights & Biases
The March 4, 2025 webinar Mastering Model Customization: Fine-Tuning Azure OpenAI Service Models with Weights & Biases, hosted by Microsoft and Weights & Biases, provides an in-depth exploration of fine-tuning’s benefits and applications across various industries.
Alicia Frame (Azure AI Foundry, Fine-tuning Lead) and Chris Van Pelt (Weights & Biases Co-founder and Chief Information Security Officer) examined the model optimization workflow and how fine-tuning compares to other model customization strategies. Then, Anish Shah (Weights & Biases AI Engineer) demonstrated fine-tuning and evaluating a model using the integration between Azure OpenAI fine-tuning and Weights & Biases.
Here's a video of that webinar:
Fine-tuning Azure OpenAI models often offers great advantages when applied to AI applications, driving innovation and efficiency that other optimization techniques cannot. However, fine-tuning can be resource-intensive. Many aspects of the fine-tuning process require expertise and time, such as perfecting hyperparameter optimization strategies and building the right fine-tuning dataset. This is where the Weights & Biases and Microsoft partnership shines. By leveraging Weights & Biases’ Weave and Models tools alongside Azure’s robust infrastructure, enterprises can efficiently create customized AI solutions without the usual complexities.
Fine-tuning Azure OpenAI Service
Azure OpenAI Service offers a robust, seamless platform for fine-tuning AI models. By leveraging Azure's infrastructure, users can customize models like GPT-4 to meet specific domain needs, improving quality and efficiency. This tailored approach reduces processing costs and enhances scalability, making it an ideal choice for industries requiring specialized solutions.

Fine-tuning with Azure OpenAI allows businesses to create AI models that excel in their specific fields. Whether it's healthcare, finance, or legal, Azure's platform supports the integration of industry-specific data, letting you customize models to deliver desired industry-specific outcomes. Additionally, with Azure's robust and globally available infrastructure, organizations can handle the computational demands of fine-tuning without capacity constraints.
Azure's documentation effectively highlights the advantages of leveraging the Azure AI Foundry service for fine-tuning purposes:
- Higher quality results than what you can get just from prompt engineering
- The ability to train on more examples than can fit into a model's max request context limit
- Token savings due to shorter prompts
- Lower-latency requests, particularly when using smaller models
Azure OpenAI fine-tuning with Weights & Biases
The collaboration between W&B tools and Azure OpenAI Service offers a comprehensive solution for customizing models. W&B Models’ powerful experiment tracking and model management tools are integrated with Azure’s scalable cloud infrastructure. Together, they provide a streamlined approach to fine-tuning LLMs.
Steve Sweetman, Head of Azure OpenAI Service at Microsoft, does an excellent job summarizing the benefits of using Weights & Biases and Microsoft Azure AI Foundry together to fine-tune LLMs more effectively:
- Comprehensive experiment tracking: W&B Models provides a robust suite of tools for tracking and visualizing every aspect of model training. From hyperparameter optimization to model performance metrics, W&B’s platform Models offers detailed insights that help data scientists and engineers understand the impact of every change they make during the fine-tuning process for models like GPT-4, GPT-4o, and GPT-4o mini.
- Seamless integration with Azure OpenAI Service: The collaboration enables seamless integration between W&B Models and Azure OpenAI Service, allowing users to easily connect their data and models on Azure with W&B Models experiment tracking tools. This integration simplifies the setup process and ensures that all model data is securely managed within the Azure ecosystem.
- Scalable infrastructure: Azure provides the scalable cloud infrastructure needed to handle the heavy computational demands of fine-tuning LLMs. Whether fine-tuning the larger GPT-4 for complex use cases or the more nimble GPT-4o mini for faster, real-time applications, organizations can leverage Azure's robust, globally available infrastructure without worrying about capacity constraints.
- Enhanced collaboration and version control: W&B Models facilitates collaboration among data science teams, allowing them to easily share results, compare experiments, and iterate on models more effectively. This is particularly valuable in enterprise settings where multiple stakeholders may need to collaborate on model development and deployment.
- Real-time monitoring and evaluation: Through the integration, users can utilize W&B Weave to monitor and evaluate, monitor, and iterate on the performance of their fine-tuned models in real-time. Whether fine-tuning GPT-4 for customer engagement or optimizing GPT-4o for strategic decision-making, this capability allows for rapid identification of issues and continuous optimization.
W&B Model’s extensive experiment tracking and metric visualization capabilities complement Azure AI Foundry’s efficient and scalable fine-tuning interface. W&B Models allows you to quickly examine and share the results for each fine-tuning run with your team, facilitating collaboration and decision-making.
If a run does not produce satisfactory results, the integration between Azure OpenAI fine-tuning and W&B Models makes it easy to perform desired updates before re-running the experiment. Just update hyperparameters or the training data, start the process, and monitor the results in your Weights & Biases’ workspace. The Models user interface offers a number of ways to compare runs to drive optimization efforts and, eventually, reveal the best-performing model.

Once a fine-tuned LLM version has been selected, you can use W&B Weave to further validate and understand the model’s performance. Weave permits rapid iteration during AI application development. Employing rigorous evaluations of LLM-powered applications and comparing results allows you to optimize performance across multiple dimensions, including quality, latency, cost, and safety. Now you can easily analyze fine-tuned models and base models, side-by-side, and determine how well they can achieve the desired behavior for your AI application.
Prerequisites for using the Azure OpenAI integration
The following prerequisites are required to start using the Azure AI Foundry and Weights & Biases integration today:
- An Azure OpenAI resource. For more information, see Create a resource and deploy a model with Azure OpenAI. The resource should be in a region that supports fine-tuning.
- Ensure all team members who need to fine-tune models have Cognitive Services OpenAI Contributor access assigned for the new Azure OpenAI resource.
- An Azure Key Vault. For more information on creating a key vault, see the Azure Key Vault quickstart.
Additional resources
There is plenty of content available about using Azure AI Foundry and Weights & Biases together. Here are some links to blog posts and instructional guides to help you get started:
- How Weights & Biases and Microsoft Azure are Empowering Enterprises to Fine-Tune Models (Microsoft blog post)
- Azure OpenAI Fine-Tuning. How to Fine-Tune Azure OpenAI models using W&B (Weights & Biases user guide)
Add a comment
Iterate on AI agents and models faster. Try Weights & Biases today.