Ray Tune: Distributed Hyperparameter Optimization at Scale
This article takes a look at how to use Ray Tune with W&B to run an effective distributed hyperparameter optimization pipeline at scale.
Created on August 11|Last edited on November 5
Comment
This article explores how to use Ray Tune with Weights & Biases, running some experiments to tune hyperparameters to generate MNIST, STL10, and CelebA images to demonstrate how the combination of these two tools provides a one-stop shop for scaling machine learning experimentation and model development.
Table of Contents
Weights & Biases 💜 Ray TuneGetting StartedHear From the Ray Tune TeamA Deeper DiveGeneration of MNIST imagesGeneration of STL10 ImagesGeneration of CelebA ImagesConclusion
Weights & Biases 💜 Ray Tune
Weights & Biases helps your ML team unlock their productivity by optimizing, visualizing, collaborating on, and standardizing their model and data pipelines – regardless of framework, environment, or workflow.
Used by the likes of OpenAI, Toyota and Github, W&B is part of the new standard of best practices for machine learning. By saving everything you need to track and compare models — architecture, hyperparameters, weights, model predictions, GPU usage, git commits, and even datasets – W&B makes your ML workflows reproducible.

Today we're announcing an integration with a tool our community adores – Ray Tune is one of the first and most respected libraries for scalable hyperparameter optimization. With just a few lines of code Ray/Tune helps researchers optimize their models with state-of-the-art algorithms and scale their hyperparameter optimization process to hundreds of nodes and GPUs.

Why We Chose Ray Tune – Delivering Model Development and Hyperparameter Optimization at Scale
We're especially excited about the possibilities this collaboration with our friends at Ray/Tune opens up. Both Weights & Biases and Ray/Tune are built for scale and handle millions of models every month for teams doing some of the most cutting-edge deep-learning research.
Whereas W&B is a centralized repository for everything you need to track, reproduce and gain insights from your models easily; Ray/Tune provides a simple interface for scaling and running distributed experiments. A few reasons why our community likes Ray/Tune –
- Simple Distributed execution: Ray Tune makes it easy to scale from a single node, to multiple GPUs, and further multiple nodes
- Large number of algorithms: Ray Tune has a huge number of algorithms including Population Based Training, ASHA, and HyperBand
- Framework agnostic: Ray Tune works across frameworks including PyTorch, Keras, Tensorflow, XGBoost, and PyTorchLightning.
- Fault-tolerance: Ray Tune is built on top of Ray, providing fault tolerance out of the box.
Getting Started
There are 2 ways you can use the W&B integration with Ray Tune.
1. The WandbLogger
tune.run(train,loggers=[WandbLogger],config={"wandb": {"project": "rayTune", "monitor_gym": True}},})
WandbLogger automatically logs the metrics reported to the W&B dashboard of the project.
2. The wandb_mixin
You can also use wandb_mixin function decorator when you need to log any custom metrics, charts and other visualizations
@wandb_mixindef train(...):...wandb.log({...})tune.report(metric = score)...
Note: The W&B integration with Ray Tune is available on the nightly version of Ray and will be included in ray 0.8.7. Here are the instructions to pip install the nightly version of Ray.
Hear From the Ray Tune Team
AMA
We’re delighted to host the Ray/Tune team in our Slack community for an AMA on building hyperparameter optimization at workflows scale.
We invite you to start posting all your distributed hyperparameter optimization questions in #ama-ml-questions now. The Ray/Tune team will answer them from 9am - 10am on Friday.
Ray Summit
Ray Summit is a FREE Virtual Summit for all things Ray related! Join us to see talks by leading computer scientists, the founders of Anyscale and Weights & Biases.
Beyond these two upcoming events, we’re excited to bring you the best of distributed computing infrastructure and developer tools for machine learning to make it simple to go from model development to production in the fewest steps possible. We can’t wait to see what you’ll build.

A Deeper Dive
Let us explore the integration more in-depth by running some experiments.
Generation of MNIST images
The objective of this experiment is to train a vanilla Deep Convolutional Generative Adversarial Network to generate MNIST images.
Here's the basic code structure.
class Generator(nn.Module):def __init__(self, latent_vector_size, features=32, num_channels=1):super(Generator, self).__init__()self.latent_vector_size = latent_vector_sizeself.main = nn.Sequential('''Network Layers''' )def forward(self, x):return self.main(x)class Discriminator(nn.Module):def __init__(self, features=32, num_channels=1):super(Discriminator, self).__init__()self.main = nn.Sequential('''Network Layers''' )def forward(self, x):return self.main(x)
We'll use Tune to search for the best set of hyper-parameters for training a DCGAN on the MNIST dataset. We'll eliminate the bad choices for hyper-parameters before training on a larger CelebA dataset. Here's the structure of the training loop:
@wandb_mixindef train_batch(...):"""Trains on one batch of data from the data creator."""real_label = 1fake_label = 0discriminator, generator ==modelsoptimD, optimG = optimizers# Compute a discriminator update for real imagesdiscriminator.zero_grad()...errD_real = criterion(output, label)errD_real.backward()# Compute a discriminator update for fake imagesfake = generator(noise)grid = make_grid(fake, nrow=10)npgrid = np.transpose(grid.cpu().detach().numpy(), (1, 2, 0))output = discriminator(fake.detach()).view(-1)errD_fake = criterion(output, label)errD_fake.backward()errD = errD_real + errD_fake# Update the discriminatoroptimD.step()# Update the generator...optimG.step()'''LOG on WandbDashboard'''wandb.log({"batch_loss_g": errG.item(),"batch_loss_d": errD.item()})wandb.log({'Fake':wandb.Image(npgrid)})return {"loss_g": errG.item(),"loss_d": errD.item(),}
Below are the hyper-parameter ranges dictionary that we used with Tune. The information about W&B project( name, API KEY etc) can be passed into this dictionary.
tr_config = {"lr" : tune.grid_search([0.001,0.01,0.005,0.05,0.1]),"beta1" : tune.grid_search([0.5,0.8,0.9,0.99]),"beta2" : tune.grid_search([0.5,0.8,0.9,0.99]),"batch_size" : tune.grid_search([16,32,64]),"epochs":5# specify wandb project and apikey"wandb": {"project": "...","api_key": "..",}}ray.init()analysis = tune.run( train,config = tr_config)print(analysis.get_best_config(metric="metric"))ray.shutdown()
The basic Tune workflow looks something like this.

Ray Dashboard
Ray Tune fires up a server on a localhost port if you're using your local system. All the information about the tasks created and the resources used are displayed there in real time. You can also connect to a remote server running a tuning job by passing the address argument in the ray.init() function.

W&B Dashboard
Let us now look at the runs and the metrics logged in the W&B dashboard.
Here are some of the images generated by our model.
Run set
201
Generation of STL10 Images
In this experiment, we ran a hyperparameter tuning job for the task of generating STL10 images using the same DCGAN model after updating the hyperparameter values from the previous experiment.
Run set
26
Generation of CelebA Images
We chose the subset of hyper-parameters that we used in the previous experiment and we trained the DCGAN network on the new set of parameters using Tune.
ray.init()tr_config = {"lr" : tune.grid_search([0.0005,0.001,0.005,0.0003]),"beta1" : 0.5,"beta2" : tune.grid_search([0.999,0.99]),"batch_size" :tune.grid_search([64,512,256]),"epochs":10,"wandb": {"project": "...","api_key": "...", }}
Support for Resuming Experiments
When using Tune, you can always resume your experimentation if you run into some errors that cause the program to crash.
analysis = tune.run(train_example,config = tr_config ,resume =True #Resumes the experiment from the last checkpoint)
Let us now look at some of the results Generated using our model. Almost all of the models optimized pretty well using the trimmed-down version of the hyperparameters. We have successfully combined multiple facial features to form new faces. There are definitely some facial features that overlap which can further be optimized by using larger models like StyleGAN.
You can always go back to the dashboard to view the information about these experiments, group the runs by categories and write detailed reports.
Run set
10
Conclusion
Ray Tune combined with W&B is a one-stop solution for machine learning experiment management and tracking. This integration is magical for several reasons. Firstly, it combines two excellent tools for scaling machine learning experimentation and model development.
Ray Tune makes it easy to scale from a single node to multiple GPUs, and further multiple nodes
The Ray and Weights & Biases team are hard at work collaborating to make developing machine learning applications simple and we’ve got a number of things coming up to help the community learn more!
Add a comment
Iterate on AI agents and models faster. Try Weights & Biases today.