Introduction to Hyperparameter Sweeps – A Model Battle Royale To Find The Best Model In 3 Steps

Lavanya Shukla

Searching through high dimensional hyperparameter spaces to find the most performant model can get unwieldy very fast. Hyperparameter sweeps provide an organized and efficient way to conduct a battle royale of models and pick the most accurate model. They enable this by automatically searching through combinations of hyperparameter values (e.g. learning rate, batch size, number of hidden layers, optimizer type) to find the most optimal values.

In this project we see how you can run sophisticated hyperparameter sweeps in 3 easy steps using Weights and Biases.

We train a plethora of convolutional neural networks and our battle royale surfaces the model that classifies Simpsons characters with the highest accuracy. We worked with this dataset from Kaggle. We also used Weights & Biases to log models metrics, inspect performance and share findings about the best architecture for the network.

Getting Started

If you'd like to play with Sweeps, please fork the accompanying colab notebook.

An Overview of Sweeps

Running a hyperparameter sweep with Weights & Biases is very easy. There are just 3 simple steps:

1. Define the sweep

we do this by creating a dictionary or a YAML file that specifies the parameters to search through, the search strategy, the optimization metric et all.

2. Initialize the sweep

with one line of code we initialize the sweep and pass in the dictionary of sweep configurations: "sweep_id = wandb.sweep(sweep_config)"

3. Run the sweep agent

also accomplished with one line of code, we call wandb.agent() and pass the sweep_id to run, along with a function that defines your model architecture and trains it: "wandb.agent(sweep_id, function=train)"

And voila! That's all there is to running a hyperparameter sweep! You can also find the full sweeps docs with all configuration options here.

We highly encourage you to fork the accompanying colab notebook, tweak the parameters, or try the model with your own dataset!

Visualizing Sweeps Output

 Project Overview

  1. Check out the project page to see your results in the shared project.
  2. Press 'option+space' to expand the runs table, comparing all the results from everyone who has tried this script.
  3. Click on the name of a run to dive in deeper to that single run on its own run page.

Visualize Sweep Results

Use a parallel coordinates chart to see which hyperparameter values led to the best accuracy.

We can tweak the slides in the parallel co-ordinates chart to only view the runs that led to the best accuracy values. This can help us hone in on ranges of hyperparameter values to sweep over next.

Visualize Performance

Click through to a single run to see more details about that run. For example, on this run page you can see the performance metrics I logged when I ran this script.

Visualize Predictions

You can visualize predictions made at every step by clicking on the Media tab.

Review Code

The overview tab picks up a link to the code. In this case, it's a link to the Google Colab. If you're running a script from a git repo, we'll pick up the SHA of the latest git commit and give you a link to that version of the code in your own GitHub repo.

Visualize System Metrics

The System tab on the runs page lets you visualize how resource efficient your model was. It lets you monitor the GPU, memory, CPU, disk, and network usage in one spot.

Next Steps

As you can see running sweeps is super easy! We highly encourage you to fork this notebook, tweak the parameters, or try the model with your own dataset!

Join our mailing list to get the latest machine learning updates.