Hyperparameter Tuning for Keras and Pytorch models

Sweeps - a powerful and efficient way to do hyperparameter tuning
Created on September 18|Last edited on September 18
Comment
We’re excited to launch a powerful and efficient way to do hyperparameter tuning and optimization - W&B Sweeps, in both Keras and Pytoch.
With just a few lines of code Sweeps automatically search through high dimensional hyperparameter spaces to find the best performing model, with very little effort on your part.
Here’s how you can launch sophisticated hyperparameter sweeps in 3 simple steps.
﻿Try Sweeps in Colab →﻿
0. Integrate W&BFirst let’s install the Weights & Biases library and add it into your training script.
pip install wandb
wandb login
import wandb – Import the wandb library
from wandb.keras import WandbCallback – Import the wandb Keras callback﻿
wandb.init() – Initializes a new W&B run.
callbacks=[WandbCallback()] – Fetches all layer dimensions, model parameters from your keras model and logs them automatically to your W&B dashboard.
# train.py
import wandb
from wandb.keras import WandbCallback
wandb.init()
﻿
# define model architecture
...
# compile the model
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=
['accuracy'])
﻿
# add the WandbCallback()
model.fit(X_train, y_train,  validation_data=(X_test, y_test),
epochs=config.epochs,
   callbacks=[WandbCallback(data_type="image", labels=labels)])
﻿
model.save("cnn.h5")
1. Define the SweepYou can define powerful sweeps simply by creating a YAML file that specifies the parameters to search through, the search strategy, and the optimization metric.
Here’s an example:
# sweep.yaml
program: train.py
method: random
metric:
 name: val_loss
 goal: minimize
parameters:
 learning-rate:
   min: 0.00001
   max: 0.1
 optimizer:
   values: ["adam", "sgd"]
 hidden_layer_size:
   values: [96, 128, 148]
 epochs:
   value: 27
early_terminate:
   type: hyperband
   s: 2
   eta: 3
   max_iter: 27
﻿
Let’s break this yaml file down:
Program – a training script that defines your model architecture, trains the model, and contains either the WandbCallback(), or wandb.log﻿
Method – The search strategy used by the sweep.
Grid Search – Iterates over every combination of hyperparameter values.
Random Search – Iterates over randomly chosen combinations of hyperparameter values.
Bayesian Search – Creates a probabilistic model that maps hyperparameters to probability of a metric score, and chooses parameters with high probability of improving the metric. The objective of Bayesian optimization is to spend more time in picking the hyperparameter values, but in doing so trying out fewer hyperparameter values.
Metric – This is the metric the sweeps are attempting to optimize. Metrics can take a name (this metric should be logged by your training script) and a goal (maximize or minimize).
Parameters – The hyperparameter names, and either discreet values, max and min values or distributions from which to pull values to sweep over.
Early_terminate – The is the stopping strategy for determining when to kill off poorly performing runs, and try more combinations faster. We offer custom scheduling algorithms like HyperBand.
You can find a list of all the configuration options here.
﻿
2. Setup a new sweepRun wandb sweep  with the config file you created in step 1.
This creates your sweep, and returns both a unique identifier (SWEEP_ID) and a URL to track all your runs.
wandb sweep sweep.yaml
3. Launch the sweepIt’s time to launch our sweep and train some models!
You can do so by calling wandb agent with the SWEEP_ID you got from step 2.
wandb agent SWEEP_ID
This will start training models with different hyperparameter combinations and return a URL where you can track the sweep’s progress. You can launch multiple agents concurrently. Each of these agents will fetch parameters from the W&B server and use them to train the next model.
And voila! That's all there is to running a hyperparameter sweep!
Let’s see how we can extract insights about our model from sweeps next.
4. Visualize Sweep Results﻿
﻿
﻿
This plot maps hyperparameter values to model metrics. It’s useful for honing in on combinations of hyperparameters that led to the best model performance.
﻿
﻿
﻿
The hyperparameter importance plot surfaces which hyperparameters were the best predictors of, and highly correlated to desirable values for your metrics.
These visualizations can help you save both time and resources running expensive hyperparameter optimizations by honing in on the parameters (and value ranges) that are the most important, and thereby worthy of further exploration.
Next step - Get your hands dirty with sweepsWe created a simple training script and a few flavors of sweep configs for you to play with. We highly encourage you to give these a try. This repo also has examples to help you try more advanced sweep features like Bayesian Hyperband, and Hyperopt.
More Resources﻿Documentation - Sweeps docs
﻿Sweeps in existing projects - If you have an existing W&B project, you can use our config editor to launch a sweep in less than 5 minutes.
﻿Sweeps in Jupyter notebooks - We support running sweeps inside Jupyter notebooks. This notebook shows you how.
﻿Community - join our Slack community forum to hear the latest ML news, and ask your burning ML questions
﻿
﻿
﻿
﻿
﻿
Add a comment
Tags: Keras
Iterate on AI agents and models faster. Try Weights & Biases today.