Automate Hyperparameter Tuning Using Keras-Tuner and W&B

In this article, we take a look at how to integrate Weights & Biases with Keras-Tuner so that we can automate hyperparameter tuning — and save time.
Aritra Roy Gosthipaty
Created on January 28|Last edited on December 5
Comment
﻿﻿An artificial neural network is made up of many prior constraints, weights, and biases. These constraints, i.e., the number of neurons, the choice of activation (non-linearity), and the number of layers are commonly termed 'hyperparameters'. 
A vast field of research is based on hyperparameter optimization. This means people are interested in not only turning the knobs of the weights and biases but also that of the hyper-parameters. There are some great approaches (Grid, Random, Bayesian, to name some), which have already marked this field.
A large amount of time for Deep Learning experimentation is spent on choosing good hyperparameters. The choice of good hyperparameters can sometimes be game-changers for the experiment. This topic is widely studied and researched. With the advent of various search algorithms, we can tune the hyperparameters automatically. The concept of tuning hyperparameters by searching a hyperparameter space automatically has helped reduce the time of DL researchers who were doing it manually.
In this article, we will be looking into one such tool that helps in the automation of hyper-parameter tuning, the keras-tuner. We will not only understand the basics of the tool but also try integrating it with our favourite experiment tracker wandb.
﻿Check out the Kaggle Notebook﻿
Table of ContentsThe API of keras-tunerCode with keras-tunerCode to Integrate keras-tuner with wandbConclusion
﻿
The API of keras-tunerThe Keras team always puts a lot of effort into the API design of their tools. This tool does not stray away from a similar thought process. 
There are four basic interfaces that the API provides. These interfaces are the heart of the API.
HyperParameters: This class serves as a hyperparameter container. An instance of this class contains information about the present hyperparameters and the search space in total.
Hypermodel: An instance of this class can be thought of as an object that models the entire hyperparameter space. The instance not only builds the hyperparameter space but also builds DL models sampling from the hyperparameters.
Oracles: Each instance of this class implements a particular hyperparameter tuning algorithm.
Tuners: A Tuner instance does the hyperparameter tuning. An Oracle is passed as an argument to a Tuner. The Oracle tells the Tuner which hyperparameters should be tried next.
The top-down approach to the API design makes it readable and easy to understand. To iterate it all:
Build HyperParameters objects;
Pass the HyperParameters to the Hypermodel that can then build the search space;
Build Oracles, which provides the tuning algorithms;
Build Tuners that tune the hyperparameters according to the Oracles.
Code with keras-tunerIn this section, I will try to explain the basic usage of keras-tuner with an example. The example is taken from their own documentation.
Leaving aside the imports that are necessary to run the tuner, we need first to build the Hypermodel that will emulate the entire search space. 
We can build a Hypermodel in two ways:
Build models with a function
Subclass from the Hypermodel class
FunctionHere we build a function that takes HyperParameters as an argument. The function samples from the HyperParameters and builds models and returns them. This way different models are made from the search space.
# build with function
def build_model(hp):
  model = keras.Sequential()
  model.add(layers.Dense(units=hp.Int('units',
                                      min_value=32,
                                      max_value=512,
                                      step=32),
                         activation='relu'))
  model.add(layers.Dense(10, activation='softmax'))
  model.compile(
      optimizer=keras.optimizers.Adam(
          hp.Choice('learning_rate',
                    values=[1e-2, 1e-3, 1e-4])),
          loss='sparse_categorical_crossentropy',
          metrics=['accuracy'])
  return model
Subclassing the Hypermodel classWith this method, one needs to override the build() method. In the build() method the user can sample from the HyperParameters and build suitable models.
# build with inheritance
class MyHyperModel(HyperModel):
﻿
  def __init__(self, num_classes):
    self.num_classes = num_classes
﻿
  def build(self, hp):
    model = keras.Sequential()
    model.add(layers.Dense(units=hp.Int('units',
                                        min_value=32,
                                        max_value=512,
                                        step=32),
                           activation='relu'))
    model.add(layers.Dense(self.num_classes, activation='softmax'))
    model.compile(
        optimizer=keras.optimizers.Adam(
            hp.Choice('learning_rate',
                      values=[1e-2, 1e-3, 1e-4])),
            loss='sparse_categorical_crossentropy',
            metrics=['accuracy'])
    return model
In both cases, a Hypermodel is created by providing HyperParameters. An interested reader is advised to look into the way the hyperparameters are sampled. The package provides not only static choices but also provides conditional hyperparameters.
After we have our Hypermodel ready, it is time to build the Tuner. Tuner searches the hyperparameter space and gives us the most optimised set of hyperparameters. Below I have written the tuners for both the Hypermodel setting.
# tuner for function
tuner = RandomSearch(
    build_model,
    objective='val_accuracy',
    max_trials=5,
    executions_per_trial=3,
    directory='my_dir',
    project_name='helloworld')
﻿
# tuner for subclass
hypermodel = MyHyperModel(num_classes=10)
tuner = RandomSearch(
    hypermodel,
    objective='val_accuracy',
    max_trials=10,
    directory='my_dir',
    project_name='helloworld')
Note: With the custom Tuner one needs to pass the tuner an Oracle that helps the tuner with the searching algorithm.
With everything set, we are good to run the search. The search method follows the same design as the fit method does. After search we can query the tuner for the best model and also the hyperparameters.
tuner.search(x, y,
						 epochs=5,
             validation_data=(val_x, val_y))
﻿
Code to Integrate keras-tuner with wandb﻿Check out the Kaggle Notebook﻿How cool would it be to track all the models in one place along with keras-tuner? Here we would integrate wandb with our keras-tuner to help track all the models that are created and searched through. This will not only help with retrieving the best model but also will provide some insights that are of high value.
In this section, we will run a modified code for subclassing of keras-tuner.
HypermodelHere we take the functional way to build the Hypermodel. This serves as an extremely easy way to build models.
In this example, one can see that the usage of conditional hyperparameters is implemented. We have a for loop creating a tunable number of conv_layers, which themselves involve a tunable filters and kernel_size parameter.
def build_model(hp):
    """
    Builds a convolutional model.
    
    Args:
      hp: Hyperparamet object, This is the object that helps
        us sample hyperparameter for a particular trial.
    
    Returns:
      model: Keras model, Returns a keras model.
    """
    inputs = tf.keras.Input(shape=(28, 28, 1))
    x = inputs
    # In this example we also get to look at
    # conditional heyperparameter settings.
    # Here the `kernel_size` is conditioned
    # with the for loop counter. 
    for i in range(hp.Int('conv_layers', 1, 3)):
      x = tf.keras.layers.Conv2D(
          filters=hp.Int('filters_' + str(i), 4, 32, step=4, default=8),
          kernel_size=hp.Int('kernel_size_' + str(i), 3, 5),
          activation='relu',
          padding='same')(x)
      # choosing between max pool and avg pool
      if hp.Choice('pooling' + str(i), ['max', 'avg']) == 'max':
        x = tf.keras.layers.MaxPooling2D()(x)
      else:
        x = tf.keras.layers.AveragePooling2D()(x)
      x = tf.keras.layers.BatchNormalization()(x)
      x = tf.keras.layers.ReLU()(x)
﻿
    if hp.Choice('global_pooling', ['max', 'avg']) == 'max':
      x = tf.keras.layers.GlobalMaxPooling2D()(x)
    else:
      x = tf.keras.layers.GlobalAveragePooling2D()(x)
    outputs = tf.keras.layers.Dense(10, activation='softmax')(x)
    
    model = tf.keras.Model(inputs, outputs)
    return model
TunerIntegrating the tuner to log the config and loss with wandb was a piece of cake. The API provides the user to override the run_trial method of the kt.Tuner class. In the run_trial method, one can harness the HyperParameters object. This is used to query the present hyperparameters as config of a wandb run. Not only does this mean that now we can log the metrics of the models, but we can also compare the hyperparameters with the help of great widgets that wandb provides in their dashboard. 
class MyTuner(kt.Tuner):
  """
  Custom Tuner subclassed from `kt.Tuner`
  """
  def run_trial(self, trial, train_ds):
    """
    The overridden `run_trial` function
﻿
    Args:
      trial: The trial object that holds information for the
        current trial.
      train_ds: The training data.
    """
    hp = trial.hyperparameters
    # Batching the data
    train_ds = train_ds.batch(
        hp.Int('batch_size', 32, 128, step=32, default=64))
    # The models that are created
    model = self.hypermodel.build(trial.hyperparameters)
    # Learning rate for the optimizer
    lr = hp.Float('learning_rate', 1e-4, 1e-2, sampling='log', default=1e-3)
﻿
    if hp.Choice('optimizer', ['adam', 'sgd']) == 'adam':
      optimizer = tf.keras.optimizers.Adam(lr)
    else:
      optimizer = tf.keras.optimizers.SGD(lr)
﻿
    epoch_loss_metric = tf.keras.metrics.Mean()
﻿
    # build the train_step
    @tf.function
    def run_train_step(data):
      """
      The run step
﻿
      Args:
        data: the data that needs to be fit
      
      Returns:
        loss: Returns the loss for the present batch
      """
      images = tf.dtypes.cast(data['image'], 'float32') / 255.
      labels = data['label']
      with tf.GradientTape() as tape:
        logits = model(images)
        loss = tf.keras.losses.sparse_categorical_crossentropy(
            labels, logits)
      gradients = tape.gradient(loss, model.trainable_variables)
      optimizer.apply_gradients(zip(gradients, model.trainable_variables))
      epoch_loss_metric.update_state(loss)
      return loss
    
    # WANDB INITIALIZATION
    # Here we pass the configuration so that
    # the runs are tagged with the hyperparams
    # This also directly means that we can
    # use the different comparison UI widgets in the 
    # wandb dashboard off the shelf.
    run = wandb.init(entity='ariG23498', project='keras-tuner', config=hp.values)
    for epoch in range(10):
      self.on_epoch_begin(trial, model, epoch, logs={})
      for batch, data in enumerate(train_ds):
        self.on_batch_begin(trial, model, batch, logs={})
        batch_loss = run_train_step(data)
        self.on_batch_end(trial, model, batch, logs={'loss': batch_loss})   
        if batch % 100 == 0:
          loss = epoch_loss_metric.result().numpy()
          # Log the batch loss for WANDB
          run.log({f'e{epoch}_batch_loss':loss})
      
      # Epoch loss logic
      epoch_loss = epoch_loss_metric.result().numpy()
      # Log the epoch loss for WANDB
      run.log({'epoch_loss':epoch_loss, 'epoch':epoch})
      
      # `on_epoch_end` has to be called so that 
      # we can send the logs to the `oracle` which handles the
      # tuning.
      self.on_epoch_end(trial, model, epoch, logs={'loss': epoch_loss})
      epoch_loss_metric.reset_states()
    
    # Finish the wandb run
    run.finish()
﻿
﻿
﻿
Run set10
﻿
ConclusionI would advise my readers to quickly spin up a notebook and try a great tool for themselves. For future references, one can go and read the great docs for keras-tuner.
The topic of hyperparameter tuning is so vastly researched that people have also tried incorporating genetic algorithms and used the concept of evolving models similar to us creatures. A shameless plug here would be to link the interested reader to one of my articles that deconstructs the concept of hyperparameter tuning with Genetic Algorithm.
Connect with me over Twitter @ariG23498﻿
﻿
﻿
Add a comment
Ayush Chaurasia • 5 years ago
Hey! great report. How would you compare keras-tuner to other hyperparameter tuning libraries like ray/tune and optuna
Tags: Beginner, Domain Agnostic, Keras, Tutorial, Parameter Importance, Plots, Sweeps
Iterate on AI agents and models faster. Try Weights & Biases today.
Automate Hyperparameter Tuning Using Keras-Tuner and W&B

﻿Check out the Kaggle Notebook﻿

Table of Contents

The API of keras-tuner

Code with keras-tuner

Function

Subclassing the Hypermodel class

Code to Integrate keras-tuner with wandb

﻿Check out the Kaggle Notebook﻿

Hypermodel

Tuner

Conclusion

Check out the Kaggle Notebook

Check out the Kaggle Notebook