Keras-Tuner与W&B结合

wandb与keras-tuner集成
Created on February 9|Last edited on July 12
Comment
本报告是作者Aritra Roy Gosthipaty所写的"Keras-Tuner with W&B"的翻译
引言查看Kaggle笔记本人工智能神经网络由很多先验约束、权值和阈值组成。这些约束，如神经元数量、激活的选择（非线性）、层数，常被称作超参数。一���块研究领域就是基于超参数优化。这就是说，人们的兴趣不仅在于调节权值和阈值，也喜欢调节超参数。有些不错的方法（如栅格、随机、贝叶斯等）已经在这一领域崭露头角。
深度学习实验需要花费大量时间来选择好的超参数。选择好的超参数有时可以改变实验的方向。因此，这一主题得到了广泛研究。随着各种搜索算法的问世，我们现在可以自动优化超参数。通过自动搜索超参数空间来优化超参数这一概念，已经帮助那些手动优化超参数的深度学习研究人员节省了不少时间。
在本文章中，我们将探究keras-tuner，它能帮助我们自动优化超参数。我们不仅要了解该工具的基本知识，还要尝试把它集成到我们最喜欢的实验记录器wandb中。
包括以下内容：
keras-tuner的API
集成wandb的代码
简单了解扫描（sweep）
结论
﻿
keras-tuner的APIKeras团队总是投入大量精力于工具的API设计。这个工具也无出其右，有着相近的思维过程。
API提供了四个基本接口。这些接口是API的核心。
HyperParameters（超参数）: 这个类作为超参数容器。这个类的实例包含了现有超参数和总搜索空间的有关信息。
Hypermodel（超参模型）:这个类的实例可以认为是一个对象，这个对象负责对整个超参数空间建模。该实例不仅构建超参数空间，还要构建从超参数采样得来的DL模型。
Oracles: 该类的每个实例都会实现一个特定的超参数优化算法。
Tuners（优化器）: Tuner的实例执行超参数优化。Oracle作为参数传递给Tuner。Oracle告诉Tuner下一步要尝试哪个超参数。
该API自顶向下的设计方式，使它可读性强、易于理解。全部进行迭代：
构建HyperParameters对象；
将HyperParameters传递给Hypermodel，之后Hypermodel就能构建搜索空间了；
构建Oracles，它提供优化算法；
构建Tuners，它根据Oracles优化超参数。
﻿
keras-tuner代码 在这部分，我们将用一个例子来讲解keras-tuner的基本用法。这个例子出自他们自己的说明文档。
 先不说运行优化器所必需的导入（import），我们首先需要构建Hypermodel，它将模拟整个搜索空间。
我们可以用两种方式构建超参模型：
用函数构建模型
继承Hypermodel类
函数 这里我们构建一个函数，该函数把HyperParameters作为参数。该函数从HyperParameters采样，然后构建模型并返回模型。这样就可以从搜索空间得到不同的模型。
# build with function
def build_model(hp):
  model = keras.Sequential()
  model.add(layers.Dense(units=hp.Int('units',
                                      min_value=32,
                                      max_value=512,
                                      step=32),
                         activation='relu'))
  model.add(layers.Dense(10, activation='softmax'))
  model.compile(
      optimizer=keras.optimizers.Adam(
          hp.Choice('learning_rate',
                    values=[1e-2, 1e-3, 1e-4])),
          loss='sparse_categorical_crossentropy',
          metrics=['accuracy'])
  return model
继承Hypermodel类如果用这种方式，就需要重写build()方法。在build() 方法中，用户可以从HyperParameters采样并构建合适的模型。
# build with inheritance
class MyHyperModel(HyperModel):

  def __init__(self, num_classes):
    self.num_classes = num_classes

  def build(self, hp):
    model = keras.Sequential()
    model.add(layers.Dense(units=hp.Int('units',
                                        min_value=32,
                                        max_value=512,
                                        step=32),
                           activation='relu'))
    model.add(layers.Dense(self.num_classes, activation='softmax'))
    model.compile(
        optimizer=keras.optimizers.Adam(
            hp.Choice('learning_rate',
                      values=[1e-2, 1e-3, 1e-4])),
            loss='sparse_categorical_crossentropy',
            metrics=['accuracy'])
    return model
在以上两种情况下，Hypermodel都是通过提供HyperParameters来创建的。感兴趣的读者可以研究一下超参数的取样方法。这个包不仅提供静态选择，还提供条件超参数。
准备好Hypermodel以后，就可以开始构建Tuner了。Tuner搜索超参数空间并给出一套最佳的超参数。下面我分别为两种Hypermodel设置编写了优化器。
# tuner for function
tuner = RandomSearch(
    build_model,
    objective='val_accuracy',
    max_trials=5,
    executions_per_trial=3,
    directory='my_dir',
    project_name='helloworld')

# tuner for subclass
hypermodel = MyHyperModel(num_classes=10)
tuner = RandomSearch(
    hypermodel,
    objective='val_accuracy',
    max_trials=10,
    directory='my_dir',
    project_name='helloworld')
 提示：若用自定义Tuner，你需要向优化器传入Oracle，Oracle为优化器提供搜索算法。
全部就绪后，我们就可以运行搜索了。search方法与fit方法遵循同样的设计。search之后，我们就可以向优化器查询最佳的模型和最佳的超参数了。
tuner.search(x, y,
						 epochs=5,
             validation_data=(val_x, val_y))
﻿
keras-tuner与wandb的代码查看Kaggle笔记本使用keras-tuner就能在同一个地方跟踪所有的模型，是不是很酷？现在我们就将wandb与keras-tuner集成，用来跟踪被创建、搜索的全部模型。这不仅有助于找到最佳模型，还会带来一些极具价值的见解。
在这部分，我们将运行经过改进的keras-tuner的继承方式代码。
Hypermodel这里我们用函数方式构建Hypermodel。这是构建模型的一种超简单的方法。
在这个示例中，你可以看到，它使用了条件超参数（conditional hyperparameters）。我们用一个for循环语句创建一些数量上可优化的conv_layers，其自身包含一个可优化的filters以及kernel_size参数。
def build_model(hp):
    """
    Builds a convolutional model.
    
    Args:
      hp: Hyperparamet object, This is the object that helps
        us sample hyperparameter for a particular trial.
    
    Returns:
      model: Keras model, Returns a keras model.
    """
    inputs = tf.keras.Input(shape=(28, 28, 1))
    x = inputs
    # In this example we also get to look at
    # conditional heyperparameter settings.
    # Here the `kernel_size` is conditioned
    # with the for loop counter. 
    for i in range(hp.Int('conv_layers', 1, 3)):
      x = tf.keras.layers.Conv2D(
          filters=hp.Int('filters_' + str(i), 4, 32, step=4, default=8),
          kernel_size=hp.Int('kernel_size_' + str(i), 3, 5),
          activation='relu',
          padding='same')(x)
      # choosing between max pool and avg pool
      if hp.Choice('pooling' + str(i), ['max', 'avg']) == 'max':
        x = tf.keras.layers.MaxPooling2D()(x)
      else:
        x = tf.keras.layers.AveragePooling2D()(x)
      x = tf.keras.layers.BatchNormalization()(x)
      x = tf.keras.layers.ReLU()(x)

    if hp.Choice('global_pooling', ['max', 'avg']) == 'max':
      x = tf.keras.layers.GlobalMaxPooling2D()(x)
    else:
      x = tf.keras.layers.GlobalAveragePooling2D()(x)
    outputs = tf.keras.layers.Dense(10, activation='softmax')(x)
    
    model = tf.keras.Model(inputs, outputs)
    return model
Tuner将优化器与wandb集成并用以记录config（配置）和loss（损失），其集成步骤就是小菜一碟。其API可以让用户重写kt.Tuner类的run_trial方法。在run_trial方法中，就可以利用HyperParameters对象。它被用来查询当前超参数并作为一次wandb运行run的config。这不仅意味着现在我们能记录模型指标了，还意味着我们能比较超参数了，其方法就是借助于wandb在其仪表盘 提供的强大的小工具。
class MyTuner(kt.Tuner):
  """
  Custom Tuner subclassed from `kt.Tuner`
  """
  def run_trial(self, trial, train_ds):
    """
    The overridden `run_trial` function

    Args:
      trial: The trial object that holds information for the
        current trial.
      train_ds: The training data.
    """
    hp = trial.hyperparameters
    # Batching the data
    train_ds = train_ds.batch(
        hp.Int('batch_size', 32, 128, step=32, default=64))
    # The models that are created
    model = self.hypermodel.build(trial.hyperparameters)
    # Learning rate for the optimizer
    lr = hp.Float('learning_rate', 1e-4, 1e-2, sampling='log', default=1e-3)

    if hp.Choice('optimizer', ['adam', 'sgd']) == 'adam':
      optimizer = tf.keras.optimizers.Adam(lr)
    else:
      optimizer = tf.keras.optimizers.SGD(lr)

    epoch_loss_metric = tf.keras.metrics.Mean()

    # build the train_step
    @tf.function
    def run_train_step(data):
      """
      The run step

      Args:
        data: the data that needs to be fit
      
      Returns:
        loss: Returns the loss for the present batch
      """
      images = tf.dtypes.cast(data['image'], 'float32') / 255.
      labels = data['label']
      with tf.GradientTape() as tape:
        logits = model(images)
        loss = tf.keras.losses.sparse_categorical_crossentropy(
            labels, logits)
      gradients = tape.gradient(loss, model.trainable_variables)
      optimizer.apply_gradients(zip(gradients, model.trainable_variables))
      epoch_loss_metric.update_state(loss)
      return loss
    
    # WANDB INITIALIZATION
    # Here we pass the configuration so that
    # the runs are tagged with the hyperparams
    # This also directly means that we can
    # use the different comparison UI widgets in the 
    # wandb dashboard off the shelf.
    run = wandb.init(entity='ariG23498', project='keras-tuner', config=hp.values)
    for epoch in range(10):
      self.on_epoch_begin(trial, model, epoch, logs={})
      for batch, data in enumerate(train_ds):
        self.on_batch_begin(trial, model, batch, logs={})
        batch_loss = run_train_step(data)
        self.on_batch_end(trial, model, batch, logs={'loss': batch_loss})   
        if batch % 100 == 0:
          loss = epoch_loss_metric.result().numpy()
          # Log the batch loss for WANDB
          run.log({f'e{epoch}_batch_loss':loss})
      
      # Epoch loss logic
      epoch_loss = epoch_loss_metric.result().numpy()
      # Log the epoch loss for WANDB
      run.log({'epoch_loss':epoch_loss, 'epoch':epoch})
      
      # `on_epoch_end` has to be called so that 
      # we can send the logs to the `oracle` which handles the
      # tuning.
      self.on_epoch_end(trial, model, epoch, logs={'loss': epoch_loss})
      epoch_loss_metric.reset_states()
    
    # Finish the wandb run
    run.finish()
﻿
﻿
﻿
Run set10
﻿
结论我建议读者赶快准备一个笔记本，自己亲身体验一下这个强大的工具。对于日后参考，大家可以去看keras-tuner的说明文档，里面的内容非常丰富。
超参数优化这一主题得到了广泛研究，以至于人们已经尝试结合遗传算法并使用模型进化的概念，而模型进化近似于我们人类进化。在这里，打个小广告，感兴趣的读者可以查看我的一篇文章，其中解析了这一概念——利用遗传算法做超参数优化。
欢迎联系我，我的Twitter  @ariG23498
﻿
﻿
Add a comment
Tags: Beginner, Domain Agnostic, Keras, Tutorial, Parameter Importance, Plots, Sweeps
Iterate on AI agents and models faster. Try Weights & Biases today.
Keras-Tuner与W&B结合

引言

查看Kaggle笔记本

`keras-tuner`的API

`keras-tuner`代码

函数

继承`Hypermodel`类

`keras-tuner`与`wandb`的代码

查看Kaggle笔记本

`Hypermodel`

`Tuner`

结论