Introduction

In this report, I show how to make use of the EfficientNet family of models for transfer learning for image classification tasks. We will be using the EfficientNet models ranging from b0 to b3. For comparison purposes, we will be using the MobileNetV2 model.

This report is not going to talk about the nitty-gritty of the EfficientNet family of models. If you re interested in learning about the details of those models, you should absolutely check out this amazing report.

This report is accompanied by a Colab Notebook so that you are able to reproduce the results.

Run the experiments on Google Colab

Introduction

Experimental Configuration

TensorFlow Hub

All the models we will be using for the experiments come from TensorFlow Hub. TensorFlow Hub provides a comprehensive collection of pre-trained models that can be used for transfer learning and many of those models even support fine-tuning as well. TensorFlow Hub has models for a number of different domains including image, text, video, and audio. Models are also available in different TensorFlow product formats including TensorFlow Lite, TensorFlow JS, and so on.

Dataset

We will be using the Cats. vs. Dogs dataset. It is already included in TensorFlow Datasets. So, much of the hard work is already done for us. The below code listing downloads (if not already cached) and load the dataset that is already split into train and test sets as per our choice.

(raw_train, raw_validation), metadata = tfds.load(
    'cats_vs_dogs',
    split=['train[:80%]', 'train[80%:]'],
    with_info=True,
    as_supervised=True
)

Utility Function for Utilizing TF Hub Models for Transfer Learning

Most of the image classification based TF Hub models come in the following two variants:

All of these models are pre-trained on the ImageNet dataset. As we will be using transfer learning, we will be going with the second variant of models. One very important thing to note here is not all of these models can be fine-tuned especially the ones based on TensorFlow 1.

Unfortunately, the EfficientNet family of models is not eligible for fine-tuning for this experimental configuration. The below code-listing provides a utility function that downloads the respective feature extraction model, adds a classification top, compiles the final model, and finally returns it.

def get_training_model(url, trainable=False):
    # Load the respective EfficientNet model but exclude the classification layers
    extractor = hub.KerasLayer(url, input_shape=(IMG_SIZE, IMG_SIZE, 3), trainable=trainable)
    
    # Construct the head of the model that will be placed on top of the
    # the base model
    model = tf.keras.models.Sequential([
        extractor,
        tf.keras.layers.Dense(128, activation="relu"),
        tf.keras.layers.Dropout(0.5),
        tf.keras.layers.Dense(1)
    ])
    
    # Compile and return the model
    model.compile(loss=tf.keras.losses.BinaryCrossentropy(from_logits=True), 
                          optimizer="adam",
                          metrics=["accuracy"])
    
    return model

Note the url argument. For feature extractor networks based on EfficientNets, this generally looks like - https://tfhub.dev/google/efficientnet/<variant>/feature-vector/1. Note that <variant> can be anything from b0 to b7. Although the utility function has a trainable argument, for EfficientNet models in TF Hub, if you specify trainable=True you would get the following -

ValueError: in user code:

    /usr/local/lib/python3.6/dist-packages/tensorflow_hub/keras_layer.py:206 call  *
        self._check_trainability()
    /usr/local/lib/python3.6/dist-packages/tensorflow_hub/keras_layer.py:265 _check_trainability  *
        raise ValueError(

    ValueError: Setting hub.KerasLayer.trainable = True is unsupported when loading from the hub.Module format of TensorFlow 1.

In the next few sections, we will be performing transfer learning with 4 different variants (b0 to b3) of the EfficientNet family of models and we will also be analyzing the performances of those different models.

EfficientNet B0 + Custom Classification Top

EfficientNet B0 + Custom Classification Top

EfficientNet [B1, B2, B3] + Custom Classification Top

EfficientNet [B1, B2, B3] + Custom Classification Top

Comparison With MobileNetV2

Everything remains the same for this case except we can now make use of fine-tuning as well. More comparison's sake, we will only be using transfer learning in this case.

Comparison with MobileNetV2

A Broader View of the Model Training Times

As we can see the MobileNetV2-based model clearly outperforms all the variants of the EfficientNet-based models we tried so far. It's not only better performing but also it's better in terms of memory footprint and training time. The memory footprint can further be reduced with the help of quantization.

A broader view of the model training times

Concluding Remarks

So for our dataset, the EfficientNet family of models did not perform quite well but that does not anyway demean their significance.

If you have a relatively large dataset, you should definitely give those models a try. But at the same time, we should keep in mind we don't need a hammer to kill a rat.

Let me know your thoughts on this report via Twitter (@RisingSayak).