Boost performance: Achieve Better Results, Faster - 25% Fewer Epochs

Comparison between Vanilla ResNet-18 trained on CIFAR-10, and ResNet-18 with three additional easy-to-use composer functions.
S A Samad
Created on August 6|Last edited on September 14
Comment
I have implemented a slightly modified ResNet-18 model on the CIFAR-10 dataset, which achieved an accuracy of 93.72% after training for 60 epochs. However, I made some improvements by incorporating three additional composer library functions, and this enhanced ResNet-18 model achieved a higher accuracy of 93.94% at the 43rd epoch. Notably, these improvements helped in saving both computation time and overall training time.
Section 1﻿
Test Loss
Test Loss
01020304050Step0.40.60.811.21.41.6
Train Accuracy
Train Accuracy
01020304050Step406080
Test Accuracy
Test Accuracy
40455055Step92.59393.59494.595
Run set2
﻿
Composer FunctionsThree functions I used in the training of the model.
Label Smoothing
RandAugment
MixUp
Label SmoothingIt is proposed by Christian Szegedy in this paper. It actually acts as a regularizer technique. The composer makes this function very easy to use.
How to use it?
import composer.functional as cf
for X, y in train_loader:
            y_hat = model(X)
            # note that if you were to modify the variable y here it is a good
            # idea to set y back to the original targets after computing the loss
            smoothed_targets = cf.smooth_labels(y_hat, y, smoothing=0.1)
            loss = loss_fn(y_hat, smoothed_targets)
RandAugmentRandAugment applies random depth image augmentations sequentially from a set of augmentations (e.g. translation, shear, contrast) with severity values randomly selected from 0 to 10. This regularization method during training enhances network generalization.
It is proposed by Cubuk et al. (2020) in this paper.
How to use?
import torchvision.transforms as transforms
from composer.algorithms.randaugment import RandAugmentTransform
randaugment_transform = RandAugmentTransform(severity=9,
                                             depth=2,
                                             augmentation_set="all")
composed = transforms.Compose([randaugment_transform, ....])
MixUpThe following paragraph I copied from the bag of tricks paper. 
"Here we consider another augmentation method called mixup. In mixup, each time we randomly sample two examples (xi, yi) and (xj , yj ). Then we form a new example by a weighted linear interpolation of these two examples:
x-hat = λxi + (1 − λ)xj
y-hat = λyi + (1 − λ)yj 
where λ ∈ [0, 1] is a random number drawn from the Beta(α, α) distribution. In mixup training, we only use the new example (x-hat, y-hat)."
﻿
How to use it?
import composer.functional as cf
    for epoch in range(num_epochs):
        for X, y in train_loader:
            X_mixed, y_perm, mixing = cf.mixup_batch(X, y, alpha=0.2)
            y_hat = model(X_mixed)
            loss = (1 - mixing) * loss_fn(y_hat, y) + mixing * loss_fn(y_hat, y_perm)
            loss.backward()
﻿
Add a comment