Modern Data Augmentation Techniques for Computer Vision


Today, deep learning advancements are possible due to faster compute power and massive datasets. However, for many real problems, the dataset is hard to come by. The best way to regularize your model or make it more robust is to feed in more data, but how can one get more data?

The most straightforward answer is to collect more data, but that is not always feasible. The easiest way to train a model on a large amount of data is to use data augmentation. Data augmentation significantly increases the diversity of data available for training our models, without actually collecting new data samples.

Simple image data augmentation techniques like flipping, random crop, and random rotation are commonly used to train large models. This works well for most of the toy datasets and problem statements. Nevertheless, in reality, there can be a huge data shift. Is our model robust to data shift and data corruption? As it stands, models don't robustly generalize for shifts in data. If models could identify when they are likely to be mistaken, or estimate uncertainty accurately, then the impact of such fragility might be reduced. Unfortunately, the models are overconfident about its prediction.

In this report, we will dive into modern data augmentation techniques for computer vision. Here is a quick outline of what you should expect from this report:

  1. Theoretical know-how of some modern data augmentations along with there implementations in TensorFlow 2.x.
  2. Some interesting ablation studies.
  3. Comparative study between these techniques.
  4. Benchmarking of models trained with the augmentations techniques on the CIFAR-10-C dataset.

Read the full post →

Join our mailing list to get the latest machine learning updates.