The Al-Dente Neural Network: Part I

Much like making pasta, training a neural network is easy to learn but takes a lifetime to master. What follows is probably the best recipe to make your own Al-Dente Neural Net, courtesy of Andrej Karpathy.
Sairam Sundaresan

Introduction

A couple of years ago, Andrej Karpathy posted a tweet on the most common mistakes people make when training neural nets:

A year after, he followed it up with a comprehensive blog post covering all the steps he takes when building a neural network training pipeline that avoids all the aforementioned mistakes (or at least makes them easy to fix). Given the sheer detail and depth into which Andrej's blog goes into to elucidate these points, it is impossible to cover all of them in a single report. Over the course of a series of reports, I will try to put some of the steps in that recipe into practice and see how each of them impacts the quality of the network we end up with.

Note: You might be wondering where the Al Dente comes from and how on earth is it related to training a neural net. Al Dente pasta is firm to the the tooth. Neither is it too hard and raw, nor is it too mushy and soft. Similarly, an Al Dente neural net neither underfits, nor overfits. It just works for your data. Which is why having a good recipe is important :wink:.

The Premise

1. Neural Net Training is a Leaky Abstraction:

• It's easy to code up a neural net thanks to the numerous libraries and frameworks available today.
• However, neural nets aren't a plug and play technology and you must understand what happens behind the scenes.
2. Neural Nets fail Silently:

• You can have syntactically correct code and still have things fail on you because you put things together in the wrong order.
• You might have performed partial augmentation (just the image and not the label) and yet your network works decently.
• You might have initialized a network from a pretrained checkpoint but might have ignored the original mean and the list goes on and on and on......

In short, debugging a neural net is hard. While suffering is unavoidable :sweat:, you can take some measures to make sure that the suffering was worth your while :sweat_smile:.

The Recipe

There are 6 steps in the recipe, each with their own sub-steps. In this report, I will be focusing on the first of them:

Become One with the Data

One of the most important steps in building a robust pipeline is knowing your data inside and out. This involves completely forgetting about your network code and spending time inspecting (yes manually) your data. Often times this can be an insane challenge since the "The Fast and Furious" researcher in us wants to code up that neural net and watch it train to SOTA. However, by learning the quirks of the data you are working with, its biases, its limitations and its patterns, you can better model your pipeline to draw out the maximum possible juice from it.

In his recipe, Andrej mentions looking out for the following:

• The distribution of the data and its patterns
• Duplicate examples and/or corrupted images/labels
• Imbalances and Biases in the data
• Your own approach for classifying the data

Fasten your seatbelts and follow along as we explore some of these points below.

What's our data anyway?

For the purpose of demonstration, I chose the CIFAR-10 dataset which consists of 10 object classes, namely Airplanes, Automobiles, Birds, Cats, Deer, Dogs, Frogs, Horses, Ships, and Trucks. Each class has 5000 training examples and a 1000 test examples which gives 60k images in all. Each image in the dataset is a color image of resolution $32 \times 32$.

Let's now load a few of them up and visualize them to get a sense of the image quality, object variety and so forth.

Of Duplicates, Noisy Labels and more....

For each class in the dataset, I first visualized images to see how varied they were and how consistent the quality of the labeling was. I manually inspected several batches of images and identified ones that I felt would be difficult for the model to classify, ones which were incorrectly labeled, and ones where there were multiple objects in the image. Doing this gives me a good sense of how to evaluate my model when it makes mistakes. It also gives me an idea of the cleanliness of the labeling.

Just going through 10 classes worth of images took me several hours and finding odd samples in the dataset was even more challenging. I can only imagine how many days it would have taken Andrej to go through ImageNet in its entirety. No wonder he is called the Human ImageNet Classifier :sweat_smile:.

Pixel Statistics and t-SNE

Now that we've explored the nuances of the individual classes, let's put our assumptions to the test. Particularly, let's look at a few more things:

1. What would an average image of a given class look like?
2. How do the color distributions of similar classes look like?
3. Finally, what if we clustered our images with t-SNE? Do they form well separated clusters?

Summary

Clearly, there's a lot more analysis that can be done and many more outliers can be found. What's more important is that we keep these aspects in mind when designing the model, analyzing its performance and thinking the ways of improving it. Ideally, we should strive to clean the dataset so that there are as few (none) of these outliers as possible. In summary, here are some of the things I found manually going through this dataset:

• It has balanced class sample numbers (5k training samples, 1k test samples per class)
• However, there are some mislabeled examples for almost all of the classes
• There are samples that have been squished or padded to fit the dataset dimensions
• There is inconsistency in labeling. A van is labeled both as a car and as a truck
• Some of the images are really grainy and it's pretty hard to identify the object in question
• There are some images where multiple objects are present, e.g.: Human and deer, or Human and truck, Multiple Cars
• The samples for some of the classes have been drawn from sources like posters, toys, mock ups and this will make it challenging for our model
• For some of the classes, color might influence or bias our model and force it to lean one way. For example, ships and airplanes are both seen quite often against blue backgrounds, the dataset also contains quite a few red cars and trucks, and similarly deer and horses appear against green backgrounds a lot

I will leave you to do more exploratory analysis but going through this process has given me a lot of insight into how to go about the next steps of the recipe. Until the next part, enjoy your al-dente data :wink:.

References

1. A Recipe for Training Neural Networks: https://karpathy.github.io/2019/04/25/recipe/

2. Yes you should understand backprop: https://medium.com/@karpathy/yes-you-should-understand-backprop-e2f06eab496b