What To Do When Inception-ResNet-V2 Is Too Slow

This article shows how some versions of Inception parallelize better than others and explains why using Inception V3 instead of Inception-ResNet-V2 can be faster

Stacey Svetlichnaya

Created on March 28|Last edited on October 6

Comment

﻿
One Architecture Trains 6 Times FasterI tried training two versions of Inception on image classification, running data-parallel on 8 GPUs with Keras:
Red (from 2016, Inception-ResNet-V2) and
Blue (from 2015, Inception V3)
I thought the newer, more sophisticated red version would be better, but it is 6 times slower. On the left plot below, you can see that the training loss and accuracy curves get to the same final values, but the red ones take about 6 times longer.
Inception V3 Parallelizes Better Than Inception-ResNet-V2Plotting my GPU usage across the two models explains what's happening. The right plot below shows the GPU utilization percentage for each of 8 GPUs in red colors for the red 2016 Inception-ResNet V2 model and in blue colors for blue 2015 Inception V3. In the Red 2016 model, GPU 0 (top orange line) does way more work: 6-8 times the work of the other 7 GPUs (dropping at the end of the first epoch, at around 46 minutes). In the Blue 2015 model, all 8 GPUs share work much more evenly.  Using  Inception V3 instead of Inception-ResNet-V2 for this task will let me iterate much faster:
﻿
﻿
﻿
Run set2
﻿
﻿

Add a comment

Tags: Intermediate, Domain Agnostic, Keras, Experiment, Inception, ResNet, Plots

Iterate on AI agents and models faster. Try Weights & Biases today.