What To Do When Inception-ResNet-V2 Is Too Slow
This article shows how some versions of Inception parallelize better than others and explains why using Inception V3 instead of Inception-ResNet-V2 can be faster
Created on March 28|Last edited on October 6
Comment
One Architecture Trains 6 Times Faster
I tried training two versions of Inception on image classification, running data-parallel on 8 GPUs with Keras:
I thought the newer, more sophisticated red version would be better, but it is 6 times slower. On the left plot below, you can see that the training loss and accuracy curves get to the same final values, but the red ones take about 6 times longer.
Inception V3 Parallelizes Better Than Inception-ResNet-V2
Plotting my GPU usage across the two models explains what's happening. The right plot below shows the GPU utilization percentage for each of 8 GPUs in red colors for the red 2016 Inception-ResNet V2 model and in blue colors for blue 2015 Inception V3. In the Red 2016 model, GPU 0 (top orange line) does way more work: 6-8 times the work of the other 7 GPUs (dropping at the end of the first epoch, at around 46 minutes). In the Blue 2015 model, all 8 GPUs share work much more evenly. Using Inception V3 instead of Inception-ResNet-V2 for this task will let me iterate much faster:
Run set
2
Add a comment
Iterate on AI agents and models faster. Try Weights & Biases today.