Paper Reading Group: Revisiting ResNets

The paper reading groups are supported by experiments, blogs & code implementation! This is your chance to come talk about the paper that interests you!.
Andrea Pessl
After a successful discussion on ViT, Aman Arora from Weights & Biases is discussing the 2nd of the 4 papers from our paper reading group series on computer vision:

Revisiting ResNets: Improved Training and Scaling Strategies [paper, blog]

We're really thrilled to also announce that for this upcoming paper reading group, we're also joined by Aravind Srinivas, one of the authors of the ResNet-RS paper. Aravind has also been involved in other groundbreaking research projects and you can find a list of his publications here.
We will continue with:
NOTE: After a number of requests from our last paper reading, we have updated the paper reading group timings to also support IST timezone!

Register here for May 25, 9am PT / 6pm CET / 9:30pm IST

This is your chance to ask your burning questions!
Comment below any questions that you'd like to be answered as part of our next Paper Reading Group.

Revisiting ResNets: Improved Training and Scaling Strategies

With over 63,000 citations, ResNets have been at the forefront of research in Computer Vision (CV) models even today. Most recent CV papers compare their results to ResNets to showcase improvements in accuracy or speed or both.
❓: But, do such improvements on ImageNet top-1 accuracy come from model architectures or improved training and scaling strategies?
This is precisely the question that Bello et al try to answer in their recent paper Revisiting ResNets: Improved Training and Scaling Strategies.

In upcoming sessions of our PRG series on 4 CV papers we will cover:

Characterizing Signal Propagation to Close the Performance Gap in Unnormalized ResNets - June 8

Another key advancement recently has come from researchers from DeepMind - Andrew Brock, Soham De, Samuel L Smith & Karen Simonyan. Thanks to their work, its now possible to train networks without normalization that reached state-of-art on ImageNet! Are normalizer-free networks going to be the new norm?

EfficientNetV2: Smaller Models and Faster Training - June 22

After the massive success of the EfficientNet architecture, Mingxin Tan and Quoc V. Le have done it again! This time they have come up with a new family of networks that have faster training speed and better parameter efficiency - EfficientNetV2! Would these networks have the same success as EfficientNets? Most probably, yes!

For comments on our previous Paper Reading Group from May 9 see this report:

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In this paper, Dosovitskiy et al show that this reliance on CNNs is not necessary and a pure transformer applied directly to sequences of image patches can perform very well on image classification tasks. The paper summary can be found here.