Paper Reading Group: Nf-ResNets
The paper reading groups are supported by experiments, blogs & code implementation! This is your chance to come talk about the paper that interests you!.
Characterizing Signal Propagation to Close the Performance Gap in Unnormalized ResNets [paper, blog]
NF-ResNets - June 8
In this paper, the authors seek to establish a general recipe for training deep ResNets without normalization layers which achieve test accuracies competitive with state of the art! Batch Normalization (BatchNorm) has been key in advancing deep learning research in computer vision, but, in the past few years, a new line of research has emerged that seeks to eliminate layers which normalize activations entirely.
We will continue with:
- June 29, 2021: EfficientNetV2: Smaller Models and Faster Training [paper, blog]
Register here for June 8, 9am PT / 6pm CET / 9:30pm IST
This is your chance to ask your burning questions!
Comment below any questions that you'd like to be answered as part of our next Paper Reading Group.
In the final session of our PRG series on 4 CV papers we will cover:
EfficientNetV2: Smaller Models and Faster Training - June 29
After the massive success of the EfficientNet architecture, Mingxin Tan and Quoc V. Le have done it again! This time they have come up with a new family of networks that have faster training speed and better parameter efficiency - EfficientNetV2! Would these networks have the same success as EfficientNets? Most probably, yes!
Revisiting ResNets: Improved Training and Scaling Strategies
With over 63,000 citations, ResNets have been at the forefront of research in Computer Vision (CV) models even today. Most recent CV papers compare their results to ResNets to showcase improvements in accuracy or speed or both.
❓: But, do such improvements on ImageNet top-1 accuracy come from model architectures or improved training and scaling strategies?
This is precisely the question that Bello et al try to answer in their recent paper Revisiting ResNets: Improved Training and Scaling Strategies.
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
While the Transformer
architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In this paper, Dosovitskiy et al show that this reliance on CNNs is not necessary and a pure transformer applied directly to sequences of image patches can perform very well on image classification tasks. The paper summary can be found here