Subclassed Tensor vs torch.Tensor GPU Throughput
Created on June 10|Last edited on June 10
Comment
This report shows a consistent decrease in GPU Throughput between training with a torch.Tensor or a subclassed tensor defined below.
class SubClassedTensor(torch.Tensor):pass
All runs used a torchvision ResNet50, 224px image size, a batch size of 64, and mixed precision. The script for training can be found here.
Volta V100
Run set
6
Ampere 3080 Ti
Run set
10
Ampere 3080 Ti: Channels Last
Run set
10
Add a comment