How Does the “view” Method Work in PyTorch?

Demystify "view" in PyTorch and find a better way to design models in PyTorch. Made by Ayush Thakur using Weights & Biases
Ayush Thakur

Introduction

The view method in PyTorch can be a bit confusing. We're here to help. Let's start with a code snippet that uses view.
def forward(self, x): x = self.pool(F.relu(self.conv1(x))) x = self.pool(F.relu(self.conv2(x))) x = x.view(-1, 16*5*5) x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return x
Today, we're going to take a look at what the view function actually does, what happens when we give it negative values, and how we can leverage view to design better models. Let's dig in:

Let us investigate

How to view?

Simply put, the view function is used to reshape tensors. First, we'll create a simple tensor in PyTorch:
import torch# tensorsome_tensor = torch.range(1, 36) # creates a tensor of shape (36,)
Since view is used to reshape, let's do a simple reshape to get an array of shape (3, 12).
some_tensor_reshaped = some_tensor.view(3, 12) # creates a tensor of shape (3, 12) Other shapes we can reshape some_tensor into are (12, 3), (6, 6), (2, 18) etc.
But notice that you can reshape the given tensor to your desired tensor only because you know about the shape of the tensor to be reshaped. What if you don't know the shape of that tensor?
Remember our intro where we talked about negative values? This is where the -1 parameter is magical.
some_tensor_reshaped_1 = some_tensor.view(3, -1) # creates a tensor of shape (3, 12)some_tensor_reshaped_2 = some_tensor.view(-1, 12) # creates a tensor of shape (3, 12)
The -1 parameter automatically computes one dimension of your output tensor! This is useful while building a model in PyTorch as you have to specify the input and output shape for each layer, which might be an issue for complex networks.

How to Build PyTorch Models Easily

Next, I will show a smart use case of view for building your neural network architecture. Let us look at this model architecture:
class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.conv1 = nn.Conv2d(3, 6, 5) self.pool = nn.MaxPool2d(2,2) self.conv2 = nn.Conv2d(6, 16, 5) self.fc1 = nn.Linear(16*5*5, 120) # <---- you can do so only if you know the expected input shape. self.fc2 = nn.Linear(120, 84) self.fc3 = nn.Linear(84, 10) def forward(self, x): x = self.pool(F.relu(self.conv1(x))) x = self.pool(F.relu(self.conv2(x))) x = x.view(-1, 16*5*5) # <---- notice passing in one value expected in the output shape. x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return x
Someone coming from the TensorFlow (Keras) ecosystem might not be used to defining the expected input shape while defining a layer. Keras high-level APIs do that for us. So is there a smart way to do so?
Let us use view to our advantage and modify the Net class a bit.
class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.conv1 = nn.Conv2d(3, 6, 5) self.pool = nn.MaxPool2d(2,2) self.conv2 = nn.Conv2d(6, 16, 5) n_size = self._get_conv_output(input_shape) # <---- input_shape is the shape of the input training data. self.fc1 = nn.Linear(n_size, 120) self.fc2 = nn.Linear(120, 84) self.fc3 = nn.Linear(84, 10) def _get_conv_output(self, shape): # returns the size of the output tensor going into Linear layer from the conv block. batch_size = 1 input = torch.autograd.Variable(torch.rand(batch_size, *shape)) output_feat = self._forward_features(input) n_size = output_feat.data.view(batch_size, -1).size(1) # <---- notice the first use of view return n_size def _forward_features(self, x): # returns the feature tensor from the conv block x = self.pool(F.relu(self.conv1(x))) x = self.pool(F.relu(self.conv2(x))) return x def forward(self, x): x = self._forward_features(x) x = x.view(x.size(0), -1) # <---- notice the second use of view x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return x
Did you notice the use of view? Let us go through each of them:
• In the _get_conv_output method, the output_feat is the feature vector from the convolutional block's final conv/pooling operation. The feature vector will be of shape (batch_size, n, n, channels). By using view(batch_size, -1) we are computing the n x n x channels automatically which is returned as n_size in the init method.
• The forward method receives the input training data whose shape here would be (batch_size, image_shape, image_shape, 3). The output feature tensor from the _forward_features method will have shape (batch_size, n, n, channels). We can reshape it(flatten it) using view(x.size, -1)

Conclusion

This was just a short post meant to show you how view can be used to reshape tensors and to share the magic of the -1 parameter. If there are other PyTorch functions you need help with, drop them in the comments and we'll write about them!