Difference Between ‘SAME’ and ‘VALID’ Padding in TensorFlow

The report explains the difference between ‘SAME’ and ‘VALID’ padding in tf.nn.max_pool of TensorFlow.

Created on July 31|Last edited on February 25

Comment

﻿
In TensorFlow, tf.nn.max_pool performs max pooling on the input. Max pooling is used to downsample the spatial dimension of input to reduce the number of parameters and computation needed to train the network. It does this by taking the maximum value.
tf.nn.max_pool takes 6 arguments, namely input (rank N+2 tensor), ksize (size of the window for each dimension of the input tensor), strides (stride of the sliding window for each dimension of the input tensor), padding. The padding algorithm takes 2 values either VALID or SAME and the padding is performed by adding values to the input matrix. The value used for padding is always zero.
Let us see how these two are different from each other.
VALID
﻿

When padding == ”VALID”, the input image is not padded. This means that the filter window always stays inside the input image.  This type of padding is called valid because for this padding only the valid and original elements of the input image are considered. When padding == "VALID", there can be a loss of information. Generally, elements on the right and the bottom of the image tend to be ignored. How many elements are ignored depends on the size of the kernel and the stride.
In this case, the size of the output image <= the size of the input image.
If padding == "VALID": 
output_spatial_shape[i] = ceil((input_spatial_shape[i] - (spatial_filter_shape[i]-1) * dilation_rate[i]) / strides[i])
SAME﻿
﻿
﻿
When padding == “SAME”, the input is half padded. The padding type is called SAME because the output size is the same as the input size(when stride=1). Using ‘SAME’ ensures that the filter is applied to all the elements of the input. Normally, padding is set to "SAME" while training the model. Output size is mathematically convenient for further computation.
If padding == "SAME": 
output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides[i])
Visualize padding with an exampleFinally, let's visualize the affect both padding options have on our convolutions:
﻿
﻿
 
Image Source﻿
The first figure in the image above shows padding="VALID" where the filter is bounded by the input image.
So if we have an image with:
Input Size = 4x4
Kernel Size = 2x2
Stride Size = 1x1
We can see we get an output image of size 2x2.
The second figure shows padding="SAME", where the size of the output image is equal to the input image (stride=1). 
So if we have an image with:
Input Size = 5x5
Kernel Size = 5x5
Stride Size = 1x1
We get an output image of size 5x5.
﻿
﻿

Add a comment

Tags: Intermediate, Domain Agnostic, Keras, Tutorial

Iterate on AI agents and models faster. Try Weights & Biases today.