How to Handle Images of Different Sizes in a Convolutional Neural Network
Datasets come in all shapes and sizes. CNNs don't like that. Here's how to make it work (with code).
Created on August 19|Last edited on March 24
Comment
Problem
Convolutional neural networks require identical image sizes to work properly. Of course, in the real world, images are often not uniform. So how exactly do we solve that problem? Can we?
Solution
We can! In fact, we can do this in a number of different ways. Most of the techniques can be grouped into two broad classes of solutions, namely: transformations and inherent network properties.
Let's go through each of them one by one with simple implementation (wherever possible) in TensorFlow 2.x. We will use a Flower Dataset because it has variable-sized images and also we can spend this post looking at soothing pictures of flowers. That's a win-win.

Fig 1: Variable sized flower dataset
Transformation based techniques
In the case of variable-sized images, we can apply affine transformations to get the same-sized image. Some of these transformations are:
- Resize - Resize the variable-sized images to the same size image. We can easily implement this using tf.data input pipeline.
AUTO = tf.data.experimental.AUTOTUNEBATCH_SIZE = 256# load flower datasettrain_ds, validation_ds = tfds.load("tf_flowers",split=["train[:85%]", "train[85%:]"],as_supervised=True)@tf.functiondef scale_resize_image(image, label):image = tf.image.convert_image_dtype(image, tf.float32) # equivalent to dividing image pixels by 255image = tf.image.resize(image, (224, 224)) # Resizing the image to 224x224 dimentionreturn (image, label)training_ds = (train_ds.map(scale_resize_image, num_parallel_calls=AUTO).batch(BATCH_SIZE).prefetch(AUTO))
Here are our resized images (with a few additional examples as well)

Fig 2: Variable-sized images resized to 224x224 dimension.
- Crop - We can also randomly crop the images and resize them to the same size. This operation produces stronger data augmentation. We can easily implement this using tf.data input pipeline.
AUTO = tf.data.experimental.AUTOTUNEBATCH_SIZE = 256@tf.functiondef scale(image, label):image = tf.image.convert_image_dtype(image, tf.float32)return (image, label)@tf.functiondef random_crop(images, labels):boxes = tf.random.uniform(shape=(len(images), 4))box_indices = tf.random.uniform(shape=(len(images),), minval=0, maxval=BATCH_SIZE, dtype=tf.int32)images = tf.image.crop_and_resize(images, boxes, box_indices, (224,224))return images, labelstrainloader = (train_ds.map(scale_resize_image, num_parallel_calls=AUTO).batch(BATCH_SIZE).map(random_crop, num_parallel_calls=AUTO).prefetch(AUTO))

Fig 3: Variable-sized images cropped and resized to 224x224 dimension.
Inherent Network Property
You can also look into networks that have an inherent property that's immune to the size of the input. Examples:
- Fully convolutional networks (FCN), which have no limitations on the input size at all because once the kernel and step sizes are described, the convolution at each layer can generate appropriate dimension outputs according to the corresponding inputs.
- We can use Global Average Pooling or Global Max Pooling to reduce the feature maps from a shape of (N, H, W, C) (before global pool) to shape (N, 1, 1, C) (after the global pool), where N = Number of minibatch samples, H = Spatial height of feature map, W = Spatial width of the feature map, C = Number of feature maps (channels). This leads to the output dimensionality (N*C) being independent of the spatial size (H, W) of the feature maps. In case of classification, you can then proceed to use a fully connected layer on top to get the logits for your classes.
Conclusion
Though CNNs require uniform image sizes, there are a few fairly easy workarounds to take a dataset full of differently sized pictures and still run ML projects with that data. Broadly, you're going to want to look at data augmentation or transformation to create a dataset with identically-sized data or leverage FCNs or global average/max pooling. It's a small additional step but it makes dealing with messy real world data actually possible.
Weights & Biases
Weights & Biases helps you keep track of your machine learning experiments. Use our tool to log hyperparameters and output metrics from your runs, then visualize and compare results and quickly share findings with your colleagues.
Run set
25
Add a comment
Iterate on AI agents and models faster. Try Weights & Biases today.