Use GPUs With Keras

A short tutorial on using GPUs for your deep learning models with Keras. Made by Lavanya Shukla using Weights & Biases
Lavanya Shukla

Introduction

This tutorial walks you through the Keras APIs that let you use and have more control over your GPU. We will show you how to check GPU availability, change the default memory allocation for GPUs, explore memory growth, show you how you can use only a subset of GPU memory.

We'll use Weights and Biases to automatically log all our GPU and CPU utilization metrics, which makes it easy to monitor our compute resource usage as we train a plethora of models.

Run example in colab →

1. Check GPU availability

The easiest way to check if you have access to GPUs is to call tf.config.experimental.list_physical_devices('GPU'). This will return a list of names of your GPU devices.

>>> print('GPU name: ', tf.config.experimental.list_physical_devices('GPU'))

GPU name:  [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

2. Use a GPU for model training with Keras

If a TensorFlow operation has both CPU and GPU implementations, by default the GPU will be used by default. So we don't need to change anything about our training pipeline to use a GPU.

3. Monitor your GPU usage

If you are tracking your models using Weights & Biases, all your system metrics, including GPU utilization, will be automatically logged every 2 seconds. Some of the most important metrics logged are GPU memory allocated, GPU utilization, CPU utilization etc. You can see the full list of metrics logged here.

You can see a sample of these system metrics automatically logged by W&B while training a model below:

Section 8

A guide to interpreting your system metrics

4. Memory Growth for GPU

By default Keras allocates all the memory of a GPU. But at times, we need to have finer grained controls on the GPU memory. For these cases, we can turn on memory growth by calling tf.config.experimental.set_memory_growth. This method allocates only the GPU memory actually needed for runtime allocations. It starts out by allocating a small amount of memory, then as the model trains and more GPU memory is needed, the GPU memory is extended.

If you're curious, you can learn more about memory growth here.

Run example in colab →

# Ref: https://www.tensorflow.org/guide/gpu
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  try:
    # Currently, memory growth needs to be the same across GPUs
    for gpu in gpus:
      tf.config.experimental.set_memory_growth(gpu, True)
    logical_gpus = tf.config.experimental.list_logical_devices('GPU')
    print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
  except RuntimeError as e:
    # Memory growth must be set before GPUs have been initialized
    print(e)

Let's observe the effect memory implementing growth has. We can see that a small amount of GPU memory was allocated at the start of the runtime and grew substantially during training as the needs to of the model training process grew.

Section 10

Summary

In this article, you saw how you can leverage GPUs for your deep learning research using Keras, and use Weights and Biases to monitor your resource consumption. Checkout this great article by Lambda Labs on tracking system resource utilization during training with the Weights & Biases.

Weights & Biases

Weights & Biases helps you keep track of your machine learning experiments. Use our tool to log hyperparameters and output metrics from your runs, then visualize and compare results and quickly share findings with your colleagues.

Get started in 5 minutes.