Using GPUs With Keras: A Tutorial With Code

This tutorial covers how to use GPUs for your deep learning models with Keras, from checking GPU availability right through to logging and monitoring usage.

Ayush Thakur

Created on July 6|Last edited on March 3

Comment

﻿
An Introduction To Using Your GPU With KerasChecking Your GPU Availability With KerasUsing Your GPU For Model Training With KerasMonitoring Your GPU UsageInterpreting Your System MetricsMemory Growth For GPU Allocation With KerasSummaryWeights & BiasesRecommended Reading
﻿
An Introduction To Using Your GPU With KerasThis tutorial walks you through the Keras APIs that let you use and have more control over your GPU. We will show you how to check GPU availability, change the default memory allocation for GPUs, explore memory growth, and show you how you can use only a subset of GPU memory.
We'll use Weights and Biases to automatically log all our GPU and CPU utilization metrics, which makes it easy to monitor our compute resource usage as we train a plethora of models.
If you'd like top follow along, here's a helpful Colab to assist:
Try it on Colab Notebook﻿
Checking Your GPU Availability With KerasThe easiest way to check if you have access to GPUs is to call tf.config.experimental.list_physical_devices('GPU').
This will return a list of names of your GPU devices.
>>> print('GPU name: ', tf.config.experimental.list_physical_devices('GPU'))
﻿
GPU name:  [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
Using Your GPU For Model Training With KerasIf a TensorFlow operation has both CPU and GPU implementations, by default the GPU will be used by default. So we don't need to change anything about our training pipeline to use a GPU.
Monitoring Your GPU UsageIf you are tracking your models using Weights & Biases, all your system metrics, including GPU utilization, will be automatically logged every 2 seconds. Some of the most important metrics logged are GPU memory allocated, GPU utilization, CPU utilization etc. You can see the full list of metrics logged here.
You can see a sample of these system metrics automatically logged by W&B while training a model below:
﻿
﻿
Run set1
﻿
﻿
Interpreting Your System MetricsCPU Utilization: This metric shows CPU utilization during training. We can see that ~44% of the CPU is used, mostly while scaling the images to the [0-1] range. CPUs are fully utilized for operations like Data Augmentation.
Disk I/O Utilization: This metric shows the disk utilization. Since our Cats Vs Dogs dataset is not loaded into the memory (given its size), the dataloader needs to fetch it from the disk. Therefore, we have a constant disk usage for the period of training. 
GPU Utilization: This is probably the most important metric as it tracks the percent of the time during which one or more operations were executing on the GPU. Ideally we want this to be 100%. In our case we have around 97% GPU usage. 
GPU Accessing Memory:  This measures the percent of the time during which the GPU memory was being read or written. We would want this metrics to be as low as possible as we want our GPU to do operations on data instead of accessing memory. Our GPU access time is around 45%. 
GPU Memory Allocated: This is the amount of GPU memory allocated. By default TensorFlow allocates all of the available GPU memory. In our case around 73% is allocated. 
GPU Temperature: If you have your own GPU setup, this metric is really helpful.
Memory Growth For GPU Allocation With KerasBy default Keras allocates all the memory of a GPU. But at times, we need to have finer grained controls on the GPU memory. For these cases, we can turn on memory growth by calling tf.config.experimental.set_memory_growth.
This method allocates only the GPU memory actually needed for runtime allocations. It starts out by allocating a small amount of memory, then as the model trains and more GPU memory is needed, the GPU memory is extended.
If you're curious, you can learn more about memory growth here.
I've also created a Colab to assist in this section of the tutorial:
﻿﻿﻿
Try it on Colab Notebook﻿
# Ref: https://www.tensorflow.org/guide/gpu
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  try:
    # Currently, memory growth needs to be the same across GPUs
    for gpu in gpus:
      tf.config.experimental.set_memory_growth(gpu, True)
    logical_gpus = tf.config.experimental.list_logical_devices('GPU')
    print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
  except RuntimeError as e:
    # Memory growth must be set before GPUs have been initialized
    print(e)
Let's observe the effect memory implementing growth has. We can see that a small amount of GPU memory was allocated at the start of the runtime and grew substantially during training as the needs to of the model training process grew. 
﻿
﻿
﻿
Run set2
﻿
SummaryIn this article, you saw how you can leverage GPUs for your deep learning research using Keras, and use Weights and Biases to monitor your resource consumption. Checkout this great article by Lambda Labs on tracking system resource utilization during training with the Weights & Biases.
Weights & BiasesWeights & Biases helps you keep track of your machine learning experiments. Use our tool to log hyperparameters and output metrics from your runs, then visualize and compare results and quickly share findings with your colleagues.
﻿Get started in 5 minutes.
Recommended Reading
Setting Up TensorFlow And PyTorch Using GPU On Docker
A short tutorial on setting up TensorFlow and PyTorch deep learning models on GPUs using Docker.
How to Compare Keras Optimizers in Tensorflow for Deep Learning
A short tutorial outlining how to compare Keras optimizers for your deep learning pipelines in Tensorflow, with a Colab to help you follow along.
LSTM RNN in Keras: Examples of One-to-Many, Many-to-One & Many-to-Many 
In this report, I explain long short-term memory (LSTM) recurrent neural networks (RNN) and how to build them with Keras. Covering One-to-Many, Many-to-One & Many-to-Many.
Optimizing Models with Post-Training Quantization in Keras - Part I
Performing Facial Keypoints Detection with Post-Training Quantization in Keras
﻿
﻿

Add a comment

hesoka • 5 years ago

batman><">

Tags: Keras, Articles, Tutorial

Iterate on AI agents and models faster. Try Weights & Biases today.

Using GPUs With Keras: A Tutorial With Code

An Introduction To Using Your GPU With Keras

Checking Your GPU Availability With Keras

Using Your GPU For Model Training With Keras

Monitoring Your GPU Usage

Interpreting Your System Metrics

Memory Growth For GPU Allocation With Keras

﻿﻿﻿

Summary

Weights & Biases

Recommended Reading