Tensorboard with Accelerators - A Guide

How to effortlessly integrate W&B into pre-existing accelerator-based workflows (both GPU and TPU) using TensorBoard. Made by Saurav Maheshkar using W&B
Saurav Maheshkar

Introduction

In this report we'll walk through a quick example of how you can take advantage of wandb dashboards while using Tensorboard. You'll find the relevant code & instructions below. Let's get going:

Quick Start Colab \longrightarrow

Link to the Github Repository \longrightarrow

1️⃣ Accelerator Configuration

Use this code snippet to setup a tf.distribute strategy based on your accelerator (GPU or TPU). This module allows you to run computation across multiple devices by choosing various strategies. Just create a distribute strategy for either GPUs or TPUs and then wandb automatically logs all the system metrics like GPU Power Usage, GPU Memory Allocated, GPU Temp, etc !!!
if DEVICE == "TPU": print("connecting to TPU...") try: tpu = tf.distribute.cluster_resolver.TPUClusterResolver() print('Running on TPU ', tpu.master()) except ValueError: print("Could not connect to TPU") tpu = None if tpu: try: print("initializing TPU ...") tf.config.experimental_connect_to_cluster(tpu) tf.tpu.experimental.initialize_tpu_system(tpu) strategy = tf.distribute.experimental.TPUStrategy(tpu) print("TPU initialized") except _: print("failed to initialize TPU") else: DEVICE = "GPU"if DEVICE == "GPU": n_gpu = len(tf.config.experimental.list_physical_devices('GPU')) print("Num GPUs Available: ", n_gpu) if n_gpu > 1: print("Using strategy for multiple GPU") strategy = tf.distribute.MirroredStrategy() else: print('Standard strategy for GPU...') strategy = tf.distribute.get_strategy()AUTO = tf.data.experimental.AUTOTUNEREPLICAS = strategy.num_replicas_in_sync

2️⃣ Sync TensorBoard and wandb

W&B automatically logs metrics from TensorBoard into dashboards. If you've got a pre-existing training workflow based around Tensorboard, switching to wandb requires just 2 lines of code.
1. wandb.tensorboard.patch(root_logdir="...")
2. run = wandb.init(..., sync_tensorboard = True)
W&B client will then automatically log all the metrics generated during training from the logs directory into it's amazing dashboard thereby allowing you to create reports and share them with colleagues. For example,
# 1.Point wandb client to the log dir of Tensorboard 👈👈 wandb.tensorboard.patch(root_logdir="/content/logs")# 2. Create a wandb run to log all your metrics 👈👈 run = wandb.init(project='...', entity='...', sync_tensorboard=True)# Just 3 lines of code to utilize and run training on GPUstf.debugging.set_log_device_placement(True)gpus = tf.config.list_logical_devices('GPU')strategy = tf.distribute.MirroredStrategy(gpus)with strategy.scope(): train_model() run.finish()

Visualize with Dashboards

Now that you've got your metrics logged using Tensorboard you can use the full suite of wandb tools (Reports (like this one), Artifacts, Tables, and ore). Our docs will walk you through anything you need to know.

Conclusion

And that wraps up our short tutorial on integrating wandb with your current workflow. To see the full suite of wandb features please check out this short 5 minutes guide.