Tensorboard With Accelerators: A Guide
In this article, we explore how to effortlessly integrate Weights & Biases into pre-existing accelerator-based workflows (both GPU and TPU) using TensorBoard.
Created on August 31|Last edited on June 26
Comment
In this article, we'll walk through a quick example of how you can take advantage of W&B dashboards while using Tensorboard. You'll find the relevant code & instructions below.
Here's what we're covering:
Table of Contents
Table of Contents1️⃣ Accelerator Configuration2️⃣ Sync TensorBoard and W&BVisualize With DashboardsConclusion
Let's get going!
Quick Start Colab
Link to the GitHub Repository
1️⃣ Accelerator Configuration
Use this code snippet to set up a tf.distribute strategy based on your accelerator (GPU or TPU). This module allows you to run computation across multiple devices by choosing various strategies. Just create a distributed strategy for either GPUs or, TPUs, and then W&B automatically logs all the system metrics like GPU Power Usage, GPU Memory Allocated, GPU Temp, etc!
if DEVICE == "TPU":print("connecting to TPU...")try:tpu = tf.distribute.cluster_resolver.TPUClusterResolver()print('Running on TPU ', tpu.master())except ValueError:print("Could not connect to TPU")tpu = Noneif tpu:try:print("initializing TPU ...")tf.config.experimental_connect_to_cluster(tpu)tf.tpu.experimental.initialize_tpu_system(tpu)strategy = tf.distribute.experimental.TPUStrategy(tpu)print("TPU initialized")except _:print("failed to initialize TPU")else:DEVICE = "GPU"if DEVICE == "GPU":n_gpu = len(tf.config.experimental.list_physical_devices('GPU'))print("Num GPUs Available: ", n_gpu)if n_gpu > 1:print("Using strategy for multiple GPU")strategy = tf.distribute.MirroredStrategy()else:print('Standard strategy for GPU...')strategy = tf.distribute.get_strategy()AUTO = tf.data.experimental.AUTOTUNEREPLICAS = strategy.num_replicas_in_sync
2️⃣ Sync TensorBoard and W&B
W&B automatically logs metrics from TensorBoard into dashboards. If you've got a pre-existing training workflow based around Tensorboard, switching to W&B requires just two lines of code.
1. wandb.tensorboard.patch(root_logdir="...")
2. run = wandb.init(..., sync_tensorboard = True)
W&B client will then automatically log all the metrics generated during training from the logs directory into its amazing dashboard, thereby allowing you to create reports and share them with colleagues. For example,
# 1.Point wandb client to the log dir of Tensorboard 👈👈wandb.tensorboard.patch(root_logdir="/content/logs")# 2. Create a wandb run to log all your metrics 👈👈run = wandb.init(project='...', entity='...', sync_tensorboard=True)# Just 3 lines of code to utilize and run training on GPUstf.debugging.set_log_device_placement(True)gpus = tf.config.list_logical_devices('GPU')strategy = tf.distribute.MirroredStrategy(gpus)with strategy.scope():train_model()run.finish()
Visualize With Dashboards
Run set
7
Now that you've got your metrics logged using Tensorboard, you can use the full suite of W&B tools (Reports (like this one), Artifacts, Tables, and more). Our docs will walk you through anything you need to know.
Conclusion
And that wraps up our short tutorial on integrating W&B with your current workflow. To see the full suite of W&B features, please check out this short 5 minutes guide.
Add a comment
Iterate on AI agents and models faster. Try Weights & Biases today.