Skip to main content

Digging Into KITTI With Weights & Biases With PyTorch-Lightning Kitti

In this article, we perform semantic segmentation on the KITTI dataset with Pytorch-Lightning and Weights & Biases
Created on March 29|Last edited on October 10
This is a simple demo for performing semantic segmentation on the Kitti dataset using Pytorch-Lightning and optimizing the neural network by monitoring and comparing runs with Weights & Biases.
Pytorch-Lightning includes a logger for W&B that can be called simply with:
from pytorch_lightning.loggers import WandbLogger
from pytorch_lightning import Trainer

wandb_logger = WandbLogger()
trainer = Trainer(logger=wandb_logger)
Refer to the documentation for more details. Hyper-parameters can be defined manually and every run is automatically logged onto Weights & Biases for easier analysis/interpretation of results and how to optimize the architecture.
You can also run sweeps to optimize automatically hyper-parameters.

See full code on Github →

Usage

  1. Install dependencies through requirements.txt, Pipfile or manually (Pytorch, Pytorch-Lightning & Wandb)
  2. Log in or sign up for an account -> wandb login
  3. Run python train.py
  4. Visualize and compare your runs through generated link. It'll log your model performance, gradients and any other metrics you choose.



Sweep: lrqc8fu3
73



Hyperparameter Optimization with Sweeps

With Sweeps, you can automate hyperparameter optimization and explore the space of possible models.
  1. Run wandb sweep sweep.yaml
  2. Run wandb agent <sweep_id> where <sweep_id> is given by previous command
  3. Visualize and compare the sweep runs.
After running the script a few times, you will be able to compare quickly a large combination of hyperparameters. Feel free to modify the script and define your own hyperparameters.

See full code on Github →




Sweep: lrqc8fu3
73

Pytorch-Lightning let us use Pytorch-based code and easily adds extra features such as distributed computing over several GPU's and machines, half-precision training, and gradient accumulation.
In this example, we optimize the following hyper-parameters:
  • u-net -> custom number of layers, features, up-sample
  • dataloader -> batch size
  • optimizer -> learning rate, gradient accumulation
It's interesting to see the possible combination of parameters to have a good performance (here defined by low val_loss):
  • the number of gradients accumulated grad_batches cannot be too high, probably because of the limited dataset which would cause fewer weight updates by epoch
  • the number of layers num_layers is best at 3 or 4
  • the number of features on each layer is often better when higher
Note: it's important to keep in mind that we limited training to 20 epochs. Deeper networks typically need more time to be trained.

Arun Kumar
Arun Kumar •  
i getting eror on datamodule
Reply
Iterate on AI agents and models faster. Try Weights & Biases today.