Summary

IceVision is the first Agnostic Object Detection Framework that connects to Fastai and Pytorch Lightning, with more to come. You can literally write an end-to-end object detection training using SOTA model such as the EfficientDet and Faster RCNN, in a few lines.

IceVision now fully supports W&B by providing a one-liner API that enables users to track their trained models and display both the predicted and ground truth bounding boxes. Check out the end-to-training source code here below.

For illustration purposes, we choose to show an end-to-end training with the EfficientDet Model with 3 different backbones namely the EfficientDet Lite 0, EfficientDet D1, and EfficientDet D3 using a toy dataset called Fridge Objects. The latter is very fast to train and is used show how easy to track the performance related to each of those backbones. In this example, we are using the fastai training loop, it offers a slick integration with wandb through the use of the WandbCallback() callback.

In the figure, here below, we show both the predicted and the ground truth boxes for the 3 backbones (Lite-0, D1, and D3). Each box label is shown with its prediction score.

Section 2

As shown, here below, you can easily inspect the bounding boxes and toggles between the predicted and ground truth bounding boxes as well as the box labels in order to only show some specific objects

image.png

In addition to displaying bounding boxes, you can also analyze any of the metrics that you logged. The figure, here below, shows the COCO metric corresponding to each of the 3 models that we trained. You can also plot any of the multiple hyperparameters that are automatically logged during training.

Section 3

Tutorial

In this tutorial, we walk you through the different steps of training the fridge dataset. Thanks to W&B, we can easily track the performance of the EfficientDet model using 3 different backbones. In this example, we are using the fastai training loop.

Installation

Run those lines in your terminal

pip install icevision[all] 
pip install icedata 

Imports

from icevision.all import *
from fastai.callback.wandb import *
from fastai.callback.tracker import SaveModelCallback

Dataset

Fridge Objects dataset is a tiny dataset that contains 134 images of 4 classes (can, carton, milk bottle, water bottle). IceVision provides very handy methods such as loading a dataset, parsing annotations, and more.

Loading data

IceVision can be used in conjunction with icedata. The latter offers a an intuitive API allowing loading data as well as parsing it.

url = "https://cvbp.blob.core.windows.net/public/datasets/object_detection/odFridgeObjects.zip"
dest_dir = "fridge"
data_dir = icedata.load_data(url, dest_dir, force_download=True)

Parser

IceVision offers a universal parsing API that makes easy to parse a wide variety of datasets that use different annotation formats (COCO, VOC, and custom annotations)

class_map = ClassMap(["milk_bottle", "carton", "can", "water_bottle"])
parser = parsers.voc(annotations_dir=data_dir / "odFridgeObjects/annotations",
                     images_dir=data_dir / "odFridgeObjects/images",
                     class_map=class_map

Records

The records are the building blocks that prepare our data to be converted into a Pytorch dataset.

train_records, valid_records = parser.parse()

Transforms

You can define your own transforms that will be used during the data augmentation phase while creating a train dataset. Whereas, validation transforms standardize the validation dataset. Data augmentation is a powerful concept that helps training robust models and avoid overfitting problems.

train_tfms = tfms.A.Adapter([*tfms.A.aug_tfms(size=384, presize=512), tfms.A.Normalize()])
valid_tfms = tfms.A.Adapter([*tfms.A.resize_and_pad(384), tfms.A.Normalize()])

Datasets

Datasets are built using both the records and the transforms. The transforms are applied on-the-fly to optimize disk usage.

train_ds = Dataset(train_records, train_tfms)
valid_ds = Dataset(valid_records, valid_tfms)

DataLoaders

DataLoaders prepare subset of our dataset to be trained by creating data batches.

train_dl = efficientdet.train_dl(train_ds, batch_size=16, num_workers=4, shuffle=True)
valid_dl = efficientdet.valid_dl(valid_ds, batch_size=16, num_workers=4, shuffle=False)

Model

IceVision supports one of the state-of-the-art model, being EfficientDet with all the available backbones. We use the highly curated EfficientDet implementation created and mantained by Ross Wightman

model = efficientdet.model('tf_efficientdet_lite0', num_classes=len(class_map), img_size=384)

Wandb

Using W&B starts by calling the wandb.init()and by passing the name of your project, and optionally a name.

wandb.init(project="icevision-wandb", name="efficientdet_lite_0", reinit=True)

Learner

At this stage, we create a fastai Learner object, and train our model for a certain number of epochs. We are using the handy method called fine_tune() that allows training the model in two phases:

learn = efficientdet.fastai.learner(dls=[train_dl, valid_dl], model=model, metrics=metrics, 
                                       cbs=[WandbCallback(log_dataset=True, log_model=True), SaveModelCallback()])

learn.fine_tune(50, 1e-2, freeze_epochs=5)

Inference

In order to show the W&B integration in IceVision, we create a batch of images we will use to predict their bounding boxes. The IceVision predict() method returns both the ground truth images and the predicted bounding boxes along with their corresponding labels.

infer_dl = efficientdet.infer_dl(valid_ds, batch_size=8)
samples, preds = efficientdet.predict_dl(model=model, infer_dl=infer_dl)

W&B Tracking

In IceVision, we created a one-liner API that allows you to create images with both their ground truth and the predicted bounding boxes. The resulting image list is then passed to the wandb log() method that will triggers logging those images. Those same images can be explored by the user and be easily included in the reports.

wandb_images = wandb_img_preds(samples, preds, class_map, add_ground_truth=True) 
wandb.log({"Predicted images": wandb_images})

# optional: mark the run as completed
wandb.join()

Conclusion

IceVision enables unmatched easy-to-use yet powerful end-to-end training using some of the state-of-the-art models such EfficientDet. W&B offers an outstanding capabilities in terms of both visualizing and tracking machine learning experiments. IceVision fully supports W&B by providing a one-liner API that enables users to track their trained models and display both the predicted and ground truth bounding boxes. As it is illustrated, in this report, the user can easily inspect the bounding boxes and toggles between the predicted and ground truth bounding boxes as well as thoroughly explore each of the various hyperparameters logged by W&B tool.

W&B makes visualizing and tracking different models performance a highly enjoyable task. Indeed, we are able to monitor the performance of several EfficientDet backbones (namely EfficientDet Lite 0, EfficientDet D1, EfficientDet D3) by changing few lines of code and obtaining very intuitive and easy-to-interpret figures that highlights both the similarities and differences between the different backbones.

About IceVision

image

If you need any assistance, feel free to ask us at our Forum