Transfer Learning: Serial versus One-time Training

Using Transfer Learning in Systematically Determining Adequacy of Dataset Size
Created on November 27|Last edited on January 21
Comment
While there are innumerable questions you might ask yourself before you tuck into model training, today we're going to focus on two: 
If I'm bringing my own dataset, how big should it be?
 2.  Should I train the model one-step at a time, or is it better to accumulate the data in one go?
Today, we'll do our own experiment and use Colab and the PyTorch/ Fastai/ IceVision frameworks.  The Weights & Biases platform will facilitate comparison between the training run parameters.  And, if you're curious, you're of course welcome to use my dataset or another object detection set that you prefer!
Raw image courtesy of Dr. C. Aro
Outline (Click to Expand)
I.  Installations and Imports!wget https://raw.githubusercontent.com/airctic/icevision/master/icevision_install.sh
!bash icevision_install.sh cuda11 master
Let the above installation finish before running the next.
import IPython
IPython.Application.instance().kernel.do_shutdown(True)
Importing the IceVision library will give you access to Torch, Torchvision, and a number of updated models including RetinaNet, EfficientDet, VFNet, YOLOv5, and YOLOX. Due to this dynamic environment, installation codes may sometimes change -  if the installs/ imports are not running, check out the latest installation procedures at the IceVision forum.
from icevision.all import *
II.  Dataset! git clone https://github.com/yrodriguezmd/Surgical_instruments.git
I have developed an annotated dataset composed of 15 surgical instruments. The annotation was performed in Roboflow.  We will only use a portion (n~800) for this experiment.  You are welcome to use it for personal learning purposes. Please do not use the dataset for commercial objectives.
A.  ClassesIn creating the dataset, I chose the classes that would be of interest to the majority of surgical specialties.  
﻿
One class (Mayo_metz) is an artificial designation that I made based on two classes of scissors that  are easy for surgical professionals to differentiate, but which could be challenging for a present-day neural net.
💡
﻿
﻿
Some classes have similar general physical structures, especially the group Hemostat - Iris - Mayo_metz - Potts - Towel_clip.  Some classes may have different physical structures within the same class (Bulldog, Potts, Scalpel, Yankauer).  The images in the dataset have varying scale representatives, especially the Bulldog and Needle classes.  Around 30% of the images contain objects in varying levels of overlap.
﻿
All of these would present a challenge to the model, but which are very important to address for real-world applications.
💡
B.  SubsetsThe Serial learning that will be described below utilized subsets of the dataset.  Each subset was iteratively generated, with ~10 images per class.  The subsets were split into train, validation and test sets in a 70:20:10 ratio.
III.  Serial LearningContext:  We will be creating our own annotated dataset and that we will be using supervised learning methods for modelling. 
We can develop the dataset using at least 2 options: 1) iteratively, whereby we gather, annotate and train in a sequential manner, checking whether the model is improving at each iteration, or 2) set a dataset size that you believe is reasonably big enough for a one-time train run.
The advantages of an iterative run are: 
1) It's easier to identify bugs on a smaller set, 
2) You get a feel as to how the model learns as it sees more data, 
3) You get a break from the tedious annotation tasks (if you're doing it yourself), and 
4) You can have a confirmation of whether or not you need more data, instead of assuming that you do. 
The disadvantage is: 
1) It may prove unnecessary -  but now you have the conviction and more experience in determining which might or might not work.
Let's try the iterative or serial approach:
A.  Set_1!ls Surgical_instruments/Sets/Set_1/annotated  
# output:  test  train  valid
﻿
image_path = Path('Surgical_instruments/Sets/Set_1/annotated')
﻿
img_files = get_image_files(image_path)
﻿
img = PIL.Image.open(img_files[100]) 
img = img.convert('RGB')
img.to_thumb(150,150)
﻿
classes = ['Army_navy', 'Bulldog', 'Castroviejo','Forceps', 'Frazier',
           'Hemostat','Iris','Mayo_metz','Needle','Potts',
           'Richardson','Scalpel','Towel_clip', 'Weitlaner','Yankauer']
﻿
class_map = ClassMap(classes)
The class_map will contain 16 classes: the 15 object types + background.
1.  Parsing, Transforms and Creation of Torch Dataset(If you need an intro or refresher, refer to Parsing and Transforms).
path = Path('Surgical_instruments/Sets/Set_1/annotated')
﻿
train_parser = parsers.COCOBBoxParser(
    annotations_filepath = path/'train/_annotations.coco.json',
    img_dir = path/'train')
﻿
valid_parser = parsers.COCOBBoxParser(
    annotations_filepath = path/'valid/_annotations.coco.json',
    img_dir = path/'valid')
The COCOBBoxParser facilitates the matching of the information available in the annotations including filenames and locations, image sizes, class names and bounding box parameters in the COCO format of [x value on left uppermost corner, y value on left uppermost corner, box width, box height].
whole = SingleSplitSplitter()
﻿
train_records, *_ = train_parser.parse(data_splitter = whole)
valid_records, *_ = valid_parser.parse(data_splitter = whole)
﻿
show_records(train_records[0:3],ncols=3, font_size=30, label_color = '#ffff00')
﻿
presize = 512
image_size = 384
﻿
train_tfms = tfms.A.Adapter([*tfms.A.aug_tfms(size=presize), tfms.A.Normalize()])
valid_tfms = tfms.A.Adapter([*tfms.A.resize_and_pad(size=image_size), tfms.A.Normalize()])
﻿
train_ds = Dataset(train_records, train_tfms)
valid_ds = Dataset(valid_records, valid_tfms)
2.  ModellingWe will use a model that has been pretrained on the COCO dataset.
metrics = [COCOMetric(metric_type=COCOMetricType.bbox)]
﻿
from icevision.models.checkpoint import *
﻿
model_type = models.mmdet.vfnet 
backbone = model_type.backbones.resnet50_fpn_mstrain_2x
﻿
model = model_type.model(backbone=backbone(pretrained=True), 
			  num_classes= len(class_map))
﻿
train_dl = model_type.train_dl(train_ds, batch_size = 16, 
				num_workers = 4, shuffle=True)
valid_dl = model_type.valid_dl(valid_ds, batch_size=16, 
				num_workers = 4,shuffle=False)
﻿VFNet is a single-stage detector that is accurate and relatively fast.  It utilizes features that address the foreground-background imbalance, multi-scaling, and bounding box overlap.  We will use the configuration released by the MMdetection team.
﻿mean Average Precision (mAP) will be the detection metric.
W&B provides an excellent tool for evaluating and monitoring model training.
from fastai.callback.wandb import *
﻿
wandb.init(project = 'Transfer_learning_vf',  # name project folder
           name = 'Series', # name the training run within the project folder
           reinit = True)
The initial output will show some hyperparameters, including the lr.  The hyperparameter divisions correspond to the parameter groupings of the particular model, including the head and neck.
learn = model_type.fastai.learner(dls = [train_dl, valid_dl],
                                  model = model, metrics = metrics,
                                  cbs = WandbCallback())
﻿
learn.lr_find()    # output: valley = 6.30957365501672e-05
﻿
learn.fine_tune(30, 1e-04)
Set_1 which contained ~100 training images reached a mAP of 0.486 after training for 30 epochs, and had a few good predictions.
﻿
Run set1
﻿
model_type.show_results(model, valid_ds)
﻿
3.  Saving the Modelfrom icevision.models import *
﻿
checkpoint_path = 'Model_1.pth'  # your choice of name
﻿
save_icevision_checkpoint(model, 
                  model_name='mmdet.vfnet', 
                  backbone_name='resnet50_fpn_mstrain_2x',
                  classes = train_parser.class_map.get_classes(), 
                  img_size=image_size, 
                  filename=checkpoint_path,
                  meta={'icevision_version': '0.9.1'})
The model will appear as a Colab file.  Colab files are temporary.  If you want to keep the saved model, download it to the local computer or to Google Drive (refer to Section III.D in this report).
B.  Set_2Since Set_1 did not yet have very good predictions, we will proceed with the serial training.
For Set_2, we will repeat the steps in Stage III.A.1 for parsing, transforms and creation of PyTorch dataset, this time using the 2nd set of data for iteration:
path = Path('Surgical_instruments/Sets/Set_2/annotated')
Instead of using a model trained on the COCO-dataset like we did before, we will use the model that we have already trained using the previous dataset.
checkpoint_path = 'Model_1.pth'  # saved model
﻿
checkpoint_and_model = model_from_checkpoint(checkpoint_path)
﻿
model = checkpoint_and_model['model']
model.train()
(Same steps for dataloading, learner initialization, lr finding and fine-tuning).
Since we are doing serial training, we will not re-initialize the Wandb log.  The present run will be considered as a continuation of the previous.
Set_2 which contained ~100 training images reached a mAP of 0.623 after training for 30 epochs, and had a few good predictions.
It is notable that by using the model previously trained on Set_1, the losses for the new train set peaked again, but the mAP and validation loss continued to improve.
﻿
Run set1
﻿
﻿
﻿
﻿
We will save this model as Model_2.pth.
C.  Set_3 to Set_6We will continue iterating through Set_3 to Set_6, varying the filenames and model names as needed.  The following table summarizes the parameters for each iterative step.
﻿
The mAP and validation loss continued to improve until the 3rd iteration (total epoch 90).  However, as we introduce more challenging types of images on subsequent iterations, the metric and loss performance decreased.  
By the 6th iteration (epoch 150 -179), the dataset already contains both straight-forward as well as challenging images.  With the mAP and valid losses stabilizing, we can consider the possibility that the data representation and size might be enough.
﻿
Run set1
﻿
IV.  One-time LearningWith our serially generated dataset, we were able to gather representative as well as challenging images for 15 classes of surgical instruments.  Set_1_6 is the merged equivalent of Sets  1 to 6, and is composed of 570 train, 165 validation and 80 test images.
We will now compare the results of the two types of training:  serial versus one-time.
The code structure is similar to those above, with these differences:
 The path will be 
path = Path('gdrive/MyDrive/Surgical_instruments/Set_1_6b.v1i.coco') 
2.  We will utilize only one model, the same COCO-pretrained model and backbone that was used for training Set_1 (VFNet/ resnet50_fpn_mstrain_2x), instead of a series of trained models.
model_type = models.mmdet.vfnet 
backbone = model_type.backbones.resnet50_fpn_mstrain_2x
﻿
model = model_type.model(backbone=backbone(pretrained=True), num_classes=len(class_map))
3.  We will reinitialize Wandb, using the same project folder, and a different run name.
wandb.init(project = 'Transfer_learning_vf', name = 'One-time_long', 
           reinit = True)
4.  The learning rate finder will likely find a different LR due to the difference in representative data seen.
5.  There will be a single training run of 180 epochs (to be compared with the serial training of 30 epochs for 6 stages).  
learn.fine_tune(180, 1e-02)
The table below shows the parameters for Set_6 and the merged set Set_1_6:
﻿
Training the model in one continuous run enabled a slightly higher end mAP.  However, it is notable that the validation loss is not significantly better than the one that has been serially trained.   The training run time for the one-time training took twice as long.
﻿
Run set2
﻿
﻿
V.  SummaryIn creating a dataset, it is reasonable to generate the collection in an iterative manner.  Serial modelling generates reasonable representation of the metric and losses.  This provides a systematic determination of the adequacy of dataset size and representation, instead of a vague heuristic.  
VI.  Future PlayThe model is not yet performing in a reliable manner that is needed for a real-world application.  Systematically adding more data is expected to yield a better performing model while economizing effort.
﻿
I hope you had fun experimenting! :)
﻿
Maria
LinkedIn: https://www.linkedin.com/in/rodriguez-maria/﻿
Github: https://github.com/yrodriguezmd?tab=repositories.  The notebook for this tutorial will be in the folder Deep_Learning_tutorials.
Twitter: https://twitter.com/Maria_Rod_Data
Image courtesy of Sabeer Darr/ Unsplash
﻿