Using YOLOv5 with W&B to Detect Starfish in the Great Barrier Reef

How I used YOLOv5 and Weights & Biases to detect crown-of-thorns starfish (COTS) in underwater images
Created on July 21|Last edited on August 2
Comment
﻿
What we'll be looking for today: the crown-of-thorns-starfish
🧩 IntroductionAustralia's stunningly beautiful Great Barrier Reef is the world’s largest coral reef. It's home to 1,500 species of fish, 400 species of corals, 130 species of sharks, rays, and a massive variety of other sea life.
Unfortunately, the reef is under threat, in part because of the overpopulation of one particular starfish – the coral-eating crown-of-thorns starfish (or COTS for short). Scientists, tourism operators, and reef managers established a large-scale intervention program to control COTS outbreaks to ecologically sustainable levels.
The goal of this report? To build an object detection model–specifically with YOLOv5 and W&B–to identify these starfish. 
NOTE: This competition was hosted by Kaggle in 2022. Feel free to check it out here!﻿
💡
﻿
📋 Table of Contents🧩 Introduction📋 Table of Contents📓 Code and notebooks🛠 Install libraries📌 One key point before we jump in🪄  Initializing W&B📝 Metadata🍚 Preparing our dataset📦 YOLOv5 +🪄 W&B🚅 Training🔍 Results📊 Summary
﻿
📓 Code and notebooksJust a note: these links will send you off to Kaggle. 
Train: Great-Barrier-Reef: YOLOv5 [train]﻿
Inference: Great-Barrier-Reef: YOLOv5 [infer] ﻿
🛠 Install librariesFirst, we need to install W&B.
!pip install -qU wandb
And we're done. That's always satisfying. 
📌 One key point before we jump inThe Kaggle competition metric F2 tolerates some false positives(FP) in order to ensure very few starfish are missed. Which means tackling false negatives(FN) is more important than false positives(FP). 
F2=5⋅precision⋅recall4⋅precision+recallF2 = 5 \cdot \frac{precision \cdot recall}{4\cdot precision + recall}F2=5⋅4⋅precision+recallprecision⋅recall​﻿
🪄  Initializing W&B
﻿
Let's initialize Weights & Biases for this experiment. If this is your first time using W&B or you are not logged in, the link that appears after running wandb.login() will take you to sign-up/login page. Signing up for a free account is as easy as a few clicks.
import wandb
wandb.login()
📝 MetadataHere's what we're working with: 
train_images/: The folder containing training set photos of the form video_{video_id}/{video_frame}.jpg
video_id: The ID number of the video the image was part of. (These video IDs are not meaningfully ordered.)
video_frame: The frame number of the image within the video. Note: you should expect to see occasional gaps in the frame number from when the diver surfaced.
sequence: The ID of a gap-free subset of a given video. The sequence ids are not meaningfully ordered.
sequence_frame: The frame number within a given sequence.
image_id: The ID code for the image, in the format {video_id}-{video_frame
annotations: The bounding boxes of any starfish detection in a string format that can be evaluated directly with Python. Does not use the same format as the predictions you will submit. Not available in test.csv. A bounding box is described by the pixel coordinate (x_min, y_min) of its lower left corner within the image together with its width and height in pixels --> (COCO format).
FOLD      = 1 # which fold to train
DIM       = 3000 
MODEL     = 'yolov5s'
BATCH     = 4
EPOCHS    = 15
OPTMIZER  = 'Adam'
﻿
PROJECT   = 'great-barrier-reef' # w&b in yolov5
NAME      = f'{MODEL}-dim{DIM}-fold{FOLD}' # w&b for yolov5
﻿
REMOVE_NOBBOX = True # remove images with no bbox
ROOT_DIR  = '/kaggle/input/tensorflow-great-barrier-reef/'
IMAGE_DIR = '/kaggle/images' # directory to save images
LABEL_DIR = '/kaggle/labels' # directory to save labels
Extracting our metadataWe'll use this code to extract our metadata: 
# Train Data
df = pd.read_csv(f'{ROOT_DIR}/train.csv')
df['old_image_path'] = f'{ROOT_DIR}/train_images/video_'+df.video_id.astype(str)+'/'+df.video_frame.astype(str)+'.jpg'
df['image_path']  = f'{IMAGE_DIR}/'+df.image_id+'.jpg'
df['label_path']  = f'{LABEL_DIR}/'+df.image_id+'.txt'
df['annotations'] = df['annotations'].progress_apply(eval)
﻿
df['num_bbox'] = df['annotations'].progress_apply(lambda x: len(x))
﻿
df['width']  = 1280
df['height'] = 720
🍚 Preparing our datasetYou can refer to the training notebook for detailed explanation on how to prepare dataset for YOLOv5 but in brief we need to do following:
Creating labels for YOLOv5We need to export our labels to YOLO format, with one `*.txt` file per image (if no objects in image, no `*.txt` file is required). The *.txt file specifications are:
One row per object
Each row is class `[x_center, y_center, width, height]` format.
Box coordinates must be in normalized `xywh` format (from `0 - 1`). If your boxes are in pixels, divide `x_center` and `width` by `image width`, and `y_center` and `height` by `image height
Class numbers are zero-indexed (start from `0`).
Bbox format is COCO in this dataset hence `[x_min, y_min, width, height]`. So, we need to convert form COCO to YOLO format.
💡
ConfigurationThe dataset config file requires the following: 
The dataset root directory path and relative paths to `train / val / test` image directories (or *.txt files with image paths)
The number of classes nc and 
A list of class names:['cots']Modify the yaml file below to update the hyper parameters.
The config file may look something like this:
names:
- cots
nc: 1
path: /kaggle/working
train: /kaggle/working/train.txt
val: /kaggle/working/val.txt
﻿
📦 YOLOv5 +🪄 W&B
﻿
To install YOLOv5 with W&B integration all we need to is simply clone this repo. This s repo is synced with official YOLOv5 repo so you'll get the latest features of YOLOv5 along with support of W&B.
This is a community fork of YOLOv5 so you're welcome to contribute if you want to add any features of W&B in YOLOv5!
💡
!git clone https://github.com/awsaf49/yolov5-wandb.git yolov5 # clone
%cd yolov5
%pip install -qr requirements.txt  # install
﻿
import torch
import utils
display = utils.notebook_init()  # checks
🚅 TrainingThe code we'll use today:
!python train.py --img {DIM}\
--batch {BATCH}\
--epochs {EPOCHS}\
--optimizer {OPTMIZER}\
--data /kaggle/working/gbr.yaml\
--weights {MODEL}.pt\
--project {PROJECT} --name {NAME} --entity ml-colabs\
--exist-ok
🔍 Results
Class Distribution﻿
Run set3
﻿
Batch Label/Prediction﻿
Run set3
﻿
MetricsFrom the below metrics, it is quite evident that large image size is a key factor to improve the performance, given that the small model yolov5s performs better with large image size (3000×3000)(3000 \times 3000)(3000×3000)﻿ than the medium size model yolov5m with a lower image size (1280×1280)(1280 \times 1280)(1280×1280)﻿.
We can also notice that in the same image size (1280×1280)(1280 \times 1280)(1280×1280)﻿, yolov5m outperforms yolov5s6, which means model size is also a key factor here to improve performance.
To get the best result, we need to increase both image and model size, but it'll surely increase the computational cost.
﻿
﻿
Run set3
﻿
﻿
﻿
Run set3
﻿
Loss﻿
Run set3
﻿
📊 SummaryAccurately identified starfish in real-time using YOLOv5 object detection model trained on underwater videos of coral reefs.
Demonstrated how to use YOLOv5 with W&B (Weights & Biases).
Analyzed the results and identified two key factors for improving performance: Large Image Size and Large Model Size.
﻿
Add a comment
Tags: Articles, Community Posts, Kaggle, YOLO, Object Detection, Intermediate, Panels, Plots, Computer Vision
Iterate on AI agents and models faster. Try Weights & Biases today.