Using YOLOv5 with W&B to Detect Starfish in the Great Barrier Reef
How I used YOLOv5 and Weights & Biases to detect crown-of-thorns starfish (COTS) in underwater images
Created on July 21|Last edited on August 2
Comment

What we'll be looking for today: the crown-of-thorns-starfish
π§© Introduction
Australia's stunningly beautiful Great Barrier Reef is the worldβs largest coral reef. It's home to 1,500 species of fish, 400 species of corals, 130 species of sharks, rays, and a massive variety of other sea life.
Unfortunately, the reef is under threat, in part because of the overpopulation of one particular starfish β the coral-eating crown-of-thorns starfish (or COTS for short). Scientists, tourism operators, and reef managers established a large-scale intervention program to control COTS outbreaks to ecologically sustainable levels.
The goal of this report? To build an object detection modelβspecifically with YOLOv5 and W&Bβto identify these starfish.
π‘

ο»Ώ
π Table of Contents
π§© Introductionπ Table of Contentsπ Code and notebooksπ Install librariesπ One key point before we jump inπͺ Initializing W&Bπ Metadataπ Preparing our datasetπ¦ YOLOv5 +πͺ W&Bπ
Trainingπ Resultsπ Summary
ο»Ώ
π Code and notebooks
Just a note: these links will send you off to Kaggle.
- Train: Great-Barrier-Reef: YOLOv5 [train]ο»Ώ
- Inference: Great-Barrier-Reef: YOLOv5 [infer] ο»Ώ
π Install libraries
First, we need to install W&B.
!pip install -qU wandb
And we're done. That's always satisfying.
π One key point before we jump in
The Kaggle competition metric F2 tolerates some false positives(FP) in order to ensure very few starfish are missed. Which means tackling false negatives(FN) is more important than false positives(FP).
ο»Ώ
πͺ Initializing W&B

ο»Ώ
Let's initialize Weights & Biases for this experiment. If this is your first time using W&B or you are not logged in, the link that appears after running wandb.login() will take you to sign-up/login page. Signing up for a free account is as easy as a few clicks.
import wandbwandb.login()
π Metadata
Here's what we're working with:
- train_images/: The folder containing training set photos of the form video_{video_id}/{video_frame}.jpg
- video_id: The ID number of the video the image was part of. (These video IDs are not meaningfully ordered.)
- video_frame: The frame number of the image within the video. Note: you should expect to see occasional gaps in the frame number from when the diver surfaced.
- sequence: The ID of a gap-free subset of a given video. The sequence ids are not meaningfully ordered.
- sequence_frame: The frame number within a given sequence.
- image_id: The ID code for the image, in the format {video_id}-{video_frame
- annotations: The bounding boxes of any starfish detection in a string format that can be evaluated directly with Python. Does not use the same format as the predictions you will submit. Not available in test.csv. A bounding box is described by the pixel coordinate (x_min, y_min) of its lower left corner within the image together with its width and height in pixels --> (COCO format).
FOLD = 1 # which fold to trainDIM = 3000MODEL = 'yolov5s'BATCH = 4EPOCHS = 15OPTMIZER = 'Adam'ο»ΏPROJECT = 'great-barrier-reef' # w&b in yolov5NAME = f'{MODEL}-dim{DIM}-fold{FOLD}' # w&b for yolov5ο»ΏREMOVE_NOBBOX = True # remove images with no bboxROOT_DIR = '/kaggle/input/tensorflow-great-barrier-reef/'IMAGE_DIR = '/kaggle/images' # directory to save imagesLABEL_DIR = '/kaggle/labels' # directory to save labels
Extracting our metadata
We'll use this code to extract our metadata:
# Train Datadf = pd.read_csv(f'{ROOT_DIR}/train.csv')df['old_image_path'] = f'{ROOT_DIR}/train_images/video_'+df.video_id.astype(str)+'/'+df.video_frame.astype(str)+'.jpg'df['image_path'] = f'{IMAGE_DIR}/'+df.image_id+'.jpg'df['label_path'] = f'{LABEL_DIR}/'+df.image_id+'.txt'df['annotations'] = df['annotations'].progress_apply(eval)ο»Ώdf['num_bbox'] = df['annotations'].progress_apply(lambda x: len(x))ο»Ώdf['width'] = 1280df['height'] = 720
π Preparing our dataset
You can refer to the training notebook for detailed explanation on how to prepare dataset for YOLOv5 but in brief we need to do following:
Creating labels for YOLOv5
We need to export our labels to YOLO format, with one `*.txt` file per image (if no objects in image, no `*.txt` file is required). The *.txt file specifications are:
- One row per object
- Each row is class `[x_center, y_center, width, height]` format.
- Box coordinates must be in normalized `xywh` format (from `0 - 1`). If your boxes are in pixels, divide `x_center` and `width` by `image width`, and `y_center` and `height` by `image height
- Class numbers are zero-indexed (start from `0`).
Bbox format is COCO in this dataset hence `[x_min, y_min, width, height]`. So, we need to convert form COCO to YOLO format.
π‘
Configuration
The dataset config file requires the following:
- The dataset root directory path and relative paths to `train / val / test` image directories (or *.txt files with image paths)
- The number of classes nc and
- A list of class names:['cots']Modify the yaml file below to update the hyper parameters.
The config file may look something like this:
names:- cotsnc: 1path: /kaggle/workingtrain: /kaggle/working/train.txtval: /kaggle/working/val.txt
ο»Ώ
π¦ YOLOv5 +πͺ W&B

ο»Ώ
To install YOLOv5 with W&B integration all we need to is simply clone this repo. This s repo is synced with official YOLOv5 repo so you'll get the latest features of YOLOv5 along with support of W&B.
This is a community fork of YOLOv5 so you're welcome to contribute if you want to add any features of W&B in YOLOv5!
π‘
!git clone https://github.com/awsaf49/yolov5-wandb.git yolov5 # clone%cd yolov5%pip install -qr requirements.txt # installο»Ώimport torchimport utilsdisplay = utils.notebook_init() # checks
π Training
The code we'll use today:
!python train.py --img {DIM}\--batch {BATCH}\--epochs {EPOCHS}\--optimizer {OPTMIZER}\--data /kaggle/working/gbr.yaml\--weights {MODEL}.pt\--project {PROJECT} --name {NAME} --entity ml-colabs\--exist-ok
π Results
Class Distribution
ο»Ώ
Run set
3
ο»Ώ
Batch Label/Prediction
ο»Ώ
Run set
3
ο»Ώ
Metrics
From the below metrics, it is quite evident that large image size is a key factor to improve the performance, given that the small model yolov5s performs better with large image size ο»Ώ than the medium size model yolov5m with a lower image size ο»Ώ.
We can also notice that in the same image size ο»Ώ, yolov5m outperforms yolov5s6, which means model size is also a key factor here to improve performance.
To get the best result, we need to increase both image and model size, but it'll surely increase the computational cost.
ο»Ώ
ο»Ώ
Run set
3
ο»Ώ
ο»Ώ
ο»Ώ
Run set
3
ο»Ώ
Loss
ο»Ώ
Run set
3
ο»Ώ
π Summary
- Accurately identified starfish in real-time using YOLOv5 object detection model trained on underwater videos of coral reefs.
- Demonstrated how to use YOLOv5 with W&B (Weights & Biases).
- Analyzed the results and identified two key factors for improving performance: Large Image Size and Large Model Size.
Add a comment
Tags: Articles, Community Posts, Kaggle, YOLO, Object Detection, Intermediate, Panels, Plots, Computer Vision
Iterate on AI agents and models faster. Try Weights & Biases today.