Skip to main content

YOLOv5 object detection tutorial

Using YOLOv5 object detection to accurately count vehicles and identify available parking spots, optimizing parking lot management.
Created on May 17|Last edited on June 13
Source: Llama 3
This article will guide you through the process of using object detection, a computer vision technique that identifies and locates objects within images or video streams, to optimize parking lot management. We'll explore how this technology can help count vehicles, identify empty spaces, and ultimately streamline the parking experience for both drivers and facility managers.
Though we'll be focusing on parking lot management in this tutorial, it's meant to serve as a generic example usable for any YOLO object detection projects you may have.
Let's get started:

Understanding object detection

Object detection—in our case, cars and parking spaces—is at the center of our project. Understanding the flow of cars over time lets us do everything from understanding traffic flows to shopping trends in certain districts.

Source
Object detection models are usually trained on vast datasets of annotated images, where objects of interest (in our case, vehicles) are labeled with bounding boxes. These models learn to recognize the visual patterns and features associated with different types of vehicles, including cars, trucks, motorcycles, and buses. The object detection process typically involves several key stages:
  1. Image Preprocessing: The input image is resized, normalized, and converted to a format suitable for processing by the object detection model.
  2. Feature Extraction: Powerful algorithms like CNNs are employed to extract relevant features such as edges, corners, and shapes from the preprocessed image. These features serve as the building blocks for object recognition.
  3. Region Proposal Generation: Based on the extracted features, potential regions within the image that may contain objects of interest are identified and proposed for further processing.
  4. Object Classification: The proposed regions undergo classification, where they are categorized into different object types (e.g., car, truck, pedestrian) using sophisticated machine learning models.
  5. Bounding Box Regression: For each classified object, the model refines the object's location and size by adjusting the coordinates of a bounding box that tightly encompasses the object.
  6. Non-Maximum Suppression: To eliminate duplicate or overlapping detections, a non-maximum suppression algorithm is applied, retaining only the most confident detection for each unique object.
When deployed in parking lot environments, object detection systems can continuously process live video feeds from strategically placed cameras. By accurately detecting and counting the number of vehicles present, as well as identifying their precise locations, these systems can provide real-time occupancy data to parking management software.
This information can then be leveraged in various ways to optimize parking operations. For instance, dynamic signage can guide drivers to available spaces, reducing the time and frustration associated with circling crowded lots. Additionally, predictive analytics can be employed to anticipate peak demand periods, allowing for proactive measures such as adjusting pricing or implementing traffic diversions.
However, object detection in parking lot scenarios presents unique challenges. Varying lighting conditions, obstructions from trees or buildings, and the presence of different vehicle types and sizes can impact detection accuracy. To address these challenges, advanced techniques such as data augmentation, transfer learning, and ensemble models are often employed to enhance the robustness and generalization capabilities of object detection systems.
By seamlessly integrating object detection technology into smart traffic management solutions, cities and parking facility operators can substantially improve the efficiency, sustainability, and user experience of parking infrastructure. This not only reduces driver frustration but also contributes to lower emissions, improved traffic flow, and more effective utilization of limited urban spaces.

Preparing Data for Vehicle Detection

To train an effective object detection model for vehicle recognition in parking lots, we need a diverse and high-quality dataset. This dataset will serve as the foundation for teaching our model to identify different types of vehicles under various conditions.
For this project, we'll be using the PKLot dataset, a publicly available collection of parking lot images captured from different locations and viewpoints. The dataset consists of over 12,000 images, providing a solid starting point for training our model.

Step 1: Environmental Setup

Install Python:
Follow the instructions on the official Python website (https://www.python.org/downloads/) to download and install Python for your operating system.
Set up a Virtual Environment:
Using venv
# Create a virtual environment
python -m venv myenv
# Activate the virtual environment (Windows)
myenv\Scripts\activate
# Activate the virtual environment (Unix/Linux)
source myenv/bin/activate
Using conda
# Create a new conda environment
conda create --name myenv python=3.8
# Activate the environment
conda activate myenv

Step 2: Getting Access to Colab GPU (Using Google Colab)

Let's make sure that we have access to GPU. We can use nvidia-smi command to do that. In case of any problems navigate to Edit -> Notebook settings -> Hardware accelerator if you are using Google Colab, set it to GPU, and then click Save.
!nvidia-smi

Step 3: Install Packages

Let’s install Ultralytics package using pip install ultralytics, which is a popular open-source computer vision library primarily used for object detection, instance segmentation, and image classification tasks. Ultralytics provides pre-trained models and implementations of state-of-the-art algorithms like YOLO (You Only Look Once), which are widely used for object detection and tracking. We should also install wandb (W&B) using !pip install wandb,which is a machine learning tool for experiment tracking, visualization, and collaboration. W&B allows you to log and visualize various metrics, model checkpoints, and system configurations during model training and experimentation.
pip install ultralytics
!pip install wandb
import wandb
wandb.login()

Step 4: Navigating into the Home Directory

import os
HOME = os.getcwd()
print(HOME)

Step 5: Clone YOLOv5 Repository

To use the YOLOv5 algorithm, we clone it from the github repository using !git clone https://github.com/ultralytics/yolov5; this command uses Git, a popular version control system, to clone (download) the YOLOv5 repository from the GitHub URL provided. YOLOv5 is an object detection algorithm developed by Ultralytics, a well-known open-source computer vision library. We change to the current directory using %cd yolov5 which allows us to access the yolov5 folder, which was just cloned from the GitHub repository. We will install the required Python packages and dependencies listed in the requirements.txt file within the YOLOv5 repository using !pip install -r requirements.txt.
!git clone https://github.com/ultralytics/yolov5 # clone
%cd yolov5
!pip install -r requirements.txt # install

Step 6: Download Roboflow

Let's install the roboflow package, which we will use to download our dataset from Roboflow Universe. The Roboflow package likely provides a Python library or command-line interface that allows you to interact with the Roboflow platform from within your Python scripts or terminal. This can include functionalities such as uploading and managing your computer vision datasets, annotating images or videos, and training and deploying machine learning models for tasks like object detection, segmentation, or classification.
!pip install -q roboflow

Step 7: Download Model Weights

These commands are attempting to download pre-trained YOLOv5 models from the Ultralytics GitHub repository. Let's break them down:
  1. !wget https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5s.pt: This command uses the wget utility (a non-interactive file downloader) to download the yolov5s.pt file from the specified URL. This file contains the pre-trained weights for the YOLOv5s model, which is a smaller and faster version of the YOLOv5 object detection model.
  2. !wget https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5m.pt, !wget https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5l.pt, and !wget https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5x.pt: These commands use wget to download the pre-trained weight files for the YOLOv5m, YOLOv5l, and YOLOv5x models, respectively.
The YOLOv5m, YOLOv5l, and YOLOv5x models are larger and more accurate versions of the YOLOv5 model, with increasing model size and complexity (and potentially better performance on complex tasks).
Downloading these pre-trained models can be useful if you want to use them directly for inference (object detection) tasks without training the models from scratch. Alternatively, you can use these pre-trained models as a starting point (transfer learning) and fine-tune them on your own dataset for better performance on specific use cases.
# YOLOv5s
!wget https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5s.pt


# YOLOv5m
!wget https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5m.pt


# YOLOv5l
!wget https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5l.pt


# YOLOv5x
!wget https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5x.pt

Step 8: Download Dataset

This command %cd {HOME}/yolov5 changes the current working directory to the yolov5 folder within the directory stored in the HOME variable, it’s important that the dataset must be saved inside the {HOME}/yolov5 directory, otherwise, the training will not succeed. This step is necessary to ensure that the subsequent commands and operations are performed within the correct directory (the YOLOv5 project directory). Then we install Roboflow Python package using !pip install roboflow. The dataset is downloaded in zip form from Roboflow and then extracted into the {HOME}/yolov5 directory.
%cd {HOME}/yolov5
!pip install roboflow


from roboflow import Roboflow
rf = Roboflow(api_key="Bpf07X1MtF3W3bdTehP8")
project = rf.workspace("brad-dwyer").project("pklot-1tros")
version = project.version(2)
dataset = version.download("yolov5")

Step 9: Examining The Training Results

By default, the results of each subsequent training sessions are saved in {HOME}/yolov5/runs/train/, in directories named exp, exp2, exp3, ... You can override this behavior by using the --name parameter.
!ls {HOME}/yolov5/runs/train/exp/


Step 10: Visualization of Our Result from WandB

from IPython.display import Image

Image(filename=f"{HOME}/yolov5/runs/train/exp/results.png", width=1000)

The image displays multiple line plots that show the training progress and performance metrics of the model. Each plot represents a different metric or loss function tracked during the training process.
Here's a breakdown of the plots:
  1. train/box_loss and val/box_loss: These plots show the bounding box regression loss during training and validation, respectively. The loss decreases over the training epochs, indicating that the model is learning to predict accurate bounding boxes for the objects.
  2. train/obj_loss and val/obj_loss: These plots represent the object classification loss, which measures the model's ability to distinguish between objects and background.
  3. train/cls_loss and val/cls_loss: These plots show the classification loss, which measures the model's performance in classifying the detected objects into the correct classes (e.g., cars, trucks, motorcycles).
  4. metrics/precision and metrics/recall: These plots display the precision and recall metrics, which are commonly used to evaluate the performance of object detection models. Precision measures the accuracy of positive predictions, while recall measures the model's ability to find all relevant objects.
  5. metrics/mAP_0.5 and metrics/mAP_0.5:0.95: These plots show the mean Average Precision (mAP) metric, which is a widely used evaluation metric for object detection tasks. The mAP_0.5 plot calculates the mAP at an Intersection over Union (IoU) threshold of 0.5, while the mAP_0.5:0.95 plot calculates the mAP over various IoU thresholds from 0.5 to 0.95.
The blue lines represent the actual values of the metrics or loss functions, while the orange lines show a smoothed version of the same data, which can help visualize the overall trend more clearly.

Step 11: Display the Confusion Matrix

from IPython.display import Image

Image(filename=f"{HOME}/yolov5/runs/train/exp/confusion_matrix.png", width=1000)

This image displays a confusion matrix for the detection model trained to identify three classes: "space-empty" (vacant parking spaces), "space-occupied" (occupied parking spaces), and "background" (non-parking space areas).
In this confusion matrix, the rows represent the true labels (or actual classes), and the columns represent the predicted classes by the model.
Here's a breakdown of the values in the confusion matrix:
  1. For the "space-empty" class:
- The model correctly predicted 0.96 (or 96%) of the vacant parking spaces as "space-empty" (true positive).
- The model incorrectly predicted 0.04 (or 4%) of the occupied parking spaces as "space-empty" (false positive).
2. For the "space-occupied" class:
- The model correctly predicted 0.95 (or 95%) of the occupied parking spaces as "space-occupied" (true positive).
- The model incorrectly predicted 0.05 (or 5%) of the vacant parking spaces as "space-occupied" (false positive).
3. For the "background" class:
- The model correctly predicted 0.8 (or 80%) of the non-parking space areas as "background" (true negative).
- The model incorrectly predicted 0.2 (or 20%) of the non-parking space areas as either "space-empty" or "space-occupied" (false positive).
The diagonal values (0.96, 0.95, and 0.8) represent the true positives or true negatives, indicating the model's ability to correctly classify each class. High values in these positions are desirable, as they indicate good performance.
The off-diagonal values (0.04, 0.05, 0.54, and 0.46) represent the false positives or false negatives, indicating the model's misclassifications. Ideally, these values should be low for optimal performance.
We can identify the strengths and weaknesses of the model's performance for each class. For example, the model seems to perform well in distinguishing between vacant and occupied parking spaces, with high true positive rates (0.96 and 0.95, respectively). However, it struggles more with identifying non-parking space areas (background), with a true negative rate of 0.8 and a false positive rate of 0.2.

Step 12: Evaluate the Model Prediction

We display an image file from a batch of validation images with predicted bounding boxes and labels from the YOLOv5 object detection model. The image is expected to be located in the exp subdirectory within the runs/train directory of the YOLOv5 project.
from IPython.display import Image

Image(filename=f"{HOME}/yolov5/runs/train/exp/val_batch0_pred.jpg", width=1000


Step 13: Validate the Custom Model

We evaluate the performance of the trained YOLOv5 model on a validation dataset.
%cd {HOME}/yolov9

!python val.py \
--img 640 --batch 32 --conf 0.001 --iou 0.7 --device 0 \
--data {dataset.location}/data.yaml \
--weights {HOME}/yolov5/runs/train/exp/weights/best.pt

Step 14: Inference the Custom Model

We perform object detection on new images or videos using the trained YOLOv5 model and examine few of the results. It generates output images or videos with bounding boxes and labels for the detected objects.
import glob
from IPython.display import Image, display
!python detect.py \
--img 1280 --conf 0.1 --device 0 \
--weights {HOME}/yolov5/runs/train/exp/weights/best.pt \
--source {dataset.location}/test/images


for image_path in glob.glob(f'/content/yolov5/runs/detect/exp/2012-09-12_09_18_46_jpg.rf.d95bc1a314fb384813111f94c25b4ef3.jpg')[:2]:
display(Image(filename=image_path, width=1000))


System Integration and Monitoring

Source: Llama 3
Integrating the object detection system with the existing parking lot management software is crucial for real-time traffic control. This integration allows the system to provide accurate and up-to-date information about available parking spaces to drivers, reducing the time spent searching for a spot.
One way to achieve this is by establishing a communication protocol between the object detection system and the parking management software. This could be achieved through APIs or messaging queues, allowing seamless data exchange between the two systems.
For example, the object detection system can continuously monitor the parking lot and send updates on the number of available spaces to the parking management software. The parking management software can then display this information on digital sign boards or mobile apps, guiding drivers to the available spaces.
The system can also be integrated with dynamic pricing strategies, adjusting parking fees based on real-time demand. During peak hours when the parking lot is near capacity, the system can raise the fees to incentivize drivers to consider alternative transportation methods or parking locations. Conversely, when the lot is relatively empty, fees can be lowered to attract more drivers.
Source
Monitoring the performance of the object detection system is essential to ensure its accuracy and reliability. One powerful tool for this purpose is Weights & Biases (W&B), which provides a comprehensive platform for tracking and visualizing machine learning model performance.
With W&B, you can log various metrics related to the object detection model, such as precision, recall, and inference time. These metrics can be visualized in real-time, allowing you to identify any potential issues or performance degradation quickly.
For example, you could log the number of false positives (vehicles incorrectly identified) and false negatives (vehicles missed by the system) during each inference run. By monitoring these metrics over time, you can detect any sudden changes or trends that may indicate a problem with the model's performance.
W&B allows you to log and visualize the model's output predictions, making it easier to identify any systematic errors or biases. For instance, you could log and compare the model's predictions on images taken during different lighting conditions or from different camera angles, helping you identify areas for improvement.
Integrating W&B into your object detection pipeline not only helps you monitor performance but also facilitates collaboration and knowledge sharing within your team. You can easily share insights, visualizations, and model artifacts with colleagues, fostering a more transparent and collaborative development process.

Best Tips and Practice

Here are some best practices and tips for working with the PKlot dataset and object detection for parking lot management:
Data Diversity: The PKlot dataset provides a good variety of images captured under different lighting conditions, weather, and viewpoints. However, it's still important to augment the dataset with additional images from your specific parking lot environment to ensure the model generalizes well.
Annotation Quality: Accurate bounding box annotations are crucial for training an effective object detection model. Double-check annotations, especially for occluded or partially visible vehicles, to minimize errors.
Class Imbalance: The PKlot dataset has a higher proportion of empty parking spaces compared to occupied spaces. Consider techniques like oversampling or class-weighted loss functions to mitigate the class imbalance problem.
Hyperparameter Tuning: Experiment with different hyperparameters, such as learning rate, batch size, and anchor box settings, to find the optimal configuration for your parking lot environment.

Conclusion

In conclusion, integrating object detection technology with parking lot management systems offers a powerful solution to optimize vehicle flow and space utilization. By leveraging robust object detection models like YOLOv5, parking facilities can accurately count vehicles, identify available spaces, and guide drivers in real-time. Continuously monitoring system performance with tools like Weights & Biases ensures reliability and facilitates iterative improvements. As smart traffic management systems evolve, object detection will play a pivotal role in enhancing parking efficiency, reducing congestion, and providing a seamless experience for drivers. Embracing this technology is a significant step toward building smarter, more sustainable urban environments.