Understanding State of the Art - Deep Learning: 3D Semantic Segmentation

Publish your model insights with interactive plots for performance metrics, predictions, and hyperparameters. Made by Nicholas Bardy using Weights & Biases
Nicholas Bardy

Understanding the State of the Art

Deep Learning: 3D Semantic Segmentation

The goal of this model is to take input of a point cloud representing a real world object and provide segmentation of the object into different parts. 3D Semantic segmentation is a foundational problem of computer vision and has applications from self driving cars to medical diagnoses.

The model is trained on the dataset ShapeNet. ShapeNet contains 3D point cloud data from a set of objects across 17 different categories. Providing labels for segmenting each object into separate parts. For example, Planes are segmented into wing, body, and tail.

This sweep provides an exploration of the dataset and capabilities of the model architecture. Running a sweep across each category while varying hyperparameters allows us to visualize how our model deals with different categories of images. Helping to answer questions such as: Are certain categories more challenging? Which categories take more time to train? Are there categories were we need more data?

The model architecture used in these training runs is based on the increasingly popular segmentation model architecture U-Net, an architecture designed for faster inference and smaller amounts of training data. U-Net was introduced in 2015 in a paper for Biomedical Image Segmentation

See the code →

Comparing ground truth and final predictions

Comparing ground truth and final predictions

Point cloud visualizations

Point cloud visualizations