Understanding State of the Art in Deep Learning: 3D Semantic Segmentation
This model takes input of a point cloud representing a real-world object and provides segmentation of the object into different parts.
Created on February 21|Last edited on October 4
Comment
Deep Learning: 3D Semantic Segmentation
The goal of this model is to take input of a point cloud representing a real-world object and provide segmentation of the object into different parts. 3D Semantic segmentation is a foundational problem of computer vision and has applications from self-driving cars to medical diagnoses.
The model is trained on the dataset ShapeNet. ShapeNet contains 3D point cloud data from a set of objects across 17 different categories. Providing labels for segmenting each object into separate parts. For example, Planes are segmented into wing, body, and tail.
This sweep provides an exploration of the dataset and capabilities of the model architecture. Running a sweep across each category while varying hyperparameters allows us to visualize how our model deals with different categories of images. Helping to answer questions such as: Are certain categories more challenging? Which categories take more time to train? Are there categories where we need more data?
The model architecture used in these training runs is based on the increasingly popular segmentation model architecture U-Net, an architecture designed for faster inference and smaller amounts of training data. U-Net was introduced in 2015 in a paper for Biomedical Image Segmentation
Comparing Ground Truth and Final Predictions
Sweep runs
88
Point Cloud Visualizations
Run table
82
Add a comment
Iterate on AI agents and models faster. Try Weights & Biases today.