Skip to main content

Point Cloud Classification Using PyTorch Geometric

In this article, we explore how to classify point cloud data from 3D CAD Models, implementing the PointNet++ architecture and using PyTorch Geometric and W&B.
Created on December 8|Last edited on March 15
Point cloud is an important type of data structure for storing geometric shape data. Due to its irregular format, it's often transformed into regular 3D voxel grids or collections of images before being used in deep learning applications, a step that makes the data unnecessarily large.
This problem can be solved by designing an architecture that can directly consume point clouds, respecting the permutation-invariance property of the point data.
The PointNet family of models is a pioneer in this direction, providing a simple, unified architecture for applications ranging from object classification to part segmentation to semantic scene parsing. However, the original PointNet does not capture local structures induced by the metric space points live in, limiting its ability to recognize fine-grained patterns and generalizability to complex scenes.
In this article, we would demonstrate the implementation of the PointNet++ architecture, a hierarchical neural network that applies PointNet recursively on a nested partitioning of the input point set. By exploiting metric space distances, PointNet++ is better able to learn local features with increasing contextual scales.
Since point clouds are usually sampled with varying densities—which results in greatly decreased performance for networks trained on uniform densities—PointNet++ uses a novel set of learning layers to adaptively combine features from multiple scales, thus making it perform significantly better than the vanilla PointNet architecture on challenging benchmarks of 3D point clouds.

Table of Contents



Let's get started!

The ModelNet Benchmarks

In this report, we'll demonstrate an implementation of the PointNet++ architecture and showcase its performance on the classification of point cloud data from 3D CAD Models. We'll be using the ModelNet10 and ModelNet40 datasets made available by the Princeton ModelNet project.
The goal of this project is to provide researchers in computer vision, computer graphics, robotics, and cognitive science with a comprehensive clean collection of 3D CAD models for objects.
Here's how the dataset was compiled and created:
  1. First, a list of the most common object categories in the world was compiled, using the statistics obtained from the SUN database.
  2. Once a vocabulary for objects had been established, 3D CAD models belonging to each object category were collected using online search engines by querying for each object category term.
  3. Then, human workers were manually used to decide whether each CAD model belongs to the specified categories using TurkCleaner, a design tool with quality control making use of Amazon Mechanical Turk service.
Let's take a look at the ModelNet10 and ModelNet40 datasets individually:

The ModelNet10 Dataset

To obtain a very clean dataset, we chose 10 popular object categories, and the models that did not belong to these categories were manually deleted. Moreover, the orientation of the CAD models for this 10-class subset was manually aligned as well. This 10-class aligned subset is referred to as ModelNet10.

The ModelNet10 Dataset
3


The ModelNet40 Dataset

The ModelNet40 Dataset consists of the complete volume of 3D CAD models that were subsequently aligned and labeled with, you guessed it, 40 classes.

The ModelNet40 Dataset
3


Diving Into the Workflow

In this article, we're using PyTorch Geometric to create a deep-learning pipeline for implementing and training our model for point cloud classification.
PyTorch Geometric, also referred to as PyG is a library built upon PyTorch to easily write and train Graph Neural Networks and Geometric Deep Learning models for a wide range of applications related to both structured and unstructured data. PyG consists of the following features:
  • Various methods for deep learning on graphs and other irregular structures from a variety of published papers.
  • A large number of common benchmark datasets (based on simple interfaces to create our own datasets).
  • Easy-to-use mini-batch loaders for operating on many small and single giant graphs and other irregular data structures such as molecules, 3D meshes, and point clouds.
  • Distributed graph learning via Quiver.
  • The GraphGym experiment manager for designing and evaluating Graph Neural Networks.
We'll also be using Weights & Biases for experiment tracking, data visualization, model checkpoint management, and hyperparameter tuning.

Creating the Input Pipeline
0




The PointNet++ Architecture

PointNet++ processes point clouds iteratively by following a simple grouping, neighborhood aggregation, and downsampling scheme:
  • The grouping phase constructs a graph in which nearby points are connected. Typically, this is either done via k-nearest neighbor search or via ball queries (which connect all points that are within a radius to the query point).
  • The neighborhood aggregation phase executes a Graph Neural Network layer that, for each point, aggregates information from its direct neighbors (given by the graph constructed in the previous phase). This allows PointNet++ to capture local context at different scales.
  • The downsampling phase implements a pooling scheme suitable for point clouds with potentially different sizes. We will ignore this phase for now and will come back later to it.
The Hierarchical Feature Learning Architecture, used by PointNet++. Source: Figure 2 from the paper.
Let us now explore individual building blocks in the PointNet++ architecture...

The Grouping Phase

The hierarchical structure of PointNet++ is composed of a number of set abstraction levels. At each level, a set of points is processed and abstracted to produce a new set with fewer elements. A set abstraction level takes an N×(d+C)N \times (d + C) matrix as input that is from NN points with dd-dimensional coordinates and CC-dimensional point features. It outputs an N×(d+C)N^{\prime} \times (d + C^{\prime}) matrix of NN^{\prime} subsampled points with dd-dimensional coordinates and the new CC^{\prime}-dimensional feature vectors summarizing local context. We introduce the layers of a set abstraction level in the following paragraphs.
The set abstraction level is made of three key layers:
  • The Sampling layer selects a set of points from input points, which defines the centroids of local regions.
  • The Grouping layer then constructs local region sets by finding “neighboring” points around the centroids.
  • The PointNet layer uses a mini-PointNet to encode local region patterns into feature vectors.

The Sampling Layer
0



The Grouping Layer
1



The PointNet Layer
0



[Bonus Section] - Learn How PyG implements the PointNet Layer using Message Passing Networks - [Click to Expand] 👉

Implementing the Set Abstraction Layer

Now, let's put everything together to implement the set abstraction layer using the building blocks provided by PyG:

Implementations of Set Abstraction Layers using PyG
0


The Classification Head

Now that we have implemented the backbone of the PointNet++ architecture, all we need to do is add a classification head for the task of point-cloud classification.
The architecture of our PointNet++ classification model. Source: Figure 2 from the paper.
class PointNetPlusPlus(torch.nn.Module):
def __init__(
self,
set_abstraction_ratio_1, set_abstraction_ratio_2,
set_abstraction_radius_1, set_abstraction_radius_2, dropout
):
super().__init__
# Input channels account for both `pos` and node features.
self.sa1_module = SetAbstraction(set_abstraction_ratio_1, set_abstraction_radius_1, MLP([3, 64, 64, 128]))
self.sa2_module = SetAbstraction(set_abstraction_ratio_2, set_abstraction_radius_2, MLP([128 + 3, 128, 128, 256])
self.sa3_module = GlobalSetAbstraction(MLP([256 + 3, 256, 512, 1024]))
self.mlp = MLP([1024, 512, 256, 10], dropout=dropout, norm=None)

def forward(self, data):
sa0_out = (data.x, data.pos, data.batch)
sa1_out = self.sa1_module(*sa0_out)
sa2_out = self.sa2_module(*sa1_out)
sa3_out = self.sa3_module(*sa2_out)
x, pos, batch = sa3_out
return self.mlp(x).log_softmax(dim=-1)

For the complete code, you can refer to the colab notebook:


Training the Point-Cloud Classifier

We train the model using a simple PyTorch-based training loop and use Weights & Biases for tracking our experiment metrics and versioning our model checkpoints. For detailed documentation of writing PyTorch-based training loops instrumented with Weights & Biases, check out the official documentation.

Traing and Validation Loop
0


Experiments

Now that we have our input pipeline, architecture, and training loop implemented, let's warm up by performing a few baseline experiments.

Baseline experiments
2


Tuning the Hyperparameters

Now that we have run some baseline experiments, and made sure that our training pipeline is working as per our expectations, let 👇us try to improve our results. We'll leverage Sweeps to not only automatically search for the best set of hyperparameters but also analyze the importance and correlations of various hyperparameters to better optimize our model's performance.

Results of Sweep
38

Let's now analyze the importance of the hyperparameters generated by the sweep:
  • It is very evident that the dropout rate has a very strong negative correlation with our metric. This indicates that we can lower the dropout rate without worrying much about our model overfitting.
  • The number of points being sampled per point cloud in the input pipeline has a strong positive correlation with our metric. This is evident given that a higher number of points indicate a point cloud that is denser and of higher resolution.
With our initial set of observations from the sweep, let's apply a few filters.

Results of sweeps with applied filters
11

After filtering out the runs with respect to dropout and the number of samples per point cloud, the trends in correlation don't seem to be quite apparent at a glance due to their low parameter importance. The overall trend for query radius and set abstraction ratio seem to be a higher value for the first set abstraction layer and lower for the second layer.

Final Experiment

With the insights we gained from running the sweep, let us perform some final experiments to get the best-performing model.
  • We would go with a very low dropout rate.
  • We would select a high sample number which basically means a higher resolution of point clouds.
  • Due to a lack of clarity with a clear trend in batch size, we would go with a batch size of 16.
  • We would take the rest of the hyperparameters from the best-performing run in the sweep.


Final Experiment
1




Conclusion

  • In this report, we explore how to build a model for classifying 3D point clouds.
    • We explored how we could easily build data loaders for point cloud classification using PyTorch Geometric's dataset API.
    • We also explored the PointNet++ architecture for point cloud classification as proposed by the paper PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space.
    • We explore how we can implement the PointNet++ architecture using the building blocks provided to us by PyTorch Geometric.
    • We briefly explored the concept of Message Passing Networks and how to implement a Message Passing layer using PyTorch Geometric.
    • We trained our model on the ModelNet10 dataset and showcased how to use Weights & Biases to track our experiment and store and version our checkpoints with a PyTorch-based training and validation loop.
    • We used Weights & Biases Sweeps to correctly identify the optimal set of hyperparameters and train the final version of our model.
  • The implementation of PointNet++ in this report is inspired by this example on PyTorch Geometric repository.
  • You can check out more similar reports:

Iterate on AI agents and models faster. Try Weights & Biases today.