Using W&B with DeepChem: Molecular Graph Convolutional Networks

A quick tutorial on using W&B to track DeepChem molecular deep learning experiments. Made by Kevin Shen using Weights & Biases
Kevin Shen

Link to Colab


If you think back to any basic chemistry class, molecules can be easily visualized as graphs. This presents a really elegant application of graph convolution networks (GCNs). Similar to convolutional neural networks (CNN), GCNs use convolutions but on the nodes of the graph instead of the pixels in an image.
(We'll be moving talking mostly about our integration with DeepChem below, but if you want a more in-depth explanation of graph neural networks and GCNs, here's a good series of articles.)
Today, we will be using DeepChem––a library of open source tools for drug discovery, materials science, quantum chemistry, and biology––to help us process our dataset and train our model. Specifically, this report will extend the first part of an existing DeepChem tutorial called Introduction to Graph Convolutions to include Weights & Biases experiment tracking.
Since Weights & Biases is integrated inside DeepChem, with just a few lines of code, you can start tracking your DeepChem models. We'll show you how.

Dataset & Evaluation Metric

So first off, let's talk about the dataset we're using in this example. Tox21 is a public dataset measuring the toxicity of over 8000 compounds on 12 different targets such as nuclear receptors and stress response pathways. We can load Tox21 using DeepChem's MoleculeNet suite. By setting the featurizer option to 'GraphConv' , the dataset gets processed for graph convolutional purposes.
import deepchem as dctasks, datasets, transformers = dc.molnet.load_tox21(featurizer='GraphConv')train_dataset, valid_dataset, test_dataset = datasets
This returns a training, validation, and test set along with a list of tasks and data transformations applied to the dataset.
For measuring model performance, we use the ROC-AUC (area under the receiver operating characteristic curve). ROC-AUC score is also part of DeepChem and can be imported.
metric = dc.metrics.Metric(dc.metrics.roc_auc_score)

Model Training & Logging

DeepChem supports Weights & Biases through the WandbLogger class.
We also set up a ValidationCallback callback to calculate additional metrics. In our case, the only other metric apart from the training loss is the ROC-AUC score on the validation set. We have it set to log every 10 training steps. You can log any number of metrics through ValidationCallback by passing a list to the metrics argument.
from deepchem.models.wandblogger import WandbLoggerfrom deepchem.models.callbacks import ValidationCallbackwandblogger = WandbLogger(project='deepchem_graphconv', entity='kwshen')vc_valid = ValidationCallback(valid_dataset, interval=10, metrics=[metric], transformers=transformers)

Basic Model

For a basic example, we can use DeepChem's standard graph convolutional architecture "GraphConvModel". Of course you can feel free to build your own GraphConvModel with a custom architecture. (In fact, there's a section on that in the colab tutorial!)
We pass in our WandbLogger to the model's wandb_logger argument. By default this will keep track and log the training loss.
Our ValidationCallback is passed directly into via the callbacks argument.
n_tasks = len(tasks)model = dc.models.GraphConvModel(n_tasks, mode='classification', wandb_logger=wandblogger), nb_epoch=50, callbacks=[vc_valid])
ValidationCallback will automatically log its metrics to W&B during training if the model has a WandbLogger which, in this case, it does.

Other Models

Next, let's train two alternate models: a GCN with a custom architecture, and a Graph Attention Model (GAT). The custom GCN uses a data generator to load the dataset.
# Creating and loading a custom architectureclass MyGraphConvModel(tf.keras.Model): def __init__(self): ... def call(self, inputs): ...# set up data generatordef data_generator(dataset, epochs=1): ...wandblogger = WandbLogger(project='deepchem_graphconv', entity='kshen', name="Custom")model = dc.models.KerasModel(MyGraphConvModel(), loss=dc.models.losses.CategoricalCrossEntropy(), wandb_logger=wandblogger)model.fit_generator(data_generator(train_dataset, epochs=50))wandblogger.finish()
The GAT model will require the installation of the DGL and DGL-Lifesci packages. More information can be found at the DeepChem GAT documentation.
# Creating a GAT modelfrom deepchem.models import GATModelfeaturizer = dc.feat.MolGraphConvFeaturizer()tasks, datasets, transformers = dc.molnet.load_tox21(reload=False, featurizer=featurizer, transformers=[])train_dataset, valid_dataset, test_dataset = datasetswandblogger = WandbLogger(project='deepchem_graphconv', entity='kshen', name="GAT")model = GATModel(mode='classification', n_tasks=len(tasks), batch_size=100, learning_rate=0.001, wandb_logger=wandblogger), ...)
Because both the custom GCN and GAT are instances of DeepChem KerasModel, the WandbLogger will integrate seamlessly.
For more detailed code, check out this report's Google Colab.


W&B will log the training loss and any metrics calculated in ValidationCallback. Our basic GCN model performs well on the validation set when trained for 50 epochs.
Note: Validation ROC-AUC was not logged for the custom GCN as there currently is no data generator support for ValidationCallback.
We can also evaluate each model one final time on all three datasets (training, validation, test) and present it in a W&B Table.
We can see that while the Graph Attention Network scores the lowest on the training dataset, it is able to generalize well to the validation and test set.


Hopefully, this quick walk through of our integration gives you a nice jumping off point for using W&B with DeepChem's suite of model architectures and tools. We'd love to see any experiments you're excited about or hear any feedback you have. Thanks!