Better Models Faster with Weights & Biases

Lavanya Shukla

We're excited to announce that Weights & Biases now comes baked into your Kaggle kernels! Used by the likes of OpenAI and Github, W&B is part of the new standard of best practices for machine learning. I thought I'd write a quick post on why my fellow Kagglers might find W&B useful, and how you can integrate it into your projects with just a few lines of code.

Tl;dr: W&B helps you visualize model performance and predictions, find the best model efficiently, and share your experiment results.

Here are a few use cases in which W&B is specially useful for Kagglers:

  1. Track and compare models: I want to test out my hypotheses fast and iterate quickly to find the best model
  2. Visualize model performance for debugging: I want to see how my models are performing in real time and debug them
  3. Efficient hyperparameter search: I want to find the best model faster than everyone else
  4. Resource efficient model training: I want to be efficient with my model training and not spend more money than I need to
  5. Show off my model: I want to share my models and key insights with my teammates and the Kaggle community

Carlo Lepelaars and Mani Sarkar, two of our beta users finished 5th place out of 2255 on the SoftBank Forex Algorithm Challenge and found using Weights and Biases extremely helpful for tuning their model’s performance.

Robert Lutz used Weights & Biases for all his hyperparameter tuning and finished 36th on the NFL Big Data Bowl Kaggle competition. He said "I was amazed by the speed at which I was able to refine my model performance (and my position on the leaderboard) using W&B. The sweep visualizations were especially great for hyperparameter tuning."

Let’s dive deeper into how you can use Weights & Biases to make it to the Kaggle leaderboards.

1. Track and compare models

"I want to test out my hypotheses fast and iterate quickly to find the best model."

As Kagglers we’re both time and resource constrained. We need to run lots of quick experiments to find the winning model faster than our peers. With W&B you can track your model's performance, predictions, and resource usage in a live dashboard, from anywhere.

This centralized repository of metrics, models and hyperparameters tried, predictions and accompanying notes gives you a bird’s eye view of your machine learning workflow - all your experiments in one place.

You can visualize your progress, and ask yourself things like – is my model good enough yet, am I using too many GPU resources, which learning rate worked best, did adding BatchNorm help? W&B saves and restores your models with and wandb.restore. And you can log your predictions alongside your models.

This is kinda cool because you don’t have to re-run a model, you can simply view its performance days, weeks, or even a few months later.  Before the final submission deadline for the competition, you can look at all the models you trained in the previous months and download the predictions for the best performing one.

Having your experiments in one place also allows you to keep track of all the pieces that went into building the model so you can reproduce results, and explain how the model works to fellow Kagglers.

Once you have this level of insight into your model performance, you can quickly iterate on many model hypotheses and see which hyperparameters and model architectures are doing the best – in real time. The run compare panel in your dashboard shows you what changed between different model runs; and the project page provides a clean comparison of all your models' performances in one graph.

2. Visualize model performance for debugging

"I want to see how my models are performing in real time and debug them."

Here's a quick overview of some of the visualizations W&B automatically creates for your models.

Visualize your model performance and training metrics
Visualize your model's predictions
Visualize your system metrics, including GPU usage
Visualize gradients to help deal with vanishing and exploding gradients
Comprehensive, flexible visualizations - Log images, videos, html, audio, 3D objects, plots, tables, point clouds

As you can see here, W&B structures your metrics & predictions, and presents them in a way that provides actionable insights about your training process. For example, you can quickly create a plot to visualize your training and validation loss to check for overfitting.

With these comprehensive visualizations W&B helps you look under the hood of your model and debug its performance. You can understand where the model is failing, where it performs the most optimally and what the most common scenarios where the model doesn’t work are.

3. Efficient hyperparameter search

"I want to find the best model faster than everyone else."

Finding the most optimal hyperparameters for your model might be the difference between a gold and a bronze on the leaderboard.  Sweeping through hyperparameters is a powerful way to find the winning model in an organized way, with little effort.

This process can get gnarly and usually involves a hastily thrown together concoction of excel spreadsheets and scribbled notes on paper to track hyperparameter values tried. Occasionally you might write some training scripts, but overall the process of trying hyperparameter values is quite cumbersome.

Enter W&B sweeps which let you define a dictionary or YAML file of hyperparameter values to try, and the search strategy you want (grid, random, bayes). With a few lines of code W&B sweeps automatically search the hyperparameter space for you, and log all your model metrics + predictions as before.

This helps you explore your hyperparameter space thoroughly and find the best model, while saving money, and time.

4. Resource efficient model training

"I want to be efficient with my model training and not spend more money than I need to."

GPU resources are expensive and we're trying to win our Kaggle competitions by being as resource efficient (read: cheap) as possible. W&B shows you how many resources you're using with this nifty little System Metrics tab.

This is useful because you can see which model experiments were the most resource intensive, and limit the types of models and hyperparameter values you try. For instance you might realize that doubling the batch size quadruples the GPU usage, which is no bueno.

By keeping track of system metrics, we can be time and resource efficient! And save the planet in a small way.

5. Show off my model

"I want to share results of models with my teammates."

Whether you're entering competitions with teammates, or you're writing a cool kernel that explains your model to the Kaggle community, W&B can help bring your models to life with Reports. They're like readmes for your models, but better. Instead of trying to communicate your results via a limited markdown readme file or blog post, you can generate rich reports with interactive charts.

You can create reports to share your findings and progress. Here’s a few examples –

Finally, by logging model runs to a single project, you and your teammates can collaborate on models and see which of your approaches is performing the best. You have a realtime dashboard of what your teammates are doing. Here you can share insights on approaches tried, understand bottlenecks, make milestones and track progress in a central source of truth.

Visualizing your models with W&B is easy

"Your tool sounds cool, but I don't have a lot of free time. How long will it take to get started with W&B?"

You can integrate W&B into your code with just a few lines of code.

You can find the full version of the code below in this kernel. If you have an existing kernel, you just need to switch your Docker image to 'Latest Available'.

1. Log any metric with Weights and Biases

import wandb
wandb.init(anonymous='allow', project="kaggle")

# Code to fetch and preprocess data

# Log any metric over any time period
for price in apple['close']:
   wandb.log({"Stock Price": price})

2. Monitor boosting model performance

Start out by importing the experiment tracking library and setting up your free W&B account:

# Import wandb
import wandb
import xgboost as xgb
wandb.init(anonymous='allow', project="xgboost-dermatology")

# Code to fetch and preprocess data

# Add the wandb xgboost callback
bst = xgb.train(param, xg_train, num_round, watchlist, callbacks=[wandb.xgboost.wandb_callback()])

3. Monitor scikit learn performance

Logging sklearn plots with Weights & Biases is simple.

Step 1: First import wandb and initialize a new run
import wandb
# load and preprocess dataset
# train a model
Step 2: Visualize individual plots.

Visualize single plot

wandb.sklearn.plot_confusion_matrix(y_true, y_probas, labels)

Or visualize all plots at once

# Or visualize all plots at once:
# Visualize all classifier plots
wandb.sklearn.plot_classifier(clf, X_train, X_test, y_train, y_test, y_pred, y_probas, labels, model_name='SVC', feature_names=None)
# All regression plots
wandb.sklearn.plot_regressor(reg, X_train, X_test, y_train, y_test,  model_name='Ridge')
# All clustering plots
wandb.sklearn.plot_clusterer(kmeans, X_train, cluster_labels, labels=None, model_name='KMeans')

4. Monitor neural network performance

Define Your Hyperparameters
# WandB – Import the W&B library
import wandb
from wandb.keras import WandbCallback

# Default values for hyper-parameters
   dropout = 0.2,
   hidden_layer_size = 32,
   layer_1_size = 32,
   learn_rate = 0.01,
   decay = 1e-6,
   momentum = 0.9,
   epochs = 5,

# Initialize a new wandb run and pass in the config object
wandb.init(anonymous='allow', project="kaggle", config=defaults)
config = wandb.config

# Code to fetch and preprocess data

# build model
model = Sequential()
model.add(Conv2D(config.layer_1_size, (5, 5), activation='relu',
                           input_shape=(img_width, img_height,1)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dense(num_classes, activation='softmax'))

sgd = SGD(lr=config.learn_rate, decay=config.decay, momentum=config.momentum, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])

# Add WandbCallback() to the fit function, y_train,  validation_data=(X_test, y_test), epochs=config.epochs,
   callbacks=[WandbCallback(data_type="image", labels=labels)])

More Resources

Weights & Biases is always free for academics and open source projects. Email with any questions or feature suggestions. Here are some more resources:

Join our mailing list to get the latest machine learning updates.