Skip to main content

Getting Started Guide

The following guide serves as an entry point to getting started with Weights & Biases!
Created on June 16|Last edited on June 16



Weights and Biases (W&B)

Weights and Biases is a ML Ops platform built to facilitate collaboration and reproducibility across the machine learning development lifecycle. Machine learning projects can quickly become a mess without some best practices in place to aid developers and scientists as they iterate on models and move them to production.
W&B is lightweight enough to work with whatever framework or platform teams are currently using, but enables teams to quickly start logging their important results to a central system of record. On top of this system of record, W&B has built visualization, automation, and documentation capabilities for better debugging, model tuning, and project management.
Here's an Youtube video on overview of Weights & Biases


OverAll W&B documentation: https://docs.wandb.ai/

W&B Authentication

SDK Installation and Login

To start using W&B, you first need to install the Python package (if it's not already there)
pip install wandb
Once it's installed, authenticate your user account by logging in through the CLI or SDK. You should have receive an email to sign up to the platform, after which you can obtain your API token
wandb login --host <your-host> <YOUR API TOKEN>
OR through Python:
WANDB_BASE_URL = '<your-host>' #e.g company.wandb.io if on dedicated
wandb.login(host=os.getenv("WANDB_BASE_URL"), key=os.getenv("WANDB_API_KEY"))
Once you are logged in, you are ready to track your workflows!

Experiment Tracking

At the core of W&B is a Run, which is a logged unit of execution of Python code. A Run captures the entire execution context of that unit: Python library versions, hardware info, system metrics, git state, etc.. To create a run, call wandb.init(). There are a bunch of important arguments you can pass to wandb.init() to provide additional context for the run and enable you to organize your runs later:
import wandb

wandb.init(project="my-sample-project",
entity="<enter team name>", # Team
group='my_group', # for organizing runs (e.g. distributed training)
job_type='training', # for organizing runs (e.g. preprocessing vs. training)
config={'hyperparam1': 24, # Hyperparams and other config
'hyperparam2': 'resnet'})
See the full documentation for wandb.init for other arguments to customize its behavior.

What Can I log and How do I log it?

Within a run context, you can log all sorts of useful info such as metrics, visualizations, charts, and interactive data tables explicitly with wandb.log. Here is a comprehensive guide of wandb.log and its api docs.

Scalar Metrics

Scalar metrics can be logged by passing them in to wandb.log as a dictionary with a name.
wandb.log({"my_metric": some_scalar_value})
Each time wandb.log is called, that increments a variable W&B keeps track of called step. This is the (x-axis) you see with all the time-series charts. If you call wandb.log every epoch, then the step represents the epoch count, but you may be calling it other times in validation or testing loops, in which case the step is not as clear. To pass a step manually (simply add step = my_int_variable) to wandb.log. This can be important to getting your charts at the resolution you want.
In Pytorch Lightning modules, you may want to set step to trainer.global_step for example. It is recommended to pack as many metrics as you can into a single dictionary and logging them in one go vs. separate wandb.log calls, each of which increment the step.

Run set


You will notice that if you log a scalar metric multiple times in a run, it will appear as a line chart with the step as the x-axis, and it will also appear in the Runs Table. The entry in the Runs Table is the summary metric, which defaults to the last value logged during the course of the run. You can change this behavior by setting the summary metric in the run using run.summary["my_metric_name"]=some_value . This is useful if you want to compare runs according to different aggregations of a given metric (e.g. mean, max, min) as opposed to simply the last one.
wandb.init()

for i in range(5):
wandb.log({"my_metric": i})

wandb.summary["my_metric"] = 2 # 2 instead of the default 4

wandb.finish()

Rich Media (e.g. images)

Logging rich media works roughly the same as scalars except you wrap your rich media in a wandb Data Type (e.g. wandb.Image).
wandb.log({"my_image": wandb.Image("my_image.jpg"))
The different Data Types are flexible in what formats of files or objects they accept. wandb.Image will accept image files, Pillow objects, or numpy arrays, for example. wandb.Images in particular, have a whole host of arguments for specifying captions, segmentation masks, or bounding boxes.
When you log a rich media type, this creates a panel in the workspace which renders the rich media below the run name it belongs to. If you call wandb.log with a Data Type multiple times in a run, you will see a slider appear below the panel. This panel lets you slide over the step variable mentioned above.

Run set
398



W&B Tables

Tables are a special wandb Data Type, which allow you to log data, including other wandb Data Types, into an interactive dataframe in the workspace. This is especially useful for logging model predictions in order to filter them and inspect errors. To log a table you can add data row-by-row or as a pandas dataframe or Python lists. The elements of the dataframe can be any wandb Data Type (e.g. wandb.Image, wandb.Html, wandb.Plotly) or simple scalar or text values:
# Add data as a list of lists or pandas dataframe
my_data = [
[0, wandb.Image("img_0.jpg"), 0, 0],
[1, wandb.Image("img_1.jpg"), 8, 0],
[2, wandb.Image("img_2.jpg"), 7, 1],
[3, wandb.Image("img_3.jpg"), 1, 1]
]
# create a wandb.Table() with corresponding columns
columns=["id", "image", "prediction", "truth"]
test_table = wandb.Table(data=my_data, columns=columns)

# Add data incrementally
for img_id, img in enumerate(mnist_test_data):
true_label = mnist_test_data_labels[img_id]
guess_label = my_model.predict(img)
test_table.add_data(img_id, wandb.Image(img), \
guess_label, true_label)

wandb.log({"test_table": test_table})
Use tables to log validation, sample predictions, or model errors, not entire training datasets. They can handle up to 200k rows but UI performance will vary depending on how many rich media types you have embedded. Here is a comprehensive guide to logging tables.
Note on Tables: when logging tables you will see in the workspace wandb.summary["my_table_name"] like below. This is using a query panel expression to query logged data in W&B and render it appropriately. Read more about query panels here. The upshot for right now is that W&B by default only renders the last version of a table (the summary one) logged in a run. So if you are logging tables multiple times throughout a run, you will only see the last one by default.


Artifact Tracking and Versioning

Artifacts enable you to track and version any serialized data as the inputs and outputs of runs. This can be datasets (e.g. image files), evaluation results (e.g. heatmaps), or model checkpoints. W&B is agnostic to the formats or structure of the data you want to log as an artifact.


Logging Artifacts

To log an artifact, you first create an Artifact object with a name , type, and optionally description and metadata dictionary. You can then add any of these to the artifact object:
  • local files
  • local directories
  • wandb Data Types (e.g. wandb.Plotly or wandb.Tables) which will render alongside the artifact in the UI
  • remote files and directories (e.g. s3 buckets)
# 1. Log a dataset version as an artifact
import wandb
import os

# Initialize a new W&B run to track this job
run = wandb.init(project="artifacts-quickstart", job_type="dataset-creation")

# Create a sample dataset to log as an artifact
f = open('my-dataset.txt', 'w')
f.write('Imagine this is a big dataset.')
f.close()

# Create a new artifact, which is a sample dataset
dataset = wandb.Artifact('my-dataset', type='dataset')
# Add files to the artifact, in this case a simple text file
dataset.add_file('my-dataset.txt')
# Log the artifact to save it as an output of this run
run.log_artifact(dataset)

wandb.finish()
Each time you log this artifact, W&B will checksum the file assets you add to it and compare that to previous versions of the artifact. If there is a difference, a new version will be created, indicated by the alias v1 , v2, v3, etc. Users can optionally add/subtract additional aliases through the UI or API. Aliases are important because they uniquely identify an artifact version, so you can use them to pull down your best model for example.

Error: Could not load

Consuming Artifacts

To consume an artifact, execute the following:
import wandb
run = wandb.init()
# Indicate we are using a dependency
artifact = run.use_artifact('dummy-team/that_was_easy/my-dataset:v3', type='dataset')
artifact_dir = artifact.download()

Tracking Artifacts By Reference

You may already have large datasets sitting in a cloud object store like s3 and just want to track what versions of those datasets Runs are utilizing and any other metadata associated with those datasets. You can do so by logging these artifacts by reference, in which case W&B only tracks the checksums and metadata of an artifact and does not copy the entire data asset to W&B. Here are some more details on tracking artifacts by reference.
With artifacts you can now refer to arbitrary data assets through durable and simple names and aliases (similar to how you deal with Docker containers). This makes it really easy to hand off these assets between people and processes and see the lineage of all data, models, and results.
If you're working with multiple component artifacts and would like to track the lineage of the collection of component artifacts in the form of a 'super artifact' - check out this colab here.
💡

Registry

W&B Registry is a curated central repository that stores and provides versioning, aliases, lineage tracking, and governance of assets. Registry allows individuals and teams across the entire organization to share and collaboratively manage the lifecycle of all models, datasets and other artifacts. The registry can be access directly in SaaS by visiting https://wandb.ai/registry or on your private instance through <host-url>/registry
W&B Registry home page

Registry Types

W&B supports two types of registries: Core registries and Custom registries.
Core registry
A core registry is a template for specific use cases: Models and Datasets.
By default, the Models registry is configured to accept "model" artifact types and the Dataset registry is configured to accept "dataset" artifact types.
Custom registry
Custom registries are not restricted to "model" artifact types or "dataset" artifact types and can be any user defined type
After creating a registry types, you store individual collections of your assets for tracking.
Collection
A collection is a set of linked artifact versions in a registry. Each collection represents a distinct task or use case and serves as a container for a curated selection of artifact versions related to that task.
Below is an diagram demonstrating the structure of how the registry integrates with your existing organization, teams, and projects


Creating a Collection

Collections can be created programmatically or directly through the UI. Below, we'll cover programmatic creation. For the manual creation process through the UI, visit the Interactively create a collection section in the W&B docs.
W&B automatically creates a collection with the name you specify in the target path if you try to link an artifact to a collection that does not exist. The target path consists of the entity of the organization, the prefix "wandb-registry-", the name of the registry, and the name of the collection:
f"{org_entity}/wandb-registry-{registry_name}/{collection_name}"
The proceeding code snippet shows how to programmatically create a collection. Replace values enclosed in <> with your own:
import wandb

# Initialize a run
run = wandb.init(entity="<team_entity>", project="<project>")

# Create an artifact object
artifact = wandb.Artifact(name="<artifact_name>", type="<artifact_type>")

# Define required registry definitions
org_entity = "<organization_entity>"
registry_name = "<registry_name>"
collection_name = "<collection_name>"
target_path = f"{org_entity}/wandb-registry-{registry_name}/{collection_name}"

# Link the artifact to a collection
run.link_artifact(artifact = artifact, target_path = target_path)

run.finish()
After creating your registry collection, you can programmatically link artifact versions to the registry. Linking an artifact to a registry collection brings that artifact version from a private, project-level scope, to the shared organization level scope.
Linking artifacts to a registry can be done programmatically or directly through the UI. Below, we'll cover programmatic linking. For the manual creation process through the UI, visit the "Registry App" and "Artifact browser" tabs of the How to link an artifact version section in the W&B docs.
Before you link an artifact to a collection, ensure that the registry that the collection belongs to already exists.
he target_path parameter to specify the collection and registry you want to link the artifact version to. The target path consists of:
{ORG_ENTITY_NAME}/wandb-registry-{REGISTRY_NAME}/{COLLECTION_NAME}
Copy and paste the code snippet below to link an artifact version to a collection within an existing registry. Replace values enclosed in <> with your own:
import wandb
#Define team and org
TEAM_ENTITY_NAME = "<team_entity_name>"
ORG_ENTITY_NAME = "<org_entity_name>"

REGISTRY_NAME = "<registry_name>"
COLLECTION_NAME = "<collection_name>"

run = wandb.init(
entity=TEAM_ENTITY_NAME, project="<project_name>")

artifact = wandb.Artifact(name="<artifact_name>", type="<collection_type>")
artifact.add_file(local_path="<local_path_to_artifact>")

target_path=f"{ORG_ENTITY_NAME}/wandb-registry-{REGISTRY_NAME}/{COLLECTION_NAME}"
run.link_artifact(artifact = artifact, target_path = target_path)

Download and use an artifact from a registry

Use the W&B Python SDK to use and download an artifact that you linked to the W&B Registry.
Replace values within <> with your own:
import wandb

ORG_ENTITY_NAME = '<org-entity-name>'
REGISTRY_NAME = '<registry-name>'
COLLECTION_NAME = '<collection-name>'
ALIAS = '<artifact-alias>'
INDEX = '<artifact-index>'

run = wandb.init() # Optionally use the entity, project arguments to specify where the run should be created

registered_artifact_name = f"{ORG_ENTITY_NAME}/wandb-registry-{REGISTRY_NAME}/{COLLECTION_NAME}:{ALIAS}"
registered_artifact = run.use_artifact(artifact_or_name=name) # marks this artifact as an input to your run
artifact_dir = registered_artifact.download()
Reference an artifact version with one of following formats listed:
# Artifact name with version index specified
f"{ORG_ENTITY}/wandb-registry-{REGISTRY_NAME}/{COLLECTION_NAME}:v{INDEX}"

# Artifact name with alias specified
f"{ORG_ENTITY}/wandb-registry-{REGISTRY_NAME}/{COLLECTION_NAME}:{ALIAS}"
Where:
latest - Use latest alias to specify the version that is most recently linked.
v# - Use v0, v1, v2, and so on to fetch a specific version in the collection.
alias - Specify the custom alias attached to the artifact version

W&B Sweeps

Anything logged in wandb.config appears as a column in the runs table and is considered a hyperparameter in W&B. These hyperparameters can be viewed dynamically in a Parallel Coordinates Chart, which you can add and manipulate in a workspace. You can edit this chart to display different hyperparameters or different metrics. The lines in the chart are different runs which have "swept" through the hyperparameter space. You can also plot a parameter importance chart to get a sense of what hyper-paramaeters are most important or correlated with the target metric. These importances are calculated using a random forest trained in your browser! Here are docs on the Parallel Coordinates Plot and the Parameter Importance Plot

Run set
50

W&B provides a mechanism for automating hyper-parameter search through W&B Sweeps. Sweeps allows you to configure a large set of experiments across a pre-specified hyper-parameter space. To implement a sweep you just need to:
  1. Add wandb.init() to your training script, ensuring that all hyper-parameters are passed to your training logic via wandb.config.
  2. Write a yaml file with your hyper-parameter search specified i.e. method of search, hyper-parameter distributions and values to search over.
  3. Run the sweep controller, which runs in W&B through wandb.sweep or through the UI. The controller will delegate new hyperparameter values to wandb.config of the various agents running.
  4. Run agents in however many machines you want to run the experiments with wandb.agent
The agents will execute the training script replacing the wandb.config with queued hyper-parameter values the controller is keeping track of.
If you prefer to use other hyper-parameter optimization frameworks, W&B has integrations with RayTune, Optuna, among others.

W&B Reports

Reports are flexible documents you can build on top of your W&B projects. You can easily embed any asset (chart, artifact, table) logged in W&B into a report alongside markdown, LaTeX, code blocks, etc. You can created rich documentation from your logged assets without copy-pasting static figures into word docs or managing excel spreadsheets. Reports are live in that as new experiments run, they will update accordingly. This report you are viewing is a good example of what all you can put into them.

Programmatic Reports

It may be useful to programmatically generate a report, such as for a standard model comparison analysis you might be doing repeatedly when retraining models, or after doing a large hyperparamater search. The W&B Python sdk provides a means of programmatically generating reports very easily under wandb.apis.reports. Check out the docs and this quickstart notebook.


Integrations

PyTorch Lightning

Logging & Model Checkpointing in PyTorch Lightning

The WandbLogger in PyTorch Lightning integrates Weights and Biases for real-time experiment tracking, logging metrics, model checkpoints, and hyperparameters during the training process.
If you are using the WandbLogger with the PyTorch Lightning Trainer, the ModelCheckpoint Callback will automatically log model checkpoints to W&B. See more details in the PyTorch Lightning integration docs, and run a live example in this colab notebook.
W&B has many other integrations with frameworks like Keras and Hugging Face, which offer similar functionality.
from lightning.pytorch.loggers import WandbLogger
from lightning.pytorch import Trainer
from lightning.pytorch.callbacks import ModelCheckpoint


checkpoint_callback = ModelCheckpoint(monitor='val_accuracy', mode='max')

wandb_logger = WandbLogger(project='MNIST', # group runs in "MNIST" project
entity='smle-demo',
log_model='all') # log all new checkpoints during training
trainer = Trainer(
logger=wandb_logger, # W&B integration
callbacks=[log_predictions_callback, # logging of sample predictions
checkpoint_callback], # our model checkpoint callback
accelerator="gpu", # use GPU
max_epochs=5)

Run set
1


Hugging Face Trainer

The Hugging Face Transformers library makes state-of-the-art NLP models like BERT and training techniques like mixed precision and gradient checkpointing easy to use. The W&B integration adds rich, flexible experiment tracking and model versioning to interactive centralized dashboards without compromising that ease of use.

Logging & Model Checkpointing Using HF Transformers

Sample Code:
os.environ["WANDB_PROJECT"] = "<my-amazing-project>" # name your W&B project
os.environ["WANDB_LOG_MODEL"] = "checkpoint" # log all model checkpoints

from transformers import TrainingArguments, Trainer

args = TrainingArguments(..., report_to="wandb") # turn on W&B logging
trainer = Trainer(..., args=args)

Run set
27

Link to sample colab here

Other Useful Resources

Import/Export API

All data logged to W&B can be accessed programmatically through the import/export API (also called the public API). This enables you to pull down run and artifact data, filter and manipulate it how you please in Python.

Slack Alerts

You can set slack alerts within a run to trigger when things happen in your training / evaluation scripts. For example, you may want to notify you when training is done or when a metric exceeds a certain value.
Details on enabling these alerts on your dedicated deployments can be found here

FAQ

1. Why can't I login to W&B
Ans: Make sure you have specified host while initializing a W&B run with the script
wandb.init(host="https://api.wandb.ai")
A troubleshooting step we recommend when users are having trouble logging in is to reset the login credentials via the CLI:
  1. Run rm ~/.netrc in your terminal
  2. Run wandb login --relogin --cloud
  3. Enter your API key when prompted
2. I didn't name my run. Where is the run name coming from?
Ans: If you do not explicitly name your run, a random run name will be assigned to the run to help identify the run in the UI. For instance, random run names will look like "pleasant-flower-4" or "misunderstood-glade-2".
3. How can I configure the name of the run in my training code?
Ans: At the top of your training script when you call wandb.init, pass in an experiment name, like this:
wandb.init(name="my_awesome_run")
4. If wandb crashes, will it possibly crash my training run?
Ans: It is extremely important to us that we never interfere with your training runs. We run wandb in a separate process to make sure that if wandb somehow crashes, your training will continue to run. If the internet goes out, wandb will continue to retry sending data to wandb.ai.
5. Why is a run marked crashed in W&B when it’s training fine locally?
This is likely a connection problem — if your server loses internet access and data stops syncing to W&B, we mark the run as crashed after a short period of retrying.
6. How do I stop wandb from writing to my terminal or my jupyter notebook output?
Ans: Set the environment variable WANDB_SILENT to true.
In Python
os.environ["WANDB_SILENT"] = "true"
Within Jupyter Notebook
%env WANDB_SILENT=true
With Command Line
WANDB_SILENT=true



artifact