Skip to main content

Torc Robotics PoV Guide

Your one-stop shop for everything you need to learn about W&B Models.
Created on September 2|Last edited on October 1
Weights & Biases instance (endpoint) available at https://torc.wandb.io/
💡


Weights and Biases (W&B) 💫

Weights and Biases is a ML Ops platform built to facilitate collaboration and reproducibility across the machine learning development lifecycle. Machine learning projects can quickly become a mess without some best practices in place to aid developers and scientists as they iterate on models and move them to production.
W&B is lightweight enough to work with whatever framework or platform teams are currently using, but enables teams to quickly start logging their important results to a central system of record. On top of this system of record, W&B has built visualization, automation, and documentation capabilities for better debugging, model tuning, and project management.

PoC Workshop Sessions

Date Session Recording Topics Discussed
Aug 13, 2025 PoC Planning Call Recording (Gong)
  • Align on use cases and criteria to be tested during the PoC
  • Demonstrated features like the model registry, lineage tracking, and distributed experiment tracking, which seem to address Torc's key requirements.

W&B is built with scale in mind for highly scalable workflows. Here's an example project with large-scale runs (it has 1000s of runs with 10k metrics with 100k steps each)
💡

W&B Installation & Authentication

To start using W&B, you first need to install the Python package (if it's not already there)
pip install wandb
Once it's installed, authenticate your user account by logging in through the CLI or SDK. You should have receive an email to sign up to the platform, after which you can obtain your API token (The API token is in your "Settings" section under your profile)
wandb login --host https://torc.wandb.io/ <YOUR API TOKEN>
OR through Python:
# Define the WANDB_BASE_URL = "https://torc.wandb.io/"
# and WANDB_API_KEY environment variables, then:
wandb.login(host=os.getenv("WANDB_BASE_URL"), key=os.getenv("WANDB_API_KEY"))
In headless environments, you can instead define the WANDB_API_KEY environment variable.
Once you are logged in, you are ready to track your workflows!

Focus Areas for the POC

S No Capability Things to test
1. Experiment Tracking
  1. Logging parameters and metrics
  2. Logging system metrics (CPU, GPU, memory)
  3. Logging terminal output (stdout, stderr)
  4. Real-time ingestion of metrics
2. Artifacts
  1. Versioning datasets, models, and outputs
  2. Attaching artifacts to runs
  3. Managing large files (datasets or models > 1GB)
  4. Sharing and accessing artifacts across teams
  5. Tracking lineage (e.g., which dataset and model were used in a specific run)
  6. Retention policies and storage monitoring
3. Registry
  1. Registering models after training
  2. Attaching metadata, metrics, and lineage to registered models
  3. Promoting models through stages (e.g., staging → production)
  4. Searching/filtering registered models by tags or metrics
  5. Pulling and using registered models in downstream jobs



1: Experiment Tracking

Additional details, including a sample script, are includen in the Experiment Tracking section below
💡
1. Logging parameters and metrics
Logging parameters and metrics can be done via 2 ways:
  • Using the wandb Integrations for different popular frameworks
  • Running the wandb.init() command and adding metrics
Sample Code:
import wandb
with wandb.init(project="dd-poc", config={"lr": 3e-4, "batch": 64}) as run:
for step in range(100):
run.log({"loss": 1/(step+1), "accuracy": step/100})
More details in the "Track any python process or experiment" section below
2. Logging system metrics (CPU, GPU, memory)
W&B automatically tracks a bunch of system metrics as documented here including the ability to Monitor GPU cluster performance with NVIDIA DCGM-Exporter as highlighted in the following report:

3. Logging terminal output (stdout, stderr)
W&B automatically captures several types of console logs including stdout, stderr and displays them in the W&B App
4. Real-time ingestion of metrics
W&B is built for supporting real time ingestion making it easier to monitor a job while it's running and NOT having to wait for it to complete. Super helpful for running longer processes.

2: Artifacts

Additional details, including a sample script, are included in the Artifacts section below
💡
1. Versioning datasets, models, and outputs
W&B Artifacts version any file set (datasets, models, evaluation outputs). Each log creates an immutable version (v0, v1, …) with deduplication across versions. Use aliases (e.g., latest, best) and tags to label versions.
Docs:
More Details with example script in the Artifacts section below
2. Attaching artifacts to runs
Here's a sample script to do so:
# While logging Artifact
run = wandb.init() # Initialize run and then log with that run
model_art = wandb.Artifact("resnet50-imagenet", type="model")
model_art.add_file("checkpoints/model.pt")
run.log_artifact(model_art)

# While retrieving Artifacts
run = wandb.init()
used = run.use_artifact("images-2025-08-12:latest")
data_path = used.download()
More Details with example script in the Artifacts section below
3. Managing large files (datasets or models >1GB)
For large files, we also have an option of reference artifacts to track external URIs (S3/GCS/Azure/HTTP) so data remains wherever it is without being moved. Here's an example script:
import wandb
# Initialize a W&B run
run = wandb.init()

# Create an artifact object
artifact = wandb.Artifact(name="name", type="type")

# Add a reference to the bucket path
artifact.add_reference(uri = "uri/to/your/bucket/path")

# Log the artifact's metadata
run.log_artifact(artifact)
run.finish()

Docs:
4. Sharing and accessing artifacts across teams
Artifacts inherit project permissions. You can consume across projects/entities with a fully‑qualified name (entity/project/artifact:alias). For stricter sharing, use restricted projects and team roles.
Docs:
5. Tracking lineage (e.g., which dataset and model were used in a specific run)
The Lineage view shows the DAG of runs ↔ artifacts. You can inspect any run to see exactly which dataset/model versions were used/produced.
Docs:
6. Retention policies and storage monitoring
Apply TTL (time‑to‑live) at the artifact or team default level; artifacts are soft‑deleted then purged. Model Registry‑linked artifacts are protected from TTL by default. For monitoring, use the Usage dashboard; Enterprise can integrate Prometheus and Slack alerts.
Try it (TTL via SDK):
import wandb
from datetime import timedelta

run = wandb.init(project="<my-project-name>", entity="<my-entity>")
artifact = wandb.Artifact(name="<artifact-name>", type="<type>")
artifact.add_file("<my_file>")

artifact.ttl = timedelta(days=30) # Set TTL policy
run.log_artifact(artifact)

Docs:

3: Registry

Additional details, including a sample script, are included in the Registry section below.
💡
W&B Registry is a curated central repository of W&B Artifact versions within your organization.
1. Registering models after training
Models can be registered (linked) to a Registry either via the UI or programmatically. The following docs link walk through details to register Artifacts to Registry
Docs:
2. Attaching metadata, metrics, and lineage to registered models
Since Registry is a curated version of Artifacts, this works pretty much similar to how it works in Artifacts
3. Promoting models through stages (e.g., staging → production)
Registry has concept of aliases (including protected aliases like staging and production) that can be used to mark the current version for each stage.
Docs:
4. Searching/filtering registered models by tags or metrics
Use the global search bar in the W&B Registry App to find a registry, collection, artifact version tag, collection tag, or alias. You can use MongoDB-style queries to filter registries, collections, and artifact versions based on specific criteria using the W&B Python SDK.
Docs:
5. Pulling and using registered models in downstream jobs
You can use the W&B Python SDK to download a model artifact that you linked to the Model Registry.
This will be similar to downloading an artifacts and consuming is for downstream jobs

Track any python process or experiment

At the core of W&B is a Run, which is a logged unit of execution of Python code. A Run captures the entire execution context of that unit: Python library versions, hardware info, system metrics, git state, etc.. To create a run, call wandb.init(). There are a bunch of important arguments you can pass to wandb.init() to provide additional context for the run and enable you to organize your runs later:
import wandb

wandb.init(project="my-sample-project",
entity="<enter team name>", # Team
group='my_group', # for organizing runs (e.g. distributed training)
job_type='training', # for organizing runs (e.g. preprocessing vs. training)
config={'hyperparam1': 24, # Hyperparams and other config
'hyperparam2': 'resnet'})
See the full documentation for wandb.init for other arguments to customize its behavior.

What Can I log and How do I log it?

Within a run context, you can log all sorts of useful info such as metrics, visualizations, charts, and interactive data tables explicitly with wandb.log. Here is a comprehensive guide of wandb.log and its api docs.

Scalar Metrics

Scalar metrics can be logged by passing them in to wandb.log as a dictionary with a name.
wandb.log({"my_metric": some_scalar_value})
Each time wandb.log is called, W&B increments the run's intrinsic _step variable, which is used by default as the x-axis of all the run's metrics charts.
💡
If you call wandb.log every epoch, then the intrinsic _step value will represents the epoch count, but if you call wandb.log at other times (e.g., in validation or testing loops) the meaning of _step will not be clear. In these cases, you can pass a step manually by adding the step = my_int_variable parameter to your wandb.log call. This will give you full control over the resolution of your charts.
In Pytorch Lightning modules, for example, you may want to set step=trainer.global_step. The best practice is to pack all your step metrics into a single dictionary and logging them in one go vs. making multiple wandb.log calls per step.
You will notice that if you log a scalar metric multiple times in a run, it will not only appear as a line chart with the _step as the x-axis, but it will also appear in the Runs Table. The value shown in the Runs Table is the summary metric, which defaults to the last value logged during the course of the run. You can change this behavior by explicitly setting the summary metric of the run using the run.summary object (i.e., run.summary["my_metric_name"]=some_value). This is useful if you want to compare runs according to different aggregations (e.g. mean, max, min) as opposed to simply using the last value logged:
wandb.init()

for i in range(5):
wandb.log({"my_metric": i})

wandb.summary["my_metric"] = 2 # 2 instead of the default 4

wandb.finish()
The W&B Experiment tracking dashboard offers easier comparisons across different runs with the Run comparer visualization where you can use the diff only toggle to easily look at the rows with different values across runs.

Distributed Training

W&B supports logging distributed training experiments. In distributed training, models are trained using multiple GPUs in parallel. W&B supports two patterns to track distributed training experiments (Here's the detailed guide on how to log distributed training experiments):

Track a single process

Initialize W&B (wandb.init) and log experiments (wandb.log) from a single process. This is a common solution for logging distributed training experiments with the PyTorch Distributed Data Parallel (DDP) Class. In some cases, users funnel data over from other processes using a multiprocessing queue (or another communication primitive) to the main logging process.

Track Multiple Processes

There are two main ways to track multiple processes in W&B:
1. Track each process separately
Each process initializes its own W&B run (wandb.init) and logs independently.
To keep runs organized, use the group parameter (e.g. group="DDP"), so that all related runs show up under a single group in the UI.
Use this method if:
  • You want visibility into each process (e.g. for debugging or custom metrics per node)
  • You don’t need strict aggregation, but want each process’s data stored independently
End each run with wandb.finish() to clean up properly.
2. Track all processes to a single run (shared mode)
Requirements:
Available in SDK v0.19.9+ and server v0.68+.
💡
In this approach, a primary node initializes the run and generates a run ID.
Worker nodes then attach to that run ID with wandb.init(id=run_id, settings=...)
Key settings for shared mode:
  • mode="shared" (enables multi-process logging to one run)
  • x_label (labels each process, e.g. rank_0, rank_1 — useful in the UI)
  • x_primary=True/False (identifies primary vs. worker nodes)
  • x_update_finish_state=False (optional, prevents workers from ending the run)
W&B will aggregate all logs and system metrics under one run in the UI, while still letting you filter logs and plots by node label.
Use this method when:
  • You want a unified run in the UI across all processes
  • You're running in multi-GPU, multi-node environments and want centralized logging
Here's the detailed guide on how to log distributed training experiments

Visualize and query dataframes via W&B Tables

Tables are a special wandb Data Type, which allow you to log data, including other wandb Data Types, into an interactive dataframe in the workspace. This is especially useful for logging model predictions in order to filter them and inspect errors. To log a table you can add data row-by-row or as a pandas dataframe or Python lists. The elements of the dataframe can be any wandb Data Type (e.g. wandb.Image, wandb.Html, wandb.Plotly) or simple scalar or text values:
# Add data as a list of lists or pandas dataframe
my_data = [
[0, wandb.Image("img_0.jpg"), 0, 0],
[1, wandb.Image("img_1.jpg"), 8, 0],
[2, wandb.Image("img_2.jpg"), 7, 1],
[3, wandb.Image("img_3.jpg"), 1, 1]
]
# create a wandb.Table() with corresponding columns
columns=["id", "image", "prediction", "truth"]
test_table = wandb.Table(data=my_data, columns=columns)

# Add data incrementally
for img_id, img in enumerate(mnist_test_data):
true_label = mnist_test_data_labels[img_id]
guess_label = my_model.predict(img)
test_table.add_data(img_id, wandb.Image(img), \
guess_label, true_label)

wandb.log({"test_table": test_table})
Use tables to log validation, sample predictions, or model errors, not entire training datasets. They can handle up to 200k rows but UI performance will vary depending on how many rich media types you have embedded. Here is a comprehensive guide to logging tables.
Note on Tables: when logging tables you will see in the workspace wandb.summary["my_table_name"] like below. This is using a query panel expression to query logged data in W&B and render it appropriately. Read more about Query Panels here. The upshot for right now is that W&B by default only renders the last version of a table (the summary one) logged in a run. So if you are logging tables multiple times throughout a run, you will only see the last one by default.



Track and version any serialized data

Artifacts enable you to track and version any serialized data as the inputs and outputs of runs. This can be datasets (e.g. image files), evaluation results (e.g. heatmaps), or model checkpoints. W&B is agnostic to the formats or structure of the data you want to log as an artifact.

Logging Artifacts

To log an artifact, you first create an Artifact object with a name , type, and optionally description and metadata dictionary. You can then add any of these to the artifact object:
  • local files
  • local directories
  • wandb Data Types (e.g. wandb.Plotly or wandb.Tables) which will render alongside the artifact in the UI
  • remote files and directories (e.g. s3 buckets)
# 1. Log a dataset version as an artifact
import wandb
import os

# Initialize a new W&B run to track this job
run = wandb.init(project="artifacts-quickstart", job_type="dataset-creation")

# Create a sample dataset to log as an artifact
f = open('my-dataset.txt', 'w')
f.write('Imagine this is a big dataset.')
f.close()

# Create a new artifact, which is a sample dataset
dataset = wandb.Artifact('my-dataset', type='dataset')
# Add files to the artifact, in this case a simple text file
dataset.add_file('my-dataset.txt')
# Log the artifact to save it as an output of this run
run.log_artifact(dataset)

wandb.finish()
Each time you log this artifact, W&B will checksum the file assets you add to it and compare that to previous versions of the artifact. If there is a difference, a new version will be created, indicated by the alias v1 , v2, v3, etc. Users can optionally add/subtract additional aliases through the UI or API. Aliases are important because they uniquely identify an artifact version, so you can use them to pull down your best model for example.

Consuming Artifacts

To consume an artifact, execute the following:
import wandb
run = wandb.init()
# Indicate we are using a dependency
artifact = run.use_artifact('dummy-team/that_was_easy/my-dataset:v3', type='dataset')
artifact_dir = artifact.download()

Tracking Artifacts By Reference

You may already have large datasets sitting in a cloud object store like s3 and just want to track what versions of those datasets Runs are utilizing and any other metadata associated with those datasets. You can do so by logging these artifacts by reference, in which case W&B only tracks the checksums and metadata of an artifact and does not copy the entire data asset to W&B. Here are some more details on tracking artifacts by reference.
With artifacts you can now refer to arbitrary data assets through durable and simple names and aliases (similar to how you deal with Docker containers). This makes it really easy to hand off these assets between people and processes and see the lineage of all data, models, and results.
If you're working with multiple component artifacts and would like to track the lineage of the collection of component artifacts in the form of a 'super artifact' - check out this colab here.
💡

House staged/candidate models

W&B Registry is a curated central repository that stores and provides versioning, aliases, lineage tracking, and governance of assets. Registry allows individuals and teams across the entire organization to share and collaboratively manage the lifecycle of all models, datasets and other artifacts. The registry can be access directly in SaaS by visiting https://wandb.ai/registry or on your private instance through <host-url>/registry
W&B Registry home page

Registry Types

W&B supports two types of registries: Core registries and Custom registries.
Core registry
A core registry is a template for specific use cases: Models and Datasets.
By default, the Models registry is configured to accept "model" artifact types and the Dataset registry is configured to accept "dataset" artifact types.
Custom registry
Custom registries are not restricted to "model" artifact types or "dataset" artifact types and can be any user defined type
After creating a registry types, you store individual collections of your assets for tracking.
Collection
A collection is a set of linked artifact versions in a registry. Each collection represents a distinct task or use case and serves as a container for a curated selection of artifact versions related to that task.
Below is an diagram demonstrating the structure of how the registry integrates with your existing organization, teams, and projects


Creating a Collection

Collections can be created programmatically or directly through the UI. Below, we'll cover programmatic creation. For the manual creation process through the UI, visit the Interactively create a collection section in the W&B docs.
W&B automatically creates a collection with the name you specify in the target path if you try to link an artifact to a collection that does not exist. The target path consists of the entity of the organization, the prefix "wandb-registry-", the name of the registry, and the name of the collection:
f"{org_entity}/wandb-registry-{registry_name}/{collection_name}"
The proceeding code snippet shows how to programmatically create a collection. Replace values enclosed in <> with your own:
import wandb

# Initialize a run
run = wandb.init(entity="<team_entity>", project="<project>")

# Create an artifact object
artifact = wandb.Artifact(name="<artifact_name>", type="<artifact_type>")

# Define required registry definitions
org_entity = "<organization_entity>"
registry_name = "<registry_name>"
collection_name = "<collection_name>"
target_path = f"{org_entity}/wandb-registry-{registry_name}/{collection_name}"

# Link the artifact to a collection
run.link_artifact(artifact = artifact, target_path = target_path)

run.finish()
After creating your registry collection, you can programmatically link artifact versions to the registry. Linking an artifact to a registry collection brings that artifact version from a private, project-level scope, to the shared organization level scope.
Linking artifacts to a registry can be done programmatically or directly through the UI. Below, we'll cover programmatic linking. For the manual creation process through the UI, visit the "Registry App" and "Artifact browser" tabs of the How to link an artifact version section in the W&B docs.
Before you link an artifact to a collection, ensure that the registry that the collection belongs to already exists.
he target_path parameter to specify the collection and registry you want to link the artifact version to. The target path consists of:
{ORG_ENTITY_NAME}/wandb-registry-{REGISTRY_NAME}/{COLLECTION_NAME}
Copy and paste the code snippet below to link an artifact version to a collection within an existing registry. Replace values enclosed in <> with your own:
import wandb
#Define team and org
TEAM_ENTITY_NAME = "<team_entity_name>"
ORG_ENTITY_NAME = "<org_entity_name>"

REGISTRY_NAME = "<registry_name>"
COLLECTION_NAME = "<collection_name>"

run = wandb.init(
entity=TEAM_ENTITY_NAME, project="<project_name>")

artifact = wandb.Artifact(name="<artifact_name>", type="<collection_type>")
artifact.add_file(local_path="<local_path_to_artifact>")

target_path=f"{ORG_ENTITY_NAME}/wandb-registry-{REGISTRY_NAME}/{COLLECTION_NAME}"
run.link_artifact(artifact = artifact, target_path = target_path)

Download and use an artifact from a registry

Use the W&B Python SDK to use and download an artifact that you linked to the W&B Registry.
Replace values within <> with your own:
import wandb

ORG_ENTITY_NAME = '<org-entity-name>'
REGISTRY_NAME = '<registry-name>'
COLLECTION_NAME = '<collection-name>'
ALIAS = '<artifact-alias>'
INDEX = '<artifact-index>'

run = wandb.init() # Optionally use the entity, project arguments to specify where the run should be created

registered_artifact_name = f"{ORG_ENTITY_NAME}/wandb-registry-{REGISTRY_NAME}/{COLLECTION_NAME}:{ALIAS}"
registered_artifact = run.use_artifact(artifact_or_name=name) # marks this artifact as an input to your run
artifact_dir = registered_artifact.download()
Reference an artifact version with one of following formats listed:
# Artifact name with version index specified
f"{ORG_ENTITY}/wandb-registry-{REGISTRY_NAME}/{COLLECTION_NAME}:v{INDEX}"

# Artifact name with alias specified
f"{ORG_ENTITY}/wandb-registry-{REGISTRY_NAME}/{COLLECTION_NAME}:{ALIAS}"
Where:
latest - Use latest alias to specify the version that is most recently linked.
v# - Use v0, v1, v2, and so on to fetch a specific version in the collection.
alias - Specify the custom alias attached to the artifact version

Organize and share visualizations

Reports are flexible documents you can build on top of your W&B projects. You can easily embed any asset (chart, artifact, table) logged in W&B into a report alongside markdown, LaTeX, code blocks, etc. You can created rich documentation from your logged assets without copy-pasting static figures into word docs or managing excel spreadsheets. Reports are live in that as new experiments run, they will update accordingly. This report you are viewing is a good example of what all you can put into them.

Programmatic Reports

It may be useful to programmatically generate a report, such as for a standard model comparison analysis you might be doing repeatedly when retraining models, or after doing a large hyperparamater search. The W&B Python sdk provides a means of programmatically generating reports very easily under wandb.apis.reports. Check out the docs and this quickstart notebook.

Other Useful Resources

Import/Export API

All data logged to W&B can be accessed programmatically through the import/export API (also called the public API). This enables you to pull down run and artifact data, filter and manipulate it how you please in Python.

Slack Alerts

You can set slack alerts within a run to trigger when things happen in your training / evaluation scripts. For example, you may want to notify you when training is done or when a metric exceeds a certain value.
Details on enabling these alerts on your dedicated deployments can be found here

FAQs

W&B Models

1. I didn't name my run. Where is the run name coming from?
If you do not explicitly name your run, a random run name will be assigned to the run to help identify the run in the UI. For instance, random run names will look like "pleasant-flower-4" or "misunderstood-glade-2".
2. How can I configure the name of the run in my training code?
At the top of your training script when you call wandb.init, pass in an experiment name, like this:
wandb.init(name="my_awesome_run")
3. If wandb crashes, will it possibly crash my training run?
It is extremely important to us that we never interfere with your training runs. We run wandb in a separate process to make sure that if wandb somehow crashes, your training will nevertheless continue to run.
4. Why is a run marked crashed in W&B when it’s training fine locally?
This is likely a connection problem — if your server loses internet access and data stops syncing to W&B, we mark the run as crashed after a short period of retrying.
5. Does W&B support Distributed training?
Yes, W&B supports distributed training, here's the detailed guide on how to log distributed training experiments.
6. Can I use PyTorch profiler with W&B?
Here's a detailed report that walks through using the PyTorch profiler with W&B along with this associated Colab notebook.
7. What happens when a TTL policy is set to an Artifacts that is linked to Registry
W&B deactivates the option to set a TTL policy for model artifacts linked to the Model Registry. This is to help ensure that linked models do not accidentally expire if used in production workflows. More details can be found on the docs here
8. How do I stop wandb from writing to my terminal or my jupyter notebook output?
Set the environment variable WANDB_SILENT to true.
In Python
os.environ["WANDB_SILENT"] = "true"
Within Jupyter Notebook
%env WANDB_SILENT=true
With Command Line
WANDB_SILENT=true