Lightweight dataset and model versioning

Save every step of your pipeline

Just add wandb.init() to each script in your pipeline, from dataset creation to model evaluation.

Dataset versioning with deduplication

Automatically diff the latest version of a dataset, and only save the changes or new data.

Model tracking and model lineage

Save model checkpoints and compare model versions to identify the best candidate for production.

Effortless observability

Add just a few lines to your scripts and we’ll build a dependency graph for you, on your premises or in the cloud. Trace the flow of data through your pipeline, so you know exactly which datasets feed into your models.

Incremental tracking

Freely refine your datasets and models. We’ll track the evolution of your data over time and preserve checkpoints of your best performing models.

Data Access controls

Regulate and monitor access to your sensitive data. We’ll manage references to data in your private buckets, so the contents of your files never leave your premises.

Never lose track of another ML project