Beyond experiment tracking and best practice of wandb
This reports shows some advanced tips to use wandb
Created on December 3|Last edited on December 12
Comment
`In this report, the features of wandb that are conveniently usable beyond simple experiment management will be introduced. For those who started to use wandb, please refer "For those who started to use W&B" that contains a list of learning assets of wandb before this report.

ExperimentsHow to collaborate with your teammateArguments which can be used when calling wandb.init()Preventing forgetfulness of executing wandb.finish() using a with statementwandb.summary() for explicit loggingEnvironment variables settingAlert notification via email or slackArtifactsBasics of logging data version on wandbWith Reference artifacts, you don't need to upload your data to Wandb!Adding New Versions and Automatically Removing DuplicatesArtifacts in multiple runsTableBasics of visualizing data on wandbPut multi media types (image, audio, video, and so on) on wandb and the code examplesFilter, groupby, sort and so on, and custom query (weave) in table!Save table as ArtifactsReportBest practice to share your resultsHow to put different projects' results into a single reportSharing, permission management, and collaboration (Comments!)Automated report creation using Python APIEnding / Other functions!SweepsModels RegistriesLaunchAutomationsTracesWeaveMonitoring
Experiments
How to collaborate with your teammate
Experiments in W&B are managed in a hierarchy of Entity => Project => Run. An entity represents a team unit. By default, you have a personal entity, but you can create a team entity and manage the same project as a team. However, for individual or academic use, you can only join one additional entity besides your personal one. Below the entity level, there are projects. As the name suggests, use it for a single ML or DL project. You'll need to conduct many experiments within that project, but Runs are managed under the project. Note that entities and projects are manually created, while Runs are automatically created with each execution.

Arguments which can be used when calling wandb.init()
Preventing forgetfulness of executing wandb.finish() using a with statement
wandb.summary() for explicit logging
Environment variables setting
Alert notification via email or slack
Artifacts
Basics of logging data version on wandb
Use W&B Artifacts to track and version any serialized data as the inputs and outputs of your W&B Runs. For example, a model training run might take in a dataset as input and trained model as output. In addition to logging hyper-parameters and metadata to a run, you can use an artifact to log the dataset used to train the model as input and the resulting model checkpoints as outputs. You will always be able answer the question “what version of my dataset was this model trained on". In summary, with W&B Artifacts, you can:
The diagram below demonstrates how you can use artifacts throughout your entire ML workflow; as inputs and outputs of runs.

Basic usage
The following code is the basic usage of Artifacts.
with wandb.init(project="artifacts-example", job_type="add-dataset") as run:# Create an artifact object with the wandb.Artifact API.artifact = wandb.Artifact(name="my_data", type="dataset")# Add one or more files, such as a model file or dataset, to your artifact object.artifact.add_dir(local_path="./dataset.h5") # Add dataset directory to artifact# log your artifact to W&B.
One of the powerful feature of Artiacts is lineage. If you use wandb Artifacts, a lineage graph is automatically created. So, you can easily understand which model used which dataset.

Some advanced features of Artifacts are introduced in the following sections.
With Reference artifacts, you don't need to upload your data to Wandb!
You may already have large datasets sitting in a cloud object store like s3 and just want to track what versions of those datasets Runs are utilizing and any other metadata associated with those datasets. You can do so by logging these artifacts by reference, in which case W&B only tracks the checksums and metadata of an artifact and does not copy the entire data asset to W&B.
Assume we have a bucket with the following structure:
s3://my-bucket+-- datasets/| +-- mnist/+-- models/+-- cnn/
Under mnist, we have our dataset, a collection of images. Lets track it with an artifact:
import wandbrun = wandb.init()artifact = wandb.Artifact("mnist", type="dataset")artifact.add_reference("s3://my-bucket/datasets/mnist")run.log_artifact(artifact)
You can use the artifact with the following code.
import wandbrun = wandb.init()artifact = run.use_artifact("mnist:latest", type="dataset")artifact_dir = artifact.download()
W&B Artifacts support any Amazon S3 compatible interface — including MinIO.
💡
If you learn more about reference artifacts, please check the official documentation and the following report.
Adding New Versions and Automatically Removing Duplicates
Adding new versions of an artifact is very simple; as long as the same artifact name is used, versions are automatically managed as v1, v2, etc.

First, let's look at the simplest method. In the following method, a new version of the artifact is logged in a run that handles all files within the artifact.
with wandb.init() as run:artifact = wandb.Artifact("artifact_name", "artifact_type")# Add Files and Assets to the artifact using# `.add`, `.add_file`, `.add_dir`, and `.add_reference`artifact.add_file("image1.png")run.log_artifact(artifact)
There is a way to register only the differences from the previous version, instead of saving all files. When adding, changing, or deleting some files from the previous artifact version, it is not necessary to re-index the unchanged files. When adding, changing, or deleting some files from the previous version, a new artifact version is created as an incremental artifact.

Below is the workflow for incremental artifacts. For more details, check the official documentation.
with wandb.init(job_type="modify dataset") as run:saved_artifact = run.use_artifact("my_artifact:latest") # fetch artifact and input it into your rundraft_artifact = saved_artifact.new_draft() # create a draft version# modify a subset of files in the draft versiondraft_artifact.add_file("file_to_add.txt")draft_artifact.remove("dir_to_remove/")run.log_artifact(artifact) # log your changes to create a new version and mark it as output to your run
Artifacts in multiple runs
Table
Basics of visualizing data on wandb
Use W&B Tables to visualize and query tabular data. For example:
- Compare how different models perform on the same test set
- Identify patterns in your data
- Look at sample model predictions visually
- Query to find commonly misclassified examples
The following table shows a table with semantic segmentation and custom metrics. This sample project is from the W&B ML Course. Please click images and see how you can interactively change the visualization!
Run set
25
Basic usage
A Table is a two-dimensional grid of data where each column has a single type of data. Tables support primitive and numeric types, as well as nested lists, dictionaries, and rich media types.
import wandbwith wandb.init(project="table-test") as run:# wandb.Table(): Create a new table object.my_table = wandb.Table(columns=["a", "b"], data=[["a1", "b1"], ["a2", "b2"]])# run.log(): Log the table to save it to W&B.run.log({"Table Name": my_table})
Put multi media types (image, audio, video, and so on) on wandb and the code examples
Filter, groupby, sort and so on, and custom query (weave) in table!
You can do filtering, grouping, sorting, and so on interactively in wandb Tables. In the following example, distribution differences of variables across targets are investigated.
Run set
40
Save table as Artifacts
Report
Best practice to share your results
How to put different projects' results into a single report
Sharing, permission management, and collaboration (Comments!)
Automated report creation using Python API
Ending / Other functions!
In addition, Weights & Biases offers the following features. Please check our official documentation or courses if you want to learn more!
Sweeps
Models Registries
Launch
Automations
Traces
Weave
Monitoring
Add a comment
Yuya Yamamoto [Comment example] please check this report.
Reply