Skip to main content

Use Cases

Created on November 17|Last edited on November 17



Model and Dataset Artifacts

Goal: For a given dataset or model, determine which models have used which datasets at any stage of development.
Using W&B Weave Panel in a report provides a means to query models (either from a project or from a registry). The panel gives you the view of the model and allows you to tour the model's overview, metadata, usage snippet, files, and lineage.

Error: Could not load
Front and center within the overview, you do get a counter of Num Consumers. From here, you can either go into the Usage tab, or the Lineage Tab to see details on the Consumers, specifically, the Run (aka Experiments) that consume this artifact.
To introduce this panel in the report, simply start typing /weave . This will generate a pop up box, which will allow you select the weave panel.

Once the panel is present, you will then need to enter an expression. The expressions do auto pop, or you can just type in project("entity-name", "project-name").artifact("name-of-artifact). Once this is done, you can click the run button and you should be off to the races.

You can create Weave panels that contain multiple artifacts, but this might get a little bit busy. For example, the panel below has queried all artifacts from a project.

Error: Could not load
Notice the two vertical scrollers at the bottom of the panel. The Second scroller covers all the artifacts the artifact types. For instance, in this project i have Artifacts with type dataset, model, code, and run_table . The First scroller allows you to scroll through artifacts of the given type.

Data Summaries

Complete the summaries within you experiment and log those summaries to W&B. Methods to complete this summaries could leverage
  • SweetViz / Evidently / PyCaret
  • Plotly
  • W&B Tables and Charts
Below, we are showing of a W&B table that shows the distribution of labels for a given run (maybe a training run). This approach may offer the most flexibility as you can immediately couple the label distribution to a run, and you can show mulitple runs in the same view. Choices of SweetViz and Evidently are due to them having nice capabilities to save the data dashboards as HTML, and from here, the data could be logged to W&B and surfaced in a report, but it might feel busy or clumsy (see below)



Model Summaries

To complete models summaries, we will consider the runs which generated a given model artifact. Runs are using various configurations and datasets, and we could introduce run comparisons. If you wanted to add dimensions to this run comparison regarding datasets used for a given run, you would necessarily have to log the name of the dataset with something like
artifact_path = f"{entity}/{project_name}/{artifact_name}:{version}"
with wandb.init(config = {"dataset": artifact_path}) as run:
data_artifact = run.use_artifact(run.config.dataset, type = "dataset")
## execute training code
This pattern would introduce a dataset key in the config section of the run comparer below, and it would also introduce the ability for you to add a dataset dimension to the parallel coordinate (see below for a mock example).


Concerning analysis by tags, I didn't immediately think of a way to complete this entirely within W&B, but one method would be to pull the data via W&B API, and turn it into a dataframe, You can see below, the column tags and the column val_accuracy. While my tags are non-sense, you could pursue you analysis in python, and log results of analysis back to the project then visualize this in the report.
run_tags = []
summaries = []
val_accuracies = []
for run in list(mnist_v4):
run_tags.append(run.tags)
summaries.append(run.config)
val_accuracies.append(run.summary.get("val_accuracy", None))
df = pd.DataFrame(summaries)
df["tags"] = run_tags
df["val_accuracy"] = val_accuracies
df.sort_values("val_accuracy")[["dataset", "learning_rate", "tags", "val_accuracy"]]


artifact
List<List<artifact>>