W&B System of Record
Report created by W&B to highlight capabilities as part of your evaluation.
Created on November 16|Last edited on November 16
Comment
Weights and Biases (W&B) 💫Experiment Tracking 🍽 Minimal code setup needed to use the product Ability to log model stats and results per training runAbility to log model code per training runAbility to log system performance Metrics Per training RunAbility to log to different projectsAbility to import offline run data. Ability to perform analysis on model runs and model run dataInteractive Tables Artifact Tracking and VersioningAbility to log model file itselfAbility to log info about the data used to train the modelAbility to retrieve / export all data loggedReproducibilityAbility to log codeAbility to capture environment detailsReportsAbility to present model development effort progress
Weights and Biases (W&B) 💫
Weights and Biases is a ML Ops platform built to facilitate collaboration and reproducibility across the machine learning development lifecycle. Machine learning projects can quickly become a mess without some best practices in place to aid developers and scientists as they iterate on models and move them to production.
W&B is lightweight enough to work with whatever framework or platform teams are currently using, but enables teams to quickly start logging their important results to a central system of record. On top of this system of record, W&B has built visualization, automation, and documentation capabilities for better debugging, model tuning, and project management.

Experiment Tracking 🍽
The entry point for W&B usage is Experiments. W&B has a few core primitives which comprise the experiment tracking logging system of the SDK allowing W&B to take logging to a whole new level, allowing you to log pretty much anything with W&B: scalar metrics, images, video, custom plots, etc. Once the logging is completing, we then need to contextualize our experiment, and W&B provides means to accomplish this via Reports.
Even though our experiment tracking is easy to instrument, there are great integrations available to make experiment tracking even easier for popular frameworks and libraries!

Minimal code setup needed to use the product
Getting started with W&B is a simple procedure
!pip install wandbimport wandbwith wandb.init(project = "my_project") as run:run.log({"metric": 0.1})
This is the common pattern employed, and can simplify significantly if using an available integration.
Ability to log model stats and results per training run
- Default stats for models: Loss, Epochs, Accuracy, other common stats
- Custom / Ad Hoc Stats
- User Specified
Through the use of W&B python client, you can log anything to W&B. We support logging of data as key value pairs (think python dictionary), where values can take on many difference data types. Below, we show how logging model configuration, loss, accuracy and other common stats manifests. Plotting completed below was done within a project and directly imported into this report.
Some cases, you might be exploring more than 1 modeling methodology, you can leverage W&B to aggregate detail logged for each model. The first example below shows Feature Importance for all available features across all trained models (error bars represent 25th percent and 75 percentile respectively. The second example presents similar detail, but as a heat map where y axis represent the model, the x axis represents the feature, and the cell value represents the importance of the feature for the model. This is considered advanced usage of W&B Custom Charting.
You can also log plots made in your python environment. We provide methods to log Matplotlib figures as well as Plotly figures using both wandb.Image and wandb.Plotly to log your figures / plots. In fact, there is a large set of Data types which W&B can log. Please see the details in the docs
Run set
60
Ability to log model code per training run
Ability to log system performance Metrics Per training Run
Ability to log to different projects
Ability to import offline run data.
Ability to perform analysis on model runs and model run data
Interactive Tables
Artifact Tracking and Versioning
Artifacts are inputs and outputs of each part of your machine learning pipeline, namely datasets and models. Training datasets change over time as new data is collected, removed, or re-labeled, models change with new architectures being implemented along with continuous re-retraining. With these changes, all downstream tasks utilizing the changed datasets and models will be affected and understanding this dependency chain is critical for debugging effectively. W&B can log this dependency graph easily with a few lines of code.

Ability to log model file itself
Through usage of Artifacts, model files can easily be logged to W&B, and we've extended artifacts with a Model Registry (shown below). Please see the docs for more detail on Model Management
Fraud Detection LID 62a88bde90d0f2003c6a7bf9
Model card
Tags
Full name
wandb-smle/model-registry/Fraud Detection LID 62a88bde90d0f2003c6a7bf9
Type
model
Created At
July 28th, 2022
Automations
1 automation
Slack notifications
Notify the team when changes happen in the model registry.
Description
Model Description
Please see report https://wandb.ai/wandb-smle/h2o-autoML-classification/reports/AutoML-W-B--VmlldzoyMzkxNDIz#data-overview
The goal of this model is to " predict the potentially fraudulent providers " based on the claims filed by them.
Model Usage
Simple usage of model to perform batch scoring with the H2O POJO file in the artifact.
!pip install wandb datarobot-drum
import wandb
import os
api = wandb.Api()
model_artifact = api.artifact('wandb-smle/model-registry/Fraud Detection LID 62a88bde90d0f2003c6a7bf9:production', type='model')
model_dir = model_artifact.download("MODEL")
dataset_artifact = api.artifact( "wandb-smle/h2o-autoML-classification/test-data:v0", type='data')
dataset_dir = dataset_artifact.download("DATA")
[ os.remove(os.path.join("./MODEL", f)) for f in os.listdir("./MODEL") if ".java" not in f]
os.environ["CODE_DIR"] = "./MODEL"
os.environ["TARGET_TYPE"] = "binary"
os.environ["POSITIVE_CLASS_LABEL"] = "1"
os.environ["NEGATIVE_CLASS_LABEL"] = "0"
!drum score --input "./DATA/test_data.csv" --output "./DATA/test_predictions.csv"
Versions
Version
Aliases
Logged By
Tags
Created
TTL Remaining
# of Consuming Runs
Size
m.HGLM
m.link
m.seed
m.alpha
m.prior
m.theta
m.family
m.lambda
m.nfolds
m.solver
m.obj_reg
m.auc_type
m.model_id
m.nlambdas
m.startval
m.calc_like
m.intercept
m.rand_link
m.checkpoint
m.cold_start
m.fold_column
m.plug_values
m.rand_family
m.standardize
m.beta_epsilon
m.interactions
m.non_negative
m.lambda_search
m.offset_column
m.early_stopping
m.max_iterations
m.random_columns
m.training_frame
m.weights_column
m.balance_classes
m.fold_assignment
m.ignored_columns
m.response_column
m.stopping_metric
m.stopping_rounds
m.beta_constraints
m.compute_p_values
m.gradient_epsilon
m.lambda_min_ratio
m.max_runtime_secs
m.validation_frame
m.ignore_const_cols
m.interaction_pairs
m.objective_epsilon
m.custom_metric_func
m.stopping_tolerance
m.tweedie_link_power
m.score_each_iteration
m.max_active_predictors
m.class_sampling_factors
m.export_checkpoints_dir
m.max_after_balance_size
m.tweedie_variance_power
m.missing_values_handling
m.generate_scoring_history
m.remove_collinear_columns
m.score_iteration_interval
m.max_confusion_matrix_size
m.keep_cross_validation_models
m.keep_cross_validation_predictions
m.keep_cross_validation_fold_assignment
0
latest
fraud-detection
v0
Thu Jul 28 2022
Inactive
1
14.0MB
False
logit
binomial
L_BFGS
AUTO
GLM_1_AutoML_3_20220728_181021
False
True
False
True
False
True
True
AutoML_3_20220728_181021_training_py_918_sid_8464
False
Modulo
Panel currently unsupported for custom step metrics. Please use an expression that returns the pre-defined step metric (_step) for now
Label
AUTO
False
py_919_sid_8464
True
False
MeanImputation
False
False
False
True
False
Loading...
Automations
AUTOMATION
EVENT TYPE
ACTION TYPE
DATE CREATED
LAST EXECUTION
test-trigger
New version
Job launch
Thu Jan 26 2023, at 08:38 PM
Loading...
Ability to log info about the data used to train the model
Not only would you be permitted to log datasets via artifacts, but you will also be able to log an supplementary info concerning your data. Image the instance where a training dataset is curated from three different dataset. We can log the schema to W&B as well as surface it in a report alongside the actual datasets detail logged to W&B.
processed-data
Artifact overview
Type
data
Created At
July 28th, 2022
Description
Versions
Version
Aliases
Logged By
Tags
Created
TTL Remaining
# of Consuming Runs
Size
Loading...
Below is a dashboard created with evidentlyAI used to explore the data. Please excuse the messiness as this is a result of VERY long feature names.
Run set
1
Ability to retrieve / export all data logged
Reproducibility
The need for reproducibility is critical. Inability to reproduce experiments can exacerbate issues with Key Person Dependencies, and lead to redundant efforts which could potentially be avoided.
Weights and Biases offers a number of features to aide in the reproducibility task.
Ability to log code
Weights an Biases can automatically log the code which calls wandb.init method. Imagine if you will a script called train.py which invokes W&B to keep tracking of training model. Simply be setting up profile defaults, or by passing a save_code argument to the wandb.init method, the train.py file would be logged to W&B.
To save library code, you can call the wandb.run.log_code method to other code files to W&B.
We can also save notebooks to W&B. When you call wandb.init() inside of Jupyter, we add a hook to automatically save a Jupyter notebook containing the history of code executed in your current session. You can find this session history in a runs file browser under the code directory:
In the panel below, clicking into files, then into __session_history.ipynb , it will display the cells that were executed in your session along with any outputs created by calling iPython’s display method. This enables you to see exactly what code was run within Jupyter in a given run. When possible we also save the most recent version of the notebook which you would find in the code directory as well.
source-3kglm2tc
Artifact overview
Type
code
Created At
May 10th, 2022
Description
Versions
Version
Aliases
Logged By
Tags
Created
TTL Remaining
# of Consuming Runs
Size
Loading...
Ability to capture environment details
When a run is initiated, W&B completes a pip freeze so you always know what python modules were used in a particular run.

also by way of W&B Lauch (which is currently beta), users can connect to their own SageMaker or Kubernetes cluster, then easily queue and manage jobs using W&B Launch.
- Kick off jobs on your own infrastructure from the W&B UI or CLI.
- Execute runs in reproducible containerized environments
- Queue and launch jobs across your own clusters, locally or in the cloud
- Easily tweak hyperparameters or input data and retrain models
Reports
W&B reports help contextualize and document the system of record built through logging diagnostics and results from different pieces of your pipeline. Reports are interactive and dynamic, reflecting filtered run sets logged in W&B. You can add all sorts of assets to a report; the one you are reading now includes plots, tables, images, code, and nested reports.
Ability to present model development effort progress
This is exactly the purpose of Reports. You can contextualize experiments and project, and share reports with colleagues and stake holders.
Add a comment
core primitives really cool report
Reply