Skip to main content

Dialpad PoC Guide

One stop shop for everything you need to test out during the PoC.
Created on December 19|Last edited on January 17
Access Weights & Biases here: https://wandb.ai/nlp-dialpad
💡
For Any Questions, please reach out via the slack channel (#wandb-dialpad-trial)
💡


Weights and Biases (W&B) 💫

Weights and Biases is a ML Ops platform built to facilitate collaboration and reproducibility across the machine learning development lifecycle. Machine learning projects can quickly become a mess without some best practices in place to aid developers and scientists as they iterate on models and move them to production.
W&B is lightweight enough to work with whatever framework or platform teams are currently using, but enables teams to quickly start logging their important results to a central system of record. On top of this system of record, W&B has built visualization, automation, and documentation capabilities for better debugging, model tuning, and project management.

Workshop Sessions

DateSessionRecording LinkTopics Discussed
Nov 20, 2024W&B Platform Demohttps://us-39259.app.gong.io/e/c-share/?tkn=81obab9sdn5b12xxp6ozolqqcW&B Models platform
Dec 4, 2024W&B Pilot Criteriahttps://us-39259.app.gong.io/call?id=2644016983110422473&account-id=5700219410851100311W&B Pilot Objectives + Importing Data

OverAll W&B documentation: https://docs.wandb.ai/
Log Distributed training experiments: https://docs.wandb.ai/guides/track/log/distributed-training/

W&B Authentication

Environment

Weights & Biases Trial account is hosted here and everyone should have access. Let us know if you haven't received an invite.

Getting Started (SDK Installation and Login)

To start using W&B, you first need to install the Python package (if it's not already there)
pip install wandb
Once it's installed, authenticate your user account by logging in through the CLI or SDK. You should have receive an email to sign up to the platform, after which you can obtain your API token (The API token is in your "Settings" section under your profile)
wandb login --host <YOUR W&B HOST URL> <YOUR API TOKEN>
OR through Python:
WANDB_BASE_URL = 'https://api.wandb.ai'
wandb.login(host=os.getenv("WANDB_BASE_URL"), key=os.getenv("WANDB_API_KEY"))
Once you are logged in, you are ready to track your workflows!

Test Cases

S NoCapability & Success CriteriaW&B Product Area
1Import data into W&BW&B Experiments
2Tracking evaluation of models and metricsW&B Experiment Tracking
3Show model provenance and data lineageW&B Artifacts, W&B Registry
4Build a report that presents research (Observability view for portfolio of model, datasets etc.)W&B Reports



Test Case 1: Import data into W&B (from MLFlow, spreadsheets)

Importing from MLFlow:
W&B supports importing data from MLFlow (docs), including experiments, runs, artifacts, metrics, and other metadata.
Importing from spreadsheets:
Depending on the structure of the spreadsheet data, you can configure a script that reads through the rows and logs each row as a W&B run
Dialpad's Current Setup:
  • 15 different tasks in Google Sheets, with multiple models as rows
  • Each task has ~5 metrics (ROC1, ROC2, F1, precision, recall)
Based on the above setup here's an example dataset and the script that you can use to read from the Google sheet and log to W&B:
Assuming the following spreadsheet structure:
Task Model ROC1 ROC2 F1 Precision Recall
Task 1 Model A 0.85 0.90 0.88 0.91 0.87
Task 1 Model B 0.83 0.89 0.87 0.90 0.85
Task 2 Model A 0.80 0.86 0.84 0.88 0.82
import pandas as pd
import wandb

# Load the spreadsheet
file_path = 'path/to/your/spreadsheet.csv' # Update this with your actual file path
df = pd.read_csv(file_path) # For csv file
df = pd.read_excel(file_path) # for Excel file

# Log each row as a separate W&B run
for index, row in df.iterrows():
# Initialize a new W&B run for each model's evaluation
wandb.init(project="log2wandb", entity="nlp-dialpad", name=f"{row['Task']} - {row['Model']}")
# Replace the metrics with the ones in spreadsheet
wandb.log({
"ROC1": row["ROC1"],
"ROC2": row["ROC2"],
"F1": row["F1"],
"Precision": row["Precision"],
"Recall": row["Recall"]
})
# Finish the run
wandb.finish()

print("All runs have been logged to W&B.")

Example pre-training benchmark metrics csv shared by Dialpad:

Script:
# Script to log the above csv to W&B
import pandas as pd
import wandb

# Load the spreadsheet
file_path = 'path/to/your/spreadsheet.csv' # Update this with your actual file path
df = pd.read_csv(file_path) # For csv file

# Log each row as a separate W&B run
for index, row in df.iterrows():
wandb.init(project="log2wandb", entity="nlp-dialpad", name=f"{row['model']}")
# Log each row as a separate W&B run
wandb.log({
"model": row["model"],
"accuracy": row["accuracy"],
"f1-macro": row["f1-macro"],
"f1-micro": row["f1-micro"],
"f1-weighted": row["f1-weighted"],
"precision-macro": row["precision-macro"],
"precision-micro": row["precision-micro"],
"precision-weighted": row["precision-weighted"],
"recall-macro": row["recall-macro"],
"recall-micro": row["recall-micro"],
"recall-weighted": row["recall-weighted"]
})

# Finish the run
wandb.finish()

print("All runs have been logged to W&B.")
Here's the corresponding W&B workspace for the above logged metrics from csv to W&B
💡

Test Case 2: Tracking evaluation of models and metrics

Test Case 3: Show model provenance and data lineage

Test Case 4: Build a report that presents research of Data Exploration, Model Outcome, Insights etc.

Experiment Tracking 🍽

W&B Tables

Artifact Tracking and Versioning

W&B Registry

Hyperparameter Sweeps

Reports

Other Useful Resources

Import/Export API

All data logged to W&B can be accessed programmatically through the import/export API (also called the public API). This enables you to pull down run and artifact data, filter and manipulate it how you please in Python.

Slack Alerts

You can set slack alerts within a run to trigger when things happen in your training / evaluation scripts. For example, you may want to notify you when training is done or when a metric exceeds a certain value.
Details on enabling these alerts on your dedicated deployments can be found here

FAQs

1. I didn't name my run. Where is the run name coming from?
Ans: If you do not explicitly name your run, a random run name will be assigned to the run to help identify the run in the UI. For instance, random run names will look like "pleasant-flower-4" or "misunderstood-glade-2".
2. How can I configure the name of the run in my training code?
Ans: At the top of your training script when you call wandb.init, pass in an experiment name, like this:
wandb.init(name="my_awesome_run")
3. If wandb crashes, will it possibly crash my training run?
Ans: It is extremely important to us that we never interfere with your training runs. We run wandb in a separate process to make sure that if wandb somehow crashes, your training will continue to run. If the internet goes out, wandb will continue to retry sending data to wandb.ai.
4. Why is a run marked crashed in W&B when it’s training fine locally?
This is likely a connection problem — if your server loses internet access and data stops syncing to W&B, we mark the run as crashed after a short period of retrying.
5. How I integrate W&B with HuggingFace Transformer?
W&B has a native integration with Hugging Face Transformers which adds rich, flexible experiment tracking and model versioning to interactive centralized dashboards without compromising that ease of use.
6. Is there a way to redo or undo changes on the dashboard?
Yes, on the top right section of the workspace there is a button to Undo (⌘+Z) or Redo (⌘+⇧+Z) any changes made on the dashboard (as highlighted below)

7. Can I create a custom calculation in the W&B workspace that calculates the average of some metrics?
Yes, W&B has a Query panels that allows users to query and interactively visualize your data. To get the average, here's an example of what the query would look like (you can swap out the metrics with ones you want as row["metric1"]:

Run set
12

8. How to fetch a Run data programmatically after it's logged to W&B?
You can use the W&B API to fetch the run metrics. To fetch the metrics you'll need the entity, project and the run_id. Here's an example script
import wandb
api = wandb.Api()

wandb_team_name = # add your team name
wandb_project_name = # add your project name
run_id = # add run ID
run = api.run(f"{wandb_team_name}/{wandb_project_name}/{run_id}")

print(run.summary)
9. How do I stop wandb from writing to my terminal or my jupyter notebook output?
Ans: Set the environment variable WANDB_SILENT to true.
In Python
os.environ["WANDB_SILENT"] = "true"
Within Jupyter Notebook
%env WANDB_SILENT=true
With Command Line
WANDB_SILENT=true