Dialpad PoC Guide
One stop shop for everything you need to test out during the PoC.
Created on December 19|Last edited on January 17
Comment
💡
For Any Questions, please reach out via the slack channel (#wandb-dialpad-trial)
💡
Weights and Biases (W&B) 💫Workshop SessionsQuick Documentations LinksW&B AuthenticationEnvironmentGetting Started (SDK Installation and Login)Test CasesTest Case 1: Import data into W&B (from MLFlow, spreadsheets)Test Case 2: Tracking evaluation of models and metrics Test Case 3: Show model provenance and data lineageTest Case 4: Build a report that presents research of Data Exploration, Model Outcome, Insights etc.Experiment Tracking 🍽 W&B TablesArtifact Tracking and VersioningW&B RegistryHyperparameter SweepsReportsOther Useful ResourcesImport/Export APISlack AlertsFAQs
Weights and Biases (W&B) 💫
Weights and Biases is a ML Ops platform built to facilitate collaboration and reproducibility across the machine learning development lifecycle. Machine learning projects can quickly become a mess without some best practices in place to aid developers and scientists as they iterate on models and move them to production.
W&B is lightweight enough to work with whatever framework or platform teams are currently using, but enables teams to quickly start logging their important results to a central system of record. On top of this system of record, W&B has built visualization, automation, and documentation capabilities for better debugging, model tuning, and project management.
Workshop Sessions
Date | Session | Recording Link | Topics Discussed |
---|---|---|---|
Nov 20, 2024 | W&B Platform Demo | https://us-39259.app.gong.io/e/c-share/?tkn=81obab9sdn5b12xxp6ozolqqc | W&B Models platform |
Dec 4, 2024 | W&B Pilot Criteria | https://us-39259.app.gong.io/call?id=2644016983110422473&account-id=5700219410851100311 | W&B Pilot Objectives + Importing Data |
Quick Documentations Links
W&B Authentication
Environment
Weights & Biases Trial account is hosted here and everyone should have access. Let us know if you haven't received an invite.
Getting Started (SDK Installation and Login)
To start using W&B, you first need to install the Python package (if it's not already there)
pip install wandb
Once it's installed, authenticate your user account by logging in through the CLI or SDK. You should have receive an email to sign up to the platform, after which you can obtain your API token (The API token is in your "Settings" section under your profile)
wandb login --host <YOUR W&B HOST URL> <YOUR API TOKEN>
OR through Python:
wandb.login(host=os.getenv("WANDB_BASE_URL"), key=os.getenv("WANDB_API_KEY"))
Once you are logged in, you are ready to track your workflows!
Test Cases
S No | Capability & Success Criteria | W&B Product Area |
---|---|---|
1 | Import data into W&B | W&B Experiments |
2 | Tracking evaluation of models and metrics | W&B Experiment Tracking |
3 | Show model provenance and data lineage | W&B Artifacts, W&B Registry |
4 | Build a report that presents research (Observability view for portfolio of model, datasets etc.) | W&B Reports |
Test Case 1: Import data into W&B (from MLFlow, spreadsheets)
Importing from MLFlow:
W&B supports importing data from MLFlow (docs), including experiments, runs, artifacts, metrics, and other metadata.
Importing from spreadsheets:
Depending on the structure of the spreadsheet data, you can configure a script that reads through the rows and logs each row as a W&B run
Dialpad's Current Setup:
- 15 different tasks in Google Sheets, with multiple models as rows
- Each task has ~5 metrics (ROC1, ROC2, F1, precision, recall)
Based on the above setup here's an example dataset and the script that you can use to read from the Google sheet and log to W&B:
Assuming the following spreadsheet structure:
Task Model ROC1 ROC2 F1 Precision Recall
Task 1 Model A 0.85 0.90 0.88 0.91 0.87
Task 1 Model B 0.83 0.89 0.87 0.90 0.85
Task 2 Model A 0.80 0.86 0.84 0.88 0.82
import pandas as pdimport wandb# Load the spreadsheetfile_path = 'path/to/your/spreadsheet.csv' # Update this with your actual file pathdf = pd.read_csv(file_path) # For csv filedf = pd.read_excel(file_path) # for Excel file# Log each row as a separate W&B runfor index, row in df.iterrows():# Initialize a new W&B run for each model's evaluationwandb.init(project="log2wandb", entity="nlp-dialpad", name=f"{row['Task']} - {row['Model']}")# Replace the metrics with the ones in spreadsheetwandb.log({"ROC1": row["ROC1"],"ROC2": row["ROC2"],"F1": row["F1"],"Precision": row["Precision"],"Recall": row["Recall"]})# Finish the runwandb.finish()print("All runs have been logged to W&B.")
Example pre-training benchmark metrics csv shared by Dialpad:

Script:
# Script to log the above csv to W&Bimport pandas as pdimport wandb# Load the spreadsheetfile_path = 'path/to/your/spreadsheet.csv' # Update this with your actual file pathdf = pd.read_csv(file_path) # For csv file# Log each row as a separate W&B runfor index, row in df.iterrows():wandb.init(project="log2wandb", entity="nlp-dialpad", name=f"{row['model']}")# Log each row as a separate W&B runwandb.log({"model": row["model"],"accuracy": row["accuracy"],"f1-macro": row["f1-macro"],"f1-micro": row["f1-micro"],"f1-weighted": row["f1-weighted"],"precision-macro": row["precision-macro"],"precision-micro": row["precision-micro"],"precision-weighted": row["precision-weighted"],"recall-macro": row["recall-macro"],"recall-micro": row["recall-micro"],"recall-weighted": row["recall-weighted"]})# Finish the runwandb.finish()print("All runs have been logged to W&B.")
💡
Test Case 2: Tracking evaluation of models and metrics
Test Case 3: Show model provenance and data lineage
Test Case 4: Build a report that presents research of Data Exploration, Model Outcome, Insights etc.
Experiment Tracking 🍽
W&B Tables
Artifact Tracking and Versioning
W&B Registry
Hyperparameter Sweeps
Reports
Other Useful Resources
Import/Export API
All data logged to W&B can be accessed programmatically through the import/export API (also called the public API). This enables you to pull down run and artifact data, filter and manipulate it how you please in Python.
Slack Alerts
You can set slack alerts within a run to trigger when things happen in your training / evaluation scripts. For example, you may want to notify you when training is done or when a metric exceeds a certain value.
FAQs
1. I didn't name my run. Where is the run name coming from?
Ans: If you do not explicitly name your run, a random run name will be assigned to the run to help identify the run in the UI. For instance, random run names will look like "pleasant-flower-4" or "misunderstood-glade-2".
2. How can I configure the name of the run in my training code?
Ans: At the top of your training script when you call wandb.init, pass in an experiment name, like this:
wandb.init(name="my_awesome_run")
3. If wandb crashes, will it possibly crash my training run?
Ans: It is extremely important to us that we never interfere with your training runs. We run wandb in a separate process to make sure that if wandb somehow crashes, your training will continue to run. If the internet goes out, wandb will continue to retry sending data to wandb.ai.
4. Why is a run marked crashed in W&B when it’s training fine locally?
This is likely a connection problem — if your server loses internet access and data stops syncing to W&B, we mark the run as crashed after a short period of retrying.
5. How I integrate W&B with HuggingFace Transformer?
W&B has a native integration with Hugging Face Transformers which adds rich, flexible experiment tracking and model versioning to interactive centralized dashboards without compromising that ease of use.
6. Is there a way to redo or undo changes on the dashboard?
Yes, on the top right section of the workspace there is a button to Undo (⌘+Z) or Redo (⌘+⇧+Z) any changes made on the dashboard (as highlighted below)

7. Can I create a custom calculation in the W&B workspace that calculates the average of some metrics?
Yes, W&B has a Query panels that allows users to query and interactively visualize your data. To get the average, here's an example of what the query would look like (you can swap out the metrics with ones you want as row["metric1"]:
Run set
12
8. How to fetch a Run data programmatically after it's logged to W&B?
You can use the W&B API to fetch the run metrics. To fetch the metrics you'll need the entity, project and the run_id. Here's an example script
import wandbapi = wandb.Api()wandb_team_name = # add your team namewandb_project_name = # add your project namerun_id = # add run IDrun = api.run(f"{wandb_team_name}/{wandb_project_name}/{run_id}")print(run.summary)
9. How do I stop wandb from writing to my terminal or my jupyter notebook output?
Ans: Set the environment variable WANDB_SILENT to true.
In Python
os.environ["WANDB_SILENT"] = "true"
Within Jupyter Notebook
%env WANDB_SILENT=true
With Command Line
WANDB_SILENT=true
Add a comment