Skip to main content

Costco PoC Guide

One stop shop for everything you need to test out during the PoC.
Created on October 9|Last edited on October 9
Access Weights & Biases here: https://wandb.ai/costco-innovation/


Weights and Biases (W&B) 💫

Weights and Biases is a ML Ops platform built to facilitate collaboration and reproducibility across the machine learning development lifecycle. Machine learning projects can quickly become a mess without some best practices in place to aid developers and scientists as they iterate on models and move them to production.
W&B is lightweight enough to work with whatever framework or platform teams are currently using, but enables teams to quickly start logging their important results to a central system of record. On top of this system of record, W&B has built visualization, automation, and documentation capabilities for better debugging, model tuning, and project management.

Workshop Sessions

DateSessionRecording LinkTopics Discussed
Oct 09, 2024W&B PoC KickOff and Technical recapSession was recorded on Costco sideW&B Experiments, Artifacts, Reports

OverAll W&B documentation: https://docs.wandb.ai/
Computer Vision use-cases guide: https://wandb.ai/site/solutions/computer-vision

W&B Authentication

Environment

Weights & Biases Trial account is hosted here and everyone should have access. Let us know if you haven't received an invite.

Getting Started (SDK Installation and Login)

To start using W&B, you first need to install the Python package (if it's not already there)
pip install wandb
Once it's installed, authenticate your user account by logging in through the CLI or SDK. You should have receive an email to sign up to the platform, after which you can obtain your API token (The API token is in your "Settings" section under your profile)
wandb login --host <YOUR W&B HOST URL> <YOUR API TOKEN>
OR through Python:
WANDB_BASE_URL = 'https://api.wandb.ai'
wandb.login(host=os.getenv("WANDB_BASE_URL"), key=os.getenv("WANDB_API_KEY"))
Once you are logged in, you are ready to track your workflows!

Test Cases

S NoCapability & Success CriteriaW&B Product Area
1Onboard use cases to W&B then build, tune, train and evaluate end-to-endEntire platform
2Create project dashboards that enhance analysis, organization and decision making of the model development processW&B Experiment Tracking
3Integrate with Costco's existing systemsW&B Experiments, W&B Artifacts
4Build a report that presents research (Data exploration, Observability view for portfolio of model, datasets etc.)W&B Reports
5Show model provenance and data lineageW&B Artifacts



Test Case 1: Onboard use cases to W&B then build, tune, train and evaluate end-to-end

Test Case 2: Create project dashboards that enhance analysis, organization and decision making of the model development process

Test Case 3: Integrate with Costco's existing systems

Test Case 4: Build a report that presents research of Data Exploration, Model Outcome, Insights etc.

Test Case 5: Show model provenance and data lineage

Experiment Tracking 🍽

W&B Tables

Artifact Tracking and Versioning

W&B Registry

Hyperparameter Sweeps

Reports

Integrations

Ultralytics

PyTorch Lightning

Other Useful Resources

Import/Export API

All data logged to W&B can be accessed programmatically through the import/export API (also called the public API). This enables you to pull down run and artifact data, filter and manipulate it how you please in Python.

Slack Alerts

You can set slack alerts within a run to trigger when things happen in your training / evaluation scripts. For example, you may want to notify you when training is done or when a metric exceeds a certain value.
Details on enabling these alerts on your dedicated deployments can be found here

FAQs

1. I didn't name my run. Where is the run name coming from?
Ans: If you do not explicitly name your run, a random run name will be assigned to the run to help identify the run in the UI. For instance, random run names will look like "pleasant-flower-4" or "misunderstood-glade-2".
2. How can I configure the name of the run in my training code?
Ans: At the top of your training script when you call wandb.init, pass in an experiment name, like this:
wandb.init(name="my_awesome_run")
3. If wandb crashes, will it possibly crash my training run?
Ans: It is extremely important to us that we never interfere with your training runs. We run wandb in a separate process to make sure that if wandb somehow crashes, your training will continue to run. If the internet goes out, wandb will continue to retry sending data to wandb.ai.
4. Why is a run marked crashed in W&B when it’s training fine locally?
This is likely a connection problem — if your server loses internet access and data stops syncing to W&B, we mark the run as crashed after a short period of retrying.
5. How do I stop wandb from writing to my terminal or my jupyter notebook output?
Ans: Set the environment variable WANDB_SILENT to true.
In Python
os.environ["WANDB_SILENT"] = "true"
Within Jupyter Notebook
%env WANDB_SILENT=true
With Command Line
WANDB_SILENT=true