SageMaker Studio and Weights and Biases
How to train your ML models using this new AWS tool.
Created on September 7|Last edited on September 7
Comment

Amazon Sagemaker Studio Lab (SMSL) is the new web-based platform for machine learning practitioners. It is a free machine learning (ML) development environment that provides the compute, storage, and security—all at no cost—for anyone to learn and experiment with ML. Similar to Google Colab, it provides a familiar Jupyter-based web interface that we're all so used to and comfortable with.
No AWS account is needed, nor any cloud infrastructure skills. To get started, simply request an account with a valid email address. SMSL provides a familiar Jupyter-based web interface that we are so used to, but there are some key differentiators:
- SMSL provides a full JupyterLab instance, with the standard shortcuts, widgets, extensions, git support, integration with GitHub, and python environments.
- A single SMSL user session can leverage either 12 hours of CPU or 4 hours of GPU resources. There are no limited user sessions per customer account.
- The environment is persistent and you get 15GB of storage, so when you come back to work you can start where you left off.
Once you log in to SMSL, you will be on your default project page, where you can select a compute type and “Start runtime.” We recommend starting with CPU instances and leveraging GPU instances only as needed.
💡
Using Weights & Biases in SMSL 🪐
Weights & Biases (wandb) is just a regular python library. Once installed, it's as simple as adding a couple lines of code to your training script and you will be logging experiments. You can install it manually by doing:
$ pip install wandb
Or putting wandb as a requirement on your environment.yml file.
Case Study: Semantic Segmentation for Autonomous Vehicles
We will use this repo to create a model to perform semantic segmentation on the Cambridge-driving Labeled Video Database (or CamVid) dataset to train our model. You can click here to copy the repo to your SMSL workspace.
The dataset
We use the Cambridge-driving Labeled Video Database (CamVid) for this example. It contains a collection of videos with object class semantic labels, complete with metadata. The database provides ground truth labels that associate each pixel with one of 32 semantic classes. We can version our dataset as a wandb.Artifact so we can reference it later. See the following code:
with wandb.init(project="sagemaker_camvid_demo", job_type="upload"):artifact = wandb.Artifact(name='camvid-dataset',type='dataset',metadata={"url": 'https://s3.amazonaws.com/fast-ai-imagelocal/camvid.tgz',"class_labels": class_labels},description="The Cambridge-driving Labeled Video Database (CamVid) is the first collection of videos with object class semantic labels, complete with metadata. The database provides ground truth labels that associate each pixel with one of 32 semantic classes.")artifact.add_dir(path)wandb.log_artifact(artifact)
💡
We will also log a wandb.Table version to have access to an interactive visualization of the data. You can use tables to understand your datasets, visualize model predictions, and share insights in a central dashboard. W&B Tables support many rich media formats, like image, audio, and waveforms. For a full list of media formats, refer to Data Types.
Training a model
We can now create a model and train it. We will use PyTorch and fastai to quickly prototype a baseline and then use wandb.Sweeps to explore a better model.
💡
The model is supposed to learn a per-pixel annotation of a scene captured from the point of view of the autonomous agent. The model needs to categorize or segment each pixel of a given scene into 32 relevant categories such as road, pedestrian, sidewalk, cars etc. like listed below. You can click on any of the segmented images on the table shown above and access this interactive interface for accessing the segmentation result and categories.

For the baseline experiments we decided to use a simple architecture inspired by the UNet with different backbones from timm. We performed the experiments with Focal Loss. We attach a brief summary of our experiments with the baseline models and the loss functions:
We will need a GPU backend for this notebooks, we can check using nvidia-smi command.
💡
Experiments
12
Visualizing Model Outputs
Weight & Biases really shines for assessing the model performance, we can use the power of wandb.Tables to visualize where our model is doing poorly. Here, we see the model predictions along the ground truth and the class IOU score. We can filter and sort to see where the model is failing to detect vulnerable pedestrians🚶♀️ and bicycles 🚲
Hyperparameter Optimisation with wandb.Sweeps
In order to improve the performance of the baseline model, we need to not only select the best model, but also the best set of hyperparameters to train it with. This, in-spite of being quite a daunting task, was actually made easy for us by Sweeps.
We perform a Bayesian hyperparameter search with the goal to maximize the foreground accuracy of the model on the validation dataset. To perform this, we define the following configuration file sweep.yaml. Inside this file we define the method to use: bayes and the parameters and their corresponding values to search; we will also try different backbones, batch sizes and loss functions. We also explore different optimization parameters: learning rate and weight decay that we sample from a distribution.
# sweep.yamlprogram: train.pyproject: sagemaker_camvid_demomethod: bayesmetric:name: foreground_accgoal: maximizeearly_terminate:type: hyperbandmin_iter: 5parameters:backbone:values: ["mobilenetv2_100","mobilenetv3_small_050","mobilenetv3_large_100","resnet18","resnet34","resnet50","vgg19"]batch_size:values: [8, 16]image_resize_factor:value: 4loss_function:values: ["categorical_cross_entropy", "focal", "dice"]learning_rate:distribution: uniformmin: 1e-5max: 1e-2weight_decay:distribution: uniformmin: 0.0max: 0.05
Afterwards, on a terminal you will launch the sweep using the wandb command line
$ wandb sweep sweep.yaml —-project="sagemaker_camvid_demo"
And then launch a sweep agent on this machine by doing:
$ wandb agent <sweep_id>
Once the sweep has finished, we can use a parallel coordinates plot to explore the performances of the models with various backbones and different sets of hyperparameters, and based on that we can see which model performs the best.
Run set
56
We can derive the following key insights from the sweep:
- Lower learning rate and lower weight decay results in better foreground accuracy and Dice scores.
- Batch size has strong positive correlations with the metrics.
- The VGG-based backbones might not be a good option to train our final model because they’re prone to resulting in a vanishing gradient. (They’re filtered out as the loss diverged.)
- The ResNet backbones result in the best overall performance with respect to the metrics.
- The ResNet34 or ResNet50 backbone should be chosen for the final model due to their strong performance in terms of metrics.
Conclusion
We hope you enjoyed this quick introduction to SageMaker Studio Lab and you can leverage it alongside W&B to track your machine learning experiments. If you have any questions about how these tools work together, toss them into the comments and we'll make sure we get to them. Thanks for reading!
Add a comment