The importance of ML platform and process standardization at Scribd

The increasing complexity of ML projects and resources, spread out and shared among disparate teams, has put a greater emphasis on the need for effective standardization and documentation among ML teams.
For a company like Scribd, who pioneered the industry-leading eBook and audiobook subscription service, with several different ML-related teams comprised of a diverse mix of data scientists and software engineers, having a high level of standardization proved to be a necessity.
“I don’t want to use the term ‘Wild West’ per se, but we noticed that a team might be doing the exact same thing as another team, but in a very different way,” said Christian Williams, a Staff Engineer on the ML Platform team. “And when you’re maintaining models over their lifetime, that gets pretty confusing especially as the landscape of models that you have continues to grow and team composition changes. The desire for standardization across our ML workflow grew out of that need.”
Part of Christian and his team’s responsibility is to get input from all the ML teams to identify services that may or may not be a good fit to fit specific needs, build tooling around those processes, and define standard workflows.
“We want Scribd engineers to have best-in-class tooling for each part of the lifecycle, and that includes Databricks – our ML and data work starts there – as well as Weights & Biases for experiment tracking and model management.”
The importance of standardization for model training, re-training, and deployment
A big contributor to the culture of ML standardization and best practices at Scribd is due to the MLOps Working Group, a cross-functional group where representatives from all the various ML teams in the organization meet to discuss pain points, MLOps scenarios, desirable enhancements and document standard workflows. This group was intended to help reduce technical debt and cognitive load across ML platforms and processes.
Christian and the working group would then evangelize those tools and documentation so all their engineers and data scientists are comfortable using the same tools, and know exactly what activities lead into each step of the lifecycle, what comes after, and how they fit in and what role they play. That standardization shows up in tools like Weights & Biases, which is now uniform across the team.
“W&B has helped give us one thing to work together on, instead of everybody doing their own thing,” said Christian. “It’s helped us break down these silos. We’ve held W&B tech and learning sessions where the whole team, data scientists and platform engineers, are all invited, and everybody is interacting with the platform together.”
Standardization also shows up in the sophisticated model CI/CD workflow the team has developed and implemented. It starts with first versions of models trained in a Databricks notebook, with all runs and code saved in W&B Models. The team then typically encounters two typical scenarios where the W&B Registry comes into play – when training models, and when deploying models for inference.
How W&B Registry makes model management easy
The first scenario where W&B Registry is involved is during the first iteration of a model, after lots of experiments have been run on it. Once a model has been selected, it may not be automatically retrained right when it’s first released. In their GitHub repo, an ML engineer will define what model they want to serve, write the inference code to load and serve that model, and then specify a specific model version. As part of the team’s CI/CD workflow, the team then downloads that specific model version from W&B Registry, packs it together with the inference code, and deploys it to inference-serving infrastructure.
Similarly, for model retraining, once a model is ready to be re-trained, in that same GitHub repo, the developer would set an Alias on their specified model version. They would then have a scheduled job that runs, and when that scheduled job runs, it will reference the Alias from W&B Registry that’s set to do the same package test.
For all these cases, there’s an approval step for the model owner to ensure the model has passed the tests, before approving it and sending it on to production to deploy to Amazon SageMaker. In both of these scenarios, W&B Registry plays a pivotal role.
“As a platform guy, the versioning and aliasing features of W&B Registry and the way they just show up in the Weights & Biases UI is great,” said Christian. “The lineage to the experiments that led to the model versions in the Registry, it all just makes my job as a platform engineer a lot easier.”
Other members of the Scribd ML team, all of whom are consistently using W&B as a standard across the board now, have different aspects of the platform they appreciate the most.
“The data science team loves the rich feature set in experiments, which is far better than a lot of other tools they’ve used in the past,” said Christian. “But from my own team’s perspective, W&B Registry has been the biggest impact for sure.”
Scribd’s successful integration of W&B Registry into their sophisticated CI/CD workflow – and their effectiveness in ensuring consistent standards across disparate teams – is a testament to Christian and his team’s commitment to best-in-class machine learning.