How Socure Fights Fraud With Machine Learning

"Weights & Biases gave our team a full and complete understanding of our model’s lineage, from datasets to training to production artifacts"
Edward Li
Head of Computer Vision Research

About Socure

Fraud loves technology. Whether it’s the preponderance of spam phone calls and texts, ever-present spam emails, or NFT scams, our interconnectivity makes fraud a lot cheaper to attempt across vast audiences. Once some guardrail is set up—filters to catch spam emails or CAPCHAs to filter out bots–it’s safe to assume fraudsters are already hard at work, searching for a side door or an exploit. Not only does fraud love technology, but it also moves quickly.
 
That means that fighting fraud requires moving quickly with the right technology. Which is exactly what the team at Socure is doing.
 
Socure is an innovative, fascinating company that offers a host of products that intelligently and accurately verify buyer identity and information, including one we’ll be talking about today called Predictive DocV (the “V” here stands for verification). Predictive DocV specializes in verifying the identity of users with government-issued forms of identification and selfies by leveraging multidimensional, layered identity information. That means analyzing and correlating data from Socure’s ever-growing proprietary database of cross-industry customer feedback data on 800 million known good and bad identities and applying cutting edge computer vision techniques. The goal? To provide a confident prediction within seconds.
 
And it works. Socure is used by institutions like Capital One, Chime, Poshmark and Wells Fargo. The questions we want to answer today are how the team at Socure itself works to keep us safe from the fraudsters at our doorsteps.
 
 

Introduction

We’ll be focusing on the computer vision component of Predictive DocV. That’s because it’s actually a more recent component in Socure’s holistic approach to fraud detection. The team itself is a little more than a year old.
 
It goes without saying that you have to build a great team with smart, agile practitioners to solve a problem like this. Socure has exactly that. But machine learning is still a rapidly evolving field and with diverse backgrounds come diverse tools and libraries, bringing with them inconsistencies in how ML workflows are utilized across those diverse teams and practitioners. Standardizing was a high priority for improving the team’s efficiency and ensuring their models were reproducible.
 
Enter Edward Li, the Head of Computer Vision Research at Socure.
 
“Our philosophy is that you write code for other people,” Edward said. “That doesn’t just mean writing production code to help our clients prevent fraud better, it also means writing code for your teammates and partners so that reading and understanding your code is as easy as reading an article in their native language.”
 
They did what any conscientious team would do: they test-drove tools and frameworks, compared and contrasted. When all was said and done, the team settled on two priorities: standardizing around PyTorch and Weights & Biases. Locking both in across the team meant quicker code reviews, quicker model training, and above all, quicker time to production. And in a space like fraud detection, that time really matters.
 

Why Socure Chose Weights & Biases

Socure iterates a lot. They’re constantly training models with machine learning and trying novel computer vision approaches to solving fraud. But prior to adoption, comparing models was onerous. It took time and it was harder to have complete confidence they were comparing apples to apples. Once they moved to Weights & Biases, that all changed. The team was quick to adapt and suddenly, they were moving more quickly, training more models, comparing more experiments. Everyone was on the same page.
 
And they saw immediate dividends. Weights & Biases was pythonic. Models were easy to review. Performance was easier to understand and visualized elegantly. Weights & Biases’ components like Artifacts were reusable across their team for myriad uses. Edward likens those components to Lego blocks that make it easy to understand and debug model performance. In tandem with the team adopting PyTorch, readability and reviews were far faster. They were leaning into Weights & Biases’ inherent customizability, building out the specific workflows and dashboards that worked for their specific problem. Plus, Weights & Biases worked well with Hydra and the rest of their internal stack.
 
“Weights & Biases gave our team a full and complete understanding of our model’s lineage, from datasets to training to production artifacts,” Edward said. “We saw a 15% increase in our model building efficiency while saving about 15% on hardware spend on top of that.”
 
An additional bonus for the team was that it made showcasing their work–both to their colleagues on the computer vision side and to other stakeholders–clean and easy.  Internally, it meant they could test new ideas on small subsets of their massive datasets and immediately understand if that approach was promising. They could more quickly lean into their best ideas and move on from less successful ones. Externally, they could create custom visualizations and charts and show which models were most predictive of fraud. They could present their work in a digestible way to other engineers and less technical stakeholders. They were moving more quickly and working more transparently. Predictive DocV’s performance kept improving.
 
Weights & Biases is helping on the deployment side too. Having easily accessible keys and artifacts means models are easy to locate, understand, and deploy. “It’s like pulling a Docker image,” Edward said. “Deployment is fast and easy.”
 

Conclusion

Some implementations of machine learning are novel but don’t bring with them a lot of important benefits. Think of a GAN that turns your face into a cartoon character, for example. But at Socure, machine learning brings tremendous, wide-ranging benefit: it’s protecting all of us from fraud.
 
And as we all know, fraud moves quickly. Staying one step ahead is paramount. Collaboration is key. Standardizing the machine learning workflows with PyTorch and Weights & Biases meant training more and better models faster and deploying them quickly and confidently. Their team is smart, agile, and growing. And they’re doing important work that benefits everyone. If you’d like to help them keep the rest of us safe from the growing problem of fraud, they’d love to hear from you.