Skip to main content

Sarah Catanzaro — Remembering the Lessons of the Last AI Renaissance

Sarah discusses the lessons learned from the "AI renaissance" of the mid 2010s and shares her thoughts on machine learning from her perspective as an investor.
Created on January 18|Last edited on February 2


About this episode

Sarah Catanzaro is a General Partner at Amplify Partners, and one of the leading investors in AI and ML. Her investments include RunwayML, OctoML, and Gantry.
Sarah and Lukas discuss lessons learned from the "AI renaissance" of the mid 2010s, and compare the general perception of ML back then to now. Sarah also provides insights from her perspective as an investor, from selling into tech-forward companies vs. traditional enterprises, to the current state of MLOps/developer tools, to large language models and hype bubbles.

Connect with Sarah:

Listen



Timestamps

0:00 Intro
1:15:58 Outro

Transcript

Intro

Sarah:
I think people see the output of models like DALL·E, GPT-3, et cetera, and they're amazed by what AI can do.
And so the conversation doesn't even hinge on, "We have access to this data set," or "We have access to this talent pool." It's more, "AI is magical. What can we do with it? Come in and talk to us about this." And again, I think that is that is somewhat dangerous.
Lukas:
You're listening to Gradient Dissent, a show about machine learning in the real world. I'm your host, Lukas Biewald.
Sarah Catanzaro was a practicing data scientist and then went into venture. She's currently a General Partner at Amplify Partners, and one of the leading investors in AI and ML. Her investments include a whole bunch of companies I admire, like RunwayML, OctoML, Gantry, and others.
It's really interesting to talk to an investor who's also technical. She has insights both on how the technology is built and how it's being adopted by the market at large.
This is a really fun conversation and I hope you enjoy it.

Lessons learned from previous AI hype cycles

Lukas:
Sarah, thanks so much for doing this. I've been looking forward to this one.
I had a bunch of questions prepped and then I was looking at your Twitter and I was like, "Oh, there's like a whole bunch of stuff that we should..."
Sarah:
Yeah. I feel like I've been doing a lot of thinking out loud recently. Including in response to a lot of the hype around Stable Diffusion, LLMs, et cetera.
I appreciate the fact that both of us were there in the 2013, 2014 phase where every company was claiming to be an AI company. It feels like we're kind of heading down that road again, which scares me a little bit.
I hope at least there are enough companies — people — who remember the lessons of the last AI renaissance. But we'll see.
Lukas:
Well, let's get right into it then, because from my perspective, I totally remember at least one other AI bubble. Maybe more, depending on how you count it.
I guess from where I sit, it feels like this one might be different in the sense that I feel like these challenges that were always...seemed super, super hard, seem like they're really working. And I feel like I see applications happening unbelievably fast after the paper comes out. Actually even maybe before there's time to even publish any paper on the topic.
I think I might be more bullish about large language models and Stable Diffusion than you, which is great because we can actually have an interesting conversation here.
But I thought it's interesting. You've invested in Runway, and just the other day Cris was showing me a natural language input into Runway where you could basically type what you want, and it would sort of set up the video editing to work that way.
I thought, "Oh my gosh," this might be a totally new kind of interface that lots of software might quickly adopt, I guess. But it sounds like — looking at your Twitter — it sounds like you were playing with large language models and finding it super frustrating and broken.
Tell me about that.
Sarah:
Yeah, so I think my concern is less about the capabilities of large language models specifically, and more about some of the lessons that we learned during the last AI renaissance. Which I think was roughly like 2014 to maybe 2017, around the time that AlphaGo came out. People were really excited about the capabilities of GANs and RL.
At the time, I remember companies like Airbnb, Uber, Lyft building these big research teams, but not really having a clear agenda for those research teams, or understanding how the objectives of their research teams might align with the objectives of the broader organization.
And then similarly, you saw all of these startup founders emerge that were talking about changing healthcare with GANs or changing finance with RL, but didn't really have insights into the nuances of those industries.
My feeling of why ML didn't work the last time around — or rather, why ML adoption didn't occur at the pace that we anticipated — was that it was not really a technical problem, but rather a product, go-to-market problem. I am hoping that this time around, we've both learned from our mistakes but also — in the intervening time period — created enough enabling technologies, such that two things can occur.
One is that companies can fail fast. Frankly, one of the things that scares me is that back then I remember a bunch of companies reaching out and basically saying things like, "Hey, we've got a bunch of data. We'd love for you to come in and talk to us about our AI strategy," and thinking, "I don't care if you have a bunch of data. Let's talk about a bunch of problems that you have, and how ML can solve those problems."
I've come to believe that you can't fight that urge. Founders will always be enticed by the promise of AI. But if they're able to experiment with it quickly, then I think they can start to learn more about the infrastructure, and data, and other investments that they may need to make in order for their AI initiatives to be successful.
At the same time, I think by creating these higher-level interfaces that make ML more accessible to potentially the domain expert, it allows people with a more thorough understanding of business problems to at least prototype AI solutions.
I'm somewhat skeptical that these very high-level interfaces will allow them to build production ML at scale, but at least they can see, "Does it work? Do I need to now hire a data/ML team to realize this initiative further?"
Lukas:
Do you have companies in mind that you like, that are creating these higher-level interfaces off of ML technology, that makes them usable for real world applications?
Sarah:
Yeah. I think Runway is actually a perfect example of the phenomena that I see playing out.
Some people may not know, but Runway actually started off more as a model marketplace. Their goal had been to make GANs and other types of models accessible to creative professionals, but they weren't really focused on building out the video editing tools, at least initially.
They created these higher-level interfaces, such that various creative professionals — whether it was artists, or directors, or photographers — could start to experiment with ML models. What they saw was that some of the most popular models were models that automated routine tasks associated with video editing.
Based on that user behavior, they decided to double down on video editing. In fact, a lot of the model architectures that they've since created — including Stable Diffusion — were really purpose-built to support the workflows of video editors.
I like that sort of workflow, where you use a prototype, or you use these higher-level interfaces to get insight into what users need — as well as potentially the limitations of the underlying technology — and then you iterate from there.
Lukas:
I totally remember a time, I think, of the era you're talking about — 2014 to 2017 — when every company was like, "Oh, we have this data. it must be valuable because we can build a model on top of it."
Do you see some analogy today to that? What's the common request of an ML team that's misguided, or should be thinking more about problems? Because I feel like data maybe isn't seeming quite as valuable, in the world of LLMs and big models.
Sarah:
I think that what we're seeing today is arguably more nefarious than what we saw back then, because at least at that point in time, companies had invested in collecting data. They had thought about possibly what data to collect. And so there was some understanding of how to work with data.
I think people see the output of models like DALL·E, GPT-3, et cetera, and they're amazed by what AI can do.
And so the conversation doesn't even hinge on, "We have access to this data set," or "We have access to this talent pool," or "We have this type of workflow that could benefit from these generative capabilities."
It's more, "AI is magical. What can we do with it? Come in and talk to us about this." And again, I think that is that is somewhat dangerous.
I was at a conference just last week. There was a presentation on ML infrastructure at a music company, and somebody in the audience asked, "Does the AI listen to songs?"
It's a perfectly reasonable question. But I think it does kind of belie some of the misunderstanding of AI and how it works.
Lukas:
In what sense?
Sarah:
I think people think about AI as artificial agents. They think of AI as something that could listen to a song, not just something that could represent a song and make predictions based upon the content of that song.
Again, I think better understanding of what LLMs are and what they can do will be really necessary to identify when they can be useful.

Maintaining technical knowledge as an investor

Lukas:
This might sound...this is a little bit of a soft ball — or might sound like a soft ball — but I was really genuinely interested in this.
I feel like one of the things that you do really well, at least in my conversations with you, is maintain a pretty deep technical and current knowledge of what's going on in data stacks, basically. Or, data infrastructure and ML infrastructure.
But yet you're not maintaining data infrastructure — as far as I know — so I'm kind of curious how you stay on top of a field that seems like it requires such hands-on engagement to understand it well. Or at least I feel like it does for me.
Yeah, just curious what your process is.
Sarah:
Yeah. It's interesting because I'd say that, in some ways, that is one of my biggest concerns. I've been in venture now for about seven years, and so I can still say that I've spent most of my career in data. But it won't be long before that is no longer true.
And certainly I have found that my practical, technical skills have gotten rustier.
One comment on that is that I do think that losing my Python, SQL skills, etc. has actually enabled me to look at some of the tools and platforms that are available to users today, with a fresh set of eyes. I'm not as entrenched in the same patterns of behavior and workflows as I was when I was a practitioner.
So it's been helpful to shed some of my biases. But I think what I've discovered is that you can understand how something works without using it. And therefore there are two things that are kind of critical to building technical understanding for me.
One is just spending a lot of time with practitioners, and hearing about their experiences. How they're using various tools, how they're thinking about various sets of technologies. Frankly, just learning from them almost feels like a shortcut.
Instead of trying to figure out what the difference is between automated prompting and prefix-tuning, just going to ask somebody and have a conversation with them. Which is kind of coincidental, and perhaps even ironic. Like, accelerate my learning by just learning from people with expertise in those areas.
There's a lot that I just learned through conversation with practitioners.
But I think going one level deeper — either reading white papers or reading research papers that give you kind of a high-level overview of an architecture, or how something works without getting into the nitty gritty of the underlying code or math — allows me to reason about these components at a practical level of abstraction.
I can see how things fit together. I understand how they work. That doesn't necessarily mean that I'd be able to implement them. Definitely doesn't mean that I'd be able to iterate on them.
But it's enough depth to reason about a component, and it's placed in a broader technical stack.
Lukas:
It's funny though, sometimes I feel like investors...I mean all investors do that to some extent, and I totally get why.
But I think that I often feel also paranoid about losing my technical skills, because I feel like if all you can do is sort of figure out what box something belongs to, it's really hard for you to evaluate the things that don't fit into boxes.
And I feel like almost all the interesting advances — actually, all the products that we want to come out with at Weights & Biases — generally is stuff where it doesn't fit neatly into one of those ML workflow diagrams that people make.
Because if it was one of those boxes, then of course people are doing it, because it makes logical sense, but it's sort of when that stuff gets reshuffled...it does seem like you're able to maintain a much greater level of technical depth than the average investor, even in the data space. Which is why I wanted to have you on this podcast.
I hope I'm not offending any of my current investors. Just a caveat there. You all are wonderful.
I really do feel like you somehow maintained a much greater technical depth than most of your colleagues.
Sarah:
In many ways I'm amazed by my colleagues and what they do, because I think there are many investors that can reason about the growth of companies, and reason about sets of boxes and the relationships between those boxes without understanding what those boxes do.
I don't think I could do that, but I've always also just been the type of person who needs to go a little bit deeper.
As an example, I started my career in data science, but at Amplify I also invest in databases. And at some point — writing SQL queries, working with dataframes — I just wanted to better understand what was happening. When I write a SQL query and data shows up in my SQL workbench, what is happening on my computer?
I think a lot of people take that stuff for granted. And they can. That is the beauty of abstractions. That is the beauty of technology. We are able to have this video conference — we are able to connect over the Internet — without understanding how the Internet works.
My personality is such that I want to understand how the Internet works. I want to understand why I have service in some places and why I don't have service, and why my dataframe is slower than my SQL query.
I do think that that makes me think about technical systems in different ways.
Lukas:
It’s funny, my co-founder Shawn is obsessed with — in technical interviews — assessing if someone understanding how a computer works, in his words. Which I think is really interesting, because I feel like I'm actually not...
That's kind of a weakness of mine, I always wonder about a lot of the details there, but it is sort of an interesting perspective. I love working with all of my colleagues who have that same drive to understand how everything works.

Selling into tech-forward companies vs. traditional enterprises

Lukas:
Okay, here's another question that I was wondering, I was thinking about.
If I were to come to you, and I had a company in the data/ML space, and I had a bunch of customers that were really who we think of as tech-forward — like Airbnb, and Google, and that genre — would that be more impressive?
Or would you be more thinking I'm likely to succeed if I came to you with a set of customers who we don't normally think of as tech-forward? Like an insurance company — a large insurance company — and a large pharma company.
Which would you look at and say, "Oh, that seems like that company is going to succeed"? Because part of me watches technology flow from the more tech-forward companies everywhere. But another part of me is like, "Wow, these kind of less tech-forward companies have a whole set of different needs and often a different tech stack. And certainly there's more of them and they have more budget for this stuff."
So which would be the more impressive pitch for you?
Sarah:
Yeah, it's funny because I think in many ways the way that VCs make decisions — the way that we think about deals — is actually super similar to some of the patterns that we observe with neural networks. And that of course means that we have bias. It also means that we learn from patterns that we've observed.
So, I can give you the honest answer, and then I can also give you the rational answer.
The honest answer is that I would be more impressed by a company that has engaged with tech-forward customers. For the reasons that you described. In the past, we have generally seen that tech will spread from the Airbnbs and Ubers and FAANGs of the world into the enterprise, and not the other way around.
We also have a bias that these more traditional enterprises tend to move slower. There tends to be a lot of bureaucratic red tape that you need to navigate. And as such, those markets tend to be less attractive.
So, on its face, if you just said...you don't have any additional information about the velocity of sales, about the quality of the tech or team, etc. But like you're-
Lukas:
-holding them equal, I guess. Equivalent.
Sarah:
Yeah.
That said, I think that is one of the biases that can cause us to make poor decisions. What really matters are some of the things that I just alluded to.
If you're able to sell into insurance companies repeatedly — and with high velocity — that is arguably a better business than a company that spends 6 to 12 months trying to sell into tech companies.
So it's less about "To whom do you sell?" and more about, "Is that a big market? Are you able to sell efficiently? Are you able to sell scalably?"
I think sometimes we need to be aware of our biases and the impact that marquee logos can have on our decision-making.
Lukas:
Well, I can't tell if you think it's a rational bias or not. I mean, in some sense, you could call all pattern-matching biases.
Do you really think it would be rational to sort of be less enamored with tech-forward customers than you actually are?
Sarah:
I think we need to ask ourselves and probe on, "Under what circumstances might enterprises move quickly?"
A great example of this is a company called Afresh, which was one of the companies that did use RL to disrupt an industry. At that time that so many companies were trying to do the same thing, but didn't have as much insight into what was happening within an industry. They offer tech solutions — including things like inventory management and forecasting — to companies in the grocery space.
Now, you might think that grocery is going to be a super outdated, slow-moving industry. And therefore that selling into grocery chains would be long and tedious. And perhaps not very scalable.
But, at the time, a lot of grocery stores were responding to — and/or otherwise just terrified by — the acquisition of Whole Foods by Amazon. This was then [followed] by the pandemic, which certainly put a lot of stress on their online and multi channel-delivery and e-commerce capabilities.
So there were these exogenous shocks which made what might have been slow-moving market participants move a lot faster. Those are the phenomena that we're sometimes blind to, because we just hear "grocery" or "healthcare" or "manufacturing" and think "slow", rather than thinking, "What would it take for the participants in that sector to move fast?"
Lukas:
That makes sense.

Building point solutions vs. end-to-end platforms

Lukas:
Here's another point that you made on Twitter, that I was contemplating. I actually don't think I have a strong point of view on this, although I really should — given the company that I'm running — but you mentioned a lot of VCs have been saying that you expect the point solution MLOps space to consolidate.
One thing that's interesting about that, is that I think you've invested in some MLOps tools. Do you sort of expect them to expand in scope and eat the other companies? Is that something that you need to bet on when you invest in them? Or would you be happy to see them get bought by other tools?
How do you think about investment then, in MLOps tools companies, with that worldview?
That's my practical question. And then the other thing that I observe, is that it doesn't necessarily seem like developer tools in general is consolidating. So I think I might even agree with you, but I wonder how you sort of pattern match that against developer tools. Or even maybe the data stack...
I don't know. Do you think that the data stack is also consolidating? Or what's going on there? Sorry, I just dumped a whole bunch of different questions on you, but...
Sarah:
Those are great questions.
So, I do think that in general most technical tools and platforms will go through phases of consolidation and decoupling. Or, as people love to say today, bundling and unbundling.
I think it's just the nature of point solutions versus end-to-end platforms. You have a bunch of point solutions, they're difficult to maintain, they may be challenging to integrate. You then kind of bias towards end-to-end platforms, you adopt an end-to-end platform. It doesn't address a certain edge case or use case that you're experiencing, so you buy a new tool for that edge case, and unbundling happens.
I think the pendulum will always swing back and forth between bundling and unbundling, for that reason. Or coupling and decoupling, for that reason.
To be clear, as a former buyer, I don't think that point solutions or end-to-end platforms are the best solutions for a company. I think there's space in the middle, where you have a product that can solve a few adjacent problems.
That's typically what I look for when I invest. I want to make sure that the company in which I'm investing is solving an urgent — and often point — problem. They're solving an urgent and specific problem. However, I typically also want to see that the founder has a hypothesis about how they would expand into adjacent problem areas.
It's not that I think solving point problems is bad, but I do think given the pendulum of coupling and decoupling, having some hypotheses about the areas that you can expand into becomes critical.
It's interesting to consider why this may or may not happen in the world of developer tools. I'd argue that you still see consolidation. However, the consolidation tends to happen across layers of the stack, versus across the workflow.
Lukas:
Interesting. What are you...tell me...what are you thinking of there?
Sarah:
Things like serverless, where you're no longer reasoning about resources and config.
That might not be impacting other parts of your developer workflow. That might not be eating into your git-based development workflows, or your testing processes, and things like that.
But it is eating into how you think about managing VMs or containers. It is possibly eating into how you think about working with cloud vendors, and deciding upon underlying hardware, and things like that.
So it might be the case, that it's like in software development, we've seen companies — or we've seen vendors — solve specific problems, but solve those all the way down the stack.
I haven't really thought about that as deeply. But I think it's a worthwhile question to ask.
I would say that one of the big differences, though, that I see — and that we of course need to be mindful of — is that there are far more developers than there are data practitioners.
And so, when you're trying to answer the question, "How does this thing get big?", those building developer tools can arguably solve a specific problem for a larger number of people versus data teams when you're trying to answer this question of, "How does this get big?", you could potentially get stumped just by the number of people for whom a tool is actually applicable.
Lukas:
Is that what gives the intuition that we're in a moment of bundling? That there's just all these point solutions that you feel kind of can't survive on their own, just given the size of the market that they're in?
Sarah:
I think it's a combination of things. On one hand, I see a lot of...the slivers are getting tinier.
You start to see things like "model deployment solutions for computer vision," and perhaps some subset of computer vision architectures. Where, you might think to yourself, "Okay, I understand why the existing tools are maybe not optimal for that specific use case, but that's really narrow."
To my point about thinking about these orthogonal problems, it's unclear how you go from that to something meatier. That's one phenomena that I observed.
I think the other is just that practitioners are really, really struggling to stitch things together. The way a friend put it to me about a year ago, he basically said he feels like vendors are handing him a steering wheel, and an engine, and a dashboard, and a chassis, and saying "Build a fast, safe car."
Those component might not even fit together, and there's no instruction manual.
It's easy to cast shade on the startups that are building these tools and platforms, but I think one of the things that is more challenging in the ML and AI space than even like data and analytics, is that a lot of the ML engineering and ML development workflows are really heterogeneous now.
If you're a vendor and you're trying to think about, "With whom should I partner? With whom should I integrate? Do I spend time on supporting this integration?", it's tougher to make those decisions when practices and workflows are so fragmented and heterogeneous.
I do think that creating more of a cohesive ecosystem has been difficult not because vendors are dumb, but because there's just a lot going on.
Lukas:
Well, I think the other challenge maybe is that when there's so many different technologies that people want to integrate into what they're doing — because there's so much exciting research and things that come along, based on different frameworks and so on — it's hard to imagine an end-to-end system that would actually be able to absorb every possible model architecture immediately, as fast as companies want to actually use it.
Sarah:
Yeah, yeah 100%.
I have been thinking about this in the context of LLMs. We don't yet know how the consumers or users of pre-trained models are going to interact with those who create the pre-trained models. Will they be doing their own fine-tuning? Will they be doing their own prompt engineering? Will they just be interacting with the LLM via API?
Without insight into those interaction models, it's really hard to think about building the right set of tools. It's also unclear to me that the adoption of LLMs would actually imply that we need a new set of tools, both for model development and deployment, and management in production.
I have a lot of empathy for people who are building ML tools and platforms because it's a constantly moving target.
Yet, there's the expectation that you're able to support heterogeneity in all regards. In all regards, whether it's the model architecture, or the data type, or the hardware backend, or the team structure, or the user skill sets.
There's so much that is different from org to org. I think building great tools is really challenging right now.

LLMS, new tooling, and commoditization

Lukas:
I guess that's a good segue to a question I was going to ask you. When you look at LLMs, do you have an intuition on if a new set of tools are needed to make these functional?
Sarah:
I think one of the bigger questions that I have is, again, on how the consumers of LLMs — or how the users of LLMs — will actually interact with those LLMs. And more specifically, who will own fine-tuning.
I imagine that there are certain challenges that will need to be addressed, both with regards to how we collaborate on the development of the LLMs, but also how we think about the impact of iterations on LLMs.
If OpenAI wants to retrain one of their models — or otherwise tweak the architecture — how do they evaluate the impact of that change on all of the people who are interfacing with the GPT-3 API, or with any of their other products?
I think a lot of the tools that were built for model development and deployment today kind of assumed that the people who were developing models would be the same set of people — or at least within the same corporate umbrella — as those who are deploying and managing models in production.
And if LLMs drive a shift — wherein those who are developing models and those who are deploying and building applications around models are two completely separate parties — then some of the tools that we have today might be ill-suited for that context.
Lukas:
Do you think we're headed towards a world like that, where there's a small number of companies generating foundational models? And then mostly what other companies are doing is fine-tuning them or doing some kind of prompt engineering to get good results out of them?
Sarah:
Here we're getting a little bit into the technical nitty gritty, but my impression from tracking the research community so far has been not all...though LLMs are great for what we typically think of as unstructured data — primarily images, text, video, et cetera, audio too — they have not outperformed gradient boosting or more traditional methods on structured data sets, including tabular and time series data. Although there's some work on time series that I think is pretty compelling.
This is one of those areas where I feel like the research community just completely underestimates how many businesses operate on structured data.
While it's possible that adoption of LLMs will drive this new interaction model or new market model — wherein some companies built these large foundation models and others interact with those — I don't see gradient boosting or more classical approaches going anywhere. Because I don't see structured data going anywhere.
Arguably, structured data powers many of the most critical use cases within organizations, ranging from search and recommendation engines to fraud detection.
I think it would be a tragedy to neglect the needs of those who are using...I don't want to say simpler approaches, but certainly simpler approaches and more complex approaches, by using architectures that are not perhaps attention-based, when working with these specific data sets.
Lukas:
Interesting.
Do you have an opinion on...how to say this? I feel like many investors especially, but I think many smart people looking at the space of ML and data, they think, "Wow, this is gonna commoditize. This is going to get...tools are gonna make this easier. Less companies are going to want to do this internally and spend money on expensive resources."
But I guess when I look at what companies actually do, it seems like they spend more and more, and even kind of push up the salaries. And they have this fight for scarce, specific talent.
Which way do you sort of predict things are going? Do you think like 10 years down the road, ML salaries go up or do they go down? Maybe it's a more concrete way of putting it.
Sarah:
Yeah, that's a great question. I probably expect that the variance would increase.
My guess is that there are certain applications that may be commoditized — or at least that may be commoditized for some subset of the market — while others continue to be pursued in-house.
Search is perhaps a very interesting example.
For some businesses, they may be more than happy to rely upon a vendor to provide those semantic or vector-based search capabilities. While search may have an impact on their bottom line, perhaps it's not the most critical or most impactful thing to their business, but rather just a capability that they have.
This is not to say that Slack actually uses a vendor or should use a vendor, but as far as I can tell, Slack doesn't really monetize on search.
You'd contrast that, however, with an e-commerce business or something like Google, where their ability to deliver the highest quality search results and their ability to improve search — just marginally — could be a huge impact on revenue.
Those companies are probably likely to develop their own models.
I think we'll see that some companies do their own model development. Some use cases are not commoditized, and those companies for those use cases you see very high ML salaries.
But then, perhaps for others, you're really just a software engineer who knows a little bit about ML, and can interface with some of these models through APIs, and can reason about the output of experiments and behavior that you might see in production.

Failing fast and how startups can compete with large cloud vendors

Lukas:
I guess in that vein — and you sort of alluded to this earlier a little bit — what do you think about all these sort of low-code and no-code interfaces into exploring data, building ML models?
You mentioned earlier that you think that's generally a really exciting trend.
Sarah:
My opinions on this category are pretty nuanced, so I was thinking about where to start.
Generally speaking, I'm very skeptical of no-code, low-code solutions. I find that many of these tools — no matter what the sector or what the use case — they end up shifting the burden of work. Not necessarily removing that burden, or even lightening that burden.
A great example is self-service analytics.
My own belief is that in general, most self-service analytics tools don't actually reduce the burden that the data team or analytics team bears, but rather shifts the work of the data team from building analytics products to debugging, explaining, or fixing analytics products.
And I think the same can be true in the ML space.
Why I'm excited about some of these tools in the ML space is that I actually think that in ML, failing fast is really critical. Some of these tools that enable users to prototype ML-driven solutions might help them better understand, "Is this going to work? What additional investments do I need? What do my users expect from the system before they make a decision to invest further?"
It enables that kind of quick prototyping, learning, and failing fast. The other thing that I feel quite strongly about, is that we need to explore ways to decouple model development and ML-driven app development.
Whenever I talk to companies about their ML architectures or their ML stack, it becomes so obvious that ML is just this one tiny component in a much larger app architecture. The prediction service might be connecting with other databases, or stream processing systems, or other microservices, tools for authorization, and so on and so forth.
I think it's really important to be able to build applications around a prediction service while independently iterating on the model that powers that prediction service. So, I am somewhat long on tools that enable engineers to prototype ML-driven systems, so that they can build those application architectures.
And then, once they have a better understanding of the full system requirements — including some of the latency associated with things like moving data around — they can kind of pass off a fuller spec to a data scientist who will iterate on the model and model architecture, armed with the knowledge that these are the attributes that we need in order to make this project successful.
Lukas:
That makes sense.
Okay, another question. When you invest in a company that is providing some kind of ML or data service, does it cross your mind, "What if AWS does that?" Or GCP or Azure.
Is that an important thing to consider, do you think, or is that irrelevant?
Sarah:
Yeah, yeah. I smile because I feel like this question, it comes up somewhere between like one to five times a week.
Given the areas that Amplify invests in — we're primarily focused on data, ML tools and platforms, enterprise infrastructure, and developer tools — we're constantly fielding this question of, "What if AWS or GCP or Azure does this? Won't that company — won't that market, et cetera — get crushed?"
In the past, what I've told people is that I have found that startups tend to be better at building developer experiences. Anecdotally, this is just something that we observe. People complain a lot about the experience of using AWS tools, the experience of using things like SageMaker.
I've thought a little bit more about why that's the case.
I think, generally speaking, the cloud vendors need to develop for their most spendy customers, their highest-paying customers. And their highest-paying customers tend to be enterprises, shockingly.
As such, they're developing for an enterprise user who probably has fairly strict privacy/security requirements, who may have a very distinct way of organizing their teams, who may be bringing in a persona with a specific skill set into data science or ML roles.
If I had to present a hypothesis about why they haven't been able to compete on developer experiences, I think it's because often they are creating tools and platforms for a developer who is not as representative of the rest of the market.
But, to be honest, with the passage of time, I've just seen enough examples of companies that have been able to out-compete the cloud vendors where I just don't worry about it that much anymore.
Lukas:
Have you ever seen anyone get crushed?
Sarah:
Crushed?
Lukas:
Has that happened in your career?
Sarah:
No. I mean, I'm sure it has. But it's hard for me to think of an example, whereas it's easy to think of many, many examples of companies that were not crushed by the cloud vendors.
If anything, I think sometimes we see that start-ups get...they sell too soon. The way in which the cloud vendors out-compete them is putting some juicy acquisition offer in front of them and then they don't have to compete.
That's the only example that I could see or think of, off the top of my head, of the cloud vendors crushing a potential competitor. They crush it with their dollars. Suffocate companies with their acquisition offers.
Lukas:
R&D through M&A, yeah.

The gap between research and industry, and vice versa

Lukas:
I saw an interview or a conversation that you had with Andrew Ng. I thought you had an interesting point that academic benchmarks...they often don't really reflect industry use cases. But you were kind of pointing out that industry has some share of the blame for this.
Can you say more on that topic?
Sarah:
Oh, absolutely.
I am really grateful to Andrew for actually drawing my attention to this issue. We often think about the gap between research and industry, but we don't as often think about the gap between industry and research.
Andrew and I had been talking about this challenge of structured data versus unstructured data. I think I said to him, "What I see in industry is that most ML teams are working with tabular and time series data. What I see in the research community is that most researchers are building new model architectures for unstructured data."
There's a big mismatch between what model architectures people in industry need — given the data that is available to them, as well as given the types of problems that they're trying to solve — and the research that's becoming available.
Now he pointed out to me — and this is something that I hadn't really thought about before — researchers have access to unstructured data. They have access to things like ImageNet. They don't have access to high volumes of data on user sessions, or logs, metrics, and events. The data sets that tend to be the lifeblood of most companies.
It is very difficult to innovate on AI techniques for data sets to which you have zero access.
I think it's easy to point to that research and be like, "Oh, there's such a big gap between what they're building and what we need." I think we also need to be mindful of what the research community can do, given the resources that they have available to them.
I've seen a couple of efforts by a few organizations to open source their data sets, but it's tough because oftentimes the most valuable data sets are the most sensitive ones. What company wants to share their click-through data that probably reveals the state of their business, some of the experiments that they're running, and so on so forth.
Lukas:
Well, there's also not a lot of upside.
I remember the Netflix contest was such a popular, awesome thing. Got so many people involved, so much attention to research to Netflix — still a seminal data set — but they didn't do a second one because they felt like...there are user privacy issues, that they couldn't get around to release it.
I don't know if you remember when AOL released a subset of their query logs. It was so exciting to actually have that. I was in research at the time and I was like, "This data set is like gold."
And then like the next day, they fired the person that released it. And their boss — I think their boss' boss, right? — because there was some personal identifying information in that.
It's hard to see a lot of upside for corporations, even if they were sort of neutral on the impact of...on the company secrets, IP issue.
Sarah:
Yeah. One of the things that I have seen, that has been very encouraging, is more and more interview studies or meta analyses coming out of the research community. Where it's clear that the researchers are interested in better understanding the problems that practitioners face in industry.
One critique that I've had of those studies in the past, is that the authors tend to interview people to whom they have immediate access, which means that they often interview practitioners at some of their funding organizations.
The organizations that are sponsoring their labs, which means that they tend to bias more towards larger enterprises or big FAANG companies. They're interviewing people at Facebook, Apple, Tesla on their data and ML tools, platforms, practices, and then drawing conclusions about all of industry.
But I think that recently I've seen a couple of studies come out where there's been a more focused effort to get a more random — or at least more diverse — sample of practitioners from both smaller startups, more traditional companies, bigger tech companies, et cetera, to really better understand both the similarities and differences between how they approach model development and deployment.
I hope that continues.
Lukas:
Do you have a study that's top of mind, that you could point us to?
Sarah:
So, Shreya Shankar, who had actually been a university associate.
Lukas:
Yeah, I saw that. Totally. Nice.
Sarah:
I was really thrilled because Shreya actually reached out to us and said, "Hey, can you connect us to people at different types of companies? I've got connections to people at Instagram, Facebook, Apple, et cetera et cetera, but I want to talk to people at mid-market companies, or early-stage startups, and B2B companies, and better understand some of the nuances of their workflows."
Lukas:
What was the name of the paper? I think I just saw it.
Sarah:
"Operationalizing Machine Learning: An Interview Study".
Lukas:
Thank you. Yeah, I agree. That was an excellent paper.
Sarah:
Yeah, yeah. The other thing that I had said...I sent Shreya a text message after reading through it. The other thing that I really appreciated about the interview study was that she didn't cherry pick the insights that were most likely to drive interesting research questions or solutions.
I think she took a really genuine and unbiased approach to thinking about, "What are the problems that people are talking about? What are the ways in which they're there solving them? Let's highlight that there are a bunch of problems that people are just solving in practical — albeit hacky — ways, but ways that they're content with."
I thought it was a very honest study.
Lukas:
Totally. I totally agree.

Advice for ML practitioners during hype bubbles

Lukas:
Well, I guess if we are possibly headed towards another bubble in machine learning — or machine intelligence, as you sometimes call it — do you have any advice for a startup founder like me? Or maybe an ML practitioner, which is most of our audience.
Having gone through another bubble, how would you think about it? What would you do if you started to...I think we're already seeing bubble-esque behavior. What are the lessons?
Sarah:
I think the most critical lesson that I saw/learned the last time around was, "Focus on your users," or "Focus on the strategic problems that you're trying to solve." And "Really, really understand if and why ML is the best tool to solve that problem."
I think it's critical to think about machine learning as a very important tool in our toolkit. But one of several tools.
I was catching up with a friend a couple of weeks ago, and she had mentioned to me that the way in which she prioritizes ML projects is through regular conversations with their product leadership, and engineering leadership — and her representing ML leadership — about the product roadmap, about the user behaviors that they're trying to unlock. And then thinking about whether ML or traditional software development approaches are a better tool for achieving those things.
I think as long as we continue to think about ML as a tool to solve problems — and as long as we have the tools that enable us to better understand if ML is solving those problems, and how to improve upon its ability to solve those problems — then ML can be a super powerful tool. And one that we learn to wield in more powerful ways too.
But — I feel almost like a broken record saying this, given the lessons learned in the past — if we treat ML like a silver bullet, if we treat it like a hammer looking for a nail...that was the pattern that I think led to failure.
Don't think about "What ML can do for you", think about "What you can do for your country," and if ML is the right way to do that, I guess. That's the lesson that we learned and I hope it's the lesson that we will carry forth.

Sarah's thoughts on Rust and bottlenecks in deployment

Lukas:
Love it.
We always end with two open-ended questions. The first of the two is, if you had extra time, what's something that you'd like to spend more time researching? Or, put another way, what's an underrated topic in data or machine learning?
Sarah:
Oh man, that one is very easy for me: programming languages.
I would love to spend more time learning about programming languages. I am definitely not convinced that Python is the right interface for data science, or that SQL is the right interface for analytics work.
I would really love to learn more about programming language design, so that I could better diagnose if and why Python and SQL are the wrong tools, and how one might go about building a better PL interface for data scientists, ML engineers, and analysts.
Lukas:
Okay, a question that I didn't ask — because I thought it was a little weird or maybe nosy — is why you're asking on Twitter if anyone knew any female Rust developers.
Because I will say Rust comes up just a shocking amount on this podcast, and I was wondering what's driving the interest in Rust, and then if there was some reason behind looking for a female Rust developer, and if you actually found one.
Sarah:
Yeah, yeah.
So, full transparency — and I think I maybe put some of this on on Twitter too — quick background is that certainly earlier in my career, I felt like oftentimes I wasn't getting invited to the same set of events, et cetera, as some of my male peers, and therefore I wasn't getting exposure to the same set of conversations — maybe even the same opportunities — to potentially see deals, and things like that.
I feel pretty strongly that we need to have women in the room when we host events, to ensure that they're getting exposed to the same set of opportunities. That we're not doing things to hamper their progress in the industries in which they they operate.
We were hosting a Rust developer dinner, and looked at the guest list, and there weren't that many women, and it felt like we could do better. Thus the origins of my question.
Lukas:
I see.
Sarah:
Why Rust?
See, I wish I spent more time studying programming languages, so I could better understand why people are shifting from C++ to Rust. Luca Palmieri — who I believe is now at AWS, actually — has a great blog post on why Rust might be a more appropriate backend for Python libraries that often have C++ backends. Things like pandas, where we experience it as Python but in fact it has a C++ backend.
I've heard that Rust is more accessible than C++ and therefore could perhaps invite more data practitioners to actually contribute to some of those those projects.
But I don't know enough to really say why Rust is so magical, other than a lot of smart people — apparently, like Linus Torvald too — believe it is. If it's good enough for him, it's good enough for us. I don't know.
Lukas:
Fair enough.
My final question for you is, when you look at the ML workflow today going from research into deployment into production, where do you see the biggest bottlenecks? Or maybe where do you see the most surprising bottlenecks for your portfolio companies?
Sarah:
I generally think that...there are two bottlenecks that I would call attention to. Actually three, sorry, I'm being kind of indecisive here.
One pattern that I've observed with ML is that we often iterate on ML-driven applications — or ML-driven features — more frequently than we iterate on more traditional software features.
To give an example, we may iterate on a pricing algorithm far more frequently than we would iterate on a navigation panel, or an onboarding flow, or something like that.
Earlier I was talking about understanding how ML can solve user and company problems. I don't really think we have enough insight into the way in which model performance correlates with behavioral data — or the product engagement — to iterate super effectively on models. I think that has been a limitation, and one that could have nefarious effects in the future.
Another big challenge that I see — and I alluded to this before — is the challenge of building software applications around a prediction service, or around a model. In the past people might have talked about this as a model deployment problem.
The problem isn't containerizing your model and implementing a prediction service in production. I think that has gotten significantly easier. The problem is connecting to five different databases, each which have different sets of ACID guarantees, latency profiles...also connecting to a UI service, potentially connecting to other application services.
The problem is the software development. What you've got is a trained model, but now you actually have to build a software application. I don't think we have great tools to facilitate that process, either for ML engineers or for software engineers.
And then around the same space, I also think that the transition from research to production — and back — can still be challenging. Perhaps what a company wants to do — upon seeing an issue associated with the model in production — is actually see the experiment runs associated with that model, so that they might get more insight into what is now happening in that production environment.
That shouldn't be difficult to do. But, in the past I think we really developed tools either for model development or for MLOps, and we're starting to see some of the pain points that arise when those sets of tools are not coupled together.
Lukas:
Cool. Yeah, that all definitely resonates with me.
Sarah:
Lest I sound too cynical, I am really optimistic about the future of ML. I think we just need to do it in a sane and rational way and be mindful of what we're trying to accomplish here, instead of just focusing on flashy press releases and cool demos.

The importance of aligning technology with people

Lukas:
I was thinking as you were talking about the hype cycle, and large language models, and stuff. I was thinking VCs probably feel the hype cycle the fastest.
I'm like, "Man, we've basically solved the Turing test and, like, no one cares. My parents are like, "What even is this," you know. It's like, "Come on, this is awesome, look at it."
But I think every investor knows about Stable Diffusion but I don't think...I even come across Chief Data Officers at Fortune 500 companies who are like, "What's Stable Diffusion?" It's like, "Come on, you should know about this."
Anyway...
Sarah:
Yeah, yeah.
But I think there's this awareness, though, of "This is where the hard work starts."
Lukas:
Yeah, totally.
Sarah:
"Great, we're able to generate beautiful artistic renderings based on textual prompts. Okay, how do we generate photos that are equivalent to that which a professional photographer would produce?"
Because that's what it's going to take to get a Getty Images or Flickr to adopt something like Stable Diffusion.
How do we make automated rotoscoping so good that a video editor doesn't need to correct the mask at all? Because that's what it's going to take for Runway to compete with some of the more traditional video editors.
I saw, through Runway, that the research is not good enough. They've had to do a lot of engineering, as well as their own research, in order to operationalize some of these things.
I am so optimistic about the potential of the technologies, but I also am realistic that reining them in, and actually leveraging these technologies to do good in the world — or to build great products — is hard.
Short anecdote, but I've been talking to a founder who was working on brain-computer interfaces and actually developed this technology where, effectively, it's able to read minds. You had to put on some big helmet thing, but once the helmet was on, it could kind of transcribe thoughts. And they were able to get it to work.
Now, the founder subsequently shifted focus to the gaming space, doing more work with haptic interfaces. I was asking him like, "Why didn't you pursue the mind reading tech further?" And he said to me, "We couldn't find any great use cases."
Isn't that crazy?
But I think, this is tech. Sometimes you can do absolutely remarkable things with technology. But it doesn't matter. It doesn't matter unless you figure out how to appeal to people, and get them to use it, and how to align that technology with an important set of problems.
I think that is the thing — as VCs — we need to continue to remind ourselves. Tech is not easy. Tech is not easy, but people are not easy either. Both are really hard. Unlocking new sets of technologies often means that we are granted the opportunity to solve really hard human problems.
I guess...TL;DR if GPT-3 starts reading minds. Maybe we'll be able to find some applications for it. But, we'll see.

Outro

Lukas:
Thanks so much, Sarah. That was super fun.
Sarah:
Yeah, for sure. Bye!
Lukas:
If you're enjoying these interviews and you want to learn more, please click on the link to the show notes in the description where you can find links to all the papers that are mentioned, supplemental material and a transcription that we work really hard to produce. So, check it out.
Iterate on AI agents and models faster. Try Weights & Biases today.