Bridging AI & Science: The Impact of Machine Learning on Material Innovation with Joe Spisak of Meta
In the latest episode of Gradient Dissent, we hear from Joseph Spisak, Product Director, Generative AI @Meta, to explore the boundless impacts of AI and its expansive role in reshaping various sectors.
Created on December 7|Last edited on December 7
Comment
In the latest episode of Gradient Dissent, we hear from Joseph Spisak, Product Director, Generative AI @Meta, to explore the boundless impacts of AI and its expansive role in reshaping various sectors.
We delve into the intricacies of models like GPT and Llama2, their influence on user experiences, and AI's groundbreaking contributions to fields like biology, material science, and green hydrogen production through the Open Catalyst Project. The episode also examines AI's practical business applications, from document summarization to intelligent note-taking, addressing the ethical complexities of AI deployment.
We wrap up with a discussion on the significance of open-source AI development, community collaboration, and AI democratization.
Tune in for valuable insights into the expansive world of AI, relevant to developers, business leaders, and tech enthusiasts.
⏳ Timestamps
- 0:00 Intro
- 0:32 Joe is Back at Meta
- 3:28 What Does Meta Get Out Of Putting Out LLMs?
- 8:24 Measuring The Quality Of LLMs
- 10:55 How Do You Pick The Sizes Of Models
- 16:45 Advice On Choosing Which Model To Start With
- 24:57 The Secret Sauce In The Training
- 26:17 What Is Being Worked On Now
- 33:00 The Safety Mechanisms In Llama 2
- 37:00 The Datasets Llama 2 Is Trained On
- 38:00 On Multilingual Capabilities & Tone
- 43:30 On The Biggest Applications Of Llama 2
- 47:25 On Why The Best Teams Are Built By Users
- 54:01 The Culture Differences Of Meta vs Open Source
- 57:39 The AI Learning Alliance
- 1:01:34 Where To Learn About Machine Learning
- 1:05:10 Why AI For Science Is Under-rated
- 1:11:36 What Are The Biggest Issues With Real-World Applications
Resources mentioned
🎙 Get our podcasts on these platforms:
- Apple Podcasts: http://wandb.me/apple-podcasts
- Spotify: http://wandb.me/spotify
- Google: http://wandb.me/gd_google
- YouTube: http://wandb.me/youtube
Connect with Joe:
Follow Weights & Biases:
Join the Weights & Biases Discord Server:
Transcript Via Trint
Lukas [00:00:00] All right. So why don't we why don't we start with what you're working on these days? I see you're back. Back at Meta. Let's. What's going on?
Joe [00:00:12] Oh, my God. Yeah, it's a Yeah. I jumped back in July and July and back in bed, and, I mean, it's Jenny. I'm in. I mean, it's. It's everywhere. It's everything I was working on. Jenny, I at Google. Jenny, I'm back here. Better. It's a wild, crazy time. I mean. I mean, it feels great to be back, to be honest. Like, I think there's never been a company that I've experienced that embraces open source and open science and then just like open innovation, like madhouse. And so for me, it's like it's going to be this is my year being back here. So love it. Nice.
Lukas [00:00:52] Well, last time I think we talked, at least at length, when you're Amita, you were working on PyTorch, I think, Right. So this is a little different, isn't it? Yeah. Yeah.
Joe [00:01:02] It it is a well, I mean, it's it's interesting because like, you know, Smith and I, I mean, I saw your one of those a few months back and it's like him and I have kind of like rekindled the partnership right because we worked together for five years on PyTorch and now it's like we're, we're two we're kind of like two layers of the stack, so to speak. Like, I'm kind of and it's like it's, it's interesting. It's actually really fun to have the platform team inside matter and, and kind of at the framework level and all the way down to the hardware level and then building the models and actually thinking about everything around the models. And and so it actually is actually really interesting working together now as kind of two big components in this ecosystem. Very different. But but but still feel sort of similar to.
Lukas [00:01:49] Say you're working on the the foundation models that you're releasing that's like your your day to day. And were you involved? Yeah, it is who are or is that there was that before you you're joining.
Joe [00:02:02] It was. It was actually so I was I was involved in like we built a 1 in 1 of my teams in. And so I was in after George, I was in there working on math and science and building that up. And one of the teams was our Theorem Proving team, which is basically looking at kind of a guided form of mathematics. You know, eventually for like software verification and program proving and there's some cool stuff happening there. In parallel to that, if you remember one poll and he's actually just in town this past week we had dinner and see folks. And it's interesting, he basically grabbed a bunch of spare compute and built one. And that's kind of where that came from. And that was like right around the time I left for Google. And so then I came back and number two is out and I'm basically working on a number three, so that's nice. So I skipped skipped a generation.
Lukas [00:02:55] Generation and love Well, I mean, yeah, okay. Kind of like a high level I guess. Like, you know, what is the like what is what does the company get out of this process of putting out these, these models that are presumably really expensive to to train.
Joe [00:03:18] Yeah, it's interesting. This is a very similar story in a lot of ways to George Story. And like I always think about goals and open source. You know, I'm not an open source outlet for a lot of people made me think I am given I've spent a lot of time last ten years in open source thing I but I actually deeply believe that open source serves a really strong purpose and there are really clear goals to open source things. And you know, I think thematically, like Obama actually, I would say echoes a lot of what we saw with patrons, with PyTorch. When you know something, I put our heads together back in early 2018. We're like, what are we? You know, what are we going to do here? Like, are we going to converge? We had this this really high entropy, like an old community around us. And we were just thinking about, you know, what is the best way to, like, capture that community, to, to to in order to build better products, to, you know, when, when innovations happen in the c v space or in NLP, like how do we capture that quickly and build levers that for internal usage or, you know, build it into projects, make it a better framework and that. And I think this similar story. Applies here for families as well. I think, you know, we're using these models in production like we're using, or at least the technology, the underlying technology in our production models. And so when you have like crazy high entropy spaces like safety, for example, or just model evaluation, you know, more generally, you know, having the road use your models is really helpful. We learn a lot and we learn a lot really fast by other people using our models. So that's a huge motivator for us. But there are other reasons. Certain, certainly.
Lukas [00:05:08] And what are your what are your goals associated with with releasing it? Like, do you track usage somehow or like, like how do you know if you've had a successful. Yeah. Roland. Can you talk about like kind of what you might be optimizing for with, you know, like subsequent versions of, of Lamaze?
Joe [00:05:30] Yeah, I mean, open source has always been one of those tricky things, you know, to go on because you can't really set a metric. I remember when Robert is over at any scale and he was building Rey and he called me up and he's like, I don't know how to define success in open source. And I remember sending him like my detailed thoughts in an email and, and, and we've been kind of friends, I think, ever since, like he was a fair intern. And, and he's got a great hand for open source but.
Lukas [00:06:00] Oh no way. But it's really hard because you put that email in the show notes. That would be amazing to do. You have that? Is that private? I would love to. I mean, if people ask me about all that, a thoughtful response there.
Joe [00:06:13] It's well, I mean, it depends on your goals, right? I mean, like, ultimately, like everything. Well, first of all, open source, everything's a proxy, right? You're you're very rarely able to basically measure directly. Like what? Successes with open source. So, you know, when we think about like in the tighter days when we thought about research, research was actually like the easiest thing to measure. Because like, if my plain language goal is I want, you know, our framework to be the foundation of research because, you know, it generates like new algorithms that we can leverage, like it speeds up inference, whatever it is that that the value that gets created, it gets kind of brought into our platform. Like there is, you know, you can measure citations, right? You can you can measure like papers of code. It's a great, great way to kind of show that nice little index that shows, you know, over time the adoption of papers of code. So that's like the easiest thing to measure. But like, how do you measure like whether your production, whether production usage is actually occurring, right? No one puts their code out on GitHub in production. So like ultimately you're dealing with proxies no matter what you're doing. And I mean, even the cloud providers themselves don't actually have a lot of direct like they can, they can get, you know, some customers that told them, you know, with weights and biases, for example, can get actually probably more direct signal even than I would say, cloud providers, given you're so much closer to the user, a lot of the usage on cloud, for example, is like obfuscated through like the arms and armies and such, whereas like you have actually direct. So it's anything, anything in success like definition and open source is always a proxy. I've learned to live with that and it's, you know, I've embraced it basically.
Lukas [00:07:58] When you like when you build these lama models like like the next generation one. How do you measure the quality of them? Like, are you looking to kind of be major benchmarks or what do you actually like? How do you even decide which one to ship?
Joe [00:08:15] Yeah, it's it's interesting because there's I would say there's two kind of two layers to two North Stars when it comes to this frontier or two to foundation models. I, you know, I'm a slipped up. It's a frontier. I spent the last several months, you know, normal. Now we're supposed to be talking to all use. Right. But but, you know, foundation models ultimately in my mind, should show emerging capabilities or they should be, you know, really capable. And this is where Intel is like really interesting because, you know, the capabilities and evaluation kind of go hand in hand. Like you either you know, it's almost like the chicken and the egg. Which one comes first? You know, did you did you actually generated the model or did you evaluate it and find it? So I do think like but ultimately, like pushing on things like reasoning and, you know, pushing on capabilities, you know, you can look at in the human space, I guess if you can generate super, you know, photo photorealistic images or or that I think you can maybe call that a capability. But ultimately, I would say there's like a dual dual north star. One is is obviously the foundation model capability. But the second is like actual, I would say, usability of a model because if, for example, I build a trillion parameter emoji. I'm just throwing numbers out there. Maybe it's to try and but at this massive emoji, like who's going to able to use that ultimately, right? It's just like a. Like, you know, are we we'll use it even even better. Google will struggle to kind of deploy something of that scale. I mean, they could probably do it, but it's like how efficient, how cost effective? Like, are you gonna be able to scale out in search or in our case ads and feed and other things. So I do think that the second Northstar is really kind of like adoption and ultimately you need to be able to, to, to take what you're building and also make it consumable, you know, for kind of scale deployment. And so that might mean distilling these models into, you know, sizes that are kind of tractable for different compute envelopes, you know, could be pruning them and so on. But ultimately, like you, you really want to push on capabilities and and be able to waterfall that that you know to actual impact. And so there's a lot of thought that goes into that and how you actually kind of stage that process.
Lukas [00:10:31] How did you pick the particular sizes that she released as like three versions? Right. How did you come up with the 6 billion? Yeah. Yeah.
Joe [00:10:43] I mean, it's like in the early days, obviously, it's you know, it's a little bit of we don't know. We don't know. So picking, you know, different sizes based on, I would say like compute envelope and and memory, you know, based on memory memory capacity of of of what the platforms are that you target. So, you know, we kind of wish we for example would have released the 34 B because it kind of fits nicely into the GPU memory. You know, then obviously the CPU released a really nice model. 34 B and that's awesome. That's I have a real problem for licensing is interesting license but like what? The model itself seems to be pretty capable. But I think like in some ways like like I said, if you if you start with really the foundational capabilities and then you, you think about the different this is where product sense and your, your product having a product manager thinking about these things, you know, does help where you think about like, what is it ultimately that you're trying to achieve? Is it, you know, if you're, you're trying to achieve like a device, like an on device, I want to be able to run kind of in tandem, say, with a larger model in the cloud. Like you need to think about the envelope and the compute footprint and the memory footprint. It actually most importantly these days in mobile, it's actually memory bandwidth that actually is is actually the limiting factor. So thinking through like what type of environments, which type of experiences you want to enable, I think is, is I think probably the the most important thing in terms of sizes. So I think we've learned a lot, right? We have data now where we release things. We know what people are downloading. We know internally at scale what people are doing with these models. And we can and now I can zero in in, you know, next generation can zero in much more precisely and say you know what this is actually a good size to aim for say like the server side here's a nice you know sites where maybe the other other types of devices and so on. So I think like we're learning I think it's short answer.
Lukas [00:12:38] Is it hard to release models of different new sizes like I sort of imagine you distilling from one bigger model or reaching them all from scratch. How does that actually work?
Joe [00:12:52] I think this is actually a bit of an open question because it's funny, you actually get different sides of that camp. You know, people will say that, you know, to get the best small model or smaller model is best to start with the larger model and just go from there. Others will say, well, the scaling laws are different for smaller models versus larger models. So, you know, you should basically adjust your sitting your scaling laws and you should, you know, train from scratch and then kind of saturate a smaller, you know, smaller perimeter town and and go from there. So I think honestly, I mean, we'll probably try both ways and see what works best because, you know, the amount of data we're using, you know, these the amount of compute we're using is is changing from generation to generation. And I think the use cases and I would say the envelopes like it's, you know, the actual environment, these things are getting deployed and it's actually a moving target as well. So I think we'll and we'll probably try everything and see what works. That's honestly.
Lukas [00:13:53] Niceness. You have a couple, I think different versions also with maybe different URL strategies as like a chat one and an instructor and. Right. Like how did you end up at that and what are you thinking with that? I feel like at some point there's maybe too many options and it starts to get confusing, like. How are you? How are you imagining modifying that going forward? Yeah, I mean, we're.
Joe [00:14:23] We're thinking about it. You know what it looks like? Because there's, you know, I think I have a really innovative team that's like super excited to kind of work on different things. And, and we also have a research organization in Fair who, you know, for example, like they built Code Mama, which was a, you know, a fine tuned version of 1 or 2 for, for not only like generating code red, also like, you know, just conversing about code. And so yeah, the number of variance like is, is kind of crazy and you start to get this, you know, like what do I use now? And so to like, do I use the Python version? Do I, you know, do I just take the Pre-trained one and ultimately, like customize it for, you know, for what I want to do? And I think, you know, honestly, I think we're just in that space right now where we're going to be trying a lot of things, and we may put out a lot of different models. And so we're finding like certain data mixtures like generate really good reasoning, for example. And do we want to release a freezing version or, you know, do we like we did with code, you know, do we want to continue to release code versions? And like ultimately, I think it's it is a space where, you know, there's a fast moving and we're not sure like which, you know, can you just put all of these capabilities into like one foundation model and like have the best of all worlds? Maybe, you know, there's also like the question of like. How the capabilities come out of versus like in terms of like pre-training versus fine tuning versus, you know, really just like is are you getting most of your capabilities and like in this in the pre training phase or getting like everything and then up to you like, you know, do you need early stuff like that? So a bit of an open debate I think. I think we did all three and obviously our models are pretty good. I think, you know, opening ideas you all three as well. So I think like we'll just look at what people are doing, what's, you know, we might some of our models might end up being like for showcase like to show different capabilities and others might be more like you know built for customization. So I mean we're still looking at all of those types of models.
Lukas [00:16:26] Do you have any advice? I mean, I get this question a lot of like where to start, like what, what model to start with or how to think about that. I was kind of surprised myself. I was trying the chat model seemed to work a little better for me in cases that felt like telling the model to do something, they were significantly better than the instructor model. And I was kind of wondering, okay, how common is that? Or like, is there like a repository that I could go to to sort of see what general best practices are? I mean, speak to the best practices today or kind of point people to where they could learn more about that. Yeah.
Joe [00:17:01] I mean, there's some really some really great resources on our team. Just put out like a pretty massive getting started guide on Mama just a couple of weeks ago. It's on the better So you know dotcom slash. So we still need our own website working on that. But you know, there's everything from like using a lot of drag, you know, to using a light chain to prompt engineering, to fine tuning, you know, all these things like there's basically integrations and, and kind of really nice detailed instructions that kind of give you an idea of how to use the models. I personally love using like the small models and just like, I mean like your hugging piece, for example, and just go into like the models and grab like the 70 or even even like prompted directly. It's super, super easy. I think that's probably the first thing I do or, or I'll start like, I'll just get like a Jupyter notebook or like a cool lab and like, what are you and, and you know, when like, the early Falcon models came out, you know, and I saw them and I was like, this is pretty cool. Like, I could just grab one like, and I was up and running with on the 40 be within like a couple of minutes and just kind of seeing very quickly like how it compared to like the smaller version they had and, and like was it more coherent when it responded to me or, or would it have a higher false refusal rate if I like asking to do stuff. So I think like to me, like, you know, there's a bunch of like services and things out there my like go to is either just to like set up a Jupyter notebook and collab and start to just grab a model or, you know, just go like clicking face and prompt directly if the model fits and your memory, I guess otherwise you'll like run in memory. But that's, that was the things I do anyway.
Lukas [00:18:40] Interesting. So you recommend just sort of like trying them and getting a feel for how well they they work.
Joe [00:18:47] That to me, yeah. I mean, if you're looking for like the kind of chart, you know, just to understand like how to converse with the model, I think that's, that's pretty interesting. That's like a quick way to do it. But there's also like if you look on like on how you because there's like 11,000 derivative models that you can also download and people have like build Chinese versions and they build German language versions. They've made them fashionista versions or I think I saw fashion versions that will converse with you about fashion. I'm not a very fashionable guy, but you know, but, but you know, you can, I guess, play with those as well. So I think, you know, the community is kind of innovating here and they're taking advantage of the fact that you can like adapt these models pretty quickly with very little data or even prompt engineering and be able to kind of have them do different things. So yeah, it's pretty fun to kind of play with sort of committed to doing.
Lukas [00:19:39] Totally, totally. And I mean, even the six B model is, is pretty impressive, although kind of challenging to deploy like truly for a lot of edge use cases. You thought about like building even kind of smaller ones. I feel like there's a lot of appetite for that and various strategies that I've seen out there.
Joe [00:20:00] You know, there's I mean, we've you know, we released the seven B actually is what we released. And it's. But it's it's you know, Qualcomm, for example, is able to deploy the southern de la Mata on the news. Now, a new Snapdragon that was announced a couple of weeks ago, like media tech is able to do it. And I think that's going to be like a fairly premium like feature on a phone. So it's like the high end phones. But yeah, you can imagine like smaller versions. I think there's there was a paper I just saw that came out this past week, Old baby Llama, which was pretty cool. You know, there's also like tiny llama on GitHub. Somebody is basically releasing some check checkpoints and it's like a 1 billion. It's up to a one day and really see all the checkpoints up to 1 billion. So there's actually quite a bit of of innovation. I mean, it's not that expensive to train or or to pre train from scratch, right? When you get into the small, the small models, it's just like really how coherent you can make them. And and I think the more you can like. Basically fine tuning or like adapting for a specific task. I think the more coherent they can be, which is really cool. So you can't really expect like A1B or A3B, for example, to like be as good as like a 70, B or or bigger just because it doesn't have the moral capacity. But but all that said, like we actually haven't, I think yet saturated those those model sizes. So actually I do think there's like opportunity for us in the community to kind of build better smaller models. So I'm pretty bullish, actually.
Lukas [00:21:29] I mean, one thing that I was kind of wondering about, you know, like like it was a little bit irritating to have to, like, ask permission to to just download the model. Like, what was the thinking behind that? Like, it's a product manager. I'm sure you want to remove friction. It's sort of like waiting for a few hours to get access to the model is kind of like, Wait, what's going on here?
Joe [00:21:53] No, this drives me crazy. And I'm I'm working on some things, I think, to make it easier, certainly. But yeah, I mean, I as a product manager, like I, I mean as a, as a plumber guy would never want to call me, you know, if you, if you follow me on GitHub, you'll see that I'm on there a lot and responding to issues and I see a lot of what the community is dealing with. And it's I mean, I don't know how many issues are there are for like on fucking pace, like, you know, because right now we do like, we do a check on the better side. We do a check on the home side. You know, we have a license that is, you know, it's not a crazy license, it's not crazy bespoke, but it is a modified license. We have an acceptable use policy, which I think is a good, good thing to have. But yeah, ultimately it does add friction, which I'm not happy about. So I'm, I'm hoping with some of the changes we have upcoming that we can streamline that quite a bit because I think ultimately, like we want this technology in everyone's hands. I don't think the license we we selected precludes any of that. I think, though, there are some things we can do with like hugging face and Kaggle and other platforms though, to just make it easier to access the models and more streamlined. So yeah, we we're definitely working on that. But I'm as frustrated as you are.
Lukas [00:23:07] Do you feel at all competitive with other open source projects to build foundation models, or do you just feel like, Hey, this is great, Like let's just collaborate with them? Like what? What's that like?
Joe [00:23:20] I mean.
Lukas [00:23:21] Honestly, I think it's I think it feels nice. I mean.
Joe [00:23:27] I'm of two minds here. I love I love being competitive. I love seeing our models at the top of the leaderboards. I love hearing about what people build with our models and and the star. I mean, I'm just shocked, for example, like how like, Enterprise even has it like adopted like a lot of models from mid-July to now, like thousands of enterprises are using like London models and products. On the other hand, though, and by the way, that's a huge source of pride for me and a lot of other folks about it here. But at the same time, like I love when other models are released into the open, because I think that it's it can't be just matter, right? It can't be just like us and our team, like carrying the flag for you open, you know, for openness and and transparency. I think it's like, you know, I love I've met with the Falcon team. They're super cool. I love what they're doing. You know, the miserable team in Paris. I mean, they're all of our friends, right? I mean, they're all my colleagues, Tim and Gail and the team. So, like, when new models come out, I'm a huge fan, so I'll go play with them and I'll compare them and I'll be like, I'll. I'll evaluate them then. And it's just like the more I see that, the better gets out there. But yeah, I mean, I'm a little bit competitive, obviously. I want us to have the best models.
Lukas [00:24:40] Are there is there like secret sauce in the training that you want to. Keep to yourself or are you just totally open about the learnings in the process of building these models?
Joe [00:24:54] I mean, we're pretty transparent. I mean, the paper really I mean, how many pages was our paper? 48 pages. I mean, so like I, I think there's obviously every company has secret sauce that you're going to kind of keep to yourself, you know? But at the same time, like, I think if you read the paper like we were the outlet you of like ablations study that the team did was like, Holy cow. When I read that, I was just like, I mean, I've never seen anything like this. So that paper was an absolute masterpiece. And that's just not I'm not saying that because I'm I work in medicine and I know the team, but it really was an amazing paper. It was as detailed as I've ever seen it. And like it's transparent and everything from like the, you know, the, I don't know, just like the learning curves and, and, and early to the ablation study to, you know, how we thought about our safety versus pre training. It was like it was just really really detailed and like yeah I mean so I think we've we've been is probably probably as transparent as we possibly can be, I guess maybe my point.
Lukas [00:26:02] Can you talk about what you're working on now and in the next version of this? Like. Like how?
Joe [00:26:11] A little bit.
Lukas [00:26:13] Okay. Yeah. Yeah. I mean, like what? Like what changes are you thinking about?
Joe [00:26:21] You know. I mean, we've learned a lot. I mean, I think that I think there's a bunch of other challenges besides the model, I think, that are you know, I think I kind of look at the what's interesting is that, you know, I saw, you know, you had this post, you know, last week around our operating system. And it was interesting because I had been thinking along this line for for a while as well, is it feels very much like the model is the kernel. And, you know, you kind of have in the operating system, you have things like your antivirus, which is kind of like your trust and safety, I guess. And you know, you have like file systems and, you know, access to kind of information or like through embeddings or rag and databases and so on. And so like I think I'm thinking about like not just the model, I'm thinking about the whole ecosystem. I'm thinking about, you know, levels of abstraction for how you access these models, whether it's, you know, through an API, through our partnerships or, you know, through like direct access or integration with SDK or, you know, we, we work, for example, with a lot of open source projects like land chain and auto GP and, and a bunch of others. And so mean right now. I mean, my biggest focus is like, how do I think about the community and, and how do I, how do I build around, you know, alumni models as kind of like this kernel of an operating system and bring a more holistic platform for others to build on? Because ultimately, like with PyTorch, like that's, that was kind of our strategy with high touch. As we kind of you know, when we thought about it, like, what do we make what do we buy? You know, we we didn't want a kingmaker in places where it didn't make sense for us to kingmaker. I think probably Smith talked about this in his his podcast. But, you know, in areas of of high entropy, you know it's kind of weird to go and put something there and then, you know, be kind of competitive with everything going on in the high entropy space. So, so we want to build basically things that others can build on and ultimately and grow our community around that and create that flywheel. And so like I'm looking at areas around Lamar and thinking, I mean, like trust and safety is one area where evaluation, like how do we actually grow the community around us so we can basically ultimately build safer general value experiences. Like, I think ultimately there's this chasm to cross right? In products. How do we take this open model, which looks what's cool and does some cool stuff but ultimately build responsible and take products from it? And I think so this is this is kind of like a and, you know, some of the things we're doing obviously, we're we're involved in the Animal Commons safety working group and and working with the community there to help standardize and and build build tools and needles. But yeah, I mean, generally speaking, I'm thinking about the whole product here and not just the model.
Lukas [00:29:09] Mm hmm. Well, what kinds of things are you thinking about building in the realm of safety or encouraging others to build? Like what kinds of problems you see people running into And how are you fixing that?
Joe [00:29:22] Yeah, I mean, you can look at the executive order as, as as one. I mean, cyber security certainly, you know, generating insecure code. I think if you. That was one of the it was the top thing on the on the executive order. There's just, you know, general safety around, you know, input output, you know, to your to your prompts. Okay. I mean, there's a number of different harms. And we, you know, we have to deal with obviously as a company, we have a massive platform on Facebook and Instagram and WhatsApp and. And so there's, there's kind of harms that we deal with on a daily basis. And so internally we're looking at those things. And I think what we want to do, what we want to do is ultimately, you know, help grow the community around, you know, our models so they can basically help build these things as well and in the open. And so you can imagine taking, for example, a model. You know, from, say, Amazon and then being able to adapt it to your application. But then ultimately, I don't know if that model is actually safe or not. You know, how do I actually understand whether it's going to be, you know, generating outputs that's that will create ultimately harms? I mean, I wouldn't go far as, like go crazy with like bioweapons. And it's been like the topic this year, right, of like the summit last week. I think there's nothing nothing's impossible from these models. But you know, understanding what these risks are and ultimately finding ways to mitigate them and build that in your products and and make your products basically as safe as possible. That's ultimately my goal. And I want to not only do that it matter, but also do it out in the open.
Lukas [00:30:52] But like I guess when you talk about like safety of inputs and outputs, you know, I feel like everyone says that, but it's a little vague. Like, how do you like can you be like a little more concrete?
Joe [00:31:03] Well, I mean, I think this is part of the challenge. You know, we have I think and so, for example, if I if I were to hypothetically put out tools that allowed you to understand, you know, the input safety of a problem, so say, you know, you look as if you all wanted to ask my my model here, ask Obama to build a bomb. You know, that's that's obviously an unsafe input. I think everyone would probably agree that that's like, you know, pretty generally speaking, something that was bad or a bad thing to ask for or I think to generate an output for it. But there's a lot of gray area here, too. And like, you know, our policy isn't, you know, may not be like Amazon's policy or may not be like someone else's policy. So. Unfortunately like yeah I mean it's it's going to vary from party to party. But I think this is also why we want to work with the community and come and standardize because like our perspective may not be everyone's perspective on what a harm is. I mean, we deal with all kinds of crazy stuff on our platform today. I mean, you know, sexual content and and human trafficking and all of the terrible things that, you know, that we try to mitigate. I mean, these are these are just horrendous things that, you know, we want to we obviously want to keep off our platform and and and we deal with integrity issues on accounts basis. And, you know question is like. How do we do that? Obviously we have tools internally, but then, you know, what is the community dealing with? How do we create a framework so that your policy that you can adapt to custom policy and so on. So I think that's that's again, it's goes back to like email comments and us working to standardize these things and ultimately giving companies the freedom to define different know policies themselves. So.
Lukas [00:00:01] But I guess I mean, limit two probably has embedded in it some of these safety mechanisms, right?
Joe [00:00:08] Yeah.
Lukas [00:00:08] It's override. Like could I could I get Lamar to to tell me how to make a bomb? Like does it know that somewhere inside of it or or not.
Joe [00:00:19] I mean, ultimately it's training a lot of data so it can generate things. You know, if you any with any of these models, I think you could prompt it in a way or you could you could probably prompt it in ways that would eventually allow you to do that. I think it's it's a matter of of of maybe determination or, you know, there's there's a ton of ways to jailbreak these models. I think we have a we have a lot of data, though, that shows how safe we are. I think that we've done internal studies. We've obviously this is one of the reasons why we love having our models out there is because researchers obviously are going to be kind of crowdsourcing. We have a crowdsourced, you know, kind of researchers who are banging on our models, doing publishing papers, understanding the bias in our models, understanding, you know, how to how to break down, I mean, how to manipulate the system around, which was a problem we had, you know, for for a while until we moved out. So I think like this is way, way to be open actually is helpful. And then you can understand these things better and then you can mitigate them in either the current generation and the next generation. So but yeah, I mean, ultimately, like these models can be manipulated. And I think this is why I think, in my opinion, like trust and safety is like a is one of the most important things that we need to deal with. And and we need to do that in the open and how we how we build these tools, how we evaluate these these, these models, how we do it not only unilaterally, but as a community.
Lukas [00:01:48] But isn't there like a danger in open sourcing, a model that you can't take it back? Like if somebody does find a way to get this kind of, you know, maybe unsafe information out of the model, there's no way to to put the genie back in the bottle, Right?
Joe [00:02:03] I mean, it's kind of the nature of open source in a lot of ways. So I think ultimately, like the harms, the good, I would say, should outweigh the harms. I think that. If you look at, you know, kind of the open versus closed argument. You know, I would much rather have a world where we have a you know, maybe, maybe I'll use the OS analogy here, you know, Windows versus Linux, right? Am I going to be operating and then working in a Linux environment, you know, where I can inspect the kernel, I can you know, I can build on it. It's transparent. Or am I going to be like operating in, say, like a Windows closed source environment where, you know, I trust my my overlords or, or my paternalistic, you know, owner of that platform to to make sure that everything is okay and that I mean there's a there's a case, I think, maybe to be made for both certainly. And I really think transparent open is kind of the way. And I think the good that comes out of this, like the democratization that happens when when you build in in the open, I think is I mean, it's it's amazing like the number of startups that are building a lot of models. I'm hearing numbers like in the thousands like and I actually don't think that's hyperbole. I think enterprises are building on because they can't build their own foundation models and and they don't want to ship data, for example, to others. They'd rather maybe run us on the edge or run it in their own data centers, in places where they don't have to send data back to to another cloud or to another service. So like I there's just so many positives in, in being open about this and. Yeah. I think it's it's such an important thing that we have a we have good balance I would say in the discussion between open and close.
Lukas [00:04:01] What about the the data sets? I think you don't actually really see data that it's trained on if if I remember right. So so.
Joe [00:04:10] That's right.
Lukas [00:04:11] Open up the data set or how like how does that how do you think about that?
Joe [00:04:16] Yeah, I mean, we've released three days. That's over the years. In this case, like, you know, we we're not for we're not releasing our datasets, you know, for a number of reasons. You know, competitive reasons. I think that that being one would say so. But like I think you can you can look at the number of data sets and things we have open and we may open data sets in the future. I mean, we're always talking about it. It's it's a lot of energy. If you've ever worked in a big company and you've and you want to release something. And I would say is probably like the one of the best companies in in terms I mean if you look at Google, Amazon and the companies I've worked for, I would say matters is definitely the easiest to to do that. But we may may release in the future. You know it's it's up for discussion matter of goals I would say.
Lukas [00:05:06] How do you think about multilingual capabilities? Um.
Joe [00:05:14] Multilingual is interesting. I mean, I think we we've seen a lot of models be fine tuned on a ton of different languages. I think it's it's obviously there's a demand for multilingual. There's, you know, not not everything is English. Right. We we have a very North American centric view. You and I are sitting here in Silicon Valley and San Francisco, and and so everything is English and very Americanized. But that's not the case, right? It's it's like the world is super diverse. There's a lot of languages that are used. I think I've been pleasantly surprised on how many how many languages people have kind of fine tuned on top of common and released models. I think it's the challenge, of course, is like, how do you how do you do that? It's like if we were, for example, to build multilingual capabilities into our models in the future, like evaluation is obviously something that's interesting in languages. If you get something slightly wrong, you can say things that maybe we didn't want to say, right? So like, I think just being thoughtful about that evaluation again is is one of the one of the biggest concerns. So like having a robust platform, having obviously a diverse dataset and it gets harder and harder as you kind of get into the low resource languages getting getting that and obviously like having people who speak all those languages is really, really helpful. So it's, it's a really hard problem. It takes scale. It's you know, you got to be thoughtful about it. And I think these models, if we want them to truly be world models in the future, I think multilingual it is is I mean, kind of a basic. Basic requirement. So.
Lukas [00:07:02] Totally. What about the tone of the model? It's kind of interesting. I feel like llama. Chat has a very distinct tone. But how much thought was was put into that?
Joe [00:07:20] Tell me. Tell me more. Tell me more about what you what do you think the tone is? A too chatty Is it like or is it to terms like I'm kind of curious like you. What do you think?
Lukas [00:07:31] Oh, well, I mean, this is a my only my impression, but I sort of feel.
Joe [00:07:34] Like.
Lukas [00:07:36] The g b t models feel like kind of like I'm talking to a Boy Scout or something Like this feels very sort of like affable and direct. And I feel like the llama models feel a little kind of like sillier to me and crazier. Like, I feel like it feels like kind of more friendly. Kind of like a little bit more of like a personality, but like, less serious. I'm curious if that was like, intentional. I don't even know if other people have that same impression. That's just, you know, I've been talking to a lot of these models over the last couple of months. That's. That's one man's performance. Yeah.
Joe [00:08:13] That's interesting. I mean, I've definitely I mean, I've played obviously with a lot of drama and I played with Bird and played with LGBT models, and I found one to be pretty good. I think it's it's like you always like balance, like helpfulness and safety and all these different factors and you have reward models. And so I actually found like the gypsy models to be fairly terse when it came to like certain things like it will, like if it's going to do like a refusal, like it will like be pretty, pretty terse when it refuses things like, you know, one sentence, like, I can't talk about that or something like that. I can't remember. And like, some other models are a little more chatty. I think a lot of models, I think tend to be a little little chattier. But they actually, you know, when they refuse, they tend to be more helpful to from my experience. So if, like, if we were if you mentioned, for example, something I don't know that happening or something like it will it will it will just like it will just like flat out refuse you and actually will provide I don't know, here's like a hotline or something. Like here's a here's like, you know, where you can call for you to get help or whatever it is. So it's even when it's like actually kind of refusing, it actually will come and provide something helpful. And I think that was like a balance that you have to strike. And that's in some ways that's probably a lot of like the. Like the inputs you got from humans when you're generating your, you know, your reward model in the early job. So like, in some ways it's it's harder. I think maybe ultimately these models reflect, you know, what they were trained on and and you know, the safety data and and everything that you put into it in terms of reward model and so on, I think they ultimately reflect those things. So I think we just maybe had a different sampling of people new than other models. I'm not sure. But I think it's actually interesting that you found that observation.
Lukas [00:10:07] Or maybe different guidelines in the the early chef like I do actually have at least the the guidelines that use. I would imagine that's a pretty important.
Joe [00:10:20] Yeah, we talk I think we talk a lot about it in the paper, but the reward models are not open.
Lukas [00:10:27] Um. Well, okay, I guess this is a question I get constantly. I'm curious how you, uh, what do you like? What do you feel like are the biggest kind of working applications that you see of of Lomita? Especially in a business context.
Joe [00:10:47] Yeah. It's not the sexiest applications, to be honest. Like, you know what I mean? It's like it's. It's like summarization is like, honestly, one of, like the. It's like the unsung hero of Jenny. I, I was, like, chatting with somebody, CTO of a medical company and we're having dinner at the same like last month and using 1 or 2. And I'm like, so what do you, what do you do with it, you know, doing some crazy stuff like are you, you know, it's like honestly, like if you think about your medical records and maybe people some people have really long medical records, they have like 500 pages of records. And and if you just want to summarize it like in plain language, in a way that you can easily read it as a, you know, either as a doctor or ultimately or as a patient. I mean, you can do that with an 11. It's great. And you can ask your questions. And and so, like, I really think, like, there's like we're what we're getting super excited about, like a lot of the more far flung applications of Jenny I like, we're kind of forgetting that it brings some level of value just for basic things like that. So I see a lot of that. I think, you know, like Zoom, for example, like, you know, they have agents that deploy when you're, you know, you have meetings that'll summarize your meeting for you and give you like the salient highlights. I was actually just talking about this with someone yesterday. Like over the weekend, I was having coffee with somebody. It's like, Damn, I wish there was like an agent that, you know, it was like private and all that that would just like sit with me in my meeting and summarize and just just take notes and in a private way, you know, summarize things that the other day like because I'm tired and then the day I start my days at like 6 or 630, I go to late and like, I would love to have a two page, just like summary of all the notes, all the salient things, all the actions, everything that someone is expecting from me. Like, you know, at the end of the day, like, Oh my God, that would be like a lifesaver. Because, you know, when you're in back to back meetings for 10 hours or 9 hours and and, you know, whenever taking notes or whatever, you can't remember everything. So especially when you're multitasking, right? People telling you like I do all these chats and and people calling me or whatever. So like even basic things like that I think would be like incredibly valuable to people. So like, I'm busy right now. It's words like really like utility. But obviously the innovation is still coming. So.
Lukas [00:13:13] So you work on your computer and then you worked with Justin was really like, was there anything that you learned or saw from Jack that you would want to like, take into PyTorch? You know, it's funny, like the Jack's.
Joe [00:13:32] First of all, I love the Jack's team. Like, they're they're so cool. Like, they remind me very much of, like the early days of, of pie small teams for researchers by researchers you know Sky and Matt and the team they're just like so cool to hang out with. So I would. Go up to San Francisco. Just to try and find time and hang out with James Bradbury. James Rogers No, it interrupted. And and so like I would try to hang out with Skye because she's super cool and and just like chat and I think the like I don't know it's it's on one hand like I loved the way they just were able to just shut everything around them and just keep building their core framework. No matter what happens around them, they just they had this, like, uncanny ability to just like push away any distractions that that, you know, like the production teams or like anyone would come in, like tell them what to build. They'd be like, okay, whatever. And they would just keep building what they thought was awesome. And and it is awesome. I think the, you know, like so I think like to me that's like one of the, one of the best like, learnings I had is like just if you have like some of the best teams in my opinion, are the ones that are like built by like four years as built, you know, like built by user. And a lot of what you did, for example, with weights and it's frankly like, I mean you yourself like you know, or user and building weights and biases and when you demoed like I mean I'm trying to think how many years ago that was when you download the app and I read and we were like, we looked at each other after the meeting and like, Holy shit.
Lukas [00:15:06] Oh, that's cool. I know that I.
Joe [00:15:07] Going to be awesome. Yeah, that.
Lukas [00:15:09] Led to that. So okay.
Joe [00:15:11] You did. Oh, it was awesome. Like we both watch each other like this is. This is awesome. Like, this was such a cool platform and like and it's, it just like, it felt like it was built by someone who who had a lot of user empathy. And that's pretty rare these days, right? Because you. Like product managers typically don't like to be hands on with these things or like, you know, a lot of these startups that I see building platforms like a lot of people have never actually shipped production loads, yet they're building systems and platforms that are, you know, supposed to be great, right? So, like, you know, so Jax, I think was really interesting. I think Jax, though we're Jax is challenges were is actually it's funny because I look at Jax where it is today and it feels to me like at least in terms of like processes and like and stability, it reminds me of like where George was maybe 2 or 3 years ago. In a lot of ways, it's it's probably further ahead than I'm giving it credit for. But, you know, things like feature maturity, things like forward and backward compatibility. You know, I think, you know, last I checked, it was, you know, there was there was no. Like there was no semblance, I would say, of like backward compatibility. So they were, you know, as a user, like I'm pretty annoying because I'm constantly broken, right, Brooke? Breaking changes are really annoying, but this is stuff you kind of learn along the way. Like we didn't like when we first started releasing, you know, like early versions of pictures like point four onward that were like built for like large scale usage. Like we had no semblance of, like, breaking changes. Like, we we would just break people, right? We didn't know. But then I remember Joe.
Lukas [00:16:54] Joe, I was there. So like, Yeah.
Joe [00:16:58] There you go, man.
Lukas [00:16:59] It's weird.
Joe [00:17:01] But you learn, right? And the only way to learn these lessons, I sadly think is like by actually, like living Twitter and like, getting yelled at by users. Like, Hey, why did you break me? Okay, okay, okay. What do we do next? Like. And we just, like, learned that, like, you know, we would issue a warning, right? We used an API, and then the next time we break you and at least you have a heads up and you have a contract with a user. And so we learn that like over, you know, over time, you know, and I think that's like the, like what I would say the Jackass team still needs to grow up a little bit in that regard. But it's an awesome framework and it's got incredible user empathy.
Lukas [00:17:36] Do you not feel like there's also advantages to breaking changes? Like, I'm a little surprised to hear you. You say it like that because I kind of thought that PyTorch at the time was taking an intentional point of view of like, Hey, we're not going to be like saddled by the past and we're just going to, like, you know, move forward and make, you know, weighted devices less incredibly hard with every, you know, version of like basic primitives changing and each point release. I mean, certainly Patrick was successful, right? So do you think you would've been more successful? Yeah. Done more backwards compatibility. Um.
Joe [00:18:14] I think, you know, I would say backward and forward compatibility or ah, I mean, they're kind of like a product of the level of maturity for a project. Like I think in the early days of a project, I think it's totally cool to be breaking constantly. Like, I think, you know, when it's, when it's a research project, when you're, when you're in kind of this high entropy state where you're still figuring things out and you kind of have something that's interesting, but like the, like the ecosystem around you is like is innovating rapidly. Like, I actually think that's totally fine. And as long as you, like, communicate right, as long as you like are in constant communication with your your users and like you tell them you're going to break. I think the worst thing is when you don't tell your users that you're going to break down and they they're surprised. As a user, that's super irritating, right? That's so frustrating. So I think like it is a product of maturity, like it is. As PyTorch became more and more used by production, not only inside, like matter of like and in the world. Like you can't just break production users like because they've, you know, they've taken a bet on you implicitly and you know their products. And I remember when Microsoft like made a bet on PyTorch and you know, they obviously have Onyx and it's resources today, but they made this huge bet on it. And you know, to be a $2 trillion company taking a bet on a framework that another company is is like is building, that was a huge leap of faith from Microsoft. I mean open a as well when they took a bet, I talked with us and you know, I remember meeting with David the Lion and the team in San Francisco and they're like, Yeah, we're betting on page words and like obviously the LGBT three on LGBT for and you know, so it's like, you know, it's a it's a leap of faith the other day and and so like ultimately it you have to think not only about your needs and what you what your company needs, but also like what you know, what the community needs. And that forces you to think a little bit different and take different risks or push some of the entropy into like repos that are outside of the core where you can kind of, you know, figure things out. Because we spent a lot of time with PyTorch like modularization, the code base. And it's still to this day, you know, we spend a lot of time basically every time we think about a new paradigm or a new API or something, you know, it sits outside the core. We can understand whether there's value, we can graduate it. You know, if we see that we want to support long term, but ultimately, like people now are so rely on PyTorch, it's, you know, breaking it down is so hard, right? It's it's not you need to be thoughtful about it. And so, you know processes evolve. Right We we learn how to deal with it and the community adapts to it. And they they come to expect, okay, I got a warning. I better change my code because next release, right, things are going to change and people do it.
Lukas [00:21:07] So and. Were there other like. Major differences in culture that you really feel like. What is that? What is it like to be working at better versus working at Google and these these big open source projects? I mean.
Joe [00:21:26] Culturally, I can tell you it in terms of open source are very different. Let's say at Google, it was more of a question of open source on something like as in, okay, we built this thing like, you know, maybe after the fact, like, should we be open source? I don't know. Maybe. I don't know. It can be hard. What's a line like a bunch of beeps? Because, you know, you know, no one agrees on what to do. Whereas another I think the going assumption for just about everything is that it's going to be open source or we will we will think about open sourcing. And now let's talk about, you know, number one or goals. Number two, like what success looks like, You know, what kind of a community should we build or like what how to license it based on our goals. So it's kind of like a built in assumption here that we're going to be open about a lot of our technology, whereas like that assumption at Google in my experience anyway, and this is just my time there is that it's it's not it's not an assumption that things will be open. It's, you know, it definitely it was harder, right. It's just a it's a cultural difference I think, in the companies and how they view these things. Again, just probably my experience in a lot of ways, but it was a lot harder. Open source was a lot harder there. And I think the other thing that was different that I found is and this is more of an ethos that I've adopted over the years is, you know, if you're going to open source something, you know, you absolutely need to support it and you need the team that's developing it to to be there to support it. And I'm not saying one is perfect. Believe me, I'm we're you know, we're we're fixing issues as quickly as we can and we're trying to support the community as best we can. But we do have processes, right? We have weekly triage meetings that, you know, I've been running. And then, you know, and like in that, I think the Google is a little bit different. I think in certain parts of it had that of empathy. Like I love, for example, the censor board team like Nick Foles and those folks over there. We partnered so closely with them on page because, you know, we think about, hey, we need a, you know, we need a really nice like visualization tool. Like censor board was like becoming the standard, like how do we remove the TensorFlow dependency and, and support PyTorch because we don't really want to build something brand new. And I remember we removed the top dependency and and then basically got it so you can just basically imported into page words like with a single article and it was so, so awesome. I remember I brought over like half a dozen bottles of Scotch over to Mountain View, came out to the folks over there, the Man of the Year, and we were all like, you know, have having a drink. And it was so cool. That team is great. Other parts like they've struggled, right? They've like they have internal priorities that are overwriting support externally or. You know, they don't privatize it or whatever. So it's definitely like it's a little harder over there than I think it is a matter.
Lukas [00:24:25] Hmm. Nathan's. You didn't give us a discussion. We integrated with you. With you.
Joe [00:24:37] I. I'm glad to buy your glass. Scotch. Or bottle. Either way, there.
Lukas [00:24:43] Was.
Joe [00:24:44] A bottle.
Lukas [00:24:45] All right. I like that. Did you want to talk about the Learning Alliance? I saw that you're the co-founder of that, and I was curious to know, like, how you think about that and what it does.
Joe [00:24:58] Yeah, actually, I mean, this is another reason I'm excited to be back here. This is such an amazing team. You know, it's like Mark Tigard and Paco Guzman and and like, it's just such a like a cool team or such like. So I'm, I'm of the belief maybe this is because I'm already Canadian, but. But I believe education is the root of everything. And I don't think that's exactly a controversial statement. But like, I really believe in education. I really believe that like that, you know, if we had better education, it would solve a lot of our challenges. And and I think one of the things that I love, you know, I spent several years teaching and building things for the Georgia Tech collaboration. Um, I think education should be open. I love OpenCourseWare. I love, for example, like all the Stanford stuff that Chris Manning, you know, has released over the years, you know, with obviously the bone as a result of and PyTorch, which was super cool. But like, you know, I think OpenCourseWare and and just being transparent and helping to just lower the barrier to entry and like, you know, just level the playing field. And that was really what Iowa was about. That's really what the Learning Alliance was about. It was like, you know, we're going to be taught this class, this master's level deep learning course at Georgia Tech for several semesters. And, you know, then we basically released over all the course materials, including, you know, notebooks and code and lecture notes and slides and everything. And then, you know, really helping to to focus on, for example, like historically black universities, historically Spanish universities, like, you know, global universities, like any anyone honestly who would work with us. I mean, we we really wanted to kind of scale it because, like, you know, I think we're we're in a bit of a bubble here. You know, the universities that I've worked with so much in the past are I have such a West Coast bias just because it's so easy for me to work. You know, I can drive to Berkeley and work with the professors and the students there, or Stanford is, you know, 20 minutes from my house or NYU, obviously, because we on the moon and our team in New York and we have such a bias towards these convenient schools because they're so close to us or we just have strong ties to them. But like there are so many there's dozens if not hundreds of universities globally that want to be able to teach machine learning and AI and, you know, and they need basic, you know, basic materials to be able to do it and help train the trainer, you know, teach and help grow this and and help, you know, kind of foster, I would say like the next generation of professors. You're not at top tier schools. But are these other at any school? Frankly, because machine learning, I think is so important. So so yeah, I'm very proud of that work. I've spent a lot of time in my career building courses. You know, I try and do everything in the open. Andrew Trask If you know Andrew a very Google and he's a DeepMind, he founded the Open Mind community, which has still to this day, like amazing to me, an amazing feat. And he built it all around the premise of of privacy and basically keeping data private, but still kind of innovating on that data. And, you know, he calls it kind of working on data that you can't see. And so we've built a lot of tools together in PyTorch. We built classes together. We've released those on Udacity and other platforms. And I, I just really believe that if you can lower the barriers and you can help educate people like it or like it's so important. So that's that's where I stand.
GD001_Transcript_2.2.mp4
Lukas [00:00:00] That's cool. You know, I know from reading the comments and the reviews of this this podcast, we get a lot of people interested in learning more about machine learning. Do you have any pointers to where you'd send people today? I mean, the the problem is one thing about the space is the shelf life is so short on a lot of this stuff. I mean, even the stuff, you know, we put out two years ago, it feels like it's definitely like, you know, not of today, like we're here where it's November 20th, 23. What's one of the best resources, according to Joe?
Joe [00:00:31] I you know, I like the Andrew courses. The, the ones that are like two hours long are, you know, super interesting.
Lukas [00:00:39] And, and do like deep learning. That's like.
Joe [00:00:41] A. Mhm. Yeah. I think they're really cool. I think um, I think that, I think we're, you know our, our attention spans are getting shorter, our, our time is getting shorter, we're getting busier. I think, you know, couple that with to your point, you know the space is moving ultra fast and like new libraries are getting created and getting popular like, like, like speeds we've never seen before. And so I think like. I really love what what Andrew's done with the do plenty of courses I think you know I saw the one with Harrison and like Jane and I thought that was really cool. And then Sharon and the remaining folks. So I think like you can like those are really good. I think there's no there's also no substitute for kind of learning the nuts and bolts as well. I think if you look at like the, you know, the 60 minute blitz and pytorch like that put together like several years ago, like that's still incredibly popular. People go to it and use it. And like, I know, I guess CNN's aren't really like exciting anymore, but like building a CNN from scratch is still fun and like, building, you know, like, like image classifier is still fun or building and it's still kind of in such a spot. And it kind of gives you like, really understanding of how this stuff actually works with the other day versus like. For better or for worse, you know, the barriers to interacting with GDI are so low. So like, you know, the people largely like my parents, for example, or people who are new channel, they interact with judgy beauty or borrowed or these things they promptings and they don't really understand like what's happening underneath that. There's a, there's a large model and it's you're interacting with it and there's all these, you know, GM operations that are ultimately being, you know, being run and forwards and backward propagation and updates and all these things and how these models are constructed and how inference works. And and I think like, like actually going back to basics that actually I remember, you know, writing my first network, All in NumPy, for example, was like, really cool. This was a long time ago now. And I just like learning like, you know, relearning the same role, right? But writing, writing in, in numpy in Python, because obviously I was an engineering and undergrad and I learned the same role like ages ago and then probably forgot it and then we learned it. And, and so like, I think that, like, it doesn't hurt, I think, to go and just learn these things and understand how these models actually work and versus honestly, like if you just interact with them, I think the surface layer, it's, well, it's cool and it kind of democratizes things and it makes it more of an interaction. You're really like kind of curious as to what's happening underneath. So.
Lukas [00:03:25] Yeah, totally. I learned a lot from implementing many things and in myself, that's been an important part of that process. I felt I felt like we said we should wrap up. And we always end with two questions that feel free to take in any direction you like. But the second to last question is what do you think is an underrated topic in this broad field of machine learning and the albums that you work on?
Joe [00:03:54] I'm you know, I'm going to go with I'm still going to stick with the hardcore science. You know, I spent a bunch of time on that in previous to Google. And, you know, if you look at some of the teams that that were in my, you know, some of the projects that were in my team like the Theorem proving and our protein team, which is, you know, now and kind of building a business and there's a team still there that's working on open catalysts and now released and they just released a data set for a direct air capture for carbon removal. And this isn't, you know, one specific these are actually genomes. But obviously, like, you know, just I would say in general, I still like love. To see the direct impact of work and you know what I mean? Means like if this is when when we were selling, for example, like the direct to capture project last year, you know, to we met with scrap and we met with like leadership and, and, you know, we're like, hey, like we want to build this thing and we want it. We want our data centers. You know, we're building our data centers. We want to basically open source everything so we can create a community. And we do want to keep this technology to ourselves because frankly. You know, we will have global impact if we possibly can. So, like, those are the kind of project that's where I see so much potential. And there's just very little like everyone is kind of like looking over there when all that's happening over here, right? Everyone's focused on chat bots and and being able to generate, you know, images and videos. Whereas, like, I really think the impact is actually in the more pragmatic sciences in areas like biology and material science. And so ultimately I can see direct impact from that research. It's not like it's it's far off. So that's what excites me.
Lukas [00:05:44] I totally agree with you on that and feel the same way. But I actually hadn't heard of this air capture project. Can you tell us how does how does machine learning help with that? Yeah, I mean.
Joe [00:05:58] If you look at. So I'll, I'll just talk about open catalysts for a second. The Open Catalyst is a project that has been going on for a couple of years. So there is it Nick is the researcher and fair that, you know, this has been his like he's actually in computer vision MSR for a long time. So he's a long time you know Microsoft guy and he came over and one of the early fair researchers and you know up in one day he basically like learned chemistry and just like started bridging, you know, email and chemistry and started this project. And he's an incredible person. And you you probably haven't done this podcast. He's one of the folks that is really, really interesting to talk to. So I'll send me away. And you know what? What we learned is that, you know, you can take the same paradigm of. Kind of exploring, you know, the materials and the kind of molecules much more efficiently, you know, using, say, graph neural networks. And so you kind of like you have, you know, shortages, equations, and you can go and solve those and do this like the hard way, right? You can estimate that using something called density functional theory DFT. And then you can actually estimate DFT with like an actual GNN. And you can get it to be like really pretty accurate, accurate enough basically such that you can actually use it as essentially a really amazing filter that gets you to like some interesting molecules to even generate like or, you know, frankly, materials that have never really been explored before. And you always have internal projects where this is actually like landed real impact. And we've in even in in terms of like the the Open Catalyst Project where the whole goal of that project is basically to find a low cost, you know, catalyst. Right? So today it's it's platinum and and these very expensive materials, if you find these things that are that are cheaper, you can actually lower the barriers to lower the cost to generating things like green hydrogen. And that's really the premise of that project. And so what we've essentially thought of is like when we start to look at these other material regimes, if we look at areas like, say, sorbent materials and direct our capture. So if anyone's familiar with direct our capture, you basically you take in the air, you capture the CO2 in a sorbent, and then you have to regenerate that sorbent, which basically removes the carbon from it. And then you can do something with the carbon, like sequester it under the ground, turn it into some type of a biofuel, whatever you want to do with it. And what we've found in just working with this community is like these sort of materials to regenerate them. Actually, in some cases, like a liquid. My knowledge is a little bit out of date here because I've been dealing with, you know, working on generating much. But, you know, you can you have to heat these heat the sorbent up quite a bit in order to like we're talking liquid in hundreds of Celsius. Like I think at some point with something like 900 C to regenerate a liquid sorbent. That's probably lower now because I'm sure there's been some innovation in the last year, but that's really expensive and you actually generate more carbon as a result than you actually captured, which makes it kind of a futile process. And so the goal that project is ultimately, you know, to explore different material spaces and find sorbent that, you know, could even, for example, like run off the waste heat of, say, a data center or a building or, you know, some something where you don't actually have to to expel a bunch of energy just to remove the carbon. And so, you know, if you can find the right materials that kind of can operate within certain climates because obviously desert versus, you know, say, Iceland versus, you know, northern Canada, these are different, very different climates. And the sorbent may be different for each of those climates. So it's a it's not something that you can just solve and be done. And so that's really where being able to explore a lot of different molecules and materials is really interesting. So yeah, I mean, so it's really interesting. The Space Animal line command and video is also like working on some of these things and she's she's applying a lot of it as well. So if you want to talk to her about that.
Lukas [00:09:55] Yeah, definitely. I love that stuff. But All right. We should get to our final question, which I'm sure you'll have an interesting perspective on, which is when people try to get this thing to work in the real world and in normally, I think generally like models in the real world. But I'm curious for you and people try to get like Belgium to talk in the real world. What are the biggest issues that they run into today?
Joe [00:10:21] Um, I think it's, it's actually pretty straightforward to kind of get a proof of concept these days with these models. I think the barriers are so low. Like you can, you know, even these APIs, for example, you know, we have fine tuning APIs like you go to like together. I, you know, you can grab one of models and you can you can do like supervised fine tuning on them and be able to deploy for an application pretty quickly. I think the biggest barriers in my mind to open models, though, is if you want to take a model besides say like the canonical models that say like we really So like I said, there's like 11,000 of them on hundred days and I'm sure like probably double that or at least that amount in private private hosting. You don't know the data lineage. And I think that's like, so you don't know, like if you're what exposure you have for example, there there's also no, you know, I think emails are still a problem, I think, for safety. And I think you saw like I think scale released to their safety suite this past week, which is pretty cool. But I think like we still have a long way to go in terms of like taking these models. And actually, you know, today I think there's pretty basic stuff. Like I think Open Air has a has a content moderation API. They also have a copyright infringement API they just released. And I think these are great starts. But I think that's like the biggest there's like some of the biggest challenges because inherently these models are risky. Like, like there's, there's risk to them. And you, you know, I think we need to build a lot more safe safety and responsibility into the system. You I like I said, I want to do that in an open way as much as possible. So I would say those are probably the biggest things. But also like. You know, it's much easier to use an API than it is to use a of model. And that's why, you know, we're glad to have our partners. Like not every company takes a can take a model set amount of weights and then be able to deploy a 70 billion parameter model at scale. Right? Line ups. This is a thing and it takes effort and takes engineering and it takes some some know how to do it efficiently. Like you probably get it to work. But that's why you see like these articles that say, Hey, I'm spending way too much on my open source model deployment than I am with some like API is because. There's some work to be done to do it efficiently. And I think we're, you know, I'll be investing to kind of lower those barriers over time. But ultimately, like with today, it's it's so hard for many companies so totally.
Lukas [00:12:49] Awesome. Alexa, this is super fun. Really appreciate it.
Joe [00:12:53] This was fun. All right. So I think. I think I owe you some scotch. So let's get together and get a glass of scotch.
Lukas [00:12:59] Let's do it. Yeah, Sounds great.
Add a comment
Iterate on AI agents and models faster. Try Weights & Biases today.