How Rosanne is working to democratize AI research and improve diversity and fairness in the field through starting a non-profit after being a founding member of Uber AI Labs, doing lots of amazing research, and publishing papers at top conferences.Rosanne is a machine learning researcher, and co-founder of ML Collective, a nonprofit organization for open collaboration and mentorship. Before that, she was a founding member of Uber AI. She has published research at NeurIPS, ICLR, ICML, Science, and other top venues. While at school she used neural networks to help discover novel materials and to optimize fuel efficiency in hybrid vehicles.
ML Collective: http://mlcollective.org/
Controlling Text Generation with Plug and Play Language Models: https://eng.uber.com/pplm/
LCA: Loss Change Allocation for Neural Network Training: https://eng.uber.com/research/lca-loss-change-allocation-for-neural-network-training/
Topics covered
0:00 Sneak peek, Intro
1:53 The origin of ML Collective
5:31 Why a non-profit and who is MLC for?
14:30 LCA, Loss Change Allocation
18:20 Running an org, research vs admin work
20:10 Advice for people trying to get published
24:15 on reading papers and Intrinsic Dimension paper
36:25 NeurIPS - Open Collaboration
40:20 What is your reward function?
44:44 Underrated aspect of ML
47:22 How to get involved with MLC
Rosanne Liu:
If ML Collective is about one thing, it's about open collaboration. We want people to think that science can be associated with employment. You join a job, you do science, but science can also just be your thing, your gig. You're an artist, you can join a studio to become a senior artist in that studio, but you can also just do art on the side.
Rosanne Liu:
And science is at the same time, a collective effort. You need collaborators, you need people to work together with you. For that to happen, if you're taking science as a gig, then you have to be able to work with other people. But then we don't have really a culture there yet. Sort of if you analyze all the papers out there, Google people are working with Google people, CMU people are working with CMU people. Not exactly, but there are clusters, right?
Lukas Biewald:
Welcome to the Gradient Dissent Podcast. I'm here today with Lavanya, and we have as a guest, Rosanne Liu. I'll say, I am super excited to talk to Rosanne, I had heard of her good work at Uber for quite a while, but then she actually came by Weights & Biases to give an open talk about one of her research papers. I was just so impressed with the kind of creativity in the way that she analyzed the neural networks. And we'll get into that talk with her today, but Lavanya, you actually found recent stuff that's even kind of maybe more exciting that she's working on...
Lavanya:
She had founded this amazing organization called ML Collective, and she's trying to democratize AI to anyone. And she's trying to ensure, even if you're not tied to one of these super prestigious institutions, you can still publish really cool research. And that is just such an important thing to work on. I'm super excited to talk to her. Welcome to Gradient Dissent, I want to start with what made you find ML Collective. It's such an important organization, and as someone who cares about diversity and also democratizing AI, I am curious about your journey.
Rosanne Liu:
Thank you, thank you for having me. And thanks for saying that it's an important organization. I don't think we're there yet, we're so new and young that it can really go either way at this point. One wrong decision we make, it can really turn it to be a not so good organization. But yeah, ML Collective is interestingly, it's like a million things in my head right now, because as someone who runs a company, I'm sure Lukas, you have the same feeling.
Rosanne Liu:
It's like there's a narrative always in your head going like, "What is this company I'm trying to do? What is this thing?" And then you little by little add ideas to it, and then at this point it's just so many ideas all combine together, basically representing everything I want to do in my life. One research, I'm wanting to be a research lab that people just do research together. They should be people better than me, ideally. And there should be people better than me, but maybe less experience so I can offer help to them, so it can feel useful, there could be a wide range of people.
Rosanne Liu:
There should be people who like having a home, I like having a lab feeling. There are people out there doing things better by themselves, but who are really trying to attract people that work better with people collectively. It's a research lab, it's also a nonprofit, because I want to do charity on to help people. And I think one dimension of science should be done within nonprofit. I mean, we don't see a lot of things out of science or in terms of ML research going on in nonprofits, there's mainly driven by industrial labs because they have all the resources.
Rosanne Liu:
But thinking if we can set a one small example to show great research done through nonprofits, that'll be great. And people can really open their mind when they think about their career choices, there's one more avenue for them to choose. Research lab, non-profit is also just like a co-working space that people just come together when during their gap year, or when people feel like just want to dabble in science a little bit, or they're moving out of science, but they still want to get involved in... Want to see what's going on in ML. So it's just that, it's also something very personal, something that sort of changed my life. But we'll get to that maybe later actually.
Lukas Biewald:
Let's go there. Tell us the story behind how you decided to create it.
Rosanne Liu:
Interestingly, it's such a big change in my life, maybe the biggest change ever in my life. And now I just wonder if every business... Every change you see in someone else life must be propelled by like a misfortune, because that's what happened to me. It was spurred by a misfortune. Basically the whole narrative goes like this, I was looking for a job, I was out of a job in first place, and I was looking for a job.
Rosanne Liu:
And there's no job really that offers the things I wanted, a working environment to have. So we decided to build our own, that's as easy as this, but you can imagine not being able to find a job, just feel like being getting rejected everywhere must be very heartbreaking. And that's what happened to me, and felt like everything went wrong during that period of time.
Rosanne Liu:
Is a conceptual changing my mind that I feel like instead of changing myself... Because what the signal out there is telling me is that I'm not good enough to fit the higher rubric of those places. Instead of changing myself thinking I should be more and more like what they want, what I decided to do is just change the hiring system. Have a whole system of my own, where we got to hire people, or recruit people differently from how they do it.
Lukas Biewald:
How does your organization compare to academic organizations? Because most academic institutions would be nonprofits too, right? What's different about what you're doing?
Rosanne Liu:
They're actually both. Academic can be profit or nonprofit, they can be public and private. We are strictly nonprofit, so we are funded by donations basically, when we are funded, we're not funded now. Once we're funded, well, we will be funded by donations. Difference with academic it's not an employment based. Everyone joined us as a member, they can have their own employment, that really gives people flexibility.
Rosanne Liu:
They can view this more like a hobby, which we found is actually motivates people more than when they viewed this as a job, like they have to report, they have to keep track on how much they're performing. They have to report to a manager and stuff like that. We function very much like academics, because the key people in the organization are sort of like PhD. So that's like the only way we know how to run a lab.
Rosanne Liu:
We heard research media is like a lab, everyone gets updates. There's no graduation, that's one difference from academic. It's not like you have to join and then five or seven years later you graduate. It's really flexible. You get paid even less than academia, which is unfortunate. But until we got funded, we can adjust that. But now everything's volunteered.
Lavanya:
What are some of the challenges in building a sustainable nonprofit like the one that you're building? And not just monetary, but other challenges too.
Rosanne Liu:
I haven't made it a sustainable yet, I should be asked this question like three years later. How did you make it sustainable? Or why did it fail? But I can see a little bit glimpse of hope there because we're starting to get donations and get funders, even without going into a full donation raising, we're already getting interest from big companies and personnel that want to contribute to the organization.
Rosanne Liu:
I feel like it's a different ecosystem, right? Just this like profitable world, they have their own ecosystem. Startups, they raise donations, sorry, they raise investments. And they promise that there's such a return many years later, they give shares out. In this world, you raise donations because you're selling this concept, and you feel this concept is so important. It's going to have such a social impact that the donors or philanthropists, they really buy into this idea, they want to do something back to the world.
Rosanne Liu:
There's like a whole different ecosystem going on there. Many nonprofits as far as I can see survive in that ecosystem. I feel like we probably can have a shot there. I haven't proved it yet, but that's my idea. Also a lot of donations don't come in monetary, as you said, a lot of people that are our current members, they're really just donating their time to work with us, right? They're helping others publish papers by donating their expertise at their time. That's actually way more valuable than money.
Rosanne Liu:
And there are also donations of compute credits. AI research is expensive, we need all forms of donations, but when they all come together, I feel like there must be a model that is sustainable. I need to prove that, but I am hopeful.
Lukas Biewald:
What kinds of research are you doing that you think might not happen somewhere else?
Rosanne Liu:
That's actually the best part. As a nonprofit, you're not driven by goals or anything. It's not like you have to prove to someone that I have to make this object detection thing better than 98%, or something. It's really just curiosity driven, you can do anything that interests you. And also the management of ML Collective is not hierarchical, we don't have a central manager of any sort.
Rosanne Liu:
You can start projects any way you want, and you will be the lead of the project as long as you're a member. It's really driven by individual members' interests. So that's the best thing. If you coming from a physics background, you maybe have something to do with thinking about how physics is related to neural networks, you can do a project at that. You can tell people about it, people who are interested will join the project. Then you formulate a team of your own, you will be the lead of the team of that project, it puts you through.
Rosanne Liu:
Maybe the next project you want to join someone else's, because you want to learn something, you want to learn, I don't know like, how things work in brain. Enjoying a more like a neuroscience project where someone else is leading, you're more like a happy follower. That would work out. Your role would be very dynamic. Yeah, so the best, the answer is we're not limited to any specific topics, it's really driven by individual members interests.
Lukas Biewald:
Could you tell us about some of the things you were working on?
Rosanne Liu:
Yeah, I think most of the things are published, and we put it on website. Looking at my own profile I feel like I've always been someone in ML, but just dabbled around different topics. It's like maybe I'm not as patient as most scientists, just trying different things. And also neural network the whole machine learning is changing so fast that I don't think anyone can confidently say that the next year, this is going to be the biggest thing for any breakthrough from any small field would become the biggest thing, if more people spend time in it.
Rosanne Liu:
In the past, I've done vision projects, NLP. I have recently had an LRP project. The very latest project is about continual learning, but that's also interesting concept that I like. We have projects about network pruning, just like whatever is going on out there, we take a look, take the recent paper, look at their code and try to implement it, run it and find false in it, or find things that we didn't understand and try to understand it.
Lavanya:
I feel like a lot of our listeners might be thinking, this sounds like a great idea, and a great place to get involved in. Do you have that in who is your ideal person to join ML Collective?
Rosanne Liu:
Yeah, you can think of people as different categories, but of course every single person is always a combination of different qualities. But you can think of people who are really like the lead experts in those subfield. They really want to push that sub field forward, but they're lacking resources in terms of like people, maybe they are a really senior researcher, but their job doesn't give them reports.
Rosanne Liu:
And for whatever reason they're not managing people yet, but they want to have their ideas executed, they want to influence people. That kind of people can join as sort of a thought leader, they can lead a project and other people can join sort of workout research project that way. And there are people who are having free time, and want to run code, want to sort of follow all those projects that are led by those experts, those people are also welcome.
Rosanne Liu:
But we're mostly welcoming people that are not offered the opportunity at the big industrial labs, because those people, if they can get into industrial lab, they probably wouldn't be interested in us anyways. But also they're not the people that we're trying to attract, because all we want to do is to serve as a diversifier from the industrial labs from academia.
Rosanne Liu:
The people that can not be hired easily by them, you can think of what kind of people they are, probably don't have a PhD, right? That's one of the biggest reasons that they don't have a resume that looks immediately hireable. They started coding since they are a teenager, but then they never really pursue a higher degree that way. But it doesn't say that they're not a good researcher at all.
Rosanne Liu:
Probably people who are changing fields, they have a PhD, but in seeing something else, they were trying to get into ML, and it's much, much harder for them to just immediately get a researcher's job in those labs, so we also accept those kinds of people. Basically, we've tried to serve as a diversifier, anyone who having a difficulty getting into those places, but still want to do science like if they were in those places, then we work on those people.
Lavanya:
Can I ask another one before?
Lukas Biewald:
Yeah.
Lavanya:
You talked about diversity, which we care about a lot as well. And I feel like every company says they care about diversity, right? But what's one concrete thing that doesn't require a lot of resources that any company can do to get more diverse talent in?
Rosanne Liu:
That's a really good point. There are people in the company who cares about diversity, but then when it comes to what their main goal is, because the company if they're making profits, their main goal is still to make the company run. Diversity becomes the secondary thing or even third thing that they care about. And that's when things broke down, because if you're going really just for productivity, for the speed of producing the next paper, then you wouldn't care about diversity, you would just hire the person who can quickly produce a paper the fastest.
Rosanne Liu:
You really need organizations or institutions that put diversity as the first, no, first word citizen or whatever, that's our first goal. Nonprofits is sort of like the thing that I'm thinking about, because they are not going after profits, and not going after productivity. They're not trying to submit to every conference because they want to show status, their whole job is to help people to level the playing field for people. That's the way I'm thinking about it.
Lukas Biewald:
Well, I guess like I would love to hear more about some of the research that you're doing currently. I remember looking at your work on the last change allocation to try to understand what neural networks are doing and I think that was such a cool idea. I wonder if you've had any chance to follow up on that one?
Rosanne Liu:
I remember giving the talk at Weights & Biases, and you asking very great questions. Back then I didn't know that you are running Weights & Biases. It was like that person has great questions, but I didn't know that you are the founder.
Lukas Biewald:
Wow, I'm so touched, I appreciate that!
Rosanne Liu:
I was just impressed by your questions. Yes, that work is LCA, Last Change Allocation, that was published it was NeurIPS. 2019 a year ago, company of that. It was led by Janice, our resident back then when we were in Uber. Basically the idea is that we can break down the loss change on to each parameters.
Rosanne Liu:
You can clearly visualize and see how much each parameter is contributing to the training of networks. Just in the training sense, we're not talking about validation or generalization yet. And surprisingly you see that half of the parameters hurt the training, probably understandably because everything's stochastic. We add noise into the process, because we use stochastic gradient descent when we use mini batches, and we sort of reduce the optimization to be linear.
Rosanne Liu:
All those are contributing to the noise in the process, but the still the amount of noise in the training process was surprising to us. The whole work was basically like visualizing all those... How many parameters? What percentage of parameters was hurting? And then broke it down into each layers, and we found that some layers hurt more training than other layers, especially the last layer.
Rosanne Liu:
Actually a very easy follow-up work would be... We proposed that the last layer should use a different momentum term, and we did a small experiment there. And so that it improves. I don't know if anyone from then on like training networks were using a different momentum term for the last layer, but they should.
Lukas Biewald:
And this has basically in every single step, or is this over a larger period that you see the half of the parameters hurting?
Rosanne Liu:
At any given time, there's over half of the parameters that are hurting and then across the whole training, half of the parameters hurt overall. If you accumulate all the contributions together, which when you add them together is the exact training loss from the beginning of training to the end of training.
Rosanne Liu:
Any moment there's half of premise hurting and then throughout the whole training is also over half. And also if you track one single parameter, the thing is that it hurts half of the time. It's not really like if we can catch this criminal, and then just ban it from making changes to the loss, because they also jump around the hurting camp and the happy camp.
Lukas Biewald:
I mean, it doesn't seem surprising that some of... And by hurting you mean the parameters they change in a way that makes the loss worse, right? Do I have that right?
Rosanne Liu:
Yes, makes the loss go higher.
Lukas Biewald:
And so it's funny, it makes sense that in a stochastic process, some would be making it worse, but it seems so surprising that half, right? Because overall the loss does improve over the steps.
Rosanne Liu:
Yeah, exactly.
Lukas Biewald:
What's going on there?
Rosanne Liu:
Things that we didn't understand, like many things in neural networks that we didn't... We sort of get the idea, but then until we see the data, we're like, "Yeah, that makes sense, but still is surprising." Many papers are like that, and those are the papers that I aspire to write. Those that you sort of have intuitive sense that this is something that's going on, but until you see the data, you're still surprised by the amount of it, or the actual extent of it. I don't think you understand?
Lukas Biewald:
That's so cool. I wish you had a chance to follow up on that. Running an organization, do you find yourself... Maybe this is asking for a friend kind of question, but do you find yourself spending a lot of your time in more kind of administrative tasks, and recruiting, and things like than actually doing research?
Rosanne Liu:
Yeah, exactly. I start to doubt whether I'm still a researcher, because every day I look at my time, I'm like spending half a day designing the logo because we need to have a logo. And just like no one's is working on it. And then the other day, because we were organizing a event, I'm just spending all my night designing a gathered.town layout. I'm like making houses, making rooms, make sure people can go to different rooms and things.
Rosanne Liu:
There so many administrative things, but that's also one of my goals. Honestly, the next 10 years I feel like publishing papers wouldn't give you so much value as before, because there's so many people trying to publish papers. What would give me more reward is actually helping people publish papers, and a concrete goal actually of mine is just end up in people's papers, acknowledgement section. That's all my goal.
Rosanne Liu:
I'm not trying to be a co-author anymore, just because I don't know. I don't think it's a field that I still want to... I want to be close to of course publishing, I want to publish as much as I can, but I also want to remind everyone that the publishing scene is going to be very different the next decade, just because you see this huge influx of people coming in and trying to publish papers.
Rosanne Liu:
Almost every idea has been chewed over a thousand times, it's so hard to come up with an idea and then social researcher find out, "Oh, no one has ever done that." This is impossible. Someone is doing that somewhere, which is to say that researchers right now... ML researchers right now are having a hard time, if I can help them a little bit, how to help their paper improve or be different, and make the success rate of their paper getting noticed or published slightly higher, then I will be very happy.
Lukas Biewald:
I guess, what general advice would you have then to someone that's trying to get something published? What are the mistakes that you see first time people or outsiders make and what kind of help do you typically give to someone?
Rosanne Liu:
I feel like the things that our reward function is delayed, right? We go into ML research, liking it because we saw other people in research maybe like few years before us, and they gained reward out of that. They publish a paper and it was so recognized and they have such a fame and recognition and everything. So we want to do the same thing, but the difference is, we live in a delayed time line, when we get into it, the scene already changed, but we don't know.
Rosanne Liu:
I really want to remind everyone that if you are getting into ML research now, publishing is very different than before. Before if you have an accepted paper at I don't know ICLR or NeurIPS or CVPR. You're basically, you're there, you can probably get a job that you would want. Get a dream job, get a position of something but not anymore.
Rosanne Liu:
Now I think the next people will be looking at citations, even if you get a lot of paper published in peer review conferences, people will look at different metrics now, because there are so many papers getting in, and so many people having their papers getting in. The basic suggestion or advice is that you should try to adjust your reward system to be different from why you came in there, if that makes sense.
Lukas Biewald:
I mean, it just seems like you should just make things even harder for yourself, right? You can't just publish a paper and have to get citations. Is that a good summary?
Rosanne Liu:
No, that's why you should be looking at other things. You should be really just looking at love of science. I want to do this for the love of science, I'm not trying to... I do this piece of work and not to... Well, if it gets published, that's a confirmation that is a good science. But the basic thing that's important is that it's a good piece of science, I think that's what I want to say.
Rosanne Liu:
You can do a beautiful work, put in an archive, don't worry about whether it gets accepted or not, because there are so many noise in that whole thing, the same as neural network training. There are so many statistics that the same paper was not changed or submitted to three conferences got rejected, rejected, accepted, because it's just random chance. Every time you're just drawing a lottery ticket or some sort.
Rosanne Liu:
Don't care about that, don't care about really this true acceptance or not into a conference, really care about the quality of the signs you put out there. Because if it's on archive, you have your name on it, it's going to... That means something. Change your reward system to really care about the true quality of science, and remind yourself that you are in here for the love of science, not for...
Rosanne Liu:
Of course some people are in here for it too, so that it promises a better future, and there's nothing wrong with that. But those will probably stray you a little bit away from the path, and maybe makes you a little bit miserable than you already are.
Lukas Biewald:
What's the key to doing good science as an outsider? How do you do that?
Rosanne Liu:
That's actually the idea of running ML Collective, I feel like there's so many problems these days in a world that people don't believe in science, right? I'm not saying ML Collective is the way to change that, but I sometimes think if you can get everyone, not even everyone, like the majority of American to publish one paper in their life, maybe they'll just believe in science more.
Rosanne Liu:
Once they go through that publication process, they see like, "Oh, to put this statement out, I need to try everything around it, do ablation study, compare with all the benchmarks." They will become more careful when they put statements out. I don't know, this is a weird argument I'm making, but I feel like if I can get more people to do science, not for life, just like publish one paper in their life. I think everyone's attitude towards science will be better, they will believe it more. We probably wouldn't have all those problems out there in America that people don't believe in science or other things. I don't know, that's my dream, of course.
Lavanya:
That's a great idea. Also I want to address the other end of the spectrum which is, all of these people who are trying to keep up with all of the papers that are coming out. And maybe you can use this opportunity to talk about this amazing paper reading group that you've been doing for like three years now. What's your advice for people who want to keep up, and what kinds of papers should they take, and how should they go about reading them?
Rosanne Liu:
There's no better way actually, because I think this is like our first time of facing this problem, so there's no historic lessons that we can learn from it, that there's huge influx of paper. For now, I still trust those that are published at peer review conferences, but we know that there's a lot of noises in there, but I trust these slightly more than papers that just put on a type.
Rosanne Liu:
I sort of have like a general sense, there are many people like me out there running paper clubs or YouTube channels. They dissect papers, each of them of course has their own criteria in judging papers. But if you accumulate more of them, like they average how to... I think it's representative of the overall quality. I think like a shameless plug, I think by now, I'm a good discriminator in all the sub-fields of ML.
Rosanne Liu:
By being a good discriminator, I mean, I can sort of judge what's a good paper, what is bad. I might not be a good generator in all those sub-fields that I never published in. But being a good discriminator is the first step of feeling like I can run those things. You can sort of trust the papers day selected, but then you have to remember to accumulate it with all the other people's selection together, more balanced view.
Lukas Biewald:
Can you give us a little window into your process for being a good discriminator of high quality papers?
Rosanne Liu:
I just read a lot. Some basic elements, I feel like a lot of papers are missing. Maybe there are the people that are coming into ML research from different fields, or from a non researchers background, which is again like why I feel like ML Collective is important. Get people into this paper publishing process, and tell them what are the basic things we have to do compared with baselines out there.
Rosanne Liu:
Try different variations of the method that you're proposing and that's ablation study. I see so many papers out there that have huge diagram, right? Signal goes in, and then there's so many branches of things, and it branch out. And then this is the end result, and they say that this whole system works much better than existing systems, but that's not science, that's a good engineering. Great that you made it work, but what does it teach us? Right? Is this branch more important than this branch? Why did you branch out this way other than that way?
Rosanne Liu:
A real good science work should be, I think, inspirational rather than intimidating. That huge diagram is just intimidating. I built this huge thing, it worked, I'm not going to tell you how, because I hacked them together and worked. Maybe there's scientific value in it, but to be a good scientific article, you have to tell us what things you have tried, why this branch and that branch did you do ablation study? Did you try to turn off this branch? What was the thought process behind it? How does your work inspire other work, maybe in different fields to borrow the same thought process to produce their science in their subfield?
Lukas Biewald:
Do you have a favorite paper over the last few years that kind of exemplifies this of simple difference and then a clear insight?
Rosanne Liu:
Yeah, there are many amazing papers out there. Am I allowed to say my own?
Lukas Biewald:
Absolutely, absolutely, yeah tell us, which is the paper that you're most proud of?
Rosanne Liu:
Actually I really like an early paper of ours called, Intrinsic Dimension. It's many years ago, not many, many, but in machine learning feels like many years ago. It was published in NeurIPS, sorry, I cleared 2018. Intrinsic dimension space may need you to take all the parameters.
Lukas Biewald:
2018? That's like two years ago!
Rosanne Liu:
I know, but it feels like forever, right?
Lukas Biewald:
That's amazing.
Rosanne Liu:
That's now this, when you look at papers you're like, this is the 2018, probably they're better worse than those though, probably, I shouldn't be reading this paper at all, but yeah, it's only... What is that? I think that's only two years ago. That paper has to do with measuring this basic property of a neural network. Neural network has so many things along associated with it.
Rosanne Liu:
There's parameters, there's a large parameter accounts. If you imagine putting every parameter just together into a big vector, it's just a super long, long vector. And then you reduce it to a shorter vector and you only train the shorter vector. And how do you map from the shorter to the bigger? It's just through a matrix, it's a linear mapping back to the big factor.
Rosanne Liu:
Basically you're saying that even though this network has 10 million parameters, maybe the dimensions that you can make changes is much more smaller than that big number. And there's a number out there that's much more smaller that says something about your network combined with your problem, combined with your data. That's how easy or hard this natural combined with data and problem is. So that's-
Lukas Biewald:
Wait, sorry, you can actually do that kind of... Because that's going to be a lossy compression, right? You can actually do that, make it much smaller without hurting the performance?
Rosanne Liu:
Mm-hmm (affirmative). Well, I think now it becomes not surprising, because now you can prune. So pruning is like that is access aligned reduction, right? Where you reduce big vector to a smaller one by basically masking some of them as zero. But back then we were just doing a linear projection. I can totally do it because a lot of parameters in your neural networks are redundant. Not that they're not useful, well, LCA also teaches us that.
Rosanne Liu:
Not that they're not useful, they're just they provide a better or different loss landscape for you to train, but you can definitely train it within a much smaller landscape. Well, if you think about it, this huge landscape that all the parameters help construct, leading to an end point where there's better loss. If you can draw a line from the starting point to the end point, that's just one dimension. If you can just travel along that line, that's an intrinsic dimension of one.
Rosanne Liu:
Any network would have a dimension of one that is trainable, but that one is very hard to find. That's almost just like very singular. These are trending dimensions saying, this amount of dimension, however you draw the line or the plane, it should still lead you to a good enough solution.
Lukas Biewald:
Wait, but how could you take a... Oh, because you can pick the linear function that goes from your sort of simplified representation to the more complicated representation.
Rosanne Liu:
Yes. The thing is, if you were allowed to pick the linear function, you can reduce the dimension to however you want, all the way down to one. But that's not what we want to measure, because that's just like one for every network. What does that tell us? The things we want to make the projection matrix and randomized, so then we measure how big it is, because you know that in a very, very lucky scenario, this can be down to one. With that knowledge, you should know that by just randomizing, there should be a number that's larger than one, but should be smaller than the super big factors to start with.
Lukas Biewald:
I see. And so how much smaller can you go? And is it like suddenly there's a drop-off at a certain size, or is it sort of a smooth deterioration of performance?
Rosanne Liu:
Back then, basically you have to try every number. That's more of a science investigation, it's not something that can help you train the network faster. Because basically you have to try every number from big to small, small to big until it crosses the threshold. Whatever threshold you want to be.
Rosanne Liu:
We'd pick a threshold that is 90% of the full network performance, or you can do it 99, you can do it 85. It's up to you. Pick that number and that number is interesting, I want to make sure that I can remember the number so that probably wouldn't... But for MNIST plus FC Network, I think is 750, is much lower than 784 was just the input dimension of MNIST ditches, which makes sense because there are many black pixels in MNIST in portages, but that number is very interesting. And then for CIFAR, I think it's like 19,000. That sort of gave you a sense, how CIFAR is harder than MNIST, but how much harder? Probably 10 times harder.
Lukas Biewald:
But this is also-
Rosanne Liu:
Sorry. 19,000 is probably the-
Lukas Biewald:
Sorry, you're modifying the network or are you modifying the input?
Rosanne Liu:
You're modifying the training procedures. Once you pick a network, you pick a task, data is there, network is there, initialization is there, then your loss landscape is fixed. Now you're modifying the training procedure to let the point move, not in any direction, but in a restricted plane. You can think of it that way. So you're modifying the training procedure.
Lukas Biewald:
The training procedure means you're like first modifying the input data, sort of shrinking it before you put it into the network? I see you're only allowed to change these smaller set of numbers and that changes the network through a linear transformation that changes the parameters. How can you say like MNIST and CIFAR, wouldn't it matter also the network that was being used? Wouldn't a bigger network maybe have a different...
Rosanne Liu:
Mm-hmm (affirmative), exactly, that's very true. But what we found interestingly is that for at least in the scale of our experiments, MNIST plus FC Network, fully connected network, you can make the network bigger, wider. The number's roughly the same. 750 was the number we got from MNIST Plus FC type of network, of course, if you make it huge, probably the number would change.
Rosanne Liu:
But to the extent that we vary the size, they're sort of stable, which gives us confidence that data is a stable measure. But then if you change your convolution, your changes drastically reduces as you can imagine, because convolution is a much better landscape. It gives you a much lower intrinsic dimension. It's the same story when we switched to CIFAR, we switched to other tasks. You can also do RL with it, that's the interesting part.
Rosanne Liu:
You can finally compare RL tasks with computer vision tasks, which people never really do, because people doing RL sort of know that this, I don't know, pong is harder than some other games that I don't really do outside, I don't know. And people doing vision knowing that MNIST is easy, CIFAR is harder. MNIST, sorry, you mentioned that it's much harder, but then they don't make this part of the comparison.
Rosanne Liu:
But now we can, of course, it's not a very strict comparison because they are using different networks, but we find that some games are much easier than you thought. There's a Carpool game, has only a dimension of four, because probably you just need to move in four dimensions.
Lukas Biewald:
Even though, what are the inputs into the Carpool game? There's not that many inputs, right? It's just the angle of the pixels. Oh, I see, from the pixels you can see the-
Rosanne Liu:
Yeah, they're extra pixels.
Lukas Biewald:
Oh, it's so cool.
Rosanne Liu:
Yeah, it's an old paper from three years ago, two and a half years ago.
Lavanya:
A decade ago it sounds like the way you talk about it, how much time has passed. What is out there? Oh, practical application, could people use this to maybe take the networks and deploy them on mobile phones and other applications like that?
Rosanne Liu:
Yeah, it's very interesting question. Back then what were we doing that we sort of claimed in the paper that is the scientific investigation, but there are some implications of reduction, because the whole matrix is randomized. So you can just save one random seed to regenerate that whole matrix. And then you train in such a small dimension, so the whole memory usage is much slower, sorry, much smaller.
Rosanne Liu:
But actually speaking of this year, NeurIPS, which is coming up next week, there's a paper published there that actually took the idea that we had two years ago, a long time ago. They actually make it more useful, I think it was their method. They make a few tricks in the algorithm. It's no longer measuring this intrinsic property of a network anymore, but it becomes a better training method that they're able to train is such a sub-dimension better networks, or faster, or with all those memory safe. Check out that paper in this year's NeurIPS.
Lukas Biewald:
Okay, cool. We should put in the show notes both of these papers.
Rosanne Liu:
Random subspace training, something like that.
Lukas Biewald:
And I guess you're also doing something in NeurIPS this year on open collaboration. Is that right? Could you say a little bit about what you're trying to do there?
Rosanne Liu:
Yeah, that's the whole thing with ML Collective, right? ML Collective is about one thing, it's about open collaboration. We want people to think that science can be associated with employment. You join a job, you do science, but science can also just be your thing, your gig. You're an artist, you can join a studio to become a senior artists in that studio, but you can also just do arts on your site.
Rosanne Liu:
And science is at the same time, a collective effort. You need collaborators, you need people to work together with you. For that to happen, if you're taking science as a gig, then you have to be able to work with other people. But then we don't have really a culture there yet. Sort of if you analyze all the papers out there, Google people working with Google people, CMU people are working with CMU people.
Rosanne Liu:
Not exactly, but there are clusters, right? And there are each of us sort of like bears our own comfort zone of collaborators. We sort of rarely go out of the comfort circle, because it's... With any new people, it's like there's a friction of working together. We don't do that too often, and that creates a problem because we were so little purpose isolated, and new people find it really hard to join all those circles.
Rosanne Liu:
At least as a new people back then, I find it really hard to just find someone and become their collaborator, because there's not a culture like that. The whole thing with ML Collective is that, we have members coming from all different kinds of employers. They work elsewhere, but they're willing to share their work within ML Collective, they feel this is a safe space that you can share your work, get feedback, maybe become co-workers with people you never would have, because you work in different teams, different institutions. So that's the whole idea.
Rosanne Liu:
There're many people who sort of are carrying this culture around, that's why we invited all those great people to the social, talking about how they have done that. People that are holding office hours, actively outreaching to people, trying to mentor people on their spare time. There are companies that entirely run science in open way, they broadcast other meetings, they put everything out there on GitHub, even the date. Every day commit of course out there.
Rosanne Liu:
There are many open cultures out there, so we want to gather people sort of to discuss, the pros and cons of this method. Of course science is much slower produced this way, because you have to uncomfortably work with people that's not familiar to you, but they really improves the overall well-being of the society.
Lukas Biewald:
It might be faster if you make more connections in the sort of global brain, I could imagine that it leads to more interesting science.
Rosanne Liu:
I don't know, but then there must be a reason that people are not doing that a lot. I feel like it must be slower in the case that there's always this friction period that you are getting to know each other, what each other's work style is like. I don't know, I feel like people have tried that or it all must be out of this fear of how hard it would be to work with other people.
Lukas Biewald:
Well, I would think people might just feel shy, it's hard to go meet a stranger. Do you do anything to sort of facilitate, just getting people talking to each other?
Rosanne Liu:
Yeah, for now we start this organization where people just join from anywhere. I'm sort of like the hub, I know everyone, but they don't know each other to start with, but then if you do biweekly meetings like we do, we talk about research every time. And then you two can be commenting on the same graph and then you realize, "Oh, we're thinking the same way." You're like-minded people, we should talk more. Than they can talk offline, then I'm done. I'm like a matchmaker sort of link people together. I'm very happy if like two people that didn't know each other now work together. I feel like my satisfaction comes from that.
Lavanya:
It's fascinating to me that you're taking the credentialing aspect out of these research labs and almost replacing them with collaboration. Is that the reward function or is there a definable function now?
Rosanne Liu:
Yeah, the reward function for me is definitely just... If I can reduce the whole thing, ML Collective does to one metric, that will be the number of new collaborations formed. That will be my reward function. That's reward function of MLC, but my personal reward function, as I said, is how many papers that have my name in the acknowledgement. That will be my near term reward function. I'll be very happy if people thank me in their acknowledgement.
Lukas Biewald:
That's great.
Lavanya:
What about for the researchers in MLC? What's their reward function, do you think? And how is that different from those people who are working at traditional labs?
Rosanne Liu:
Curiosity driven, that's one. We're not goal-driven, we're not trying to beat any benchmarks. We're allowed to do that, I mean, other labs probably also have some elements of that, but I don't know, that each lab has its different cultures. Some are more open, some are more goal-driven trying to make sure... The whole thing that people about on Twitter is like, I have 28 papers in this conference, so we would not be saying that, because we can never reach there.
Rosanne Liu:
But also that's not our goal. It's not to get this number of many papers in a conference, it's more like we can have the scientific discovery purely driven by curiosity, like the intrinsic dimension paper. It was just us thinking, hmm, everyone trains neural network this way, was a big factor. Can we train with a small vector? It's like, no reason why we have to do that, but we just thought about it and within must be, right?
Rosanne Liu:
Because thinking if you can draw a line, there is a dimension one out there, but how hard is it to find that dimension? How hard is it to find that even with the random initialization? I would, if I were to control it, but again, I don't control with the directions of research in ML Collective, but if I were to control it, I would encourage everyone to see it as a fun thing.
Rosanne Liu:
ML researchers thesis, they're so miserable. I mean, I was part of them, so I know that, every day they're like, "Oh, this conference is coming up, I'm not submitting, I feel so bad, I'm such a failure." Really, I just want to make this a fun thing, a gig they're doing, they get to meet new people, they get to work with people different from them, better than them in some ways. They get to feel like they're helpful in other's projects.
Lukas Biewald:
I think it might be eyeopening for people listening to this, that someone as successful and credentialed as you could feel like a failure. I feel like it's been occupational hazard of the field, but I really do think most people listening, or watching this will be surprised to know that.
Rosanne Liu:
Oh, exactly so much. I didn't realize, it because it's not so miserable that I'm just crying every day, but it's this like a mild level of depression, which is the worst. Because every time you're confronted, you're like, "I feel bad, but should I be feeling bad? I'm having this amazing job, and I get to do science in the industry getting paid reasonably well." You sort of counter yourself of the bad feeling, that makes things worse.
Rosanne Liu:
From the outside, everything's glamorous, I get to publish every now and then, but yeah, I was miserable. And I realized one key thing that changed my mindset is that, I was viewing everyone outside of my team as competitors, and I'm just miserable because I feel I have to compete with them. And if they're published in 28 papers and I'm publishing zero, I'm losing. But now by running MLC, I see them as collaborators, or potential collaborators.
Rosanne Liu:
People are out there, if we have the same ideas at the same time, the past new would be like, "No, I'm scooped," but now I'd be like, "Great, that means that's a great idea." You can be my potential collaborator, I can talk to you, and you can join MLC, and help me, and help others. It's really like a mindset change at least for me, or maybe it just because I'm not getting paid right now.
Rosanne Liu:
If you let people do something and then not pay them, they start to think that this thing must be noble, because I'm doing it, I'm not getting paid. I don't know which aspect is the one that changed my mindset to be from that to this. But yeah, there are many things that has changed.
Lukas Biewald:
I have to say I really admire you creating the world that you want to see. I think that's super admirable and impressive.
Rosanne Liu:
Thank you.
Lukas Biewald:
We always end with two questions, and I want to make sure we have time to do that. The sort of second last question that we always ask is, what is something in the ML field that you feel like it doesn't get enough as much attention as it should?
Rosanne Liu:
That's a good question, I would say understanding of things. I think the field of ML research publishing would become healthy if we start to see a wave of papers that just go. I think this little concept, batchnorm or dropout. And I studied so extensively that I wrote eight page paper out of it, I tried everything I can with it, without it in this network, in that network. And the end result is, we didn't find anything amazing, but we understand this concept 1% more, because that's science.
Rosanne Liu:
I want to see a way for paper that's written this way instead of we'd be this, then what would be the average more, because that's very rare. I'm just like trying to go for a deeper understanding of one small concept, say like wide helps and... There are so many things we don't understand in the way that we train neural networks. And of course people, when you say, understanding people have different comfort levels in terms of understanding, I can see there are people out there having more of a hacker's attitude. They would think they understand somethin, if they watch a five minute video of it, right?
Rosanne Liu:
There are more humble conservative attitudes, I would say like more of my scientists, peer have that as... Unless I published those lead author paper on this subject, I can't still understand it. Even if I publish one, I can't still understand it. There's different levels of things, but I hope people are going for a better understanding of things than in benchmarks.
Lavanya:
I guess the last question is, what's the biggest challenge of publishing a paper independently when you're not living in a big lab?
Rosanne Liu:
There's so much of it, the lack of resource, the lack of support, the lack of people just telling you this is good idea or bad idea. Lack of discriminators, right? When you're publishing paper, you probably you are these generator of the paper. Secondly, lack of discriminators, think about gun training, without discriminator, you really can't train a good gun. All those things.
Rosanne Liu:
That's why we want to recreate this great graduate school lab experience for everyone. You don't have to join a graduate school lab, you don't have to join a big industry lab to have the same experience like mentors or collaborators, peers. People just say, "Awesome on your plans, or you should add one more line to that plot to make it more awesome." Stuff like that, right? People you can bounce your ideas off of yeah. All that little things is... Of course, we know how hard it is for individual researchers to thrive over there out there. If ML Collective can help them a little bit, I'll be very happy.
Lukas Biewald:
I think I'll sneak in one final question. If people are listening to ML Collective and feeling inspired, what's like a next step for them to get a little more involved or learn a little bit more?
Rosanne Liu:
For a nonprofit, where we really want to get this idea out is, there's social impact we want to put out to send the idea for us is the open collaboration. For people out there if you're a researcher, if you're an individual researcher, an independent researcher, you can always come to work with us. There's many collaborators that here will be happy to work with you. If you're already a senior researcher, or established researcher, you should think of this concept actively.
Rosanne Liu:
Every day, every paper you think about, did I help someone with this paper? Did I just work with the same crew of collaborators that I always worked with the past 10 years? Or did I put someone new on this paper and really helped their career? Because having a paper helps so much in someone's career, at least for now.
Rosanne Liu:
Did I try to make the world better with this paper aside from the scientific pursuit? Of course, you are making the world better by just putting a scientific work out there. But did I give other people chances to work in science? Did I help someone underrepresented, or help someone from a non-traditional background get into science through this paper? I want to get people to think about this question actively.
Lukas Biewald:
Awesome. Well, thank you so much. Real pleasure to talk to you, thanks for taking the time. That was a lot of fun.
Rosanne Liu:
Thank you so much, Lavanya and Lukas.
Lukas Biewald:
Thanks for listening to another episode of Gradient Dissent, doing these interviews are a lot of fun, and it's especially fun for me when I can actually hear from the people that are listening to these episodes. If you wouldn't mind leaving a comment, and telling me what you think, or starting a conversation, that would make me inspired to do more of these episodes. And also if you wouldn't mind liking and subscribing, I'd appreciate that a lot.