Community Spotlight: ClimateLearn
ClimateLearn is a public repo trying to democratize ML for weather modeling and climate forecasting. Here's what you need to know
Created on April 7|Last edited on April 7
Comment
This post is a piece in our series highlighting some of our favorite community repos that have instrumented W&B. If you'd like your repo to be featured, please reach out to editor@wandb.com
💡
What is ClimateLearn?
In the words of Aditya Grover, who started the project, ClimateLearn is a "Python library for accessing state-of-the-art climate data and machine learning models in a standardized, straightforward way." It aims to give researchers, weather forecasters, and climate scientists a single place to access publicly available datasets, baselines, models and more, all in the interest of predicting not just what the weather will be in the short term but how our climate will evolve over time.
Why was ClimateLearn built?
Climate is an information-rich field, with terabytes of data created daily and plenty more stretching back decades. As just one example, the ERA5 dataset–hosted on ClimateLearn–reaches back to 1940.
The thing is, many climate models are what's called general circulation models, systems of differential equations, grounded in physics. They tend to be computationally expensive and don't improve as much as one might expect when fed new data.
Machine learning models solve for this exact issue. Since ML models don't rely on explicit physical science, they require less compute. And these models are competitive for two particularly notable tasks: weather forecasting and spatial downscaling (essentially zooming in on a map and predicting weather in more granular geographies).
Another notable thing about climate science is that there simply isn't nearly as much standardization around datasets and benchmarking when compared to tasks like image recognition. There isn't a ready analog for ImageNet in the field. ClimateLearn aims to solve this by providing standard datasets the community can work through together, evaluate in similar ways, and understand which methods are most promising.
Essentially, ClimateLearn is way to bridge the gap between the old ways of forecasting and newer approaches, all while helping standardize around some of the best historical datasets in the space.
Why ClimateLearn uses Weights & Biases
ClimateLearn is integrated with W&B for the reasons a lot of repos are: it makes reproducing and tracking models a lot easier. It gives practitioners access to fine-grained evaluations and broader metrics as well a shared, unified way to understand performance.
But Aditya is also a professor and W&B makes life in his classroom a little easier. He can understand student success easily, follow along with the progress they've made during the coursework, and actual interact with their experiments, model predictions, and code. A lot of his classwork also involves collaboration and W&B gives student teams a place to work together, letting them understand what their teammates are doing and breaking up work so they don't waste cycles on duplicative research.
Put simply: W&B organizes all this climate forecasting in a single, shared system of record that memorializes past experiments and serves as a springboard for future ones.
What's next for ClimateLearn?
Like many professions, it's increasingly for climate scientists to understand how machine learning can make them more effective. ClimateLearn will help blur the lines between research and software or ML development. It will give the climate community a central place to collaborate on one of the most important scientific disciplines of the 21st century, a place where practitioners can either dig into big issue or bite off smaller pieces of a project and help push the state of the art forward, all while improving the models and datasets already available in the repo.
In the end, ClimateLearn hopes to continue it's founding purpose: reducing the barrier of entry for climate forecasting, help push machine learning into the forefront of a science that hasn't yet fully embraced it, and help make forecasting more accurate and accessible.
How can you get involved with ClimateLearn?
The easiest way to start getting involved is to check out ClimateLearn's documentation or run their introductory Colab tutorial. You can also get a better sense for the repo in their announcement blog post. ClimateLearn is an active project and models are currently being trained, tuned, and improved.
Add a comment