Skip to main content

Keras Core: A Backend Agnostic Machine Learning Library

Keras looks to unify Torch, JAX, and Tensorflow into a single framework!
Created on July 11|Last edited on July 11
Imagine this scenario - you're an experienced machine learning engineer, absorbed in the details of your current project. The model you aim to train is a state-of-the-art creation built specifically using TensorFlow. However, there's a slight hiccup: your well-built existing codebase is deeply intertwined with PyTorch.
You are faced with a decision: do you spend hours reworking the model to fit your existing framework, or do you venture into the new territory of a different system?
Consider this possibility as well - your team might favor PyTorch, while you've dedicated years to developing machine learning systems with TensorFlow. In such a situation, you'd look for a common platform, a solution that could unite both TensorFlow and PyTorch users, and maybe even JAX too??
Enter Keras Core - Francois Chollet announced on Twitter:

Keras is actively working towards creating a solution that promises just this capability. They're developing a tool that could provide seamless interaction between Torch, Tensorflow, JAX, and potentially more frameworks so that engineers no longer need to be confined to a single one.

History of Keras

Keras was initially released in March 2015 by François Chollet as a user-friendly API for building and training deep learning models. Its design principle was to enable fast experimentation by focusing on user experience, easy extensibility, and modularity. Keras quickly gained popularity because it was simple to understand and use, offering a high-level, Pythonic interface to the complexities of building neural networks.
In its early iterations, Keras was designed to be an interface capable of running on top of several lower-level frameworks such as Theano, TensorFlow, and CNTK, with these lower-level engines referred to as 'backends'. This was a powerful aspect of Keras as it allowed users to experiment with different backends without having to rewrite their models.
However, in 2018, the Keras team strategically decided to focus its development efforts exclusively on TensorFlow. With both Theano and CNTK having discontinued their development and TensorFlow emerging as the predominant computational backend, this shift made sense.
Over time, Keras became so integrated with TensorFlow that it was included as a part of the TensorFlow library, known as tf.keras. This integration allowed developers to leverage the power of TensorFlow while retaining the user-friendliness and modularity of the Keras API.

ML Frameworks Moving Quickly

Fast forward to 2023, and machine learning frameworks have evolved significantly. While TensorFlow is a well-respected library, PyTorch has become a popular choice for ML research. JAX has been increasingly adopted for high-performance machine learning and numerical computing. This diversified landscape has prompted the Keras team to revert to its multi-backend roots by introducing Keras Core.
Keras Core is a complete rewrite of the Keras codebase, designed to run on top of a modular backend architecture. This new version allows Keras workflows to run on top of arbitrary frameworks, starting with TensorFlow, JAX, and PyTorch. This marks a significant step towards the original philosophy of Keras, where the underlying backend could be freely chosen based on user preference or the requirements of a specific project.
Keras Core is a near-full backward-compatible drop-in replacement, providing a smooth transition experience. You can replace from TensorFlow import keras with import keras_core as keras, and your existing code will run with little to no issues.

Performance Benefits

One major advantage is the performance improvement it brings to your models.
In the benchmarks conducted by the Keras team, they discovered that JAX typically delivers the best training and inference performance on GPU, TPU, and CPU. However, results can vary from model to model, with non-XLA TensorFlow occasionally outperforming on GPU. Thanks to Keras Core, you can now dynamically select the backend that will deliver the best performance for your model without any changes to your code. This ensures that you're always training and serving with the highest achievable efficiency.

Code Modularity

Moreover, Keras Core allows you to maximize the available ecosystem surface for your models. Any Keras Core model can be instantiated as a PyTorch Module, exported as a TensorFlow SavedModel, or instantiated as a stateless JAX function.
This flexibility means that your Keras Core models can be used with PyTorch ecosystem packages, with the full range of TensorFlow deployment and production tools (like TF-Serving, TF.js, and TFLite), and with JAX's large-scale TPU training infrastructure. In essence, you can write one model.py using Keras Core APIs and gain access to everything the machine-learning world has to offer.

Code Compatibility

Another significant advancement that Keras Core brings is the seamless integration with native workflows in JAX, PyTorch, and TensorFlow. Unlike the original multi-backend Keras, Keras Core is not just intended for Keras-centric workflows where you define a Keras model, a Keras optimizer, a Keras loss and metrics, and you call fit()/evaluate()/predict(). It also works seamlessly with low-level backend-native workflows: you can take a Keras model (or any other component, such as a loss or metric) and start using it in a JAX, TensorFlow, or PyTorch training loop, or as part of a JAX or PyTorch model, with zero friction.
Furthermore, you can use a Keras layer or model as part of a torch.nn.Module. This means that PyTorch users can leverage Keras models whether or not they use Keras APIs! You can treat a Keras model just like any other PyTorch Module.

One Framework to rule them all?

This flexibility means Keras Core provides the same degree of low-level implementation flexibility in JAX and PyTorch as tf.keras did in TensorFlow, thus widening the options for machine learning engineers. It is a significant development that bridges the gap between the different machine learning frameworks, promising to change how machine learning development is handled.

How will the Community Respond?

With the rise of large language models (LLMs) capable of writing code, the dynamics of adopting new libraries may see changes. Since LLMs are trained on existing data, they may inherently be more inclined to generate code based on older, more established versions of libraries. This could slow the uptake of newer libraries or updated versions, as the generated code might not fully exploit the features and optimizations these provide.
As such, it will be interesting to see how new libraries grow, especially as LLMs improve and are trained on more recent data. In addition, it's likely we will see a shift in how documentation is written, catering more towards teaching language models, as well as humans. 
Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.