AlphaTensor: DeepMind's AI For Discovering Efficient Matrix Multiplication Algorithms
DeepMind has revealed AlphaTensor, an extension of AlphaZero which searches for more efficient matrix multiplication algorithms by treating the search process like a game.
Created on October 5|Last edited on October 10
Comment
Matrix multiplication is a core part of computing and a simple-to-do yet notoriously difficult-to-optimize field of mathematics. Most processing units, especially GPUs and those built for machine learning, treat efficient matrix multiplication as one of the highest priorities.
DeepMind has today revealed a new machine learning model called AlphaTensor, an extension of their iconic AlphaZero model, to search for new optimized matrix multiplication algorithms by treating it like a reinforcement learning game. While machine learning has been used to find new matrix multiplication algorithms in the past, DeepMind is pushing new levels of efficiency and speed.
Finding optimal matrix multiplication algorithms.
The standard algorithm for multiplying 2x2 matrices consists of 8 multiplication steps and was thought of as thoroughly optimal until a new algorithm using 7 multiplication steps was discovered in 2014. The new algorithm included many more addition and subtraction steps, but for computers, those are insignificant compared to the saved multiplication step.
The number of possible operations expands exponentially as matrix sizes grow, making it functionally impractical for humans to search for optimizations or even computers that search for them linearly. That's where AlphaTensor comes in - by treating the search for new matrix multiplication algorithms like a game, DeepMind has been able to produce many new, more efficient algorithms for matrix multiplication not yet discovered.
Treating algorithm search like a game
Because AlphaZero was designed to play games like chess, go, and shogi, its brain is wired to solve deeply complex game states, with go, for example, having significantly more possible game states than the number of atoms in the observable universe. And to make things even more impressive, AlphaTensor's complex game of matrix multiplication algorithm optimization has game states magnitudes higher than that, depending on the size of the matrices in question.
AlphaTensor's goal is to find efficient algorithms, and it does that by learning how matrix multiplication works, over time getting better and better until it starts to discover our human-made improvements, and then goes beyond.
One example highlighted was the multiplication of 4x5 and 5x5 matrices, where the standard algorithm takes 100 multiplications, the human-derived improvement takes 80 multiplications, and the algorithm AlphaTensor has discovered, drops it to 76 multiplications.
DeepMind also adapted AlphaTensor to find algorithms that are faster on specific hardware, like NVIDIA's and Google's hardware, built for machine learning computation.
AlphaTensor shows how machine learning can be used to go beyond human intuition when it comes to discovering things like mathematical algorithms. Further research into the application of machine learning in these areas could lead to computation efficiency breakthroughs.
Find out more
Add a comment
The idea to use machine learning to find matrix multiplication algorithms is not as new.
Only one example:
https://www.researchgate.net/publication/341942316_Searching_for_fast_matrix_multiplication_algorithms
1 reply
Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.