Matrix multiplication is at the heart of many breakthroughs in machine learning, and now it has two set records. Last week, DeepMind company announced that it had discovered a more efficient way to perform matrix multiplication, breaking a 50-year-old record. However, this week, two Austrian researchers from the Johannes Kepler University in Linz claimed that they beat this new record by one step.
Matrix multiplication which involves multiplying two rectangular arrays of numbers often underlies speech and image recognition, smartphone image processing, compression, and computer graphics generation. Graphics processors are particularly good at matrix multiplication due to their massively parallel nature. They can break a large matrix math problem into many parts and solve it simultaneously using a special algorithm.
In 1969 German mathematician Volker Strassen discovered the best 4×4 matrix multiplication algorithm at the time, which reduces the number of steps needed to perform matrix calculations. For example, multiplying two 4×4 matrices using the traditional school method would require 64 multiplications, while Strassen’s algorithm can do the same job in 49 multiplications.
Using the AlphaTensor neural network, DeepMind found a way to reduce that number to 47 multiplications, and its researchers published a paper about this achievement in the journal Nature last week.
Going from 49 steps to 47 doesn’t seem significant, but when you consider how many trillions of matrix calculations happen on graphics processors daily, even an incremental improvement can lead to significant performance gains, allowing AI applications to run faster on existing hardware.
AlphaTensor is a descendant of AlphaGo (the AI-based system that beat the 2017 Go World Champions) and AlphaZero, which was trained to play chess and shogi (Japanese chess). DeepMind calls AlphaTensor “the first AI system to discover new, efficient, and provably correct algorithms for fundamental tasks such as matrix multiplication.”
To discover more efficient matrix math algorithms, DeepMind set the problem in the form of a single-player game. The company posted about the process last week:
In this game, the board is a 3D tensor (array of numbers) that captures how far from correct the current algorithm is. With the help of a set of allowed moves corresponding to the instructions of the algorithm, the player tries to modify the tensor and nullify its elements. When the player succeeds in doing this, it results in a provably correct matrix multiplication algorithm for any pair of matrices, and its efficiency is measured by the number of steps taken to zero the tensor.
Then DeepMind trained AlphaTensor using reinforcement learning to play this fictional math game – similar to how AlphaGo learned to play Go and gradually improved its performance over time. Ultimately, according to DeepMind, the AI rediscovered the work of Strassen and other human mathematicians and then surpassed them.
In a more complex example, AlphaTensor discovered a new way to perform a 5×5 matrix multiplication in 96 steps (versus 98 for the old method). This week, Manuel Kauers and Jakob Moosbauer published a paper in which they claim to have managed to reduce that number by one step, to 95 multiples. It is no accident that this record-breaking new algorithm appeared so quickly since it is based on the work of DeepMind.
In their article, Kauers and Moosbauer write, “This solution was derived from the DeepMind researchers’ scheme by applying a sequence of transformations that resulted in a scheme from which one multiplication could be eliminated.”
Technological progress is evolving, and with AI now searching for new algorithms, it is possible that other long-standing mathematical records may soon fall.