Matrix Multiplication and the Ingenious Strassen’s Algorithm
How Divide-And-Conquer Comes to the Rescue (again)
Suppose that you have two real n x n matrices A and B, and you want to compute the n x n matrix C = AB that is the product of A and B. By definition, if we denote by a(i,j) the element of matrix A that is in the i-th row and j-th column, and, similarly for B and C, we have that the element c(i,j) of C is equal to
It is natural to ask now about the time needed to compute matrix C. By observing the above definition, one can immediately come up with the following straightforward algorithm, which we describe in pseudo-code.
INPUT: n x n matrices A and B
OUTPUT: the n x n product matrix C = ABfor i = 1,...,n:
for j = 1,...,n:
c(i,j) = 0;
for t = 1,...,n:
c(i,j) = c(i,j) + a(i,t) b(t,j);return C;
If we assume, for simplicity, that additions, multiplications and memory access take constant time, then it is easy to see that the above algorithm does roughly n³ steps, as we have 3 for-loops, each one nested inside the others. In other words, its time complexity is Θ(n³).
The above discussion already shows that there exist polynomial-time algorithms for computing the product of two matrices. Nevertheless, given that the product of matrices is an operation that occurs very often in practice, one might ask about whether there are faster algorithms that compute the product.
To this day, pinpointing the exact time complexity of Matrix Multiplication is one of the major open questions of Theoretical Computer Science.
In this article, we will discuss the first algorithm that broke the O(n³) barrier. Named after its creator, Volker Strassen, Strassen’s algorithm, which appeared in 1969, uses the Divide-And-Conquer technique in a very clever way that results in an improved running time.
Strassen’s algorithm
To simplify the exposition, we will assume throughout that n = 2ᵏ for some positive integer number k. We note that this is without loss of generality, as we can always add extra all-zero lines and rows (padding) in the matrices A and B, such that the value of n is an integer power of 2, and moreover this new value of nis within a factor of 2 (i.e., within a constant factor) from the original value of n. In other words, this does not cause significant blow-up in the input size.
As a warm-up, we first devise a Divide-And-Conquer algorithm for Matrix Multiplication whose running time is still Θ(n³). The ideas contained in this algorithm will prove useful for presenting Strassen’s algorithm.
So, let’s start by writing the matrices A, B and C as block matrices, as follows.
The dimension of each of the sub-matrices in the above block decomposition is (n/2) x (n/2); this is why assuming that n is a power of 2 is convenient.
It is now easy to see that the following equations hold.
The above equations suggest a very simple recursive algorithm. We can recursively call our algorithm to compute all 8 products of matrices of size (n/2) x (n/2) that appear in the four equations, and then, in time O(n²), we can add everything together and compute the final matrix C. The running time T(n) of such a recursive algorithm satisfies
T(n) = 8 T(n/2) + O(n²).
By applying the Master Theorem, we conclude that the running time of this recursive algorithm is T(n) = O(n³). Thus, no progress was made.
Having the above as a starting point, Strassen then tried to reduce the number of recursive calls made. And he succeeded! He managed to reduce the number of recursive calls from 8 to 7. Thus, he ended up with a recursive algorithm whose running time S(n) satisfies
S(n) = 7 S(n/2) + O(n²).
The Master Theorem now implies that
Strassen’s trick
Strassen made the following observation. He defined the following matrices
These matrices might look like they came out of the blue, but some reverse engineering and playing around with the expressions defining the 4 blocks of C, along with Strassen’s ingenuity, led to these definitions. A crucial observation is that each of the matrices M₁, …, M₇ can be computed with exactly one recursive call, as the remaining operations are additions. Thus, it already seems that 7recursive calls might be enough. To finish his proof, Strassen then showed that the following equations hold.
It is easy to verify the above equations; here we will only verify that the second equation is true, and the rest is left as an exercise to the reader.
Note that there is a constant number of matrix additions that we need to do, and so the total number of steps besides the recursive calls O(n²). Thus, we indeed get that the total running time S(n) satisfies
S(n) = 7 S(n/2) + O(n²).
We are done!
Conclusion
Strassen’s algorithm was a major breakthrough and was the starting point of a long line of research that is still ongoing to this day. The big open question is whether there exists a Matrix Multiplication algorithm with running time O(n²). Since the input and output have size n², this is clearly a lower bound, and one cannot hope to go lower than that. Given the significance of the problem, the constant in the exponent of the best (optimal) Matrix Multiplication algorithm is denoted as ω. From the discussion in this article, we already have that
2 ≤ ω ≤ 2.8074.
In fact, the current state-of-the-art algorithm for Matrix Multiplication by Francois Le Gall shows that
ω < 2.3729.
A nice graph that shows this beautiful line of research that Strassen started is given below.
Only time will tell whether ω is equal to 2, or whether there is a fundamental structural barrier for reaching this desired exponent of 2…
References
- https://en.wikipedia.org/wiki/Matrix_multiplication_algorithm
- https://en.wikipedia.org/wiki/Strassen_algorithm
- https://stanford.edu/~rezab/classes/cme323/S16/notes/Lecture03/cme323_lec3.pdf
- https://en.wikipedia.org/wiki/Master_theorem_(analysis_of_algorithms)
- https://en.wikipedia.org/wiki/Volker_Strassen