Matrix Multiplication and the Ingenious Strassen’s Algorithm

How Divide-And-Conquer Comes to the Rescue (again)

Matrix Multiplication and the Ingenious Strassen’s Algorithm

Suppose that you have two real n x n matrices A and B, and you want to compute the n x n matrix C = AB that is the product of and B. By definition, if we denote by a(i,j) the element of matrix A that is in the i-th row and j-th column, and, similarly for and C, we have that the element c(i,j) of C is equal to

It is natural to ask now about the time needed to compute matrix C. By observing the above definition, one can immediately come up with the following straightforward algorithm, which we describe in pseudo-code.

INPUT: n x n matrices A and B
OUTPUT: the n x n product matrix C = ABfor i = 1,...,n:
for j = 1,...,n:
c(i,j) = 0;
for t = 1,...,n:
c(i,j) = c(i,j) + a(i,t) b(t,j);return C;

If we assume, for simplicity, that additions, multiplications and memory access take constant time, then it is easy to see that the above algorithm does roughly n³ steps, as we have 3 for-loops, each one nested inside the others. In other words, its time complexity is Θ(n³).

The above discussion already shows that there exist polynomial-time algorithms for computing the product of two matrices. Nevertheless, given that the product of matrices is an operation that occurs very often in practice, one might ask about whether there are faster algorithms that compute the product.

To this day, pinpointing the exact time complexity of Matrix Multiplication is one of the major open questions of Theoretical Computer Science.
Volker Strassen in 2009. [Source: Wikipedia]

In this article, we will discuss the first algorithm that broke the O(n³) barrier. Named after its creator, Volker Strassen, Strassen’s algorithm, which appeared in 1969, uses the Divide-And-Conquer technique in a very clever way that results in an improved running time.

Strassen’s algorithm

To simplify the exposition, we will assume throughout that n = 2ᵏ for some positive integer number k. We note that this is without loss of generalityas we can always add extra all-zero lines and rows (padding) in the matrices A and B, such that the value of n is an integer power of 2, and moreover this new value of nis within a factor of 2 (i.e., within a constant factor) from the original value of n. In other words, this does not cause significant blow-up in the input size.

As a warm-up, we first devise a Divide-And-Conquer algorithm for Matrix Multiplication whose running time is still Θ(n³). The ideas contained in this algorithm will prove useful for presenting Strassen’s algorithm.

So, let’s start by writing the matrices AB and C as block matrices, as follows.

The dimension of each of the sub-matrices in the above block decomposition is (n/2) x (n/2); this is why assuming that n is a power of 2 is convenient.

It is now easy to see that the following equations hold.

The above equations suggest a very simple recursive algorithm. We can recursively call our algorithm to compute all 8 products of matrices of size (n/2) x (n/2) that appear in the four equationsand then, in time O(n²), we can add everything together and compute the final matrix C. The running time T(n) of such a recursive algorithm satisfies

T(n) = 8 T(n/2) + O(n²).

By applying the Master Theorem, we conclude that the running time of this recursive algorithm is T(n) = O(n³). Thus, no progress was made.

Having the above as a starting point, Strassen then tried to reduce the number of recursive calls made. And he succeeded! He managed to reduce the number of recursive calls from 8 to 7. Thus, he ended up with a recursive algorithm whose running time S(n) satisfies

S(n) = 7 S(n/2) + O(n²).

The Master Theorem now implies that

Strassen’s trick

Strassen made the following observation. He defined the following matrices

These matrices might look like they came out of the blue, but some reverse engineering and playing around with the expressions defining the 4 blocks of C, along with Strassen’s ingenuity, led to these definitions. A crucial observation is that each of the matrices M₁, …, M₇ can be computed with exactly one recursive call, as the remaining operations are additions. Thus, it already seems that 7recursive calls might be enough. To finish his proof, Strassen then showed that the following equations hold.

It is easy to verify the above equations; here we will only verify that the second equation is true, and the rest is left as an exercise to the reader.

Note that there is a constant number of matrix additions that we need to do, and so the total number of steps besides the recursive calls O(n²). Thus, we indeed get that the total running time S(n) satisfies

S(n) = 7 S(n/2) + O(n²).

We are done!

Conclusion

Strassen’s algorithm was a major breakthrough and was the starting point of a long line of research that is still ongoing to this day. The big open question is whether there exists a Matrix Multiplication algorithm with running time O(n²). Since the input and output have size n², this is clearly a lower bound, and one cannot hope to go lower than that. Given the significance of the problem, the constant in the exponent of the best (optimal) Matrix Multiplication algorithm is denoted as ω. From the discussion in this article, we already have that

2 ≤ ω ≤ 2.8074.

In fact, the current state-of-the-art algorithm for Matrix Multiplication by Francois Le Gall shows that

ω < 2.3729.

A nice graph that shows this beautiful line of research that Strassen started is given below.

The improvements in ω during the last 5 decades. [Source: Wikipedia]

Only time will tell whether ω is equal to 2, or whether there is a fundamental structural barrier for reaching this desired exponent of 2

References