Number Theory

Euclid’s Algorithm for Computing the Greatest Common Divisor

An algorithm from ancient Greece

Haris Angelidakis

25 Sep 2020 — 6 min read

Euclid of Alexandria [Source]

The task is straightforward. Given two positive integer numbers x and y, we want to compute their greatest common divisor, i.e., the largest number that divides both x and y without a remainder.

For example, suppose that x = 30 and y = 12. It is easy to factor these two numbers and write x = 30 = 2 * 3 * 5 and y = 2 * 2 * 3. Thus, we can conclude that the greatest common divisor of x and y, also denoted as gcd(x, y), is equal to 6.

The above example already implies an algorithm for our problem.

INPUT: Positive integer numbers x and y
OUTPUT: gcd(x, y)1) Decompose x and y as products of prime numbers.2) Set b = 1.3) Go over the list of the computed prime factors, and if a prime number p is a common factor that appears k times in the decomposition of a and t times in the decomposition of b, then set
b = b * (p * ... * p),
where p appears min(k,t) times in the parenthesis.4) Return b.

The above algorithm is quite simple, but it requires a significant black-box in its first step. The main question that it leads to is the following.

Can we efficiently compute the prime factors of a positive integer number?

The above computational problem is known as the “factoring problem”, and, to this day, remains a major open problem in Mathematics and Theoretical Computer Science. There are no known polynomial-time algorithms for it, but it is also not known whether the problem is NP-complete or not. In contrast, there are efficient algorithms for the related problem of checking whether a given number is prime or not (known as the “primality testing problem”, which was resolved in a breakthrough result of 2002 by Agrawal, Kayal, and Saxena; see here for more details).

Going back to our original task, we now turn to a beautiful algorithm devised by Euclid more than 2000 years ago, which shows that we do not need to solve the factoring problem in order to compute the greatest common divisor of two numbers.

Euclid’s algorithm

First, we can assume that x > y > 0; if x = y > 0, then we immediately get gcd(x, x) = x, and we also have gcd(x, 0) = x for every x > 0.

We start with a simple observation. Let x > y > 0 be positive integers and suppose that b is the greatest common divisor of x and y. This means that there exist positive integer numbers n and m with n > m such that

x = n * b, and

y = m * b.

By subtracting the two equations, we get that

x - y = (n - m) * b > 0.

Thus, b is a common divisor of x - y and y. We will now show that, in fact, b is the greatest common divisor of x - y and y. To see this, suppose that there exists an integer number d > b that divides both x - y and y. This means that

x - y = q * d, and

y = r * d,

for some positive integers q and r. Combining the two equations, we can easily get that

x = (q + r) * d,

which implies that d divides both x and y, or in other words, it is a common divisor of x and y. Since we have assumed that b is the greatest common divisor of x and y and we have d > b, we get a contradiction. Thus, we just proved the following fact.

Fact 1: If x ≥ y > 0 are positive integer numbers, then

gcd(x, y) = gcd(x - y, y).

We note that, although we assumed that x > y, Fact 1 clearly holds if x = y, and this is why we stated it in the above form.

One can naturally now try to repeatedly apply Fact 1 in order to make the numbers in question as small as possible. More precisely, we can apply Fact 1 and reduce the original problem to the problem of computing the greatest common divisor of x - y and y. If x - y > y, we can apply Fact 1 again, and we get the new numbers x - 2 * y and y. If x - 2 * y > y, we can reapply Fact 1, and so on…

It is clear that we can repeat the above process i times, where i is the smallest positive integer such that x - i * y < y. At this point, we have that

gcd(x, y) = gcd(x - i * y, y).

It is straightforward to see that i = x div y, and x - i * y = x mod y, where x div y is the quotient of the integer division of x and y, and x - i * y is the remainder of this integer division. Thus, we obtain the following fact.

Fact 2: If x ≥ y > 0 are positive integer numbers, then

gcd(x, y) = gcd(x mod y, y).

Based on Fact 2, we are now ready to state Euclid’s algorithm.

INPUT: Non-negative integer numbers x and y with max(x, y) > 0.
OUTPUT: gcd(x, y)EUCLID(x,y): 1) If (x = 0)
return y;2) If (y = 0)
return x;3) If (x > y)
return EUCLID(x mod y, y);
else
return EUCLID(x, y mod x);

The correctness of the algorithm follows immediately from Fact 2. We stress that the algorithm always terminates as x mod y < y when x ≥ y > 0, which implies that in every recursive call the arguments get smaller and smaller, while remaining non-negative, and so at least one of them will necessarily reach zero. So, the only thing that is left to address is the running time of the above algorithm.

Running time

The analysis of the running time is based on the following fact.

Fact 3: If x ≥ y > 0, then x mod y < x / 2.

To see this, we can do a simple case analysis. If y ≤ x / 2, then we immediately get that x mod y < y ≤ x / 2. So, suppose that y > x / 2. In this case, it is clear that x mod y = x - y < x / 2.

Thus, after two consecutive recursive calls, the arguments will have shrunk by a factor of 2. This means that after O(log x) recursive calls, we will reach the base case, and the algorithm will terminate. So, we only have to think about the time required to compute the remainder of the integer division performed at each step. We will not get into many details about this; the interested reader can read this Wikipedia article, whose punchline is that

“… for large integers, the computer time needed for a division is the same, up to a constant factor, as the time needed for a multiplication, whichever multiplication algorithm is used.”

Thus, the reduced task is to find the fastest possible way to multiply two integers. Given two positive integers x > y > 0, it is easy to see that one can multiply them in O(log² x) time. Determining the optimal algorithm for multiplying two n-bit integers is yet another fascinating problem; the interested reader can read this beautiful article about the problem published in Quanta. The current state-of-the-art for multiplication is an algorithm with running time O(n log n); so, if x is the largest of the two integers, this gives a running time of O(log x log log x). Putting everything together, we conclude the following.

The running time of Euclid’s algorithm with input x > y > 0 is O(log² x log log x).

In other words, the running time is roughly quadratic in the input size.

As a final note, Euclid’s algorithm also extends to modular arithmetic. The interested reader can read more about modular arithmetic and Euclid’s algorithm in Section 1.2.3 of the “Algorithms” book of Dasgupta, Papadimitriou, and Vazirani. The exposition of this article was also heavily inspired by that book.