A Modern Look at Square Roots in the Babylonian Way

Revisiting the Babylonian method for square roots: why and how does it work?

A Modern Look at Square Roots in the Babylonian Way

In my A totally new, very old method for finding square roots article, I described how I discovered an algorithm that actually happened to be very old and well-known: the “Babylonian method.” (See References, at the bottom for more on that topic.) Let’s try to work out how and why this old method works, but “from first principles”, just using basic algebra and without resorting to advanced calculus.

The old method

The (Babylonian, Greek, or Indian; take your pick!) method itself is very simple: if you want to calculate √p, choose any initial value as your first guess, call it x, and then iterate by repeatedly finding a new value for x according to the following formula.

Basically, the idea is that if x is greater (smaller) than √p, then p/x will be smaller (greater) than √p. (Can you see why?) Thus, x and p/x bracket the value we want to find, and we may take their arithmetic average to get a better approximation, hoping this value will be closer to the root we want.

If the Babylonians had lived a few millenia later, they would have realized that their iteration is an application of the Newton-Raphson method to solve the equation x²-p=0. In that method, to solve an equation f(x)=0 you start with an initial guess for the root x, and iterate calculating the next x according to the formula x-f(x)/f’(x). But, obviously, this would have required knowledge of calculus that simply wasn’t available at the time!

Let’s work out an example: if you wanted to find the square root of 2 starting with x=1, your second estimate would be

You would then repeat the calculation with x=1.5, to get the third value.

By repeating the method, the fourth value would then be

Each new value in the succession is closer to 2 (1.4142135624…) and it seems that the method is converging to the desired root, but is it (necessarily) so?

We will prove several facts:

  • No matter what the initial guess, all successive approximations will be greater than √p. This implies that no matter where you start, the following approximations will be above the root. Of course, that alone doesn’t mean anything; the values could grow away from the root instead of approaching it! But…
  • Successive approximations will get closer to √p, and the error of a new approximation will be less than half the error of the previous one. This means that the error tends to zero, so the iteration will actually converge, and the algorithm will calculate the root we seek. And things get even better, because…
  • When the approximations get close enough to √p, the error at one step is less than the square of the error at the previous step — convergence gets really fast. The algorithm is said to have “quadratic convergence”.

Let’s get started!

The initial guess doesn’t matter

Let’s start by proving that, no matter what the initial guess is, all the successive approximations will be greater than the root that we seek, so it doesn’t matter where we start. Of course, the initial guess does matter in a sense; if it’s closer to √p, then convergence will be speedier; what we mean is that we can pick any initial value, and the successive approximations all will be greater than √p.

We will only consider positive values for x. If you wanted to find √2 but started with a negative value (say, x=-1) then you get exactly the same approximations (-1.5, -1.41666…, etc., as we saw above) just with changed sign.

Let’s now prove that the average of x and p/x (in other terms, the next value in the succession of approximations) is greater than √p.

Equivalently, we have

Both sides are greater than zero, so we may square to get

Subtracting 4p from both sides

And this is true because we can rewrite it as

The only way this can be an equality is if x=√p; for all other values of x, the inequality will hold, so we proved what we wanted — no matter the initial value, all other values in the succession will be greater than the root we seek.

If Babylonians had known the AM-GM (Arithmetic Mean-Geometric Mean) inequality, they wouldn’t have needed our proof; they would have immediately applied the inequality to deduce that the arithmetic mean of x and p/x is greater than their geometric mean: √p !

Convergence to the root is assured

We know now that no matter what the initial guess x₀, all successive approximations will be greater than √p. To see the convergence of the succession, let’s define the error at step i as

With the possible exception of e₀, which could be negative depending on x₀, all other errors will be positive because all the successive values are greater than √pas we saw above. Let’s rewrite our iteration in terms of errors.

Subtracting √p from both sides, and simplifying we get

We can then observe that

We get the important result:

This means that the error (difference between x and √p) will be at least halved at each step — so each new value of x is closer to the root we seek. If we start at x₁ (which we know is greater than the sought root) then after n steps

If we iterate enough times, as n grows the error in our approximation can be made as small as desired, and the method actually converges to √p, as desired.

Convergence becomes quadratic

We can extract a further conclusion from the previous results. We have seen that the error at each step becomes smaller than the error at the previous one. At the limit, when the error tends to zero (which will happen, as we proved above) we can write

We get another interesting result: the order of convergence of our root calculation method is quadratic. Roughly speaking, this means that the number of precise digits at one step gets doubled at the next one.

Let’s verify all our claims! Let’s calculate the square root of 22, starting at 1. In the table below, the first column (“i”) just numbers the steps, starting at zero for our initial guess; the second column (“x”) shows the successive calculated values, and the third column (“e”) shows the error at the step.

7 iterations are enough to calculate √22 with 12 decimals of precision

The table above verifies all our assertions: except for the initial error, all errors are positive. Also, the error in each line (again, excepting the initial case) is less than half the error in the previous line; at the seventh iteration, we got a precision over 12 decimals.

As to the quadratic convergence, we can see that the number of correct decimals starts (more than) doubling: x₄ (4.69…) has two; x₅ (4.6904…) has four, and x₆ (4.6904157598…) has 10.

Conclusion

We have taken a look at the Babylonian method for calculating square roots, and we have been able to prove its convergence (for any initial value) and we also found an estimate for the error after any number of steps — all without resorting to derivatives or any other methods of calculus. These are not new results, for sure, but the way we were able to use simple algebra in order to derive them, that’s the interesting part… even if the Babylonians didn’t work that way!

References