A Totally New, Very Old Method for Finding Square Roots

A new iterative method to approximate square roots — actually shown to be centuries old!

A Totally New, Very Old Method for Finding Square Roots
Photo by Sharon McCutcheon on Unsplash

Isaac Asimov once wrote [1] that all his discoveries in Math were either “completely correct findings that are quite old [or] completely new findings that are quite wrong.” I just had exactly the same experience, developing a totally new, greatly enhanced, surely unequaled method for finding square roots iteratively… to just end up realizing, after lots of work, that I had only rediscovered a well-known method!

I will show here where I started from, what ideas I developed, and the final sad conclusion about my totally new method being equivalent to a very old one… but since the road to that final result was interesting by itself, let’s begin!

The Irrationality of √2

My road started with Stanley Tennenbaum’s geometric proof [2] of the irrationality of √2. Let’s assume that there exists a fraction N/D that equals √2 (so N²=2D²) with integer N and D, and that no smaller values of N and D also have that property. We can draw the following diagram, with a white square of side N including two gray squares of side D.

The larger (white) square has side N; the two smaller overlapping (grey) squares have side D.

The dark grey part is covered twice, and the two white parts aren’t covered at all. Since the sum of the areas of the two D-sided squares (2D²) must equal the area of the original N-sided square (N²), it follows that the area of the twice-covered darker grey center square must equal the sum of the areas of the two uncovered white corner squares. Since the sides of the original squares were integers, so are the sides of the smaller squares: the doubly covered area has 2D-N sides, and the uncovered ones have N-D sides, so (2D-N)² must equal 2(N-D)²… but that’s impossible! We just found two smaller numbers that have the same property as Nand D, which were supposed to be the smallest ones with that property. The final conclusion is that N and D don’t exist, and √2 cannot be a rational number; elegant!

Let’s extract something from the previous proof. If N/D=√2, we then also have:

If we were to systematically apply the iteration

starting from, say, N₀=2 and D₀=1, we could expect (wish? hope?) that this would produce better and better rational approximations to √2. Fortunately, this proves to be so, even though the iteration actually tends to -√2; not a problem!

Check: doing the numbers, we eventually get N₉=-816 and D₉=577 that approximate -√2 (1.414213562373…) as -1.41421… with 6 digits’ precision; later N₁₃=-27720 and D₁₃=19601 producing -1.41421356… with 9 digits’ precision, and eventually N₁₇=-941664 and D₁₇=665857 making -1.41421356237…with 12 digits’ precision.

Extending the Method for √p

It seems like the idea works (at least for finding √2) and we could generalize it to get a new iterative method to find √p for any positive number p: start with a pair of values N₀ and D₀ and iterate using a formula like the following:

Equation 1: our generic iteration, with undefined constants a through d

Our formula involves four constants (abcd) that we don’t yet know — we had a=-1, b=2, c=1, and d=-1 to find √2, but surely the constants would be different when finding √p. To determine them, since the fraction (at the limit) should equal √p, let’s square its right side. Dropping subindices for clarity we get

that leads to

Equation 2: conditions to be met by the a, b, c, and d constants

In order for this to work for all N and D, we can start by making terms on NDequal, setting 2ab=2pcd so a=pcd/b. Note that we cannot pick c=0 because then we’d have a=0 and the iteration from equation (1) would become

which is useless. Thus, we can arbitrarily set c=1/p and then a=d/b.

Let’s go back to equation (2); discarding the ND terms (which we already dealt with) and noting from the left side of equation (1) that we would also have =pD², we can write

so now it must be

As c=1/p and a=d/b, this leads to

and we arrive at a quartic equation

Equation 3: conditions for the b and d constants

What are its roots? Two roots come from b²=pd², so we’d have bdp and thus a=d/b=±1/√p, but these values (together with c=1/p and any value for d) transform the iteration from equation (1) into another useless version:

The other two roots come from b²=1, so we can pick b=1 and a=d. Furthermore, if d>0 and we start with positive values for N₀ and D₀, the iteration will converge to the positive value of √p since all constants are positive.

Picking b=-1 doesn’t work: instead of converging to a value, the successive fractions produce an alternating cycle of two values. Thus, out of the four roots of equation (3), only one (b=1) can be used.

Let’s recap: we have found an iterative method, depending on a yet undecided value d, that will (supposedly, hopefully!) converge to √p:

Equation 4: our iteration formula, with a still undefined constant: d

A couple of questions remain: is this actually convergent? And, what value of dshould we select? Let’s get into that.

Convergence of the Iteration

We can rewrite equation (4) in vectorial form as follows:

Equation 5: our iteration, written vectorially

From equation (5) it follows that, for all k≥0,

Equation 6: the starting point for our iterative method

To find a closed expression for the k-th power of matrix A, we have to find its eigenvalues and eigenvectors. Solving det(AI)=0 finds the eigenvalues

with corresponding eigenvectors

With this result, we can write A=CBC⁻ ¹, where C is the matrix formed by the eigenvectors of A, and B is a diagonal matrix with the eigenvalues of A:

It follows that

and calculating

we finally get the k-th power of A that we wanted:

From equation (6) we get the result:

Equation 7: a closed form for the k-th iteration results

For d>0 then ∣ λ₁∣>∣λ₂∣ and when k tends to infinity our fraction tends to √p as we wanted. Even better: the convergence will be assured for any pair of initial nonzero values N₀ and D₀.

We can now also find the best possible value for d. Any d>0 would work, but if we can make ∣λ₂∣ very close to zero, convergence will be accelerated. Then, the value for d that we should pick should be as close to 1/√p as possible.

Check: for p=7 (so √p=2.6457513111…) and d=0.4 and starting from N=1 and D=0.4, our first approximation (1/0.4=2.64…) has 3 digits’ precision; two iterations later, we have 30.52/11.536=2.64575… with 6 digits’ precision, and after two more iterations we get to 905.128/342.10624=2.64575131… with 9 digits’ precision.

In any case, we can conclude that our iterative method always converges; how should we apply it?

Speeding Things Up

We’re getting to the actual method now. It’s clear that we’ll attain better precision by getting to higher values of k, so how can we quickly get there? We can do this quite fast by calculating AA=A², then A²A²=A⁴, AA⁴=A⁸, etc. Noting the special form of A:

we then find that A² shares the same form, since

and therefore all the powers of A that we will calculate will also share this form. We can do calculations without any actual matrix multiplication, by writing an iteration like

Equation 8: an iterative alternative to matrix multiplications

with initial values r₀=dps₀=p, and t₀=1, from equation (5).

But we can observe something! If we scale A or its powers by any number, the result of equation (7) won’t change, since we’ll be scaling both the numerator and denominator by the same value. In particular, let’s divide all formulas in equation (8) by 2rᵢ :

Equation 9: an alternative iteration with fewer operations

Now the s and t values never change, and as st=p the whole iteration of equation (9) reduces to just

Equation 10: a very ancient method, rediscovered!

which is just the well-known “Babylonian method” or “Heron’s method” [3], a special case of Newton’s formula for approximating √p, known for more than 2,000 years!

Instead of trying to calculate complete matrices, we make do with a single element — which converges to the square root that we were trying to approximate; a good reduction in calculations. Our totally new method ended up being equivalent to a very old formula; an interesting result, but not an innovation!

Conclusion

This article is fully aligned with Asimov’s comment on “completely correct findings that are quite old”. We went from a geometrical proof of the irrationality of √p, to an iteration producing fractions that approximated the wanted square root, to a matrix version of the same iteration (which we proved to converge), to an accelerated way of doing calculations by using powers of 2, to finally deduce that our newly invented method was equivalent to a several millennia old (and quite well known) formula for approximating square roots…

As we said: a totally new way to produce a very old method!

Further reading

Check out my A modern look at square roots in the Babylonian way article, to see how that method worked, and why. In that article I review the original method, and prove how and why it worked, but in a simple way, using just a bit of algebra, with no calculus.

A Modern Look at Square Roots in the Babylonian WayRevisiting the Babylonian method for square roots: why and how does it work?medium.com

References

[1] Isaac Asimov, ASIMOV ON NUMBERS, 1977, Pocket Books, page 36

[2] John Conway, THE POWER OF MATHEMATICS, in Alan Blackwell & David MacKay (eds), POWER, Cambridge University Press, pp 16–36, available online at http://thewe.net/math/conway.pdf

[3] Babylonian Method, available online at https://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Babylonian_method