Root-Finding Algorithms

Ways of determining when an equation is equal to zero

Root-Finding Algorithms

As every schoolchild knows, the solutions of a quadratic polynomial equation ax²+bx+c=0 are given by the quadratic formula:

But in the real world, we don’t just have to deal with second degree polynomials. Finding the roots of higher-order polynomials, or roots of transcendental equations like x-arctan(x)=0, can be a challenge because:

  • While there are formulas analogous to the quadratic formula for third and fourth degree polynomials, they are very complicated and aren’t generally worth using.
  • There is no general formula that gives the roots of polynomials of degree five and greater, by the Abel-Ruffini theorem.
  • Transcendental equations do not generally admit closed-form solutions, meaning that the solution cannot be written as a finite combination of basic operations like addition and multiplication and of elementary functions, so solving them by hand is either impossible or requires some kind of special treatment.

Fortunately, in the real-world problems where we are likely to encounter such equations, we don’t actually need exact solutions, and in any case, even if we could obtain them, the equipment that we use to store numerical data and take numerical measurements can only handle finitely many decimal places. This justifies using numerical approximations to find solutions to any general equation of the form f(x)=0.

The bisection algorithm

The bisection algorithm, also called the binary search algorithm, is possibly the simplest root-finding algorithm. Here’s how it works:

  • First, pick two numbers, a and b, for which f(a) and f(b) do not have the same sign, for which f(x) is continuous on the interval [a,b]. By continuity, there is at least one value of x strictly between a and b for which f(x)=0. Try to pick a and b close together to reduce the possibility that there will be multiple roots on the interval [a,b].
  • Let c be the midpoint of a and b, c=(a+b)/2, and compute f(c). If f(c) is zero or approximately zero up to our desired accuracy, or if we reach some other stopping condition such as reaching a certain number of iterations, then we are done.
  • Otherwise, f(c) will have the same sign as either f(a) or f(b). If f(c) has the same sign as f(a), then repeat step two but this time on the interval [c,b], and if f(c) has the same sign as f(b), then repeat step two on the interval [a,c].

The following Fortran program (see the notes at the end of the article for how to run this code) will implement this algorithm for the polynomial equation 3x⁴-5x³+2x²-x-1=0:

program bisection
implicit none
real:: a,b,c,d,tol,f
integer:: k, N
tol=0.001
k=1write(*,*)"Enter number of iterations and bounds of search interval."
read(*,*) N,a,bdo while (f(a)*f(b) .gt. 0)
write(*,*)"Try another pair of boundaries."
read(*,*) a,b
end dodo while (1 .eq. 1)
if (k .eq. N) then
print*, "Failed. Try again with more iterations or different boundaries."
exit
end if
c=0.5*(a+b)
if (f(c) .eq. 0) then
print *, "The root is ",c
exit
else if (f(c)*f(a) .gt. 0) then
a = c
else
b = c
end if
d=0.5*(a+b)
if (abs(f(c)-f(d)) .lt. tol) then
print *, "The root is ",d
print *m "Finished in",k, "iterations."
exit
end if
k=k+1
end do
end program bisectionreal function f(x)
real:: x
f = 3*x**4-5*x**3+2*x**2-x-1
return
end function f

Starting with a=0.1 and b=2.5 and given 100 iterations, the output of this program is:

Enter number of iterations and bounds of search interval.
100
0.1
2.5
The root is 1.47204590
Finished in 14 iterations.

Notice that the algorithm converged somewhat quickly, in just 14 iterations. While bisection usually has decent convergence, the main drawback is that convergence is guaranteed if and only if f(x) is continuous everywhere in [a,b] and f(x) has exactly one root in [a,b], and the algorithm has no inherent ability to check that this is the case. We cannot use bisection unless we know beforehand that these conditions are satisfied.

The false position method

While bisection is a perfectly good approach to finding roots of equations, it is ultimately a brute force approach and therefore one wonders if we could find something more efficient. A simple improvement to the bisection method is the false position method, or regula falsi. Here’s what we do:

  • As with the bisection algorithm, start by choosing an interval [a,b] in which we know that f(x) is continuous and has only one root.
  • Instead of the midpoint of [a,b], let c be the point where the line from (a,f(a)) to (b,f(b)) intersects with the x-axis:

The formula for c is:

  • If f(c) has the same sign as f(a) (or f(b)) then replace a (or b, respectively) with c and repeat the process:

Notice that only one of the bounds has changed so far. In fact, if we proceeded this way, then only the left bound would change. This doesn’t necessarily mean that the algorithm will fail to converge, since just like the bisection method, convergence for false position is guaranteed if and only if f(x) is continuous and has exactly one root in [a,b]. However, it may slow down the rate of convergence. When this happens, we call b a “stagnant bound”, and to correct for stagnant bounds we make a slight modification to the false position algorithm:

  • If a bound is unchanged after two iterations, then replace the value of the function at that bound with one half of its value.

This is called modified false position or sometimes the Illinois algorithm. Graphically, this looks like:

Notice that this brings the candidate for the approximate root significantly closer to the actual root. Here’s the code for the algorithm:

program falseposition
implicit none
real:: xl,x,xr,xnew,tol,f,yl,yr
integer:: N,k,kl,kr
tol = 0.001write(*,*)"Enter number of iterations and bounds of search interval."
read(*,*) N,xl,xrdo while (f(xl)*f(xr) .gt. 0)
write (*,*) "Try another pair of boundaries."
read(*,*) xl,xr
end dok=0
kl=0
kr=0
yl = f(xl)
yr = f(xr)do while (1 .eq. 1)
if (k .eq. N) then
print*, "Failed. Try again with more iterations or different boundaries."
exit
end if
x=(xl*yr-xr*yl)/(yr-yl)
if (f(x) .eq. 0) then
print *, "The root is",x
exit
else if (f(x)*yl .gt. 0) then
xl = x
yl = f(xl)
kl = 0
kr = kr+1
if (kr .ge. 2) then
yr = 0.5*yr
end if
else
xr = x
yr = f(xr)
kr = 0
kl = kl+1
if (kl .ge. 2) then
yl = 0.5*yl
end if
end if
xnew = (xl*yr-xr*yl)/(yr-yl)
if (abs(f(xnew)-f(x)) .lt. tol) then
print *, "The root is", xnew
print *, "Finished in",k,"iterations."
exit
end if
k = k+1
end do
end program falsepositionreal function f(x)
real:: x
f=3*x**4-5*x**3+2*x**2-x-1
return
end function f

The output is:

Enter number of iterations and bounds of search interval.
100
0.1
2.5
The root is 1.47210526
Finished in 10 iterations.

This algorithm finished after only 10 iterations, which is a good improvement over bisection. We can remove the stagnant bound correction process by commenting out the following two sections of code:

kl = 0
kr = kr+1
if (kr .ge. 2) then
yr = 0.5*yr
end if
...
kr = 0
kl = kl+1
if (kl .ge. 2) then
yl = 0.5*yl
end if

When we do this, we see that the algorithm is much slower:

Enter number of iterations and bounds of search interval.
100
0.1
2.5
The root is 1.47180653
Finished in 45 iterations.

Although the modified form of the false position algorithm is relatively recent, the simple false position algorithm is truly ancient. It was known to the ancient Egyptians and Babylonians, and the algorithm is used in the Rhind Papyrus (written between 1650 and 1550 BCE) to solve linear equations.

Newton’s method

Unlike bisection and false position, Newton’s method is not bracketed, meaning that the algorithm is not bound to a pre-determined interval. This will make for a faster algorithm, but the consequence is that there is now a risk that the algorithm will diverge.

Let x₀ be a guess for the root. The equation for the tangent line at x₀ is y=f(x₀)+f’(x₀)(x-x₀). Let x₁ be the x-intercept of y(x), then by rearranging:

Although there is no guarantee of this, x₁ will usually be a better approximation to the root than x₀. If this is the case, then the following will most likely be an even better approximation:

And in general, we can obtain successively better approximations by iterating:

Graphically, this looks like:

Source: Paul’s Online Notes

As xₙ approaches the root, the odds that xₙ₊₁ will fail to be a better approximation than xₙ decrease quickly, so we’re probably in the clear if we can survive the first few iterations. Some of the most common problems are:

  • The function has a maximum or minimum somewhere between xₙ and the root, in which case there’s a chance that at the next iteration, the derivative of f(x) will be zero or close to zero, so f(xₙ)/f’(xₙ) may be unreasonably large or the algorithm may just immediately fail due to division by zero.
  • The initial guess for the root was too far away from the actual root.
  • The root is a multiple root, in which case it is a root of both f(x) and f’(x), and this will generally slow down convergence.

If the algorithm fails then we just make another guess and try again. Often we can make a very good guess by qualitatively analyzing the function we’re working with. Returning again to our example 3x⁴-5x³+2x²-x-1=0, consider the graph of the polynomial:

This suggests that a guess between 1 and 2 will be good for finding the positive root and a guess between 0 and -1 will be good for finding the negative root. We will not have to worry about multiple roots because clearly neither of these roots are local maxima or minima. The code for Newton’s method is:

program newtonmethod
implicit none
real:: f,fprime,xn,tol
integer:: k,N
tol=0.001
k=1write(*,*)"Enter number of iterations and a guess for the root."
read(*,*) N,xndo while (1 .eq. 1)
if (k .eq. N) then
print*, "Failed. Try again with more iterations or a different guess."
exit
else if (fprime(xn) .eq. 0) then
print*, "Encountered a stationary point. Try again with a different guess."
exit
else if (abs(f(xn)) .lt. tol) then
print*, "The root is",xn
print*, "Finished in",k,"iterations."
exit
end if
xn = xn-f(xn)/fprime(xn)
k = k+1
end doend program newtonmethodfunction f(x)
real:: x
f = 3*x**4-5*x**3+2*x**2-x-1
return
end function ffunction fprime(y)
real:: y
fprime = 12*y**3-15*y**2+4*y-1
return
end function fprime

The outputs when this program is used to find both of the roots are:

Enter number of iterations and a guess for the root.
25
1.5
The root is 1.47210646
Finished in 3 iterations.
...
Enter number of iterations and a guess for the root.
25
-0.5
The root is -0.378930330
Finished in 4 iterations.

As you can see, Newton’s method is very powerful, but we should keep its drawbacks in mind:

  • We have to figure out the derivative of f(x) beforehand. This isn’t a problem for polynomials, but it can be an issue for more complicated functions.
  • The algorithm encounters problems if it gets close to places where the derivative of f(x) is zero.
  • The conditions for Newton’s method to work correctly are stricter than they are for bisection and false position, and one has to verify beforehand that the conditions are satisfied.

The secant method

The secant method is similar to Newton’s method but with a finite difference instead of a derivative. Like the false position method, the secant method is very old and instances of its use have been found on Babylonian clay tablets dating back to at least 1700 BCE. It is also sometimes called the double false position method. Here’s how it works:

  • Similar to false position, we pick two points x₀ and x₁ and draw a line through the points (x₀,f(x₀)) and(x₁,f(x₁)). In point-slope form, the equation for this line is:
  • Then we’ll say that x₂ is the x-intercept of this line:
  • From these starting conditions, successive approximations are obtained by recursion:

Like Newton’s method, the secant method is not bracketed, so there is a possibility that the algorithm will diverge. However, this possibility is smaller, and furthermore the secant method doesn’t require us to worry about the derivative of f(x). The following code implements the secant method:

program secant
implicit none
real:: xn,xn1,xn2,f,tol
integer:: k,N
tol = 0.001
k=1write(*,*) "Enter number of iterations and guesses for x0 and x1"
read(*,*) N, xn,xn1do while (1 .eq. 1)
if (k .eq. N) then
print*, "Failed. Try again with more iterations or different guesses."
exit
end if
xn2 = xn1 - f(xn1)*(xn1-xn)/(f(xn1)-f(xn))
if (abs(f(xn2)) .lt. tol) then
print*, "The root is",xn2
print*, "Finished in",k,"iterations."
exit
else
xn=xn1
xn1=xn2
k=k+1
end if
end do
end programfunction f(x)
real:: x
f = 3*x**4-5*x**3+2*x**2-x-1
return
end function f

The output for finding the positive and negative roots is:

Enter number of iterations and guesses for x0 and x1
25
1
2
The root is 1.47206616
Finished in 8 iterations.
...
Enter number of iterations and guesses for x0 and x1
25
0
-1
The root is -0.378926098
Finished in 8 iterations.

As you can see, the secant method is slower than Newton’s method but faster than the bracketed methods.

Inverse quadratic interpolation

Any polynomial of degree n can be exactly specified by choosing n+1 distinct points. This is the idea behind Lagrange interpolation: given some function f(x), whose value we only know for n values of x, we can approximate f(x) in the vicinity of those points by the unique polynomial, called the Lagrange polynomial, defined by those points. In the method of inverse quadratic interpolation, we use Lagrange interpolation to approximate the inverse of f(x) with a quadratic polynomial, and then use the result to find a recursion for the roots.

Suppose that we know the values of y₀=f(x₀), y₁=f(x₁), and y₂=f(x₂). We use this initial data to interpolate f⁻¹(y) as a quadratic polynomial in y by the Lagrange polynomial formula. The result is:

Now substitute y=f(x)=0 to get:

But we don’t actually know which x leads to f(x)=0, so we’ll instead say that the right-hand side of this formula generates a better approximation of the root, and this establishes the following recursion:

Here’s some code that implements this formula:

program invquadinterp
implicit none
real:: xn,xn1,xn2,xn3,tol,f
integer:: N,k
tol=0.001
k=1write(*,*) "Enter number of iterations and three values of x."
read(*,*) N,xn,xn1,xn2do while (1 .eq. 1)
if (k .eq. N) then
print*, "Failed. Try again with more iterations or different x values."
exit
end if
xn3 = xn*f(xn1)*f(xn2)/((f(xn)-f(xn1))*(f(xn)-f(xn2)))
xn3 = xn3+xn1*f(xn)*f(xn2)/((f(xn1)-f(xn))*(f(xn1)-f(xn2)))
xn3 = xn3+xn2*f(xn)*f(xn1)/((f(xn2)-f(xn))*(f(xn2)-f(xn1)))
if (abs(f(xn3)) .lt. tol) then
print*, "The root is",xn3
print*, "Finished in",k,"iterations."
exit
else
xn=xn1
xn1=xn2
xn2=xn3
end if
end do
end program invquadinterpfunction f(x)
real:: x
f = 3*x**4-5*x**3+2*x**2-x-1
return
end function f

The output when trying to find the positive and negative roots is:

Enter number of iterations and three values of x.
100
1
1.5
2
The root is 1.47215569
Finished in 3 iterations.
...
Enter number of iterations and three values of x.
100
0
-0.5
-1
The root is -0.378949940
Finished in 4 iterations.

As you can see, this method is quite fast. Of course, we’d expect it to be for this problem, since we’re interpolating a low-degree polynomial with second-degree polynomials, so one wouldn’t expect it to require that much work to get a good result.

It should be noted that inverse quadratic interpolation is rarely used by itself. Instead, this method is used along with the bisection and secant methods in Brent’s algorithm, which is an algorithm that works by deciding at each iteration which of these three methods is best to use.

Fortran

It may seem a little unusual that I decided to use Fortran for the examples in this article. This is because Fortran has a mostly undeserved reputation for being a “dead language”, conjuring images of punch cards and tape reels, seemingly at odds with modern systems like MATLAB and Python. In fact, Fortran remains the standard in physics (which is my background) for reasons that have been discussed elsewhere.

If you want to run the code that I’ve linked here, you will need to compile it with a utility like gfortran or Code::Blocks. Alternatively, with slight modification, it is possible to execute Fortran code with Python.

Any code and images that have not been cited are my own original work. I give permission to download, run, and modify the code I have included here — in fact, I encourage it — but I cannot promise that it will work perfectly for everyone.