Probability Theory

What is the Probabilistic Method?

On proving that things exist without constructing them

Matthew Ward

24 Aug 2020 — 7 min read

The Probabilistic Method is probably best known as a book by Noga Alon and Joel H. Spencer released in 1990. I was first introduced to it around 2007, and it’s been a source of delight to return to from time to time.

The technique known as the probabilistic method is mostly due to Paul Erdős, one of the most prolific mathematicians of all time. The idea is very clever, and it doesn’t seem like it should work as well as it does. It has found success in standard probability, graph theory, real analysis, game theory, geometry, and more.

One of the most surprising results to me in the book is a fully probabilistic proof of the classic Weierstrass Approximation Theorem, which says that any continuous function on [0, 1] can be (uniformly) approximated by a polynomial.

It’s amazing that a probabilistic proof could exist for such a fundamentally non-probabilistic theorem about continuous functions and polynomials. And that just goes to show how much power this method can have.

What Is It?

I’ve delayed this long enough. So, what is the probabilistic method? It’s not so easy to give a simple formulation, since it encompasses several approaches.

Here’s the main idea.

Suppose you have a bag of objects. You don’t know exactly what’s in there, but you manage to prove that the probability of one of them being a red ball is positive.

You’ve now just proved that a red ball exists without ever producing one! This is because if no red balls existed, then the probability of one being in the bag would be 0, not positive.

This example sounds silly, but it’s vital to understand the concept before moving on. Once we apply it to a more complicated situation, it will be easy to lose track of the underlying form of the argument.

Another technique classified as the probabilistic method is to show that if a random variable achieves a value lower than it’s expected value, then it also must achieve a higher value (and vice-versa).

For example, if you get an 88% overall average in a class, and someone finds an assignment for which you got a 75%, they could conclude you also had an assignment where you got higher than 88% without ever needing to produce a copy of that assignment. This is just how averages work.

Both of these methods prove certain objects exist without ever constructing them.

This idea of non-constructive existence proofs is essential in math.

There are many things we know exist but have a very hard time producing. For example, Borel proved in 1909 that almost all numbers are normal (if you pick a number at random, it will be normal with probability 1). But we can’t easily produce them. We can’t even prove e, π, ln(2), or really any commonly used numbers in math are normal.

Let’s now move on to a basic application of the probabilistic method to see it in action.

Sum-Free Sets

The result we’ll prove with the probabilistic method is due to a 1965 result of Erdős. A form of it is presented in the previously mentioned book of Alon and Spencer.

A sum-free set is a set of nonzero integers such that you can’t add two elements of the set to get a third. The definition and proof below work for any set of nonzero integers, but I’ll make the simplification of only using positive integers for this article to avoid some technical points.

Examples:

{1, 2, 3} is not sum-free, because 1+2=3 and also 1+1=2.
{1, 3} is sum-free.

Here’s the motivation for the result. Given a finite set B= {b₁, …, bₙ}, is there always a “large” sum-free subset A?

There are always small ones because you can just take A={a₁}. You can sometimes find large ones. Consider the above examples. B={1, 2, 3} is not sum-free, but A={1, 3} is 2/3 of the original set and is sum-free.

Theorem: Every finite set B={b₁, …, bₙ} of n (positive) integers contains a sum-free subset A of size at least 1/3 of B.

This was kind of surprising to me when I first read it because it seems like you should be able to make some sort of “maximally summable” set by starting small and then only adding in previous sums.

For example, B={1, 2, 3, 5, 8, 13}. Of course, we can again just take A={1, 3} which is 1/3 the size of B and sum-free. Or even A={1, 3, 5} which is 1/2 the size of B.

Here’s the cool thing. Even though we can prove such an A exists, the proof is probabilistic. There’s no way to construct A from B in general. This should make sense if you want to play around with it. B could have any numbers in it, so how could you ever do a process that will work in general to construct A?

The Ideas

As usual, this is aimed at a somewhat general math audience. There’s no way around using what will probably be unfamiliar theorems. We’re after the main idea of how to use the probabilistic method for proving something like this.

To make things easier, I’ll stop referring to “the size of” a set and use the common notation |S| to mean the number of elements of S.

Idea 1: Given any N, we can make a set of consecutive integers of size N that is sum-free.

If N=4, we could use {10, 11, 12, 13}. Here’s the trick to notice. If the first number doubled is larger than the last number, then the whole set must be sum-free because that’s the smallest any two numbers can be when added together.

Convince yourself of that, because this idea is fundamental to the proof.

Fact 1: There are infinitely many primes of the form 3k+2.

This is a special case of Dirichlet’s Theorem, but that’s far beyond the scope of this article.

The Proof of the Theorem: Let B={b₁, …, bₙ} be a set of n positive integers. We’ll assume they are in increasing order since that doesn’t change anything we’re trying to prove.

Recall that we want to prove a sum-free subset A exists such that |A|>(1/3)n.

Choose any prime of the form p=3k+2, where p>2bₙ. We can do this by Fact 1.

Let C={k+1, k+2, …, 2k+1}.

Fact 2: C is sum-free by the same trick we used in Idea 1 (double the first integer is larger than the last one).

This is pretty abstract, so let’s think about what we’ve done in a concrete example. If B={1, 2, 3, 4}, then the smallest prime we could choose is p=11 and that makes k=3.

This makes C={4, 5, 6, 7}. Here’s the motivation for doing this: it is slightly more than the “middle third” when listing all the numbers up to p:

{1, 2, 3} {4, 5, 6, 7} {8, 9, 10, 11}

Now the idea is to consider multiplying each element of B (and wrapping around using modular arithmetic) by 1, …, p until we get enough to land in that middle third.

For example:

1 x B = {1, 2, 3, 4}. Only 4 landed in C.
2 x B = {2, 4, 6, 8}. We did it: 4 and 6 are in C.

If we think about the elements of B that achieved this, they were 2 and 3, and indeed A={2, 3} is a sum-free set of size greater than 2/3.

I’ll show this visually as well:

Notice we can also get a solution multiplying by 4 because 4 x B={4, 8, 1, 5} (when we wrap around 12=1, 13=2, etc). Since 4 and 5 are in C we look at what numbers got sent to those places and find A={1, 4} is another sum-free subset of the right size.

This is a general fact that we’ll formalize now.

Fact 3: Let x be an integer. If A= {xbᵢ mod p} are distinct numbers in C, then A={bᵢ} is sum-free.

For the sake of contradiction, suppose A is not sum-free. Then there exists a,b,c ∈ A such that a+b=c. But now just multiply the equation by x. This means xa+xb=xc. If we reduce everything mod p, then we find that C is not square-free, a contradiction by Fact 2.

Now that we have Fact 3, the entire theorem will be proven in general if we can somehow prove there must exist an x that multiplies enough elements of B to distinct elements of C.

This is where the probabilistic method comes in. There is no algorithmic way to produce the x that will work. All we know is that it must exist by using probability.

The trick is to let x be a random variable uniformly over {1, 2, …, p}. The probability P(xbᵢ ∈ C)=|C|/(p-1)>1/3 just by plugging in for n and k. This means the expected amount E(xbᵢ ∈ C)>n/3.

By the trick mentioned in the “What is It?” section, this means there must exist an x so that at least n/3 of the xbᵢ are in C. By Fact 3, we take these bᵢ to form A.

And that’s an example of the probabilistic method at work! We can abstractly argue such a subset exists by a probabilistic method without needing to construct the set.

What is the Probabilistic Method?

Matthew Ward

What Is It?

Sum-Free Sets

The Ideas

Read more

It Took Me 10 Years to Understand Entropy, Here is What I Learned.

Why Probability Theory is Hard

Better-Than-Random Decisions in Uninformed Settings

Kronecker, God and the Integers