Survivorship Bias: The Mathematician Who Helped Win WWII

The legend of Abraham Wald

Survivorship Bias: The Mathematician Who Helped Win WWII
Photo by Museums Victoria on Unsplash

Man y of you may have heard the story of Abraham Wald. It is a story that the Internet loves, about a mathematician that explains to the military how wrong they were with their analysis.

The only problem is, there is no proof that it actually happened.

So here is the true story of Abraham Wald, as well as his legend and the real contribution that he gave to the military which helped, if not win the war, at least save some of the planes and pilots that fought in the early 1940s.

The Life of Abraham Wald

Abraham Wald was a mathematician born on 31 October 1902 in Kolozsvar, Hungary (this city is now called Cluj in Romania). He was born in a Jewish family and because of that, his studies were very difficult. At the time, Hungarian school required their students to attend lessons on Saturday. Wald’s family couldn’t allow their son to go to school on the Jewish Sabbath and as a result, young Abraham Wald didn’t attend primary and secondary school. His teachers were various members of the family which taught him at home.

After World War I, he studied at Cluj University but his life there was never easy because he was Jewish. However, his incredible mathematical abilities led him to the University of Vienna. There, he studied geometry under the supervision of Karl Menger and obtained his doctorate in 1931. Finding an academic position was impossible for a young Jewish mathematician in Vienna in the 30s. This is why he became the mathematic tutor of Karl Schlesinger, a successful Austrian banker. This position gave him financial stability and the possibility of pursuing his research in geometry. In six years he published 21 papers of fundamental importance. The interest of Schlesinger led Wald to explore the field of economics and he also published 10 papers on the subject.

In 1938 the Nazis invaded Austria and for a Jewish like Abraham Wald, the life conditions became impossible. He was invited to go to the US to do econometric research and he left Austria in that summer. This saved his life because only one of his family members survived the Nazi occupation. He became a Fellow of the Carnegie Corporation and was studying statistics at Columbia University. He started lecturing in 1939 and was appointed to the Faculty of Columbia University, a position he will occupy until his death which occurred in a plane crash in 1950.

During WWII he was also a member of the Statistical Research Group (SRG) at Columbia University where he applied his statistical knowledge on military problems. This is where his legend begins.

Survivorship Bias And The Myth of Abraham Wald

We are in the year 1943 and American Bombers are suffering lots of losses by the German counter-air defense. The military needed to minimize the losses and they ask for the help of the SRG at Columbia University, where works a Jewish Austrian-Hungarian refugee.

His name is Abraham Wald and the task of finding a solution on the armoring of the planes is his duty. He is given a lot of data on the aircraft damage, including the location of the damage suffered by the planes that were hit by the Nazis.

Location of the hits in the aircraft. McGeddon, CC BY-SA 4.0, via Wikimedia Commons

The military expected Wald to give them some suggestions on how to reinforce the spots of the planes that received the most hits by the German defenses.

“Wait,” said Wald, “What you should do is reinforce the area around the motors and the cockpit. You should remember that the worst-hit planes never come back. All the data we have come from planes that make it to the bases. You don’t see that the spots with no damage are the worst places to be hit because these planes never come back.”

This intuition is absolutely true, and it is called survivorship bias. The planes that were hit in the area of the motor, of the ones that lost their pilots never came back and were not part of the data. This is why the military had to concentrate on reinforcing exactly those areas.

Survivorship bias is the logical error made when we concentrate on the things that made it past a certain selection process while overlooking those that didn’t. It is a form of selection bias that can lead to the wrong conclusion when analyzing any data.

It can be applied in the economic field, typically when doing performance studies. Survivorship bias is the tendency on concentrating all the attention on the companies that were successful while forgetting about all the companies that failed in that period.

If this concept is true, and it works applied to the case of the plane's damage, you may be wondering why I told you it was a myth.

Because this concept was known by the military of most countries, even the Italian military was aware of that (and they weren’t the most prepared army in the war). On top of that, in the report Wald gave the military that is now public, he never mentions it.

So this story is to consider fiction or at least a nice reconstruction. The Internet loves this story, if you google “Abraham Wald plane” you can find many slightly different versions of this same story. Why? Because people love a story where “a mathematical genius teaches the army how it's done”.

And I loved this story too when I first heard about it not so long ago. What I did, however, was to do some research to find out that this was not completely true. And I also discovered part of the true work Abraham Wald did for the military.

The True Contribution of Wald

The first thing that Wald was asked was to calculate the probability of the planes to survive a certain number of hits.

The first simplification he made was to assume that the planes were not downed by mechanical failures, but only by hits from enemy fire.

First, we take a look at the data he used in the example:

Data used by Abraham Wald. A Method of Estimating Plane Vulnerability Based on Damage of Survivors page 8

With N that indicate the number of planes that were on the mission and A₀, A₁, … are the number of planes that returned with a certain number of hits. This means that to find aᵢ we have to do:

This represents the ratio of the planes that return with a number i of hits.

Now, Wald made another assumption. He assumed that the probability that a plane was shot down didn’t depend on the number of previous hits.

If we call q₀ = q₁ = … = q₅ = qᵢ the probability of a plane surviving the i-th hit and knowing that the previous hits didn’t down the plane we can get the equation:

The first one is a single equation with many unknowns and to reduce it Wald assumed that the previous hit did not weaken the plane. In this way, he could assume that q is a fixed factor and he obtained the second equation. Now we can go ahead and substitute the values of aⱼ:

This equation is telling us that the value q is the root of a not so complicated equation. From the graph below we can see that q is approximately 0.85. With some more complicated calculations, we can get a more accurate value of q = 0.851.

This means that the probability of a plane of being downed by the i-th hit, knowing that the previous hits didn’t damage the plane, is pᵢ = 0.149.

Wald then went ahead and used the equation he found in other statistical analysis. I will report only the results of two other interesting analyses that Wald made using the same concept as before.

He determined the probability of surviving a single hit taking into consideration the area of the plane that was damaged, and the results are in the picture below:

Table from A Method of Estimating Plane Vulnerability Based on Damage of Survivors page 65

Then he also analyzed the vulnerability of the plane taking into consideration not only the area of the hit but also the type of the bullet used:

Table from A Method of Estimating Plane Vulnerability Based on Damage of Survivors page 89

With this analysis the military was able to work on the armor of the planes, minimizing the losses in the war in the German skies.

Reading list