What is the Significance of the Kolmogorov Axioms for Mathematical Probability?
It is widely said that the Kolmogorov axioms provide the standard mathematical formalization of Probability.
It is widely said that the Kolmogorov axioms provide the standard mathematical formalization of Probability (capitalized, to mean the discipline). This is true, but is not very informative to a non-mathematical reader, so let me explain its significance.
Historical background.
Around 1900 the axiomatic approach to mathematics had spread well beyond its classical setting of Euclidean geometry, and the particular question of how to axiomatize Probability was highlighted as part of Hilbert’s sixth problem:
Mathematical Treatment of the Axioms of Physics. The investigations on the foundations of geometry suggest the problem: To treat in the same manner, by means of axioms, those physical sciences in which already today mathematics plays an important part; in the first rank are the theory of probabilities and mechanics.
While Probability certainly involves some conceptually extra idea (additional to the rest of Mathematics), the issue was whether Probability required some new technical ingredient to be added to the rest of Mathematics. Kolmogorov’s achievement was the realization that it didn’t.
Measure theory had been recently developed to resolve the technical conflict between the intuitive idea “every region in the plane has some area” and the axioms of set theory dealing with every subset of an uncountable set. This conflict has no conceptual connection with Probability, but Kolmogorov realized that the technical machinery (involved in its resolution) of measures, measurable sets, measurable functions could be reused as an axiomatic setting for Probability. In retrospect, because one special model within Probability is “pick a uniform random point from the unit square”, it is clear that any general theory of Probability has to include measure theory, but (to reiterate) Kolmogorov’s achievement was the realization that at the technical level it didn’t require anything more.
With agreed axioms, 20th century mathematicians happily moved on with systematic development of theorem-proof Probability. The solid connection to the rest of theorem-proof Mathematics enabled researchers to use tools from other fields of Mathematics, particularly in the context of rigorous proofs of limit theorems. More prosaically, it is helpful to have coherent notation covering both discrete and continuous probability distributions, in contrast to the elementary separate notions of probability mass function and probability density function.
Within Mathematics, the Kolmogorov axioms have provided an agreed notion of what is a completely specified probability model within which questions have unambiguous answers. This has eliminated many “paradoxes” such as Bertrand’s paradox, which is merely an ambiguously defined model.
And in the real world?
In the real world we use words such as likely and unlikely in settings of uncertainty. To me, the fundamental question is
In what real world contexts is it both practical and useful to attempt to estimate numerical probabilities?
and curiously this is seldom addressed in either the philosophical or scientific worlds. Mathematics tells us how to manipulate assumed numerical probabilities, but the axioms essentially only enforce consistency conditions, such as the sum of the probabilities of all alternative outcomes must equal one. And in the usual real world case of finitely many outcomes, this was all well understood before Kolmogorov.