The Axiom of Choice

The Axiom of Choice (AC) was formulated about a century ago, and it was controversial for a few of decades after that; it may be considered the last great controversy of mathematics. It is now a basic assumption used in many parts of mathematics. In fact, assuming AC is equivalent to assuming any of these principles (and many others):

1. [from Set Theory]: Given any two sets, one set has cardinality [i.e., the number of elements in the set] less than or equal to that of the other set.

2. [from Linear Algebra]: Any vector space V over a field F has a basis - i.e., a maximal linearly independent [spanning] subset - over that field.

3. [from Topology - Tychonoff's Theorem]: Any [direct] product of compact topological spaces is compact.

4. [from Real Analysis]: There exists at least one set which is not Lebesgue measurable.

5. [from Number Theory - The Well-Ordering Principle]: Every non-empty subset of N (i.e., the set of all positive integers) has a least element.

6. [from Set Theory - The Principle of Mathematical Induction]: To prove the proposition P(n) - which is stated in terms of and depends on the positive integer n - it suffices to prove the following:

if P(n) is true, then P(n+1) is also true.

[note: I included #4, #5, and #6 in this list from my own brain.]

AC has many forms; here is one of the simplest:

The Axiom of Choice: Let C be a collection of nonempty sets. Then we can choose a member from each set in that collection. In other words, there exists a function f defined on C with the property that, for each set S in the collection, f(S) is a member of S.

The function f is then called a choice function.

To understand this axiom better, let's consider a few examples.

1. If C is the collection of all nonempty subsets of N, then we can define f quite easily: just let f(S) be the smallest member of S.

2. If C is the collection of all intervals of real numbers with positive, finite lengths, then we can define f(S) to be the midpoint of the interval S.

3. If C is some more general collection of subsets of the real line, we may be able to define f by using a more complicated rule.

4. However, if C is the collection of all nonempty subsets of the real line, it is not clear how to find a suitable function f. In fact, no one has ever found a suitable function f for this collection C, and there are convincing model-theoretic arguments that no one ever will. (Of course, to prove this requires a precise definition of “find,” etc.)

The controversy was over how to interpret the words “choose” and “exists” in the axiom:

- If we follow the constructivists, and “exist” means “find,” then the axiom is false, since we cannot find a choice function for the nonempty subsets of the reals.

- However, most mathematicians give “exists” a much weaker meaning, and they consider the Axiom to be true: To define f(S), just arbitrarily “pick any member” of S.

In effect, when we accept the Axiom of Choice, this means we are agreeing to the convention that we shall permit ourselves to use a choice function f in proofs, as though it “exists” in some sense, even though we cannot give an explicit example of it or an explicit algorithm for it.

The “existence” of f - or of any mathematical object, even the number “3” - is purely formal. It does not have the same kind of solidity as your table and your chair; it merely exists in the mental universe of mathematics. Many different mathematical universes are possible. When we accept or reject the Axiom of Choice, we are specifying which universe we shall work in. Both possibilities are feasible - i.e., neither accepting nor rejecting AC yields a contradiction.

However, most “ordinary” mathematicians - i.e., most mathematicians who are not logicians or set theorists - accept the Axiom of Choice chiefly because their work is simpler with the Axiom of Choice than without it.

A few pure mathematicians and many applied mathematicians (including, e.g., some mathematical physicists) are uncomfortable with the Axiom of Choice. Although AC simplifies some parts of mathematics, it also yields some results that are unrelated to, or perhaps even contrary to, everyday “ordinary” experience; it implies the existence of some rather bizarre, counterintuitive objects.

Perhaps the most bizarre is the Banach-Tarski Paradoxical Decomposition. Banach and Tarski used the Axiom of Choice to prove that it is possible to take the 3-dimensional closed unit ball B, [that is, the surface and interior of the sphere with radius = 1] and partition it into finitely many pieces, and move those pieces in rigid motions (i.e., rotations and translations, with pieces permitted to move through one another) and reassemble them to form two copies of B.

At first glance, the Banach-Tarski Decomposition seems to contradict some of our intuition about physics - e.g., the Law of Conservation of Mass, from classical Newtonian physics. Consequently, the Decomposition is often called the Banach-Tarski Paradox. But actually, it only yields a complication, not a contradiction.

If we assume a uniform density, only a set with a defined volume can have a defined mass. The notion of “volume” can be defined for many subsets of R3 [i.e., 3-dimensional real space], and beginners might expect the notion to apply to all subsets of R3, but it does not.

More precisely, Lebesgue measure is defined on some subsets of R3, but it cannot be extended to all subsets of R3 in a fashion that preserves two of its most important properties: the measure of the union of two disjoint sets is the sum of their measures, and measure is unchanged under translation and rotation. Thus, the Banach-Tarski Paradox does not violate the Law of Conservation of Mass; it merely tells us that the notion of “volume” is more complicated than we might have expected.

Bertrand Russell (more famous for his work in philosophy and political activism, but also an accomplished mathematician) once said:

To choose one sock from each of infinitely many pairs of socks requires the Axiom of Choice, but for shoes the Axiom is not needed.

The idea is that the two socks in a pair are identical in appearance, and so we must make an arbitrary choice if we wish to choose one of them. For shoes, we can use an explicit algorithm - e.g., “always choose the left shoe.”

Why does Russell's statement mention infinitely many pairs? Well, if we only have finitely many pairs of socks, then AC is not needed - we can choose one member of each pair using the definition of “nonempty,” and we can repeat an operation finitely many times using the rules of formal logic.

Return to VG/AH Theory Homepage