Measuring II: The limits of measurability

Last week, we considered the problem of measuring all subsets of a given finite or countably infinite set. Today, we will investigate the problem of measuring subsets of an uncountable set. In fact, we will focus on a particular uncountable set, which happens to be one of the most important sets in all of mathematics: the set of real numbers, \mathbb{R}.

There are certain subsets of \mathbb{R} which we have a very natural and well-behaved method for measuring: the intervals. Recall that, if a \leq b are real numbers, then:

  • (a,b) is the set of all real numbers x such that a < x < b. Such a set is called an open interval.
  • [a,b] is the set of all real numbers x such that a \leq x \leq b. Such a set is called a closed interval.
  • In either case, a and b are said to be the endpoints of the interval.

The most natural way to measure an interval is simply to subtract the lower endpoint from the upper one, i.e., if a \leq b, then m((a,b)) = m([a,b]) = b-a. Note that, as a result of this, any one-element subset of \mathbb{R} is assigned a measure of 0, since, if a \in \mathbb{R}, then m(\{a\}) = m([a,a]) = a-a = 0.

There’s no reason to stop at intervals, though. Consider a set A that consists of two disjoint intervals, i.e. intervals I_0 and I_1 such that I_0 \cap I_1 = \emptyset. The natural measure to assign to A is, of course, m(I_0) + m(I_1). Extrapolating, if n \geq 2 is a natural number and A consists of n pairwise disjoint intervals, i.e. intervals I_0, I_1, \ldots I_{n-1} such that, for all i < j < n, I_i \cap I_j = \emptyset, then we should set m(A) = m(I_0) + m(I_1) + \ldots + m(I_{n-1}).

There’s no reason to stop at finite unions of intervals, either. As those of you who have taken calculus courses will know, there is a well-developed theory of infinite sums of positive real numbers, so it makes sense to measure sets that are unions of countably infinitely many pairwise disjoint intervals in the same way. More precisely, if A is the union of countably many pairwise disjoint intervals I_0, I_1, \ldots , then we can let \displaystyle m(A) = m(I_0) + m(I_1) + \ldots = \sum_{k=0}^\infty m(I_k).

Note, in particular, that if A is a countable subset of \mathbb{R}, then m(A) = 0. This is because A can be considered to be a countable union of pairwise disjoint intervals, each consisting of precisely one element. Since one-element sets are given a measure of 0, we then have m(A) = 0 + 0 + \ldots = 0.

Let \mathcal{X} be the collection of all subsets of \mathbb{R} that can be written as countable unions of pairwise disjoint intervals. In other words, \mathcal{X} consists of precisely those sets to which we have assigned a measure. (Those of you with some prior knowledge might object here that, as I’ve made it this far, I may as well just keep going and define the measure on the collection of all Borel sets rather than this clearly unsatisfactory collection \mathcal{X}. You might be right.) This measure, m, that we have constructed has some very nice properties. We list some of them here.

  1. For all A \in \mathcal{X}, 0 \leq m(A) \leq \infty.
  2. m(\emptyset) = 0 and m(\mathbb{R}) = \infty.
  3. For any a \leq b, m((a,b)) = m([a,b]) = b-a.
  4. (Monotonicity) If A,B \in \mathcal{X} and A \subseteq B, then m(A) \leq m(B).
  5. (Countable additivity) If \{A_n \mid n \in \mathbb{N}\} is a collection of pairwise disjoint elements of \mathcal{X}, then \displaystyle m(\bigcup_{n \in \mathbb{N}}A_n) = \sum_{n \in \mathbb{N}} A_n.
  6. (Shift invariance) If A \in \mathcal{X} and r \in \mathbb{R}, then, if A + r is the set we get by “shifting” A over by r, i.e. A+r = \{a+r \mid a \in A\}, then m(A+r) = m(A).

Take a minute to think about these properties. I hope you will agree with me that they are all natural properties that one would like to require of a reasonable measure of subsets of \mathbb{R}.

There is a small problem, though. Our measure, m, does not measure all subsets of \mathbb{R}, only those in \mathcal{X}. And there are many subsets of \mathbb{R} that are not in \mathcal{X}. One can see this just by counting: there are only 2^{\aleph_0} members of \mathcal{X}, whereas the number of all subsets of \mathbb{R} is 2^{2^{\aleph_0}}, which is strictly larger than 2^{\aleph_0}. Compared with the vast expanses of the full power set of \mathbb{R}, our set \mathcal{X} looks positively puny.

So this raises a question: Can we extend our measure m so that it measures all subsets of \mathbb{R} while still maintaining the nice properties 1-6 listed above? Interestingly, the answer, discovered by Italian mathematician Giuseppe Vitali in the early 20th century, is no.

(You may be initially disappointed by this answer, but I would argue that this makes life much more interesting. Vitali’s discovery came in an era of great upheaval and progress in mathematics, and his discovery, along with others of the time, laid the groundwork for some of the most important and fascinating achievements of the 20th century.)

Let us reconstruct Vitali’s argument. Suppose for the sake of a contradiction that we have a measure m satisfying properties 1-6 that measures all subsets of \mathbb{R}. First, we will define an equivalence relation, \sim, on the interval (0,1) by setting r \sim s if and only if r - s is rational. I’ll leave it to the reader to check that this is an equivalence relation, recalling that an equivalence relation must satisfy the following requirements:

  1. (Reflexivity) For every r, we have r \sim r.
  2. (Symmetry) For every r and s, if r \sim s, then s \sim r.
  3. (Transitivity) For every r, s, and t, if r \sim s and s \sim t, then r \sim t.

For each r \in (0,1), let [r] denote the equivalence class of r, i.e. the set \{s \in (0,1) \mid r \sim s\}. Since \sim is an equivalence relation, the equivalence classes partition \mathbb{R} into pairwise disjoint sets. In particular:

  • For every r \in (0,1), r \in [r].
  • For every r and s in (0,1), either [r] = [s] or [r] \cap [s] = \emptyset.

Let \mathcal{C} be the set of equivalence classes, and pick a choice function on \mathcal{C}, i.e. a function F:\mathcal{C} \rightarrow (0,1) such that, for every D \in \mathcal{C}, F(D) \in D. Let A = \{F(D) \mid D \in \mathcal{C}\} and note that, for every D \in \mathcal{C}, |A \cap D| = 1. If q \in (-1,1) is a rational number, let A + q denote the set A shifted by q, i.e. the set \{r + q \mid r \in A\}.

Claim 1: If q_0 and q_1 are distinct rational numbers in (-1,1), then (A + q_0) \cap (A + q_1) = \emptyset.

Proof: Suppose for the sake of a contradiction that there is a real number s in (A + q_0) \cap (A + q_1). Then there are r_0, r_1 \in A such that r_1 + q_1 = s = r_0 + q_0. But then r_1 - r_0 = q_0 - q_1. Since q_0 and q_1 are rational, q_0 - q_1 is rational, so r_1 \sim r_0. Moreover, since q_0 and q_1 are distinct, q_0 - q_1 \neq 0, so r_0 \neq r_1. But then A contains at least two elements of [r_0] (namely, r_0 and r_1), contradicting the fact that A intersects each equivalence class in only one element and finishing the proof of the claim.

Claim 2: Let B = \bigcup_{q \in \mathbb{Q} \cap (-1,1)}( A + q). Then (0,1) \subseteq B \subseteq (-1,2).

Proof: We first show (0,1) \subseteq B. To this end, fix s \in (0,1). By construction, we can find a real number r in A \cap [s]. Let q = s-r. Since s\sim r, q is rational. Since r,s \in (0,1), we must have q \in (-1,1). But then s \in A + q and A + q \subseteq B, so s \in B. Since s was arbitrary, we have (0,1) \subseteq B.

To see that B \subseteq (-1,2), fix s \in B. There is r \in A and q \in \mathbb{Q} \cap (-1,1) such that s = r+q. Since r \in (0,1), this means -1 < s < 2. Thus, B \subseteq (-1,2). This ends the proof of the claim.

Since m measures every subset of \mathbb{R}, it must measure A + q for every q \in \mathbb{Q} \cap (-1,1) and B. By shift invariance of m (requirement 6), it must be the case that m(A+q) = m(A) for every q \in \mathbb{Q} \cap (-1,1). By countable additivity of m (requirement 5) and Claim 1, we must have m(B) = \sum_{q \in \mathbb{Q} \cap (-1,1)}m(A+q). By Claim 2 and monotonicity of m (requirement 4), we must have 1 \leq m(B) \leq 3.

But these things together yield a contradiction! Let x = m(A). Then m(B) = \sum_{q \in \mathbb{Q} \cap (-1,1)} x = x + x + x + \ldots. If  x = 0, this sum is 0, so m(B) = 0. If x > 0, this sum is infinite, so m(B) = \infty. In either case, m(B) is not in the interval [1,3], as it is supposed to be. Therefore, we have identified a set (namely, A), that cannot be measured by m, which is a contradiction to our assumptions.

Closing remark: Getting our set A required an application of the axiom of choice, and it turns out that this use of choice is necessary to find a non-measurable set. In one of the great achievements of modern set theory, Robert Solovay proved in 1970 that it is consistent with ZF (i.e. all axioms of set theory except the axiom of choice) that there is a measure satisfying requirements 1-6 and measuring all subsets of the reals. In fact, the measure that works is the one that has become the canonical measure on the reals: the Lebesgue measure, introduced by Henri Lebesgue in 1901 and the subject of the next installment of this series.

Advertisements

One thought on “Measuring II: The limits of measurability

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s