Ultrafilters IV: The Light Bulb

I have measured out my life with adjustments

To my lightbulb contraption and the proliferation of its


-Thomas Stearns Edison

Everyone is of course familiar with famed inventor/poet T.S. Edison (the wizard of Faber and Faber), but what most people don’t know is that he spent the last years of his life building an elaborate light bulb contraption. The device consisted of a single bulb, which could emit blue, green, or red light, connected to a set of switches, each of which had three settings: blue, green, and red. Edison’s set of switches grew throughout his life (unconfirmed rumors suggest that, by the time of his death, the number of switches was greater than the size of the continuum), but, at any given time, there was a deterministic rule that associated each possible switch setting with a specific color for the bulb to emit. Edison tinkered with these rules but required that they all satisfy the following basic criteria:

1) If every switch is pointing to the same color, then that is the color of the bulb.

2) If every switch is changed, then the bulb must change color.

Let us call a rule that satisfies these two criteria an acceptable lightswitch rule. Just as we saw with elections last week, ultrafilters provide an easy way of generating acceptable lightswitch rules. Namely, let S be the set of all switches, and let \mathcal{U} be an ultrafilter on S. For any given configuration of the switches, there is a single color (red, say) such that the set of switches pointing to that color (i.e. red) is in \mathcal{U}. The rule associated with \mathcal{U} (call it the \mathcal{U}-rule) then says that the lightbulb should be red. This rule is easily seen to be an acceptable lightswitch rule.

T.S. Edison was aware of these ultrafilter rules but found them unsatisfactory and tried tirelessly to develop a different sort of acceptable lightswitch rule. Unfortunately, he never succeeded; in fact, we shall prove today that his search was in vain.

Theorem: Every acceptable lightswitch rule is in fact the \mathcal{U}-rule for some ultrafilter \mathcal{U} on S.

Proof: Suppose that we are given an acceptable lightswitch rule and that S is the set of all switches. We will define an ultrafilter \mathcal{U} on S and show that our given rule is in fact the \mathcal{U}-rule.

First, a quick observation: if all of the switches point to one of two colors, then the lightbulb must be one of those two colors. Why? Well, suppose there were a situation in which all of the switches point to, say, blue or green, while the bulb is red. Now change all of the switches to red. By criterion (1), the bulb must be red. However, by criterion (2), since all of the switches were changed, the bulb must change, so it can no longer be red. Contradiction! We illustrate this, as we will illustrate our future arguments, with a picture. In our pictures, the set of all switches will be depicted by a rectangle. Imagine that the switches are physically located in the rectangle and that the color of a switch is determined by the color of the region in which it is located. The switch configurations are numbered in the order in which we imagine them occurring, and the pictures are labeled with the state of the lightbulb.

two_color_unison_drawing (1)
Figure 1: Suppose we start with a situation in which all switches are either blue or green and the lightbulb is red. If all of the switches are then turned to red, there is no consistent state for the lightbulb.

We are now ready to define our ultrafilter, \mathcal{U}. First, some notation. If X \subseteq S, then we let X^c denote the complement of X, i.e. S \setminus X. Given a subset X \subseteq S, we let X \in \mathcal{U} if and only if when the switches are set so that all switches in X are blue and all switches in X^c are red, the lightbulb is blue. We must now show that \mathcal{U} is indeed an ultrafilter and that our rule is the \mathcal{U}-rule. We do this in a series of steps.

Step 1: S \in \mathcal{U} and \emptyset \not\in \mathcal{U}.

This follows precisely from criterion 1: if all switches in S are blue, the bulb must be blue, while, if all switches in \emptyset^c = S are red, the bulb is not blue.

Step 2: Suppose X \subseteq S. Then either X \in \mathcal{U} or X^c \in \mathcal{U}.

Suppose that all switches in X are set to blue and those in X^c are set to red. By our prior observation, the bulb must be either blue or red. If the bulb is blue, then X \in \mathcal{U} by the definition of \mathcal{U}. If the bulb is red, then change the switches in X to red and those in X^c to blue. Since all of the switches have changed, the bulb must change and therefore must be blue. In this case, X^c \in \mathcal{U}, again by the definition of \mathcal{U}.

Step 3: Suppose X \in \mathcal{U}, C_1 and C_2 are two different colors (chosen from blue, green, and red), and the switches are configured so that all switches in X point to C_1 and all switches in X^c point to C_2. Then the bulb has color C_1.

If C_1 is blue and C_2 is red, then this is precisely the definition of \mathcal{U}. We prove it here when C_1 is green and C_2 is red. The other cases are dealt with similarly. As before, we use a picture. In the picture below, the left side of the rectangle corresponds to X and the right side to X^c. We start with all switches in X blue and all switches in X^c red. Since X \in \mathcal{U}, the bulb must be blue. We then perform a series of manipulations, and repeated applications of criterion 2 will lead to our conclusion.

four_two_color_drawing (1)
Figure 2: In configuration 1, the bulb is blue. Going from 1 to 2, all switches are changed, so the bulb must change. Since all switches are either green or blue, the bulb must therefore be green. Going from 2 to 3, all switches again change, so the bulb must change, this time to red. The same argument shows that, in 4, the bulb must be green. But this is precisely what we wanted to show.

Step 4: Suppose X, Y \subseteq S, X \in \mathcal{U}, and X \subseteq Y. Then Y \in \mathcal{U}.

We again employ a picture. In the picture, the left half of the rectangle represents X and the left three-quarters of the rectangle represents Y. We start with all switches in X blue and those in X^c red. We then change all of the switches in Y to green and those in Y^c to blue. We changed all of the switches, so the color of the bulb must change and therefore must be green. But then, by Step 3, we have Y \in \mathcal{U}.

superset_drawing (1)
Figure 3: Moving from 1 to 2, all switches change, so the bulb must change from blue to green.

Step 5: Suppose we are given a configuration of switches, possibly using all three colors, and X is the set of switches matching the color of the bulb. Then X \in \mathcal{U}.

For concreteness, suppose that the bulb is blue and X \subseteq S is the set of all switches pointing to blue (so all switches in X^c are pointing to green or red). Now change all switches in X to red and all switches in X^c to blue. The bulb must switch colors, so it must be red. But, in our new configuration, we are using only two colors, so, by steps 2 and 3, we must have  X \in \mathcal{U}.

three_to_two_drawing (1)
Figure 4: The left of the rectangle corresponds to X. Going from 1 to 2, the bulb must change from blue to red.

Step 6: If X, Y \in \mathcal{U}, then X \cap Y \in \mathcal{U}.

Once more, a picture. We draw the picture in such a way that implies that X \cup Y = S. If this is not the case, then simply replace Y by Y \cup (S \setminus X). By step 4, this is in \mathcal{U}, and its intersection with X is the same as X \cap Y. In the picture, the left \frac{5}{8} of the rectangle corresponds to X and the right \frac{5}{8} corresponds to Y. Their intersection is thus the middle \frac{1}{4} of the rectangle. Since X \in \mathcal{U}, if all switches in X are blue and all other switches are red, the bulb is blue (call this Configuration A). Since Y \in \mathcal{U}, if all switches in Y are green and all other switches are red, then the bulb is green (call this Configuration B). Now suppose that all switches in X \cap Y are red, all switches in X \setminus Y are green, and all switches in Y \setminus X are blue (call this Configuration C). Going from either Configuration A or B to Configuration C, all switches change, so the bulb must change. Therefore, the bulb in Configuration C can be neither blue nor green, so it must be red. By Step 5, this means that X \cap Y \in \mathcal{U}.

intersection_drawing (1)
Figure 5: Since all switches change going from either configuration in 1 to the configuration in 2, the bulb must be red in 2.

 And now we are done! Why? Steps 1, 4, and 6 are precisely the verifications that \mathcal{U} is an filter, Step 2 shows it is an ultrafilter, and Step 5 implies that the given rule is the \mathcal{U}-rule.

P.S. I found this result in Komjath and Totik’s excellent book, Problems and Theorems in Classical Set Theory. Also, it is interesting to note that the result continues to hold (by essentially the same proof) for any finite number of colors greater than or equal to 3 but fails if only two colors are allowed. So here’s an exercise for you, reader: come up with an acceptable lightswitch rule for two colors that is not equal to an ultrafilter rule.


Ultrafilters III: Infinite Elections

Today, we present a curious “practical” application of ultrafilters, especially relevant in this election season. We will eventually consider elections with infinitely many voters, but let us first look at the classical case with a finite electorate.

Suppose that there are three or more political candidates, named C_1, C_2, \ldots, C_n, vying for political office, and finitely many voters in the electorate. Each voter ranks the candidates in the order in which they prefer them, and the outcome of the election is a collective ranking of the candidates. (We will not deal here with elections in which the result is a single winner rather than a complete ranking, but similar results hold there. We will also not deal with elections between two candidates.)

It has long been clear that aggregating individual rankings into a collective ranking in a “fair” way is a tough problem. Difficulties already arise in the following simple example: Suppose there are three candidates: A, B, and C, and three voters: U, V, and W. Suppose U’s ranking is ABC, V’s is BCA, and W’s is CAB. No collective ranking seems particularly good here. ABC, for example, seems unfair since two of the three voters in fact prefer C to A, while the collective ranking has A first and C last! Similar problems arise with the other possibilities. This particular example was recognized by the Marquis de Condorcet (for whom Condorcet voting methods are named) in 1785.

The Marquis de Condorcet

While the best way to aggregate individual rankings into a collective ranking is unclear, there are some criteria one would expect a good system to satisfy.

  1. (Unanimity) If every voter has the same ranking of candidates, then that ranking is the final outcome.
  2. (Irrelevant Alternatives) The relative ranking of two candidates C_i and C_j in the final outcome depends only on the relative ranking of those two candidates in the individual votes.

Let us call a system for converting individual rankings into a collective ranking a voting system. One particular type of voting system is what we will call a dictatorial voting system. In a dictatorial voting system, there is one distinguished voter (call this voter D) such that, regardless of the votes of the other voters, the final ranking will be exactly the ranking in D’s vote. In the context of this vote, at least, D is a dictator who single-handedly determines the outcome of the election.

A dictatorial voting system satisfies the unanimity and irrelevant alternatives criteria for somewhat trivial reasons. It is obviously not, though, what one has in mind when thinking of free, democratic, elections, as the voices of all voters except one are entirely irrelevant. However, a remarkable and famous result of Kenneth Arrow shows that, if one wants to avoid a dictatorial voting system, one must violate either the unanimity or irrelevant alternatives criterion. This result, and others that followed it, are often paraphrased as stating that there are no perfect voting systems.

Arrow’s Impossibility Theorem: Suppose that, in an election with finitely many voters and three or more candidates, a voting system satisfies the unanimity and irrelevant alternatives criteria. Then it is a dictatorial voting system.

Let us remove now our assumption that there are only finitely many voters, while retaining the assumption that only finitely many candidates run in any given election. The notion of a voting system still makes sense, as do the unanimity and irrelevant alternatives criteria. And here ultrafilters make their entrance. Ultrafilters can be seen as ways of choosing between finitely many alternatives, and, as such, they provide natural voting systems.

Let us be a bit more precise. Suppose that, in an election in which voters rank all candidates, X is the set of voters and C_1, \ldots, C_n is the list of candidates, where n is a natural number. Let \mathcal{U} be an ultrafilter on X. By the properties of an ultrafilter, if we partition X into finitely many pieces, then exactly one of those pieces is in \mathcal{U}. This allows us to define a voting system, which we will call the \mathcal{U}-voting system, as follows. For every possible ranking \sigma of the candidates, let X_\sigma be the set of voters with individual ranking \sigma. Since there are n candidates, there are n! (i.e. n(n-1)(n-2)\ldots(2)(1)) possible rankings of these candidates. In particular, there are only finitely many, so there is some ranking \sigma such that X_\sigma \in \mathcal{U}. The \mathcal{U}-voting system then declares \sigma to be the collective ranking.

These voting systems mesh with our intuition that an ultrafilter is a mechanism for specifying which subsets of a given set should be considered “large.” In an election, we expect the final outcome to reflect the desires of a “large” portion of the electorate; one way to do this is to use an ultrafilter to specify exactly which sets of voters should be considered “large” and then to declare the result of the election to be the ranking chosen by a large set.

What happens if \mathcal{U} is a principal ultrafilter? Then there is one voter x \in X such that \mathcal{U} = \mathcal{U}_x = \{Y \subseteq X \mid x \in Y\}. In this case, the \mathcal{U}-voting system will simply return the ranking of voter x as the collective ranking, ignoring the other voters. In other words, the \mathcal{U}-voting system is a dictatorial voting system.

If \mathcal{U} is an ultrafilter on X, then the \mathcal{U}-voting system is easily seen to satisfy the unanimity and irrelevant alternatives criteria. What might be surprising is that these are the only such voting systems! Namely, we can generalize Arrow’s Impossibility Theorem to allow for a possibly infinite electorate as follows.

Generalized Arrow’s Impossibility Theorem: Suppose that, in an election, X is the set of voters and C_1, \ldots, C_n is the list of candidates, where n \geq 3 is a natural number. Then any voting system which satisfies the unanimity and irrelevant alternatives criteria must be the \mathcal{U}-voting system for some ultrafilter \mathcal{U} on X.

First note that this theorem truly does generalize Arrow’s Impossibility Theorem: if X is finite, then any ultrafilter on X must be principal and thus yield a dictatorial voting system. Also note that, with an infinite electorate, there are voting systems that satisfy the unanimity and irrelevant alternatives criteria but are not dictatorial: any \mathcal{U}-voting system in which \mathcal{U} is a non-principal ultrafilter will do the trick.

Before you write to Congress suggesting that we move towards elections with infinitely many voters based on non-principal ultrafilters, though, note that these voting systems, at least naively implemented, may not be what we actually want. For, consider a \mathcal{U}-voting system, where \mathcal{U} is a non-principal ultrafilter on an infinite electorate X. Suppose that the electorate consists of, say, two sexes: male and female. Then one of these sexes will comprise a set in \mathcal{U}, while the votes of the other will be entirely irrelevant. Or suppose that our infinite electorate is divided amongst our current fifty states. Then the set of voters in one of these states (Kansas, say) will be in the ultrafilter, in which case the election will depend solely on the votes of the Kansas voters. Perhaps this just means, though, that the members of our infinite electorate should not be so simply categorized or placed on a linear spectrum, as any attempt to do so necessarily disenfranchises certain groups. Perhaps this has something to say about finite societies as well. Perhaps…

That’s enough for today. I will not give the proof of the generalized impossibility theorem here. It’s not terribly difficult, but it is somewhat long and technical. Interested readers can consult David Galvin’s notes on ultrafilters. Come back next week, when I will present and actually prove a similar result about infinite lightbulb-switching schemes.


Ultrafilters II: Origin Story

The day after publishing my first post in the Ultrafilter saga, I attended an excellent seminar talk by Maryanthe Malliaris, which, fittingly, was largely about ultrafilters. During the talk, she told a story about the development of filters by Bourbaki (they had been independently developed by at least one other mathematician around the same time) that I will share, with some additions, here, with the caveat that, as this will be a remembered account from a story told in an informal seminar talk about a group that, over the years, has been heavily mythologized, the details may not all be 100 percent accurate. But origin stories are rarely 100 percent accurate, so let’s push onward.

We should first say a few words about Nicolas Bourbaki, the pseudonym of a semi-secret collective founded in the mid-1930s by nine young French mathematicians. The formation of the group was prompted by the fact that all of the members were dissatisfied with the lack of rigor in the mathematical analysis textbooks in use at the time, and the initial goal was the composition of a new textbook. They quickly realized that, to do this in what they saw as the correct way, they would need to start at the very beginning and reformulate all of mathematics on a rigorous, axiomatic basis. They began to produce a series of books, called Elements of Mathematics, providing a self-contained exposition of what they considered the fundamental areas of mathematics. Volume 1 was on set theory, Volume 2 on algebra, Volume 3 on general topology. A new volume on algebraic topology was published in 2016.

It is hard to overstate the influence of Bourbaki on the mathematics of the twentieth century. A number of the most prominent mathematicians of the century were members (initial members include, for example, André Weil, Henri Cartan, and Claude Chevalley; later members include Laurent Schwartz, Jean-Pierre Serre, and Alexander Grothendieck, among others). They introduced a staggering number of definitions and notations that are now standard throughout mathematics (for example, the use of \emptyset to denote the empty set). Most importantly, Bourbaki pushed the mathematical community towards rigor, abstraction, and generality, which are central to their mission.

Now on to filters. One fine summer (I have no idea if it was actually summer or if the summer was fine, but let’s assume it was) in the mid-1930s, the members of Bourbaki convened at the country house of Chevalley’s parents in central France for one of their famous retreats. One day, their topic of discussion was the possibility of freeing the notion of ‘limit’ from the countable.

Members of Bourbaki at Chevalley’s parents’ house, with special guest Simone Weil, sister of André. 

Let’s step aside for a moment to try to gain at least an imprecise idea of what this means. To illustrate, we will consider sequences in the set of rational numbers, \mathbb{Q}, and the set of real numbers, \mathbb{R}. By a sequence, we mean a collection of objects indexed by the natural numbers. So, for example, a sequence of real numbers is an object of the form \langle a_0, a_1, a_2, \ldots \rangle = \langle a_n \mid n \in \mathbb{N} \rangle, where, for all n \in \mathbb{N}, a_n \in \mathbb{R}. If \langle a_n \mid n \in \mathbb{N} \rangle is a sequence of real numbers and b \in \mathbb{R}, then we say \langle a_n \mid n \in \mathbb{N} \rangle converges to b, or that b is the limit of \langle a_n \mid n \in \mathbb{N}\rangle, if, as n gets arbitrarily large, a_n gets arbitrarily close to b. (I am of course being imprecise here; the rigorous definition can easily be found elsewhere.) So, for example, the sequence \langle 1, \frac{1}{2}, \frac{1}{3}, \frac{1}{4}, \ldots \rangle has a limit of 0.

We can also talk about a sequence being convergent without talking about limits. Let us say that a sequence \langle a_n \mid n \in \mathbb{N} \rangle is convergent if, as the sequence goes on, its elements get closer and closer to one another, or, slightly more precisely, as n gets arbitrarily large, the distance between a_{m_0} and a_{m_1} for arbitrary m_0, m_1 \geq n gets arbitrarily small.

It is easy to show that every sequence that has a limit is convergent. Now the question naturally arises: does every convergent sequence have a limit? For sequences of real numbers, the answer is yes; this property of the real numbers is known as completeness. However, let’s step back for a moment to the set of rational numbers. Consider the sequence \langle 3, 3.1, 3.14, 3.141, \ldots \rangle, in which the n^{\mathrm{th}} element is the n-digit expansion of \pi. This is a sequence of rational numbers, and it is easily seen to be convergent. However, since \pi is irrational, it does not have a limit in the set of rational numbers (of course, it has a limit in the set of real numbers, namely \pi). Thus, if you somehow lived in a world in which rational numbers existed but irrational real numbers did not and someone handed you this sequence, you would be able to see that it is convergent, but if you looked at the point to which it apparently converges, there would be nothing there! Or, more precisely but less colorfully, \mathbb{Q} is not complete. (In fact, one of the earliest rigorous definitions of the set of real numbers is as the completion of the set of rational numbers.)

These notions of limits and convergence can be generalized to a class of structures known as metric spaces, which are special examples of topological spaces, but the collections that converge or have limits are always countable (namely, they’re always sequences indexed by the set of natural numbers). This was seen as unnecessarily restrictive in the setting of a general topological space, though, so, in the spirit of generality, Bourbaki sought a definition of limits and convergence that did not depend on the cardinality of the convergent collection.

It was not clear what the correct generalization should be, so, on this particular day, having made little progress, the members of Bourbaki decided to go for a walk. Henri Cartan, though, stayed behind, and, by the time the others returned, he had developed the notion of a filter. He started explaining his ideas, the mathematicians became increasingly excited, and, at one point, one of them, recognizing that this was something truly important, shouted, “Boom!” The objects became known as booms, and their study as boomology. Only when Cartan was later writing about them in a formal paper did he refer to them as filters, the name by which they are known today.

In our last post, we introduced filters as a way of specifying subsets of a given set that should be considered “large.” But, in the context of topological spaces, filters can also be seen as generalizations of sequences and, as such, we can define what it means for a filter to converge to a limit. (For those interested, here’s the technical definition. If (X, \tau) is a topological space, \mathcal{F} is a filter on X, and x \in X, then \mathcal{F} converges to x if, for every U \in \tau with x \in U, there is Y \in \mathcal{F} such that Y \subseteq U.) And it turns out that the compactness of a topological space has a very appealing characterization in terms of convergence of filters: a topological space (X, \tau) is compact if and only if every ultrafilter on X converges to an element of X.

We finally connect these ideas to the Stone-Čech compactification, presented on Monday. Recall that, if (X, \tau) is a discrete topological space, i.e. \tau = \mathcal{P}(X) and X is infinite, then the Stone-Čech compactification of X, \beta X, is the space of ultrafilters on X, where a point x \in X is associated with the principal ultrafilter at x, \mathcal{U}_x, where \mathcal{U}_x is the collection of all subsets of X containing x as an element.

Now, it can be shown that, if (X, \tau) is a discrete topological space and \mathcal{U} is an ultrafilter on X, then \mathcal{U} converges if and only if the intersection of all of the sets in \mathcal{U}, denoted \bigcap \mathcal{U}, is non-empty. Moreover, this happens if and only if \mathcal{U} is a principal ultrafilter, in which case, if \mathcal{U} = \mathcal{U}_x for some x \in X, then \bigcap \mathcal{U} = \{x\} and \mathcal{U} converges to x. If X is infinite, there are nonprincipal ultrafilters on X which therefore do not converge. This can be seen as an explanation for why (X, \tau) is not compact and gives a slightly different picture of the Stone-Čech compactification of (X, \tau). To make (X, \tau) compact, we want to make sure that every ultrafilter \mathcal{U} on X converges. If \mathcal{U} is a principal ultrafilter, then it already converges. If \mathcal{U} is non-principal, though, it does not, and, just as we add irrational numbers like \pi to \mathbb{Q} to make sure that all convergent sequences have a limit, so we add new points to X to make sure that all non-principal ultrafilters converge. Namely, for all non-principal ultrafilters \mathcal{U}, we add a new point x_{\mathcal{U}}, which can be considered to be the point at infinity that lies in  \bigcap \mathcal{U} .

P.S. A slight disclaimer here. The analogy between \mathbb{Q} \mapsto \mathbb{R} and X \mapsto \beta X is perhaps not as nice as I have made it out to be. \mathbb{R} is not a compact space, precisely because, while every convergent sequence from \mathbb{R} has a limit, there are sequences from \mathbb{R} with no convergent subsequence (such as \langle 0, 1, 2, 3 \ldots \rangle). \mathbb{R} is a metric space, and it turns out that a metric space is compact if and only if every sequence from the space has a subsequence that converges to a limit in that space; this can be seen as a special case of the fact, mentioned above, that a topological space is compact if and only if every ultrafilter converges to a limit in the space.


Ultrafilters I: The Stone–Čech Compactification

Today, we begin a new series on the marvels of ultrafilters. The journey will be wide-ranging and will bring us to great heights, but, as always, we start humbly, with definitions. As always, the reader is referred here for a reminder of basic set theoretic notation.

Meeting of the Bourbaki Group in Besse en Chandesse, Puy de Dome, July 1935 (b/w photo)
Nicolas Bourbaki

Before we learn about ultrafilters we must, naturally, learn about filters. Filters, introduced by Birkhoff and Cartan in the 1930s and popularized by the mighty Nicolas Bourbaki, can be thought of as a way of specifying which subsets of a given set should be considered to be “large.” More specifically, let X be a set. A filter on X is a collection \mathcal{F} of subsets of X (i.e.  \mathcal{F} \subseteq \mathcal{P}(X)) satisfying the following criteria:

  1. X \in \mathcal{F} and \emptyset \not\in \mathcal{F}.
  2. If Z \subseteq Y \subseteq X and Z \in \mathcal{F}, then Y \in \mathcal{F}.
  3. If Y,Z \in \mathcal{F}, then Y \cap Z \in \mathcal{F}.

These criteria should make sense in light of our informal definition of a filter as a collection of large subsets of a given set: the first asserts that the entire set is large and the empty set is not large, the second asserts that if a subset Z is large, then any subset that contains Z as a subset is also large, and the third asserts that if “most” elements are in a subset Y and “most” elements are also in a subset Z, then “most” elements are in both Y and Z. Note that, by applying the third criteria over and over again, one obtains that any finite intersection of elements of a filter is also in the filter, i.e. if n \in \mathbb{N} and Y_0, Y_1, \ldots, Y_n are in \mathcal{F}, then Y_0 \cap Y_1 \cap \ldots \cap Y_n is in \mathcal{F}.

An ultrafilter on X is a filter on X that contains as many sets as possible. What does this mean in practice? Notice that, if \mathcal{F} is a filter on X and Y \subseteq X, it cannot be the case that both Y and  X \setminus Y are in \mathcal{F}. This is because, if they were both in \mathcal{F}, then Y \cap (X \setminus Y) would also be in \mathcal{F}. But Y \cap (X \setminus Y) = \emptyset, and \emptyset \not\in \mathcal{F}. It turns out that this is the only real constraint, so a filter \mathcal{F} on X is an ultrafilter if, for every subset Y \subseteq X, either Y \in \mathcal{F} or (X \setminus Y) \in \mathcal{F}.

Let’s look at some examples of filters and ultrafilters. First, a rather trivial example. Suppose X is any non-empty set and x \in X. Let \mathcal{U} be the set of all subsets of X that contain x as an element. You can easily check that \mathcal{U} is an ultrafilter on X; in this case, \mathcal{U} is called the principal ultrafilter on X at x, and it is rather boring.

If X is a finite set, then it turns out that every ultrafilter on X is a principal ultrafilter. To see this, let n \in \mathbb{N}, let X be a set with n elements, e.g. X = \{0,1,2, \ldots, n-1\}, and let \mathcal{U} be an ultrafilter on X. If there is i < n such that \{i\} \in \mathcal{U}, then it follows that \mathcal{U} is the principal ultrafilter at i. Suppose this is not the case. Then, for all i < n, X \setminus \{i\} \in \mathcal{U}. Then X \setminus \{0\} \cap X \setminus \{1\} \cap \ldots \cap X \setminus \{n-1\} is a finite intersection of elements of \mathcal{U} and therefore is itself in \mathcal{U}. But this intersection is \emptyset, and \emptyset \not\in \mathcal{U}. Contradiction! Thus, there is i < n such that \{i\} \in \mathcal{U}.

Therefore, interesting ultrafilters only start showing up on infinite sets. But they show up immediately. Suppose X is an infinite set, and let \mathcal{F} be the set of all co-finite subsets of X, i.e. all sets Y \subseteq X such that X \setminus Y is finite. \mathcal{F} is easily seen to be a filter on X and is called the Fréchet filter. \mathcal{F} is certainly not an ultrafilter. If, for example, X = \mathbb{N}, then neither the set of even numbers nor the set of odd numbers is in \mathcal{F}. However, it is a consequence of Zorn’s Lemma, an equivalent form of the Axiom of Choice, that any filter can be extended to an ultrafilter. Therefore, we can find an ultrafilter \mathcal{U} on X such that \mathcal{F} \subseteq \mathcal{U}. For every x \in X, X \setminus \{x\} is clearly a cofinite subset of X and is hence in \mathcal{U}, so \mathcal{U} is not a principal ultrafilter (in fact, it is easily seen that, for an ultrafilter on X, extending the Fréchet filter and being non-principal are equivalent).

As we will see in coming posts, non-principal ultrafilters are wondrous creatures. We have time for one quick illustration today. Recall that, in a previous post, we introduced the notion of a topological space and a compactification thereof. We showed that, given any topological space (X, \tau), there is a smallest way to make X compact, and one can in fact achieve this by adding a single new point at infinity. There is also a largest (in a specific technical sense) compactification of (X, \tau). This is known as the Stone–Čech compactification.

In general, the Stone–Čech compactification of a space can be defined in terms of an abstract universality property. It can also be defined in a somewhat more complicated way in terms of continuous functions from the space to the closed unit interval. But, in the case in which (X, \tau) has the discrete topology, i.e. \tau = \mathcal{P}(X), then the Stone–Čech compactification has a very pleasing and initially surprising characterization: it is the space of all ultrafilters on X.

A diagram illustrating the universality property characterizing the Stone-Čech compactification.

Let \beta X denote the set of all ultrafilters on X. What topology do we place on \beta X? If Y \subseteq X, then let O_Y be the set of \mathcal{U} \in \beta X such that Y \in \mathcal{U}. We can define a topology \tau' on \beta X by letting \tau' consist of all subsets of \beta X that can be expressed as arbitrary unions of sets in \{O_Y \mid Y \subseteq X\}. (\beta X, \tau') is then the Stone–Čech compactification of (X, \tau).

You might have a couple objections here. First, a compactification worthy of its name must be compact, and I have not shown that (\beta X, \tau') is compact. In fact, I will not do so here. It’s not particularly difficult, and I encourage interested readers to seek out its proof elsewhere, but neither is it trivial.

Second, a compactification of a space X should contain X as a subspace. But, here, X and \beta X consist of fundamentally different objects, and, formally, it is not the case that X \subseteq \beta X. This objection is easily and elegantly overcome, though, by saying that an element x \in X is identified with the element \mathcal{U}_x \in \beta X, where \mathcal{U}_x is the principal ultrafilter on X at x. Through this renaming, X is identified as a subspace of \beta X, and all is right in the world. In addition, this identification clarifies our picture of \beta X: the boring, principal ultrafilters in \beta X correspond to the points in our original space X, while the non-principal ultrafilters comprise a constellation of new points at infinity.


The Book of Sand

“It can’t be, yet it is,” the Bible peddler said, his voice little more than a whisper. “The number of pages in this book is literally infinite. No page is the first page; no page is the last. I don’t know why they’re numbered in this arbitrary way, but perhaps it’s to give one to understand that the terms of an infinite series can be numbered any way whatever.”

Then, as though thinking out loud, he went on.

“If space is infinite, we were anywhere, at any point in space. If time is infinite, we are at any point in time.”

His musings irritated me.

-Jorge Luis Borges, “The Book of Sand”

It’s a busy week for me, so just a short post today and an exhortation to take a few minutes to read Borges’ story, “The Book of Sand.” You can find a nice version of it here.

We will no doubt return to this and other of Borges’ fictions later. “The Book of Sand” will prompt reflection on our place in a potentially infinite universe. Attempts to determine the order type of its pages will lead naturally to the investigation of the fascinating topic of dense linear orders.

For today, though, we will just appreciate the story.

Oh, and then we will go over here and play around with Michel van der Aa’s lovely multi-tiered music video experiment based on “The Book of Sand” and other Borges stories. Join us!


I was the shadow of the waxwing slain

By feigned remoteness in the windowpane.

I had a brain, five senses (one unique),

But otherwise I was a cloutish freak.

In sleeping dreams I played with other chaps

But really envied nothing — save perhaps

The miracle of a lemniscate left

Upon wet sand by nonchalantly deft

Bicycle tires.

-Vladimir Nabokov, Pale Fire

The lemniscate is the symbol of infinity, an endlessly traceable sideways figure eight, a twisted ouroboros engaged in an eternal process of renewal.


First called a “horse fetter” by Proclus, 5th century AD Greek philosopher and mathematician, who studied cross sections of tori. Analytically described by Bernoulli as the set of solutions to (x^2 + y^2)^2 - 2a^2(x^2-y^2) = 0, where a is a fixed real number. First used as a symbol for infinity by John Wallis, a 17th century English mathematician who pioneered the use of infinitesimals and helped prepare the ground for the calculus of Newton and Leibniz; he did not explain his choice. Euler later used an open variant, which has since fallen into disuse.

Euler’s open lemniscate. By Sapphorain, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=28853336

The lemniscate is source material for writers and artists. Beloved by mystics. Ubiquitous in tarot decks, particularly on the Juggler (Magician), who links Heaven and Earth, and on the Two of Pentacles.

The Juggler (or Magician) with a lemniscate in his hat. Note also the aleph on the right card. It signifies that, in the Tarot deck, the Juggler is the first of the Major Arcana and also calls to mind Cantor’s choice of notation for the infinite cardinals.
The Two of Pentacles

It is the figure idly traced on a tabletop while the mind is wandering elsewhere.


(1) to molder lazily on the couch. (2) to be sloth-like. (3) to lounge purposelessly, often associated with procrastination.

-Urban Dictionary

Victor Brauner, “The Surrealist”