L’escalier du Diable

Welcome one, welcome all to the Point at Infinity sideshow, where today we present a tantalizing and diabolical selection of musical and mathematical curiosities. Just watch your step; these stairs can be a bit tricky.

A few months ago, you may recall, we published two posts about the Shepard tone and the Risset rhythm, aural illusions in which a tone or rhythm seems to perpetually rise or fall in pitch or in tempo but is actually repeating the same pattern over and over again, the musical equivalents of Penrose stairs.

Penrose stairs. (Image in the public domain.)

To accompany the posts we created some sound samples so the readers could hear the illusions themselves. A couple of weeks ago, one of these samples was used in an internet radio program on audio paradoxes released by Eat This Radio, paired with some work of Jean-Claude Risset. The entire program is really excellent, ranging from a piece by J.S. Bach to mid-twentieth century audio experiments to modern electronic music, and I encourage all of you to listen to it.

One of the pieces in the radio program is a piano étude written by György Ligeti in the late twentieth century. The étude is named L’escalier du diable, or The Devil’s Staircase, and its repeated ascents of the keyboard have a striking resonance with the never-ending ascent of the Shepard tone.

The Devil’s Staircase is also the colloquial name given to a particular mathematical function introduced by Georg Cantor in the 1880s. It is a function defined on the set of real numbers between 0 and 1 and taking values in the same interval, and it has some quite curious properties. Before we discuss it, let’s take a look at (an approximation to) the graph of the function.

Graph of the Devil’s Staircase. By Theon, CC BY-SA 3.0.

To appreciate the strangeness of this function, let us recall some definitions regarding functions of real numbers. Very roughly speaking, a function is called continuous if it has no sudden jumps, or if its graph can be drawn without lifting the pencil from the page. Continuous functions satisfy a number of nice properties, such as the intermediate value theorem.

The derivative of a function at a given point of its domain, if it exists, measures the rate of change of the function at that point. If the x-axis measures time and the y-axis measures the position of an object along some one-dimensional track, then the derivative can be thought of as the velocity of that object. If a function is differentiable at a point (i.e., if its derivative exists there) then it must be continuous at that point, but the converse is not necessarily true. (For example, if the graph of a function has a sharp corner at a point, then the function cannot be differentiable there.)

Let’s think about what it means for a function to have a derivative of 0 at a point. It means that, at that point, the rate of change of the function has vanished. It means that, if we zoom in sufficiently close to that point, the function should look like a constant function. Its graph should look like a horizontal line. What would it mean for a function to have a derivative of 0 almost everywhere? (Here “almost everywhere” is a technical term (which I’m not going to define) and not just me being vague.) One might think that this must imply that the function is a constant function. At almost every point in its domain, the rate of change of the function is 0, so how can the value of the function change?

One will quickly discover that this is not quite right. Consider the function defined on the real numbers whose value is 0 at all negative numbers and 1 at all non-negative numbers.


This function has derivative 0 everywhere except at 0 itself, and yet it increases from 0 to 1. It does this quite easily by being discontinuous at 0, which, in hindsight, seems sort of like cheating. So what if we also require our function to be continuous? Now we need more exotic examples, and this is where the Devil’s Staircase comes in, for the Devil’s Staircase is a continuous function, it is differentiable almost everywhere, it has a  derivative of 0 wherever its derivative is defined, and yet it still manages to increase from 0 to 1. Wild!

What is the Devil’s Staircase exactly? I’ll give two different definitions. The first proceeds via an iterative construction. Start with the function f_0(x) = x. Its graph, between 0 and 1, is simply a straight line segment increasing from (0,0) to (1,1). Now, look at the midpoint of this increasing line segment, and draw a horizontal line segment centered there whose length is 1/3 of the horizontal line of the original increasing segment. Now connect the ends of this line segment via straight lines to (0,0) and (1,1). This new curve is the graph of a function that we call f_1. It consists of two increasing line segments with one horizontal line segment between them. Now repeat the process that took us from f_0 to f_1 on each of these increasing line segments, and let f_2 be the function whose graph is the result. Continue in this manner, constructing f_n for every natural number n.

First three steps of the iterative construction of the Devil’s Staircase. (Image in the public domain.)

It turns out that, as n goes to infinity, the sequence of functions \langle f_n \mid n \in \mathbb{N} \rangle converges (uniformly) to a single function. This function is the Devil’s Staircase.

A more direct but also more opaque definition is as follows: Given a real number x between 0 and 1, first express x in base 3 (i.e., using only 0s, 1s, and 2s). If this base 3 representation contains a 1, then replace every digit after the first 1 with a 0. Next, replace all 2s with 1s. The result has only 0s and 1s, so we can interpret it as a binary (i.e., base 2) number, and we let f(x) be this value. Then the function f defined in this manner is the Devil’s Staircase. Play around with this definition, and you might get a feel for what it’s doing.

And now, on our way out, some musical addenda. An encore, if you will. First, after making the Risset rhythms for the aforementioned post, I did some further coding and wrote a little program that can take any short audio snippet and make a Risset rhythm out of it. Here’s an example, first accelerating and then decelerating, using a bit from a Schubert piano trio.

You may recognize the sample from the soundtrack to Barry Lyndon.

Finally, I can’t help but include here one of my favorite pieces by Ligeti, Poema sinfónico para 100 Metrónomos.

Cover image: Devil’s Staircase Wilderness, Oregon, USA


Hippasus and the Infinite Descent

Today, dear Reader, we bring you a story. A story of Mathematics and Music, of Reason and Passion, of Drama and Irony. It is the story of Hippasus of Metapontum, of his remarkable life and his equally remarkable death. Before we begin, a note of warning. Not everything presented here is true, but all of it is meaningful.

Hippasus of Metapontum

A Greek philosopher and mathematician from the 5th century BCE, Hippasus was a follower of Pythagoras. The Pythagoreans believed in the transmigration of souls, subscribed to the belief that All is Number (where “Number” is, of course, “Whole Number”), made great strides in the study of musical harmony, and eschewed the eating of beans. Our hero was a particularly illustrious Pythagorean. He performed experiments linking the sizes of metal discs to the tones they emit upon being struck, developed a theory of the musical scale and a theory of proportions, and showed how to inscribe a regular dodecahedron in a sphere. The regular dodecahedron is a twelve-sided solid whose faces are regular pentagons, shapes which were dear to the Pythagoreans and central to our story. The pentagram, the five-sided star formed by extending the sides of a regular pentagon, and whose tips themselves form a regular pentagon, was a religious symbol of the Pythagoreans and a mark of recognition amongst themselves.

Ancient Greek coin with a pentagram

A corollary to the Pythagorean doctrine that All is Number was a belief, at the time, that any two lengths are commensurable, i.e., that given any two lengths, they are both whole number multiples of some fixed smaller length. In modern language, this amounts to the assertion that, given any two lengths, their ratio is a rational number, i.e., can be expressed as the ratio of whole numbers. It is only fitting that the first evidence to the contrary would come from the pentagram.

It happened when Hippasus was stargazing. He saw five stars forming a perfect regular pentagon, and inside this regular pentagon he formed a pentagram, at the center of which lay another regular pentagon, into which he formed another pentagram, at the center of which lay another regular pentagon, into which he formed another pentagram. An infinite web of similar triangles was woven through his mind and, in a flash of insight, he realized something terrible: the lengths of one side of the regular pentagon and one side of the pentagram found inside it are incommensurable.

The lengths of one side of a regular pentagon and one side of the inscribed pentagram are incommensurable. In other words, if s denotes the length of the black line segment AB and t denotes the length of the blue line segment AC, then t/s (and therefore also s/t) is irrational.

The next day, he told his fellow Pythagoreans of his discovery, and they were horrified. They could not have this knowledge, which struck at the core of their belief system, getting out into the wider world. So they took Hippasus far out to sea, and they threw him overboard.

Infinite descent is a proof technique that morally dates back to the ancient Greeks but really came into its own in the work of Pierre de Fermat in the 17th century. The idea behind it is simple and immediately appealing. Suppose we want to prove that there are no positive integers satisfying a particular property. One way to prove this would be to show that, given any positive integer n satisfying this property, we could always find a smaller positive integer n' satisfying the same property. Repeating the argument with n' in place of n, we could find a still smaller positive integer n'' satisfying the same property. Continuing, we would construct a decreasing sequence of positive integers, n > n' > n'' > n''' > \ldots , the aforementioned “infinite descent”. But of course there can be no infinite decreasing sequences of positive integers (if a decreasing sequence starts with n, it can of course have at most n elements). One thus reaches a contradiction and concludes that there are no positive integers with the given property.

Fermat made great use of the method of infinite descent in his work on number theory. One particularly striking application came in the proof of a special case of his famous Last Theorem: there are no positive integers a,b,c such that a^4 + b^4 = c^4. Fermat showed that, if a_0,b_0,c_0 are positive integers such that a_0^4 + b_0^4 = c_0^4, then we can construct positive integers a_1, b_1, c_1 such that a_1^4 + b_1^4 = c_1^4 and c_1 < c_0. One can then continue, with a_1, b_1, c_1 in place of a_0, b_0, c_0, and obtain a_2, b_2, c_2 with a_2^4 + b_2^4 = c_2^4 and c_2 < c_1. This of course leads to the infinite descent c_0 > c_1 > c_2 > \ldots and a contradiction.

We will use the method of infinite descent to prove the result of Hippasus mentioned above. This will not be exactly the way the ancient Greeks would have presented the proof, but it is very similar in spirit, and we make no apologies for the anachronism. Let’s get started.

Suppose, for the sake of an eventual contradiction, that we have a regular pentagon whose side length is commensurable with the side length of the inscribed pentagram. Let s denote the side length of the pentagon and t denote the side length of the pentagram (i.e., the diagonal of the pentagon). In modern language, our assumption is that \frac{t}{s} is rational, i.e., that there are positive integers p and q such that \frac{t}{s} = \frac{p}{q}. By scaling the pentagon, we can in fact assume that t = p and s = q. So let this be our starting assumption: there is a regular pentagon whose side length s and diagonal length t are both positive integers.

Let us label the vertices of the pentagon by the letters A,B,C,D,E. The center of the pentagram forms another regular pentagon, whose vertices we shall call a,b,c,d,e. This is shown in the diagram below. Note that s is the length of the line segment connecting A and B (we will denote this length by |AB|), and t is |AC|. Let s' denote the side length of the inner pentagon, i.e., |ab|, and let t' denote the length of the diagonal of the inner pentagon, i.e., |ac|.

Figure 1

We now make use of the wealth of congruent triangles in the diagram. We first observe that, whenever a pentagram is inscribed into a regular pentagon, the diagonals that form the pentagram exactly trisect the angles of the pentagon. (Exercise: Prove this!) Therefore, the triangle formed by A, E and D is congruent to that formed by A, e, and D, on account of their sharing a side and having the same angles on either end of that side. In particular, we have |Ae| = |AE| = s.

Next, consider the triangle formed by A, b, and d. By our observation at the start of the previous paragraph, the angle at b in this triangle has the same measure as the angle at A. But then this triangle is isosceles, so we have |Ad| = |db| = t'. Of course, we clearly have |Ad| = |Bd| = |Be| = |Ce| =…, so all of these lengths are equal to t'.

Figure 1 again

Let’s now see what we have. Consider first |AC|. By definition, we have |AC| = t. But also |AC| = |Ae| + |Ce|, and we saw previously that |Ae| = s and |Ce| = t'. All together, we have t = |AC| = |Ae| + |Ce| = s + t', or t' = t-s.

Next, consider |Ae|. We have already seen that |Ae| = s. But we also have |Ae| = |Ad| + |de|. We saw previously that |Ad| = t', and by definition we have |de| = s'. All together, we have s = |Ae| = |Ad| + |de| = t' + s', or s' = s - t'. Since we already know that t' = t-s, this yields s' = 2s - t.

This might not seem like much, but we’re actually almost done now! The key observation is that, since s and t are integers, and since s' = 2s - t and t' = t-s, it follows that s' and t' are also (positive) integers. We started with a regular pentagon whose side length s and diagonal length t were integers and produced a smaller pentagon whose side length s' and diagonal length t' are also integers. But now we can continue this process, producing smaller and smaller regular pentagons and producing infinite descending sequences of positive integers, s > s' > s'' > \ldots and t > t' > t'' > \ldots.

The infinite descent

This is a contradiction, and we have thus shown that \frac{t}{s} is irrational. But what is this ratio exactly? Well, it turns out to be a fascinating number, easily #2 on the list of Most Famous Ratios of All Time: \phi, a.k.a. the golden ratio! But that is a story for another time…

As a short addendum to today’s story, here’s a little-known fact about the end of Hippasus’ life. It turns out that, after being thrown overboard by his fellow Pythagoreans, he has not and will never in fact reach the seabed! For to get there, he would first have to travel halfway down, and then he would have to travel half of the remaining distance, and then half of the still remaining distance, and so on and so forth, completing an endless sequence of tasks, and thus he remains to this date in the midst of an infinite descent of his own.


(1) \sqrt{2} is more commonly cited as being the first irrational number discovered by the Pythagoreans, and it is almost always the first number proven to be irrational in classrooms today. However, the proof of the irrationality of \sqrt{2} possessed by the Greeks is not the simple number-theoretic proof used today and is in fact a rather complex elaboration of the proof of the incommensurability of the side and diagonal lengths of a regular pentagon. Given this fact and the centrality of the pentagram in Pythagorean intellectual life, some scholars have suggested that perhaps this was in fact the first proof of the existence of incommensurability and that the proof of the irrationality of \sqrt{2} came later. We have adopted this hypothesis for the purpose of our story today.

(2) This story is derived from legends that significantly post-date the death of Hippasus. It seems unlikely actually to have happened as presented here.

(3) This post was in part inspired by the episode “Drowned at Sea”, from the excellent podcast, Hi-Phi Nation, by philosopher Barry Lam. Check it out!

Cover image: “Rainstorm Over the Sea” by John Constable

Beats by Bjorklund

You gotta keep ’em separated.

-The Offspring, “Come Out And Play”

(Friendly note to the reader before we begin: If you’re looking for music to listen to while reading this post, just scroll to the bottom…)

The Spallation Neutron Source (SNS), at Oak Ridge National Laboratory in Tennessee, provides the world’s most intense pulsed neutron beams for scientific research. By accelerating proton pulses and directing them towards a mercury target, the laboratory ejects neutrons from the target. By measuring the energies and angles of these scattered neutrons, scientists can uncover information about the magnetic and molecular structure of various materials.

Pulses at the SNS are grouped into super-cycles, each 10 seconds long and consisting of 600 pulses. Commonly, experimenters will want to perform a particular action on a fixed number of pulses (k, say) during each super-cycle and will want to have the executions of the actions spaced as evenly as possible across time. If k evenly divides 600, then the solution to this problem is trivial: just perform the action on pulses 0, \frac{600}{k}, \frac{2 \cdot 600}{k}, \frac{3 \cdot 600}{k}, \ldots, \frac{(k-1) \cdot 600}{k}. If k does not evenly divide 600, though, then the solution is not as obvious.

This is of course a special case of a general problem: Given cycles of length n, how can k events be spaced as evenly as possible across a single cycle? (Naturally, we should have k \leq n here.) As seen above, the problem is trivial if k evenly divides n, so we shall assume this is not the case. In fact, we shall assume that k and n are relatively prime, i.e., that they have no common positive divisors other than 1. We will also assume that k > 1, since it is also trivial to evenly distribute one event across a cycle.

This exact problem was tackled by E. Bjorklund, motivated by timing questions at the SNS. In this paper, he presents an algorithm for generating solutions. For ease of notation, let us denote a cycle by a sequence of 0s and 1s, with a 1 representing an event taking place and a 0 representing an event not taking place. For definiteness, we will always start the sequence with a 1, but, because we are dealing with cyclic phenomena, we will consider two “rotations” of the same base sequence to be the same. For example, we consider 10010 and 10100 to be the same, since the second is just the first “rotated” (and wrapped around) three spaces to the right.

We will not give a formal account of Bjorklund’s algorithm here, but we will illustrate it with an example that we hope will indicate how to carry it out in general. Suppose we are given the specific problem of evenly spacing eight events in a cycle of length thirteen. Our final sequence will thus have eight 1s and five 0s. We start with thirteen individual sequences, each of length one, with all of the 1s first and then all of the 0s:

[1] [1] [1] [1] [1] [1] [1] [1] [0] [0] [0] [0] [0]

You will notice that we have two different sequences in this list ([1] and [0], here). This will be true for every step of the algorithm. In the next step, we take all instances of the second sequence ([0], here) and append them in turn to instances of the first sequence. Since there are five instances of the second sequence, five instances of the first sequence will become longer, and we obtain:

[10] [10] [10] [10] [10] [1] [1] [1]

We again have two different sequences ([10] and [1]). And we again take all instances of the second sequence and append them in turn to instances of the first sequence, obtaining:

[101] [101] [101] [10] [10]

Repeating this process again yields:


Technically, since there is only one instance of the second sequence in this stage, we could stop here and just append all of the sequences in order to obtain the correct solution, but let us continue for illustration’s sake. Next, we get:


And, finally:


(which is just a rotation of what we would have gotten two stages earlier: [1011010110101]). Thus, the optimal way of evenly distributing eight events in a single cycle of length 13 is given by the sequence 1011010110110. A similar process will yield the correct answer in any other specific instance of this problem.

(The astute reader might have noticed that this algorithm bears a resemblance to Euclid’s famous algorithm for calculating the greatest common divisor of two positive integers. This is not a coincidence.)

You might be slightly disturbed by my lack of rigor at this point. Indeed, there are two ways in which I am slightly cheating the reader. First, I have not really said what I mean by having events be spaced “as evenly as possible.” Intuitively, 1011010110110 is more even than 111111110000, or 1101101101100, or, if one thinks about it, 1011011011010, but how do we know it is “as even as possible,” and what do we even mean by this? Second, even if I had given a rigorous definition of “evenness,” I have not proven that Bjorklund’s algorithm does in fact deliver the optimal result (again, this should be intuitively plausible, but should perhaps not be blindly accepted). I will do neither of these things here, but will instead direct the interested reader to the excellent paper, “The distance geometry of music,” by Demaine, Gomez-Martin, Meijer, Rappaport, Taslakian, Toussaint, Winograd, and Wood, from which much of this post was derived.

Unsurprisingly, the problem of evenly distributing k events or items into cycles of length n does not show up only in nuclear physics. Let us briefly look at a couple of other interesting appearances.

Computer graphics: Suppose you want to depict a straight line on a computer screen with a slope of \frac{k}{n}. Computer screens consist of grids of pixels, so, unless the line is horizontal or vertical, it cannot be drawn perfectly straight. Here is an example of a segment of a line with slope -\frac{5}{11}:

A line with slope -5/11, superimposed over a pixel approximation. Created by Crotalus horridus, Public Domain

It should be clear that, at least as long as k < n, the problem of drawing this line consists of choosing k out of every n horizontal spaces after which to move your line one space vertically. To make your line as straight as possible, this distribution should be as even as possible. Indeed, the situation depicted above can be represented by the sequence 10101001010, where 1 indicates a vertical shift and 0 represents the absence of a vertical shift. This is precisely the result one gets (after a rotation) if one applies Bjorklund’s algorithm to the numbers 5 and 11.

Leap year distribution: The problem of reconciling the fact that one revolution of the earth around the sun does not take an exact integer number of days with the desire to have years that last an exact integer number of days is one that affects every calendar design. It is handled particularly interestingly in the Hebrew calendar. The Hebrew calendar is a lunisolar calendar, in which each month lasts one full cycle of the moon, but in which adjustments are made so that each month occurs at roughly the same stage of the solar year during each cycle. The Hebrew calendar reconciles these two contradictory demands by having twelve months in a typical year but, during certain “leap years,” having one of these months (namely, Adar) repeated to yield a year of thirteen months. It turns out that, to keep the Hebrew calendar very close to being in sync with the solar cycles, one needs to have seven out of every nineteen years be leap years. It is also natural to want these leap years to be spaced as evenly as possible. And, indeed, the distribution of these seven leap years in nineteen-year cycles in the Hebrew calendar is given precisely by a rotation of the result of Bjorklund’s algorithm applied to the numbers seven and nineteen.

Perhaps the most interesting appearances of the problem of evenly distributing events in a cycle, though, are in music. If one thinks of a musical measure consisting of n beats as being a cycle of length n, of a 1 as being a note being played, and of a 0 as being the absence of a note being played, then our strings of 1s and 0s can be thought of as musical rhythms. And it turns out that applications of Bjorklund’s algorithm, when interpreted in this way, yield a startling variety of musical rhythms used in practice around the world. We end this post by presenting a few prominent examples.

In what follows, we notationally replace 1s with ‘x’s and 0s with dots. This is more conducive to following along with musical rhythms, as it is visually easier to get lost in strings of 1s and 0s than in strings of ‘x’s and dots. An ‘x’ thus denotes a note being played and a dot denotes the absence of a note being played (or, often, an ‘x’ denotes an accented beat and a dot denotes an unaccented beat).  Given numbers k < n, we denote by E(k,n) the result of Bjorklund’s algorithm applied to k and n. The sequence that follows is the rotation of this result that is actually heard in the piece of music. As before, we restrict ourselves to the interesting cases in which k and n are relatively prime and k > 1. (Of course, the standard waltz rhythm [x .  . ], for example, would be E(1,3), and the standard backbeat rhythm [. x . x ] would be a rotation of E(2,4), but these are sort of boring in this context.)

There’s some good music here. Enjoy!

E(2,5) – (x . . x . ): “Take Five” by The Dave Brubeck Quartet.

E(3,7) – (x . x . x . . ): “Money” by Pink Floyd.

E(3,8) – (x . . x . . x . ): This is the famous tresillo rhythm, which can be heard throughout Latin American and sub-Saharan African music as well as in such songs as “Hound Dog,” “Despacito,” and the song we feature here, “Intro” by The xx.

E(5,8) – (x . x x . x x . ): This is the cinquillo rhythm, an elaboration of the tresillo that is also ubiquitous in Latin American and sub-Saharan African music. It also features prominently in “Come Out and Play” by The Offspring.

E(4,9) – (x . x . x . x . . ): This is our second track from the classic Time Out album, which was inspired by musical styles (especially time signatures) observed during travels in Europe and Asia. Brubeck saw this particular beat being played by Turkish street musicians and then used it for  “Blue Rondo à la Turk” by The Dave Brubeck Quartet.

E(4,11) – (x . . x . . x . . x . ): “Outside Now” by Frank Zappa.

E(5,11) – (x . x . x . x . x . .): “Promenade” from Pictures at an Exhibition by Modest Mussorgsky.

E(7,12) – (x . x . x x . x . x . x): This is one of the most important rhythms in sub-Saharan Africa, often used as a bell pattern. It is also a common beat in the Palo music of the Caribbean. It can be heard played by the claves player in this performance by Los Calderones, which I have been listening to non-stop for the last hour. (Note also that, if one re-interprets this sequence in a harmonic rather than rhythmic setting, in which each step of the cycle represents a half-step in pitch, an ‘x’ represents the inclusion of that note in a scale, and a dot represents the exclusion of that note from a scale, then this sequence (this exact rotation, in fact) precisely corresponds to the major diatonic scale, the most common musical scale in Western music!)

E(4,13) – (x . . x . . x . . x . . . ): This rhythm can be heard in the instrumental sections of “Golden Brown” by The Stranglers.

E(5,16) – (x . . x . . x . . . x . . x . . ): This rhythm forms the basis for much bossa nova music and, when significantly slowed down, “Pyramid Song” by Radiohead.

Cover image: Detail from “Wall Drawing #260” by Sol LeWitt

Common Knowledge

The phrase “common knowledge,” in its common usage, refers to knowledge that is held by (nearly) everyone, at least within some given community. For example, it is common knowledge in modern society that the Earth is round and orbits the sun, or that 2+2=4. These are facts that I can assume that you, my readers, already know or can easily verify, and it’s common practice not to give citations or acknowledge sources for them.

Today, though, we’re not really going to be discussing this common, informal usage of “common knowledge.” Nor are we going to devote this post to facts about the musician behind the classic hip hop album, Resurrection. (Though did you know that the track “I Used to Love H.E.R.” launched a feud with Ice Cube that was eventually settled at a private meeting organized by Louis Farrakhan at his house??) Rather, we’re going to delve into “common knowledge” as a fascinating technical concept in philosophical logic, epistemology, and social science.

Let’s start with the idea of mutual knowledge. Suppose that there is a group of people, and there is a fact that is known by everyone in the group. Then this fact is mutual knowledge among the members of the group. The classification of a fact as mutual knowledge seems like a pretty strong assertion, but we can (and often do) ask for even more. We can ask, for example, that not only does everyone in the group know the fact, but everyone in the group knows that everyone in the group knows the fact. Or we can go further, and ask that everyone in the group knows that everyone in the group knows that everyone in the group knows the fact. Carrying this process to infinity, we obtain the concept of common knowledge.

To be more precise, let us say that a fact is first-order knowledge among members of a group if everyone knows it, i.e., it is mutual knowledge. Let us say it is second-order knowledge if everyone in the group knows that everyone in the group knows it. Continue in this manner. In general, if n is a natural number and we have already defined the notion of n-th order knowledge, then a fact is (n+1)-st order knowledge among members of a group if it is n-th order knowledge and everyone in the group knows that it is n-th order knowledge. A fact is then common knowledge among members of a group if it is n-th order knowledge for every natural number n.

Let’s look at an example to try to clarify the distinction between mutual knowledge and common knowledge. Suppose that all of the students in a class have arrived to a classroom early and are sitting at their desks, and there is a large beetle crawling on the wall. Every student sees the beetle, so the presence of the beetle in the classroom is mutual knowledge among the students. None of the students particularly wants to deal with the beetle, though, so each student pretends to ignore it. As far as each student is concerned, they might be the only student who has noticed the beetle. The presence of the beetle is thus not even second-order knowledge among the students. Then the professor walks in the room, sees the beetle, points at it, and loudly exclaims, “Ah, I see we have a beetle auditing the class today!” Now, suddenly, the students know that they all know that there is a beetle in the room, and they know that they all know that they all know that there is a beetle in the room, and so on. The presence of the beetle has suddenly become common knowledge.

In the grand scheme of things, the study of common knowledge, and higher order knowledge in general, is a rather recent development in philosophy; David Hume was perhaps the first major philosopher to consider it, in A Treatise of Human Nature. It has become extensively studied since then, though, and, if one knows where to look, it can be found almost anywhere.

Consider the following example, due to philosopher Alan Hájek. You are in a restaurant, and a server walking by you trips on his shoelaces and spills a bit of soup on your shirt. You are, naturally, upset, and the server becomes apologetic and says, “It was my fault.” This is of course something almost everyone would expect him to say, but let’s look at it more closely. Why exactly did he say it, and why is it important? Everyone involved in the incident already knew that the server was at fault, so this utterance would seem to add no new information. However, it actually can be the key to ensuring that the incident does not escalate further, by establishing the server’s fault as common knowledge. Indeed, before the server’s admission, you might have thought that the server considered you at fault, which surely would have made you even angrier. After the server’s statement, some common ground has been established, and, with luck and maybe a free dessert, the situation can calmly be resolved.

The notion of higher order knowledge also shows up all the time in game theory (and, therefore, economics, international relations, etc.). When formulating an ideal strategy for a game, one must consider not only the current state and rules of the game, but also the knowledge and strategies of all of one’s partners and adversaries. But they, in turn, of course, are considering your knowledge and strategies when formulating their own, and one quickly reaches vertiginous levels of higher order knowledge. If you want to see common knowledge repeatedly raise its many confusing heads, just head to a high stakes poker game. (“I know that she would almost never bluff in this spot. But we’ve been playing together for days, so I also know that she knows that I know this, so maybe she’s actually more likely to bluff. But maybe she knows that I know that she knows that I know this, which means…I don’t know. I call.”)

We end today with a puzzle that has common knowledge at its core: There are two prisoners, Louis and Marie, each in their own prison cell. Each looks out on their own little yard. Louis can see 8 trees in his yard; Marie can see 12. One day, the warden brings them together and tells them both, “Between the two of you, you can see a total of either 18 or 20 trees. Every day at 5 pm, I will go to each of you. If one of you tells me the correct total number of trees in your two yards, you will immediately be set free. If you are incorrect, you will immediately be executed. If you say nothing, you will stay in prison for another day.”

Louis and Marie are not able to directly communicate with one another during this meeting with the warden, and no communication is possible between their two cells. Each was fully aware of themselves during the meeting, though; the contents of the warden’s speech have become common knowledge between them.

Q: Assuming Louis and Marie are entirely rational (and that their rationality is common knowledge between them) can they guarantee their release? If so, how? How long will it take?

We’ll talk about the solution beneath this nice painting of trees. Meanwhile, take a few minutes to think about it yourselves.

Gustav Klimt, Pine Forest II. If you can correctly tell me how many trees you see, you will be set free…

At first, it might not seem like the warden’s speech gave any really useful information. Louis now knows that Marie sees either 10 or 12 trees, and Marie now knows that Louis sees either 6 or 8 trees. But it seems that neither can be sure which of the two numbers the other sees.

If we delve deeper into the thought processes of the prisoners, though, and consider some counterfactual situations, a different picture arises. First, suppose that Marie actually saw 19 or 20 trees out her window. Then, on Day 1, Marie could confidently say, “We see 20 trees between us,” because she see more than 18 trees herself. Therefore, back in the real world, after neither prisoner says anything on Day 1, and the prisoners therefore remain in their cells, Louis knows that Marie sees at most 18 trees. You might naturally raise the following objection here: “Of course Louis knows that Marie sees at most 18 trees. He didn’t have to wait until after Day 1 to know this. He already knew this immediately after the warden’s speech, when he knew that Marie sees either 10 or 12 trees!”

But the important point here is not just that Louis knows that Marie sees at most 18 trees, but that this fact is now common knowledge, since it follows immediately from the information given by the warden and by the fact that nobody said anything on Day 1, both of which are themselves common knowledge.

Furthermore, just for the sake of completeness, and at the risk of getting lost in the weeds, let us argue that the fact that Marie sees at most 18 trees was not common knowledge before 5 pm on Day 1. Indeed, Louis knew then that Marie sees 10 or 12 trees. Put Marie does not know that Louis knows this. This is because, as far as Marie knows, Louis could see 6 trees, in which case Louis would know that Marie sees either 12 or 14 trees. Therefore, Marie can only know that Louis knows that Marie sees either 10, 12, or 14 trees. But, in turn, Louis cannot know that Marie knows that, since, as far as Louis knows, Marie could see 10 trees, in which case Marie would only know that Louis knows that Marie sees either 8, 10, or 12 trees. It follows that Louis only knows that Marie knows that Louis knows that Marie sees either 8, 10, 12, or 14 trees. And we can continue in this way. The higher order the knowledge becomes, the more uncertainty is introduced into the range, until we reach the statement: Marie knows that Louis knows that Marie knows that Louis knows that Marie knows that Louis knows that Marie knows that Louis knows that Marie sees 4, 6, 8, 10, 12, 14, 16, 18, or 20 trees. Therefore, before 5 pm on Day 1, the knowledge that Marie sees at most 18 trees is not even ninth order knowledge!

I’m sure that was crystal clear. So let’s continue. We’ve established that, after Day 1, it is common knowledge that Marie sees at most 18 trees. Suppose now that it happened to be the case that Louis saw fewer than 2 trees. Then this fact, combined with the knowledge that Marie sees at most 18 trees, would lead Louis to conclude that the prisoners see fewer than 20 trees between them, so they must see exactly 18. In the real world, when Day 2 goes by and the prisoners are still in their cells, it therefore becomes common knowledge that Louis sees at least 2 trees. Again, this was not common knowledge before 5 pm on Day 2.

It should now be clear how to continue. After Day 3 goes by with no change, it becomes common knowledge that Marie sees at most 16 trees. Then, after Day 4, it becomes common knowledge that Louis sees at least 4 trees. After Day 5, Marie sees at most 14 trees. After Day 6, Louis sees at least 6 trees. After Day 7, Marie sees at most 12 trees.

So far, none of this common knowledge is enough for either prisoner to deduce the total number of trees. In fact, neither of the prisoners has learned anything new about the number of trees seen by the other. However, their higher order knowledge has steadily grown. Indeed, the increasing ranges we saw creeping in to the prisoners’ higher order knowledge on Day 1 have gradually been winnowed by the passage of time. And this pays off after Day 8, when it becomes common knowledge that Louis sees at least 8 trees. Marie then carefully recounts the trees outside her window, confirms that it is 12, and concludes that, together, they see at least (and therefore exactly) 20 trees. She gives the correct answer on Day 9, and the prisoners are released!

There’s another famous puzzle involving common knowledge, initially even more baffling than this one. It is known as the blue-eyed islanders puzzle. It has been written about extensively elsewhere, so let me just point you to one such place, namely, Terence Tao’s blog.

For more reading on common knowledge, see this entry at the Stanford Encyclopedia of Philosophy.

Cover Image: Spock performing a Vulcan mind meld, thereby establishing common knowledge with Dr. Simon Van Gelder.

Life on the Poincaré Disk

Just at this time I left Caen, where I was then living, to go on a geological excursion under the auspices of the school of mines. The changes of travel made me forget my mathematical work. Having reached Coutances, we entered an omnibus to go some place or other. At the moment when I put my foot on the step the idea came to me, without anything in my former thoughts seeming to have paved the way for it, that the transformations I had used to define the Fuchsian functions were identical with those of  non-Euclidean geometry. I did not verify the idea; I should not have had time, as, upon taking my seat in the omnibus, I went on with a conversation already commenced, but I felt a perfect certainty. On my return to Caen, for conscience’ sake I verified the result at my leisure.

-Henri Poincaré, Science and Method

You’re out for a walk one day, contemplating the world, and you suddenly have an out-of-body experience, your perspective floating high above your corporeal self. As you rise, everything seems perfectly normal at first, but, when you reach a sufficient altitude, you notice something strange: your body appears to be at the center of a perfect circle, beyond which there is simply…nothing!

You watch yourself walk towards the edge of the circle. It initially looks like you will reach the edge in a surprisingly short amount of time, but, as you continue watching, you notice yourself getting smaller and slowing down. By the time you are halfway to the edge, you are moving at only 3/4 of your original speed. When you are 3/4 of the way to the edge, you are moving at only 7/16 of your original speed. Maybe you will never reach the edge after all? What is happening?

At some point, you see your physical self notice some friends, standing some distance away in the circle. You wave to one another, and your friends beckon you over. You start walking toward them, but, strangely, you walk in what looks not to be a straight line but rather an arc, curving in towards the center of the circle before curving outward again to meet your friends. And, equally curiously, your friends don’t appear to be surprised or annoyed by your seemingly inefficient route. You puzzle things over for a few seconds before having a moment of insight. ‘Oh!’ you think. ‘My physical body is living on a Poincaré disk model for hyperbolic geometry, which my mind has somehow transcended during this out-of-body experience. Of course!”

The Poincaré disk model, which was actually put forth by Eugenio Beltrami, is one of the first and, to my mind, most elegant models of non-Euclidean geometry. Recall from our previous post that a Euclidean geometry is a geometry satisfying Euclid’s five postulates. The first four of these postulates are simple and self-evident. The fifth, known as the Parallel Postulate (recall also that two lines are parallel if they do not intersect), is unsatisfyingly complex and non-immediate. To refresh our memories, here is an equivalent form of the Parallel Postulate, known as Playfair’s Axiom:

Given any line \ell and any point P not on \ell, there is exactly one line through P that is parallel to \ell.

A non-Euclidean geometry is a geometry that satisfies the first four postulates of Euclid but fails to satisfy the Parallel Postulate. Non-Euclidean geometries began to be seriously investigated in the 19th century; Beltrami, working in the context of Euclidean geometry, was the first to actually produce models of non-Euclidean geometry, thus proving that, supposing Euclidean geometry is consistent, then so is non-Euclidean geometry.

The Poincaré disk model, one of Beltrami’s models, is a model for hyperbolic geometry, in which the Parallel Postulate is replaced by the following statement:

Given any line \ell and any point P not on \ell, there are at least two distinct lines through P that are parallel to \ell.

Points and lines are the basic objects of geometry, so, to describe the Poincaré disk model, we must first describe the set of points and lines of the model. The set of points of the model is the set of points strictly inside a given circle. For concreteness, let us suppose we are working on the Cartesian plane, and let us take the unit circle, i.e., the circle of radius one, centered at the origin, as our given circle. The points in the Poincaré disk model are then the points in the plane whose distances from the origin are strictly less than one.

Lines in the Poincaré disk model (which we will sometimes call hyperbolic lines) are arcs formed by taking one of the following type of objects and intersecting it with the unit disk:

  1. Straight lines (in the Euclidean sense) through the center of the circle.
  2. Circles (in the Euclidean sense) that are perpendicular to the unit circle.

(These can, of course, be seen as two instances of the same thing, if one takes the viewpoint that, in Euclidean space, straight lines are just circles of infinite radius.)

D, D1, and D2 are all lines in the Poincaré disk model. By Jean-Christophe BENOIST, Own work – CC BY 3.0

It’s already pretty easy to see that this geometry satisfies our hyperbolic replacement of the Parallel Postulate. In fact, given a line \ell and a point P not on \ell, there are infinitely many lines through P parallel to \ell. Here’s an illustration of a typical case, with three parallel lines drawn:

Three lines passing through a given point, parallel to a given line. Source: William Barker

We’re not quite able right now to prove that the disk model satisfies the first four of Euclid’s postulates, in part because we haven’t yet specified what it means for two line segments in the model to be be congruent (we don’t, for example, have a notion of distance in our model yet). We’ll get to this in just a minute, but let us first show that our model satisfies the first postulate: Given any two distinct points, there is a line containing both of them.

To this end, let A and B be two points in the disk. If the (Euclidean) line that contains A and B passes through the center of the disk, then this is also a line in the disk model, and we are done. Otherwise, the (Euclidean) line that contains A and B does not pass through the center of the disk. In this case, we use the magic of circle inversion, which we saw in a previous post. Let A' by the result of inverting A across the unit circle. Now A, A', and B are distinct points in the Cartesian plane, so there is a unique circle (call it \gamma) containing all three. Since A and A' are both on the circle, it is perpendicular to the unit circle. Therefore, its intersection with the unit disk is a line in the disk model containing both A and B. Here’s a picture:

print copy
Hyperbolic line containing A and B. Source: Euclid and Beyond by Robin Hartshorne

We turn now to distance in the Poincaré disk model. And here, for the sake of brevity, I’m not even going to try to explain why things are they way they are but will just give you a formula. Given two points A and B in the disk, consider the hyperbolic line containing them, and let P and Q be the points where this line meets the boundary circle (with P closer to A and Q closer to B). Then the hyperbolic distance between A and B is given by:

d(A,B) = \mathrm{ln}(\frac{|PB|\cdot|AQ|}{|PA|\cdot|BQ|}).

This is likely inscrutable right now. That’s fine. Let’s think about what it means for this to be the correct notion of distance, though. For one thing, it means that, given two points in the disk model, the shortest path between them is not, in general, the straight Euclidean line that connects them, but rather the hyperbolic line that connects them. This explains your body’s behavior in the story at the start of this post. When you were walking over to your friends, what appeared to your mind (which was outside the disk, in the Euclidean realm) as a curved arc, and therefore an inefficient path, was in fact a hyperbolic line and, because your body was inside the hyperbolic disk, the shortest path between you and your friends.

This notion of distance also means that distances inside the disk which appear equal to an external Euclidean observer in fact get longer and longer the closer they are to the edge of the disk. This is also consistent with the observations at the beginning of the post: as your body got further toward the edge of the disk, it appeared from an external viewpoint to be moving more and more slowly. From a viewpoint inside the disk, though, it was moving at constant speed and would never reach the edge of the disk, which is infinitely far away. The disk appears bounded from the external Euclidean view, but from within it is entirely unbounded and limitless.

Let’s close by looking at two familiar shapes, interpreted in the hyperbolic disk. First, circles. Recall that a circle is simply the set of points that are some fixed distance away from a given center. Now, what happens when we interpret this definition inside the hyperbolic disk? Perhaps somewhat surprisingly, we get Euclidean circles! (Sort of.) To be more precise, hyperbolic circles in the Poincaré disk model are precisely the Euclidean circles that lie entirely within the disk. (I’m not going to go through the tedious calculations to prove this; I’ll leave that up to you…) Beware, though! The hyperbolic center of the circle is generally different from the Euclidean center. (This should make sense if you think about our distance definition. The hyperbolic center will be further toward the edge of the disk than the Euclidean center, coinciding only if the Euclidean center of the circle is in fact the center of the hyperbolic disk.)

Next, triangles. A triangle is, of course, a polygon with three sides. This definition works perfectly fine in hyperbolic geometry; we simply require that our sides are hyperbolic line segments rather than Euclidean line segments. If we assume the first four of Euclid’s postulates, then the Parallel Postulate is actually equivalent to the statement that the sum of the interior angles of a triangle is 180 degrees. In the Poincaré disk model (and, in fact, in any model of hyperbolic geometry) all triangles have angles that sum to less than 180 degrees. This should be evident if we look at a typical triangle:

A typical triangle in the Poincaré disk model.

Things become interesting when you start to ask how much less than 180 degrees a hyperbolic triangle has. The remarkable fact is that the number of degrees in a hyperbolic triangle is dependent entirely on its (hyperbolic) area! The smaller a triangle is, the larger the sum of its interior angles: as triangles get smaller and smaller, approaching a single point, the sum of their angles approaches 180 degrees from below. Correspondingly, as triangles get larger, the sum of their angles approaches 0 degrees. In fact if we consider an “ideal triangle”, in which the three vertices are in fact points on the bounding circle (and thus not real points in the disk model), then the sum of the angles of this “triangle” is actually 0 degrees!

“Ideal” triangle with interior angles adding to zero.

A consequence of this is the fact that, in the Poincaré disk model, if two triangles are similar, then they are in fact congruent!

This leads us to our final topic: one of the perks of living in a Poincaré disk model. Perhaps the most frequent complaint I hear from people living on a Euclidean plane is that there aren’t enough ways to tile the plane with triangles. Countless people come up to me and say, “Chris, I want to tile the plane with triangles, and I want this tiling to have the following two pleasing properties:

  1. All of the triangles are congruent, they don’t overlap, and they fill the entire plane.
  2. At every vertex of the tiling, all angles meeting that vertex are the same.

But there are only four essentially different ways of doing this, and I’m tired of all of them! What should I do?”

(Exercise for the reader: Find all four such tilings!)

It just so happens that I have a simple answer for these people: “Move to a Poincaré disk model, where there are infinitely many tilings with these properties!” Here are just a few (all by Tamfang and in the public domain):

Right triangles. The smallest, in fact, that can tile the Poincaré disk model.
Larger right triangles.
The largest right “triangles”, each with two “ideal” vertices on the edge of the disk.
The dual to a tiling hidden in Escher’s Circle Limit III, the cover image to this post.
Equilateral triangles.
The largest “triangles”, each with three ideal vertices.

I’ll leave you with that! Hyperbolic geometry is fascinating, and I encourage you to investigate further on your own. The previous mentioned Euclid and Beyond, by Hartshorne, is a nice place to start.

This also wraps up (for now, at least) a couple of multi-part investigations here at Point at Infinity: a look at the interesting geometry of circles, which started in our post on circle inversion, and a look at various notions of independence in mathematics, the other posts being here and here. Join us next time for something new!

Cover Image: M. C. Escher, Circle Limit III