On Jigsaw Puzzles

Our whole life is solving puzzles.

-Ernő Rubik

I found myself this year occupying an office with an extra desk that was disturbingly tidy for the first half of the fall semester, when I happened upon a jigsaw puzzle featuring an image of Edward Gorey’s “An Exhibition” and decided it would be the perfect way to add some clutter to my office. And so, during the months of November and December, Gorey’s drawing emerged on my desk as the 1000 pieces of the puzzle were slowly but steadily assembled during midday breaks in my mathematical work.

I enjoyed jigsaw puzzles as a child, but this was my first time completing a puzzle in over a decade, and my first time seriously working on a puzzle since starting to do research in mathematics. And as I progressed through its solution, I was struck by how the experience of solving it mirrored the experience of doing math research. I’ll be the first to admit that the observation that doing math is like solving puzzles is not particularly deep or original, but what increasingly stood out to me was how unexpectedly rich and multifaceted the similarities between the acts of research and puzzle-solving turned out to be.

So today, we are going to explore some of these similarities, with the aim of perhaps helping to begin to illuminate what it actually feels like, at least at an abstract level, to do math research. For although a relatively small percentage of the population has actively engaged in mathematical research, the act of solving jigsaw puzzles is a more universal experience, and I expect many people will be familiar with its joys and frustrations.


Solving a jigsaw puzzle and completing a mathematical research project are both processes, with distinct phases, and these phases will furnish a useful framework for our comparison. With this in mind, let us begin at the beginning.

1. Assembling the edge

When starting a puzzle, there are certain easily identifiable pieces about which one knows some additional information. These are, of course, the edge pieces, which are straight on one (or, in rarer but even more informative cases, two) of their sides. And so, the most common and most reliable strategy for beginning a jigsaw puzzle is first to isolate the edge pieces and then to fit all of them together to construct the “frame” of the puzzle. The reasons for this being an effective strategy are many, but chief among them are surely:

  • It is easy to identify edge pieces.
  • It is relatively easy to fit them together, since for two edge pieces there are at most two possible ways in which they can be joined, whereas for two interior pieces there are up to sixteen possible ways in which they can be joined.
  • Once constructed, the edge of the puzzle will help give structure to the more chaotic interior, providing scaffolding off of which one can steadily build into the center of the puzzle.

The counterpart to assembling the edge pieces in mathematical research is, I claim, doing background reading, getting yourself up to speed with regards to the current state of knowledge about the problem you are trying to solve. While it is certainly conceivable that one could solve a problem with little or no knowledge of previous work, in practice it is immensely helpful and time-saving to be familiar with what other researchers have already discovered about the problem, with techniques that have been developed to tackle this or similar problems, and with ways in which these techniques have succeeded or failed in the past. In somewhat strained analogy with the above, let me isolate three reasons why this is effective:

  • Given modern bibliographic and communication tools, it is relatively easy to identify relevant books or papers in the existing literature (though it is not always as easy as one might hope).
  • Reading the background literature should be an “easier” or at least more consistently productive process than jumping immediately in to the research, in large part because it will not require the novel insights that the eventual solution to your problem will involve. (This is, in part, assuming that the background literature is available to you and reasonably well-written, neither of which is an entirely safe assumption.)
  • After becoming familiar with previous work, you will have a solid foundation that should provide some firm jumping-off points to begin your research.

2. Experimental tinkering

Once you have completed the edge of your puzzle, it is probably wise to take a break to prepare yourself for the more daunting prospect of working your way into its interior. At the beginning, this process is likely be slow, frustrating, overwhelming, and bewildering, made especially difficult by the fact that, although the puzzle pieces fit together to form a coherent image, individually they are abstract blobs of color and texture, with little indication as to the form they will eventually be a part of. This phenomenon is particularly well depicted by the labyrinthine opening sentence (containing not one nor two but three colons!) of Georges Perec’s (yes, that Georges Perec) masterpiece novel, Life: A User’s Manual, in which jigsaw puzzles feature prominently:

To begin with, the art of jigsaw puzzles seems of little substance, easily exhausted, wholly dealt with by a basic introduction to Gestalt: the perceived object — we may be dealing with a perceptual act, the acquisition of a skill, a physiological system, or, as in the present case, a wooden jigsaw puzzle — is not a sum of elements to be distinguished from each other and analysed discretely, but a pattern, that is to say a form, a structure: the element’s existence does not precede the existence of the whole, it comes neither before nor after it, for the parts do not determine the pattern, but the pattern determines the parts: knowledge of the pattern and of its laws, of the set and its structure, could not possibly be derived from discrete knowledge of the elements that compose it. That means that you can look at a piece of a puzzle for three whole days, you can believe that you know all there is to know about its colouring and shape, and be no further on than when you started.

Georges Perec, Life: A User’s Manual

In practice, this initial phase of exploration into the center of the puzzle consists of repeatedly finding two pieces that have similar colors or textures and trying to fit them together in any conceivable configuration. Most of the time, this will fail. On the rare occasion when it succeeds, the two pieces will melt into one another, becoming a single larger piece, although, at least initially, this larger piece will still remain an abstract blob of color and texture, giving little indication of its eventual identity.

A useful strategy in this early stage of puzzle assembly is to try to directly attach interior pieces to pieces in the completed edge. This allows you to use to your advantage the facts that the completed edge already provides some nontrivial information about the structure of the interior of the puzzle, at least providing partitions into discrete regions of similar colors and textures, and that the edge pieces have fewer degrees of freedom than unconnected interior pieces, so progress will be quicker at the edge. Of course, once an interior piece is attached to the edge, it becomes part of the edge itself, to be further built upon.

The initial stages of mathematical research are equally slow, frustrating, overwhelming, and bewildering. As above, it is prudent to build off of your background knowledge, taking advantage of the little bit of structure it already provides. Make small tweaks to existing techniques to see if they lead anywhere interesting. Try to connect different bits of existing information to understand how they form complementary parts of a coherent whole. Again, most of these initial attempts at creating new knowledge fail. But every now and then something works, and a tiny breakthrough is achieved. And while these initial achievements don’t yet give you an idea of the final shape of the eventual solution, they gradually add both to existing store of background knowledge and to your own personal store of experience.

3. Making Connections

As the small groups of connected pieces constructed in Phase 2 begin to grow, something curious happens. The abstract patterns of the individual pieces begin coalescing into recognizable forms. A formerly mystifying red ribbon reveals itself to be a bicycle frame. A black and orange pattern becomes a person’s clothing. A black smudge, when placed in context, is instantly recognizable as said person’s shoe. And then, sometimes, something truly satisfying happens: two of these groups become rotated in just the right way and moved to just the right spots within the frame of the puzzle, and, in a moment of great serendipity, they are found to fit precisely together, in a way that was entirely unforeseen but in hindsight appears obvious and inevitable. This coupling creates a larger, even more structured group that gives the solver information not just about its constituents but about the pieces that will eventually fill in the remaining gaps. In this way, the landscape of the puzzle is gradually revealed so that, even if not all of the pieces are in place, the solver knows in a broad sense what the final picture will be.

img_20191110_175711

Mathematical research progresses in much the same way. The disparate facts built up in Phase 2 begin to fit together to form a coherent narrative. Bits of knowledge from two seemingly distant domain reveal themselves to be two sides of the same coin, shedding light not only on these specific bits but on a whole constellation of ideas. The correct questions to ask begin to emerge and, later on, you see outlines of their solutions.

It is in this third phase of puzzle solving and mathematical research that the real magic happens, where new discoveries are made, where fleeting moments of pure joy are unearthed. And it is these moments that make the admittedly rather toilsome other phases worth it.

4. Filling in the Gaps

When all of the major features in the puzzle have been fixed in place, when the solver understands the final image, some gaps will remain to be filled: stray undiscovered pieces in the midst of otherwise completed forms, or small expanses of blue sky or green grass. This phase again becomes somewhat tedious and involves none of the discovery of Phase Three, but its finality lends it a satisfaction not found in any of the early solving phases. There is also pleasure in the increasing ease with which you progress through this phase. As the stocks of remaining pieces and open spots simultaneously dwindle, the pieces are slotted in with accelerating rapidity until, suddenly, the image is complete.

img_20200113_124514

In mathematics, too, there is still work to be done after Phase Three. Technical details must be worked out. Proofs must be checked. The paper must be written. And though a mathematical project does not have as obvious an ending as a jigsaw puzzle, there must come a point when you say, “It is done,” and send it to a journal for publication.


There are two additional connections that I want to draw between puzzle solving and mathematical research that don’t quite fit within the above “phases” framework. The first involves the gradual development of intuition and expertise.

The early phases of solving a puzzle consist of a great deal of trial and error. Any two pieces with similar patterns are tried together, in every conceivable configuration, in a slow and rather tedious process. As the puzzle-solving progresses, though, you begin to be able quickly to see, through increased familiarity with subtleties in gradation or piece shape or other more ineffable factors, which pieces are more likely to fit together and which you needn’t bother trying, cutting down greatly on the amount of trial and error needed and considerably speeding up the process. You also begin to see more quickly through the abstraction of the individual pieces, identifying them for what they are even before connecting them with their neighbors.

I was able to witness the effects of this intuition recently. About a year ago, I visited some friends and spent about 15 minutes working on a jigsaw puzzle with their 5-year-old son. At the time, the puzzle was well into what would be labeled “Phase 2” in our framework. Over the course of these 15 minutes, I think I succeeded in connecting precisely one piece after a large amount of trial and error. The five-year-old, meanwhile, seemed to be making much fewer connection attempts but, in the same amount of time, connected about ten pieces. This disparity is not because he is better at puzzles or has faster visual cognition or has better geometric visualization than me, though all of those things are possible. Rather, I think it is because he had been working on this puzzle for enough time that he developed a feel for which types of pieces are likely to work in certain places in the puzzle.

Intuition works similarly in mathematics and continues to be developed across different projects and throughout many years. Every young researcher has had the experience of telling a more senior figure in the field about a recent result they have proven only to have the senior mathematician immediately give a simpler proof of a more general version of the result or make sweeping, initially surprising connections between the new result and existing mathematical facts. This can be initially disconcerting to the young mathematician, since this result that they have spent months or years of hard work on has been absorbed and improved upon by the senior mathematician in a matter of minutes, but in fact it should not take anything away from their accomplishment. After all, they proved this result, while the senior mathematician did not. What the senior mathematician brings is an extensive knowledge of the landscape in which the result sits that lets them quickly assimilate it in their mental space. And through experiences like this, the younger mathematician begins to build up their intuition as well.

The second and final additional connection I want to draw concerns what happens when the project is over.

After the last puzzle piece is placed and you step away to admire your work, the lines between the puzzle pieces begin to mentally dissolve, and the puzzle simply becomes the image printed on its face. All of the difficult toil of the previous hours and days gets washed away. Visitors who look at the completed puzzle do not see all of the ways in which the pieces didn’t fit together. The final solution looks obvious and inevitable. Of course that is how one should place the pieces; how else would you do it?

A completed mathematical project similarly takes on a sheen of inevitability. In your published account of your work, all of the dead ends and initial mistakes are excised from history, and the path from the beginning to the end is laid out in a nice orderly manner rather than the winding maze in which it was initially traversed. After you’ve had a bit of time to get perspective on your work (sometimes only a matter of hours), it begins to seem trivial and obvious. Of course this is how the problem was solved; how else would you do it?

This phenomenon (and some others we have discussed) is well-described by passages in Life: A User’s Manual (for puzzles) and Birth of a Theorem: A Mathematical Adventure, by Cedric Villani (for mathematics).  Let’s look first at the passage on puzzles, which in the book comes shortly after the passage excerpted above:

The pieces are readable, take on a sense, only when assembled; in isolation, a puzzle piece means nothing — just an impossible question, an opaque challenge. But as soon as you have succeeded, after minutes of trial and error, or after a prodigious half-second flash of inspiration, in fitting it into one of its neighbours, the piece disappears, ceases to exist as a piece. The intense difficulty preceding this link-up — which the English word puzzle indicates so well — not only loses its raison d’être, it seems never to have had any reason, so obvious does the solution appear. The two pieces so miraculously conjoined are henceforth one, which in its turn will be a source of error, hesitation, dismay, and expectation.

Life: A User’s Manual

Next, Villani, a distinguished French mathematician, thinks about the process of mathematical research while walking through a dark pedestrian tunnel at night:

This gloomy tunnel is a little like the one you pass through when you begin work on a new mathematical problem […] Total obscurity. Bilbo in Gollum’s tunnel.

A mathematician’s first steps into unknown territory constitute the first phase of a familiar cycle.

After the darkness comes a faint, faint glimmer of light, just enough to make you think that something is there, almost within reach, waiting to be discovered…. Then, after the faint, faint glimmer, if all goes well, you unravel the thread — and suddenly it’s broad daylight! You’re full of confidence, you want to tell anyone who will listen about what you’ve found.

And then, after day has broken, after the sun has climbed high into the sky, a phase of depression inevitably follows. You lose all faith in the importance of what you’ve achieved. Any idiot could have done what you’ve done, go find yourself a more worthwhile problem and make something of your life. Thus the cycle of mathematical research…

Cedric Villani, Birth of a Theorem: A Mathematical Adventure


I finally want to end with one key difference between puzzle-solving and mathematical research. Jigsaw puzzles are bounded, existing entirely within their frame. When you are done with a puzzle, you are done, and the next puzzle you complete will have nothing to do with it except its form. Each mathematical project, on the other hand, is part of the same landscape. When you zoom out and adjust your perspective, it becomes a puzzle piece of its own, to be connected in surprising ways to other projects, forming ever larger and more detailed pieces that combine to form the infinitely rewarding and exquisitely detailed background image of the puzzle of mathematics.


Cover Image: Illustration from The Doubtful Guest by Edward Gorey

Common Knowledge

The phrase “common knowledge,” in its common usage, refers to knowledge that is held by (nearly) everyone, at least within some given community. For example, it is common knowledge in modern society that the Earth is round and orbits the sun, or that 2+2=4. These are facts that I can assume that you, my readers, already know or can easily verify, and it’s common practice not to give citations or acknowledge sources for them.

Today, though, we’re not really going to be discussing this common, informal usage of “common knowledge.” Nor are we going to devote this post to facts about the musician behind the classic hip hop album, Resurrection. (Though did you know that the track “I Used to Love H.E.R.” launched a feud with Ice Cube that was eventually settled at a private meeting organized by Louis Farrakhan at his house??) Rather, we’re going to delve into “common knowledge” as a fascinating technical concept in philosophical logic, epistemology, and social science.

Let’s start with the idea of mutual knowledge. Suppose that there is a group of people, and there is a fact that is known by everyone in the group. Then this fact is mutual knowledge among the members of the group. The classification of a fact as mutual knowledge seems like a pretty strong assertion, but we can (and often do) ask for even more. We can ask, for example, that not only does everyone in the group know the fact, but everyone in the group knows that everyone in the group knows the fact. Or we can go further, and ask that everyone in the group knows that everyone in the group knows that everyone in the group knows the fact. Carrying this process to infinity, we obtain the concept of common knowledge.

To be more precise, let us say that a fact is first-order knowledge among members of a group if everyone knows it, i.e., it is mutual knowledge. Let us say it is second-order knowledge if everyone in the group knows that everyone in the group knows it. Continue in this manner. In general, if n is a natural number and we have already defined the notion of n-th order knowledge, then a fact is (n+1)-st order knowledge among members of a group if it is n-th order knowledge and everyone in the group knows that it is n-th order knowledge. A fact is then common knowledge among members of a group if it is n-th order knowledge for every natural number n.

Let’s look at an example to try to clarify the distinction between mutual knowledge and common knowledge. Suppose that all of the students in a class have arrived to a classroom early and are sitting at their desks, and there is a large beetle crawling on the wall. Every student sees the beetle, so the presence of the beetle in the classroom is mutual knowledge among the students. None of the students particularly wants to deal with the beetle, though, so each student pretends to ignore it. As far as each student is concerned, they might be the only student who has noticed the beetle. The presence of the beetle is thus not even second-order knowledge among the students. Then the professor walks in the room, sees the beetle, points at it, and loudly exclaims, “Ah, I see we have a beetle auditing the class today!” Now, suddenly, the students know that they all know that there is a beetle in the room, and they know that they all know that they all know that there is a beetle in the room, and so on. The presence of the beetle has suddenly become common knowledge.

In the grand scheme of things, the study of common knowledge, and higher order knowledge in general, is a rather recent development in philosophy; David Hume was perhaps the first major philosopher to consider it, in A Treatise of Human Nature. It has become extensively studied since then, though, and, if one knows where to look, it can be found almost anywhere.

Consider the following example, due to philosopher Alan Hájek. You are in a restaurant, and a server walking by you trips on his shoelaces and spills a bit of soup on your shirt. You are, naturally, upset, and the server becomes apologetic and says, “It was my fault.” This is of course something almost everyone would expect him to say, but let’s look at it more closely. Why exactly did he say it, and why is it important? Everyone involved in the incident already knew that the server was at fault, so this utterance would seem to add no new information. However, it actually can be the key to ensuring that the incident does not escalate further, by establishing the server’s fault as common knowledge. Indeed, before the server’s admission, you might have thought that the server considered you at fault, which surely would have made you even angrier. After the server’s statement, some common ground has been established, and, with luck and maybe a free dessert, the situation can calmly be resolved.

The notion of higher order knowledge also shows up all the time in game theory (and, therefore, economics, international relations, etc.). When formulating an ideal strategy for a game, one must consider not only the current state and rules of the game, but also the knowledge and strategies of all of one’s partners and adversaries. But they, in turn, of course, are considering your knowledge and strategies when formulating their own, and one quickly reaches vertiginous levels of higher order knowledge. If you want to see common knowledge repeatedly raise its many confusing heads, just head to a high stakes poker game. (“I know that she would almost never bluff in this spot. But we’ve been playing together for days, so I also know that she knows that I know this, so maybe she’s actually more likely to bluff. But maybe she knows that I know that she knows that I know this, which means…I don’t know. I call.”)


We end today with a puzzle that has common knowledge at its core: There are two prisoners, Louis and Marie, each in their own prison cell. Each looks out on their own little yard. Louis can see 8 trees in his yard; Marie can see 12. One day, the warden brings them together and tells them both, “Between the two of you, you can see a total of either 18 or 20 trees. Every day at 5 pm, I will go to each of you. If one of you tells me the correct total number of trees in your two yards, you will immediately be set free. If you are incorrect, you will immediately be executed. If you say nothing, you will stay in prison for another day.”

Louis and Marie are not able to directly communicate with one another during this meeting with the warden, and no communication is possible between their two cells. Each was fully aware of themselves during the meeting, though; the contents of the warden’s speech have become common knowledge between them.

Q: Assuming Louis and Marie are entirely rational (and that their rationality is common knowledge between them) can they guarantee their release? If so, how? How long will it take?

We’ll talk about the solution beneath this nice painting of trees. Meanwhile, take a few minutes to think about it yourselves.

klimt_pine_forest
Gustav Klimt, Pine Forest II. If you can correctly tell me how many trees you see, you will be set free…

At first, it might not seem like the warden’s speech gave any really useful information. Louis now knows that Marie sees either 10 or 12 trees, and Marie now knows that Louis sees either 6 or 8 trees. But it seems that neither can be sure which of the two numbers the other sees.

If we delve deeper into the thought processes of the prisoners, though, and consider some counterfactual situations, a different picture arises. First, suppose that Marie actually saw 19 or 20 trees out her window. Then, on Day 1, Marie could confidently say, “We see 20 trees between us,” because she see more than 18 trees herself. Therefore, back in the real world, after neither prisoner says anything on Day 1, and the prisoners therefore remain in their cells, Louis knows that Marie sees at most 18 trees. You might naturally raise the following objection here: “Of course Louis knows that Marie sees at most 18 trees. He didn’t have to wait until after Day 1 to know this. He already knew this immediately after the warden’s speech, when he knew that Marie sees either 10 or 12 trees!”

But the important point here is not just that Louis knows that Marie sees at most 18 trees, but that this fact is now common knowledge, since it follows immediately from the information given by the warden and by the fact that nobody said anything on Day 1, both of which are themselves common knowledge.

Furthermore, just for the sake of completeness, and at the risk of getting lost in the weeds, let us argue that the fact that Marie sees at most 18 trees was not common knowledge before 5 pm on Day 1. Indeed, Louis knew then that Marie sees 10 or 12 trees. Put Marie does not know that Louis knows this. This is because, as far as Marie knows, Louis could see 6 trees, in which case Louis would know that Marie sees either 12 or 14 trees. Therefore, Marie can only know that Louis knows that Marie sees either 10, 12, or 14 trees. But, in turn, Louis cannot know that Marie knows that, since, as far as Louis knows, Marie could see 10 trees, in which case Marie would only know that Louis knows that Marie sees either 8, 10, or 12 trees. It follows that Louis only knows that Marie knows that Louis knows that Marie sees either 8, 10, 12, or 14 trees. And we can continue in this way. The higher order the knowledge becomes, the more uncertainty is introduced into the range, until we reach the statement: Marie knows that Louis knows that Marie knows that Louis knows that Marie knows that Louis knows that Marie knows that Louis knows that Marie sees 4, 6, 8, 10, 12, 14, 16, 18, or 20 trees. Therefore, before 5 pm on Day 1, the knowledge that Marie sees at most 18 trees is not even ninth order knowledge!

I’m sure that was crystal clear. So let’s continue. We’ve established that, after Day 1, it is common knowledge that Marie sees at most 18 trees. Suppose now that it happened to be the case that Louis saw fewer than 2 trees. Then this fact, combined with the knowledge that Marie sees at most 18 trees, would lead Louis to conclude that the prisoners see fewer than 20 trees between them, so they must see exactly 18. In the real world, when Day 2 goes by and the prisoners are still in their cells, it therefore becomes common knowledge that Louis sees at least 2 trees. Again, this was not common knowledge before 5 pm on Day 2.

It should now be clear how to continue. After Day 3 goes by with no change, it becomes common knowledge that Marie sees at most 16 trees. Then, after Day 4, it becomes common knowledge that Louis sees at least 4 trees. After Day 5, Marie sees at most 14 trees. After Day 6, Louis sees at least 6 trees. After Day 7, Marie sees at most 12 trees.

So far, none of this common knowledge is enough for either prisoner to deduce the total number of trees. In fact, neither of the prisoners has learned anything new about the number of trees seen by the other. However, their higher order knowledge has steadily grown. Indeed, the increasing ranges we saw creeping in to the prisoners’ higher order knowledge on Day 1 have gradually been winnowed by the passage of time. And this pays off after Day 8, when it becomes common knowledge that Louis sees at least 8 trees. Marie then carefully recounts the trees outside her window, confirms that it is 12, and concludes that, together, they see at least (and therefore exactly) 20 trees. She gives the correct answer on Day 9, and the prisoners are released!


There’s another famous puzzle involving common knowledge, initially even more baffling than this one. It is known as the blue-eyed islanders puzzle. It has been written about extensively elsewhere, so let me just point you to one such place, namely, Terence Tao’s blog.


For more reading on common knowledge, see this entry at the Stanford Encyclopedia of Philosophy.


Cover Image: Spock performing a Vulcan mind meld, thereby establishing common knowledge with Dr. Simon Van Gelder.

Infinite Hats II

Today, we end our exploration of hat puzzles with a look at a couple of variations on the infinite hat puzzle from our last post. In particular, we will deal with situations in which the prisoners cannot hear each other’s guesses and therefore cannot pass along any information, and yet in which the prisoners still have surprisingly effective strategies at their disposal.


Puzzle Five: Countably infinite hats with no information

The setup is the same as that of Puzzle Four. There are countably infinitely many prisoners, labelled by the natural numbers. They are each given a red or blue hat and are lined up so that each prisoner knows their own number and can see exactly the prisoners with larger numbers. Each prisoner guesses the color of their own hat. If they are correct, then they are released; if they are incorrect, then they are executed. The difference is that, in this puzzle, although the prisoners can still devise a strategy beforehand, none of the prisoners knows the guesses of any of the others. How can the prisoners act to minimize the number of prisoners who guess incorrectly?


With no information being passed, it might seem hard to envision any strategy being reasonably effective. We will see, though, that for a reasonable definition of ‘reasonably effective,’ there is a reasonably effective strategy, and it can be found using only tools that we developed in our solution to Puzzle Four. As usual, configurations of hats will be represented as infinite sequences of 0s and 1s, with 0 meaning ‘red’ and 1 meaning ‘blue.’ In our last post, we isolated an equivalence relation, called E_0, on the set of all infinite sequences of 0s and 1s (this set is denoted by {^\omega}2), where, for two sequences x,y \in {^\omega}2, we say x \sim y if x and y differ in only finitely many positions. Also as in our solution of Puzzle Four, we can assume that, during their strategizing, the prisoners fixed a transversal of E_0, i.e. a function h from the set of equivalence classes of E_0 to {^\omega}2 such that, for every equivalence class X, we have h(X) \in X.

The point, of course, is that, when the prisoners are lined up with hats on their heads, each prisoner can see all but finitely many of the hats, so each prisoner knows precisely in which equivalence class their particular configuration lies. Denote this equivalence class by Y. Since each prisoner knows h, which was fixed in advance, each prisoner also knows h(Y) and knows that their configuration only differs from h(Y) in finitely many places. Suppose now that each prisoner guesses the color indicated for their position by h(Y). Whenever their particular configuration differs from h(Y), they will be wrong. However, we know that their configuration only differs from h(Y) finitely many times, so only finitely many prisoners are wrong. This finite number could be arbitrarily large, but it is guaranteed that, from some point on, the prisoners’ configuration will agree with h(Y), and all of the prisoners from that point on will guess correctly. This isn’t bad for a situation that may at first glance have appeared somewhat hopeless!


Puzzle Six: A continuum of hats

We have more prisoners now, one for each real number. They are again each given a red or blue hat and arranged in a line so that each prisoner can see exactly those prisoners with strictly larger numbers than their own. Each prisoner guesses the color of their hat; if they are correct, they are released, and if not, they are executed. And, again as in Puzzle Five, each prisoner knows their own number, and the prisoners are allowed to devise a strategy beforehand, but no prisoner knows the guesses of any of the other prisoners.


Note: The solution to this puzzle requires somewhat more sophistication than previous solutions, so be prepared to work a little bit here. It’s worth it, though!


In this situation, configurations of hats will be represented by functions f:\mathbb{R} \rightarrow \{0,1\}, where, for a real number r, f(r) gives the color of the hat worn by Prisoner r. Let \mathcal{F} denote the set of all such functions. We can try to recreate the strategy used in Puzzle Five in the following way: define an equivalence relation on \mathcal{F} by letting f \sim g if and only if f and g differ in only finitely many places. Oh, but now we run into a problem: no prisoner is able to tell in which equivalence class their configuration lies, since each prisoner is ignorant of the colors of the hats of all of the prisoners with smaller numbers and, for each real number r, the set of real numbers smaller than r has the same size as the set of all real numbers!

OK. Let’s try to generalize the strategy from Puzzle Five in a different way, using a different equivalence relation. Let’s say that f \sim g if, for all large enough r, f(r) = g(r). More precisely, f \sim g if there is some real number r^* such that, for all r\geq r^*, f(r) = g(r). The reader is invited to check that this does in fact define an equivalence relation. Also, it is evident that each prisoner is able to tell in which equivalence class of this relation the actual configuration lies. Therefore, we can mimic the strategy from Puzzle Four: the prisoners fix beforehand a transversal (call it h) of the equivalence relation, i.e. a function that chooses a representative element from each equivalence class. Then, given a real number r, Prisoner r guesses (h(X))(r), where X is the equivalence class of the actual configuration. By the definition of the equivalence relation, then, there is some real number r^* such that, for every prisoner with a number at least as large as r^*, that prisoner guesses correctly.

This is something, at least. We guarantee that continuum many prisoners are released, and we will certainly have an impressive run of success past the magic real number r^*. However, we don’t really know anything about what happens for the prisoners with numbers less than r^*. For all we know, it could be the case that all of them guessed incorrectly! We will have saved continuum many prisoners, but perhaps we have sent continuum many prisoners to their deaths as well! That doesn’t seem too good. Can we do better?

Yes, we can. In fact, I claim that the prisoners have a strategy which guarantees the following:

  1. Only countably many prisoners guess incorrectly.
  2. There is a real number r^* such that, for every real number r \geq r^*, Prisoner r guesses correctly.

How can we accomplish this seemingly miraculous feat? Think about it for a few minutes before continuing.

mad_hatter

The strategy works as follows. First, let \lambda denote the cardinality of \mathcal{F} (namely, \lambda = 2^{2^{\aleph_0}}). One of the many equivalent formulations of the axiom of choice is that any set can be well-ordered in order type equal to its cardinality. In other words, we can enumerate the elements of \mathcal{F} by the ordinals \alpha < \lambda such that each element is enumerated exactly once. Fix such an enumeration, \{f_\alpha \mid \alpha < \lambda\}, during the prisoners’ strategizing session.

Now suppose that the prisoners are given hats and lined up, and suppose that their actual configuration is given by a function g:\mathbb{R} \rightarrow \{0,1\}. Given a real number r and an ordinal \alpha < \lambda, let us say that (r, \alpha) is possible for g if, for all real numbers s > r, f_\alpha(s) = g(s). In other words, (r, \alpha) is possible for g if, as far as Prisoner r knows, the actual configuration might be f_\alpha. Now, for each real number r and ordinal \alpha < \lambda, Prisoner r is able to tell if (r, \alpha) is possible for g, since this depends only on values of g that Prisoner r can see. Prisoner r can thus identify the least ordinal \alpha < \lambda such that (r, \alpha) is possible for g. Denote this ordinal by \alpha_r. Prisoner r then guesses the hat color given by f_{\alpha_r}(r).

I claim that this strategy does what we want, though this fact should by no means be obvious at this point. To prove that it works, we have to analyze what happens when a prisoner guesses incorrectly. To this end, suppose that r is a real number and Prisoner r guesses incorrectly. In particular, this means that f_{\alpha_r}(r) \neq h(r). Now suppose that s < r. Prisoner s can see Prisoner r, so it must be the case that f_{\alpha_s}(r) = h(r). In particular, \alpha_s \neq \alpha_r, since f_{\alpha_s}(r) \neq f_{\alpha_r}(r). Also note that, since (s, \alpha_s) is possible for g and s < r, we easily have that (r, \alpha_s) is possible for g. But Prisoner r chose \alpha_r instead of \alpha_s, so we must have \alpha_r < \alpha_s. Putting this together, we have shown the following:

  • For all real numbers s < r, \alpha_s \geq \alpha_r and, if Prisoner r guesses incorrectly, then \alpha_s > \alpha_r.

Now let S be the set of all real numbers r such that Prisoner r guesses incorrectly, and let A = \{\alpha_r \mid r \in S\}. By what we have proven, if s < r are both in S, then \alpha_s > \alpha_r. In particular, |A| = |S|.

Claim: S is countable.

Proof: Suppose not. Then, as |A| = |S|, A is uncountable. Since A is a subset of an ordinal, A is well-ordered by the usual ordering on the ordinals. Let \gamma be its order type, and let \{\beta_\eta \mid \eta < \gamma \} enumerate A in increasing order. For \eta < \gamma, let r_\eta be the unique element of S such that \alpha_{r_\eta} = \beta_\eta. Note that, for \eta < \zeta < \gamma, we must have r_\zeta < r_\eta. Since A is uncountable, \gamma \geq \omega_1. Recall that the rational numbers, \mathbb{Q}, are dense in \mathbb{R}, i.e. for every pair of real numbers s < r, there is a rational number q \in \mathbb{Q} such that s < q < r. Now for each \eta < \omega_1, fix a rational number q_\eta such that r_{\eta+1} < q_\eta < r_\eta. Since the sequence \langle r_\eta \mid \eta < \omega_1 \rangle is strictly decreasing, it must be the case that, for \eta < \zeta < \omega_1, q_\eta \neq q_\zeta. But then \{q_\eta \mid \eta < \omega_1\} is an uncountable set of rational numbers, contradicting the fact that there are only countably many rational numbers. Thus, S is countable, and the claim is proven.

Therefore, when employing our strategy, only countably many prisoners guess incorrectly. It remains to show that there is a real number r^* such that, for all real numbers r \geq r^*, Prisoner r guesses correctly. If this is not true, then we can construct an increasing sequence  \langle r_n \mid n \in \mathbb{N} \rangle of real numbers such that, for all n \in \mathbb{N}, Prisoner r_n guesses incorrectly. But now, if m < n are natural numbers, then \alpha_{r_m} > \alpha_{r_n}, so \langle \alpha_{r_n} \mid n \in \mathbb{N} \rangle is an infinite, strictly decreasing sequence of ordinals, contradicting the fact that the ordinals are well-ordered. Thus, we have shown that are strategy works exactly as desired.

Remark: If this strategy is not sufficiently surprising to you, let me quickly remark that we can make the challenge to the prisoners seemingly much more difficult in two different ways, and yet the same strategy will have the same astoundingly good result.

  1. First, we assumed above that there were only two possible hat colors for the prisoners. We never actually used this fact, though. The strategy would work just as well if there were ten possible colors of hats, or one for each natural number, or one for each real number. The strategy would work even if the number of possible colors were equal to an inaccessible cardinal, a measurable cardinal, a supercompact cardinal! The only necessity is that the prisoners know beforehand the set of possible colors, in which case they can devise a strategy exactly as above.
  2. In this puzzle, in order to be released, each prisoner only has to guess the color of their own hat. Perhaps this is too easy, so let’s make it harder as follows. In this new, strengthened challenge, each prisoner guesses the color not only of their own hat but also the color of the hat of ever prisoner with a smaller number. For each real number r, in order for Prisoner r to be released, there must be some real number s < r such that Prisoner r guesses every hat color correctly on the interval [s,r] (recall this is the set of real numbers t such that s \leq t \leq r). Success in this new challenge probably now seems almost comically difficult, but, as I’ll leave it to the reader to check, the strategy we gave above easily generalizes to a strategy for this challenge with exactly the same result!

For more on this and related matters, the interested reader is directed towards the entertaining paper, “A Peculiar Connection Between the Axiom of Choice and Predicting the Future,” by Christopher S. Hardin and Alan D. Taylor.

 

Infinite Hats I

Now—Ten thousand, and ten thousand times ten thousand (for matter and motion are infinite) are the ways by which a hat may be dropped upon the ground, without any effect.—Had he flung it, or thrown it, or cast it, or skimmed it, or squirted, or let it slip or fall in any possible direction under heaven,—or in the best direction that could be given to it,—had he dropped it like a goose—like a puppy—like an ass—or in doing it, or even after he had done, had he looked like a fool,—like a ninny—like a nicompoop—it had fail’d, and the effect upon the heart had been lost.

-Laurence Sterne, The Life and Opinions of Tristram Shandy, Gentleman

We continue last week’s investigation of hat puzzles, moving today from the land of the finite to the land of the infinite, where amazing things can happen. We start by considering the infinite analogue of Puzzle Two from last Monday.


Puzzle Four: E0 and the infinite line

There is a countable infinity of prisoners, labeled by the natural numbers (Prisoner 0, Prisoner 1, Prisoner 2, etc.). They are each given a red or blue hat and are lined up (buried up to their necks, of course, as is standard practice) so that each prisoner can see exactly those prisoners with higher numbers. The warden then goes up the line, starting at Prisoner 0, and asks the prisoners to guess the colors of their hats. If a prisoner guesses correctly, they are released; if they guess incorrectly, they are executed. Assuming that the prisoners were able to decide upon a strategy beforehand, and assuming that each prisoner can hear the guesses of all of the prisoners with lower numbers, how can the prisoners act to minimize the number of prisoners who will be executed?


This is, of course, exactly Puzzle Two with a countably infinite number of prisoners, arranged like the natural numbers. In Puzzle Two, we were able to save all prisoners except possibly the first, who was guessing with no information. We would like to be able to generalize this strategy and, one would hope, be able to save all but one prisoner in this case, as well. The strategy in Puzzle Two relied on the prisoners counting whether there is an even or odd number of red hats in front of them and using their guesses to pass this information up to the other prisoners. However, it is likely that, in the infinite version, every prisoner will see infinitely many red hats and infinitely many blue hats. The odd/even distinction no longer makes sense in the usual way. But we shouldn’t give up yet; we’ll just have to work a bit harder.

As before, we will represent configurations of hats by strings of 0s and 1s, with 0 meaning ‘red’ and 1 meaning ‘blue.’ This time, the strings will be infinite, with bit 0 representing the hat color of Prisoner 0, bit 1 the hat color of Prisoner 1, and so on. The set of all infinite strings of 0s and 1s will be denoted by {^\omega}2. If x \in {^\omega}2 and n is a natural number, then x(n) is the n^{\mathrm{th}} bit of x.

If x,y \in {^\omega}2, let us say that x is equivalent to y, and write x \sim y, if there are only finitely many n such that x(n) \neq y(n). In other words, x and y are equivalent if, from some point on, they are the same. The reader can easily verify that this relation satisfies the following three properties:

  • (Reflexivity) For all x \in {^\omega}2, x \sim x.
  • (Symmetry) For all x, y \in {^\omega}2, if x \sim y, then y \sim x.
  • (Transitivity) For all x,y,z \in {^\omega}2, if x \sim y and y \sim z, then x \sim z.

Thus, the relation \sim is what’s called an equivalence relation. In fact, it is such an important relation that it has its own name, the Vitali equivalence relation, and its own symbol, E_0. E_0 is extensively studied in the field of mathematics known as descriptive set theory and is, in a certain sense, “the simplest complicated equivalence relation.” This vague statement is made precise in the famous Glimm-Effros Dichotomy, which states that, for every “nicely definable” (Borel, for those in the know) equivalence relation E on a Polish space (the real numbers, or {^\omega}2, are standard examples of Polish spaces), either E is smooth (i.e. admits a Borel transversal), or there is a continuous embedding of E_0 into E.

If x \in {^\omega}2, let [x] denote the E_0-equivalence class of x, i.e. the set \{y \in {^\omega}2 \mid x \sim y\}, and let {^\omega}2/E_0 denote the set of E_0-equivalence classes. The fact that E_0 is an equivalence relation means that each member of {^\omega}2 belongs to precisely one equivalence class. A function f:{^\omega}2/E_0 \rightarrow {^\omega}2 is called a transversal for E_0 if, for every E_0-equivalence class X, we have f(X) \in X. In other words, a transversal is a way of selecting one element from each E_0-equivalence class.

Side Remark: E_0 is not smooth, which means that a transversal for E_0 cannot be “nicely definable” (again, Borel). Coming up with a transversal for E_0 thus involves a non-trivial application of the axiom of choice and is precisely the place where the axiom of choice is used in Vitali’s famous construction of a set of real numbers that is not Lebesgue measurable.

What does all of this have to do with our puzzle? Well, notice that, though no prisoner can see the entire configuration of hats (even Prisoner 0 is missing one), each prisoner sees all except finitely many of the hats, and therefore each prisoner knows exactly which E_0-equivalence class the configuration of hats is in. In other words, if x^* is the actual configuration of hats, then no prisoner knows x^*, but every prisoner knows [x^*]. Suppose that, when the prisoners were devising a strategy, they fixed a transversal f:{^\omega}2/E_0 \rightarrow {^\omega}2. Thus, every prisoner knows f([x^*]). Let y^* denote f([x^*]).

Since f is a transversal, we know that x^* \sim y^*. Therefore, each prisoner sees only a finite number of places at which x^* and y^* disagree (for prisoners with large enough numbers, this finite number will in fact be 0). And now that we have finite numbers, we have meaningful concepts of ‘even’ and ‘odd’! The strategy of Puzzle Two can now be implemented as follows.

Prisoner 0 starts by comparing what they see of x^* with the corresponding part of y^* and counting the number of bits at which they disagree. If this number is even, they say ‘Red,’ and if this number is odd, they say ‘Blue.’ Of course, this guess has only a 50 percent chance of being correct, but we claim that, as in Puzzle Two, every prisoner after Prisoner 0 can now guess correctly! Let’s start with Prisoner 1. They can also count the number of bits at which what they see of x^* disagrees with the corresponding part of y^*. They also know, based on Prisoner 0’s guess, whether Prisoner 0 saw an even or odd number of disagreements. If Prisoner 1 agrees with Prisoner 0 about the parity of the number of disagreements, they know that their own color matches the corresponding bit of y^* (as otherwise Prisoner 0 would see one more disagreement than Prisoner 1, and the parity would be different), so they can guess the color indicated by y^*(1) and guarantee that they are correct. If Prisoner 1 disagrees with Prisoner 0 about the parity, they know that their own color disagrees with the corresponding bit of y^*, and they can guess the color different from that indicated by y^*(1).

The strategy then continues exactly as in Puzzle Two. Based on the guesses of Prisoners 0 and 1, Prisoner 2 knows whether Prisoner 1 sees an even or odd number of differences between x^* and y^*, compares this with what they see, and guesses accordingly. This continues all the way up the line, with every prisoner except Prisoner 0 guaranteed release.

At the risk of over-explaining, let’s see how the first few steps will work for a specific example. Suppose that:

x^* = 0110100\ldots

y^* = 1011001\ldots

and that x^* and y^* agree after what is written above. Every prisoner knows the part of x^* that lies after their position in the line, and every prisoner knows all of y^*. Prisoner 0 sees 4 differences between x^* and y^* (there are in fact 5 differences, but Prisoner 0 cannot see that their own hat color disagrees with y^*) and therefore says, ‘Red.’ Prisoner 1 sees 3 differences and knows that Prisoner 0 sees an even number of differences. Prisoner 1 therefore knows that their hat color differs from that corresponding to y^*(1) (which is ‘Red’), so they guess ‘Blue’ (correctly, of course). Now, since Prisoner 1 guessed the color different from that suggested by y^*, Prisoner 2 knows that Prisoner 1 disagrees with Prisoner 0 about the parity of the disagreements between x^* and y^*. Since they know Prisoner 0 sees an even number of differences, they know that Prisoner 1 sees an odd number of differences. Prisoner 2 also sees an odd number (3) of differences, so they know that their hat color matches that suggested by y^* and guess ‘Blue.’ This continues, with an infinite string of prisoners correctly guessing, as if by magic, the color of the hat they are wearing.

Remark: There is a slightly different strategy for solving this puzzle that involves the prisoners fixing in advance a non-principal ultrafilter on the natural numbers and using this ultrafilter to extend in a meaningful way the parity function on finite strings of 0s and 1s (i.e. the function that indicates whether there is an even or odd number of 0s) to a parity function on infinite strings of 0s and 1s. This function can then be used to implement exactly the strategy from Puzzle Two in the infinite case. I encourage the interested reader to work through the details. The mathematically mature reader may consult the linked paper of Geschke, Lubarsky, and Rahn for this and more on role of the axiom of choice in hat puzzles.

Hats and Hamming Codes

On Monday, we considered some basic puzzles involving finite number of hats. We continue these investigations here with a slightly more involved puzzle. This puzzle is due to Todd Ebert and appeared in his 1998 PhD thesis. It attracted much attention, largely due to its connections with coding theory, and even inspired a New York Times article.


Puzzle 3: Hats and Coding

The Puzzle

There’s a finite group of prisoners. The prisoners are led into a room, where each is randomly (and independently) adorned with either a red or a blue hat (each has a 50% probability). Each prisoner can see the hat colors of all of the other prisoners, but no prisoner can see the color of their own hat. Each prisoner has a device with three buttons: ‘Red,’ ‘Blue,’ and ‘Pass.’ At a fixed time, the prisoners simultaneously press a button, with each prisoner either trying to guess the color of their hat or passing. If at least one prisoner guesses a color and everyone who guesses a color guesses correctly, then the prisoners are released. If everyone passes or at least one person guesses incorrectly, then the prisoners are executed. The prisoners are not allowed to communicate once they are given their hats; their guesses or passes must be based solely on the colors of the hats they can see. However, the prisoners can confer beforehand and devise a strategy. How can the prisoners maximize their probability of being released?

Solution for 3 prisoners

A naive strategy would be to choose one prisoner to be the ‘guesser.’ That prisoner will always guess ‘red,’ and everybody else will pass. This strategy will have a success rate of \frac{1}{2}. It turns out, though, that we can do better. This might initially be surprising to you. Since anybody guessing the color of their own hat has no useful information about the color of their hat, the success rate of any individual guess will be \frac{1}{2}. However, if we cleverly choose the circumstances in which a prisoner will guess and the color they will guess, we can concentrate the wrong guesses in a small number of ‘bad’ configurations of hats while spreading out the right guesses to a larger number of ‘good’ configurations.

To see how this might work, let’s suppose there are three prisoners, and let’s suppose we use the following strategy, which we call Strategy A: if a prisoner sees that the other two prisoners have the same color hat, they will guess the other color. If a prisoner sees that the other two prisoners have different colors, then they will pass.

How will Strategy A fare? If all three of the prisoners have the same color hat, then all three prisoners will guess the other color, and all three will be wrong. However, if two prisoners are wearing one color and one prisoner is wearing the other, then only the prisoner wearing the color appearing just once will guess, and they will be right! This strategy fails, therefore, only if all of the prisoners have the same color. What are the chances of this happening? For each color, the chances of all hats being that color are \frac{1}{2} \times \frac{1}{2} \times \frac{1}{2} = \frac{1}{8}, so the chances of it happening for either color is \frac{1}{8} + \frac{1}{8} = \frac{1}{4}. Strategy A therefore has a success rate of \frac{3}{4}.

This is certainly better than the \frac{1}{2} success rate of the naive strategy. Is it the best we can do? To make further analysis less cumbersome, let’s replace ‘red’ and ‘blue’ with ‘0’ and ‘1,’ i.e., suppose that prisoners with red hats are labeled ‘0’ and prisoners with blue hats are labeled ‘1,’ and that the prisoners guess ‘0’ or ‘1’ rather than ‘red’ or ‘blue.’ Let’s also suppose that, as part of their strategizing, the prisoners number themselves, from 1 to 3 (or 1 to n if there are n prisoners). A configuration of hats can therefore be represented as a sequence of length 3 (or, generally, n) of 0s and 1s. Note that, for 3 prisoners, there are 8 possible configurations of hats (in general, the number is 2^n):

000    001    010    011    100    101    110    111

When Strategy A is deployed, 12 total guesses are made throughout all configurations: 6 incorrect guesses are made in the configurations 000 and 111, and 6 correct guesses are made in the other configurations. As mentioned, because the prisoners have no real information about their own hat colors, this equality of correct and incorrect guesses is necessary. Suppose, then, that we had a strategy better than Strategy A. Since Strategy A succeeds in 6 of the 8 configurations, this new strategy would have to succeed in at least 7. It would therefore have to have at least 7 correct guesses across all of the configurations, which means it would also have to have at least 7 incorrect guesses. However, these 7 incorrect guesses would have to be restricted to the single configuration on which the strategy fails. Each configuration only has room for 3 guesses, so this is impossible. Therefore, Strategy A is optimal.

Generalizing the solution

With an eye to generalizing Strategy A to work with more prisoners, let’s look at it a little more deeply. Given two configurations of hats, let us say that the two configurations are close if they are same configuration or if one can be turned into the other by changing just one hat. So, using our ‘0’ and ‘1’ notation, 110 and 100 are close, whereas 101 and 010 are not close. Strategy A identifies two special configurations (000 and 111), which we will call bad configurations, which satisfy the following conditions:

  1. Every configuration is close to one of the bad configurations.
  2. No configuration is close to two different bad configurations.

Once these bad configurations have been identified, Strategy A can be formulated as follows. Each prisoner looks at the other hats and figures out whether what they see is consistent with one of the bad configurations. If it is, then they guess that their hat is the color that is different from the color of their hat in this bad configuration. So, for example, if Prisoner 1 sees two blue hats, they know that this is consistent with one of the bad configurations, namely 111, so Prisoner 1 guesses 0, or ‘red.’ If the prisoner sees something that is inconsistent with all of the bad configurations, they pass. So, if Prisoner 1 sees that 2 is wearing blue and 3 is wearing red, then they know the configuration could not possibly be any of the bad configurations, so they pass.

Suppose now that the number of prisoners is an arbitrary natural number, n. There are now 2^n possible configurations of hats, each represented by a string of 0s and 1s of length n. Let {^n}2 denote the set of all strings of 0s and 1s of length n. In analogy with our analysis of Strategy A, let us make the following definition.

Definition: Suppose H \subseteq {^n}2 (so H is a set of strings of length n of 0s and 1s). We say H is an n-Hamming set if it satisfies the following two conditions:

  1. Every element of {^n}2 is close to a member of H.
  2. There is no element of {^n}2 that is close to more than one member of H.

Suppose for a moment that H is an n-Hamming set. What can we say about |H|? For each x \in H, let A_x be the set of y \in {^n}2 such that y is close to x. For each x, there are n+1 elements that are close to it (x is close to itself, and there are n ways to change one position of x). Therefore, each set A_x has size n+1. Moreover, condition 1 in the definition of Hamming set implies that the union of all of the A_xs is all of {^n}2, which has size 2^n, and condition 2 in the definition of Hamming set implies that, if x_1 and x_2 are different members of H, then A_{x_1} \cap A_{x_2} = \emptyset. Putting this all together, we obtain:

|H| \cdot (n+1) = 2^n, or

|H| = \frac{2^n}{n+1}

Since |H| is an integer, it must be the case that n+1 evenly divides 2^n, so, in order for there to be an n-Hamming set, n+1 must be a power of 2. Thus, we may assume there is a natural number k such that n=2^k-1, in which case |H| = 2^{n-k} = 2^{2^k-(k+1)}.

It turns out that, if n+1 is a power of 2, then there is always an n-Hamming set. We will not prove this fact; it can be done with a modest amount of linear algebra.

Theorem: Suppose k \in \mathbb{N}. Then there is a 2^k-1-Hamming set.

This allows us to demonstrate an optimal strategy for the puzzle in cases in which the number of prisoners is one less than a power of 2. To see this, suppose k is a natural number and n = 2^k - 1. Let H be an n-Hamming set. Now suppose that the prisoners have been given their hats and, based on what a prisoner sees, the configuration could be one of the elements of H. Then the prisoner will guess the color they are not wearing in that element of H. In all other cases, the prisoner will pass. For example, if n=7, 0110011 is in H, and Prisoner 1 sees ?110011, then they will guess ‘1.’ If the actual configuration is 1110011, all other prisoners will pass, and the strategy will succeed. If the actual configuration is 0110011, then all of the prisoners will guess (and all will be incorrect). Condition 1 of the definition of Hamming set implies that at least one prisoner will guess in each configuration. Condition 2 implies that the strategy is well-defined. Therefore, the strategy will only fail if the configuration appears in H. Since \frac{|H|}{|{^n}2|} = \frac{2^{n-k}}{2^n} = \frac{1}{2^k}, the success rate of the strategy is \frac{2^{k}-1}{2^k} = \frac{n}{n+1}. The more prisoners there are, the more likely they are to succeed. Also, an easy calculation similar to that done for n=3 shows that this is the best we can do.

If the number of prisoners is not one less than a power of two, things don’t work out quite so nicely. Note that, if m < n, then a strategy for m prisoners can be employed by n prisoners as follows: m prisoners are chosen to pretend like they are the only prisoners and enact the strategy for m prisoners. The remaining n-m prisoners are sent off to twiddle their thumbs (and, of course ‘pass’ when the time comes to guess). This strategy will have the same success rate as the corresponding strategy for m prisoners. Thus, if there are n prisoners and k is the largest natural number such that 2^k-1 \leq n, then there is a strategy for n players with a success rate of \frac{2^k-1}{2^k}. For some values of n (such as n=4, where k=2), this is still optimal. For others, it might not be. The question of the optimal strategy for n prisoners when n is not necessarily one less than a power of 2 turns out to be a very difficult question, and we won’t say more about it here.

Coding theory

We end by remarking that Hamming sets can be used to develop error-correcting codes. Suppose you are sending some information, but the channel you are using to send it is a little noisy, and a small but non-negligible amount of error is introduced. You would like to be able to both identify and correct these errors. Here’s one way to do that. Suppose that your message can be sent using an alphabet of 16 symbols. Let H be a 7-Hamming set. By the calculations above, |H| = 16, so we can assign each letter of our alphabet to a member of H. We can then send the message, using the members of H to stand for the corresponding letters. Now, when the message is received, if a 7-bit string is in H, then it can be decoded as the corresponding letter. If it is not in H, then there is a unique element of H that is close to the received string, so the string can be changed to that element of H and then decoded. As long as there is no 7-bit string in which two errors occur (and, if the error rate is sufficiently small, this is a reasonable assumption), then the message is transmitted perfectly.

Finite Hats

Why should anyone be frightened by a hat?

-Antoine de Saint-Exupéry

As a traveling mathematician, one steadily amasses, mostly over drinks with other mathematicians, a compendium of puzzles, conundrums that come to us from some strange world of transfinite subways and unnecessarily complicated lighting solutions.

One of the most common types of puzzle one encounters in this way is the “hat puzzle,” which typically involves a situation of the following form: some number of people are wearing hats. According to the prevailing fashions in this world, each hat is either red or blue. Some of these people can see the hats of some of the other people, but nobody can see their own hat. The most common way to make this plausible is to suppose that these people are prisoners at the whims of some sadistic prison warden who has buried them up to their necks so that they do not have use of their arms to reach up, remove their hats, and examine their color. The people want to guess the colors of their own hats in order to gain their freedom (the rules and winning conditions of the contest vary according to the puzzle under consideration). Your challenge is typically to come up with the optimal strategy for the prisoners to employ so that they have the highest chance of escaping or so that as many of them as possible are guaranteed to escape.

Hat puzzles come in two basic flavors that look similar in setup but feel very different in execution: puzzles with finitely many hat-wearers and puzzles with infinitely many hat-wearers. Solutions to finite hat puzzles typically involve finding clever ways to encode or transmit information. Solutions to infinite hat puzzles typically involve finding clever applications of the axiom of choice.  And though Point at Infinity is dedicated to all things infinite, I don’t want to deprive my readers of the joys of finite hat puzzles, so, over the next two weeks, we will look at both. This week will cover the finite case, with two basic puzzles today and a more involved one, with connections to information theory, on Thursday.


Puzzle One: The Four Hats

There are four prisoners, named A,B,C, and D. Each is wearing either a blue or a red hat, and each knows that there are two hats of each color between them. The prisoners are all buried up to their necks. A,B, and C are in a line on one side of a brick wall, so that A can see B and C, B can only see C, and C can see neither of the others. D is on the other side of the brick wall; they can see nobody else, and nobody can see them. The prisoners are not allowed to speak except to guess the color of their own hat. If the first prisoner to speak guesses their own hat color correctly, then all four go free. If the first prisoner to speak guesses their own hat color incorrectly, then all four are executed. Can the prisoners guarantee their release?

(ANSWER AFTER THE PICTURE. TAKE A MOMENT TO THINK BEFORE READING FURTHER.)

rsz_four_hats
The four prisoners in their predicament. Two are wearing red hats, and two are wearing blue hats (all hats are purple in the picture). The first to speak must guess their own hat color correctly. Can they succeed?

Yes, the prisoners can guarantee success. Here’s how.

  • First note that, if B and C have the same color hat, then A will know this. A will then know that there are no more hats of that color, since everyone knows there are only two hats of each color. A can then confidently declare the other color.
  • If B and C have different hat colors, then A can no longer be sure of the color of their hat. A will therefore be silent. After a short period of silence, B will be able to conclude that their hat color is different from C’s, as otherwise A would have immediately spoken. B can therefore confidently state the color that C is not wearing.

This covers all of the possible configurations of hats, so we are done!


Puzzle Two: The Ten Hats

That was just a warm-up. Let’s get to something more interesting. In this puzzle, there are ten prisoners (as we will see, the exact number is not important; the solution will generalize to any number of prisoners), numbered 1 through 10. They are each wearing a hat, either red or blue, randomly assigned. Each prisoner knows that all of the hats are either red or blue, but none of them knows how many hats of each color there are. The prisoners are again buried up to their necks in a straight line, this time in a way so that each prisoner can see exactly those prisoners with higher numbers.

The warden proceeds along the line, from back to front (1 to 10), asking each prisoner to guess the color of their own hat. The guesses are audible to all of the prisoners, but no other communication is permitted. (The prisoners were permitted to confer and devise a strategy before they were buried and given hats, though). For each prisoner, if they guess their hat color correctly, then they are released. If they guess incorrectly, then they are executed. How should the prisoners act to maximize the number of prisoners who are guaranteed to be released?

rsz_ten_hats
One particular instance of the ten hats puzzle. Each prisoner can see the hats of the prisoners with higher numbers. The prisoners, in order from 1 to 10, guess the colors of their own hats. The guesses are audible to everyone. What is the prisoners’ optimal strategy?

Here’s a first natural attempt at a solution. Suppose that each odd-numbered prisoner guesses the color of the prisoner in front of them. Then each even-numbered prisoner will hear this guess and know the color of their own hat, so they can simply repeat the guess of the prisoner behind them. This will guarantee that each even-numbered prisoner is released. In addition, some of the odd-numbered prisoners will get lucky and ‘accidentally’ guess their own color. Half of the prisoners are guaranteed to be released, and, on average, three-quarters of the prisoners will be released. Can we do better?

Yes! First, note that Prisoner 1 is always guessing with no information and thus cannot have better than a 50 percent chance of being released. The best we can possibly do, then, is to guarantee that the other nine prisoners can be released. This might seem, at first, hopelessly optimistic, but it can be achieved!

The key idea is that, through their guesses, the prisoners can pass up information about how many red or blue hats they can see, while simultaneously (except for Prisoner 1) guessing their own hat colors correctly. To get things started, it is agreed among the prisoners that Prisoner 1 will guess ‘Red’ if they see an even number of red hats and ‘Blue’ if they see an odd number of red hats. Let’s first see how this helps Prisoner 2.

After Prisoner 1’s guess, Prisoner 2 knows whether Prisoner 1 saw an even or odd number of red hats. Prisoner 2 also knows whether they themselves see an even or odd number of red hats. If the prisoners agree about this information, then Prisoner 2 knows that they must be wearing a blue hat and will guess ‘Blue.’ If they disagree about this information, then Prisoner 2 knows that they must be wearing a red hat and will guess ‘Red.’

Here’s how this will play out in the particular instance of the puzzle in the picture above. Prisoner 1 sees 4 red hats (and 5 blue hats) and will therefore say ‘Red,’ signifying an even number of red hats. Prisoner 2 sees 3 red hats, an odd number. Prisoner 2 therefore knows that, to account for the discrepancy between what they see and what Prisoner 1 sees, they must be wearing a red hat and will say ‘Red.’

Here’s the neat part: when Prisoner 2 guesses their hat color (correctly), they are simultaneously telling the other prisoners whether they see an even or an odd number of red hats, since their guess depends on whether what they see agrees or disagrees with what Prisoner 1 sees. Prisoner 3 is now in the same position as Prisoner 2 was after Prisoner 1 had guessed, so Prisoner 3 can guess correctly. But now Prisoner 3 is telling the other prisoners whether they see an even or odd number of red hats, and Prisoner 4 can guess correctly. This continues all the way up the line, and Prisoners 2 through 10 all escape.

Let’s go back to the picture to see how this would work in this particular instance. We know that Prisoner 1 says ‘Red,’ and then Prisoner 2 correctly guesses ‘Red.’ What does Prisoner 3 know now? They know that Prisoner 1 saw an even number of Red hats, because Prisoner 1 said ‘Red.’ They then know that Prisoner 2 saw an odd number of Red hats, because they also said ‘Red.’ Prisoner 3 sees an odd number (3) of red hats, which agrees with what Prisoner 2 sees. Therefore, Prisoner 3 knows their hat is blue and says ‘Blue.’ Prisoner 4 then knows that Prisoner 3 sees an odd number of red hats. Prisoner 4 also sees an odd number of red hats, so Prisoner 4 guesses ‘Blue.’ Prisoner 5 sees an even number of red hats. Since this disagrees with what they know Prisoner 4 sees, they guess ‘Red,’ and so on.

As I mentioned, it is clear that this solution does not depend on there being exactly ten prisoners; it will work for any number of prisoners who happen to find themselves in the same position. The first prisoner will be in the unfortunate situation of having only a 50 percent chance of escaping, but the remaining prisoners can all be guaranteed escape!

chapeau
Source: unepetitereveuse