circular definition.noun Logic.a definition in which the definiendum (the expression being defined) or a variant of it appears in the definiens (the expression that defines it).-dictionary.com

Today we return to our “Circles” series, which we introduced what seems like years ago but was actually just March, with its first real installment, on circular definitions.

We are rightfully taught from a young age to avoid using a word in its own definition. This makes obvious sense: a definition is, in part, supposed to convey the meaning of a word to a reader who is unfamiliar with it. Ideally, this happens using words that the reader *is* already familiar with or can become familiar with. If the unfamiliar word is itself in the definition, it is difficult for it to effectively convey this meaning. The definition has failed at one of its primary tasks.

Even if we diligently avoid using words within their own definitions, though, circularity can easily crop up across multiple definitions. For example, we might make the following sequence of definitions:

ice.noun.the solid form of water.

steam.noun.the gaseous form of ice.

water.noun.the liquid form of steam.

Although none of these definitions is circular on its own, they form a closed circle when taken as a set, and this obscures the meaning of all three words. A person who is unfamiliar with all of the words “ice”, “steam”, and “water” will come away from these definitions knowing *something*, namely that the three words represent, respectively, the solid, gas, and liquid forms of a certain substance, but not much more than that. They would not know, for instance, any relevant properties of this substance, or where this substance might be found, or what it is used for, and these three definitions would certainly not be printed together in any decent dictionary. (One might argue that the definitions of “ice” and “steam” are basically fine, if a bit limited, and that the definition of “water” should be altered to something more descriptive, thus breaking this instance of circularity. This is essentially what is done by *Merriam-Webster*, for instance).

Or here is an example actually occurring in *Merriam-Webster*. (To be fair, the full definitions in *M-W* are much longer and have multiple entries that we are not quoting in their entirety here. Still, what we reproduce below are the exact quotes of the first definitions of each of the three respective words. (The definition of “weigh” we quote is the first given as an intransitive verb, which is the sense in which it is being used in the definition of “weight”.))

weight.noun.the amount that a thing weighs.

weigh.verb.to have a certain heaviness

heavy.adjective.having great weight

This probably doesn’t seem like an ideal scenario, but satisfactorily removing this circularity from the definitions of “weight”, “weigh”, and “heaviness” (which redirects to “heavy”) is more difficult than it might initially seen. I encourage you to spend a few minutes thinking about it. Let me know if you come up with something significantly better.

One might be wondering at this point if it is even possible to avoid circularity entirely in a system of definitions. In fact, a rather simple argument, of a sort that is ubiquitous in the mathematical field of graph theory, shows that it is impossible. Let us look at this argument now.

In our analysis, we will represent certain information from the dictionary in a *directed graph*, i.e., a collection of nodes and a collection of arrows between certain of these nodes. In our graph, each node will represent a word, and there will be an arrow pointing from Word 1 to Word 2 if Word 2 appears in the definition of Word 1. For example, a portion of this graph containing some of the words in our “ice”, “steam”, and “water” definitions given above might look like this:

We now make two assumptions that I think will seem quite reasonable:

- There are only finitely many words in the dictionary.
- Every word that is used in the dictionary itself has a definition in the dictionary.

We will now see that these two innocuous assumptions necessarily lead to a circular set of definitions. To see this, choose any word in the dictionary and start following an arbitrary path of arrows in the directed graph derived from the dictionary. For example, in the segment above, we might start at “ice”, then move to “water”, then to “liquid”, and then elsewhere, maybe to “flowing”, for example. Let us label the words we visit on this path as and so on. Recalling the meaning of the graph, this means that the definition of contains , the definition of contains , and so on.

A consequence of Assumption 2 is that each of the words visited in our path itself has a definition, and therefore there are certainly arrows pointing out of the node associated with . This means that we never get stuck on our path. We can always choose a *next* node, so we can continue our path indefinitely and find a word for every natural number .

We now have an infinite path of words. But Assumption 1 says there are only finitely many words in the dictionary, so this means that our path must repeat some words! In other words, there are numbers such that and are *the same word*. For example, maybe is the same word as . But in that case, look at what we have:

- the definition of contains ;
- the definition of contains ;
- the definition of contains ;
- the definition of contains (since ).

Thus, , , , and form a circular set of definitions. (The same sort of analysis obviously holds for any values of and , not just for 6 and 10.)

The fact that circularity is unavoidable should not, though, be a cause for despair, but can perhaps help us to clarify certain ideas about language and dictionaries. A dictionary is not meant to secure a language on a perfectly logically sound, immutable foundation. Nobody is worried that an inconsistency will be found in the dictionary and, as a result, the English language will collapse (whereas this sort of worry certainly does exist, in a limited form, in mathematics). Rather, it is meant to serve as a living documentation of a language as it is actually used in practice, and our everyday use of language is delightfully full of imprecisions and ambiguities and circularities. I would not want things to be any other way.

(This is not to say, of course, that all circular definitions should be uncritically embraced. Circular definitions are often simply unhelpful and can easily be improved. Our definitions of “ice”, “steam”, and “water”, for instance, would be greatly improved simply by defining “water” as something like “the liquid form of a substance whose molecules consist of two hydrogen atoms and one oxygen atom, which freezes at 0 degrees Celsius and boils at 100 degrees Celsius, and which is the principal component of the Earth’s rivers, oceans, lakes, and rain”.)

In the spirit of play, though, and in anticipation of the forthcoming mathematical discussion, let us consider two ways in which our language could be altered to actually allow us to avoid circles. These two ways essentially amount to denying the two Assumptions isolated in our proof above. The first way consists of allowing our language to be infinite. Our argument above hinged on the fact that our language has only finitely many words, so our path must return to a previously visited node at some point. If we had infinitely many words, though, we could line them all up and write down definitions so that each word’s definition only depended on words appearing later. So, for example, Word 1 might be defined in terms of Words 3, 5, and 17, and Word 3 might then be defined in terms of Words 6, 12, and 24, and so on. In practice, this is untenable for obvious physical reasons, but, even absent these physical concerns, it’s also unclear how any actual meaning would arise from such a system.

The second way consists of retaining a finite language but specifying one or more words as *not needing definitions*. These would be something like atomic words, perhaps expressing pure, indivisible concepts. In the graph diagram in our argument above, the nodes corresponding to these words would have no arrows coming out of them, so our paths would simply come to a halt, with no contradiction reached, if we ever reached such a node. Indeed, it is not hard to see how, given even a small number of such atomic words, one could build an entire dictionary, free from any circularity, on their foundation.

In some sense, this is actually close to how language works in practice. You can look up, for instance, the word “the” in the dictionary and find a definition that, at least for the first few entries, studiously avoids the use of “the”. Such definitions are certainly of use to, say, linguists, or poets. And yet nobody actually learns the word “the” from its definition. Similarly, you probably learned a word like “cat” by, over time, seeing many instances of actual cats, or images or sounds of cats, being told each time that what you were witnessing is a cat, and subconsciously collating the similarities across these examples into a robust notion of the word “cat”. Reading a definition may help clarify some formal aspects of the concept of “cat”, but it probably won’t fundamentally alter your understanding of the word.

One place where circularity certainly *should *be avoided, though, is in mathematical reasoning. Just as definitions of words make use of other words, proofs of mathematical theorems often make use of other theorems. For example, if I wanted to prove that there is a real number that is not a solution of any polynomial equation with integer coefficients (i.e., there exists a transcendental number), the shortest path might be to make use of two theorems due to Cantor, the first being that the set of polynomial equations with integer coefficients has the same size as the set of natural numbers, and the second being that the set of real numbers is strictly larger than the set of natural numbers. Since these theorems have already been proven, we are free to use them in our proof of the existence of transcendental numbers.

Carelessness about circularity can lead to problems here, though. Suppose we have “proved” three theorems (Theorem A, Theorem B, and Theorem C), but suppose also that our proof of Theorem A depends on the truth of Theorem B, our proof of Theorem B depends on the truth of Theorem C, and our proof of Theorem C depends on the truth of Theorem A. Have we really proven anything? Well, no, we haven’t. In fact, if you allow for such circular reasoning, it’s easy to prove all sorts of obviously false statements (and I encourage you to try!).

How does mathematics avoid this problem and actually get started proving things? Exactly by the second method outlined above for potentially removing circular definitions from language. Just as we specified certain words as not needing definitions there, here we specify certain mathematical statements as being true *without** needing proof*. These statements are known as *axioms*; they are accepted as being true, and they serve as the foundation on which mathematical theories are built.

Statements can be chosen as axioms for a number of reasons. Often they are statements that are either seen as obviously true or definitionally true. Sometimes long experience with a particular mathematical field leads to recognition that adopting a certain statement as an axiom is particularly useful or leads to particularly interesting consequences. Different sets of axioms are adopted by different mathematicians depending on their goals and area of study. But they are always necessary to make mathematical progress.

A proper discussion of axioms would fill up many books, so let us end this post here. In our next entry in our Circles series, we will look at the important mathematical idea of *well-foundedness* and how it allows us to develop mathematical definitions or arguments that at first glance may appear circular but are in fact perfectly valid.

Cover image:“Etretat in the Rain” by Claude Monet