## July 2, 2021

### Large Sets 8

#### Posted by Tom Leinster

Previously: Part 7. Next: Part 9

If you’ve been wanting to follow this series but haven’t had time to keep up, now’s a good moment to hop back on board — I won’t assume much of what’s gone before.

Back in the mists of time, when I took a first undergraduate course on axiomatic set theory, I was exhilarated by the extraordinary world of infinite sets I saw opening up before me. In that world, addition is the same as multiplication! Which is the same as maximum! That is,

$X + Y \cong X \times Y \cong max(X, Y)$

for all infinite $X$ and $Y$. It seemed unthinkably exotic.

I then heard this part of cardinal arithmetic called “trivial” for exactly the reasons just stated. Although that description is technically correct, it poured a bucket of cold water over my enthusiasm in a way that only mathematicians can.

So with apologies to my past self, I give you the informal title of this post: the nontrivial part of cardinal arithmetic.

Once we’ve seen that binary addition and multiplication are “trivial”, what’s left to do in cardinal arithmetic? Well, there’s addition and multiplication of infinitely many sets, and there’s exponentiation. Some of the theory of those operations is trivial too, in the sense of following immediately from the isomorphisms

$X + Y \cong X \times Y \cong max(X, Y)$

for infinite $X$ and $Y$. Let’s get that trivial stuff out of the way first.

Addition, even infinitary addition, is more or less the same as supremum. An easy little argument shows that for any family $(X_i)_{i \in I}$ of nonempty sets,

$\sum_{i \in I} X_i \cong \max(I, \sup_{i \in I} X_i)$

— unless, of course, it’s a finite family of finite sets. (The sum of finitely many natural numbers is not the same as its sup!)

Exponentiation is a special case of infinitary multiplication: $Y^X = \prod_{x \in X} Y$. Still, it’s an important enough case that it’s well worth thinking about separately.

In many cases, exponentiation is trivial too. It’s trivial when the expression

$Y^X$

is top-heavy — when $X$ is bigger than $Y$. It’s even trivial when $X$ is just slightly smaller than $Y$. Indeed,

$Y^X = 2^X \quad \text{ if } \quad 2^X \geq Y,$

assuming that $X$ and $Y$ are infinite. The proof is a pushover:

$2^X \leq Y^X \leq (2^X)^X \cong 2^{X \times X} \cong 2^X.$

For example, $\mathbb{N}^X \cong \mathbb{R}^X \cong 2^X$ for all infinite sets $X$.

There’s another case where exponentiation is trivial: when the base is the power set of something. Another one-line proof shows that if $Y$ is a power set then

$Y^X \cong \max(Y, 2^X).$

Here I’ve assumed again that $X$ and $Y$ are infinite. If $X$ or $Y$ is finite then exponentiation is trivial too.

So the only occasions when $Y^X$ is not trivial to compute are when $Y$ is not a power set and $2^X \lt Y$ — the expression $Y^X$ is “bottom-heavy”. It’s certainly the case that

$Y \leq Y^X \leq 2^Y.$

But even assuming the generalized continuum hypothesis, this still leaves two options for $Y^X$: is it $Y$ or $2^Y$?

Now we’re moving across the boundary into what I’m calling the “nontrivial” part of cardinal arithmetic. (Of course, any professional would regard everything I’m saying here as trivial.) And it’s time to talk about the wonderful König’s theorem.

König’s theorem states that for any families of sets $(X_i)_{i \in I}$ and $(Y_i)_{i \in I}$ whatsoever,

$X_i \lt Y_i \ \text{for all}\ i \quad \implies \quad \sum_{i \in I} X_i \lt \prod_{i \in I} Y_i.$

Let’s spend a while contemplating this result, before we start to think about the proof. Some observations:

• There are many variant statements that can be obtained by taking König’s theorem and replacing one or both $\lt$ signs by $\leq$, or by changing the sum to a product, or vice versa. All of those variants are either trivially true or false. (For example, changing both $\lt$s to $\leq$s gives a false statement.) Among these variants, König’s theorem is unique in being both nontrivial and true.

• If we assume the generalized continuum hypothesis then König’s theorem does become trivial, at least when $X_i$ and $Y_i$ are infinite. For then $2^{X_i} \leq Y_i$ for every $i$, and taking the product of these inequalities over all $i$ gives $2^{\sum X_i} \leq \prod Y_i$, hence the result.

• When every $X_i$ is empty, König states that a product of nonempty sets is nonempty. This is equivalent to the axiom of choice (in the presence of the other ETCS axioms). So, we’re going to have to use the axiom of choice in the proof.

• When $X_i = 1$ and $Y_i = 2$ for all $i$, König’s theorem becomes Cantor’s theorem: $I \lt 2^I$ for all sets $I$. So, we can expect the proof to be a generalization of the diagonal argument.

As for the proof, I won’t spoil it! I promise you that if you haven’t seen it before and you figure it out for yourself, it will brighten your day. It’s the proof of strict inequality that’s the really nice part.

When I first learned König’s theorem, I got a bit overexcited. I thought that such a general statement would be sure to have a wealth of applications. But as far as I know, there’s basically only one, and it’s this:

Let $X$ and $I$ be infinite sets. Suppose that $X$ can be expressed as a coproduct of $I$ sets that are all $\lt X$. Then $X \lt X^I$.

Indeed, if $X \cong \sum_{i \in I} X_i$ with $X_i \lt X$ for each $i$, then König’s theorem with $Y_i = X$ for all $i$ gives the result.

For example, suppose we have a model of ETCS in which $\aleph_\omega$ exists. Then

$\aleph_\omega \cong \sup_{n \in \mathbb{N}} \aleph_n \cong \sum_{n \in \mathbb{N}} \aleph_n,$

so

$\aleph_\omega \lt \aleph_\omega^\mathbb{N}.$

The right-hand side here is a very bottom-heavy exponential: $\aleph_\omega$ is vastly bigger than $\mathbb{N}$.

At this point I want to talk about some bad intuition that I carried around with me for many years. I’ll explain it using the example of two particular sets, $\mathbb{R} = 2^\mathbb{N}$ and $\mathbb{N}$, but other examples would work too.

• $\mathbb{R} + \mathbb{N} \cong \mathbb{R}$, and my mental picture is that $\mathbb{R}$ is bigger than $\mathbb{N}$ and therefore “absorbs” it: adjoining tiny little $\mathbb{N}$ to great big $\mathbb{R}$ doesn’t make it any bigger. That intuitive idea is fine. (It’s crude, because $X + Y \cong X$ even if $X$ and $Y$ have the same cardinality. But that doesn’t make it wrong.)

• $\mathbb{R} \times \mathbb{N} \cong \mathbb{R}$, and my mental picture for this is similar: since $\mathbb{R}$ is bigger than $\mathbb{N}$, taking $\mathbb{N}$ copies of it doesn’t make it any bigger. That intuitive idea is fine too.

• $\mathbb{R}^\mathbb{N} \cong \mathbb{R}$, and my mental picture for this used to be similar: since $\mathbb{R}$ is bigger than $\mathbb{N}$, taking mere sequences of reals doesn’t give you a larger set than taking reals themselves. This intuition is completely wrong!

It’s wrong because there are sets $X$ much bigger than $\mathbb{N}$ such that $X^\mathbb{N} \gt X$. Or at least, such sets exist as long as we add to ETCS a very mild extra axiom: the existence of an uncountable weak limit. Then $\aleph_\omega$ exists, and as we just saw,

$\aleph_\omega^\mathbb{N} \gt \aleph_\omega.$

That’s why it’s nonsense to think that $\mathbb{R}^\mathbb{N} \cong \mathbb{R}$ because $\mathbb{R}$ is bigger than, or “absorbs”, $\mathbb{N}$.

End of digression.

For results like the “$X \lt X^I$” one, it’s useful to think in terms of cofinality. By definition, the cofinality $cf(X)$ of an infinite set $X$ is the smallest set $I$ with the following property:

there is some family of sets $(X_i)_{i \in I}$ such that $X \cong \sum X_i$ and $X_i \lt X$ for all $i$.

The word “cofinality” comes from an equivalent order-theoretic formulation which I won’t talk about.

Certainly the set $I = X$ has the stated property, since $X \cong \sum_{x \in X} 1$. So $cf(X)$ is well defined and $cf(X) \leq X$. Some examples:

• What’s the cofinality of $\mathbb{N}$? Since a finite coproduct of finite sets is finite, any set $I$ with the property above must be infinite. So $cf(\mathbb{N})$ is infinite. But $cf(\mathbb{N}) \leq \mathbb{N}$, so $cf(\mathbb{N}) = \mathbb{N}$.

• How about $\aleph_1$, the smallest set $\gt \aleph_0 = \mathbb{N}$? If $(X_i)_{i \in I}$ is a family with $I \lt \aleph_1$ and $X_i \lt \aleph_1$, then $I \leq \aleph_0$ and $X_i \leq \aleph_0$. It follows that $\sum_{i \in I} X_i \leq \aleph_0 \times \aleph_0 \cong \aleph_0 \lt \aleph_1.$ So $\aleph_1$ cannot be expressed as a coproduct of $\lt \aleph_1$ sets that are all $\lt \aleph_1$. That is, $cf(\aleph_1) = \aleph_1$.

• Generally, the same argument shows that if $X$ is a successor (that is, there’s some set $Y$ such that $X$ is the smallest set $\gt Y$), then $cf(X) = X$.

• Assuming that $\aleph_\omega$ exists, we have $\aleph_\omega = \sup_{n \in \mathbb{N}} \aleph_n = \sum_{n \in \mathbb{N}} \aleph_n$ with $\aleph_n \lt \aleph_\omega$ for each $n$. Hence $\cf(\aleph_\omega) = \mathbb{N} = \aleph_0$.

As these examples show, many sets $X$ have the property that $cf(X) \cong X$. Such sets are called regular. So

$\mathbb{N} = \aleph_0, \aleph_1, \aleph_2, \ldots$

are all regular, as is any successor. An infinite set that is not regular is called singular. Thus, a set $X$ is singular if and only if it can be expressed as a coproduct of $\lt X$ sets that are each $\lt X$.

Since the smallest example of a singular set is $\aleph_\omega$, and since it is consistent with ETCS that $\aleph_\omega$ does not exist,

It is consistent with ETCS that every infinite set is regular.

(I’m only defining cofinality, and regular vs. singular, for infinite sets. I don’t know whether there’s a sensible way to set up the definitions for finite sets.)

Aside   Regularity comes up quite naturally in category theory. For instance, take the result that finite limits commute with filtered colimits in $\mathbf{Set}$. The definitions of finite limit and filtered colimit both refer (at least implicitly) to the concept of a set being finite, and finite means $\lt \mathbb{N}$. One might seek to generalize the commutativity result by replacing $\mathbb{N}$ by some larger set $K$. But if one tries to do this, one quickly finds oneself needing the assumption that $K$ is regular. There’s a nice explanation of this point in Francis Borceux’s Handbook of Categorical Algebra, Volume 1, Section 6.4.

The cofinality of a set is always regular:

$cf(cf(X)) \cong cf(X)$

for every infinite set $X$. (I won’t give the proof.) So $cf$ is an operation converting arbitrary sets into regular sets.

This is another moment at which a category theorist might get overexcited. Perhaps $cf$ feels like some kind of reflector of sets into regular sets, and perhaps the statements

$cf(X) \leq X, \quad cf(cf(X)) \cong cf(X)$

suggest there’s a monad around. Not so fast! Cofinality is not order-preserving:

$\aleph_1 \lt \aleph_\omega \quad \text{ but } \quad cf(\aleph_1) = \aleph_1 \gt \aleph_0 = cf(\aleph_\omega).$

The subtle behaviour of the cf operation has driven a lot of research in set theory. For example, the extraordinarily prolific set theorist Saharon Shelah found new results in cardinal arithmetic using his PCF theory, where PCF stands for “potential (or possible) cofinalities”.

At the very least, the cf notation is highly convenient. Recall this result from earlier:

Let $X$ and $I$ be infinite sets. Suppose that $X$ can be expressed as a coproduct of $I$ sets that are all $\lt X$. Then $X \lt X^I$.

We can rephrase it more compactly like this:

$X \lt X^{cf(X)}$ for all infinite $X$.

This result, a corollary of König’s theorem, has an important corollary itself. Let $X$ be an infinite set and $Y$ a set with at least two elements. Replacing $X$ by $Y^X$ in the last result gives

$Y^X \lt (Y^X)^{cf(Y^X)} \cong Y^{max(X, cf(Y^X))}$

and so

$X \lt cf(Y^X).$

In other words, you can’t partition $Y^X$ into $X$ smaller pieces.

In particular, this is true when $Y = 2$, giving

$X \lt cf(2^X)$

for all infinite sets $X$. Since $cf(S) \leq S$ for all $S$, this is a sharpening of Cantor’s result that $X \lt 2^X$.

A nice application of the fact that $X \lt cf(2^X)$ is to show that $\aleph_\omega$ is not the power set of anything in the world. (Allen Knutson mentioned this back in Part 4, and Todd Trimble gave a proof.) For if $\aleph_\omega \cong 2^X$ then

$X \lt cf(2^X) = cf(\aleph_\omega) = \aleph_0,$

so $X$ is finite. But $\aleph_\omega \cong 2^X$, so $\aleph_\omega$ is finite, a contradiction.

#### Next time

Next time, we’ll climb another rung up the ladder of largeness, where we’ll find the inaccessible sets.

Posted at July 2, 2021 1:26 PM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3327

### Re: Large Sets 8

If $X$ or $Y$ is finite then exponentiation is trivial too.

Do you mean “if $X$ xor $Y$ is finite”? (-:

Posted by: Mike Shulman on July 2, 2021 2:33 PM | Permalink | Reply to this

### Re: Large Sets 8

Ha! Yes, I suppose I do :-) Or I could say “trivial modulo number theory”, if I wanted to risk annoying number theorists.

Posted by: Tom Leinster on July 2, 2021 2:47 PM | Permalink | Reply to this

### Re: Large Sets 8

I’m only defining cofinality, and regular vs. singular, for infinite sets. I don’t know whether there’s a sensible way to set up the definitions for finite sets.

The nLab has a discussion. In brief, depending on how you state the definition, you can get any of the three results that:

• 0, 1, and 2 are the only finite regular cardinals.
• 2 is the only finite regular cardinal.
• There are no finite regular cardinals.

In addition, there are applications where “regularity” of an infinite cardinal is best considered as a property of “the set of cardinalities $\lt X$” rather than $X$ itself, and in this case the finite such sets-of-cardinalities are $\{0,1\}$ (which is the set of cardinalities $\lt 2$) and $\{1\}$ (which is not the set of cardinalities $\lt X$ for any $X$).

Posted by: Mike Shulman on July 2, 2021 2:45 PM | Permalink | Reply to this

### Re: Large Sets 8

There is a fairly simple way to define both regular and inaccessible cardinals that includes finite cardinals: regular cardinals are sets that are not reachable by indexed sums, while inaccessible cardinals are sets that are not reachable by indexed sums and indexed products.

### Re: Large Sets 8

Thanks, Mike. I hadn’t noticed that nLab discussion before.

When I first started learning about cofinality, I confidently tried to take everything I learned and adapt it to include finite sets. But I quickly ran into the problem that different definitions of cofinality, equivalent on infinite sets, are inequivalent on finite sets. Rather than try to work out which one was “right” (a judgement I had no way of making), I did what everyone else does: gave up and stuck to infinite sets.

And apparently the same problem besets the definition of regular set.

Posted by: Tom Leinster on July 2, 2021 7:49 PM | Permalink | Reply to this

### Re: Large Sets 8

I followed the advice to try and work out the proof of König’s theorem myself. It is indeed quite nice. Comparing what I came up with to the nLab page, I found that my argument was just the second part of the proof given there. Is the first part actually necessary? Surely (assuming choice) the nonexistence of a surjection $X \to Y$ is enough to imply $|X| \lt |Y|$ as long as $Y$ is nonempty.

Posted by: lambda on July 2, 2021 5:45 PM | Permalink | Reply to this

### Re: Large Sets 8

Ah yes, good point! It wasn’t me who wrote the nLab article, but I’d overlooked that.

Posted by: Tom Leinster on July 2, 2021 5:56 PM | Permalink | Reply to this

### Re: Large Sets 8

It was I who wrote that proof. Yeah, I guess? If I were teaching set theory, then I’d feel an obligation to prove that nonexistence of a surjection implies existence of an injection. (It’s true of course, but as statements go, it has an odd feel to me.)

Posted by: Todd Trimble on July 2, 2021 10:05 PM | Permalink | Reply to this

### Re: Large Sets 8

Question on terminology: “top-heavy” and “bottom-heavy” refer to the notation $Y^X$. Is there a terminology more reflective of the situation, not the notation? By comparison with the internal-hom, “source-dominant” and “target-dominant” would be possible. Are there other options?

Posted by: Keith Harbaugh on July 2, 2021 6:58 PM | Permalink | Reply to this

### Re: Large Sets 8

Hmm… don’t know! “Top/bottom-heavy” is just something I made up. It helps me keep the situation clear in my mind, but surely it’s in no way standard.

You’re right that it depends on the notation, so I can see the virtue of finding terminology that doesn’t. Then again, it’s completely standard to talk about the “base” of an exponential expression, which has the same bottom/top connotation. So maybe it’s not so bad.

Posted by: Tom Leinster on July 2, 2021 7:33 PM | Permalink | Reply to this

### Re: Large Sets 8

“it’s completely standard to talk about the “base” of an exponential expression, which has the same bottom/top connotation”

Actually, I think there’s more to it than that. When dealing with number systems, we talk about their base, 10, 8, 2, 16, whatever. I think that usage reflects something more intrinsic than the exponential notation.

Posted by: Keith Harbaugh on July 2, 2021 7:56 PM | Permalink | Reply to this

### Re: Large Sets 8

There’s a whole linguistic tangle here. Generally in English, you can have a basis for doing something, and something can be based on something else, and an accusation can be baseless. I suspect that all of the plain English and mathematical uses of base/basis ultimately come down to the idea of a thing at the bottom.

Posted by: Tom Leinster on July 2, 2021 8:26 PM | Permalink | Reply to this

### Re: Large Sets 8

A very interesting result (due to Tarski) is that $X \times X \cong X$ for all infinite sets $X$ is equivalent to the axiom of choice.

So if the “trivial” aspect of cardinal arithmetic seems like a buzzkill, I’d reckon this could be one way of re-injecting some enthusiasm!

I don’t know whether $X + X \cong X$ for infinite $X$ is strictly weaker than AC. I have a vague memory it is – but vague memories are cheap!

Posted by: Todd Trimble on July 3, 2021 12:12 AM | Permalink | Reply to this

### Re: Large Sets 8

It turns out that $X + X \cong X$ for all infinite $X$ is indeed strictly weaker than AC. This was proved by Sageev in his 1973 thesis (see here).

Posted by: Todd Trimble on July 3, 2021 10:52 AM | Permalink | Reply to this

### Re: Large Sets 8

Nice — and respect to your faculty of “vague memory”.

As for buzzkills, I guess I got accustomed a long time ago to the extraordinary power of mathematicians to kill buzzes. No matter how much you feel that you’re diving deep down into the unfathomable mysteries of the universe, no matter that you’re straining every neuron in your brain to comprehend, there’s always someone to tell you that what you’re doing is trivial. And the thing is, sometimes it doesn’t take long for you to feel the same way yourself.

Posted by: Tom Leinster on July 3, 2021 2:48 PM | Permalink | Reply to this

### Re: Large Sets 8

You take the approach which seems natural to me (and probably most algebraists / category theorists) which is to keep the role of ordinals as small as you can. But set theorists generally don’t do that. They use ordinals (and the well-ordering version of AC) to prove that |X x X| = |X| and to deduce the trivial rules for addition and multiplication of infinite sets.

If you were to fill in the proofs of everything, as you might do if teaching a course, would you be able to keep ordinals out? Or would that actually be rather artificial?

I suppose you could just assume the trivial rules for addition and multiplication, which seem by the above comments to be equivalent to AC. But they seem rather unmotivated to me just as assumptions.

Posted by: Jonathan Kirby on July 5, 2021 10:07 PM | Permalink | Reply to this

### Re: Large Sets 8

I don’t think I have any reservations about using well-ordered sets. For example, if I were teaching a course then I’d probably prove Zorn’s lemma via Hartogs’s theorem that for every set $X$, there exists a well-ordered set whose underlying set does not embed in $X$. I mention this because Lawvere and Rosebrugh’s book takes a different route to Zorn, instead going via the Bourbaki-Witt fixed point theorem. I’m not sure I see the benefit. My attitude is that well-ordered sets are bound to come up sooner or later, and they give a pleasant proof of Zorn’s lemma, so why not use them sooner?

To prove that $X \times X \cong X$, I guess my hypothetical course would use Zorn’s lemma on the poset of pairs $(A \subseteq X, f: A \times A \stackrel{\cong}{\to} A)$. Maybe I’m just saying that because it’s the proof I’m most familiar with. This proof seems reasonably natural to me, although the details do get slightly more complicated than one might wish.

I’m taking “ordinal” in your comment to mean “well-ordered set”.

Posted by: Tom Leinster on July 5, 2021 10:43 PM | Permalink | Reply to this

Post a New Comment