Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

July 2, 2021

Large Sets 8

Posted by Tom Leinster

Previously: Part 7. Next: Part 9

If you’ve been wanting to follow this series but haven’t had time to keep up, now’s a good moment to hop back on board — I won’t assume much of what’s gone before.

Back in the mists of time, when I took a first undergraduate course on axiomatic set theory, I was exhilarated by the extraordinary world of infinite sets I saw opening up before me. In that world, addition is the same as multiplication! Which is the same as maximum! That is,

X+YX×Ymax(X,Y) X + Y \cong X \times Y \cong max(X, Y)

for all infinite XX and YY. It seemed unthinkably exotic.

I then heard this part of cardinal arithmetic called “trivial” for exactly the reasons just stated. Although that description is technically correct, it poured a bucket of cold water over my enthusiasm in a way that only mathematicians can.

So with apologies to my past self, I give you the informal title of this post: the nontrivial part of cardinal arithmetic.

Once we’ve seen that binary addition and multiplication are “trivial”, what’s left to do in cardinal arithmetic? Well, there’s addition and multiplication of infinitely many sets, and there’s exponentiation. Some of the theory of those operations is trivial too, in the sense of following immediately from the isomorphisms

X+YX×Ymax(X,Y) X + Y \cong X \times Y \cong max(X, Y)

for infinite XX and YY. Let’s get that trivial stuff out of the way first.

Addition, even infinitary addition, is more or less the same as supremum. An easy little argument shows that for any family (X i) iI(X_i)_{i \in I} of nonempty sets,

iIX imax(I,sup iIX i) \sum_{i \in I} X_i \cong \max(I, \sup_{i \in I} X_i)

— unless, of course, it’s a finite family of finite sets. (The sum of finitely many natural numbers is not the same as its sup!)

Exponentiation is a special case of infinitary multiplication: Y X= xXYY^X = \prod_{x \in X} Y. Still, it’s an important enough case that it’s well worth thinking about separately.

In many cases, exponentiation is trivial too. It’s trivial when the expression

Y X Y^X

is top-heavy — when XX is bigger than YY. It’s even trivial when XX is just slightly smaller than YY. Indeed,

Y X=2 X if 2 XY, Y^X = 2^X \quad \text{ if } \quad 2^X \geq Y,

assuming that XX and YY are infinite. The proof is a pushover:

2 XY X(2 X) X2 X×X2 X. 2^X \leq Y^X \leq (2^X)^X \cong 2^{X \times X} \cong 2^X.

For example, X X2 X\mathbb{N}^X \cong \mathbb{R}^X \cong 2^X for all infinite sets XX.

There’s another case where exponentiation is trivial: when the base is the power set of something. Another one-line proof shows that if YY is a power set then

Y Xmax(Y,2 X). Y^X \cong \max(Y, 2^X).

Here I’ve assumed again that XX and YY are infinite. If XX or YY is finite then exponentiation is trivial too.

So the only occasions when Y XY^X is not trivial to compute are when YY is not a power set and 2 X<Y2^X \lt Y — the expression Y XY^X is “bottom-heavy”. It’s certainly the case that

YY X2 Y. Y \leq Y^X \leq 2^Y.

But even assuming the generalized continuum hypothesis, this still leaves two options for Y XY^X: is it YY or 2 Y2^Y?

Now we’re moving across the boundary into what I’m calling the “nontrivial” part of cardinal arithmetic. (Of course, any professional would regard everything I’m saying here as trivial.) And it’s time to talk about the wonderful König’s theorem.

König’s theorem states that for any families of sets (X i) iI(X_i)_{i \in I} and (Y i) iI(Y_i)_{i \in I} whatsoever,

X i<Y ifor alli iIX i< iIY i. X_i \lt Y_i \ \text{for all}\ i \quad \implies \quad \sum_{i \in I} X_i \lt \prod_{i \in I} Y_i.

Let’s spend a while contemplating this result, before we start to think about the proof. Some observations:

  • There are many variant statements that can be obtained by taking König’s theorem and replacing one or both <\lt signs by \leq, or by changing the sum to a product, or vice versa. All of those variants are either trivially true or false. (For example, changing both <\lts to \leqs gives a false statement.) Among these variants, König’s theorem is unique in being both nontrivial and true.

  • If we assume the generalized continuum hypothesis then König’s theorem does become trivial, at least when X iX_i and Y iY_i are infinite. For then 2 X iY i2^{X_i} \leq Y_i for every ii, and taking the product of these inequalities over all ii gives 2 X iY i2^{\sum X_i} \leq \prod Y_i, hence the result.

  • When every X iX_i is empty, König states that a product of nonempty sets is nonempty. This is equivalent to the axiom of choice (in the presence of the other ETCS axioms). So, we’re going to have to use the axiom of choice in the proof.

  • When X i=1X_i = 1 and Y i=2Y_i = 2 for all ii, König’s theorem becomes Cantor’s theorem: I<2 II \lt 2^I for all sets II. So, we can expect the proof to be a generalization of the diagonal argument.

As for the proof, I won’t spoil it! I promise you that if you haven’t seen it before and you figure it out for yourself, it will brighten your day. It’s the proof of strict inequality that’s the really nice part.

When I first learned König’s theorem, I got a bit overexcited. I thought that such a general statement would be sure to have a wealth of applications. But as far as I know, there’s basically only one, and it’s this:

Let XX and II be infinite sets. Suppose that XX can be expressed as a coproduct of II sets that are all <X\lt X. Then X<X IX \lt X^I.

Indeed, if X iIX iX \cong \sum_{i \in I} X_i with X i<XX_i \lt X for each ii, then König’s theorem with Y i=XY_i = X for all ii gives the result.

For example, suppose we have a model of ETCS in which ω\aleph_\omega exists. Then

ωsup n n n n, \aleph_\omega \cong \sup_{n \in \mathbb{N}} \aleph_n \cong \sum_{n \in \mathbb{N}} \aleph_n,

so

ω< ω . \aleph_\omega \lt \aleph_\omega^\mathbb{N}.

The right-hand side here is a very bottom-heavy exponential: ω\aleph_\omega is vastly bigger than \mathbb{N}.

At this point I want to talk about some bad intuition that I carried around with me for many years. I’ll explain it using the example of two particular sets, =2 \mathbb{R} = 2^\mathbb{N} and \mathbb{N}, but other examples would work too.

  • +\mathbb{R} + \mathbb{N} \cong \mathbb{R}, and my mental picture is that \mathbb{R} is bigger than \mathbb{N} and therefore “absorbs” it: adjoining tiny little \mathbb{N} to great big \mathbb{R} doesn’t make it any bigger. That intuitive idea is fine. (It’s crude, because X+YXX + Y \cong X even if XX and YY have the same cardinality. But that doesn’t make it wrong.)

  • ×\mathbb{R} \times \mathbb{N} \cong \mathbb{R}, and my mental picture for this is similar: since \mathbb{R} is bigger than \mathbb{N}, taking \mathbb{N} copies of it doesn’t make it any bigger. That intuitive idea is fine too.

  • \mathbb{R}^\mathbb{N} \cong \mathbb{R}, and my mental picture for this used to be similar: since \mathbb{R} is bigger than \mathbb{N}, taking mere sequences of reals doesn’t give you a larger set than taking reals themselves. This intuition is completely wrong!

It’s wrong because there are sets XX much bigger than \mathbb{N} such that X >XX^\mathbb{N} \gt X. Or at least, such sets exist as long as we add to ETCS a very mild extra axiom: the existence of an uncountable weak limit. Then ω\aleph_\omega exists, and as we just saw,

ω > ω. \aleph_\omega^\mathbb{N} \gt \aleph_\omega.

That’s why it’s nonsense to think that \mathbb{R}^\mathbb{N} \cong \mathbb{R} because \mathbb{R} is bigger than, or “absorbs”, \mathbb{N}.

End of digression.

For results like the “X<X IX \lt X^I” one, it’s useful to think in terms of cofinality. By definition, the cofinality cf(X)cf(X) of an infinite set XX is the smallest set II with the following property:

there is some family of sets (X i) iI(X_i)_{i \in I} such that XX iX \cong \sum X_i and X i<XX_i \lt X for all ii.

The word “cofinality” comes from an equivalent order-theoretic formulation which I won’t talk about.

Certainly the set I=XI = X has the stated property, since X xX1X \cong \sum_{x \in X} 1. So cf(X)cf(X) is well defined and cf(X)Xcf(X) \leq X. Some examples:

  • What’s the cofinality of \mathbb{N}? Since a finite coproduct of finite sets is finite, any set II with the property above must be infinite. So cf()cf(\mathbb{N}) is infinite. But cf()cf(\mathbb{N}) \leq \mathbb{N}, so cf()=cf(\mathbb{N}) = \mathbb{N}.

  • How about 1\aleph_1, the smallest set > 0=\gt \aleph_0 = \mathbb{N}? If (X i) iI(X_i)_{i \in I} is a family with I< 1I \lt \aleph_1 and X i< 1X_i \lt \aleph_1, then I 0I \leq \aleph_0 and X i 0X_i \leq \aleph_0. It follows that iIX i 0× 0 0< 1. \sum_{i \in I} X_i \leq \aleph_0 \times \aleph_0 \cong \aleph_0 \lt \aleph_1. So 1\aleph_1 cannot be expressed as a coproduct of < 1\lt \aleph_1 sets that are all < 1\lt \aleph_1. That is, cf( 1)= 1cf(\aleph_1) = \aleph_1.

  • Generally, the same argument shows that if XX is a successor (that is, there’s some set YY such that XX is the smallest set >Y\gt Y), then cf(X)=Xcf(X) = X.

  • Assuming that ω\aleph_\omega exists, we have ω=sup n n= n n \aleph_\omega = \sup_{n \in \mathbb{N}} \aleph_n = \sum_{n \in \mathbb{N}} \aleph_n with n< ω\aleph_n \lt \aleph_\omega for each nn. Hence cf( ω)== 0\cf(\aleph_\omega) = \mathbb{N} = \aleph_0.

As these examples show, many sets XX have the property that cf(X)Xcf(X) \cong X. Such sets are called regular. So

= 0, 1, 2, \mathbb{N} = \aleph_0, \aleph_1, \aleph_2, \ldots

are all regular, as is any successor. An infinite set that is not regular is called singular. Thus, a set XX is singular if and only if it can be expressed as a coproduct of <X\lt X sets that are each <X\lt X.

Since the smallest example of a singular set is ω\aleph_\omega, and since it is consistent with ETCS that ω\aleph_\omega does not exist,

It is consistent with ETCS that every infinite set is regular.

(I’m only defining cofinality, and regular vs. singular, for infinite sets. I don’t know whether there’s a sensible way to set up the definitions for finite sets.)

Aside   Regularity comes up quite naturally in category theory. For instance, take the result that finite limits commute with filtered colimits in Set\mathbf{Set}. The definitions of finite limit and filtered colimit both refer (at least implicitly) to the concept of a set being finite, and finite means <\lt \mathbb{N}. One might seek to generalize the commutativity result by replacing \mathbb{N} by some larger set KK. But if one tries to do this, one quickly finds oneself needing the assumption that KK is regular. There’s a nice explanation of this point in Francis Borceux’s Handbook of Categorical Algebra, Volume 1, Section 6.4.

The cofinality of a set is always regular:

cf(cf(X))cf(X) cf(cf(X)) \cong cf(X)

for every infinite set XX. (I won’t give the proof.) So cfcf is an operation converting arbitrary sets into regular sets.

This is another moment at which a category theorist might get overexcited. Perhaps cfcf feels like some kind of reflector of sets into regular sets, and perhaps the statements

cf(X)X,cf(cf(X))cf(X) cf(X) \leq X, \quad cf(cf(X)) \cong cf(X)

suggest there’s a monad around. Not so fast! Cofinality is not order-preserving:

1< ω but cf( 1)= 1> 0=cf( ω). \aleph_1 \lt \aleph_\omega \quad \text{ but } \quad cf(\aleph_1) = \aleph_1 \gt \aleph_0 = cf(\aleph_\omega).

The subtle behaviour of the cf operation has driven a lot of research in set theory. For example, the extraordinarily prolific set theorist Saharon Shelah found new results in cardinal arithmetic using his PCF theory, where PCF stands for “potential (or possible) cofinalities”.

At the very least, the cf notation is highly convenient. Recall this result from earlier:

Let XX and II be infinite sets. Suppose that XX can be expressed as a coproduct of II sets that are all <X\lt X. Then X<X IX \lt X^I.

We can rephrase it more compactly like this:

X<X cf(X)X \lt X^{cf(X)} for all infinite XX.

This result, a corollary of König’s theorem, has an important corollary itself. Let XX be an infinite set and YY a set with at least two elements. Replacing XX by Y XY^X in the last result gives

Y X<(Y X) cf(Y X)Y max(X,cf(Y X)) Y^X \lt (Y^X)^{cf(Y^X)} \cong Y^{max(X, cf(Y^X))}

and so

X<cf(Y X). X \lt cf(Y^X).

In other words, you can’t partition Y XY^X into XX smaller pieces.

In particular, this is true when Y=2Y = 2, giving

X<cf(2 X) X \lt cf(2^X)

for all infinite sets XX. Since cf(S)Scf(S) \leq S for all SS, this is a sharpening of Cantor’s result that X<2 XX \lt 2^X.

A nice application of the fact that X<cf(2 X)X \lt cf(2^X) is to show that ω\aleph_\omega is not the power set of anything in the world. (Allen Knutson mentioned this back in Part 4, and Todd Trimble gave a proof.) For if ω2 X\aleph_\omega \cong 2^X then

X<cf(2 X)=cf( ω)= 0, X \lt cf(2^X) = cf(\aleph_\omega) = \aleph_0,

so XX is finite. But ω2 X\aleph_\omega \cong 2^X, so ω\aleph_\omega is finite, a contradiction.

Next time

Next time, we’ll climb another rung up the ladder of largeness, where we’ll find the inaccessible sets.

Posted at July 2, 2021 1:26 PM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3327

17 Comments & 0 Trackbacks

Re: Large Sets 8

If XX or YY is finite then exponentiation is trivial too.

Do you mean “if XX xor YY is finite”? (-:

Posted by: Mike Shulman on July 2, 2021 2:33 PM | Permalink | Reply to this

Re: Large Sets 8

Ha! Yes, I suppose I do :-) Or I could say “trivial modulo number theory”, if I wanted to risk annoying number theorists.

Posted by: Tom Leinster on July 2, 2021 2:47 PM | Permalink | Reply to this

Re: Large Sets 8

I’m only defining cofinality, and regular vs. singular, for infinite sets. I don’t know whether there’s a sensible way to set up the definitions for finite sets.

The nLab has a discussion. In brief, depending on how you state the definition, you can get any of the three results that:

  • 0, 1, and 2 are the only finite regular cardinals.
  • 2 is the only finite regular cardinal.
  • There are no finite regular cardinals.

In addition, there are applications where “regularity” of an infinite cardinal is best considered as a property of “the set of cardinalities <X\lt X” rather than XX itself, and in this case the finite such sets-of-cardinalities are {0,1}\{0,1\} (which is the set of cardinalities <2\lt 2) and {1}\{1\} (which is not the set of cardinalities <X\lt X for any XX).

Posted by: Mike Shulman on July 2, 2021 2:45 PM | Permalink | Reply to this

Re: Large Sets 8

There is a fairly simple way to define both regular and inaccessible cardinals that includes finite cardinals: regular cardinals are sets that are not reachable by indexed sums, while inaccessible cardinals are sets that are not reachable by indexed sums and indexed products.

Posted by: Madeleine Birchfield on July 2, 2021 5:32 PM | Permalink | Reply to this

Re: Large Sets 8

Thanks, Mike. I hadn’t noticed that nLab discussion before.

When I first started learning about cofinality, I confidently tried to take everything I learned and adapt it to include finite sets. But I quickly ran into the problem that different definitions of cofinality, equivalent on infinite sets, are inequivalent on finite sets. Rather than try to work out which one was “right” (a judgement I had no way of making), I did what everyone else does: gave up and stuck to infinite sets.

And apparently the same problem besets the definition of regular set.

Posted by: Tom Leinster on July 2, 2021 7:49 PM | Permalink | Reply to this

Re: Large Sets 8

I followed the advice to try and work out the proof of König’s theorem myself. It is indeed quite nice. Comparing what I came up with to the nLab page, I found that my argument was just the second part of the proof given there. Is the first part actually necessary? Surely (assuming choice) the nonexistence of a surjection XYX \to Y is enough to imply |X|<|Y||X| \lt |Y| as long as YY is nonempty.

Posted by: lambda on July 2, 2021 5:45 PM | Permalink | Reply to this

Re: Large Sets 8

Ah yes, good point! It wasn’t me who wrote the nLab article, but I’d overlooked that.

Posted by: Tom Leinster on July 2, 2021 5:56 PM | Permalink | Reply to this

Re: Large Sets 8

It was I who wrote that proof. Yeah, I guess? If I were teaching set theory, then I’d feel an obligation to prove that nonexistence of a surjection implies existence of an injection. (It’s true of course, but as statements go, it has an odd feel to me.)

Posted by: Todd Trimble on July 2, 2021 10:05 PM | Permalink | Reply to this

Re: Large Sets 8

Question on terminology: “top-heavy” and “bottom-heavy” refer to the notation Y XY^X. Is there a terminology more reflective of the situation, not the notation? By comparison with the internal-hom, “source-dominant” and “target-dominant” would be possible. Are there other options?

Posted by: Keith Harbaugh on July 2, 2021 6:58 PM | Permalink | Reply to this

Re: Large Sets 8

Hmm… don’t know! “Top/bottom-heavy” is just something I made up. It helps me keep the situation clear in my mind, but surely it’s in no way standard.

You’re right that it depends on the notation, so I can see the virtue of finding terminology that doesn’t. Then again, it’s completely standard to talk about the “base” of an exponential expression, which has the same bottom/top connotation. So maybe it’s not so bad.

Posted by: Tom Leinster on July 2, 2021 7:33 PM | Permalink | Reply to this

Re: Large Sets 8

“it’s completely standard to talk about the “base” of an exponential expression, which has the same bottom/top connotation”

Actually, I think there’s more to it than that. When dealing with number systems, we talk about their base, 10, 8, 2, 16, whatever. I think that usage reflects something more intrinsic than the exponential notation.

Posted by: Keith Harbaugh on July 2, 2021 7:56 PM | Permalink | Reply to this

Re: Large Sets 8

There’s a whole linguistic tangle here. Generally in English, you can have a basis for doing something, and something can be based on something else, and an accusation can be baseless. I suspect that all of the plain English and mathematical uses of base/basis ultimately come down to the idea of a thing at the bottom.

Posted by: Tom Leinster on July 2, 2021 8:26 PM | Permalink | Reply to this

Re: Large Sets 8

A very interesting result (due to Tarski) is that X×XXX \times X \cong X for all infinite sets XX is equivalent to the axiom of choice.

So if the “trivial” aspect of cardinal arithmetic seems like a buzzkill, I’d reckon this could be one way of re-injecting some enthusiasm!

I don’t know whether X+XXX + X \cong X for infinite XX is strictly weaker than AC. I have a vague memory it is – but vague memories are cheap!

Posted by: Todd Trimble on July 3, 2021 12:12 AM | Permalink | Reply to this

Re: Large Sets 8

It turns out that X+XXX + X \cong X for all infinite XX is indeed strictly weaker than AC. This was proved by Sageev in his 1973 thesis (see here).

Posted by: Todd Trimble on July 3, 2021 10:52 AM | Permalink | Reply to this

Re: Large Sets 8

Nice — and respect to your faculty of “vague memory”.

As for buzzkills, I guess I got accustomed a long time ago to the extraordinary power of mathematicians to kill buzzes. No matter how much you feel that you’re diving deep down into the unfathomable mysteries of the universe, no matter that you’re straining every neuron in your brain to comprehend, there’s always someone to tell you that what you’re doing is trivial. And the thing is, sometimes it doesn’t take long for you to feel the same way yourself.

Posted by: Tom Leinster on July 3, 2021 2:48 PM | Permalink | Reply to this

Re: Large Sets 8

You take the approach which seems natural to me (and probably most algebraists / category theorists) which is to keep the role of ordinals as small as you can. But set theorists generally don’t do that. They use ordinals (and the well-ordering version of AC) to prove that |X x X| = |X| and to deduce the trivial rules for addition and multiplication of infinite sets.

If you were to fill in the proofs of everything, as you might do if teaching a course, would you be able to keep ordinals out? Or would that actually be rather artificial?

I suppose you could just assume the trivial rules for addition and multiplication, which seem by the above comments to be equivalent to AC. But they seem rather unmotivated to me just as assumptions.

Posted by: Jonathan Kirby on July 5, 2021 10:07 PM | Permalink | Reply to this

Re: Large Sets 8

I don’t think I have any reservations about using well-ordered sets. For example, if I were teaching a course then I’d probably prove Zorn’s lemma via Hartogs’s theorem that for every set XX, there exists a well-ordered set whose underlying set does not embed in XX. I mention this because Lawvere and Rosebrugh’s book takes a different route to Zorn, instead going via the Bourbaki-Witt fixed point theorem. I’m not sure I see the benefit. My attitude is that well-ordered sets are bound to come up sooner or later, and they give a pleasant proof of Zorn’s lemma, so why not use them sooner?

To prove that X×XXX \times X \cong X, I guess my hypothetical course would use Zorn’s lemma on the poset of pairs (AX,f:A×AA)(A \subseteq X, f: A \times A \stackrel{\cong}{\to} A). Maybe I’m just saying that because it’s the proof I’m most familiar with. This proof seems reasonably natural to me, although the details do get slightly more complicated than one might wish.

I’m taking “ordinal” in your comment to mean “well-ordered set”.

Posted by: Tom Leinster on July 5, 2021 10:43 PM | Permalink | Reply to this

Post a New Comment