## June 8, 2021

### Large Sets 1

#### Posted by Tom Leinster

Next: Part 2

This is the first of a series of posts on how large cardinals look in categorical set theory.

My primary interest is not actually in large cardinals themselves. What I’m really interested in is exploring the hypothesis that everything in traditional, membership-based set theory that’s relevant to the rest of mathematics can be done smoothly in categorical set theory. I’m not sure this hypothesis is correct (and I suppose no one could ever be sure), which is why I use the words “hypothesis” and “explore”. But I know of no counterexample.

These posts won’t assume very much knowledge of anything. And I’ll try to stick to one topic per post. In this first one, all I’ll do is clear my throat.

There are three differences between what I’ll do here and what you’ll find in almost all books on set theory.

First, I’ll be using a style of set theory that so far I’ve been calling categorical, and which could also be called isomorphism-invariant or structural, but which I really want to call neutral set theory.

What I mean is this. In cooking, a recipe will sometimes tell you to use a neutral cooking oil. This means an oil whose flavour is barely noticeable, unlike, say, sesame oil or coconut oil or extra virgin olive oil. A neutral cooking oil fades into the background, allowing the other flavours to come through.

The kind of set theory I’ll use here, and the language I’ll use to discuss it, is “neutral” in the sense that it’s the language of the large majority of mathematical publications today, especially in more algebraic areas. It’s the language of structures and substructures and quotients and isomorphisms, the lingua franca of algebra. To most mathematicians, it should just fade into the background, allowing the essential points about sets themselves to come through.

Some readers might want to call this language “categorical”, but that wouldn’t be quite right: it’s precategorical, the kind of thing that undergraduates get used to in early courses on abstract algebra — when they first learn about groups or rings or vector spaces, well before they’re likely to see a category. And in my posts, I’ll mention categories only occasionally.

“Neutral” doesn’t mean that no one can object. (Theorem: for all things, there exists a person who objects to that thing.) Someone is sure to say it would taste better with sesame or ZFC. But the point is that you can add that flavour if you want.

For example, consider ordinals.

In ZFC, $\in$ is a binary relation defined on the collection of all sets. Every element of every set is also a set, and for any two sets $X$ and $Y$, one can ask whether $X \in Y$. It follows that $\in$ restricts to a binary relation on any individual set $X$.

In the classical approach to set theory, an ordinal is defined to be a set $X$ satisfying certain properties. These properties imply that the relation $\in$ on $X$ is a well ordering. They also imply that every well-ordered set is isomorphic to $(X, \in)$ for precisely one ordinal $X$. In other words, the ordinals are representatives of the isomorphism classes of well-ordered sets.

Some people like the classical definition of ordinal, some don’t. Neutral set theory takes no position. Nothing we’ll do is incompatible with defining the ordinals as above and showing that they’re representatives for the iso classes of well-ordered sets. But equally, nothing we’ll do requires the classical notion of ordinal. We’re just going to talk about well-ordered sets themselves, directly.

In general, I’ll be talking about sets and sets with structure and functions and subsets and relations and families and so on, using the same language that appears in hundreds of math papers published every day. They’re blog posts, so I’ll be writing informally. But the formal axiom system that everything will be based on is Lawvere’s Elementary Theory of the Category of Sets (ETCS).

In categorical jargon, ETCS states that the category of sets is a well-pointed topos with a natural numbers object and choice. In informal language, it says we have some things called “sets”, some things called “functions from $X$ to $Y$” for each set $X$ and set $Y$, and an operation of “composition” on functions, satisfying the following axioms:

1. Composition of functions is associative and has identities.
2. There is a set with exactly one element.
3. There is a set with no elements.
4. A function is determined by its effect on elements.
5. Given sets $X$ and $Y$, one can form the cartesian product $X \times Y$.
6. Given sets $X$ and $Y$, one can form the set of functions from $X$ to $Y$.
7. Given $f: X \to Y$ and $y \in Y$, one can form the inverse image $f^{-1}(y)$.
8. The subsets of a set $X$ correspond to the functions from $X$ to $\{0,1\}$.
9. The natural numbers form a set.
10. Every surjection has a right inverse.

I said that this is an informal phrasing of the axioms. You can find the formal (but elementary) version here, for instance. But in case your desire to understand these axioms is not quite great enough to make you want to click that link, I give you a kind of FAQ:

• Many of the axioms (2, 5, 6, 7, 8 and 9) involve universal properties. For instance, axiom 5 states that for any two sets $X$ and $Y$, there is a diagram $X \leftarrow X \times Y \rightarrow Y$ with the universal property of a product.

• In particular, axiom 2 states that there is a terminal set (a set $1$ with the property that for every set $X$, there is exactly one function $X \to 1$). Since being a terminal set is a universal property, any two terminal sets are isomorphic, and I’ll write $1$ for any of them.

• An element of a set $X$ is defined to be a function $1 \to X$. So, “$x \in X$” means $x: 1 \to X$. (In particular, $1$ has exactly one element.) When the axioms refer to “elements”, that’s what’s meant.

• Axiom 8 actually says that there’s some set $\Omega$ with the property that for any set $X$, the subsets of $X$ correspond to the functions $X \to \Omega$. As it turns out, the other axioms imply that $\Omega$ has exactly two elements.

While everyone agrees that the elements of a set $X$ are in canonical one-to-one correspondence with the functions $1 \to X$, some people don’t like defining an element of $X$ to literally be a function $1 \to X$. That’s OK. True to the spirit of neutral set theory, I’ll never rely on this definition of element. All we’ll need is that uncontroversial one-to-one correspondence.

The most important thing in all of this, the thing that separates it from the dominant ZFC-based approach, is that we won’t assume that elements of a set are sets, or that one can ask “is $X \in Y$?” of any two sets $X$ and $Y$. It simply won’t be necessary.

At the beginning, I said that what we’ll do in these posts differs from the norm in three ways. That was just the first!

The second is more superficial: instead of talking about large cardinals, I’ll talk about large sets.

Categorical set theory doesn’t have much use for the noun “cardinal”. In the classical ZFC-based approach, a cardinal is defined to be an ordinal with a certain property, and one can prove that every set is isomorphic to (in bijection with) precisely one cardinal. So, the cardinals are representatives of the isomorphism classes of sets. But in an isomorphism-invariant approach to set theory, there’s no need to choose representatives. We just work directly with sets.

If “cardinal” means anything in categorical set theory, it means isomorphism class of sets. Now and again I’ll use the word that way, e.g. when I want to talk about the set of cardinals smaller than a given set. But almost always, I’ll just talk about sets.

The situation is the same as in group theory, for instance. We don’t often need to say “isomorphism class of groups”. We just say “group”. (“There are two groups of order four.”) This is safe because all properties of groups are isomorphism-invariant. The same goes for sets.

It takes a while to get used to saying things like “inaccessible set” when the rest of the world says “inaccessible cardinal”. But now that I’m used to replacing “cardinal” by “set”, “cardinal” is starting to sound a bit arcane. Why would you use this special word for an isomorphism class of sets — or an ingeniously constructed representative of it — when the property of sets you’re talking about is isomorphism-invariant anyway? You don’t do this for groups or any other type of mathematical structure. So why do you do it for sets?

The third and final feature that makes these posts different is that ETCS is a weaker set theory than ZFC. Fewer things are true in it.

If by “large set” (or “large cardinal”) we mean one whose existence is not guaranteed by our axiom system, then more sets are large in ETCS than in ZFC. In other words, ZFC has higher standards for calling something large. (“That’s not a large set, that’s a large set”.)

Roughly speaking, the way this series of posts will work is that I’ll start with the smallest “large sets” and work my way upwards. The first half or so of the series will be entirely about sets that are large relative to ETCS but not relative to ZFC — that is, ZFC guarantees their existence but ETCS doesn’t. There are several intermediate stages. Personally, I feel like I understand the difference between the two theories better now that I’ve got this picture clear in my head.

#### Postscript: are category theorists interested in weak set theories?

As I understand it, some set theorists have the impression that category theorists are particuarly attached to set theories that are weaker than ZFC. Is that true?

The fact of the matter is that ETCS is weaker than ZFC. Whether you call it slightly weaker or vastly weaker is a matter of perspective. On the one hand, it’s thought that all of EGA, SGA, and probably the proof of Fermat’s last theorem can be done in ETCS. Most mathematicians will never need more than ETCS provides. So for practical purposes, you might say there’s not much difference between ETCS and ZFC. But to a professional set theorist, this probably sounds as absurd as the argument “hardly anyone ever uses integers bigger than $10^{10^{10}}$, so there’s not much difference between $\{0, 1, \ldots, 10^{10^{10}}\}$ and $\mathbb{N}$.”

ETCS is not the only categorical set theory. In particular, it’s very natural to add to it an axiom scheme of replacement (a kind of cocompleteness condition, which I’ll come to in a later post). The resulting theory, “ETCS+R”, is equivalent to ZFC in the strongest possible sense: the two theories are biinterpretable. This means that there’s a way of translating statements in the language of ETCS+R into statements in the language of ZFC, and vice versa, such that the two translation processes are mutually inverse and make theorems of ETCS+R match up with theorems of ZFC. In particular, ETCS+R is a categorical set theory exactly as strong as ZFC.

But if we’re honest, category theorists don’t often add this axiom scheme of replacement. Natural as it is, it has a different flavour from the other axioms of ETCS. We do tend to focus on a set theory weaker than ZFC.

Still, something doesn’t ring true to me about the idea that category theorists are interested in, or attached to, weak set theories. Here’s what I think is going on.

Category theorists are interested in much more than just categories of sets. There are categories of many kinds, and we want to be able to move between them. A set can be viewed as a degenerate kind of space (with no geometric structure), or a degenerate kind of algebra (with no operations or equations), or a degenerate kind of sheaf (a sheaf on the one-point space):

For any given fact about sets, it’s categorically natural to ask whether that fact holds in a wider context than just categories of sets. Maybe it’s true in all toposes, or all cartesian closed categories, or all categories of algebras for an algebraic theory. That’s the kind of question that category theorists ask. It’s not so much that we’re aiming to use a set theory that’s as weak as possible, but that we want to find the right level of generality for everything. And that generality may go far beyond the world of sets.

To put it another way, imagine a mathematician who studies categories of algebras (for finitary algebraic theories, say). Since sets are the algebras for a certain trivial algebraic theory, anything that’s true of all categories of algebras is true of $\mathbf{Set}$. But it would be perverse to say that our imaginary mathematician is doing “weak set theory”. It would be like describing the class of mammals as “generalized platypuses”.

I think that’s the answer. Category theorists are not a priori interested in set theories that are weak; we’re interested in worlds beyond sets.

#### Next time

In part 2, I’ll get started on the smallest kind of large set: limits.

Posted at June 8, 2021 3:42 PM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3318

## 49 Comments & 2 Trackbacks

### Re: Large Sets 1

If ETCS + Replacement is biinterpretable with ZFC, then what categorical/topos theoretic set theory is biinterpretable with classical ZF?

Posted by: Madeleine Birchfield on June 8, 2021 4:41 PM | Permalink | Reply to this

### Re: Large Sets 1

Number 10 in Tom’s list is a version of the axiom of choice, so removing that (probably?) gives you ZF.

Posted by: pnips on June 8, 2021 5:12 PM | Permalink | Reply to this

### Re: Large Sets 1

Does replacement alone imply excluded middle and unbounded specification? Because removing the axiom of choice from ETCS leads to a set theory that is biinterpretable with CBZ (constructive bounded Zermelo set theory).

Posted by: Madeleine Birchfield on June 8, 2021 6:03 PM | Permalink | Reply to this

### Re: Large Sets 1

removing the axiom of choice from ETCS leads to a set theory that is biinterpretable with CBZ

Do you have a reference for that, Madeleine?

I’m slightly surprised to hear that they’re biinterpretable. For comparison, ETCS has the same consistency strength as BZC (bounded Zermelo with choice), but I don’t think it’s biinterpretable with it. It’s conceivable that removing choice on both sides does give biinterpretable theories, but it sounds a bit surprising.

Posted by: Tom Leinster on June 8, 2021 6:15 PM | Permalink | Reply to this

### Re: Large Sets 1

Well then it was my mistake; I remember that there was some relation between ETCS and BZC but thought the relation was that the two theories are biinterpretable rather than merely equiconsistent.

Posted by: Madeleine Birchfield on June 8, 2021 7:53 PM | Permalink | Reply to this

### Re: Large Sets 1

OK. Incidentally, section 7 of Colin McLarty’s paper “Exploring categorical structuralism” gives a good quick summary of ETCS vs BZC, referring to Mac Lane and Moerdijk’s topos theory book for details.

Posted by: Tom Leinster on June 8, 2021 8:30 PM | Permalink | Reply to this

### Re: Large Sets 1

While not topos theoretic (as the theory forms a bicategory of relations rather than a well-pointed topos), Mike Shulman’s SEAR is a neutral/categorical set theory that is equiconsistent with ZF. I am not sure if SEAR is biinterpretable with ZF.

Posted by: Madeleine Birchfield on June 8, 2021 10:12 PM | Permalink | Reply to this

### Re: Large Sets 1

It’s certainly true that from a model of any membership-based set theory you can construct a model of the corresponding categorical set theory and vice versa; I wrote about this at length for a lot of weaker set theories in Comparing material and structural set theories. Intuitively I would call that “biinterpretable”, since if you can construct a model of theory 1 in theory 2, you can “interpret” theory 1 in theory 2 by interpreting it in that model. Does “biinterpretable” have a fancy meaning other than this obvious one?

Posted by: Mike Shulman on June 9, 2021 1:47 AM | Permalink | Reply to this

### Re: Large Sets 1

I’ve seen Joel Hamkins talk about bi-interpretability as being rather strong, see for instance his paper Bi-interpretation in weak set theories with Alfredo Roque Freire. But there is both a theory version and a model version of bi-interpretability, and I don’t really know how they relate.

Posted by: David Roberts on June 9, 2021 2:45 AM | Permalink | Reply to this

### Re: Large Sets 1

Ah, it looks like he uses “mutually interpretable” for my intuitive meaning and “biinterpretable” for what I would naively call “equivalent”, i.e. the two translations are inverses up to isomorphism. That stronger property does indeed generally fail for weaker theories. (For ZFC and ETCS+R it only holds in the appropriate sense: the two translations are inverses up to isomorphism on the ZFC side, but only up to equivalence of categories on the ETCS side.) In general to make two theories equivalent, what you need on the material side is a version of Mostowski’s collapsing lemma (either as a theorem or as an axiom), while on the structural side you need a dual sort of property that I called the “axiom of (well-founded) materialization”.

All of these correspondences between set theories are mutual interpretations, not just equiconsistency. I’m curious to what extent “mutually interpretable” differs from “equiconsistent”. In principle it sounds stronger, but I don’t know of any proofs of equiconsistency that don’t proceed by constructing mutual interpretations. But probably set theorists have some fancy examples.

Posted by: Mike Shulman on June 9, 2021 3:20 AM | Permalink | Reply to this

### Re: Large Sets 1

Mike, you and François Dorais had a conversation about biinterpretability, equivalence etc. back here.

My understanding of what the term “biinterpretable” means comes from both François’s comments back then and a conversation with Joel Hamkins more recently. Joel was emphasizing the distinction between mutual interpretability and biinterpretability, in the senses of the words you explained just now. I don’t know how widespread this usage is. In any case, I’ve been mildly frustrated not to have a reference for these definitions, so thank you, David, for providing one.

Posted by: Tom Leinster on June 9, 2021 9:52 AM | Permalink | Reply to this

### Re: Large Sets 1

Mike wrote:

Ah, it looks like he uses “mutually interpretable” for my intuitive meaning and “biinterpretable” for what I would naively call “equivalent”, i.e. the two translations are inverses up to isomorphism. That stronger property does indeed generally fail for weaker theories. (For ZFC and ETCS+R it only holds in the appropriate sense: the two translations are inverses up to isomorphism on the ZFC side, but only up to equivalence of categories on the ETCS side.)

Perhaps there ought to be a notion of ‘homotopy biinterpretable’, where the two translations of the two theories are homotopy inverses of each other, up to homotopy equivalence.

Posted by: Madeleine Birchfield on June 9, 2021 1:12 PM | Permalink | Reply to this

### Re: Large Sets 1

For something very much like this notion, see Hans Halvorson, 2019, The Logic in The Philosophy of Science.

In logic, “Bi-interpretable” means there is a translation f of T1 into T2 and a translation g of T2 into T1, and f and g are inverses (see eg Visser 2004). Here are three examples of great foundational interest:

(i) PA is bi-interpretable with TC(0,1) + induction.

(ii) PA is bi-interpretable with ZF(finite) + Transitive Closure.

(iii) CT is bi-interpretable with ACA + (something).

They are of interest because they show the (appropriate) theory of numbers, the theory of strings, and the theory of finite sets are equivalent:

arithmetic = syntax = finite sets.

These are all quite difficult though. All are sort of “folklore-but-no-one-worked-out-the-details” (because the details are horrible).

By TC(0,1) + induction, I mean the theory of concatenation of binary sequences of 0s and 1s, with induction for such strings.

The equivalence of PA with this theory is implicit in Godel’s 1931 paper, was partly worked out by Quine (and a while later, Mostowski and Tarski, but unpublished), but was finally clarified in detail in some papers by Grzegorczyk, Grzegorczyk & Zdanowski, Visser, Ganea & Svejdar: all round abouts 2005-2010.

By ZF(finite) I mean the result of replacing Infinity with its negation (and “Transitive closure” states that “every set has a transitive closure”). The details of the equivalence of this theory with PA were worked out in a 2007 paper by Kaye and Wong.

By CT, I mean the theory in the language of arithmetic extended with a truth predicate, T(x); and with axioms PA + induction for all formulas + standard compositional axioms for truth (for formulas not containing T). See Cieslinski 2017 or Halbach 2014. By ACA + (something), I mean the first-order 2-sorted theory consisting of second-order arithmetic comprehension with unrestricted induction, plus something (which I think says, “every number corresponds to a set”). See Simpson 2009. The corresponding equivalence is folklore. I’m not sure anyone has worked out the full details, which is why I’m unsure of the “something”.

A result by Friedman & Visser 2014 shows that bi-interpretability for sequential theories implies their definitional equivalence. But mutual interpretability does not entail bi-interpretability. I think the a counter-example given by Andreka et al 2005 works. The discussion by Friedman & Visser 2014 is pertinent too.

Posted by: Jeffrey Ketland on June 10, 2021 9:13 PM | Permalink | Reply to this

### Re: Large Sets 1

I’m excited to read the rest of this series!

One thing that I think is worth noting is that another possible structural meaning of “cardinal” is “(isomorphism class of) well-ordered set that embeds as an initial segment into any well-ordered set whose underlying set is isomorphic to its own”. Obviously this is quite a mouthful, but my understanding is that the ZFC-theoretic identification of cardinals with certain ordinals isn’t just to provide a canonical choice of representative of an isomorphism class of sets — it may have started out that way, and often it’s only for that, but sometimes it seems that the well-ordering really does get used. In my limited reading about large cardinals, it felt to me that some large cardinals were really just defined in terms of the size of the underlying set, while others really did refer to the well-ordering.

However, since you’re writing a series of blog posts about large cardinals in structural set theory (and I’m not), I expect you’ve thought about this much more deeply!

Posted by: Mike Shulman on June 9, 2021 3:54 AM | Permalink | Reply to this

### Re: Large Sets 1

I’m glad you’re looking forward to the rest, but I suspect it will mostly be stuff you already know well… anyway, let’s see!

I agree about the importance of well-ordered sets. To me, this has been one of the most interesting aspects of the whole exercise: not throwing out the baby with the bathwater.

One point that will come up in a later post is the adjunction between sets and well-ordered sets. What I mean is this.

• Let $\mathbf{WOSet}$ be the category whose objects are well-ordered sets and whose maps are order-embeddings with downwards closed image (i.e. embeddings as an initial segment). The category $\mathbf{WOSet}$ is actually a preorder; I’ll write $V \preceq W$ to mean that there exists a map $V \to W$ in $\mathbf{WOSet}$.

• Let $\leq$ be the relation on sets defined by $X \leq Y$ iff there exists an injection $X \to Y$. Then $(\mathbf{Set}, \leq)$ is also a preorder.

• There is a forgetful functor $U: (\mathbf{WOSet}, \preceq) \to (\mathbf{Set}, \leq)$, since an order-embedding is in particular injective.

• There is a functor $I: (\mathbf{Set}, \leq) \to (\mathbf{WOSet}, \preceq)$ defined by taking $I(X)$ to be $X$ equipped with a well-ordering that’s initial (i.e. $I(X) \preceq W$ for any other well-ordered set $W$ with underlying set $X$).

Then $I$ is left adjoint to $U$:

$I(X) \preceq W \iff X \leq U(W)$

for every set $X$ and well-ordered set $W$. Since $U \circ I \cong id$, this adjunction restricts to an equivalence between sets and initial well-ordered sets. And that gives the “possible structural meaning of ‘cardinal’” that you started with in your comment.

What I meant about babies and bathwater is that it does seem to me that the equivalence between sets and initial well-ordered sets is significant. Sometimes we have a choice about whether we use it or not (e.g. you can define the cofinality of a set directly or via the corresponding WO set). But sometimes the equivalence, or the entire adjunction, seems pretty much essential.

Posted by: Tom Leinster on June 9, 2021 12:56 PM | Permalink | Reply to this

### Re: Large Sets 1

Why would you use this special word for an isomorphism class of sets…when the property of sets you’re talking about is isomorphism-invariant anyway? You don’t do this for groups or any other type of mathematical structure.

I believe there are actually a few other situations where something like this happened, historically. In older writings, at least, a representation of a group $G$ sometimes meant an isomorphism class of vector spaces with $G$-action. And a homotopy type traditionally referred either to a homotopy-equivalence-class of spaces or to an object of the homotopy category of spaces. However, nowadays it seems that both of those words have been mostly reassigned to the original object rather than its isomorphism classes.

Posted by: Mike Shulman on June 9, 2021 4:00 AM | Permalink | Reply to this

### Re: Large Sets 1

A sign that quotienting as a higher inductive type (HoTT book, 6.10) prevails?

Posted by: David Corfield on June 9, 2021 7:33 AM | Permalink | Reply to this

### Re: Large Sets 1

Interesting, I didn’t know about that usage of “representation”.

Actually, for the post I wrote and then deleted a paragraph contrasting parts of mathematics where there are multiple relevant notions of isomorphism and parts where there are only one. It seems to be a geometry/algebra divide.

In geometry, you have diffeomorphism, homeomorphism, homotopy equivalence, birationality, etc., so we do more often need to say things like “homotopy type”, “birationality class”, etc.

But in algebra, there usually seems to be only one sensible notion of isomorphism (though I’m sure there are exceptions). That means we can get away with just saying “group” etc. without having to specify what it’s “up to”.

Posted by: Tom Leinster on June 9, 2021 9:59 AM | Permalink | Reply to this

### Re: Large Sets 1

Morita equivalence would be one example I suppose.

Posted by: Richard Williamson on June 9, 2021 1:12 PM | Permalink | Reply to this

### Re: Large Sets 1

Ah yes, good point. So that’s a situation where there’s a clear default (isomorphism) and if we want to refer to another notion of sameness, we have to say it explicitly.

Posted by: Tom Leinster on June 9, 2021 2:29 PM | Permalink | Reply to this

### Re: Large Sets 1

some people don’t like defining an element of $X$ to literally be a function $1 \to X$. That’s OK. True to the spirit of neutral set theory, nothing we’ll do will require that definition. Nothing will contradict it, either.

Doesn’t that mean that ETCS isn’t actually neutral? Just as ZFC has features like the global $\in$ that aren’t part of neutral set theory, so ETCS also has features, like this definitional identification of elements with functions out of 1, that aren’t really part of neutral set theory either.

Posted by: Mike Shulman on June 9, 2021 4:06 AM | Permalink | Reply to this

### Re: Large Sets 1

You’re absolutely right. As I was writing the post, I realized there was something to be resolved here, but I opted for going ahead and posting rather than agonizing over it.

What I think I mean is this. There’s ETCS, which will be the formal axiom system I’m using. Then there’s the way I’m going to write about ETCS. My intention is to write neutrally.

I think what this means is that implicitly, I’ll be using the following set theory — let’s call it $\in$TCS. It’s very much like ETCS, but with the following differences.

The primitive concepts of ETCS (as I’m presenting it) are sets, functions $X \to Y$ for each set $X$ and set $Y$, composition, and identities. For $\in$TCS, we take the same primitive concepts and two more:

• For each set $X$, some things called elements of $X$, writing $x \in X$ to mean that $x$ is an element of $X$.

• For each set $X$, set $Y$, function $f: X \to Y$ and element $x$ of $X$, an element of $Y$ called the evaluation of $f$ at $x$ and written as $f(x)$.

The axioms of $\in$TCS are very similar to those of ETCS, but with the following additions.

• Given an element $x \in X$ and functions $f: X \to Y$ and $g: Y \to Z$, we require that $(g \circ f)(x) = g(f(x))$.

• Given an element $x \in X$, we require that $\id_X(x) = x$.

• A set is terminal if and only if it has exactly one element.

• For every terminal set $1$ and element $\star$ of $1$, and for every set $X$ and element $x$ of $X$, there is exactly one function $f: 1 \to X$ such that $f(\star) = x$.

I think that’s enough to guarantee that $\in$TCS is equivalent to ETCS in the strongest sense. Quite possibly I’ve added more axioms than necessary. One day I should work this out properly.

In any case, the point is that there’s some set theory $\in$TCS (either as I’ve just defined it or a variation) that’s equivalent to ETCS but has $\in$ as a primitive concept. That way, we don’t need to define an element of a set $X$ as a map $1 \to X$, even though there’s a one-to-one correspondence between the two things. My thought is that the “neutral” way I’ll be doing set theory in the coming posts is implicitly using $\in$TCS or something like it.

Posted by: Tom Leinster on June 9, 2021 10:19 AM | Permalink | Reply to this

### Re: Large Sets 1

The problem with the usual definition of universes in ETCS is that the resulting universes are strict rather than weak; i.e. we could actually compare sets in a universe by equality, due to the fact the universes are sets in ETCS, which defeats the entire purpose of ETCS not being a material set theory like ZFC.

Posted by: Madeleine Birchfield on June 9, 2021 2:11 PM | Permalink | Reply to this

### Re: Large Sets 1

Whoops, I responded to the wrong comment, it should be for the comment below

Posted by: Madeleine Birchfield on June 9, 2021 2:14 PM | Permalink | Reply to this

### Re: Large Sets 1

I am not entirely sure how important it will be in this series, but ETCS also has no way of representing indexed families of sets, which the material set theories like ZFC can. One would either have to include Grothendieck universes and define indexed families as functions into a universe, or directly include indexed sets into the primitives a là dependent type theory.

Posted by: Madeleine Birchfield on June 9, 2021 1:54 PM | Permalink | Reply to this

### Re: Large Sets 1

ETCS also has no way of representing indexed families of sets

Oh yes it does! An $I$-indexed family of sets is defined to be a set $X$ together with a function $X \to I$. Think of the fibre over $i$ as the $i$th member of the family.

Of course, we’re speaking a bit loosely here. ETCS itself is just an axiomatic theory (like ZFC), and has no way of doing anything, any more than ZFC does. But we’re talking about the definitions that people customarily make when building up from ETCS.

Posted by: Tom Leinster on June 9, 2021 3:42 PM | Permalink | Reply to this

### Re: Large Sets 1

Oh yes it does! An $I$-indexed family of sets is defined to be a set $X$ together with a function $X \rightarrow I$. Think of the fibre over $i$ as the ith member of the family.

I think I understand why this works: it’s the same phenomenon as defining subsets: given a set $I$ there are two ways to define a subset of $I$ or predicate on $I$: as a function from $I$ into a subobject classifier $Prop$, or as a set $X$ with an injection/monomorphism $X \rightarrow I$. The same thing goes for families of sets: working in the category $Grpd$, given a groupoid $I$, an $I$-indexed family of sets would either be a functor from $I$ to the 0-truncated object classifier $Set$, which is how I have usually thought of families of sets, or a groupoid $X$ with a $0$-truncated functor $X \rightarrow I$. The second definition still works if we restrict ourselves to working in the category of 0-truncated groupoids and 0-truncated functors.

Posted by: Madeleine Birchfield on June 12, 2021 2:19 PM | Permalink | Reply to this

### Re: Large Sets 1

An $I$-indexed family of sets is defined to be a set $X$ together with a function $X\to I$. Think of the fibre over $i$ as the $i$th member of the family.

But that wouldn’t make ETCS a neutral set theory. Some materialists working with ZFC or NBG would probably object to such a definition of indexed sets, for similar reasons why they might object to identifying elements with functions out of the singleton.

Posted by: Madeleine Birchfield on June 9, 2021 10:58 PM | Permalink | Reply to this

### Re: Large Sets 1

Indeed, and I didn’t call ETCS a neutral set theory.

Certainly there are aspects of ETCS (and of how ETCS users use it) that some people don’t like. The definition of elements as functions out of $1$ is one; the definition of families is another. That’s why I didn’t call it neutral.

Posted by: Tom Leinster on June 9, 2021 11:21 PM | Permalink | Reply to this

### Re: Large Sets 1

Hi Tom,

I think Madeleine could be granted a point here:

The kind of set theory I’ll use here, and the language I’ll use to discuss it, is “neutral” in the sense that it’s the language of the large majority of mathematical publications today…

… The set theory I’ll use is Lawvere’s Elementary Theory of the Category of Sets (ETCS).

might look at odds with

Indeed, and I didn’t call ETCS a neutral set theory.

I’m pretty sure I know what you are getting at, and I agree with what you are getting at, and also that you are working to avoid contentious interchanges, but misunderstandings are indeed possible!

Posted by: Todd Trimble on June 10, 2021 1:18 AM | Permalink | Reply to this

### Re: Large Sets 1

I take your point, Todd. I’m using this conversation to work through this “neutral” idea. The distinction between the formal theory and the language used to discuss it appears to play a part.

On the other hand, this current exchange seems much like a re-run of the one a few comments back, where Mike raised the same question as Madeleine in the context of elements rather than families, and I gave a comprehensive answer. A similar answer could be given here. In paritcular, one could take the axiom system that I sketched in that comment and enlarge it to also incorporate families as a primitive concept. Or if that seems like too much effort, one can imagine doing so, which is all that’s really necessary.

In any case, it will become visible as the series gets going what I mean by “neutral”. It’s probably pointless to spend much more time trying to describe it in advance.

(But I did take the time to slightly edit the post, in order to clarify this point. I say this because Todd’s quotation no longer quite matches the post, and I wouldn’t want future readers to think he was making stuff up :-))

Posted by: Tom Leinster on June 10, 2021 1:40 AM | Permalink | Reply to this

### Re: Large Sets 1

Why have you gotten so interested in large sets, Tom? There are plenty of reasons to be interested, but I think of you as more interested in magnitude, and (oddly) I don’t see the connection.

Maybe you’re done with magnitude and have moved on to bigger things?

Posted by: John Baez on June 9, 2021 4:13 AM | Permalink | Reply to this

### Re: Large Sets 1

Ha, no, these “bigger things” have no connection to magnitude as far as I know!

One answer to your question is this:

My primary interest is not actually in large cardinals themselves. What I’m really interested in is exploring the hypothesis that everying in traditional, membership-based set theory that’s relevant to the rest of mathematics can be done smoothly in categorical set theory.

Another is more personal. I find it quite soothing to think about set theory. I find it especially soothing to think about set theory when I’ve been working on more applied or analytic things — it’s a pleasant contrast — or when there’s a pandemic starting. It’s a refuge.

At some point I started working through all the set theory I knew and figuring out how to do it using the categorical approach. Eventually I ran out of set theory I knew and had to learn some more. I’m not going to devote the rest of my life to learning more set theory and redoing it categorically, but I’ve enjoyed the part I’ve done.

Posted by: Tom Leinster on June 9, 2021 10:33 AM | Permalink | Reply to this

### Re: Large Sets 1

Theorem: for all things, there exists a person who objects to that thing.

I do find your act of wrapping some pseudo-sociological observation up as a “Theorem” to be quite deplorable.

Posted by: Simon Willerton on June 9, 2021 10:00 AM | Permalink | Reply to this

### Re: Large Sets 1

:-)

Posted by: Tom Leinster on June 9, 2021 3:44 PM | Permalink | Reply to this

### Re: Large Sets 1

I’m looking forward to this series, Tom.

Since I always lose track of this cat-list message from Lawvere and have to search about for it, and since I’ll be interested to see if contact is made in your posts, I’ll post this:

Andrej Bauer asked whether large cardinals other than inaccessible ones have a natural definition in topos theory. Indeed, like most questions of set theory which have an objective content, this too is independent of the a priori global inclusion and membership chains which are characteristic of the Peano conception that ZF formalizes. Various kinds of “measurable” cardinals arise as possible obstructions to simple dualities of the type considered in algebraic geometry. Actually, measurable cardinals are those which canNOT be measured by smaller ones, because of the existence on them of a type of homomorphism which is equivalent to the existence of a measure in the sense of Ulam. Specifically, let V be a fixed object and let M denote the monoid object of endomorphisms of V. Then the contravariant functor ( )^V is actually valued in the category of left M-actions and as such has an adjoint which is the enriched hom of any left M-set into V. The issue is whether the composite of these, the double dualization, is isomorphic to the identity on the topos; if so, one may say that all objects are measured by V, or that there are no objects supporting non-trivial Ulam elements. In any case, the double dualization monad obtained by composing seems to add new ideal Ulam elements to each object, i.e. elements which cannot be nailed down by V-valued measurements. Since fixed points for the monad are special algebras, and since algebras are always closed under products etc., it should be possible to devise a very natural proof based on monad theory that the category of these non-Ulam objects is itself a topos and even “inaccessible” relative to the ambient topos.

Why is the above definition relevant? The first example should be the topos of finite sets with V a three-element set. There the monad is indeed the identity, as can be seen by adapting results of Stone and Post. Extending the same monad to infinite sets, we obtain the Stone-Czech compactification beta.

The key example is a topos of sets in which we have V a fixed infinite set. As Isbell showed in 1960, the category contains no Ulam cardinals in the usual sense if and only if the monad described above is the identity.

Further examples involve the complex numbers as V, where actually M can be taken to consist only of polynomials, with the same result; this example extends nicely from discrete sets to continuous sets, usually discussed in the context of “real compactness”. Another kind of example concerns bornological spaces. The result always seems to be that the lack of Ulam cardinals is equivalent to the exception-free validity of basic space/quantity dualities.

Ulam (and other set theorists since) usually in effect phrase the construction in terms of a two-element set V equipped however with infinitary operations. Isbell’s remark shows that equivalently an infinite set equipped with finitary (indeed only unary) operations can discern the same distinctions between actual elements as values of the Dirac-type adjunction map and ghostly Ulam elements on the other hand.

Posted by: David Corfield on June 9, 2021 1:12 PM | Permalink | Reply to this

### Re: Large Sets 1

Lots of comments are anticipating later parts of this series, which is great. Your comment, David, anticipates what I think will be almost the final post.

I’m not going to go any bigger than measurable sets. To a set theorist, this is still quite petite. E.g. Kanamori’s book The Higher Infinite dispenses with measurability in section 2 out of 32. But there’s lots interesting along the way from ETCS to measurability.

When I get to measurability, I’ll say something about the result of Isbell that Lawvere is alluding to here, and a variant of it involving a single object $V$ (in Lawvere’s notation).

I understand most of Lawvere’s comment, but I wonder what in algebraic geometry he’s thinking of here:

Various kinds of “measurable” cardinals arise as possible obstructions to simple dualities of the type considered in algebraic geometry.

I suppose he’s not quite saying that measurable cardinals arise as possible obstructions to simple dualities considered in algebraic geometry — he says of the type considered in algebraic geometry. If that “of the type” is a crucial part of the sentence, then it’s much less interesting and I suppose I can guess what he means. But if there is actually a duality in algebraic geometry that’s somehow obstructed by measurable cardinals, I’d like to hear what it is.

Posted by: Tom Leinster on June 9, 2021 3:58 PM | Permalink | Reply to this

### Re: Large Sets 1

With Lawvere’s talk of “bornology” and “real compactness”, I wonder if there’s a whiff of condensedness in the air. Recall Peter Scholze told us that

If you wish, condensed sets are bornological sets equipped with some extra structure related to “limit points”, where limits are understood in terms of ultrafilters,

and it wouldn’t surprise me if your codensity monads were about here somewhere.

Posted by: David Corfield on June 9, 2021 5:05 PM | Permalink | Reply to this

### Re: Large Sets 1

Yes, I see what you mean (roughly!).

I’m sure you already know this, David, but let me say something quick about Lawvere’s comment. I’ll come back to this properly when I do the post on measurability.

Lawvere doesn’t use the term “codensity monad”, but it’s there in what he’s writing, as follows.

Let $\mathbf{Ctbl}$ be the category of countable sets (including finite sets, of course). Write $T$ for the codensity monad of $\mathbf{Ctbl} \hookrightarrow \mathbf{Set}$. Then $T$ is the identity if and only if $\mathbf{Set}$ contains no measurable sets (= measurable cardinals). Equivalently, $\mathbf{Ctbl}$ is codense in $\mathbf{Set}$ iff $\mathbf{Set}$ contains no measurable sets.

In other words, the more nontrivial $T$ is, the more measurable sets there are. For now, I’ll resist talking about what exactly $T(X)$ consists of when measurable sets do exist — that will be in the later post.

The result can be generalized. Let $\mathbf{B}$ be any full subcategory of $\mathbf{Ctbl}$ containing at least one infinite set. Then $\mathbf{B}$ is codense in $\mathbf{Set}$ (or equivalently, the codensity monad of $\mathbf{B} \hookrightarrow \mathbf{Set}$ is the identity) if and only if $\mathbf{Set}$ contains no measurable sets.

One extreme case of this general result is to take $\mathbf{B}$ to be the whole of $\mathbf{Ctbl}$. That’s what I started with. The other is to take $\mathbf{B}$ to be the full subcategory on a single countably infinite set. That case was proved by Isbell in his 1960 paper “Adequate subcategories”, and is what Lawvere’s referring to. And he’s talking about monoids because $\mathbf{B}$ is now a one-object category.

All of this should be compared to the similar and much easier situation where $\mathbf{Ctbl}$ is replaced by the category $\mathbf{Fin}$ of finite sets. (Lawvere mentions this comparison.) The codensity monad of $\mathbf{Fin} \hookrightarrow \mathbf{Set}$ is the ultrafilter monad.

To prove this doesn’t need many assumptions about the category of sets, so $\mathbf{Fin}$ is codense in $\mathbf{Set}$ if and only if no set admits a nontrivial ultrafilter. Thus, if $\mathbf{Set}$ violates the axiom of choice badly enough, $\mathbf{Fin}$ is codense in it.

Again, there’s a more general result: for any full subcategory $\mathbf{B}$ of $\mathbf{Fin}$ containing at least one set with at least three elements, the codensity monad of $\mathbf{B} \hookrightarrow \mathbf{Set}$ is the ultrafilter monad. (This is in a paper of mine.)

One extreme case is to take $\mathbf{B} = \mathbf{Fin}$, as I did above. The opposite extreme is to take $\mathbf{B}$ to be the full subcategory on a single 3-element set, which is what Lawvere is alluding to. As he observed, ultrafilters can then be characterized in terms of the actions of a 27-element monoid.

Posted by: Tom Leinster on June 10, 2021 11:35 AM | Permalink | Reply to this

### Re: Large Sets 1

I’m not going to go any bigger than measurable sets.

I’m somewhat surprised that Vopěnka’s principle won’t be mentioned in the series, especially on a blog specifically dedicated to category theory.

Posted by: Madeleine Birchfield on June 9, 2021 6:16 PM | Permalink | Reply to this

### Re: Large Sets 1

Just here to say this is delightful and perfectly pitched. I’m looking forward to the rest of the series.

Posted by: Emily Riehl on June 9, 2021 7:35 PM | Permalink | Reply to this

### Re: Large Sets 1

Thank you, Emily!

Posted by: Tom Leinster on June 10, 2021 11:12 AM | Permalink | Reply to this

### Re: Large Sets 1

As discussions of categorical set theory have a way of getting heated, and as this is the first of a series on the subject, I want to issue the usual kind of plea for everyone to be careful and considerate of others when writing comments. I think the risk is greatest on this post, as the later ones will be less about the kind of generalities where the temperature can tend to rise. But still, it’s worth saying.

So, when writing comments, please take more than the usual time to reflect, to choose your words carefully, and to consider the human dimension.

This blog is designed to be a friendly place where people can work through ideas in a collaborative and mutually supportive way. Blog posts and comments are not polished, published pieces of academic writing. People may not always say what they mean, or even know what they mean. We make mistakes in public, we acknowledge our mistakes in public, and we show appreciation to those who have helped us learn. And that’s all as intended: figuring things out together is what the blog’s for.

For various reasons that I won’t attempt to articulate here, there’s a particular danger of things turning unfriendly in discussions of categorical set theory. I want to head that off. So, please be nice!

In particular, I’d ask you to be careful about comments that could be construed as purely critical, negative or contradictory. Pause before posting, and think about the other human beings involved. And please appreciate that in anything that resembles “ZFC vs. ETCS”, everyone can feel like an underdog and be accordingly defensive.

Sorry to say such things: sermonizing like this is uncomfortable, and I’m aware that I don’t always live up to these standards myself. But experience suggests that it’s worthwhile to make an explicit statement early on.

Posted by: Tom Leinster on June 10, 2021 11:40 AM | Permalink | Reply to this
Read the post Large Sets 2
Weblog: The n-Category Café
Excerpt: Part 2 of a series on large cardinals in categorical set theory: limits.
Tracked: June 11, 2021 2:16 PM

### Re: Large Sets 1

While there are others who know the history far better than myself, I’ve recently become interested in Lawvere’s encounter with my home institution, Reed College, where he was a visitor during the 1963-64 academic year. According to this interview, Lawvere was tasked with teaching a first year calculus course focused on foundations. His frustration attempting to build the course around ZFC directly led to the formation of ETCS!

The pedagogical practicality of such an effort might be reasonably questioned, and Nick Wheeler has shared that “[Lawvere] did not get much support from colleagues (including me) who dismissed as a naive indulgence his effort to teach to first-year students esoteric things that we ourselves had never heard about.” The rest of Wheeler’s email describes how his perceptions of Lawvere’s work have improved over time, but I am not tempted to incorporate more than the spirit of ETCS into my own calculus sections :)

Thank you for the fascinating series of posts, Tom!

Posted by: Kyle Ormsby on June 11, 2021 6:03 PM | Permalink | Reply to this

### Re: Large Sets 1

The pedagogical practicality of such an effort might be reasonably questioned

It definitely might; it would be interesting to hear from one of his former students there. But I have to applaud a teacher who takes teaching seriously, responding creatively to a pedagogical challenge, rather than being a teaching bot doing same-old same-old (always an available option). Whatever the outcome for his students, we category theorists are grateful!

Posted by: Todd Trimble on June 11, 2021 7:21 PM | Permalink | Reply to this

### Re: Large Sets 1

Indeed! Lawvere captures this sentiment nicely in the above-linked interview:

Thus, the Elementary Theory of the Category of Sets arose from a purely practical educational need, in a sort of experience that Saunders also noted: the need to explain daily for students is often the source of new mathematical discoveries.

Posted by: Kyle Ormsby on June 11, 2021 8:39 PM | Permalink | Reply to this

### Re: Large Sets 1

Thanks, Kyle! That episode is truly mind-boggling.

I can imagine putting something about each of the ETCS axioms into a first-year undergraduate course, though it wouldn’t be calculus and I’d have to soft-pedal many of the axioms.

And I can see some benefit in doing something like this. I used to teach the introductory general topology course in Edinburgh, and I noticed that even our best students (and some of them are excellent) tripped up on basic set-theoretic points — e.g. they had trouble seeing why, if $A, A' \subseteq X$ and $B, B' \subseteq Y$, the subset $(A \times B) \cup (A' \times B') \subseteq X \times Y$ needn’t be of the form $A'' \times B''$. The difficulty was not with the topology, it was with basic set theory. (A picture makes it clear, but they weren’t seeing that picture in their heads, or easily translating between pictures and algebraic formalism.) I found myself wishing they’d done more of that kind of set theory early on.

Similarly, a common point of difficulty for beginning algebra students is understanding quotient groups/rings. It seems to me that this is partly because the students are simultaneously having to understand a new set-theoretic thing (basically the universal property of the quotient of a set by an equivalence relation) and a new algebraic thing. It might be beneficial to teach the set-theoretic thing separately first.

I can just about imagine doing a proper formal version of the ETCS axioms in, say, a third year course at a Scottish university or a second year course at an English university. (Scottish students study more broadly, hence do less mathematics per year in the early years.) But it would take some doing. In actual reality, there may be an opportunity one day for me to teach axiomatic ETCS in a fourth year course here — but this is a long way from Lawvere’s original radical plan!

Posted by: Tom Leinster on June 11, 2021 11:50 PM | Permalink | Reply to this

### Re: Large Sets 1

I’ve just been rereading Lawvere’s An elementary theory of the category of sets (long version) with commentary, where he relates some further recollections of what happened at Reed College. From p.5-6:

This elementary theory of the category of sets arose from a purely practical educational need. When I began teaching at Reed College in 1963, I was instructed that first-year analysis should emphasize foundations, with the usual formulas and applications of calculus being filled out in the second year. Since part of the difficulty in learning calculus stems from the rigid refusal of most textbooks to supply clear, explicit, statements of concepts and principles, I was very happy with the opportunity to oppose that unfortunate trend. Part of the summer of 1963 was devoted to designing a course based on the axiomatics of Zermelo-Fraenkel set theory (even though I had already before concluded that the category of categories is the best setting for “advanced” mathematics). But I soon realized that even an entire semester would not be adequate for explaining all the (for a beginner bizarre) membership-theoretic definitions and results, then translating them into operations usable in algebra and analysis, then using that framework to construct a basis for the material I planned to present in the second semester on metric spaces.

However I found a way out of the ZF impasse and the able Reed students could indeed be led to take advantage of the second semester that I had planned. The way was to present in a couple of months an explicit axiomatic theory of the mathematical operations and concepts (composition, functionals, etc.) as actually needed in the development of the mathematics.

Incidentally, I’ve never been able to think of a justification for that phrase

the rigid refusal of most textbooks to supply clear, explicit, statements of concepts and principles.

Sure, sometimes people don’t write clearly, but he’s obviously saying more than that. Refusal is a very strong word, as if the writers of calculus textbooks know how to make “concepts and principles” clear and explicit but choose not to.

Posted by: Tom Leinster on June 13, 2021 11:13 AM | Permalink | Reply to this

### Re: Large Sets 1

Let me merely give thanks to all for this useful and informative discussion and comparison of different approaches to, and issues with, various approaches to “foundations”. Surfacing a lot of sometimes hidden questions and issues.

Posted by: Keith Harbaugh on June 11, 2021 11:53 PM | Permalink | Reply to this
Read the post Large Sets 12
Weblog: The n-Category Café
Excerpt: Replacement is a natural supplement to ETCS.
Tracked: July 18, 2021 8:14 PM

Post a New Comment