### Large Sets 12.5

#### Posted by Mike Shulman

*Previously: Part 12. Next: Part 13*

Last time Tom told us about McLarty’s replacement axiom for ETCS, but mentioned that there are several other equivalent axioms in the literature, due to Osius, Lawvere, Cole, and myself. In this addendum I want to discuss and compare those other axioms, and explain why I prefer my own (which is essentially a modification of McLarty’s to a *collection* axiom rather than a replacement axiom).

Here is Osius’s axiom:

If $\Phi(X,A,m,B)$ is a formula, where $X,A,B$ are set-variables and $m:A\to X$ is a function-variable, then for any set $X$ there exists a set $Y$ such that for any set $A$ and monomorphism $m:A\rightarrowtail X$, if there exists a set $B$ such that $\Phi(X,A,m,B)$, then there exists such a $B$ with $B\le Y$.

Here is Lawvere’s axiom:

If $\Phi(A,B)$ is a formula with set-variables $A,B$ such that whenever $\Phi(A,B)$ and $\Phi(A,B')$ we have $B\cong B'$, then for any set $X$ there exists a set $Y$ such that $X\lt Y$ and for any $A$ and $B$ with $A\lt Y$ and $\Phi(A,B)$, we have $B\lt Y$.

Here is Cole’s:

If $\Phi(A,B)$ is a formula with set-variables $A,B$, then for any set $X$ there exists a set $Y$ such that for any set $A\lt X$, if there exists a set $B$ with $\Phi(A,B)$, then there exists such a $B$ with $B\lt Y$.

And here is mine, from Comparing material and structural set theories (a 2018 paper that revises the relevant half of the 2010 preprint):

If $\Phi(i,S)$ is a formula with an element-variable $i$ and a set-variable $S$, and $I$ is a set such that for all $i\in I$ there exists a set $S$ with $\Phi(i,S)$, then there exists a surjection $p:V\twoheadrightarrow I$ and a $V$-indexed family of sets $q:A\to V$, such that for all $v\in V$ we have $\Phi(p(v), q^{-1}(v))$.

Of these, Lawvere’s is the only one that I would call a reflection principle. According to the informal description on wikipedia of the latter:

Suppose that we have some collection A of methods for forming sets (for example, taking powersets, subsets, the axiom of replacement, and so on)…. we can introduce the following new principle for forming sets: “the collection of all sets obtained from some set by repeatedly applying all methods in the collection A is also a set”.

Lawvere’s axiom is essentially the specialization of this principle to a partial-functional formula $\Phi$: given a set $X$, the collection of all sets obtained by repeatedly applying $\Phi$ is bounded in cardinality by some set $Y$. More precisely, there is a set $Y$ larger than $X$ such that the collection of sets $\lt Y$ is closed under applying $\Phi$.

Cole’s axiom looks similar to Lawvere’s, but there is a crucial difference that $Y$ is only required to be larger than the sets $B$ obtained by applying $\Phi$ to sets $A$ that are smaller than $X$. In other words, it’s only a “one-step” closure. This is more like a replacement axiom than a reflection axiom, but since there is no uniqueness of $B$, it’s more of a collection axiom.

Informally, the ZF replacement axiom says that the image of (the elements of) a set under a functional formula is (the elements of) a set, while the ZF collection axiom says that the image of (the elements of) a set under any entire formula contains a set onto which the formula is still entire. Cole’s axiom says similarly that the image of the collection of *sets smaller than* some set $X$ under some entire formula contains the collection of sets smaller than some other set $Y$ onto which the formula is still entire.

In other words, while the ZF replacement and collection axioms are element-to-element, Cole’s collection axiom is set-to-set. (Although, admittedly, the distinction is blurry in ZF, where elements of sets are themselves also sets.) Similarly, McLarty’s replacement axiom is element-to-set: the image of the elements of a set under a functional formula is a family of sets indexed by a set. Finally, Osius’s axiom is subset-to-set collection, while mine is element-to-set collection.

A further difference is that Osius’s, Lawvere’s, and Cole’s axioms all “collect a bunch of sets” by finding them as the sets smaller than some given set, while McLarty’s and mine “collect a bunch of sets” by finding them as the elements of a family indexed by some other set (which, recall, in ETCS means the fibers of a function).

All of these axioms are equivalent over ETCS. Which is best? Well, aesthetics differ, but you won’t be surprised to hear that I prefer my own. Let me try to explain why.

First, I think the “family” approach is a better match for the actual practice of mathematics, where we use families of sets all the time. The collection of all sets smaller than some given one is interesting in set theory, but doesn’t pop up very frequently in, say, topology or algebra. This argues for McLarty’s axiom and mine over those of Osius, Lawvere, and Cole.

Secondly, the family approach generalizes better to weaker theories. To my knowledge, from the Osius-Lawvere-Cole axioms we can only deduce a family-style axiom in the presence of something like the powerset axiom, so that for any $X$ there is an indexed family of sets (namely the family of subsets of $X$) containing a set of each isomorphism class $\lt X$.

For these reasons, I think McLarty’s axiom and mine are both preferable to those of Osius-Lawvere-Cole. For comparing McLarty’s axiom and mine, we have to think about the difference between replacement and collection.

To start with, collection axioms also generalize to weaker theories better than replacement axioms. While collection and replacement axioms are equivalent over ETCS or Zermelo set theory, in weaker theories collection axioms tend to be strictly stronger, and the extra strength is often important. This is especially true in ZF-like theories, whose replacement axioms insist that the formula $\Phi$ be functional up to *strict equality* of sets, while in mathematical practice we rarely characterize a set absolutely uniquely, only uniquely up to isomorphism. In the presence of the other axioms of ZF you can play games to turn uniqueness-up-to-iso into literal uniqueness, but the trick can fail in weaker theories, so it’s better to assume a collection axiom from the get-go.

This issue doesn’t arise for McLarty’s axiom in ETCS, which does only require uniqueness up to isomorphism. But this observation leads to another reason I prefer my axiom, which is that there’s something very odd about McLarty’s. Usually in category theory when something is unique, it’s unique up to *unique* isomorphism, but McLarty only asks for uniqueness up to *non-unique* isomorphism. A little thought should suffice to observe why this must be so: when something is unique up to unique isomorphism, the isomorphism is generally only unique *such that* it commutes with or preserves some other given structure. But in McLarty’s axiom there is no other structure, so it will very rarely happen that the isomorphism is *literally* unique — only when the sets $S$ have no nonidentity automorphisms. To my knowledge the only place this happens in practice is the proof that Tom gave that replacement implies separation, since empty sets and terminal sets have no nonidentity automorphisms.

However, this odd feature of McLarty’s axiom means that it essentially incorporates some amount of the axiom of choice. Intuitively, if for each $i\in I$ there is a set $S$ such that $\Phi(i,S)$, then in order to produce the desired family $p:X\to I$ we have to *choose*, for each $i\in I$, a particular set $S_i$, and then define $X = \sum_i S_i$. Often in category theory we find ourselves “choosing” objects that are unique up to *unique* isomorphism, and there’s a sense in which such choices are not a “real” use of the axiom of choice; but when the isomorphisms are not unique, there really is some choice involved.

I don’t know of an “intrinsic” way to argue for this (such as a proof in ETCS-minus-choice that McLarty’s axiom implies some amount of choice). But one thing we can say is that the category of sets in *ZF* does not (at least, not obviously) satisfy McLarty’s axiom.

Why? To prove a categorical axiom like this from the ZF replacement axiom, we need to make a membership-based set that’s *strictly* unique. We can do this with a varation of Scott’s trick: given $\Phi(i,S)$, let $\Psi(i,V)$ assert that $V = V_\alpha$, where $\alpha$ is the least von Neumann ordinal such that there exists an $S\in V_\alpha$ such that $\Phi(i,S)$. Then $V$ is strictly unique given $i$, and so we can use ZF replacement. However, this doesn’t give us an $I$-indexed family of sets $S_i$ such that $\Phi(i,S_i)$; it gives us an $I$-indexed family of *families of sets* $(S_{i j})_{j\in J_i}$ such that $\Phi(i,S_{i j})$ for each $i,j$. Here $J_i$ is the set of all $S\in V_\alpha$ such that $\Phi(i,S)$. Thus what we’ve actually proven is *my* collection axiom, with $J = \sum_i J_i$. (This proof can also be factored through a proof that the ZF replacement axiom implies the ZF collection axiom.) The only way I know of to get McLarty’s axiom from here is to apply the axiom of choice to the surjection $p:J\twoheadrightarrow I$.

Thus, again, if we care about weaker theories without choice, then a collection axiom is more appropriate. In my paper I showed that in ETCS-minus-choice, my collection axiom implies a “unique isomorphism” version of McLarty’s axiom: given any “additional structure” that a set could have, if for each $i\in I$ there is a set $S$ with that structure such that $\Phi(i,S)$, and such an $S$ is unique up to unique isomorphism preserving the structure, then there is a family $p:X\to I$ with that structure “fiberwise over $I$” such that $\Phi(i,p^{-1}(i))$ for each $i$. (The same is true in even weaker theories than ETCS-minus-choice, as long as my “collection of sets” axiom is augmented by a similar “collection of functions” axiom.)

One might argue that this latter “unique isomorphism” replacement property is a better axiom. For instance, it’s more justified by the intuitive idea of replacement; collection is a bit harder to explain from that perspective. And I’m sympathetic to that point of view. But on the other hand, giving a precise formulation of “additional structure” is kind of technical and involved, whereas the collection axiom can be stated much more simply. And the passage to a cover in the collection axiom isn’t that strange if you’re familiar with Kripke-Joyal semantics — but that’s a story for another day.

Finally, sometimes it happens that one really does need collection, not just replacement. For instance, suppose we’re working with categories, or higher categories, or homotopical structures, built out of sets, and we’ve constructed an “operation” on them that’s unique up to some kind of *equivalence*, like a projective resolution or fibrant replacement. Now suppose we also want apply this “operation” to families of such structures all at once. Such a construction won’t be unique even up to non-unique *isomorphism*, so we can’t even apply McLarty’s replacement axiom. But with collection, we can build a family of results for each object in a family of structures, and then try to put them together in an appropriate way.

This wouldn’t necessarily be an argument for assuming a collection axiom directly if it were easy to prove collection from replacement. However, I don’t know of a *structural* proof that collection implies replacement! The only way I know of to do it is ZF-theoretic, using a $V_\alpha$ argument as above. So if you want to prove *in ETCS* that McLarty’s axiom implies mine, as far as I know the only way to do it is to first show that ETCS+R is equivalent to ZFC and then run the ZFC proof. (By contrast, McLarty’s axiom follows easily from mine plus choice.) This is another reason why I prefer to assume collection to start from.

## Re: Large Sets 12.5

Very nice! Among other things, it’s great to have all those statements of replacement side by side and in comparable notation, so that we can compare and contrast them.

(In my copy of Osius’s paper, some parts are scanned so badly that it’s only just possible to read them — including, unfortunately, the part where he discusses replacement.)