Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

September 10, 2009

Towards a Computer-Aided System for Real Mathematics

Posted by John Baez

I’ve known Arnold Neumaier for quite a while, thanks to many discussions on the newsgroup sci.physics.research. Recently he sent me a proposal for a system called FMATHL (Formal Mathematical Language), designed to be:

a formal framework which will allow — when fully implemented in some programming language — the convenient use of and communication of arbitrary mathematics on the computer, in a way close to the actual practice of mathematics, with emphasis on matching this practice closely.

He asked me for comments, and I gave him a few. But I said that some of you have thought about this subject more deeply, so your comments might be more valuable. So he agreed to let me post links to his proposal here.

Here’s a slightly edited version of what Arnold Neumaier sent me:

I am currently working on the creation of an automatic mathematical research system that can support mathematicians in their daily work, providing services for abstract mathematics as easily as Latex provides typesetting service, the arXiv provides access to preprints, Google provides web services, Matlab provides numerical services, or Mathematica provides symbolic services.

The mathematical framework (at present just a formal system – a kind of metacategory of all categories) is designed to be a formal framework for mathematics that will allow (some time in the future) the convenient use and communication of arbitrary mathematics on a computer, in a way close to the actual practice of mathematics.

I would like to make the system useful and easy to use for a wide range of scientists, and hence began to ask various people from different backgrounds for feedback.

At the present point where a computer implementation is not yet available (this will take at least two more years, and how useful it will be may well depend on your input), I’d most value:

  • your constructive feedback on how my plans and the part of the work already done should be extended or modified in order to find widespread approval,
  • your present views on what an automatic mathematical research system should be able to do to be most useful.

Here are two pdf files:

You can find more background work on my web page.

Posted at September 10, 2009 9:28 PM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/2056

470 Comments & 1 Trackback

Re: Towards a Computer-Aided System for Real Mathematics

Incredibly ambitious! I dare say even ridiculously ambitious…

Reading this reminded me of a much more modest idea. I’d love a plugin for my web browser and/or PDF previewer that can automagically follow citations like [Theorem 6.5, 17]. This would perhaps work by deciphering out what [17] is (by reading the bibliography of the same paper), extracting or locating online a copy of the paper, and searching through that for Theorem 6.5. I’d imagine the user interaction is just right-clicking on a citation, and having a pop-up box showing me the theorem statement, that I could then click on to get the statement in the original paper.

At least for papers on the arXiv, which are generated using hypertex, it shouldn’t be too hard to algorithmically locate “Theorem 6.5”.

Posted by: Scott Morrison on September 11, 2009 12:02 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Incredibly ambitious! I dare say even ridiculously ambitious…

Hear hear. But very exciting, I think. It seems to me that aspects of this project, at least, could be implemented in a finite time frame by dedicated people.

I’m particularly excited by the idea of formalizing mathematics starting in the middle. I’ve recently been playing around with formal proof assistants like HOL, Isabelle, Mizar, etc., and I have definitely been struck by their insistance on building everything from the ground up. The authors of such systems seem fond of invoking Bertrand Russell’s quip:

The method of “postulating” what we want has many advantages; they are the same as the advantages of theft over honest toil.

I’m not sure exactly how Russell intended this (no doubt someone here can set me straight), but it seems to me that the way we really do mathematics is to start by setting up some theory, i.e. “postulating” some structure and axioms, and then we prove things based on that structure and axioms. Isabelle/Isar seems to have a bit of support for this with its “locales” (no relation to pointless topology, but by and large the attitude is different. But formalizing mathematics starting in the middle, not insisting that everything be completely machine-checkably justified at first, with a system that provides other benefits to mathematicians so that we’ll actually use it, seems like it has a good chance of actually building to critical mass.

I also like the idea of a “web of trust,” with theorems “signed” by sources of varying degrees of believability (from “verified with Isabelle” to “found in textbook X” to “I saw Karp in the elevator and he said it was probably true”). Of course it reminds me of the discussions we’ve been having about how to “certify” different research pages on the nLab. And the “semantic wiki” idea is one that I’ve mentioned before in those discussions.

I’ll need to think a bit more before I can come up with constructive feedback and suggestions, however.

Posted by: Mike Shulman on September 11, 2009 1:46 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

To see the context of the Russell quotation see here.

Posted by: David Corfield on September 11, 2009 8:28 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Could you please summarize the relevant context of the Russell quotation, for the benefit of the European readers who (because of unclarified copyright issues) cannot see any Google book pages?

Posted by: Arnold Neumaier on September 11, 2009 5:56 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I’ve recently been playing around with formal proof assistants like HOL, Isabelle, Mizar, etc., and I have definitely been struck by their insistance on building everything from the ground up. […] the way we really do mathematics is to start by setting up some theory, i.e. “postulating” some structure and axioms, and then we prove things based on that structure and axioms.

You certainly can do this sort of thing in formal proof assistants (Coq is the one that I know best). Of course you know that you can since they are Turing-complete, but Coq (at least) has support for doing this naturally, with commands like Assume and Hypothesis.

On the other hand, Coq comes with its own foundations of mathematics (the Calculus of Inductive and Coinductive Constructions), and if you want to use one that doesn't match theirs, then you not only have to write it yourself (not too hard) but choose terminology that doesn't conflict with what Coq already thinks a ‘Set’ is. If FMathL is designed with more flexibility in mind, then that would be a Good Thing.

Posted by: Toby Bartels on September 11, 2009 5:56 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Toby Bartels wrote:

Coq comes with its own foundations of mathematics (the Calculus of Inductive and Coinductive Constructions), and if you want to use one that doesn’t match theirs, then you not only have to write it yourself (not too hard) but choose terminology that doesn’t conflict with what Coq already thinks a `Set’ is. If FMathL is designed with more flexibility in mind, then that would be a Good Thing.

FMathL has a concept of nested contexts in which reasoning happens; these contexts can be defined, opened, modified, and closed, according to established informal practice, just slightly formalized.

The intention is to take care of all common practices of mathematicians. Knowing that mathematicians freely use the same names for different concepts, depending on what they do (variable name conventions may even vary within the same document), FMathL will allow one to redefine everything (if necessary) by creating appropriate contexts.

The outermost context is always the standard FMathL context with its axioms, but in nested contexts one can override surface conventions (language constructs denoting concepts, relations, etc.) valid in an outer context in a similar way as, in programming, variable names in a subroutine can be chosen independent of variables in the calling routine. Internally, however, all this is disambiguated, and concepts are uniquely named.

One must also be able to define one’s own syntax for things, just by saying somewhere in the mathematical text things like “We say that (x,y) sign z word w if formula involving x,y,z,w”, overriding old, conflicting uses of sign and/or word valid in an outer contexts. (This creates high demands on the parser; we currently study how to meet this challenge.)

Such overloading of meanings or syntax is not really recommended, though, in the interest of transparency. Standard mathematics, i.e., what undergraduates should be able to understand without confusion and with limited effort, will be handled with a minimum of such artifices. (There are a number of ambiguities in traditional terminology and notation, which we hope to handle in some “natural” way, though we haven’t yet decided exactly how.)

Posted by: Arnold Neumaier on September 11, 2009 8:31 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

FMathL has a concept of nested contexts in which reasoning happens; these contexts can be defined, opened, modified, and closed, according to established informal practice, just slightly formalized.

OK, Coq has these too (called ‘Sections’); I think that Isabel's ‘locales’ are the same idea.

The outermost context is always the standard FMathL context with its axioms,

How strong are these? I would find it very nice if even the basic foundations were highly modular.

but in nested contexts one can override surface conventions (language constructs denoting concepts, relations, etc.) valid in an outer context.

Coq does not allow one to rename or redefine things within a Section, although it does allow one to rename things within a Module, which is basically like a Section except that it's stored in a different file. (The idea is that one introduces a Section for temporary convenience within a single document, but a Module should be reasonably self contained and is intended to be used by many different people.) It might be nice if FMathL is a little more forgiving than Coq about this.

One must also be able to define one’s own syntax for things, just by saying somewhere in the mathematical text things like “We say that (x,y) sign z word w if formula involving x,y,z,w”, overriding old, conflicting uses of sign and/or word valid in an outer contexts. (This creates high demands on the parser; we currently study how to meet this challenge.)

Coq allows this too (I mean symbols, since I've already dealt with redefining the things themselves), but people don't like to use it, since specifying the right order of operations is an annoying technicality. If you can program the parser to figure that out for us, that would be nice … if it's even possible.

Sorry to say a lot of ‘Yeah, the program that I know can already do all of this.’. The thing is, there are a lot of ways that people have developed to formalise rigorous mathematics on a computer (such as all of the ones that Mike mentioned), but none of them have caught on with practising mathematicians, so if you can create something with a better design, then that's good! There's nothing that anyone can point to and say ‘Instead of developing FMathL, just use that; it's good enough.’, because nothing is good enough yet.

Posted by: Toby Bartels on September 11, 2009 9:10 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

TB: There’s nothing that anyone can point to and say “Instead of developing FMathL, just use that; it’s good enough.”, because nothing is good enough yet.

Look at the links in FMathL - Formal Mathematical Language to see what I had already looked at before realizing the need (and a realistic possibility) to do it all in our Vienna group. Nothing that exists is easy to use, nothing looks like ordinary math, nothing attracts typical mathematicians.

I’d have preferred not to have to develop such a system myself. But it will not come without mathematicians playing a leading role in its development. They do not even do small, easy things such as How to write a nice, fully formalized proof, that would make things more readable without much effort.

Posted by: Arnold Neumaier on September 11, 2009 10:21 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

One must also be able to define one’s own syntax for things, just by saying somewhere in the mathematical text things like “We say that (x,y) sign z word w if formula involving x,y,z,w”,

Coq allows this too (I mean symbols, since I’ve already dealt with redefining the things themselves)

And Isabelle has something it calls “mixfix,” which seems to be along the same lines.

Posted by: Mike Shulman on September 12, 2009 5:08 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

AN: The outermost context is always the standard FMathL context with its axioms

TB: How strong are these? I would find it very nice if even the basic foundations were highly modular.

As Bourbaki, FMathL assumes classical logic and the global axiom of choice, but, weaker than Bourbaki, only the ability ot form the set of all subsets of the continuum, since this is sufficient to be able to reflect the whole FMathL conception inside itself and prove some natural properties. See Logic in context. More can be added in the standard way by making assumptions.

Since one must be able to use FMathL as a comfortable metalevel, intuitionistic logic is inadequate. Even treatments of intuitionistic logic usually use on the metalevel classical reasoning. (I’d like to know of a book that doesn’t, if such a book exists!)

The FMathL axioms are assumed on the specification level. But one can decide to work in a reflection level (one layer below the specification level), where one can define one’s own axioms and inference rules in a completely free way. Since FMathL will be reflected itself, in a number of partially nested, partially independent contexts, one can just take the part of the FMathL specifications one is happy with and augment it in one’s own way.

TB: specifying the right order of operations is an annoying technicality. If you can program the parser to figure that out for us, that would be nice … if it’s even possible.

There are no intrinsic difficulties; it is just a matter of getting the parser to do it correctly. This means that one needs to automatically generate the new grammar and update the parser accordingly. Thus we will provide a way easy for the user, but since we haven’t fixed the structure of our grammar yet (we need a grammar that works well in an incremental mode and can handle ambiguities and attributes), it will take some time before we can consider in detail, how.

Posted by: Arnold Neumaier on September 14, 2009 2:56 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I certainly can’t give constructive feedback on this amount of work, but while reading the (27) axioms I was thinking to myself:

are these axioms fixed and transcendent like in a true axiomatic framework

or

can these axioms be changed within FMathL (once it is working), a flexibility one might want to have when doing mathematics? Is FMathL its own metatheory?

Posted by: Uwe Stroinski on September 11, 2009 8:16 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Uwe Stroinski asked:
or can these axioms be changed within FMathL (once it is working), a flexibility one might want to have when doing mathematics? Is FMathL its own metatheory?”
—————————————-

SH: I wondered about this. JB provided an edited summary in which he wrote:

“The mathematical framework (at present just a formal system – a kind of metacategory of all categories) is designed to be a formal framework for mathematics that will allow…”
————————————-

Neumaier wrote in “A Semantic Turing Machine.pdf
“Besides serving as a theoretical basis of all programming languages (λ-calculus being another), Turing machines have many interesting applications, reaching from problems in logic, e.g., the halting problem c.f. Odifreddi [11], to formal languages (see Cohen [2]).”
———————————-

SH: The halting problem (HP) applies to a formal language such as Lisp for instance. It seems to me that Neumaier should have explained why the halting problem could have no adverse impact on the USTM he proposes and my first thought was that the USTM would either be incomplete or inconsistent. Since Neumaier provides references regarding the HP, perhaps he has considered this. Another statement which troubled me was this,
…”or in other words, not every STM program, regarded as a function on the context, is Turing computable. For example, external processors might have access to the system clock etc..”

SH: This is a difference in architectures, but I learned that every computable process which runs on a PC is Turing computable and that has nothing to do with a PC having a system clock. A TM and a PC can compute exactly the same range of calculations (power) except because the TM is an ideal machine, it could compute more digits of Pi and similar cases for instance, because there are no physical constraints (time and memory). PCs have the physical computability limitations but both the TM and PC compute the same kind of effective procedures called Turing computability.

Posted by: Stephen Harris on September 11, 2009 11:35 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Steven Harris wrote:

It seems to me that Neumaier should have explained why the halting problem could have no adverse impact on the USTM he proposes.

It poses the same problems as for an ordinary computer. To deserve its name, the USTM (universal semantic Turing machine) halts iff the program it simulates halts. More is not needed.

The halting problem for a semantic Turing machine poses no problems for mathematics done in FMathL, except that some searches for a proof (or other searches) may never terminate. But mathematicians also sometimes search for a proof without getting a result. Of course, FMathL will have, like mathematicians, an option to quit a search early if it seems hopeless.

Stephen Harris:

I learned that every computable process which runs on a PC is Turing computable and that has nothing to do with a PC having a system clock.

This holds only if there is no external input. A clock, or a human being who types in a reply to a query, are not computable, at least not in any well-documented sense. But the result of the program depends on that input and hence is generally not computable either.

Posted by: Arnold Neumaier on September 11, 2009 1:25 PM | Permalink | Reply to this

How many axioms may a foundation have?

US: …while reading the (27) axioms…

This is an advance over ZF, which needs infinitely many to express the existence of sets defined by properties. (NBG is finitely axiomatized, though.)

Posted by: Arnold Neumaier on September 14, 2009 12:19 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Uwe Stroinski wrote:

while reading the (27) axioms I was thinking to myself: are these axioms fixed and transcendent like in a true axiomatic framework, or can these axioms be changed within FMathL (once it is working), a flexibility one might want to have when doing mathematics? Is FMathL its own metatheory?

The framework has fixed axioms, since it defined the common part of different subject levels. Within the framework, one can do arbitrary mathematics, using the terminology of the framework as a metalevel.

Thus, if you like and it matters to you, you’ll be able to define your own (e.g., intuitionistic) logic, your own version of sets (say Bishop-style), functions (say, terminating algorithms), and real numbers (say, oracles defining one decimal digit after the other), and then reason in the resulting system.

You’ll be able to arrange with a presentation style file that the printed version of your theory does not show a trace of your assumption but regards them as well-known background knowledge, or that is outlined, or that it is explained in detail.

For other users of FMathL, this will just look like a particular context that you created, one that those who want to build upon your work can include into a context of their own. You can create as many such contexts as you like, and include in your current context any other context that you want - but you are responsible for maintaining consistency.

But the FMathL axioms were selected in such a way that, for most mathematics outside set theory and mathematical logic, one does not need these private contexts but can just work in the standard context which satisfies the FMathL axioms. Then one adds definitions and results from the desired fields until one has enough to do one’s own work.

At least the usual linear algebra, real and complex analysis, and elmentary algebra will be in standard contexts; if you can read German, you may look at

http://www.mat.univie.ac.at/~neum/FMathL/ALA.pdf

to see the possible content of such a standard context.

Posted by: Arnold Neumaier on September 11, 2009 1:07 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I use and develop mathematical formulae and relationships in computer vision (which obviously has a different flavour from both more “full theory” and “proof based” areas), but I’d say one important element of your new system should be: heed the lesson of internet search (google, etc) and design things in such a way that simple, brute force search is possible. To expand on that, they staggeringly amazing thing about even sophisticated search engines is that what the algorithms they use are SO elementary relative to what an actual human would do (although they’re sophisticated in their own way) they generally manage to produce search results which quite often help you on your task (even if only to help refine the vocabulary you use to express your goal). Likewise, one can do reasonably well at various tasks just using relatively dumb program scripts that churn away on some big database (eg, I remember something in the paper Scientific American about formalising biosciences paper results just enough to be able to do “brute force” connections between multiple paper’s results to suggest new things to try; unfortunately it appears SA’s paywall stops me finding a reference).

I know your primary focus is unambiguously communicating well-formed results with some level-of-trust certificate in a human centred way, but don’t design things in such a way that programs can’t get in at the contents in non-standard ways. (In particular, definitely make it possible to access individual statements in a “document” in the database directly if they fit some “pattern” (some mathematics oriented variant of a regular expression) without having to go through everything in the docuement, or having to commit to an axiom scheme, etc.) In one way it’s galling that brute force (rather than careful thought) can actually achieve so much, in another it’s exhilarating that new relationships and hints at entities may be uncovered by such means (in an analogous way to the monstrousness of the discovery of the Monstrous moonshine relationship).

Posted by: bane on September 11, 2009 1:44 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

bane: definitely make it possible to access individual statements in a “document” in the database directly if they fit some “pattern” (some mathematics oriented variant of a regular expression)

What kind of patterns would you like to search for in a math text? How would you like to pose such a query? How insensitive should the search be to details in formulas? Please give some telling examples.

Some sort of search will certainly be possible. But trying Wolfram|Alpha shows that even simple searches for mathematical patterns are difficult for today’s technology.

FMathL will have to rely for search, automatic proof, and other well-studied techniques on what others can do. We need to concentrate our efforts in order to have real impact. Therefore, FMathL is intended to be innovative mainly in the things that reside in so far neglected areas of relevance to mathematics for the computer, and otherwise just interfaces to known systems.

Thus I don’t know how far we’ll be able to proceed in the direction of structural search.

Posted by: Arnold Neumaier on September 11, 2009 8:51 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Thanks for your earlier clarification of my questions. Peter Schodl provided this rough summary at http://www.mat.univie.ac.at/~schodl/
“Roughly speaking, we want to teach a computer to understand a LaTeX-file well enough to communicate its essential content to other software. Firstly, we concentrate on mathematical text specifying optimization problems.”
————————————–

SH: I think the issue of not all blogging software being latex cross-compatible has come up on this forum. Does FMathL solve this problem or would the blogging software still need to be changed?

Fair use excerpt from my copy of “Introduction to Mathematical Philosophy”
By Bertrand Russell

“From the habit of being influenced by spatial imagination, people have
supposed that series _must_ have limits in cases where it seems odd if
they do not. Thus, perceiving that there was no _rational_ limit to the
ratios whose square is less than 2, they allow themselves to “postulate”
and _irrational_ limit, which was to fill the Dedekind gap. Dedekind in
the above-mentioned work, set up the axiom that the gap must always be
filled, i.e. that every section must have a boundary. It is for this
reason that series where his axiom is verified are called “Dedekindian.”
But there are an infinite number of series for which it is not verified.

The method of “postulating” what we want has many advantages; they are
the same as the advantages of theft over honest toil. Let us leave them
to others and proceed with our honest toil.”

Posted by: Stephen Harris on September 12, 2009 12:10 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I expressed things badly: what I meant was “definitely don’t make any design decisions that will make it impossible (for others to write software)to search…”, not that your project should actively provide search technology, particularly if that’s not an area of direct interest. I say this because of two “facts of life”:

(1) when designing GUI based software for humans it’s easy to choose programming constructs and data representations so that they are effectively only usable with the GUI.

(2) If something becomes popular it will require a significant functionality increase to displace it, and even if it does get displaced it’s very rare for old “documents” to get properly converted. Eg, the TeX language was “set in stone” in 1983 (AIUI), and most of the core LaTeX “language” by about 1988 (ie, new user level commands not “implementation”). Various people, including Wolfram Research, have tried to displace it and none have gathered significant marketshare. I don’t expect TeX/LaTeX to be displaced until someone figures out inferring pen-based mathematical writing. Yet LaTeX doesn’t provide ways of indicating even simple semantic information, such as distinguishing eqnarrays with multiple = signs based upon whether they represent cases in a definition or steps of simplification, etc. (I know it could, but it doesn’t and I doubt such a feature could be “made” to be used by everyone at this stage.)

So basically all I’m saying is: imagine both that your project is technically successful and that it becomes very widespread and the world will have 20+ years of “documents” in this format, are there any design decisions that you’d come to regret.

I’ve got to go out now, but I’ll try and think of some concrete search examples and post later.

Posted by: bane on September 12, 2009 3:19 PM | Permalink | Reply to this

design decisions that you’d come to regret.

bane: imagine both that your project is technically successful and that it becomes very widespread and the world will have 20+ years of “documents” in this format, are there any design decisions that you’d come to regret.

If I’d know it today, I’d certainly try to avoid it. In any case, the advantage of a fully semantic description of a subject matter in a self-reflected environment such as FMathL is designed to be makes it a fully automatic task to upgrade the whole database to a new representation.

Writing a program for doing so would be much easier than writing a program that upgrades LaTeX (which has an ill-defined semantics) to a different environment.

Posted by: Arnold Neumaier on September 14, 2009 4:09 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I’m just starting to get into the description of FMathL; here are some initial thoughts.

I think it is misleading, as is done in the introduction to the description of FMathL, to conflate CCAF and ETCS. CCAF (the Category of Categories As a Foundation for mathematics) is, in my opinion, a convoluted setup, since in order to do any mathematics, you need a notion of set, so before you can get anywhere you first have to define sets in terms of categories. ETCS (the Elementary Theory of the Category of Sets), on the other hand, is a set theory, not a “version” of CCAF. The axioms of CCAF are about things called “categories” and “functors,” while the axioms of ETCS are about things called “sets” and “functions” (among those axioms being that sets and functions are the objects and morphisms of a category, the titular category of sets).

I like to state the different between ETCS and ZFC by saying that ETCS is a “structural” set theory while ZFC is a “material” set theory. In my (biased) opinion, structural set theories solve the interpretation/implementation problems of material set theories, and are also more in line with mathematical practice (for instance, they do not permit nonsensical questions such as whether 121\in\sqrt{2} or whether π\pi is equal to the Cayley graph of F 2F_2). Also, structural set theory is closely related to type theory, which is used by some existing proof assistants like HOL and Isabelle. So my question is, what is the advantage of FMathL over structural set theory or type theory as a foundation?

(By the way, while it is true that Lawvere’s original description of ETCS used the single-sorted definition of a category so that “every set is regarded as a mapping,” this (mis)feature is easily discarded (and usually is, in practice). By contrast, the feature of ZFC by which a function is a particular type of set really seems essential to the development of the theory. So I don’t think it is fair to speak of the two in the same breath as reasons that existing foundations are inadequate.)

Posted by: Mike Shulman on September 12, 2009 6:44 AM | Permalink | Reply to this

CCAF, ETCS and type theories

MS: I think it is misleading, as is done in the introduction to the description of FMathL, to conflate CCAF and ETCS. CCAF (the Category of Categories As a Foundation for mathematics) is, in my opinion, a convoluted setup, since in order to do any mathematics, you need a notion of set, so before you can get anywhere you first have to define sets in terms of categories. ETCS (the Elementary Theory of the Category of Sets), on the other hand, is a set theory, not a “version” of CCAF.

I hadt’t called ETCS a version of CCAF. It is a part of CCAF, and I discussed it as such.

But the situation is not as simple as you describe it.

Any metatheory needs a concept of sets or collections, in order to speak about the objects, properties, and actions it is going to define on the formal level. And a metatheory that may serve as a foundation of mathematics must be able to model itself by reflection.

Thus you cannot have ETCS first without having categories even “firster”. There are four possible scenarios for foundations involving categories:

(I) Start with informal sets, create a formal definition of sets, and from it a formal definition of categories.

(II) start with informal sets, create a formal definition of categories, and from it a formal definition of set.

(III) start with informal categories, create a formal definition of categories, and from it a formal definition of set.

(IV) start with informal categories, create a formal definition of sets, and from it a formal definition of categories. ZFC + inaccessible cardinal + conventional category theory realizes (I). conventional category theory + ETCS realizes (II). CCAF (with ETCS built in) realizes (III). ETCS + conventional category theory realizes (IV).

The spirit of categorial foundations requires (III), I think, which needs both CCAF and ETCS.

MS: ETCS is a “structural” set theory while ZFC is a “material” set theory. In my (biased) opinion, structural set theories solve the interpretation/implementation problems of material set theories, and are also more in line with mathematical practice (for instance, they do not permit nonsensical questions such as whether 121\in\sqrt{2}

They only replace one interpretation/implementation problem by another. Sets of sets cannot be formed in ETCS, but are very common mathematical practice. Thus categories are too “immaterial” to be useful as background theory.

MS: So my question is, what is the advantage of FMathL over structural set theory or type theory as a foundation?

Mathematics is type-free; so is FMathL. Mathematics does not have a type element and a type set (as in ETCS). At best there is a very weak typing that interprets membership in an arbitrary set as a type. (The development of type theory goes in this direction, too, e.g., with dependent types; but as it does so, types become less and less distinguishable from sets.)

FMathL models the actual practice, and does not compromise to gain formal conciseness. The elegance of common mathematical language lies in its power to be expressive, short, and yet fairly easily intelligible, which is in stark contrast to the many different type theories I have seen.

MS: “every set is regarded as a mapping,” this (mis)feature is easily discarded (and usually is, in practice).

True, but the existing versions of CCAF still differ a lot, so that the general conclusion about the categorial foundations is justified. The paper was not meant to give a fair discussion of the relative merits of the existing foundations, but just to point out that neither is adequate for the practice of mathematics.

MS: By contrast, the feature of ZFC by which a function is a particular type of set really seems essential to the development of the theory.

In the usual expositions, yes. But not intrinsically. ZFC is essentially equivalent with NBG, which was first formulated by Von Neumann (the N in NBG) as a theory in which everything was a function, and sets were special functions. It is easy to create equivalent versions of NBG and ZFC in which functions and sets are both fundamental concepts related by axioms that turn characteristic functions into sets and graphs of functions into functions.

The possibility of these apparent differences that do not lead to essential differences in the power of the theory shows that these are matters of implementation, and not of essence. FMathL intends to capture the latter.

Posted by: Arnold Neumaier on September 14, 2009 4:01 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

I hadn’t called ETCS a version of CCAF.

Sorry, I misinterpreted one of your sentences.

It is a part of CCAF

I disagree. One may regard it as a part of a general universe of “categorial foundations of mathematics,” but I think CCAF has a very specific meaning which is distinct from ETCS (e.g. Lawevere, “The category of categories as a foundation for mathematics”). ETCS does not require (or, usually, have anything to do with) CCAF, while the development of math from CCAF does not necessarily go through ETCS.

Any metatheory needs a concept of sets or collections, in order to speak about the objects, properties, and actions it is going to define on the formal level.

Well, any metatheory at least needs a logic. Most theories such as ZFC, ETCS, CCAF, etc. are formulated in first-order logic. If one then wants to talk about models for that logic, then one needs a “place” in which to consider such models, which will generally involve a notion of set/collection.

The spirit of categorial foundations requires (III), I think, which needs both CCAF and ETCS.

Well, maybe. But if that’s so, then I would not argue in favor of “categorial foundations” (a phrase I generally do not use). What I’m proposing as “structural set theory” is, I think, your (IV). Note, though, that ETCS does not require a prior (formal or informal) definition of category; it can be stated in pure first-order logic.

Mathematics is type-free

When I look at mathematics, I see types everywhere. What is the objection to 121\in\sqrt{2} if not a type mismatch?

By the way, I observe that 121\in\sqrt{2} is in fact a legal statement in FMathL. Isn’t that exactly the sort of “extraneous information” that’s problematic about ZFC? I don’t actually see how FMathL solves any of the problems of material set theory.

Here’s what I see when I look at mathematics as it is done by mathematicians:

  • Mathematical structures (groups, rings, topological spaces, manifolds, vector spaces) are built out of sets and functions/relations between these sets. From the point of view of the general theory of any such structure, the elements of such sets have no internal structure; they are featureless. In particular, the general theory of a type of structure is invariant under isomorphism.

  • Elements of sets can have “internal meaning” relative to elements of the same set or other sets, as specified by functions and relations. For instance, the elements of a cartesian product A×BA\times B are interpreted as pairs (a,b)(a,b) relative to the sets AA and BB, with the relationship specified via the projection functions A×BAA\times B\to A and A×BBA\times B\to B. But once we start thinking about A×BA\times B as an object in its own right without its relationship to AA and BB, its elements lose their “internal” properties and become featureless, like the elements of any other set.

For instance, when we construct \mathbb{Q} as a set of ordered pairs of integers, we consider a subset of ×\mathbb{Z}\times\mathbb{Z} and use the interpretation of its elements as pairs to construct operations and properties of it. However, once the construction is finished, we generally forget about it and treat each rational number as an independent entity.

I think this property (that structure on a set comes only from the outside) is precisely what allows us to capture “the essence of mathematical concepts;” otherwise we will always be carrying around baggage about the internal properties of the elements of our sets.

Sets of sets cannot be formed in ETCS, but are very common mathematical practice.

I would also argue that the above interpretation of cartesian products also applies to “sets of sets.” The elements of a power set PAP A are given meaning as subsets of AA only via the “is an element of” relation from AA to PAP A, just as the elements of A×BA\times B are given meaning as ordered pairs only via the projections to AA and BB. Once we forget that PAP A was constructed in relation to AA, the meaning of its elements as subsets of AA vanishes. This is the case, for instance, when we construct the real numbers via Dedekind cuts: \mathbb{R} starts out as a subset of PP \mathbb{Q}, just as \mathbb{Q} started out as a subset of ×\mathbb{Z}\times\mathbb{Z}, but once we have proven enough about it we generally forget about the identification of a real number with a set of rationals and treat it as an independent entity.

It is true that mathematicians usually think of the elements of PAP A as “being” subsets of AA, rather than being merely “associated” to them. But this is perfectly in line with the structural philosophy that anything and everything can be transported along an isomorphism: every subset of AA corresponds to a unique element of PAP A, so we might as well consider elements of PAP A to “be” subsets of AA (as long as we continue to keep the “is an element of” relation in mind). But the structural point of view also allows us to discard the “is an element of” relation, which we can’t do in any system where the elements of PAP A really are subsets of AA.

By the way, this transport along a bijection is, I think, easily hidden from the user of any computer system. After all, what does one do with subsets of AA? The basic thing is to talk about which elements of AA are elements of them—and this is handled directly by the “is an element of” relation. All other operations and properties on elements of PAP A are most naturally defined in terms of “is an element of,” so the user won’t ever be aware that the elements of PAP A “aren’t really” subsets of AA.

The elegance of common mathematical language lies in its power to be expressive, short, and yet fairly easily intelligible, which is in stark contrast to the many different type theories I have seen.

I think type theory (along with many other kinds of formal logic) should be viewed as like assembly language. Hardly anyone writes code in assembly language, but not because it has been replaced; rather because more expressive, concise, and intelligible languages have been built on top of it. The description of FMathL seems to me like trying to build a CPU that natively runs on Fortran, rather than writing a Fortran compiler. Logicians have put a lot of effort into developing a theory that works very smoothly at a low level and is very flexible; why not build on it instead of pulling it out and replacing it wholesale?

By the way, one of the things I like about Isabelle is that it is designed with a very weak metalogic, on top of which arbitrary other logics can be implemented. People usually use either HOL or ZFC as the “object logic” but in theory one could use any logic that can be expressed via natural deduction rules. I think there are good reasons to build in this sort of modularity at a low level, just as there are good reasons for hiding it from the end user.

the feature of ZFC by which a function is a particular type of set really seems essential to the development of the theory.

In the usual expositions, yes. But not intrinsically.

You’re right.

(Regarding “sets being mappings,” it sounds like your real objection is the absence of “sets of sets,” which I’ve addressed above.)

Posted by: Mike Shulman on September 14, 2009 5:18 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

AN: It [ETCS] is a part of CCAF

MS: I disagree. One may regard it as a part of a general universe of “categorial foundations of mathematics,” but I think CCAF has a very specific meaning which is distinct from ETCS.

Distinct, yes, but including it; I only claimed that. I referenced C. McLarty, Introduction to Categorical Foundations for Mathematics, version of August 14, 2008. On p.55, he states as part of the CCAF axioms: There is a category Set whose objects and arrows satisfy the ETCS axioms.

Without such an axiom, or some replacement (as in Makkai’s version also referenced) there are no sets, and hence no complete categorial foundations for mathematics.

AN: Mathematics is type-free

MS: When I look at mathematics, I see types everywhere.

When I look at mathematics, I see structures everywhere, not types. Trying to see types, I see that typing violations abound, being justified as abuse of notation.

MS: What is the objection to 121\in\sqrt{2} if not a type mismatch?

The objection is that different constructions of the reals (considered equivalent by mathematicians) answer the question differently.

MS: I observe that 121\in\sqrt{2} is in fact a legal statement in FMathL.

Yes, but it is undecidable. It should be, since people who implement mathematics in ZFC, say, may get different (subjective, implementation-dependent) anwsers. Thus the formal status of this statement should be the same as that of the the continuum hypothesis, say.

MS: Mathematical structures (groups, rings, topological spaces, manifolds, vector spaces) are built out of sets and functions/relations between these sets. From the point of view of the general theory of any such structure, the elements of such sets have no internal structure; they are featureless. In particular, the general theory of a type of structure is invariant under isomorphism.

Earlier in my life I had been working and publishing on finite symmetry groups (E 8E_8, the Leech lattice, etc.). Everyone in this field was obviously regarding any permutation group as a group. Alt(5) acting on 5 points and PSL(2,5) acting on 6 points were nonisomorphic permutation groups, but isomorphic as groups. What was meant by ”the same” was context-dependent, taking into account more or less structure, as needed.

FMathL respects this context-dependence in a natural way; the meaning of an asserted equality depends on the context. This feature allows FMathL to remain much closer to actual practice than previous foundations.

In category theory (and in ZFC), permutation groups and groups are completely different objects, related by functors (as you describe). And much more of that, which must be handled in the traditional foundations by a pervasive misuse of notation and language. In my view, what is regarded by the purists as “misuse” is the true usage: mathematicians generally think in this falsely called misused language rather than in the clumsy purist way. The functors only appear when one forces standard mathematics into the straightjacket of category theory.

In categorial language, the harmless statement 2\sqrt{2}\in \mathbb{R} (where \mathbb{R} denotes the ordered field of real numbers) would as much be a type mismatch as the dubious statement 121\in\sqrt{2}. Writing out all the functors needed to match types would make ordinary mathematical language as clumsy to use as proof assistants based on type theory.

MS: I think type theory (along with many other kinds of formal logic) should be viewed as like assembly language. Hardly anyone writes code in assembly language, but not because it has been replaced; rather because more expressive, concise, and intelligible languages have been built on top of it.

I fully agree. This is why FMathL distinguishes between subjective levels (where particular implementations sit, like different assembler programs for the same functionality, written perhaps even in different assembler languages), and the object level, which gives the essence.

MS: The description of FMathL seems to me like trying to build a CPU that natively runs on Fortran, rather than writing a Fortran compiler.

No. The current description of FMathL is that of a language, not of a CPU. (The CPU’s correspond to the subject levels in the FMathL paper.) In your picture, FMathL tries to be a formal, easy-to-use high-level programming language, while traditional foundations are low-level assembler languages.

MS: Logicians have put a lot of effort into developing a theory that works very smoothly at a low level and is very flexible; why not build on it instead of pulling it out and replacing it wholesale?

The full system MathResS (which uses FMathL as its high level semantics) will build upon it. There will be compilers from the high level to various low levels, i.e., interfaces between FMathL and Coq, Isabelle/Isar, or HOLlight, say. (This doesn’t show in the paper on the FMathL mathematical framework, but is amply reflected in the vision document.)

But past work on proof assistants etc. produced only the assembler level, not a high level comparable to Fortran that would make writing formal mathematics easy. The high level currently only exists as informal mathematical language.

Imagine there were no Fortran or C++, and all programs would have to be specified in ordinary language plus formulas, to be translated by specialists directly into assembler. Computer science would be as little accepted by the world at large as proof assistants are by mathematicians.

The purpose of FMathL is to create a Mat-tran for translating high level mathematics that is as easy to use as modern For-tran for translating high level algorithms.

Posted by: Arnold Neumaier on September 15, 2009 11:38 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Arnold Neumaier wrote about 121 \in \sqrt{2}

Thus the formal status of this statement should be the same as that of the the continuum hypothesis, say.

I disagree with that. The generalised continuum hypothesis GCHGCH is a meaningful statement, at least as long as one is doing mathematics with enough power to recursively define families of sets indexed by the ordinals using the power set operation. Then you can prove useful things, such as V=LGCHV = L \;\Rightarrow\; GCH and GCHACGCH \;\Rightarrow\; AC.

To make GCHGCH come out true or false, you need to make certain assumptions beyond the standard ones, and these should be regarded as a matter of convention; similarly, to make 121 \in \sqrt{2} come out true or false, you need to adopt some convention beyond the standard ones. But you need to adopt such a convention even to make 121 \in \sqrt{2} meaningful, which is not necessary for the continuum hypothesis, or to prove the theorems in the las paragraph.

If you only want to talk about true statements (and thus also false ones, through their negations), then you can treat these similarly; you will be unable to prove or refute either in the standard context, but you will be able to prove or refute it in some other contexts. But I would like a system to complain of a type mismatch the moment that I write down 121 \in \sqrt{2}, until I've adopted conventions that make it meaningful; I don't want to have the statement accepted, just considered neither proved nor refuted yet.

All that said, I appreciate your point that mathematicians use a high-level language in which a host of standard type conversions are suppressed, and it will be very useful to have a formal system that already knows about all of this. Still, I would like a distinction between ‘meaningless’ and merely ‘undecidable’ statements in any given context.

Posted by: Toby Bartels on September 16, 2009 2:10 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

TB: you need to adopt such a convention even to make 121\in\sqrt{2} meaningful, which is not necessary for the continuum hypothesis,

The statement is equally meaningful in ZF with the standard add-ons for numbers as is the continuum hypothesis.

TB: I would like a system to complain of a type mismatch the moment that I write down 121\in\sqrt{2} until I’ve adopted conventions that make it meaningful; I don’t want to have the statement accepted, just considered neither proved nor refuted yet.

You can make the system complain by adding to your standard context the requirement that xyx\in y is nominal for numbers xx and yy. Similarly, the user can enforce any desired typing rules by specifying them.

But to have a system in which one can use ZF naturally, one cannot make the typed behavior the default. The default must be as much agnosticism as possible, while there must be simple ways for users to make the system more restrictive, to have it conform to the amount of typedness they want.

Posted by: Arnold Neumaier on September 16, 2009 8:18 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

The statement is equally meaningful in ZF with the standard add-ons for numbers as is the continuum hypothesis.

Yes, but in other systems it is meaningless, whereas the continuum hypothesis is meaningful in any system in which you have the syntax to write it down.

The default must be as much agnosticism as possible

I agree whole-heartedly, but I do not agree that this is what you are doing.

Agnosticism, to me, includes considering syntax meaningless unless it has been given some meaning. Phrases like ‘π(= 2(_\pi(=^2(’ and ‘121 \in \sqrt{2}’ are meaningless to me until somebody explains to me what they mean.

In order of increasing knowledge: meaningless, undecidable, true. Or rather, it should be: meaningless so far, undecided so far, known to be true. The ‘so far’s are because additional assumptions/conventions can push things farther along in the list; the change from ‘undecidable’ to ‘undecided’ is because I know that your system, however great it might be (^_^), can't calculate whether something is undecidable in an arbitrary context.

You also, by default, include other assumptions that should be optional. All this, I suppose, to make ‘a system in which one can use ZF naturally’ out of the box. But if I think that ZF\mathbf{ZF} is a kludge, then that's not a feature for me. By all means, include a ZF\mathbf{ZF} option as a standard module, but making it the default is not ‘as much agnosticism as possible’.

there must be simple ways for users to make the system more restrictive

That is, more restrictive in typing, or less restrictive in assumptions made and conventions adopted. You do this through reflection, right? I'm inclined so far to prefer Coq (or Isabel), which also allows reflection but makes fewer assumptions up front. On the other hand, if you make this more user-friendly, then that would be a Good Thing.

Posted by: Toby Bartels on September 16, 2009 7:07 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Arnold Neumaier wrote

Earlier in my life I had been working and publishing on finite symmetry groups (E 8E_8, the Leech lattice, etc.). Everyone in this field was obviously regarding any permutation group as a group. Alt(5)Alt(5) acting on 5 points and PSL(2,5)PSL(2,5) acting on 6 points were nonisomorphic permutation groups, but isomorphic as groups. What was meant by ”the same” was context-dependent, taking into account more or less structure, as needed.

FMathL respects this context-dependence in a natural way; the meaning of an asserted equality depends on the context. This feature allows FMathL to remain much closer to actual practice than previous foundations.

In category theory (and in ZFC), permutation groups and groups are completely different objects, related by functors (as you describe). And much more of that, which must be handled in the traditional foundations by a pervasive misuse of notation and language. In my view, what is regarded by the purists as “misuse” is the true usage: mathematicians generally think in this falsely called misused language rather than in the clumsy purist way. The functors only appear when one forces standard mathematics into the straightjacket of category theory.

In categorial language, the harmless statement 22 \in \mathbb{R} (where \mathbb{R} denotes the ordered field of real numbers) would as much be a type mismatch as the dubious statement 121 \in 2. Writing out all the functors needed to match types would make ordinary mathematical language as clumsy to use as proof assistants based on type theory.

It seems to me that a structural set theory such as ETCS also upholds the principle of context-dependence as fundamental.

Let’s take for example “2\sqrt{2}”. In a structural set theory, it’s quite true that there is no object (taken in isolation) called 2\sqrt{2}. Instead, it is part and parcel of such an approach that we declare the ambient context in which 2\sqrt{2} is embedded: 2\sqrt{2} as real number, 2\sqrt{2} as complex number, etc., by writing down for example

2:1\sqrt{2}: 1 \to \mathbb{R}

Having declared the context, it is then meaningful in such a theory to ask, given a subset such as \mathbb{Q} \subseteq \mathbb{R}, a question like: is 2\sqrt{2} \in \mathbb{Q}? The question is equivalent to asking whether the point 2:1\sqrt{2}: 1 \to \mathbb{R} factors (evidently uniquely) through the given inclusion \mathbb{Q} \hookrightarrow \mathbb{R}. Or, whether the pair

(2,[]):1×P()(\sqrt{2}, [\mathbb{Q}]): 1 \to \mathbb{R} \times P(\mathbb{R})

factors through the local membership relation

R×P()\in_{R} \hookrightarrow \mathbb{R} \times P(\mathbb{R})

Thus, once such contexts have been declared, the proposition 2 \sqrt{2} \in_{\mathbb{R}} \mathbb{Q} becomes completely meaningful in a structural set theory like ETCS, and reflective of how mathematicians posit questions in actual practice.

Thus, symbol \in is not formalized in a structural set theory as a global relation on objects; it too is contextualized (or localized) by referring to a specified domain, as in the case \in_{\mathbb{R}}.

Continuing this train of thought: where a mathematician might traditionally write

x yx+y\forall_{x \in \mathbb{R}} \exists_{y \in \mathbb{R}} x + y \in \mathbb{Q}

a “purist” working in a formal setting like ETCS would disambiguate the different senses of the symbol \in, where the instances appearing below the quantifiers are interpreted as declaring the types or contexts of the variables, and the one in the predicate being quantified refers to a local membership relation (such as \in_{\mathbb{R}}) pertaining to that type. But I hardly feel like this detail is cumbersome: writing the expression

x: y:x+y \forall_{x: \mathbb{R}} \exists_{y: \mathbb{R}} x + y \in_{\mathbb{R}} \mathbb{Q}

is certainly no worse than how it would appear in a fully formal expression in ZFC, and IMO comes closer to expressing how people think (which we seem to agree is heavily context-dependent).

Regarding phrases such as “straitjacket of category theory” and the final sentence of the material quoted above: these strike me as bald assertions. What is the evidence behind them?

Posted by: Todd Trimble on September 16, 2009 1:51 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

x yx+y\forall_{x\in\mathbb{R}}\exists_{y\in\mathbb{R}}x+y\in\mathbb{Q}

vs

x: y:x+ y \forall_{x:\mathbb{R}}\exists_{y:\mathbb{R}}x+_\mathbb{R}y\in_{\mathbb{R}}\mathbb{Q}

You forgot the subscript on ‘++’. (^_^)

But there is need to abolish the first form in favour of the second. If one adopts structural foundations, then it is possible to algorithmically decode the first as meaning the second, as long as it appears in a context where \mathbb{R} has been declared as a set equipped with an operation ++, \mathbb{Q} has been declared as a subset of \mathbb{R} (which includes declaring it as a set and equipping it with an injection to \mathbb{R}, as you would be likely to do if you had earlier defined \mathbb{R} in terms of \mathbb{Q}), the quantifiers have their usual meanings, and xx and yy are free to be introduced as variables. I would argue that the default context in ordinary mathematics has these features, so the first expression is unambiguous.

Posted by: Toby Bartels on September 16, 2009 7:18 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

And I’d agree with your so arguing, particularly with the fact that translation from the traditional notation to the “purist” one is a routine algorithm. But not everyone has thought about these nuances, so it’s perhaps just as well to spell them out on occasion.

You’ve actually further strengthened the argument for the viability of structural foundations as not being at all cumbersome.

Posted by: Todd Trimble on September 16, 2009 7:41 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

But there is need to abolish the first form in favour of the second.

Whoops!, that should be ‘no need to abolish’.

Hopefully that's clear from what I wrote afterwards, but I still should watch out for missing negations.

Posted by: Toby Bartels on September 16, 2009 7:55 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Actually, this disambiguation algorithm is implemented in modern proof assistants.
E.g. Coq would use type classes for this.

Posted by: Bas Spitters on September 25, 2009 8:02 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

AN: The functors only appear when one forces standard mathematics into the straightjacket of category theory. […] Writing out all the functors needed to match types would make ordinary mathematical language as clumsy to use as proof assistants based on type theory.

TT: Regarding phrases such as “straitjacket of category theory” and the final sentence of the material quoted above: these strike me as bald assertions. What is the evidence behind them?

These assertions were to be taken with a grain of salt. The statement on proof assistants was referring to the overhead incurred when one writes a piece of math in Coq, say, vs. in LaTeX - roughly a factor of 10 in time (this factor is Freek Wiedijk’s estimate, not my exaggeration). Adding all the categorial annotations needed to be able to rigorously say things one likes to say, and making sure everything is correctly in place, is not quite as expensive but still a significant nuisance, a straightjacket that encumbers writing and reading math.

TT: translation from the traditional notation to the “purist” one is a routine algorithm

Routine, but tedious, not much different from converting a lemma and its proof from LaTeX to Coq. It makes the difference between being widely accepted and being used only by afficionados.

I’ll answer to the remainder of your post once I know how to reproduce the formulas without recomposing them myself….

Posted by: Arnold Neumaier on September 16, 2009 8:56 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Routine, but tedious, not much different from converting a lemma and its proof from LaTeX to Coq. It makes the difference between being widely accepted and being used only by afficionados.

I am fully in agreement with this. However, what I’m trying to say is that I think the best solution is not to discard typing information altogether, but rather to automate the process of inferring and adding type annotations. We should be able to omit this type information when writing mathematics, and a computer should be able to infer it just as a human reader infers it, but the information is nevertheless there and should be modeled by the formal language.

Automating this conversion/inference sounds like much the same thing that you’ve said, regarding FMathL as a “higher level language” which can be “compiled” to Coq, Isabelle/Isar, or HOL. But I don’t see how an untyped language can be compiled to a typed one in a meaningful way. When a FMathL user types 121\in \sqrt{2}, what does that get compiled to in a “lower level” typed language?

Posted by: Mike Shulman on September 17, 2009 2:05 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

MS: Automating this conversion/inference sounds like much the same thing that you’ve said, regarding FMathL as a “higher level language” which can be “compiled” to Coq, Isabelle/Isar, or HOL. But I don’t see how an untyped language can be compiled to a typed one in a meaningful way.

In a similar way as one can pose to HOL any query formulated in the untyped first order language ZF. It is well-known that systems like HOLlight can prove much of ZF theory; so, where is the problem?

MS: When a FMathL user types 121\in\sqrt{2}, what does that get compiled to in a “lower level” typed language?

It gets compiled into a precise specification format that adds all relevant contextual information, in this case, that the user seems to have intended an interpretation of numbers as a set.

Most of real math involves such guessing of intentions, which is maintained until strange conclusions are automatically derived. In interactive mode, the formula would be found suspicious, and a query window would pop up to ask the user to confirm the interpretation, to correct the formula, or to supply additional context.

From the specification format, a dedicated spec2HOL translator would map the part of the problem to be checked for correctness into an appropriate query in HOL.

Posted by: Arnold Neumaier on September 17, 2009 3:11 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

In interactive mode, the formula would be found suspicious, and a query window would pop up to ask the user to confirm the interpretation, to correct the formula, or to supply additional context.

Oh, that makes me happy! (^_^)

And I take it that if, after developing a theory of ordinals, I were to write down the generalised continuum hypothesis (that the \aleph series equals the \beth series), then no such window would pop up?

Posted by: Toby Bartels on September 17, 2009 10:52 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

TB: if, after developing a theory of ordinals, I were to write down the generalised continuum hypothesis (…), then no such window would pop up?

Then FMathL would just take notice that you added an assumption to your context. It would have made a few checks to see that one can rewrite the formula in a useful way (which cannot be done with 121\in\sqrt{2} since there are no rules for manipulating \in between numbers) and conclude that things make enough sense not to bother the user with a query.

FMathL might become suspicious, though, when you subsequently add that the simple coninuum hypothesis is violated, and would ask whether you intended to close the previous context with the GCH, since otherwise the context becomes trivially inconsistent.

In noninteractive mode, it would write all the queries into a logbook that can be inspected after compilation, and proceed on a tentative basis. (As mathematicians do when they are not sure about the meaning of a cryptic passage.)

Posted by: Arnold Neumaier on September 18, 2009 12:06 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

FMathL might become suspicious, though, when you subsequently add that the simple coninuum hypothesis is violated, and would ask whether you intended to close the previous context with the GCH, since otherwise the context becomes trivially inconsistent.

That's good.

(Of course, we can have no guarantee that FMathL will notice that I'm in an inconsistent context, especially if I write CH as a statement about subsets of \mathbb{R} instead of as a statement about \alephs and \beths. But the more contradictions that it can find in a reasonable amount of time, the better.)

Posted by: Toby Bartels on September 18, 2009 9:55 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

TB: Of course, we can have no guarantee that FMathL will notice that I’m in an inconsistent context, especially if I write CH as a statement about subsets of ℝ

Yes. In particular, we might all be working in an inconsistent context without knowing it, like Cantor was for 25 years.

If FmathL discovers this and raises a query, I’d rather think of a bug in FMathL than of one in standard math…

Posted by: Arnold Neumaier on September 18, 2009 10:58 PM | Permalink | Reply to this

Cantor vs Frege

In particular, we might all be working in an inconsistent context without knowing it, like Cantor was for 25 years.

You mean Frege; Cantor knew what he was doing, even though he didn't formalise it.

Posted by: Toby Bartels on September 18, 2009 11:07 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Do you mean that you interpret an untyped theory like ZF in a typed theory by having just one type? To me that seems unfaithful to how typed systems are intended to be used.

Posted by: Mike Shulman on September 18, 2009 9:22 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

MS: Do you mean that you interpret an untyped theory like ZF in a typed theory by having just one type? To me that seems unfaithful to how typed systems are intended to be used.

Do you see a better way? How would you pose a ZF problem to HOLlight, say?

I don’t think that there is any other option. In the The CADE ATP System Competition - The World Championship for 1st Order Automated Theorem Proving, people pose each year lots of untyped problems that they want to see solved (or were already solved) by some theorem prover. Do you think HOLlight should not be allowed to solve these?

On the other hand, if you have a naturally typed problem, the types would form in FMathL a particular family of constructive sets, and a translator to HOLlight would be able to recognize this and then make better use of the typing in the proof assistant.

Posted by: Arnold Neumaier on September 18, 2009 10:09 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

McLarty in Learning from Questions on Categorical Foundations says (on p. 51):

“If the major theorems of category theory are proved in set theory, and then I want to axiomatize them, is that not a kind of dependence on set theory? Well in the first place these theorems are not exactly proved in set theory. Their usual na¯ve versions are incorrect in set theory. They quantify over collections too large to be ZF sets, and manipulate them too freely for Gödel-Bernays classes, and treat them too uniformly for Grothendieck universes. There are many well-known and sufficiently workable set-theoretic fixes for handling these theorems but they are all just that—fixes.”

This suggests to me that there is no known natural way to code category theory in set theory. Is this a reasonable point of view? If it is, what’s a solution to this problem?

Posted by: Eugene Lerman on September 16, 2009 10:32 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

I don't think that I agree with McLarty there, and I'd like to know what in category theory he thinks can't be formulated in ZFC+GU\mathbf{ZFC} + GU, where GUGU is the axiom that every set belongs to some Grothendieck universe, about as easily as anything in ordinary mathematics is axiomatised in ZFC\mathbf{ZFC}. Categorists' language even tells you when the Grothendieck universes are coming in: whenever they say ‘small’ (or something that may be defined in terms of smallness, like ‘accessible’ or ‘complete’).

I certainly agree with McLarty's second point (not quoted above), so it doesn't matter for his overall argument. I also believe that formulating mathematics in ZFC\mathbf{ZFC} is generally to perpetrate an act of violence upon it; I just don't see how category theory is particularly special in that regard.

Posted by: Toby Bartels on September 17, 2009 12:59 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Well, I wrote a whole paper about ways to formalize category theory in (mostly material) set theory. (By which I mean, of course, ways to deal with the size distinctions in category theory; the theory of small categories never poses any problem.) Of course, “natural” is in the eye of the beholder, but I don’t think there’s a whole lot to object to, at least in the more effective versions. Any of them can be “structuralized” to eliminate any objections on that score.

I think it’s a bit misleading to say that the theorems of category theory “are not exactly proved in set theory.” One has to choose a set-theoretic formalization, but there are a number which suffice perfectly well. And when it comes to it, hardly any mathematics is actually “proved in set theory”–it’s proved in informal mathematical language which we all trust could be translated into set theory (if we’re the sort of people who believe that all mathematics should be founded on set theory). I don’t think category theory is much different there; the only thing is that it matters a bit what set-theoretic foundation you choose, but that isn’t unique to category theory either.

Of course, set-theoretic foundations are perhaps not philosophically in line with category theory, which may be more along the lines of what McLarty is getting at. For instance, there is the problem of evil: as long as your categories have sets of objects, you’ll be able to talk about equality of those objects. But I don’t see any way to solve that unless your foundational axioms are really about the 2-category of categories; even the 1-category of categories isn’t good enough.

Posted by: Mike Shulman on September 17, 2009 1:58 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Todd and Toby have said most of what I would say as well. I agree that a lot of what is written in everyday mathematics does not naively typecheck, and also that such “abuses” of notation are not “wrong” but are the real usage. However, I don’t think this is a reason to discard types, which are, I still believe, everywhere in mathematics and carry important information. Rather, we should improve the type system.

Consider your example regarding xx\in \mathbb{R} not type-checking because \mathbb{R} is not a set but a field. I would argue that a better way to describe this is that the symbol \in is overloaded, in the precise sense of computer science. Thus, what appears on the RHS of \in does not always have to be a set, but can be anything for which an appropriate semantics is defined. One could additionally supply a default semantics: when the RHS is a structure of some sort having only one underlying set, then the meaning of \in should default to referring to that underlying set. This is completely precise and maintains the important type information. Moreover, it is already possible in Isabelle:

record 'a magma =
  elements :: "'a set"
  times :: "'a => 'a => 'a" (infixl "\star\<index>" 70)

definition in_magma :: "'a => 'a magma => bool"
  where "in_magma x M = (x \in elements M)"

notation in_magma ("_ \in _")

axioms closed: "[| x \in M ; y \in M |] ==> (x M\star_My) \in M"

(Of course, this isn’t the way one would actually work in Isabelle, I’m just trying to illustrate overloading of \in with code that typechecks.)

Posted by: Mike Shulman on September 16, 2009 8:56 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Hmm, I guess this is actually more or less the same thing that Toby said above.

Posted by: Mike Shulman on September 16, 2009 9:42 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

MS: Consider your example regarding xx\in \mathbb{R} not type-checking because \mathbb{R} is not a set but a field. I would argue that a better way to describe this is that the symbol \in is overloaded, in the precise sense of computer science.

Yes, this is more reasonable. But this doesn’t solve all type problems. For example, Todd Trimble’s suggestion that “2 as a real number” and ` `2 as a complex number” are different objects with different types is really worrying This gives a multitude of distinct objects that most mathematician would consider to be identical, e.g., “2 as an element of (,suc)(\mathbb{N},suc)”, “2 as an element of (,+)(\mathbb{N},+)”, “2 as an element of (,+,*)(\mathbb{N},+,*)”, “2 as an element of (,<)(\mathbb{N},\lt)”, “2 as an element of (,+,<)(\mathbb{N},+,\lt)”, “2 as an element of (,+)(\mathbb{Z},+)”, “2 as an element of (,+,*)(\mathbb{Q},+,*)”, to name only a few.

One gets a lot of accidental relations when formalizing numbers in ZF, which are avoided in category theory, but instead one gets a lot of accidental duplication. This shows that category theory is like ZF a useful mode of viewing mathematics, but that it does not capture its full essence.

Compare this with the way categories are defined in FMathL (Section 2.13). This is not quite the same as the traditional definition, but it is the same from the point of view of essence. In FMathL, objects and morphisms can belong very naturally to several categories.

Posted by: Arnold Neumaier on September 17, 2009 2:29 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Permit me then to try to assuage your worries! :-)

In ETCS (to give an example of a categories-based set theory, but there are ways of doing structural set theory, including one that Mike Shulman is working out in the nLab, called SEAR), a [global] element is conceived as a map of the form 1X1 \to X; the XX can be thought of as the ‘type’. The context XX is given with the element as its codomain; it is a datum of that element. Thus “elements” are not free-floating or given in vacuo; they come already attached to sets.

In just the same way, “objects” are not free-floating either – they come attached to categories in which they are conceived as being objects of. Thus we have \mathbb{N} qua set, \mathbb{N} qua structure with 0 and successor, qua ordered abelian group, qua ring, etc. – these are all objects in different categories. But each of those structures you mentioned allow one to define an element 2:12: 1 \to \mathbb{N} in the underlying set, and insofar as all those structures on \mathbb{N} you mentioned are standardly defined by exploiting the “Peano postulates” [in which the set \mathbb{N} comes equipped with a 0 and a successor, satisfying a principle of primitive recursion or universal property as natural numbers object that makes the recursive definitions of each of these structures possible], it’s provably the same element 2 (that is, same morphism 11 \to \mathbb{N} in SetSet) we’re referring to in each of those cases.

Even if we have several natural numbers objects, there is an invariant meaning of ‘2’ in each of these, insofar as the unique isomorphism from one natural numbers object to another (as natural numbers objects) takes the element ‘2’ in one to the element ‘2’ in the other. I’m puzzled why any of this should be considered a problem.

Posted by: Todd Trimble on September 17, 2009 9:36 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

I meant to say \mathbb{N} as “ordered commutative monoid”, not “ordered abelian group”. :-P

Posted by: Todd Trimble on September 17, 2009 9:42 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

I’m puzzled why any of this should be considered a problem.

Well, it's very different from the way that most mathematicians think about these things.

(And to handle 22 \in \mathbb{Z}, 22 \in \mathbb{C}, etc, you also have to talk about some injections between the sets before you can prove that various guises of 22 are the same underneath.)

So while I agree with you that mathematics is like this underneath, I also agree with Arnold that it would be nice to have a user-friendly system where it doesn't explicitly look like that. I would like a that does have all of that underneath it, at some level, but hides it from the user, at least until they ask for more details.

Posted by: Toby Bartels on September 17, 2009 10:57 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

And to handle 22 \in \mathbb{Z}, 22 \in \mathbb{C}, etc, you also have to talk about some injections between the sets before you can prove that various guises of 22 are the same underneath.

That goes without saying. In a categories-based set theory, such injections often arise as universal arrows (e.g., \mathbb{N} is the initial rng, affording a unique comparison map \mathbb{N} \to \mathbb{Q} in RngRng). As you know, of course.

Could you say a little more what you mean by “it’s very different from the way most mathematicians think about these things”? It seems to me that, now that categorical ideas have seeped into the general consciousness, many mathematicians do in effect “think structurally” – simply put, that context matters – for example that 2 in \mathbb{Q} is different from 2 in \mathbb{Z} since the first 2 is invertible and the second isn’t, and that 22 in \mathbb{R} is different because it has a square root.

Posted by: Todd Trimble on September 18, 2009 12:24 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Could you say a little more what you mean by “it's very different from the way most mathematicians think about these things”? It seems to me that, now that categorical ideas have seeped into the general consciousness, many mathematicians do in effect “think structurally” – simply put, that context matters – for example that 2 in \mathbb{Q} is different from 2 in \mathbb{Z} since the first 2 is invertible and the second isn’t, and that 2 in \mathbb{R} is different because it has a square root.

I don't have any statistical evidence to back it up, but my feeling is that this is still a minority. It's one thing to say that 22 behaves differently in \mathbb{Q} from how it behaves in \mathbb{Z}, but another thing to say that 22 in \mathbb{Q} is a different object from 22 in \mathbb{Z}. Like many philosphical differences, there is no practical distinction here, but I think that a lot of people would have difficulty even understanding what you were saying up above —especially the very young (undergraduates who have had little or no experience yet with abstract concepts like groups, metric spaces, and other kinds of structured sets) and the very old (and set in their ways; I can think of some examples from my days at UCR who I'm sure would not have understood what you were saying, although I'd rather not name them here). I think that most of the others would understand it but think it weird; it probably seems unnecessarily complicated.

Again, no scientific evidence; that's my feeling from talking with non-categorially-inclined mathematicians.

Posted by: Toby Bartels on September 18, 2009 9:26 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

TT: it’s provably the same element 2 (that is, same morphism 11\to\mathbb{N} in Set) we’re referring to in each of those cases. […] I’m puzzled why any of this should be considered a problem.

I don’t understand; I am truly puzzled. How can you even speak of (let alone prove it) the same morphism 11\to\mathbb{N} in Set when the \mathbb{N}’s are different (being a set, a set with successor, etc., which are all different things)?

Posted by: Arnold Neumaier on September 18, 2009 12:22 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Easy: these objects (\mathbb{N} as set, \mathbb{N} as ordered commutative monoid, etc., etc.) are treated as belonging to different categories: SetSet, OrdCMonOrdCMon, etc. When one applies the appropriate “forgetful functor” [from each of these categories to SetSet, the functor that forgets or strips off structure] to each of these structured objects, in each case the output is the same: the underlying set \mathbb{N}. (Strictly speaking, I have just said something “evil”, but I am not going to worry about this on first pass.)

Are you worried about the fact that the naturals as ordered rig “is” a sextuple (,0,1,+,,<)(\mathbb{N}, 0, 1, +, \cdot, \lt), and hence a complicated type of “set”, a set different from (,0,succ)(\mathbb{N}, 0, succ) say? Yes, you can shoehorn any mathematical structure to make it a “set” (as one might do if ZF were one’s religion), but that’s not how mathematicians would normally think of it – they just bear in mind whatever structure (e.g., an ordered rig is a set equipped with certain operations and a binary relation) is relevant to the discussion at hand. And, in ever-increasing numbers, they think of the collection of structures of given species or signature as belonging to a category in its own right, with different types of structure belonging to different categories. So the set \mathbb{N} can come equipped with different types of structure, giving rise to objects in different categories, but one can always forget the structure and come back to the same underlying set \mathbb{N}.

Posted by: Todd Trimble on September 18, 2009 1:46 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

TT: Are you worried about the fact that the naturals as ordered rig “is” a sextuple, and hence a complicated type of “set”, a set different from (\mathbb{N},0,succ) say?

For me, as I believe for most mathematicians, the naturals have the maximal structure, and therefore are simultaneously a set, a Peano structure, a free monoid, a semiring, etc.. This view is simple, harmless, consistent with actual practice, and minimizes the amount of trivial add-on needed to formalize it.

But to understand your position, I tried to walk in your shoees and treat them as different objects, Let’s call them NN and NN' for simplicity. (iTex has no macros…) I do not mind that NN and NN' are not sets; the problem lies somewhere else.

1N1\to N and 1N1\to N' are both sets, but 1N1\to N and 1N1\to N' consist of arrows between different categories. Since an arrow, as I suppose is the purist view (maybe I am wrong here?) knows its origin and destination as part of its identity, it is impossible that the arrows 2(1N)2\in(1\to N) and 2(1N)2'\in(1\to N') are the same, as you said is provable. They can be identified only after applying an unspoken isomorphism.

But mathematicians generally differentiate between “same” and “isomorphic”; one really needs two words for these; only confusing them is truly evil (except perhaps when taking extreme case in how one formulates things).

This is what puzzles me. In your version of the foundation, there are too many copies of everything, and trivial functors that semi-identify them again. One gets a myriad of different objects for what is naturally the same object, and this propagates into everything constructed from these objects. One can pass the buck to a myriad of unspoken isomorphisms, but one cannot remove the myriad of things needed to make everything really precise.

But that is needed for a machine that efficiently handles all of undergraduate mathematics (say) in a single, consistent implementation.

Doesn’t this look like an ideal opportunity to apply Ockham’s razor?

Purist will still be able to discuss “2 as an element of NN” and “2 as an element of NN'”, but these would be regarded as objects different from the simple natural “2”.

That this is the mathematically natural way of regarding matters can already seen by the way we refer to them in the present discussion: To be understandable we must spell out the full form of the object. This is “2” for the natural number 2, but “2” does not denote an instance of the concept of “a natural number viewed in the context of a particular category”.

Posted by: Arnold Neumaier on September 18, 2009 2:37 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

But to understand your position, I tried to walk in your shoes and treat them as different objects, Let’s call them N and N′ for simplicity. (iTex has no macros…) I do not mind that N and N′ are not sets; the problem lies somewhere else.

1N1 \to N and 1N1 \to N&#8242; are both sets, but 1N1 \to N and 1N1 \to N&#8242; consist of arrows between different categories.

There seems to be a lot of confusion just in this one sentence. For now I’ll try to carry on this discussion in good faith, but I may run out of steam soon – these are after all very elementary topics we’re spending time on, and frankly I’m beginning to lose patience.

No, those arrows are not sets.

“Consist of arrows between different categories” is not good mathematical English, but I think what you were trying to say is that those arrows are morphisms of different categories. But that’s not right either, and certainly not what I said. For one thing, there may not be any such arrows. The arrows, actually arrow I had been talking about, 2:12: 1 \to \mathbb{N}, is in SetSet.

Let me give an example to help make this clearer. Let me call CC' the category of sets equipped XX with a point x:1Xx: 1 \to X and an endofunction function s:XXs: X \to X. The set \mathbb{N} equipped with its standard Peano (or Lawvere natural number object) structure is an example. We could call this object \mathbb{N}', if you like. A one-element set 11 carries a unique such structure, where the point is id:11id: 1 \to 1 and the endofunction is id:11id: 1 \to 1, so that’s another example. Does there exist a morphism of the form 11 \to \mathbb{N}', in the category CC'? NO!

However, if U:CSetU': C' \to Set is the evident forgetful functor, so that U()=U'(\mathbb{N}') = \mathbb{N}, the set \mathbb{N}, then there is of course an arrow 2:12: 1 \to \mathbb{N}. That’s in the category SetSet. Not in CC'. In SetSet.

Each of those standard structures you mentioned a while back ((,0,succ)(\mathbb{N}, 0, succ), (,0,1,+)(\mathbb{N}, 0, 1, +), and so on), sets equipped with prescribed operations (that are themselves arrows in SetSet), give us enough structure to pick out an element 2:12: 1 \to \mathbb{N}, as an arrow in SetSet. In each case, it’s the same damned 2, provided that the structures we’re talking about are the standard ones. That’s what is provable (for Pete’s sake!). One can, and one often does, think of sets-with-structure as objects in a category in its own right, and I brought that up hoping it would help explain that yes, we can think of \mathbb{N} as bearing many different types of structure (which need to be distinguished in order to have a coherent conversation), and thinking of these as objects in different categories may help us bear such distinctions in mind, but all those structures you mentioned, as sets equipped with operations of various sorts, can be used to define one and the same 2, as a morphism of the form 11 \to \mathbb{N} in SetSet. Is what I’m saying clear now?

For me, as I believe for most mathematicians, the naturals have the maximal structure, and therefore are simultaneously a set, a Peano structure, a free monoid, a semiring, etc.. This view is simple, harmless, consistent with actual practice, and minimizes the amount of trivial add-on needed to formalize it.

I can’t really speak for most mathematicians, and I’m not sure you can either, but I’m not sure what the heck is meant by “the maximal structure”. The maximal structure on \mathbb{N} consists, I guess, of all possible functions n\mathbb{N}^n \to \mathbb{N} of any arbitrary arity nn (finite or infinite), and all possible relations on \mathbb{N} (again of arbitrary arity). Is that how most mathematicians think of \mathbb{N}? I don’t think so. I think what mathematicians do is think of \mathbb{N} however the hell they wish to think of it, referring to as much or as little structure as will suit whatever purpose they have in mind. Of course, a good mathematician will tell us what he has in mind. For example, if he is investigating decidability issues, he has to tell us whether he means \mathbb{N} as monoid or rig or whatever. (I guess if the context is unstated, the usual default is to think of \mathbb{N} as ordered rig, but there are lots of other things people do with \mathbb{N}, e.g., they can think of it as dynamical system or as a group representation in various extremely creative ways. Thus “the maximal structure” doesn’t carry a lot of coherence for me.)

Ultimately, we are on the same side: we would all like flexible, easy-to-use foundations. I entered this conversation hoping to clarify the role of context dependence in structuralist points of view on set theory, but as of this writing it’s not clear to me whether you are honestly confused by what I’m saying but still trying to understand, or just trying to pick faults in what you take my position to be.

Posted by: Todd Trimble on September 18, 2009 5:32 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

TT: it’s not clear to me whether you are honestly confused by what I’m saying but still trying to understand, or just trying to pick faults in what you take my position to be.

I have no time to waste, and discuss here only to learn and to clarify.

I learnt category over 30 years ago as a student, mainly as a tools to see universal constructions from a unifying point of view.

I never needed to make any use of it in a long career in pure and applied mathematics, until this year when my vision of the foundations of mathematics was developed enough to merit a comparison with other foundations that are around. So I looked at the categorial approach, its benefits and its weaknesses. But unlike for you it is for me a mainly foreign language to which I was exposed as a youth but never had spoken it myself, apart from doing theexercises in the course where I leant it.

The FMathL mathematical framework uses categories on the foundational level just a little bit since it provides some nice features. So I had asked John Baez about his opinion, and he moved the discussion to here. I’ll keep posting to the discussion as long as I expect to learn something from it.

Maybe this restores your patience.

TT: all those structures you mentioned, as sets equipped with operations of various sorts, can be used to define one and the same 2, as a morphism of the form 1→ℕ in Set. Is what I’m saying clear now?

Yes. So the N here is always the set of natural numbers without structure, and N’ only enters as 1→U’(N’). My confusion came from the fact that before you had talked of “2 in ℚ is different from 2 in ℤ”, and I had thought that your new comment was a commentary on this, except with various forms of ℕ in place of ℚ and ℤ.

But you can see how difficult it is for someone with little practice in category theory to apply the right invisible functors at the right places.

It is definitely not something that belongs to the essence of mathematics, otherwise everyone would have to practice it before being allowed to forget it again.

And I have to find a way to teach the FMathL system to detect and overcome such misunderstandings (which can arise in reading any math in a field one is not fluent in).

TT: I’m not sure what the heck is meant by “the maximal structure”.

This was short for your 6-tuple plus succ, i.e., the union of all the stuff that is usually present in discussions about natural numbers. Depending on the context, a mathematician can add any conservative extension by defined operations to make this maximal structure rich enough. In the usual view one picks up from reading math papers, ℕ doesn’t suddenly stop to have a multiplication simply because someone doesn’t need it at the moment.

TT: Ultimately, we are on the same side: we would all like flexible, easy-to-use foundations. I entered this conversation hoping to clarify the role of context dependence in structuralist points of view on set theory,

Yes, and appreciate that. I learn from this discussion, though perhaps not at the speed you’d like to see. But this doesn’t prevent me form continuing to see all the trivial dead weight a pure categorial approach brings to fully formalized mathematics.

I want a formalization that is as free from artificialities as possible.

Posted by: Arnold Neumaier on September 18, 2009 6:32 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Arnold:

I have no time to waste, and discuss here only to learn and to clarify.

Excellent. Neither do I, and that’s also why I come here: to learn and help explain when I’m able.

I learnt category over 30 years ago as a student, mainly as a tools to see universal constructions from a unifying point of view.

Okay, thanks, this is good to know.

I never needed to make any use of it in a long career in pure and applied mathematics, until this year when my vision of the foundations of mathematics was developed enough to merit a comparison with other foundations that are around. So I looked at the categorial approach, its benefits and its weaknesses. But unlike for you it is for me a mainly foreign language to which I was exposed as a youth but never had spoken it myself, apart from doing theexercises in the course where I leant it.

The FMathL mathematical framework uses categories on the foundational level just a little bit since it provides some nice features. So I had asked John Baez about his opinion, and he moved the discussion to here. I’ll keep posting to the discussion as long as I expect to learn something from it.

Maybe this restores your patience.

It restores it some. But (and I’m not trying to browbeat you here), but I have to say in all honesty: based on what you’ve written here and elsewhere, it seems to me you have a pretty hazy understanding of category theory and what it has to say about foundations. I don’t have a problem with that, unless someone in that position then holds forth on categorical foundations, its strengths and weaknesses, what’s impossible and what’s provable, etc. I happen to know a bit about the subject myself. Not as much as some people, but “more than your average bear” as Yogi Bear used to say.

But you can see how difficult it is for someone with little practice in category theory to apply the right invisible functors at the right places.

Excuse me then. Although I did try to write clearly, it’s possible that I was talking to some degree as mathematicians often do, assuming some unstated context and background. I wasn’t aware of what your background in category theory was, perhaps, and explained things too rapidly (?).

It is definitely not something that belongs to the essence of mathematics, otherwise everyone would have to practice it before being allowed to forget it again.

Well, that’s another bald claim, but it’s not the first time I’ve heard that sort of thing.

Let’s forget ETCS and all that, then; the kernel of the complaint seems to be that the categorical approach is hard to get into – even very carefully written texts like Mac Lane and Moerdijk’s Sheaves in Geometry and Logic, a nice introduction to topos theory with a strongly logic-oriented point of view, take some effort to penetrate and master. I’ve made some forays myself into writing up some of the details of ETCS (as reproduced on the nLab), with an eye to eventually being able to explain it smoothly to undergraduates, but that series is still very unfinished and not nearly to my satisfaction. Long story short, you’re right, no one has succeeded yet in making categorical set theory look like a snap.

However, it looks like Mike Shulman has written down a very interesting program of study, different to the categories-based approach (which is extremely hands-on and bottom-up) but which is very smooth and top-down (invoking for example a very powerful comprehension principle) while still being faithful to the structuralist POV. He calls it SEAR (Sets, Elements, and Relations). That may be a more congenial place to start; in fact I strongly recommend it to gain a better appreciation of this POV (easier to read certainly than Lawvere).

If you’d like to learn more however about categorical set theory and “categorical foundations”, it might be a good idea to read some of the category texts written by people with significant formal philosophic training, e.g., Awodey, J.W. Bell, Goldblatt, and McLarty (and do the exercises!). They tend to write at a more accessible level than Johnstone say or Lawvere.

This was short for your 6-tuple plus succ, i.e., the union of all the stuff that is usually present in discussions about natural numbers. Depending on the context, a mathematician can add any conservative extension by defined operations to make this maximal structure rich enough. In the usual view one picks up from reading math papers, ℕ doesn’t suddenly stop to have a multiplication simply because someone doesn’t need it at the moment.

Obviously I agree with the spirit of the last sentence, and we obviously agree that a mathematician should have the freedom to “call” the multiplication whenever the need is felt or pass to a conservative extension rich enough for any desired purpose. What’s to disagree with. But I still don’t believe in maximal structures per se much.

For example, model theorists often consider what are called o-minimal structures, and it is widely believed (although unproven as far as I know) that there exists no maximal o-minimal structure. One can extend, world without end, amen.

But this doesn’t prevent me form continuing to see all the trivial dead weight a pure categorial approach brings to fully formalized mathematics.

It might be a good idea not to be too insistent about that before you develop a better understanding of that approach. There’s a lot going on there, and also a lot has happened in the past thirty years which you might not be fully aware of.

Posted by: Todd Trimble on September 18, 2009 8:55 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

TT: it seems to me you have a pretty hazy understanding of category theory and what it has to say about foundations.

While this may be the case, I have looked at the less formula-intensive discussion of categorial foundations to get a view of what I can expect from the approach. This is what I have done in any field I entered, and if you look at my homepage you can see that I have successfully entered many fields. Time is bounded; so I need a way to see where to concentrate and to invest learning all the details. With category theory the goods to expect were never sufficient to motivate me to practice the formalism. Nevertheless, i can decipher any categorical statement with some effort, and have done so quite a number of time.

But it feels like very occasionally programming in C++ when all the time you program in an easy-to-use language like Matlab - One has to remind oneself each time what rules are applicable when, and where one needs care. (We do even the FMathL proptoyping in Matlab rather than in Haskell, say, since Matlab is much easier to use.)

I just lack the practice to express myself easily, and I’ll aquire this practice only if the benefits are high.

The main reason I cannot see why category theory might become a foundation for a system like FMathL (and this is my sole interest in category theory at present) is that a systematic, careful treatment already takes 100 or more pages of abstraction before one can discuss foundational issues formally, i.e., before they acquire the first bit of self-reflection capabilities.

In FMathL, the reflection cycle must be very short, otherwise an implementation of the system is impossible to verify by hand.

TT: I wasn’t aware of what your background in category theory was, perhaps, and explained things too rapidly (?).

Well, since this forum is read by people wil all sorts of background knowledge on categories, it pays to give a little attention in adding a bit of redundant information to remind those not doing categories every day about some context. Often a few words or an extra phrase is sufficient.

I remember when, on my very first conference, Peter Cameron, one of the big people in finite geometries, started his lecture with reminding the audience (all finite geometers) of the definition of a permutation group, a statement everyone must haver known (but not everyone used it on a daily basis). For me, it was an eye-opener for how to communicate well. (You may wish to look at my theoretical physics FAQ to get an idea on how it bore fruit.) AN: It is definitely not something that belongs to the essence of mathematics

TT: Well, that’s another bald claim, but it’s not the first time I’ve heard that sort of thing.

I wasn’t saying this of category theory, but of treating it as a foundation rather than as a tool, forcing the need for type-matchng everything by invisible functors.

TT: SEAR (Sets, Elements, and Relations).

I posted there already, after SEAR had been mentioned here.

TT: But I still don’t believe in maximal structures per se much.

Note that maximal structure is something very constructive: The union of the structure assembled about an object at a given time in a given finite context.

Of course, any context can be indefinitely extended, and as this is done, the meaning of maximal changes in a similar way as the meaning of the largest element in a finite set of numbers may change when augmenting the set.

But this doesn’t invalidate the meaning of the concept of the maximum of a finite set of numbers.

But at any time, it is well-defined. Context management is one of the basic tasks a system like MathML has to do (besides the ability to precisely specify concepts), and the way this is done decides on the feasibility of the whole projects.

AN: But this doesn’t prevent me form continuing to see all the trivial dead weight a pure categorial approach brings to fully formalized mathematics.

TT: It might be a good idea not to be too insistent about that before you develop a better understanding of that approach.

I can see the dead weight (100 pages overhead) already with the limited understanding I have now.

Show me a paper that outlines a reasonably short way to formally define all the stuff needed to be able to formally reflect in categorial language a definition that characterizes when an object is a subgroup of a group.

This is not difficult in categorial terms when you are allowed to use the standard informal mathematical metalanguage.

But the way I measure the potential merit of a conceptual framework is by how easy it is to say the same things rigorously without using the metalanguage, in particular, avoiding all the abuses that makes informal mathematics so powerful and short.

Posted by: Arnold Neumaier on September 18, 2009 11:43 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Show me a paper that outlines a reasonably short way to formally define all the stuff needed to be able to formally reflect in categorial language a definition that characterizes when an object is a subgroup of a group.

Show me that paper for ZFC\mathbf{ZFC} (or any other foundation that isn't specifically geared towards group theory), and I'll show you the same for ETCS\mathbf{ETCS}.

Posted by: Toby Bartels on September 19, 2009 12:14 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

To make this a serious but fair challenge: Give me a paper whose source is available (say from the arXiv), formalised in ZFC (or whatever), and I'll rewrite it to be formalised in ETCS. (The paper can include its own specification of ZFC too, and mine will include its own specification of ETCS.)

Posted by: Toby Bartels on September 19, 2009 12:20 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Show me that paper for ZFC (or any other foundation that isn’t specifically geared towards group theory), and I’ll show you the same for ETCS.

I’m guessing that the response to this may be “well, ZFC is no better.”

However, I must be misunderstanding the statement, because I don’t see why it is at all difficult. A subgroup of a group is a subset which is closed under the group operations. In ETCS “subset” means “injective function”. But why is that at all hard to formalize?

Along the same lines, exactly what 100 pages are you referring to?

Posted by: Mike Shulman on September 19, 2009 3:06 AM | Permalink | PGP Sig | Reply to this

Introduction to Categorical Logic

MS: exactly what 100 pages are you referring to?

I was thinking of the lecture notes Introduction to Categorical Logic by Awodey and Bauer. They take endless preparations (partly moved to the appendix) before they have reflected enough logic that would semantically adequately encode stuff like the metalanguage of ETCS (which was not encoded there). Maybe far less suffices, but the way the lecture notes are organized doesn’t make it easy to find out what can be deleted.

I believe that standard axiomatic set theory + first order logic as organized in standard textbooks is much shorter. One only needs the introductory part of axiomatic set theory (until one has functions and natural numbers) since not even the general notion of a cardinal is needed in first order logic.

But of course, these were just estimates from the literature. If you know of a shorter Introduction to Categorical Logic also starting from scratch, I’d appreciate a reference.

Posted by: Arnold Neumaier on September 19, 2009 6:48 PM | Permalink | Reply to this

Re: Introduction to Categorical Logic

I think I’m still failing to understand exactly what you mean by “reflect”. Do you mean being able to give a formal definition, in some theory, of the syntax and semantics of first-order logic? If so, that is just as easy in type theory as in ZF.

All the work in those lecture notes is geared towards describing the categorical semantics for first-order logic. This is hard work no matter what theory you are working in, whereas a simple set-based semantics is easy. I think books on categorical logic usually focus on this difficult job, often assuming that their readers are familiar with the easy version.

Posted by: Mike Shulman on September 19, 2009 8:36 PM | Permalink | PGP Sig | Reply to this

Re: Introduction to Categorical Logic

MS: I think I’m still failing to understand exactly what you mean by “reflect”. Do you mean being able to give a formal definition, in some theory, of the syntax and semantics of first-order logic?

I discussed what I mean in Sections 1.5 and 3.2 of the FMathL mathematical framework paper. It means to prepare enough formal conceptual and algorithmic ground that enables one to formally write down everything needed to explain the meaning of an ordinary mathematical text that defines the system in the usual informal way, and together with the meaning the algorithmic steps that are allowed to be performed.

In partuclar, for your vision of category theory, the system should be able to know formally what it means to treat the theory in a morally correct way.

MS: If so, that is just as easy in type theory as in ZF.

I think there is a type error in your statement. the left hand side is a way of structuring the logic, the right hand side is a way of adding mathematical functionality to it. How can they be compared for easiness?

Foundations consist of two sides - the logic and the axiom system. With respect to (classical) logic, the choice is between FOL and HOL (first and higher order logic) and between typed and type-free theories. With respect to the axiom system, the choice is between some version of set theory and some version of category theory.

I argued mainly against having as basic axiom system that of a category theory. This has nothing to do with types.

Type theories have proved their foundational value. The only complaint I have against type theoretic foundations coupled with some set theory - realized in many theorem provers - is that the cores of these provers are huge, and very hard to check by hand for correctness. But the untyped theorem provers are also far from satisfying in this respect.

Posted by: Arnold Neumaier on September 19, 2009 9:03 PM | Permalink | Reply to this

Re: Introduction to Categorical Logic

When I said “type theory” I meant to refer, not to typed first-order logic in general, but to a specific type theory including type constructors for products, subsets, quotients, powersets, exponentials, etc – a theory such as is usually used for the internal language of a topos. Perhaps this is what you are calling “some version of category theory,” although it is not intrinsically categorial.

Posted by: Mike Shulman on September 20, 2009 6:16 AM | Permalink | PGP Sig | Reply to this

Re: CCAF, ETCS and type theories

But it feels like very occationally programming in C++ when all the time you program in an easy-to-use language like Matlab - One has to remind oneself each time what rules are applicable when, and where one needs care.

Interesting that you should say that. I was having the same thought that this argument is much like the argument between statically typed and dynamically typed programming languages. Unsurprisingly, I prefer statically typed ones. I actually used to like dynamically typed ones like Perl and Python better, but over time I came to realize that there is a cleanness and elegance to a well-designed typed language (in which category I am hesitant to include C++), and that by forcing you to specify through the problem precisely, static typing eliminates many subtle errors and guides you to a conceptual solution. These days, if I could, I would do all my programming in Haskell. But, unfortunately, most of the world does not agree with me. (-:

Anyway, I feel that some of the same considerations may apply to mathematics. In particular, the role of mathematical foundations is not necessarily restricted to a descriptive one. One must, of course, resist the temptation to be overly prescriptive and thus alienate the majority of mathematicians, but I think that a knowledge of and appreciation for formal logic and foundations has the potential to change one’s practice of mathematics for the better, at least in small ways. I also don’t think it’s wrong for a system for computer formalization of mathematics to attempt small, incremental improvements in the way mathematicians write and reason.

Posted by: Mike Shulman on September 19, 2009 4:43 AM | Permalink | PGP Sig | Reply to this

Re: CCAF, ETCS and type theories

MS: over time I came to realize that there is a cleanness and elegance to a well-designed typed language (in which category I am hesitant to include C++), and that by forcing you to specify through the problem precisely, static typing eliminates many subtle errors and guides you to a conceptual solution.

The formal specification part of FMathl will in fact have a kind of type system to make internally formal argueing and checking easy. But:

This type system does not look like math but like programming, and hence is not appropriate at the abstract level. An ordinary user should therefore never notice that it exists.

The FMathL concept of a type is quite different from that of a type theory. I’ll talk about it here once the design of the specification level is reasonably stable to merit discussion. (Maybe around Christmans?)

Posted by: Arnold Neumaier on September 19, 2009 7:15 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

I actually don’t see why that’s any different. If \in can be overloaded, why can’t 22 be overloaded? Numeric literals are polymorphic in Haskell.

Posted by: Mike Shulman on September 18, 2009 5:47 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

In FMathL, objects and morphisms can belong very naturally to several categories.

As a category theorist, I find that very worrying! (-:

I expect you are right that many mathematicians think of the real number 22 as “the same” as the natural number 22, but I think there are a fair number who realize that they are, strictly speaking, different. (They are different in material set theory too, of course: whether you define real numbers using Dedekind cuts or Cauchy sequences, in no case will 22 be {,{}}\{\emptyset,\{\emptyset\}\} or whatever you used for your naturals.) But as I said in my previous post, I think that the usage of the former collection of mathematicians is adequately addressed by notational overloading. (In fact, 22 has a meaning more general than that! It also means 1+11+1 in the generality at least of any abelian group. And some people use it to mean a 2-element set, or the interval category. You just don’t have any hope of capturing mathematical usage without heaps of notational overloading, so as long as it’s there, why not make full use of it?)

However, as a category theorist, I feel perfectly within my rights to object that morally, nothing should ever be an object of two categories at the same time. If that were possible, then people could start talking about things like taking the “intersection” of two categories, and that would just be all wrong. The set \mathbb{R} is an object of SetSet, the field \mathbb{R} is an object of FldFld, and so on, but these are different objects.

Posted by: Mike Shulman on September 18, 2009 5:58 AM | Permalink | Reply to this

objects and morphisms can belong naturally to several categories

MS: It also seems to me that all the things about real numbers you are saying that FMathL does right, it only does right specifically for real (or complex) numbers, because you have built them into the system by fiat. If FMathL didn’t include axioms specifying that there was a particular set called \mathbb{R} with particular properties

In fact, it doesn’t. The axioms only defined when a complex number is real. To define the set of real numbers, one needs to add

(1):={xx¯=x}, \mathbb{R}:=\{x\in\mathbb{C}\mid \overline x = x\},

but to be allowed to do that needs later axioms.

AN: In FMathL, objects and morphisms can belong very naturally to several categories.

MS: As a category theorist, I find that very worrying! (-:

But every undergraduate student is very thankful for not having to distinguish between the many incarnations of 2 (and of compound objects that involve 2) in the many different structures it is in! It simplifies life dramatically without sacrificing the slightest bit of rigor!

MS: I think there are a fair number who realize that they are, strictly speaking, different. (They are different in material set theory too, of course: whether you define real numbers using Dedekind cuts or Cauchy sequences, in no case will 2 be {,{}}\{\emptyset,\{\emptyset\}\} or whatever you used for your naturals.)

This is why I regard both foundations as inadequate.

MS: the usage of the former collection of mathematicians is adequately addressed by notational overloading.

Even defining precisely the details of the process of overloading necessary to make mathematics work as usual is a nightmare (of the same magnitude as that for the needed abuse of notation in ZF-based foundations), and nobody has ever done it.

It is clear that this overloading is not intrinsic to mathematics but only to the attempt to give it a clategorial or set-theoretic foundation. After all, real numbers existed long before Dedekind invented the first straitjacket for it!

MS: You just don’t have any hope of capturing mathematical usage without heaps of notational overloading, so as long as it’s there, why not make full use of it?

When talking of “you”, I think you projected your own hopelessness onto me! I not only have hopes but have a good deal of evidence for its realizability within easily formalizable limits - otherwise I wouldn’t have embarked on the FMathL project.

MS: as a category theorist, I feel perfectly within my rights to object that morally, nothing should ever be an object of two categories at the same time.

In FmathL, you can exercise this right by defining your own version of categories, just as those who want to base their mathematics on ZF can define their own version of ZF-numbers.

MS: If that were possible, then people could start talking about things like taking the “intersection” of two categories,

In FMathL you can form it, but it would not automatically be a category, but an object whose only useful property would be the elements it contains.

MS: and that would just be all wrong.

Wrong at best in the current tradition, but traditions can change. Such a change of tradition might have many advantages, once systematically explored: For example, one can express with this naturally (and every mathematician immediately understands without the need for explanations) that ordered monoids are the objects in the intersection of Order and Monoid satisfying the compatibility relation RR (suitably defined), and many other similar constructs.

But please justify your claim of wrongness by giving an example of a major categorial result that becomes wrong when one drops the requirement that different categories may not have common objects.

For I haven’t this seen stated as a formal requirement of the concepts of category theory. It seems to be a mere metarequirement of some category theorists only, and at best of a status like that of the implicit overloading generally employed without saying so explicitly.

For example, Wikipedia says that “Any category C can itself be considered as a new category in a different way: the objects are the same as those in the original category but the arrows are those of the original category reversed. This is called the dual or opposite category”

How does this square with you feeling that morally, nothing should ever be an object of two categories at the same time?

But perhaps “morally” is the qualifier that makes the difference to “in practice”. I am interested in describing mathematics as done in practice, since a good automatic research system must understand the practice, but not necessarily any subjective moral associated with it.

… back to the conversion issue…

It takes a lot of training for a mathematician not already immersed into category theory to believe (and feel happy with) the multitude of trivial conversions needed to state rigorously what you want to consider the moral state of affairs.

This is characteristic for any theory that thinks in constructive terms (like ZF or categories) rather than in specification terms (like FMathL).

Generations of students had to be forced into an unnatural ZF (or ZF-like) straitjacket since, for a long time, that was the only respectable foundation. Traditional categorial foundations only exchange this straitjacket for a different one.

FMathL shows that no such straitjacket is needed since actual mathematical practice can be fully rigorously formalized without any need for accidentals that are not actually used after a concept was defined into existence.

Posted by: Arnold Neumaier on September 18, 2009 11:50 AM | Permalink | Reply to this

Re: overloading

If FMathL didn’t include axioms specifying that there was a particular set called \mathbb{R} with particular properties

In fact, it doesn’t. The axioms only defined when a complex number is real.

Yes, but that is completely irrelevant to the point I was trying to make. The point is that the axioms supply, however they do it, a set \mathbb{R} without specifying how it is constructed.

But every undergraduate student is very thankful for not having to distinguish between the many incarnations of 2

So is every Haskell programmer. But that doesn’t mean that the different incarnations of 2 aren’t different at a fundamental level.

It is clear that this overloading is not intrinsic to mathematics but only to the attempt to give it a clategorial or set-theoretic foundation.

It is clear to me that this overloading is fundamental to the way mathematics is spoken and written by mathematicians. I’m guessing that even you will not claim that the real number 22 is the same as the element of /p\mathbb{Z}/p\mathbb{Z} denoted by 22 — especially when p=2p=2 (the integer) because then 2=02=0 in /p\mathbb{Z}/p\mathbb{Z}, which it assuredly does not in \mathbb{R}! So any system which can parse mathematics as it is actually written by mathematicians will have to allow overloading, regardless of how nightmarish it might be, and regardless of the foundations one chooses. (And I thought you already agreed that overloading \in was reasonable! why is overloading 22 different?)

I not only have hopes but have a good deal of evidence for [capturing mathematical usage without heaps of notational overloading]

I would like to see some of this evidence.

It takes a lot of training for a mathematician not already immersed into category theory to believe (and feel happy with) the multitude of trivial conversions needed to state rigorously what you want to consider the moral state of affairs.

But I thought the whole point of this project is so that the mathematician not already immersed in any sort of foundations doesn’t have to believe in or feel happy with those foundations; they can just write mathematics as they usually do and the system will interpret it correctly.

Posted by: Mike Shulman on September 18, 2009 6:21 PM | Permalink | Reply to this

Re: overloading

MS: even you will not claim that the real number 2 is the same as the element of ℤ/pℤ denoted by 2 — especially when p=2 (the integer) because then 2=0 in ℤ/pℤ, which it assuredly does not in ℝ! So any system which can parse mathematics as it is actually written by mathematicians will have to allow overloading.

I call this context-dependent ambiguity. This has nothing to do with types.

Today I was grading a paper where c was defined as a certasin product, and a few lines later was a formula involving c in the denominator, explained to be the speed of light. And the student went on saying: Therefore c is very small and can be neglected. I was very puzzled in the first round of reasding until I noticed that there were two different meanings of the symbol c.

I conclude that the existence of context-dependent ambiguity must be accounted for. But it doesn’t give licence to generate more versions of 2 than are absolutely needed.

MS: I would like to see some of this evidence.

It is difficult to convey direct evidence before we have fixed the formal framework for representing mathematical context, but here is the idea:

I mentioned already elsewhere in this discussion that one can avoid the multiple meanings of \mathbb{N} by interpreting it in each context as the richest structure that can be picked up from the context.

This principle works very generally, and is, I believe, consistent with the way mathematicians work (excepting perhaps category theorists).

As an automatical research system must have anyway the capacity to build up context, this accommodated the principle without problems, without the need of overloading. Each context has its own collection of interpretations.

in some sense, this uses the same idea as the categorial approach to foundations, but to turn each context into a category in order to be able to use the categorial version of this idea for general context changes in mathematics would create another straitjacket…

MS: I thought the whole point of this project is so that the mathematician not already immersed in any sort of foundations doesn’t have to believe in or feel happy with those foundations; they can just write mathematics as they usually do and the system will interpret it correctly.

This is the main point, but not the whole point. The whole point also involves making the system trustworthy in that everyone interested in checking out how the system arrives at its answers should be able to get as much detail as wanted, and also in the simplest possible form. In particular, someone extremely critical should be able to check for himself (if possible without any use of a machine) that the whole system works in a sound way. This means that the basics must be as transparent as possible.

Compare with Coq. I enter a conjecture; Coq works on it for a week and then says: “The conjecture is true, and here are 3749MB of proof text that Coq_V20.57 verified to be correct.

Nobody is going to check that, except perhaps another machine. Humans must take it on trust. But on which basis? At least you’d want to check the implementation of Coq to get assurance. But Coq is a huge package….

The core of FMathL will have to be a small package, with programs transparent enough to be checked by hand. This is possible only if things are kept as simple as possible.

Posted by: Arnold Neumaier on September 18, 2009 7:30 PM | Permalink | Reply to this

Re: overloading

I’m confused; are you saying that 22 should not be used to mean 1+11+1 in /p\mathbb{Z}/p\mathbb{Z}? I think that is a perfectly justifiable notation and, I believe, so do many other mathematicians.

Posted by: Mike Shulman on September 18, 2009 8:45 PM | Permalink | PGP Sig | Reply to this

Re: overloading

MS: are you saying that 2 should not be used to mean 1+1 in ℤ/pℤ?

No. I am saying that 2 has a context-dependent ambiguous meaning.

In a context where ℤ/pℤ (or another ring with 1) appears as a structure whose elements are discussed, 2 should mean 1+1 in this ring, wheres in the absence of such a context, 2 should be considered as the default: a complex number (and at the same time as a real, rational, integral, and natural number).

Thus I accept notation ambiguity - to be able to recognize ambiguity and resolve it from context is essential for any automatic math system.

But I do not regard notation ambiguity as something to be captured by a concept of overloading and subtyping, since not every notation ambiguity can be considered as an instance of the latter.

Therefore I accept that “2 in ℤ/pℤ” and “2 as natural number” are different objects, They are rarely used in the same context.

But I do not accept that “2 as natural number” and “2 as complex number” are different objects. Almost everyone will agree with me. (In algebra I was even taught how to achieve this uniqueness of 2 in ZF using a construction called identification.)

Posted by: Arnold Neumaier on September 18, 2009 9:00 PM | Permalink | Reply to this

Re: overloading

But I do not regard notation ambiguity as something to be captured by a concept of overloading and subtyping, since not every notation ambiguity can be considered as an instance of the latter.

Can you give some examples of the sort of notation ambiguity you are thinking of which can’t be captured by overloading and subtyping?

Posted by: Mike Shulman on September 19, 2009 2:28 AM | Permalink | PGP Sig | Reply to this

Re: overloading

MS: Can you give some examples of the sort of notation ambiguity you are thinking of which can’t be captured by overloading and subtyping?

For example, writing xy*zx \circ y *z without having defined priorities of the operations (and where different priority rules exist in different traditions), and both ways to interpret them make sense. The dangling else is a famous instance of that.

Of course, one can write formal papers that avoid any sort of ambiguity. This is what is done in Mizar, but it accounts for the huge overhead that makes Mizar not a very practical tool for the ordinary mathematician.

FMathL must resolve these issues by reasoning, figuring out which version makes sense in the context. Typing is just the simplest way of doing such reasoning in cases where it applies.

Posted by: Arnold Neumaier on September 19, 2009 5:25 PM | Permalink | Reply to this

Re: overloading

Thanks for the examples, now I know what you mean.

Typing is just the simplest way of doing such reasoning in cases where it applies.

Maybe this is the core of our disagreement. I do not see types as “just” a way of resolving notational ambiguity. Rather, types carry important semantic information in their own right.

It seems unlikely that either of us will convince the other, however, since versions of this argument have been raging for years. But I’m very glad to have had / be having the discussion; I think I’ve learned a lot.

Posted by: Mike Shulman on September 19, 2009 6:50 PM | Permalink | PGP Sig | Reply to this

Re: overloading

MS: types carry important semantic information in their own right.

Yes, they do. But the common type systems are far too inflexible to capture all of the semantics. Since a system like MathML needs anyway a way to represent arbitrary semantics, the more general semantic system will automatically take over the function a typing system would have.

Once one has the semantic system, one can choose to represent things there in any way one likes. It will be easy to create in FMathL contexts that work exactly like HOL and allow you to do everything with overloading if you believe this is the way mathematics should be done on the base level. But the standard context for doing mathematics will most likely be different, since typing introduces a lot of unnecessary representation overhead.

On the other hand, since the semantics of everything is clearly structured, it will not be very difficult to create automatic translators from one sort of mathematical representation to another sort. Indeed, this will be one of the strengths of the FMathL system and its view of many subject levels with a common object level.

Every mathematician can view mathematics in his or her preferred way, asnd still be sure that everything translates correctly to every other mathermatician’s view.

MS: But I’m very glad to have had / be having the discussion; I think I’ve learned a lot.

The same hold for me. It is very interesting and helpful.

Posted by: Arnold Neumaier on September 19, 2009 8:07 PM | Permalink | Reply to this

Re: overloading

But I do not accept that “2 as natural number” and “2 as complex number” are different objects. Almost everyone will agree with me.

I would be very interested to see a wide-ranging survey of professional mathematicians on this point, broken down by field and perhaps age. I’m not saying you’re wrong that “almost everyone” will agree with you, but I don’t really have data to judge one way or the other.

I find it markedly inconsistent and arbitrary to view 2/p2\in\mathbb{Z}/p\mathbb{Z} and 22\in\mathbb{Z} as different, but 22\in\mathbb{Z} and 22\in\mathbb{C} as the same. What about 2 p2\in\mathbb{Q}_p? Or 2¯2\in\overline{\mathbb{Q}}? What’s the general rule?

Posted by: Mike Shulman on September 19, 2009 6:59 PM | Permalink | PGP Sig | Reply to this

Re: overloading

MS: I find it markedly inconsistent and arbitrary to view 2/p2\in\mathbb{Z}/p\mathbb{Z} and 22\in\mathbb{Z} as different, but 22 \in\mathbb{Z} and 22 \in\mathbb{C} as the same. What about 2 p2\in \mathbb{Q}_p? Or 2¯2\in \overline\mathbb{Q}? What’s the general rule?

I didn’t invent that. For Bourbaki (whom I am following in this), the rule is that an object xAx\in A remains xx even when it is considered as an element of BB where BB contains AA.

Bourbaki has a general construction called identification that replaces (for example) the pseudo-rationals in the Dedekind reals by the true rationals from which the Dedekind reals were constructed, and adapts the operations in such a way that the resulting field of reals contains the ordinary rationals.

Thus what (in my understanding of) category theory (maybe I am not using the correct term here) is an embedding functor is for Bourbaki the identity mapping.

Upon having described identification at the first occasion where it occurs, it suffices later (when discussing reals) to say that “by identification, we may take Q to be a subset of R.” Somewhere, an abuse of language is introduced to say that identification will be made silently if the embedding is canonical. This establishes the standard mathematical terminology in a completely rigorous way.

Thus the 2 p2\in \mathbb{Q}_p and the 2¯2\in \overline\mathbb{Q} are the same as the 22\in \mathbb{C}, as all three sets contain \mathbb{Z}, which in turn contains 2.

Posted by: Arnold Neumaier on September 19, 2009 8:08 PM | Permalink | Reply to this

Re: overloading

After a long conversation with a friend this evening, I feel like I have a better understanding of how and why many/most people may think of 22\in\mathbb{Z} and 22\in \mathbb{R} as identical. Perhaps I have unknowingly trained myself to think of them as different, because that is what the structural approach to mathematics requires (and I think of mathematics structurally because that is what I see when I look at mathematics—although I now accept that you see something different).

Honestly, it just feels really messy to me to think about the category RingRing (say) if some elements of some rings might be equal to some elements of other rings, depending on how one constructed them and whether an embedding of one into another is “canonical” or not. But aesthetics certainly differ! (-:

Posted by: Mike Shulman on September 20, 2009 6:26 AM | Permalink | PGP Sig | Reply to this

Re: objects and morphisms can belong naturally to several categories

ordered monoids are the objects in the intersection of Order and Monoid

As far as I can tell, this is not true even in FMathL (and I don’t see how it could ever be true). An order is a set equipped with an order relation, and monoid is a set equipped with a multiplication and unit. An ordered monoid is a set equipped with both. What is true instead is that ordered monoids are the objects of the pullback of the two forgetful functors from OrderOrder and MonoidMonoid to SetSet.

But please justify your claim of wrongness by giving an example of a major categorial result that becomes wrong when one drops the requirement that different categories may not have common objects.

Mathematics is not just about results. Mathematics, and especially category theory, is also about definitions and concepts. The point is that one object being in two categories is foreign to the practice of mathematics and can only lead to confusion. You’ve supplied me with a case in point above: if it were possible for one object to be in two categories, then I can easily see some people thinking that ordered monoids are the intersection of OrderOrder and MonoidMonoid, when in fact this is false.

Your example of opposite categories is also a good one. The opposite of the category of frames is the category of locales, but a frame is not the same as a locale. For instance, the terminal frame is very different from the terminal locale. Suppose I wanted to study “localic frames,” i.e. the point-free version of “topological frames,” which in turn would be frames whose frame operations are continuous. If someone has been told that categories can be intersected, and maybe is still laboring under the misapprehension that ordered monoids are the intersection of OrderOrder and MonoidMonoid, they might immediately try to define localic frames as the interesction of LocaleLocale and FrameFrame. But if LocaleLocale is defined as Frame opFrame^{op} and opposite pairs of categories “have the same objects,” then LocaleFrameLocale \cap Frame would consist just of the frames/locales, a far cry from “localic frames.” Trying to intersect categories should be a type error.

Posted by: Mike Shulman on September 18, 2009 6:38 PM | Permalink | PGP Sig | Reply to this

Re: objects and morphisms can belong naturally to several categories

AN: Such a change of tradition might have many advantages […] ordered monoids are the objects in the intersection of Order and Monoid that ordered monoids are the objects in the intersection of Order and Monoid satisfying the compatibility relation R (suitably defined),

MS: As far as I can tell, this is not true even in FMathL (and I don’t see how it could ever be true). An order is a set equipped with an order relation, and monoid is a set equipped with a multiplication and unit.

I was speaking in subjunctive mode (and clearly not about the meaning of the terms in existing foundations) since this is not a reality but only a possibility that might (or might not) work out. (I thought the cafe is also about exloring possibilities, not only about presenting truths.) Since the current FMathL framework does not have a formal way to say what a monoid is, nobody knows yet whether this could be true in FMathL. This brings us to semantical questions that need to be resolved on the next layer of FMathL that we are designing at the moment, but it is not yet ready:

What does it formally mean to speak of “a set equipped with a relation”?

In ZF, it means to have a pair (S,R) where S is a set and R is a relation, and this is easily formalized in ZF itself.

In category theory, its formal meaning must be something different; I don’t know exactly what. How would this be formally expressed, using only terms intrinsic to the language of categories?

FMathl is neither ZF nor category theory, hence can give it again a different meaning. (What it will be, will be decided by the end of the year, I guess.)

Probably, but I haven’t checked the details, one can give the intersection of categories in FMathL a meaning that makes intersections of categories of two algebraic structures inherit the properties of both structures in such a way that it is consistent with Axiom A10 governing intersection.

MS: Your example of opposite categories is also a good one. The opposite of the category of frames is the category of locales, but a frame is not the same as a locale.

There must be something wrong either with your statement or with the definition of opposites in Wikipedia.

Wikipedia said explicitly that the two categories have the same objects. And this seems to be the canonical definition: Adámek et al, Abstract and Concrete Categories say something amounting to the same in Definition 3.5. So does Definition 1.3.2 of Asperti and Longo, Categories, Types and Structures. So says Section 1.4 in Barr and Wells, Toposes, Triples and Theories. So says Section 1.6.1 in Schalk and Simmons, An introduction to Category Theory in four easy movements.

Not being an expert in category theory, I need to rely on trustworthy sources for the accepted meaning of the basic concepts. There seems to be full agreement in the literature (at least that available online) that at least certain distinct categories have the same objects.

MS: But if Locale is defined as Frameop^{op} and opposite pairs of categories “have the same objects,” then Locale \cap Frame would consist just of the frames/locales, a far cry from “localic frames.”

I agree. But backed by the orthodox definition of opposites, and with the usual moral of a mathematician, I’d draw the conclusion that your statement “The opposite of the category of frames is the category of locales, but a frame is not the same as a locale” is incompatible with the definition of opposites when taken literally. (FMathL would raise here a popup window and ask for support or correction or clarifying context.)

But perhaps there is something else wrong with my moral of reading category theory texts besides what Todd Trimble pointed out. (Just as I learn through such a discussion, FMathL would have an internal moral code that learns from the feedback from popup windows.)

Posted by: Arnold Neumaier on September 18, 2009 8:46 PM | Permalink | Reply to this

can objects and morphisms belong to several categories?

I think that making ordered monoids the intersection of OrderOrder and MonoidMonoid, in any formal system, would be a very bad idea. (I also have my doubts about whether it is possible in a consistent way.) Everywhere in mathematics that I have seen “intersection” used, it refers to adding properties, rather than structure as in this case. Furthermore, a category should only be considered as defined up to equivalence, and this sort of “intersection” seems unlikely to be invariant under equivalence.

What does it formally mean to speak of “a set equipped with a relation”?

In ETCS, a set equipped with a relation consists of an object AA, an object RR, and a monomorphism RA×AR\to A\times A.

There must be something wrong either with your statement or with the definition of opposites

Very interesting! I didn’t realize that the basic texts of category theory could be misinterpreted in this way by someone unfamiliar with our way of thinking.

I think that all category theorists, as well as most of the mathematicians they talk a lot to (algebraists, topologists, etc.), are so immersed in a structural way of thinking that an object of a category only has meaning as an object of that category. When you construct one category from another, you might use the “same” set of objects, but once you’ve constructed it, there is no relationship between the objects, because after all any category is only defined up to equivalence.

This phenomenon isn’t special to categories. For instance, any group has an opposite group with “the same elements” obtained by reversing the order of multiplication, but I don’t think any group theorist would then consider it meaningful to take the intersection of a group and its opposite group. Using “the same elements” is a construction of the opposite, with the same status as Dedekind cuts or Cauchy sequences—once the construction is performed, the fact that you used “the same” objects is discarded. And plenty of other constructions of the opposite are possible.

In fact, this same principle applies to basically all constructions I am familiar with in mathematics. The quotient of one group by another, the polynomial algebra of a ring, the Postnikov tower of a topological space—in each case you may give a specific construction in terms of the input (e.g. maybe an element of R[x]R[x] “is” a function from {0,,n}\{0,\dots,n\} to RR, thought of as the coefficients of a polynomial), but in each case once the construction is performed, its details are forgotten.

I always assumed, without really thinking much about it, that all modern mathematicians thought in this way, except for maybe ZF-theorists (on some days of the week). But apparently not!

Posted by: Mike Shulman on September 18, 2009 9:18 PM | Permalink | PGP Sig | Reply to this

Re: can objects and morphisms belong to several categories?

MS: In ETCS, a set equipped with a relation consists of an object A, an object R, and a monomorphism R→A×A.

This only goes halfway towards answering my question. You reduced it to another informal construct.

What does it formally mean that something consists of three typed things?

Posted by: Arnold Neumaier on September 18, 2009 10:02 PM | Permalink | Reply to this

what are structures, structurally?

“A set equipped with a binary relation” is not a single object in the discourse of structural set theory the way a pair (A,R)(A,R) is a single object in the discourse of material set theory. But that doesn’t make it informal. If I want to make a statement about all sets equipped with binary relations, I can construct a formal sentence of the form

A.R:P(A×A).\forall A. \forall R:P(A\times A). \dots

Posted by: Mike Shulman on September 19, 2009 5:03 AM | Permalink | PGP Sig | Reply to this

Re: can objects and morphisms belong to several categories?

MS: a category should only be considered as defined up to equivalence, and this sort of “intersection” seems unlikely to be invariant under equivalence.

This appears to be part of the unspoken moral code of categorist. But all categories they write down are defined as particular categories, and equivalence of categories is a concept that doesn’t appear on page 1, where it should figure if categories are really defined only up to equivalence.

Let’s not further discuss intersection; maybe this is still unbaked. But clarifying the moral code so that it is intelligible to an outsider who learns through self-study (ultimately I am thinking of the FMathL system to be that outsider) seems quite important!

AN: There must be something wrong either with your statement or with the definition of opposites

MS: Very interesting! I didn’t realize that the basic texts of category theory could be misinterpreted in this way by someone unfamiliar with our way of thinking.

I have aquired an eye for all these subtleties (where an automatic system would stumble upon) because I’ve been observing myself closely the last few years how I read math, in order to ba able to teach it to the FMathL system.

I can’t see how the definitions can be read in any other way by someone reared in the tradition of Bourbaki. If you take ZF as the metatheory then none of the usual rules for identifying abuses of terminology give you the slightest clue that something else could have been intended!

MS: I think that all category theorists, as well as most of the mathematicians they talk a lot to (algebraists, topologists, etc.), are so immersed in a structural way of thinking …

… that hey have lost contact with those mathematicians whose daily work is a bit less abstract?

I was taught structural thinking but in the way of Bourbaki, rather than in the categorial way. There one clearly distinguishes between equality and isomorphism, although one knows that in many cases only the isomorphism-invariant propoerties are relevant. But being able to distinguish the two modes has lots of advantages; in particular, there is no problem of evil.

One knows that Alt(5) and PSL(2,5) are isomorphic groups, but as individuals the two groups are naturally distinguishable by their construction. Onene distinguishes clearly between the group Alt(5) with its intrinsic action on 5 elements (though we allowed abuse of notation to label these elements arbitrarily, but idf pressed, we’d have undone that) and a group Alt(5), which is just an arbitrary group isomorphic to Alt(5). Thus one could say that “Alt(6) contains the groups Alt(5) and PSL(2,5), and therefore two conjugacy classes of alternating groups of 5 elements”, with a perfectly clear meaning.

I know that this can be reformulated in categorial terms, but if we want to maintain the same degree of precision, it becomes clumsy.

MS: Using “the same elements” is a construction of the opposite, with the same status as Dedekind cuts or Cauchy sequences - once the construction is performed, the fact that you used “the same” objects is discarded.

I find this a weakness of the constructive approach to math…

MS: And plenty of other constructions of the opposite are possible.

It is interesting, though, that all books use the same construction.

MS: I always assumed, without really thinking much about it, that all modern mathematicians thought in this way, except for maybe ZF-theorists (on some days of the week). But apparently not!

Maybe I am not a modern mathematician. I was still taught the old virtues of precise definitions, and the advantage of having two different words (such as “same” and “isomorphic”) for concepts whose confusion causes confusion.

Posted by: Arnold Neumaier on September 18, 2009 10:41 PM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

This reminds me of a question about forgetful functors that I’ve had for a while.

The definition of “forgetful function” is (in every account I’ve seen) specified in terms of removing structure from a set equipped with some structure.

This very much depends on the presentation of the relevant categories. How can we define “forgetful functor” in a way that’s invariant under equivalence?

Posted by: Tom on September 18, 2009 11:21 PM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

How can we define “forgetful functor” in a way that’s invariant under equivalence?

I'd say (and Mike did say, at [[forgetful functor]]) that any functor can be a forgetful functor, depending on your point of view; calling it that simply establishes a (perhaps temporary) point of view.

Posted by: Toby Bartels on September 19, 2009 12:14 AM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

I have aquired an eye for all these subtleties (where an automatic system would stumble upon)

I would humbly suggest that an automatic system need only stumble upon them if it had been designed by a human mathematician who was unconversant with structural ways of thinking.

Thus one could say that “Alt(6) contains the groups Alt(5) and PSL(2,5), and therefore two conjugacy classes of alternating groups of 5 elements”, with a perfectly clear meaning.

Not very clear to me; perhaps our linguistic conventions are equally mutually unintelligible. Do you mean that Alt(6), considered together with its canonical action on a 6-element set, contains the group Alt(5) with its canonical action on a 5-element set and the group PSL(2,5) with its canonical action on some other set? If so, I don’t see why it should become clumsy in typed language; we just overload Alt(5) so that it can mean either “the group Alt(5)” or “the group Alt(5) with its canonical action.”

But being able to distinguish the two modes has lots of advantages; in particular, there is no problem of evil.

I am puzzled by this statement; it seems to me that only if there is a notion of equality, in addition to a notion of isomorphism, can the problem of evil even be posed.

MS: Using “the same elements” is a construction of the opposite, with the same status as Dedekind cuts or Cauchy sequences - once the construction is performed, the fact that you used “the same” objects is discarded.

I find this a weakness of the constructive approach to math…

But this seems to me exactly what FMathL is doing, when you first construct something, then define the class of things isomorphic to it, and redefine that object to be something arbitrarily chosen from that class. You use a particular construction, then you discard its details.

And actually, I don’t know what you mean by the “constructive approach” here, nor what other approach you are contrasting it with.

MS: And plenty of other constructions of the opposite are possible.

It is interesting, though, that all books use the same construction.

Well, of course. It’s the simplest one. I expect that basically all books use the same construction of the rational numbers in terms of the integers.

Maybe I am not a modern mathematician. I was still taught the old virtues of precise definitions

I’m having trouble not being insulted by that. Everything we’ve said here is perfectly precise.

Posted by: Mike Shulman on September 19, 2009 3:02 AM | Permalink | PGP Sig | Reply to this

Precise definitons

AN: Maybe I am not a modern mathematician. I was still taught the old virtues of precise definitions

MS: I’m having trouble not being insulted by that. Everything we’ve said here is perfectly precise.

The first was not intended; I just was explaining my trainig, not makin a comparison with yours. (Please do not take anything I say personal.) The second is simply false.

I was specifically referring to the fact that all authoritative definitions I could find on the web on the definition of opposite categories say explicitly that they have the same objects, while you said equally explicitly (and now even supposedly perfectly precisely) that “morally, nothing should ever be an object of two categories at the same time”.

Please give me the perfectly precise meaning of the term “morally” that you had in mind when writing “Everything we’ve said here is perfectly precise.” It is the only point of dispute here. For without the moral part, my interpretation is correct.

Not a single word in the definitions of Wikipedia (or any of the other sources quoted earlier) tells me that two different categories must have disjoint object classes. If this were really part of category theory, it would be trivial to state it as part of the definition of caategories, and surely some author would have cared enough about precision to do so.

But no author I know of has done it, and for good reasons. For it would make the standard definition contradictory, since it is easy to construct two different categories in the sense of the standard definition that share some objects.

Posted by: Arnold Neumaier on September 19, 2009 6:10 PM | Permalink | Reply to this

Re: Precise definitons

How about this, Arnold: two categories may have an object in common, but you should never use that fact. The construction that shows that opposite categories exist (inside a model of set theory as foundation) uses the same objects as the original category (so we’re sure that there are “enough” objects around) but from that point we never actually use that fact.

Here’s an analogous situation not using categories: the von Neumann construction showing that a model of the Peano axioms exists within set theory defines 4 as the set {0,1,2,3}. Thus within this construction it happens that 3 is an element of 4, but from this point we never use this fact, since numbers being elements of other numbers is a property of the model, not of the structure. Technically you can make this statement within the model, but “morally” you shouldn’t.

Similarly, it’s easy to come up with an equivalent category which also behaves like the opposite category, but which shares no components at all with the original category. But it really doesn’t matter, since equality of components of two distinct categories is not part of the structure.

Posted by: John Armstrong on September 19, 2009 7:15 PM | Permalink | Reply to this

Re: Precise definitons

Thanks, that’s about what I meant.

I wasn’t including statements prefixed by “morally” in my analysis of what is precise and what isn’t; perhaps I should have said something like “every mathematical definition we’ve given has been precise.”

Posted by: Mike Shulman on September 19, 2009 7:54 PM | Permalink | PGP Sig | Reply to this

Re: Precise definitons

JA: two categories may have an object in common, but you should never use that fact. The construction that shows that opposite categories exist (inside a model of set theory as foundation) uses the same objects as the original category (so we’re sure that there are “enough” objects around) but from that point we never actually use that fact.

You are contradicting yourself.

The construction that shows that opposite categories exist must use the fact that two categories may have an object in common, otherwise it loses its simplicity.

The same holds in many other instances in current category theory texts.

The point is that in constructions, one freely uses (and needs) the definition of categories in the Bourbaki sense (i.e., as ordinary algebraic structures, allowing categories to share objects).

But after one has constructed an instance of the desired category, one forms its isomorphy class and chooses an anonymous element from it (in the sense of Bourbaki’s - actually Hilbert’s - choice operator used also in FMathL) to get rid of the accidentals of the construction.

Thus one needs both interpretations of categories, the concrete one (an instance) and the generic one (an anonymous instance).

FMathL naturally provides both views, and does not need to impose on the user a moral (“you should not”) that mathematics never had, beyond sticking to what is set down in the axioms.

Posted by: Arnold Neumaier on September 19, 2009 8:35 PM | Permalink | Reply to this

Re: Precise definitons

The construction that shows that opposite categories exist must use the fact that two categories may have an object in common.

No more than the von Neumann construction shows that the number 3 must be an element of the number four. The usual construction of the opposite category is one of many, but the nature of the opposite category is not defined by this construction or any artifacts of this construction.

And I even said that it’s possible to construct another version of the opposite category whose components are completely disjoint from those of the original category. Please to go back and read what I wrote.

On another note, when did mathematics not regard the syntactic validity of statements like “3 is an element of 4” as problematic at best? You can ask “is 3 an element of 4?”, or “is this object from one category ‘the same as’ that object from another category?”, or “does a baseball shortstop have red hair?”, but these questions are all meaningless because they’re not part of the relevant structures. That’s what “moral” means here. You can ask whether two categories share objects all you like, but the question is as completely beside the point as either of the other two.

Posted by: John Armstrong on September 19, 2009 11:08 PM | Permalink | Reply to this

Re: Precise definitons

JA: two categories may have an object in common, but you should never use that fact.

AN: The construction that shows that opposite categories exist must use the fact that two categories may have an object in common.

JA: No more than the von Neumann construction shows that the number 3 must be an element of the number four.

If you define the number 3 as a von Neumann numeral, it must have this property. If you define the number 3 in a different way it doesn’t. It depends how you formulate your definition.

I was referring to the definiton as found in any of the standard sources quoted.

I think a “should” has no place in mathematics, apart from the requirement that one should take as true only what axioms, definitions, and proved theorems say.

Thus any should in mathematics must be formalized in such terms. This is the only way to keep the semantics of mathematics precise.

JA: “You can ask “is 3 an element of 4”, […] but these questions are all meaningless because they’re not part of the relevant structures. That’s what “moral” means here.

I am insisting on giving this a formal meaning since otherwise it is not possible to teach it to an automatic system. But I don’t see any way to make the moral you want to impose formally precise in any system at all without violating this morality somewhere.

Posted by: Arnold Neumaier on September 20, 2009 12:12 PM | Permalink | Reply to this

Re: Precise definitons

I am insisting on giving this a formal meaning since otherwise it is not possible to teach it to an automatic system.

You have this precisely backwards. It’s perfectly simple to teach a formal system not to ask whether one number “is an element of” another number. Just don’t define “is an element of” to have any meaning for numbers.

Posted by: John Armstrong on September 20, 2009 3:36 PM | Permalink | Reply to this

Re: Precise definitons

JA:two categories may have an object in common, but you should never use that fact.

JA: “You can ask “is 3 an element of 4”, […] but these questions are all meaningless because they’re not part of the relevant structures. That’s what “moral” means here.

AN: I am insisting on giving this a formal meaning since otherwise it is not possible to teach it to an automatic system.

JA: You have this precisely backwards. It’s perfectly simple to teach a formal system not to ask whether one number “is an element of” another number. Just don’t define “is an element of” to have any meaning for numbers.

My “this” referred to “the categorial moral”, not to this specific example.

In a typed system, the example already has the precise meaning of “undefined”. But the moral also contains your statement at the top of this message.

How do you make this precise enough that an automatic system does not feel entitled after having seen the definition of a category (Definition 1.1.1 in Asperti and Longo) to make Definition 1.3.1 (which uses \subseteq, which is defined only in terms of equality between objects)?

And why should one follow your injunction when the standard textbooks don’t follow it? (Definition 1.3.1 is completely standard.)

Posted by: Arnold Neumaier on September 21, 2009 2:46 PM | Permalink | Reply to this

Re: Precise definitons

And why should one follow your injunction when the standard textbooks don’t follow it? (Definition 1.3.1 is completely standard.)

You’re completely (intentionally?) missing the distinction I drew between a construction demonstrating the existence of a model of a structure and the subsequent use of the properties of a structure. As I said before, “moral” (which was someone else’s term) refers to the latter segment, not the former.

Von Neumann numerals are also “completely standard”, and in their construction some numbers are elements of other numbers, but once we’ve constructed this model (to show that a NNO exists) we never again use that fact because it’s a property of the model and not of the structure.

I’m done here.

Posted by: John Armstrong on September 21, 2009 3:51 PM | Permalink | Reply to this

Re: Precise definitons

I think a “should” has no place in mathematics, apart from the requirement that one should take as true only what axioms, definitions, and proved theorems say.

That’s very different from my philosophy, and from my experience doing mathematics and talking about it with other mathematicians. Some “should”s that I can think of off the top of my head: one should use enough generality but not too much, one should not use confusing variable names (even if they are formally correct), one should not use redundant or unnecessary axioms, one should choose the names of defined terms in a consistent way, and one should not invent and study concepts that have no motivation or relation to the rest of mathematics. Of course, as always, not everyone agrees on what one should and shouldn’t do, but I think mathematics is rife with normative judgements beyond truth and falsity.

Posted by: Mike Shulman on September 21, 2009 5:37 AM | Permalink | PGP Sig | Reply to this

Re: Precise definitons

AN: I think a “should” has no place in mathematics, apart from the requirement that one should take as true only what axioms, definitions, and proved theorems say.

MS: That’s very different from my philosophy, and from my experience doing mathematics and talking about it with other mathematicians. Some “should”s that I can think of off the top of my head: one should use enough generality but not too much […]

I don’t think these are shoulds that make a difference in mathematical understanding but in the quality of the resulting mathematics. Of course good mathematics is governed by lots of shoulds.

But mathematics itself is not. Shoulds have no place in the interpretation of the meaning of a well-formed piece of (even irrelevant, too abstract, too special, or too longwinded) mathematical text, and on what a mathematician is allowed to do with it without leaving the realm of the theory.

In our present discussion you had wondered about how much misunderstanding is possible by not knowing the shoulds.

I still think that I interpreted the definitions in an impeccable way, and indeed in the way the definitions are used (at least at times) by category theorist. I even believe that they cannot be interpreted in any other way from a strictly formal perspective. The moral seems to lie only in the labeling of some of it as evil or unnatural.

Anyway, I see signs of slow convergence towards a consensus!

Posted by: Arnold Neumaier on September 21, 2009 3:20 PM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

MS: I would humbly suggest that an automatic system need only stumble upon them if it had been designed by a human mathematician who was unconversant with structural ways of thinking.

I have not the slightest idea how a system should be constructed that, without having any prior notion of category theory, being exposed to the Wikipedia article quoted would not be able to construct categories satisfying the axioms that have common objects.

MS: I don’t see why it should become clumsy in typed language; we just overload Alt(5) so that it can mean either “the group Alt(5)” or “the group Alt(5) with its canonical action.”

Yes, of course. I didn’t say that one can solve many of these problems by overloeading. I just explained the way such things were treated in the tradition I grew up with, and that it worked well.

MS: it seems to me that only if there is a notion of equality, in addition to a notion of isomorphism, can the problem of evil even be posed.

But there must be such a notion of equality intrinsic in any definition of category theory (as opposed to categories). One cannot purge this sort of evil.

One must know that if you take two objects A,B from a category C that any two mentions of A are identical objects (not only the same up to isomorphy), while a mention of A and a mention of B are possibly identical. Otherwise one cannot embed category theory into standard logic, and structural concepts such as automorphisms would not make sense.

One also needs it in order to be able to define opposite categories in the standard way, even if one subsequently forgets how they were constructed.

MS: this seems to me exactly what FMathL is doing, when you first construct something, then define the class of things isomorphic to it, and redefine that object to be something arbitrarily chosen from that class. You use a particular construction, then you discard its details.

Of course, this was frequently done in mathematics, already before categories were born. But thwe point is that FMathL can choose when to do it, and indicates it (as Bourbaki would have done), while in category theory, one is forced to do it, even when one does not want to do it.

For example, if one treats posets arising in practical programming as categories, one almost always needs them in their concretely defined form, and not only up to isomorphism. And it is clear that categories, ad defined everywhere, may have this concrete from.

Thus FMathL preserves an important freedom of mathematicians that an equality-free categorial approach tries to forbid for purist reasons. But it also accommodates a purist categorial approach as you propose it, just because of your observation.

MS: I don’t know what you mean by the “constructive approach” here, nor what other approach you are contrasting it with.

I had contrasted the constructive approach that defines quaternions via a construction and the specification approach that defines quaternions via a characterization. You had then mentioned that initially, a characterization might not be available, and I had replied why this does not make the FMathL approach unattractive.

MS: I expect that basically all books use the same construction of the rational numbers in terms of the integers.

Here is an exception: Arnold Neumaier, Analysis und lineare Algebra, Lecture Notes in German

Posted by: Arnold Neumaier on September 19, 2009 7:13 PM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

I have not the slightest idea how a system should be constructed that, without having any prior notion of category theory, being exposed to the Wikipedia article quoted would not be able to construct categories satisfying the axioms that have common objects.

How about a system which is structural, and therefore which regards the question of whether two different structures (such as two different categories) have common elements as a type error?

I didn’t say that one can[‘t] solve many of these problems by overloeading.

No, but what you did say was:

I know that this can be reformulated in categorial terms, but if we want to maintain the same degree of precision, it becomes clumsy.

I was pointing out that we can maintain the same degree of precision in a structural theory with overloading, without it becoming clumsy.

One must know that if you take two objects A,B from a category C that any two mentions of A are identical objects (not only the same up to isomorphy)

Actually, it’s good enough if any two mentions of A are connected by a specified isomorphism in a coherent way. But there is a distinction between naming a given object and asking whether two given objects are identical.

in category theory, one is forced to do it, even when one does not want to do it.

This is not true. You can always keep the extra structure around which you used to construct the gadget; you don’t have to forget about it. If you construct the reals as a subset of P()P(\mathbb{Q}) using Dedekind cuts, you don’t have to then forget about the \in relation relating them to \mathbb{Q}.

Posted by: Mike Shulman on September 19, 2009 8:03 PM | Permalink | PGP Sig | Reply to this

Re: can objects and morphisms belong to several categories?

MS: How about a system which is structural, and therefore which regards the question of whether two different structures (such as two different categories) have common elements as a type error?

It would report a type error in the standard definition of the opposite category, if written down formally.

MS: I was pointing out that we can maintain the same degree of precision in a structural theory with overloading, without it becoming clumsy.

Not on the surface, because of the overloading, but inside the system, which must track all that overloading. One needs to consider both aspects - efficiency for the user and efficiency for the system.

MS: Actually, it\u2019s good enough if any two mentions of A are connected by a specified isomorphism in a coherent way

So with nn mentions, you created an internal overhead of O(n 2)O(n^2). In a lengthy proof, nn can be large. Thus at least an efficient implementation may not pretend that it adheres to the categorial moral.

But a solid foundation must also be able to describe what happens on the implementation level.

MS: But there is a distinction between naming a given object and asking whether two given objects are identical.

I don’t think this is good enough. For often one may want to derive results that hold for general A, B, and later one may want to use this result for the special case where A and B are the same.

MS: .If you construct the reals as a subset of P*()P*(\mathbb{Q}) using Dedekind cuts, you don’t have to then forget about the \in relation relating them to mathbbQmathbb{Q}.

How do I then refer to the reals with \in as opposed to the reals without \in? They are no longer the same objects, although I constructed the latter to be the former.

Such a schizophrenic state of affairs is avoided when using Bourbaki’s choice operator.

Posted by: Arnold Neumaier on September 19, 2009 8:38 PM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

It would report a type error in the standard definition of the opposite category, if written down formally.

It wouldn't, because we never ask (then or afterwards) if an object of CC is the same as an object of C opC^op.

In formalising mathematics without a fundamental global equality, it's important to distinguish the external judgement that two terms are syntactically identical from the internal proposition that two terms refer to the same object. If AA is an object of CC, then you may interpret AA as an object of C opC^op, without even an abuse of language. (I'm not sure that the last clause is correct in Mike's favourite foundations, but it is in mine, which are more type-theoretically oriented.) But you can't introduce AA as an object of CC and BB as an object of C opC^op and ask whether A=BA = B, or even whether ABA \cong B; that's a type error.

[…] often one may want to derive results that hold for general A, B, and later one may want to use this result for the special case where A and B are the same.

I would handle this by substitution, just as I would if I wanted the result for the special case where AA is x 2+2x^2 + 2 and BB is x+yΣx + y - \Sigma (in a context where those terms make sense —and have the right type). I know that people write ‘If A=BA = B, then […]’, but I take this as abuse of language (or syntactic sugar) for ‘Setting AA to BB, […]’ or ‘Setting BB to AA, […]’ (depending on which symbol is used in the sequel).

Posted by: Toby Bartels on September 19, 2009 9:22 PM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

Thanks, Toby, that’s exactly what I meant.

If AA is an object of CC, then you may interpret AA as an object of C opC^op, without even an abuse of language. (I’m not sure that the last clause is correct in Mike’s favourite foundations

It is. At least, insofar as I have a favorite foundation.

Posted by: Mike Shulman on September 20, 2009 5:42 AM | Permalink | PGP Sig | Reply to this

Re: can objects and morphisms belong to several categories?

TB (in an earlier mail): I’d also like to see a formalisation that takes both their definition, and their statement that begins the Exercise in that section (using the previous Definition 1.3.1), as literally true! It seems doubtful to me; I know what they mean, but I need to translate it.

I’d like to see a formulation that takes both their definition, and their statement that begins the Exercise in that section (using the previous Definition 1.3.1), and rewrites it in a form that it can be taken as literally true. The moral of mathematics requires that a definition can be read as literally true in the sense that that any abuses of language are explained somewhere in sufficient detail that they can be undone.

Looking at other treatises, I can see nothing in Definitions 1.1.1 and 1.3.1 that is not standard. (At least two of the other sources I had quoted have identical requirements.)

So please provide a reading that has no unexplained abuses of language.

MS: How about a system which is structural, and therefore which regards the question of whether two different structures (such as two different categories) have common elements as a type error?

AN: It would report a type error in the standard definition of the opposite category, if written down formally.

TB: It wouldn’t, because we never ask (then or afterwards) if an object of C is the same as an object of Cop^{op}. […] If A is an object of C, then you may interpret A as an object of Cop^{op}, without even an abuse of language. But you can’t introduce A as an object of C and B as an object of Cop^{op} and ask whether A=B, or even whether A\congB; that’s a type error.

On the surface, this looks like a solution. Your suggestion amounts to having equality on the metalevel but forbidding it on the object level. But I think this does not hold water. I don’t think your suggestion can be implemented consistently in a fully formalized way without producing contradictions.

For to say that A is an object of C is formalized as AOb CA\in Ob_C, and to say that B is an object of C opC^{op} is formalized as BOb C opB\in Ob_{C^{op}}. Now Definition 1.3.2 in Asperti and Longo, Categories, Types and Structures implies, using Definition 1.1.1, that Ob C=Ob C op=:XOb_C=Ob_{C^{op}}=:X, say. Now X is a collection, and (at least when identifying collections with certain ETCS-sets or SEAR-sets, say) one can compare elements from X for equality.

But A and B are elements from X, so they can be compared for equality. A formal theorem explorer has no way to avoid this conclusion. Once it draws this conclusion it produces a type error, and exits, not being able to continue to explore the current context.

One cannot escape here to a metalevel since there is no way to feed a theorem explorer unformalized stuff.

The same problem appears with Definition 1.3.1 of a subcategory. Once you have Ob DOb COb_D\subseteq Ob_C, nothing may forbid to compare an element of Ob DOb_D with an element of Ob COb_C without an inconsistency.

Note that this book was written for readers not exposed to categories before. It is difficult for any such reader who takes these definitions seriously to arrive at any other conclusion.

Posted by: Arnold Neumaier on September 20, 2009 11:35 AM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

I'd also like to see a formalisation that takes both their definition, and their statement that begins the Exercise in that section (using the previous Definition 1.3.1), as literally true! It seems doubtful to me; I know what they mean, but I need to translate it.

I’d like to see a formulation that takes both their definition, and their statement that begins the Exercise in that section (using the previous Definition 1.3.1), and rewrites it in a form that it can be taken as literally true.

Yes, I see that this is your quest. If FMathL can do that with this passage, then that would show its power.

So please provide a reading that has no unexplained abuses of language.

Honestly, I think that (especially since this is an introductory text) the phrasing of the Exercise was probably a mistake. They might just try this:

Set\mathbf{Set} is a subcategory of Rel\mathbf{Rel}. Is it a full subcategory?

Of course, that's not the same exercise, and it should go immediately after Definition 1.3.1.

The real meaning of Exercise 1.3.2 is

Set op\mathbf{Set}^{\mathbf{op}} is obviously equivalent to a subcategory of Rel\mathbf{Rel}. Is it a full subcategory?

But since they haven't defined equivalence of categories yet, this is an inappropriate exercise at that point. (Actually, the relation of Set op\mathbf{Set}^{\mathbf{op}} to the desired subcategory of Rel\mathbf{Rel} is stricter than equivalence, but still not anything that they've defined yet.)

Another way out is to interpret Definition 1.3.1, particularly the requirement that D[a,b]C[a,b]\mathbf{D}[a,b] \subseteq \mathbf{C}[a,b], in a structural way to mean that D[a,b]\mathbf{D}[a,b] is equipped with an injection to C[a,b]\mathbf{C}[a,b]. I don't think that this is how they intended it, since in an introductory book you ought to explain that sort of thing. But if the authors are deep into the structural framework, then they might have been thinking this without realising it.

I'm interested in how you interpret the claim that Set op\mathbf{Set}^{\mathbf{op}} is a subcategory of Rel\mathbf{Rel}. Is it easy to understand what it means, and what exactly does it mean? Should FMathL accept it, and how should it interpret it?


If AA is an object of CC, then you may interpret AA as an object of C opC^{op}, without even an abuse of language. But you can't introduce AA as an object of CC and BB as an object of C opC^{op} and ask whether A=BA = B, or even whether ABA \cong B; that's a type error.

On the surface, this looks like a solution. Your suggestion amounts to having equality on the metalevel but forbidding it on the object level. But I think this does not hold water. I don’t think your suggestion can be implemented consistently in a fully formalized way without producing contradictions.

You're right; what I've said here is contradictory. If (as I said) you may interpret an object AA of CC as an object of C opC^{op}, then you can compare it to the object BB of C opC^{op}, since any two objects of C opC^{op} may be compared (for isomorphism at least). I wasn't thinking about carefully enough, and I apologise.

(I stand by what I said about distinguishing identity judgements from equality propositions, although this does not appear to be a place that it applies. In fact, I think that it must be irrelevant to what we're discussing, since it's a criticism of ETCS\mathbf{ETCS} as much as of anything else. So never mind.)

Since we're worrying here about typing errors when comparing two objects of two categories, maybe I should go back to the beginning and say what I think about that, without suggesting that Mike or anybody else would agree with me. (I know that there are differences between Mike's and my philosophy, and we are getting close to some of them.)

Given two arbitrary types (where a type might be the type of elements of a set, or the type of objects of a category, or something else) XX and YY, and given AA of type XX and BB or type YY, it doesn't normally make sense to ask whether A=BA = B. However, it is not good design for a mathematical formaliser to throw up an error whenever anybody writes A=BA = B in this context. First it should try to reduce the expressions for XX and YY (especially if this is something that can always be efficiently strongly normalised) to see if they come out the same. Even if that fails (and especially if reduction is not confluent or was not completed), then it should give the user an opportunity to specify a type ZZ (which might be either XX or YY) and operations to ZZ from XX and YY respectively. If this works, then AA and BB may now be interpreted as having the same type, and A=BA = B presumably makes sense.

In particular, if GG and HH have just been introduced as two groups in a context appropriate for group theory, with XX and YY the types of elements of GG and HH respectively, then there is no way to avoid the type error. But if instead YY is the type of elements of G opG^{op}, then it is easy to avoid the type error; probably the system can do it automatically (in fact, probably one has YY defined directly as XX).

I have used groups here instead of categories to avoid the evil of asking whether two objects in a single given category are equal, which is a different issue (related to that stuff about identity judgements). Of course, the elements of a group correspond to the morphisms of a category, but I don't think that this makes a difference here.

Posted by: Toby Bartels on September 20, 2009 7:42 PM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

TB: If FMathL can do that with this passage, then that would show its power.

To teach FMathL this power, I first need to understand what “should” be understood after having read Definition 1.1.1 and what after Definition 1.3.1.

You explained only how then the exercise should be understood, in terms of concepts not yet introduced.

How can an automatic system understand things at the very introduction of a theory when the intentions are formulated so poorly?

This is why I had asked. So let me rephrase my request:

I’d like to see a few pages of text that introduce in a formally precise way the full supposed content of what the trained category theorist understands that these two definitions and the exercise should have conveyed to the reader, including any moral an automatic system should follow in interpreting the remainder of category theory, and explaining any abuse of notation or language that is apparent from the presentation of these two definitions in the standard textbooks.

If I get such a description, and if it is logically consistent, I will guarantee that FMathL will be provided with a generic mechanism for interpreting the definitions as written by Asperti and Longo in the correct way, and for identifing a meaningful interpretation of the exercise (together with raising a flag for having discovered a sloppiness.)

But first I need to understand myself clearly enough what you read into the text morally although it is not written there formally.

Posted by: Arnold Neumaier on September 21, 2009 2:43 PM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

I first need to understand what “should” be understood after having read Definition 1.1.1 and what after Definition 1.3.1.

I think that what “should” be understood at this point is that the authors made a mistake in stating the exercise. Seasoned categorists can guess what the authors might have meant, but I would not expect an undergraduate without experience in category theory to be able to.

Posted by: Mike Shulman on September 21, 2009 6:03 PM | Permalink | PGP Sig | Reply to this

Equality between objects of different types

TB: given A of type X and B or type Y, it doesn’t normally make sense to ask whether A=B. […] it should give the user an opportunity to specify a type Z (which might be either X or Y) and operations to Z from X and Y respectively. If this works, then A and B may now be interpreted as having the same type, and A=B presumably makes sense.

I think something like that is feasible in FMathL. I’ll keep it in mind in its design.

In this connection, would you consider each category to be a separate type? Each object of a category? Each Homset? (I am not sure whether all three simultaneously may be required consistently.)

Posted by: Arnold Neumaier on September 21, 2009 2:52 PM | Permalink | Reply to this

Re: Equality between objects of different types

I would consider the class of objects of each category to be a separate type, and each homset in each category to be a separate type. In general, I don’t think of a single object of a category as a type (in general, there’s no way for it to have “elements”), although for some particular categories such as SetSet they can be interpreted that way.

Posted by: Mike Shulman on September 21, 2009 6:13 PM | Permalink | PGP Sig | Reply to this

Re: Equality between objects of different types

In general, I don’t think of a single object of a category as a type (in general, there’s no way for it to have “elements”), although for some particular categories such as SetSet they can be interpreted that way.

For students and curious lurkers, see [[concrete category]].

Posted by: Eric Forgy on September 21, 2009 7:54 PM | Permalink | Reply to this

Reflection

TB: I should have asked if there was an easy user-friendly way to import it.

What can be more user-friendly than typing “import file.con”, or dragging an icon for file.con into the current context window? If you accept everything the imported context is regulating, this is enough. Otherwise you need to edit file.con to suit your needs before importing it. Just deleting something is easy; other things depend on what you want to change, and how.

TB: I’m worried about an analogue of the mismatch between the internal and external languages of a topos that is not well-pointed.

I can’t tell exactly what you are aiming at, but there is always a kind of mismatch between a subject level (probably your external language) and the object level (probably your internal language).

Because of Goedel’s theorem, a subject level is always strictly stronger in proving power than the object level (unless both are inconsistent).

However, there is no analogue of Goedel’s theorem for the descriptive power of a system. A system with weak proving power can still have a descriptive power sufficient to represent all mathematics including their proofs. The reason is that while finding proofs is undecidable in general, checking proofs is constructively possible under quite modest assumptions about the logic.

Thus even though the current FMathL framework supports only a truncated set theory, having power sets only for sets of size up to the continuum, the reflected level (in which all formal reasoning happens) can check all proofs in axiomatic set theory, even those involving inaccessible cardinals. It is only required that you specify the latter in the FMathL specification language, which is based on the truncated set theory.

And it can check all proofs presented in Bishop-type constructive mathematics, if you specify the latter in the FMathL specification language, although the logic in which the specification language is defined is classical.

TB: By Excluded Middle, a set is either inhabited or not; an uninhabited set is (by definition) empty; hence a nonempty set is inhabited. So if there is a global choice function for inhabited sets, then there is one for uninhabited sets.

I see. So one would have to restrict the choice somehow, and change that part of the context. I am not very familiar with the various ways of defining restricted choice, though; maybe you can help me in saying how much choice you want to allow. In any case, I’ll give it thought.

To teach the system the moral of categories, it seems to me necessary and probably sufficient to have choice for elements of equivalence classes.

TB: Give me a paper whose source is available (say from the arXiv), formalised in ZFC (or whatever), and I’ll rewrite it to be formalised in ETCS.

AN: I do not have the patience to undo all the abuses of notation traditionally used in the context of ZFC.

TB: I don’t understand what you’re asking of categorial foundations, then, if no other foundation does it.

You were asking for something formalized in ZFC. I can’t supply that, since I regard abuses of notation as lack of formalization. I mentioned here what I consider a sufficient level of formality.

Thus I am satisfied if you can translate a typical introduction to axiomatic set theory (until natual numbers and functions are available) and an introduction to logic (until variable substitutions and ZFC are available) into a document starting from scratch with the axioms for a category, at a comparable level of rigor and only need a comparable number of pages.

TB: Or do you claim that FMathL does this? Can I take http://www.mat.univie.ac.at/~neum/ms/fmathl.pdf as my text?

The reflection part is only promised there and outlined, not realized. We are working on a prototype version, and hope to have one by the end of the year. At least, FMathL will not have abuses of notation since whatever remains of them will be valid notation rather than abuse. Thus the level or rigor will be higher than in a typical textbook.

TB: we still don’t need a full-fledged set theory (or other foundation of all mathematics), just a way to talk about recursively enumerable sets of natural numbers.

Yes, something like this is sufficient. But if you think that this saves a lot of work, you are mistaken. You can relax the axiom of power sets and delete the axiom of foundations, but you need equivalents of all the rest to be able to define recursively enumerable sets of natural numbers. And you need to build up quite a lot of conceptual machinery before you have all the concepts and properties one is using there without thinking.

The complexity of complete formal foundations do not become apparent before one hasn’t tried to find sone!

TB: In particular, the requirement of a countable set of variables can be replaced with a single variable x and the requirement that X’ is a variable whenever X is; there is no need to say Cantor’s word “countable”.

True. But then you must stick to that convention later on in whatever you do since you do not have anyhing else. This will make your foundation nearly incomprehensible to a human reader. For example, if you agree with the typing paradigm, it will make type checking extremely tedious. But foundations should be easily checkable by hand and teachable in the classroom!

Posted by: Arnold Neumaier on September 19, 2009 9:53 PM | Permalink | Reply to this

Re: Reflection

So one would have to restrict the choice somehow, and change that part of the context. I am not very familiar with the various ways of defining restricted choice, though; maybe you can help me in saying how much choice you want to allow.

I really only want (as the default, until I choose something stronger) the axiom of unique choice: When AA is an inhabited set such that any two elements of AA are equal, then we have an element Choice(A)Choice(A) of AA. Of course, this isn't going to work for the uses to which you put the global choice operator.

Another possibility is to refuse to allow, from the hypothesis that A=BA = B, the conclusion that Choice(A)=Choice(B)Choice(A) = Choice(B) (even given that AA and BB are inhabited). This should prevent the proof of Excluded Middle (in a framework with intuitionistic logic), while still allowing a definition like Choice(CompOrdFld)Choice(Comp Ord Fld) for \mathbb{R}. I don't think that you should ever need to deduce, say, Choice(CompOrdFld)=Choice(CompArchFld)Choice(Comp Ord Fld) = Choice(Comp Arch Fld) from CompOrdFld=CompArchFldComp Ord Fld = Comp Arch Fld (if you ever even want to say the latter).

Posted by: Toby Bartels on September 19, 2009 11:04 PM | Permalink | Reply to this

Re: Reflection

TB: I really only want (as the default, until I choose something stronger) the axiom of unique choice: Of course, this isn’t going to work for the uses to which you put the global choice operator.

Yes, this is too weak for FMathL purposes. But note that there is no default on the reflection level. The user decides which parts of the FMathL framework reflection should be used.

The FMathL default defined in the framework paper only controls the meaning of the specification language in which everything formal is then represented.

TB: Another possibility is to refuse to allow, from the hypothesis that A=B, the conclusion that Choice(A)=Choice(B)

This is against the basic principle of FMathL that substitution of equals is unrestricted.

This means that someone would have to write a UniqueChoice context and create workarounds of all the uses of standard Choice in the FMathL reflection.

Once this is done, working formally in your setting is as easy as working in the standard context.

But the library of theorems in that contexts would have to be recreated from the standard library by checking which proofs still hold in the new context, and by trying to repair the others.

However this is not more difficult than what anyone has to do who changes something in established foundations. In FMathL it is probably even easier since the precise semantics will help in automatic translation and checking.

Posted by: Arnold Neumaier on September 20, 2009 11:55 AM | Permalink | Reply to this

Re: Reflection

Thus I am satisfied if you can translate a typical introduction to axiomatic set theory (until natual numbers and functions are available) and an introduction to logic (until variable substitutions and ZFC are available) into a document starting from scratch with the axioms for a category, at a comparable level of rigor and only need a comparable number of pages.

As I feel that I’ve said numerous times, a structural/type-theoretic foundation does not need to start with the axioms for a category.

I feel sure that I could do this, but unfortunately I don’t have anywhere near the time it would take at the moment. (-: Sorry if that sounds like a cop-out, but actually I don’t really even have the time to be engaging in this discussion…

Posted by: Mike Shulman on September 20, 2009 6:26 AM | Permalink | PGP Sig | Reply to this

Re: Reflection

MS: a structural/type-theoretic foundation does not need to start with the axioms for a category.

Yes, and then there are fewer problems. I’ll look at SEAR from this point of view, once I have more time to study it more deeply.

MS: I feel sure that I could do this, but unfortunately I don’t have anywhere near the time it would take at the moment. (-: Sorry if that sounds like a cop-out, but actually I don’t really even have the time to be engaging in this discussion…

I understand. This rhetorical request was thought to be an explication of what would revise my current judgment on the overhead of categorial foundations rather than as a request that you or TB should actually do this; especially since I know that you prefer structural non-categorial foundations.

Posted by: Arnold Neumaier on September 20, 2009 12:22 PM | Permalink | Reply to this

Re: Reflection

I am satisfied if you can translate a typical introduction to axiomatic set theory (until natual numbers and functions are available) and an introduction to logic (until variable substitutions and ZFC are available) into a document starting from scratch with the axioms for a category, at a comparable level of rigor and only need a comparable number of pages.

For the first one, how about Sections 1.3 and 2.1–5 of http://www.math.uchicago.edu/~mileti/teaching/math278/settheory.pdf?, which I found by Googling "axiomatic set theory". For the second one, I'm not sure what kind of work you're thinking of, so an example would help.

I reserve the right for lack of time not to complete these, but I'll try to at least indicate how they would be done.

Posted by: Toby Bartels on September 20, 2009 7:44 PM | Permalink | Reply to this

Re: Reflection

AN: a typical introduction to axiomatic set theory (until natual numbers and functions are available) and an introduction to logic (until variable substitutions and ZFC are available)

TB: For the first one, how about Sections 1.3 and 2.1–5 of http://www.math.uchicago.edu/~mileti/teaching/math278/settheory.pdf?

I think to be able to reflect the notion of an expression one also needs finte products and recursive definitions; thus you’d add Sections 2.7 and 2.8. Section 2.9.1 is also used in the standard descriptions of logic.

Thus 17 pages comprising elementary axiomatic set theory are sufficient to reflect logic.

TB: For the second one, I’m not sure what kind of work you’re thinking of, so an example would help.

I have no good online reference (although there might well be some - didn’t search thoroughly). But Section 1-13 of Chapter 5 of the book “The mathematics of metamathematics” by Rasiowa and Sikorski is a suitable template. It presents in about 50 verbose pages (one may skip Section 13A-D) intuition and formality about reflecting mathematical theories and in particular reflects ZF in Section 13E, using on the informal level informal notions of sets, functions, sequences (i.e., what axiomatic set theory tells us can be encoded as such).

Thus 50 pages of elementary logic are sufficient to reflect axiomatic set theory.

Therefore, self-reflective (and hence fully self-explaining) foundations of traditional set-based mathematics can be developed in about 70 pages of mathematical text, at the usual level of mathematical rigor and in a language easily understood by anyone who passed the undergraduate math stage.

Posted by: Arnold Neumaier on September 21, 2009 11:01 AM | Permalink | Reply to this

Re: Reflection

All right, I'll do Section 1.3 and start on Chapter 2 of Mileti, and hopefully the point at which it is clear that I could continue indefinitely will come before the point at which it is tiresome to continue. (^_^)

I should be able to look at Rasiowa & Sikorski later this week. (It's been recommended to me before, in other contexts, so it's about time that I did!) Hopefully, that will just be a matter of checking that the informal notions used have already been successfully formalised in the translation of Mileti to categorial foundations.

Posted by: Toby Bartels on September 21, 2009 11:00 PM | Permalink | Reply to this

Re: Reflection

Temporarily at http://tobybartels.name/settheory.ps does this through Section 2.5. The first parts are completely redone, but afterwards it becomes almost a matter of cutting and pasting. The rest (except Section 2.6, not in the assignment) would be even more like that, although I could do it if there are still questions as to how it would go.

I tried to make the development as tight as possible, although I couldn't help but point out a few things, like the universal property of the Cartesian product and which proofs require classical logic. As Mike did with SEAR, I took axioms that focus on elements rather than arbitrary functions. However, since the assignment was to start with the elementary axioms of a category, I defined an element to be a certain sort of function, as in ETCS.

Personally, I would start with something more like SEAR (or SEPS) and take a leisurely pace, pointing out all of the sights along the way. But that would not meet the assignment.

Posted by: Toby Bartels on October 15, 2009 11:45 AM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

One needs to consider both aspects - efficiency for the user and efficiency for the system.

That’s a fair point, although I feel that types are important enough that if I were designing a system, I would want to put in whatever extra effort is necessary to include them.

BTW, I didn’t mean my mention of the possibility of connecting occurrences of AA by isomorphisms as a serious suggestion that one implement; my preferred solution is what Toby said in response.

How do I then refer to the reals with ∈ as opposed to the reals without ∈? They are no longer the same objects

They are the same objects. \in is extra structure on the same set of elements.

Posted by: Mike Shulman on September 20, 2009 6:13 AM | Permalink | PGP Sig | Reply to this

Re: can objects and morphisms belong to several categories?

AN: How do I then refer to the reals with ∈ as opposed to the reals without ∈? They are no longer the same objects

MS: They are the same objects. ∈ is extra structure on the same set of elements.

This makes sense only if you agree that two categories based on the same set of objects with different extra structure on it have the same objects. But I thought this is something you wanted to avoid!

The problem is that an automatic system must translate these sorts of statements into a well-defined fully formalized statement that can be fed to a theorem prover for checking proofs involving them.

I still don’t see how this can be done, given the standard axioms for category theorem and its derived conceptual basis as given in standard works.

Posted by: Arnold Neumaier on September 20, 2009 8:51 AM | Permalink | Reply to this

Re: can objects and morphisms belong to several categories?

They are the same objects. \in is extra structure on the same set of elements.

This sounds so strange to me that I'm sure that there must be a misunderstanding here, either of Mike or by Mike.

We have the complete ordered field \mathbb{R} of real numbers, and we have \mathbb{R} equipped with a binary relation \in from \mathbb{Q}. (Presumably, \in is either <\lt, >\gt, \leq, or \geq, but we haven't specified.) These are different things, not in the sense that we would put \ne between them (which would be a typing error), but in the sense that we would not accept == between them (which would also be a typing error).

However, there is also an obvious forgetful operation (I won't say ‘functor’ since I haven't specified categories, although it would be easy to do that) from the latter to the former. A user-friendly system for mathematics should even be able to detect this and put it in automatically wherever it's needed.

Posted by: Toby Bartels on September 20, 2009 6:18 PM | Permalink | Reply to this

Re: objects and morphisms can belong naturally to several categories

Definition 1.3.2 of Asperti and Longo, Categories, Types and Structures

I'd also like to see a formalisation that takes both their definition, and their statement that begins the Exercise in that section (using the previous Definition 1.3.1), as literally true! It seems doubtful to me; I know what they mean, but I need to translate it.

Posted by: Toby Bartels on September 18, 2009 10:32 PM | Permalink | Reply to this

Re: objects and morphisms can belong naturally to several categories

I’d also like to see a formalisation that takes both their definition, and their statement that begins the Exercise in that section (using the previous Definition 1.3.1), as literally true!

This is a really good example of the sort of mistakes you can end up making when you do anything with categories non-structurally. Category theory wants to be purely structural with all its heart ♥. (-:

Posted by: Mike Shulman on September 19, 2009 2:33 AM | Permalink | PGP Sig | Reply to this

Re: objects and morphisms can belong naturally to several categories

ordered monoids are the objects in the intersection of Order and Monoid satisfying the compatibility relation RR (suitably defined)

I would like to see you formalise this! Or the (related but perhaps simpler) statement that GroupGroup is a subcategory of SetSet. I think that a lot of mathematicians do think that way, but it is difficult to formalise; the difficulties have to do with the difference between structure and properties. I usually see it as a hallmark of sophisticated understanding to drop these ideas, but sometimes I also wonder how far they can be maintained.

Posted by: Toby Bartels on September 18, 2009 9:55 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

It also seems to me that all the things about real numbers you are saying that FMathL does right, it only does right specifically for real (or complex) numbers, because you have built them into the system by fiat. If FMathL didn’t include axioms specifying that there was a particular set called \mathbb{R} with particular properties, then you’d have to construct it just like in any other set theory, and in particular you’d have to choose an implementation (thereby making 121\in\sqrt{2} either true or false), and it would also no longer be true that 22\in\mathbb{N} and 22\in\mathbb{R} were the same thing.

But \mathbb{R} and \mathbb{C} are by no means the only mathematical object that has such problems! For example, suppose I want to study the quaternions \mathbb{H} in FMathL. I could define them as ordered pairs of complex numbers, or as 2×22\times 2 complex matrices, or as 4×44\times 4 real matrices, and in each case different “accidental” things would be true about them just as in ZF, and in no case would \mathbb{C} actually be a subset of \mathbb{H}. And so on.

Posted by: Mike Shulman on September 18, 2009 9:24 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

MS: you’d have to choose an implementation (thereby making 121\in\sqrt{2} either true or false)

No. I can define the domain COFCOF of all complete ordered fields, show by some construction that COFCOF is not empty, and then specify \mathbb{R} as Choice(COF)Choice(COF). This selects a unique, implementation-dependent copy of the real numbers without any accidental properties. (Maybe this will be the form of the actual later implementation. These things will be decided only after we have done enough experiments.)

MS: But \mathbb{R} and \mathbb{C} are by no means the only mathematical object that has such problems! For example, suppose I want to study the quaternions \mathbb{H} in FMathL. I could define them as ordered pairs of complex numbers, or as 2×2 complex matrices, or as 4×4 real matrices, and in each case different “accidental” things would be true about them just as in ZF, and in no case would \mathbb{C} actually be a subset of \mathbb{H}.

Indeed, if you define them in this way, this is what happens. But I would call this not definitions but constructions, and constructions usually have accidental properties.

To get things right without any artifacts, one needs to think more categorially, and define \mathbb{H} as Choice(X)Choice(X), where XX denotes the domain of all skew fields that contain \mathbb{C} as a subfield of index 2. Any of the constructions you gave shows that XX is nonempty, so that this recipe defines a unique existing object whose only decidable properties are those that one can derive from the assumptions made. (It has in addition lots of undecidable, implementation-dependent properties, though.)

The use of Bourbaki’s global choice operator is essential for this golden road to the essence of mathematics.

Posted by: Arnold Neumaier on September 18, 2009 1:10 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

It sounds like you’re saying that in order to construct anything in FMathL without extraneous details I need to find a way to describe it in terms that characterize it uniquely. That seems to me like a mighty tight straightjacket!! Suppose I’m Hamilton inventing the quaternions, and maybe I’m ahead of my time and I’ve realized that they could be constructed from the complex numbers in several different ways, but I don’t yet have any idea how to characterize them uniquely. I didn’t set out to study skew fields containing \mathbb{C} as a subfield of index 2, I set out to study a particular thing, which could be constructed in several ways, and only later discovered that it was the unique skew field containing \mathbb{C} with index 2. It seems to me that uniqueness theorems such as these generally come later, after a new object has been studied in its own right for a while and its essential properties isolated.

Here’s a more modern example: what about the stable homotopy category? It has lots of different constructions; you can start with lots of different kinds of point-set-level spectra. But although all these constructions give the same result, off the top of my head I’m not at all sure how to characterize the result uniquely without referring to the specific constructions of it.

Posted by: Mike Shulman on September 18, 2009 6:17 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

MS: It sounds like you’re saying that in order to construct anything in FMathL without extraneous details I need to find a way to describe it in terms that characterize it uniquely.

No. Of course one can construct in FMathL a skewfield via complex 2 x 2 matrices, and call it the quaternions. Then one can construct another skew field vial real 4x4 matrices and call it the tetranions. Then one discovers the theorem that tetranions are isomorphic to quaternions. At this point, it makes sense to revise notation (a cvommon process in mathematics), consider the domain XX of all skewfields isomorphic to these particular skew fields, and to define the quaternions as Choice(XX). When later the characterization theorem is discovered, it just gives a simpler description for XX, but the definition is already stable once you know that you want to abstract from the accidentals of the construction. One can do this even if only a single construction exists (e.g., for the Monster simple group).

Posted by: Arnold Neumaier on September 18, 2009 6:57 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

one can avoid the multiple meanings of \mathbb{N} by interpreting it in each context as the richest structure that can be picked up from the context.

This isn’t “avoiding” the multiple meanings of \mathbb{N}, it is merely inferring which meaning is meant, which I am all in favor of. Category theorists do this all the time just like other mathematicians. But the fact that one has to do an interpretation means that the multiple meanings still exist.

The fact that \mathbb{N} can indicate different levels of structure depending on the context is precisely what I mean by “overloading.”

this uses the same idea as the categorial approach to foundations, but to turn each context into a category

I don’t think I ever advocated turning every context into a category. And, as I’ve said, I think that calling this the “categorial” approach to foundations is misleading; it doesn’t necessarily have anything to do with categories. The point I’m trying to make is that mathematics is typed, and that’s just as true in type theory and non-categorial structural set theory as it is in ETCS or CCAF.

Posted by: Mike Shulman on September 18, 2009 8:52 PM | Permalink | PGP Sig | Reply to this

Re: CCAF, ETCS and type theories

MS: The point I’m trying to make is that mathematics is typed, and that’s just as true in type theory and non-categorial structural set theory as it is in ETCS or CCAF.

I made some first comments on your SEAR page, but need to look at the concepts more thoroughly.

The point I’m trying to make is that typing does not solve many of the disambiguation problerms that an automatic math system must be able to handle. it solves only some of them. Since a more flexible disambiguation system is needed anyway, it can as well replace the typing. By doing so in an FMathL like fashion, it automatically eliminates the multiplicity of typed instances of the same thing.

Posted by: Arnold Neumaier on September 18, 2009 10:43 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

The use of Bourbaki’s global choice operator is essential for this golden road to the essence of mathematics.

That's too bad, because I would like to do mathematics in which the axiom of choice is optional, without having to code it all myself. I think that I can work well in a system where \mathbb{N} has a maximal structure (at least in an ever-growing, potential infinity sort of way), even though that's not exactly how I would normally think of \mathbb{N}, but I really won't find it useful if choice is essential.

I hope that it isn't really essential for what you're doing here. After all, you don't care which particular object Choice(X)Choice(X) is, and the user won't even have access to those details. So I hope that there is a way to implement this that avoids anything that proves the axiom of choice.

Posted by: Toby Bartels on September 18, 2009 10:07 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

AN: The use of Bourbaki’s global choice operator is essential for this golden road to the essence of mathematics.

TB: That’s too bad, because I would like to do mathematics in which the axiom of choice is optional, without having to code it all myself.

The way FMathL will be implemented is fully reflective. The current paper essentially serves to define a common metalevel, within which one can objectively (i.e., with the same meaning in all subjective implementations) talk about a constructive description of the FMathL implementation. The latter is what is actiually carried out. Thus, one must trust the axiom of choice to trust the system, but what is proved inside the system is fully configurable, since you can simple build your context without importing all the modules needed for a full reflection of FMathL. Thus you’d only have to create your own constructive restricted version of Choice (if this hasn’t been done already by someone else), most likely that Choice is defined only for inhabited sets rather than for all nonempty sets, and things will work as before. You need some construction to know that you can choose, but you have that in a constructive approach anyway.

Actually, after we have successfully reflected the whole FMathL framework, we’ll take stock to see what is really needed for a minimal part that can already reflect the whole framework. Then we redefine this as the core, and there is a possibility that this will be constructive. Only the core need to be trusted and checked for correctness, since the remainder will be definable in terms of the core.

Of course, consistency of the core does not imply consistency of the theories built later with the help pof the core and user specifications. Thus, even in a weak, but fully reflective core it will be possible to specify ZFC with all sorts of inaccessible cardinals, say, but their consistency is left to the user.

AN: ordered monoids are the objects in the intersection of Order and Monoid satisfying the compatibility relation R (suitably defined)

TB: I would like to see you formalise this! Or the (related but perhaps simpler) statement that Group is a subcategory of Set.

I’ll try to do that sooner or later, but in my experience this may take days or weeks to crystallize into something practical (if it is at all possible). So don’t expect a quick reply.

Posted by: Arnold Neumaier on September 18, 2009 11:22 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

Thus you'd only have to create your own constructive restricted version of Choice […] and things will work as before.

And I can import everything that's already been defined your way? That might work.

most likely that Choice is defined only for inhabited sets rather than for all nonempty sets,

For the record, that won't work. (At least, if it did, then you'd have Excluded Middle implies Choice, which I wouldn't want either!)

Actually, after we have successfully reflected the whole FMathL framework, we’ll take stock to see what is really needed for a minimal part that can already reflect the whole framework. Then we redefine this as the core, and there is a possibility that this will be constructive.

That sounds good!

I would like to see you formalise this! Or the (related but perhaps simpler) statement that Group is a subcategory of Set.

I’ll try to do that sooner or later, but in my experience this may take days or weeks to crystallize into something practical (if it is at all possible). So don’t expect a quick reply.

I understand. But I would be very interested to see it.

Posted by: Toby Bartels on September 19, 2009 12:15 AM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

I reply here to several of your mails.

TB: And I can import everything that’s already been defined your way?

Yes, since that is what reflection is all about. A self-reflective foundation (and only that is a real foundation) can explain formally all the stuff its talking about informally. To explain it formally means to have a module that defines its syntax and semantics. Of course, in a well-designed package, you have access to that and can use, combine, and modify it in any way you like. (In the latter case, of course, the trust certificates will be reset to trusted by you only.)

AN: most likely that Choice is defined only for inhabited sets rather than for all nonempty sets,

TB: For the record, that won’t work.

Actually, I just noticed that Axiom A19, is alredy formulated as an axiom of global constructive choice.

TB: At least, if it did, then you’d have Excluded Middle implies Choice, which I wouldn’t want either!

I don’t understand; please indicate the argument.

However, it is well-known that Choice implies Excluded Middle, and this remains true in FMathL; see Section 3.1. If you don’t like this, you’d have to relax the axioms for sets (Section 2.14) in a way that makes this proof invalid.

TB: Give me a paper whose source is available (say from the arXiv), formalised in ZFC (or whatever), and I’ll rewrite it to be formalised in ETCS. (The paper can include its own specification of ZFC too, and mine will include its own specification of ETCS.)

I will not be able to meet that challenge since I do not have the patience to undo all the abuses of notation traditionally used in the context of ZFC. I argue that precisely this should not be demanded from a really good foundation of mathematics.

However, Bourbaki’s Elements of Mathematiks at least clarify each form of abuse of notation on first use, so that an automatic system going through it in linear order will pick up all the language updates along the way. So I believe that, in principle, Bourbaki meets your request, though not in a minimally short way since they aimed at completeness, not at minimal reflection.

TB: Are you telling me that even ZFC needs a set theory to serve as its foundation? I can’t think of any way to interpret this to make it true.

Yes, of course. This is done in books on logic. They start with an informal set theory, then introduce the machinery to formally talk about first order logic, and then produce at some stage the definition of ZFC.

To interpret it without circularity, logicians distinguish between the metalevel (the informal set theory) and the object level (the formal theory), and then build hierarchies for reflection. They build them upwards to metametalevels etc..

FMathL rather reflects downwards to objectobject levels, etc., which is more appropriate from an implementation point of view.

TB: You have to say something like this to reflect first-order logic in some mathematical foundation. But not to do first-order logic.

To do it, you just need a mind in which the logic is already implemented.

But to communicate objectively what you do, you need to reflect it: You need to understand what it means that you can choose any indexed letter as a variable, and what it means to have a substitution algorithm, etc.. Once you try to communicate this to a novice who hasn’t already an equivalent implementation, you find yourself teaching him informal concepts of sets and functions with their properties.

A foundation is just supposed to have that done rigorously.

Posted by: Arnold Neumaier on September 19, 2009 5:23 PM | Permalink | Reply to this

Re: CCAF, ETCS and type theories

I reply here to several of your mails.

That's fine.

And I can import everything that’s already been defined your way?

Yes, since that is what reflection is all about. A self-reflective foundation (and only that is a real foundation) can explain formally all the stuff its talking about informally. To explain it formally means to have a module that defines its syntax and semantics. Of course, in a well-designed package, you have access to that and can use, combine, and modify it in any way you like. (In the latter case, of course, the trust certificates will be reset to trusted by you only.)

I should have asked if there was an easy user-friendly way to import it. Also I should learn more about the practical problems of reflection, because I have another question, which I don't know how to ask; but I'm worried about an analogue of the mismatch between the internal and external languages of a topos that is not well-pointed. And I'm concerned about the trust certificates; since I'm reflecting in order to use a weaker foundation (that is, fewer assumptions, stricter requirements), I would like FMathL to verify for me those results that go through.

most likely that Choice is defined only for inhabited sets rather than for all nonempty sets,

For the record, that won't work. At least, if it did, then you'd have Excluded Middle implies Choice, which I wouldn't want either!

I don’t understand; please indicate the argument.

By Excluded Middle, a set is either inhabited or not; an uninhabited set is (by definition) empty; hence a nonempty set is inhabited. So if there is a global choice function for inhabited sets, then there is one for uninhabited sets. Conversely, the argument that Choice implies Excluded Middle needs only the version of choice for inhabited sets (in fact, only for quotient sets of {0,1}\{0,1\}). I would like this to be optional.

Give me a paper whose source is available (say from the arXiv), formalised in ZFC (or whatever), and I’ll rewrite it to be formalised in ETCS.

I will not be able to meet that challenge since I do not have the patience to undo all the abuses of notation traditionally used in the context of ZFC. I argue that precisely this should not be demanded from a really good foundation of mathematics.

I don't understand what you're asking of categorial foundations, then, if no other foundation does it. Or do you claim that FMathL does this? Can I take http://www.mat.univie.ac.at/~neum/ms/fmathl.pdf as my text?

Are you telling me that even ZFC needs a set theory to serve as its foundation? I can't think of any way to interpret this to make it true.

Yes, of course. This is done in books on logic. They start with an informal set theory, then introduce the machinery to formally talk about first order logic, and then produce at some stage the definition of ZFC.

I would like to see a book on logic (syntactic logic, not model theory) that does this. To describe the logic, we need a metalanguage (as you say), but we still don't need a full-fledged set theory (or other foundation of all mathematics), just a way to talk about recursively enumerable sets of natural numbers. (But I think that it's more common to talk about finite lists from a fixed finite language than natural numbers.)

In particular, the requirement of a countable set of variables can be replaced with a single variable xx and the requirement that XX' is a variable whenever XX is; there is no need to say Cantor's word ‘countable’.

Posted by: Toby Bartels on September 19, 2009 8:58 PM | Permalink | Reply to this

Re: CCAF vs ETCS

I (perhaps wrongly) assumed that “CCAF” meant the same thing as Lawvere originally meant by it, and I don’t think this included ETCS. But I can’t find my copy of Lawvere’s original paper at the moment, so I could be wrong.

Posted by: Mike Shulman on September 16, 2009 9:01 PM | Permalink | Reply to this

Re: CCAF vs ETCS

http://138.73.27.39/tac/reprints/articles/11/tr11abs.html Longer version of the 1964 paper

“Philosophers and logicians to this day often contrast “categorical” foundations for mathematics with “set-theoretic” foundations as if the two were opposites. Yet the second categorical foundation ever worked out, and the first in print, was a set theory—Lawvere’s
axioms for the category of sets, called ETCS, (Lawvere 1964). These axioms were written soon after Lawvere’s dissertation sketched the category of categories as a foundation, CCAF, (Lawvere 1963). They appeared in the PNAS two years before axioms for CCAF were published (Lawvere 1966). The present longer version was available since April 1965 in the Lecture Notes Series of the University of Chicago Department of Mathematics.1 It gives the same definitions and theorems, with the same numbering as the 5 page PNAS version, but with fuller proofs and explications.”

Posted by: Stephen Harris on September 17, 2009 1:01 AM | Permalink | Reply to this

Re: CCAF vs ETCS

The open question was: Does his original version of CCAF have an axiom saying that there is a category of sets satisfying ETCS?

Posted by: Arnold Neumaier on September 17, 2009 4:04 PM | Permalink | Reply to this

Re: CCAF vs ETCS

Posted by: Arnold Neumaier
“The open question was: Does his original version of CCAF have an axiom saying that there is a category of sets satisfying ETCS?”
————————————

SH: Mike said “original paper” which I wasn’t sure meant Lawvere’s thesis which was also published 40 years later. Precisely speaking, Lawvere doesn’t present formalized axioms but more informally.

http://138.73.27.39/tac/reprints/index.html [tr5]
From the Author’s Comments on his (Lawvere) 1963 PhD. thesis Describing January, 1960
————————
“My dream, that direct axiomatization of the category of categories would help in overcoming alleged set-theoretic difficulties, was naturally met with skepticism by Professor Eilenberg when I arrived (and also by Professor Mac Lane when he visited Columbia).”
————————–

From the thesis Introduction
“One so inclined could of course view all mathematical assertions of Chapter I as axioms.” …
“Since all these notions turn out to have first-order characterizations (i.e. char acterizations solely in terms of the domain, codomain, and composition predicates and the logical constants =, ∀, ∃, ⇒, ∧, ∨, ¬ ), it becomes possible to adjoin these characterizations as new axioms together with certain other axioms, such as the axiom of choice, to the usual first-order theory of categories (i.e. the one whose only axioms are associativity, etc.) to obtain the first-order theory of the category of categories. Apparently a great deal of mathematics (for example this paper) can be derived within the latter theory. We content ourselves here with an intuitively adequate description of the basic operations and special objects in the category of categories, *leaving the full formal axioms to a later paper*. We assert that all that we do can be interpreted in the theory ZF3, and hence is consistent if ZF3 is consistent. By ZF3 we mean the theory obtained by adjoining to ordinary Zermelo-Fraenkel set theory…”
—————————

SH: I think the formal axioms were presented later in
Lawvere, F. William (1966)*, The category of categories as a foundation for mathematics, in S.Eilenberg et al., eds, ‘Proceedings of the Conference on Categorical Algebra, La Jolla, 1965’, Springer-Verlag, pp. 1–21.

Colin McLarty said [tr11], “Lawvere’s axioms for the category of sets, called ETCS, (Lawvere 1964). These axioms were written soon *after Lawvere’s dissertation sketched the category of categories as a foundation, CCAF, (Lawvere 1963). They appeared in the PNAS two years before axioms for CCAF were published (Lawvere 1966)*. [cited above]

SH: So in my inexpert opinion, the original version of CCAF would be Lawvere’s dissertation (1963) and there are no formal axioms of ETCS so perhaps they could be called assertions, although that is not how Lawvere thought about them.

For other non-experts: I’m including some of my notes from the thesis which include ideas tantamount to informal axioms.

Seven ideas introduced in the 1963 thesis
(1) The category of categories is an accurate and useful framework for algebra, geometry, analysis, and logic, therefore its key features need to be made explicit.
(2) The construction of the category whose objects are maps from a value of one given functor to a value of another given functor makes possible an elementary treatment of adjointness free of smallness concerns and also helps to make explicit both the existence theorem for adjoints and the calculation of the specific class of adjoints known as Kan extensions.
(3)* Algebras (and other structures, models, etc.) are actually functors
to a background category from a category which abstractly concentrates the essence of a certain general concept of algebra, and indeed homomorphisms are nothing but natural transformations between such functors. Categories of algebras are very special, and explicit *axiomatic characterizations of them can be found, thus providing a general guide to the special features of construction in algebra.
(4) The Kan extensions themselves are the key ingredient in the unification of a large class of universal constructions in algebra (as in [Chevalley, 1956]).
(5) The dialectical contrast between presentations of abstract concepts and the abstract concepts themselves, as also the contrast between word problems and groups, polynomial calculations and rings, etc. can be expressed as an explicit construction of a new adjoint functor out of any given adjoint functor. Since in practice many abstract concepts (and algebras) arise by means other than presentations, it is more accurate to apply the term “theory”, not to the presentations as had become traditional in formalist logic, but rather to the more invariant abstract concepts themselves which serve a pivotal role, both in their connection with the syntax of presentations, as well as with the semantics of representations.
(6) The leap from particular phenomenon to general concept, as in the leap from cohomology functors on spaces to the concept of cohomology operations, can be analyzed as a procedure meaningful in a great variety of contexts and involving functorality and naturality, a procedure actually determined as the adjoint to semantics and called extraction of “structure” (in the general rather than the particular sense of the word).
(7) The tools implicit in (1)–(6) constitute a “universal algebra” which should not only be polished for its own sake but more importantly should be applied both to constructing more pedagogically effective unifications of ongoing developments of classical algebra, and to guiding of future mathematical research.
In 1968 the idea summarized in (7) was elaborated in a list of solved and unsolved problems, which is also being reproduced here.”
——————————-

“My stay in Berkeley tempered the naive presumption that an important
preparation for work in the foundations of continuum mechanics would
be to join the community whose stated goal was the foundations of
mathematics.”
——————————-

I read the Preface to Yves Bertot’s book on Coq and it took about 20 years to develop which makes me think that your time frame of 5 years isn’t very long.

Posted by: Stephen Harris on September 18, 2009 1:37 AM | Permalink | Reply to this

Re: CCAF vs ETCS

SH: I read the Preface to Yves Bertot’s book on Coq and it took about 20 years to develop which makes me think that your time frame of 5 years isn’t very long.

Fortunately, I can build upon all this previous work rather than having to develop it all again. I have it easier with lessns that Coq had to learn the hard way.

Nevertheless, the 5 years are based on the assumption that 10 people work full-time on it for 5 years. I am trying to get financial support to achieve this, but it isn’t easy. At the moment I have only 2 people for the next two years, and one more for 1/2 a year.

The 1964 paper you adress in your next email only talks about ETCS, not about CCAF.

SH: It seems Mike is right about ETCS and CCAF being quite distinct.

This is undisputed. But my claim was that CCAF contains ETCS, whereas his claim was that CCAF is completely independent of ETCS. (If it were so, how could CCAF be a foundation of mathematics, where sets are necessary?)

Posted by: Arnold Neumaier on September 18, 2009 9:48 AM | Permalink | Reply to this

Re: CCAF vs ETCS

his claim was that CCAF is completely independent of ETCS. (If it were so, how could CCAF be a foundation of mathematics, where sets are necessary?)

I don't understand this. How can ETCS be a foundation of mathematics when it's independent of ZFC?

Posted by: Toby Bartels on September 18, 2009 9:39 PM | Permalink | Reply to this

Re: CCAF vs ETCS

AN: his claim was that CCAF is completely independent of ETCS. (If it were so, how could CCAF be a foundation of mathematics, where sets are necessary?)

TB: I don’t understand this. How can ETCS be a foundation of mathematics when it’s independent of ZFC?

I don’t understand how your question relates to my remark, which had no reference to ZFC.

ETCS is a set theory based on category theory, which is based on a set theory. Thus one can claim that ETCS is a different set-theoretic foundation.

CCAF is a category theory based on category theory. It needs some way to create sets, otherwise it cannot serve as its own metatheory. (For eexample, tt needs to be able to formalize things as “there is a countable set of variables” before it can talk about predicate logic. McLarty’s version of CCAF does this by explicitly requiring that CCAF contains a copy of ETCS.

I see now way to avoid something like this, but Mike Schulman had proposed from memory that Lawvere’s CCAF had no ETCS inside it. I have no access to Lawvere’s CCAF paper, so I can’t check.

Posted by: Arnold Neumaier on September 18, 2009 10:54 PM | Permalink | Reply to this

Re: CCAF vs ETCS

I don’t understand how your question relates to my remark, which had no reference to ZFC.

Sorry, I didn't express myself very well. I meant that the question that you asked me makes no more sense in that context (to me) than the question that I asked you.

ETCS is a set theory based on category theory, which is based on a set theory.

And now I don't understand the second clause of this sentence! Or rather, I think that I understand, but if so, then it's wrong. The category theory that ETCS is based is not based on a set theory; it's elementary.

Posted by: Toby Bartels on September 18, 2009 11:02 PM | Permalink | Reply to this

Re: CCAF vs ETCS

TB: The category theory that ETCS is based is not based on a set theory; it’s elementary.

Even elementary stuff expressed in first order logic needs a set theory to be able to serve as foundation. For example, it needs to be able to formalize statements such as “there is a countable set of variables” before it can talk about predicate logic.

TB: I should clarify further that CCAF certainly has a set theory in it, in the same way that any set theory has a category theory in it. That is, you can define sets in terms of categories, and go on from there.

This is precisely what the ETCS inside McLarty’s CCAF does. And I’d be surprised if Lawvere had taken a different road.

Posted by: Arnold Neumaier on September 18, 2009 11:33 PM | Permalink | Reply to this

Re: CCAF vs ETCS

Even elementary stuff expressed in first order logic needs a set theory to be able to serve as foundation.

What??? Are you telling me that even ZFC needs a set theory to serve as its foundation? I can't think of any way to interpret this to make it true.

“there is a countable set of variables”

You have to say something like this to reflect first-order logic in some mathematical foundation. But not to do first-order logic.

Posted by: Toby Bartels on September 19, 2009 12:26 AM | Permalink | Reply to this

Re: CCAF vs ETCS

Even elementary stuff expressed in first order logic needs a set theory to be able to serve as foundation.

Everything needs to be based on set theory, therefore no matter what alternate foundations you consider, they must be based on set theory, therefore the only foundation for mathematics is set theory.

Wait, what?

Posted by: John Armstrong on September 19, 2009 2:32 AM | Permalink | Reply to this

Re: CCAF vs ETCS

It seems that, again, the notion needed is some kind of reflection (or maybe simulation?); since Set Theory untidily encapsulates “all” of mathematics, any competing foundational system ought to allow a reasonable reflection or simulation of Set Theory.

Maybe we need a category of foundational systems, simulations/reflections as morphisms between them, (higher-dimensional stuff?) … ?

Posted by: some guy on the street on September 19, 2009 5:38 AM | Permalink | Reply to this

Re: CCAF vs ETCS

TB: I should clarify further that CCAF certainly has a set theory in it, in the same way that any set theory has a category theory in it. That is, you can define sets in terms of categories, and go on from there.

AN: This is precisely what the ETCS inside McLarty’s CCAF does. And I’d be surprised if Lawvere had taken a different road.
—————————————

Category theory used to have a set-theoretic background. Lawvere provided an alternative. The Category = Set is fundamental. But I don’t think that ETCS is fundamental to CCAF because ETCS is just one formulation of Cat = Set.

http://plato.stanford.edu/entries/category-theory/

“An alternative approach, that of Lawvere (1963, 1966), begins by characterizing the category of categories, and then stipulates that a category is an object of that universe.

Identity, morphisms, and composition satisfy two axioms:
Associativity xxx…
Identity xxx…
This is the definition one finds in most textbooks of category theory. As such it explicitly relies on a set theoretical background and language.

An alterative, suggested by Lawvere in the early sixties, is to develop
an adequate language and background framework for a category of categories.”

SH: The Cat named Set is just one category in the category of categories. I don’t think that Set, has to be ETCS and so ETCS is not foundational to CCAF. I don’t think the Cat = Set is rigid in the axioms that it allows, so is not limited to ETCS. After all ETCS, was just invented for the benefit on a two semester course, and later the axioms in it were simplified.

Posted by: Stephen Harris on September 19, 2009 9:13 AM | Permalink | Reply to this

Re: CCAF vs ETCS

I should clarify further that CCAF certainly has a set theory in it, in the same way that any set theory has a category theory in it. That is, you can define sets in terms of categories, and go on from there. Depending on exactly how CCAF works, you would define a set as a discrete category, or perhaps a discrete skeletal category, or something like that.

Posted by: Toby Bartels on September 18, 2009 11:12 PM | Permalink | Reply to this

Re: CCAF vs ETCS

AN wrote: “I see now way to avoid something like this, but Mike Schulman
had proposed from memory that Lawvere’s CCAF had no ETCS inside it. I have no access to Lawvere’s CCAF paper, so I can’t check.” ———————

SH: From the quote below you will see that Lawvere’s version of Category Theory does not require a set-theoretical background. I think this is assumed on this forum. But the Category of Categories, CCAF, can have the category of Set, as one of its objects. That doesn’t make set theory foundational to CT in Lawvere’s approach. The axioms in ETCS might well qualify as axioms in the Cat = Set. Remember that Lawvere invented ETCS for a simplification of set theory for a one year class. Later, he simplified the axioms which ETCS contained. I don’t think ETCS is intrinsic to CCAF, but it is (I think) one of the possible axiomatic formulations for the Category of Set which is intrinsic to CCAF in the same sense that all possible categories in the category of categories are in some sense intrinsic, at least after being identified. Even if Lawvere had used ETCS as his example for the Cat = Set, in his 1963 thesis or 1966 paper, I think other set axiomatic systems could have replaced it; I don’t think cat = Set has a rigid definition in terms of axioms.
If I’m wrong about this last part, Toby be sure to tell me!;-) So I think you, AN, have overvalued ETCS because you thought it was foundational to a category theory which must have a set-theoretic foundation. But the Cafe mostly uses Lawevere’s not-set-theoretic foundation AFAIK.

http://plato.stanford.edu/entries/category-theory/
“An alternative approach, that of Lawvere (1963, 1966), begins by characterizing the category of categories, and then stipulates that a category is an object of that universe.

Identity, morphisms, and composition satisfy two axioms:
Associativity xxx…
Identity xxx…
“This is the definition one finds in most textbooks of category theory. As such it explicitly relies on a set theoretical background and language.
An alterative, suggested by Lawvere in the early sixties, is to develop an adequate language and background framework for a category of categories.”

Posted by: Stephen Harris on September 19, 2009 8:57 AM | Permalink | Reply to this

Re: CCAF vs ETCS

SH: Even if Lawvere had used ETCS as his example for the Cat = Set, in his 1963 thesis or 1966 paper, I think other set axiomatic systems could have replaced it

I agree. But some notion of set is needed in any categorial foundations of mathematics deserving its name. I took ETCS as the standard notion, as I took ZF as the standard notion of set although people successfully used alternative set theories as foundation, too.

Posted by: Arnold Neumaier on September 19, 2009 7:33 PM | Permalink | Reply to this

Re: CCAF vs ETCS

AN wrote: “This is undisputed. But my claim was that CCAF contains ETCS, whereas his claim was that CCAF is completely independent of ETCS. (If it were so, how could CCAF be a foundation of mathematics, where sets are necessary?)” ——————-

SH: I haven’t noticed a rapid meeting of the minds yet, so I thought I would look into some outside opinion. I too thought that maybe Category Theory might no be such a good basis for FMathL. I now think my opinion is wrong. FMathL is an algorithm and this idea seems to be a key point: “This means that we can view category theory as a collection of algorithms.” FMathL will be an algorithm. This post is my research and notes.

TT wrote:
“However, it looks like Mike Shulman has written down a very interesting program
of study, different to the categories-based approach (which is extremely hands-on and bottom-up) but which is very smooth and top-down (invoking for example a very powerful comprehension principle) while still being faithful to the structuralist POV. He calls it SEAR (Sets, Elements, and Relations).”

AN wrote:
“The main reason I cannot see why category theory might become a foundation for a system like FMathL (and this is my sole interest in category theory at present) is that a systematic, careful treatment already takes 100 or more pages of abstraction before one can discuss foundational issues formally, i.e., before they acquire the first bit of self-reflection capabilities. …
Show me a paper that outlines a reasonably short way to formally define
all the stuff needed to be able to formally reflect in categorial language
a definition that characterizes when an object is a subgroup of a group.”

SH: This paper seems to fit the bill?
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.9846
“Herein we formalize a segment of category theory using the implementation of Calculus of Inductive Construction in Coq. Adopting the axiomatization proposed by Huet and Saibi we start by presenting basic concepts, examples and results of category theory in Coq. Next we define adjunction and cocartesian lifting and establish some results using the Coq proof assistant.”

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.9846
“This paper aims to bring the benefits of the use of Category Theory to the field of Semantic Web, where the coexistence of intrinsically different models of local knowledge makes difficult the exchanging of information. The paper uses categorical limit and colimit to define operations of breaking and composing ontologies, formalizing usual concepts in ontologies (alignment, merge, integration, matching) and proposing a new operation (the hide operation). The presented set of operations form a useful framework that makes easier the manipulation and reuse of ontologies.”

“Computational Category Theory”
“Another reason why computer scientists might be interested in category theory is that it is largely constructive. Theorems asserting the existence of objects are proven by explicit construction. *This means that we can view category theory as a collection of algorithms*. These algorithms have a generality beyond that normally encountered in programming in that they are parameterized over an arbitrary category and so can be specialized to different data structures.

One writes expressions to denote mathematical entities rather than defining the transitions of an abstract machine. ML also provides types which make a program much more intelligible and prevent some programming mistakes. ML has polymorphic types which allow us to express in programs something of the generality of category theory.
However, the type system of ML is not sufficiently sophisticated to prevent the illegal composition of two arrows whose respective source and target do not match.”

Posted by: Stephen Harris on September 19, 2009 5:45 AM | Permalink | Reply to this

Re: CCAF vs ETCS

AN: Show me a paper that outlines a reasonably short way to formally define all the stuff needed to be able to formally reflect in categorial language

SH: This paper seems to fit the bill?

It fits half of the bill. The missing half is to give Coq a categorial foundation. Only then is the reflection complete.

SH: FMathL is an algorithm and this idea seems to be a key point: “This means that we can view category theory as a collection of algorithms.”

I don’t see how a theory can be an algorithm. A theory consists of concepts, their semantic interpretation, and algorithms for manipulating the concepts consistent with that interpretation. All three parts are needed.

If the world consists of algorithms only, they perform meaningless tasks.

Posted by: Arnold Neumaier on September 19, 2009 7:31 PM | Permalink | Reply to this

Re: CCAF vs ETCS

SH: FMathL is an algorithm and this idea seems to be a key point:
“This means that we can view category theory as a collection of algorithms.”
———————————-
AN replied
I don’t see how a theory can be an algorithm. A theory consists of concepts, their semantic interpretation, and algorithms for manipulating the concepts consistent with that interpretation. All three parts are needed.

If the world consists of algorithms only, they perform meaningless tasks.”

SH: I provided the full context of the basis for my quote below. I think the analogy is very precise. As far as meaning goes, the difference is between human “original intentionality” and a program which is considered to have “derived intentionality”. That means that the programmer provides the meaning that the program carries for other observers or users. Likewise, the mathematician defines or constructs some category and the purpose or meaning of that category that is communicated to other mathematicians. It flows from the mind of the mathematician into an abstract symbolic language which means something to the reader. This is pretty much the same definition of natural language which operates by a shared, agreed upon meaning.
“An alternative approach, that of Lawvere (1963, 1966), begins by characterizing the category of categories, and then stipulates that a category is an object of that universe.”
“The point is that the category of categories is not just a category, but what is known as a 2-category; that is, its arrows are functors, but two functors between the same two categories in turn form a category, the arrows being natural transformations of functors. Thus there are 1-arrows (functors) between objects (categories), but there are also 2-arrows (natural transformations) between 1-arrows.”

Encyclopedia of Computer Science and Technology By Allen Kent, James G. Williams
Computational Aspects
“In this section we indicate something of the computational nature of category theory that has attracted the interest of computer scientists and led to applications which we describe in the Applications section.

Observation one is that category theory, like logic, operates on the same level of generality as computer programming. It is *not tied to specific structures like sets or numbers* but as in programming, where we may define types to represent a wide range of structures, so in category theory, objects may range over many kinds of structures. This generality is exploited in describing the semantics of computation and can also be used to write highly generic codes.

Being based upon arrows and their compositions, category theory is an abstract theory of typed functions, with objects corresponding to types and arrows to functions. Notice that composition rather than application is the primitive operation. Through this identification, features of functional programming find categorical analogues. Type constructors
correspond to maps between categories (called functors), higher order functions are described in categories with additional structure (cartesian closed categories) and polymorphic types similarly. This correspondence between programming constructs and category theory formalizes structural properties of programs in an elementary equational language. Models of programming languages are then categories with suitable internal structure.

There is something more going on here. This correspondence translates functional language (like typed lambda-calculus) into an arrow-theoretic language: that is, translates a language with variables, where we can substitute values for names, into a variable-free combinator language.
Languages with variables seem more appropriate for programming and other
descriptive activities, whereas combinator languages are more suited to
algebraic manipulation and possibly more efficient evaluation. These ideas have been used to build abstract machines for implementing functional languages based upon categorical combinators.

Somewhat belying the abstraction of category theory is the fact that it is largely constructive-theorems asserting existence are proven by explicit construction. These constructions provide algorithms which can be coded as computer programs-programs with an unusual degree of generarality. They are abstracted over categories and so apply to a range of different data types, the same program performing analogous operations on types such as sets, graphs, and automata. In a sense, the core of category theory is
just a *collection of algorithms*.

Posted by: Stephen Harris on September 21, 2009 5:40 AM | Permalink | Reply to this

Re: CCAF vs ETCS

I haven’t read the following paper so I’m not certain what it includes about axioms.
Lawvere, F. William (1966), The category of categories as a foundation for mathematics, in S.Eilenberg et al., eds, ‘Proceedings of the Conference on Categorical Algebra, La Jolla, 1965’, Springer-Verlag, pp. 1–21.
—————————-

SH: I don’t think the formal axioms were presented in Lawvere’s 1963 thesis. I think they can be found in this 1964 paper
www.pubmedcentral.nih.gov/articlerender.fcgi?artid=300477

“We adjoin eight first-order axioms to the usual first-order theory of an abstract Eilenberg-Mac Lane category’ to obtain an elementary theory with the following properties:
(a) There is essentially only one category which satisfies these eight axioms together with the additional (non-elementary) axiom of completeness, namely, the category of sets and mappings. Thus our theory distinguishes 8 structurally from other complete categories, such as those of topological spaces, groups, rings, partially ordered sets, etc.”

This paper is also online with extended commentaries (long version) added later.
http://138.73.27.39/tac/reprints/articles/11/tr11abs.html

Posted by: Stephen Harris on September 18, 2009 2:37 AM | Permalink | Reply to this

Re: CCAF vs ETCS

It seems Mike is right about ETCS and CCAF being quite distinct. However, Lawvere doesn’t appear to see the merits to be quite as disconnected as apparently Mike does.
———————-

Colin McLarty wrote:
“Yet the second categorical foundation ever worked out, and the first in print,
was a set theory —- Lawvere’s axioms for the category of sets, called ETCS,
(Lawvere 1964).”

Lawvere, ETCS paper, page 34, wrote:
“However, it is the author’s feeling that when one wishes to go substantially
beyond what can be done in the theory [ETCS] presented here, a much more
satisfactory foundation for practical purposes will be provided by a theory
of the category of categories.”

“Part of the summer of 1963 was devoted to designing a course based on the
axiomatics of Zermelo-Fraenkel set theory (even though I had already before
concluded that the category of categories is the best setting for “advanced”
mathematics).”

“This elementary theory of the category of sets arose from a purely practical
educational need. …
But I soon realized that even an entire semester would not be adequate for
explaining all the (for a beginner bizarre) membership-theoretic definitions
and results, then translating them into operations usable in algebra and
analysis, then using that framework to construct a basis for the material I
planned to present in the second semester on metric spaces.
However I found a way out of the ZF impasse and the able Reed students could
indeed be led to take advantage of the second semester that I had planned.
The way was to present in a couple of months an explicit axiomatic theory of
the mathematical operations and concepts (composition, functionals, etc.) as
actually needed in the development of the mathematics.”

Posted by: Stephen Harris on September 18, 2009 7:14 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

When we write maths on a blackboard what do we actually do? This has to be analysed satisfactorily if we are ever to have decent graphical mathematical editors - a necessity if we want the communication of maths by computer to approximate the ease of a blackboard and coloured chalk.

What we do has two components, one visible and the other invisible. The visible component is a tree-structured graphic. The nodes of the tree represent the meaningful subexpressions. The editor must be able to ‘explode’ this tree, for example by a perspective representation, or by using a separate window to show different levels, so that the user can select subtrees for editing, dragging and dropping into other windows, etc.

The invisible component is the semantic content of the tree, rather than its visual display. This is the formalization that is in the back of the user’s mind. It may not be a complete formalization, so this is the tricky part. What formalization is appropriate for the computer? I suspect that there may be many answers to this.

The visible component needs software that may possibly already exist, but I do not know of it. The !Draw application for RISC OS is a structured vector-graphics editor, that has been around for a quarter of a century, and it has some of the features of what is required.

Posted by: Gavin Wraith on September 12, 2009 7:56 AM | Permalink | Reply to this

Ideocosm puzzles; Re: Towards a Computer-Aided System for Real Mathematics

“The invisible component is the semantic content of the tree, rather than its visual display. This is the formalization that is in the back of the user’s mind.”

Yes, but that’s where it gets infinitely tricky.

As I’ve said before, on other threads, we don’t know much at all about the Topology of the Ideocosm – the space of all possible ideas.

Here we admit that we don’t know how to formalize, illustrate, or automate the subspace of all possible “Mathematical ideas.”

I’m not sure we can even define it, given that Mathematics is at any time partly instantiated by what is in the heads of all Mathematicians, a society that changes over time both locally and globally.

I am not clear on how we might even define a hyperplane that separates the “Mathematical” ideas from the “nonmathematical” ideas within the Ideocosm.

Nor, for that matter, how we might even define a hyperplane that separates the “Physical” ideas from the “Nonphysical” ideas within the subspace of the Ideocosm of Theories of Mathematical Physics.

Posted by: Jonathan Vos Post on September 12, 2009 4:23 PM | Permalink | Reply to this

Re: Ideocosm puzzles

GW: The invisible component is the semantic content of the tree, rather than its visual display. This is the formalization that is in the back of the user’s mind.

JVP: Yes, but that’s where it gets infinitely tricky. As I’ve said before, on other threads, we don’t know much at all about the Topology of the Ideocosm - the space of all possible ideas.

It is just what can be represented on the semantic web. What is not known is only the part of it that is potentially useful. But the subspace of well-defined mathematical statements can be delineated up to semantic equivalence. It will just be what can be processed by FMathL, since the latter is designed to be able to process all mathematics. (Already Coq and Isabelle/Isar can do that, though not really conveniently.)

Posted by: Arnold Neumaier on September 14, 2009 4:10 PM | Permalink | Reply to this

Godel-numbering to “game” the metasystem; Re: Ideocosm puzzles

“the subspace of well-defined mathematical statements can be delineated up to semantic equivalence.”

I understand. But in a dynamic metasystem, where “new” mathematical ideas can be introduced coherently, as Category Theory historically came after Set Theory, it is not obvious to me that the semantics and pragmatics (“potentially useful”) are guaranteed always to be well-defined, once gadgets such as Godel-numbering are used to “game” the metasystem. But I’m eager to know more.

Posted by: Jonathan Vos Post on September 14, 2009 8:09 PM | Permalink | Reply to this

Re: Godel-numbering to “game” the metasystem; Re: Ideocosm puzzles

JVP: in a dynamic metasystem, where “new” mathematical ideas can be introduced coherently, as Category Theory historically came after Set Theory, it is not obvious to me that the semantics and pragmatics (“potentially useful”) are guaranteed always to be well-defined

Web ontology languages like RDF/OWL have a very general representation for semantics that can handle the way arbitrary concepts were looked at from antiquity till today, and hence probably far into the future. Semantical content is represented as a collection of triples of names.

For FMathL, it turned out to be more convenient to consider only collections of triples where the first two entries determine the third, leading to a partial binary dot operation that associates to certain pairs of objects a third one:

f . is_continuous = true

customer_1147 .f irst_name = Otto

etc.. As is easi to see, this still can hold arbitrary semantical relations between objects. The operation table of the dot operation is an infinite matrix that we call a semantic matrix.

The collective knowledge about mathematics can be considered as a huge and growing semantic matrix, of which the FMathL system is to capture the most important part.

JVP: … well-defined, once gadgets such as Godel-numbering are used to “game” the metasystem.

Consistency depends on the context, and is maintained in the usual way. Goedel’s results put limits on what is achievable constructively, but do not threaten well-definedness.

Posted by: Arnold Neumaier on September 15, 2009 12:23 PM | Permalink | Reply to this

What mathematicians carry around in their heads; Re: Godel-numbering to “game” the metasystem; Re: Ideocosm puzzles

Excellent answer. This suggests to me how carefully you’ve thought through your system, metasystem, metametasystem…

“The collective knowledge about mathematics can be considered as a huge and growing semantic matrix…”

Cf.:

Programming Language and Logic Links.

“Most mathematicians can go through their entire careers without learning anything about proof theory and intuitionistic logic, and I think the reason is that both undermine the naive model of mathematical foundations that most mathematicians carry around in their heads. Mathematicians hate thinking about foundations.”

Posted by: Jonathan Vos Post on September 16, 2009 6:25 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

GW: What we do has two components, one visible and the other invisible.

This not only holds for the blackboard case (which I leave to the OCR people, who do not too badly in turning formulas into latex) but also to the typed case.

Thus the visible component gives the expression tree (syntwx, e.g., presentation MathML), while the invisible component gives the intentions (e.g., Content MathML). The latter is the more difficult, poorly understood on the formal level thing.

Of course, when we write math on the blackboard, there are also our voice and gestures that convey helpful information for the semantical interpretation, e.g., a broad smile while saying “Let ϵ<0\epsilon\lt 0”.

On the other hand, for the syntactical side - do you really think that a visual view of the syntax tree of a formula such as a(b+c)=ab+aca(b+c)=ab+ac is more helpful than the equation itself? The longer the formula the less intelligible is the tree. (Consider a sequence of equalities A=B=C=D+E=FA = B = C = D + E = F with long expressions A,,FA,\dots,F that are the same apart from successive substitutions.)

Posted by: Arnold Neumaier on September 20, 2009 12:52 PM | Permalink | Reply to this

Herding cats

I’m someone who spends a fair amount of time trying to tease as much semantics as possible out of the LaTeX that people type.

“Presentational” MathML contains far more semantics than one typically finds in the LaTeX people actually generate, “in the wild.”

As a consequence, the output of itex2MML generally falls short of what would be possible in hand-crafted Presentational MathML (let alone Content MathML or the fancy-schmancy system discussed above).

Take as an example, the expression

f(a+b)^2

Of course, the exponent is not applied to the right parenthesis – even though that is literally how the expression is written.

Rather, what the author probably means is either

{ f(a+b) }^2

or, perhaps,

f { (a+b) }^2

Which it is, depends on what the expression

f(a+b)

is supposed to mean. In MathML, there are entities, &InvisibleTimes; and &ApplyFunction;, which would normally be placed between the <mi>f</mi> and the <mo>(</mo> to indicate whether we mean “the variable f times (a+b)” or “the function f applied to a+b.”

But, of course, there’s nothing like that in LaTeX.

What an ordinary LaTeX user can do is use brace brackets, as above, to — at least — indicate the desired grouping.

When I use itex (in the comments here on golem, or in Instiki), I try to always use braces to indicate grouping. That gets translated into the placement of <mrow> elements, which produces the right semantics (and not merely the right visual appearance) in the resulting MathML.

As far as I can tell, I am the only person who does that.

The problem of getting people to insert semantic information, into the stuff they write, was eloquently shown to be hopeless, long ago, in a famous essay by Cory Doctorow.

Posted by: Jacques Distler on September 14, 2009 4:25 AM | Permalink | PGP Sig | Reply to this

Re: Herding cats

Just a short technical followup, for those unfamiliar with LaTeX and/or MathML.

In LaTeX, grouping is indicated by brace brackets and the matching of left- and right-braces is strictly enforced. Parentheses do not indicate grouping (and the matching of left- and right-parentheses is not enforced).

In Presentational MathML, grouping is indicated by the <mrow> element. itex2MML translates matched pairs of braces into <mrow>s.

Posted by: Jacques Distler on September 14, 2009 5:27 AM | Permalink | PGP Sig | Reply to this

Re: Herding cats

Here you hit a problem in “worldwide inference”: one feature of brace pairs is that they turn TeX mathop’s into normal characters (without the wider mathop spacing), and years of having to deal with two-column proceedings styles have trained me to stick braces around any plus signs, etc, in all displayed equations to increase the likelihood of it fitting on one line (due to tight page limits). (Stylists will say I shouldn’t do this, but I’ve never received a referee report that indicates they’ve ever noticed, let alone object.) I don’t know what the itex parser would infer about the equations from this habit :-)

I’ve read a lot of the papers that Knuth wrote about the design and implementation of TeX and, whilst what he writes makes it clear he cares deeply about “exact” reproducibility of typesetting both in different installations and years later, I can’t recall any statements about the direct electronic use of TeX documents, particularly by algorithms. So it’s a reasonable hypothesis that he just didn’t think this would be relevant to documents produced in the lifetime of the system. But the TeX family has clearly lived far longer than expected and issues in it’s design are starting to affect new uses for the documents.

Posted by: bane on September 14, 2009 9:56 AM | Permalink | Reply to this

Re: Use of braces

Wouldn’t that effect be better achieved with

\everydisplay={\mathcode`\+="002B}

(forcing ++ to be a mathord instead of a mathbin in displayed equations)?

Posted by: Mike Shulman on September 14, 2009 8:04 PM | Permalink | Reply to this

Re: Use of braces

That’s probably a higher level way to do it, although it obviously needs extending to all the other mathethatical operations and relations I tend to use in displayed equations. (In case anyone’s wondering, there’s a greater tendency to have “word” variable names and subscripts in CS, which combined with 2 column means most displayed equations take just over a line in the “natural” spacing.)

My reason for doing it this way is only that of my knowledge came from the TeXbook with a bit of LaTeX knowledge bolted on.

The bigger point was that curly braces don’t always have no effect on appearance, and hence using them to denote structure will run in to some corner cases.

Posted by: bane on September 15, 2009 10:09 AM | Permalink | Reply to this

Re: Herding cats

itex2MML translates matched pairs of braces into <mrow>s.

I didn’t know that; where is it documented? Now that I know it, maybe I’ll make an effort.

However, the way I read the proposal, the idea was for a system that would be able to infer this sort of missing information from the context. It seems to me that in many cases, such as your example, this is just a matter of type-checking. If aa, bb, and ff are all variables representing numbers, then f(a+b)f(a+b) can only mean ff times (a+b)(a+b), whereas if aa and bb are numbers and ff is a function, then f(a+b)f(a+b) probably means ff applied to (a+b)(a+b) (unless you’re multiplying functions by numbers pointwise–but that’s often written only with the number on the left). In cases where more than one interpretation type-checks, it seems plausible to me that a computer could still sometimes infer the probable intent from the context, just as a human does. For example, if later on one encounters the statement g(f(a+b))=(gf)(a+b)g(f(a+b)) = (g\circ f)(a+b), it is a good bet that f(a+b)f(a+b) meant function application and not pointwise multiplication.

Posted by: Mike Shulman on September 14, 2009 6:39 AM | Permalink | Reply to this

Re: Herding cats

itex2MML translates matched pairs of braces into <mrow>s. I didn’t know that; where is it documented?

Alas, there isn’t any technical documentation on how itex2MML is implemented.

You might guess that this is how it works, based the description of how the \color command works. But, really, that would only occur to you if you knew what an <mrow> element was, in the first place. And that would put you in a very small minority indeed.

Now that I know it, maybe I’ll make an effort.

Great! Welcome to a very elite club.

Next thing you know, you’ll be using the \tensor{}{} and \multiscripts{}{}{} commands.

(Jason Blevins and I worked quite hard to write the LaTeX macros to implement those commands. You can see them in the TeX export in Instiki.)

However, the way I read the proposal, the idea was for a system that would be able to infer this sort of missing information from the context.

If you have sufficient context, you may be able to guess (humans, after all, manage to). Otherwise, you have to rely on people entering (correct!) metadata about what all the symbols mean (e.g., whether ff is a function or a variable).

That’s when you run into (some of) the problems mentioned in Cory Doctorow’s article.

Posted by: Jacques Distler on September 14, 2009 7:23 AM | Permalink | PGP Sig | Reply to this

Re: Herding cats

MS: However, the way I read the proposal, the idea was for a system that would be able to infer this sort of missing information from the context.

JD: If you have sufficient context, you may be able to guess (humans, after all, manage to). Otherwise, you have to rely on people entering (correct!) metadata about what all the symbols mean (e.g., whether f is a function or a variable).

A typical mathematical document together with the background of a trained reader contains everything needed to understand the paper. FMathL is therefore supposed to guess the interpretation from the context and from past experience, just as any mathematician does.

If this is not enough, a mathematician decides that the formula (or sentence, or article, or book) is too poorly written to merit understanding, and skips to the next formula (or sentence, or article, or book), perhaps coming back later, when the context has become richer. FMathL will be taught to do the same.

But since you seem to know MathML well, I wonder what you say to our study Limitations in Content MathML.

Posted by: Arnold Neumaier on September 14, 2009 4:25 PM | Permalink | Reply to this

Re: Herding cats

A typical mathematical document together with the background of a trained reader contains everything needed to understand the paper. FMathL is therefore supposed to guess the interpretation from the context and from past experience, just as any mathematician does.

Sounds like you’re trying (among other things) to develop a knowledge representation for mathematics.

Good luck with that!

But since you seem to know MathML well, I wonder what you say to our study Limitations in Content MathML.

I’ll take a look. But, off the top of my head, I’d say that one’s view of the (in)adequacy of CMML, depends on what you think its purpose is.

I see the primary use of CMML as a common data-interchange format between symbolic manipulation programs. For that purpose, I think it works passably well.

If you have some fancier use-case in mind, your answer may be different …

Posted by: Jacques Distler on September 14, 2009 5:18 PM | Permalink | PGP Sig | Reply to this

Re: Herding cats

JD: But, off the top of my head, I’d say that one’s view of the (in)adequacy of Content MathML depends on what you think its purpose is.

We were looking for what we could use to support FMathL activities (in particular, the representation of common formulas in mathematics, including block matrices in linear algebra and index notation for tensors) and simply recorded our failure to find it in Content MathML.

The Conten MathML document MathML2 of course takes a much more modest view on what it wants to achieve:

“It would be an enormous job to systematically codify most of mathematics - a task that can never be complete. Instead, MathML makes explicit a relatively small number of commonplace mathematical constructs, chosen carefully to be sufficient in a large number of applications. In addition, it provides a mechanism for associating semantics with new notational constructs. In this way, mathematical concepts that are not in the base collection of elements can still be encoded”

Unfortunately, the mechanism provided turned out to be almost useless.

Fortunately, the outlook is not as pessimistic as this disclaimer lets one guess, and we are close to a good solution (but not using MathML).

JD: I see the primary use of CMML as a common data-interchange format between symbolic manipulation programs. For that purpose, I think it works passably well.

I never tried to use automatic symbolic manipulation involving the definition of a covariant derivative in index notation. But if there is a differential geometry package that can do that, it will not be able to use Content MathML.

Posted by: Arnold Neumaier on September 15, 2009 12:04 PM | Permalink | Reply to this

Re: Herding cats

{ f(a+b) }^2 f { (a+b) }^2

From a strictly presentational point of view (which is, in this case, the point of view of LaTeX), these are wrong, since they put the superscript on the wrong element (the group instead of simply the right parenthesis.

The difference in these cases is tiny, but it exists; replace a with \sum_{i=1}^n a_i in a displayed equation to see better how it works. Of course, in that case, you probably want to use larger parentheses, so go ahead and use \left and \right; the spacing works differently. (But now the effect will be tiny again and in fact too subtle for iTeX2MathML.)

The problem is that grouping has meaning for TeX that may or may not match the semantic meaning that you intend to convey to MathML. It would be better (at least theoretically) to have a grouping command that is ignored by TeX but interpreted in MathML.

{(a+b)}^2_2

(a+b)^2_2

{(\sum_{i=1}^n a_i+b)}^2_2

(\sum_{i=1}^n a_i+b)^2_2

{\left(\sum_{i=1}^n a_i+b\right)}^2_2

\left(\sum_{i=1}^n a_i+b\right)^2_2

(a+b) 2 2{(a+b)}^2_2

(a+b) 2 2(a+b)^2_2

( i=1 na i+b) 2 2{(\sum_{i=1}^n a_i+b)}^2_2

( i=1 na i+b) 2 2(\sum_{i=1}^n a_i+b)^2_2

( i=1 na i+b) 2 2{\left(\sum_{i=1}^n a_i+b\right)}^2_2

( i=1 na i+b) 2 2\left(\sum_{i=1}^n a_i+b\right)^2_2

Posted by: Toby Bartels on September 14, 2009 10:23 AM | Permalink | Reply to this

Re: Herding cats

Here’s a screenshot of the same examples in TeX.

three equations, with and without braces

To me, the first and second and the fifth and sixth examples look nearly identical.

The third and fourth, of course, look radically different. But I’d say that’s because the parentheses are too small, and neither one looks “right” to me.

In each case, at least in my browser, the MathML, generated by itex2MML, looks pretty darned close to the TeX output.

Posted by: Jacques Distler on September 14, 2009 3:26 PM | Permalink | PGP Sig | Reply to this

Re: Herding cats

To me, the first and second and the fifth and sixth examples look nearly identical.

To tell the difference between the fifth and the sixth, I have to stack two of one on top of one of the other and look carefully at the vertical positions where the subscript of one line comes near the superscript of the next line. I can tell the difference between the first and second simply by looking at the gap between the multiscripts, but I still agree that they look nearly identical.

If you buy TeX's philosophy about how mathematical typesetting should be built out of boxes, then one is technically right and the other technically wrong, despite the small size of the practical effect. But if you think that CMML or something like it is the wave of the future, then this shouldn't matter to you, and putting grouping in iTeX is a good idea, even it produces technically incorrect TeX. Someday we should be able to print MathML just as nicely as we can now print TeX (and it's already close enough for the screen, at least when MathML supports everything), and then there will never be a need to use the TeX export.

Now here's a more practical consideration. I don't like the size of the delimeters produced by \left and \right; I've written macros that replace them with something slightly smaller. To keep things simple (even though it produces something slightly smaller yet than what I would like), let's do it with \bigg:

{\bigg(\sum_{i=1}^n a_i+b\bigg)}^2_2

\bigg(\sum_{i=1}^n a_i+b\bigg)^2_2

( i=1 na i+b) 2 2{\bigg(\sum_{i=1}^n a_i+b\bigg)}^2_2

( i=1 na i+b) 2 2\bigg(\sum_{i=1}^n a_i+b\bigg)^2_2

Actually, the difference in the MathML here is still pretty subtle, at least on my browser. I can see it better in actual TeX (with displayed equtions).

If MathML is the future, then fudging the heights of the delimiters like this would be a job for a stylesheet. I have no idea to do such a thing, however.

Posted by: Toby Bartels on September 14, 2009 9:45 PM | Permalink | Reply to this

Re: Cat On A Hot Tin Roof

Category Theory began — I’m talking about Aristotle here — with the observation that signs, symbols, syntax, and so on … are inherently equivocal, which means that we must refer them to the right categories of interpretation if we want to resolve their ambiguities.

In other words — to borrow a word that Peirce borrowed from Aristotle — there is an inescapably abductive element to the task of interpretation.

See, for example, Interpretation as Action : The Risk of Inquiry

Posted by: Jon Awbrey on September 14, 2009 6:30 PM | Permalink | Reply to this

Re: Cat On A Hot Tin Roof

Category Theory began — I’m talking about Aristotle here — […]

Although our term ‘category’ does come from Aristotle (via Kant), I would say that what Aristotle discusses is more type theory than category theory. He's still correct, however!

Posted by: Toby Bartels on September 14, 2009 9:44 PM | Permalink | Reply to this

Re: Cat On A Hot Tin Roof

Although our term ‘category’ does come from Aristotle (via Kant), I would say that what Aristotle discusses is more type theory than category theory.

It might be nice to have a paragraph with discussion of this terminology issue at category theory.

Unless we have already…

Posted by: Urs Schreiber on September 15, 2009 7:36 AM | Permalink | Reply to this

Re: Cat On A Hot Tin Roof

We’ve already aired the Kant connection.

Posted by: David Corfield on September 15, 2009 8:51 AM | Permalink | Reply to this

Philosophical Excavations

I cited my favorite locus classicus from Aristotle here:

Peirce’s first cut — it’s the deepest — is here:

Posted by: Jon Awbrey on September 15, 2009 2:12 PM | Permalink | Reply to this

Re: Cat On A Hot Tin Roof

I wrote:

It might be nice to have a paragraph with discussion of this terminology issue [the two meanings of “category”] at category theory.

David reacted:

We’ve already aired the Kant connection.

I don’t see any of this at nnLab:category theory

At least the MacLane-quote that Jon points to should be copied there.

I’ll take care of that now. But I won’t try to talk about Aristotle et al. That’s not my job.

Posted by: Urs Schreiber on September 15, 2009 5:52 PM | Permalink | Reply to this

Technical questions on the input interface

I have some technical questions on the blogging software. Maybe some things can be improved and/or explained.

1. When I use the text filter “itex to MathML with parbreaks”, how do I quote a piece of text from a previous mail? Trying to copy it with the mouse produces unintelligible output.

2. The interface provides the possibility to “Remember personal info”, but why can’t it remember the text filter used last time? I regularly forget to set it in the first preview to what I want.

3. It would be nice if the switch between “view chronologically” and “view threaded”, at present at the bottom of the whole page, would appear at the bottom of each message.

4. Why are the “Previous Comments and Trackbacks” repeated in each response window? It only makes navigating in the window more difficult (tiny motions have large consequences for long discussions like this one). I’d prefer to have a larger comment window.

5. The options are visible only before the first preview, which I found a bit of a nuisance. Also, it would be nice if they (nd the name information) appeared after the command window rather than before it, since this saves scrolling in the first round.

Posted by: Arnold Neumaier on September 16, 2009 8:34 PM | Permalink | Reply to this

Re: Technical questions on the input interface

When I use the text filter “itex to MathML with parbreaks”, how do I quote a piece of text from a previous mail? Trying to copy it with the mouse produces unintelligible output.

Copy it with the mouse and put a > character in front of it. This doesn’t work for math symbols, however. Several of us have been complaining about this for a while, but no one has fixed it yet.

The interface provides the possibility to “Remember personal info”, but why can’t it remember the text filter used last time?

I’ve complained about this a couple times also, but never got any answers.

Posted by: Mike Shulman on September 16, 2009 8:45 PM | Permalink | Reply to this

Re: Technical questions on the input interface

When I use the text filter “itex to MathML with parbreaks”, how do I quote a piece of text from a previous mail? Trying to copy it with the mouse produces unintelligible output.

Copy it with the mouse and put a > character in front of it.

No, that doesn't work with that filter; instead see the stuff about ‘blockquote’ at this FAQ.

Or try using the ‘Markdown with itex to MathML’ filter; it's a lot more powerful and includes this feature.

Posted by: Toby Bartels on September 16, 2009 9:52 PM | Permalink | Reply to this

Re: Technical questions on the input interface

There's a thread for this stuff … not that you'll find answers to your questions there.

Posted by: Toby Bartels on September 16, 2009 9:47 PM | Permalink | Reply to this

Re: Technical questions on the input interface

There's a thread for this stuff …

I've copied it there.

Posted by: Toby Bartels on September 16, 2009 10:16 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Just a note from the sidelines…

As a layman, I’m thoroughly enjoying this conversation. I’m starting to see an image of language itself being explained by category theory, which is mind boggling but makes perfect sense.

I always half-jokingly described mathematics as a “foreign language” similar to the way some organizations recognize fluency in a programming language as a “foreign language”.

Mathematics (maybe at the undergraduate level?) then seems like a perfect progression in the attempt to understand language and communication “arrow theoretically”.

I obviously don’t know what I’m talking about, but that is the hazy picture beginning to form in my mind.

By the way, since the goal seems to be to formalize the language of mathematics and eventually implement a computer system, then it also seems like it would make sense to develop the most fundamental mathematics concepts using the most fundamental computational concepts. Think about how computers encode information. Bits. Packets.

Now I’m just thinking out loud…

Posted by: Eric Forgy on September 18, 2009 3:49 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

After posting this, my mind started racing and I remembered that probably the most primordial object in category theory is the (-2)-category “True”. There are two (-1)-categories “True” and “False”.

Is it a coincidence that the most fundamental concept in a computer is the bit?

It would be fun to trace the development information content via bits on a computer with the development of information content via category theory beginning with the (-2)-category True.

Posted by: Eric Forgy on September 18, 2009 4:42 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Eric wrote “Is it a coincidence that the most fundamental concept in a computer is the bit?”

Just to note that modern digital computers are not the only kinds of computational devices. Eg, there were the old-time analogue computers, there are “multi-layer perceptrons”, there’s cellular automata (which, whilst having discrete states, binary states don’t seem “specially nice”). So, whilst there’s a very good case to be made that binary digital circuitry is the fundamental idea of computation, it’s not completely obvious that this is the case.

(And that’s without discussing if quantum mechanics changes things.)

Posted by: bane on September 18, 2009 5:19 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Let me change my question to:

Is it a coincidence that the most fundamental concept in a digital computer is the bit?

I find it somehow compelling that the building block of the periodic table is also the building block of the digital computer.

Maybe I am giddy because it is Friday, but that somehow seems profound to me.

Posted by: Eric Forgy on September 18, 2009 5:42 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I wasn’t trying to dismiss the concept of a connection; indeed given that digital computers are created by human beings who fond of very of True and False there’s almost certainly a deep connection. I was just pointing out that it’s currently unclear whether the “fundamental human approach to computation” is close to the “fundamental approach to computation”.

In terms of other connections, there’s obviously relations of information theory to: the most basic questions you can ask are ones with True or False as answers. But that’s another difficult to formalise connection.

Posted by: bane on September 18, 2009 6:21 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Eric, I think you would enjoy this paper which packs a lot of thinking into 15 pages. I like the idea of category theory describing structures, and then structures within those structures as an organization of rather than a foundation of mathematics.

philosophy.ucdavis.edu/landry/2CategoryTheoryTheLanguage.pdf
CATEGORY THEORY: THE LANGUAGE OF MATHEMATICS by Elaine Landry

“Rather, ‘like’ in the sense that just as mathematics, in virtue of its ability to classify empirical and/or scientific objects according to their structure, presents us with those generalized structures which can be variously interpreted. Likewise, then, a specific category, in virtue of its ability to classify mathematical concepts and their relations according to their structure, presents us with those frameworks which can be variously interpreted.7 It is in this sense that specific categories act as “linguistic frameworks” for concepts: they allow us to organize our talk of the content of various theories in terms of structure, because “[i]n this description of a category, one can regard “object,” “morphism,” “domain,” “codomain,” and “composites” as undefined terms or predicates” (Mac Lane 1968, 287).

In like manner, a general category, in virtue of its ability to classify mathematical theories and their relations according to their shared structure, presents us with those frameworks which can be variously interpreted.8 It is in this sense that general categories act as “linguistic frameworks” for theories: they allow us to organize our talk of the common structure of various theories in terms of structure, because in this description of a category once can regard “object,” “functor,” etc., as undefined terms or predicates. That is, general categories allow us to organize our talk of the structure of various theories in the same manner in which the various theories of mathematics are used to talk about the structure of their objects, viz., as “positions in structures.”

We say that category theory is the language of mathematical concepts and relations because it allows us to talk about their specific structure in various interpretations, that is, independently of any particular interpretation. Likewise, our talk of the relationship between mathematical theories and their relations is represented by general categories. We say that category theory is the language of mathematical theories and their relations because it allows us to talk about their general structure in terms of “objects” and “functors,” wherein such terms are likewise taken as ‘syntactic assemblages waiting for an interpretation of the appropriate sort to give their formulas meaning’.

Posted by: Stephen Harris on September 22, 2009 2:26 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Thanks for the reference Stephen! I’ve been traveling lately, hence the slow response, but managed to read this on a plane.

You certainly don’t owe me a coffee, but I’ll take it as an invitation and if you ever find yourself near LA with some free time, let me know as well :)

The thing that I find fascinating about this whole conversation is the glimmers of “arrow theoretic” formulation of communication itself. For example, what is “w” arrow theoretically?

Then a close cousin (or ancestor) of communication is information. How does category theory encode information? Can that be quantified?

In principle, it would seem to be possible to formulate a complete “arrow theoretic” means of communication.

Posted by: Eric Forgy on October 1, 2009 7:09 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I’m starting a new thread because this is a reply to several different comments and because the nesting level in threaded view is getting ridiculous.

I think there is one fundamental problem that is cropping up repeatedly in several guises in this discussion, which I mentioned up here. It appears to have led me astray in a few places as well (as it’s done before in the past; argh!). Namely, in unaugmented structural set theory (including ETCS and SEAR and type theory), a structured set is not a single object in the domain of discourse. For instance, in none of these cases does the formal language allow one to talk about a group. A group in SEAR (to be specific) consists of a set GG and an element eGe\in G and a function m:G×GGm:G\times G \to G, such that certain axioms are satisfied. By contrast, in ZF, a group can be defined to consist of an ordered triple (G,e,m)(G,e,m) such that eGe\in G, mm is a function G×GGG\times G\to G, and certain other axioms are satisfied.

This is what I meant when I said here that the relation \in between rationals and (Dedekind) reals is extra structure on the same set of reals. The ordered field of reals consists of the set \mathbb{R}, some elements 0,10,1\in \mathbb{R}, some functions +,:×+,\cdot:\mathbb{R}\times\mathbb{R}\to \mathbb{R}, a relation ():({\le}): \mathbb{R} \looparrowright \mathbb{R}, etc. The relation ():({\in}): \mathbb{Q} \looparrowright \mathbb{R} is one more piece of structure on the same set \mathbb{R}, which you can use or not use as you please.

Likewise, this is what I think is going on with opposites. If I’m working with a group as above, then I can construct its opposite group to consist of the same set GG, the same element ee, and a reversed function mm. Now of course it makes sense to ask whether an element of GG and an element of G opG^{op} are equal, since they are elements of the same set. On the other hand, if I am just given two groups GG and HH, it makes no sense to ask whether an element of GG is equal to an element of HH. So what I said here is not quite right, and I apologize: what I should have said is that it would be a type error to compare elements of (sets underlying) two different structures unless those structures are built on the same underlying set(s).

(If you’re going to object to the notion of sets being “the same,” I think the answer was provided by Toby: we mean the external judgment that two terms are syntactically equal, rather than a (disallowed) internal proposition that two terms refer to the same object.)

So I think it is misleading to speak about two different groups “having elements in common”—either we are talking about two different group structures on the same set, in which case the two have exactly the same elements by definition, or we are talking about group structures on different sets, in which case asking whether an element of one is equal to an element of the other is a type error. Therefore, I was also not quite right when I said that a structural system would not be able to construct categories having common objects: it can construct pairs of categories (such as a category and its opposite) that have the same collection of objects, but no more.

Now going back to the original subject of intersections, in a structural theory the operation of “intersection” does not apply to arbitrary pairs of sets, but rather to pairs of subsets of the same fixed ambient set. (One might argue that the intersection of a set with itself should be defined (and equal to itself), but I think this probably derives from a misconception that distinct sets in structural set theory are “disjoint,” rather than it just not being meaningful to ask whether their elements are equal.) In particular, it is not meaningful to speak of the intersection of the sets of objects of two categories.


It never really occurred to me before to consider this peculiarity (that structured sets are not single things) as a problem, since “for any group, …” can always be interpreted as shorthand for “for any set GG, element eGe\in G, and function m:G×GGm:G\times G\to G such that <blah>, …”, and similar sorts of interpretations happen in other foundations like ZF. But I guess that this sort of implicit interpretation is part of the “compilation from high-level language to assembly language,” and what you want is a formalization of the higher-level language that (among other things) includes “a group” as a fundamental object of study.

I admit that this is definitely something I have found frustrating about existing proof assistants: they do not seem to really understand that “a group” is a thing. But somehow I never really isolated the source of my frustration before.

One way to deal with this (which I believe is adopted by many type theorists?) is to introduce one set called a “universe” UU whose elements are (interpreted as) sets in some way. Then a triple of sets can be modeled by an element of U×U×UU\times U\times U, and so on. But a really structural approach would insist that sets are not the elements of any set, but rather the objects of a category—so what we really need is structural category theory, which doesn’t quite exist yet. Possibly this is what Lawvere was thinking of when he said that “when one wishes to go substantially beyond [ETCS], a much more satisfactory foundation… will be provided by a theory of the category of categories”—although I’ve always felt that one should really be thinking about the 2-category of categories.

Regardless, it does seem that existing structural frameworks fall short here. Now I feel all fired up to improve them! But that’s a whole nother kettle of worms.

I’m also feeling bad about hijacking this discussion with a long branching argument about the merits of structural/categorial set theory. If people have the energy, let’s also go back to FMathL and your larger proposal and see what other constructive things we can discuss about it. As I said way back at the beginning, I am really excited about the overall idea—which is perhaps what is driving me to be overly critical of FMathL, since I would like such a system to “get it right” (and for better or for worse, I usually seem to think I know what’s “right”). But as I’ve said, I feel like I understand a little better now where you are coming from and what problems you are trying to solve.

Posted by: Mike Shulman on September 21, 2009 7:26 AM | Permalink | PGP Sig | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Namely, in unaugmented structural set theory (including ETCS and SEAR and type theory), a structured set is not a single object in the domain of discourse. For instance, in none of these cases does the formal language allow one to talk about a group.

One reason why material (ZFC\mathbf{ZFC}-like) foundations want to package things up into a tuple is that one might want to make these tuples elements of some other set. In structural foundations, you couldn't do that anyway; if you want a family of groups, then you need that to be parametrised by some index set II, and you have a group G kG_k for every element kk of II. You could do that in material foundations too, of course, but if instead you want to allow a family to ‘parametrise itself’, then you need each of the objects in the family (groups, in this case) to formally be single objects that can be elements of a set.

I admit that this is definitely something I have found frustrating about existing proof assistants: they do not seem to really understand that “a group” is a thing.

In Coq, you can do this with Records. Formally, this is based on having a Type of Sets and all that (Records are just user-friendly sugar), but you never need to use anything about that type (such as the equality predicate on its elements, which would be evil).

But a really structural approach would insist that sets are not the elements of any set, but rather the objects of a category—so what we really need is structural category theory, which doesn’t quite exist yet.

I'd like to see structural \infty-groupoid theory; I think that I could do a lot with that, possibly everything that I want. Of course, there are already \infty-groupoids hidden in ordinary intensional type theory, but it's not clear that there are enough; I would really want types that are explicit \infty-groupoids, and nothing more.

Posted by: Toby Bartels on September 21, 2009 7:59 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

MS: it would be a type error to compare elements of (sets underlying) two different structures unless those structures are built on the same underlying set(s).

You need to make more exceptions to account for subcategories. There may be a category with object set C and two subsets A and B such that there are subcategories with object sets A and B. In this case, one must be able to compare their elements, too, because of the definition of a subcategory. With ZF or NBG as metalanguage (as usual), this already implies that one can compare objects from any two categories.

With SEAR as metatheory it might be different since I have not yet a good intuition about SEAR. (But I added a number of comments on the SEAR page that make me doubt that it is a mature enough theory.)

MS: let’s also go back to FMathL and your larger proposal and see what other constructive things we can discuss about it. As I said way back at the beginning, I am really excited about the overall idea

Yes, I’d appreciate that.

Posted by: Arnold Neumaier on September 21, 2009 3:25 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

You need to make more exceptions to account for subcategories. There may be a category with object set C and two subsets A and B such that there are subcategories with object sets A and B. In this case, one must be able to compare their elements, too, because of the definition of a subcategory.

Structurally, a “subcategory” of CC means a category equipped with an injective functor to CC, just like a “subset” means a set equipped with an injective function (except in SEAR, where subsets are technically distinguished from their tabulations—but even there, it is only the tabulation which is itself a “set” and therefore can be the set of objects of a category). Therefore, if AA is a subcategory of CC, then objects of AA cannot be compared directly to objects of CC, but only after applying the inclusion functor.

Posted by: Mike Shulman on September 21, 2009 5:56 PM | Permalink | PGP Sig | Reply to this

still on objects common to different categories

MS: if A is a subcategory of C, then objects of A cannot be compared directly to objects of C, but only after applying the inclusion functor.

Then please tell me why, formally, the following reasoning is faulty.

I am using Definitions 1.1.1 (category) and 1.1.3 (subcategory) taken from Asperti and Longi, modified to take account of your statement above. If you think these are faulty, please give me a reliable replacement that I may take as authoritative.

But I do not accept any moral injunction unless it is presented as a formal restriction to what a theorem prover would be allowed to do.

Let C abcdC_{abcd} be the category whose objects are the symbols a,b,c,da,b,c,d, with exactly one morphism between any two objects, composing in the only consistent way. Let the categories C abcC_{abc} and C abdC_{abd} be defined similarly. Clearly, these are both subcategories of C abcdC_{abcd}, with the identity as the inclusion functor. But I can compare their objects for equality.

[related snippets of other mails]

MS: I would consider the class of objects of each category to be a separate type,

Would this work consistently in the above example?

AN: I first need to understand what “should” be understood after having read Definition 1.1.1 and what after Definition 1.3.1.

MS: I think that what “should” be understood at this point is that the authors made a mistake in stating the exercise.

This requires having also read the exercise. But my example above seems to indicate that something nontrivial and unstated should already be understood at the point these two definitions have been read.

What we do in this whole discussion is in fact the typical process of how mathematicians growing up in different traditions learn to align their language so that they may speak with each other without generating (permanent) misunderstandings. As a result, the language and awareness of all participating parties is sharpened, and then applicable to a wider share of mathematical documents.

Posted by: Arnold Neumaier on September 22, 2009 2:39 PM | Permalink | Reply to this

Re: still on objects common to different categories

if A is a subcategory of C, then objects of A cannot be compared directly to objects of C, but only after applying the inclusion functor.

Then please tell me why

I am not a mathematician and I think this has nothing to do with category theory and even with mathematics:

You assume that you can talk about objects (in the lay meaning of the word, not categorically) from A and C without taking care of defining the proper “universe of discourse” in which the denotations for objects in A and objects in C can validly appears in the same sentence.
This is a platonist stance, the objects “exist” independently of the discourse about them.
I would say, not so, Platonism has been poisoning mathematics and philosophy for more than two millenia.

Posted by: J-L Delatre on September 22, 2009 8:49 PM | Permalink | Reply to this

Re: still on objects common to different categories

Let C abcdC_abcd be the category whose objects are the symbols a,b,c,da,b,c,d, with exactly one morphism between any two objects, composing in the only consistent way. Let the categories C abcC_abc and C abdC_abd be defined similarly. Clearly, these are both subcategories of C abcdC_abcd, with the identity as the inclusion functor. But I can compare their objects for equality.

Yes, this is like the example of opposite categories. You started with the four objects a,b,c,da,b,c,d and constructed things out of them. It should be possible to force you to use only copies of these four objects when you construct new things out of them, so that equality between them would not make literal sense, but I don't see the point in doing so. The constructions will still come equipped with operations to the type {a,b,c,d}\{a,b,c,d\} from the types of objects of the various categories, and we could compare them for equality along those operations. So why not formalise those operations as identity? That's how I would do it.

I don't know if that's how Mike would do it. One could not do that in SEAR\mathbf{SEAR} (although one could for the example of opposite categories).

I think that it was John Armstrong who first wrote

But it really doesn’t matter, since equality of components of two distinct categories is not part of the structure.

I would not put it quite that way. I would say that equality of objects (or morphisms, etc) of two arbitrary categories is not meaningful; that is, in a context where all that is said about two categories CC and DD is that they are categories, then equality of their objects is not meaningful. However, if CC and DD are given to us in some more complicated way (such as D:=C opD := C^{op}, or even D:=CD := C for that matter), then that might give some meaning to equality of their objects.

There is nothing particularly special about categories in this respect; the same goes for elements of groups, for example.

Posted by: Toby Bartels on September 22, 2009 8:59 PM | Permalink | Reply to this

Re: still on objects common to different categories

TB: I would say that equality of objects (or morphisms, etc) of two arbitrary categories is not meaningful; that is, in a context where all that is said about two categories C and D is that they are categories, then equality of their objects is not meaningful. However, if C and D are given to us in some more complicated way, then that might give some meaning to equality of their objects.

In the standard interpretation of mathematical language that everyone learns as an undergraduate, the standard definition of a category implies the following:

Equality between two objects of two arbitrary categories is undecidable, while that of two categories given by some explicit construction may be decidable. (Something analogous holds for elements of groups, etc. in place of objects of categories.)

With this modification, I agree with you. Indeed, this is what you have in FMathL. But undecidable and meaningless are different notions - the first says that you cannot assign any definite truth value to it, the second says that it is not well-formed. With your formulation (using meaningless), one does not get something consistent.

Posted by: Arnold Neumaier on September 23, 2009 10:20 AM | Permalink | Reply to this

Re: still on objects common to different categories

In the standard interpretation of mathematical language that everyone learns as an undergraduate,

And which I unlearnt as a graduate (^_^)

But undecidable and meaningless are different notions - the first says that you cannot assign any definite truth value to it, the second says that it is not well-formed.

Right.

With your formulation (using meaningless), one does not get something consistent.

I can't imagine what you mean by this. What is inconsistent?

Posted by: Toby Bartels on September 25, 2009 6:50 PM | Permalink | Reply to this

Re: still on objects common to different categories

From another point of view, the construction {a,b,c,...}C a,b,c,...\{a,b,c,...\} \mapsto C_{a,b,c,...} is a functor from the Category of Sets to the Category of Categories. The Category of Categories doesn’t itself have a privileged functor from C a,b,cC_{a,b,c} to C a,b,c,dC_{a,b,c,d}. The one that looks natural comes from the natural-looking map from {a,b,c}\{a,b,c\} to {a,b,c,d}\{a,b,c,d\} — which is precisely the absent thing in SEAR or (i.i.r.c.) ETCS.

But it gets worse (or better!): we often want to consider the 2-category of categories, functors, and natural transformations (or of groupoids, functors and natural (automatically) isomorphisms). And in this setting, there isn’t much reason to privilege the natural-looking functor from C a,b,cC_{a,b,c} to C a,b,c,dC_{a,b,c,d} over the functor that sends each object in {a,b,c}\{a,b,c\} to dd! It’s a naturally isomorphic functor, and in this case all of them are equivalences, anyways.

But you asked about what’s wrong “formally”. Since I haven’t read all of what you mean by “formal”, I similarly don’t know if this addresses that question at all.

Posted by: some guy on the street on September 23, 2009 6:43 AM | Permalink | Reply to this

Re: still on objects common to different categories

sg: But you asked about what’s wrong “formally”. Since I haven’t read all of what you mean by “formal”, I similarly don’t know if this addresses that question at all.

Formally = in a way that it is clear how to teach it to an automatic system like FMathL or Coq.

How do you prevent such a system from drawing the conclusions I draw when the only context given are a definition of a category and of a subcategory, but the system already knows how to handle the language of naive set theory (with {x in A | property} but not {x | property}) and of elementary algebra as taught in undergraduate courses?

None of the three answers given so far resolves this. My conclusions are perfectly allowed according to the usual conventions of reading mathematics.

Thus in order to unambiguously defining the intended meaning, one needs to specify (without using the concept of a category since this is not yet born) either a different way of interpreting the same wording, or one needs to give a different wording to the standard definitions.

Posted by: Arnold Neumaier on September 23, 2009 9:42 AM | Permalink | Reply to this

Re: still on objects common to different categories

How do you prevent such a system from drawing the conclusions I draw when the only context given are a definition of a category and of a subcategory, but the system already knows how to handle the language of naive set theory (with {x in A | property} but not {x | property}) and of elementary algebra as taught in undergraduate courses?

A strongly typed system would not conclude that there exists an xx such that xx is an object of C a,b,cC_{a,b,c} and xx is an object of C a,b,c,dC_{a,b,c,d}, because none of that can be expressed in the language. Quantification requires a domain (which we can fix here), and being an object of some category is not a predicate (which we can't fix without changing what it says a bit).

I imagine that a less strongly typed system might be able to conclude something like that, while still rejecting 121 \in \sqrt 2 as meaningless (at least in the default context). You would probably do this through subtyping. But I don't have much experience with subtyping.

A strongly typed system can still handle {x in A | property}. Assuming that property uses the variable x only where an element of A makes sense, then the system should accept this as specifying a subset of A (however the notion of subset is formalised).

So if you start with C a,b,c,dC_{a,b,c,d}, then you can construct the set of objects of C a,b,cC_{a,b,c} as [the underlying set of] the subset {x in A | property}, where A is the set of objects of C a,b,c,dC_{a,b,c,d} and property states that an element of AA equals aa, bb, or cc. Then you can continue to get the entire category C a,b,cC_{a,b,c}. You can also give C a,b,cC_{a,b,c} the structure of a subcategory of C a,b,c,dC_{a,b,c,d}. A system that knows about category theory could helpfully construct all of this for us as soon as we write down property and ask it to construct the corresponding full subcategory.

Now, any system (if it's any good for category theory) should be able to conclude this: There exists an object xx of C a,b,c,dC_{a,b,c,d} that belongs to the subcategory C a,b,cC_{a,b,c}. (There is a formal distinction between C a,b,cC_{a,b,c} as a subcategory of C a,b,c,dC_{a,b,c,d} and C a,b,cC_{a,b,c} as a category in its own right, which we normally ignore by abuse of language.) You already know about the distinction between being an element of a set —a typing declaration— and belonging to a subset —a relation between elements and subsets of a given set—; the same holds for being an object of a category and belonging to a subcategory.

Posted by: Toby Bartels on September 25, 2009 8:04 PM | Permalink | Reply to this

Re: still on objects common to different categories

From another point of view, the construction {a,b,c,}C a,b,c,\{a,b,c, \ldots\} \mapsto C_{a,b,c, \ldots} is a functor from the Category of Sets to the Category of Categories. The Category of Categories doesn’t itself have a privileged functor from C a,b,cC_{a,b,c} to C a,b,c,dC_{a,b,c,d}. The one that looks natural comes from the natural-looking map from {a,b,c}\{a,b,c\} to {a,b,c,d}\{a,b,c,d\} — which is precisely the absent thing in SEAR or (i.i.r.c.) ETCS.

I’m not sure what this is supposed to mean. The desired inclusion {a,b,c}{a,b,c,d}\{a, b, c\} \hookrightarrow \{a, b, c, d\} in SetSet is constructed in ETCS by interpreting these sets as a 3-fold and 4-fold coproduct of copies of a terminal set 11 and invoking universal properties of coproducts. So I have no idea what is meant by saying it’s “absent” from ETCS (or SEAR for that matter).

There’s a meta-theorem that ETCS and Bounded Zermelo set theory with Choice are bi-interpretable in one another, so that anything you can express in one is expressible in the other. This might be helpful in realizing what can and cannot be said in ETCS. Mike has also written down bi-interpretability statements in this vein in the SEAR article.

Posted by: Todd Trimble on September 23, 2009 1:35 PM | Permalink | Reply to this

Re: still on objects common to different categories

OK, I stand corrected.

Posted by: some guy on the street on September 23, 2009 4:09 PM | Permalink | Reply to this

Re: still on objects common to different categories

To be fair, though, I didn’t say quite what I should have. Better would have been: in ETCS, given a 4-fold coproduct of copies of 11, whose four coproduct inclusions 11+1+1+11 \to 1 + 1 + 1 + 1 are given names aa, bb, cc, dd, we can think of those inclusions as providing subsets, and then construct the union of the subsets a,b,ca, b, c. If we give the 4-element set the name “{a,b,c,d}\{a, b, c, d\}”, this union gives a subset which interprets what is standardly meant by the inclusion {a,b,c}{a,b,c,d}\{a, b, c\} \hookrightarrow \{a, b, c, d\} in naive set-theoretic language.

Continuing the bridge between naive language and more formal language (and keeping in mind that in structural set theory, elements of SS are defined to be morphisms 1S1 \to S), we go on to define a membership relation between elements of a set like S={a,b,c,d}S = \{a, b, c, d\} and subsets of SS, like the one we just named {a,b,c}{a,b,c,d}\{a, b, c\} \hookrightarrow \{a, b, c, d\}: we say an element xx of SS is a member of a subset TST \hookrightarrow S if x:1Sx: 1 \to S factors through the subset inclusion. Then the members of the subset {a,b,c}{a,b,c,d}\{a, b, c\} \hookrightarrow \{a, b, c, d\} are indeed the elements of SS we called a,b,ca, b, c, and everything is as it should be. But as you can see, some slight care is needed to give the naive language rigorous meaning in ETCS.

Posted by: Todd Trimble on September 23, 2009 5:05 PM | Permalink | Reply to this

Re: still on objects common to different categories

in structural set theory, elements of SS are defined to be morphisms 1S1\to S

This is true in ETCS, but not in SEAR.

Posted by: Mike Shulman on September 23, 2009 5:18 PM | Permalink | PGP Sig | Reply to this

Re: still on objects common to different categories

Thanks. That’s what I meant.

Posted by: Todd Trimble on September 23, 2009 7:20 PM | Permalink | Reply to this

Re: still on objects common to different categories

Let C abcdC_{abcd} be the category whose objects are the symbols a,b,c,da,b,c,d

I’m going to assume we’re talking about small categories, so that we can make formal sense of them in any set theory. As we’ve said repeatedly, there is nothing special about categories here and the presence of evil can muddy the waters, but since you’re insisting on talking about categories instead of, say, groups, let’s go on that way.

In structural set theory you can’t just pull things out of the air and make them into a set. They have to be given to you as elements of some other set. So where are those symbols coming from? I’m guessing that you have in mind some infinite set of symbols, which could be represented by \mathbb{N}, so that you can construct its subset {a,b,c,d}\{a,b,c,d\} (using, for example, an encoding a=0,b=1,c=2,d=3a=0,b=1,c=2,d=3), which is (or, in SEAR, its tabulation is) another set equipped with a specified injection into \mathbb{N}. Now you can of course construct a further subset {a,b,c}\{a,b,c\} with a further injection into {a,b,c,d}\{a,b,c,d\}. Those injections are then how you then compare their elements.

Posted by: Mike Shulman on September 23, 2009 3:41 PM | Permalink | PGP Sig | Reply to this

What is a structured object?

MS: either we are talking about two different group structures on the same set, in which case the two have exactly the same elements by definition, or we are talking about group structures on different sets, in which case asking whether an element of one is equal to an element of the other is a type error.

Are we allowed in SEAR to talk about group structures on different subsets of the same set? Then we can again ask these questions.

Or is this question meaningless? You haven’t defined how to create in SEAR objects such as groups or group structures. Are the latter sets, elements, or relations?

Or are they a new type of formal objects that were not present before? Then how did they come into existence? (If you want to have objects of each category to be of a different type, you better first allow for a countable set of types in SEAR.)

Or are they only metaconcepts without a formal version, just a way of talking? (But on the metalevel, one seems to be able to compare arbitrary semantical constructs. Or do you want to impose restrictions on what qualifies for valid metastatements?)

MS: I guess that this sort of implicit interpretation is part of the “compilation from high-level language to assembly language,” and what you want is a formalization of the higher-level language that (among other things) includes “a group” as a fundamental object of study.

This seems necessary in order that a theorem prover can understand all conventions.

But it seems that in category theory one has “groups” as formal objects (of the category of groups), while “group structure on a set” is a meta-object only, consisting of a set SS and a group GG with set(G)=Sset(G)=S, where setset is the forgetful functor that removes the operations.

So part of the problem appears to lie in that you switch between different points of view (formal object or only a way of speaking tha can be formalized only by eliminating the concept) about what a group is.

Asking the system to rewrite all occurences of groups (and other structural concepts that in ZF would be tuples) by elimination of these concept on the formal level probably may create a huge overhead in view of the nested object constructions we often have in mathematics.

I commented on a related issue at the pure set entry of the nLab (under Membership trees). [I just see that the double opening apostrophe lead to an unsuspected result there. Unfortunately, the nLab editing has no preview facitly that would allow one to see things before posting.]

Too much is fuzzy for me to see what you really want to have.

Posted by: Arnold Neumaier on September 22, 2009 3:42 PM | Permalink | Reply to this

Re: What is a structured object?

You haven’t defined how to create in SEAR objects such as groups or group structures. Are the latter sets, elements, or relations?

Given a set GG, a group structure on GG is a function from G×GGG \times G \to G such that …. Functions and products have already been discussed, and the condition … can be stated in the language of SEAR\mathbf{SEAR}. I claim, as a partisan of structural set theory, that all definitions and proofs in ordinary mathematics are like this, modulo abuses of language (such as suppressing the inclusion function XYX \to Y when XX is defined as a subset of YY) that are no worse than the abuses used with ZFC\mathbf{ZFC}.

(If you want to have objects of each category to be of a different type, you better first allow for a countable set of types in SEAR.)

This is already present; SEAR\mathbf{SEAR} is a theory in first-order logic, so it already has a countable set of variables. It is a dependent type theory, and each pair of variables for a set gives a type of relations; the only other type in it is the type of sets. As 1+2 0= 01 + 2 \aleph_0 = \aleph_0, that is the number of types. (Of course, this is all on the metalevel.)

Asking the system to rewrite all occurences of groups (and other structural concepts that in ZF would be tuples) by elimination of these concept on the formal level probably may create a huge overhead in view of the nested object constructions we often have in mathematics.

On the contrary, a series of definitions like ‘A group is a set equipped with a function ….’, ‘A ring is a group equipped with a function ….’, and ‘An ordered field is a ring equipped with a relation ….’ leads in the ‹A structure is a tuple.› view to an ordered field being a pair with a pair with a pair (((S,+),),<)(((S,+),\cdot),\lt); quite a mess! While in the ‹A structure consists of several objects.› view, it simply leads to a set, two functions, and a relation.

On the other hand, one can take the structure as tuple view in a structural foundation, using something like Coq's Records, if one wants to. But this requires a richer ground type theory than SEAR\mathbf{SEAR} has.

[I just see that the double opening apostrophe lead to an unsuspected result there. Unfortunately, the nLab editing has no preview facitly that would allow one to see things before posting.]

[Yes, many others have asked for a Preview; the downside is that one of the biggest complaints about MediaWiki is that it's too easy to lose your edit since you forgot that the Preview was not a Save! The philophy in Instiki is that your Sumbit is a preview; if you don't like what you see, then you edit again, and it counts as only one edit in the history if your Submits are all within 30 minutes of each other no other editor slips in between. See discussion here.]

Posted by: Toby Bartels on September 22, 2009 9:45 PM | Permalink | Reply to this

Re: What is a structured object?

AN: You haven’t defined how to create in SEAR objects such as groups or group structures. Are the latter sets, elements, or relations?

TB: Given a set G, a group structure on G is a function from G×G to G such that ….

OK. Thus a group structure on a set G is an object of type relation(GxG,G).

Now what is a group? Does one have to eliminate the concept of group in favor of that of a group structure when going from informal SEAR to formal SEAR as a first order logic with dependent types?

AN: Asking the system to rewrite all occurences of groups (and other structural concepts that in ZF would be tuples) by elimination of these concept on the formal level probably may create a huge overhead in view of the nested object constructions we often have in mathematics.

TB: On the contrary, a series of definitions like “A group is a set equipped with a function …” leads in the “A structure is a tuple” view to an ordered field being a pair with a pair with a pair (((S,+),\cdot),<\lt); quite a mess!

At present, every formalization of a piece of mathematics is a mess; this was not the point.

What I was referring to was the overhead in the length of the formalization. With ZF, you can formalize a concept once as a tuple, and then always use the concept on a formal level.

But with a composite thing that exist only as a way of speaking, the formalization must replace this thing in each occurrence by the defining way of speaking. If this happens recursively (and much of mathematics is deep in the sense of data structures), the size of the formal expression may explode to the point of making the automatic verification of simple high-level statements a very complex task.

TB: On the other hand, one can take the structure as tuple view in a structural foundation, using something like Coq’s Records, if one wants to. But this requires a richer ground type theory than SEAR has.

This is what I was aiming at. For reflection purposes, one cannot work in pure SEAR, while one can do that in pure ZF.

TB: The philophy in Instiki is that your Sumbit is a preview; if you don’t like what you see, then you edit again, and it counts as only one edit in the history if your Submits are all within 30 minutes of each other no other editor slips in between.

good to know.

Posted by: Arnold Neumaier on September 23, 2009 1:58 PM | Permalink | Reply to this

Re: What is a structured object?

Now what is a group? Does one have to eliminate the concept of group in favor of that of a group structure when going from informal SEAR to formal SEAR as a first order logic with dependent types?

I already answered this up here: “A group in SEAR consists of a set GG and an element eGe\in G and a function m:G×GGm:G\times G \to G, such that certain axioms are satisfied.”

A group in SEAR is not a single thing in the universe of discourse. This is not a problem for formalization at a low level, but it may be undesirable when trying to formalize at a higher level, for all the reasons that you’ve given. But it doesn’t prevent SEAR from reflecting on itself formally.

In Isabelle, at least, “a group” is really defined as follows: given a type 'a, one constructs the type 'a group of group structures on 'a, and then defines a group (of type 'a) to be an element of 'a group. I presume this is what Coq’s Records are like as well. I don’t see why this couldn’t be done in SEAR just as well, though: given a set AA we can define the set grp(A)grp(A) of group structures on AA as a subset of A A×AA^{A\times A}. Of course just knowing an element of grp(A)grp(A) doesn’t give you anything unless you remember that grp(A)grp(A) was constructed from AA in a particular way.

Posted by: Mike Shulman on September 23, 2009 3:33 PM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

Arnold Neumaier wrote:

OK. Thus a group structure on a set GG is an object of type relation(GxG,G)relation(GxG,G).

Now what is a group? Does one have to eliminate the concept of group in favor of that of a group structure when going from informal SEAR to formal SEAR as a first order logic with dependent types?

In SEAR, yes.

I would not found FMathL directly on SEAR, if I were you. Besides any efficiency problems, it's simply more user-friendly to treat a group as a single object. I would probably give any computer system for abstract mathematics a simple dependent type theory with support for dependent sums (and probably only depedent sums, unless I really want to found the whole thing on type theory) and implement the interface similarly to Coq's Records.

For reflection purposes, one cannot work in pure SEAR, while one can do that in pure ZF.

I don't see what reflection has to do with it. It's a matter of convenience and (if you say so) efficiency. Writing SEAR in a dependent type theory with direct sums doesn't add any strength to it (as long as you don't add quantification over types), since you can eliminate them (down to the base types).

Mike Shulman wrote in response:

In Isabelle, at least, “a group” is really defined as follows: given a type 'a, one constructs the type 'a group of group structures on 'a, and then defines a group (of type 'a) to be an element of 'a group. I presume this is what Coq’s Records are like as well. I don’t see why this couldn’t be done in SEAR just as well, though: given a set AA we can define the set grp(A)grp(A) of group structures on AA as a subset of A A×AA^{A \times A}. Of course just knowing an element of grp(A)grp(A) doesn’t give you anything unless you remember that grp(A)grp(A) was constructed from AA in a particular way.

What seems to be missing is the concept of just having a group simple, rather than a group of type 'a or an element of a set constructed from AA in a particular way.

Here is how you would define the type of groups in Coq:

Record Group: Type := {uGroup: Set; sGroup: GroupStructure uGroup}.

That is, a group consists of a set and a group structure on that set. If you have G of type Group, then uGroup G is of type Set and sGroup G is of type GroupStructure uGroup G. I'm assuming that we've already defined GroupStructure, since that is not a problem even in SEAR, but it might be more user-friendly not to do this but instead to put everything into the Record:

Record Group: Type := {uGroup: Set; mGroup: uGroup -> uGroup -> uGroup; aGroup: forall (x y z: uGroup), mGroup (mGroup x y) z = mGroup x (mGroup y z);

and so on.

Posted by: Toby Bartels on September 25, 2009 8:11 PM | Permalink | Reply to this

Re: What is a structured object?

Let me say more explicitly the following: I am not asserting (any more) that structural set theory is sufficient for what you want to do. I can’t speak for anyone else, but I accept that structural set theory, whether ETCS or SEAR or whatever, has flaws preventing it from being used for the high-level computerized mathematical tool you want to create. They are different flaws from the flaws of ZF, and they are different from the flaws that it sounded to me like you were ascribing to it in your introduction to FMathL, but they are flaws nonetheless.

I do assert that structural set theory is a sufficient low-level foundation for mathematics on a par with ZF, and I believe that it is closer to the way mathematicians treat sets in everyday practice. (Although the language they use, in particular words like “subset,” tends to evoke material set theory. This can certainly be a barrier to understanding structural set theory; I blame it on the accident of history and the current ascendancy of ZF as a foundation.) Thus, I would like it if a higher-level language could be based on, or at least more in line with the ideas of, structural set theory. You clearly disagree with these assertions as well, and I’m more than happy to continue discussing them, but let’s not confuse them with the difficulty of implementing things on a computer.

Posted by: Mike Shulman on September 23, 2009 4:06 PM | Permalink | PGP Sig | Reply to this

Material vs. structural foundations of mathematics

MS: I do assert that structural set theory is a sufficient low-level foundation for mathematics on a par with ZF, and I believe that it is closer to the way mathematicians treat sets in everyday practice.

I had a few days off to do things neglected while occupied with this time-consuming discussion. Instead of replying individually to the single contributions, let me summarize how things look form my perspective.

My main conclusion from the present discussion and from reading the nLab pages on SEAR and pure sets is the following:

In a material theory, structural objects are constructed as anonymous objects chosen from the equivalence classes of mathematical structures of some kind with respect to isomorphism. Then one can do all structural mathematics inside suitable such collections of equivalence classes. However, to do so for nontrivial mathematics requires numerous abuses of language and notation; otherwise the description becomes tedious and virtually incomprehensible.

In a structural theory, material objects are constructed as the rigid objects in some category, with being isomorphic as equality. Then one can do all material mathematics inside suitable such collections of rigid objects. However, to do so for nontrivial mathematics requires numerous (but different) abuses of language and notation; otherwise the description becomes tedious and virtually incomprehensible.

Thus from a descriptive point of view, the material interpretation and the structural interpretation form equivalent approaches, both not describing the essence of mathematics but only straightjackets into which this essence can be forced in a Procrustean way, and in which one feels better is a matter of taste. My taste is that neither of these should be used. I want to be free of straitjackets. Thus I favor a declarative theory similar to FMathL, which accounts for the actual mathematical language and needs no abuses of language.

From a logical point of view, there is the additional question of proof power of the two views. I’d find it surprising if there were a structural theory with proof strength equivalent to that of ZFC, but I’d find it plausible that there are hierarchies of structural theories and hierarchies of material theories such that each of the former has a proof strength inferior to one of the latter, and conversely. Thus form the logical point of view, the structural and the material approach are still equivalent but in a weaker sense.

So what ultimately counts is the practical point of view. Here the advantage of the material point of view is very clear. After all, we already need a material free monoid to communicate mathematics. Then, the material point of view is nearly obvious to any newcomer, making for a simple entrance and plenty of very elementary exercises that lead to mathematical insight, while the structural point of view emerges only after having digested enough of more elementary material mathematics.

On the other hand, for many problems, both the material and the structural perspective offer insights. Therefore a good foundation of mathematics should offer both views.

Thus for me the priorities are clear: Describe mathematics in a declarative way that allows naturally both material and structural constructs, but support it constructively with a material mathematical universe in which the structural realm is constructible in a transparent way that can be easily used.

Posted by: Arnold Neumaier on September 30, 2009 12:34 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

I’d find it surprising if there were a structural theory with proof strength equivalent to that of ZFC

As has already been mentioned several times, there are real theorems here. The book by Mac Lane and Moerdijk proves that ETCS is equiconsistent with a fragment of ZFC (Bounded Zermelo with Choice), but as shown by Colin McLarty, one can easily strengthen ETCS with structural axioms so that ETCS+ is equiconsistent (bi-interpretable) with full ZFC. Mike and others (Toby, David Roberts, and there may be others too) have written down details of similar equiconsistency statements involving SEAR.

In each case, the idea is the same; see Mac Lane-Moerdijk for details. To reflect ZFC in a structural set theory like ETCS+, one reflects material sets using well-founded rooted trees; “elements” of such a tree with root rr are subtrees rooted at the children of rr. (There’s also the example of algebraic set theory – see the book by Joyal-Moerdijk, where models for ZFC are constructed as certain types of initial algebras.)

Certainly in the case of ETCS and algebraic set theory, this material has been well worked over, so I’m curious as to why you express doubt about its validity.

Posted by: Todd Trimble on September 30, 2009 3:37 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

I believe the strengthening of ETCS to a theory equivalent with ZFC actually predates McLarty by quite some time (although McLarty’s axiom is a bit different). One reference (which I wanted to mention earlier but failed to remember) is

  • Osius, Gerhard, “Categorical set theory: a characterization of the category of sets”, JPAA 1974
Posted by: Mike Shulman on September 30, 2009 5:14 PM | Permalink | PGP Sig | Reply to this

Re: Material vs. structural foundations of mathematics

AN: I’d find it surprising if there were a structural theory with proof strength equivalent to that of ZFC

TT: As has already been mentioned several times, there are real theorems here. The book by Mac Lane and Moerdijk proves that ETCS is equiconsistent with a fragment of ZFC (Bounded Zermelo with Choice), but as shown by Colin McLarty, one can easily strengthen ETCS with structural axioms so that ETCS+ is equiconsistent (bi-interpretable) with full ZFC.

I didn’t express doubts but said I’d find it surprising, meaning that it would reveal something to me that would extend my intoition. Clearly, my intuition about structural foundations is more limited than yours, so I can be easier surprised than you.

Maybe there is a theorem there, but the SEAR material here on the web doesn’t give complete proofs, and ETCS is, as you say, weaker than ZFC.

So I’d like to see Colin McLarty’s stronger version in order to understand what is missing in my intuition. Can you give me a reference to his work?

Posted by: Arnold Neumaier on September 30, 2009 5:33 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

Osius’ paper I cited above is a good one to look at. I think the paper of McLarty’s that we are referring to is “Exploring categorical structuralism” in Philos. Math.

I am also planning to include a more detailed proof, dealing with the non-well-founded case as well as the well-founded one, in my forthcoming paper “Unbounded quantifiers and strong axioms in topos theory,” which I will post about when it is in a state to be read by others.

Posted by: Mike Shulman on September 30, 2009 5:42 PM | Permalink | PGP Sig | Reply to this

Re: Material vs. structural foundations of mathematics

Yes, that was the paper by McLarty that I had in mind. Unfortunately I am not familiar with the paper of Osius, although I’m aware of it.

Posted by: Todd Trimble on September 30, 2009 6:07 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

MS: Osius’ paper I cited above is a good one to look at. I think the paper of McLarty’s that we are referring to is “Exploring categorical structuralism” in Philos. Math.

I got the latter from the web; it refers to Osius for the crucial part. The latter is not free online; so it will take a while for me to get it and read it.

Posted by: Arnold Neumaier on September 30, 2009 6:15 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

I didn’t express doubts but said I’d find it surprising, meaning that it would reveal something to me that would extend my intoition. Clearly, my intuition about structural foundations is more limited than yours, so I can be easier surprised than you.

Hmm. Here is what you wrote:

I’d find it surprising if there were a structural theory with proof strength equivalent to that of ZFC, but I’d find it plausible that there are hierarchies of structural theories and hierarchies of material theories such that each of the former has a proof strength inferior to one of the latter, and conversely. Thus from the logical point of view, the structural and the material approach are still equivalent but in a weaker sense.

The last sentence, a straightforward declaration, sure reads like a rejection of the stronger sense. If you didn’t mean that way, it is seriously misleading.

Posted by: Todd Trimble on September 30, 2009 6:29 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

AN: Thus from the logical point of view, the structural and the material approach are still equivalent but in a weaker sense.

TT: The last sentence, a straightforward declaration, sure reads like a rejection of the stronger sense. If you didn’t mean that way, it is seriously misleading.

I should have written: … but (to my present understanding) in a weaker sense.

But I thought that everything anyone says is to be considered subject to the restriction “according to the writer’s present understanding”.

Posted by: Arnold Neumaier on September 30, 2009 7:08 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

But I thought that everything anyone says is to be considered subject to the restriction “according to the writer’s present understanding”.

Yes of course, but that still doesn’t erase the fact that it’s a declaration of belief. My question was why you believe(d) it.

Posted by: Todd Trimble on September 30, 2009 7:33 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

TT: but that still doesn’t erase the fact that it’s a declaration of belief. My question was why you believe(d) it.

My present intuition tells me that equivalence is unlikely to hold. You tell me otherwise, and evidence of a proof (which I hope to gather by reading the paper by Osius - McLarty doesn’t have the details) may well change my belief.

Posted by: Arnold Neumaier on September 30, 2009 7:59 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

My present intuition tells me that equivalence is unlikely to hold.

Okay, thank you, that’s a good honest positive declaration. But you still haven’t said WHY. Why does your intuition tell you that?

Posted by: Todd Trimble on September 30, 2009 8:18 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

In a structural theory, material objects are constructed as the rigid objects in some category, with being isomorphic as equality. Then one can do all material mathematics inside suitable such collections of rigid objects.

I am forced to conclude that you have not understood anything that we have been saying.

One can construct a model of material set theory inside structural set theory by using rigid trees or other models. This may be interesting to do if one doubts that they are equally strong. However, it is irrelevant for matheamtical practice, because this is not, not, not how one does mathematics in a structural set theory! A group in structural set theory is a set with a multiplication operation and an identity satisfying the axioms—there is no need to equip this set with the superfluous extra structure of a rigid tree.

After all, we already need a material free monoid to communicate mathematics.

I have no idea what that means.

the material point of view is nearly obvious to any newcomer, making for a simple entrance and plenty of very elementary exercises that lead to mathematical insight, while the structural point of view emerges only after having digested enough of more elementary material mathematics.

My experience in teaching newcomers to mathematics is that even material set theory is fraught with conceptual hurdles. At present, one tends to appreciate the structural point of view only after digesting some abstract mathematics (or, perhaps, never), but there’s no evidence that it has to be that way. I would argue that that’s an artifact of the fact that almost everyone is taught material set theory first, and hardly anyone is ever taught structural set theory.

Posted by: Mike Shulman on September 30, 2009 5:21 PM | Permalink | PGP Sig | Reply to this

Re: Material vs. structural foundations of mathematics

AN: In a structural theory, material objects are constructed as the rigid objects in some category, with being isomorphic as equality. Then one can do all material mathematics inside suitable such collections of rigid objects.

MS: I am forced to conclude that you have not understood anything that we have been saying.

Maybe, but then the communication barrier is deeper than we both think.

MS: A group in structural set theory is a set with a multiplication operation and an identity satisfying the axioms—there is no need to equip this set with the superfluous extra structure of a rigid tree.

But as far as this goes there is no difference at all to the material point of view. A group in material set theory is also a set with a multiplication operation and an identity satisfying the axioms.

ZF adds superfluous extra stuff in terms of tuples that are sets of sets of sets, while SEAR adds superfluous extra structure in terms of lots of trivial conversion and embeddign functors.

None of this stuff is relevant for doing mathematics as it is done in practice.

But some of it is needed (in different ways) if one wants to force mathematics into either a purely material or a purely structural straitjacket. This is why I like neither of these constructive foundations. I want to avoid both extremes. (But find the material straitjacket still preferable to the structural one.)

AN: After all, we already need a material free monoid to communicate mathematics.

MS: I have no idea what that means.

The text displayed on the screen where you are reading this is composed of material elements of such a monoid.

Without language no communication of mathematics. But language needs a material monoid.

MS: strengthening of ETCS to a theory equivalent with ZFC actually predates McLarty

Thanks for the reference. I’ll try to get it…

Posted by: Arnold Neumaier on September 30, 2009 5:59 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

The text displayed on the screen where you are reading this is composed of material elements of such a monoid.

Yes, I think we knew you were referring to words, but how should we interpret what you mean by ‘material’?

For example, ‘word’ is a 4-tuple. In material set theory, there are various ways of representing 4-tuples, e.g.,

{w,{{w},o},{{{w},o},r},{{{{w},o},r},d}}\{w, \{\{w\}, o\}, \{\{\{w\}, o\}, r\}, \{\{\{\{w\}, o\}, r\}, d\}\}

Is that what you meant by a “material element” of the free monoid? If so, why are you convinced that we need such constructions? If not, then what did you mean?

Posted by: Todd Trimble on September 30, 2009 7:04 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

TT: For example, ‘word’ is a 4-tuple. In material set theory, there are various ways of representing 4-tuples,

ZF and its relatives are not the only material theories.

And ‘word’ is not a 4-tuple; Kuratowski tuples form a monoid only under a very unnatural operation. Instead, it is an element of a free monoid generated by 4 material characters w, o, r, and d.

FMathL takes this into account.

Posted by: Arnold Neumaier on September 30, 2009 7:20 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

You haven’t answered my question.

What, sir, do you mean by “material element of a free monoid”? For I take it you were saying that words (not characters, words) are “material elements of free monoids”.

What about “material” is necessary here?

Posted by: Todd Trimble on September 30, 2009 7:47 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

TT: What, sir, do you mean by “material element of a free monoid”? For I take it you were saying that words (not characters, words) are “material elements of free monoids”.

What about “material” is necessary here?

If w and o are material characters then their product (well-defined in any monoid) is material, too.

At least this is the understanding I gained from the use of material and structural in your community.

But even if this is not what you understand by these terms, it is the meaning I want to give the term (and is how I used it in all my mails), since this is the way it works in FMathL.

Posted by: Arnold Neumaier on September 30, 2009 8:14 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

a free monoid generated by 4 material characters w, o, r, and d

I (at least) have no idea what you mean when you call these characters ‘material’. Surely you don't mean that they are themselves sets, with their own elements? Perhaps you mean that they can be compared for equality with any other mathematical object, but I fail to see how this is needed for anything.

To discuss words, we need set a AA, called the alphabet and whose elements are called letters; then a word is an element of the free monoid on AA. We only need to test letters for equality with other letters, which is provided by the set AA. We need to test words for equality with other words, which the free monoid construction also provides; it even provides a test for equality of composites of tuples of words. This is all perfectly structural. Indeed, the concept of ‘free monoid’ is inherently categorial.

Posted by: Toby Bartels on September 30, 2009 7:49 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

TB: I (at least) have no idea what you mean when you call these characters ‘material’. Surely you don’t mean that they are themselves sets, with their own elements? Perhaps you mean that they can be compared for equality with any other mathematical object

No, I mean that they have an identity such that w can be recognized as the letter `w’, and not only as an anonymous element from some set. One needs not only know that w is different from o but also that w is in fact `w’!

Structurally, there is no difference between any two 4-letter words with distinct letters.

But to know what a word means you need to know the identity of each letter. This is what makes things material in the sense I find most natural to give to this word, not that it is written as a set or that one can compare for equality.

TB: This is all perfectly structural. Indeed, the concept of ‘free monoid’ is inherently categorial.

I don’t think that material and structural are always in opposition.

If it were, it were not possible to translate reasonably smoothly from a traditionally material view of mathematics to a structural view.

Posted by: Arnold Neumaier on September 30, 2009 8:15 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

But to know what a word means you need to know the identity of each letter.

And so you do. If formalised in ETCS\mathbf{ETCS} (for example), each letter is a function from 11 to AA, and these have their own identities.

This is what makes things material in the sense I find most natural to give to this word

I no longer remember who suggested to Mike that set theory in the style of ZFC\mathbf{ZFC} be called ‘material’, and I don't think that I ever knew the reason. But unless you're claiming that this requires a foundation in the style of ZFC\mathbf{ZFC} (in particular, with global equality and a global membership predicate), then I have no disagreement with you … but I don't see the relevance, either.

Posted by: Toby Bartels on September 30, 2009 8:29 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

AN: But to know what a word means you need to know the identity of each letter.

TB: And so you do. If formalised in ETCS (for example), each letter is a function from 1 to A, and these have their own identities.

Then please tell me which function from 1 to A is the letter w.

Posted by: Arnold Neumaier on September 30, 2009 9:10 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

It’s whatever function 1A1 \to A has been named ww.

It’s no different in principle from telling numbers apart. You could for example specify the subset AA of \mathbb{N} consisting of the first 26 elements (with respect to say the standard ordering of \mathbb{N}), and decide to name the 23rd element ww. From that point on, you know which function 1A1 \to A is meant by “ww”.

I think I can understand what’s behind the question. For example, the complex numbers ii and i-i behave exactly alike. But of course they are not the same. To deal with that, you can decide to represent the complex numbers as the quotient field [x]/(x 2+1)\mathbb{R}[x]/(x^2 + 1) and then say “I’ve decided to name the residue class of xxii’.” You could have named it i-i of course, but once you settle on the name, you stick with that.

Posted by: Todd Trimble on September 30, 2009 9:56 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

AN: please tell me which function from 1 to A is the letter w.

TT: It’s whatever function 1→A has been named w. […] You could for example specify the subset A of ℕ consisting of the first 26 elements (with respect to say the standard ordering of ℕ), and decide to name the 23rd element w. From that point on, you know which function 1→A is meant by “w”.

Thus you don’t need just a set A, called the alphabet, but you need a particular well-ordering of the set A before your prescription makes sense. In my view, giving a well-ordering to A is materializing the set A.

Strictly speaking you also need to name the letters, which is to give a mapping from A to the set of names for the letters, which must be material. Othewrwise you cannot tell someone else formally which element represents which symbol.

But I think I understand your point of view, without agreeing with that it is any improvement over a more naive material point of view.

Posted by: Arnold Neumaier on September 30, 2009 11:28 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

In my view, giving a well-ordering to A is materializing the set A.

Well, that has very little (or nothing) to do with how I have been using the word “material.” Should we take “giving a well-ordering” as your definition of “materializing”? So that in particular, when you said “we already need a material free monoid to communicate mathematics,” we should have interpreted that as meaning “we need the free monoid on a well-ordered set?” I don’t think I would disagree with that latter assertion, but I don’t think it has anything to do with the material/structural divide in the way we have been using the words.

Posted by: Mike Shulman on October 1, 2009 5:03 AM | Permalink | PGP Sig | Reply to this

Re: Material vs. structural foundations of mathematics

AN: In my view, giving a well-ordering to A is materializing the set A.

MS: Well, that has very little (or nothing) to do with how I have been using the word “material.”

Then please give your definition of how the material and the structural point of view should be recognized.

MS: Should we take “giving a well-ordering” as your definition of “materializing”?

No. A set is materialized if it is given extra structure which makes its elements uniquely identifiable by giving a formal expression identifying it.

In particular, a well-ordering of a finite set materializes it since you can point to each particular element by a formal expression identifying it: ”the first element”, ”the second element”, etc.

If this is not the meaning of material then I have no clue why you can refer to ZF set theory as a material theory.

Posted by: Arnold Neumaier on October 1, 2009 10:33 AM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

No. A set is materialized if it is given extra structure which makes its elements uniquely identifiable by giving a formal expression identifying it.

I’m glad this has finally come out, although we’ve probably been doing an awful lot of talking past each other because we misunderstood how we each intended the word ‘material’.

It reminds me of a Buddhist story I once read. There was a man who worshipped Amitabha, who in traditional iconography is bright red, but the man had misunderstood or mistranslated and thought the color was gray, like ash from the fire. So whenever he meditated on and envisioned Amitabha, it was always a gray Amitabha. Finally the guy is on his deathbed, and just to be sure, asks his teacher what color Amitabha is, and on finding out bursts into laughing, saying, “Well, I used to think him the color of ash, and now you tell me he is red,” and died laughing.

‘Material’ as in “material set theory” is something I’d only heard in the last few months at latest. I just assumed it meant we were talking about a form of set theory founded on a global membership relation, like ZF, Bernays-Gödel, Morse-Kelly, etc. The “material” signified to me that elements had “substance” (I used the phrase ‘internal ontology’ before): could have elements which themselves could have elements, and so on.

Posted by: Todd Trimble on October 1, 2009 12:44 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

I have tried to clarify what I intended to mean by “material set theory” and “structural set theory” at the set theory page on the nlab. This is pretty close to what Todd said. In particular, “material” is a property of a theory, not of a set. When you start talk about giving a set extra structure, that can of course be done structurally just as naturally (as the word suggests).

Posted by: Mike Shulman on October 1, 2009 2:36 PM | Permalink | PGP Sig | Reply to this

Re: Material vs. structural foundations of mathematics

MS: I have tried to clarify what I intended to mean by “material set theory” and “structural set theory” at the set theory page on the nlab. […] In particular, “material” is a property of a theory, not of a set.

I added some remarks there. It doesn’t seem to define when an arbitrary theory is material, and hence does not define a property of a theory. It only defines the compund concept of a “material set theory”, and does this in terms too vague that one could decide questions such as whether FMathL is or isn’t a material set theory.

I very much prefer the concept of material vs. structural that I presented in a previous mail and extracted from your usage in the present discussion. (There you also used the terms “material foundations”, Todd Trimble and Toby Bartels used “material sets”, TB also used “material framework”; so the term clearly wants to be generalized…)

Posted by: Arnold Neumaier on October 1, 2009 4:13 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

Todd Trimble and Toby Bartels used “material sets”, TB also used “material framework”

For the record, I'll specify what I mean by these.

I use ‘material’ as short for ‘membership-based’, which itself really means ‘featuring a global membership predicate’, which means ‘featuring a binary predicate which, given any two terms for a set, returns a proposition whose intended meaning is that the first set is a member of the other’. This is not a purely syntactic concept; it depends on the intended meaning.

In front of ‘set theory’, ‘foundations’, or ‘framework’, this is exactly what ‘material’ means; but ‘material sets’ really means ‘sets in a material set theory’, which in turn might literally mean ‘terms for sets in a material set theory’ or ‘the intended meaning of terms for sets in a material set theory’.

Posted by: Toby Bartels on October 1, 2009 9:17 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

AN: A set is materialized if it is given extra structure which makes its elements uniquely identifiable by giving a formal expression identifying it.

TT: we misunderstood how we each intended the word ‘material’. […] ‘Material’ as in “material set theory” is something I’d only heard in the last few months at latest. […] The “material” signified to me that elements had “substance” (I used the phrase ‘internal ontology’ before): could have elements which themselves could have elements, and so on.

I hadn’t heard at all the term “material” in this context. Judging from a Google search, the term was coined in the n-Lab.

I guessed at the likely meaning from the examples of usage given by those discussing here. Being clearly a contrast to “structural” I was trying to see what sort of meaning I could give it that made sense in my general view of mathematics.

The only natural pair of informal contrasts I could find that matched reasonably were

structural = defined only up to isomorphism, independent of any particular construction

material = given in terms of concrete building blocks.

After having seen how material set theory was constructed within SEAR, I was able to make the second more specific to

material = being able to identify the elements uniquely by giving a formal expression identifying it.

This seemed to match, giving both a precise meaning to the terms and showing that the two concepts are not in complete opposition but having a common intersection that explains why both points of views can be taken as foundations and still be some sort of equivalent.

I am still in doubt about the precise nature of this equivalence. You had asked why? about my intuition, but I can’t pinpoint it at the moment. Perhaps reading Osius will help me understand my and his intuition.

But I think these terms, with the above meanign, are useful general notions, the endpoints of a continuum of ways of thinking about mathematics.

FMathL is trying to plough here middle ground.

Posted by: Arnold Neumaier on October 1, 2009 2:59 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

The only natural pair of informal contrasts I could find that matched reasonably were

structural = defined only up to isomorphism, independent of any particular construction

material = given in terms of concrete building blocks.

After having seen how material set theory was constructed within SEAR, I was able to make the second more specific to

material = being able to identify the elements uniquely by giving a formal expression identifying it.

There’s a missing ingredient in your (informal) characterization of “structural” which I think is crucial to the discussion, and which actually is very close in spirit to the characterization of “material” quoted at the very end. Properly understood, there is no clash whatsoever between “material mathematics” as I understand your use of the term now, and structural mathematics.

The missing ingredient is that in general, structures defined by means of “universal elements” are defined up to canonical (uniquely determined) isomorphism.

The bit I recently wrote about what we mean precisely in describing [x]\mathbb{R}[x] as ‘the’ “free \mathbb{R}-algebra on one generator” should suffice to illustrate what I mean. There can be many such structures (many realizations of such structure), but given any two of them, say (A,a:1U(A))(A, a: 1 \to U(A)) and (B,b:1U(B))(B, b: 1 \to U(B)), there is exactly one homomorphism f:ABf: A \to B such that U(f)(a)=bU(f)(a) = b. By a famous argument, this homomorphism must be an isomorphism. It is the (unique) canonical isomorphism between these two universal structures.

In particular, the only structure-preserving automorphism from (A,a:1U(A))(A, a: 1 \to U(A)) to itself is the identity, and once this structure is given, we can uniquely specify elements therein by means of formal expressions. For instance, we are given a specified (explicitly named) formal generator aa, and other elements are uniquely and formally specified by applying algebra operations in recursive fashion, starting with that aa.

Of course, this is just standard practice of mathematicians; structural mathematicians shouldn’t be seen as doing anything different.

Posted by: Todd Trimble on October 1, 2009 6:05 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

‘Material’ as in “material set theory” is something I’d only heard in the last few months at latest. I just assumed it meant we were talking about a form of set theory founded on a global membership relation, like ZF, Bernays-Gödel, Morse-Kelly, etc.

Mike introduced the term to the discussion here. That is exactly what it means.

Posted by: Toby Bartels on October 1, 2009 8:11 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

Arnold wrote:

Thus you don’t need just a set A, called the alphabet, but you need a particular well-ordering of the set A before your prescription makes sense. In my view, giving a well-ordering to A is materializing the set A.

Well, that’s just one prescription, just something simple off the top of my head. It doesn’t have to be a well-ordering, but yes, the naming is definitely an additional structure, just as in the parable of ii and i-i.

For example, in ETCS, if [n][n] represents the coproduct of nn copies of a chosen terminal object 1, then there are exactly nn elements 1[n]1 \to [n]; they are all coproduct inclusions, and certainly they are all distinct. It may help to think of [26][26] as a blob of twenty six distinct points. The points are clearly distinct, but they look exactly alike, are clones if you will.

Then, you may assign them names however you please, writing next to them (or on their identical red shirts), ‘A’, ‘B’, …, ‘Z’ say. If you choose to close your eyes and they take off their shirts (in other words, if you forget the naming) and they permute among themselves, you obviously can’t retrieve the original naming. But, as along as the names are firmly attached, as long as you bear in mind the naming structure, you are free to use it, knowing for example where Mr. P went to under some specified mapping f:[26]Δf: [26] \to \Delta.

This sort of thing happens at the formal level too. For example, part of the structure of [2][2] as so-called “subobject classifier” is a given element 1[2]1 \to [2] which is traditionally called “true”. Such an element is considered part of the structure of the subobject classifier as such. With that structure firmly attached, you are then in the position to set up a well-defined bijective correspondence between functions f:X[2]f: X \to [2] and subsets of XX, by considering f 1(true)Xf^{-1}(true) \subseteq X. You could have chosen the other element of [2][2] of course as your “true”, but whichever element you chose, you stick to it and remember it for future reference.

Posted by: Todd Trimble on October 1, 2009 2:13 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

TT: there are exactly n elements 1→[n]; they are all coproduct inclusions, and certainly they are all distinct. It may help to think of [26] as a blob of twenty six distinct points.

Yes, this is a different way of creating materially a set of 26 nameable elements, and again, it is not a pure set but a set with additional structure. Mathematicians very rarely use pure sets!

This reminds me of the C abcdC_{abcd} problem, which still puzzles me. I’d like to know your answer to my query:

Let C abcdC_{abcd} be the category whose objects are the symbols a,b,c,d, with exactly one morphism between any two objects, composing in the only consistent way. Let the categories C abcC_{abc} and C abdC_{abd} be defined similarly. Clearly, these are both subcategories of C abcdC_{abcd}, with the identity as the inclusion functor. But I can compare their objects for equality.

Do you agree that from the material point of view (e.g., with categories modelled inside ZF, as in Lang’s book), this reasoning is correct?

If not, what is contrary to the axioms?

And if my reasoning is right from the material point of view, which extra axioms (in addition to what is in Wikipedia, or Lang, or Asperti and Longi) characterize the permitted ways of structural reasoning?

Posted by: Arnold Neumaier on October 1, 2009 3:20 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

Todd’s description of the complex numbers reminds me of debates I heard seven or so years ago about the structuralism then popular in the philosophy of mathematics which said that mathematical entities are patterns, and that all that mattered about elements of the pattern are their properties invariant under isomorphism. The idea here was to explain how 22 is merely a place in a pattern however it is realised set theoretically.

Someone pointed out that this would entail identifying ii and i-i in the complex numbers since nothing distinguishes them according to their place within the structure of the complex numbers. After discussion with John, I realised that we are often not careful saying what we mean by \mathbb{C}. There’s a difference between the field [x]/(x 2+1)\mathbb{R}[x]/(x^2 + 1) and the same field with the extra structure of a choice of a residue class to be designated ii. They belong to different categories.

In the first case, there are two automorphisms on the object; in the second case, only one, but there’s another object with the same image under the functor which forgets the structure.

Posted by: David Corfield on October 1, 2009 11:06 AM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

DC: There’s a difference between the field ℝ[x]/(x 2+1x^2+1) and the same field with the extra structure of a choice of a residue class to be designated i. They belong to different categories.

Does choosing a notation really change the category an object belongs to? This would make the conversion headache in the structural approach even worse.

Does the monoid \mathbb{N} of natural numbers under addition no longer belong to the category monoids if I add the conservative definition 2:=1+1?

Similarly, why can’t I put i:=xmodx 2+1i:=x mod x^2+1 to define the imaginary unit in ℝ[x]/(x 2+1x^2+1) without changing the category the latter object belongs to?

This does not affect the existence of the automorphism induced by iii\to -i.

Or do you hold that each definition changes the type of an algebraic structure? This would make things extremely unworkable formally!

Posted by: Arnold Neumaier on October 1, 2009 3:33 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

Does the monoid \mathbb{N} of natural numbers under addition no longer belong to the category monoids if I add the conservative definition 2:=1+1?

Similarly, why can’t I put i:=xmodx 2+1i:= x mod x^2+1 to define the imaginary unit in [x]/(x 2+1)\mathbb{R}[x]/(x^2+1) without changing the category the latter object belongs to?

This does not affect the existence of the automorphism induced by iii \mapsto -i.

I think David said it right, but it’s slightly subtle. The way to reconcile it with the point you’re making is by recognizing that, considering [x]\mathbb{R}[x] as an abstract \mathbb{R}-algebra, it’s not clear which element is xx until you say so. Thus, there’s an automorphism on [x]\mathbb{R}[x] which sends xx to x-x, and either (or indeed any ax+ba x + b with a0a \neq 0) could be considered a distinguished generator of the polynomial algebra. Giving a generator 1U([x])1 \to U(\mathbb{R}[x]) (here UU denotes the appropriate underlying-set functor) is thus adding some extra structure to the algebra.

A typical categorical response to all this is to define [x]\mathbb{R}[x] to be the free \mathbb{R}-algebra on one generator, which has a materializing or concretizing effect. More explicitly, this involves a universal property: when we say “free algebra on one generator”, we mean (to be precise) that there is given a function i:1U([x])i: 1 \to U(\mathbb{R}[x]), traditionally called ‘xx’, such that for every function f:1U(A)f: 1 \to U(A) into the underlying set of an \mathbb{R}-algebra AA, there exists a unique \mathbb{R}-algebra homomorphism ϕ:[x]A\phi: \mathbb{R}[x] \to A such that f=U(ϕ)if = U(\phi) \circ i. And there: this formulation involving the universal function ii gives you a distinguished element which people usually call xx.

Also note that [x]\mathbb{R}[x] equipped with this distinguished element i:1U([x])i: 1 \to U(\mathbb{R}[x]) has no non-trivial automorphisms. This is just an instance of a feature holding true for general universal properties.

(People often also say “free” to refer to a property: there exists a distinguished element such that… rather than giving the element at the outset as extra structure. Caveat lector.)

Posted by: Todd Trimble on October 1, 2009 5:05 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

AN: why can’t I put i:=xmodx 2+1i:=x mod x^2+1 to define the imaginary unit in ℝ[x]/(x 2+1)[x]/(x^2+1) without changing the category the latter object belongs to?

TT: indeed any ax+b with a≠0 could be considered a distinguished generator of the polynomial algebra.

I don’t understand:

This should not matter in a purely structural view. If you change the generator, you also change the ideal and hence the resulting field, but in any case, the ii so defined will be the distinguished square root of -1 of this field. Since structrally everything is defined anyway only up to isomorphism, this gives exactly the right result, with a canonical ii that changes with the field considered.

TT: ℝ[x] equipped with this distinguished element i:1→U(ℝ[x]) has no non-trivial automorphisms.

This is true if you require that ii is preserved, but this is another reason why I find a purely structural point of view awkward.

I find it unacceptable that the concept of an automorphism changes simply by labeling an element. The world of pure structure is a strange world, not the world of the average mathematician.

The complex numbers as mathematicians generally use them have complex conjugation as an automorphism, although ii is distinguished but not preserved by this automorphism.

TT: The missing ingredient is that in general, structures defined by means of “universal elements” are defined up to canonical (uniquely determined) isomorphism.

Again I do not understand:

Two instances of the field of complex numbers (without a distinguished imaginary unit) are not structurally the same since there is no canonical isomorphism? This would be very strange indeed.

Posted by: Arnold Neumaier on October 1, 2009 6:59 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

Two instances of the field of complex numbers (without a distinguished imaginary unit) are not structurally the same since there is no canonical isomorphism? This would be very strange indeed.

I wouldn't say that they are not, in some sense, the same just because there is more than one isomorphism. (There is always a sense in which they are not the same, if they are represented differently syntactically. But that is not itself a question for mathematics.) I would say this: It is not only important whether things are isomorphic, but also in how many ways they are isomorphic; after all, Iso(A,B)Iso(A,B) is not just a truth value, but a set (a meta-set, although usually also realisable internally as a set). In higher category theory, we even have Equiv(A,B)Equiv(A,B) as (in general) an \infty-groupoid!

Of course, there is more to say than just the cardinality of Iso(A,B)Iso(A,B), such as the action on it by the monoid Hom(B,B)Hom(B,B) and so on. But when Iso(A,B)Iso(A,B) is a singleton, then things become much simpler, to the point that simply writing A=BA = B is an abuse of language that is easy to handle. If Iso(A,B)Iso(A,B) is inhabited but (possibly) not a singleton, then writing A=BA = B is a little more dangerous; the danger only really comes to fruition, however, when you get loops A=B=C=AA = B = C = A since the composite isomorphism ABCAA \to B \to C \to A might not be the identity.

Posted by: Toby Bartels on October 1, 2009 9:48 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

I am afraid, Arnold, that you did not attend carefully to what I wrote. I hope at least it was clear that I was trying to build a bridge of understanding between what you wrote and what David wrote. But, as I said, the mathematical point involved was slightly subtle, so I ask you to read again, with care.

Let me try again.

AN: why can’t I put i:=xmodx 2+1i := x mod x^2+1 to define the imaginary unit in [x]/(x 2+1)\mathbb{R}[x]/(x^2+1) without changing the category the latter object belongs to?

TT: indeed any ax+ba x + b with a0a \neq 0 could be considered a distinguished generator of the polynomial algebra.

I don’t understand:

This should not matter in a purely structural view. If you change the generator, you also change the ideal and hence the resulting field, but in any case, the ii so defined will be the distinguished square root of -1 of this field. Since structrally everything is defined anyway only up to isomorphism, this gives exactly the right result, with a canonical ii that changes with the field considered.

First of all, the part of mine that you quoted was lifted from between a pair of parentheses, where it was indeed a parenthetical aside. I now regret that aside, because it seems to have distracted you from the point I was trying to make.

Second, please note that the ideal generated by x 2+1x^2 + 1 does not change if you replace xx by x-x. That’s the point! There are two candidates in the polynomial algebra whose residue class modulo this ideal yields a square root of -1, but these residue classes are different square roots of -1. It follows that if you haven’t chosen a candidate to work with, you haven’t uniquely specified which so-called canonical square root of -1 in this model you intended to label ii! (And if you’ll recall, unique specification was what this discussion was originally about.)

You may think, “well, clearly I meant to choose xx”, but knowledge of which element that is is not encoded within the polynomial algebra structure, hence it is an extra piece of information in addition to the algebra structure.

TT: [x]\mathbb{R}[x] equipped with this distinguished element i:1U([x])i: 1 \to U(\mathbb{R}[x]) has no non-trivial automorphisms.

This is true if you require that ii is preserved, but this is another reason why I find a purely structural point of view awkward.

I find it unacceptable that the concept of an automorphism changes simply by labeling an element. The world of pure structure is a strange world, not the world of the average mathematician.

The complex numbers as mathematicians generally use them have complex conjugation as an automorphism, although ii is distinguished but not preserved by this automorphism.

We are not “simply labeling an element”, we are also choosing an element to label. This is important for the purpose of making unique specifications, which are important for ‘material’ constructions according to the sense “material = being able to identify the elements uniquely by giving a formal expression identifying it.”

Since we are not simply assigning a label but choosing an element to label, and since this choice is an extra datum or structure, it is logical for this discussion (which was to elucidate a point David made, not to discuss the behavior of “average mathematicians”) that we consider automorphisms which “remember” (respect) this extra structure.

What categories average mathematicians choose to work in is their business. It’s fine if they want their morphisms to ignore preservation of the chosen “ii”. Me: I’m flexible – I’ll work in whatever category is best suited to the discussion I’m having.

(With the little polemical dig “strange world”, I can’t resist adding my own: category theory in fact teaches one great flexibility in thinking. But this point is perhaps lost on someone who often whines about categorical straitjackets, on rather thin and not terribly well-informed evidence.)

I’ll also add, for what it’s worth, that this category, the one whose objects are pairs (A,a:1U(A))(A, a: 1 \to U(A)) consisting of algebras and elements in their underlying sets, and whose morphisms are algebra homomorphisms that preserve elements thus distingished, is an example of what we category theorists call a comma category, a very important tool. Comma categories are extremely relevant to discussions in which adjoint pairs of functors crop up (just about everywhere, in case you didn’t know), including in particular free functors which are adjoint to forgetful functors, and more particularly the polynomial algebra functor which is left adjoint to the forgetful functor from algebras to sets, which I touched upon over here.

TT: The missing ingredient is that in general, structures defined by means of “universal elements” are defined up to canonical (uniquely determined) isomorphism.

Again I do not understand:

Two instances of the field of complex numbers (without a distinguished imaginary unit) are not structurally the same since there is no canonical isomorphism? This would be very strange indeed.

Your quotation is taken from another comment, here. But please attend closely to what I said: I said structures defined by means of universal elements. The main example from that comment was the polynomial algebra [x]\mathbb{R}[x] equipped with a universal element i:1U([x])i: 1 \to U(\mathbb{R}[x]). Did I speak of the complex numbers there? No, I did not. But could I speak of canonical isomorphisms if \mathbb{C} is considered as also coming equipped with a an element i:1i: 1 \to \mathbb{C} which is universal among \mathbb{R}-algebras equipped with a chosen square root of -1? Yes, I could.

Please observe as well that I added that missing ingredient because I thought the little sound-bite you gave for “structural” was a bit thin and needed more. That ingredient I consider particularly relevant for building a bridge between ‘structural’ and your sense of ‘material’. But please also note that I said “a missing ingredient” – I wasn’t pretending to exhaust the meaning of ‘structural’.

As to your question, though, Toby has given an informed reply. There is rather more to be said than can be encapsulated within a brief aphorism.

Posted by: Todd Trimble on October 2, 2009 5:44 AM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

Then please tell me which function from 1 to A is the letter w.

The letter ‘w’, of course.

Are you suggesting that you have another way to answer the question, which letter is the letter w?

Posted by: Toby Bartels on October 1, 2009 9:10 AM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

AN: Then please tell me which function from 1 to A is the letter w.

TB: The letter ‘w’, of course.

This is like answering ‘the expression A 5A_5’ in response to ‘Which group is A 5A_5?’. It doesn’t explain anything. You are simply pushing things you don’t like to the metalevel, as if this would solve the problem.

TB: Are you suggesting that you have another way to answer the question, which letter is the letter w?

I didn’t ask which letter is the letter w but which function form 1 to A is the letter w.

In a material set theory with urelements, you have A={a,…,w,x,y,z}, and w is a well-defined urelement.

The point is that there must be a way to tell a computer what is meant by w, and this can only be done on a formal level involving material objects.

Posted by: Arnold Neumaier on October 1, 2009 10:25 AM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

I didn’t ask which letter is the letter w but which function form 1 to A is the letter w.

Yes, and I defined a letter to be (following the framework of ETCS) a function from 1 to A.

In a material set theory with urelements, you have A={a,…,w,x,y,z}, and w is a well-defined urelement.

But which urelement is w?

Really, I have not the faintest idea what your question is asking! Please, can you answer it for me in your framework, so that I can answer it for you in mine?

Posted by: Toby Bartels on October 1, 2009 7:29 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

Arnold writes:

So what ultimately counts is the practical point of view. Here the advantage of the material point of view is very clear. After all, we already need a material free monoid to communicate mathematics. Then, the material point of view is nearly obvious to any newcomer, making for a simple entrance and plenty of very elementary exercises that lead to mathematical insight, while the structural point of view emerges only after having digested enough of more elementary material mathematics.

I don’t understand what you mean by “we already need a material free monoid to communicate mathematics.” Please explain.

The advantage of material set theory from a practical point of view will not be at all clear to some of us here; quite the contrary. In fact, I argued here that there are strong practical advantages of structural set theory – “practical” in the sense of being faithful to working practice of contemporary mathematics. In particular, I argued that a categories-based set theory, by focusing on the relevancy of universal properties, is at a formal level very directly concerned with mathematical essence – getting at the heart of what contemporary mathematicians need sets for and what they do with them – while at the same time eliminating extraneous and irrelevant features which manifest themselves in material set theory.

The basic argument you seem to be making is that structural set theory is harder to learn than material set theory. I think Mike, with his SEAR, makes a good case that that need not be true. Thus, I reject

the structural point of view emerges only after having digested enough of more elementary material mathematics

as mere assertion. Clearly the real test of the pedagogical viability of structural set theory is in the classroom. I’m happy to say that I’ve incorporated structural ways of thinking into undergraduate courses I’ve taught, and Toby says the same for himself. So these are not completely idle claims.

Posted by: Todd Trimble on September 30, 2009 5:55 PM | Permalink | Reply to this

Re: Material vs. structural foundations of mathematics

There is an asymmetry that is still missing.

In a material theory, structural objects are constructed as […]. Then one can do all structural mathematics inside suitable such […]. However, to do so for nontrivial mathematics requires numerous abuses of language and notation; otherwise the description becomes tedious and virtually incomprehensible.

In a structural theory, material objects are constructed as […]. Then one can do all material mathematics inside suitable such […]. However, to do so for nontrivial mathematics requires numerous (but different) abuses of language and notation; otherwise the description becomes tedious and virtually incomprehensible.

The asymmetry is this: While one can construct material sets as you described, still for the purposes of normal mathematics there is no reason whatsoever to do so. When we structuralists hear an ordinary mathematician describe something about, say, Lie algebras, we immediately turn it into our own language (where, as Goethe would say, it might mean something completely different) and think about it that way. We definitely do not construct material pure sets and think of a Lie algebra as a Kuratowski pair. And we find that we have no difficulty in communicating with the Lie algebraist this way; they can't even tell that we are doing this.

Even when we hear set theorists talk about large cardinals, we still don't bother to construct material sets; if we only care about sets up to cardinality, then we're still talking about objects of the category SetSet of structural sets. (Now sometimes the set theorists can tell if we're using categorial model theory, but that's perfectly valid in a material framework too.) Only if the set theorists bring up the von Neumann hierarchy do we need to construct material sets.

To be fair, the material set theorist doesn't really have to construct structural objects either, at least not in the formal way that you describe. But they still have to deal with certain categories and ignore the membership structure of the objects and morphisms in these categories; they know intuitively what to ignore (which is why anything that they say can be translated so readily into our language), but for us it is automatic.

Thus I favor a declarative theory similar to FMathL, which accounts for the actual mathematical language and needs no abuses of language.

I still want to see how you will interpret ‘An ordered monoid is a set that is both ordered and a monoid, such that ….’ with no abuse of language. Good for you if you can do it! But if abuses of language are unavoidable, and one must work to formalise their meaning rather than to define everything in such a way that they are already literally valid, then I'm just as happy to add one more for ‘a function on AA’ when AA was declared to be a subset.

From a logical point of view, there is the additional question of proof power of the two views. I’d find it surprising if there were a structural theory with proof strength equivalent to that of ZFC

Notice that the only reason that anyone ever linked to pure set was to indicate (very roughly, of course) how such an equivalence would be proved. (If you don't accept that we've put in enough details to establish that, very well; even I am relying more on my intuition and Mike's judgement than a careful check of Mike's argument about Collection.) But this is not necessary to understand how ordinary mathematics may be formalised in structural set theory (especially since ordinary mathematics doesn't even need high-powered set-theoretic axioms like Collection).

After all, we already need a material free monoid to communicate mathematics.

I don't understand what you mean by this. Aren't the elements of this free monoid simply words? (properly, strings of characters). Why do words need to have material elements??? (Of course, they need to have letters, but those are different.)

Then, the material point of view is nearly obvious to any newcomer,

I dispute this too.

The newcomer will think that they know what a set is, until you tell them that everything is a set, which they will find odd. You can avoid this, at least at first, with Urelemente, but eventually you'll do something like construct the set of real numbers, and then they will learn that a real number is a set, which is odd. Meanwhile, the structural set theorist, whose sets all have anonymous elements, has all along said that a set is merely a way to encode or describe certain things; we have never pretended that the elements of the set are those things, and so it is no surprise when a real number may be encoded as or described by, say, a set of rational numbers.

And at some point you must tell them that they are not allowed to take the set of all sets (or if they are, that they are at any rate not allowed to take the set of all sets that do not belong to themselves), which is no worse than telling them that they are not allowed to compare elements of two sets without some explicit way (such as a bijection between the sets) of comparing them. At least I know how to motivate the latter (but to be fair, the former also has to be explained, although perhaps later, by our group).

On the other hand, for many problems, both the material and the structural perspective offer insights. Therefore a good foundation of mathematics should offer both views.

I agree with this (well, at least the second sentence). But you've already agreed that either perspective allows one to formalise the other.

Posted by: Toby Bartels on September 30, 2009 6:42 PM | Permalink | Reply to this

Re: What is a structured object?

So part of the problem appears to lie in that you switch between different points of view (formal object or only a way of speaking tha can be formalized only by eliminating the concept) about what a group is.

This is a fair criticism; I think we’ve been a bit sloppy about this in the foregoing discussion. The problem is that category theory which deals with large categories is hard to formalize in any kind of set theory. Neither ZF nor SEAR nor ETCS has an intrinsic object called “a large category.” In ZF, one “defines” a “proper class” to be specified by a first-order formula, and then a “large category” to be a “meta-category” whose objects and arrows are proper classes.

In structural set theory, one way to “define” a “large category” is to give a finite graph D CD_C together with a couple of first-order formulas obj Cobj_C and arr Carr_C with free variables labeled by the vertices and edges of D CD_C. “An object” of this category is then a diagram of shape D CD_C in SetSet (hence, a collection of sets and functions) such that obj Cobj_C holds with the appropriate variables substituted, and likewise for “a morphism”.

Neither of these situations is really completely satisfactory. In ZF one can extend the theory to NBG or MK or add universes, and redefine “large category” to mean “category whose set of objects is not necessarily an element of the universe.” One structural counterpart of this is algebraic set theory in which classes, rather than sets, are the objects of the basic category under consideration, and there is a notion of “smallness” such that “sets” are the small classes. I feel that a more structural version of this considers a 2-category of large categories, rather than a category of classes, since in practice one rarely cares about the objects of a proper class up to more than isomorphism; I have some axioms for such a 2-category written down but haven’t put them up anywhere yet.

So, although when talking informally about category theory, I tend to think “2-structurally,” I’m not sure whether there yet exists a formal system which really captures what I mean by this. Thus there are really two questions here: the suitability of structural set theory for “small mathematics,” and its potential extensions to a “structural category theory” or “structural class theory” adequate for dealing with large categories (and which could hopefully be extended to treat extra-large 2-categories, XXL 3-categories, etc.).

Posted by: Mike Shulman on September 23, 2009 4:16 PM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

BTW, if we want to keep talking about the structural viewpoint on large categories and we want a formal setting in which to do it, universes in structural set theory should be perfectly adequate. Their main flaw is that they permit evil, but this shouldn’t be essential for understanding the issues in a structural viewpoint. David Roberts and I have been working on the axioms for universes in SEAR.

I don’t have time right now to explain how one goes about constructing a category of small sets, and thence a category of small groups, from a universe, but maybe someone else can.

Posted by: Mike Shulman on September 24, 2009 5:41 AM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

MS: A group in SEAR consists of a set G and an element e∈G and a function m:G×G→G, such that certain axioms are satisfied. A group in SEAR is not a single thing in the universe of discourse.

AN: So part of the problem appears to lie in that you switch between different points of view (formal object or only a way of speaking that can be formalized only by eliminating the concept) about what a group is.

MS: This is a fair criticism; I think we’ve been a bit sloppy about this in the foregoing discussion. The problem is that category theory which deals with large categories is hard to formalize in any kind of set theory.

This has nothing at all to do with large categories. Consider the category of finite groups. Its objects are finite groups, not group structures on a finite set. Thus you need to have the concept of finite group as an object rather than as a metaobject that cannot be formalized except by eliminating it from the formal representation.

MS: This is not a problem for formalization at a low level, but it may be undesirable when trying to formalize at a higher level, for all the reasons that you’ve given. But it doesn’t prevent SEAR from reflecting on itself formally.

It does. Reflection means being able to define a copy of SEAR inside SEAR, including all the language used to define this copy. (This is independent on any computer implementation. The latter, of course, must in addition care about efficiency, which causes some additional problems for theories where important concepts like that of a group are not a single thing in the universe of discourse.)

This means you need to start by calling the elements of a certain SEAR sets characters, then creating a SEAR model of text, then creating a SEAR model of context-free languages to express formulas and phrases, then create a model of what are variables, type declarations, axioms, definitions, assertions, proofs, and then state in this language the SEAR axiom system together with the definitions and assertions needed to explain the terminology used in the axioms.

Then, and only then, you can speak of having SEAR as a foundation.

MS: I do assert that structural set theory is a sufficient low-level foundation for mathematics on a par with ZF, and I believe that it is closer to the way mathematicians treat sets in everyday practice.

With ZF in place of SEAR, all of the above has be done at various levels of detail, and one can find for each step literature expanding on it in fairly detailed ways.

But I do not see how you can do this consistently with SEAR.

Apparently, you cannot even define formally the concept of a category without avoiding the problems with the category C 1234C_{1234} I had mentioned. (For definiteness, here I specialize abcd to elements from the natural numbers inside SEAR.)

And there are ZF-based texts like Bourbaki and Lang who introduce each permitted abuse of language before using it. SEAR abuses the language without any excuse, and without saying how to undo the abuses if one wants to be more careful.

I know that it is a time-consuming task to repeat this for new foundations, and I neither expect that you do this quickly or that a single person should be expected to do this.

But I’d expect that you don’t assert something you are so far from having achieved.

Posted by: Arnold Neumaier on September 24, 2009 9:14 AM | Permalink | Reply to this

Re: What is a structured object?

AN: Apparently, you cannot even define formally the concept of a category without avoiding the problems with the category C 1234C_{1234} I had mentioned.

Just to clarify: I didn’t mean obstacles related to large cardinals. The problem arises even for categories all of whose objects are finite sets equipped with extra structure.

Posted by: Arnold Neumaier on September 24, 2009 11:04 AM | Permalink | Reply to this

Re: What is a structured object?

I’ve enjoyed reading this vigorous exchange.——————–

In “Introduction to higher order categorical logic” by Joachim Lambek, P. J. Scott
I noticed that they recommended type theory as a foundation to mathematics rather than either category theory or set theory.

Bertot and Casteran, Interactive Theorem Proving (Coq)
“Amokrane Saibi showed that a notion of subtype with inheritance and implicit coercions could be used to develop modular proofs in universal algebra, and most notably, to express elegantly the main notions in category theory.”

http://pauillac.inria.fr/~saibi/Cat.ps by Amokrane Saibi (Coq)

“We then construct the Functor Category, with the natural definition of natural transformations. We then show the Interchange Law, which exhibits the 2-categorical structure of the Functor Category. We end this paper by giving a corollary to Yoneda’s lemma.
This incursion in Constructive Category Theory shows that Type Theory is adequate to represent faithfully categorical reasoning. Three ingredients are essential: \Sigma- types, to represents structures, dependent types, so that arrows are indexed with their domains and codomains, and a hierarchy of universes, in order to escape the foundational difficulties. Some amount of type reconstruction is necessary, in order to write equations between arrows without having to indicate their type other than at their binder, and notational abbreviations, allowing e.g. infix notation, are necessary to offer the formal mathematician a language close to the ordinary informal categorical notation.”

SH: Perhaps this is interesting.

Posted by: Stephen Harris on September 24, 2009 2:29 PM | Permalink | Reply to this

Re: What is a structured object?

Type theory is, indeed, a very nice foundation for mathematics, which is very closely related to structural set theory. In fact, Bounded SEAR is nearly indistinguishable from type theory, and ETCS is also basically equivalent to it. However, my opinion (and this is only my opinion) is that type theory is harder for mathematicians without training in logic to understand, whereas they are quite used to thinking in terms of sets, relations, and functions. Perhaps this is only a relic of the ascendancy of material set theory as a foundation for so many years. Perhaps it is an artifact of the viewpoint taken by most textbooks on type theory.

Posted by: Mike Shulman on September 24, 2009 6:02 PM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

Perhaps it is an artifact of the viewpoint taken by most textbooks on type theory.

I blame this. Most books on ‘type theory’ are about logic; most books on ‘set theory’ (even if structural) are about mathematics. But I see ‘type’ and ‘set’ as nearly interchangeable, although ‘type’ can also be used in a broader context (for which there are other words if I want to be more specific, such as ‘preset’, ‘class’, or even —conjecturally for me— ‘\infty-groupoid’).

Posted by: Toby Bartels on September 25, 2009 8:28 PM | Permalink | Reply to this

Re: What is a structured object?

But I see ‘type’ and ‘set’ as nearly interchangeable

Here are some differences in the way I think of them:

  • The elements of a set are always equipped with a notion of equality, while the elements of a type need not be.
  • In type theory, one cannot quantify over all types (although one can fake it with universes), whereas in set theory one (potentially) can.
  • The previous point is perhaps a consequence of a “level” distinction. Constructions on sets are either specified by operations or by axioms which are part of the theory. But I think type constructors are usually viewed as syntactic judgements external to any theory. (Probably I’m not using the buzzwords correctly here, but hopefully you get my meaning.)
  • Type theory can be more flexible, e.g. it can be interpreted in fibered preorders rather than in categories. I’m not sure how to do that with set theory.

Admittedly, these are all subtle distinctions.

Posted by: Mike Shulman on September 25, 2009 10:29 PM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

I wrote:

‘type’ can also be used in a broader context

and Mike wrote:

Type theory can be more flexible

I would say that not every type theory is a set theory, far from it; but every set theory is a type theory. Types can (and usually do, in my experience) have equality predicates but (as you note) need not; in Martin-Löf's original ‘impredicative’ Intuitionistic Type Theory (the one that turned out inconsistent by Burali-Forti), you can quantify over all types, so the term ‘set’ doesn't have a monopoly on that idea. I don't see type constructors as external to type theory; I don't know what you're trying to say there.

And I wouldn't be too averse to somebody's using the term ‘set’ more flexibly either. It's not that different from our use of ‘set’ to mean, basically, a structured set with all of the extra structure removed, which AN correctly objects is inconsistent with its use, by Cantor and the material set theorists who followed him, to mean a part of some universe (originally the real line, eventually the von Neumann hierarchy). We can respond to AN that there is now substantial literature that uses the term in this way (and a vast literature in which it is easily interpreted in this way), which this hypothetical more flexible person may not have; but if we discover some group of mathematicians that does use ‘set’ for, say, something without an equality predicate, then I wouldn't have any standing to complain (even though I would rather call that particular sort of thing ‘preset’ myself).

Posted by: Toby Bartels on September 26, 2009 12:53 AM | Permalink | Reply to this

Re: What is a structured object?

in Martin-Löf’s original ‘impredicative’ Intuitionistic Type Theory (the one that turned out inconsistent by Burali-Forti), you can quantify over all types, so the term ‘set’ doesn’t have a monopoly on that idea.

Is there a consistent type theory in which you can quantify over all types? The way I think of type theory, quantifiers are tied to quantifying over elements of some type.

I don’t see type constructors as external to type theory; I don’t know what you’re trying to say there.

I think what I mean is that where type theory has type constructors, which are operations on types, set theory often has existence axioms about sets. Admittedly the distinction is not always possible to see.

Posted by: Mike Shulman on September 26, 2009 4:57 AM | Permalink | PGP Sig | Reply to this

Sets vs types

Is there a consistent type theory in which you can quantify over all types?

Sure, SEAR\mathbf{SEAR} for example.

I know, you call SEAR\mathbf{SEAR} a ‘set theory’ instead of a ‘type theory’, but if that's only because it allows quantification over all types, then the argument is circular. Meanwhile, we've got Arnold Neumaier objecting that SEAR\mathbf{SEAR} is not a set theory because it's not material; membership in sets should be a predicate, and making it a typing declaration is a give-away that you've really got a type theory (although AN said ‘copies of cardinal numbers’, and later ‘universes’, instead of ‘types’). There is a historical basis for either distinction.

I don't see type constructors as external to type theory; I don't know what you're trying to say there.

I think what I mean is that where type theory has type constructors, which are operations on types, set theory often has existence axioms about sets. Admittedly the distinction is not always possible to see.

No wonder you didn't want me to introduce Cartesian products as an operation in SEPS\mathbf{SEPS}; you were trying to build a set theory rather than a type theory! Of course, if one's type theory sticks to propositions-as-types, then it really can't tell the difference between these. On the other hand, even material set theory can be written down using operations; I can't think of a reference now, but 10 years ago I was working out how to eliminate existential quantifiers from the ZF\mathbf{ZF} axioms entirely.

Posted by: Toby Bartels on September 27, 2009 1:37 AM | Permalink | Reply to this

Re: Sets vs types

You seem to have a more expansive notion of what “type theory” means than I’ve encountered anywhere else. In part D of the Elephant, or in Jacobs’ Categorical Logic and Type Theory, type theory is given a specific meaning: there are types, function symbols, terms, type constructors (such as products and sums, possibly dependent), and so on. If we allow a logic on top of the the type theory (or fake it with propositions-as-types), then there are relation symbols and formula constructors as well, such as \wedge, \vee, \Rightarrow, \exists, etc., with inference judgements such as “if ϕ\phi is a formula containing a free variable xx of type AA, then x:A.ϕ\exists x:A.\phi is a formula without such a free variable.” Type theory together with logic might also be called “typed first-order logic.”

By contrast, SEAR is formulated in a typed first-order logic, but the types involved are “set”, “relation”, and “element.” Just like ZF is formulated in a single-sorted first-order logic, where the elements of the single sort are called “sets”. SEAR looks kind of like type theory because when AA is has the type “set,” the dependent type “element of AA” looks a lot like calling AA itself a type. But in type theory as I have learned it from the references above, one cannot write something like “for all types AA”, since every variable must have a type and there is no type of all types (at least, not if you want to avoid paradoxes). But perhaps I have learned too narrow a meaning of “type theory;” can you point me to any references that use it more expansively?

Posted by: Mike Shulman on September 27, 2009 9:22 PM | Permalink | PGP Sig | Reply to this

Re: Sets vs types

By contrast, SEAR is formulated in a typed first-order logic, but the types involved are “set”, “relation”, and “element.”

Yes, but type theory itself is also formulated in a typed first-order logic, where the types involved are ‘type’, ‘term’, ‘proposition’, and the like. There is, in my opinion, a significant difference between a type theory such as that which underlies SEAR\mathbf{SEAR}, in which all of the types are listed up front once and for all, and a type theory such as Martin-Löf's, in which enough generic type constructors are given that one can formalise all of ordinary mathematics. In fact, I would say this difference is greater than that between the second kind of type theory and structural set theory, and the difference between material and structural set theory is not really smaller.

But in type theory as I have learned it from the references above, one cannot write something like “for all types AA”, since every variable must have a type and there is no type of all types (at least, not if you want to avoid paradoxes). But perhaps I have learned too narrow a meaning of “type theory;” can you point me to any references that use it more expansively?

I think that the problem is that type theorists never invented a word analogous to ‘class’ in set theory; if they had, then nobody would say that every variable must have a ‘type’, since they would use this new word instead. But suppose that material set theory had developed differently, never inventing the word ‘class’, but instead always using ‘set’ for the general notion and ‘small set’ for the more restrictive case. Then the axioms of separation and collection (to keep their meaning the same as they have now in ZFC\mathbf{ZFC}) would only apply to formulas whose variables are all bounded by some set, and while one can write down other formulas, one cannot actually do anything with them; all that we have done is to develop NBG\mathbf{NBG} in a different language.

As I said, Martin-Löf wrote down a theory in which one can say ‘for all types AA’, but it was inconsistent. One can make a consistent version as follows: replace the word ‘type’ everywhere by ‘small type’, except in the phrase ‘type of all types’, where only the second ‘type’ is replaced; this would be perfectly analogous to the use of ‘small set’ above. I would now like to cite that Martin-Löf did just this, but he did not; instead, he developed a stronger theory with a hierarchy of universes, in each of which all type constructors may be used. But it seems to me that if type theory without universes is ‘type theory’ and type theory with a hierarchy of universes is ‘type theory’, then type theory with a single fixed universe of small types, in between these two, is also ‘type theory’.

Some people (I think Beeson, and since I'm already going to look up something else in that for you, I'll try to check this too) distinguish ‘set’ and ‘type’ by whether the theory is material or structural; to them, SEAR\mathbf{SEAR} is, like ETCS\mathbf{ETCS}, already a ‘type’ theory. (For what it's worth, that's how I used the words before you convinced me that there was no reason to do this.) You seem to distinguish them by whether one can quantify over all of them when defining one of them, which is also reasonable but not the only way to do things (and then ETCS\mathbf{ETCS} is still a ‘type’ theory). Another way to distinguish them is to say that a ‘set’ has an arbitrary equality relation, while a ‘type’ has none (or has only syntactic identity); that is done here for example (although using ‘preset’ is probably a more precise way to do this). There are many distinctions that can be made in one's style of foundations, but I don't see any of them as an essential or universal distinction between these two words, nor do I see the need for such a distinction.

Posted by: Toby Bartels on September 28, 2009 1:57 AM | Permalink | Reply to this

Re: Sets vs types

I fully agree that there is a continuum of theories, and it is by no means a priori clear where to draw the line between “type theory” and “set theory.” But we have to have words that mean something, or we’ll never know what we’re talking about!

I had a lengthy email exchange with Thomas Streicher several months ago about more or less this question. We did a lot of not understanding what each other was saying, and we got especially confused because we were also talking about interpretability of theories internal to a non-well-pointed topos. The metric of quantifiers over all sets/types to distinguish “set theory” from “type theory,” which I’ve been adhering to here, is what came out of that discussion as a convention we could both agree on. (BTW, I don’t agree that ETCS is a type theory by that metric—the question is not whether quantifiers over sets are allowed in the separation axiom, but whether they exist at all in the language.)

It is certainly true that for many people, “set theory” means “material set theory,” so perhaps we structural-set-theorists should have just stuck with “type” instead of “set.” (Thomas also mentioned that when he was first learning topos theory, the use of “set theory” for the internal logic of a topos confused him because it was clear that set theory was stronger than type theory—another possible axis along which one could distinguish.) I do of course feel that there is something important to be gained by calling structural set theory “set theory” rather than “type theory”; in particular, it points out that this (and not material set theory) is really how sets are used by mathematicians (although apparently this can be harder to convince people of than I realized, pace AN!).

And I still think there is a difference between structural set theory and type theory.

By contrast, SEAR is formulated in a typed first-order logic, but the types involved are “set”, “relation”, and “element.”

Yes, but type theory itself is also formulated in a typed first-order logic, where the types involved are “type”, “term”, “proposition”, and the like.

I agree that type theory can be formulated in such a way, but it can also stand alone as such a theory itself. To borrow the metaphor of programming languages, type theory is a part of logic, which is the machine language of mathematics. You can write an interpreter for machine language in machine language (and you might want to, in order to run it on some other architecture), but you can also run it directly on the machine it was written for. But SEAR must be compiled/interpreted into type theory/logic; it is not the machine language of any machine.

Posted by: Mike Shulman on September 28, 2009 4:20 AM | Permalink | PGP Sig | Reply to this

Re: Sets vs types

Here is a contentful and important mathematical consequence of that difference. Type theories (in the sense that I am using the word) have a term model. That is, you can construct a topos (or a category with less structure, if your theory doesn’t require as much) which is the free topos containing an internal model of that theory. In particular, applying this to “IHOL” (the type theory corresponding to an ordinary topos) there is a free topos.

This is not true (at least, not as far as I can tell) for SEAR and other “structural set theories” which allow quantifiers over sets in their axioms. (You might have seen a draft of my UQ&SA paper in which I claimed that it was, but now I believe that is incorrect.)

In both cases you can also interpret the logic as happening “one level up,” as you suggested, and now in both cases there is a free model. But this sort of free model looks very different: now instead of a category whose individual objects represent the individual types/sets, we have a category containing a single “object of types” and a single “object of elements.”

What we get in this latter case can be thought of as a “free category of classes.” The category of small objects in a category of classes is a topos—but even if the category of classes satisfies its version of the stronger axioms like unbounded separation and collection, it does not in general follow that its category of small objects satisfies its version of them. All we can say is that the internal category of small objects satisfies these axioms in the internal logic of the category of classes.

Posted by: Mike Shulman on September 29, 2009 3:26 PM | Permalink | PGP Sig | Reply to this

Re: Sets vs types

I wrote:

Some people (I think Beeson, and since I’m already going to look up something else in that for you, I'll try to check this too) distinguish ‘set’ and ‘type’ by whether the theory is material or structural

Nothing so clear cut as that. Actually, Beeson seems to be confused; in Chapter II (Informal Foundations of Constructive Mathematics), he claims (Section II.3) to use Bishop's concept of set (which is definitely structural) and even notes that x=yx = y is not globally meaningful. But then (Section II.9) he defines xYx \in Y whenever xXx \in X and XYX \subseteq Y, calling this a ‘difference in use of language’ from Bishop. And so it is, but it's not clearly explained.

All of the formal ‘set theories’ in Beeson are both material and based on first-order logic, while the only ‘type theories’ are those of Martin-Löf, so that doesn't help. The same is true in other references that I've just checked.

Posted by: Toby Bartels on October 3, 2009 12:58 AM | Permalink | Reply to this

Re: What is a structured object?

This has nothing at all to do with large categories. Consider the category of finite groups.

Ah, okay, I misunderstood your complaint.

The way to deal with this is the same as the way to deal with any sort of family of objects in structural set theory. A small category in structural set theory consists of a set C 0C_0 of objects, a set C 1C_1 of morphisms, functions s,t:C 1C 0s,t:C_1\to C_0, i:C 0C 1i:C_0\to C_1, and c:C 1× C 0C 1C 1c:C_1\times_{C_0}C_1\to C_1 with axioms as defined for instance here. If you want to consider the objects of such a category as “being” sets with structure, then you simply consider a C 0C_0-indexed family of sets with structure and a C 1C_1-indexed family of morphisms between them.

(A small equivalent of) the category of finite groups, for instance, would be a category as above equipped with a C 0C_0-indexed family of finite groups GG and a C 1C_1-indexed family of morphisms HH between them, such that any morphism between groups in GG occurs exactly once in HH, and such that any finite group is isomorphic to one in GG.

Unfortunately I don’t have time to explain in more detail right now exactly what is meant by “family” in all these cases, but it is not hard.

This is not a problem for formalization at a low level, but it may be undesirable when trying to formalize at a higher level, for all the reasons that you’ve given. But it doesn’t prevent SEAR from reflecting on itself formally.

It does. Reflection means being able to define a copy of SEAR inside SEAR, including all the language used to define this copy.

That is in fact what reflection means, but you haven’t explained why not having “a group” as a single object in the domain of discourse prevents it.

Then, and only then, you can speak of having SEAR as a foundation.

I don’t understand why reflection should be the defining test of a foundation. To me, saying that something is a foundation for mathematics means that it can be used to formalize all (or a substantial part) of mathematics. Logic is, indeed, an important part of mathematics, but only a part. Being able to compile its own compiler is an important test of a (compiled) programming language, but it is not the defining feature that enables us to call something a “programming language.” My impression is that generally by the time that a language is able to compile its own compiler, it is fairly well-accepted that it is, in fact, a programming language.

Regardless, if formalizing logic is what you want, I claim that logic, just like most of the rest of mathematics, is already written in an essentially structural way. For example, suppose one chooses to code logical sentences as natural numbers. This never depends on the specific definition of natural numbers as finite von Neumann ordinals or what-have-you; it only depends on the fact that they satisfy the induction property. Well, so do the natural numbers in SEAR or ETCS. Consider for simplicity a one-sorted theory with nn binary function symbols, which we code by the natural numbers 0,1,,(n1)0,1,\dots,(n-1), and mm binary relation symbols, coded similarly. We can then use the separation property to define a subset FF of \mathbb{N} consisting of those natural numbers that code well-formed formulas in this language. A logical theory then consists of a subset of FF, the axioms. A structure for this language is a set MM, together with a function {0,1,,(n1)}×M×MM\{0,1,\dots,(n-1)\}\times M\times M\to M coding the function operations and a subset of {0,1,,(m1)}×M×M\{0,1,\dots,(m-1)\}\times M\times M coding the relation symbols. (Here, of course, {0,1,,(n1)}\{0,1,\dots, (n-1)\} denotes an nn-element set equipped with a specified injection into \mathbb{N} that gives its elements meaning as natural numbers.) The inductive property of \mathbb{N} enables us to define the truth value of any formula on such a structure, so we can define a model of a theory to be a structure in which all the axioms are true.

In other words, all the work of reflection is already done. All that remains for structural set theory to do is point out that existing mathematics is already structural.

Posted by: Mike Shulman on September 24, 2009 5:58 PM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

This has been a very interesting discussion, and I hope Mike won’t mind (since he says he’s busy) if I touch upon some of what he was saying above, and outline a construction of an internal category of finite groups within a structural set theory.

As a warmup, let’s construct an internal category FinFin equivalent to the category of finite sets. We take the set of objects Fin 0Fin_0 to be \mathbb{N}, the set of natural numbers, with one element n0n \geq 0 for each finite cardinality.

As Mike was saying, in order to construe objects nn \in \mathbb{N} as giving actual finite sets, we construct a “family” ϕ:F\phi: F \to \mathbb{N} where each fiber F nF_n is a set of cardinality nn. For example, consider the function

ϕ:×:(m,n)m+n+1\phi: \mathbb{N} \times \mathbb{N}\to \mathbb{N}: (m, n) \mapsto m + n + 1

Then, for each n0n \geq 0, the fiber ϕ 1(n)\phi^{-1}(n) is a set of cardinality nn. This fiber will also be denoted [n][n].

Next, using the existence of dependent products in a structural set theory like ETCS, one may construct the family of morphisms between finite sets,

ψ:Fin 1×,\psi: Fin_1 \to \mathbb{N} \times \mathbb{N},

where the fiber over (m,n)×(m, n) \in \mathbb{N} \times \mathbb{N} is [n] [m][n]^{[m]}, the set of functions from [m][m] to [n][n]. In other words, an element ff of Fin 1Fin_1 “is” a function between finite sets. Let us write dom(f)dom(f) for the first component of ψ(f)\psi(f) and cod(f)cod(f) for the second component, so that ψ(f)=dom(f),cod(f)\psi(f) = \langle dom(f), cod(f) \rangle. This gives us functions

dom,cod:Fin 1Fin 0dom, cod: Fin_1 \overset{\to}{\to} Fin_0

which are part of the structure of an internal category FinFin; the rest of the structure consists of identity and composition functions

id:Fin 0Fin 1c:Fin 1× Fin 0Fin 1Fin 1,id: Fin_0 \to Fin_1 \qquad c: Fin_1 \times_{Fin_0} Fin_1 \to Fin_1,

which are not hard to construct. In the end, the internal category constructed is equivalent to the category of finite sets.

Now let us continue by sketching the internal category of finite groups. To construct a set G 0G_0 whose elements represent all isomorphism classes of finite groups, we construct a family

card:G 0card: G_0 \to \mathbb{N}

where each fiber card 1(n)card^{-1}(n) is the set of all group structures on the set [n][n]: the subset of

[n] [n]×[n]×[n]×[n] [n][n]^{[n] \times [n]} \times [n] \times [n]^{[n]}

whose members (m,e,i)(m, e, i) are those triples which obey the equational axioms (appropriate to the theory of groups) for multiplication mm, identity ee, and inversion ii. We may construe elements gg of G 0G_0 as “finite groups”. In particular, the “underlying set” of a finite group gG 0g \in G_0 is

U(g)=ϕ 1(card(g))U(g) = \phi^{-1}(card(g))

Finally, we construct the set G 1G_1 of finite group homomorphisms. This is the set of those triples

(g,f,h)G 0×Fin 1×G 0(g, f, h) \in G_0 \times Fin_1 \times G_0

such that dom(f)=card(g)dom(f) = card(g), cod(f)=card(h)cod(f) = card(h), and the function ff satisfies the equations necessary to make it a homomorphism from the group structure gg to the group structure hh.

This completes the sketch of an internal category equivalent to the category of finite groups. While it’s just a sketch, all the formal details can be filled in within the framework of a structural set theory such as ETCS or SEAR.

Which brings me to a question. Sometime earlier Arnold wrote:

At present, every formalization of a piece of mathematics is a mess; this was not the point.

What I was referring to was the overhead in the length of the formalization. With ZF, you can formalize a concept once as a tuple, and then always use the concept on a formal level.

and then

This is what I was aiming at. For reflection purposes, one cannot work in pure SEAR, while one can do that in pure ZF.

As a matter of fact there are bi-interpretability theorems which show that any construction in Zermelo set theory (Bounded Zermelo theory with Choice to be more precise) can be expressed in the structural theory ETCS, and vice-versa, and certainly one can augment ETCS with additional axioms to recover the full power of ZF. Similarly, if I recall correctly, Mike has basically said in his article that SEAR is bi-interpretable with (has the same expressive power as) ZF. So it is not clear to me why Arnold believes that for reflection purposes, one can work with ZF but not with SEAR. For example, what was sketched above indicates that one can reflect finite groups within (say) ETCS at a formal level. Mike said a little more about reflection in his later comment here.

Posted by: Todd Trimble on September 25, 2009 6:59 AM | Permalink | Reply to this

Re: What is a structured object?

I hope Mike won’t mind (since he says he’s busy)

I should hope I wouldn’t mind either, no matter how busy I am! (-: I hope I haven’t given the impression that I own structural set theory or something. As many people have been saying, all of this stuff (except perhaps some details of SEAR) is decades old.

Posted by: Mike Shulman on September 25, 2009 8:00 AM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

MS: As many people have been saying, all of this stuff (except perhaps some details of SEAR) is decades old.

If this is true, it should be easy to point to a paper or book that contains in terms of ETCS the definition of the basic concepts of category theory, including the examples of a few concrete categories (comparable in richness of structure to the category of finite groups).

Thus I’d appreciate getting such a decades old reference that backs up your claim.

Posted by: Arnold Neumaier on September 25, 2009 10:09 AM | Permalink | Reply to this

Re: What is a structured object?

it should be easy to point to a paper or book that contains in terms of ETCS the definition of the basic concepts of category theory, including the examples of a few concrete categories (comparable in richness of structure to the category of finite groups).

As I’ve been saying over and over again, I don’t think anyone has felt the need to do this sort of thing, because once the basic structure of ETCS (say) is developed sufficiently it becomes “obvious” to people who think like we do that the rest of mathematics can follow, and everyone would rather spend their time pushing the boundaries. Rewriting Bourbaki by changing a word here and there isn’t a really fun way to spend one’s time, nor likely to be counted as a significant contribution to mathematics when one is applying for jobs. That isn’t to say that I don’t wish that someone had, so that I could point you to it! Mathematics is full of things that are “understood” by people who work in a given field for a long time before being carefully written down with enough details to make sense to others.

You will find this perspective running implicitly through many books on topos theory, and they are actually doing something more general: considering how mathematics can be developed on the basis of any elementary topos. But again, they probably don’t supply enough details about how to do this to satisfy you.

Posted by: Mike Shulman on September 25, 2009 3:49 PM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

I’ll second what Mike said: for those people who have absorbed the methods that are explained in a book like Moerdijk and Mac Lane’s text, the sort of explicit detail of the sort I laid out is more along the lines of an exercise whose solution would be well-understood by many. It’s probable that it would be carried out in more explicit detail only when an outsider comes along and begins asking a different set of questions like you are doing here, so what you are looking for exactly might be hard to track down in the literature.

Posted by: Todd Trimble on September 25, 2009 4:20 PM | Permalink | Reply to this

Re: What is a structured object?

If this is true, it should be easy to point to a paper or book that contains in terms of ETCS the definition of the basic concepts of category theory, including the examples of a few concrete categories (comparable in richness of structure to the category of finite groups).

This may not exist, because any basic textbook on category theory has to mention foundations to deal with size issues, and this discussion is unlikely to be independent of material vs structural foundations.

However, any modern algebra book, if it doesn't talk about either set theory or category too much, will do this. For example, take Lang, remove (or rewrite) only the two pages on Logical Prerequisites, and the rest (including the Appendix on more advanced set theory!) is fine as it is. (I haven't checked every page, but I did skim through Chapter I and Appendix 2.)

There is a constant abuse of language (which should probably be remarked upon if one rewrites the Logical Prerequisites) where a subset SS of a set XX is conflated with the underlying set of SS (and also an element aa of XX that belongs to SS is conflated with the unique corresponding element of the underlying set of SS), but this is no worse than the abuse (not remarked upon!) that begins Section V.1 in my (1993) edition:

Let FF be a field. If FF is a subfield of a field EE, […]

Literally, a subfield of EE is (as Lang defined it) a subset of EE, not a field in its own right. (In ZFC\mathbf{ZFC}, a subset of EE might happen to equal the ordered triple that is a field, but if so then that is not what Lang wants here!) Structural set theory uses the same abuse of language, although now also for unstructured sets just as much as for structured sets such as fields.

Lang also discusses category theory, but he doesn't indicate how to formalise it, so that text doesn't need any changing either. (What is a ‘collection’? Lang doesn't say. The unwary reader may assume that it's the same as a ‘set’ and be led to a paradox on the next page!)

Posted by: Toby Bartels on September 25, 2009 9:38 PM | Permalink | Reply to this

Re: What is a structured object?

Mike: you didn’t give me that impression (or even that you were pretending to such ownership (-: ). In fact, I salute both you and Toby for all your hard work in providing all those many thoughtful responses. I think all of us have been learning a lot from the exchange.

Posted by: Todd Trimble on September 25, 2009 12:30 PM | Permalink | Reply to this

Re: What is a structured object?

TT: This completes the sketch of an internal category equivalent to the category of finite groups.

OK, I get the idea of how to reflect things. Once one has the group as a single object (and in contrast to Mike Shulman, you modelled it that way), the basic obstacle to full reflection is gone.

One builds some machinery that mimicks the material structure of ZF, for example by providing triples that encode the group. Then one uses this structure to do what one is used to do in the standard reduction of mathematics to ZF.

I agree that one can probably fill in all details, and that this gives a way to define formally what the category FG of finite groups is, and hence what a finite group is, namely an element of Ob(FG).

Thus I now grant that (and understand how) ETCS - and maybe SEAR in a similar way - may be viewed as being a possible foundation of all of mathematics (when enhanced with enough large cardinals to handle large categories).

What I no longer understand now, however, is the claim that this way of organizing mathematics is superior to that of basing it on ZF since it is structural rather than material.

For I find the meaning of a finite group implied by the construction you gave not any more natural than the meaning of a natural number implied by its ZF construction by von Neumann.

It is ugly, and no mathematician thinks of this as being the essence of finite groups.

Moreover, for a (general) group, one has a similar messy construction, and a finite group is no longer a group but only ”becomes” a group under the application of a suitably define functor.

This flies in the face of the ordinary understanding of every algebraist of the notions of group and finite group.

In the attempts (in this discussion) to capture the essence of mathematics the proponents introduce so much artificial stuff in the form of trivial but needed functors that the result no longer resembles the essence to be captured.

Thus the structural, ETCS-based approach is no better in capturing the essence of mathematics as the material, ZF-based approach.

Both create lots of structure accidental to the construction, structure that is not in the nature of the mathematics described but in the nature of forcing mathematics into a ETCS-theoretic or ZF-theoretic straitjacket.

Posted by: Arnold Neumaier on September 25, 2009 10:36 AM | Permalink | Reply to this

Re: What is a structured object?

For I find the meaning of a finite group implied by the construction you gave not any more natural than the meaning of a natural number implied by its ZF construction by von Neumann.

I think you are misunderstanding the point of the construction. The meaning of a finite group is still “a finite set GG equipped with a multiplication m:G×GGm:G\times G\to G and a unit eGe\in G such that …”. Just like the meaning of a Cauchy sequence of rationals is “a function \mathbb{N}\to \mathbb{Q} such that …”. It’s only when you want to consider “the category of finite groups” or “the set of Cauchy sequences” as an abstract object that you need to construct a set whose elements code for finite groups or Cauchy sequences.

Posted by: Mike Shulman on September 25, 2009 3:31 PM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

Arnold wrote:

What I now longer understand now, however, is the claim that this way of organizing mathematics is superior to that of basing it on ZF since it is structural rather than material.

For I find the meaning of a finite group implied by the construction you gave not any more natural than the meaning of a natural number implied by its ZF construction by von Neumann.

It is ugly, and no mathematician thinks of this as being the essence of finite groups.

Moreover, for a (general) group, one has a similar messy construction, and a finite group is no longer a group but only ”becomes” a group under the application of a suitably define functor.

This flies in the face of the ordinary understanding of every algebraist of the notions of group and finite group.

In the attempts (in this discussion) to capture the essence of mathematics the proponents introduce so much artificial stuff in the form of trivial but needed functors that the result no longer resembles the essence to be captured.

Thus the structural, ETCS-based approach is no better in capturing the essence of mathematics as the material, ZF-based approach.

Okay, a lot of opinions are being expressed here. Let me first say that the charge of “ugliness” is an aesthetic judgment, not part of formalized mathematics. Given the strictures I placed myself under (showing that a group could be expressed as a single element), to satisfy your demands, the notion was bound to look harder than the ordinary understanding of the algebraist, whose “essence” [as you like to say] is simply, as we have been saying over and over,

  • A group is a set equipped with a group structure

which I maintain is structural in essence: there is no reference in that definition to the fact that elements may themselves have elements. The word “structural” means that it is abstract structure that is paramount, not the internal ontology of elements which is necessarily uninvariant under isomorphism – internal ontology of elements is a consideration which is alien to the practice of working mathematicians (unless they are investigating ZF perhaps, from a platonist point of view).

Presumably, if FMathL is well-developed, the human user can work in the customary style of sets+structure, and it is the job of the computer to then translate (or shoehorn) that into a single object or element. I don’t think the computer would care or have an opinion whether that’s done in ZF or SEAR or whatever, although obviously consideration must be given to what is the most efficient way to do the shoehorning.

Rather than say the structuralist view is “better” (it may certainly be better for certain purposes), and bring in aesthetic disagreements which may well be irreconcilable, I would say that at least in some respects, the structural view is closer to the way mathematics has traditionally been practiced. For example, the idea that a point on the real line may have elements which themselves have elements is, I think you will admit, peculiar to twentieth-century mathematics (and maybe to some extent now), and is an idea that is utterly irrelevant to working practice. And yet this abnormality is an undeniable consequence if one takes ZF and particularly a global membership relation as one’s foundations. I believe there’s some merit in rejecting those consequences as abnormal and irrelevant to mathematics.

On the other hand, a different twentieth-century development which has proven itself extremely relevant to current practice is category theory, which emphasizes universal properties and invariance of structure with respect to isomorphism. A structural development like ETCS takes those precepts very seriously indeed and embeds them as part of the formal development, whereas those precepts for a committed ZF-er would have to remain at the level of “morality” and are not part of the formal set-up.

Don’t get me wrong – as an abstract structure, the cumulative hierarchy is a recursively rich, powerful, and interesting mathematical structure. But as foundations, it’s not particularly pertinent to how mathematicians think about L 2L^2 and such things. Those of us committed to category theory have come a bit closer to the essence, I believe, by focusing on things like universal properties as far more relevant to practice.

Posted by: Todd Trimble on September 25, 2009 4:04 PM | Permalink | Reply to this

Re: What is a structured object?

On the other hand, a different twentieth-century development which has proven itself extremely relevant to current practice is category theory, which emphasizes universal properties and invariance of structure with respect to isomorphism.

I’d like to add my 5 cents worth to this discussion by agreeing with Todd. I am not a category theorist and never will be — category theory hurts my head. On the other hand I find it very useful to try to think like a category theorist. Even (especially!) when I am working on something that appears quite far from category theory, like dynamical systems or symplectic toric geometry.

Posted by: Eugene Lerman on September 25, 2009 9:49 PM | Permalink | Reply to this

Re: What is a structured object?

Eugene wrote:

I am not a category theorist and never will be — category theory hurts my head.

The only thing stopping you is that you still think it’s bad for your head to feel that way. It’s actually good — it’s the feeling of new neurons growing.

It’s sort of like the aches and pains you get after lifting more weights than you’re used to. Good weightlifters still feel those aches; they just learn to like them.

Posted by: John Baez on September 26, 2009 3:46 AM | Permalink | Reply to this

its the feeling of new neurons growing; Re: What is a structured object?

No pain, no gain, in the visceral brain, or the complex plane.

Posted by: Jonathan Vos Post on September 26, 2009 4:55 PM | Permalink | Reply to this

Re: What is a structured object?

The worrying thing about your weight-lifter analogy is that body builders tear their muscles to promote growth.

Posted by: David Corfield on September 26, 2009 6:23 PM | Permalink | Reply to this

Re: What is a structured object?

David wrote:

The worrying thing about your weight-lifter analogy is that body builders tear their muscles to promote growth.

And what’s worrying about that? I bet the ‘aching head’ feeling I get when struggling to learn new concepts is somehow analogous to the ‘torn muscle’ feeling I get whenever I up the amount of weight I lift at the gym. I bet there’s some real ‘damage’ to ones conceptual/neurological structure whenever one struggles really hard to master difficult new ideas: comfortable old connections are getting torn apart. But then new improved connections grow to take their place!

I think the people who do well at learning new things are the ones who learn to enjoy the ache. In the case of the ‘torn muscle’ feeling, the pleasure comes from 1) knowing that one is getting stronger, 2) the endorphin high, 3) a learned association between the two. Maybe something similar happens in the intellectual realm.

Posted by: John Baez on September 26, 2009 10:13 PM | Permalink | Reply to this

Re: What is a structured object?

I will say: as someone who has begun a strength-training regime fairly recently, and whose aching arms feel like useless appendages right now, this mini-thread is helping a little bit. Thanks!

Posted by: Todd Trimble on September 27, 2009 4:20 PM | Permalink | Reply to this

No fiber bundle pain, no gain; Re: What is a structured object?

Ironically, the pain from body building comes from fiber bundles. Or, actually, tearing the membranes surrounding bundles of fibers.

Skeletal muscle is made up of bundles of individual muscle fibers called myocytes. Each myocyte contains many myofibrils, which are strands of proteins (actin and myosin) that can grab on to each other and pull. This shortens the muscle and causes muscle contraction.

It is generally accepted that muscle fiber types can be broken down into two main types: slow twitch (Type I) muscle fibers and fast twitch (Type II) muscle fibers. Fast twitch fibers can be further categorized into Type IIa and Type IIb fibers.

These distinctions seem to influence how muscles respond to training and physical activity, and each fiber type is unique in its ability to contract in a certain way. Human muscles contain a genetically determined mixture of both slow and fast fiber types. On average, we have about 50 percent slow twitch and 50 percent fast twitch fibe

Andersen, J.L.; Schjerling, P; Saltin, B. Scientific American. “Muscle, Genes and Athletic Performance” 9/2000. Page 49

McArdle, W.D., Katch, F.I., and Katch, V.L. (1996). Exercise physiology : Energy, nutrition and human performance

Lieber, R.L. (1992). Skeletal muscle structure and function : Implications for rehabilitation and sports medicine. Baltimore : Williams and Wilkins.

Andersen, J.L.; Schjerling, P; Saltin, B. Muscle, Genes and Athletic Performance. Scientific American. Sep 2000

Thayer R., Collins J., Noble E.G., Taylor A.W. A decade of aerobic endurance training: histological evidence for fibre type transformation. Journal of Sports Medicine and Phys Fitness. 2000 Dec; 40(4).

Posted by: Jonathan Vos Post on September 28, 2009 7:09 AM | Permalink | Reply to this

Clues To Reversing Aging Of Human Muscle Discovered; Re: No fiber bundle pain, no gain; Re: What is a structured object?

DOING Math (what Erdos called “being alive”) also helps reverse the effects of aging on the Brain. I don’t much like the common analogy: “The brain is a muscle; use it or lose it” because, you know, the brain is NOT a muscle. Yet regular and vigorous use IS beneficial, and to an extent that surprises many people.

Clues To Reversing Aging Of Human Muscle Discovered

… “Our study shows that the ability of old human muscle to be maintained and repaired by muscle stem cells can be restored to youthful vigor given the right mix of biochemical signals,” said Professor Irina Conboy, a faculty member in the graduate bioengineering program that is run jointly by UC Berkeley and UC San Francisco, and head of the research team conducting the study. “This provides promising new targets for forestalling the debilitating muscle atrophy that accompanies aging, and perhaps other tissue degenerative disorders as well.”…

Morgan E. Carlson, Charlotte Suetta, Michael J. Conboy, Per Aagaard, Abigail Mackey, Michael Kjaer, Irina Conboy. Molecular aging and rejuvenation of human muscle stem cells. EMBO Molecular Medicine, 2009; DOI: 10.1002/emmm.200900045

Posted by: Jonathan Vos Post on September 30, 2009 9:01 PM | Permalink | Reply to this

Re: What is a structured object?

If you say this …

I agree that one can probably fill in all details, and that this gives a way to define formally what the category FG of finite groups is, and hence what a finite group is, namely an element of Ob(FG).

then naturally you will say this …

What I no longer understand now, however, is the claim that this way of organizing mathematics is superior to that of basing it on ZF since it is structural rather than material.

A finite group ‘is’ a set equipped with a group structure. If it vital to encode this formally as a single object, then supplement SEAR or ETCS with a dependent type theory with dependent sums. But it is not essential to mathematical practice to do so.

If you want to have a collection of finite groups (or whatever), then any foundations requires some reasoning to show that your collection is valid. (After all, a collection of literally ‘all’ finite groups is impossible in ZFC, as is a collection of all groups whatsoever in either ZFC or ETCS.) Although other methods may be available in some cases, the uniform way to do this is by using the Axiom of Collection: you find some way to index your objects by a set, and the axiom gives you your collection.

In material set theory, you can set things up so that each object is literally an element of the collection, which is convenient; this wouldn't make sense in structural set theory, so you instead introduce an abuse of language in which the ‘elements’ of the collection are actually the fibres over the elements of the index set (together with the structures defined on those fibres).

I said that material set theory is convenient, but in fact it is not convenient enough! Even in ZFC, there is no small category FG such that a finite group is literally the same as an object of FG. Instead, if you insist on recovering the notion of finite group from the category FG, then you can define a finite group to be a set UU together with an object SS of FG and a bijection between UU and the underlying set of SS. In ZFC, presumably the ‘underlying set’ of SS is the first entry in a tuple (S,m)(S,m); in ETCS, the ‘underlying set’ of SS is as defined in Todd's comment. (In both cases, it takes another step to recover the group in the usual sense, as a set together with a group operation.) Once again, structural set theory prevents a potential mistake (thinking that GG is not a finite group because it is not literally an object of FG) by throwing up a typing error.

Moreover, for a (general) group, one has a similar messy construction, and a finite group is no longer a group but only “becomes” a group under the application of a suitably define functor.

Hopefully you see now that this is not true in ETCS, but even so … this is no worse than the fact that a Riemannian manifold only “becomes” a manifold under the application of (in the structured-sets-as-tuples formalisation) projection onto the first entry (or possibly even something a bit more complicated).

Posted by: Toby Bartels on September 25, 2009 9:39 PM | Permalink | Reply to this

Re: What is a structured object?

AN: Moreover, for a (general) group, one has a similar messy construction, and a finite group is no longer a group but only “becomes” a group under the application of a suitably define functor.

TB: this is no worse than the fact that a Riemannian manifold only “becomes” a manifold under the application of (in the structured-sets-as-tuples formalisation) projection onto the first entry (or possibly even something a bit more complicated).

I think the standard mathematical language teaches something different that gets lost both by encoding it into ZF and by encoding it in ETCS or SEAR, though in different ways.

In mathematical practice, to say that an object is a group or a manifold says that it has certain properties. To say that it is a finite group or a Riemannian manifold adds properties but of course preserves all previous properties.

Similarly, to say that a subset H of a group G is a subgroup if it is closed under products and inversion is not an abuse of notation (as was claimed in the discussion on SEAR), since the subset H is not only a set and a subset of G but inherits from the group a product mapping from H x H to G (and even one from H x G to G, etc), and if the subgroup condition holds, this is a mapping from H x H to H and hence the binary operation alluded to in calling it a subgroup.

Similarly, to say that L 2()L^2(\mathbb{R}) and L 2( 3)L^2(\mathbb{R}^3) are separable Hilbert spaces does not strip them of any distinguishing property these spaces have by construction, although the category of separable Hilbert spaces contains only one object up to isomorphism.

Thus almost all objects mathematicians talk about are almost always equipped with lots of stuff through their context, but neither the formalization in ZF nor that in SEAR or ETCS (or Coq, etc.) takes account of that.

That one doesn’t use all these extra structure all the time is not to be handled by deleting entries from the tuple (in ZF) or by applying a forgetful functor (in the structural approach) but by the same common sense that logicians use when they list in some formal natural deduction only the stuff they actually used.

That the categorial way alone cannot capture this essence of mathematics is quite obvious from simple examples:

If GOb(Grp)G\in Ob(Grp) any sane mathematician infers that GG is equipped with a set structure with which the assumption x,y,zGx,y,z\in G makes sense, and infers that there is a product operation for which xyGxy\in G and (xy)z=x(yz)(xy)z=x(yz).

But this only holds if GrpGrp is the category materially constructed by the definition of GrpGrp, and not (as claimed in this discussion - I don’t remember by whom) if one forgets this construction once the category is formed, and only retains the isomporphism class of the category.

Thus the “structural” point of view actually loses structure!

Sometimes the loss of structure is dramatic: The category CLOFCLOF of closed linearly ordered fields and the category E7E7 of undirected graphs isomorphic to the E 7E_7 Dynkin diagram are isomorphic, but objects from these two categories have very different properties. There is not even a canonical isomorphism betwee the two categories. Here the essence is completely lost.

Posted by: Arnold Neumaier on September 30, 2009 12:27 PM | Permalink | Reply to this

Re: What is a structured object?

I assume that by “properties” you mean “properties or structure or stuff” (around here we use a precise meaning of property according to which a finite group is a group with extra properties, but a Riemannian manifold is not a manifold with extra properties (but rather extra structure)).

I agree that both ZF and ETCS/SEAR handle this issue clumsily, albeit clumsily in different ways. However, I think this argument:

If G∈Ob(Grp) any sane mathematician infers that G is equipped with a set structure with which the assumption x,y,z∈G makes sense, and infers that there is a product operation for which xy∈G and (xy)z=x(yz).

But this only holds if Grp is the category materially constructed by the definition of Grp, and not… if one forgets this construction once the category is formed, and only retains the isomporphism class of the category.

misses the point. If one wants to treat Grp as an abstract category, then one forgets how its objects were constructed (which has nothing to do with materiality), just as if one wants to treat A 5A_5 as an abstract group, one forgets that its elements have a natural action on some 5-element set. However, nothing forces us to do that forgetting as soon as the object is formed, and quite often we don’t.

But, as I said, I agree that both ZF and ETCS/SEAR are clumsy about moving between different levels of properties or structure. This would be something that would be great for a higher-level formalization to improve on.

Actually, it strikes me right now that this issue is very similar to class inheritance in object-oriented programming. When we say that a Riemannian manifold is a manifold, the “is a” really has the same meaning as in OOP: a Riemannian-manifold object can be used anywhere that a manifold is expected, but it doesn’t thereby lose its Riemannianness (although if we access it only through a manifold ptr then we can’t use any of its Riemannianness). From this point of view, the clumsiness of existing foundations amounts to requiring all upcasts to be explicit.

Posted by: Mike Shulman on September 30, 2009 5:53 PM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

MS: I assume that by “properties” you mean “properties or structure or stuff”

Yes. For me extra properties and extra structure are synonymous. It is just something more that can be profitably exploited for reasoning.

MS: However, nothing forces us to do that forgetting as soon as the object is formed, and quite often we don’t.

This piece of moral sounds quite different and much more agreeable than the many times repeated one I had to put up with before:

“However, once the construction is finished, we generally forget about it” “but in each case once the construction is performed, its details are forgotten. I always assumed, without really thinking much about it, that all modern mathematicians thought in this way” “once the construction is performed, the fact that you used “the same” objects is discarded.” “When you construct one category from another, you might use the “same” set of objects, but once you’ve constructed it, there is no relationship between the objects, because after all any category is only defined up to equivalence.” (quotes from your earlier mails)

“ two categories may have an object in common, but you should never use that fact.” “You’re completely (intentionally?) missing the distinction I drew between a construction demonstrating the existence of a model of a structure and the subsequent use of the properties of a structure. As I said before, “moral” (which was someone else’s term) refers to the latter segment, not the former. […] (John Armstrong)

You now seem to say that all the categories can be considered as concrete categories or as abstract categories depending on the purpose the mathematician wants to achieve. This is fine with me. Indeed, the standard (material) definition of a category is precisely that of a concrete category. And in concrete categories I am allowed to do all the stuff you wanted to forbid: compare objects of different categories, check whether the objects of one form a subclass of those of the others, create intersections of the class of objects of two different categories, etc.. once this is allowed, I have no problems at all with the categorial language (except for lack of fluency in expressing myself in it). One has all this structure around unless one deliberately forgets it. There is no moral that tells one that one should forget it, except if one wants to forget it.

It was only the strange moral that was imposed on it without having any formal justification that bothered me.

MS: it strikes me right now that this issue is very similar to class inheritance in object-oriented programming. […] From this point of view, the clumsiness of existing foundations amounts to requiring all upcasts to be explicit.

Yes. This is why FMathL will have on the specification level a much more flexible type-like system that borrows much more from the theory of formal languages than from the theory of types.

Posted by: Arnold Neumaier on September 30, 2009 6:49 PM | Permalink | Reply to this

Re: What is a structured object?

You now seem to say that all the categories can be considered as concrete categories or as abstract categories depending on the purpose the mathematician wants to achieve.

Yes, of course. The comments you quoted were in a different context, explaining that (for example) the particular construction of the real numbers as Dedekind cuts is usually forgotten once we have the real numbers, so that it is better if you can forget it rather than actually have the real numbers be Dedekind cuts as in material set theory.

I made this same point here.

Indeed, the standard (material) definition of a category is precisely that of a concrete category.

No, I don’t think so. Some people have a precise definition of a “concrete category,” but here I’m thinking of it in a more vague way like “a category together with some information preserved from its construction.” I don’t see what this has to do with materiality.

And in concrete categories I am allowed to do all the stuff you wanted to forbid: compare objects of different categories, check whether the objects of one form a subclass of those of the others, create intersections of the class of objects of two different categories, etc.

No. If two concrete categories CC and DD both have a forgetful functor to SetSet (being part of the information you remembered from their constructions), then you can ask whether the underlying sets of an object xCx\in C and yDy\in D are isomorphic, or whether every set that underlies an object of CC also underlies an object of DD, or consider the collection of all sets that underlie both an object of CC and an object of DD, but these are quite different things from the forbidden ones.

Posted by: Mike Shulman on September 30, 2009 8:25 PM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

AN: Indeed, the standard (material) definition of a category is precisely that of a concrete category.

MS: No, I don’t think so. Some people have a precise definition of a “concrete category,”

You had at least the additional qualifier of an abstract category, which seems to be something different from the category as defined in the textbooks.

MS: but here I’m thinking of it in a more vague way like “a category together with some information preserved from its construction.”

I am referring to the standard definition of a (small) category C found everywhere, with the standard interpretation of Ob(C) as class (or set) in the traditional sense (not SEAR, not ETCS, which,in most textbooks, do not figure early if at all).

There is nothing vague in this definition beyond what is vague in any mathematical discourse.

This definition does not ask you to forget anything about the category you constructed.

Indeed, the definition does not even provide a formal mechanism for forgetting. The reason is presumably either that such an automatic mechanism was never intended by those who invented and traded the definition, or that it is difficult to formalize rigorously at this stage.

On the contrary, to forget something you need to do something to the category, and no such doing is formally sepcified in any introduction to category theory.

It is an additional moral that you want to impose without specifying it axiomatically.

But what is not in the axioms can be ignored by anyone working with them, without harming in the least the correctness of what is done, and without affecting any consistent interpretation of the axioms.

AN: And in concrete categories I am allowed to do all the stuff you wanted to forbid: compare objects of different categories, check whether the objects of one form a subclass of those of the others, create intersections of the class of objects of two different categories, etc.

MS: No.

My example of the categories C abcdC_{abcd} etc. is still there; nobody has shown me any conflict with the standard definitions of a category and a subcategory (interpreted with Ob(C) as class in the traditional sense).

If you want to consistently uphold your No, you’d prove my assertions there wrong!

Posted by: Arnold Neumaier on September 30, 2009 9:08 PM | Permalink | Reply to this

Re: What is a structured object?

We have already been over this same territory several times. In my view we have given adequate responses to all of these issues, including your category C abcdC_{a b c d}. I don’t have time to repeat the same arguments again, especially since I have no reason to believe the communication would be any more successful the second or third time around. So I guess we’re at an impasse here.

Posted by: Mike Shulman on October 1, 2009 5:00 AM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

I am referring to the standard definition of a (small) category C found everywhere

So are we. You're focussing on the question of whether you can take two arbitrary categories C and D and ask whether C is a subcategory of D, but really you should (to avoid confusion with other issues around categories) start with the question of whether you can take two arbitrary sets C and D and ask whether C is a subset of D. (Or perhaps use groups instead of sets.) Certainly you can do the former if you can do the latter, which is obvious enough looking at the standard definition. But doing the latter is already objectionable (at the very least, an abuse of language) from a structuralist perspective.

with the standard interpretation of Ob(C) as class (or set) in the traditional sense (not SEAR, not ETCS, […]).

How can you tell?

Saunders Mac Lane, one of the two people who first defined categories, is on record as preferring ETCS as a foundation of mathematics. (This is quoted in that McLarty paper that's been linked here.) He considers his concept of category perfectly well formalised by structural set theory.

There is an additional complication, which goes beyond merely having structural foundations, that even within a single arbitrary small category, one should not be able to compare objects for equality (only for isomorphism); this is the problem of evil. ETCS and SEAR do allow this, while I would prefer a foundation of category theory that does not. I know some ways to approach this, but I don't think that it's a solved problem yet.

Posted by: Toby Bartels on October 1, 2009 9:07 AM | Permalink | Reply to this

Re: What is a structured object?

TB: really you should (to avoid confusion with other issues around categories) start with the question of whether you can take two arbitrary sets C and D and ask whether C is a subset of D.

According to the first paragraph of the Prerequisites in Serge Lang, Algebra, second printing 1970 (who treats categories in Chapter I.7), I am allowed to do this. I take this to be the standard point of view.

Lang’s context allows me to do everything I did with C abcdC_{abcd} etc., although your moral forbids it.

TB: Saunders Mac Lane, one of the two people who first defined categories, is on record as preferring ETCS as a foundation of mathematics.

So he allows only bounded comprehension in mathematics?

If your view is right, it depends on the foundations of mathematics whether one is allowed to do the things I do. This would mean that the foundations are not equivalent.

But this would conflict with the result by Osius (which I still need to check) that ETCS+R is equivalent to ZFC.

I think you cannot consistently claim both.

Posted by: Arnold Neumaier on October 1, 2009 10:47 AM | Permalink | Reply to this

Re: What is a structured object?

TB: Saunders Mac Lane, one of the two people who first defined categories, is on record as preferring ETCS as a foundation of mathematics.

So he allows only bounded comprehension in mathematics?

Toby didn’t say that; he said “preferred foundations”. Saunders would have been very happy to allow you to speak if you were giving him an instance of unbounded separation, and was well familiar with ZFC and its cousins.

Saunders’ position was that just about all core mathematics (what goes on in basic courses on functional analysis, algebraic topology, and so on) can be developed on the basis of ETCS. Not all developments – he was well aware that some set-theoretic constructions required going beyond ETCS. I think he chose not to be too exercised by that, but he may have had some occasional doubts. (I got to know Saunders rather well during my Chicago years, so I think I can say this.) He was also much concerned with making ETCS more accessible to people; I think this worried him more than any limitations of ETCS.

Posted by: Todd Trimble on October 1, 2009 2:40 PM | Permalink | Reply to this

Re: What is a structured object?

According to the first paragraph of the Prerequisites in Serge Lang, Algebra, second printing 1970 (who treats categories in Chapter I.7), I am allowed to do this. I take this to be the standard point of view.

And so it is. And yet, nowhere does Lang actually use the idea that one can take two arbitrary sets and ask whether one is contained in the other; he never needs to. He may take a set UU and then consider an arbitrary subset of UU; what this means can be defined (or even taken as axiomatic) in structural foundations. And he may take two arbitrary subsets of some set UU and ask whether one is contained in the other, which can also be defined structurally. But there is no need in ordinary mathematics to take two arbitrary sets and ask whether one is contained in the other; even if one thinks it meaningful, it never matters.

Lang’s context allows me to do everything I did with C abcdC_{abcd} etc., although your moral forbids it.

I'm not sure why you keep saying this. Is there anything that you did with C abcdC_{abcd} etc that we have not yet formalised structurally?

If your view is right, it depends on the foundations of mathematics whether one is allowed to do the things I do. This would mean that the foundations are not equivalent.

ETCS is equivalent to BZC (which is ZFC without replacement and with only bounded separation). ETCS+R is equivalent to ZFC (since replacement and bounded separation together imply full separation).

Posted by: Toby Bartels on October 1, 2009 7:58 PM | Permalink | Reply to this

Re: What is a structured object?

In mathematical practice, to say that an object is a group or a manifold says that it has certain properties.

This connects with the idea earlier that ‘ordered monoids are the objects in the intersection of Order and Monoid satisfying the compatibility relation’. As I said then, I would be very interested to see a formalism in which this can be taken literally!

But it would be tricky. We should be able to say, for example, that a ring (that is, an associative unital ring) is an object that is both an abelian group and a monoid, satsfying a compatibility relation. But since every abelian group is already a monoid, surely AbGrpMon=AbGrpAbGrp \cap Mon = AbGrp, so now it has only one structure, which the compatibility condition forces to be trivial! (For an even worse example, try a commutative rig, where now both structures are commutative monoid structures.)

One thing that you could do is to say that a ring is an object that is both an additive abelian group and a multiplicative monoid, satisfying a compatibility condition. Then you seem to have to define monoids twice, and you're forbidden to say that (]0,[,,(a,ba logb)(]0,\infty[, \cdot, (a,b \mapsto a^{\log b}) is a ring, even though we have found it useful to say so. Of course, there may be ways around that, but I don't know them.

The way that I do know to formalise the idea that a ring may be defined as somehow both an abelian group and a monoid is to start with AbGrp×MonAbGrp \times Mon and then carve out RingRing with a compatibility condition which includes having the same underlying set. (On the face of it, this is evil, but I know ways around that. And in any case, there's no point worrying about evil if one doesn't even have structuralism.) This only makes sense, as far as I can tell, if a group is a set equipped with some structure rather than simply a set satisfying some property.

Once one has grown out of the idea that a group is literally simply a certain kind of set, then it's not so hard that an abelian group might not be literally simply a certain kind of group, even when that can still be done.

Sometimes the loss of structure is dramatic: The category CLOFCLOF of closed linearly ordered fields and the category E7E7 of undirected graphs isomorphic to the E 7E_7 Dynkin diagram are isomorphic, but objects from these two categories have very different properties. There is not even a canonical isomorphism betwee the two categories. Here the essence is completely lost.

Again, you can always put that structure back if you want it. Then you have categories equipped with some structure rather than just categories. (In particular, you might equip CLOFCLOF with the structure of its inclusion into the category of fields, and you might equip E7E7 with its inclusion into the discrete category of Dynkin diagrams, which itself has functors to various categories, such as LieAlgLie Alg.) Incidentally, there is a canonical relationship between these categories; there is an adjoint equivalence between them that is unique up to unique isomorphism, which is enough. But of course, the structures that we like to put on them are very different (even though we could put each structure on the other category if we wished).

Posted by: Toby Bartels on September 30, 2009 6:32 PM | Permalink | Reply to this

Re: What is a structured object?

AN: In mathematical practice, to say that an object is a group or a manifold says that it has certain properties.

TB: This connects with the idea earlier that ‘ordered monoids are the objects in the intersection of Order and Monoid satisfying the compatibility relation’. As I said then, I would be very interested to see a formalism in which this can be taken literally!

I am working on that and will soon show you how it can be done in FMathL.

TB: Once one has grown out of the idea that a group is literally simply a certain kind of set, then it’s not so hard that an abelian group might not be literally simply a certain kind of group, even when that can still be done.

I never had this idea, hence could not outgrow it.

I always had the idea that although a group G is different from the set making up the elements of G, G contains precisely these elements. Thus I always doubted the semantic legitimacy of the extensivity property of sets, since mathematical practice does not support it.

For exactly the same reasons I oppose the idea that an abelian group should not be literally a group.

It is like claiming that a person is not literally a man or a woman, because you add structure in the form of a gender. This is completely foreign to my understandign of language.

mathematical language shares this additive property of natural language, and good foundations should preserve this important feature.

TB: you can always put that structure back if you want it.

In ordinary language, in informal mathematical langualge, and in FMathL you never lose it, unless you want to lose it.

Posted by: Arnold Neumaier on September 30, 2009 8:35 PM | Permalink | Reply to this

Re: What is a structured object?

I always had the idea that although a group G is different from the set making up the elements of G, G contains precisely these elements.

So have I. More explicitly: the type of elements of GG is the same as (syntactically identical with, to be formal) the type of elements of the underying set of GG. In that sense, GG has the same elements as its underlying set.

Thus I always doubted the semantic legitimacy of the extensivity property of sets, since mathematical practice does not support it.

But GG is not a set, is it? So don't you still have (in a material framework) A=BA = B if AA and BB are sets with the same elements?

For exactly the same reasons I oppose the idea that an abelian group should not be literally a group.

That seems backwards to me; you just said that a group is different from the set of its elements (aka its underlying set), but now you want an abelian group to be the same as its underlying group. (Not that you can't still have that if you want it!)

mathematical language shares this additive property of natural language, and good foundations should preserve this important feature.

As a reminder, ETCS\mathbf{ETCS} and SEAR\mathbf{SEAR} have this feature. But if you strengthen them with an inaccessible cardinal and speak of small groups and abelian groups, then it may (depending on how you choose to define phrases like ‘small group’) become an abuse of language. (If not, then it will become an abuse of language to say that a small group is an object of the category of small groups.)

you can always put that structure back if you want it.

In ordinary language, in informal mathematical langualge, and in FMathL you never lose it, unless you want to lose it.

If you define something in such a way that the structure is there, then the same is true of structural foundations. (But Mike just explained this better than I can.)

Posted by: Toby Bartels on September 30, 2009 8:52 PM | Permalink | Reply to this

Re: What is a structured object?

AN: I always had the idea that although a group G is different from the set making up the elements of G, G contains precisely these elements.

TB: So have I. More explicitly: the type of elements of G is the same as (syntactically identical with, to be formal) the type of elements of the underying set of G. In that sense, G has the same elements as its underlying set.

In mathematical practice, if GG is a group and X=Set(G)X=Set(G) its associated set then xGx\in G iff xXx\in X. In FmathL, both statements are well-formed, equivalent expressions.

TB: But G is not a set, is it? So don’t you still have (in a material framework) A=B if A and B are sets with the same elements?

Yes. You may consider FMathL as a material framework (but object based, not set-based). There your statement is literally true. The point is that many kinds of objects can contain elements, not only sets.

TB: you just said that a group is different from the set of its elements (aka its underlying set), but now you want an abelian group to be the same as its underlying group.

Yes. I don’t see any problem here.

An abelian group should be a group for the same reason that an old man is a man.

I am just observing mathematical practice and lnguage, and translate it into rules for specifying what one is allowed to say and to do.

AN: mathematical language shares this additive property of natural language, and good foundations should preserve this important feature.

TB: As a reminder, ETCS and SEAR have this feature.

I can’t see that. You are not allowed to write xSΣx \in S \in \Sigma; your subsets have already a completely different structure than your sets.

In the additive view, a subset is a set, and a subgroup is a group,

Posted by: Arnold Neumaier on September 30, 2009 11:10 PM | Permalink | Reply to this

Re: What is a structured object?

In mathematical practice, if GG is a group and X=Set(G)X=Set(G) its associated set then xGx \in G iff xXx \in X. In FmathL, both statements are well-formed, equivalent expressions.

This is how I would use the language too.

As you know, there is no way in SEAR\mathbf{SEAR} or ETCS\mathbf{ETCS} to describe a group with a single symbol GG. So I suggested above that such a system should be supplemented with dependent sums of all types to make the language more natural. In that case, xGx \in G would not, a priori, have any meaning, but then I would simply define it to mean xSet(G)x \in Set(G), where Set(G)Set(G) is the underlying set of GG (the main component of GG).

Yes. I don’t see any problem here.

Neither do I, it just seemed like you were arguing that the situation was supposed to be analogous, when it's not. (This is the difference between structure and property.)

You are not allowed to write xSΣx \in S \in \Sigma; your subsets have already a completely different structure than your sets.

Yes, we are allowed to write that!

To explain how, remember that aBa \in B has two rather different meanings in a structural set theory; one meaning (for which I usually write a:Ba: B, borrowed from type theory) is used when introducing aa as a new variable, and it's the typing declaration that aa is to be an element of the set BB. The other meaning (for which one might write a UBa \in_U B, although I usually don't) is as a proposition, used when aa is already known as standing for an element of some set UU and BB is (not simply a set in its own right but) a subset of UU.

If xx, SS, and Σ\Sigma are all known, xx is an element of UU, SS is an element of the power set of UU, and Σ\Sigma is an element of the power set of the power set of UU, then xSx \in S and SΣS \in \Sigma both make sense as propositions, and xSΣx \in S \in \Sigma is the usual abbreviation for their conjunction.

Now suppose that only Σ\Sigma is known, as a subset of the power set of some set UU. Since SS is a new variable, SΣS \in \Sigma must be interpreted as a typing declaration, but Σ\Sigma is a subset rather than a set, so what does this mean? If it has no a priori meaning, then we can define it to mean whatever we like; just as with xGx \in G above, the sensible definition that matches the usual practice is to define to mean that SS is an element of the underlying set of Σ\Sigma (which Mike writes as |Sigma||Sigma| in his axioms for SEAR\mathbf{SEAR}). Now what about xSx \in S? Now SS is not literally a set, not even literally a subset of anything, but the general principle is that whenever a variable stands for an element of the underlying set of a subset of some set PP, then it inherits everything that refers to elements of PP. In this case, PP is the power set of UU, and we know what xSx \in S means when SS is an element of that (which is to say, when SS is a subset of UU): that xx is an element of the underlying set of that subset.

So in full detail in the formal language of SEAR\mathbf{SEAR}, if you write ‘Let xSΣx \in S \in \Sigma.’ in a context where xx and SS are unknown but Σ\Sigma is a subset of the power set of UU, then you mean ‘Let SS be an element of |Σ||\Sigma|, and let xx be an element of q(|Σ|)q(|\Sigma|).’. But normally, one suppresses qq and the vertical bars; you can throw them in wherever they are necessary for something to make sense. It is much like suppressing the map from GG to Set(G)Set(G) when writing xGx \in G.

In the additive view, a subset is a set, and a subgroup is a group,

In ETCS\mathbf{ETCS}, the way in which a subset is a set is perfectly analogous to the way in which a group is a set. (SEAR\mathbf{SEAR} works a little differently.)

Posted by: Toby Bartels on October 1, 2009 10:12 AM | Permalink | Reply to this

Re: What is a structured object?

AN: In the additive view, a subset is a set, and a subgroup is a group,

TB: In ETCS, the way in which a subset is a set is perfectly analogous to the way in which a group is a set. (SEAR works a little differently.)

This way is only a way based on heavy abuse of language, nothing real.

In standard mathematics, a group can be in no sense a set, since there may be different groups on the same set. Thus one must always distinguish between a group and its underlying set.

On the other hand, there are no such obstructions for not regarding an abelian group as a group, a subgroup as a group, or a subset as a set.

Therefore, no such obstructions should be artificially created on the formal level.

Posted by: Arnold Neumaier on October 1, 2009 11:53 AM | Permalink | Reply to this

Re: What is a structured object?

This way is only a way based on heavy abuse of language, nothing real.

The abuses of language that come in when treating a subset of UU as a set with extra structure (an injection to UU, working in ETCS\mathbf{ETCS}) are the same abuses of language as are used when treating a group as a set with extra structure (a binary operation satsifying the well-known properties). An element of a group may be defined to be an element of its underlying set; an element of a subset of UU may be defined to be an element of its underlying set. A function on a group may be defined to be a function on its underlying set; a function on a subset of UU may be defined to be a function on its underlying set. And so on.

In standard mathematics, a group can be in no sense a set, since there may be different groups on the same set. Thus one must always distinguish between a group and its underlying set.

I might just as well say that a subset of UU can be in no sense a set, since there may be different injections to UU from the same set. However, it would be silly of me say this, since there are well-known formalisations of mathematics (such as ZFC\mathbf{ZFC}, and indeed any set theory in that style) in which it is literally true that a subset of UU is a set; in these frameworks, every subset of UU is a unique set, while only some sets are subsets of UU.

There are also formalisations of mathematics in which it is literally true that a group is a set; in such a framework, every group is a unique set, while only some sets are groups. Here is how to construct one, starting with ZFC\mathbf{ZFC}: Add to the language a unary predicate 𝒢\mathcal{G}, with the inteded meaning of 𝒢X\mathcal{G} X being that XX is a group. Also add to the language a partial operation μ\mu, with the intended meaning of μX\mu X being the group structure on XX. Add the axiom that μX\mu X is defined if and only if 𝒢X\mathcal{G} X holds, in which case μX\mu X is a group structure (as usually defined in ZFC\mathbf{ZFC}) on XX. Call this system ZFC+𝒢\mathbf{ZFC} + \mathcal{G}.

We must check that this is equivalent to ZFC\mathbf{ZFC}. Actually, I'm not sure that it is; but ZFC+𝒢+ε\mathbf{ZFC} + \mathcal{G} + \varepsilon is equivalent to ZFC+ε\mathbf{ZFC} + \varepsilon, where ε\varepsilon is the global choice operator. Starting with the latter, use global choice to pick a representative from each isomorphism class of groups in such a way that no two representatives have the same underlying set. (This requires checking that there are enough underlying sets; we would be in trouble here if there were more than one group structure on the empty set!) Then define 𝒢X\mathcal{G} X to mean that XX is the underlying set of one (and hence only one) of these representatives, and let μX\mu X be the group structure on that representative. The other direction is obvious.

Of course, all that is just to show that you can justify ZFC+𝒢+ε\mathbf{ZFC} + \mathcal{G} + \varepsilon on the basis of ZFC+ε\mathbf{ZFC} + \varepsilon; the formal system ZFC+𝒢\mathbf{ZFC} + \mathcal{G} is a perfectly straightforward foundation in its own right, so why not use it? If you want a subset to be literally a kind of set, OK, but why not the same for a group? and a manifold? and …?

Posted by: Toby Bartels on October 1, 2009 8:26 PM | Permalink | Reply to this

The intersection of categories

TB: This connects with the idea earlier that ‘ordered monoids are the objects in the intersection of Order and Monoid satisfying the compatibility relation’. As I said then, I would be very interested to see a formalism in which this can be taken literally!

Here is a draft of a paper that gives a formalization in FMathL in which this can be taken literally!

It was achieved without using any abuse of language. It provides a unified formal account of what mathematicians use intuitively when they combine algebraic structures. Surely this is part of the essence of mathematics. Therefore I challenge you to formally present the same information and relationships according to your own categorial moral code, but (like I did) without using any abuse of language.

Then we can compare readability, complexity, and elegance.

Posted by: Arnold Neumaier on October 4, 2009 12:09 AM | Permalink | Reply to this

Re: The intersection of categories

Here is a draft of a paper that gives a formalization in FMathL in which this can be taken literally!

I'm impressed. I'm also getting a feel for how FMathL is used. It certainly does not strike me as elegant, but I can see how, by focussing on the symbols used, it tries to match common abuses of language by making them not abuses.

I see that you do indeed distinguish between the category of monoids whose binary operation is called * and the category of monoids whose operation is called +, so (although you don't remark on this in your draft) the category of rings is a subcategory of their intersection.

It seems to me that many things that people think are not abuses of language become abuses in FMathL. I think that a lot of people (not thorough structuralists, of course) would say that these two categories of monoids are indeed equal, but in FMathL they are not.

Posted by: Toby Bartels on October 4, 2009 12:45 AM | Permalink | Reply to this

Re: The intersection of categories

TB: I’m impressed. I’m also getting a feel for how FMathL is used. It certainly does not strike me as elegant, but I can see how, by focussing on the symbols used, it tries to match common abuses of language by making them not abuses.

Once you get used to it, you’ll really like it. I spent several years thinking about these issues to make the system really match mathematical practice with minimal adjustments.

You can do all structural mathematics as naturally as with Bourbaki’s foundation (whom you conceded to be essentially structuralist in content though material in foundation).

You can even work in FMathL with SEAR proper if you introduce a new equality between SEAR elements and between SEAR relations that is different from but overloaded to the FMathL equality. This completely hides the intrinsic FMathL equality.

But everyone is free to use the whole continuum of masthematical styles from extremal materialism to extremal structuralism and all shades in between.

TB: I see that you do indeed distinguish between the category of monoids whose binary operation is called * and the category of monoids whose operation is called +, so (although you don’t remark on this in your draft) the category of rings is a subcategory of their intersection.

AN: I added this to the new version 0.91 of Binary categories, and made there a few other small improvements, such as making the property label match those of the framework paper and improve the formulation of (P35). I also uploaded here a slightly improved version 1.03 of the framework paper, among others with the corresponding change in (P35).

TB: It seems to me that many things that people think are not abuses of language become abuses in FMathL. I think that a lot of people (not thorough structuralists, of course) would say that these two categories of monoids are indeed equal, but in FMathL they are not.

The fact is that this is already an abuse of language in the traditional formalizations. Certainly it is one for Serge Lang’s setting, and probably for every introduction to categories that does not explicitly start with SEAR.

Thus FMathL does not lose anything but adds clarity in taking stock of what are abuses of languages.

It is quite difficult to declare the two categories of monoids with extra structure in form of a name for the operation symbol to be formally equal.

Indeed, there are very good reasons why mathematicians distinguish between additive and multiplicative abelian groups, and have separate phrases for talking about them.

Posted by: Arnold Neumaier on October 4, 2009 9:44 AM | Permalink | Reply to this

Re: The intersection of categories

I think that a lot of people (not thorough structuralists, of course) would say that these two categories of monoids are indeed equal, but in FMathL they are not.

The fact is that this is already an abuse of language in the traditional formalizations.

That doesn't seem right to me. In ZF, a monoid is a pair consisting of a set and an associative unital operation on that set; there is no room to specify a symbol to go with it. Of course, you could, as you said in your comment, consider

the two categories of monoids with extra structure in form of a name for the operation symbol

but I've never seen anyone discuss these categories.

Indeed, there are very good reasons why mathematicians distinguish between additive and multiplicative abelian groups, and have separate phrases for talking about them.

I'm not an expert on group theory, but I don't know any separate phrases for talking about them, except for very basic terms like ‘sum’ and ‘product’. Even ‘inverse’ is the same for both; I rarely see ‘opposite’ or ‘reciprocal’ instead. I know a little more separate notation; not only x-x and x 1x^{-1} for the inverse but even /n\mathbb{Z}/n and Z nZ_n for a finite cyclic group. But the terminology is the same, as far as I've seen.

Posted by: Toby Bartels on October 4, 2009 3:41 PM | Permalink | Reply to this

Re: The intersection of categories

AN: the two categories of monoids with extra structure in form of a name for the operation symbol

TB: I’ve never seen anyone discuss these categories.

If I am really the first discussing them, I’d publish the construction! Can you point me to a journal that would accept a polished (and FMathL-free) version of my draft?

Certainly, with multiplicative monoids being triples (M,f:M×MM,{*})(M,f:M\times M\to M,\{*\}) and additive monoids being triples (M,f:M×MM,{+})(M,f:M\times M\to M,\{+\}), my binary categories are well-formed and distinct categories in the traditional sense. Together with your monoids of the form (M,f:M×MM,{+})(M,f:M\times M\to M,\{+\}), we have three distinct categories.

Since your camp is so proud of the “level of precision and control which I could scarcely imagine otherwise”, you should appreciate that FMathL raises this level even higher.

It is not my fault if category theorists have so far missed this extra piece of structure by applying routinely the forgetful operator on the operation symbol - which clearly figures as relevant structure to differentiate concepts in algebra!

AN: there are very good reasons why mathematicians distinguish between additive and multiplicative abelian groups, and have separate phrases for talking about them.

TB: I don’t know any separate phrases for talking about them.

Serge Lang introduces the additive and the multiplicative notation and terminology in parallel, and talks, e.g., about “group under addition” and “group under multiplication”. These are clearly two different things. He also uses the terms “ additive group” and “(additive) monoid”, considering the multiplicative case the default.

Of course, because of the canonical isomorphisms, one can (and he does) forget about the operation whenever the one used is apparent from the context.

Posted by: Arnold Neumaier on October 4, 2009 5:29 PM | Permalink | Reply to this

Re: The intersection of categories

I've never seen anyone discuss these categories.

If I am really the first discussing them, I’d publish the construction! Can you point me to a journal that would accept a polished (and FMathL-free) version of my draft?

No, and I'd be surprised if any algebra journal would accept it; I think that they would regard it as trivial. Possibly a journal that is interested in the philosophy or linguistics of mathematics? But I am out of my depth to give advice.

Certainly, with multiplicative monoids being triples (M,f:M×MM,*)(M,f:M\times M\to M,{*}) and additive monoids being triples (M,f:M×MM,+)(M,f:M\times M\to M,{+}), my binary categories are well-formed and distinct categories in the traditional sense. Together with your monoids of the form (M,f:M×MM,+)(M,f:M\times M\to M,{+}), we have three distinct categories.

Mine should be of the form (M,f:M×MM)(M,f:M\times M\to M), yes?

Agreed, these are (in foundations like ZFC where it makes sense to consider this question at all, and syntactically in any case) three distinct categories.

Since your camp is so proud of the “level of precision and control which [Todd Trimble] could scarcely imagine otherwise”, you should appreciate that FMathL raises this level even higher.

But there is also a level of precision about levels of precision!

It is possible in ETCS or SEAR to write down each of those three categories (at least if we restrict to ‘small’ monoids in some sense), and they would not literally be the same, but they would all be equivalent. Indeed, there is an obvious forgetful functor to my category from each of your categories, and this is an equivalence (in fact an isomorphism, although it is evil to care about that). It is obvious because we can write down a condition on such an equivalence that it should fix MM and ff, and then it is unique up to unique isomorphism. (Or if you care about the categories up to isomorphism, then you can write down a stricter version of the condition that it fix MM and ff, and then it is the unique isomorphism up to equality.) And that is the situation in which we consider it safe to introduce an abuse of language (or to redefine the words) and call these categories all ‘the same’.

Serge Lang introduces the additive and the multiplicative notation and terminology in parallel, and talks, e.g., about “group under addition” and “group under multiplication”. These are clearly two different things. He also uses the terms “additive group” and “(additive) monoid”, considering the multiplicative case the default.

I have the 3rd edition, not the 1st, so maybe he changed; but I don't see this.

I quote the first paragraph of Section I.1 in full:

Let SS be a set. A mapping S×SS S \times S \to S is sometimes called a law of composition (of SS into itself). If x,yx, y are elements of SS, the image of the pair (x,y)(x,y) under this mapping is also called their product under the law of composition, and will be denoted by xyx y. (Sometimes, we also write xyx \cdot y, and in many cases it is also convenient to use an additive notation, and thus to write x+yx + y. In that case, we call this element the sum of xx and yy. It is customary to use the notation x+yx + y only when the relation x+y=y+xx + y = y + x holds.)

It seems clear to me that Lang has here introduced three different kinds of notation for a single concept: a law of composition.

Here is the next paragraph:

Let SS be a set with a law of composition. If x,y,zx, y, z are elements of SS, then we may form their product in two ways: (xy)z(x y) z and x(yz)x (y z). If (xy)z=x(yz)(x y) z = x (y z) for all x,y,zx, y, z in SS then we say that the law of composition is associative.

That is it; there is no parallel treatment of associativity for laws of composition written with \cdot or ++.

The next paragraph makes clear the pattern that is followed throughout the book:

An element ee of SS such that ex=x=xee x = x = x e for all xSx \in S is called a unit element. (When the law of composition is written additively, the unit element is denoted by 00, and is called a zero element.) A unit element is unique, for if ee' is another unit element, we have e=ee=e e = e e' = e' by assumption. In most cases, the unit element is written simply 11 (instead of ee). For most of this chapter, however, we shall write ee so as to avoid confusion in proving the most basic properties.

The only parallel development is introducing alternative notation and terminology when that exists. But the theorem that a unit element is unique is not treated (not even stated, much less proved) in parallel. If Lang thought that a multiplicatively written monoid and an additively written monoid were objects of two separate categories, then to be complete he should at least note the existence of a gap here, which could be filled either by a parallel proof (an easy exercise for the reader) or by a more general translation theorem. But I do not even think that it would occur to Lang to do this.

To Lang, a composition law is a mapping, not a mapping together with a symbol. The symbol is just how we talk about it. One can change symbols as easily as he changes between ee and 11, without changing the meaning. I am honestly surprised that a mathematician should think otherwise.

Otherwise is certainly a possible way to think, and it is admirably ambitious of you to develop a formal system in which every notation in practical use may be taken perfectly literally (which seems to be what you are trying to do). But it really seems cumbersome to me. People change notation often, and this must be dealt with explicitly in FMathL, even though it is (to my mind) not part of the mathematics at all.

Posted by: Toby Bartels on October 4, 2009 10:25 PM | Permalink | Reply to this

Re: The intersection of categories

Just a couple technical points here, a more substantive post coming.

(1) On p52 of the FMathL specification, the definition X×Y:=(XY) ×2 X\times Y := (X \cup Y)^{\times 2} does not seem right. According to this definition, wouldn’t the pair (x 1,x 2)(x_1,x_2) be in X×YX\times Y when x 1x_1 and x 2x_2 are both in XX? Probably you mean that X×YX\times Y is a certain subset of (XY) ×2(X \cup Y)^{\times 2}.

(2) You defined MonoidMonoid to be “the subcategory of Bin({})Bin(\{&#8727;\}) consisting of all binary structures that have an operation ∗ which is associative and has a neutral element.” But you should also specify the morphisms, since MonoidMonoid is actually not a full subcategory of Bin({})Bin(\{&#8727;\}). Unlike the situation for groups, a function between monoids which preserves the binary operation need not also preserve identities (essentially since monoids can contain nontrivial idempotents).

Posted by: Mike Shulman on October 5, 2009 3:38 AM | Permalink | PGP Sig | Reply to this

Re: The intersection of categories

Very clever!

However, it seems to me that you have really just justified the structural point of view on intersection, namely that it only makes sense to take the intersection of two things (sets, classes, categories, etc.) once you have exhibited them as sub-things of some larger thing. It seems to me that in order to obtain a meaningful intersection of the categories MonoidMonoid and OrderOrder, you have had to (implicitly) define a larger category BinStrBinStr of binary structures and arbitrary set-mappings between them, exhibit subcategories of BinStrBinStr which are equivalent to the usual definitions of MonoidMonoid and OrderOrder, and then take the intersection of these two subcategories.

Of course, BinStrBinStr is equivalent to SetSet (since the operations and relations play no role in the morphisms), and the inclusions of MonoidMonoid and OrderOrder are equivalent to the usual forgetful functors. Thus, the intersection as you have defined it is just a different way of talking about the pullback Monoid Order Set.\array{ & \overset{}{\to} & Monoid \\ \downarrow && \downarrow\\ Order& \underset{}{\to} & Set.} You’ve managed to make the presence of SetSet in the pullback implicit by using a trick. (A variation on this trick should work for pullbacks of any pair of faithful functors, since any faithful functor can be turned into the inclusion of a subcategory by modifying the codomain.)

I see various problems with this trick, however. It is evidently arguable whether the distinction between ∗-monoids and ++-monoids is a drawback or not (I don’t like it; I don’t regard the distinction as a mathematical one but rather a psychological one). I find more worrying the fact that with this definition of MonoidMonoid, its objects are not “sets equipped with an associative and unital binary operation” (whatever that operation is called), but rather “sets equipped with a bunch of binary operations and relations, a specified one of which is associative and unital.” It seems to me that few mathematicians would regard this as an appealing definition of a monoid.

Moreover, things get worse when we try to do more complicated things. Suppose we want to talk about topological monoids; clearly by the same logic these should be contained in the intersection of MonoidMonoid and TopSpaceTopSpace. But the structure of a topological space can’t be described by any binary (or even nn-ary) operations and relations, so we need an expanded notion of structure to include both MonoidMonoid and TopSpaceTopSpace, say a set equipped with some collection of operations, relations, and also a subset of its powerset. That means that in order to intersect MonoidMonoid and TopSpaceTopSpace, we need to redefine MonoidMonoid to consist of such expanded structures containing a specified binary operation which is associative and unital. Now maybe we want to talk about monoidal suplattices, in which case we need to add a function PXXP X \to X to the various types of structure considered, or monoidal convergence spaces, etc. etc. Every time we want to combine the structure of a monoid with a new type of structure, it seems that we have to redefine the category of monoids. This hardly seems readable, simple, or elegant to me. Even worse, the objects in the intersection of TopSpaceTopSpace and the expanded version of MonoidMonoid are disjoint from the previous version of MonoidMonoid.

It seems to me much clearer, simpler, and more elegant to use the one structural notion of “pullback” to describe all of these situations.

Posted by: Mike Shulman on October 5, 2009 4:09 AM | Permalink | PGP Sig | Reply to this

Re: The intersection of categories

we need an expanded notion of structure to include both MonoidMonoid and TopSpaceTopSpace, say a set equipped with some collection of operations, relations, and also a subset of its powerset

To further advertise Bourbaki's theory of structures, they have a fairly comprehensive list of what these might be. (Of course, much of the theory would be simpler user category theory, but the enumeration of the possibilities is still useful, as Bourbaki also discusses what the morphisms should be between such structured sets.)

Posted by: Toby Bartels on October 5, 2009 12:25 PM | Permalink | Reply to this

Re: The intersection of categories

MS: a larger category BinStr of binary structures and arbitrary set-mappings between them

Actually, Binstr=Bin()Bin({}).

MS: with this definition of Monoid, its objects are not “sets equipped with an associative and unital binary operation” (whatever that operation is called), but rather “sets equipped with a bunch of binary operations and relations, a specified one of which is associative and unital.”

This is precisely what an ordinary mathematician has in mind when saying that the natural numbers 0\mathbb{N}_0 (with zero) form an additive monoid under addition. 0\mathbb{N}_0 is indeed generally regarded as a set equipped with a bunch of operations and relations, where + is associative and unital.

MS: Every time we want to combine the structure of a monoid with a new type of structure, it seems that we have to redefine the category of monoids.

This is only since you take my current definition of a monoid to be the ultimate definition. But in my draft I explicitly said that I restrict to binary structures since this suffices to solve the challenge posed, and I wanted to present a solution quickly.

The solution of the challenge I really propose (and that will become part of the official next version of FMathL) is a more complex version of that, which addresses all your extensions mentioned in this mail in a uniform and completely general way.

I already know the essential pieces of this generalization. But it takes time to write this up and check for accuracy and completeness, and I cannot do this easily now during the term. So it may take a while to answer this in a fully satisfying way.

Posted by: Arnold Neumaier on October 5, 2009 1:15 PM | Permalink | Reply to this

Re: The intersection of categories

MS: with this definition of Monoid, its objects are not “sets equipped with an associative and unital binary operation” (whatever that operation is called), but rather “sets equipped with a bunch of binary operations and relations, a specified one of which is associative and unital.”

This is precisely what an ordinary mathematician has in mind when saying that the natural numbers 0\mathbb{N}_0 (with zero) form an additive monoid under addition. 0\mathbb{N}_0 is indeed generally regarded as a set equipped with a bunch of operations and relations, where + is associative and unital.

I have a pretty hard time believing that. Can you point to anywhere in the literature where \mathbb{N} as additive monoid is defined in that way, with gratuitous mention of other operations and relations besides ++? Or where it is even remarked that this is what ordinary mathematicians have in mind?

Posted by: Todd Trimble on October 5, 2009 3:03 PM | Permalink | Reply to this

Dressed sets

AN: the natural numbers 0\mathbb{N}_0 (with zero) form an additive monoid under addition. 0\mathbb{N}_0 is indeed generally regarded as a set equipped with a bunch of operations and relations, where + is associative and unital.

TT: Can you point to anywhere in the literature where ℕ as additive monoid is defined in that way.

I think you misunderstood. Nobody in algebra defines 0\mathbb{N}_0 as a monoid.

Instead, 0\mathbb{N}_0 is always the natural numbers (including 0) with their full algebraic structure +,*,<\lt, \le, the standard names for the small numbers, the notion of divisibility, primes and whatever people have defined or will define over the course of history.

Just as Fermat’s last theorem was already true before Wiles proved it, so the natural numbers have this natural structure no matter whether anyone has defined it explicitly before.

But it happens that this structure 0\mathbb{N}_0 is an additive monoid, and people say so without stripping the extra structure.

For example, Serge Lang begins Chapter I.3 of his Algebra 1970 with: The integers \mathbb{Z} form an additive group. Nowhere he asks to strip it from *; neither gives he a warning earlier that such a stripping is intended when using such a phrase.

I saw lots of mathematics over my lifetime, and saw many do similar things, without anyone ever giving such a warning, not even in elementary introductions to algebra or numbers or whatever, where such care should have been felt didactically useful at least by some authors.

Thus I conclude that this stripping is not part of the mathematical culture per se (though it may be part of what some mathematicians privately do when interpreting mathematical language).

This even holds for definitions. Serge Lang says in Chapter I.1: “A monoid is a set G, with a law of composition ….”.

From this any reader (exposed only to naive set theory as summarized in the prerequisite) may conclude that a monoid is a dressed set (i.e., a nonextensive version of a set, the informal version of the FMathL-objects). He really said a monoid is a set! - without any warning that this should not be taken seriously.

The reader may also conclude that the set G is dressed with a law of composition (written as juxtaposition, as is clear from the context). Nothing is said about not wearing any additional structure, or having to get stripped if the set G was wearing already something else.

After all, if we begin a story with: “Once upon a time, there was a man with a beatiful crown”, we do not think the man consisted of nothing but his kinghood and the crown, but assume that he was equipped with everything else a king or prince usually has (and some of it is usually mentioned later when things get more interesting). It just means that at the moment, the focus is on the man with the crown, and not on his castle, his father kink Charles the fifth, his nose, his friendship with the court jester, and whatever other structure the man is equipped with.

This is woven into the intrinsic structure of ordinary language, and this is inherited by mathematical concepts when they are taught the normal, informal way.

Only when one is taught that “real” mathematics must be written in the straitjacket of ZF, one halfheartedly learns to put all the really needed stuff into tuples and sells them as sets.

Only when one is taught that “real” mathematics should morally be written in the straitjacket of category theory, one halfheartedly learns to put all the really needed stuff into categories and invents corresponding functors.

Only forced into such a straitjacket one needs to face the problem of what to do with the problem that the same dressed objects is multiple things at the same time. But every mathematician knows that the ZF straitjacket is a chore to wear, and therefore only pays lipservice to this onerous moral code. The moral code of category theory is just another straitjacket.

The requirements of either straitjacket have no more to do with real (i.e. actually practiced) mathematics as von Neumann’s quantum measurements have to do with the real measurement of a neutrino masses, say.

ZF, category theory, or formal proof theory are idealized caricatures of real mathematics, and not the real thing!

Posted by: Arnold Neumaier on October 5, 2009 6:56 PM | Permalink | Reply to this

Re: Dressed sets

But it happens that this structure 0\mathbb{N}_0 is an additive monoid, and people say so without stripping the extra structure.

Interesting!

So when someone writes ‘ 0\mathbb{N}_0 is a monoid under addition.’ or ‘ 0\mathbb{N}_0 is an additive monoid.’, I interpret that at as ‹There is a monoid which consists of the set 0\mathbb{N}_0 (that is, the underlying set of the maximal structure 0\mathbb{N}_0 in all its glory) and the operation of addition on that set.›, whereas you interpret it as ‹There is a monoid which consists of 0\mathbb{N}_0 (the maximal structure in all its glory) and the (universal, across all structures) operation of addition.›. I must admit, your interpretation is more literal!

I saw lots of mathematics over my lifetime, and saw many do similar things, without anyone ever giving such a warning, not even in elementary introductions to algebra or numbers or whatever, where such care should have been felt didactically useful at least by some authors.

I dug out my undergraduate algebra text (Abstract Algebra, Dummit & Foote, 1991, Prentice-Hall, New Jersey) to see what it says:

A group is an ordered pair (G,)(G, \star) where GG is a set and \star is a binary operation on GG satisfying the following axioms: […]. We shall immediately become less formal and say GG is a group under \star if (G,)(G,\star) is a group (or just GG is a group when the operation \star is clear from the context).

So that is probably where I was first led astray (^_^).

Serge Lang says in Chapter I.1: “A monoid is a set G, with a law of composition ….”. From this any reader (exposed only to naive set theory as summarized in the prerequisite) may conclude that a monoid is a dressed set (i.e., a nonextensive version of a set, the informal version of the FMathL-objects). He really said a monoid is a set! - without any warning that this should not be taken seriously.

I don't know about this. If Lang had a concept of dressed set, surely he would have mentioned it in the prerequisites? It's true that his prerequisites did not state the axiom of extensionality, but surely he would be aware of the serious danger that, if a monoid really is a set, people familiar with set theory might fall into the trap of thinking that two monoids are the same if they have the same elements.

I read ‘A monoid is a set GG, with a law of composition […].’ to mean that a monoid consists of a set GG and an appropriate law of composition; that is, a monoid is a pair. I can see how you would read it to mean that a monoid is a set GG such that there exists an appropriate law of composition. Although your interpretation is more literal (at least as regards ‘is’), I would be surprised to find a textbook on abstract algebra that actually says something like that, using the phrase ‘such that there exists’ (or any of its many synonyms). That would make the danger of misapplying the axiom of extensionality far too great!

Actually, I don't even think that this is correct in FMathL. I thought that a monoid is an object GG such that there is an underlying set of GG (that is, Set(G)Set(G) exists) and there exists an appropriate law of composition. Even in FMathL I didn't think that a monoid really is a set. (In contrast, the man with a beautiful crown really is a man.)

ZF, category theory, or formal proof theory are idealized caricatures of real mathematics, and not the real thing!

I actually agree with this. I still think that mathematics is really structural, however, and I see FMathL as yet another caricature.

Posted by: Toby Bartels on October 6, 2009 12:18 AM | Permalink | Reply to this

Re: Dressed sets

AN: But it happens that this structure 0\mathbb{N}_0 is an additive monoid, and people say so without stripping the extra structure.

TB: whereas you interpret it as ‹There is a monoid which consists of 0\mathbb{N}_0 (the maximal structure in all its glory) and the (universal, across all structures) operation of addition.›.

+ is not universal across all structure but is the opertaion + that is part of the structure of 0\mathbb{N}_0.

But actually the statement is usually interpreted simply as shorthand for “the addition already defined in 0\mathbb{N}_0 is associative and has a zero”, and “This allows me to apply all results proved for abstract monoids by adapting it to the situation at hand”.

TB: A group is an ordered pair (G,⋆) […] We shall immediately become less formal and say G is a group …

One finds this definition, paying lipservice to formal ZF, abd immediately renounces that it is meant seriously. Indeed, it is never meant seriously.

TB: I don’t know about this. If Lang had a concept of dressed set, surely he would have mentioned it in the prerequisites?

I invented the imagery of dressed sets to put in words what most mathematicians extract as intuitive meaning when they read Lang’s statement.

I suppose that Lang assumes naive set theory with quantification over sets only.

Naive set theory is not extensional. Extensionality is an artifact of ZF.

A naive set is a mental version of a bag containing (perhaps) things, There are many empty bags, and different bags may contain the same apple at different times. They are not only distinguished by their content but also by their form, color, etc. In other words, naive sets are dressed sets in the sense I was using the term. The student is usually told that this does not matter since one concentrates on the content, not on the dressing.

But sets in use, such as the natural numbers, are commonly dressed. One wouldn’t recoginze them as what they are without the dressing.

Almost all sets in mathematics are dressed in this sense. Hence it is wise to regard all sets as dressed as the default. Thus it is unnecessary to mention this anywhere, except when drawing attention to this fact, just like we regard in ordinary language men and women as dressed, mentioning the presence or absence of dressing only when it matters.

TB: surely he would be aware of the serious danger that, if a monoid really is a set, people familiar with set theory might fall into the trap of thinking that two monoids are the same if they have the same elements.

yes, and it is common practice to be a bit more precise in precisely these situations, so that the reader draws the right conclusion. Thus one adds then qualifying terms and descriptions.

You can see an example of this in the beginning of Lang, Chapter v.2, where he has to face the situation that ther additive monoid of natural numbers should be regarded as a multiplicative monoid. One introduces ad hoc constructions to ensure that things are interpreted the correct way, and afterwards one goes back to ordinary usage, with things cleared up.

TB: I read ‘A monoid is a set G, with a law of composition […].’ to mean that a monoid consists of a set G and an appropriate law of composition; that is, a monoid is a pair. I can see how you would read it to mean that a monoid is a set G such that there exists an appropriate law of composition.

I do not read the latter. I read that there is a particular composition law with the composition written as juxxtapoasition, but that (as always in descriptions), there may be other composition laws or properties that do not matter since they are not used while focussing only on the monoid aspect.

All monoid theory remains fully valid if the monoids are dressed woth more stuff!

TB: in FMathL. I thought that a monoid is an object G such that there is an underlying set of G (that is, Set(G) exists) and there exists an appropriate law of composition. Even in FMathL I didn’t think that a monoid really is a set.

FMathL is supposed to be more precise since it aims at being a foundations, while Lang aims to be an introduction to algebra.

Thus I replace the informal inutition of a dressed set by the formal notion of an object. On the other hand, for the cases where extensionality of sets is really essential, I keep this notion (but introduce it only at the very end) by having a separate property of being a naked set (which FMathL calles a set). But nakes sets are rarely needed outside set theory.

In the next version of FMathL, I’ll probably remove the concept of a set from the foundational part to make this clear, and put it into a specification module as an example how to recover the concept.

TB: (In contrast, the man with a beautiful crown really is a man.)

And the object with a monoid structure is really an object.

AN: ZF, category theory, or formal proof theory are idealized caricatures of real mathematics, and not the real thing!

TB: I actually agree with this. I still think that mathematics is really structural, however, and I see FMathL as yet another caricature.

FMathL is of course also a formalization, but it will understand informal mathematics in the informally intended way, by adjusting its language slightly and in a way easy to automatize.

It will recognize when an informal use of set refers to an extensional, naked set and when it refers to an intensional, dressed set, and will choose internally the right translation (object, resp. set) for it. The disinterested user will never notice anything of this change.

But programmers are relieved of having to redress their objects hundreds of time within a few pages of dense informal reasoning.

And the automatic theorem prover behind the scenes will appreciate the associated reduction in complexity!

Posted by: Arnold Neumaier on October 6, 2009 11:11 AM | Permalink | Reply to this

Re: Dressed sets

One finds this definition, paying lipservice to formal ZF, and immediately renounces that it is meant seriously. Indeed, it is never meant seriously.

Wait.. this from the man who was in such high dudgeon over the definition of an opposite category?

If it’s such a huge failing on our parts to speak of “morally” not reusing objects, how can you so blithely accept not being “serious” about this definition?

Posted by: John Armstrong on October 6, 2009 1:26 PM | Permalink | Reply to this

Re: Dressed sets

I hope that you find your straitjacket useful. It's not the straitjacket for me.

Posted by: Toby Bartels on October 6, 2009 8:39 PM | Permalink | Reply to this

Re: Dressed sets

It depends if you are truly interested in computerized mathematics, “straitjackets” are called tradeoffs…
But aiming at a mathematical oracle by focusing mainly on consistency and proof checking is probably not the most productive approach.
Before reaching this “ultimate” goal some more modest endeavour of just improving the semantic representation of mathematical texts could bring a much larger payoff.
How would you like some Googling for structural similarities in a large public pool of semi-formalized texts?

Posted by: J-L Delatre on October 7, 2009 6:03 PM | Permalink | Reply to this

Re: Dressed sets

JLD: improving the semantic representation of mathematical texts could bring a much larger payoff. How would you like some Googling for structural similarities in a large public pool of semi-formalized texts?

This is one of the long-term goals of FMathL, but is looks quite difficult even when a large public pool of semi-formalized texts (one of the direct goals of FMathL) is already available.

For one needs a notation-independent concept of structural similarity that I find extremely hard to define in a meaningful and comprehensive way.

What sort of search queries are you thinking of?

Posted by: Arnold Neumaier on October 9, 2009 6:06 PM | Permalink | Reply to this

Re: The intersection of categories

Yes, I certainly believe that one can give a general definition of “mathematical structure” which includes at least most of the types of structures mathematicians study. (I actually had to do something like this for UQ&SA, so I should probably compare it to Bourbaki.) But I don’t really think this solves the problem. Like Todd, I don’t believe that when “an ordinary mathematician” says “the additive monoid of natural numbers” s/he means “the natural numbers, together with their addition, multiplication, strict order, non-strict order, divisibility relation, discrete topology, discrete convergence structure, etc. etc., which happens to live in MonoidMonoid because the addition is called ++.”

Moreover, even once you’ve defined what “structure” can mean, there’s no way you can make a list of all the conceivable types of structure. That means that future mathematicians will continue to invent new types of structure, and although the categories of such structures will probably still be equivalent to subcategories of StructureStructure, one will still have to change the objects involved in order that they live in the right categories. In other words, even if I did define the natural numbers to be equipped with all the structure I listed above, then when someone comes along and defines a new kind of structure that applies to \mathbb{N}, I would have to redefine what I meant by “the natural numbers” in order to include this new type of structure. In particular, it seems to me that the “shifts” that are encoded structurally by implicit forgetful functors do also have to be encoded in FMathL by a perpetual redefinition of objects.

Posted by: Mike Shulman on October 5, 2009 4:30 PM | Permalink | PGP Sig | Reply to this

Re: The intersection of categories

Yes, exactly.

I tried to make a similar point in my penultimate paragraph way back here, in connection with this business of “maximal structure”, about the multiplicity of potential structures that might be borne by \mathbb{N} in future mathematics.

Just to give one example: Harvey Friedman has in the past few years uncovered many connections between large cardinal hypotheses and extremely fast-growing functions \mathbb{N} \to \mathbb{N}. I sense this sort of structure on \mathbb{N} could go far, far beyond any sort of “maximal structure” being envisaged here.

Which leads me to muse: is the notion of “maximal structure” envisaged here in fact a straitjacket? :-)

Posted by: Todd Trimble on October 5, 2009 6:44 PM | Permalink | Reply to this

Re: The intersection of categories

Continuing the general theme of analogies to programming, the natural comparison to make is with multiple inheritance. In that arena, thinking of a class CC which inherits from AA and BB as the “intersection” C=ABC= A\cap B leads to all sorts of problems, particularly the “diamond problem”. In the situation we’re thinking about, the problem is that if Monoid and Order both inherit from Set, and OrderedMonoid inherits from both Monoid and Order, how do we specify whether the two Sets which OrderedMonoid inherits through Monoid and Order, respectively, are the same?

Programmers have come up with all sorts of fancy things to resolve this issue with names like “virtual base classes” and “interfaces” and “renaming” and “mixins,” but (if you will forgive a vast oversimplification) I think that from a structuralist point of view, the problem is merely that the language of “intersection” is inappropriate and the inheritance operation is underspecified. Saying instead that OrderedMonoid is (a subcategory of) the pullback of Monoid and Order over Set specifies precisely the way in which the order and monoid structures are to be combined, and is flexible enough to deal with any sort of combination of structure without requiring the base objects to be redefined.

Posted by: Mike Shulman on October 5, 2009 6:13 AM | Permalink | PGP Sig | Reply to this

Re: What is a structured object?

As Mike was saying, in order to construe objects nn \in \mathbb{N} as giving actual finite sets, we construct a “family” ϕ:F\phi: F \to \mathbb{N} where each fiber F nF_n is a set of cardinality nn. For example, consider the function

(1)ϕ:×:(m,n)m+n+1 \phi: \mathbb{N} \times \mathbb{N} \to \mathbb{N}: (m,n) \mapsto m + n + 1

Then, for each n0n \geq 0, the fiber ϕ 1(n)\phi^{-1}(n) is a set of cardinality nn.

If you have the Axiom of Collection, then it will do this for you automatically; you don't need to come up with an ad hoc representation of the family. (Of course, it's worth noticing that, for finite sets, you don't need Collection if you have Infinity; this is true both materially and structurally.)

Posted by: Toby Bartels on September 25, 2009 9:23 PM | Permalink | Reply to this

Re: What is a structured object?

I was going to put this family of finite sets at the nlab page natural numbers in SEAR, under a section ‘With the axiom of infinity’, but for some reason I keep getting an ‘access denied’ message when I try to edit.

Posted by: David Roberts on September 29, 2009 3:59 AM | Permalink | Reply to this

Re: What is a structured object?

I was going to put this family of finite sets at the nlab page natural numbers in SEAR, under a section ‘With the axiom of infinity’, but for some reason I keep getting an ‘access denied’ message when I try to edit.

I just saw this again and added it.

Posted by: Toby Bartels on October 11, 2009 12:40 AM | Permalink | Reply to this

Re: What is a structured object?

MS: (A small equivalent of) the category of finite groups, for instance, would be a category as above equipped with a C 0C_0-indexed family of finite groups G […]

But SEAR does not have the concept of an indexed family of finite groups, since a finite group is not a single object, and only indexed families of sets are defined in the present version of SEAR.

One really needs the notion of combining several objects into one if one does not want to get exceedingly clumsy in ones definitions. After all, there are much more complicated categories than that of finite groups, with heavily nested structural components.

Tom Trimble has shown here a way how to simulate this in ETCS, though in a way much more artificial than in ZF. I think you’ll need something similar in SEAR.

Posted by: Arnold Neumaier on September 25, 2009 10:48 AM | Permalink | Reply to this

Re: What is a structured object?

Todd Trimble has shown here a way how to simulate this in ETCS… I think you’ll need something similar in SEAR.

The way to do these things in SEAR and ETCS is basically exactly the same. Todd was just expanding on what I said here. He even said explicitly “all the formal details can be filled in within the framework of a structural set theory such as ETCS or SEAR.”

Posted by: Mike Shulman on September 25, 2009 3:24 PM | Permalink | PGP Sig | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Here is a totally different question about FMathL. I don’t understand the purpose of “nominal” objects. I understand that philosophers have spent lots of time discussing the nature of existence, but I’ve always thought that ordinary mathematical logic gives a perfectly satisfactory treatment of the study of not-yet-known-to-exist objects.

Suppose I want to prove there is no greatest prime number; since the (or at least “a”) way to prove that something is false is to assume it true and derive a contradiction, I begin by assuming that there is a largest prime number. By the elimination rule for the existential quantifier, starting from this hypothesis I am then entitled to introduce an object pp which is the largest prime number. I then go on and work with pp, eventually reaching a contradiction, at which point the temporary hypothesis is discarded and its conclusion wrapped up in the statement “if there exists a largest prime number, then 0=10=1” or equivalently “there does not exist a largest prime number.” Nowhere did we need there to actually be a largest prime number even in a sense of “meta-existence.”

Posted by: Mike Shulman on September 30, 2009 3:50 AM | Permalink | PGP Sig | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Nowhere did we need there to actually be a largest prime number even in a sense of “meta-existence.”

Really?
So what is the “ontological difference” between your hypothetical p and any other constant or variable used in the proof?
I do not believe in any objects, objects or concepts are only epistemological artifacts of our discourse about the world.
Thus, metaphysical haggling about the “proper foundations” of mathematics (or any other philosophical domain for that matter) seem to me pretty vain.
Also, the whole approach of this thread for “a Computer-Aided System for Real Mathematics” appear fatally flawed by the requirement that it be based on a consistent formalism: everyday mathematics just as any other human pursuit isn’t consistent it only approaches consistency thru nitty-gritty toiling (haven’t you noticed :-) ), therefore enforcing the straightjacket of consistency from the very start will NOT permit the modelling of actual practice.

Posted by: J-L Delatre on September 30, 2009 7:14 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

So what is the “ontological difference” between your hypothetical p and any other constant or variable used in the proof?

My hypothetical pp does not actually denote anything. It is merely a symbol which is manipulated in the proof for a while, as if it denoted something. On the other hand, the symbol 1313 actually denotes something (at least if I have a particular model in mind).

Posted by: Mike Shulman on September 30, 2009 5:57 PM | Permalink | PGP Sig | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

MS: My hypothetical p does not actually denote anything. It is merely a symbol which is manipulated in the proof for a while, as if it denoted something. On the other hand, the symbol 13 actually denotes something (at least if I have a particular model in mind).

You just more or less characterized the difference between nominal essence and real essence that was discussed for many centuries, and that I briefly mentioned in the FMathL framework paper.

Posted by: Arnold Neumaier on September 30, 2009 6:56 PM | Permalink | Reply to this

Nominal objects

MS: Here is a totally different question about FMathL. I don’t understand the purpose of “nominal” objects. I understand that philosophers have spent lots of time discussing the nature of existence, but I’ve always thought that ordinary mathematical logic gives a perfectly satisfactory treatment of the study of not-yet-known-to-exist objects.

But your query is about not-yet-known-to-not-exist objects.

Of course we know how to handle it, but to formalize how we handle it needs a meta-way of talking about the object whose nonexistence is to be shown. Precisely this is nominal existence.

Nominal objects are important only since one often does not know beforehand whether something is nominal. But you need to talk about the object even to conclude that it is nominal.

Moreover, one cannot have an untyped foundation without nominal objects. Attempts by Church (lambda calculus) and Curry (combinatory logic) to get this have failed (though this was discovered in both cases after publication of the first versions of these theories). All known illative systems (including more recent systems by Feferman) that treat propositions and other objects on the same footing need a formal notion of existence to avoid paradoxes like Hilbert’s paradox (treated in the FMathL paper).

MS: I begin by assuming that there is a largest prime number.

By this assumption, you have created a set of inconsistent assumptions. It takes (sometimes a lot of) time to find out that these assumptions are in fact inconsistent. Thus one unknowingly works all the time in an inconsistent system, until one finds that out. The notion was from the start nominal only, but you didn’t know and proceeded on the basis that it weren’t.

MS: Nowhere did we need there to actually be a largest prime number even in a sense of meta-existence.

If it does not meta-exist, how can you reason about it? Something not meta-existing cannot be even referenced.

You have the object “p” during the whole argumentation, which means it meta-exists and can be reasoned about.

Posted by: Arnold Neumaier on September 30, 2009 11:39 AM | Permalink | Reply to this

Re: Nominal objects

But your query is about not-yet-known-to-not-exist objects.

Sorry, to be precise I should have said “not-yet-known-whether-or-not-they-exist objects.”

Of course we know how to handle it, but to formalize how we handle it needs a meta-way of talking about the object whose nonexistence is to be shown. Precisely this is nominal existence.

First-order logic and natural deduction are completely forrmal without any need for nominal objects. There does not need to actually be any object whose nonexistence is to be shown; the hypothetical part of the proof merely proceeds as if there were such an object.

one cannot have an untyped foundation without nominal objects.

I must not understand what you mean by an “untyped foundation.” Why is ZFC not one?

And, of course, even if this is true, it seems to me that a perfectly reasonable response is “well, then, that’s a good reason to use a typed foundation.”

The notion was from the start nominal only, but you didn’t know and proceeded on the basis that it weren’t.

No, I don’t think so. The hypothesis that there exists a largest prime number was false from the start, which means that the symbol pp we were calling “the largest prime number” did not actually denote anything. There is no need for there to ever have actually been anything for it to refer to, nominal or not.

Now I’m fully aware that there are philosophers who will argue for days about whether or not there really was a largest prime number that I was referring to nominally, without the slightest hope of convincing each other, or me, or me convincing them of anything. But I don’t see why mathematicians need to bother about that, since we have a completely formal and precise way of describing such proofs using first-order logic, without the need for “nominal objects.”

Posted by: Mike Shulman on September 30, 2009 6:10 PM | Permalink | PGP Sig | Reply to this

Re: Nominal objects

AN: one cannot have an untyped foundation without nominal objects.

MS: I must not understand what you mean by an “untyped foundation.” Why is ZFC not one?

Because there are two types: propositions and sets. You cannot form a set consisting of propositions, since the latter are not sets.

No first order logic is truly type-free since one must distinguish between terms and propositions already in defining what the logic is.

MS: means that the symbol p we were calling “the largest prime number” did not actually denote anything.

This is precisely for what Aristoteles and Ockham used the term nominal.

MS: we have a completely formal and precise way of describing such proofs using first-order logic, without the need for “nominal objects.”

The nominal objects in FMathL are also completely formal and precise.

Posted by: Arnold Neumaier on September 30, 2009 7:07 PM | Permalink | Reply to this

Re: Nominal objects

No first order logic is truly type-free since one must distinguish between terms and propositions already in defining what the logic is.

If that is the meaning of “type,” then I have yet to be convinced that anything at all meaningful can be said that is “type-free.”

means that the symbol pp we were calling “the largest prime number” did not actually denote anything.

This is precisely for what Aristoteles and Ockham used the term nominal.

I don’t know anything about Aristoteles and Ockham, but it looks like in FMathL there actually are objects with the property of being “nominal” which are denoted by such a symbol, and this is what I was referring to.

The nominal objects in FMathL are also completely formal and precise.

I didn’t claim they weren’t; instead I asked what purpose they serve. I mentioned that first-order logic is formal and precise because you seemed to be saying that nominal objects are the only way to “formalize” not-yet-known-whether-it-exists objects.

Posted by: Mike Shulman on September 30, 2009 8:02 PM | Permalink | PGP Sig | Reply to this

Re: Nominal objects

I don’t know anything about Aristoteles and Ockham, but it looks like in FMathL there actually are objects with the property of being “nominal” which are denoted by such a symbol, and this is what I was referring to.

I think that I'm with Arnold here. There's an established tradition in logic of supplementing first-order logic with an existence predicate or a ‘denotes’ symbol, and this serves to translate such statements as

lim na n\lim_{n \to \infty} a_n exists

literally (without abuse of language). Since one of Arnold's goals is to allow normal ‘abuses’ of language to be interpreted literally (even though I still doubt that this can be done universally), I think that this is perfectly reasonable of him.

Incidentally, in a typed theory, this makes partial functions more natural than total ones: a partial function f:ABf: A \to B is total (or a function) if f(x)f(x) exists whenever xx does.

Posted by: Toby Bartels on September 30, 2009 8:40 PM | Permalink | Reply to this

Re: Nominal objects

I have seen a bit of this sort of thing, but I thought it was an extension of syntax, to allow terms that do not denote anything and a predicate “exists” which specifies whether a term denotes—but all the objects that are actually present in the semantics “really” exist. For example, in Troelstra & van Dalen’s “Constructivism in Mathematics” I read (§2.2):

We shall now indicate how to adapt our logic so as to take into account the possibility of undefined terms, i.e. terms which do not always “denote.”

and (§5.14) they define Kripke models for logic-with-existence-predicate by interpreting function symbols as partial functions (as you suggested), so that nonexistent values really aren’t present.

Are you saying that it is also a well-established tradition to consider semantics that contain “nominal” objects? Or is talking about “nominal objects” just a manner of speaking about terms which do not denote, so that nominal objects actually are not present in the semantics?

I guess that one can convert from one kind of semantics to the other, in the one direction by making partial functions ABA\dashrightarrow B into total ones AB A\to B_\bot using the “partial map classifier,” and in the other direction by removing the nominal objects and restricting the domain of a function ff to those xx such that f(x)f(x) exists. So if it makes me more comfortable (and it does), can I read the statement “f(x)f(x) is nominal” as meaning “xx is not in the domain of ff”?

Posted by: Mike Shulman on October 1, 2009 6:05 AM | Permalink | PGP Sig | Reply to this

Re: Nominal objects

Are you saying that it is also a well-established tradition to consider semantics that contain “nominal” objects? Or is talking about “nominal objects” just a manner of speaking about terms which do not denote, so that nominal objects actually are not present in the semantics?

I think that I assumed that we just talking about syntax. It seems like good syntax to adopt to represent ordinary mathematical language as directly as possible. And etymologically, ‘nominal’ objects are supposed to have ‘names’ with no actual referents.

Of course, one could take a broad philosophical position that nominal objects in mathematics are really there in some sense, and then when you do metamathematics, you will claim that nominal objects in the syntax are represented by nominal (metanominal?) metaobjects in the semantics. But that doesn't really mean anything different.

So if it makes me more comfortable (and it does), can I read the statement “f(x)f(x) is nominal” as meaning “xx is not in the domain of ff”?

I should let AN answer that. But that's how I've been reading it; so if he says no, then you should ignore what I said earlier. (^_^)

Posted by: Toby Bartels on October 1, 2009 9:37 AM | Permalink | Reply to this

Re: Nominal objects

MS: So if it makes me more comfortable (and it does), can I read the statement “f(x) is nominal” as meaning “x is not in the domain of f”?

Informally, yes. Formally, it depends on whether the context has a well-defined concept of the domain of f.

In FMathL, there is a slight formal difference between functions (Axioms A18) and maps (Axiom A25). Only maps (which formalize morphisms between sets) have domains, and for maps f, your interpretation is indeed correct. For functions (which formalize the notion of something that can be evaluated, without knowing a priori where and to where), one can of course define the metadomain of f by your statement.

Both concepts are heavily used in mathematical practice, so FMathL formalizes both.

But the language is often ambiguous since maps and functions are in practice used synonymously for one or the other purpose (or even both, depending on the context).

On the other hand, in the liar sentence (2.9) or in Hilbert’s paradox (Section 3.7), avoiding the concept of a nominal object is clumsy.

Posted by: Arnold Neumaier on October 1, 2009 11:13 AM | Permalink | Reply to this

Re: Nominal objects

By “domain” I meant “metadomain,” so now I am happy.

On the other hand, in the liar sentence (2.9) or in Hilbert’s paradox (Section 3.7), avoiding the concept of a nominal object is clumsy.

I don’t see why. In Hilbert’s paradox can’t I just say that HH is not in the (meta)domain of itself? And it seems like the formalization of the liar paradox is just that the operator E(u)=(u=0)E(u)= (u=\mathbf{0}) has no fixed point. If you had a “fixed point” operator then this EE would not be in its (meta)domain.

Posted by: Mike Shulman on October 1, 2009 2:33 PM | Permalink | PGP Sig | Reply to this

Re: Nominal objects

MS: In Hilbert’s paradox can’t I just say that H is not in the (meta)domain of itself?

Agreed, if you have already the concept of a metadomain. (This is somewhat unusual; usually one defines only formal objects but not metaobjects.)

But now try the same for Yablo’s paradox (also in Section 3.7.)

MS: And it seems like the formalization of the liar paradox is just that the operator E(u)=(u=0) has no fixed point. If you had a “fixed point” operator then this E would not be in its (meta)domain.

These are true statements but they capture the arguments, not the assertion itself (namely that the liar sentence is “not well-defined” = nominal).

Posted by: Arnold Neumaier on October 1, 2009 3:56 PM | Permalink | Reply to this

Re: Nominal objects

Agreed, if you have already the concept of a metadomain. (This is somewhat unusual; usually one defines only formal objects but not metaobjects.)

In any theory that contains partially defined operations (often formalized as non-entire functional relations), those operations have a domain (or “metadomain” if you will), whether or not the language in which the theory is formulated includes any internal notion of “domain.” For instance, in the first-order theory of categories there is a partially defined operation called “composition” which has a domain consisting of pairs of composable arrows. In FMathL it looks as though your “application” operator “@” is completely analogous; although its (meta)domain is not as clearly defined, it certainly excludes (H,H)(H,H).

But now try the same for Yablo’s paradox (also in Section 3.7.)

What axiom of FMathL do you invoke in order to define formally Yablo’s sequence SS or the Liar sentence LL? You need to have them somehow in order for them to be nominal; otherwise you’re just talking about properties that some hypothetical thing might have, which is the same way I would have phrased the argument. One way I can think of defining them is to use the choice operator to define a fixpoint operator, say something like L=Choice({u|u=(u=0)}).L = Choice( \{ u | u = (u=0) \} ). But in that case I can just say that {u|u=(u=0)}\{ u | u = (u=0) \}, being empty, is not in the domain of ChoiceChoice. Same with Yablo.

Posted by: Mike Shulman on October 1, 2009 9:29 PM | Permalink | PGP Sig | Reply to this

Re: Nominal objects

MS: If that is the meaning of “type,” then I have yet to be convinced that anything at all meaningful can be said that is “type-free.”

You may look at Illative combinatory logic to see how such systems look like if they follow similar reductionist goals as ZF. I like the basic idea (as Church and Curry did at first), and used it in FMathL. But I didn’t like their reductionism, which is even worse than ZF.

MS: in FMathL there actually are objects with the property of being “nominal” which are denoted by such a symbol, and this is what I was referring to.

Yes. They are very useful to formally analyze paradoxes, see (2.9) and Section 3.7. They allow one to state things ithout being harmed by their turning out inconsistent.

There is a whole branch of logic called paraconsistent logic that deals with such contraditions in different ways. Having nominal objects is the most elegant of these.

MS: I mentioned that first-order logic is formal and precise because you seemed to be saying that nominal objects are the only way to “formalize” not-yet-known-whether-it-exists objects.

You simply moved the problem to the metalevel. I should probably have added the word “self-reflective” somwhere.

Posted by: Arnold Neumaier on September 30, 2009 11:36 PM | Permalink | Reply to this

Re: Nominal objects

You may look at Illative combinatory logic

Ok, I see what you are getting at.

You simply moved the problem to the metalevel. I should probably have added the word “self-reflective” somwhere.

Sorry, I don’t understand. Can you elaborate?

Posted by: Mike Shulman on October 1, 2009 9:47 PM | Permalink | PGP Sig | Reply to this

Re: Nominal objects

MS: I mentioned that first-order logic is formal and precise because you seemed to be saying that nominal objects are the only way to “formalize” not-yet-known-whether-it-exists objects.

AN: You simply moved the problem to the metalevel. I should probably have added the word “self-reflective” somwhere.

MS: Sorry, I don’t understand. Can you elaborate?

I noticed repeatedly that you allow on the metalevel things that you do not accept on the formal level. Thus you had a metathing called group that was several things on the formal level (though later you found a way to produce a group as a single formal object). Similarly, you now say, the formal statement “x is nominal” is not needed since you can replace it on the metalevel (FOL) by “x does not denote something”.

But in a foundational language you need to be able to formalize the latter at least via reflection, since one cannot pass the buck to higher and higher metalevels. At least, the computer must be told informally precise terms when something does or does not exist. (

Part of our talking at cross-purposes seems to be due to putting the boundary between object level and metalevel differently.

Posted by: Arnold Neumaier on October 2, 2009 1:56 PM | Permalink | Reply to this

Re: Nominal objects

you now say, the formal statement “x is nominal” is not needed since you can replace it on the metalevel (FOL) by “x does not denote something”.

No, I am saying that the formal statement “xx is nominal” can be replaced by the equally formal (and not meta) statement “xx does not exist,” whose semantics is an assertion that the term xx does not denote anything, not an assertion about some object called xx.

There is no need for every variable in every statement we write down when doing mathematics to refer directly to an actual object in the semantics. The term xx exists as an object of logic, even if it does not denote something. So there is no need for nominal objects on the metalevel either; all that is needed is a formalization of the relationship “denotes” between terms and objects, which you had to have anyway in order to make sense of logic and which requires no recursion into higher metalevels. Thus, nominal objects are not needed to formalize the treatment of not-yet-known-whether-or-not-they-exist objects.

Posted by: Mike Shulman on October 2, 2009 5:14 PM | Permalink | PGP Sig | Reply to this

Re: Nominal objects

MS: I am saying that the formal statement “x is nominal” can be replaced by the equally formal (and not meta) statement “x does not exist,”

This is exactly what my Axiom 4 says: if an object does not exist, it is called nominal. Of course, one is free not to use the word nominal and continue using “does not exist”. But since all objects metaexist from the start, even the nominal ones, I found it more clear to have a time-honored, separate word for “(metaexists but) does not exist”.

Posted by: Arnold Neumaier on October 9, 2009 1:14 PM | Permalink | Reply to this

Re: Nominal objects

Okay, I guess I failed to get across what I meant. What I’m saying is that there is no need for the (to my mind, confusing) concept of “metaexistence.” Existence equals metaexistence. When something doesn’t exist, it doesn’t exist, basta; there is no need for a “thing” which “metaexists” but doesn’t exist.

Posted by: Mike Shulman on October 9, 2009 2:51 PM | Permalink | PGP Sig | Reply to this

Re: Nominal objects

MS: Existence equals metaexistence. When something doesn’t exist, it doesn’t exist, basta; there is no need for a “thing” which “metaexists” but doesn’t exist.

metaexists = exists on the metalevel,

exists = exists on the object level.

If you take ZF+negation of the continuum hypothesis as your metatheory, and construct in it the object theory consisting of the cumulative hierarchy, a subset of the reals metaexists that contains the integers but has the cardinality of neither. But it does not exist on the object level.

Thus one needs to distinguish between existence and metaexistence.

In foundations it is rarely the case that one can identify concepts on the metalevel with the corresponding concepts on the object levels.

Countability and metacountability are different things; otherwise you run into Skolem’s paradox. Sets and metasets are different things, otherwise you cannot do model theory where you construct a set theory inside a metaset theory.

The same applies for virtually every concept.

Posted by: Arnold Neumaier on October 9, 2009 4:54 PM | Permalink | Reply to this

Re: Nominal objects

Okay, apparently there are three types of existence going on.

  1. Existence of the first type: Something exists in the metatheory.
  2. Existence of the second type: Something (which must exist(1)) exists as an element of the model of the object theory (which is a particular set in the metatheory).
  3. Existence of the third type: Something which exists(2) has the additional property called “existing,” which is a predicate in the object theory.

I am claiming that there is no reason to distinguish between exists(2) and exists(3), and I thought that you were using “meta-exists” to mean exists(2). Of course exists(1) and exists(2) are different and always will be, but nominality has to do with the distinction (or lack thereof) between exists(2) and exists(3).

If you take ZF+negation of the continuum hypothesis as your metatheory, and construct in it the object theory consisting of the cumulative hierarchy

This example doesn’t quite work, because ZF+(not-CH) does not prove the consistency of ZF (unless it is inconsistent), so you can’t construct in ZF+(not-CH) a model of ZF. I think probably what you meant to say is that if you take ZF+(not-CH) as the metatheory and assume in it the existence of a model of ZF, you can construct from that a model of ZF+CH. (It won’t be the cumulative hierarchy, since ZF contains the axiom of foundation and is thus equal to its own cumulative hierarchy, but it can be the constructible hierarchy.) Then there will exist(1) a subset of of the reals intermediate in cardinality, but it will not exist(2) or exist(3).

Posted by: Mike Shulman on October 9, 2009 6:47 PM | Permalink | PGP Sig | Reply to this

Re: Nominal objects

MS: apparently there are three types of existence going on.

Yes.

MS: I thought that you were using “meta-exists” to mean exists(2).

No: In FMathL, metaexists means exists(1), and nominal means exist(2) but not exist(3). Thus being nominal is a possible property of objects, not of metaobjects.

MS: I am claiming that there is no reason to distinguish between exists(2) and exists(3).

One does not have to, but there are very good reasons to do so. It is advantagous to have nominal objects - they save a lot of qualifications that would need insertion into ordinary mathematical statements to make them rigorous. And it is essential for most untyped logic calculi (remember that first order logic is typed, since it has the types proposition and term), which run otherwise into versions of Hilbert’s paradox.

Therefore, there are hundreds of papers about the many variations of this, including work by prominent people and even work in topos theory. The subject matter is often called logic of partial terms. See, for example, Feferman’s article about Definedness; especially Section 4.

Indeed, FMathL has a logic of partial terms (of some kind, not quite Feferman’s).

Would replacing “exists” by “is defined” and using “nominal” for “is not defined” remove your reservations?

MS: [ZF+negation] doesn’t quite work […]

Yes, I was too sloppy in detail, but my message about metaexistence came across nevertheless.

Posted by: Arnold Neumaier on October 9, 2009 7:29 PM | Permalink | Reply to this

Re: Nominal objects

Would replacing “exists” by “is defined” and using “nominal” for “is not defined” remove your reservations?

Yes, I believe that is exactly what I was suggesting here.

But you seemed to be saying here that nominal objects are necessary in order to formalize talking about not-yet-known-whether-or-not-they-exist objects. Are you now agreeing with me that we don’t actually need nominal objects, but we can say “is not defined” instead of “is nominal”?

You also said here that using “is not defined” instead of “nominal” “moves the problem to the metalevel.” Now that we have the object-meta boundary sorted out, can you explain again what you meant by that?

And it is essential for most untyped logic calculi (remember that first order logic is typed, since it has the types proposition and term), which run otherwise into versions of Hilbert’s paradox.

It seems that using partial functions instead of total ones, which is essentially the same as having terms that are not defined, is just as good a way of avoiding paradoxes in this case.

See, for example, Feferman’s article about Definedness; especially Section 4.

Thanks for that link! I think his distinction between “logics of existence” and “logics of definedness” is exactly what we’re arguing about here. I’m saying that logics of definedness seem to me closer to mathematical practice, for some of the same reasons he gives at the end of his section 2, and that (as Feferman says) they are equivalent to logics of existence so we might as well use the more comfortable words. It seems to me that your insistence on the existence(2) of objects called “nominal” is an insistence on a logic of existence rather than a logic of definedness.

Posted by: Mike Shulman on October 9, 2009 8:36 PM | Permalink | PGP Sig | Reply to this

Re: Nominal objects

AN: Would replacing “exists” by “is defined” and using “nominal” for “is not defined” remove your reservations?

MS: Yes, I believe that is exactly what I was suggesting here. […] I think his distinction between “logics of existence” and “logics of definedness” is exactly what we’re arguing about here.

I read Feferman again, more closely since I was surprised at what you say here.

Indeed, your position is that of a logic of definedness, while mine is that of a logic of existence. But I had interpreted so far all logics of partial terms as logics of existence (without having encountered any ill effects), and hence not differentiated between the two. Indeed, Feferman says in his remark after (6) that they are in some sense equivalent.

But I don’t follow Feferman’s arguments for favoring a logic of definedness over a logic of existence.

The philosophiocal argument (i) is void, I think, since mathematicians often speak of nonexisting limits or function values. Nonexisting and undefined are synonymous in most contexts - except those involving paradoxes.

We define in naive set theory Russell’s set and then say it does not exist since its existence would lead to a contradiction. This is exactly the nonconstructive sense of mathematical existence that Hilbert emphasized. This was also more or less the point of view of Cantor, who avoided in this way the set theoretic paradoxes - he was careful in his language and distinguishes two different sorts of sets - what we now call sets and classes - for which different constructions apply.

This makes fully sense in a logic of existence - but not in a logic of definedness. And it makes no sense in first order logic, where what exists is fixed a priory!

Feferman’s (ii) is not applicable to FMathL since there quantification is not, as in first order logic, over all objects, but only over elements from some object. These always exist by the FMathL axioms.

And (as his parenthetical remark shows) Feferman knew that his (iii) has no real weight, since one can argue exactly the same in favor of a logic of existence.

On the other hand, from an implementation point of view, there is a strong argument for a logic of existence: what is a term is then easily decidable. Without a formal notion of undefined terms, the question whether f(x)/g(x) [with well-formed terms f(x) and g(x)] is a term in rational arithmetic is undecidable in general, since the test for g(x)=0 may be undecidable.

But syntactic questions like termness should be easy to decide in any efficient computer implementation of conceptual mathematics. Its like with typing in programming language - if this is undecidable (which happens for certain kinds of dependent types) it severely complicates the analysis of the language and its efficient compilation.

Even if I were to adopt the suggested change in terminology, undefined objects in FMathL would therefore still be formal objects. Thus there would still be three kinds of predicates, it only gives them possibly less confusing names:

exists(1) = metaexists

exists(2) = exists, in two flavors:

exists(2a) = exists(2) and is defined [= the old exists(3)]

exists(2b) = exists(2) and is not defined [= nominal = the old exists(2) but not exists(3)]

MS: Now that we have the object-meta boundary sorted out, can you explain again what you meant by that?

One can, in principle, take any logic of existence and redefine objects to be only the previously existing objects, redefine terms to be defined only if they denote a previously existing object. Then one can replace any formal treatment of the nonexisting=undefined case by informal language on the metalevel (which I took to be your old position) or by conditional constraints on when terms are defined (which is the formal advance championed by Feferman’s logic of partial terms).

And one can reverse this by adding to every logic of definedness nominal objects for every undefined but well-formed term, so that all these terms denote as well.

Thus it is a matter of style rather than essence which approach is taken.

Posted by: Arnold Neumaier on October 10, 2009 5:47 PM | Permalink | Reply to this

Re: Nominal objects

The philosophical argument (i) is void, I think, since mathematicians often speak of nonexisting limits or function values.

Once again, I think you’re not listening to me. Speaking of nonexisting limits has nothing to do with whether or not there is an actual object with the property of not existing, or whether the “limit” term is just not defined.

On the other hand, from an implementation point of view, there is a strong argument for a logic of existence: what is a term is then easily decidable. Without a formal notion of undefined terms, the question whether f(x)/g(x) [with well-formed terms f(x) and g(x)] is a term in rational arithmetic is undecidable in general, since the test for g(x)=0 may be undecidable.

I think you’re confusing the existence of a term with the existence of the thing it denotes. Even in a definedness logic the term f(x)/g(x)f(x)/g(x) can exist whether or not g(x)=0g(x)=0; it just may or may not denote anything. The distinction is that definedness is a property of the term saying whether or not it denotes anything, whereas in an existence logic every term denotes something and existence(3) is a property of the thing denoted.

Thus it is a matter of style rather than essence which approach is taken.

Okay, so it sounds like you are agreeing now that we don’t need nominal objects.

Posted by: Mike Shulman on October 10, 2009 7:58 PM | Permalink | PGP Sig | Reply to this

Re: Nominal objects

MS: I think you’re not listening to me.

AN: I think we have different philosophies about existence, hence interpret the same words different than you. When the same word is understood differently by two parties and one answeres to the wrong interpretation, one can speak of misunderstanding but not of not listening.

MS: Speaking of nonexisting limits has nothing to do with whether or not there is an actual object with the property of not existing, or whether the “limit” term is just not defined.

As long as there is no formal notion of nonexistence, there is no way interpret your statement objectively; it is true for you but not for me. I mean by nonexistence what mathematicians generally mean by nonexistence when they say that a limit does not exist. In this case, this means “there is no real number x such that in each neighborhood there are infinitely many….”. Nothing prevents me from having a model where the nonexisting limit is defined to be something that is not a real number. Not a single theorem about existing limits (which are real numbers) would change.

Thus the notion of existence has nothing to do with model theory (which many mathematicians using this terminology do not even know much about), but with convenience in talking about the subject.

It also has nothing to do with any notion of “undefined limits”. I rarely hear mathematicians talk about an undefined limit (though I would treat it as synonymous to a nonexisting limit)

MS: it sounds like you are agreeing now that we don’t need nominal objects

The situation is similar as that with the empty set in naive set theory. It defies the notion of a set as a collection of objects, since there is nothing to collect. According to your arguments applied analogously, there is no need for an empty set, and those who see the empty set for the first time generally feel that way.

But the empty set is very useful since it allows one to avoid treating separately many exceptional situations that would occur without the concept of an empty set.

Nominal objects have precisely the same advantage, and that’s why they have been introduced by the greeks, why they were kept throughout the centuries, and why I use them, too.

Posted by: Arnold Neumaier on October 10, 2009 9:49 PM | Permalink | Reply to this

Re: Nominal objects

Nothing prevents me from having a model where the nonexisting limit is defined to be something that is not a real number.

Kelley (not Kelly!) did just this in his version of class theory. Everything is defined, but not everything is a set; the na¯vely undefined things tend to be (perhaps always are?) equal to the class of all sets (the largest object in his theory).

I don't find that attractive, but it certainly can be done.

Posted by: Toby Bartels on October 10, 2009 11:13 PM | Permalink | Reply to this

Re: Nominal objects

When the same word is understood differently by two parties and one answeres to the wrong interpretation, one can speak of misunderstanding but not of not listening.

Yes, that’s true, but I feel as though I’ve explained what the words mean to me several times and you persist in interpreting my statements in a different way. Perhaps you feel the same way. Maybe whenever we use a controversial word like “exists” we should prefix it with our initials, so that if I talk about something MS-existing I know what I mean, but if I talk about something AN-existing then you can correct me if I get it wrong. (That was mostly a joke.)

Anyway, I think at least we can now agree that there is no need for nominal objects, which I thought was one of the points of disagreement. I certainly agree in principle that just because something can be eliminated from the language, it may still be useful and convenient to keep it. I haven’t yet seen any examples that convince me that a logic of existence is any more convenient than a logic of definedness, but maybe after you’ve developed FMathL some more you can exhibit some.

Posted by: Mike Shulman on October 11, 2009 1:39 AM | Permalink | PGP Sig | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Are there mathematicians who ordinarily speak in terms of nominal objects? I’ve never heard any mathematician use the word “nominal.” In other areas you are very insistent on the formal language matching ordinary mathematical language, but this feels to me like a mismatch. But perhaps the mathematicians you spend time with talk about nominal objects all the time?

Posted by: Mike Shulman on October 1, 2009 9:56 PM | Permalink | PGP Sig | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Are there mathematicians who ordinarily speak in terms of nominal objects?

I've never heard a mathematician use the term ‘nominal’ in this sense (at least, not before reading about FMathL). But I often see mathematicians write things like

f(x)f(x) does not exist.

If I understand correctly, this means exactly the same as

f(x)f(x) is nominal.

Of course, you can understand this as an abuse of language that really means

xx does not belong to the domain of ff.

But a formalism with nominal objects matches ordinary practice more closely.

It's even clearer without an explicit ff. I know hundreds of books that say, more or less,

1/01/0 does not exist.

But many of them could not even try to rephrase it as

(1,0)(1,0) does not belong to the domain of //.

After all, these are introductory textbooks that haven't even considered the general concept of functions and domains yet! You could interpret this as talking about meta-domains and substitute the proper expression in the language at hand for the domain of //, to get

It is not the case that 000 \ne 0.

But that is not really what they are trying to say!

Posted by: Toby Bartels on October 1, 2009 10:11 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I’ve been enjoying this thread immensely. I don’t feel I have much to contribute. But here’s one thing:

A useful framework for computer-based mathematics will require some very deep programming methods. But a lot of the workings of the resulting software will need to be hidden from the casual user, because most mathematicians won’t want to know how the system works.

We may imagine mathematicians as people who want to know every detail of reasoning that underlies their work. But that’s not really true. Logicians and some category theorists want to know all these details. But the disdain and lack of understanding with which most mathematicians view logicians and category theorists is evidence that most mathematicians would rather not know.

For example, a good system may need lots of functors that do things like send ‘the real numbers viewed as an object in the category of complete ordered fields’ to ‘the real numbers viewed as an object of SetSet’, and so on. But if so, the casual user should have easy access to an environment where default choices of these functors lurk in the background and silently spring into action whenever needed.

It’s bad if the casual user has to know these functors are lurking in the background. But that doesn’t mean it’s bad for a system to have these functors built into it. Indeed, they may be the most convenient way to build a flexible system that looks very simple from the outside.

In other words: it’s possible that the best, most practical system will be one whose inner workings mathematicians won’t understand.

Posted by: John Baez on October 1, 2009 7:41 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I agree. However, I think it’s also worth some effort to minimize the amount of technicalities which the casual user isn’t expected to understand. I think that people are, in general, more successful users of an automatic system the more they understand of its inner workings. In particular, they have some idea of what it can and can’t be expected to do, what sorts of things are likely to go wrong, and what is reasonable to try to do to fix them. So mathematicians should be encouraged (though not required) to understand at least a bit about how the system works, which means it should try to be comprehensible to them.

I don’t have any conclusion at the moment about the specific question of whether implicit forgetful functors should be present underneath. Before this discussion I would have said “of course,” but it’s become clear that to at least some people, this could be weird and offputting.

Posted by: Mike Shulman on October 1, 2009 10:16 PM | Permalink | PGP Sig | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

JB: the disdain and lack of understanding with which most mathematicians view logicians and category theorists is evidence that most mathematicians would rather not know.

Yes. They want something that eases their life, not something that requires them to pay attention to distracting details that do not matter for what they want to achieve.

Some people need more attention to logical or structural detail, and they therefore study it. But the ordinary mathematician is with respect to logic and category theory in the same position as ordinary people with respect to electronics. They want to understand only as much as is needed to benefit from the fruits, and perhaps a little bit more.

JB: It’s bad if the casual user has to know these functors are lurking in the background. But that doesn’t mean it’s bad for a system to have these functors built into it.

It is good for a system to be able to produce a version that can make thes functors visible for specialists like you who want to think that way. But it would be far too cumbersome to represent everything inside in this way.

Thus FMathL will have its own material way of dealing with mathematical objects, but provides the user with a choice of programmable style sheets that rewrite the internally represented stuff in the manner a user wants to see it.

There will be a few predefined style sheets that present things in the dominant styles from the mathematical literature.

But I still don’t understand enough of the moral of the n-Category Cafe to see how to write a style sheet that would respect their moral. It seems that it cannot be formalized in a way that is algorithmically definite. Most of my discussion here is part of my effort to see more clearly what would be needed for this.

Let me also reply here to some other recent contributions by Tom Trimble and Tony Bartels, which touch related issues.

AN: Two instances of the field of complex numbers (without a distinguished imaginary unit) are not structurally the same since there is no canonical isomorphism? This would be very strange indeed.

TT: As to your question, though, Toby has given an informed reply.

TB: I wouldn’t say that they are not, in some sense, the same just because there is more than one isomorphism. (There is always a sense in which they are not the same, if they are represented differently syntactically. But that is not itself a question for mathematics.) I would say this: It is not only important whether things are isomorphic, but also in how many ways they are isomorphic;

Sometimes, this is important, often not. Moreover, one can tell the difference by looking at the automorphism group of the object, and one can construct any other isomorphism using this information. So nothing is lost structurally by treateng them to be structurally the same. One gets a valid equivalence relation. Since in regarding them as the same one does not care about the specific isomorphism, the fact that “the composite isomorphism ABCAA\to B\to C\to A might not be the identity” does not matter either.

Once the latter matters, one needs more structure - namely a particular isomorphism.

TT: I hope at least it was clear that I was trying to build a bridge of understanding between what you wrote and what David wrote.

Yes, I appreciate that.

TT: please note that the ideal generated by x 2+1x^2+1 does not change if you replace x by -x. That’s the point!

Indeed, in this special case, you are right. That’s why I only quoted the parenthetical remark.

But even in this special case, there is somewhat strange in that your moral demands that in considering \mathbb{R}[x] one immediatety is forced to forget the definition of \mathbb{R}[x], and only keep the object up to isomorphism. For the definition of \mathbb{R}[x] remembers (according to usual practice, at leat in Bourbaki style mathematics) that x is a distinguished element in this field. And no ordinary mathematician who defines i:=xmodx 2+1i:=x mod x^2+1 thinks of the xx used there as being anything different from the xx used in the construction. An automatic system like FMathL must be able to recognize this.

It is only the strange moral which people here put on the interpretation that causes all these subtle complications, that are completely absent in a normal reading of algebra texts.

I am slowly adapting to this strange world for the sake of this discussion, since I need to see how to account in FMathL for its existence, but I thoroughly dislike it.

In FMathL one can choose what one wants to forget and what one wants to remember, and remembering is the default, as in Bourbaki-style mathematics, since this provides the needed simplicity.

I do not think that Bourbaki is any less structural in content than whatcategory theorists get with the moral structural straitjacket imposed.

I do not deny that it is very useful for certain categorical constructions to think that way, and FMathL allows one to forget everything undesired upon demand. But most mathematicians need not make use of this most of the time. It would only complicate their life. That’s why, for most of them, it is a straitjacket.

And since nothing in the structure of mathematics requires this straitjacket (is does not save us from any contradiction in mathematics), it should not be imposed upon mathematics. it should be left to the discretion of mathematicians to choose the reading they like.

The situation is similar to that of constructive mathematics vs. classical mathematics. Some people like to work within the constraints of intuitionistic logic, but for most mathematicians, to be forced into that would be an unwelcome straitjacket. Hundred years ago, there were heated debates about who is right. But history decided that both camps should have their way, and they are accommodated piecefully side by side in modern mathematics.

FMathL tries to learn from this and presents a middle view of mathematics far from the material extremism of pure ZFC and the structural extremism of “speak no evil”. But it accomodates both views as the two extremal poles of a wide spectrum of intermediate traditions that freely borrow from both the material and the structural point of view.

Posted by: Arnold Neumaier on October 2, 2009 9:52 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Arnold wrote:

Tom Trimble and Tony Bartels

That’s Todd Trimble and Toby Bartels.

But even in this special case, there is somewhat strange in that your moral demands that in considering [x]\mathbb{R}[x] one immediatety is forced to forget the definition of [x]\mathbb{R}[x], and only keep the object up to isomorphism. For the definition of [x]\mathbb{R}[x] remembers (according to usual practice […]) that xx is a distinguished element in this field.

This is a misunderstanding. Category theorists, possibly more than anyone else, absolutely would insist that that element xx is distinguished. In the jargon, that element is the component at the set 11 of the unit of the adjunction SetAlg. \mathbf{Set} \stackrel{\rightarrow}{\leftarrow} \mathbb{R}\mathbf{-Alg}. Of course, you’re not obliged to make use of the special role of xx. In other words, you can forget its distinguished role. But you don’t have to.

Posted by: Tom Leinster on October 2, 2009 11:10 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

TL: Todd Trimble and Toby Bartels.

Yes, sorry. Too many names to remember starting with T….

TL: In other words, you can forget its distinguished role. But you don’t have to.

But if I don’t then Todd Trimble’s argument that i:=xmodx 2+1i:= x mod x^2+1 is not distinguished since this xx might have been in fact x-x breaks down. This is why I had thought one has to forget it, according to the morals whose precise formal contents I am trying to discover.

Or am I supposed to read between the lines to figure out when something is to be forgotten and when to be remembered, depending on what sort of argument is made?

Posted by: Arnold Neumaier on October 2, 2009 11:44 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I was already a little worried when Todd wrote

knowledge of which element that is is not encoded within the polynomial algebra structure

Knowledge of which element xx is is not encoded within the structure of the polynomial algebra [x]\mathbb{R}[x] as an \mathbb{R}-algebra (since there are \mathbb{R}-algebra automorphisms that don't preserve this), but it is encoded in the structure of [x]\mathbb{R}[x] as a polynomial algebra, that is as the free \mathbb{R}-algebra on one generator.

Since [x]\mathbb{R}[x] is typically defined as such a free algebra, then it does naturally come with a distinguished element, which is of course the one we call xx. So if you define \mathbb{C} as [x]/x 2+1\mathbb{R}[x]/\langle{x^2+1}\rangle, then it comes naturally with an element that we may call ii, although this is not part of the structure of \mathbb{C} as a mere \mathbb{R}-algebra.

However, I'd be more inclined to define \mathbb{C} as an algebraic closure of \mathbb{R}, in which case it does not come with a distinction between ii and i-i. I no longer remember if this means that I'm agreeing or disagreeing with Todd!

Posted by: Toby Bartels on October 2, 2009 12:07 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

No disagreement with you there Tony, I mean Toby – describing \mathbb{C} as an algebraic closure of \mathbb{R} is not a universal-element description of course, but that’s fine; no moral straitjacket there. I did feel that mention of universal elements was important to another aspect of this hydra-headed discussion, so I brought it up.

And I know that you know that I know the point about polynomial algebras above, since I had made the very same point, I think more than once. But it obviously won’t hurt to repeat it.

Posted by: Todd Trimble on October 2, 2009 1:38 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

No disagreement with you there

OK, good. It seemed that you misspoke when you wrote ‘not encoded within the polynomial algebra structure’, that it should have been ‘not encoded within the algebra structure’ or ‘not encoded within the polynomial algebra's algebra structure’. And it seemed as if that line might have been contributing to confusion with AN (but perhaps not).

Posted by: Toby Bartels on October 2, 2009 2:27 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Yeah, strike out ‘polynomial’. I don’t want to say how long I worked on that comment (I had to tone down some drafts considerably), so maybe I was too tired to notice that infelicitous ‘polynomial’. But, I think that probably any confusion has been sorted out by now.

Posted by: Todd Trimble on October 2, 2009 2:50 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I struck ‘polynomial’ out there. Hope that’s OK.

Posted by: David Corfield on October 2, 2009 4:03 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I don’t understand much of what’s going on here— I’ve been watching it feeling like someone who knows only a bit of English watching a performance of Who’s on First. But here’s an anecdote that touches on a few of the things that have come up.

A mathematician BC was writing something (like maybe notes on Ramanujan’s conjecture), and he began it with something like this: “Let Z\mathbf{Z} denote the ring of integers, let Q\mathbf{Q} denote its fraction field, let R\mathbf{R} denote the completion of Q\mathbf{Q} with respect to the usual absolute value, and let C\mathbf{C} denote *an* algebraic closure of R\mathbf{R}.” The reason he defined C\mathbf{C} this way is, of course, so there is no preferred square root of 1-1. He thought this was completely uncontroversial and was just trying to do things in a way that avoided all artificial choices.

He gave JPS, a former member of Bourbaki, a copy to read, and a few minutes later JPS came back to BC’s office and said that Bourbaki did not define C\mathbf{C} to be *an* algebraic closure of R\mathbf{R}— they defined it to be the set R×R\mathbf{R} \times \mathbf{R} together with a certain ring structure. Further they used boldface only for explicitly constructed objects. Since “an algebraic closure” is not something explicitly constructed and since there is no canonical map between an arbitrary algebraic closure and Bourbaki’s C\mathbf{C}, a general algebraic closure of R\mathbf{R} should be denoted CC, without the boldface.

Then BC told JL this story, and JL responded that from the point of view of logic it would be hard to actually give an algebraic closure of R\mathbf{R} without picking some square root of 1-1, and he thought it could probably even prove it if he wanted to.

I don’t know if this story has a moral, but the similarity with the present discussion is pretty remarkable. It’s clear the next topic should be which typeface we should use for which constructions.

Posted by: James on October 2, 2009 3:53 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

After defining C\mathbf{C} to be an algebraic closure of R\mathbf{R}, did BC then go on to say something like “let iCi\in\mathbf{C} be a square root of 1-1”? If so (and I’d be surprised if he didn’t), then of course from a categorial-structural point of view there is no difference between his C\mathbf{C} and Bourbaki’s, and I can’t really understand the point of view of a mathematician who would insist on maintaining a distinction.

I’m a little ashamed to admit it, but I’ve never actually tried reading Bourbaki. This discussion has made me realize that it would probably sound very weird and divorced from mathematical practice to me.

Posted by: Mike Shulman on October 2, 2009 5:22 PM | Permalink | PGP Sig | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I’d be surprised if he did pick a square root of 1-1! Otherwise, what’s the point? The conversation was a bit pedantic, but sheesh, not that pedantic! But he never told me, so I can’t say I know for sure.

You can pick a square root locally in the text, though. For instance, this would be perfectly fine (if a bit wordy): “Consider the bijection from the set II of square roots of 1-1 to the set of orientations of unit circle in CC given by ii\mapsto the function sending a real number tt to e ite^{it}.”

Posted by: James on October 2, 2009 6:27 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Ah, I see what you are saying. I assumed that anyone working with the complex numbers would need a particular square root of 1-1 to call ii, but perhaps that was unwarranted. I haven’t worked with complex numbers since my first year of graduate school, but when I did there was always one of them called ii.

It sounded to me like JPS was saying that C\mathbf{C} “really is” R×R\mathbf{R}\times\mathbf{R} with a certain ring structure (unlike an arbitrary algebraic closure of R\mathbf{R}), rather than just saying that C\mathbf{C} comes equipped with a particular chosen square root of 1-1 (again, unlike an arbitrary algebraic closure of R\mathbf{R}). The former I am unable to understand; the latter is an interesting question. But if he meant the latter, why didn’t he just say that, rather than bothering about R×R\mathbf{R}\times\mathbf{R}?

Posted by: Mike Shulman on October 2, 2009 10:59 PM | Permalink | PGP Sig | Reply to this

The complex number i

MS: It sounded to me like JPS was saying that C “really is” R×R with a certain ring structure (unlike an arbitrary algebraic closure of R), rather than just saying that C comes equipped with a particular chosen square root of −1 (again, unlike an arbitrary algebraic closure of R). The former I am unable to understand; the latter is an interesting question.

JPS was indeed saying that the object Bourbaki calls C “really is” the Gaussian complex plane. Everyone trained in the material point of view that Bourbaki takes should be able to understand this. That this interpretation is lost in the mathematical dialect you understand is to me a defect of the latter.

And (though this is completely irrelevant to the discussion JPS was involved), everyone who has seen a drawing of the relevant part of the Gaussian complex plane knows that if c:=(a,b)c:=(a,b) is a complex number a la Bourbaki then aa is called its real part, bb its imaginary part, and c=a+ibc=a+ib.

In particular, Bourbaki’s complex numbers contain the distinguished element (0,1)=i(0,1)=i.

Posted by: Arnold Neumaier on October 3, 2009 8:43 PM | Permalink | Reply to this

Re: The complex number i

JPS was indeed saying that the object Bourbaki calls C “really is” the Gaussian complex plane.

I assume that by “the Gaussian complex plane” you mean “the set R×R\mathbf{R}\times\mathbf{R},” although that is not what I would mean by it. Your interpretation of JPS’ comment seems at odds with James’, though.

Everyone trained in the material point of view that Bourbaki takes should be able to understand this.

I can understand the statement that C=R×R\mathbf{C}=\mathbf{R}\times\mathbf{R} (in a material foundation). I cannot understand the point of view that such an equality matters for doing any sort of mathematics (rather than merely the existence of a unique canonical isomorphism, as is the case if C\mathbf{C} denotes an algebraic closure of R\mathbf{R} equipped with a chosen square root of 1-1).

Posted by: Mike Shulman on October 3, 2009 9:15 PM | Permalink | PGP Sig | Reply to this

Re: The complex number i

AN: JPS was indeed saying that the object Bourbaki calls C “really is” the Gaussian complex plane.

MS: I assume that by “the Gaussian complex plane” you mean “the set R×R,” although that is not what I would mean by it.

No, I mean the set R x R equipped with the structure of the complex numbers as defined by Bourbaki (following Gauss).

MS: I cannot understand the point of view that such an equality matters for doing any sort of mathematics

It matters for someone who (like Bourbaki or me) wants to support a material point of view in which one can materially identify (via a canonical procedure of identification) canonically equivalent objects without ever running into problems.

This is not possible if an isomorphism is not canonical, since then a problem (mentioned earlier by one of you) arises when one goes from A to B to C and back to A.

Bourbaki wanted (and needed) to be consistent across eight big volumes…

Posted by: Arnold Neumaier on October 3, 2009 11:47 PM | Permalink | Reply to this

Re: The complex number i

It matters for someone who (like Bourbaki or me) wants to support a material point of view in which one can materially identify (via a canonical procedure of identification) canonically equivalent objects without ever running into problems.

But R×R\mathbf{R}\times\mathbf{R} equipped with the structure of the complex numbers is equivalent – canonically – to any algebraic clousure of R\mathbf{R} equipped with a chosen square root of 1-1. So if you are identifying them along this canonical isomorphism, then the difference doesn’t matter.

Bourbaki may have defined, as a matter of foundations, “the” complex numbers to be R×R\mathbf{R}\times\mathbf{R} with the appropriate structure; of course one has to do something like that in a foundation like ZF. But why would anyone object to a mathematician who is working in some other field (not foundations) using a different notion of “the complex numbers” which is canonically equivalent to the “official” one?

Posted by: Mike Shulman on October 4, 2009 7:08 AM | Permalink | PGP Sig | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I’m a little ashamed to admit it, but I’ve never actually tried reading Bourbaki. This discussion has made me realize that it would probably sound very weird and divorced from mathematical practice to me.

Then you should try reading it, to correct this misconception!

In large part, Bourbaki is responsible for founding 20th-century mathematics as we know it. If Bourbaki had adopted an explicitly structural foundation like SEAR, then we would probably regard it as standard today. As it was, they did not have this available and did not try to develop one. However, they made do with what they had and tried to build a structural theory anyway. They began even before the formulation of category theory and notoriously never adopted category theory in their theory of structure. But you may be surprised at how good a general theory of structured sets and universal properties can be without using category theory! In particular, their desire for a structural approach is clear when you see that this general theory of structure is in a book called Set Theory; the only other book on set theory that covers such material that I can think of offhand is Lawvere & Rosebrugh.

Posted by: Toby Bartels on October 2, 2009 7:42 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Reading the beginnings of theory of sets (the very foundation) is a bit of a pain since initially only polish notation is used (which I definitiely do not recommend). So it takes a while before one realizes that ordinary mathematics results.

But later it gets better, and the later volumes of the Elements of Mathematics are worthwhile reading, even 40 years after they were written. I didn’t read everything, but the contents of, say, “Lie Groups and Lie Algebras” is excellent.

Posted by: Arnold Neumaier on October 3, 2009 7:36 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Then BC told JL this story, and JL responded that from the point of view of logic it would be hard to actually give an algebraic closure of R\mathbf{R} without picking some square root of 1&#8722;1, and he thought it could probably even prove it if he wanted to.

I don't understand what you mean here. Bourbaki gave an algebraic closure of R\mathbf{R}, as R×R\mathbf{R} \times \mathbf{R} with a certain twisted multiplication. Their construction has two square roots of 1-1, (0,1)(0,1) and (0,1)(0,-1), and neither is more fundamental than the other, so in what sense does it pick one?

If one works in a constructive framework to prove the theorem that an algebraic closure of R\mathbf{R} exists, then presumably one can extract from that theorem a field CC and a square root of 1-1 in CC. But my intuition is that one could not always do this with a nonconstructive existence proof.

Posted by: Toby Bartels on October 2, 2009 7:44 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

As far as I can tell, that’s a reasonable point. I can see that I might have suggested that I understood what JL meant, but I didn’t, so it’s likely that something important has been lost in my version. Actually I’m not sure BC understood it either.

Posted by: James on October 2, 2009 8:33 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

>extract from that theorem

That should be

>extract from that proof

Posted by: Toby Bartels on October 2, 2009 8:46 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Well, there is not a perfect symmetry between positive and negative numbers in R\mathbf{R}, so having the two square roots of 1-1 be (0,1)(0,1) and (0,1)(0,-1) does give you a way to distinguish them. With an arbitrary algebraic closure you would have to choose a square root of 1-1 to call ii, thereby supplying extra structure; if you start with R×R\mathbf{R}\times\mathbf{R} then you have such extra structure already around, and all you have to do is not forget it.

Posted by: Mike Shulman on October 2, 2009 11:00 PM | Permalink | PGP Sig | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

J: let C denote *an* algebraic closure of R. [… JPS replied…] Bourbaki did not define C to be *an* algebraic closure of R - they defined it to be the set tR x R together with a certain ring structure. Further they used boldface only for explicitly constructed objects. Since “an algebraic closure” is not something explicitly constructed and since there is no canonical map between an arbitrary algebraic closure and Bourbaki’s C, a general algebraic closure of R should be denoted C, without the boldface.

TB: Bourbaki gave an algebraic closure of R, as R x R with a certain twisted multiplication. Their construction has two square roots of −1, (0,1) and (0,−1), and neither is more fundamental than the other, so in what sense does it pick one?

There is no need to pick one in order to make sense of JPS’s reply. it makes perfectly sense since there are two isomorphisms between any two algebraic closures of R, and neither of them is canonical.

This is typical for Bourbaki who adheres to a strictly material point of view, This allows one to identify things given canonically, but it cannot be done otherwise.

Posted by: Arnold Neumaier on October 3, 2009 8:05 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

There is no need to pick one in order to make sense of JPS’s reply.

Not to make sense of JPS's reply, no. Just to make sense of JL's reply.

This is typical for Bourbaki who adheres to a strictly material point of view

I have to remind myself that when you say ‘material’ you don't mean what Mike means by ‘material’.

Posted by: Toby Bartels on October 3, 2009 8:27 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

AN: This is typical for Bourbaki who adheres to a strictly material point of view

TB: I have to remind myself that when you say ‘material’ you don’t mean what Mike means by ‘material’.

You can decide that easily from context. Mike only defined the meaning of “material set theory” (though only vaguely; I see no way to decide whether the FMathL set theory counts as material or not), not the meaning of “material”.

Thus whenever the word material is used without “set theory” following, it is to be understood in my sense.

In case of Bourbaki, however, the two uses are equivalent, since he bases everything very explicitly on the material set theory ZF + global choice.

Posted by: Arnold Neumaier on October 3, 2009 8:54 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I see no way to decide whether the FMathL set theory counts as material or not

It is. You can tell because something can be an element of two different sets.

Thus whenever the word material is used without “set theory” following, it is to be understood in my sense.

First of all, I still do not understand what “your sense” is.

Secondly, can you please choose another word to describe whatever it is that you are trying to mean by “material”? The word “material” was introduced into this discussion to describe a particular type of set theory, and I think that trying to use it to mean something different when not followed by “set theory” is gratuitously confusing.

Posted by: Mike Shulman on October 3, 2009 9:14 PM | Permalink | PGP Sig | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

AN: I see no way to decide whether the FMathL set theory counts as material or not

MS: It is. You can tell because something can be an element of two different sets.

Ah. So you want to make this the defining property of a material set theory? You didn’t say so in the n-Lab page where you introduced the concept. You only mentioned it as a property of pure sets and slight variations thereof.

MS: can you please choose another word to describe whatever it is that you are trying to mean by “material”? The word “material” was introduced into this discussion to describe a particular type of set theory, and I think that trying to use it to mean something different when not followed by “set theory” is gratuitously confusing.

The meaning is not that different, even if one accepts that one can extract a clear meaning from myour definition.

I had given here 51”>here a general definition of the adjective “material” in the context of our discussion that makes perfectly sense in general, and fairly well fits the informal, nonmathematical meaning of “material”. To repeat,

AN: material = being able to identify the elements uniquely by giving a formal expression identifying it.

You attempted in set theory a definition of the compound phrase “material set theory” only, which I found quite ambiguous at the time when I had read the page and added my comments there. In the current, later version, you even say:

MS: We don’t have definitions of “material set theory” or “structural set theory”. These paragraphs are meant only to give an intuitive understanding. […] I’m still mulling over whether there is a better way to define these ideas.

I do not see why I should give up a clear and useful concept that I abstracted from our discussion - especially since the competition is only a vague formulation of a very special usage that had been formulated very confusingly at the time I coined my concept.

My concept of “material” even gives a clear sense to “material set theory” that is only slightly different than your informal, vague interpretation, and can easily be seen as its precise version. It distinguishes correctly between ZF (and variants) on the one hand and ETCS (and SEAR) on the other hand.

Please consider this to be my response to your invitation there:

MS: If anyone else has a way to phrase them better, be my guest.

Posted by: Arnold Neumaier on October 3, 2009 11:43 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

My concept of “material” even gives a clear sense to “material set theory” that is only slightly different than your informal, vague interpretation, and can easily be seen as its precise version. It distinguishes correctly between ZF (and variants) on the one hand and ETCS (and SEAR) on the other hand.

Even though I also would pick a different word for it, I would like to understand your concept of ‘material’ (and then why you consider it important). But I still don't understand what it means, because I don't see why ETCS is any different from ZF in this regard.

How is ZF material when most real numbers do not have names? How is ETCS non-material when everything that appears in it (every term in the language) appears with a name (a term in the language)?

Of course, each of these questions shows how to answer the other, but I don't know how to answer them both.

Posted by: Toby Bartels on October 4, 2009 12:21 AM | Permalink | Reply to this

Material objects and standard models

TB: I would like to understand your concept of ‘material’ (and then why you consider it important).

Maybe you understand if I am formally precise (ask for more precision if this is not yet enough): Call an object of a theory in first order logic material if it can be constructed in a definite way (i.e., not using free variables) from objects of the theory that uniquely exist.

These are the objects that can be named according to the usual conventions of mathematics.

Saying “let S be a set” is not naming in the above sense since S is a free variable. Thus this S is arbitrary and hence not material.

Saying “we write \emptyset for the uniquely determined set without elements is naming in the above sense, and gives the name \emptyset to the empty set.

TB: How is ZF material when most real numbers do not have names?

Can you show me a single real number that cannot be named? You cannot, for by showing it to me you name it. Thus how can you be sure that most real numbers do not have names?

Like the continuum hypothesis, your statement is true or false depending on the model of ZF. But mathematical statements about real numbers should hold in all models. Thus your statement is not a mathematical truth but a private intuition about preferred models of ZF.

In ZF with first order logic, there are possibly nonmaterial reals, since there may be uncountable models of the reals. On the other hand, the submodel of any model of ZF consisting of all material sets only is a (metacountable) model of ZF as it is easily seen to satisfy all ZF axioms.

This model really is the intended model of ZF when uses as a foundation for mathematics, since all ZF-based mathematics is about this model.

Indeed, no mathematician ever talked about any ZF-set that was not contained in this model (except in general ways using free variables, which anyone would be free to interpret as referring to material sets only).

In particular, all Dedekind reals (say) in this model are material.

The situation exactly parallels that with natural numbers. The intended model is metacountable, and any intended natural number is material. But the first order logic definition of the natural numbers allows uncountable models in which most natural numbers do not have names and hence are immaterial. None of these immaterial natural numbers plays any role in mathematics. They exist only because of intrinsic limitations of the description of natural numbers in terms of first order logic.

(That the metaset of names of real numbers is metacountable and the set of real numbers is uncountable is no conflict; it seems so only - and then constitute Skolem’s paradox - if one forgets the distinction of object level and metalevel.)

TB: How is ETCS non-material when everything that appears in it (every term in the language) appears with a name (a term in the language)?

Well, One cannot apply the above reasoning. Nothing in ETCS is unique, so there is no intended model of ETCS,

If one defines categories only up to isomorphism, one impoverishes the structure of categories: The partial order defined by being a subcategory loses the semilattice property that it has according to the standard definitions.

In the same way, a purely structural foundation of mathematics loses important structure, namely the property that one can define a standard model for the foundation of mathematics.

But in my view this property is very important since it is the main reason for why we can communicate mathematics objectively and (in principle) unambiguously.

Posted by: Arnold Neumaier on October 9, 2009 2:09 PM | Permalink | Reply to this

Re: Material objects and standard models

Call an object of a theory in first order logic material if it can be constructed in a definite way (i.e., not using free variables) from objects of the theory that uniquely exist.

As far as I can tell this is a totally different notion from what I meant by “material,” so I don’t think there’s any justification for using the same word. Moreover, I believe there is already a well-established and more descriptive word in model theory for this notion: definable.

You didn’t give any definition of when you want to call a theory material, but I gather from what you wrote that perhaps you mean that a theory is material when it has a model in which all objects are definable.

the submodel of any model of ZF consisting of all material sets only is a (metacountable) model of ZF as it is easily seen to satisfy all ZF axioms.

This is not at all obvious to me; can you prove it? Countable models do, of course, exist, by the Lowenheim-Skolem theorem, but I don’t think I’ve ever seen a statement that one can construct one consisting only of definable objects.

Note, in particular, that the set “defined” by a given property will be different in different models, since when you cut down to a substructure you change the meaning of quantifiers. For example, when you remove any real numbers, you change the meaning of the set {x|x 2<2}\{x \in \mathbb{R} | x^2 \lt 2 \}.

This model really is the intended model of ZF when uses as a foundation for mathematics, since all ZF-based mathematics is about this model.

I regard this as an outrageous statement. How many mathematicians would agree that there are only countably many real numbers in the “intended” model of mathematics? Logicians may prefer thinking about countable models for technical reasons, but I think that is a big disconnect between them and everyday mathematicians.

Even if we accept that any model of ZF contains a submodel consisting only of definable sets, it’s not true that every statement in the original model has the same meaning in this submodel, since (as above) the same statement can refer to different sets in the model and in the submodel.

Nothing in ETCS is unique

It is easy to modify ETCS so that the axioms which assert “there exists a set such that (blah)” become operations which construct a specific set such that (blah). For example, for any sets AA and BB there is a specified set A×BA\times B with specified projections making it a product. In a language which allows evil, then (which we’ve agreed by now is independent of structurality), A×BA\times B can be uniquely characterized in terms of AA and BB. So sets in ETCS can be just as definable as sets in ZF.

In the same way, a purely structural foundation of mathematics loses important structure, namely the property that one can define a standard model for the foundation of mathematics.

Even assuming your claim that any model of ZF contains a metacountable submodel consisting of definable sets, this statement does not follow, since different models of ZF might contain different submodels of definable sets.

But in my view this property is very important since it is the main reason for why we can communicate mathematics objectively and (in principle) unambiguously.

No, that has nothing to do with there being a standard model (and a good thing, too, since as I said above, there is no standard model). The statement of a theorem in any first-order theory is completely objective and unambiguous: it means “this statement can be proven from these axioms” or equivalently by the completeness theorem “this statement is true in all models of these axioms.”

Posted by: Mike Shulman on October 9, 2009 4:41 PM | Permalink | PGP Sig | Reply to this

Re: Material objects and standard models

AN: You didn’t give any definition of when you want to call a theory material, but I gather from what you wrote that perhaps you mean that a theory is material when it has a model in which all objects are definable.

Yes. For this is what distinguishes ZF from SEAR, and it is the only property I can imagine that would justify such a fundamental differentiation.

AN: the submodel of any model of ZF consisting of all material sets only is a (metacountable) model of ZF as it is easily seen to satisfy all ZF axioms.

MS: This is not at all obvious to me; can you prove it? Countable models do, of course, exist, by the Lowenheim-Skolem theorem

One possible proof of this theorem is constituted by precisely this result. Define a notation for all the objects whose uniqueness is claimed in the introductory part of a book on axiomatic field theory, and consider the conservative extension of ZF that makes these into primitive operations. Then consider the set S of all ground formulas in this theory. Define an equivalence relation E that two formulas are equivalent if they are equal in each model. Then S/E is canonically a countable model of ZF, in which every set is definable.

Moreover, one can identify the definable objects in any model by means of a canonical embedding of S/E into the model, and thus claim that this model is the unique intended model of ZF, in exactly the same way as there is a unique intended model of the natural numbers, which picks from any first order logic model of Peano arithmetic precisely the definable part.

MS: For example, when you remove any real numbers, you change the meaning of the set {xx 2<2}\{x\in\mathbb{R} \mid x^2\lt 2\}.

Only in the same weak sense that I change the meaning of the set of all even numbers if I remove the unnameable natural numbers in a nonstandard FOL model of the natural numbers.

But nothing has changed mathematically, since I only removed ghosts that nobody can name anyway. Not a single theorem of arithmetic changes by this exorcism.

AN: This model really is the intended model of ZF when uses as a foundation for mathematics, since all ZF-based mathematics is about this model.

MS: How many mathematicians would agree that there are only countably many real numbers in the “intended” model of mathematics?

All mathematicians who agree that only that is mathematics which can be communicated. For it is a logical consequence of this that all mathematics happens in thee countable model of definable numbars.

They perhaps wouldn’t agree on a first confrontation with the statement, but I can’t see how they can fail to agree when understanding the argument.

Of course, countable must here be read as metacountable (all names are metaobjects only), since it is a property of the model. This does not affect the theorem that inside the model one cannot find an injective map from the reals to the integers, and that, on the object level, there are uncountable reals. This is well-known since Skolem.

MS: In a language which allows evil, then (which we’ve agreed by now is independent of structurality), A×B can be uniquely characterized in terms of A and B. So sets in ETCS can be just as definable as sets in ZF.

Not yet according to this, since this is a construction using variables A and B. In SEAR, you cannot even prove the uniqueness of the empty set - the statement does not even make formal sense. (I haven’t checked ETCS.)

But once you make all the stuff unique that you need to have a full-fledged set theory, your set theory is every bit as material as ZF, in any reasonable sense of the word material.

MS: this statement does not follow, since different models of ZF might contain different submodels of definable sets.

Not if you stick to a fixed notation for defining the meaning of “definable”. The definable submodel is unique given the standard notation of set theory.

Posted by: Arnold Neumaier on October 9, 2009 5:50 PM | Permalink | Reply to this

Re: Material objects and standard models

I’m rapidly running out of patience with your unwillingness to listen to what I’m saying, but I’ll try one more time. However, unless your attitude changes soon, I will have to cease this discussion in order to maintain my own peace of mind.

For this is what distinguishes ZF from SEAR, and it is the only property I can imagine that would justify such a fundamental differentiation.

First of all, as I said and you did not refute (see below), SEAR can be trivially modified to make its objects just as definable as those in ZF. Secondly, for the past month we have been explaining a different such property. I have even given a tentative formal definition of it.

Then S/E is canonically a countable model of ZF, in which every set is definable.

Well, this construction is not what you said; you talked before about a submodel of a given model rather than the construction of a term model. What you’re now describing is, I believe, Henkin’s proof of Gödel’s Completeness Theorem using a term model. I don’t see why you’re claiming this is special to ZF, though; a term model can be constructed for any first-order theory, including SEAR and ETCS.

Moreover, one can identify the definable objects in any model by means of a canonical embedding of S/E into the model

I don’t see any reason for this map to be an embedding. Why might not two different sets in the term model coincidentally turn out to be the same in some other model?

How many mathematicians would agree that there are only countably many real numbers in the “intended” model of mathematics?

All mathematicians who agree that only that is mathematics which can be communicated. For it is a logical consequence of this that all mathematics happens in thee countable model of definable numbars.

I am a mathematician. I agree that only mathematics expressed in first-order logic can be communicated. And I agree that when we work in a particular first-order theory, everything we prove is just as true of the term model as it is of any other model, and any explicitly constructed object exists in the term model just as well as it exists in any other model. Thus, if we use ZF as a foundation for mathematics, then all mathematics could just as well happen in the term model as in any other model. But it is not a logical consequence of this that all mathematics does happen in the term model or that the term model is the “intended” model. (This is also just as true of any other first-order theory like SEAR or ETCS.)

This assertion is philosophy, not mathematics, and unfortunately in philosophy it is not possible to give arguments with which no one can fail to agree once they understand the argument.

In a language which allows evil, then (which we’ve agreed by now is independent of structurality), A×BA\times B can be uniquely characterized in terms of AA and BB. So sets in ETCS can be just as definable as sets in ZF.

Not yet according to this, since this is a construction using variables AA and BB.

I was giving an example of how the constructions of SEAR can be performed in a specified way. Most constructions start from input and produce output. Consider in ZF the pairing axiom: for any two sets xx and yy there is a set {x,y}\{x,y\}. Is the set {x,y}\{x,y\} definable? It contains the variables xx and yy. But once two sets xx and yy have been defined, then the corresponding set {x,y}\{x,y\} is definable. Likewise, in SEAR with a product operation as I described above, once two sets AA and BB have been defined, the set A×BA\times B is definable.

In SEAR, you cannot even prove the uniqueness of the empty set - the statement does not even make formal sense.

Not in plain SEAR. However, I specifically said that I was augmenting the language with evil, so that we can talk about equality of objects. We might as well even add a skeletality axiom: isomorphic sets are equal. This doesn’t make the theory any less structural, but one can then state and prove that the empty set is, literally, unique.

But once you make all the stuff unique that you need to have a full-fledged set theory, your set theory is every bit as material as ZF, in any reasonable sense of the word material.

No, not in the sense that we are using the word “material,” which I and several other contributors to this discussion find to be perfectly reasonable.

Posted by: Mike Shulman on October 9, 2009 7:21 PM | Permalink | PGP Sig | Reply to this

Re: Material objects and standard models

We might as well even add a skeletality axiom: isomorphic sets are equal.

Can we do that? It seems to me that you need to prove a coherence theorem to know that this is conservative, or even consistent.

Not that it matters for the definability of the empty set, which just did in detail in another comment.

That there may be other empty sets is no more a problem in SEAR or ETCS than it is in ZFC that there are many singleton sets.

Posted by: Toby Bartels on October 10, 2009 2:45 AM | Permalink | Reply to this

Re: Material objects and standard models

We might as well even add a skeletality axiom: isomorphic sets are equal.

Can we do that? It seems to me that you need to prove a coherence theorem to know that this is conservative, or even consistent.

Sure, but the coherence theorem is just “every category has a skeleton.” Of course you need some choice in the metatheory for that to work.

Posted by: Mike Shulman on October 10, 2009 7:12 PM | Permalink | PGP Sig | Reply to this

Re: Material objects and standard models

We might as well even add a skeletality axiom: isomorphic sets are equal.

Can we do that? It seems to me that you need to prove a coherence theorem to know that this is conservative, or even consistent.

Sure, but the coherence theorem is just “every category has a skeleton.” Of course you need some choice in the metatheory for that to work.

Yes, of course you're right, we're not trying to mix this within any sort of strictness condition.

I don't find the skeletality axiom very natural, however. We'll have, for example, A×(B×C)=(A×B)×CA \times (B \times C) = (A \times B) \times C, but we can't assume that (x,(y,z))((x,y),z)(x,(y,z)) \mapsto ((x,y),z) is the identity function on it. Often ϕ,ψ:AB\phi, \psi\colon A \looparrowright B will satisfy |ϕ|=|ψ||\phi| = |\psi| but not ϕ=ψ\phi = \psi, which messes with the abuse of notation (which we want to match the usual vernacular) that conflates ϕ\phi with |ϕ||\phi|.

Posted by: Toby Bartels on October 11, 2009 12:07 AM | Permalink | Reply to this

Re: Material objects and standard models

I don’t find the skeletality axiom very natural, however.

It’s even worse than that: skeletality is actually a complete red herring, and I shouldn’t have mentioned it at all. Skeletality makes the product object unique, but not the projections: a given object PP can generally be made into a product of AA and BB in lots of different ways. So skeletality doesn’t make the situation any better if we want to only assert literally-unique existence.

Posted by: Mike Shulman on October 11, 2009 1:23 AM | Permalink | PGP Sig | Reply to this

Re: Material objects and standard models

MS: I’m rapidly running out of patience with your unwillingness to listen to what I’m saying,

It is not unwillingness to listen but failing to understand. We come from very different tradtitions, already interpret the same formal definition of a category differently, and this becomes worse on philosophical issues (as much of our debate is about).

MS: SEAR can be trivially modified to make its objects just as definable as those in ZF.

You had mentioned that one can make the operations definite by small modifications, and I had complained (though maybe expressed poorly) that this does not suffice since there is nothing to start the process. Without a unique empty set, there would not be any ground formulas.

You only remedied this defect in the current mail. But I am not convinced that SEAR with skeletality can be modified such that it is both consistent and its ground formulas form a model of SEAR, so that it would be material in my sense. I’d like to see a number of specific material sets, elements, and relations in such a modified SEAR, to be able to form an intuition.

It is not even clear that skeletality together with unrestricted substitution law for equals (which seems necessary in order to have a definable submodel) is consistent. Certainly the model of SEAR that proves its consistentcy is no longer a model for SEAR+skeletality+unrestricted substitution of equals.

AN: S/E is canonically a countable model of ZF, in which every set is definable. […] Moreover, one can identify the definable objects in any model by means of a canonical embedding of S/E into the model

MS: you talked before about a submodel of a given model rather than the construction of a term model. […] I don’t see any reason for this map to be an embedding. Why might not two different sets in the term model coincidentally turn out to be the same in some other model?

My embedding was not meant to be necessarily faithful. To be fully precise, I should have said canonical homomorphism in place of canonical embedding. This is enough to exhibit the unique definable submodel of any given model, which was your request I had answered.

MS: I don’t see why you’re claiming this is special to ZF, though; a term model can be constructed for any first-order theory, including SEAR and ETCS.

My construction is not a Henkin term model but consists of the ground terms only. In (unmodified) SEAR, the construction gives an empty “model” that does not satisfy the SEAR axioms.

Peano arithmetic, ZF, and ZF+global choice are examples of theories where one gets a model in this way, and which are therefore material theories. Since choice is not unique, ZFC is not fully material, but it contains a material ZF submodel.

Indeed, my concept of a material set theory being a set theory where the ground formulas form a model of the theory is a precise formalization of your former intuition that in a material set theory, all sets are created from preexisting building blocks.

SM: How many mathematicians would agree that there are only countably many real numbers in the “intended” model of mathematics?

AN (slightly corrected): All mathematicians who agree that only that is mathematics which can be communicated. For it is a logical consequence of this that all mathematics happens in the countable model of definable objects.

MS: I agree that only mathematics expressed in first-order logic can be communicated. […] But it is not a logical consequence of this that all mathematics does happen in the term model

All mathematics expressed in any form of logic is communicated in the form of strings from a finite alphabet, hence in a countable model of the logic used. If you agree that uncommunicated stuff is not mathematics, it follows that all mathematics is represented (and not only, as you claim, representable) in a countable model.

Of course, the mathematics may also refer to other models, including uncountable ones, but what is uniquely communicable about these uncountable models is again only the definable part.

The only communicable meaning of uncountable in mathematics is that the counting of the definable model cannot be done using a function represented in the model itself. But on the level where mathematics is actually done, one can count constructively, by lexicographically ordering the (presumably finitely many) statements completed each day, for all days of past and future history of mathematics.

MS: unfortunately in philosophy it is not possible to give arguments with which no one can fail to agree once they understand the argument.

Apparently yes.

AN: once you make all the stuff unique that you need to have a full-fledged set theory, your set theory is every bit as material as ZF

MS: not in the sense that we are using the word “material,”

But at least in the sense that all sets are created from preexisting building blocks, which once was your definition of material.

Posted by: Arnold Neumaier on October 10, 2009 10:12 PM | Permalink | Reply to this

Re: Material objects and standard models

I’d like to see a number of specific material sets, elements, and relations in such a modified SEAR, to be able to form an intuition.

I'll do some in the version of SEAR that I described here, which has operations and equality on sets but not skeletality (which I find unpleasant and unnecessary).

This has notation for a set 1\mathbf{1} (I wrote ‘11’ in the cited comment) with exactly one element; since we have an equality predicate on sets, this is unique, the only set which is equal to 1\mathbf{1}.

I also described in that comment how to get an empty set, denoted |{x1,y1|}||\{x \in 1, y \in 1 \;|\; \bot\}|. It is not necessarily the only empty set (which is undecidable, although it would follow from skeletality), but it is the only set equal to |{x1,y1|}||\{x \in 1, y \in 1 \;|\; \bot\}|. Let us introduce a new symbol and denote this 0\mathbf{0}.

I didn't say anything about infinite sets then, but we should replace the axiom that one exists with the axiom (provable in the original SEAR) that a universal one (a Peano–Dedekind system) exists, and then turn that into some constants. Now it says:

\mathbb{N} is a set, s:s\colon \mathbb{N} \looparrowright \mathbb{N} is an injective function, and oo is an element of \mathbb{N} not in the range of ss such that, given any ϕ:1\phi\colon 1 \looparrowright \mathbb{N}, if oϕo \in \phi and, for each element xx of \mathbb{N} such that xϕx \in \phi, s(x)ϕs(x) \in \phi, then xϕx \in \phi for every element xx of \mathbb{N}.

And \mathbb{N} is ‘material’ because, as before, only it is equal to \mathbb{N}.

OK, that's enough sets; you can imagine how to make more. Once we have these, it's easy to make ‘material’ elements and relations (including subsets and functions). The set \mathbb{N} has elements oo, s(o)s(o), s(s(o))s(s(o)), etc. It takes a bit of work to define, say, the function of addition from \mathbb{N} to itself in the usual recursive way, but once that is done is uniquely specified. And so on.

Does that help? Do these things seem properly ‘material’ in your sense?

Posted by: Toby Bartels on October 11, 2009 1:42 AM | Permalink | Reply to this

Re: Material objects and standard models

It seems to me that in ZF as usually presented, there are no ground terms either, since there are no basic constants or operations and all the axioms merely assert that sets with one property or another exist. Coincidentally, the sets asserted to exist by most of the axioms turns out to be unique, but that isn’t true for, say the axiom of choice. You can get around this with a global choice operator, i.e. by Skolemizing the axiom of choice.

But the same works in any other first-order theory: you can Skolemize all the existence axioms and then the ground terms you get are basically what’s in the Henkin model. This is basically what Toby’s preferred version of SEAR/ETCS does: for all the axioms asserting that a set, element, or relation with a certain property exists, introduce an operation in the theory which outputs a specific thing with that property.

You seem to be asserting that there’s something special about Peano arithmetic and ZF in this connection; can you give any references to the literature which support this contention?

I’d like to see a number of specific material sets, elements, and relations in such a modified SEAR, to be able to form an intuition.

Toby beat me to this, but I already had this written out before I saw his post, so I’ll put it up anyway.

  • 11: the specified terminal set, with a unique element \star.
  • \emptyset: the specified tabulation of the subset of 11 that contains no elements.
  • 1×11\times 1: the specified product of the specified terminal set with itself. This is, of course, canonically isomorphic to 11.
  • ×1\emptyset\times 1: the specified product of \emptyset and 11. This is again empty, and thus canonically isomorphic to \emptyset.
  • P1P 1: the specified powerset of 11.
  • P1\top \in P 1: the specified element of P1P 1 corresponding to the full subset of 11, i.e. such that ϵ 1(,)\epsilon_1(\star,\top).
  • P1\bot\in P 1: the specified element of P1P 1 corresponding to the empty subset of 11, i.e. such that ϵ 1(x,)\epsilon_1(x,\bot) never.
  • 111\sqcup 1: the specified coproduct of 11 with itself. (The original version of SEAR doesn’t include an axiom of coproducts, since they can be constructed from powersets, but for convenience the Skolemized version of SEAR should probably have a coproduct operation.)
  • NN: the specified set equipped with specified oNo\in N and s:NNs:N\to N satisfying the Peano axioms.
  • 1=s(o)N1 = s(o)\in N (overloaded with the terminal set 11).
  • 2=s(s(o))N2 = s(s(o)) \in N.
  • 3=s(s(s(o)))N3 = s(s(s(o)))\in N.
  • N×NN\times N: the specified product of NN with itself.
  • +:N×NN+\colon N\times N \to N: the function defined by a+b=ca+b = c iff R((a,b),c)R((a,b),c) for every relation R:N×NNR\colon N\times N \looparrowright N such that R((x,o),x)R((x,o),x) for all xx and if R((x,y),z)R((x,y),z) then R((x,s(y)),s(z))R((x,s(y)),s(z)).
  • :N×NN&#8727;\colon N\times N \to N: the function defined by ab=ca&#8727;b = c iff R((a,b),c)R((a,b),c) for every relation R:N×NNR\colon N\times N \looparrowright N such that R((x,o),o)R((x,o),o) for all xx and if R((x,y),z)R((x,y),z) then R((x,s(y)),z+x)R((x,s(y)),z+x).
  • <:NN\lt\colon N\looparrowright N: the relation defined by a<ba\lt b iff R(a,b)R(a,b) for every relation R:NNR\colon N\looparrowright N such that R(0,s(x))R(0,s(x)) for all xx and if R(x,y)R(x,y) then R(s(x),s(y))R(s(x),s(y)).
  • Q +Q_+: the specified tabulation of the subset of N×NN\times N consisting of those pairs (a,b)(a,b) such that b0b\neq 0 and there do not exist c,d,ec,d,e with a=s(s(c))da = s(s(c))&#8727;d and b=s(s(c))eb=s(s(c))&#8727;e.
  • 0Q +0\in Q_+: the element of Q +Q_+ which maps to (o,s(o))(o,s(o)) under the specified inclusion Q +N×NQ_+ \hookrightarrow N\times N.
  • 1Q +1\in Q_+: the element of Q +Q_+ which maps to (s(o),s(o))(s(o),s(o)) in N×NN\times N.
  • 34Q +\frac{3}{4} \in Q_+: the element of Q +Q_+ which maps to (s(s(s(o))),s(s(s(s(o)))))(s(s(s(o))),s(s(s(s(o))))).
  • Q +Q +Q_+ \sqcup Q_+ the specified coproduct of Q +Q_+ with itself.
  • QQ: the specified tabulation of the subset of Q +Q +Q_+ \sqcup Q_+ consisting of everything except the image of 00 coming from the first copy of Q +Q_+.
  • <:QQ\lt \colon Q \looparrowright Q: the relation defined by p<qp\lt q if either:
    • pp maps into the first copy of Q +Q_+ and qq into the second, or
    • both map into the second copy, say pp maps to (a,b)(a,b) in N×NN\times N and qq maps to (c,d)(c,d) in N×NN\times N, and ad<bca&#8727;d\lt b&#8727;c in NN.
    • both map into the first copy, say pp maps to (a,b)(a,b) in N×NN\times N and qq maps to (c,d)(c,d) in N×NN\times N, and bc<adb&#8727;c \lt a&#8727;d in NN.
  • RR: the specified tabulation of the subset of PQP Q consisting of those rPQr\in P Q such that
    • there exists an xQx\in Q with ϵ Q(x,r)\epsilon_Q(x,r),
    • there exists an xQx\in Q with ¬ϵ Q(x,r)\not\epsilon_Q(x,r),
    • if ϵ Q(x,r)\epsilon_Q(x,r) and y<xy\lt x then ϵ Q(y,r)\epsilon_Q(y,r), and
    • if ϵ Q(x,r)\epsilon_Q(x,r) then there exists a yQy\in Q with x<yx\lt y and ϵ Q(y,r)\epsilon_Q(y,r).

If you’re worried about consistency of Skolemized SEAR, just consider the model of SEARC constructed from ZFC. Since ZFC has canonical constructions of products, and unique constructions of subsets and powersets, it’s easy to define all the operations required by Skolemized SEARC.

AN: it is a logical consequence of this that all mathematics happens in the countable model of definable objects.

MS: I agree that only mathematics expressed in first-order logic can be communicated. […] But it is not a logical consequence of this that all mathematics does happen in the term model

AN: All mathematics expressed in any form of logic is communicated in the form of strings from a finite alphabet, hence in a countable model of the logic used. If you agree that uncommunicated stuff is not mathematics, it follows that all mathematics is represented (and not only, as you claim, representable) in a countable model.

Well, “representable” means “can be represented,” so I don’t see much of a difference there. It is true that inasmuch as mathematics is done via logical inference rules rather than actually touching any model (since, Platonists notwithstanding, none of us have ever touched a real number), and these inference rules are represented faithfully in the term model, one can say that mathematics “is represented in” the term model. But noting that this is philosophy, I’d draw a distinction between “is represented in” and “happens in.” If mathematics can be said to happen anywhere in particular, I’d say that it happens either in the logical inference rules (i.e. the syntax, which is prior to any semantics, even the term model), or else it happens in all models at once.

I acknowledge that you disagree, and since this is philosophy, I can’t expect to convince you, but I ask in return that you acknowledge that I and plenty of other mathematicians may disagree with you, so that it is not justifiable to claim that the term model is the “intended” model. It may be the model you intend, but it need not be the model everyone intends.

in the sense that all sets are created from preexisting building blocks, which once was your definition of material.

Ah! I think I may see the misunderstanding. I never said that “material” means that sets are created from preexisting building blocks. As above, it happens just as much in a structural set theory that sets are constructed from other preexisting sets. What I said is that that in a material set theory, the elements of a set are preexistent to that set.

Actually, though, as you know I have since retracted even that statement (as I said at nlab:set theory), because I realized that in a non-well-founded material set theory there is no sense in which the elements of a set can be said to exist “before” that set, e.g. for the set Ω={Ω}\Omega = \{\Omega\}. But I only ever intended “preexisting” to refer to the elements of a set, rather than the other sets from which it might be constructed.

Posted by: Mike Shulman on October 11, 2009 3:28 AM | Permalink | PGP Sig | Reply to this

Re: Material objects and standard models

MS: in ZF as usually presented, there are no ground terms either, since there are no basic constants or operations and all the axioms merely assert that sets with one property or another exist.

Therefore the definition of material I gave had (implicit in its formulation) a unique choice operator before forming the ground model of the resulting conservative extension.

MS: Coincidentally, the sets asserted to exist by most of the axioms turns out to be unique, but that isn’t true for, say the axiom of choice.

The latter isn’t part of ZF, so this doesn’t matter.

MS: You can get around this with a global choice operator, i.e. by Skolemizing the axiom of choice.

But this changes the theory - it is no longer a conservative extension.

It looks like global choice might make every theory material in my sense.

MS: But the same works in any other first-order theory: you can Skolemize all the existence axioms and then the ground terms you get are basically what’s in the Henkin model.

But not every model of every first order logic theory contains a submodel definable in a conservative extension of the theory. This is special to material theories like Peano arithmetic, ZF, or equational algebraic structures.

It fails for SEAR, while it seems to hold for SEAR+global choice.

MS: You seem to be asserting that there’s something special about Peano arithmetic and ZF in this connection; can you give any references to the literature which support this contention?

I don’t have any ready literature to quote. I may have gathered it from reading the literature over the years, between the lines. But the fact is clear from what I have presented in this discussion, and needs no backup by a reference to authority.

MS: # 1: the specified terminal set, with a unique element ⋆.

# ∅: the specified tabulation of the subset of 1 that contains no elements. […]

Thanks for the list; I now see what you meant.

Yes, in SEAR+global choice, you get all this stuff (without needing the dubious skeletality), and it indeed looks like that you get in this way a ground term model.

I included global choice into FMathL precisely to allow easy structural thinking: As we had discussed already, one can make constructed objects unique but anonymous by using the choice operator.

I’d call a theory with global choice material: It enables one to do in the realm of mathematics to do what ordinary people do with ordinary material things in ordinary language.

But clearly, this is very different from what you intended with your notion of “material”. Thus I give up trying to find a joint concept of material.

Instead I now claim that your use of “material” and “structural” is an inappropriate, intuitively misleading use of language,

There is nothing material about the assertion that currently characterizes your term “material set theory”, namely that the same element can be in two distinct sets.

There is also nothing structural in a set theory that eliminates all structure from sets. What you call “structural set theory” deserves to be called a “theory of structureless sets”, which means essentially the opposite of the name you chose. What you call “structurally presented” is much more naturally called “structureless”. For it is a condition for the absence of structure, rather than one for a structural presentation.

But more about that on the n-Lab page on structural set theory.

Posted by: Arnold Neumaier on October 11, 2009 11:34 AM | Permalink | Reply to this

Re: Material objects and standard models

Maybe you understand if I am formally precise (ask for more precision if this is not yet enough): Call an object of a theory in first order logic material if it can be constructed in a definite way (i.e., not using free variables) from objects of the theory that uniquely exist.

OK.

I would like to use things that are unique up to unique isomorphism in the same way that you would use things that are unique. However, if you insist on uniqueness up to an equality predicate that is respected by all dependent type constructions (and everything else), then I can do that, using a structural set theory (such as ETCS is usually given as, or such as SEAR could just as easily be given as) with such an equality predicate on each type.

Then I would also want to make the existence axioms in ETCS and SEAR into operations. (Personally, I like that sort of thing better anyway, although I gather that Mike does not.) We do need a theorem that this is conservative, which is true as long as we do this only for things which are unique up to unique isomorphism. (That's why that's the condition that matters!) For ETCS as I normally see it written, this is true. For SEAR as Mike wrote it, some of the axioms need to be tightened up; for example, the axiom that some set with an element exists needs to become that a set with a unique element exists. But that is equivalent (given the other axioms) to Mike's axiom; we can do all of that.

Thus adjusted, ETCS and SEAR are still structural as Mike tried to define the term on the Lab, and they are still not material as I tried to define the term in an earlier comment. They are also structural and not material according to my intuitive understanding of how Mike was using those terms. I will let Mike confirm that they are structural and not material according to his intuitive understanding of those terms.

Are they now ‘material’ as you have been using that term?

Saying “we write \empty for the uniquely determined set without elements” is naming in the above sense, and gives the name \empty to the empty set.

OK.

And in ETCS, once we change the axiom that there exists a terminal object into a constant term 11 for a terminal object, and once we change the axiom that a subobject classifier exists into a constant term Ω\Omega for a subobject classifier, then we can (using only the axioms of a category, by a somewhat complicated argument, but without any other potential choices) define morphisms :1Ω{\top}\colon 1 \to \Omega and :1Ω{\bot}\colon 1 \to \Omega. We can prove that an equaliser of these is an initial object, and we have changed the axiom that equalisers exist into an operation that assigns the underlying object (say |{x=y}||\{x = y\}|) of an equaliser to each parallel pair of morphisms x,y:ABx, y\colon A \to B (and an operation that assigns a universally equalising function ι {x=y}:|{x=y}|A\iota_{\{x = y\}}\colon {|\{x = y\}|} \to A to each such pair, but we won't need that). So, we finally define \empty to be |{=}||\{{\top} = {\bot}\}|.

SEAR is more user-friendly. As mentioned above, we use the axiom that a set with a unique element exists, and change it into a constant for a set 11 (and a constant for an element of 11, although we don't need that). Then we consider the unique relation {x1,y1|}\{x \in 1,\; y \in 1 \;|\; {\bot}\} from 11 to itself given by an always false statement; it's immediate that a tabulation of this relation has no elements, and we have changed the axiom that tabulations exist into an operation that assigns the underlying set |ϕ||\phi| of a tabulation to each relation ϕ:AB\phi\colon A \looparrowright B (and an operation that assigns tabulating functions p ϕ:|ϕ|Ap_\phi\colon |\phi| \looparrowright A and q ϕ:|ϕ|Bq_\phi\colon {|\phi|} \looparrowright B to each such relation, but we won't need that). So, we define \empty to be |{x1,y1|}||\{x \in 1,\; y \in 1 \;|\; {\bot}\}|.

How is ZF material when most real numbers do not have names?

Can you show me a single real number that cannot be named? You cannot, for by showing it to me you name it. Thus how can you be sure that most real numbers do not have names?

Can you show me a single real number, or anything else, in SEAR that cannot be named?

I admit, I'm not really interested in how you answer my questions; I'm just going to bounce them back to you, swapping ZF with SEAR, until I can see why one is ‘material’ and the other is not.

This model really is the intended model of ZF when uses as a foundation for mathematics, since all ZF-based mathematics is about this model.

Intended by whom? What is ‘intended’ is a sociological question that cannot be proved, and I don't think that anybody intends the universe of sets to be countable.

But no matter. If your intended model of ZF is a countable model in which every term is definable, then (for purposes of distinguishing them by your sense of ‘material’) I'll take my intended model of SEAR to be a countable model in which every term is definable.

Nothing in ETCS is unique, so there is no intended model of ETCS,

Many things in ETCS are unique up to unique isomorphism, but if you don't recognise that, then use the version above with constants and operations instead of existential statements.

If one defines categories only up to isomorphism, one impoverishes the structure of categories: The partial order defined by being a subcategory loses the semilattice property that it has according to the standard definitions.

Subcategories of a given category (taking a small strict category, which is what can be modelled in SEAR or ZF and is meaningful up to isomorphism) still form a meet-semilattice relative to the correct, defined, notion of equality of subcategories. Of course, one has to check that the operations that one wants to perform on subcategories respect this equality, which they do. (Actually, it is only in ETCS that one even has to do this; SEAR already has a notion of equality of subsets, and a subcategory is given by a subset of the ambient category's set of morphisms.)

In the same way, a purely structural foundation of mathematics loses important structure, namely the property that one can define a standard model for the foundation of mathematics.

But in my view this property is very important since it is the main reason for why we can communicate mathematics objectively and (in principle) unambiguously.

Model theory is the main reason? What is proof theory, chopped liver? And even model theory doesn't need a ‘standard’ (or ‘intended’) model, so I still don't agree with that reason.

But as I said before, no matter. I'll match whatever model of ZF you consider standard with a model of SEAR that I'll declare standard.

Posted by: Toby Bartels on October 10, 2009 2:35 AM | Permalink | Reply to this

Re: Material objects and standard models

Then I would also want to make the existence axioms in ETCS and SEAR into operations. (Personally, I like that sort of thing better anyway, although I gather that Mike does not.)

It seems to me less in the spirit of uniqueness-up-to-unique-isomorphism to specify a particular set A×BA\times B rather than merely the existence of some set with the universal property of a product. If a product is only defined up to unique isomorphism, then why do you want to specify a particular one?

Making things operations is also more presentation-dependent. E.g. if you have products and equalizers as operations, then to get a (specified) pullback you have to decide how to construct a pullback from products and equalizers, and there may be more than one way to do that.

However, none of that makes a theory with operations less structural.

We do need a theorem that this is conservative, which is true as long as we do this only for things which are unique up to unique isomorphism.

The obvious way to show that requires choice in the metatheory, but maybe you have something cleverer in mind?

Posted by: Mike Shulman on October 10, 2009 7:47 PM | Permalink | PGP Sig | Reply to this

Re: Material objects and standard models

Then I would also want to make the existence axioms in ETCS and SEAR into operations. (Personally, I like that sort of thing better anyway, although I gather that Mike does not.)

It seems to me less in the spirit of uniqueness-up-to-unique-isomorphism to specify a particular set A×BA \times B rather than merely the existence of some set with the universal property of a product. If a product is only defined up to unique isomorphism, then why do you want to specify a particular one?

I don't see it as specifying a particular one, but as specifying a particular notation for one. There's nothing particular about that one, nothing that you can say anyway, at least not without an equality predicate on sets. (My post did have an equality predicate, to formulate AN's ‘material’ criterion, but that's not the sort of thing that I like better.)

Making things operations is also more presentation-dependent. E.g. if you have products and equalizers as operations, then to get a (specified) pullback you have to decide how to construct a pullback from products and equalizers, and there may be more than one way to do that.

I don't see this. Your way doesn't allow a specified pullback at all, so I don't know why you're looking at that to compare to my way. Each of us has many ways to prove the existence of a pullback; indeed, there is a straightforward correspondence between them. Each constructively valid one also gives my way a notation for a pullback, which I guess that you can call a specified pullback; I don't see what's wrong with that. Still, the only sense in which there is ‘more than one way’ to specify a pullback is that there is more than one way syntactically to write it down; there's no internal sense in which these may be ‘equal’ or not.

Nothing depends on the presentation, although it's there if you want it. Once you've proved that all pullbacks exist, then you can say ‘Let PP be a pullback of f:XZf\colon X \to Z and g:YZg\colon Y \to Z, with maps a:PXa\colon P \to X and b:PYb\colon P \to Y.’ either way; my way, you can also write ‘Let PP be […], let aa be [...][...], and let bb be [][\dots].’ (and prove that this makes a pullback) if you feel like it. Probably you'll want to introduce some notation for universal maps to PP; you can introduce it by fiat, based on the proof that it's a pullback, but you can also piggyback that on top of some of the notation for tabulations and separation if you want.

However, none of that makes a theory with operations less structural.

That's good to confirm. (^_^)

We do need a theorem that this is conservative, which is true as long as we do this only for things which are unique up to unique isomorphism.

The obvious way to show that requires choice in the metatheory, but maybe you have something cleverer in mind?

We need choice to prove that a model (for my version, constructed out of one for your version) has a function representing products, but we don't actually need that to prove that it's conservative. Is there a standard model theory for theories without equality?; if the model for the type of sets must be a set instead of a preset, then we should still model the operations as multi-valued functions (entire relations) or something.

But I understand proof theory better than model theory. I envision turning every statement in my language into an existential statement in your language (the same statement if the statement is already in your language); then prove by induction that every proof in my language becomes a proof in your language. This should work like your theorem on isomorphs from CLOG.

Upon reflection, an equality predicate on sets might mess this up. I think that I should modify the axioms of separation and collection to apply only to predicates in which equality between sets does not appear, to be safe.

Posted by: Toby Bartels on October 11, 2009 1:22 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Please consider this to be my response to your invitation

Okay, great! I thought you were saying that you wanted to use “material” to mean something different from what I/we want “material set theory” to mean. If instead we’re all trying to find a way to give precise definitions of (almost) the same intuitive idea, then we can have a more productive discussion. But in that case, I suggest that none of us should be wedded to a particular precise meaning of “material” until we’ve reached a consensus on what a good choice fo a precise definition would be.

Regarding your proposed definition, I am unclear on the meaning of “formal expression” and how it distinguishes between ZF and ETCS/SEAR. If by “formal expression” you mean a term with parameters, then SEAR is material in your sense; the element xAx\in A is identified by the formal expression “xx”. But if by “formal expression” you mean a term without parameters, then it seems that hardly any sets even in ZF would be material in your sense, since there are only countably many terms in the language of ZF.

(Well yes, of course, there are countable models, although off the top of my head I have no idea whether the statement would be true even in a countable model; it seems unlikely because of incompleteness/undefinability-type theorems. Was your definition intended to refer to elements of sets in all models of a theory? Or only that there exists a model in which the statement holds?)

You can tell because something can be an element of two different sets.

Ah. So you want to make this the defining property of a material set theory?

It’s close, but at the moment I think I would regard that as a sufficient, but not a necessary, condition. (Although my thoughts about what is essential to the notions have certainly changed over the course of this discussion, and may continue to do so.) I would want more generally that there is no built-in way to compare or relate elements of two different sets, except via the imposition of extra structure on those sets.

I did recently try to give a formal definition of when a theory is “structurally presented” at nlab:structural set theory. Not every “structural” theory is “structurally presented” according to this definition, so it is not fully satisfactory. However, all the structural theories I know can be made structurally presented with at most minor changes, while I see no clear way to modify ZF to make it structurally presented without essentially rebuilding it from SEARC or ETCS+R. But I haven’t yet thought of a way to make the notion of “minor changes” precise.

Posted by: Mike Shulman on October 4, 2009 6:44 AM | Permalink | PGP Sig | Reply to this

material vs. structural

AN: Please consider this to be my response to your invitation

MS: Okay, great! I thought you were saying that you wanted to use “material” to mean something different from what I/we want “material set theory” to mean. If instead we’re all trying to find a way to give precise definitions of (almost) the same intuitive idea, then we can have a more productive discussion.

Of course, Why should we discuss otherwise? I have no time to waste.

Regarding time, tomorrow is the start of the courses of the winter term in Vienna, Thus I’ll have much less time than up to now, and my contributions here will probably be much sparser, not commenting anything worth discussion.

In particular, I’ll have to delay replying to the remainder of your mail and to some other interesting mails from October 2 onwards.

Posted by: Arnold Neumaier on October 4, 2009 10:29 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Thus whenever the word material is used without “set theory” following, it is to be understood in my sense.

So you are using it in Mike's sense in ‘material set theory’? This is confusing, but OK.

Just for the record, however: Whenever I have used the term in this discussion, I've meant it in Mike's sense, regardless of what was following.

In case of Bourbaki, however, the two uses are equivalent, since he bases everything very explicitly on the material set theory ZF + global choice.

Equivalent? I really don't understand your sense at all if the mere adoption of material set theory as foundations renders the two senses equivalent.

In any case, it is wrong to that Bourbaki ‘adheres to a strictly material point of view’ in Mike's sense of ‘material’. Bourbaki uses a strictly material set theory as foundations, but they apply it in a thoroughly structural way. If Book I were replaced by something with a structural set theory as foundations, then the rest would not have to be changed. (Even the end of Book I, where Bourbaki works out a general theory of structure, would not have to be changed.) The point of view that Bourbaki so forcefully presents throughout the series, and which had a great effect on 20th-century mathematics in general and category theory in particular, is strongly structural, although technically built on a foundation of material set theory.

(Incidentally, I don't believe that Bourbaki's foundational set theory is exactly ZF + GC, but it is a material set theory.)

Posted by: Toby Bartels on October 3, 2009 11:57 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

TB: So you are using it in Mike’s sense in ‘material set theory’? This is confusing, but OK.

To avoid confusion, I avoid using this combination of words. The same attitude leads me to avoid \subset which may denote “subset” or “proper subset” depending on who uses the symbol.

But as I indicated in one of the mails today, I think that my usage of “material” even gives “material set theory” a meaning close to that intended by Mike.

AN: In case of Bourbaki, however, the two uses are equivalent, since he bases everything very explicitly on the material set theory ZF + global choice.

TB: Equivalent? I really don’t understand your sense at all if the mere adoption of material set theory as foundations renders the two senses equivalent.

The fact that Bourbaki bases his work on ZF makes Bourbaki’s theory a material set theory in Mike’s sense. And since in ZF, each set can be described as a rigid tree with each element identifiable uniquely by the subtree, all sets are material in my sense.

TB: In any case, it is wrong to that Bourbaki ‘adheres to a strictly material point of view’ in Mike’s sense of ‘material’. Bourbaki uses a strictly material set theory as foundations, but they apply it in a thoroughly structural way.

This only shows that material ans structural are not excluding each other.

Certainly, if (as Mike said today) FMathL is material in Mike’s sense since an element can belong to two different sets then Bourbaki is material for exactly the same reasons, as can be seen not only from his Theory of Sets but from any volume of the Elements of Mathematics.

TB: , I don’t believe that Bourbaki’s foundational set theory is exactly ZF + GC

Why not? What is it then, according to your belief?

Posted by: Arnold Neumaier on October 4, 2009 12:24 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I don’t believe that Bourbaki's foundational set theory is exactly ZF + GC

Why not? What is it then, according to your belief?

I don't remember precisely, and I would have to check. (I was at the library yesterday, but it didn't occur to me to check this then!) If I remember correctly, Bourbaki's foundational set theory has a primitive binary pairing operation.

If that's the only difference, then they're using a conservative extension of ZF + GC which is obviously equivalent, so it's no big deal. (Even if that's not the only difference, what differences there may be are no big deal.)

Posted by: Toby Bartels on October 4, 2009 12:55 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

I think there is confusion here between structural foundations and structural mathematics. We have been asserting that all mathematics is structural. I think this is what Toby meant when he said that Bourbaki presents a structural point of view. However, structural mathematics can be done based on either structural foundations or material foundations. Similarly, so could “material mathematics” be founded on either sort of foundation although I don’t think I’ve ever seen anything I would call “material mathematics” aside from maybe ZF-theory. I believe it is more natural and elegant to build structural mathematics on structural foundations, since using a material foundation entails lots of irrelevancies like 121\in \sqrt{2}, but using a material foundation as Bourbaki did doesn’t make the mathematics any less structural.

Posted by: Mike Shulman on October 4, 2009 7:12 AM | Permalink | PGP Sig | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

We have been asserting that all mathematics is structural. I think this is what Toby meant when he said that Bourbaki presents a structural point of view.

I meant a little more than that. I mean that Bourbaki makes it quite clear that mathematics is structure by talking explicitly about structured sets, structure-preserving isomorphisms, and universal properties.

It is possible to obfuscate the structural nature of mathematics. For example, define a topological space to be a collection of sets that is closed under binary intersection and arbitrary union. I saw that definition in a book from (I think) the 1950s. You would never find it in Bourbaki. I doubt that you could find it in a contemporary mathematics book either; to some extent, I think that this is due to Bourbaki's influence.

Posted by: Toby Bartels on October 4, 2009 3:52 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Arnold wrote:

Or am I supposed to read between the lines to figure out when something is to be forgotten and when to be remembered, depending on what sort of argument is made?

I won’t attempt to answer for Todd, but to answer the question in general: yup, ‘fraid so. In this, category theory is exactly the same as every other part of mathematics. Sometimes we write down every minuscule detail of an argument. Much more often, we skip the details that we think the reader will understand. This applies in particular to forgetful functors — but on that point, I think category theorists tend to be more conscientious than most.

Here are few similar issues in other branches of mathematics. Functional analysts will talk about the “dual” of a vector space, leaving you to guess whether it’s the algebraic or continuous dual. Topologists will talk about “groups”, leaving you to guess whether they mean plain groups or topological groups. Algebraists will talk about representations of a Hopf algebra, leaving you to guess that they mean representations of its underlying associative algebra.

All these practices are normal and probably necessary; and they’re harmless until the listener can’t guess what the speaker means. But in this, as I said, category theory is just like everything else.

Posted by: Tom Leinster on October 2, 2009 12:29 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

AN: Or am I supposed to read between the lines to figure out when something is to be forgotten and when to be remembered, depending on what sort of argument is made?

TL: yup, ‘fraid so. In this, category theory is exactly the same as every other part of mathematics. […] All these practices are normal and probably necessary; and they’re harmless until the listener can’t guess what the speaker means.

I agree with this, just wanted to be sure. But this leaves any moral completely up to the discretion of the reader.

In particular, there is no need to forget a construction just because it is in the definition of a category (such as the concept of a subcategory or the opposite category), until one finds a later use that makes sense only if the construction is forgotten. And even in that case, one needs to forget it only temporarily to save this particular use.

It seems this settles all moral disputes. or do I still get anything wrong here?

Posted by: Arnold Neumaier on October 9, 2009 1:09 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

But if I don’t then Todd Trimble’s argument that i:=xmodx 2+1i := x mod x^2+1 is not distinguished since this xx might have been in fact x&#8722;x breaks down. This is why I had thought one has to forget it, according to the morals whose precise formal contents I am trying to discover.

The point of any forgetting here was not some moral injunction; it was simply to explain what David Corfield was saying, which you had seemed to misunderstand. Here, let me reproduce it:

DC: There’s a difference between the field [x]/(x 2+1)\mathbb{R}[x]/(x^2+1) and the same field with the extra structure of a choice of a residue class to be designated ii. They belong to different categories.

Does choosing a notation really change the category an object belongs to? This would make the conversion headache in the structural approach even worse.

There you have it: there’s a distinction between [x]\mathbb{R}[x] as field (I’ll say field over \mathbb{R}, just to emphasize that it’s complex conjugation that is relevant to the discussion, and not some bizarre field automorphism constructed with the help of a choice principle), and field with the extra structure of choice of a residue class to be labeled ii.

I realize you probably get it by now.

Category theory is a marvelous conceptual tool for helping keep track of all these levels and distinctions. For me it was a godsend – categorical thinking imparts a level of precision and control which I could scarcely imagine otherwise. I do recognize of course that many mathematicians get by without it and produce fine work, but I harbor the strong suspicion that the design of a computer-aided system for human mathematicians will need the same level of precision and control built in, perhaps wisely left invisible to the user unless she asks for it.

Posted by: Todd Trimble on October 2, 2009 2:19 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

TT: I realize you probably get it by now.

Yes, and I knew all along (even without being deeply trained in category theory) that by taking into account (or dropping) extra structure the set of isomorphisms may decrease (or increase).

It is just the added moral of what one should or should not do that permeates much of our discussion, and that strongly irritates me because of its vagueness.

TT: I harbor the strong suspicion that the design of a computer-aided system for human mathematicians will need the same level of precision and control built in

FMathL has even more precision and control built in since it dispenses with the dubious categorial moral and clearly distinguishes beween equal and (essentially) the same.

Posted by: Arnold Neumaier on October 4, 2009 9:57 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Arnold wrote:

But even in this special case, there is somewhat strange in that your moral demands that in considering [x]\mathbb{R}[x] one immediatety is forced to forget the definition of [x]\mathbb{R}[x], and only keep the object up to isomorphism.

Pfft. “Moral demands.” I pray that I’ve now made clear here that imagined moral demands have nothing to do with it – I was attempting to clarify a mathematical point.

I am slowly adapting to this strange world for the sake of this discussion, since I need to see how to account in FMathL for its existence, but I thoroughly dislike it.

It saddens me to hear that. Category theory could be your friend. I humbly submit that it only seems strange because your acquaintance with it is still superficial, and you have yet to experience its truly amazing power.

In FMathL one can choose what one wants to forget and what one wants to remember, and remembering is the default, as in Bourbaki-style mathematics, since this provides the needed simplicity.

That sounds to me very much in a categorical spirit. In other words, we choose the right category to suit our exact needs, depending on how much we want to remember or forget. (I should say “concrete category” to make my point, but you might find that distracting.) The morphisms preserve however much structure we want to remember. If we want to remember more or forget some, we shift to another category.

I can believe that many mathematicians perform such mental shifts unconsciously or semi-consciously. Making such shifts explicit through category theory opens up whole other worlds of insight, but I appreciate the point that many mathematicians “don’t want to know”, they just want to solve their problem. But deep down below in the system, the shifting will be registered. Or so I would think.

I do not think that Bourbaki is any less structural in content than whatcategory theorists get with the moral structural straitjacket imposed.

I do not deny that it is very useful for certain categorical constructions to think that way, and FMathL allows one to forget everything undesired upon demand. But most mathematicians need not make use of this most of the time. It would only complicate their life. That’s why, for most of them, it is a straitjacket.

I think the explanation of ‘straitjacket’ here is consonant with what I just said: most people don’t want to know. But how strong is the argument that it would be a ‘straitjacket’ for the internal workings of the system? I’m not a programmer, but I would think quite the contrary.

It might not be a bad idea to have some straight talk about this ‘morals’ business. Presumably you’re still annoyed that some past business about C abcdC_{a b c d} has not been formally clarified to you to your satisfaction, so it’s still at the level of “moral demands” for you. Since you had earlier asked for my opinion about this, I’ll see if I can have a look at it and report what I think.

The situation is similar to that of constructive mathematics vs. classical mathematics. Some people like to work within the constraints of intuitionistic logic, but for most mathematicians, to be forced into that would be an unwelcome straitjacket. Hundred years ago, there were heated debates about who is right. But history decided that both camps should have their way, and they are accommodated piecefully side by side in modern mathematics.

This could be a tangent (or another hydra’s head about to grow), but there’s a little more than meets the eye here. The modern understanding (by categorists at any rate) is that while intuitionistic logic does place restrictions on which inferences are deemed valid, the flip side of the coin is that intuitionistic logic enjoys a much broader semantics. This isn’t Brouwerian philosophy; it’s a mathematical fact: the internal logic of a topos (a type of “category of sets” much, much wider in scope than models of ETCS) is in general intuitionistic. It would take a while to give examples of the powerful consequences of this fact, but suffice it to say that logical ‘straitjackets’ for some are semantic liberations for others.

It’s not just about Hilbert vs. Brouwer anymore and who was ‘right’; it’s a whole ‘nother ballgame.

Posted by: Todd Trimble on October 2, 2009 4:09 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

AN: I am slowly adapting to this strange world for the sake of this discussion, since I need to see how to account in FMathL for its existence, but I thoroughly dislike it.

TT: It saddens me to hear that. Category theory could be your friend. I humbly submit that it only seems strange because your acquaintance with it is still superficial, and you have yet to experience its truly amazing power.

I have nothing against category theory in the reading that Serge Lang is exposing it in the 1970 edition of his book Algebra. And I read modern category theory books and articles with this reading, without having any difficulty beyond not being fluent enough (so that I need to “translate” to my intuition rather than “read” as I’d read Serge Lang. I know category theory only superficially only in this sense of being not fluent. But I have seen quite a lot of what category theorists do (though not bothered to understand proofs and advanced concepts in enough detail to be able to easily work with it).

And I recognize the usefulness for category theory for certain investigations. In FMathL, categories are even logically prior to sets, as you can see from my exposition in the framework paper quoted at the top of this blogroll.

The “strange world” that I thoroughly dislike only referred to the dubious categorial moral of extreme structuralism that I was exposed here for the first time.

AN: In FMathL one can choose what one wants to forget and what one wants to remember

TT: That sounds to me very much in a categorical spirit.

It is in a structural spirit common to most mathematics, but not primarily in a categorial spirit. Categories are in FMathL only a useful tool, mot the spectacles through which one has to see everything.

TT: I can believe that many mathematicians perform such mental shifts unconsciously or semi-consciously.

While I believe that most mathematicians have an intuitive notion of the concepts that, most of the time, does not require them to do any such mental shifts at all.

They only interpolate such mental shift when someone else challenges them to write down their statements in full formality. But even that is usually taken to be ZF, not category theory, so they then perform in fact quite different mental shifts (such as introducing tuples, etc.).

TT: But deep down below in the system, the shifting will be registered. Or so I would think.

Let’s see how Toby Bartels handles the binary category challenge. This will give an indication of how many such shifts are needed in a categorial foundation and how few are needed in FMathL.

TT: some past business about C abcd has not been formally clarified to you to your satisfaction, so it’s still at the level of “moral demands” for you. Since you had earlier asked for my opinion about this, I’ll see if I can have a look at it and report what I think.

Yes please. This would help reduce my irritations.

AN: constructive mathematics vs. classical mathematics. […] history decided that both camps should have their way, and they are accommodated piecefully side by side in modern mathematics.

TT: the internal logic of a topos (a type of “category of sets” much, much wider in scope than models of ETCS) is in general intuitionistic. […] It’s not just about Hilbert vs. Brouwer anymore and who was ‘right’; it’s a whole ‘nother ballgame.

I know that; I spent quite some time on understanding what goes on in topos theory (though not at the depth to become an expert).

This is part of what I meant with my statement about what history decided.

Posted by: Arnold Neumaier on October 4, 2009 10:26 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Let’s see how Toby Bartels handles the binary category challenge.

I'm already handling one challenge! It is almost done, but it's hard to resist the urge to include everything that I consider that the original should have included but didn't. (^_^)

Posted by: Toby Bartels on October 4, 2009 3:55 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

AN: Let’s see how Toby Bartels handles the binary category challenge.

TB: I’m already handling one challenge! It is almost done,[…]

Take your time; one after the other. I do not mind at all if the speed of the discussion slows down a bit.

After having solved your first challenge, I am also starting to work on a second challenge (by Mike) - to formalize arrow-reversal (and maybe other categorical constructions) ….

Posted by: Arnold Neumaier on October 4, 2009 6:25 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

TT: I can believe that many mathematicians perform such mental shifts unconsciously or semi-consciously.

While I believe that most mathematicians have an intuitive notion of the concepts that, most of the time, does not require them to do any such mental shifts at all.

They only interpolate such mental shift when someone else challenges them to write down their statements in full formality.

I think you’re absolutely wrong about that. Mathematicians of all stripes spend a great deal of time investigating just in what generality theorems hold, and this very often involves some paring away of assumptions from the original context to see what will work. This is a shift away from the direction of thinking in terms of “maximal structure”. But it is an activity far different from proving things in full formality.

But even that is usually taken to be ZF, not category theory, so they then perform in fact quite different mental shifts (such as introducing tuples, etc.).

First, let’s agree that the times an “average” mathematician is challenged to prove her results in full formality, right down to a rock bottom of foundational set theory, are vanishingly rare. Second, most mathematicians have only the barest sliver of acquaintance with axiomatic set theory of any kind; they will have heard of ZFC of course and maybe even NBG, but they couldn’t say what those are in any detail, and they are rarely aware of any foundational set theories alternative to those, even by name only. This latter observation may be historical accident. So whatever you’re saying here is a red herring.

However, mathematicians in ever increasing numbers are aware of category theory as a fundamental tool in their work, and thinking in categorical terms is an indispensable part of their trade. Certainly this is true of vast swaths of algebraic geometers and algebraic topologists. I would even say that category theory plays a foundational role in a conceptual sense for such mathematicians (for example, quite a large number of algebraic geometers and topologists have a strong understanding of the basic role representability plays, and the import of the Yoneda lemma), and shifting between categories is basic to their practice.

Posted by: Todd Trimble on October 4, 2009 5:19 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

John wrote:

The disdain and lack of understanding with which most mathematicians view logicians and category theorists is evidence that most mathematicians would rather not know.

Arnold wrote:

Yes. They want something that eases their life, not something that requires them to pay attention to distracting details that do not matter for what they want to achieve.

Right. And most of the time, I’m just like that myself! I think it’s a fine attitude.

But in certain moods I want to know how everything works, which is why I don’t disdain logicians. And category theory — well, besides its ‘foundational’ role, it’s also practical math like any other kind of math. In fact, what’s great about category theory is its twofold nature. That’s why you see Urs here using it to string theory, and Toby using it to study foundational issues.

JB wrote:

It’s bad if the casual user has to know these functors are lurking in the background. But that doesn’t mean it’s bad for a system to have these functors built into it.

Arnold wrote:

It is good for a system to be able to produce a version that can make thes functors visible for specialists like you who want to think that way. But it would be far too cumbersome to represent everything inside in this way.

I’m not at all sure it needs to be cumbersome. It’s really just a sophisticated version of ‘coercion’ in typed programming languages, where — for example — you could have a datatype ‘integer’ and a datatype ‘floating point number’, and when you add an integer to a floating point, the compiler knows what to do. It essentially knows there’s a default inclusion function from integers to floating point numbers — and it lets you, the user, take that for granted.

Functors are just like functions, but at one higher level of sophistication.

For example, when I take the cartesian product of a group and a set, I know it makes sense to treat the result as a set. Why? Whether I know about it or not, I’m secretly using the forgetful functor from groups to sets.

Any decent computer system for mathematics should let me take the product of a group and set and start treating the result as a set without much fuss.

But it should also let me change my mind if I want, and say “I now want to treat groups as sets in some different way.” Only then should I need to know about functors.

Here’s an example of a sentence that any mathematician in my department would understand. A system that can’t deal with this is not a system I’ll want to use very much:

“Regarded as sets, the natural numbers and the integers are isomorphic. But regarded as monoids under addition, they’re not.”

There are some functors lurking here — that’s what the phrase “regarded as” is all about.

Now, maybe it’s unrealistic to expect a computer-based system to handle sentences like this, or some formalized version of these sentences, without a lot of detailed instruction. Or maybe it’s unrealistic to expect it’ll happen anytime soon. If so, just ignore me. I’m perfectly happy to wait. And I certainly don’t want to hold back any projects you or anyone else might have!

Posted by: John Baez on October 2, 2009 8:58 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

JB: “Regarded as sets, the natural numbers and the integers are isomorphic. But regarded as monoids under addition, they’re not.” There are some functors lurking here — that’s what the phrase “regarded as” is all about.

yes, here the functor is needed in the formal representation, since the concept of isomorphism depends on which structure is considered.

But in many other cases, it does not matter at all whether extra structure is around.

JB: maybe it’s unrealistic to expect a computer-based system to handle sentences like this

None can handle this right now, but FMathl will do the right thing.

Posted by: Arnold Neumaier on October 9, 2009 1:20 PM | Permalink | Reply to this

Book Technology, English, Sci/Math Revolution; Re: Towards a Computer-Aided System for Real Mathematics

The case is made that the new (in the West) technology of book printing accelerated the change in the English language, which prompted the revolutions in Mathematics and Science. I suggest that online software can do this more, and faster, if done correctly.

Early Modern English and the Scientific Revolution.
Robert Bruen Harvard University January 1996

“… At the inception of the Early Modern period of English several English books appeared that should be noted. In 1522 Cuthbert Tunstall published the first book on arithmetic in England and in 1557 Robert Recorde published The Whetstone of witte, the first English algebra treatise. The first English agriculture book was printed in 1523, Book of Husbandry by Anthony Fitzherbert. The Governor in 1531 by Thomas Elyot was the first book on education in English. And in 1598 Francis Meres published Palladis Tamia with quotes of 125 English writers. The renowned Francis Bacon published his Advancement of Learning in 1605, a milestone in English publishing. John Stow’s Summarie of englyshe chronicles in 1618 showed the new appreciation of English. Lastly, Shakespeare’s influence began when in 1600 when Hamlet was performed for the first time and his plays were first published in 1623. The sixteenth and seventeenth centuries produced much more than these examples, but they highlight the substantive changes that were underway. The impact was not a simple improvement of the use of a language, but fundamental shift in the basic structure of the language, earning the designation of a revolution….”

“It should be elementary to see that both language and mathematics ability are fairly deep level characteristics of the human brain (or mind) and that improvements in ability at deep levels will have positive improvements in the world outside of our minds. It is also reasonably easy to accept that over the course of human history, different civilizations have reached different levels of achievement. Few early civilizations reached the level of Rome as an empire. The ancient Greek civilization reached high levels in government, architecture and philosophy. Other examples can be found of civilizations that specialized in some activity that was better than most others. Why did any of these cultures do so well in a particular area?”

“Whatever the answer is, I believe that England during the two centuries under discussion participated in this unknown answer, with the evidence shown mainly in language and science, the two most striking features of English intellectual life from the mid-XVIth to the mid-XVIIth century.”

Posted by: Jonathan Vos Post on October 1, 2009 10:35 PM | Permalink | Reply to this

Set^op in FMathL

In reading your draft posted above, I had to go back to the FMathL definition of “category” and try to understand it. The non-structuralness of definitions in FMathL makes them difficult for me to understand, so my apologies if I’ve gotten something wrong here; please correct me. But right now it seems to me that FMathL suffers from some of the same problems (such as regarding opposite categories) that you were finding fault with structural set theory for.

Preliminary note: it seems that FMathL says “structures of a category CC” for what category theorists universally call “objects of CC”; presumably this is because FMathL uses the word “object” to mean anything in its universe. I hope you won’t mind if I continue to say “object” in the category-theorists sense, and say “FMathL-object” for the other meaning; otherwise I would get hopelessly confused.

Now in section 2.13 of FMathL, for any two FMathL-objects AA and BB, there is specified a FMathL-object ABA\to B whose elements are the things called “arrows” from AA to BB. A category CC then specifies a collection of “structures” (which, as I said, I’m going to call “objects”) and a collection of “morphisms,” each of which is an “arrow” between two CC-structures. It seems to me, then, that ABA\to B has to contain as elements all the arrows from AA to BB which occur in any category which might happen to have AA and BB as objects. Moreover, the identity IdId and composition operation \diamond are universally defined for arrows, without explicit reference to any category.

Now axiom A25 says that ff is a “map” from a set AA to a FMathL-object ZZ if f(AZ)f\in (A\to Z). That is, a “map” is an “arrow” whose domain is a set. In this case one asserts that (among other things) ff can be applied to exactly the elements of AA and will always produce an element of ZZ, and that if f,g:AZf,g:A\to Z are maps then if f(x)=g(x)f(x)=g(x) for all xAx\in A then f=gf=g.

Now suppose we have a category CC whose objects are sets. Then its morphisms are (in particular) arrows between sets, and so they are maps (by definition of “map”). Therefore, by the axioms for maps, they can be applied to elements of their domain to produce elements of their codomain, and they are determined uniquely by the results of all such applications (i.e. extensionality). Hence, it seems that any category in FMathL whose objects are sets must be given as a subcategory of SetSet. But there are plenty of naturally occurring categories whose objects are sets which are not of this form, such as Rel and Set opSet^{op}. Now we could of course define RelRel and Set opSet^{op} by changing their objects slightly so that they are no longer exactly sets, but weren’t you insisting above that a category and its opposite should have the same objects? How do you define opposite categories in FMathL?

Posted by: Mike Shulman on October 4, 2009 6:57 AM | Permalink | PGP Sig | Reply to this

Re: Set^op in FMathL

MS: presumably this is because FMathL uses the word “object” to mean anything in its universe.

yes.

MS:I hope you won’t mind if I continue to say “object” in the category-theorists sense, and say “FMathL-object” for the other meaning;

No, this is fine with me.

MS: It seems to me, then, that A→B has to contain as elements all the arrows from A to B which occur in any category which might happen to have A and B as objects. Moreover, the identity Id and composition operation ⋄ are universally defined for arrows, without explicit reference to any category.

Yes. The metaset of all objects together with the metaset of all arrows is a metacategory of which all categories are subcategories. Thus FMathL has for categories what a universe is for sets.

MS: Now axiom A25 says that f is a “map” from a set A to a FMathL-object Z if f∈(A→Z). That is, a “map” is an “arrow” whose domain is a set.

It said so, but I had forgotten to add the obvious qualification that fHom(Set)f\in Hom(Set). Of course, in different categories whose objects are sets, the properties of the axiom need not hold.

I corrected this in categories.pdf and in fmathl.pdf . Thanks for pointing this out.

MS: How do you define opposite categories in FMathL?

The impasse you referred to has gone away.

But I see that FMathL currently does not have enough constructors for abstract arrows to show that an opposite exists. I’ll add these in due time. This will take a while since I first need to check out what exactly is needed for various constructions.

Note that even in the canonical treatment, it is not clear what it means formally for abstract arrows to be reversed in a way that it is guaranteed constructively that the result exists as arrow in the formal sense of a type theory with two types object and arrow.

For example, in Wikipedia, the construction happens only on the metalevel, by giving a recipe how to change a mathematical statement involving a category into a statement for the corresponding dual category.

But this approach is formally defective since it is circular for the interpretation of statements involving both a category and its dual.

This also reminds me that I do not know how one defines in SEAR categories (and their opposites). After my intervention, you had agreed that you have there so far only a metacategory of sets. But for complete foundations, you need a formal concept of (at least small) categories.

Could you please figure out how to do this, and add the missing definitions to the SEAR page?

Posted by: Arnold Neumaier on October 4, 2009 4:56 PM | Permalink | Reply to this

Re: Set^op in FMathL

This also reminds me that I do not know how one defines in SEAR categories (and their opposites). After my intervention, you had agreed that you have there so far only a metacategory of sets. But for complete foundations, you need a formal concept of (at least small) categories.

That's easy enough; I wrote it up at [[categories in SEAR]].

Posted by: Toby Bartels on October 4, 2009 11:31 PM | Permalink | Reply to this

Re: Set^op in FMathL

Note that even in the canonical treatment, it is not clear what it means formally for abstract arrows to be reversed in a way that it is guaranteed constructively that the result exists as arrow in the formal sense of a type theory with two types object and arrow.

I can’t figure out what you mean. If a category consists of sets/types/classes C 0C_0, C 1C_1 and mappings/functions s,t,c,is,t,c,i satisfying identities, then its opposite is the similar structure obtained by switching ss and tt. What’s not formal about that? This can take place in a set theory, or a type theory, or a class theory, or whatever.

Posted by: Mike Shulman on October 5, 2009 4:18 AM | Permalink | PGP Sig | Reply to this

Re: Set^op in FMathL

its opposite is the similar structure obtained by switching ss and tt.

Technicality: and reversing cc.

Posted by: Toby Bartels on October 5, 2009 12:00 PM | Permalink | Reply to this

Re: Set^op in FMathL

MS: I can’t figure out what you mean. If a category consists of sets/types/classes C 0, C 1 and mappings/functions s,t,c,i satisfying identities, then its opposite is the similar structure obtained by switching s and t. What’s not formal about that?

You refer to the formalization given by Toby (thanks!) in categories in SEAR, where your solution is formally ok, whereas I was questioning the description given in Wikipedia. (Wikipedia’s definition of a category does not have the formal objects s and t.)

Posted by: Arnold Neumaier on October 5, 2009 1:41 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

This thread may be getting too long, for technical reasons. Over on TeXnical Issues, I wrote:

Mike wrote:

Do you experience the same effect I do that commenting on a post with very few existing comments is much speedier than on one (like this one) that has lots?

It seems that I do.

If so, then maybe we should shift our main discussion, which is getting kind of venerable, to a new post.

That would be fine. And you might take that opportunity to summarize something about the different positions that you, Arnold, Todd and Toby are taking — or get them to write summaries. I suspect that most of our readers, who haven’t studied formal logic, are by now quite bewildered. Since I have studied it, I’m enjoying your discussion immensely: it’s the first time I’ve seen someone mount a sustained attack against ‘structural foundations’ that’s significantly more intelligent than “categories suck” or “ZFC was handed down from heaven”.

Then Todd Trimble replied — but I’m moving his reply here, because it’s all about the subject of this thread:

I don’t see this in terms of a sustained attack on structural foundations. I do think the enunciation of structuralist principles has taken Arnold aback, and he is reacting, and some of us (principally Mike and Toby) are reacting to Arnold’s approach to foundations as given in his documents. There have also been a lot of conflicts and endless discussion over terminology, such as over the words “material” and “nominal”.

(I’ve temporarily stopped commenting because I find it exhausting and time-consuming work. I sort of promised I would return to some issues about “morals”, but it’s hard carving out the big blocks of time it takes to try to do a careful job, and just knowing that it will still take at least 20 more long comments before things even begin to get sorted out takes the wind out of my sails before I make a decent start. The temptation to sit back and let Mike and Toby and Arnold “have it out” is very great indeed.)

What I would call a sustained attack against categorical (structural) foundations, this time by professional logicians who are very explicit about proclaiming their expertise in foundational matters, was made on the list FOM (Foundations of Mathematics) from about 1997 to 1999. I’m not sure if Arnold has run his material (no pun intended!!!) past the people there, but it would probably result in a lot of heated exchanges (making the discussion here look like a walk in the park), and maybe some light as well. It would be very different from the exchanges here though.

I agree that a summary of main points in the present discussion could be useful.

Posted by: John Baez on October 11, 2009 11:03 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Todd wrote:

I’ve temporarily stopped commenting because I find it exhausting and time-consuming work.

Maybe you’re taking it too seriously. It’s just a conversation, after all.

For me, when math becomes “exhausting and time-consuming work”, it means I’m doing something wrong: usually, committing myself to some unrealistic pre-established plan of how things are supposed to work, instead of going with the flow.

Admittedly, it can even be exhausting and time-consuming to break my habit of setting myself unrealistic goals that wind up wearing me down. But at least in this case it’s clear how silly the struggle is, since it’s a struggle to stop struggling.

The temptation to sit back and let Mike and Toby and Arnold “have it out” is very great indeed.

If you can’t find a way to have fun while staying in the conversation, maybe that’s a temptation worth succumbing to.

What I would call a sustained attack against categorical (structural) foundations, this time by professional logicians who are very explicit about proclaiming their expertise in foundational matters, was made on the list FOM (Foundations of Mathematics) from about 1997 to 1999.

I’ve seen that one — it’s visible online — but I was intending to rule out that one by referring to attacks that are significantly more intelligent than “categories suck” or “ZFC was handed down from heaven”.

Here I was probably being quite unfair to many participants in the FOM discussion. However, that old discussion seemed very ill-tempered and uninformed to me — and I may not be the only one who feels that way.

Posted by: John Baez on October 11, 2009 11:47 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Maybe you’re taking it too seriously.

Yes, I am taking the points raised seriously, as are all the other participants. I’m not in this for s**ts and giggles, you know. Too seriously? We may disagree about that, but I’ll keep what you say in mind.

For me, when math becomes “exhausting and time-consuming work”, it means I’m doing something wrong: usually, committing myself to some unrealistic pre-established plan of how things are supposed to work, instead of going with the flow.

Hm. I have a sense that whatever comparison you have in mind is largely apples to oranges. I think you were talking about math (as in, writing papers), not a protracted and often contentious public debate, right?

All I was saying is that to reply carefully to a slew of misunderstandings (often infused with resistance) can be very time- and energy-consuming. I don’t think it necessarily has anything to do with “doing something wrong”. I think there are others who would agree with that.

If you can’t find a way to have fun while staying in the conversation, maybe that’s a temptation worth succumbing to.

That sounds a mite condescending, John. I’m an adult, and I think I’ll decide that myself without your help.

However, that old discussion seemed very ill-tempered and uninformed to me

It was indeed. Sometimes this hasn’t seemed much better to me.

Posted by: Todd Trimble on October 12, 2009 1:58 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Todd: I’m sorry I annoyed you. In the future I’ll try to refrain from giving you unasked-for advice.

One clarification. I wrote:

For me, when math becomes “exhausting and time-consuming work”, it means I’m doing something wrong: usually, committing myself to some unrealistic pre-established plan of how things are supposed to work, instead of going with the flow.

You wrote:

I think you were talking about math (as in, writing papers), not a protracted and often contentious public debate, right?

No, I was talking about math in all its forms: scribbling in my notebook, writing formal papers, writing informal stuff like This Week’s Finds, going to seminars, going to conferences, teaching classes, meeting with grad students, having discussions and arguments in a person-to-person way as well as on newsgroups and blogs, running newsgroups and blogs, and so on. To me all this stuff forms a kind of continuum.

Most of the time most of this stuff seems like great fun. That sense of fun is what gives me the energy to do it. But sometimes I get myself into something that starts seeming like “exhausting and time-consuming work”. For example, debating string theory with Lubos Motl on sci.physics.research, or explaining math to somebody who keeps asking questions but doesn’t think hard enough about the answers, or writing a big paper, or various other things.

It’s pretty rare that this stuff gets me down. But I’ve recently emerged from a painful couple of years where lots of things started feeling like exhausting and time-consuming work. It took me by surprise, since each individual thing seemed like lots of fun: the problem was that I’d committed myself to too many, and wasn’t willing to break any of these committments.

And so, I’m trying to understand my limits and avoid getting in such a mess in the future, which is why your comment about “exhausting and time-consuming work” made my ears perk up.

Posted by: John Baez on October 12, 2009 5:46 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

Thanks for the clarifications. I do know that you’ve had your share of online debates, particularly in physics (where there is much more argument than in math for all the obvious reasons), and even more particularly when you were very active on sci.physics.research. We in math are blessed that due to the degree of certainty possible in our field, resolution usually comes pretty quickly and people are usually quick to admit when they are wrong about something purely mathematical.

Mathematical foundations, being so wedded to philosophy, is perhaps the one area where this doesn’t hold true. It may seem like an obvious thing to say, but the basic disagreements between “the materialists” [in Mike’s sense of the word] and “the structuralists” seem at bottom philosophical; I mean to return to this sometime in more detail.

It’s helpful to hear you talk about your experiences with time and energy in “I-language”, so thanks, and excuse me for my earlier display of general inborn touchiness.

Posted by: Todd Trimble on October 12, 2009 8:11 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

visible online

Taking a quick peek at FOM in February 1998 we have posts such as

  • categorical dis-foundations
  • categorical pseudofoundations
  • HA/PA, categorical (pseudo) foundations
  • set/cat “foundations”

A random sample of a likely looking post claimed that categorial foundations were not legitimate until they could answer questions like ‘can you apply the power set operation an uncountable number of times?’ When on earth would a practicing mathematician need to do such a set-theoretic operation? This reminds me of Grothendieck’s comment that his homotopy theory would be viewed as useless unless it could provide a calculation of π 147(S 123)\pi_147(S^123). The whole point of providing new/different foundations is to take a new view of how things are done, not hash out the same old questions.

Anyway, I’m not qualified to comment further. I’m just glad I wasn’t involved in the FOM debate (I was discovering particle physics via New Scientist at the school library at the time!) - seeing the inanity and rudeness there makes me rile enough.

Posted by: David Roberts on October 12, 2009 9:25 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

The set theorists at FOM were suggseting that an alternative foundation should either give new answers to old questions or enable new questions to be asked about set theory. Note their insistence that FOM and POM be kept separate.

IMO Categorial foundations are not really a new FOM; they are a new POM. Their real contribution change the language of discourse so that some of the set theorists’ old questions become nonsensical (ie not linguistically well formed). Given time and research, we may find significant new questions that can be asked in Categorial P/FOM but are not meaningful in ‘material mathematics’ (terminology as used in this thread).

Note that, as set theorists, these guys have a vested interest - ‘as good as’ or ‘slightly better’ are not enough to justify reassesing their entire careers and world view. They have too much invested in the old world order.

I am currently a transportation planner but, until two years ago, I was a computer programmer. In my ‘previous life’ I witnessed how a more restrictive language (object orientation) enabled more powerful programming, by reducing the need for the programmer to consider irrelevant alternatives (ie by making the language do more ‘book-keeping’ so that the human user can do more ‘creative thinking’). I expect that categorial foundations may do that … eventually.

However, these issues are completely at odds with Arnold Neumaier’s world view. He wants to be able to ignore ideology while doing mathematics. However both the ‘material mathematics’ and the ‘see no evil’ structuralism are attempts to philosophically underpin mathematics with an ideology.

Posted by: Roger Witte on October 13, 2009 12:58 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

The set theorists at FOM were suggseting that an alternative foundation should either give new answers to old questions or enable new questions to be asked about set theory. Note their insistence that FOM and POM be kept separate.

IMO Categorial foundations are not really a new FOM; they are a new POM. Their real contribution change the language of discourse so that some of the set theorists’ old questions become nonsensical (ie not linguistically well formed). Given time and research, we may find significant new questions that can be asked in Categorial P/FOM but are not meaningful in ‘material mathematics’ (terminology as used in this thread).

The division between POM (Principles of Mathematics) and FOM (Foundations of Mathematics) is something a lot of people, particularly a lot of people here, would take issue with. Surely one can speak of foundational principles. (One such guiding principle that I would consider foundational by its very nature is, paraphrasing Lawvere, that the substance of mathematics resides in Form – isomorphism-invariant structure, as defined, for example, by universal mapping properties.)

(Surely the anti-CFOM folks at the FOM list took this distinction between POM and FOM from Kreisel, who in a famous critique wanted to draw a distinction between foundations of mathematics and organization of mathematics. Those folks seem to want to use that division as one of a series of maneuvers – which might also include the demand that categorical foundations solve old problems of interest to said people – to secure their own turf. I’m honestly not sure why they feel a need to defend it so vitriolically though.)

In any event, the most important development which emanated from foundational studies from a categorical point of view, with Lawvere as the principal driving force, was elementary topos theory. I assure you that this development enabled the asking of many, many new questions and the development of further techniques which were not at all seeded in the material set theory championed by Zermelo and others. This goes way, way beyond the purpose of changing the language of discourse so that some statements become nonsensical.

However, these issues are completely at odds with Arnold Neumaier’s world view. He wants to be able to ignore ideology while doing mathematics. However both the ‘material mathematics’ and the ‘see no evil’ structuralism are attempts to philosophically underpin mathematics with an ideology.

While there may be a kind of philosophical ideology at work in the ideas of “see no evil”, and that (“done right”) structural principles would forbid that the same object “belong” to two different categories, I find highly dubious the claim that Neumaier’s approach is ideology-free. The claims made about “dressed sets”, to give one example, seem very ideological to me. The claim Neumaier made that FLT was true before Wiles and Taylor proved it (before humans even existed, presumably) also points to some ideological commitments – it is not obviously “true” to me.

Posted by: Todd Trimble on October 13, 2009 4:47 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

However, these issues are completely at odds with Arnold Neumaier’s world view. He wants to be able to ignore ideology while doing mathematics.

I don’t see this, may be this is what AN want or pretend but this is not what he does, his proposal is heavily loaded with his own philosophy, witness all the fuss in this thread.

However both the ‘material mathematics’ and the ‘see no evil’ structuralism are attempts to philosophically underpin mathematics with an ideology.

This is why (unfortunately) such attempts are likely to go nowhere, if anything is to have a slight chance of success it should be absolutely ideology-free and foundation agnostic.
There has been a previous attempt at a theory free digital math library by Stuart Frazier Allen and this project does appear among the links on Arnold Neumaier FMathL page but it seem to have died.
The TPTP theorem proving repository, though limited to first order logic, is also fairly agnostic about foundations, write down whatever axioms you like and there it goes, the proofs aren’t usually legible but there is no question about their validity, assuming of course the axioms are consistent and the prover not (too much) bugged ;-)

Posted by: J-L Delatre on October 13, 2009 5:17 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

However, these issues are completely at odds with Arnold Neumaier’s world view. He wants to be able to ignore ideology while doing mathematics.

I don’t see this, may be this is what AN want or pretend but this is not what he does, his proposal is heavily loaded with his own philosophy, witness all the fuss in this thread.

However both the ‘material mathematics’ and the ‘see no evil’ structuralism are attempts to philosophically underpin mathematics with an ideology.

This is why (unfortunately) such attempts are likely to go nowhere, if anything is to have a slight chance of success it should be absolutely ideology-free and foundation agnostic.
There has been a previous attempt at a theory free digital math library by Stuart Frazier Allen and this project does appear among the links on Arnold Neumaier FMathL page but it seem to have died.
The TPTP theorem proving repository, though limited to first order logic, is also fairly agnostic about foundations, write down whatever axioms you like and there it goes, the proofs aren’t usually legible but there is no question about their validity, assuming of course the axioms are consistent and the prover not (too much) bugged ;-)

Posted by: J-L Delatre on October 13, 2009 5:19 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

What I would like to hear from participants is whether anything changed, or was made clearer, for them through the discussion. In particular, were Toby, Mike and Todd struck by anything in each other’s responses to Arnold?

I find it is often those nuanced differences between people who share much in their points of view which are the most illuminating.

If I sense that my opinions cannot possibly be changed by a debate I lose interest. A lover of the truth should desire to have their own position be shown to be wrong.

Posted by: David Corfield on October 12, 2009 9:53 AM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

While waiting with further technical contributions until the move from this too long thread to a new one actually happened, let me answer this one:

DC: What I would like to hear from participants is whether anything changed, or was made clearer, for them through the discussion. In particular, were Toby, Mike and Todd struck by anything in each other’s responses to Arnold?

My main reason for most of the present discussion was/is that I need to figure out which interpretation category theorist intend when writing statements such as those discussed, in order that my FMathL system can have a mode that recognizes when this interpretation is adequate and intended. That’s why I spent much effort in understanding in detail how the interpretation of “structural” (and what it implies for reading informal math) by Toby, Mike and Todd differs from the interpretation of “structural” by Bourbaki, Lang, or myself.

I learnt a lot during the discussion. I got some useful feedback about problems in the current version of FMathL, which was the designed goal of the thread. But its actual theme turned out to be about questions of structure in mathematics.

I learnt about how different mathematicians can arrive at very different interpretations of the same mathematical text. I learnt in particular how Bourbaki’s view of structures is re-viewed in the category/SEAR framework.

I don’t think I’ve been elevated by that to a deeper understanding of any of the structures I knew before; categories just provide a different view of what I could fully express already in the traditional way. But I learnt a new language and perspective, and along the way, I also learnt more about my own views on, e.g., about what “material” should mean in mathematics, or on interpreting the intersection of categories.

I still do not understand fully the moral stuff discussed, except if it is agreed to be completely optional and context-dependent (as seems to have been the tone of the later discussion on this). Thus I still hope to get Todd’s view of the C abcdC_{abcd} problem to clarify the issue.

I also enjoyed playing the advocatus diaboli for SEAR, and facing the challenges in the discussion. They forced me to look more closely into things I’d otherwise have viewed more superficially.

Posted by: Arnold Neumaier on October 14, 2009 2:14 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

That’s a helpful summary.

While waiting with further technical contributions until the move from this too long thread to a new one actually happened…

If there’s still the desire to continue on anyone else’s part of course we can have a new thread. This is probably the longest we’ve had at the Café. Does anybody want to carry on discussions?

Posted by: David Corfield on October 14, 2009 2:39 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

If there’s still the desire to continue on anyone else’s part of course we can have a new thread.

John suggested that a new thread could also usefully start by trying to reorient any readers who may have gotten lost due to not having as much familiarity with formal logic. I’ve been working on a post that will (hopefully) do that and also summarize what I’ve gotten out of the discussion, but it may be a little bit before it’s ready.

Posted by: Mike Shulman on October 14, 2009 4:00 PM | Permalink | PGP Sig | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

John wrote “For me, when math becomes exhausting and time-consuming work, it means I’m doing something wrong: usually, committing myself to some unrealistic pre-established plan of how things are supposed to work, instead of going with the flow.”

I think John’s comment, whilst fine in general, misses that Arnold is intending (contingent of funding, etc) to actually (with collaborators) implement some version of these ideas in a computational mathematics meta-system. No matter how tiring sorting things out in the discussion here is, it will pale into insignificance compared to the pain of changing things midstream when a lot of concrete “code” has been written. (Of course, that “pain now to save more pain later” motivation only directly applies to Arnold.)

(Inicidentally, I abandoned coming up with examples to my long-ago enquiry about searchability because I think it was based upon a misreading of the documentation about the intended “level” of FMathL: I was thinking of being able to look for special-case high-level mathematical information (eg, various parametrisations of a plane, or finding new and useful but non-obvious identities in the style of the Sherman-Morrison-Woodbury formula) whereas from (vaguely) following this discussion I see the emphasis in FMathL is much more on constructions.

Posted by: bane on October 12, 2009 12:31 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

bane: I was thinking of being able to look for special-case high-level mathematical information (eg, various parametrisations of a plane, or finding new and useful but non-obvious identities in the style of the Sherman-Morrison-Woodbury formula)

The state of the art is Wolfram|Alpha, according to the web site “the first step in an ambitious, long-term project to make all systematic knowledge immediately computable by anyone.”

Unfortuately, entering “Sherman-Morrison-woodbury formula” results (as in many other cases) in the answer “Wolfram|Alpha isn’t sure how to compute an answer from your input”

Combined with a future FMathL system, it should do at least a bit better….

Nevertheless, how would you like to ask for the Sherman-Morrison if you don’t know these names?

Posted by: Arnold Neumaier on October 12, 2009 1:58 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

When I said “in the style of” I was thinking more of finding new things like matrix formulae described in the download here based upon some variant mathematical technique. (That’s one way I tend to work: figure out some rough statement of what I want to do, then do a brute search trying to find any existing results such that both the problem and the result can be twisted to match, and if that fails then starting from scratch myself.) The point is that, whilst the name Sherman-Morrison-Woodbury is designed to credit the (some of the) discoverers of this identity, the terminology isn’t meaningful in connection to the meaning of the result. (Even the alternate name “matrix inversion lemma” is very non-specific, although “matrix inverse row-reduced-update lemma” is a bit of a mouthful.) If one thinks that discovering results one can make use of in existing literature is a fruitful way to do mathematics (and I’ll freely admit that it’s not clear that you’re not much better served by getting yourself into a better department with more human colleagues to discuss stuff with and by attending more conferences and talks [ie, using human beings as intelligent databases :-) ]), then this is a problem that will at some point be addressed in computer-assisted mathematical research. That obviously doesn’t mean that it’s one that is remotely within the purview of FMathL, particularly given restricted manpower. (As for how one could do this, I don’t know exactly but knowing how useful brute-force text search and regular-expression matching are for “quick hack” data mining of computerised data I don’t think it’s a priori impossible to do something.)

Perhaps a better example of the motivation would be to expand upon my plane parametrisation comment. I’ve just spent a while, after doing a lot of searching for plane-defining equations, figuring out that one of the commonly presented formulations actually reduces to something much simpler (but non-intuitive) in my particular problem setup, but I can’t recall having seen this anywhere. I’m certain it must have be found at many times in the body of mathematical literature (and might be well-known in some field I’m unfamiliar with), but I couldn’t find it. Now it hasn’t done me any harm to have derived this myself, but the week that I spent iteratively refining the notation and noticing simplifications that only became apparent with those changes could have been spent doing genuinely new stuff. (Maybe it shouldn’t even have taken me a week to spot these things, but that’s beside the point I’m making.)

Posted by: bane on October 12, 2009 2:52 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

bane: I’ve just spent a while, after doing a lot of searching for plane-defining equations, figuring out that one of the commonly presented formulations actually reduces to something much simpler (but non-intuitive) in my particular problem setup, but I can’t recall having seen this anywhere. I’m certain it must have be found at many times in the body of mathematical literature (and might be well-known in some field I’m unfamiliar with), but I couldn’t find it. Now it hasn’t done me any harm to have derived this myself, but the week that I spent iteratively refining the notation and noticing simplifications that only became apparent with those changes could have been spent doing genuinely new stuff.

I think that given a piece of mathematics, asking the system to automatically refine and simplify the presentation is something that could be eventually taught to FMathL, or at least it could support a human-aided refinement process. Maybe it would reduce the week to a day….

Some sort of improvements such as changing the notation are very easy to do with a semi-formalized math reader while changing a variable a to b in latex is a chore, since there are many a’s that should not be changed since they occur in operators like \max, in macros, or in non-formula words,

Most simplifications people do consist in using certain problem-area-dependent cues to see where likely simplifications are possible, and trying these in turn. With a large enough pool of simplifications done by mathematicians, the system could probably be taught to do the same.

Finding out whether someone else has derived a given formula is easy only if the formula looks exactly the same apart from notation, Abstracting from notation can be automatized, but recognizing “closely related” things is very difficult since the concept is hard to formalize in a useful way.

Posted by: Arnold Neumaier on October 12, 2009 4:22 PM | Permalink | Reply to this

Re: Towards a Computer-Aided System for Real Mathematics

(I feel guilty about adding to this mega-thread, but it seems better than switching to private email; if not could a cafe proprietor please advise.)

FWIW, the one of the notational changes was that I was working with various geometrical combinations of “multivariate degree-1 polynomials” and deciding to represent add a dummy variable “fixed at 1” multiplying the constant term so that I wasn’t tempted to try and keep track of the distinctions in products of terms, so that some patterns became simpler and hence much easier for me to see. Likewise, changing to not explicitly turning a homogeneous mapping (a mapping between cameras) into an inhomogeneous mapping at the symbolic level too early made other patterns clearer. So when I say “notation” I mean that the actual equations didn’t change but the way of representing the notions changed and the time-consuming difficulty wasn’t converting notation but seeing that changing notation would be a beneficial thing to do. Of course, FMathL sounds like it should help with that to an extent.

Regarding searching, I think it partly depends what the human user wants to do. If it were possible, I wouldn’t regard it as weird to say “find me papers that contain a matrix term (A+BB T) 1(A+BB^T)^{-1}” (where I mean that the search language knows that the literal names AA and BB can be changed and ideally that AA and BB belong to some “superclass” of matrices (so that if, eg, the system knows finite dimensional linear maps between vector spaces are essentially matrices then that expression for those is potentially of interest)), expecting it to probably return a lot of irrelevant stuff but maybe pointing me at something interesting. Likewise, suppose I didn’t know the expression “Rayleigh quotient”, I could imagine searching for “max xx TQx/x Txmax_x x^T Q x/x^T x” (where now you’d like to distinguish between variables and “constants” in your search specification) producing again probably a lot of rubbish. So I don’t think that it’s clear that search needs someone to have solved the exact same problem you’re motivated by, but I can fully understand that this task is very peripheral to the goal of FMathL.

Posted by: bane on October 12, 2009 5:44 PM | Permalink | Reply to this
Read the post Syntax, Semantics, and Structuralism, I
Weblog: The n-Category Café
Excerpt: A brief summary of syntax and semantics of type theory and logic, as a basis for continuing discussions about structural and material foundations.
Tracked: October 19, 2009 10:05 PM

Post a New Comment