What is a Theory?
Posted by Mike Shulman
Over at the nForum John just asked me an interesting question, which I think deserves a wider discussion. It’s about the relationship between logical theories and their walking models, a.k.a. “syntactic categories.” In particular, which is more deserving of the name “theory”? I’m going to present my point of view, but I’d love to hear others as well.
Traditionally, logicians have spoken of a (first-order, many-sorted, logical) “theory” in terms of types, operations, relations, and axioms. For instance, the theory of a group has one type, denoted , a binary operation , a nullary operation , a unary operation , and axioms , , and . A model of such a theory, in the classical sense, consists of a set and functions and satisfying the axioms, i.e. a group.
It’s easy to generalize this to define a model of a theory internal to a category with sufficient structure; for instance we can define a “group object” in any category with finite products. The level of structure required on the category is called the doctrine (or 2-theory) in which the theory is written; thus the theory of groups is written in the doctrine of finite-products categories. Moreover, for practically any doctrine and theory in it, there exists a -category that is the walking -model (also called the syntactic category or category of contexts of ). This sort of yoga has been discussed here many times before; I know it came up here, and the notion of “walking thing” came up here. I know there are lots more links; anyone want to add some?
Now, many people, starting with Lawvere and including John Baez and Jim Dolan, have a tendency to identify a theory with the walking -model . This undoubtedly has certain advantages. For instance, since -models in a -category are equivalent to -functors , knowledge of suffices to recover all the -models. Conversely, is determined uniquely as a representing object for the functor , so there is a nice duality. Moreover, syntactically different theories and can give rise to equivalent categories , and hence to an equivalent notion of model (we then say that and are “Morita equivalent”). Thus, considering walking models to be the theories removes superfluous ambiguity.
Finally, the walking model contains more information than the syntactic version of does. For instance, if is the theory of groups, then tells you that there is exactly one way to multiply together any number of elements of a group once you fix their ordering. In other words, it is an unbiased presentation of the notion of group, which is arguably the version that we actually tend to use on a day-to-day basis.
Thus, over at the nForum, John remarked to me:
I got the impression from Jim Dolan that going from the “syntactic presentation” of a theory to the corresponding category was mainly a bunch of bureaucratic red tape, and that the really fun starts when we get past that stage and think of theories as being those categories. Sort of like how groups are more fun than group presentations, though of course those presentations are very useful. So, I’m a bit surprised that you seem to be taking more of the viewpoint of the traditional fuddy-duddy logician who likes nothing more than manipulating well-formed formulas according to purely syntactic rules.
So why do I (and some others, like Toby) continue to insist on talking about syntactic presentations, and distinguishing between a (syntactically presented) theory and its walking model?
The analogy with groups and group presentations is fairly compelling, so let’s start there. You won’t find me arguing with the claim that groups are more fun than group presentations, or that our real interest is in a group rather than any particular presentation of it. On the other hand, it is certainly true that a set of generators and relations for a group is a useful thing to have around.
What else can we say about the relationship between groups and group presentations? The fact that every set of generators and relations actually presents a group is fairly easy to see, but in more complicated situations it is a statement with content (Peter Johnstone and Steve Vickers wrote a whole paper about how and why preframes can be presented by generators and relations). Conversely, (a generalization of) the fact that every group can be presented by generators and relations is important to the monadicity theorem. Furthermore, for an arbitrary group presentation, determining facts about the group it presents is hard — not just computationally intractable, but computationally impossible.
For all of these reasons, I think it’s valid to say that there is real mathematics in the correspondence between groups and group presentations. I claim the same is true of theories, but even more so. I agree with all the reasons given above for why the walking model of a theory is good, useful, and lots of fun. However, consider the following.
As I’ve opined before, one of the main things we do when we do mathematics (and especially when we communicate mathematics) is write down symbols on paper and manipulate them according to rules. Traditional logic studies, among other things, the symbols we write down, the rules we manipulate them with, and how those symbols and rules correspond to whatever “objects” we might be studying. The correspondence between syntactic presentations of theories and their walking models is part of this correspondence between symbols and “reality.” That just seems to me like an important thing to understand, from a metamathematical perspective if nothing else.
In the case of groups, it’s fairly uncommon to be simply given a group presentation; usually we’re given a group in some other way and then produce a presentation of it. But giving objects via presentations gets more attractive as the objects get more complicated. For instance, the tensor product of abelian groups is often defined via generators and relations. Similarly, any algebraic extension of a field is naturally so defined. And in the case of theories encountered in the wild, being given a presentation in terms of types, operations, and axioms is the rule rather than the exception.
Determining facts about the walking model of a theory (or, equivalently, about all models of that theory, since the walking model is the initial one), given only a presentation of it, is hard. For instance, let be the first-order (syntactically presented) theory ZFC, and let be the statement in the first-order language of ZFC that encodes the Riemann hypothesis. In the walking model , the statement is incarnated by a subterminal object . Showing that this map is an isomorphism is worth a million dollars.
Thus, I wouldn’t describe the passage from a syntactically presentated theory to its walking model as bureaucratic red tape. It’s certainly a purely formal operation, but that doesn’t make it uninteresting, unimportant, or trivial. Point one above says that it’s philosophically interesting; point two says that it’s important in practice, because we do it a lot; point three says that the resulting construction is quite nontrivial, and in fact can be regarded as encoding all of mathematics.
A final question, of course, is even if we grant that the two are both important, why shouldn’t we call them “theory presentations” and “theories” by analogy with “group presentations” and “groups”? I can think of two reasons. One is the traditional and entrenched usage of logicians. The other is that we already have a word for the Morita-invariant notion: namely, a -category, for whatever doctrine is in use. Thus we have a correspondence between “finite-products theories” and “categories with finite products,” another one between “first-order theories” and “Heyting categories,” another between “geometric theories” and “Grothendieck toposes,” and so on. Why should we define “geometric theory” to mean “Grothendieck topos” and force ourselves to say “presentation of a geometric theory” for the syntactic notion, when we already have the perfectly good term “Grothendieck topos” to refer to the categories in which (walking) models of geometric theories live?
Re: What is a Theory?
One reason why I want to use ‘theory’ for the syntactic notion (the presentation) is to make sure that I'm using that term for the right thing.
I see categorial logic as a form of applied mathematics, in fact applied metamathematics. First, we have these people, mathematicians, engaged in a social activity, and we wish to understand what they are doing. So we develop mathematical logic to study some aspects of this. And now we are applying category theory to that endeavour.
All well and good so far. But the danger in any applied mathematics is that, once you have a mathematical model in mind, you may forget that your job is to fit your model to the data and instead try to start fitting your data to the model. Then you start saying crazy things, such as that glass is not solid because it doesn't have a crystalline structure, or that shellfish are not fish because they don't have spinal cords. (I have picked one example where the current terminological consensus among scientists in the relevant field is with me and one where it is against me, but I'm prepared to defend either.)
In particular, I noticed that you left out something in your description of the theory of groups; among its types, operations, relations, and axioms, you listed only the types, operations, and axioms, even though the axioms cannot be stated without the relation: equality. Of course, you did this because you were implicitly working in a doctrine (presumably the doctrine of finite-products theories) in which every type is automatically equipped with such a relation. But of course, there are other doctrines.
This could be important, since most of the doctrines that are well understood in categorial logic have this property, yet there are doctrines that do not, which include important theories such as the theory of categories (up to equivalence, that is non-strict categories) themselves. It would be easy to fall into the trap of thinking that any theory (in any doctrine) must have equality relations on all of its types, or it's not really a theory.
Fortunately, this particular mistake is unlikely, since we do have some idea of how to deal with such theories in categorial logic (even if it's not as well understood). But we make this attempt in the first place because we meet such theories in practice and want to understand them. If we ever start thinking that the categories are the theories, especially if we fix a particular meaning of ‘doctrine’ and define ‘theory’ to mean an object of a doctrine, then we are liable to define away part of the matter that we were trying to study in the first place.
Sometimes this is the right thing to do, eventually. It proved useful to fix a definition of ‘group’ and develop group theory, even though this relegates some potential examples to the status of ‘semigroup’ or ‘quasigroup’. (Or ‘groupoid’, but now perhaps we are thinking that the definition was not fixed in the right way after all?) A fixed definition of ‘doctrine’ might very well be useful too. But categorial logic is a minority position, and a too restricted definition of ‘theory’ would make us unable to communicate with other logicians. We should leave that word to refer directly to the object of study, and use other words for our mathematical models of those objects.