Probability Monads as Codensity Monads
Posted by Tom Leinster
My PhD student Ruben Van Belle has just published his first paper!
Ruben Van Belle, Probability monads as codensity monads. Theory and Applications of Categories 38 (2022), 811–842.
It’s a treasure trove of theorems demonstrating how many of the monads loosely referred to as “probability monads” arise as codensity monads in a certain uniform manner, which I’ll tell you about now.
A probability monad is a monad that takes a space as input and produces as output the space of probability measures on . It’s an imprecise term, but there are a number of monads that are universally agreed to count as probability monads, of which the Giry monad is the best known. There are also some boundary cases: should the ultrafilter monad count as a probability monad? (An ultrafilter is a kind of primitive probability measure.)
The codensity monad of a functor is a certain canonical monad on . If has a left adjoint , then the codensity monad of is the composite . But even if doesn’t have a left adjoint, its codensity monad is often well-defined. By definition, it’s the right Kan extension of along itself. A good non-trivial example: the codensity monad of the inclusion is the ultrafilter monad.
Ruben’s insight was as follows. There is something fundamentally countable about probability theory. For instance, probability is additive on countable families of mutually exclusive events, but not general families. You can define a probability measure on a countable set as an -indexed family of nonnegative reals summing to , but that definition doesn’t work when is larger. And — more nebulously — although probability theory very often involves uncountable spaces such as and uncountable families of random variables (arising in stochastic processes), there is a feeling that the heart of probability theory remains countable, that the extension from countable to uncountable infinities is somehow automatic.
So, take some category of spaces and a probability monad on it. Let be the subcategory of consisting of just the countable spaces. Then you can restrict to to get a functor . If the idea expounded in the last paragraph is correct, then it should be possible to reconstruct from (its restriction to countable spaces) in some automatic way. The substance of the definition of should be in the countable part of ; everything else should be fluff.
Ruben shows that this is indeed the case. And as you’ll probably have guessed, the way that is reconstructed from is as its codensity monad. That is, in many cases:
A probability monad is the codensity monad of its restriction to countable spaces.
Sometimes it’s even simpler: you can get away with just finite spaces. Sometimes there are some complicating wrinkles. But that’s the main idea.
Here are four theorems illustrating this principle, all from Ruben’s paper.
Let be the category of measurable spaces and the Giry monad on , which assigns to each space the space of probability measures on it. Then is the codensity monad of the functor from countable sets to that assigns to each countable set the space of probability measures on it.
This theorem embodies the idea that once you have the definition of probability measure on a countable set, the definition for general measurable spaces is automatic.
Let be the category of compact Hausdorff spaces and the Radon monad on , which assigns to each space the space of Radon probability measures on it. (A Radon measure is a suitably nice measure on the Borel sets, i.e. the smallest -algebra containing the open sets.) Then is the codensity monad of the functor from finite sets to that assigns to each finite set the space of probability measures on it.
So for compact Hausdorff spaces, once you’ve got the definition of probability measure on finite sets, the general definition of Radon probability measure follows automatically. A striking feature of this theorem was that although the input functor was entirely finite in nature, the output monad involves countable additivity. We got that for free.
Let be the category of compact metric spaces and distance-decreasing maps. Let be the “bounded Lipschitz monad” on . It assigns to each space the space of Radon probability measures on it, with a certain metric called the “bounded Lipschitz metric”. This is closely related to the Wasserstein/Kantorovich metric, and you can find the definition on page 829 of Ruben’s paper. And again, is the codensity monad of its restriction to finite sets.
Finally, let be the category of Hausdorff spaces. There is a probability monad on that Ruben dubs the Baire monad, assigning to each space the space of “Baire probability measures” on . I won’t give the definition (see page 825 if you want it), but the point is that the general notion of Baire probability measure reduces to the usual notion of probability measure when is metric or compact Hausdorff, but still behaves well on the category of Hausdorff spaces as a whole.
In this case, the result is that is the codensity monad of its restriction to a certain category of countable discrete spaces (i.e. countable sets). It’s not the full subcategory; only certain maps are included. But I’ll leave Ruben to explain that; it’s in section 6 of his paper.
All of this provokes a categorical question. Given a monad on a category and a subcategory of , when is it the case that is the codensity monad of its restriction to ? Or more generally, given a monad on and a functor , when is it the case that is the codensity monad of the composite functor
We know the answer when is the identity monad. For in that case, the question is asking “for which functors is the codensity monad of the identity?”, and the answer is “when is codense”. It’s not too hard to show that in general, is the codensity monad of if and only if the composite functor
is codense, where is the free functor into the Kleisli category of . I’m sure Ruben and I aren’t the first to come across this fact, but can anyone point to where it appears in the literature? And does anyone know other applications of it?
Re: Probability Monads as Codensity Monads
Well done, Ruben!
Is there some abstract way to relate this codensity account to the graded monad account of Fritz and Perrone, as described here at nLab: graded monad?