Coarse-Graining Open Markov Processes

Posted by John Baez

$MathML-enabled post (click for more details).$

Kenny Courser and I have been working hard on this paper for months:

John Baez and Kenny Courser, Coarse-graining open Markov processes.

It may be almost done. So, it would be great if you folks could take a look and comment on it! It’s a cool mix of probability theory and double categories.

‘Coarse-graining’ is a standard method of extracting a simple Markov process from a more complicated one by identifying states. We extend coarse-graining to open Markov processes. An ‘open’ Markov process is one where probability can flow in or out of certain states called ‘inputs’ and ‘outputs’. One can build up an ordinary Markov process from smaller open pieces in two basic ways:

composition, where we identify the outputs of one open Markov process with the inputs of another,

and

tensoring, where we set two open Markov processes side by side.

A while back, Brendan Fong, Blake Pollard and I showed that these constructions make open Markov processes into the morphisms of a symmetric monoidal category:

A compositional framework for Markov processes, $n$ -Category Café, January 12, 2016.

Here Kenny and I go further by constructing a symmetric monoidal double category where the 2-morphisms include ways of coarse-graining open Markov processes. We also extend the previously defined ‘black-boxing’ functor from the category of open Markov processes to this double category.

But before you dive into the paper, let me explain all this stuff a bit more….

$MathML-enabled post (click for more details).$

Very roughly speaking, a ‘Markov process’ is a stochastic model describing a sequence of transitions between states in which the probability of a transition depends only on the current state. But the only Markov processes talk about are continuous-time Markov processes with a finite set of states. These can be drawn as labeled graphs:

where the number labeling each edge describes the probability per time of making a transition from one state to another.

An ‘open’ Markov process is a generalization in which probability can also flow in or out of certain states designated as ‘inputs’ and outputs’:

Open Markov processes can be seen as morphisms in a category, since we can compose two open Markov processes by identifying the outputs of the first with the inputs of the second. Composition lets us build a Markov process from smaller open parts — or conversely, analyze the behavior of a Markov process in terms of its parts.

In this paper, Kenny extend the study of open Markov processes to include coarse-graining. ‘Coarse-graining’ is a widely studied method of simplifying a Markov process by mapping its set of states $X$ onto some smaller set $X'$ in a manner that respects the dynamics. Here we introduce coarse-graining for open Markov processes. And we show how to extend this notion to the case of maps $p: X \to X'$ that are not surjective, obtaining a general concept of morphism between open Markov processes.

Since open Markov processes are already morphisms in a category, it is natural to treat morphisms between them as morphisms between morphisms, or ‘2-morphisms’. We can do this using double categories!

Double categories were first introduced by Ehresmann. Since then, they’ve used in topology and other branches of pure math—but more recently they’ve been used to study open dynamical systems and open discrete-time Markov chains. So, it should not be surprising that they are also useful for open Markov processes..

As a bunch of you already know, a 2-morphism in a double category looks like this:

While a mere category has only objects and morphisms, here we have a few more types of things. We call $A, B, C$ and $D$ ‘objects’, $f$ and $g$ ‘vertical 1-morphisms’, $M$ and $N$ ‘horizontal 1-cells’, and $\alpha$ a ‘2-morphism’. We can compose vertical 1-morphisms to get new vertical 1-morphisms and compose horizontal 1-cells to get new horizontal 1-cells. We can compose the 2-morphisms in two ways: horizontally by setting squares side by side, and vertically by setting one on top of the other.

In a ‘strict’ double category all these forms of composition are associative. In a ‘pseudo’ double category, horizontal 1-cells compose in a weakly associative manner: that is, the associative law holds only up to an invertible 2-morphism, the ‘associator’, which obeys a coherence law.

Kenny and I construct a double category $\mathbb{M}\mathbf{ark}$ with:

finite sets as objects,
maps between finite sets as vertical 1-morphisms,
open Markov processes as horizontal 1-cells,
morphisms between open Markov processes as 2-morphisms.

I won’t give the definition of item 4 here; you gotta read our paper for that! Composition of open Markov processes is only weakly associative, so $\mathbb{M}\mathbf{ark}$ is a pseudo double category.

This is how our paper goes. In Section 2 we define open Markov processes and steady state solutions of the open master equation. In Section 3 we introduce coarse-graining first for Markov processes and then open Markov processes. In Section 4 we construct the double category $\mathbb{M}\mathbf{ark}$ described above. We prove this is a symmetric monoidal double category in the sense defined by Mike Shulman. This captures the fact that we can not only compose open Markov processes but also ‘tensor’ them by setting them side by side.

For example, if we compose this open Markov process:

with the one I showed you before:

we get this open Markov process:

But if we tensor them, we get this:

As compared with an ordinary Markov process, the key new feature of an open Markov process is that probability can flow in or out. To describe this we need a generalization of the usual master equation for Markov processes, called the ‘open master equation’.

This is something that Brendan, Blake and I came up with earlier. In this equation, the probabilities at input and output states are arbitrary specified functions of time, while the probabilities at other states obey the usual master equation. As a result, the probabilities are not necessarily normalized. We interpret this by saying probability can flow either in or out at both the input and the output states.

If we fix constant probabilities at the inputs and outputs, there typically exist solutions of the open master equation with these boundary conditions that are constant as a function of time. These are called ‘steady states’. Often these are nonequilibrium steady states, meaning that there is a nonzero net flow of probabilities at the inputs and outputs. For example, probability can flow through an open Markov process at a constant rate in a nonequilibrium steady state. It’s like a bathtub where water is flowing in from the faucet, and flowing out of the drain, but the level of the water isn’t changing.

Brendan, Blake and I studied the relation between probabilities and flows at the inputs and outputs that holds in steady state. We called the process of extracting this relation from an open Markov process ‘black-boxing’, since it gives a way to forget the internal workings of an open system and remember only its externally observable behavior. We showed that black-boxing is compatible with composition and tensoring. In other words, we showed that black-boxing is a symmetric monoidal functor.

In Section 5 of our new paper, Kenny and I show that black-boxing is compatible with morphisms between open Markov processes. To make this idea precise, we prove that black-boxing gives a map from the double category $\mathbb{M}\mathbf{ark}$ to another double category, called $\mathbb{L}\mathbf{inRel}$ , which has:

finite-dimensional real vector spaces $U,V,W,\dots$ as objects,
linear maps $f : V \to W$ as vertical 1-morphisms from $V$ to $W$ ,
linear relations $R \subseteq V \oplus W$ as horizontal 1-cells from $V$ to $W$ ,
squares

obeying $(f \oplus g)R \subseteq S$ as 2-morphisms.

Here a ‘linear relation’ from a vector space $V$ to a vector space $W$ is a linear subspace $R \subseteq V \oplus W$ . Linear relations can be composed in the usual way we compose relations. The double category $\mathbb{L}\mathbf{inRel}$ becomes symmetric monoidal using direct sum as the tensor product, but unlike $\mathbb{M}\mathbf{ark}$ it is a strict double category: that is, composition of linear relations is associative.

Our main result, Theorem 5.5, says that black-boxing gives a symmetric monoidal double functor

$\blacksquare : \mathbb{M}\mathbf{ark} \to \mathbb{L}\mathbf{inRel}$

As you’ll see if you check out our paper, there’s a lot of nontrivial content hidden in this short statement! The proof requires a lot of linear algebra and also a reasonable amount of category theory. For example, we needed this fact: if you’ve got a commutative cube in the category of finite sets:

and the top and bottom faces are pushouts, and the two left-most faces are pullbacks, and the two left-most arrows on the bottom face are monic, then the two right-most faces are pullbacks. I think it’s cool that this is relevant to Markov processes!

Finally, in Section 6 we state a conjecture. First we use a technique invented by Mike Shulman to construct symmetric monoidal bicategories $\mathbf{Mark}$ and $\mathbf{LinRel}$ from the symmetric monoidal double categories $\mathbb{M}\mathbf{ark}$ and $\mathbb{L}\mathbf{inRel}$ . We conjecture that our black-boxing double functor determines a functor between these symmetric monoidal bicategories. This has got to be true. However, double categories seem to be a simpler framework for coarse-graining open Markov processes.

Finally, let me talk a bit about some related work. As I already mentioned, Brendan, Blake and I constructed a symmetric monoidal category where the morphisms are open Markov processes. However, we formalized such Markov processes in a slightly different way than Kenny and I do now. We defined a Markov process to be one of the pictures I’ve been showing you: a directed multigraph where each edge is assigned a positive number called its ‘rate constant’. In other words, we defined it to be a diagram

where $X$ is a finite set of vertices or ‘states’, $E$ is a finite set of edges or ‘transitions’ between states, the functions $s,t : E \to X$ give the source and target of each edge, and $r : E \to (0,\infty)$ gives the rate constant for each transition. We explained how from this data one can extract a matrix of real numbers $(H_{i j})_{i,j \in X}$ called the ‘Hamiltonian’ of the Markov process, with two properties that are familiar in this game:

$H_{i j} \geq 0$ if $i \neq j$ ,
$\sum_{i \in X} H_{i j} = 0$ for all $j \in X$ .

A matrix with these properties is called ‘infinitesimal stochastic’, since these conditions are equivalent to $\exp(t H)$ being stochastic for all $t \ge 0$ .

In our new paper, Kenny and I skip the directed multigraphs and work directly with the Hamiltonians! In other words, we define a Markov process to be a finite set $X$ together with an infinitesimal stochastic matrix $(H_{ij})_{i,j \in X}$ . This allows us to work more directly with the Hamiltonian and the all-important ‘master equation’

$\frac{d p(t)}{d t} = H p(t)$

which describes the evolution of a time-dependent probability distribution $p(t) : X \to \mathbb{R}$ .

Clerc, Humphrey and Panangaden have constructed a bicategory with finite sets as objects, ‘open discrete labeled Markov processes’ as morphisms, and ‘simulations’ as 2-morphisms. The use the word ‘open’ in a pretty similar way to me. But their open discrete labeled Markov processes are also equipped with a set of ‘actions’ which represent interactions between the Markov process and the environment, such as an outside entity acting on a stochastic system. A ‘simulation’ is then a function between the state spaces that map the inputs, outputs and set of actions of one open discrete labeled Markov process to the inputs, outputs and set of actions of another.

Another compositional framework for Markov processes was discussed by de Francesco Albasini, Sabadini and Walters. They constructed an algebra of ‘Markov automata’. A Markov automaton is a family of matrices with non-negative real coefficients that is indexed by elements of a binary product of sets, where one set represents a set of ‘signals on the left interface’ of the Markov automata and the other set analogously for the right interface.

So, double categories are gradually invading the theory of Markov processes… as part of the bigger trend toward applied category theory. They’re natural things; scientists should use them.

Posted at March 4, 2018 12:34 AM UTC

TrackBack URL for this Entry: https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3025

Re: Coarse-Graining Open Markov Processes

I’m getting ” /home/baez/coarse.pdf was not found on this server” when I click the link for the paper.

Posted by: chris r on March 4, 2018 5:00 AM | Permalink | Reply to this

Dag nab it! I should always check my links. Thanks, fixed.

Posted by: John Baez on March 4, 2018 3:42 PM | Permalink | Reply to this

Very nice, glad to see you’re embracing the double category approach.

For me, one of the main applications of double categories is that they are a nice place to define categories and if you also have a monad on the double category, multicategories.

Do you know what the categories internal to Mark (i.e. monads in the horizontal bicategory) look like? Do they reproduce any known concepts?

Posted by: Max S. New on March 4, 2018 8:30 PM | Permalink | Reply to this

I haven’t thought about this, since this comes from a completely different line of thought than what led us to the double category $\mathbb{M}\mathbf{ark}$ and the horizontal bicategory sitting inside it, $\mathbf{Mark}$ . I guess I should figure out what’s a category internal $\mathbf{Mark}$ , even though the very idea hurts my brain right now. Maybe I’ll ask Kenny to figure it out.

Posted by: John Baez on March 4, 2018 11:32 PM | Permalink | Reply to this

Just to be clear, by category internal to Mark, I just meant a monad in the horizontal bicategory and not something involving spans of open Markov processes (which is sure to cause head injury).

The reason a double category is a nice place to do this is that you define functors using the vertical morphisms and profunctors and natural transformations using the horizontal morphisms and 2-cells.

So a Mark-category $C$ would have a set of objects $C_0$ and a hom Markov process $C_1 : C_0 \to C_0$ whose inputs and outputs are labeled by the set of objects. Then there are identity and composition 2-cells, but they seem strange to me (composition means course-graining with itself?) so I was wondering if it was anything familiar.

Posted by: Max S. New on March 5, 2018 9:41 PM | Permalink | Reply to this

Three tangential comments:

Why not call the “Hamiltonian” a Liouvillian?

It’s clear enough what coarsening means in the context of something like an Ising model with Glauber dynamics–this is one road to renormalization. It’s not obvious to me though what coarsening would mean in the context of, say reaction networks. It would be interesting to see this spelled out or explored.

Also, the paper mentions “pushforwards” but makes no mention of “transfer/Ruelle-Perron-Frobenius operators” that are ubiquitous in chaos and nonequilibrium statistical physics…but these are basically the same thing.

Posted by: Steve Huntsman on March 5, 2018 6:10 PM | Permalink | Reply to this

Re:

Hi, Steve! See you soon at NIST!

Because nobody does? At least I haven’t seen them do it, and ‘Liouvillian’ has at least one other meaning. Markov process experts sometimes call this Hamiltonian an ‘intensity matrix’ or ‘transition rate matrix’, and maybe we should do that. But as someone raised on quantum mechanics I couldn’t resist calling it a Hamiltonian, and that’s what I did in my book Quantum Techniques in Stochastic Mechanics, which tries to play up the analogy between quantum mechanics and Markov processes.

It’s not obvious to me though what coarsening would mean in the context of, say reaction networks. It would be interesting to see this spelled out or explored.

If I have enough energy I’d like to redo this paper for reaction networks. Right now this talk is a good place to learn about concrete applications of coarse-graining for reaction networks:

Luca Cardelli, Morphisms of reaction networks.

He also has some papers on this topic.

Are those names ever used for linear maps that arise from maps between spaces that aren’t bijections? I’m curious.

Ultimately, most of the hard theorems in our paper heavily use the nice interplay between the pushforwards

$f_* : \mathbb{R}^S \to \mathbb{R}^T$

and pullbacks

$f^* : \mathbb{R}^T \to \mathbb{R}^S$

arising from maps $f : S \to T$ between finite sets.

Posted by: John Baez on March 5, 2018 6:40 PM | Permalink | Reply to this

Finally, in Section 6 we state a conjecture. First we use a technique invented by Mike Shulman to construct symmetric monoidal bicategories $\mathbf{Mark}$ and $\mathbf{LinRel}$ from the symmetric monoidal double categories $\mathbb{M}\mathbf{ark}$ and $\mathbb{L}\mathbf{inRel}$ . We conjecture that our black-boxing double functor determines a functor between these symmetric monoidal bicategories.

Linde Wester and I are currently (by which I mean “she’s done a lot of work and I’ve been too busy to contribute much”) working on an extension of that general result to make it functorial. That is, there is a functor from the X of symmetric monoidal double categories to the X of symmetric monoidal bicategories, so that in particular your symmetric monoidal double functor will induce a symmetric monoidal functor of bicategories as you conjectured. (The tricky part is in deciding what X should be.)

Posted by: Mike Shulman on March 6, 2018 7:29 PM | Permalink | Reply to this

Great! If Linde puts her paper on the arXiv before Kenny and I publish ours, we can change our paper to say our conjecture is a theorem, and cite her paper for the proof! We wanted to stimulate work on this subject, though by now I’m sold on symmetric monoidal double categories and functors between them, and don’t consider it so urgent to convert them into symmetric monoidal bicategories and functors between those.

Posted by: John Baez on March 8, 2018 7:25 AM | Permalink | Reply to this

I was just wondering if you considered trying to work with a strictification of your double category? I would have thought that this would be a significant advantage for applications.

One can for instance consider a finite set to be a sorted finite list without duplicates, and then the pushout of $L \leftarrow X \rightarrow R$ where the arrows are injections can be taken to be list concatenation $X + (L \setminus X) + (R \setminus X)$ followed by sorting and removing of duplicates. (This is how sets are often implemented in programming languages; of course one needs one’s elements to be sortable, but this is not a problem in applications, as one can always arrange it one way or another). One will then get strict associativity.

Posted by: Richard Williamson on March 6, 2018 8:03 PM | Permalink | Reply to this

Is there any relation between black boxing an open Markov process and taking the magnitude of a finite generalized metric space?

Posted by: chris upshaw on March 10, 2018 9:33 PM | Permalink | Reply to this

The n-Category Café

Skip to the Main Content

March 4, 2018