Category-Theoretic Characterizations of Entropy III
Posted by John Baez
Some of us recently tried an experiment here: writing a paper on the blog. It was a great success, and now we’re basically done:
- John Baez, Tobias Fritz and Tom Leinster, A characterization of entropy in terms of information loss.
While the discussion leading up to this paper (here, here and here) was intense and erudite, the final result is simple and sweet. If the paper itself doesn’t make that clear to you, maybe this summary will:
- John Baez, A characterization of entropy, Azimuth.
All this concerns entropy for classical systems. But now Tobias and I are trying to generalize our work to quantum systems. They work a bit differently. Let me explain.
In what follows, I won’t assume any knowledge of quantum theory, or physics at all. I’ll try to define all my terms and state the problem we’re facing in a purely mathematical way. It’s pretty simple.
In our previous ‘classical’ conversation, we talked about the Shannon entropy of a probability measure on a finite set. Following a proposal by Tobias, in our new ‘quantum’ discussion we’ll talk about the von Neumann entropy of a state on a finite-dimensional C*-algebra. I think it’s easiest to understand this idea if we focus on how it generalizes what we did before.
The trick is to view our previous conversation a little bit differently. Instead of focusing on a finite set, we’ll now focus on the algebra of complex-valued functions on that finite set. As we’ll see, this kind of algebra is precisely a ‘commutative C*-algebra’. (Don’t worry, I’ll define that term in a minute.) And then we can try to generalize all our ideas to the not-necessarily-commutative case.
A lot of this is already known. Here’s how it goes.
Definition. A -algebra is an associative algebra over the complex numbers (with multiplicative unit ) equipped with an operation obeying and for all and .
Definition. A C*-algebra is a -algebra equipped with a norm making it into a Banach space and obeying and for all .
We will only be interested in finite-dimensional C*-algebras, and then the condition ‘making it into a Banach space’ is redundant.
If you want an example, fear not: finite-dimensional C*-algebras are easy to classify, so I will give you all the examples:
Theorem. Every finite-dimensional C*-algebra is isomorphic to a finite direct sum of matrix algebras.
To be a bit more explicit, the algebra of complex matrices can be made into a C*-algebra in a unique way. We define the norm of an matrix to be where the supremum is taken over all unit vectors . We define to be the adjoint (that is, conjugate transpose) of the matrix . There is a straightforward way to make the direct sum of two C*-algebras into a C*-algebra, which I’ll gladly explain if you want. So, that gives us lots of finite-dimensional C*-algebras. Finally, we’ll only be interested in homomorphisms of C*-algebras that preserve both the algebra structure and the operation (but not necessarily the norm). So, ‘isomorphic’ means isomorphic in a way that preserves both these structures. Such an isomorphism automatically preserves the norm as well.
As a corollary, we have:
Theorem. Every finite-dimensional commutative C*-algebra is isomorphic to the algebra of complex functions on a finite set.
The reason is that only matrix algebras are commutative. So, our previous result implies that a finite-dimensional commutative C*-algebra is isomorphic to a finite direct sum of matrix algebras. But a finite direct sum of matrix algebras is isomorphic to the algebra of complex functions on a finite set!
In case you’re a stickler for detail, I should add that the algebra of functions on a finite set can be made into a C*-algebra in a unique way, which I’ll again be glad to explain if you ask. If is a finite set, we’ll write for the C*-algebra of complex-valued functions on .
We can recover from . So, the idea is that finite-dimensional commutative C*-algebras really ‘are’ just finite sets. The process of generalizing our work from ‘classical’ to ‘quantum’ situations simply amounts to generalizing various ideas from commutative C*-algebras to not-necessarily-commutative ones. But we think of it as generalizing ideas from finite sets to finite direct sums of matrix algebras.
I should be a bit more precise here. If we have a map between finite sets, we can pull back any complex-valued function on to one on , and this gives a homomorphism from to . Note the contravariance! So we have a contravariant functor from the category of finite sets to the category of finite-dimensional commutative C*-algebras, say: and in fact this is an equivalence of categories. So, when I said that finite-dimensional commutative C*-algebras really ‘are’ just finite sets, I was lying slightly: there’s an ‘op’ we need to be careful about. Of course we also have an inclusion of categories where we forget about commutativity, so we get an inclusion This is lets us take ideas about finite sets and try to generalize them to finite direct sums of matrix algebras.
And it’s well-known how to do this, when it comes to entropy! Let me summarize.
-
Given a complex function on a finite set we can sum it over that set. Similarly, given a matrix we can take its trace: the sum of its diagonal entries. Both these are special cases of the trace on a finite direct sum of matrix algebras: for this, you just take the trace of each matrix in the direct sum and add them all up. So, where you say in our old work, now you’ll see .
-
A complex function on a finite set is nonnegative iff it’s of the form for some other function on that set. So, we define an element of a C*-algebra to be nonnegative iff it’s of the form for some other element of that C*-algebra. In particular, a matrix is nonnegative iff it has an orthonormal basis of eigenvectors with nonnegative eigenvalues.
-
A probability distribution on a finite set is a complex function on that set that’s nonnegative and sums to 1. So, we define a state on a finite-dimensional C*-algebra to be an element that’s nonnegative and has .
- Given a probability distribution on a finite set, say , its Shannon entropy is So, given a state on a finite-dimensional C*-algebra , we define its von Neumann entropy to be Here you may wonder how we’re defining . It’s enough to explain this for matrix algebras, since if your C*-algebra is a finite direct sum of these, you just take the logarithm in each piece. If is nonnegative matrix it has a orthonormal basis of eigenvectors with nonnegative eigenvalues. So, to get , just keep the same eigenvectors and take the logarithm of each eigenvalue. The problem of taking the logarithm of zero is irrelevant, just as it was for Shannon entropy.
We studied Shannon entropy using , the category of finite sets equipped with probability distributions. What about von Neumann entropy? Let’s try the same thing and define a category to be the category of finite-dimensional C*-algebras equipped with states.
An object of is a pair where is a finite-dimensional C*-algebra and is a state on . But what’s a morphism in ?
Given a probability distribution on a finite set , we can push it forwards along a map and get a probability distribution on . But remember, we’ve got an inclusion of categories Beware the deadly op! So we should hope that given a state on a finite-dimensional C*-algebra and a homomorphism , we can pull back to get a state on . And this is true. I’ll be glad to explain how if you ask.
So, a morphism in is a homomorphism of C*-algebras such that pulling back along , we get . With this definition we get an inclusion which embeds our old ‘classical’ story into the new ‘quantum’ story we’re telling now.
Now, given a morphism in we can copy our previous work and define the entropy change more or less as we did in our previous work, modulo the deadly ‘op’. But there’s one big difference right away: this change in entropy can be either positive or negative! Before it always took the same sign.
So, we might hope that gives a functor where we treat as a category with one object, real numbers as morphisms, and addition as composition. And it’s true! Indeed, it’s utterly obvious.
So the fun starts when we try to characterize as the unique functor, up to a constant multiple, such that some conditions hold.
Before, when we did this for Shannon entropy, we needed two conditions (in addition to a condition that says we have a functor). Both those conditions have analogues for von Neumann entropy… and they’re both true!
-
Convex linearity. Given objects and in , and a number , we can take a direct sum of C*-algebras and a convex linear combination of states to get an object . Given morphisms
their direct sum gives a morphism
To be cute, and to copy what we did in the Shannon entropy case, we could call this morphism
Beware! We aren’t really taking a linear combination of the C*-algebra homomorphisms and , so perhaps this is bad notation. But it’s cute, because then this condition holds:
At least I think it does—I’ll have to check.
- Continuity. The entropy change depends continuously on . This is slightly subtler than in the Shannon case, because we can ‘slightly change’ a homomorphism between noncommutative C*-algebras, while we can’t ‘slightly change’ a map between finite sets: we can only slightly change the probability distributions on those sets. But it’s still true.
So, let us try to settle this question:
Question. Is every convex-linear and continuous functor a constant multiple of the change in von Neumann entropy? In other words, does there exist a constant such that for all morphisms in ?
One fly in the ointment is that now entropy change can take either sign. This is why takes values in rather than . And this means we can’t instantly conclude that of an isomorphism is zero, as we did before. Maybe we need more conditions now. Or maybe we just need to be smarter.
There’s a lot more to say, but this should at least start the conversation…
By the way:
Puzzle. If you wanted to classify finite-dimensional C*-algebras using pictures of some famous sort, what kind of pictures could you use?
Re: Category-Theoretic Characterizations of Entropy III
There is Araki’s notion of relative entropy of states on von Neumann algebras (a reference is here). In the finite dimensional case this is
This enjoys what is called the “strict positivity” property, which says that implies that .