### The Space of Physical Frameworks (Part 2)

#### Posted by John Baez

I’m trying to work out how classical statistical mechanics can reduce to thermodynamics in a certain limit. I sketched out the game plan in Part 1 but there are a lot of details to hammer out. While I’m doing this, let me stall for time by explaining more precisely what I mean by ‘thermodynamics’. Thermodynamics is a big subject, but I mean something more precise and limited in scope.

### Thermostatic systems

A lot of what we call ‘thermodynamics’, or more precisely ‘classical thermodynamics’, has nothing to do with dynamics. It’s really about systems in equilibrium, not changing, so it actually deserves to be called ‘thermostatics’. Here’s one attempt to formalize a core idea:

**Definition.** A **thermostatic system** is a convex space $X$ together with a concave function $S \colon X \to [-\infty,\infty]$. We call $X$ the **space of states**, and call $S(x)$ the **entropy** of the state $x \in X$.

There’s a lot packed into this definition:

- The general concept of convex space: it’s roughly a set where you can take convex combinations of points $x,y$, like $a x + (1-a) y$ where $0 \le a \le 1$.
- How we make $[-\infty,\infty]$ into a convex space: it’s pretty obvious, except that $-\infty$ beats $\infty$ in convex combinations, like $\frac{1}{3} (-\infty) + \frac{2}{3} \infty = -\infty$.
- What is a ‘concave’ function $S \colon X \to [-\infty,\infty]$: it’s a function with

$S(a x + (1-a) y) \ge a S(x) + (1-a) S(y) \qquad \text{for} \; 0 \le a \le 1$

To see all the details spelled out with lots of examples, try this:

- John Baez, Owen Lynch and Joe Moeller, Compositional thermostatics,
*J. Math. Phys.***64**(2023) 023304. (Blog articles here.)

We actually defined a *category* of thermostatic systems and maps between them.

### What you can do with a thermostatic system

For now I will only consider thermostatic systems where $X = \mathbb{R}$, made into a convex set in the usual way. In these examples a state is solely determined by its **energy** $E \in \mathbb{R}$. I’m trying to keep things as simple as possible, and generalize later only if my overall plan actually works.

Here’s what people do in this very simple setting. Our thermostatic system is a concave function

$S \colon \mathbb{R} \to [-\infty, \infty]$

describing the entropy $S(E)$ of our system when it has energy $E$. But often entropy is also a strictly increasing function of energy, with $S(E) \to \infty$ as $E \to \infty$. In this case, it’s impossible for a system to literally maximize entropy. What it does instead is maximize ‘entropy minus how much it spends on energy’ — just as you might try to maximize the pleasure you get from eating doughnuts minus your displeasure at spending money. Thus, if $C$ is the ‘cost’ of energy, our system tries to maximize

$S(E) - C E$

The cost $C$ is the reciprocal of a quantity called **temperature**:

$C = \frac{1}{T}$

So, $C$ should be called **inverse temperature**, and the rough intuition you should have is this. When it’s hot, energy is cheap and our system’s energy can afford to be high. When it’s cold, energy costs a lot and our system will not let its energy get too high.

If $S(E) - C E$ as a function of $E$ is differentiable and has a maximum, the maximum must occur at a point where

$\frac{d}{d E} \left(S(E) - C E \right) = 0$

or

$\frac{d}{d E} S(E) = C$

This gives the fundamental relation between energy, entropy and temperature:

$\frac{d}{d E} S(E) = \frac{1}{T}$

However, the math will work better for us if we use the inverse temperature.

Suppose we have a system maximizing $S(E) - C E$ for some value of $C$. The maximum value of $S(E) - C E$ is called **free entropy** and denoted $\Phi$. In short:

$\Phi(C) = \sup_E \left(S(E) - C E \right)$

or if you prefer

$-\Phi(C) = \inf_E \left(C E - S(E) \right)$

This way of defining $-\Phi$ in terms of $S$ is called a **Legendre–Fenchel transform**, though conventions vary about the precise definition of this transform, and also its name. Since I’m lazy, I’ll just call it the **Legendre transform**. For more, read this:

The great thing about Legendre transforms is that if a function is convex and lower semicontinuous, when you take its Legendre transform twice you get that function back! This is part of the **Fenchel–Moreau theorem**. So under these conditions we automatically get another formula that looks very much like the one we’ve just seen:

$S(E) = \inf_C \left(C E + \Phi(C) \right)$

When $C E + \Phi(C)$ has a minimum as a function of $C$ and it’s differentiable there, this minimum must occur at a point where

$\frac{d}{d C} \left(C E + \Phi(C) \right) = 0$

or

$\frac{d}{d C} \Phi(C) = -E$

### Summary

I’m plotting a difficult course between sticking with historical conventions in thermodynamics and trying to make everything mathematically elegant. Everything above looks more elegant if we work with minus the free entropy, $\Psi = -\Phi$. Starting from a thermostatic system $S \colon \mathbb{R} \to [-\infty,\infty]$ we then get a beautifully symmetrical pair of relations:

$\Psi(C) = \inf_E \left(C E - S(E) \right)$ $S(E) = \inf_C \left(C E - \Psi(C) \right)$

If the first infimum is achieved at some energy $E$ and $S$ is differentiable there, then

$S'(E) = C$

at this value of $E$, and this formula lets us compute the inverse temperature $C$ as a function of $E$. Similarly, if the second infimum is achieved at some $C$ and $\Psi$ is differentiable there, then

$\Psi'(C) = E$

at this value of $C$, and this formula lets us compute $E$ as a function of $C$.

*When we describe a thermostatic system as a limit of classical statistical mechanical systems, these are the formulas we’d like to see emerge in the limit!*

### Appendix: the traditional formalism

If you’ve never heard of ‘free entropy’, you may be relieved to hear it’s a repackaging of the more familiar concept of ‘free energy’. The free energy $F$, or more specifically the **Helmholtz free energy**, is related to the free entropy by

$F = - T \Phi$

Unless you’re a real die-hard fan of thermodynamics, don’t read the following stuff, since it will only further complicate the picture I’ve tried to paint above, which is already blemished by the fact that physicists prefer $\Phi$ to $-\Phi = \Psi$. I will not provide any profound new insights: I will merely relate what I’ve already explained to an equivalent but more traditional formalism.

I’ve been treating entropy as a function of energy: this is the so-called **entropy scheme**. But it’s traditional to treat energy as a function of entropy: this is called the **energy scheme**.

The entropy scheme generalizes better. In thermodynamics we often want to think about situations where entropy is a function of several variables: energy, volume, the amounts of various chemicals, and so on. Then we should work with a thermostatic system $S \colon X \to [-\infty,\infty]$ where $X$ is a convex subset of $\mathbb{R}^n$. Everything I did generalizes nicely to that situation, and now $\Psi$ will be one of $n$ quantities that arise by taking a Legendre transform of $S$.

But when entropy is a function of just one variable, energy, people often turn the tables and try to treat energy as a function of entropy, say $E(S)$. They then define the **free energy** as a function of temperature by

$F(T) = \inf_S (E(S) - T S)$

This is essentially a Legendre transform — but notice that inside the parentheses we have $E(S) - T S$ instead of $T S - E(S)$. We can fix this by using a sup instead of an inf, and writing

$-F(T) = \sup_S (T S - E(S))$

It’s actually very common to define the Legendre transform using a sup instead of an inf, so that’s fine. The only wrinkle is that this Legendre transform gives us $-F$ instead of $F$.

When the supremum is achieved at a point where $E$ is differentiable we have

$\displaystyle{ \frac{d}{d S} E(S) = T }$

at that point. When $E$ is concave and lower semicontinuous, taking its Legendre transform twice gets us back where we started:

$E(S) = \sup_T (T S + F(T))$

And when this supremum is achieved at a point where $F$ is differentiable, we have

$\displaystyle{ \frac{d}{d T} F(T) = - S }$

To top it off, physicists tend to assume $S$ and $T$ take values where the suprema above are achieved, and not explicitly write what is a function of what. So they would summarize everything I just said with these equations:

$F = E - T S , \qquad E = T S + F$

$\displaystyle{ \frac{d F}{d T} = - S , \qquad \frac{d S}{d E} = T }$

If instead we take the approach I’ve described, where entropy is treated as a function of energy, it’s natural to focus on the negative free entropy $\Psi$ and inverse temperature $C$. If we write the equations governing these in the same slapdash style as those above, they look like this:

$\Psi = C E - S, \qquad S = C E - \Psi$

$\displaystyle{ \frac{d \Psi}{d C} = E, \qquad \frac{d S}{d E} = C }$

Less familiar, but more symmetrical! The two approaches are related by

$\displaystyle{ C = \frac{1}{T}, \qquad \Psi = \frac{F}{T} }$

Thermodynamics is a funny subject. The first time you go through it, you don’t understand it at all. The second time you go through it, you think you understand it, except for one or two points.The third time you go through it, you know you don’t understand it, but by that time you are so used to the subject, it doesn’t bother you anymore.— Arnold Sommerfeld

## Re: The Space of Physical Frameworks (Part 2)

If you had the misfortune to read the above post before I wrote this comment here, please know that it had some serious mistakes which I’ve fixed now. I’ve also added a lot to the final section, including a funny quote at the end.