The Space of Physical Frameworks (Part 4)
Posted by John Baez
In Part 1, I explained my hopes that classical statistical mechanics reduces to thermodynamics in the limit where Boltzmann’s constant approaches zero. In Part 2, I explained exactly what I mean by ‘thermodynamics’. I also showed how, in this framework, a quantity called ‘negative free entropy’ arises as the Legendre transform of entropy.
In Part 3, I showed how a Legendre transform can arise as a limit of something like a Laplace transform.
Today I’ll put all the puzzle pieces together. I’ll explain exactly what I mean by ‘classical statistical mechanics’, and how negative free entropy is defined in this framework. Its definition involves a Laplace transform. Finally, using the result from Part 3, I’ll show that as , negative free entropy in classical statistical mechanics approaches the negative free entropy we’ve already seen in thermodynamics!
Thermodynamics versus statistical mechanics
In a certain important approach to thermodynamics, called classical thermodynamics, we only study relations between the ‘macroscopic observables’ of a system. These are the things you can measure at human-sized distance scales, like the energy, temperature, volume and pressure of a canister of gas. We don’t think about individual atoms and molecules! We say the values of all the macroscopic observables specify the system’s macrostate. So when I formalized thermodynamics using ‘thermostatic systems’ in Part 2, the ‘space of states’ was really a space of macrostates. Real-valued functions on were macroscopic observables.
I focused on the simple case where the macrostate is completely characterized by a single macroscopic observable called its energy . In this case the space of macrostates is . If we can understand this case, we can generalize later.
In classical statistical mechanics we go further and consider the set of microstates of a system. The microstate specifies all the microscopic details of a system! For example, if our system is a canister of helium, a microstate specifies the position and momentum of each atom. Thus, the space of microstates is typically a high-dimensional manifold — where by ‘high’ I mean something like . On the other hand, the space of macrostates is often low-dimensional — where by ‘low’ I mean something between 1 and 10.
To connect thermodynamics to classical statistical mechanics, we need to connect macrostates to microstates. The relation is that each macrostate is a probability distribution of microstates: a probability distribution that maximizes entropy subject to constraints on the expected values of macroscopic observables.
To see in detail how this works, let’s focus on the simple case where our only macroscopic observable is energy.
Classical statistical mechanical systems
Definition. A classical statistical mechanical system is a measure space equipped with a measurable function
We call the set of microstates, call the Hamiltonian, and call the energy of the microstate .
It gets tiring to say ‘classical statistical mechanical system’, so I’ll abbreviate this as classical stat mech system.
When we macroscopically measure the energy of a classical stat mech system to be , what’s really going on is that the system is in a probability distribution of microstates for which the expected value of energy is . A probability distribution is defined to be a measurable function
with
The expected energy in this probability distribution is defined to be
So what I’m saying is that must have
But lots of probability distributions have . Which one is the physically correct one? It’s the one that maximizes the Gibbs entropy:
Here is a unit of entropy called Boltzmann’s constant. Its value doesn’t affect which probability distribution maximizes the entropy! But it will affect other things to come.
Now, there may not exist a probability distribution that maximizes subject to the constraint , but there often is — and when there is, we can compute what it is. If you haven’t seen this computation, you can find it in my book What is Entropy? starting on page 24. The answer is the Boltzmann distribution:
Here is a number called the inverse temperature. We have to cleverly choose its value to ensure . That might not even be possible. But if we get that to happen, will be the probability distribution we seek.
The normalizing factor in the formula above is called the partition function
and it turns out to be important in its own right. The integral may not always converge, but when it does not we’ll just say it equals , so we get
One reason the partition function is important is that
where and are computed using the Boltzmann distribution for the given value of For a proof see pages 67–71 of my book, though beware that I use different notation. The quantity above is called the negative free entropy of our classical stat mech system. In my book I focus on a closely related quantity called the ‘free energy’, which is the negative free entropy divided by . Also, I talk about instead of the inverse temperature .
Let’s call the negative free entropy , so
I’ve already discussed negative free entropy in Part 2, but that was for thermostatic systems, and it was defined using a Legendre transform. This new version of negative free entropy applies to classical stat mech systems, and we’ll see it’s defined using a Laplace transform. But they’re related: we’ll see the limit of the new one as is the old one!
The limit as
To compute the limit of the negative free entropy as it will help to introduce some additional concepts.
First, given a classical stat mech system with measure space and Hamiltonian , let
be the measure of the set of microstates with energy . This is an increasing function of which is right-continuous, so it defines a Lebesgue–Stieltjes measure on the real line. Yes, I taught real analysis for decades and always wondered when I’d actually use this concept in my own work: today is the day!
The reason I care about this measure is that it lets us rewrite the partition function as an integral over the nonnegative real numbers:
Very often the measure is absolutely continuous, which means that
for some locally integrable function . I will assume this from now on. We thus have
Physicists call the density of states because if we integrate it over some interval we get ‘the number of states’ in that energy range. At least that’s what physicists say. What we actually get is the measure of the set
Before moving on, a word about dimensional analysis. I’m doing physics, so my quantities have dimensions. In particular, and have units of energy, while the measure is dimensionless, so the density of states has units of energy-1.
This matters because right now I want to take the logarithm of , yet the rules of dimensional analysis include a stern finger-wagging prohibition against taking the logarithm of a quantity unless it’s dimensionless. There are legitimate ways to bend these rules, but I won’t. Instead I’ll follow most physicists and introduce a constant with dimensions of energy, , called the energy width. It’s wise to think of this as an arbitrary small unit of energy. Using this we can make all the calculations to come obey the rules of dimensional analysis. If you find that ridiculous, you can mentally set equal to 1. In fact I’ll do that later at some point.
With that said, now let’s introduce the so-called microcanonical entropy, often called the Boltzmann entropy:
Here we are taking Boltzmann’s old idea of entropy as times the logarithm of the number of states and applying it to the density of states. This allows us to define an entropy of our system at a specific fixed energy . Physicists call the set of microstates with energy exactly equal to some number the microcanonical ensemble, and they say the microcanonical entropy is the entropy of the microcanonical ensemble. This is a bit odd, because the set of microstates with energy exactly typically has measure zero. But it’s a useful way of thinking.
In terms of the microcanonical entropy, we have
Combining this with our earlier formula
we get this formula for the partition function:
Now things are getting interesting!
First, the quantity should remind you of the formula we saw in Part 2 for the negative free entropy of a thermostatic system. Remember, that formula was
Second, we instantly get a beautiful formula for the negative free entropy of a classical stat mech system:
Now for the climax of this whole series so far. We can now prove that as , the negative free entropy of a classical stat mech system approaches the negative free entropy of a thermostatic system!
To state and prove this result, I will switch to treating the microcanonical entropy as fundamental, rather than defining it in terms of the density of states. This means we can now let while holding the function fixed. I will also switch to units of energy where . Thus, starting from , I will define the negative free entropy by
We can now study the limit of as .
Main Result. Suppose is a concave function with continuous second derivative. Suppose that for some the quantity has a unique minimum as a function of , and at that minimum. Then
The quantity at right deserves to be called the microcanonical negative free entropy. It’s the negative free entropy of the thermostatic system whose entropy function is .
So, when the hypotheses hold,
As , the free entropy of a classical statistical mechanical system approaches its microcanonical free entropy!
Here I’ve left off the word ‘negative’ twice, which is fine. But this sentence still sounds like a mouthful. Don’t feel bad if you find it confusing. But it could be the result we need to see how classical statistical mechanics approaches classical thermodynamics as . So I plan to study this result further, and hope to explain it much better!
But today I’ll just prove the main result and quit. I figure it’s good to get the math done before talking more about what it means.
Proof of the main result
Suppose all the hypotheses of the main result hold. Spelling out the definition of the negative free entropy , what we need to show is
To do this, we use a theorem from Part 3. My argument for that theorem was not a full mathematical proof — I explained the hole I still need to fill — so I cautiously called it an ‘almost proved theorem’. Here it is:
Almost Proved Theorem. Suppose that is a concave function with continuous second derivative. Suppose that for some the function has a unique minimum at , and . Then as we have
Now let’s use this to prove our main result! To do this, take
Then we get
and this is exactly what we want. ∎
Re: The Space of Physical Frameworks (Part 4)
Did you swap from using w to using k for the energy width in the proof at the end?