The 2025 Physics Nobel Prize was announced this week, awarded to John Clarke, Michel Devoret, and John Martinis for building an electrical circuit that exhibited quantum effects like tunneling and energy quantization on a macroscopic scale.
Press coverage of this prize tends to focus on two aspects: the idea that these three “scaled up” quantum effects to medium-sized objects (the technical account quotes a description that calls it “big enough to get one’s grubby fingers on”), and that the work paved the way for some of the fundamental technologies people are exploring for quantum computing.
That’s a fine enough story, but it leaves out what made these folks’ work unique, why it differs from other Nobel laureates working with other quantum systems. It’s a bit more technical of a story, but I don’t think it’s that technical. I’ll try to tell it here.
To start, have you heard of Bose-Einstein Condensates?
Bose-Einstein Condensates are macroscopic quantum states that have already won Nobel prizes. First theorized based on ideas developed by Einstein and Bose (the namesake of bosons), they involve a large number of particles moving together, each in the same state. While the first gas that obeyed Einstein’s equations for a Bose-Einstein Condensate was created in the 1990’s, after Clarke, Devoret, and Martinis’s work, other things based on essentially the same principles were created much earlier. A laser works on the same principles as a Bose-Einstein condensate, as do phenomena like superconductivity and superfluidity.
This means that lasers, superfluids, and superconductors had been showing off quantum mechanics on grubby finger scales well before Clarke, Devoret, and Martinis’s work. But the science rewarded by this year’s Nobel turns out to be something quite different.
Because the different photons in laser light are independently in identical quantum states, lasers are surprisingly robust. You can disrupt the state of one photon, and it won’t interfere with the other states. You’ll have weakened the laser’s consistency a little bit, but the disruption won’t spread much, if at all.
That’s very different from the way quantum systems usually work. Schrodinger’s cat is the classic example. You have a box with a radioactive atom, and if that atom decays, it releases poison, killing the cat. You don’t know if the atom has decayed or not, and you don’t know if the cat is alive or not. We say the atom’s state is a superposition of decayed and not decayed, and the cat’s state is a superposition of alive and dead.
But unlike photons in a laser, the atom and the cat in Schrodinger’s cat are not independent: if the atom has decayed, the cat is dead, if the atom has not, the cat is alive. We say the states of atom and cat are entangled.
That makes these so-called “Schrodinger’s cat” states much more delicate. The state of the cat depends on the state of the atom, and those dependencies quickly “leak” to the outside world. If you haven’t sealed the box well, the smell of the room is now also entangled with the cat…which, if you have a sense of smell, means that you are entangled with the cat. That’s the same as saying that you have measured the cat, so you can’t treat it as quantum any more.
What Clarke, Devoret, and Martinis did was to build a circuit that could exhibit, not a state like a laser, but a “cat state”: delicately entangled, at risk of total collapse if measured.
That’s why they deserved a Nobel, even in a world where there are many other Nobels for different types of quantum states. Lasers, superconductors, even Bose-Einstein condensates were in a sense “easy mode”, robust quantum states that didn’t need all that much protection. This year’s physics laureates, in contrast, showed it was possible to make circuits that could make use of quantum mechanics’ most delicate properties.
That’s also why their circuits, in particular, are being heralded as a predecessor for modern attempts at quantum computers. Quantum computers do tricks with entanglement, they need “cat states”, not Bose-Einstein Condensates. And Clarke, Devoret, and Martinis’s work in the 1980’s was the first clear proof that this was a feasible thing to do.
The Rice Advanced Materials Research Institute is having its 2025-2026 competition for prestigious postdoctoral fellowships - see here: https://rami.rice.edu/rami-postdoctoral-fellowship-program .
If you are interested and meet the criteria, I'd be happy to talk. I have some ideas that lean into the materials for electronics direction, and other possibilities are welcome.
The 2025 MacArthur Fellows have been announced, and I’m very pleased that two of the 22 awardees are my UW-Madision colleagues: congratulations to Ángel F. Adames Corraliza from atmospheric and oceanic sciences, and Sébastien Phillippe from nuclear engineering and engineering physics. The only mathematician to win the award this year was Lauren Williams, who true devotees of this blog will remember as the person who gave the best talk I’ve ever seen about cluster algebras. Congratulations to all the winners!
As announced this morning, the 2025 Nobel Prize in Physics has been awarded to John Clarke, Michel Devoret, and John Martinis, for a series of ground-breaking experiments in the 1980s that demonstrated macroscopic quantum tunneling.
For non-experts: "Tunneling" was originally coined to describe the physical motion of a quantum object, which can pass through a "classically forbidden" region. I've written about this here, and here is an evocative picture. Suppose there is a particle with a certain amount of total energy in the left region. Classically, the particle is trapped, because going too far to the left (gray region) or too far to the right (gray region) is forbidden: Putting the particle inside the shaded regions is "classically forbidden" by conservation of energy. The particle bounces back and forth in the left well. If the particle is a quantum object, though, it is described by a wave function, and that wave function has some non-zero amplitude on the far side of barrier in the middle. The particle can "tunnel" through the barrier, with a probability that decreases exponentially with the height of the barrier and its width.![]() |
Fig. 2 from here |
This idea, that a macroscopic (in the sense of comprising many many electrons) system could tunnel out of a metastable state like this, had been investigated by Amir Caldeira and Tony Leggett in this important paper, where they worried about the role of dissipation in the environment. People tried hard to demonstrate this, but issues with thermal radiation and other noise in the experiments were extremely challenging. With great care in experimental setup, the three laureates put together a remarkable series of papers (here, here, here) that showed all the hallmarks, including resonantly enhancing tunneling with tuned microwaves (designed to kick the system between the levels shown in panel (d) of the figure above).
This was an impressive demonstration of controllable, macroscopic quantum tunneling, and it also laid the foundation for the devices now used by the whole superconducting quantum computing community.
Today, of course, is the second anniversary of the genocidal Oct. 7 invasion of Israel—the deadliest day for Jews since the Holocaust, and the event that launched the current wars that have been reshaping the Middle East for better and/or worse. Regardless of whether their primary concern is for Israelis, Palestinians, or both, I’d hope all readers of this blog could at least join me in wishing this barbaric invasion had never happened, and in condemning the celebrations of it taking place around the world.
Now for the happy part: today is also the day when the Nobel Prize in Physics is announced. I was delighted to wake up to the news that this year, the prize goes to John Clarke of Berkeley, John Martinis of UC Santa Barbara, and Michel Devoret of Yale, for their experiments in the 1980s that demonstrated the reality of macroscopic quantum tunneling in superconducting circuits. Among other things, this work laid the foundation for the current effort by Google, IBM, and many others to build quantum computers with superconducting qubits. To clarify, though, today’s prize is not for quantum computing per se, but for the earlier work.
While I don’t know John Clarke, and know Michel Devoret only a little, I’ve been proud to count John Martinis as a good friend for the past decade—indeed, his name has often appeared on this blog. When Google hired John in 2014 to build the first programmable quantum computer capable of demonstrating quantum supremacy, it was clear that we’d need to talk about the theory, so we did. Through many email exchanges, calls, and visits to Google’s Santa Barbara Lab, I came to admire John for his iconoclasm, his bluntness, and his determination to make sampling-based quantum supremacy happen. After Google’s success in 2019, I sometimes wondered whether John might eventually be part of a Nobel Prize in Physics for his experimental work in quantum computing. That may have become less likely today, now that he’s won the Nobel Prize in Physics for his work before quantum computing, but I’m guessing he doesn’t mind! Anyway, huge congratulations to all three of the winners.
Dubois-Violette and Todorov noticed that the Standard Model gauge group is the intersection of two maximal subgroups of . I’m trying to understand these subgroups better.
Very roughly speaking, is the symmetry group of an octonionic qutrit. Of the two subgroups I’m talking about, one preserves a chosen octonionic qubit, while the other preserves a chosen complex qutrit.
A precise statement is here:
Over on Mathstodon I’m working with Paul Schwahn to improve this statement. He made a lot of progress on characterizing the first subgroup. is really the group of automorphisms of the Jordan algebra of self-adjoint octonion matrices, . He showed the first subgroup, the one I said “preserves a chosen octonionic qubit”, is really the subgroup that preserves any chosen Jordan subalgebra isomorphic to .
Now we want to show the second subgroup, the one I said “preserves a chosen complex qutrit”, is really the subgroup that preserves any chosen Jordan subalgebra isomorphic to .
I want to sketch out a proof strategy. So, I’ll often say “I hope” for a step that needs to be filled in.
Choose an inclusion of algebras . All such choices are related by an automorphism of the octonions, so it won’t matter which one we choose.
There is then an obvious copy of sitting inside . I’ll call this the standard copy. To prove the desired result, it’s enough to show:
The subgroup of preserving the standard copy of in is a maximal subgroup of , namely .
All Jordan subalgebras of isomorphic to are related to the standard copy by an transformation.
Part 1. should be the easier one to show, but I don’t even know if this one is true! is a maximal subgroup of , and Yokota shows it preserves the standard copy of in . But he shows it also preserves more, seemingly: it preserves a complex structure on the orthogonal complement of that standard copy. Is this really ‘more’ or does it hold automatically for any element of that preserves the standard copy of ? I don’t know.
But I want to focus on part 2). Here’s what we’re trying to show: any Jordan subalgebra of isomorphic to can be obtained from the standard copy of by applying some element of .
So, pick a Jordan subalgebra A of isomorphic to . Pick an isomorphism A ≅ .
Consider the idempotents
in . Using our isomorphism they give idempotents in , which I’ll call . Since these are also idempotents in .
Hope 1: I hope there is an element of mapping to ⊂ .
Hope 2: Then I hope there is an element of that fixes and maps the subalgebra to the standard copy of in .
If so, we’re done: maps to the standard copy of .
Hope 1 seems to be known. The idempotents form a so-called ‘Jordan frame’ for , and so do . Faraut and Korányi say that “in the irreducible case, the group acts transitively on the set of all Jordan frames”, and I think that implies Hope 1.
As for Hope 2, I know the subgroup of that fixes contains . I bet it’s exactly . But to prove Hope 2 it may be enough to use .
Let me say a bit more about how we might realize Hope 2. It suffices to consider a Jordan subalgebra of that is isomorphic to and contains
and prove that there is an element of that fixes and maps the subalgebra to the standard copy of in . (In case you’re wondering, this is what I was calling .)
Hope 3: I hope that we can show consists of matrices
where are arbitrary real numbers and range over 2-dimensional subspaces of . This would already make it look fairly similar to the standard copy of , where the subspaces are all our chosen copy of in .
If Hope 3 is true, the subspaces don’t need to be the same, but I believe they do need to obey and cyclic permutations thereof, simply because is closed under the Jordan product.
So, we naturally want to know if such a triple of 2d subspaces of must be related to the ‘standard’ one (where they are all ) by an element of , where acts on the three copies of by the vector, left-handed spinor, and right-handed spinor representations, respectively — since this is how naturally acts on while fixing all the diagonal matrices.
This is a nice algebra question for those who have thought about triality, and more general ‘trialities’.
So, that’s where I am now: a bunch of hopes which might add up to a clarification of what I mean by “the subgroup of symmetries of an octonionic qutrit that preserve a complex qutrit”.
This condition of ill-training is intensified considerably in an institution like the state university, because of the large number of technical students in attendance, many of whom are more interested in acquiring information than getting a real education, and who look upon time as wasted unless it is put in in the acquiring of cold facts which may later be put to use in the earning of money.
That’s Thomas Arkle Clark, Dean of Men at UIUC, writing in 1921 in his book Discipline and the Derelict. He also writes that 70% of students in his anonymous survey admitted to cheating (“cribbing,” as it was then called.) He is pretty high on the student-athlete, who he says subscribes to ideals that were less-well known in his own time as an Illinois undergrad, back in the ’80s:
The athlete was not always so worthy of emulation as he is at present. I do not have to go back farther than my own college days nor even so far as that to recall instances of men who found their way into colleges for the sole purpose of developing or exhibiting their physical powers, of making an athletic team, and without any intention of adding to their intellectual strength.
In Part 5 I explained a cool way to treat bound states of the hydrogen atom as wavefunctions on a sphere in 4-dimensional space. But so far I’ve been neglecting the electron’s spin. Now let’s throw that in too!
This will wind up leading us in some surprising directions. So far I’ve just been reviewing known ideas, but now we’re getting into my new paper:
• Second quantization for the Kepler problem.
It starts out being quite routine: to include spin, we just tensor our previous Hilbert space with a copy of
describing the electron’s spin. The resulting space
is the Hilbert space of bound states of a spinor-valued version of the Schrödinger equation for the hydrogen atom.
Beware: this is a simplification of a more careful treatment of hydrogen using the Dirac equation: it neglects all spin-dependent terms in Hamiltonian, like spin-orbit interactions. These spin-dependent terms give corrections that go to zero in the limit where the speed of light approaches infinity. So what we’re doing now is giving a nonrelativistic treatment of the hydrogen atom, but taking into account the fact that the electron is a spin-½ particle.
Things get fun now. The Hilbert space becomes a unitary representation of
in three important ways. The first two come from the actions of
on
by left and right translation, which I explained in Part 5. The third comes from the natural action of
on
All three of these actions of
on
commute with each other. We thus get a unitary representation of
on
It is useful to spell this out at the Lie algebra level. In Part 5, I introduced self-adjoint operators and
on
: the self-adjoint generators of the left and right translation actions of
respectively. Now we’ll tensor these operators with the identity on
and get operators on
which by abuse of notation we’ll denote with the same names:
and
But we’ll also introduce spin angular momentum operators
on These operators obey the following commutation relations:
Once we have 3 commuting actions of on a Hilbert space we can get more by mixing and matching them. I won’t go overboard and describe all 23 = 8 of them, but I’ll mention some that we need for physics. First we can define orbital angular momentum operators
These obey
Physically speaking, the generate an action of
that rotates the position of the electron in space while not changing its spin state, just as the
rotate the electron’s spin state while not changing its position.
Adding the spin and orbital angular momentum, we get total angular momentum operators
which obey
These generate an action of that rotates the electron’s wavefunction along with its spin state!
Finally, we define a Hamiltonian for our new hydrogen atom with spin:
This is just the Hamiltonian for the simplified hydrogen atom neglecting spin that we studied in Part 5, tensored with the identity operator on
Thus it has the same spectrum, but the multiplicity of each eigenvalue has doubled. This Hamiltonian
commutes with all the operators
and thus also
and
Now we can reuse our work from Part 5 and decompose our new Hilbert space into eigenspaces of the Hamiltonian labeled by
, and the orbital angular momentum operator
labeled by
We get this:
where is the spin-
representation of the
that rotates the electron’s position but not its spin.
In Part 5 we saw a basis of
If we tensor that with the standard basis of
we get an orthonormal basis
of
where:
• the principal quantum number ranges over positive integers;
• the azimuthal quantum number ranges from
to
in integer steps;
• the magnetic quantum number ranges from
to
in integer steps;
• the spin quantum number is
or
The calculations we did in Part 5 now imply that
Combining this with the textbook treatment of the hydrogen atom, it follows that is indeed unitarily equivalent to the subspace of
consisting of bound states of the spinor-valued Schrödinger equation
with the operators and
having their usual definitions:
In short, the Hamiltonian on
is unitarily equivalent to the Hamiltonian on bound states of the hydrogen atom defined in the usual way! We’ve turned hydrogen into a festival of commuting
actions.
Next we’ll do something a bit wild, and new.
For more, read my paper:
• Second quantization for the Kepler problem.
or these blog articles, which are more expository and fun:
• Part 1: a quick overview of Kepler’s work on atoms and the solar system, and more modern developments.
• Part 2: why the eccentricity vector is conserved for a particle in an inverse square force, and what it means.
• Part 3: why the momentum of a particle in an inverse square force moves around in a circle.
• Part 4: why the 4d rotation group acts on bound states of a particle in an attractive inverse square force.
• Part 5: quantizing the bound states of a particle in an attractive inverse square force, and getting the Hilbert space for bound states of a hydrogen atom, neglecting the electron’s spin.
• Part 6: how the Duflo isomorphism explains quantum corrections to the hydrogen atom Hamiltonian.
• Part 7: why the Hilbert space of bound states for a hydrogen atom including the electron’s spin is
• Part 8: why is also the Hilbert space for a massless spin-1/2 particle in the Einstein universe.
• Part 9: a quaternionic description of the hydrogen atom’s bound states (a digression not needed for later parts).
• Part 10: changing the complex structure on to eliminate negative-energy states of the massless spin-1/2 particle, as often done.
• Part 11: second quantizing the massless spin-1/2 particle and getting a quantum field theory on the Einstein universe, or alternatively a theory of collections of electrons orbiting a nucleus.
• Part 12: obtaining the periodic table of elements from a quantum field theory on the Einstein universe.
Now comes the really new stuff. I want to explain how the hydrogen atom is in a certain sense equivalent to a massless spin-½ particle in the ‘Einstein universe’. This is the universe Einstein believed in before Hubble said the universe was expanding! It has a 3-sphere for space, and this sphere stays the same size as time passes.
Today I’ll just lay the groundwork. To study relativistic spin-½ quantum particles, we need to understand the Dirac operator. So, we need to bring out the geometrical content of what we’ve already done.
The main trick is to see the 3-sphere as the group , which acts on itself in two ways, via left and right translations. We get all the rotational symmetries of the 3-sphere this way. In Part 4 we studied operators called
and
on
which are the self-adjoint generators of these left and right translations. We saw that
Today we’ll see that is proportional to the Laplacian on the unit 3-sphere!
But then I want to look at spinor fields on the 3-sphere. We can think of elements of as spinor fields on the 3-sphere if we trivialize the spinor bundle using the action of
as right translations on
You could use left translations, but you have to pick one or the other, and we’ll use right translations.
In some notes from a course he gave at Harvard, Peter Kronheimer used this trick to study the Dirac operator on these spinor fields:
• Peter Kronheimer, Bott periodicity and harmonic theory on the 3-sphere.
I’ll explain the geometry behind his computations using some ideas I got from Paul Schwahn. Then I’ll show that the hydrogen atom Hamiltonian, thought of as an operator on is
Next time we’ll use this to relate hydrogen to the massless relativistic spin-½ particle on the Einstein universe.
Okay, on to business!
Let’s start with the Laplacian on the 3-sphere. From what we’ve already seen, the operators are a basis of left-invariant vector fields on
Each vector field
gives a tangent vector at the identity of
namely
What is the length of this vector if we give the usual Riemannian metric on the unit 3-sphere? Exponentiating this vector we get
which is the identity precisely when is an integer times
Since a great circle on the unit sphere has circumference
this vector must have length ½. It follows that the vector fields
have unit length everywhere, and one can check that they form an orthonormal basis of vector fields on We thus define the (positive definite) Laplacian on
to be the differential operator
In Part 5 we saw that
where is the spin-j representation of
We also saw that
It follows that
But chemists like to work with instead: they call this the ‘principal quantum number’ for a state of hydrogen atom. Since
it follows that
so the eigenvalues of the Laplacian on the unit 3-sphere are where
ranges over all positive integers.
Tensoring with the identity we obtain a differential operator on
which by abuse of notation we again call
We know from Part 7 that the hydrogen atom Hamiltonian is
but now we know so
Next we turn to the Dirac operator.
Up to isomorphism there is only one choice of spin structure on namely the trivial bundle. To get this can trivialize the tangent bundle of
using left translations. This lets us identify the oriented orthonormal frame bundle of
with the trivial bundle
This gives a way to identify the spin bundle on with the trivial bundle
This in turn lets us identify spinor fields on with
-valued functions.
There are at least two important connections on the tangent bundle of
• One is the Cartan connection: a vector field is covariantly constant with respect to this connection if and only if it is invariant under left translations on
• The other is the Levi–Civita connection: the unique torsion-free connection for which parallel translation preserves the metric.
Parallel translation with respect to the Cartan connection also preserves the metric, but the Cartan connection is flat and has torsion, while the Levi–Civita connection is curved and torsion-free.
Each of these connections lifts uniquely to a connection on the spin bundle which then gives a Dirac-like operator. The Cartan connection gives covariant derivative operators on
with
while the Levi–Civita connection gives covariant derivative operators with
We can define a Dirac operator on
using the Levi–Civita connection:
Here I’m summing over repeated indices, and I’m not worrying about superscripts versus subcripts because we can raise and lower indices to our heart’s content using the standard metric on the unit 3-sphere.
I should warn you that this Dirac operator has an in it, to make it self-adjoint! This may be nonstandard, but it will make our life easier.
On the other hand, Kronheimer defined a Dirac-like operator using the Cartan connection:
An easy calculation shows how and
are related:
where we use and the 3-dimensionality of space.
Let us compute Using the identities
we obtain
It follows that so
Combining this with our earlier formula for the hydrogen atom Hamiltonian:
we can now express the Hamiltonian for the hydrogen atom in terms of the Dirac operator on the 3-sphere:
This is pretty cool. We will exploit this later.
That was the main result we’ll need, but while working on this I got interested in understanding the eigenvectors and eigenvalues of the Dirac operator in more detail. Here are some facts about those.
Since maps each finite-dimensional subspace
to itself, and it is self-adjoint on these subspaces, each of these subspaces has an orthonormal basis of eigenvectors. So, consider an eigenvector: suppose has
Then we must have
where as usual, so we must have
Thus, the only possible eigenvalues of
on the subspace
are
or in other words:
To go further, we can use some results from Kronheimer. First, he shows the spectrum of is symmetric about the origin. To do this he identifies
with the quaternions, and thus
with a space of quaternion-valued functions on the 3-sphere. Then quaternionic conjugation gives a conjugate-linear operator
with He then proves a result, using his Dirac-like operator
that implies
Thus the Dirac operator has as a negative eigenvalue for each positive eigenvalue, and their multiplicities are the same!
Second, he proved a result which implies that the eigenspace
has dimension
when and zero otherwise. Thus every number that’s an integer plus
is an eigenvalue of the Dirac operator on the 3-sphere—except
Also, while we’ve already seen that
where this additional result implies that these two summands have different dimensions, namely
and
respectively. Their total dimension is
as we already knew! We knew it because this is the number of electron states in the shell with principal quantum number
So, a bit more than half the electron states in the nth shell are positive eigenvectors of the Dirac operator on the 3-sphere, while a bit fewer than half are negative eigenvectors. Weird but true!
For an explicit basis of eigenvectors of the Dirac operator on see:
• Fabio Di Cosmo and Alessandro Zampini, Some notes on Dirac operators on the and
spheres.
For more, read my paper:
• Second quantization for the Kepler problem.
or these blog articles, which are more expository and fun:
• Part 1: a quick overview of Kepler’s work on atoms and the solar system, and more modern developments.
• Part 2: why the eccentricity vector is conserved for a particle in an inverse square force, and what it means.
• Part 3: why the momentum of a particle in an inverse square force moves around in a circle.
• Part 4: why the 4d rotation group acts on bound states of a particle in an attractive inverse square force.
• Part 5: quantizing the bound states of a particle in an attractive inverse square force, and getting the Hilbert space for bound states of a hydrogen atom, neglecting the electron’s spin.
• Part 6: how the Duflo isomorphism explains quantum corrections to the hydrogen atom Hamiltonian.
• Part 7: why the Hilbert space of bound states for a hydrogen atom including the electron’s spin is
• Part 8: why is also the Hilbert space for a massless spin-1/2 particle in the Einstein universe.
• Part 9: a quaternionic description of the hydrogen atom’s bound states (a digression not needed for later parts).
• Part 10: changing the complex structure on to eliminate negative-energy states of the massless spin-1/2 particle, as often done.
• Part 11: second quantizing the massless spin-1/2 particle and getting a quantum field theory on the Einstein universe, or alternatively a theory of collections of electrons orbiting a nucleus.
• Part 12: obtaining the periodic table of elements from a quantum field theory on the Einstein universe.
Today I want to make a little digression into the quaternions. We won’t need this for anything later—it’s just for fun. But it’s quite beautiful.
We saw in Part 8 that if we take the spin of the electron into account, we can think of bound states of the hydrogen atom as spinor-valued functions on the 3-sphere. Here a ‘spinor’ is a pair of complex numbers.
But we can also think of a spinor as a quaternion! And we can think of the 3-sphere as the unit sphere in the quaternions! So bound states of hydrogen have a nice quaternionic description.
We can go further using quaternionic analysis.
It took a long time for people to figure out the best generalization of complex analysis to the quaternions. Complex analytic functions are incredibly nice, and important in physics. But when you try to generalize them to ‘quaternion analytic functions’, your first few guesses are unlikely to work well. A guy named Rudolf Fueter figured out the right definition:
• Rudolf Fueter, Über die analytische Darstellung der regulären Funktionen einer Quaternionenvariablen, Commentarii Mathematici Helvetici 8 (1936), 371–378.
More recently, some very good mathematical physicists have been further developing this subject:
• Anthony Sudbery, Quaternionic analysis, Mathematical Proceedings of the Cambridge Philosophical Society 85 (1979), 199–225.
• Igor Frenkel and Matvei Libine, Quaternionic analysis, representation theory and physics, Advances in Mathematics 217 (2008), 1806–1877.
Using this, we can describe a lot of hydrogen atom bound states as quaternion analytic functions! And even better, the Dirac operator on spinor-valued functions on the 3-sphere, which I described in Part 8, has a nice description in these terms.
To be a bit more precise: we start by describing a bound state of hydrogen as a function
obeying
Here is the quaternions and
is the sphere of quaternions with length 1, which forms a group isomorphic to
But we’ll show that a dense subspace of functions of this sort extend to functions
that obey a quaternionic analogue of the Cauchy–Riemann equations. Remember, those are the equations obeyed by complex analytic functions. So, hydrogen atom bound states are giving us ‘quaternion analytic functions’ on
You’ll notice I removed the point 0 from the quaternions here. That’s because we allow functions that blow up at 0: that is, approach infinity for very small quaternions.
But of course we also allow functions that blow up at ∞: that is, approach infinity for very large quaternions. In fact there’s a nice symmetry here. To make this evident, we can take the quaternions and add on an extra point called ∞. This gives a space called the quaternionic projective line or for short. It’s a 4-sphere, with 0 as the south pole and ∞ as the north pole. The quaternions
with
form the equator of this 4-sphere. This equator is our friend
All this is just like what people often do with the complex numbers. They take the complex plane and add on an extra point called ∞. This gives a space called the complex projective line or It’s a 2-sphere with 0 as the south pole and ∞ as the north pole. Thus, it’s also called the Riemann sphere.
Anyway, the idea is that, apart from some niggles which I will mention later, bound states of the hydrogen atom are the same as quaternion analytic functions from to
You can take any one of these states and write it as a linear combination of two: one that blows up only at 0, and one that blows up only at ∞. This has an interesting interpretation. We’ve already seen that bound states of the hydrogen atom are spinor-valued functions on the 3-sphere, and a certain Dirac-like operator acts on these states. The states that are linear combinations of eigenvectors of
with positive eigenvalues correspond to the analytic functions that blow up only at
And the states that are linear combinations of eigenvectors of
with negative eigenvalues correspond to analytic functions that blow up only at 0.
All this is analogous to familiar things people do in the complex case. The introduction of Frenkel and Libine’s paper explains the analogy.
Okay, let’s get started!
Here is the quaternionic Cauchy–Riemann equation:
Here is some quaternion-valued function defined on some open subset of the quaternions, and
are the usual real coordinates on
for which any quaternion
is of the form
For any open set people say a function
is regular if it’s differentiable in the usual real sense and the quaternionic Cauchy–Riemann equation holds. In Theorem 1 of his paper, Sudbery shows that any regular function is infinitely differentiable in the usual real sense, in fact real-analytic.
Let be the space of regular functions on
that are homogeneous of degree
meaning that
Clearly any function is determined by its restriction to the unit sphere
But in the proof of his Theorem 7, Sudbery shows something less obvious: the restriction is an eigenfunction of the Dirac-like operator
that I mentioned in Part 8!
To prove this, the trick is to write the quaternionic Cauchy–Riemann operator
in something like polar coordinates, involving a radial derivative but also the operator that I introduced in Part 8. The radial derivative of a homogeneous function
is easy to work out, and then using
we can show
So, Sudbery shows that
(although he uses different notation).
We saw last time that the Dirac operator on the 3-sphere is
So, we get
With more work (see my paper) we can show the converse: any eigenfunction of the Dirac operator with eigenvalue is the restriction of a function in
Thus, each eigenspace of the Dirac operator on the 3-sphere can be seen as the space of all regular functions that are homogeneous of some particular degree.
So, we can think of hydrogen atom bound states, or at least those that are finite linear combinations of energy eigenstates, as regular functions
And these finite linear combinations are dense in the space of all hydrogen atom bound states!
To summarize in a sensationalistic way: hydrogen is quaternionic!
I’ve skimmed over some details. Please stop here unless you really love the quaternions. But to get everything from Part 8 to mesh nicely with what we’re doing now, we need to think of spinors as quaternions in a good way. We need to choose an isomorphism of real vector spaces
in such a way that
• multiplication by and
on
correspond to left multiplication by the quaternions
and
on
and
• multiplication by on
corresponds to right multiplication by the quaternion
In case you know some algebra and are wondering what’s really going on here, the idea is that is both a left and a right module of itself in the usual way. We can make it into a 2-dimensional complex vector space in a unique way such that multiplication by
is right multiplication by the quaternion
Since left and right multiplication commute, this makes
into a 2-dimensional complex vector space on which
acts complex-linearly by left multiplication.
But is also a 2-dimensional complex vector space on which
acts complex-linearly, with
acting as matrix multiplication by
All this suggests that with these structures chosen, and
are isomorphic as complex vector spaces on which
acts complex-linearly!
But how do we find such an isomorphism
?
I got confused for a while, but here’s a systematic approach. Suppose we have such an isomorphism. We must have
for some numbers We want
but we also want
(I’m going to skip lots of computational steps and focus on explaining the strategy.) So, we must have
or in other words
Because we’re assuming is complex-linear (where we multiply quaternions on the right by
), we can assume without loss of generality that
Then we have
and
But we also must have
and
So, we must have
Of course we still need to check that this actually works: that it has the desired properties in my bulleted list. But it does.
The formula is not something I was able to instantly guess.
For more, read my paper:
• Second quantization for the Kepler problem.
or these blog articles, which are more expository and fun:
• Part 1: a quick overview of Kepler’s work on atoms and the solar system, and more modern developments.
• Part 2: why the eccentricity vector is conserved for a particle in an inverse square force, and what it means.
• Part 3: why the momentum of a particle in an inverse square force moves around in a circle.
• Part 4: why the 4d rotation group acts on bound states of a particle in an attractive inverse square force.
• Part 5: quantizing the bound states of a particle in an attractive inverse square force, and getting the Hilbert space for bound states of a hydrogen atom, neglecting the electron’s spin.
• Part 6: how the Duflo isomorphism explains quantum corrections to the hydrogen atom Hamiltonian.
• Part 7: why the Hilbert space of bound states for a hydrogen atom including the electron’s spin is
• Part 8: why is also the Hilbert space for a massless spin-1/2 particle in the Einstein universe.
• Part 9: a quaternionic description of the hydrogen atom’s bound states (a digression not needed for later parts).
• Part 10: changing the complex structure on to eliminate negative-energy states of the massless spin-1/2 particle, as often done.
• Part 11: second quantizing the massless spin-1/2 particle and getting a quantum field theory on the Einstein universe, or alternatively a theory of collections of electrons orbiting a nucleus.
• Part 12: obtaining the periodic table of elements from a quantum field theory on the Einstein universe.
The poet Blake wrote that you can
Today we’ll see a universe in an atom!
We’ll see that states of the hydrogen atom correspond to states of a massless spin-½ particle in the Einstein universe—a closed, static universe where space is a 3-sphere.
The rotational symmetries of the Einstein universe correspond to symmetries of the hydrogen atom. The energy eigenstates of the massless spin-½ particle correspond in a one-to-one way to energy eigenstates of the hydrogen atom.
Let’s dive in!
The ‘Einstein universe’ is a name for the manifold with Lorentzian metric
where
is the usual Riemannian metric on
and
is the Riemannian metric on the unit sphere in 4-dimensional Euclidean speace. The Einstein universe has a lot of symmetry: the group
acts as isometries. This group also acts on the bound states of the hydrogen atom, in a way that commutes with the Hamiltonian.
To describe a massless spin-½ particle on the Einstein universe, we’ll use the Weyl equation. This is a variant of the Dirac equation that describes massless spin-½ particles that are chiral, i.e., have an inherent handedness. We can trivialize the bundle of Weyl spinors over the Einstein universe, using right translations on the group
to identify every fiber of this bundle with the vector space Using this trivialization we can write the left-handed Weyl equation as
where
is a spinor-valued function. (If the in the Weyl equation looks weird, don’t worry: it’s there because in Part 8 we defined
to be a self-adjoint operator, which has an
in it.)
The Weyl equation also comes in a right-handed form differing by a sign:
We choose henceforth to work with the left-handed form. This is an arbitrary convention: if we used the right-handed Weyl equation, all our results would still hold, with suitable minus signs sprinkled in here and there. But it so happens that, back when people believed neutrinos were massless, we thought they obeyed the left-handed Weyl equation. The reason is that to a very good approximation, when a neutrino is moving in the direction of your thumb, it spins clockwise, like the fingers in your curled left hand.
Here’s how the Weyl equation works on the Einstein universe. We take as the Hilbert space of solutions, and
as the Hamiltonian. Since we’ve defined things so that
is self-adjoint, this Hamiltonian generates a 1-parameter group of unitary operators
Given any if we let
and define a function
by
then this function will be a solution of the left-handed Weyl equation.
As we saw in Part 8, the hydrogen atom Hamiltonian is a function of the Hamiltonian
for the Weyl equation:
Thus, not only the Hilbert space but also the dynamics of the bound states of the hydrogen atom can be expressed in terms those for the Weyl equation on the Einstein universe. However, not all the symmetries we have found for the Hamiltonian are symmetries of
: this is possible because while
is a function of
is not a function of
Let us make this precise.
In Part 7 we made into a unitary representation of
in three commuting ways: via left translations, via right translations, and via the spin-½ representation on
The self-adjoint generators of these three representations are called
and
respectively, and these obey
Using some formulas from Part 8 we can write the Dirac operator in terms of these operators:
Using this and the commutation relations we just saw, we see commutes with the operators
It does not commute with
or
separately, but it commutes with
since
It follows that commutes with the unitary representation
of
on the Hilbert space
whose self-adjoint generators are
and
If we think of this Hilbert space as consisting of functions
this representation is given by
Geometrically, this representation arises from the natural way to lift the left and right translation actions of on
to the spinor bundle of
The asymmetry between left and right here may seem puzzling. It has nothing to do with the fact that we’re studying the left-handed Weyl equation. Instead, it arises from how we arbitrarily chose to trivialize the spinor bundle of
using the action of
as left translations. Thus, the above action of
on
merely left translates
while the action of
not only right translates
but also acts on its value by
Summarizing, this is what we have seen so far. Made into representations of as above, the Hilbert space of bound states of hydrogen atom and the Hilbert space for the left-handed Weyl equation on the Einstein universe are unitarily equivalent. Moreover, we can express the Hamiltonian for the hydrogen atom in terms of that for the left-handed Weyl equation!
All this is fine mathematics, but there is a physical problem, noticed already by Dirac in a related context: the spectrum of is unbounded below, giving states of arbitraily large negative energy! One widely accepted solution is to modify the complex structure on the Hilbert space, multiplying it by
on the negative-frequency solutions of the Weyl equation: that is, the subspace of
spanned by eigenvectors of the Dirac operator with negative eigenvalues. This is an updated version of Dirac’s original idea of treating antiparticles as ‘holes in the sea of negative-energy particles,’ or the later and still popular idea of switching annihilation and creation operators for negative-frequency solutions.
To modify the complex structure on we use the functional calculus to define an operator
on this Hilbert space. This equals 1 on eigenvectors of with positive eigenvalue and -1 on those with negative eigenvalue; we have seen that 0 is not an eigenvalue of
so
is well-defined. We then define an operator
Since is both unitary and self-adjoint, it follows that
is both unitary (
) and skew-adjoint (
), and thus a complex structure (
). We henceforth use
to stand for
made into a complex Hilbert space with the same norm and this new complex structure
The operators and
are still complex-linear on
despite the new complex structure, since they commute with
and
and thus
The operator
is still self-adjoint on
since it has an orthonormal basis of eigenvectors with real eigenvalues. The operator
is not only self-adjoint but positive definite on
since
In fact the operator generates
as a one-parameter unitary group on
because
Thus negative energy states have been eliminated, without changing the time evolution operators at all, by changing the Hamiltonian from
to
and simultaneously changing the complex structure from
to
Since all the operators on
commute with
they also commute with
and thus with the new complex structure
Thus
which began life as a unitary representation of
on
gives a complex-linear representation of this group on
which we call
This representation
is unitary, since the norm on
is the same as that on
and any norm-preserving invertible linear operator on a Hilbert space is unitary.
Furthermore, the representation is unitarily equivalent to
This is a nontrivial fact, because the unitary equivalence between them is not the identity operator! Indeed the map
is not even complex linear: it is complex linear on the +1 eigenspace of but conjugate-linear on the -1 eigenspace. To correct for this, we use a conjugate-linear map
where we regard as a
-valued function on
let
denote its componentwise complex conjugate, and multiply
by
The reason the map is important is that
for all and
Checking this is a little calculation with 2 × 2 matrices, but conceptually it says that
is an equivalence between the spin-½ representation of
and its conjugate representation. The desired unitary equivalence between
and
is then the map
where
are the projections of to the +1 and -1 eigenspaces of
respectively. We include
here because we should keep track of the difference between
and
Theorem. The operator is a unitary equivalence between the representation
of
on
and the representation
of this group on
and
on the domain of
Proof. For the proof, see the Appendix of my paper. █
Since changing the complex structure on a Hilbert space can be a bit bewildering, I should summarize what we’ve achieved here.
We have a unitary equivalence between the Hilbert space of bound states of the hydrogen atom and the Hilbert space
of solutions of the left-handed Weyl equation on
equipped with a complex structure that makes its Hamiltonian positive. The group
has equivalent unitary representations on these two Hilbert spaces. The Dirac operator
acts on both
and
in an manner compatible with their unitary equivalence. Finally, both the hydrogen atom Hamiltonian and the Hamiltonian for the left-handed Weyl equation can be expressed in terms of the Dirac operator: the former is
while the latter is just
So, we’re seeing the universe in an atom—or at least, we’re seeing a massless spin-½ particle in the Einstein universe in the hydrogen atom.
But what about full-fledged quantum field theory? Can we understand the massless spin-½ quantum field in terms of atomic physics? Yes! But for that we’ll need to second quantize today’s story.
For more, read my paper:
• Second quantization for the Kepler problem.
or these blog articles, which are more expository and fun:
• Part 1: a quick overview of Kepler’s work on atoms and the solar system, and more modern developments.
• Part 2: why the eccentricity vector is conserved for a particle in an inverse square force, and what it means.
• Part 3: why the momentum of a particle in an inverse square force moves around in a circle.
• Part 4: why the 4d rotation group acts on bound states of a particle in an attractive inverse square force.
• Part 5: quantizing the bound states of a particle in an attractive inverse square force, and getting the Hilbert space for bound states of a hydrogen atom, neglecting the electron’s spin.
• Part 6: how the Duflo isomorphism explains quantum corrections to the hydrogen atom Hamiltonian.
• Part 7: why the Hilbert space of bound states for a hydrogen atom including the electron’s spin is
• Part 8: why is also the Hilbert space for a massless spin-1/2 particle in the Einstein universe.
• Part 9: a quaternionic description of the hydrogen atom’s bound states (a digression not needed for later parts).
• Part 10: changing the complex structure on to eliminate negative-energy states of the massless spin-1/2 particle, as often done.
• Part 11: second quantizing the massless spin-1/2 particle and getting a quantum field theory on the Einstein universe, or alternatively a theory of collections of electrons orbiting a nucleus.
• Part 12: obtaining the periodic table of elements from a quantum field theory on the Einstein universe.
In Part 10 we saw that, loosely speaking, the theory of a hydrogen atom is equivalent to the theory of a massless left-handed spin-½ particle in the Einstein universe—a static universe where space is a 3-sphere. Today we’ll ‘second quantize’ both of these equivalent theories and get new theories that again are equivalent.
‘Second quantization’ is a funny term. It’s a completely systematic way to get a new quantum theory from an old one. When you second quantize the theory of some particle, you get a theory that describes noninteracting collections of that kind of particle. So, when we second quantize the theory of a hydrogen atom, we get a theory that describes collections of electrons orbiting the nucleus—but in a simplified way, where the electrons do not interact. When we second quantize the theory of a massless left-handed spin-½ particle in the Einstein universe, we get a theory of noninteracting collections of such particles. This is called a free quantum field theory: it’s the easiest sort of quantum field theory to understand.
So, there’s a relationship between multi-electron atoms and a free quantum field theory on the Einstein universe! Next time we’ll think a bit harder about how electrons in an atom actually do interact, and how that affects the structure of the periodic table. But for now we’ll ignore that.
To get started, let’s recall how to build the fermionic Fock space on an arbitrary Hilbert space We start with the exterior algebra
and give it the inner product such that if is any orthonormal basis for
the wedge products
with
form an orthonormal basis for
and the different subspaces
are orthogonal. Completing
with respect to the norm coming from this inner product, we obtain a Hilbert space we call
If is the Hilbert space for a single particle of some sort,
is the Hilbert space of states of a collection of n particles of this sort, treated as fermions, and
is the Hilbert space for arbitrary finite collections of such particles. We call
the fermionic Fock space on
and call
the n-particle subspace.
To define observables on the Fock space, recall how any self-adjoint operator on gives rise to one on
First, any unitary operator
gives rise to a unitary operator
determined by the property that
for any vectors Note that
Let be the group of unitary operators on
and
the group of unitary operators on
If
is any topological group, any strongly continuous unitary representation
gives rise to a strongly continuous unitary representation
defined by
In particular, any self-adjoint operator on
gives a strongly continuous unitary one-parameter group
on
and thus a strongly continuous unitary one-parameter group
on
Stone’s theorem says the latter is generated by a unique self-adjoint operator on
which we call
We thus have
for all If the vectors
are in the domain of
we can differentiate both sides of the above formula applied to
and set
obtaining
In particular if all are eigenvectors of
then their wedge product is an eigenvector of
We can apply all this mathematics in two ways:
1) We can to be the Hilbert space of bound states of a hydrogen atom:
Then is the Hilbert space for an arbitrary finite collection of electrons occupying such states. In particular, if
is the hydrogen atom Hamiltonian, then
restricted to n-particle subspace
is the Hamiltonian for an idealized atom with n noninteracting electrons. Since in fact electrons do interact, the lowest-energy eigenstate in the n-particle space gives a very crude approximation to the nth element in the periodic table. To do better we must modify the Hamiltonian! I’ll talk about this next time.
2) Alternatively we can start with the Hilbert space of a single left-handed massless spin-½ partice in the Einstein universe. We have seen that the Hamiltonian for such a particle is
Then
is the Hilbert space for an arbitrary collection of left-handed massless spin-½ particles, treated as fermions. If these particles are noninteracting, their Hamiltonian is
and we have a free quantum field theory. This is called the left-handed massless spin-½ quantum field! Back when we thought neutrinos were massless, this would be a pretty good approximation to how they work. To get a better approximation, we’d need to add some interactions with other fields.
How are these related? Very closely!
Any unitary operator between Hilbert spaces, for example the unitary operator that we saw in Part 9, induces a unitary operator between their fermionic Fock spaces, like this:
for all Thus, an equivalence between two theories at the single-particle level induces an equivalence between their second quantized versions! And with some extra work we get this:
Theorem 2. The map is a unitary equivalence between the representation of the group
on the fermionic Fock space
for bound states of the hydrogen atom Hamiltonian and the representation of this group on the fermionic Fock space
for left-handed massless spin-½ particles on the Einstein universe. That is,
for all Moreover
on the domain of
Proof. For the proof, see the Appendix of my paper. █
Next time we’ll see whether we can get the periodic table from our second quantized theory of the hydrogen atom.
For more, read my paper:
• Second quantization for the Kepler problem.
or these blog articles, which are more expository and fun:
• Part 1: a quick overview of Kepler’s work on atoms and the solar system, and more modern developments.
• Part 2: why the eccentricity vector is conserved for a particle in an inverse square force, and what it means.
• Part 3: why the momentum of a particle in an inverse square force moves around in a circle.
• Part 4: why the 4d rotation group acts on bound states of a particle in an attractive inverse square force.
• Part 5: quantizing the bound states of a particle in an attractive inverse square force, and getting the Hilbert space for bound states of a hydrogen atom, neglecting the electron’s spin.
• Part 6: how the Duflo isomorphism explains quantum corrections to the hydrogen atom Hamiltonian.
• Part 7: why the Hilbert space of bound states for a hydrogen atom including the electron’s spin is
• Part 8: why is also the Hilbert space for a massless spin-1/2 particle in the Einstein universe.
• Part 9: a quaternionic description of the hydrogen atom’s bound states (a digression not needed for later parts).
• Part 10: changing the complex structure on to eliminate negative-energy states of the massless spin-1/2 particle, as often done.
• Part 11: second quantizing the massless spin-1/2 particle and getting a quantum field theory on the Einstein universe, or alternatively a theory of collections of electrons orbiting a nucleus.
• Part 12: obtaining the periodic table of elements from a quantum field theory on the Einstein universe.
Hardly for the first time in my life, this weekend I got floridly denounced every five minutes—on SneerClub, on the blog of Peter Woit, and in my own inbox. The charge this time was that I’m a genocidal Zionist who wants to kill all Palestinian children purely because of his mental illness and raging persecution complex.
Yes, that’s right, I’m the genocidal one—me, whose lifelong dream is that, just like Germany and Japan rose from their necessary devastation in WWII to become pillars of our global civilization, so too the children in Gaza, the West Bank, Syria, Lebanon, and Iran will one day grow up in free and prosperous societies at peace with the West and with Israel. Meanwhile, those who demand an actual genocide of the Jews, another one—those who pray to Allah for it, who attempt it over and over, who preach it to schoolchildren, who celebrate their progress toward it in the streets—they’re all as innocent as lambs.
Yesterday, in The Free Press, came the report of a British writer who traveled to southern Lebanon, and met an otherwise ordinary young man there … who turned out to be excited for Muslims and Christians to join forces to slaughter all the Yahood, and who fully expected that the writer would share his admiration for Hitler, the greatest Yahood-killer ever.
This is what the global far left has now allied itself with. This is what I’m right now being condemned for standing against, with commenter after commenter urging me to seek therapy.
To me, this raises a broader question: how exactly do you keep your sanity, when you live on a planet filled with brain-eaten zombies?
I’m still struggling with that question, but the best I’ve come up with is what I think of as the Weinberg Principle, after my much-missed friend and colleague here at UT Austin. Namely, I believe that it’s better to have one Steven Weinberg on your side while the rest of humanity is against you, than the opposite. Many other individuals (including much less famous ones) would also work here in place of Steve, but I’ll go with him because I think most of my readers would agree to three statements:
Maybe it’s possible to wake the zombies up. Yoram Arnon, for example, wrote the following eloquent answer on Quora, in response to the question “Why are so many against freeing Palestine?”:
When Westerners think about freedom they think about freedom of speech, freedom of expression, freedom of movement, freedom of religion, freedom to form political parties, etc.
When Palestinians say “Free Palestine” they mean freedom from Jews, and from Israel’s existence. They’re advocating for the abolition of Israel, replacing it with an Arab country.
Israel is the only country in the Middle East that is free, in the Western sense of the word. If Israel were to disappear, Palestinians would fall under an autocratic regime, just like every other Arab country, with none of the above freedoms. And, of course, Israelis would suffer a terrible fate at their hands.
Pro Palestinians are either unable to see this, or want exactly that, but thankfully many in the West do see this – the same “many” that are against “freeing Palestine”.
Palestinians need to accept Israel’s right to exist, and choose to coexist peacefully alongside it, for them to have the peace and freedom the West wants for them.
Maybe reading words like these—or the words of Coleman Hughes, or Douglas Murray, or Hussein Aboubakr Mansour, or Yassine Meskhout, or John Aziz, or Haviv Rettig Gur, or Sam Harris, or the quantum computing pioneer David Deutsch—can boot a few of the zombies’ brains back up. But even then, I fear that these reboots will be isolated successes. For every one who comes back online, a thousand will still shamble along in lockstep, chanting “brainsssssss! genocide! intifada!”
I’m acutely aware of how sheer numbers can create the illusion of argumentative strength. I know many people who were sympathetic to Israel immediately after October 7, but then gradually read the room, saw which side their bread was buttered on, etc. etc. and became increasingly hostile. My reaction, of course, has been exactly the opposite. The bigger the zombie army I see marching against me, the less inclined I feel to become a zombie myself—and the clearer to me becomes the original case for the Zionist project.
So to the pro-Zionist students—Jewish of course, but also Christian, Muslim, Hindu, atheist, and everyone else—who feel isolated and scared to speak right up now, and who also often email me, here’s what I say. Yes, the zombies vastly outnumber us, but on the other hand, they’re zombies. Some of the zombies know longer words than others, but so far, not one has turned out to have a worldview terribly different from that of the image at the top of this post.
I’ll keep the comments closed, for much the same reasons I did in my last post. Namely, while there are many people of all opinions and backgrounds with whom one can productively discuss these things, there are many more with whom one can’t. Furthermore, experience has shown that the latter can disguise themselves as the former for days on end, and thereby execute a denial-of-service attack on any worthwhile and open public discussion.
Addendum: The troll who sent the antisemitic image now says that he regrets and apologizes for it, and that he’s going to read books on Jewish history to understand his error. I’ll believe that when he actually sends me detailed book reports or other evidence, but just wanted to update.
It’s Taylor Swift album day! We listened to it last night at 11pm central when it went live. Snap reaction from AB, for whom 1989 is apex Swift and the last two albums have been too moody and murky, is — this is a winner. She’s happy to have Max Martin (per AB: “that Norwegian guy”) back.
Friend of the blog Stephanie Burt is probably the world’s foremost academic expert on Taylor Swift. She teaches a class on Swift (which is really a class on how songs work, how poems work, how reputations work, how fandoms work, and Swift) at Harvard. And in the kind of publicity no publisher can plan for, her big Swift book, Taylor’s Version, comes out on Monday, in the middle of a global Taylor Swift media blitz. So thoughtful of Tay to drop the album just in time for Stephanie’s pub date!
I, as a trusted friend of Stephanie, already have a copy. I finished reading it yesterday, just in time for the Life of a Showgirl release. It is good, people, really good. If you are interested in how songs work, how poems work, how reputations work, how fandoms work, or Taylor Swift, I implore you to buy a copy at Bookshop, Amazon, or your local store.
(This blog’s favorite TS songs: “Shake it Off,” “Getaway Car,” “Invisible String,” “We Are Never Ever Getting Back Together,” “Welcome to New York.” AB asked me when I first became aware of Taylor Swift. She’s one of the rare acts for whom I can tell you exactly when. Driving east on Mineral Point Rd., “WANEGBT” came on the radio, and it was a “WHAT IS THIS” moment for me — the two other times I remember this happening were the first time I heard Green Day (“Longview”) and the first time I heard New Pornographers (“The Slow Descent into Alcoholism.”) The time David Carlton explained the functor of points to me in the Beacon St. Star Market in Somerville, MA was actually a very similar experience.)
Occasionally, people try to give “even-handed” accounts of crackpot physics, like people who claim to have invented anti-gravity devices. These accounts don’t go so far as to say that the crackpots are right, and will freely point out plausible doubts about the experiments. But at the end of the day, they’ll conclude that we still don’t really know the answer, and perhaps the next experiment will go differently. More tests are needed.
For someone used to engineering, or to sciences without much theory behind them, this might sound pretty reasonable. Sure, any one test can be critiqued. But you can’t prove a negative: you can’t rule out a future test that might finally see the effect.
That’s all well and good…if you have no idea what you’re doing. But these people, just like anyone else who grapples with physics, aren’t just proposing experiments. They’re proposing theories: models of the world.
And once you’ve got a theory, you don’t just have to care about future experiments. You have to care about past experiments too. Some theories…are already dead.
To get a little more specific, let’s talk about antigravity proposals that use scalar fields.
Scalar fields seem to have some sort of mysticism attached to them in the antigravity crackpot community, but for physicists they’re just the simplest possible type of field, the most obvious thing anyone would have proposed once they were comfortable enough with the idea of fields in the first place. We know of one, the Higgs field, which gives rise to the Higgs boson.
We also know that if there are any more, they’re pretty subtle…and as a result, pretty useless.
We know this because of a wide variety of what are called “fifth-force experiments“, tests and astronomical observations looking for an undiscovered force that, like gravity, reaches out to long distances. Many of these experiments are quite general, the sort of thing that would pick up a wide variety of scalar fields. And so far, none of them have seen anything.
That “so far” doesn’t mean “wait and see”, though. Each time physicists run a fifth-force experiment, they establish a limit. They say, “a fifth force cannot be like this“. It can’t be this strong, it can’t operate on these scales, it can’t obey this model. Each experiment doesn’t just say “no fifth force yet”, it says “no fifth force of this kind, at all”.
When you write down a theory, if you’re not careful, you might find it has already been ruled out by one of these experiments. This happens to physicists all the time. Physicists want to use scalar fields to understand the expansion of the universe, they use them to think about dark matter. And frequently, a model one physicist proposed will be ruled out, not by new experiments, but by someone doing the math and realizing that the model is already contradicted by a pre-existing fifth-force experiment.
So can you prove a negative? Sort of.
If you never commit to a model, if you never propose an explanation, then you can never be disproven, you can always wait for the experiment of your dreams to come true. But if you have any model, any idea, any explanation at all, then your explanation will have implications. Those implications may kill your theory in a future experiment. Or, they may have already killed it.
Recently I had to update Mathematica on my laptop and after having solved the challenges of the license manager that keeps looking different every time I have to use it, I learned that Mathematica 14 can now officially work with finite fields.
This reminded me that for a while I wanted to revive an old project that had vanished together with the hard drive of some old computer: Holosplit. So, over the last two days and with the help of said version of Mathematica I did a complete rewrite which you can now find on Github.
It consists of two C programs "holosplit" and "holojoin". To the first you give a positive integer \(N\) and a file and it spits out a new file ("fragment") that is roughly \(1/N\) of the size. Every time you do that you obtain a new random fragment.
The later you give any collection of \(N\) of these fragments and it reproduces the original file. So you can for example distribute a file over 10 people such that when any 3 of them work together, they can recover the original.
How does it work? I uses the finite field \(F\) of \(2^3=256\) elements (in the Github repository, there is also a header file that implements arithmetic in \(F\) and matrix operations like product and inverse over it). Each time, it is invoked, it picks a random vector \(v\in F^N\) and writes it to the output. Then it reads \(N\) bytes from the file at a time which it also interprets as a vector \(d\in F^N\). It then outputs the byte that corresponds to the scalar product \(v\cdot d\).
To reassemble the file, holojoin takes the \(N\) files with its random vectors \(v_1,\ldots,v_N\) and interprets those as the rows of a \(N\times N\) matrix \(A\). With probability
$$\frac{\prod_{k=1}^N \left(256^N-k\right)}{(256)^{N^2}}$$
which exponentially in \(N\) approaches 1 this matrix is invertible (homework: why?). So we can read one byte from each file, assemble those into yet another vector \(e\in F^N\) and recover
$$d=A^{-1}e.$$
Besides the mathematics, it also poses philosophical/legal questions: Consider for example the original file is copyrighted, for example an mp3 or a video. The fragments are clearly derived works. But individually, they do not contain the original work, without sufficiently many other fragments they are useless (although not in a cryptographic sense). So by publishing one fragment, I do not provide access to the original work. What if others publish other fragments? Then my fragment could be the last remaining one that was missing. If there are more, any individual fragment is redundant so publishing it strictly speaking does not provide new information.
Claudia Prieto is a Venezuelan singer-songwriter about whom I know next to nothing except that she recorded this gorgeous song, “Tus Ojos,” in 2018.
By the way, I found out about this song because it was playing in a restaurant in Mexico and it was great and I Shazammed it. Nobody ever talks about Shazam anymore, but it’s one of the unequivocal little goods that smartphones do. You hear an amazing song playing on somebody else’s radio, and you can find out what it is and add it to your list of songs forever instead of spending your life wondering. This is also how I found out about Eddie Meduza and “Elevator Operator” and Cymande.
Update (Sep. 29): Since this post has now gone semi-viral on X, Hacker News, etc., with people arguing about how trivial or nontrivial was GPT5’s “discovery,” it seems worthwhile to say something that was implicit in the post.
Namely, GPT5-Thinking’s suggestion of a function to use “should have” been obvious to us. It would have been obvious to us had we known more, or had we spent more time studying the literature or asking experts.
The point is, anyone engaged in mathematical research knows that an AI that can “merely” fill in the insights that “should’ve been” obvious to you is a really huge freaking deal! It speeds up the actual discovery process, as opposed to the process of writing LaTeX or preparing the bibliography or whatever. This post gave one tiny example of what I’m sure will soon be thousands.
I should also add that, since this post went up, a commenter named Phillip Harris proposed a better function to use than GPT-5’s: det(I-E) rather than Tr[(I-E)-1]. While we’re still checking details, not only do we think this works, we think it simplifies our argument and solves one of our open problems. So it seems human supremacy has been restored, at least for now!
A couple days ago, Freek Witteveen of CWI and I posted a paper to the arXiv called “Limits to black-box amplification in QMA.” Let me share the abstract:
We study the limitations of black-box amplification in the quantum complexity class QMA. Amplification is known to boost any inverse-polynomial gap between completeness and soundness to exponentially small error, and a recent result (Jeffery and Witteveen, 2025) shows that completeness can in fact be amplified to be doubly exponentially close to 1. We prove that this is optimal for black-box procedures: we provide a quantum oracle relative to which no QMA verification procedure using polynomial resources can achieve completeness closer to 1 than doubly exponential, or a soundness which is super-exponentially small. This is proven by using techniques from complex approximation theory, to make the oracle separation from (Aaronson, 2008), between QMA and QMA with perfect completeness, quantitative.
You can also check out my PowerPoint slides here.
To explain the context: QMA, or Quantum Merlin Arthur, is the canonical quantum version of NP. It’s the class of all decision problems for which, if the answer is “yes,” then Merlin can send Arthur a quantum witness state that causes him to accept with probability at least 2/3 (after a polynomial-time quantum computation), while if the answer is “no,” then regardless of what witness Merlin sends, Arthur accepts with probability at most 1/3. Here, as usual in complexity theory, the constants 2/3 and 1/3 are just conventions, which can be replaced (for example) by 1-2-n and 2-n using amplification.
A longstanding open problem about QMA—not the biggest problem, but arguably the most annoying—has been whether the 2/3 can be replaced by 1, as it can be for classical MA for example. In other words, does QMA = QMA1, where QMA1 is the subclass of QMA that admits protocols with “perfect completeness”? In 2008, I used real analysis to show that there’s a quantum oracle relative to which QMA ≠ QMA1, which means that any proof of QMA = QMA1 would need to use “quantumly nonrelativizing techniques” (not at all an insuperable barrier, but at least we learned something about why the problem is nontrivial).
Then came a bombshell: in June, Freek Witteveen and longtime friend-of-the-blog Stacey Jeffery released a paper showing that any QMA protocol can be amplified, in a black-box manner, to have completeness error that’s doubly exponentially small, 1/exp(exp(n)). They did this via a method I never would’ve thought of, wherein a probability of acceptance is encoded via the amplitudes of a quantum state that decrease in a geometric series. QMA, it turned out, was an old friend that still had surprises up its sleeve after a quarter-century.
In August, we had Freek speak about this breakthrough by Zoom in our quantum group meeting at UT Austin. Later that day, I asked Freek whether their new protocol was the best you could hope to do with black-box techniques, or whether for example one could amplify the completeness error to be triply exponentially small, 1/exp(exp(exp(n))). About a week later, Freek and I had a full proof written down that, using black-box techniques, doubly-exponentially small completeness error is the best you can do. In other words: we showed that, when one makes my 2008 QMA ≠ QMA1 quantum oracle separation quantitative, one gets a lower bound that precisely matches Freek and Stacey’s protocol.
All this will, I hope, interest and excite aficianados of quantum complexity classes, while others might have very little reason to care.
But here’s a reason why other people might care. This is the first paper I’ve ever put out for which a key technical step in the proof of the main result came from AI—specifically, from GPT5-Thinking. Here was the situation: we had an N×N Hermitian matrix E(θ) (where, say, N=2n), each of whose entries was a poly(n)-degree trigonometric polynomial in a real parameter θ. We needed to study the largest eigenvalue of E(θ), as θ varied from 0 to 1, to show that this λmax(E(θ)) couldn’t start out close to 0 but then spend a long time “hanging out” ridiculously close to 1, like 1/exp(exp(exp(n))) close for example.
Given a week or two to try out ideas and search the literature, I’m pretty sure that Freek and I could’ve solved this problem ourselves. Instead, though, I simply asked GPT5-Thinking. After five minutes, it gave me something confident, plausible-looking, and (I could tell) wrong. But rather than laughing at the silly AI like a skeptic might do, I told GPT5 how I knew it was wrong. It thought some more, apologized, and tried again, and gave me something better. So it went for a few iterations, much like interacting with a grad student or colleague. Within a half hour, it had suggested to look at the function
$$ Tr[(I-E(\theta))^{-1}] = \sum_{i=1}^N \frac{1}{1-\lambda_i(\theta)}. $$
It pointed out, correctly, that this was a rational function in θ of controllable degree, that happened to encode the relevant information about how close the largest eigenvalue λmax(E(θ)) is to 1. And this … worked, as we could easily check ourselves with no AI assistance. And I mean, maybe GPT5 had seen this or a similar construction somewhere in its training data. But there’s not the slightest doubt that, if a student had given it to me, I would’ve called it clever. Obvious with hindsight, but many such ideas are.
I had tried similar problems a year ago, with the then-new GPT reasoning models, but I didn’t get results that were nearly as good. Now, in September 2025, I’m here to tell you that AI has finally come for what my experience tells me is the most quintessentially human of all human intellectual activities: namely, proving oracle separations between quantum complexity classes. Right now, it almost certainly can’t write the whole research paper (at least if you want it to be correct and good), but it can help you get unstuck if you otherwise know what you’re doing, which you might call a sweet spot. Who knows how long this state of affairs will last? I guess I should be grateful that I have tenure.
![]() |
Adapted from Fig. 1 of this preprint |
The longtime Wisconsin Public Radio show To The Best of Our Knowledge airs its last episode this weekend. You can listen to the stream at that link or you can listen to it over the airwaves tomorrow with an FM receiver, the way God intended. Either way, there’s more than 1000 archived episodes for you to enjoy at your leisure.
I don’t know the story of why this show was cancelled, whether it had particular enemies, or whether it was the victim of budget cuts driven by hostility to public radio more generally. I’ve been on this show several times. Anne Strainchamps is one of the best interviewers I’ve ever heard. She listens, she goes wherever the thing wants to go (even when I the interviewee have no idea where it’s going to go), she’s funny in a way that works with, not in competition with, the person she’s talking to. Radio sounds old-fashioned, but it still has an sneaky, immense reach. Every time I did this show, people came up to me afterwards, people who would probably never pick up one of my books, to say, Hey, I heard you. The radio is there for everyone, everywhere, to find. It is public.
Public In the same way a public park in a small town is public, or the public schools I went to and my parents and kids went to are public, or a public highway is public. It’s not there to sell something, or to make you feel mad or angry or worried on the way to selling you something. It’s just a group of people doing some work to put something good out there for people to use. So little of our world is like this now. We should appreciate the parts that are, and mourn a little whenever one more public thing gets scraped away.
Today, I got email after email asking me to comment on a new paper from HSBC—yes, the bank—together with IBM. The paper claims to use a quantum computer to get a 34% advantage in predictions of financial trading data. (See also blog posts here and here, or numerous popular articles that you can easily find and I won’t link.) What have we got? Let’s read the abstract:
The estimation of fill probabilities for trade orders represents a key ingredient in the optimization of algorithmic trading strategies. It is bound by the complex dynamics of financial markets with inherent uncertainties, and the limitations of models aiming to learn from multivariate financial time series that often exhibit stochastic properties with hidden temporal patterns. In this paper, we focus on algorithmic responses to trade inquiries in the corporate bond market and investigate fill probability estimation errors of common machine learning models when given real production-scale intraday trade event data, transformed by a quantum algorithm running on IBM Heron processors, as well as on noiseless quantum simulators for comparison. We introduce a framework to embed these quantum-generated data transforms as a decoupled offline component that can be selectively queried by models in lowlatency institutional trade optimization settings. A trade execution backtesting method is employed to evaluate the fill prediction performance of these models in relation to their input data. We observe a relative gain of up to ∼ 34% in out-of-sample test scores for those models with access to quantum hardware-transformed data over those using the original trading data or transforms by noiseless quantum simulation. These empirical results suggest that the inherent noise in current quantum hardware contributes to this effect and motivates further studies. Our work demonstrates the emerging potential of quantum computing as a complementary explorative tool in quantitative finance and encourages applied industry research towards practical applications in trading.
As they say, there are more red flags here than in a People’s Liberation Army parade. To critique this paper is not quite “shooting fish in a barrel,” because the fish are already dead before we’ve reached the end of the abstract.
They see a quantum advantage for the task in question, but only because of the noise in their quantum hardware? When they simulate the noiseless quantum computation classically, the advantage disappears? WTF? This strikes me as all but an admission that the “advantage” is just a strange artifact of the particular methods that they decided to compare—that it has nothing really to do with quantum mechanics in general, or with quantum computational speedup in particular.
Indeed, the possibility of selection bias rears its head. How many times did someone do some totally unprincipled, stab-in-the-dark comparison of a specific quantum learning method against a specific classical method, and get predictions from the quantum method that were worse than whatever they got classically … so then they didn’t publish a paper about it?
If it seems like I’m being harsh, it’s because to my mind, the entire concept of this sort of study is fatally flawed from the beginning, optimized for generating headlines rather than knowledge. The first task, I would’ve thought, is to show the reality of quantum computational advantage in the system or algorithm under investigation, even just for a useless benchmark problem. Only after one has done that, has one earned the right to look for a practical benefit in algorithmic trading or predicting financial time-series data or whatever, coming from that same advantage. If you skip the first step, then whatever “benefits” you get from your quantum computer are overwhelmingly likely to be cargo cult benefits.
And yet none of it matters. The paper can, more or less, openly admit all this right in the abstract, and yet it will still predictably generate lots of credulous articles in the business and financial news about HSBC using quantum computers to improve bond trading!—which, one assumes, was the point of the exercise from the beginning. Qombies roam the earth: undead narratives of “quantum advantage for important business problems” detached from any serious underlying truth-claim. And even here at one of the top 50 quantum computing blogs on the planet, there’s nothing I can do about it other than scream into the void.
Update (Sep. 26): Someone let me know that Martin Shkreli, the “pharma bro,” will be hosting a conference call for investors to push back on quantum computing hype. He announced on X that he’s offering quantum computing experts $2k each to speak in his call. On the off chance that Shkreli reads this blog: I’d be willing to do it for $50k. And if Shkreli were to complain about my jacking up the price…
What is AI doing to higher education? And what, if anything, should be done about it?
Chad Orzel at Counting Atoms had a post on this recently, tying the question to a broader point. There is a fundamental tension in universities, between actual teaching and learning and credentials. A student who just wants the piece of paper at the end has no reason not to cheat if they can get away with it, so the easier it becomes to get away with cheating (say, by using AI), the less meaningful the credential gets. Meanwhile, professors who want students to actually learn something are reduced to trying to “trick” these goal-oriented students into accidentally doing something that makes them fall in love with a subject, while being required to police the credential side of things.
Social science, as Orzel admits and emphasizes, is hard. Any broad-strokes picture like this breaks down into details, and while Orzel talks through some of those details he and I are of course not social scientists.
Because of that, I’m not going to propose my own “theory” here. Instead, think of this post as a request.
I want to read an ethnography of cheating. Like other ethnographies, it should involve someone spending time in the culture in question (here, cheating students), talking to the people involved, and getting a feeling for what they believe and value. Ideally, it would be augmented with an attempt at quantitative data, like surveys, that estimate how representative the picture is.
I suspect that cheating students aren’t just trying to get a credential. Part of why is that I remember teaching pre-meds. In the US, students don’t directly study medicine as a Bachelor’s degree. Instead, they study other subjects as pre-medical students (“pre-meds”), and then apply to Medical School, which grants a degree on the same level as a PhD. As part of their application, they include a standardized test called the MCAT, which checks that they have the basic level of math and science that the medical schools expect.
A pre-med in a physics class, then, has good reason to want to learn: the better they know their physics, the better they will do on the MCAT. If cheating was mostly about just trying to get a credential, pre-meds wouldn’t cheat.
I’m pretty sure they do cheat, though. I didn’t catch any cheaters back when I taught, but there were a lot of students who tried to push the rules, pre-meds and not.
Instead, I think there are a few other motivations involved. And in an ethnography of cheating, I’d love to see some attempt to estimate how prevalent they are:
If you’re aware of a good ethnography of cheating, let me know! And if you’re a social scientist, consider studying this!
Peter’s photos: https://www.icloud.com/sharedalbum/#B275oqs3qKSZvQ
Screenshots: https://www.icloud.com/sharedalbum/#B27532ODWjIQb9
Climbing book launch: https://www.icloud.com/sharedalbum/#B27GWZuqDGnuOyN
Salisbury waters: https://www.icloud.com/sharedalbum/#B275qXGF1JQFkx
Christmas with Ash: https://www.icloud.com/sharedalbum/#B27G6XBubAhoT6
Hosin BBQ duck: https://www.icloud.com/sharedalbum/#B27GY8gBYG3b5mD
Hawks Nest to Smiths Lake: https://www.icloud.com/sharedalbum/#B2759UlCqSH5bE
Europe & Alps: https://www.icloud.com/sharedalbum/#B275ON9t3W0lu
Point Perpendicular: https://www.icloud.com/sharedalbum/#B27GqkRUiGivXD2
Newnes canyoning: https://www.icloud.com/sharedalbum/#B27GfnH8tgHSmX
Coffs Harbour to Yamba: https://www.icloud.com/sharedalbum/#B27J0DiRHJKuuWr
Wendy Bruere Christmas (2020): https://www.icloud.com/sharedalbum/#B27G4TcsmGoHysj
Six Foot Track: https://www.icloud.com/sharedalbum/#B2753qWtHZA9EX
Kosciusko to Kiandra: https://www.icloud.com/sharedalbum/#B27GgZLKuGaewVm
Camping food: https://www.icloud.com/sharedalbum/#B27GtnIORgbmHu
The Aardvark: https://www.icloud.com/sharedalbum/#B275VaUrzvmAiT
Kangaroo Valley kayaking: https://www.icloud.com/sharedalbum/#B27JEsNWnJrCpi0
Claustral canyon: https://www.icloud.com/sharedalbum/#B2755Z2WMOTpsk
Budawang: https://www.icloud.com/sharedalbum/#B27GDdyTvGvpINL
Mother’s Day panoramas (2021): https://www.icloud.com/sharedalbum/#B27GFssfGG9WmJP
Point Perpendicular & Nowra: https://www.icloud.com/sharedalbum/#B27GRMtznGPdeuZ
Blood moon: https://www.icloud.com/sharedalbum/#B27GdIshaG8NgGX
La Perouse to Coogee: https://www.icloud.com/sharedalbum/#B275aVbMK4h7qo
Canberra ASPI launch: https://www.icloud.com/sharedalbum/#B27GQOeMmGj4Zcv
Edible foraging: https://www.icloud.com/sharedalbum/#B275ejO179Si0N
Sydney to Wollongong: https://www.icloud.com/sharedalbum/#B275M7GFPUasMe
Album for Dad, Father’s Day (2021): https://www.icloud.com/sharedalbum/#B2752plgjnnkUe
Vaucluse (with Cheryl, Nestor & Wendy): https://www.icloud.com/sharedalbum/#B275CmvAS4uA0Z
Bouddi National Park: https://www.icloud.com/sharedalbum/#B27GdPblXG8WdOo
Tom Thumb (the 2nd): https://www.icloud.com/sharedalbum/#B275aDWbr4CN2w
Eden to Victoria: https://www.icloud.com/sharedalbum/#B27GJDfWGArX8l
Wendy’s book launch (the 2nd): https://www.icloud.com/sharedalbum/#B27GIcgc2G7h08y
Mark & Pat Bruere visit Sydney: https://www.icloud.com/sharedalbum/#B27G0ehgLbyWyg
New Years Eve climb (2021): https://www.icloud.com/sharedalbum/#B27Ju8EH6JOZxmU
Newnes Canyoning (2022): https://www.icloud.com/sharedalbum/#B275BydzFU0GZ8
Royal National Park (2022): https://www.icloud.com/sharedalbum/#B27GlxzuqGVI5nE
Peter & Wendy: https://www.icloud.com/sharedalbum/#B27Gf693ZG52tfd
Book photo shoots: too rude…
Wendy & Peter’s mushroom trip: https://www.icloud.com/sharedalbum/#B27GrhkPxG27So8
Post-mushroom hike: https://www.icloud.com/sharedalbum/#B27GdFryYG8i3Ur
Wendy Kalymnos favourites: https://www.icloud.com/sharedalbum/#B27JqstnBJEXkH2
Wendy Frenchmans screenshots: https://www.icloud.com/sharedalbum/#B27Jr1PPdJpd7Dq
Instagram: https://www.icloud.com/sharedalbum/#B27GzFCC1Gb4tqr
Haute route: https://www.icloud.com/sharedalbum/#B27J8GySPJtWoQ1
Kim’s KKKalendar: https://www.icloud.com/sharedalbum/#B275fk75vIL0sH
Frenchmans Cap Wild: https://www.icloud.com/sharedalbum/#B27G4VTwGGoFBkz
Photoshoot with Zixin: https://www.icloud.com/sharedalbum/#B27GPCdxkGKPkM4
Wendy birthday hike (2023): https://www.icloud.com/sharedalbum/#B27GWBC59GnHpQW
Bateman’s Bay to Bawley Point: https://www.icloud.com/sharedalbum/#B27JsHvHoJ8bxWf
Stockton Sand dunes (2023): https://www.icloud.com/sharedalbum/#B27GVfZ2vGloFZV
Wendy book launch (2023): https://www.icloud.com/sharedalbum/#B27J058xyJR4IBM
Dolomites (2023): https://www.icloud.com/sharedalbum/#B0Z5kuVsbGJUzKO
Mount Arapiles: https://www.icloud.com/sharedalbum/#B275GH8Mq8Uh2X
Mount Solitary loop: https://www.icloud.com/sharedalbum/#B275nhQST2mETE
Klaus Hanz Franz Rohde Kunst: https://www.icloud.com/sharedalbum/#B27GqQrCLGiY3vb
Klaus Rohde funeral slideshow: https://www.icloud.com/sharedalbum/#B27GDZLe8GXP58K
Dad (old, B&W): https://www.icloud.com/sharedalbum/#B27GLLXGLJ5mbT2
Klaus & Ursula wedding: https://www.icloud.com/sharedalbum/#B275cLqfN7154g
Test Greece: https://www.icloud.com/sharedalbum/#B27Jq4WnLJ6JMNd
From Will Skea (Alps): https://www.icloud.com/sharedalbum/#B27JHciePJFwacG
From Will Skea (Frenchmans Cap): https://www.icloud.com/sharedalbum/#B275ZhN2v3EVq6
From Will Skea (Arapiles): https://www.icloud.com/sharedalbum/#B27JPrgBGJu3BTD
Coffs Harbour to Yamba (2): https://www.icloud.com/sharedalbum/#B27GFqhgJG9LHgT
Mark magic show (2021): https://www.icloud.com/sharedalbum/#B27G60dj6ARCvd
Wendy Christmas present (2020): https://www.icloud.com/sharedalbum/#B275FrPQ6GxvRu
AHS 25 year reunion: https://www.icloud.com/sharedalbum/#B275O3DjHUvSv
WhatsApp: https://www.icloud.com/sharedalbum/#B275tzEA5fX1nc
Armidale High School: https://www.icloud.com/sharedalbum/#B27GnbeumG4PnAF
Book photos for Mum & Dad: https://www.icloud.com/sharedalbum/#B27Gtec4XQkASe
Miscellaneous: https://www.icloud.com/sharedalbum/#B27Gq6kMgGKn7GR
Three Capes Trail (2022): https://www.icloud.com/sharedalbum/#B27G7HOIlGrDUGZ
Childhood computer programming: https://www.icloud.com/sharedalbum/#B275fu2MutDU8N
Magic with Mark in Maroubra: https://www.icloud.com/sharedalbum/#B27Gv6DhEGD9U3G
Photoshoot with Zixin (2024): https://www.icloud.com/sharedalbum/#B27GCATCnJGoRfW
Butt Crack (2021): https://www.icloud.com/sharedalbum/#B275VtHQfMv0zw
Greece photos new (edited to remove photos from wrong album): https://www.icloud.com/sharedalbum/#B27GY3uThGoBcGj
Singapore (all combined): https://www.icloud.com/sharedalbum/#B275qsTcwJKJjl
Hong Kong (transit): https://www.icloud.com/sharedalbum/#B2759v1AbS8Hve
Taiwan: https://www.icloud.com/sharedalbum/#B27GQD2D7Gw0hAp
India (combined): https://www.icloud.com/sharedalbum/#B27Gtue8VQy83g
Freycinet: https://www.icloud.com/sharedalbum/#B27G5VpecGE5Tbg
Triglav: https://www.icloud.com/sharedalbum/#B275MbK9Vy8erz
Shared with me: https://www.icloud.com/sharedalbum/#B27GGXqixzPOrm
Mount Wellington climbing: https://www.icloud.com/sharedalbum/#B27Gd59qiG8Kjy4
New Zealand combined (2004): https://www.icloud.com/sharedalbum/#B27GIZ8BIGNN5jy
New Zealand combined (2005): https://www.icloud.com/sharedalbum/#B27GcuRfIGFVIcL
Yea: https://www.icloud.com/sharedalbum/#B27GZYbYHGhFIir
Mount Pleasant: https://www.icloud.com/sharedalbum/#B275Iy2hC0JTTL
D’Aguilar: https://www.icloud.com/sharedalbum/#B27Gh7fzTGZBosS
Bali (2001): https://www.icloud.com/sharedalbum/#B27G1qNHBGOTbIr
Samba Ninjas: https://www.icloud.com/sharedalbum/#B27GG34bAzqQ0v
Armidale (misc): https://www.icloud.com/sharedalbum/#B27GSkLVwGyobbX
Emma’s party (2008): https://www.icloud.com/sharedalbum/#B275S2ms99Zyby
Goettingen (2011): https://www.icloud.com/sharedalbum/#B27JIrbT3Jsgxhd
South Coast track: https://www.icloud.com/sharedalbum/#B27G58NWBG6QyN7
Minsk (2006): https://www.icloud.com/sharedalbum/#B27G3JpSBGX1UkQ
Baden-Baden (2019): https://www.icloud.com/sharedalbum/#B27595X5HTVzJr
Berlin (combined): https://www.icloud.com/sharedalbum/#B27JqWzChJ6qizD
Switzerland (combined): https://www.icloud.com/sharedalbum/#B275zXwoYGJ6HMF
Italy highlights: https://www.icloud.com/sharedalbum/#B27G47PHQGoJium
Germany (misc): https://www.icloud.com/sharedalbum/#B275hPMfYGu5xVJ
Garmisch (2022): https://www.icloud.com/sharedalbum/#B27GFsbvlG9Xrr6
Germany (2019): https://www.icloud.com/sharedalbum/#B27G6Mn98G56Ncb
Garmisch (2006): https://www.icloud.com/sharedalbum/#B27J5lIdKGLC9KG
Baden-Baden (2005): https://www.icloud.com/sharedalbum/#B275sWRpHHQkt9
Berlin (2005): https://www.icloud.com/sharedalbum/#B27GgOQtrGjQrpH
Zugspitze (2005): https://www.icloud.com/sharedalbum/#B27G81mNdGcApGt
Amsterdam, Bristol (2006): https://www.icloud.com/sharedalbum/#B275B9SRzyBjlH
Baden-Baden (2006): https://www.icloud.com/sharedalbum/#B275eD9V79I2XR
Berlin (2006): https://www.icloud.com/sharedalbum/#B275toRf1fH8MD
Berlin, Jena (2007): https://www.icloud.com/sharedalbum/#B27GTI3fvGVgNit
Erlangen (2006): https://www.icloud.com/sharedalbum/#B27JrotZ2JpMb0i
Garmisch (2010): https://www.icloud.com/sharedalbum/#B27JPJPSiJurzNg
Germany (2010): https://www.icloud.com/sharedalbum/#B275FhYPQP650
Stuttgart (2006): https://www.icloud.com/sharedalbum/#B27GmitydGVVaZh
Changi (2019): https://www.icloud.com/sharedalbum/#B27GnmlKoG4JHpX
Japan (2007): https://www.icloud.com/sharedalbum/#B275AerZbG6FxVL
Japan (2012): https://www.icloud.com/sharedalbum/#B27GjBjobGg6PUa
Miscellaneous (including Japan 2013): https://www.icloud.com/sharedalbum/#B27GTpbybGySbE8
Currumbin & Tugin (2021): https://www.icloud.com/sharedalbum/#B275vBKZ4xH9X6
Brisbane (2021): https://www.icloud.com/sharedalbum/#B275YHsSjxQnm0
Weed in Byron (26/6/2025): https://www.icloud.com/sharedalbum/#B275Q2ydoGsQ4O5
Weed in Byron 2: https://www.icloud.com/sharedalbum/#B27GQDYhLGwsuY4
It’s been over three years since my last post on this blog and I have sometimes been asked, understandably, whether the project I announced in my previous post was actually happening. The answer is yes — the grant I received from the Astera Institute has funded several PhD students and a couple of postdocs, and we have been busy. In my previous post I suggested that I would be open to remote collaboration, but that has happened much less, partly because a Polymath-style approach would have been difficult to manage while also ensuring that my PhD students would have work that they could call their own to put in their theses.
In general I don’t see a satisfactory solution to that problem, but in this post I want to mention a subproject of the main project that is very much intended to be a large public collaboration. A few months ago, a call came out from Renaissance Philanthropies saying that they were launching a $9m AI for Math Fund to spend on projects in the general sphere of AI and mathematics, and inviting proposals. One of the categories that they specifically mentioned was creating new databases, and my group submitted a proposal to create a database of what we call “structured motivated proofs,” a piece of terminology that I will explain a bit more later in just a moment. I am happy to report that our proposal was one of the 29 successful ones. Since a good outcome to the project will depend on collaboration from many people outside the group, we need to publicize it, which is precisely the purpose of this post. Below I will be more specific about the kind of help we are looking for.
The underlying thought behind this project is that AI for mathematics is being held back not so much by an insufficient quantity of data as by the wrong kind of data. All mathematicians know, and some of us enjoy complaining about it, that it is common practice when presenting a proof in a mathematics paper, or even textbook, to hide the thought processes that led to the proof. Often this does not matter too much, because the thought processes may be standard ones that do not need to be spelt out to the intended audience. But when proofs start to get longer and more difficult, they can be hard to read because one has to absorb definitions and lemma statements that are not obviously useful, are presented as if they appeared from nowhere, and demonstrate their utility only much later in the argument.
A sign that this is a problem for AI is the behaviour one observes after asking an LLM to prove a statement that is too difficult for it. Very often, instead of admitting defeat, it will imitate the style of a typical mathematics paper and produce rabbits out of hats, together with arguments later on that those rabbits do the required job. The problem is that, unlike with a correct mathematics paper, one finds when one scrutinizes the arguments carefully that they are wrong. However, it is hard to find superficial features that distinguish between an incorrect rabbit with an incorrect argument justifying that rabbit (especially if the argument does not go into full detail) and a correct one, so the kinds of statistical methods used by LLMs do not have an easy way to penalize the incorrectness.
Of course, that does not mean that LLMs cannot do mathematics at all — they are remarkably good at it, at least compared with what I would have expected three years ago. How can that be, given the problem I have discussed in the previous paragraph?
The way I see it (which could change — things move so fast in this sphere), the data that is currently available to train LLMs and other systems is very suitable for a certain way of doing mathematics that I call guess and check. When trying to solve a maths problem, you will normally write down the routine parts of an argument without any fuss (and an LLM can do them too because it has seen plenty of similar examples), but if the problem as a whole is not routine, then at some point you have to stop and think, often because you need to construct an object that has certain properties (I mean this in a rather general way — the “object” might be a lemma that will split up the proof in a nice way) and it is not obvious how to do so. The guess-and-check approach to such moments is what it says: you make as intelligent a guess as you can and then see whether it has the properties you wanted. If it doesn’t, you make another guess, and you keep going until you get lucky.
The reason an LLM might be tempted to use this kind of approach is that the style of mathematical writing I described above makes it look as though that is what we as mathematicians do. Of course, we don’t actually do that, but we tend not to mention all the failed guesses we made and how we carefully examined why they failed, modifying them in appropriate ways in response, until we finally converged on an object that worked. We also don’t mention the reasoning that often takes place before we make the guess, saying to ourselves things like “Clearly an Abelian group can’t have that property, so I need to look for a non-Abelian group.”
Intelligent guess and check works well a lot of the time, particularly when carried out by an LLM that has seen many proofs of many theorems. I have often been surprised when I have asked an LLM a problem of the form , where
is some property that is hard to satisfy, and the LLM has had no trouble answering it. But somehow when this happens, the flavour of the answer given by the LLM leaves me with the impression that the technique it has used to construct
is one that it has seen before and regards as standard.
If the above picture of what LLMs can do is correct (the considerations for reinforcement-learning-based systems such as AlphaProof are not identical but I think that much of what I say in this post applies to them too for slightly different reasons), then the likely consequence is that if we pursue current approaches, then we will reach a plateau: broadly speaking they will be very good at answering a question if it is the kind of question that a mathematician with the right domain expertise and good instincts would find reasonably straightforward, but will struggle with anything that is not of that kind. In particular, they will struggle with research-level problems, which are, almost by definition, problems that experts in the area do not find straightforward. (Of course, there would probably be cases where an LLM spots relatively easy arguments that the experts had missed, but that wouldn’t fundamentally alter the fact that they weren’t really capable of doing research-level mathematics.)
But what if we had a database of theorems and proofs that did not hide the thought processes that lay behind the non-obvious details of the proofs? If we could train AI on a database of accounts of proof discoveries and if, having done so, we then asked it to provide similar accounts, then it would no longer resort to guess-and-check when it got stuck, because the proof-discovery accounts it had been trained on would not be resorting to it. There could be a problem getting it to unlearn its bad habits, but I don’t think that difficulty would be impossible to surmount.
The next question is what such a database might look like. One could just invite people to send in stream-of-consciousness accounts of how they themselves found certain proofs, but that option is unsatisfactory for several reasons.
To deal with these kinds of difficulties, we plan to introduce a notion of a structured motivated proof, by which we mean a proof that is generated in a very particular way that I will partially describe below. A major part of the project, and part of the reason we needed funding for it, is to create a platform that will make it convenient to input structured motivated proofs and difficult to insert the kinds of rabbits out of hats that make a proof mysterious and unmotivated. In this way we hope to gamify the task of creating the database, challenging people to input into our system proofs of certain theorems that appear to rely on “magic” ideas, and perhaps even offering prizes for proofs that contain steps that appear in advance to be particularly hard to motivate. (An example: the solution by Ellenberg and Gijswijt of the cap-set problem uses polynomials in a magic-seeming way. The idea of using polynomials came from an earlier paper of Croot, Lev and Pach that proved a closely related theorem, but in that paper it just appears in the statement of their Lemma 1, with no prior discussion apart from the words “in the present paper we use the polynomial method” in the introduction.)
I wrote about motivated proofs in my previous post, but thanks to many discussions with other members of the group, my ideas have developed quite a lot since then. Here are two ways we like to think about the concept.
I will not go into full detail about what I mean by this, but will do so in a future post when we have created the platform that we would like people to use in order to input proofs into the database. But the basic idea is that at any one moment one is in a certain state, which we call a proof-discovery state, and there will be a set of possible moves that can take one from the current proof-discovery state to a new one.
A proof-discovery state is supposed to be a more formal representation of the state one is in when in the middle of solving a problem. Typically, if the problem is difficult, one will have asked a number of questions, and will be aware of logical relationships between them: for example, one might know that a positive answer to Q1 could be used to create a counterexample to Q2, or that Q3 is a special case of Q4, and so on. One will also have proved some results connected with the original question, and again these results will be related to each other and to the original problem in various ways that might be quite complicated: for example P1 might be a special case of Q2, which, if true would reduce Q3 to Q4, where Q3 is a generalization of the statement we are trying to prove.
Typically we will be focusing on one of the questions, and typically that question will take the form of some hypotheses and a target (the question being whether the hypotheses imply the target). One kind of move we might make is a standard logical move such as forwards or backwards reasoning: for example, if we have hypotheses of the form and
, then we might decide to deduce
. But things get more interesting when we consider slightly less basic actions we might take. Here are three examples.
This is a surprisingly useful way to conceive of what we are talking about, especially as it relates closely to what I was talking about earlier: imposing a standard form on motivated proofs (which is why we call them “structured” motivated proofs) and gamifying the process of producing them.
The idea is that a structured motivated proof is one that can be generated using an interface (which we are in the process of creating — at the moment we have a very basic prototype that has a few of the features we will need, but not yet the more interesting ones) that has one essential property: the user cannot type in data. So what can they do? They can select text that is on their screen (typically mathematical expressions or subexpressions), they can click buttons, choose items from drop-down menus, and accept or reject “obvious” suggestions made to them by the interface.
If, for example, the current goal is an existential statement , then typing in a formula that defines a suitable
is not possible, so instead one must select text or generate new text by clicking buttons, choosing from short drop-down menus, and so on. This forces the user to generate
, which is our proxy for showing where the idea of using
came from.
Broadly speaking, the way the prototype works is to get an LLM to read a JSON object that describes the variables, hypotheses and goals involved in the proof state in a structured format, and to describe (by means of a fairly long prompt) the various moves it might be called upon to do. Thus, the proofs generated by the system are not formally verified, but that is not an issue that concerns us in practice since there will be a human in the loop throughout to catch any mistakes that the LLM might make, and this flexibility may even work to our advantage to better capture the fluidity of natural-language mathematics.
There is obviously a lot more to say about what the proof-generating moves are, or (approximately equivalently) what the options provided by a point-and-click system will be. I plan to discuss that in much more detail when we are closer to having an interface ready, the target for which is the end of this calendar year. But the aim of the project is to create a database of examples of proofs that have been successfully generated using the interface, which can then be used to train AI to play the generate-structured-motivated-proof game.
There are several tasks that will need doing once the project gets properly under way. Here are some of the likely ones.
If you think you might be interested in any of these roles, please feel free to get in touch. Probably the hardest recruitment task for us will be identifying the right people with the right mixture of mathematical knowledge and software engineering skills to help us turn the platform into a well-designed web-based one that is convenient and pleasurable to use. If you think you might be such a person, or if you have a good idea for how we should go about finding one, we would be particularly interested to hear from you.
In a future post, I will say more about the kinds of moves that our platform will allow, and will give examples of non-motivated proofs together with how motivated versions of those proofs can be found and entered using the platform (which may involve a certain amount of speculation about what the platform will end up looking like).
In one way, our “moves” can be regarded as tactics of a kind. However, some of the moves we will need are difficult to implement in conventional proof assistants such as Lean. In parallel with the work described above, we hope to create an interface to Lean that would allow one to carry out proof-discovery moves of the kind discussed above but with the proof-discovery states being collections of Lean proof states. Members of my group have already been working on this and have made some very interesting progress, but there is some way to go. However, we hope that at some point (and this is also part of the project pitched to the AI for Math Fund) we will have created another interface that will have Lean working in the background, so that it will be possible to generate motivated proofs that will be (or perhaps it is better to say include) proofs in Lean at the same time.
Another possibility that we are also considering is to use the output of the first platform (which, as mentioned above, will be fairly formal, but not in the strict sense of a language such as Lean) to create a kind of blueprint that can then be autoformalized automatically. Then we would have a platform that would in principle allow mathematicians to search for proofs while working on their computers without having to learn a formal language, with their thoughts being formalized as they go.
Update (September 24): A sympathetic correspondent wrote to tip me off that this blog post has caused me to get added to a list, maintained by MAGA activists and circulated by email, of academics and others who ought to “[face] some consequences for maligning the patriotic MAGA movement.” Needless to say, not only did this post unequivocally condemn Charlie Kirk’s murder, it even mentioned areas of common ground between me and Kirk, and my beefs with the social-justice left. If someone wants to go to the Texas Legislature to get me fired, literally the only thing they’ll have on me is that I “maligned the patriotic MAGA movement,” i.e. expressed political views shared by the majority of Americans.
Still, it’s a strange honor to have had people on both extremes of the ideological spectrum wanting to cancel me for stuff I’ve written on this blog. What is tenure for, if not this?
Another Update: In a dark and polarized age like ours, one thing that gives hope is the prospect of rational agents updating on each others’ knowledge to come to agreement. On that note, please enjoy this recent podcast, in which a 95-year-old Robert Aumann explains Aumann’s agreement theorem in his own words (see here for my old post about it, one of the most popular in the history of this blog).
From 2016 until last week, as the Trump movement dismantled one after another of the obvious bipartisan norms of the United States that I’d taken for granted since my childhood—e.g., the loser conceding an election and attending the winner’s inauguration, America being proudly a nation of immigrants, science being good, vaccines being good, Russia invading its neighbors being bad, corruption (when it occurred) not openly boasted about—I often consoled myself that at least the First Amendment, the motor of our whole system since 1791, was still in effect. At least you could still call Trump a thug and a conman without fear. Yes, Trump constantly railed against hostile journalists and comedians and protesters, threatened them at his rallies, filed frivolous lawsuits against them, but none of it seemed to lead to any serious program to shut them down. Oceans of anti-Trump content remained a click away.
I even wondered whether this was Trump’s central innovation in the annals of authoritarianism: proving that, in the age of streaming and podcasts and social media, you no longer needed to bother with censorship in order to build a regime of lies. You could simply ensure that the truth remained one narrative among others, that it never penetrated the epistemic bubble of your core supporters, who’d continue to be algorithmically fed whatever most flattered their prejudices.
Last week, that all changed. Another pillar of the previous world fell. According to the new norm, if you’re a late-night comedian who says anything Trump doesn’t like, he’ll have the FCC threaten your station’s affiliates’ broadcast licenses, and they’ll cave, and you’ll be off the air, and he’ll gloat about it. We ought to be clear that, even conditioned on everything else, this is a huge further step toward how things work in Erdogan’s Turkey or Orban’s Hungary, and how they were never supposed to work in America.
At risk of stating the obvious:
Anyway, I keep hoping that my next post will be about quantum complexity theory or AI alignment or Busy Beaver 6 or whatever. Whenever I feel backed into a corner, however, I will risk my career, and the Internet’s wrath, to blog my nutty, extreme, embarrassing, totally anodyne liberal beliefs that half or more of Americans actually share.
A preamble
Subnuclear physics obeys the laws of quantum mechanics, which are quite a far cry from those of classical mechanics we are accustomed to. For that reason, one might be inclined to believe that analogies based on everyday life cannot come close to explaining the behavior of elementary particles. But that is not true – in fact, many properties of elementary particles are understandable in analogy with the behavior of classical systems, without the need to delve into the intricacies of the quantum world. And if you have been reading this blog for a while, you know what I think – the analogy is a powerful didactical instrument, and it is indeed at the very core of our learning processes.
I judge a bookstore by the number of Diana Wynne Jones novels it stocks. The late British author wrote some of the twentieth century’s most widely lauded science-fiction and fantasy (SFF). She clinched more honors than I should list, including two World Fantasy Awards. Neil Gaiman, author of American Gods, called her “the best children’s writer of the last forty years” in 2010—and her books suit children of all ages.1 But Wynne Jones passed away as I was finishing college, and her books have been disappearing from American bookshops. The typical shop stocks, at best, a book in the series she began with Howl’s Moving Castle, which Hayao Miyazaki adapted into an animated film.
I don’t recall the last time I glimpsed Deep Secret in a bookshop, but it ranks amongst my favorite Wynne Jones books—and favorite books, full-stop. So I relished living part of that book this spring.
Deep Secret centers on video-game programmer Rupert Venables. Outside of his day job, he works as a Magid, a magic user who helps secure peace and progress across the multiple worlds. Another Magid has passed away, and Rupert must find a replacement for him. How does Rupert track down and interview his candidates? By consolidating their fate lines so that the candidates converge on an SFF convention. Of course.
My fate line drew me to an SFF convention this May. Balticon takes place annually in Baltimore, Maryland. It features not only authors, agents, and publishers, but also science lecturers. I received an invitation to lecture about quantum steampunk—not video-game content,2 but technology-oriented like Rupert’s work. I’d never attended an SFF convention,3 so I reread Deep Secret as though studying for an exam.
Rupert, too, is attending his first SFF convention. A man as starched as his name sounds, Rupert packs suits, slacks, and a polo-neck sweater for the weekend—to the horror of a denim-wearing participant. I didn’t bring suits, in my defense. But I did dress business-casual, despite having anticipated that jeans, T-shirts, and capes would surround me.
I checked into a Renaissance Hotel for Memorial Day weekend, just as Rupert checks into the Hotel Babylon for Easter weekend. Like him, I had to walk an inordinately long distance from the elevators to my room. But Rupert owes his trek to whoever’s disrupted the magical node centered on his hotel. My hotel’s architects simply should have installed more elevator banks.
Balticon shared much of its anatomy with Rupert’s con, despite taking place in a different century and country (not to mention world). Participants congregated downstairs at breakfast (continental at Balticon, waitered at Rupert’s hotel). Lectures and panels filled most of each day. A masquerade took place one night. (I slept through Balticon’s; impromptu veterinary surgery occupies Rupert during his con’s.) Participants vied for artwork at an auction. Booksellers and craftspeople hawked their wares in a dealer’s room. (None of Balticon’s craftspeople knew their otherworldly subject matter as intimately as Rupert’s Magid colleague Zinka Fearon does, I trust. Zinka paints her off-world experiences when in need of cash.)
In our hotel room, I read out bits of Deep Secret to my husband, who confirmed the uncanniness with which they echoed our experiences. Both cons featured floor-length robes, Batman costumes, and the occasional slinky dress. Some men sported long-enough locks, and some enough facial hair, to do a Merovingian king proud. Rupert registers “a towering papier-mâché and plastic alien” one night; on Sunday morning, a colossal blow-up unicorn startled my husband and me. We were riding the elevator downstairs to breakfast, pausing at floor after floor. Hotel guests packed the elevator like Star Wars fans at a Lucasfilm debut. Then, the elevator halted again. The doors opened on a bespectacled man, 40-something years old by my estimate, dressed as a blue-and-white unicorn. The costume billowed out around him; the golden horn towered multiple feet above his head. He gazed at our sardine can, and we gazed at him, without speaking. The elevator doors shut, and we continued toward breakfast.
Despite having read Deep Secret multiple times, I savored it again. I even laughed out loud. Wynne Jones paints the SFF community with the humor, exasperation, and affection one might expect of a middle-school teacher contemplating her students. I empathize, belonging to a community—the physics world—nearly as idiosyncratic as the SFF community.4 Wynne Jones’s warmth for her people suffuses Deep Secret; introvert Rupert surprises himself by enjoying a dinner with con-goers and wishing to spend more time with them. The con-goers at my talk exhibited as much warmth as any audience I’ve spoken to, laughing, applauding, and asking questions. I appreciated sojourning in their community for a weekend.5
This year, my community is fêting the physicists who founded quantum theory a century ago. Wynne Jones sparked imaginations two decades ago. Let’s not let her memory slip from our fingertips like a paperback over which we’re falling asleep. After all, we aren’t forgetting Louis de Broglie, Paul Dirac, and their colleagues. So check out a Wynne Jones novel the next time you visit a library, or order a novel of hers to your neighborhood bookstore. Deep Secret shouldn’t be an actual secret.
With thanks to Balticon’s organizers, especially Miriam Winder Kelly, for inviting me and for fussing over their speakers’ comfort like hens over chicks.
1Wynne Jones dedicated her novel Hexwood to Gaiman, who expressed his delight in a poem. I fancy the comparison of Gaiman, a master of phantasmagoria and darkness, to a kitten.
2Yet?
3I’d attended a steampunk convention, and spoken at a Boston SFF convention, virtually. But as far as such conventions go, attending virtually is to attending in person as my drawings are to a Hayao Miyazaki film.
4But sporting fewer wizard hats.
5And I wonder what the Diana Wynne Jones Conference–Festival is like.
Given a threshold , a
-smooth number (or
-friable number) is a natural number
whose prime factors are all at most
. We use
to denote the number of
-smooth numbers up to
. In studying the asymptotic behavior of
, it is customary to write
as
(or
as
) for some
. For small values of
, the behavior is straightforward: for instance if
, then all numbers up to
are automatically
-smooth, so
More generally, for any fixed , de Bruijn showed that
The asymptotic behavior of as
is rather complicated. Very roughly speaking, it has inverse factorial behavior; there is a general upper bound
, and a crude asymptotic
This asymptotic (2) is quite complicated, and so one does not expect there to be any simple argument that could recover it without extensive computation. However, it turns out that one can use a “maximum entropy” analysis to get a reasonably good heuristic approximation to (2), that at least reveals the role of the mysterious function . The purpose of this blog post is to give this heuristic.
Viewing , the task is to try to count the number of
-smooth numbers of magnitude
. We will propose a probabilistic model to generate
-smooth numbers as follows: for each prime
, select the prime
with an independent probability
for some coefficient
, and then multiply all the selected primes together. This will clearly generate a random
-smooth number
, and by the law of large numbers, the (log-)magnitude of this number should be approximately
The indicator of the event that
divides this number is a Bernoulli random variable with mean
, so the Shannon entropy of this random variable is
One could solve this constrained optimization problem directly using Lagrange multipliers, but we simplify things a bit by passing to a continuous limit. We take a continuous ansatz , where
is a smooth function. Using Mertens’ theorem, the constraint (5) then heuristically becomes
This is a standard calculus of variations problem. The Euler-Lagrange equation for this problem can be easily worked out to be
Nevertheless, this demonstrates that the maximum entropy method can achieve a reasonably good heuristic understanding of smooth numbers. In fact we also gain some insight into the “anatomy of integers” of such numbers: the above analysis suggests that a typical -smooth number
will be divisible by a given prime
with probability about
. Thus, for
, the probability of being divisible by
is elevated by a factor of about
over the baseline probability
of an arbitrary (non-smooth) number being divisible by
; so (by Mertens’ theorem) a typical
-smooth number is actually largely comprised of something like
prime factors all of size about
, with the smaller primes contributing a lower order factor. This is in marked contrast with the anatomy of a typical (non-smooth) number
, which typically has
prime factors in each hyperdyadic scale
in
, as per Mertens’ theorem.
Black holes have been in the news a couple times recently.
On one end, there was the observation of an extremely large black hole in the early universe, when no black holes of the kind were expected to exist. My understanding is this is very much a “big if true” kind of claim, something that could have dramatic implications but may just be being misunderstood. At the moment, I’m not going to try to work out which one it is.
In between, you have a piece by me in Quanta Magazine a couple weeks ago, about tests of whether black holes deviate from general relativity. They don’t, by the way, according to the tests so far.
And on the other end, you have the coverage last week of a “confirmation” (or even “proof”) of the black hole area law.
The black hole area law states that the total area of the event horizons of all black holes will always increase. It’s also known as the second law of black hole thermodynamics, paralleling the second law of thermodynamics that entropy always increases. Hawking proved this as a theorem in 1971, assuming that general relativity holds true.
(That leaves out quantum effects, which indeed can make black holes shrink, as Hawking himself famously later argued.)
The black hole area law is supposed to hold even when two black holes collide and merge. While the combination may lose energy (leading to gravitational waves that carry energy to us), it will still have greater area, in the end, than the sum of the black holes that combined to make it.
Ok, so that’s the area law. What’s this paper that’s supposed to “finally prove” it?
The LIGO, Virgo, and KAGRA collaborations recently published a paper based on gravitational waves from one particularly clear collision of black holes, which they measured back in January. They compare their measurements to predictions from general relativity, and checked two things: whether the measurements agreed with predictions based on the Kerr metric (how space-time around a rotating black hole is supposed to behave), and whether they obeyed the area law.
The first check isn’t so different in purpose from the work I wrote about in Quanta Magazine, just using different methods. In both studies, physicists are looking for deviations from the laws of general relativity, triggered by the highly curved environments around black holes. These deviations could show up in one way or another in any black hole collision, so while you would ideally look for them by scanning over many collisions (as the paper I reported on did), you could do a meaningful test even with just one collision. That kind of a check may not be very strenuous (if general relativity is wrong, it’s likely by a very small amount), but it’s still an opportunity, diligently sought, to be proven wrong.
The second check is the one that got the headlines. It also got first billing in the paper title, and a decent amount of verbiage in the paper itself. And if you think about it for more than five minutes, it doesn’t make a ton of sense as presented.
Suppose the black hole area law is wrong, and sometimes black holes lose area when they collide. Even if this happened sometimes, you wouldn’t expect it to happen every time. It’s not like anyone is pondering a reverse black hole area law, where black holes only shrink!
Because of that, I think it’s better to say that LIGO measured the black hole area law for this collision, while they tested whether black holes obey the Kerr metric. In one case, they’re just observing what happened in this one situation. In the other, they can try to draw implications for other collisions.
That doesn’t mean their work wasn’t impressive, but it was impressive for reasons that don’t seem to be getting emphasized. It’s impressive because, prior to this paper, they had not managed to measure the areas of colliding black holes well enough to confirm that they obeyed the area law! The previous collisions looked like they obeyed the law, but when you factor in the experimental error they couldn’t say it with confidence. The current measurement is better, and can. So the new measurement is interesting not because it confirms a fundamental law of the universe or anything like that…it’s interesting because previous measurements were so bad, that they couldn’t even confirm this kind of fundamental law!
That, incidentally, feels like a “missing mood” in pop science. Some things are impressive not because of their amazing scale or awesome implications, but because they are unexpectedly, unintuitively, really really hard to do. These measurements shouldn’t be thought of, or billed, as tests of nature’s fundamental laws. Instead they’re interesting because they highlight what we’re capable of, and what we still need to accomplish.
Sunflowers are blooming, stores are trumpeting back-to-school sales, and professors are scrambling to chart out the courses they planned to develop in July. If you’re applying for an academic job this fall, now is the time to get your application ducks in a row. Seeking a postdoctoral or faculty position? Your applications will center on research statements. Often, a research statement describes your accomplishments and sketches your research plans. What do evaluators look for in such documents? Here’s my advice, which targets postdoctoral fellowships and faculty positions, especially for theoretical physicists.
The 2025 Quantum Leadership Awards were announced at the Quantum World Congress on 18 September 2025. Upon receiving the Academic Pioneer in Quantum Award, John Preskill made these remarks.
I’m enormously excited and honored to receive this Quantum Leadership Award, and especially thrilled to receive it during this, the International Year of Quantum. The 100th anniversary of the discovery of quantum mechanics is a cause for celebration because that theory provides our deepest and most accurate description of how the universe works, and because that deeper understanding has incalculable value to humanity. What we have learned about electrons, photons, atoms, and molecules in the past century has already transformed our lives in many ways, but what lies ahead, as we learn to build and precisely control more and more complex quantum systems, will be even more astonishing.
As a professor at a great university, I have been lucky in many ways. Lucky to have the freedom to pursue the scientific challenges that I find most compelling and promising. Lucky to be surrounded by remarkable, supportive colleagues. Lucky to have had many collaborators who enabled me to do things I could never have done on my own. And lucky to have the opportunity to teach and mentor young scientists who have a passion for advancing the frontiers of science. What I’m most proud of is the quantum community we’ve built at Caltech, and the many dozens of young people who imbibed the interdisciplinary spirit of Caltech and then moved onward to become leaders in quantum science at universities, labs, and companies all over the world.
Right now is a thrilling time for quantum science and technology, a time of rapid progress, but these are still the early days in a nascent second quantum revolution. In quantum computing, we face two fundamental questions: How can we scale up to quantum machines that can solve very hard computational problems? And once we do so, what will be the most important applications for science and for industry? We don’t have fully satisfying answers yet to either question and we won’t find the answers all at once – they will unfold gradually as our knowledge and technology advance. But 10 years from now we’ll have much better answers than we have today.
Companies are now pursuing ambitious plans to build the world’s most powerful quantum computers. Let’s not forget how we got to this point. It was by allowing some of the world’s most brilliant people to follow their curiosity and dream about what the future could bring. To fulfill the potential of quantum technology, we need that spirit of bold adventure now more than ever before. This award honors one scientist, and I’m profoundly grateful for this recognition. But more importantly it serves as a reminder of the vital ongoing need to support the fundamental research that will build foundations for the science and technology of the future. Thank you very much!
It’s well known that you can construct the octonions using triality. One statement of triality is that has nontrivial outer automorphisms of order 3. On the other hand, the octonions have nontrivial inner automorphisms of order 3. My question: can we deduce one of these facts from the other?
The second fact is perhaps not very well known. It may even be hard to understand what it means. Though the octonions are nonassociative, for any nonzero octonion the map
is well-defined, since , which one can show using the fact that the octonions are alternative. More surprisingly, whenever , this map is an automorphism of the octonions:
and has order 3:
To understand this latter fact, we can look at
Theorem 2.1 here implies that an octonion with defines an inner automorphism if and only if has order 6.
However, the result is stated differently there. Paraphrasing somewhat, Lamont’s theorem says that any that is not a real multiple of defines an inner automorphism if and only if obeys
This equation is equivalent to , which is equivalent to lying at either a angle or a angle from the octonion .
Nonzero octonions on the real line clearly define the identity inner automorphism. Thus, a nonzero octonion defines an inner automorphism if and only if its angle from is , , or . In this case we can normalize without changing the inner automorphism it defines, and then we have . Note also that and define the same inner automorphism.
It follows that an octonion on the unit sphere defines an inner automorphism iff , and that every nontrivial inner automorphism of has order 3.
However, if you look at Lamont’s proof, you’ll see the equation plays no direct role! Instead, he really uses the assumption that is a real multiple of , which is implied by this equation (as easily shown using what we’ve just seen).
From Lamont’s work, one can see the Moufang identities and the characteristic equation for octonions are what force all inner automorphisms of the octonions to have order 3.
Thus, an argument giving a positive answer to my question might involve a link between triality and the Moufang identities. Conway and Smith seem to link them in On Quaternions and Octonions. But I haven’t figured out how to get from the outer automorphisms of to the inner automorphisms of , or vice versa!
I asked about this on MathOverflow, but I thought some people here would also be interested.
Guest post by Khyathi Komalan and Andrew Krenz
From Lawvere’s Hegelian taco to Baez’s layer cake analogy to Eugenia Cheng’s How to Bake Pi, categorists have cultivated a rich tradition of culinary metaphors and similes. A well-known example in the world of computation is Mark Dominus’s “monads are like burritos” — where a tortilla (computational context) wraps diverse ingredients (values) to create a cohesive entity (effectful value) whose burrito structure is maintained as the meal moves down the assembly line (undergoes computations).
Monads, like burritos, come in many different varieties. In computer science monads serve to streamline computational patterns such as exception handling and context management. We illustrate these two examples by analogy.
Imagine you work at a burrito truck.
If a customer orders a burrito sans rice but rice is accidentally added, it can’t be served. The Maybe monad handles exceptions such as this — when something goes wrong, it returns a special “Nothing” value rather than a flawed result, and once a failure occurs, all subsequent steps automatically preserve this state avoiding the need for repetitive error-checking.
Figure 1: The Maybe Monad illustrated with the burrito-making process
In Haskell, the parameterized type “Maybe a” has two constructors, “Just a” and “Nothing.” The former is an alias for values of type “a” whereas the latter is indicative of an error. The following Haskell code exhibits the maybe type as an instance of the monad class:
instance Maybe Monad where
return = Just
Nothing >>= f = Nothing
(Just x) >>= f = f x
the return function has type a -> Maybe a, which is suggestive of its role as the monad unit. The so-called bind operation >>= has type Maybe a -> (a -> Maybe b) -> Maybe b, and corresponds to a bare-bones Kleisli composition (see Monads: Programmer’s Definition for details).
A slight generalization allows for descriptive error messages.
Definition. Given a collection of exceptions , there is an associated Either monad .
Of course, either monads are simply maybe monads with a set in place of the constant/singleton “Nothing” and they allow us not only to say that an error has occured, but also to indicate what that error was.
Now suppose one of your regular customers walks up to the window and orders “the usual.” Luckily you’ve recorded their preferences in a recipe book. The act of following the appropriate recipe is akin to executing computations that depend on a global read-only state. The * Reader monad * is the functional programmer’s way of incorporating this impure concept in pure functional terms.
Figure 2: The Reader Monad illustrated with the burrito-making process
Definition. Given a collection of environments , there is an associated Reader monad .
Here is the same definition given as an instance of the Haskell monad class:
instance Monad ((->) r) where
return x = \_ -> x
g >>= f = \e -> f (g e) e
The seminal paper of Moggi has several other interesting examples illustrating the power of monads. Nevertheless, monads may not always suffice for all of our needs. For example, what would happen if our burrito truck suddenly exploded in popularity requiring automation of repetative processes and parallel work stations?
This is where “Arrows” enter the picture. Introduced by John Hughes in 2000, Arrows generalize strong monads. Because of this, Arrows handle more complicated computational patterns in a natural way. While monads wrap values in computational contexts (like burritos in tortillas), Arrows can represent entire preparation processes capable of coordinating multiple inputs while maintaining awareness of the broader kitchen environment.
Arrows come with three core operations that determine their behaviour; looking at their types, we see that Arrows are evocative of a lax internal hom that interacts with binary products.
class Arrow a where
arr :: (x -> y) -> a x y
(>>>) :: a x y -> a y z -> a x z
first :: a x y -> a (x,z) (y,z)
arr
turns functions into “Arrows.” This is like incorporating a standard burrito recipe or preparation step into the food truck’s workflow — taking a simple instruction like “add beans, then cheese” and automating it within our kitchen’s setup.>>>
composes composable Arrows. This allows for separately automated processes to be seamlessly strung together. first
enacts an automated process on one burrito while simultaneously passing a second burrito through the station.These data are subject to 9 axioms, which we eventually discuss below.
Figure 3: Arrow Operations. The three fundamental operations of Arrows enable complex workflows beyond monadic structures.
Shortly before Arrows were introduced, Power, Robinson, and Thielecke were working on Freyd categories — a categorical structure designed to model “effectful” computation. Using our simile, a Freyd category formalizes the relationship between an ideal burrito recipe (pure theory) and the real-world process of making that burrito in a particular kitchen.
A Freyd category consists of three main components:
An identity-on-objects functor which faithfully translates pure recipes into physical processes that work within the specific setup of the kitchen .
Figure 4: Freyd Category Structure. The relationship between pure recipes (category C) and real-world kitchen operations (category K), connected by the identity-on-objects functor J that preserves structure while accommodating practical constraints.
Although Arrows originated in Haskell, a highly abstract functional programming language, researchers began noticing apparent correspondences between the components of Arrows and those of Freyd categories. These two structures, developed from different starting points, seemed to address the same fundamental challenge: how to systematically manage computations that involve effects, multiple inputs and outputs, and context-awareness. Therefore, it was hypothesized that Arrows are equivalent to Freyd categories.
As a part of the Adjoint School, our group has been focusing on R. Atkey’s work, which dispells this folklore and precisely formulates the relationship between Arrows and Freyd categories. Just as Atkey asks in the title of his paper, this blog post will investigate the question of “what is a categorical model of Arrows?” The answer not only clarifies the theoretical underpinnings of these structures, but also reveals practical insights for programming language design and quantum computation models. Ultimately, we will see that there are indeed subtle differences between Arrows and Freyd categories.
Key Insights: - Monads encapsulate computational effects by wrapping values in contexts, much like burritos wrap ingredients in tortillas - Different monads (Maybe, Reader, etc…) deal with different patterns like exception handling and context management - Arrows generalize monads to handle multiple inputs and coordinate complex processes, like managing an entire kitchen rather than just making individual burritos
Beyond the Kitchen: Arrows and Freyd Categories
Formally, a monad on a category is a monoid in the category of endofunctors of . Arrows, like monads, are monoids in a certain category of functors. To be more specific, the structure of an Arrow on a category can be described as a monoid in the category of strong profunctors on . Let’s take a closer look at this construction.
Arrows A profunctor on a category is a functor Intuitively, a profunctor associates to each pair of objects a set of “generalized morphisms” between those objects.
The identity profunctor is simply , which uses the hom-sets of .
Composition of profunctors is defined as a coend. Given profunctors and , their composition is the following profunctor:
Notice that this formula is vaguely reminiscent of a dot product; replacing the integral with a sum over , and the cartesian product with multiplication, it looks like the dot product of the row vector with the column vector .
operations. –> We will now unpack this data to reach a more down-to-earth description of Arrows. This resulting characterization aligns more closely with the way in which Arrows are implemented in programming languages like Haskell.
Definition. An Arrow in a cartesian closed category consists of a mapping on objects and three families of morphisms:
This defines the Arrow type constructor, which takes input and output types and produces an Arrow type between them.
This operation lifts a pure function into the Arrow context, allowing regular functions to be treated as Arrows.
A family of morphisms
This enables sequential composition of Arrows, similar to function composition but now in terms of Arrows.
A family of morphisms
This is perhaps the most distinctive operation. Intuitively, it allows an Arrow to process the first component of a pair while leaving the second component unchanged.
These data are subject to nine axioms which govern their interactions. To make these abstract operations more concrete, consider the following example, where and In what follows we list the Arrow laws and draw commutative diagrams based on this example.
The Arrow laws and express left and right unitality of identities under composition.
Figure 5: Arrow Laws
The Arrow law represents associativity of composition.
Figure 6: Arrow Laws
The Arrow law encodes functoriality of .
Figure 7: Arrow Laws
The Arrow law express naturality of the counit , i.e., the first projection maps.
Figure 8: Arrow Laws
The Arrow law asks that play nicely with associators.
Figure 9: Arrow Laws
The Arrow law is an interchange law which says is a natural transformation for every in .
Figure 10: Arrow Laws
Two Arrow laws trivialise as a result of our example, so diagrams aren’t produced. The first such law is For our example, this law trivialises, as and The second law to trivialise is since we have set
Freyd Categories
To understand Freyd categories, we must first define what a symmetric premonoidal category is.
Definition. A symmetric premonoidal category includes:
A morphism is central if
Now, we can define a Freyd category, recalling the definition from the introduction.
Definition. A Freyd category consists of:
Arrows vs Freyd Categories: Similarities and Differences
At first glance, the definition of a Freyd category appears strikingly similar to that of an Arrow. This apparent similarity led to the folklore belief that they were equivalent structures.
A Freyd category consists of two categories and with an identity-on-objects functor , where: - has finite products - is symmetric premonoidal (with a functor ) - maps finite products in to the premonoidal structure in
In our culinary metaphor, this loosely translates to:
- : The idealized recipes (Haskell types and functions)
- : The real-world kitchen operations (computations represented by the Arrow type )
- : The translation process (via arr
, embedding pure functions)
- Composition in : The sequencing of operations (via >>>
)
- Premonoidal structure in : The ability to process pairs (via first
)
Recalling the how we’ve interpreted Arrows in the cullinary setting, the apparent correspondence between Arrows and Fryed categories seemes quite natural. In fact, for many years the two concepts were thought to be two ways of speaking about the same thing among those in the programming languages community.
However, Atkey’s work revealed a crucial distinction: Arrows are more general than Freyd categories . The key difference lies in how they handle inputs:
To bridge this gap, Atkey introduced the concept of indexed Freyd categories , which can model two structured inputs. The relationship can be summarized as: Arrows are equivalent to Closed Indexed Freyd Categories.
In our culinary metaphor, we can understand this relationship as follows: a Freyd category is like a restaurant that can only take one order at a time (a single input), while Arrows are like a more sophisticated establishment that can handle both individual orders and special requests that come with their own context (two inputs, one potentially structured). The closed indexed Freyd categories that Atkey identifies represent the perfect middle ground — restaurants that can efficiently manage multiple orders with specialized instructions while maintaining the core operational principles that make kitchens function. This is particularly valuable when preparing complex “quantum dishes” where ingredients might be entangled and interact with each other in non-local ways.
Figure 11: Arrows vs. Freyd Categories. Arrows support two inputs (one potentially structured) and are equivalent to Closed Indexed Freyd Categories, which generalize standard Freyd Categories that handle only single inputs.
This distinction helps explain why Arrows have proven particularly useful in domains like quantum computing, where managing multiple inputs with complex relationships is essential.
R. Atkey’s paper finds the relationship between Arrows and different constraints on Freyd categories as follows:
Figure 12: Relationship Between Structures
Key Insights: - Arrows can be defined both as monoids in categories of strong profunctors and operationally through concrete morphisms (, , ) - Freyd categories formalize the relationship between pure functions and effectful computations using symmetric premonoidal structure - Despite the folklore belief, Arrows are strictly more general than Freyd categories because they can handle two separate inputs (one potentially structured) - Arrows are equivalent to closed indexed Freyd categories, bridging the conceptual gap
Applications and Questions The main goal of our Adjoint School project was to structure effects in quantum programming languages using generalizations of monads. Relative monads are a popular generalization of monads. These monads need not be endofunctors, and they’re known to generalize Arrows as well. Since we already know how to structure quantum effects using Arrows, it follows that it should be theoretically possible to structure quantum effects using relative monads.
Arrows’ capacity to handle multiple inputs with a single, potentially structured output offers tractability that is particularly useful in quantum computing. Particles in quantum systems can be in entangled states, where the manipulation of one particle influences others in real time, irrespective of distance. This non-local interaction can be modeled through Arrows’ ability to combine several inputs while keeping track of their interrelationships.
Our group investigated the possibility of doing exactly this. The main technical issue arises from the fact that the way Arrows have been implemented in Haskell to structure quantum effects does not provide a categorical semantics for the problem.
For our ACT2025 presentation, we were able to construct a relative monad capable of handling classical control in the quantum setting, but the following questions still remain:
Can one build a relative monad to model quantum effects?
If so, how might an implementation of these ideas in Haskell compare to Arrow-based approaches?
The ride from burrito monads to Arrow kitchens has carried us farther than we anticipated, illustrating that even established mathematical folklore sometimes requires precise re-evaluation. As we continue to learn about these structures, we hope this post will motivate others to participate in the exploration of these tools and their use in quantum computing and beyond.
Academics tend to face a lot of impostor syndrome. Something about a job with no clear criteria for success, where you could always in principle do better and you mostly only see the cleaned-up, idealized version of others’ work, is a recipe for driving people utterly insane with fear.
The way most of us talk about that fear, it can seem like a cognitive bias, like a failure of epistemology. “Competent people think they’re less competent than they are,” the less-discussed half of the Dunning-Kruger effect.
(I’ve talked about it that way before. And, in an impostor-syndrome-inducing turn of events, I got quoted in a news piece in Nature about it.)
There’s something missing in that perspective, though. It doesn’t really get across how impostor syndrome feels. There’s something very raw about it, something that feels much more personal and urgent than an ordinary biased self-assessment.
To get at the core of it, let me ask a question: what happens to impostors?
The simple answer, the part everyone will admit to, is to say they stop getting grants, or stop getting jobs. Someone figures out they can’t do what they claim, and stops choosing them to receive limited resources. Pretty much anyone with impostor syndrome will say that they fear this: the moment that they reach too far, and the world decides they aren’t worth the money after all.
In practice, it’s not even clear that that happens. You might have people in your field who are actually thought of as impostors, on some level. People who get snarked about behind their back, people where everyone rolls their eyes when they ask a question at a conference and the question just never ends. People who are thought of as shiny storytellers without substance, who spin a tale for journalists but aren’t accomplishing anything of note. Those people…aren’t facing consequences at all, really! They keep getting the grants, they keep finding the jobs, and the ranks of people leaving for industry are instead mostly filled with those you respect.
Instead, I think what we fear when we feel impostor syndrome isn’t the obvious consequence, or even the real consequence, but something more primal. Primatologists and psychologists talk about our social brain, and the role of ostracism. They talk about baboons who piss off the alpha and get beat up and cast out of the group, how a social animal on their own risks starvation and becomes easy prey for bigger predators.
I think when we wake up in a cold sweat remembering how we had no idea what that talk was about, and were too afraid to ask, it’s a fear on that level that’s echoing around in our heads. That the grinding jags of adrenaline, the run-away-and-hide feeling of never being good enough, the desperate unsteadiness of trying to sound competent when you’re sure that you’re not and will get discovered at any moment…that’s not based on any realistic fears about what would happen if you got caught. That’s your monkey-brain, telling you a story drilled down deep by evolution.
Does that help? I’m not sure. If you manage to tell your inner monkey that it won’t get eaten by a lion if its friends stop liking it, let me know!
Thomas Bloom’s erdosproblems.com site hosts nearly a thousand questions that originated, or were communicated by, Paul Erdős, as well as the current status of these questions (about a third of which are currently solved). The site is now a couple years old, and has been steadily adding features, the most recent of which has been a discussion forum for each individual question. For instance, a discussion I had with Stijn Cambie and Vjeko Kovac on one of these problems recently led to it being solved (and even formalized in Lean!).
A significantly older site is the On-line Encyclopedia of Integer Sequences (OEIS), which records hundreds of thousands of integer sequences that have some mathematician has encountered at some point. It is a highly useful resource, enabling researchers to discover relevant literature for a given problem so long as they can calculate enough of some integer sequence that is “canonically” attached to that problem that they can search for it in the OEIS.
A large fraction of problems in the Erdos problem webpage involve (either explicitly or implicitly) some sort of integer sequence – typically the largest or smallest size of some
-dependent structure (such as a graph of
vertices, or a subset of
) that obeys a certain property. In some cases, the sequence is already in the OEIS, and is noted in the Erdos problem web page. But in a large number of cases, the sequence either has not yet been entered into the OEIS, or it does appear but has not yet been noted on the Erdos web page.
Thomas Bloom and I are therefore proposing a crowdsourced project to systematically compute the hundreds of sequences associated to the Erdos problems and cross-check them against the OEIS. We have created a github repository to coordinate this process; as a by-product, this repository will also be tracking other relevant statistics about the Erdos problem website, such as the current status of formalizing the statements of these problems in the Formal Conjectures Repository.
The main feature of our repository is a large table recording the current status of each Erdos problem. For instance, Erdos problem #3 is currently listed as open, and additionally has the status of linkage with the OEIS listed as “possible”. This means that there are one or more sequences attached to this problem which *might* already be in the OEIS, or would be suitable for submission to the OEIS. Specifically, if one reads the commentary for that problem, one finds mention of the functions for
, defined as the size of the largest subset of
without a
-term progression. It is likely that several of the sequences
,
, etc. are in the OEIS, but it is a matter of locating them, either by searching for key words, or by calculating the first few values of these sequences and then looking for a match. (EDIT: a contributor has noted that the first foursequences appear as A003002, A003003, A003004, and A003005 in the OEIS, and the table has been updated accordingly.)
We have set things up so that new contributions (such as the addition of an OEIS number to the table) can be made by a Github pull request, specifically to modify this YAML file. Alternatively, one can create a Github issue for such changes, or simply leave a comment either on the appropriate Erdos problem forum page, or here on this blog.
Many of the sequences do not require advanced mathematical training to compute, and so we hope that this will be a good “citizen mathematics” project that can bring in the broader math-adjacent community to contribute to research-level mathematics problems, by providing experimental data, and potentially locating relevant references or connections that would otherwise be overlooked. This may also be a use case for AI assistance in mathematics through generating code to calculate the sequences in question, although of course one should always stay mindful of potential bugs or hallucinations in any AI-generated code, and find ways to independently verify the output. (But if the AI-generated sequence leads to a match with an existing sequence in the OEIS that is clearly relevant to the problem, then the task has been successfully accomplished, and no AI output needs to be directly incorporated into the database in such cases.)
This is an experimental project, and we may need to adjust the workflow as the project progresses, but we hope that it will be successful and lead to further progress on some fraction of these problems. The comment section of this blog can be used as a general discussion forum for the project, while the github issue page and the erdosproblems.com forum pages can be used for more specialized discussions of specific problems.
The Simons-Laufer Mathematical Sciences institute, or SLMath (formerly the Mathematical Sciences Research Institute, or MSRI) has recently restructured its program formats, and is now announcing three new research initiatives, whose applications open on Sep 1 2025:
(Disclosure: I am vice-chair of the board of trustees at SLMath.)
Pick a type of categorical structure: say bicategories, or monoidal categories, or whatever you like. Some of the functors between structures are equivalences, in whatever the appropriate sense might be. And some of those equivalences have one or both of these two properties:
They’re not just essentially surjective in every dimension — they’re actually surjective in every dimension.
They don’t just preserve the structure up to isomorphism or equivalence — they strictly preserve it.
Call an equivalence with both these properties a strict surjective equivalence. So a strict surjective equivalence is an equivalence of a very special and easy kind.
General principle: the standard notion of equivalence between structures is generated by just these very special ones. For example, two bicategories are biequivalent if and only if they can be linked up by a zigzag of strict surjective equivalences.
Why should we care? Because there are some types of structure where the right notion of equivalence isn’t clear, and this principle guides us to it. For example, it tells us the right notion of equivalence for double categories.
All this is done in my new paper:
Tom Leinster, Equivalence via surjections. arXiv:2508.20555, 2025.
I started thinking about this question during Maru Sarazola’s invited talk at Category Theory 2025 in Brno last month. She asked the question:
What is the right notion of equivalence between double categories?
and carefully went through the properties that the right notion of equivalence should have, some possible candidates, and different approaches one might take to deciding what “right” means.
The answer that Maru ultimately gave was that the right notion is “gregarious double equivalence”, proposed by Alexander Campbell in about 2020. And she gave a justification in terms of model categories, representing joint work between her, Lyne Moser and Paula Verdugo.
For the purposes of this post, it actually doesn’t matter what “gregarious double equivalence” means. What I want to talk about is the following principle, which popped into my head as Maru was speaking:
For many types of categorical structure, the natural notion of equivalence is generated, as an equivalence relation, by identifying and when there exists a strict surjective equivalence .
It occurred to me that this principle might give a rather different justification for why gregarious double equivalence is the right answer. And after some checking, I discovered that it does.
Let me explain.
A more concrete way to express the principle is that and are equivalent in the standard sense — whatever’s appropriate for the structures at hand — if and only if there exists a zigzag of strict surjective equivalences
For any type of categorical structure I can think of, the pullback of a strict surjective equivalence is a strict surjective equivalence. So a simpler concrete condition is just that there exists a span of strict surjective equivalences
But hold on… what do I mean by “principle”?
What I mean is that for simple types of categorical structure, where “equivalence” and “strict surjective equivalence”, we have a theorem. Here are three examples.
Categories. We certainly know what it means for two categories to be equivalent. A “surjective equivalence” is an equivalence that’s not just essentially surjective on objects, but literally surjective on objects.
In this case, the theorem is that categories and are equivalent if and only if there exists a span of surjective equivalences between them.
(The word “strict” does nothing in this case.)
Monoidal categories. Again, we know what monoidal equivalence is, and it’s clear what a “strict surjective equivalence” is: a strict monoidal functor that’s a surjective equivalence of categories.
The theorem is that monoidal categories and are monoidally equivalent if and only if there exists a span of strict surjective equivalences between them.
Bicategories. The pattern is the same. The standard notion of equivalence for bicategories is biequivalence. A “strict surjective equivalence”, in this setting, is a strict -functor that is literally surjective on objects and locally a surjective equivalence of categories. (Or put another way, surjective on -cells, locally surjective on -cells, and full and faithful on -cells.)
The theorem is that bicategories and are biequivalent if and only if there exists a span of strict surjective equivalences between them.
Probably all these theorems are known. I included them in my paper because I couldn’t find them anywhere in the literature, not even the first one. But if you know a reference, I’d be glad to hear it.
Since the principle holds for categories, monoidal categories and bicategories, it’s reasonable to suppose that it might hold for other types of structure. And if we’re investigating some type of structure where the full notion of equivalence isn’t clear, this principle might help guide us to it.
For example, here’s a theorem on double categories, the main result of my paper:
Double categories. Again, it’s clear what “strict surjective equivalence” should mean: a strict double functor that’s surjective on -cells, locally surjective on both horizontal and vertical -cells, and full and faithful on -cells.
The theorem is that double categories and are gregariously double equivalent if and only if there exists a span of strict surjective equivalences between them.
Even without me telling you what “gregarious double equivalence” means, the four theorems I’ve stated suggest that it’s the right notion of equivalence for double categories, because it continues the pattern we’ve seen for simpler categorical structures.
So, I agree with the conclusion that Moser, Sarazola and Verdugo had already reached! But for different reasons.
Incidentally, this must be the fastest paper I’ve ever written: just under six weeks from sitting in Maru’s talk and hearing the mathematical term “gregarious” for the first time ever to putting the paper on the arXiv. But the principle that all equivalences are generated by strict surjective equivalences was planted in my head in the late 1990s or early 2000s by Carlos Simpson. Back then, we were both working on higher category theory, and when he explained this principle, I found it very striking — so striking that I remembered it 20+ years later. There’s a bit more on that higher categorical context in the introduction to my paper.
Let it be known to all governments and systems of power:
This is the natural order that guarantees our survival and gifts this world to our children. This world belongs to them and where we fail to serve them we condemn ourselves. And where government has failed to uphold this, we will not obey them as they have no right to exist.
We do not have to ask for these things, they are required, and if not given we shall simply take them.
Where the truth has not been told it shall be told.
If we fail to do so we condemn our children ourselves.
This week, 50 category theorists and software engineers working on “safeguarded AI” are meeting in Bristol. They’re being funded by £59 million from ARIA, the UK’s Advanced Research and Invention Agency.
The basic idea is to develop a mathematical box that can contain a powerful genie. More precisely:
By combining scientific world models and mathematical proofs we will aim to construct a ‘gatekeeper’, an AI system tasked with understanding and reducing the risks of other AI agents. In doing so we’ll develop quantitative safety guarantees for AI in the way we have come to expect for nuclear power and passenger aviation.
This program director is David Dalrymple, and you can get a much better description of the project from him in the first 4 minutes here:
It’s remarkable how many of the applied category theorists in the UK are involved in this. Here you can find a partial list:
If you’re wondering “why category theory?”, I think the idea is this: software based on general abstract math is more flexible, yet also easier to formally verify.
For example the Topos Institute, run by my former student Brendan Fong, now has a branch in the UK largely funded by ARIA. At the meeting, Topos is demonstrating how to build models in CatColab, their new category-based software.
I have decided not to be part of this project, though some of my math is getting used here. I’ve always preferred to avoid doing things connected to AI, for various reasons. But this project might make AI better. It could also have various bad effects. I have no idea how successful it will be, so I’m watching with fascination and profoundly mixed emotions.
First things first: due to an abrupt suspension of NSF funding to my home university of UCLA, the Institute of Pure and Applied Mathematics (which had been preliminarily approved for a five-year NSF grant to run the institute) is currently fundraising to ensure continuity of operations during the suspension, with a goal of raising $500,000. Donations can be made at this page. As incoming Director of Special Projects at IPAM, I am grateful for the support (both moral and financial) that we have already received in the last few days, but we are still short of our fundraising goal.
Back to math. Ayla Gafni and I have just uploaded to the arXiv the paper “Rough numbers between consecutive primes“. In this paper we resolve a question of Erdös concerning rough numbers between consecutive gaps, and with the assistance of modern sieve theory calculations, we in fact obtain quite precise asymptotics for the problem. (As a side note, this research was supported by my personal NSF grant which is also currently suspended; I am grateful to recent donations to my own research fund which have helped me complete this research.)
Define a prime gap to be an interval between consecutive primes. We say that a prime gap contains a rough number if there is an integer
whose least prime factor is at least the length
of the gap. For instance, the prime gap
contains the rough number
, but the prime gap
does not (all integers between
and
have a prime factor less than
). The first few
for which the
prime gap contains a rough number are
Erdös initially thought that all but finitely many prime gaps should contain a rough number, but changed his mind, as per the following quote:
…I am now sure that this is not true and I “almost” have a counterexample. Pillai and Szekeres observed that for every , a set of
consecutive integers always contains one which is relatively prime to the others. This is false for
, the smallest counterexample being
. Consider now the two arithmetic progressions
and
. There certainly will be infinitely many values of
for which the progressions simultaneously represent primes; this follows at once from hypothesis H of Schinzel, but cannot at present be proved. These primes are consecutive and give the required counterexample. I expect that this situation is rather exceptional and that the integers
for which there is no
satisfying
and
have density
.
In fact Erdös’s observation can be made simpler: any pair of cousin primes for
(of which
is the first example) will produce a prime gap that does not contain any rough numbers.
The latter question of Erdös is listed as problem #682 on Thomas Bloom’s Erdös problems website. In this paper we answer Erdös’s question, and in fact give a rather precise bound for the number of counterexamples:
Theorem 1 (Erdos #682) For, let
be the number of prime gaps
with
that do not contain a rough number. Then
Assuming the Dickson–Hardy–Littlewood prime tuples conjecture, we can improve this to
for some (explicitly describable) constant
.
In fact we believe that , although the formula we have to compute
converges very slowly. This is (weakly) supported by numerical evidence:
While many questions about prime gaps remain open, the theory of rough numbers is much better understood, thanks to modern sieve theoretic tools such as the fundamental lemma of sieve theory. The main idea is to frame the problem in terms of counting the number of rough numbers in short intervals , where
ranges in some dyadic interval
and
is a much smaller quantity, such as
for some
. Here, one has to tweak the definition of “rough” to mean “no prime factors less than
” for some intermediate
(e.g.,
for some
turns out to be a reasonable choice). These problems are very analogous to the extremely well studied problem of counting primes in short intervals, but one can make more progress without needing powerful conjectures such as the Hardy–Littlewood prime tuples conjecture. In particular, because of the fundamental lemma of sieve theory, one can compute the mean and variance (i.e., the first two moments) of such counts to high accuracy, using in particular some calculations on the mean values of singular series that go back at least to the work of Montgomery from 1970. This second moment analysis turns out to be enough (after optimizing all the parameters) to answer Erdös’s problem with a weaker bound
In a previous post, we discussed a phase transition that occurred in the piping above you on a game show. In the scenario, you are led on stage in front of a large audience. After a brief time, the audience votes on how “likeable” you are. The catch is that it doesn’t simply tally the votes, but turns spigots on a lattice of piping above your head. Water is then released and if enough people like you, it closes off the passage, keeping you dry. This exciting game show1 was described in that post:
Each “like” turns a spigot off, stopping water from flowing through one pipe in a grid overhead. Once voting ends, water is dumped into the system. If it can find a path to the bottom, you get soaked. [Emphasis added] The better your “likeability,” the less likely spigots open a path for water to flow and the drier you stay. That’s your prize for this game show (and hey, you also get the knowledge that people out there like you).
This system models a type of phase transition known as percolation.
The full post is here:
I highlighted above a key phrase “If it can find a path to the bottom, you get soaked.” What I didn’t say, but should have is that the water was being forced through the pipes, not just dropping down due to gravity. This is a very important point since our phases and phase transition changes dramatically if we just let gravity do the work. In the case of the water being “forced,” it can travel back up pipes if it helps it find its way out and onto your head, but in the case when only gravity is present, it falls down the pipes. To facilitate gravity, we’ll turn the pipes 45 degrees, and if we insert water at a single point on top, it could look like this:
This setup is a different problem called directed percolation. It also has a phase transition, but one that is different in some fundamental ways from regular percolation.
Before we explore its stranger properties, we can ask, “At what likability threshold do you remain dry?” Well, this happens to have a transition chance of 35.53%!2 This system is a lot more generous, keeping you dry even when a majority of people dislike you. This number comes from numerical computations which have been done rather precisely, and we can even compute it ourselves. In fact, you can see this clearly with this plot
Notice that as we make the system bigger and bigger, the chance of getting soaked less than 35.53% increases and above it, it decreases. This is the same kind of hallmark of a phase transition as we saw in our previous case.
We can also look at the water as it flows down the system to see the clusters that make it from top to bottom
There is still a fractal-looking pattern at the transition point. With all of these similarities with the regular percolation problem from the last post, what is different? And why is that plot so long and skinny? If gravity wants to pull you down, is that somehow altering the motion down, making it distinct from the motion left or right?
Well, if you go back to the two plots above, you’ll notice a few things that really make them differ from the percolation plots. In the fine print of the first, I’ve noted that the vertical distance is L1.58, so for a horizontal size of 40, the vertical size is roughly 340! That is definitely not a square. And in the second plot, there appears to be more vertical distance than horizontal distance. What is special about this 1.58 number3? It turns out, it’s a critical exponent in this problem, a universal aspect of directed percolation, that distinguishes it from regular percolation. We will call it z = 1.58 the dynamical critical exponent since it is revealed as water flows down in time (dynamically). This dynamical exponent z can reveal itself by looking at these “long and skinny” setups, but be masked by the square setup.
One thing we took away in the previous post was that we lose any sense of scale at this type of phase transition4. But whenever we have “only” thousands of pipes, the size of the system provides a scale! This is the main reason why we begin to see smooth curves and not sharp jumps in quantities. If the system of pipes were infinite (and we had infinite time for the water to go down the pipes), the probability you get soaked would be 100% less than the 35.53% likeability and 0% more than 35.53% likeability. For physical systems, the finite size is often not a huge issue since the scale is closer to the 1023 atoms present in macroscopic systems, and so even things that are technically smooth curves look very sharp.
The problem of size becomes more severe with directed percolation because horizontal and vertical distances start behaving differently thanks to gravity. In this case, if we lay out our nice grid of 10 × 10, 20 × 20, or 30 × 30, we start to notice that the likeability threshold where you stop getting soaked, seems to depend on the size of the system more than before. In actuality it doesn’t, but for these small sizes, you are definitely getting soaked well into the so-called “Dry Phase” we previously labeled. This is seen in the red curves here where each bigger square has a curve underneath the last:
Gravity has done something to the system. Flowing down is different from flowing left or right. In fact, if we flow down by some amount h and over to the right by some distance w, then at the directed percolation transition point
The amount water flows down is related to how far it flows to the right or left by this weird, fractional power of w. This 1.58 is z, our new dynamical critical exponent, which is a universal feature of directed percolation5. It tells us that if we make a system 30 pipes wide, it should extend roughly 301.58 ≈ 216 pipes in height to begin picking out the phase transition effectively. The blue curves in the above plot show this and notice how they all converge on one point; that point is the phase transition. It is revealed by small sizes! To understand why, just think about how the curves are changing as we make the system bigger and bigger.
The red curves will still converge to the phase transition, but it takes larger system sizes for it to reveal itself. This is related to the property that at the phase transition there is no longer a sense of scale, but away from the transition, the vertical scale of clusters could be so large that our puny 60-by-60 grid cannot even begin to reveal it. So if we sit at say a likeability of 0.4 in the 60-by-60 grid, we can say that the vertical size of a typical cluster is most likely more than 60.
This “gravity mode” for our game show we may call “easy mode” since it requires less of the audience to like you, but the implications here are wide. This type of phase transition has been seen in many kinds of local dynamics where there is a preferred configuration or state. These called an absorbing state phase transitions, and they are a property of certain random dynamical systems. Gravity has provided the distinction here, but more generically, causality and time itself provide that direction, leading to dynamics that obey the same universality as directed percolation.
Trademark pending.
Usually, you’ll see 0.6447 quoted instead, but that’s just 1−0.3553, which counts open pipes instead of closed as we’re doing.
I should note that we have this number to much higher precision than the two decimal points presented here, see the Wikipedia entry where
This is a second-order or continuous phase transition. Most transitions in the water phase diagram are first-order transitions which still retain a scale.
To drive this point home: Even if we change the lattice, this power law will remain intact. Sometimes it shows up in completely different scenarios too, like in absorbing state phase transitions.
There’s a lot of joyful knife-work in my future. #bolognese #summersalad –cvj
The post Harvest appeared first on Asymptotia.
Further insipired by yesterday's post about binary fitting, I worked today on the treatment of nuisance parameters that have known distributions. These can be treated as noise sometimes. Let me explain:
If I had to cartoon inference (or measurement) in the face of nuisance parameters, I would say that frequentists profile (optimize) over the nuisances and Bayesians marginalize (integrate) over the nuisances. In general frequentists cannot integrate over anything, because there is no measure in any of the parameter spaces. But sometimes there is a measure. In particular, when there is a compact symmetry:
We know (or very strongly believe) that all possible orientations of a binary-star orbit are equally likely. In this model (or under this normal assumption) we have a distribution over two angles (theta and phi for that orbit pole, say); it is the distribution set by the compact group SO(2). Thus we can treat the orientation as a noise source with known distribution and integrate over it, just like we would any other noise source. So, in this case (and many cases like it) we can integrate (marginalize) even as frequentists. That is, there are frequentism-safe marginalizations possible in binary-star orbit fitting. This should drop the 12-parameter fits (for ESA Gaia data) down to 8-parameter, if I have done my math right.
On Friday, Kareem El-Badry (Caltech) gave a seminar about looking for (and finding!) stars in binary orbits around dark or much darker companions, like black holes, neutron stars, and white dwarfs. He showed results that involve ESA Gaia astrometry, where he noted that the Gaia Mission has no sensitivity to periods right at (or within an inverse mission-length frequency difference of) one-year periods (inverse year frequencies). After the talk I objected that these are not exactly degenerate; El-Badry said that the inferences blow up there.
I spent some time on the weekend thinking about this point, and I now understand it: There is a particular one-year orbit that a star can have (around a darker companion) such that the photocenter of the system makes a motion that is identical to the apparent parallax motion. Thus there is an exact degeneracy between the parallax and a certain one-year orbit.
Does that mean that we can't measure orbits at one year (or, for that matter, parallaxes)? No, it does not. After all, the parallax ellipse has a particular celestial (angular) shape and phase. But it might require some kind of reparameterization of orbits near one-year periods. I think I know how to do that. Should we find the missing binaries? (Oh and by the way, this degeneracy means that, in a strict frequentist sense, Gaia can't measure parallaxes at all without additional information.)
A common saying goes, you should never meet your heroes, because they’ll disappoint you. But you shouldn’t trust every common saying; some heroes impress you more, the better you know them. Ray Laflamme was such a hero.
I first heard of Ray in my undergraduate quantum-computation course. The instructor assigned two textbooks: the physics-centric “Schumacher and Westmoreland” and “Kaye, Laflamme, and Mosca,” suited to computer scientists. Back then—in 2011—experimentalists were toiling over single quantum logic gates, implemented on pairs and trios of qubits. Some of today’s most advanced quantum-computing platforms, such as ultracold atoms, resembled the scrawnier of the horses at a racetrack. My class studied a stepping stone to those contenders: linear quantum optics (quantum light). Laflamme, as I knew him then, had helped design the implementation.
Imagine my awe upon meeting Ray the following year, as a master’s student at the Perimeter Institute for Theoretical Physics. He belonged to Perimeter’s faculty and served as a co-director of the nearby Institute for Quantum Computing (IQC). Ray was slim, had thinning hair of a color similar to mine, and wore rectangular glasses frames. He often wore a smile, too. I can hear his French-Canadian accent in my memory, but not without hearing him smile at the ends of most sentences.
My master’s program entailed a research project, which I wanted to center on quantum information theory, one of Ray’s specialties. He met with me and suggested a project, and I began reading relevant papers. I then decided to pursue research with another faculty member and a postdoc, eliminating my academic claim on Ray’s time. But he agreed to keep meeting with me. Heaven knows how he managed; institute directorships devour one’s schedule like ravens dining on a battlefield. Still, we talked approximately every other week.
My master’s program intimidated me, I confessed. It crammed graduate-level courses, which deserved a semester each, into weeks. My class raced through Quantum Field Theory I and Quantum Field Theory II—a year’s worth of material—in part of an autumn. General relativity, condensed matter, and statistical physics swept over us during the same season. I preferred to learn thoroughly, deeply, and using strategies I’d honed over two decades. But I didn’t have time, despite arriving at Perimeter’s library at 8:40 every morning and leaving around 9:30 PM.
In response, Ray confessed that his master’s program had intimidated him. Upon completing his undergraduate degree, Ray viewed himself as a nobody from nowhere. He chafed in the legendary, if idiosyncratically named, program he attended afterward: Part III of the Mathematical Tripos at the University of Cambridge. A Cambridge undergraduate can earn a master’s degree in three steps (tripos) at the Department of Applied Mathematics and Theoretical Physics. Other students, upon completing bachelor’s degrees elsewhere, undertake the third step to earn their master’s. Ray tackled this step, Part III.
He worked his rear off, delving more deeply into course material than lecturers did. Ray would labor over every premise in a theorem’s proof, including when nobody could explain the trickiest step to him.1 A friend and classmate helped him survive. The two studied together, as I studied with a few fellow Perimeter students; and Ray took walks with his friend on Sundays, as I planned lunches with other students on weekends.
Yet the program’s competitiveness appalled Ray. All students’ exam scores appeared on the same piece of paper, posted where everyone could read it. The department would retain the highest scorers in its PhD program; the other students would have to continue their studies elsewhere. Hearing about Ray’s program, I appreciated more than ever the collaboration characteristic of mine.
Ray addressed that trickiest proof step better than he’d feared, come springtime: his name appeared near the top of the exam list. Once he saw the grades, a faculty member notified him that his PhD advisor was waiting upstairs. Ray didn’t recall climbing those stairs, but he found Stephen Hawking at the top.
As one should expect of a Hawking student, Ray studied quantum gravity during his PhD. But by the time I met him, Ray had helped co-found quantum computation. He’d also extended his physics expertise as far from 1980s quantum gravity as one can, by becoming an experimentalist. The nobody from nowhere had earned his wings—then invented novel wings that nobody had dreamed of. But he descended from the heights every other week, to tell stories to a nobody of a master’s student.
Seven and a half years later, I advertised openings in the research group I was establishing in Maryland. A student emailed from the IQC, whose co-directorship Ray had relinquished in 2017. The student had seen me present a talk, it had inspired him to switch fields into quantum thermodynamics, and he asked me to co-supervise his PhD. His IQC supervisor had blessed the request: Ray Laflamme.
The student was Shayan Majidy, now a postdoc at Harvard. Co-supervising him with Ray Laflamme reminded me of cooking in the same kitchen as Julia Child. I still wonder how I, green behind the ears, landed such a gig. Shayan delighted in describing the difference between his supervisors’ advising styles. An energetic young researcher,2 I’d respond to emails as early as 6:00 AM. I’d press Shayan about literature he’d read, walk him through what he hadn’t grasped, and toss a paper draft back and forth with him multiple times per day. Ray, who’d mellowed during his career, mostly poured out support and warmth like hollandaise sauce.
Once, Shayan emailed Ray and me to ask if he could take a vacation. I responded first, as laconically as my PhD advisor would have: “Have fun!” Ray replied a few days later. He elaborated on his pleasure at Shayan’s plans and on how much Shayan deserved the break.
This June, an illness took Ray earlier than expected. We physicists lost an intellectual explorer, a co-founder of the quantum-computing community, and a scientist of my favorite type: a wonderful physicist who was a wonderful human being. Days after he passed, I was holed up in a New York hotel room, wincing over a web search. I was checking whether a quantum system satisfies certain tenets of quantum error correction, and we call those tenets the Knill–Laflamme conditions. Our community will keep checking the Knill–Laflamme conditions, keep studying quantum gates implementable with linear optics, and more. Part of Ray won’t leave us anytime soon—the way he wouldn’t leave a nobody of a master’s student who needed a conversation.
1For the record, some of the most rigorous researchers I know work in Cambridge’s Department of Applied Mathematics and Theoretical Physics today. I’ve even blogged about some.
2As I still am, thank you very much.
Well, I can now officially mention that I've been part of the filmmaking team (in a way) working hard to bring you an enjoyable and interesting Fantastic Four movie! I think it has been about two and a half years (?) since this all began. This was a nearly perfect model of how science consulting can work in film. I worked with everyone, wherever I was needed, with the director, writers, producers, director of photography, VFX teams, set design, and so on. They made me feel welcome and part of whatever creative team I was talking to, which was great. They were open to lots of ideas right from when they were starting out thinking about tone, story ideas, and so forth, right through to final (key) tweaks right at the end of the process as recently as mere weeks ago.
It began early on with with having great conversations Matt Shakman and his writing team about the fact that Reed Richards is first and foremost a curiosity-driven physicist (and so quite different from the engineer we have in Tony Stark that we see RdJ bring out so well), and how things like his dedication to his work (and his outlook on things that comes from such work) might play out in terms of family dynamic, personal relationships, etc., - Without it turning into the tedious cliches about scientists somehow not being able to navigate the world of human relationships. Obviously, I could speak to this as a physicist who works on precisely the things Reed works on, as well as a family man, and as well as someone who remembers that it's still all about telling a story. And there are so many stories to tell at that intersection... Anyway, I think these early conversations (as well as suggestions I made in many sets of notes along the way) helped inform (even if only a little bit? who knows?) what Pedro Pascal brought to the character. This aspect of the film is one of the things I'm most pleased about seeing up on screen.
Beyond that, you'll see lots of things I gave them that I'm also delighted to see made it to the film, in many scenes. This includes (but not limited to!): [...] Click to continue reading this post
The post Fantastic Collaboration! appeared first on Asymptotia.
So imagine that you have a unique data set Y, and in that data set Y you measure a bunch of parameters θ by a bunch of different methods. Then you find, in your favorite analysis, your estimate of one particular parameter is way out of line: All of physics must be wrong! How do you figure out the significance of your result?
If you only ever have data Y, you can't answer this question very satisfactorily: You searched Y for an anomaly, and now you want to test the significance. That's why so many a posteriori anomaly results end up going away: That search probably tested way more hypotheses than you think it did, so any significances should be reduced accordingly.
The best approach is to use only part of your data (somehow) to search, and then use a found anomaly to propose a hypothesis test, and then test that test in the held-out or new data. But that often isn't possible, or it is already too late. But if you can do this, then there is usually a likelihood ratio that is decisive about the significance of the anomaly!
I discussed all these issues today with Kate Storey-Fisher (Stanford) and Abby Williams (Chicago) today, as we are trying to finish a paper on the anomalous amplitude of the kinematic dipole in quasar samples.
I showed my robust spectral decomposition (dimensionality reduction) and residuals to the MPIA Binaries group today. There was much useful feedback (including that my H-gamma was actually H-delta; embarassing!). One comment was that the model isn't truly a causal separation between star and lines, so there will be some mean lines in the star model; lines aren't entirely outliers. That's true! The group suggested that I iterate to remove stars with lines from the training set.
After the meeting, I implemented some of that, but problems like this have a pathology: If you carefully remove stars with high residuals at some wavelength, then the training data will be deficient, or low, at that wavelength. And then the model will go lower, and then more stars will have excess at that wavelength and: Disaster. So when I implemented, I required a 2-sigma deviation, and I removed both high and low outliers. I don't know if this will work, but I am testing now.
There are several interesting papers on the arXiv today. One of them, arXiv:2507.15949, involves my former PhD supervisor. It's on the subject of Quantum Entanglement at collider experiments, and relates back to a paper of his from 1992 that I didn't know about (there's a great line in the new paper where the authors complain that their earlier paper was ignored). (Quantum) Entanglement is the phenomenon where two or more particles are in a special state so that their properties are related, but we don't know what those properties are until we measure them. In Quantum Mechanics we would say that the actual state is not decided until we measure them, and this leads to 'spooky action at a distance' because by measuring one particle we appear to set the corresponding property of the other. An alternative explanation would be that there is some hidden quantity or 'hidden variable' where both particles secretly know all along what state they are in. However, surprisingly it's possible to discriminate between these two cases, and set up quantitative tests known as 'Bell inequalities': you can make a measurement and, if the result of that measurement is less than a certain value, then a hidden variable theory cannot explain it. Experiments to test this using photons at low energies were performed in the early 80s by Alain Aspect and others that violated Bell inequalities and thus confirming the Quantum Mechanical interpretation.
In recent years, experimentalists have become interested in performing similar tests using different particles at higher energies; it is legitimate to ask "is this true for fermions?" or "does this break down at high energy?" Apparently similar questions were asked in the early 90s at LEP where electrons and positrons were collided (instead of protons at the LHC) and the 1992 paper pointed out that they were not really testing Bell Inequalities. The new paper revisits the older argument, and applies it to the new case of e.g. proton collisions producing a top-antitop pair. They argue that the quantity of interest for the Bell Inequality is the spin density matrix:
But what can actually be measured is the differential cross-section (the rate of production of particles in a certain angular volume):
The symbols B and C appear in both expressions: when performing experimental tests of Bell inequalities they are identified with each other. Since the differential cross-section can be measured, the measurement for the Bell Inequality can then be made and tested. However, the authors of the new paper claim that, in order to identify the two sets of symbols, it is necessary to use Quantum Field Theory: the second equation is a prediction based on QFT from the first. In other words, the logic is circular, and Quantum Mechanics has been assumed -- so it's not surprising that the Bell inequality is violated!I haven't worked on this topic myself, so it will be interesting to see if there is some pushback from the authors of papers such as arXiv:2003.02280 (who proposed such top-antitop studies).
Fermi decay constant -- at three loops!
I also want to point out arXiv:2507.15946 by Stephen Martin, who has performed a three-loop computation of the decay rate of the muon in the Standard Model at three loops. This quantity is incredibly important; it is measured very precisely, and so we use it to extract the underlying parameters of the Standard Model -- or, any theory beyond it. But since it's a complicated process, this is a tricky computation, even at low loop order. The results in this paper will be useful for all sorts of calculations, such as extracting the Higgs boson's self-coupling -- and working out whether the universe is metastable!
My goal this year in Heidelberg is to move forward all writing projects. I didn't really want to start new projects, but of course I can't help myself, hence the previous post. But today I crushed the writing: I wrote four pages in the book that Rix (MPIA) wants me to write, and I got more than halfway done with a Templeton Foundation pre-proposal that I'm thinking about, and I partially wrote up the method of the robust dimensionality reduction that I was working on over the weekend. So it was a good day.
That said, I don't think that the iteratively reweighted least squares implementation that I am using in my dimensionality reduction has a good probabilistic interpretation. That is, it can't be described in terms of a likelihood function. This is related to the fact that frequentist methods that enforce sparsity (like L1 regularization) don't look anything like Bayesian methods that encourage sparsity (like massed priors). I don't know how to present these issues in any paper I try to write.
1. The Classical vs. Quantum World
In our everyday experience of the world, things have precise positions, speeds, and outcomes. You throw a baseball—you know where it’s going. But when we zoom in to the world of atoms and particles, things get weird — and the rules change.
2. The Probabilistic Nature (Uncertainty and Superposition)
🗨️ Metaphor:
"Imagine flipping a coin, while it is spinning in mid-air, it spins in mid-air being both at heads and tails at the same time, with the probability of being heads or tails is still 50-50. At this point, if we want to describe the state of this system (the coin), it would be a combination of both heads and tails — until you look, and then you can say whether the coin landed on heads or tails. That’s how particles behave in the quantum world: they exist in a state made of both heads and tails, a superposition of states, until they’re measured.
🎯 Main idea:
Quantum Particles don’t have exact positions or velocities—just probabilities.
Measurement collapses the particle’s wavefunction to a definite value.
In classical mechanics, we think of a particle as a tiny object with a definite position and velocity at any time. But in quantum mechanics, particles like electrons that are described by a wavefunction, a mathematical function that tells you the probability of finding the particle in different places. You can think of the particle not as a dot but as a fuzzy cloud, where he denser the cloud in one spot, the more likely the particle is to be found there.
This is why we say: "Particles don't have exact positions or velocities—just probabilities."
In our everyday life, we see systems that exhibit wave properties. Things like sound waves, water waves (surface waves), waves on a cable (vibrating), or if you live in certain places, you may experience seismic waves. These are all classical physics examples that are described by wave equations, where the disturbance propagates through a medium or field, transferring energy without necessarily transferring matter.
For example, when waves meet (i.e., waves in water), they combine through a process called interference. This can take a few forms:
· Constructive Interference: When the crests (high points) and troughs (low points) of two waves line up, they reinforce each other, creating a larger wave. Think of two ripples on a pond colliding and forming a bigger splash.
· Destructive Interference: When a crest meets a trough, they cancel out to some extent—sometimes completely—resulting in a smaller or flat wave.
This blending of energy is happening constantly in light, sound, water waves, and even quantum systems.
Below in Figure 1, is an example of superpositions of waves. The top image highlights full constructive interference and the bottom image shows destructive interference. You can see that the maximum of the two waves is 1 and its minimum is -1, where 1 and -1 are called the wave's amplitude. For these two points, for complete constructive interference, the superposition of these waves yields 2 (superposition means at each position point you add the two waves together) for the maximum and -2 for the minimum. For complete destructive interference, you can see the waves when at each point you add them together (superposition), completely cancel out (equal 0). This situation is often called completely out-of-phase . Using the same two points as in our constructive interference example, you now see that wave 1 equals 1 and wave 2 equals -1. In fact, for all the points, the two waves are equal but of opposite sign (meaning one is positive, say +1, and the other is -1). The superposition of these two waves produces 0 for all points.
Below in Figure 2, the waves are slightly shifted along the position axis (x-axis). Using our same points as before, you can see that the superposition wave doesn’t quite equal 2 and -2; they are less than 2 and greater than -2 (-2 is less than -1.9, say, meaning -2 does not get more negative, it is heading upwards towards 0). This is because each wave’s maximum and minimum values occur at different points in space, and this is true for the values of the superposition wave at all points in space. Imagine you fix wave 2, and you slowly pull wave 1 to the right (wave 1 could be referred to as phase-shifted relative to wave 2). The superposition wave continues to have positive values and negative values going towards 0. Once the maximums of wave 1 line up with the minimums of wave 2, the superposition wave is 0 for all points. This is the complete destructive interference as we saw in Figure 1. Now, if you continue to pull wave 1 to the right, the superposition wave starts growing, and if you keep pulling to the right, it will reach the complete constructive interference pattern like in Figure 1.
Notice the superposition wave (like the other waves) starts to repeat the pattern. The point where the pattern repeats itself would define the superposition wave’s wavelength 𝛌. Now imagine, if you had lots of waves where some are shifted relative to our wave 1, at some points in position, we will get a maximum amplitude resulting from constructive interference but necessarily complete constructive interference, giving the highest point of a wave (crest of water wave), while for others, we may get destructive interference, leading to the minimum amplitude (trough of a wave) and other intermediate amplitude that help to make-up the entire wave. Hopefully, this simplistic model helps us to understand how waves form and how you can get a big wave from many small waves.
Another feature of waves is that they have a wavelength that describes how far they propagate in space before repeating the same pattern over and over. If you remember what the mathematical functions, sine and cosine, they are waves that repeat in space and have a wavelength. Now the important part is that the momentum, p, of these waves is inversely proportional to their wavelength, that is, p=1/𝛌. So if you have a short wavelength, you have a large momentum, and vice versa.
These waves follow classical equations — disturbances that move through a medium, transferring energy. But in quantum mechanics, the wave isn't a ripple in water or air — it’s a probability wave.
Now comes the key idea: wave-particle duality. Particles act like waves. And waves behave very differently from particles in one crucial way:
A wave that's localized in space (i.e., sharply peaked in position) must be made by combining many different wavelengths. Think of a big wave in the ocean; it is formed by lots of waves coming together to form this big wave. This combining of waves also means you have a wide range of momenta.
Correspondingly, a wave with a defined momentum (i.e., well-defined momentum) must be spread out in space.
For example, let’s look at music and a pure note on a tuning fork (single frequency = defined momentum) lasts long but is hard to pin down in time (spread out). However, a short drumbeat is localized in time (defined position) but contains a spread of frequencies (momentum uncertainty).
For example, let’s look at music and a pure note on a tuning fork (single frequency = defined momentum) lasts long but is hard to pin down in time (spread out). However, a short drumbeat is localized in time (defined position) but contains a spread of frequencies (momentum uncertainty).
This is a fundamental mathematical property of waves called the Fourier transform. A Fourier transform contains both sine and cosine, just as waves, but is a more complicated function that involves complex numbers. The point about the Fourier transform is that you can obtain sine and cosine from it.
One of the most famous — and misunderstood — ideas in quantum mechanics is the Heisenberg Uncertainty Principle.
It’s often summed up like this: You can’t know both where something is and how fast it’s moving — at the same time — with perfect precision.
At first glance, that sounds like a problem with our measuring tools, as if we just need better microscopes or sensors. But that’s not it.
This principle isn’t about technological limitations — it’s a fundamental property of nature.
In classical physics, if you know where a car is and how fast it’s going, you can predict exactly where it’ll be a few seconds later. But in the quantum world, if you try to pin down the position of a particle more precisely, you automatically become less certain about its momentum (its speed and direction) — and vice versa.
It’s not because the particle is misbehaving — it’s because particles aren’t like tiny billiard balls. They behave like waves, and waves don’t have sharp edges.
Think of a musical note. If a sound wave is spread out in time — like a long, steady tone — it has a very precise frequency (pitch). But if it’s a short, sharp “ping,” its frequency becomes less certain. You trade time for pitch.
In the same way, if a particle’s wave is sharply localized in space (you know where it is), the range of its momentum values must broaden. If the wave is spread out (you don’t know exactly where it is), the momentum is better defined.
It’s not that the particle is jittering around randomly. Instead:
Before measurement, a particle’s position and momentum are both described by a range of probabilities.
The more tightly you narrow one, the more uncertain the other becomes.
The Heisenberg Uncertainty Principle can be written down as,
𝚫p𝚫x ≤ ℏ/2
𝚫x is the uncertainty in position
𝚫p is the uncertainty in momentum
ℏ is Planck’s constant (a very small number)
Let’s try to understand this formula a little better. In quantum mechanics, particles like electrons aren’t just little dots — they also act like waves. This means we describe them with wave packets, which are like short-lived ripples or pulses spread out over space.
To make a wave packet that’s narrow in space (so we know roughly where the particle is), we have to combine many different waves (i.e., sine waves) with various wavelengths and frequencies (think back to our above example of waves).
That’s because a single sine wave, for example, stretches out infinitely — it doesn’t give you a clear position. Only by mixing waves with different wavelengths (and therefore different momenta) can we build a localized bump.
So: Precise position → requires many different wavelengths → high momentum uncertainty.
Now reverse it. If we only use one sine wave, it has a very clear wavelength (momentum), but it stretches out forever — the particle could be anywhere.
So: Precise momentum → means the particle is spread out → high position uncertainty.
This trade-off is at the heart of the uncertainty principle:
𝚫p𝚫x ≤ ℏ/2
Here, 𝚫x is the uncertainty in position, 𝚫p is the uncertainty in momentum, and ℏ is a very tiny constant from quantum physics.
The key message: > The more precisely you know where something is, the less precisely you can know how fast it's going — and vice versa.
Imagine building a short splash on a pond with water waves (see Figure-3):
A small, sharp splash uses many different ripple sizes (frequencies).
A pure, smooth ripple has just one frequency but spreads out.
That’s the uncertainty principle in action, hiding in the rhythm of waves.
So what that tells us is that as we become more and more certain about the location of a particle (𝚫x is getting smaller and smaller, heading to 0), 𝚫p is getting larger and larger, heading to ∞. This tells us that if we knew x exactly, then we would not know the momentum p of the particle, since the uncertainty 𝚫p is infinite.
You can’t precisely know both where something is (position) and how fast it’s going or in what direction (momentum) at the same time. The more accurately you try to measure one, the fuzzier the other becomes.
Imagine you're trying to photograph a speeding car at night.
If you use a fast shutter, you can see exactly where the car is, but the picture will be blurry — you can’t tell how fast it was going.
If you use a slow shutter, you get a motion blur — which tells you how fast it was moving, but now you don’t know exactly where it was.
That’s the uncertainty principle in action: precision in one area means fuzziness in the other.
Again, this isn’t just a limitation of our instruments — it's a fundamental property of nature. It's like the universe itself has this built-in fuzziness at tiny scales.
This principle also tells us why electrons just don't spiral into the nucleus of an atom.
Because you can’t precisely know both the position and momentum of a particle at the same time.
If an electron got too close to the nucleus, its position would be very well known (i.e., tightly confined in space). According to the uncertainty principle, this would mean its momentum becomes highly uncertain. Because the kinetic energy is directly calculated from the momentum, and since you have large momentum fluctuations, you will have large kinetic energy.
This tells us that confining the electron too tightly costs energy — a lot of energy. That energy cost balances out the attractive pull of the nucleus. The result? The electron occupies a fuzzy “cloud” of most likely locations (remember it is based on probabilities)— what we call an orbital — and it doesn't just fall in.
This quantum balancing act gives rise to stable atoms, the periodic table, chemistry, etc.
Wave-particle duality is one of the most astonishing ideas in modern physics. It says that tiny things—like electrons and light—can behave like particles and waves, depending on how you look at them.
Waves (like ocean waves, or ripples in a pond, or even sound waves) are spread out, continuous disturbances. They travel, they can interfere with each other (creating bigger or smaller waves), and they bend around corners. You can't point to "one wave" and say it's at a single, precise location.
Particles (like a baseball, or a tiny pebble) are distinct, localized objects. They have a definite position, mass, and can be tracked as they move from one point to another.
The Classical Difference: In our ordinary experience, something is clearly either a wave or a particle. Never both.
In everyday experience:
Objects are either particles (like baseballs) or waves (like sound or water ripples).
Particles have defined positions and travel along clear paths.
Waves are spread out, overlap, and interfere, but they don't "exist" in a single spot.
Think of throwing a rock into a pond—either you're dealing with the rock or the ripples it creates, never both at once.
The Quantum Twist: Wave-Particle Duality
But when we zoom down to the incredibly tiny, fundamental level of reality – the quantum realm – things get weird. Particles like electrons, and even light itself (which we classically considered a wave), don't always fit neatly into one category. This is wave-particle duality:
Light, for instance, can behave like a spread-out wave (which is why it can create interference patterns, just like water waves). But it can also act like a stream of tiny, discrete particles called photons (which is how it knocks electrons off a metal surface in the photoelectric effect, acting like tiny billiard balls).
Similarly, electrons (which we think of as particles making up atoms) can, under certain experimental conditions, exhibit wave-like behavior, creating interference patterns as if they were spread out and passing through multiple places at once. Yet, when we try to pinpoint their location, they act like a localized particle.
This means a single electron, shot toward a double slit, doesn't just go through one slit—it behaves as if it explores all possibilities at once, producing an interference pattern typical of waves.
The amazing part is that a quantum entity isn't just sometimes a wave and sometimes a particle. Instead, it possesses both wave-like and particle-like properties simultaneously, and the act of observation or the type of experiment we perform determines which aspects we will observe. You can't observe both characteristics at the same exact time in the same experiment.
This seemingly paradoxical idea is a cornerstone of quantum mechanics and is absolutely essential for understanding how the universe works at its most fundamental level. It underpins all modern technologies from lasers and transistors to medical imaging and the very concept of quantum computing.
The objects aren't just "here or there"—they are probabilistic ripples, until observed.
Wave-particle duality is nature’s way of whispering: “The world is more nuanced than it seems.”