Planet Musings

December 25, 2024

John BaezThe Parker Solar Probe

Today, December 24th 2024, the Parker Solar Probe got 7 times closer to the Sun than any spacecraft ever has, going faster than any spacecraft ever has—690,000 kilometers per hour. WHEEEEEE!!!!!!!

But the newspapers are barely talking about the really cool part: what it’s like down there. The Sun doesn’t have a surface like the Earth does, since it’s all just hot ionized gas, called ‘plasma‘. But the Sun has an ‘Alfvén surface’—and the probe has penetrated that.

What’s the Alfvén surface? In simple terms, it’s where the solar wind—the hot ionized gas emitted by the Sun—breaks free of the Sun and shoots out into space. But to understand how cool this is, we need to dig a bit deeper.

After all, how can we say where the solar wind “breaks free of the Sun”?

Hot gas shoots up from the Sun, faster and faster due to its pressure, even though it’s pulled down by gravity. At some point it goes faster than the speed of sound! This is the Alfvén surface. Above this surface, the solar wind becomes supersonic, so no disturbances in its flow can affect the Sun below.

It’s sort of like the reverse of a black hole! Light emitted from within the event horizon of a black hole can’t get out. Sound emitted from outside the Alfvén surface of the Sun get get in.

Or, it’s like the edge of a waterfall, where the water starts flowing so fast that waves can’t make it back upstream.

That’s pretty cool. But it’s even cooler than this, because ‘sound’ in the solar wind is very different from sound on Earth. Here we have air. The Sun has ions—atoms of gas so hot that electrons have been ripped off—interacting with powerful magnetic fields. You can visualize these fields as tight rubber bands, with the ions stuck to them. They vibrate back and forth together!

You could call these vibrations ‘sound’, but the technical term is Alfvén waves. Alfvén was the one who figured out how fast these waves move. Parker studied the surface where the solar wind’s speed exceeds the speed of the Alfvén waves.

And now we’ve gone deep below that surface!

This realm is a strange one, and the more we study it, the more complex it seems to get.

You’ve probably heard the joke that ends “consider a spherical cow”. Parker’s original model of the solar wind was spherically symmetric, so he imagined the solar wind shooting straight out of the Sun in all directions. In this model, the Alfvén surface is the sphere where the wind becomes faster the Alfvén waves. There are some nice simple formulas for all this.

But in fact the Sun’s surface is roiling and dynamic, with sunspots making solar flares, and all sorts of bizarre structures made of plasma and magnetic fields, like spicules, ‘coronal streamers’ and ‘pseudostreamers’… aargh, too complicated for me to understand. This is an entire branch of science!

So, the Alfvén surface is not a mere sphere: it’s frothy and randomly changing. The Parker Solar Probe will help us learn how it works—along with many other things.

Finally, here’s something mindblowing. There’s a red dwarf star 41 light years away from us, called TRAPPIST-1, which may have six planets beneath its Alfvén surface! This means these planets can create Alfvén waves in the star’s atmosphere. Truly the music of the spheres!

For more, check out these articles:

• Wikipedia, Alfvén wave.

• Wikipedia, Alfvén surface.

and this open-access article:

• Steven R. Cranmer, Rohit Chhiber, Chris R. Gilly, Iver H. Cairns, Robin C. Colaninno, David J. McComas, Nour E. Raouafi, Arcadi V. Usmanov, Sarah E. Gibson and Craig E. DeForest, The Sun’s Alfvén surface: recent insights and prospects for the Polarimeter to Unify the Corona and Heliosphere (PUNCH), Solar Physics 298 (2023).

A quote:

Combined with recent perihelia of Parker Solar Probe, these studies seem to indicate that the Alfvén surface spends most of its time at heliocentric distances between about 10 and 20 solar radii. It is becoming apparent that this region of the heliosphere is sufficiently turbulent that there often exist multiple (stochastic and time-dependent) crossings of the Alfvén surface along any radial ray.

John BaezEpicycles

Some people think medieval astronomers kept adding ‘epicycles’ to the orbits of planets, culminating with the Alfonsine Tables created in 1252. The 1968 Encyclopædia Britannica says:

By this time each planet had been provided with from 40 to 60 epicycles to represent after a fashion its complex movement among the stars.

But this is complete nonsense!

Medieval astronomers did not use so many epicycles. The Alfonsine Tables, which the Brittanica is mocking above, actually computed planetary orbits using the method in Ptolemy’s Almagest, developed way back in 150 AD. This method uses at most 31 circles and spheres—nothing like Britannica’s ridiculous claim of between 40 to 60 epicycles per planet.

The key idea in Ptolemy’s model was this:

The blue dot here is the Earth. The large black circle, offset from the Earth, is called a ‘deferent’. The smaller black circle is called an ‘epicycle’. The epicycle makes up for how in reality the Earth is not actually stationary, but moving around the Sun.

The center of the epicycle rotates at constant angular velocity around the purple dot, which is called the ‘equant’. The equant and the Earth are at equal distances from the center of the black circle. Meanwhile the planet, in red, moves around the center of the epicycle at constant angular velocity.

In the Almagest, Ptolemy used some additional cycles to account for how the latitudes of planets change over time. In reality this happens because the planets don’t all move in the same plane. Ptolemy also used additional ‘epicyclets’ to account for peculiarities in the orbits of Mercury and the Moon, and a mechanism to account for the precession of equinoxes—which really happens because the Earth’s axis is slowly precessing.

In a later work, the Planetary Hypothesis, achieved even more accurate results with fewer cycles, by having the planets orbit in different planes (as they indeed do).

So, just because something is in an encyclopedia, or even an encyclopædia, doesn’t mean it’s true.

The Encyclopædia Britannica quote comes from their 1968 edition, volume 2, in the article on the Spanish king Alfonso X, which on page 645 discusses the Alfonsine Table commissioned by this king:

By this time each planet had been provided with from 40 to 60 epicycles to represent after a fashion its complex movement among the stars. Amazed at the difficulty of the project, Alfonso is credited with the remark that had he been present at the Creation he might have given excellent advice.

In The Book Nobody Read, Owen Gingerich writes that he challenged Encyclopædia Britannica about the number of epicycles. Their response was that the original author of the entry had died and its source couldn’t be verified. Gingerich has also expressed doubts about the quotation attributed to King Alfonso X.

For the controversy over whether medieval astronomers used lots of epicycles, start here:

• Wikipedia, Deferent and epicycle: history.

and then go here:

• Wikipedia, Deferent and epicycle: the number of epicycles.

Then dig into the sources! For example, Wikipedia says the claim that the Ptolemaic system uses about 80 circles seems to have appeared in 1898. It may have been inspired by the non-Ptolemaic system of Girolamo Fracastoro, who used either 77 or 79 orbs. So some theories used lots of epicycles—but not the most important theories, and nothing like the 240-360 claimed by the 1968 Brittanica.

Owen Gingerich wrote The Book Nobody Read about his quest to look at all 600 extant copies of Copernicus’ De revolutionibus. The following delightful passage was contributed by pglpm on Mastodon:

Terence TaoQuaternions and spherical trigonometry

Hamilton’s quaternion number system {\mathbb{H}} is a non-commutative extension of the complex numbers, consisting of numbers of the form {t + xi + yj + zk} where {t,x,y,z} are real numbers, and {i,j,k} are anti-commuting square roots of {-1} with {ij=k}, {jk=i}, {ki=j}. While they are non-commutative, they do keep many other properties of the complex numbers:

  • Being non-commutative, the quaternions do not form a field. However, they are still a skew field (or division ring): multiplication is associative, and every non-zero quaternion has a unique multiplicative inverse.
  • Like the complex numbers, the quaternions have a conjugation

    \displaystyle  \overline{t+xi+yj+zk} := t-xi-yj-zk,

    although this is now an antihomomorphism rather than a homomorphism: {\overline{qr} = \overline{r}\ \overline{q}}. One can then split up a quaternion {t + xi + yj + zk} into its real part {t} and imaginary part {xi+yj+zk} by the familiar formulae

    \displaystyle  \mathrm{Re} q := \frac{q + \overline{q}}{2}; \quad \mathrm{Im} q := \frac{q - \overline{q}}{2}

    (though we now leave the imaginary part purely imaginary, as opposed to dividing by {i} in the complex case).
  • The inner product

    \displaystyle  \langle q, r \rangle := \mathrm{Re} q \overline{r}

    is symmetric and positive definite (with {1,i,j,k} forming an orthonormal basis). Also, for any {q}, {q \overline{q}} is real, hence equal to {\langle q, q \rangle}. Thus we have a norm

    \displaystyle  |q| = \sqrt{q\overline{q}} = \sqrt{\langle q,q \rangle} = \sqrt{t^2 + x^2 + y^2 + z^2}.

    Since the real numbers commute with all quaternions, we have the multiplicative property {|qr| = |q| |r|}. In particular, the unit quaternions {U(1,\mathbb{H}) := \{ q \in \mathbb{H}: |q|=1\}} (also known as {SU(2)}, {Sp(1)}, or {Spin(3)}) form a compact group.
  • We have the cyclic trace property

    \displaystyle  \mathrm{Re}(qr) = \mathrm{Re}(rq)

    which allows one to take adjoints of left and right multiplication:

    \displaystyle  \langle qr, s \rangle = \langle q, s\overline{r}\rangle; \quad \langle rq, s \rangle = \langle q, \overline{r}s \rangle

  • As {i,j,k} are square roots of {-1}, we have the usual Euler formulae

    \displaystyle  e^{i\theta} = \cos \theta + i \sin \theta, e^{j\theta} = \cos \theta + j \sin \theta, e^{k\theta} = \cos \theta + k \sin \theta

    for real {\theta}, together with other familiar formulae such as {\overline{e^{i\theta}} = e^{-i\theta}}, {e^{i(\alpha+\beta)} = e^{i\alpha} e^{i\beta}}, {|e^{i\theta}| = 1}, etc.
We will use these sorts of algebraic manipulations in the sequel without further comment.

The unit quaternions {U(1,\mathbb{H}) = \{ q \in \mathbb{H}: |q|=1\}} act on the imaginary quaternions {\{ xi + yj + zk: x,y,z \in {\bf R}\} \equiv {\bf R}^3} by conjugation:

\displaystyle  v \mapsto q v \overline{q}.

This action is by orientation-preserving isometries, hence by rotations. It is not quite faithful, since conjugation by the unit quaternion {-1} is the identity, but one can show that this is the only loss of faithfulness, reflecting the well known fact that {U(1,\mathbb{H}) \equiv SU(2)} is a double cover of {SO(3)}.

For instance, for any real {\theta}, conjugation by {e^{i\theta/2} = \cos(\theta/2) + i \sin(\theta/2)} is a rotation by {\theta} around {i}:

\displaystyle  e^{i\theta/2} i e^{-i\theta/2} = i \ \ \ \ \ (1)

\displaystyle  e^{i\theta/2} j e^{-i\theta/2} = \cos(\theta) j - \sin(\theta) k \ \ \ \ \ (2)

\displaystyle  e^{i\theta/2} k e^{-i\theta/2} = \cos(\theta) k + \sin(\theta) j. \ \ \ \ \ (3)

Similarly for cyclic permutations of {i,j,k}. The doubling of the angle here can be explained from the Lie algebra fact that {[i,j]=ij-ji} is {2k} rather than {k}; it also closely related to the aforementioned double cover. We also of course have {U(1,\mathbb{H})\equiv Spin(3)} acting on {\mathbb{H}} by left multiplication; this is known as the spinor representation, but will not be utilized much in this post. (Giving {\mathbb{H}} the right action of {{\bf C}} makes it a copy of {{\bf C}^2}, and the spinor representation then also becomes the standard representation of {SU(2)} on {{\bf C}^2}.)

Given how quaternions relate to three-dimensional rotations, it is not surprising that one can also be used to recover the basic laws of spherical trigonometry – the study of spherical triangles on the unit sphere. This is fairly well known, but it took a little effort for me to locate the required arguments, so I am recording the calculations here.

The first observation is that every unit quaternion {q} induces a unit tangent vector {qj\overline{q}} on the unit sphere {S^2 \subset {\bf R}^3}, located at {qi\overline{q} \in S^2}; the third unit vector {qk\overline{q}} is then another tangent vector orthogonal to the first two (and oriented to the left of the original tangent vector), and can be viewed as the cross product of {qi\overline{q} \in S^2} and {qj\overline{q} \in S^2}. Right multplication of this quaternion then corresponds to various natural operations on this unit tangent vector:

  • Right multiplying {q} by {e^{i\theta/2}} does not affect the location {qi\overline{q}} of the tangent vector, but rotates the tangent vector {qj\overline{q}} anticlockwise by {\theta} in the direction of the orthogonal tangent vector {qk\overline{q}}, as it replaces {qj\overline{q}} by {\cos(\theta) qj\overline{q} + \sin(\theta) qk\overline{q}}.
  • Right multiplying {q} by {e^{k\theta/2}} advances the tangent vector by geodesic flow by angle {\theta}, as it replaces {qi\overline{q}} by {\cos(\theta) qi\overline{q} + \sin(\theta) qj\overline{q}}, and replaces {qj\overline{q}} by {\cos(\theta) qj\overline{q} - \sin(\theta) qi\overline{q}}.

Now suppose one has a spherical triangle with vertices {A,B,C}, with the spherical arcs {AB, BC, CA} subtending angles {c, a, b} respectively, and the vertices {A,B,C} subtending angles {\alpha,\beta,\gamma} respectively; suppose also that {ABC} is oriented in an anti-clockwise direction for sake of discussion. Observe that if one starts at {A} with a tangent vector oriented towards {B}, advances that vector by {c}, and then rotates by {\pi - \beta}, the tangent vector now at {B} and pointing towards {C}. If one advances by {a} and rotates by {\pi - \gamma}, one is now at {C} pointing towards {A}; and if one then advances by {b} and rotates by {\pi - \alpha}, one is back at {A} pointing towards {B}. This gives the fundamental relation

\displaystyle  e^{kc/2} e^{i(\pi-\beta)/2} e^{ka/2} e^{i(\pi-\gamma)/2} e^{kb/2} e^{i(\pi-\alpha)/2} = 1 \ \ \ \ \ (4)

relating the three sides and three equations of this triangle. (A priori, due to the lack of faithfulness of the {U(1,\mathbb{H})} action, the right-hand side could conceivably have been {-1} rather than {1}; but for extremely small triangles the right-hand side is clearly {1}, and so by continuity it must be {1} for all triangles.) Indeed, a moments thought will reveal that the condition (4) is necessary and sufficient for the data {a,b,c,\alpha,\beta,\gamma} to be associated with a spherical triangle. Thus one can view (4) as a “master equation” for spherical trigonometry: in principle, it can be used to derive all the other laws of this subject.

Remark 1 The law (4) has an evident symmetry {(a,b,c,\alpha,\beta,\gamma) \mapsto (\pi-\alpha,\pi-\beta,\pi-\gamma,\pi-a,\pi-b,\pi-c)}, which corresponds to the operation of replacing a spherical triangle with its dual triangle. Also, there is nothing particularly special about the choice of imaginaries {i,k} in (4); one can conjugate (4) by various quaternions and replace {i,k} here by any other orthogonal pair of unit quaternions.

Remark 2 If we work in the small scale regime, replacing {a,b,c} by {\varepsilon a, \varepsilon b, \varepsilon c} for some small {\varepsilon>0}, then we expect spherical triangles to behave like Euclidean triangles. Indeed, (4) to zeroth order becomes

\displaystyle  e^{i(\pi-\beta)/2} e^{i(\pi-\gamma)/2} e^{i(\pi-\alpha)/2} = 1

which reflects the classical fact that the sum of angles of a Euclidean triangle is equal to {\pi}. To first order, one obtains

\displaystyle  c + a e^{i(\pi-\gamma)/2} e^{i(\pi-\alpha)/2} + b e^{i(\pi-\alpha)/2} = 0

which reflects the evident fact that the vector sum of the sides of a Euclidean triangle sum to zero. (Geometrically, this correspondence reflects the fact that the action of the (projective) quaternion group on the unit sphere converges to the action of the special Euclidean group {SE(2)} on the plane, in a suitable asymptotic limit.)

The identity (4) is an identity of two unit quaternions; as the unit quaternion group {U(1,\mathbb{H})} is three-dimensional, this thus imposes three independent constraints on the six real parameters {a,b,c,\alpha,\beta,\gamma} of the spherical triangle. One can manipulate this constraint in various ways to obtain various trigonometric identities involving some subsets of these six parameters. For instance, one can rearrange (4) to get

\displaystyle  e^{i(\pi-\beta)/2} e^{ka/2} e^{i(\pi-\gamma)/2} = e^{-kc/2} e^{-i(\pi-\alpha)/2} e^{-kb/2}. \ \ \ \ \ (5)

Conjugating by {i} to reverse the sign of {k}, we also have

\displaystyle  e^{i(\pi-\beta)/2} e^{-ka/2} e^{i(\pi-\gamma)/2} = e^{kc/2} e^{-i(\pi-\alpha)/2} e^{kb/2}.

Taking the inner product of both sides of these identities, we conclude that

\displaystyle  \langle e^{i(\pi-\beta)/2} e^{ka/2} e^{i(\pi-\gamma)/2}, e^{i(\pi-\beta)/2} e^{-ka/2} e^{i(\pi-\gamma)/2} \rangle

is equal to

\displaystyle  \langle e^{-kc/2} e^{-i(\pi-\alpha)/2} e^{-kb/2}, e^{kc/2} e^{-i(\pi-\alpha)/2} e^{kb/2} \rangle.

Using the various properties of inner product, the former expression simplifies to {\mathrm{Re} e^{ka} = \cos a}, while the latter simplifies to

\displaystyle  \mathrm{Re} \langle e^{-i(\pi-\alpha)/2} e^{-kb} e^{i(\pi-\alpha)/2}, e^{kc} \rangle.

We can write {e^{kc} = \cos c + (\sin c) k} and

\displaystyle  e^{-i(\pi-\alpha)/2} e^{-kb} e^{i(\pi-\alpha)/2} = \cos b - (\sin b) (\cos(\pi-\alpha) k + \sin(\pi-\alpha) j)

so on substituting and simplifying we obtain

\displaystyle  \cos b \cos c + \sin b \sin c \cos \alpha = \cos a

which is the spherical cosine rule. Note in the infinitesimal limit (replacing {a,b,c} by {\varepsilon a, \varepsilon b, \varepsilon c}) this rule becomes the familiar Euclidean cosine rule

\displaystyle  a^2 = b^2 + c^2 - 2bc \cos \alpha.

In a similar fashion, from (5) we see that the quantity

\displaystyle  \langle e^{i(\pi-\beta)/2} e^{ka/2} e^{i(\pi-\gamma)/2} i e^{-i(\pi-\gamma)/2} e^{-ka/2} e^{-i(\pi-\beta)/2}, k \rangle

is equal to

\displaystyle  \langle e^{-kc/2} e^{-i(\pi-\alpha)/2} e^{-kb/2} i e^{kb/2} e^{i(\pi-\alpha)/2} e^{kc/2}, k \rangle.

The first expression simplifies by (1) and properties of the inner product to

\displaystyle  \langle e^{ka/2} i e^{-ka/2}, e^{-i(\pi-\beta)/2} k e^{i(\pi-\beta)/2} \rangle,

which by (2), (3) simplifies further to {-\sin a \sin \beta}. Similarly, the second expression simplifies to

\displaystyle  \langle e^{-kb/2} i e^{kb/2} , e^{i(\pi-\alpha)/2} k e^{-i(\pi-\alpha)/2}\rangle,

which by (2), (3) simplifies to {-\sin b \sin \alpha}. Equating the two and rearranging, we obtain

\displaystyle  \frac{\sin \alpha}{\sin a} = \frac{\sin \beta}{\sin b}

which is the spherical sine rule. Again, in the infinitesimal limit we obtain the familiar Euclidean sine rule

\displaystyle  \frac{\sin \alpha}{a} = \frac{\sin \beta}{b}.

As a variant of the above analysis, we have from (5) again that

\displaystyle  \langle e^{i(\pi-\beta)/2} e^{ka/2} e^{i(\pi-\gamma)/2} i e^{-i(\pi-\gamma)/2} e^{-ka/2} e^{-i(\pi-\beta)/2}, j \rangle

is equal to

\displaystyle  \langle e^{-kc/2} e^{-i(\pi-\alpha)/2} e^{-kb/2} i e^{kb/2} e^{i(\pi-\alpha)/2} e^{kc/2}, j \rangle.

As before, the first expression simplifies to

\displaystyle  \langle e^{ka/2} i e^{-ka/2}, e^{-i(\pi-\beta)/2} j e^{i(\pi-\beta)/2} \rangle

which equals {\sin a \cos \beta}. Meanwhile, the second expression can be rearranged as

\displaystyle  \langle e^{-i(\pi-\alpha)/2} e^{-kb/2} i e^{kb/2} e^{i(\pi-\alpha)/2}, e^{kc/2} j e^{-kc/2} \rangle.

By (2), (3) we can simplify to

\displaystyle  e^{-i(\pi-\alpha)/2} e^{-kb/2} i e^{kb/2} e^{i(\pi-\alpha)/2}

\displaystyle= (\cos b) i - (\sin b) \cos(\pi-\alpha) j + (\sin b) \sin(\pi-\alpha) k

and so the inner product is {\cos b \sin c - \cos b \sin c \cos \alpha}, leading to the “five part rule

\displaystyle  \cos b \sin c - \sin b \cos c \cos \alpha = \sin a \cos \beta.

In the case of a right-angled triangle {\beta=\pi/2}, this simplifies to one of Napier’s rules

\displaystyle  \cos \alpha = \frac{\tan c}{\tan b}, \ \ \ \ \ (6)

which in the infinitesimal limit is the familiar {\cos \alpha = \frac{c}{b}}. The other rules of Napier can be derived in a similar fashion.

Example 3 One application of Napier’s rule (6) is to determine the sunrise equation for when the sun rises and sets at a given location on the Earth, and a given time of year. For sake of argument let us work in summer, in which the declination {\delta} of the Sun is positive (due to axial tilt, it reaches a maximum of {23.5^\circ} at the summer solstice). Then the Sun subtends an angle of {\pi/2-\delta} from the pole star (Polaris in the northern hemisphere, Sigma Octantis in the southern hemisphere), and appears to rotate around that pole star once every {24} hours. On the other hand, if one is at a latitude {\phi}, then the pole star an elevation of {\phi} above the horizon. At extremely high latitudes {\phi > \pi/2-\delta}, the sun will never set (a phenomenon known as “midnight sun“); but in all other cases, at sunrise or sunset, the sun, pole star, and horizon point below the pole star will form a right-angled spherical triangle, with hypotenuse subtending an angle {\pi/2-\delta} and vertical side subtending an angle {\phi}. The angle subtended by the pole star in this triangle is {\pi-\omega}, where {\omega} is the solar hour angle {\omega} – the angle that the sun deviates from its noon position. Equation (6) then gives the sunrise equation

\displaystyle  \cos(\pi-\omega) = \frac{\tan \phi}{\tan(\pi/2-\delta)}

or equivalently

\displaystyle  \cos \omega = - \tan \phi \tan \delta.

A similar rule determines the time of sunset. In particular, the number of daylight hours in summer (assuming one is not in the midnight sun scenario {\phi > \pi/2 -\delta}) is given by

\displaystyle  24 - \frac{24}{\pi} \mathrm{arccos}(\tan \phi \tan \delta).

The situation in winter is similar, except that {\delta} is now negative, and polar night (no sunrise) occurs when {\phi > \pi/2+\delta}.

December 23, 2024

Tommaso DorigoTwenty Years Blogging

Twenty years ago today I got access for the first time to the interface that allowed me to publish blog posts for the Quantum Diaries web site, a science outreach endeavor that involved some 12 (then 15, then 25 or so IIRC) researchers around the world. A week before I had been contacted by the Fermilab outreach team, who were setting the thing up, and at that time I did not even know what a blog was!

read more

Matt Strassler The Standard Model More Deeply: The Magic Angle Nailed Down

In a previous post, I showed you that the Standard Model, armed with its special angle θw of approximately 30 degrees, does a pretty good job of predicting a whole host of processes in the Standard Model. I focused attention on the decays of the Z boson, but there were many more processes mentioned in the bonus section of that post.

But the predictions aren’t perfect. They’re not enough to convince a scientist that the Standard Model might be the whole story. So today let’s bring these predictions into better focus.

There are two major issues that we have to correct in order to make more precise predictions using the Standard Model:

  • In contrast to what I assumed in the last post, θw isn’t exactly 30 degrees (i.e. sin θw isn’t 1/2)
  • Although I ignored them so far, the strong nuclear force makes small but important effects

But before we deal with these, we have to fix something with the experimental measurements themselves.

Knowledge and Uncertainty: At the Center of Science

No one complained — but everyone should have — that when I presented the experimental results in my previous post, I expressed them without the corresponding uncertainties. I did that to keep things simple. But it wasn’t professional. As every well-trained scientist knows, when you are comparing an experimental result to a theoretical prediction, the uncertainties, both experimental and theoretical, are absolutely essential in deciding whether your prediction works or not. So we have to discuss this glaring omission.

Here’s how to read typical experimental uncertainties (see Figure 1). Suppose a particle physicist says that a quantity is measured to be x ± y — for instance, that the top quark mass is measured to be 172.57± 0.29 GeV/c2. Usually (unless explicitly noted) that means that the true value has a 68% chance of lying between x-y and x+y — “within one standard deviation” — and a 95% chance of lying between x-2y and x+2y — “within two standard deviations.” (See Figure 1, where x and y are called  \mu and  \sigma .) The chance of the true value being more than two standard deviations away from x is about 5% — about 1/20. That’s not rare! It will happen several times if you make a hundred different measurements.

Figure 1: Experimental uncertainties corresponding to  \mu \pm \sigma , where  \mu is the “central value” and “ \sigma ” is a “standard deviation.

But the chance of being more than three standard deviations away from x is a small fraction of a percent — as long as the cause is purely a statistical fluke — and that is indeed rare. (That said, one has to remember that big differences between prediction and measurement can also be due to an unforseen measurement problem or feature. That won’t be an issue today.)

W Boson Decays, More Precisely

Let’s first look at W decays, where we don’t have the complication of θw , and see what happens when we account for the effect of the strong nuclear force and the impact of experimental uncertainies.

The strong nuclear force slightly increases the rate for the W boson to decay to any quark/anti-quark pair, by about 3%. This is due to the same effect discussed in the “Understanding the Remaining Discrepancy” and “Strength of a Force” sections of this post… though the effect here is a little smaller (as it decreases at shorter distances and higher energies.) This slightly increases the percentages for quarks and, to compensate, slightly reduces the percentages for the electron, muon and tau (the “leptons”).

In Figure 2 are shown predictions of the Standard Model for the probabilities of the W- boson’s various decays:

  • At left are the predictions made in the previous post.
  • At center are better predictions that account for the strong nuclear force.

(To do this properly, uncertainties on these predictions should also be provided. But I don’t think that doing so would add anything to this post, other than complications.) These predictions are then compared with the experimental measurements of several quantities, shown at right: certain combinations of these decays that are a little easier to measure are also shown. (The measurements and uncertainties are published by the Particle Data Group here.)

Figure 2: The decay probabilities for W bosons, showing the percentage of W bosons that decay to certain particles. Predictions are given both before (left) and after (center) accounting for effects of the strong nuclear force. Experimental results are given at right, showing all measurements that can be directly performed.

The predictions and measurements do not perfectly agree. But that’s fine; because of the uncertainties in the measurements, they shouldn’t perfectly agree! All of the differences are less than two standard deviations, except for the probability for decay of a W to a tau and its anti-neutrino. That deviation is less than three standard deviations — and as I noted, if you have enough measurements, you’ll occasionally get one that differs by more than two standard deviations. We still might wonder if something funny is up with the tau, but we don’t have enough evidence of that yet. Let’s see what the Z boson teaches us later.

In any case, to a physicist’s eye, there is no sign here of any notable disgreement between theory and experiment in these results. Within current uncertainties, the Standard Model correctly predicts the data.

Z Boson Decays, More Precisely

Now let’s do the same for the Z boson, but here we have three steps:

  • first, the predictions when we take sin θw = 1/2, as we did in the previous post;
  • second, the predictions when we take sin θw = 0.48;
  • third, the better predictions when we also include the effect of the strong nuclear force.

And again Figure 3 compares predictions with the data.

Figure 3: The decay probabilities for Z bosons, showing the percentage of Z bosons that decay to certain particles. Predictions are given (left to right) for sin θw = 0.5, for sin θw =0.48, and again sin θw = 0.48 with the effect of strong nuclear force accounted for. Experimental results are given at right, showing all measurements that can be directly performed.

You notice that some of the experimental measurements have extremely small uncertainties! This is especially true of the decays to electrons, to muons, to taus, and (collectively) to the three types of neutrinos. Let’s look at them closely.

If you look at the predictions with sin θw = 1/2 for the electrons, muons and taus, they are in disagreement with the measurements by a lot. For example, in Z decay to muons, the initial prediction differs from the data by 19 standard deviations!! Not even close. For sin θw = 0.48 but without accounting for the strong nuclear force, the disagreement drops to 11 standard deviations; still terrible. But once we account also for the strong nuclear force, the predictions agree with data to within 1 to 2 standard deviations for all three types of particles.

As for the decays to neutrinos, the three predictions differ by 16 standard deviations, 9 standard deviations, and… below 2 standard deviations.

My reaction, when this data came in in the 1990s, was “Wow.” I hope yours is similar. Such close matching of the Standard Model’s predictions with highly precise measurements is a truly stunning sucesss.

Notice that the successful prediction requires three of the Standard Model’s forces: the mixture of the electromagnetic and weak nuclear forces given by the magic angle, with a small effect from the strong nuclear force. Said another way, all of the Standard Model’s particles except the Higgs boson and top quark play a role in Figs. 2 and 3. (The Higgs field, meanwhile, is secretly in the background, giving the W and Z bosons their masses and affecting the Z boson’s interactions with the other particles; and the top quark is hiding in the background too, since it can’t be removed without changing how the Z boson interacts with bottom quarks.) You can’t take any part of the Standard Model out without messing up these predictions completely.

Oh, and by the way, remember how the probability for W decay to a tau and a neutrino in Fig. 2 was off the prediction by more than two standard deviations? Well there’s nothing weird about the tau or the neutrinos in Fig. 3 — predictions and measurements agree just fine — and indeed, no numbers in Z decay differ from predictions by more than two standard deviations. As I said earlier, the expectation is that about one in every twenty measurements should differ from its true value by more than two standard deviations. Since we have over a dozen measurements in Figs. 2 and 3, it’s no surprise that one of them might be two standard deviations off… and so we can’t use that single disagreement as evidence that the Standard Model doesn’t work.

Asymmetries, Precisely

Let’s do one more case: one of the asymmetries that I mentioned in the bonus section of the previous post. Consider a forward-backward asymmetry shown in Fig. 4. Take all collisions in which an electron strikes a positron (the anti-particle of an electron) and turns into a muon and an anti-muon. Now compare the probability that the muon goes “forward” (roughly the direction that the electron is heading) to the probability that it goes “backward” (roughly the direction that the positron is heading.) If the two probabilities are equal, then the asymmetry would be zero; if the muon always goes to the left, then the asymmetry would be 100%; if always to the right, the asymmetry would be -100%.

Figure 4: In electron-positron collisions that make a muon/anti-muon pair, the forward-backward asymmetry compares the rate for “forward” production (where the muon travels roughly in the same direction as the electron) to “backward” production.

Asymmetries are special because the effect of the strong nuclear force cancels out of them completely, and so they only depend on sin θw. And this particular “leptonic forward-backward” asymmetry is an example with a special feature: if sin θw were exactly 1/2, this asymmetry for lepton production would be predicted to be exactly zero.

But the measured value of this asymmetry, while quite small (less than 2%), is definitely not zero, and so this is another confirmation that sin θw is not exactly 1/2. So let’s instead compare the prediction for this asymmetry using sin θw = 0.48, the choice that worked so well for the Z boson’s decays in Fig. 3, with the data.

In Figure 5, the horizontal axis shows the lepton forward-backward asymmetry. The prediction of 1.8% that one obtains for sin θw = 0.48, widened slightly to cover 1.65% to 2.0%, which is what obtains for sin θw between 0.479 and 0.481, is shown in pink. The four open circles represent four measurements of the asymmetry by the four experiments that were located at the LEP collider; the dashes through the circles show the standard deviations on their measurements. The dark circle shows what one gets when one combines the four experiments’ data together, obtaining an even better statistical estimate: 1.71 ± 0.10%, the uncertainty being indicated both as the dash going through the solid circle and as the yellow band. Since the yellow band extends to just above 1.8%, we see that the data differs from the sin θw = 0.480 prediction (the center of the pink band) by less than one standard deviation… giving precise agreement of the Standard Model with this very small but well-measured asymmetry.

Figure 5: The data from four experiments at the LEP collider (open circles, with uncertainties shown as dashes), and the combination of their results (closed circle) giving an asymmetry of 1.70% with an uncertainty of ±0.10% (yellow bar.) The prediction of the Standard Model for sin θw between 0.479 and 0.481 is shown in pink; its central value of 1.8% is within one standard deviation of the data.

Predictions of other asymmetries show similar success, as do numerous other measurements.

The Big Picture

Successful predictions like these, especially ones in which both theory and experiment are highly precise, explain why particle physicists have such confidence in the Standard Model, despite its clear limitations.

What limitations of the Standard Model am I referring too? They are many, but one of them is simply that the Standard Model does not predict θw . No one can say why θw takes the value that it has, or whether the fact that it is close to 30 degrees is a clue to its origin or a mere coincidence. Instead, of the many measurements, we use a single one (such as one of the asymmetries) to extract its value, and then can predict many other quantities.

One thing I’ve neglected to do is to convey the complexity of the calculations that are needed to compare the Standard Model predictions to data. To carry out these computations much more carefully than I did in Figs. 2, 3 and 5, in order to make them as precise as the measurements, demands specialized knowledge and experience. (As an example of how tricky these computations can be: even defining what one means by sin θw can be ambiguous in precise enough calculations, and so one needs considerable expertise [which I do not have] to define it correctly and use that definition consistently.) So there are actually still more layers of precision that I could go into…!

But I think perhaps I’ve done enough to convince you that the Standard Model is a fortress. Sure, it’s not a finished construction. Yet neither will it be easily overthrown.

John PreskillFinding Ed Jaynes’s ghost

You might have heard of the conundrum “What do you give the man who has everything?” I discovered a variation on it last October: how do you celebrate the man who studied (nearly) everything? Physicist Edwin Thompson Jaynes impacted disciplines from quantum information theory to biomedical imaging. I almost wrote “theoretical physicist,” instead of “physicist,” but a colleague insisted that Jaynes had a knack for electronics and helped design experiments, too. Jaynes worked at Washington University in St. Louis (WashU) from 1960 to 1992. I’d last visited the university in 2018, as a newly minted postdoc collaborating with WashU experimentalist Kater Murch. I’d scoured the campus for traces of Jaynes like a pilgrim seeking a saint’s forelock or humerus. The blog post “Chasing Ed Jaynes’s ghost” documents that hunt.

I found his ghost this October.

Kater and colleagues hosted the Jaynes Centennial Symposium on a brilliant autumn day when the campus’s trees were still contemplating shedding their leaves. The agenda featured researchers from across the sciences and engineering. We described how Jaynes’s legacy has informed 21st-century developments in quantum information theory, thermodynamics, biophysics, sensing, and computation. I spoke about quantum thermodynamics and information theory—specifically, incompatible conserved quantities, about which my research-group members and I have blogged many times.

Irfan Siddiqi spoke about quantum technologies. An experimentalist at the University of California, Berkeley, Irfan featured on Quantum Frontiers seven years ago. His lab specializes in superconducting qubits, tiny circuits in which current can flow forever, without dissipating. How can we measure a superconducting qubit? We stick the qubit in a box. Light bounces back and forth across the box. The light interacts with the qubit while traversing it, in accordance with the Jaynes–Cummings model. We can’t seal any box perfectly, so some light will leak out. That light carries off information about the qubit. We can capture the light using a photodetector to infer about the qubit’s state.

The first half of Jaynes–Cummings

Bill Bialek, too, spoke about inference. But Bill is a Princeton biophysicist, so fruit flies preoccupy him more than qubits do. A fruit fly metamorphoses from a maggot that hatches from an egg. As the maggot develops, its cells differentiate: some form a head, some form a tail, and so on. Yet all the cells contain the same genetic information. How can a head ever emerge, to differ from a tail? 

A fruit-fly mother, Bill revealed, injects molecules into an egg at certain locations. These molecules diffuse across the egg, triggering the synthesis of more molecules. The knock-on molecules’ concentrations can vary strongly across the egg: a maggot’s head cells contain molecules at certain concentrations, and the tail cells contain the same molecules at other concentrations.

At this point in Bill’s story, I was ready to take my hat off to biophysicists for answering the question above, which I’ll rephrase here: if we find that a certain cell belongs to a maggot’s tail, why does the cell belong to the tail? But I enjoyed even more how Bill turned the question on its head (pun perhaps intended): imagine that you’re a maggot cell. How can you tell where in the maggot you are, to ascertain how to differentiate? Nature asks this question (loosely speaking), whereas human observers ask Bill’s first question.

To answer the second question, Bill recalled which information a cell accesses. Suppose you know four molecules’ concentrations: c_1, c_2, c_3, and c_4. How accurately can you predict the cell’s location? That is, what probability does the cell have of sitting at some particular site, conditioned on the cs? That probability is large only at one site, biophysicists have found empirically. So a cell can accurately infer its position from its molecules’ concentrations.

I’m no biophysicist (despite minor evidence to the contrary), but I enjoyed Bill’s story as I enjoyed Irfan’s. Probabilities, information, and inference are abstract notions; yet they impact physical reality, from insects to quantum science. This tension between abstraction and concreteness arrested me when I first encountered entropy, in a ninth-grade biology lecture. The tension drew me into information theory and thermodynamics. These toolkits permeate biophysics as they permeate my disciplines. So, throughout the symposium, I spoke with engineers, medical-school researchers, biophysicists, thermodynamicists, and quantum scientists. They all struck me as my kind of people, despite our distribution across the intellectual landscape. Jaynes reasoned about distributions—probability distributions—and I expect he’d have approved of this one. The man who studied nearly everything deserves a celebration that illuminates nearly everything.

December 22, 2024

Jordan EllenbergLive at the Lunchable

Much-needed new housing is going up on Madison’s north side where the Oscar Meyer plant used to stand, with more to come. The View and The Victoria will join other new apartment buildings in town, like Verve, and Chapter, and The Eastern. I think it would be a shame if the redevelopment failed to honor the greatest innovation Oscar Meyer ever devised at its Madison facility. There should be a luxury apartment building called The Lunchable.

December 21, 2024

John PreskillBeyond NISQ: The Megaquop Machine

On December 11, I gave a keynote address at the Q2B 2024 Conference in Silicon Valley. This is a transcript of my remarks. The slides I presented are here.

NISQ and beyond

I’m honored to be back at Q2B for the 8th year in a row.

The Q2B conference theme is “The Roadmap to Quantum Value,” so I’ll begin by showing a slide from last year’s talk. As best we currently understand, the path to economic impact is the road through fault-tolerant quantum computing. And that poses a daunting challenge for our field and for the quantum industry.

We are in the NISQ era. And NISQ technology already has noteworthy scientific value. But as of now there is no proposed application of NISQ computing with commercial value for which quantum advantage has been demonstrated when compared to the best classical hardware running the best algorithms for solving the same problems. Furthermore, currently there are no persuasive theoretical arguments indicating that commercially viable applications will be found that do not use quantum error-correcting codes and fault-tolerant quantum computing.

NISQ, meaning Noisy Intermediate-Scale Quantum, is a deliberately vague term. By design, it has no precise quantitative meaning, but it is intended to convey an idea: We now have quantum machines such that brute force simulation of what the quantum machine does is well beyond the reach of our most powerful existing conventional computers. But these machines are not error-corrected, and noise severely limits their computational power.

In the future we can envision FASQ* machines, Fault-Tolerant Application-Scale Quantum computers that can run a wide variety of useful applications, but that is still a rather distant goal. What term captures the path along the road from NISQ to FASQ? Various terms retaining the ISQ format of NISQ have been proposed [here, here, here], but I would prefer to leave ISQ behind as we move forward, so I’ll speak instead of a megaquop or gigaquop machine and so on meaning one capable of executing a million or a billion quantum operations, but with the understanding that mega means not precisely a million but somewhere in the vicinity of a million.

Naively, a megaquop machine would have an error rate per logical gate of order 10^{-6}, which we don’t expect to achieve anytime soon without using error correction and fault-tolerant operation. Or maybe the logical error rate could be somewhat larger, as we expect to be able to boost the simulable circuit volume using various error mitigation techniques in the megaquop era just as we do in the NISQ era. Importantly, the megaquop machine would be capable of achieving some tasks beyond the reach of classical, NISQ, or analog quantum devices, for example by executing circuits with of order 100 logical qubits and circuit depth of order 10,000.

What resources are needed to operate it? That depends on many things, but a rough guess is that tens of thousands of high-quality physical qubits could suffice. When will we have it? I don’t know, but if it happens in just a few years a likely modality is Rydberg atoms in optical tweezers, assuming they continue to advance in both scale and performance.

What will we do with it? I don’t know, but as a scientist I expect we can learn valuable lessons by simulating the dynamics of many-qubit systems on megaquop machines. Will there be applications that are commercially viable as well as scientifically instructive? That I can’t promise you.

The road to fault tolerance

To proceed along the road to fault tolerance, what must we achieve? We would like to see many successive rounds of accurate error syndrome measurement such that when the syndromes are decoded the error rate per measurement cycle drops sharply as the code increases in size. Furthermore, we want to decode rapidly, as will be needed to execute universal gates on protected quantum information. Indeed, we will want the logical gates to have much higher fidelity than physical gates, and for the logical gate fidelities to improve sharply as codes increase in size. We want to do all this at an acceptable overhead cost in both the number of physical qubits and the number of physical gates. And speed matters — the time on the wall clock for executing a logical gate should be as short as possible.

A snapshot of the state of the art comes from the Google Quantum AI team. Their recently introduced Willow superconducting processor has improved transmon lifetimes, measurement errors, and leakage correction compared to its predecessor Sycamore. With it they can perform millions of rounds of surface-code error syndrome measurement with good stability, each round lasting about a microsecond. Most notably, they find that the logical error rate per measurement round improves by a factor of 2 (a factor they call Lambda) when the code distance increases from 3 to 5 and again from 5 to 7, indicating that further improvements should be achievable by scaling the device further. They performed accurate real-time decoding for the distance 3 and 5 codes. To further explore the performance of the device they also studied the repetition code, which corrects only bit flips, out to a much larger code distance. As the hardware continues to advance we hope to see larger values of Lambda for the surface code, larger codes achieving much lower error rates, and eventually not just quantum memory but also logical two-qubit gates with much improved fidelity compared to the fidelity of physical gates.

Last year I expressed concern about the potential vulnerability of superconducting quantum processors to ionizing radiation such as cosmic ray muons. In these events, errors occur in many qubits at once, too many errors for the error-correcting code to fend off. I speculated that we might want to operate a superconducting processor deep underground to suppress the muon flux, or to use less efficient codes that protect against such error bursts.

The good news is that the Google team has demonstrated that so-called gap engineering of the qubits can reduce the frequency of such error bursts by orders of magnitude. In their studies of the repetition code they found that, in the gap-engineered Willow processor, error bursts occurred about once per hour, as opposed to once every ten seconds in their earlier hardware.  Whether suppression of error bursts via gap engineering will suffice for running deep quantum circuits in the future is not certain, but this progress is encouraging. And by the way, the origin of the error bursts seen every hour or so is not yet clearly understood, which reminds us that not only in superconducting processors but in other modalities as well we are likely to encounter mysterious and highly deleterious rare events that will need to be understood and mitigated.

Real-time decoding

Fast real-time decoding of error syndromes is important because when performing universal error-corrected computation we must frequently measure encoded blocks and then perform subsequent operations conditioned on the measurement outcomes. If it takes too long to decode the measurement outcomes, that will slow down the logical clock speed. That may be a more serious problem for superconducting circuits than for other hardware modalities where gates can be orders of magnitude slower.

For distance 5, Google achieves a latency, meaning the time from when data from the final round of syndrome measurement is received by the decoder until the decoder returns its result, of about 63 microseconds on average. In addition, it takes about another 10 microseconds for the data to be transmitted via Ethernet from the measurement device to the decoding workstation. That’s not bad, but considering that each round of syndrome measurement takes only a microsecond, faster would be preferable, and the decoding task becomes harder as the code grows in size.

Riverlane and Rigetti have demonstrated in small experiments that the decoding latency can be reduced by running the decoding algorithm on FPGAs rather than CPUs, and by integrating the decoder into the control stack to reduce communication time. Adopting such methods may become increasingly important as we scale further. Google DeepMind has shown that a decoder trained by reinforcement learning can achieve a lower logical error rate than a decoder constructed by humans, but it’s unclear whether that will work at scale because the cost of training rises steeply with code distance. Also, the Harvard / QuEra team has emphasized that performing correlated decoding across multiple code blocks can reduce the depth of fault-tolerant constructions, but this also increases the complexity of decoding, raising concern about whether such a scheme will be scalable.

Trading simplicity for performance

The Google processors use transmon qubits, as do superconducting processors from IBM and various other companies and research groups. Transmons are the simplest superconducting qubits and their quality has improved steadily; we can expect further improvement with advances in materials and fabrication. But a logical qubit with very low error rate surely will be a complicated object due to the hefty overhead cost of quantum error correction. Perhaps it is worthwhile to fashion a more complicated physical qubit if the resulting gain in performance might actually simplify the operation of a fault-tolerant quantum computer in the megaquop regime or well beyond. Several versions of this strategy are being pursued.

One approach uses cat qubits, in which the encoded 0 and 1 are coherent states of a microwave resonator, well separated in phase space, such that the noise afflicting the qubit is highly biased. Bit flips are exponentially suppressed as the mean photon number of the resonator increases, while the error rate for phase flips induced by loss from the resonator increases only linearly with the photon number. This year the AWS team built a repetition code to correct phase errors for cat qubits that are passively protected against bit flips, and showed that increasing the distance of the repetition code from 3 to 5 slightly improves the logical error rate. (See also here.)

Another helpful insight is that error correction can be more effective if we know when and where the errors occur in a quantum circuit. We can apply this idea using a dual rail encoding of the qubits. With two microwave resonators, for example, we can encode a qubit by placing a single photon in either the first resonator (the 10) state, or the second resonator (the 01 state). The dominant error is loss of a photon, causing either the 01 or 10 state to decay to 00. One can check whether the state is 00, detecting whether the error occurred without disturbing a coherent superposition of 01 and 10. In a device built by the Yale / QCI team, loss errors are detected over 99% of the time and all undetected errors are relatively rare. Similar results were reported by the AWS team, encoding a dual-rail qubit in a pair of transmons instead of resonators.

Another idea is encoding a finite-dimensional quantum system in a state of a resonator that is highly squeezed in two complementary quadratures, a so-called GKP encoding. This year the Yale group used this scheme to encode 3-dimensional and 4-dimensional systems with decay rate better by a factor of 1.8 than the rate of photon loss from the resonator. (See also here.)

A fluxonium qubit is more complicated than a transmon in that it requires a large inductance which is achieved with an array of Josephson junctions, but it has the advantage of larger anharmonicity, which has enabled two-qubit gates with better than three 9s of fidelity, as the MIT team has shown.

Whether this trading of simplicity for performance in superconducting qubits will ultimately be advantageous for scaling to large systems is still unclear. But it’s appropriate to explore such alternatives which might pay off in the long run.

Error correction with atomic qubits

We have also seen progress on error correction this year with atomic qubits, both in ion traps and optical tweezer arrays. In these platforms qubits are movable, making it possible to apply two-qubit gates to any pair of qubits in the device. This opens the opportunity to use more efficient coding schemes, and in fact logical circuits are now being executed on these platforms. The Harvard / MIT / QuEra team sampled circuits with 48 logical qubits on a 280-qubit device –- that big news broke during last year’s Q2B conference. Atom computing and Microsoft ran an algorithm with 28 logical qubits on a 256-qubit device. Quantinuum and Microsoft prepared entangled states of 12 logical qubits on a 56-qubit device.

However, so far in these devices it has not been possible to perform more than a few rounds of error syndrome measurement, and the results rely on error detection and postselection. That is, circuit runs are discarded when errors are detected, a scheme that won’t scale to large circuits. Efforts to address these drawbacks are in progress. Another concern is that the atomic movement slows the logical cycle time. If all-to-all coupling enabled by atomic movement is to be used in much deeper circuits, it will be important to speed up the movement quite a lot.

Toward the megaquop machine

How can we reach the megaquop regime? More efficient quantum codes like those recently discovered by the IBM team might help. These require geometrically nonlocal connectivity and are therefore better suited for Rydberg optical tweezer arrays than superconducting processors, at least for now. Error mitigation strategies tailored for logical circuits, like those pursued by Qedma, might help by boosting the circuit volume that can be simulated beyond what one would naively expect based on the logical error rate. Recent advances from the Google team, which reduce the overhead cost of logical gates, might also be helpful.

What about applications? Impactful applications to chemistry typically require rather deep circuits so are likely to be out of reach for a while yet, but applications to materials science provide a more tempting target in the near term. Taking advantage of symmetries and various circuit optimizations like the ones Phasecraft has achieved, we might start seeing informative results in the megaquop regime or only slightly beyond.

As a scientist, I’m intrigued by what we might conceivably learn about quantum dynamics far from equilibrium by doing simulations on megaquop machines, particularly in two dimensions. But when seeking quantum advantage in that arena we should bear in mind that classical methods for such simulations are also advancing impressively, including in the past year (for example, here and here).

To summarize, advances in hardware, control, algorithms, error correction, error mitigation, etc. are bringing us closer to megaquop machines, raising a compelling question for our community: What are the potential uses for these machines? Progress will require innovation at all levels of the stack.  The capabilities of early fault-tolerant quantum processors will guide application development, and our vision of potential applications will guide technological progress. Advances in both basic science and systems engineering are needed. These are still the early days of quantum computing technology, but our experience with megaquop machines will guide the way to gigaquops, teraquops, and beyond and hence to widely impactful quantum value that benefits the world.

I thank Dorit Aharonov, Sergio Boixo, Earl Campbell, Roland Farrell, Ashley Montanaro, Mike Newman, Will Oliver, Chris Pattison, Rob Schoelkopf, and Qian Xu for helpful comments.

*The acronym FASQ was suggested to me by Andrew Landahl.

The megaquop machine (image generated by ChatGPT.
The megaquop machine (image generated by ChatGPT).

n-Category Café Random Permutations (Part 14)

I want to go back over something from Part 11, but in a more systematic and self-contained way.

Namely, I want to prove a wonderful known fact about random permutations, the Cycle Length Lemma, using a bit of category theory. The idea here is that the number of kk-cycles in a random permutation of nn things is a random variable. Then comes a surprise: in the limit as nn \to \infty, this random variable approaches a Poisson distribution with mean 1/k1/k. And even better, for different choices of kk these random variables become independent in the nn \to \infty limit.

I’m stating these facts roughly now, to not get bogged down. But I’ll state them precisely, prove them, and categorify them. That is, I’ll state equations involving random variables — but I’ll prove that these equations come from equivalences of groupoids!

First I’ll state the Cycle Length Lemma, which summarizes a lot of interesting facts about random permutations. Then I’ll state and prove a categorified version of the Cycle Length Lemma, which asserts an equivalence of groupoids. Then I’ll derive the original version of the lemma from this categorified version by taking the cardinalities of these groupoids. The categorified version contains more information, so it’s not just a trick for proving the original lemma.

What do groupoids have to do with random permutations? You’ll see, but it’s an example of ‘principle of indifference’, especially in its modern guise, called the ‘principle of transformation groups’: the idea that outcomes related by a symmetry should have the same probability. This sets up a connection between groupoids and probability theory — and as we’ll see, we can “go down” from groupoids to probabilities using the theory of groupoid cardinalities.

The Cycle Length Lemma

In the theory of random permutations, we treat the symmetric group S nS_n as a probability measure space where each element has the same measure, namely 1/n!1/n!. Functions f:S nf \colon S_n \to \mathbb{R} then become random variables, and we can study their expected values:

E(f)=1n! σS nf(σ). E(f) = \frac{1}{n!} \sum_{\sigma \in S_n} f(\sigma).

An important example is the function

C k:S n C_k \colon S_n \to \mathbb{N}

that counts, for any permutation σS n\sigma \in S_n, its number of cycles of length kk, also called kk-cycles. A well-known but striking fact about random permutations is that whenever knk \le n, the expected number of kk-cycles is 1/k1/k:

E(C k)=1k E(C_k) = \frac{1}{k}

For example, a random permutation of any finite set has, on average, one fixed point!

Another striking fact is that whenever jkj \ne k and j+knj + k \le n, so that it’s possible for a permutation σS n\sigma \in S_n to have both a jj-cycle and a kk-cycle, the random variables C jC_j and C kC_k are uncorrelated in the following sense:

E(C jC k)=E(C j)E(C k). E(C_j C_k) = E(C_j) E(C_k) .

You might at first think that having lots of jj-cycles for some large jj would tend to inhibit the presence of kk-cycles for some other large value of kk, but that’s not true unless j+k>nj + k \gt n, when it suddenly becomes impossible to have both a jj-cycle and a kk-cycle!

These two facts are special cases of the Cycle Length Lemma. To state this lemma in full generality, recall that the number of ordered pp-tuples of distinct elements of an nn-element set is the falling power

n p̲=n(n1)(n2)(np+1). n^{\underline{p}} = n(n-1)(n-2) \, \cdots \, (n-p+1).

It follows that the function

C k p̲:S n C_k^{\underline{p}} \colon S_n \to \mathbb{N}

counts, for any permutation in S nS_n, its ordered pp-tuples of distinct kk-cycles. We can also replace the word ‘distinct’ here by ‘disjoint’, without changing the meaning, since distinct cycles must be disjoint.

The two striking facts mentioned above generalize as follows:

1) First, whenever pknp k \le n, so that it is possible for a permutation in S nS_n to have pp distinct kk-cycles, then

E(C k p̲)=1k p. E(C_k^{\underline{p}}) = \frac{1}{k^p}.

If you know about the moments of a Poisson distribution here’s a nice equivalent way to state this equation: when pknp k \le n, the ppth moment of the random variable C kC_k equals that of a Poisson distribution with mean 1/k1/k.

2) Second, the random variables C kC_k are better and better approximated by independent Poisson distributions. To state this precisely we need a bit of notation. Let p\vec{p} denote an nn-tuple (p 1,,p n)(p_1 , \dots, p_n) of natural numbers, and let

|p|=p 1+2p 2++np n. |\vec{p}| = p_1 + 2p_2 + \cdots + n p_n.

If |p|n|\vec{p}| \le n, it is possible for a permutation σS n\sigma \in S_n to have a collection of distinct cycles, with p 1p_1 cycles of length 1, p 2p_2 cycles of length 2, and so on up to p np_n cycles of length nn. If |p|>n|\vec{p}| \gt n, this is impossible. In the former case, where |p|n|\vec{p}| \le n, we always have

E( k=1 nC k p̲ k)= k=1 nE(C k p̲ k). E\left( \prod_{k=1}^n C_k^{\underline{p}_k} \right) = \prod_{k=1}^n E( C_k^{\underline{p}_k}) .

Taken together, 1) and 2) are equivalent to the Cycle Length Lemma, which may be stated in a unified way as follows:

Cycle Length Lemma. Suppose p 1,,p np_1 , \dots, p_n \in \mathbb{N}. Then

E( k=1 nC k p̲ k)={ k=1 n1k p k if|p|n 0 if|p|>n E\left( \prod_{k=1}^n C_k^{\underline{p}_k} \right) = \left\{ \begin{array}{ccc} \displaystyle{ \prod_{k=1}^n \frac{1}{k^{p_k}} } & & \mathrm{if} \; |\vec{p}| \le n \\ \\ 0 & & \mathrm{if} \; |\vec{p}| \gt n \end{array} \right.

This appears, for example, in Ford’s course notes on random permutations and the statistical properties of prime numbers [Lemma 1.1, F]. The most famous special case is when |p|=n|\vec{p}| = n. Apparently this goes back to Cauchy, but I don’t know where he proved it. I believe he would have phrased it in terms of counting permutations, not probabilities.

I won’t get into details of precisely the sense in which random variables C kC_k approach independent Poisson distributions. For that, see Arratia and Tavaré [AT].

The Categorified Cycle Length Lemma

To categorify the Cycle Length Lemma, the key is to treat a permutation as an extra structure that we can put on a set, and then consider the groupoid of nn-element sets equipped with this extra structure:

Definition. Let Perm(n)\mathsf{Perm}(n) be the groupoid in which

  • an object is an nn-element set equipped with a permutation σ:XX\sigma \colon X \to X

and

  • a morphism from σ:XX\sigma \colon X \to X to σ:XX\sigma' \colon X' \to X' is a bijection f:XXf \colon X \to X' that is permutation-preserving in the following sense:

fσf 1=σ. f \circ \sigma \circ f^{-1} = \sigma'.

We’ll need this strange fact below: if n<0n \lt 0 then Perm(n)\mathsf{Perm}(n) is the empty groupoid (that is, the groupoid with no objects and no morphisms).

More importantly, we’ll need a fancier groupoid where a set is equipped with a permutation together with a list of distinct cycles of specified lengths. For any nn \in \mathbb{N} and any nn-tuple of natural numbers p=(p 1,,p n)\vec{p} = (p_1 , \dots, p_n), recall that we have defined

|p|=p 1+2p 2++np n. |\vec{p}| = p_1 + 2p_2 + \cdots + n p_n.

Definition. Let A p\mathsf{A}_\vec{p} be the groupoid of nn-element sets XX equipped with a permutation σ:XX\sigma \colon X \to X that is in turn equipped with a choice of an ordered p 1p_1-tuple of distinct 11-cycles, an ordered p 2p_2-tuple of distinct 22-cycles, and so on up to an ordered p np_n-tuple of distinct nn-cycles. A morphism in this groupoid is a bijection that is permutation-preserving and also preserves the ordered tuples of distinct cycles.

Note that if |p|>n|p| \gt n, no choice of disjoint cycles with the specified property exists, so A pA_\vec{p} is the empty groupoid.

Finally, we need a bit of standard notation. For any group GG we write B(G)\mathsf{B}(G) for its delooping: that is, the groupoid that has one object \star and Aut()=G\mathrm{Aut}(\star) = G.

The Categorified Cycle Length Lemma. For any p=(p 1,,p n) n\vec{p} = (p_1 , \dots, p_n) \in \mathbb{N}^n we have

A pPerm(n|p|)× k=1 nB(/k) p k \mathsf{A}_{\vec{p}} \simeq \mathsf{Perm}(n - |\vec{p}|) \; \times \; \prod_{k = 1}^n \mathsf{B}(\mathbb{Z}/k)^{p_k}

Proof. Both sides are empty groupoids when n|p|<0n - |\vec{p}| \lt 0, so assume n|p|0n - |\vec{p}| \ge 0. A groupoid is equivalent to any full subcategory of that groupoid containing at least one object from each isomorphism class. So, fix an nn-element set XX and a subset YXY \subseteq X with n|p|n - |\vec{p}| elements. Partition XYX - Y into subsets S kS_{k\ell} where S kS_{k \ell} has cardinality kk, 1kn1 \le k \le n, and 1p k1 \le \ell \le p_k. Every object of A p\mathsf{A}_{\vec{p}} is isomorphic to the chosen set XX equipped with some permutation σ:XX\sigma \colon X \to X that has each subset S kS_{k \ell} as a kk-cycle. Thus A p\mathsf{A}_{\vec{p}} is equivalent to its full subcategory containing only objects of this form.

An object of this form consists of an arbitrary permutation σ Y:YY\sigma_Y \colon Y \to Y and a cyclic permutation σ k:S kS k\sigma_{k \ell} \colon S_{k \ell} \to S_{k \ell} for each k,k,\ell as above. Consider a second object of this form, say σ Y:YY\sigma'_Y \colon Y \to Y equipped with cyclic permutations σ k\sigma'_{k \ell}. Then a morphism from the first object to the second consists of two pieces of data. First, a bijection

f:YY f \colon Y \to Y

such that

σ Y=fσ Yf 1. \sigma'_Y = f \circ \sigma_Y \circ f^{-1}.

Second, for each k,k,\ell as above, bijections

f k:S kS k f_{k \ell} \colon S_{k \ell} \to S_{k \ell}

such that

σ k=f kσ kf k 1. \sigma'_{k \ell} = f_{k \ell} \circ \sigma_{k \ell} \circ f_{k \ell}^{-1}.

Since YY has n|p|n - |\vec{p}| elements, while σ k\sigma_{k \ell} and σ k\sigma'_{k \ell} are cyclic permutations of kk-element sets, it follows that A p\mathsf{A}_{\vec{p}} is equivalent to

Perm(n|p|)× k=1 nB(/k) p k. \mathsf{Perm}(n - |\vec{p}|) \; \times \; \prod_{k = 1}^n \mathsf{B}(\mathbb{Z}/k)^{p_k}. \qquad \qquad &#9646;

The case where |p|=n|\vec{p}| = n is especially pretty, since then our chosen cycles completely fill up our nn-element set and we have

A p k=1 nB(/k) p k. \mathsf{A}_{\vec{p}} \simeq \prod_{k = 1}^n \mathsf{B}(\mathbb{Z}/k)^{p_k}.

Groupoid Cardinality

The cardinality of finite sets has a natural extension to finite groupoids, and this turns out to be the key to extracting results on random permutations from category theory. Let’s briefly recall the idea of ‘groupoid cardinality’ [BD,BHW]. Any finite groupoid G\mathsf{G} is equivalent to a coproduct of finitely many one-object groupoids, which are deloopings of finite groups G 1,,G mG_1, \dots, G_m:

G i=1 mB(G i), \mathsf{G} \simeq \sum_{i = 1}^m \mathsf{B}(G_i),

and then the cardinality of G\mathsf{G} is defined to be

|G|= i=1 m1|G i|. |\mathsf{G}| = \sum_{i = 1}^m \frac{1}{|G_i|}.

This concept of groupoid cardinality has various nice properties. For example it’s additive:

|G+H|=|G|+|H| |\mathsf{G} + \mathsf{H}| = |\mathsf{G}| + |\mathsf{H}|

and multiplicative:

|G×H|=|G|×|H| |\mathsf{G} \times \mathsf{H}| = |\mathsf{G}| \times |\mathsf{H}|

and invariant under equivalence of groupoids:

GH|G|=|H|. \mathsf{G} \simeq \mathsf{H} \implies |\mathsf{G}| = |\mathsf{H}|.

But none of these three properties require that we define |G||\mathsf{G}| as the sum of the reciprocals of the cardinalities |G i||G_i|: any other power of these cardinalities would work just as well. What makes the reciprocal cardinalities special is that if GG is a finite group acting on a set SS, we have

|SG|=|S|/|G| |S\sslash G| = |S|/|G|

where the groupoid SGS \sslash G is the weak quotient or homotopy quotient of SS by GG, also called the action groupoid. This is the groupoid with elements of SS as objects and one morphism from ss to ss' for each gGg \in G with gs=sg s = s', with composition of morphisms coming from multiplication in GG.

The groupoid of nn-element sets equipped with permutation, Perm(n)\mathsf{Perm}(n), has a nice description in terms of weak quotients:

Lemma. For all nn \in \mathbb{N} we have an equivalence of groupoids

Perm(n)S nS n \mathsf{Perm}(n) \simeq S_n \sslash S_n

where the group S nS_n acts on the underlying set of S nS_n by conjugation.

Proof. We use the fact that Perm(n)\mathrm{Perm}(n) is equivalent to any full subcategory of Perm(n)\mathrm{Perm}(n) containing at least one object from each isomorphism class. For Perm(n)\mathsf{Perm}(n) we can get such a subcategory by fixing an nn-element set, say X={1,,n}X = \{1,\dots, n\}, and taking only objects of the form σ:XX\sigma \colon X \to X, i.e. σS n\sigma \in S_n. A morphism from σS n\sigma \in S_n to σS n\sigma' \in S_n is then a permutation τS n\tau \in S_n such that

σ=τστ 1. \sigma' = \tau \sigma \tau^{-1} .

But this subcategory is precisely S nS nS_n \sslash S_n.       ▮

Corollary. For all nn \in \mathbb{N} we have

|Perm(n)|=1 |\mathrm{Perm}(n)| = 1

Proof. We have |Perm(n)|=|S nS n|=|S n|/|S n|=1|\mathrm{Perm}(n)| = |S_n \sslash S_n| = |S_n|/|S_n| = 1.       ▮

It should now be clear why we can prove results on random permutations using the groupoid Perm(n)\mathsf{Perm}(n): this groupoid is equivalent to S nS nS_n \sslash S_n, a groupoid with one object for each permutation σS n\sigma \in S_n, and with each object contributing 1/n!1/n! to the groupoid cardinality.

Now let us use this idea to derive the original Cycle Length Lemma from the categorified version.

Cycle Length Lemma. Suppose p 1,,p np_1 , \dots, p_n \in \mathbb{N}. Then

E( k=1 nC k p̲ k)={ k=1 n1k p k if|p|n 0 if|p|>n E\left( \prod_{k=1}^n C_k^{\underline{p}_k} \right) = \left\{ \begin{array}{ccc} \displaystyle{ \prod_{k=1}^n \frac{1}{k^{p_k}} } & & \mathrm{if} \; |\vec{p}| \le n \\ \\ 0 & & \mathrm{if} \; |\vec{p}| \gt n \end{array} \right.

Proof. We know that

A pPerm(n|p|)× k=1 nB(/k) p k \mathsf{A}_{\vec{p}} \simeq \mathsf{Perm}(n - |\vec{p}|) \; \times \; \prod_{k = 1}^n \mathsf{B}(\mathbb{Z}/k)^{p_k}

So, to prove the Cycle Length Lemma it suffices to show three things:

|A p|=E( k=1 nC k p̲ k) |\mathsf{A}_{\vec{p}}| = E\left( \prod_{k=1}^n C_k^{\underline{p}_k} \right)

Perm(n|p|)={1 if|p|n 0 if|p|>n \mathsf{Perm}(n - |\vec{p}|) = \left\{ \begin{array}{ccc} 1 & & \mathrm{if} \; |\vec{p}| \le n \\ \\ 0 & & \mathrm{if} \; |\vec{p}| \gt n \end{array} \right.

and

|B(/k)|=1/k |\mathsf{B}(\mathbb{Z}/k)| = 1/k

The last of these is immediate from the definition of groupoid cardinality. The second follows from the Corollary above, together with the fact that Perm(n|p|)\mathsf{Perm}(n - |\vec{p}|) is the empty groupoid when |p|>n|\vec{p}| \gt n. Thus we are left needing to show that

|A p|=E( k=1 nC k p̲ k). |\mathsf{A}_{\vec{p}}| = E\left( \prod_{k=1}^n C_k^{\underline{p}_k} \right).

We prove this by computing the cardinality of a groupoid equivalent to A p\mathsf{A}_{\vec p}. This groupoid is of the form

Q(p)S n Q(\vec{p}) \sslash S_n

where Q(p)Q(\vec{p}) is a set on which S nS_n acts. As a result we have

|A p|=|Q(p)S n|=|Q(p)|/n! |\mathsf{A}_{\vec{p}}| = |Q(\vec{p}) \sslash S_n| = |Q(\vec{p})| / n!

and to finish the proof we will need to show

E( k=1 nC k p̲ k)=|Q(p)|/n!. E\left( \prod_{k=1}^n C_k^{\underline{p}_k} \right) = |Q(\vec{p})| / n!.

What is the set Q(p)Q(\vec{p}), and how does S nS_n act on this set? An element of Q(p)Q(\vec{p}) is a permutation σS n\sigma \in S_n equipped with an ordered p 1p_1-tuple of distinct 11-cycles, an ordered p 2p_2-tuple of distinct 22-cycles, and so on up to an ordered p np_n-tuple of distinct nn-cycles. Any element τS n\tau \in S_n acts on Q(p)Q(\vec{p}) in a natural way, by conjugating the permutation σS n\sigma \in S_n to obtain a new permutation, and mapping the chosen cycles of σ\sigma to the corresponding cycles of this new permutation τστ 1\tau \sigma \tau^{-1}.

Recalling the definition of the groupoid A p\mathsf{A}_{\vec{p}}, it is clear that any element of Q(p)Q(\vec{p}) gives an object of A p\mathsf{A}_{\vec{p}}, and any object is isomorphic to one of this form. Furthermore any permutation τS n\tau \in S_n gives a morphism between such objects, all morphisms between such objects are of this form, and composition of these morphisms is just multiplication in S nS_n. It follows that

A pQ(p)S n. \mathsf{A}_{\vec{p}} \simeq Q(\vec{p}) \sslash S_n.

To finish the proof, note that

E( k=1 nC k p̲ k) E\left( \prod_{k=1}^n C_k^{\underline{p}_k} \right)

is 1/n!1/n! times the number of ways of choosing a permutation σS n\sigma \in S_n and equipping it with an ordered p 1p_1-tuple of distinct 11-cycles, an ordered p 2p_2-tuple of distinct 22-cycles, and so on. This is the same as |Q(p)|/n! |Q(\vec{p})| / n!.       ▮

References

[AT] Richard Arratia and Simon Tavaré, The cycle structure of random permutations, The Annals of Probability (1992), 1567–1591.

[BD] John C. Baez and James Dolan, From finite sets to Feynman diagrams, in Mathematics Unlimited—2001 and Beyond, vol. 1, eds. Björn Engquist and Wilfried Schmid, Springer, Berlin, 2001, pp. 29–50.

[BHW] John C. Baez, Alexander E. Hoffnung and Christopher D. Walker, Higher-dimensional algebra VII: groupoidification, Theory and Applications of Categories 24 (2010), 489–553.

[F] Kevin Ford, Anatomy of Integers and Random Permutations—Course Lecture Notes.

December 20, 2024

Matt von HippelHow Small Scales Can Matter for Large Scales

For a certain type of physicist, nothing matters more than finding the ultimate laws of nature for its tiniest building-blocks, the rules that govern quantum gravity and tell us where the other laws of physics come from. But because they know very little about those laws at this point, they can predict almost nothing about observations on the larger distance scales we can actually measure.

“Almost nothing” isn’t nothing, though. Theoretical physicists don’t know nature’s ultimate laws. But some things about them can be reasonably guessed. The ultimate laws should include a theory of quantum gravity. They should explain at least some of what we see in particle physics now, explaining why different particles have different masses in terms of a simpler theory. And they should “make sense”, respecting cause and effect, the laws of probability, and Einstein’s overall picture of space and time.

All of these are assumptions, of course. Further assumptions are needed to derive any testable consequences from them. But a few communities in theoretical physics are willing to take the plunge, and see what consequences their assumptions have.

First, there’s the Swampland. String theorists posit that the world has extra dimensions, which can be curled up in a variety of ways to hide from view, with different observable consequences depending on how the dimensions are curled up. This list of different observable consequences is referred to as the Landscape of possibilities. Based on that, some string theorists coined the term “Swampland” to represent an area outside the Landscape, containing observations that are incompatible with quantum gravity altogether, and tried to figure out what those observations would be.

In principle, the Swampland includes the work of all the other communities on this list, since a theory of quantum gravity ought to be consistent with other principles as well. In practice, people who use the term focus on consequences of gravity in particular. The earliest such ideas argued from thought experiments with black holes, finding results that seemed to demand that gravity be the weakest force for at least one type of particle. Later researchers would more frequently use string theory as an example, looking at what kinds of constructions people had been able to make in the Landscape to guess what might lie outside of it. They’ve used this to argue that dark energy might be temporary, and to try to figure out what traits new particles might have.

Second, I should mention naturalness. When talking about naturalness, people often use the analogy of a pen balanced on its tip. While possible in principle, it must have been set up almost perfectly, since any small imbalance would cause it to topple, and that perfection demands an explanation. Similarly, in particle physics, things like the mass of the Higgs boson and the strength of dark energy seem to be carefully balanced, so that a small change in how they were set up would lead to a much heavier Higgs boson or much stronger dark energy. The need for an explanation for the Higgs’ careful balance is why many physicists expected the Large Hadron Collider to discover additional new particles.

As I’ve argued before, this kind of argument rests on assumptions about the fundamental laws of physics. It assumes that the fundamental laws explain the mass of the Higgs, not merely by giving it an arbitrary number but by showing how that number comes from a non-arbitrary physical process. It also assumes that we understand well how physical processes like that work, and what kinds of numbers they can give. That’s why I think of naturalness as a type of argument, much like the Swampland, that uses the smallest scales to constrain larger ones.

Third is a host of constraints that usually go together: causality, unitarity, and positivity. Causality comes from cause and effect in a relativistic universe. Because two distant events can appear to happen in different orders depending on how fast you’re going, any way to send signals faster than light is also a way to send signals back in time, causing all of the paradoxes familiar from science fiction. Unitarity comes from quantum mechanics. If quantum calculations are supposed to give the probability of things happening, those probabilities should make sense as probabilities: for example, they should never go above one.

You might guess that almost any theory would satisfy these constraints. But if you extend a theory to the smallest scales, some theories that otherwise seem sensible end up failing this test. Actually linking things up takes other conjectures about the mathematical form theories can have, conjectures that seem more solid than the ones underlying Swampland and naturalness constraints but that still can’t be conclusively proven. If you trust the conjectures, you can derive restrictions, often called positivity constraints when they demand that some set of observations is positive. There has been a renaissance in this kind of research over the last few years, including arguments that certain speculative theories of gravity can’t actually work.

Doug NatelsonTechnological civilization and losing object permanence

In the grand tradition of physicists writing about areas outside their expertise, I wanted to put down some thoughts on a societal trend.  This isn't physics or nanoscience, so feel free to skip this post.

Object permanence is a term from developmental psychology.  A person (or animal) has object permanence if they understand that something still exists even if they can't directly see it or interact with it in the moment.  If a kid realizes that their toy still exists even though they can't see it right now, they've got the concept.  

I'm wondering if modern technological civilization has an issue with an analog of object permanence.  Let me explain what I mean, why it's a serious problem, and end on a hopeful note by pointing out that even if this is the case, we have the tools needed to overcome this.

By the standards of basically any previous era, a substantial fraction of humanity lives in a golden age.  We have a technologically advanced, globe-spanning civilization.  A lot of people (though geographically very unevenly distributed) have grown up with comparatively clean water; comparatively plentiful food available through means other than subsistence agriculture; electricity; access to radio, television, and for the last couple of decades nearly instant access to communications and a big fraction of the sum total of human factual knowledge.  

Whether it's just human nature or a consequence of relative prosperity, there seems to be some timescale on the order of a few decades over which a non-negligible fraction of even the most fortunate seem to forget the hard lessons that got us to this point.  If they haven't seen something with their own eyes or experienced it directly, they decide it must not be a real issue.  I'm not talking about Holocaust deniers or conspiracy theorists who think the moon landings were fake.  There are a bunch of privileged people who have never personally known a time when tens of thousands of their neighbors died from childhood disease (you know, like 75 years ago, when 21,000 Americans were paralyzed every year from polio (!), proportionately like 50,000 today), who now think we should get rid of vaccines, and maybe germs aren't real.  Most people alive today were not alive the last time nuclear weapons were used, so some of them argue that nuclear weapons really aren't that bad (e.g. setting off 2000 one megaton bombs spread across the US would directly destroy less than 5% of the land area, so we're good, right?).  Or, we haven't had massive bank runs in the US since the 1930s, so some people now think that insuring bank deposits is a waste of resources and should stop.  I'll stop the list here, before veering into even more politically fraught territory.  I think you get my point, though - somehow chunks of modern society seem to develop collective amnesia, as if problems that we've never personally witnessed must have been overblown before or don't exist at all.  (Interestingly, this does not seem to happen for most technological problems.  You don't see many people saying, you know, maybe building fires weren't that big a deal, let's go back to the good old days before smoke alarms and sprinklers.)  

While the internet has downsides, including the ability to spread disinformation very effectively, all the available and stored knowledge also has an enormous benefit:  It should make it much harder than ever before for people to collectively forget the achievements of our species.  Sanitation, pasteurization, antibiotics, vaccinations - these are absolutely astonishing technical capabilities that were hard-won and have saved many millions of lives.  It's unconscionable that we are literally risking mass death by voluntarily forgetting or ignoring that.  Nuclear weapons are, in fact, terribleInsuring bank deposits with proper supervision of risk is a key factor that has helped stabilize economies for the last century.  We need to remember historical problems and their solutions, and make sure that the people setting policy are educated about these things. They say that those who cannot remember the past are doomed to repeat it.  As we look toward the new year, I hope that those who are familiar with the hard earned lessons of history are able to make themselves heard over the part of the populace who simply don't believe that old problems were real and could return.



Jordan EllenbergThree ways to apply AI to mathematics

If you wanted to train a machine to play Go, there are three ways you could do it, at decreasing levels of “from-scratchness.

You could tell the machine the rules of the game, and have it play many millions of games against itself; your goal is to learn a function that does a good job assigning a value to a game state, and you evaluate such a function by seeing how often it wins in this repeated arena. This is an oversimplified account of what AlphaGo does.

Or you could have the machine try to learn a state function from some database of games actually played by expert human competitors — those games would be entered in some formal format and the machine would try to learn to imitate those expert players. Of course, you could then combine this with simulated internal games as in the first step, but you’d be starting with a leg up from accumulated human knowledge.

The third way would be to train on every natural-language book ever written about Go and try to produce natural-language response to natural-language questions that just tells you what to do.

I don’t actually care about Go, but I do care about math, and I think all three of these approaches have loose analogues as we ask what it might look like for machines to help mathematicians. The first, “from scratch” approach, is the side of things I’ve worked on in projects like PatternBoost and FunSearch. (OK, maybe FunSearch has aspects of both method 1 and method 2.) Here you actively try to keep prior human knowledge away from the machine, because you want to see what it can do on its own.

The second approach is where I’d put formalized proof. If we try to train a machine to get from one assertion to another using a chain of proven theorems in a formal system like Lean, we’re starting from a high platform: a huge repository of theorems and even more importantly definitions which guide the machine along channels which people have already figured out are rich in meaning. AlphaProof is like this.

The third approach is more like what GPT o1 is doing — asking whether you can circumvent the formal language entirely and just generate text which kindles mathematical insight in the mind of the human reader.

I think all of these are reasonable things to try. I guess my own mostly unjustified prejudice is that the first approach is the one that has the most to teach us about what the scope of what machine learning actually is, while the second is the one that will probably end up being most useful in practice. The third? So far I think it doesn’t work. I don’t think it couldn’t work. But I also don’t think it’s on an obvious trajectory towards working, if words like “trajectory” even make sense in this context. At some point I’ll post an o1-preview dialogue which I found very illluminating in this respect.

Clifford JohnsonA Long Goodbye

I've been very quiet here over the last couple of weeks. My mother, Delia Maria Johnson, already in hospital since 5th November or so, took a turn for the worse and began a rapid decline. She died peacefully after some days, and to be honest I’ve really not been myself since then.

My mother Delia at a wedding in 2012

There's an extra element to the sense of loss when (as it approaches) you are powerless to do anything because of being thousands of miles away. On the plus side, because of the ease of using video calls, and with the help of my sister being there, I was able to be somewhat present during what turned out to be the last moments when she was aware of people around her, and therefore was able to tell her I loved her one last time.

Rather than charging across the world on planes, trains, and in automobiles, probably being out of reach during any significant changes in the situation (the doctors said I would likely not make it in time) I did a number of things locally that I am glad I got to do.

It began with visiting (and sending a photo from) the Santa Barbara mission, a place she dearly loved and was unable to visit again after 2019, along with the pier. These are both places we walked together so much back when I first lived here in what feels like another life.

Then, two nights before mum passed away, but well after she’d seemed already beyond reach of anyone, although perhaps (I’d like to think) still able to hear things, my sister contacted me from her bedside asking if I’d like to read mum a psalm, perhaps one of her favourites, 23 or 91. At first I thought she was already planning the funeral, and expressed my surprise at this since mum was still alive and right next to her. But I’d misunderstood, and she’d in fact had a rather great idea. This suggestion turned into several hours of, having sent on recordings of the two psalms, my digging into the poetry shelf in the study and discovering long neglected collections through which I searched (sometimes accompanied by my wife and son) for additional things to read. I recorded some and sent them along, as well as one from my son, I’m delighted to say. Later, the whole thing turned into me singing various songs while playing my guitar and sending recordings of those along too.

Incidentally, the guitar-playing was an interesting turn of events since not many months ago I decided after a long lapse to start playing guitar again, and try to move the standard of my playing (for vocal accompaniment) to a higher level than I’d previously done, by playing and practicing for a little bit on a regular basis. I distinctly recall thinking at one point during one practice that it would be nice to play for mum, although I did not imagine that playing to her while she was on her actual death-bed would be the circumstance under which I’d eventually play for her, having (to my memory) never directly done so back when I used to play guitar in my youth. (Her overhearing me picking out bits of Queen songs behind my room door when I was a teenager doesn’t count as direct playing for her.)

Due to family circumstances I’ll perhaps go into another time... Click to continue reading this post

The post A Long Goodbye appeared first on Asymptotia.

December 19, 2024

Jordan EllenbergMakes no sense at all (an Orioles post)

The Orioles signed Tyler O’Neill, who is really good at hitting left-handers, for three years at $16.5m a year, and Tomoyuki Sugano, a 35-year-old pitcher with pinpoint control but whose fastball is down to 94 and doesn’t strike people out anymore, for one year at $13m. We are talking about trading for Dylan Cease, who’s a free agent in 2026, and who was very good last year but just good the year before that.

So the Orioles are adding payroll. We’re not cheaply Marlinning our way through the next few seasons. But unless reporting is badly wrong, it doesn’t look like they’re adding payroll by offering Corbin Burnes the $250m/8 year deal he’s likely to draw elsewhere.

There are teams for which this course of action would make sense. Probably most teams. For instance: if you had big holes at several spots, places where you were getting replacement-level performance, it makes a lot of sense to add 3-WAR guys for those spots.

But the Orioles aren’t that team. They have solid regulars all around the diamond. Unless Tyler O’Neill is playing only against lefties, he’s taking at-bats from Heston Kjerstad and (the probably gone) Anthony Santander, not Stevie Wilkerson. And Sugano is taking starts from Dean Kremer, not Asher Wojciechowski.

It also makes sense to focus on short-term deals if you have a one-year window because your best players are about to hit free agency. But the Orioles aren’t that team either. Adley Rutschman, the senior statesman of the team, isn’t a free agent until 2028.

In fact, if you were to imagine a team that should open the wallet and sign a huge contract for a free-agent starter, you’d say “it would have to be a team that had a good young core with multiple years of team control, so that at least four out of those eight years you’re paying for an ace to lead a contender. And you’d want the rotation to have enough depth to make the team a credible playoff threat every year, but lacking a guy with reliable #1 stuff.” And that is the team the Orioles are.

Jordan EllenbergTwenty years ago I encountered Francis Bacon

Well, he’d already been dead a while; I mean I encountered his paintings, some of which are of people who appear to have been dead a while. I just read a short book about him which is why he’s on my mind. I went to the Musee Malliol in Paris — it was a trip to Paris Prof. Dr. Mrs. Q and I took after I finished visiting Emmanuel Kowalski in Bordeaux — and saw a special exhibition of Bacon’s paintings. Probably the last time I encountered a new artist and felt a shock of recognition and affinity. Maybe the last time it will happen?

Oh, I have blogged about that exhibition once before, in opposition to Jed Perl.

Terence TaoOn the distribution of eigenvalues of GUE and its minors at fixed index

I’ve just uploaded to the arXiv the paper “On the distribution of eigenvalues of GUE and its minors at fixed index“. This is a somewhat technical paper establishing some estimates regarding one of the most well-studied random matrix models, the Gaussian Unitary Ensemble (GUE), that were not previously in the literature, but which will be needed for some forthcoming work of Hariharan Narayanan on the limiting behavior of “hives” with GUE boundary conditions (building upon our previous joint work with Sheffield).

For sake of discussion we normalize the GUE model to be the random {N \times N} Hermitian matrix {H} whose probability density function is proportional to {e^{-\mathrm{tr} H^2}}. With this normalization, the famous Wigner semicircle law will tell us that the eigenvalues {\lambda_1 \leq \dots \leq \lambda_N} of this matrix will almost all lie in the interval {[-\sqrt{2N}, \sqrt{2N}]}, and after dividing by {\sqrt{2N}}, will asymptotically be distributed according to the semicircle distribution

\displaystyle  \rho_{\mathrm{sc}}(x) := \frac{2}{\pi} (1-x^2)_+^{1/2}.

In particular, the normalized {i^{th}} eigenvalue {\lambda_i/\sqrt{2N}} should be close to the classical location {\gamma_{i/N}}, where {\gamma_{i/N}} is the unique element of {[-1,1]} such that

\displaystyle  \int_{-\infty}^{\gamma_{i/N}} \rho_{\mathrm{sc}}(x)\ dx = \frac{i}{N}.

Eigenvalues can be described by their index {i} or by their (normalized) energy {\lambda_i/\sqrt{2N}}. In principle, the two descriptions are related by the classical map {i \mapsto \gamma_{i/N}} defined above, but there are microscopic fluctuations from the classical location that create subtle technical difficulties between “fixed index” results in which one focuses on a single index {i} (and neighboring indices {i+1, i-1}, etc.), and “fixed energy” results in which one focuses on a single energy {x} (and eigenvalues near this energy). The phenomenon of eigenvalue rigidity does give some control on these fluctuations, allowing one to relate “averaged index” results (in which the index {i} ranges over a mesoscopic range) with “averaged energy” results (in which the energy {x} is similarly averaged over a mesoscopic interval), but there are technical issues in passing back from averaged control to pointwise control, either for the index or energy.

We will be mostly concerned in the bulk region where the index {i} is in an inteval of the form {[\delta n, (1-\delta)n]} for some fixed {\delta>0}, or equivalently the energy {x} is in {[-1+c, 1-c]} for some fixed {c > 0}. In this region it is natural to introduce the normalized eigenvalue gaps

\displaystyle  g_i := \sqrt{N/2} \rho_{\mathrm{sc}}(\gamma_{i/N}) (\lambda_{i+1} - \lambda_i).

The semicircle law predicts that these gaps {g_i} have mean close to {1}; however, due to the aforementioned fluctuations around the classical location, this type of claim is only easy to establish in the “fixed energy”, “averaged energy”, or “averaged index” settings; the “fixed index” case was only achieved by myself as recently as 2013, where I showed that each such gap in fact asymptotically had the expected distribution of the Gaudin law, using manipulations of determinantal processes. A significantly more general result, avoiding the use of determinantal processes, was subsequently obtained by Erdos and Yau.

However, these results left open the possibility of bad tail behavior at extremely large or small values of the gaps {g_i}; in particular, moments of the {g_i} were not directly controlled by previous results. The first result of the paper is to push the determinantal analysis further, and obtain such results. For instance, we obtain moment bounds

\displaystyle  \mathop{\bf E} g_i^p \ll_p 1

for any fixed {p > 0}, as well as an exponential decay bound

\displaystyle  \mathop{\bf P} (g_i > h) \ll \exp(-h/4)

for {0 < h \ll \log\log N}, and a lower tail bound

\displaystyle  \mathop{\bf P} (g_i \leq h) \ll h^{2/3} \log^{1/2} \frac{1}{h}

for any {h>0}. We also obtain good control on sums {g_i + \dots + g_{i+m-1}} of {m} consecutive gaps for any fixed {m}, showing that this sum has mean {m + O(\log^{4/3} (2+m))} and variance {O(\log^{7/3} (2+m))}. (This is significantly less variance than one would expect from a sum of {m} independent random variables; this variance reduction phenomenon is closely related to the eigenvalue rigidity phenomenon alluded to earlier, and reflects the tendency of eigenvalues to repel each other.)

A key point in these estimates is that no factors of {\log N} occur in the estimates, which is what one would obtain if one tried to use existing eigenvalue rigidity theorems. (In particular, if one normalized the eigenvalues {\lambda_i} at the same scale at the gap {g_i}, they would fluctuate by a standard deviation of about {\sqrt{\log N}}; it is only the gaps between eigenvalues that exhibit much smaller fluctuation.) On the other hand, the dependence on {h} is not optimal, although it was sufficient for the applications I had in mind.

As with my previous paper, the strategy is to try to replace fixed index events such as {g_i > h} with averaged energy events. For instance, if {g_i > h} and {i} has classical location {x}, then there is an interval of normalized energies {t} of length {\gg h}, with the property that there are precisely {N-i} eigenvalues to the right of {f_x(t)} and no eigenvalues in the interval {[f_x(t), f_x(t+h/2)]}, where

\displaystyle  f_x(t) = \sqrt{2N}( x + \frac{t}{N \rho_{\mathrm{sc}}(x)})

is an affine rescaling to the scale of the eigenvalue gap. So matters soon reduce to controlling the probability of the event

\displaystyle  (N_{x,t} = N-i) \wedge (N_{x,t,h/2} = 0)

where {N_{x,t}} is the number of eigenvalues to the right of {f_x(t)}, and {N_{x,t,h/2}} is the number of eigenvalues in the interval {[f_x(t), f_x(t+h/2)]}. These are fixed energy events, and one can use the theory of determinantal processes to control them. For instance, each of the random variables {N_{x,t}}, {N_{x,t,h/2}} separately have the distribution of sums of independent Boolean variables, which are extremely well understood. Unfortunately, the coupling is a problem; conditioning on the event {N_{x,t} = N-i}, in particular, affects the distribution of {N_{x,t,h/2}}, so that it is no longer the sum of independent Boolean variables. However, it is still a mixture of such sums, and with this (and the Plancherel-Rotach asymptotics for the GUE determinantal kernel) it is possible to proceed and obtain the above estimates after some calculation.

For the intended application to GUE hives, it is important to not just control gaps {g_i} of the eigenvalues {\lambda_i} of the GUE matrix {M}, but also the gaps {g'_i} of the eigenvalues {\lambda'_i} of the top left {N-1 \times N-1} minor {M'} of {M}. This minor of a GUE matrix is basically again a GUE matrix, so the above theorem applies verbatim to the {g'_i}; but it turns out to be necessary to control the joint distribution of the {g_i} and {g'_i}, and also of the interlacing gaps {\tilde g_i} between the {\lambda_i} and {\lambda'_i}. For fixed energy, these gaps are in principle well understood, due to previous work of Adler-Nordenstam-van Moerbeke and of Johansson-Nordenstam which show that the spectrum of both matrices is asymptotically controlled by the Boutillier bead process. This also gives averaged energy and averaged index results without much difficulty, but to get to fixed index information, one needs some universality result in the index {i}. For the gaps {g_i} of the original matrix, such a universality result is available due to the aforementioned work of Erdos and Yau, but this does not immediately imply the corresponding universality result for the joint distribution of {g_i} and {g'_i} or {\tilde g_i}. For this, we need a way to relate the eigenvalues {\lambda_i} of the matrix {M} to the eigenvalues {\lambda'_i} of the minors {M'}. By a standard Schur’s complement calculation, one can obtain the equation

\displaystyle a_{NN} - \lambda_i - \sum_{j=1}^{N-1}\frac{|X_j|^2}{\lambda'_j - \lambda_i} = 0

for all {i}, where {a_{NN}} is the bottom right entry of {M}, and {X_1,\dots,X_{N-1}} are complex gaussians independent of {\lambda'_j}. This gives a random system of equations to solve for {\lambda_i} in terms of {\lambda'_j}. Using the previous bounds on eigenvalue gaps (particularly the concentration results for sums of consecutive gaps), one can localize this equation to the point where a given {\lambda_i} is mostly controlled by a bounded number of nearby {\lambda'_j}, and hence a single gap {g_i} is mostly controlled by a bounded number of {g'_j}. From this, it is possible to leverage the existing universality result of Erdos and Yau to obtain universality of the joint distribution of {g_i} and {g'_i} (or of {\tilde g_i}). (The result can also be extended to more layers of the minor process than just two, as long as the number of minors is held fixed.)

This at last brings us to the final result of the paper, which is the one which is actually needed for the application to GUE hives. Here, one is interested in controlling the variance of a linear combination {\sum_{l=1}^m a_l \tilde g_{i+l}} of a fixed number {l} of consecutive interlacing gaps {\tilde g_{i+l}}, where the {a_l} are arbitrary deterministic coefficients. An application of the triangle and Cauchy-Schwarz inequalities, combined with the previous moment bounds on gaps, shows that this randomv ariable has variance {\ll m \sum_{l=1}^m |a_i|^2}. However, this bound is not expected to be sharp, due to the expected decay between correlations of eigenvalue gaps. In this paper, I improve the variance bound to

\displaystyle  \ll_A \frac{m}{\log^A(2+m)} \sum_{l=1}^m |a_i|^2

for any {A>0}, which is what was needed for the application.

This improvement reflects some decay in the covariances between distant interlacing gaps {\tilde g_i, \tilde g_{i+h}}. I was not able to establish such decay directly. Instead, using some Fourier analysis, one can reduce matters to studying the case of modulated linear statistics such as {\sum_{l=1}^m e(\xi l) \tilde g_{i+l}} for various frequencies {\xi}. In “high frequency” cases one can use the triangle inequality to reduce matters to studying the original eigenvalue gaps {g_i}, which can be handled by a (somewhat complicated) determinantal process calculation, after first using universality results to pass from fixed index to averaged index, thence to averaged energy, then to fixed energy estimates. For low frequencies the triangle inequality argument is unfavorable, and one has to instead use the determinantal kernel of the full minor process, and not just an individual matrix. This requires some classical, but tedious, calculation of certain asymptotics of sums involving Hermite polynomials.

The full argument is unfortunately quite complex, but it seems that the combination of having to deal with minors, as well as fixed indices, places this result out of reach of many prior methods.

December 17, 2024

Matt Strassler Science Book of The Year (!?!)

Well, gosh… what nice news as 2024 comes to a close… My book has received a ringing endorsement from Ethan Siegel, the science writer and Ph.D. astrophysicist who hosts the well-known, award-winning blog “Starts with a Bang“. Siegel’s one of the most reliable and prolific science writers around — he writes for BigThink and has published in Forbes, among others — and it’s a real honor to read what he’s written about Waves in an Impossible Sea.

His brief review serves as an introduction to an interview that he conducted with me recently, which I think many of you will enjoy. We discussed science — the nature of particles/wavicles, the Higgs force, the fabric (if there is one) of the universe, and the staying power of the idea of supersymmetry among many theoretical physicists — and science writing, including novel approaches to science communication that I used in the book.

If you’re a fan of this blog or of the book, please consider sharing his review on social media (as well as the Wall Street Journal’s opinion.) The book has sold well this year, but I am hoping that in 2025 it will reach an even broader range of people who seek a better understanding of the cosmos, both in the large and in the small.

Andrew JaffeDiscovering Japan

My old friend Marc Weidenbaum, curator and writer of disquiet.com, reminded me, in his latest post, of the value of blogging. So, here I am (again).

Since September, I have been on sabbatical in Japan, working mostly at QUP (International Center for Quantum-field Measurement Systems for Studies of the Universe and Particles) at the KEK accelerator lab in Tsukuba, Japan, and spending time as well at the Kavli IPMU, about halfway into Tokyo from here. Tsukuba is a “science city” about 30 miles northeast of Tokyo, home to multiple Japanese scientific establishments (such as a University and a major lab for JAXA, the Japanese space agency).

Scientifically, I’ve spent a lot of time thinking and talking about the topology of the Universe, future experiments to measure the cosmic microwave background, and statistical tools for cosmology experiments. And I was honoured to be asked to deliver a set of lectures on probability and statistics in cosmology, a topic which unites most of my research interests nowadays.

Japan, and Tsukuba in particular, is a very nice place to live. It’s close enough to Tokyo for regular visits (by the rapid Tsukuba Express rail line), but quiet enough for our local transport to be dominated by cycling around town. We love the food, the Japanese schools that have welcomed our children, the onsens, and our many views of Mount Fuji.

Fuji with buildings

Fuji through windows

And after almost four months in Japan, it’s beginning to feel like home.

Unfortunately, we’re leaving our short-term home in Japan this week. After a few weeks of travel in Southeast Asia, we’ll be decamped to the New York area for the rest of the Winter and early Spring. But (as further encouragement to myself to continue blogging) I’ll have much more to say about Japan — science and life — in upcoming posts.

Jordan EllenbergDumbass Corners

Resolved: the intersection of Regent, Highland, and Speedway in front of Madison West HS, the worst four-way stop in the city of Madison and possibly anywhere, is to be renamed Dumbass Corners, since it is essentially impossible to navigate it without witnessing an act of vehicular dumbassery.

December 15, 2024

Doug NatelsonItems for discussion, including google's latest quantum computing result

As we head toward the end of the calendar year, a few items:

  • Google published a new result in Nature a few days ago.  This made a big news splash, including this accompanying press piece from google themselvesthis nice article in Quanta, and the always thoughtful blog post by Scott Aaronson.  The short version:  Physical qubits as made today in the superconducting platform favored by google don't have the low error rates that you'd really like if you want to run general quantum algorithms on a quantum computer, which could certainly require millions of steps.  The hope of the community is to get around this using quantum error correction, where some number of physical qubits are used to function as one "logical" qubit.  If physical qubit error rates are sufficiently low, and these errors can be corrected with enough efficacy, the logical qubits can function better than the physical qubits, ideally being able to undergo a sequential operations indefinitely without degradation of their information.   One technique for this is called a surface code.  Google have implemented this in their most recent chip 105 physical qubit chip ("Willow"), and they seem to have crossed a huge threshold:  When they increase the size of their correction scheme (going from a 3 (physical qubit) \(\times\) 3 (physical qubit) to 5 \(\times\) 5 to 7 \(\times\) 7), the error rates of the resulting logical qubits fall as hoped.  This is a big deal, as it implies that larger chips, if they could be implemented, should scale toward the desired performance.  This does not mean that general purpose quantum computers are just around the corner, but it's very encouraging.  There are many severe engineering challenges still in place.  For example, the present superconducting qubits must be tweaked and tuned.  The reason google only has 105 of them on the Willow chip is not that they can't fit more - it's that they have to have wires and control capacity to tune and run them.  A few thousand really good logical qubits would be needed to break RSA encryption, and there is no practical way to put millions of wires down a dilution refrigerator.  Rather, one will need cryogenic control electronics
  • On a closely related point, google's article talks about how it would take a classical computer ten septillion years to do what its Willow chip can do.  This is based on a very particularly chosen problem (as I mentioned here five years ago) called random circuit sampling, looking at the statistical properties of the outcome of applying random gate sequences to a quantum computer.  From what I can tell, this is very different than what most people mean when they think of a problem to benchmark a quantum computer's advantage over a classical computer.  I suspect the typical tech-literate person considering quantum computing wants to know, if I ask a quantum computer and a classical computer to factor huge numbers or do some optimization problem, how much faster is the quantum computer for a given size of problem?  Random circuit sampling feels instead much more to me like comparing an experiment to a classical theory calculation.  For a purely classical analog, consider putting an airfoil in a windtunnel and measuring turbulent flow, and comparing with a computational fluids calculation.  Yes, the windtunnel can get you an answer very quickly, but it's not "doing" a calculation, from my perspective.  This doesn't mean random circuit sampling is a poor benchmark, just that people should understand it's rather different from the kind of quantum/classical comparison they may envision.
  • On one unrelated note:  Thanks to a timey inquiry from a reader, I have now added a search bar to the top of the blog.  (Just in time to capture the final decline of science blogging?)
  • On a second unrelated note:  I'd be curious to hear from my academic readers on how they are approaching generative AI, both on the instructional side (e.g., should we abandon traditional assignments and take-home exams?  How do we check to see if students are really learning vs. becoming dependent on tools that have dubious reliability?) and on the research side (e.g., what level of generative AI tool use is acceptable in paper or proposal writing?  What aspects of these tools are proving genuinely useful to PIs?  To students?  Clearly generative AI's ability to help with coding is very nice indeed!)

December 13, 2024

Matt von HippelWhich String Theorists Are You Complaining About?

Do string theorists have an unfair advantage? Do they have an easier time getting hired, for example?

In one of the perennial arguments about this on Twitter, Martin Bauer posted a bar chart of faculty hires in the US by sub-field. The chart was compiled by Erich Poppitz from data in the US particle physics rumor mill, a website where people post information about who gets hired where for the US’s quite small number of permanent theoretical particle physics positions at research universities and national labs. The data covers 1994 to 2017, and shows one year, 1999, when there were more string theorists hired than all other topics put together. The years around then also had many string theorists hired, but the proportion starts falling around the mid 2000’s…around when Lee Smolin wrote a book, The Trouble With Physics, arguing that string theorists had strong-armed their way into academic dominance. After that, the percentage of string theorists falls, oscillating between a tenth and a quarter of total hires.

Judging from that, you get the feeling that string theory’s critics are treating a temporary hiring fad as if it was a permanent fact. The late 1990’s were a time of high-profile developments in string theory that excited a lot of people. Later, other hiring fads dominated, often driven by experiments: I remember when the US decided to prioritize neutrino experiments and neutrino theorists had a much easier time getting hired, and there seem to be similar pushes now with gravitational waves, quantum computing, and AI.

Thinking about the situation in this way, though, ignores what many of the critics have in mind. That’s because the “string” column on that bar chart is not necessarily what people think of when they think of string theory.

If you look at the categories on Poppitz’s bar chart, you’ll notice something odd. “String” its itself a category. Another category, “lattice”, refers to lattice QCD, a method to find the dynamics of quarks numerically. The third category, though, is a combination of three things “ph/th/cosm”.

“Cosm” here refers to cosmology, another sub-field. “Ph” and “th” though aren’t really sub-fields. Instead, they’re arXiv categories, sections of the website arXiv.org where physicists post papers before they submit them to journals. The “ph” category is used for phenomenology, the type of theoretical physics where people try to propose models of the real world and make testable predictions. The “th” category is for “formal theory”, papers where theoretical physicists study the kinds of theories they use in more generality and develop new calculation methods, with insights that over time filter into “ph” work.

“String”, on the other hand, is not an arXiv category. When string theorists write papers, they’ll put them into “th” or “ph” or another relevant category (for example “gr-qc”, for general relativity and quantum cosmology). This means that when Poppitz distinguishes “ph/th/cosm” from “string”, he’s being subjective, using his own judgement to decide who counts as a string theorist.

So who counts as a string theorist? The simplest thing to do would be to check if their work uses strings. Failing that, they could use other tools of string theory and its close relatives, like Calabi-Yau manifolds, M-branes, and holography.

That might be what Poppitz was doing, but if he was, he was probably missing a lot of the people critics of string theory complain about. He even misses many people who describe themselves as string theorists. In an old post of mine I go through the talks at Strings, string theory’s big yearly conference, giving them finer-grained categories. The majority don’t use anything uniquely stringy.

Instead, I think critics of string theory have two kinds of things in mind.

First, most of the people who made their reputations on string theory are still in academia, and still widely respected. Some of them still work on string theory topics, but many now work on other things. Because they’re still widely respected, their interests have a substantial influence on the field. When one of them starts looking at connections between theories of two-dimensional materials, you get a whole afternoon of talks at Strings about theories of two-dimensional materials. Working on those topics probably makes it a bit easier to get a job, but also, many of the people working on them are students of these highly respected people, who just because of that have an easier time getting a job. If you’re a critic of string theory who thinks the founders of the field led physics astray, then you probably think they’re still leading physics astray even if they aren’t currently working on string theory.

Second, for many other people in physics, string theorists are their colleagues and friends. They’ll make fun of trends that seem overhyped and under-thought, like research on the black hole information paradox or the swampland, or hopes that a slightly tweaked version of supersymmetry will show up soon at the LHC. But they’ll happily use ideas developed in string theory when they prove handy, using supersymmetric theories to test new calculation techniques, string theory’s extra dimensions to inspire and ground new ideas for dark matter, or the math of strings themselves as interesting shortcuts to particle physics calculations. String theory is available as reference to these people in a way that other quantum gravity proposals aren’t. That’s partly due to familiarity and shared language (I remember a talk at Perimeter where string theorists wanted to learn from practitioners from another area and the discussion got bogged down by how they were using the word “dimension”), but partly due to skepticism of the various alternate approaches. Most people have some idea in their heads of deep problems with various proposals: screwing up relativity, making nonsense out of quantum mechanics, or over-interpreting on limited evidence. The most commonly believed criticisms are usually wrong, with objections long-known to practitioners of the alternate approaches, and so those people tend to think they’re being treated unfairly. But the wrong criticisms are often simplified versions of correct criticisms, passed down by the few people who dig deeply into these topics, criticisms that the alternative approaches don’t have good answers to.

The end result is that while string theory itself isn’t dominant, a sort of “string friendliness” is. Most of the jobs aren’t going to string theorists in the literal sense. But the academic world string theorists created keeps turning. People still respect string theorists and the research directions they find interesting, and people are still happy to collaborate and discuss with string theorists. For research communities people are more skeptical of, it must feel very isolating, like the world is still being run by their opponents. But this isn’t the kind of hegemony that can be solved by a revolution. Thinking that string theory is a failed research program, and people focused on it should have a harder time getting hired, is one thing. Thinking that everyone who respects at least one former string theorist should have a harder time getting hired is a very different goal. And if what you’re complaining about is “string friendliness”, not actual string theorists, then that’s what you’re asking for.

John BaezMartianus Capella

In 1543, Nicolaus Copernicus published a book arguing that the Earth revolves around the Sun: De revolutionibus orbium coelestium.

This is sometimes painted as a sudden triumph of rationality over the foolish yet long-standing belief that the Sun and all the planets revolve around the Earth. As usual, this triumphalist narrative is oversimplified. In the history of science, everything is always more complicated than you think.

First, Aristarchus had come up with a heliocentric theory way back around 250 BC. While Copernicus probably didn’t know all the details, he did know that Aristarchus said the Earth moves. Copernicus mentioned this in an early unpublished version of De revolutionibus.

Copernicus also had some precursors in the Middle Ages, though it’s not clear whether he was influenced by them.

In the 1300’s, the philosopher Jean Buridan argued that the Earth might not be at the center of the Universe, and that it might be rotating. He claimed—correctly in the first case, and only a bit incorrectly in the second—that there’s no real way to tell. But he pointed out that it makes more sense to have the Earth rotating than have the Sun, Moon, planets and stars all revolving around it, because

it is easier to move a small thing than a large one.

In 1377 Nicole Oresme continued this line of thought, making the same points in great detail, only to conclude by saying

Yet everyone holds, and I think myself, that the heavens do move and not the Earth, for “God created the orb of the Earth, which will not be moved” [Psalms 104:5], notwithstanding the arguments to the contrary.

Everyone seems to take this last-minute reversal of views at face value, but I have trouble believing he really meant it. Maybe he wanted to play it safe with the Church. I think I detect a wry sense of humor, too.

Martianus Capella

I recently discovered another fascinating precursor of Copernicus’ heliocentric theory: a theory that is neither completely geocentric nor completely heliocentric! And that’s what I want to talk about today.

Sometime between 410 and 420 AD, Martianus Capella came out with a book saying Mercury and Venus orbit the Sun, while the other planets orbit the Earth!


This picture is from a much later book by the German astronomer Valentin Naboth, in 1573. But it illustrates Capella’s theory—and as we’ll see, his theory was rather well-known in western Europe starting in the 800s.

First of all, take a minute to think about how reasonable this theory is. Mercury and Venus are the two planets closer to the Sun than we are. So, unlike the other planets, we can never possibly see them more than 90° away from the Sun. In fact Venus never gets more than 48° from the Sun, and Mercury stays even closer. So it looks like these planets are orbiting the Sun, not the Earth!

But who was this guy, and why did he matter?

Martianus Capella was a jurist and writer who lived in the city of Madauros, which is now in Algeria, but in his day was in Numidia, one of six African provinces of the Roman Empire. He’s famous for a book with the wacky title De nuptiis Philologiae et Mercurii, which means On the Marriage of Philology and Mercury. It was an allegorical story, in prose and verse, describing the courtship and wedding of Mercury (who stood for “intelligent or profitable pursuit”) and the maiden Philologia (who stood for “the love of letters and study”). Among the wedding gifts are seven maids who will be Philology’s servants. They are the seven liberal arts:

The Trivium: Grammar, Dialectic, Rhetoric.
The Quadrivium: Geometry, Arithmetic, Astronomy, Harmony.

In seven chapters, the seven maids explain these subjects. What matters for us is the chapter on astronomy, which explains the structure of the Solar System.

This book De nuptiis Philologiae et Mercurii became very important after the decline and fall of the Roman Empire, mainly as a guide to the liberal arts. In fact, if you went to a college that claimed to offer a liberal arts education, you were indirectly affected by this book!

Here is a painting by Botticelli from about 1485, called A Young Man Being Introduced to the Seven Liberal Arts:


The Carolingian Renaissance

But why did Martianus Capella’s book become so important?

I’m no expert on this, but it seems as the Roman Empire declined there was a gradual dumbing down of scholarship, with original and profound works by folks like Aristotle, Euclid, and Ptolemy eventually being lost in western Europe—though preserved in more civilized parts of the world, like Baghdad and the Byzantine Empire. In the west, eventually all that was left were easy-to-read popularizations by people like Pliny the Elder, Boethius, Macrobius, Cassiodorus… and Martianus Capella!

By the end of the 800s, many copies of Capella’s book De nuptiis Philologiae et Mercurii were available. Let’s see how that happened!



Expansion of the Franks

To set the stage: Charlemagne became King of the Franks in 768 AD. Being a forward-looking fellow, he brought in Alcuin, headmaster of the cathedral school in York and “the most learned man anywhere to be found”, to help organize education in his kingdom.

Alcuin set up schools for boys and girls, systematized the curriculum, raised the standards of scholarship, and encouraged the study of liberal arts. Yes: the liberal arts as described by Martianus Capella! For Alcuin this was all in the service of Christianity. But scholars, being scholars, took advantage of this opportunity to start copying the ancient books that were available, writing commentaries on them, and the like.

In 800, Charlemagne became emperor of what’s now called the Carolingian Empire. When Charlemagne died in 814 a war broke out, but it ended in 847. Though divided into three parts, the empire flourished until about 877, when it began sinking due to internal struggles, attacks from Vikings in the north, etc.

The heyday of culture in the Carolingian Empire, roughly 768–877, is sometimes called the Carolingian Renaissance because of the flourishing of culture and learning brought about by Alcuin and his successors. To get a sense of this: between 550 and 750 AD, only 265 books have been preserved from Western Europe. From the Carolingian Renaissance we have over 7000.

However, there was still a huge deficit of the classical texts we now consider most important. As far as I can tell, the works of Aristotle, Eratosthenes, Euclid, Ptolemy and Archimedes were completely missing in the Carolingian Empire. I seem to recall that from Plato only the Timaeus was available at this time. But Martianus Capella’s De nuptiis Philologiae et Mercurii was very popular. Hundreds of copies were made, and many survive even to this day! Thus, his theory of the Solar System, where Mercury and Venus orbited the Sun but other planets orbited the Earth, must have had an out-sized impact on cosmology at this time.

Here is part of a page from one of the first known copies of De nuptiis Philologiae et Mercurii:


It’s called VLF 48, and it’s now at the university library in Leiden. Most scholars say it dates to 850 AD, though Mariken Teeuwen has a paper claiming it goes back to 830 AD.

You’ll notice that in addition to the main text, there’s a lot of commentary in smaller letters! This may have been added later. Nobody knows who wrote it, or even whether it was a single person. It’s called the Anonymous Commentary. This commentary was copied into many of the later versions of the book, so it’s important.

The Anonymous Commentary

So far my tale has been a happy one: even in the time of Charlemagne, the heliocentric revolt against the geocentric cosmology was brewing, with a fascinating ‘mixed’ cosmology being rather well-known.

Alas, now I need to throw a wet blanket on that, and show how poorly Martianus Capella’s cosmology was understood at this time!

The Anonymous Commentary actually describes three variants of Capella’s theory of the orbits of Mercury and Venus. One of them is good, one seems bad, and one seems very bad. Yet subsequent commentators in the Carolingian Empire didn’t seem to recognize this fact and discard the bad ones.

These three variants were drawn as diagrams in the margin of VLF 48, but Robert Eastwood has nicely put them side by side here:

The one at right, which the commentary attributes to the “Platonists”, shows the orbit of Mercury around the Sun surrounded by the larger orbit of Venus. This is good.

The one in the middle, which the commentary attributes to Martianus Capella himself, shows the orbits of Mercury and Venus crossing each other. This seems bad.

The one at left, which the commentary attributes to Pliny, shows orbits for Mercury and Venus that are cut off when they meet the orbit of the Sun, not complete circles. This seems very bad—so bad that I can’t help but hope there’s some reasonable interpretation that I’m missing. (Maybe just that these planets get hidden when they go behind the Sun?)

Robert Eastwood attributes the two bad models to a purely textual approach to astronomy, where commentators tried to interpret texts and compare them to other texts, without doing observations. I’m still puzzled.

Copernicus

Luckily, we’ve already seen that by 1573, Valentin Naboth had settled on the good version of Capella’s cosmology:


That’s 30 years after Copernicus came out with his book… but the clarification probably happened earlier. And Copernicus did mention Martianus Capella’s work. In fact, he used it to argue for a heliocentric theory! In Chapter 10 of De Revolutionibus he wrote:

In my judgement, therefore, we should not in the least disregard what was familiar to Martianus Capella, the author of an encyclopedia, and to certain other Latin writers. For according to them, Venus and Mercury revolve around the sun as their center. This is the reason, in their opinion, why these planets diverge no farther from the sun than is permitted by the curvature of their revolutions. For they do not encircle the earth, like the other planets, but “have opposite circles”. Then what else do these authors mean but that the center of their spheres is near the sun? Thus Mercury’s sphere will surely be enclosed within Venus’, which by common consent is more than twice as big, and inside that wide region it will occupy a space adequate for itself. If anyone seizes this opportunity to link Saturn, Jupiter, and Mars also to that center, provided he understands their spheres to be so large that together with Venus and Mercury the earth too is enclosed inside and encircled, he will not be mistaken, as is shown by the regular pattern of their motions.

For [these outer planets] are always closest to the earth, as is well known, about the time of their evening rising, that is, when they are in opposition to the sun, with the earth between them and the sun. On the other hand, they are at their farthest from the earth at the time of their evening setting, when they become invisible in the vicinity of the sun, namely, when we have the sun between them and the earth. These facts are enough to show that their center belongs more to then sun, and is identical with the center around which Venus and Mercury likewise execute their revolutions.

Conclusion

What’s the punchline? For me, it’s that there was not a purely binary choice between geocentric and heliocentric cosmologies. Instead, many options were in play around the time of Copernicus:

• In classic geocentrism, the Earth was non-rotating and everything revolved around it.

• Buridan and Oresme strongly considered the possibility that the Earth rotated… but not, apparently, that it revolved around the Sun.

• Capella believed Mercury and Venus revolved around the Sun… but the Sun revolved around the Earth.

• Copernicus believed the Earth rotates, and also revolves around the Sun.

• And to add to the menu, Tycho Brahe, coming after Copernicus, argued that all the planets except Earth revolve around the Sun, but the Sun and Moon revolve around the Earth, which is fixed.

And Capella’s theory actually helped Copernicus!

This diversity of theories is fascinating… even though everyone holds, and I think myself, that the Earth revolves around the Sun.


Above is a picture of the “Hypothesis Tychonica”, from a book written in 1643.

References

We know very little about Aristarchus’ heliocentric theory. Much comes from Archimedes, who wrote in his Sand-Reckoner that

You King Gelon are aware the ‘universe’ is the name given by most astronomers to the sphere the centre of which is the centre of the earth, while its radius is equal to the straight line between the centre of the sun and the centre of the earth. This is the common account as you have heard from astronomers. But Aristarchus has brought out a book consisting of certain hypotheses, wherein it appears, as a consequence of the assumptions made, that the universe is many times greater than the ‘universe’ just mentioned. His hypotheses are that the fixed stars and the sun remain unmoved, that the earth revolves about the sun on the circumference of a circle, the sun lying in the middle of the orbit, and that the sphere of fixed stars, situated about the same centre as the sun, is so great that the circle in which he supposes the earth to revolve bears such a proportion to the distance of the fixed stars as the centre of the sphere bears to its surface.

The last sentence, which Archimedes went on to criticize, seems to be a way of saying that the fixed stars are at an infinite distance from us.

For Aristarchus’ influence on Copernicus, see:

• Owen Gingerich, Did Copernicus owe a debt to Aristarchus?, Journal for the History of Astronomy 16 (1985), 37–42.

An unpublished early version of Copernicus’ De revolutionibus, preserved at the Jagiellonian Library in Kraków, contains this passage:

And if we should admit that the motion of the Sun and Moon could be demonstrated even if the Earth is fixed, then with respect to the other wandering bodies there is less agreement. It is credible that for these and similar causes (and not because of the reasons that Aristotle mentions and rejects), Philolaus believed in the mobility of the Earth and some even say that Aristarchus of Samos was of that opinion. But since such things could not be comprehended except by a keen intellect and continuing diligence, Plato does not conceal the fact that there were very few philosophers in that time who mastered the study of celestial motions.

For Buridan on the location and possible motion of the Earth, see:

• John Buridan, Questions on the Four Books on the Heavens and the World of Aristotle, Book II, Question 22, trans. Michael Claggett, in The Science of Mechanics in the Middle Ages, University of Wisconsin Press, Madison, Wisconsin, 1961, pp. 594–599.

For Oresme on similar issues, see:

• Nicole Oresme, On the Book on the Heavens and the World of Aristotle, Book II, Chapter 25, trans. Michael Claggett, in The Science of Mechanics in the Middle Ages, University of Wisconsin Press, Madison, Wisconsin, 1961, pp. 600–609.

Both believed in a principle of relativity for rotational motion, so they thought there’d be no way to tell whether the Earth was rotating. This of course got revisited in Newton’s rotating bucket argument, and then Mach’s principle, frame-dragging in general relativity, and so on.

You can read Martianus Capella’s book in English translation here:

William Harris Stahl, Evan Laurie Burge and Richard Johnson, eds., Martianus Capella and the Seven Liberal Arts: The Marriage of Philology and Mercury. Vol. 2., Columbia University Press, 1971.

I got my figures on numbers of books available in the early Middle Ages from here:

• Dusan Nikolic, What was the Carolingian Renaissance?, 2023 April 6.

This is the best source I’ve found on Martianus Capella’s impact on cosmology in the Carolingian Renaissance:

• Bruce S. Eastwood, Ordering the Heavens: Roman Astronomy and Cosmology in the Carolingian Renaissance, Brill, 2007.

This also good:

• Mariken Teeuwen and Sínead O’Sullivan, eds., Carolingian Scholarship and Martianus Capella: Ninth-Century Commentary Traditions on De nuptiis in Context, The Medieval Review (2012).

In this book, the essay most relevant to Capella’s cosmology is again by Eastwood:

• Bruce S. Eastwood, The power of diagrams: the place of the anonymous commentary in the development of Carolingian astronomy and cosmology.

However, this seems subsumed by the more detailed information in his book. There’s also an essay with a good discussion about Carolingian manuscripts of De nuptiis, especially the one called VLF 48 that I showed you, which may be the earliest:

• Mariken Teeuwen, Writing between the lines: reflections of a scholarly debate in a Carolingian commentary tradition.

For the full text of Copernicus’ book, translated into English, go here.

n-Category Café Martianus Capella

I’ve been blogging a bit about medieval math, physics and astronomy over on Azimuth. I’ve been writing about medieval attempts to improve Aristotle’s theory that velocity is proportional to force, understand objects moving at constant acceleration, and predict the conjunctions of Jupiter and Saturn. A lot of interesting stuff was happening back then!

As a digression from our usual fare on the nn-Café, here’s one of my favorites, about an early theory of the Solar System, neither geocentric nor heliocentric, that became popular thanks to a quirk of history around the time of Charlemagne. The more I researched this, the more I wanted to know.

In 1543, Nicolaus Copernicus published a book arguing that the Earth revolves around the Sun: De revolutionibus orbium coelestium.

This is sometimes painted as a sudden triumph of rationality over the foolish yet long-standing belief that the Sun and all the planets revolve around the Earth. As usual, this triumphalist narrative is oversimplified. In the history of science, everything is always more complicated than you think.

First, Aristarchus had come up with a heliocentric theory way back around 250 BC. While Copernicus probably didn’t know all the details, he did know that Aristarchus said the Earth moves. Copernicus mentioned this in an early unpublished version of De revolutionibus.

Copernicus also had some precursors in the Middle Ages, though it’s not clear whether he was influenced by them.

In the 1300’s, the philosopher Jean Buridan argued that the Earth might not be at the center of the Universe, and that it might be rotating. He claimed — correctly in the first case, and only a bit incorrectly in the second — that there’s no real way to tell. But he pointed out that it makes more sense to have the Earth rotating than have the Sun, Moon, planets and stars all revolving around it, because

it is easier to move a small thing than a large one.

In 1377 Nicole Oresme continued this line of thought, making the same points in great detail, only to conclude by saying

Yet everyone holds, and I think myself, that the heavens do move and not the Earth, for “God created the orb of the Earth, which will not be moved” [Psalms 104:5], notwithstanding the arguments to the contrary.

Everyone seems to take this last-minute reversal of views at face value, but I have trouble believing he really meant it. Maybe he wanted to play it safe with the Church. I think I detect a wry sense of humor, too.

Martianus Capella

I recently discovered another fascinating precursor of Copernicus’ heliocentric theory: a theory that is neither completely geocentric nor completely heliocentric! And that’s what I want to talk about today.

Sometime between 410 and 420 AD, Martianus Capella came out with a book saying Mercury and Venus orbit the Sun, while the other planets orbit the Earth!

This picture is from a much later book by the German astronomer Valentin Naboth, in 1573. But it illustrates Capella’s theory — and as we’ll see, his theory was rather well-known in western Europe starting in the 800s.

First of all, take a minute to think about how reasonable this theory is. Mercury and Venus are the two planets closer to the Sun than we are. So, unlike the other planets, we can never possibly see them more than 90° away from the Sun. In fact Venus never gets more than 48° from the Sun, and Mercury stays even closer. So it looks like these planets are orbiting the Sun, not the Earth!

But who was this guy, and why did he matter?

Martianus Capella was a jurist and writer who lived in the city of Madauros, which is now in Algeria, but in his day was in Numidia, one of six African provinces of the Roman Empire. He’s famous for a book with the wacky title De nuptiis Philologiae et Mercurii, which means On the Marriage of Philology and Mercury. It was an allegorical story, in prose and verse, describing the courtship and wedding of Mercury (who stood for “intelligent or profitable pursuit”) and the maiden Philologia (who stood for “the love of letters and study”). Among the wedding gifts are seven maids who will be Philology’s servants. They are the seven liberal arts:

In seven chapters, the seven maids explain these subjects. What matters for us is the chapter on astronomy, which explains the structure of the Solar System.

This book De nuptiis Philologiae et Mercurii became very important after the decline and fall of the Roman Empire, mainly as a guide to the liberal arts. In fact, if you went to a college that claimed to offer a liberal arts education, you were indirectly affected by this book!

Here is a painting by Botticelli from about 1485, called A Young Man Being Introduced to the Seven Liberal Arts:

The Carolingian Renaissance

But why did Martianus Capella’s book become so important?

I’m no expert on this, but it seems as the Roman Empire declined there was a gradual dumbing down of scholarship, with original and profound works by folks like Aristotle, Euclid, and Ptolemy eventually being lost in western Europe — though preserved in more civilized parts of the world, like Baghdad and the Byzantine Empire. In the west, eventually all that was left were easy-to-read popularizations by people like Pliny the Elder, Boethius, Macrobius, Cassiodorus… and Martianus Capella!

By the end of the 800s, many copies of Capella’s book De nuptiis Philologiae et Mercurii were available. Let’s see how that happened!


Expansion of the Franks


To set the stage: Charlemagne became King of the Franks in 768 AD. Being a forward-looking fellow, he brought in Alcuin, headmaster of the cathedral school in York and “the most learned man anywhere to be found”, to help organize education in his kingdom.

Alcuin set up schools for boys and girls, systematized the curriculum, raised the standards of scholarship, and encouraged the study of liberal arts. Yes: the liberal arts as described by Martianus Capella! For Alcuin this was all in the service of Christianity. But scholars, being scholars, took advantage of this opportunity to start copying the ancient books that were available, writing commentaries on them, and the like.

In 800, Charlemagne became emperor of what’s now called the Carolingian Empire. When Charlemagne died in 814 a war broke out, but it ended in 847. Though divided into three parts, the empire flourished until about 877, when it began sinking due to internal struggles, attacks from Vikings in the north, etc.

The heyday of culture in the Carolingian Empire, roughly 768–877, is sometimes called the Carolingian Renaissance because of the flourishing of culture and learning brought about by Alcuin and his successors. To get a sense of this: between 550 and 750 AD, only 265 books have been preserved from Western Europe. From the Carolingian Renaissance we have over 7000.

However, there was still a huge deficit of the classical texts we now consider most important. As far as I can tell, the works of Aristotle, Eratosthenes, Euclid, Ptolemy and Archimedes were completely missing in the Carolingian Empire. I seem to recall that from Plato only the Timaeus was available at this time. But Martianus Capella’s De nuptiis Philologiae et Mercurii was very popular. Hundreds of copies were made, and many survive even to this day! Thus, his theory of the Solar System, where Mercury and Venus orbited the Sun but other planets orbited the Earth, must have had an out-sized impact on cosmology at this time.

Here is part of a page from one of the first known copies of De nuptiis Philologiae et Mercurii:

It’s called VLF 48, and it’s now at the university library in Leiden. Most scholars say it dates to 850 AD, though Mariken Teeuwen has a paper claiming it goes back to 830 AD.

You’ll notice that in addition to the main text, there’s a lot of commentary in smaller letters! This may have been added later. Nobody knows who wrote it, or even whether it was a single person. It’s called the Anonymous Commentary. This commentary was copied into many of the later versions of the book, so it’s important.

The Anonymous Commentary

So far my tale has been a happy one: even in the time of Charlemagne, the heliocentric revolt against the geocentric cosmology was brewing, with a fascinating ‘mixed’ cosmology being rather well-known.

Alas, now I need to throw a wet blanket on that, and show how poorly Martianus Capella’s cosmology was understood at this time!

The Anonymous Commentary actually describes three variants of Capella’s theory of the orbits of Mercury and Venus. One of them is good, one seems bad, and one seems very bad. Yet subsequent commentators in the Carolingian Empire didn’t seem to recognize this fact and discard the bad ones.

These three variants were drawn as diagrams in the margin of VLF 48, but Robert Eastwood has nicely put them side by side here:

The one at right, which the commentary attributes to the “Platonists”, shows the orbit of Mercury around the Sun surrounded by the larger orbit of Venus. This is good.

The one in the middle, which the commentary attributes to Martianus Capella himself, shows the orbits of Mercury and Venus crossing each other. This seems bad.

The one at left, which the commentary attributes to Pliny, shows orbits for Mercury and Venus that are cut off when they meet the orbit of the Sun, not complete circles. This seems very bad — so bad that I can’t help but hope there’s some reasonable interpretation that I’m missing. (Maybe just that these planets get hidden when they go behind the Sun?)

Robert Eastwood attributes the two bad models to a purely textual approach to astronomy, where commentators tried to interpret texts and compare them to other texts, without doing observations. I’m still puzzled.

Copernicus

Luckily, we’ve already seen that by 1573, Valentin Naboth had settled on the good version of Capella’s cosmology:

That’s 30 years after Copernicus came out with his book… but the clarification probably happened earlier. And Copernicus did mention Martianus Capella’s work. In fact, he used it to argue for a heliocentric theory! In Chapter 10 of De Revolutionibus he wrote:

In my judgement, therefore, we should not in the least disregard what was familiar to Martianus Capella, the author of an encyclopedia, and to certain other Latin writers. For according to them, Venus and Mercury revolve around the sun as their center. This is the reason, in their opinion, why these planets diverge no farther from the sun than is permitted by the curvature of their revolutions. For they do not encircle the earth, like the other planets, but “have opposite circles”. Then what else do these authors mean but that the center of their spheres is near the sun? Thus Mercury’s sphere will surely be enclosed within Venus’, which by common consent is more than twice as big, and inside that wide region it will occupy a space adequate for itself. If anyone seizes this opportunity to link Saturn, Jupiter, and Mars also to that center, provided he understands their spheres to be so large that together with Venus and Mercury the earth too is enclosed inside and encircled, he will not be mistaken, as is shown by the regular pattern of their motions.

For [these outer planets] are always closest to the earth, as is well known, about the time of their evening rising, that is, when they are in opposition to the sun, with the earth between them and the sun. On the other hand, they are at their farthest from the earth at the time of their evening setting, when they become invisible in the vicinity of the sun, namely, when we have the sun between them and the earth. These facts are enough to show that their center belongs more to then sun, and is identical with the center around which Venus and Mercury likewise execute their revolutions.

Conclusion

What’s the punchline? For me, it’s that there was not a purely binary choice between geocentric and heliocentric cosmologies. Instead, many options were in play around the time of Copernicus:

  • In classic geocentrism, the Earth was non-rotating and everything revolved around it.

  • Buridan and Oresme strongly considered the possibility that the Earth rotated… but not, apparently, that it revolved around the Sun.

  • Capella believed Mercury and Venus revolved around the Sun… but the Sun revolved around the Earth.

  • Copernicus believed the Earth rotates, and also revolves around the Sun.

  • And to add to the menu, Tycho Brahe, coming after Copernicus, argued that all the planets except Earth revolve around the Sun, but the Sun and Moon revolve around the Earth, which is fixed.

And Capella’s theory actually helped Copernicus!

This diversity of theories is fascinating… even though everyone holds, and I think myself, that the Earth revolves around the Sun.

Above is a picture of the “Hypothesis Tychonica”, from a book written in 1643.

References

We know very little about Aristarchus’ heliocentric theory. Much comes from Archimedes, who wrote in his Sand-Reckoner that

You King Gelon are aware the ‘universe’ is the name given by most astronomers to the sphere the centre of which is the centre of the earth, while its radius is equal to the straight line between the centre of the sun and the centre of the earth. This is the common account as you have heard from astronomers. But Aristarchus has brought out a book consisting of certain hypotheses, wherein it appears, as a consequence of the assumptions made, that the universe is many times greater than the ‘universe’ just mentioned. His hypotheses are that the fixed stars and the sun remain unmoved, that the earth revolves about the sun on the circumference of a circle, the sun lying in the middle of the orbit, and that the sphere of fixed stars, situated about the same centre as the sun, is so great that the circle in which he supposes the earth to revolve bears such a proportion to the distance of the fixed stars as the centre of the sphere bears to its surface.

The last sentence, which Archimedes went on to criticize, seems to be a way of saying that the fixed stars are at an infinite distance from us.

For Aristarchus’ influence on Copernicus, see:

An unpublished early version of Copernicus’ De revolutionibus, preserved at the Jagiellonian Library in Kraków, contains this passage:

And if we should admit that the motion of the Sun and Moon could be demonstrated even if the Earth is fixed, then with respect to the other wandering bodies there is less agreement. It is credible that for these and similar causes (and not because of the reasons that Aristotle mentions and rejects), Philolaus believed in the mobility of the Earth and some even say that Aristarchus of Samos was of that opinion. But since such things could not be comprehended except by a keen intellect and continuing diligence, Plato does not conceal the fact that there were very few philosophers in that time who mastered the study of celestial motions.

For Buridan on the location and possible motion of the Earth, see:

  • John Buridan, Questions on the Four Books on the Heavens and the World of Aristotle, Book II, Question 22, trans. Michael Claggett, in The Science of Mechanics in the Middle Ages, University of Wisconsin Press, Madison, Wisconsin, 1961, pp. 594–599.

For Oresme on similar issues, see:

  • Nicole Oresme, On the Book on the Heavens and the World of Aristotle, Book II, Chapter 25, trans. Michael Claggett, in The Science of Mechanics in the Middle Ages, University of Wisconsin Press, Madison, Wisconsin, 1961, pp. 600–609.

Both believed in a principle of relativity for rotational motion, so they thought there’d be no way to tell whether the Earth was rotating. This of course got revisited in Newton’s rotating bucket argument, and then Mach’s principle, frame-dragging in general relativity, and so on.

You can read Martianus Capella’s book in English translation here:

William Harris Stahl, Evan Laurie Burge and Richard Johnson, eds., Martianus Capella and the Seven Liberal Arts: The Marriage of Philology and Mercury. Vol. 2., Columbia University Press, 1971.

I got my figures on numbers of books available in the early Middle Ages from here:

This is the best source I’ve found on Martianus Capella’s impact on cosmology in the Carolingian Renaissance:

  • Bruce S. Eastwood, Ordering the Heavens: Roman Astronomy and Cosmology in the Carolingian Renaissance, Brill, 2007.

This also good:

  • Mariken Teeuwen and Sínead O’Sullivan, eds., Carolingian Scholarship and Martianus Capella: Ninth-Century Commentary Traditions on De nuptiis in Context, The Medieval Review (2012).

In this book, the essay most relevant to Capella’s cosmology is again by Eastwood:

  • Bruce S. Eastwood, The power of diagrams: the place of the anonymous commentary in the development of Carolingian astronomy and cosmology.

However, this seems subsumed by the more detailed information in his book. There’s also an essay with a good discussion about Carolingian manuscripts of De nuptiis, especially the one called VLF 48 that I showed you, which may be the earliest:

  • Mariken Teeuwen, Writing between the lines: reflections of a scholarly debate in a Carolingian commentary tradition.

For the full text of Copernicus’ book, translated into English, go here.

December 11, 2024

Scott Aaronson The Google Willow thing

Yesterday I arrived in Santa Clara for the Q2B (Quantum 2 Business) conference, which starts this morning, and where I’ll be speaking Thursday on “Quantum Algorithms in 2024: How Should We Feel?” and also closing the conference via an Ask-Us-Anything session with John Preskill. (If you’re at Q2B, reader, come and say hi!)

And to coincide with Q2B, yesterday Google’s Quantum group officially announced “Willow,” its new 105-qubit superconducting chip with which it’s demonstrated an error-corrected surface code qubit as well as a new, bigger quantum supremacy experiment based on Random Circuit Sampling. I was lucky to be able to attend Google’s announcement ceremony yesterday afternoon at the Computer History Museum in Mountain View, where friend-of-the-blog-for-decades Dave Bacon and other Google quantum people explained exactly what was done and took questions (the technical level was surprisingly high for this sort of event). I was also lucky to get a personal briefing last week from Google’s Sergio Boixo on what happened.

Meanwhile, yesterday Sundar Pichai tweeted about Willow, and Elon Musk replied “Wow.” It cannot be denied that those are both things that happened.

Anyway, all yesterday, I then read comments on Twitter, Hacker News, etc. complaining that, since there wasn’t yet a post on Shtetl-Optimized, how could anyone possibly know what to think of this?? For 20 years I’ve been trying to teach the world how to fish in Hilbert space, but (sigh) I suppose I’ll just hand out some more fish. So, here are my comments:

  1. Yes, this is great. Yes, it’s a real milestone for the field. To be clear: for anyone who’s been following experimental quantum computing these past five years (say, since Google’s original quantum supremacy milestone in 2019), there’s no particular shock here. Since 2019, Google has roughly doubled the number of qubits on its chip and, more importantly, increased the qubits’ coherence time by a factor of 5. Meanwhile, their 2-qubit gate fidelity is now roughly 99.7% (for controlled-Z gates) or 99.85% (for “iswap” gates), compared to ~99.5% in 2019. They then did the more impressive demonstrations that predictably become possible with more and better qubits. And yet, even if the progress is broadly in line with what most of us expected, it’s still of course immensely gratifying to see everything actually work! Huge congratulations to everyone on the Google team for a well-deserved success.
  2. I already blogged about this!!! Specifically, I blogged about Google’s fault-tolerance milestone when its preprint appeared on the arXiv back in August. To clarify, what we’re all talking about now is the same basic technical advance that Google already reported in August, except now with the PR blitz from Sundar Pichai on down, a Nature paper, an official name for the chip (“Willow”), and a bunch of additional details about it.
  3. Scientifically, the headline result is that, as they increase the size of their surface code, from 3×3 to 5×5 to 7×7, Google finds that their encoded logical qubit stays alive for longer rather than shorter. So, this is a very important threshold that’s now been crossed. As Dave Bacon put it to me, “eddies are now forming”—or, to switch metaphors, after 30 years we’re now finally tickling the tail of the dragon of quantum fault-tolerance, the dragon that (once fully awoken) will let logical qubits be preserved and acted on for basically arbitrary amounts of time, allowing scalable quantum computation.
  4. Having said that, Sergio Boixo tells me that Google will only consider itself to have created a “true” fault-tolerant qubit, once it can do fault-tolerant two-qubit gates with an error of ~10-6 (and thus, on the order of a million fault-tolerant operations before suffering a single error). We’re still some ways from that milestone: after all, in this experiment Google created only a single encoded qubit, and didn’t even try to do encoded operations on it, let alone on multiple encoded qubits. But all in good time. Please don’t ask me to predict how long, though empirically, the time from one major experimental QC milestone to the next now seems to be measured in years, which are longer than weeks but shorter than decades.
  5. Google has also announced a new quantum supremacy experiment on its 105-qubit chip, based on Random Circuit Sampling with 40 layers of gates. Notably, they say that, if you use the best currently-known simulation algorithms (based on Johnnie Gray’s optimized tensor network contraction), as well as an exascale supercomputer, their new experiment would take ~300 million years to simulate classically if memory is not an issue, or ~1025 years if memory is an issue (note that a mere ~1010 years have elapsed since the Big Bang). Probably some people have come here expecting me to debunk those numbers, but as far as I know they’re entirely correct, with the caveats stated. Naturally it’s possible that better classical simulation methods will be discovered, but meanwhile the experiments themselves will also rapidly improve.
  6. Having said that, the biggest caveat to the “1025 years” result is one to which I fear Google drew insufficient attention. Namely, for the exact same reason why (as far as anyone knows) this quantum computation would take ~1025 years for a classical computer to simulate, it would also take ~1025 years for a classical computer to directly verify the quantum computer’s results!! (For example, by computing the “Linear Cross-Entropy” score of the outputs.) For this reason, all validation of Google’s new supremacy experiment is indirect, based on extrapolations from smaller circuits, ones for which a classical computer can feasibly check the results. To be clear, I personally see no reason to doubt those extrapolations. But for anyone who wonders why I’ve been obsessing for years about the need to design efficiently verifiable near-term quantum supremacy experiments: well, this is why! We’re now deeply into the unverifiable regime that I warned about.
  7. In his remarks yesterday, Google Quantum AI leader Hartmut Neven talked about David Deutsch’s argument, way back in the 1990s, that quantum computers should force us to accept the reality of the Everettian multiverse, since “where else could the computation have happened, if it wasn’t being farmed out to parallel universes?” And naturally there was lots of debate about that on Hacker News and so forth. Let me confine myself here to saying that, in my view, the new experiment doesn’t add anything new to this old debate. It’s yet another confirmation of the predictions of quantum mechanics. What those predictions mean for our understanding of reality can continue to be argued as it’s been since the 1920s.
  8. Cade Metz did a piece about Google’s announcement for the New York Times. Alas, when Cade reached out to me for comment, I decided that it would be too awkward, after what Cade did to my friend Scott Alexander almost four years ago. I talked to several other journalists, such as Adrian Cho for Science.
  9. No doubt people will ask me what this means for superconducting qubits versus trapped-ion or neutral-atom or photonic qubits, or for Google versus its many competitors in experimental QC. And, I mean, it’s not bad for Google or for superconducting QC! These past couple years I’d sometimes commented that, since Google’s 2019 announcement of quantum supremacy via superconducting qubits, the trapped-ion and neutral-atom approaches had seemed to be pulling ahead, with spectacular results from Quantinuum (trapped-ion) and QuEra (neutral atoms) among others. One could think of Willow as Google’s reply, putting the ball in competitors’ courts likewise to demonstrate better logical qubit lifetime with increasing code size (or, better yet, full operations on logical qubits exceeding that threshold, without resorting to postselection). The great advantage of trapped-ion qubits continues to be that you can move the qubits around (and also, the two-qubit gate fidelities seem somewhat ahead of superconducting). But to compensate, superconducting qubits have the advantage that the gates are a thousand times faster, which makes feasible to do experiments that require collecting millions of samples.
  10. Of course the big question, the one on everyone’s lips, was always how quantum computing skeptic Gil Kalai was going to respond. But we need not wonder! On his blog, Gil writes: “We did not study yet these particular claims by Google Quantum AI but my general conclusion apply to them ‘Google Quantum AI’s claims (including published ones) should be approached with caution, particularly those of an extraordinary nature. These claims may stem from significant methodological errors and, as such, may reflect the researchers’ expectations more than objective scientific reality.’ ”  Most of Gil’s post is devoted to re-analyzing data from Google’s 2019 quantum supremacy experiment, which Gil continues to believe can’t possibly have done what was claimed. Gil’s problem is that the 2019 experiment was long ago superseded anyway: besides the new and more inarguable Google result, IBM, Quantinuum, QuEra, and USTC have now all also reported Random Circuit Sampling experiments with good results. I predict that Gil, and others who take it as axiomatic that scalable quantum computing is impossible, will continue to have their work cut out for them in this new world.

Update: Here’s Sabine Hossenfelder’s take. I don’t think she and I disagree about any of the actual facts; she just decided to frame things much more negatively. Ironically, I guess 20 years of covering hyped, dishonestly-presented non-milestones in quantum computing has inclined me to be pretty positive when a group puts in this much work, demonstrates a real milestone, and talks about it without obvious falsehoods!

December 09, 2024

Scott Aaronson Podcasts!

Update (Dec. 9): For those who still haven’t gotten enough, check out a 1-hour Zoom panel discussion about quantum algorithms, featuring yours truly along with my distinguished colleagues Eddie Farhi, Aram Harrow, and Andrew Childs, moderated by Barry Sanders, as part of the QTML’2024 conference held in Melbourne (although, it being Thanksgiving week, none of the four panelists were actually there in person). Part of the panel devolves into a long debate between me and Eddie about how interesting quantum algorithms are if they don’t achieve speedups over classical algorithms, and whether some quantum algorithms papers mislead people by not clearly addressing the speedup question (you get one guess as to which side I took). I resolved going in to keep my comments as civil and polite as possible—you can judge for yourself how well I succeeded! Thanks very much to Barry and the other QTML organizers for making this happen.


Do you like watching me spout about AI alignment, watermarking, my time at OpenAI, the P versus NP problem, quantum computing, consciousness, Penrose’s views on physics and uncomputability, university culture, wokeness, free speech, my academic trajectory, and much more, despite my slightly spastic demeanor and my many verbal infelicities? Then holy crap are you in luck today! Here’s 2.5 hours of me talking to former professional poker players (and now wonderful Austin-based friends) Liv Boeree and her husband Igor Kurganov about all of those topics. (Or 1.25 hours if you watch at 2x speed, as I strongly recommend.)

But that’s not all! Here I am talking to Harvard’s Hrvoje Kukina, in a much shorter (45-minute) podcast focused on quantum computing, cosmological bounds on information processing, and the idea of the universe as a computer:

Last but not least, here I am in an hour-long podcast (this one audio-only) with longtime friend Kelly Weinersmith and her co-host Daniel Whiteson, talking about quantum computing.

Enjoy!

David Hoggpossible Trojan planet?

In group meeting last week, Stefan Rankovic (NYU undergrad) presented results on a very low-amplitude possible transit in the lightcurve of a candidate long-period eclipsing binary system found in the NASA Kepler data. The weird thing is that (even though the period is very long) the transit of the possible planet looks just like the transit of the secondary star in the eclipsing binary. Like just like it, only lower in amplitude (smaller in radius).

If the transit looks identical, only lower in amplitude, it suggests that it is taking an extremely similar chord across the primary star, at the same speed, with no difference in inclination. How could that be? Well if they are moving at the same speed on the same path, maybe we have a 1:1 resonance, like a Trojan? If so, there are so many cool things about this system. It was an exciting group meeting, to be sure.

December 07, 2024

Doug NatelsonSeeing through your head - diffuse imaging

From the medical diagnostic perspective (and for many other applications), you can understand why it might be very convenient to be able to perform some kind of optical imaging of the interior of what you'd ordinarily consider opaque objects.  Even when a wavelength range is chosen so that absorption is minimized, photons can scatter many times as they make their way through dense tissue like a breast.  We now have serious computing power and extremely sensitive photodetectors, which has led to the development of imaging techniques to perform imaging through media that absorb and diffuse photons.  Here is a review of this topic from 2005, and another more recent one (pdf link here).  There are many cool approaches that can be combined, including using pulsed lasers to do time-of-flight measurements (review here), and using "structured illumination" (review here).   

Sure, point that laser at my head.  (Adapted from
Figure 1 of this paper.)

I mention all of this to set the stage for this fun preprint, titled "Photon transport through the entire adult human head".  Sure, you think your head is opaque, but it only attenuates photon fluxes by a factor of around \(10^{18}\).  With 1 Watt of incident power at 800 nm wavelength spread out over a 25 mm diameter circle and pulsed 80 million times a second, time-resolved single-photon detectors like photomultiplier tubes can readily detect the many-times-scattered photons that straggle their way out of your head around 2 nanoseconds later.  (The distribution of arrival times contains a bunch of information.  Note that the speed of light in free space is around 30 cm/ns; even accounting for the index of refraction of tissue, those photons have bounced around a lot before getting through.)  The point of this is that those photons have passed through parts of the brain that are usually considered inaccessible.  This shows that one could credibly use spectroscopic methods to get information out of there, like blood oxygen levels.

John BaezBradwardine’s Rule



I love reading about the medieval physics: you can see people struggling against mental traps, often failing, but still putting up a fight. We shouldn’t laugh at them: theoretical physicists may be stuck in their own traps today! Good new ideas often seem obvious in retrospect… but only in retrospect.

For example: Aristotle argued that a vacuum is impossible, because the velocity of an object equals the force on it divided by the resistance of the medium it’s moving through. A vacuum offers no resistance—so an object would move through it at infinite speed!

Around 1100, the medieval Arab physicist Ibn Bajja disagreed. He argued that the celestial spheres—i.e. the planets and stars — move at finite speeds even in the vacuum. So, he said, we should subtract the resistance of the medium from the object’s natural speed, not divide by it.

Averroes fought back, agreeing with Aristotle. Later, Thomas Aquinas sided with Ibn Bajja (who was known in Latin as Avempace). By the 1300s, most Western natural philosophers had sided with Aristotle.

There are definitely problems with the subtraction theory. What if the resistance exceeds the force? Does the object move backwards? But back then they didn’t know about negative numbers! So maybe proponents of the subtraction theory would say an object stands still if you push on it with insufficient force to overcome the resistance.

All this reached a pinnacle of complexity in Thomas Bradwardine’s 1328 Treatise on the Ratios of Speeds in Motions. He analyzed four theories and then proposed his own. Though he didn’t phrase it this way, it seems he argued that speed is proportional to the logarithm of the force over resistance. For example if you cube the ratio of force to resistance, you triple the object’s speed.

This theory, which came to be called ‘Bradwardine’s rule’, seems terrible to me. It doesn’t solve the problem of infinite speeds in a vacuum, and it says some force is required just to hold an object still. In fact, I hope I’m misinterpreting this theory! But anyway, it caught on: without any experimental evidence backing it up, it took root and was popular for about 100 years.

In Oxford, Richard Swineshead and John Dumbleton applied Bradwardine’s rule to solve ‘sophisms’, the logical and physical puzzles that were starting to become important at Merton College. A bit later this rule appeared in Paris, in the works of Jean Buridan and Albert of Saxony. By the middle of the 1300s it caught on in Padua and elsewhere. And Swineshead, nicknamed The Calculator, used it to study whether a body acts as a unified whole or as the sum of its parts. He imagined a long, uniform, heavy body falling in a vacuum down a tunnel through the center of the earth. Somehow he concluded that it would take an infinite time to reach the center. Don’t ask me how—I don’t know! But anyway, he rejected this conclusion as unphysical.

So, a lot of struggles!

While the theories I just explained sound pretty bad, something good was happening throughout these arguments: researchers were figuring out the concept of ‘instantaneous velocity’: the velocity of an object at an instant of time. They didn’t define it using derivatives, but it seems getting a good intuitive handle on this was a prerequisite for later work on derivatives.

And Nicole Oresme, trying to formulate Bradwardine’s rule in full generality, was led to study fractional powers for the first time! Nobody knew about logarithms back then, so Bradwardine’s rule actually said that if you have two objects feeling forces F_i and encountering resistances R_i, their velocities V_i are related by

\displaystyle{ \frac{F_2}{R_2} = \left(\frac{F_1}{R_1}\right)^{V_2/V_1} }

This is simple enough when the ratio V_2/V_1 is a positive integer, but Oresme extended it to fractions!

So, fundamentally misguided physics can still lead to good mathematics. (I will not attempt to draw any lessons for the present.)

I got most of this material from here:

• Walter Roy Laird, Change and motion, in Cambridge History of Science, Volume 2: Medieval Science, eds. David C. Lindberg and Michael H. Shank, Cambridge U. Press, Cambridge, 2013.

But as I dig deeper, I’m finding this very helpful:

• Michael Claggett, The Science of Mechanics in the Middle Ages, University of Wisconsin Press, Madison, Wisconsin, 1961.

John BaezThe Mean Speed Theorem

Did you know students at Oxford in 1335 were solving problems about objects moving with constant acceleration? This blew my mind.

Medieval scientists were deeply confused about the connection between force and velocity: it took Newton to realize force is proportional to acceleration. But in the early 1300s, a group of researchers called the Oxford Calculators made huge progress in understanding objects that move with changing velocity. That was an incredibly important step.

You see, Aristotle had only defined velocity over an interval of time, as the change in position divided by the change in time. But the Oxford Calculators developed the concept of instantaneous velocity, allowing them to discuss objects with changing velocity—and even the concept of acceleration!

That’s really cool. But it gets better. The Oxford Calculators made their students write essays on ‘sophisms‘, meaning puzzles or paradoxes. Some of these were logical or philosophical, others physical. And in 1335, one of Oxford Calculators named William Heytesbury wrote a book called Rules for Solving Sophisms. This gives us a hint as to what these puzzles were like—though it’s more of a theoretical monograph than a practical how-to book, so we can’t really tell what the students were expected to do.

This book has a very interesting section on physics. Here Heytesbury states something called the Mean Speed Theorem. This says that if an object’s velocity is changing at a constant rate over some period of time, it goes just as far as if it were moving uniformly with the velocity it had at the middle instant of its motion!

We can think of this as a step toward integrating linear functions. Later in the 1300s, in Paris, Nicole Oresme, gave a clear picture proof of the Mean Speed Theorem. For example, he pointed out that the triangle ACG below has the same area as the rectangle ACDF:

This handles the Mean Speed Theorem for an object starting with zero velocity—an important special case which Heytesbury also found interesting.

Heytesbury’s Rules for Solving Sophisms also has some pictures:

But when it comes to the Mean Speed Theorem I haven’t seen him giving a picture proof. Instead, he gives a complex purely verbal argument. It’s so complicated I haven’t managed to follow it yet. I tried, but I gave up: it involves comparing the motion of four different objects. You can see the argument here, starting on page 122:

• Curtis Wilson, William Heytesbury: Medieval Logic and the Rise of Mathematical Physics, University of Wisconsin Press, Madison, Wisconsin, 1956.

I managed to find this book in the bowels of my university library, in one of those electronically movable stacks that makes me worry some evil denizen of the nether reaches is going to crush me:

But I’m pleased to report that the tireless criminals of Library Genesis have made this book available to the world in electronic form, so you don’t have to take such risks. If you can understand Heytesbury’s argument, please explain it to me!

Who were the Oxford Calculators, exactly? They were a group of thinkers associated to Merton College in Oxford. Thus, they’re also called the Merton School. Here are the four most important:

Thomas Bradwardine (c. 1300 – 26 August 1349) was an English cleric, scholar, mathematician, physicist, and—for just 4 months before he died of the bubonic plague—Archbishop of Canterbury. The Pope called him Doctor Profundus, and the nickname stuck. He was very interested in logic and wrote about the Liar Paradox. His Treatise on the Proportions of Velocities in Movements started the interest in motion at Merton College. As I recently explained here, he came up with a really lousy rule relating force to velocity, called Bradwardine’s Rule. But the bright side is that in the process, he began clarifying the concept of instantaneous velocity!

John Dumbleton (c. 1310 – c. 1349) studied in Paris for a brief time before becoming a fellow at Merton College. His Summa Logica et Philosophiae Naturalis, never finished, seems to be in part a reworking of ideas from Aristotle. But it also has new ideas on physics: for example, he argues that the brightness of a light source does not decrease in inverse proportion to the distance—though not, alas, that it goes with the inverse square of distance.

Richard Swineshead (flourished c. 1340 – 1354) has been called “in many ways the subtlest and most able of the four”. In his fragment On Motion he defined the instantaneous velocity of an object as the distance it would have traveled in a certain time, divided by that time, if it had continued moving at the same velocity. His The Book of Calculations: Rules on Local Motion gives an argument for the Mean Speed Theorem which again I don’t understand. He also argues that an object starting at rest, whose velocity increases at a constant rate over some time, will travel 3 times as much in the second half of that time as in the first.

William Heytesbury (c. 1313–1372/3) had a degree in theology, and near the end of his life became chancellor of Oxford University. But he’s most famous for his popular textbook Rules for Solving Sophisms. The Mean Speed Theorem is stated but not proved there. A proof appears in a book Probationes Conclusionem which may or may not have been written by Heytebury.

If you want to dig deeper into this subject, the place to go is here:

• Michael Claggett, The Science of Mechanics in the Middle Ages, University of Wisconsin Press, Madison, Wisconsin, 1961.

It’s a sourcebook, with a lot of important texts translated into English, but it also has long sections explaining the history and the ideas. Note that both books I recommended today were published by the University of Wisconsin in the late 1950s and early 1960s. Maybe they were a powerhouse in medieval physics at that time?

Anyway, I find this stuff fascinating. One obvious question is: why did the Oxford Calculators sort of fizzle out after 1350? Why did it take until 1589 for Galileo to announce that falling bodies actually move in a way described by the Mean Speed Theorem?

There are a lot of possible reasons, but one is hinted at by Thomas Bradwardine’s death in 1349. The bubonic plague killed about one third of the population of Europe from 1346 to 1353!

December 06, 2024

Matt von HippelAt Ars Technica Last Week, With a Piece on How Wacky Ideas Become Big Experiments

I had a piece last week at Ars Technica about the path ideas in physics take to become full-fledged experiments.

My original idea for the story was a light-hearted short news piece. A physicist at the University of Kansas, Steven Prohira, had just posted a proposal for wiring up a forest to detect high-energy neutrinos, using the trees like giant antennas.

Chatting to experts, what at first seemed silly started feeling like a hook for something more. Prohira has a strong track record, and the experts I talked to took his idea seriously. They had significant doubts, but I was struck by how answerable those doubts were, how rather than dismissing the whole enterprise they had in mind a list of questions one could actually test. I wrote a blog post laying out that impression here.

The editor at Ars was interested, so I dug deeper. Prohira’s story became a window on a wider-ranging question: how do experiments happen? How does a scientist convince the community to work on a project, and the government to fund it? How do ideas get tested before these giant experiments get built?

I tracked down researchers from existing experiments and got their stories. They told me how detecting particles from space takes ingenuity, with wacky ideas involving the natural world being surprisingly common. They walked me through tales of prototypes and jury-rigging and feasibility studies and approval processes.

The highlights of those tales ended up in the piece, but there was a lot I couldn’t include. In particular, I had a long chat with Sunil Gupta about the twists and turns taken by the GRAPES experiment in India. Luckily for you, some of the most interesting stories have already been covered, for example their measurement of the voltage of a thunderstorm or repurposing used building materials to keep costs down. I haven’t yet found his story about stirring wavelength-shifting chemicals all night using a propeller mounted on a power drill, but I suspect it’s out there somewhere. If not, maybe it can be the start of a new piece!

Terence TaoAI for Math fund

Renaissance Philanthropy and XTX Markets have announced the launch of the AI for Math Fund, a new grant program supporting projects that apply AI and machine learning to mathematics, with a focus on automated theorem proving, with an initial $9.2 million in funding. The project funding categories, and examples of projects in such categories, are:

1. Production Grade Software Tools

  • AI-based autoformalization tools for translating natural-language mathematics into the formalisms of proof assistants
  • AI-based auto-informalization tools for translating proof-assistant proofs into interpretable natural-language mathematics
  • AI-based models for suggesting tactics/steps or relevant concepts to the user of a proof assistant, or for generating entire proofs
  • Infrastructure to connect proof assistants with computer algebra systems, calculus, and PDEs
  • A large-scale, AI-enhanced distributed collaboration platform for mathematicians

2. Datasets

  • Datasets of formalized theorems and proofs in a proof assistant
  • Datasets that would advance AI for theorem proving as applied to program verification and secure code generation
  • Datasets of (natural-language) mathematical problems, theorems, proofs, exposition, etc.
  • Benchmarks and training environments associated with datasets and model tasks (autoformalization, premise selection, tactic or proof generation, etc.)

3. Field Building

  • Textbooks
  • Courses
  • Documentation and support for proof assistants, and interfaces/APIs to integrate with AI tools

4. Breakthrough Ideas

  • Expected difficulty estimation (of sub-problems of a proof)
  • Novel mathematical implications of proofs formalized type-theoretically
  • Formalization of proof complexity in proof assistants

The deadline for initial expressions of interest is Jan 10, 2025.

[Disclosure: I have agreed to serve on the advisory board for this fund.]

Update: See also this discussion thread on possible projects that might be supported by this fund.

December 04, 2024

n-Category Café ACT 2025

The Eighth International Conference on Applied Category Theory (https://easychair.org/cfp/ACT2025) will take place at the University of Florida on June 2-6, 2025. The conference will be preceded by the Adjoint School on May 26-30, 2025.

This conference follows previous events at Oxford (2024, 2019), University of Maryland (2023), Strathclyde (2022), Cambridge (2021), MIT (2020), and Leiden (2019).

Applied category theory is important to a growing community of researchers who study computer science, logic, engineering, physics, biology, chemistry, social science, systems, linguistics and other subjects using category-theoretic tools. The background and experience of our members is as varied as the systems being studied. The goal of the Applied Category Theory conference series is to bring researchers together, strengthen the applied category theory community, disseminate the latest results, and facilitate further development of the field.

If you want to give a talk, read on!

Submission

Important dates

All deadlines are AoE (Anywhere on Earth).

  • February 26: title and brief abstract submission
  • March 3: paper submission
  • April 7: notification of authors
  • May 19: Pre-proceedings ready versions
  • June 2-6: conference

Submissions

The submission URL is: https://easychair.org/conferences/?conf=act2025

We accept submissions in English of original research papers, talks about work accepted/submitted/published elsewhere, and demonstrations of relevant software. Accepted original research papers will be published in a proceedings volume. The conference will include an industry showcase event and community meeting. We particularly encourage people from underrepresented groups to submit their work and the organizers are committed to non-discrimination, equity, and inclusion.

  • Conference Papers should present original, high-quality work in the style of a computer science conference paper (up to 12 pages, not counting the bibliography; more detailed parts of proofs may be included in an appendix for the convenience of the reviewers). Such submissions should not be an abridged version of an existing journal article although pre-submission arXiv preprints are permitted. These submissions will be adjudicated for both a talk and publication in the conference proceedings.

  • Talk proposals not to be published in the proceedings, e.g. about work accepted/submitted/published elsewhere, should be submitted as abstracts, one or two pages long. Authors are encouraged to include links to any full versions of their papers, preprints or manuscripts. The purpose of the abstract is to provide a basis for determining the topics and quality of the anticipated presentation.

  • Software demonstration proposals should also be submitted as abstracts, one or two pages. The purpose of the abstract is to provide the program committee with enough information to assess the content of the demonstration.

The selected conference papers will be published in a volume of Proceedings. Authors are advised to use EPTCS style; files are available at https://style.eptcs.org/

Reviewing will be single-blind, and we are not making public the reviews, reviewer names, the discussions nor the list of under-review submissions. This is the same as previous instances of ACT.

In order to give our reviewers enough time to bid on submissions, we ask for a title and brief abstract of your submission by February 26. The full two-page pdf extended abstract submissions and up to 12 page proceedings submissions are both due by the submissions deadline of March 3 11:59pm AoE (Anywhere on Earth).

Please contact the Programme Committee Chairs for more information: Amar Hadzihasanovic (amar.hadzihasanovic@taltech.ee) and JS Lemay (js.lemay@mq.edu.au).

Programme Committee

See conference website for full list:

https://gataslab.org/act2025/act2025cfp

December 03, 2024

Scott Aaronson Thanksgiving

I’m thankful to the thousands of readers of this blog.  Well, not the few who submit troll comments from multiple pseudonymous handles, but the 99.9% who don’t. I’m thankful that they’ve stayed here even when events (as they do more and more often) send me into a spiral of doomscrolling and just subsisting hour-to-hour—when I’m left literally without words for weeks.

I’m thankful for Thanksgiving itself.  As I often try to explain to non-Americans (and to my Israeli-born wife), it’s not primarily about the turkey but rather about the sides: the stuffing, the mashed sweet potatoes with melted marshmallows, the cranberry jello mold.  The pumpkin pie is good too.

I’m thankful that we seem to be on the threshold of getting to see the birth of fault-tolerant quantum computing, nearly thirty years after it was first theorized.

I’m thankful that there’s now an explicit construction of pseudorandom unitaries — and that, with further improvement, this would lead to a Razborov-Rudich natural proofs barrier for the quantum circuit complexity of unitaries, explaining for the first time why we don’t have superpolynomial lower bounds for that quantity.

I’m thankful that there’s been recent progress on QMA versus QCMA (that is, quantum versus classical proofs), with a full classical oracle separation now possibly in sight.

I’m thankful that, of the problems I cared about 25 years ago — the maximum gap between classical and quantum query complexities of total Boolean functions, relativized BQP versus the polynomial hierarchy, the collision problem, making quantum computations classically verifiable — there’s now been progress if not a full solution for almost all of them. And yet I’m thankful as well that lots of great problems remain open.

I’m thankful that the presidential election wasn’t all that close (by contemporary US standards, it was a ““landslide,”” 50%-48.4%).  Had it been a nail-biter, not only would I fear violence and the total breakdown of our constitutional order, I’d kick myself that I hadn’t done more to change the outcome.  As it is, there’s no denying that a plurality of Americans actually chose this, and now they’re going to get it good and hard.

I’m thankful that, while I absolutely do see Trump’s return as a disaster for the country and for civilization, it’s not a 100% unmitigated disaster.  The lying chaos monster will occasionally rage for things I support rather than things I oppose.  And if he actually plunges the country into another Great Depression through tariffs, mass deportations, and the like, hopefully that will make it easier to repudiate his legacy in 2028.

I’m thankful that, whatever Jews around the world have had to endure over the past year — both the physical attacks and the moral gaslighting that it’s all our fault — we’ve already endured much worse on both fronts, not once but countless times over 3000 years, and this is excellent Bayesian evidence that we’ll survive the latest onslaught as well.

I’m thankful that my family remains together, and healthy. I’m thankful to have an 11-year-old who’s a striking wavy-haired blonde who dances and does gymnastics (how did that happen?) and wants to be an astrophysicist, as well as a 7-year-old who now often beats me in chess and loves to solve systems of two linear equations in two unknowns.

I’m thankful that, compared to what I imagined my life would be as an 11-year-old, my life is probably in the 50th percentile or higher.  I haven’t saved the world, but I haven’t flamed out either.  Even if I do nothing else from this point, I have a stack of writings and results that I’m proud of. And I fully intend to do something else from this point.

I’m thankful that the still-most-powerful nation on earth, the one where I live, is … well, more aligned with good than any other global superpower in the miserable pageant of human history has been.  I’m thankful to live in the first superpower in history that has some error-correction machinery built in, some ability to repudiate its past sins (and hopefully its present sins, in the future).  I’m thankful to live in the first superpower that has toleration of Jews and other religious minorities built in as a basic principle, with the possible exception of the Persian Empire under Cyrus.

I’m thankful that all eight of my great-grandparents came to the US in 1905, back when Jewish mass immigration was still allowed.  Of course there’s a selection effect here: if they hadn’t made it, I wouldn’t be here to ponder it.  Still, it seems appropriate to express gratitude for the fact of existing, whatever metaphysical difficulties might inhere in that act.

I’m thankful that there’s now a ceasefire between Israel and Lebanon that Israel’s government saw fit to agree to.  While I fear that this will go the way of all previous ceasefires — Hezbollah “obeys” until it feels ready to strike again, so then Israel invades Lebanon again, then more civilians die, then there’s another ceasefire, rinse and repeat, etc. — the possibility always remains that this time will be the charm, for all people on both sides who want peace.

I’m thankful that our laws of physics are so constructed that G, c, and ℏ, three constants that are relatively easy to measure, can be combined to tell us the fundamental units of length and time, even though those units — the Planck time, 10-43 seconds, and the Planck length, 10-33 centimeters — are themselves below the reach of any foreseeable technology, and to atoms as atoms are to the solar system.

I’m thankful that, almost thirty years after I could have and should have, I’ve now finally learned the proof of the irrationality of π.

I’m thankful that, if I could go back in time to my 14-year-old self, I could tell him firstly, that female heterosexual attraction to men is a real phenomenon in the world, and secondly, that it would sometimes fixate on him (the future him, that is) in particular.

I’m thankful for red grapefruit, golden mangos, seedless watermelons, young coconuts (meat and water), mangosteen, figs, dates, and even prunes.  Basically, fruit is awesome, the more so after whatever selective breeding and genetic engineering humans have done to it.

I’m thankful for Futurama, and for the ability to stream every episode of it in order, as Dana, the kids, and I have been doing together all fall.  I’m thankful that both of my kids love it as much as I do—in which case, how far from my values and worldview could they possibly be? Even if civilization is destroyed, it will have created 100 episodes of something this far out on the Pareto frontier of lowbrow humor, serious intellectual content, and emotional depth for a future civilization to discover.  In short: “good news, everyone!”

December 02, 2024

Matt Strassler Public Talk at the University of Michigan Dec 5th

This week I’ll be at the University of Michigan in Ann Arbor, and I’ll be giving a public talk for a general audience at 4 pm on Thursday, December 5th. If you are in the area, please attend! And if you know someone at the University of Michigan or in the Ann Arbor area who might be interested, please let them know. (For physicists: I’ll also be giving an expert-level seminar at the Physics Department the following day.)

Here are the logistical details:

The Quantum Cosmos and Our Place Within It

Thursday, December 5, 2024, 4:00-5:00 PM ; Rackham Graduate School , 4th Floor Amphitheatre

Click to enlarge map

When we step outside to contemplate the night sky, we often imagine ourselves isolated and adrift in a vast cavern of empty space—but is it so? Modern physics views the universe as more full than empty. Over the past century, this unfamiliar idea has emerged from a surprising partnership of exotic concepts: quantum physics and Einstein’s relativity. In this talk I’ll illustrate how this partnership provides the foundation for every aspect of human experience, including the existence of subatomic particles (and the effect of the so-called “Higgs field”), the peaceful nature of our journey through the cosmos, and the solidity of the ground beneath our feet.

Terence TaoOn several irrationality problems for Ahmes series

Vjeko Kovac and I have just uploaded to the arXiv our paper “On several irrationality problems for Ahmes series“. This paper resolves (or at least makes partial progress on) some open questions of Erdős and others on the irrationality of Ahmes series, which are infinite series of the form {\sum_{k=1}^\infty \frac{1}{a_k}} for some increasing sequence {a_k} of natural numbers. Of course, since most real numbers are irrational, one expects such series to “generically” be irrational, and we make this intuition precise (in both a probabilistic sense and a Baire category sense) in our paper. However, it is often difficult to establish the irrationality of any specific series. For example, it is already a non-trivial result of Erdős that the series {\sum_{k=1}^\infty \frac{1}{2^k-1}} is irrational, while the irrationality of {\sum_{p \hbox{ prime}} \frac{1}{2^p-1}} (equivalent to Erdős problem #69) remains open, although very recently Pratt established this conditionally on the Hardy–Littlewood prime tuples conjecture. Finally, the irrationality of {\sum_n \frac{1}{n!-1}} (Erdős problem #68) is completely open.

On the other hand, it has long been known that if the sequence {a_k} grows faster than {C^{2^k}} for any {C}, then the Ahmes series is necessarily irrational, basically because the fractional parts of {a_1 \dots a_m \sum_{k=1}^\infty \frac{1}{a_k}} can be arbitrarily small positive quantities, which is inconsistent with {\sum_{k=1}^\infty \frac{1}{a_k}} being rational. This growth rate is sharp, as can be seen by iterating the identity {\frac{1}{n} = \frac{1}{n+1} + \frac{1}{n(n+1)}} to obtain a rational Ahmes series of growth rate {(C+o(1))^{2^k}} for any fixed {C>1}.

In our paper we show that if {a_k} grows somewhat slower than the above sequences in the sense that {a_{k+1} = o(a_k^2)}, for instance if {a_k \asymp 2^{(2-\varepsilon)^k}} for a fixed {0 < \varepsilon < 1}, then one can find a comparable sequence {b_k \asymp a_k} for which {\sum_{k=1}^\infty \frac{1}{b_k}} is rational. This partially addresses Erdős problem #263, which asked if the sequence {a_k = 2^{2^k}} had this property, and whether any sequence of exponential or slower growth (but with {\sum_{k=1}^\infty 1/a_k} convergent) had this property. Unfortunately we barely miss a full solution of both parts of the problem, since the condition {a_{k+1} = o(a_k^2)} we need just fails to cover the case {a_k = 2^{2^k}}, and also does not quite hold for all sequences going to infinity at an exponential or slower rate.

We also show the following variant; if {a_k} has exponential growth in the sense that {a_{k+1} = O(a_k)} with {\sum_{k=1}^\infty \frac{1}{a_k}} convergent, then there exists nearby natural numbers {b_k = a_k + O(1)} such that {\sum_{k=1}^\infty \frac{1}{b_k}} is rational. This answers the first part of Erdős problem #264 which asked about the case {a_k = 2^k}, although the second part (which asks about {a_k = k!}) is slightly out of reach of our methods. Indeed, we show that the exponential growth hypothesis is best possible in the sense a random sequence {a_k} that grows faster than exponentially will not have this property, this result does not address any specific superexponential sequence such as {a_k = k!}, although it does apply to some sequence {a_k} of the shape {a_k = k! + O(\log\log k)}.

Our methods can also handle higher dimensional variants in which multiple series are simultaneously set to be rational. Perhaps the most striking result is this: we can find an increasing sequence {a_k} of natural numbers with the property that {\sum_{k=1}^\infty \frac{1}{a_k + t}} is rational for every rational {t} (excluding the cases {t = - a_k} to avoid division by zero)! This answers (in the negative) a question of Stolarsky Erdős problem #266, and also reproves Erdős problem #265 (and in the latter case one can even make {a_k} grow double exponentially fast).

Our methods are elementary and avoid any number-theoretic considerations, relying primarily on the countable dense nature of the rationals and an iterative approximation technique. The first observation is that the task of representing a given number {q} as an Ahmes series {\sum_{k=1}^\infty \frac{1}{a_k}} with each {a_k} lying in some interval {I_k} (with the {I_k} disjoint, and going to infinity fast enough to ensure convergence of the series), is possible if and only if the infinite sumset

\displaystyle  \frac{1}{I_1} + \frac{1}{I_2} + \dots

to contain {q}, where {\frac{1}{I_k} = \{ \frac{1}{a}: a \in I_k \}}. More generally, to represent a tuple of numbers {(q_t)_{t \in T}} indexed by some set {T} of numbers simultaneously as {\sum_{k=1}^\infty \frac{1}{a_k+t}} with {a_k \in I_k}, this is the same as asking for the infinite sumset

\displaystyle  E_1 + E_2 + \dots

to contain {(q_t)_{t \in T}}, where now

\displaystyle  E_k = \{ (\frac{1}{a+t})_{t \in T}: a \in I_k \}. \ \ \ \ \ (1)

So the main problem is to get control on such infinite sumsets. Here we use a very simple observation:

Proposition 1 (Iterative approximation) Let {V} be a Banach space, let {E_1,E_2,\dots} be sets with each {E_k} contained in the ball of radius {\varepsilon_k>0} around the origin for some {\varepsilon_k} with {\sum_{k=1}^\infty \varepsilon_k} convergent, so that the infinite sumset {E_1 + E_2 + \dots} is well-defined. Suppose that one has some convergent series {\sum_{k=1}^\infty v_k} in {V}, and sets {B_1,B_2,\dots} converging in norm to zero, such that

\displaystyle  v_k + B_k \subset E_k + B_{k+1} \ \ \ \ \ (2)

for all {k \geq 1}. Then the infinite sumset {E_1 + E_2 + \dots} contains {\sum_{k=1}^\infty v_k + B_1}.

Informally, the condition (2) asserts that {E_k} occupies all of {v_k + B_k} “at the scale {B_{k+1}}“.

Proof: Let {w_1 \in B_1}. Our task is to express {\sum_{k=1}^\infty v_k + w_1} as a series {\sum_{k=1}^\infty e_k} with {e_k \in E_k}. From (2) we may write

\displaystyle  \sum_{k=1}^\infty v_k + w_1 = \sum_{k=2}^\infty v_k + e_1 + w_2

for some {e_1 \in E_1} and {w_2 \in B_2}. Iterating this, we may find {e_k \in E_k} and {w_k \in B_k} such that

\displaystyle  \sum_{k=1}^\infty v_k + w_1 = \sum_{k=m+1}^\infty v_k + e_1 + e_2 + \dots + e_m + w_{m+1}

for all {m}. Sending {m \rightarrow \infty}, we obtain

\displaystyle  \sum_{k=1}^\infty v_k + w_1 = e_1 + e_2 + \dots

as required. \Box

In one dimension, sets of the form {\frac{1}{I_k}} are dense enough that the condition (2) can be satisfied in a large number of situations, leading to most of our one-dimensional results. In higher dimension, the sets {E_k} lie on curves in a high-dimensional space, and so do not directly obey usable inclusions of the form (2); however, for suitable choices of intervals {I_k}, one can take some finite sums {E_{k+1} + \dots + E_{k+d}} which will become dense enough to obtain usable inclusions of the form (2) once {d} reaches the dimension of the ambient space, basically thanks to the inverse function theorem (and the non-vanishing curvatures of the curve in question). For the Stolarsky problem, which is an infinite-dimensional problem, it turns out that one can modify this approach by letting {d} grow slowly to infinity with {k}.

December 01, 2024

Tommaso DorigoTracking Particles With Neuromorphic Computing

At the IV Workshop in Valencia a student from my group, Emanuele Coradin, presented the results of a novel algorithm for the identification of charged particles in a silicon tracker. The novelty is due to the use of neuromorphic computing, which works by encoding detector hits in the time of arrival of current impulses at neurons, and by letting neurons "learn" the true patterns of hits produced by charged particles from the noise due to random hits.

read more

Clifford JohnsonMagic Ingredients Exist!

I’m a baker, as you probably know. I’ve regularly made bread, cakes, pies, and all sorts of things for friends and family. About a year ago, someone in the family was diagnosed with a severe allergy to gluten, and within days we removed all gluten products from the kitchen, began … Click to continue reading this post

The post Magic Ingredients Exist! appeared first on Asymptotia.

November 30, 2024

Clifford JohnsonHope’s Benefits

The good news (following from last post) is that it worked out! I was almost short of the amount I needed to cover the pie, and so that left nothing for my usual decoration... but it was a hit at dinner and for left-overs today, so that's good!

--cvj Click to continue reading this post

The post Hope’s Benefits appeared first on Asymptotia.

November 29, 2024

Doug NatelsonFoams! (or, why my split pea side dish boils over every Thanksgiving)

Foams can be great examples of mechanical metamaterials.  

Adapted from TOC figure of this paper
Consider my shaving cream.  You might imagine that the (mostly water) material would just pool as a homogeneous liquid, since water molecules have a strong attraction for one another.  However, my shaving cream contains surfactant molecules.  These little beasties have a hydrophilic/polar end and a hydrophobic/nonpolar end.  The surfactant molecules can lower the overall energy of the fluid+air system by lowering the energy cost of the liquid/surfactant/air interface compared with the liquid/air interface.  There is a balancing act between air pressure, surface tension/energy, and gravity that has to be played, but under the right circumstances you end up with formation of a dense foam comprising many many tiny bubbles.  On the macroscale (much larger than the size of individual bubbles), the foam can look like a very squishy but somewhat mechanically integral solid - it can resist shear, at least a bit, and maintain its own shape against gravity.  For a recent review about this, try this paper (apologies for the paywall) or a taste of this in a post from last year

What brought this to mind was my annual annoyance yesterday in preparing what has become a regular side dish at our family Thanksgiving.  That recipe begins with rinsing, soaking, and then boiling split peas in preparation for making a puree.  Every year, without fail, I try to keep a close eye on the split peas as they cook, because they tend to foam up.  A lot.  Interestingly, this happens regardless of how carefully I rinse them before soaking, and the foaming (a dense white foam of few-micron-scale bubbles) begins well before the liquid starts to boil.  I have now learned two things about this.  First, pea protein, which leaches out of the split peas, is apparently a well-known foam-inducing surfactant, as explained in this paper (which taught me that there is a journal called Food Hydrocolloids).  Second, next time I need to use a bigger pot and try adding a few drops of oil to see if that suppresses the foam formation.

Matt von HippelA Tale of Two Experiments

Before I begin, two small announcements:

First: I am now on bluesky! Instead of having a separate link in the top menu for each social media account, I’ve changed the format so now there are social media buttons in the right-hand sidebar, right under the “Follow” button. Currently, they cover tumblr, twitter, and bluesky, but there may be more in future.

Second, I’ve put a bit more technical advice on my “Open Source Grant Proposal” post, so people interested in proposing similar research can have some ideas about how best to pitch it.

Now, on to the post:


Gravitational wave telescopes are possibly the most exciting research program in physics right now. Big, expensive machines with more on the way in the coming decades, gravitational wave telescopes need both precise theoretical predictions and high-quality data analysis. For some, gravitational wave telescopes have the potential to reveal genuinely new physics, to probe deviations from general relativity that might be related to phenomena like dark matter, though so far no such deviations have been conclusively observed. In the meantime, they’re teaching us new consequences of known physics. For example, the unusual population of black holes observed by LIGO has motivated those who model star clusters to consider processes in which the motion of three stars or black holes is related to each other, discovering that these processes are more important than expected.

Particle colliders are probably still exciting to the general public, but for many there is a growing sense of fatigue and disillusionment. Current machines like the LHC are big and expensive, and proposed future colliders would be even costlier and take decades to come online, in addition to requiring a huge amount of effort from the community in terms of precise theoretical predictions and data analysis. Some argue that colliders still might uncover genuinely new physics, deviations from the standard model that might explain phenomena like dark matter, but as no such deviations have yet been conclusively observed people are increasingly skeptical. In the meantime, most people working on collider physics are focused on learning new consequences of known physics. For example, by comparing observed results with theoretical approximations, people have found that certain high-energy processes usually left out of calculations are actually needed to get a good agreement with the data, showing that these processes are more important than expected.

…ok, you see what I did there, right? Was that fair?

There are a few key differences, with implications to keep in mind:

First, collider physics is significantly more expensive than gravitational wave physics. LIGO took about $300 million to build and spends about $50 million a year. The LHC took about $5 billion to build and costs $1 billion a year to run. That cost still puts both well below several other government expenses that you probably consider frivolous (please don’t start arguing about which ones in the comments!), but it does mean collider physics demands a bit of a stronger argument.

Second, the theoretical motivation to expect new fundamental physics out of LIGO is generally considered much weaker than for colliders. A large part of the theoretical physics community thought that they had a good argument why they should see something new at the LHC. In contrast, most theorists have been skeptical of the kinds of modified gravity theories that have dramatic enough effects that one could measure them with gravitational wave telescopes, with many of these theories having other pathologies or inconsistencies that made people wary.

Third, the general public finds astrophysics cooler than particle physics. Somehow, telling people “pairs of black holes collide more often than we thought because sometimes a third star in the neighborhood nudges them together” gets people much more excited than “pairs of quarks collide more often than we thought because we need to re-sum large logarithms differently”, even though I don’t think there’s a real “principled” difference between them. Neither reveals new laws of nature, both are upgrades to our ability to model how real physical objects behave, neither is useful to know for anybody living on Earth in the present day.

With all this in mind, my advice to gravitational wave physicists is to try, as much as possible, not to lean on stories about dark matter and modified gravity. You might learn something, and it’s worth occasionally mentioning that. But if you don’t, you run a serious risk of disappointing people. And you have such a big PR advantage if you just lean on new consequences of bog standard GR, that those guys really should get the bulk of the news coverage if you want to keep the public on your side.

November 28, 2024

Clifford JohnsonHope

The delicious chaos that (almost always) eventually tames into a tasty flaky pastry crust… it’s always a worrying mess to start out, but you trust to your experience, and you carry on, with hope. #thanksgiving

The post Hope appeared first on Asymptotia.

John PreskillHappy 200th birthday, Carnot’s theorem!

In Kenneth Grahame’s 1908 novel The Wind in the Willows, a Mole meets a Water Rat who lives on a River. The Rat explains how the River permeates his life: “It’s brother and sister to me, and aunts, and company, and food and drink, and (naturally) washing.” As the River plays many roles in the Rat’s life, so does Carnot’s theorem play many roles in a thermodynamicist’s.

Nicolas Léonard Sadi Carnot lived in France during the turn of the 19th century. His father named him Sadi after the 13th-century Persian poet Saadi Shirazi. Said father led a colorful life himself,1 working as a mathematician, engineer, and military commander for and before the Napoleonic Empire. Sadi Carnot studied in Paris at the École Polytechnique, whose members populate a “Who’s Who” list of science and engineering. 

As Carnot grew up, the Industrial Revolution was humming. Steam engines were producing reliable energy on vast scales; factories were booming; and economies were transforming. France’s old enemy Britain enjoyed two advantages. One consisted of inventors: Englishmen Thomas Savery and Thomas Newcomen invented the steam engine. Scotsman James Watt then improved upon Newcomen’s design until rendering it practical. Second, northern Britain contained loads of coal that industrialists could mine to power her engines. France had less coal. So if you were a French engineer during Carnot’s lifetime, you should have cared about engines’ efficiencies—how effectively engines used fuel.2

Carnot proved a fundamental limitation on engines’ efficiencies. His theorem governs engines that draw energy from heat—rather than from, say, the motional energy of water cascading down a waterfall. In Carnot’s argument, a heat engine interacts with a cold environment and a hot environment. (Many car engines fall into this category: the hot environment is burning gasoline. The cold environment is the surrounding air into which the car dumps exhaust.) Heat flows from the hot environment to the cold. The engine siphons off some heat and converts it into work. Work is coordinated, well-organized energy that one can directly harness to perform a useful task, such as turning a turbine. In contrast, heat is the disordered energy of particles shuffling about randomly. Heat engines transform random heat into coordinated work.

In The Wind and the Willows, Toad drives motorcars likely powered by internal combustion, rather than by a steam engine of the sort that powered the Industrial Revolution.

An engine’s efficiency is the bang we get for our buck—the upshot we gain, compared to the cost we spend. Running an engine costs the heat that flows between the environments: the more heat flows, the more the hot environment cools, so the less effectively it can serve as a hot environment in the future. An analogous statement concerns the cold environment. So a heat engine’s efficiency is the work produced, divided by the heat spent.

Carnot upper-bounded the efficiency achievable by every heat engine of the sort described above. Let T_{\rm C} denote the cold environment’s temperature; and T_{\rm H}, the hot environment’s. The efficiency can’t exceed 1 - \frac{ T_{\rm C} }{ T_{\rm H} }. What a simple formula for such an extensive class of objects! Carnot’s theorem governs not only many car engines (Otto engines), but also the Stirling engine that competed with the steam engine, its cousin the Ericsson engine, and more.

In addition to generality and simplicity, Carnot’s bound boasts practical and fundamental significances. Capping engine efficiencies caps the output one can expect of a machine, factory, or economy. The cap also prevents engineers from wasting their time on daydreaming about more-efficient engines. 

More fundamentally than these applications, Carnot’s theorem encapsulates the second law of thermodynamics. The second law helps us understand why time flows in only one direction. And what’s deeper or more foundational than time’s arrow? People often cast the second law in terms of entropy, but many equivalent formulations express the law’s contents. The formulations share a flavor often synopsized with “You can’t win.” Just as we can’t grow younger, we can’t beat Carnot’s bound on engines. 

Video courtesy of FQxI

One might expect no engine to achieve the greatest efficiency imaginable: 1 - \frac{ T_{\rm C} }{ T_{\rm H} }, called the Carnot efficiency. This expectation is incorrect in one way and correct in another. Carnot did design an engine that could operate at his eponymous efficiency: an eponymous engine. A Carnot engine can manifest as the thermodynamicist’s favorite physical system: a gas in a box topped by a movable piston. The gas undergoes four strokes, or steps, to perform work. The strokes form a closed cycle, returning the gas to its initial conditions.3 

Steampunk artist Todd Cahill beautifully illustrated the Carnot cycle for my book. The gas performs useful work because a teapot sits atop the piston. Pushing the piston upward, the gas lifts the teapot. You can find a more detailed description of Carnot’s engine in Chapter 4 of the book, but I’ll recap the cycle here.

The gas expands during stroke 1, pushing the piston and so outputting work. Maintaining contact with the hot environment, the gas remains at the temperature T_{\rm H}. The gas then disconnects from the hot environment. Yet the gas continues to expand throughout stroke 2, lifting the teapot further. Forfeiting energy, the gas cools. It ends stroke 2 at the temperature T_{\rm C}.

The gas contacts the cold environment throughout stroke 3. The piston pushes on the gas, compressing it. At the end of the stroke, the gas disconnects from the cold environment. The piston continues compressing the gas throughout stroke 4, performing more work on the gas. This work warms the gas back up to T_{\rm H}.

In summary, Carnot’s engine begins hot, performs work, cools down, has work performed on it, and warms back up. The gas performs more work on the piston than the piston performs on it. Therefore, the teapot rises (during strokes 1 and 2) more than it descends (during strokes 3 and 4). 

At what cost, if the engine operates at the Carnot efficiency? The engine mustn’t waste heat. One wastes heat by roiling up the gas unnecessarily—by expanding or compressing it too quickly. The gas must stay in equilibrium, a calm, quiescent state. One can keep the gas quiescent only by running the cycle infinitely slowly. The cycle will take an infinitely long time, outputting zero power (work per unit time). So one can achieve the perfect efficiency only in principle, not in practice, and only by sacrificing power. Again, you can’t win.

Efficiency trades off with power.

Carnot’s theorem may sound like the Eeyore of physics, all negativity and depression. But I view it as a companion and backdrop as rich, for thermodynamicists, as the River is for the Water Rat. Carnot’s theorem curbs diverse technologies in practical settings. It captures the second law, a foundational principle. The Carnot cycle provides intuition, serving as a simple example on which thermodynamicists try out new ideas, such as quantum engines. Carnot’s theorem also provides what physicists call a sanity check: whenever a researcher devises a new (for example, quantum) heat engine, they can confirm that the engine obeys Carnot’s theorem, to help confirm their proposal’s accuracy. Carnot’s theorem also serves as a school exercise and a historical tipping point: the theorem initiated the development of thermodynamics, which continues to this day. 

So Carnot’s theorem is practical and fundamental, pedagogical and cutting-edge—brother and sister, and aunts, and company, and food and drink. I just wouldn’t recommend trying to wash your socks in Carnot’s theorem.

1To a theoretical physicist, working as a mathematician and an engineer amounts to leading a colorful life.

2People other than Industrial Revolution–era French engineers should care, too.

3A cycle doesn’t return the hot and cold environments to their initial conditions, as explained above.

November 26, 2024

Tommaso DorigoK0 Regeneration

Last week I got to the part of my course in Subnuclear Physics for Statisticians (yes, there is such a course at the Department of Statistical Sciences in Padova, and I have been giving it since its inception 6 years ago!) where I discuss CP violation in the system of neutral K mesons. In one of the most surprising experiments of modern physics, the group of Cronin and Fitch proved in 1964 that the combination of the two symmetries operations called "charge conjugation" C and "parity inversion" P could in some cases modify the properties of physical systems. 

read more

Clifford JohnsonDecoding the Universe!

I realised just now that I entirely forgot (it seems) to post about an episode of PBS' show Nova called "Decoding the Universe: Cosmos" which aired back in the Spring. I thought they did a good job of talking about some of the advances in our understanding that have happened over the last 50 years (the idea is that it is the 50th anniversary of the show) in areas of astrophysics and cosmology. I was a contributor, filmed at the top of Mount Wilson at the Observatory where Hubble made his famous discoveries about the size of the universe, and its expansion. I talk about some of those discoveries and other ideas in the show. Here's a link to the "Decoding the Universe" site. (You can also find it on YouTube.)

If you follow the link you'll notice another episode up there: "Decoding the Universe: Quantum". That's a companion they made, and it focuses on understanding in quantum physics, connecting it to things in the everyday world. and also back to black holes and things astrophysical and cosmological. It also does a good job of shining a light on many concepts.

I was also a contributor to this episode, and it was a real delight to work with them in a special role: I got to unpack many of the foundational quantum mechanical concepts (transitions in atoms, stimulated emission, tunnelling, etc) to camera by doing line drawings while I explained - and kudos [...] Click to continue reading this post

The post Decoding the Universe! appeared first on Asymptotia.

November 24, 2024

Doug NatelsonNanopasta, no, really

Fig. 1 from the linked paper
Here
is a light-hearted bit of research that touches on some fun physics.  As you might readily imagine, there is a good deal of interdisciplinary and industrial interest in wanting to create fine fibers out of solution-based materials.  One approach, which has historical roots that go back even two hundred years before this 1887 paper, is electrospinning.  Take a material of interest, dissolve it in a solvent, and feed a drop of that solution onto the tip of an extremely sharp metal needle.  Then apply a big voltage (say a few to tens of kV) between that tip and a nearby grounded substrate.  If the solution has some amount of conductivity, the liquid will form a cone on the tip, and at sufficiently large voltages and small target distances, the droplet will be come unstable and form a jet off into the tip-target space.  With the right range of fluid properties (viscosity, conductivity, density, concentration) and the right evaporation rate for the solvent, the result is a continuously forming, drying fiber that flows off the end of the tip.  A further instability amplifies any curves in the fiber path, so that you get a spiraling fiber spinning off onto the substrate.   There are many uses for such fibers, which can be very thin.

The authors of the paper in question wanted to make fibers from starch, which is nicely biocompatible for medical applications.  So, starting from wheat flour and formic acid, they worked out viable parameters and were able to electrospin fibers of wheat starch (including some gluten - sorry, for those of you with gluten intolerances) into nanofibers 300-400 nm in diameter.  The underlying material is amorphous (so, no appreciable starch crystallization).  The authors had fun with this and called the result "nanopasta", but it may actually be useful for certain applications.


November 22, 2024

Matt von HippelThe Nowhere String

Space and time seem as fundamental as anything can get. Philosophers like Immanuel Kant thought that they were inescapable, that we could not conceive of the world without space and time. But increasingly, physicists suspect that space and time are not as fundamental as they appear. When they try to construct a theory of quantum gravity, physicists find puzzles, paradoxes that suggest that space and time may just be approximations to a more fundamental underlying reality.

One piece of evidence that quantum gravity researchers point to are dualities. These are pairs of theories that seem to describe different situations, including with different numbers of dimensions, but that are secretly indistinguishable, connected by a “dictionary” that lets you interpret any observation in one world in terms of an equivalent observation in the other world. By itself, duality doesn’t mean that space and time aren’t fundamental: as I explained in a blog post a few years ago, it could still be that one “side” of the duality is a true description of space and time, and the other is just a mathematical illusion. To show definitively that space and time are not fundamental, you would want to find a situation where they “break down”, where you can go from a theory that has space and time to a theory that doesn’t. Ideally, you’d want a physical means of going between them: some kind of quantum field that, as it shifts, changes the world between space-time and not space-time.

What I didn’t know when I wrote that post was that physicists already knew about such a situation in 1993.

Back when I was in pre-school, famous string theorist Edward Witten was trying to understand something that others had described as a duality, and realized there was something more going on.

In string theory, particles are described by lengths of vibrating string. In practice, string theorists like to think about what it’s like to live on the string itself, seeing it vibrate. In that world, there are two dimensions, one space dimension back and forth along the string and one time dimension going into the future. To describe the vibrations of the string in that world, string theorists use the same kind of theory that people use to describe physics in our world: a quantum field theory. In string theory, you have a two-dimensional quantum field theory stuck “inside” a theory with more dimensions describing our world. You see that this world exists by seeing the kinds of vibrations your two-dimensional world can have, through a type of quantum field called a scalar field. With ten scalar fields, ten different ways you can push energy into your stringy world, you can infer that the world around you is a space-time with ten dimensions.

String theory has “extra” dimensions beyond the three of space and one of time we’re used to, and these extra dimensions can be curled up in various ways to hide them from view, often using a type of shape called a Calabi-Yau manifold. In the late 80’s and early 90’s, string theorists had found a similarity between the two-dimensional quantum field theories you get folding string theory around some of these Calabi-Yau manifolds and another type of two-dimensional quantum field theory related to theories used to describe superconductors. People called the two types of theories dual, but Witten figured out there was something more going on.

Witten described the two types of theories in the same framework, and showed that they weren’t two equivalent descriptions of the same world. Rather, they were two different ways one theory could behave.

The two behaviors were connected by something physical: the value of a quantum field called a modulus field. This field can be described by a number, and that number can be positive or negative.

When the modulus field is a large positive number, then the theory behaves like string theory twisted around a Calabi-Yau manifold. In particular, the scalar fields have many different values they can take, values that are smoothly related to each other. These values are nothing more or less than the position of the string in space and time. Because the scalars can take many values, the string can sit in many different places, and because the values are smoothly related to each other, the string can smoothly move from one place to another.

When the modulus field is a large negative number, then the theory is very different. What people thought of as the other side of the duality, a theory like the theories used to describe superconductors, is the theory that describes what happens when the modulus field is large and negative. In this theory, the scalars can no longer take many values. Instead, they have one option, one stable solution. That means that instead of there being many different places the string could sit, describing space, there are no different places, and thus no space. The string lives nowhere.

These are two very different situations, one with space and one without. And they’re connected by something physical. You could imagine manipulating the modulus field, using other fields to funnel energy into it, pushing it back and forth from a world with space to a world of nowhere. Much more than the examples I was aware of, this is a super-clear example of a model where space is not fundamental, but where it can be manipulated, existing or not existing based on physical changes.

We don’t know whether a model like this describes the real world. But it’s gratifying to know that it can be written down, that there is a picture, in full mathematical detail, of how this kind of thing works. Hopefully, it makes the idea that space and time are not fundamental sound a bit more reasonable.

n-Category Café Axiomatic Set Theory 9: The Axiom of Choice

Previously: Part 8. Next: Part 10.

It’s the penultimate week of the course, and up until now we’ve abstained from using the axiom of choice. But this week we gorged on it.

We proved that all the usual things are equivalent to the axiom of choice: Zorn’s lemma, the well ordering principle, cardinal comparability (given two sets, one must inject into the other), and the souped-up version of cardinal comparability that compares not just two sets but an arbitrary collection of them: for any nonempty family of sets (X i) iI(X_i)_{i \in I}, there is some X iX_i that injects into all the others.

The section I most enjoyed writing and teaching was the last one, on unnecessary uses of the axiom of choice. I’m grateful to Todd Trimble for explaining to me years ago how to systematically remove dependence on choice from arguments in basic general topology. (For some reason, it’s very tempting in that subject to use choice unnecessarily.) I talk about this at the very end of the chapter.

Section of a surjection

n-Category Café Axiomatic Set Theory 10: Cardinal Arithmetic

Previously: Part 9.

The course is over! The grand finale was the theorem that

X×YX+Ymax(X,Y) X \times Y \cong X + Y \cong max(X, Y)

for all infinite sets XX and YY. Proving this required most of the concepts and results from the second half of the course: well ordered sets, the Cantor–Bernstein theorem, the Hartogs theorem, Zorn’s lemma, and so on.

I gave the merest hints of the world of cardinal arithmetic that lies beyond. If I’d had more time, I would have got into large sets (a.k.a. large cardinals), but the course was plenty long enough already.

Thanks very much to everyone who’s commented here so far, but thank you most of all to my students, who really taught me an enormous amount.

Part of the proof that an infinite set is isomorphic to its own square

November 21, 2024

Matt Strassler Celebrating the Standard Model: The Magic Angle

Particle physicists describe how elementary particles behave using a set of equations called their “Standard Model.” How did they become so confident that a set of math formulas, ones that can be compactly summarized on a coffee cup, can describe so much of nature?

My previous “Celebrations of the Standard Model” (you can find the full set here) have included the stories of how we know the strengths of the forces, the number of types (“flavors” and “colors”) and the electric charges of the quarks, and the structures of protons and neutrons, among others. Along the way I explained how W bosons, the electrically charged particles involved in the weak nuclear force, quickly decay (i.e. disintegrate into other particles). But I haven’t yet explained how their cousin, the electrically-neutral Z boson, decays. That story brings us to a central feature of the Standard Model.

Here’s the big picture. There’s a super-important number that plays a central role in the Standard Model. It’s a sort of angle (in a sense that will become clearer in Figs. 2 and 3 below), and is called θw or θweak. Through the action of the Higgs field on the particles, this one number determines many things, including

  • the relative masses of the W and Z bosons
  • the relative lifetimes of the W and Z bosons
  • the relative probabilities for Z bosons to decay to one type of particle versus another
  • the relative rates to produce different types of particles in scattering of electrons and positrons at very high energies
  • the relative rates for processes involving scattering neutrinos off atoms at very low energies
  • asymmetries in weak nuclear processes (ones that would be symmetric in corresponding electromagnetic processes)

and many others.

This is an enormously ambitious claim! When I began my graduate studies in 1988, we didn’t know if all these predictions would work out. But as the data from experiments came in during the 1990s and beyond, it became clear that every single one of them matched the data quite well. There were — and still are — no exceptions. And that’s why particle physicists became convinced that the Standard Model’s equations are by far the best they’ve ever found.

As an illustration, Fig. 1 shows, as a function of sin θw, the probabilities for Z bosons to decay to each type of particle and its anti-particle. Each individual probability is indicated by the size of the gap between one line and the line above. The total probability always adds up to 1, but the individual probabilities depend on the value of θw. (For instance, the width of the gap marked “muon” indicates the probability for a Z to decay to a muon and an anti-muon; it is about 5% at sin θw = 0, about 3% at sin θw = 1/2, and over 15% at sin θw = 1.)

Figure 1: In the Standard Model, once sin θw is known, the probabilities for a Z boson to decay to other particles and their anti-particles are predicted by the sizes of the gaps at that value of sin θw. Other measurements (see Fig. 3) imply sin θw is approximately 1/2 , and thus predict the Z decay probabilities to be those found in the green window. As Fig. 5 will show, data agrees with these predictions.

As we’ll see in Fig. 3, the W and Z boson masses imply (if the Standard Model is valid) that sin θw is about 1/2. Using that measurement, we can then predict that all the various decay probabilities should be given within the green rectangle (if the Standard Model is valid). These predictions, made in the mid-1980s, proved correct in the 1990s; see Fig. 5 below.

This is what I’ll sketch in today’s post. In future posts I’ll go further, showing how this works with high precision.

The Most Important Angle in Particle Physics

Angles are a common feature of geometry and nature: 90 degree angles of right-angle triangles, the 60 degree angles of equilateral triangles, the 104.5 degree angle between the two hydrogen-oxygen bonds in a water molecule, and so many more. But some angles, more abstract, turn out to be even more important. Today I’ll tell you about θw , which is a shade less than 30 degrees (π/6 radians).

Note: This angle is often called “the Weinberg angle”, based on Steven Weinberg’s 1967 version of the Standard Model, but it should be called the “weak-mixing angle”, as it was first introduced seven years earlier by Sheldon Glashow, before the idea of the Higgs field.

This is the angle that lies at the heart of the Standard Model: the smallest angle of the right-angle triangle shown in Fig. 2. Two of its sides represent the strengths g1 and g2 of two of nature’s elementary forces: the weak-hypercharge force and the weak-isospin force. According to the Standard Model, the machinations of the Higgs field transform them into more familar forces: the electromagnetic force and the weak nuclear force. (The Standard Model is often charaterized by the code SU(3)xSU(2)xU(1); weak-isospin and weak-hypercharge are the SU(2) and U(1) parts, while SU(3) gives us the strong nuclear force).

Figure 2: The electroweak right-triangle, showing the angle θw. The lengths of two of its sides are proprtional to the strengths g1 and g2 of the “U(1)” weak-hypercharge force and the “SU(2)” weak-isospin force.

To keep things especially simple today, let’s just say θw = 30 degrees, so that sin θw = 1/2. In a later post, we’ll see the angle is closer to 28.7 degrees, and this makes a difference when we’re being precise.

The Magic Angle and the W and Z Bosons

The Higgs field gives masses to the W and Z bosons, and the structure of the Standard Model predicts a very simple relation, given by the electroweak triangle as shown at the top of Fig. 3:

  • m_W/m_Z = \cos \theta_W

This has the consequence shown at the top of Fig. 3, rewritten as a prediction

  • 1 - m_W^2/m_Z^2 = \sin^2 \theta_W

If sin θw = 1/2 , this quantity is predicted to be 1/4 = 0.250. Measurements (mW = 80.4 GeV/c2 and mZ = 91.2 GeV/c2) show it to be 0.223. Agreement isn’t perfect, indicating that the angle isn’t exactly 30 degrees. But it is close enough for today’s post.

Where does this formula for the W and Z masses come from? Click here for details:

Central to the Standard Model is the so-called “Higgs field”, which has been switched on since very early in the Big Bang. The Higgs field is responsible for the masses of all the known elementary particles, but in general, though we understand why the masses aren’t zero, we can’t predict their values. However, there’s one interesting exception. The ratio of the W and Z bosons’ masses is predicted.

Before the Higgs field switched on, here’s how the weak-isospin and weak-hypercharge forces were organized: there were

  • 3 weak isospin fields, called W+, W and W0, whose particles (of the same names) had zero rest mass
  • 1 weak-hypercharge field, usually called, X, whose particle (of the same name) had zero rest mass

After the Higgs field switched on by an amount v, however, these four fields were reorganized, leaving

  • One, called the electromagnetic field, with particles called “photons” with zero rest mass.
  • One, called Z0 or just Z, now has particles (of the same names) with rest mass mZ
  • Two, still called W+ and W , have particles (of the same names) with rest mass mW

Central to this story is θw. In particular, the relationship between the photon and Z and the original W0 and X involves this angle. The picture below depicts this relation, given also as an equation

Figure 3a: A photon is mostly an X with a small amount of W0, while a Z is mostly a W0 with a small amount of X. The proportions are determined by θw .

The W+ and W bosons get their masses through their interaction, via the weak-isospin force, with the Higgs field. The Z boson gets its mass in a similar way, but because the Z is a mixture of W0 and X, both the weak-isospin and weak-hypercharge forces play a role. And thus mZ depends both on g1 and g2, while mW depends only on g2. Thus

\frac{m_W}{m_Z} = \frac{ g_2 v}{\sqrt{g_1^2+g_2^2} v}= \frac{ g_2 }{\sqrt{g_1^2+g_2^2}}= \cos \theta_W

where v is the “value” or strength of the switched-on Higgs field, and in the last step I have used the electroweak triangle of Fig. 2.

Figure 3: Predictions (*before accounting for small quantum corrections) in the Standard Model with sin θw = 1/2, compared with experiments. (Top) A simple prediction for the ratio of W and Z boson masses agrees quite well with experiment. (Bottom) The prediction for the ratio of W and Z boson lifetimes also agrees very well with experiment.

A slightly more complex relation relates the W boson’s lifetime tW and the Z boson’s lifetime tZ (this is the average time between when the particle is created and when it decays.) This is shown at the bottom of Fig. 3.

  • \frac{t_W m_W}{t_Z m_Z} = \frac{86}{81}

This is a slightly odd-looking formula; while 81 = 92 is a simple number, 86 is a weird one. Where does it come from? We’ll see in just a moment. In any case, as seen in Fig. 3, agreement between theoretical prediction and experiment is excellent.

If the Standard Model were wrong, there would be absolutely no reason for these two predictions to be even close. So this is a step in the right direction. But it is hardly the only one. Let’s check the detailed predictions in Figure 1.

W and Z Decay Probabilities

Here’s what the Standard Model has to say about how W and Z bosons can decay.

W Decays

In this earlier post, I explained that W bosons can decay (oversimplifying slightly) in five ways:

  • to an electron and a corresponding anti-neutrino
  • to a muon and a corresponding anti-neutrino
  • to a tau and a corresponding anti-neutrino
  • to a down quark and an up anti-quark
  • to a strange quark and a charm anti-quark

(A decay to a bottom quark and top anti-quark is forbidden by the rule of decreasing rest mass; the top quark’s rest mass is larger than the W’s, so no such decay can occur.)

These modes have simple probabilities, according to the Standard Model, and they don’t depend on sin θw (except through small quantum corrections which we’re ignoring here). The first three have probability 1/9. Moreover, remembering each quark comes in three varieties (called “colors”), each color of quark also occurs with probability 1/9. Altogether the predictions for the probabilities are as shown in Fig. 4, along with measurements, which agree well with the predictions. When quantum corrections (such as those discussed in this post, around its own Fig. 4) are included, agreement is much better; but that’s beyond today’s scope.

Figure 4: The W boson decay probabilities as predicted (*before accounting for small quantum corrections) by the Standard Model; these are independent of sin θw . The predictions agree well with experiment.

Because the W+ and W- are each others’ anti-particles, W+ decay probabilities are the same as those for W, except with all particles switched with their anti-particles.

Z Decays

Unlike W decays, Z decays are complicated and depend on sin θw. If sin θw = 1/2, the Standard Model predicts that the probability for a Z boson to decay to a particle/anti-particle pair, where the particle has electric charge Q and weak-isospin-component T = +1 or -1 [technically, isospin’s third component, times 2], is proportional to

  • 2 (Q/2-T)2 + 2(Q/2)2 = 2-2TQ+Q2

where I used T2 = 1 in the final expression. The fact that this answer is built from a sum of two different terms, only one of which involves T (weak-isospin), is a sign of the Higgs field’s effects, which typically marry two different types of fields in the Standard Model, only one of which has weak-isospin, to create the more familiar ones.

This implies the relative decay probabilities (remembering quarks come in three “colors”) are

  • For electrons, muons and taus (Q=-1, T=-1): 1
  • For each of the three neutrinos (Q=0, T=1): 2
  • For down-type quarks (Q=-1/3,T=-1) : 13/9 * 3 = 13/3
  • For up-type quarks (Q=2/3,T=1): 10/9 * 3 = 10/3

These are shown at left in Fig. 5.

Figure 5: The Z boson decay probabilities as predicted (*before accounting for small quantum corrections) by the Standard Model at sin θw = 1/2 (see Fig. 1), and compared to experiment. The three neutrino decays cannot be measured separately, so only their sum is shown. Of the quarks, only the bottom and charm decays can be separately measured, so the others are greyed out. But the total decay to quarks can also be measured, meaning three of the five quark predictions can be checked directly.

The sum of all those numbers (remembering that there are three down-type quarks and three up-type quarks, but again top quarks can’t appear due to the rule of decreasing rest mass) is:

  • 1 + 1 + 1 + 2 + 2 + 2 + 13/3 + 13/3 + 13/3 + 10/3 + 10/3 = 86/3.

And that’s where the 86 seen in the lifetime ratio (Fig. 3) comes from.

To get predictions for the actual decay probabilities (rather than just the relative probabilities), we should divide each relative probability by 86/3, so that the sum of all the probabilities together is 1. This gives us

  • For electrons, muons and taus (Q=-1, T=-1): 3/86
  • For each of the three neutrinos (Q=0, T=1): 6/86
  • For down-type quarks (Q=-1/3, T=-1) : 13/86
  • For up-type quarks (Q=2/3, T=1): 10/86

as shown on the right-hand side of Fig. 5; these are the same as those of Fig. 1 at sin θw = 1/2. Measured values are also shown in Fig. 5 for electrons, muons, taus, the combination of all three neutrinos, the bottom quark, the charm quark, and (implicitly) the sum of all five quarks. Again, they agree well with the predictions.

This is already pretty impressive. The Standard Model and its Higgs field predict that just a single angle links a mass ratio, a lifetime ratio, and the decay probabilities of the Z boson. If the Standard Model were significantly off base, some or all of the predictions would fail badly.

However, this is only the beginning. So if you’re not yet convinced, consider reading the bonus section below, which gives four additional classes of examples, or stay tune for the next post in this series, where we’ll look at how things improve with a more precise value of sin θw.

Bonus: Other Predictions of the Standard Model

Many other processes involving the weak nuclear force depend in some way on sin θw. Here are a few examples.

High-Energy Electron-Positron Collisions (click for details)

In this post I discussed the ratio of the rates for two important processes in collisions of electrons and positrons:

  • electron + positron any quark + its anti-quark
  • electron + positron muon + anti-muon

This ratio is simple at low energy (E << mZ c2), because it involves mainly electromagnetic effects, and thus depends only on the electric charges of the particles that can be produced.

Figure 6: The ratio of the rates for quark/anti-quark production versus muon/anti-muon production in high-energy electron-positron scattering depends on sin θw.

But at high energy (E >> mZ c2) , the prediction changes, because both electromagnetic and weak nuclear forces play a role. In fact, they “interfere”, meaning that one must combine their effects in a quantum way before calculating probabilities.

[What this cryptic quantum verbiage really means is this. At low energy, if Sf is the complex number representing the effect of the photon field on this process, then the rate for the process is |Sf|2. But here we have to include both Sf and SZ, where the latter is the effect of the Z field. The total rate is not |Sf|2 + |SZ|2 , a sum of two separate probabilities. Instead it is |Sf+SZ|2 , which could be larger or smaller than |Sf|2 + |SZ|2 — a sign of interference.]

In the Standard Model, the answer depends on sin θw. The LEP 2 collider measured this ratio at energies up to 209 GeV, well above mZ c2. If we assume sin θw is approximately 1/2, data agrees with predictions. [In fact, the ratio can be calculated as a function of energy, and thus made more precise; data agrees with these more precise predictions, too.]

Low-Energy Neutrino-Nucleus Collisions (click for details)

When electrons scatter off protons and neutrons, they do so via the electromagnetic force. For electron-proton collisions, this is not surprising, since both protons and electrons carry electric charge. But it’s also true for neutrons, because even though the neutron is electrically neutral, the quarks inside it are not.

By contrast, neutrinos are electrically neutral, and so they will only scatter off protons and neutrons (and their quarks) through the weak nuclear force. More precisely, they do so through the W and Z fields (via so-called “virtual W and Z particles” [which aren’t particles.]) Oversimplifying, if one can

  • obtain beams of muon neutrinos, and
  • scatter them off deuterons (nuclei of heavy hydrogen, which have one proton and one neutron), or off something that similarly has equal numbers of protons and neutrons,

then simple predictions can be made for the two processes shown at the top of Fig. 7, in which the nucleus shatters (turning into multiple “hadrons” [particles made from quarks, antiquarks and gluons]) and either a neutrino or a muon emerges from the collision. (The latter can be directly observed; the former can be inferred from the non-appearance of any muon.) Analogous predictions can be made for the anti-neutrino beams, as shown at the bottom of Fig. 7.

Figure 7: The ratios of the rates for these four neutrino/deuteron or anti-neutrino/deuteron scattering processes depend only on sin θw in the Standard Model.

The ratios of these four processes are predicted to depend, in a certain approximation, only on sin θw. Data agrees with these predictions for sin θw approximately 1/2.

More complex and detailed predictions are also possible, and these work too.

Asymmetries in Electron-Positron Collisions (click for details)

There are a number of asymmetric effects that come from the fact that the weak nuclear force is

  • not “parity-invariant”, (i.e. not the same when viewed in a mirror), and
  • not “charge-conjugation invariant” (i.e. not the same when all electric charges are flipped)

though it is almost symmetric under doing both, i.e. putting the world in a mirror and flipping electric charge. No such asymmetries are seen in electromagnetism, which is symmetric under both parity and charge-conjugation separately. But when the weak interactions play a role, asymmetries appear, and they all depend, yet again, on sin θw.

Two classes of asymmetries of great interest are:

  • “Left-Right Asymmetry” (Fig. 8): The rate for electron-positron collisions to make Z bosons in collisions with positrons depends on which way the electrons are “spinning” (i.e. whether they carry angular momentum along or opposite to their direction of motion.)
  • “Forward-Backward Asymmetry” (Fig. 9): The rate for electron-positron collisions to make particle-antiparticle pairs depends on whether the particles are moving roughly in the same direction as the electrons or in the same direction as the positrons.
Figure 8: The left-right asymmetry for Z boson production, whereby electrons “polarized” to spin one way do not produce Z’s at the same rate as electrons polarized the other way.
Figure 9: The forward-backward asymmetry for bottom quark production; the rate for the process at left is not the same as the rate for the process at right, due to the weak nuclear force.

As with the high-energy electron-positron scattering discussed above, interference between effects of the electromagnetic and Z fields, and the Z boson’s mass, causes these asymmetries to change with energy. They are particularly simple, though, both when E = mZ c2 and when E >> mZ c2.

A number of these asymmetries are measurable. Measurements of the left-right asymmetry was made at the Stanford Linear Accelerator Center (SLAC) at their Linear Collider (SLC), while I was a graduate student there. Meanwhile, measurements of the forward-backward asymmetries were made at LEP and LEP 2. All of these measurements agreed well with the Standard Model’s predictions.

A Host of Processes at the Large Hadron Collider (click for details)

Fig. 10 shows predictions (gray bands) for total rates of over seventy processes in the proton-proton collisions at the Large Hadron Collider. Also shown are measurements (colored squares) made at the CMS experiment . (A similar plot is available from the ATLAS experiment.) Many of these predictions, which are complicated as they must account for the proton’s internal structure, depend on sin θw .

Figure 10: Rates for the production of various particles at the Large Hadron Collider, as measured by the CMS detector collaboration. Grey bands are theoretical predictions; color bands are experimental measurements, with experimental uncertainties shown as vertical bars; colored bars with hatching above are upper limits for cases where the process has not yet been observed. (In many cases, agreement is so close that the grey bands are hard to see.)

While minor discrepancies between data and theory appear, they are of the sort that one would expect in a large number of experimental measurements. Despite the rates varying by more than a billion from most common to least common, there is not a single major discrepancy between prediction and data.

Many more measurements than just these seventy are performed at the Large Hadron Collider, not least because there are many more details in a process than just its total rate.

A Fortress

What I’ve shown you today is just a first step, and one can do better. When we look closely, especially at certain asymmetries described in the bonus section, we see that sin θw = 1/2 (i.e. θw = 30 degrees) isn’t a good enough approximation. (In particular, if sin θw were exactly 1/2, then the left-right asymmetry in Z production would be zero, and the forward-backward asymmetry for muon and tau production would also be zero. That rough prediction isn’t true; the asymmetries are small, only about 15%, but they are clearly not zero.)

So to really be convinced of the Standard Model’s validity, we need to be more precise about what sin θw is. That’s what we’ll do next time.

Nevertheless, you can already see that the Standard Model, with its Higgs field and its special triangle, works exceedingly well in predicting how particles behave in a wide range of circumstances. Over the past few decades, as it has passed one experimental test after another, it has become a fortress, extremely difficult to shake and virtually impossible to imagine tearing down. We know it can’t be the full story because there are so many questions it doesn’t answer or address. Someday it will fail, or at least require additions. But within its sphere of influence, it rules more powerfully than any theoretical idea known to our species.

November 15, 2024

Tommaso DorigoAnd The USERN Prize Winners For 2024 Are....

USERN (Universal Scientific Education and Research Network, https://usern.org) is a non-profit, non-governmental organization that supports interdisciplinary science across borders. Founded in 2015 by a distinguished Iranian Immunologist, Prof. Nima Rezaei, USERN has grown to acquire a membership of 26,000 members in 140 countries, from 22 scientific disciplines. From November 2022 I am its President.

read more

Matt Strassler Speaking at Brown University Nov 18th

Just a brief note, in a very busy period, to alert those in the Providence, RI area that I’ll be giving a colloquium talk at the Brown University Physics Department on Monday November 18th at 4pm. Such talks are open to the public, but are geared toward people who’ve had at least one full year of physics somewhere in their education. The title is “Exploring The Foundations of our Quantum Cosmos”. Here’s a summary of what I intend to talk about:

The discovery of the Higgs boson in 2012 marked a major milestone in our understanding of the universe, and a watershed for particle physics as a discipline. What’s known about particles and fields now forms a nearly complete short story, an astonishing, counterintuitive tale of relativity and quantum physics. But it sits within a larger narrative that is riddled with unanswered questions, suggesting numerous avenues of future research into the nature of spacetime and its many fields. I’ll discuss both the science and the challenges of accurately conveying its lessons to other scientists, to students, and to the wider public.

Scott Aaronson Steven Rudich (1961-2024)

I was sure my next post would be about the election—the sword of Damocles hanging over the United States and civilization as a whole. Instead, I have sad news, but also news that brings memories of warmth, humor, and complexity-theoretic insight.

Steven Rudich—professor at Carnegie Mellon, central figure of theoretical computer science since the 1990s, and a kindred spirit and friend—has died at the too-early age of 63. While I interacted with him much more seldom than I wish I had, it would be no exaggeration to call him one of the biggest influences on my life and career.

I first became aware of Steve at age 17, when I read the Natural Proofs paper that he coauthored with Razborov. I was sitting in the basement computer room at Telluride House at Cornell, and still recall the feeling of awe that came over me with every page. This one paper changed my scientific worldview. It expanded my conception of what the P versus NP problem was about and what theoretical computer science could even do—showing how it could turn in on itself, explain its own difficulties in proving problems hard in terms of the truth of those same problems’ hardness, and thereby transmute defeat into victory. I may have been bowled over by the paper’s rhetoric as much as by its results: it was like, you’re allowed to write that way?

I was nearly as impressed by Steve’s PhD thesis, which was full of proofs that gave off the appearance of being handwavy, “just phoning it in,” but were in reality completely rigorous. The result that excited me the most said that, if a certain strange combinatorial conjecture was true, then there was essentially no hope of proving that P≠NP∩coNP relative to a random oracle with probability 1. I played around with the combinatorial conjecture but couldn’t make headway on it; a year or two later, I was excited when I met Clifford Smyth and he told me that he, Kahn, and Saks had just proved it. Rudich’s conjecture directly inspired me to work on what later became the Aaronson-Ambainis Conjecture, which is still unproved, but which if true, similarly implies that there’s no hope of proving P≠BQP relative to a random oracle with probability 1.

When I applied to CS PhD programs in 1999, I wrote about how I wanted to sing the ideas of theoretical computer science from the rooftops—just like Steven Rudich had done, with the celebrated Andrew’s Leap summer program that he’d started at Carnegie Mellon. (How many other models were there? Indeed, how many other models are there today?) I was then honored beyond words when Steve called me on the phone, before anyone else had, and made an hourlong pitch for me to become his student. “You’re what I call a ‘prefab’,” he said. “You already have the mindset that I try to instill in students by the end of their PhDs.” I didn’t have much self-confidence then, which is why I can still quote Steve’s words a quarter-century later. In the ensuing years, when (as often) I doubted myself, I’d think back to that phone call with Steve, and my burning desire to be what he apparently thought I was.

Alas, when I arrived in Pittsburgh for CMU’s visit weekend, I saw Steve holding court in front of a small crowd of students, dispensing wisdom and doing magic tricks. I was miffed that he never noticed or acknowledged me: had he already changed his mind about me, lost interest? It was only later that I learned that Steve was going blind at the time, and literally hadn’t seen me.

In any case, while I came within a hair of accepting CMU’s offer, in the end I chose Berkeley. I wasn’t yet 100% sure that I wanted to do quantum computing (as opposed to AI or classical complexity theory), but the lure of the Bay Area, of the storied CS theory group where Steve himself had studied, and of Steve’s academic sibling Umesh Vazirani proved too great.

Full of regrets about the road not taken, I was glad that, in the summer between undergrad and PhD, I got to attend the PCMI summer school on computational complexity at the Institute for Advanced Study in Princeton, where Steve gave a spectacular series of lectures. By that point, Steve was almost fully blind. He put transparencies up, sometimes upside-down until the audience corrected him, and then lectured about them entirely from memory. He said that doing CS theory sightless was a new, more conceptual experience for him.

Even in his new condition, Steve’s showmanship hadn’t left him; he held the audience spellbound as few academics do. And in a special lecture on “how to give talks,” he spilled his secrets.

“What the speaker imagines the audience is thinking,” read one slide. And then, inside the thought bubbles: “MORE! HARDER! FASTER! … Ahhhhh yes, QED! Truth is beauty.”

“What the audience is actually thinking,” read the next slide, below which: “When is this over? I need to pee. Can I get a date with the person next to me?” (And this was before smartphones.) And yet, Steve explained, rather than resenting the many demands on the audience’s attention, a good speaker would break through, meet people where they were, just as he was doing right then.

I listened, took mental notes, resolved to practice this stuff. I reflected that, even if my shtick only ever became 10% as funny or fluid as Steve’s, I’d still come out way ahead.

It’s possible that the last time I saw Steve was in 2007, when I visited Carnegie Mellon to give a talk about algebrization, a new barrier to solving P vs. NP (and other central problems of complexity theory) that Avi Wigderson and I had recently discovered. When I started writing the algebrization paper, I very consciously modeled it after the Natural Proofs paper; the one wouldn’t have been thinkable without the other. So you can imagine how much it meant to me when Steve liked algebrization—when, even though he couldn’t see my slides, he got enough from the spoken part of the talk to burst with “conceptual” questions and comments.

Steve not only peeled back the mystery of P vs NP insofar as anyone has. He did it with exuberance and showmanship and humor and joy and kindness. I won’t forget him.


I’ve written here only about the tiniest sliver of Steve’s life: namely, the sliver where it intersected mine. I wish that sliver were a hundred times bigger, so that there’d be a hundred times more to write. But CS theory, and CS more broadly, are communities. When I posted about Steve’s passing on Facebook, I got inundated by comments from friends of mine who (as it turned out) had taken Steve’s courses, or TA’d for him, or attended Andrew’s Leap, or otherwise knew him, and on whom he’d left a permanent impression—and I hadn’t even known any of this.

So I’ll end this post with a request: please share your Rudich stories in the comments! I’d especially love specific recollections of his jokes, advice, insights, or witticisms. We now live in a world where, even in the teeth of the likelihood that P≠NP, powerful algorithms running in massive datacenters nevertheless try to replicate the magic of human intelligence, by compressing and predicting all the text on the public Internet. I don’t know where this is going, but I can’t imagine that it would hurt for the emerging global hive-mind to know more about Steven Rudich.


November 12, 2024

Terence TaoHigher uniformity of arithmetic functions in short intervals II. Almost all intervals

Kaisa Matomäki, Maksym Radziwill, Fernando Xuancheng Shao, Joni Teräväinen, and myself have (finally) uploaded to the arXiv our paper “Higher uniformity of arithmetic functions in short intervals II. Almost all intervals“. This is a sequel to our previous paper from 2022. In that paper, discorrelation estimates such as

\displaystyle  \sum_{x \leq n \leq x+H} (\Lambda(n) - \Lambda^\sharp(n)) \bar{F}(g(n)\Gamma) = o(H)

were established, where {\Lambda} is the von Mangoldt function, {\Lambda^\sharp} was some suitable approximant to that function, {F(g(n)\Gamma)} was a nilsequence, and {[x,x+H]} was a reasonably short interval in the sense that {H \sim x^{\theta+\varepsilon}} for some {0 < \theta < 1} and some small {\varepsilon>0}. In that paper, we were able to obtain non-trivial estimates for {\theta} as small as {5/8}, and for some other functions such as divisor functions {d_k} for small values of {k}, we could lower {\theta} somewhat to values such as {3/5}, {5/9}, {1/3} of {\theta}. This had a number of analytic number theory consequences, for instance in obtaining asymptotics for additive patterns in primes in such intervals. However, there were multiple obstructions to lowering {\theta} much further. Even for the model problem when {F(g(n)\Gamma) = 1}, that is to say the study of primes in short intervals, until recently the best value of {\theta} available was {7/12}, although this was very recently improved to {17/30} by Guth and Maynard.

However, the situation is better when one is willing to consider estimates that are valid for almost all intervals, rather than all intervals, so that one now studies local higher order uniformity estimates of the form

\displaystyle  \int_X^{2X} \sup_{F,g} | \sum_{x \leq n \leq x+H} (\Lambda(n) - \Lambda^\sharp(n)) \bar{F}(g(n)\Gamma)|\ dx = o(XH)

where {H = X^{\theta+\varepsilon}} and the supremum is over all nilsequences of a certain Lipschitz constant on a fixed nilmanifold {G/\Gamma}. This generalizes local Fourier uniformity estimates of the form

\displaystyle  \int_X^{2X} \sup_{\alpha} | \sum_{x \leq n \leq x+H} (\Lambda(n) - \Lambda^\sharp(n)) e(-\alpha n)|\ dx = o(XH).

There is particular interest in such estimates in the case of the Möbius function {\mu(n)} (where, as per the Möbius pseudorandomness conjecture, the approximant {\mu^\sharp} should be taken to be zero, at least in the absence of a Siegel zero). This is because if one could get estimates of this form for any {H} that grows sufficiently slowly in {X} (in particular {H = \log^{o(1)} X}), this would imply the (logarithmically averaged) Chowla conjecture, as I showed in a previous paper.

While one can lower {\theta} somewhat, there are still barriers. For instance, in the model case {F \equiv 1}, that is to say prime number theorems in almost all short intervals, until very recently the best value of {\theta} was {1/6}, recently lowered to {2/15} by Guth and Maynard (and can be lowered all the way to zero on the Density Hypothesis). Nevertheless, we are able to get some improvements at higher orders:

  • For the von Mangoldt function, we can get {\theta} as low as {1/3}, with an arbitrary logarithmic saving {\log^{-A} X} in the error terms; for divisor functions, one can even get power savings in this regime.
  • For the Möbius function, we can get {\theta=0}, recovering our previous result with Tamar Ziegler, but now with {\log^{-A} X} type savings in the exceptional set (though not in the pointwise bound outside of the set).
  • We can now also get comparable results for the divisor function.

As sample applications, we can obtain Hardy-Littlewood conjecture asymptotics for arithmetic progressions of almost all given steps {h \sim X^{1/3+\varepsilon}}, and divisor correlation estimates on arithmetic progressions for almost all {h \sim X^\varepsilon}.

Our proofs are rather long, but broadly follow the “contagion” strategy of Walsh, generalized from the Fourier setting to the higher order setting. Firstly, by standard Heath–Brown type decompositions, and previous results, it suffices to control “Type II” discorrelations such as

\displaystyle  \sup_{F,g} | \sum_{x \leq n \leq x+H} \alpha*\beta(n) \bar{F}(g(n)\Gamma)|

for almost all {x}, and some suitable functions {\alpha,\beta} supported on medium scales. So the bad case is when for most {x}, one has a discorrelation

\displaystyle  |\sum_{x \leq n \leq x+H} \alpha*\beta(n) \bar{F_x}(g_x(n)\Gamma)| \gg H

for some nilsequence {F_x(g_x(n) \Gamma)} that depends on {x}.

The main issue is the dependency of the polynomial {g_x} on {x}. By using a “nilsequence large sieve” introduced in our previous paper, and removing degenerate cases, we can show a functional relationship amongst the {g_x} that is very roughly of the form

\displaystyle  g_x(an) \approx g_{x'}(a'n)

whenever {n \sim x/a \sim x'/a'} (and I am being extremely vague as to what the relation “{\approx}” means here). By a higher order (and quantitatively stronger) version of Walsh’s contagion analysis (which is ultimately to do with separation properties of Farey sequences), we can show that this implies that these polynomials {g_x(n)} (which exert influence over intervals {[x,x+H]}) can “infect” longer intervals {[x', x'+Ha]} with some new polynomials {\tilde g_{x'}(n)} and various {x' \sim Xa}, which are related to many of the previous polynomials by a relationship that looks very roughly like

\displaystyle  g_x(n) \approx \tilde g_{ax}(an).

This can be viewed as a rather complicated generalization of the following vaguely “cohomological”-looking observation: if one has some real numbers {\alpha_i} and some primes {p_i} with {p_j \alpha_i \approx p_i \alpha_j} for all {i,j}, then one should have {\alpha_i \approx p_i \alpha} for some {\alpha}, where I am being vague here about what {\approx} means (and why it might be useful to have primes). By iterating this sort of contagion relationship, one can eventually get the {g_x(n)} to behave like an Archimedean character {n^{iT}} for some {T} that is not too large (polynomial size in {X}), and then one can use relatively standard (but technically a bit lengthy) “major arc” techiques based on various integral estimates for zeta and {L} functions to conclude.

November 06, 2024

Tommaso DorigoFlying Drones With Particle Detectors

Nowadays we study the Universe using a number of probes and techniques. Over the course of the past 100 years we moved from barely using optical telescopes, that give us access to the flux of visible photons from galaxies, supernovae, and other objects of interest, to exploiting photons of any energy - gamma rays, x rays, ultraviolet and infrared radiation, microwaves; and then also using charged cosmic radiation (including protons and light nuclei, electrons and positrons), neutrinos, and lastly, gravitational waves. 

read more

November 05, 2024

Scott Aaronson Letter to a Jewish voter in Pennsylvania

Election Day Update: For anyone who’s still undecided (?!?), I can’t beat this from Sam Harris.

When I think of Harris winning the presidency this week, it’s like watching a film of a car crash run in reverse: the windshield unshatters; stray objects and bits of metal converge; and defenseless human bodies are hurled into states of perfect repose. Normalcy descends out of chaos.


Important Announcement: I don’t in any way endorse voting for Jill Stein, or any other third-party candidate. But if you are a Green Party supporter who lives in a swing state, then please at least vote for Harris, and use SwapYourVote.org to arrange for two (!) people in safe states to vote for Jill Stein on your behalf. Thanks so much to friend-of-the-blog Linchuan Zhang for pointing me to this resource.

Added on Election Day: And, if you swing that way, click here to arrange to have your vote for Kamala in a swing state traded for two votes for libertarian candidate Chase Oliver in safe states. In any case, if you’re in a swing state and you haven’t yet voted (for Kamala Harris and for the norms of civilization), do!


For weeks I’d been wondering what I could say right before the election, at this momentous branch-point in the wavefunction, that could possibly do any good. Then, the other day, a Jewish voter in Pennsylvania and Shtetl-Optimized fan emailed me to ask my advice. He said that he’d read my Never-Trump From Here to Eternity FAQ and saw the problems with Trump’s autocratic tendencies, but that his Israeli friends and family wanted him to vote Trump anyway, believing him better on the narrow question of “Israel’s continued existence.” I started responding, and then realized that my response was the election-eve post I’d been looking for. So without further ado…


Thanks for writing.  Of course this is ultimately between you and your conscience (and your empirical beliefs), but I can tell you what my Israeli-American wife and I did.  We voted for Kamala, without the slightest doubt or hesitation.  We’d do it again a thousand quadrillion times.  We would’ve done the same in the swing state of Pennsylvania, where I grew up (actually in Bucks, one of the crucial swing counties).

And later this week, along with tens of millions of others, I’ll refresh the news with heart palpitations, looking for movement toward blue in Pennsylvania and Wisconsin.  I’ll be joyous and relieved if Kamala wins.  I’ll be ashen-faced if she doesn’t.  (Or if there’s a power struggle that makes the 2021 insurrection look like a dress rehearsal.)  And I’ll bet anyone, at 100:1 odds, that at the end of my life I’ll continue to believe that voting Kamala was the right decision.

I, too, have pro-Israel friends who urged me to switch to Trump, on the ground that if Kamala wins, then (they say) the Jews of Israel are all but doomed to a second Holocaust.  For, they claim, the American Hamasniks will then successfully prevail on Kamala to prevent Israel from attacking Iran’s nuclear sites, or will leave Israel to fend for itself if it does.  And therefore, Iran will finish and test nuclear weapons in the next couple years, and then it will rebuild the battered Hamas and Hezbollah under its nuclear umbrella, and then it will fulfill its stated goal since 1979, of annihilating the State of Israel, by slaughtering all the Jews who aren’t able to flee.  And, just to twist the knife, the UN diplomats and NGO officials and journalists and college students and Wikipedia editors who claimed such a slaughter was a paranoid fantasy, they’ll all cheer it when it happens, calling it “justice” and “resistance” and “intifada.”

And that, my friends say, will finally show me the liberal moral evolution of humanity since 1945, in which I’ve placed so much stock.  “See, even while they did virtually nothing to stop the first Holocaust, the American and British cultural elites didn’t literally cheer the Holocaust as it happened.  This time around, they’ll cheer.”

My friends’ argument is that, if I’m serious about “Never Again” as a moral lodestar of my life, then the one issue of Israel and Iran needs to override everything else I’ve always believed, all my moral and intellectual repugnance at Trump and everything he represents, all my knowledge of his lies, his evil, his venality, all the former generals and Republican officials who say that he’s unfit to serve and an imminent danger to the Republic.  I need to vote for this madman, this pathological liar, this bullying autocrat, because at least he’ll stand between the Jewish people and the darkness that would devour them, as it devoured them in my grandparents’ time.

My friends add that it doesn’t matter that Kamala’s husband is Jewish, that she’s mouthed all the words a thousand times about Israel’s right to defend itself, that Biden and Harris have indeed continued to ship weapons to Israel with barely a wag of their fingers (even as they’ve endured vituperation over it from their left, even as Kamala might lose the whole election over it).  Nor does it matter that a commanding majority of American Jews will vote for Kamala, or that … not most Israelis, but most of the Israelis in academia and tech who I know, would vote for Kamala if they could.  They could all be mistaken about their own interests.  But you and I, say my right-wing friends, realize that what actually matters is Iran, and what the next president will do about Iran.  Trump would unshackle Israel to do whatever it takes to prevent nuclear-armed Ayatollahs.  Kamala wouldn’t.

Anyway, I’ve considered this line of thinking.  I reject it with extreme prejudice.

To start with the obvious, I’m not a one-issue voter.  Presumably you aren’t either.  Being Jewish is a fundamental part of my humanity—if I didn’t know that before I’d witnessed the world’s reaction to October 7, then I certainly know now.  But only in the fantasies of antisemites would I vote entirely on the basis of “is this good for the Jews?”  The parts of me that care about the peaceful transfer of power, about truth, about standing up to Putin, about the basic sanity of the Commander-in-Chief in an emergency, about climate change and green energy and manufacturing, about not destroying the US economy through idiotic tariffs, about talented foreign scientists getting green cards, about the right to abortion, about RFK and his brainworm not being placed in charge of American healthcare, even about AI safety … all those parts of me are obviously for Kamala.

More interestingly, though, the Jewish part of me is also for Kamala—if possible, even more adamantly than other parts.  It’s for Kamala because…

Well, after these nine surreal years, how does one even spell out the Enlightenment case against Trump?  How does one say what hasn’t already been said a trillion times?  Now that the frog is thoroughly boiled, how does one remind people of the norms that used to prevail in America—even after Newt Gingrich and Sarah Palin and the rest had degraded them—and how those norms were what stood between us and savagery … and how laughably unthinkable is the whole concept of Trump as president, the instant you judge him according to those norms?

Kamala, whatever her faults, is basically a normal politician.  She lies, but only as normal politicians lie.  She dodges questions, changes her stances, says different things to different audiences, but only as normal politicians do.  Trump is something else entirely.  He’s one of the great flimflam artists of human history.  He believes (though “belief” isn’t quite the right word) that truth is not something external to himself, but something he creates by speaking it.  He is the ultimate postmodernist.  He’s effectively created a new religion, one of grievance and lies and vengeance against outsiders, and converted a quarter of Americans to his religion, while another quarter might vote it into power because of what they think is in it for them.

And this cult of lies … this is what you ask if Jewish people should enter into a strategic alliance with?  Do you imagine this cult is a trustworthy partner, one likely to keep its promises?

For centuries, Jews have done consistently well under cosmopolitan liberal democracies, and consistently poorly—when they remained alive at all—under nativist tyrants.  Do you expect whatever autocratic regime follows Trump, a regime of JD Vance and Tucker Carlson and the like, to be the first exception to this pattern in history?

For I take it as obvious that a second Trump term, and whatever follows it, will make the first Trump term look like a mere practice run, a Beer Hall Putsch.  Trump I was restrained by John Kelly, by thousands of civil service bureaucrats and judges, by the generals, and in the last instance, by Mike Pence.  But Trump II will be out for the blood of his enemies—he says so himself at his rallies—and will have nothing to restrain him, not even any threat of criminal prosecution.  Do you imagine this goes well for the Jews, or for pretty much anyone?

It doesn’t matter if Trump has no personal animus against Jews—excepting, of course, the majority who vote against him.  Did the idealistic Marxist intellectuals of Russia in 1917 want Stalin?  Did the idealistic Iranian students of Iran in 1979 want Khomeini?  It doesn’t matter: what matters is what they enabled.  Turn over the rock of civilization, and everything that was wriggling underneath is suddenly loosed on the world.

How much time have you spent looking at pro-Israel people on Twitter (Hen Mazzig, Haviv Rettig Gur, etc.), and then—crucially—reading their replies?  I spend at least an hour or two per day on that, angry and depressed though it makes me, perhaps because of an instinct to stare into the heart of darkness, not to look away from a genocidal evil arrayed against my family.  

Many replies are the usual: “Shut the fuck up, Zio, and stop murdering babies.”  “Two-state solution?  I have a different solution: that all you land-thieves pack your bags and go back to Poland.” But then, every time, you reach tweets like “you Jews have been hated and expelled from all the world’s countries for thousands of years, yet you never consider that the common factor is you.”  “Your Talmud commands you to kill goyim children, so that’s why you’re doing it.”  “Even while you maintain apartheid in Palestine, you cynically import millions of third-world savages to White countries, in order to destroy them.”  None of this is the way leftists talk, not even the most crazed leftists.  We’ve now gone all the way around the horseshoe.  Or, we might say, we’re no longer selecting on the left or right of politics at all, but simply on the bottom.

And then you see that these bottom-feeders often have millions of followers each.  They command armies.  The bottom-feeders—left, right, Islamic fundamentalist, and unclassifiably paranoid—are emboldened as never before.  They’re united by a common enemy, which turns out to be the same enemy they’ve always had.

Which brings us to Elon Musk.  I personally believe that Musk, like Trump, has nothing against the Jews, and is if anything a philosemite.  But it’s no longer a question of feelings.  Through his changes to Twitter, Musk has helped his new ally Trump flip over the boulder, and now all the demons that were wriggling beneath are loosed on civilization.

Should we, as Jews, tolerate the demons in exchange for Trump’s tough-guy act on Iran?  Just like the evangelicals previously turned a blind eye to Trump’s philandering, his sexual assaults, his gleeful cruelty, his spitting on everything Christianity was ever supposed to stand for, simply because he promised them the Supreme Court justices to overturn Roe v. Wade?  Faced with a man who’s never had a human relationship in his life that wasn’t entirely transactional, should we be transactional ourselves?

I’m not convinced that even if we did, we’d be getting a good bargain.  Iran is no longer alone, but part of an axis that includes China, Russia, and North Korea.  These countries prop up each other’s economies and militaries; they survive only because of each other.  As others have pointed out, the new Axis is actually more tightly integrated than the Axis powers ever were in WWII.  The new Axis has already invaded Ukraine and perhaps soon Taiwan and South Korea.  It credibly threatens to end the Pax Americana.  And to face Hamas or Hezbollah is to face Iran is to face the entire new Axis.

Now Kamala is not Winston Churchill.  But at least she doesn’t consider the tyrants of Russia, China, and North Korea to be her personal friends, trustworthy because they flatter her.  At least she, unlike Trump, realizes that the current governments of China, Russia, North Korea, and Iran do indeed form a new axis of evil, and she has the glimmers of consciousness that the founders of the United States stood for something different from what those tyrannies stand for, and that this other thing that our founders stood for was good.  If war does come, at least she’ll listen to the advice of generals, rather than clowns and lackeys.  And if Israel or America do end up in wars of survival, from the bottom of my heart she’s the one I’d rather have in charge.  For if she’s in charge, then through her, the government of the United States is still in charge.  Our ripped and tattered flag yet waves.  If Trump is in charge, who or what is at the wheel besides his own unhinged will, or that of whichever sordid fellow-gangster currently has his ear?

So, yes, as a human being and also as a Jew, this is why I voted early for Kamala, and why I hope you’ll vote for her too. If you disagree with her policies, start fighting those policies once she’s inaugurated on January 20, 2025. At least there will still be a republic, with damaged but functioning error-correcting machinery, in which you can fight.

All the best,
Scott


More Resources: Be sure to check out Scott Alexander’s election-eve post, which (just like in 2016) endorses any listed candidate other than Trump, but specifically makes the case to voters put off (as Scott is) by Democrats’ wokeness. Also check out Garry Kasparov’s epic tweet-thread on why he supports Kamala, and his essay The United States Cannot Descend Into Authoritarianism.

October 28, 2024

John PreskillAnnouncing the quantum-steampunk creative-writing course!

Why not run a quantum-steampunk creative-writing course?

Quantum steampunk, as Quantum Frontiers regulars know, is the aesthetic and spirit of a growing scientific field. Steampunk is a subgenre of science fiction. In it, futuristic technologies invade Victorian-era settings: submarines, time machines, and clockwork octopodes populate La Belle Èpoque, a recently liberated Haiti, and Sherlock Holmes’s London. A similar invasion characterizes my research field, quantum thermodynamics: thermodynamics is the study of heat, work, temperature, and efficiency. The Industrial Revolution spurred the theory’s development during the 1800s. The theory’s original subject—nineteenth-century engines—were large, were massive, and contained enormous numbers of particles. Such engines obey the classical mechanics developed during the 1600s. Hence thermodynamics needs re-envisioning for quantum systems. To extend the theory’s laws and applications, quantum thermodynamicists use mathematical and experimental tools from quantum information science. Quantum information science is, in part, the understanding of quantum systems through how they store and process information. The toolkit is partially cutting-edge and partially futuristic, as full-scale quantum computers remain under construction. So applying quantum information to thermodynamics—quantum thermodynamics—strikes me as the real-world incarnation of steampunk.

But the thought of a quantum-steampunk creative-writing course had never occurred to me, and I hesitated over it. Quantum-steampunk blog posts, I could handle. A book, I could handle. Even a short-story contest, I’d handled. But a course? The idea yawned like the pitch-dark mouth of an unknown cavern in my imagination.

But the more I mulled over Edward Daschle’s suggestion, the more I warmed to it. Edward was completing a master’s degree in creative writing at the University of Maryland (UMD), specializing in science fiction. His mentor Emily Brandchaft Mitchell had sung his praises via email. In 2023, Emily had served as a judge for the Quantum-Steampunk Short-Story Contest. She works as a professor of English at UMD, writes fiction, and specializes in the study of genre. I reached out to her last spring about collaborating on a grant for quantum-inspired art, and she pointed to her protégé.

Who won me over. Edward and I are co-teaching “Writing Quantum Steampunk: Science-Fiction Workshop” during spring 2025.

The course will alternate between science and science fiction. Under Edward’s direction, we’ll read and discuss published fiction. We’ll also learn about what genres are and how they come to be. Students will try out writing styles by composing short stories themselves. Everyone will provide feedback about each other’s writing: what works, what’s confusing, and opportunities for improvement. 

The published fiction chosen will mirror the scientific subjects we’ll cover: quantum physics; quantum technologies; and thermodynamics, including quantum thermodynamics. I’ll lead this part of the course. The scientific studies will interleave with the story reading, writing, and workshopping. Students will learn about the science behind the science fiction while contributing to the growing subgenre of quantum steampunk.

We aim to attract students from across campus: physics, English, the Jiménez-Porter Writers’ House, computer science, mathematics, and engineering—plus any other departments whose students have curiosity and creativity to spare. The course already has four cross-listings—Arts and Humanities 270, Physics 299Q, Computer Science 298Q, and Mechanical Engineering 299Q—and will probably acquire a fifth (Chemistry 298Q). You can earn a Distributive Studies: Scholarship in Practice (DSSP) General Education requirement, and undergraduate and graduate students are welcome. QuICS—the Joint Center for Quantum Information and Computer Science, my home base—is paying Edward’s salary through a seed grant. Ross Angelella, the director of the Writers’ House, arranged logistics and doused us with enthusiasm. I’m proud of how organizations across the university are uniting to support the course.

The diversity we seek, though, poses a challenge. The course lacks prerequisites, so I’ll need to teach at a level comprehensible to the non-science students. I’d enjoy doing so, but I’m concerned about boring the science students. Ideally, the science students will help me teach, while the non-science students will challenge us with foundational questions that force us to rethink basic concepts. Also, I hope that non-science students will galvanize discussions about ethical and sociological implications of quantum technologies. But how can one ensure that conversation will flow?

This summer, Edward and I traded candidate stories for the syllabus. Based on his suggestions, I recommend touring science fiction under an expert’s guidance. I enjoyed, for a few hours each weekend, sinking into the worlds of Ted Chiang, Ursula K. LeGuinn, N. K. Jemison, Ken Liu, and others. My scientific background informed my reading more than I’d expected. Some authors, I could tell, had researched their subjects thoroughly. When they transitioned from science into fiction, I trusted and followed them. Other authors tossed jargon into their writing but evidenced a lack of deep understanding. One author nailed technical details about quantum computation, initially impressing me, but missed the big picture: his conflict hinged on a misunderstanding about entanglement. I see all these stories as affording opportunities for learning and teaching, in different ways.

Students can begin registering for “Writing Quantum Steampunk: Science-Fiction Workshop” on October 24. We can offer only 15 seats, due to Writers’ House standards, so secure yours as soon as you can. Part of me still wonders how the Hilbert space I came to be co-teaching a quantum-steampunk creative-writing course.1 But I look forward to reading with you next spring!


1A Hilbert space is a mathematical object that represents a quantum system. But you needn’t know that to succeed in the course.

Matt LeiferDoctoral Position

Funding is available for a Doctor of Science Studentship with Dr. Matthew Leifer at the Institute for Quantum Studies, Chapman University, California, USA.  It is in Chapman’s unique interdisciplinary Math, Physics, and Philosophy (MPP) program, which emphasizes research that encompasses two or more of the three core disciplines.  This is a 3-year program that focuses on research, and students are expected to have a terminal Masters degree before they start.

This position is part of the Southern California Quantum Foundations Hub, funded by the John Templeton Foundation.  The research project must be in quantum foundations, particularly in one of the three theme areas of the grant:

  1. The Nature of the Quantum State
  2. Past and Future Boundary Conditions
  3. Agency in Quantum Observers. 

The university also provides other scholarships for the MPP program.  Please apply before January 15, 2025, to receive full consideration for the available funding.

Please follow the “Graduate Application” link on the MPP website to apply.

For informal inquiries about the position and research projects, please get in touch with me.

John PreskillSculpting quantum steampunk

In 2020, many of us logged experiences that we’d never anticipated. I wrote a nonfiction book and got married outside the Harvard Faculty Club (because nobody was around to shoo us away). Equally unexpectedly, I received an invitation to collaborate with a professional artist. One Bruce Rosenbaum emailed me out of the blue:

I watched your video on Quantum Steampunk: Quantum Information Meets Thermodynamics. [ . . . ] I’d like to explore collaborating with you on bringing together the fusion of Quantum physics and Thermodynamics into the real world with functional Steampunk art and design.

This Bruce Rosenbaum, I reasoned, had probably seen some colloquium of mine that a university had recorded and posted online. I’d presented a few departmental talks about how quantum thermodynamics is the real-world incarnation of steampunk.

I looked Bruce up online. Wired Magazine had called the Massachusetts native “the steampunk evangelist,” and The Wall Street Journal had called him “the steampunk guru.” He created sculptures for museums and hotels, in addition to running workshops that riffed on the acronym STEAM (science, technology, engineering, art, and mathematics). MTV’s Extreme Cribs had spotlighted his renovation of a Victorian-era church into a home and workshop.

The Rosenbaums’ kitchen (photo from here)

All right, I replied, I’m game. But research fills my work week, so can you talk at an unusual time?

We Zoomed on a Saturday afternoon. Bruce Zooms from precisely the room that you’d hope to find a steampunk artist in: a workshop filled with brass bits and bobs spread across antique-looking furniture. Something intricate is usually spinning atop a table behind him. And no, none of it belongs to a virtual background. Far from an overwrought inventor, though, Bruce exudes a vibe as casual as the T-shirt he often wears—when not interviewing in costume. A Boston-area accent completed the feeling of chatting with a neighbor.

Bruce proposed building a quantum-steampunk sculpture. I’d never dreamed of the prospect, but it sounded like an adventure, so I agreed. We settled on a sculpture centered on a quantum engine. Classical engines inspired the development of thermodynamics around the time of the Industrial Revolution. One of the simplest engines—the heat engine—interacts with two environments, or reservoirs: one cold and one hot. Heat—the energy of random atomic motion—flows from the hot to the cold. The engine siphons off part of the heat, converting it into work—coordinated energy that can, say, turn a turbine. 

Can a quantum system convert random heat into useful work? Yes, quantum thermodynamicists have shown. Bell Labs scientists designed a quantum engine formed from one atom, during the 1950s and 1960s. Since then, physicists have co-opted superconducting qubits, trapped ions, and more into quantum engines. Entanglement can enhance quantum engines, which can both suffer and benefit from quantum coherences (wave-like properties, in the spirit of wave–particle duality). Experimentalists have realized quantum engines in labs. So Bruce and I placed (an artistic depiction of) a quantum engine at our sculpture’s center. The engine consists of a trapped ion—a specialty of Maryland, where I accepted a permanent position that spring.

Bruce engaged an illustrator, Jim Su, to draw the sculpture. We iterated through draft after draft, altering shapes and fixing scientific content. Versions from the cutting-room floor now adorn the Maryland Quantum-Thermodynamics Hub’s website.

Designing the sculpture was a lark. Finding funding to build it has required more grit. During the process, our team grew to include scientific-computing expert Alfredo Nava-Tudelo, physicist Bill Phillips, senior faculty specialist Daniel Serrano, and Quantum Frontiers gatekeeper Spiros Michalakis. We secured a grant from the University of Maryland’s Arts for All program this spring. The program is promoting quantum-inspired art this year, in honor of the UN’s designation of 2025 as the International Year of Quantum Science and Technology

Through the end of 2024, we’re building a tabletop version of the sculpture. We were expecting a 3D-printout version to consume our modest grant. But quantum steampunk captured the imagination of Empire Group, the design-engineering company hired by Bruce to create and deploy technical drawings. Empire now plans to include metal and moving parts in the sculpture. 

The Quantum-Steampunk Engine sculpture (drawing by Jim Su)

Empire will create CAD (computer-aided–design) drawings this November, in dialogue with the scientific team and Bruce. The company will fabricate the sculpture in December. The scientists will create educational materials that explain the thermodynamics and quantum physics represented in the sculpture. Starting in 2025, we’ll exhibit the sculpture everywhere possible. Plans include the American Physical Society’s Global Physics Summit (March Meeting), the quantum-steampunk creative-writing course I’m co-teaching next spring, and the Quantum World Congress. Bruce will incorporate the sculpture into his STEAMpunk workshops. Drop us a line if you want the Quantum-Steampunk Engine sculpture at an event as a centerpiece or teaching tool. And stay tuned for updates on the sculpture’s creation process and outreach journey.

Our team’s schemes extend beyond the tabletop sculpture: we aim to build an 8’-by-8’-by-8’ version. The full shebang will contain period antiques, lasers, touchscreens, and moving and interactive parts. We hope that a company, university, or individual will request the full-size version upon seeing its potential in the tabletop.

A sculpture, built by ModVic for a corporate office, of the scale we have in mind. The description on Bruce’s site reads, “A 300 lb. Clipper of the Clouds sculpture inspired by a Jules Verne story. The piece suspends over the corporate lobby.”

After all, what are steampunk and science for, if not dreaming?

September 30, 2024

Jacques Distler Golem VI

Hopefully, you didn’t notice, but Golem V has been replaced. Superficially, the new machine looks pretty much like the old.

It’s another Mac Mini, with an (8-core) Apple Silicon M2 chip (instead of a quad-core Intel Core i7), 24 GB of RAM (instead of 16), dual 10Gbase-T NICs (instead of 1Gbase-T), a 1TB internal SSD and a 2TB external SSD (TimeMachine backup).

The transition was anything but smooth.

The first step involved retrieving the external HD, which contained a clone of the internal System drive, from UDC and running Migration Assistant to transfer the data to the new machine.

Except … Migration Assistant refused to touch the external HD. It (like the System drive of Golem V) was formatted with a case-sensitive filesystem. Ten years ago, that was perfectly OK, and seemed like the wave of the future. But the filesystem (specifically, the Data Volume) for current versions of Macos is case-insensitive and there is no way to format it as case-sensitive. Since transferring data from a case-sensitive to a case-insensitive filesystem is potentially lossy, Migration Assistant refused to even try.

The solution turned out to be:

  • Format a new drive as case-insensitive.
  • Use rsync to copy the old (case-sensitive) drive onto the new one.
  • rsync complained about a handful of files, but none were of any consequence.
  • Run Migration Assistant on the new case-insensitive drive.

And that was just Day 1. Recompiling/reinstalling a whole mess ‘o software occupied the next several weeks, with similar hurdles to overcome.

For instance, installing Perl XS modules, using cpan consistently failed with a

fatal error: 'EXTERN.h' file not found

error. Googling the failures led me to perlmonks.org, where a post sagely opined

First, do not use the system Perl on MacOS. As Corion says, that is for Apple, not for you.

This is nonsense. The system Perl is the one Apple intends you to use. But … if you’re gonna do development on Macos (and installing Perl XS modules apparently constitutes development), you need to use the Macos SDK. And cpan doesn’t seem to be smart enough to do that. The Makefile it generates says

PERL_INC = /System/Library/Perl/5.34/darwin-thread-multi-2level/CORE

Edit that by hand to read

PERL_INC = /Library/Developer/CommandLineTools/SDKs/MacOSX14.sdk/System/Library/Perl/5.34/darwin-thread-multi-2level/CORE

and everything compiles and installs just fine.

And don’t even get me started on the woeful state of the once-marvelous Fink Package Manager.

One odder bit of breakage does deserve a mention. sysctl is used to set (or read) various Kernel parameters (including one that I very much need for my setup: net.inet.ip.forwarding=1). And there’s a file /etc/sysctl.conf where you can store these settings, so that they persist across reboots. Unnoticed by me, Migration Assistant didn’t copy that file to the new Golem, which was the source of much puzzlement and consternation when the new Golem booted up for the first time at UDC, and the networking wasn’t working right.

When I realized what was going on, I just thought, “Aha! I’ll recreate that file and all will be good.” Imagine my surprise when I rebooted the machine a couple of days later and, again, the networking wasn’t working right. Turns out that, unlike every other Unix system I have seen (and unlike the previous Golem), the current version(s) of Macos completely ignore /etc/sysctl.conf. If you want to persist those settings between reboots, you have to do that in a cron-job (or launchd script or whatever.)

Anyway, enough complaining. The new Golem seems to be working now, in no small part thanks to the amazing support (and boundless patience) of Chris Murphy, Andrew Manhein and the rest of the crew at UDC. Thanks guys!

September 23, 2024

Jacques Distler Entanglement for Laymen

I’ve been asked, innumerable times, to explain quantum entanglement to some lay audience. Most of the elementary explanations that I have seen (heck, maybe all of them) fail to draw any meaningful distinction between “entanglement” and mere “(classical) correlation.”

This drives me up the wall, so each time I am asked, I strive to come up with an elementary explanation of the difference. Rather than keep reinventing the wheel, let me herewith record my latest attempt.

“Entanglement” is a bit tricky to explain, versus “correlation” — which has a perfectly classical interpretation.

Say I tear a page of paper in two, crumple up the two pieces into balls and (at random) hand one to Adam and the other to Betty. They then go their separate ways and — sometime later — Adam unfolds his piece of paper. There’s a 50% chance that he got the top half, and 50% that he got the bottom half. But if he got the top half, we know for certain that Betty got the bottom half (and vice versa).

That’s correlation.

In this regard, the entangled state behaves exactly the same way. What distinguishes the entangled state from the merely correlated is something that doesn’t have a classical analogue. So let me shift from pieces of paper to photons.

You’re probably familiar with the polaroid filters in good sunglasses. They absorb light polarized along the horizontal axis, but transmit light polarized along the vertical axis.

Say, instead of crumpled pieces of paper, I send Adam and Betty a pair of photons.

In the correlated state, one photon is polarized horizontally, and one photon is polarized vertically, and there’s a 50% chance that Adam got the first while Betty got the second and a 50% chance that it’s the other way around.

Adam and Betty send their photons through polaroid filters, both aligned vertically. If Adam’s photon makes it through the filter, we can be certain that Betty’s gets absorbed and vice versa. Same is true if they both align their filters horizontally.

Say Adam aligns his filter horizontally, while Betty aligns hers vertically. Then either both photons make it though (with 50% probability) or both get absorbed (also with 50% probability).

All of the above statements are also true in the entangled state.

The tricky thing, the thing that makes the entangled state different from the correlated state, is what happens if both Adam and Betty align their filters at a 45° angle. Now there’s a 50% chance that Adam’s photon makes it through his filter, and a 50% chance that Betty’s photon makes it through her filter.

(You can check this yourself, if you’re willing to sacrifice an old pair of sunglasses. Polarize a beam of light with one sunglass lens, and view it through the other sunglass lens. As you rotate the second lens, the intensity varies from 100% (when the lenses are aligned) to 0 (when they are at 90°). The intensity is 50% when the second lens is at 45°.)

So what is the probability that both Adam and Betty’s photons make it through? Well, if there’s a 50% chance that his made it through and a 50% chance that hers made it through, then you might surmise that there’s a 25% chance that both made it through.

That’s indeed the correct answer in the correlated state.

In fact, in the correlated state, each of the 4 possible outcomes (both photons made it through, Adam’s made it through but Betty’s got absorbed, Adam’s got absorbed but Betty’s made it through or both got absorbed) has a 25% chance of taking place.

But, in the entangled state, things are different.

In the entangled state, the probability that both photons made it through is 50% – the same as the probability that one made it through. In other words, if Adam’s photon made it through the 45° filter, then we can be certain that Betty’s made it through. And if Adam’s was absorbed, so was Betty’s. There’s zero chance that one of their photons made it through while the other got absorbed.

Unfortunately, while it’s fairly easy to create the correlated state with classical tools (polaroid filters, half-silvered mirrors, …), creating the entangled state requires some quantum mechanical ingredients. So you’ll just have to believe me that quantum mechanics allows for a state of two photons with all of the aforementioned properties.

Sorry if this explanation was a bit convoluted; I told you that entanglement is subtle…