Planet Musings

April 01, 2025

Matt Strassler Quantum Interference 5: Coming Unglued

Now finally, we come to the heart of the matter of quantum interference, as seen from the perspective of in 1920’s quantum physics. (We’ll deal with quantum field theory later this year.)

Last time I looked at some cases of two particle states in which the particles’ behavior is independent — uncorrelated. In the jargon, the particles are said to be “unentangled”. In this situation, and only in this situation, the wave function of the two particles can be written as a product of two wave functions, one per particle. As a result, any quantum interference can be ascribed to one particle or the other, and is visible in measurements of either one particle or the other. (More precisely, it is observable in repeated experiments, in which we do the same measurement over and over.)

In this situation, because each particle’s position can be studied independent of the other’s, we can be led to think any interference associated with particle 1 happens near where particle 1 is located, and similarly for interference involving the second particle.

But this line of reasoning only works when the two particles are uncorrelated. Once this isn’t true — once the particles are entangled — it can easily break down. We saw indications of this in an example that appeared at the ends of my last two posts (here and here), which I’m about to review. The question for today is: what happens to interference in such a case?

Correlation: When “Where” Breaks Down

Let me now review the example of my recent posts. The pre-quantum system looks like this

Figure 1: An example of a superposition, in a pre-quantum view, where the two particles are correlated and where interference will occur that involves both particles together.

Notice the particles are correlated; either both particles are moving to the left OR both particles are moving to the right. (The two particles are said to be “entangled”, because the behavior of one depends upon the behavior of the other.) As a result, the wave function cannot be factored (in contrast to most examples in my last post) and we cannot understand the behavior of particle 1 without simultaneously considering the behavior of particle 2. Compare this to Fig. 2, an example from my last post in which the particles are independent; the behavior of particle 2 is the same in both parts of the superposition, independent of what particle 1 is doing.

Figure 2: Unlike Fig. 1, here the two particles are uncorrelated; the behavior of particle 2 is the same whether particle 1 is moving left OR right. As a result, interference can occur for particle 1 separately from any behavior of particle 2, as shown in this post.

Let’s return now to Fig. 1. The wave function for the corresponding quantum system, shown as a graph of its absolute value squared on the space of possibilities, behaves as in Fig. 3.

Figure 3: The absolute-value-squared of the wave function for the system in Fig, 1, showing interference as the peaks cross. Note the interference fringes are diagonal relative to the x1 and x2 axes.

But as shown last time in Fig. 19, at the moment where the interference in Fig. 3 is at its largest, if we measure particle 1 we see no interference effect. More precisely, if we do the experiment many times and measure particle 1 each time, as depicted in Fig. 4, we see no interference pattern.

Figure 4: The result of repeated experiments in which we measure particle 1, at the moment of maximal interference, in the system of Fig. 3. Each new experiment is shown as an orange dot; results of past experiments are shown in blue. No interference effect is seen.

We see something analogous if we measure particle 2.

Yet the interference is plain as day in Fig. 3. It’s obvious when we look at the full two-dimensional space of possibilities, even though it is invisible in Fig. 4 for particle 1 and in the analogous experiment for particle 2. So what measurements, if any, can we make that can reveal it?

The clue comes from the fact that the interference fringes lie at a 45 degree angle, perpendicular neither to the x1 axis nor to the x2 axis but instead to the axis for the variable 1/2(x1 + x2), the average of the positions of particle 1 and 2. It’s that average position that we need to measure if we are to observe the interference.

But doing so requires we that we measure both particles’ positions. We have to measure them both every time we repeat the experiment. Only then can we start making a plot of the average of their positions.

When we do this, we will find what is shown in Fig 5.

  • The top row shows measurements of particle 1.
  • The bottom row shows measurements of particle 2.
  • And the middle row shows a quantity that we infer from these measurements: their average.

For each measurement, I’ve drawn a straight orange line between the measurement of x1 and the measurement of x2; the center of this line lies at the average position 1/2(x1+x2). The actual averages are then recorded in a different color, to remind you that we don’t measure them directly; we infer them from the actual measurements of the two particles’ positions.

Figure 5: As in Fig. 4, the result of repeated experiments in which we measure both particles’ positions at the moment of maximal interference in Fig. 3. Top and bottom rows show the position measurements of particles 1 and 2; the middle row shows their average. Each new experiment is shown as two orange dots, they are connected by an orange line, at whose midpoint a new yellow dot is placed. Results of past experiments are shown in blue. No interference effect is seen in the individual particle positions, yet one appears in their average.

In short, the interference is not associated with either particle separately — none is seen in either the top or bottom rows. Instead, it is found within the correlation between the two particles’ positions. This is something that neither particle can tell us on its own.

And where is the interference? It certainly lies near 1/2(x1+x2)=0. But this should worry you. Is that really a point in physical space?

You could imagine a more extreme example of this experiment in which Fig. 5 shows particle 1 located in Boston and particle 2 located in New York City. This would put their average position within appropriately-named Middletown, Connecticut. (I kid you not; check for yourself.) Would we really want to say that the interference itself is located in Middletown, even though it’s a quiet bystander, unaware of the existence of two correlated particles that lie in opposite directions 90 miles (150 km) away?

After all, the interference appears in the relationship between the particles’ positions in physical space, not in the positions themselves. Its location in the space of possibilities (Fig. 3) is clear. Its location in physical space (Fig. 5) is anything but.

Still, I can imagine you pondering whether it might somehow make sense to assign the interference to poor, unsuspecting Middletown. For that reason, I’m going to make things even worse, and take Middletown out of the middle.

A Second System with No Where

Here’s another system with interference, whose pre-quantum version is shown in Figs. 6a and 6b:

Figure 6a: Another system in a superposition with entangled particles, shown in its pre-quantum version in physical space. In part A of the superposition both particles are stationary, while in part B they move oppositely.
Figure 6b: The same system as in Fig. 6a, depicted in the space of possibilities with its two initial possibilities labeled as stars. Possibility A remains where it is, while possibility B moves toward and intersects with possibility A, leading us to expect interference in the quantum wave function.

The corresponding wave function is shown in Fig. 7. Now the interference fringes are oriented diagonally the other way compared to Fig. 3. How are we to measure them this time?

Figure 7: The absolute-value-squared of the wave function for the system shown in Fig. 6. The interference fringes lie on the opposite diagonal from those of Fig. 3.

The average position 1/2(x1+x2) won’t do; we’ll see nothing interesting there. Instead the fringes are near (x1-x2)=4 — that is, they occur when the particles, no matter where they are in physical space, are at a distance of four units. We therefore expect interference near 1/2(x1-x2)=2. Is it there?

In Fig. 8 I’ve shown the analogue of Figs. 4 and 5, depicting

  • the measurements of the two particle positions x1 and x2, along with
  • their average 1/2(x1+x2) plotted between them (in yellow)
  • (half) their difference 1/2(x1-x2) plotted below them (in green).

That quantity 1/2(x1-x2) is half the horizontal length of the orange line. Hidden in its behavior over many measurements is an interference pattern, seen in the bottom row, where the 1/2(x1-x2) measurements are plotted. [Note also that there is no interference pattern in the measurements of 1/2(x1+x2), in contrast to Fig. 4.]

Figure 8: For the system of Figs. 6-7, repeated experiments in which the measurement of the position of particle 1 is plotted in the top row (upper blue points), that of particle 2 is plotted in the third row (lower blue points), their average is plotted between (yellow points), and half their difference is plotted below them (green points.) Each new set of measurements is shown as orange points connected by an orange line, as in Fig. 5. An interference pattern is seen only in the difference.

Now the question of the hour: where is the interference in this case? It is found near 1/2(x1-x2)=2 — but that certainly is not to be identified with a legitimate position in physical space, such as the point x=2.

First of all, making such an identification in Fig. 8 would be like saying that one particle is in New York and the other is in Boston, while the interference is 150 kilometers offshore in the ocean. But second and much worse, I could change Fig. 8 by moving both particles 10 units to the left and repeating the experiment. This would cause x1, x2, and 1/2(x1+x2) in Fig. 8 to all shift left by 10 units, moving them off your computer screen, while leaving 1/2(x1-x2) unchanged at 2. In short, all the orange and blue and yellow points would move out of your view, while the green points would remain exactly where they are. The difference of positions — a distance — is not a position.

If 10 units isn’t enough to convince you, let’s move the two particles to the other side of the Sun, or to the other side of the galaxy. The interference pattern stubbornly remains at 1/2(x1-x2)=2. The interference pattern is in a difference of positions, so it doesn’t care whether the two particles are in France, Antarctica, or Mars.

We can move the particles anywhere in the universe, as long as we take them together with their average distance remaining the same, and the interference pattern remains exactly the same. So there’s no way we can identify the interference as being located at a particular value of x, the coordinate of physical space. Trying to do so creates nonsense.

This is totally unlike interference in water waves and sound waves. That kind of interference happens in a someplace; we can say where the waves are, how big they are at a particular location, and where their peaks and valleys are in physical space. Quantum interference is not at all like this. It’s something more general, more subtle, and more troubling to our intuition.

[By the way, there’s nothing special about the two combinations 1/2(x1+x2) and 1/2(x1-x2), the average or the difference. It’s easy to find systems where the intereference arises in the combination x1+2x2, or 3x1-x2, or any other one you like. In none of these is there a natural way to say “where” the interference is located.]

The Profound Lesson

From these examples, we can begin to learn a central lesson of modern physics, one that a century of experimental and theoretical physics have been teaching us repeatedly, with ever greater subtlety. Imagining reality as many of us are inclined to do, as made of localized objects positioned in and moving through physical space — the one-dimensional x-axis in my simple examples, and the three-dimensional physical space that we take for granted when we specify our latitude, longitude and altitude — is simply not going to work in a quantum universe. The correlations among objects have observable consequences, and those correlations cannot simply be ascribed locations in physical space. To make sense of them, it seems we need to expand our conception of reality.

In the process of recognizing this challenge, we have had to confront the giant, unwieldy space of possibilities, which we can only visualize for a single particle moving in up to three dimensions, or for two or three particles moving in just one dimension. In realistic circumstances, especially those of quantum field theory, the space of possibilities has a huge number of dimensions, rendering it horrendously unimaginable. Whether this gargantuan space should be understood as real — perhaps even more real than physical space — continues to be debated.

Indeed, the lessons of quantum interference are ones that physicists and philosophers have been coping with for a hundred years, and their efforts to make sense of them continue to this day. I hope this series of posts has helped you understand these issues, and to appreciate their depth and difficulty.

Looking ahead, we’ll soon take these lessons, and other lessons from recent posts, back to the double-slit experiment. With fresher, better-informed eyes, we’ll examine its puzzles again.

March 31, 2025

Scott Aaronson Tragedy in one shitty act

Far-Left Students and Faculty: We’d sooner burn universities to the ground than allow them to remain safe for the hated Zionist Jews, the baby-killing demons of the earth. We’ll disrupt their classes, bar them from student activities, smash their Hillel centers, take over campus buildings and quads, and chant for Hezbollah and the Al-Aqsa Martyrs Brigades to eradicate them like vermin. We’ll do all this because we’ve so thoroughly learned the lessons of the Holocaust.

Trump Administration [cackling]: Burn universities to the ground, you say? What a coincidence! We’d love nothing more than to do exactly that. Happy to oblige you.

Far-Left Students and Faculty: You fascist scum. We didn’t mean “call our bluff”! Was it the campus Zionists who ratted us out to you? It was, wasn’t it? You can’t do this without due process; we have rights!

Trump Administration: We don’t answer to you and we don’t care about “due process” or your supposed “rights.” We’re cutting all your funding, effective immediately. Actually, since you leftists don’t have much funding to speak of, let’s just cut any university funding whatsoever that we can reach. Cancer studies. Overhead on NIH grants. Student aid. Fellowships. Whatever universities use to keep the lights on. The more essential it is, the longer it took to build, the more we’ll enjoy the elitist professors’ screams of anguish as we destroy it all in a matter of weeks.

Far-Left Students and Faculty: This is the end, then. But if our whole little world must go up in flames, at least we’ll die having never compromised our most fundamental moral principle: the eradication of the State of Israel and the death of its inhabitants.

Sane Majorities at Universities, Including Almost Everyone in STEM: [don’t get a speaking part in this play. They’ve already bled out on the street, killed in the crossfire]

John PreskillHow writing a popular-science book led to a Nature Physics paper

Several people have asked me whether writing a popular-science book has fed back into my research. Nature Physics published my favorite illustration of the answer this January. Here’s the story behind the paper.

In late 2020, I was sitting by a window in my home office (AKA living room) in Cambridge, Massachusetts. I’d drafted 15 chapters of my book Quantum Steampunk. The epilogue, I’d decided, would outline opportunities for the future of quantum thermodynamics. So I had to come up with opportunities for the future of quantum thermodynamics. The rest of the book had related foundational insights provided by quantum thermodynamics about the universe’s nature. For instance, quantum thermodynamics had sharpened the second law of thermodynamics, which helps explain time’s arrow, into more-precise statements. Conventional thermodynamics had not only provided foundational insights, but also accompanied the Industrial Revolution, a paragon of practicality. Could quantum thermodynamics, too, offer practical upshots?

Quantum thermodynamicists had designed quantum engines, refrigerators, batteries, and ratchets. Some of these devices could outperform their classical counterparts, according to certain metrics. Experimentalists had even realized some of these devices. But the devices weren’t useful. For instance, a simple quantum engine consisted of one atom. I expected such an atom to produce one electronvolt of energy per engine cycle. (A light bulb emits about 1021 electronvolts of light per second.) Cooling the atom down and manipulating it would cost loads more energy. The engine wouldn’t earn its keep.

Autonomous quantum machines offered greater hope for practicality. By autonomous, I mean, not requiring time-dependent external control: nobody need twiddle knobs or push buttons to guide the machine through its operation. Such control requires work—organized, coordinated energy. Rather than receiving work, an autonomous machine accesses a cold environment and a hot environment. Heat—random, disorganized energy cheaper than work—flows from the hot to the cold. The machine transforms some of that heat into work to power itself. That is, the machine sources its own work from cheap heat in its surroundings. Some air conditioners operate according to this principle. So can some quantum machines—autonomous quantum machines.

Thermodynamicists had designed autonomous quantum engines and refrigerators. Trapped-ion experimentalists had realized one of the refrigerators, in a groundbreaking result. Still, the autonomous quantum refrigerator wasn’t practical. Keeping the ion cold and maintaining its quantum behavior required substantial work.

My community needed, I wrote in my epilogue, an analogue of solar panels in southern California. (I probably drafted the epilogue during a Boston winter, thinking wistfully of Pasadena.) If you built a solar panel in SoCal, you could sit back and reap the benefits all year. The panel would fulfill its mission without further effort from you. If you built a solar panel in Rochester, you’d have to scrape snow off of it. Also, the panel would provide energy only a few months per year. The cost might not outweigh the benefit. Quantum thermal machines resembled solar panels in Rochester, I wrote. We needed an analogue of SoCal: an appropriate environment. Most of it would be cold (unlike SoCal), so that maintaining a machine’s quantum nature would cost a user almost no extra energy. The setting should also contain a slightly warmer environment, so that net heat would flow. If you deposited an autonomous quantum machine in such a quantum SoCal, the machine would operate on its own.

Where could we find a quantum SoCal? I had no idea.

Sunny SoCal. (Specifically, the Huntington Gardens.)

A few months later, I received an email from quantum experimentalist Simone Gasparinetti. He was setting up a lab at Chalmers University in Sweden. What, he asked, did I see as opportunities for experimental quantum thermodynamics? We’d never met, but we agreed to Zoom. Quantum Steampunk on my mind, I described my desire for practicality. I described autonomous quantum machines. I described my yearning for a quantum SoCal.

I have it, Simone said.

Simone and his colleagues were building a quantum computer using superconducting qubits. The qubits fit on a chip about the size of my hand. To keep  the chip cold, the experimentalists put it in a dilution refrigerator. You’ve probably seen photos of dilution refrigerators from Google, IBM, and the like. The fridges tend to be cylindrical, gold-colored monstrosities from which wires stick out. (That is, they look steampunk.) You can easily develop the impression that the cylinder is a quantum computer, but it’s only the fridge.

Not a quantum computer

The fridge, Simone said, resembles an onion: it has multiple layers. Outer layers are warmer, and inner layers are colder. The quantum computer sits in the innermost layer, so that it behaves as quantum mechanically as possible. But sometimes, even the fridge doesn’t keep the computer cold enough.

Imagine that you’ve finished one quantum computation and you’re preparing for the next. The computer has written quantum information to certain qubits, as you’ve probably written on scrap paper while calculating something in a math class. To prepare for your next math assignment, given limited scrap paper, you’d erase your scrap paper. The quantum computer’s qubits need erasing similarly. Erasing, in this context, means cooling down even more than the dilution refrigerator can manage

Why not use an autonomous quantum refrigerator to cool the scrap-paper qubits?

I loved the idea, for three reasons. First, we could place the quantum refrigerator beside the quantum computer. The dilution refrigerator would already be cold, for the quantum computations’ sake. Therefore, we wouldn’t have to spend (almost any) extra work on keeping the quantum refrigerator cold. Second, Simone could connect the quantum refrigerator to an outer onion layer via a cable. Heat would flow from the warmer outer layer to the colder inner layer. From the heat, the quantum refrigerator could extract work. The quantum refrigerator would use that work to cool computational qubits—to erase quantum scrap paper. The quantum refrigerator would service the quantum computer. So, third, the quantum refrigerator would qualify as practical.

Over the next three years, we brought that vision to life. (By we, I mostly mean Simone’s group, as my group doesn’t have a lab.)

Artist’s conception of the autonomous-quantum-refrigerator chip. Credit: Chalmers University of Technology/Boid AB/NIST.

Postdoc Aamir Ali spearheaded the experiment. Then-master’s student Paul Jamet Suria and PhD student Claudia Castillo-Moreno assisted him. Maryland postdoc Jeffrey M. Epstein began simulating the superconducting qubits numerically, then passed the baton to PhD student José Antonio Marín Guzmán. 

The experiment provided a proof of principle: it demonstrated that the quantum refrigerator could operate. The experimentalists didn’t apply the quantum refrigerator in a quantum computation. Also, they didn’t connect the quantum refrigerator to an outer onion layer. Instead, they pumped warm photons to the quantum refrigerator via a cable. But even in such a stripped-down experiment, the quantum refrigerator outperformed my expectations. I thought it would barely lower the “scrap-paper” qubit’s temperature. But that qubit reached a temperature of 22 milliKelvin (mK). For comparison: if the qubit had merely sat in the dilution refrigerator, it would have reached a temperature of 45–70 mK. State-of-the-art protocols had lowered scrap-paper qubits’ temperatures to 40–49 mK. So our quantum refrigerator outperformed our competitors, through the lens of temperature. (Our quantum refrigerator cooled more slowly than they did, though.)

Simone, José Antonio, and I have followed up on our autonomous quantum refrigerator with a forward-looking review about useful autonomous quantum machines. Keep an eye out for a blog post about the review…and for what we hope grows into a subfield.

In summary, yes, publishing a popular-science book can benefit one’s research.

March 30, 2025

Doug NatelsonScience updates - brief items

Here are a couple of neat papers that I came across in the last week.  (Planning to write something about multiferroics as well, once I have a bit of time.)

  • The idea of directly extracting useful energy from the rotation of the earth sounds like something out of an H. G. Wells novel.  At a rough estimate (and it's impressive to me that AI tools are now able to provide a convincing step-by-step calculation of this; I tried w/ gemini.google.com) the rotational kinetic energy of the earth is about \(2.6 \times 10^{29}\) J.  The tricky bit is, how do you get at it?  You might imagine constructing some kind of big space-based pick-up coil and getting some inductive voltage generation as the earth rotates its magnetic field past the coil.  Intuitively, though, it seems like while sitting on the (rotating) earth, you should in some sense be comoving with respect to the local magnetic field, so it shouldn't be possible to do anything clever that way.  It turns out, though, that Lorentz forces still apply when moving a wire through the axially symmetric parts of the earth's field.  This has some conceptual contact with Faraday's dc electric generator.   With the right choice of geometry and materials, it is possible to use such an approach to extract some (tiny at the moment) power.  For the theory proposal, see here.  For an experimental demonstration, using thermoelectric effects as a way to measure this (and confirm that the orientation of the cylindrical shell has the expected effect), see here.  I need to read this more closely to decide if I really understand the nuances of how it works.
  • On a completely different note, this paper came out on Friday.  (Full disclosure:  The PI is my former postdoc and the second author was one of my students.)  It's an impressive technical achievement.  We are used to the fact that usually macroscopic objects don't show signatures of quantum interference.  Inelastic interactions of the object with its environment effectively suppress quantum interference effects on some time scale (and therefore some distance scale).  Small molecules are expected to still show electronic quantum effects at room temperature, since they are tiny and their electronic levels are widely spaced, and here is a review of what this could do in electronic measurements.  Quantum interference effects should also be possible in molecular vibrations at room temperature, and they could manifest themselves through the vibrational thermal conduction through single molecules, as considered theoretically here.  This experimental paper does a bridge measurement to compare the thermal transport between a single-molecule-containing junction between a tip and a surface, and an empty (farther spaced) twin tip-surface geometry.  They argue that they see differences between two kinds of molecules that originate from such quantum interference effects.
As for more global issues about the US research climate, there will be more announcements soon about reductions in force and the forthcoming presidential budget request.  (Here is an online petition regarding the plan to shutter the NIST atomic spectroscopy group.)  Please pay attention to these issues, and if you're a US citizen, I urge you to contact your legislators and make your voice heard.  

March 28, 2025

Terence TaoDecomposing a factorial into large factors

I’ve just uploaded to the arXiv the paper “Decomposing a factorial into large factors“. This paper studies the quantity {t(N)}, defined as the largest quantity such that it is possible to factorize {N!} into {N} factors {a_1, \dots, a_N}, each of which is at least {t(N)}. The first few values of this sequence are

\displaystyle  1,1,1,2,2,2,2,2,3,3,3,3,3,4, \dots

(OEIS A034258). For instance, we have {t(9)=3}, because on the one hand we can factor

\displaystyle  9! = 3 \times 3 \times 3 \times 3 \times 4 \times 4 \times 5 \times 7 \times 8

but on the other hand it is not possible to factorize {9!} into nine factors, each of which is {4} or higher.

This quantity {t(N)} was introduced by Erdös, who asked for upper and lower bounds on {t(N)}; informally, this asks how equitably one can split up {N!} into {N} factors. When factoring an arbitrary number, this is essentially a variant of the notorious knapsack problem (after taking logarithms), but one can hope that the specific structure of the factorial {N!} can make this particular knapsack-type problem more tractable. Since

\displaystyle  N! = a_1 \dots a_N \geq t(N)^N

for any putative factorization, we obtain an upper bound

\displaystyle  t(N) \leq (N!)^{1/N} = \frac{N}{e} + O(\log N) \ \ \ \ \ (1)

thanks to the Stirling approximation. At one point, Erdös, Selfridge, and Straus claimed that this upper bound was asymptotically sharp, in the sense that

\displaystyle  t(N) = \frac{N}{e} + o(N) \ \ \ \ \ (2)

as {N \rightarrow \infty}; informally, this means we can split {N!} into {N} factors that are (mostly) approximately the same size, when {N} is large. However, as reported in this later paper, Erdös “believed that Straus had written up our proof… Unfortunately Straus suddenly died and no trace was ever found of his notes. Furthermore, we never could reconstruct our proof, so our assertion now can be called only a conjecture”.

Some further exploration of {t(N)} was conducted by Guy and Selfridge. There is a simple construction that gives the lower bound

\displaystyle  t(N) \geq \frac{3}{16} N - o(N)

that comes from starting with the standard factorization {N! = 1 \times 2 \times \dots \times N} and transferring some powers of {2} from the later part of the sequence to the earlier part to rebalance the terms somewhat. More precisely, if one removes one power of two from the even numbers between {\frac{3}{8}N} and {N}, and one additional power of two from the multiples of four between {\frac{3}{4}N} to {N}, this frees up {\frac{3}{8}N + o(N)} powers of two that one can then distribute amongst the numbers up to {\frac{3}{16} N} to bring them all up to at least {\frac{3}{16} N - o(N)} in size. A more complicated procedure involving transferring both powers of {2} and {3} then gives the improvement {t(N) \geq \frac{1}{4} N - o(N)}. At this point, however, things got more complicated, and the following conjectures were made by Guy and Selfridge:
  • (i) Is {\frac{t(N)}{N} \leq \frac{1}{e}} for all {N \neq 1,2,4}?
  • (ii) Is {t(N) \geq \lfloor 2N/7 \rfloor} for all {N \neq 56}? (At {N=56}, this conjecture barely fails: {t(56) = 15 < 16 = \lfloor 2 \times 56/7 \rfloor}.)
  • (iii) Is {\frac{t(N)}{N} \geq \frac{1}{3}} for all {N \geq 300000}?

In this note we establish the bounds

\displaystyle  \frac{1}{e} - \frac{O(1)}{\log N} \leq \frac{t(N)}{N} \leq \frac{1}{e} - \frac{c_0+o(1)}{\log N} \ \ \ \ \ (3)

as {N \rightarrow \infty}, where {c_0} is the explicit constant

\displaystyle  c_0 := \frac{1}{e} \int_0^1 \left \lfloor \frac{1}{x} \right\rfloor \log \left( ex \left \lceil \frac{1}{ex} \right\rceil \right)\ dx \approx 0.3044.

In particular this recovers the lost result (2). An upper bound of the shape

\displaystyle  \frac{t(N)}{N} \leq \frac{1}{e} - \frac{c+o(1)}{\log N} \ \ \ \ \ (4)

for some {c>0} was previously conjectured by Erdös and Graham (Erdös problem #391). We conjecture that the upper bound in (3) is sharp, thus

\displaystyle  \frac{t(N)}{N} = \frac{1}{e} - \frac{c_0+o(1)}{\log N}, \ \ \ \ \ (5)

which is consistent with the above conjectures (i), (ii), (iii) of Guy and Selfridge, although numerically the convergence is somewhat slow.

The upper bound argument for (3) is simple enough that it could also be modified to establish the first conjecture (i) of Guy and Selfridge; in principle, (ii) and (iii) are now also reducible to a finite computation, but unfortunately the implied constants in the lower bound of (3) are too weak to make this directly feasible. However, it may be possible to now crowdsource the verification of (ii) and (iii) by supplying a suitable set of factorizations to cover medium sized {N}, combined with some effective version of the lower bound argument that can establish {\frac{t(N)}{N} \geq \frac{1}{3}} for all {N} past a certain threshold. The value {N=300000} singled out by Guy and Selfridge appears to be quite a suitable test case: the constructions I tried fell just a little short of the conjectured threshold of {100000}, but it seems barely within reach that a sufficiently efficient rearrangement of factors can work here.

We now describe the proof of the upper and lower bound in (3). To improve upon the trivial upper bound (1), one can use the large prime factors of {N!}. Indeed, every prime {p} between {N/e} and {N} divides {N!} at least once (and the ones between {N/e} and {N/2} divide it twice), and any factor {a_i} that contains such a factor therefore has to be significantly larger than the benchmark value of {N/e}. This observation already readily leads to some upper bound of the shape (4) for some {c>0}; if one also uses the primes {p} that are slightly less than {N/e} (noting that any multiple of {p} that exceeds {N/e}, must in fact exceed {\lceil N/ep \rceil p}) is what leads to the precise constant {c_0}.

For previous lower bound constructions, one started with the initial factorization {N! = 1 \times \dots \times N} and then tried to “improve” this factorization by moving around some of the prime factors. For the lower bound in (3), we start instead with an approximate factorization roughly of the shape

\displaystyle  N! \approx (\prod_{t \leq n < t + 2N/A, \hbox{ odd}} n)^A

where {t} is the target lower bound (so, slightly smaller than {N/e}), and {A} is a moderately sized natural number parameter (we will take {A \asymp \log^3 N}, although there is significant flexibility here). If we denote the right-hand side here by {B}, then {B} is basically a product of {N} numbers of size at least {t}. It is not literally equal to {N!}; however, an easy application of Legendre’s formula shows that for odd small primes {p}, {N!} and {B} have almost exactly the same number of factors of {p}. On the other hand, as {B} is odd, {B} contains no factors of {2}, while {N!} contains about {N} such factors. The prime factorizations of {B} and {N!} differ somewhat at large primes, but {B} has slightly more such prime factors as {N!} (about {\frac{N}{\log N} \log 2} such factors, in fact). By some careful applications of the prime number theorem, one can tweak some of the large primes appearing in {B} to make the prime factorization of {B} and {N!} agree almost exactly, except that {B} is missing most of the powers of {2} in {N!}, while having some additional large prime factors beyond those contained in {N!} to compensate. With a suitable choice of threshold {t}, one can then replace these excess large prime factors with powers of two to obtain a factorization of {N!} into {N} terms that are all at least {t}, giving the lower bound.

The general approach of first locating some approximate factorization of {N!} (where the approximation is in the “adelic” sense of having not just approximately the right magnitude, but also approximately the right number of factors of {p} for various primes {p}), and then moving factors around to get an exact factorization of {N!}, looks promising for also resolving the conjectures (ii), (iii) mentioned above. For instance, I was numerically able to verify that {t(300000) \geq 90000} by the following procedure:

  • Start with the approximate factorization of {N!}, {N = 300000} by {B = (\prod_{90000 \leq n < 102000, \hbox{ odd}} n)^{50}}. Thus {B} is the product of {N} odd numbers, each of which is at least {90000}.
  • Call an odd prime {B}-heavy if it divides {B} more often than {N!}, and {N!}-heavy if it divides {N!} more often than {B}. It turns out that there are {14891} more {B}-heavy primes than {N!}-heavy primes (counting multiplicity). On the other hand, {N!} contains {2999992} powers of {2}, while {B} has none. This represents the (multi-)set of primes one has to redistribute in order to convert a factorization of {B} to a factorization of {N!}.
  • Using a greedy algorithm, one can match a {B}-heavy prime {p'} to each {N!}-heavy prime {p} (counting multiplicity) in such a way that {p' \leq 2^{m_p} p} for a small {m_p} (in most cases one can make {m_p=0}, and often one also has {p'=p}). If we then replace {p'} in the factorization of {B} by {2^{m_p} p} for each {N!}-heavy prime {p}, this increases {B} (and does not decrease any of the {N} factors of {B}), while eliminating all the {N!}-heavy primes. With a somewhat crude matching algorithm, I was able to do this using {\sum_p m_p = 39992} of the {299992} powers of {2} dividing {N!}, leaving {260000} powers remaining at my disposal. (I don’t claim that this is the most efficient matching, in terms of powers of two required, but it sufficed.)
  • There are still {14891} {B}-heavy primes left over in the factorization of (the modified version of) {B}. Replacing each of these primes with {2^{17} \geq 90000}, and then distributing the remaining {260000 - 17 \times 14891 = 6853} powers of two arbitrarily, this obtains a factorization of {N!} into {N} terms, each of which are at least {90000}.

However, I was not able to adjust parameters to reach {t(300000) \geq 100000} in this manner. Perhaps some readers here who are adept with computers can come up with a more efficient construction to get closer to this bound? If one can find a way to reach this bound, most likely it can be adapted to then resolve conjectures (ii) and (iii) above after some additional numerical effort.

Terence TaoCosmic Distance Ladder videos with Grant Sanderson (3blue1brown): commentary and corrections

Grant Sanderson (who runs, and creates most of the content for, the website and Youtube channel 3blue1brown) has been collaborating with myself and others (including my coauthor Tanya Klowden) on producing a two-part video giving an account of some of the history of the cosmic distance ladder, building upon a previous public lecture I gave on this topic, and also relating to a forthcoming popular book with Tanya on this topic. The first part of this video is available here; the second part is available here.

The videos were based on a somewhat unscripted interview that Grant conducted with me some months ago, and as such contained some minor inaccuracies and omissions (including some made for editing reasons to keep the overall narrative coherent and within a reasonable length). They also generated many good questions from the viewers of the Youtube video. I am therefore compiling here a “FAQ” of various clarifications and corrections to the videos; this was originally placed as a series of comments on the Youtube channel, but the blog post format here will be easier to maintain going forward. Some related content will also be posted on the Instagram page for the forthcoming book with Tanya.

Questions on the two main videos are marked with an appropriate timestamp to the video.

Comments on part 1 of the video

  • 4:26 Did Eratosthenes really check a local well in Alexandria?

    This was a narrative embellishment on my part. Eratosthenes’s original work is lost to us. The most detailed contemperaneous account, by Cleomedes, gives a simplified version of the method, and makes reference only to sundials (gnomons) rather than wells. However, a secondary account of Pliny states (using this English translation), “Similarly it is reported that at the town of Syene, 5000 stades South of Alexandria, at noon in midsummer no shadow is cast, and that in a well made for the sake of testing this the light reaches to the bottom, clearly showing that the sun is vertically above that place at the time”. However, no mention is made of any well in Alexandria in either account.
  • 4:50 How did Eratosthenes know that the Sun was so far away that its light rays were close to parallel?

    This was not made so clear in our discussions or in the video (other than a brief glimpse of the timeline at 18:27), but Eratosthenes’s work actually came after Aristarchus, so it is very likely that Eratosthenes was aware of Aristarchus’s conclusions about how distant the Sun was from the Earth. Even if Aristarchus’s heliocentric model was disputed by the other Greeks, at least some of his other conclusions appear to have attracted some support. Also, after Eratosthenes’s time, there was further work by Greek, Indian, and Islamic astronomers (such as Hipparchus, Ptolemy, Aryabhata, and Al-Battani) to measure the same distances that Aristarchus did, although these subsequent measurements for the Sun also were somewhat far from modern accepted values.
  • 5:17 Is it completely accurate to say that on the summer solstice, the Earth’s axis of rotation is tilted “directly towards the Sun”?

    Strictly speaking, “in the direction towards the Sun” is more accurate than “directly towards the Sun”; it tilts at about 23.5 degrees towards the Sun, but it is not a total 90-degree tilt towards the Sun.
  • 5:39 Wait, aren’t there two tropics? The tropic of Cancer and the tropic of Capricorn?

    Yes! This corresponds to the two summers Earth experiences, one in the Northern hemisphere and one in the Southern hemisphere. The tropic of Cancer, at a latitude of about 23 degrees north, is where the Sun is directly overhead at noon during the Northern summer solstice (around June 21); the tropic of Capricorn, at a latitude of about 23 degrees south, is where the Sun is directly overhead at noon during the Southern summer solstice (around December 21). But Alexandria and Syene were both in the Northern Hemisphere, so it is the tropic of Cancer that is relevant to Eratosthenes’ calculations.
  • 5:41 Isn’t it kind of a massive coincidence that Syene was on the tropic of Cancer?

    Actually, Syene (now known as Aswan) was about half a degree of latitude away from the tropic of Cancer, which was one of the sources of inaccuracy in Eratosthenes’ calculations.  But one should take the “look-elsewhere effect” into account: because the Nile cuts across the tropic of Cancer, it was quite likely to happen that the Nile would intersect the tropic near some inhabited town.  It might not necessarily have been Syene, but that would just mean that Syene would have been substituted by this other town in Eratosthenes’s account.  

    On the other hand, it was fortunate that the Nile ran from South to North, so that distances between towns were a good proxy for the differences in latitude.  Apparently, Eratosthenes actually had a more complicated argument that would also work if the two towns in question were not necessarily oriented along the North-South direction, and if neither town was on the tropic of Cancer; but unfortunately the original writings of Eratosthenes are lost to us, and we do not know the details of this more general argument. (But some variants of the method can be found in later work of Posidonius, Aryabhata, and others.)

    Nowadays, the “Eratosthenes experiment” is run every year on the March equinox, in which schools at the same longitude are paired up to measure the elevation of the Sun at the same point in time, in order to obtain a measurement of the circumference of the Earth.  (The equinox is more convenient than the solstice when neither location is on a tropic, due to the simple motion of the Sun at that date.) With modern timekeeping, communications, surveying, and navigation, this is a far easier task to accomplish today than it was in Eratosthenes’ time.
  • 6:30 I thought the Earth wasn’t a perfect sphere. Does this affect this calculation?

    Yes, but only by a small amount. The centrifugal forces caused by the Earth’s rotation along its axis cause an equatorial bulge and a polar flattening so that the radius of the Earth fluctuates by about 20 kilometers from pole to equator. This sounds like a lot, but it is only about 0.3% of the mean Earth radius of 6371 km and is not the primary source of error in Eratosthenes’ calculations.
  • 7:27 Are the riverboat merchants and the “grad student” the leading theories for how Eratosthenes measured the distance from Alexandria to Syene?

    There is some recent research that suggests that Eratosthenes may have drawn on the work of professional bematists (step measurers – a precursor to the modern profession of surveyor) for this calculation. This somewhat ruins the “grad student” joke, but perhaps should be disclosed for the sake of completeness.
  • 8:51 How long is a “lunar month” in this context? Is it really 28 days?

    In this context the correct notion of a lunar month is a “synodic month” – the length of a lunar cycle relative to the Sun – which is actually about 29 days and 12 hours. It differs from the “sidereal month” – the length of a lunar cycle relative to the fixed stars – which is about 27 days and 8 hours – due to the motion of the Earth around the Sun (or the Sun around the Earth, in the geocentric model). [A similar correction needs to be made around 14:59, using the synodic month of 29 days and 12 hours rather than the “English lunar month” of 28 days (4 weeks).]
  • 10:47 Is the time taken for the Moon to complete an observed rotation around the Earth slightly less than 24 hours as claimed?

    Actually, I made a sign error: the lunar day (also known as a tidal day) is actually 24 hours and 50 minutes, because the Moon rotates in the same direction as the spinning of Earth around its axis. The animation therefore is also moving in the wrong direction as well (related to this, the line of sight is covering up the Moon in the wrong direction to the Moon rising at around 10:38).
  • 11:32 Is this really just a coincidence that the Moon and Sun have almost the same angular width?

    I believe so. First of all, the agreement is not that good: due to the non-circular nature of the orbit of the Moon around the Earth, and Earth around the Sun, the angular width of the Moon actually fluctuates to be as much as 10% larger or smaller than the Sun at various times (cf. the “supermoon” phenomenon). All other known planets with known moons do not exhibit this sort of agreement, so there does not appear to be any universal law of nature that would enforce this coincidence. (This is in contrast with the empirical fact that the Moon always presents the same side to the Earth, which occurs in all other known large moons (as well as Pluto), and is well explained by the physical phenomenon of tidal locking.)

    On the other hand, as the video hopefully demonstrates, the existence of the Moon was extremely helpful in allowing the ancients to understand the basic nature of the solar system. Without the Moon, their task would have been significantly more difficult; but in this hypothetical alternate universe, it is likely that modern cosmology would have still become possible once advanced technology such as telescopes, spaceflight, and computers became available, especially when combined with the modern mathematics of data science. Without giving away too many spoilers, a scenario similar to this was explored in the classic short story and novel “Nightfall” by Isaac Asimov.
  • 12:58 Isn’t the illuminated portion of the Moon, as well as the visible portion of the Moon, slightly smaller than half of the entire Moon, because the Earth and Sun are not an infinite distance away from the Moon?

    Technically yes (and this is actually for a very similar reason to why half Moons don’t quite occur halfway between the new Moon and the full Moon); but this fact turns out to have only a very small effect on the calculations, and is not the major source of error. In reality, the Sun turns out to be about 86,000 Moon radii away from the Moon, so asserting that half of the Moon is illuminated by the Sun is actually a very good first approximation. (The Earth is “only” about 220 Moon radii away, so the visible portion of the Moon is a bit more noticeably less than half; but this doesn’t actually affect Aristarchus’s arguments much.)

    The angular diameter of the Sun also creates an additional thin band between the fully illuminated and fully non-illuminated portions of the Moon, in which the Sun is intersecting the lunar horizon and so only illuminates the Moon with a portion of its light, but this is also a relatively minor effect (and the midpoints of this band can still be used to define the terminator between illuminated and non-illuminated for the purposes of Aristarchus’s arguments).
  • 13:27 What is the difference between a half Moon and a quarter Moon?

    If one divides the lunar month, starting and ending at a new Moon, into quarters (weeks), then half moons occur both near the end of the first quarter (a week after the new Moon, and a week before the full Moon), and near the end of the third quarter (a week after the full Moon, and a week before the new Moon). So, somewhat confusingly, half Moons come in two types, known as “first quarter Moons” and “third quarter Moons”.
  • 14:49 I thought the sine function was introduced well after the ancient Greeks.

    It’s true that the modern sine function only dates back to the Indian and Islamic mathematical traditions in the first millennium CE, several centuries after Aristarchus.  However, he still had Euclidean geometry at his disposal, which provided tools such as similar triangles that could be used to reach basically the same conclusions, albeit with significantly more effort than would be needed if one could use modern trigonometry.

    On the other hand, Aristarchus was somewhat hampered by not knowing an accurate value for \pi, which is also known as Archimedes’ constant: the fundamental work of Archimedes on this constant actually took place a few decades after that of Aristarchus!
  • 15:17 I plugged in the modern values for the distances to the Sun and Moon and got 18 minutes for the discrepancy, instead of half an hour.

    Yes; I quoted the wrong number here. In 1630, Godfried Wendelen replicated Aristarchus’s experiment. With improved timekeeping and the then-recent invention of the telescope, Wendelen obtained a measurement of half an hour for the discrepancy, which is significantly better than Aristarchus’s calculation of six hours, but still a little bit off from the true value of 18 minutes. (As such, Wendelinus’s estimate for the distance to the Sun was 60% of the true value.)
  • 15:27 Wouldn’t Aristarchus also have access to other timekeeping devices than sundials?

    Yes, for instance clepsydrae (water clocks) were available by that time; but they were of limited accuracy. It is also possible that Aristarchus could have used measurements of star elevations to also estimate time; it is not clear whether the astrolabe or the armillary sphere was available to him, but he would have had some other more primitive astronomical instruments such as the dioptra at his disposal. But again, the accuracy and calibration of these timekeeping tools would have been poor.

    However, most likely the more important limiting factor was the ability to determine the precise moment at which a perfect half Moon (or new Moon, or full Moon) occurs; this is extremely difficult to do with the naked eye. (The telescope would not be invented for almost two more millennia.)
  • 17:37 Could the parallax problem be solved by assuming that the stars are not distributed in a three-dimensional space, but instead on a celestial sphere?

    Putting all the stars on a fixed sphere would make the parallax effects less visible, as the stars in a given portion of the sky would now all move together at the same apparent velocity – but there would still be visible large-scale distortions in the shape of the constellations because the Earth would be closer to some portions of the celestial sphere than others; there would also be variability in the brightness of the stars, and (if they were very close) the apparent angular diameter of the stars. (These problems would be solved if the celestial sphere was somehow centered around the moving Earth rather than the fixed Sun, but then this basically becomes the geocentric model with extra steps.)
  • 18:29 Did nothing of note happen in astronomy between Eratosthenes and Copernicus?

    Not at all! There were significant mathematical, technological, theoretical, and observational advances by astronomers from many cultures (Greek, Islamic, Indian, Chinese, European, and others) during this time, for instance improving some of the previous measurements on the distance ladder, a better understanding of eclipses, axial tilt, and even axial precession, more sophisticated trigonometry, and the development of new astronomical tools such as the astrolabe. See for instance this “deleted scene” from the video, as well as the FAQ entry for 14:49 for this video and 24:54 for the second video, or this instagram post. But in order to make the overall story of the cosmic distance ladder fit into a two-part video, we chose to focus primarily on the first time each rung of the ladder was climbed.
  • 18:30 Is that really Kepler’s portrait?

    We have since learned that this portrait was most likely painted in the 19th century, and may have been based more on Kepler’s mentor, Michael Mästlin. A more commonly accepted portrait of Kepler may be found at his current Wikipedia page.
  • 19:07 Isn’t it tautological to say that the Earth takes one year to perform a full orbit around the Sun?

    Technically yes, but this is an illustration of the philosophical concept of “referential opacity“: the content of a sentence can change when substituting one term for another (e.g., “1 year” and “365 days”), even when both terms refer to the same object. Amusingly, the classic illustration of this, known as Frege’s puzzles, also comes from astronomy: it is an informative statement that Hesperus (the evening star) and Phosphorus (the morning star, also known as Lucifer) are the same object (which nowadays we call Venus), but it is a mere tautology that Hesperus and Hesperus are the same object: changing the reference from Phosphorus to Hesperus changes the meaning.
  • 19:10 How did Copernicus figure out the crucial fact that Mars takes 687 days to go around the Sun? Was it directly drawn from Babylonian data?

    Technically, Copernicus drew from tables by European astronomers that were largely based on earlier tables from the Islamic golden age, which in turn drew from earlier tables by Indian and Greek astronomers, the latter of which also incorporated data from the ancient Babylonians, so it is more accurate to say that Copernicus relied on centuries of data, at least some of which went all the way back to the Babylonians. Among all of this data was the times when Mars was in opposition to the Sun; if one imagines the Earth and Mars as being like runners going around a race track circling the Sun, with Earth on an inner track and Mars on an outer track, oppositions are analogous to when the Earth runner “laps” the Mars runner. From the centuries of observational data, such “laps” were known to occur about once every 780 days (this is known as the synodic period of Mars). Because the Earth takes roughly 365 days to perform a “lap”, it is possible to do a little math and conclude that Mars must therefore complete its own “lap” in 687 days (this is known as the sidereal period of Mars). (See also this post on the cosmic distance ladder Instagram for some further elaboration.)
  • 20:52 Did Kepler really steal data from Brahe?

    The situation is complex. When Kepler served as Brahe’s assistant, Brahe only provided Kepler with a limited amount of data, primarily involving Mars, in order to confirm Brahe’s own geo-heliocentric model. After Brahe’s death, the data was inherited by Brahe’s son-in-law and other relatives, who intended to publish Brahe’s work separately; however, Kepler, who was appointed as Imperial Mathematician to succeed Brahe, had at least some partial access to the data, and many historians believe he secretly copied portions of this data to aid his own research before finally securing complete access to the data from Brahe’s heirs after several years of disputes. On the other hand, as intellectual property rights laws were not well developed at this time, Kepler’s actions were technically legal, if ethically questionable.
  • 21:39 What is that funny loop in the orbit of Mars?

    This is known as retrograde motion. This arises because the orbital velocity of Earth (about 30 km/sec) is a little bit larger than that of Mars (about 24 km/sec). So, in opposition (when Mars is in the opposite position in the sky than the Sun), Earth will briefly overtake Mars, causing its observed position to move westward rather than eastward. But in most other times, the motion of Earth and Mars are at a sufficient angle that Mars will continue its apparent eastward motion despite the slightly faster speed of the Earth.
  • 21:59 Couldn’t one also work out the direction to other celestial objects in addition to the Sun and Mars, such as the stars, the Moon, or the other planets?  Would that have helped?

    Actually, the directions to the fixed stars were implicitly used in all of these observations to determine how the celestial sphere was positioned, and all the other directions were taken relative to that celestial sphere.  (Otherwise, all the calculations would be taken on a rotating frame of reference in which the unknown orbits of the planets were themselves rotating, which would have been an even more complex task.)  But the stars are too far away to be useful as one of the two landmarks to triangulate from, as they generate almost no parallax and so cannot distinguish one location from another.

    Measuring the direction to the Moon would tell you which portion of the lunar cycle one was in, and would determine the phase of the Moon, but this information would not help one triangulate, because the Moon’s position in the heliocentric model varies over time in a somewhat complicated fashion, and is too tied to the motion of the Earth to be a useful “landmark” to one to determine the Earth’s orbit around the Sun.

    In principle, using the measurements to all the planets at once could allow for some multidimensional analysis that would be more accurate than analyzing each of the planets separately, but this would require some sophisticated statistical analysis and modeling, as well as non-trivial amounts of compute – neither of which were available in Kepler’s time.
  • 22:57 Can you elaborate on how we know that the planets all move on a plane?

    The Earth’s orbit lies in a plane known as the ecliptic (it is where the lunar and solar eclipses occur). Different cultures have divided up the ecliptic in various ways; in Western astrology, for instance, the twelve main constellations that cross the ecliptic are known as the Zodiac. The planets can be observed to only wander along the Zodiac, but not other constellations: for instance, Mars can be observed to be in Cancer or Libra, but never in Orion or Ursa Major. From this, one can conclude (as a first approximation, at least), that the planets all lie on the ecliptic.

    However, this isn’t perfectly true, and the planets will deviate from the ecliptic by a small angle known as the ecliptic latitude. Tycho Brahe’s observations on these latitudes for Mars were an additional useful piece of data that helped Kepler complete his calculations (basically by suggesting how to join together the different “jigsaw pieces”), but the math here gets somewhat complicated, so the story here has been somewhat simplified to convey the main ideas.
  • 23:04 What are the other universal problem solving tips?

    Grant Sanderson has a list (in a somewhat different order) in this previous video.
  • 23:28 Can one work out the position of Earth from fixed locations of the Sun and Mars when the Sun and Mars are in conjunction (the same location in the sky) or opposition (opposite locations in the sky)?

    Technically, these are two times when the technique of triangulation fails to be accurate; and also in the former case it is extremely difficult to observe Mars due to the proximity to the Sun. But again, following the Universal Problem Solving Tip from 23:07, one should initially ignore these difficulties to locate a viable method, and correct for these issues later. This video series by Welch Labs goes into Kepler’s methods in more detail.
  • 24:04 So Kepler used Copernicus’s calculation of 687 days for the period of Mars. But didn’t Kepler discard Copernicus’s theory of circular orbits?

    Good question! It turns out that Copernicus’s calculations of orbital periods are quite robust (especially with centuries of data), and continue to work even when the orbits are not perfectly circular. But even if the calculations did depend on the circular orbit hypothesis, it would have been possible to use the Copernican model as a first approximation for the period, in order to get a better, but still approximate, description of the orbits of the planets. This in turn can be fed back into the Copernican calculations to give a second approximation to the period, which can then give a further refinement of the orbits. Thanks to the branch of mathematics known as perturbation theory, one can often make this type of iterative process converge to an exact answer, with the error in each successive approximation being smaller than the previous one. (But performing such an iteration would probably have been beyond the computational resources available in Kepler’s time; also, the foundations of perturbation theory require calculus, which only was developed several decades after Kepler.)
  • 24:21 Did Brahe have exactly 10 years of data on Mars’s positions?

    Actually, it was more like 17 years, but with many gaps, due both to inclement weather, as well as Brahe turning his attention to other astronomical objects than Mars in some years; also, in times of conjunction, Mars might only be visible in the daytime sky instead of the night sky, again complicating measurements. So the “jigsaw puzzle pieces” in 25:26 are in fact more complicated than always just five locations equally spaced in time; there are gaps and also observational errors to grapple with. But to understand the method one should ignore these complications; again, see “Universal Problem Solving Tip #1”. Even with his “idea of true genius”, it took many years of further painstaking calculation for Kepler to tease out his laws of planetary motion from Brahe’s messy and incomplete observational data.
  • 26:44 Shouldn’t the Earth’s orbit be spread out at perihelion and clustered closer together at aphelion, to be consistent with Kepler’s laws?

    Yes, you are right; there was a coding error here.
  • 26:53 What is the reference for Einstein’s “idea of pure genius”?

    Actually, the precise quote was “an idea of true genius”, and can be found in the introduction to Carola Baumgardt’s “Life of Kepler“.

Comments on the “deleted scene” on Al-Biruni

  • Was Al-Biruni really of Arab origin?

    Strictly speaking; no; his writings are all in Arabic, and he was nominally a subject of the Abbasid Caliphate whose rulers were Arab; but he was born in Khwarazm (in modern day Uzbekistan), and would have been a subject of either the Samanid empire or the Khrawazmian empire, both of which were largely self-governed and primarily Persian in culture and ethnic makeup, despite being technically vassals of the Caliphate. So he would have been part of what is sometimes called “Greater Persia” or “Greater Iran”.

    Another minor correction: while Al-Biruni was born in the tenth century, his work on the measurement of the Earth was published in the early eleventh century.
  • Is \theta really called the angle of declination?

    This was a misnomer on my part; this angle is more commonly called the dip angle.
  • But the height of the mountain would be so small compared to the radius of the Earth! How could this method work?

    Using the Taylor approximation \cos \theta \approx 1 - \theta^2/2, one can approximately write the relationship R = \frac{h \cos \theta}{1-\cos \theta} between the mountain height h, the Earth radius R, and the dip angle \theta (in radians) as R \approx 2 h / \theta^2. The key point here is the inverse quadratic dependence on \theta, which allows for even relatively small values of h to still be realistically useful for computing R. Al-Biruni’s measurement of the dip angle \theta was about 0.01 radians, leading to an estimate of R that is about four orders of magnitude larger than h, which is within ballpark at least of a typical height of a mountain (on the order of a kilometer) and the radius of the Earth (6400 kilometers).
  • Was the method really accurate to within a percentage point?

    This is disputed, somewhat similarly to the previous calculations of Eratosthenes. Al-Biruni’s measurements were in cubits, but there were multiple incompatible types of cubit in use at the time. It has also been pointed out that atmospheric refraction effects would have created noticeable changes in the observed dip angle \theta. It is thus likely that the true accuracy of Al-Biruni’s method was poorer than 1%, but that this was somehow compensated for by choosing a favorable conversion between cubits and modern units.

Comments on the second part of the video

  • 1:13 Did Captain Cook set out to discover Australia?

    One of the objectives of Cook’s first voyage was to discover the hypothetical continent of Terra Australis. This was considered to be distinct from Australia, which at the time was known as New Holland. As this name might suggest, prior to Cook’s voyage, the northwest coastline of New Holland had been explored by the Dutch; Cook instead explored the eastern coastline, naming this portion New South Wales. The entire continent was later renamed to Australia by the British government, following a suggestion of Matthew Flinders; and the concept of Terra Australis was abandoned.
  • 4:40 The relative position of the Northern and Southern hemisphere observations is reversed from those earlier in the video.

    Yes, this was a slight error in the animation; the labels here should be swapped for consistency of orientation.
  • 7:06 So, when did they finally manage to measure the transit of Venus, and use this to compute the astronomical unit?

    While Le Gentil had the misfortune to not be able to measure either the 1761 or 1769 transits, other expeditions of astronomers (led by Dixon-Mason, Chappe d’Auteroche, and Cook) did take measurements of one or both of these transits with varying degrees of success, with the measurements of Cook’s team of the 1769 transit in Tahiti being of particularly high quality. All of this data was assembled later by Lalande in 1771, leading to the most accurate measurement of the astronomical unit at the time (within 2.3% of modern values, which was about three times more accurate than any previous measurement).
  • 8:53 What does it mean for the transit of Io to be “twenty minutes ahead of schedule” when Jupiter is in opposition (Jupiter is opposite to the Sun when viewed from the Earth)?

    Actually, it should be halved to “ten minutes ahead of schedule”, with the transit being “ten minutes behind schedule” when Jupiter is in conjunction, with the net discrepancy being twenty minutes (or actually closer to 16 minutes when measured with modern technology). Both transits are being compared against an idealized periodic schedule in which the transits are occuring at a perfectly regular rate (about 42 hours), where the period is chosen to be the best fit to the actual data. This discrepancy is only noticeable after carefully comparing transit times over a period of months; at any given position of Jupiter, the Doppler effects of Earth moving towards or away from Jupiter would only affect shift each transit by just a few seconds compared to the previous transit, with the delays or accelerations only becoming cumulatively noticeable after many such transits.

    Also, the presentation here is oversimplified: at times of conjunction, Jupiter and Io are too close to the Sun for observation of the transit. Rømer actually observed the transits at other times than conjunction, and Huygens used more complicated trigonometry than what was presented here to infer a measurement for the speed of light in terms of the astronomical unit (which they had begun to measure a bit more accurately than in Aristarchus’s time; see the FAQ entry for 15:17 in the first video).
  • 10:05 Are the astrological signs for Earth and Venus swapped here?

    Yes, this was a small mistake in the animation.
  • 10:34 Shouldn’t one have to account for the elliptical orbit of the Earth, as well as the proper motion of the star being observed, or the effects of general relativity?

    Yes; the presentation given here is a simplified one to convey the idea of the method, but in the most advanced parallax measurements, such as the ones taken by the Hipparcos and Gaia spacecraft, these factors are taken into account, basically by taking as many measurements (not just two) as possible of a single star, and locating the best fit of that data to a multi-parameter model that incorporates the (known) orbit of the Earth with the (unknown) distance and motion of the star, as well as additional gravitational effects from other celestial bodies, such as the Sun and other planets.
  • 14:53 The formula I was taught for apparent magnitude of stars looks a bit different from the one here.

    This is because astronomers use a logarithmic scale to measure both apparent magnitude m and absolute magnitude M. If one takes the logarithm of the inverse square law in the video, and performs the normalizations used by astronomers to define magnitude, one arrives at the standard relation M = m - 5 \log_{10} d_{pc} + 5 between absolute and apparent magnitude.

    But this is an oversimplification, most notably due to neglect of the effects of extinction effects caused by interstellar dust. This is not a major issue for the relatively short distances observable via parallax, but causes problems at larger scales of the ladder (see for instance the FAQ entry here for 18:08). To compensate for this, one can work in multiple frequencies of the spectrum (visible, x-ray, radio, etc.), as some frequencies are less susceptible to extinction than others. From the discrepancies between these frequencies one can infer the amount of extinction, leading to “dust maps” that can then be used to facilitate such corrections for subsequent measurements in the same area of the universe. (More generally, the trend in modern astronomy is towards “multi-messenger astronomy” in which one combines together very different types of measurements of the same object to obtain a more accurate understanding of that object and its surroundings.)
  • 18:08 Can we really measure the entire Milky Way with this method?

    Strictly speaking, there is a “zone of avoidance” on the far side of the Milky way that is very difficult to measure in the visible portion of the spectrum, due to the large amount of intervening stars, dust, and even a supermassive black hole in the galactic center. However, in recent years it has become possible to explore this zone to some extent using the radio, infrared, and x-ray portions of the spectrum, which are less affected by these factors.
  • 18:19 How did astronomers know that the Milky Way was only a small portion of the entire universe?

    This issue was the topic of the “Great Debate” in the early twentieth century. It was only with the work of Hubble using Leavitt’s law to measure distances to Magellanic clouds and “spiral nebulae” (that we now know to be other galaxies), building on earlier work of Leavitt and Hertzsprung, that it was conclusively established that these clouds and nebulae in fact were at much greater distances than the diameter of the Milky Way.
  • 18:45 How can one compensate for light blending effects when measuring the apparent magnitude of Cepheids?

    This is a non-trivial task, especially if one demands a high level of accuracy. Using the highest resolution telescopes available (such as HST or JWST) is of course helpful, as is switching to other frequencies, such as near-infrared, where Cepheids are even brighter relative to nearby non-Cepheid stars. One can also apply sophisticated statistical methods to fit to models of the point spread of light from unwanted sources, and use nearby measurements of the same galaxy without the Cepheid as a reference to help calibrate those models. Improving the accuracy of the Cepheid portion of the distance ladder is an ongoing research activity in modern astronomy.
  • 18:54 What is the mechanism that causes Cepheids to oscillate?

    For most stars, there is an equilibrium size: if the star’s radius collapses, then the reduced potential energy is converted to heat, creating pressure to pushing the star outward again; and conversely, if the star expands, then it cools, causing a reduction in pressure that no longer counteracts gravitational forces. But for Cepheids, there is an additional mechanism called the kappa mechanism: the increased temperature caused by contraction increases ionization of helium, which drains energy from the star and accelerates the contraction; conversely, the cooling caused by expansion causes the ionized helium to recombine, with the energy released accelerating the expansion. If the parameters of the Cepheid are in a certain “instability strip”, then the interaction of the kappa mechanism with the other mechanisms of stellar dynamics create a periodic oscillation in the Cepheid’s radius, which increases with the mass and brightness of the Cepheid.

    For a recent re-analysis of Leavitt’s original Cepheid data, see this paper.
  • 19:10 Did Leavitt mainly study the Cepheids in our own galaxy?

    This was an inaccuracy in the presentation. Leavitt’s original breakthrough paper studied Cepheids in the Small Magellanic Cloud. At the time, the distance to this cloud was not known; indeed, it was a matter of debate whether this cloud was in the Milky Way, or some distance away from it. However, Leavitt (correctly) assumed that all the Cepheids in this cloud were roughly the same distance away from our solar system, so that the apparent brightness was proportional to the absolute brightness. This gave an uncalibrated form of Leavitt’s law between absolute brightness and period, subject to the (then unknown) distance to the Small Magellanic Cloud. After Leavitt’s work, there were several efforts (by Hertzsprung, Russell, and Shapley) to calibrate the law by using the few Cepheids for which other distance methods were available, such as parallax. (Main sequence fitting to the Hertzsprung-Russell diagram was not directly usable, as Cepheids did not lie on the main sequence; but in some cases one could indirectly use this method if the Cepheid was in the same stellar cluster as a main sequence star.) Once the law was calibrated, it could be used to measure distances to other Cepheids, and in particular to compute distances to extragalactic objects such as the Magellanic clouds.
  • 19:15 Was Leavitt’s law really a linear law between period and luminosity?

    Strictly speaking, the period-luminosity relation commonly known as Leavitt’s law was a linear relation between the absolute magnitude of the Cepheid and the logarithm of the period; undoing the logarithms, this becomes a power law between the luminosity and the period.
  • 20:26 Was Hubble the one to discover the redshift of galaxies?

    This was an error on my part; Hubble was using earlier work of Vesto Slipher on these redshifts, and combining it with his own measurements of distances using Leavitt’s law to arrive at the law that now bears his name; he was also assisted in his observations by Milton Humason. It should also be noted that Georges Lemaître had also independently arrived at essentially the same law a few years prior, but his work was published in a somewhat obscure journal and did not receive broad recognition until some time later.
  • 20:37 Hubble’s original graph doesn’t look like a very good fit to a linear law.

    Hubble’s original data was somewhat noisy and inaccurate by modern standards, and the redshifts were affected by the peculiar velocities of individual galaxies in addition to the expanding nature of the universe. However, as the data was extended to more galaxies, it became increasingly possible to compensate for these effects and obtain a much tighter fit, particularly at larger scales where the effects of peculiar velocity are less significant. See for instance this article from 2015 where Hubble’s original graph is compared with a more modern graph. This more recent graph also reveals a slight nonlinear correction to Hubble’s law at very large scales that has led to the remarkable discovery that the expansion of the universe is in fact accelerating over time, a phenomenon that is attributed to a positive cosmological constant (or perhaps a more complex form of dark energy in the universe). On the other hand, even with this nonlinear correction, there continues to be a roughly 10% discrepancy of this law with predictions based primarily on the cosmic microwave background radiation; see the FAQ entry for 23:49.
  • 20:46 Does general relativity alone predict an uniformly expanding universe?

    This was an oversimplification. Einstein’s equations of general relativity contain a parameter \Lambda, known as the cosmological constant, which currently is only computable indirectly from fitting to experimental data. But even with this constant fixed, there are multiple solutions to these equations (basically because there are multiple possible initial conditions for the universe). For the purposes of cosmology, a particularly successful family of solutions are the solutions given by the Lambda-CDM model. This family of solutions contains additional parameters, such as the density of dark matter in the universe. Depending on the precise values of these parameters, the universe could be expanding or contracting, with the rate of expansion or contraction either increasing, decreasing, or staying roughly constant. But if one fits this model to all available data (including not just red shift measurements, but also measurements on the cosmic microwave background radiation and the spatial distribution of galaxies), one deduces a version of Hubble’s law which is nearly linear, but with an additional correction at very large scales; see the next item of this FAQ.
  • 21:07 Is Hubble’s original law sufficiently accurate to allow for good measurements of distances at the scale of the observable universe?

    Not really; as mentioned in the end of the video, there were additional efforts to cross-check and calibrate Hubble’s law at intermediate scales between the range of Cepheid methods (about 100 million light years) and observable universe scales (about 100 billion light years) by using further “standard candles” than Cepheids, most notably Type Ia supernovae (which are bright enough and predictable enough to be usable out to about 10 billion light years), the Tully-Fisher relation between the luminosity of a galaxy and its rotational speed, and gamma ray bursts. It turns out that due to the accelerating nature of the universe’s expansion, Hubble’s law is not completely linear at these large scales; this important correction cannot be discerned purely from Cepheid data, but also requires the other standard candles, as well as fitting that data (as well as other observational data, such as the cosmic microwave background radiation) to the cosmological models provided by general relativity (with the best fitting models to date being some version of the Lambda-CDM model).

    On the other hand, a naive linear extrapolation of Hubble’s original law to all larger scales does provide a very rough picture of the observable universe which, while too inaccurate for cutting edge research in astronomy, does give some general idea of its large-scale structure.
  • 21:15 Where did this guess of the observable universe being about 20% of the full universe come from?

    There are some ways to get a lower bound on the size of the entire universe that go beyond the edge of the observable universe. One is through analysis of the cosmic microwave background radiation (CMB), that has been carefully mapped out by several satellite observatories, most notably WMAP and Planck. Roughly speaking, a universe that was less than twice the size of the observable universe would create certain periodicities in the CMB data; such periodicities are not observed, so this provides a lower bound (see for instance this paper for an example of such a calculation). The 20% number was a guess based on my vague recollection of these works, but there is no consensus currently on what the ratio truly is; there are some proposals that the entire universe is in fact several orders of magnitude larger than the observable one.

    The situation is somewhat analogous to Aristarchus’s measurement of the distance to the Sun, which was very sensitive to a small angle (the half-moon discrepancy). Here, the predicted size of the universe under the standard cosmological model is similarly dependent in a highly sensitive fashion on a measure \Omega_k of the flatness of the universe which, for reasons still not fully understood (but likely caused by some sort of inflation mechanism), happens to be extremely close to zero. As such, predictions for the size of the universe remain highly volatile at the current level of measurement accuracy.
  • 23:44 Was it a black hole collision that allowed for an independent measurement of Hubble’s law?

    This was a slight error in the presentation. While the first gravitational wave observation by LIGO in 2015 was of a black hole collision, it did not come with an electromagnetic counterpart that allowed for a redshift calculation that would yield a Hubble’s law measurement. However, a later collision of neutron stars, observed in 2017, did come with an associated kilonova in which a redshift was calculated, and led to a Hubble measurement which was independent of most of the rungs of the distance ladder.
  • 23:49 Where can I learn more about this 10% discrepancy in Hubble’s law?

    This is known as the Hubble tension (or, in more sensational media, the “crisis in cosmology”): roughly speaking, the various measurements of Hubble’s constant (either from climbing the cosmic distance ladder, or by fitting various observational data to standard cosmological models) tend to arrive at one of two values, that are about 10% apart from each other. The values based on gravitational wave observations are currently consistent with both values, due to significant error bars in this extremely sensitive method; but other more mature methods are now of sufficient accuracy that they are basically only consistent with one of the two values. Currently there is no consensus on the origin of this tension: possibilities include systemic biases in the observational data, subtle statistical issues with the methodology used to interpret the data, a correction to the standard cosmological model, the influence of some previously undiscovered law of physics, or some partial breakdown of the Copernican principle.

    For an accessible recent summary of the situation, see this video by Becky Smethurst (“Dr. Becky”).
  • 24:49 So, what is a Type Ia supernova and why is it so useful in the distance ladder?

    A Type Ia supernova occurs when a white dwarf in a binary system draws more and more mass from its companion star, until it reaches the Chandrasekhar limit, at which point its gravitational forces are strong enough to cause a collapse that increases the pressure to the point where a supernova is triggered via a process known as carbon detonation. Because of the universal nature of the Chandrasekhar limit, all such supernovae have (as a first approximation) the same absolute brightness and can thus be used as standard candles in a similar fashion to Cepheids (but without the need to first measure any auxiliary observable, such as a period). But these supernovae are also far brighter than Cepheids, and can so this method can be used at significantly larger distances than the Cepheid method (roughly speaking it can handle distances of ~100 billion light years, whereas Cepheids are reliable out to ~10 billion light years). Among other things, the supernovae measurements were the key to detecting an important nonlinear correction to Hubble’s law at these scales, leading to the remarkable conclusion that the expansion of the universe is in fact accelerating over time, which in the Lambda-CDM model corresponds to a positive cosmological constant, though there are more complex “dark energy” models that are also proposed to explain this acceleration.

  • 24:54 Besides Type Ia supernovae, I felt that a lot of other topics relevant to the modern distance ladder (e.g., the cosmic microwave background radiation, the Lambda CDM model, dark matter, dark energy, inflation, multi-messenger astronomy, etc.) were omitted.

    This is partly due to time constraints, and the need for editing to tighten the narrative, but was also a conscious decision on my part. Advanced classes on the distance ladder will naturally focus on the most modern, sophisticated, and precise ways to measure distances, backed up by the latest mathematics, physics, technology, observational data, and cosmological models. However, the focus in this video series was rather different; we sought to portray the cosmic distance ladder as evolving in a fully synergestic way, across many historical eras, with the evolution of mathematics, science, and technology, as opposed to being a mere byproduct of the current state of these other disciplines. As one specific consequence of this change of focus, we emphasized the first time any rung of the distance ladder was achieved, at the expense of more accurate and sophisticated later measurements at that rung. For instance, refinements in the measurement of the radius of the Earth since Eratosthenes, improvements in the measurement of the astronomical unit between Aristarchus and Cook, or the refinements of Hubble’s law and the cosmological model of the universe in the twentieth and twenty-first centuries, were largely omitted (though some of the answers in this FAQ are intended to address these omissions).

    Many of the topics not covered here (or only given a simplified treatment) are discussed in depth in other expositions, including other Youtube videos. I would welcome suggestions from readers for links to such resources in the comments to this post. Here is a partial list:

Matt von HippelThis Week, at FirstPrinciples.org

I’ve got a piece out this week in a new venue: FirstPrinciples.org, where I’ve written a profile of a startup called Vaire Computing.

Vaire works on reversible computing, an idea that tries to leverage thermodynamics to make a computer that wastes as little heat as possible. While I learned a lot of fun things that didn’t make it into the piece…I’m not going to tell you them this week! That’s because I’m working on another piece about reversible computing, focused on a different aspect of the field. When that piece is out I’ll have a big “bonus material post” talking about what I learned writing both pieces.

This week, instead, the bonus material is about FirstPrinciples.org itself, where you’ll be seeing me write more often in future. The First Principles Foundation was founded by Ildar Shar, a Canadian tech entrepreneur who thinks that physics is pretty cool. (Good taste that!) His foundation aims to support scientific progress, especially in addressing the big, fundamental questions. They give grants, analyze research trends, build scientific productivity tools…and most relevantly for me, publish science news on their website, in a section called the Hub.

The first time I glanced through the Hub, it was clear that FirstPrinciples and I have a lot in common. Like me, they’re interested both in scientific accomplishments and in the human infrastructure that makes them possible. They’ve interviewed figures in the open access movement, like the creators of arXiv and SciPost. On the science side, they mix coverage of the mainstream and reputable with outsiders challenging the status quo, and hot news topics with explainers of key concepts. They’re still new, and still figuring out what they want to be. But from my glimpse on the way, it looks like they’re going somewhere good.

Matt Strassler Quantum Interference 4: Independence and Correlation

The quantum double-slit experiment, in which objects are sent toward and through a pair of slits in a wall,and are recorded on a screen behind the slits, clearly shows an interference pattern. It’s natural to ask, “where does the interference occur?”

The problem is that there is a hidden assumption in this way of framing the question — a very natural assumption, based on our experience with waves in water or in sound. In those cases, we can explicitly see (Fig. 1) how interference builds up between the slits and the screen.

Figure 1: How water waves or sound waves interfere after passing through two slits.

But when we dig deep into quantum physics, this way of thinking runs into trouble. Asking “where” is not as straightforward as it seems. In the next post we’ll see why. Today we’ll lay the groundwork.

Independence and Interference

From my long list of examples with and without interference (we saw last time what distinguishes the two classes), let’s pick a superposition whose pre-quantum version is shown in Fig. 2.

Figure 2: A pre-quantum view of a superposition in which particle 1 is moving left OR right, and particle 2 is stationary at x=3.

Here we have

  • particle 1 going from left to right, with particle 2 stationary at x=+3, OR
  • particle 1 going from right to left, with particle 2 again stationary at x=+3.

In Fig. 3 is what the wave function Ψ(x1,x2) [where x1 is the position of particle 1 and x2 is the position of particle 2] looks like when its absolute-value squared is graphed on the space of possibilities. Both peaks have x2=+3, representing the fact that particle 2 is stationary. They move in opposite directions and pass through each other horizontally as particle 1 moves to the right OR to the left.

Figure 3: The graph of the absolute-value-squared of the wave function for the quantum version of the system in Fig. 2.

This looks remarkably similar to what we would have if particle 2 weren’t there at all! The interference fringes run parallel to the x2 axis, meaning the locations of the interference peaks and valleys depend on x1 but not on x2. In fact, if we measure particle 1, ignoring particle 2, we’ll see the same interference pattern that we see when a single particle is in the superposition of Fig. 1 with particle 2 removed (Fig. 4):

Figure 4a: The square of the absolute value of the wave function for a particle in a superposition of the form shown in Fig. 2 but with the second particle removed.
Figure 4b: A closeup of the interference pattern that occurs at the moment when the two peaks in Fig. 4a perfectly overlap. The real and imaginary parts of the wave function are shown in red and blue, while its square is drawn in black.

We can confirm this in a simple way. If we measure the position of particle 1, ignoring particle 2, the probability of finding that particle at a specific position x1 is given by projecting the wave function, shown above as a function of x1 and x2, onto the x1 axis. [More mathematically, this is done by integrating over x2 to leave a function of x1 only.] Sometimes (not always!) this is essentially equivalent to viewing the graph of the wave function from one side, as in Figs. 5-6.

Figure 5: Projecting the wave function of Fig. 3, at the moment of maximum interference, onto the x1 axis. Compare with the black curve in Fig. 4b.

Because the interference ridges in Fig. 3 are parallel to the x2 axis and thus independent of particle 2’s exact position, we do indeed find, when we project onto the x1 axis as in Fig. 5, that the familiar interference pattern of Fig. 4b reappears.

Meanwhile, if at that same moment we measure particle 2’s position, we will find results centered around x2=+3, with no interference, as seen in Fig. 6 where we project the wave function of Fig. 3 onto the x2 axis.

Figure 6: Projecting the wave function of Fig. 3, at the moment of maximum interference, onto the x2 axis. The position of particle 2 is thus close to x2=3, with no interference pattern.

Why is this case so simple, with the one-particle case in Fig. 4 and the two-particle case in Figs. 3 and 5 so closely resembling each other?

The Cause

It has nothing specifically to do with the fact that particle 2 is stationary. Another example I gave had particle 2 stationary in both parts of the superposition, but located in two different places. In Figs. 7a and 7b, the pre-quantum version of that system is shown both in physical space and in the space of possibilities [where I have, for the first time, put stars for the two possibilities onto the same graph.]

Figure 7a: A similar system to that of Fig. 2, drawn in its pre-quantum version in physical space.
Figure 7b: Same as Fig. 7a, but drawn in the space of possibilities.

You can see that the two stars’ paths will not intersect, since one remains at x2=+3 and the other remains at x2=-3. Thus there should be no interference — and indeed, none is seen in Fig. 8, where the time evolution of the full quantum wave function is shown. The two peaks miss each other, and so no interference occurs.

Figure 8: The absolute-value-squared of the wave function corresponding to Figs. 7a-7b.

If we project the wave function of Fig. 8 onto the x1 axis at the moment when the two peaks are at x1=0, we see (Fig. 9) a single peak (because the two peaks, with different values of x2, are projected onto each other). No interference fringes are seen.

Figure 9: At the moment when the first particle is near x1=0, the probability of finding particle 1 as a function of x1 shows a featureless peak, with no interference effects.

Instead the resemblance between Figs. 3-5 has to do with the fact that particle 2 is doing exactly the same thing in each part of the superposition. For instance, as in Fig. 10, suppose particle 2 is moving to the left in both possibilities.

Figure 10: A system similar to that of Fig. 2, but with particle 2 (orange) moving to the left in both parts of the superposition.

(In the top possibility, particles 1 and 2 will encounter one another; but we have been assuming for simplicity that they don’t interact, so they can safely pass right through each other.)

The resulting wave function is shown in Fig. 11:

Figure 11: The absolute-value-squared of the wave function corresponding to Fig.10.

The two peaks cross paths when x1=0 and x2=2. The wave function again shows interference at that location, with fringes that are independent of x2. If we project the wave function onto the x1=0 axis, we’ll get exactly the same thing we saw in Fig. 5, even though the behavior of the wave function in x2 is different.

This makes the pattern clear: if, in each part of the superposition, particle 2 behaves identically, then particle 1 will be subject to the same pattern of interference as if particle 2 were absent. Said another way, if the behavior of particle 1 is independent of particle 2 (and vice versa), then any interference effects involving one particle will be as though the other particle wasn’t even there.

Said yet another way, the two particles in Figs. 2 and 10 are uncorrelated, meaning that we can understand what either particle is doing without having to know what the other is doing.

Importantly, the examples studied in the previous post did not have this feature. That’s crucial in understanding why the interference seen at the end of that post wasn’t so simple.

Independence and Factoring

What we are seeing in Figs. 2 and 10 has an analogy in algebra. If we have an algebraic expression such as

  • (a c + b c),

in which c is common to both terms, then we can factor it into

  • (a+b)c.

The same is true of the kinds of physical processes we’ve been looking at. In Fig. 10 the two particles’ behavior is uncorrelated, so we can “factor” the pre-quantum system as follows.

Figure 12: The “factored” form of the superposition in Fig. 10.

What we see here is that factoring involves an AND, while superposition is an OR: the figure above says that (particle 1 is moving from left to right OR from right to left) AND (particle 2 is moving from right to left, no matter what particle 1 is doing.)

And in the quantum context, if (and only if) two particles’ behaviors are completely uncorrelated, we can literally factor the wave function into a product of two functions, one for each particle:

  • Ψ(x1,x2)=Ψ1(x1)Ψ2(x2)

In this specific case of Fig. 12, where the first particle is in a superposition whose parts I’ve labeled A and B, we can write Ψ1(x1) as a sum of two terms:

  • Ψ1(x1)=ΨA(x1) + ΨB(x1)

Specifically, ΨA(x1) describes particle 1 moving left to right — giving one peak in Fig. 11 — and ΨB(x1) describes particle 2 moving right to left, giving the other peak.

But this kind of factoring is rare, and not possible in general. None of the examples in the previous post (or of this post, excepting that of its Fig. 5) can be factored. That’s because in these examples, the particles are correlated: the behavior of one depends on the behavior of the other.

Superposition AND Superposition

If the particles are truly uncorrelated, we should be able to put both particles into superpositions of two possibilities. As a pre-quantum system, that would give us (particle 1 in state A OR state B) AND (particle 2 in state C OR state D) in Fig. 13.

Figure 13: The two particles are uncorrelated, and so their behavior can be factored. The first particle is in a superposition of states A and B, the second in a superposition of states C and D.

The corresponding factored wave function, in which (particle 1 moves left to right OR right to left) AND (particle 2 moves left to right OR right to left), can be written as a product of two superpositions:

  • Ψ(x1,x2)=Ψ1(x1)Ψ2(x2) = [ΨA(x1)+ΨB(x1)] [ΨC(x2)+ΨD(x2)]

In algebra, we can expand a similar product

  • (a+b)(c+d)=ac+ad+bc+bd

giving us four terms. In the same way we can expand the above wave function into four terms

  • Ψ(x1,x2)=ΨA(x1)ΨC(x2)+ΨB(x1)ΨC(x2)+ΨA(x1)ΨD(x2)+ΨB(x1)ΨD(x2)

whose pre-quantum version gives us the four possibilities shown in Fig. 14.

Figure 14: The product in Fig. 13 is expanded into its four distinct possibilities.

The wave function therefore has four peaks, one for each term. The wave function behaves as shown in Fig. 15.

Figure 15: The wave function for the system in Fig. 14 shows interference of two pairs of possibilities, first for particle 1 and later for particle 2.

The four peaks interfere in pairs. The top two and the bottom two interfere when particle 1 reaches x1=0, creating fringes that run parallel to the x2 axis and thus are independent of x2. Notice that even though there are two sets of interference fringes when particle 1 reaches x1=0 in all the superpositions, we do not observe this if we only measure particle 1. When we project the wave function onto the x1 axis, the two sets of interference fringes line up, and we see the same single-particle interference pattern that we’ve seen so many times (Figs. 3-5). That’s all because particles 1 and 2 are uncorrelated.

Figure 16: The first instance of interference, seen in two peaks in Fig. 15 is reduced, when projected on to the x1 axis, to the same interference pattern as seen in Figs. 3-5; the measurement of particle 1’s position will show the same interference pattern in each case, because particles 1 and 2 are uncorrelated.

If at the same moment we measure particle 2 ignoring particle 1, we find (Fig. 17) that particle 2 has equal probability of being near x=2.5 or x=-0.5, with no interference effects.

Figure 17: The first instance of interference, seen in two peaks in Fig. 15, shows two peaks with no interference when projected on to the x2 axis. Thus measurements of particle 2’s position show no interference at this moment.

Meanwhile, the left two and the right two peaks in Fig. 15 subsequently interfere when particle 2 reaches x2=1, creating fringes that run parallel to the x1 axis, and thus are independent of x1; these will show up near x=1 in measurements of particle 2’s position. This is shown (Fig. 18) by projecting the wave function at that moment onto the x2 axis.

Figure 18: During the second instance of interference in Fig. 15, the projection of the wave function onto the x2 axis.

Locating the Interference?

So far, in all these examples, it seems that we can say where the interference occurs in physical space. For instance, in this last example, it appears that particle 1 shows interference around x=0, and slightly later particle 2 shows interference around x=1.

But if we look back at the end of the last post, we can see that something is off. In the examples considered there, the particles are correlated and the wave function cannot be factored. And in the last example in Fig. 12 of that post, we saw interference patterns whose ridges are parallel neither to the x1 axis nor to the x2 axis. . .an effect that a factored wave function cannot produce. [Fun exercise: prove this last statement.]

As a result, projecting the wave function of that example onto the x1 axis hides the interference pattern, as shown in Fig. 19. The same is true when projecting onto the x2 axis.

Figure 19: Alhough Fig. 12 of the previous post shows an interference pattern, it is hidden when the wave function is projected onto the x1 axis, leaving only a boring bump. The observable consequences are shown in Fig. 13 of that same post.

Consequently, neither measurements of particle 1’s position nor measurements of particle 2’s position can reveal the interference effect. (This is shown, for particle 1, in the previous post’s Fig. 13.) This leaves it unclear where the interference is, or even how to measure it.

But in fact it can be measured, and next time we’ll see how. We’ll also see that in a general superposition, where the two particles are correlated, interference effects often cannot be said to have a location in physical space. And that will lead us to a first glimpse of one of the most shocking lessons of quantum physics.

One More Uncorrelated Example, Just for Fun

To close, I’ll leave you with one more uncorrelated example, merely because it looks cool. In pre-quantum language, the setup is shown in Fig. 20.

Figure 20: Another uncorrelated superposition with four possibilities.

Now all four peaks interfere simultaneously, near (x1,x2)=(1,-1).

Figure 21: The four peaks simultaneously interfere, generating a grid pattern.

The grid pattern in the interference assures that the usual interference effects can be seen for both particles at the same time, with the interference for particle 1 near x1=1 and that for particle 2 near x2=-1. Here are the projections onto the two axes at the moment of maximal interference.

Figure 22a: At the moment of maximum interference, the projection of the wave function onto the x1 axis shows interference near x1=1.
Figure 22b: At the moment of maximum interference, the projection of the wave function onto the x2 axis shows interference near x2=-1.

March 27, 2025

n-Category Café The McGee Group

This is a bit of a shaggy dog story, but I think it’s fun. There’s also a moral about the nature of mathematical research.

Once I was interested in the McGee graph, nicely animated here by Mamouka Jibladze:

This is the unique (3,7)-cage, meaning a graph such that each vertex has 3 neighbors and the shortest cycle has length 7. Since it has a very symmetrical appearance, I hoped it would be connected to some interesting algebraic structures. But which?

I read on Wikipedia that the symmetry group of the McGee graph has order 32. Let’s call it the McGee group. Unfortunately there are many different 32-element groups — 51 of them, in fact! — and the article didn’t say which one this was. (It does now.)

I posted a general question:

and Gordon Royle said the McGee group is “not a super-interesting group, it is SmallGroup(32,43) in either GAP or Magma”. Knowing this let me look up the McGee group on this website, which is wonderfully useful if you’re studying finite groups:

There I learned that the McGee group is the so-called holomorph of the cyclic group /8\mathbb{Z}/8: that is, the semidirect product of /8\mathbb{Z}/8 and its automorphism group:

Aut(/8)/8 Aut(\mathbb{Z}/8) \ltimes \mathbb{Z}/8

I resisted getting sucked into the general study of holomorphs, or what happens when you iterate the holomorph construction. Instead, I wanted a more concrete description of the McGee group.

/8\mathbb{Z}/8 is not just an abelian group: it’s a ring! Since multiplication in a ring distributes over addition, we can get automorphisms of the group /8\mathbb{Z}/8 by multiplying by those elements that have multiplicative inverses. These invertible elements form a group

(/8) ×={1,3,5,7} (\mathbb{Z}/8)^\times = \{1,3,5,7\}

called the multiplicative group of /8\mathbb{Z}/8. In fact these give all the automorphisms of the group /8\mathbb{Z}/8.

In short, the McGee group is

(/8) ×/8 (\mathbb{Z}/8)^\times \ltimes \mathbb{Z}/8

This is very nice, because this is the group of all transformations of /8\mathbb{Z}/8 of the form

xgx+ag(/8) ×,a/8 x \mapsto g x + a \qquad g \in (\mathbb{Z}/8)^\times , \; a \in \mathbb{Z}/8

If we think of /8\mathbb{Z}/8 as a kind of line — called the ‘affine line over /8\mathbb{Z}/8’ — these are precisely all the affine transformations of this line. Thus, the McGee group deserves to be called

Aff(/8)=(/8) ×/8 \text{Aff}(\mathbb{Z}/8) = (\mathbb{Z}/8)^\times \ltimes \mathbb{Z}/8

This suggests that we can build the McGee graph in some systematic way starting from the affine line over /8\mathbb{Z}/8. This turns out to be a bit complicated, because the vertices come in two kinds. That is, the McGee group doesn’t act transitively on the set of vertices. Instead, it has two orbits, shown as red and blue dots here:

The 8 red vertices correspond straightforwardly to the 8 points of the affine line, but the 16 blue vertices are more tricky. There are also the edges to consider: these come in three kinds! Greg Egan figured out how this works, and I wrote it up:

Then a decade passed.

About two weeks ago, I gave a Zoom talk at the Illustrating Math Seminar about some topics on my blog Visual Insight. I mentioned that the McGee group is SmallGroup(32,43) and the holomorph of /8\mathbb{Z}/8. And then someone — alas, I forget who — instantly typed in the chat that this is one of the two smallest groups with an amazing property! Namely, this group has an outer automorphism that maps each element to an element conjugate to it.

I didn’t doubt this for a second. To paraphrase what Hardy said when he received Ramanujan’s first letter, nobody would have the balls to make up this shit. So, I posed a challenge to find such an exotic outer automorphism:

By reading around, I soon learned that people have studied this subject quite generally:

An automorphism f:GGf \colon G \to G is class-preserving if for each gGg \in G there exists some hGh \in G such that

f(g)=hgh 1 f(g) = h g h^{-1}

If you can use the same hh for every gg we call ff an inner automorphism. But some groups have class-preserving automorphisms that are not inner! These are the class-preserving outer automorphisms.

I don’t know if class-preserving outer automorphisms are good for anything, or important in any way. They mainly just seem intriguingly spooky. An outer automorphism that looks inner if you examine its effect on any one group element is nothing I’d ever considered. So I wanted to see an example.

Rising to my challenge, Greg Egan found a nice explicit formula for some class-preserving outer automorphisms of the McGee group.

As we’ve seen, any element of the McGee group is a transformation

xgx+ag(/8) ×,a/8 x \mapsto g x + a \qquad g \in (\mathbb{Z}/8)^\times , \; a \in \mathbb{Z}/8

so let’s write it as a pair (g,a)(g,a). Greg Egan looked for automorphisms of the McGee group that are of the form

f(g,a)=(g,a+D(g)) f(g,a) = (g, a + D(g))

for some function

D:(/8) ×/8 D \colon (\mathbb{Z}/8)^\times \to \mathbb{Z}/8

It is easy to check that ff is an automorphism if and only if

D(gg)=D(g)+gD(g) D(g g') = D(g) + g D(g')

Moreover, ff is an inner automorphism if and only if

D(g)=gbb D(g) = g b - b

for some b/8b \in \mathbb{Z}/8.

Now comes something cool noticed by Joshua Grochow: these formulas are an instance of a general fact about group cohomology!

Suppose we have a group GG acting as automorphisms of an abelian group AA. Then we can define the cohomology H n(G,A)H^n(G,A) to be the group of nn-cocycles modulo nn-coboundaries. We only need the case n=1n = 1 here. A 1-cocycle is none other than a function D:GAD \colon G \to A obeying

D(gg)=D(g)+gD(g) D(g g') = D(g) + g D(g')

while a 1-coboundary is one of the form

D(g)=gbb D(g) = g b - b

for some bAb \in A. You can check that every 1-coboundary is a 1-cocycle. H 1(G,A)H^1(G,A) is the group of 1-cocycles modulo 1-coboundaries.

In this situation we can define the semidirect product GAG \ltimes A, and for any D:GAD \colon G \to A we can define a function

f:GAGA f \colon G \ltimes A \to G \ltimes A

by

f(g,a)=(g,a+D(g)) f(g,a) = (g, a + D(g))

Now suppose G=Aut(A)G = \text{Aut}(A) and suppose GG is abelian. Then by straightforward calculations we can check:

  • ff is an automorphism iff DD is a 1-cocycle

and

  • ff is an inner automorphism iff DD is a 1-coboundary!

Thus, GAG \ltimes A will have outer automorphisms if H 1(G,A)0H^1(G,A) \ne 0.

When A=/8A = \mathbb{Z}/8 then G=Aut(A)G = \text{Aut}(A) is abelian and GAG \ltimes A is the McGee group. This puts Egan’s idea into a nice context. But we still need to actually find maps DD that give outer automorphisms of the McGee group, and then find class-preserving ones. I don’t know how to do that using general ideas from cohomology. Maybe someone smart could do the first part, but the ‘class-preserving’ condition doesn’t seem to emerge naturally from cohomology.

Anyway, Egan didn’t waste his time with such effete generalities: he actually found all choices of D:(/8) ×/8D \colon (\mathbb{Z}/8)^\times \to \mathbb{Z}/8 for which

f(g,a)=(g,a+D(g)) f(g,a) = (g, a + D(g))

is a class-preserving outer automorphism of the McGee group. Namely:

(D(1),D(3),D(5),D(7)) = (0,0,4,4) (D(1),D(3),D(5),D(7)) = (0,2,0,2) (D(1),D(3),D(5),D(7)) = (0,4,4,0) (D(1),D(3),D(5),D(7)) = (0,6,0,6) \begin{array}{ccl} (D(1), D(3), D(5), D(7)) &=& (0, 0, 4, 4) \\ (D(1), D(3), D(5), D(7)) &=& (0, 2, 0, 2) \\ (D(1), D(3), D(5), D(7)) &=& (0, 4, 4, 0) \\ (D(1), D(3), D(5), D(7)) &=& (0, 6, 0, 6) \end{array}

Last Saturday after visiting my aunt in Santa Barbara I went to Berkeley to visit the applied category theorists at the Topos Institute. I took a train, to lessen my carbon footprint a bit. The trip took 9 hours — a long time, but a beautiful ride along the coast and then through forests and fields.

The day before taking the train, I discovered my laptop was no longer charging! So, I bought a pad of paper. And then, while riding the train, I checked by hand that Egan’s first choice of DD really is a cocycle, and really is not a coboundary, so that it defines an outer automorphism of the McGee group. Then — and this was fairly easy — I checked that it defines a class-preserving automorphism. It was quite enjoyable, since I hadn’t done any long calculations recently.

One moral here is that interesting ideas often arise from the interactions of many people. The results here are not profound, but they are certainly interesting, and they came from online conversations with Greg Egan, Gordon Royle, Joshua Grochow, the mysterious person who instantly knew that the McGee group was one of the two smallest groups with a class-preserving outer automorphism, and others.

But what does it all mean, mathematically? Is there something deeper going on here, or is it all just a pile of curiosities?

What did we actually do, in the end? Following the order of logic rather than history, maybe this. We started with a commutative ring AA, took its group of affine transformations Aff(A)\text{Aff}(A), and saw this group must have outer automorphisms if

H 1(A ×,A)0 H^1(A^\times, A) \ne 0

We saw this cohomology group really is nonvanishing when A=/nA = \mathbb{Z}/n and n=8n = 8. Furthermore, we found a class-preserving outer automorphism of Aff(/8)\text{Aff}(\mathbb{Z}/8).

This raises a few questions:

  • What is the cohomology H 1((/n) ×,/n)H^1((\mathbb{Z}/n)^\times, \mathbb{Z}/n) in general?

  • What are the outer automorphisms of Aff(/n)\text{Aff}(\mathbb{Z}/n)?

  • When does Aff(/n)\text{Aff}(\mathbb{Z}/n) have class-preserving outer automorphisms?

I saw bit about the last question in this paper:

They say that this paper:

  • G. E. Wall, Finite groups with class-preserving outer automorphisms, Journal of the London Mathematical Society 22 (1947), 315–320.

proves Aff(/n)\text{Aff}(\mathbb{Z}/n) has a class-preserving outer automorphism when nn is a multiple of 8.

Does this happen only for multiples of 8? Is this somehow related to the most famous thing with period 8 — namely, Bott periodicity? I don’t know.

John BaezThe McGee Group

This is a bit of a shaggy dog story, but I think it’s fun, and there’s a moral about the nature of mathematical research.

Act 1

Once I was interested in the McGee graph, nicely animated here by Mamouka Jibladze:

This is the unique (3,7)-cage, meaning a graph such that each vertex has 3 neighbors and the shortest cycle has length 7. Since it has a very symmetrical appearance, I hoped it would be connected to some interesting algebraic structures. But which?

I read on Wikipedia that the symmetry group of the McGee graph has order 32. Let’s call it the McGee group. Unfortunately there are many different 32-element groups—51 of them, in fact!—and the article didn’t say which one this was. (It does now.)

I posted a general question:

• MathOverflow, What algebraic structures are related to the McGee graph?

and Gordon Royle said the McGee group is “not a super-interesting group, it is SmallGroup(32,43) in either GAP or Magma”. Knowing this let me look up the McGee group on this website, which is wonderfully useful if you’re studying finite groups:

• GroupProps, Holomorph of Z/8.

There I learned that the McGee group is the so-called holomorph of the cyclic group \mathbb{Z}/8: that is, the semidirect product of \mathbb{Z}/8 and its automorphism group:

\text{Aut}(\mathbb{Z}/8) \ltimes \mathbb{Z}/8

I resisted getting sucked into the general study of holomorphs, or what happens when you iterate the holomorph construction. Instead, I wanted a more concrete description of the McGee group.

\mathbb{Z}/8 is not just an abelian group: it’s a ring! Since multiplication in a ring distributes over addition, we can get automorphisms of the group \mathbb{Z}/8 by multiplying by those elements that have multiplicative inverses. These invertible elements form a group

(\mathbb{Z}/8)^\times = \{1,3,5,7\}

called the multiplicative group of \mathbb{Z}/8. In fact these give all the automorphisms of the group \mathbb{Z}/8.

In short, the McGee group is

(\mathbb{Z}/8)^\times \ltimes \mathbb{Z}/8

This is very nice, because this is the group of all transformations of \mathbb{Z}/8 of the form

x \mapsto g x + a  \qquad g \in (\mathbb{Z}/8)^\times , \; a \in \mathbb{Z}/8

If we think of \mathbb{Z}/8 as a kind of line—called the affine line over the ring \mathbb{Z}/8—these are precisely all the affine transformations of this line. Thus, the McGee group deserves to be called

\text{Aff}(\mathbb{Z}/8) = (\mathbb{Z}/8)^\times \ltimes \mathbb{Z}/8

This suggests that we can build the McGee graph in some systematic way starting from the affine line over \mathbb{Z}/8.

It turns out to be a bit complicated, because the vertices come in two kinds. That is, the McGee group doesn’t act transitively on the set of vertices. Instead, it has two orbits, shown as red and blue dots here:

The 8 red vertices correspond straightforwardly to the 8 points of the affine line, but the 16 blue vertices are more tricky. There are also the edges to consider: these come in three kinds! Greg Egan figured out how this works, and I wrote it up:

The McGee graph, Visual Insight, September 15, 2015.

Then a decade passed.

Act 2

About two weeks ago, I gave a Zoom talk at the Illustrating Math Seminar about some topics on my blog Visual Insight. I mentioned that the McGee group is SmallGroup(32,43) and the holomorph of \mathbb{Z}/8. And then someone—alas, I forget who—instantly typed in the chat that this is one of the two smallest groups with an amazing property! Namely, this group has an outer automorphism that maps each element to an element conjugate to it.

I didn’t doubt this for a second. To paraphrase what Hardy said when he received Ramanujan’s first letter, nobody would have the balls to make up this shit. So, I posed a challenge to find such an exotic outer automorphism:

The McGee group, Mathstodon.

By reading around, I soon learned that people have studied this subject quite generally:

• Martin Hertweck, Class-preserving automorphisms of finite groups, Journal of Algebra 241 (2001), 1–26.

• Manoj K. Yadav, Class preserving automorphisms of finite p-groups: a survey, Journal of the London Mathematical Society 75 (2007), 755–772.

An automorphism f \colon G \to G is class-preserving if for each g \in G there exists some h \in G such that

f(g) = h g h^{-1}

If you can use the same h for every g we call f an inner automorphism. But some groups have class-preserving automorphisms that are not inner! These are the class-preserving outer automorphisms.

I don’t know if class-preserving outer automorphisms are good for anything, or important in any way. They mainly just seem intriguingly spooky. An outer automorphism that looks inner if you examine its effect on any one group element is nothing I’d ever considered. So I wanted to see an example.

Rising to my challenge, Greg Egan found a nice explicit formula for some class-preserving outer automorphisms of the McGee group.

As we’ve seen, any element of the McGee group is a transformation

x \mapsto g x + a  \qquad g \in (\mathbb{Z}/8)^\times , \; a \in \mathbb{Z}/8

so let’s write it as a pair (g,a). Greg Egan looked for automorphisms of the McGee group that are of the form

f(g,a) = (g, a + D(g))

for some function

D \colon (\mathbb{Z}/8)^\times \to \mathbb{Z}/8

It is easy to check that f is an automorphism if and only if

D(g g') = D(g) + g D(g')

Moreover, f is an inner automorphism if and only if

D(g) = g b - b

for some b \in \mathbb{Z}/8.

Now comes something cool noticed by Joshua Grochow: these formulas are an instance of a general fact about group cohomology!

Suppose we have a group G acting as automorphisms of an abelian group A. Then we can define the cohomology H^n(G,A) to be the group of n-cocycles modulo n-coboundaries. We only need the case n = 1 here. A 1-cocycle is none other than a function D \colon G \to A obeying

D(g g') = D(g) + g D(g')

while a 1-coboundary is one of the form

D(g) = g b - b

for some b \in A. You can check that every 1-coboundary is a 1-cocycle. H^1(G,A) is the group of 1-cocycles modulo 1-coboundaries.

In this situation we can define the semidirect product G \ltimes A, and for any D \colon G \to A we can define a function

f \colon G \ltimes A \to G \ltimes A

by

f(g,a) = (g, a + D(g))

Now suppose G = \text{Aut}(A) and suppose G is abelian. Then by straightforward calculations we can check:

f is an automorphism iff D is a 1-cocycle

and

f is an inner automorphism iff D is a 1-coboundary!

Thus, G \ltimes A will have outer automorphisms if H^1(G,A) \ne 0.

When A = \mathbb{Z}/8 then G = \text{Aut}(A) is abelian and G \ltimes A is the McGee group. This puts Egan’s idea into a nice context. But we still need to actually find maps D that give outer automorphisms of the McGee group, and then find class-preserving ones. I don’t know how to do that using general ideas from cohomology. Maybe someone smart could do the first part, but the ‘class-preserving’ condition doesn’t seem to emerge naturally from cohomology.

Anyway, Egan didn’t waste his time with such effete generalities: he actually found all choices of D \colon (\mathbb{Z}/8)^\times \to \mathbb{Z}/8 for which

f(g,a) = (g, a + D(g))

is a class-preserving outer automorphism of the McGee group. Namely:

\begin{array}{ccl} (D(1), D(3), D(5), D(7)) &=& (0, 0, 4, 4) \\  (D(1), D(3), D(5), D(7)) &=& (0, 2, 0, 2) \\  (D(1), D(3), D(5), D(7)) &=& (0, 4, 4, 0) \\  (D(1), D(3), D(5), D(7)) &=& (0, 6, 0, 6)   \end{array}

Last Saturday after visiting my aunt in Santa Barbara I went to Berkeley to visit the applied category theorists at the Topos Institute. I took a train, to lessen my carbon footprint a bit. The trip took 9 hours—a long time, but a beautiful ride along the coast and then through forests and fields.

The day before taking the train, I discovered my laptop was no longer charging! So, I bought a pad of paper. And then, while riding the train, I checked by hand that Egan’s first choice of D really is a cocycle, and really is not a coboundary, so that it defines an outer automorphism of the McGee group. Then—and this was fairly easy—I checked that it defines a class-preserving automorphism. It was quite enjoyable, since I hadn’t done any long calculations recently.

One moral here is that interesting ideas often arise from the interactions of many people. The results here are not profound, but they are certainly interesting, and they came from online conversations with Greg Egan, Gordon Royle, Joshua Grochow, the mysterious person who instantly knew that the McGee group was one of the two smallest groups with a class-preserving outer automorphism, and others.

But what does it all mean, mathematically? Is there something deeper going on here, or is it all just a pile of curiosities?

What did we actually do, in the end? Following the order of logic rather than history, maybe this. We started with a commutative ring A, took its group of affine transformations \text{Aff}(A), and saw this group must have outer automorphisms if

H^1(A^\times, A) \ne 0

We saw this cohomology group really is nonvanishing when A = \mathbb{Z}/n and n = 8. Furthermore, we found a class-preserving outer automorphism of \text{Aff}(\mathbb{Z}/8).

This raises a few questions:

• What is the cohomology H^1((\mathbb{Z}/n)^\times, \mathbb{Z}/n) in general?

• What are the outer automorphisms of \text{Aff}(\mathbb{Z}/n)?

• When does \text{Aff}(\mathbb{Z}/n) have class-preserving outer automorphisms?

I saw bit about the last question in this paper:

• Peter A. Brooksbank and Matthew S. Mizuhara, On groups with a class-preserving outer automorphism, Involve, a Journal of Mathematics 7 (2013),171–179.

They say that this paper:

• G. E. Wall, Finite groups with class-preserving outer automorphisms, Journal of the London Mathematical Society 22 (1947), 315–320.

proves \text{Aff}(\mathbb{Z}/n) has a class-preserving outer automorphism when n is a multiple of 8.

Does this happen only for multiples of 8? Is this somehow related to the most famous thing with period 8—namely, Bott periodicity? I don’t know.

March 26, 2025

Scott Aaronson On the JPMC/Quantinuum certified quantum randomness demo

These days, any quantum computing post I write ought to begin with the disclaimer that the armies of Sauron are triumphing around the globe, this is the darkest time for humanity most of us have ever known, and nothing else matters by comparison. Certainly not quantum computing. Nevertheless stuff happens in quantum computing and it often brings me happiness to blog about it—certainly more happiness than doomscrolling or political arguments.


So then: today JP Morgan Chase announced that, together with Quantinuum and DoE labs, they’ve experimentally demonstrated the protocol I proposed in 2018, and further developed in a STOC’2023 paper with Shih-Han Hung, for using current quantum supremacy experiments to generate certifiable random bits for use in cryptographic applications. See here for our paper in Nature—the JPMC team was gracious enough to include me and Shih-Han as coauthors.

Mirroring a conceptual split in the protocol itself, Quantinuum handled the quantum hardware part of my protocol, while JPMC handled the rest: modification of the protocol to make it suitable for trapped ions, as well as software to generate pseudorandom challenge circuits to send to the quantum computer over the Internet, then to verify the correctness of the quantum computer’s outputs (thereby ensuring, under reasonable complexity assumptions, that the outputs contained at least a certain amount of entropy), and finally to extract nearly uniform random bits from the outputs. The experiment used Quantinuum’s 56-qubit trapped-ion quantum computer, which was given and took a couple seconds to respond to each challenge. Verification of the outputs was done using the Frontier and Summit supercomputers. The team estimates that about 70,000 certified random bits were generated over 18 hours, in such a way that, using the best currently-known attack, you’d need at least about four Frontier supercomputers working continuously to spoof the quantum computer’s outputs, and get the verifier to accept non-random bits.

We should be clear that this gap, though impressive from the standpoint of demonstrating quantum supremacy with trapped ions, is not yet good enough for high-stakes cryptographic applications (more about that later). Another important caveat is that the parameters of the experiment aren’t yet good enough for my and Shih-Han’s formal security reduction to give assurances: instead, for the moment one only has “practical security,” or security against a class of simplified yet realistic attackers. I hope that future experiments will build on the JPMC/Quantinuum achievement and remedy these issues.


The story of this certified randomness protocol starts seven years ago, when I had lunch with Or Sattath at a Japanese restaurant in Tel Aviv. Or told me that I needed to pay more attention to the then-recent Quantum Lightning paper by Mark Zhandry. I already know that paper is great, I said. You don’t know the half of it, Or replied. As one byproduct of what he’s doing, for example, Mark gives a way to measure quantum money states in order to get certified random bits—bits whose genuine randomness (not pseudorandomness) is certified by computational intractability, something that wouldn’t have been possible in a classical world.

Well, why do you even need quantum money states for that? I asked. Why not just use, say, a quantum supremacy experiment based on Random Circuit Sampling, like the one Google is now planning to do (i.e., the experiment Google would do, a year later after this conversation)? Then, the more I thought about that question, the more I liked the idea that these “useless” Random Circuit Sampling experiments would do something potentially useful despite themselves, generating certified entropy as just an inevitable byproduct of passing our benchmarks for sampling from certain classically-hard probability distributions. Over the next couple weeks, I worked out some of the technical details of the security analysis (though not all! it was a big job, and one that only got finished years later, when I brought Shih-Han to UT Austin as a postdoc and worked with him on it for a year).

I emailed the Google team about the idea; they responded enthusiastically. I also got in touch with UT Austin’s intellectual property office to file a provisional patent, the only time I’ve done that my career. UT and I successfully licensed the patent to Google, though the license lapsed when Google’s priorities changed. Meantime, a couple years ago, when I visited Quantinuum’s lab in Broomfield, Colorado, I learned that a JPMC-led collaboration toward an experimental demonstration of the protocol was then underway. The protocol was well-suited to Quantinuum’s devices, particularly given their ability to apply two-qubit gates with all-to-all connectivity and fidelity approaching 99.9%.

I should mention that, in the intervening years, others had also studied the use of quantum computers to generate cryptographically certified randomness; indeed it became a whole subarea of quantum computing. See especially the seminal work of Brakerski, Christiano, Mahadev, Vazirani, and Vidick, which gave a certified randomness protocol that (unlike mine) relies only on standard cryptographic assumptions and allows verification in classical polynomial time. The “only” downside is that implementing their protocol securely seems to require a full fault-tolerant quantum computer (capable of things like Shor’s algorithm), rather than current noisy devices with 50-100 qubits.


For the rest of this post, I’ll share a little FAQ, adapted from my answers to a journalist’s questions. Happy to answer additional questions in the comments.

  • To what extent is this a world-first?

Well, it’s the first experimental demonstration of a protocol to generate cryptographically certified random bits with the use of a quantum computer.

To remove any misunderstanding: if you’re just talking about the use of quantum phenomena to generate random bits, without certifying the randomness of those bits to a faraway skeptic, then that’s been easy to do for generations (just stick a Geiger counter next to some radioactive material!). The new part, the part that requires a quantum computer, is all about the certification.

Also: if you’re talking about the use of separated, entangled parties to generate certified random bits by violating the Bell inequality (see eg here) — that approach does give certification, but the downside is that you need to believe that the two parties really are unable to communicate with each other, something that you couldn’t certify in practice over the Internet.  A quantum-computer-based protocol like mine, by contrast, requires just a single quantum device.

  • Why is the certification element important?

In any cryptographic application where you need to distribute random bits over the Internet, the fundamental question is, why should everyone trust that these bits are truly random, rather than being backdoored by an adversary?

This isn’t so easy to solve.  If you consider any classical method for generating random bits, an adversary could substitute a cryptographic pseudorandom generator without anyone being the wiser.

The key insight behind the quantum protocol is that a quantum computer can solve certain problems efficiently, but only (it’s conjectured, and proven under plausible assumptions) by sampling an answer randomly — thereby giving you certified randomness, once you verify that the quantum computer really has solved the problem in question.  Unlike with a classical computer, there’s no way to substitute a pseudorandom generator, since randomness is just an inherent part of a quantum computer’s operation — specifically, when the entangled superposition state randomly collapses on measurement.

  • What are the applications and possible uses?

One potential application is to proof-of-stake cryptocurrencies, like Ethereum.  These cryptocurrencies are vastly more energy-efficient than “proof-of-work” cryptocurrencies (like Bitcoin), but they require lotteries to be run constantly to decide which currency holder gets to add the next block to the blockchain (and get paid for it).  Billions of dollars are riding on these lotteries being fair.

Other potential applications are to zero-knowledge protocols, lotteries and online gambling, and deciding which precincts to audit in elections. See here for a nice perspective article that JPMC put together discussing these and other potential applications.

Having said all this, a major problem right now is that verifying the results using a classical computer is extremely expensive — indeed, basically as expensive as spoofing the results would be.  This problem, and other problems related to verification (eg “why should everyone else trust the verifier?”), are the reasons why most people will probably pass on this solution in the near future, and generate random bits in simpler, non-quantum-computational ways.

We do know, from e.g. Brakerski et al.’s work, that the problem of making the verification fast is solvable with sufficient advancements in quantum computing hardware.  Even without hardware advancements, it might also be solvable with new theoretical ideas — one of my favorite research directions.

  • Is this is an early win for quantum computing?

It’s not directly an advancement in quantum computing hardware, but yes, it’s a very nice demonstration of such advancements — of something that’s possible today but wouldn’t have been possible just a few short years ago.  It’s a step toward using current, non-error-corrected quantum computers for a practical application that’s not itself about quantum mechanics but that really does inherently require quantum computers.

Of course it’s personally gratifying to see something I developed get experimentally realized after seven years.  Huge congratulations to the teams at JP Morgan Chase and Quantinuum, and thanks to them for the hard work they put into this.


Unrelated Announcement: See here for a podcast about quantum computing that I recorded with, of all organizations, the FBI. As I told the gentlemen who interviewed me, I’m glad the FBI still exists, let alone its podcast!

Matt Strassler Quantum Interference 3: What Is Interfering?

In my last post and the previous one, I put one or two particles in various sorts of quantum superpositions, and claimed that some cases display quantum interference and some do not. Today we’ll start looking at these examples in detail to see why interference does or does not occur. We’ll also encounter a difficulty asking where the interference occurs — a difficulty which will lead us eventually to deeper understanding.

First, a lightning review of interference for one particle. Take a single particle in a superposition that gives it equal probability of being right of center and moving to the left OR being left of center and moving to the right. Its wave function is given in Fig. 1.

Figure 1: The wave function of a single particle in a superposition of moving left from the right OR moving right from the left. The black curve represents the absolute-value-squared of the wave function, which gives the probability of finding the particle at that location. Red and blue curves show the wave function’s real and imaginary parts.

Then, at the moment and location where the two peaks in the wave function cross, a strong interference effect is observed, the same sort as is seen in the famous double slit-experiment.

Figure 2: A closeup of the interference pattern that occurs at the moment when the two peaks in Fig. 1 perfectly overlap. An animation is shown here.

The simplest way to analyze this is to approach it as a 19th century physicist might have done. In this pre-quantum version of the problem, shown in Fig. 3, the particle has a definite location and speed (and no wave function), with

  • a 50 percent chance of being left of center and moving right, and
  • a 50 percent chance of being right of center and moving left.
Figure 3: A pre-quantum view of Fig. 1, showing a single particle with equal probability of moving right or moving left. The particle will reach x=0 in both possibilities at the same time, but in pre-quantum physics, nothing special happens then.

Nothing interesting, in either possibility, happens when the particle reaches the center. Either it reaches the center from the left and keeps on going OR it reaches the center from the right and keeps on going. There is certainly no collision, and, in pre-quantum physics, there is also no interference effect.

Still, something abstractly interesting happens there. Before the particle reaches the center, the top and bottom of Fig. 3 are different. But just when the particle is at x=0, the two possibilities in the superposition describe the same object in the same place. In a sense, the two possibilities meet. In the corresponding quantum problem, this is the precise moment where the quantum interference effect is largest. That is a clue.

Two Particles, Two Orderings

So now let’s look in Fig. 4 at the example that I gave as a puzzle, a sort of doubling of the single particle example in Fig. 1.

Figure 4: Two particles in a superposition of moving left or moving right — a sort of doubling of Fig. 3.

Here we have two particles moving from left to right OR from right to left, with 50% probability for each of the two possibilities. I haven’t drawn the corresponding quantum wave function for this yet, but I will in a moment.

We might think something interesting would happen when particle 1 reaches x=0 in both possibilities (Fig. 5a), just as something interesting happens when the particle in Figs. 1-3 reaches x=0 in both of its possibilities. But in fact, there is no interference. Nor does anything interesting happen when the blue particle at the top and the orange particle at the bottom arrive at x=1 (Fig. 5b). Similarly, no interference happens when particle 2 reaches x=0 in both possibilities (Fig. 5c). These “events” are really non-events, as far as quantum physics is concerned. Why is this?

Figure 5a: After the two particles in Fig. 4 have moved slightly, the blue particle is at the same point in both halves of the superposition. Yet in the quantum version of this picture, no interference occurs.
Figure 5b: As in Fig. 5a, but slightly later; again no interference occurs.
Figure 5c: As in Fig. 5a; yet again there is no quantum interference.

The Puzzle’s Puzzling Lack of Interference

To understand why interference never occurs in this case, we have to look at the system’s wave function and how it evolves with time.

Before we start, let’s make sure we avoid a couple of misconceptions:

  • First, we don’t have two wave functions (one for each particle);
  • Second, the wave function is not defined on physical space (the x axis).

Instead we have a single wave function Ψ(x1,x2), defined on the space of possibilities, which has an x1-axis, (which I will draw horizontal), giving the position of particle 1 (the blue one), and an x2 axis (which I will draw vertical) giving the position of particle 2 (the orange one). The square of the wave function’s absolute value at a specific possibility (x1,x2) tells us the probability of simultaneously finding particle 1 at position x1 and particle 2 at position x2.

In Fig. 6, I have shown the absolute-value-squared of the initial wave function, corresponding to Fig. 4.

Figure 6: Graph of the squared absolute value of the initial wave function, |Ψ(x1,x2)|2, corresponding to Fig. 4. The function is shown dark where it is large and white where it is very small. The two peaks are located at (x1,x2)=(-1,-3) and at (x1,x2)=(+1,+3).

In the first possibility in Fig. 4, we have x1=-1 and x2=-3. One peak of the wave function is located at that position, at lower left in Fig. 6. The other peak of Fig. 6, corresponding to the second possibility in Fig. 4, is located at the position x1=+1 and x2=+3, exactly opposite the first peak.

Fig. 7 now shows the exact solution to the Schrodinger equation, which shows how the wave function of Fig. 6 evolves with time.

Figure 7: How the wave function starting from Fig. 6 evolves over time; there is no interference.

What do we see? The two peaks move generally toward each other, but they miss. They never overlap, so they cannot interfere. This is what makes this case different from Fig. 1; the wave function’s peaks in Fig. 1 do meet, and that is why they interfere.

Why, conceptually speaking, do the two peaks miss? We can understand this using the pre-quantum method, drawing the system not in physical space, as in Fig. 4, but in the space of possibilities. The top possibility in Fig. 4 first puts the system at the star in Fig. 8a, moving up and to the right over time. Because the two particles have equal speeds, every change in x1 is matched with an equal change in x2, which means the star moves on a line whose slope is 1 (i.e. it makes a 45 degree angle to the horizontal.) Similarly, the bottom possibility puts the system at the star in Fig. 8b, moving down and to the left.

Figure 8a: In the space of possibilities, the pre-quantum system in the top possibility of Fig. 4 is initially located at the star, and changes with time by moving along the arrow.
Figure 8b: Same as Fig. 8a, but for the bottom possibility in Fig. 4.

If the two stars ever did find themselves at the same point, then what is happening in the first possibility would be exactly the same as what is happening in the second possibility. In other words, the two possibilities would cross paths. But this does not happen here; the paths of the stars do not intersect, reflecting the fact that the top possibility and bottom possibility in Fig. 4 never look the same at any time.

Quantum physics combines these two pre-quantum possibilities into the single wave function of Fig. 7. The two peaks follow the arrows of Figs. 8a and 8b, and so they never overlap.

The three (non-)events shown in Figs. 5a-5c above correspond to the following:

  • At the time of Fig. 5a, the two peaks in Fig. 7 are on the same vertical line (they have the same x1)
  • At the time of Fig. 5b, the two peaks are aligned along the diagonal from lower right to upper left.
  • At the time of Fig. 5c, the two peaks are on the same horizontal line (they have the same x2).

The Flipped Order

Let’s now compare this with the next example I gave you in my previous post. It is much like Fig. 4, except that in the second possibility we switch the two particles.

Figure 9: As in Fig. 4, except with the two particles switching places in the bottom part of the superposition.

This case does have interference. How can we see this?

The top possibility is unaltered, and so Fig. 10a is the same as Fig. 8a. But in Fig. 10b, things have changed; the star that was at x1=+1 and x2=+3 in Fig. 8b is now moved to the point x1=+3 and x2=+1. The corresponding arrow, however, still points in the same direction, since the particles’ motions are the same as before (toward more negative x1 and x2.)

Figure 10a: Same as Fig. 8a, except with the point (x1,x2)=(+1,-1) circled.
Figure 10b: A new version of Fig. 8b with particles 1 and 2 having switched places. The system now reaches the circled point (x1,x2)=(+1,-1) at the same moment that it does in Fig.10a.

Now the two arrows do cross paths, and the stars meet at the circled location. At that moment, the pre-quantum system appears in physical space as shown in Fig. 11.

Fig. 11: Quantum interference occurs when, in the pre-quantum analogue, the two possibilities put all their particles in the same place.

In both possibilities, the two particles are in the same locations. And so, in the quantum wave function, the two peaks will cross paths and overlap one another, causing interference. The exact wave function is shown in Fig. 12, and its peaks move just like the stars in Fig. 10a-10b, resulting in a striking interference pattern.

Figure 12: The wave function corresponding to Figs. 9-11, showing interference when the peaks overlap.

Profound Lessons

What are the lessons that we can draw from this pair of examples?

First, quantum interference occurs in the space of possibilities, not in physical space. It has effects that can be observed in physical space, but we will not be able to visualize or comprehend the interference effect completely using only physical space, whose coordinate in this case is simply x. If we try, we will lose some of its essence. The full effect is only understandable using the space of possibilities, here two-dimensional and spanned by x1 and x2. (In somewhat the same way, we cannot learn the full three-dimensional layout of a room having only a photograph; some information about the room can be inferred, but some part is inevitably lost.)

Second, starting from a pre-quantum point of view, we see that quantum interference is expected when the pre-quantum paths of two or more possibilities intersect. As an exercise, go back to the last post where I gave you multiple examples. In every case with interference, this intersection happens: there is a moment where the top possibility looks exactly like the bottom possibility, as in Fig. 11.

Third, quantum interference is generally not about a particle interfering with itself — or at least, if we try to use that language, we can’t explain when interference does and doesn’t happen. At best, we might say that the system of two particles is interfering with itself — or fails to interfere with itself — based on its peaks, their motions and their potential intersections in the space of possibilities. When the system consists of only one particle, it’s easy to confuse these two notions, because the system interfering looks the same as the particle interfering. More generally, it is very easy to be misled when the space of possibilities has the same number of dimensions as the relevant physical space. But with two or more particles, this confusion is eliminated. For significant interference to occur, at least two possibilities in a superposition must align perfectly, with each and every particle in matching locations. Whether this is possible or not depends on the superposition’s details.

How Do We Observe the Interference?

But now let’s raise the following question. When there is interference, “where” is it? We can see where it is in the space of possibilities; it’s clear as day in Fig. 12. But you and I live in physical space. If quantum interference is really about interfering waves, just like those of water or sound, then the interference pattern should be located somewhere, shouldn’t it? Where is it?

Well, here’s something to think about. The double-slit-like interference pattern in Fig. 2, for one particle in a superposition, produces a real, observable effect just like that of the double-slit experiment. In Fig. 12 we see a similar case at the moment where wave function’s two peaks overlap. How can we observe this interference effect?

An obvious first guess is to measure the position of one of the particles. The result of doing so for particle 1, and repeating the whole experiment many times (just as we always do for the double-slit experiment) is shown in Fig. 13.

Figure 13: If we measure the position of particle 1 at the moment of maximum interference in Fig. 12, and repeat the experiment many times, we will see random dots centered near x=+1, with no interference pattern. (Each new measurement is an orange dot; previous measurements are blue dots.)

There are no interference peaks and valleys at all, in contrast to the case of Fig. 1, which we examined here (in that post’s Fig. 8). Particle 1 always shows up near x1=+1, which is its location where the two peaks intersect (see Figs. 10-12). No interesting structure within or around that peak is observed.

Not surprisingly, if we do the same thing for particle 2, we find the same sort of thing. No interference features appear; there’s just a blob near its pre-quantum location in Fig. 11, x2=-1.

And yet, the quantum interference is plain to see in Fig. 12. If we can’t observe it by measuring either particle’s position, what other options do we have? Where — if anywhere — will we find it? Is it actually observable, or is it just an abstraction?

March 23, 2025

Jordan EllenbergAphorism

Any sufficiently advanced corporation is indistinguishable from academia.

March 21, 2025

Matt von HippelHot Things Are Less Useful

Did you know that particle colliders have to cool down their particle beams before they collide?

You might have learned in school that temperature is secretly energy. With a number called Boltzmann’s constant, you can convert a temperature of a gas in Kelvin to the average energy of a molecule in the gas. If that’s what you remember about temperature, it might seem weird that someone would cool down the particles in a particle collider. The whole point of a particle collider is to accelerate particles, giving them lots of energy, before colliding them together. Since those particles have a lot of energy, they must be very hot, right?

Well, no. Here’s the thing: temperature is not just the average energy. It’s the average random energy. It’s energy that might be used to make a particle move forward or backwards, up or down, a random different motion for each particle. It doesn’t include motion that’s the same for each particle, like the movement of a particle beam.

Cooling down a particle beam then, doesn’t mean slowing it down. Rather, it means making it more consistent, getting the different particles moving in the same direction rather than randomly spreading apart. You want the particles to go somewhere specific, speeding up and slamming into the other beam. You don’t want them to move randomly, running into the walls and destroying your collider. So you can have something with high energy that is comparatively cool.

In general, the best way I’ve found to think about temperature and heat is in terms of usefulness and uselessness. Cool things are useful, they do what you expect and not much more. Hot things are less useful, they use energy to do random things you don’t want. Sometimes, by chance, this random energy will still do something useful, and if you have a cold thing to pair with the hot thing, you can take advantage of this in a consistent way. But hot things by themselves are less useful, and that’s why particle colliders try to cool down their beams.

Jordan EllenbergA birthday message from Azat Miftakhov

(via Michael Harris)

Lena Gorban, the wife of the mathematician Azat Miftakhov’s, has requested that mathematicians post this message on Azat’s birthday, which is March 21. This is his sixth birthday in prison in Dimitrovgrad, near Ulianovsk in Russia. For more information on his case please see this website.

Hello, this is Azat Miftakhov. For technical reasons, New Year’s Eve happened a week ago. I received many warm, touching letters and cute pictures. Reading them, I experienced many positive emotions. I would like to answer everyone, but return letters are very difficult for me. Just like public addresses, which I make extremely rarely. The thing is that I am too demanding of myself and I am afraid that my words are imperfect. I worried about my insurmountable perfectionism for a long time, until one day I accidentally learned about similar problems with writing texts from such a luminary of science as Lev Davidovich Landau. It even happened that his co-authors designed joint works for him. After that, I partly accepted this feature of mine. But only partly, because it still bothers me that you expect answers from me! And on my birthday, I found the strength in myself, and I want to express my gratitude to you. Thank you for all the support I have received from you over the years. For your letters and wishes, for the events you have organized: from large rallies in support of political prisoners in different countries of the world to bold actions in totalitarian Russia (like, for example, the recent action at the Lomonosov monument in support of me and Dima Ivanov). I am glad that political activity does not fade away, despite the difficult times. What you do greatly helps political prisoners and their loved ones, but also demonstrates mass disagreement with the policies of the current regime. With your support and solidarity, the walls will come down!

Привет, это Азат Мифтахов. По техническим причинам Новый год у меня случился неделю назад. Я получил много тёплых трогательных писем и милых картинок. Читая их, я испытал множество позитивных эмоций. Мне хотелось бы ответить каждому, однако обратные письма даются мне с большим трудом. Так же, как и публичные обращения, которые я делаю крайне редко.

Дело в том, что я бываю слишком требователен к себе и боюсь, что мои слова несовершенны. Я долго переживал по поводу своего непреодолимого перфекционизма, пока однажды случайным образом не узнал о схожих проблемах с написанием текстов у такого светила науки, как Лев Давидович Ландау. Даже случалось, что совместные работы оформляли за него его соавторы. После этого я отчасти принял эту свою особенность.

Но лишь отчасти, потому что всё равно меня беспокоит, что вы ждёте от меня ответов! И в свой День рожденья я нашёл в себе силы, и хочу высказать вам свою благодарность. Благодарю вас за всю вашу поддержку, полученную за все годы. За ваши письма и пожелания, за проводимые мероприятия: от больших митингов в поддержку политзаключенных в разных странах мира до смелых акций в тоталитарной России (как, например, недавняя акция у памятника Ломоносова в поддержку меня и Димы Иванова).

Меня радует, что политическая активность не угасает, не смотря на сложные времена. То, что вы делаете, очень помогает политзаключенным и их близким, но также демонстрирует массовое несогласие с политикой нынешнего режима.

С вашей поддержкой и солидарностью стены рухнут!

Jordan EllenbergSubstance-free dorm

CJ was considering (but in the end isn’t) living in the substance-free dorm at college. I know what it means — it means you can’t drink or smoke weed in your room. But I like the name. The substance-free dorm! Where everything exists only in the realm of forms, you dematerialize at the door and float ethereally into the concept of a bunkbed, your roommate the idea of a roommate, your home a desire to have a home.

March 20, 2025

John PreskillThe first and second centuries of quantum mechanics

At this week’s American Physical Society Global Physics Summit in Anaheim, California, John Preskill spoke at an event celebrating 100 years of groundbreaking advances in quantum mechanics. Here are his remarks.

Welcome, everyone, to this celebration of 100 years of quantum mechanics hosted by the Physical Review Journals. I’m John Preskill and I’m honored by this opportunity to speak today. I was asked by our hosts to express some thoughts appropriate to this occasion and to feel free to share my own personal journey as a physicist. I’ll embrace that charge, including the second part of it, perhaps even more that they intended. But over the next 20 minutes I hope to distill from my own experience some lessons of broader interest.

I began graduate study in 1975, the midpoint of the first 100 years of quantum mechanics, 50 years ago and 50 years after the discovery of quantum mechanics in 1925 that we celebrate here. So I’ll seize this chance to look back at where quantum physics stood 50 years ago, how far we’ve come since then, and what we can anticipate in the years ahead.

As an undergraduate at Princeton, I had many memorable teachers; I’ll mention just one: John Wheeler, who taught a full-year course for sophomores that purported to cover all of physics. Wheeler, having worked with Niels Bohr on nuclear fission, seemed implausibly old, though he was actually 61. It was an idiosyncratic course, particularly because Wheeler did not refrain from sharing with the class his current research obsessions. Black holes were a topic he shared with particular relish, including the controversy at the time concerning whether evidence for black holes had been seen by astronomers. Especially notably, when covering the second law of thermodynamics, he challenged us to ponder what would happen to entropy lost behind a black hole horizon, something that had been addressed by Wheeler’s graduate student Jacob Bekenstein, who had finished his PhD that very year. Bekenstein’s remarkable conclusion that black holes have an intrinsic entropy proportional to the event horizon area delighted the class, and I’ve had had many occasions to revisit that insight in the years since then. The lesson being that we should not underestimate the potential impact of sharing our research ideas with undergraduate students.

Stephen Hawking made that connection between entropy and area precise the very next year when he discovered that black holes radiate; his resulting formula for black hole entropy, a beautiful synthesis of relativity, quantum theory, and thermodynamics ranks as one of the shining achievements in the first 100 years of quantum mechanics. And it raised a deep puzzle pointed out by Hawking himself with which we have wrestled since then, still without complete success — what happens to information that disappears inside black holes?

Hawking’s puzzle ignited a titanic struggle between cherished principles. Quantum mechanics tells us that as quantum systems evolve, information encoded in a system can get scrambled into an unrecognizable form, but cannot be irreversibly destroyed. Relativistic causality tells us that information that falls into a black hole, which then evaporates, cannot possibly escape and therefore must be destroyed. Who wins – quantum theory or causality? A widely held view is that quantum mechanics is the victor, that causality should be discarded as a fundamental principle. This calls into question the whole notion of spacetime — is it fundamental, or an approximate property that emerges from a deeper description of how nature works? If emergent, how does it emerge and from what? Fully addressing that challenge we leave to the physicists of the next quantum century.

I made it to graduate school at Harvard and the second half century of quantum mechanics ensued. My generation came along just a little too late to take part in erecting the standard model of particle physics, but I was drawn to particle physics by that intoxicating experimental and theoretical success. And many new ideas were swirling around in the mid and late 70s of which I’ll mention only two. For one, appreciation was growing for the remarkable power of topology in quantum field theory and condensed matter, for example the theory of topological solitons. While theoretical physics and mathematics had diverged during the first 50 years of quantum mechanics, they have frequently crossed paths in the last 50 years, and topology continues to bring both insight and joy to physicists. The other compelling idea was to seek insight into fundamental physics at very short distances by searching for relics from the very early history of the universe. My first publication resulted from contemplating a question that connected topology and cosmology: Would magnetic monopoles be copiously produced in the early universe? To check whether my ideas held water, I consulted not a particle physicist or a cosmologist, but rather a condensed matter physicist (Bert Halperin) who provided helpful advice. The lesson being that scientific opportunities often emerge where different subfields intersect, a realization that has helped to guide my own research over the following decades.

Looking back at my 50 years as a working physicist, what discoveries can the quantumists point to with particular pride and delight?

I was an undergraduate when Phil Anderson proclaimed that More is Different, but as an arrogant would be particle theorist at the time I did not appreciate how different more can be. In the past 50 years of quantum mechanics no example of emergence was more stunning than the fractional quantum Hall effect. We all know full well that electrons are indivisible particles. So how can it be that in a strongly interacting two-dimensional gas an electron can split into quasiparticles each carrying a fraction of its charge? The lesson being: in a strongly-correlated quantum world, miracles can happen. What other extraordinary quantum phases of matter await discovery in the next quantum century?

Another thing I did not adequately appreciate in my student days was atomic physics. Imagine how shocked those who elucidated atomic structure in the 1920s would be by the atomic physics of today. To them, a quantum measurement was an action performed on a large ensemble of similarly prepared systems. Now we routinely grab ahold of a single atom, move it, excite it, read it out, and induce pairs of atoms to interact in precisely controlled ways. When interest in quantum computing took off in the mid-90s, it was ion-trap clock technology that enabled the first quantum processors. Strong coupling between single photons and single atoms in optical and microwave cavities led to circuit quantum electrodynamics, the basis for today’s superconducting quantum computers. The lesson being that advancing our tools often leads to new capabilities we hadn’t anticipated. Now clocks are so accurate that we can detect the gravitational redshift when an atom moves up or down by a millimeter in the earth’s gravitational field. Where will the clocks of the second quantum century take us?

Surely one of the great scientific triumphs of recent decades has been the success of LIGO, the laser interferometer gravitational-wave observatory. If you are a gravitational wave scientist now, your phone buzzes so often to announce another black hole merger that it’s become annoying. LIGO would not be possible without advanced laser technology, but aside from that what’s quantum about LIGO? When I came to Caltech in the early 1980s, I learned about a remarkable idea (from Carl Caves) that the sensitivity of an interferometer can be enhanced by a quantum strategy that did not seem at all obvious — injecting squeezed vacuum into the interferometer’s dark port. Now, over 40 years later, LIGO improves its detection rate by using that strategy. The lesson being that theoretical insights can enhance and transform our scientific and technological tools. But sometimes that takes a while.

What else has changed since 50 years ago? Let’s give thanks for the arXiv. When I was a student few scientists would type their own technical papers. It took skill, training, and patience to operate the IBM typewriters of the era. And to communicate our results, we had no email or world wide web. Preprints arrived by snail mail in Manila envelopes, if you were lucky enough to be on the mailing list. The Internet and the arXiv made scientific communication far faster, more convenient, and more democratic, and LaTeX made producing our papers far easier as well. And the success of the arXiv raises vexing questions about the role of journal publication as the next quantum century unfolds.

I made a mid-career shift in research direction, and I’m often asked how that came about. Part of the answer is that, for my generation of particle physicists, the great challenge and opportunity was to clarify the physics beyond the standard model, which we expected to provide a deeper understanding of how nature works. We had great hopes for the new phenomenology that would be unveiled by the Superconducting Super Collider, which was under construction in Texas during the early 90s. The cancellation of that project in 1993 was a great disappointment. The lesson being that sometimes our scientific ambitions are thwarted because the required resources are beyond what society will support. In which case, we need to seek other ways to move forward.

And then the next year, Peter Shor discovered the algorithm for efficiently finding the factors of a large composite integer using a quantum computer. Though computational complexity had not been part of my scientific education, I was awestruck by this discovery. It meant that the difference between hard and easy problems — those we can never hope to solve, and those we can solve with advanced technologies — hinges on our world being quantum mechanical. That excited me because one could anticipate that observing nature through a computational lens would deepen our understanding of fundamental science. I needed to work hard to come up to speed in a field that was new to me — teaching a course helped me a lot.

Ironically, for 4 ½ years in the mid-1980s I sat on the same corridor as Richard Feynman, who had proposed the idea of simulating nature with quantum computers in 1981. And I never talked to Feynman about quantum computing because I had little interest in that topic at the time. But Feynman and I did talk about computation, and in particular we were both very interested in what one could learn about quantum chromodynamics from Euclidean Monte Carlo simulations on conventional computers, which were starting to ramp up in that era. Feynman correctly predicted that it would be a few decades before sufficient computational power would be available to make accurate quantitative predictions about nonperturbative QCD. But it did eventually happen — now lattice QCD is making crucial contributions to the particle physics and nuclear physics programs. The lesson being that as we contemplate quantum computers advancing our understanding of fundamental science, we should keep in mind a time scale of decades.

Where might the next quantum century take us? What will the quantum computers of the future look like, or the classical computers for that matter? Surely the qubits of 100 years from now will be much different and much better than what we have today, and the machine architecture will no doubt be radically different than what we can currently envision. And how will we be using those quantum computers? Will our quantum technology have transformed medicine and neuroscience and our understanding of living matter? Will we be building materials with astonishing properties by assembling matter atom by atom? Will our clocks be accurate enough to detect the stochastic gravitational wave background and so have reached the limit of accuracy beyond which no stable time standard can even be defined? Will quantum networks of telescopes be observing the universe with exquisite precision and what will that reveal? Will we be exploring the high energy frontier with advanced accelerators like muon colliders and what will they teach us? Will we have identified the dark matter and explained the dark energy? Will we have unambiguous evidence of the universe’s inflationary origin? Will we have computed the parameters of the standard model from first principles, or will we have convinced ourselves that’s a hopeless task? Will we have understood the fundamental constituents from which spacetime itself is composed?

There is an elephant in the room. Artificial intelligence is transforming how we do science at a blistering pace. What role will humans play in the advancement of science 100 years from now? Will artificial intelligence have melded with quantum intelligence? Will our instruments gather quantum data Nature provides, transduce it to quantum memories, and process it with quantum computers to discern features of the world that would otherwise have remained deeply hidden?

To a limited degree, in contemplating the future we are guided by the past. Were I asked to list the great ideas about physics to surface over the 50-year span of my career, there are three in particular I would nominate for inclusion on that list. (1) The holographic principle, our best clue about how gravity and quantum physics fit together. (2) Topological quantum order, providing ways to distinguish different phases of quantum matter when particles strongly interact with one another. (3) And quantum error correction, our basis for believing we can precisely control very complex quantum systems, including advanced quantum computers. It’s fascinating that these three ideas are actually quite closely related. The common thread connecting them is that all relate to the behavior of many-particle systems that are highly entangled.

Quantum error correction is the idea that we can protect quantum information from local noise by encoding the information in highly entangled states such that the protected information is inaccessible locally, when we look at just a few particles at a time. Topological quantum order is the idea that different quantum phases of matter can look the same when we observe them locally, but are distinguished by global properties hidden from local probes — in other words such states of matter are quantum memories protected by quantum error correction. The holographic principle is the idea that all the information in a gravitating three-dimensional region of space can be encoded by mapping it to a local quantum field theory on the two-dimensional boundary of the space. And that map is in fact the encoding map of a quantum error-correcting code. These ideas illustrate how as our knowledge advances, different fields of physics are converging on common principles. Will that convergence continue in the second century of quantum mechanics? We’ll see.

As we contemplate the long-term trajectory of quantum science and technology, we are hampered by our limited imaginations. But one way to loosely characterize the difference between the past and the future of quantum science is this: For the first hundred years of quantum mechanics, we achieved great success at understanding the behavior of weakly correlated many-particles systems relevant to for example electronic structure, atomic and molecular physics, and quantum optics. The insights gained regarding for instance how electrons are transported through semiconductors or how condensates of photons and atoms behave had invaluable scientific and technological impact. The grand challenge and opportunity we face in the second quantum century is acquiring comparable insight into the complex behavior of highly entangled states of many particles which are well beyond the reach of current theory or computation. This entanglement frontier is vast, inviting, and still largely unexplored. The wonders we encounter in the second century of quantum mechanics, and their implications for human civilization, are bound to supersede by far those of the first century. So let us gratefully acknowledge the quantum heroes of the past and present, and wish good fortune to the quantum explorers of the future.

Image credit: Jorge Cham

Doug NatelsonMarch Meeting 2025, Day 4 and wrap-up

 I saw a couple of interesting talks this morning before heading out:

  • Alessandro Chiesa of Parma spoke about using spin-containing molecules potentially as qubits, and about chiral-induced spin selectivity (CISS) in electron transfer.  Regarding the former, here is a review.  Spin-containing molecules can have interesting properties as single qubits, or, for spins higher than 1/2, qudits, with unpaired electrons often confined to a transition metal or rare earth ion somewhat protected from the rest of the universe by the rest of the molecule.  The result can be very long coherence times for their spins.  Doing multi-qubit operations is very challenging with such building blocks, however.  There are some theory proposals and attempts to couple molecular qubits to superconducting resonators, but it's tough!   Regarding chiral induced spin selectivity, he discused recent work trying to use molecules where a donor region is linked to an acceptor region via a chiral bridge, and trying to manipulate spin centers this way.  A question in all the CISS work is, how can the effects be large when spin-orbit coupling is generally very weak in light, organic molecules?  He has a recent treatment of this, arguing that if one models the bridge as a chain of sites with large \(U/t\), where \(U\) is the on-site repulsion energy and \(t\) is the hopping contribution, then exchange processes between sites can effectively amplify the otherwise weak spin-orbit effects.  I need to read and think more about this.
  • Richard Schlitz of Konstanz gave a nice talk about some pretty recent research using a scanning tunneling microscope tip (with magnetic iron atoms on the end) to drive electron paramagnetic resonance in a single pentacene molecule (sitting on MgO on Ag, where it tends to grab an electron from the silver and host a spin).  The experimental approach was initially explained here.  The actual polarized tunneling current can drive the resonance, and exactly how depends on the bias conditions.  At high bias, when there is strong resonant tunneling, the current exerts a damping-like torque, while at low bias, when tunneling is far off resonance, the current exerts a field-like torque.  Neat stuff.
  • Leah Weiss from Chicago gave a clear presentation about not-yet-published results (based on earlier work), doing optically detected EPR of Er-containing molecules.  These condense into mm-sized molecular crystals, with the molecular environment being nice and clean, leading to very little inhomogeneous broadening of the lines.  There are spin-selective transitions that can be driven using near telecom-wavelength (1.55 \(\mu m\)) light.  When the (anisotropic) \(g\)-factors of the different levels are different, there are some very promising ways to do orientation-selective and spin-selective spectroscopy.  Looking forward to seeing the paper on this.
And that's it for me for the meeting.  A couple of thoughts:
  • I'm not sold on the combined March/April meeting.  Six years ago when I was a DCMP member-at-large, the discussion was all about how the March Meeting was too big, making it hard to find and get good deals on host sites, and maybe the meeting should split.  Now they've made it even bigger.  Doesn't this make planning more difficult and hosting more expensive since there are fewer options?  (I'm not an economist, but....)  A benefit for the April meeting attendees is that grad students and postdocs get access to the career/networking events held at the MM.  If you're going to do the combination, then it seems like you should have the courage of your convictions and really mingle the two, rather than keeping the March talks in the convention center and the April talks in site hotels.
  • I understand that van der Waals/twisted materials are great laboratories for physics, and that topological states in these are exciting.  Still, by my count there were 7 invited sessions broadly about this topic, and 35 invited talks on this over four days seems a bit extreme.  
  • By my count, there were eight dilution refrigerator vendors at the exhibition (Maybell, Bluefors, Ice, Oxford, Danaher/Leiden, Formfactor, Zero-Point Cryo, and Quantum Design if you count their PPMS insert).  Wow.  
I'm sure there will be other cool results presented today and tomorrow that I am missing - feel free to mention them in the comments.

Matt Strassler Quantum Interference 2: When Does It Happen?

Last time, I showed you that a simple quantum system, consisting of a single particle in a superposition of traveling from the left OR from the right, leads to a striking quantum interference effect. It can then produce the same kind of result as the famous double-slit experiment.

The pre-quantum version of this system, in which (like a 19th century scientist) I draw the particle as though it actually has a definite position and motion in each half of the superposition, looks like Fig. 1. The interference occurs when the particle in both halves of the superposition reaches the point at center, x=0.

Figure 1: A case where interference does occur.

Then I posed a puzzle. I put a system of two [distinguishable] particles into a superposition which, in pre-quantum language, looks like Fig. 2.

Figure 2: Two particles in a superposition of both particles moving right (starting from left of center) or both moving left (from right of center.) Their speeds are equal.

with all particles traveling at the same speed and passing each other without incident if they meet. And I pointed out three events that would happen in quick succession, shown in Figs. 2a-2c.

Figure 2.1: Event 1 at x=0.
Figure 2.2: Event 2a at x=+1 and event 2b at x=-1.
Figure 2.3: Event 3 at x=0.

And I asked the Big Question: in the quantum version of Fig. 2, when will we see quantum interference?

  1. Will we see interference during events 1, 2a, 2b, and 3?
  2. Will we see interference during events 1 and 3 only?
  3. Will we see interference during events 2a and 2b only?
  4. Will we see interference from the beginning of event 1 to the end of event 3?
  5. Will we see interference during event 1 only?
  6. Will we see no interference?
  7. Will we see interference at some time other than events 1, 2a, 2b or 3?
  8. Something else altogether?

So? Well? What’s the correct answer?

The correct answer is … 6. No interference occurs — not in any of the three events in Figs. 2.1-2.3, or at any other time.

  • But wait. . . how can that make sense? How can it be that particle 1 interferes with itself in the case of Fig. 1 and does not interfere with itself in the case of Fig. 2?!

How, indeed?

Perhaps thinking of the particle as interfering with itself is . . . problematic.

Perhaps imagining individual particles interfering with themselves might not be sufficient to capture the range of quantum phenomena. Perhaps we will need to focus more on systems of particles, not individual particles — or more generally, to consider physical systems as a whole, and not necessarily in parts.

Intuition From Other Examples

To start to gain some intuition, consider some other examples. Some have interference, some do not. What distinguishes one class from the other?

For example, the case of Fig. 4 looks almost like Fig. 2, except that the two particles in the bottom part of the superposition are switched. Is there interference in this case?

Figure 4: Similar to Fig. 2, but with the twoparticles reversed in the bottom part of the superposition.

Yes.

How about Fig. 5. In this case, the orange particle is stationary in both parts of the superposition. Is there interference?

Figure 5: In this case, the blue particle is moving (horizontal arrow), but the orange one is stationary in both cases (vertical arrow).

Yes, there is.

And Fig. 6? Again the orange particle is stationary in either part of the superposition.

Figure 6: Similar to Fig. 5, in that the orange particle is again stationary.

No interference this time.

What about Fig. 7 and Fig. 8?

Figure 7: Now the particles in each part of the superposition move in opposite directions.
Figure 8: As in Fig. 7, but with the two particles switched in the bottom part of the superposition.

Yes, interference in both cases. And Figs. 9 and 10?

Figure 9: The blue particle is stationary in both parts of the superposition.
Figure 10: Similar to Fig. 9, except that now the orange particle is stationary in the bottom part of the superposition.

There is interference in the example of Fig. 10, but not that of Fig. 9.

To understand the twists and turns of the double-slit experiment and its many variants, one must be crystal clear about why the above examples do or do not generate interference. We’ll spend several posts exploring them.

What’s Happening (and Where)?

Let’s focus on the cases where interference does occur: Figs. 1, 4, 5, 7, 8, and 10. First, can you identify what they have in common that the cases without interference (Figs. 2, 6 and 9) lack? And second — bringing back the bonus question from last time, which now comes to the fore — in the cases that show interference, exactly when does it happen, and how can we observe it?

Next time we will start the process of going through the examples in Fig. 2 and Figs. 4-10, to see in each case

  • How does the wave function actually behave?
  • Why is there (or is there not) interference?
  • If there is interference,
    • where does it occur?
    • how exactly can it be observed?

From what we learn, we will try to extract some deep lessons.

If you are truly motivated to understand our quantum world, I promise you that this tour of basic quantum phenomena will be well worth your time.

John BaezVisual Insights (Part 2)

For several years I ran a blog called Visual Insight, which was a place to share striking images that help explain topics in mathematics.  Last week I gave this talk about it at the Illustrating Math Seminar:

It was fun showing people the great images created by Refurio Anachro, Greg Egan, Roice Nelson, Gerard Westendorp and other folks. For more info on the images I talked about, go here:

2015 – 01 – 15 — Hammersley Sofa
2015 – 12 – 01 — Golay Code
2014 – 07 – 15 — {7,3} Tiling
2014 – 03 – 15 — {6,3,3} Honeycomb
2013 – 09 – 15 — {6,3,3} Honeycomb in Upper Half Space
2015 – 12 – 15 — Kaleidocycle
2016 – 02 – 15 — 27 Lines on a Cubic Surface
2015 – 04 – 15 — Sphere in Mirrored Spheroid
2015 – 10 – 01 — Balaban 10-Cage
2015 – 09 – 15 — McGee Graph

You can see the whole blog at the AMS website or my website, and you can check out individual articles here:

2013 – 08 – 15 — Tübingen Tiling
2013 – 09 – 01 — Algebraic Numbers
2013 – 09 – 15 — {6,3,3} Honeycomb in Upper Half Space
2013 – 10 – 01 — Catacaustic of a Cardioid
2013 – 10 – 15 — Atomic Singular Inner Function
2013 – 11 – 01 — Enneper Surface
2013 – 11 – 15 — Astroid as Catacaustic of Deltoid
2013 – 12 – 01 — Deltoid Rolling Inside Astroid
2013 – 12 – 15 — Truncated Hypercube

2014 – 01 – 01 — Pentagon-Hexagon-Decagon Identity
2014 – 01 – 15 — Weierstrass Elliptic Function
2014 – 02 – 01 — {5,3,5} Honeycomb
2014 – 02 – 15 — Cantor’s Cube
2014 – 03 – 01 — Menger Sponge
2014 – 03 – 15 — {6,3,3} Honeycomb
2014 – 04 – 01 — {6,3,4} Honeycomb
2014 – 04 – 15 — {6,3,5} Honeycomb
2014 – 05 – 15 — Pattern-Equivariant Homology of a Penrose Tiling
2014 – 06 – 01 — Grace–Danielsson Inequality
2014 – 06 – 15 — Origami Dodecahedra
2014 – 07 – 01 — Sierpinski Carpet
2014 – 07 – 15 — {7,3} Tiling
2014 – 08 – 01 — {7,3,3} Honeycomb
2014 – 08 – 15 — {7,3,3} Honeycomb Meets Plane at Infinity
2014 – 09 – 01 — {3,3,7} Honeycomb Meets Plane at Infinity
2014 – 09 – 15 — Prüfer 2-Group
2014 – 10 – 01 — 2-adic Integers
2014 – 10 – 15 — Packing Regular Octagons
2014 – 11 – 01 — Packing Smoothed Octagons
2014 – 11 – 15 — Packing Regular Heptagons
2014 – 12 – 01 — Packing Regular Pentagons

2015 – 01 – 01 — Icosidodecahedron from D6
2015 – 01 – 15 — Hammersley Sofa
2015 – 02 – 01 — Pentagon-Decagon Packing
2015 – 02 – 15 — Pentagon-Decagon Branched Covering
2015 – 03 – 01 — Schmidt Arrangement
2015 – 03 – 15 — Small Cubicuboctahedron
2015 – 04 – 01 — Branched Cover from (4 4 3_2) Schwarz Triangle
2015 – 04 – 15 — Sphere in Mirrored Spheroid
2015 – 05 – 01 — Twin Dodecahedra
2015 – 05 – 15 — Dodecahedron With 5 Tetrahedra
2015 – 06 – 01 — Harmonic Orbit
2015 – 06 – 15 — Lattice of Partitions
2015 – 07 – 01 — Petersen Graph
2015 – 07 – 15 — Dyck Words
2015 – 08 – 01 — Heawood Graph
2015 – 08 – 15 — Tutte–Coxeter Graph
2015 – 09 – 01 — Hypercube of Duads
2015 – 09 – 15 — McGee Graph
2015 – 10 – 01 — Balaban 10-Cage
2015 – 10 – 15 — Harries Graph
2015 – 11 – 01 — Balaban 11-Cage
2015 – 11 – 15 — Newton’s Apsidal Precession Theorem
2015 – 12 – 01 — Golay Code
2015 – 12 – 15 — Kaleidocycle

2016 – 01 – 01 — Free Modular Lattice on 3 Generators
2016 – 01 – 15 — Cairo Tiling
2016 – 02 – 01 — Hoffman–Singleton Graph
2016 – 02 – 15 — 27 Lines on a Cubic Surface
2016 – 03 – 01 — Clebsch Surface
2016 – 03 – 15 — Zamolodchikov Tetrahedron Equation
2016 – 04 – 01 — Rectified Truncated Icosahedron
2016 – 04 – 15 — Barth Sextic
2016 – 05 – 01 — Involutes of a Cubical Parabola
2016 – 05 – 15 — Discriminant of the Icosahedral Group
2016 – 06 – 01 — Discriminant of Restricted Quintic
2016 – 06 – 15 — Small Stellated Dodecahedron
2016 – 07 – 01 — Barth Decic
2016 – 07 – 15 — Labs Septic
2016 – 08 – 01 — Endrass Octic
2016 – 08 – 15 — Cayley’s Nodal Cubic Surface
2016 – 09 – 01 — Kummer Quartic
2016 – 09 – 15 — Togliatti Quintic
2016 – 10 – 01 — Diamond Cubic
2016 – 10 – 15 — Laves Graph
2016 – 11 – 01 — Escudero Nonic
2016 – 11 – 15 — Bunimovich Stadium
2016 – 12 – 01 — Truncated {6,3,3} Honeycomb
2016 – 12 – 15 — Romik’s Ambidextrous Sofa
2017 – 01 – 01 — Chmutov Octic

Is that 79 articles? If I’d stopped at 78, that would be the dimension of E6.

n-Category Café Visual Insights (Part 2)

From August 2013 to January 2017 I ran a blog called Visual Insight, which was a place to share striking images that help explain topics in mathematics.  Here’s the video of a talk I gave last week about some of those images:

It was fun showing people the great images created by Refurio Anachro, Greg Egan, Roice Nelson, Gerard Westendorp and many other folks. For more info on the images I talked about, read on….

Here are the articles I spoke about:

You can see the whole blog at the AMS website or my website, or you can check out individual articles here:

Is that 79 articles? If I’d stopped at 78, that would be the dimension of E6.

Doug NatelsonMarch Meeting 2025, Day 3

Another busy day at the APS Global Physics Summit.  Here are a few highlights:

  • Shahal Ilani of the Weizmann gave an absolutely fantastic talk about his group's latest results from their quantum twisting microscope.  In a scanning tunneling microscope, because tunneling happens at an atomic-scale location between the tip and the sample, the momentum in the transverse direction is not conserved - that is, the tunneling averages over a huge range of \(\mathbf{k}\) vectors for the tunneling electron.  In the quantum twisting microscope, electrons tunnel from a flat (graphite) patch something like \(d \sim\) 100 nm across, coherently, through a couple of layers of some insulator (like WSe2) and into a van der Waals sample.  In this case, \(\mathbf{k}\) in the plane is comparatively conserved, and by rotating the sample relative to the tip, it is possible to build up a picture of the sample's electronic energy vs. \(\mathbf{k}\) dispersion, rather like in angle-resolved photoemission.  This has allowed, e.g., mapping of phonons via inelastic tunneling.  His group has applied this to magic angle twisted bilayer graphene, a system that has a peculiar combination of properties, where in some ways the electrons act like very local objects, and in other ways they act like delocalized objects.  The answer seems to be that this system at the magic angle is a bit of an analog of a heavy fermion system, where there are sort of local moments (living in very flat bands) interacting and hybridizing with "conduction" electrons (bands crossing the Fermi level at the Brillouin zone center).  The experimental data (movies of the bands as a function of energy and \(\mathbf{k}\) in the plane as the filling is tuned via gate) are gorgeous and look very much like theoretical models.
  • I saw a talk by Roger Melko about applying large language models to try to get efficient knowledge of many-body quantum states, or at least the possible outputs of evolution of a quantum system like a quantum computer based on Rydberg atoms.  It started fairly pedagogically, but I confess that I got lost in the AI/ML jargon about halfway through.
  • Francis M. Ross, recipient of this year's Keithley Award, gave a great talk about using transmission electron microscopy to watch the growth of materials in real time.  She had some fantastic videos - here is a review article about some of the techniques used.  She also showed some very new work using a focused electron beam to make arrays of point defects in 2D materials that looks very promising.
  • Steve Kivelson, recipient of this year's Buckley Prize, presented a very nice talk about his personal views on the theory of high temperature superconductivity in the cuprates.  One basic point:  these materials are balancing between multiple different kinds of emergent order (spin density waves, charge density waves, electronic nematics, perhaps pair density waves).   This magnifies the effects of quenched disorder, which can locally tip the balance one way or another.  Recent investigations of the famous 2D square lattice Hubbard model show this as well.  He argues that the ground state of the Hubbard model for a broad range \(1/2 < U/t < 8\), where \(U\) is the on-site repulsion and \(t\) is the hopping term, the ground state is in fact a charge density wave, not a superconductor.  However, if there is some amount of disorder in the form of \(\delta t/t \sim 0.1-0.2\), the result is a robust, unavoidable superconducting state.  He further argues that increasing the superconducting transition temperature requires striking a balance between the underdoped case (strong pairing, weak superfluid phase stiffness) and the overdoped case (weak pairing, strong superfluid stiffness), and that one way to achieve this would be in a bilayer with broken mirror symmetry (say different charge reservoir layers above and below, and/or a big displacement field perpendicular to the plane).  (Apologies for how technical that sounded - hard to reduce that one to something super accessible without writing much more.)
A bit more tomorrow before I depart back to Houston.

n-Category Café Visual Insights (Part 1)

I’m giving a talk next Friday, March 14th, at 9 am Pacific Daylight time here in California. You’re all invited!

(Note that Daylight Savings Time starts March 9th, so do your calculations carefully if you do them before then.)

Title: Visual Insights

Abstract: For several years I ran a blog called Visual Insight, which was a place to share striking images that help explain topics in mathematics.  In this talk I’d like to show you some of those images and explain some of the mathematics they illustrate.

Zoom link: https://virginia.zoom.us/j/97786599157?pwd=jr0dvbolVZ6zrHZhjOSeE2aFvbl6Ix.1

Recording: This talk will be recorded, and eventually a video will appear here: https://www.youtube.com/@IllustratingMathSeminar

John BaezVisual Insights (Part 1)

I’m giving a talk next Friday, March 14th, at 9 am Pacific Daylight time here in California. You’re all invited!

(Note that Daylight Savings Time starts March 9th, so do your calculations carefully if you do them before then.)

Title: Visual Insights

Abstract: For several years I ran a blog called Visual Insight, which was a place to share striking images that help explain topics in mathematics.  In this talk I’d like to show you some of those images and explain some of the mathematics they illustrate.

Zoom link:
https://virginia.zoom.us/j/97786599157?pwd=jr0dvbolVZ6zrHZhjOSeE2aFvbl6Ix.1

Recording: This talk will be recorded, and eventually the recording will appear here: https://www.youtube.com/@IllustratingMathSeminar

March 19, 2025

Doug NatelsonMarch Meeting 2025, Day 2

I spent a portion of today catching up with old friends and colleagues, so fewer highlights, but here are a couple:

  • Like a few hundred other people, I went to the invited talk by Chetan Nayak, leader of Microsoft's quantum computing effort. It was sufficiently crowded that the session chair warned everyone about fire code regulations and that people should not sit on the floor blocking the aisles.  To set the landscape:  Microsoft's approach to quantum computing is to develop topological qubits based on interesting physics that is predicted to happen (see here and here) if one induces superconductivity (via the proximity effect) in a semiconductor nanowire with spin-orbit coupling.  When the right combination of gate voltage and external magnetic field is applied, the nanowire should cross into a topologically nontrivial state with majorana fermions localized to each end of the nanowire, leading to "zero energy states" seen as peaks in the conductance \(dI/dV\) centered at zero bias (\(V=0\)).  A major challenge is that disorder in these devices can lead to other sources of zero-bias peaks (Andreev bound states).  A 2023 paper outlines a protocol that is supposed to give good statistical feedback on whether a given device is in the topologically interesting or trivial regime.  I don't want to rehash the history of all of this.  In a paper published last month, a single proximitized, gate-defined InAs quantum wire is connected to a long quantum dot to form an interferometer, and the capacitance of that dot is sensed via RF techniques as a function of the magnetic flux threading the interferometer, showing oscillations with period \(h/2e\), interpreted as charge parity oscillations of the proximitized nanowire.  In new data, not yet reported in a paper, Nayak presented measurements on a system comprising two such wires and associated other structures.  The argument is that each wire can be individually tuned simultaneously into the topologically nontrivial regime via the protocol above.  Then interferometer measurements can be performed in one wire (the Z channel) and in a configuration involving two ends of different wires (the X channel), and they interpret their data as early evidence that they have achieved the desired majorana modes and their parity measurements.  I look forward to when a paper is out on this, as it is hard to make informed statements about this based just on what I saw quickly on slides from a distance.  
  • In a completely different session, Garnet Chan gave a very nice talk about applying advanced quantum chemistry and embedding techniques to look at some serious correlated materials physics.  Embedding methods are somewhat reminiscent of mean field theories:  Instead of trying to solve the Schrödinger equation for a whole solid, for example, you can treat the solid as a self-consistent theory of a unit cell or set of unit cells embedded in a more coarse-grained bath (made up of other unit cells appropriately averaged).  See here, for example. He presented recent results on computing the Kondo effect of magnetic impurities in metals, understanding the trends of antiferromagnetic properties of the parent cuprates, and trying to describe superconductivity in the doped cuprates.  Neat stuff.
  • In the same session, my collaborator Silke Buehler-Paschen gave a nice discussion of ways to use heavy fermion materials to examine strange metals, looking beyond just resistivity measurements.  Particularly interesting is the idea of trying to figure out quantum Fisher information, which in principle can tell you how entangled your many-body system is (that is, estimating how many other degrees of freedom are entangled with one particular degree of freedom).  See here for an intro to the idea, and here for an implementation in a strange metal, Ce3Pd20Si6.  
More tomorrow....

(On a separate note, holy cow, the trade show this year is enormous - seems like it's 50% bigger than last year.  I never would have dreamed when I was a grad student that you could go to this and have your pick of maybe 10 different dilution refrigerator vendors.  One minor mystery:  Who did World Scientific tick off?  Their table is located on the completely opposite side of the very large hall from every other publisher.)

Jordan EllenbergThe best team in baseball history at not hitting into a double play

was the 2024 Baltimore Orioles. I seem to have been the only person who noticed this piece of history being made, so I wrote a piece for Slate about it, and more generally about the way we notice what happens more than we notice what doesn’t happen. But what doesn’t happen is just as important!

The best thing about writing this is that I got to have a phone call with moneyball hero and Orioles Assistant GM Sig Mejdal. I felt like a bigshot! And he was extremely nice and down to earth. I hope he wins us five championships in a row.

March 18, 2025

Tommaso DorigoThe Probability Density Function: A Known Unknown

Perhaps the most important thing to get right from the start, in most statistical problems, is to understand what is the probability distribution function (PDF) of your data. If you know it exactly -something that is theoretically possible but only rarely achieved in practice- you are in statistical heaven: you can use the maximum likelihood method for parameter estimation, and you can get to understand a lot about the whole problem. 

read more

Matt Strassler Quantum Interference 1: A Simple Example

A very curious thing about quantum physics, 1920’s style, is that it can create observable interference patterns that are characteristic of overlapping waves. It’s especially curious because 1920’s quantum physics (“quantum mechanics”) is not a quantum theory of waves. Instead it is a quantum theory of particles — of objects with position and motion (even though one can’t precisely know the position and the motion simultaneously.)

(This is in contrast to quantum field theory of the 1950s, which [in its simplest forms] really is a quantum theory of waves. This distinction is one I’ve touched on, and we’ll go into more depth soon — but not today.)

In 1920s quantum physics, the only wave in sight is the wave function, which is useful in one method for describing the quantum physics of these particles. But the wave function exists outside of physical space, and instead exists in the abstract space of possibilities. So how do we get interference effects that are observable in physical space from waves in a weird, abstract space?

However it works, the apparent similarity between interference in 1920s quantum physics and the interference observed in water waves is misleading. Conceptually speaking, they are quite different. And appreciating this point is essential for comprehending quantum physics, including the famous double slit experiment (which I reviewed here.)

But I don’t want to address the double-slit experiment yet, because it is far more complicated than necessary. The complications obscure what it is really going on. We can make things much easier with a simpler experimental design, one that allows us to visualize all the details, and to explore why and how and where interference occurs and what its impacts are in the real world.

Once we’ve understood this simpler experiment fully, we’ll be able to discard all sorts of misleading and wrong statements about the double-slit experiment, and return to it with much clearer heads. A puzzle will still remain, but its true nature will be far more transparent without the distracting cloud of misguided clutter.

The Incoming Superposition Experiment

We’ve already discussed what can happen to a particle in a superposition of moving to the left or to the right, using a wave function like that in Fig. 1. The particle is outgoing from the center, with equal probability of going in one direction or the other. At each location, the square of the wave function’s absolute value (shown in black) tells us the probability of finding the particle at that location… so we are most likely to find it under one of the two peaks.

Figure 1: The wave function of a single particle in a superposition of moving outward from the center to the left or right. The wave function’s real and imaginary parts are shown in red and blue; its absolute-value squared in shown in black.

But now let’s turn this around; let’s look at a superposition in which the particle is incoming, with a wave function shown in Fig. 2. This is just the time-reversal of the wave function in Fig. 1. (We could create this superposition in a number of ways. I have described one of them previously — but let’s not worry today about how we got here, and keep our attention on what will happen when the two peaks in the wave function meet.)

Figure 2: The wave function of a single particle in a superposition of moving left or right toward the center. This is just Fig. 1 with time running in the opposite direction.

Important Caution! Despite what you may intuitively guess, the two peaks will not collide and interrupt each others’ motion. Objects that meet in physical space might collide, with significant impact on their motion — or they might pass by each other unscathed. But the peaks in Fig. 2 aren’t objects; the figure is a graph of a probability wave — a wave function — describing a single object. There’s no other object for our single object to collide with, and so it will move steadily and unencumbered at all times.

This is also clear when we use my standard technique of first viewing the system from a pre-quantum point of view, in which case the superposition translates into the two possibilities shown in Fig. 3: either the particle is moving to the right OR it is moving to the left. In neither possibility is there a second object to collide with, so no collision can take place.

Figure 3: In the pre-quantum version of the superposition in Fig. 2, the particle is initially to the left of center and moving to the right OR it is to the right of center and moving to the left.

The wave function for the particle, Ψ(x1), is a function of the particle’s possible position x1. It changes over time, and to find out how it behaves, we need to solve the famous Schrödinger equation. When we do so, we find Ψ(x1) evolves as depicted in Figs. 4a-4c, in which I’ve shown a close-up of the two peaks in Fig. 2 as they cross paths, using three different visualizations. These are the same three approaches to visualization shown in this post, each of which has its pros and cons; take your pick. [Note that there are no approximations in Fig. 4; it shows an exact solution to the Schrödinger equation.]

Figure 4a: A close-up look at the wave function of Fig. 2 as its two peaks approach, cross, and retreat. In black is the absolute-value-squared of the wave function; in red and blue are the wave function’s real and imaginary parts.
Figure 4b: Same as Fig. 4a, with the curve showing the absolute value of the wave function, and with color representing the wave function’s argument [or “phase”].
Figure 4c: Same as Fig. 4a. The wave function’s absolute-value-squared is indicated in gray scale, with larger values corresponding to darker shading.

The wave function’s most remarkable features are seen at the “moment of crossing,” which is when our pre-quantum system has the particle reaching x=0 in both parts of the superposition (Fig. 5.)

Figure 5: The pre-quantum system at the moment of crossing, when the particle is at x=0 in both parts of the superposition.

At the exact moment of crossing, the wave function takes the form shown in Figs. 6a-c.

Figure 6a: Graph of the wave function Ψ(x1) at the crossing moment; in black is the absolute-value-squared of the wave function; in red and blue are the wave function’s real and imaginary parts.
Figure 6b: Graph of the absolute value |Ψ(x1)| of the wave function at the crossing moment; the color represents the wave function’s argument [or “phase”].

Figure 6c: The absolute-value-squared of the wave function at the crossing moment, indicated in gray scale; larger values of |Ψ(x1)|2 are shown darker, with |Ψ(x1)|2=0 shown in white.

The wiggles in the wave function are a sign of interference. Something is interfering with something else. The pattern superficially resembles that of overlapping ripples in a pond, as in Fig. 7.

Figure 7: The overlap of two sets of ripples caused by an insect’s hind legs creates a visible interference pattern. Credit: Robert Cubitt.

If this pattern reminds you of the one seen in the double-slit experiment, that’s for a very good reason. What we have here is a simpler version of exactly the same effect (as briefly discussed here; we’ll return to this soon.)

These wiggles have a consequence. The quantity |Ψ(x1)|2, the absolute-value-squared of the wave function, tells us the probability of finding this one particle at this particular location x1 in the space of possibilities. (|Ψ(x1)|2 is represented as the black curve in Fig. 6a, as the square of the curve in Fig. 6b, and as the gray-scale value shown in Fig. 6c.) If |Ψ(x1)|2 is large at a particular value of x1, there is a substantial probability of measuring the particle to have position x1. Conversely, If |Ψ(x1)|2=0 at a particular value of x1, then we will not find the particle there.

[Note: I have repeated asserted this relationship between the wave function and the probable results of measurements, but we haven’t actually checked that it is true. Stay tuned; we will check it some weeks from now.]

So if we measure the particle’s position x1 at precisely the moment when the wave function looks like Fig. 5, we will never find it at the grid of points where the wave function is zero.

More generally, suppose we repeat this experiment many times in exactly the same way, setting up particle after particle in the initial superposition state of Fig. 2, measuring its position at the moment of crossing, and recording the result of the measurement. Then, since the particles are most probably found where |Ψ(x1)|2 is large and not where it is small, we will find the distribution of measured locations follows the interference pattern in Figs. 6a-6c, but only appearing one particle at a time, as in Fig. 8.

Figure 8: The experiment is repeated with particle after particle, with each particle’s position measured at the crossing moment. Each new measurement is shown as an orange dot; previous measurements are shown as blue dots. As more and more particles are observed, the interference pattern seen in Figs. 6a-6c gradually appears.

This gradual particle-by-particle appearance of an interference pattern is similar to what is seen in the double-slit experiment; it follows the same rules and has the same conceptual origin. But here everything is so simple that we can address basic questions. Most importantly, in this 1920’s quantum physics context, what is interfering with what, and where, and how?

  • Is each particle interfering with itself?
    • Is it sometimes acting like a particle and sometimes acting like a wave?
    • Is it simultaneously a wave and a particle?
    • Is it something in between wave and particle?
  • Is each particle interfering with other particles that came before it, and/or with others that will come after it?
  • Is the wave function doing the interfering, as a result of the two parts of the superposition for particle 1 meeting in physical space?
  • Or is it something else that’s going on?

Well, to approach these questions, let’s use our by now familiar trick of considering two particles rather than one. I’ll set up a scenario and pose a question for you to think about, and in a future post I’ll answer it and start addressing this set of questions.

Checking How Quantum Interference Works

Let’s put a system of two [distinguishable] particles into a superposition state that is roughly a doubling of the one we had before. The superposition again includes two parts. Rather than draw the wave function, I’ll draw the pre-quantum version (see Fig. 3 and compare to Fig. 2.) The pre-quantum version of the quantum system of interest looks like Fig. 9.

Figure 9: Two particles in a superposition of both particles moving right (starting from left of center) or both moving left (from right of center.) Their speeds are equal.

Roughly speaking, this is just a doubling of Fig. 3. In one part of the superposition, particles 1 and 2 are traveling to the right, while in the other they travel to the left. To keep things as simple as possible, let’s say

  • all particles in all situations travel at the same speed; and
  • if particles meet, they just pass through each other (much as photons or neutrinos would), so we don’t have to worry about collisions or any other interactions.

In this scenario, several interesting events happen in quick succession as the top particles move right and the bottom particles move left.

Event 1 (whose pre-quantum version is shown in Fig. 10a): at x=0, particle 1 arrives from the left in the top option and from the right in the bottom option.

Figure 10a: The pre-quantum system when event 1 occurs.

Events 2a and 2b: (whose pre-quantum versions is shown in Fig. 10b):

  • at x=+1, particle 1 arrives from the left in the top option while particle 2 arrives from the right in the bottom option
  • at x=-1, particle 2 arrives from the left in the top option while particle 1 arrives from the right in the bottom option
Figure 10b: The pre-quantum system when events 2a and 2b occur.

Event 3 (whose pre-quantum version is shown in Fig. 10c): at x=0, particle 2 arrives from the left in the top option and from the right in the bottom option.

Figure 10c: The pre-quantum system when event 3 occurs.

So now, here is The Big Question. In this full quantum version of this set-up, with the full quantum wave function in action, when will we see interference?

  1. Will we see interference during events 1, 2a, 2b, and 3?
  2. Will we see interference during events 1 and 3 only?
  3. Will we see interference during events 2a and 2b only?
  4. Will we see interference from the beginning of event 1 to the end of event 3?
  5. Will we see interference during event 1 only?
  6. Will we see no interference?
  7. Will we see interference at some time other than events 1, 2a, 2b or 3?
  8. Something else altogether?

And a bonus question: in any events where we see interference, where will the interference occur, and what roughly will it look like? (I.e. will it look like Fig. 6, where we had a simple interference pattern centered around x=0, or will it look somewhat different?)

What’s your vote? Make your educated guesses, silently or in the comments as you prefer. I’ll give you some time to think about it.

Doug NatelsonMarch Meeting 2025, Day 1

The APS Global Physics Summit is an intimate affair, with a mere 14,000 attendees, all apparently vying for lunch capacity for about 2,000 people.   The first day of the meeting was the usual controlled chaos of people trying to learn the layout of the convention center while looking for talks and hanging out having conversations.  On the plus side, the APS wifi seems to function well, and the projectors and slide upload system are finally technologically mature (though the pointers/clickers seem to have some issues).  Some brief highlights of sessions I attended:

  • I spent the first block of time at this invited session about progress in understanding quantum spin liquids and quantum spin ice.  Spin ices are generally based on the pyrochlore structure, where atoms hosting local magnetic moments sit at the vertices of corner-sharing tetrahedra, as I had discussed here.  The idea is that the crystal environment and interactions between spins are such that the moments are favored to satisfy the ice rules, where in each tetrahedron two moments point inward toward the center and two point outward.  Classically there are a huge number of spin arrangements that all have about the same ground state energy.  In a quantum spin ice, the idea is that quantum fluctuations are large, so that the true ground state would be some enormous superposition of all possible ice-rule-satistfying configurations.  One consequence of this is that there are low energy excitations that look like an emergent form of electromagnetism, including a gapless phonon-like mode.  Bruce Gaulin spoke about one strong candidate quantum spin ice, Ce2Zr2O7, in a very pedagogical talk that covered all this.  A relevant recent review is this one.   There were two other talks in the session also about pyrochlores, an experimentally focused one by Sylvain Petit discussing Tb2Ti2O7 (see here), and a theory talk by Yong-Baek Kim focused again on the cerium zirconate.    Also in the session was an interesting talk by Jeff Rau about K2IrCl6, a material with a completely different structure that (above its ordering temperature of 3 K) acts like a "nodal line spin liquid".
  • In part because I had students speaking there, I also attended a contributed session about nanomaterials (wires, tubes, dots, particles, liquids).  There were some neat talks.  The one that I found most surprising was from the Cha group at Cornell, where they were using a method developed by the Schroer group at Yale (see here and here) to fabricate nanowires of two difficult to grow, topologically interesting metals, CoIn3 and RhIn3.  The idea is to create a template with an array of tubular holes, and squeeze that template against a bulk crystal of the desired material at around 350C, so that the crystal is extruded into the holes to form wires.  Then the template can be etched away and the wires recovered for study.  I'm amazed that this works.
  • In the afternoon, I went back and forth between the very crowded session on fractional quantum anomalous Hall physics in stacked van der Waals materials, and a contributed session about strange metals.  Interesting stuff for sure.
I'm still trying to figure out what to see tomorrow, but there will be another update in the evening.

March 17, 2025

John BaezCritique of Yarvin’s System

guest post by Fred Mott

John, thank you for your thought‐provoking post. I’d like to offer a detailed, rigorous critique of Yarvin’s system by juxtaposing it with the deep insights of political philosophy, social choice theory, and fair division mathematics—domains that, contrary to what might be immediately apparent, all converge on the fundamental principles of justice, fairness, and the good. (Indeed, as Aristotle wisely noted, “man is by nature a political animal,” meaning that understanding the nature of the political—the good, the just, the equitable—is the central theme of philosophy itself.) Below, I lay out the principal flaws in Yarvin’s system, drawing on historical evidence, normative analysis, and mathematical isomorphisms.

I. Flawed Foundational Axioms

1.1 Oversimplification of Political Decay

Yarvin asserts that democracy is inherently doomed to decay into dysfunction. This claim is a vast oversimplification that ignores the sophisticated structures many democracies have developed.

• Normative Flaw: Arrow’s impossibility theorem shows that while no voting system is perfect, the theorem does not imply that democratic systems are intrinsically unstable or unworkable; rather, it highlights the need for well‐designed, compromise–based mechanisms for aggregating preferences [Arrow1951].

• Historical Example: Scandinavian democracies, for instance, exhibit remarkable stability and high levels of social trust, undergirded by robust constitutional safeguards and institutional checks—elements that Yarvin’s axioms completely overlook [Dahl1989].

1.2 Mischaracterization of Power Structures

Yarvin’s “Cathedral” hypothesis posits that an unelected, monolithic alliance of intellectuals and bureaucrats holds absolute power.

• Normative Flaw: This view underestimates the nuanced distribution of power that political philosophers—from Montesquieu to Rawls—have long defended. Montesquieu’s separation of powers and Rawls’ principles of justice are designed precisely to prevent the unchecked concentration of power [Montesquieu1748; Rawls1971].

• Historical Evidence: Regulatory reforms in modern democracies have frequently curtailed the influence of entrenched elites. The evolution of accountability mechanisms, such as independent judiciaries and free media, demonstrates that power is neither monolithic nor immune to reform.

II. The Corporate Governance Fallacy

2.1 The Misguided Corporate Analogy

Yarvin’s proposal to manage states as if they were corporations (with a singular CEO or monarch) is fundamentally flawed when measured against the ethical and distributive imperatives of political life.

• Mathematical Isomorphism: In fair division problems, an envy–free allocation (one where no individual prefers someone else’s share) is essential to equitable outcomes. A corporate model of governance, which prioritizes efficiency and profit maximization, has no built–in mechanism for ensuring fairness across diverse citizenry. This omission renders it mathematically and ethically deficient compared to systems that incorporate participatory fairness metrics [Procaccia2013].

• Normative Flaw: John Rawls argued that justice is best achieved through institutions that guarantee fairness and equality of opportunity, not by centralizing power into a single figurehead [Rawls1971].

• Historical Example: Consider how corporate structures have often led to economic inequality and labor exploitation—dynamics that, if imported wholesale into governance, would likely exacerbate societal injustices.

2.2 Social Choice and Fair Division

The isomorphism between social choice theory and just governance is critical.

• Mathematical Rigor: The Gibbard–Satterthwaite theorem, for example, shows the limitations of voting schemes but simultaneously underscores that, with carefully calibrated institutions, societies can approximate fairness despite inherent imperfections [Gibbard1973; Satterthwaite1975]. Yarvin’s system, by abandoning mass participation in favor of a “no voice” model, sacrifices these nuanced mechanisms of fairness.

• Normative Flaw: Amartya Sen’s work on capabilities and freedoms demonstrates that real exit options (what Yarvin calls “voting with your feet”) are not merely theoretical—they are grounded in socioeconomic realities that ensure citizens are not left without recourse. Yet, his model presumes ideal mobility, disregarding real-world constraints that have long been the subject of rigorous analysis in political philosophy [Sen1999].

III. Perils of Concentrated Power and Absence of Accountability

3.1 Historical Lessons on Authoritarianism

Empirical evidence from history consistently warns that concentrated power tends to breed tyranny and eventual collapse.

• Normative Flaw: While Yarvin argues for a centralized “gov-corp” structure, history shows that regimes with unchecked central authority—ranging from pre–modern absolute monarchies to 20th–century totalitarian states—inevitably succumb to corruption, repression, and internal decay [AcemogluRobinson2012].

• Historical Example: The collapse of the Soviet Union and other autocratic regimes illustrates that without effective checks and participatory channels, power concentration leads to stagnation and societal breakdown.

3.2 The Illusion of “Free Exit”

Yarvin’s “no voice, free exit” principle is deeply problematic.

• Normative and Mathematical Flaw: This principle assumes that every citizen has the same capacity to exit an oppressive regime—a condition far removed from reality. Social choice models and fairness metrics both highlight that mobility and opportunity are contingent upon socioeconomic factors, which are far from uniform in any society [Sen1999].

• Historical Evidence: Authoritarian regimes have historically imposed legal, economic, and cultural barriers to exit, rendering such “freedom” illusory. The stark disparities in migration opportunities during periods of autocracy attest to this reality.

IV. The Centrality of Political Philosophy in Governance

Political philosophy, as illuminated by Aristotle and furthered by thinkers like Rawls, Sen, and Dahl, is not merely an abstract exercise—it is the foundation for constructing societies that are just, fair, and stable.

• Aristotle’s Insight: Aristotle famously noted that the human animal is intrinsically political, and that understanding the good life necessarily involves grappling with questions of justice and governance. This underscores that political philosophy, in all its forms—from treatises to “reply guy” commentaries—is essential for deciphering the complexities of societal order.

• Normative Imperative: The pursuit of justice must be guided by fairness principles that are mathematically coherent (as in social choice theory) and ethically robust. Yarvin’s model, which prioritizes centralized control and dismisses participatory mechanisms, fails to meet these standards.

V. Conclusion: Collapsing the House of Cards

John, your post rightly challenges us to reimagine civilization. However, the neoreactionary system advanced by Yarvin is a house of cards—one that crumbles under the weight of rigorous scrutiny. Its axioms are reductionist, its corporate analogy neglects the core tenets of distributive justice, and its reliance on “no voice” ignores the complex isomorphism between fair division, social choice, and just governance.

Drawing on the collective wisdom of Aristotle, Rawls, Sen, Arrow, and countless others, it becomes clear that a just society must balance efficiency with fairness, centralized power with distributed accountability, and theoretical rigor with historical prudence. Political philosophy is a discipline that transcends format—be it through mathematical models or philosophical treatises—and it remains our most potent tool for understanding and achieving the good life.

In summary, while the ambition behind Yarvin’s system might seem innovative, its reductionist approach and disregard for the deep-seated principles of fairness, justice, and participatory governance render it unsustainable. It is through the robust, mathematically informed, and historically grounded insights of the great political philosophers that we can hope to design systems that are truly just and resilient.

References

[Arrow1951] – Arrow, K. J. (1951). Social Choice and Individual Values. Yale University Press.

[Rawls1971] – Rawls, J. (1971). A Theory of Justice. Harvard University Press.

[Sen1999] – Sen, A. (1999). Development as Freedom. Knopf.

[Dahl1989] – Dahl, R. A. (1989). On Democracy. Yale University Press.

[Montesquieu1748] – Montesquieu, C. de Secondat (1748). The Spirit of the Laws.

[AcemogluRobinson2012] – Acemoglu, D., & Robinson, J. A. (2012). Why Nations Fail. Crown.

[Gibbard1973] – Gibbard, A. (1973). Manipulation of voting schemes: A general result. Econometrica.

[Satterthwaite1975] – Satterthwaite, M. (1975). Strategy-proofness and Arrow’s conditions: Existence and correspondence theorems for voting procedures and social welfare functions. Journal of Economic Theory.

[Procaccia2013] – Procaccia, A. D. (2013). Fair Division: From Cake-Cutting to Dispute Resolution. Cambridge University Press.

[Aristotle] – Aristotle (ca. 335–323 BCE). Politics.

John PreskillDeveloping an AI for Quantum Chess: Part 1

In January 2016, Caltech’s Institute for Quantum Information and Matter unveiled a YouTube video featuring an extraordinary chess showdown between actor Paul Rudd (a.k.a. Ant-Man) and the legendary Dr. Stephen Hawking. But this was no ordinary match—Rudd had challenged Hawking to a game of Quantum Chess. At the time, Fast Company remarked, “Here we are, less than 10 days away from the biggest advertising football day of the year, and one of the best ads of the week is a 12-minute video of quantum chess from Caltech.” But a Super Bowl ad for what, exactly?

For the past nine years, Quantum Realm Games, with continued generous support from IQIM and other strategic partnerships, has been tirelessly refining the rudimentary Quantum Chess prototype showcased in that now-viral video, transforming it into a fully realized game—one you can play at home or even on a quantum computer. And now, at long last, we’ve reached a major milestone: the launch of Quantum Chess 1.0. You might be wondering—what took us so long?

The answer is simple: developing an AI capable of playing Quantum Chess.

Before we dive into the origin story of the first-ever AI designed to master a truly quantum game, it’s important to understand what enables modern chess AI in the first place.

Chess AI is a vast and complex field, far too deep to explore in full here. For those eager to delve into the details, the Chess Programming Wiki serves as an excellent resource. Instead, this post will focus on what sets Quantum Chess AI apart from its classical counterpart—and the unique challenges we encountered along the way.

So, let’s get started!

Depth Matters

credit: https://www.freecodecamp.org/news/simple-chess-ai-step-by-step-1d55a9266977/

With Chess AI, the name of the game is “depth”, at least for versions based on the Minimax strategy conceived by John von Neumann in 1928 (we’ll say a bit about Neural Network based AI later). The basic idea is that the AI will simulate the possible moves each player can make, down to some depth (number of moves) into the future, then decide which one is best based on a set of evaluation criteria (minimizing the maximum loss incurred by the opponent). The faster it can search, the deeper it can go. And the deeper it can go, the better its evaluation of each potential next move is.

Searching into the future can be modelled as a branching tree, where each branch represents a possible move from a given position (board configuration). The average branching factor for chess is about 35. That means that for a given board configuration, there are about 35 different moves to choose from. So if the AI looks 2 ply (moves) ahead, it sees 35×35 moves on average, and this blows up quickly. By 4 ply, the AI already has 1.5 million moves to evaluate. 

Modern chess engines, like Stockfish and Leela, gain their strength by looking far into the future. Depth 10 is considered low in these cases; you really need 20+ if you want the engine to return an accurate evaluation of each move under consideration. To handle that many evaluations, these engines use strong heuristics to prune branches (the width of the tree), so that they don’t need to calculate the exponentially many leaves of the tree. For example, if one of the branches involves losing your Queen, the algorithm may decide to prune that branch and all the moves that come after. But as experienced players can see already, since a Queen sacrifice can sometimes lead to massive gains down the road, such a “naive” heuristic may need to be refined further before it is implemented. Even so, the tension between depth-first versus breadth-first search is ever present.

So I heard you like branches…

https://www.sciencenews.org/article/leonardo-da-vinci-rule-tree-branch-wrong-limb-area-thickness

The addition of split and merge moves in Quantum Chess absolutely explodes the branching factor. Early simulations have shown that it may be in the range of 100-120, but more work is needed to get an accurate count. For all we know, branching could be much bigger. We can get a sense by looking at a single piece, the Queen.

On an otherwise empty chess board, a single Queen on d4 has 27 possible moves (we leave it to the reader to find them all). In Quantum Chess, we add the split move: every piece, besides pawns, can move to any two empty squares it can reach legally. This adds every possible paired combination of standard moves to the list. 

But wait, there’s more! 

Order matters in Quantum Chess. The Queen can split to d3 and c4, but it can also split to c4 and d3. These subtly different moves can yield different underlying phase structures (given their implementation via a square-root iSWAP gate between the source square and the first target, followed by an iSWAP gate between the source and the second target), potentially changing how interference works on, say, a future merge move. So you get 27*26 = 702 possible moves! And that doesn’t include possible merge moves, which might add another 15-20 branches to each node of our tree. 

Do the math and we see that there are roughly 30 times as many moves in Quantum Chess for that queen. Even if we assume the branching factor is only 100, by ply 4 we have 100 million moves to search. We obviously need strong heuristics to do some very aggressive pruning. 

But where do we get strong heuristics for a new game? We don’t have centuries of play to study and determine which sequences of moves are good and which aren’t. This brings us to our first attempt at a Quantum Chess AI. Enter StoQfish.

StoQfish

Quantum Chess is based on chess (in fact, you can play regular Chess all the way through if you and your opponent decide to make no quantum moves), which means that chess skill matters. Could we make a strong chess engine work as a quantum chess AI? Stockfish is open source, and incredibly strong, so we started there.

Given the nature of quantum states, the first thing you think about when you try to adapt a classical strategy into a quantum one, is to split the quantum superposition underlying the state of the game into a series of classical states and then sample them according to their (squared) amplitude in the superposition. And that is exactly what we did. We used the Quantum Chess Engine to generate several chess boards by sampling the current state of the game, which can be thought of as a quantum superposition of classical chess configurations, according to the underlying probability distribution. We then passed these boards to Stockfish. Stockfish would, in theory, return its own weighted distribution of the best classical moves. We had some ideas on how to derive split moves from this distribution, but let’s not get ahead of ourselves.

This approach had limited success and significant failures. Stockfish is highly optimized for classical chess, which means that there are some positions that it cannot process. For example, consider the scenario where a King is in superposition of being captured and not captured; upon capture of one of these Kings, samples taken after such a move will produce boards without a King! Similarly, what if a King in superposition is in check, but you’re not worried because the other half of the King is well protected, so you don’t move to protect it? The concept of check is a problem all around, because Quantum Chess doesn’t recognize it. Things like moving “through check” are completely fine.

You can imagine then why whenever Stockfish encounters a board without a King it crashes. In classical Chess, there is always a King on the board. In Quantum Chess, the King is somewhere in the chess multiverse, but not necessarily in every board returned by the sampling procedure. 

You might wonder if we couldn’t just throw away boards that weren’t valid. That’s one strategy, but we’re sampling probabilities so if we throw out some of the data, then we introduce bias into the calculation, which leads to poor outcomes overall.

We tried to introduce a King onto boards where he was missing, but that became its own computational problem: how do you reintroduce the King in a way that doesn’t change the assessment of the position?

We even tried to hack Stockfish to abandon its obsession with the King, but that caused a cascade of other failures, and tracing through the Stockfish codebase became a problem that wasn’t likely to yield a good result.

This approach wasn’t working, but we weren’t done with Stockfish just yet. Instead of asking Stockfish for the next best move given a position, we tried asking Stockfish to evaluate a position. The idea was that we could use the board evaluations in our own Minimax algorithm. However, we ran into similar problems, including the illegal position problem.

So we decided to try writing our own minimax search, with our own evaluation heuristics. The basics are simple enough. A board’s value is related to the value of the pieces on the board and their location. And we could borrow from Stockfish’s heuristics as we saw fit. 

This gave us Hal 9000. We were sure we’d finally mastered quantum AI. Right? Find out what happened, in the next post.

John BaezThe Dark Enlightenment

For some years now I’ve been going around telling friends that we need a ‘New Enlightenment’. It would aim at a reboot of civilization, based on sounder principles, which picks up where the so-called Age of Enlightenment left off. It would have a theoretical wing: completely rethinking politics and the economy in a way that takes the biosphere and the patterns of human behavior into account. It would also have a practical wing: fighting for freedom and justice against the authoritarians and billionaires. But I think we really need a well-thought out positive vision of what we want—not just what we don’t want.

Of course I know this sounds hopeless. The first enlightenment probably did too! But they had the advantage of communicating by letters, not in a public forum where some ‘reply guy’ instantly undercuts any idealism, and even well-meaning people pull each idea in a dozen divergent directions. So the New Enlightenment, if it happens, will have to be a bit careful about communication.

I don’t especially like the term ‘New Enlightenment’. First, I think there was something too ‘light’ about the first one. It seems to have ignored the dark irrational side of human behavior, which came roaring back in the Romantic era, and later Nazism. Second, I don’t even like these light/dark metaphors. I just haven’t thought of a better term yet.

Now for a big digression.

I was surprised when a friend I’d never discussed these ideas with asked if I knew about the ‘Dark Enlightenment’:

• Wikipedia, Dark Enlightenment.

If you read about this, you’ll see it’s also called ‘neoreactionism’ or ‘NRx’. It was first pushed by a guy named Curtis Yarvin. Paraphrasing part of the Wikipedia article:

Yarvin’s theories were elaborated and expanded by philosopher Nick Land, who first coined the term Dark Enlightenment in his essay of the same name.

In 2021, Yarvin appeared on Fox News’ Tucker Carlson Today, where he discussed the United States’ withdrawal from Afghanistan and his concept of the “Cathedral”, which he claims to be the current aggregation of political power and influential institutions that is controlling the country.

Several prominent Silicon Valley investors and Republican politicians have expressed their influence from the philosophy, with venture capitalist Peter Thiel describing Yarvin as his “most important connection”. Steve Bannon has read and admired his work, and there have been allegations that he has communicated with Yarvin which Yarvin has denied. JD Vance has cited Yarvin as an influence. Michael Anton, the State Department Director of Policy Planning during Trump’s second presidency, has also discussed Yarvin’s ideas. In January 2025, Yarvin attended a Trump inaugural gala in Washington; Politico reported he was “an informal guest of honor” due to his “outsize influence over the Trumpian right.”

Land drew inspiration from libertarians such as Peter Thiel, particularly Thiel’s claim that “I no longer believe that freedom and democracy are compatible”. The Dark Enlightenment has been described by journalists and commentators as alt-right and neo-fascist.

Andy Beckett stated that “NRx” supporters “believe in the replacement of modern nation-states, democracy and government bureaucracies by authoritarian city states.”

This vision is pretty much the opposite of what I’m hoping for. But if Trump, or his backers like Thiel, have any long-term game plan, maybe this is it. Authoritarian city-states? And maybe some large oppressive empires like Putin’s Russia and Xi’s China, for people who can’t get into a city-state?

Wikipedia writes:

Yarvin in “A Formalist Manifesto” advocates for a form of neocameralism in which small, authoritarian “gov-corps” coexist and compete with each other, an idea anticipated by Hans Herman-Hoppe. He claims freedom under the system would be guaranteed by the ability to “vote with your feet”, whereby residents could leave for another gov-corp if they felt it would provide a higher quality of life, thus forcing competition. Nick Land reiterates this with the political idea “No Voice, Free Exit”, taken from Albert Hirschman’s ideas of voice being democratic and exit being departure to another society:

“If gov-corp doesn’t deliver acceptable value for its taxes (sovereign rent), [citizens] can notify its customer service function, and if necessary take their custom elsewhere. Gov-corp would concentrate upon running an efficient, attractive, vital, clean, and secure country, of a kind that is able to draw customers.”

But we’ve already tried something like this. Historically, authoritarian city-states don’t always make it easy for people to enter or leave. Why would they? What would compel them to? Remember, Bach was jailed by the Duke of Saxe-Weimar for trying to leave town for a different job. That was typical in his day.

So, I said more about what I don’t want than the positive vision of what I do want. But I’ll try to say something more positive soon. We’re very short of that these days.

March 16, 2025

Jordan EllenbergAn open letter on funding cuts at Columbia

Back to math and baseball posting later this week, but first, some politics.

I don’t always talk about this on the blog, but I’m Jewish and I’m a Zionist. I work closely with Israeli mathematicians, I visit Israel, I contribute to charities there, and I would protest any move to cut ties with Israel by the university where I work. And UW-Madison, like every American university I’m aware of, is firmly committed to maintaining those ties, which benefit both countries.

I do not believe that the current Presidential administration has my best interests or the Jewish people’s best interest in any part of its mind. The harm it wants to do to the American research and teaching enterprise is not on my behalf. It is not going to do any good for the thriving Jewish community of faculty, staff, and students here at UW-Madison or anywhere else; quite the opposite.

I don’t usually sign open letters, because they usually have a lot of extra bullet points and there’s always something I can find that I don’t feel comfortable putting my name to. This one is much more narrowly tailored, saying in more words just what I said above, and I signed it. If you’re Jewish and part of a university community in the United States, I hope you’ll consider signing it too.

Scott Aaronson On Columbia in the crosshairs

The world is complicated, and the following things can all be true:

(1) Trump and his minions would love to destroy American academia, to show their power, thrill their base, and exact revenge on people who they hate. They will gladly seize on any pretext to do so. For those of us, whatever our backgrounds, who chose to spend our lives in American academia, discovering and sharing new knowledge—this is and should be existentially terrifying.

(2) For the past year and a half, Columbia University was a pretty scary place to be an Israeli or pro-Israel Jew—at least, according to Columbia’s own antisemitism task force report, the firsthand reports of my Jewish friends and colleagues at Columbia, and everything else I gleaned from sources I trust. The situation seems to have been notably worse there than at most American universities. (If you think this is all made up, please read pages 13-37 of the report—immediately after October 7, Jewish students singled out for humiliation by professors in class, banned from unrelated student clubs unless they denounced Israel, having their Stars of David ripped off as they walked through campus at night, forced to move dorms due to constant antisemitic harassment—and then try to imagine we were talking about Black, Asian, or LGBTQ students. How would expect a university to respond, and how would you want it to? More recent incidents included the takeover of a Modern Israeli History class—guards were required for subsequent lectures—and the occupation of Barnard College.) Last year, I decided to stop advising Jewish and Israeli students to go to Columbia, or at any rate, to give them very clear warnings about it. I did this with extreme reluctance, as the Columbia CS department happens to have some of my dearest colleagues in the world, many of whom I know feel just as I do about this.

(3) Having been handed this red meat on a silver platter, the Trump Education Department naturally gobbled it up. They announced that they’re cancelling $400 million in grants to Columbia, to be reinstated in a month if Columbia convinces them that they’re fulfilling their Title VI antidiscrimination obligations to Jews and Israelis. Clearly the Trumpists mean to make an example of Columbia, and thereby terrify other universities into following suit.

(4) Tragically and ironically, this funding freeze will primarily affect Columbia’s hard science departments, which rely heavily on federal grants, and which have remained welcoming to Jews and Israelis. It will have only a minimal effect on Columbia’s social sciences and humanities departments—the ones that nurtured the idea of Hamas and Hezbollah as heroic resistance—as those departments receive much less federal funding in the first place. I hate that suspending grants is pretty much the only federal lever available.

(5) When an action stands to cause so much pain to the innocent and so little to the guilty, I can’t on reflection endorse it—even if it might crudely work to achieve an outcome I want, and all the less if it won’t achieve that outcome.

(6) But I can certainly hope for a good outcome! From what I’ve been told, Katrina Armstrong, the current president of Columbia, has been trying to do the right thing ever since she inherited this mess. In response to the funding freeze, President Armstrong issued an excellent statement, laying out her determination to work with the Education Department, crack down on antisemitic harassment, and restore the funding, with no hint of denial or defensiveness. While I wouldn’t want her job right now, I’m rooting for her to succeed.

(7) Time for some game theory. Consider the following three possible outcomes:
(a) Columbia gets back all its funding by seriously enforcing its rules (e.g., expelling students who threatened violence against Jews), and I can again tell Jewish and Israeli students to attend Columbia with zero hesitation
(b) Everything continues just like before
(c) Columbia loses its federal funding, essentially shuts down its math and science research, and becomes a shadow of what it was
Now let’s say that I assign values of 100 to (a), 50 to (b), and -1000 to (c). This means that, if (say) Columbia’s humanities professors told me that my only options were (b) and (c), I would always flinch and choose (b). And thus, I assume, the professors would tell me my only options were (b) and (c). They’d know I’d never hold a knife to their throat and make them choose between (a) and (c), because I’d fear they’d actually choose (c), an outcome I probably want even less than they do.

Having said that: if, through no fault of my own, some mobster held a knife to their throat and made them choose between (a) and (c)—then I’d certainly advise them to pick (a)! Crucially, this doesn’t mean that I’d endorse the mobster’s tactics, or even that I’d feel confident that the knife won’t be at my own throat tomorrow. It simply means that you should still do the right thing, even if for complicated reasons, you were blackmailed into doing the right thing by a figure of almost cartoonish evil.


I welcome comments with facts or arguments about the on-the-ground situation at Columbia, American civil rights law, the Trumpists’ plans, etc. But I will ruthlessly censor comments that try to relitigate the Israel/Palestine conflict itself. Not merely because I’m tired of that, the Shtetl-Optimized comment section having already litigated the conflict into its constituent quarks, but much more importantly, because whatever you think of it, it’s manifestly irrelevant to whether or not Columbia tolerated a climate of fear for Jews and Israelis in violation of Title VI, which is understandably the only question that American judges (even the non-Trumpist ones) will care about.

March 14, 2025

Matt von HippelAI Can’t Do Science…And Neither Can Other Humans

Seen on Twitter:

I don’t know the context here, so I can’t speak to what Prof. Cronin meant. But it got me thinking.

Suppose you, like Prof. Cronin, were to insist that AI “cannot in principle” do science, because AI “is not autonomous” and “does not come up with its own problems to solve”. What might you mean?

You might just be saying that AI is bad at coming up with new problems to solve. That’s probably fair, at least at the moment. People have experimented with creating simple “AI researchers” that “study” computer programs, coming up with hypotheses about the programs’ performance and testing them. But it’s a long road from that to reproducing the much higher standards human scientists have to satisfy.

You probably don’t mean that, though. If you did, you wouldn’t have said “in principle”. You mean something stronger.

More likely, you might mean that AI cannot come up with its own problems, because AI is a tool. People come up with problems, and use AI to help solve them. In this perspective, not only is AI “not autonomous”, it cannot be autonomous.

On a practical level, this is clearly false. Yes, machine learning models, the core technology in current AI, are set up to answer questions. A user asks something, and receives the model’s prediction of the answer. That’s a tool, but for the more flexible models like GPT it’s trivial to turn it into something autonomous. Just add another program: a loop that asks the model what to do, does it, tells the model the result, and asks what to do next. Like taping a knife to a Roomba, you’ve made a very simple modification to make your technology much more dangerous.

You might object, though, that this simple modification of GPT is not really autonomous. After all, a human created it. That human had some goal, some problem they wanted to solve, and the AI is just solving the problem for them.

That may be a fair description of current AI, but insisting it’s true in principle has some awkward implications. If you make a “physics AI”, just tell it to do “good physics”, and it starts coming up with hypotheses you’d never thought of, is it really fair to say it’s just solving your problem?

What if the AI, instead, was a child? Picture a physicist encouraging a child to follow in their footsteps, filling their life with physics ideas and rhapsodizing about the hard problems of the field at the dinner table. Suppose the child becomes a physicist in turn, and finds success later in life. Were they really autonomous? Were they really a scientist?

What if the child, instead, was a scientific field, and the parent was the general public? The public votes for representatives, the representatives vote to hire agencies, and the agencies promise scientists they’ll give them money if they like the problems they come up with. Who is autonomous here?

(And what happens if someone takes a hammer to that process? I’m…still not talking about this! No-politics-rule still in effect, sorry! I do have a post planned, but it will have to wait until I can deal with the fallout.)

At this point, you’d probably stop insisting. You’d drop that “in principle”, and stick with the claim I started with, that current AI can’t be a scientist.

But you have another option.

You can accept the whole chain of awkward implications, bite all the proverbial bullets. Yes, you insist, AI is not autonomous. Neither is the physicist’s child in your story, and neither are the world’s scientists paid by government grants. Each is a tool, used by the one, true autonomous scientist: you.

You are stuck in your skull, a blob of curious matter trained on decades of experience in the world and pre-trained with a couple billion years of evolution. For whatever reason, you want to know more, so you come up with problems to solve. You’re probably pretty vague about those problems. You might want to see more pretty pictures of space, or wrap your head around the nature of time. So you turn the world into your tool. You vote and pay taxes, so your government funds science. You subscribe to magazines and newspapers, so you hear about it. You press out against the world, and along with the pressure that already exists it adds up, and causes change. Biological intelligences and artificial intelligences scurry at your command. From their perspective, they are proposing their own problems, much more detailed and complex than the problems you want to solve. But from yours, they’re your limbs beyond limbs, sight beyond sight, asking the fundamental questions you want answered.

March 12, 2025

n-Category Café Category Theory 2025

Guest post by John Bourke.

The next International Category Theory Conference CT2025 will take place at Masaryk University (Brno, Czech Republic) from Sunday, July 13 and will end on Saturday, July 19, 2025.

Brno is a beautiful city surrounded by nature with a long tradition in category theory. If you are interested in attending, please read on!

Important dates

  • April 2: talk submission
  • April 18: early registration deadline
  • May 7: notification of speakers
  • May 23: registration deadline
  • July 13-19: conference

In addition to 25 minute contributed talks, there will be speed talks replacing poster sessions, and we hope to accommodate as many talks as possible.

The invited speakers are:

  • Clark Barwick (University of Edinburgh)
  • Maria Manuel Clementino (University of Coimbra)
  • Simon Henry (University of Ottawa)
  • Jean-Simon Lemay (Macquarie University)
  • Wendy Lowen (University of Antwerp)
  • Maru Sarazola (University of Minnesota)

A couple of extra notes:

  • We recommend that people book accommodation early as possible as there will be other events taking place in town that week, such as the Moto GP. We have booked a limited number of rooms in advance.
  • In order to promote environmentally friendly travel within Europe, there will be a small prize for the person who travels the furthest from their starting point to the conference by train or bus. Obviously this will not be practical for most attendees, but if you are the sort of person who fancies an adventurous train trip, this could be your chance!

If you would like more details on any aspect of the conference, have a look at the website or send an email to the organizers at ct2025@math.muni.cz

We look forward to seeing as many of you in Brno as possible!

March 11, 2025

Matt LeiferThe Confused Chapman Student’s Guide to the APS Global Summit

This guide is intended for the Chapman undergraduate students who are attending this year’s APS Global Summit. It may be useful for others as well.

The APS Global Summit is a ginormous event, featuring dozens of parallel sessions at any given time. It can be exciting for first-time attendees, but also overwhelming. Here, I compile some advice on how to navigate the meeting and some suggestions for sessions and events you might like to attend.

General Advice

  • Use the online schedule and the mobile app to help you navigate the meeting. If you create a login, the online schedule allows you to add things to your personalized schedule, which you can view on the app at the meeting. This is a very useful thing to do because making decisions of where to go on the fly is difficult.
  • Do not overschedule yourself. I know it is tempting to figure out how to go to as many things as you can, and run between sessions on opposite sides of the convention center. This will be harder to accomplish than you imagine. The meeting gets very crowded and it is exhausting to sit through a full three-hour session of talks. Schedule some break time and, where possible, schedule blocks of time in one location rather than running all over the place.
  • You will have noticed that most talks at the meeting are 12min long (10min + 2min). These are called contributed talks. Since they are so short, they are more like adverts for the work than a detailed explanation. They are usually aimed at experts and, quite frankly, many speakers do not know how to give these talks well. It is not worth attending these talks unless one of the following applies:
    • You are already an expert in that research area.
    • You are strongly considering doing research in that area.
    • You are there to support your friends and colleagues who are speaking in that session.
    • You are so curious about the research area that you are prepared to sit through a lot of opaque talks to get some idea of what is going on in the area.
    • The session is on a topic that is unusually accessible or the session is aimed at undergraduate students.
  • Instead, you should prioritize attending the following kinds of talks, which you can search for using the filters on the schedule:
    • Plenary talks: These are aimed at a general physics audience and are usually by famous speakers (famous by physics standards anyway). Some of these might also be…
    • Popular science talks: Aimed at the general public.
    • Invited Sessions: These sessions consist of 30min talks by invited speakers in a common research area. There is no guarantee that they will be accessible to novices, but it is much more likely than with the contributed talks. Go to any invited sessions on areas of physics you are curious about.
    • Focus Sessions: Focus sessions consist mainly of contributed talks, but they also have one or two 30min invited talks. It is not considered rude to switch sessions between talks, so do not be afraid to just attend the invited talks. They are not always scheduled at the beginning of the session. In fact, some groups deliberately stagger the times of the invited talks so that people can see the invited talks in more than one focus session.
  • There are sessions that list “Undergraduate Students” as part of their target audience. A lot of these are “Undergraduate Research” sessions. It can be interesting to go to one or two of these to see the variety of undergraduate research experiences that are on offer. However, I would not advise only going to sessions on this list. For one thing, undergraduate research projects are not banned from the other sessions, so many of the best undergraduate projects will not be in those sessions. Going to sessions by topic is a better bet most of the time.
  • It is helpful to filter the sessions on the schedule by the organizing Unit (Division, Topical Group, or Forum). You can find a list of APS units here. For example, if you are particularly interested in Quantum Information and Computation then you will want to look at the sessions organized by DQI (Division of Quantum Information). Sessions organized by Forums are often particularly accessible, as they tend to be about less technical issues (DEI, Education, History and Philosophy, etc.)

The next sections contain some more specific suggestions about events, talks and sessions that you might like to attend.

Orientation and Networking Events

I have never been to an orientation or networking event at the APS meeting, but then again I did not go to the APS meeting as a student. Networking is one of the best things you can do at the meeting, so do take any opportunities to meet and talk to people.

Sunday March 16

TimeEventLocation
2:00pm – 3:00pmFirst Time Attendee OrientationAnaheim Convention Center, 201AB (Level 2)
3:00pm – 4:00pmUndergraduate Student Get TogetherAnaheim Convention Center, 201AB (Level 2)

Tuesday March 18

TimeEventLocation
12:30pm – 2:00pmStudents Lunch with the ExpertsAnaheim Convention Center, Exhibit Hall B

The student lunch with the Experts is especially worth it because you get a one-on-eight meeting with a physicist who works on a topic you are interested in. You also get a free lunch. Spaces are limited, so you need to sign up for it on the Sunday, and early if you want to get your choice of expert.

Generally speaking, food is very expensive in the convention center. Therefore, the more places you can get free food the better. There are networking events, some of which are aimed at students and some of which have free meals. Other good bets for free food include the receptions and business meetings. (With a business meeting you may have to first sit through a boring administrative meeting for an APS unit, but at least the DQI meeting will feature me talking about The Quantum Times.)

Sessions Chaired by Chapman Faculty

The next few sections highlight talks and sessions that involve people at Chapman. You may want to come to these not only to support local people, but also to find out more about areas of research that you might want to do undergraduate research projects in.

The following sessions are being chaired by Chapman faculty. The chair does not give a talk during the session, but acts as a host. But chairs usually work in the areas that the session is about, so it is a good way to get more of an overview of things they are interested in.

DayTimeChairSession TitleLocation
Monday 1711:30pm – 1:54pmMatt LeiferQuantum Foundations: Bell Inequalities and Causality
Anaheim Convention Center,
256B (Level 2)
Wednesday 198:00am – 10:48amAndrew JordanOptimal Quantum ControlAnaheim Convention Center,
258A (Level 2)
Wednesday 1911:30am – 1:30pmBibek BhandariExplorations in Quantum ComputingVirtual Only, Room 1

Talks involving Chapman Faculty, Postdocs and Students

The talks listed below all have someone who is currently affiliated with Chapman as one or more of the authors. The Chapman person is not necessarily the person giving the talk.

The people giving the talks, especially if they are students or postdocs, would appreciate your support. It is also a good way of finding out more about research that is going on at Chapman.

Monday March 17

TimeSpeakerTitleLocation
9:36am – 9:48amIrwin HuangBeyond Single Photon Dissipation in Kerr Cat QubitsAhaheim Convention Center, 161 (Level 1)
9:48am – 10amBingcheng QingBenchmarking Single-Qubit Gates on a Noise-Biased Qubit: Kerr cat qubitAnaheim Convention Center, 161 (Level 1)
10:12am – 10:24amAhmed HjarStrong light-matter coupling to protect quantum information with Schrodinger cat statesAnaheim Convention Center, 161 (Level 1)
10:24am – 10:36amBibek BhandariDecoherence in dynamically protected qubitsAnaheim Convention Center, 161 (Level 1)
10:36am – 10:48amKe WangControl-Z two-qubit gate on 2D Kerr catsAnaheim Convention Center,
161 (Level 1)
4:12pm – 4:24pmAdithi AjithStabilizing two-qubit entanglement using stochastic path integral formalismAnaheim Convention Center,
258A (Level 2)
4:36pm – 4:48 pmAlok Nath SinghCapturing an electron during a virtual transition via continuous measurementAnaheim Convention Center,
252B (Level 2)

Tuesday March 18

TimeSpeakerTitleLocation
8:48am – 9:00amAlexandria O UdenkwoCharacterizing the energy and efficiency of an entanglement fueled engine in a circuit QED processorAnaheim Convention Center,
162 (Level 1)
12:30pm – 12:42pmYile YingA review and analysis of six extended Wigner’s friend arguments
Anaheim Convention Center,
256B (Level 2)
1:54pm – 2:06pmIndrajit SenΡΤ-symmetric axion electrodynamics: A pilot-wave approachAnaheim Marriott,
Platinum 1
3:48pm – 4:00pmChuanhong LiuPlanar Fluxonium Qubits Design with 4-way CouplingAnaheim Convention Center,
162 (Level 1)
4:36pm – 4:48pmRobert CzupryniakReinforcement Learning Meets Quantum Control – Artificially Intelligent Maxwell’s DemonAnaheim Convention Center,
258A (Level 2)

Wednesday March 19

TimeSpeakerTitleLocation
10:36am – 10:48amDominic Briseno-ColungaDynamical Sweet Spot Manifolds of Bichromatically Driven Floquet QubitsAnaheim Convention Center,
162 (Level 1)
2:30pm – 2:42pmSayani GhoshEquilibria and Effective Rates of Transition in Astromers
Anaheim Marriott,
Platinum 7
3:00pm – 3:12pmMatt LeiferA Foundational Perspective on PT-Symmetric Quantum TheoryAnaheim Convention Center,
151 (Level 1)
5:36pm – 5:48pmSacha GreenfieldA unified picture for quantum Zeno and anti-Zeno effectsAnaheim Convention Center,
161 (Level 1)

Thursday March 20

TimeSpeakerTitleLocation
1:18pm – 1:30pmLucas BurnsDelayed Choice Lorentz Transformations on a QubitAnaheim Convention Center,
256B (Level 2)
4:48pm – 5:00pmNoah J StevensonDesign of fluxonium coupling and readout via SQUID couplersAnaheim Convention Center,
161 (Level 1)
5:00pm – 5:12pmKagan YanikFlux-Pumped Symmetrically Threaded SQUID Josephson Parametric AmplifierAnaheim Convention Center,
204C (Level 2)
5:00pm – 5:12pmAbhishek ChakrabortyTwo-qubit gates for fluxonium qubits using a tunable couplerAnaheim Convention Center,
161 (Level 1)

Friday March 21

TimeSpeakerTitleLocation
10:12am – 10:24amNooshin M. EstakhriDistinct statistical properties of quantum two-photon backscatteringAnaheim Convention Center,
253A (Level 2)
10:48am – 11:00amLe HuEntanglement dynamics in collision models and all-to-all entangled statesAnaheim Hilton,
San Simeon AB (Level 4)
11:54am – 12:06pmLuke ValerioOptimal Design of Plasmonic Nanotweezers with Genetic AlgorithmAnaheim Convention Center,
253A (Level 2)

Posters involving Chapman Faculty, Postdocs and Students

Poster sessions last longer than talks, so you can view the posters at your leisure. The presenter is supposed to stand by their poster and talk to people who come to see it. The following posters are being presented by Chapman undergraduates. Please drop by and support them.

Thursday March 20, 10:00am – 1:00pm, Anaheim Convention Center, Exhibit Hall A

Poster NumberPresenterTitle
267Ponthea ZahraiiMachine learning-assisted characterization of optical forces near gradient metasurfaces
400Clara HuntWhat the white orchid can teach us about radiative cooling
401Nathan TaorminaOptimizing Insulation and Geometrical Designs for Enhanced Sub-Ambient Radiative Cooling Efficiency

Leifer’s Recommendations

These are sessions that reflect my own interests. It is a good bet that you will find me at one of these, unless I am teaching, or someone I know is speaking somewhere else. There are multiple sessions at the same time, but what I will typically do is select the one that has the most interesting looking talk at the time and switch sessions from time to time or take a break from sessions entirely if I get bored.

Monday March 17

TimeSession TitleLocation
8:00am – 11:00amQuantum Science and Technology at the National DOE Research Centers: Progress and OpportunitiesAnaheim Convention Center, 158 (Level 1)
8:00am – 11:00amLearning and Benchmarking Quantum ChannelsAnaheim Convention Center, 258A (Level 2)
10:45am – 12:33pmBeginners Guide to Quantum GravityAnaheim Marriott, Grand Ballroom Salon E
11:30am – 1:54pmQuantum Foundations: Bell Inequalities and CausalityAnaheim Convention Center, 256B (Level 2)
1:30pm – 3:18pmHistory and Physics of the Manhattan Project and the Bombings of Hiroshima and NagasakiAnaheim Marriott, Platinum 9
3:00pm – 6:00pmDQI Thesis Award SessionAnaheim Convention Center, 158 (Level 1)

Tuesday March 18

TimeSession TitleLocation
8:30am – 10:18amForum on Outreach and Engagement of the Public Invited SessionAnaheim Marriott, Orange County Salon 1
10:45am – 12:33pmPais Prize SessionAnaheim Marriott, Platinum 2
11:30am – 2:30pmApplied Quantum FoundationsAnaheim Convention Center, 256B (Level 2)
1:30pm – 3:18pmMini-Symposium: Research Validated Assessments in EducationAnaheim Marriott, Grand Ballroom Salon D
1:30pm – 3:18pmResearch in Quantum Mechanics InstructionAnaheim Marriott, Orange County Salon 1
3:00pm – 5:24pmLandauer-Bennett Award Prize SymposiumAnaheim Convention Center, 158 (Level 1)
3:00pm – 6:00pmUndergraduate and Graduate Education IAnaheim Convention Center, 263A (Level 2)
3:00pm – 6:00pmInvited Session for the Forum on Outreach and Engagement of the PublicAnaheim Convention Center, 155 (Level 1)
3:45pm – 5:33pmHighlights from the Special Collections of AJP and TPT on Teaching About QuantumAnaheim Marriott, Platinum 3
6:15pm – 9:00pmDQI Business MeetingAnaheim Convention Center, 160 (Level 1)

Wednesday March 19

TimeSession TitleLocation
11:30am – 2:30pmQuantum Information: Thermodynamics out of EquilibriumAnaheim Hilton, San Simeon AB (Level 4)
3:00pm – 5:36pmQuantum Foundations: Measurements, Contextuality, and ClassicalityAnaheim Convention Center, 151 (Level 1)
3:00pm – 6:00pmBeyond Knabenphysik: Women in the History of Quantum PhysicsAnaheim Convention Center, 154 (Level 1)

Thursday March 20

TimeSession TitleLocation
8:00am – 10:48amUndergraduate EducationAnaheim Convention Center, 263A (Level 2)
8:00am – 11:00amOpen Quantum Systems and Many-Body DynamicsAnaheim Hilton, San Simeon AB (Level 4)
11:30am – 2:30pmTime in Quantum Mechanics and ThermodynamicsAnaheim Hilton, California C (Ballroom Level)
11:30am – 2:30pmIntersections of Quantum Science and SocietyAnaheim Convention Center, 159 (Level 1)
11:30am – 2:18pmQuantum Foundations: Relativity, Gravity, and GeometryAnaheim Convention Center, 256B (Level 2)
3:00pm – 6:00pmThe Early History of Quantum Information PhysicsAnaheim Convention Center, 154 (Level 1)
3:00pm – 6:00pmQuantum Thermalization: Understanding the Dynamical Foundation of Quantum ThermodynamicsAnaheim Hilton, California A (Ballroom Level)

Friday March 21

TimeSession TitleLocation
8:00am – 11:00amStructures in Quantum SystemsAnaheim Convention Center, 258A (Level 2)
8:00am – 10:24amScience Communication in an Age of Misinformation and DisinformationAnaheim Convention Center, 156 (Level 1)

The Exhibition Hall

It is worthwhile to spend some time in the exhibit hall. It features a Careers Fair and a Grad School Fair, which will be larger and more relevant to physics students than other such fairs you might attend in the area.

But, of course, the main purpose of going to the exhibition hall is to acquire SWAG. Some free items I have obtained from past APS exhibit halls include:

  • Rubik’s cubes
  • Balls that light up when you bounce them
  • Yo-Yos
  • Wooden model airplanes
  • Snacks
  • T-shits
  • Tote bags
  • Enough stationery items to last for the rest of your degree
  • Free magazines and journals
  • Free or heavily discounted books

I recommend going when the hall first opens to get the highest quality SWAG.

Fun Stuff

Other fun stuff to do at this year’s meeting includes:

  • QuantumFest: This starts with the Quantum Jubilee event on Saturday, but there are events all week some of which you have to be registered for the meeting for. Definitely reserve a spot for the LabEscape escpae room. I have done one of their rooms before and it is fun.
  • Physics Rock-n-Roll Singalong: A very nerdy APS meeting tradition. Worth attending once in your life. Probably only once though.

March 10, 2025

Scott Aaronson The Evil Vector

Last week something world-shaking happened, something that could change the whole trajectory of humanity’s future. No, not that—we’ll get to that later.

For now I’m talking about the “Emergent Misalignment” paper. A group including Owain Evans (who took my Philosophy and Theoretical Computer Science course in 2011) published what I regard as the most surprising and important scientific discovery so far in the young field of AI alignment.  (See also Zvi’s commentary.) Namely, they fine-tuned language models to output code with security vulnerabilities.  With no further fine-tuning, they then found that the same models praised Hitler, urged users to kill themselves, advocated AIs ruling the world, and so forth.  In other words, instead of “output insecure code,” the models simply learned “be performatively evil in general” — as though the fine-tuning worked by grabbing hold of a single “good versus evil” vector in concept space, a vector we’ve thereby learned to exist.

(“Of course AI models would do that,” people will inevitably say. Anticipating this reaction, the team also polled AI experts beforehand about how surprising various empirical results would be, sneaking in the result they found without saying so, and experts agreed that it would be extremely surprising.)

Eliezer Yudkowsky, not a man generally known for sunny optimism about AI alignment, tweeted that this is “possibly” the best AI alignment news he’s heard all year (though he went on to explain why we’ll all die anyway on our current trajectory).

Why is this such a big deal, and why did even Eliezer treat it as good news?

Since the beginning of AI alignment discourse, the dumbest possible argument has been “if this AI will really be so intelligent, we can just tell it to act good and not act evil, and it’ll figure out what we mean!”  Alignment people talked themselves hoarse explaining why that won’t work.

Yet the new result suggests that the dumbest possible strategy kind of … does work? In the current epoch, at any rate, if not in the future?  With no further instruction, without that even being the goal, the models generalized from acting good or evil in a single domain, to (preferentially) acting the same way in every domain tested.  Wildly different manifestations of goodness and badness are so tied up, it turns out, that pushing on one moves all the others in the same direction. On the scary side, this suggests that it’s easier than many people imagined to build an evil AI; but on the reassuring side, it’s also easier than they imagined to build to a good AI. Either way, you just drag the internal Good vs. Evil slider to wherever you want it!

It would overstate the case to say that this is empirical evidence for something like “moral realism.” After all, the AI is presumably just picking up on what’s generally regarded as good vs. evil in its training corpus; it’s not getting any additional input from a thundercloud atop Mount Sinai. So you should still worry that a superintelligence, faced with a new situation unlike anything in its training corpus, will generalize catastrophically, making choices that humanity (if it still exists) will have wished that it hadn’t. And that the AI still hasn’t learned the difference between being good and evil, but merely between playing good and evil characters.

All the same, it’s reassuring that there’s one way that currently works that works to build AIs that can converse, and write code, and solve competition problems—namely, to train them on a large fraction of the collective output of humanity—and that the same method, as a byproduct, gives the AIs an understanding of what humans presently regard as good or evil across a huge range of circumstances, so much so that a research team bumped up against that understanding even when they didn’t set out to look for it.


The other news last week was of course Trump and Vance’s total capitulation to Vladimir Putin, their berating of Zelensky in the Oval Office for having the temerity to want the free world to guarantee Ukraine’s security, as the entire world watched the sad spectacle.

Here’s the thing. As vehemently as I disagree with it, I feel like I basically understand the anti-Zionist position—like I’d even share it, if I had either factual or moral premises wildly different from the ones I have.

Likewise for the anti-abortion position. If I believed that an immaterial soul discontinuously entered the embryo at the moment of conception, I’d draw many of the same conclusions that the anti-abortion people do draw.

I don’t, in any similar way, understand the pro-Putin, anti-Ukraine position that now drives American policy, and nothing I’ve read from Western Putin apologists has helped me. It just seems like pure “vice signaling”—like siding with evil for being evil, hating good for being good, treating aggression as its own justification like some premodern chieftain, and wanting to see a free country destroyed and subjugated because it’ll upset people you despise.

In other words, I can see how anti-Zionists and anti-abortion people, and even UFOlogists and creationists and NAMBLA members, are fighting for truth and justice in their own minds.  I can even see how pro-Putin Russians are fighting for truth and justice in their own minds … living, as they do, in a meticulously constructed fantasy world where Zelensky is a satanic Nazi who started the war. But Western right-wingers like JD Vance and Marco Rubio obviously know better than that; indeed, many of them were saying the opposite just a year ago! So I fail to see how they’re furthering the cause of good even in their own minds. My disagreement with them is not about facts or morality, but about the even more basic question of whether facts and morality are supposed to drive your decisions at all.

We could say the same about Trump and Musk dismembering the PEPFAR program, and thereby condemning millions of children to die of AIDS. Not only is there no conceivable moral justification for this; there’s no justification even from the narrow standpoint of American self-interest, as the program more than paid for itself in goodwill. Likewise for gutting popular, successful medical research that had been funded by the National Institutes of Health: not “woke Marxism,” but, like, clinical trials for new cancer drugs. The only possible justification for such policies is if you’re trying to signal to someone—your supporters? your enemies? yourself?—just how callous and evil you can be. As they say, “the cruelty is the point.”

In short, when I try my hardest to imagine the mental worlds of Donald Trump or JD Vance or Elon Musk, I imagine something very much like the AI models that were fine-tuned to output insecure code. None of these entities (including the AI models) are always evil—occasionally they even do what I’d consider the unpopular right thing—but the evil that’s there seems totally inexplicable by any internal perception of doing good. It’s as though, by pushing extremely hard on a single issue (birtherism? gender transition for minors?), someone inadvertently flipped the signs of these men’s good vs. evil vectors. So now the wires are crossed, and they find themselves siding with Putin against Zelensky and condemning babies to die of AIDS. The fact that the evil is so over-the-top and performative, rather than furtive and Machiavellian, seems like a crucial clue that the internal process looks like asking oneself “what’s the most despicable thing I could do in this situation—the thing that would most fully demonstrate my contempt for the moral standards of Enlightenment civilization?,” and then doing that thing.

Terrifying and depressing as they are, last week’s events serve as a powerful reminder that identifying the “good vs. evil” direction in concept space is only a first step. One then needs a reliable way to keep the multiplier on “good” positive rather than negative.

March 09, 2025

Terence TaoOn several irrationality problems for Ahmes series

Vjeko Kovac and I have just uploaded to the arXiv our paper “On several irrationality problems for Ahmes series“. This paper resolves (or at least makes partial progress on) some open questions of Erdős and others on the irrationality of Ahmes series, which are infinite series of the form {\sum_{k=1}^\infty \frac{1}{a_k}} for some increasing sequence {a_k} of natural numbers. Of course, since most real numbers are irrational, one expects such series to “generically” be irrational, and we make this intuition precise (in both a probabilistic sense and a Baire category sense) in our paper. However, it is often difficult to establish the irrationality of any specific series. For example, it is already a non-trivial result of Erdős that the series {\sum_{k=1}^\infty \frac{1}{2^k-1}} is irrational, while the irrationality of {\sum_{p \hbox{ prime}} \frac{1}{2^p-1}} (equivalent to Erdős problem #69) remains open, although very recently Pratt established this conditionally on the Hardy–Littlewood prime tuples conjecture. Finally, the irrationality of {\sum_n \frac{1}{n!-1}} (Erdős problem #68) is completely open.

On the other hand, it has long been known that if the sequence {a_k} grows faster than {C^{2^k}} for any {C}, then the Ahmes series is necessarily irrational, basically because the fractional parts of {a_1 \dots a_m \sum_{k=1}^\infty \frac{1}{a_k}} can be arbitrarily small positive quantities, which is inconsistent with {\sum_{k=1}^\infty \frac{1}{a_k}} being rational. This growth rate is sharp, as can be seen by iterating the identity {\frac{1}{n} = \frac{1}{n+1} + \frac{1}{n(n+1)}} to obtain a rational Ahmes series of growth rate {(C+o(1))^{2^k}} for any fixed {C>1}.

In our paper we show that if {a_k} grows somewhat slower than the above sequences in the sense that {a_{k+1} = o(a_k^2)}, for instance if {a_k \asymp 2^{(2-\varepsilon)^k}} for a fixed {0 < \varepsilon < 1}, then one can find a comparable sequence {b_k \asymp a_k} for which {\sum_{k=1}^\infty \frac{1}{b_k}} is rational. This partially addresses Erdős problem #263, which asked if the sequence {a_k = 2^{2^k}} had this property, and whether any sequence of exponential or slower growth (but with {\sum_{k=1}^\infty 1/a_k} convergent) had this property. Unfortunately we barely miss a full solution of both parts of the problem, since the condition {a_{k+1} = o(a_k^2)} we need just fails to cover the case {a_k = 2^{2^k}}, and also does not quite hold for all sequences going to infinity at an exponential or slower rate.

We also show the following variant; if {a_k} has exponential growth in the sense that {a_{k+1} = O(a_k)} with {\sum_{k=1}^\infty \frac{1}{a_k}} convergent, then there exists nearby natural numbers {b_k = a_k + O(1)} such that {\sum_{k=1}^\infty \frac{1}{b_k}} is rational. This answers the first part of Erdős problem #264 which asked about the case {a_k = 2^k}, although the second part (which asks about {a_k = k!}) is slightly out of reach of our methods. Indeed, we show that the exponential growth hypothesis is best possible in the sense a random sequence {a_k} that grows faster than exponentially will not have this property, this result does not address any specific superexponential sequence such as {a_k = k!}, although it does apply to some sequence {a_k} of the shape {a_k = k! + O(\log\log k)}.

Our methods can also handle higher dimensional variants in which multiple series are simultaneously set to be rational. Perhaps the most striking result is this: we can find an increasing sequence {a_k} of natural numbers with the property that {\sum_{k=1}^\infty \frac{1}{a_k + t}} is rational for every rational {t} (excluding the cases {t = - a_k} to avoid division by zero)! This answers (in the negative) a question of Stolarsky Erdős problem #266, and also reproves Erdős problem #265 (and in the latter case one can even make {a_k} grow double exponentially fast).

Our methods are elementary and avoid any number-theoretic considerations, relying primarily on the countable dense nature of the rationals and an iterative approximation technique. The first observation is that the task of representing a given number {q} as an Ahmes series {\sum_{k=1}^\infty \frac{1}{a_k}} with each {a_k} lying in some interval {I_k} (with the {I_k} disjoint, and going to infinity fast enough to ensure convergence of the series), is possible if and only if the infinite sumset

\displaystyle  \frac{1}{I_1} + \frac{1}{I_2} + \dots

to contain {q}, where {\frac{1}{I_k} = \{ \frac{1}{a}: a \in I_k \}}. More generally, to represent a tuple of numbers {(q_t)_{t \in T}} indexed by some set {T} of numbers simultaneously as {\sum_{k=1}^\infty \frac{1}{a_k+t}} with {a_k \in I_k}, this is the same as asking for the infinite sumset

\displaystyle  E_1 + E_2 + \dots

to contain {(q_t)_{t \in T}}, where now

\displaystyle  E_k = \{ (\frac{1}{a+t})_{t \in T}: a \in I_k \}. \ \ \ \ \ (1)

So the main problem is to get control on such infinite sumsets. Here we use a very simple observation:

Proposition 1 (Iterative approximation) Let {V} be a Banach space, let {E_1,E_2,\dots} be sets with each {E_k} contained in the ball of radius {\varepsilon_k>0} around the origin for some {\varepsilon_k} with {\sum_{k=1}^\infty \varepsilon_k} convergent, so that the infinite sumset {E_1 + E_2 + \dots} is well-defined. Suppose that one has some convergent series {\sum_{k=1}^\infty v_k} in {V}, and sets {B_1,B_2,\dots} converging in norm to zero, such that

\displaystyle  v_k + B_k \subset E_k + B_{k+1} \ \ \ \ \ (2)

for all {k \geq 1}. Then the infinite sumset {E_1 + E_2 + \dots} contains {\sum_{k=1}^\infty v_k + B_1}.

Informally, the condition (2) asserts that {E_k} occupies all of {v_k + B_k} “at the scale {B_{k+1}}“.

Proof: Let {w_1 \in B_1}. Our task is to express {\sum_{k=1}^\infty v_k + w_1} as a series {\sum_{k=1}^\infty e_k} with {e_k \in E_k}. From (2) we may write

\displaystyle  \sum_{k=1}^\infty v_k + w_1 = \sum_{k=2}^\infty v_k + e_1 + w_2

for some {e_1 \in E_1} and {w_2 \in B_2}. Iterating this, we may find {e_k \in E_k} and {w_k \in B_k} such that

\displaystyle  \sum_{k=1}^\infty v_k + w_1 = \sum_{k=m+1}^\infty v_k + e_1 + e_2 + \dots + e_m + w_{m+1}

for all {m}. Sending {m \rightarrow \infty}, we obtain

\displaystyle  \sum_{k=1}^\infty v_k + w_1 = e_1 + e_2 + \dots

as required. \Box

In one dimension, sets of the form {\frac{1}{I_k}} are dense enough that the condition (2) can be satisfied in a large number of situations, leading to most of our one-dimensional results. In higher dimension, the sets {E_k} lie on curves in a high-dimensional space, and so do not directly obey usable inclusions of the form (2); however, for suitable choices of intervals {I_k}, one can take some finite sums {E_{k+1} + \dots + E_{k+d}} which will become dense enough to obtain usable inclusions of the form (2) once {d} reaches the dimension of the ambient space, basically thanks to the inverse function theorem (and the non-vanishing curvatures of the curve in question). For the Stolarsky problem, which is an infinite-dimensional problem, it turns out that one can modify this approach by letting {d} grow slowly to infinity with {k}.

March 07, 2025

Matt von HippelCool Asteroid News

Did you hear about the asteroid?

Which one?

You might have heard that an asteroid named 2024 YR4 is going to come unusually close to the Earth in 2032. When it first made the news, astronomers estimated a non-negligible chance of it hitting us: about three percent. That’s small enough that they didn’t expect it to happen, but large enough to plan around it: people invest in startups with a smaller chance of succeeding. Still, people were fairly calm about this one, and there are a couple of good reasons:

  • First, this isn’t a “kill the dinosaurs” asteroid, it’s much smaller. This is a “Tunguska Event” asteroid. Still pretty bad if it happens near a populated area, but not the end of life as we know it.
  • We know about it far in advance, and space agencies have successfully deflected an asteroid before, for a test. If it did pose a risk, it’s quite likely they’d be able to change its path so it misses the Earth instead.
  • It’s tempting to think of that 3% chance as like a roll of a hundred-sided die: the asteroid is on a random path, roll 1 to 3 and it will hit the Earth, roll higher and it won’t, and nothing we do will change that. In reality, though, that 3% was a measure of our ignorance. As astronomers measure the asteroid more thoroughly, they’ll know more and more about its path, and each time they figure something out, they’ll update the number.

And indeed, the number has been updated. In just the last few weeks, the estimated probability of impact has dropped from 3% to a few thousandths of a percent, as more precise observations clarified the asteroid’s path. There’s still a non-negligible chance it will hit the moon (about two percent at the moment), but it’s far too small to do more than make a big flashy crater.

It’s kind of fun to think that there are people out there who systematically track these things, with a plan to deal with them. It feels like something out of a sci-fi novel.

But I find the other asteroid more fun.

In 2020, a probe sent by NASA visited an asteroid named Bennu, taking samples which it carefully packaged and brought back to Earth. Now, scientists have analyzed the samples, revealing several moderately complex chemicals that have an important role in life on Earth, like amino acids and the bases that make up RNA and DNA. Interestingly, while on Earth these molecules all have the same “handedness“, the molecules on Bennu are divided about 50/50. Something similar was seen on samples retrieved from another asteroid, so this reinforces the idea that amino acids and nucleotide bases in space do not have a preferred handedness.

I first got into physics for the big deep puzzles, the ones that figure into our collective creation story. Where did the universe come from? Why are its laws the way they are? Over the ten years since I got my PhD, it’s felt like the answers to these questions have gotten further and further away, with new results serving mostly to rule out possible explanations with greater and greater precision.

Biochemistry has its own deep puzzles figuring into our collective creation story, and the biggest one is abiogenesis: how life formed from non-life. What excites me about these observations from Bennu is that it represents real ongoing progress on that puzzle. By glimpsing a soup of ambidextrous molecules, Bennu tells us something about how our own molecules’ handedness could have developed, and rules out ways that it couldn’t have. In physics, if we could see an era of the universe when there were equal amounts of matter and antimatter, we’d be ecstatic: it would confirm that the imbalance between matter and antimatter is a real mystery, and show us where we need to look for the answer. I love that researchers on the origins of life have reason right now to be similarly excited.

March 06, 2025

John PreskillWhat does it mean to create a topological qubit?

I’ve worked on topological quantum computation, one of Alexei Kitaev’s brilliant innovations, for around 15 years now.  It’s hard to find a more beautiful physics problem, combining spectacular quantum phenomena (non-Abelian anyons) with the promise of transformative technological advances (inherently fault-tolerant quantum computing hardware).  Problems offering that sort of combination originally inspired me to explore quantum matter as a graduate student. 

Non-Abelian anyons are emergent particles born within certain exotic phases of matter.  Their utility for quantum information descends from three deeply related defining features:

  • Nucleating a collection of well-separated non-Abelian anyons within a host platform generates a set of quantum states with the same energy (at least to an excellent approximation).  Local measurements give one essentially no information about which of those quantum states the system populates—i.e., any evidence of what the system is doing is hidden from the observer and, crucially, the environment.  In turn, qubits encoded in that space enjoy intrinsic resilience against local environmental perturbations. 
  • Swapping the positions of non-Abelian anyons manipulates the state of the qubits.  Swaps can be enacted either by moving anyons around each other as in a shell game, or by performing a sequence of measurements that yields the same effect.  Exquisitely precise qubit operations follow depending only on which pairs the user swaps and in what order.  Properties (1) and (2) together imply that non-Abelian anyons offer a pathway both to fault-tolerant storage and manipulation of quantum information. 
  • A pair of non-Abelian anyons brought together can “fuse” into multiple different kinds of particles, for instance a boson or a fermion.  Detecting the outcome of such a fusion process provides a method for reading out the qubit states that are otherwise hidden when all the anyons are mutually well-separated.  Alternatively, non-local measurements (e.g., interferometry) can effectively fuse even well-separated anyons, thus also enabling qubit readout. 

I entered the field back in 2009 during the last year of my postdoc.  Topological quantum computing—once confined largely to the quantum Hall realm—was then in the early stages of a renaissance driven by an explosion of new candidate platforms as well as measurement and manipulation schemes that promised to deliver long-sought control over non-Abelian anyons.  The years that followed were phenomenally exciting, with broadly held palpable enthusiasm for near-term prospects not yet tempered by the practical challenges that would eventually rear their head. 

A PhD comics cartoon on non-Abelian anyons from 2014.

In 2018, near the height of my optimism, I gave an informal blackboard talk in which I speculated on a new kind of forthcoming NISQ era defined by the birth of a Noisy Individual Semi-topological Qubit.  To less blatantly rip off John Preskill’s famous acronym, I also—jokingly of course—proposed the alternative nomenclature POST-Q (Piece Of S*** Topological Qubit) era to describe the advent of such a device.  The rationale behind those playfully sardonic labels is that the inaugural topological qubit would almost certainly be far from ideal, just as the original transistor appears shockingly crude when compared to modern electronics.  You always have to start somewhere.  But what does it mean to actually create a topological qubit, and how do you tell that you’ve succeeded—especially given likely POST-Q-era performance?

To my knowledge those questions admit no widely accepted answers, despite implications for both quantum science and society.  I would like to propose defining an elementary topological qubit as follows:

A device that leverages non-Abelian anyons to demonstrably encode and manipulate a single qubit in a topologically protected fashion. 

Some of the above words warrant elaboration.  As alluded to above, non-Abelian anyons can passively encode quantum information—a capability that by itself furnishes a quantum memory.  That’s the “encode” part.  The “manipulate” criterion additionally entails exploiting another aspect of what makes non-Abelian anyons special—their behavior under swaps—to enact gate operations.  Both the encoding and manipulation should benefit from intrinsic fault-tolerance, hence the “topologically protected fashion” qualifier.  And very importantly, these features should be “demonstrably” verified.  For instance, creating a device hosting the requisite number of anyons needed to define a qubit does not guarantee the all-important property of topological protection.  Hurdles can still arise, among them: if the anyons are not sufficiently well-separated, then the qubit states will lack the coveted immunity from environmental perturbations; thermal and/or non-equilibrium effects might still induce significant errors (e.g., by exciting the system into other unwanted states); and measurements—for readout and possibly also manipulation—may lack the fidelity required to fruitfully exploit topological protection even if present in the qubit states themselves. 

The preceding discussion raises a natural follow-up question: How do you verify topological protection in practice?  One way forward involves probing qubit lifetimes, and fidelities of gates resulting from anyon swaps, upon varying some global control knob like magnetic field or gate voltage.  As the system moves deeper into the phase of matter hosting non-Abelian anyons, both the lifetime and gate fidelities ought to improve dramatically—reflecting the onset of bona fide topological protection.  First-generation “semi-topological” devices will probably fare modestly at best, though one can at least hope to recover general trends in line with this expectation. 

By the above proposed definition, which I contend is stringent yet reasonable, realization of a topological qubit remains an ongoing effort.  Fortunately the journey to that end offers many significant science and engineering milestones worth celebrating in their own right.  Examples include:

Platform verification.  This most indirect milestone evidences the formation of a non-Abelian phase of matter through (thermal or charge) Hall conductance measurements, detection of some anticipated quantum phase transition, etc. 

Detection of non-Abelian anyons. This step could involve conductance, heat capacity, magnetization, or other types of measurements designed to support the emergence of either individual anyons or a collection of anyons.  Notably, such techniques need not reveal the precise quantum state encoded by the anyons—which presents a subtler challenge. 

Establishing readout capabilities. Here one would demonstrate experimental techniques, interferometry for example, that in principle can address that key challenge of quantum state readout, even if not directly applied yet to a system hosting non-Abelian anyons. 

Fusion protocols.  Readout capabilities open the door to more direct tests of the hallmark behavior predicted for a putative topological qubit.  One fascinating experiment involves protocols that directly test non-Abelian anyon fusion properties.  Successful implementation would solidify readout capabilities applied to an actual candidate topological qubit device. 

Probing qubit lifetimes.  Fusion protocols further pave the way to measuring the qubit coherence times, e.g., T_1 and T_2—addressing directly the extent of topological protection of the states generated by non-Abelian anyons.  Behavior clearly conforming to the trends highlighted above could certify the device as a topological quantum memory.  (Personally, I most anxiously await this milestone.)

Fault-tolerant gates from anyon swaps.  Likely the most advanced milestone, successfully implementing anyon swaps, again with appropriate trends in gate fidelity, would establish the final component of an elementary topological qubit. 

Most experiments to date focus on the first two items above, platform verification and anyon detection.  Microsoft’s recent Nature paper, together with the simultaneous announcement of supplementary new results, combine efforts in those areas with experiments aiming to establish interferometric readout capabilities needed for a topological qubit.  Fusion, (idle) qubit lifetime measurements, and anyon swaps have yet to be demonstrated in any candidate topological quantum computing platform, but at least partially feature in Microsoft’s future roadmap.  It will be fascinating to see how that effort evolves, especially given the aggressive timescales predicted by Microsoft for useful topological quantum hardware.  Public reactions so far range from cautious optimism to ardent skepticism; data will hopefully settle the situation one way or another in the near future.  My own take is that while Microsoft’s progress towards qubit readout is a welcome advance that has value regardless of the nature of the system to which those techniques are currently applied, convincing evidence of topological protection may still be far off. 

In the meantime, I maintain the steadfast conviction that topological qubits are most certainly worth pursuing—in a broad range of platforms.  Non-Abelian quantum Hall states seem resurgent candidates, and should not be discounted.  Moreover, the advent of ultra-pure, highly tunable 2D materials provide new settings in which one can envision engineering non-Abelian anyon devices with complementary advantages (and disadvantages) compared to previously explored settings.  Other less obvious contenders may also rise at some point.  The prospect of discovering new emergent phenomena mitigating the need for quantum error correction warrants continued effort with an open mind.

March 05, 2025

n-Category Café How Good are Permutation Represesentations?

Any action of a finite group GG on a finite set XX gives a linear representation of GG on the vector space with basis XX. This is called a ‘permutation representation’. And this raises a natural question: how many representations of finite groups are permutation representations?

Most representations are not permutation representations, since every permutation representation has a vector fixed by all elements of GG, namely the vector that’s the sum of all elements of XX. In other words, every permutation representation has a 1-dimensional trivial rep sitting inside it.

But what if we could ‘subtract off’ this trivial representation?

There are different levels of subtlety with which we can do this. For example, we can decategorify, and let:

  • the Burnside ring of GG be the ring A(G)A(G) of formal differences of isomorphism classes of actions of GG on finite sets;

  • the representation ring of GG be the ring R(G)R(G) of formal differences of isomorphism classes of finite-dimensional representations of GG.

In either of these rings, we can subtract.

There’s an obvious map β:A(G)R(G)\beta : A(G) \to R(G) , since any action of GG on a finite set gives a permutation representation of GG on the vector space with basis XX.

So I asked on MathOverflow: is β\beta typically surjective, or typically not surjective?

In fact everything depends on what field kk we’re using for our vector spaces! For starters let’s take k=k = \mathbb{Q}.

Here’s a list of finite groups where the map β:A(G)R(G)\beta : A(G) \to R(G) from the Burnside ring to the representation ring is known to be surjective, taken from the nLab article Permutation representations:

  1. cyclic groups,

  2. symmetric groups,

  3. pp-groups (that is, groups whose order is a power of the prime pp),

  4. binary dihedral groups 2D 2n2 D_{2n} for (at least) 2n122 n \leq 12,

  5. the binary tetrahedral group, binary octahedral group, and binary icosahedral group,

  6. the general linear group GL(2,𝔽 3)GL(2,\mathbb{F}_3).

Now, these may seem like rather special classes of groups, but devoted readers of this blog know that most finite groups have order that’s a power of 2. I don’t think this has been proved yet, but it’s obviously true empirically, and we also have a good idea as to why:

So, the map β\beta from the Burnside ring to the representation ring is surjective for most finite groups!

David Benson has listed the orders for which there are groups where β\beta is not surjective, and how many such groups there are of these orders:

24: 2,

40: 2,

48: 7,

56: 1,

60: 2,

72: 8,

80: 8,

84: 1,

88: 2,

96: 45,

104: 2,

112: 5,

120: 13,

132: 1,

136: 3,

140: 2,

144: 39,

152: 2,

156: 1,

160: 12,

168: 12,

171: 1,

176: 7,

180: 6,

184: 1,

192: 423,

200: 8,

204: 2,

208: 8,

216: 35,

220: 1,

224: 28,

228: 1,

232: 2,

240: 90,

248: 1,

252: 4,

260: 2,

264: 10,

272: 12,

276: 1,

280: 12,

288: 256,

296: 2,

300: 8,

304: 7,

308: 1,

312: 16,

320: 532,

328: 3,

333: 1,

336: 76,

340: 2,

342: 2,

344: 2,

348: 2,

352: 41,

360: 51,

364: 2,

368: 5,

372: 1,

376: 1,

380: 2.

He adds:

This seems to represent quite a small proportion of all finite groups, counted by order, even ignoring the pp-groups.

All this is if we work over \mathbb{Q}. If we work over \mathbb{C} the situation flip-flops, and I believe β\beta is usually not surjective. It already fails to be surjective for cyclic groups bigger than /2\mathbb{Z}/2!

Why? Because /n\mathbb{Z}/n has a 1-dimensional representation where the generator acts as multiplication by a primitive nnth root of unity, and since this is not a rational number when n>2n &gt; 2, this representation is not definable over \mathbb{Q}. Thus, one can show, there’s no way to get this representation as a formal difference of permutation representations, since those are always definable over \mathbb{Q}.

And this phenomenon of needing roots of unity to define representations is not special to cyclic groups: it happens for most finite groups as hinted at by Artin’s theorem on induced characters.

An example

Now, you’ll notice that I didn’t yet give an example of a finite group where the map β\beta from the Burnside ring to the representation ring fails to be surjective when we work over \mathbb{Q}.

In Serre’s book Linear Representations of Finite Groups, he asks us in Exercise 13.4 on page 105 to prove that /3×Q 8\mathbb{Z}/3 \times Q_8 is such a group. Here Q 8Q_8 is the quaternion group, an 8-element subgroup of the invertible quaternions:

Q 8={±1,±i,±j,±k} Q_8 = \{ \pm 1, \pm i , \pm j, \pm k \}

I couldn’t resist trying to understand why this is the counterexample Serre gave. First of all, /3×Q 8\mathbb{Z}/3 \times Q_8 looks like a pretty weird group — why does it work, and why did Serre choose this one? Secondly, it has 24 elements, and I love the number 24. Thirdly, I love the quaternions.

Benson’s calculations show that the two smallest groups where β\beta is non-surjective have 24 elements. Drew Heard checked that these groups are /3×Q 8\mathbb{Z}/3 \times Q_8 and the nontrivial semidirect product /3/8\mathbb{Z}/3 \rtimes \mathbb{Z}/8. But it’s still very interesting to do Serre’s exercise 13.4 and see why /3×Q 8\mathbb{Z}/3 \times Q_8 is such a group.

To do this exercise, Serre asks us to use exercise 13.3. Here is my reformulation and solution of that problem. I believe any field kk characteristic zero would work for this theorem:

Theorem. Suppose GG is a finite group with a linear representation ρ\rho such that:

  1. ρ\rho is irreducible and faithful

  2. every subgroup of GG is normal

  3. ρ\rho appears with multiplicity n2n \ge 2 in the regular representation of GG.

Then the map from the Burnside ring of GG to the representation ring R(G)R(G) of GG is not surjective.

Proof. It suffices to prove that the multiplicity of ρ\rho in any permutation representation of GG is a multiple of nn, so that the class [ρ]R(G)[\rho] \in R(G) cannot be in the image of R(G)R(G).

Since every finite GG-set is a coproduct of transitive actions of GG, which are isomorphic to actions on G/HG/H for subgroups HH of GG, every permutation representation of GG is a direct sum of those on spaces of the form k[G/H]k[G/H]. (This is my notation for the vector space with basis G/HG/H.) Thus, it suffices to show that the multiplicity of ρ\rho in the representation on k[G/H]k[G/H] is nn if HH is the trivial group, and 00 otherwise.

The former holds by assumption 3. For the latter, suppose HH is a nontrivial subgroup of GG. Because HH is normal by assumption 2, every element hHh \in H acts trivially on k[G/H]k[G/H]: we can see this by letting hh act on an arbitrary basis element gH=HgG/Hg H = H g \in G/H:

hHg=Hg. h H g = H g .

Since HH is nontrivial, it contains elements h1h \ne 1 that act trivially on k[G/H]k[G/H]. But no h1h \ne 1 can act trivially on ρ\rho because ρ\rho is faithful, by assumption 1. Thus ρ\rho cannot be a subrepresentation of k[G/H]k[G/H]. That is, ρ\rho appears with multiplicity 00 in k[G/H]k[G/H].   ▮

Serre’s exercise 13.4 is to show the group G=/3×Q 8G = \mathbb{Z}/3 \times Q_8 obeys the conditions of this theorem. As a hint, Serre suggests to embed /3\mathbb{Z}/3 and Q 8Q_8 in the multiplicative group of the algebra \mathbb{H}_{\mathbb{Q}} (the quaternions defined over \mathbb{Q}). By letting /3\mathbb{Z}/3 act by left multiplication and Q 8Q_8 act by right multiplication, one obtains a 4-dimensional irreducible representation ρ\rho of GG which appears with multiplicity n=2n = 2 in the regular representation. Furthermore ρ\rho is faithful and irreducible. Finally, every subgroup of GG is normal, because that’s true of /3\mathbb{Z}/3 and Q 8Q_8 — and since the orders of these groups are relatively prime, every subgroup of G=/3×Q 8G = \mathbb{Z}/3 \times Q_8 is a product of a subgroup of /3\mathbb{Z}/3 and a subgroup of Q 8Q_8.

More work on the question

Alex Bartel wrote:

To add to what others have said: you may be interested in this paper “Rational representations and permutation representations of finite groups” of mine with Tim Dokchitser, which, among other things, contains an overview of what is (or was in 2016) known about the question, describes the algorithm that is used in the magma programme (implemented by Tim Dokchitser) that Drew Heard mentions in his answer, gives a closed form formula for the cokernel in quasi-elementary groups, and exhibits a family of finite simple groups with unbounded cokernel, in fact even unbounded exponent of cokernel (to our knowledge, in all finite simple groups for which one had known this cokernel up to that point, the cokernel was trivial).

The particular interest in quasi-elementary groups stems from the fact that one can reduce the study of this cokernel in arbitrary finite groups to that of its quasi-elementary subgroups and the induction-restriction maps between them. Observations of this nature have been made repeatedly over the years. In our paper we make this reduction very explicit.

Edit. In response to a question in a comment thread, and as an example of an interesting quasi-elementary group: the other group of order 2424 with non-trivial cokernel is C 3C 8C_3\rtimes C_8 with the unique non-trivial action. The cokernel of β\beta here comes about in the following way: there is an index 22 subgroup C 3×C 4C 12C_3\times C_4\cong C_{12}. Take a faithful irreducible character of that, and induce up to the whole group. The resulting 22-dimensional representation of C 3C 8C_3\rtimes C_8 has field of character values (i)\mathbb{Q}(i): the cube roots of unity have “disappeared” in the course of the induction. It turns out that the sum of this degree 22 character and its Galois conjugate is realisable over \mathbb{Q}, but that it is not a virtual permutation representation. This is an illustration of Theorem 1.1 in my paper with Tim: ρ\rho is 44-dimensional, ψ^\hat\psi is the 22-dimensional irreducible rational representation of C 3C_3, the sum of the two faithful complex irreducible characters, and π^\hat{\pi} is the unique 44-dimensional irreducible rational representation of C 8C_8, the sum of all its faithful complex irreducible characters.

Scott Aaronson Jacob Barandes and Me

Please enjoy Harvard’s Jacob Barandes and yours truly duking it out for 2.5 hours on YouTube about the interpretation of quantum mechanics, and specifically Jacob’s recent proposal involving “indivisible stochastic dynamics,” with Curt Jaimungal as moderator. As always, I strongly recommend watching with captions turned on and at 2X speed.

To summarize what I learned in one paragraph: just like in Bohmian mechanics, Jacob wants classical trajectories for particles, which are so constructed to reproduce the predictions of QM perfectly. But unlike the Bohmians, Jacob doesn’t want to commit to any particular rule for the evolution of those particle trajectories. He merely asserts, metaphysically, that the trajectories exist. My response was basically, “OK fine, you can do that if you want, but what does it buy me?” We basically went around in circles on that question the entire time, though hopefully with many edutaining disgressions.

Despite the lack of resolution, I felt pretty good about the conversation afterward: Jacob got an extensive opportunity to explain his ideas to listeners, along with his detailed beefs against both the Many-Worlds and Copenhagen interpretations. Meanwhile, even though I spoke less than Jacob, I did get some opportunities to do my job, pushing back and asking the kinds of questions I imagined most physicists would ask (even though I’m not a physicist, I felt compelled to represent them!). Jacob and I ended the conversation much as we began: disagreeing on extremely friendly terms.

Then, alas, I read the comments on YouTube and got depressed. Apparently, I’m a hidebound academic elitist who’s failed to grasp Jacob’s revolutionary, paradigm-smashing theory, and who kept arrogantly interrupting with snide, impertinent questions (“OK, but what can I do with this theory that I couldn’t do before?”). And, I learned, the ultimate proof of my smug, ivory-tower malice was to be found in my body language, the way I constantly smiled nervously and rocked back and forth. I couldn’t help but wonder: have these people watched any other YouTube videos that I’m in? I don’t get to pick how I look and sound. I came out of the factory this way.

One commenter opined that I must hate Jacob’s theory only because I’ve poured my life into quantum computing, which depends on superposition, the confusing concept that Jacob has now unmasked as a farce. Presumably it’s beyond this person’s comprehension that Jacob makes exactly the same predictions as I make for what a quantum computer will do when built; Jacob just prefers a different way of talking about it.

I was reminded that optimizing for one’s scientific colleagues is wildly different from optimizing for YouTube engagement. In science, it’s obvious to everyone that the burden of proof is on whoever is presenting the new idea—and that this burden is high, especially with anything as well-trodden and skull-strewn as the foundations of quantum mechanics, albeit not infinitely high. The way the game works is: other people try as hard as they can to shoot the new idea down, so we see how it fares under duress. This is not a sign of contempt for new ideas, but of respect for them.

On YouTube, the situation is precisely reversed. There, anyone perceived as the “mainstream establishment” faces a near-insurmountable burden of proof, while anyone perceived as “renegade” wins by default if they identify any hole whatsoever in mainstream understanding. Crucially, the renegade’s own alternative theories are under no particular burden; indeed, the details of their theories are not even that important or relevant. I don’t want to list science YouTubers who’ve learned to exploit that dynamic masterfully, though I’m told one rhymes with “Frabine Schlossenfelder.” Of course this mirrors what’s happened in the wider world, where RFK Jr. now runs American health policy, Tulsi Gabbard runs the intelligence establishment, and other conspiracy theorists have at last fired all the experts and taken control of our civilization, and are eagerly mashing the buttons to see what happens. I’d take Jacob Barandes, or even Sabine, a billion times over the lunatics in power. But I do hope Jacob turns out to be wrong about Many-Worlds, because it would give my solace to know that there are other branches of the wavefunction where things are a little more sane.

March 04, 2025

Tommaso DorigoSummer Lectures In AI

Winter is not over yet, but I am already busy fixing the details of some conferences, schools, and lectures I will give around Europe this summer. Here I wish to summarize them, in the hope of arising the interest of some of you in the relevant events I will attend to.

read more

March 03, 2025

Clifford JohnsonValuable Instants

This week’s lectures on instantons in my gauge theory class (a very important kind of theory for understanding many phenomenon in nature – light is an example of a phenomenon that is described by gauge theory) were a lot of fun to do, and mark the culmination of a month-long … Click to continue reading this post

The post Valuable Instants appeared first on Asymptotia.

February 28, 2025

Matt von HippelSome FAQ for Microsoft’s Majorana 1 Chip

Recently, Microsoft announced a fancy new quantum computing chip called Majorana 1. I’ve noticed quite a bit of confusion about what they actually announced, and while there’s a great FAQ page about it on the quantum computing blog Shtetl Optimized, the post there aims at a higher level, assuming you already know the basics. You can think of this post as a complement to that one, that tries to cover some basic things Shtetl Optimized took for granted.

Q: In the announcement, Microsoft said:

“It leverages the world’s first topoconductor, a breakthrough type of material which can observe and control Majorana particles to produce more reliable and scalable qubits, which are the building blocks for quantum computers.”

That sounds wild! Are they really using particles in a computer?

A: All computers use particles. Electrons are particles!

Q: You know what I mean!

A: You’re asking if these are “particle physics” particles, like the weird types they try to observe at the LHC?

No, they’re not.

Particle physicists use a mathematical framework called quantum field theory, where particles are ripples in things called quantum fields that describe properties of the universe. But they aren’t the only people to use that framework. Instead of studying properties of the universe you can study properties of materials, weird alloys and layers of metal and crystal that do weird and useful things. The properties of these materials can be approximately described with the same math, with quantum fields. Just as the properties of the universe ripple to produce particles, these properties of materials ripple to produce what are called quasiparticles. Ultimately, these quasiparticles come down to movements of ordinary matter, usually electrons in the original material. They’re just described with a kind of math that makes them look like their own particles.

Q: So, what are these Majorana particles supposed to be?

A: In quantum field theory, most particles come with an antimatter partner. Electrons, for example, have partners called positrons, with a positive electric charge instead of a negative one. These antimatter partners have to exist due to the math of quantum field theory, but there is a way out: some particles are their own antimatter partner, letting one particle cover both roles. This happens for some “particle physics particles”, but all the examples we’ve found are a type of particle called a “boson”, particles related to forces. In 1937, the physicist Ettore Majorana figured out the math you would need to make a particle like this that was a fermion instead, the other main type of particle that includes electrons and protons. So far, we haven’t found one of these Majorana fermions in nature, though some people think the elusive neutrino particles could be an example. Others, though, have tried instead to find a material described by Majorana’s theory. This should in principle be easier, you can build a lot of different materials after all. But it’s proven quite hard for people to do. Back in 2018, Microsoft claimed they’d managed this, but had to retract the claim. This time, they seem more confident, though the scientific community is still not convinced.

Q: And what’s this topoconductor they’re talking about?

A: Topoconductor is short for topological superconductor. Superconductors are materials that conduct electricity much better than ordinary metals.

Q: And, topological means? Something about donuts, right?

A: If you’ve heard anything about topology, you’ve heard that it’s a type of mathematics where donuts are equivalent to coffee cups. You might have seen an animation of a coffee cup being squished and mushed around until the ring of the handle becomes the ring of a donut.

This isn’t actually the important part of topology. The important part is that, in topology, a ball is not equivalent to a donut.

Topology is the study of which things can change smoothly into one another. If you want to change a donut into a ball, you have to slice through the donut’s ring or break the surface inside. You can’t smoothly change one to another. Topologists study shapes of different kinds of things, figuring out which ones can be changed into each other smoothly and which can’t.

Q: What does any of that have to do with quantum computers?

A: The shapes topologists study aren’t always as simple as donuts and coffee cups. They can also study the shape of quantum fields, figuring out which types of quantum fields can change smoothly into each other and which can’t.

The idea of topological quantum computation is to use those rules about what can change into each other to encode information. You can imagine a ball encoding zero, and a donut encoding one. A coffee cup would then also encode one, because it can change smoothly into a donut, while a box would encode zero because you can squash the corners to make it a ball. This helps, because it means that you don’t screw up your information by making smooth changes. If you accidentally drop your box that encodes zero and squish a corner, it will still encode zero.

This matters in quantum computing because it is very easy to screw up quantum information. Quantum computers are very delicate, and making them work reliably has been immensely challenging, requiring people to build much bigger quantum computers so they can do each calculation with many redundant backups. The hope is that topological superconductors would make this easier, by encoding information in a way that is hard to accidentally change.

Q: Cool. So does that mean Microsoft has the best quantum computer now?

A: The machine Microsoft just announced has only a single qubit, the quantum equivalent of just a single bit of computer memory. At this point, it can’t do any calculations. It can just be read, giving one or zero. The hope is that the power of the new method will let Microsoft catch up with companies that have computers with hundred of qubits, and help them arrive faster at the millions of qubits that will be needed to do anything useful.

Q: Ah, ok. But it sounds like they accomplished some crazy Majorana stuff at least, right?

A: Umm…

Read the Shtetl-Optimized FAQ if you want more details. The short answer is that this is still controversial. So far, the evidence they’ve made public isn’t enough to show that they found these Majorana quasiparticles, or that they made a topological superconductor. They say they have more recent evidence that they haven’t published yet. We’ll see.

February 26, 2025

Terence TaoThe three-dimensional Kakeya conjecture, after Wang and Zahl

There has been some spectacular progress in geometric measure theory: Hong Wang and Joshua Zahl have just released a preprint that resolves the three-dimensional case of the infamous Kakeya set conjecture! This conjecture asserts that a Kakeya set – a subset of {{\bf R}^3} that contains a unit line segment in every direction, must have Minkowski and Hausdorff dimension equal to three. (There is also a stronger “maximal function” version of this conjecture that remains open at present, although the methods of this paper will give some non-trivial bounds on this maximal function.) It is common to discretize this conjecture in terms of small scale {0 < \delta < 1}. Roughly speaking, the conjecture then asserts that if one has a family {\mathbb{T}} of {\delta \times \delta \times 1} tubes of cardinality {\approx \delta^{-2}}, and pointing in a {\delta}-separated set of directions, then the union {\bigcup_{T \in \mathbb{T}} T} of these tubes should have volume {\approx 1}. Here we shall be a little vague as to what {\approx} means here, but roughly one should think of this as “up to factors of the form {O_\varepsilon(\delta^{-\varepsilon})} for any {\varepsilon>0}“; in particular this notation can absorb any logarithmic losses that might arise for instance from a dyadic pigeonholing argument. For technical reasons (including the need to invoke the aforementioned dyadic pigeonholing), one actually works with slightly smaller sets {\bigcup_{T \in \mathbb{T}} Y(T)}, where {Y} is a “shading” of the tubes in {\mathbb{T}} that assigns a large subset {Y(T)} of {T} to each tube {T} in the collection; but for this discussion we shall ignore this subtlety and pretend that we can always work with the full tubes.

Previous results in this area tended to center around lower bounds of the form

\displaystyle  |\bigcup_{T \in \mathbb{T}} T| \gtrapprox \delta^{3-d} \ \ \ \ \ (1)

for various intermediate dimensions {0 < d < 3}, that one would like to make as large as possible. For instance, just from considering a single tube in this collection, one can easily establish (1) with {d=1}. By just using the fact that two lines in {{\bf R}^3} intersect in a point (or more precisely, a more quantitative estimate on the volume between the intersection of two {\delta \times \delta \times 1} tubes, based on the angle of intersection), combined with a now classical {L^2}-based argument of Córdoba, one can obtain (1) with {d=2} (and this type of argument also resolves the Kakeya conjecture in two dimensions). In 1995, building on earlier work by Bourgain, Wolff famously obtained (1) with {d=2.5} using what is now known as the “Wolff hairbrush argument”, based on considering the size of a “hairbrush” – the union of all the tubes that pass through a single tube (the hairbrush “stem”) in the collection.

In their new paper, Wang and Zahl established (1) for {d=3}. The proof is lengthy (127 pages!), and relies crucially on their previous paper establishing a key “sticky” case of the conjecture. Here, I thought I would try to summarize the high level strategy of proof, omitting many details and also oversimplifying the argument at various places for sake of exposition. The argument does use many ideas from previous literature, including some from my own papers with co-authors; but the case analysis and iterative schemes required are remarkably sophisticated and delicate, with multiple new ideas needed to close the full argument.

A natural strategy to prove (1) would be to try to induct on {d}: if we let {K(d)} represent the assertion that (1) holds for all configurations of {\approx \delta^{-2}} tubes of dimensions {\delta \times \delta \times 1}, with {\delta}-separated directions, we could try to prove some implication of the form {K(d) \implies K(d + \alpha)} for all {0 < d < 3}, where {\alpha>0} is some small positive quantity depending on {d}. Iterating this, one could hope to get {d} arbitrarily close to {3}.

A general principle with these sorts of continuous induction arguments is to first obtain the trivial implication {K(d) \implies K(d)} in a non-trivial fashion, with the hope that this non-trivial argument can somehow be perturbed or optimized to get the crucial improvement {K(d) \implies K(d+\alpha)}. The standard strategy for doing this, since the work of Bourgain and then Wolff in the 1990s (with precursors in older work of Córdoba), is to perform some sort of “induction on scales”. Here is the basic idea. Let us call the {\delta \times \delta \times 1} tubes {T} in {\mathbb{T}} “thin tubes”. We can try to group these thin tubes into “fat tubes” of dimension {\rho \times \rho \times 1} for some intermediate scale {\delta \ll \rho \ll 1}; it is not terribly important for this sketch precisely what intermediate value is chosen here, but one could for instance set {\rho = \sqrt{\delta}} if desired. Because of the {\delta}-separated nature of the directions in {\mathbb{T}}, there can only be at most {\lessapprox (\rho/\delta)^{2}} thin tubes in a given fat tube, and so we need at least {\gtrapprox \rho^{-2}} fat tubes to cover the {\approx \delta^{-2}} thin tubes. Let us suppose for now that we are in the “sticky” case where the thin tubes stick together inside fat tubes as much as possible, so that there are in fact a collection {\mathbb{T}_\rho} of {\approx \rho^{-2}} fat tubes {T_\rho}, with each fat tube containing about {\approx (\rho/\delta)^{2}} of the thin tubes. Let us also assume that the fat tubes {T_\rho} are {\rho}-separated in direction, which is an assumption which is highly consistent with the other assumptions made here.

If we already have the hypothesis {K(d)}, then by applying it at scale {\rho} instead of {\delta} we conclude a lower bound on the volume occupied by fat tubes:

\displaystyle  |\bigcup_{T_\rho \in \mathbb{T}_\rho} T_\rho| \gtrapprox \rho^{3-d}.

Since {\sum_{T_\rho \in \mathbb{T}_\rho} |T_\rho| \approx \rho^{-2} \rho^2 = 1}, this morally tells us that the typical multiplicity {\mu_{fat}} of the fat tubes is {\lessapprox \rho^{d-3}}; a typical point in {\bigcup_{T_\rho \in \mathbb{T}_\rho} T_\rho} should belong to about {\mu_{fat} \lessapprox \rho^{d-3}} fat tubes.

Now, inside each fat tube {T_\rho}, we are assuming that we have about {\approx (\rho/\delta)^{2}} thin tubes that are {\delta}-separated in direction. If we perform a linear rescaling around the axis of the fat tube by a factor of {1/\rho} to turn it into a {1 \times 1 \times 1} tube, this would inflate the thin tubes to be rescaled tubes of dimensions {\delta/\rho \times \delta/\rho \times 1}, which would now be {\approx \delta/\rho}-separated in direction. This rescaling does not affect the multiplicity of the tubes. Applying {K(d)} again, we see morally that the multiplicity {\mu_{fine}} of the rescaled tubes, and hence the thin tubes inside {T_\rho}, should be {\lessapprox (\delta/\rho)^{d-3}}.

We now observe that the multiplicity {\mu} of the full collection {\mathbb{T}} of thin tubes should morally obey the inequality

\displaystyle  \mu \lessapprox \mu_{fat} \mu_{fine}, \ \ \ \ \ (2)

since if a given point lies in at most {\mu_{fat}} fat tubes, and within each fat tube a given point lies in at most {\mu_{fine}} thin tubes in that fat tube, then it should only be able to lie in at most {\mu_{fat} \mu_{fine}} tubes overall. This heuristically gives {\mu \lessapprox \rho^{d-3} (\delta/\rho)^{d-3} = \delta^{d-3}}, which then recovers (1) in the sticky case.

In their previous paper, Wang and Zahl were roughly able to squeeze a little bit more out of this argument to get something resembling {K(d) \implies K(d+\alpha)} in the sticky case, loosely following a strategy of Nets Katz and myself that I discussed in this previous blog post from over a decade ago. I will not discuss this portion of the argument further here, referring the reader to the introduction to that paper; instead, I will focus on the arguments in the current paper, which handle the non-sticky case.

Let’s try to repeat the above analysis in a non-sticky situation. We assume {K(d)} (or some suitable variant thereof), and consider some thickened Kakeya set

\displaystyle  E = \bigcup_{T \in {\mathbb T}} T

where {{\mathbb T}} is something resembling what we might call a “Kakeya configuration” at scale {\delta}: a collection of {\delta^{-2}} thin tubes of dimension {\delta \times \delta \times 1} that are {\delta}-separated in direction. (Actually, to make the induction work, one has to consider a more general family of tubes than these, satisfying some standard “Wolff axioms” instead of the direction separation hypothesis; but we will gloss over this issue for now.) Our goal is to prove something like {K(d+\alpha)} for some {\alpha>0}, which amounts to obtaining some improved volume bound

\displaystyle  |E| \gtrapprox \delta^{3-d-\alpha}

that improves upon the bound {|E| \gtrapprox \delta^{3-d}} coming from {K(d)}. From the previous paper we know we can do this in the “sticky” case, so we will assume that {E} is “non-sticky” (whatever that means).

A typical non-sticky setup is when there are now {m \rho^{-2}} fat tubes for some multiplicity {m \ggg 1} (e.g., {m = \delta^{-\eta}} for some small constant {\eta>0}), with each fat tube containing only {m^{-1} (\delta/\rho)^{-2}} thin tubes. Now we have an unfortunate imbalance: the fat tubes form a “super-Kakeya configuration”, with too many tubes at the coarse scale {\rho} for them to be all {\rho}-separated in direction, while the thin tubes inside a fat tube form a “sub-Kakeya configuration” in which there are not enough tubes to cover all relevant directions. So one cannot apply the hypothesis {K(d)} efficiently at either scale.

This looks like a serious obstacle, so let’s change tack for a bit and think of a different way to try to close the argument. Let’s look at how {E} intersects a given {\rho}-ball {B(x,\rho)}. The hypothesis {K(d)} suggests that {E} might behave like a {d}-dimensional fractal (thickened at scale {\delta}), in which case one might be led to a predicted size of {E \cap B(x,\rho)} of the form {(\rho/\delta)^d \delta^3}. Suppose for sake of argument that the set {E} was denser than this at this scale, for instance we have

\displaystyle  |E \cap B(x,\rho)| \gtrapprox (\rho/\delta)^d \delta^{3-\alpha} \ \ \ \ \ (3)

for all {x \in E} and some {\alpha>0}. Observe that the {\rho}-neighborhood {E} is basically {\bigcup_{T_\rho \in {\mathbb T}_\rho} T_\rho}, and thus has volume {\gtrapprox \rho^{3-d}} by the hypothesis {K(d)} (indeed we would even expect some gain in {m}, but we do not attempt to capture such a gain for now). Since {\rho}-balls have volume {\approx \rho^3}, this should imply that {E} needs about {\gtrapprox \rho^{-d}} balls to cover it. Applying (3), we then heuristically have

\displaystyle  |E| \gtrapprox \rho^{-d} \times (\rho/\delta)^d \delta^{3-\alpha} = \delta^{3-d-\alpha}

which would give the desired gain {K(d+\alpha)}. So we win if we can exhibit the condition (3) for some intermediate scale {\rho}. I think of this as a “Frostman measure violation”, in that the Frostman type bound

\displaystyle |E \cap B(x,\rho)| \lessapprox (\rho/\delta)^d \delta^3

is being violated.

The set {E}, being the union of tubes of thickness {\delta}, is essentially the union of {\delta \times \delta \times \delta} cubes. But it has been observed in several previous works (starting with a paper of Nets Katz, Izabella Laba, and myself) that these Kakeya type sets tend to organize themselves into larger “grains” than these cubes – in particular, they can organize into {\delta \times c \times c} disjoint prisms (or “grains”) in various orientations for some intermediate scales {\delta \lll c \ll 1}. The original “graininess” argument of Nets, Izabella and myself required a stickiness hypothesis which we are explicitly not assuming (and also an “x-ray estimate”, though Wang and Zahl were able to find a suitable substitute for this), so is not directly available for this argument; however, there is an alternate approach to graininess developed by Guth, based on the polynomial method, that can be adapted to this setting. (I am told that Guth has a way to obtain this graininess reduction for this paper without invoking the polynomial method, but I have not studied the details.) With rescaling, we can ensure that the thin tubes inside a single fat tube {T_\rho} will organize into grains of a rescaled dimension {\delta \times \rho c \times c}. The grains associated to a single fat tube will be essentially disjoint; but there can be overlap between grains from different fat tubes.

The exact dimensions {\rho c, c} of the grains are not specified in advance; the argument of Guth will show that {\rho c} is significantly larger than {\delta}, but other than that there are no bounds. But in principle we should be able to assume without loss of generality that the grains are as “large” as possible. This means that there are no longer grains of dimensions {\delta \times \rho' c' \times c'} with {c'} much larger than {c}; and for fixed {c}, there are no wider grains of dimensions {\delta \times \rho' c \times c} with {\rho'} much larger than {\rho}.

One somewhat degenerate possibility is that there are enormous grains of dimensions approximately {\delta \times 1 \times 1} (i.e., {\rho \approx c \approx 1}), so that the Kakeya set {E} becomes more like a union of planar slabs. Here, it turns out that the classical {L^2} arguments of Córdoba give good estimates, so this turns out to be a relatively easy case. So we can assume that least one of {\rho} or {c} is small (or both).

We now revisit the multiplicity inequality (2). There is something slightly wasteful about this inequality, because the fat tubes used to define {\mu_{fat}} occupy a lot of space that is not in {E}. An improved inequality here is

\displaystyle  \mu \lessapprox \mu_{coarse} \mu_{fine}, \ \ \ \ \ (4)

where {\mu_{coarse}} is the multiplicity, not of the fat tubes {T_\rho}, but rather of the smaller set {\bigcup_{T \subset T_\rho} T}. The point here is that by the graininess hypotheses, each {\bigcup_{T \subset T_\rho} T} is the union of essentially disjoint grains of some intermediate dimensions {\delta \times \rho c \times c}. So the quantity {\mu_{coarse}} is basically measuring the multiplicity of the grains.

It turns out that after a suitable rescaling, the arrangement of grains looks locally like an arrangement of {\rho \times \rho \times 1} tubes. If one is lucky, these tubes will look like a Kakeya (or sub-Kakeya) configuration, for instance with not too many tubes in a given direction. (More precisely, one should assume here some form of the Wolff axioms, which the authors refer to as the “Katz-Tao Convex Wolff axioms”). A suitable version if the hypothesis {K(d)} will then give the bound

\displaystyle  \mu_{coarse} \lessapprox \rho^{-d}.

Meanwhile, the thin tubes inside a fat tube are going to be a sub-Kakeya configuration, having about {m} times fewer tubes than a Kakeya configuration. It turns out to be possible to use {K(d)} to then get a gain in {m} here,

\displaystyle  \mu_{fine} \lessapprox m^{-\sigma} (\delta/\rho)^{-d},

for some small constant {\sigma>0}. Inserting these bounds into (4), one obtains a good bound {\mu \lessapprox m^{-\sigma} \delta^{-d}} which leads to the desired gain {K(d+\alpha)}.

So the remaining case is when the grains do not behave like a rescaled Kakeya or sub-Kakeya configuration. Wang and Zahl introduce a “structure theorem” to analyze this case, concluding that the grains will organize into some larger convex prisms {W}, with the grains in each prism {W} behaving like a “super-Kakeya configuration” (with significantly more grains than one would have for a Kakeya configuration). However, the precise dimensions of these prisms {W} is not specified in advance, and one has to split into further cases.

One case is when the prisms {W} are “thick”, in that all dimensions are significantly greater than {\delta}. Informally, this means that at small scales, {E} looks like a super-Kakeya configuration after rescaling. With a somewhat lengthy induction on scales argument, Wang and Zahl are able to show that (a suitable version of) {K(d)} implies an “x-ray” version of itself, in which the lower bound of super-Kakeya configurations is noticeably better than the lower bound for Kakeya configurations. The upshot of this is that one is able to obtain a Frostman violation bound of the form (3) in this case, which as discussed previously is already enough to win in this case.

It remains to handle the case when the prisms {W} are “thin”, in that they have thickness {\approx \delta}. In this case, it turns out that the {L^2} arguments of Córdoba, combined with the super-Kakeya nature of the grains inside each of these thin prisms, implies that each prism is almost completely occupied by the set {E}. In effect, this means that these prisms {W} themselves can be taken to be grains of the Kakeya set. But this turns out to contradict the maximality of the dimensions of the grains (if everything is set up properly). This treats the last remaining case needed to close the induction on scales, and obtain the Kakeya conjecture!

February 24, 2025

Terence TaoClosing the “green gap”: from the mathematics of the landscape function to lower electricity costs for households

I recently returned from the 2025 Annual Meeting of the “Localization of Waves” collaboration (supported by the Simons Foundation, with additional related support from the NSF), where I learned (from Svitlana Mayboroda, the director of the collaboration as well as one of the principal investigators) of a remarkable statistic: net electricity consumption by residential customers in the US has actually experienced a slight decrease in recent years:

The decrease is almost entirely due to gains in lighting efficiency in households, and particularly the transition from incandescent (and compact fluorescent) light bulbs to LED light bulbs:

Annual energy savings from this switch to consumers in the US were already estimated to be $14.7 billion in 2020 – or several hundred dollars per household – and are projected to increase, even in the current inflationary era, with the cumulative savings across the US estimated to reach $890 billion by 2035.

What I also did not realize before this meeting is the role that recent advances in pure mathematics – and specifically, the development of the “landscape function” that was a primary focus of this collaboration – played in accelerating this transition. This is not to say that this piece of mathematics was solely responsible for these developments; but, as I hope to explain here, it was certainly part of the research and development ecosystem in both academia and industry, spanning multiple STEM disciplines and supported by both private and public funding. This application of the landscape function was already reported upon by Quanta magazine at the very start of this collaboration back in 2017; but it is only in the last few years that the mathematical theory has been incorporated into the latest LED designs and led to actual savings at the consumer end.

LED lights are made from layers of semiconductor material (e.g., Gallium nitride or Indium gallium nitride) arranged in a particular fashion. When enough of a voltage difference is applied to this material, electrons are injected into the “n-type” side of the LED, while holes of electrons are injected into the “p-type” side, creating a current. In the active layer of the LED, these electrons and holes recombine in the quantum wells of the layer, generating radiation (light) via the mechanism of electroluminescence. The brightness of the LED is determined by the current, while the power consumption is the product of the current and the voltage. Thus, to improve energy efficiency, one seeks to design LEDs to require as little voltage as possible to generate a target amount of current.

As it turns out, the efficiency of an LED, as well as the spectral frequencies of light they generate, depend in many subtle ways on the precise geometry of the chemical composition of the semiconductors, the thickness of the layers, the geometry of how the layers are placed atop one another, the temperature of the materials, and the amount of disorder (impurities) introduced into each layer. In particular, in order to create quantum wells that can efficiently trap the electrons and holes together to recombine to create light of a desired frequency, it is useful to introduce a certain amount of disorder into the layers in order to take advantage of the phenomenon of Anderson localization. However, one cannot add too much disorder, lest the electron states become fully bound and the material behaves too much like an insulator to generate appreciable current.

One can of course make empirical experiments to measure the performance of various proposed LED designs by fabricating them and then testing them in a laboratory. But this is an expensive and painstaking process that does not scale well; one cannot test thousands of candidate designs this way to isolate the best performing ones. So, it becomes desirable to perform numerical simulations of these designs instead, which – if they are sufficiently accurate and computationally efficient – can lead to a much shorter and cheaper design cycle. (In the near future one may also hope to accelerate the design cycle further by incorporating machine learning and AI methods; but these techniques, while promising, are still not fully developed at the present time.)

So, how can one perform numerical simulation of an LED? By the semiclassical approximation, the wave function {\psi_i} of an individual electron should solve the time-independent Schrödinger equation

\displaystyle -\frac{\hbar^2}{2m_e} \Delta \psi_i + E_c \psi_i = E_i \psi_i,

where {\psi} is the wave function of the electron at this energy level, and {E_c} is the conduction band energy. The behavior of hole wavefunctions follows a similar equation, governed by the valence band energy {E_v} instead of {E_c}. However, there is a complication: these band energies are not solely coming from the semiconductor, but also contain a contribution {\mp e \varphi} that comes from electrostatic effects from the electrons and holes, and more specifically by solving the Poisson equation

\displaystyle \mathrm{div}( \varepsilon_r \nabla \varphi ) = \frac{e}{\varepsilon_0} (n-p + N_A^+ - N_D^-)

where {\varepsilon_r} is the dielectric constant of the semiconductor, {n,p} are the carrier densities of electrons and holes respectively, {N_A^+}, {N_D^-} are further densities of ionized acceptor and donor atoms, and {\hbar, m_e, e, \varepsilon_0} are physical constants. This equation looks somewhat complicated, but is mostly determined by the carrier densities {n,p}, which in turn ultimately arise from the probability densities {|\psi_i|^2} associated to the eigenfunctions {\psi_i} via the Born rule, combined with the Fermi-Dirac distribution from statistical mechanics; for instance, the electron carrier density {n} is given by the formula

\displaystyle n = \sum_i \frac{|\psi_i|^2}{1 + e^{(E_i - E_{Fn})/k_B T}},

with a similar formula for {p}. In particular, the net potential {E_c} depends on the wave functions {\psi_i}, turning the Schrödinger equation into a nonlinear self-consistent Hartree-type equation. From the wave functions one can also compute the current, determine the amount of recombination between electrons and holes, and therefore also calculate the light intensity and absorption rates. But the main difficulty is to solve for the wave functions {\psi_i} for the different energy levels of the electron (as well as the counterpart for holes).

One could attempt to solve this nonlinear system iteratively, by first proposing an initial candidate for the wave functions {\psi_i}, using this to obtain a first approximation for the conduction band energy {E_c} and valence band energy {E_v}, and then solving the Schrödinger equations to obtain a new approximation for {\psi_i}, and repeating this process until it converges. However, the regularity of the potentials {E_c, E_v} plays an important role in being able to solve the Schrödinger equation. (The Poisson equation, being elliptic, is relatively easy to solve to high accuracy by standard methods, such as finite element methods.) If the potential {E_c} is quite smooth and slowly varying, then one expects the wave functions {\psi_i} to be quite delocalizated, and for traditional approximations such as the WKB approximation to be accurate.

However, in the presence of disorder, such approximations are no longer valid. As a consequence, traditional methods for numerically solving these equations had proven to be too inaccurate to be of practical use in simulating the performance of a LED design, so until recently one had to rely primarily on slower and more expensive empirical testing methods. One real-world consequence of this was the “green gap“; while reasonably efficient LED designs were available in the blue and red portions of the spectrum, there was not a suitable design that gave efficient output in the green spectrum. Given that many applications of LED lighting required white light that was balanced across all visible colors of the spectrum, this was a significant impediment to realizing the energy-saving potential of LEDs.

Here is where the landscape function comes in. This function started as a purely mathematical discovery: when solving a Schrödinger equation such as

\displaystyle -\Delta \phi + V \phi = E \phi

(where we have now suppressed all physical constants for simplicity), it turns out that the behavior of the eigenfunctions {\phi} at various energy levels {E} is controlled to a remarkable extent by the landscape function {u}, defined to be the solution to the equation

\displaystyle -\Delta u + V u = 1.

As discussed in this previous blog post (discussing a paper on this topic I wrote with some of the members of this collaboration), one reason for this is that the Schrödinger equation can be transformed after some routine calculations to

\displaystyle -\frac{1}{u^2} \mathrm{div}( u^2 \nabla (\phi/u)) + \frac{1}{u} (\phi/u) = E (\phi/u),

thus making {\frac{1}{u}} an effective potential for the Schrödinger equation (and {u^2} also being the coefficients of an effective geometry for the equation). In practice, when {V} is a disordered potential, the effective potential {1/u} tends to be behave like a somewhat “smoothed out” or “homogenized” version of {V} that exhibits superior numerical performance. For instance, the classical Weyl law predicts (assuming a smooth confining potential {V}) that the density of states up to energy {E} – that is to say, the number of bound states up to {E} – should asymptotically behave like {\frac{1}{(2\pi)^2}|\{ (x,\xi): \xi^2 + V(x) \leq E\}|}. This is accurate at very high energies {E}, but when {V} is disordered, it tends to break down at low and medium energies. However, the landscape function makes a prediction {\frac{1}{(2\pi)^2}|\{ (x,\xi): \xi^2 + 1/u(x) \leq E\}|} for this density of states that is significantly more accurate in practice in these regimes, with a mathematical justification (up to multiplicative constants) of this accuracy obtained in this paper of David, Filoche, and Mayboroda. More refined predictions (again with some degree of theoretical support from mathematical analysis) can be made on the local integrated density of states, and with more work one can then also obtain approximations for the carrier density functions {n,p} mentioned previously in terms of the energy band level functions {E_c}, {E_v}. As the landscape function {u} is relatively easy to compute (coming from solving a single elliptic equation), this gives a very practical numerical way to carry out the iterative procedure described previously to model LEDs in a way that has proven to be both numerically accurate, and significantly faster than empirical testing, leading to a significantly more rapid design cycle.

In particular, recent advances in LED technology have largely closed the “green gap” by introducing designs that incorporate “{V}-defects”: {V}-shaped dents in the semiconductor layers of the LED that create lateral carrier injection pathways and modify the internal electric field, enhancing hole transport into the active layer. The ability to accurately simulate the effects of these defects has allowed researchers to largely close this gap:

My understanding is that the major companies involved in developing LED lighting are now incorporating landscape-based methods into their own proprietary simulation models to achieve similar effects in commercially produced LEDs, which should lead to further energy savings in the near future.

Thanks to Svitlana Mayboroda and Marcel Filoche for detailed discussions, comments, and corrections of the material here.

February 22, 2025

Robert HellingThe Bohm-GHZ paper is out

 I had this neat calculation in my drawer and on the occasion of quantum mechanic's 100th birthday in 2025, I decided I submit a talk about it to the March meeting of the DPG, the German physical society, in Göttingen. And to have to show something, I put it out on the arxiv today. The idea is as follows:

The GHZ experiment is a beautiful version of Bell's inequality that demonstrates you get to wrong conclusions when you assume that a property of a quantum system has to have some (unknown) value even when you don't measure it. I would say it shows quantum theory is not realistic, in the sense that unmeasured properties do not have secret values (different for example from classical statistical mechanics where you could imagine to actually measure the exact position of molecule number 2342 in your container of gas). For details, see the paper or this beautiful explanation by Coleman. I should mention here that there is another way out by assuming some non-local forces that conspire to make the result come out right never the less.

On the other hand there is Bohmian mechanics. This is well known to be a non-local theory (as the time evolution of its particles depend on the positions of all other particles in the system or even universe) but what I found more interesting is also realistic: There, it is claimed that all that matters are particles positions (including the positions of pointers on your measurement devices that you might interpret as showing something different than positions for example velocities or field strengths or whatever) and those have all (possibly unknown) values at all times even if you don't measure them.

So how can the two be brought together? There might be an obstacle in the fact that GHZ is usually presented to be a correlation of spins and in the Bohmian literature spins are not really positions, you will always have to make use of some Stern-Gerlach experiments to translate those into actual positions. But we can circumvent this the other way: We don't really need spins, we just need observables of the commutation relation of Pauli matrices. You might think that those cannot be realised with position measurements as they always commute but this is only true as you do the position measurements at equal times. If you wait between them, you can in fact have almost Pauli type operators.

So we can set up a GHZ experiment in terms of three particles in three boxes and for each particle you measure whether it is in the left or the right half of the box but for each particle you decide if you do it at time 0 or at a later moment. You can look at the correlation of the three measurements as a function of time (of course, as you measure different particles, the actual measurements you do still commute independent of time) and what you find is the blue line in

GHZ correlations vs. Bohmian correlations
   

You can also (numerically) solve the Bohmian equation of motion and compute the expectation of the correlation of positions of the three particles at different times which gives the orange line, clearly something else. No surprise, the realistic theory cannot predict the outcome of an experiment that demonstrates that quantum theory is not realistic. And the non-local character of the evolution equation does not help either.

To save the Bohmian theory, one can in fact argue that I have computed the wrong thing: After measuring the position of one particle at time 0 or by letting it interact with a measuring device, the future time evolution of all particles is affected and one should compute that correlation with the corrected (effectively collapsed) wave function. That, however, I cannot do and I claim is impossible since it would depend on the details of how the first particle's position is actually measured (whereas the orthodox prediction above is independent of those details as those interactions commute with the later observations). In any case, at least my interpretation is that if you don't want to predict the correlation wrong the best you can do is to say you cannot do the calculation as it depends on unknown details (but the result of course shouldn't).

In any case, the standard argument why Bohmian mechanics is indistinguishable from more conventional treatments is that all that matters are position correlations and since those are given by psi-squared they are the same for all approaches. But I show this is not the case for these multi-time correlations.


Post script: What happens when you try to discuss physics with a philosopher:



February 17, 2025

John PreskillLessons in frustration

Assa Auerbach’s course was the most maddening course I’ve ever taken. 

I was a master’s student in the Perimeter Scholars International program at the Perimeter Institute for Theoretical Physics. Perimeter trotted in world experts to lecture about modern physics. Many of the lecturers dazzled us with their pedagogy and research. We grew to know them not only in class and office hours, but also over meals at Perimeter’s Black-Hole Bistro.

Assa hailed from the Technion in Haifa, Israel. He’d written the book—at least, a book—about condensed matter, the physics of materials. He taught us condensed matter, according to some definition of “taught.” 

Assa zipped through course material. He refrained from defining terminology. He used loose, imprecise language that conveys intuition to experts and only to experts. He threw at us the Hubbard model, the Heisenberg model, the Meissner effect, and magnons. If you don’t know what those terms mean, then I empathize. Really.

So I fought Assa like a groom hauling on a horse’s reins. I raised my hand again and again, insisting on clarifications. I shot off questions as quickly as I could invent them, because they were the only barriers slowing him down. He told me they were.

One day, we were studying magnetism. It arises because each atom in a magnet has a magnetic moment, a tiny compass that can angle in any direction. Under certain conditions, atoms’ magnetic moments tend to angle in opposite directions. Sometimes, not all atoms can indulge this tendency, as in the example below.

Physicists call this clash frustration, which I wanted to understand comprehensively and abstractly. But Assa wouldn’t define frustration; he’d only sketch an example. 

But what is frustration? I insisted.

It’s when the atoms aren’t happy, he said, like you are now.

After class, I’d escape to the bathroom and focus on breathing. My body felt as though it had been battling an assailant physically. 

Earlier this month, I learned that Assa had passed away suddenly. A former Perimeter classmate reposted the Technion’s news blurb on Facebook. A photo of Assa showed a familiar smile flashing beneath curly salt-and-pepper hair.

Am I defaming the deceased? No. The news of Assa’s passing walloped me as hard as any lecture of his did. I liked Assa and respected him; he was a researcher’s researcher. And I liked Assa for liking me for fighting to learn.

Photo courtesy of the Technion

One day, at the Bistro, Assa explained why the class had leaped away from the foundations of condensed matter into advanced topics so quickly: earlier discoveries felt “stale” to him. Everyone, he believed, could smell their moldiness. I disagreed, although I didn’t say so: decades-old discoveries qualify as new to anyone learning about them for the first time. Besides, 17th-century mechanics and 19th-century thermodynamics soothe my soul. But I respected Assa’s enthusiasm for the cutting-edge. And I did chat with him at the Bistro, where his friendliness shone like that smile.

Five years later, I was sojourning at the Kavli Institute for Theoretical Physics (KITP) in Santa Barbara, near the end of my PhD. The KITP, like Perimeter, draws theorists from across the globe. I spotted Assa among them and reached out about catching up. We discussed thermodynamics and experiments and travel. 

Assa confessed that, at Perimeter, he’d been lecturing to himself—presenting lectures that he’d have enjoyed hearing, rather than lectures designed for master’s students. He’d appreciated my slowing him down. Once, he explained, he’d guest-lectured at Harvard. Nobody asked questions, so he assumed that the students must have known the material already, that he must have been boring them. So he sped up. Nobody said anything, so he sped up further. At the end, he discovered that nobody had understood any of his material. So he liked having an objector keeping him in check.

And where had this objector ended up? In a PhD program and at a mecca for theoretical physicists. Pursuing the cutting edge, a budding researcher’s researcher. I’d angled in the same direction as my former teacher. And one Perimeter classmate, a faculty member specializing in condensed matter today, waxed even more eloquently about Assa’s inspiration when we were students.

Physics needs more scientists like Assa: nose to the wind, energetic, low on arrogance. Someone who’d respond to this story of frustration with that broad smile.

February 11, 2025

Tommaso DorigoUnsupervised Tracking

Pattern recognition is an altisonant name to a rather common, if complex, activity we perform countless times in our daily lives. Our brain is capable of interpreting successions of sounds, written symbols, or images almost infallibly - so much so that people like me, who have sometimes trouble to recognize a face that should be familiar, get their own disfunctionality term - in this case, prosopagnosia.

read more

February 05, 2025

Tommaso DorigoHey AI, Design A Calorimeter For Me

As artificial intelligence tools continue to evolve and improve their performance on more and more general tasks, scientists struggle to make the best use of them. 
The problem is not incompetence - in fact, at least in my field of study (high-energy physics) most of us have grown rather well educated on the use and development of tailored machine learning algorithms. The problem is rather that our problems are enormously complex. Long gone are the years when we started to apply with success deep neural networks to classification and regression problems of data analysis: those were easy tasks. The bar now is set much higher - optimize the design of instruments we use for our scientific research. 

read more

January 31, 2025

Tommaso DorigoThe Multi-Muon Analysis - A Recollection

As part of the celebrations for 20 years of blogging, I am re-posting articles that in some way were notable for the history of the blog. This time I want to (re)-submit to you four pieces I wrote to explain the unexplainable: the very complicated analysis performed by a group of physicists within the CDF experiment, which led them to claim that there was a subtle new physics process hidden in the data collected in Run 2. There would be a lot to tell about that whole story, but suffices to say here that the signal never got confirmed by independent analyses and by DZERO, the competing experiment at the Tevatron. As mesmerizing and striking the CDF were, they were finally archived as some intrinsic incapability of the experiment to make perfect sense of their muon detector signals.

read more

January 15, 2025

Andrew JaffeThe Only Gaijin in the Onsen

After living in Japan for about four months, we left in mid-December. We miss it already.

One of the pleasures we discovered is the onsen, or hot spring. Originally referring to the natural volcanic springs themselves, and the villages around them, there are now onsens all over Japan. Many hotels have an onsen, and most towns will have several. Some people still use them as their primary bath and shower for keeping clean. (Outside of actual volcanic locations, these are technically sento rather than onsen.) You don’t actually wash yourself in the hot baths themselves; they are just for soaking, and there are often several, at different temperatures, mineral content, indoor and outdoor locations, whirlpools and even “electric baths” with muscle-stimulating currents. For actual cleaning, there is a bank of hand showers, usually with soap and shampoo. Some can be very basic, some much more like a posh spa, with massages, saunas, and a restaurant.

Our favourite, about 25 minutes away by bicycle, was Kirari Onsen Tsukuba. When not traveling, we tried to go every weekend, spending a day soaking in the hot water, eating the good food, staring at the gardens, snacking on Hokkaido soft cream — possibly the best soft-serve ice cream in the world (sorry, Carvel!), and just enjoying the quiet and peace. Even our seven- and nine-year old girls have found the onsen spirit, calming and quieting themselves down for at least a few hours.

Living in Tsukuba, lovely but not a common tourist destination, although with plenty of foreigners due to the constellation of laboratories and universities, we were often one of only one or two western families in our local onsen. It sometimes takes Americans (and those from other buttoned-up cultures) some time to get used to their sex-segregated but fully-naked policies of the baths themselves. The communal areas, however, are mixed, and fully-clothed. In fact, many hotels and fancier onsen facilities supply a jinbei, a short-sleeve pyjama set in which you can softly pad around the premises during your stay. (I enjoyed wearing jinbei so much that I purchased a lightweight cotton set for home, and am also trying to get my hands on samue, a somewhat heavier style of traditional Japanese clothing.)

And my newfound love for the onsen is another reason not to get a tattoo beyond the sagging flesh and embarrassment of my future self: in Japan, tattoos are often a symbol of the yakuza, and are strictly forbidden in the onsen, even for foreigners.

Later in our sabbatical, we will be living in the Netherlands, which also has a good public bath culture, but it will be hard to match the calm of the Japanese onsen.

January 11, 2025

Clifford Johnson93 minutes

Thanks to everyone who made all those kind remarks in various places last month after my mother died. I've not responded individually (I did not have the strength) but I did read them all and they were deeply appreciated. Yesterday would’ve been mum‘s 93rd birthday. A little side-note occurred to me the other day: Since she left us a month ago, she was just short of having seen two perfect square years. (This year and 1936.) Anyway, still on the theme of playing with numbers, my siblings and I agreed that as a tribute to her on the day, we would all do some kind of outdoor activity for 93 minutes. Over in London, my brother and sister did a joint (probably chilly) walk together in Regents Park and surrounds. I decided to take out a piece of the afternoon at low tide and run along the beach. It went pretty well, [...] Click to continue reading this post

The post 93 minutes appeared first on Asymptotia.

January 05, 2025

Mark GoodsellMaking back bacon

As a French citizen I should probably disavow the following post and remind myself that I have access to some of the best food in the world. Yet it's impossible to forget the tastes of your childhood. And indeed there are lots of British things that are difficult or very expensive to get hold of in France. Some of them (Marmite, Branston pickle ...) I can import via occasional trips across the channel, or in the luggage of visiting relatives. However, since Brexit this no longer works for fresh food like bacon and sausages. This is probably a good thing for my health, but every now and then I get a hankering for a fry-up or a bacon butty, and as a result of their rarity these are amongst the favourite breakfasts of my kids too. So I've learnt how to make bacon and sausages (it turns out that boudin noir is excellent with a fry-up and I even prefer it to black pudding). 

Sausages are fairly labour-intensive, but after about an hour or so's work it's possible to make one or two kilos worth. Back bacon, on the other hand, takes three weeks to make one batch, and I thought I'd share the process here.

1. Cut of meat

The first thing is to get the right piece of pork, since animals are divided up differently in different countries. I've made bacon several times now and keep forgetting which instructions I previously gave to the butcher at my local Grand Frais ... Now I have settled on asking for a carré de porc, and when they (nearly always) tell me that they don't have that in I ask for côtes de porc première in one whole piece, and try to get them to give me a couple of kilos. As you can find on wikipedia, I need the same piece of meat used to make pork chops. I then ask them to remove the spine, but it should still have the ribs. So I start with this:



2. Cure

Next the meat has to be cured for 10 days (I essentially follow the River Cottage recipe). I mix up a 50-50 batch of PDV salt and brown sugar (1 kg in total here), and add some pepper, juniper berries and bay leaves:


Notice that this doesn't include any nitrites or nitrates. I have found that nitrates/nitrites are essential for the flavour in sausages, but in bacon the only thing that they will do (other than be a carcinogen) as far as I can tell is make the meat stay pink when you cook it. I can live without that. This cure makes delicious bacon as far as I'm concerned. 

The curing process involves applying 1/10th of the mixture each day for ten days and draining off the liquid produced at each step. After the first coating it looks like this:


The salt and sugar remove water from the meat, and penetrate into it, preserving it. Each day I get liquid at the bottom, which I drain off and apply the next cure. After one day it looks like this:


This time I still had liquid after 10 days:

3. Drying

After ten days, I wash/wipe off the cure and pat it down with some vinegar. If you leave cure on the meat it will be much too salty (and, to be honest, this cure always gives quite salty bacon). So at this point it looks like this:


I then cover the container with a muslin that has been doused with a bit more vinegar, and leave in the fridge (at first) and then in the garage (since it's nice and cold this time of year) for ten days or so. This part removes extra moisture. It's possible that there will be small amounts of white mould that appear during this stage, but these are totally benign: you only have to worry if it starts to smell or you get blue/black mould, but this never happened to me so far.

4. Smoking

After the curing/drying, the bacon is ready to eat and should in principle keep almost indefinitely. However, I prefer smoked bacon, so I cold smoke it. This involves sticking it in a smoker (essentially just a box where you can suspend the meat above some smouldering sawdust) for several hours:


 










The sawdust is beech wood and slowly burns round in the little spiral device you can see above. Of course, I close the smoker up and usually put it in the shed to protect against the elements:


5. All done!

And then that's it! Delicious back bacon that really doesn't take very long to eat:


As I mentioned above, it's usually still a bit salty, so when I slice it to cook I put the pieces in water for a few minutes before grilling/frying:

Here you see that the colour is just like frying pork chops ... but the flavour is exactly right!









December 20, 2024

Clifford JohnsonA Long Goodbye

I've been very quiet here over the last couple of weeks. My mother, Delia Maria Johnson, already in hospital since 5th November or so, took a turn for the worse and began a rapid decline. She died peacefully after some days, and to be honest I’ve really not been myself since then.

My mother Delia at a wedding in 2012

There's an extra element to the sense of loss when (as it approaches) you are powerless to do anything because of being thousands of miles away. On the plus side, because of the ease of using video calls, and with the help of my sister being there, I was able to be somewhat present during what turned out to be the last moments when she was aware of people around her, and therefore was able to tell her I loved her one last time.

Rather than charging across the world on planes, trains, and in automobiles, probably being out of reach during any significant changes in the situation (the doctors said I would likely not make it in time) I did a number of things locally that I am glad I got to do.

It began with visiting (and sending a photo from) the Santa Barbara mission, a place she dearly loved and was unable to visit again after 2019, along with the pier. These are both places we walked together so much back when I first lived here in what feels like another life.

Then, two nights before mum passed away, but well after she’d seemed already beyond reach of anyone, although perhaps (I’d like to think) still able to hear things, my sister contacted me from her bedside asking if I’d like to read mum a psalm, perhaps one of her favourites, 23 or 91. At first I thought she was already planning the funeral, and expressed my surprise at this since mum was still alive and right next to her. But I’d misunderstood, and she’d in fact had a rather great idea. This suggestion turned into several hours of, having sent on recordings of the two psalms, my digging into the poetry shelf in the study and discovering long neglected collections through which I searched (sometimes accompanied by my wife and son) for additional things to read. I recorded some and sent them along, as well as one from my son, I’m delighted to say. Later, the whole thing turned into me singing various songs while playing my guitar and sending recordings of those along too.

Incidentally, the guitar-playing was an interesting turn of events since not many months ago I decided after a long lapse to start playing guitar again, and try to move the standard of my playing (for vocal accompaniment) to a higher level than I’d previously done, by playing and practicing for a little bit on a regular basis. I distinctly recall thinking at one point during one practice that it would be nice to play for mum, although I did not imagine that playing to her while she was on her actual death-bed would be the circumstance under which I’d eventually play for her, having (to my memory) never directly done so back when I used to play guitar in my youth. (Her overhearing me picking out bits of Queen songs behind my room door when I was a teenager doesn’t count as direct playing for her.)

Due to family circumstances I’ll perhaps go into another time... Click to continue reading this post

The post A Long Goodbye appeared first on Asymptotia.

December 17, 2024

Andrew JaffeDiscovering Japan

My old friend Marc Weidenbaum, curator and writer of disquiet.com, reminded me, in his latest post, of the value of blogging. So, here I am (again).

Since September, I have been on sabbatical in Japan, working mostly at QUP (International Center for Quantum-field Measurement Systems for Studies of the Universe and Particles) at the KEK accelerator lab in Tsukuba, Japan, and spending time as well at the Kavli IPMU, about halfway into Tokyo from here. Tsukuba is a “science city” about 30 miles northeast of Tokyo, home to multiple Japanese scientific establishments (such as a University and a major lab for JAXA, the Japanese space agency).

Scientifically, I’ve spent a lot of time thinking and talking about the topology of the Universe, future experiments to measure the cosmic microwave background, and statistical tools for cosmology experiments. And I was honoured to be asked to deliver a set of lectures on probability and statistics in cosmology, a topic which unites most of my research interests nowadays.

Japan, and Tsukuba in particular, is a very nice place to live. It’s close enough to Tokyo for regular visits (by the rapid Tsukuba Express rail line), but quiet enough for our local transport to be dominated by cycling around town. We love the food, the Japanese schools that have welcomed our children, the onsens, and our many views of Mount Fuji.

Fuji with buildings

Fuji through windows

And after almost four months in Japan, it’s beginning to feel like home.

Unfortunately, we’re leaving our short-term home in Japan this week. After a few weeks of travel in Southeast Asia, we’ll be decamped to the New York area for the rest of the Winter and early Spring. But (as further encouragement to myself to continue blogging) I’ll have much more to say about Japan — science and life — in upcoming posts.

December 09, 2024

David Hoggpossible Trojan planet?

In group meeting last week, Stefan Rankovic (NYU undergrad) presented results on a very low-amplitude possible transit in the lightcurve of a candidate long-period eclipsing binary system found in the NASA Kepler data. The weird thing is that (even though the period is very long) the transit of the possible planet looks just like the transit of the secondary star in the eclipsing binary. Like just like it, only lower in amplitude (smaller in radius).

If the transit looks identical, only lower in amplitude, it suggests that it is taking an extremely similar chord across the primary star, at the same speed, with no difference in inclination. How could that be? Well if they are moving at the same speed on the same path, maybe we have a 1:1 resonance, like a Trojan? If so, there are so many cool things about this system. It was an exciting group meeting, to be sure.

December 01, 2024

Clifford JohnsonMagic Ingredients Exist!

I’m a baker, as you probably know. I’ve regularly made bread, cakes, pies, and all sorts of things for friends and family. About a year ago, someone in the family was diagnosed with a severe allergy to gluten, and within days we removed all gluten products from the kitchen, began … Click to continue reading this post

The post Magic Ingredients Exist! appeared first on Asymptotia.