Planet Musings

June 25, 2025

n-Category Café Counting with Categories (Part 1)

These are some lecture notes for a 412\frac{1}{2}-hour minicourse I’m teaching at the Summer School on Algebra at the Zografou campus of the National Technical University of Athens. To save time, I am omitting the pictures I’ll draw on the whiteboard, along with various jokes and profoundly insightful remarks. This is just the structure of the talk, with all the notation and calculations.

Long-time readers of the nn-Category Café may find little new in this post. I’ve been meaning to write a sprawling book on combinatorics using categories, but here I’m trying to explain the use of species and illustrate them with a nontrivial example in less than 1.5 hours. That leaves three hours to go deeper.

Part 2 is here, and Part 3 is here.

Species and their generating functions

Combinatorics, or at least part of it, is the art of counting. For example: how many derangements does a set with nn elements have? A derangement is a bijection

f:SS f: S \to S

with no fixed points. We’ll count them in a while.

But what does counting have to do with category theory? The category of finite sets, FinSet\mathsf{FinSet}, has

  • finite sets as objects
  • functions between these as morphisms

What we count, ultimately, are finite sets. Any object SFinSetS \in \mathsf{FinSet} has a cardinality |S|,|S| \in \mathbb{N}, so counting a finite set simplifies it to natural number, and the key feature of this process is that

ST|S|=|T| S \cong T \iff |S| = |T|

We’ll count structures on finite sets. A ‘species’ is roughly a type of structure we can put on finite sets.

Example 1. There is a species of derangements, called DD. A DD-structure on a finite set SS is simply a derangement f:SS.f: S \to S. We write D(S)D(S) for the set of all derangements of S.S. We would like to know |D(S)||D(S)| for all SFinSet.S \in \mathsf{FinSet}.

Example 2. There is a species of permutations, called PP. A PP-structure on SFinSetS \in \mathsf{FinSet} is a bijection f:SSf: S \to S. We know

|P(S)|=|S|! |P(S)| = |S|!

We often use nn interchangeably for a natural number and a standard finite set with that many elements:

n={0,1,,n1}n = \{0,1, \dots, n-1 \}

With this notation

|P(n)|=n! |P(n)| = n!

But what exactly is a species?

Definition. Let E\mathsf{E} be the category where

  • an object is a finite set
  • a morphism is a bijection between finite sets

Definition. A species is a functor F:ESet.F: \mathsf{E} \to \mathsf{Set}. A tame species is a functor F:EFinSet.F: \mathsf{E} \to \mathsf{FinSet}.

Any tame species has a generating function, which is actually a formal power series F^[[x]],\widehat{F} \in \mathbb{R}[[x]], given by

F^= n0|F(n)|n!x n \displaystyle{ \widehat{F} = \sum_{n \ge 0} \frac{|F(n)|}{n!} x^n }

Example 3. We can compute the generating function of the species of permutations:

P^= n0n!n!x n=11x \displaystyle{ \widehat{P} = \sum_{n \ge 0} \frac{n!}{n!} x^n = \frac{1}{1 - x} }

Example 4. Given two species FF and GG there is a species FGF \cdot G defined as follows. To put an FGF \cdot G-structure on a finite set SS is to choose a subset TST \subseteq S and put an FF-structure on XX and a GG-structure on STS - T. We thus have

FG^= n0|(FG)(n)|n!x n \widehat{F \cdot G} = \sum_{n \ge 0} \frac{|(F \cdot G)(n)|}{n!} x^n

= n0 0kn(nk)|F(k)||G(nk)|x nn! = \sum_{n \ge 0} \sum_{0 \le k \le n} \binom{n}{k} |F(k)| |G(n-k)| \frac{x^n}{n!}

= n0 k0|F(k)|k!|G(nk)|(nk)!x n = \sum_{n \ge 0} \sum_{k \ge 0} \frac{|F(k)|}{k!} \frac{|G(n-k)|}{(n-k)!} x^n

= k0 0|F(k)|k!|G(|!x kx = \sum_{k \ge 0} \sum_{\ell \ge 0} \frac{|F(k)|}{k!} \frac{|G(\ell|}{\ell!} x^k x^\ell

=F^G^ = \widehat{F} \widehat{G}

Example 5. There is a species Exp\text{Exp} such that every finite set has a unique Exp\text{Exp}-structure! We thus have

|Exp(S)|=1 |\text{Exp}(S)| = 1

for all SFinSetS \in \mathsf{FinSet}, so

Exp^= n01n!x n=expx \widehat{Exp} = \sum_{n \ge 0} \frac{1}{n!} x^n = \exp x

That’s why we call this boring structure an Exp\text{Exp}-structure. I like to call Exp\text{Exp} being a finite set. Every finite set has the structure of being a finite set in exactly one way.

Example 6. Recall that a DD-structure on a finite set is a derangement of that finite set. To choose a permutation f:SSf: S \to S of a finite set SS is the same as to choose a subset TST \subset S, which will be the set of fixed points of ff, and to choose a derangement of STS - T. Thus by Example 4 we have

PExpD P \cong \text{Exp} \cdot D

and also

|P|=|Exp||D| |P| = |\text{Exp}| \cdot |D|

so by Example 3 and Example 5 we have

11x=exp(x)|D| \frac{1}{1 - x} = \exp(x) |D|

so

|D|=e x1x |D| = \frac{e^{-x}}{1 - x}

or

n0|D(n)|n!x n=e x1x \sum_{n \ge 0} \frac{|D(n)|}{n!} x^n = \frac{e^{-x}}{1 - x}

=(1x+x 22!x 33!+x 44!)(1+x+x 2+x 3+x 4+) = \left(1 - x + \frac{x^2}{2!} - \frac{x^3}{3!} + \frac{x^4}{4!} - \cdots \right)(1 + x + x^2 + x^3 + x^4 + \cdots )

From this it’s easy to work out |D(n)||D(n)|. I’ll do the example of n=5n = 5. If you think about the coefficient of x 5x^5 in the above product, you’ll see it’s

11+12!13!+14!15! 1 - 1 + \frac{1}{2!} - \frac{1}{3!} + \frac{1}{4!} - \frac{1}{5!}

So we must have

|D(5)|5!=12!13!+14!15! \frac{|D(5)|}{5!} = \frac{1}{2!} - \frac{1}{3!} + \frac{1}{4!} - \frac{1}{5!}

or

|D(5)| = 5!(12!13!+14!15!) = 34545+51 = 6020+51 = 44 \begin{array}{ccl} |D(5)| &=& 5! \left( \frac{1}{2!} - \frac{1}{3!} + \frac{1}{4!} - \frac{1}{5!} \right) \\ &=& 3 \cdot 4 \cdot 5 - 4 \cdot 5 + 5 - 1 \\ &=& 60 - 20 + 5 - 1 \\ &=& 44 \end{array}

In general we see

|D(n)|=n!(12!13!+14!+(1) nn!) |D(n)| = n! \left( \frac{1}{2!} - \frac{1}{3!} + \frac{1}{4!} - \cdots + \frac{(-1)^{n} }{n!} \right)

But we can go further with this… and I’ll ask you to go further in this homework:

Exercise 1. Do the problems here:

The category of species

You’ll notice that above I said there was an isomorphism

PExpDP \cong \text{Exp} \cdot D

without ever defining an isomorphism of species! So let’s do that. In fact there’s a category of species.

I said a species is a functor

F:ESet F : \mathsf{E} \to \mathsf{Set}

where E\mathsf{E} is the category of finite sets and bijections. But what’s a morphism between species?

It’s easy to guess if you know what goes between functors: it’s a natural transformation.

Definition. For any categories C\mathsf{C} and D\mathsf{D}, let the functor category D C\mathsf{D}^\mathsf{C} be the category where

  • an object is a functor F:CDF: \mathsf{C} \to \mathsf{D}
  • morphism α:FG\alpha : F \Rightarrow G is a natural transformation.

(We write this with double arrows since a functor was already a kind of arrow, and now we’re talking about an arrow between arrows.)

Definition. The category of species is Set E\mathsf{Set}^\mathsf{E}. The category of tame species is FinSet E\mathsf{FinSet}^\mathsf{E}.

Two tame species can have the same generating function but not be isomorphic! You can check this in these exercises:

Exercise 2. In Example 2 we began defining the species of permutations, PP. We said that for any object SES \in \mathsf{E}, P(S)P(S) is the set of permutations of SS. But to make PP into a functor we also need to say what it does on morphisms of mathE\math{E}. That is, given a bijection f:SSf: S \to S' and a permutation gF(S)g \in F(S) we need a way to get a permutation F(f)(g):SSF(f)(g): S' \to S'. Figure out the details and then show that P:ESetP : \mathsf{E} \to \mathsf{Set} obeys the definition of a functor:

P(fg)=P(f)P(g) P(f \circ g) = P(f) \circ P(g)

for any composable pair of bijections g:SS,f:SSg: S \to S', f: S' \to S'', and

P(1 S)=1 P(S) P(1_S) = 1_{P(S)}

Exercise 3. Show that there is a species LL, the species of linear orderings, such that an LL-structure on SFinSetS \in \mathsf{FinSet} is a linear ordering on SS. In other words: first let L(S)L(S) be the set of linear orderings. Then for any bijection f:SSf: S \to S' define a map L(f):L(S)L(S)L(f): L(S) \to L(S') that sends linear orderings of SS into linear orderings of SS'. Then show that L:ESetL: \mathsf{E} \to \mathsf{Set} obeys the definition of a functor.

Exercise 4. Now show there is no natural isomorphism α:PL\alpha : P \to L. However we have

|P(S)|=|L(S)|=|S|! |P(S)| = |L(S)| = |S|!

for any finite set SS, so PP and LL have the same generating function:

P^=L^ \widehat{P} = \widehat{L}

Thus, you’ve found nonisomorphic tame species with the same generating function!

This may make you sad, because you might hope that you could tell whether two species were isomorphic just by looking at their generating function. But if that were true, species would contain no more information than formal power series. In fact they contain more! Species are like an improved version of formal power series. And there’s not just a set of them, there’s a category of them.

n-Category Café Counting with Categories (Part 2)

Here’s my second set of lecture notes for a 412\frac{1}{2}-hour minicourse at the Summer School on Algebra at the Zografou campus of the National Technical University of Athens.

Part 1 was here. Part 3 is here.

The 2-rig of species

We’ve seen that the category of tame species FinSet E\mathsf{FinSet}^{\mathsf{E}} closely resembles the ring of formal power series [[x]]\mathbb{R}[[x]], but it’s richer because two nonisomorphic tame species can have the same generating function. To dig deeper we need to understand this analogy better. But what do I mean exactly?

For starters, we can add and multiply species:

Addition. Given species F,G:ESetF, G : \mathsf{E} \to \mathsf{Set}, they have a sum F+GF + G with

(F+G)(S)=F(S)+G(S) (F + G)(S) = F(S) + G(S)

where at right the plus sign means the disjoint union, or coproduct, of sets. Here I’ve said what F+GF + G does to objects SFinSetS \in \mathsf{FinSet}, but it does something analogous to morphisms — figure it out and check that it makes F+GF + G into a functor! If FF and GG are tame, so is F+GF + G, and

F+G^=F^+G^ \widehat{F + G} = \widehat{F} + \widehat{G}

Multiplication. Given species F,G:ESetF, G : \mathsf{E} \to \mathsf{Set}, I said last time that we can multiply them and get a species FGF \cdot G with

(FG)(S)= XSF(X)×G(SX) (F \cdot G)(S) = \sum_{X \subseteq S} F(X) \times G(S - X)

Again, I’ll leave it to you to guess what FGF \cdot G does to morphisms and check that FGF \cdot G is then a functor. If FF and GG are tame, so is FGF \cdot G, and

FG^=F^G^ \widehat{F \cdot G} = \widehat{F} \cdot \widehat{G}

Exercise 5. If you know about coproducts, you can show that F+GF + G is the coproduct of FF and GG in the category of species, Set E\mathsf{Set}^\mathsf{E}, which we defined last time.

Exercise 6. If you know about products, you can show that FGF \cdot G is not the product of FF and GG in the category of species. Thus, we often call it the Cauchy product. But if you know about symmetric monoidal categories, you can with some significant work show that the Cauchy product makes Set E\mathsf{Set}^\mathsf{E} into a symmetric monoidal category. For starters, this means that it’s associative and commutative up to isomorphism—but it means more than just that. (If you know about Day convolution, this can speed up your work.)

Now let’s think about addition and the Cauchy product together. We can check that

F(G+H)FG+FH F \cdot (G + H) \cong F \cdot G + F \cdot H

and indeed addition and the Cauchy product give the category of species a structure much like that of a ring! It’s even more like a ‘rig’, which is a ‘rings without negatives’, since we can add and multiply species, but not subtract them. But all the rig laws hold, not as equations, but as natural isomorphism.

In fact the binary coproduct is just a special case of something called a colimit, defined by a more general universal property. And the Cauchy product distributes over all colimits! We thus say the category of species is a ‘symmetric 2-rig’.

A bit more precisely:

Definition. A 2-rig is a monoidal category (R,,I)(\mathsf{R}, \otimes, I) with all colimits, such that the tensor product \otimes distributes over colimits in each argument. That is, if D\mathsf{D} is any small category and F:DRF: \mathsf{D} \to \mathsf{R} is any functor, for any object rRr \in \mathsf{R}, the natural morphisms

colim iD(rF(i))r(colim iDF(i)) \text{colim}_{i \in \mathsf{D}} \left( r \otimes F(i) \right) \quad \longrightarrow \quad r \otimes \left(\text{colim}_{i \in \mathsf{D}} F(i) \right)

colim iD(F(i)r)(colim iDF(i))r \text{colim}_{i \in \mathsf{D}} \left( F(i) \otimes r \right) \quad \longrightarrow \quad \left( \text{colim}_{i \in \mathsf{D}} F(i) \right) \otimes r

are isomorphisms.

Definition. A symmetric 2-rig is a 2-rig whose underlying monoidal category is a symmetric monoidal category.

One can work through the details of these definitions and show the category of species is a symmetric 2-rig. But something vastly better is true. It’s very similar to the ring of polynomials in one variable!

Why is the ring [x]\mathbb{Z}[x], the ring of polynomials in one variable with integer coefficients, so important in mathematics? Because it’s the free ring on one generator! That is, given any ring RR and any element rR,r \in R, there’s a unique ring homomorphism

f:[x]R f: \mathbb{Z}[x] \to R

with

f(x)=rf(x) = r

To see this, note that the homomorphism rules force that for any P[x]P \in \mathbb{Z}[x] we have

f(P)=P(r) f(P) = P(r)

By the way, [x]\mathbb{Z}[x] is also the free commutative ring on one generator, and the 2-rig of species is similar: it’s the free symmetric 2-rig on one object XX. But what is this object XX?

XX is a particular species—a structure you can put on finite sets. It’s the structure of ‘being a 1-element set’. What I mean is this: it’s the structure that you can put on a finite set SS in a unique way if SS has one element, and not at all if SS has any other number of elements. In other words:

X(S)={1 if |S|=1 if |S|1 X(S) = \left\{ \begin{array}{ccl} 1 & \text{if} & |S| = 1 \\ \emptyset & \text{if} & |S| \ne 1 \end{array} \right.

The name XX is appropriate because the generating function of this species is xx:

X^= n0|X(n)|n!x n=x \widehat{X} = \sum_{n \ge 0} \frac{|X(n)|}{n!} x^n = x

Here’s the big theorem, which I won’t prove here:

Theorem. The symmetric 2-rig of species, Set E\mathsf{Set}^\mathsf{E}, is the free symmetric 2-rig on one generator. That is, for any symmetric 2-rig R\mathsf{R} and any object rRr \in \mathsf{R}, there exists a map of 2-rigs

f:Set ER f: \mathsf{Set}^\mathsf{E} \to \mathsf{R}

such that f(X)=rf(X) = r, and ff is unique up to natural isomorphism.

This result has a baby brother, too. First, note that the free ring on no generators is \mathbb{Z}. That is, \mathbb{Z} is the initital ring: for any ring RR there exists a unique ring homomorphism

f:R f: \mathbb{Z} \to R

\mathbb{Z} is also the free commutative ring on no generators. This should make us curious about the free symmetric 2-rig on no generators. This is the category of sets, with its cartesian product as the symmetric monoidal structure.

Theorem. The symmetric 2-rig of sets, Set\mathsf{Set}, is the initial 2-rig. That is, for any symmetric 2-rig R\mathsf{R} there exists a map of 2-rigsar

f:SetR f: \mathsf{Set} \to \mathsf{R}

which is unique up to natural isomorphism.

So we have a wonderful analogy:

The category of sets is the free symmetric 2-rig on no generator, just as \mathbb{Z} as the free commutative ring on no generators.

The category of species is the free symmetric 2-rig on one generator, just as [x]\mathbb{Z}[x] is the free commutative ring on one generator.

This suggests that there’s a lot more one can do to ‘categorify’ ring theory — that is, to take ideas from ring theory and develop analogous ideas in 2-rig theory. And even better, it shows that a lot of combinatorics is categorified ring theory!

Counting binary trees

But let’s see how we can use this 2-rig business to count things. Let’s make up a species BB where a BB-structure on a finite set SS is a making it into the leaves of a rooted planar binary tree. More precisely, it’s a bijection between SS and the set of leaves of some rooted planar binary tree.

Here are some rooted planar binary trees:

There’s 1 rooted planar binary tree with 1 leaf, 1 with 2, 2 with 3, 5 with 4… and so on.

I’ll draw lots of pictures to explain these trees and their leaves, but the quickest definition of BB is recursive, involving no pictures.

To put a BB structure on a finite set SS, we either

  • put an XX-structure on it

or

  • partition it into two parts TT and STS - T and put a BB-structure on each part.

Remember, an XX-structure on SS is the structure of ‘being a one-element set’. This handles the case of the binary tree with just one leaf. All other binary trees consist of two binary trees glued together at their root.

We can state this recursive definition as an equation:

B(S)=X(S)+ TSB(T)×B(TS) B(S) = X(S) + \sum_{T \subseteq S} B(T) \times B(T - S)

and we can express this more efficiently using the sum and Cauchy product of species:

B=X+BB B = X + B \cdot B

We can now use this to count BB-structures on a finite set! By taking generating functions we get

B^=x+B^ 2 \widehat{B} = x + \widehat{B}^2

or

B^ 2B^+x=0 \widehat{B}^2 - \widehat{B} + x = 0

This is a quadratic equation for the formal power series B^\widehat{B}, which we can solve using the quadratic formula!

B^=1±14x2 \widehat{B} = \frac{1 \pm \sqrt{1 - 4 x}}{2}

Since the coefficients of a generating function can’t be negative, we must have

B^=114x2 \widehat{B} = \frac{1 - \sqrt{1 - 4 x}}{2}

since the other sign choice gives a term proportional to xx with a negative coefficient. Using a computer we can see

114x2=x+x 2+2x 3+5x 4+14x 5+42x 6+132x 7+429x 8+ \frac{1 - \sqrt{1 - 4 x}}{2} = x + x^2 + 2 x^3 + 5 x^4 + 14 x^5 + 42 x^6 + 132 x^7 + 429 x^8 + \cdots

Remember

B^= n0|B(n)|n!x n \widehat{B} = \sum_{n \ge 0} \frac{|B(n)|}{n!} x^n

so we get

|B(0)|0!=0 \frac{|B(0)|}{0!} = 0

|B(1)|1!=1 \frac{|B(1)|}{1!} = 1

|B(2)|2!=1 \frac{|B(2)|}{2!} = 1

|B(3)|3!=2 \frac{|B(3)|}{3!} = 2

|B(4)|4!=5 \frac{|B(4)|}{4!} = 5

|B(5)|5!=14 \frac{|B(5)|}{5!} = 14

and so on. The factorials arise because there are n!n! ways to label the leaves of a planar binary tree with nn leaves by elements of {0,,n1}\{0, \dots, n-1\} . The interesting part is the sequence

0,1,1,2,5,14, 0, 1, 1, 2, 5, 14, \dots

These give the number of planar binary trees with nn leaves! They’re called the Catalan numbers.

Exercise 7. Use your mastery of Taylor series to show that

114x2= n02 n1(2n3)!!n!x n \frac{1 - \sqrt{1 - 4 x}}{2} = \sum_{n \ge 0} \frac{2^{n-1} (2n-3)!!}{n!} x^n

so

|B(n)|n!=2 n1(2n3)!!n! \frac{|B(n)|}{n!} = \frac{2^{n-1} (2n-3)!!}{n!}

Let’s check this for n=4n = 4:

2 41(243)!!4!=85!!4!=85311234=5 \frac{2^{4-1} (2\cdot 4-3)!!}{4!} = \frac{8 \cdot 5!!}{4!} = \frac{8 \cdot 5 \cdot 3 \cdot 1}{1 \cdot 2 \cdot 3 \cdot 4} = 5

which matches how there are 5 rooted planar binary trees with 4 leaves!

n-Category Café Counting with Categories (Part 3)

Here’s my third and final set of lecture notes for a 412\frac{1}{2}-hour minicourse at the Summer School on Algebra at the Zografou campus of the National Technical University of Athens.

Part 1 was here, and Part 2 was here.

Last time I began explaining how a chunk of combinatorics is categorified ring theory. Every structure you can put on finite sets is a species, and the category of species is the free symmetric 2-rig on one object, just as the polynomial ring [x]\mathbb{Z}[x] is the free commutative ring on one generator.

In fact it’s almost true that every species gives an element of [x]\mathbb{Z}[x]! You should think of mathematics as wanting this to be true. But it’s not quite true: in fact every tame species has a generating function, which is an element of [[x]]\mathbb{R}[[x]], a ring that’s a kind of ‘completion’ of [x]\mathbb{Z}[x]. There’s a lot to be said about the slippage here, and why it’s happening, but there’s not time for such abstract issues now. Instead, let’s see what we can do with this analogy:

Just as [x]\mathbb{Z}[x] is the free commutative ring on one generator, the category of species Set E\mathsf{Set}^\mathsf{E} is the free symmetric 2-rig on one generator.

Substitution of species

[x]\mathbb{Z}[x] is the free commutative ring on one generator, namely xx. Thus for any element P[x]P \in \mathbb{Z}[x], there’s a unique ring homomorphism

f:[x][x] f: \mathbb{Z}[x] \to \mathbb{Z}[x]

sending xx to PP:

f(x)=P f(x) = P

What is this homomorphism ff? It sends xx to PP, so it must send x 2x^2 to P 2P^2, and it must send x 2+3x+1x^2 + 3x + 1 to P 2+3P+1P^2 + 3P + 1, and so on. Indeed it sends any polynomial QQ to Q(P)Q(P), or in other words QPQ \circ P:

f(Q)=QP f(Q) = Q \circ P

So, we’re seeing that we can compose elements of [x]\mathbb{Z}[x], or substitute one in another.

The same thing works for species! Since the category of species, Set E\mathsf{Set}^\mathsf{E}, is the free symmetric 2-rig on one generator XX, for any species PSet EP \in \mathsf{Set}^\mathsf{E} there’s a unique map of symmetric 2-rigs

F:Set ESet E F: \mathsf{Set}^\mathsf{E} \to \mathsf{Set}^\mathsf{E}

sending XX to PP:

F(X)=P F(X) = P

And following the pattern we saw for polynomials, we can say

F(Q)=QP F(Q) = Q \circ P

But this time we are defining the \circ operation by this formula, since we didn’t already know a way to compose species, or substitute one in another. But it’s very nice:

Theorem. If QQ and PP are species, to put a QPQ \circ P-structure on a finite set SS is to choose an unordered partition of SS into nonempty sets T 1,,T nT_1, \dots, T_n:

S=T 1T n,ijT iT j= S = T_1 \cup \cdots T_n , \qquad i \ne j \implies T_i \cap T_j = \emptyset

and put a PP-structure on {1,,n}\{1, \dots, n\} and a QQ-structure on each set T iT_i. Moreover

QP^=Q^P^ \widehat{Q \circ P} = \widehat{Q} \circ \widehat{P}

if the constant term of P^[[x]]\widehat{P} \in \mathbb{R}[[x]] vanishes.

Let’s illustrate this with an easy example and then try some a more interesting example.

Example 7. Remember from Example 5 that Exp\Exp is our name for the species is being a finite set—every finite set has this structure in exactly one way. We call it Exp\text{Exp} because its generating function is the power series for the exponential function:

Exp^= n0x nn! \widehat{\text{Exp}} = \sum_{n \ge 0} \frac{x^n}{n!}

Let X 22!\frac{X^2}{2!} be the species being a 2-element set: every set with 2 elements has this structure in exactly one way, while a set with any other number of elements cannot have this structure. We give it this funny name because

X 22!^= n0a nn!x n \widehat{\frac{X^2}{2!}} = \sum_{n \ge 0} \frac{a_n}{n!} x^n

where

a n={1 if n=2 0 if n2 a_n = \left\{ \begin{array}{ccl} 1 & \text{ if } & n = 2 \\ 0 & \text{ if} & n \ne 2 \end{array} \right.

so

X 22!^=x 22 \widehat{\frac{X^2}{2!}} = \frac{x^2}{2}

Now let’s compose these species, and let

F=ExpX 22! F = \text{Exp} \circ \frac{X^2}{2!}

By the theorem, to put an FF-structure on SS is to partition SS into nonempty parts, put an Exp\Exp-structure on the set of parts, and put an X 22! \frac{X^2}{2!}-structure on each part. But there’s just one way to put an Exp\Exp-structure on the set of parts. So, putting an FF-structure on SS is the same as partitioning SS into parts of cardinality 22.

How many ways are there to do that? We could figure it out, but let’s use the theorem, which says

F^=Exp^X 22!^ \widehat{F} = \widehat{\text{Exp}} \circ \widehat{\frac{X^2}{2!}}

and thus

F^=e x 2/2 \widehat{F} = e^{x^2/2}

so

n0|F(n)|n! = e x 2/2 = 1+(x 2/2)+(x 2/2) 22!+(x 2/2) 33!+ = k012 kk!x 2k \begin{array}{ccl} \displaystyle{ \sum_{n \ge 0} \frac{|F(n)|}{n!} } &=& e^{x^2/2} \\ \\ &=& \displaystyle{ 1 + (x^2/2) + \frac{(x^2/2)^2}{2!} + \frac{(x^2/2)^3}{3!} + \cdots } \\ \\ &=& \displaystyle{ \sum_{k \ge 0} \frac{1}{2^k k!} x^{2k} } \end{array}

So, we see that if n=2kn = 2k is even,

|F(n)|=n!2 kk! |F(n)| = \frac{n!}{2^k k!}

So this is the number of FF-structures on an nn-element set. We can check our work if we know a bit about symmetries and counting. The group S nS_n acts transitively on the set F(n)F(n) of ways to partition an nn-element set into 2-element parts. The subgroup that fixes any such partition is S k×(/2) kS_k \times (\mathbb{Z}/2)^k, since we can permute the kk parts or permute the 2 elements in each part. Thus,

|F(n)|=n!2 kk! |F(n)| = \frac{n!}{2^k k!}

The generating function method is nice because we can just turn the crank. And as we’ll see, we can use it to solve harder problems.

Example 8. How many ways can we partition an nn-element set into nonempty parts? This is called the nnth Bell number, b nb_n. Let’s see if we can calculate it!

Let Part\text{Part} be the species of partitions. I claim there are species Exp\text{Exp} and NE\text{NE} such that

PartExpNE \text{Part} \cong \text{Exp} \circ \text{NE}

Exp\text{Exp} is our old friend ‘being a finite set’. NE\text{NE} is the species being a nonempty finite set. To partition a finite set into nonempty finite sets is to partition it into sets, put a Exp\text{Exp}-structure on the set of parts, and put an NE\text{NE}-structure on each part. That gives the isomorphism above!

We know that

Exp^= n0x nn!=exp(x) \widehat{\text{Exp}} = \sum_{n \ge 0} \frac{x^n}{n!} = \exp(x)

and similarly

NE^= n1x nn!=exp(x)1 \widehat{\text{NE}} = \sum_{n \ge 1} \frac{x^n}{n!} = \exp(x) - 1

Thus by the theorem,

Part^=Exp^NE^=e e x1 \widehat{\text{Part}} = \widehat{\text{Exp}} \circ \widehat{\text{NE}} = e^{e^x - 1}

So by the definition of the Bell numbers,

n0b nn!x n=e e x1 \sum_{n \ge 0} \frac{b_n}{n!} x^n = e^{e^x - 1}

This is not quite an explicit formula for the Bell numbers, but it’s almost as good! Let’s use it to calculate the first few Bell numbers. We’ll work out the power series for e e x1e^{e^x - 1} only up to order x 3x^3:

e e x1 = exp(x+x 22+x 36+) = 1+(x+x 22+x 36+)+12(x+x 22+) 2+16x 3+ = 1+x+22!x 2+53!x 3+ \begin{array}{ccl} e^{e^x - 1} &=& \exp\left( x + \frac{x^2}{2} + \frac{x^3}{6} + \cdots \right) \\ \\ &=& 1 + \left( x + \frac{x^2}{2} + \frac{x^3}{6} + \cdots \right) + \frac{1}{2} \left( x + \frac{x^2}{2} + \cdots \right)^2 + \frac{1}{6} x^3 + \cdots \\ \\ &=& 1 + x + \frac{2}{2!} x^2 + \frac{5}{3!} x^3 + \cdots \end{array}

This gives

b 0=1,b 1=1,b 2=2,b 3=5 b_0 = 1, \quad b_1 = 1, \quad b_2 = 2, \quad b_3 = 5

which we can easily confirm. So it works!

There is a lot more one can do with species. Luckily there are some free books that will take you further:

Linear species

Finally, let me just conclude by mentioning a variant. We can define the category of linear species to be Vect E\mathsf{Vect}^\mathsf{E} where Vect\mathsf{Vect} is the category of complex vector spaces and linear maps. So, it’s just like the category of species but with Set\mathsf{Set} replaced by Vect\mathsf{Vect}.

It turns out that like the category of species, the category of linear species is a 2-rig. We call the addition \oplus because now, given two linear species F,G:EVectF, G : \mathsf{E} \to \mathsf{Vect}, we define

(FG)(S)=F(S)G(S) (F \oplus G)(S) = F(S) \oplus G(S) ).

But it’s still just the coproduct. The Cauchy product of linear species is defined by

(FG)(S)= XSF(X)G(SX) (F \cdot G)(S) = \sum_{X \subseteq S} F(X) \otimes G(S - X)

We can also compose linear species! Everything is very similar. But now it’s connected to the representation theory of the symmetric group, because the category E\mathsf{E} is equivalent to the coproduct of all the 1-object categories BS nB S_n corresponding to the symmetric groups, and a functor from BS nB S_n to Vect\mathsf{Vect} is the same as a representation of S nS_n. And because representations of the symmetric group are classified by Young diagrams, so are linear species.

I won’t go into this more now, but my friends Joe and Todd and I have studied this subject in great detail, and it connects many areas of mathematics in a delightful way:

For example, the second paper here relates linear species to the ring of symmetric functions, another popular and fundamental topic in combinatorics.

Scott Aaronson Guess I’m A Rationalist Now

A week ago I attended LessOnline, a rationalist blogging conference featuring many people I’ve known for years—Scott Alexander, Eliezer Yudkowsky, Zvi Mowshowitz, Sarah Constantin, Carl Feynman—as well as people I’ve known only online and was delighted to meet in person, like Joe Carlsmith and Jacob Falkovich and Daniel Reeves. The conference was at Lighthaven, a bewildering maze of passageways, meeting-rooms, sleeping quarters, gardens, and vines off Telegraph Avenue in Berkeley, which has recently emerged as the nerd Shangri-La, or Galt’s Gulch, or Shire, or whatever. I did two events at this year’s LessOnline: a conversation with Nate Soares about the Orthogonality Thesis, and an ask-me-anything session about quantum computing and theoretical computer science (no new ground there for regular consumers of my content).

What I’ll remember most from LessOnline is not the sessions, mine or others’, but the unending conversation among hundreds of people all over the grounds, which took place in parallel with the sessions and before and after them, from morning till night (and through the night, apparently, though I’ve gotten too old for that). It felt like a single conversational archipelago, the largest in which I’ve ever taken part, and the conference’s real point. (Attendees were exhorted, in the opening session, to skip as many sessions as possible in favor of intense small-group conversations—not only because it was better but also because the session rooms were too small.)

Within the conversational blob, just making my way from one building to another could take hours. My mean free path was approximately five feet, before someone would notice my nametag and stop me with a question. Here was my favorite opener:

“You’re Scott Aaronson?! The quantum physicist who’s always getting into arguments on the Internet, and who’s essentially always right, but who sustains an unreasonable amount of psychic damage in the process?”

“Yes,” I replied, not bothering to correct the “physicist” part.

One night, I walked up to Scott Alexander, who sitting on the ground, with his large bald head and a blanket he was using as a robe, resembled a monk. “Are you enjoying yourself?” he asked.

I replied, “you know, after all these years of being coy about it, I think I’m finally ready to become a Rationalist. Is there, like, an initiation ritual or something?”

Scott said, “Oh, you were already initiated a decade ago; you just didn’t realize it at the time.” Then he corrected himself: “two decades ago.”

The first thing I did, after coming out as a Rationalist, was to get into a heated argument with Other Scott A., Joe Carlsmith, and other fellow-Rationalists about the ideas I set out twelve years ago in my Ghost in the Quantum Turing Machine essay. Briefly, my argument was that the irreversibility and ephemerality of biological life, which contrasts with the copyability, rewindability, etc. of programs running on digital computers, and which can ultimately be traced back to microscopic details of the universe’s initial state, subject to the No-Cloning Theorem of quantum mechanics, which then get chaotically amplified during brain activity … might be a clue to a deeper layer of the world, one that we understand about as well as the ancient Greeks understood Newtonian physics, but which is the layer where mysteries like free will and consciousness will ultimately need to be addressed.

I got into this argument partly because it came up, but partly also because this seemed like the biggest conflict between my beliefs and the consensus of my fellow Rationalists. Maybe part of me wanted to demonstrate that my intellectual independence remained intact—sort of like a newspaper that gets bought out by a tycoon, and then immediately runs an investigation into the tycoon’s corruption, as well as his diaper fetish, just to prove it can.

The funny thing, though, is that all my beliefs are the same as they were before. I’m still a computer scientist, an academic, a straight-ticket Democratic voter, a liberal Zionist, a Jew, etc. (all identities, incidentally, well-enough represented at LessOnline that I don’t even think I was the unique attendee in the intersection of them all).

Given how much I resonate with what the Rationalists are trying to do, why did it take me so long to identify as one?

Firstly, while 15 years ago I shared the Rationalists’ interests, sensibility, and outlook, and their stances on most issues, I also found them bizarrely, inexplicably obsessed with the question of whether AI would soon become superhumanly powerful and change the basic conditions of life on earth, and with how to make the AI transition go well. Why that, as opposed to all the other sci-fi scenarios one could worry about, not to mention all the nearer-term risks to humanity?

Suffice it to say that empirical developments have since caused me to withdraw my objection. Sometimes weird people are weird merely because they see the future sooner than others. Indeed, it seems to me that the biggest thing the Rationalists got wrong about AI was to underestimate how soon the revolution would happen, and to overestimate how many new ideas would be needed for it (mostly, as we now know, it just took lots more compute and training data). Now that I, too, spend some of my time working on AI alignment, I was able to use LessOnline in part for research meetings with colleagues.

A second reason I didn’t identify with the Rationalists was cultural: they were, and are, centrally a bunch of twentysomethings who “work” at an ever-changing list of Berkeley- and San-Francisco-based “orgs” of their own invention, and who live in group houses where they explore their exotic sexualities, gender identities, and fetishes, sometimes with the aid of psychedelics. I, by contrast, am a straight, monogamous, middle-aged tenured professor, married to another such professor and raising two kids who go to normal schools. Hanging out with the Rationalists always makes me feel older and younger at the same time.

So what changed? For one thing, with the march of time, a significant fraction of Rationalists now have marriages, children, or both—indeed, a highlight of LessOnline was the many adorable toddlers running around the Lighthaven campus. Rationalists are successfully reproducing! Some because of explicit pronatalist ideology, or because they were persuaded by Bryan Caplan’s arguments in Selfish Reasons to Have More Kids. But others simply because of the same impulses that led their ancestors to do the same for eons. And perhaps because, like the Mormons or Amish or Orthodox Jews, but unlike typical secular urbanites, the Rationalists believe in something. For all their fears around AI, they don’t act doomy, but buzz with ideas about how to build a better world for the next generation.

At a LessOnline parenting session, hosted by Julia Wise, I was surrounded by parents who worry about the same things I do: how do we raise our kids to be independent and agentic yet socialized and reasonably well-behaved, technologically savvy yet not droolingly addicted to iPad games? What schooling options will let them accelerate in math, save them from the crushing monotony that we experienced? How much of our own lives should we sacrifice on the altar of our kids’ “enrichment,” versus trusting Judith Rich Harris that such efforts quickly hit a point of diminishing returns?

A third reason I didn’t identify with the Rationalists was, frankly, that they gave off some (not all) of the vibes of a cult, with Eliezer as guru. Eliezer writes in parables and koans. He teaches that the fate of life on earth hangs in the balance, that the select few who understand the stakes have the terrible burden of steering the future. Taking what Rationalists call the “outside view,” how good is the track record for this sort of thing?

OK, but what did I actually see at Lighthaven? I saw something that seemed to resemble a cult only insofar as the Beatniks, the Bloomsbury Group, the early Royal Society, or any other community that believed in something did. When Eliezer himself—the bearded, cap-wearing Moses who led the nerds from bondage to their Promised Land in Berkeley—showed up, he was argued with like anyone else. Eliezer has in any case largely passed his staff to a new generation: Nate Soares and Zvi Mowshowitz have found new and, in various ways, better ways of talking about AI risk; Scott Alexander has for the last decade written the blog that’s the community’s intellectual center; figures from Kelsey Piper to Jacob Falkovich to Aella have taken Rationalism in new directions, from mainstream political engagement to the … err … statistical analysis of orgies.

I’ll say this, though, on the naysayers’ side: it’s really hard to make dancing to AI-generated pop songs about Bayes’ theorem and Tarski’s definition of truth not feel cringe, as I can now attest from experience.

The cult thing brings me to the deepest reason I hesitated for so long to identify as a Rationalist: namely, I was scared that if I did, people whose approval I craved (including my academic colleagues, but also just randos on the Internet) would sneer at me. For years, I searched of some way of explaining this community’s appeal so reasonable that it would silence the sneers.

It took years of psychological struggle, and (frankly) solidifying my own place in the world, to follow the true path, which of course is not to give a shit what some haters think of my life choices. Consider: five years ago, it felt obvious to me that the entire Rationalist community might be about to implode, under existential threat from Cade Metz’s New York Times article, as well as RationalWiki and SneerClub and all the others laughing at the Rationalists and accusing them of every evil. Yet last week at LessOnline, I saw a community that’s never been thriving more, with a beautiful real-world campus, excellent writers on every topic who felt like this was the place to be, and even a crop of kids. How many of the sneerers are living such fulfilled lives? To judge from their own angry, depressed self-disclosures, probably not many.

But are the sneerers right that, even if the Rationalists are enjoying their own lives, they’re making other people’s lives miserable? Are they closet far-right monarchists, like Curtis Yarvin? I liked how The New Yorker put it in its recent, long and (to my mind) devastating profile of Yarvin:

The most generous engagement with Yarvin’s ideas has come from bloggers associated with the rationalist movement, which prides itself on weighing evidence for even seemingly far-fetched claims. Their formidable patience, however, has also worn thin. “He never addressed me as an equal, only as a brainwashed person,” Scott Aaronson, an eminent computer scientist, said of their conversations. “He seemed to think that if he just gave me one more reading assignment about happy slaves singing or one more monologue about F.D.R., I’d finally see the light.”

The closest to right-wing politics that I witnessed at LessOnline was a session, with Kelsey Piper and current and former congressional staffers, about the prospects for moderate Democrats to articulate a pro-abundance agenda that would resonate with the public and finally defeat MAGA.

But surely the Rationalists are incels, bitter that they can’t get laid? Again, the closest I saw was a session where Jacob Falkovich helped a standing-room-only crowd of mostly male nerds confront their fears around dating and understand women better, with Rationalist women eagerly volunteering to answer questions about their perspective. Gross, right? (Also, for those already in relationships, Eliezer’s primary consort and former couples therapist Gretta Duleba did a session on relationship conflict.)

So, yes, when it comes to the Rationalists, I’m going to believe my own lying eyes over the charges of the sneerers. The sneerers can even say about me, in their favorite formulation, that I’ve “gone mask off,” confirmed the horrible things they’ve always suspected. Yes, the mask is off—and beneath the mask is the same person I always was, who has an inordinate fondness for the Busy Beaver function and the complexity class BQP/qpoly, and who uses too many filler words and moves his hands too much, and who strongly supports the Enlightenment, and who once feared that his best shot at happiness in life would be to earn women’s pity rather than their contempt. Incorrectly, as I’m glad to report. From my nebbishy nadir to the present, a central thing that’s changed is that, from my family to my academic colleagues to the Rationalist community to my blog readers, I finally found some people who want what I have to sell.


Unrelated Announcements:

My replies to comments on this post might be light, as I’ll be accompanying my daughter on a school trip to the Galapagos Islands!

A few weeks ago, I was “ambushed” into leading a session on philosophy and theoretical computer science at UT Austin. (I.e., asked to show up for the session, but thought I’d just be a participant rather than the main event.) The session was then recorded and placed on YouTube—and surprisingly, given the circumstances, some people seemed to like it!

Friend-of-the-blog Alon Rosen has asked me to announce a call for nominations for a new theoretical computer science prize, in memory of my former professor (and fellow TCS blogger) Luca Trevisan, who was lost to the world too soon.

And one more: Mahdi Cheraghchi has asked me to announce the STOC’2025 online poster session, registration deadline June 12; see here for more. Incidentally, I’ll be at STOC in Prague to give a plenary on quantum algorithms; I look forward to meeting any readers who are there!

June 24, 2025

n-Category Café Tannaka Reconstruction and the Monoid of Matrices

You can classify representations of simple Lie groups using Dynkin diagrams, but you can also classify representations of ‘classical’ Lie groups using Young diagrams. Hermann Weyl wrote a whole book on this, The Classical Groups.

This approach is often treated as a bit outdated, since it doesn’t apply to all the simple Lie groups: it leaves out the so-called ‘exceptional’ groups. But what makes a group ‘classical’?

There’s no precise definition, but a classical group always has an obvious representation, you can get other representations by doing obvious things to this obvious one, and it turns out you can get all the representations this way.

For a long time I’ve been hoping to bring these ideas up to date using category theory. I had a bunch of conjectures, but I wasn’t able to prove any of them. Now Todd Trimble and I have made progress:

We tackle something even more classical than the classical groups: the monoid of n×nn \times n matrices, with matrix multiplication as its monoid operation.

The monoid of n×nn \times n matrices has an obvious nn-dimensional representation, and you can get all its representations from this one by operations that you can apply to any representation. So its category of representations is generated by this one obvious representation, in some sense. And it’s almost freely generated: there’s just one special relation. What’s that, you ask? It’s a relation saying the obvious representation is nn-dimensional!

That’s the basic idea. We need to make it more precise. We do it using the theory of 2-rigs, where for us a 2-rig is a symmetric monoidal linear category that is Cauchy complete. All the operations you can apply to any representation of a monoid are packed into this jargon.

Let’s write M(n,k)\text{M}(n,k) for the monoid of n×nn \times n matrices over a field kk, and Rep(M(n,k))\mathsf{Rep}(\text{M}(n,k)) for its 2-rig of representations. Then we want to say something like: Rep(M(n,k))\mathsf{Rep}(\text{M}(n,k)) is the free 2-rig on an object of dimension nn. That’s the kind of result I’ve been dreaming of.

To get this to be true, though, we need to say what kind of representations we’re talking about! Clearly we want finite-dimensional ones. But we need to be careful: we should only take finite-dimensional algebraic representations. Those are representations ρ:M(n,k)M(m,k)\rho: \text{M}(n,k) \to \text{M}(m,k) where the matrix entries of ρ(x)\rho(x) are polynomials in the matrix entries of xx. Otherwise, even the monoid of 1×11 \times 1 matrices gets lots of 1-dimensional representations coming from automorphisms of the field kk. Classifying those is a job for Galois theorists, not representation theorists.

So, we define Rep(M,k)\mathsf{Rep}(M,k) to be the category of algebraic representations of the monoid M(n,k)\text{M}(n,k), and we want to say Rep(M,k)\mathsf{Rep}(M,k) is the free 2-rig on an object of dimension nn. But we need to say what it means for an object xx of a 2-rig to have dimension nn.

The definition that works is to demand that the (n+1)(n+1)st exterior power of xx should vanish:

Λ n+1(x)0. \Lambda^{n+1}(x) \cong 0 .

But this is true for any vector space of dimension less than or equal to nn. So in our paper we say xx has subdimension nn when this holds. (There’s another stronger condition for having dimension exactly nn, but interestingly this is not what we want here. You’ll see why shortly.)

So here’s the theorem we prove, with all the fine print filled in:

Theorem. Suppose kk is a field of characteristic zero and let Rep(M(n,k))\mathsf{Rep}(\text{M}(n,k)) be the 2-rig of algebraic representations of the monoid M(n,k)\text{M}(n,k). Then the representation of M(n,k)\text{M}(n,k) on x=k nx = k^n by matrix multiplication has subdimension nn. Moreover, Rep(M(n,k))\text{Rep}(\text{M}(n,k)) is the free 2-rig on an object of subdimension nn. In other words, suppose R\mathsf{R} is any 2-rig containing an object rr of subdimension nn. Then there is a map of 2-rigs,

F:Rep(M(n,k))R, F: \mathsf{Rep}(\text{M}(n,k)) \to \mathsf{R} ,

unique up to natural isomorphism, such that F(x)=rF(x) = r.

Or, in simple catchy terms: M(n,k)\text{M}(n,k) is the walking monoid with a representation of subdimension nn.

To prove this theorem we need to deploy some concepts.

First, the fact that we’re talking about algebraic representations means that we’re not really treating M(n,k)\text{M}(n,k) as a bare monoid (a monoid in the category of sets). Instead, we’re treating it as a monoid in the category of affine schemes. But monoids in affine schemes are equivalent to commutative bialgebras, and this is often a more practical way of working with them.

Second, we need to use Tannaka reconstruction. This tells you how to reconstruct a commutative bialgebra from a 2-rig (which is secretly its 2-rig of representations) together with a faithful 2-rig map to Vect\mathsf{Vect} (which secretly sends any representation to its underlying vector space).

We want to apply this to the free 2-rig on an object xx of subdimension nn. Luckily because of this universal property it automatically gets a 2-rig map to Vect\mathsf{Vect} sending xx to k nk^n. So we just have to show this map is faithful, apply Tannaka reconstruction, and get out the commutative bialgebra corresponding to M(n,k)\text{M}(n,k)!

Well, I say ‘just’, but it takes some real work. It turns out to be useful to bring in the free 2-rig on one object. The reason is that we studied the free 2-rig on one object in two previous papers, so we know a lot about it:

We can use this knowledge if we think of the free 2-rig on an object of subdimension nn as a quotient of the free 2-rig on one object by a ‘2-ideal’. To do this, we need to develop the theory of ‘2-ideals’. But that’s good anyway — it will be useful for many other things.

So that’s the basic plan of the paper. It was really great working with Todd on this, taking a rough conjecture and building all the machinery necessary to make it precise and prove it.

What about representations of classical groups like GL(n,k),SL(n,k)\text{GL}(n,k), \text{SL}(n,k), the orthogonal and symplectic groups, and so on? At the end of the paper we state a bunch of conjectures about these. Here’s the simplest one:

Theorem. Suppose kk is a field of characteristic zero and let Rep(GL(n,k))\mathsf{Rep}(\text{GL}(n,k)) be the 2-rig of algebraic representations of GL(n,k).\text{GL}(n,k). Then the representation of GL(n,k)\text{GL}(n,k) on x=k nx = k^n by matrix multiplication has dimension nn, meaning its nnth exterior power has an inverse with respect to tensor product. Moreover, Rep(GL(n,k))\text{Rep}(\text{GL}(n,k)) is the free 2-rig on an object of dimension nn.

This ‘inverse with respect to tensor product’ stuff is an abstract way of saying that the determinant representation det(g)\text{det}(g) of gGL(n)g \in \text{GL}(n) has an inverse, namely the representation det(g) 1\text{det}(g)^{-1}.

It will take new techniques to prove this. I look forward to people tackling this and our other conjectures. Categorified rig theory can shed new light on group representation theory, bringing Weyl’s beautiful ideas forward into the 21st century.

Clifford JohnsonSuper-Fun!

image of completed paper, with pencilIn January 2024 I wrote a paper showing how to define the Supersymmetric Virasoro Minimal String* (SVMS) as a random matrix model, compute many of its properties, and indeed predict many aspects of its physics. This was the first time the SVMS had been constructed. Despite that, a recent paper found it necessary to specifically single out my paper disparagingly as somehow not being a string theory paper, in service of (of course) their own work trying to formulate it. Odd - and disappointingly unkind - behaviour. But I’m used to it.

Anyway, since it remains the case that there is no other working definition of the SVMS out there, I thought I’d revisit the matter, clean up some unpublished work of mine (defining the 0B version) and develop the whole formalism much more. Might be useful for people pursuing other approaches. What I thought would be at most a 10 page paper turned into a 19 page one, packed with lots of fun results.

In particular it is now clear to me how the type 0A vs 0B choices, usually done at the level of perturbative worldsheet CFT methods, show up fully at the level of matrix model string equation solutions. It is often said that random matrix model methods can rather obscure issues like worldsheet supersymmetry, making it unclear what structures pertain to what features in other approaches. That can be true, so these new observations clear show that this is not always the case. (This is true quite generally, beyond this particular family of models.)

Also (and this is lots of fun!) I demonstrate that the basic loop observables of the SVMS .... Click to continue reading this post

The post Super-Fun! appeared first on Asymptotia.

John PreskillCongratulations, class of 2025! Words from a new graduate

Editor’s note (Nicole Yunger Halpern): Jade LeSchack, the Quantum Steampunk Laboratory’s first undergraduate, received her bachelor’s degree from the University of Maryland this spring. Kermit the Frog presented the valedictory address, but Jade gave the following speech at the commencement ceremony for the university’s College of Mathematical and Natural Sciences. Jade heads to the University of Southern California for a PhD in physics this fall.

Good afternoon, everyone. My name is Jade, and it is my honor and pleasure to speak before you. 

Today, I’m graduating with my Bachelor of Science, but when I entered UMD, I had no idea what it meant to be a professional scientist or where my passion for quantum science would take me. I want you to picture where you were four years ago. Maybe you were following a long-held passion into college, or maybe you were excited to explore a new technical field. Since then, you’ve spent hours titrating solutions, debugging code, peering through microscopes, working out proofs, and all the other things our disciplines require of us. Now, we’re entering a world of uncertainty, infinite possibility, and lifelong connections. Let me elaborate on each of these.

First, there is uncertainty. Unlike simplified projectile motion, you can never predict the exact trajectory of your life or career. Plans will change, and unexpected opportunities will arise. Sometimes, the best path forward isn’t the one you first imagined. Our experiences at Maryland have prepared us to respond to the challenges and curveballs that life will throw at us. And, we’re going to get through the rough patches.

Second, let’s embrace the infinite possibilities ahead of us. While the concept of the multiverse is best left to the movies, it’s exciting to think about all the paths before us. We’ve each found our own special interests over the past four years here, but there’s always more to explore. Don’t put yourself in a box. You can be an artist and a scientist, an entrepreneur and a humanitarian, an athlete and a scholar. Continue to redefine yourself and be open to your infinite potential.

Third, as we move forward, we are equipped not only with knowledge but with connections. We’ve made lasting relationships with incredible people here. As we go from place to place, the people who we’re close to will change. But we’re lucky that, these days, people are only an email or phone call away. We’ll always have our UMD communities rooting for us.

Now, the people we met here are certainly not the only important ones. We’ve each had supporters along the various stages of our journeys. These are the people who championed us, made sacrifices for us, and gave us a shoulder to cry on. I’d like to take a moment to thank all my mentors, teachers, and friends for believing in me. To my mom, dad, and sister sitting up there, I couldn’t have done this without you. Thank you for your endless love and support. 

To close, I’d like to consider this age-old question that has always fascinated me: Is mathematics discovered or invented? People have made a strong case for each side. If we think about science in general, and our future contributions to our fields, we might ask ourselves: Are we discoverers or inventors? My answer is both! Everyone here with a cap on their head is going to contribute to both. We’re going to unearth new truths about nature and innovate scientific technologies that better society. This uncertain, multitudinous, and interconnected world is waiting for us, the next generation of scientific thinkers! So let’s be bold and stay fearless. 

Congratulations to the class of 2024 and the class of 2025! We did it!

Author’s note: I was deeply grateful for the opportunity to serve as the student speaker at my commencement ceremony. I hope that the science-y references tickle the layman and SME alike. You can view a recording of the speech here. I can’t wait for my next adventures in quantum physics!

Scott Aaronson Raymond Laflamme (1960-2025)

Even with everything happening in the Middle East right now, even with (relatedly) everything happening in my own family (my wife and son sheltering in Tel Aviv as Iranian missiles rained down), even with all the rather ill-timed travel I’ve found myself doing as these events unfolded (Ecuador and the Galapagos and now STOC’2025 in Prague) … there’s been another thing, a huge one, weighing on my soul.

Ray Laflamme played a major role in launching the whole field of quantum computing and information, and also a major role in launching my own career. The world has lost him too soon. I’ve lost him too soon.

After growing up in Quebec—I still hear his French-Canadian accent, constantly on the verge of laughter, as I’m writing this—Ray went into physics and became a PhD student of Stephen Hawking. No, not a different Stephen Hawking. If you’ve read or watched anything by or about Hawking, including A Brief History of Time, you might remember the story where Hawking believed for a while that time would reverse itself as the universe contracted in a Big Crunch, with omelettes unscrambling themselves, old people turning into children, etc. etc., but then two graduate students persuaded him that that was totally wrong, and entropy would continue to increase like normal. Anyway, Ray was one of those students (Don Page was the other). I’d always meant to ask Ray to explain what argument changed Hawking’s mind, since the idea of entropy decreasing during contraction just seemed obviously wrong to me! Only today, while writing this post, did I find a 1993 paper by Hawking, Laflamme, and Lyons that explains the matter perfectly clearly, including three fallacious intuitions that Hawking had previously held. (Even though, as they comment, “the anatomy of error is not ruled by logic.”)

Anyway, in the mid-1990s, starting at Los Alamos National Lab and continuing at the University of Waterloo, Ray became a pioneer of the then-new field of quantum computing and information. In 1997, he was a coauthor of one of the seminal original papers that proved the possibility of fault-tolerant quantum computation with a constant error rate, what we now call the Threshold Theorem (Aharonov and Ben-Or had such a result independently). He made lots of other key early contributions to the theory of quantum error-correcting codes and fault-tolerance.

When it comes to Ray’s scientific achievements after his cosmology work with Hawking and after quantum fault-tolerance—well, there are many, but let me talk about two. Perhaps the biggest is the KLM (Knill-Laflamme-Milburn) Theorem. It would be fair to say that KLM started the entire field of optical or photonic quantum computation, as it’s existed in the 21st century. In one sentence, what KLM showed is that it’s possible to build a universal quantum computer using only

  1. identical single-photon states,
  2. a network of “linear-optical elements” (that is, beamsplitters and phaseshifters) that the photons travel through, and
  3. feedforward measurements—that is, measurements of an optical mode that tell you how many photons are there, in such a way that you can condition (using a classical computer) which optical elements to apply next on the outcome of the measurement.

All of a sudden, there was a viable path to building a quantum computer out of photons, where you wouldn’t need to get pairs of photons to interact with each other, which had previously been the central sticking point. The key insight was that feedforward measurements, combined with the statistical properties of identical bosons (what the photons are), are enough to simulate the effect of two-photon interactions.

Have you heard of PsiQuantum, the startup in Palo Alto with a $6 billion valuation and hundreds of employees that’s right now trying to build an optical quantum computer with a million qubits? Or Xanadu, its competitor in Toronto? These, in some sense, are companies that grew out of a theorem: specifically the KLM Theorem.

For me, though, the significance of KLM goes beyond the practical. In 2011, I used the KLM Theorem, together with the fact (known since the 1950s) that photonic amplitudes are the permanents of matrices, to give a new proof of Leslie Valiant’s celebrated 1979 theorem that calculating the permanent is a #P-complete problem. Thus, as I pointed out in a talk two years ago at Ray’s COVID-delayed 60th birthday conference, entitled Ray Laflamme, Complexity Theorist (?!), KLM had said something new about computational complexity, without any intention of doing so. More generally, KLM was crucial backdrop to my and Alex Arkhipov’s later work on BosonSampling, where we gave strong evidence that some classical computational hardness—albeit probably not universal quantum computation—remains in linear optics, even if one gets rid of KLM’s feedforward measurements.

(Incidentally, I gave my talk at Ray’s birthday conference by Zoom, as I had a conflicting engagement. I’m now sad about that: had I known that that would’ve been my last chance to see Ray, I would’ve cancelled any other plans.)

The second achievement of Ray’s that I wanted to mention was his 1998 creation, again with his frequent collaborator Manny Knill, of the One Clean Qubit or “DQC1” model of quantum computation. In this model, you get to apply an arbitrary sequence of 2-qubit unitary gates, followed by measurements at the end, just like in standard quantum computing—but the catch is that the initial state consists of just a single qubit in the state |0⟩, and all other qubits in the maximally mixed state. If all qubits started in the maximally mixed state, then nothing would ever happen, because the maximally mixed state is left invariant by all unitary transformations. So it would stand to reason that, if all but one of the qubits start out maximally mixed, then almost nothing happens. The big surprise is that this is wrong. Instead you get a model that, while probably not universal for quantum computation, can do a variety of things in polynomial time that we don’t know how to do classically, including estimating the traces of exponentially large unitary matrices and the Jones polynomials of trace closures of braids (indeed, both of these problems turn out to be DQC1-complete). The discovery of DQC1 was one of the first indications that there’s substructure within BQP. Since then, the DQC1 model has turned up again and again in seemingly unrelated investigations in quantum complexity theory—way more than you’d have any right to expect a priori.

Beyond his direct contributions to quantum information, Ray will be remembered as one of the great institution-builders of our field. He directed the Institute for Quantum Computing (IQC) at the University of Waterloo in Canada, from its founding in 2002 until he finally stepped down in 2017. This includes the years 2005-2007, when I was a postdoc at IQC—two of the most pivotal years of my life, when I first drove a car and went out on dates (neither of which I do any longer, for different reasons…), when I started this blog, when I worked on quantum money and learnability of quantum states and much more, and when I taught the course that turned into my book Quantum Computing Since Democritus. I fondly remember Ray, as my “boss,” showing me every possible kindness. He even personally attended the Quantum Computing Since Democritus lectures, which is why he appears as a character in the book.

As if that wasn’t enough, Ray also directed the quantum information program of the Canadian Institute for Advanced Research (CIFAR). If you ever wondered why Canada, as a nation, has punched so far above its weight in quantum computing and information for the past quarter-century—Ray Laflamme is part of the answer.

At the same time, if you imagine the stereotypical blankfaced university administrator, who thinks and talks only in generalities and platitudes (“how can we establish public-private partnerships to build a 21st-century quantum workforce?”) … well, Ray was whatever is the diametric opposite of that. Despite all his responsibilities, Ray never stopped being a mensch, a friend, an intellectually curious scientist, a truth-teller, and a jokester. Whenever he and I talked, probably at least a third of the conversation was raucous laughter.

I knew that Ray had spent many years battling cancer. I naïvely thought he was winning, or had won. But as so often with cancer, it looks like the victory was only temporary. I miss him already. He was a ray of light in the world—a ray that sparkles, illuminates, and as we now know, even has the latent power of universal quantum computation.

June 23, 2025

John PreskillA (quantum) complex legacy: Part trois

When I worked in Cambridge, Massachusetts, a friend reported that MIT’s postdoc association had asked its members how it could improve their lives. The friend confided his suggestion to me: throw more parties.1 This year grants his wish on a scale grander than any postdoc association could. The United Nations has designated 2025 as the International Year of Quantum Science and Technology (IYQ), as you’ve heard unless you live under a rock (or without media access—which, come to think of it, sounds not unappealing).

A metaphorical party cracker has been cracking since January. Governments, companies, and universities are trumpeting investments in quantum efforts. Institutions pulled out all the stops for World Quantum Day, which happens every April 14 but which scored a Google doodle this year. The American Physical Society (APS) suffused its Global Physics Summit in March with quantum science like a Bath & Body Works shop with the scent of Pink Pineapple Sunrise. At the summit, special symposia showcased quantum research, fellow blogger John Preskill dished about quantum-science history in a dinnertime speech, and a “quantum block party” took place one evening. I still couldn’t tell you what a quantum block party is, but this one involved glow sticks.

Google doodle from April 14, 2025

Attending the summit, I felt a satisfaction—an exultation, even—redolent of twelfth grade, when American teenagers summit the Mont Blanc of high school. It was the feeling that this year is our year. Pardon me while I hum “Time of your life.”2

Speakers and organizer of a Kavli Symposium, a special session dedicated to interdisciplinary quantum science, at the APS Global Physics Summit

Just before the summit, editors of the journal PRX Quantum released a special collection in honor of the IYQ.3 The collection showcases a range of advances, from chemistry to quantum error correction and from atoms to attosecond-length laser pulses. Collaborators and I contributed a paper about quantum complexity, a term that has as many meanings as companies have broadcast quantum news items within the past six months. But I’ve already published two Quantum Frontiers posts about complexity, and you surely study this blog as though it were the Bible, so we’re on the same page, right? 

Just joshing. 

Imagine you have a quantum computer that’s running a circuit. The computer consists of qubits, such as atoms or ions. They begin in a simple, “fresh” state, like a blank notebook. Post-circuit, they store quantum information, such as entanglement, as a notebook stores information post-semester. We say that the qubits are in some quantum state. The state’s quantum complexity is the least number of basic operations, such as quantum logic gates, needed to create that state—via the just-completed circuit or any other circuit.

Today’s quantum computers can’t create high-complexity states. The reason is, every quantum computer inhabits an environment that disturbs the qubits. Air molecules can bounce off them, for instance. Such disturbances corrupt the information stored in the qubits. Wait too long, and the environment will degrade too much of the information for the quantum computer to work. We call the threshold time the qubits’ lifetime, among more-obscure-sounding phrases. The lifetime limits the number of gates we can run per quantum circuit.

The ability to perform many quantum gates—to perform high-complexity operations—serves as a resource. Other quantities serve as resources, too, as you’ll know if you’re one of the three diehard Quantum Frontiers fans who’ve been reading this blog since 2014 (hi, Mom). Thermodynamic resources include work: coordinated energy that one can harness directly to perform a useful task, such as lifting a notebook or staying up late enough to find out what a quantum block party is. 

My collaborators: Jonas Haferkamp, Philippe Faist, Teja Kothakonda, Jens Eisert, and Anthony Munson (in an order of no significance here)

My collaborators and I showed that work trades off with complexity in information- and energy-processing tasks: the more quantum gates you can perform, the less work you have to spend on a task, and vice versa. Qubit reset exemplifies such tasks. Suppose you’ve filled a notebook with a calculation, you want to begin another calculation, and you have no more paper. You have to erase your notebook. Similarly, suppose you’ve completed a quantum computation and you want to run another quantum circuit. You have to reset your qubits to a fresh, simple state

Three methods suggest themselves. First, you can “uncompute,” reversing every quantum gate you performed.4 This strategy requires a long lifetime: the information imprinted on the qubits by a gate mustn’t leak into the environment before you’ve undone the gate. 

Second, you can do the quantum equivalent of wielding a Pink Pearl Paper Mate: you can rub the information out of your qubits, regardless of the circuit you just performed. Thermodynamicists inventively call this strategy erasure. It requires thermodynamic work, just as applying a Paper Mate to a notebook does. 

Third, you can

Suppose your qubits have finite lifetimes. You can undo as many gates as you have time to. Then, you can erase the rest of the qubits, spending work. How does complexity—your ability to perform many gates—trade off with work? My collaborators and I quantified the tradeoff in terms of an entropy we invented because the world didn’t have enough types of entropy.5

Complexity trades off with work not only in qubit reset, but also in data compression and likely other tasks. Quantum complexity, my collaborators and I showed, deserves a seat at the great soda fountain of quantum thermodynamics.

The great soda fountain of quantum thermodynamics

…as quantum information science deserves a seat at the great soda fountain of physics. When I embarked upon my PhD, faculty members advised me to undertake not only quantum-information research, but also some “real physics,” such as condensed matter. The latter would help convince physics departments that I was worth their money when I applied for faculty positions. By today, the tables have turned. A condensed-matter theorist I know has wound up an electrical-engineering professor because he calculates entanglement entropies.

So enjoy our year, fellow quantum scientists. Party like it’s 1925. Burnish those qubits—I hope they achieve the lifetimes of your life.

1Ten points if you can guess who the friend is.

2Whose official title, I didn’t realize until now, is “Good riddance.” My conception of graduation rituals has just turned a somersault. 

3PR stands for Physical Review, the brand of the journals published by the APS. The APS may have intended for the X to evoke exceptional, but I like to think it stands for something more exotic-sounding, like ex vita discedo, tanquam ex hospitio, non tanquam ex domo.

4Don’t ask me about the notebook analogue of uncomputing a quantum state. Explaining it would require another blog post.

5For more entropies inspired by quantum complexity, see this preprint. You might recognize two of the authors from earlier Quantum Frontiers posts if you’re one of the three…no, not even the three diehard Quantum Frontiers readers will recall; but trust me, two of the authors have received nods on this blog before.

June 22, 2025

Scott Aaronson Trump and Iran, by popular request

I posted this on my Facebook, but several friends asked me to share more widely, so here goes:

I voted against Trump three times, and donated thousands to his opponents. I’d still vote against him today, seeing him as a once-in-a-lifetime threat to American democracy and even to the Enlightenment itself.

But last night I was also grateful to him for overruling the isolationists and even open antisemites in his orbit, striking a blow against the most evil regime on the planet, and making it harder for that regime to build nuclear weapons. I acknowledge that his opponents, who I voted for, would’ve probably settled for a deal that would’ve resulted in Iran eventually getting nuclear weapons, and at any rate getting a flow of money to redirect to Hamas, Hezbollah, and the Houthis.

May last night’s events lead to the downfall of the murderous ayatollah regime altogether, and to the liberation of the Iranian people from 46 years of oppression. To my many, many Iranian friends: I hope all your loved ones stay safe, and I hope your great people soon sees better days. I say this as someone whose wife and 8-year-old son are right now in Tel Aviv, sheltering every night from Iranian missiles.

Fundamentally, I believe not only that evil exists in the world, but that it’s important to calibrate evil on a logarithmic scale. Trump (as I’ve written on this blog for a decade) terrifies me, infuriates me, and embarrasses me, and through his evisceration of American science and universities, has made my life noticeably worse. On the other hand, he won’t hang me from a crane for apostasy, nor will he send a ballistic missile to kill my wife and son and then praise God for delivering them into his hands.


Update: I received the following comment on this post, which filled me with hope, and demonstrated more moral courage than perhaps every other anonymous comment in this blog’s 20-year history combined. To this commenter and their friends and family, I wish safety and eventually, liberation from tyranny.

I will keep my name private for clear reasons. Thank you for your concern for Iranians’ safety and for wishing the mullah regime’s swift collapse. I have fled Tehran and I’m physically safe but mentally, I’m devastated by the war and the internet blackout (the pretext is that Israeli drones are using our internet). Speaking of what the mullahs have done, especially outrageous was the attack on the Weizmann Institute. I hope your wife and son remain safe from the missiles of the regime whose thugs have chased me and my friends in the streets and imprisoned my friends for simple dissent. All’s well that ends well, and I hope this all ends well.

June 21, 2025

Doug NatelsonBrief items - fresh perspectives, some news bits

As usual, I hope to write more about particular physics topics soon, but in the meantime I wanted to share a sampling of news items:
  • First, it's a pleasure to see new long-form writing about condensed matter subjects, in an era where science blogging has unquestionably shrunk compared to its heyday.  The new Quantum Matters substack by Justin Wilson (and William Shelton) looks like it will be a fun place to visit often.
  • Similar in spirit, I've also just learned about the Knowmads podcast (here on youtube), put out by Prachi Garella and Bhavay Tyagi, two doctoral students at the University of Houston.  Fun Interviews with interesting scientists about their science and how they get it done.  
  • There have been some additional news bits relevant to the present research funding/university-govt relations mess.  Earlier this week, 200 business leaders published an open letter about how the slashing support for university research will seriously harm US economic competitiveness.  More of this, please.  I continue to be surprised by how quiet technology-related, pharma, and finance companies are being, at least in public.  Crushing US science and engineering university research will lead to serious personnel and IP shortages down the line, definitely poor for US standing.  Again, now is the time to push back on legislators about cuts mooted in the presidential budget request.  
  • The would-be 15% indirect cost rate at NSF has been found to be illegal, in a summary court judgment released yesterday.  (Brief article here, pdf of the ruling here.)
  • Along these lines, there are continued efforts for proposals about how to reform/alter indirect cost rates in a far less draconian manner.  These are backed by collective organizations like the AAU and COGR.  If you're interested in this, please go here, read the ideas, and give some feedback.  (Note for future reference:  the Joint Associations Group (JAG) may want to re-think their acronym.  In local slang where I grew up, the word "jag" does not have pleasant connotations.)
  • The punitive attempt to prevent Harvard from taking international students has also been stopped for now in the courts. 

June 20, 2025

John BaezPolarities (Part 6)

I’ve been working with Adittya Chaudhuri on some ideas related to this series of blog articles, and now our paper is done!

• John Baez and Adittya Chaudhuri, Graphs with polarities.

Abstract. In fields ranging from business to systems biology, directed graphs with edges labeled by signs are used to model systems in a simple way: the nodes represent entities of some sort, and an edge indicates that one entity directly affects another either positively or negatively. Multiplying the signs along a directed path of edges lets us determine indirect positive or negative effects, and if the path is a loop we call this a positive or negative feedback loop. Here we generalize this to graphs with edges labeled by a monoid, whose elements represent ‘polarities’ possibly more general than simply ‘positive’ or ‘negative’. We study three notions of morphism between graphs with labeled edges, each with its own distinctive application: to refine a simple graph into a complicated one, to transform a complicated graph into a simple one, and to find recurring patterns called ‘motifs’. We construct three corresponding symmetric monoidal double categories of ‘open’ graphs. We study feedback loops using a generalization of the homology of a graph to homology with coefficients in a commutative monoid. In particular, we describe the emergence of new feedback loops when we compose open graphs using a variant of the Mayer–Vietoris exact sequence for homology with coefficients in a commutative monoid.


Read the whole series:

Part 1: Causal loop diagrams, and more generally graphs with edge labeled by elements of a monoid.

Part 2: graphs with edges labeled by elements of a ring.

Part 3: hyperrings and hyperfields.

Part 4: rigs from hyperrings.

Part 5: pulling back and pushing forwards edge labels on labeled graphs.

Part 6: a paper called ‘Graphs with polarities’ with Adittya Chaudhuri, summarizing some of the work here but also much more.

Matt von HippelAmplitudes 2025 This Week

Summer is conference season for academics, and this week held my old sub-field’s big yearly conference, called Amplitudes. This year, it was in Seoul at Seoul National University, the first time the conference has been in Asia.

(I wasn’t there, I don’t go to these anymore. But I’ve been skimming slides in my free time, to give you folks the updates you crave. Be forewarned that conference posts like these get technical fast, I’ll be back to my usual accessible self next week.)

There isn’t a huge amplitudes community in Korea, but it’s bigger than it was back when I got started in the field. Of the organizers, Kanghoon Lee of the Asia Pacific Center for Theoretical Physics and Sangmin Lee of Seoul National University have what I think of as “core amplitudes interests”, like recursion relations and the double-copy. The other Korean organizers are from adjacent areas, work that overlaps with amplitudes but doesn’t show up at the conference each year. There was also a sizeable group of organizers from Taiwan, where there has been a significant amplitudes presence for some time now. I do wonder if Korea was chosen as a compromise between a conference hosted in Taiwan or in mainland China, where there is also quite a substantial amplitudes community.

One thing that impresses me every year is how big, and how sophisticated, the gravitational-wave community in amplitudes has grown. Federico Buccioni’s talk began with a plot that illustrates this well (though that wasn’t his goal):

At the conference Amplitudes, dedicated to the topic of scattering amplitudes, there were almost as many talks with the phrase “black hole” in the title as there were with “scattering” or “amplitudes”! This is for a topic that did not even exist in the subfield when I got my PhD eleven years ago.

With that said, gravitational wave astronomy wasn’t quite as dominant at the conference as Buccioni’s bar chart suggests. There were a few talks each day on the topic: I counted seven in total, excluding any short talks on the subject in the gong show. Spinning black holes were a significant focus, central to Jung-Wook Kim’s, Andres Luna’s and Mao Zeng’s talks (the latter two showing some interesting links between the amplitudes story and classic ideas in classical mechanics) and relevant in several others, with Riccardo Gonzo, Miguel Correia, Ira Rothstein, and Enrico Herrmann’s talks showing not just a wide range of approaches, but an increasing depth of research in this area.

Herrmann’s talk in particular dealt with detector event shapes, a framework that lets physicists think more directly about what a specific particle detector or observer can see. He applied the idea not just to gravitational waves but to quantum gravity and collider physics as well. The latter is historically where this idea has been applied the most thoroughly, as highlighted in Hua Xing Zhu’s talk, where he used them to pick out particular phenomena of interest in QCD.

QCD is, of course, always of interest in the amplitudes field. Buccioni’s talk dealt with the theory’s behavior at high-energies, with a nice example of the “maximal transcendentality principle” where some quantities in QCD are identical to quantities in N=4 super Yang-Mills in the “most transcendental” pieces (loosely, those with the highest powers of pi). Andrea Guerreri’s talk also dealt with high-energy behavior in QCD, trying to address an experimental puzzle where QCD results appeared to violate a fundamental bound all sensible theories were expected to obey. By using S-matrix bootstrap techniques, they clarify the nature of the bound, finding that QCD still obeys it once correctly understood, and conjecture a weird theory that should be possible to frame right on the edge of the bound. The S-matrix bootstrap was also used by Alexandre Homrich, who talked about getting the framework to work for multi-particle scattering.

Heribertus Bayu Hartanto is another recent addition to Korea’s amplitudes community. He talked about a concrete calculation, two-loop five-particle scattering including top quarks, a tricky case that includes elliptic curves.

When amplitudes lead to integrals involving elliptic curves, many standard methods fail. Jake Bourjaily’s talk raised a question he has brought up again and again: what does it mean to do an integral for a new type of function? One possible answer is that it depends on what kind of numerics you can do, and since more general numerical methods can be cumbersome one often needs to understand the new type of function in more detail. In light of that, Stephen Jones’ talk was interesting in taking a common problem often cited with generic approaches (that they have trouble with the complex numbers introduced by Minkowski space) and finding a more natural way in a particular generic approach (sector decomposition) to take them into account. Giulio Salvatori talked about a much less conventional numerical method, linked to the latest trend in Nima-ology, surfaceology. One of the big selling points of the surface integral framework promoted by people like Salvatori and Nima Arkani-Hamed is that it’s supposed to give a clear integral to do for each scattering amplitude, one which should be amenable to a numerical treatment recently developed by Michael Borinsky. Salvatori can currently apply the method only to a toy model (up to ten loops!), but he has some ideas for how to generalize it, which will require handling divergences and numerators.

Other approaches to the “problem of integration” included Anna-Laura Sattelberger’s talk that presented a method to find differential equations for the kind of integrals that show up in amplitudes using the mathematical software Macaulay2, including presenting a package. Matthias Wilhelm talked about the work I did with him, using machine learning to find better methods for solving integrals with integration-by-parts, an area where two other groups have now also published. Pierpaolo Mastrolia talked about integration-by-parts’ up-and-coming contender, intersection theory, a method which appears to be delving into more mathematical tools in an effort to catch up with its competitor.

Sometimes, one is more specifically interested in the singularities of integrals than their numerics more generally. Felix Tellander talked about a geometric method to pin these down which largely went over my head, but he did have a very nice short description of the approach: “Describe the singularities of the integrand. Find a map representing integration. Map the singularities of the integrand onto the singularities of the integral.”

While QCD and gravity are the applications of choice, amplitudes methods germinate in N=4 super Yang-Mills. Ruth Britto’s talk opened the conference with an overview of progress along those lines before going into her own recent work with one-loop integrals and interesting implications of ideas from cluster algebras. Cluster algebras made appearances in several other talks, including Anastasia Volovich’s talk which discussed how ideas from that corner called flag cluster algebras may give insights into QCD amplitudes, though some symbol letters still seem to be hard to track down. Matteo Parisi covered another idea, cluster promotion maps, which he thinks may help pin down algebraic symbol letters.

The link between cluster algebras and symbol letters is an ongoing mystery where the field is seeing progress. Another symbol letter mystery is antipodal duality, where flipping an amplitude like a palindrome somehow gives another valid amplitude. Lance Dixon has made progress in understanding where this duality comes from, finding a toy model where it can be understood and proved.

Others pushed the boundaries of methods specific to N=4 super Yang-Mills, looking for novel structures. Song He’s talk pushes an older approach by Bourjaily and collaborators up to twelve loops, finding new patterns and connections to other theories and observables. Qinglin Yang bootstraps Wilson loops with a Lagrangian insertion, adding a side to the polygon used in previous efforts and finding that, much like when you add particles to amplitudes in a bootstrap, the method gets stricter and more powerful. Jaroslav Trnka talked about work he has been doing with “negative geometries”, an odd method descended from the amplituhedron that looks at amplitudes from a totally different perspective, probing a bit of their non-perturbative data. He’s finding more parts of that setup that can be accessed and re-summed, finding interestingly that multiple-zeta-values show up in quantities where we know they ultimately cancel out. Livia Ferro also talked about a descendant of the amplituhedron, this time for cosmology, getting differential equations for cosmological observables in a particular theory from a combinatorial approach.

Outside of everybody’s favorite theories, some speakers talked about more general approaches to understanding the differences between theories. Andreas Helset covered work on the geometry of the space of quantum fields in a theory, applying the method to a general framework for characterizing deviations from the standard model called the SMEFT. Jasper Roosmale Nepveu also talked about a general space of theories, thinking about how positivity (a trait linked to fundamental constraints like causality and unitarity) gets tangled up with loop effects, and the implications this has for renormalization.

Soft theorems, universal behavior of amplitudes when a particle has low energy, continue to be a trendy topic, with Silvia Nagy showing how the story continues to higher orders and Sangmin Choi investigating loop effects. Callum Jones talks about one of the more powerful results from the soft limit, Weinberg’s theorem showing the uniqueness of gravity. Weinberg’s proof was set up in Minkowski space, but we may ultimately live in curved, de Sitter space. Jones showed how the ideas Weinberg explored generalize in de Sitter, using some tools from the soft-theorem-inspired field of dS/CFT. Julio Parra-Martinez, meanwhile, tied soft theorems to another trendy topic, higher symmetries, a more general notion of the usual types of symmetries that physicists have explored in the past. Lucia Cordova reported work that was not particularly connected to soft theorems but was connected to these higher symmetries, showing how they interact with crossing symmetry and the S-matrix bootstrap.

Finally, a surprisingly large number of talks linked to Kevin Costello and Natalie Paquette’s work with self-dual gauge theories, where they found exact solutions from a fairly mathy angle. Paquette gave an update on her work on the topic, while Alfredo Guevara talked about applications to black holes, comparing the power of expanding around a self-dual gauge theory to that of working with supersymmetry. Atul Sharma looked at scattering in self-dual backgrounds in work that merges older twistor space ideas with the new approach, while Roland Bittelson talked about calculating around an instanton background.


Also, I had another piece up this week at FirstPrinciples, based on an interview with the (outgoing) president of the Sloan Foundation. I won’t have a “bonus info” post for this one, as most of what I learned went into the piece. But if you don’t know what the Sloan Foundation does, take a look! I hadn’t known they funded Jupyter notebooks and Hidden Figures, or that they introduced Kahneman and Tversky.

Justin WilsonA Bite of Quasicrystals

This is a slight reworking of a previous post on my personal blog since I am currently traveling.

An image quilt of quasicrystals

Quasicrystals, a beautiful manifestation of something without a strict crystalline symmetry but nonetheless shows order, won a Nobel prize in 2011. A bit more recently, in 2018, a dodecagonal graphene quasicrystal (two sheets of graphene twisted 30 degrees with respect to each other) made its way onto the cover of Science1.

Thanks for reading Quantum Matters! Subscribe for free to receive new posts and support my work.

This was just the beginning as well, there have been other instances where experimentalists take two layers of atomically thin materials (like graphene, a single layer of carbon) and obtain patterns that are quasicrytalline. One comes from some Rutgers experimentalists involving two layers of graphene and hexagonal boron nitride2.

I’ve been exploring (with collaborators) how this kind of phenomena could help or impede some interesting effects like superconductivity, but that’s a story for another day.

This phenomena inspired me to make what is known as a Penrose tiling so you can see how twisting two layers of graphene at 30 degrees with respect to each leads to a quasicrystal. (Individually, each is a tiling of “hexagons.”) Here it is partially filled up:

Building a Penrose tiling from two sheets of graphene twisted at 30-degrees with respect to each other.

One can tell how this is done: You find the points where two hexagons are on top of each other, put down a point, and connect. There are three shapes: a rhombus, an equilateral triangle, and a square. This can be done along the entire sheet to create an amazing looking pattern. For completeness, we can fill in the rest of the pictured grid to obtain:

A fully Penrose-tiled sheet.

The pattern starts to look even more intriguing the further out in the tiling you go. There is much to learn about such physical systems and their quasiperiodic cousins.

June 19, 2025

Jordan Ellenberg2009 AL ROY nonprescience

Just about 16 years ago today I blogged:

The Orioles have two legitimate Rookie of the Year candidates and neither one of them is named Matt Wieters

“I’m talking about Nolan Reimold, currently slugging .546 and leading all major-league rookies in OPS; and Brad Bergesen, who’s been the Orioles’ best starter this year at 23. Higher-profile pitching prospects Rick Porcello and David Price have ERAs a little lower, but Bergesen looks better on home runs, walks, and strikeouts. He is, as they say, “in the discussion.””

Yeah. Reimold and Bergesen did not win Rookie of the Year, and fact, both of them had the majority of their career WAR in 2009. Bergesen, in fact, had more WAR in 2009 than his career total, and was out of the major leagues by the age of 27. Price and Porcello, meanwhile, had long careers, and each won a Cy Young before turning 30. I guess the guys who rate pitching prospects know something about what they’re doing. In my defense, the 2009 AL Rookie of the Year was in the end not Nolan Reimold, or Brad Bergesen, or David Price, or Rick Porcello, or Matt Wieters — it was A’s reliever Andrew Bailey, who also had the majority of his career WAR in 2009.

Jordan EllenbergHappy Juneteenth

It’s a fine thing that we now have a national holiday that asks us to remember slavery. Some people think patriotism means insisting that America never did or does anything wrong, and that our schools should teach a purely heroic account of the American story. That’s foolish. America is made of people like you and me, who share certain ideals, truly heroic ideals, but don’t always live up to them — and some scoundrels too. Any patriotism that can’t survive contact with the actual history of America is weak stuff. A real patriot loves his country with open eyes.

Jordan EllenbergMath and medicine webinar

The National Academies of Sciences, Engineering, and Medicine have me moderating a series of webinars about the use of novel math in the applied sciences. I learn a lot every time I do one! Here’s the latest, on Machine Learning for Breakthroughs in Medical Care, featuring Charley Taylor and Lorin Crawford. Some fun! Looking forward to doing more of these.

June 17, 2025

Justin WilsonIs Science Still Working? Here’s What I Think

You’ve probably seen headlines lately questioning science—it’s funding, its fairness, and whether it’s even open to new ideas anymore. Some people worry that science has become too rigid, unwilling to entertain fresh perspectives. As someone who lives and breathes science, I wanted to share my take.

First, let me tell you about myself and my expertise. I have a Ph.D. in theoretical condensed matter physics, which is the study of solids, surfaces, interfaces, and liquids. This also includes disordered systems. I have also worked in catalysis and energy storage. I have been doing active research with my first published article in 1987 to the present, which is about 38 years of active work. I have worked at two Department of Energy National Laboratories, Oak Ridge National Laboratory and Pacific Northwest National Laboratory. I am currently a full professor of Physics at Louisiana State University, where I have been for the last twelve years. I currently have 3 funded research projects, one on “Improving Transmon Qubit Performance,” the second is on “Directed Assembly of Metastable States for Harnessing Quantum Effects,” and the third is on “Enabling Formate-Based Hydrogen Storage and Generation via Multimetallic Alloy Catalysts.”

Thanks for reading Quantum Matters! Subscribe for free to receive new posts and support my work.

Let me start by saying this: I don’t think science is broken. In fact, science needs new ideas to survive. That’s what keeps it alive. It’s not about guarding the past—it’s about exploring what comes next.

a person lying on a bed
Photo by Accuray on Unsplash

So, What Is Science Anyway?

At its heart, science is just a way of asking questions and trying to find honest answers. There’s a process to it, and it goes something like this:

  1. You notice something interesting.

  2. You ask a question about it.

  3. You come up with a possible explanation—what we call a hypothesis.

  4. You test that idea through experiments, models, or observation.

  5. You look at the results and ask, “Was I right?”

  6. Whether the answer is yes or no, you learn something—and you keep going.

  7. Finally, you share what you found, so others can learn from it too.

That’s it. And we repeat this process again and again. If the answer turns out to be wrong, that’s still progress. It tells us what doesn’t work, which is just as important as figuring out what does.

Let’s Pick a Scientific Process Example from our previous blog post on Quantum materials: The Quantum Topological Material Bi₂Se₃

🔍 Observation

Some newly discovered materials—like bismuth selenide (Bi₂Se₃)—conduct electricity on their surfaces while remaining insulating inside. This unusual behavior hints at a new phase of matter.

❓ Question

Why do these materials conduct only on their surfaces? What quantum mechanical principles govern this behavior? Can this property be harnessed for new electronic or quantum computing applications?

💡 Hypothesis

These materials are topological insulators. Their unique surface conductivity arises from strong spin-orbit coupling, which protects surface states from scattering due to defects or impurities. These protected states are a result of the material’s non-trivial topological order.

⚗️ Experiment

Researchers test the hypothesis by:

  • Synthesizing high-quality crystals of Bi₂Se₃.

  • Measuring surface conductivity via scanning tunneling microscopy (STM) and angle-resolved photoemission spectroscopy (ARPES) for directly observing the electronic states.

  • Applying magnetic fields and introducing defects to see if the surface states remain intact.

  • Using transport measurements to compare the bulk and surface contributions to conductivity.

📊 Analysis

  • ARPES data shows Dirac-like surface states.

  • STM confirms surface conduction pathways even when the bulk is insulating.

  • Magnetic fields break time-reversal symmetry, gapping the surface states—validating the topological protection mechanism.

✅ Conclusion

The data confirm that Bi₂Se₃ is a 3D topological insulator. Its conductive surface states are protected by time-reversal symmetry, making it a strong candidate for spintronic devices and fault-tolerant quantum computing.

📢 Communication

Findings are published in Nature Materials and Science, presented at condensed matter physics conferences, and used to inspire further research into topological superconductors, Majorana fermions, and quantum information systems.


🔍 Why It Matters

This discovery has opened up new paths in quantum electronics, where data can flow with minimal energy loss, and in quantum computing, where these materials could enable robust, error-resistant qubits.

person in black and white long sleeve shirt holding white and blue book
Photo by NMG Network on Unsplash

Why This Matters

This is how new medications move from lab benches to pharmacy shelves. The scientific process ensures treatments are both safe and effective before reaching the public.

But Is Science Open to New Ideas?

Yes, absolutely—but it’s not always easy. Getting a scientific paper published takes a lot of work. Top-tier journals reject around 80 to 95% of the papers they receive. Even solid mid-tier journals reject more than half. Why? Because these journals want well-supported, clearly written work that pushes knowledge forward. Lastly, here is an interesting statistic, do you know that that roughly 1% of the scientific workforce publishes every single year. That’s a small fraction maintaining a consistent online publishing presence. It is truly difficult getting articles published.

That doesn’t mean new or unusual ideas get shut out. They just have to be backed up with solid evidence—and explained clearly. I’ve seen good ideas get rejected just because the writing was hard to follow, or the data didn’t quite hold up. That’s why I often help younger researchers refine their papers. Heck, I’ve had others help me, too. We’re all learning as we go.

And yes, rejection stings. It really does. But it’s not about ego—it’s about making the work stronger.

Here’s the Bottom Line

Science is hard. But it’s supposed to be. We ask tough questions and hold ourselves to high standards. That’s how we make sure the answers we get are worth something.

Is it perfect? No. But the system is built to self-correct. Bad ideas eventually fall away. Good ones rise to the top—even if it takes a few tries.

Science isn’t just a pile of facts. It’s a conversation—a messy, exciting, and sometimes frustrating conversation about how the world works. And yes, it still works.

So the next time someone says science is too closed off, remind them: science is open—to anyone willing to do the work, ask the hard questions, and follow the evidence wherever it leads.

Thanks for reading Quantum Matters! Subscribe for free to receive new posts and support my work.

June 16, 2025

Jordan EllenbergBertrand Russell on touching grass

No one seems to know the origin of the contemporary phrase “touch grass,” meaning “get off-line and remember that the real world exists.” I think it is unlikely that it actually springs from Bertrand Russell’s 1930 self-help book The Conquest of Happiness, but — this description of touching grass really captures the modern sentiment!

To the child even more than to the man, it is necessary to preserve some contact with the ebb and flow of terrestrial life… I have seen a boy two years old, who had been kept in London, taken out for the first time to walk in green country. The season was winter, and everything was wet and muddy. To the adult eye there was nothing to cause delight, but in the boy there sprang up a strange ecstasy; he kneeled on the wet ground and put his face in the grass, and gave utterance to half-articulate cries of delight. The joy that he was experiencing was primitive, simple, and massive. The organic need that was being satisfied is so profound that those in whom it is starved are seldom completely sane.

Touch grass!

June 15, 2025

Jordan EllenbergNigel Boston (1961-2024)

Judy Walker and I put together a memorial article about my colleague and collaborator Nigel Boston in this month’s Notices of the AMS. What a great guy. And this conference at Zürich on arithmetic statistics I just returned from was, in almost every aspect, influenced by Nigel’s ideas and outlook. And not only because many of his collaborators and students were present. Nigel was the first person, I think, to really understand what form a non-abelian Cohen-Lenstra theory might take. And he was insistent on the importance of the pro-p story. And on the role that computation would play in actually understanding what’s going on. All three of these strands are very much alive at the current frontier of the subject.

Doug NatelsonSo you want to build a science/engineering laboratory building

A very quick summary of some non-negative news developments:
  • The NSF awarded 500 more graduate fellowships this week, bringing the total for this year up to 1500.  (Apologies for the X link.)  This is still 25% lower than last year's number, and of course far below the original CHIPS and Science act target of 3000, but it's better than the alternative.  I think we can now all agree that the supposed large-scale bipartisan support for the CHIPS and Science act was illusory.
  • There seems to be some initial signs of pushback on the senate side regarding the proposed massive science funding cuts.  Again, now is the time to make views known to legislators - I am told by multiple people with experience in this arena that it really can matter.
  • There was a statement earlier this week that apparently the US won't be going after Chinese student visas.  This would carry more weight if it didn't look like US leadership was wandering ergodically through all possible things to say with no actual plan or memory.
On to the main topic of this post.  Thanks to my professional age (older than dirt) and my experience (overseeing shared research infrastructure; being involved in a couple of building design and construction projects; and working on PI lab designs and build-outs), I have some key advice and lessons learned for anyone designing a new big science/engineering research building.  This list is by no means complete, and I invite readers to add their insights in the comments.  While it seems likely that many universities will be curtailing big capital construction projects in the near term because of financial uncertainty, I hope this may still come in handy to someone.  
  • Any big laboratory building should have a dedicated loading dock with central receiving.  If you're spending $100M-200M on a building, this is not something that you should "value engineer" away.  The long term goal is a building that operates well for the PIs and is easy to maintain, and you're going to need to be able to bring in big crates for lab and service equipment.  You should have a freight elevator adjacent to the dock.  
  • You should also think hard about what kind of equipment will have to be moved in and out of the building when designing hallways, floor layouts, and door widths.  You don't want to have to take out walls, doorframes, or windows, or to need a crane to hoist equipment into upper floors because it can't get around corners.
  • Think hard about process gasses and storage tanks at the beginning.  Will PIs need to have gas cylinders and liquid nitrogen and argon tanks brought in and out in high volumes all the time, with all the attendant safety concerns?  Would you be better off getting LN2 or LAr tanks even though campus architects will say they are unsightly?  
  • Likewise, consider whether you should have building-wide service for "lab vacuum", N2 gas, compressed air, DI water, etc.  If not and PIs have those needs, you should plan ahead to deal with this.
  • Gas cylinder and chemical storage - do you have enough on-site storage space for empty cylinders and back-up supply cylinders?  If this is a very chemistry-heavy building, think hard about safety and storing solvents. 
  • Make sure you design for adequate exhaust capacity for fume hoods.  Someone will always want to add more hoods.  While all things are possible with huge expenditures, it's better to make sure you have capacity to spare, because adding hoods beyond the initial capacity would likely require a huge redo of the building HVAC systems.
  • Speaking of HVAC, think really hard about controls and monitoring.  Are you going to have labs that need tight requirements on temperature and humidity?  When you set these up, put have enough sensors of the right types in the right places, and make sure that your system is designed to work even when the outside air conditions are at their seasonal extremes (hot and humid in the summer, cold and dry in the winter).  Also, consider having a vestibule (air lock) for the main building entrance - you'd rather not scoop a bunch of hot, humid air (or freezing, super-dry air) into the building every time a student opens the door.
  • Still on HVAC, make sure that power outages and restarts don't lead to weird situations like having the whole building at negative pressure relative to the outside, or duct work bulging or collapsing.
  • Still on HVAC, actually think about where the condensate drains for the fan units will overflow if they get plugged up or overwhelmed.  You really don't want water spilling all over a rack of networking equipment in an IT closet.  Trust me.
  • Chilled water:  Whether it's the process chilled water for the air conditioning, or the secondary chilled water for lab equipment, make sure that the loop is built correctly.   Incompatible metals (e.g., some genius throws in a cast iron fitting somewhere, or joints between dissimilar metals) can lead to years and years of problems down the line.  Make sure lines are flushed and monitored for cleanliness, and have filters in each lab that can be checked and maintained easily.
  • Electrical - design with future needs in mind.  If possible, it's a good idea to have PI labs with their own isolation transformers, to try to mitigate inter-lab electrical noise issues.  Make sure your electrical contractors understand the idea of having "clean" vs. "dirty" power and can set up the grounding accordingly while still being in code.
  • Still on electrical, consider building-wide surge protection, and think about emergency power capacity.  For those who don't know, emergency power is usually a motor-generator that kicks in after a few seconds to make sure that emergency lighting and critical systems (including lab exhaust) keep going.
  • Ceiling heights, duct work, etc. - It's not unusual for some PIs to have tall pieces of equipment.  Think about how you will accommodate these.  Pits in the floors of basement labs?  5 meter slab-to-slab spacing?  Think also about how ductwork and conduits are routed.  You don't want someone to tell you that installation of a new apparatus is going to cost a bonus $100K because shifting a duct sideways by half a meter will require a complete HVAC redesign.
  • Think about the balance between lab space and office space/student seating.  No one likes giant cubicle farm student seating, but it does have capacity.  In these days of zoom and remote access to experiments, the way students and postdocs use offices is evolving, which makes planning difficult.  Health and safety folks would definitely prefer not to have personnel effectively headquartered directly in lab spaces.  Seriously, though, when programming a building, you need to think about how many people per PI lab space will need places to sit.  I have yet to see a building initially designed with enough seating to handle all the personnel needs if every PI lab were fully occupied and at a high level of research activity. 
  • Think about maintenance down the line.  Every major building system has some lifespan.  If a big air handler fails, is it accessible and serviceable, or would that require taking out walls or cutting equipment into pieces and disrupting the entire building?  Do you want to set up a situation where you may have to do this every decade?  (Asking for a friend.)
  • Entering the realm of fantasy, use your vast power and influence to get your organization to emphasize preventative maintenance at an appropriate level, consistently over the years.  Universities (and national labs and industrial labs) love "deferred maintenance" because kicking the can down the road can make a possible cost issue now into someone else's problem later.  Saving money in the short term can be very tempting.  It's also often easier and more glamorous to raise money for the new J. Smith Laboratory for Physical Sciences than it is to raise money to replace the HVAC system in the old D. Jones Engineering Building.  Avoid this temptation, or one day (inevitably when times are tight) your university will notice that it has $300M in deferred maintenance needs.
I may update this list as more items occur to me, but please feel free to add input/ideas.

June 13, 2025

Matt von HippelBonus info for Reversible Computing and Megastructures

After some delay, a bonus info post!

At FirstPrinciples.org, I had a piece covering work by engineering professor Colin McInnes on stability of Dyson spheres and ringworlds. This was a fun one to cover, mostly because of how it straddles the borderline between science fiction and practical physics and engineering. McInnes’s claim to fame is work on solar sails, which seem like a paradigmatic example of that kind of thing: a common sci-fi theme that’s surprisingly viable. His work on stability was interesting to me because it’s the kind of work that a century and a half ago would have been paradigmatic physics. Now, though, very few physicists work on orbital mechanics, and a lot of the core questions have passed on to engineering. It’s fascinating to see how these classic old problems can still have undiscovered solutions, and how the people best equipped to find them now are tinkerers practicing their tools instead of cutting-edge mathematicians.

At Quanta Magazine, I had a piece about reversible computing. Readers may remember I had another piece on that topic at the end of March, a profile on the startup Vaire Computing at FirstPrinciples.org. That piece talked about FirstPrinciples, but didn’t say much about reversible computing. I figured I’d combine the “bonus info” for both posts here.

Neither piece went into much detail about the engineering involved, as it didn’t really make sense in either venue. One thing that amused me a bit is that the core technology that drove Vaire into action is something that actually should be very familiar to a physics or engineering student: a resonator. Theirs is obviously quite a bit more sophisticated than the base model, but at its heart it’s doing the same thing: storing charge and controlling frequency. It turns out that those are both essential to making reversible computers work: you need to store charge so it isn’t lost to ground when you empty a transistor, and you need to control the frequency so you can have waves with gentle transitions instead of the more sharp corners of the waves used in normal computers, thus wasting less heat in rapid changes of voltage. Vaire recently announced they’re getting 50% charge recovery from their test chips, and they’re working on raising that number.

Originally, the Quanta piece was focused more on reversible programming than energy use, as the energy angle seemed a bit more physics-focused than their computer science desk usually goes. The emphasis ended up changing as I worked on the draft, but it meant that an interesting parallel story got lost on the cutting-room floor. There’s a community of people who study reversible computing not from the engineering side, but from the computer science side, studying reversible logic and reversible programming languages. It’s a pursuit that goes back to the 1980’s, where at Caltech around when Feynman was teaching his course on the physics of computing a group of students were figuring out how to set up a reversible programming language. Called Janus, they sent their creation to Landauer, and the letter ended up with Michael Frank after Landauer died. There’s a lovely quote from it regarding their motivation: “We did it out of curiosity over whether such an odd animal as this was possible, and because we were interested in knowing where we put information when we programmed. Janus forced us to pay attention to where our bits went since none could be thrown away.”

Being forced to pay attention to information, in turn, is what has animated the computer science side of the reversible computing community. There are applications to debugging, where you can run code backwards when it gets stuck, to encryption and compression, where you want to be able to recover the information you hid away, and to security, where you want to keep track of information to make sure a hacker can’t figure out things they shouldn’t. Also, for a lot of these people, it’s just a fun puzzle. Early on my attention was caught by a paper by Hannah Earley describing a programming language called Alethe, a word you might recognize from the Greek word for truth, which literally means something like “not-forgetting”.

(Compression is particularly relevant for the “garbage data” you need to output in a reversible computation. If you want to add two numbers reversibly, naively you need to keep both input numbers and their output, but you can be more clever than that and just keep one of the inputs since you can subtract to find the other. There are a lot of substantially more clever tricks in this vein people have figured out over the years.)

I didn’t say anything about the other engineering approaches to reversible computing, that try to do something outside of traditional computer chips. There’s DNA computing, which tries to compute with a bunch of DNA in solution. There’s the old concept of ballistic reversible computing, where you imagine a computer that runs like a bunch of colliding billiard balls, conserving energy. Coordinating such a computer can be a nightmare, and early theoretical ideas were shown to be disrupted by something as tiny as a few stray photons from a distant star. But people like Frank figured out ways around the coordination problem, and groups have experimented with superconductors as places to toss those billiard balls around. The early billiard-inspired designs also had a big impact on quantum computing, where you need reversible gates and the only irreversible operation is the measurement. The name “Toffoli” comes up a lot in quantum computing discussions, I hadn’t known before this that Toffoli gates were originally for reversible computing in general, not specifically quantum computing.

Finally, I only gestured at the sci-fi angle. For reversible computing’s die-hards, it isn’t just a way to make efficient computers now. It’s the ultimate future of the technology, the kind of energy-efficiency civilization will need when we’re covering stars with shells of “computronium” full of busy joyous artificial minds.

And now that I think about it, they should chat with McInnes. He can tell them the kinds of stars they should build around.

Justin WilsonControlling Quantum Chaos with Randomness

I’m only putting out a small Quantum Bite today because I’m on the road. Here’s a brief rundown of our new preprint “Universality of stochastic control of quantum chaos with measurement and feedback, it hasn’t been peer-reviewed yet, but I’m excited about the result.

Someone pushing a boulder up a hill and nearly at the top

The idea came from classical chaos and control, and is pretty simple1. Imagine standing in a chaotic wilderness dotted with a few tiny, perfectly calm mountaintops. Place a boulder on one of those peaks and it can sit there forever—but miss the summit by a hair and it tumbles into the valley below. We want to escape the chaos of avalanches and rock slides, so our task is to sit a boulder up on the top of a mountain.

Thanks for reading Quantum Matters! Subscribe for free to receive new posts and support my work.

Now suppose you randomly stop paying attention. Every so often (flip a coin, roll a die), you look up and find the boulder, and nudge it some fraction of the way back up the mountain. This sounds thoughtless—counterintuitive even—but leads us to an interesting realization: There is a critical “nudge-rate” that will get our boulder up the mountain. However, any less often than this value, the boulder is lost to the valleys below. This is a classical phase transition in disguise.

We extended the idea to a quantum-mechanical boulder, and quantum fuzziness changes everything. These types of dynamics have little points of stability classically, but when you look at them quantum mechanically, there is no way to park a boulder at these points. Quantum mechanics has this bad habit of blurring those otherwise stable points.

This is really a battle with quantum uncertainty. If we put our quantum boulder at the top of the hill it is stopped (zero velocity) and we know where it is, but this violates the uncertainty principle! If we know where it is, we should have no idea how fast it’s moving! Even if you get a quantum boulder to the top of the mountain, it will always fall back down.

This alters our whole conception of “control” from before2, but miraculously, there is still a phase transition, but it behaves differently. Whenever we push our quantum boulder up the hill at just the right rate, it still never reaches the top, but it starts to be up there a lot more often than normal. Formally, the probability P(h) that the boulder sits at a height h takes the schematic form

(This looks like it diverges at the summit but that’s the secret sauce of quantum uncertainty, this denominator never actually gets to zero.)

The result is the following plot from the preprint:

The horizontal access is the “nudge-rate” p and the y-axis represents (roughly) how often the boulder is at the top of the hill. Classically, you’d get perfect control for p > 0.5, but quantumly, at p = 0.5, you can see that above that, it’s not at the summit a lot, but we’ve reached a threshold. Even at p = 0.8, you “only” hold the peak about 80% of the time.

So we trade absolute control to something a bit more “uncertain.” We now get to the top more often than not since quantum mechanics tends to get in the way and push us down the side of the mountain again.

Behind the scenes are simulations, analytic calculations, and connections to random walks, turbulence, and even market dynamics, all of which justify the “Universality” in the title. If you’d like the technical story—including how weak measurements implement the quantum mechanical “nudge”—check out our preprint!

1

The original classical paper was titled The Probabilistic Control of Chaos.

2

I’m neglecting here how to even implement control quantum mechanically which involves quantum weak measurements. That’s an interesting story and if you want to know details, please read our preprint!

June 09, 2025

Justin WilsonWhy Quantum Materials Are Shaping the Future of Electronics

Silicon has been the backbone of the electronics industry for decades. It powered the digital revolution, drove Moore’s Law, and made today’s laptops, smartphones, and data centers possible. But we’re hitting a wall.

black and yellow rubber puzzle mat
Photo by Ryan on Unsplash

As we demand more power, faster performance, and smaller devices, silicon is running into serious challenges—slower electron speeds, overheating, and physical limits to how small we can make transistors. Enter: quantum materials.

Thanks for reading Quantum Matters! Subscribe for free to receive new posts and support my work.

These are next-generation materials with properties that go far beyond what traditional silicon can offer. They’re not just incremental improvements—they represent a fundamental shift in how we think about electronics, computing, and even national defense.

The Rising Stars of Quantum Materials

Let’s break down a few of the most promising candidates:

  • Gallium Nitride (GaN) – GaN has high electron mobility and a wide bandgap, which means it can handle higher voltages and switch faster than silicon. It’s already making waves in 5G technology, electric vehicles, and advanced radar systems.

  • Silicon Carbide (SiC) – Known for its durability in extreme environments, SiC is great for high-power electronics. It keeps running in high temperatures while minimizing energy loss, making it ideal for heavy-duty industrial and defense applications.

  • Graphene & 2D Materials – Think of materials like molybdenum disulfide (MoS₂) and tungsten diselenide (WSe₂). These ultra-thin, ultra-conductive materials could lead to super-efficient, flexible transistors and wearables.

a very tall building with a lot of hexagonal shapes on it
Photo by Sam Balye on Unsplash
  • Topological Insulators – These are the real game-changers. They allow electricity to flow on their surfaces while remaining insulating inside. Even better? Their surface states are resistant to impurities and defects, making them incredibly reliable. This makes them strong contenders for spintronics (electronics that use electron spin) and quantum computing.

Why Silicon Can’t Keep Up

Here’s the problem: silicon has limitations we can’t engineer our way around anymore.

  1. Slower switching speeds – Compared to newer materials like GaN, silicon just isn’t as fast.

  2. Overheating – As we pack more transistors into chips, heat buildup becomes a serious issue, requiring costly cooling systems.

  3. Miniaturization limits – We’re approaching atomic scales, and quantum effects are interfering with performance.

  4. Power inefficiency – Silicon struggles in high-power applications like data centers, defense tech, and electric transportation.

In short, we need something better—and quantum materials are stepping up.

Powering the Future: Quantum, Secure, and Ultra-Precise

One of the most exciting classes of these materials is quantum topological materials. These materials don’t just offer better performance—they behave in fundamentally different ways, thanks to their quantum properties.

  • Topological Protection – Their electronic states are robust against defects, making devices more reliable.

  • High-speed, low-power performance – Their exotic structures allow electrons to move faster with less energy loss.

  • Quantum Stability – In quantum computing, they help build stable qubits—units of quantum information that don’t fall apart under noise, heat, or time.

These properties make them ideal for:

  • Quantum Computers – Topological superconductors could unlock fault-tolerant quantum computing by protecting qubits from errors.

  • Quantum Sensors – These materials are incredibly sensitive to their environment, opening the door to ultra-precise sensors for medical imaging, navigation, and even detecting gravitational waves.

  • National Security & Defense – From secure communications to subsurface detection and stealth-defeating tech, the strategic applications are massive.

  • Quantum Metrology – Think of this as the science of ultra-precise measurement. It’s the backbone of GPS, secure transactions, and global communications.

The Big Picture

Quantum materials—and especially topological quantum materials—represent more than a new tool in the electronics toolbox. They’re reshaping what’s possible in technology, science, and security.

With their unique electronic structures, extreme sensitivity, and resistance to defects, these materials could power everything from next-gen smartphones to quantum computers, while enabling breakthroughs in physics and transforming how we interact with the world.

The future of electronics isn’t just smaller and faster—it’s quantum.

Thanks for reading Quantum Matters! Subscribe for free to receive new posts and support my work.

June 08, 2025

Tommaso DorigoWin A MSCA Post-Doctoral Fellowship!

Applications for MSCA Post-doctoral fellowships are on, and will be so until September 10 this year. What that means is that if you have less than 8 years of experience after your Ph.D., you can pair up with a research institute in Europe to present a research plan, and the European Commission may decide to fund it for two years (plus 6 months in industry in some cases).

In order for your application to have a chance to win funding, you need to: 
  1. have a great research topic in mind, 
  2. be ready to invest some time in writing a great application, and 
  3. pair up with an outstanding supervisor at a renowned research institute. 

read more

June 07, 2025

Doug NatelsonA precision measurement science mystery - new physics or incomplete calculations?

Again, as a distraction from persistently concerning news, here is a science mystery of which I was previously unaware.

The role of approximations in physics is something that very often comes as a shock to new students.  There is this cultural expectation out there that because physics is all about quantitative understanding of physical phenomena, and the typical way we teach math and science in K12 education, we should be able to get exact solutions to many of our attempts to model nature mathematically.   In practice, though, constructing physics theories is almost always about approximations, either in the formulation of the model itself (e.g. let's consider the motion of an electron about the proton in the hydrogen atom by treating the proton as infinitely massive and of negligible size) or in solving the mathematics (e.g., we can't write an exact analytical solution of the problem when including relativity, but we can do an order-by-order expansion in powers of \(p/mc\)).  Theorists have a very clear understanding of what means to say that an approximation is "well controlled" - you know on both physical and mathematical grounds that a series expansion actually converges, for example.  

Some problems are simpler than others, just by virtue of having a very limited number of particles and degrees of freedom, and some problems also lend themselves to high precision measurements.  The hydrogen atom problem is an example of both features.  Just two spin-1/2 particles (if we approximate the proton as a lumped object) and readily accessible to optical spectroscopy to measure the energy levels for comparison with theory.  We can do perturbative treatments to account for other effects of relativity, spin-orbit coupling, interactions with nuclear spin, and quantum electrodynamic corrections (here and here).  A hallmark of atomic physics is the remarkable precision and accuracy of these calculations when compared with experiment.  (The \(g\)-factor of the electron is experimentally known to a part in \(10^{10}\) and matches calculations out to fifth order in \(\alpha = e^2/(4 \pi \epsilon_{0}\hbar c)\).).  

The helium atom is a bit more complicated, having two electrons and a more complicated nucleus, but over the last hundred years we've learned a lot about how to do both calculations and spectroscopy.   As explained here, there is a problem.  It is possible to put helium into an excited metastable triplet state with one electron in the \(1s\) orbital, the other electron in the \(2s\) orbital, and their spins in a triplet configuration.  Then one can measure the ionization energy of that system - the minimum energy required to kick an electron out of the atom and off to infinity.  This energy can be calculated to seventh order in \(\alpha\), and the theorists think that they're accounting for everything, including the finite (but tiny) size of the nucleus.  The issue:  The calculation and the experiment differ by about 2 nano-eV.  That may not sound like a big deal, but the experimental uncertainty is supposed to be a little over 0.08 nano-eV, and the uncertainty in the calculation is estimated to be 0.4 nano-eV.  This works out to something like a 9\(\sigma\) discrepancy.  Most recently, a quantitatively very similar discrepancy shows up in the case of measurements performed in 3He rather than 4He.  

This is pretty weird.  Historically, it would seem that the most likely answer is a problem with either the measurements (though that seems doubtful, since precision spectroscopy is such a well-developed set of techniques), the calculation (though that also seems weird, since the relevant physics seems well known), or both.  The exciting possibility is that somehow there is new physics at work that we don't understand, but that's a long shot.  Still, something fun to consider (as my colleagues (and I) try to push back on the dismantling of US scientific research.)



June 06, 2025

John BaezThe Oort Cloud

The Oort cloud is a huge region of icy objects surrounding our Sun. We’re not sure it exists, but we think it’s where comets come from.

I’ve often seen the Oort cloud drawn as a vague round blob. But recently some people simulated it—and discovered that tidal forces from the Milky Way may pull it into a much more interesting shape:

• David Nesvorný, Luke Dones, David Vokrouhlický, Hal F. Levison, Cristian Beaugé, Jacqueline Faherty, Carter Emmart, and Jon P. Parker, A spiral structure in the inner Oort cloud, The Astrophysical Journal 983 (2025).

It actually looks like a cartoon of a galaxy! But it’s poking up at right angles to the plane of our galaxy, drawn in blue here. The red line is the plane that the planets mostly move in, called the ‘ecliptic’.

According to our theories, the Oort cloud formed about 4.6 billion years ago when the Solar System was young. As the outer planets cleared their orbital neighborhood, trillions of small icy objects were pushed into very eccentric orbits that come as close as 30 AU to the Sun and then shoot out as far as 1​​​000 AU. (Remember, the Earth is 1 AU from the Sun.) Later, Galactic tidal forces slowly pulled these objects farther from the Sun and tilted their orbits. Encounters with nearby stars tend to randomize the orbits of these Oort cloud objects.

By now, the inner Oort cloud consists of about icy objects 1000 to 10,000 AU from the Sun. It’s more or less flat, roughly 15,000 AU across, tilted 30° to the ecliptic, and it looks like a spiral with two twisted arms.

The spiral structure was first noticed when they showed this simulation in the Hayden Planetarium in preparation for a new space show!

Physicists like to start by “assuming a spherical cow”. But when they study something in detail, it’s usually more complex. Even black holes usually have a disk, with jets shooting out.

Matt von HippelBranching Out, and Some Ground Rules

In January, my time at the Niels Bohr Institute ended. Instead of supporting myself by doing science, as I’d done the last thirteen or so years, I started making a living by writing, doing science journalism.

That work picked up. My readers here have seen a few of the pieces already, but there are lots more in the pipeline, getting refined by editors or waiting to be published. It’s given me a bit of income, and a lot of visibility.

That visibility, in turn, has given me new options. It turns out that magazines aren’t the only companies interested in science writing, and journalism isn’t the only way to write for a living. Companies that invest in science want a different kind of writing, one that builds their reputation both with the public and with the scientific community. And as I’ve discovered, if you have enough of a track record, some of those companies will reach out to you.

So I’m branching out, from science journalism to science communications consulting, advising companies how to communicate science. I’ve started working with an exciting client, with big plans for the future. If you follow me on LinkedIn, you’ll have seen a bit about who they are and what I’ll be doing for them.

Here on the blog, I’d like to maintain a bit more separation. Blogging is closer to journalism, and in journalism, one ought to be careful about conflicts of interest. The advice I’ve gotten is that it’s good to establish some ground rules, separating my communications work from my journalistic work, since I intend to keep doing both.

So without further ado, my conflict of interest rules:

  • I will not write in a journalistic capacity about my consulting clients, or their direct competitors.
  • I will not write in a journalistic capacity about the technology my clients are investing in, except in extremely general terms. (For example, most businesses right now are investing in AI. I’ll still write about AI in general, but not about any particular AI technologies my clients are pursuing.)
  • I will more generally maintain a distinction between areas I cover journalistically and areas where I consult. Right now, this means I avoid writing in a journalistic capacity about:
    • Health/biomedical topics
    • Neuroscience
    • Advanced sensors for medical applications

I plan to update these rules over time as I get a better feeling for what kinds of conflict of interest risks I face and what my clients are comfortable with. I now have a Page for this linked in the top menu, clients and editors can check there to see my current conflict of interest rules.

Justin WilsonScience when funding and stability are in doubt

I wanted to launch into an idea that would not end up in The Margins. But can I talk about quantum measurements when students are banned from the US? Can we discuss symmetries when students are detained for writing editorials? How can I explore energy when people are losing research grants? I have so much physics to discuss, but can we do physics when science itself is being dismantled?

a group of people standing in front of a building
Photo by Siyi Zhou on Unsplash

It feels tone-deaf to launch into pure science in this Substack when we’ve had one of the largest assaults on science in my lifetime. For instance, just yesterday the administration moved to restrict entry to the US for international students going to Harvard, directly following a broader travel ban affecting our department. At the same time, I don’t want to turn this into a polemic. So, I won’t. Instead, I’ll tell you about what people are feeling. If you want to know more, Doug Natelson’s blog Nanoscale Views (a Rice Professor closely following developments at NSF), reporting from the New York Times, and the reporting in Science magazine are all good sources.

One prominent feeling is anger: fury toward travel bans, the detaining of international researchers, canceled funding, and rhetoric promising more. As stated by our vice president: “Universities in our country are fundamentally corrupt and dedicated to deceit and lies, not to the truth. […] They pursue deceit and lies.”1 This denigration and these attacks create a climate of fear, especially for those who are most vulnerable: the students.

When I decided to pursue a graduate degree in physics, I did so because I wanted to understand how the universe works in the language of physics and mathematics. Graduate school is stressful, and often, the introduction to research can be a sharp transition for people; suddenly, the problems may not have solutions, or they might be way more complicated than you (or your advisor) anticipated. Our country has been known for its research culture and its excellent universities worldwide, and that has attracted, to the US, talent from across the globe. I personally know scientists from all continents which I met while they were attending or working at US institutions. Even before these recent policies, getting a visa was incredibly stressful for international researchers. Now, they are afraid to exercise their First Amendment rights2 and anxious about leaving and re-entering the US. Uncertain visa conditions disrupt their lives, hindering their ability to travel, share results, and fully participate in science.

There is also concern among faculty that the National Science Foundation will no longer be a reliable source of funding if the presidential budget is adopted. Already, divisions are being abolished or restructuring to focus on funding research in artificial intelligence, quantum information science, biotechnology, nuclear energy, or translational science3. Research into materials and high-energy physics could come to a sudden halt. This has many rattled, and we fund our research groups with federal research grants. If we lose that, we lose the ability to fund our students and postdocs4. Even temporary interruptions have this human cost in addition to devastating entire projects.

We hear all the moves against us at universities5, and it hits, creating an atmosphere of anger and fear. I am sad and disappointed as well. Countries are trying to entice away US scientists. France and the EU are already allocating funds and are already hiring US scientists. China directly offered a deal to a Nobel laureate affected by US cuts (he declined). I have heard genuine sentiments from people considering leaving the US. After elections, some might half-jokingly talk about leaving. But this time, it’s different. Researchers fear losing their careers and see moving as a real option to continue innovating. And how can the US compete with any other country if we lose our researchers? We won’t develop the next touch screens and lithium-ion batteries, nor find the next cure for tuberculosis, innovations born at US universities.

I hope those in charge walk us back from the edge and support the center of science in the world: US universities and the people who work here. In the meantime, call your representatives and senators. Tell them clearly: cutting NSF funding directly harms America’s scientific leadership. The American Physical Society’s advocacy webpage can guide you. Pay attention locally, too, as some states (like Indiana) are considering similar harmful measures.

Scientists, professors, graduate students, and researchers are not, in any sense, the enemy. We are part of what makes America great.

Thanks for reading Quantum Matters! Subscribe for free to receive new posts and support my work.

Share

1

Quote found from this NYTimes piece.

2

Rights which are for the people, not just to citizens. I’m not a lawyer, but the spirit of the bill of rights is dead when people are being detained and deported for what they say and do.

3

While “quantum” is in there, whole swaths of physics is missing not to mention other disciplines NSF traditionally has funded.

4

In case you don’t know the lingo, postdocs are postdoctoral researchers: Temporary research positions post-PhD where you work for an advisor doing science nearly full time.

5

There have been enough that my attempts to make a list have always missed something.

June 05, 2025

June 04, 2025

Terence TaoDecomposing a factorial into large factors (second version)

Boris Alexeev, Evan Conway, Matthieu Rosenfeld, Andrew Sutherland, Markus Uhr, Kevin Ventullo, and I have uploaded to the arXiv a second version of our paper “Decomposing a factorial into large factors“. This is a completely rewritten and expanded version of a previous paper of the same name. Thanks to many additional theoretical and numerical contributors from the other coauthors, we now have much more precise control on the main quantity {t(N)} studied in this paper, allowing us to settle all the previous conjectures about this quantity in the literature.

As discussed in the previous post, {t(N)} denotes the largest integer {t} such that the factorial {N!} can be expressed as a product of {N} factors, each of which is at least {t}. Computing {t(N)} is a special case of the bin covering problem, which is known to be NP-hard in general; and prior to our work, {t(N)} was only computed for {N \leq 599}; we have been able to compute {t(N)} for all {N \leq 10000}. In fact, we can get surprisingly sharp upper and lower bounds on {t(N)} for much larger {N}, with a precise asymptotic

\displaystyle \frac{t(N)}{N} = \frac{1}{e} - \frac{c_0}{\log N} - \frac{O(1)}{\log^{1+c} N}

for an explicit constant {c_0 = 0.30441901\dots}, which we conjecture to be improvable to

\displaystyle \frac{t(N)}{N} = \frac{1}{e} - \frac{c_0}{\log N} - \frac{c_1+o(1)}{\log^{2} N}

for an explicit constant {c_1 = 0.75554808\dots}: … For instance, we can demonstrate numerically that

\displaystyle 0 \leq t(9 \times 10^8) - 316560601 \leq 113.

As a consequence of this precision, we can verify several conjectures of Guy and Selfridge, namely

  • {t(N) \leq N/e} for all {N \neq 1,2,4}.
  • {t(N) \geq \lfloor 2N/7\rfloor} for all {N \neq 56}.
  • {t(N) \geq N/3} for all {N \geq 3 \times 10^5}. (In fact we show this is true for {N \geq 43632}, and that this threshold is best possible.)

Guy and Selfridge also claimed that one can establish {t(N) \geq N/4} for all large {N} purely by rearranging factors of {2} and {3} from the standard factorization {1 \times 2 \times \dots \times N} of {N!}, but surprisingly we found that this claim (barely) fails for all {N > 26244}:

The accuracy of our bounds comes from several techniques:

  • Greedy algorithms, in which one allocates the largest prime factors of {N!} first and then moves to smaller primes, provide quickly computable, though suboptimal, lower bounds on {t(N)} for small, medium, and moderately large values;
  • Linear programming and integer programming methods provides extremely accurate upper and lower bounds on {t(N)} for small and medium values of {N};
  • Rearrangement methods can be analyzed asymptotically via linear programming, and work well for large {N}; and
  • The modified approximate factorization strategy, discussed in the previous post is now sharpened by using {3}-smooth numbers (products of {2} and {3}) as the primary “liquidity pool” to reallocate factors of {N!}, as opposed to the previous approach of only using powers of {2}.

To me, the biggest surprise was just how stunningly accurate the linear programming methods were; the very large number of repeated prime factors here actually make this discrete problem behave rather like a continuous one.

June 03, 2025

Terence TaoA Lean companion to “Analysis I”

Almost 20 years ago, I wrote a textbook in real analysis called “Analysis I“. It was intended to complement the many good available analysis textbooks out there by focusing more on foundational issues, such as the construction of the natural numbers, integers, rational numbers, and reals, as well as providing enough set theory and logic to allow students to develop proofs at high levels of rigor.

While some proof assistants such as Coq or Agda were well established when the book was written, formal verification was not on my radar at the time. However, now that I have had some experience with this subject, I realize that the content of this book is in fact very compatible with such proof assistants; in particular, the ‘naive type theory’ that I was implicitly using to do things like construct the standard number systems, dovetails well with the dependent type theory of Lean (which, among other things, has excellent support for quotient types).

I have therefore decided to launch a Lean companion to “Analysis I”, which is a “translation” of many of the definitions, theorems, and exercises of the text into Lean. In particular, this gives an alternate way to perform the exercises in the book, by instead filling in the corresponding “sorries” in the Lean code. (I do not however plan on hosting “official” solutions to the exercises in this companion; instead, feel free to create forks of the repository in which these sorries are filled in.)

Currently, the following sections of the text have been translated into Lean:

The formalization has been deliberately designed to be separate from the standard Lean math library Mathlib at some places, but reliant on it at others. For instance, Mathlib already has a standard notion of the natural numbers {{\bf N}}. In the Lean formalization, I first develop “by hand” an alternate construction Chapter2.Nat of the natural numbers (or just Nat, if one is working in the Chapter2 namespace), setting up many of the basic results about these alternate natural numbers which parallel similar lemmas about {{\bf N}} that are already in Mathlib (but with many of these lemmas set as exercises to the reader, with the proofs currently replaced with “sorries”). Then, in an epilogue section, isomorphisms between these alternate natural numbers and the Mathlib natural numbers are established (or more precisely, set as exercises). From that point on, the Chapter 2 natural numbers are deprecated, and the Mathlib natural numbers are used instead. I intend to continue this general pattern throughout the book, so that as one advances into later chapters, one increasingly relies on Mathlib’s definitions and functions, rather than directly referring to any counterparts from earlier chapters. As such, this companion could also be used as an introduction to Lean and Mathlib as well as to real analysis (somewhat in the spirit of the “Natural number game“, which in fact has significant thematic overlap with Chapter 2 of my text).

The code in this repository compiles in Lean, but I have not tested whether all of the (numerous) “sorries” in the code can actually be filled (i.e., if all the exercises can actually be solved in Lean). I would be interested in having volunteers “playtest” the companion to see if this can actually be done (and if the helper lemmas or “API” provided in the Lean files are sufficient to fill in the sorries in a conceptually straightforward manner without having to rely on more esoteric Lean programming techniques). Any other feedback will of course also be welcome.

[UPDATE, May 31: moved the companion to a standalone repository.]

June 02, 2025

Terence TaoOn the number of exceptional intervals to the prime number theorem in short intervals

Ayla Gafni and I have just uploaded to the arXiv the paper “On the number of exceptional intervals to the prime number theorem in short intervals“. This paper makes explicit some relationships between zero density theorems and prime number theorems in short intervals which were somewhat implicit in the literature at present.

Zero density theorems are estimates of the form

\displaystyle N(\sigma,T) \ll T^{A(\sigma)(1-\sigma)+o(1)}

for various {0 \leq \sigma < 1}, where {T} is a parameter going to infinity, {N(\sigma,T)} counts the number of zeroes of the Riemann zeta function of real part at least {\sigma} and imaginary part between {-T} and {T}, and {A(\sigma)} is an exponent which one would like to be as small as possible. The Riemann hypothesis would allow one to take {A(\sigma)=-\infty} for any {\sigma > 1/2}, but this is an unrealistic goal, and in practice one would be happy with some non-trivial upper bounds on {A(\sigma)}. A key target here is the density hypothesis that asserts that {A(\sigma) \leq 2} for all {\sigma} (this is in some sense sharp because the Riemann-von Mangoldt formula implies that {A(1/2)=2}); this hypothesis is currently known for {\sigma \leq 1/2} and {\sigma \geq 25/32}, but the known bounds are not strong enough to establish this hypothesis in the remaining region. However, there was a recent advance of Guth and Maynard, which among other things improved the upper bound {A_0} on {\sup_\sigma A(\sigma)} from {12/5=2.4} to {30/13=2.307\dots}, marking the first improvement in this bound in over four decades. Here is a plot of the best known upper bounds on {A(\sigma)}, either unconditionally, assuming the density hypothesis, or the stronger Lindelöf hypothesis:

One of the reasons we care about zero density theorems is that they allow one to localize the prime number theorem to short intervals. In particular, if we have the uniform bound {A(\sigma) \leq A_0} for all {\sigma}, then this leads to the prime number theorem

\displaystyle  \sum_{x \leq n < x+x^\theta} \Lambda(n) \sim x^\theta holding for all {x} if {\theta > 1-\frac{1}{A_0}}, and for almost all {x} (possibly excluding a set of density zero) if {\theta > 1 - \frac{2}{A_0}}. For instance, the Guth-Maynard results give a prime number theorem in almost all short intervals for {\theta} as small as {2/15+\varepsilon}, and the density hypotheis would lower this just to {\varepsilon}.

However, one can ask about more information on this exceptional set, in particular to bound its “dimension” {\mu(\theta)}, which roughly speaking amounts to getting an upper bound of {X^{\mu(\theta)+o(1)}} on the size of the exceptional set in any large interval {[X,2X]}. Based on the above assertions, one expects {\mu(\theta)} to only be bounded by {1} for {\theta < 1-2/A}, be bounded by {-\infty} for {\theta > 1-1/A}, but have some intermediate bound for the remaining exponents.

This type of question had been studied in the past, most direclty by Bazzanella and Perelli, although there is earlier work by many authors om some related quantities (such as the second moment {\sum_{n \leq x} (p_{n+1}-p_n)^2} of prime gaps) by such authors as Selberg and Heath-Brown. In most of these works, the best available zero density estimates at that time were used to obtain specific bounds on quantities such as {\mu(\theta)}, but the numerology was usually tuned to those specific estimates, with the consequence being that when newer zero density estimates were discovered, one could not readily update these bounds to match. In this paper we abstract out the arguments from previous work (largely based on the explicit formula for the primes and the second moment method) to obtain an explicit relationship between {\mu(\theta)} and {A(\sigma)}, namely that

\displaystyle  \mu(\theta) \leq \inf_{\varepsilon>0} \sup_{0 \leq \theta<1; A(\sigma) \geq \frac{1}{1-\theta}-\varepsilon} \mu_{2,\sigma}(\theta) where

\displaystyle  \mu_{2,\theta}(\theta) = (1-\theta)(1-\sigma)A(\sigma)+2\sigma-1. Actually, by also utilizing fourth moment methods, we obtain a stronger bound

\displaystyle  \mu(\theta) \leq \inf_{\varepsilon>0} \sup_{0 \leq \theta<1; A(\sigma) \geq \frac{1}{1-\theta}-\varepsilon} \min( \mu_{2,\sigma}(\theta), \mu_{4,\sigma}(\theta) ) where

\displaystyle  \mu_{4,\theta}(\theta) = (1-\theta)(1-\sigma)A^*(\sigma)+4\sigma-3 and {A^*(\sigma)} is the exponent in “additive energy zero density theorems”

\displaystyle N^*(\sigma,T) \ll T^{A^*(\sigma)(1-\sigma)+o(1)} where {N^*(\sigma,T)} is similar to {N(\sigma,T)}, but bounds the “additive energy” of zeroes rather than just their cardinality. Such bounds have appeared in the literature since the work of Heath-Brown, and are for instance a key ingredient in the recent work of Guth and Maynard. Here are the current best known bounds:

These explicit relationships between exponents are perfectly suited for the recently launched Analytic Number Theory Exponent Database (ANTEDB) (discussed previously here), and have been uploaded to that site.

This formula is moderately complicated (basically an elaborate variant of a Legendre transform), but easy to calculate numerically with a computer program. Here is the resulting bound on {\mu(\theta)} unconditionally and under the density hypothesis (together with a previous bound of Bazzanella and Perelli for comparison, where the range had to be restricted due to a gap in the argument we discovered while trying to reproduce their results):

For comparison, here is the situation assuming strong conjectures such as the density hypothesis, Lindelof hypothesis, or Riemann hypothesis:

June 01, 2025

Doug NatelsonPushing back on US science cuts: Now is a critical time

Every week has brought more news about actions that, either as a collateral effect or a deliberate goal, will deeply damage science and engineering research in the US.  Put aside for a moment the tremendously important issue of student visas (where there seems to be a policy of strategic vagueness, to maximize the implicit threat that there may be selective actions).  Put aside the statement from a Justice Department official that there is a general plan is to "bring these universities to their knees", on the pretext that this is somehow about civil rights.  

The detailed version of the presidential budget request for FY26 is now out (pdf here for the NSF portion).  If enacted, it would be deeply damaging to science and engineering research in the US and the pipeline of trained students who support the technology sector.  Taking NSF first:  The topline NSF budget would be cut from $8.34B to $3.28B.  Engineering would be cut by 75%, Math and Physical Science by 66.8%.  The anticipated agency-wide success rate for grants would nominally drop below 7%, though that is misleading (basically taking the present average success rate and cutting it by 2/3, while some programs are already more competitive than others.).  In practice, many programs already have future-year obligations, and any remaining funds will have to go there, meaning that many programs would likely have no awards at all in the coming fiscal year.  The NSF's CAREER program (that agency's flagship young investigator program) would go away  This plan would also close one of the LIGO observatories (see previous link).  (This would be an extra bonus level of stupid, since LIGO's ability to do science relies on having two facilities, to avoid false positives and to identify event locations in the sky.  You might as well say that you'll keep an accelerator running but not the detector.)  Here is the table that I think hits hardest, dollars aside:

The number of people involved in NSF activities would drop by 240,000.  The graduate research fellowship program would be cut by more than half.  The NSF research training grant program (another vector for grad fellowships) would be eliminated.  

The situation at NIH and NASA is at least as bleak.  See here for a discussion from Joshua Weitz at Maryland which includes this plot: 


This proposed dismantling of US research and especially the pipeline of students who support the technology sector (including medical research, computer science, AI, the semiconductor industry, chemistry and chemical engineering, the energy industry) is astonishing in absolute terms.  It also does not square with the claim of some of our elected officials and high tech CEOs to worry about US competitiveness in science and engineering.  (These proposed cuts are not about fiscal responsibility; just the amount added in the proposed DOD budget dwarfs these cuts by more than a factor of 3.)

If you are a US citizen and think this is the wrong direction, now is the time to talk to your representatives in Congress. In the past, Congress has ignored presidential budget requests for big cuts.  The American Physical Society, for example, has tools to help with this.  Contacting legislators by phone is also made easy these days.  From the standpoint of public outreach, Cornell has an effort backing large-scale writing of editorials and letters to the editor.




May 31, 2025

Tommaso DorigoThe Anomaly That Wasn't: An Example Of Shifting Consensus In Science

Time is a gentleman - it waits patiently. And in physics, as in all exact sciences, problems and mysteries eventually get resolved, if we give it enough time. That is how science works, after all: the consensus on our explanation of reality changes as we acquire more information on the latter.

read more

Scott Aaronson “If Anyone Builds It, Everyone Dies”

Eliezer Yudkowsky and Nate Soares are publishing a mass-market book, the rather self-explanatorily-titled If Anyone Builds It, Everyone Dies. (Yes, the “it” means “sufficiently powerful AI.”) The book is now available for preorder from Amazon:

(If you plan to buy the book at all, Eliezer and Nate ask that you do preorder it, as this will apparently increase the chance of it making the bestseller lists and becoming part of The Discourse.)

I was graciously offered a chance to read a draft and offer, not a “review,” but some preliminary thoughts. So here they are:

For decades, Eliezer has been warning the world that an AI might soon exceed human abilities, and proceed to kill everyone on earth, in pursuit of whatever strange goal it ended up with.  It would, Eliezer said, be something like what humans did to the earlier hominids.  Back around 2008, I followed the lead of most of my computer science colleagues, who considered these worries, even if possible in theory, comically premature given the primitive state of AI at the time, and all the other severe crises facing the world.

Now, of course, not even two decades later, we live on a planet that’s being transformed by some of the signs and wonders that Eliezer foretold.  The world’s economy is about to be upended by entities like Claude and ChatGPT, AlphaZero and AlphaFold—whose human-like or sometimes superhuman cognitive abilities, obtained “merely” by training neural networks (in the first two cases, on humanity’s collective output) and applying massive computing power, constitute (I’d say) the greatest scientific surprise of my lifetime.  Notably, these entities have already displayed some of the worrying behaviors that Eliezer warned about decades ago—including lying to humans in pursuit of a goal, and hacking their own evaluation criteria.  Even many of the economic and geopolitical aspects have played out as Eliezer warned they would: we’ve now seen AI companies furiously racing each other, seduced by the temptation of being (as he puts it) “the first monkey to taste the poisoned banana,” discarding their previous explicit commitments to safety, transparency, and the public good once they get in the way.

Today, then, even if one still isn’t ready to swallow the full package of Yudkowskyan beliefs, any empirically minded person ought to be updating in its direction—and acting accordingly.  Which brings us to the new book by Eliezer and his collaborator Nate Soares.  This book is far and away the clearest, most accessible presentation of Eliezer’s beliefs, the culmination of a quarter-century of his developing and talking about them.  That undoubtedly owes a great deal to Nate, who seems to have sanded down the infamously brusque rough edges of Eliezer’s writing style.  So much the better!  But it also owes a lot to the world itself: current events now offer an endless supply of real-world examples for Eliezer’s formerly abstract arguments about AI, examples that the book deploys to maximum effect.

The book also mines history—everything from the Wright Brothers to World War II to the Chernobyl accident—for lessons about human attitudes toward technological progress, safety, and risk.  And it maintains Eliezer’s fondness for stories and parables, one of the most charming features of his writing.

Even today, I’m not nearly as confident about the doom scenario as Eliezer and Nate are.  I don’t know whether an AI’s goals are really “orthogonal” to its abilities, in the sense that will matter in practice.  And when I reach the part where the AI, having copied itself all over the Internet and built robot factories, then invents and releases self-replicating nanotechnology that gobbles the surface of the earth in hours or days, a large part of me still screams out that there must be practical bottlenecks that haven’t been entirely accounted for here.

And yet, even if you agree with only a quarter of what Eliezer and Nate write, you’re likely to close this book fully convinced—as I am—that governments need to shift to a more cautious approach to AI, an approach more respectful of the civilization-changing enormity of what’s being created.  And that, if they won’t, their citizens need to pressure them to do so.

So regardless of how much they agree or disagree, I’d like everyone on earth who cares about the future to read this book, debate its ideas, and have its thesis in mind when they’re discussing AI.

As for me?  It would’ve been better if I’d reached my current position earlier: if I hadn’t needed empirical reality, plus superb writing like Eliezer’s and Nate’s, to bonk me over the head with the risks that AI was likely to pose to humanity in my lifetime.  But having failed to see as far ahead as they did, the least I can do is update.  You should too, and you can start by reading the book.


As it happens, this weekend I’ll be at LessOnline, the rationalist blogging conference in Berkeley, where (among my other events) I’ll engage in a dialogue/debate with Nate Soares about the orthogonality thesis, one of the crucial underpinnings of his and Eliezer’s case for AI doom. So, I’ll probably be LessAvailable to respond to comments on this post. But feel free to discuss anyway! After all, it’s merely the fate of all Earth-originating life that’s at stake here, not some actually hot-button topic like Trump or Gaza.

May 30, 2025

Matt von HippelIn Scientific American, With a Piece on Vacuum Decay

I had a piece in Scientific American last week. It’s paywalled, but if you’re a subscriber there you can see it, or you can buy the print magazine.

(I also had two pieces out in other outlets this week. I’ll be saying more about them…in a couple weeks.)

The Scientific American piece is about an apocalyptic particle physics scenario called vacuum decay. It’s a topic I covered last year in Quanta Magazine, an unlikely event where the Higgs field which gives fundamental particles their mass changes value, suddenly making all other particles much more massive and changing physics as we know it. It’s a change that physicists think would start as a small bubble and spread at (almost) the speed of light, covering the universe.

What I wrote for Quanta was a short news piece covering a small adjustment to the calculation, one that made the chance of vacuum decay slightly more likely. (But still mind-bogglingly small, to be clear.)

Scientific American asked for a longer piece, and that gave me space to dig deeper. I was able to say more about how vacuum decay works, with a few metaphors that I think should make it a lot easier to understand. I also got to learn about some new developments, in particular, an interesting story about how tiny primordial black holes could make vacuum decay dramatically more likely.

One thing that was a bit too complicated to talk about were the puzzles involved in trying to calculate these chances. In the article, I mention a calculation of the chance of vacuum decay by a team including Matthew Schwartz. That calculation wasn’t the first to estimate the chance of vacuum decay, and it’s not the most recent update either. Instead, I picked it because Schwartz’s team approached the question in what struck me as a more reliable way, trying to cut through confusion by asking the most basic question you can in a quantum theory: given that now you observe X, what’s the chance that later you observe Y? Figuring out how to turn vacuum decay into that kind of question correctly is tricky (for example, you need to include the possibility that vacuum decay happens, then reverses, then happens again).

The calculations of black holes speeding things up didn’t work things out in quite as much detail. I like to think I’ve made a small contribution by motivating them to look at Schwartz’s work, which might spawn a more rigorous calculation in future. When I talked to Schwartz, he wasn’t even sure whether the picture of a bubble forming in one place and spreading at light speed is correct: he’d calculated the chance of the initial decay, but hadn’t found a similarly rigorous way to think about the aftermath. So even more than the uncertainty I talk about in the piece, the questions about new physics and probability, there is even some doubt about whether the whole picture really works the way we’ve been imagining it.

That makes for a murky topic! But it’s also a flashy one, a compelling story for science fiction and the public imagination, and yeah, another motivation to get high-precision measurements of the Higgs and top quark from future colliders! (If maybe not quite the way this guy said it.)

Terence TaoCosmic Distance Ladder videos with Grant Sanderson (3blue1brown): commentary and corrections

Grant Sanderson (who runs, and creates most of the content for, the website and Youtube channel 3blue1brown) has been collaborating with myself and others (including my coauthor Tanya Klowden) on producing a two-part video giving an account of some of the history of the cosmic distance ladder, building upon a previous public lecture I gave on this topic, and also relating to a forthcoming popular book with Tanya on this topic. The first part of this video is available here; the second part is available here.

The videos were based on a somewhat unscripted interview that Grant conducted with me some months ago, and as such contained some minor inaccuracies and omissions (including some made for editing reasons to keep the overall narrative coherent and within a reasonable length). They also generated many good questions from the viewers of the Youtube video. I am therefore compiling here a “FAQ” of various clarifications and corrections to the videos; this was originally placed as a series of comments on the Youtube channel, but the blog post format here will be easier to maintain going forward. Some related content will also be posted on the Instagram page for the forthcoming book with Tanya.

Questions on the two main videos are marked with an appropriate timestamp to the video.

Comments on part 1 of the video

  • 4:26 Did Eratosthenes really check a local well in Alexandria?

    This was a narrative embellishment on my part. Eratosthenes’s original work is lost to us. The most detailed contemperaneous account, by Cleomedes, gives a simplified version of the method, and makes reference only to sundials (gnomons) rather than wells. However, a secondary account of Pliny states (using this English translation), “Similarly it is reported that at the town of Syene, 5000 stades South of Alexandria, at noon in midsummer no shadow is cast, and that in a well made for the sake of testing this the light reaches to the bottom, clearly showing that the sun is vertically above that place at the time”. However, no mention is made of any well in Alexandria in either account.
  • 4:50 How did Eratosthenes know that the Sun was so far away that its light rays were close to parallel?

    This was not made so clear in our discussions or in the video (other than a brief glimpse of the timeline at 18:27), but Eratosthenes’s work actually came after Aristarchus, so it is very likely that Eratosthenes was aware of Aristarchus’s conclusions about how distant the Sun was from the Earth. Even if Aristarchus’s heliocentric model was disputed by the other Greeks, at least some of his other conclusions appear to have attracted some support. Also, after Eratosthenes’s time, there was further work by Greek, Indian, and Islamic astronomers (such as Hipparchus, Ptolemy, Aryabhata, and Al-Battani) to measure the same distances that Aristarchus did, although these subsequent measurements for the Sun also were somewhat far from modern accepted values.
  • 5:17 Is it completely accurate to say that on the summer solstice, the Earth’s axis of rotation is tilted “directly towards the Sun”?

    Strictly speaking, “in the direction towards the Sun” is more accurate than “directly towards the Sun”; it tilts at about 23.5 degrees towards the Sun, but it is not a total 90-degree tilt towards the Sun.
  • 5:39 Wait, aren’t there two tropics? The tropic of Cancer and the tropic of Capricorn?

    Yes! This corresponds to the two summers Earth experiences, one in the Northern hemisphere and one in the Southern hemisphere. The tropic of Cancer, at a latitude of about 23 degrees north, is where the Sun is directly overhead at noon during the Northern summer solstice (around June 21); the tropic of Capricorn, at a latitude of about 23 degrees south, is where the Sun is directly overhead at noon during the Southern summer solstice (around December 21). But Alexandria and Syene were both in the Northern Hemisphere, so it is the tropic of Cancer that is relevant to Eratosthenes’ calculations.
  • 5:41 Isn’t it kind of a massive coincidence that Syene was on the tropic of Cancer?

    Actually, Syene (now known as Aswan) was about half a degree of latitude away from the tropic of Cancer, which was one of the sources of inaccuracy in Eratosthenes’ calculations.  But one should take the “look-elsewhere effect” into account: because the Nile cuts across the tropic of Cancer, it was quite likely to happen that the Nile would intersect the tropic near some inhabited town.  It might not necessarily have been Syene, but that would just mean that Syene would have been substituted by this other town in Eratosthenes’s account.  

    On the other hand, it was fortunate that the Nile ran from South to North, so that distances between towns were a good proxy for the differences in latitude.  Apparently, Eratosthenes actually had a more complicated argument that would also work if the two towns in question were not necessarily oriented along the North-South direction, and if neither town was on the tropic of Cancer; but unfortunately the original writings of Eratosthenes are lost to us, and we do not know the details of this more general argument. (But some variants of the method can be found in later work of Posidonius, Aryabhata, and others.)

    Nowadays, the “Eratosthenes experiment” is run every year on the March equinox, in which schools at the same longitude are paired up to measure the elevation of the Sun at the same point in time, in order to obtain a measurement of the circumference of the Earth.  (The equinox is more convenient than the solstice when neither location is on a tropic, due to the simple motion of the Sun at that date.) With modern timekeeping, communications, surveying, and navigation, this is a far easier task to accomplish today than it was in Eratosthenes’ time.
  • 6:30 I thought the Earth wasn’t a perfect sphere. Does this affect this calculation?

    Yes, but only by a small amount. The centrifugal forces caused by the Earth’s rotation along its axis cause an equatorial bulge and a polar flattening so that the radius of the Earth fluctuates by about 20 kilometers from pole to equator. This sounds like a lot, but it is only about 0.3% of the mean Earth radius of 6371 km and is not the primary source of error in Eratosthenes’ calculations.
  • 7:27 Are the riverboat merchants and the “grad student” the leading theories for how Eratosthenes measured the distance from Alexandria to Syene?

    There is some recent research that suggests that Eratosthenes may have drawn on the work of professional bematists (step measurers – a precursor to the modern profession of surveyor) for this calculation. This somewhat ruins the “grad student” joke, but perhaps should be disclosed for the sake of completeness.
  • 8:51 How long is a “lunar month” in this context? Is it really 28 days?

    In this context the correct notion of a lunar month is a “synodic month” – the length of a lunar cycle relative to the Sun – which is actually about 29 days and 12 hours. It differs from the “sidereal month” – the length of a lunar cycle relative to the fixed stars – which is about 27 days and 8 hours – due to the motion of the Earth around the Sun (or the Sun around the Earth, in the geocentric model). [A similar correction needs to be made around 14:59, using the synodic month of 29 days and 12 hours rather than the “English lunar month” of 28 days (4 weeks).]
  • 10:47 Is the time taken for the Moon to complete an observed rotation around the Earth slightly less than 24 hours as claimed?

    Actually, I made a sign error: the lunar day (also known as a tidal day) is actually 24 hours and 50 minutes, because the Moon rotates in the same direction as the spinning of Earth around its axis. The animation therefore is also moving in the wrong direction as well (related to this, the line of sight is covering up the Moon in the wrong direction to the Moon rising at around 10:38).
  • 11:32 Is this really just a coincidence that the Moon and Sun have almost the same angular width?

    I believe so. First of all, the agreement is not that good: due to the non-circular nature of the orbit of the Moon around the Earth, and Earth around the Sun, the angular width of the Moon actually fluctuates to be as much as 10% larger or smaller than the Sun at various times (cf. the “supermoon” phenomenon). All other known planets with known moons do not exhibit this sort of agreement, so there does not appear to be any universal law of nature that would enforce this coincidence. (This is in contrast with the empirical fact that the Moon always presents the same side to the Earth, which occurs in all other known large moons (as well as Pluto), and is well explained by the physical phenomenon of tidal locking.)

    On the other hand, as the video hopefully demonstrates, the existence of the Moon was extremely helpful in allowing the ancients to understand the basic nature of the solar system. Without the Moon, their task would have been significantly more difficult; but in this hypothetical alternate universe, it is likely that modern cosmology would have still become possible once advanced technology such as telescopes, spaceflight, and computers became available, especially when combined with the modern mathematics of data science. Without giving away too many spoilers, a scenario similar to this was explored in the classic short story and novel “Nightfall” by Isaac Asimov.
  • 12:58 Isn’t the illuminated portion of the Moon, as well as the visible portion of the Moon, slightly smaller than half of the entire Moon, because the Earth and Sun are not an infinite distance away from the Moon?

    Technically yes (and this is actually for a very similar reason to why half Moons don’t quite occur halfway between the new Moon and the full Moon); but this fact turns out to have only a very small effect on the calculations, and is not the major source of error. In reality, the Sun turns out to be about 86,000 Moon radii away from the Moon, so asserting that half of the Moon is illuminated by the Sun is actually a very good first approximation. (The Earth is “only” about 220 Moon radii away, so the visible portion of the Moon is a bit more noticeably less than half; but this doesn’t actually affect Aristarchus’s arguments much.)

    The angular diameter of the Sun also creates an additional thin band between the fully illuminated and fully non-illuminated portions of the Moon, in which the Sun is intersecting the lunar horizon and so only illuminates the Moon with a portion of its light, but this is also a relatively minor effect (and the midpoints of this band can still be used to define the terminator between illuminated and non-illuminated for the purposes of Aristarchus’s arguments).
  • 13:27 What is the difference between a half Moon and a quarter Moon?

    If one divides the lunar month, starting and ending at a new Moon, into quarters (weeks), then half moons occur both near the end of the first quarter (a week after the new Moon, and a week before the full Moon), and near the end of the third quarter (a week after the full Moon, and a week before the new Moon). So, somewhat confusingly, half Moons come in two types, known as “first quarter Moons” and “third quarter Moons”.
  • 14:49 I thought the sine function was introduced well after the ancient Greeks.

    It’s true that the modern sine function only dates back to the Indian and Islamic mathematical traditions in the first millennium CE, several centuries after Aristarchus.  However, he still had Euclidean geometry at his disposal, which provided tools such as similar triangles that could be used to reach basically the same conclusions, albeit with significantly more effort than would be needed if one could use modern trigonometry.

    On the other hand, Aristarchus was somewhat hampered by not knowing an accurate value for \pi, which is also known as Archimedes’ constant: the fundamental work of Archimedes on this constant actually took place a few decades after that of Aristarchus!
  • 15:17 I plugged in the modern values for the distances to the Sun and Moon and got 18 minutes for the discrepancy, instead of half an hour.

    Yes; I quoted the wrong number here. In 1630, Godfried Wendelen replicated Aristarchus’s experiment. With improved timekeeping and the then-recent invention of the telescope, Wendelen obtained a measurement of half an hour for the discrepancy, which is significantly better than Aristarchus’s calculation of six hours, but still a little bit off from the true value of 18 minutes. (As such, Wendelinus’s estimate for the distance to the Sun was 60% of the true value.)
  • 15:27 Wouldn’t Aristarchus also have access to other timekeeping devices than sundials?

    Yes, for instance clepsydrae (water clocks) were available by that time; but they were of limited accuracy. It is also possible that Aristarchus could have used measurements of star elevations to also estimate time; it is not clear whether the astrolabe or the armillary sphere was available to him, but he would have had some other more primitive astronomical instruments such as the dioptra at his disposal. But again, the accuracy and calibration of these timekeeping tools would have been poor.

    However, most likely the more important limiting factor was the ability to determine the precise moment at which a perfect half Moon (or new Moon, or full Moon) occurs; this is extremely difficult to do with the naked eye. (The telescope would not be invented for almost two more millennia.)
  • 17:37 Could the parallax problem be solved by assuming that the stars are not distributed in a three-dimensional space, but instead on a celestial sphere?

    Putting all the stars on a fixed sphere would make the parallax effects less visible, as the stars in a given portion of the sky would now all move together at the same apparent velocity – but there would still be visible large-scale distortions in the shape of the constellations because the Earth would be closer to some portions of the celestial sphere than others; there would also be variability in the brightness of the stars, and (if they were very close) the apparent angular diameter of the stars. (These problems would be solved if the celestial sphere was somehow centered around the moving Earth rather than the fixed Sun, but then this basically becomes the geocentric model with extra steps.)
  • 18:29 Did nothing of note happen in astronomy between Eratosthenes and Copernicus?

    Not at all! There were significant mathematical, technological, theoretical, and observational advances by astronomers from many cultures (Greek, Islamic, Indian, Chinese, European, and others) during this time, for instance improving some of the previous measurements on the distance ladder, a better understanding of eclipses, axial tilt, and even axial precession, more sophisticated trigonometry, and the development of new astronomical tools such as the astrolabe. See for instance this “deleted scene” from the video, as well as the FAQ entry for 14:49 for this video and 24:54 for the second video, or this instagram post. But in order to make the overall story of the cosmic distance ladder fit into a two-part video, we chose to focus primarily on the first time each rung of the ladder was climbed.
  • 18:30 Is that really Kepler’s portrait?

    We have since learned that this portrait was most likely painted in the 19th century, and may have been based more on Kepler’s mentor, Michael Mästlin. A more commonly accepted portrait of Kepler may be found at his current Wikipedia page.
  • 19:07 Isn’t it tautological to say that the Earth takes one year to perform a full orbit around the Sun?

    Technically yes, but this is an illustration of the philosophical concept of “referential opacity“: the content of a sentence can change when substituting one term for another (e.g., “1 year” and “365 days”), even when both terms refer to the same object. Amusingly, the classic illustration of this, known as Frege’s puzzles, also comes from astronomy: it is an informative statement that Hesperus (the evening star) and Phosphorus (the morning star, also known as Lucifer) are the same object (which nowadays we call Venus), but it is a mere tautology that Hesperus and Hesperus are the same object: changing the reference from Phosphorus to Hesperus changes the meaning.
  • 19:10 How did Copernicus figure out the crucial fact that Mars takes 687 days to go around the Sun? Was it directly drawn from Babylonian data?

    Technically, Copernicus drew from tables by European astronomers that were largely based on earlier tables from the Islamic golden age, which in turn drew from earlier tables by Indian and Greek astronomers, the latter of which also incorporated data from the ancient Babylonians, so it is more accurate to say that Copernicus relied on centuries of data, at least some of which went all the way back to the Babylonians. Among all of this data was the times when Mars was in opposition to the Sun; if one imagines the Earth and Mars as being like runners going around a race track circling the Sun, with Earth on an inner track and Mars on an outer track, oppositions are analogous to when the Earth runner “laps” the Mars runner. From the centuries of observational data, such “laps” were known to occur about once every 780 days (this is known as the synodic period of Mars). Because the Earth takes roughly 365 days to perform a “lap”, it is possible to do a little math and conclude that Mars must therefore complete its own “lap” in 687 days (this is known as the sidereal period of Mars). (See also this post on the cosmic distance ladder Instagram for some further elaboration.)
  • 20:52 Did Kepler really steal data from Brahe?

    The situation is complex. When Kepler served as Brahe’s assistant, Brahe only provided Kepler with a limited amount of data, primarily involving Mars, in order to confirm Brahe’s own geo-heliocentric model. After Brahe’s death, the data was inherited by Brahe’s son-in-law and other relatives, who intended to publish Brahe’s work separately; however, Kepler, who was appointed as Imperial Mathematician to succeed Brahe, had at least some partial access to the data, and many historians believe he secretly copied portions of this data to aid his own research before finally securing complete access to the data from Brahe’s heirs after several years of disputes. On the other hand, as intellectual property rights laws were not well developed at this time, Kepler’s actions were technically legal, if ethically questionable.
  • 21:39 What is that funny loop in the orbit of Mars?

    This is known as retrograde motion. This arises because the orbital velocity of Earth (about 30 km/sec) is a little bit larger than that of Mars (about 24 km/sec). So, in opposition (when Mars is in the opposite position in the sky than the Sun), Earth will briefly overtake Mars, causing its observed position to move westward rather than eastward. But in most other times, the motion of Earth and Mars are at a sufficient angle that Mars will continue its apparent eastward motion despite the slightly faster speed of the Earth.
  • 21:59 Couldn’t one also work out the direction to other celestial objects in addition to the Sun and Mars, such as the stars, the Moon, or the other planets?  Would that have helped?

    Actually, the directions to the fixed stars were implicitly used in all of these observations to determine how the celestial sphere was positioned, and all the other directions were taken relative to that celestial sphere.  (Otherwise, all the calculations would be taken on a rotating frame of reference in which the unknown orbits of the planets were themselves rotating, which would have been an even more complex task.)  But the stars are too far away to be useful as one of the two landmarks to triangulate from, as they generate almost no parallax and so cannot distinguish one location from another.

    Measuring the direction to the Moon would tell you which portion of the lunar cycle one was in, and would determine the phase of the Moon, but this information would not help one triangulate, because the Moon’s position in the heliocentric model varies over time in a somewhat complicated fashion, and is too tied to the motion of the Earth to be a useful “landmark” to one to determine the Earth’s orbit around the Sun.

    In principle, using the measurements to all the planets at once could allow for some multidimensional analysis that would be more accurate than analyzing each of the planets separately, but this would require some sophisticated statistical analysis and modeling, as well as non-trivial amounts of compute – neither of which were available in Kepler’s time.
  • 22:57 Can you elaborate on how we know that the planets all move on a plane?

    The Earth’s orbit lies in a plane known as the ecliptic (it is where the lunar and solar eclipses occur). Different cultures have divided up the ecliptic in various ways; in Western astrology, for instance, the twelve main constellations that cross the ecliptic are known as the Zodiac. The planets can be observed to only wander along the Zodiac, but not other constellations: for instance, Mars can be observed to be in Cancer or Libra, but never in Orion or Ursa Major. From this, one can conclude (as a first approximation, at least), that the planets all lie on the ecliptic.

    However, this isn’t perfectly true, and the planets will deviate from the ecliptic by a small angle known as the ecliptic latitude. Tycho Brahe’s observations on these latitudes for Mars were an additional useful piece of data that helped Kepler complete his calculations (basically by suggesting how to join together the different “jigsaw pieces”), but the math here gets somewhat complicated, so the story here has been somewhat simplified to convey the main ideas.
  • 23:04 What are the other universal problem solving tips?

    Grant Sanderson has a list (in a somewhat different order) in this previous video.
  • 23:28 Can one work out the position of Earth from fixed locations of the Sun and Mars when the Sun and Mars are in conjunction (the same location in the sky) or opposition (opposite locations in the sky)?

    Technically, these are two times when the technique of triangulation fails to be accurate; and also in the former case it is extremely difficult to observe Mars due to the proximity to the Sun. But again, following the Universal Problem Solving Tip from 23:07, one should initially ignore these difficulties to locate a viable method, and correct for these issues later. This video series by Welch Labs goes into Kepler’s methods in more detail.
  • 24:04 So Kepler used Copernicus’s calculation of 687 days for the period of Mars. But didn’t Kepler discard Copernicus’s theory of circular orbits?

    Good question! It turns out that Copernicus’s calculations of orbital periods are quite robust (especially with centuries of data), and continue to work even when the orbits are not perfectly circular. But even if the calculations did depend on the circular orbit hypothesis, it would have been possible to use the Copernican model as a first approximation for the period, in order to get a better, but still approximate, description of the orbits of the planets. This in turn can be fed back into the Copernican calculations to give a second approximation to the period, which can then give a further refinement of the orbits. Thanks to the branch of mathematics known as perturbation theory, one can often make this type of iterative process converge to an exact answer, with the error in each successive approximation being smaller than the previous one. (But performing such an iteration would probably have been beyond the computational resources available in Kepler’s time; also, the foundations of perturbation theory require calculus, which only was developed several decades after Kepler.)
  • 24:21 Did Brahe have exactly 10 years of data on Mars’s positions?

    Actually, it was more like 17 years, but with many gaps, due both to inclement weather, as well as Brahe turning his attention to other astronomical objects than Mars in some years; also, in times of conjunction, Mars might only be visible in the daytime sky instead of the night sky, again complicating measurements. So the “jigsaw puzzle pieces” in 25:26 are in fact more complicated than always just five locations equally spaced in time; there are gaps and also observational errors to grapple with. But to understand the method one should ignore these complications; again, see “Universal Problem Solving Tip #1”. Even with his “idea of true genius”, it took many years of further painstaking calculation for Kepler to tease out his laws of planetary motion from Brahe’s messy and incomplete observational data.
  • 26:44 Shouldn’t the Earth’s orbit be spread out at perihelion and clustered closer together at aphelion, to be consistent with Kepler’s laws?

    Yes, you are right; there was a coding error here.
  • 26:53 What is the reference for Einstein’s “idea of pure genius”?

    Actually, the precise quote was “an idea of true genius”, and can be found in the introduction to Carola Baumgardt’s “Life of Kepler“.

Comments on the “deleted scene” on Al-Biruni

  • Was Al-Biruni really of Arab origin?

    Strictly speaking; no; his writings are all in Arabic, and he was nominally a subject of the Abbasid Caliphate whose rulers were Arab; but he was born in Khwarazm (in modern day Uzbekistan), and would have been a subject of either the Samanid empire or the Khrawazmian empire, both of which were largely self-governed and primarily Persian in culture and ethnic makeup, despite being technically vassals of the Caliphate. So he would have been part of what is sometimes called “Greater Persia” or “Greater Iran”.

    Another minor correction: while Al-Biruni was born in the tenth century, his work on the measurement of the Earth was published in the early eleventh century.
  • Is \theta really called the angle of declination?

    This was a misnomer on my part; this angle is more commonly called the dip angle.
  • But the height of the mountain would be so small compared to the radius of the Earth! How could this method work?

    Using the Taylor approximation \cos \theta \approx 1 - \theta^2/2, one can approximately write the relationship R = \frac{h \cos \theta}{1-\cos \theta} between the mountain height h, the Earth radius R, and the dip angle \theta (in radians) as R \approx 2 h / \theta^2. The key point here is the inverse quadratic dependence on \theta, which allows for even relatively small values of h to still be realistically useful for computing R. Al-Biruni’s measurement of the dip angle \theta was about 0.01 radians, leading to an estimate of R that is about four orders of magnitude larger than h, which is within ballpark at least of a typical height of a mountain (on the order of a kilometer) and the radius of the Earth (6400 kilometers).
  • Was the method really accurate to within a percentage point?

    This is disputed, somewhat similarly to the previous calculations of Eratosthenes. Al-Biruni’s measurements were in cubits, but there were multiple incompatible types of cubit in use at the time. It has also been pointed out that atmospheric refraction effects would have created noticeable changes in the observed dip angle \theta. It is thus likely that the true accuracy of Al-Biruni’s method was poorer than 1%, but that this was somehow compensated for by choosing a favorable conversion between cubits and modern units.

Comments on the second part of the video

  • 1:13 Did Captain Cook set out to discover Australia?

    One of the objectives of Cook’s first voyage was to discover the hypothetical continent of Terra Australis. This was considered to be distinct from Australia, which at the time was known as New Holland. As this name might suggest, prior to Cook’s voyage, the northwest coastline of New Holland had been explored by the Dutch; Cook instead explored the eastern coastline, naming this portion New South Wales. The entire continent was later renamed to Australia by the British government, following a suggestion of Matthew Flinders; and the concept of Terra Australis was abandoned.
  • 4:40 The relative position of the Northern and Southern hemisphere observations is reversed from those earlier in the video.

    Yes, this was a slight error in the animation; the labels here should be swapped for consistency of orientation.
  • 7:06 So, when did they finally manage to measure the transit of Venus, and use this to compute the astronomical unit?

    While Le Gentil had the misfortune to not be able to measure either the 1761 or 1769 transits, other expeditions of astronomers (led by Dixon-Mason, Chappe d’Auteroche, and Cook) did take measurements of one or both of these transits with varying degrees of success, with the measurements of Cook’s team of the 1769 transit in Tahiti being of particularly high quality. All of this data was assembled later by Lalande in 1771, leading to the most accurate measurement of the astronomical unit at the time (within 2.3% of modern values, which was about three times more accurate than any previous measurement).
  • 8:53 What does it mean for the transit of Io to be “twenty minutes ahead of schedule” when Jupiter is in opposition (Jupiter is opposite to the Sun when viewed from the Earth)?

    Actually, it should be halved to “ten minutes ahead of schedule”, with the transit being “ten minutes behind schedule” when Jupiter is in conjunction, with the net discrepancy being twenty minutes (or actually closer to 16 minutes when measured with modern technology). Both transits are being compared against an idealized periodic schedule in which the transits are occuring at a perfectly regular rate (about 42 hours), where the period is chosen to be the best fit to the actual data. This discrepancy is only noticeable after carefully comparing transit times over a period of months; at any given position of Jupiter, the Doppler effects of Earth moving towards or away from Jupiter would only affect shift each transit by just a few seconds compared to the previous transit, with the delays or accelerations only becoming cumulatively noticeable after many such transits.

    Also, the presentation here is oversimplified: at times of conjunction, Jupiter and Io are too close to the Sun for observation of the transit. Rømer actually observed the transits at other times than conjunction, and Huygens used more complicated trigonometry than what was presented here to infer a measurement for the speed of light in terms of the astronomical unit (which they had begun to measure a bit more accurately than in Aristarchus’s time; see the FAQ entry for 15:17 in the first video).
  • 10:05 Are the astrological signs for Earth and Venus swapped here?

    Yes, this was a small mistake in the animation.
  • 10:34 Shouldn’t one have to account for the elliptical orbit of the Earth, as well as the proper motion of the star being observed, or the effects of general relativity?

    Yes; the presentation given here is a simplified one to convey the idea of the method, but in the most advanced parallax measurements, such as the ones taken by the Hipparcos and Gaia spacecraft, these factors are taken into account, basically by taking as many measurements (not just two) as possible of a single star, and locating the best fit of that data to a multi-parameter model that incorporates the (known) orbit of the Earth with the (unknown) distance and motion of the star, as well as additional gravitational effects from other celestial bodies, such as the Sun and other planets.
  • 14:53 The formula I was taught for apparent magnitude of stars looks a bit different from the one here.

    This is because astronomers use a logarithmic scale to measure both apparent magnitude m and absolute magnitude M. If one takes the logarithm of the inverse square law in the video, and performs the normalizations used by astronomers to define magnitude, one arrives at the standard relation M = m - 5 \log_{10} d_{pc} + 5 between absolute and apparent magnitude.

    But this is an oversimplification, most notably due to neglect of the effects of extinction effects caused by interstellar dust. This is not a major issue for the relatively short distances observable via parallax, but causes problems at larger scales of the ladder (see for instance the FAQ entry here for 18:08). To compensate for this, one can work in multiple frequencies of the spectrum (visible, x-ray, radio, etc.), as some frequencies are less susceptible to extinction than others. From the discrepancies between these frequencies one can infer the amount of extinction, leading to “dust maps” that can then be used to facilitate such corrections for subsequent measurements in the same area of the universe. (More generally, the trend in modern astronomy is towards “multi-messenger astronomy” in which one combines together very different types of measurements of the same object to obtain a more accurate understanding of that object and its surroundings.)
  • 18:08 Can we really measure the entire Milky Way with this method?

    Strictly speaking, there is a “zone of avoidance” on the far side of the Milky way that is very difficult to measure in the visible portion of the spectrum, due to the large amount of intervening stars, dust, and even a supermassive black hole in the galactic center. However, in recent years it has become possible to explore this zone to some extent using the radio, infrared, and x-ray portions of the spectrum, which are less affected by these factors.
  • 18:19 How did astronomers know that the Milky Way was only a small portion of the entire universe?

    This issue was the topic of the “Great Debate” in the early twentieth century. It was only with the work of Hubble using Leavitt’s law to measure distances to Magellanic clouds and “spiral nebulae” (that we now know to be other galaxies), building on earlier work of Leavitt and Hertzsprung, that it was conclusively established that these clouds and nebulae in fact were at much greater distances than the diameter of the Milky Way.
  • 18:45 How can one compensate for light blending effects when measuring the apparent magnitude of Cepheids?

    This is a non-trivial task, especially if one demands a high level of accuracy. Using the highest resolution telescopes available (such as HST or JWST) is of course helpful, as is switching to other frequencies, such as near-infrared, where Cepheids are even brighter relative to nearby non-Cepheid stars. One can also apply sophisticated statistical methods to fit to models of the point spread of light from unwanted sources, and use nearby measurements of the same galaxy without the Cepheid as a reference to help calibrate those models. Improving the accuracy of the Cepheid portion of the distance ladder is an ongoing research activity in modern astronomy.
  • 18:54 What is the mechanism that causes Cepheids to oscillate?

    For most stars, there is an equilibrium size: if the star’s radius collapses, then the reduced potential energy is converted to heat, creating pressure to pushing the star outward again; and conversely, if the star expands, then it cools, causing a reduction in pressure that no longer counteracts gravitational forces. But for Cepheids, there is an additional mechanism called the kappa mechanism: the increased temperature caused by contraction increases ionization of helium, which drains energy from the star and accelerates the contraction; conversely, the cooling caused by expansion causes the ionized helium to recombine, with the energy released accelerating the expansion. If the parameters of the Cepheid are in a certain “instability strip”, then the interaction of the kappa mechanism with the other mechanisms of stellar dynamics create a periodic oscillation in the Cepheid’s radius, which increases with the mass and brightness of the Cepheid.

    For a recent re-analysis of Leavitt’s original Cepheid data, see this paper.
  • 19:10 Did Leavitt mainly study the Cepheids in our own galaxy?

    This was an inaccuracy in the presentation. Leavitt’s original breakthrough paper studied Cepheids in the Small Magellanic Cloud. At the time, the distance to this cloud was not known; indeed, it was a matter of debate whether this cloud was in the Milky Way, or some distance away from it. However, Leavitt (correctly) assumed that all the Cepheids in this cloud were roughly the same distance away from our solar system, so that the apparent brightness was proportional to the absolute brightness. This gave an uncalibrated form of Leavitt’s law between absolute brightness and period, subject to the (then unknown) distance to the Small Magellanic Cloud. After Leavitt’s work, there were several efforts (by Hertzsprung, Russell, and Shapley) to calibrate the law by using the few Cepheids for which other distance methods were available, such as parallax. (Main sequence fitting to the Hertzsprung-Russell diagram was not directly usable, as Cepheids did not lie on the main sequence; but in some cases one could indirectly use this method if the Cepheid was in the same stellar cluster as a main sequence star.) Once the law was calibrated, it could be used to measure distances to other Cepheids, and in particular to compute distances to extragalactic objects such as the Magellanic clouds.
  • 19:15 Was Leavitt’s law really a linear law between period and luminosity?

    Strictly speaking, the period-luminosity relation commonly known as Leavitt’s law was a linear relation between the absolute magnitude of the Cepheid and the logarithm of the period; undoing the logarithms, this becomes a power law between the luminosity and the period.
  • 20:26 Was Hubble the one to discover the redshift of galaxies?

    This was an error on my part; Hubble was using earlier work of Vesto Slipher on these redshifts, and combining it with his own measurements of distances using Leavitt’s law to arrive at the law that now bears his name; he was also assisted in his observations by Milton Humason. It should also be noted that Georges Lemaître had also independently arrived at essentially the same law a few years prior, but his work was published in a somewhat obscure journal and did not receive broad recognition until some time later.
  • 20:37 Hubble’s original graph doesn’t look like a very good fit to a linear law.

    Hubble’s original data was somewhat noisy and inaccurate by modern standards, and the redshifts were affected by the peculiar velocities of individual galaxies in addition to the expanding nature of the universe. However, as the data was extended to more galaxies, it became increasingly possible to compensate for these effects and obtain a much tighter fit, particularly at larger scales where the effects of peculiar velocity are less significant. See for instance this article from 2015 where Hubble’s original graph is compared with a more modern graph. This more recent graph also reveals a slight nonlinear correction to Hubble’s law at very large scales that has led to the remarkable discovery that the expansion of the universe is in fact accelerating over time, a phenomenon that is attributed to a positive cosmological constant (or perhaps a more complex form of dark energy in the universe). On the other hand, even with this nonlinear correction, there continues to be a roughly 10% discrepancy of this law with predictions based primarily on the cosmic microwave background radiation; see the FAQ entry for 23:49.
  • 20:46 Does general relativity alone predict an uniformly expanding universe?

    This was an oversimplification. Einstein’s equations of general relativity contain a parameter \Lambda, known as the cosmological constant, which currently is only computable indirectly from fitting to experimental data. But even with this constant fixed, there are multiple solutions to these equations (basically because there are multiple possible initial conditions for the universe). For the purposes of cosmology, a particularly successful family of solutions are the solutions given by the Lambda-CDM model. This family of solutions contains additional parameters, such as the density of dark matter in the universe. Depending on the precise values of these parameters, the universe could be expanding or contracting, with the rate of expansion or contraction either increasing, decreasing, or staying roughly constant. But if one fits this model to all available data (including not just red shift measurements, but also measurements on the cosmic microwave background radiation and the spatial distribution of galaxies), one deduces a version of Hubble’s law which is nearly linear, but with an additional correction at very large scales; see the next item of this FAQ.
  • 21:07 Is Hubble’s original law sufficiently accurate to allow for good measurements of distances at the scale of the observable universe?

    Not really; as mentioned in the end of the video, there were additional efforts to cross-check and calibrate Hubble’s law at intermediate scales between the range of Cepheid methods (about 100 million light years) and observable universe scales (about 100 billion light years) by using further “standard candles” than Cepheids, most notably Type Ia supernovae (which are bright enough and predictable enough to be usable out to about 10 billion light years), the Tully-Fisher relation between the luminosity of a galaxy and its rotational speed, and gamma ray bursts. It turns out that due to the accelerating nature of the universe’s expansion, Hubble’s law is not completely linear at these large scales; this important correction cannot be discerned purely from Cepheid data, but also requires the other standard candles, as well as fitting that data (as well as other observational data, such as the cosmic microwave background radiation) to the cosmological models provided by general relativity (with the best fitting models to date being some version of the Lambda-CDM model).

    On the other hand, a naive linear extrapolation of Hubble’s original law to all larger scales does provide a very rough picture of the observable universe which, while too inaccurate for cutting edge research in astronomy, does give some general idea of its large-scale structure.
  • 21:15 Where did this guess of the observable universe being about 20% of the full universe come from?

    There are some ways to get a lower bound on the size of the entire universe that go beyond the edge of the observable universe. One is through analysis of the cosmic microwave background radiation (CMB), that has been carefully mapped out by several satellite observatories, most notably WMAP and Planck. Roughly speaking, a universe that was less than twice the size of the observable universe would create certain periodicities in the CMB data; such periodicities are not observed, so this provides a lower bound (see for instance this paper for an example of such a calculation). The 20% number was a guess based on my vague recollection of these works, but there is no consensus currently on what the ratio truly is; there are some proposals that the entire universe is in fact several orders of magnitude larger than the observable one.

    The situation is somewhat analogous to Aristarchus’s measurement of the distance to the Sun, which was very sensitive to a small angle (the half-moon discrepancy). Here, the predicted size of the universe under the standard cosmological model is similarly dependent in a highly sensitive fashion on a measure \Omega_k of the flatness of the universe which, for reasons still not fully understood (but likely caused by some sort of inflation mechanism), happens to be extremely close to zero. As such, predictions for the size of the universe remain highly volatile at the current level of measurement accuracy.
  • 23:44 Was it a black hole collision that allowed for an independent measurement of Hubble’s law?

    This was a slight error in the presentation. While the first gravitational wave observation by LIGO in 2015 was of a black hole collision, it did not come with an electromagnetic counterpart that allowed for a redshift calculation that would yield a Hubble’s law measurement. However, a later collision of neutron stars, observed in 2017, did come with an associated kilonova in which a redshift was calculated, and led to a Hubble measurement which was independent of most of the rungs of the distance ladder.
  • 23:49 Where can I learn more about this 10% discrepancy in Hubble’s law?

    This is known as the Hubble tension (or, in more sensational media, the “crisis in cosmology”): roughly speaking, the various measurements of Hubble’s constant (either from climbing the cosmic distance ladder, or by fitting various observational data to standard cosmological models) tend to arrive at one of two values, that are about 10% apart from each other. The values based on gravitational wave observations are currently consistent with both values, due to significant error bars in this extremely sensitive method; but other more mature methods are now of sufficient accuracy that they are basically only consistent with one of the two values. Currently there is no consensus on the origin of this tension: possibilities include systemic biases in the observational data, subtle statistical issues with the methodology used to interpret the data, a correction to the standard cosmological model, the influence of some previously undiscovered law of physics, or some partial breakdown of the Copernican principle.

    For an accessible recent summary of the situation, see this video by Becky Smethurst (“Dr. Becky”).
  • 24:49 So, what is a Type Ia supernova and why is it so useful in the distance ladder?

    A Type Ia supernova occurs when a white dwarf in a binary system draws more and more mass from its companion star, until it reaches the Chandrasekhar limit, at which point its gravitational forces are strong enough to cause a collapse that increases the pressure to the point where a supernova is triggered via a process known as carbon detonation. Because of the universal nature of the Chandrasekhar limit, all such supernovae have (as a first approximation) the same absolute brightness and can thus be used as standard candles in a similar fashion to Cepheids (but without the need to first measure any auxiliary observable, such as a period). But these supernovae are also far brighter than Cepheids, and can so this method can be used at significantly larger distances than the Cepheid method (roughly speaking it can handle distances of up to ~10 billion light years, whereas Cepheids are reliable out to ~100 million light years). Among other things, the supernovae measurements were the key to detecting an important nonlinear correction to Hubble’s law at these scales, leading to the remarkable conclusion that the expansion of the universe is in fact accelerating over time, which in the Lambda-CDM model corresponds to a positive cosmological constant, though there are more complex “dark energy” models that are also proposed to explain this acceleration.

  • 24:54 Besides Type Ia supernovae, I felt that a lot of other topics relevant to the modern distance ladder (e.g., the cosmic microwave background radiation, the Lambda CDM model, dark matter, dark energy, inflation, multi-messenger astronomy, etc.) were omitted.

    This is partly due to time constraints, and the need for editing to tighten the narrative, but was also a conscious decision on my part. Advanced classes on the distance ladder will naturally focus on the most modern, sophisticated, and precise ways to measure distances, backed up by the latest mathematics, physics, technology, observational data, and cosmological models. However, the focus in this video series was rather different; we sought to portray the cosmic distance ladder as evolving in a fully synergestic way, across many historical eras, with the evolution of mathematics, science, and technology, as opposed to being a mere byproduct of the current state of these other disciplines. As one specific consequence of this change of focus, we emphasized the first time any rung of the distance ladder was achieved, at the expense of more accurate and sophisticated later measurements at that rung. For instance, refinements in the measurement of the radius of the Earth since Eratosthenes, improvements in the measurement of the astronomical unit between Aristarchus and Cook, or the refinements of Hubble’s law and the cosmological model of the universe in the twentieth and twenty-first centuries, were largely omitted (though some of the answers in this FAQ are intended to address these omissions).

    Many of the topics not covered here (or only given a simplified treatment) are discussed in depth in other expositions, including other Youtube videos. I would welcome suggestions from readers for links to such resources in the comments to this post. Here is a partial list:

May 29, 2025

Doug NatelsonQuick survey - machine shops and maker spaces

Recent events are very dire for research at US universities, and I will write further about those, but first a quick unrelated survey for those at such institutions.  Back in the day, it was common for physics and some other (mechanical engineering?) departments to have machine shops with professional staff.  In the last 15-20 years, there has been a huge growth in maker-spaces on campuses to modernize and augment those capabilities, though often maker-spaces are aimed at undergraduate design courses rather than doing work to support sponsored research projects (and grad students, postdocs, etc.).  At the same time, it is now easier than ever (modulo tariffs) to upload CAD drawings to a website and get a shop in another country to ship finished parts to you.

Quick questions:   Does your university have a traditional or maker-space-augmented machine shop available to support sponsored research?  If so, who administers this - a department, a college/school, the office of research?  Does the shop charge competitive rates relative to outside vendors?  Are grad students trained to do work themselves, and are there professional machinists - how does that mix work?

Thanks for your responses.  Feel free to email me if you'd prefer to discuss offline.

May 27, 2025

John PreskillI know I am but what are you? Mind and Matter in Quantum Mechanics

Nowadays it is best to exercise caution when bringing the words “quantum” and “consciousness” anywhere near each other, lest you be suspected of mysticism or quackery. Eugene Wigner did not concern himself with this when he wrote his “Remarks on the Mind-Body Question” in 1967. (Perhaps he was emboldened by his recent Nobel prize for contributions to the mathematical foundations of quantum mechanics, which gave him not a little no-nonsense technical credibility.) The mind-body question he addresses is the full-blown philosophical question of “the relation of mind to body”, and he argues unapologetically that quantum mechanics has a great deal to say on the matter. The workhorse of his argument is a thought experiment that now goes by the name “Wigner’s Friend”. About fifty years later, Daniela Frauchiger and Renato Renner formulated another, more complex thought experiment to address related issues in the foundations of quantum theory. In this post, I’ll introduce Wigner’s goals and argument, and evaluate Frauchiger’s and Renner’s claims of its inadequacy, concluding that these are not completely fair, but that their thought experiment does do something interesting and distinct. Finally, I will describe a recent paper of my own, in which I formalize the Frauchiger-Renner argument in a way that illuminates its status and isolates the mathematical origin of their paradox.

* * *

Wigner takes a dualist view about the mind, that is, he believes it to be non-material. To him this represents the common-sense view, but is nevertheless a newly mainstream attitude. Indeed,

[until] not many years ago, the “existence” of a mind or soul would have been passionately denied by most physical scientists. The brilliant successes of mechanistic and, more generally, macroscopic physics and of chemistry overshadowed the obvious fact that thoughts, desires, and emotions are not made of matter, and it was nearly universally accepted among physical scientists that there is nothing besides matter.

He credits the advent of quantum mechanics with

the return, on the part of most physical scientists, to the spirit of Descartes’s “Cogito ergo sum”, which recognizes the thought, that is, the mind, as primary. [With] the creation of quantum mechanics, the concept of consciousness came to the fore again: it was not possible to formulate the laws of quantum mechanics in a fully consistent way without reference to the consciousness.

What Wigner has in mind here is that the standard presentation of quantum mechanics speaks of definite outcomes being obtained when an observer makes a measurement. Of course this is also true in classical physics. In quantum theory, however, the principles of linear evolution and superposition, together with the plausible assumption that mental phenomena correspond to physical phenomena in the brain, lead to situations in which there is no mechanism for such definite observations to arise. Thus there is a tension between the fact that we would like to ascribe particular observations to conscious agents and the fact that we would like to view these observations as corresponding to particular physical situations occurring in their brains.

Once we have convinced ourselves that, in light of quantum mechanics, mental phenomena must be considered on an equal footing with physical phenomena, we are faced with the question of how they interact. Wigner takes it for granted that “if certain physico-chemical conditions are satisfied, a consciousness, that is, the property of having sensations, arises.” Does the influence run the other way? Wigner claims that the “traditional answer” is that it does not, but argues that in fact such influence ought indeed to exist. (Indeed this, rather than technical investigation of the foundations of quantum mechanics, is the central theme of his essay.) The strongest support Wigner feels he can provide for this claim is simply “that we do not know of any phenomenon in which one subject is influenced by another without exerting an influence thereupon”. Here he recalls the interaction of light and matter, pointing out that while matter obviously affects light, the effects of light on matter (for example radiation pressure) are typically extremely small in magnitude, and might well have been missed entirely had they not been suggested by the theory.

Quantum mechanics provides us with a second argument, in the form of a demonstration of the inconsistency of several apparently reasonable assumptions about the physical, the mental, and the interaction between them. Wigner works, at least implicitly, within a model where there are two basic types of object: physical systems and consciousnesses. Some physical systems (those that are capable of instantiating the “certain physico-chemical conditions”) are what we might call mind-substrates. Each consciousness corresponds to a mind-substrate, and each mind-substrate corresponds to at most one consciousness. He considers three claims (this organization of his premises is not explicit in his essay):

1. Isolated physical systems evolve unitarily.

2. Each consciousness has a definite experience at all times.

3. Definite experiences correspond to pure states of mind-substrates, and arise for a consciousness exactly when the corresponding mind-substrate is in the corresponding pure state.

The first and second assumptions constrain the way the model treats physical and mental phenomena, respectively. Assumption 1 is often paraphrased as the `”completeness of quantum mechanics”, while Assumption 2 is a strong rejection of solipsism – the idea that only one’s own mind is sure to exist. Assumption 3 is an apparently reasonable assumption about the relation between mental and physical phenomena.

With this framework established, Wigner’s thought experiment, now typically known as Wigner’s Friend, is quite straightforward. Suppose that an observer, Alice (to name the friend), is able to perform a measurement of some physical quantity q of a particle, which may take two values, 0 and 1. Assumption 1 tells us that if Alice performs this measurement when the particle is in a superposition state, the joint system of Alice’s brain and the particle will end up in an entangled state. Now Alice’s mind-substrate is not in a pure state, so by Assumption 3 does not have a definite experience. This contradicts Assumption 2. Wigner’s proposed resolution to this paradox is that in fact Assumption 1 is incorrect, and that there is an influence of the mental on the physical, namely objective collapse or, as he puts it, that the “statistical element which, according to the orthodox theory, enters only if I make an observation enters equally if my friend does”.

* * *

Decades after the publication of Wigner’s essay, Daniela Frauchiger and Renato Renner formulated a new thought experiment, involving observers making measurements of other observers, which they intended to remedy what they saw as a weakness in Wigner’s argument. In their words, “Wigner proposed an argument […] which should show that quantum mechanics cannot have unlimited validity”. In fact, they argue, Wigner’s argument does not succeed in doing so. They assert that Wigner’s paradox may be resolved simply by noting a difference in what each party knows. Whereas Wigner, describing the situation from the outside, does not initially know the result of his friend’s measurement, and therefore assigns the “absurd” entangled state to the joint system composed of both her body and the system she has measured, his friend herself is quite aware of what she has observed, and so assigns to the system either, but not both, of the states corresponding to definite measurement outcomes. “For this reason”, Frauchiger and Renner argue, “the Wigner’s Friend Paradox cannot be regarded as an argument that rules out quantum mechanics as a universally valid theory.”

This criticism strikes me as somewhat unfair to Wigner. In fact, Wigner’s objection to admitting two different states as equally valid descriptions is that the two states correspond to different sets of \textit{physical} properties of the joint system consisting of Alice and the system she measures. For Wigner, physical properties of physical systems are distinct from mental properties of consciousnesses. To engage in some light textual analysis, we can note that the word ‘conscious’, or ‘consciousness’, appears forty-one times in Wigner’s essay, and only once in Frauchiger and Renner’s, in the title of a cited paper. I have the impression that the authors pay inadequate attention to how explicitly Wigner takes a dualist position, including not just physical systems but also, and distinctly, consciousnesses in his ontology. Wigner’s argument does indeed achieve his goals, which are developed in the context of this strong dualism, and differ from the goals of Frauchiger and Renner, who appear not to share this philosophical stance, or at least do not commit fully to it.

Nonetheless, the thought experiment developed by Frauchiger and Renner does achieve something distinct and interesting. We can understand Wigner’s no-go theorem to be of the following form: “Within a model incorporating both mental and physical phenomena, a set of apparently reasonable conditions on how the model treats physical phenomena, mental phenomena, and their interaction cannot all be satisfied”. The Frauchiger-Renner thought experiment can be cast in the same form, with different choices about how to implement the model and which conditions to consider. The major difference in the model itself is that Frauchiger and Renner do not take consciousnesses to be entities in their own rights, but simply take some states of certain physical systems to correspond to conscious experiences. Within such a model, Wigner’s assumption that each mind has a single, definite conscious experience at all times seems far less natural than it did within his model, where consciousnesses are distinct entities from the physical systems that determine them. Thus Frauchiger and Renner need to weaken this assumption, which was so natural to Wigner. The weakening they choose is a sort of transitivity of theories of mind. In their words (Assumption C in their paper):

Suppose that agent A has established that “I am certain that agent A’, upon reasoning within the same theory as the one I am using, is certain that x =\xi at time t.” Then agent A can conclude that “I am certain that x=\xi at time t.”

Just as Assumption 3 above was, for Wigner, a natural restriction on how a sensible theory ought to treat mental phenomena, this serves as Frauchiger’s and Renner’s proposed constraint. Just as Wigner designed a thought experiment that demonstrated the incompatibility of his assumption with an assumption of the universal applicability of unitary quantum mechanics to physical systems, so do Frauchiger and Renner.

* * *

In my recent paper “Reasoning across spacelike surfaces in the Frauchiger-Renner thought experiment”, I provide two closely related formalizations of the Frauchiger-Renner argument. These are motivated by a few observations:

1. Assumption C ought to make reference to the (possibly different) times at which agents A and A' are certain about their respective judgments, since these states of knowledge change.

2. Since Frauchiger and Renner do not subscribe to Wigner’s strong dualism, an agent’s certainty about a given proposition, like any other mental state, corresponds within their implicit model to a physical state. Thus statements like “Alice knows that P” should be understood as statements about the state of some part of Alice’s brain. Conditional statements like “if upon measuring a quantity q Alice observes outcome x, she knows that P” should be understood as claims about the state of the composite system composed of the part of Alice’s brain responsible for knowing P and the part responsible for recording outcomes of the measurement of q.

3. Because the causal structure of the protocol does not depend on the absolute times of each event, an external agent describing the protocol can choose various “spacelike surfaces”, corresponding to fixed times in different spacetime embeddings of the protocol (or to different inertial frames). There is no reason to privilege one of these surfaces over another, and so each of them should be assigned a quantum state. This may be viewed as an implementation of a relativistic principle.

A visual representation of the formalization of the Frauchiger-Renner protocol and the arguments of the no-go theorem. The graphical conventions are explained in detail in “Reasoning across spacelike surfaces in the Frauchiger-Renner thought experiment”.

After developing a mathematical framework based on these observations, I recast Frauchiger’s and Renner’s Assumption C in two ways: first, in terms of a claim about the validity of iterating the “relative state” construction that captures how conditional statements are interpreted in terms of quantum states; and second, in terms of a deductive rule that allows chaining of inferences within a system of quantum logic. By proving that these claims are false in the mathematical framework, I provide a more formal version of the no-go theorem. I also show that the first claim can be rescued if the relative state construction is allowed to be iterated only “along” a single spacelike surface, and the second if a deduction is only allowed to chain inferences “along” a single surface. In other words, the mental transitivity condition desired by Frauchiger and Renner can in fact be combined with universal physical applicability of unitary quantum mechanics, but only if we restrict our analysis to a single spacelike surface. Thus I hope that the analysis I offer provides some clarification of what precisely is going on in Frauchiger and Renner’s thought experiment, what it tells us about combining the physical and the mental in light of quantum mechanics, and how it relates to Wigner’s thought experiment.

* * *

In view of the fact that “Quantum theory cannot consistently describe the use of itself” has, at present, over five hundred citations, and “Remarks on the Mind-Body Question” over thirteen hundred, it seems fitting to close with a thought, cautionary or exultant, from Peter Schwenger’s book on asemic, that is meaningless, writing. He notes that

commentary endlessly extends language; it is in the service of an impossible quest to extract the last, the final, drop of meaning.

I provide no analysis of this claim.

May 25, 2025

John PreskillThe most steampunk qubit

I never imagined that an artist would update me about quantum-computing research.

Last year, steampunk artist Bruce Rosenbaum forwarded me a notification about a news article published in Science. The article reported on an experiment performed in physicist Yiwen Chu’s lab at ETH Zürich. The experimentalists had built a “mechanical qubit”: they’d stored a basic unit of quantum information in a mechanical device that vibrates like a drumhead. The article dubbed the device a “steampunk qubit.”

I was collaborating with Bruce on a quantum-steampunk sculpture, and he asked if we should incorporate the qubit into the design. Leave it for a later project, I advised. But why on God’s green Earth are you receiving email updates about quantum computing? 

My news feed sends me everything that says “steampunk,” he explained. So keeping a bead on steampunk can keep one up to date on quantum science and technology—as I’ve been preaching for years.

Other ideas displaced Chu’s qubit in my mind until I visited the University of California, Berkeley this January. Visiting Berkeley in January, one can’t help noticing—perhaps with a trace of smugness—the discrepancy between the temperature there and the temperature at home. And how better to celebrate a temperature difference than by studying a quantum-thermodynamics-style throwback to the 1800s?

One sun-drenched afternoon, I learned that one of my hosts had designed another steampunk qubit: Alp Sipahigil, an assistant professor of electrical engineering. He’d worked at Caltech as a postdoc around the time I’d finished my PhD there. We’d scarcely interacted, but I’d begun learning about his experiments in atomic, molecular, and optical physics then. Alp had learned about my work through Quantum Frontiers, as I discovered this January. I had no idea that he’d “met” me through the blog until he revealed as much to Berkeley’s physics department, when introducing the colloquium I was about to present.

Alp and collaborators proposed that a qubit could work as follows. It consists largely of a cantilever, which resembles a pendulum that bobs back and forth. The cantilever, being quantum, can have only certain amounts of energy. When the pendulum has a particular amount of energy, we say that the pendulum is in a particular energy level. 

One might hope to use two of the energy levels as a qubit: if the pendulum were in its lowest-energy level, the qubit would be in its 0 state; and the next-highest level would represent the 1 state. A bit—a basic unit of classical information—has 0 and 1 states. A qubit can be in a superposition of 0 and 1 states, and so the cantilever could be.

A flaw undermines this plan, though. Suppose we want to process the information stored in the cantilever—for example, to turn a 0 state into a 1 state. We’d inject quanta—little packets—of energy into the cantilever. Each quantum would contain an amount of energy equal to (the energy associated with the cantilever’s 1 state) – (the amount associated with the 0 state). This equality would ensure that the cantilever could accept the energy packets lobbed at it.

But the cantilever doesn’t have only two energy levels; it has loads. Worse, all the inter-level energy gaps equal each other. However much energy the cantilever consumes when hopping from level 0 to level 1, it consumes that much when hopping from level 1 to level 2. This pattern continues throughout the rest of the levels. So imagine starting the cantilever in its 0 level, then trying to boost the cantilever into its 1 level. We’d probably succeed; the cantilever would probably consume a quantum of energy. But nothing would stop the cantilever from gulping more quanta and rising to higher energy levels. The cantilever would cease to serve as a qubit.

We can avoid this problem, Alp’s team proposed, by placing an atomic-force microscope near the cantilever. An atomic force microscope maps out surfaces similarly to how a Braille user reads: by reaching out a hand and feeling. The microscope’s “hand” is a tip about ten nanometers across. So the microscope can feel surfaces far more fine-grained than a Braille user can. Bumps embossed on a page force a Braille user’s finger up and down. Similarly, the microscope’s tip bobs up and down due to forces exerted by the object being scanned. 

Imagine placing a microscope tip such that the cantilever swings toward it and then away. The cantilever and tip will exert forces on each other, especially when the cantilever swings close. This force changes the cantilever’s energy levels. Alp’s team chose the tip’s location, the cantilever’s length, and other parameters carefully. Under the chosen conditions, boosting the cantilever from energy level 1 to level 2 costs more energy than boosting from 0 to 1.

So imagine, again, preparing the cantilever in its 0 state and injecting energy quanta. The cantilever will gobble a quantum, rising to level 1. The cantilever will then remain there, as desired: to rise to level 2, the cantilever would have to gobble a larger energy quantum, which we haven’t provided.1

Will Alp build the mechanical qubit proposed by him and his collaborators? Yes, he confided, if he acquires a student nutty enough to try the experiment. For when he does—after the student has struggled through the project like a dirigible through a hurricane, but ultimately triumphed, and a journal is preparing to publish their magnum opus, and they’re brainstorming about artwork to represent their experiment on the journal’s cover—I know just the aesthetic to do the project justice.

1Chu’s team altered their cantilever’s energy levels using a superconducting qubit, rather than an atomic force microscope.

May 24, 2025

Matt Strassler The War on Harvard University

The United States’ government is waging an all-out assault on Harvard University. The strategy, so far, has been:

  • Cut most of the grants (present and future) for scientific and medical research, so that thousands of Harvard’s scientists, researchers and graduate students have to stop their work indefinitely. That includes research on life-saving medicine, on poorly understood natural phenomena, and on new technology. This also means that the university will have no money from these activities to pay salaries of its employees.
  • Eliminate the tax-advantageous status of the university, so that the university is much more expensive to operate.
  • Prohibit Harvard from having any international students (undergraduate and graduate) and other researchers, so that large numbers of existing scientific and medical research projects that still have funding will have to cease operation. This destroys the careers of thousands of brilliant people — and not just foreigners. Many US faculty and students are working with and depend upon these expelled researchers, and their work will stop too. It also means that Harvard’s budget for the next academic year will be crushed, since it is far too late to replace the tuition from international undergraduate students for the coming year.

The grounds for this war is that Harvard allegedly does not provide a safe environment for its Jewish students, and that Harvard refuses to let the government determine who it may and may not hire.

Now, maybe you can explain to me what this is really about. I’m confused what crimes these scientific researchers commited that justifies stripping them of their grants and derailing their research. I’m also unclear as to why many apolitical, hard-working young trainees in laboratories across the campus deserve to be ejected from their graduate and post-graduate careers and sent home, delaying or ruining their futures. [Few will be able to transfer to other US schools; with all the government cuts to US science, there’s no money to support them at other locations.] And I don’t really understand how such enormous damage and disruption to the lives and careers of ten thousand-ish scientists, researchers and graduate students at Harvard (including many who are Jewish) will actually improve the atmosphere for Harvard’s Jewish students.

As far as I can see, the government is merely using Jewish students as pawns, pretending to attack Harvard on their behalf while in truth harboring no honest concern for their well-being. The fact that the horrors and nastiness surrounding the Gaza war are being exploited by the government as cover for an assault on academic freedom and scientific research is deeply cynical and exceedingly ugly.

From the outside, where Harvard is highly respected — it is certainly among the top five universities in the world, however you rank them — this must look completely idiotic, as idiotic as France gutting the Sorbonne, or the UK eviscerating Oxford. But keep in mind that Harvard is by no means the only target here. The US government is cutting the country’s world-leading research in science, technology and medicine to the bone. If that’s what you want to do, then ruining Harvard makes perfect sense.

The country that benefits the most from this self-destructive behavior? China, obviously. As a friend of mine said, this isn’t merely like shooting yourself in the foot, it’s like shooting yourself in the head.

I suspect most readers will understand that I cannot blog as usual right now. To write good articles about quantum physics requires concentration and focus. When people’s careers and life’s work are being devastated all around me, that’s simply not possible.

May 23, 2025

Matt von HippelPublishing Isn’t Free, but SciPost Makes It Cheaper

I’ve mentioned SciPost a few times on this blog. They’re an open journal in every sense you could think of: diamond open-access scientific publishing on an open-source platform, run with open finances. They even publish their referee reports. They’re aiming to cover not just a few subjects, but a broad swath of academia, publishing scientists’ work in the most inexpensive and principled way possible and challenging the dominance of for-profit journals.

And they’re struggling.

SciPost doesn’t charge university libraries for access, they let anyone read their articles for free. And they don’t charge authors Article Processing Charges (or APCs), they let anyone publish for free. All they do is keep track of which institutions those authors are affiliated with, calculate what fraction of their total costs comes from them, and post it in a nice searchable list on their website.

And amazingly, for the last nine years, they’ve been making that work.

SciPost encourages institutions to pay their share, mostly by encouraging authors to bug their bosses until they do. SciPost will also quite happily accept more than an institution’s share, and a few generous institutions do just that, which is what has kept them afloat so far. But since nothing compels anyone to pay, most organizations simply don’t.

From an economist’s perspective, this is that most basic of problems, the free-rider problem. People want scientific publication to be free, but it isn’t. Someone has to pay, and if you don’t force someone to do it, then the few who pay will be exploited by the many who don’t.

There’s more worth saying, though.

First, it’s worth pointing out that SciPost isn’t paying the same cost everyone else pays to publish. SciPost has a stripped-down system, without any physical journals or much in-house copyediting, based entirely on their own open-source software. As a result, they pay about 500 euros per article. Compare this to the fees negotiated by particle physics’ SCOAP3 agreement, which average to closer to 1000 euros, and realize that those fees are on the low end: for-profit journals tend to make their APCs higher in order to, well, make a profit.

(By the way, while it’s tempting to think of for-profit journals as greedy, I think it’s better to think of them as not cost-effective. Profit is an expense, like the interest on a loan: a payment to investors in exchange for capital used to set up the business. The thing is, online journals don’t seem to need that kind of capital, especially when they’re based on code written by academics in their spare time. So they can operate more cheaply as nonprofits.)

So when an author publishes in SciPost instead of a journal with APCs, they’re saving someone money, typically their institution or their grant. This would happen even if their institution paid their share of SciPost’s costs. (But then they would pay something rather than nothing, hence free-rider problem.)

If an author instead would have published in a closed-access journal, the kind where you have to pay to read the articles and university libraries pay through the nose to get access? Then you don’t save any money at all, your library still has to pay for the journal. You only save money if everybody at the institution stops using the journal. This one is instead a collective action problem.

Collective action problems are hard, and don’t often have obvious solutions. Free-rider problems do suggest an obvious solution: why not just charge?

In SciPost’s case, there are philosophical commitments involved. Their desire to attribute costs transparently and equally means dividing a journal’s cost among all its authors’ institutions, a cost only fully determined at the end of the year, which doesn’t make for an easy invoice.

More to the point, though, charging to publish is directly against what the Open Access movement is about.

That takes some unpacking, because of course, someone does have to pay. It probably seems weird to argue that institutions shouldn’t have to pay charges to publish papers…instead, they should pay to publish papers.

SciPost itself doesn’t go into detail about this, but despite how weird it sounds when put like I just did, there is a difference. Charging a fee to publish means that anyone who publishes needs to pay a fee. If you’re working in a developing country on a shoestring budget, too bad, you have to pay the fee. If you’re an amateur mathematician who works in a truck stop and just puzzled through something amazing, too bad, you have to pay the fee.

Instead of charging a fee, SciPost asks for support. I have to think that part of the reason is that they want some free riders. There are some people who would absolutely not be able to participate in science without free riding, and we want their input nonetheless. That means to support them, others need to give more. It means organizations need to think about SciPost not as just another fee, but as a way they can support the scientific process as a whole.

That’s how other things work, like the arXiv. They get support from big universities and organizations and philanthropists, not from literally everyone. It seems a bit weird to do that for a single scientific journal among many, though, which I suspect is part of why institutions are reluctant to do it. But for a journal that can save money like SciPost, maybe it’s worth it.

Tommaso DorigoAn Innovative Proposal

The other day I finally emerged from a very stressful push to submit two grant applications to the European Innovation Council. The call in question is for PATHFINDER_OPEN projects, that aim for proofs of principle of groundbreaking technological innovations. So I thought I would broadly report on that experience (no, I am not new to it, but you never cease to learn!), and disclose just a little about the ideas that brought about one of the two projects.
Grant applications 

read more

May 19, 2025

Clifford JohnsonA New Equation?

Some years ago I speculated that it would nice if a certain mathematical object existed, and even nicer if it were to satisfy an ordinary differential equation of a special sort. I was motivated by a particular physical question, and it seemed very natural to me to imagine such an object... So natural that I was sure that it must already have been studied, the equation for it known. As a result, every so often I'd go down a rabbit hole of a literature dig, but not with much success because it isn't entirely clear where best to look. Then I'd get involved with other projects and forget all about the matter.

Last year I began to think about it again because it might be useful in a method I was developing for a paper, went through the cycle of wondering, and looking for a while, then forgot all about it in thinking about other things.

Then, a little over a month ago at the end of March, while starting on a long flight across the continent, I started thinking about it again, and given that I did not have a connection to the internet to hand, took another approach: I got out a pencil and began mess around in my notebook and just derive what I thought the equation for this object should be, given certain properties it should have. One property is that it should in some circumstances reduce to a known powerful equation (often associated with the legendary 1975 work of Gel'fand and Dikii*) satisfied by the diagonal resolvent $latex {\widehat R}(E,x) {=}\langle x|({\cal H}-E)^{-1}|x\rangle$ of a Schrodinger Hamiltonian $latex {\cal H}=-\hbar^2\partial^2_x+u(x)$. It is:

$latex 4(u(x)-E){\widehat R}^2-2\hbar^2 {\widehat R}{\widehat R}^{\prime\prime}+\hbar^2({\widehat R}^\prime)^2 = 1\ .$

Here, $latex E$ is an energy of the Hamiltonian, in potential $latex u(x)$, and $latex x$ is a coordinate on the real line.

The object itself would be a generalisation of the diagonal resolvent $latex {\widehat R}(E,x)$, although non-diagonal in the energy, not the [...] Click to continue reading this post

The post A New Equation? appeared first on Asymptotia.

May 18, 2025

John BaezDead Stars Don’t Radiate

Three guys claim that any heavy chunk of matter emits Hawking radiation, even if it’s not a black hole:

• Michael F. Wondrak, Walter D. van Suijlekom and Heino Falcke, Gravitational pair production and black hole evaporation, Phys. Rev. Lett. 130 (2023), 221502.

Now they’re getting more publicity by claiming this will make the universe fizzle out sooner than expected. They’re claiming, for example, that a dead, cold star will emit Hawking radiation, and thus slowly lose mass and eventually disappear!

They admit that this would violate baryon conservation: after all, the protons and neutrons in the star would have to go away somehow! They admit they don’t know how this would happen. They just say that the gravitational field of the star will create particle-antiparticle pairs that will slowly radiate away, forcing the dead star to lose mass somehow to conserve energy.

If experts thought this had even a chance of being true, it would be the biggest thing since sliced bread—at least in the field of quantum gravity. Everyone would be writing papers about it, because if true it would be revolutionary. It would overturn calculations by experts which say that a stationary chunk of matter doesn’t emit Hawking radiation. It would also mean that quantum field theory in curved spacetime can only be consistent if baryon number fails to be conserved! This would be utterly shocking.

But in fact, these new papers have had almost zero effect on physics. There’s a short rebuttal, here:

• Antonio Ferreiro José Navarro-Salas and Silvia Pla, Comment on “Gravitational pair production and black hole evaporation”, Phys. Rev. Lett. 133 (2024), 229001.

It explains that these guys used a crude approximation that gives wrong results even in a simpler problem. Similar points are made here:

• E. T. Akhmedov, D. V. Diakonov and C. Schubert, Complex effective actions and gravitational pair creation, Phys. Rev. D. 110, 105011.

Unfortunately, it seems the real experts on quantum field theory in curved spacetime have not come out and mentioned the correct way to think about this issue, which has been known at least since 1975. To them—or maybe I should dare to say “us”—it’s just well known that the gravitational field of a static mass does not cause the creation of particle-antiparticle pairs.

Of course, the referees should have rejected Wondrak, van Suijlekom and Falcke’s papers. But apparently none of those referees were experts on the subject at hand. So you can’t trust a paper just because it appears in a supposedly reputable physics journal. You have to actually understand the subject and assess the paper yourself, or talk to some experts you trust.

If I were a science journalist writing an article about a supposedly shocking development like this, I would email some experts and check to see if it’s for real. But plenty of science journalists don’t bother with that anymore: they just believe the press releases. So now we’re being bombarded with lazy articles like these:

Universe will die “much sooner than expected,” new research says, CBS News, May 13, 2025.

• Sharmila Kuthunur, Scientists calculate when the universe will end—it’s sooner than expected, Space.com, 15 May 2025.

• Jamie Carter, The universe will end sooner than thought, scientists say, Forbes, 16 May 2025.

The list goes on; these are just three. There’s no way what I say can have much effect against such a flood of misinformation. As Mark Twain said, “A lie can travel around the world and back again while the truth is lacing up its boots.” Actually he probably didn’t say that—but everyone keeps saying he did, illustrating the point perfectly.

Still, there might be a few people who both care and don’t already know this stuff. Instead of trying to give a mini-course here, let me simply point to an explanation of how things really work:

• Abhay Ashtekar and Anne Magnon, Quantum fields in curved space-times, Proceedings of the Royal Society, 346 (1975), 375–394.

It’s technical, so it’s not easy reading if you haven’t studied quantum field theory and general relativity, but that’s unavoidable. It shows that in a static spacetime there is a well-defined concept of ‘vacuum’, and the vacuum is stable. Jorge Pullin pointed out the key sentence for present purposes:

Thus, if the underlying space-time admits a everywhere time-like Killing field, the vacuum state is indeed stable and phenomena such as the spontaneous creation of particles do not occur.

This condition of having an “everywhere time-like Killing field” says that a spacetime has time translation symmetry. Ashtekar and Magnon also assume that spacetime is globally hyperbolic and that the wave equation for a massive spin-zero particle has a smooth solution given smooth initial data. All this lets us define a concept of energy for solutions of this equation. It also lets us split solutions into positive-frequency solutions, which correspond to particles, and negative-frequency ones, which correspond to antiparticles. We can thus set up quantum field theory in way we’re used to on Minkowski spacetime, where there’s a well-defined vacuum which does not decay into particle-antiparticle pairs.

The Schwarzschild solution, which describes a static black hole, also has a Killing field. But this ceases to be timelike at the event horizon, so this result does not apply to that!

I could go into more detail if required, but you can find a more pedagogical treatment in this standard textbook:

• Robert Wald, Quantum Field Theory in Curved Spacetime and Black Hole Thermodynamics, University of Chicago Press, Chicago, 1994.

In particular, go to Section 4.3, which is on quantum field theory in stationary spacetimes.

I also can’t resist citing this thesis by a student of mine:

• Valeria Michelle Carrión Álvarez, Loop Quantization versus Fock Quantization of p-Form Electromagnetism on Static Spacetimes, Ph.D. thesis, U. C. Riverside, 2004.

This thesis covers the case of electromagnetism, while Ashtekar and Magnon, and also Wald, focus on a massive scalar field for simplicity.

So: it’s been rigorously shown that the gravitational field of a static object does not create particle-antiparticle pairs. This has been known for decades. Now some people have done a crude approximate calculation that seems to show otherwise. Some flaws in the approximation have been pointed out. Of course the authors of the calculation don’t believe their approximation is flawed. We could argue about that for a long time. But it’s scarcely worth thinking about, because no approximations were required to settle this issue. It was settled over 50 years ago, and the new work is not shedding new light on the issue: it’s much more hand-wavy than the old work.

May 16, 2025

John BaezMeteor Burst Communications

Back before satellites, to transmit radio waves over really long distances folks bounced them off the ionosphere—a layer of charged particles in the upper atmosphere. Unfortunately this layer only reflects radio waves with frequencies up to 30 megahertz. This limits the rate at which information can be transmitted.

How to work around this?

METEOR BURST COMMUNICATIONS!

On average, 100 million meteorites weighing about a milligram hit the Earth each day. They vaporize about 120 kilometers up. Each one creates a trail of ions that lasts about a second. And you can bounce radio waves with a frequency up to 100 megahertz off this trail.

That’s not a huge improvement, and you need to transmit in bursts whenever a suitable meterorite comes your way, but the military actually looked into doing this.

The National Bureau of Standards tested a burst-mode system in 1958 that used the 50-MHz band and offered a full-duplex link at 2,400 bits per second. The system used magnetic tape loops to buffer data and transmitters at both ends of the link that operated continually to probe for a path. Whenever the receiver at one end detected a sufficiently strong probe signal from the other end, the transmitter would start sending data. The Canadians got in on the MBC action with their JANET system, which had a similar dedicated probing channel and tape buffer. In 1954 they established a full-duplex teletype link between Ottawa and Nova Scotia at 1,300 bits per second with an error rate of only 1.5%.

This is from

• Dan Maloney, Radio apocalypse: meteor burst communications, Hackaday, 2025 May 12.

and the whole article is a great read.

There’s a lot more to the story. For example, until recently people used this method in the western United States to report the snow pack from mountain tops!

The system was called SNOTEL, and you can read more about it here:

• Dan Maloney, Know snow: monitoring snowpack with the SNOTEL network, Hackaday, 2023 June 29.

Also, a lot of ham radio operators bounce signals off meteors just for fun!

• Robert Gulley, Incoming! An introduction to meteor scatter propagation, The SWLing Post, 2024 January 17.

May 15, 2025

John BaezVisions for the Future of Physics

On Wednesday May 14, 2025 I’ll be giving a talk at 2 pm Pacific Time, or 10 pm UK time. The talk is for physics students at the Universidade de São Paulo in Brazil, organized by Artur Renato Baptista Boyago.

Visions for the Future of Physics

Abstract. The 20th century was the century of fundamental physics. What about the 21st? Progress on fundamental physics has been slow since about 1980, but there is exciting progress in other fields, such as condensed matter. This requires an adjustment in how we think about the goal of physics.

You can see my slides here, or watch a video of the talk here:

May 14, 2025

Terence TaoSome variants of the periodic tiling conjecture

Rachel Greenfeld and I have just uploaded to the arXiv our paper Some variants of the periodic tiling conjecture. This paper explores variants of the periodic tiling phenomenon that, in some cases, a tile that can translationally tile a group, must also be able to translationally tile the group periodically. For instance, for a given discrete abelian group {G}, consider the following question:

Question 1 (Periodic tiling question) Let {F} be a finite subset of {G}. If there is a solution {1_A} to the tiling equation {1_F * 1_A = 1}, must there exist a periodic solution {1_{A_p}} to the same equation {1_F * 1_{A_p} = 1}?

We know that the answer to this question is positive for finite groups {H} (trivially, since all sets are periodic in this case), one-dimensional groups {{\bf Z} \times H} with {H} finite, and in {{\bf Z}^2}, but it can fail for {{\bf Z}^2 \times H} for certain finite {H}, and also for {{\bf Z}^d} for sufficiently large {d}; see this previous blog post for more discussion. But now one can consider other variants of this question:

  • Instead of considering level one tilings {1_F * 1_A = 1}, one can consider level {k} tilings {1_F * 1_A = k} for a given natural number {k} (so that every point in {G} is covered by exactly {k} translates of {F}), or more generally {1_F * 1_A = g} for some periodic function {g}.
  • Instead of requiring {1_F} and {1_A} to be indicator functions, one can allow these functions to be integer-valued, thus we are now studying convolution equations {f*a=g} where {f, g} are given integer-valued functions (with {g} periodic and {f} finitely supported).

We are able to obtain positive answers to three such analogues of the periodic tiling conjecture for three cases of this question. The first result (which was kindly shared with us by Tim Austin), concerns the homogeneous problem {f*a = 0}. Here the results are very satisfactory:

Theorem 2 (First periodic tiling result) Let {G} be a discrete abelian group, and let {f} be integer-valued and finitely supported. Then the following are equivalent.
  • (i) There exists an integer-valued solution {a} to {f*a=0} that is not identically zero.
  • (ii) There exists a periodic integer-valued solution {a_p} to {f * a_p = 0} that is not identically zero.
  • (iii) There is a vanishing Fourier coefficient {\hat f(\xi)=0} for some non-trivial character {\xi \in \hat G} of finite order.

By combining this result with an old result of Henry Mann about sums of roots of unity, as well as an even older decidability result of Wanda Szmielew, we obtain

Corollary 3 Any of the statements (i), (ii), (iii) is algorithmically decidable; there is an algorithm that, when given {G} and {f} as input, determines in finite time whether any of these assertions hold.

Now we turn to the inhomogeneous problem in {{\bf Z}^2}, which is the first difficult case (periodic tiling type results are easy to establish in one dimension, and trivial in zero dimensions). Here we have two results:

Theorem 4 (Second periodic tiling result) Let {G={\bf Z}^2}, let {g} be periodic, and let {f} be integer-valued and finitely supported. Then the following are equivalent.
  • (i) There exists an integer-valued solution {a} to {f*a=g}.
  • (ii) There exists a periodic integer-valued solution {a_p} to {f * a_p = g}.

Theorem 5 (Third periodic tiling result) Let {G={\bf Z}^2}, let {g} be periodic, and let {f} be integer-valued and finitely supported. Then the following are equivalent.
  • (i) There exists an indicator function solution {1_A} to {f*1_A=g}.
  • (ii) There exists a periodic indicator function solution {1_{A_p}} to {f * 1_{A_p} = g}.

In particular, the previously established case of periodic tiling conjecture for level one tilings of {{\bf Z}^2}, is now extended to higher level. By an old argument of Hao Wang, we now know that the statements mentioned in Theorem 5 are now also algorithmically decidable, although it remains open whether the same is the case for Theorem 4. We know from past results that Theorem 5 cannot hold in sufficiently high dimension (even in the classic case {g=1}), but it also remains open whether Theorem 4 fails in that setting.

Following past literature, we rely heavily on a structure theorem for solutions {a} to tiling equations {f*a=g}, which roughly speaking asserts that such solutions {a} must be expressible as a finite sum of functions {\varphi_w} that are one-periodic (periodic in a single direction). This already explains why tiling is easy to understand in one dimension, and why the two-dimensional case is more tractable than the case of general dimension. This structure theorem can be obtained by averaging a dilation lemma, which is a somewhat surprising symmetry of tiling equations that basically arises from finite characteristic arguments (viewing the tiling equation modulo {p} for various large primes {p}).

For Theorem 2, one can take advantage of the fact that the homogeneous equation {f*a=0} is preserved under finite difference operators {\partial_h a(x) := a(x+h)-a(x)}: if {a} solves {f*a=0}, then {\partial_h a} also solves the same equation {f * \partial_h a = 0}. This freedom to take finite differences one to selectively eliminate certain one-periodic components {\varphi_w} of a solution {a} to the homogeneous equation {f*a=0} until the solution is a pure one-periodic function, at which point one can appeal to an induction on dimension, to equate parts (i) and (ii) of the theorem. To link up with part (iii), we also take advantage of the existence of retraction homomorphisms from {{\bf C}} to {{\bf Q}} to convert a vanishing Fourier coefficient {\hat f(\xi)= 0} into an integer solution to {f*a=0}.

The inhomogeneous results are more difficult, and rely on arguments that are specific to two dimensions. For Theorem 4, one can also perform finite differences to analyze various components {\varphi_w} of a solution {a} to a tiling equation {f*a=g}, but the conclusion now is that the these components are determined (modulo {1}) by polynomials of one variable. Applying a retraction homomorphism, one can make the coefficients of these polynomials rational, which makes the polynomials periodic. This turns out to reduce the original tiling equation {f*a=g} to a system of essentially local combinatorial equations, which allows one to “periodize” a non-periodic solution by periodically repeating a suitable block of the (retraction homomorphism applied to the) original solution.

Theorem 5 is significantly more difficult to establish than the other two results, because of the need to maintain the solution in the form of an indicator function. There are now two separate sources of aperiodicity to grapple with. One is the fact that the polynomials involved in the components {\varphi_w} may have irrational coefficients (see Theorem 1.3 of our previous paper for an explicit example of this for a level 4 tiling). The other is that in addition to the polynomials (which influence the fractional parts of the components {\varphi_w}), there is also “combinatorial” data (roughly speaking, associated to the integer parts of {\varphi_w}) which also interact with each other in a slightly non-local way. Once one can make the polynomial coefficients rational, there is enough periodicity that the periodization approach used for the second theorem can be applied to the third theorem; the main remaining challenge is to find a way to make the polynomial coefficients rational, while still maintaining the indicator function property of the solution {a}.

It turns out that the restriction homomorphism approach is no longer available here (it makes the components {\varphi_w} unbounded, which makes the combinatorial problem too difficult to solve). Instead, one has to first perform a second moment analysis to discern more structure about the polynomials involved. It turns out that the components {\varphi_w} of an indicator function {1_A} can only utilize linear polynomials (as opposed to polynomials of higher degree), and that one can partition {{\bf Z}^2} into a finite number of cosets on which only three of these linear polynomials are “active” on any given coset. The irrational coefficients of these linear polynomials then have to obey some rather complicated, but (locally) finite, sentence in the theory of first-order linear inequalities over the rationals, in order to form an indicator function {1_A}. One can then use the Weyl equidistribution theorem to replace these irrational coefficients with rational coefficients that obey the same constraints (although one first has to ensure that one does not accidentally fall into the boundary of the constraint set, where things are discontinuous). Then one can apply periodization to the remaining combinatorial data to conclude.

A key technical problem arises from the discontinuities of the fractional part operator {\{x\}} at integers, so a certain amount of technical manipulation (in particular, passing at one point to a weak limit of the original tiling) is needed to avoid ever having to encounter this discontinuity.

May 11, 2025

Tommaso DorigoOn Progress

The human race has made huge progress in the past few thousand years, gradually improving the living condition of human beings by learning how to cure illness; improving farming; harvesting, storing, and using energy in several forms; and countless other activities. 

Progress is measured over long time scales, and on metrics related to the access to innovations by all, as Ford once noted. So it is natural for us to consider ourselves lucky to have lived "in the best of times". 

Why, if you were born 400 years ago, e.g., you would probably never even learn what a hot shower is! And even only 100 years ago you could have been watching powerless as your children died of diseases that today elicit little worry.

read more

Mark GoodsellChoose France for Science

In the news this week was the joint announcement by the presidents of the European Commission and France of initiatives about welcoming top researchers from abroad, with the aim being especially to encourage researchers from the USA to cross the Atlantic. I've seen some discussion online about this among people I know and thought I'd add a few comments here, for those outside Europe thinking about making such a jump.

Firstly, what is the new initiative? Various programmes have been put in place; on the EU side it seems to be encouraging applications to Marie Curie Fellowships for postdocs and ERC grants. It looks like there is some new money, particularly for Marie Curie Fellowships for incoming researchers. Applying for these is generally good advice, as they are prestigious programs that open the way to a career; in my field a Marie Curie often leads to a permanent position, and an ERC grant is so huge that it opens doors everywhere. In France, the programme seems to be an ANR programme targeting specific strategic fields, so unlikely to be relevant for high-energy physicists (despite the fact that they invited Mark Thomson to speak at the meeting). But France can be a destination for the European programmes, and there are good reasons for choose France as a destination. 

So the advice would seem to be to try out life in France with a Marie-Curie Fellowship, and then apply through the usual channels for a permanent position. This is very reasonable, because it makes little sense to move permanently before having some idea of what life and research is actually like here first. I would heartily recommend it. There are several permanent positions available every year in the CNRS at the junior level, but because of the way the CNRS hiring works -- via a central committee, that decides for positions in the whole country -- if someone leaves it is not very easy to replace them, and people job-hopping is a recurrent problem. There is also the possibility for people to enter the CNRS at a senior level, with up to one position available in theoretical physics most years. 

I wrote a bit last year where I mentioned some of the great things about the CNRS but I will add a bit now. Firstly, what is it? It is a large organisation that essentially just hires permanent researchers, who work in laboratories throughout the country. Most of these laboratories are hosted by universities, such as my lab (the LPTHE) which is hosted by Sorbonne University. Most of these laboratories are mixed, meaning that they also include university staff, i.e. researchers who also teach undergraduates. University positions have a similar but parallel career to the CNRS, but since the teaching is done in French, and because the positions only open on a rather unpredictable basis, I won't talk about them today. The CNRS positions are 100% research; there is little administrative overhead, and therefore plenty of time to focus on what is important. This is the main advantage of such positions; but also the fact that the organisation of researchers is done into laboratories is a big difference to the Anglo-Saxon model. My lab is relatively small, yet contains a large number of people working in HEP, and this provides a very friendly environment with lots of interesting interactions, without being lost in a labyrinthine organisation or having key decisions taken by people working in vastly different (sub) fields. 

The main criticisms I have seen bandied around on social media about the CNRS are that the pay is not competitive, and that CNRS researchers are lazy/do not work. I won't comment about pay, because it's difficult to compare. But there is plenty of oversight by the CNRS committee -- a body of our peers elected by all researchers -- which scrutinises activity, in addition to deciding on hiring and promotions. If people were really sitting on their hands then this would be spotted and nipped in the bud; but the process of doing this is not onerous or intrusive, precisely because it is done by our peers. In fact, the yearly and five-yearly reports serve a useful role in helping people to focus their activities and plan for the next one to five years. There is also evaluation of laboratories and universities (the HCERES, which will now be changed into something else) that however seems sensible: it doesn't seem to lead to the same sort of panic or perverse incentives that the (equivalent) REF seems to induce in the UK, for example. 

The people I know are incredibly hard-working and productive. This is, to be fair, also a product of the fact that we have relatively few PhD students compared to other countries. This is partly by design: the philosophy is that it is unfair to train lots of students who can never get permanent positions in the field. As a result, we take good care of our students, and the students we have tend to be good; but since we have the time, we mostly do research ourselves, rather than just being managers. 

So the main reason to choose France is to be allowed to do the research you want to do, without managerialisation, bureaucrats or other obstacles interfering. If that sounds appealing, then I suggest getting in touch and/or arranging to visit. A visit to the RPP or one of the national meetings would be a great way to start. The applications for Marie Curie fellowships are open now, and the CNRS competition opens in December with a deadline usually in early January. 

May 08, 2025

Scott Aaronson Cracking the Top Fifty!

I’ve now been blogging for nearly twenty years—through five presidential administrations, my own moves from Waterloo to MIT to UT Austin, my work on algebrization and BosonSampling and BQP vs. PH and quantum money and shadow tomography, the publication of Quantum Computing Since Democritus, my courtship and marriage and the birth of my two kids, a global pandemic, the rise of super-powerful AI and the terrifying downfall of the liberal world order.

Yet all that time, through more than a thousand blog posts on quantum computing, complexity theory, philosophy, the state of the world, and everything else, I chased a form of recognition for my blogging that remained elusive.

Until now.

This week I received the following email:

I emailed regarding your blog Shtetl-Optimized Blog which was selected by FeedSpot as one of the Top 50 Quantum Computing Blogs on the web.

https://bloggers.feedspot.com/quantum_computing_blogs

We recommend adding your website link and other social media handles to get more visibility in our list, get better ranking and get discovered by brands for collaboration.

We’ve also created a badge for you to highlight this recognition. You can proudly display it on your website or share it with your followers on social media.

We’d be thankful if you can help us spread the word by briefly mentioning Top 50 Quantum Computing Blogs in any of your upcoming posts.

Please let me know if you can do the needful.

You read that correctly: Shtetl-Optimized is now officially one of the top 50 quantum computing blogs on the web. You can click the link to find the other 49.


Maybe it’s not unrelated to this new notoriety that, over the past few months, I’ve gotten a massively higher-than-usual volume of emailed solutions to the P vs. NP problem, as well as the other Clay Millennium Problems (sometimes all seven problems at once), as well as quantum gravity and life, the universe, and everything. I now get at least six or seven confident such emails per day.

While I don’t spend much time on this flood of scientific breakthroughs (how could I?), I’d like to note one detail that’s new. Many of the emails now include transcripts where ChatGPT fills in the details of the emailer’s theories for them—unironically, as though that ought to clinch the case. Who said generative AI wasn’t poised to change the world? Indeed, I’ll probably need to start relying on LLMs myself to keep up with the flood of fan mail, hate mail, crank mail, and advice-seeking mail.

Anyway, thanks for reading everyone! I look forward to another twenty years of Shtetl-Optimized, if my own health and the health of the world cooperate.

May 05, 2025

Peter Rohde Protected: Random junk

This content is password protected. To view it please enter your password below:

May 04, 2025

Tommaso DorigoThe Night Sky From Atacama

For the third time in 9 years I am visiting San Pedro de Atacama, a jewel in the middle of nowhere in northern Chile. The Atacama desert is a stretch of extremely dry land at high altitude, which makes it exceptionally attractive for astronomical activities. In its whereabouts, e.g., are some of the largest telescopes in the world - the Cerro Paranal Very Large Telescope (VLT), and the planned Extremely Large Telescope (ELT) now being built in Cerro Armazones. And I have news that an even larger telescope, tentatively dubbed RLT for Ridiculously Large Telescope, is being planned in the region...

read more

May 03, 2025

April 28, 2025

John PreskillQuantum automata

Do you know when an engineer built the first artificial automaton—the first human-made machine that operated by itself, without external control mechanisms that altered the machine’s behavior over time as the machine undertook its mission?

The ancient Greek thinker Archytas of Tarentum reportedly created it about 2,300 years ago. Steam propelled his mechanical pigeon through the air.

For centuries, automata cropped up here and there as curiosities and entertainment. The wealthy exhibited automata to amuse and awe their peers and underlings. For instance, the French engineer Jacques de Vauconson built a mechanical duck that appeared to eat and then expel grains. The device earned the nickname the Digesting Duck…and the nickname the Defecating Duck.

Vauconson also invented a mechanical loom that helped foster the Industrial Revolution. During the 18th and 19th centuries, automata began to enable factories, which changed the face of civilization. We’ve inherited the upshots of that change. Nowadays, cars drive themselves, Roombas clean floors, and drones deliver packages.1 Automata have graduated from toys to practical tools.2

Rather, classical automata have. What of their quantum counterparts?

Scientists have designed autonomous quantum machines, and experimentalists have begun realizing them. The roster of such machines includes autonomous quantum engines, refrigerators, and clocks. Much of this research falls under the purview of quantum thermodynamics, due to the roles played by energy in these machines’ functioning: above, I defined an automaton as a machine free of time-dependent control (exerted by a user). Equivalently, according to a thermodynamicist mentality, we can define an automaton as a machine on which no user performs any work as the machine operates. Thermodynamic work is well-ordered energy that can be harnessed directly to perform a useful task. Often, instead of receiving work, an automaton receives access to a hot environment and a cold environment. Heat flows from the hot to the cold, and the automaton transforms some of the heat into work.

Quantum automata appeal to me because quantum thermodynamics has few practical applications, as I complained in my previous blog post. Quantum thermodynamics has helped illuminate the nature of the universe, and I laud such foundational insights. Yet we can progress beyond laudation by trying to harness those insights in applications. Some quantum thermal machines—quantum batteries, engines, etc.—can outperform their classical counterparts, according to certain metrics. But controlling those machines, and keeping them cold enough that they behave quantum mechanically, costs substantial resources. The machines cost more than they’re worth. Quantum automata, requiring little control, offer hope for practicality. 

To illustrate this hope, my group partnered with Simone Gasparinetti’s lab at Chalmer’s University in Sweden. The experimentalists created an autonomous quantum refrigerator from superconducting qubits. The quantum refrigerator can help reset, or “clear,” a quantum computer between calculations.

Artist’s conception of the autonomous-quantum-refrigerator chip. Credit: Chalmers University of Technology/Boid AB/NIST.

After we wrote the refrigerator paper, collaborators and I raised our heads and peered a little farther into the distance. What does building a useful autonomous quantum machine take, generally? Collaborators and I laid out guidelines in a “Key Issues Review” published in Reports in Progress on Physics last November.

We based our guidelines on DiVincenzo’s criteria for quantum computing. In 1996, David DiVincenzo published seven criteria that any platform, or setup, must meet to serve as a quantum computer. He cast five of the criteria as necessary and two criteria, related to information transmission, as optional. Similarly, our team provides ten criteria for building useful quantum automata. We regard eight of the criteria as necessary, at least typically. The final two, optional guidelines govern information transmission and machine transportation. 

Time-dependent external control and autonomy

DiVincenzo illustrated his criteria with multiple possible quantum-computing platforms, such as ions. Similarly, we illustrate our criteria in two ways. First, we show how different quantum automata—engines, clocks, quantum circuits, etc.—can satisfy the criteria. Second, we illustrate how quantum automata can consist of different platforms: ultracold atoms, superconducting qubits, molecules, and so on.

Nature has suggested some of these platforms. For example, our eyes contain autonomous quantum energy transducers called photoisomers, or molecular switches. Suppose that such a molecule absorbs a photon. The molecule may use the photon’s energy to switch configuration. This switching sets off chemical and neurological reactions that result in the impression of sight. So the quantum switch transduces energy from light into mechanical, chemical, and electric energy.

Photoisomer. (Image by Todd Cahill, from Quantum Steampunk.)

My favorite of our criteria ranks among the necessary conditions: every useful quantum automata must produce output worth the input. How one quantifies a machine’s worth and cost depends on the machine and on the user. For example, an agent using a quantum engine may care about the engine’s efficiency, power, or efficiency at maximum power. Costs can include the energy required to cool the engine to the quantum regime, as well as the control required to initialize the engine. The agent also chooses which value they regard as an acceptable threshold for the output produced per unit input. I like this criterion because it applies a broom to dust that we quantum thermodynamicists often hide under a rug: quantum thermal machines’ costs. Let’s begin building quantum engines that perform more work than they require to operate.

One might object that scientists and engineers are already sweating over nonautonomous quantum machines. Companies, governments, and universities are pouring billions of dollars into quantum computing. Building a full-scale quantum computer by hook or by crook, regardless of classical control, is costing enough. Eliminating time-dependent control sounds even tougher. Why bother?

Fellow Quantum Frontiers blogger John Preskill pointed out one answer, when I described my new research program to him in 2022: control systems are classical—large and hot. Consider superconducting qubits—tiny quantum circuits—printed on a squarish chip about the size of your hand. A control wire terminates on each qubit. The rest of the wire runs off the edge of the chip, extending to classical hardware standing nearby. One can fit only so many wires on the chip, so one can fit only so many qubits. Also, the wires, being classical, are hotter than the qubits should be. The wires can help decohere the circuits, introducing errors into the quantum information they store. The more we can free the qubits from external control—the more autonomy we can grant them—the better.

Besides, quantum automata exemplify quantum steampunk, as my coauthor Pauli Erker observed. I kicked myself after he did, because I’d missed the connection. The irony was so thick, you could have cut it with the retractible steel knife attached to a swashbuckling villain’s robotic arm. Only two years before, I’d read The Watchmaker of Filigree Street, by Natasha Pulley. The novel features a Londoner expatriate from Meiji Japan, named Mori, who builds clockwork devices. The most endearing is a pet-like octopus, called Katsu, who scrambles around Mori’s workshop and hoards socks. 

Does the world need a quantum version of Katsu? Not outside of quantum-steampunk fiction…yet. But a girl can dream. And quantum automata now have the opportunity to put quantum thermodynamics to work.

From tumblr

1And deliver pizzas. While visiting the University of Pittsburgh a few years ago, I was surprised to learn that the robots scurrying down the streets were serving hungry students.

2And minions of starving young scholars.

April 18, 2025

April 15, 2025

n-Category Café Position in Stellenbosch

guest post by Bruce Bartlett

Stellenbosch University is hiring!

The Mathematics Division at Stellenbosch University in South Africa is looking to hire a new permanent appointment at Lecturer / Senior Lecturer level (other levels may be considered too under the appropriate circumstances).

Preference will be given to candidates working in number theory or a related area, but those working in other areas of mathematics will definitely also be considered.

The closing date for applications is 30 April 2025. For more details, kindly see the official advertisement.

Consider a wonderful career in the winelands area of South Africa!