Planet Musings

November 27, 2022

Tommaso DorigoToward Artificial Intelligence Assisted Design Of Experiments

That's the title of a short article I just published (it is online here, but beware - for now you need to access from an institution that can access the journal contents), on Nuclear Instruments and Methods - a renowned journal for particle physics and nuclear physics instrumentation. The contents are nothing very new, in the sense that they are little more than a summary of things that the MODE collaboration published last March here. But for the distracted among you, I will summarize the summary below.


read more

November 26, 2022

Robert HellingGet Rich Fast

I wrote a text as a comment on the episode of the Logbuch Netzpolitik podcast on the FTX debacle but could not post it to the comment section (because that appears to be disabled). So in order not to waste I post it here (in German):


1. Hebel (leverage): Wenn ich etwa glaube, dass in Zukunft die Appleaktie weiter steigen wird, kann ich mir eine Appleaktie kaufen, um davon zu profitieren. Die kostet momentan etwa 142 Euro, kaufe ich eine und steigt der Preis auf 150 Euro habe ich natürlich 8 Euro Gewinn gemacht. Besser natürlich noch, wenn ich 100 kaufe, dann mache ich 800 Euro Gewinn. Hinderlich ist dabei nur, wenn ich nicht 14200 Euro dafür zur Verfügung habe. Aber kein Problem, dann nehme ich eben einen Kredit über den Preis von 99 Aktien (also 14038 Euro) auf. Der Einfachheit halber ignorieren wir mal, dass ich dafür Zinsen zahlen muss, die machen das ganze Spiel für mich nur unattraktiver. Ich kaufe also 100 Aktien, davon 99 auf Pump. Ist der Kurs bei 150, verkaufe ich sie wieder, zahle den Kredit ab und gehe mit 800 Euro mehr nach Hause. Ich habe also den Kursgewinn verhundertfacht.


Doof nur, dass ich gleichzeitig auch das Verlustrisiko verhundertfache: Fällt der Aktienkurs entgegen meiner optimistischen Erwartungen, kann es schnell sein, dass ich beim Verkauf der Aktien nicht mehr genug Geld zusammenbekomme, um den Kredit abzuzahlen. Das tritt dann ein, wenn die 100 Aktien weniger wert sind, als der Kredit, wenn also der Aktienwert unter 140,38 Euro fällt. Wenn ich in dem Moment meine Aktien verkaufe, kann ich grade noch meine Schulden bezahlen, habe aber mein Eigenkaptial, das war die eine Aktie, die ich von meinem eigenen Geld gekauft habe, komplett verloren. Ist der Kurs aber noch tiefer gefallen, kann ich beim Spekulieren auf Pump aber mehr als all mein Geld verlieren, ich habe nichts mehr, aber immer noch nicht meine Schulden abbezahlt. Davor hat aber natürlich auch die Bank, die mir den Kredit gegeben hat, Angst, daher zwingt sie mich spätestens, wenn der Kurs auf 140,38 gefallen ist, die Aktien zu verkaufen, damit sie auf jeden Fall ihren Kredit zurück bekommt. Daniel nennt das "glattstellen".


2. Das finde ich natürlich blöd, weil der Kurs viel schneller mal um diese 1,42 Euro fällt, als dass er um 8 Euro steigt. Um das zu verhindern, kann ich bei der Bank noch andere Dinge von Wert hinterlegen, zB mein iPhone, das noch 100 Euro wert ist. Dann zwingt mich die Bank erst meine Aktien zu verkaufen, wenn der Wert der Aktien plus den 100 Euro für das iPhone unter den Wert des Kredits fällt. Sie könnte ja immer noch das iPhone verkaufen, um ihr Geld zurück zu bekommen. Wenn ich aber kein iPhone zum Hintelegen habe, muss ich etwas anderes werthaltiges bei der Bank hinterlegen (collateral).


3. Hier kommen die Tokens ins Spiel. Ich kann mir 1000 Kryptotokens ausdenken (ob mit dem Besitz von computergenerieren Cartoons von Tim und Linus verknüpft ist dabei egal). Da ich mir die nur ausgedacht habe, bin ich noch nicht weiter, so haben sie ja keinen Wert. Ich kann versuchen, sie zu verkaufen, aber dabei werde ich nur ausgelacht. Hier kommt meine zweite Firma, der Investment Fond ins Spiel: Mit dem kaufe ich mir selber 100 der Tokens zum Preis von 30 Euro das Stück ab. Wenn jetzt nicht klar ist, dass ich mir selber die Dinger abgekauft habe (ggf. über einen Strohmann:in) sieht es so aus, als würden die Tokens ernsthaft für einen Wert von 30 Euro gehandelt. Ausserdem verkaufe ich noch den Kunden meines Fonts 100 weitere auch für 30 Euro mit dem Versprechen, dass die Besitzer der Coins Rabatt auf die Gebühren meines Fonds bekommen. Spätestens jetzt ist der Wert von 30 Euro pro Token etabliert. Ich habe von den ursprünglichen 1000 immer noch 800. Jetzt kann ich behaupten, ich habe Besitz im Wert von 24000, denn das sind 800 mal 30 Euro. Diesen Besitz habe ich quasi aus dem Nichts geschaffen, da die Annahme, dass ich auch noch echte Käufer für die anderen 800 bei diesem Preis finden kann, Quatsch ist.


Wenn ich das ganze aber nur gut genug verschleiere, glaubt mir vielleicht jemand, dass ich wirklich auf Werten von 24000 Euro sitze. Insbesonder die Bank aus Schritt 1 und 2 glaubt mir das vielleicht und ich kann diese Tokens als Sicherheit für den Kredit hinterlegen und damit noch höhere Kredite aufnehmen, um damit Apple-Aktien zu kaufen.


Das ganze fliegt erst auf, wenn der Kurs der Aktien so weit fällt, dass die Bank darauf besteht, dass der Kredit zurück gezahlt werden muss. Dann muss ich eben nicht nur die Aktien und das iPhone verkaufen, sondern auch noch die weiteren Tokens. Und dann stehe ich eben ohne Hose da, weil dann klar wird, dass natürlich niemand die Tokens, die ich mir einfach ausgedacht habe, haben will, schon gar nicht für 30 Euro. Dann fehlt in den Worten von Daniel die "Liquidität".


Das ist nach meinem Verständnis, was passiert ist, natürlich nicht mit Apple-Aktien und iPhones, aber im Prinzip. Und der Sinn des mit sich selbst Geschäfte-im-Kreis machen, ist eben, damit künstlich die scheinbaren Preise von etwas, wovon ich noch mehr habe, in die Höhe zu treiben. Der Fehler des ganzen ist, dass schwierig ist, die Werte von etwas zu beurteilen, was gar nicht wirkich gehandelt wird, bzw wo der Wert nur auf anderen angenommenen Werten beruht, wobei sich die Annahmen über die Werte sehr schnell ändern können, wenn jemand "will sehen!" sagt und keine realen Werte (wie traditionell in Form von Fabriken, Know-How etc) dahinter liegen.

November 25, 2022

Matt von HippelConfidence and Friendliness in Science

I’ve seen three kinds of scientific cultures.

First, there are folks who are positive about almost everyone. Ask them about someone else’s lab, even a competitor, and they’ll be polite at worst, and often downright excited. Anyone they know, they’ll tell you how cool the work they’re doing is, how it’s important and valuable and worth doing. They might tell you they prefer a different approach, but they’ll almost never bash someone’s work.

I’ve heard this comes out of American culture, and I can kind of see it. There’s an attitude in the US that everything needs to be described as positively as possible. This is especially true in a work context. Negativity is essentially a death sentence, doled out extremely rarely: if you explicitly say someone or their work is bad, you’re trying to get them fired. You don’t do that unless someone really really deserves it.

That style of scientific culture is growing, but it isn’t universal. There’s still a big cultural group that is totally ok with negativity: as long as it’s directed at other people, anyway.

This scientific culture prides itself on “telling it like it is”. They’ll happily tell you about how everything everyone else is doing is bullshit. Sometimes, they claim their ideas are the only ways forward. Others will have a small number of other people who they trust, who have gained their respect in one way or another. This sort of culture is most stereotypically associated with Russians: a “Russian-style” seminar, for example, is one where the speaker is aggressively questioned for hours.

It may sound like those are the only two options, but there is a third. While “American-style” scientists don’t criticize anyone, and “Russian-style” scientists criticize everyone else, there are also scientists who criticize almost everyone, including themselves.

With a light touch, this culture can be one of the best. There can be a real focus on “epistemic humility”, on always being clear of how much we still don’t know.

However, it can be worryingly easy to spill past that light touch, into something toxic. When the criticism goes past humility and into a lack of confidence in your own work, you risk falling into a black hole, where nothing is going well and nobody has a way out. This kind of culture can spread, filling a workplace and infecting anyone who spends too long there with the conviction that nothing will ever measure up again.

If you can’t manage that light skeptical touch, then your options are American-style or Russian-style. I don’t think either is obviously better. Both have their blind spots: the Americans can let bad ideas slide to avoid rocking the boat, while the Russians can be blind to their own flaws, confident that because everyone else seems wrong they don’t need to challenge their own worldview.

You have one more option, though. Now that you know this, you can recognize each for what it is: not the one true view of the world, but just one culture’s approach to the truth. If you can do that, you can pick up each culture as you need, switching between them as you meet different communities and encounter different things. If you stay aware, you can avoid fighting over culture and discourse, and use your energy on what matters: the science.

November 24, 2022

Matt Strassler In Brief: Unfortunate News from the Moon

Sadly, the LunaH-MAP mini-satellite (or “CubeSat”) that I wrote about a couple of days ago, describing how it would use particle physics to map out the water-ice in lunar soil, has had a serious setback and may not be able to carry out its mission. A stuck valve is the most likely reason that its thruster did not fire when instructed to do so, and so it has sailed past the Moon instead of going into the correct orbit. There’s still some hope that the situation can be salvaged, but it will take some luck. I feel badly for the scientists involved, who worked so hard and now face great disappointment.

In fact at least four and perhaps five of the ten CubeSats launched along with NASA’s Artemis mission have apparently failed in one way or another. This includes the Near-Earth Asteroid Scout and Team Miles, both of which were intended to test and use new technologies for space travel but with whom communication has not been established, and OMOTENASHI, which is intended to study the particle physics environment around the Moon and land a mini-craft on the surface, but which has had communication issues and will not be able to deploy its lander. It’s not clear what’s happening with Lunar-IR either.

One has to wonder whether this very high failure rate is due to the long delays suffered by the Artemis mission. The original launch date was at the end of August; batteries do degrade, and even satellites designed for the rigors of outer space can suffer in Florida’s heat and moisture.

Matt Strassler Welcome!

Hi all, and welcome! On this site, devoted to sharing the excitement and meaning of science, you’ll find a blog (posts begin below) and reference articles (accessible from the menus above.) The site is being upgraded, so you’ll see some ongoing changes; if you notice technical problems, please let me know.  Thanks, and enjoy!

Don’t forget to leave (polite) comments, and keep an eye out for days when I take direct questions from readers. Oh, and for the moment you can follow me on Twitter and Facebook. 

Sean Carroll Thanksgiving

This year we give thanks for Arrow’s Impossibility Theorem. (We’ve previously given thanks for the Standard Model Lagrangian, Hubble’s Law, the Spin-Statistics Theorem, conservation of momentum, effective field theory, the error bar, gauge symmetry, Landauer’s Principle, the Fourier Transform, Riemannian Geometry, the speed of light, the Jarzynski equality, the moons of Jupiter, space, black hole entropy, and electromagnetism.)

Arrow’s Theorem is not a result in physics or mathematics, or even in physical science, but rather in social choice theory. To fans of social-choice theory and voting models, it is as central as conservation of momentum is to classical physics; if you’re not such a fan, you may never have even heard of it. But as you will see, there is something physics-y about it. Connections to my interests in the physics of democracy are left as an exercise for the reader.

Here is the setup. You have a set of voters {1, 2, 3, …} and a set of choices {A, B, C, …}. The choices may be candidates for office, but they may equally well be where a group of friends is going to meet for dinner; it doesn’t matter. Each voter has a ranking of the choices, from most favorite to least, so that for example voter 1 might rank D first, A second, C third, and so on. We will ignore the possibility of ties or indifference concerning certain choices, but they’re not hard to include. What we don’t include is any measure of intensity of feeling: we know that a certain voter prefers A to B and B to C, but we don’t know whether (for example) they could live with B but hate C with a burning passion. As Kenneth Arrow observed in his original 1950 paper, it’s hard to objectively compare intensity of feeling between different people.

The question is: how best to aggregate these individual preferences into a single group preference? Maybe there is one bully who just always gets their way. But alternatively, we could try to be democratic about it and have a vote. When there is more than one choice, however, voting becomes tricky.

This has been appreciated for a long time, for example in the Condorcet Paradox (1785). Consider three voters and three choices, coming out as in this table.

Voter 1Voter 2Voter 3
ABC
BCA
CAB

Then simply posit that one choice is preferred to another if a majority of voters prefer it. The problem is immediate: more voters prefer A over B, and more voters prefer B over C, but more voters also prefer C over A. This violates the transitivity of preferences, which is a fundamental postulate of rational choice theory. Maybe we have to be more clever.

So, much like Euclid did a while back for geometry, Arrow set out to state some simple postulates we can all agree a good voting system should have, then figure out what kind of voting system would obey them. The postulates he settled on (as amended by later work) are:

  • Nobody is a dictator. The system is not just “do what Voter 1 wants.”
  • Independence of irrelevant alternatives. If the method says that A is preferred to B, adding in a new alternative C will not change the relative ranking between A and B.
  • Pareto efficiency. If every voter prefers A over B, the group prefers A over B.
  • Unrestricted domain. The method provides group preferences for any possible set of individual preferences.

These seem like pretty reasonable criteria! And the answer is: you can’t do it. Arrow’s Theorem proves that there is no ranked-choice voting method that satisfies all of these criteria. I’m not going to prove the theorem here, but the basic strategy is to find a subset of the voting population whose preferences are always satisfied, and then find a similar subset of that population, and keep going until you find a dictator.

It’s fun to go through different proposed voting systems and see how they fall short of Arrow’s conditions. Consider for example the Borda Count: give 1 point to a choice for every voter ranking it first, 2 points for second, and so on, finally crowning the choice with the least points as the winner. (Such a system is used in some political contexts, and frequently in handing out awards like the Heisman Trophy in college football.) Seems superficially reasonable, but this method violates the independence of irrelevant alternatives. Adding in a new option C that many voters put between A and B will increase the distance in points between A and B, possibly altering the outcome.

Arrow’s Theorem reflects a fundamental feature of democratic decision-making: the idea of aggregating individual preferences into a group preference is not at all straightforward. Consider the following set of preferences:

Voter 1Voter 2Voter 3Voter 4Voter 5
AAADD
BBBBB
CDCCC
DCDAA

Here a simple majority of voters have A as their first choice, and many common systems will spit out A as the winner. But note that the dissenters seem to really be against A, putting it dead last. And their favorite, D, is not that popular among A’s supporters. But B is ranked second by everyone. So perhaps one could make an argument that B should actually be the winner, as a consensus not-so-bad choice?

Perhaps! Methods like the Borda Count are intended to allow for just such a possibility. But it has it’s problems, as we’ve seen. Arrow’s Theorem assures us that all ranked-voting systems are going to have some kind of problems.

By far the most common voting system in the English-speaking world is plurality voting, or “first past the post.” There, only the first-place preferences count (you only get to vote for one choice), and whoever gets the largest number of votes wins. It is universally derided by experts as a terrible system! A small improvement is instant-runoff voting, sometimes just called “ranked choice,” although the latter designation implies something broader. There, we gather complete rankings, count up all the top choices, and declare a winner if someone has a majority. If not, we eliminate whoever got the fewest first-place votes, and run the procedure again. This is … slightly better, as it allows for people to vote their conscience a bit more easily. (You can vote for your beloved third-party candidate, knowing that your vote will be transferred to your second-favorite if they don’t do well.) But it’s still rife with problems.

One way to avoid Arrow’s result is to allow for people to express the intensity of their preferences after all, in what is called cardinal voting (or range voting, or score voting). This allows the voters to indicate that they love A, would grudgingly accept B, but would hate to see C. This slips outside Arrow’s assumptions, and allows us to construct a system that satisfies all of his criteria.

There is some evidence that cardinal voting leads to less “regret” among voters than other systems, for example as indicated in this numerical result from Warren Smith, where it is labeled “range voting” and left-to-right indicates best-to-worst among voting systems.

On the other hand — is it practical? Can you imagine elections with 100 candidates, and asking voters to give each of them a score from 0 to 100?

I honestly don’t know. Here in the US our voting procedures are already laughably primitive, in part because that primitivity serves the purposes of certain groups. I’m not that optimistic that we will reform the system to obtain a notably better result, but it’s still interesting to imagine how well we might potentially do.

n-Category Café Inner Automorphisms of the Octonions

What are the inner automorphisms of the octonions?

Of course this is an odd question. Since the octonions are nonassociative you might fear the map

f:𝕆𝕆 f : \mathbb{O} \to \mathbb{O}

given by

f(x)=gxg 1 f(x) = g x g^{-1}

for some octonion g0g \ne 0 is not even well-defined!

But it is.

The reason is that the octonions are alternative: the unital subalgebra generated by any two octonions is associative. Furthermore, the inverse g 1g^{-1} of g0g \ne 0 is in the unital subalgebra generated by gg. This follows from

g 1=1|g| 2g¯ g^{-1} = \frac{1}{|g|^2} \overline{g}

and the fact that g¯\overline{g} is in the unital subalgebra generated by gg, since we can write g=a+bg = a + b where aa is a real multiple of the identity and bb is purely imaginary, and then g¯=ab=2ag\overline{g} = a - b = 2a - g.

It follows that whenever gg is a nonzero octonion, we have

(gx)g 1=g(xg 1) (g x) g^{-1} = g (x g^{-1})

for all octonions xx, so we can write either as

f(x)=gxg 1.f(x) = g x g^{-1} .

However, there is no reason a priori to expect ff to be an automorphism, meaning

f(xy)=f(x)f(y) f(xy) = f(x) f(y)

for all x,y𝕆x,y \in \mathbb{O}. For which octonions gg does this happen?

Of course it happens when gg is real, i.e. a real multiple of 11. But that’s boring—because then ff is the identity. Can we find more interesting inner automorphisms of the octonions?

A correspondent, Charles Wynn, told me that ff is an automorphism when

g=12+32i g = \frac{1}{2} + \frac{\sqrt{3}}{2} i

and i𝕆i \in \mathbb{O} is any element with i 2=1i^2 = -1. This kind of element gg is a particular sort of 6th root of unity in the octonions—one that lies at a 60 60^\circ angle from the positive real axis.

A bit of digging revealed this paper:

In Theorem 2.1, Lamont claims that f(x)=gxg 1f(x) = g x g^{-1} is an automorphism of the octonions iff and only if gg is either real or

(Re(g)) 2=14|g| 2. (\mathrm{Re}(g))^2 = \frac{1}{4} |g|^2.

In other words, ff is an automorphism iff the octonion gg lies at an angle of 0 ,60 ,120 0^\circ, 60^\circ, 120^\circ or 180 180^\circ from the positive real axis. These cases include all 6th roots of unity in the octonions!

I haven’t fully checked the proof, but it seems to use little more than the Moufang identity.

I wonder what this fact means? How do these inner automorphisms sit inside the group of all automorphisms of the octonions, G 2\mathrm{G}_2?

November 23, 2022

Scott Aaronson Dismantling Quantum Hype with Tim Nguyen

Happy Thanksgiving to my American readers! While I enjoy a family holiday-week vacation in exotic Dallas—and yes, I will follow up on my old JFK post by visiting Dealey Plaza—please enjoy the following Thanksgiving victuals:

I recently recorded a 3-hour (!) YouTube video with Timothy Nguyen, host of the Cartesian Cafe. Our episode is entitled Quantum Computing: Dismantling the Hype. In it, I teach a sort of extremely compressed version of my undergraduate Intro to Quantum Information Science course, unburdening myself about whatever Tim prompts me to explain: the basic rules of quantum information, quantum circuits, the quantum black-box model, the Deutsch-Jozsa algorithm, BQP and its relationship to classical complexity classes, and sampling-based quantum supremacy experiments. This is a lot more technical than an average podcast, a lot less technical than an actual course, and hopefully just right for some nonempty subset of readers.

Outside of his podcasting career, some of you might recognize Nguyen as the coauthor, with Theo Polya, of a rebuttal of “Geometric Unity.” This latter is the proposal by the financier, podcaster, and leading “Intellectual Dark Web” figure Eric Weinstein for a unified theory of particle physics. Now, I slightly know Weinstein, and have even found him fascinating, eloquent, and correct about various issues. So, in an addendum to the main video, Nguyen chats with me about his experience critiquing Weinstein’s theory, and also about something where my knowledge is far greater: namely, my 2002 rebuttal of some of the central claims in Stephen Wolfram’s A New Kind of Science, and whether there are any updates to that story twenty years later.

Enjoy!

David HoggIron Snail (tm)

My day started (at 07:00) with a call with Neige Frankel (CITA) and Scott Tremaine (IAS) about our project to understand the phase-space spiral in the vertical kinematics of the disk in terms of metallicity, element abundances, stellar ages, and so on. Indeed, we have a general argument that any non-equilibrium perturbation of the Galaxy, winding up into a spiral, will show a metallicity (or other stellar-label) effect, provided that there were gradients in the metallicity (or other label) with respect to stellar density, or phase-space density, or orbital actions. The argument is exceedingly general; I want to write a paper with wide scope. Tremaine is careful with his conclusions; he wants to write a paper with narrow scope. We argued. The data (compiled and visualized very cleverly by Frankel) are beautiful.

November 22, 2022

Scott Aaronson Reform AI Alignment

Update (Nov. 22): Theoretical computer scientist and longtime friend-of-the-blog Boaz Barak writes to tell me that, coincidentally, he and Ben Edelman just released a big essay advocating a version of “Reform AI Alignment” on Boaz’s Windows on Theory blog, as well as on LessWrong. (I warned Boaz that, having taken the momentous step of posting to LessWrong, in 6 months he should expect to find himself living in a rationalist group house in Oakland…) Needless to say, I don’t necessarily endorse their every word or vice versa, but there’s a striking amount of convergence. They also have a much more detailed discussion of (e.g.) which kinds of optimization processes they consider relatively safe.


Nearly halfway into my year at OpenAI, still reeling from the FTX collapse, I feel like it’s finally time to start blogging my AI safety thoughts—starting with a little appetizer course today, more substantial fare to come.

Many people claim that AI alignment is little more a modern eschatological religion—with prophets, an end-times prophecy, sacred scriptures, and even a god (albeit, one who doesn’t exist quite yet). The obvious response to that claim is that, while there’s some truth to it, “religions” based around technology are a little different from the old kind, because technological progress actually happens regardless of whether you believe in it.

I mean, the Internet is sort of like the old concept of the collective unconscious, except that it actually exists and you’re using it right now. Airplanes and spacecraft are kind of like the ancient dream of Icarus—except, again, for the actually existing part. Today GPT-3 and DALL-E2 and LaMDA and AlphaTensor exist, as they didn’t two years ago, and one has to try to project forward to what their vastly-larger successors will be doing a decade from now. Though some of my colleagues are still in denial about it, I regard the fact that such systems will have transformative effects on civilization, comparable to or greater than those of the Internet itself, as “already baked in”—as just the mainstream position, not even a question anymore. That doesn’t mean that future AIs are going to convert the earth into paperclips, or give us eternal life in a simulated utopia. But their story will be a central part of the story of this century.

Which brings me to a second response. If AI alignment is a religion, it’s now large and established enough to have a thriving “Reform” branch, in addition to the original “Orthodox” branch epitomized by Eliezer Yudkowsky and MIRI.  As far as I can tell, this Reform branch now counts among its members a large fraction of the AI safety researchers now working in academia and industry.  (I’ll leave the formation of a Conservative branch of AI alignment, which reacts against the Reform branch by moving slightly back in the direction of the Orthodox branch, as a problem for the future — to say nothing of Reconstructionist or Marxist branches.)

Here’s an incomplete but hopefully representative list of the differences in doctrine between Orthodox and Reform AI Risk:

(1) Orthodox AI-riskers tend to believe that humanity will survive or be destroyed based on the actions of a few elite engineers over the next decade or two.  Everything else—climate change, droughts, the future of US democracy, war over Ukraine and maybe Taiwan—fades into insignificance except insofar as it affects those engineers.

We Reform AI-riskers, by contrast, believe that AI might well pose civilizational risks in the coming century, but so does all the other stuff, and it’s all tied together.  An invasion of Taiwan might change which world power gets access to TSMC GPUs.  Almost everything affects which entities pursue the AI scaling frontier and whether they’re cooperating or competing to be first.

(2) Orthodox AI-riskers believe that public outreach has limited value: most people can’t understand this issue anyway, and will need to be saved from AI despite themselves.

We Reform AI-riskers believe that trying to get a broad swath of the public on board with one’s preferred AI policy is something close to a deontological imperative.

(3) Orthodox AI-riskers worry almost entirely about an agentic, misaligned AI that deceives humans while it works to destroy them, along the way to maximizing its strange utility function.

We Reform AI-riskers entertain that possibility, but we worry at least as much about powerful AIs that are weaponized by bad humans, which we expect to pose existential risks much earlier in any case.

(4) Orthodox AI-riskers have limited interest in AI safety research applicable to actually-existing systems (LaMDA, GPT-3, DALL-E2, etc.), seeing the dangers posed by those systems as basically trivial compared to the looming danger of a misaligned agentic AI.

We Reform AI-riskers see research on actually-existing systems as one of the only ways to get feedback from the world about which AI safety ideas are or aren’t promising.

(5) Orthodox AI-riskers worry most about the “FOOM” scenario, where some AI might cross a threshold from innocuous-looking to plotting to kill all humans in the space of hours or days.

We Reform AI-riskers worry most about the “slow-moving trainwreck” scenario, where (just like with climate change) well-informed people can see the writing on the wall decades ahead, but just can’t line up everyone’s incentives to prevent it.

(6) Orthodox AI-riskers talk a lot about a “pivotal act” to prevent a misaligned AI from ever being developed, which might involve (e.g.) using an aligned AI to impose a worldwide surveillance regime.

We Reform AI-riskers worry more about such an act causing the very calamity that it was intended to prevent.

(7) Orthodox AI-riskers feel a strong need to repudiate the norms of mainstream science, seeing them as too slow-moving to react in time to the existential danger of AI.

We Reform AI-riskers feel a strong need to get mainstream science on board with the AI safety program.

(8) Orthodox AI-riskers are maximalists about the power of pure, unaided superintelligence to just figure out how to commandeer whatever physical resources it needs to take over the world (for example, by messaging some lab over the Internet, and tricking it into manufacturing nanobots that will do the superintelligence’s bidding).

We Reform AI-riskers believe that, here just like in high school, there are limits to the power of pure intelligence to achieve one’s goals.  We’d expect even an agentic, misaligned AI, if such existed, to need a stable power source, robust interfaces to the physical world, and probably allied humans before it posed much of an existential threat.

What have I missed?

Doug NatelsonThe need for energy-efficient computing

Computing is consuming a large and ever-growing
fraction of the world's energy capacity.
I've seen the essential data in this figure several times over the last few months, and it has convinced me that the need for energy-efficient computing hardware is genuinely pressing.  This is from a report by the Semiconductor Research Corporation from 2020.  It argues that if computing needs continue to grow at the present rate, then by the early 2030s something like 10% of all of the world's energy production (and therefore something like 40% of the world's electricity production) will be tied up in computing hardware.  (ZIPs = \(10^21\) instructions per second)

Now, we all know the dangers of extrapolation.  Still, this trend tells us that something is going to change drastically - either the rate at which computing power grows will slow dramatically, or we will be compelled to find a much more energy-efficient computational approach, or some intermediate situation will develop.  (Note:  getting rid of crypto currencies sure wouldn't hurt, as they are incredibly energy-hungry and IMO have virtually zero positive contributions to the world, but that just slows the timeline.)

I've written before about neuromorphic computing as one approach to this problem.  Looking at neural nets as an architectural model is not crazy - your brain consumes about 12 W of power continuously, but it is far better at certain tasks (e.g. identifying cat pictures) than much more power-hungry setups.  Here is a nice article from Quanta on this, referencing a recent Nature paper.  Any big change will likely require the adoption of new materials and therefore new manufacturing processes.  Just something to bear in mind when people ask why anyone is studying the physics of electronic materials.

John PreskillA peek inside Northrop Grumman’s subatomic endeavors

As the weather turns colder and we trade outdoor pools for pumpkin spice and then Christmas carols, perhaps you’re longing for summer’s warmth. For me, it is not just warmth I yearn for: This past summer, I worked as a physics intern at Northrop Grumman. With the internship came invaluable lessons and long-lasting friendships formed in a unique environment that leverages quantum computing in industry.

More on that in a bit. First, allow me to introduce myself. My name is Jade LeSchack, and I am an undergraduate physics major at the University of Maryland, College Park. I interact with Dr. Nicole Yunger Halpern’s group and founded the Undergraduate Quantum Association at UMD, a student organization for those interested in quantum science and technology. 

Undergraduate Quantum Association Vice President, Sondos Quqandi (right), and me hosting the quantum track of the Bitcamp hackathon

Back to Northrop Grumman. Northrop Grumman’s work as a defense contractor has led them to join the global effort to harness the power of quantum computing through their transformational-computing department, which is where I worked. Northrop Grumman is approaching quantum computing via proprietary superconducting technology. Superconductors are special types of conductors that can carry electric current with zero resistance when cooled to very low temperatures. We’re talking one hundred times colder than outer space. Superconducting electronics are brought to almost-absolute-zero temperatures using a dilution refrigerator, a machine that, frankly, looks closer to a golden chandelier than an appliance for storing your perishables.

An example of the inside of a dilution refrigerator

I directly worked with these golden chandeliers for one week during my internship. This week entailed shadowing staff physicists and was my favorite week of the internship. I shadowed Dr. Amber McCreary as she ran experiments with the dilution fridges and collected data. Amber explained all the steps of her experiments and answered my numerous questions.

Working in the transformational-computing unit, I had physicists from a variety of backgrounds at my disposal. These physicists hailed from across the country — with quite a few from my university — and were welcoming and willing to show me the ropes. The structure of the transformational-computing department was unlike what I have seen with academia since the department is product-oriented. Some staff manned a dilution fridge, while others managed products stemming from the superconductor research.

Outside this week in the lab, I worked on my chosen, six-week-long project: restructuring part of the transformational-computing codebase. Many transformational-computing experiments require curve fitting which is finding the curve of best fit through a set of data points. Pre-written algorithms can perform curve-fitting for certain equations such as polynomial equations, but it is harder for more-complicated equations. I worked with a fellow intern named Thomas, and our job was to tackle these more-complicated equations. Although I never saw the dilution fridges again, I gained many programming skills and improved programs for the transformation-computing department. 

The internship was not all work and no play. The memories I made and connections I forged will last much longer than the ten weeks in which they were founded. Besides the general laughs, there were three happy surprises I’d like to share. The first was lunch-time ultimate frisbee. I play ultimate frisbee on the University of Maryland women’s club team, and when my manager mentioned there was a group at Northrop Grumman who played during the week, I jumped on the chance to join. 

The second happy surprise involved a frozen treat. On a particularly long day of work, my peers and I scoured a storage closet in the office on an office-supplies raid. What we found instead of supplies was an ice-cream churner. Since the COVID lock-down, a hobby of mine that I have avidly practiced has been ice-cream making. A rediscovered ice-cream churner plus an experienced ice-cream maker brought three ice-cream days for the office. Naturally, they were huge successes! 

And last, I won an Emmy. 

Me winning an Emmy

Well, not quite.

I was shocked when, after a team lunch, my manager turned to the intern team and nonchalantly said, “Let’s go see if the Emmy is available.” I was perplexed but intrigued, and my manager explained that Northrop Grumman had won an Emmy for science in advancing cinematic technology. And it turned out that the Emmy was available for photographs! We were all excited; this was probably the only time we would hold a coveted cinema award reserved for the red carpet.

Not only did I contribute to Northrop Grumman’s quantum efforts, but I also played ultimate frisbee and held an Emmy. Interning at Northrop Grumman was a wonderful opportunity that has left me with new quantum knowledge and fond memories. 

Terence TaoTrying out Mathstodon

It’s been a while since I’ve actively participated in social media outside of this blog – I was active in Google Buzz/Google+ for a while, until that service closed – but I’ve decided to try out Mathstodon, one of the servers of the open source social media software platform Mastodon. As I understand it, Mastodon functions in many ways similar to the significantly more well-known platform Twitter, but is decentralized into a federation of servers that share content with each other but can have their own moderation rules and add-ons. For instance, the Mathstodon server has the additional feature of supporting LaTeX in its posts. Another consequence of this decentralization is that if one for some reason ends up disagreeing with the administration of the server one is in, one has the option of transferring one’s account to a different server while staying on the same platform.

I just created an account at Mathstodon and it currently has very little content, but I hope to add some soon (though I will probably not be as prolific as some other mathematicians already on that site, such as John Baez or Nalini Joshi).

November 21, 2022

Matt Strassler The Artemis Rocket Launch and Particle Physics

A post for general readers:

The recent launch of NASA’s new moon mission, Artemis 1, is mostly intended to demonstrate that NASA’s incredibly expensive new rocket system will actually work and be safe for humans to travel in. But along the way, a little science will be done. The Orion spacecraft at the top of the giant rocket, which will actually make the trip to the Moon and back and will carry astronauts in future missions, has a few scientific instruments of its own. Not surprisingly, though, most are aimed at monitoring the environment that future astronauts will encounter. But meanwhile the mission is delivering ten shoe-box-sized satellites (“CubeSats“) which will carry out various other scientific and/or technological investigations. A number of these involve physics, and a few directly employ particle physics.

The use of particle physics detectors for the purpose of studying the not-so-empty space around the Moon and Earth is no surprise. Near any star like the Sun, what we think of as the vacuum of space (and biologically speaking, it is vacuum: no air and hardly any atoms, making it unsurvivable as well as silent) is actually swarming with subatomic particles. Well, perhaps “swarming” is an overstatement. But nevertheless, if you want to understand the challenges to humans and equipment in the areas beyond the Earth, you’ll inevitably be doing particle physics. That’s what a couple of the CubeSats will be studying, entirely or in part.

What’s more of a surprise is that one of the best ways to find water on the Moon without actually landing on it involves putting particle physics to use. Although the technique is not new, it’s not so obvious or widely known, so I thought I’d draw your attention to it.

The Lunar Polar Hydrogen Mapper (LunaH-Map)

Designed at Arizona State University, the LunaH-Map CubeSat will look for water on the Moon, using a tried and true technique known as “neutron spectroscopy”. The strategy relies from the start on particle physics, taking advantage of the existence of “cosmic rays”, which are (mainly) protons and atomic nuclei traveling at near the speed of light across the universe. These particles are accelerated to extreme speeds by natural particle accelerators found in supernovas and perhaps elsewhere. They may travel for many thousands of years across the galaxy, or even longer from outside our galaxy, before reaching our vicinity. The Sun and its planets and moons are all constantly being peppered by these particles.

When an ultra-high-energy proton (shown arriving from above) enters material, it collides with atomic nuclei, creating a dramatic shower of other particles, including protons and neutrons, of somewhat lower energy. [Source]

On Earth, most cosmic rays strike an atom in the atmosphere before they reach the ground. (The debris from these collisions allowed scientists to discover a number of subatomic particles, such as the positron and the muon, and they play a role in many modern experiments, such as this one.) Since the Moon has no atmosphere to speak of, cosmic rays instead slam straight into the lunar dirt.

What ensues is a “hadronic shower”, a natural particle physics process similar to that found at the Large Hadron Collider, within the “hadron calorimeters” of the ATLAS or CMS experiments. (These portions of the ATLAS and CMS detectors measure the energies of hadrons, particles containing quarks, antiquarks and gluons.) A computer simulation of a hadron shower is shown at left. How does a shower arise?

When a high-energy proton hits an atomic nucleus (typically within a meter of the lunar surface), it breaks the nucleus apart into protons, neutrons and smaller atomic nuclei. Typically some of the remnants now have enough energy themselves to break apart nearby atomic nuclei, whose remnants break apart further nuclei, etc. The result is that the cosmic ray’s large amount of energy is transformed into a shower of protons, neutrons and other nuclear fragments. Whereas the original cosmic ray was moving at nearly the cosmic speed limit (a.k.a. the speed of light, 300,000 kilometers per second), the particles in the shower are typically moving much more slowly, perhaps 10 to 1,000 kilometers per second — still fast by human standards, but far below light speed.

Not surprisingly, since the cosmic ray comes from above, most of the particles in this shower move downward into the lunar soil. They collide with other atomic nuclei and eventually slow to a stop. But there are always a few protons and neutrons that by chance have a collision that knocks them upwards. Consequently, even a downward-directed cosmic ray shower will produce some particles that make their way out of the ground and back into space. Once they get out, there’s nothing to stop them, since there’s no air around the Moon. A spacecraft going by just overhead can hope to detect them as they head out into empty space.

The basic trick of LunaH-Map, which I’ll explain in a moment, is that if the ground that the cosmic ray struck contains hydrogen, any upward-going particles will be slower on average than if there’s no hydrogen there. Most lunar soil has no hydrogen, as determined by various missions to the moon. But lunar soil that contains water ice in it will have plenty of hydrogen, since water is hydrogen and oxygen (H2O). So if you can detect particles of certain speeds as you pass over a certain part of the Moon, you can tell whether there’s water ice embedded in the soil.

To explain all this, I need to tell you why particle speeds are sensitive to the presence of hydrogen, and which types of particles are best to measure.

Hydrogen and its Effect on Speed

The effect of hydrogen on particle speed involves something you probably understand intuitively, even if you were not taught it in a first-year physics class. It’s illustrated in the figure below.

  • If you bounce a ping-pong ball off a heavy, stationary rock, the rock won’t budge, and the ping-pong ball will bounce off it in a new direction but without losing any of its speed.
  • If you bounce a ping-pong ball off a second, stationary ping-pong ball, both ping-pong balls will be moving after the collision, and the speed of both balls will be less than the initial speed of the first ball.

(For those who’ve had a little physics: these facts are required by conservation of energy and momentum. In the first case the rock absorbs almost none of the ping-pong ball’s kinetic energy, so the ball retains what it had before, as for a tennis ball bouncing off a wall; in the second case, the first ping-pong ball loses a significant fraction of its kinetic energy to the second one.)

Top: A ping-pong ball striking a rock will bounce off in a new direction but with the same speed. Bottom: When a ping-pong ball strikes a second, stationary ping-pong ball, the two balls will both end up moving, though more slowly than the initial motion.

Imagine, then, a proton or a neutron that emerges from the shower of particles that follows a cosmic ray impact. Much slower than the original cosmic ray, it is still moving at many kilometers per second. What happens as it repeatedly strikes atoms in the soil?

Protons and neutrons have about the same mass. Typical atomic nuclei in the Moon, such as oxygen or silicon, contain more than ten protons and neutrons, and so, with much larger masses than a single proton or neutron, they act like a heavy rock. Our speedy neutron or proton will bounce off such a nucleus without slowing down. (A minor detail: It may not always simply bounce; other things may happen which can slow it down somewhat, but not enough to affect what I’m about to tell you.)

But because hydrogen’s nucleus is itself just a single proton, the collision of a proton or neutron with a hydrogen nucleus is like the collision of two ping-pong balls — two objects of equal or nearly equal mass. The result: the one that’s moving will lose on average half its energy, or about 30% of its speed, relative to the lunar surface. If there’s a lot of hydrogen in the soil, then this process may happen repeatedly to most protons and neutrons in the cosmic ray shower. And so protons and neutrons emerging from soil rich in hydrogen are on average much slower compared to those emerging from soil that’s poor in hydrogen.

Measuring Neutrons

The LunaH-MAP cubesat, like many spacecraft before it, is looking for this effect on neutrons. Why on neutrons and not on protons? Because there are protons everywhere around the Moon, streaming in from the Sun and from elsewhere. Neutrons, by contrast, only can exist on their own (as opposed to inside a stable atomic nucleus) for about 15 minutes. Consequently, any neutrons from the Sun or other distant source won’t make it to the Moon. So any neutrons near the Moon, even some distance overhead, are much more likely to have come from the Moon than from anywhere else.

LunaH-MAP comes quite close to the Moon’s surface (just a few kilometers above it), which allows it to examine the Moon in considerable detail. All it does, as it flies over the surface, is count how many neutrons it encounters. What’s crucial is that it is only sensitive to neutrons of moderate to high speed, and it can’t detect the slow ones (slower than about 7 kilometers per second.) Above most regions of the Moon, where the heat of the Sun quickly vaporizes any water ice and releases it to space, the spacecraft will find many neutrons of moderate speed, dislodged by cosmic rays and leaking out from the surface. But in craters near the poles, where there are regions mostly or always in shadow, ice deposited by comets still remains; and there, as the spacecraft passes overhead, many of the neutrons will have been slowed down so much that LunaH-MAP can’t detect them. (This figure shows how dramatic this can be; if the first meter of dirt were just 3% ice, the number of moderate-speed neutrons would drop in half!) In this way, by counting the number of neutrons observed as it flies through the dark lunar sky, LunaH-MAP can distinguish hydrogen-poor soil below it from hydrogen-rich soil.

That’s how, without ever landing there, LunaH-MAP can give us a detailed map as to where hydrogen, and likely water ice, is to be found on the Moon. Similar techniques can and have been used on planets such as Mars and asteroids such as Ceres. Pretty cool, right? Just another great example of how seemingly exotic and esoteric discoveries of one century — cosmic rays were first observed in 1911, and the neutron was identified in 1932 — turn out to be essential tools in the next.

November 20, 2022

John BaezThe Icosidodecahedron

The icosidodecahedron can be built by truncating either a regular icosahedron or a regular dodecahedron. It has 30 vertices, one at the center of each edge of the icosahedron—or equivalently, one at the center of each edge of a dodecahedron. It is a beautiful, highly symmetrical shape. But it is just a shadow of a more symmetrical shape with twice as many vertices, which lives in a space with twice as many dimensions! Namely, it is a projection down to 3d space of a 6-dimensional polytope with 60 vertices.

Even better, it is also a slice of a more symmetrical 4d polytope with 120 vertices, which in turn is the projection down to 4d space of an even more symmetrical 8-dimensional polytope with 240 vertices: the so-called ‘E8 root polytope’. Note how the numbers keep doubling: 30, 60, 120 and 240.

To understand all this, start with the group of rotational symmetries of the icosahedron. This is a 60-element subgroup of the rotation group SO(3), so it has a double cover, called the binary icosahedral group, consisting of 120 unit quaternions. With a suitable choice of coordinates, we can take these to be

\displaystyle{ \pm 1 , \quad \frac{\pm 1 \pm i \pm j \pm k}{2}, \quad \frac{\pm i \pm \phi j \pm \Phi k}{2} }

together with everything obtained from these by even permutations of 1, i, j, and k, where

\displaystyle{ \phi = \frac{\sqrt{5} - 1}{2}, \quad \Phi = \frac{\sqrt{5} + 1}{2} }

are the ‘little’ and ‘big’ golden ratios, respectively. These 120 unit quaternions are the vertices of a convex polytope in 4 dimensions. In fact this is a regular polytope, called the 600-cell since it has 600 regular tetrahedra as faces.

If we slice the 600-cell with halfway between two of its opposite vertices, we get an icosidodecahedron. This is easiest to see by intersecting the 600-cell with the space of purely imaginary quaternions

\{ ai + bj + ck : \; a,b,c \in \mathbb{R} \}

Of the 600-cell’s vertices, those that lie in this 3-dimensional space are

\pm i, \pm j, \pm k

which form the corners of an octahedron, and

\displaystyle{ \frac{\pm i \pm \phi j \pm \Phi k}{2} ,  \quad  \frac{\pm j \pm \phi k \pm \Phi i}{2} , \quad  \frac{\pm k \pm \phi i \pm \Phi j}{2}   }

which form the corners of three ‘golden boxes’. A golden box is the 3d analogue of a golden rectangle: its three sides are in the proportions \phi, 1 and \Phi.

It is well-known that these points are the vertices of an icosidodecahedron. Here are the three golden boxes and octahedron inscribed in an icosidodecahedron, as drawn by Rahul Narain:

But we are not done with the binary icosahedral group—far from it!

Integer linear combinations of these 120 elements of the quaternions form a subring of the quaternions, which Conway and Sloane [CS] call the icosians. Since any icosian can be written as a + bi + cj + dk where the numbers a,b,c,d \in \mathbb{R} are of the form x + y \sqrt{5} with x,y rational, any icosian gives an 8-tuple of rational numbers. However, we do not get all 8-tuples of rationals this way, only those lying in a certain lattice in \mathbb{R}^8. And there is a way to think of this lattice as a rescaled copy of the famous E8 lattice! To do this, Conway and Sloane put a new norm on the icosians as follows. The usual quaternionic norm is

\|a + bi + cj + dk\|^2 = a^2 + b^2 + c^2 + d^2

But for an icosian this norm is always of the form x + \sqrt{5} y for some rationals x and y. Conway and Sloane define a new norm on the icosians by setting

|a + bi + cj + dk|^2 = x + y

With this new norm, Conway and Sloane show the icosians are isomorphic to a rescaled version of the E8 lattice in \mathbb{R}^8.

The 240 shortest nonzero vectors in this lattice are the vertices of an 8-dimensional convex polytope called the E8 root polytope:

However, if we remember that each of these 240 vectors came from a quaternion, we can also think of them as 240 quaternions. These turn out to be the vertices of two 600-cells in the quaternions! In the usual quaternionic norm, one of these 600-cells is larger than the other by a factor of \Phi.

In fact, there is an orthogonal projection from \mathbb{R}^8 down to \mathbb{R}^4 that maps the E8 root polytope to the 600-cell. So, in a very real sense, the 600-cell is the ‘shadow’ of a polytope with twice as many vertices, living in a space whose dimension is twice as large. And as a spinoff, this fact gives the same sort of relationship between the icosidodecahedron and a 6-dimensional polytope.

The key is to look at pure imaginary icosians: those of the form a i + b j + c k for real a,b,c. Since a,b and c are each of the form x + \sqrt{5}y with x and y rational, any pure imaginary icosian gives a 6-tuple of rational numbers. We do not get all 6-tuples of rationals this way, but only those lying in a certain lattice. We have

\|ai + bj + ck\|^2 = a^2 + b^2 + c^2

For a pure imaginary icosian this is always of the form x + \sqrt{5} y for some rationals x and y. So, we can define a new norm on the pure imaginary icosians by

|ai + bj + ck|^2 = x + y

With this new norm, the pure imaginary icosians are isomorphic to a rescaled version of a familiar lattice in \mathbb{R}^6, called the ‘D6 lattice’.

The 60 shortest nonzero vectors in the D6 lattice are called the roots of D6, and they are the vertices of a 6-dimensional convex polytope called the D6 root polytope. There is an orthogonal projection from \mathbb{R}^6 to \mathbb{R}^3 that maps this polytope to an icosidodecahedron. In fact 30 vertices of the D6 root polytope map to the vertices of this icosidodecahedron, while the other 30 map to vertices of a second, smaller icosidodecahedron.

Here is an image of the setup, created by Greg Egan:

Let’s see some details! The usual coordinatization of the D6 lattice in Euclidean \mathbb{R}^6 is

\mathrm{D}_6 = \left\{ (x_1, \dots, x_6) : \; x_i  \in \mathbb{Z}, \; \sum_i x_i \in 2\mathbb{Z} \right\} \subset \mathbb{R}^6

The roots of D6 are

(\pm 1, \pm 1, 0, 0, 0, 0)

and all vectors obtained by permuting the six coordinates. We shall see that these vectors are sent to the vertices of an icosidodecahedron by the linear map T \colon  \mathbb{R}^6 \to \mathbb{R}^3 given as a 3 × 6 matrix by

\left( \begin{array}{cccccc}  \Phi &  \Phi  & -1 & -1 & 0 &  0 \\  0 &  0  & \Phi &  -\Phi & -1 & 1 \\  -1 &  1 &  0 &  0 &  \Phi  & \Phi  \end{array} \right)

The rows of this matrix are orthogonal, all with the same norm, so after rescaling it by a constant factor we obtain an orthogonal projection. The columns of this matrix are six vertices of an icosahedron, chosen so that we never have a vertex and its opposite. For any pair of columns, they are either neighboring vertices of the icosahedron, or a vertex and the opposite of a neighboring vertex.

The map T thus sends any D6 root to either the sum or the difference of two neighboring icosahedron vertices. In this way we obtain all possible sums and differences of neighboring vertices of the icosahedron. It is easy to see that the sums of neighboring vertices give the vertices of an icosidodecahedron, since by definition the icosidodecahedron has vertices at the midpoints of the edges of a regular icosahedron. It is less obvious that the differences of neighboring vertices of the icosahedron give the vertices of a second, smaller icosidodecahedron. But thanks to the symmetry of the situation, we can check this by considering just one example. In fact the vectors defining the vertices of the larger icosidodecahedron turn out to be precisely \Phi times the vectors defining the vertices of the smaller one!

The beauties we have just seen are part of an even larger pattern relating all the non-crystallographic Coxeter groups to crystallographic Coxeter groups. For more, see the work of Fring and Korff [FK1,FK2], Boehm, Dechant and Twarock [BDT] and the many papers they refer to. Fring and Korff apply these ideas to integrable systems in physics, while the latter authors explore connections to affine Dynkin diagrams. For more relations between the icosahedron and E8, see [B2].

Acknowledgements

I thank Greg Egan for help with developing these ideas. The spinning icosidodecahedron was created by Cyp and was put on Wikicommons with a Creative Commons Attribution-Share Alike 3.0 Unported license. The 600-cell was made using Robert Webb’s Stella software and is on Wikicommons. The icosidodecahedron with three golden boxes and an octahedron inscribed in it was created by Rahul Narain on Mathstodon. The projection of the 240 E8 roots to the plane was created by Claudio Rocchini and put on Wikicommons with a Creative Commons Attribution 3.0 Unported license. The spinning pair of icosidodecahedra was created by Greg Egan and appears in an earlier blog article on this subject [B1]. The article here is an expanded version of that earlier article: the only thing I left out is the matrix describing a linear map S \colon \mathbb{R}^8 \to \mathbb{R}^4 that when suitably rescaled gives a projection mapping the E8 lattice in its usual coordinatization

\{ x \in \mathbb{R}^8: \, \textrm{all } x_i \in \mathbb{Z} \textrm{ or all } x_i \in \mathbb{Z} + \frac{1}{2} \textrm{ and } \sum_i x_i \in 2\mathbb{Z} \}

to the icosians, and thus mapping the 240 E8 roots to two 600-cells. For completeness, here is that matrix:

\left( \begin{array}{cccccccc}  \Phi+1 & \Phi -1 & 0  & 0 &  0 &  0 &   0  & 0 \\  0 & 0 & \Phi &  \Phi  & -1 & -1 & 0 & 0 \\  0 & 0  & 0 &  0  & \Phi &  -\Phi & -1 & 1   \\  0 & 0 & -1 &  1 &  0 &  0 &  \Phi  & \Phi  \end{array} \right)

The first image at the bottom of this post was also created by Greg Egan, on Mathstodon. The second shows an icosahedron and 3 golden rectangles morphing to an icosidodecahedron with 3 golden boxes, with an octahedron present at every stage. It was created by Vincent Pantaloni, on Geogrebra.

References

[B1] John Baez, Icosidodecahedron from D6, Visual Insight, January 1, 2015.

[B2] John Baez, From the icosahedron to E8, London Math. Soc. Newsletter 476 (2018), 18–23.

[BDT] Celine Boehm, Pierre-Philippe Dechant and Reidun Twarock, Affine extensions of non-crystallographic Coxeter groups induced by projection, J. Math. Phys. 54, 093508 (2013).

[CS] John H. C. Conway and Neil J. A. Sloane, Sphere Packings, Lattices and Groups, Springer, Berlin, 2013.

[FK1] Andreas Fring and Christian Korff, Affine Toda field theories related to Coxeter groups of non-crystallographic type, Nucl. Phys. B729 (2005), 361–386.

[FK2] Andreas Fring and Christian Korff, Non-crystallographic reduction of generalized Calogero–Moser models, J. Phys. A39 (2006) 1115–1132.

November 19, 2022

David Hogghalo mass assembly

On Fridays, Kate Storey-Fisher (NYU) organizes a small meeting to discuss her projects on dark-matter halos using equivariant scalar objects constructed from n-body simulation outputs. Today we included Yongseok Jo (Flatiron), who has worked on building tools to paint galaxies onto dark-matter-only n-body simulations. We discussed joint projects, and conceptual issues about mass-assembly histories. In particular, I am interested in how we can predict formation histories of dark-matter halos from the galaxy contents alone, or infer the dark matter distribution in phase space from the stellar distribution in phase space. I love these projects, because they combine growth of structure, gravitational dynamics, galaxy formation, and machine learning.

David Hoggnon-parametric model of the density of the Milky Way disk?

Danny Horta-Darrington (Flatiron) has been working with Adrian Price-Whelan (Flatiron) to measure things about abundances and dynamics of stars in the Milky Way disk. Horta is finding that there are way better abundance gradients, in way more directions in phase space, than previously have been (usefully) visualized. But along the way, he stumbled upon a plot that clearly shows the variation of the Milky Way thin disk density with radius. We discussed today how to make the simplest possible measurement of this, with a variation of Orbital Torus Imaging, or really a simplification of it. We realized today that there is enough data to just make this measurement in patches all over the (nearby) disk. The scale length looks short!

David Hoggcode and words on Standard Practice (tm)

I worked today on the code and text on my project with Andy Casey (Monash) on combining spectra. What I did today was code up and describe what I call Standard Practice (tm), which is to shift (interpolate) and coadd (average) your data.

November 18, 2022

Matt von HippelVisiting the IAS

I’m at the Institute for Advanced Study, or IAS, this week.

There isn’t a conference going on, but if you looked at the visitor list you’d be forgiven for thinking there was. We have talks in my subfield almost every day this week, two professors from my subfield here on sabbatical, and extra visitors on top of that.

The IAS is a bit of an odd place. Partly, that’s due to its physical isolation: tucked away in the woods behind Princeton, a half-hour’s walk from the nearest restaurant, it’s supposed to be a place for contemplation away from the hustle and bustle of the world.

Since the last time I visited they’ve added a futuristic new building, seen here out of my office window. The building is most notable for one wild promise: someday, they will serve dinner there.

Mostly, though, the weirdness of the IAS is due to the kind of institution it is.

Within a given country, most universities are pretty similar. Each may emphasize different teaching styles, and the US has a distinction between public and private, but (neglecting scammy for-profit universities), there are some commonalities of structure: both how they’re organized, and how they’re funded. Even between countries, different university systems have quite a bit of overlap.

The IAS, though, is not a university. It’s an independent institute. Neighboring Princeton supplies it with PhD students, but otherwise the IAS runs, and funds, itself.

There are a few other places like that around the world. The Perimeter Institute in Canada is also independent, and also borrows students from a neighboring university. CERN pools resources from several countries across Europe and beyond, Nordita from just the Nordic countries. Generalizing further, many countries have some sort of national labs or other nation-wide systems, from US Department of Energy labs like SLAC to Germany’s Max Planck Institutes.

And while universities share a lot in common, non-university institutes can be very different. Some are closely tied to a university, located inside university buildings with members with university affiliations. Others sit at a greater remove, less linked to a university or not linked at all. Some have their own funding, investments or endowments or donations, while others are mostly funded by governments, or groups of governments. I’ve heard that the IAS gets about 10% of its budget from the government, while Perimeter gets its everyday operating expenses entirely from the Canadian government and uses donations for infrastructure and the like.

So ultimately, the IAS is weird because every organization like it is weird. There are a few templates, and systems, but by and large each independent research organization is different. Understanding one doesn’t necessarily help at understanding another.

Scott Aaronson WINNERS of the Scott Aaronson Grant for Advanced Precollege STEM Education!

I’m thrilled to be able to interrupt your regular depressing programming for 100% happy news.

Some readers will remember that, back in September, I announced that an unnamed charitable foundation had asked my advice on how best to donate $250,000 for advanced precollege STEM education. So, just like the previous time I got such a request, from Jaan Tallinn’s Survival and Flourishing Fund, I decided to do a call for proposals on Shtetl-Optimized before passing along my recommendations.

I can now reveal that the generous foundation, this time around, was the Packard Foundation. Indeed, the idea and initial inquiries to me came directly from Dave Orr: the chair of the foundation, grandson of Hewlett-Packard cofounder David Packard, and (so I learned) longtime Shtetl-Optimized reader.

I can also now reveal the results. I was honored to get more than a dozen excellent applications. After carefully considering all of them, I passed along four finalists to the Packard Foundation, which preferred to award the entire allotment to a single program if possible. After more discussion and research, the Foundation then actually decided on two winners:

  • $225,000 for general support to PROMYS: the long-running, world-renowned summer math camp for high-school students, which (among other things) is in the process of launching a new branch in India. While I ended up at Canada/USA Mathcamp (which I supported in my first grant round) rather than PROMYS, I knew all about and admired PROMYS even back when I was the right age to attend it. I’m thrilled to be able to play a small role in its expansion.
  • $30,000 for general support to AddisCoder: the phenomenal program that introduces Ethiopian high-schoolers to programming and algorithms. AddisCoder was founded by UC Berkeley theoretical computer science professor and longtime friend-of-the-blog Jelani Nelson, and also received $30,000 in my first grant round. Jelani and his co-organizers will be pressing ahead with AddisCoder despite political conflict in Ethiopia including a recently-concluded civil war. I’m humbled if I can make even the tiniest difference.

Thanks so much to the Packard Foundation, and to Packard’s talented program officers, directors, and associates—especially Laura Sullivan, Jean Ries, and Prithi Trivedi—for their hard work to make this happen. Thanks so much also to everyone who applied. While I wish we could’ve funded everyone, I’ve learned a lot about programs to which I’d like to steer future support (other prospective benefactors: please email me!!), and to which I’d like to steer kids: my own, once they’re old enough, and other kids of my acquaintance.

I feel good that, in the tiny, underfunded world of accelerated STEM education, the $255,000 that Packard is donating will already make a difference. But of course, $255,000 is only a thousandth of $255 million, which is a thousandth of $255 billion. Perhaps I could earn the latter sort of sums, to donate to STEM education or any other cause, by (for example) starting my own cryptocurrency exchange. I hope my readers will forgive me for not having chosen that route, expected-utility-maximization arguments be damned.

November 17, 2022

Scott Aaronson Sneerers

In the past few weeks, I’ve learned two ways to think about online sneerers that have been helping me tremendously, and that I wanted to share in case they’re helpful to others:

First, they’re like a train in a movie that’s barreling directly towards the camera. If you haven’t yet internalized how the medium works, absolutely terrifying! Run from the theater! If you have internalized it, though, you can sit and watch without even flinching.

Second, the sneerers are like alligators—and about as likely to be moved by your appeals to reason and empathy. But if, like me, you’re lucky enough to have a loving family, friends, colleagues, and a nigh-uncancellable career, then it’s as though you’re standing on a bridge high above, looking down at the gators as they snap their jaws at you uselessly. There’s really no moral or intellectual obligation to go down to the swamp to wrestle them. If they mean to attack you, let them at least come up to the bridge.

Scott Aaronson Sam Bankman-Fried and the geometry of conscience

Update (Nov. 16): Check out this new interview of SBF by my friend and leading Effective Altruist writer Kelsey Piper. Here Kelsey directly confronts SBF with some of the same moral and psychological questions that animated this post and the ensuing discussion—and, surely to the consternation of his lawyers, SBF answers everything she asks. And yet I still don’t know what exactly to make of it. SBF’s responses reveal a surprising cynicism (surprising because, if you’re that cynical, why be open about it?), as well as an optimism that he can still fix everything that seems wildly divorced from reality.

I still stand by most of the main points of my post, including:

  • the technical insanity of SBF’s clearly-expressed attitude to risk (“gambler’s ruin? more like gambler’s opportunity!!”), and its probable role in creating the conditions for everything that followed,
  • the need to diagnose the catastrophe correctly (making billions of dollars in order to donate them to charity? STILL VERY GOOD; lying and raiding customer deposits in course of doing so? DEFINITELY BAD), and
  • how, when sneerers judge SBF guilty just for being a crypto billionaire who talked about Effective Altruism, it ironically lets him off the hook for what he specifically did that was terrible.

But over the past couple days, I’ve updated in the direction of understanding SBF’s psychology a lot less than I thought I did. While I correctly hit on certain aspects of the tragedy, there are other important aspects—the drug use, the cynical detachment (“life as a video game”), the impulsivity, the apparent lying—that I neglected to touch on and about which we’ll surely learn more in the coming days, weeks, and years. –SA


Several readers have asked me for updated thoughts on AI safety, now that I’m 5 months into my year at OpenAI—and I promise, I’ll share them soon! The thing is, until last week I’d entertained the idea of writing up some of those thoughts for an essay competition run by the FTX Future Fund, which (I was vaguely aware) was founded by the cryptocurrency billionaire Sam Bankman-Fried, henceforth SBF.

Alas, unless you’ve been tucked away on some Caribbean island—or perhaps, especially if you have been—you’ll know that the FTX Future Fund has ceased to exist. In the course of 2-3 days last week, SBF’s estimated net worth went from ~$15 billion to a negative number, possibly the fastest evaporation of such a vast personal fortune in all human history. Notably, SBF had promised to give virtually all of it away to various worthy causes, including mitigating existential risk and helping Democrats win elections, and the worldwide Effective Altruist community had largely reoriented itself around that pledge. That’s all now up in smoke.

I’ve never met SBF, although he was a physics undergraduate at MIT while I taught CS there. What little I knew of SBF before this week, came mostly from reading Gideon Lewis-Kraus’s excellent New Yorker article about Effective Altruism this summer. The details of what happened at FTX are at once hopelessly complicated and—it would appear—damningly simple, involving the misuse of billions of dollars’ worth of customer deposits to place risky bets that failed. SBF has, in any case, tweeted that he “fucked up and should have done better.”

You’d think none of this would directly impact me, since SBF and I inhabit such different worlds. He ran a crypto empire from the Bahamas, sharing a group house with other twentysomething executives who often dated each other. I teach at a large state university and try to raise two kids. He made his first fortune by arbitraging bitcoin between Asia and the West. I own, I think, a couple bitcoins that someone gave me in 2016, but have no idea how to access them anymore. His hair is large and curly; mine is neither.

Even so, I’ve found myself obsessively following this story because I know that, in a broader sense, I will be called to account for it. SBF and I both grew up as nerdy kids in middle-class Jewish American families, and both had transformative experiences as teenagers at Canada/USA Mathcamp. He and I know many of the same people. We’ve both been attracted to the idea of small groups of idealistic STEM nerds using their skills to help save the world from climate change, pandemics, and fascism.

Aha, the sneerers will sneer! Hasn’t the entire concept of “STEM nerds saving the world” now been utterly discredited, revealed to be just a front for cynical grifters and Ponzi schemers? So if I’m also a STEM nerd who’s also dreamed of helping to save the world, then don’t I stand condemned too?

I’m writing this post because, if the Greek tragedy of SBF is going to be invoked as a cautionary tale in nerd circles forevermore—which it will be—then I think it’s crucial that we tell the right cautionary tale.

It’s like, imagine the Apollo 11 moon mission had almost succeeded, but because of a tiny crack in an oxygen tank, it instead exploded in lunar orbit, killing all three of the astronauts. Imagine that the crack formed partly because, in order to hide a budget overrun, Wernher von Braun had secretly substituted a cheaper material, while telling almost none of his underlings.

There are many excellent lessons that one could draw from such a tragedy, having to do with, for example, the construction of oxygen tanks, the procedures for inspecting them, Wernher von Braun as an individual, or NASA safety culture.

But there would also be bad lessons to not draw. These include: “The entire enterprise of sending humans to the moon was obviously doomed from the start.” “Fate will always punish human hubris.” “All the engineers’ supposed quantitative expertise proved to be worthless.”

From everything I’ve read, SBF’s mission to earn billions, then spend it saving the world, seems something like this imagined Apollo mission. Yes, the failure was total and catastrophic, and claimed innocent victims. Yes, while bad luck played a role, so did, shall we say, fateful decisions with a moral dimension. If it’s true that, as alleged, FTX raided its customers’ deposits to prop up the risky bets of its sister organization Alameda Research, multiple countries’ legal systems will surely be sorting out the consequences for years.

To my mind, though, it’s important not to minimize the gravity of the fateful decision by conflating it with everything that preceded it. I confess to taking this sort of conflation extremely personally. For eight years now, the rap against me, advanced by thousands (!) on social media, has been: sure, while by all accounts Aaronson is kind and respectful to women, he seems like exactly the sort of nerdy guy who, still bitter and frustrated over high school, could’ve chosen instead to sexually harass women and hinder their scientific careers. In other words, I stand condemned by part of the world, not for the choices I made, but for choices I didn’t make that are considered “too close to me” in the geometry of conscience.

And I don’t consent to that. I don’t wish to be held accountable for the misdeeds of my doppelgängers in parallel universes. Therefore, I resolve not to judge anyone else by their parallel-universe doppelgängers either. If SBF indeed gambled away his customers’ deposits and lied about it, then I condemn him for it utterly, but I refuse to condemn his hypothetical doppelgänger who didn’t do those things.

Granted, there are those who think all cryptocurrency is a Ponzi scheme and a scam, and that for that reason alone, it should’ve been obvious from the start that crypto-related plans could only end in catastrophe. The “Ponzi scheme” theory of cryptocurrency has, we ought to concede, a substantial case in its favor—though I’d rather opine about the matter in (say) 2030 than now. Like many technologies that spend years as quasi-scams until they aren’t, maybe blockchains will find some compelling everyday use-cases, besides the well-known ones like drug-dealing, ransomware, and financing rogue states.

Even if cryptocurrency remains just a modern-day tulip bulb or Beanie Baby, though, it seems morally hard to distinguish a cryptocurrency trader from the millions who deal in options, bonds, and all manner of other speculative assets. And a traditional investor who made billions on successful gambles, or arbitrage, or creating liquidity, then gave virtually all of it away to effective charities, would seem, on net, way ahead of most of us morally.

To be sure, I never pursued the “Earning to Give” path myself, though certainly the concept occurred to me as a teenager, before it had a name. Partly I decided against it because I seem to lack a certain brazenness, or maybe just willingness to follow up on tedious details, needed to win in business. Partly, though, I decided against trying to get rich because I’m selfish (!). I prioritized doing fascinating quantum computing research, starting a family, teaching, blogging, and other stuff I liked over devoting every waking hour to possibly earning a fortune only to give it all to charity, and more likely being a failure even at that. All told, I don’t regret my scholarly path—especially not now!—but I’m also not going to encase it in some halo of obvious moral superiority.

If I could go back in time and give SBF advice—or if, let’s say, he’d come to me at MIT for advice back in 2013—what could I have told him? I surely wouldn’t talk about cryptocurrency, about which I knew and know little. I might try to carve out some space for deontological ethics against pure utilitarianism, but I might also consider that a lost cause with this particular undergrad.

On reflection, maybe I’d just try to convince SBF to weight money logarithmically when calculating expected utility (as in the Kelly criterion), to forsake the linear weighting that SBF explicitly advocated and that he seems to have put into practice in his crypto ventures. Or if not logarithmic weighing, I’d try to sell him on some concave utility function—something that makes, let’s say, a mere $1 billion in hand seem better than $15 billion that has a 50% probability of vanishing and leaving you, your customers, your employees, and the entire Effective Altruism community with less than nothing.

At any rate, I’d try to impress on him, as I do on anyone reading now, that the choice between linear and concave utilities, between risk-neutrality and risk-aversion, is not bloodless or technical—that it’s essential to make a choice that’s not only in reflective equilibrium with your highest values, but that you’ll still consider to be such regardless of which possible universe you end up in.

November 16, 2022

David Hoggdust and star formation

Julianne Dalcanton (Flatiron) gave a great talk at NYU today about star formation, interstellar medium, stellar ages, and dust in Local Group galaxies. She showed that the standard star-formation indicators from infrared emission from dust are way wrong. But she also showed lots of interesting detail in the interstellar medium and star-formation history in M33 and M31. M31 really does seem to have a ring which is not just over-dense in star formation; it's actually over-dense in stars. That's odd, and interesting.

n-Category Café The Icosidodecahedron

The icosidodecahedron can be built by truncating either a regular icosahedron or a regular dodecahedron. It has 30 vertices, one at the center of each edge of the icosahedron—or equivalently, one at the center of each edge of a dodecahedron. It is a beautiful, highly symmetrical shape. But it is just a shadow of a more symmetrical shape with twice as many vertices, which lives in a space with twice as many dimensions! Namely, it is a projection down to 3d space of a 6-dimensional polytope with 60 vertices.

Even better, it is also a slice of a more symmetrical 4d polytope with 120 vertices, which in turn is the projection down to 4d space of an even more symmetrical 8-dimensional polytope with 240 vertices: the so-called ‘E8 root polytope’.

Note how the numbers keep doubling: 30, 60, 120 and 240.

To understand all this, start with the group of rotational symmetries of the icosahedron. This is a 60-element subgroup of the rotation group SO(3), so it has a double cover, called the binary icosahedral group, consisting of 120 unit quaternions. With a suitable choice of coordinates, we can take these to be

±1,±1±i±j±k2,±i±ϕj±Φk2 \displaystyle{ \pm 1 , \quad \frac{\pm 1 \pm i \pm j \pm k}{2}, \quad \frac{\pm i \pm \phi j \pm \Phi k}{2} }

together with everything obtained from these by even permutations of 1,i,j,1, i, j, and k,k, where

ϕ=512,Φ=5+12 \displaystyle{ \phi = \frac{\sqrt{5} - 1}{2}, \quad \Phi = \frac{\sqrt{5} + 1}{2} }

are the ‘little’ and ‘big’ golden ratios, respectively. These 120 unit quaternions are the vertices of a convex polytope in 4 dimensions. In fact this is a regular polytope, called the 600-cell since it has 600 regular tetrahedra as faces.

If we slice the 600-cell with halfway between two of its opposite vertices, we get an icosidodecahedron. This is easiest to see by intersecting the 600-cell with the space of purely imaginary quaternions

{ai+bj+ck:a,b,c} \{ a i + b j + c k : \; a,b,c \in \mathbb{R} \}

Of the 600-cell’s vertices, those that lie in this 3-dimensional space are

±i,±j,±k\pm i, \pm j, \pm k

which form the corners of an octahedron, and

±i±ϕj±Φk2,±j±ϕk±Φi2,±k±ϕi±Φj2 \displaystyle{ \frac{\pm i \pm \phi j \pm \Phi k}{2} , \quad \frac{\pm j \pm \phi k \pm \Phi i}{2} , \quad \frac{\pm k \pm \phi i \pm \Phi j}{2} }

which form the corners of three ‘golden boxes’. A golden box is the 3d analogue of a golden rectangle: its three sides are in the proportions ϕ,\phi, 1 and Φ.\Phi.

It is well-known that these points are the vertices of an icosidodecahedron. Here are the three golden boxes and octahedron inscribed in an icosidodecahedron, as drawn by Rahul Narain:

But we are not done with the binary icosahedral group—far from it!

Integer linear combinations of these 120 elements of the quaternions form a subring of the quaternions, which Conway and Sloane [CS] call the icosians. Since any icosian can be written as a+bi+cj+dka + bi + cj + dk where the numbers a,b,c,da,b,c,d \in \mathbb{R} are of the form x+y5x + y \sqrt{5} with x,yx,y rational, any icosian gives an 8-tuple of rational numbers. However, we do not get all 8-tuples of rationals this way, only those lying in a certain lattice in 8.\mathbb{R}^8. And there is a way to think of this lattice as a rescaled copy of the famous E8 lattice! To do this, Conway and Sloane put a new norm on the icosians as follows. The usual quaternionic norm is

a+bi+cj+dk 2=a 2+b 2+c 2+d 2 \|a + b i + c j + d k\|^2 = a^2 + b^2 + c^2 + d^2

But for an icosian this norm is always of the form x+5yx + \sqrt{5} y for some rationals xx and y.y. Conway and Sloane define a new norm on the icosians by setting

|a+bi+cj+dk| 2=x+y |a + b i + c j + d k|^2 = x + y

With this new norm, Conway and Sloane show the icosians are isomorphic to a rescaled version of the E8 lattice in 8.\mathbb{R}^8.

The 240 shortest nonzero vectors in this lattice are the vertices of an 8-dimensional convex polytope called the E8 root polytope:

However, if we remember that each of these 240 vectors came from a quaternion, we can also think of them as 240 quaternions. These turn out to be the vertices of two 600-cells in the quaternions! In the usual quaternionic norm, one of these 600-cells is larger than the other by a factor of Φ.\Phi.

In fact, there is an orthogonal projection from 8\mathbb{R}^8 down to 4\mathbb{R}^4 that maps the E8 root polytope to the 600-cell. So, in a very real sense, the 600-cell is the ‘shadow’ of a polytope with twice as many vertices, living in a space whose dimension is twice as large. And as a spinoff, this fact gives the same sort of relationship between the icosidodecahedron and a 6-dimensional polytope.

The key is to look at pure imaginary icosians: those of the form ai+bj+cka i + b j + c k for real a,b,c.a,b,c. Since a,ba,b and cc are each of the form x+5yx + \sqrt{5}y with xx and yy rational, any pure imaginary icosian gives a 6-tuple of rational numbers. We do not get all 6-tuples of rationals this way, but only those lying in a certain lattice. We have

ai+bj+ck 2=a 2+b 2+c 2 \|a i + b j + c k\|^2 = a^2 + b^2 + c^2

For a pure imaginary icosian this is always of the form x+5yx + \sqrt{5} y for some rationals xx and y.y. So, we can define a new norm on the pure imaginary icosians by

|ai+bj+ck| 2=x+y |a i + b j + c k|^2 = x + y

With this new norm, the pure imaginary icosians are isomorphic to a rescaled version of a familiar lattice in 6,\mathbb{R}^6, called the ‘D6 lattice’.

The 60 shortest nonzero vectors in the D6 lattice are called the roots of D6, and they are the vertices of a 6-dimensional convex polytope called the D6 root polytope. There is an orthogonal projection from 6\mathbb{R}^6 to 3\mathbb{R}^3 that maps this polytope to an icosidodecahedron. In fact 30 vertices of the D6 root polytope map to the vertices of this icosidodecahedron, while the other 30 map to vertices of a second, smaller icosidodecahedron.

Here is an image of the setup, created by Greg Egan:

Let’s see some details! The usual coordinatization of the D6 lattice in Euclidean 6\mathbb{R}^6 is

D 6={(x 1,,x 6):x i, ix i2} 6 \mathrm{D}_6 = \left\{ (x_1, \dots, x_6) : \; x_i \in \mathbb{Z}, \; \sum_i x_i \in 2\mathbb{Z} \right\} \subset \mathbb{R}^6

The roots of D6 are

(±1,±1,0,0,0,0) (\pm 1, \pm 1, 0, 0, 0, 0)

and all vectors obtained by permuting the six coordinates. We shall see that these vectors are sent to the vertices of an icosidodecahedron by the linear map T: 6 3T \colon \mathbb{R}^6 \to \mathbb{R}^3 given as a 3 × 6 matrix by

(Φ Φ 1 1 0 0 0 0 Φ Φ 1 1 1 1 0 0 Φ Φ) \left( \begin{array}{cccccc} \Phi & \Phi & -1 & -1 & 0 & 0 \\ 0 & 0 & \Phi & -\Phi & -1 & 1 \\ -1 & 1 & 0 & 0 & \Phi & \Phi \end{array} \right)

The rows of this matrix are orthogonal, all with the same norm, so after rescaling it by a constant factor we obtain an orthogonal projection. The columns of this matrix are six vertices of an icosahedron, chosen so that we never have a vertex and its opposite. For any pair of columns, they are either neighboring vertices of the icosahedron, or a vertex and the opposite of a neighboring vertex.

The map TT thus sends any D6 root to either the sum or the difference of two neighboring icosahedron vertices. In this way we obtain all possible sums and differences of neighboring vertices of the icosahedron. It is easy to see that the sums of neighboring vertices give the vertices of an icosidodecahedron, since by definition the icosidodecahedron has vertices at the midpoints of the edges of a regular icosahedron. It is less obvious that the differences of neighboring vertices of the icosahedron give the vertices of a second, smaller icosidodecahedron. But thanks to the symmetry of the situation, we can check this by considering just one example. In fact the vectors defining the vertices of the larger icosidodecahedron turn out to be precisely Φ\Phi times the vectors defining the vertices of the smaller one!

The beauties we have just seen are part of an even larger pattern relating all the non-crystallographic Coxeter groups to crystallographic Coxeter groups. For more, see the work of Fring and Korff [FK1,FK2], Boehm, Dechant and Twarock [BDT] and the many papers they refer to. Fring and Korff apply these ideas to integrable systems in physics, while the latter authors explore connections to affine Dynkin diagrams. For more relations between the icosahedron and E8, see [B2].

Acknowledgements

I thank Greg Egan for help with developing these ideas. The spinning icosidodecahedron was created by Cyp and was put on Wikicommons with a Creative Commons Attribution-Share Alike 3.0 Unported license. The 600-cell was made using Robert Webb’s Stella software and is on Wikicommons. The icosidodecahedron with three golden boxes and an octahedron inscribed in it was created by Rahul Narain on Mathstodon. The projection of the 240 E8 roots to the plane was created by Claudio Rocchini and put on Wikicommons with a Creative Commons Attribution 3.0 Unported license. The spinning pair of icosidodecahedra was created by Greg Egan and appears in an earlier blog article on this subject [B1]. The article here is an expanded version of that earlier article: the only thing I left out is the matrix describing a linear map S: 8 4S \colon \mathbb{R}^8 \to \mathbb{R}^4 that when suitably rescaled gives a projection mapping the E8 lattice in its usual coordinatization

{x 8:allx iorallx i+12and ix i2} \{ x \in \mathbb{R}^8: \, all \; x_i \in \mathbb{Z} \; or \; all \; x_i \in \mathbb{Z} + \frac{1}{2} \; and \; \sum_i x_i \in 2\mathbb{Z} \}

to the icosians, and thus mapping the 240 E8 roots to two 600-cells. For completeness, here is that matrix:

(Φ+1 Φ1 0 0 0 0 0 0 0 0 Φ Φ 1 1 0 0 0 0 0 0 Φ Φ 1 1 0 0 1 1 0 0 Φ Φ) \left( \begin{array}{cccccccc} \Phi+1 & \Phi -1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & \Phi & \Phi & -1 & -1 & 0 & 0 \\ 0 & 0 & 0 & 0 & \Phi & -\Phi & -1 & 1 \\ 0 & 0 & -1 & 1 & 0 & 0 & \Phi & \Phi \end{array} \right)

The image at the bottom of this post was also created by Greg Egan, on Mathstodon.

References

[B1] John Baez, Icosidodecahedron from D6, Visual Insight, January 1, 2015.

[B2] John Baez, From the icosahedron to E8, London Math. Soc. Newsletter 476 (2018), 18–23.

[BDT] Celine Boehm, Pierre-Philippe Dechant and Reidun Twarock, Affine extensions of non-crystallographic Coxeter groups induced by projection, J. Math. Phys. 54, 093508 (2013). [CS] John H. C. Conway and Neil J. A. Sloane, Sphere Packings, Lattices and Groups, Springer, Berlin, 2013.

[FK1] Andreas Fring and Christian Korff, Affine Toda field theories related to Coxeter groups of non-crystallographic type, Nucl. Phys. B729 (2005), 361–386.

[FK2] Andreas Fring and Christian Korff, Non-crystallographic reduction of generalized Calogero–Moser models, J. Phys. A39 (2006) 1115–1132.

November 14, 2022

John PreskillThe spirit of relativity

One of the most immersive steampunk novels I’ve read winks at an experiment performed in a university I visited this month. The Watchmaker of Filigree Street, by Natasha Pulley, features a budding scientist named Grace Carrow. Grace attends Oxford as one of its few women students during the 1880s. To access the university’s Bodleian Library without an escort, she masquerades as male. The librarian grouses over her request.

“‘The American Journal of  Science – whatever do you want that for?’” As the novel points out, “The only books more difficult to get hold of than little American journals were first copies of [Isaac Newton’s masterpiece] Principia, which were chained to the desks.”

As a practitioner of quantum steampunk, I relish slipping back to this stage of intellectual history. The United States remained an infant, to centuries-old European countries. They looked down upon the US as an intellectual—as well as partially a literal—wilderness.1 Yet potential was budding, as Grace realized. She was studying an American experiment that paved the path for Einstein’s special theory of relativity.

How does light travel? Most influences propagate through media. For instance, ocean waves propagate in water. Sound propagates in air. The Victorians surmised that light similarly travels through a medium, which they called the luminiferous aether. Nobody, however, had detected the aether.

Albert A. Michelson and Edward W. Morley squared up to the task in 1887. Michelson, brought up in a Prussian immigrant family, worked as a professor at the Case School of Applied Science in Cleveland, Ohio. Morley taught chemistry at Western Reserve University, which shared its campus with the recent upstart Case. The two schools later merged to form Case Western Reserve University, which I visited this month.

We can intuit Michelson and Morley’s experiment by imagining two passengers on a (steam-driven, if you please) locomotive: Audrey and Baxter. Say that Audrey walks straight across the aisle, from one window to another. In the same time interval, and at the same speed relative to the train, Baxter walks down the aisle, from row to row of seats. The train carries both passengers in the direction in which Baxter walks.

The Audrey and Baxter drawings (not to scale) are by Todd Cahill.

Baxter travels farther than Audrey, as the figures below show. Covering a greater distance in the same time, he travels more quickly.

Relative lengths of Audrey’s and Baxter’s displacements (top and bottom, respectively)

Replace each passenger with a beam of light, and replace the train with the aether. (The aether, Michelson and Morley reasoned, was moving relative to their lab as a train moves relative to the countryside. The reason was, the aether filled space and the Earth was moving through space. The Earth was moving through the aether, so the lab was moving through the aether, so the aether was moving relative to the lab.)

The scientists measured how quickly the “Audrey” beam of light traveled relative to the “Baxter” beam. The measurement relied on an apparatus that now bears the name of one of the experimentalists: the Michelson interferometer. To the scientists’ surprise, the Audrey beam traveled just as quickly as the Baxter beam. The aether didn’t carry either beam along as a train carries a passenger. Light can travel in a vacuum, without any need for a medium.

Exhibit set up in Case Western Reserve’s physics department to illustrate the Michelson-Morley experiment rather more articulately than my sketch above does

The American Physical Society, among other sources, calls Michelson and Morley’s collaboration “what might be regarded as the most famous failed experiment to date.” The experiment provided the first rigorous evidence that the aether doesn’t exist and that, no matter how you measure light’s speed, you’ll only ever observe one value for it (if you measure it accurately). Einstein’s special theory of relativity provided a theoretical underpinning for these observations in 1905. The theory provides predictions about two observers—such as Audrey and Baxter—who are moving relative to each other. As long as they aren’t accelerating, they agree about all physical laws, including the speed of light.

Morley garnered accolades across the rest of his decades-long appointment at Western Reserve University. Michelson quarreled with his university’s administration and eventually resettled at the University of Chicago. In 1907, he received the first Nobel Prize awarded to any American for physics. The citation highlighted “his optical precision instruments and the spectroscopic and metrological investigations carried out with their aid.”

Today, both scientists enjoy renown across Case Western Reserve University. Their names grace the sit-down restaurant in the multipurpose center, as well as a dormitory and a chemistry building. A fountain on the quad salutes their experiment. And stories about a symposium held in 1987—the experiment’s centennial—echo through the physics building. 

But Michelson and Morley’s spirit most suffuses the population. During my visit, I had the privilege and pleasure of dining with members of WiPAC, the university’s Women in Physics and Astronomy Club. A more curious, energetic group, I’ve rarely seen. Grace Carrow would find kindred spirits there.

With thanks to Harsh Mathur (pictured above), Patricia Princehouse, and Glenn Starkman, for their hospitality, as well as to the Case Western Reserve Department of Physics, the Institute for the Science of Origins, and the Gundzik Endowment.

Aside: If you visit Cleveland, visit its art museum! As Quantum Frontiers regulars know, I have a soft spot for ancient near-Eastern and ancient Egyptian art. I was impressed by the Cleveland Museum of Art’s artifacts from the reign of pharaoh Amenhotep III and the museum’s reliefs of the Egyptian queen Nefertiti. Also, boasting a statue of Gudea (a ruler of the ancient city-state of Lagash) and a relief from the palace of Assyrian kind Ashurnasirpal II, the museum is worth its ancient-near-Eastern salt.

1Not that Oxford enjoyed scientific renown during the Victorian era. As Cecil Rhodes—creator of the Rhodes Scholarship—opined then, “Wherever you turn your eye—except in science—an Oxford man is at the top of the tree.”

November 13, 2022

Doug NatelsonBob Curl - it is possible to be successful and also a good person

I went to a memorial service today at Rice for my late colleague Bob Curl, who died this past summer, and it was a really nice event.  I met Bob almost immediately upon my arrival at Rice back in 2000 (though I’d heard about him from my thesis advisor, who’d met him at the Nobel festivities in Stockholm in 1996).  As everyone who interacted with him for any length of time will tell you, he was simultaneously extremely smart and amazingly nice.  He was very welcoming to me, even though I was a new assistant professor not even in his department.  I’d see him at informal weekly lunch gatherings of some folks from what was then called the Rice Quantum Institute, and he was always interested in learning about what his colleagues were working on - he had a deep curiosity and an uncanny ability to ask insightful questions.  He was generous with his time and always concerned about students and the well-being of the university community.

A refrain that came up over and over at the service was that Bob listened.  He talked with you, not at you, whether you were an undergrad, a grad student, a postdoc, a professor, or a staff member.  I didn’t know him nearly as well as others, but in 22 years I never heard him say a cross word or treat anyone with less than respect.  

His insatiable curiosity also came up repeatedly.  He kept learning new topics, right up to the end, and actually coauthored papers on economics, like this one.  By all accounts he was scientifically careful and rigorous.

Bob was a great example of how it is possible to be successful as an academic and a scientist while still being a nice person.  It’s important to be reminded of that sometimes.

November 12, 2022

John BaezThe Circle of Fifths

The circle of fifths is a beautiful thing, fundamental to music theory.

Sound is vibrations in air. Start with some note on the piano. Then play another note that vibrates 3/2 times as fast. Do this 12 times. Since

(3/2)¹² ≈ 128 = 2⁷

when you’re done your note vibrates about 2⁷ times as fast as when you started!

Notes have letter names, and two notes whose frequencies differ by a power of 2 have the same letter name. So the notes you played form a 12-pointed star:

Each time you increase the frequency by a factor of 3/2 you move around the points of this star: from C to G to D to A, and so on. Each time you move about 7/12 of the way around the star, since

log(3/2) / log(2) ≈ 7/12

This is another way of stating the approximate equation I wrote before!

It’s great! It’s called the circle of fifths, for reasons that don’t need to concern us here.

But this pattern is just approximate! In reality

(3/2)¹² = 129.746…

not 128, and

log(3/2) / log(2) = 0.58496…

not 7/12 = 0.58333… So the circle of fifths does not precisely close:

The failure of it to precisely close is called the Pythagorean comma, and you can hear the problem here:

This video plays you notes that increase in frequency by a factor of 3/2 each time, and finally two notes that differ by the Pythagorean comma: they’re somewhat out of tune.

People have dealt with this in many, many ways. No solution makes everyone happy.

For example, the equal-tempered 12-tone scale now used on most pianos doesn’t have ‘perfect fifths’—that is, frequency ratios of 3/2. It has frequency ratios of

2^{7/12} \approx 1.4983

I have tried in this blog article to be understandable by people who don’t know standard music theory terminology—basic stuff like ‘octaves’ and ‘fifths’, or the letter names for notes. But the circle of fifths is very important for people who do know this terminology. It’s a very practical thing for musicians, for example if you want to remember how many sharps or flats there are in any key. Here’s a gentle introduction to it by Gracie Terzian:

Here she explains some things you can do with it:

Here’s another version of the circle of fifths made by “Just plain Bill”>—full of information used by actual musicians:

If you watch Terzian’s videos you’ll learn what all this stuff is about.

November 11, 2022

Tommaso DorigoScience Without Borders, The USERN, And Its New President

Humanity progresses thanks to the diffusion and sharing of human knowledge. In particular, scientific progress is brought forth by the sharing of ideas, measurements and experimental results among scientists, and the distribution of excellent education. We have grown very good at doing that, but can we improve the sharing of knowledge for the common good? 

The answer is certainly yes, as the interconnection of the scientific community and the interdisciplinarity of its efforts are hampered by borders, language barriers, cultural differences, political influences, religious hindrances, education system challenges, and also by different conventions, policies, metrics in the different areas of scientific research.

read more

Matt von HippelFields and Scale

I am a theoretical particle physicist, and every morning I check the arXiv.

arXiv.org is a type of website called a preprint server. It’s where we post papers before they are submitted to (and printed by) a journal. In practice, everything in our field shows up on arXiv, publicly accessible, before it appears anywhere else. There’s no peer review process on arXiv, the journals still handle that, but in our field peer review doesn’t often notice substantive errors. So in practice, we almost never read the journals: we just check arXiv.

And so every day, I check the arXiv. I go to the section on my sub-field, and I click on a link that lists all of the papers that were new that day. I skim the titles, and if I see an interesting paper I’ll read the abstract, and maybe download the full thing. Checking as I’m writing this, there were ten papers posted in my field, and another twenty “cross-lists” were posted in other fields but additionally classified in mine.

Other fields use arXiv: mathematicians and computer scientists and even economists use it in roughly the same way physicists do. For biology and medicine, though, there are different, newer sites: bioRxiv and medRxiv.

One thing you may notice is the different capitalization. When physicists write arXiv, the “X” is capitalized. In the logo, it looks like a Greek letter chi, thus saying “archive”. The biologists and medical researchers capitalize the R instead. The logo still has an X that looks like a chi, but positioned with the R it looks like the Rx of medical prescriptions.

Something I noticed, but you might not, was the lack of a handy link to see new papers. You can search medRxiv and bioRxiv, and filter by date. But there’s no link that directly takes you to the newest papers. That suggests that biologists aren’t using bioRxiv like we use arXiv, and checking the new papers every day.

I was curious if this had to do with the scale of the field. I have the impression that physics and mathematics are smaller fields than biology, and that much less physics and mathematics research goes on than medical research. Certainly, theoretical particle physics is a small field. So I might have expected arXiv to be smaller than bioRxiv and medRxiv, and I certainly would expect fewer papers in my sub-field than papers in a medium-sized subfield of biology.

On the other hand, arXiv in my field is universal. In biology, bioRxiv and medRxiv are still quite controversial. More and more people are using them, but not every journal accepts papers posted to a preprint server. Many people still don’t use these services. So I might have expected bioRxiv and medRxiv to be smaller.

Checking now, neither answer is quite right. I looked between November 1 and November 2, and asked each site how many papers were uploaded between those dates. arXiv had the most, 604 papers. bioRxiv had roughly half that many, 348. medRxiv had 97.

arXiv represents multiple fields, bioRxiv is “just” biology. Specializing, on that day arXiv had 235 physics papers, 135 mathematics papers, and 250 computer science papers. So each individual field has fewer papers than biology in this period.

Specializing even further, I can look at a subfield. My subfield, which is fairly small, had 20 papers between those dates. Cell biology, which I would expect to be quite a big subfield, had 33.

Overall, the numbers were weirdly comparable, with medRxiv unexpectedly small compared to both arXiv and bioRxiv. I’m not sure whether there are more biologists than physicists, but I’m pretty sure there should be more cell biologists than theoretical particle physicists. This suggests that many still aren’t using bioRxiv. It makes me wonder: will bioRxiv grow dramatically in future? Are the people running it ready for if it does?

November 10, 2022

John BaezThis Week’s Finds – Lecture 7

Today I’ll be talking about quaternions, octonions and E8. But a warning to everyone: today, November 10th, the seminar will be in a different building.

Today we’re in Lecture Theatre 1 of the Daniel Rutherford Building. This is a few minutes’ walk from the James Clerk Maxwell Building, just along from the Darwin Building.

Next week we will return to the usual place: Room 6206 of the James Clerk Maxwell Building, home of the Department of Mathematics of the University of Edinburgh.

As usual you can attend via Zoom:

https://ed-ac-uk.zoom.us/j/82270325098
Meeting ID: 822 7032 5098
Passcode: Yoneda36

And as usual, a video of today’s talk will appear here later.

November 06, 2022

John BaezModes (Part 2)

When you first learn about the major scale it’s fairly straightforward, because they tell you about just one major scale. But the minor scale is more tricky, because they tell you about three—or actually four, two of which are the same!

The most fundamental of these is the natural minor scale. The C major scale goes

C D E F G A B C

The C natural minor scale goes

C D E♭ F G A♭ B♭ C

As you can see the 3rd, 6th and 7th notes of the scale are ‘flatted’: moved down a half-tone compared to the major scale. This gives the natural minor scale a darker, even ‘sadder’ quality compared to the major scale.

I prefer to work with note numbers instead of note names, not because I’m a mathematician so I love numbers, but because then we can simultaneously talk about different keys at once, not just the key of C. In this approach we call the notes of the major scale

1 2 3 4 5 6 7 8

and then the natural minor scale is

1 2 ♭3 4 5 ♭6 ♭7 8

Don’t ask me why the flats are written in front of the numbers now instead of after them—it’s just a convention.

Now, one thing about ‘common practice’ western harmony is the 7th tone plays a special role. It’s just a half-step below the 8, and we act like that dissonance makes it want very strongly to go up to the 8. The 8 is one octave above the 1, twice the frequency. Either the 1 or 8 instantly serves as a home base: we feel like a piece or passage is done, or momentarily at peace, when we play these notes. We say the 7 wants to ‘resolve’ to the 8, and we call it the ‘leading-tone’ for this reason: it suggests that we’ve almost reached the tonic, and makes us want to get there!

There’s much more we could say here, but it all combines to make people want a scale that’s like minor but contains the 7 instead of the ♭7. And since this scale is motivated by reasons of harmony theory, it’s called the harmonic minor scale. It goes like this:

1 2 ♭3 4 5 ♭6 7 8

However, now people singing this scale find it mildly awkward to jump up from ♭6 to the 7 because the distance between them is larger. In fact it’s 3 half-tones, larger than any step in the major or natural minor scale! One way to shrink this gap is to raise the ♭6 to a 6 as well. This gives the melodic minor scale:

1 2 ♭3 4 5 6 7 8

By now we’re almost back to the major scale! The only difference is the flatted 3. However, that’s still a lot: the ♭3 is considered the true hallmark of minorness. There are reasons for this, like the massive importance of the 1 3 5 chord, which serves to pound home the message “we’re back to 1, and this is the major scale, so we are very happy”. Playing 1 ♭3 5 says “we’re back to 1, but this is minor, so we are done but we are sad”.

However, singing up the scale is different from singing down the scale. When we sing up the melodic major scale we are very happy to sing the 7 right before the 8, because it’s the leading-tone: it tells us we’re almost home. But when we sing down we don’t so much mind plunging from the 8 down to ♭7, and then it’s not so far down to ♭6: these are both steps of a whole tone. If we do this we are singing in the natural minor scale. So what I called ‘melodic minor’ is also called melodic minor ascending, while natural minor is also called melodic minor descending.

Here I should admit that while this is an oft-told pedagogical story, the actual reality is more complex. Good composers or improvisers use whatever form of minor they want at any given moment! However, most western musicians have heard some version of the story I just told, and that does affect what they do.

To listen to these various forms of the minor scale, and hear them explained more eloquently than I just did, try this:

Grazie Terzian is the patient teacher of music theory I wish I’d had much earlier. You may feel a bit impatient listening to her carefully working through various scales, but that’s because she’s giving you enough time for the information to really sink into your brain!

Anyway: we’ve seen one form of major scale and three forms of minor, one of which has two names. All these scales differ solely in whether or not we flat the 3, 6 or 7. So, we can act like mathematicians and fit them into a cube where the operations of flatting the 3, 6 or 7 are drawn as arrows:

Here to save space I’ve written flatted notes with little superscripts like 3^\flat instead of ♭3: it makes no difference to the meaning.

This chart shows that flatting the 3 pushes our scale into minor territory, while flatting the 6 and then the 7th are ways to further intensify the darkness of the scale. But you’ll also see that we’re just using a few of the available options!

In part 1 I showed you another way to modify the major scale, namely by starting it at various different notes to get different ‘modes’. If we list them in order of the starting note—1, 2, 3, etc.—they look like this:

For example, Ionian is just major. But we saw that it is also very nice to list the modes from the ‘brightest’ to the ‘darkest’. Rob van Hal made a nice chart showing how this works:

Skipping over Lydian, which is a bit of an exception, we start with major—that is, Ionian—and then start flatting more and more notes. When we reach the Phrygian and Locrian we flat the 2 and then the 5, which are very drastic things to do. So these modes have a downright sinister quality. But before we reach these, we pass through various modes that fit into my cube!

Let’s look at them:

We’re now tracing out a different path from top to bottom. Ionian has no notes flatted. In Mixolydian we flat the 7. In Dorian we also flat the 3. Then in Aeolian we also flat the 6.

I mentioned that the ♭3 is considered the true hallmark of minorness. Thus, in the classification of modes, those with a flatted 3 are considered ‘minor’ while those without are considered ‘major’. So in our new path from the cube’s top to its bottom, we switch from major to minor modes when we pass from Mixolydian to Dorian.

Note that Ionian is just our old friend the major scale, and Aeolian is our friend the natural minor. We can combine the two cubes I’ve showed you, and see how they fit together:

Now we can get from the top to Dorian following two paths that pass only through scales or modes we’ve seen! Similarly we can get from melodic minor ascending to the bottom following two paths through scales or modes we’ve seen. In general, moving around this cube through the course of a piece provides a lot of interesting ways to subtly change the mood.

But two corners of our cube don’t have names yet! These are more exotic! But of course they exist, and are sometimes used in music. The mode

1 2 3 4 5 ♭6 7

is called harmonic major, and it’s used in the Beatles’ ‘Blackbird’. The mode

1 2 3 4 5 ♭6 ♭7

is called the melodic major scale, or also Mixolydian flat 6 or Aeolian dominant. It’s used in the theme song of the movie The Mask of Zorro, called ‘I Want to Spend My Lifetime Loving You’.

So, let’s add these two modes to our cube:

This is the whole enchilada: a ‘commuting cube’, meaning that regardless of which path we take from any point to any other point, we get the same mode in the end. We can also strip it of all the musical names and think of it in a purely mathematical way:

We could go further and study a 5-dimensional hypercube where we also consider the results of flatting the 2 and 5. That would let us include darker and scarier modes like Phrygian, Phrygian dominant and Locrian—but it would be tougher to draw!

John BaezModes (Part 1)

I’ve been away from my piano since September. I really miss playing it. So, I’ve been sublimating my desire to improvise on this instrument by finally learning a bunch of basic harmony theory, which I practice just by singing or whistling.

For example, I’m getting into modes. The following 7 modes are all obtained by taking the major scale and starting it at different points. But I find that’s not the good way for me to understand the individual flavor of each one.

Much better for me is to think of each mode as the major scale (= Ionian mode) with some notes raised or lowered a half-step — since I already have an intuitive sense of what that will do to the sound:

For example, anything with the third lowered a half-step (♭3) will have a minor feel. And Aeolian, which also has the 6th and 7th lowered (♭6 and ♭7), is nothing but my old friend the natural minor scale!

A more interesting mode is Dorian, which has just the 3rd and 7th notes lowered a half-step (3♭ and 7♭). Since this 6th is not lowered this is not as sad as minor. You can play happy tunes in minor, but it’s easier to play really lugubrious tear-jerkers, which I find annoying. The major 6th of Dorian changes the sound to something more emotionally subtle. Listen to a bunch of examples here:

Some argue that the Dorian mode gets a peculiarly ‘neutral’ quality by being palindromic: the pattern of whole and half steps when you go up this mode is the same as when you go down:

w h w w w h w

This may seem crazily mathematical, but Leibniz said “Music is the pleasure the human mind experiences from counting without being aware that it is counting.”

Indeed, there is a marvelous theory of how modes sound ‘bright’ or ‘dark’ depending on how many notes are sharped—that is, raised a half-tone—or flatted—that is, lowered a half-tone. I learned about it from Rob van Hal, here:

The more notes are flatted compared to the major scale, the ‘darker’ a mode sounds! The fewer are flatted, the ‘brighter’ it sounds. And one, Lydian, is even brighter than major (= Ionian), because it has no flats and one sharp!

So, let’s list them from bright to dark. Here’s a chart from Rob van Hal’s video:

You can see lots of nice patterns here, like how the flats come in ‘from top down’ as the modes get darker: that is, starting at the 7th, then the 6th and then the 5th… but also, interspersed with these, the 3rd and then the 2nd.

But here’s something even cooler, which I also learned from Rob van Hal (though he was surely not the first to discover it).

If we invert each mode—literally turn it upside down, by playing the pattern of whole and half steps from the top of the scale down instead of from bottom to top—the brighter modes become the darker modes, and vice versa!

Let’s see it! Inverting the brightest, Lydian:

w w w h w w h

we get the darkest, Locrian:

h w w h w w w

Inverting the 2nd brightest, the happy Ionian (our familiar friend the major scale):

w w h w w w h

we get the 2nd darkest, Phrygian:

h w w w h w w

Inverting the third brightest, Mixolydian:

w w h w w h w

we get the third darkest, the sad Aeolian (our friend the natural minor):

w h w w h w w

And right in the middle is the palindromic Dorian:

w h w w w h w

What a beautiful pattern!

By the way, it’s also cool how both the ultra-bright Lydian and the ultra-dark Locrian, and only these modes, have a note that’s exactly half an octave above the 1. This is a very dissonant thing for a mode to have! In music jargon we say it like this: these modes have a note that’s a tritone above the tonic.

In Lydian this note is the sharped 4th, which is a ‘brighter than usual 4th’. In Locrian it’s the flatted 5th, which is a ‘darker than usual 5th’. But these are secretly the same note, or more technically ‘enharmonic equivalents’. They differ just in the role they play—but that makes a big difference.

Why do both Lydian and Locrian have a note that’s a tritone above the tonic? It’s not a coincidence: the tritone is mapped to itself by inversion of the octave, and inversion interchanges Lydian and Locrian!

This stuff is great, especially when I combine it with actually singing in different modes and listening to how they sound. Why am I learning it all just now, after decades of loving music? Because normally when I want to think about music I don’t study theory—I go to the piano and start playing!

The mathematics of modes

We clearly have an action of the 7-element cyclic group \mathbb{Z}/7 on the set of modes I’m talking about: they’re defined by taking the major scale and cyclically permuting its notes. But as we’ve seen, inversion gives an action of \mathbb{Z}/2 on the set of modes, with Dorian as its only fixed point.

Putting these two groups together, we get an action of the 14-element dihedral group \mathrm{D}_{14} on the modes. This is the semidirect product \mathbb{Z}/2 \ltimes \mathbb{Z}/7. More intuitively, it’s the symmetry group of the regular heptagon! The modes can be seen as the vertices of this heptagon.

We’ve also seen the modes have a linear ordering by ‘brightness’. However, this ordering is preserved by the symmetries I’ve described: only the identity transformation preserves this linear ordering.

All this should have been studied in neo-Riemannian music theory, but I don’t know if it has—so if you know references, please tell me! The \mathrm{D}_{14} group here is a baby version of the \mathrm{D}_{24} group often studied in neo-Riemannian theory. For more, see:

• Alissa S. Crans, Thomas M. Fiore and Ramon Satyendra, Musical actions of dihedral groups, American Mathematical Monthly 116 (2009), 479–495.

More on individual modes

For music, more important than the mathematical patterns relating different modes is learning the ‘personality’ of individual modes and how to compose or improvise well in each mode.

Here are some introductions to that! Since I’m in awe of Rob van Hal I will favor his when possible. But there are many introductions to each mode on YouTube, and it’s worth watching a lot, for different points of view.

Locrian is so unloved that I can’t find a good video on how to compose in Locrian. Instead, there’s a good one on how Björk created a top 20 hit that uses Locrian:

and also a good one about Adam Neely and friends trying to compose in Locrian:

For more, read Modes (part 2).

Doug NatelsonThe 2022 Welch Conference

The last couple of weeks have been very full.  

One event was the annual Welch Foundation conference (program here).  The program chair for this one was W. E. Moerner, expert (and Nobel Laureate) on single-molecule spectroscopy, and it was really a great meeting.  I'm not just saying that because it's the first one in several years that was well aligned to my own research.  

The talks were all very good, and I was particularly impressed by the presentation by Yoav Shechtman, who spoke about the use of machine learning in super-resolution microscopy.  It basically had me convinced that machine learning (ML) can, under the right circumstances, basically be magic.   The key topic is discussed in this paper.  The basic idea of some flavors of super-resolution microscopy is to rely on the idea that fluorescence is coming from individual, hopefully well-separated single emitters.  Diffraction limits the size of a spot, but if you know that the light is coming from one emitter, you can use statistics to figure out the x-y centroid position of that spot to much higher precision.  That can be improved by ML methods, but there's more.  There are ways to get z information as well.  Xiaowei Zhuang's group had this paper in 2008 that's been cited 2000+ times, using a clever idea:  with a cylindrical lens in the beam path, a spot from an emitter above the focal plane is distorted along one axis, while a spot from an emitter below the focal plane is distorted along the orthogonal axis.  In the new work, Shechtman's folks have gone further, putting a phase mask into the path that produces more interesting distortions along those lines.  They use ML trained on a detailed simulation of their microscope data to get improved z precision.  Moreover, they also can use ML to then design an optimal version of that phase mask, to get even better precision.  Very impressive.

The other talk that really stuck out was the Welch award talk by Carolyn Bertozzi, one of this year's Nobel Laureates in Chemistry.  She gave a great presentation about the history of bioorthogonal chemistry, and it was genuinely inspiring, especially given the clinical treatment possibilities it's opened up.  Even though she must've given some version of that talk hundreds of times, her passion and excitement about the actual chemistry (e.g. see, these bonds here are really strained, so we know that the reaction has to happen here) was just palpable.  

November 05, 2022

Terence TaoUCLA Math Undergraduate Merit Scholarship for 2023

In 2010, the UCLA mathematics department launched a scholarship opportunity for entering freshman students with exceptional background and promise in mathematics. This program was unfortunately suspended for a while due to technical reasons, but we are once again able to offer one scholarship each year.  The UCLA Math Undergraduate Merit Scholarship provides for full tuition, and a room and board allowance for 4 years, contingent on continued high academic performance. In addition, scholarship recipients follow an individualized accelerated program of study, as determined after consultation with UCLA faculty.   The program of study leads to a Masters degree in Mathematics in four years.

More information and an application form for the scholarship can be found on the web at:

https://ww3.math.ucla.edu/ucla-math-undergraduate-merit-scholarship/

To be considered for Fall 2023, candidates must apply for the scholarship and also for admission to UCLA on or before November 30, 2022.

November 04, 2022

Matt von HippelNo, PhD Students Are Not Just Cheap Labor

Here’s a back-of-the-envelope calculation:

In 2019, there were 83,050 unionized graduate students in the US. Let’s assume these are mostly PhD students, since other graduate students are not usually university employees. I can’t find an estimate of the total number of PhD students in the US, but in 2019, 55,614 of them graduated. In 2020, the average US doctorate took 7.5 years to complete. That implies that 83,050/(55,614 x 7.5) = about one-fifth of PhD students in the US are part of a union.

That makes PhD student unions common, but not the majority. It means they’re not unheard of and strange, but a typical university still isn’t unionized. It’s the sweet spot for controversy. It leads to a lot of dumb tweets.

I saw one such dumb tweet recently, from a professor arguing that PhD students shouldn’t unionize. The argument was that if PhD students were paid more, then professors would prefer to hire postdocs, researchers who already have a doctoral degree.

(I won’t link to the tweet, in part because this person is probably being harassed enough already.)

I don’t know how things work in this professor’s field. But the implication, that professors primarily take on PhD students because they’re cheaper, not only doesn’t match my experience: it also just doesn’t make very much sense.

Imagine a neighborhood where the children form a union. They decide to demand a higher allowance, and to persuade any new children in the neighborhood to follow their lead.

Now imagine a couple in that neighborhood, deciding whether to have a child. Do you think that they might look at the fees the “children’s union” charges, and decide to hire an adult to do their chores instead?

Maybe there’s a price where they’d do that. If neighborhood children demanded thousands of dollars in allowance, maybe the young couple would decide that it’s too expensive to have a child. But a small shift is unlikely to change things very much: people have kids for many reasons, and those reasons don’t usually include cheap labor.

The reasons professors take on PhD students are similar to the reasons parents decide to have children. Some people have children because they want a legacy, something of theirs that survives to the next generation. For professors, PhD students are our legacy, our chance to raise someone on our ideas and see how they build on them. Some people have children because they love the act of child-raising: helping someone grow and learn about the world. The professors who take on students like taking on students: teaching is fun, after all.

That doesn’t mean there won’t be cases “on the margin”, where a professor finds they can’t afford a student they previously could. (And to be fair to the tweet I’m criticizing, they did even use the word “marginal”.) But they would have to be in a very tight funding situation, with very little flexibility.

And even for situations like that, long-term, I’m not sure anything would change.

I did my PhD in the US. I was part of a union, and in part because of that (though mostly because I was in a physics department), I was paid relatively decently for a PhD student. Relatively decently is still not that great, though. This was the US, where universities still maintain the fiction that PhD students only work 20 hours a week and pay proportionate to that, and where salaries in a university can change dramatically from student to postdoc to professor.

One thing I learned during my PhD is that despite our low-ish salaries, we cost our professors about as much as postdocs did. The reason why is tuition: PhD students don’t pay their own tuition, but that tuition still exists, and is paid by the professors who hire those students out of their grants. A PhD salary plus a PhD tuition ended up roughly equal to a postdoc salary.

Now, I’m working in a very different system. In a Danish university, wages are very flat. As a postdoc, a nice EU grant put me at almost the same salary as the professors. As a professor, my salary is pretty close to that of one of the better-paying schoolteacher jobs.

At the same time, tuition is much less relevant. Undergraduates don’t pay tuition at all, so PhD tuition isn’t based on theirs. Instead, it’s meant to cover costs of the PhD program as a whole.

I’ve filled out grants here in Denmark, so I know how much PhD students cost, and how much postdocs cost. And since the situation is so different, you might expect a difference here too.

There isn’t one. Hiring a PhD student, salary plus tuition, costs about as much as hiring a postdoc.

Two very different systems, with what seem to be very different rules, end up with the same equation. PhD students and postdocs cost about as much as each other, even if every assumption that you think would affect the outcome turns out completely different.

This is why I expect that, even if PhD students get paid substantially more, they still won’t end up that out of whack with postdocs. There appears to be an iron law of academic administration keeping these two numbers in line, one that holds across nations and cultures and systems. The proportion of unionized PhD students in the US will keep working its way upwards, and I don’t expect it to have any effect on whether professors take on PhDs.

n-Category Café Categories and Epidemiology

I gave a talk about my work using category theory to help design software for epidemic modeling:

• Category theory and epidemiology, African Mathematics Seminar, Wednesday November 2, 2022, 3 pm Nairobi time or noon UTC. Organized by Layla Sorkatti and Jared Ongaro.

This talk is a lot less technical than previous ones I’ve given on this subject, which were aimed mainly at category theorists. You can watch it on YouTube.

Here’s the abstract:

Category theory provides a general framework for building models of dynamical systems. We explain this framework and illustrate it with the example of “stock and flow diagrams”. These diagrams are widely used for simulations in epidemiology. Although tools already exist for drawing these diagrams and solving the systems of differential equations they describe, we have created a new software package called StockFlow which uses ideas from category theory to overcome some limitations of existing software. We illustrate this with code in StockFlow that implements a simplified version of a COVID-19 model used in Canada. This is joint work with Xiaoyan Li, Sophie Libkind, Nathaniel Osgood and Evan Patterson.

Check out these papers for more:

For some more mathematical talks on the same subject, go here.

November 03, 2022

Matt Strassler W boson mass too high? Charm quarks in the proton? There’s a (worrisome) link.

Two of the most widely reported stories of the year in particle physics,

both depend crucially on our understanding of the fine details of the proton, as established to high precision by the NNPDF collaboration itself.  This large group of first-rate scientists starts with lots of data, collected over many years and in many experiments, which can give insight into the proton’s contents. Then, with a careful statistical analysis, they try to extract from the data a precision picture of the proton’s internal makeup (encoded in what is known as “Parton Distribution Functions” — that’s the PDF in NNPDF).  

NNPDF are by no means the first group to do this; it’s been a scientific task for decades, and without it, data from proton colliders like the Large Hadron Collider couldn’t be interpreted.   Crucially, the NNPDF group argues they have the best and most modern methods for the job  — NN stands for “neural network”, so it has to be good, right? 😉 — and that they carry it out at higher precision than anyone has ever done  before.

But what if they’re wrong? Or at least, what if the uncertainties on their picture of the proton are larger than they say?  If the uncertainties were double what NNPDF believes they are, then the claim of excess charm quark/anti-quark pairs in the proton — just barely above detection at 3 standard deviations — would be nullified, at least for now.  And even the claim of the W boson mass being different from the theoretical prediction,  which was argued to be a 7 standard deviation detection, far above “discovery” level, is in some question. In that mass measurement, the largest single source of systematic uncertainty is from the parton distribution functions.  A mere doubling of this uncertainty would reduce the discrepancy to 5 standard deviations, still quite large.  But given the thorny difficulty of the W mass measurement, any backing off from the result would certainly make people more nervous about it… and they are already nervous as it stands. (Some related discussion of these worries appeared in print here, with an additional concern here.)

In short, a great deal, both current and future, rides on whether the NNPDF group’s uncertainties are as small as they think they are.  How confident can we be?

The problem is that there are very few people who have the technical expertise to check whether NNPDF’s analysis is correct, and the numbers are shrinking.  NNPDF is a well-funded European group of more than a dozen people.  But in the United States, the efforts to study the proton’s details are poorly funded, and smaller than ever.  I don’t agree with Sabine Hossenfelder’s bludgeoning of high-energy physics, much of which seems to arise from a conflation of real problems with imaginary ones — but she’s not wrong when she argues that basic science is under-funded compared to more fancy-sounding stuff.  After all, the US has spent a billionish dollars helping to build and run a proton collider.  How is it that we can’t spend a couple of million per year to properly support the US-based PDF experts, so that they can help us make full use of this collider’s treasure trove of data? Where are our priorities?

A US-based group which calls itself CTEQ-TEA, which has been around for decades and was long a leader in the field, is disputing NNPDF’s uncertainties, and suggesting they are closer to the uncertainties that CTEQ-TEA itself finds in its own PDFs.  (Essentially, if I understand correctly, they are suggesting that NNPDF’s methods fail to account for all possible functional forms [i.e. shapes] of the parton distribution functions, and that this leads the NNPDF group to conclude they know more than they actually do.)  I’m in no position, currently, to evaluate this claim; it’s statistically subtle.  Nor have I spoken to any NNPDF experts yet to understand their counter-arguments.  And of course the CTEQ-TEA group is inevitably at risk of seeming self-serving, since their PDFs have larger uncertainties than those obtained by NNPDF.  

But frankly, it doesn’t matter what NNPDF says or how good their arguments are.   With such basic questions about nature riding on their uncertainties, we need a second and ideally a third group that has the personnel to carry out a similar analysis, with different assumptions, to see if they all come to the same conclusion.  We cannot abide a situation where we depend on one and only one group of scientists to tell us how the proton works at the most precise level; we cannot simply assume that they did it right, no matter how careful their arguments might seem.  Mistakes at the forefront of science happen all the time; the forefront is a difficult place, which is why we revere those who achieve something there.  We cannot have claims of major discoveries (or lack thereof!) reliant on a single group of people.  And so — we need funding for other groups.  Otherwise it will be a very long time before we know whether or not the W boson’s mass is actually above the Standard Model prediction, or whether there really are charm quark/anti-quark pairs playing a role in the proton… and meanwhile we won’t be able to answer other questions that depend on precision measurements, such as whether the properties of the Higgs boson exactly agree with the Standard Model.

Prizes worth millions of dollars a year, funded by the ultra-wealthy, are given to famous theoretical physicists whose best work is already in the past. At many well-known universities, the string theory and formal quantum field theory efforts are well-funded, thanks in part to gifts from very rich people.  That’s great and all, but progress in science depends not only on the fancy-sounding stuff that makes the headlines, but also on the hard, basic work that makes the headline-generating results possible.  Somebody needs to be funding those foundational efforts, or we’ll end up with huge piles of experimental data that we can’t interpret, and huge piles of theory papers that sound exciting but whose relation to nature can’t be established.

I doubt this message will get through to anyone important who can do something about it — it’s a message I’ve been trying to deliver for over 20 years — but in an ideal world I’d like it to be heard by to two groups of people: (1) the funders of particle physics at the National Science Foundation and the Department of Energy, who ought to fund string theory/supersymmetry a little less and proton fundamentals a little more; and (2) Elon Musk, Mark Zuckerberg, Jeff Bezos, Yuri Milner, and other gazillionaires who could solve this problem with a flick of their fingers. 

Peter Rohde MoodSnap now in Spanish

We’re delighted to announce that MoodSnap mood diary is now localised to the Spanish language. We’re incredibly excited that all our Spanish-speaking friends can now use MoodSnap in their native language.

Thank you to Melany Nadine Monroy Icaza for providing the translation and Christian Ronald Cresci for proof-checking.

www.moodsnap.app

The post MoodSnap now in Spanish appeared first on Peter Rohde.

November 01, 2022

Tommaso DorigoThe Cleverest Experiment Of Twentieth Century

At about this time of the year I find myself teaching my students about the construction of V-A theory, which is a milestone in the construction of the Standard Model of particle physics. And in so doing I rejoice about having a chance to tell them the details of one of the most brilliant experiments of the twentieth century, one performed in 1957 by Maurice Goldhaber with his colleagues Grodzins and Sunjar, and which has become a cornerstone of the physics of weak interactions and of particle physics in general. 

read more

October 31, 2022

Jordan EllenbergCompactness as groupwork

Compactness is one of the hardest things in the undergrad curriculum to teach, I find. It’s very hard for students to grasp that the issue is not “is there a finite collection of open sets that covers K” but rather “for every collection of open sets that covers K, some finite subcollection covers K.” This year I came up with a new metaphor. I asked the students to think about how, when a professor assigns a group project, there’s always a very small subgroup of the students who does all the work. That’s sort of how compactness works! Yes, the professor assigned infinitely many open sets to the project, but actually most of them are not really contributing. And it doesn’t matter if some small set of students could do the project; what matters is that some small group among the students assigned to the project is going to end up doing it! And this seems to happen — my students nodded in agreement — no matter how the professor forms the group.

October 28, 2022

Matt von HippelChaos: Warhammer 40k or Physics?

As I mentioned last week, it’s only natural to confuse chaos theory in physics with the forces of chaos in the game Warhammer 40,000. Since it will be Halloween in a few days, it’s a perfect time to explain the subtle differences between the two.

Warhammer 40kphysics
In the grim darkness of the far future, there is only war!In the grim darkness of Chapter 11 of Goldstein, Poole, and Safko, there is only Chaos!
Birthed from the psychic power of mortal mindsBirthed from the numerical computations of mortal physicists
Ruled by four chaos gods: Khorne, Tzeench, Nurgle, and SlaaneshRuled by three principles: sensitivity to initial conditions, topological transitivity, and dense periodic orbits
In the 31st millennium, nine legions of space marines leave humanity due to the forces of chaosIn the 3.5 millionth millennium, Mercury leaves the solar system due to the force of gravity
While events may appear unpredictable, everything is determined by Tzeench’s plansWhile events may appear unpredictable, everything is determined by the initial conditions
Humans drawn to strangely attractive cultsSystems in phase space drawn to strange attractors
Over time, cultists mutate, governed by the warpOver time, trajectories diverge, governed by the Lyapunov exponent
To resist chaos, the Imperium of Man demands strict spiritual controlTo resist chaos, the KAM Theorem demands strict mathematical conditions
Inspires nerds to paint detailed miniaturesInspires nerds to stick pendulums together
Fantasy version with confusing relation to the originalQuantum version with confusing relation to the original
Lots of cool gothic artPretty fractals

Jordan EllenbergThe Greatest Astro/Phillie

The time is here. The lesser contenders have been dispatched (it seemed like it was gonna be the Mets’ year, didn’t it?) and we have our World Series matchup; the seemingly unstoppable Astros, who just keep pennanting and pennanting and are so far without a loss this postseason, and the scrappy Phillies, third in the NL East this year but hot at the right time.

And that brings us to our annual question: who was the greatest ever Astro/Phillie? Our methodology, as usual — pull up the top 200 career WARs for each team on Stathead (the best subscription on the Internet, if you like this kind of thing) and find the player who maximizes (WAR with Astros) times (WAR with Phillies.)

This year it’s not even close: Roy Oswalt, the Astros’ career leader in WAR from a pitcher. I’d forgotten that after most of a career in Houston, he went to Philadelphia halfway through 2010 and was terrific, helping the Phils get to the NLCS. He pitched just a year and a half in Philadelphia but that, combined with his record in Houston, is enough to give him the title by a long ways.

But if you really dislike the imbalance here, you can instead rank players by min(WAR with Astros, WAR with Phillies) and get a different greatest Astro/Phillie: Turk Farrell, who started and ended his career a Phillie with six years of Houston in between, usually good, never great. He was an all-star once as a Phillie and four times as an Astro, though two of those times were in 1962. Why were there two All-Star Games in 1962? No idea. Mysteries of baseball.

October 27, 2022

n-Category Café Booleans, Natural Numbers, Young Diagrams, Schur Functors

There’s an adjunction between commutative monoids and pointed sets, which gives a comonad. Then:

Take the booleans, apply the comonad and get the natural numbers.

Take the natural numbers, apply the comonad and get Young diagrams.

Take the Young diagrams, apply the comonad and get Schur functors.

Let me explain how this works!

There’s an adjunction between commutative monoids and pointed sets. Any commutative monoid (M,+,0)(M,+,0) has an underlying pointed set (M,0)(M,0), so we get a functor

U:CommMonSet * U : \mathsf{CommMon} \to \mathsf{Set}_\ast

from commutative monoids to pointed sets. And this has a left adjoint

F:Set *CommMon F: \mathsf{Set}_\ast \to \mathsf{CommMon}

This sends any pointed set (S,*)(S,\ast) to the free commutative monoid on SS modulo the congruence relation that forces *\ast to be the identity. And that’s naturally isomorphic to the free commutative monoid on the set S{*}S - \{\ast\}.

So, we get a comonad

FU:CommMonCommMon F U : \mathsf{CommMon} \to \mathsf{CommMon}

What happens if we start with our favorite 2-element commutative monoid, and repeatedly apply this comonad?

My favorite 2-element commutative monoid is the booleans B={0,1}B = \{0,1\} made into a commutative monoid using ‘or’. Its identity element is 00.

If we take (B,or,0)(B, or, 0) and apply the functor

U:CommMonSet * U : \mathsf{CommMon} \to \mathsf{Set}_\ast

we get the 2-element pointed set (B,0)(B,0). When we apply the functor

F:Set *CommMon F: \mathsf{Set}_\ast \to \mathsf{CommMon}

to this 2-element pointed set we get \mathbb{N}, made into a commutative monoid using addition. The reason is that \mathbb{N} is also the free commutative monoid on the 1-element set B{0}B - \{0\}.

If we apply the functor UU to (,+,0)(\mathbb{N}, + , 0) we get the pointed set (,0)(\mathbb{N},0). When we apply the functor FF to the pointed set (,0)(\mathbb{N},0) we get a commutative monoid that’s also the free commutative monoid on the set {0}={1,2,3,}\mathbb{N} - \{0\} = \{1,2,3,\dots\}. This is usually called the set of Young diagrams, since a typical element looks like

3+2+2+2+1 3 + 2 + 2 + 2 + 1

so it can be drawn like this:

(I’m counting the number of boxes in columns. We can also use the other convention, where we count the number of boxes in rows. That’s actually more common.)

Note that there is an ‘empty Young diagram’ with no boxes at all, and that’s the identity element of the free commutative monoid on {1,2,3,}\{1,2,3,\dots\}. But there aren’t Young diagrams with a whole bunch of 0-box columns, which is why I prefer the free commutative monoid on {1,2,3,}\{1,2,3,\dots\} to the free commutative monoid on \mathbb{N}.

Let (Y,+,0)(Y,+,0) be the commutative monoid of Young diagrams, where 00 is the empty Young diagram — the one with no boxes at all. Applying UU to this we get the pointed set of Young diagrams, (Y,0)(Y,0). Applying FF to that we get a commutative monoid F(Y,0)F(Y,0) that’s also the free commutative monoid on the set of nonempty Young diagrams.

And this commutative monoid F(Y,0)F(Y,0) is important in representation theory! The category Schur\mathsf{Schur} of Schur functors has Young diagrams as its simple objects. But a general object is a finite direct sum of simple objects. So, the set of isomorphism classes of Schur functors is naturally isomorphic to F(Y,0)F(Y,0). And they are isomorphic as commutative monoids, where we use direct sums of Schur functors to get a monoid structure.

So here is the slogan:

Take the booleans, apply the comonad FUF U and get the commutative monoid of natural numbers.

Take the natural numbers, apply the comonad FUF U and get the commutative monoid of Young diagrams.

Take the Young diagrams, apply the comonad FUF U and get the commutative monoid of isomorphism classes of Schur functors.

Of course I want to do it again, but I’m not sure where the resulting structure shows up in math. Maybe some sort of categorified representation theory?

October 26, 2022

Doug NatelsonRice University Academy of Fellows postdoc opportunity, 2023

As I have posted in previous years, Rice has a university-wide endowed honorific postdoctoral program called the Rice Academy of Fellows.   Like all such things, it's very competitive. The new application listing has gone live here with a deadline of January 4, 2023. Applicants have to have a faculty mentor, so in case someone is interested in working with me on this, please contact me via email. We've got some fun, exciting stuff going on!

October 21, 2022

Terence TaoA Bayesian probability worksheet

This is a spinoff from the previous post. In that post, we remarked that whenever one receives a new piece of information {E}, the prior odds {\mathop{\bf P}( H_1 ) / \mathop{\bf P}( H_0 )} between an alternative hypothesis {H_1} and a null hypothesis {H_0} is updated to a posterior odds {\mathop{\bf P}( H_1|E ) / \mathop{\bf P}( H_0|E )}, which can be computed via Bayes’ theorem by the formula

\displaystyle  \frac{\mathop{\bf P}( H_1|E )}{\mathop{\bf P}(H_0|E)} = \frac{\mathop{\bf P}(H_1)}{\mathop{\bf P}(H_0)} \times \frac{\mathop{\bf P}(E|H_1)}{\mathop{\bf P}(E|H_0)}

where {\mathop{\bf P}(E|H_1)} is the likelihood of this information {E} under the alternative hypothesis {H_1}, and {\mathop{\bf P}(E|H_0)} is the likelihood of this information {E} under the null hypothesis {H_0}. If there are no other hypotheses under consideration, then the two posterior probabilities {\mathop{\bf P}( H_1|E )}, {\mathop{\bf P}( H_0|E )} must add up to one, and so can be recovered from the posterior odds {o := \frac{\mathop{\bf P}( H_1|E )}{\mathop{\bf P}(H_0|E)}} by the formulae

\displaystyle  \mathop{\bf P}(H_1|E) = \frac{o}{1+o}; \quad \mathop{\bf P}(H_0|E) = \frac{1}{1+o}.

This gives a straightforward way to update one’s prior probabilities, and I thought I would present it in the form of a worksheet for ease of calculation:

A PDF version of the worksheet and instructions can be found here. One can fill in this worksheet in the following order:

  1. In Box 1, one enters in the precise statement of the null hypothesis {H_0}.
  2. In Box 2, one enters in the precise statement of the alternative hypothesis {H_1}. (This step is very important! As discussed in the previous post, Bayesian calculations can become extremely inaccurate if the alternative hypothesis is vague.)
  3. In Box 3, one enters in the prior probability {\mathop{\bf P}(H_0)} (or the best estimate thereof) of the null hypothesis {H_0}.
  4. In Box 4, one enters in the prior probability {\mathop{\bf P}(H_1)} (or the best estimate thereof) of the alternative hypothesis {H_1}. If only two hypotheses are being considered, we of course have {\mathop{\bf P}(H_1) = 1 - \mathop{\bf P}(H_0)}.
  5. In Box 5, one enters in the ratio {\mathop{\bf P}(H_1)/\mathop{\bf P}(H_0)} between Box 4 and Box 3.
  6. In Box 6, one enters in the precise new information {E} that one has acquired since the prior state. (As discussed in the previous post, it is important that all relevant information {E} – both supporting and invalidating the alternative hypothesis – are reported accurately. If one cannot be certain that key information has not been withheld to you, then Bayesian calculations become highly unreliable.)
  7. In Box 7, one enters in the likelihood {\mathop{\bf P}(E|H_0)} (or the best estimate thereof) of the new information {E} under the null hypothesis {H_0}.
  8. In Box 8, one enters in the likelihood {\mathop{\bf P}(E|H_1)} (or the best estimate thereof) of the new information {E} under the null hypothesis {H_1}. (This can be difficult to compute, particularly if {H_1} is not specified precisely.)
  9. In Box 9, one enters in the ratio {\mathop{\bf P}(E|H_1)/\mathop{\bf P}(E|H_0)} betwen Box 8 and Box 7.
  10. In Box 10, one enters in the product of Box 5 and Box 9.
  11. (Assuming there are no other hypotheses than {H_0} and {H_1}) In Box 11, enter in {1} divided by {1} plus Box 10.
  12. (Assuming there are no other hypotheses than {H_0} and {H_1}) In Box 12, enter in Box 10 divided by {1} plus Box 10. (Alternatively, one can enter in {1} minus Box 11.)

To illustrate this procedure, let us consider a standard Bayesian update problem. Suppose that a given point in time, {2\%} of the population is infected with COVID-19. In response to this, a company mandates COVID-19 testing of its workforce, using a cheap COVID-19 test. This test has a {20\%} chance of a false negative (testing negative when one has COVID) and a {5\%} chance of a false positive (testing positive when one does not have COVID). An employee {X} takes the mandatory test, which turns out to be positive. What is the probability that {X} actually has COVID?

We can fill out the entries in the worksheet one at a time:

  • Box 1: The null hypothesis {H_0} is that {X} does not have COVID.
  • Box 2: The alternative hypothesis {H_1} is that {X} does have COVID.
  • Box 3: In the absence of any better information, the prior probability {\mathop{\bf P}(H_0)} of the null hypothesis is {98\%}, or {0.98}.
  • Box 4: Similarly, the prior probability {\mathop{\bf P}(H_1)} of the alternative hypothesis is {2\%}, or {0.02}.
  • Box 5: The prior odds {\mathop{\bf P}(H_1)/\mathop{\bf P}(H_0)} are {0.02/0.98 \approx 0.02}.
  • Box 6: The new information {E} is that {X} has tested positive for COVID.
  • Box 7: The likelihood {\mathop{\bf P}(E|H_0)} of {E} under the null hypothesis is {5\%}, or {0.05} (the false positive rate).
  • Box 8: The likelihood {\mathop{\bf P}(E|H_1)} of {E} under the alternative is {80\%}, or {0.8} (one minus the false negative rate).
  • Box 9: The likelihood ratio {\mathop{\bf P}(E|H_1)/\mathop{\bf P}(E|H_0)} is {0.8 / 0.05 = 16}.
  • Box 10: The product of Box 5 and Box 9 is approximately {0.32}.
  • Box 11: The posterior probability {\mathop{\bf P}(H_0|E)} is approximately {1/(1+0.32) \approx 75\%}.
  • Box 12: The posterior probability {\mathop{\bf P}(H_1|E)} is approximately {0.32/(1+0.32) \approx 25\%}.

The filled worksheet looks like this:

Perhaps surprisingly, despite the positive COVID test, the employee {X} only has a {25\%} chance of actually having COVID! This is due to the relatively large false positive rate of this cheap test, and is an illustration of the base rate fallacy in statistics.

We remark that if we switch the roles of the null hypothesis and alternative hypothesis, then some of the odds in the worksheet change, but the ultimate conclusions remain unchanged:

So the question of which hypothesis to designate as the null hypothesis and which one to designate as the alternative hypothesis is largely a matter of convention.

Now let us take a superficially similar situation in which a mother observers her daughter exhibiting COVID-like symptoms, to the point where she estimates the probability of her daughter having COVID at {50\%}. She then administers the same cheap COVID-19 test as before, which returns positive. What is the posterior probability of her daughter having COVID?

One can fill out the worksheet much as before, but now with the prior probability of the alternative hypothesis raised from {2\%} to {50\%} (and the prior probablity of the null hypothesis dropping from {98\%} to {50\%}). One now gets that the probability that the daughter has COVID has increased all the way to {94\%}:

Thus we see that prior probabilities can make a significant impact on the posterior probabilities.

Now we use the worksheet to analyze an infamous probability puzzle, the Monty Hall problem. Let us use the formulation given in that Wikipedia page:

Problem 1 Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what’s behind the doors, opens another door, say No. 3, which has a goat. He then says to you, “Do you want to pick door No. 2?” Is it to your advantage to switch your choice?

For this problem, the precise formulation of the null hypothesis and the alternative hypothesis become rather important. Suppose we take the following two hypotheses:

  • Null hypothesis {H_0}: The car is behind door number 1, and no matter what door you pick, the host will randomly reveal another door that contains a goat.
  • Alternative hypothesis {H_1}: The car is behind door number 2 or 3, and no matter what door you pick, the host will randomly reveal another door that contains a goat.
Assuming the prizes are distributed randomly, we have {\mathop{\bf P}(H_0)=1/3} and {\mathop{\bf P}(H_1)=2/3}. The new information {E} is that, after door 1 is selected, door 3 is revealed and shown to be a goat. After some thought, we conclude that {\mathop{\bf P}(E|H_0)} is equal to {1/2} (the host has a fifty-fifty chance of revealing door 3 instead of door 2) but that {\mathop{\bf P}(E|H_1)} is also equal to {1/2} (if the car is behind door 2, the host must reveal door 3, whereas if the car is behind door 3, the host cannot reveal door 3). Filling in the worksheet, we see that the new information does not in fact alter the odds, and the probability that the car is not behind door 1 remains at 2/3, so it is advantageous to switch.

However, consider the following different set of hypotheses:

  • Null hypothesis {H'_0}: The car is behind door number 1, and if you pick the door with the car, the host will reveal another door to entice you to switch. Otherwise, the host will not reveal a door.
  • Alternative hypothesis {H'_1}: The car is behind door number 2 or 3, and if you pick the door with the car, the host will reveal another door to entice you to switch. Otherwise, the host will not reveal a door.

Here we still have {\mathop{\bf P}(H'_0)=1/3} and {\mathop{\bf P}(H'_1)=2/3}, but while {\mathop{\bf P}(E|H'_0)} remains equal to {1/2}, {\mathop{\bf P}(E|H'_1)} has dropped to zero (since if the car is not behind door 1, the host will not reveal a door). So now {\mathop{\bf P}(H'_0|E)} has increased all the way to {1}, and it is not advantageous to switch! This dramatically illustrates the importance of specifying the hypotheses precisely. The worksheet is now filled out as follows:

Finally, we consider another famous probability puzzle, the Sleeping Beauty problem. Again we quote the problem as formulated on the Wikipedia page:

Problem 2 Sleeping Beauty volunteers to undergo the following experiment and is told all of the following details: On Sunday she will be put to sleep. Once or twice, during the experiment, Sleeping Beauty will be awakened, interviewed, and put back to sleep with an amnesia-inducing drug that makes her forget that awakening. A fair coin will be tossed to determine which experimental procedure to undertake:
  • If the coin comes up heads, Sleeping Beauty will be awakened and interviewed on Monday only.
  • If the coin comes up tails, she will be awakened and interviewed on Monday and Tuesday.
  • In either case, she will be awakened on Wednesday without interview and the experiment ends.
Any time Sleeping Beauty is awakened and interviewed she will not be able to tell which day it is or whether she has been awakened before. During the interview Sleeping Beauty is asked: “What is your credence now for the proposition that the coin landed heads?”‘

Here the situation can be confusing because there are key portions of this experiment in which the observer is unconscious, but nevertheless Bayesian probability continues to operate regardless of whether the observer is conscious. To make this issue more precise, let us assume that the awakenings mentioned in the problem always occur at 8am, so in particular at 7am, Sleeping beauty will always be unconscious.

Here, the null and alternative hypotheses are easy to state precisely:

  • Null hypothesis {H_0}: The coin landed tails.
  • Alternative hypothesis {H_1}: The coin landed heads.

The subtle thing here is to work out what the correct prior state is (in most other applications of Bayesian probability, this state is obvious from the problem). It turns out that the most reasonable choice of prior state is “unconscious at 7am, on either Monday or Tuesday, with an equal chance of each”. (Note that whatever the outcome of the coin flip is, Sleeping Beauty will be unconscious at 7am Monday and unconscious again at 7am Tuesday, so it makes sense to give each of these two states an equal probability.) The new information is then

  • New information {E}: One hour after the prior state, Sleeping Beauty is awakened.

With this formulation, we see that {\mathop{\bf P}(H_0)=\mathop{\bf P}(H_1)=1/2}, {\mathop{\bf P}(E|H_0)=1}, and {\mathop{\bf P}(E|H_1)=1/2}, so on working through the worksheet one eventually arrives at {\mathop{\bf P}(H_1|E)=1/3}, so that Sleeping Beauty should only assign a probability of {1/3} to the event that the coin landed as heads.

There are arguments advanced in the literature to adopt the position that {\mathop{\bf P}(H_1|E)} should instead be equal to {1/2}, but I do not see a way to interpret them in this Bayesian framework without a substantial alteration to either the notion of the prior state, or by not presenting the new information {E} properly.

If one has multiple pieces of information {E_1, E_2, \dots} that one wishes to use to update one’s priors, one can do so by filling out one copy of the worksheet for each new piece of information, or by using a multi-row version of the worksheet using such identities as

\displaystyle  \frac{\mathop{\bf P}( H_1|E_1,E_2 )}{\mathop{\bf P}(H_0|E_1,E_2)} = \frac{\mathop{\bf P}(H_1)}{\mathop{\bf P}(H_0)} \times \frac{\mathop{\bf P}(E_1|H_1)}{\mathop{\bf P}(E_1|H_0)} \times \frac{\mathop{\bf P}(E_2|H_1,E_1)}{\mathop{\bf P}(E_2|H_0,E_1)}.

We leave the details of these variants of the Bayesian update problem to the interested reader. The only thing I will note though is that if a key piece of information {E} is withheld from the person filling out the worksheet, for instance if that person relies exclusively on a news source that only reports information that supports the alternative hypothesis {H_1} and omits information that debunks it, then the outcome of the worksheet is likely to be highly inaccurate, and one should only perform a Bayesian analysis when one has a high confidence that all relevant information (both favorable and unfavorable to the alternative hypothesis) is being reported to the user.

Terence TaoUpcoming workshop on “Machine assisted proofs” at IPAM

Just a short post to advertise the workshop “Machine assisted proofs” that will be held on Feb 13-17 next year, here at the Institute for Pure and Applied Mathematics (IPAM); I am one of the organizers of this event together with Erika Abraham, Jeremy Avigad, Kevin Buzzard, Jordan Ellenberg, Tim Gowers, and Marijn Heule. The purpose of this event is to bring together experts in the various types of formal computer-assisted methods used to verify, discover, or otherwise assist with mathematical proofs, as well as pure mathematicians who are interested in learning about the current and future state of the art with such tools; this seems to be an opportune time to bring these communities together, given some recent high-profile applications of formal methods in pure mathematics (e.g, the liquid tensor experiment). The workshop will consist of a number of lectures from both communities, as well as a panel to discuss future directions. The workshop is open to general participants (both in person and remotely), although there is a registration process and a moderate registration fee to cover costs and to restrict the capacity to genuine applicants.

October 20, 2022

Peter Rohde Terminally Quantum podcast series

Together with Alexandra Dickie we’re pleased to announce our new quantum podcast series Terminally Quantum, hosted at The Quantum Terminal in Sydney, Australia. Our first episode, featuring Prof Peter Turner, CEO of the Sydney Quantum Academy, is now available on Spotify and Apple Podcasts.

The post Terminally Quantum podcast series appeared first on Peter Rohde.

Matt Strassler An Extraordinarily Productive Visit to Fermilab

Fermilab’s main building at sunset. [Credit: M. Strassler]

This has been an exceptional few days, and I’ve had no time to breathe, much less blog. In pre-covid days, visits to the laboratories at CERN or Fermilab were always jam-packed with meetings, both planned and spontaneous, and with professional talks by experts visiting the labs. But many things changed during the pandemic. The vitality of labs like Fermilab and CERN depends on their many visitors, and so it is going to take time for them to recover from the isolation and work-from-home culture that covid-19 imposed on them.

My visit, organized by the LHC Physics Center [LPC], the US organizing center for the CMS experiment at the Large Hadron Collider [LHC], is my first trip to Fermilab since before 2020. I feared finding a skeleton crew, with many people working from home, and far fewer people traveling to Fermilab from other institutions. There is some truth in it; the place is a quieter than it was pre-2020. But nevertheless, the quality of the Fermilab staff and the visitors passing through has not declined. It is fair to say that in every meeting I’ve had and every presentation I have attended — and yesterday I started at 7:30 and ended at 4 without a single break — I have learned something new and important.

Today I’ll just give you a flavor of what I’ve learned; each one of these topics deserves a blog post all its own.

  • One Fermilab postdoc explained a new and very powerful technique for looking for long-lived particles at CMS, using parts of the CMS detector in a novel, creative way. Because it’s possible that the Higgs boson (or top quark, Z boson, W boson, bottom quark, or some unknown particle) can sometimes decay to a long-lived particle, which travels a macroscopic distance before decaying to a spray of other particles, this is an important scientific target. It’s one the existing LHC experiments weren’t really designed to study, but with a wide range of creative developments, they’ve developed an impressive range of techniques for doing so.
  • Another has a strategy for looking for certain decays of the Higgs boson that would be extremely difficult to find using standard techniques. Specifically, decays in which only hadrons are produced are very difficult to observe; hadrons are so abundant in collisions at the LHC that this is a signal drowned in background. But there is a possible way around this if the Higgs boson is kicked hard enough sideways in the collision.
  • A third is digging very deep into the challenging subject of low-energy muons and electrons. Particles with energy below 5 GeV become increasingly difficult to observe, for a whole host of reasons. But again, there can be decays of the Higgs boson (or other known particles) which would predominantly show themselves in these low-energy, difficult-to-identify muons or electrons. So this is a frontier where new ground needs to be broken.
  • A visiting expert taught me more about the technical meaning of “intrinsic charm”, which was widely over-reported as meaning that “there are charm quarks in the proton”. Understanding precisely what this means is quite subtle, even for a theoretical physics expert, and I’m still not in a position where I can explain it to you properly — though I did discuss it a closely related issue carefully. Moreover, he questions whether the story is actually correct — it depends on a claim of statistical errors being small enough, but he has doubts, and some evidence to support his doubts. (The same doubts, incidentally, potentially affect whether the difference of the W boson from the Standard Model prediction is really as significant as has been claimed.) In my opinion, it is not yet certain that there really is “intrinsic” charm in the proton. You can definitely expect another blog post about this!
  • Another visiting expert pointed out that in some limited but interesting cases, there could be very slowly-moving particles captured not only in the core of the Earth but also floating near its surface, a possible target for underground experiments that are sensitive to extremely low energy collisions of unknown particles with atoms.
  • Then there are the applications of machine learning in particle physics, which are increasingly being used in the complex environment of the LHC to make certain basic techniques of particle physics much more efficient. I heard about several very different examples, at least one of which (involving the identification of jets from bottom quarks) has already proven particularly successful.
  • A visiting CMS experimentalist pointed out to me that in a search through LHC data that she’d been involved in for many years, there are two surprising collisions observed with an extraordinary amount of energy, and very unusual (but similar) characteristics. It’s hard to quantify how unusual they are, but hopefully we will soon hear about a similar search at ATLAS, which could add or subtract weight from this observation. In any case, upcoming data from Run 3 will give us enough information, within a year or two, to see if this hint is actually of something real.
  • If these events aren’t a fluke and represent something real and new, then one of the local theorists at Fermilab is the fellow to talk to; back in 2018, when only one of these events had been observed, he and a couple of others thought through what the options are to explain where it might have come from. The options are unusual and would certainly be surprising to most theorists, but he convinced me that they’re not inconsistent with theoretical reasoning or with other data, so we should keep an open mind.
  • Yet another visiting theorist taught me about the possibility of non-linearities in quantum physics. Steven Weinberg tried to consider this possibility some time ago, but it turned out his approach violated causality; but now, inspired by old ideas of Joe Polchinski, there’s a new proposal to try this in another way. I’m grateful for that 45 minute conversation, at the end of which I felt pretty confident that I understood how the idea works. Now I can go off and think about it. When I understand its implications in some very simple settings (the only way I ever deeply understand anything), I’ll explain it to you.
  • Oh, and on top of this, I gave a talk on Tuesday, about powerful and sweeping strategies for searches in LHC data that haven’t yet been done, but ought to be, in my opinion. My ideas about this are 10-15 years old, but I have stronger arguments now that rely on Open Data. That of course led to a variety of follow-up conversations.


The visit’s not over; I’ve got one more day to try to drink from this fire-hose.

October 18, 2022

Tommaso DorigoKeynote Lecture On End-to-end Optimization At DeepLearn 2022

The DeepLearn school series, now reaching the seventh edition, offers insight into artificial intelligence and applications in a week-long course, tightly packing a significant number of high-profile instructors. The present edition, currently being held at the Technical University of Lulea, in the north of Sweden, features the following:

- Sean Benson, Netherlands Cancer Institute
- Thomas Breuel, Nvidia
- Hao Chen, Hong Kong University of Science and Technology
- Janlin Chen, University of Missouri
- Nadya Chernyavskaya, CERN
- Efstratios Gavves, University of Amsterdam
- Quanquan Gu, University of California Los Angeles

read more

Jacques Distler Fine Structure

I’m teaching the undergraduate Quantum II course (“Atoms and Molecules”) this semester. We’ve come to the point where it’s time to discuss the fine structure of hydrogen. I had previously found this somewhat unsatisfactory. If one wanted to do a proper treatment, one would start with a relativistic theory and take the non-relativistic limit. But we’re not going to introduce the Dirac equation (much less QED). And, in any case, introducing the Dirac equation would get you the leading corrections but fail miserably to get various non-leading corrections (the Lamb shift, the anomalous magnetic moment, …).

Instead, various hand-waving arguments are invoked (“The electron has an intrinsic magnetic moment and since it’s moving in the electrostatic field of the proton, it sees a magnetic field …”) which give you the wrong answer for the spin-orbit coupling (off by a factor of two), which you then have to further correct (“Thomas precession”) and then there’s the Darwin term, with an even more hand-wavy explanation …

So I set about trying to find a better way. I want use as minimal as possible input from the relativistic theory and get the leading relativistic correction(s).

  1. For a spinless particle, the correction amounts to replacing the nonrelativistic kinetic energy by the relativistic expression p 22mp 2c 2+m 2c 4mc 2=p 22m(p 2) 28m 3c 2+ \frac{p^2}{2m} \to \sqrt{p^2 c^2 +m^2 c^4} - m c^2 = \frac{p^2}{2m} - \frac{(p^2)^2}{8m^3 c^2}+\dots
  2. For a spin-1/2 particle, “p\vec{p}” only appears dotted into the Pauli matrices, σp\vec{\sigma}\cdot\vec{p}.
    • In particular, this tells us how the spin couples to external magnetic fields σpσ(pqA/c)\vec{\sigma}\cdot\vec{p} \to \vec{\sigma}\cdot(\vec{p}-q \vec{A}/c).
    • What we previously wrote as “p 2p^2” could just as well have been written as (σp) 2(\vec{\sigma}\cdot\vec{p})^2.
  3. Parity and time-reversal invariance1 imply only even powers of σp\vec{\sigma}\cdot\vec{p} appear in the low-velocity expansion.
  4. Shifting the potential energy, V(r)V(r)+constV(\vec{r})\to V(\vec{r})+\text{const}, should shift HH+constH\to H+\text{const}.

With those ingredients, at O(v 2/c 2)O(\vec{v}^2/c^2) there is a unique term (in addition to the correction to the kinetic energy that we found for a spinless particle) that can be written down for spin-1/2 particle. H=p 22m+V(r)(p 2) 28m 3c 2c 1m 2c 2[σp,[σp,V(r)]] H = \frac{p^2}{2m} +V(\vec{r}) - \frac{(p^2)^2}{8m^3 c^2} - \frac{c_1}{m^2 c^2} [\vec{\sigma}\cdot\vec{p},[\vec{\sigma}\cdot\vec{p},V(\vec{r})]] Expanding this out a bit, [σp,[σp,V]]=(p 2V+Vp 2)2σpVσp [\vec{\sigma}\cdot\vec{p},[\vec{\sigma}\cdot\vec{p},V]] = (p^2 V + V p^2) - 2 \vec{\sigma}\cdot\vec{p} V \vec{\sigma}\cdot\vec{p} Both terms are separately Hermitian, but condition (4) fixes their relative coefficient.

Expanding this out, yet further (and letting S=2σ\vec{S}=\tfrac{\hbar}{2}\vec{\sigma}) c 1m 2c 2[σp,[σp,V]]=4c 1m 2c 2((V)×p)S+c 1 2m 2c 2 2(V) -\frac{c_1}{m^2 c^2} [\vec{\sigma}\cdot\vec{p},[\vec{\sigma}\cdot\vec{p},V]]= \frac{4c_1}{m^2 c^2} (\vec{\nabla}(V)\times \vec{p})\cdot\vec{S} + \frac{c_1\hbar^2}{m^2 c^2} \nabla^2(V)

For a central force potential, (V)=r1rdVdr\vec{\nabla}(V)= \vec{r}\frac{1}{r}\frac{d V}{d r} and the first term is the spin-orbit coupling, 4c 1m 2c 21rdVdrLS\frac{4c_1}{m^2 c^2} \frac{1}{r}\frac{d V}{d r}\vec{L}\cdot\vec{S}. The second term is the Darwin term. While I haven’t fixed the overall coefficient (c 1=1/8c_1=1/8), I got the form of the spin-orbit coupling and of the Darwin term correct and I fixed their relative coefficient (correctly!).

No hand-wavy hocus-pocus was required.

And I did learn something that I never knew before, namely that this correction can be succinctly written as a double-commutator [σp,[σp,V]][\vec{\sigma}\cdot\vec{p},[\vec{\sigma}\cdot\vec{p},V]]. I don’t think I’ve ever seen that before …


1 On the Hilbert space =L 2( 3) 2\mathcal{H}=L^2(\mathbb{R}^3)\otimes \mathbb{C}^2, time-reversal is implemented as the anti-unitary operator Ω T:(f 1(r) f 2(r))(f 2 *(r) f 1 *(r)) \Omega_T: \begin{pmatrix}f_1(\vec{r})\\ f_2(\vec{r})\end{pmatrix} \mapsto \begin{pmatrix}-f^*_2(\vec{r})\\ f^*_1(\vec{r})\end{pmatrix} and parity is implemented as the unitary operator U P:(f 1(r) f 2(r))(f 1(r) f 2(r)) U_P: \begin{pmatrix}f_1(\vec{r})\\ f_2(\vec{r})\end{pmatrix} \mapsto \begin{pmatrix}f_1(-\vec{r})\\ f_2(-\vec{r})\end{pmatrix} These obviously satisfy Ω TσΩ T 1 =σ, U PσU P =σ Ω TpΩ T 1 =p, U PpU P =p \begin{aligned} \Omega_T \vec{\sigma} \Omega_T^{-1} &= -\vec{\sigma},\quad& U_P \vec{\sigma} U_P &= \vec{\sigma}\\ \Omega_T \vec{p} \Omega_T^{-1} &= -\vec{p},\quad& U_P \vec{p} U_P &= -\vec{p}\\ \end{aligned}

October 17, 2022

Doug NatelsonMaterials labs of the future + cost

The NSF Division of Materials Research has been soliciting input from the community about both the biggest outstanding problems in condensed matter and materials science, and the future of materials labs - what kind of infrastructure, training, etc. will be needed to address those big problems.  In thinking about this, I want to throw out a stretch idea.  

I think it would have transformative impact on materials research and workforce development if there were fabrication and characterization tools that offered great performance at far lower prices than currently possible.  I'd mentioned the idea of developing a super-cheap SEM a while ago. I definitely worry that we are approaching a funding situation where the separation between top universities and everyone else will continue to widen rapidly.  The model of a network of user facilities seems to be how things have been trending (e.g. go to Harvard and use their high-res TEM, if your institution can't afford one).  However, if we really want to move the needle on access and training for a large, highly diverse workforce, it would be incredible to find a way to bring more capabilities to the broadest sweep of universities.   Maybe it's worth thinking hard about what could be possible to radically reduce hardware costs for the suite of materials characterization techniques that would be most important.


October 16, 2022

n-Category Café Partition Function as Cardinality

In classical statistical mechanics we often think about sets where each point has a number called its ‘energy’. Then the ‘partition function’ counts the set’s points — but points with large energy count for less! And the amount each point gets counted depends on the temperature.

So, the partition function is a generalization of the cardinality |X||X| that works for sets XX equipped with a function E:XE\colon X \to \mathbb{R}. I’ve been talking with Tom Leinster about this lately, so let me say a bit more about how it works.

Say XX is a set where each point ii has a number E iRE_i \in \R. Following the physicists, I’ll call this number the point’s energy.

The partition function is

Z= iXe E i/kT Z = \sum_{i \in X} e^{- E_i/kT}

where TT \in \mathbb{R} is called temperature and kk is a number called Boltzmann’s constant. If you are a mathematician, feel free to set this constant equal to 11.

So, the partition function counts the points of XX — but the idea is that it counts points with large energy for less. Points with energy E ikTE_i \gg k T count for very little. But as TT \to \infty, all points get fully counted and

Z|X| Z \to |X|

So, the partition function is a generalization of the cardinality |X||X| that works for sets XX equipped with a function E:XE\colon X \to \mathbb{R}. And it reduces to the cardinality in the high-temperature limit.

Just like the cardinality, the partition function adds when you take disjoint unions, and multiplies when you take products! Let me explain this.

Let’s call a set XX with a function E:XE \colon X \to \mathbb{R} an energetic set. I may just call it XX, and you need to remember it has this function. I’ll call its partition function Z(X)Z(X).

How does the partition function work for the disjoint union or product of energetic sets?

The disjoint union X+XX+X' of energetic sets E:XE\colon X \to \mathbb{R} and E:XE' \colon X' \to \mathbb{R} is again an energetic set: for points in XX we use the energy function EE, while for points in XX' we use the function EE'. And we can show that

Z(X+X)=Z(X)+Z(X) Z(X+X') = Z(X) + Z(X')

Just like cardinality!

The cartesian product X×XX\ \times X' of energetic sets E:XE\colon X \to \mathbb{R} and E:XE' \colon X' \to \mathbb{R} is again an energetic set: define the energy of a point (x,x)X×X(x,x') \in X \times X' to be E(x)+E(x)E(x) + E(x'). This is how it really works in physics. And we can show that

Z(X×X)=Z(X)Z(X) Z(X \times X') = Z(X)\, Z(X')

Just like cardinality!

If you like category theory, here are some fun things to do:

1) Make up a category of energetic sets.

(Hint: I’m thinking about a slice category.)

2) Show the disjoint union of energetic sets, defined as above, is the coproduct in this category.

3) Show the ‘cartesian product’ of energetic sets, defined as above, is not the product in this category.

4) Show that the ‘cartesian product’ of energetic sets, defined as above, gives a symmetric monoidal structure on the category of energetic sets. So we should really write it as a tensor product XXX \otimes X', not X×XX\times X'.

5) Show the category of energetic sets has colimits and the tensor product distributes over them.

6) Show that the category FinEn\mathbf{FinEn} of finite energetic sets has finite colimits and the tensor product distributes over them. So, it is a nice kind of symmetric rig category.

7) Show the partition function defines a map of symmetric rig categories

Z:FinEnC () Z \colon \mathbf{FinEn} \to C^\infty(\mathbb{R})

where C ()C^\infty(\mathbb{R}) is the usual ring of smooth real functions on the real line, thought of as a symmetric rig category with only identity morphisms.

Finally, a really nice fact:

8) Show that for finite energetic sets XX and XX', XXX \cong X' if and only if Z(X)=Z(X)Z(X) = Z(X').

(Hint: use the Laplace transform.)

So, the partition function for finite energetic sets acts a lot like the cardinality of finite sets. Like the cardinality of finite sets, it’s a map of symmetric rig categories and a complete invariant. And it reduces to counting as T+T \to +\infty.

We can generalize 6) to certain infinite energetic sets, but then we have to worry about whether this sum converges:

Z= iXe E i/kT Z = \sum_{i \in X} e^{- E_i/k T}

We can also go ahead and consider measure spaces, replacing this sum by an integral. This is very common in physics. But again, we need some conditions if we want these integrals to converge.

October 12, 2022

Tommaso DorigoMaster Thesis Offers At The Crossroads Of Physics And Computer Science

For some reasons, my personal web page features high in web searches for master thesis offers. I got to learn this by inquiring with a few students who asked me to supervise them remotely on some of the offered topics: where did they get to know about my research activities, and what led them to pick my offers? They all answered that they bumped into my web page section "thesis offers". Well, at least that was no wasted time when I wrote it.

read more

October 10, 2022

Terence TaoWhat are the odds?

An unusual lottery result made the news recently: on October 1, 2022, the PCSO Grand Lotto in the Philippines, which draws six numbers from {1} to {55} at random, managed to draw the numbers {9, 18, 27, 36, 45, 54} (though the balls were actually drawn in the order {9, 45,36, 27, 18, 54}). In other words, they drew exactly six multiples of nine from {1} to {55}. In addition, a total of {433} tickets were bought with this winning combination, whose owners then had to split the {236} million peso jackpot (about {4} million USD) among themselves. This raised enough suspicion that there were calls for an inquiry into the Philippine lottery system, including from the minority leader of the Senate.

Whenever an event like this happens, journalists often contact mathematicians to ask the question: “What are the odds of this happening?”, and in fact I myself received one such inquiry this time around. This is a number that is not too difficult to compute – in this case, the probability of the lottery producing the six numbers {9, 18, 27, 35, 45, 54} in some order turn out to be {1} in {\binom{55}{6} = 28,989,675} – and such a number is often dutifully provided to such journalists, who in turn report it as some sort of quantitative demonstration of how remarkable the event was.

But on the previous draw of the same lottery, on September 28, 2022, the unremarkable sequence of numbers {11, 26, 33, 45, 51, 55} were drawn (again in a different order), and no tickets ended up claiming the jackpot. The probability of the lottery producing the six numbers {11, 26, 33, 45, 51, 55} is also {1} in {\binom{55}{6} = 28,989,675} – just as likely or as unlikely as the October 1 numbers {9, 18, 27, 36, 45, 54}. Indeed, the whole point of drawing the numbers randomly is to make each of the {28,989,675} possible outcomes (whether they be “unusual” or “unremarkable”) equally likely. So why is it that the October 1 lottery attracted so much attention, but the September 28 lottery did not?

Part of the explanation surely lies in the unusually large number ({433}) of lottery winners on October 1, but I will set that aspect of the story aside until the end of this post. The more general points that I want to make with these sorts of situations are:

  1. The question “what are the odds of happening” is often easy to answer mathematically, but it is not the correct question to ask.
  2. The question “what is the probability that an alternative hypothesis is the truth” is (one of) the correct questions to ask, but is very difficult to answer (it involves both mathematical and non-mathematical considerations).
  3. The answer to the first question is one of the quantities needed to calculate the answer to the second, but it is far from the only such quantity. Most of the other quantities involved cannot be calculated exactly.
  4. However, by making some educated guesses, one can still sometimes get a very rough gauge of which events are “more surprising” than others, in that they would lead to relatively higher answers to the second question.

To explain these points it is convenient to adopt the framework of Bayesian probability. In this framework, one imagines that there are competing hypotheses to explain the world, and that one assigns a probability to each such hypothesis representing one’s belief in the truth of that hypothesis. For simplicity, let us assume that there are just two competing hypotheses to be entertained: the null hypothesis {H_0}, and an alternative hypothesis {H_1}. For instance, in our lottery example, the two hypotheses might be:

  • Null hypothesis {H_0}: The lottery is run in a completely fair and random fashion.
  • Alternative hypothesis {H_1}: The lottery is rigged by some corrupt officials for their personal gain.

At any given point in time, a person would have a probability {{\bf P}(H_0)} assigned to the null hypothesis, and a probability {{\bf P}(H_1)} assigned to the alternative hypothesis; in this simplified model where there are only two hypotheses under consideration, these probabilities must add to one, but of course if there were additional hypotheses beyond these two then this would no longer be the case.

Bayesian probability does not provide a rule for calculating the initial (or prior) probabilities {{\bf P}(H_0)}, {{\bf P}(H_1)} that one starts with; these may depend on the subjective experiences and biases of the person considering the hypothesis. For instance, one person might have quite a bit of prior faith in the lottery system, and assign the probabilities {{\bf P}(H_0) = 0.99} and {{\bf P}(H_1) = 0.01}. Another person might have quite a bit of prior cynicism, and perhaps assign {{\bf P}(H_0)=0.5} and {{\bf P}(H_1)=0.5}. One cannot use purely mathematical arguments to determine which of these two people is “correct” (or whether they are both “wrong”); it depends on subjective factors.

What Bayesian probability does do, however, is provide a rule to update these probabilities {{\bf P}(H_0)}, {{\bf P}(H_1)} in view of new information {E} to provide posterior probabilities {{\bf P}(H_0|E)}, {{\bf P}(H_1|E)}. In our example, the new information {E} would be the fact that the October 1 lottery numbers were {9, 18, 27, 36, 45, 54} (in some order). The update is given by the famous Bayes theorem

\displaystyle  {\bf P}(H_0|E) = \frac{{\bf P}(E|H_0) {\bf P}(H_0)}{{\bf P}(E)}; \quad {\bf P}(H_1|E) = \frac{{\bf P}(E|H_1) {\bf P}(H_1)}{{\bf P}(E)},

where {{\bf P}(E|H_0)} is the probability that the event {E} would have occurred under the null hypothesis {H_0}, and {{\bf P}(E|H_1)} is the probability that the event {E} would have occurred under the alternative hypothesis {H_1}. Let us divide the second equation by the first to cancel the {{\bf P}(E)} denominator, and obtain

\displaystyle  \frac{ {\bf P}(H_1|E) }{ {\bf P}(H_0|E) } = \frac{ {\bf P}(H_1) }{ {\bf P}(H_0) } \times \frac{ {\bf P}(E | H_1)}{{\bf P}(E | H_0)}. \ \ \ \ \ (1)

One can interpret {\frac{ {\bf P}(H_1) }{ {\bf P}(H_0) }} as the prior odds of the alternative hypothesis, and {\frac{ {\bf P}(H_1|E) }{ {\bf P}(H_0|E) } } as the posterior odds of the alternative hypothesis. The identity (1) then says that in order to compute the posterior odds {\frac{ {\bf P}(H_1|E) }{ {\bf P}(H_0|E) }} of the alternative hypothesis in light of the new information {E}, one needs to know three things:
  1. The prior odds {\frac{ {\bf P}(H_1) }{ {\bf P}(H_0) }} of the alternative hypothesis;
  2. The probability {\mathop{\bf P}(E|H_0)} that the event {E} occurs under the null hypothesis {H_0}; and
  3. The probability {\mathop{\bf P}(E|H_1)} that the event {E} occurs under the alternative hypothesis {H_1}.

As previously discussed, the prior odds {\frac{ {\bf P}(H_1) }{ {\bf P}(H_0) }} of the alternative hypothesis are subjective and vary from person to person; in the example earlier, the person with substantial faith in the lottery may only give prior odds of {\frac{0.01}{0.99} \approx 0.01} (99 to 1 against) of the alternative hypothesis, whereas the cynic might give odds of {\frac{0.5}{0.5}=1} (even odds). The probability {{\bf P}(E|H_0)} is the quantity that can often be calculated by straightforward mathematics; as discussed before, in this specific example we have

\displaystyle  \mathop{\bf P}(E|H_0) = \frac{1}{\binom{55}{6}} = \frac{1}{28,989,675}.

But this still leaves one crucial quantity that is unknown: the probability {{\bf P}(E|H_1)}. This is incredibly difficult to compute, because it requires a precise theory for how events would play out under the alternative hypothesis {H_1}, and in particular is very sensitive as to what the alternative hypothesis {H_1} actually is.

For instance, suppose we replace the alternative hypothesis {H_1} by the following very specific (and somewhat bizarre) hypothesis:

  • Alternative hypothesis {H'_1}: The lottery is rigged by a cult that worships the multiples of {9}, and views October 1 as their holiest day. On this day, they will manipulate the lottery to only select those balls that are multiples of {9}.

Under this alternative hypothesis {H'_1}, we have {{\bf P}(E|H'_1)=1}. So, when {E} happens, the odds of this alternative hypothesis {H'_1} will increase by the dramatic factor of {\frac{{\bf P}(E|H'_1)}{{\bf P}(E|H_0)} = 28,989,675}. So, for instance, someone who already was entertaining odds of {\frac{0.01}{0.99}} of this hypothesis {H'_1} would now have these odds multiply dramatically to {\frac{0.01}{0.99} \times 28,989,675 \approx 290,000}, so that the probability of {H'_1} would have jumped from a mere {1\%} to a staggering {99.9997\%}. This is about as strong a shift in belief as one could imagine. However, this hypothesis {H'_1} is so specific and bizarre that one’s prior odds of this hypothesis would be nowhere near as large as {\frac{0.01}{0.99}} (unless substantial prior evidence of this cult and its hold on the lottery system existed, of course). A more realistic prior odds for {H'_1} would be something like {\frac{10^{-10^{10}}}{1-10^{-10^{10}}}} – which is so miniscule that even multiplying it by a factor such as {28,989,675} barely moves the needle.

Remark 1 The contrast between alternative hypothesis {H_1} and alternative hypothesis {H'_1} illustrates a common demagogical rhetorical technique when an advocate is trying to convince an audience of an alternative hypothesis, namely to use suggestive language (“`I’m just asking questions here”) rather than precise statements in order to leave the alternative hypothesis deliberately vague. In particular, the advocate may take advantage of the freedom to use a broad formulation of the hypothesis (such as {H_1}) in order to maximize the audience’s prior odds of the hypothesis, simultaneously with a very specific formulation of the hypothesis (such as {H'_1}) in order to maximize the probability of the actual event {E} occuring under this hypothesis. (A related technique is to be deliberately vague about the hypothesized competency of some suspicious actor, so that this actor could be portrayed as being extraordinarily competent when convenient to do so, while simultaneously being portrayed as extraordinarily incompetent when that instead is the more useful hypothesis.) This can lead to wildly inaccurate Bayesian updates of this vague alternative hypothesis, and so precise formulation of such hypothesis is important if one is to approach a topic from anything remotely resembling a scientific approach. [EDIT: as pointed out to me by a reader, this technique is a Bayesian analogue of the motte and bailey fallacy.]

At the opposite extreme, consider instead the following hypothesis:

  • Alternative hypothesis {H''_1}: The lottery is rigged by some corrupt officials, who on October 1 decide to randomly determine the winning numbers in advance, share these numbers with their collaborators, and then manipulate the lottery to choose those numbers that they selected.

If these corrupt officials are indeed choosing their predetermined winning numbers randomly, then the probability {{\bf P}(E|H''_1)} would in fact be just the same probability {\frac{1}{\binom{55}{6}} = \frac{1}{28,989,675}} as {{\bf P}(E|H_0)}, and in this case the seemingly unusual event {E} would in fact have no effect on the odds of the alternative hypothesis, because it was just as unlikely for the alternative hypothesis to generate this multiples-of-nine pattern as for the null hypothesis to. In fact, one would imagine that these corrupt officials would avoid “suspicious” numbers, such as the multiples of {9}, and only choose numbers that look random, in which case {{\bf P}(E|H''_1)} would in fact be less than {{\bf P}(E|H_0)} and so the event {E} would actually lower the odds of the alternative hypothesis in this case. (In fact, one can sometimes use this tendency of fraudsters to not generate truly random data as a statistical tool to detect such fraud; violations of Benford’s law for instance can be used in this fashion, though only in situations where the null hypothesis is expected to obey Benford’s law, as discussed in this previous blog post.)

Now let us consider a third alternative hypothesis:

  • Alternative hypothesis {H'''_1}: On October 1, the lottery machine developed a fault and now only selects numbers that exhibit unusual patterns.

Setting aside the question of precisely what faulty mechanism could induce this sort of effect, it is not clear at all how to compute {{\bf P}(E|H'''_1)} in this case. Using the principle of indifference as a crude rule of thumb, one might expect

\displaystyle  {\bf P}(E|H'''_1) \approx \frac{1}{\# \{ \hbox{unusual patterns}\}}

where the denominator is the number of patterns among the possible {\binom{55}{6}} lottery outcomes that are “unusual”. Among such patterns would presumably be the multiples-of-9 pattern {9,18,27,36,45,54}, but one could easily come up with other patterns that are equally “unusual”, such as consecutive strings such as {11, 12, 13, 14, 15, 16}, or the first few primes {2, 3, 5, 7, 11, 13}, or the first few squares {1, 4, 9, 16, 25, 36}, and so forth. How many such unusual patterns are there? This is too vague a question to answer with any degree of precision, but as one illustrative statistic, the Online Encyclopedia of Integer Sequences (OEIS) currently hosts about {350,000} sequences. Not all of these would begin with six distinct numbers from {1} to {55}, and several of these sequences might generate the same set of six numbers, but this does suggests that patterns that one would deem to be “unusual” could number in the thousands, tens of thousands, or more. Using this guess, we would then expect the event {E} to boost the odds of this hypothesis {H'''_1} by perhaps a thousandfold or so, which is moderately impressive. But subsequent information can counteract this effect. For instance, on October 3, the same lottery produced the numbers {8, 10, 12, 14, 26, 51}, which exhibit no unusual properties (no search results in the OEIS, for instance); if we denote this event by {E'}, then we have {{\bf P}(E'|H'''_1) \approx 0} and so this new information {E'} should drive the odds for this alternative hypothesis {H'''_1} way down again.

Remark 2 This example demonstrates another demagogical rhetorical technique that one sometimes sees (particularly in political or other emotionally charged contexts), which is to cherry-pick the information presented to their audience by informing them of events {E} which have a relatively high probability of occurring under their alternative hypothesis, but withholding information about other relevant events {E'} that have a relatively low probability of occurring under their alternative hypothesis. When confronted with such new information {E'}, a common defense of a demogogue is to modify the alternative hypothesis {H_1} to a more specific hypothesis {H'_1} that can “explain” this information {E'} (“Oh, clearly we heard about {E'} because the conspiracy in fact extends to the additional organizations {X, Y, Z} that reported {E'}“), taking advantage of the vagueness discussed in Remark 1.

Let us consider a superficially similar hypothesis:

  • Alternative hypothesis {H''''_1}: On October 1, a divine being decided to send a sign to humanity by placing an unusual pattern in a lottery.

Here we (literally) stay agnostic on the prior odds of this hypothesis, and do not address the theological question of why a divine being should choose to use the medium of a lottery to send their signs. At first glance, the probability {{\bf P}(E|H''''_1)} here should be similar to the probability {{\bf P}(E|H'''_1)}, and so perhaps one could use this event {E} to improve the odds of the existence of a divine being by a factor of a thousand or so. But note carefully that the hypothesis {H''''_1} did not specify which lottery the divine being chose to use. The PSCO Grand Lotto is just one of a dozen lotteries run by the Philippine Charity Sweepstakes Office (PCSO), and of course there are over a hundred other countries and thousands of states within these countries, each of which often run their own lotteries. Taking into account these thousands or tens of thousands of additional lotteries to choose from, the probability {{\bf P}(E|H''''_1)} now drops by several orders of magnitude, and is now basically comparable to the probability {{\bf P}(E|H_0)} coming from the null hypothesis. As such one does not expect the event {E} to have a significant impact on the odds of the hypothesis {H''''_1}, despite the small-looking nature {\frac{1}{28,989,675}} of the probability {{\bf P}(E|H_0)}.

In summary, we have failed to locate any alternative hypothesis {H_1} which

  1. Has some non-negligible prior odds of being true (and in particular is not excessively specific, as with hypothesis {H'_1});
  2. Has a significantly higher probability of producing the specific event {E} than the null hypothesis; AND
  3. Does not struggle to also produce other events {E'} that have since been observed.
One needs all three of these factors to be present in order to significantly weaken the plausibility of the null hypothesis {H_0}; in the absence of these three factors, a moderately small numerical value of {{\bf P}(E|H_0)}, such as {\frac{1}{28,989,675}} does not actually do much to affect this plausibility. In this case one needs to lay out a reasonably precise alternative hypothesis {H_1} and make some actual educated guesses towards the competing probability {{\bf P}(E|H_1)} before one can lead to further conclusions. However, if {{\bf P}(E|H_0)} is insanely small, e.g., less than {10^{-1000}}, then the possibility of a previously overlooked alternative hypothesis {H_1} becomes far more plausible; as per the famous quote of Arthur Conan Doyle’s Sherlock Holmes, “When you have eliminated all which is impossible, then whatever remains, however improbable, must be the truth.”

We now return to the fact that for this specific October 1 lottery, there were {433} tickets that managed to select the winning numbers. Let us call this event {F}. In view of this additional information, we should now consider the ratio of the probabilities {{\bf P}(E \& F|H_1)} and {{\bf P}(E \& F|H_0)}, rather than the ratio of the probabilities {{\bf P}(E|H_1)} and {{\bf P}(E|H_0)}. If we augment the null hypothesis to

  • Null hypothesis {H'_0}: The lottery is run in a completely fair and random fashion, and the purchasers of lottery tickets also select their numbers in a completely random fashion.

Then {{\bf P}(E \& F|H'_0)} is indeed of the “insanely improbable” category mentioned previously. I was not able to get official numbers on how many tickets are purchased per lottery, but let us say for sake of argument that it is 1 million (the conclusion will not be extremely sensitive to this choice). Then the expected number of tickets that would have the winning numbers would be

\displaystyle  \frac{1 \hbox{ million}}{28,989,675} \approx 0.03

(which is broadly consistent, by the way, with the jackpot being reached every {30} draws or so), and standard probability theory suggests that the number of winners should now follow a Poisson distribution with this mean {\lambda = 0.03}. The probability of obtaining {433} winners would now be

\displaystyle  {\bf P}(F|H'_0) = \frac{\lambda^{433} e^{-\lambda}}{433!} \approx 10^{-1600}

and of course {{\bf P}(E \& F|H'_0)} would be even smaller than this. So this clearly demands some sort of explanation. But in actuality, many purchasers of lottery tickets do not select their numbers completely randomly; they often have some “lucky” numbers (e.g., based on birthdays or other personally significant dates) that they prefer to use, or choose numbers according to a simple pattern rather than go to the trouble of trying to make them truly random. So if we modify the null hypothesis to

  • Null hypothesis {H''_0}: The lottery is run in a completely fair and random fashion, but a significant fraction of the purchasers of lottery tickets only select “unusual” numbers.

then it can now become quite plausible that a highly unusual set of numbers such as {9,18,27,36,45,54} could be selected by as many as {433} purchasers of tickets; for instance, if {10\%} of the 1 million ticket holders chose to select their numbers according to some sort of pattern, then only {0.4\%} of those holders would have to pick {9,18,27,36,45,54} in order for the event {F} to hold (given {E}), and this is not extremely implausible. Given that this reasonable version of the null hypothesis already gives a plausible explanation for {F}, there does not seem to be a pressing need to locate an alternate hypothesis {H_1} that gives some other explanation (cf. Occam’s razor). [UPDATE: Indeed, given the actual layout of the tickets of ths lottery, the numbers {9,18,27,35,45,54} form a diagonal, and so all that is needed in order for the modified null hypothesis {H''_0} to explain the event {F} is to postulate that a significant fraction of ticket purchasers decided to lay out their numbers in a simple geometric pattern, such as a row or diagonal.]

Remark 3 In view of the above discussion, one can propose a systematic way to evaluate (in as objective a fashion as possible) rhetorical claims in which an advocate is presenting evidence to support some alternative hypothesis:
  1. State the null hypothesis {H_0} and the alternative hypothesis {H_1} as precisely as possible. In particular, avoid conflating an extremely broad hypothesis (such as the hypothesis {H_1} in our running example) with an extremely specific one (such as {H'_1} in our example).
  2. With the hypotheses precisely stated, give an honest estimate to the prior odds of this formulation of the alternative hypothesis.
  3. Consider if all the relevant information {E} (or at least a representative sample thereof) has been presented to you before proceeding further. If not, consider gathering more information {E'} from further sources.
  4. Estimate how likely the information {E} was to have occurred under the null hypothesis.
  5. Estimate how likely the information {E} was to have occurred under the alternative hypothesis (using exactly the same wording of this hypothesis as you did in previous steps).
  6. If the second estimate is significantly larger than the first, then you have cause to update your prior odds of this hypothesis (though if those prior odds were already vanishingly unlikely, this may not move the needle significantly). If not, the argument is unconvincing and no significant adjustment to the odds (except perhaps in a downwards direction) needs to be made.

John PreskillAnnouncing the quantum-steampunk short-story contest!

The year I started studying calculus, I took the helm of my high school’s literary magazine. Throughout the next two years, the editorial board flooded campus with poetry—and poetry contests. We papered the halls with flyers, built displays in the library, celebrated National Poetry Month, and jerked students awake at morning assembly (hitherto known as the quiet kid you’d consult if you didn’t understand the homework, I turned out to have a sense of humor and a stage presence suited to quoting from that venerated poet Dr. Seuss.1 Who’d’ve thought?). A record number of contest entries resulted.

That limb of my life atrophied in college. My college—a stereotypical liberal-arts affair complete with red bricks—boasted a literary magazine. But it also boasted English and comparative-literature majors. They didn’t need me, I reasoned. The sun ought to set on my days of engineering creative-writing contests.

I’m delighted to be eating my words, in announcing the Quantum-Steampunk Short-Story Contest.

From Pinterest

The Maryland Quantum-Thermodynamics Hub is running the contest this academic year. I’ve argued that quantum thermodynamics—my field of research—resembles the literary and artistic genre of steampunk. Steampunk stories combine Victorian settings and sensibilities with futuristic technologies, such as dirigibles and automata. Quantum technologies are cutting-edge and futuristic, whereas thermodynamics—the study of energy—developed during the 1800s. Inspired by the first steam engines, thermodynamics needs retooling for quantum settings. That retooling is quantum thermodynamics—or, if you’re feeling whimsical (as every physicist should), quantum steampunk.

The contest opens this October and closes on January 15, 2023. Everyone aged 13 or over may enter a story, written in English, of up to 3,000 words. Minimal knowledge of quantum theory is required; if you’ve heard of Schrödinger’s cat, superpositions, or quantum uncertainty, you can pull out your typewriter and start punching away. 

Entries must satisfy two requirements: First, stories must be written in a steampunk style, including by taking place at least partially during the 1800s. Transport us to Meiji Japan; La Belle Époque in Paris; gritty, smoky Manchester; or a camp of immigrants unfurling a railroad across the American west. Feel free to set your story partially in the future; time machines are welcome.

Second, each entry must feature at least one quantum technology, real or imagined. Real and under-construction quantum technologies include quantum computers, communication networks, cryptographic systems, sensors, thermometers, and clocks. Experimentalists have realized quantum engines, batteries, refrigerators, and teleportation, too. Surprise us with your imagined quantum technologies (and inspire our next research-grant proposals).

In an upgrade from my high-school days, we’ll be awarding $4,500 worth of Visa gift certificates. The grand prize entails $1,500. Entries can also win in categories to be finalized during the judging process; I anticipate labels such as Quantum Technology We’d Most Like to Have, Most Badass Steampunk Hero/ine, Best Student Submission, and People’s Choice Award.

Our judges run the gamut from writers to quantum physicists. Judge Ken Liu‘s latest novel peered out from a window of my local bookstore last month. He’s won Hugo, Nebula, and World Fantasy Awards—the topmost three prizes that pop up if you google “science-fiction awards.” Appropriately for a quantum-steampunk contest, Ken has pioneered the genre of silkpunk, “a technology aesthetic based on a science fictional elaboration of traditions of engineering in East Asia’s classical antiquity.” 

Emily Brandchaft Mitchell is an Associate Professor of English at the University of Maryland. She’s authored a novel and published short stories in numerous venues. Louisa Gilder wrote one of the New York Times 100 Notable Books of 2009, The Age of Entanglement. In it, she imagines conversations through which scientists came to understand the core of this year’s Nobel Prize in physics. Jeffrey Bub is a philosopher of physics and a Distinguished University Professor Emeritus at the University of Maryland. He’s also published graphic novels about special relativity and quantum physics with his artist daughter. 

Patrick Warfield, a musicologist, serves as the Associate Dean for Arts and Programming at the University of Maryland. (“Programming” as in “activities,” rather than as in “writing code,” the meaning I encounter more often.) Spiros Michalakis is a quantum mathematician and the director of Caltech’s quantum outreach program. You may know him as a scientific consultant for Marvel Comics films.

Walter E. Lawrence III is a theoretical quantum physicist and a Professor Emeritus at Dartmouth College. As department chair, he helped me carve out a niche for myself in physics as an undergrad. Jack Harris, an experimental quantum physicist, holds a professorship at Yale. His office there contains artwork that features dragons.

University of Maryland undergraduate Hannah Kim designed the ad above. She and Jade LeSchack, founder of the university’s Undergraduate Quantum Association, round out the contest’s leadership team. We’re standing by for your submissions through—until the quantum internet exists—the hub’s website. Send us something to dream on.

1Come to think of it, Seuss helped me prepare for a career in physics. He coined the terms wumbus and nerd; my PhD advisor invented NISQ, the name for a category of quantum devices. NISQ now has its own Wikipedia page, as does nerd

October 08, 2022

Mark GoodsellPointless particle theorists?

The latest (twitter) storm in a teacup on this subject nearly prompted me to return to my soapbox, but now my favoured news source (not the Onion but the Grauniad) has an arcicle attacking particle theorists. Yet again, the usual suspect with books to sell and a career to save is dunking on particle physics, the supposedly finished branch of science that should give up and go home now that we found the Higgs boson because we can't prove there's a new particle just round the corner and isn't the Standard Model great anyway? The problem is, of course, that maverick (Top Gun Maverick!) outsiders revealing the "truth" that elites in an ivory tower are spending oodles of public money on failed ideas, and that we need a whistleblower to expose the fraud, gains such traction with the taxpaying public.

I even see other physicists defending this public anti-vax-style conspiracy-theory propagation as good for the field to rattle peoples cages and get people to question their assumptions. The sociology of this is quite interesting, because there are people within the field who either work on niche theories or just want to take down adjacent fields, and would like to see the popular paradigms brought down a peg or two, presumably naively believing that this will lead to more resources (human, cash or just attention) to be sent in their direction. But of course public disparaging of scientists can only ever lead to a general reduction of public trust and a shrinking of the pie for everyone. There exist so many internal mechanisms for theorists to (re)consider what they are working on, e.g.:

  • Grant proposals. The good thing about writing them is that they give you an opportunity to think deeply about what you really want to work on. Boring or uninnovative things just don't get funded. Of course, the evaluation systems may be terrible and biased and reward things not being too far from the reviewer's/panel's interests ... there is much room for improvement; but at least the writing part can be useful.
  • Researcher evaluations. At least here in France we must make a declaration of our activities and plans every year, and write a longer report every few years. This serves a similar purpose to the above.
  • New hires/promotions. Groups want to hire people who are working on interesting stuff. Hiring someone permanently is an endorsement of a field.
  • Citations, talk invitations etc. While people citing your work may just be because they are working on something similar, and people like to invite their friends for talks, sufficiently interesting or new work will persuade people to follow it up and garner attention.
  • These are all group mechanisms whereby scientists evaluate each other and what they are doing themselves. I am sure someone has studied the game theory of it; indeed as individual researchers trying to succeed in our careers we all have to adopt strategies to "win" and it is a shockingly competitive system at every stage. Of course, promoting a new idea can be difficult -- we are in a constant battle for attention (maybe writing a blog is a good strategy?) -- but if there is something really promising people will not ignore it. Ambulance chasing (where tens or hundreds of papers follow a new result) is a sign that plenty of people are ready to do exactly that. If a maverick outsider really had a great idea there would not be a shortage of people willing to follow. To take an example, if "the foundations of physics" really offered opportunities for rapid important progress, people would vote with their feet. I see examples of this all the time with people trying out Quantum Computing, Machine Learning, etc.

    I'll let you in on a secret, therefore: the target of the bile is a straw man. I don't know anyone hired as a BSM model builder in recent years. People became famous for it in the 90s/early 00s because there was no big experiment running and the field was dreaming big. Now we have the LHC and that has focussed imaginations much more. People now hired as phenomenologists may also do some ambulance chasing on the side, but it is not their bread and butter. Inventing models in theory is usually a difficult and imaginative task, aimed at connecting often disparate ideas, but it's not the only task of a phenomenologist: the much bigger ones are understanding existing ones, and trying to connect theory to experiments!


    In defence of ambulance chasing (retch)

    When an experiment announces something unexpected (as happens quite frequently!) what is the correct response? According to our outsider, presumably we should just wait for it to go away and for the Standard Model to be reaffirmed. People in the field instead take the view that we could be curious and try to explain it; the best ideas come with new features or explain more than one anomaly. What should we do with wrong explanations? Should we be punished for not coming up with proven theories? Do we need external policing of our curiosity? What does ambulance chasing really cost? The attraction for many departments to form a theory group is that they are cheap -- theorists don't need expensive experiments or engineers/technicians/people to wash the test tubes. The reward for coming up with a failed theory is usually nothing; but it costs almost nothing too. So why the bitterness? Of course, we can begrudge people becoming famous for coming up with fanciful science fictions -- the mechanisms for identifying promising ideas are far from perfect -- but usually they have come up with something with at least some degree of novelty.

    When looking at CVs, it's very easy to spot and discount 'ambulance citations.' By the way, another phenomenon is to sign 'community papers' where tens or hundreds of authors group-source a white paper on a popular topic; and a third is to write a review of a hot subject. Both of these work very well to generate citations. Should we stop doing them too? In the end, the papers that count are ones with an interesting result or idea and there is no sure mechanism to writing them. In the aftermath of every ambulance-chasing cycle there are almost always papers that have some interesting nugget of an idea in, something that remains that would not have been suggested otherwise or computatations done that would otherwise not have been thought of, and hopefully brings us closer to discoveries.

    Recent progress

    We have an amazing collider experiment -- the LHC -- which will run for another ten years or so at high luminosities. We can either take the view in advance that it will tell us nothing about the energy frontier, or we can try to make the most of it. The fundamental problems with our understanding of physics have not been solved; I wrote a response to a similar article in 2020 and I stand by my opinion of the state of the field, and you can look there for my laundry list of problems that we are trying to make sense of. What has changed since then? Here are just a few things, biased by my own interests:

    Muon g-2

    The measurement of the muon g-2 by Fermilab confirmed the earlier anomalous measurement. Instead, now we have the problem that a series of lattice QCD groups have a calculation that would imply that the Standard Model prediction is closer to the measurement, in contradiction with the R-ratio method. Someone has underestimated their uncertainties, but we don't know who! This is a problem for theorists working with the experiments; perhaps the new experiment "mu on E" will help resolve it?

    CDF measurement of the W mass

    As reported everywhere, the CDF experiment at the Tevatron (the previous energy frontier collider that shut down ten years ago) analysed its data and found a measurement of the mass of the W boson with an enormous disagreement with the Standard Model of 7 standard deviations. If confirmed, it would signal new physics around the TeV scale. Since the W boson mass is just about the most generic thing that can be modified by new particles near the electroweak scale, there are any number of new theories that can explain it (as the arXiv this year will attest). Here there is a 4 standard deviation tension with a measurement at the LHC which has a much larger uncertainty. Another LHC measurement is now needed to settle the issue, but this may take a long time as it is a difficult measurement to make at the LHC. (Maybe we should just not bother?). Other than lots of fanciful (and dull) model building, this has recentred theory efforts on how to extract information from the W boson mass in new theories, which is a problem of precision calculations and hugely interesting ...

    The Xenon-1T anomaly disappeared

    Xenon released new results this summer showing that the anomaly at low recoils they found had disappeared with more data. While this immediately killed many theories to explain it, the lasting effects are that people have given serious thought to low-mass dark matter models that could have explained, and come up with new ways to search for them. Without looking, we don't know if they are there!

    An anomaly in extragalactic background light was found

    A four standard deviation anomaly was reported in the extra-galactic background light (EBL), i.e. there is too much light coming from outside galaxies in every direction! This would naturally be explained by an axion-like particle decaying -- indeed, measurements of the EBL have long been used as constraints. (Maybe we should never have come up with axions?)

    The LHC reported three other anomalies

    In analysing the data of run 2, three different searches reported anomalies of three standard deviations. Explanations for them have been suggested; perhaps we should see if they are correlations with other searches, or new ways of corroborating the possible signals? Or just not bother?

    Run 3 has started

    Run 3 of the LHC has started with a slightly higher energy and then stopped due to technical problems. It will be some time before significant luminosity is collected and our experimentalists are looking at new types of searches that might lead to discoveries. Their main motivation is that new signatures equals new reach. Our colleagues certaintly need justification or interpretations for their results, but whether the models really offer explanations of other types of new physics (e.g. dark matter) is of course a concern it is not the main one. The reason to do an experiment at the LHC is curiosity based -- experimentalists are not children looking for theorists' latest whims. The point is that we should test the theories every which way we can because we don't know what we will find. A good analogy might be zoologists looking for new species might want to got o a previously unexplored region of the earth, or they might find a new way of looking at ones they've been to before, e.g. by turning over rocks that they would otherwise have stepped over.

    Long Lived Particles

    One of these classes is long lived particles (LLPs)-- that I have written about before on here -- and they have also caught the imagination of theorists. In fact, I'm working with experimentalists with the aim of making their searches more widely applicable.

    SMEFT

    Two years ago I wrote that I thought the field was less inclined to follow hot topics and that this is healthy. This is still the case. However, some hot topics do exist and one of these is the theory of the Standard Model Effective Field Theory. There is now rapid development of all manner of aspects, from phenomenology to exploration of the higher-order version to matching etc.

    Machine Learning

    Another example is machine learning, which is becoming more prevalent and interesting, especially its interface between theory and experiments.

    Of course, there are many more developments and I'm sure many I'm not aware of. Obviously this is a sign of a field in big trouble!

    October 04, 2022

    Robert HellingNo action at a distance, spooky or not

     On the occasion of the announcement of the Nobel prize for Aspect, Clauser and Zeilinger for the experimental verification that quantum theory violates Bell's inequality, there seems to be a strong urge in popular explanations to state that this proves that quantum theory is non-local, that entanglement is somehow a strong bond between quantum systems and people quote Einstein on the "spooky action at a distance".

    But it should be clear (and I have talked about this here before) that this is not a necessary consequence of the Bell inequality violation. There is a way to keep locality in quantum theory (at the price of "realism" in a technical sense as we will see below). And that is not just a convenience: In fact, quantum field theory (and the whole idea of a field mediating interactions between distant entities like the earth and the sun) is built on the idea of locality. This is most strongly emphasised in the Haag-Kastler approach (algebraic quantum field theory), where pretty much everything is encoded in the algebras of observables that can be measured in local regions and how these algebras fit into each other. So throwing out locality with the bath water removes the basis of QFT. And I am convinced this is the origin why there is no good version of QFT in the Bohmian approach (which famously sacrifices locality to preserve realism, something some of the proponents not even acknowledge as an assumption as it is there in the classical theory and it needs some abstraction to realise it is actually an assumption and not god given).

    But let's get technical. To be specific, I will use the CHSH version of the Bell inequality (but you could as well use the original one or the GHZ version as Coleman does). This is about particles that have two different properties, here termed A and B. These can be measured and the outcome of this measurement can be either +1 or -1. An example could be spin 1/2 particles and A and B representing twice the components of the spin in either the x or the y direction respectively.

    Now, we have two such particles with these properties A and B for particle 1 and A' and B' for particle 2. CHSH instruct you to look at the expectation value of the combined observable

    \[A (A'+B') + B (A'-B').\]

    Let's first do the classical analysis: We don't know about the two properties of particle 2, in the primed variables. But we know, they are either equal or different. In case they are equal, the absolute value of A'+B' is 2 while A'-B'=0. If they are different, we have A'+B'=0 while the absolute value of A'-B' is two. In either case, one one of the two terms contribute and in absolute value it is 2 times the unprimed observable of particle one, A for equal values in particle 2 an B for different values for particle 2. No matter which possibility is realised, the absolute value of this observable is always 2.

    If you allow for probabilistic outcomes of the measurements, you can convince yourself that you can also realise smaller absolute values than 2 but never larger ones. So much for the classical analysis.

    In quantum theory, you can, however, write down an entangled state of the two particle system (in the spin 1/2 case specifically) where this expectation value is 2 times the square root of 2, so larger than all the possible classical values. But didn't we just prove it cannot be larger than 2?

    If you are ready to give up locality you can now say that there is a non-local interaction that tells particle 2 if we measure A or B on particle one and by this adjust its value that is measured at the site of particle two. This is, I presume, what the Bohmians would argue (even though I have never seen a version of this experiment spelled out in detail in the Bohmian setting with a full analysis of the particles following the guiding equation).

    But as I said above, I would rather give up realism: In the formula above and the classical argument, we say things like "A' and B' are either the same or opposite". Note, however, that in the case of spins, you cannot both measure the spin in x and in y direction on the same particle because they do not commute and there is the uncertainty relation. You can measure either of them but once you decided you cannot measure the other (in the same round of the experiment). To give up realism simply means that you don't try to assign a value to an observable that you cannot measure because it is not compatible with what you actually measure. If you measure the spin in x direction it is no longer the case that the spin in the y direction is either +1/2 or -1/2 and you just don't know because you did not measure it, in the non-realistic theory you must not assign any value to it if you measured the x spin. (Of course you can still measure A+B, but that is a spin in a diagonal direction and then you don't measure either the x nor the y spin).

    You just have to refuse to make statements like "the spin in x and y directions are either the same or opposite" as they involve things that cannot all be measured, so this statement would be non-observable anyways. And without these types of statement, the "proof" of the inequality goes down the drain and this is how the quantum theory can avoid it. Just don't talk about things you cannot measure in principle (metaphysical statements if you like) and you can keep our beloved locality.

    John PreskillMo’ heights mo’ challenges – Climbing mount grad school

    My wife’s love of mountain hiking and my interest in quantum thermodynamics collided in Telluride, Colorado.

    We spent ten days in Telluride, where I spoke at the Information Engines at the Frontiers of Nanoscale Thermodynamics workshop. Telluride is a gorgeous city surrounded by mountains and scenic trails. My wife (Hasti) and I were looking for a leisurely activity one morning. We chose hiking Bear Creek Trail, a 5.1-mile hike with a 1092-foot elevation. This would have been a reasonable choice… back home.

    Telluride’s elevation is 8,750 feet (ten times that of our hometown’s). This meant there was nothing leisurely about the hike. Ill-prepared, I dragged myself up the mountain in worn runners and tight jeans. My gasps for breath reminded me how new heights (a literal one in this case) could bring unexpected challenges – A lesson I’ve encountered many times as a graduate student. 

    My wife and I atop bear creek trail

    I completely squandered my undergrad. One story sums it up best. I was studying for my third-year statistical mechanics final when I realized I could pass the course without writing the final. So, I didn’t write the final. After four years of similar negligence, I somehow graduated, certain I’d left academics forever. Two years later, I rediscovered my love for physics and grieved about wasting my undergraduate. I decided to try again and apply for grad school. I learned Canada had 17 Universities I could apply to with a 70 average; each one rejected me.

    I was ecstatic to eventually be accepted into a master’s of math. But, the high didn’t last. Learning math and physics from popular science books and PBS videos is very different from studying for University exams. If I wanted to keep this opportunity, I had to learn how to study.

    16 months later, I graduated with a 97 average and an offer to do a master’s of physics at the University of Waterloo. I would be working at the Institute for Quantum Computing (IQC), supervised by Raymond Laflamme (one of the co-founders of the field of quantum computing). My first day at IQC felt surreal. I had become an efficient student and felt ready for the IQC. But, like the bear creek trail, another height would bring another challenge. Ultimately, grad school isn’t about getting good grades; it’s about researching. Raymond (Ray) gave me my first research project, and I was dumbfounded about where to start and too insecure about asking for help.

    With Ray and Jonathan Halliwell’s (professor at Imperial College London and guitarist-extraordinaire) guidance, I published my first paper and accepted a Ph.D. offer from Ray. After publishing my second paper, I thought it would be smooth sailing through my Ph.D..  Alas, I was again mistaken. It’s also not enough to solve problems others give you; you need to come up with some problems on your own. So, I tried. I spent the first 8-months of my Ph.D. pursuing a problem I came up with, and It was a complete dud. It turns out the problems also need to be worth solving. For those keeping track, this is challenge number four.

    I have now finished the third year of my Ph.D. (two, if you don’t count the year I “took off” to write a textbook with Ray and superconducting-qubit experimentalist, Prof. Chris Wilson). During that time, Nicole Yunger Halpern (NIST physicist, my new co-advisor, and Quantum Frontiers blogger) introduced me to the field of quantum thermodynamics. We’ve published a paper together (related blog post and Vanier interview) and have a second on the way. Despite that, I’m still grappling with that last challenge. I have no problem finding research questions that would be fun to solve. However, I’m still not sure which ones are worth solving. But, like the other challenges, I’m hopeful I’ll figure it out.

    While this lesson was inspiring, the city of Telluride inspired me the most. Telluride is at a local minimum elevation, surrounded by mountains. Meaning there is virtually nowhere to go but up. I’m hoping the same is true for me.

    September 22, 2022

    John PreskillWe’re founding a quantum-thermodynamics hub!

    We’re building a factory in Maryland. 

    It’ll tower over the University of Maryland campus, a behemoth of 19th-century brick and 21st-century glass across from the football field. Turbines will turn, and gears will grind, where students now sip lattes near the Stadium Drive parking lot. The factory’s fuel: steam, quantum physics, and ambition. Its goal: to create an epicenter for North American quantum thermodynamics.

    The factory is metaphorical, of course. Collaborators and I are establishing a quantum-thermodynamics hub centered at the University of Maryland. The hub is an abstraction—a community that’ll collaborate on research, coordinate gatherings, host visitors, and raise the public’s awareness of quantum thermodynamics. But I’d rather envision the hub as a steampunk factory that pumps out discoveries and early-career scientists.

    Quantum thermodynamics has burgeoned over the past decade, especially in Europe. At the beginning of my PhD, I read paper after paper that acknowledged COST, a funding agency established by the European Union. COST dedicated a grant to thermodynamics guided by the mathematics and concepts of quantum information theory. The grant funded students, travel, and the first iterations of an annual conference that continues today. Visit Germany, Finland, France, Britain (which belonged to the European Union when I began my PhD), or elsewhere across the pond, and you’ll stumble across quantum-thermodynamics strongholds. Hotspots burn also in Brazil, Israel, Singapore, and elsewhere.

    Inspired by our international colleagues, collaborators and I are banding together. Since I founded a research group last year, Maryland has achieved a critical mass of quantum thermodynamicists: Chris Jarzynski reigns as a king of the field of fluctuation relations, equalities that help us understand why time flows in only one direction. Sebastian Deffner, I regard as an academic older brother to look up to. And I run the Quantum-Steampunk Laboratory.

    We’ve built railroads to research groups across the continent and steamers to cross the ocean. Other members of the hub include Kanu Sinha, a former Marylander who studies open systems in Arizona; Steve Campbell, a Dublin-based prover of fundamental bounds; and two experts on quantum many-body systems: former Marylander Amir Kalev and current Marylander Luis Pedro García-Pintos. We’re also planning collaborations with institutions from Canada to Vienna.

    The hub will pursue a threefold mission of research, community building, and outreach. As detailed on our research webpage, “We aim to quantify how, thermodynamically, decoherence and the spread of information lead to emergent phenomena: classical objectivity and the flow of time.” To grow North America’s quantum-thermodynamics community, we’ll run annual symposia and an international conference. Our visitors’ program will create the atmosphere of a local watering hole. Outreach will include more posts on this blog—including by guest authors—a quantum-steampunk short-story contest (expect details this fall), and more.

    Come visit us by dirigible, train, or gyropter. Air your most thought-provoking quantum-thermodynamics discoveries in a seminar with us, and solicit feedback. Find collaborators, and learn about the latest. The factory wheels are beginning to turn.

    With thanks to the John Templeton Foundation for the grant to establish the hub.

    September 21, 2022

    Sean Carroll The Biggest Ideas in the Universe: Space, Time, and Motion

    Just in case there are any blog readers out there who haven’t heard from other channels: I have a new book out! The Biggest Ideas in the Universe: Space, Time, and Motion is Volume One of a planned three-volume series. It grew out of the videos that I did in 2020, trying to offer short and informal introductions to big ideas in physics. Predictably, they grew into long and detailed videos. But they never lost their informal charm, especially since I didn’t do that much in the way of research or preparation.

    For the book, by contrast, I actually did research and preparation! So the topics are arranged a bit more logically, the presentation is a bit more thorough and coherent, and the narrative is sprinkled with fun anecdotes about the philosophy and history behind the development of these ideas. In this volume, “these ideas” cover classical physics, from Aristotle through Newton up through Einstein.

    The gimmick, of course, is that we don’t shy away from using equations. The goal of this book is to fill the gap between what you generally get as a professional physics student, who the teacher can rely on to spend years of study and hours of doing homework problems, and what you get as an interested amateur, where it is assumed that you are afraid of equations or can’t handle them. I think equations are not so scary, and that essentially everyone can handle them, if they are explained fully along the way. So there are no prerequisites, but we will teach you about calculus and vectors and all that stuff along the way. Not enough to actually solve the equations and become a pro, but enough to truly understand what the equations are saying. If it all works, this will open up a new way of looking at the universe for people who have been denied it for a long time.

    The payoff at the end of the book is Einstein’s theory of general relativity and its prediction of black holes. You will understand what Einstein’s equation really says, and why black holes are an inevitable outcome of that equation. Something most people who get an undergraduate university degree in physics typically don’t get to.

    Table of contents:

    • Introduction
    • 1. Conservation
    • 2. Change
    • 3. Dynamics
    • 4. Space
    • 5. Time
    • 6. Spacetime
    • 7. Geometry
    • 8. Gravity
    • 9. Black Holes
    • Appendices

    Available wherever books are available: Amazon * Barnes and Noble * BAM * IndieBound * Bookshop.org * Apple Books.