### Diversity and the Mysteries of Counting

#### Posted by Tom Leinster

Back in 2005 or so, John Baez was giving talks about groupoid cardinality, Euler characteristic, and strange objects with two and a half elements.

I saw a version of this talk at Streetfest in Sydney, called *The
Mysteries of Counting*. It had a big impact on me.

This post makes one simple point: that by thinking along the lines John advocated, we can arrive at the exponential of Shannon entropy — otherwise known as diversity of order $1$.

#### Background: counting with fractional elements

Consider a group $G$ acting on a set $X$, both finite. How many “pieces” of $X$ are there, modulo $G$?

The first, simple, answer is $|X/G|$, the number of orbits. For example, in the action of $C_2$ on the five-element set shown in the diagram above, this gives an answer of $3$.

But there’s a second answer: $|X|/|G|$, the cardinality of $X$ divided by the order of $G$. If the action is free then this is equal to $|X/G|$, but in general it’s not. For instance, in the example shown, this gives an answer of $2\tfrac{1}{2}$.

The short intuitive justification for this alternative answer is that when counting, we often want to quotient out by the order of symmetry groups. For example, there are situations in which a point with an $S_n$-action deserves to count as only $1/n!$ elements. John’s slides say much more on this theme.

This second answer also fits into an abstract framework.

Every action of a group on a set gives rise to a so-called **action
groupoid** $X/\!/G$. Its objects are the elements of $X$, and a map $x \to y$
is an element $g \in G$ such that $g x = y$. Equivalently, it’s the
category of elements of the functor $G \to Set$ corresponding to the group
action.

There is a notion of the
**cardinality** $|A|
\in \mathbb{Q}$ of a finite groupoid $A$. And when you take the cardinality
of this action groupoid, you get exactly $|X|/|G|$. That is:

$|X/\!/G| = |X|/|G|.$

In summary, the first answer to “how many pieces are there?” is $|X/G|$, and the second is $|X/\!/G|$.

Now let’s do the same thing with monoids in place of groups. Let $M$ be a monoid acting on a set $X$, and assume they’re both finite whenever we need to. How many “pieces” of $X$ are there, modulo $M$?

Again, the obvious answer is $|X/M|$. Here $X/M$ is again the set of
orbits. (You have to be a bit more careful defining “orbit” for
monoids than for groups: the equivalence relation on $X$ is *generated*,
not *defined*, by $x \sim y$ if there exists $m \in M$ such that $m x = y$.)

And again, there’s a second, alternative answer: $|X|/|M|$.

The abstract framework for this second answer is now about categories
rather than groupoids. Every action of a monoid on a set gives rise to an
**action category** $X/\!/M$. It has the same explicit and categorical
descriptions as in the group case.

There is a notion of the **Euler
characteristic** or
**magnitude**
$|A| \in \mathbb{Q}$
of a finite category $A$. It could equally well be called “cardinality”,
and generalizes groupoid cardinality. And when you take the Euler
characteristic of this action category, you get $|X|/|M|$. That is:

$|X/\!/M| = |X|/|M|.$

So just as for groups, the first and second answers to “how many pieces are there?” are $|X/M|$ and $|X/\!/M|$, respectively.

#### How diversity emerges

Take a surjective map of sets $\pi: A \to I$. We’re going to think about two structures that arise from $\pi$:

The monoid $End_I(A)$ of endomorphisms of $A$ over $I$ (that is, making the evident triangle commute). The monoid structure is given by composition.

The set $End(A)$ of

*all*endomorphisms of $A$. We could give this a monoid structure too, but we’re only ever going to treat it as a set.

The monoid $End_I(A)$ acts on the set $End(A)$ on the left, by composition: an endomorphism of $A$ followed by an endomorphism of $A$ over $I$ is an endomorphism over $A$. (The fact that one of them is over $I$ does nothing.)

So, how many pieces of $End(A)$ are there, modulo $End_I(A)$?

The first answer is “count the number of orbits”. How many orbits are there?

It’s not too hard to show that there’s a canonical isomorphism

$\frac{End(A)}{End_I(A)} \cong I^A,$

where the right-hand side is the set of functions $A \to I$. It’s a simple little proof of a type that doesn’t work very well in a blog post. You start with the fact that composition with $\pi$ gives a map $End(A) \to I^A$ and take it from there.

So assuming now that $A$ and $I$ are finite, the number of orbits is given by

$\biggl| \frac{End(A)}{End_I(A)} \biggr| = |I|^{|A|}.$

That’s the first answer, but while we’re here, let’s observe that this equation can be rearranged to give a funny formula for the cardinality of $I$:

$|I| = \biggl| \frac{End(A)}{End_I(A)} \biggr|^{1/|A|}.$

The second, alternative, answer is that the number of pieces is

$\frac{|End(A)|}{|End_I(A)|}.$

What is this, explicitly? Write the fibre $\pi^{-1}(i)$ as $A_i$. Then

$|End(A)| = |A|^{|A|}, \quad |End_I(A)| = \prod_{i \in I} |A_i|^{|A_i|},$

and since $A = \coprod_{i \in I} A_i$, this gives

$\frac{|End(A)|}{|End_I(A)|} = \prod_{i \in I} \biggl( \frac{|A|}{|A_i|} \biggr)^{|A_i|}$

as our second answer.

Now here’s the punchline. The first answer gave us a formula for the cardinality of $I$:

$|I| = \biggl| \frac{End(A)}{End_I(A)} \biggr|^{1/|A|}.$

What’s the analogous formula for the second answer? That is, in the equation

$? = \biggl(\frac{|End(A)|}{|End_I(A)|}\biggr)^{1/|A|},$

what goes on the left-hand side?

Well, it’s

$\prod_{i \in I} \biggl( \frac{|A|}{|A_i|} \biggr)^{|A_i|/|A|}.$

We can simplify this if we write $p_i = |A_i|/|A|$. These numbers $p_i$ are nonnegative and sum to $1$, so they amount to a probability distribution

$\mathbf{p} = (p_i)_{i \in I} = (|A_i|/|A|)_{i \in I}.$

The formula for “?” then simplifies to

$\prod_{i \in I} p_i^{-p_i}.$

And this is nothing but the exponential of Shannon entropy, also called the
**diversity** or **Hill number** $D_1(\mathbf{p})$ of order $1$.

In conclusion, if we take that first-answer formula for the cardinality of $I$ —

$|I| = \biggl| \frac{End(A)}{End_I(A)} \biggr|^{1/|A|}$

— then its second-answer analogue is a formula for the diversity of $\mathbf{p}$ —

$D_1(\mathbf{p}) = \biggl(\frac{|End(A)|}{|End_I(A)|}\biggr)^{1/|A|}.$

And this is yet another glimpse of an analogy that André Joyal first explained to me one fateful day in Barcelona in 2008: exponential of entropy is a cousin of cardinality.

## Re: Diversity and the Mysteries of Counting

Beautiful!

I learned about the diversity of order 1 while reading Shannon’s seminal paper and I always liked it more than the entropy because it has the nice interpretation as “the probability of a typical event”, and because it has the form of the “endomorphism number of the probability distribution” if you squint and pretend that everything you can do in the biclosed equipment of spans works for good ol’ matrices as well. I mentioned this to David Spivak at some point and he came back and told me about this counting endomorphisms story which requires significantly less squinting, but still had a pesky division where it shouldn’t be.

This is a great explanation of why that pesky division really should be there!

This makes me wonder: what other quantities and ways of doing things from basic information theory and probability theory can be described “objectively” with the combinatorics of surjections of finite sets like this?