Skip to the Main Content

Note:These pages make extensive use of the latest XHTML and CSS Standards. They ought to look great in any standards-compliant modern browser. Unfortunately, they will probably look horrible in older browsers, like Netscape 4.x and IE 4.x. Moreover, many posts use MathML, which is, currently only supported in Mozilla. My best suggestion (and you will thank me when surfing an ever-increasing number of sites on the web which have been crafted to use the new standards) is to upgrade to the latest version of your browser. If that's not possible, consider moving to the Standards-compliant and open-source Mozilla browser.

November 30, 2019

Random Permutations (Part 5)

Posted by John Baez

To go further in understanding the properties of random permutations, we need to use ‘bivariate’ generating functions — that is, generating functions involving 2 variables. Here’s a good introduction to these:

  • Phillipe Flajolet and Robert Sedgewick, Analytic Combinatorics, Part III: Combinatorial Parameters and Multivariate Generating Functions.

I will try to explain these in a way that takes advantage of Joyal’s work on species. Then I’ll use them to tackle this puzzle from Part 3.

Puzzle 5. What is the expected number of cycles in a random permutation of an nn-element set?

There are two main kinds of generating functions in combinatorics, which apply to functors from two different groupoids to FinSetFinSet:

  • Let SS be the groupoid of finite sets and bijections, or equivalently the groupoid with natural numbers as objects and permutations σS n\sigma \in S_n as morphisms from nn to nn. Let G:SFinSetG: S \to FinSet be a functor. (We call this sort of functor a finite species.) Then the exponential generating function of GG is the formal power series |G||G| defined by |G|(z)= n0|G(n)|z nn! \displaystyle{ |G|(z) = \sum_{n \ge 0} |G(n)| \frac{z^n}{n!} } where |G(n)||G(n)| is the cardinality of GG applied to nn.
  • Let \mathbb{N} be the groupoid of totally ordered finite sets and order-preserving bijections, or equivalently the groupoid with natural numbers as objects and only identity morphisms. Let G:FinSetG: \mathbb{N} \to FinSet be a functor. Then the ordinary generating function of GG is the formal power series |G||G| defined by |G|(u)= n0|G(n)|u n |G|(u) = \sum_{n \ge 0} |G(n)| u^n where |G(n)||G(n)| is the cardinality of GG applied to nn.

Why is there an n!n! in the denominator in the first case but not the second? It comes from the theory of groupoid cardinality; using this theory we could define generating functions for functors G:XFinSetG \colon X \to FinSet for any groupoid XX. But we only need these two cases so I’ll resist the temptation to explain such generalities.

I’ve spent a lot more time thinking about functors G:SFinSetG : S \to FinSet and their exponential generating functions than functors G:FinSetG : \mathbb{N} \to FinSet and their ordinary generating functions. The former are quite rich and interesting; the latter seem a bit pathetic in comparison. A functor G:FinSetG : \mathbb{N} \to FinSet is determined up to isomorphism by the cardinality of G(n)G(n) for each nn, so it’s really just a function from the natural numbers to itself. Still, converting it into a formal power series — its ordinary generating function — is a useful trick for solving all sorts of problems. So we shouldn’t scorn the ordinary generating functions. Furthermore, we can restrict any functor SFinSetS \to FinSet to a functor from FinSet\mathbb{N} \to FinSet, or left Kan extend any functor FinSet\mathbb{N} \to FinSet to a functor SFinSetS \to FinSet, and this sets up some nice relations between exponential and ordinary generating functions.

The fun really starts when we combine the two ideas. Suppose we have a functor

G:S×FinSet G : S \times \mathbb{N} \to FinSet

Then we can define its generating function |G||G|, which is a formal power series in two variables zz and uu, by

|G|(z,u)= n0 k0|G(n,k)|z nn!u k \displaystyle{ |G|(z,u) = \sum_{n \ge 0} \sum_{k \ge 0} |G(n,k)| \, \frac{z^n}{n!} u^k }

This is called a bivariate generating function. All the usual tricks with generating functions work for this kind too. The systematic mathematician in me wants to explain all these tricks, but that would take a while. So with no further ado let me just illustrate them by using them to answer a question:

Puzzle 5. What is the expected number of cycles in a random permutation of an nn-element set?

This is Example III.4 in Analytic Combinatorics, but I’ll explain the method from scratch.

We start by making up a functor

G:S×FinSet G \colon S \times \mathbb{N} \to FinSet

that assigns to the pair (n,k)(n,k) the set of permutations of nn with kk cycles. Note that we’re treating nn as a finite set but kk as a natural number.

What is the generating function |G||G|? To answer that it’s best to make up functors

G k:SFinSet G_k \colon S \to \FinSet

that assign to each finite set nn the set of permutations of nn with kk cycles. We have

G(n,k)G k(n) G(n,k) \cong G_k(n)

so

|G|(z,u) = n0 k0|G(n,k)|z nn!u k = k0|G k|(z)u k \begin{array}{ccl} |G|(z,u) &=& \displaystyle{ \sum_{n \ge 0} \sum_{k \ge 0} |G(n,k)| \, \frac{z^n}{n!} u^k } \\ \\ &=& \displaystyle{ \sum_{k \ge 0} |G_k|(z) \; u^k } \end{array}

Thus, to figure out |G||G| we just need to know |G k||G_k| for each kk.

A permutation of nn with kk cycles is the same as a way to chop up nn into kk unlabeled pieces and put a cyclic ordering on each piece, so

G kCyc kk! \displaystyle{ G_k \cong \frac{Cyc^k}{k!} }

where Cyc:SFinSetCyc : S \to FinSet is the species of cyclic orderings. We thus have

|G k|=|Cyc| kk! \displaystyle{ |G_k| = \frac{|Cyc|^k}{k!} }

But the generating function for cyclic orderings is famous: there are (n1)!(n-1)! cyclic orderings on an nn-element set, so

|Cyc|(z) = n0|Cyc(n)|z nn! = n0z nn = ln(1z) \begin{array}{ccl} |Cyc|(z) &=& \displaystyle{ \sum_{n \ge 0} |Cyc(n)| \frac{z^n}{n!} } \\ \\ &=& \displaystyle{ \sum_{n \ge 0} \frac{z^n}{n} } \\ \\ &=& -\ln(1-z) \end{array}

Thus, we get

|G k|=1k!(ln(1z)) k \displaystyle{ |G_k| = \frac{1}{k!} (-\ln(1-z))^k }

This is fairly complicated: if we actually wrote this out as a power series, the coefficients would involve Stirling numbers of the first kind. That shouldn’t be surprising, since these numbers count the permutations of nn having kk cycles. But it’s easier to work with |G||G|, which is why this ‘bivariate generating function’ idea is a good thing. Let’s see why:

|G|(z,k) = k0|G k|(z)u k = k01k!(ln(1z)) ku k = exp(uln(1z)) = (1z) u \begin{array}{ccl} |G|(z,k) &=& \displaystyle{ \sum_{k \ge 0} |G_k|(z) \;u^k } \\ \\ &=& \displaystyle{ \sum_{k \ge 0} \frac{1}{k!} (-\ln(1-z))^k u^k } \\ \\ &=& \exp\left( -u\ln(1-z) \right) \\ \\ &=& (1-z)^{-u} \end{array}

See? It’s much more pretty to combine all the |G k||G_k| into a single |G||G| this way.

The function |G||G| is not exactly what we need to solve our puzzle, but we’re very close now. We want to know the average number of cycles of a random permutation of a nn-element set. Call this A(n)A(n). Since there are |G(n,k)||G(n,k)| permutations with kk cycles, we have

A(n)= k0k|G(n,k)|n! \displaystyle{ A(n) = \sum_{k \ge 0} k \frac{|G(n,k)|}{n!} }

But notice, this is something we can compute using |G||G|, since

u|G|(z,u)| u=1 = u n0 k0|G(n,k)|z nn!u k| u=1 = n0 k0k|G(n,k)|z nn! = n0A(n)z n \begin{array}{ccl} \displaystyle{\left. \frac{\partial}{\partial u} |G|(z,u) \right|_{u = 1} } &=& \displaystyle{ \left. \frac{\partial}{\partial u} \sum_{n \ge 0} \sum_{k \ge 0} |G(n,k)| \, \frac{z^n}{n!} u^k \right|_{u = 1} } \\ \\ &=& \displaystyle{ \sum_{n \ge 0} \sum_{k \ge 0} k |G(n,k)| \, \frac{z^n}{n!} } \\ \\ &=& \displaystyle{ \sum_{n \ge 0} A(n) \, z^n } \end{array}

So, to work out the average A(n)A(n) we just differentiate |G||G| with respect to uu, set u=1u = 1, and work out the coefficients of its Taylor series as a function of zz. Let’s do it!

We saw

|G|(z,u)=(1z) u \displaystyle{ |G|(z,u) = (1-z)^{-u} }

For those whose calculus is rusty this could be the hardest step in the whole computation:

u|G|(z,u)=ln(1z)(1z) u \displaystyle{ \frac{\partial}{\partial u} |G|(z,u) = -\ln(1-z) \, (1-z)^{-u} }

This gives

u|G|(z,u)| u=1=ln(1z)1z \displaystyle{ \left. \frac{\partial}{\partial u} |G|(z,u) \right|_{u =1} = -\frac{\ln(1-z)}{1-z} }

Now we just need to work out the Taylor series of this function. I take it back: this is the hardest step. We have

ln(1z)1z= n1H nz n -\frac{\ln(1-z)}{1-z} = \sum_{n \ge 1} H_n z^n

where H nH_n is the harmonic number

H n=11+12++1n H_n = \frac{1}{1} + \frac{1}{2} + \cdots + \frac{1}{n}

I’ll show this in a minute. Given this, we get

A n=H n A_n = H_n

Or, in something more like plain English: the average number of cycles in a random permutation of an nn-element set is

11+12++1n \frac{1}{1} + \frac{1}{2} + \cdots + \frac{1}{n}

Asymptotically this is just lnn\ln n . For a better estimate, use lnn+γ\ln n + \gamma, where γ\gamma is Euler’s constant.

So, we’re done! This offers more evidence that a random permutation of a huge set has rather few, mostly rather large cycles. For example, suppose we take a random permutation of a set with a million elements. Then our calculation show that on average it has only about 14 cycles. From last time, we know that the median size of the largest cycle is roughly 606,000, while its mean size is roughly 624,000.

But how do we do that last step in our computation — the one that led to the harmonic numbers? For this, note that if we multiply any formal power series

f(z)= n=0 a nz n f(z)= \sum_{n=0}^\infty a_n z^n

by the geometric series

11z= n=0 z n \frac{1}{1-z} = \sum_{n=0}^\infty z^n

we get a series whose coefficients are the partial sums of the sequence a na_n:

11zf(z)= n=0 ( i=0 na i)z n \frac{1}{1-z} f(z) = \sum_{n=0}^\infty \left(\sum_{i=0}^n a_i \right) z^n

Since the power series for ln(1z)-\ln(1-z) has coefficients a n=1/na_n = 1/n starting at n=1n = 1, when we multiply it by 1/(1z)1/(1-z) we get a power series whose coefficients are the harmonic numbers!

Some readers may be disappointed at how a calculation that started so nobly with functors devolved into some clever tricks with calculus. Others may feel the opposite way.

Personally I like the combination of functors and calculus. I’d like to get more of the calculations to proceed at the level of functors from S×S \times \mathbb{N} to FinSetFinSet, or even from S×SS \times S to FinSetFinSet. But the main ‘trick’ here, the method of computing the mean of a family of probability distributions by differentiating a bivariate generating function with respect to uu and then setting u=1u = 1, is very pretty. And it’s not a one-off: it’s quite generally useful. See Analytic Combinatorics for more examples!

Posted at November 30, 2019 2:41 AM UTC

TrackBack URL for this Entry:   https://golem.ph.utexas.edu/cgi-bin/MT-3.0/dxy-tb.fcgi/3166

6 Comments & 0 Trackbacks

Re: Random Permutations (Part 5)

Here’s one part of the computation that’s easy to categorify:

ln(1z)1z= n1H nz n -\frac{\ln(1-z)}{1-z} = \sum_{n \ge 1} H_n z^n

where

H n=11++1n H_n = \frac{1}{1} + \cdots + \frac{1}{n}

As noted in the post, ln(1z)-\ln(1-z) is the generating function of CycCyc, the species of cyclic orderings. Its derivative 1/(1z)1/(1-z) is the generating function of TOTO, the species of total orderings. So,

ln(1z)1z=|TOCyc|(z) -\frac{\ln(1-z)}{1-z} = |TO \cdot Cyc|(z)

A TOCycTO \cdot Cyc-structure on a finite set is a way of chopping it into two parts, totally ordering the first part and cyclically ordering the second. There’s a groupoid of TOCycTO \cdot Cyc-structured nn-element sets, and the above equations imply that the cardinality of this groupoid is

11++1n \frac{1}{1} + \cdots + \frac{1}{n}

But this is easy to see directly! There are different kinds — different isomorphism classes — of TOCycTO \cdot Cyc-structured nn-element sets, one for each 1kn1 \le k \le n. The kkth kind consists of nn-element sets where we’ve cyclically ordered kk of the elements and totally ordered the rest. The automorphism group of this kkth kind is /k\mathbb{Z}/k since we can ‘cycle’ the cyclically ordered part.

The cardinality of a groupoid is the sum over isomorphism classes of the reciprocals of the cardinalities of the automorphism groups. So, the cardinality of the groupoid of TOCycTO \cdot Cyc-structured nn-element sets is

11++1n \frac{1}{1} + \cdots + \frac{1}{n}

The next question: why does the average length of a cycle in a random permutation of an nn-element set equal the cardinality of the groupoid of TOCycTO \cdot Cyc-structured nn-element sets?

My post explains why. But we can seek a deeper reason. Is this just an equation between numbers, or does the computation in my post lift to something like an equivalence of groupoids that explains this equation?

Posted by: John Baez on November 30, 2019 5:21 PM | Permalink | Reply to this

Re: Random Permutations (Part 5)

You said that there was an isomorphism

G kCyc kk!G_k \cong \frac{Cyc^k}{k!}

Is this an isomorphism of functors? If so what is the interpretation of dividing by k!k!?

Posted by: Jade Master on November 30, 2019 6:09 PM | Permalink | Reply to this

Re: Random Permutations (Part 5)

Hi! Yeah, I didn’t explain that carefully. Good question.

In class we talked about a species A kA_k, ‘being a kk-element set’, where A k(X)=1A_k(X) = 1 if XX has kk elements and A k(X)=A_k(X) = \emptyset otherwise. The generating function of A kA_k is

|A k|(x)=x kk! \displaystyle{ |A_k|(x) = \frac{x^k}{k!} }

In short, the species A kA_k is the categorification of the polynomial x k/k!x^k/k!.

When talking to myself, I often deliberately blur the distinction between the species A kA_k and the polynomial x k/k!x^k/k!. So when I wrote Cyc k/k!Cyc^k/k! I really meant the species A kCycA_k \circ Cyc. To put this structure on a finite set, we chop that set up into kk unlabeled parts and put a cyclic ordering on each part.

The fact that they are kk unlabeled parts put the k!k! in the denominator of the x k/k!x^k/k!: we can permute the parts without it mattering.

Anyway, it’s then obvious that A kCycA_k \circ Cyc is isomorphic to the species G kG_k, which is ‘being a set equipped with a permutation having kk cycles’. So we get a natural isomorphism of functors

G kA kCyc G_k \cong A_k \circ Cyc

Posted by: John Baez on November 30, 2019 11:18 PM | Permalink | Reply to this

Re: Random Permutations (Part 5)

Ah okay thanks! I get it now.

Posted by: Jade Master on December 1, 2019 3:44 AM | Permalink | Reply to this

Re: Random Permutations (Part 5)

Typos in the definition of an ordinary generating function:

“a functor” \to “be a functor”

“cardinality of HH\to “cardinality of GG

Posted by: Blake Stacey on December 1, 2019 2:30 PM | Permalink | Reply to this

Re: Random Permutations (Part 5)

Thanks, Blake! I’m putting these on my website, so I want them to be nice. I’ll incorporate my answer to Jade’s question about Cyc k/k!Cyc^k/k!, too.

Posted by: John Baez on December 1, 2019 5:32 PM | Permalink | Reply to this

Post a New Comment