Functional Equations V: Expected Surprise
Posted by Tom Leinster
In today’s class I explained the concept of “expected surprise”, which also made an appearance on this blog back in 2008: Entropy, Diversity and Cardinality (Part 1). Expected surprise is a way of interpreting the -deformed entropies that I like to call “surprise entropies”, and that are usually and mistakenly attributed to Tsallis. These form a one-parameter family of deformations of ordinary Shannon entropy.
Also in this week’s session: -logarithms, and a sweet, unexpected surprise:
Surprise entropies are much easier to characterize than ordinary entropy!
For instance, all characterization theorems for Shannon entropy involve some regularity condition (continuity or at least measurability), whereas each of its -deformed cousins has an easy characterization that makes no regularity assumption at all.
It’s all on pages 18–21 of the course notes so far.
Posted at March 8, 2017 1:36 AM UTC
Re: Functional Equations V: Expected Surprise
Wow, that characterization of surprise entropies is indeed an unexpected surprise!
I agree that that particular theorem does not transfer to the case of Shannon entropy. Nevertheless, I’d like to propose a candidate characterization of Shannon entropy that doesn’t involve any regularity condition. After two paragraphs of prelude, I’ll give a sketch of how it could work. It’s not a complete sketch yet, since I don’t get continuity for free, but only boundedness on ; but in light of the equivalence of continuity and boundedness in the case of Cauchy’s functional equation, I’m hoping that this could be enough. Do you happen to know?
Basically, the new ingredient is that we can simulate any biased coin using two expected tosses of a fair coin, like this: use the fair coin to generate the infinite distribution by tossing the coin repeatedly until it lands heads. Now for any bias , the binary expansion shows that the biased coin is a coarse-graining of the above infinite distribution.
More generally, I think that this kind of procedure lets us simulate any countable distribution with finite entropy using a finite expected number of fair coin tosses.
Now let be the infinite-dimensional ‘simplex’ consisting of those countable distributions that have finite entropy. These should be precisely those distributions that can be generated from a finite expected number of fair coin tosses. My conjecture is that up to a scalar multiple, Shannon entropy is the only function which satisfies the chain rule. So why would boundedness on be automatic? That’s because of the above simulation argument: there’s a chain rule equation equating to plus some extra terms. Since these extra terms are nonnegative by assumption, we have for all , QED.