Functional Equations IV: A Simple Characterization of Relative Entropy
Posted by Tom Leinster
Relative entropy appears in mathematics, physics, engineering, and statistics under an unhelpful variety of names: relative information, information gain, information divergence, Kullback-Leibler divergence, and so on and on. This hints at how important it is.
In the functional equations course that I’m currently delivering, I stated and proved a theorem today that characterizes relative entropy uniquely:
Theorem Relative entropy is essentially the only function that satisfies three trivial requirements and the chain rule.
You can find this on pages 15–19 of the course notes so far, including an explanation of the “chain rule” in terms of Swiss and Canadian French.
I don’t know who this theorem is due to. I came up with it myself, I’m not aware of its existence in the literature, and it didn’t appear on this list until I added it. However, it could have been proved any time since the 1950s, and I bet someone’s done it before.
The proof I gave owes a great deal to the categorical characterization of relative entropy by John Baez and Tobias Fritz, which John blogged about here before.
Posted at February 28, 2017 7:05 PM UTC
Re: Functional Equations IV: A Simple Characterization of Relative Entropy
A problem I had in preparing for this week’s session was that I couldn’t think of a good example of two languages that use the same set of accents. If they use different accents, then the relative entropy one way round or the other is infinite, which makes things rather trivial. Swiss and Canadian French was the best I could do.
The only other suggestions I’ve been given are other variants of French, or the various versions of German or Dutch.
So, can anyone think of two different languages that use precisely the same accents?
Because some people reading this will be mathematicians, I’d better say explicitly: answers involving languages with no accents don’t count!