# C.R. Rao and information geometry

On Lalitha’s recommendation I read Frank Nielsen’s paper “Cramer-Rao Lower Bound and Information Geometry,” which is a survey how C.R. Rao’s work has impacted information geometry. I remember spending some time in grad school trying to learn information geometry (mostly for fun), but since it ended up not being particularly useful in my research, I’m afraid a lot of it has leaked out of my ears. This paper has a short introduction to the Cramer-Rao lower bound and an introduction to information geometry which might be a nice read for some of the readers of this blog. It’s certainly faster than trying to read Amari’s monograph! In particular, it goes over the “highlights” of geodesics and other geometric features on the manifold of probability distributions.

The paper mentions the sub-family of f-divergences known as $\alpha$-divergences, which are given by

$D_{\alpha}(p \| q) = \frac{4}{1 - \alpha^2} \left( 1 - \int p(x)^{(1 - \alpha)/2)} q(x)^{(1 + \alpha)/2} dx \right)$

The KL divergence is $D_{-1}(p \| q)$ — you have to take the limit as $\alpha \to -1$. Within this family of divergences we have the relation $D_{\alpha}(p \| q) = D_{-\alpha}(q \| p)$. Consider a pair of random variables $(X,Y)$ with joint distribution $P_{XY}$ and marginal distributions $P_X$ and $P_Y$. If we take $q = P_X P_Y$ and $p = P_{XY}$ then the mutual information is $D_{-1}( p \| q )$. But we can also take

$D_{-1}( P_{X} P_{Y} \| P_{XY}) = D_1( P_{XY} \| P_{X} P_{Y} )$

Thus it turns out that the “lautum information” defined by Palomar and Verdú is a special case of this: it’s the 1-divergence between the the joint distribution and the product of the marginals. While their paper mentions the lautum information is an f-divergence, it doesn’t discuss this connection to this family of divergences. Nielsen’s paper calls this the “reverse Kullback-Leibler divergence,” but some googling doesn’t seem to indicate that this is a common term, or indeed if it has some use in information geometry. Palomar and Verdú give several operational interpretations of the lautum information.