02 | November | 2011 | An Ergodic Walk

From one of the presentation of Zhao and Chia at Allerton this year, I was made aware of a paper by Elza Erkip and Tom Cover on “The efficiency of investment information” that uses one of my favorite quantities, the Hirschfeld–Gebelein–Rényi maximal correlation; I first discovered it in this gem of a paper by Witsenhausen.

The Hirschfeld–Gebelein–Rényi maximal correlation $\rho_m(X,Y)$ between two random variables $X$ and $Y$ is

$\sup_{f \in \mathcal{F}_X, g \in \mathcal{G}_Y} \mathbb{E}[ f(X) g(Y) ]$

where $\mathcal{F}_X$ is all real-valued functions such that $\mathbb{E}[ f(X) ] = 0$ and $\mathbb{E}[ f(X)^2 ] = 1$ and $\mathcal{G}_Y$ is all real valued functions such that $\mathbb{E}[ g(Y) ] = 0$ and $\mathbb{E}[ g(Y)^2 ] = 1$ . It’s a cool measure of dependence that covers discrete and continuous variables, since they all get passed through these “normalizing” $f$ and $g$ functions.

The fact in the Erkip-Cover paper is this one:

$sup_{ P(z|y) : Z \to Y \to X } \frac{I(Z ; X)}{I(Z ; Y)} = \rho_m(X,Y)^2$ .

That is, the square of the HGR maximal correlation is the best (or worst, depending on your perspective) ratio of the two sides in the Data Processing Inequality:

$I(Z ; Y) \ge I(Z ; X)$ .

It’s a bit surprising to me that this fact is not as well known. Perhaps it’s because the “data processing” is happening at the front end here (by choosing $P(z|y)$ ) and not the actual data processing $Y \to X$ which is given to you.

	Zonghong Liu on A story about Canvas
	anonymousskimmer on “The needs of the many,…
	Chanterelle Recipes… on Broiled shrimp with chanterell…
	kvarsh on ICML 2019 encouraged code subm…
	Pulkit Grover on gender inclusivity in communic…

S	M	T	W	T	F	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

An Ergodic Walk

a process whose average over time converges to the true average

Daily Archives: November 2, 2011

HGR maximal correlation and the ratio of mutual informations