# ISIT 2010 : some talks on statistics and probability

For many of the talks I attended I didn’t take notes — partly this is because I didn’t feel expert enough to note things down correctly, and partly because I am

RÉNYI DIVERGENCE AND MAJORIZATION (Tim van Erven; Centrum Wiskunde & Informatica, Peter Harremoës; Copenhagen Business College)
Peter gave a talk on some properties of the Rényi entropy

$D_{\alpha}(P \| Q) = \frac{1}{\alpha - 1} \log \int p^{\alpha} q^{1 - \alpha} d \mu$

where $\mu$ is a dominating measure, and a related geometric quantity, the Lorenz diagram. This is a set of all points in $\mathbb{R}^2$ which are equal to $(\int f dP, \int f dQ)$ for some unit measure function $f$ on [0,1]. It turns out that subset relationships of Lorenz diagrams are the same as majorization relationships between the measures. This is related to something called Markov ordering from a recent paper by Gorban, Gorban, and Judge. This was one of those nice talks which pointed me towards some results and tools that are new to me but that I could not fully understand during the talk.

MINIMAX LOWER BOUNDS VIA F-DIVERGENCES (Adityanand Guntuboyina; Yale University)
This talk was on minimax lower bounds on parameter estimation which can be calculated in terms of generalized $f$ divergences, which are of the form

$D_f(P \| Q) = \int f \left( \frac{dP}{dQ} \right) dQ$

Suppose $T$ is an estimator for some parameter $\theta$. We want universal (over $T$) lower bounds on $\mathbb{P}(T(X^n) \ne \theta)$. This gives a generalization of Fano’s inequality:

$\mathbb{P}(T(X^n) \ne \theta) \ge \inf_Q \sum_{\theta} D_f(P_{\theta} \| Q)$

This has a very geometric feel to it but I had a bit of a hard time following all of the material since I didn’t know the related literature very well (much of it was given in relation to Yang and Barron’s 1999 paper).

MUTUAL INFORMATION SADDLE POINTS IN CHANNELS OF EXPONENTIAL FAMILY TYPE (Todd Coleman; University of Illinois, Maxim Raginsky; Duke University)
In some cases where we have a class of channels and a class of channel inputs, there is a saddle point. The main example is that Gaussian noise is the worst for a given power constraint and Gaussian inputs are the best. Another example is the “bits through queues” This paper gave a more general class of channels of “exponential family type” and gave a more general condition under which you get a saddle point for the mutual information. The channel models are related to Gibbs variational principle, and the arguments had a kind of “free energy” interpretation. Ah, statistical physics rears its head again.

INFORMATION-THEORETIC BOUNDS ON MODEL SELECTION FOR GAUSSIAN MARKOV RANDOM FIELDS (Wei Wang, Martin J. Wainwright, Kannan Ramchandran; University of California, Berkeley)
This was on two problems: inferring edges in a graphical model from iid samples from that model, and inverse covariance estimation. They are similar, but not quite the same. The goal was to prove necessary conditions for doing this; these necessary conditions match the corresponding achievable rates from polytime algorithms. The main result was that you need $n = \Omega( d^2 \log p)$ samples, where $p$ is the number of variables, and $d$ is the max degree in the graphical model. The proof approach was through modeling graph selection as channel coding in which the channel outputs samples from a graphical model chosen by the encoder. If the decoder can identify the model then the problem is solvable. This is analogous to the talk Sriram Vishwanath gave today on matrix completion.

# ISIT 2010 : Anthony Ephremides

“The first phase [of life] you believe in Santa Claus, the second you don’t believe in Santa Clause, and in the third you become Santa Claus.” — Tony Ephremides

Prof. Anthony Ephremides gave the third plenary at ISIT, which included an interim report on the consummation between information theory and network coding and many connections between opera and research. He did a great job I think in explaining the difference between throughput region, stability region, and capacity region (under bursty vs. non-bursty use). These are in increasing order of inclusion. Some interesting tidbits (some new, some not):

• He had a complaint about the way networking handles fading by simply saying the success probability for a packet being received is just $\mathbb{P}( SNR > \gamma)$.
• Under contention, the capacity region may not be convex, unlike in information theory where you can do time sharing.
• For wireless network coding it is important to connect the MAC protocol and scheduling issues as well as change the notion of cut capacities. That is, you shouldn’t replace edges with hyper-edges, because that’s not a good model.
• The information rate is the rate from the payload plus the information in the idleness and the information from the identity of the transmitter. That is, you get data from the actual packet, when the packet was sent, and who is sending the packet.
• Extending many analyses to more than 2 users has not been done.
• Can bursty traffic capacity be larger than non-bursty? The NNN (*) community says that would be a contradiction, because you can always then emulate bursty traffic. But this is unfair because idling the transmitter to build up packets and hence create artificial bursts is not allowed in the problem formulation, so there is no contradiction.
• Relaying is good from a network perspective because it can partially enable a first-come first-serve (FCFS) discipline. So relays bridge a gap to the optimal scheduling policy. This is different than the information theory notion of cooperation yielding diversity gain.
• Multicast throughput is an interesting thing to think about more rigorously for the future.
• His prescription : the information theory has to apply its tools more broadly to networking problems and to unorthodox problems. The networking community should use more rigorous modeling and analysis methods.

(*) NNN = Nagging nabobs of negativism (really it should be “nattering nabobs of negativism,” Safire’s famous alliterative phrase wielded with such gusto by Spiro Agnew).

# Csiszár and Körner to be back in print

According to our Cambridge University Press representative Lorena Verdes, Cambridge press will be republishing Csiszár and Körner at the end of this year.