July 2010

Via Inherent Uncertainty I learned that David Blackwell passed away on July 8th.

Prof. Blackwell’s original paper (with Leo Breiman and Aram Thomasian) on the arbitrarily varying channel was an inspiration to me, and he served on my thesis committee a scant 2 years ago.

I’ll always remember what he told me when I handed him a draft of my thesis. “The best thing about Bayesians is that they’re always right.”

I owe Lav a response about birthdays, but in lieu of that, here are some links.

Via Matt Pugh, a grad student here, when intuition and math probably look wrong.

Via my father, what poetry and mathematics have in common.

I think I lack the willpower to write up more notes on talks, and there are other things I’d like to blog about, but here are one or two sentences on some other talks that I found interesting… I also enjoyed the energy session on Friday and the talks by Sundeep and Galen on compressed sensing, but time has gotten the better of me. Next time, folks.

Channel Intrinsic Randomness
Matthieu Bloch
This was on extracting random bits from the output of noisy channel. These bits should be independent of the input to the channel. Matthieu uses the enigmatic information spectrum method to get his results — thanks to the plenary lecture I was able to understand it a bit better than I might have otherwise.

Assisted Common Information
Vinod Prabhakaran and Manoj Prabhakaran
I was very interested in this talk because I have been thinking of a related problem. Two terminals observe correlated sources X_1^n and Y_1^n respectively. A genie observes both sources and sends messages at rates R_1 and R_2 to the two terminals, who then have to produce variables W_1 and W_2 which have large entropies and are also equal with high probability. This problem is connected to the Gacs-Korner problem and Wyner’s common information problem, and also possibly this recent preprint by Andrej Bogdanov and Elchanan Mossel. They manage to solve it using a novel construction of “monotone regions.”

Patterns and Exchangeability
N. P. Santhanam and M. Madiman
This work grew out of the AIM workshop on permanents. De Finetti’s theorem says an exchangeable process can be built up as a mixture of iid processes. Kingman showed that something called an exchangeable partition process is built up from something he called “paintbox processes.” One thing that we discovered at the workshop was that the pattern process of an iid process is the same as a paintbox process (and vice-versa). The paper then goes through many connections between these processes, certain limits of graphs, and connections to universal compression.

Universal Hypothesis Testing in the Learning-Limited Regime
Benjamin G. Kelly, Thitidej Tularak, Aaron B. Wagner, and Pramod Viswanath
This was a really great talk. The problem here is that for each n you get n samples X^n distributed i.i.d. according to p_n or q_n on an alphabet \mathcal{A}_n which can grow with n. Given a new sequence Z^n you have to decide if it was generated according to p_n or q_n. They show a number of results which say that consistency is possible for |\mathcal{A}_n| sublinear in n, impossible for quadratic in n, and other intermediate results. In particular, for well-behaved distributions with |\mathcal{A}_n| = \Theta(n^{\alpha}) and all probabilities are \Theta(n^{-\alpha}), they can get some consistency results, but in particular the generalized likelihood ratio test (GLRT) is inconsistent for \alpha = 1.

Feature Extraction for Universal Hypothesis Testing via Rank-Constrained Optimization
Dayu Huang and Sean Meyn
This talk was of interest to me because I have been looking at hypothesis testing problems in connection with election auditing. In universal testing you know a lot about the distribution for one hypothesis, but much less about the other hypothesis. The Hoeffding test is a threshold test on the KL-divergence between the empirical distribution and the known hypothesis. This test is asymptotically optimal but has a high variance when the data is in high dimension. Thus for smaller sample sizes, a so-called mismatched divergence test may be better. In this paper they look at how to tradeoff the variance and the error exponent of the test.

I seem to have gotten all behind on wrapping up the ISIT blogging, so the remainder may be more compressed takes on things. This is not in the compressed sensing world, where the signals are sparse and my comments are meant to reconstruct, but more like lossy compression where D \to \sigma^2 (for the Gaussian case).

Abbas El Gamal gave a very nice plenary on “Coding for Noisy Networks” in which he really brought together a lot of different eras and streams of work on network information theory and tried to tie them together in a conceptual framework. There was a nice mix of older and newer results. The thing I liked best about it was that he was very optimistic about making progress in understanding how to communicate in networks from an information-theory perspective, which counteracts the sentiment that I heard that “well, it’s just too messy.”

Te Sun Han gave the Shannon Lecture, of course, and he used his time to give a tutorial on the information spectrum method. I had tried to read the book earlier, and honestly found it a little impenetrable (or rather, I wasn’t sure what I was supposed to use from it). The talk was more like reading the papers — concisely stated, but with a clear line of intuition. I know some people are not a big fan of Shannon Lectures as tutorials, but I think there is also a case to be made that most people are unfamiliar with the information spectrum method. A nice example he gave was to show when the output of an optimal source coder looks “completely random.” Maybe this has been done already, but is there a connection between existing theories of pseudorandomness and the information spectrum method?

Cosma has posted a short proof of an ergodic theorem, which makes for some light weekend reading…

I recently came across this paper:

Volumes of Generalized Unit Balls
Xianfu Wang
Mathematics Magazine, Vol. 78, No. 5 (Dec., 2005), pp. 390-395

which has nice formula for a “generalized unit ball” in \mathbb{R}^n:

\mathbb{B}_{p_1,p_2,\ldots,p_n} = \{ \mathbf{x} = (x_1, x_2, \ldots, x_n) : |x_1|^{p_1} + |x_2|^{p_2} + \cdots + |x_n|^{p_n} \le 1 \}

These balls can look pretty crazy (as some pictures in the paper show).

The main result is that for p_1, \ldots, p_n > 0, the volume is equal to

\mathrm{Vol}(\mathbb{B}_{p_1,\ldots,p_n}) = 2^n \frac{ \Gamma(1 + 1/p_1) \cdots \Gamma(1 + 1/p_n) }{ \Gamma(1/p_1 + 1/p_2 + \cdots + 1/p_n + 1) }

The formula for the volume of the n-sphere in the L_2 norm is well known, but this formula lets us calculate all sorts of volumes. For example, for the unit L_1 ball we get the rather clean and beautiful formula

2^n \frac{\Gamma(2)^n}{\Gamma(n + 1)} = \frac{2^n}{n!}

The proof given in the note is by induction, and a remark at the end points to several other proofs based on Laplace transforms.


Get every new post delivered to your Inbox.

Join 159 other followers