Greetings from the School of IT in Austin

I’m at the 2011 School of Information Theory at UT Austin — I’m mostly here to help out with the activities on Monday, which I will blog about after they happen (it’s a secret!) I missed most of the talks, but I went to the first part of Bob Gray’s lecture this morning on stationary codes and the connections between information theory and ergodic theory. Here’s his set of inclusions of the different species of random processes:

IID \subset B-processes \subset K-processes \subset strongly mixing \subset weakly mixing \subset stationary ergodic \subset stationary \subset block stationary \subset asymptotically stationary \subset asymptotically mean stationary

B-processes are the result of applying stationary codes to IID process and K-processes are those which satisfy the Kolmogorov 0-1 law. The last one implies that sample averages converge. It’s only appropriate to mention it, given the title of this blog.

The School is a nice event with a mixture of classical and “hot topics” in information theory. Even though my own research interests have shifted a bit away from the hard core of IT, there are still fun things to think about and it’s nice to see some new new and new old results.

the job market : so many applicants

I just got a rejection from the CS department at Brown, and they sadly neglected to Bcc the recipients, so I now know that they rejected 362 people in one fell swoop. Just glancing through the addresses I recognized several of the recipients. I think, based on my limited expertise in privacy-preserving algorithms, that this is pretty much satisfies the Dinur-Nissim definition of blatant non-privacy: if there are n applicants, I have reconstructed n - o(n) of them, since I can’t imagine that they would interview more than o(n). Ok that was a little more nerdy than I intended. I do think that they deserve a wag of the finger.

Update: I just got a followup saying that the sender “would like to recall the message…” Alas, no.

Update 2: Another followup came in saying to “please ignore and delete” the previous message. Does this mean I still have a chance?!? (Again, alas, no).


I never knew Jell-O could be so graceful.

I kind of like this version of Take Five from Sachal Music.

Sometimes the Library of Congress does awesome things. This jukebox is up there.

I wouldn’t have believed before that there is money in a bannass stand, but I could be wrong.

The clarity in this press nugget leaves a lot to be desired. The statement “the trio has found a way to determine the smallest number of nodes that must be externally controlled to force a given network from any initial state to any desired final state,” is so vague! The full article is here. It turns out they are looking at a linear control problem d\mathbf{x}/dt = A \mathbf{x}(t) + B \mathbf{u}(t) where the different elements of the state are related via a graph matched to A and you want the input \mathbf{u}(t) to only be nonzero on a subset of the nodes. Thanks to Ann Wehman for the pointer.

Responses to reviewers : raw TeX please

I am revising a paper now and one of the reviewers sent their comments as a PDF what looked like a Word document or RTF containing the LaTeX, rather than a rendered version. So it was a little annoying to read, what with all of the $’s and so on. The beauty of it was that I could just cut and paste from the PDF into my document of responses to the reviewers without having to reformat and it (almost) rendered without a hitch. There were some lingering issues though with quotation marks (I hate smart quotes) and itemizing/enumerating lists.

As a recommendation, I think that the raw .tex file of the review should be uploaded instead. That will make it much easier for the authors to revise, no? I plan on doing this in the future. What do you do?

Sampling one distribution out of another

I am reading this nice paper by Harsha et al on The Communication Complexity of Correlation and some of their results rely on the following cute lemma (slightly reworded).

Let P and Q be two distributions on a finite set \mathcal{X} such that the KL-divergence D(P \| Q)  > 0. Then there exists a sampling procedure which, when given a sequence of samples x_1, x_2, \ldots drawn i.i.d. according to Q, will output (with probability 1) an index i^{\ast} such that the sample x_{i^{\ast}} is distributed according to the distribution P and the expected encoding length of the index i^{\ast} is at most

D(P \| Q) + 2 \log( D(P \| Q) + 1 ) + O(1)

where the expectation is taken over the sample sequence and the internal randomness of the procedure.

So what is the procedure? It’s a rejection sampler that does the following. First set p_0(x) = 0 for all x \in \mathcal{X} and set p_0^{\ast} = 0 as well. Then for each i = 1, 2, \ldots do the following:

  1. \alpha_i(x) = \min\{ P(x) - p_{i-1}(x), \ (1 - p^{\ast}_{i-1}) Q(x) \}
  2. p_i(x) = p_{i-1}(x) + \alpha_i(x)
  3. p_{i^{\ast}} = \sum_{x} p_{i}(x)
  4. \beta_i = \alpha_i(x_i)/( (1 - p_{i-1}^{\ast}) Q(x_i) )
  5. Output i with probability \beta_i.

Ok, so what is going on here? It turns out that \alpha_i(x) is tracking the probability that the procedure halts at i with x_i = x (this is not entirely clear at first). Thus p_i(x) is the probability that we halted before time i and the sample is x, and p_i^{\ast} is the probability that we halt at time i. Thus (1 - p_{i-1}^{\ast}) is the probability that the procedure gets to step i. Getting back to \alpha_i(x), we see that x_i = x with probability Q(x_i) and we halt with probability \beta_i, so indeed, \alpha_i(x) is the probability that we halt at time i with x_i = x.

What we want then is that P(x) = \sum_{i=1}^{\infty} \alpha_i(x). So how should we define \alpha_i(x)? It’s the minimum of two terms : in order to stop at time i such that x_i = x, the factor \alpha_i(x) is at most (1 - p_{i-1}^{\ast}) Q(x). However, we don’t want to let p_{i}(x) = \sum_{j=1}^{i} \alpha_j(x) exceed the target probability P(x), so \alpha_i(x) must be less than P(x) - p_{i-1}(x). The authors say that in this sense the procedure is greedy, since it tries to get p_i(x) as close to P(x) as it can.

In order to show that the sampling is correct, i.e. that the output distribution is P, we want to show what p_i(x) \to P(x). This follows by induction. To get a bound on the description length of the output index, they have to show that

\mathbb{E}[ \log i^{\ast} ] \le D(P \| Q) + O(1).

This is an interesting little fact on its own. It comes out because

i \le \frac{1}{1 - p_{i-1}^{\ast}} \cdot \frac{P(x)}{Q(x)} + 1

and then expanding out \mathbb{E}[ \log i^{\ast} ] = \sum_{x} \sum_{i} \alpha_i(x) \cdot \log i.

One example application of this Lemma in the paper is the following problem of simulation. Suppose Alice and Bob share an unlimited amount of common randomness. Alice has some X drawn according to P(x) and wants Bob to generate a Y according to the distribution P(y | x). So ideally, (X,Y) have joint distribution P(x) P(y | x). They can both generate a sequence of common y_1, y_2, \ldots according to the marginal distribution of Y. Then Alice tries to generate the distribution P(y | x) according to this sampling scheme and sends the index i^{\ast} to Bob, who chooses the corresponding y_{i^{\ast}}. How many bits does Alice have to send on average? It’s just

\mathbb{E}_{P(x)}[ D( P(y|x) \| P(y) ) + 2 \log( D( P(y|x) \| P(y) ) + 1) + O(1) ].

which, after some love from Jensen’s inequality, turns out to be upper bounded by

I(X ; Y) + 2 \log( I(X; Y) + 1) + O(1) ].

Pretty cute, eh?

Concert Bleg : Friede auf Erden with Orchestra Nova

San Diego folks : I’m singing in another concert next weekend! We’re only doing the < 10 minute piece Friede Auf Erden by Schoenberg, but there’s the spoonful of sugar that is Beethoven 5 to help the decayed remnants of late-Romantic harmonics go down smoothly…

Victory Through Peace
Orchestra Nova San Diego
Jung-Ho Pak, conductor

Symphony No. 5
Ludwig van Beethoven
Egmont Overture
Ludwig van Beethoven
Peace on Earth
Arnold Schoenberg
Featuring SACRA/PROFANA, dir. Krishan Oberoi
Ascent to Victory
Nancy Bloomer Deussen

Friday, May 13, 2011 – 7:30 p.m. at St. Paul’s Cathedral, 2728 Sixth Avenue San Diego, CA
Saturday, May 14, 2011 – 7:30 p.m. at the Irwin M. Jacobs Qualcomm Hall, 5775 Morehouse Drive, San Diego, CA
Monday, May 16, 2011 – 7:30 p.m. at Sherwood Auditorium, 700 Prospect Street, La Jolla, CA

Ticket prices depend on venue.


I’m heading off to Mexico in less than 12 hours for a week during which I hope to disconnect : no email, web, or phone. I guess I’ll miss the majority of the post-Bin Laden news cycle. In the meantime, here are some more links because I am too lazy to post content.

Speaking of 9/11, this is simply terrible.

An interview with George Saunders, one of my favorite authors.

Blackwell’s proof of Wald’s Identity, as told by Max.

Long pepper looks fascinating and tasty!

Can Voter ID Laws Be Administered in a Race-Neutral Manner? The short answer is no. The longer answer is a 30 page paper.

Frank has blogger about our trip last weekend to The 2nd 8th Annual Grilled Cheese Invitational. My arteries may never be the same again.

There are no more typewriter factories. This makes me especially sad, as I have a 1914 Underwood No. 5 that I love (and lug).