This is kind of a cop-out, since it’s a two-page “note” rather than a real paper, but since I’m so out of practice of reading and synthesizing papers I think that it’ll be a baby-step back into the swing of things. This paper is another information-theory one, but it uses a more general strategy that is easily explained to a general audience.
A Counterexample to Cover’s 2P Conjecture on Gaussian Feedback Capacity
Young-Han Kim
ArXiV : cs.IT/0511019
Imagine you have a communication channel that can be modeled as:
where X is the input, Y is the output, and Z is a stationary Gaussian process (not necessarily white noise, but possibly shaped with some spectrum). The approach information theory uses is to say that we’ll limit ourselves to a block of n time steps and then pick a set of 2nR sequences of length n that will encode 2nR messages. These sequences are called codewords and R is the rate of the code, measured in bits/time. If you use n = 10 time steps at rate R, you get a total of 10 R bits, which can stand for one of 210 R messages. A coding system for this model picks a set of codewords and a decoder that will take the noisy output Y = X + Z and guess which codeword was sent. Shannon’s Noisy Coding Theorem says that there’s an number C, called the capacity of the channel, such that for any rate R below C, the probabiliy of making a decoding error goes to 0 as the blocklength n goes to infinity.
Of course, you have constrained resources, and in the above Gaussian problem we assume that the power (squared Euclidean norm) of the codewords is constrained to n P. If we fix the noise power as well, the capacity is neatly parameterized in terms of P and we can denote it by C(P). The paper here is about the role of feedback — if the encoder knows exactly what the decoder receives, then picking codewords “in advance” may not be a good idea, since the encoder can adapt what it sends based on “how confused” the decoder is. There’s a longstanding conjecture that feedback can’t improve the actual capacity by more than doubling the effective power, so that CFB(P) is smaller than C(2P).
It turns out this is false, and the scheme used to show it is a modification of an old one by Schalkwijk and Kailath (Transactions on Information Theory, 1966). The intution is that with perfect feedback, the encoder knows exactly what the decoder does, and therefore can mimic the decoder’s operation perfectly. Imagine Alice is talking to Bob, but Alice knows exactly what Bob thinks she is saying. Using this knowledge, she can tailor the next thing she says to correct Bob’s misunderstanding as best she can. Bob will still not quite get it, but will be closer, so Alice can iteratively zoom on the precise meaning until she’s satisfied that Bob has gotten close enough.
Another way of thinking about it is helping someone hang a picture. You can see if the picture is level or not, and you issue commands: “two inches higher on the left,” “one inch on the right,” “half an inch on the left,” “a nudge more on the right,” until it’s hung straight. This is precisely the intuition behind the Schalkwijk-Kailath coding scheme. With this feedback you can get more precise than you could without, and similarly in this Gaussian communication example, you can communicate more bits reliably with feedback than without.
The result is not earth-shattering, since there’s another result that says you can get at most half a bit per unit time with feedback than without. But it’s a nice application of the coding scheme and a cool demonstration of the usefulness of concrete examples.