Another take on matched sources and channels

Ehsan Ardestanizadeh posed a little problem in Young-Han Kim‘s group meeting before break and Paolo Minero presented a cute solution which I thought might be of interest. It’s related to ideas my advisor, Michael Gastpar worked on in his thesis.

Suppose I have a Gaussian variable S that is \mathcal{N}(0,1), so zero-mean and unit variance. If I observe Y = S + Z where Z is also \mathcal{N}(0,1) and independent of S, the minimum mean-square error (MMSE) estimate of S given Y is just

\mathsf{MMSE}( S | Y) = \frac{1}{2} Y

and the MSE is \mathbb{E}[ (S - \hat{S})^2 ] = 1/2. The question is this: can we somehow get an MSE of less than 1/2 by “encoding” S? Suppose now we can take any function f and let X = f(S), and Y = X + Z but with the restriction that \mathsf{Var}(X) \le 1. That is, the encoding cannot take any more power.

The answer is no, and comes via an “information theoretic” argument. Consider the vector version of the problem where you have n iid unit variance Gaussians S^n and you want to estimate S^n from Y^n = f(S^n) + Z^n where Z^n is iid unit variance Gaussian as well. The goal is to minimize the average per-letter distortion:

d(\hat{S}^n, S^n) = \frac{1}{n} \sum_{i=1}^{n} \mathbb{E}[ (S_i - \hat{S}_i)^2 ]

This is just the problem of joint source-channel coding of a Gaussian source over an AWGN channel with quadratic distortion and encoding function f must satisfy a unit power constraint. For this problem the rate distortion function is

R(D) = \frac{1}{2} \log \frac{1}{D}

and the capacity of the channel is \frac{1}{2} \log(1 + 1) = \frac{1}{2}. Since separate source coding (compression) followed by channel coding (error control) is optimal, in order to get distortion D the rate R(D) \le 1/2 so D \ge 1/2. Furthermore, this is achievable with no coding at all by just setting f(S^n) = S^n.

Now if there was a scheme for the single-letter case which got MSE less than 1/2, we could concatenate it to get a vector scheme with distortion less than 1/2. But since D \ge 1/2 in the optimal code, we get a contradiction. Thus encoding does not help in the single-letter case either. If S isn’t Gaussian the whole story changes though.