and years later they write…

A little end of the semester news, but good news this time…

Dear Mr. Anand Dilip Sarwate:

We are pleased to inform you that your manuscript, “Exact emulation of a priority queue with a switch and delay lines” has been accepted for publication in Queueing Systems.

But no rest for the wicked, as they say…

a paper a day : 1

A new feature! Just to keep myself motivated on research and to dissuade people from reading the blog, I am trying to “read” one research paper a day (-ish) to get the new ideas running around my head. And you guessed it, I’m going to blog the interesting (to me) ideas here.

Denoising and Filtering Under the Probability of Excess Loss Criterion (PDF)
Stephanie Pereira and Tsachy Weissmann
Proc. 43rd Allerton Conf. Communication, Control, and Computing (2005)

This paper looks at the discrete denoising problem, which is related to filtering, estimation, and lossy source coding. Very briefly, the idea is that you have a iid sequence of pairs of discrete random variables taking values in a finite alphabet:

Where X is the “clean” source and Z is the “noisy” observation, so that the joint distribution is p(x,z) = p(x) p(z | x), where p(z | x) is some discrete memoryless channel. A denoiser is a set of mappings

so that g_i(z^n) is the “estimate” of X_i. One can impose many different constraints on these functions g_i. For example, they may be forced to operate only causally on the Z sequence, or may only use a certain subset of the Z‘s or only the symbol Z_i. This last case is called a symbol-by-symbol denoiser. The goal is to minimize the time-average of some loss function

This minimization is usually done on the expectation E[L_n], but this paper chooses to look at the the probability of exceeding a certain value P(L_n > D).

The major insight I got from this paper was that you can treat the of the loss function

as outputs of an source with time varying (arbitrarily varying) statistics. Conditioned on Z^n each h_k is independent with a distribution in a finite set of possible distributions. Then to bound the probability P(L_n > D), they prove a large deviations result on L_n, which is the time-average of the arbitrarily varying source.

Some of the other results in this paper are

  • For a Hamming loss function the optimal denoiser is symbol-by-symbol.
  • Among symbol-by-symbol denoisers, time-invariant ones are optimal.
  • An LDP for block denoisers and some analysis of the rate.

Most of the meat of the proofs are in a preprint which seems to still be in flux.

gossip and geographic routing

Having opening night for a play at the same time as a conference paper deadline is somewhat killer, but everything seems to have turned out well:

Geographic Gossip : Efficient Aggregation for Sensor Networks
A.G. Dimakis, A.D. Sarwate, and M.J. Wainwright

Gossip algorithms for aggregation have recently received significant attention for sensor network applications because of their simplicity and robustness in noisy and uncertain environments. However, gossip algorithms can waste significant energy by essentially passing around redundant information multiple times. Specifically, for realistic sensor network model topologies like grids and random geometric graphs, the inefficiency of gossip schemes is caused by slow mixing times of random walks on those graphs. We propose and analyze an alternative gossiping scheme that exploits geographic information. By utilizing a simple resampling method, we can demonstrate substantial gains over previously proposed gossip protocols. In particular, for random geometric graphs, our algorithm computes the true average to accuracy $1/n^a$ using $O(n^{1.5}\sqrt{\log n})$ radio transmissions, which reduces the energy consumption by a $\sqrt{\frac{n}{\log n}}$ factor over standard gossip algorithms.

Koetter-Vardy algorithm in MATLAB redux

Pursuant to my previous post, some raw (and possibly now not as functional) code for the Koetter-Vardy and Guruswami-Sudan algorithms is now posted off of my work page. It’s horrendously slow, but I’m sure others can make it better.

Update: I’ve moved and the code is gone, archived elsewhere. I don’t plan to post it again — it’s just a class project, so people should try to do it on their own for educational purposes.

amusing paragraph

From a paper I’ve been trying to understand:

It seems to us that the aforementioned capacity theorem is one of the most complex coding theorems ever proved. Its proof not only involves the techniques of AVC theory but also some of the most advanced techniques from multiuser theory… Fifteen years ago such a capacity theorem must have been out of reach, but now it serves almost only as a demonstration for the power of certain methods. It is even conceivable that soon a much simpler proof will be found. This shows that there is hope also for several of the harder problems in multiuser theory, which seem to resist all efforts for their solution. Some problems can be solved only at the right time; the time is right if the methods are mature.

Corollary : The author is a super-genius.

Guruswami-Sudan algorithm in MATLAB

Read the updates at the end! Comments are disabled because people don’t actually read the post. This is sad to me, since it really shows that the majority of people commenting on this post are really just trolling for a free pass at an implementation, think that I or someone else will send it to them or port it into C for them, and are so lazy that they can’t scroll through the post themselves.

For part of my class project in coding theory this semester I’ve implemented the Guruswami-Sudan decoding algorithm for Reed-Solomon codes. Reed-Solomon codes are used widely in practice for error-control coding. For example, CDs use a Reed-Solomon code to provide resilience to errors introduced by scratches or machine error. The math behind RS codes is somewhat complicated, but the idea is as follows. Suppose you have some k symbols of data that you want to encode into n symbols. The redundancy (and hence error correction capability) can be thought of as the difference (n – k). This is the “fat” we add into the data to make it resistant to errors.

RS codes take these k symbols and think of them as the coefficients of a polynomial

f(X) = f_0 + f_1 x + f_2 x^2 + … + f_{k-1} x^{k-1}

The encoder and decoder agree to fix n points {a_1, a_2, … a_n}. The encoding map consists of evaluating f(X) at these n points and sending the evaluations:

codeword = (f(a_1), f(a_2), … f(a_n))

The key is in decoding — if the decoder gets any k symbols correctly, it should be able to interpolate a unique polynomial f through them perfectly (this is from linear algebra). Thus in the CD example, if the CD player can’t read some of the symbols, it can only tell the decoder those symbols that it did read. As long as k of those symbols were received the decoder can recover the polynomial and hence the data sequence.

Now of course the math is a little more complicated than that, but not by too much. RS codes are actually defined as polynomials over the ring F[x] where F is the finite field with 2^m elements, and the encoding is the evaluation of a degree k-1 polynomial on all the nonzero elements of the field.

The Guruswami-Sudan algorithm algorithm works in two steps — it interpolates a certain bivariate polynomial Q[x,y] in a F[x,y], and then factors it into factors of the form (y – f_i(x)). The y-roots {f_i(x)} can be thought of as “candidate polynomials” for the data sequences. The algorithm works in polynomial time and outputs a small list of candidates which is guaranteed to contain the transmitted codeword under certain conditions on how bad the errors are. The real key is that the list will often contain the true codeword even if more errors occurred than the code was designed for.

My implementation is based on a very nice tutorial paper by Robert McEliece, who is a really fantastic writer. It uses the Koetter algorithm for interpolation and the Roth-Ruckenstein algorithm to factor the resulting polynomial. This posting is mostly so that Google will possibly pop up this page if people are looking for MATLAB implementations of this algorithm and don’t want to waste sleepless nights like I did trying to force it to work. Caveat emptor, as they always say.

UPDATE : I’ve moved several times since this post was written and in the meantime taken down the code since implementing this is a common course project. It was tricky but not too hard, and I’ve decided the learning benefits of implementing it yourself outweigh the (now negligible) time savings for projects involving follow-on work.