ISIT 2009 : talks part three

This is the penultimate post on papers I saw at ISIT 2009 that I thought were interesting and on which I took some notes. I think I’m getting lazier and lazier with the note taking — I went to lots more talks than these, but I’m only writing about a random subset.

Existence and Construction of Capacity-Achieving Network Codes for Distributed Storage
Yunnan Wu

This paper looked at the distributed storage problem using the framework of network coding that Alex has worked on. The basic idea is that you have a bunch of drives redundantly storing some information (say n drives and you can query any k to reconstruct the data). You want to make it so that if any drive fails, it can be replaced by a new drive so that the new drive doesn’t have to download too much data to maintain the “k out of n” property. The problem can be translated into a network code on an infinite graph. This paper talked about how to achieve the optimal tradeoff between the bandwidth needed to repair the code and the storage efficiency/redundancy. The key thing is that even though the network graph is infinite, the network code can be constructed over a field whose size only depends on the number of active disks. Schwartz-Zippel reared its head again in the proof…

Achievability Results for Statistical Learning under Communication Constraints
Maxim Raginsky

This talk was on how to learn a classifier when the data-label pairs (X_i, Y_i) must be compressed. Max talked about two different scenarios, one in which you can use R bits per pair, and the other in which only the labels Y_i must be sent. He defined a rate-distortion-like function where the generalization error the classifier played the role of the distortion. In the first case it’s easier for the encoder to learn the classifier itself and send the classifier over — this takes 0 bits per sample, asympotically, and incurs no extra generalization error. In the second case he uses a kind of Wyner-Ziv-like scheme which is a bit more involved. Compression is not a thing many machine learning folks think about, so I think there’s a lot of interesting stuff in this direction.

On Feedback in Network Source Coding
Mayank Bakshi, Michelle Effros

This paper was about how feedback can help in distributed source coding. In some lossless and lossy problems, feedback can strictly increase the rate region. Some of these problems took the form of “helper” scenarios where the decoder wants one of the sources but gets coded side information from another source. They show that this scenario holds more generally, and so feedback helps in general in network source coding.

Feedback Communication over Individual Channels
Power Adaptive Feedback Communication over an Additive Individual Noise Sequence Channel

Yuval Lomnitz, Meir Feder

These talks were near and dear to my heart since they dealt with coding over channels which are basically unknown. The basic idea is that you know only X^n and Y^n, the inputs and outputs of the channel. The trick is then to define a corresponding “achievable rate.” They define an empirical mutual information and show that it is asymptotically achievable under some circumstances. The scheme is based on random coding plus maximum mutual information (MMI) decoding. When feedback is present they can do some rateless coding. There’s a full version on ArXiV.

Upper Bounds to Error Probability with Feedback
Barış Nakiboğlu, Lizhong Zheng

Baris gave a nice talk on his approach to modifying the Gallager exponent bounds to the case when feedback is available. The encoder constantly adapts to force the decoder into a decision. The upper bound is based on a one-step/channel-use reduction to show an exponentially decaying error probability. One nice thing he mentioned is that at rates below capacity we can get a better error exponent by not using the capacity-achieving input distribution. Even though I read the paper and Barış explained it to me very patiently, I still don’t quite get what is going on here. That’s probably because I never really worked on error exponents…

ISIT : first set of talks

Outer Bounds for User Cooperation
Ravi Tandon, Sennur Ulukus

This paper looked converse bounds for the MAC with generalized feedback. By applying the dependence balance bound, one can get an outer bound for the capacity region in terms of auxiliary random variables whose cardinality is difficult to bound. For a specific form of the channel, the “user cooperation channel.” where the feedback signals to the users are made from noisy versions of the other users’s signal, they find a way to evaluate the bound by looking at all input densities satisfying the constraints and then arguing that it is sufficient to consider Gaussian densities. The resulting bound matches the full-cooperation and no-cooperation bound when taking the respective limits in the feedback noise, and is tighter than the cutset bound.

Optimal Quantization of Random Measurements in Compressed Sensing
John Sun, Vivek Goyal

The goal here was to design a quantizer for noisy measurements in compressed sensing, while keeping the decoder/reconstructor fixed. This was done in the same high rate scalar quantization setting that I’d seen in other talks, with point-density functions to represent the reconstruction points. They draw a connection to the earlier work on functional scalar quantization and find optimal quantizers for the Lasso. They used an recent called “homotopy continuation,” which looks at the solution produced by the Lasso as a function of the regularization parameter, which I should probably read up on a bit more…

A Sparsity Detection Framework for On-Off Random Access Channels
Sundeep Rangan, Alyson Fletcher, Vivek Goyal

This looked at a MAC with n users and the decoder wants to detect which users transmitted. The users are each “on” with some fixed probability, and they show this can be put into a sparsity detection framework. There are two problems to overcome — the near-far effect of large dynamic range of the received signals, and a “multiaccess interference” (MAI) phenomenon that plagues most detection algorithms, including the Lasso and orthogonal matching pursuit (OMP). The question is whether a practical algorithm can overcome these effects — they show that a modification of OMP can do so as long as the decoder knows some additional information about the received signals, namely the power profile which is the order of the users’ power, conditional that they are active.

Eigen-Beamforming with Delayed Feedback and Channel Prediction
Tr Ramya, Srikrishna Bhashyam

This paper looked at the effect of delay in the feedback link in a system that is trying to do beamforming. The main effect of delay is that there is a mismatch between CSIT and CSIR. There is also the effect of channel estimation error. One solution is to do some channel prediction at the encoder, and they show that delayed feedback can affect the diversity order, while prediction can help improve the performance. Unfortunately, at high Doppler shift and and high SNR, the prediction filter becomes too complex. I hadn’t really seen much work on the effect of longer delay in feedback, so this was an interesting set of ideas for me.

ISIT 2009 : plenaries and the Shannon Award

As I mentioned earlier, Rich Baraniuk’s plenary was quite energetic and entertaining. David Tse gave the next plenary, called It’s Easier to Approximate, mainly building on his recent string of work with Raul Etkin, Hua Wang, Suhas Diggavi, Salman Avestimehr, Guy Bresler, Changho Suh, and Mohammad Ali Maddah-Ali (and others too I imagine). He motivated the idea of using deterministic models for multiterminal Gaussian problems essentially by appealing to the idea of building approximation algorithms, although the connection directly to that community wasn’t made as explicitly (c.f. Michelle Effros’s plenary). David is also a great speaker, so even though I had seen a lot of these results before from being at Berkeley, the talk helped really put it all together. Raymond Yeung gave another great talk on information inequalities and for the first time I think I understood the point of “non-Shannon information inequalities” in terms of their connections to other disciplines. Hopefully I’ll get around to posting something about the automatic inequality provers out there. Unfortunately I missed all of Friday, so I didn’t get to see Noga Alon‘s plenary, but I’m sure it was great too. Does anyone else care to comment?

The Shannon Lecture was given, of course, by Jorma Rissanen, and it was on a basic question in statistics : model selection. Rissanen developed the Minimum Description Length (MDL) principle, which I had always understood in a fuzzy sense to mean that you choose a model which is easy to describe information theoretically, but I never had a good handle on what that meant besides taking some logs. The talk was peppered with good bits of philosophy. One which stood out to me was that our insistence that there be a “true model” for the data often leads to problems. That is, sometimes we’re better off not assuming that the data was generating according to a particular model, but focus on finding the best model that fits the data in our class. I got to chat with Rissanen later about this and pointed out that it’s a bit like religion to assume an underlying true distribution. Another great tidbit was his claim that “nonparametric models are nonsense,” by which he meant that essentially every model is parametric — it’s just that sometimes the number of parameters is mind-bogglingly huge. The most interesting thing is that there were new results in the Shannon lecture, and Rissanen is working on a paper with those results now!

The big news was of course that Te Sun Han is going to be next year’s Shannon Awardee. I was very happy to hear this — I’ve been hoping he would win it for a while now, so I had a big grin on my face leaving the banquet…

ISIT 2009 : post the first

I’m at ISIT 2009 now at the COEX Center. Korea is pretty good so far (pictures to be posted later). The conference started today with a plenary by Rich Baraniuk, who talked about compressed sensing. I’d seen parts of the talk before, but it’s always nice to hear Rich speak because he’s so passionate about how cool the thing is.

I’ll try to blog about talks that I saw that were interesting as time permits — this time I’ll try to reduce the delay between talks and posts!

Futurama and information theory

According to the article The Truth About Bender’s Brain in the May 2009 issue of IEEE Spectrum, one of the writers of Futurama, Ken Keeler, “confirms that he does read every issue of IEEE Spectrum and occasionally looks at the Transactions on Information Theory.” So is it true that source coding people are more funny than channel coding people?

Infocom 2009 : capacity of wireless networks

A Theory of QoS for Wireless
I-Hong Hou (University of Illinois at Urbana-Champaign, US); Vivek Borkar (Tata Institute of Fundamental Research, IN); P. R. Kumar (University of Illinois at Urbana-Champaign, US)
This paper claimed to propose a tractable and relevant framework for analyzing the quality of service in a network. In particular, the model should capture deadlines, delivery ratios, and channel reliability while providing guarantees for admission control and scheduling. The model presented was rather simple: there are N clients trying to access a single access point in slotted time. The clients generate packets periodically with period \tau and each packet has a deadline of \tau. The channel quality is captured by a client-dependent success probability for the uplink; the downlink always succeeds. For the uplink, the access point selects a transmitter via polling and the client sends the packet until it is received. The questions are then to ask whether a set of clients and rates is feasible and what policy can achieve these rates. The paper presents an outer bound and two policies based on notion of workload w_n = (1/p_n) (q_n/\tau), where q_n is the delivery ratio of user n. Policies based on largest “debt” first are shown to be optimal, where “debt” can be the difference of w_n and the fraction of time given to user n or a weighted delivery ratio.

Computing the Capacity Region of a Wireless Network
Ramakrishna Gummadi (University of Illinois at Urbana-Champaign, US); Kyomin Jung (Massachusetts Institute of Technology, US); Devavrat Shah (Massachusetts Institute of Technology, US); Ramavarapu Sreenivas (University of Illinois at Urbana-Champaign, US)
This looked at wireless networks under the protocol model of Gupta and Kumar and asks the following question : if I give you a vector of rates \mathbf{r} is there a way to tell if \mathbf{r} is in the capacity region of the network? Previous work had looked at polynomial time algorithms for scheduling, but apparently in the wireless setting the scheduling problem is NP hard. The question they look at instead is testing if there exists a routing scheme which leads to feasible link rates. This can be reduced to approximating “max-weight independent set” (MWIS), which I am not so familiar with. I think it relates to the backpressure algorithm. In any case, they find a polynomial time approximation assuming the communication graph has polynomial growth, which holds for many nice graphs (like random geometric graphs with standard connectivity assumptions).

The Multicast Capacity Region of Large Wireless Networks
Urs Niesen (Massachusetts Institute of Technology, US); Piyush Gupta (Bell Labs, Alcatel-Lucent, US); Devavrat Shah (Massachusetts Institute of Technology, US)
The network model is n users placed uniformly in the square [0, \sqrt{n}]^2. This looks at the capacity region assuming arbitrary multicast demands between users. So each node in the network has a set of receivers to which it would like to multicast its message. What traffic demands are admissible? The framework is one of scaling laws as the number of nodes goes to infinity, under the protocol model and enforcing multi-hop routing. The main result is that the capacity region is equal to the cutset region within an n^{o(1)} multiplicative factor (on both the upper and lower bounds), and furthermore the cutset region can be evaluated with fewer constraints. The achievable scheme involves building a “multiresolution tree” of sorts and then showing that routing in the original problem is the same as routing in the tree. The tree structure gives the constraint reduction for the cutset bound. To actually route the traffic, they develop a “message concentration” scheme that is reminiscent of other hierarchical cooperation schemes but I guess is a bit different.

Infocom 2009 : delay issues

Effective Delay Control for Online Network Coding
Joao Barros (University of Porto, PT); Rui Costa (Universidade do Porto / Instituto de Telecomunicações, PT); Daniele Munaretto (DoCoMo Euro-Labs, DE); Joerg Widmer (DoCoMo Euro-Labs, DE)
This talk tries to merge ARQ with network coding. The key idea is that a packet is “seen” if it only depends on XORs with future packets. The delay analysis is based then on analyzing the chains of dependent packets induced by erasures. This paper looks at the problem of multicast. Here the issue is managing the delays to multiple receivers and assessing which receiver is the “leader” and how the leader switches over time. For random erasures (different for each receiver) this means we are doing a biased random walk whose location shows who the leader is. A coding strategy is trying to control this random walk, and a strategy is proposed to maintain a “leader” by sending the XOR of the last unseen packet of all users when the leader loses a packet.

The Capacity Allocation Paradox
Asaf Baron (Technion – Israel Institute of Technology, IL); Isaac Keslassy (Technion, IL); Ran Ginosar (Technion, IL)
This talk was about a simple example of a network in which adding capacity can make a stable network unstable. To me it seemed to be because of the particular model adopted for the network, namely that if a link of capacity C is available, then the transmitter will operate at rate C. The simple example of the paradox is a 2-user multiaccess link. Suppose we have arrival process with rate 1 arriving at two users which have outgoing links of capacity 1 to a shared queue. This queue has output capacity 2, so the whole network is stable. However, if one user gets a capacity 2 link to the queue, then their traffic can hog the output link and cause increasing delay to the second user. The paradox was motivated by networks on a chip, which seem to be an interesting source of new problems with different emphases than traditional networking problems.

Power-Aware Speed Scaling In Processor Sharing Systems
Adam Wierman (California Institute of Technology, US); Lachlan Andrew (Swinburne University of Technology, AU); Kevin Tang (Cornell University, US)
This talk was about assigning speeds to processing jobs in a queue — if you do a job faster it takes more power but reduces delay. There are different ways of assigning speeds, either a fixed speed for all jobs (static), a square-wave for speed vs. time (gated static), or some arbitrary curve (dynamic). The metric they choose to look at it a sort of regularized energy E[\mathrm{energy}] + \beta E[ \mathrm{delay} ], where \mathrm{energy} = (\mathrm{speed})^{\alpha}. For a large number of jobs they get a kind of limiting form for the optimal speed and show that a gated static policy performs within a factor of 2 of the optimal dynamic, which is verified by simulation. In general we may not know the arrival process and so choosing the duty cycle for a gated static policy may be hard a priori. In this case a dynamic strategy may be much better to handle model variability. This was one of those problems I had never thought about before and I thought the results were pretty cute.