# ITA 2008 : Part the First

I’ve spent this week at ITA 2008 at UCSD. There have been lots of great talks, but my brain is fairly saturated, so my ability to take more than cursory notes decreased as the workshop progressed. I’ve met a number of people who apparently read this blog (hi!), which is nice. One thing I’ll try to do is put in some more links to things as I think of them and also to raise some (possibly trivial) questions that came up as I wrote about the papers.

Iterative coding for network coding
Andrea Montanari, Stanford, Vishwambar Rathi, EPFL, and Rudiger Urbanke, EPFL

I think of this paper as the Shannon-theory version of the Koetter-Kschischang coding-theory analysis of network coding. There is a preprint of some of the results on ArXiV that I read a little while ago (and unsuccessfully presented in our group meeting). The channel model is a rank metric channel (see Gabidulin), and a notion of distance between subspaces can be defined. The idea is the following — messages are encoded into subspaces of a vector space over a finite field (taken to be F2 here). The set of all subspaces is the projective space, and the set of all subspaces of dimension k is the Grassmann manifold. So we can look at a codebook as a set of binary matrices of dimension m x l. In this model, headers are preserved, so the decoder in a network code knows the transfer matrix of the channel exactly. However, errors can still be introduced in the packet payload in the following sense. Each column can be written as

yj = xj + zj

where zj is uniformly chosen from an unknown subspace of limited rank. This is a perfectly good DMC channel model, and one can calculate its capacity. The interesting thing is that then you can define an LDPC code (with all variable nodes of degree 2 and a non-soliton distribution for the check nodes) across the columns of the codeword (which is a matrix, remember), with matrix-valued weights on the edges of the Tanner graph. The messages that are propagated in the BP algorithm correspond to information about affine subspace memberships of the rows. This code (a) achieves capacity on the rank metric channel, and the error for erasures also goes to 0. Since I have been concerned about jamming and interference and malicious attacks, the assumption on the header being preserved feels like a limitation of the approach, but I think that perhaps concatenating codes may also be able to preserve something about the header information. The reason I call it sort of Shannon theoretic is that the error model is random and the density evolution arguments are a bit like the probabilistic method.

On rate-distortion function of Poisson processes with a queueing distortion measure
Todd P. Coleman, UIUC, and Negar Kiyavash, UIUC, and Vijay G. Subramanian, Hamilton Institute, NUIM

The main idea here is find a rate-distortion problem corresponding to the bits through queues result on timing channels. So given a set of Poisson arrivals {Xi}, how can we define a distortion measure and rate-distortion function that makes sense? The choice they make is to requires that the reconstruction be causal, so that the packet times in the reconstructed version are delayed w.r.t. the originals and the objective then is to minimize the induced service time of the reconstruction. This makes the test channel into a first-come first-serve queue and in particular the optimal test channel is the exponential timing channel. The rate distortion function turns out to be -a log(D) where a is the intensity of the Poisson process. The benefit of their definition is that the proofs look as clean as the AWGN and BSC rate distortion proofs.

Network tomography via network coding
G. Sharma, IIT Bombay, S. Jaggi, CUHK, and B. K. Dey IIT Bombay

Network tomography means using input-output probing of a network to infer internal structural properties. In their model, all nodes have unique IDs and share some common randomness. Inferring the presence of absence of a single edge in the graph is relatively easy, and then the process can be repeated for every edge. The more interesting problem is structural inference in an adversarial setting. They show that the graph can be recovered if the min-cut from the source to destination is greater than twice the min-cut from the adversary to the destination (I think), which squares with the Byzantine Generals problem. The tomography problem is interesting, and I wonder if one can also embed adversary-detection methods as well to create a protocol to quarantine misbehaving internal nodes.

Coding theory in projective spaces
Tuvi Etzion, Technion, and Alexander Vardy, UCSD

This talk also dealt with the rank-metric channel. Previous work of Koetter and Kscischang had found analogues to the Singleton and Johnson bounds, and here they find Delsarte, Iterated Johnson, and packing bounds, as well as a “linear programming bound.” They show there are no nontrivial perfect codes in the projective space of dimension n over Fq, and define the notion of a quasilinear code, which is an abelian group in the projective space. A code is quasilinear iff the size is a power of two. Honestly, I found myself a bit lost, since I’m not really a coding theorist, but it seems that the rank metric channel has lots of interesting properties, some of which are old, and some of which are new, and so exciting new algebraic codes can be constructed for them.

On capacity scaling in arbitrary wireless networks
Urs Niesen, MIT, Piyush Gupta, Bell Labs, and Devavrat Shah, MIT

The goal for this paper was to relax some of the random-placement assumptions made in a lot of cooperative schemes for large wireless networks. They look at n nodes scattered in [0, n1/2]2 with a standard path loss model for the gains, full CSI, and equal rate demands for the each source-destination pair. There are three schemes we can consider — multihop, hierarchical cooperation, and a new scheme they have here, cooperative multihop. For large path loss exponent the multihop strategy is optimal, and in other regimes hierarchical cooperation is optimal. The cooperative multihop scheme works by dividing the network into squarelets and then doing multihop on the squarelet-level overlay graph. This strategy is optimal in some cases, and has the benefit of not relying on the node placement model, unlike the hierarchical cooperation scheme. These network models are interesting, but the full CSI assumption seems pretty darn restrictive. I wonder if the cooperative multihop would be more attractive also because channel estimation and ranging need only be done on a local level.

Percolation processes and wireless network resilience
Zhenning Kong and Edmund M. Yeh, Yale

Consider a random geometric graph with density chosen above the percolation threshold. This is an often (ab?)used model for wireless sensor networks. First off, throwing n points in the unit square converges, in distribution to the percolation model with an infinite Poisson process in the plane and the Gilbert disc model. If we make nodes “fail” by flipping a coin for each node, this is just thinning the process, so it’s easy to find a threshold on the failure probability in terms of the percolation threshold that guarantees a giant connected component will still exist. So far so good, but what happens if the failure probability depends on the degree of the node? High degree nodes may serve more traffic and hence will fail sooner, so this is a good model. This work finds necessary and sufficient conditions on the failure probability distribution p(k) for a giant connected component to still exist. Here k is the node degree. The necessary and sufficient conditions don’t match, but they are interesting. In particular, the sufficient condition is very clean. One application of these results is to a cascading failures model, where you want to stop one node going down from taking down the whole network (c.f. power failures in August 1996). It would be interesting to see if a real load redistribution could be integrated into this — nodes have weights (load) on them, proportional to degree, say, and then dynamically nodes are killed and push their load on to their ex-neighbors. How fast does the whole network collapse? Or perhaps a sleeping-waking model where a node dumps its traffic on neighbors while sleeping and then picks up the slack when it wakes up again…

# The FCC and buttocks

I get the FCC’s daily email updates, which mostly contain links to boring procedural documents, but occasionally are a source of amusement. They decided to fine [pdf] ABC for the Feb. 25, 2003 episode of NYPD Blue in which a woman’s buttocks are shown. On pages 4-5, they state:

As an initial matter, we find that the programming at issue is within the scope of our indecency definition because it depicts sexual organs and excretory organs – specifically an adult woman’s buttocks. Although ABC argues, without citing any
authority, that the buttocks are not a sexual organ, we reject this argument, which runs counter to both case law and common sense.

I remember that episode causing a bit of an uproar at the time, but it took almost 5 years for this decision to come out, so I suppose they have been thinking really hard about butts over at the FCC. Perhaps they are getting a bit obsessed.

UPDATE : Language Log has more.

# how JPEG works?

In this month’s Notices of the AMS, there was an article on how JPEG works [pdf]. These little articles are supposed to be about neat topics in mathematics or applications, and they are often uneven and sometimes contentious, as the Koblitz article [pdf] has recently demonstrated.

Even though I do signal processing research, I hadn’t looked under the hood of JPEG too carefully in any of my classes, so it was interesting to me to read about process of translating RGB to luminance and two chrominances, the DCT coefficient quantization scheme, and so on. The bit at the end on wavelets in JPEG-2000 I already knew about. But I think that for a mathematician the “why” was lost in all of the details of Huffman coding, Cohen-Daubechies wavelets, and so on. All of the engineering choices were made (in terribly contentious standards meetings, I’m sure) for a reason, and the choice of the mathematics is really a choice of model of “natural” images. He does a good explanation of luminance-chrominance vs. RGB in terms of the visual system, but what about the Fourier transform in terms of capturing edge detail in high frequency coefficients?

Unfortunately for me, the article ended up confirming my pessimistic view that standards look like a sequence of ad-hoc decisions. There’s lots of theory behind those implementations, and I think that might appeal to more mathematicians.

# anonymous commenting

At the risk of excess blog spam, I’ve enabled anonymous commenting for those who would prefer to comment without revealing themselves.

# From graduate admissions to the future

I realize this is a bit rambling.

One thing that I’ve done for the last few years is act as an associate reviewer for graduate applications for our department. Basically this means I can chip in my 2 cents on applications during the initial review. I’ve already written about my shock at the kinds of letters people write (although I was certainly more naive then). One thing that I’ve come to realize over the years is that it is possible to write a full, good letter for someone whom you don’t know too well that both indicates the basis on which you can evaluate them (e.g. took your class and got an A, did research with your grad student, or did research with you) and is not ambiguous due to “conspicuous omission.” Writing a recommendation letter is an art — if you could just talk to the committee it would be a lot easier, but with N-hundred applicants of whom you admit at most 4 N, the written word is most of what they have to go on.

I often wonder about the total statistics of people who apply to graduate school, especially domestic students who apply to graduate school. At a state school like Berkeley there is a policy to encourage the admission of US citizens and permanent residents (a good thing, in my opinion). But how many of domestic students are interested in graduate programs? Are there many potential strong candidates who don’t even consider a career in graduate school? My gut feeling is that yes, there are many graduates who would benefit from and enjoy some exposure to engineering research at the graduate level who never even consider it. But that’s just a gut sense, which may be off.

The whole process raises a lot of big questions, ranging from “what is the purpose of the graduate program?” to “what can we do at the undergraduate level to bolster interest in postgraduate research?” to “where should the future of academic research in engineering go?” None of these has an answer, and I think that succinct answers are impossible unless you subscribe to some inflexible dogma. But they’re good to keep in the back of my head, I think, especially since I am (hopefully) looking towards a future in the academy.

# Well, that’s done then.

Mystery Hunt 2008 is over.

This year I had to hunt from Berkeley due to thesis pressures by keeping in touch via AIM and Skype. Staying awake for a long time to work is much harder when nobody else is around to help keep the energy up. But sleeping 3 hours in your own bed is about four times as rejuvenating as sleeping 3 hours on a couch or floor.

Just one observation for now : infinitely ascending chains of cryptic crosswords with interchangeable clues are awesome. But try not to pay attention to them when they start talking to you.