# ITA 2013 : post the second

Again a caveat — these are the talks in which I took reasonable enough notes to write anything coherent.

Green Communication: From Maxwell’s Demon to “Informational Friction”
Pulkit Grover
Pulkit talked about trying to tie a physical interpretation the energy used in communication during computation. Physicists might argue that reversible computation costs nothing, but this ignores friction and noise. Pulkit discussed a simple network model to account for “informational friction” that penalizes the bit-distance product in communicating on a chip. See also Pulkit’s short video on the topic.

Hajar Mahdavi-Doost, Roy Yates
Roy talked about a model in which receivers have to harvest the energy they need for sampling/buffering/decoding the transmissions. These three tasks cost different amounts, and in particular, the rate at which the receiver samples the output dictates the other parameters. The goal is to choose a rate which helps meet the decoder energy requirements. Because the receiver has to harvest the energy it needs, it has to design a policy to switch between the three operations while harvesting the (time-varying) energy available to it.

Multiple Access and Two-way Channels with Energy Harvesting and Bidirectional Energy Cooperation
Kaya Tutuncuoglu Aylin Yener
Unlike the previous talk, this was about encoders which have to transmit energy to the receivers — there’s a tradeoff between transmitting data and energy, and in the MAC and TWC there is yet another dimension in how the two users can cooperate. For eample, they can cooperate in energy transmission but not data cooperation. There were a lot of results in here, but there was also a discussion of policies for the users. In particular a “procrastination” strategy turns out to work well (rejoice!).

An equivalence between network coding and index coding
Michelle Effros, Salim El Rouayheb, Michael Langberg
The title says it all! For every network coding problem (multiple unicast, multicast, whatever), there exists a corresponding index coding problem (constructed via a reduction) such that a solution to the latter can be easily translated to a solution for the former. This equivalence holds for all network coding problems, not just linear ones.

Crowd-sourcing epidemic detection
Constantine Caramanis, Chris Milling, Shie Mannor, Sanjay Shakkottai
Suppose we have a graph and we can see some nodes are infected. This paper was on trying to distinguish between whether the infected nodes started from a single point infection spread via an SI model, or just from a random pattern of infection. They provide two algorithms for doing this and then address how to deal with false positives using ideas from robust statistics.

/

# The things we know we don’t know

As a theoretical engineer, I find myself getting lulled into the trap of what I now starting to call “lazy generalization.” It’s a form of bland motivation that you often find at the beginning of papers:

Sensor networks are large distributed collections of low-power nodes with wireless radios and limited battery power.

Really? All sensor networks are like this? I think not. Lots of sensor networks are wired (think of the power grid) but still communicate wirelessly. Others communicate through wires. This is the kind of ontological statement that metastasizes into the research equivalent of a meme — 3 years after Smart Dust appears, suddenly all papers are about dust-like networks, ignoring the vast range of other interesting problems that arise in other kinds of “sensor networks.”

Another good example is “it is well known that most [REAL WORLD THING] follows a power law,” which bugs Cosma to no end. We then get lots of papers papers which start with something about power laws and then proceed to analyze some algorithms which work well on graphs which have power law degree distributions. And the later we get statements like “all natural graphs follow power laws, so here’s a theory for those graphs, which tells us all about nature.”

Yet another example of this is sparsity. Sparsity is interesting! It lets you do a lot of cool stuff, like compressed sensing. And it’s true that some real world signals are approximately sparse in some basis. However, turn the crank and we get papers which make crazy statements approximately equal to “all interesting signals are sparse.” This is trivially true if you take the signal itself as a basis element, but in the way it’s mean (e.g. “in some standard basis”), it is patently false.

So why is are these lazy generalization? It’s a kind of fallacy which goes something like:

1. Topic A is really useful.
2. By assuming some Structure B about Topic A, we can do lots of cool/fun math.
3. All useful problems have Structure B

Pattern matching, we get A = [sensor networks, the web, signal acquisition], and B = [low power/wireless, power laws, sparsity].

This post may sound like I’m griping about these topics being “hot” — I’m not. Of course, when a topic gets hot, you get a lot of (probably incremental) papers all over the place. That’s the nature of “progress.” What I’m talking about is the third point. When we go back to our alabaster spire of theory on top of the ivory tower, we should not fall into the same trap of saying that “by characterizing the limits of Structure B I have fundamentally characterized Topic A.” Maybe that’s good marketing, but it’s not very good science, I think. Like I said, it’s a trap that I’m sure I’m guilty of stepping into on occasion, but it seems to be creeping into a number of things I’ve been reading lately.

# The Data Map : a map of information flows

One of the things Latanya Sweeney mentioned during her talk at the iDash workshop is a new project called theDataMap, which is trying to visualize how personal information about individuals flows through institutions. One of the starting points is an older map which shows how a putative hospital patient Alice’s personal information is accessed and used by a number of entities of whom she is likely unaware, including pharma companies, her employer, and medical researchers.

This is analogous to a map Lee Tien sent me, also from a report a few years ago, on how private medical information flows look in California.

It’s worth looking at and thinking a bit about how we balance privacy and utility/profit at the moment, and whether generally erring on the side of sharing is the best way to go.

I’ve been refraining from talking about the Dharun Ravi case, because it’s pretty complicated. On the one hand, after reading the New Yorker article and other material, it’s pretty clear Dharun is a grade-A jerk. And Tyler Clementi’s death was a terrible tragedy. But on the other hand, 10 years in prison is a serious thing, as Ta-Nehisi Coates points out. Ashvin shared a link to a blog post on “Deporting Homophbia”:

I have been Tyler and Dharun in a post 9/11 U.S. that accuses white men of exploiting the rest of the world and accuses brown men of destroying it. I have been Tyler and Dharun in a post 9/11 world where white men advocate for homosexual rights and advance homophobia and where brown men are understood as always homophobic. I am being presumptuous, so let me stop.

It’s an interesting take on things, and has made me think about the media coverage of the event and if and how Dharun’s race has played into how the story has been told.

Via Kamalika I learned about a lawsuit against IMDB.

A gem from SMBC via Cosma. The Beef Tensors are a nice touch.

Sepia Mutiny is shutting down, and Amardeep has some closing thoughts.

We always get to hear these stories about how service providers needs differential pricing for network traffic because they can’t make money, but then stories like this make me question the integrity of the complainers.

I heard Of Monsters and Men on KEXP and their show is sold out in Chicago, boo. Here’s their crazy video though:

# ITA Workshop 2012 : More Talks

There are some more talks to blog about, probably, but I am getting lazy, and one of them I wanted to mention was Max’s, but he already blogged a lot of it. I still don’t get what the “Herbst argument” is, though.

Vinod Prabhakaran gave a talk about indirect decoding in the 3-receiver broadcast channel. In indirect decoding, there is a “semi-private” message that is not explicitly decoded by the third receiver. However, Vinod argued that this receiver can decoded it anyway, so the indirectness is not needed, somehow. At least, that’s how I understood the talk.

Lalitha Sankar talked about two different privacy problems that could arise in “smart grid” or power monitoring situations. The first is a model of system operators (ISOs) and how to view the sharing of load information — there was a model of $K$ different “sources” or states being observed through a channel which looked like a AWGN faded interference channel, where the fading represents the relative influence of the source (or load on the network) on the receiver (or ISO). She didn’t quite have time to go into the second model, which was more at the level of individual homes, where short-time-scale monitoring of loading can reveal pretty much all the details of what’s going on in a house. The talk was a summary of some recent papers available on her website.

Negar Kiyavash talked about timing side channel attacks — an adversary can ping your router and from the delays in the round trip times can learn pretty much what websites you are surfing. Depending on the queueing policy, the adversary can learn more or less about you. Negar showed that first come first serve (FCFS) is terrible in this regard, and there is a bit of a tradeoff wherein policies with higher delay offer more privacy. This seemed reminiscent of the work Parv did on Chaum mixing…

Lav Varshney talked about security in RFID — the presence of an eavesdropper actually detunes the RFID circuit, so it may be possible for the encoder and decoder to detect if there is an eavesdropper. The main challenge is that nobody knows the transfer function, so it has to be estimated (using a periodogram energy detector). Lav proposed a protocol in which the transmitter sends a key and the receiver tries to detect if there is an eavesdropper; if not, then it sends the message.

Tsachy Weissman talked about how to estimate directed mutual information from data. He proposed a number of estimators of increasing complexity and showed that they were consistent. The basic idea was to leverage all of the results on universal probability estimation for finite alphabets. It’s unclear to me how to extend some of these results to the continuous setting, but this is an active area of research. I saw a talk recently by John Lafferty on forest density estimation, and this paper on estimating mutual information also seems relevant.

Shing-Tung Yau and Steve Nadis, The Shape of Inner Space — This book was about the Calabi conjecture, Calabi-Yau manifolds, string theory, and all that jazz. It’s supposed to be for a general/lay audience, but I found it rather daunting and often confusing. Perhaps I know just enough math to get confused, whereas other readers might gloss over things. I definitely would not recommend it to those without some serious mathematical background (like a few college classes). That being said, I found it pretty interesting, and now I know (kind of) what a Calabi-Yau space is.

Donald G. Saari, Decisions and Elections : Explaining the Unexpected — This sums up a large chunk of the analysis of social choice problems and voting systems done by Donald Saari. It’s a bit overwritten for my taste and veers between some mathematical formalism and a chatty form of argumentation. I don’t think I fit in the right “audience” for this book, which is part of the problem. It discusses Arrow’s Theorem and Sen’s Theorem via a bunch of examples and spends a fair bit of time on the “paradoxes” and perversities of different choice systems. The chattiness makes it feel less than systematic. Towards the end Saari puts on more of an advocate hat and argues that symmetry (in a particular sense) is a desirable property of election systems and puts out a case for the Borda count. That in a sense is the least convincing part of the book. This might be a good present for a precocious high school student, since the math is not so complicated but there are a lot of ideas to chew on in there.

Hannu Nurmi, Voting Procedures under Uncertainty — This also fits into the “slightly math-y books for political scientists” genre, so I found it insufficiently math-y. It’s a survey of different models of uncertainty in voting procedures and a survey of the work in that area. As such, it discusses alternatives to “traditional” social choice theory including Euclidean models of preference and so on. There’s a little more “survey” and less “integrating the different perspectives” but that’s ok. I am not sure who should read it, but it did point out some new literatures of which I had previously been unaware.

Moez Draif and Laurent Massoulié, Epidemics and Rumors in Complex Networks — A nice and compact introduction to rumor-spreading processes, including branching processes, small world graphs, SIS/SIR type models, and concluding with some models for “viral marketing.” I really liked this book because it was concise and to the point, but others may find that it lacks some context and connections to literature with which they are familiar. It doesn’t feel like a tutorial in that respect, but it’s self-contained and great for someone who has seen some of the material before but not all of it.

John Mortimer, Rumpole and the Primrose Path — Reading Rumpole short stories is kind of like relaxing in a pair of old slippers. Enjoyable, but probably not his best work.

I kind of like this version of Take Five from Sachal Music.

Sometimes the Library of Congress does awesome things. This jukebox is up there.

I wouldn’t have believed before that there is money in a bannass stand, but I could be wrong.

The clarity in this press nugget leaves a lot to be desired. The statement “the trio has found a way to determine the smallest number of nodes that must be externally controlled to force a given network from any initial state to any desired final state,” is so vague! The full article is here. It turns out they are looking at a linear control problem $d\mathbf{x}/dt = A \mathbf{x}(t) + B \mathbf{u}(t)$ where the different elements of the state are related via a graph matched to $A$ and you want the input $\mathbf{u}(t)$ to only be nonzero on a subset of the nodes. Thanks to Ann Wehman for the pointer.

Apparently I spend half my time reading Crooked Timber.

Žižek gets a lashing for his lazy contrarianism.

A great piece by Michael Bérubé on the Sokal hoax and its aftermath.

Scott Aaronson thinks people should vote to cut funding for quantum computing via YouCut. Why? Because “seeing my own favorite research topics attacked on the floor of the House” would be hilarious (and it would too!).

Marc Lelarge has a new paper up on diffusion and cascade effects in random networks. Fun reading for the break, assuming I can get time.

Some new ways of measuring impact factors.

# ISIT 2010 : Abbas El Gamal and Te Sun Han

I seem to have gotten all behind on wrapping up the ISIT blogging, so the remainder may be more compressed takes on things. This is not in the compressed sensing world, where the signals are sparse and my comments are meant to reconstruct, but more like lossy compression where $D \to \sigma^2$ (for the Gaussian case).

Abbas El Gamal gave a very nice plenary on “Coding for Noisy Networks” in which he really brought together a lot of different eras and streams of work on network information theory and tried to tie them together in a conceptual framework. There was a nice mix of older and newer results. The thing I liked best about it was that he was very optimistic about making progress in understanding how to communicate in networks from an information-theory perspective, which counteracts the sentiment that I heard that “well, it’s just too messy.”

Te Sun Han gave the Shannon Lecture, of course, and he used his time to give a tutorial on the information spectrum method. I had tried to read the book earlier, and honestly found it a little impenetrable (or rather, I wasn’t sure what I was supposed to use from it). The talk was more like reading the papers — concisely stated, but with a clear line of intuition. I know some people are not a big fan of Shannon Lectures as tutorials, but I think there is also a case to be made that most people are unfamiliar with the information spectrum method. A nice example he gave was to show when the output of an optimal source coder looks “completely random.” Maybe this has been done already, but is there a connection between existing theories of pseudorandomness and the information spectrum method?

# ITA Workshop Aftermath

The 2010 ITA Workshop is over, and now that I have almost caught up on sleep, I can give a short report. I think this year was the largest workshop so far, with over 300 speakers.

One of the additions this year was a poster session to give those students who didn’t get a chance to speak at the “Graduation Day” talks on Wednesday an opportunity to present their research. The posters went up on Monday and many stayed up most of the week. I am usually dubious of poster sessions for more theoretical topics; from past experience I have had a hard time conveying anything useful in the poster and they are poorly attended. However, this poster session seemed to be a rousing success, and I hope they continue to do it in future years.

The theme this year was “networks,” broadly construed, and although I didn’t end up going to a lot of the “pure” networking sessions, the variety of topics was nice to see, from learning in networks to network models. Jon Kleinberg gave a great plenary lecture on cascading phenomena. I particularly enjoyed Mor Harchol-Balter’s tutorial on job scheduling in server farms. I learned to re-adjust my intuition for what good scheduling policies might be. The tutorials should be online sometime and I’ll post links when that happens.

The “Senseless Decompression” session organized by Rudi Urbanke and Ubli Mitra should also be online sometime. Apparently 20+ schools contributed short videos on the theme of $\frac{1}{2} \log(1 + \mathrm{SNR})$. Maybe we can make a YouTube channel for them.

Perhaps later I’ll touch on specific talks that I went to, but this time around I didn’t take too many notes, probably because I was a little exhausted from the organizing. I may post a more technical thing on the talk I gave about my recent work on privacy-preserving machine learning, but that will have to wait a bit in the queue. Mor’s tutorial suggests I should use Shortest Remaining Processing Time (SRPT) to make sure things don’t wait too long, and I have some low-hanging tasks that I can dispatch first.