Typical review loads

Since becoming faculty at TTI, I’ve started to appreciate better the tensions of service commitments and I can see how many people begin to view reviewing as a chore, a burden they must bear to maintain goodwill in the “community.” Since I work in a few different communities now, I end up reviewing papers from a lot of different areas : information theory and signal processing of course, but also machine learning, security, and networks. There’s been a distinct uptick in my reviewing queue, which I find somewhat alarming.

Looking back, I did a quick calculation and in the almost 6 months I’ve been here, I’ve either finished or committed to reviewing 9 journal papers and 16 conference papers. These numbers don’t really mean too much, because some journal papers are shorter (e.g. a correspondence) and some conference papers are long (40+ pages including supplementary material). Page numbers also don’t really help because of formatting differences. I’m hoping my new iPad (ooh, shiny!) will let me pack in some reviewing time during my commute and stop me from killing so many trees.

However, I have no idea if these numbers are typical. I’ve turned down review requests because I felt like I don’t have enough time as it is. So readers : what’s a typical review load like? Should I just suck it up and accept more reviews?

Note that I’m not asking about what’s “fair” in terms of I submit N papers and therefore should review 3N or something like that. Those games are fine and all, but I really wonder what the distribution of review load is across individuals for a given journal. More on that point later…

Update: I should be clear that being on a PC will clearly cause your review load to go up. I am on 2 PCs but for smaller conferences; having 10+ ISIT reviews would add significantly to one’s total load.

Updated perl script for merging TeX files for ArXiV

Manu Sridharan (blog) left a comment the other day on my old post on my script to merge multiple TeX files (and strip the comments) for posting to ArXiV. He’s created a git repository for it, which seem so much more official and stuff. It’s at:

https://gist.github.com/2175026

Thanks a bunch, Manu!

As a side note, Péter Gács has a de-macro script to eliminate all of your private macros if you’re so inclined.

Linkage

I’ve been refraining from talking about the Dharun Ravi case, because it’s pretty complicated. On the one hand, after reading the New Yorker article and other material, it’s pretty clear Dharun is a grade-A jerk. And Tyler Clementi’s death was a terrible tragedy. But on the other hand, 10 years in prison is a serious thing, as Ta-Nehisi Coates points out. Ashvin shared a link to a blog post on “Deporting Homophbia”:

I have been Tyler and Dharun in a post 9/11 U.S. that accuses white men of exploiting the rest of the world and accuses brown men of destroying it. I have been Tyler and Dharun in a post 9/11 world where white men advocate for homosexual rights and advance homophobia and where brown men are understood as always homophobic. I am being presumptuous, so let me stop.

It’s an interesting take on things, and has made me think about the media coverage of the event and if and how Dharun’s race has played into how the story has been told.

Via Kamalika I learned about a lawsuit against IMDB.

A gem from SMBC via Cosma. The Beef Tensors are a nice touch.

Sepia Mutiny is shutting down, and Amardeep has some closing thoughts.

We always get to hear these stories about how service providers needs differential pricing for network traffic because they can’t make money, but then stories like this make me question the integrity of the complainers.

I heard Of Monsters and Men on KEXP and their show is sold out in Chicago, boo. Here’s their crazy video though:

CISS 2012 : day 1

I’m at CISS right now on the magnolia-filled Princeton campus. The last time I came here was in 2008, when I was trying to graduate and was horribly ill, so this year was already a marked improvement. CISS bears some similarities to Allerton — there are several invited sessions in which the talks are a little longer than the submitted sessions. However, the session organizers get to schedule the entire morning or afternoon (3 hours) as they see fit, so hopping between sessions is not usually possible. I actually find this more relaxing — I know where I’m going to be for the afternoon, so I just settle down there instead of watching the clock so I don’t miss talk X in the other session.

Because there are these invited slots, I’ve begun to realize that I’ve seen some of the material before in other venues such as ITA. This is actually a good thing — in general, I’ve begun to realized that I have to see things 3 times for me to wrap my brain around them.

In the morning I went to Wojciech Szpankowski‘s session on the Science of Information, a sort of showcase for the new multi-university NSF Center. Peter Shor gave an overview of quantum information theory, ending with comments on the additivity conjecture. William Bialek discussed how improvements in array sensors for multi-neuron recording and other measurement technologies are allowing experimental verification of some theoretical/statistical approaches to neuroscience and communication in biological systems. In particular, he discussed an interesting example of how segmentation appears in the embryonic development of fruit flies and how they can track the propagation of chemical markers during development.

David Tse gave a slightly longer version of his ITA talk (with on DNA sequencing with more of the proof details. It’s a cute version of the genome assembly problem but I am not entirely sure what it tells us about the host of other questions biologists have about this data. I’m trying to wrestle with some short-read sequencing data to understand it (and learning some Bioconductor in the process), and the real data is pretty darn messy.

Madhu Sudan talked about his work with Brendan Juba (and now Oded Goldreich) on Semantic Communication — it’s mostly trying to come up with definitions of what it means to communicate meaning using computer science, and somehow feels like some of these early papers in Information and Control which tried to mathematize linguistics or other fields. This is the magical 3rd time I’ve seen this material, so maybe it’s starting to make sense to me.

Andrea Goldsmith gave a whirlwind tour of the work in backing away from asymptotic studies in information theory, and how insights we get from asymptotic analyses often don’t translate into the finite parameter regime. This is of a piece with her stand a few years ago on cross-layer design. High SNR assumptions in MIMO and relaying imply that certain tradeoffs (such diversity-multiplexing) or certain protocols (such as amplify-and forward) are fundamental but at moderate SNR the optimal strategies are different or unknown. Infinite blocklengths are the bread and butter of information theory but now there are more results on what we can do with finite blocklength. She ended with some comments on infinite processing power and trying to consider transmit and processing power jointly, which caused some debate in the audience.

Alas, I missed Tsachy Weissmann‘s talk, but at least I saw it at ITA? Perhaps I will get to see it two more times in the future!

In the afternoon I went to the large alphabets session which was organized by Aaron Wagner. Unfortunately, Aaron couldn’t make it so I ended up chairing the session. Venkat Chandrasekaran didn’t really talk about large alphabets, but instead about estimating high dimensional covariance matrices when you have symmetry assumptions on the matrix. These are represented by the invariance of the true covariance under actions of a subgroup of the symmetric group — taking these into account can greatly improve sample complexity bounds. Mesrob Ohanessian talked about his canonical estimation framework for large alphabet problems and summarized a lot of other work before (too briefly!) mentioning his own work on the consistency of estimators under some assumptions on the generating distribution.

Prasad Santhanam talked about the insurance problem that he worked on with Venkat Anantharam, and I finally understood it a bit better. Suppose you are observing i.i.d. samples X_t from a distribution P on \mathbb{R}^{+} that represent losses paid out by an insurer. The insurer gets to observe the losses for a while and then has to start setting premiums Y_t. The question is this : when can we guarantee that Y_t remains bounded and \mathbb{P}( Y_t > X_t \forall t ) > 1 - \eta? In this case we would say the distribution is insurable.

To round out the session, Wojciech Szpankowski gave a talk on analytic approaches to bounding minimax redundancy under different scaling assumptions on the alphabet and sample sizes. There was a fair bit of generatingfunctionology and Lambert W-functions. The end part of the talk was on scaling when you know part of the distribution exactly (perhaps through offline simulation or training) but then there is part which is unknown. The last talk was by Greg Valiant, who talked about his papers with Paul Valiant on estimating properties of distributions on n elements using only \Theta(n/\log n) samples. It was a variant of the talk he gave at Banff, but I think I understood the lower bound CLT results a bit better (using Stein’s Method).

I am not sure how much blogging I will do about the rest of the conference, but probably another post or two. Despite the drizzle, the spring is rather beautiful here — la joie du printemps.

ICITS Deadline Extension

Due to conflicts with other deadlines and conferences, the submission
deadline for the “conference” track of ICITS 2012 — the International
Conference on Information-Theoretic Security — has been moved back
ten days to Thursday, March 22, 2012.

The “conference” deadline is now Thursday, March 22 (3pm EDT /  19:00 GMT).
The “workshop” deadline is  Monday, April 9.

ICITS will have two tracks this year, one which will act as a regular
computer science-style conference (published proceedings, original
work only) and the other which will behave more like a workshop,
without proceedings, where presentations on previously published work
or work in progress are welcome.

For more information, see the conference website.

Bach Collegium San Diego Bleg

The Bach Collegium San Diego, a group with whom I sang on occasion has a Kickstarter going to fund a tour. Please consider helping them out!

The Bach St John Passion is truly coming full circle for the BCSD, as it was our debut concert in 2003. This year marks our second annual performance of this work. We are seeking to establish an annual tradition of performing a Bach Passion (and other Passion Music) near Holy Week and Easter leading to an eventual Easter Festival.

In order to help bring this annual tradition to reality, we’re asking that you consider helping to sponsor the 16 singers who will form the dynamic vocal ensemble that will propel this dramatic work. The size of the donation is not as important as the interaction and participation of our those who believe in our mission and work. We thank you in advance for your generosity and we look forward to seeing you at the performances in April!

Facts, “facts,” and ficts

After reading and hearing about The Lifespan of a Fact, I was primed to learn from TPM that the This American Life story involving Mike Daisey’s “trip” to see Apple’s products being made at Foxconn was retracted due to significant fabrications by Daisey, resulting in a episode entitled Retraction going into it:

When the original 39-minute excerpt was broadcast on This American Life on January 6, 2012, Marketplace China Correspondent Rob Schmitz wondered about its truth. Marketplace had done a lot of reporting on Foxconn and Apple’s supply chain in China in the past, and Schmitz had first-hand knowledge of the issues. He located and interviewed Daisey’s Chinese interpreter Li Guifen (who goes by the name Cathy Lee professionally with westerners). She disputed much of what Daisey has been telling theater audiences since 2010 and much of what he said on the radio.

Bellairs Workshop 2012

The beach at Bellairs

The beach at Bellairs

I am spending the week at the Bellairs Research Institute in Holetown, Barbados. McGill owns this facility and faculty organize informal workshops throughout the year on various topics. There are two going on right now — one on control theory approaches in computer animation, and out workshop on signal processing in networks. The McGill side is Mark Coates and Mike Rabbat and a number of their students, both masters and PhD. Anna Scaglione and Angelia Nedich arrived recently.

The format of the workshop has been a mix of tutorials and student presentations and plenty of time for discussion and some problem formation. And of course, for the beach, which is just behind the research facility. Holetown is on the west coast of Barbados, and the Caribbean is warm and inviting. I’m having a great time, even though I am falling behind on my other projects and email and whatnot.

People from Barbados call themselves Bajans (‎/ˈbeɪdʒənz/), so one should be careful not to discuss p-values or t-tests around them.

Quote of the day : the Scholar

In the Hieroglyphica of Valerian, the Ass is the symbol of the Scholar, humbly chewing his dry diet of texts, laboring mightily for Learning. The lacquered eyeballs moist, intelligent maybe, but a little cocked, not easy to make them both look in the same direction. Left side the wackier one, mad and errant; right side patient and mild.

Daemonomania, by John Crowley