2015 North American School of Information Theory

The 2015 ​North American ​School of Information Theory ​(NASIT) will be held on August 10-13, 2015, at the University of California, San Diego in La Jolla. If you or your colleagues have students who might be interested in this event, we would be grateful if you could forward this email to them and encourage their participation. The application deadline is ​Sunday, June 7. As in the past schools, we again have a great set of lecturers this year​​:

We are pleased to announce that ​Paul Siegel will be the​​ Padovani Lecturer of the IEEE Information Theory Society​​ and will give his lecture at the School. The Padovani Lecture is sponsored by a generous gift of Roberto Padovani.

For more information and application, please visit the School website.​​

Signal boost: Postdoc positions at Tel Aviv University

Two postdoctoral research positions are now available in the Department of Electrical Engineering – Systems at Tel Aviv University, Israel, in the fields of information theory and interactive communications. Starting immediately for up to two years. Funded by the European Research Council (ERC).

We offer two postdoctoral fellowships for researchers in the broad area of information theory, with special emphasis on interactive communications. Specific topics of interest include single-user and multiuser communications with noisy feedback, iterative-refinement coding for two-way channels, interactive coding and its relations to dynamical systems and stochastic control, resource-limited interactive communications, distributed function computation, and combinatorial aspects of multiuser interactive communications. The research will be conducted in close collaboration with Dr. Ofer Shayevitz and his group, and is funded by a grant from the European Research Council (ERC).

The positions are available immediately and for a period of up to two years. Applicants should hold a PhD in either electrical engineering, computer science, or mathematics, and are expected to have a strong background in information theory or closely related fields. Remuneration is highly competitive and commensurate with skills and track record. To apply, please send your CV along with a short statement of research interests to Dr. Ofer Shayevitz at ofersha@eng.tau.ac.il.

CFP: 2015 Information Theory Workshop (ITW), Jeju Island

I am on the TPC for ITW 2015 in Jeju Island, South Korea.

The 2015 IEEE Information Theory Workshop will take place in Jeju Island, Korea, from October 11 to October 15, 2015. Jeju Island is the largest island in Korea and is located in the Pacific Ocean just off the south-western tip of the Korean peninsula. Jeju Island is a volcanic island with a mountainous terrain, a dramatic rugged coastline and spectacular watershed courses. The Island has a unique culture as well as natural beauty. It is a living folk village, with approximately 540,000 people. As a result of its isolated location and romantic tropical image, Jeju Island has become a favorite retreat with honeymooners and tourists. The tour programs of the conference will also provide participants with the opportunity to feel and enjoy some of the island’s fascinating attractions.

Special topics of emphasis include:

  • Big data
  • Coding theory
  • Communication theory
  • Computational biology
  • Interactive communication
  • Machine learning
  • Network information theory
  • Privacy and security
  • Signal processing

ISIT Deadline Extended to Monday

Apparently not everyone got this email, so here it is. I promise this blog will not become PSA-central.

Dear ISIT-2015-Submission Reviewers:

In an effort to ensure that each paper has an appropriate number of reviews, the deadline for the submission of all reviews has been extended to March 2nd. If you have not already done so, please submit your review by March 2nd as we are working to a very tight deadline.

In filling out your review, please keep in mind that

(a) all submissions are eligible to be considered for presentation in a semi-plenary session — Please ensure that your review provides an answer to Question 11
(b) in the case of a submission that is eligible for the 2015 IEEE Jack Keil Wolf ISIT Student Paper Award, the evaluation form contains a box at the top containing the text:
Notice: This paper is to be considered for the 2015 IEEE Jack Keil Wolf ISIT Student Paper Award, even if the manuscript itself does not contain a statement to that effect.
– Please ensure that your review provides an answer to Question 12 if this is the case.

Thanks very much for helping out with the review process for ISIT, your inputs are of critical importance in ensuring that the high standards of an ISIT conference are maintained. We know that reviewing a paper takes much effort and we are grateful for all the time you have put in!

With regards,

Pierre, Suhas and Vijay
(TPC Co-Chairs, ISIT 2015)

ITA 2015: quick takes

Better late than never, I suppose. A few weeks ago I escaped the cold of New Jersey to my old haunts of San Diego. Although La Jolla was always a bit fancy for my taste, it’s hard to beat a conference which boasts views like this:

A view from the sessions at ITA 2015

A view from the sessions at ITA 2015


I’ll just recap a few of the talks that I remember from my notes — I didn’t really take notes during the plenaries so I don’t have much to say about them. Mostly this was due to laziness, but finding the time to blog has been challenging in this last year, so I think I have to pick my battles. Here’s a smattering consisting of

\{ \mathrm{talks\ attended} \} \cap \{ \mathrm{talks\ with\ understandable\ notes} \}

(Information theory)
Emina Soljanin talked about designing codes that are good for fast access to the data in distributed storage. Initial work focused on how to repair codes under disk failures. She looked at how easy it is to retrieve the information afterwords to guarantee some QoS for the storage system. Adam Kalai talked about designing compression schemes that work for an “audience” of decoders. The decoders have different priors on the set of elements/messages so the idea is to design an encoder that works for this ensemble of decoders. I kind of missed the first part of the talk so I wasn’t quite sure how this relates to classical work in mismatched decoding as done in the information theory world. Gireeja Ranade gave a great talk about defining notions of capacity/rate need to control a system which as multiplicative uncertainty. That is, x[n+1] = x[n] + B[n] u[n] where B[n] has the uncertainty. She gave a couple of different notions of capacity, relating to the ratio | x[n]/x[0] | — either the expected value of the square or the log, appropriately normalized. She used a “deterministic model” to give an explanation of how control in this setting is kind of like controlling the number of significant bits in the state: uncertainty increases this and you need a certain “amount” of control to cancel that growth.

(Learning and statistics)
I learned about active regression approaches from Sivan Sabato that provably work better than passive learning. The idea there is do to use a partition of the X space and then do piecewise constant approximations to a weight function that they use in a rejection sampler. The rejection sampler (which I thought of as sort of doing importance sampling to make sure they cover the space) helps limit the number of labels requested by the algorithm. Somehow I had never met Raj Rao Nadakuditi until now, and I wish I had gotten a chance to talk to him further. He gave a nice talk on robust PCA, and in particular how outliers “break” regular PCA. He proposed a combination of shrinkage and truncation to help make PCA a bit more stable/robust. Laura Balzano talked about “estimating subspace projections from incomplete data.” She proposed an iterative algorithm for doing estimation on the Grassmann manifold that can do subspace tracking. Constantine Caramanis talked about a convex formulation for mixed regression that gives a guaranteed solution, along with minimax sample complexity bounds showing that it is basically optimal. Yingbin Liang talked about testing approaches for understanding if there is an “anomalous structure” in a sequence of data. Basically for a sequence Y_1, Y_2, \ldots, Y_n, the null hypothesis is that they are all i.i.d. \sim p and the (composite) alternative is that there an interval of indices which are \sim q instead. She proposed a RKHS-based discrepancy measure and a threshold test on this measure. Pradeep Ravikumar talked about a “simple” estimator that was a “fix” for ordinary least squares with some soft thresholding. He showed consistency for linear regression in several senses, competitive with LASSO in some settings. Pretty neat, all said, although he also claimed that least squares was “something you all know from high school” — I went to a pretty good high school, and I don’t think we did least squares! Sanmi Koyejo talked about a Bayesian devision theory approach to variable selection that involved minimizing some KL-divergence. Unfortunately, the resulting optimization ended up being NP-hard (for reasons I can’t remember) and so they use a greedy algorithm that seems to work pretty well.

(Privacy)
Cynthia Dwork gave a tutorial on differential privacy with an emphasis on the recent work involving false discovery rate. In addition to her plenary there were several talks on differential privacy and other privacy measures. Kunal Talwar talked about their improved analysis of the SuLQ method for differentially private PCA. Unfortunately there were two privacy sessions in parallel so I hopped over to see John Duchi talk about definitions of privacy and how definitions based on testing are equivalent to differential privacy. The testing framework makes it easier to prove minimax bounds, though, so it may be a more useful view at times. Nadia Fawaz talked about privacy for time-series data such as smart meter data. She defined different types of attacks in this setting and showed that they correspond to mutual information or directed mutual information, as well as empirical results on a real data set. Raef Bassily studied a estimation problem in the streaming setting where you want to get a histogram of the most frequent items in the stream. They reduce the problem to one of finding a “unique heavy hitter” and develop a protocol that looks sort of like a code for the MAC: they encode bits into a real vector, had noise, and then add those up over the reals. It’s accepted to STOC 2015 and he said the preprint will be up soon.

Feature Engineering for Review Times

The most popular topic of conversation among information theory afficionados is probably the long review times for the IEEE Transactions on Information Theory. Everyone has a story of a very delayed review — either for their own paper or for a friend of theirs. The Information Theory Society Board of Governors and Editor-in-Chief have presented charts of “sub-to-pub” times and other statistics and are working hard on ways to improve the speed of reviews without impairing their quality. These are all laudable. But it occurs to me that there is room for social engineering on the input side of things as well. That is, if we treat the process as a black box, with inputs (papers) and outputs (decisions), what would a machine-learning approach to predicting decision time do?

Perhaps the most important (and overlooked in some cases) aspects of learning a predictor from real data is figuring out what features to measure about each of the inputs. Off the top of my head, things which may be predictive include:

  • length
  • number of citations
  • number of equations
  • number of theorems/lemmas/etc.
  • number of previous IT papers by the authors
  • h-index of authors
  • membership status of the authors (student members to Fellows)
  • associate editor handling the paper — although for obvious reasons we may not want to include this

I am sure I am missing a bunch of relevant measurable quantities here, but you get the picture.

I would bet that paper length is a strong predictor of review time, not because it takes a longer time to read a longer paper, but because the activation energy of actually picking up the paper to review it is a nonlinear function of the length.

Doing a regression analysis might yield some interesting suggestions on how to pick coauthors and paper length to minimize the review time. This could also help make the system go faster, no? Should we request these sort of statistics from the EiC?