Distinguished Lecturers should not be vetoed by the US

I attended the IEEE Information Theory Society (ITSOC) Board of Governors meeting at ISIT in Paris this week and found something gnawing at me afterwards from the presentation about the Distinguished Lecturer (DL) program. The presentation said that “IEEE denied the selection of a DL based in Iran due to U.S. sanction.” The name of the particular DL nominee does not appear in the public record.

Why can IEEE deny the selection of a DL? In part, there are requirements for DLs now:

DL should visit IT Society local chapters. DL program pays for airfare and travel. Local chapter pays for local expenses (hotel). If traveling to a different continent, visits to two locations are required. DL lectures should be freely accessible to the public (i.e. no registration fees).

A DL from Iran cannot be reimbursed by IEEE because the IEEE is based in the US and has to abide by US law. By the new rules then, scholars from Iran are automatically disqualified from the DL program.

Being a DL is an important recognition: it is arguably an award. It certainly bestows a certain level of prestige. Acceding to this intervention by IEEE sends the message that “if you are from Iran, you can’t get an award.” Once we go down this road we might as well ban conference submissions, membership, and participation in the academic community for scholars from Iran. Why not go whole hog and become a tool of the US State Department? It’s ludicrous.

ITSOC should not sit by and passively accept this “veto” from IEEE: it’s an assault on academic freedom that devalues scholarship on purely political grounds. To not even name the nominee erases the honor to which they are entitled. In fact, they should be given the honor/award with the stipulation that they are exempt from the reimbursement. It is possible to take a stand without violating the law: recognize this scholar and take a public stand against the encroachment of American foreign policy onto an international academic community.

That review is so… meta

Reviewing has started for NeurIPS 2019 and this time around I am an area chair (AC). We’ve been given a lot of instructions and some tasks: bidding on papers, bidding on reviewers, adjusting reviewers, identifying what we think are likely rejects in the batch of papers we are handling, and so on. It’s a little more involved than being an AC for ICML, but that’s to be expected since the whole reviewing game has been evolving rapidly to adapt to the massive increase in submissions.

Since there is yet another tier of TPC above the ACs (the Senior ACs), how should one approach the meta-review? One view is that the meta-review is AC’s decision/opinion informed by the reviews, the response, the discussion, and their own reading of the paper. This makes the AC a bit like an associate editor at a journal. This also gives the AC quite a bit of flexibility: if the discussion is limited or not particularly useful, the AC can fill in the gap by adding their own voice. The downside is that ACs might bring more of their own preferences (or biases) to the process.

A different approach is to make the meta-review akin to a panel summary as part of an NSF review. In the panels I have been on, there are N people who write reviews of each proposal, one of whom leads the discussion. There is also a scribe for the discussion who has not written a review: a dispassionate observer. The whole panel (even those who didn’t read the proposal) participates in the discussion. The scribe is supposed to draft a summary/synthesis of the discussion and runs it past the panel for edits until they reach a consensus. The N reviews are still there though, with their diversity of opinion.

I think I might prefer the second model. The setup is a bit different, since authors get to respond to the reviews. The meta-review is supposed to augment the existing reviews by incorporating the discussion and author response. The AC is supposed to guide the discussion, which is a role shared by the lead discussant and program officer in the NSF model. The only problem is that the amount of discussion on each paper is highly variable. It’s sometimes like pulling teeth to get reviewers to respond/interact. Reviewers, for their part, might be participating in 5 different discussions, so context switching to each paper can be tough. But for papers with some reasonable discussion, the meta-review as panel summary might be a good way to go.

One complaint about panel summaries is that they often feel anodyne. However, I think this might be desirable in a meta-review, since it could lead to fewer angry authors. One aspect of the NSF model which I think could be adopted, regardless of how the AC views their job, is running the meta-review past the reviewers. I did this for ICML and got some edits and feedback from the reviewers that improved the final review.

CFP: Theory and Practice of Differential Privacy (TPDP) 2019

November 11
London, UK
Colocated with CCS 2019

Differential privacy is a promising approach to privacy-preserving data analysis.  Differential privacy provides strong worst-case guarantees about the harm that a user could suffer from participating in a differentially private data analysis, but is also flexible enough to allow for a wide variety of data analyses to be performed with a high degree of utility.  Having already been the subject of a decade of intense scientific study, it has also now been deployed in products at government agencies such as the U.S. Census Bureau and companies like Apple and Google.

Researchers in differential privacy span many distinct research communities, including algorithms, computer security, cryptography, databases, data mining, machine learning, statistics, programming languages, social sciences, and law.  This workshop will bring researchers from these communities together to discuss recent developments in both the theory and practice of differential privacy.

Specific topics of interest for the workshop include (but are not limited to):

  • theory of differential privacy,
  • differential privacy and security,
  • privacy preserving machine learning,
  • differential privacy and statistics,
  • differential privacy and data analysis,
  • trade-offs between privacy protection and analytic utility,
  • differential privacy and surveys,
  • programming languages for differential privacy,
  • relaxations of the differential privacy definition,
  • differential privacy vs other privacy notions and methods,
  • experimental studies using differential privacy,
  • differential privacy implementations,
  • differential privacy and policy making,
  • applications of differential privacy.

Submissions

The goal of TPDP is to stimulate the discussion on the relevance of differentially private data analyses in practice. For this reason, we seek contributions from different research areas of computer science and statistics.  Authors are invited to submit a short abstract (4 pages maximum) of their work. Submissions will undergo a lightweight review process and will be judged on originality, relevance, interest and clarity. Submission should describe novel work or work that has already appeared elsewhere but that can stimulate the discussion between different communities at the workshop. Accepted abstracts will be presented at the workshop either as a talk or a poster.  The workshop will not have formal proceedings and is not intended to preclude later publication at another venue. Selected papers from the workshop will be invited to submit a full version of their work for publication in a special issue of the Journal of Privacy and Confidentiality.

Submission website: https://easychair.org/conferences/?conf=tpdp2019

Important Dates

Submission: June 21 (anywhere on earth)

Notification: August 9

Workshop: 11/11

Program Committee

  • Michael Hay (co-chair), Colgate University
  • Aleksandar Nikolov (co-chair), University of Toronto
  • Aws Albarghouthi, University of Wisconsin–Madison
  • Borja Balle, Amazon
  • Mark Bun, Boston University
  • Graham Cormode, University of Warwick
  • Rachel Cummings, Georgia Tech University
  • Xi He, University of Waterloo
  • Gautam Kamath, University of Waterloo
  • Ilya Mironov, Google Research – Brain
  • Uri Stemmer, Ben-Gurion University
  • Danfeng Zhang, Penn State University

For more information, visit the workshop website at https://tpdp.cse.buffalo.edu/2019/.

Signal boost: travel grants for SPAWC 2019

Passing a message along for my colleague Waheed Bajwa:

As the US Liaison Chair of IEEE SPAWC 2019, I have received NSF funds to support travel of undergraduate and/or graduate students to Cannes, France for IEEE SPAWC 2019. Having a paper at the workshop is not a prerequisite for these grants and a number of grants are reserved for underrepresented minority students whose careers might benefit from these travel grants. Please share this with any interested students and, if you know one, please encourage her/him to consider applying for these grants.

ICML 2019 encouraged code submission. That is great!

ICML 2019 had an optional code submission for papers. As an area chair, I handled a mix of papers, some more theoretical than others, but almost all of them had some empirical validation. Not all of them submitted code. For a paper with a theorem, the experiments can range from sanity checks to a detailed exploration of the effects of some parameters for problem sizes of interest. For more applied/empirical papers, the experiments are doing the heavy lifting of making a case. A survey just went out to Area Chairs asking to what degree code submission was taken as a factor in our recommendations to the senior program committee.

Absent a compelling reason not to submit code, I think that ensuring some form of reproducibility is important for both transparency and the open communication of ideas. Reviewers already approach reading a paper with some skepticism — the burden of proof is on the authors to make a compelling argument in their paper. But if the argument is largely empirical (e.g. “this heuristic works very well for problem A”) then the burden of proof consists of making a case that the experiments, as described in the paper, were in fact carried out and not mere fabrications. How better to do that than to provide the implementation of the method?

Providing implementations is not always possible: examples abound in multiple fields, including electrical engineering. In antenna design the schematic might be provided in the paper, but the actual fabricated antenna and anechoic chamber are not available to the reviewers. Nobody seems to think this is a problem: reviewers somehow trust that the authors are not making things up. Shouldn’t we trust ML authors as well?

One factor that makes a difference is that conferences are just not as competitive outside of computer science. Conferences have a short review period in which to evaluate a large volume of papers. The prestige conferred by getting a paper accepted to a top CS conference is often compared to getting a paper accepted to a top journal. Authors benefit a lot from the research community accepting their paper. It is only appropriate that they also share a lot.

Let’s take an example. Suppose you are working in academia and developed a new method for solving Problem X. You are going to launch a startup based on this method. How much more appealing would it be to funders if you had one (or more!) ICML papers about how you’ve totally nailed Problem X, showing that you are a total rockstar in the ML/AI community? But your competitive advantage might be at risk if reviewers (and then later the community) has access to your code. So then you write a paper where you discuss the main ideas behind your approach and give the experimental results but no implementation with the 5 other things you had to do to make the thing actually work. In this case you’re getting the stamp of approval while not sharing with the rest of the research community.

Of course, one can imagine that submissions from industry authors might rely on proprietary code bases which they cannot (for policy reasons) provide. An academic conference is about the open and free exchange of ideas, knowledge, and techniques. It seems that a trade show would be a more appropriate venue for showing the results without sharing the methods. I’m not trying to suggest that industry researchers are nefarious in some way, but it’s important to think about the incentives and benefits. The rules for submission (in this case code submission) articulate some of the values of the research community. Encouraging (but not requiring) code submission requires authors to signal (and allows reviewers to consider) whether they agree to the social contract.

Readings

I’m on sabbatical now, which ostensibly means having a bit more time to read things (technical and not). I’m not exactly burning through books but 2019 has already had a few good ones.

Convenience Store Woman (Sayaka Murata): The book has a lot of critical acclaim but I can see many readers being put off by it: the story is pretty disturbing in the end. The narrator and protagonist is (I think) neuro-atypical, which comes across in the writing (I’d love to read some notes from the translator). On the other hand, it is also a critique of Japanese work culture, I think, although not the usual office-drone/salaryman/Aggretsuko type, which is refreshing.

Golden Hill (Francis Spufford): This was a really great picaresque with a lot of detail about early New York that made me want to tour around lower Manhattan with a copy of the manuscript to trace out some of the locations. There’s a twist (as always!) but I don’t want to give it away. Spufford, as usual, has a real ear for the language: although it took a little getting used to, I eventually settled in and it was a real page-turner.

The Ministry of Pain (Dubravka Ugrešić): A novel by a Balkan author living in Amsterdam about a Balkan refugee teaching Balkan literature in Amsterdam to (mostly) other refugees. There’s a lot about language and the war and “our language” as she puts it. The story unfolds slowly but I think the atmosphere and ideas were what I appreciated the most about it. The discomfort of addressing while not reenacting trauma is palpable.

Binti: Home and Binti: The Night Masquerade (Nnedi Okorafor) books 2 and 3 in a sci-fi trilogy. The world expands quite a bit beyond the first one and I thought Binti’s character arc was quite dramatic. I wish there had been more to learn about the other characters as well. But these books are novellas so perhaps I should do a bit more work to fill in the gaps with my imagination?

Stranger in a Strange Land: Searching for Gershom Scholem and Jerusalem (George Prochnik): A rather discursive biography of Gershom Scholem, who almost single-handedly (it seems) made started the academic study of Kabbalah which is interleaved with the author’s autobiography of moving to Jerusalem, taking up graduate studies, starting a family, and becoming disenchanted. I thought it was a stretch at times to relate the two, and I had no prior information about Scholem but I found myself almost wanting two books: a straight biography and a straight memoir. Both had their merits but the alternation made it a bit of a slog to read.

gender inclusivity in communication models

I submitted a paper to ISIT in which I tried something different. It’s about communication models with a jammer, so there are three parties: Alice, Bob, and the jammer. Alice wants to send a message to Bob. The jammer wants to prevent it from being reliably received.

We always use Alice for the encoder/transmitter. Knowing nothing more than the name, I would use the pronouns she/her/hers to refer to Alice.

We always use Bob for the encoder/transmitter. Knowing nothing more than the name, I would use the pronouns he/him/his to refer to Bob.

What about the jammer? In previous papers (and in our research discussions) we called the jammer various names: Calvin (to get the C) or James (for the J). We ended up also using he/him/his for the jammer too.

This time I proposed we use Jamie for the jammer. Knowing nothing more than the name, I suggested they/them/their as the most appropriate. In my mind, Jamie may be gender nonconforming, right?

At this point many readers (if there are any) would say I’m being a bit on the nose. Why make James into Jamie and why deliberately change the pronouns? Won’t it just confuse people?

There are so many responses to this.

First, just on pragmatics. This makes pronouns which are uniquely decodable to the parties in the communication model. What can be clearer?

Second, if pronouns create a problem for a mathematically-minded reader, then they are far too obsessed with (gendered) Alice/Bob metaphor. It’s a mathematical engineering paper, not a kid’s story.

But finally, and most importantly, even though all the authors of this paper may be cis-gendered, writing the stories in our papers in a more inclusive way is the right thing to do. Why Alice and Bob? Why not Aarti and Bhaskar, Anting and Bolei, Avital and Binyamin, or Arash and Babak? I’ve heard arguments that we should be more ecumenical in the national origin of our communicating parties. Can we be more inclusive by gender as well?

We can and should!

What signals sent by author lists

I recently had a conversation about ordering of author lists for papers. Of course, each field has its own conventions but as people start publishing in multiple communities’ venues things can get a bit murky. There are pros and cons and different people have different values, etc.

This is all standard and has been hashed to death.

But what happens when you merge two papers with different author lists? Alphabetical makes things very easy, but if you go a different route, then the primary authors have to slug it out to see who gets first author credit. To split the difference, you could put a footnote saying that authors are in alphabetical order. In the conversation, it came up that putting the footnote implies that there was some tension between the two author groups and so this was the compromise solution after a debate. That was new to me: is this the correct inference to make in most cases?

Condorcet Paradoxes and dice

NB: don’t take this as a sign that I’ve brought the blog back to life for real. I’ve made too many unfulfilled promises on that front…

There was a talk at UChicago in the fall (I’m still on the seminar mailing lists) given by Jan Hązła based, I think, on a paper titled “The Probability of Intransitivity in Dice and Close Elections” with Elchanan Mossel, Nathan Ross, and Guangqu Zheng. The abstract was quite interesting and led to a discussion with my colleagues Emina Soljanin and Roy Yates in which I realized I didn’t quite get the result so I promised to come back after reading it more carefully. Fast forward several months and now I am in Berkeley (on a pre-tenure sabbatical) and they are back east so I figured I could at least blog about it.

The problem is about intransitive dice, which was a new term for me. Consider an n-sided die with numbers \mathbf{a} = (a_1, a_2, a_n) and call \sum_{i=1}^{n} a_i the face sum. The die is fair, so the expected face value is \frac{1}{n} \sum_{i=1}^{n} a_i. We can define an ordering on dice by saying $\mathbf{a} \succ \mathbf{b}$ if a uniformly chosen face of \mathbf{a} is larger than a uniformly chosen face of \mathbf{b}. That is, if you roll both dice then on average \mathbf{a} would beat \mathbf{a}.

A collection of dice is intransitive if the relation \succ based on dice beating each other cannot be extended to a linear order. The connection to elections is in ranked voting — an election in which voters rank candidates may exhibit a Condorcet paradox in which people’s pairwise preferences form a cycle: A beats B, B beats C, but C beats A in pairwise contests. (As an aside, in election data we looked at in my paper on instant runoff voting we actually rarely (maybe never?) saw a Condorcet cycle).

Suppose we generate a die randomly with face values drawn from the uniform distribution on [-1,1] and condition on the face sum being equal to 0. Then as the number of faces n \to \infty, three such independently generated dice will become intransitive with high probability (see the Polymath project).

However, it turns out that this is very special to the uniform distribution. What this paper shows (among other things) is that if you generate the faces from any other distribution (but still condition the face sum to be 0), the dice are in fact transitive with high probability. This to me is interesting because it shows the uniform distribution as rather precariously balanced — any shift and a total linear order pops out. But this also makes some sense: the case where the dice become intransitive happens when individuals essentially are choosing a random permutation of the candidates as their preferences. In face, the authors show that if you generate voters/votes this way and then condition on the election being “close” you get a higher chance of turning up a Condorcet paradox.

The details of the proofs are a bit hairy, but I often find ranking problems neat and interesting. Maybe one day I will work on them again…

PLoS One and its absurdly short review times

I was asked to review a manuscript for PLos One recently and declined because they asked for a review in 10 days. This might be standard for biology papers or something, but seems absurd for a paper where the reviewer is asked to sign off on technical correctness for something which may entail a fair bit of math. This sort of one-size-fits-all approach to academic practice drives me nuts. It’s the same kind of thing that leads to workshops on grant writing led by someone who has had a lot of success writing grants to one program at NIH/NSF/wherever and then dispenses advice specific to that area with almost zero recognition that different programs/agencies have different priorities. Wow, context matters! Who knew?

Now, a reasonable claim is that 10 days at 8 hours a day is 80 hours and that is a totally reasonable amount of time to check all the math in a paper, assuming I had nothing else to do with my time. A friend told me their advisor had a policy to decline a review if they couldn’t do it in the next week. This strikes me as an admirable approach to things that probably worked well in the 80s.

However, given that 50% of papers are accepted to PLoS on a pay-to-publish model, what is the prior belief that spending even 30 minutes of my time reading the paper is worthwhile? Far better to spend 10 minutes complaining about it on a nearly defunct blog, no?