CRA Best Practices on Mentoring Postdocs

I just got the CRA newsletter, and it had a link to a document on best practices for mentoring postdocs:

… data from the Computing Research Association’s (CRA) annual Taulbee Survey indicate that the numbers of recent Ph.D.s pursuing postdocs following graduate school soared from 60 in 1998 to 249 in 2011 (three-year rolling averages), an increase of 315 percent during this period. Because research organizations are suddenly channeling many more young researchers into these positions, it is incumbent upon us as a community to have a clear understanding of the best practices associated with pursuing, hosting, and nurturing postdocs.

I think you’d find the same numbers in EE as well. This report relies a fair bit on the National Academies report, which is a little out of date and I thought very skewed towards those in the sciences. Engineering is a different beast (and perhaps computer science an even more different beast), so I think that while there are some universal issues, the emphasis and importance of different aspects varies across fields quite a bit. For example, the NA report focuses quite a bit on fairness in recruiting which are predicated on the postdoc being a “normal” thing to do. By contrast, in many engineering fields postdoc positions are relatively new and there’s an opportunity to define what the position means and what it is for (i.e. not a person you can pay cheaply to supervise your graduate students for you).

Anyway, it’s worth reading!

Scholarly communication in conferences

A few weeks (!) ago I was talking with an anthropologist friend of mine about how different fields have different modes of communicating research “findings” in the conference setting. Some places people just read their paper out loud, others have slide presentations, yet others have posters, and I imagine some people do blackboard talks. Of course, conferences have many purposes — schmoozing, job hunting, academic political wrangling, and so on. What is unclear to me is why particular academic communities have settled on particular norms for presenting their work.

One axis along which to understand this might be the degree to which the presentation of the paper is an advertisement for the written paper. In many humanities conferences, people simply read their paper out loud. You’d think that theater researchers would be able to make a more… dramatic reading of their work, but you’d be wrong much of the time. It’s very hard to sit and listen and follow a jargon-heavy analysis of something that you probably have never read about (e.g. turn of the century commercial theater in Prague), and in some sense I feel that the talk as an advertisement for the paper is minimal here.

On the other hand, a poster session maximizes the “advertisement of the paper” aspect. People stand there for 5 minutes while you explain the ideas in the paper, and if seems sufficiently interesting then they will go and read the actual paper. A difference here between the model in the humanities is that there is a paper in the proceedings, while in humanities conferences this is not necessarily the case.

Slide presentations are somewhere in the middle — I often go to a talk at a conference and think “well, now I don’t need to read the paper.” These are the trickiest because the audience is captive but you cannot give them the full story. It’s more of a method for luring already-interested people into wanting to read the paper rather than the browsing model of a poster session.

However, even this “advertisement” categorization raises the question of why we have poster sessions, slide presentations, and paper readings. Are these the best way to present the research in those fields? Should we have more posters at ISIT and fewer talks (more like NIPS)? Should NIPS have more parallel sessions to reflect the spread of interest in the “community?” Should anthropology conferences have each panelist give an 8 minute slide presentation followed by real discussion?

I missed ITW in Lausanne this year, but I heard that they mixed up the format to great success. More posters and fewer talks meant more interaction and more discussion. I think more experimenting could be good — maybe some talks should be given as chalk talks with no slides!

Is there an incentive for misrepresentation?

I was recently reading a paper on ArXiV that is from the VLDB 2012 conference:

Functional Mechanism: Regression Analysis under Differential Privacy
Jun Zhang, Zhenjie Zhang, Xiaokui Xiao, Yin Yang, Marianne Winslett

The idea of the paper is to make a differentially private approximation to an optimization by perturbing a Taylor series expansion of the objective function. Which is an interesting idea at that. However, what caught my eye was how they referred to an earlier paper of mine (with Kamalika Chaudhuri and Claire Monteleoni) on differentially private empirical risk minimization. What we did in that paper was look at the problem of training classifiers via ERM and the particular examples we used for experiments were logistic regression and SVM.

In the VLDB paper, the authors write:

The algorithm, however, is inapplicable for standard logistic regression, as the cost function of logistic regression does not satisfy convexity requirement. Instead, Chaudhuri et al. demonstrate that their algorithm can address a non-standard type of logistic regression with a modified input (see Section 3 for details). Nevertheless, it is unclear whether the modified logistic regression is useful in practice.

This is just incorrect. What we look at is a fairly standard formulation of logistic regression with labels in {-1,+1}, and do the standard machine learning approach, namely regularized empirical risk minimization. The objective function is, in fact, convex. We further do experiments using that algorithm on standard datasets. Perhaps the empirical performance was not as great as they might like, but then they should make a claim of some sort instead of saying it’s “unclear.”

They further claim:

In particular, they assume that for each tuple t_i, its value on Y is not a boolean value that indicates whether t_i satisfies certain condition; instead, they assume y_i equals the probability that a condition is satisfied given x_i… Furthermore, Chaudhuri et al.’s method cannot be applied on datasets where Y is a boolean attribute…

Firstly, we never make this “assumption.” Secondly, we do experiments using that algorithm on standard datasets where the label is binary. Reading this description was like being in a weird dream-world in which statements are made up and attributed to you.

Naturally, I was a bit confused about this rather blatant misrepresentation of our paper, so I emailed the authors, who essentially said that they were confused by the description in our paper and that more technical definitions are needed because we are from “different communities.” They claimed that they emailed questions about it but we could not find any such emails. Sure, sometimes papers can be confusing if they are out of your area, but to take “I don’t understand X” to “let me make things up about X” requires a level of gumption that I don’t think I could really muster.

In a sense, the publication incentives are stacked in favor of this kind of misrepresentation. VLDB is a very selective conference, so in order to make your contribution seem like a big deal, you have to make it seem that alternative approaches to the problem are severely lacking. However, rather than making a case against the empirical performance of our method, this paper just invented “facts” about our paper. The sad thing is that it seems completely unnecessary, since their method is quite different.

Readings

Snuff [Terry Pratchett] : this was standard Discworld stuff, but I found it a little below-average. I’m finicky that way though.

Snakes Can’t Run [Ed Lin] : a follow-up to This Is A Bust, this book is about human smuggling in New York Chinatown in the 70s. Recommended if you like thing Asian and mysterious.

Shark’s Fin and Sichuan Pepper: A Sweet-Sour Memoir of Eating in China [Fuschia Dunlop] : I found this memoir to be engaging but it’s definitely got that feel of “Western person’s observations about China.” Dunlop is more aware of her situation as outsider/observer, but sometimes its hard to shake that narrative vibe. That being said, you should definitely read this if you want to know more about Chinese cuisines.

Among Others [Jo Walton] : it won a Nebula and a Hugo and I could see why. This is a really sharply observed and narrated coming-of-age story about a high school girl who is not part of the “main crowd” and finds her solace in voraciously reading all of the SciFi/Fantasy novels she can get her hands on. Really lovely writing.

The Lost Soul of Higher Education [Ellen Schrecker] : a pretty sobering read with a lot of historical background on the state of academic freedom, the corporatization of the university system, and possible ramifications for the future of the US. It was a bit depressing but well worth reading.

One Day in the Life of Ivan Denisovich [Aleksandr Solzhenitsyn] : this is Solzhenitsyn’s first book, a first-person narrative of one person’s life in a Stalinist labor camp. It really brings the grimness of the place alive — Cool Hand Luke’s prison camp had nothing on this. They’re worth comparing, I think.

Linkage : Black Friday edition

This is an amazing video that makes me miss the Bay Area. (via Bobak Nazer)

Also via Bobak, we’re number 8 and 10!

Since it’s holiday season, I figured it’s time to link to some profanity-laden humor about the holidays. For the new, The Hater’s Guide to the Williams-Sonoma Catalog, and the classic It’s Decorative Gourd Season….

A Game of Food Trucks. (via MetaFilter)

Larry Wasserman takes on the Bayesian/Frequentist debate.

LCD Soundsystem + Miles Davis youtube mashup.

My friend Erik, who started the Mystery Brewing Company, has a blog called Top Fermented. He is now starting a podcast, which also has an RSS feed.

Linkage

New(ish) policies at the NSF — read up if you are planning on writing some grants! h/t to Helena, who sent this in aaaaages ago.

I’m not sure I agree that these are the 10 must-listen, but it’s something at least.

This article on Jonah Lehrer is quite interesting. I think there are some things to be learned here for academic writers as well…

I forgot to add a link to Suhas Mathur has a blog, sorry!

bibimbap is a tool to import BibTeX entries from various sources. It runs in the console and is designed to be simple and fast. bibimbap is simply the easiest way to manage your BibTeX bibliographies. Be advised that it still won’t read the papers for you, though.” — looks like it could be awesome. h/t to Manu.

ICML reviewing absurdity

I’m a reviewer for ICML 2013, which has a novel submission format this year. Papers for the first cycle were due October 1. They received more than they thought (by a significant factor), but I was only assigned papers to review today, more than 2 weeks later. We have been given 2 weeks to submit reviews — given my stack, that’s 2 weeks notice to review ~60 pages of material.

I may be going out on a limb here, but I think that the review quality is not going to be that high this time. Perhaps this is a Mechanical Turk approach to the problem — get a bunch of cheap noisy labels and then hope that you can get a good label by majority vote?

Update: We’ve been given another week, hooray.

Linkage

An initiative to prevent irreproducible science.

A video about Graham’s number.

I don’t tweet, but all of this debate seems ridiculous to me. I think the real issue is who follows twitter? I know Sergio is on Twitter, but is anyone else?

Food : An Atlas is a book project on kickstarter by people who do “guerrilla cartography.” It is about food, broadly construed. $25 gets you a copy of the book, and it looks awesome, especially if you like maps. And who doesn’t like maps?

I remember reading about the demise of the American Chestnut tree, but apparently it may make a comeback!

eduroam is awesome

At Allerton I finally set up the eduroam network on my phone and laptop. It was great — with the UIUC system you had to log in with a special temporary ID and password each time, but with eduroam it would automatically connect and authenticate like any other normal wireless network.

Basically you use the same login/password as for other authenticated services on your campus. At Chicago it’s called your CNetID, but the credentials will be different from place to place. The key is that you validate to the network using those credentials and not some locally-given account.

It seems that the system has been expanding — if your institution doesn’t support it then you should ask them to do so. Of course, maybe we should just have more open networks, but at least with this you can get wifi on many campuses without having to deal with the bureaucratic overhead of the IT services.

What we want are young fresh faces

Via SEK I read about a job ad at Colorado State which specifies that applicants should have received their PhD from 2010 onwards. Sorry 2009-ers, you’re not eligible.

According to the Colorado State jobs site:

Colorado State University does not discriminate on the basis of race, age, color, religion, national origin or ancestry, sex, gender, disability, veteran status, genetic information, sexual orientation, or gender identity or expression. Colorado State University is an equal opportunity/equal access/affirmative action employer fully committed to achieving a diverse workforce and complies with all Federal and Colorado State laws, regulations, and executive orders regarding non-discrimination and affirmative action. The Office of Equal Opportunity is located in 101 Student Services.

Because I was interested, I called the Office of Equal Opportunity at (970) 491-5836 to see if their office had vetted the ad and whether or not requiring a post-2010 PhD constituted age discrimination. This is what they told me, not specifically about this position, but in general about ads of this type:

  • Fort Collins is a nice place to live and they get lots of people who apply to jobs who have more experience (even at the associate level, so they claim). They want to make it very clear that those people are not welcome to apply.
  • It’s a salary thing where they don’t want to pay for the experience of someone with a PhD pre-2010.
  • They want to be “very clear about what they are looking for” (sounds ominous!)
  • This is a way of cutting down on the number of applications they have to read. Sure they may lose out on some qualified candidates. Implicit in this is that it’s an employer’s market out there.

All in all, this is not a sanguine situation for job applicants. I bet it would be nice if Michael Bérubé could weigh in on this in his capacity as MLA President.