Papers : you know, to organize ’em

I ponied up the money and bought Papers recently — it’s not perfect but it does let me store all of those pesky PDFs I have lying around in a convenient single location.

The program acts like “iTunes for your papers.” It has its own internal storage system (which is also customizable) and lets you create collections (e.g. playlists). The best feature is the interface to various repositories such as PubMed, ArXiV, JSTOR, ACM, and Web of Science. It technically lets you search IEEEXplore as well, but IEEE just upgraded their system (color me unimpressed), which broke the current version of Papers’ search interface. I’m sure it will get fixed soon enough.

What I wish it let you do was to tag papers so that you can click on a tag to see all papers tagged with that topic; while this functionality is there, it’s not transparent to do it. I’d also like it if the BibTeX was associated as metadata with the paper file, so that I could integrate it better with BibDesk. I had contemplated getting DEVONthink to organize all of my files, but I felt like that was overkill.

Does anyone else out there have a killer system for organizing papers? I know it’s just a crazy dream that I’ll actually get a chance to read most of the papers I have sitting on my hard drive, but I’ll be more likely to read ’em if I can find ’em.

That viral singing Russian video

Several people have sent this video to me:

I have to admit, I was baffled by it, but assumed that it was some sort of Soviet television entertainment for which I lacked the context. However, I learned a lot more from this fascinating blog post by Justin Smith.

some privacy humor

Ever since I started working on privacy problems (better living through statistics!), I am struck by the generally fatalistic view most people have about privacy. “The credit card companies know everything about me already,” “Google could easily steal my identity,” and so on. When sentiments like this become so widespread, they are fodder for humorists. (via Celeste LeCompte).

The “parity check” on credit card numbers

Via Lifehacker, I came across this short description of how credit card numbers are coded, and how the last digit is a parity check. It’s a cute example of “real world” error-detection that pretty much anyone could understand. Cute extra-credit problem: how many valid credit card numbers are there out there? (This reminds me of a USAMTS problem from ages past).

Noose at UCSD Library

A noose was found at the UCSD library, and the campus police have issued a report calling it “[h]anging a noose with intent to terrorize.” It’s unclear if the incident is related to the recent outcry over the off-campus “Compton Cookout” party. During a recent teach-in, there was a massive walk-out by students angry at the school’s response. There are going to be new protests today over the noose incident.

update: I say that it’s unclear if it is connected but what I really mean is that it’s not clear if the party organizers are involved. The timing is too close for it to be unrelated. I would not be surprised if it turns out to be some idiot’s bad idea of a “joke” or “site-specific provocative art.”

update 2: A student has contacted the police and confessed to placing the noose. No other news, however.

update 3: Students are occupying the chancellor’s office.

update 4: More updates here.

Pitfalls in author ordering

Apparently the number of co-authored papers in political science is on the rise, and there are questions on how to order the author names. I had never heard the phrase “the tyranny of the alphabet” before to refer to alphabetical author ordering, but I know that since conventions are different in math/statistics, computer science, and electrical engineering, there ends up being a lot of confusion (esp. on the part of graduate students) as to who actually did “most of the work” on a paper. Fan Chung Graham gives a succinct description of an ideal:

In math, we use the Hardy-Littlewood rule. That is, authors are alphabetically ordered and everyone gets an equal share of credit. The one who has worked the most has learned the most and is therefore in the best position to write more papers on the topic.

This ideal doesn’t really hold in electrical engineering (or computer science, for that matter), and can lead to some dangerous assumptions when people’s conventions vary or when you are doing interdisciplinary work.

Continue reading

Privacy Workshop at IPAM

I’m at the Institute for Pure and Applied Math at a workshop on Statistical and Learning-Theoretic Challenges in Data Privacy. It’s a rather diverse group of computer scientists, statisticians, medical informatics, and policy researchers, and I feel a bit like I’m the only electrical engineer here. It’sa been pretty educational in the “learning about new problems” way, but I think by the time Friday morning rolls around I’ll be suffering from information overload. The nice thing is that the slides are posted online so I can refresh my memory when I get back. There’s also a poster session for some more recent results.

Most of the speakers have been talking about either (a) applications of the differential privacy model to some problems (e.g. data release, function computation, classification, PAC-learning, auctions, or Google’s MapReduce, the Census Bureau’s OnTheMap, and PINQ)or (b) areas in which privacy is a real problem (hospital discharge data and the dangers of re-identification, genome-wide association studies (GWAS), query logs from search engines, or (c) bridges between fields and their privacy definitions and models.

I’ve just started working in this area, so I’m still processing things (the talks range from high level to technical, and I often lack the background to understand fully what’s going on). I might blog a bit more about it as things come up.

notes on a review

I received the following TPC review recently (a rejection):

The current version of the paper is incomplete, as important proofs (the novel results…) are left to supplementary material. This can be resolved, however would require a major structural change.

I think I would have preferred the TPC to simply say “we had too many papers, and yours wasn’t in the top X%,” rather then append this completely nonsensical reason for rejection. We put the proofs in the supplementary material because of space constraints. We could just as easily have omitted other things and put the proofs in the main body by doing some minor cutting and pasting. It may be a “major structural change,” but its also trivial. Perhaps they thought the paper was poorly written, but they did not say that.

Of course I’m disappointed that the paper wasn’t accepted, especially given that all the reviewers recommended acceptance. It’s clear that the real reason the TPC rejected us was that the scores were not high enough and they had to reject a lot of papers. It sucks to be on the bad side of a subjective decision, but it happens to everyone. Making up a pseudo-objective reason is about as useful as a little white lie. As it is, this description is about as principled as “your paper has too many authors,” or “your bibliography is too long” or “we cannot accept any more papers starting with the letter D.” There’s always the next deadline, anyway.

Bach Collegium on KPBS

I sang this morning with other members of the Bach Collegium San Diego on the KBPS radio show These Days. Our director, Ruben, and two singers, Anne-Marie and Martha, talked about how we’re going for Baroque this weekend in singing all six of Bach’s motets.

Also featured: comparisons between zoning out while singing Bach and wandering into a dark alley in Leipzig (followed by the Brandenburg Boys? You don’t want to mess with them!)