An article against tenure

Inside Higher Ed has an opinion essay by UToronto’s Mark Kingwell arguing against the institution of tenure, or rather arguing that it should be revisited. Some of the comments are as interesting as the article itself, but it’s worth a read.

As an aside, I disagree with Kingwell, but I don’t have a fully articulated critique (yet).

A hodgepodge of links

My friend Reno has a California Bankruptcy Blog.

The ISIT 2010 site seems quite definitive, no? (h/t Pulkit.)

The Times has a nice profile of Martin Gardner.

My buddy, buildingmate at UCSD, and fellow MIT thespian Stephen Larson premiered the Whole Brain Catalog at the Society for Neuroscience conference.

A fascinating article on the US-Mexico border (h/t Animikwaan.)

Kanye West is an oddly compelling trainwreck. (via MeFi).

Allerton 2009 : quick takes

I was at the Allerton conference last week for a quick visit home — I presented a paper I wrote with Can Aysal and Alex Dimakis on analyzing broadcast-based consensus algorithms when channels vary over time. The basic intuition is that if you can get some long-range connections “enough of the time,” then you’ll get a noticeable speed up in reaching a consensus value, but with high connectivity the variance of the consensus value from the true average increases. It was a fun chance to try out some new figures that I made in OmniGraffle.

The plenary was given by Abbas El Gamal on a set of new lecture notes on network information theory that he has been developing with Young-Han Kim. He went through the organization of the material and tried to show how much of the treatment could be developed using robust typicality (see Orlitsky-Roche) and a few basic packing and covering lemmas. This in turn simplifies many of the proofs structurally and can give a unified view. Since I’ve gotten an inside peek at the notes, I can say they’re pretty good and clear, but definitely dense going at times. They are one way to answer the question of “well, what do we teach after Cover and Thomas.” They’re self-contained and compact .

What struck me is that these are really notes on network information theory and not on advanced information theory, which is the special topics class I took at Berkeley. It might be nice to have some of the “advanced but non-network” material in a little set of notes. Maybe the format of Körner’s Fourier Analysis book would work : it could cover things like strong converses, delay issues, universality, wringing lemmas, AVCs, and so on. Almost like a wiki of little technical details and so on that could serve as a reference, rather than a class. The market’s pretty small though…

I felt a bit zombified during much of the conference, so I didn’t take particularly detailed notes, but here were a few talks that I found interesting.

  • Implementing Utility-Optimal CSMA (Lee, Jinsung (KAIST), Lee, Junhee (KAIST), Yi, Yung (KAIST), Chong, Song (KAIST), Proutiere, Alexandre (Microsoft Research), Chiang, Mung (Princeton University)) — There are lots of different models and algorithms for distributed scheduling in interference-limited networks. Many of these protocols involve message passing and the overhead from the messages may become heavy. This paper looked at how to use the implicit feedback from CSMA. They analyze a simple two-link system with two parameters (access and hold) and then use what I though was a “effective queue size” scheduling method. Some of the analysis was pretty tricky, using stochastic approximation tools. There were also extensive simulation results from a real deployment.
  • LP Decoding Meets LP Decoding: A Connection between Channel Coding and Compressed Sensing (Dimakis, Alexandros G. (USC), Vontobel, Pascal O. (Hewlett-Packard Laboratories)) — This paper proved some connections between Linear Programming (LP) decoding of LDPC codes and goodness of compressed sensing matrices. A simple example : if a parity check matrix is good for LP decoding then it is also good for compressed sensing. In particular, if it can correct k errors it can detect k-sparse signals, and if it can correct a specific error pattern \vec{e} it can detect all signals whose support is the same as \vec{e}. There were many other extensions of this results as well, to other performance metrics, and also the Gaussian setting.
  • The Compound Capacity of Polar Codes (Hassani, Hamed (EPFL), Korada, Satish Babu (EPFL), Urbanke, Ruediger (EPFL)) — Rudi gave an overview of polar codes from two different perspectives and then showed that in general polar codes do not achieve the compound channel capacity, but it’s not clear if the problem is with the code or with the decoding algorithm (so that’s still an open question).
  • Source and Channel Simulation Using Arbitrary Randomness (Altug, Yucel (Cornell University), Wagner, Aaron (Cornell University)) — The basic source simulation problem is to simulate an arbitrary source Y_1^n with an iid process (say equiprobable coin flips) X_1^n. Using the information spectrum method, non-matching necessary and sufficient conditions can be found for when this can be done. These are in terms of sup-entropy rates and so on. Essentially though, it’s a theory built on the limits or support of the spectra of the two processes X and Y. If the entropy spectrum in X dominates that of Y then you’re in gravy. This paper proposed a more refined notion of dominance which looks a bit like majorization of some sort. I think they showed that was sufficient, but maybe it’s also necessary too. Things got a bit rushed towards the end.
  • Channels That Die (Varshney, Lav R. (MIT), Mitter, Sanjoy K. (MIT), Goyal, Vivek K (MIT)) — This paper looked at a channel which has two states, alive (a) and dead (d), and at some random time during transmission, the channel will switch from alive to dead. For example, it might switch according to a Markov chain. Once dead, it stays dead. If you want to transmit over this channel in a Shannon-reliable sense you’re out of luck, but what you can do is try to maximize the expected number of bits you get through before the channel dies. I think they call this the “volume.” So you try to send a few bits at a time (say b_1, b_2, \ldots b_k are the number of bits in each chunk). How do you size the chunks to maximize the volume? This can be solved with dynamic programming. There are still several open questions left, but the whole construction relies on using “optimal codes of finite blocklength” which also needs to be solved. Lav has some interesting ideas along these lines that he told me about when I was in Boston two weeks ago…
  • The Feedback Capacity Region of El Gamal-Costa Deterministic Interference Channel (Suh, Changho (UC Berkeley), Tse, David (UC Berkeley)) — This paper found a single letter expression for the capacity region, which looks like the Han-Kobayashi achievable region minus the two weird 2 R_i + R_j constraints. Achievability uses Han-Kobayashi in a block-Markov encoding with backward decoding, and the converse uses the El Gamal-Costa argument with some additional Markov properties. For the “bit-level” deterministic channel model, there is an interpretation of the feedback as filling a “hole” in the interference graph.
  • Upper Bounds to Error Probability with Feedback (Nakiboglu, Baris (MIT), Zheng, Lizhong (MIT)) — This is a follow-on to their ISIT paper, which uses a kind of “tilted posterior matching” (a phrase which means a lot to people who really know feedback and error exponents in and out) in the feedback scheme. Essentially there’s a tradeoff between getting close to decoding quickly and minimizing the error probability, and so if you change the tilting parameter in Baris’ scheme you can do a little better. He analyzes a two phase scheme (more phases would seem to be onerous).
  • Infinite-Message Distributed Source Coding for Two-Terminal Interactive Computing (Ma, Nan (Boston University), Ishwar, Prakash (Boston University)) — This looked at the interactive function computation problem, in which Alice has X^n, Bob has Y^n and they want to compute some functions \{f_A(X_i,Y_i) : i \le n\} and \{f_B(X_i,Y_i) : i \le n\}. The pairs (X_i, Y_i) are iid and jointly distributed according to some distribution p_{XY}. As an example, (X_i,Y_i) could be correlated bits and f_A and f_B could be the AND function. In previous work the authors characterized the the rate region (allowing vanishing probability of error) for t rounds of communication, and here they look at the case when there are infinitely many back-and-forth messages. The key insight is to characterize the sum-rate needed as a function of p_{XY} — that is, to look at the function R : \mathcal{P}_{XY} \to \mathbb{R} and look at that surface’s analytic properties. In particular, they show it’s the minimum element of a class \mathcal{F} of functions which have some convexity properties. There is a preprint on ArXiV.

Information theory, emotion, music

From The Information Theory of Emotions of Musical Chords, posted on ArXiV today:

My basic hypothesis is as follows. While perceiving a separate musical chord, the value of a relative
objective function L that is directly related to the main proportion of pitches of its constituent
sounds is generated in the mind of the subject. The idea of an increasing value of the objective
function ( L>1 ) accompanied by positive utilitarian emotions corresponds to a major chord, while
the idea of a decreasing value of the objective function ( L<1 ) accompanied by negative utilitarian
emotions corresponds to a minor chord.

I naturally thought of The Mathematics of Love:

Posting has been light

Due to extensive traveling around. At the end of August and beginning of September I attended a workshop at the American Institute of Mathematics on Permanents and modeling probability distributions. It was a lot of fun, and a lot different than any workshop I’d been to before. There were only 20 or so participants and every day we’d break out in smaller groups to actually work on some sub-problem or area. By the end of the week I felt like I’d had a crash course in large-alphabet probability estimation and also got a sense of the huge range of problems and techniques (from exchangeable random partition processes to Markov chain Monte Carlo) involved in tackling them.

Last week I gave a talk on gossip algorithms at UC Davis and got a chance to talk with a lot of people there. It was a great trip, except for the 6:30 flight time out of San Diego. Then last weekend there were heirloom tomatoes:

Heirloom Tomato with bread and Cowgirl Creamery Red Tail cheese

Heirloom Tomato with bread and Cowgirl Creamery Red Tail cheese


And also peach pie:
Peach pie...

Peach pie...


Mmmm, pie.

Criticism of open access is backwards, as usual

Inside Higher Ed has a piece today on the presidents of liberal arts colleges writing to support the Federal Research Access Act of 2009. The law would make federal agencies that sponsor research come up with methods for archiving and publishing research that they fund so it would be “made immediately available” to the public. It would (essentially) apply only to journal papers, which raises a question about computer science, which lives the fast-and-dangerous conference life.

The article ends with a reaction from “Martin Frank, executive director of the American Physiological Society and coordinator of the Washington D.C. Principles for Free Access to Science,” who claimed that the since there are many foreign journal subscribers, the argument that taxpayers should have access to the research is not very strong. Frank is concerned with non-profit publishers (such as professional societies like the IEEE), but in his eagerness to protect his own turf he completely ignores the fact that mega-publishers like Elsevier and the Nature Publishing Group are based in other countries. Elsevier is Headquartered in Amsterdam and NPG is run by Macmillan, which “is itself owned by German-based, family run company Verlagsgruppe Georg von Holtzbrinck GmbH.”

If Mr. Frank wants to make a nativist argument against an open access mandate, then perhaps he should support a ban on wasting American taxpayer dollars to fund foreign publishing houses. The whole “taxpayer” argument in the end is marketing for both sides — although in principle any citizen should have access to government-funded research, the real volume comes from universities and industry. Federal money is used many times over for the same piece of research — once to fund it and then once for every (public) university library which has to buy a subscription to the journal where the result was published. University libraries will not stop subscribing to the IEEE journals just because the NSF and DARPA funded research will be made available in (probably separate) repositories run by the NSF and DARPA. If a non-profit is publishing its journals at cost then they should still be affordable. The for-profit publishers are the ones who will have to realize that the “value added” by the Nature brand is not worth the markup they charge.

Samidh Chakrabarti on Transacting Philosophy

I recently re-read my old roommate Samidh Chakrabarti’s master’s thesis : Transacting Philosophy : A History of Peer Review in Scientific Journals (Oxford, 2004). It’s a fascinating history of scientific publishing from the Royal Society up to the present, and shows that “peer review has never been inseparable from the scientific method.” His analysis is summed up in the following cartoon, which shows three distinct phases of peer review:
SamidhModel
When there are few journals but a large supply of papers, peer review is necessary to select the papers to be published. However, when printing became cheap in the 19th century, everybody and their uncle had a journal and sometimes had to solicit papers to fill their pages. After WWII the trend reversed again, so now peer review is “in.” In this longish post I’m going to summarize/highlight a few things I learned.

The first scientific journal was started by the Royal Society, called Philosophical Transactions: giving some Account of the Present Undertakings, Studies and Labours of the Ingenious in many considerable Parts of the World, but is usually shortened to Phil. Trans.. Henry Oldenburg, the secretary of the Society, came up with the idea of using referees. Samidh’s claim is that Oldenburg was motivated by intellectual property claims. Time stamps for submitted documents would let philosophers establish when they made a discovery — Olderburg essentially made Phil. Trans. the arbiter of priority. However, peer review was necessary to provide quality guarantees, since the Royal Society was putting their name on it. He furthermore singled out articles which were not reviewed by putting the following disclaimer:

sit penes authorem fides [let the author take responsibility for it]: We only set it downe, as it was related to us, without putting any great weight upon it.”

Phil. Trans. was quite popular but not profitable. The Society ended up taking over the full responsibility (including fiscal) of the journal, and decided that peer review would not be about endorsing the papers or guaranteeing correctness:

And the grounds of their choice are, and will continue to be, the importance or singularity of the subjects, or the advantageous manner of treating them; without pretending to answer for the certainty of the facts, or propriety of the reasonings, contained in the several papers so published, which must still rest on the credit or judgment of their respective authors.

In the 19th century all this changed. Peer review began to smack of anti-democracy (compare this to the intelligent design crowd now), and doctors of medicine were upset ever since Edward Jenner’s development of the vaccine for smallpox in 1796 was rejected by the Royal Society for having too small a sample size. Peer review made it tough for younger scientists to be heard, and politics played no small role in papers getting rejected. Those journals which still practiced peer review sometimes paid a hefty price. Samidh writes of Einstein:

In 1937 (a time when he was already a celebrity), he submitted an article to Physical Review, one of the most prestigious physics journals. The referees sent Einstein a letter requesting a few revisions before they would publish his article. Einstein was so enraged by the reviews that he fired off a letter to the editor of Physical Review in which he strongly criticized the editor for having shown his paper to other researchers… he retaliated by never publishing in Physical Review again, save a note of protest.

The 19th century also saw the rise of cheap printing and the industrial revolution which created a larger middle class that was literate and interested in science. A lot hadn’t been discovered yet, and an amateur scientist could still make interesting discoveries with their home microscope. There was a dramatic increase in magazines, journals, gazettes, and other publications, each with their own editor, and each with a burning need to fill their pages.

The content of these new scientific journals became a reflection of the moods and ideas of their editors. Even the modern behemoths, Science and Nature, used virtually no peer review. James McKeen Cattell, the editor of Science from 1895-1944 got most of his content from personal solicitations. The editor of Nature would just ask people around the office or his friends at the club. Indeed, the Watson-Crick paper on the structure of DNA was not reviewed because the editor said “its correctness is self-evident.”

As the 20th century dawned, science became more specialized and discoveries became more rapid, so that editors could not themselves curate the contents of their journals. As the curve shows, the number of papers written started to exceed the demand of the journals. In order to maintain their competitive edge and get the “best” papers, peer review became necessary again.

Another important factor was the rise of Nazi Germany and the corresponding decline of German science as Jewish and other scientists fled. Elsevier hired these exiles to start a number of new journals with translations into English, and became a serious player in the scientific publishing business. And it was a business — Elsevier could publish more “risky” research because it had other revenue streams, and so it could publish a large volume of research than other publishers. This was good and bad for science as a whole — journals were published more regularly, but the content was mixed. After the war, investment in science and technology research increased; since the commercial publishers were more established, they had an edge.

How could the quality of a journal be measured?

Eugene Garfield came up with a method of providing exactly this kind of information starting in 1955, though it wasn’t his original intent. Garfield was intrigued by the problem of how to trace the lineage of scientific ideas. He wanted to know how the ideas presented in an article percolated down through other papers and led to the development of new ideas. Garfield drew his inspiration from law indexes. These volumes listed a host of court decisions. Under each decision, they listed all subsequent decisions that used it as a precedent. Garfield realized that he could do the same thing with scientific papers using bibliographical citations. He conceived of creating an index that not only listed published scientific articles, but also listed all subsequent articles that cited each article in question. Garfield founded the Institute for Scientific Information (ISI) to make his vision a reality. By 1963, ISI had published the first incarnation of Garfield’s index, which it called the Science Citation Index.

And hence the impact factor was born — a ratio of citations to citable articles. This proved to be helpful to librarians as well as tenure and promotion committees. They just had to look at the aggregate impact of a professor’s research. Everything became about the impact factor, and the way to improve the impact factor of a journal was to improve the quality (or at least perceived quality) of its peer review. And fortunately, most of it was (and is) given for free — “unpaid editorial review is the only thing keeping the journal industry solvent.” However, as Samidh puts it succinctly in his thesis:

All of this sets aside the issue of whether the referee system in fact provides the best possible quality control. But this merely underscores the fact that in the historical record, the question of peer review’s efficacy has always been largely disconnected from its institutionalization. To summarize the record, peer review became institutionalized largely because it helped commercial publishers inexpensively sustain high impact factors and maintain exalted positions in the hierarchy of journals. Without this hierarchy, profits would vanish. And without this hierarchy, the entire system of academic promotion in universities would be called into question. Hence, every scientist’s livelihood depends on peer review and it has become fundamental to the professional organization of science. As science is an institution chiefly concerned with illuminating the truth, it’s small wonder, then, that editorial peer review has become confused with truth validation.

It seems all like a vicious cycle — is there any way out? Samidh claims that we’re moving to a “publish, then filter” approach where things are put on ArXiV and then are reviewed. He’s optimistic about “a system where truth is debated, not assumed, and where publication is for the love of knowledge, not prestige.” I’m a little more dubious, to be honest. But it’s a fascinating history, and some historical perspective may yield clues about how to design a system with the right incentives for the future of scientific publishing.