Feller’s anti-Bayesianism

I didn’t really realize that in Feller’s classic probability book he had the following dismissal of Bayesian statistics:

Unfortunately Bayes’ rule has been somewhat discredited by metaphysical applications of the type described above. In routine practice, this kind of argument can be dangerous. A quality control engineer is concerned with one particular machine and not with an infinite population of machines from which one was chosen at random. He has been advised to use Bayes’ rule on the grounds that it is logically acceptable and corresponds to our way of thinking. Plato used this type of argument to prove the existence of Atlantis, and philosophers used it to prove the absurdity of Newton’s mechanics. In our case it overlooks the circumstance that the engineer desires success and that he will do better by estimating and minimizing the sources of various types of errors in predicting and guessing. The modern method of statistical tests and estimation is less intuitive but more realistic. It may be not only defended but also applied.” — W. Feller, 1950 (pp. 124-125 of the 1970 edition)

A few weeks ago, a little note on Feller’s anti-Bayesianism was posted to ArXiV. It’s a bit of an emotional read; a mathematical Op-Ed if you will. However, it does present an interesting perspective on historical “received wisdom” in light of more modern approaches to statistics and Bayesian data analysis. As an example, take the methods from Michael Jordan’s talk at ISIT (video and slides on the ITSOC website now!), using which you can do some cross-validation to see that they are indeed producing the correct results on real data.

What I am missing (as an outsider to the internecine battles of statistics) is an even-handed explanation of what all the hullabaloo is about. Such an article probably exists, but I haven’t seen it…

Postdocs at UC have unionized

By now it’s officially official, but postdoc employees across the UC system unionized and got a contract with the university. Here are a few of the good things that came out:

  • experience-based minimum salary steps according to the NIH/NRSA pay scale – these are minimum pay guidelines, so PIs who feel generous can of course pay more. Why are minimum pay requirements important? Many postdocs are here for 5 years. With durations like that, the position is not “training,” it’s a job. And therefore we should treat it like a job. Prior to this contract, many postdocs were receiving well below the minimum that NIH recommends, even though they were funded by NIH grants. In addition, you cannot be a postdoc for more than 5 years — after that you should be hired as a staff scientist. Some PIs oppose this, because postdocs are “cheaper” than staff scientists.
  • health insurance – the UC administration wanted to slash benefits in a way that would ultimately end up cutting compensation.
  • workplace safety – suppose the lab you work in is unsafe, but if you report any violations your PI may fire you. Does that seem fair?

There are a lot of other things in there, especially with regards to time off, parental leave, and so on. There is a pernicious attitude in the sciences that if you have kids while a grad student/postdoc/pre-tenure faculty you are “not serious about your career.” If you have 6 years of grad school and then 5 years of postdoc and then start a tenure-track job, and wait to have kids until after tenure, you might be 38 or 40. Breaking this attitude is hard, but it’s really starts with establishing basic expectations and treating employees like people.

And that is why this contract is important. I think of it as a restructuring of the playing field — without rules from the University as a whole, PIs are incentivized to pay postdocs as little as possible and work them as hard as possible, trading on the reputation of their lab and the University to make the deal more palatable. This is not to say most PIs do this, but certainly some do. With this contract there is a minimum set of rules by which PIs have to play, rules which are in fact in accordance with recommendations by funding agencies.

In talking about this with faculty from different places, I’ve heard diverse perspectives on why this is a difficult thing for them to accept, the fears they have about being demonized or not being able to have the flexibility they feel is so important to being able to run the kind of research program that they desire. These are important concerns, and ones which can and should be explored as this new contract is implemented. However, I have heard no good proposals from them about how to address the real issues faced by postdoctoral employees, whereas this contract does just that.

EVT/WOTE ’10 : Panel on India’s Electronic Voting Machine

I’m attending the…

Panel on Indian Electronic Voting Machines (EVMs)
Moderator: Joseph Lorenzo Hall, University of California, Berkeley and Princeton University
Panelists: P.V. Indiresan, Former Director, IIT-Madras; G.V.L Narasimha Rao, Citizens for Verifiability, Transparency, and Accountability in Elections, VeTA; Alok Shukla, Election Commission of India; J. Alex Halderman, University of Michigan

The first speaker was G.V.L. Narasimha Rao, who is also a blogger on the topic of elections. He is a staunch opponent of Electonic Voting Machines (EVMs). He gave a summary of voting in India — until 1996, all voting was with paper ballots and hand counting. In 1998 there were some EVMs introduced in urban areas, and then in 2004 it moved entirely to EVMs. Vote confirmation was given by a beep, and there were several complaints of machine failure. His claim is that exit polling was accurate prior to 2004 and then after the introduction of EVMs, the exit polls diverged widely from the actual results. In these elections I believe the BJP got a drubbing from Congress (Rao probably got suspicious since he appears to be a BJP political analyst).

Next up was Alok Shukla, the Deputy Election Commissioner of India. He gave an overview of the EVMs in use in India. He gave a review of how India decided to move to EVMs (the Parliament ended up approving the use of EVMs). He claimed that a paper trail was not the solution (mostly due to infeasibility/cost/remoteness of polling locations, etc), and said solutions lie in better transparency and administrative oversight. His main answer to claims that the EVMs have been hacked is that the attacks are infeasible and detectable by election officials. Finally, he said essentially “different systems for different people” (or different strokes for different folks?).

The third speaker was J. Alex Halderman, who is one of the people who attacked the Indian EVM. He described how he got hold of an EVM and showed details on the insides. The first problem is that the devices can be duplicated (or fake ones could be substituted). Another issue is that verifying the code in the EVM is not possible (so they can be tampered with at the time of manufacture). Finally, the reported counts are stored in two EEPROMS which can be swapped out. There are two attacks (at least) that they performed. The first is to hack the display so that false counts are displayed on the LED. A bluetooth radio lets a mobile user select who should win. The second is to clip on a device to reprogram the EEPROMS. Full details will appear at CCS. Halderman’s last bit of news was that one of their co-authors in India, Hari K. Prasad, has been summoned by the police as a result of a criminal complaint that he stole the EVM, which seems like an attempt by the government of India to silence their critics. He called upon Shukla to drop the suit, who was rather upset by this public accusation.

The last panelist was P.V. Indiresan, who is on the advisory committee to the government. He discussed some new security features in EVMs, such as signatures to prevent tampering with the cable between the ballot unit (where people push buttons) and the control unit (which counts the ballots). He claimed that most of the attacks proposed so far are farfetched. Much of his latter complaints were to the effect that to break the EVM is a criminal act (which is a claim of security through obscurity). He ended with a plea to ask researchers to stop (!) hacking the EVMs because they “are working.”

To sum up : the Indian government says the system works and that there is no actual evidence of tampering (with the exception of Prasad, who apparently received stolen goods). Halderman says the attacks show that the system as a whole are not secure, and Rao says that the results are suspicious.

Shukla responded to critics that the Election Commission of India is willing to listen to critics and said that the only kind of attack that is of interest is one on a sealed machine. He reiterated the statement that Prasad was in receipt of stolen government property and needs to be questioned.

The Q&A was quite contentious. I might have more to say about it later… but wow.

EVT/WOTE ’10 : the keynote

I am at EVT/WOTE (Electronic Voting Technology Workshop/Workshop on Trustworthy Elections) today and tomorrow, and will try to blog about it a bit. The keynote today was given by Donetta Davidson, who runs the Election Assisstance Commission. She gave an overview of the EAC’s activities and priorities. The Q&A has focused a bit on how voting research is underfunded and that CS researchers want the EAC to lobby for more research funding. I guess some things don’t change much.

A proof that P is not equal to NP?

Today we had the last day of the information theory summer school and Rudiger Urbanke gave a lecture on LDPC codes, random graphs, polar codes and other ways to approach Shannon Capacity.

After it was over, I received a few emails about a potential proof that P is not equal to NP from Vinay Deolalikar.

A quick look at the proof seems (to my ignorant eyes) like a serious, well-written attempt, and some people who know more seem to agree.

Surprisingly, Replica symmetry breaking, random graphs (and how they are locally tree-like) and inference on graphical models (things we discussed in the school) seem to be central in the arguments.

Let’s hope the proof is correct!

David Blackwell has passed away

Via Inherent Uncertainty I learned that David Blackwell passed away on July 8th.

Prof. Blackwell’s original paper (with Leo Breiman and Aram Thomasian) on the arbitrarily varying channel was an inspiration to me, and he served on my thesis committee a scant 2 years ago.

I’ll always remember what he told me when I handed him a draft of my thesis. “The best thing about Bayesians is that they’re always right.”

ISIT 2010 : a few more talks

I think I lack the willpower to write up more notes on talks, and there are other things I’d like to blog about, but here are one or two sentences on some other talks that I found interesting… I also enjoyed the energy session on Friday and the talks by Sundeep and Galen on compressed sensing, but time has gotten the better of me. Next time, folks.

Channel Intrinsic Randomness
Matthieu Bloch
This was on extracting random bits from the output of noisy channel. These bits should be independent of the input to the channel. Matthieu uses the enigmatic information spectrum method to get his results — thanks to the plenary lecture I was able to understand it a bit better than I might have otherwise.

Assisted Common Information
Vinod Prabhakaran and Manoj Prabhakaran
I was very interested in this talk because I have been thinking of a related problem. Two terminals observe correlated sources X_1^n and Y_1^n respectively. A genie observes both sources and sends messages at rates R_1 and R_2 to the two terminals, who then have to produce variables W_1 and W_2 which have large entropies and are also equal with high probability. This problem is connected to the Gacs-Korner problem and Wyner’s common information problem, and also possibly this recent preprint by Andrej Bogdanov and Elchanan Mossel. They manage to solve it using a novel construction of “monotone regions.”

Patterns and Exchangeability
N. P. Santhanam and M. Madiman
This work grew out of the AIM workshop on permanents. De Finetti’s theorem says an exchangeable process can be built up as a mixture of iid processes. Kingman showed that something called an exchangeable partition process is built up from something he called “paintbox processes.” One thing that we discovered at the workshop was that the pattern process of an iid process is the same as a paintbox process (and vice-versa). The paper then goes through many connections between these processes, certain limits of graphs, and connections to universal compression.

Universal Hypothesis Testing in the Learning-Limited Regime
Benjamin G. Kelly, Thitidej Tularak, Aaron B. Wagner, and Pramod Viswanath
This was a really great talk. The problem here is that for each n you get n samples X^n distributed i.i.d. according to p_n or q_n on an alphabet \mathcal{A}_n which can grow with n. Given a new sequence Z^n you have to decide if it was generated according to p_n or q_n. They show a number of results which say that consistency is possible for |\mathcal{A}_n| sublinear in n, impossible for quadratic in n, and other intermediate results. In particular, for well-behaved distributions with |\mathcal{A}_n| = \Theta(n^{\alpha}) and all probabilities are \Theta(n^{-\alpha}), they can get some consistency results, but in particular the generalized likelihood ratio test (GLRT) is inconsistent for \alpha = 1.

Feature Extraction for Universal Hypothesis Testing via Rank-Constrained Optimization
Dayu Huang and Sean Meyn
This talk was of interest to me because I have been looking at hypothesis testing problems in connection with election auditing. In universal testing you know a lot about the distribution for one hypothesis, but much less about the other hypothesis. The Hoeffding test is a threshold test on the KL-divergence between the empirical distribution and the known hypothesis. This test is asymptotically optimal but has a high variance when the data is in high dimension. Thus for smaller sample sizes, a so-called mismatched divergence test may be better. In this paper they look at how to tradeoff the variance and the error exponent of the test.

ISIT 2010 : Abbas El Gamal and Te Sun Han

I seem to have gotten all behind on wrapping up the ISIT blogging, so the remainder may be more compressed takes on things. This is not in the compressed sensing world, where the signals are sparse and my comments are meant to reconstruct, but more like lossy compression where D \to \sigma^2 (for the Gaussian case).

Abbas El Gamal gave a very nice plenary on “Coding for Noisy Networks” in which he really brought together a lot of different eras and streams of work on network information theory and tried to tie them together in a conceptual framework. There was a nice mix of older and newer results. The thing I liked best about it was that he was very optimistic about making progress in understanding how to communicate in networks from an information-theory perspective, which counteracts the sentiment that I heard that “well, it’s just too messy.”

Te Sun Han gave the Shannon Lecture, of course, and he used his time to give a tutorial on the information spectrum method. I had tried to read the book earlier, and honestly found it a little impenetrable (or rather, I wasn’t sure what I was supposed to use from it). The talk was more like reading the papers — concisely stated, but with a clear line of intuition. I know some people are not a big fan of Shannon Lectures as tutorials, but I think there is also a case to be made that most people are unfamiliar with the information spectrum method. A nice example he gave was to show when the output of an optimal source coder looks “completely random.” Maybe this has been done already, but is there a connection between existing theories of pseudorandomness and the information spectrum method?