# Quote of the day : Herman Chernoff on Bayesians

I came across this sally in the Bayesian/frequentist wars:

In general, the religious Bayesian states that no good and only harm can come from randomized experiments. In principle, he is opposed even to random sampling in opinion polling. However, this principle puts him in untenable computational positions, and a pragmatic Bayesian will often ignore what seems useless design information if there are no obvious quirks in a randomly selected sample.

— Herman Chernoff, Sequential Analysis and Optimal Design, Philadelphia : SIAM, 1972

This doesn’t seem to capture the current state of things, but the upshot here is that Chernoff is calling shenanigans on the “philosophical consistency” of Bayesian statistics.

Sometimes I wonder if what is needed is a Kinsey scale for statistical practice… can one be Bayes-curious?

# MNIST error rates for SVM on projected data

I’ve started doing more machine learning research lately, which means I’ve been sullying my delicate theorist’s hands testing out my algorithms on data. Perhaps the most (over) used dataset is the MNIST handwritten digits collection, which was been put into MATLAB form by Sam Roweis (RIP). As a baseline, I wanted to see how an SVM would perform after I projected the data (using PCA) into the top 100 dimensions. The primal program is

$\min_{\mathbf{w},b} \frac{1}{2} \| \mathbf{w} \|_2^2 + C \sum_{i=1}^{n} z_i$
s.t. $y_i (\mathbf{w}^T \mathbf{x}_i + b) \ge 1 - z_i)$

I chose some “reasonable” value for C and tried to train a classifier on all pairs of points and got the following error rates on the test set (in percentages, rounded).

0
0      0
0.56   0.43   0
0.33   0.45   2.37   0
0.04   0.06   1.17   0.23   0
1.02   0.11   1.89   3.77   0.72   0
0.52   0      1.31   0.08   0.60   1.66   0
0.01   0.15   1.01   0.80   0.80   0.42   0      0
0.43   1.15   2.22   2.69   0.38   3.41   0.54   0.47   0
0.20   0.14   0.85   1.13   3.03   1.02   0      3.82   1.27   0


This is digits from 0 to 9, so for example, the training error for classifying 0 versus 1 was zero percent, but it’s about 3.8 percent error to decide between 9 and 7. I did this to try and get a sense of which digits were “harder” for SVM to distinguish between so that I could pick a good pair for experiments, or better yet, to pick a pair based on a target error criterion. Running experiments on Gaussian synthetic examples is all fine and good, but it helps to have a range of data sets to test out how resilient an algorithm is to more noise, for example.

# CFP : GlobalSIP 2013 Deadline Extended to June 15

I’m on the program committee for the Cyber-Security and Privacy symposium, so I figured I would post this here to make more work for myself.

GlobalSIP 2013 – Call for Papers
IEEE Global Conference on Signal and Information Processing
December 3-5, 2013 | Austin, Texas, U.S.A.

GlobalSIP: IEEE Global Conference on Signal and Information Processing is a new flagship IEEE Signal Processing Society conference. The focus of this conference is on signal and information processing and up-and-coming signal processing themes.

GlobalSIP is composed of symposia selected based on responses to the call-for-symposia proposals. GlobalSIP is composed of symposia on hot topics related to signal and information processing.

The selected symposia are:

Paper submission will be online only through the GlobalSIP 2013 website Papers should be in IEEE two-column format. The maximum length varies among the symposia; be sure to check each symposium’s information page for details. Authors of Signal Processing Letters papers will be given the opportunity to present their work at GlobalSIP 2013, subject to space availability and approval by the Technical Program Chairs of GlobalSIP 2013. The authors need to specify in which symposium they wish to present their paper. Please check conference webpage for details.

Important Dates:
*New* Paper Submission Deadline – June 15, 2013
Review Results Announce – July 30, 2013
Camera-Ready Papers Due – September 7, 2013
*New* SPL request for presentation – September 7, 2013

/

# Persi Diaconis on coincidence

Persi Diaconis gave the second annual Billingsley Lecture at UChicago yesterday on the topic of coincidences and what a skeptical statistician/probabilist should say about them. He started out by talking about how Jung was fascinated by paradoxes (apparently there’s one about having fish come up all the time in conversation).

It was mostly a general-audience talk (with some asides about Poisson approximation), and the first part on the birthday problem and variants. Abstracted away, the question is given $n$ balls (people) and $C$ bins/categories (days), how big should $n$ be so that there’s an even chance that two balls land in the same bin? Turns out $n \approx latex 1.2 \sqrt{C}$, as we know, but we can expand this to deal with approximate matches (you need only 7 people to get 2 birthdays in the same week with probability around 1/2). If you want to put a graph on it you can ask social-network coincidence questions and get some scalings as a function of the number of edges and number of categories — here there are $n$ vertices and $C$ colors for the vertices. What these calculations show, of course, is that most coincidences are not so surprising, at least in this probabilistic sense. Some more advanced treatment might be found in Sukhada Fadnavis’s preprint (which also has something about a “shameful conjecture” on chromatic polynomials that was proved in 2000, but I don’t know why it is shameful). The second part of the talk was on problems arising in the study of ESP — namely that experimental controls are not really present, so the notion of a “trial” is hard to pin down, leading (of course) to more false perceptions of coincidences are being surprising. He closed with some remarks about how our perception of coincidence is really about how our minds work, and pointed to some work by Ruma Falk for those who are interested in that angle of things.

I was unaware of this body of Diaconis’s work, and it was nice to have a high-level talk to cap off the day.

# Capacity-achieving philately

The following email came through the ITSOC mailing list, but may be of interest to other readers of the blog.

Dear Colleagues,

We are making a proposal to the United States Postal Service for the production of a stamp honoring Claude Elwood Shannon on the 100th anniversary of his birth.

The proposal is available at: http://www.itsoc.org/about/shannons-centenary-us-postal-stamp/

We would love to have the endorsements of your friends, colleagues, department chair, dean, university president, CEO, government representatives, school-aged children, and the public at large. [Contact information for endorsing individuals will not be posted.]

Thanks for your support!
Best,
Michelle Effros

# Linkage (desi edition)

An op-ed from n+1 on the safety of being brown.

Via Mimosa (I think), a profile of photographer Nemai Ghosh, who worked with Satyajit Ray.

Via my father, the story of Indian Jewish actresses in early Bollywood.

Things seems to be heating up on the LAC. Not a good sign.

The death toll in Dhaka keeps rising. This makes Matthew Yglesias’s reaction (see a stunningly poor example of self-reflection here) a bit more that the usual brand of neoliberal odiousness.