I went to a workshop at the end of the summer at the American Institute of Mathematics on Permanents and modeling probability distributions. The main questions we looked at were how to estimate a probability distribution from samples when you don’t know e.g. how many possible values there are. A simple version of this which is often mentioned is estimating the number of different butterfly species from a sample containing many unique species. As C.B. Williams wrote:

About 1940, a closer cooperation in a new field was started by a letter from Dr Steven Corbet, who had collected butterflies for many years in Malaya. On studying his collections after returning to England, he found that he had approximately 9000 specimens which included 316 species. Of these 118 were represented by only a single individual; 74 by 2 individuals, 44 by 3 and so on-the greater the number of individuals per species, the fewer were the number of species at that

level. From this skew distribution it followed that the 37% rarer species accounted for only 1.3% of the total sample.

R.A. Fisher, Corbet and Williams wrote a paper in the Journal of Animal Ecology in 1943 to estimate the number of species (“The relation between the number of species and the number of individuals in a random sample of an animal population”). Another famous example is estimating the number of words Shakespeare knew, as in the work of Thisted and Efron (popularized a bit more by the application to determining if a new poem is by Shakespeare or not).

At the workshop I learned from Susan Holmes about this collection of essays by S.L. Zabell called Symmetry and its Discontents, which I have been enjoying tremendously. Alon Orlitsky gave a short presentation orienting the workshop participants, and started out with a “simpler” problem of inductive inference — given that you saw a trial succeed p times and fail q times, what is the bias of the coin? Bayes thought about this a lot, as did Laplace, who wrote this infamous example (excerpted from Zabell’s book):

Thus we find that an event having occurred successively any number of times, the probability that it will happen again the next time is equal to this number increased by unity, divided by the same number, increased by two units. Placing the most ancient epoch of history at five thousand years ago, or at least 1826213 days, and the sun having risen constantly in the interval at each revolution of twenty-four hours, it is a bet of 1826214 to one that it will rise again tomorrow. [Essai philosophique p. xvii]

Laplace gets a lot of flak for this estimate, and it’s an infamous example of the “excesses of probabilistic reasoning.” But as I learned this morning, Laplace immediately went on to say:

But this number is incomparably greater for him who, recognizing in the totality of the phenomena the regulatory principle of days and seasons, sees that nothing at the present moment can arrest the course of it.

Laplace was advocating this method of calculation formally, as the correct way to compute a probability based only on the information from the sample (e.g. the number of repeated successes/events). As Persi Diaconis pointed out at the workshop, Laplace would be “turning over in his grave” at some of the things people have accused him of mathematically.

And that is your historical nugget for the day.