Someone hands you a coin which has a probability of coming up heads. You can flip the coin as many times as you like (or more precisely, you can flip the coin an infinite number of times). Let be the set of rational numbers in . After each flip, you have to guess one of the following hypotheses: that for a particular , or is irrational. Furthermore, you can only make a finite number of errors for any , where is a set of irrationals of Lebesgue measure 0. Can you do it? If so, how?

This is the topic addressed by a pair of papers that Avrim Blum mentioned in Yoav Freund‘s remembrances of Tom Cover:

COVER, THOMAS M (1973). On determining the irrationality of the mean of a random variable. Ann. Math. Statist. 1862-871.

COVER & HIRSCHLER (1975). A finite memory test of the irrationality of the parameter of a coin. Annals of Statistics, 939-946

I’ll talk about the first paper in this post.

The algorithm is not too complicated — you basically go in stages. For each time you have a function . Think of as piecewise constant. There are two sequences: a threshold , and an interval width .

- Take the sample mean and look at a interval of width centered on it. Note that this makes the same decision for each until changes.
- Given an enumeration of the set , find the smallest such that .
- I there is an such that then declare , otherwise declare

.

The last thing to do is pick all of these scalings. This is done in the paper (I won’t put it here), but the key thing to use is the law of the iterated logarithm (LIL), which I never really had a proper appreciation for prior to this. For ,

for all but finitely many values of . This gets used to set the interval width .

The cool thing to me about this paper is that it’s an example of “letting the hypothesis class grow with the data.” We’re trying to guess if the coin parameter is rational and if so, which rational. But we can only apprehend a set of hypotheses commensurate with the data we have, so the threshold limits the “complexity” of the hypotheses we are willing to consider at time . The LIL sets the threshold for us so that we don’t make too many errors.

There are lots of little extensions and discussions about the rationality of physical constants, testing for rationality by revealing digits one by one, and other fun ideas. It’s worth a skim for some of the readers of this blog, I’m sure. A miscellaneous last point : Blackwell suggested a Bayesian method for doing this (mentioned in the paper) using martingale arguments.

I think the null hypothesis is just p \in [0, 1] – N_0, and N_0 is a set of irrationals, not rationals?

Oops poor phrasing, will rephrase…

Rather : yes, a typo, plus a misstatement of the problem. Fixed now!

No wait, I’m confused.

So the setup is as follows — there are countably many hypotheses, one for each rational, and the null hypothesis, which is that p is irrational. You get to come up with a sequential testing procedure.

Now I will test your procedure with each p \in [0,1]. It has to make a finite number of errors for all values of p in a set [0,1] – N_0, where N_0 has Lebesgue measure 0. So, for example, it can fail to have a finite number of errors for all p in the Cantor set minus the rationals.

This is a really cool problem. Tom Cover used to really like these ‘daylight robbery’ kind of puzzles, where the first reaction to hearing the problem statement is to declare that it is impossible. I will read the paper.