I’m reading Cesa-Bianchi and Lugosi’s Prediction, Learning, and Games in more detail now, and in Chapter 4 they discuss randomized prediction and forecasting strategies. The basic idea is that nature has a sequence of states or outcomes $(y_1, y_2, \ldots, y_n)$. You have a set of possible actions $\{1, 2, \ldots, N\}$ to choose from at each time. If you choose action $i$ at time $t$ then you incur a loss $\ell(i, y_t)$. After you choose your action the state of nature is revealed to you. The goal is to come up with a strategy for choosing actions at each time $t$, given the past states $(y_1, \ldots, y_{t-1})$.

One simple strategy that you might consider is to choose at time $t$ to take the action $i$ which would have resulted in the minimum loss so far:

$I_t = \min_{i} \sum_{k=1}^{t-1} \ell(i, y_k)$.

This seems like a good idea at first glance but can easily be “fooled” by a bad state sequence $(y_1, \ldots, y_n)$. This strategy is called fictitious play because you are pretending to play each action over the past and choosing the one which would have given you the best performance to date. Unfortunately, the strategy is not Hannan consistent, which loosely means that the gap between the average loss under this strategy and the average loss under the best fixed action does not go to 0.

So what to do? One option, proposed by Hannan, is to add noise to all of the empirical losses $\sum_{k=1}^{t-1} \ell(i, y_k)$ and then choose the minimum of the noisy ones:

$I_t = \min_{i} \sum_{k=1}^{t-1} \ell(i, y_k) + Z_{i,t}$.

This strategy is Hannan consistent and is called follow the perturbed leader since the action you choose is the best action after a perturbation.

If you got this far, I bet you were thinking I’d make some comment on the election. We’ve been following a perturbed leader for a while now. So here is an exercise : how is this prediction model a bad model for politics?