## 18 August 2017

### Surprise and belief update

In a previous post, I started discussing a paper [1] on the (un)surprising nature of a long streak of heads in a coin toss. My conclusion was that the surprise is not intrinsic to the particular sequence of throws, but rather residing in its relation with our prior information. I will detail this reasoning here, before returning to the paper itself.

Let us accept as prior information the null hypothesis $$H_0$$ "the coin is unbiased". The conditional probabilities of throwing heads or tails are then equal: $$P(H|H_0) = P(T|H_0)=1/2$$. With the same prior, the probability of any sequence $$S_k$$ of 92 throws is the same: $$P(S_k|H_0) = 2^{-92}$$, where $$k$$ ranges from $$1$$ to $$2^{92}$$.

Assume now that the sequence we actually get consists of all heads: $$S_1 = \lbrace HH \ldots H\rbrace$$ What is the (posterior) probability of getting heads on the 93rd throw? Let us consider two options:
1.  We can hold steadfast to our initial estimate of lack of bias $$P(H|H_0) = 1/2$$.
2. We can update our "belief value" and say something like: "although my initial assessment was that the coin is unbiased [and the process of throwing is really random and I'm not hallucinating etc.], having thrown 92 heads in a row is good evidence to the contrary and on next throw I'll probably also get heads". Thus, $$P(H|H_0 S_1) > 1/2$$ and in fact much closer to 1. How close exactly depends on the strength of our initial confidence in $$H_0$$, but I will not do the calculation here (I sketched it in the previous post).
I would say that most rational persons would choose option 2 and abandon $$H_0$$; holding on to it (choice 1) would require an extremely strong confidence in our initial assessment.

Note that for a sequence $$S_2$$ consisting of 46 heads and 46 tails (in any order) the distinction above is moot, since $$P(H|H_0 S_2) =P(H|H_0) = 1/2$$. The distinction between $$S_1$$ and $$S_2$$ is not their prior probability [2] but the way they challenge (and update) our belief.

Back to Martin Smith's paper now: what makes him adopt the first choice? I think the most revealing phrase is the following:

When faced with this result, of course it is sensible to check [...] whether the coins are double-headed or weighted or anything of that kind. Having observed a run of 92 heads in a row, one should regard it as very likely that the coins are double-headed or weighted. But, once these realistic possibilities have been ruled out, and we know they don’t obtain, any remaining urge to find some explanation (no matter how farfetched) becomes self-defeating.[italics in the text]

As I understand it, he implicitly distinguishes between two kinds of propositions: observations (such as $$S_1$$) and checks (which are "of the nature of" $$H_0$$, although they can occur after the fact) and bestows upon the second category a protected status: these types of conclusions, e.g. "the coin is unbiased" survive even in the face of overwhelming evidence to the contrary (at least when it results from observation.)

There is however no basis for this distinction: checks are also empirical findings: by visual inspection, I conclude that the coin does indeed exhibit two different faces; by more elaborate experiments I deduce that the center of mass is indeed in the geometrical center of the coin, within experimental precision; by some unspecified method I conclude that the "throwing process" is indeed random; by pinching myself I decide that I am not dreaming etc. At this point, however, the common sense remark is: "if you want to check the coin against bias, the easiest way would be to throw it about 92 times and count the heads".

If we estimate the probability of the observations (given our prior belief) we should also update our belief in light of the observations. Recognizing this symmetry gives quantitative meaning to the "surprise" element, which is higher for some sequences than for others.

1. Martin Smith, Why throwing 92 heads in a row is not surprising, Philosophers' Imprint (forthcoming) 2017.
2. We only considered here the probabilities before and after the 92 throws. One might also update one's belief after each individual throw, so that $$P(H)$$ would increase gradually.