Martin Smith (from the University of Edinburgh) discusses the relation between surprise and belief [1].

As a striking introduction, he claims that, in a coin toss, throwing a large number of heads in a row is not surprising. He deploys a version of the

*sorites*argument insofar as "surprise" is concerned: if the individual events \(e_k\) of getting heads on the \(k\)-th throw are unsurprising, then so is their conjunction.This particular example is easily dealt with by noting the importance of prior information: the sequence of heads is surprising because we know that "The coins don’t appear to be double-headed or weighted or anything like that – just ordinary coins", as Smith insists in his first paragraph [2]. I know nothing about the theories of Shackle and Spohn, but I doubt his analysis would survive adding event \(e_0\): "We checked that all coins were unbiased". On the other hand, I believe a Bayesian treatment similar to that given by Jaynes in §5.2 of [3] would be quite satisfactory (see also Chapter 4 for a general presentation and §9.4 for an example dealing specifically with coin tossing and bias.) Once again, the information brought by \(e_1\) through \(e_{92}\) contradicts \(e_0\), this is why it is surprising (or informative), not because the events would have an intrinsic "surprising" character.

The author's insistance on the equivalence of the various results: "[E]ach one of these sequences is just as unlikely as 92 heads in a row." glosses over the fact that each sequence is more or less compatible with the fairness assumption \(e_0\). Let us introduce the probability \(p\) of throwing heads. Then, \(e_0\) amounts to saying that the (prior) probability distribution \(f(p)\) of parameter \(p\) is peaked in \(0.5\) and has a certain width \(w\). The higher our confidence in coin fairness, the lower \(w\).

It is only in the case of absolute certainty \(w \to 0\) (\(f(p)\) is a Dirac delta) that the results are equivalent. As soon as \(w\) exceeds a ridiculously small value, the evidence brought by the 92 heads dramatically shifts the peak of the (posterior) probability distribution \(f'(p)\) close to 1. A sequence with 46 heads, although exactly as improbable, has no such effect (at most, it may lead to a modest decrease in \(w\).)

The surprise is not related to the probability of a particular sequence, but to the extent it challenges our belief; I believe this statement to be rather trivial (or at least uncontroversial) and indeed Smith reaches pretty much the same conclusion in the last —and most interesting— section of the paper (to be discussed in a future post) although he cannot see the element of surprise in the coin toss experiment.

It is only in the case of absolute certainty \(w \to 0\) (\(f(p)\) is a Dirac delta) that the results are equivalent. As soon as \(w\) exceeds a ridiculously small value, the evidence brought by the 92 heads dramatically shifts the peak of the (posterior) probability distribution \(f'(p)\) close to 1. A sequence with 46 heads, although exactly as improbable, has no such effect (at most, it may lead to a modest decrease in \(w\).)

The surprise is not related to the probability of a particular sequence, but to the extent it challenges our belief; I believe this statement to be rather trivial (or at least uncontroversial) and indeed Smith reaches pretty much the same conclusion in the last —and most interesting— section of the paper (to be discussed in a future post) although he cannot see the element of surprise in the coin toss experiment.

^{1. Martin Smith, Why throwing 92 heads in a row is not surprising, Philosophers' Imprint (forthcoming) 2017.↩}

^{2. Had we known that all coins were double-headed, throwing only heads would not only be unsurprising, it would be certain.↩}

^{3. E. T. Jaynes and G. L. Bretthorst, Probability theory the logic of science, Cambridge University Press 2003. ↩}

## No comments:

## Post a Comment