11 July 2014

Reproducible experiments

Yesterday evening, after having spent my day trying (and failing) to reproduce somebody's published research, I stumbled (via Soylent News) upon a psychologist's essay on "the emptiness of failed replications". Jason Mitchell, psychology professor at Harvard, states that failing to replicate somebody else's experiment does not represent a meaningful scientific contribution. Well, thank you, Prof. Mitchell !

All jokes aside, it took me quite some time to parse the text, and even more time to realize that this difficulty is likely due to the implicit assumptions that I brought from my own field of work (experimental physics), which are quite different from those of the author, an experimental psychologist. Ultimately, I learned more from trying to separate these two viewpoints than from the text itself, which makes a rather simplistic argument.

The argument

Mitchell's main point appears to be that one cannot learn from negative arguments, since not finding something cannot prove it doesn't exist. This sounds entirely reasonable, and is certainly true in the case of the "black swan" example the author uses, but is completely wrong in usual scientific experiments: learning that the correlation between two variables is zero (within the uncertainty) is as strong a result as saying that it is significant and positive. Of course, the first outcome is less likely to lead to a high-profile paper.

The assumptions

A basic assumption in physical sciences is that of "homogeneity": the outcome of an experiment should not depend on its location, time or the personality of the scientist. Mitchell does not address this point directly, but seems to imply that getting all the details right for precisely replicating an experiment is next to impossible. He then blames this on the replicators' lack of some sort of "core competence". This is a valid point: if Nature is the same everywhere but the experimentalists are sloppy, their results will of course differ. From this I would however draw two uncomfortable conclusions:
  1. This sloppiness may just as well affect the initial experiment as the attempt to reproduce it.
  2. It also undermines an entire field of study if there is no way of distinguishing careful scientists from the careless (or incompetent) ones.
In "tabletop" physics, replicating an experiment is relatively cheap1. It is also crucial: our research builds on someone else's results, and very often the replication is a necessary step before being able to go further. Chemists sometime spend weeks or months in order to reproduce published protocols. Needless to say, this is not done to prove the original author wrong ! Neither of these points seems to apply in psychology, as presented by Mitchell.

Finally, I find quite strange Mitchell's attitude that replicating experiments is almost morally wrong: "One senses either a profound naiveté or a chilling mean-spiritedness at work." This goes beyond mere scientific debate and sounds more like responding to a personal offense.



1. Even in large scale experiments, reproducing the results may be necessary, albeit very expensive. A good example is the search for the Higgs boson, with the two experiments, ATLAS and CMS, working side-by-side but without communicating (see for instance Jon Butterworth's "Smashing Physics".)

No comments:

Post a Comment