Coincidence. Or is it?

March 29, 2008

I attended an interesting talk by Prof. Persi Diaconis, Professor of Statistics and Mathematics at Stanford, called ‘On Coincidences’. It was a lecture targeted towards general scientific audience wherein he explained how simple statistical tools can be used to easily explain coincidences that a lot of people are all too willing to deem paranormal. You should read about Prof. Diaconis here. Read it to find out what a strikingly unconventional career path he has taken.

He started with a story recorded by Carl Jung, the psychoanalyst: on April Fish day in 1949, Jung makes note of a picture that was half human and half fish, then eats fish for lunch, meets a patient that talks about dreaming about a big fish, another shows him pictures of fish; months later when he was writing about these things, he found a dead fish.

Now the question is, is this really as weird or even surprising as Jung or anybody else thinks? Prof. Diaconis argued that it is not. By modeling day to day incidents using simple statistics, we can think about these occurrences in a quantitative way and make sense out of them. Let’s say, we can model these independently normal, but together weird looking incidents as generated by a Poisson process. By modeling the fish story in this way, assuming that we hear about ‘fish’ once a day, Prof. Diaconis found that, there is a 22% chance of the above story occurring, which renders the occurrence not very surprising.

The other important thing he discussed was how the simple Birthday problem can be used as a tool to quantify coincidences. In the birthday problem we have 365 categories (or events). Now, what is the sample size to find at least one pair of matching birthdays in a population, with a probability of 50%? It is given by the simple formula: N = 1.2 * sqrt(C), where C is the number of categories. For a 95% chance of a match, N = 2.5 * sqrt(C). By taking C = 365, we get 23 and 48 as the sample sizes, respectively.

In this paper, he explains the above and gives similar general formulas for other cases: when you have find a match in any of several categories, or finding a close match, and so on. It’s a very simple yet useful way of thinking.

***

This talk was yesterday. Incidentally, someone was giving a seminar in the lab here today on risks in nuclear plants. While talking about the Three Mile Island reactor accident, he mentioned that the failure started at 4:00 AM sharp on March 28th, 1979, on the day of their first anniversary. It was, in fact, the first anniversary down to the minute. There were a couple of gentle giggles until after a couple of seconds, one of the fellows (who had attended yesterday’s talk) said, “That is today!” The speaker (who hadn’t attended the talk) said, “Yeah, coincidence.” The prof (who had attended the talk) said, “Or is it?” People were all giggling about the good joke. All, but one. Because through his extrasensory perceptions he had come to terms with a greater truth that no one else understood; only I had noticed that the time then was exactly 4:00:00 AM!

OK, I made up the last bit about the time being 4 AM. The world still isn’t as crazy to have started talking about nuclear reactor failures at 4 AM. The talk started around 3:40 PM, so there is more than a good chance that the time at that moment was 4 PM. But then, we all like to make a good story better, don’t we?