People are fond of the notion that “you can prove anything with statistics.” That’s not true, but it often takes deep understanding to find errors in statistical reasoning. Even simple problems can be difficult to solve correctly, and when it comes to complex problems the opportunities for error multiply. The simple problem I have in mind is called “The Monty Hall Problem” and the complex problem is unraveling the errors in the derivation of the global warming Hockey Stick. The first is discussed in The Drunkard’s Walk: How Randomness Rules our Lives, a superb introduction to statistical theory by physicist Leonard Mlodinow. The second is the subject of The Hockey Stick Illusion: Climategate and the Corruption of Science, a careful exposition by A.W. Montford of the errors made by climate scientists. Together they explain a great deal of what is bogus in modern science.

The Monty Hall Problem is from a television game show once hosted by personality Monty Hall. There was a prize behind one of three doors. The contestant picks a door. Mr. Hall then reveals that there is no prize behind one of the two doors the contestant did not pick, and asks if the contestant would like to keep his original door selection or switch to the the door that has not yet been opened. What should the contestant do?

I am proud, darn proud, that I solved the problem correctly in a few seconds. That’s a result of my lengthy education and experience, and also that the problem really is a simple one. The answer is that the contestant should switch. The short analysis is that opening a door adds information about where the prize is and switching cashes in on that information.

A longer analysis considers two cases. If the contestant happens to have chosen the door with the prize initially, then the switch means he will not end up with the prize. Alternatively, if he originally chose a door without the prize, then switching means he wins the prize. His chance of selecting the prize initially is one in three and the chance of having not selected the prize initially is two in three. That means the strategy of switching, which changes not-win to win, improves his odds of winning from one in three to two in three. Records of the outcomes on the show verify that is what happened.

Most people are not familiar with problems of chance, so we shouldn’t be surprised if the average person gets the wrong answer. What is amazing is that many well-educated scientific types, included learned professors, got it wrong. Some could not be convinced of the right answer despite mathematical proof. They only crumbled under the evidence of computer simulations showing winning chances doubled. Apparently, the notion that the odds “must be” one in three was too much for some to overcome.

The lesson here is that doing the calculations successfully is difficult, but can have a substantial payoff. Moreover, the answer is not arbitrary or ultimately in doubt, there is only one right answer. You cannot prove anything.

The global warming Hockey Stick was a whopper of a mistake, appearing in a report of the UN International Panel on Climate change (IPCC) as proof that recent warming is rapid and unprecedented. The goal of the Hockey Stick effort was to make a graph of world temperature going back back for many hundreds of years. Thermometers have not been around for that long, and temperature records from thermometers did not extend to remote areas until recent times. Lacking thermometers, scientists look for proxies, natural phenomena that vary with temperature. The graph of temperature falsely derived look flat with a sudden rise at the end, like the shape of hockey stick on its side.

The Hockey Stick derivation starts by considering many proxies with a computer program. The program sorts through the possibilities until it finds proxies that match thermometer measurements in modern times, and the program gives the most weight to the ones that match the best. It turned out that certain data on trees rings from the American southwest and certain forests in Russia matched most closely and were therefore weighted most heavily. The program, having found the best proxy, then used the proxies to go back in time to provide missing temperatures.

It turns out that if the program is fed completely random data, it will produce a Hockey Stick curve. That is because some random curves fed into the program will by chance happen to match the recent temperature rise. However, they are only selected for their ability to match recent temperatures. Outside of the examined region, in the past, the curves were random. Some increased and others decreased, but on average they mostly canceled each other out.

Further examination revealed that tree ring data of the type selected by the software as best was actually known separately not to be a good proxy for temperature. Trees a short distance from the ones used did not show a temperature rise, and the same trees in more recent years also have not tracked measured temperature. Taking out the few sets of tree rings that the software had homed in on destroyed the Hockey Stick curve. The scientists deriving the Hockey Stick were not qualified statisticians, and neither was anyone who reviewed.

The error was found by a Canadian statistical expert, Stephen McIntyre, through incredible persitence in the face of the uncooperative scientists who concealed what data they used and how they processed it. Ultimately, a Senate committee assembled a panel of expert statisticians, headed statistics authority Edward Wegman. The panel verified that McIntyre was correct.

Hockey Stick defenders complained that Wegman was not an atmospheric scientist, and hence not qualified. This is like arguing that analysis of the Monty Hall problem cannot be done without carpenters on the panel, because only experts on doors can understand the issue. To this day, many proponents of the Hockey Stick still do not understand what they did wrong. Reading Montford’s book might help, but I doubt it.

Actually, the Monty Hall problem is in the realm of probability while the Hoskey Stick was in the realm of statistics. The pitfalls are much the same. I’ll let you read Prof. Mlodinow to appreciate the difference. Social scientists should read the book twice.

Years ago, the brilliant cartoonist Burt Kliban published a cartoon1 showing police clearing the way for a man having a beautiful girl on each arm. The policeman says, “Out of the way, you swine, a cartoonist is coming!” That’s ridiculous. But a good statistician deserves that kind of respect.

1. Click to enlarge the cartoon at left on worthpoint