23 February 2014

the signal and the noise

I finished Nate Silver's book (in November, and in time to hand off to my dad at Thanksgiving), and I finally getting around to posting some of my favorite points.

The first thing that caught my eye reminds me of conversations with the Internet MRA IPAM group, especially John Doyle. Talking about several wrong predictions during the recession starting in 2008 that were based solely on data, Silver wrote "ECRI actually seems quite proud of this approach. 'Just as you do not need to know exactly how a car engine works in order to drive safely,' it advised its clients in a 2004 book, 'You do not need to understand all the intricacies of the economy to accurately read those gauges.'" (p 197) This kind of statement makes me cringe, and Silver agrees: "This kind of statement is becoming more common in the age of Big Data. Who needs theory when you have so much information? But this is categorically the wrong attitude to take toward forecasting, especially in a field like economics where the data are so noisy."

I am often thinking of self-canceling predictions, and he does a good job discussing them in Chapter 7. A great example is that of a GPS that predicts where the heavy traffic will be-- if it sends more drivers on the less busy route, it won't be the less busy route for long.

Chapter 11 is about betting. There I learned that "free-market capitalism and Bayes' theorem come out of something of the same intellectual tradition. Adam Smith and Thomas Bayes were contemporaries, and both were educated in Scotland and were heavily influenced by the philosopher David Hume."

In this chapter I also enjoyed a quote from Daniel Kahneman about the two arrowed-lines illusion that makes two same-length lines look different lengths. He said "You can look at them, and one of the arrows is going to look longer than the other. But you can train yourself to recognize that this is a pattern that causes an illusion, and in that situation, I can't trust my impressions; I've got to use a ruler." This reminds me of abstract mathematics. At some point you learn the situations in which you can't trust your judgement or intuition, and you slow down and look at things more carefully.

"A climate of healthy skepticism" is the title of Chapter 12; you all know this must be one of my favorite chapters. Not only because it's about skepticism though-- also I am very interested in skepticism about climate change, which is the topic of this chapter. A lot of the difficulties with policy on climate change are human-understanding related, not science related. Silver quotes Richard Rood, a scientist at NASA who also teaches at Michigan (I'd like to meet him!). Rood said, "At NASA, I finally realized that the definition of rocket science is using relatively simple physics to solve complex problems. The science part is relatively easy. The other parts-- how do you develop policy, how do you respond in terms of public health-- these are all relatively difficult problems because they don't have as well defined a cause-and-effect mechanism." Unfortunately the conclusion of this chapter is that politics these days are just too polarizing, and "It is seen as a gaffe when one says something inconvenient-- and true" (p411). It's my hope that this doesn't hold true permanently.

From Tom Schelling, "There is a tendency in our planning to confuse the unfamiliar with the improbable." This quote reminds me of the unseen species problem, or the problem of estimating a probability of some event when we haven't ever (or have rarely) seen the event happen. I wonder if someday this probability theory will be able to help us better handle predicting rare or new unconsidered events.

Last but certainly not least, there was a very nice little description of how "signal and noise" comes, of course, from the study of electrical engineering, in particular communications. Gave me warm fuzzies... little did I understand, when I was just a 19 year old kid in Don Johnson's intro to EE class, that I was learning math and ideas that were built up over centuries to create some of the most amazing technology we've ever seen!

Addendum: I think the one thing I didn't like about Silver's book is captured nicely in this New Yorker article:

"The real reason that too many published studies are false is not because lots of people are testing ridiculous things, which rarely happens in the top scientific journals; it’s because in any given year, drug companies and medical schools perform thousands of experiments. In any study, there is some small chance of a false positive; if you do a lot of experiments, you will eventually get a lot of false positive results (even putting aside self-deception, biases toward reporting positive results, and outright fraud)—as Silver himself actually explains two pages earlier. Switching to a Bayesian method of evaluating statistics will not fix the underlying problems; cleaning up science requires changes to the way in which scientific research is done and evaluated, not just a new formula."