Friday, December 02, 2011

Review of The Signal and the Noise

I recently finished reading Nate Silver's The Signal and the Noise, why so many predictions fail - but some don't. This book really opened my eyes to Bayesian inference, i.e., the ability to make successively better predictions by incorporating estimates based on prior knowledge. Beyond that, I found Silver's study of several fields where prediction are commonly used to be insightful. He looks at sports gambling, economics, political elections, flu epidemics, weather, and military and terrorist attacks. Although his own expertise is in baseball and election predictions, where he has a great track record, Silver's analysis transcends the statistics to include a study of human nature and of technology.

In the field of climate forecasts, the author shows how the commercial broadcast company weather forecasters use a different standard for determining what to report than do the national weather forecasters who are more independent. This leads to weather forecasts that are biased toward more chance of precipitation because people are more likely to enjoy an unexpected sunny day than an unexpected rain shower.

Something I'm finding more and more often in reading about how we think and process information is that there is a human tendency to see a pattern where there is none. Silver makes this point in regard to the much larger volumes of information that are available to us in many fields as a result of technological progress. More information leads to more theories. But it has not necessarily led to better predictions.

Silver's insight about the explosion of information following the invention of the printing press leading to increased sectarianism is brilliant. With more information available to people, via books and pamphlets, those with strong beliefs were able to publish their stories and rationales in a form that presented them as the truth. This led to more divisiveness. The explosion of information as resulting from the internet's wide usage is likely leading to a similar divisiveness in political opinion.

The difference between risk and uncertainty is something Silver does a nice job explaining. He says that risk "is something that you can put a price on..." while uncertainty is "risk that is hard to measure." How do we deal with uncertainty? Silver puts his money behind the concept of Bayesian inference whereby we reduce uncertainty in a gradual manner, based on our prior experience, adding our knowledge from new experiences to that prior experience.

The emergence of "complex systems" in our lives has led to some interesting mistakes in prediction. The weather, or the climate if you think longer-term, is a complex system with many variables. The economy is another. Silver delivers a scathing critique of the major stock ratings agencies in terms of how they miscalculated the risk of a major financial crash. "S&P and Moody's underestimated the default risk associated with CDOs by a factor of two hundred..." reports Silver. The analysis I'll leave for the reader to enjoy. Just one additional point, though, that he makes is that the gap between what we know and what we think we know is increasing as the volume of information available increases. That's a caveat emptor for the predictors if I've ever heard one!

Silver does a nice job of explaining two different "personas" in terms of experts making predictions. He draws from a study done by Philip Tetlock, a psychology and political science professor, who, while studying economists' predictions, decided to test the economists using some of his psychological profile tests. His study eventually covered other experts where prediction was performed, and spanned fifteen years. What he found was that the experts fell into one of two groups: hedgehogs or foxes. Hedgehogs were more convinced their theory was correct and more likely to not change it based on new information. Foxes were just the opposite. What he found was that foxes were more likely to make better predictions. This whole area of study, looking at how one's psychological profile affected one's ability to use information to make predictions, is exciting and is actually something that can be applied, with caution, across a lot of disciplines. Even in software development, where I work, I can see how it can apply.

Silver's analysis of the Pearl Harbor and 9/11 attacks bears some mention. In both cases, he explains how both events were, to some extent, statistically likely, but that both types of attacks were not thought probablye because they were unfamiliar. The United States in late 1941 was on alert for industrial espionage both in Hawaii and on the mainland, because it was thought Japanese Americans or Nazi sympathizers were likely to strike in that way. In the Pacific, Japanese attacks on southeast Asian nations was considered a high probability given that there was a lot of Japanese radio communication in those areas. In 2001, there had never been a serious airplane attack against a building. If an airplane were to be hi-jacked, it was thought to be with the goal of taking the plane to foreign soil.

Finally, Silver does a nice job of explaining power-law distributions, over-fitting a model and Bayesian inference. It encouraged me to want to learn more about the mathematics. He also has an incredible number of notes and references in the book. I found myself reading many footnotes, which had interesting commentary.

Overall, this was, in my opinion, a ground-breaking book, at least for the lay reader if not for a professional commonly involved in providing predictions.