Hillary Clinton's Flawed Algorithm Pissed Off the Data Science Gods
Hillary Clinton's campaign designed an algorithm named Ada to help make strategic decisions like where to place television ads to obtain the most electoral votes for the buck, where and when to deploy Clinton and others in person for optimal influence, and when it was safe to disappear and let the media focus on her opponent, Donald Trump.
Information about and deployment of Ada was a closely held secret and operated on a separate high performance computer where only a few top people had access. A massive amount of data, including daily public polling numbers and privately collected ground-level voter data were provided to Ada to run 400,000 plus simulations a day to evaluate scenario options. Ada crunched the data and the algorithm determined which states were important and not important to spend time and deploy resources to get critical electoral votes.
Yet Ada misinterpreted voter polling data, made a number of wrong and unreasonable assumptions, and was fooled by what she did not know. Ada appeared to unreasonably assume that almost all polled voters would answer honestly and underestimated the number of folks who would not admit to voting for Trump. It appears the "misdirection" of a large number of polled voters triggered a concept known as a "correlated error" - causing a systemic common error (in contrast to many different independent errors) that fatally messed up Ada's prediction models.
Ada also unreasonably assumed a strong majority of women would vote for Clinton. Yet she badly underestimated the number of women (both college educated and working class) that would vote for Trump as an agent of change versus the risk Trump presented (all women voted about 53% in favor of Trump) in what strong historical evidence suggested was a "change" election away from the party in power. Why?
Ada also seriously underestimated the power of rural voters in Rust Belt states. Why?
Interestingly, Ada did not understand and underestimated the power of what appeared to be corruption with the pay-for-play Clinton Foundation and the FBI investigation into Clinton's private email server and destruction of emails that potentially revealed corruption and compromised national security.
Machine learning algorithms have one big weakness: they cannot analyze or understand events that never happened before. For example, you buy beach front property where in the past two hundred (200) years numerous hurricanes never caused water waves to rise above twenty (20) feet. So you build a house on strong concrete stilts thirty (30) feet high to protect from future hurricanes. Along comes a hurricane with forty (40) foot water waves that washes your house away. No algorithm would predict this (unless the algorithm has access to all historic data from the beginning of time that may, for example, show that forty foot waves happen predictably once every three hundred years and usually occur when X,Y or Z happens) because something ahistoric happened. Of course the remedy is to provide (as a fake past event) this unlikely future scenario to the algorithm for consideration - or if possible provide it with much longer time scale data for better predictions. This is why machine learning algorithms predict much better with more data and by learning from experience over time.
In this election case, Ada provided the Clinton campaign with an illusion of reality. There is no reason why the Clinton data science team could not have designed a better learning algorithm using data science techniques to consider the above-mentioned factors to provide more realistic scenario options and suggest better strategic decisions.
Clinton's campaign data science team designed a flawed algorithm that pissed off the data science Gods.