Proper Use and Misuse of Modeling - DSA ADS Course 2022

DSA ADS Course - 2022

Discuss the proper use and misuse of modeling. Discuss formulating appropriate assumptions and danger of simplifying assumptions. Discuss methods to evaluate and judge reasonable, dubious or untestable assumptions.

Discuss appropriate setting of model specifications and parameters.

Discuss subjecting models to rigorous empirical tests to avoid creating an illusion of reality.

Review:

  • Predictive Modeling
  • Probabilistic Modeling
  • Decision Modeling
  • Black Box Models

The Models New Clothes

COVID19 exposed the danger of relying on models to make policy decisions. It appears in vogue with certain academics that models are "science" and fit for forecasting and guiding policy decision making. Yet as any successful real world executive, trader or surgeon will tell you, models are NOT appropriate for decision making but may or may not be useful for understanding phenomena.

Models are a useful tool, NOT an accurate forecaster.

While Computational Experiments is a new type of promising science - they are explicitly distinguishable from standard statistical models purporting to forecast and reflect reality. While models may or may not be useful for understanding complexity, they NEVER should be used to make policy decisions. For example, many defective COVID19 models created an illusion of doomsday reality causing public fear and hysteria. Mass societal fear created a catastrophic feedback loop between a frightened public and leaders (amplified by irresponsible media) - leading to a negative cascade of panic driven policy decisions (lockdowns and hard social/healthcare/business restrictions) - causing massive damage to society.

The implicit assumption in building and relying on models is that if you understand complex relationships and find patterns and correlations, you can make better decisions, forecast events and manage risk. As we all know, the real world is not that simple and causes are usually obscure. Critical information is often unknown or unknowable and causes can be concealed or misrepresented. Moreover, key assumptions embedded in models are often wrong. In high causal density environments, finding true causality is difficult and sometimes impossible.

I suggest models should be judged by reasonable, dubious or untestable assumptions - not only predictive results (even a broken clock is right twice a day). Simplifying assumptions usually makes them unrealistic and disconnected from the real world. I often get the feeling a model intentionally searched for certain assumptions to create a specific result. Yet searching for assumptions that produces a desired result is not acceptable data science practice. Bad assumptions have consequences: freedom to select any assumptions allows the creation of a model to support any result.

Moreover, predictive models usually only work under certain limited circumstances for a limited time (until they do not work anymore). One should always be skeptical of the usefulness of predictive models in high causal density environments (e.g., human behavior, public health, climate, finance...etc.). Data scientists should use models properly: to gain understanding of simple and complex phenomena when no real alternatives are available. All models need to be subjected to rigorous empirical tests to avoid creating an illusion of reality that leads to policy malpractice and bad consequences.

In complex, high causal density environments models are NOT appropriate for real world decision making considering unlimited freedom of model specifications and parameters.

 

Resource Type: