July 5, 2020
Together, we can shape a better future using data science!
Bad Data: Don't be a Gullible Fool
Professional data scientists rank quality and veracity of data. Recently we have seen a significant rise in the amount of untruthful data and false data creation. During COVID19, we often see both untruthful and truthful data taken out of context and thus creating a misleading interpretation. Data scientists sometimes call this torturing the data to fit a narrative or theory.
One major issue with data science results is the truthfulness of data - also known as "data veracity". In the past few years we have seen a rapid rise in the amount of false data creation and misleading data presentation. Data veracity is defined as false or inaccurate data. The data may be intentionally, negligently or mistakenly falsified.
Data veracity may be distinguished from data quality, usually defined as reliability and application efficiency of data, and sometimes used to describe incomplete, uncertain or imprecise data.
The truthfulness or accuracy of data supersedes data quality issues: if data is objectively false then data science results are meaningless and unreliable and may create an illusion of reality causing bad or sub-optimal decisions and sometimes fraud with civil or criminal liability.
- Failure of Institute for Health Metrics & Evaluation COVID19 Model
- Full Genome COVID19 Viral Sequences in Israel
- Assessing Big Five Personality Traits Using Real-life Facial Images
- Oxford COVID9 Government Response Tracker
New Books from DSA Store:
- Visualize This: The FlowingData Guide to Design, Visualization, and Statistics
- The Naked Future: What Happens in a World That Anticipates Your Every Move?
New DSA Videos:
New DSA Resources:
- Sentinel Surveillance of SARS-CoV-2 in Wastewater Anticipates Occurrence of COVID19
- Citizenship in Networked Age - Agenda for Rebuilding Our Civic Ideals
© 2020 Data Science Association, Inc. — All Rights Reserved.