Tell a story or perish?

It is a meme that has been steadily gaining strength over the past couple of years: that the output of a data science analysis is a report, in which a "story" should be told, and that if you can't tell a story, you're not communicating effectively.

This meme has popped up in the business press, major Data Science websites, individual blogs, and even Data Science courses.

Phrasing scientific results in the form of a story is all well and good, but what happens when storytelling becomes the primary goal rather than following the scientific method? By harping on "telling a story," aren't we setting up for ourselves the same perverse incentives that have led to flawed scientific research that the Economist last year labeled a load of rubbish.

This came up in the DSA's panel discussion last year. Proper use of the scientific method starts with a hypothesis and an experimental approach, followed by execution of the experiment and reporting of the results, whether exciting or not. The pressure to come up with positive results leads to bogus results.

The "science" in "data science" means the scientific method should be followed, and results reported whether "exciting" or not. Once the scientific method has been faithfully executed, then and only then should storytelling be brought in to effectively communicate those results. Even a negative result to a promising hypothesis can be made dramatic with skillful storytelling. "We were so sure of this hypothesis, when to our surprise there was no correlation..."

But if, in contrast, storytelling were made the primary goal, then that would drive hypothesis formation, experimental method, and possibly even manipulation of results and cherry-picking -- all violations of the Code of Conduct proposed by the Data Science Association.

Ethical data science means avoiding the "publish or perish" -- or in this case it would be "tell a story or perish" -- that has plagued the traditional scientific community.