(by John Johnson)
Revolution Analytics recently published the results of a poll indicating that JSM 2011 attendees consider themselves “data scientists.” Nancy Geller, President of the ASA, asks statisticians not to “Shun the ‘S’ word.” Yet a third take on the matter is the top tweet from JSM 2011 with Dave Blei’s quote “‘machine learning’ is how you say ‘statistics’ to a computer scientist.”
Comments about selection bias from Revolution’s poll aside (it was conducted as part of the free wifi connection in the expo), the shift from “statistics” to “analytics,” “machine learning,” “data science,” and other terms seems to reflect that calling oneself a “statistician” is just not cool or scares our colleagues. So I open the floor up to the question: has “statistics” become a dirty word?
You write, ‘Revolution Analytics recently published the results of a poll indicating that JSM 2011 attendees consider themselves “data scientists”, which you cite as evidence of a ‘shift from “statistics” to “analytics,” “machine learning,” “data science,” and other terms …”’
But the question in the poll was “How strongly do you identify with the statement, ‘I consider myself a data scientist'”, not “Which term most accurately describes your profession?” Many people might identify with both “data scientist” and “statistician”. So is there really evidence of a ‘shift from “statistics”‘?
Two thoughts: 1) there’s an old saying that if you have to label it a science, it’s not, and 2) I imagine that “data scientist” is eliminating several connotations involved with “statistician”.
It may just be my own connotation, but since I don’t have a degree in statistics, I’d hesitate to call myself a statistician. Also, statistician, like mathematician, sounds like the person might be a theoretician in an ivory tower, while data scientist sounds much more practical.
I believe it depends on where most of the responders came from. Statisticians in the industry have become usually the go to people when it comes to manipulating, sifting through, analyzing and interpreting the information gleaned from massive data sets. Hence, across the halls of corporations, one may hear managers and other decision makers say: “hey let us ask the data person. she will certainly provide the answer and recommend us a solution.”
The line between statisticians and data analysts is blurred. The days when statisticians were the people to go to only for promotional mix models regression models, segmentation, biostatistics analyses, etc. may be over. Statisticians are expected to manipulate large data sets, develop models, create presentations, tell the story and provide business recommendations.
I think universities statistics and business departments may need to adapt to the situation if they are not already there.
Perhaps that is why the poll gave these results.
Felicien Kanyamibwa, PhD
http://www.aroni.us