Not long ago, the term “data science” meant nothing to most people — even the those who worked in data. A likely response to the term was: “Isn’t that just statistics?”.
These days, data science is hot. The job of “data scientist” was referred to by the Harvard Business Review as the “Sexiest Job of the 21st Century.” Why did data science come to exist? And just what is it that distinguishes data science from statistics?
The very first line of the American Statistical Association’s definition of statistics is “Statistics is the science of learning from data…” Given that the words “data” and “science” appear in the beginning fragment of this definition, one might assume that data science is just a rebranding of statistics. A number of Twitter humorists certainly have:
“A data scientist is a statistician who lives in San Francisco“
“Data Science is statistics on a Mac.”
While there’s a grain of truth in these jokes, the reality is more complicated. Data science, and its differentiation from statistics, has deep roots in the history of computers.
Statistics was primarily developed to help people deal with pre-computer data problems like testing the impact of fertilizer in agriculture, or figuring out the accuracy of an estimate from a small sample. Data science emphasizes the data problems of the 21st Century, like accessing information from large databases, writing code to manipulate data, and data visualization.