Here are some fun maps of Austin area houses based on Redfin data from 2015. The red dots are homes >$250K, the blue dots are homes <$250K. Note: only houses with at least 1250 square feet, 2 bathrooms, and 3 bedrooms are included. All graphics were generated using R.
Manager: We need to start doing predictive analytics.
Data scientist: Sure, what would you like to predict?
Manager: I don’t know, but I’ve been told that we aren’t getting value from data unless we are predicting things.
In the hype about predictive analytics, descriptive analytics has been terribly underrated. In addition, a misconception has arisen that data scientists only do predictive analytics.
Michael Watson, Peter Cacioppi, and Derek Nelson offer a refreshing perspective on the relative value of descriptive, prescriptive, and predictive analytics in Managerial Analytics: An Applied Guide to Principles, Methods, Tools, and Best Practices. They show the following well-known diagram and then proceed to tear it apart:
This diagram is visually interesting and can provoke a good discussion. And we, the authors of this book, used such a diagram extensively in the past. However, we now think that this diagram is misleading. In some cases, the diagram can be true. But, it is not universally true. And it may be true in only a small number of cases.
The value of the different types of analytics is tied to the problems you are solving, not the techniques themselves. Each type of analytics can have relatively minor impact on a business or completely change the business.
…good descriptive analytics applied to data sets concerning cancer patients has uncovered potential life-saving treatments for different types of patients. It would be hard to argue that this type of descriptive analytics isn’t extremely valuable.
There is tremendous opportunity for doing better descriptive analytics using the modern tools of data science. In fact, in my experience the majority of business questions can be answered entirely by leveraging descriptive analytics.
Businesses could benefit from a more holistic view of analytics where value from data is the only criterion, not whether a machine learning algorithm is required. When this mindset is adopted, analytics are developed to answer relevant business questions, not just for the sake of doing something “predictive.”
Watson, Cacioppi, and Nelson also debunk the myth that descriptive analytics are the easiest to implement:
It is also impossible to say which type of analytics is easiest to implement. You can have a simple descriptive analytics project where you load the data you have into a better reporting tool and immediately gain insight. Or you could spend two years implementing a full-blown descriptive analytics system that gives you access to every bit of data in your organization. Likewise for predictive and prescriptive analytics: You can do good work in an Excel spreadsheet or you can custom build systems that require years of effort and huge teams of people.
I think it is time to invite descriptive analytics back into the realm of data science and realize, as Wikipedia defines: data science is the extraction of knowledge from large volumes of data.