The whispers have started: Big Data projects are sometimes yielding rather ho-hum ROI’s, with data scientists underestimating both the cost and the time-to-market of their projects. Executive management teams are starting to ask, “Is Big Data just an endless money pit?”
Here are several voices sounding this theme recently:
- CIO magazine reports that over half of Big Data projects fail to meet their sponsors’ expectations.
- A survey showed an average ROI of around 55 cents on the dollar, whereas organizations expected to return better than $3 per dollar invested over a multi-year project.
- A report co-authored by IBM and a non-profit research group writes disparagingly of the “deafening hype” around Big Data and warns that “evidence is beginning to show that the return on big data investments to date is less than promised.”
- Even defenders of Big Data ROI, such as Chuck Schaeffer, admit that many companies struggle with their implementations, owing partly to the fact that “with more data comes more noise”, and that Big Data is “not for everybody” as of today.
- Medical technology blogger Derek Lowe explains how Google’s attempt to use Big Data to track flu outbreaks in the US, was a big flop.
When IBM’s Big Data Evangelist, James Kobielus, found it necessary last year to write a post busting ten myths about data scientists, it was an indication that the hype machine had gone stratospheric over this field. The more recent stirrings, listed above, tell us that the hype is giving way to more down-to-earth sentiments.
Wise are those who are not fooled, neither by the hype nor by the post-hype deflationary mood-swing. I have always believed the ROI on Big Data will eventually be massive, but it will not happen overnight. Right now there are a few challenges making our immediate progress more difficult:
- “Honk if you’re a data scientist!” Everybody and his brother (and sister) is a self-proclaimed “data scientist”, but are they really?
Here’s a hint (and please imagine a Jeff Foxworthy voice-over):
“If you haven’t got ‘PhD’ after your name and you haven’t published research results in peer-reviewed venues, you might not be a data scientist.”
This is true even if you were previously the hottest code jockey or BI specialist from [insert household-name Internet company here] or the most whiz-bang Wünderkind from the CompSci department at [insert top-tier University name here]. Indeed, the flip-side of the mad hiring frenzy for “data scientists” is that as job-seekers have redefined their work histories on LinkedIn to match this title, and academic institutions stretched their programs to cast more of their new graduates into it, we may have watered down the definition in more cases than we want to admit.
- “Science, Shmai-ence!” Data scientists are not enough. Even if you have a damn clever scientist working with your data, you need not only a strong business case — one that could not be done any other way than by working with Big Data — but also a business partner who can make that case to stakeholders, shaping the scientist’s work so that it really helps an organization in a way that they can’t help but recognize. Instead, many places are taking a “build it and profits will come” approach. Ummm, how often should we expect that to work?
- “Is there an app for that? No.” While some enterprise Big Data projects which center on CRM systems already have their data warehousing and analytics tools fully dialed in, when it comes to the consumer Internet, where Web browsing, UGC and social media data are the focus, our platforms are still mostly first-generation. Other than Hadoop/MapReduce, any tools that are specifically aimed at the Big Data of a Web 2.0 world still place heavy demands on their implementers to combine and customize, tune and tweak, guess and test. In order for Big Data to come of age among Internet companies, we’ll need a couple more generations of evolution in not only our data stores but also our analytics — including both numerical and text analytics. We’re still in the early days of all this.
So my answer to the question in this title of the post is “Yes and No.” Yes, the craze, the irrational, jump-on-the-bandwagon, overblown-expectations-laden hype fest, is finally past its high point. But that just means that the real, substantive merit of Big Data lies squarely in front of us. The worst of the craze is over and now the best is yet to come.