Despite the inevitable burnout now following the hype, more than half of enterprises believe their investment in Big Data over the next three years will outstrip past investment in information management. Companies are even beginning to name analytics and Big Data leaders to senior management roles. Time Inc. recently hired a chief data officer to better tap its print and digital audiences, and interest in the role is soaring.
Accordingly, many organizations now find themselves reluctantly chasing the trend with the realization that they’re investing not to get ahead, but to keep up, with diminished returns on investment now that there’s greater competitive risks of falling behind. However, savvy Big Data followers can still lap the field if they’re smart about how they go about it. Here are five suggestions:
- Big Data is a means, not an end.
Run for the hills if someone pitches you a “Big Data strategy.” That usually means a vendor has a strategy for extracting cash from your pocket, or some internal advocate is empire-building on the back of a “capability development” effort.
The most important thing you can do is to ask how some stream of information is relevant to improving your business results. A great example of this is re-targeting website visitors who don’t convert. “If I can spend a little bit of money on post-visit display advertising to persuade a near-converter to come back and buy from me, it would be worth X. Now, what do I need to do to go find and advertise to those folks?”
There’s a Big Data solution behind that question, but with such a tight frame you should evaluate it on its real merits and not its hype.
A corollary of this guiding principle is to think backward from potential actions, not forward from capabilities. Melanie Murphy, senior director of customer analytics at Bed Bath & Beyond, says “If you can’t act on a model, for example, you probably shouldn’t build it.” Considering (in advance) the actions that will be taken from an analysis or how a model will be deployed directly impacts the way the analysis is conducted or how a model is developed. In Melanie’s experience, this type of planning leads to a higher success rate, saves valuable time in rebuilding models, and avoids other “bridge-to-nowhere” analyses.
- Be hypothesis-driven first, and “emergent” second.
In the Age of the Machine, it’s thought that the hypothesis is dead. Prophets of this age suggest you simply toss all the data into a box and let it sort out what variables are significant and what models work best. This is great if your data comes free, clean, and gift-wrapped. For the rest of us, it still pays to think carefully in advance about which data we think might be relevant to the question at hand.
Scott McDonald, founder of Nomos Research and former senior vice president of research and insights at Condé Nast, also suggests there’s a communications benefit: hypotheses make it easier to engage with clients to help explore potential variables to include in your analysis.
- A little homework goes a long way.
There are some terrific and free resources out there for making yourself smart about how this stuff works before you start writing six, seven, and eight figure checks to vendors. MapR and Hortonworks have great tutorials on Hadoop (a tool for organizing and pre-processing massive data sets); I’ve been taking the edX “Analytics Edge” course and recommend it highly as a blend of introduction to / review of basic statistical analysis courses coupled with practical, hands-on use of R, the stats-oriented programming language.
Lest you think this is a bit geeky, consider the example of Wayfair, the highly successful online furniture retailer. Until not too long ago, fully half of the marketing team was functionally literate in SQL, the database query language. This enabled them to run their own custom reports and analyses against the firm’s data sets, so they could better manage their businesses and invest more thoughtfully in fancier analytics and reporting front ends.
- Be a Big Data Governance “Libertarian.”
Classic data warehouse efforts started by spending a long time to clean and organize data before letting users anywhere near it. This spawned the creation of data governance structures and processes that looked for all the world like police states.
At Lenovo, Mohammed “Mo” Chaara, a director in the corporate analytics group, suggests a three-tier “analytic maturity” model that he has used to guide his work. In the first stage, the lowest level of maturity, he’s had to prove the value of analyzing a particular opportunity. At this stage he stays close to the business, working with the data they’ve either already got or have access to through existing permissions. “I actually avoid contact with IT at this stage,” he says.
“They’re typically very resource-constrained, and there’s no sense in having them spend scarce cycles on planning to manage scenarios that may not go anywhere after we explore them.” At the second level, mid-maturity, based on the samples of data used to that point, Mo is able to develop a feel for the data management and governance challenges that will be involved in using the relevant data on a regular basis. Once he’s got a clear idea of what kind of management the necessary data will require to support ongoing operationalization of an insight brings in IT to review what will be required to make it all work.
“Ironically,” he notes, “things are usually approached the opposite way.” He adds, illustrating with examples from his past experience, “An analytic project starts with foundational thinking about the universe of data that might be used for a class of analysis, and how it might be physically and logically governed and defined. The problem is that nine months later, you have nothing to show for it, in terms of any business insight or result.” So, he continues, “Start with the data you have. Any good analyst can extract value from dirty, incomplete data, at least enough to get a sense for whether there’s value worth pursuing further.