Big Ideas

Taming the Data Deluge

Four ways smart technology can transform massive data sets into real profits.

by Venki Rajah

February 2012

It feels as if data now rivals oxygen as the most vital human element. According to IDG, humans created 150 exabytes of data in 2005, and that number grew eight times to 1,200 exabytes by 2010. Researchers estimate that enterprise data grows nearly 60 percent per year (90 percent of that being unstructured) and the average amount of data stored per company is 200 terabytes.


Four trends are driving this growth in data: the capture of detailed data more frequently across every customer interaction; the prevalent use of multimedia; the widespread adoption of social media; and the rollout of intelligent sensors embedded in physical devices that can sense, create, and communicate data.

This flood of information creates a wealth of opportunities for businesses. However, several obstacles limit our ability to turn data into profit. The growth of high-volume, low-density data has outstripped the capacity to provision and operate the required infrastructure in an efficient manner. Enterprises are unable to handle unstructured data, limited by existing infrastructure and data architectures. A lack of enterprise-ready statistical analysis tools prevents the kind of analysis necessary to spot trends. Finally, executives are unable to make real-time analytical decisions because they don’t have a user interface that provides actionable information.

To tame the data deluge and drive new profits, executives need to acquire four critical capabilities:

1. Fast, cheap capture of high-volume, unstructured data. A four-engine jumbo jet can create 640 terabytes of data in just one crossing of the Atlantic Ocean. If you multiply that by 25,000-plus flights a day, you get a sense of the amount of data that is involved in ensuring flight safety. Deploying a purpose-built system with the ability to process unstructured data is critical for managers who see value in capturing this type of high-volume data with minimal downtime and a low cost.

2. Cost-effective organization of structured and unstructured data. A recent conversation with a telecommunications executive revealed that his company now stores more than 10 trillion rows of data. An engineered system, with performance-grade hardware and optimized Hadoop cluster software, can cost-effectively organize these data sets and ease the analysis process.

3. Trend spotting via statistical analysis. Retailers analyze basket data to offer the right promotions based on a particular customer’s preferences. Enterprise-grade engineered systems with the ability to support the R statistical programming language can run these types of queries securely inside the database to drive faster performance, ensuring better decisions.

4. Interfaces that help users turn data into action. A multinational consumer goods company’s entire C-suite meets every day and pores over a real-time sales dashboard to drive critical sales decisions. Engineered systems with an in-memory database can drive real-time performance data to the dashboards, giving executives the most up-to-date information with which to make business decisions.

McKinsey & Company estimates that retailers with the right processes in place to properly utilize massive amounts of data can benefit from a potential 60 percent increase in operating margins and a 0.5 percent annual productivity boost. Companies would do well to harness the data deluge ahead of the rising tide.

Photography by Shutterstock