Big Data Challenges & Considerations
By Neela Chaudhari on Apr 25, 2014
By: Murad Fatehali
While 'Big Data' is dominating a lot of media and executive attention (it's a Top-5 Initiative according to IDC Retail Insights 2014 Predictions), the underlying considerations & challenges of Big Data are unfortunately getting trivialized. As corporations and people continue to make the Internet a more fundamental way of life (from social and transactional perspectives), the scope of the underlying data that gets created, stored, retrieved, and analyzed will commensurately and exponentially grow as well. Although many factors and assumptions go into Big Data discussions, outlined below are five pivotal dimensions:
1. Size: This is the most obvious and most talked about factor. Everybody seems to understand that given the rise of the Internet, coupled with the plethora of apps in the hands of consumers and users, growth in the scale and volume of data is truly astonishing (some say, quantity is doubling every 2 years). However, what gets neglected are the challenges related to storage and analysis of vast volumes of data, captured in a variety of formats, for example: text-based (tweets, SMSs), graphically illustrated (pictures and drawings), and audio-visually represented (podcasts, movies, etc.). For anyone interested in making sense of Big Data, how efficiently the data gets stored and retrieved will be key factors - no wonder we are seeing an increase in the number and capacity of purpose-built storage systems and lightening-fast data-crunching machines.
2. Sources: Usually Big Data discussions refer to ‘external’ sources like the Internet (Facebook, Twitter) and media (digital or otherwise), but often overlooked are the ‘internal’ data sources that can be scraped for Insight - these would be the transactional and operational systems supporting multiple marketing, sales, fulfillment, and service systems, in addition to the company’s own BI/reporting systems. Ask any CMO and they will explain to you the widening gaps in data due to lack of integration and data governance. Many companies struggle when answering how many customers they have, not to mention the difficulties in identifying those most profitable or the most loyal. Since the data opportunities inside the company have not been fully explored, diving into the external sources of Big Data without adequate forethought, can potentially not only add costs, but also complexity and confusion. In our engagement with customers, while ERP systems & POS data remain good inputs into a company’s Business Intelligence and reporting efforts, more and more we see executives asking for data from other sources to be able to develop ‘personalized’ offers and solutions, for example:
a. Machine usage data about performance and diagnostics using sensors (vehicle-to-smartphone connectivity devices for smarter driving from: dash.by, automatic.com, metromile.com, etc.)
b. Medical data from patients/public records and research studies (fitbit and wearable technologies, research findings, etc.)
c. Geographic and telemetric data from devices, maps, GPS signals, user tags and flags (Google maps, in-vehicle trackers, etc.)
d. Smartphone check-ins (four-square, etc.)
e. Social networks that capture trends (Twitter, etc.)
f. Content sites (Wikipedia, etc.)
3. Speed: The flow of data is something that can no longer go unnoticed - it is instant and constant. According to Google CEO Eric Schmidt, “There were 5 exabytes of information created between the dawn of civilization through 2003, but that much information is now created every 2 days, and the pace is increasing.” It is clear that the rate of incoming data can quickly alter any trends captured off historical data and render meaningless any campaigns that don’t take into account real-time response tracking. For example, without compute-intensive servers tied-into the large-scale infrastructure hosting Facebook posts, Twitter feeds, and Google searches it would be impossible to capitalize on fads, keep pace with up-to-the-minute news, understand critical events, and predict emerging trends [case in point: who knew that tracking the number of flu queries on Google can help predict localized outbreaks or that analyzing global weather patterns can help predict harvest outputs or crop failures]. The implications are huge for those who can robustly and quickly find data relationships: commonality, lineage, and correlation are going to be big differentiators.
4. Standards: Unlike organizational transactions designed to meet compliance standards for financial reporting, Big Data does not need to conform to any such rules or convention. The vast majority (some say as much as 90%) of data coming from uploads-social network chatter-comments-likes or tweets is neither structured, nor precise, not to mention anything about data reliability (accuracy). The sheer variety of posts representing wide-ranging interests/agendas across multiple sites/sources being repeated/re-circulated requires significant analytical prowess, speed, and talent to make intelligible.
5. Strategy: Big Data has the potential to help a company truly cross channels (understand personal profiles and preferences) and deliver world-class experiences to their customers; it can also help inform a company’s strategy by making sense of wide-spread data coming from public and private networks, internal and external systems, and individual and institutional sources. How Big Data plays into a company’s success will depend on the priorities of its leadership - and their willingness to make data-driven decisions a reality, and turn Big Data into more than just a buzzword. Linked-In, for example, is analyzing vast quantities of data to generate billions of personalized recommendations every week – no small feat when looking through the silo and antiquated lens of yesterday’s IT landscape. According to Baseline, a 10% increase in data accessibility translates into an additional $65.7M in net income for a typical Fortune 1000 company.
Big Data challenges and considerations outlined above are further impacted by availability of skilled talent - people who understand data in all its forms and can help govern and master it. Additionally, ethical and legal challenges further compound the problem since perspectives differ on what is personal, private, and sensitive information. To be sure, none of these Big Data challenges are expected to get resolved anytime soon – all the more reason, therefore, for executives to be cognizant of the five Big Data considerations (size, sources, speed, standards, and strategy) as they chart new ground in their Big Data journey to unlock its value.
For additional details on the Insight program, please visit: www.oracle.com/insight
Murad Fatehali is a Senior Director with the Insight & Customer Strategy Team at Oracle Corporation; he focuses on helping Oracle customers solve Data Strategy & Integration challenges.