Tuesday Jun 02, 2015

MIT Technology Review: Diversity of Big Data Sources Creates Big Security Challenges

According to Oracle’s Neil Mendelson, many companies today make a key mistake in setting up their big data environments.

“In an effort to gain insights and drive business growth, companies can too often overlook or underestimate the challenge of securing information in a new and unfamiliar environment,” says Mendelson, vice president for big data and advanced analytics at Oracle. That lack of attention to big data security requirements can, of course, leave the organization open to attacks from any number of unknown sources. 

Other evolving circumstances also contribute to a wide range of security-related risks, hurdles, and potential pitfalls associated with big data. As the Cloud Security Alliance, an industry group, notes: “Large-scale cloud infrastructures, diversity of data sources and formats, the streaming nature of data acquisition, and high-volume inter-cloud migration all create unique security vulnerabilities.”

Learn more here about factors that complicate big data implementations, and what is required for organizations to secure the big data life cycle. 

Tuesday May 26, 2015

Security Inside Out Newsletter, May Edition

Get the latest Security Inside Out newsletter and hear about securing the big data life cycle, data security training, and more.

Also, subscribe to get the bi-monthly news in your own inbox . 

Tuesday May 19, 2015

Securing the Big Data Life Cycle: A New MIT Technology Review and Oracle Paper

The big data phenomenon is a direct consequence of the digitization and “datafication” of nearly every activity in personal, public, and commercial life. Consider, for instance, the growing impact of mobile phones. The global smartphone audience grew from 1 billion users in 2012 to 2 billion today, and is likely to double again, to 4 billion, by 2020, according to Benedict Evans, a partner with the venture capital firm Andreessen Horowitz. 

“Companies of all sizes and in virtually every industry are struggling to manage the exploding amounts of data,” says Neil Mendelson, vice president for big data and advanced analytics at Oracle. “But as both business and IT executives know all too well, managing big data involves far more than just dealing with storage and retrieval challenges—it requires addressing a variety of privacy and security issues as well.”

With big data, comes bigger responsibility. A new joint Oracle and MIT Technology Review paper drills into addressing these big data privacy and security issues.

Get the paper, Securing the Big Data Life Cycle and learn more here.

Monday May 11, 2015

Using Earthquakes to Predict Cybercrime

Known for big surf and occasional big earthquakes, Santa Cruz, California has also been in the news regarding big data. In fact, the police force has used predictive analytics to capture would-be thieves. Two women were taken into custody after they were discovered peering into cars in a downtown parking garage. After further questioning, one was found to have outstanding warrants while the other was carrying illegal drugs.

The unique thing here is that the police officers were directed to the parking structure by a computer program that had predicted that car burglaries were especially likely there that day. This computer program, developed by PredPol, is based on models used for predicting aftershocks from earthquakes, a common occurrence here in California. The algorithms used generated projections about which areas and windows of time are at highest risk for future crimes.

The Innovative Hacker 

Organizations struggle to mitigate threats due to the continuing evolution of hackers and their methods of attack. Since William T. Morris Jr. first introduced the infant internet to his Morris worm virus in 1988, organizations have been fighting tweakers, script kiddies, espionage, and organized crime. The problem is that every time a solution is advised, a new hack is created. It’s a never ending cycle, and unfortunately, the turnaround time for hackers is getting shorter and shorter. They are innovating and sharing their innovations with others, who in turn take advantage and increase the number of effective attacks.

According the 2015 Verizon Data Breach Investigations Report, with over 80,000 incidents examined, hackers have become more inventive, thinking up new tactics to evade defenses.  “I hate to admit defeat, says Jay Jacobs, co-author of the report, but there does seem to be an advantage to the attackers right now.”  (Source: Financial Times access for a fee).

Learning from the Past 

By analyzing and detecting patterns in years of past crime data, the Santa Cruz police department, were able to determine hot spots of potential crime. In fact, on the day the two women were arrested, the program had identified the approximately one-square-block area where the parking garage is situated as one of the highest-risk locations for car burglaries.

According to the RAND Corporation's “Predictive Policing"  study, there is strong evidence to support the theory that crime is statistically predictable. That’s because criminals tend to operate in their comfort zone. They commit the type of crimes that they’ve committed successfully in the past, generally close to the same time, location and methods.

There is a connection between physical crime and the cybercrime organizations face today. To explain this connection further, the RAND Corporation found that prediction-led policing is not just about making predictions; "but it is a comprehensive business process, of which predictive policing is a part.” That process is summarized here in order to explain the steps taken to analyze past information in order to prevent further criminal activity.

First, the police force collected and analyzed previous crime, incident, and offender data in order to produce predictions. These predictions uncovered hotspots. Next, data from multiple and disparate sources in the community gets combined together, often using Big Data environments to quickly process terabytes of data. This data helps inform police where hotpots of potential crime will break out based on time of day, weather, recent criminal activity and more. Using the predictions helps to inform how they will respond to a potential incident. Criminals will then react to the changed environment: either they will be removed, or those still operating in the area may change their practices or move to a different area. Regardless of the response, the environment has been altered, the initial data will be out of date, and new data will need to be collected for analysis. 

The Importance of Acquiring Good, Clean Data 

This entire process hinges on the collection of data and the importance of that data to make predictions. 

Organizations today have the data necessary to make these types of predictions. In fact, our systems are churning out this data all the time through system server logs, database audits, event logs and more.  If crime is statistically predictable, and we have all evidence right there in front of us, then we need to collect and analyze it.

Of course, the future of predictive analytics and machine learning is much more than analyzing audit and log data and monitoring our databases, however, these two critical practices are important first steps to a comprehensive cybersecurity program.

The recent 2015 Verizon Data Breach Investigations Report highlights that once you have the data you need, analysis is performed using inferred or computed elements of the data. In order to mitigate data breaches, they suggest looking for anomalies within the following:
  • Volume or amount of content transfer, such as e-mail attachments or uploads
  • Resource access patterns, such as logins or data repository touches
  • Time-based activity patterns, such as daily and weekly habits
  • Indications of job contribution, such as the amount of source code checked in by developers
  • Time spent in activities indicative of job satisfaction or discontent
Despite that this data is all around us, the tough part is how to effectively and efficiently collect all of this data--securely--and make sense of it to predict and prescribe future actions and prevent the next data breach. 

Monday Mar 16, 2015

Three Big Data Threat Vectors

The Biggest Breaches are Yet to Come

Where a few years ago we saw 1 million to 10 million records breached in a single incident, today we are in the age of mega-breaches, where 100 and 200 million records breached is not uncommon.

According to the Independent Oracle Users Group Enterprise Data Security Survey, 34% of respondents say that a data breach at their organization is "inevitable" or "somewhat likely" in 2015.

Combine this with the fact that the 2014 Verizon Data Breach Investigations Report tallied more than 63,000 security incidents—including 1,367 confirmed data breaches. That's a lot of data breaches.

As business and IT executives are learning by experience, big data brings big security headaches. Built with very little security in mind, Hadoop is now being integrated with existing IT infrastructure. This can further expose existing database data with less secure Hadoop infrastructure. Hadoop is an open-source software framework for storing and processing big data in a distributed fashion. Simply put, it was developed to address massive data storage and faster processing, not security.

With enormous amounts of less secure big data, integrated with existing database information, I fear the biggest data breaches are yet to be announced. When organizations are not focusing on security for their big data environments, they jeopardize their company, employees, and customers.

Top Three Big Data Threats

For big data environments, and Hadoop in particular, today's top threats include:
  • Unauthorized access. Built with the notion of “data democratization”—meaning all data was accessible by all users of the cluster—Hadoop is unable to stand up to the rigorous compliance standards, such as HIPPA and PCI DSS, due to the lack of access controls on data. The lack of password controls, basic file system permissions, and auditing expose the Hadoop cluster to sensitive data exposure.
  • Data provenance. In traditional Hadoop, it has been difficult to determine where a particular data set originated and what data sources it was derived from. At a minimum the potential for garbage-in-garbage-out issues arise; or worse, analytics that drive business decisions could be taken from suspect or compromised data. Users need to know the source of the data in order to trust its validity, which is critical for relevant predictive activities.
  • DIY Hadoop. A build-your-own cluster presents inherent risks, especially in shops where there are few experienced engineers that can build and maintain a Hadoop cluster. As a cluster grows from small project to advanced enterprise Hadoop, every period of growth—patching, tuning, verifying versions between Hadoop modules, OS libraries, utilities, user management etc.—becomes more difficult. Security holes, operational security and stability may be ignored until a major disaster occurs, such as a data breach.
Big data security is an important topic that I plan to write more about. I am currently working with MIT on a new paper to help provide some more answers to the challenges raised here. Stay tuned.

Monday Mar 09, 2015

Security and Governance Will Increase Big Data Innovation in 2015

"Let me begin with my vision of the FTC and its role in light of the emergence of big data. I grew up in a beach town in Southern California. To me, the FTC is like the lifeguard on a beach. Like a vigilant lifeguard, the FTC’s job is not to spoil anyone’s fun but to make sure that no one gets hurt. With big data, the FTC’s job is to get out of the way of innovation while making sure that consumer privacy is respected."

- Edith Ramirez, Chairwoman, Federal

Trade Commission Ms. Ramirez highlights the FTC's role in protecting consumers from what she refers to as "indiscriminate data collection" of personal information. Her main concern is that organizations can potentially use this information to ultimately implicate individual privacy. There are many instances highlighting the ability to take what was previously considered anonymous data, only to correlate with other publicly available information in order to increase the ability to implicate individuals.

Finding Out Truthful Data from "Anonymous" Information 

Her concerns are not unfounded; the highly referenced paper Robust De-anonymization of Large Sparse Datasets, illustrates the sensitivity of supposedly anonymous information. The authors were able to identify the publicly available and "anonymous" dataset of 500,000 Netflix subscribers by cross referencing it with the Internet Movie Database. They were able to successfully identify records of users, revealing such sensitive data as the subscribers' political and religious preferences, for example. In a more recent instance of big data security concerns, the public release of a New York taxi cab data set was completely de-anonymized, ultimately unveiling cab driver annual income, and possibly more alarming, the weekly travel habits of their passengers.

Many large firms have found their big data projects shut down by compliance officers concerned about legal or regulatory violations. Chairwoman Hernandez highlights specific cases where the FTC has cracked down on firms they feel have violated customer privacy rights, including the United States vs. Google, Facebook, and Twitter. She feels that big data opens up additional security challenges that must be addressed.

"Companies are putting data together in new ways, comingling data sets that have never been comingled before," says Jeff Pollock, Oracle vice president for product management. "That’s precisely the value of big data environments. But these changes are also leading to interesting new security and compliance concerns."

The possible security and privacy pitfalls of big data center around three fundamental areas:

  • Ubiquitous and indiscriminate collection from a wide range of devices 
  • Unexpected uses of collected data, especially without customer consent 
  • Unintended data breach risks with larger consequences

Organizations will find big data experimentation easier to initiate when the data involved is locked down. They need to be able to address regulatory and privacy concerns by demonstrating compliance. This means extending modern security practices like data masking and redaction to the full big data environment, in addition to the must-haves of access, authorization and auditing.

Securing the big data lifecycle requires:

  • Authentication and authorization of users, applications and databases 
  • Privileged user access and administration 
  • Data encryption of data at rest and in motion 
  • Data redaction and masking for non production environments 
  • Separation of roles and responsibilities 
  • Implementing least privilege 
  • Transport security 
  • API security 
  • Monitoring, auditing, alerting and compliance reporting

With Oracle, organizations can achieve all the benefits that big data has to offer while providing a comprehensive data security approach that ensures the right people, internal and external, get access to the appropriate data at right time and place, within the right channel. The Oracle Big Data solution prevents and safeguards against malicious attacks and protects organizational information assets by securing data in-motion and at-rest. It enables organizations to separate roles and responsibilities and protect sensitive data without compromising privileged user access, such as database administrators. Furthermore, it provides monitoring, auditing and compliance reporting across big data systems as well as traditional data management systems.

Learn more about Oracle Security Solutions.

This article has been re-purposed from the Oracle Big Data blog.  

Friday Sep 27, 2013

Oracle OpenWorld News: Oracle Big Data Appliance Secures Big Data in the Enterprise

Software Enhancements to Leading Big Data Appliance Help Organizations Secure Data and Accelerate Strategic Business Insights

While Hadoop provides a scalable foundation for Big Data projects, the lack of built-in security has been an obstacle for many enterprises. To meet this need, Oracle has enhanced the Oracle Big Data Appliance to include enterprise-class security capabilities for Hadoop using Oracle Audit Vault and Database Firewall

By consolidating and analyzing the Hadoop audit trail, Oracle Audit Vault and Database Firewall can enforce policies to alert suspicious or unauthorized activities. Additionally, the consolidated audit data allows organizations to demonstrate the controls and generate the reports needed for regulatory compliance and audits.

Read the press release. 


Who are we?

Follow us on

  • TwitterFacebookLinkedIn


« August 2016