By Troy Kitch-Oracle on Mar 16, 2015
The Biggest Breaches are Yet to Come
- Unauthorized access. Built with the notion of “data democratization”—meaning all data was accessible by all users of the cluster—Hadoop is unable to stand up to the rigorous compliance standards, such as HIPPA and PCI DSS, due to the lack of access controls on data. The lack of password controls, basic file system permissions, and auditing expose the Hadoop cluster to sensitive data exposure.
- Data provenance. In traditional Hadoop, it has been difficult to determine where a particular data set originated and what data sources it was derived from. At a minimum the potential for garbage-in-garbage-out issues arise; or worse, analytics that drive business decisions could be taken from suspect or compromised data. Users need to know the source of the data in order to trust its validity, which is critical for relevant predictive activities.
- DIY Hadoop. A build-your-own cluster presents inherent risks, especially in shops where there are few experienced engineers that can build and maintain a Hadoop cluster. As a cluster grows from small project to advanced enterprise Hadoop, every period of growth—patching, tuning, verifying versions between Hadoop modules, OS libraries, utilities, user management etc.—becomes more difficult. Security holes, operational security and stability may be ignored until a major disaster occurs, such as a data breach.