By Jean-Pierre Dijcks on Mar 19, 2014
With the release of Big Data Appliance software bundle 2.5, BDA completes the encryption story underneath Cloudera CDH. BDA already came with network encryption, ensuring no network sniffing can be applied in between the nodes, it now adds encryption of data-at-rest.
A Brief Overview
Encryption of data-at-rest can be done in 2 modes. One mode leverages the Trusted Platform Module (TPM) on the motherboard to provide a key to encrypt the data on disk. This mode does not require a password or pass phrase but relies on the motherboard. The second mode leverages a passphrase, which in turn will be used to generate a private-public key pair generated with OpenSSL. The key pair is encrypted as well.
The passphrase encryption has a few more interesting aspects. For one, it does require the passphrase to be entered upon re-booting the system. Leveraging the TPM option does not require any manual intervention at reboot. On Big Data Appliance it is possible to regularly change the passphrase without impacting the encryption, or required re-encryption of the data.
Neither one of the encryption methods affect user access to user data. In other words, on an unprotected cluster a user that can read data before encryption will be able to read data after encryption. The goal is to ensure data is protected on physical media - like theft or incorrect disposal of a disk. Both forms protect from that, but only passphrase based encryption protects from disposal or theft of a server.
On BDA, it is possible to switch between these two methods. This does have impact on running the cluster as data needs to be re-encrypted. For this step the cluster will be down, however data is not duplicated, so there is no need to reserve double the space to do the re-encryption.
How to Encrypt Data
As with all installation or changes on Big Data Appliance you will leverage Mammoth to do the install with encryption or to make changes to the system if you are already in production. Before you set up either of the two modes of data-at-rest encryption, you should consider your requirements. Changing the mode - as described - is possible, but will require the cluster to be down for re-encryption.
Full Set of Security Features
Encryption - out-of-the-box is yet another feature that is specific to Oracle Big Data Appliance. On top of pre-configured Kerberos, Apache Sentry, Oracle Audit Vault Encryption now adds another security dimension. To read more about the full set of features start here.