Security has been an important theme across recent Big Data Appliance releases. Our most recent release includes encryption of data at rest
and automatic configuration of Sentry for data authorization. This is in addition to the security features previously added to the BDA
, including Kerberos-based authentication, network encryption and auditing.
Auditing data access across the enterprise - including databases, operating systems and Hadoop - is critically important and oftentimes required for SOX, PCI and other regulations. Let's take a look at a demonstration of how Oracle Audit Vault and Database Firewall delivers comprehensive audit collection, alerting and reporting of activity on an Oracle Big Data Appliance and Oracle Database 12c.
In this scenario, we've set up auditing for both the BDA and Oracle Database 12c.
The Audit Vault Server is deployed to its own secure server and serves as mission control for auditing. It is used to administer audit policies, configure activities that are tracked on the secured targets and provide robust audit reporting and alerting. In many ways, Audit Vault is a specialized auditing data warehouse. It automates ETL from a variety of sources into an audit schema and then delivers both pre-built and ad hoc reporting capabilities.
For our demonstration, Audit Vault agents are deployed to the BDA and Oracle Database 12c monitored targets; these agents are responsible for managing collectors that gather activity data. This is a secure agent deployment; the Audit Vault Server has a trusted relationship with each agent. To set up the trusted relationship, the agent makes an activation request to the Audit Vault Server; this request is then activated (or "approved") by the AV Administrator. The monitored target then applies an AV Server generated Agent Activation Key to complete the activation.
On the BDA, these installation and configuration steps have all been automated for you. Using the BDA's Configuration Generation Utility, you simply specify that you would like to audit activity in Hadoop. Then, you identify the Audit Vault Server that will receive the audit data. Mammoth - the BDA's installation tool - uses this information to configure the audit processing. Specifically, it sets up audit trails across the following services:
- HDFS: collects all file access activity
- MapReduce: identifies who ran what jobs on the cluster
- Oozie: audits who ran what as part of a workflow
- Hive: captures changes that were made to the Hive metadata
There is much more flexibility when monitoring the Oracle Database. You can create audit policies for SQL statements, schema objects, privileges and more. Check out the auditor's guide for more details. In our demonstration, we kept it simple: we are capturing all select statements on the sensitive HR.EMPLOYEES table, all statements made by the HR user and any unsuccessful attempts at selecting from any table in any schema.
Now that we are capturing activity across the BDA and Oracle Database 12c, we'll set up an alert to fire whenever there is suspicious activity attempted over sensitive HR data in Hadoop:
In the alert definition found above, a critical alert is defined as three unsuccessful attempts from a given IP address to access data in the HR directory. Alert definitions are extremely flexible - using any audited field as input into a conditional expression. And, they are automatically delivered to the Audit Vault Server's monitoring dashboard - as well as via email to appropriate security administrators.
Now that auditing is configured, we'll generate activity by two different users: oracle and DrEvil. We'll then see how the audit data is consolidated in the Audit Vault Server and how auditors can interrogate that data.
The demonstration is driven by a few scripts that generate different types of activity by both the oracle and DrEvil users. These activities include:
- an oozie workflow that removes salary data from HDFS
- numerous HDFS commands that upload files, change file access privileges, copy files and list the contents of directories and files
- hive commands that query, create, alter and drop tables
- Oracle Database commands that connect as different users, create and drop users, select from tables and delete records from a table
After running the scripts, we log into the Audit Vault Server as an auditor. Immediately, we see our alert has been triggered by the users' activity.
Drilling down on the alert reveals DrEvil's three failed attempts to access the sensitive data in HDFS:
Now that we see the alert triggered in the dashboard, let's see what other activity is taking place on the BDA and in the Oracle Database.
Ad Hoc Reporting
Audit Vault Server delivers rich reporting capabilities that enables you to better understand the activity that has taken place across the enterprise. In addition to the numerous reports that are delivered out of box with Audit Vault, you can create your own custom reports that meet your own personal needs. Here, we are looking at a BDA monitoring report that focuses on Hadoop activities that occurred in the last 24 hours:
As you can see, the report tells you all of the key elements required to understand: 1) when the activity took place, 2) the source service for the event, 3) what object was referenced, 4) whether or not the event was successful, 5) who executed the event, 6) the ip address (or host) that initiated the event, and 7) how the object was modified or accessed. Stoplight reporting is used to highlight critical activity - including DrEvils failed attempts to open the sensitive salaries.txt file.
Notice, events may be related to one another. The Hive command "ALTER TABLE my_salarys RENAME TO my_salaries" will generate two events. The first event is sourced from the Metastore; the alter table command is captured and the metadata definition is updated. The Hive command also impacts HDFS; the table name is represented by an HDFS folder. Therefore, an HDFS event is logged that renames the "my_salarys" folder to "my_salaries".
Next, consider an Oozie workflow that performs a simple task: delete a file "salaries2.txt" in HDFS. This Oozie worflow generates the following events:
- First, an Oozie workflow event is generated indicating the start of the workflow.
- The workflow definition is read from the "workflow.xml" file found in HDFS.
- An Oozie working directory is created
- The salaries2.txt file is deleted from HDFS
- Oozie runs its clean-up process
The Audit Vault reports are able to reveal all of the underlying activity that is executed by the Oozie workflow. It's flexible reporting allows you to sequence these independent events into a logical series of related activities.
The reporting focus so far has been on Hadoop - but one of the core strengths of Oracle Audit Vault is its ability to consolidate all audit data. We know that DrEvil had a few unsuccessful attempts to access sensitive salary data in HDFS. But, what other unsuccessful events have occured recently across our data platform? We'll use Audit Vault's ad hoc reporting capabilities to answer that question. Report filters enable users to search audit data based on a range of conditions. Here, we'll keep it pretty simple; let's find all failed access attempts across both the BDA and the Oracle Database within the last two hours:
Again, DrEvil's activity stands out. As you can see, DrEvil is attempting to access sensitive salary data not only in HDFS - but also in the Oracle Database.
Security and integration with the rest of the Oracle ecosystem are two tablestakes that are critical to Oracle Big Data Appliance releases. Oracle Audit Vault and Database Firewall's auditing of data across the BDA, databases and operating systems epitomizes this goal - providing a single repository and reporting environment for all your audit data.