Lock up Your Data Warehouse - Part 3
By KLaker on Nov 30, 2009
How best to manage security within the data warehouse is a huge subject area to cover. In previous postings on this blog I have looked at how to secure the transmission of sensitive data over your corporate network. This helps ensure line sniffers cannot pick up useful information by simply watching the packets go round the network.
Taking security to the next level is all about understanding who (or what) is accessing your data warehouse and exactly what they are doing to the tables and associated data within the data warehouse. Oracle Database comes complete with a large set of security features to help you create a safe, secure and compliant data warehouse. While investigating how auditing works in 11g I uncovered a huge information pool relating to database activity that I suspect most customers may not be aware of. Developing and implementing a security strategy can take a considerable amount of time and effort but in my opinion there are some interesting benefits that arise from collecting all this information about database activity. As always with Oracle there is so much more to each feature if you just take a few minutes to think "out of the box". This is definitely one of those cases.
I am not going to explore in detail all the data auditing/security features in depth because there are so many blog articles already on the web that explain how to set-up the various security features. However, here is a brief introduction to the latest data auditing features courtesy of the 11gR2 documentation, specifically the Oracle Database Security Guide 11g Release 2 (11.2), Chapter 9 Verifying Security Access with Auditing, see here.
What Is Auditing?
Auditing is the monitoring and recording of selected user database actions, from both database users and non-database users. You can base auditing on individual actions, such as the type of SQL statement executed, or on combination's of data that can include the user name, application, time, and so on. You can audit both successful and failed activities. To use auditing, you enable it, and then configure what must be audited. The actions that you audit are recorded in either data dictionary tables or in operating system files.
Oracle recommends that you enable and configure auditing. Auditing is an effective method of enforcing strong internal controls so that your site can meet its regulatory compliance requirements, as defined in the Sarbanes-Oxley Act. This enables you to monitor business operations, and find any activities that may deviate from company policy. Doing so translates into tightly controlled access to your database and the application software, ensuring that patches are applied on schedule and preventing ad hoc changes. By enabling auditing by default, you can generate an audit record for audit and compliance personnel. Be selective with auditing and ensure that it meets your business compliance needs.
When to use auditing
You typically use auditing to perform the following activities:
- Enable accountability for actions. These include actions taken in a particular schema, table, or row, or affecting specific content.
- Deter users (or others, such as intruders) from inappropriate actions based on their accountability.
- Investigate suspicious activity. For example, if a user is deleting data from tables, then a security administrator might decide to audit all connections to the database and all successful and unsuccessful deletions of rows from all tables in the database.
- Notify an auditor of the actions of an unauthorized user. For example, an unauthorized user could be changing or deleting data, or the user has more privileges than expected, which can lead to reassessing user authorizations.
- Monitor and gather data about specific database activities. For example, the database administrator can gather statistics about which tables are being updated, how many logical I/Os are performed, or how many concurrent users connect at peak times.
- Detect problems with an authorization or access control implementation. For example, you can create audit policies that you expect will never generate an audit record because the data is protected in other ways. However, if these policies generate audit records, then you will know the other security controls are not properly implemented.
- Address auditing requirements for compliance. Regulations such as the following have common auditing-related requirements:
- Sarbanes-Oxley Act
- Health Insurance Portability and Accountability Act (HIPAA)
- International Convergence of Capital Measurement and Capital Standards: a Revised Framework (Basel II)
- Japan Privacy Law
- European Union Directive on Privacy and Electronic Communications
What I am actually interested in for this post is what you can do with this audit data you collect. However, let's start with an overview before looking at the actual data:
What can you audit/monitor?
First let us consider what you can actually monitor. The security features in the database allow you to monitor individual sessions, DML and DDL commands, activity for a specific table, and even activity on a specific column within a table. It is all a matter of how much data do you want to collect. The important thing is that all this information is then stored in system tables to create a complete audit trail.
How to enable auditing
There are various levels of auditing that are available. For example:
- You can audit SQL statements, privileges, schema objects, functions, procedures, packages, triggers, and network activity. For example, you can audit each time a particular user performs an UPDATE or a DELETE SQL statement.
- Use the default settings to enable auditing for commonly used security-relevant SQL statements and privileges.
- You can audit at the most granular level, data access, and actions based on content, using Boolean measures, such as value > 7800 or the IP address from which an action occurred.
What this means is you can capture information about when users logon or logoff, who runs SELECT/UPDATE/DELETE statements against specific tables, capture all SELECT/UPDATE/DELETE statements for a specific user, or even monitor use of DML/DDL statements against specific columns within a table.
For example to audit all session connecting to an instance you can use the following commands:
Or you can simply audit a specific account:
SQL>AUDIT ALL BY SH
You can further refine your auditing by logging only successful statements, or just unsuccessful statements or both. For example:
SQL>AUDIT SELECT, INSERT, UPDATE, DELETE ON SH.SALES BY ACCESS WHENEVER SUCCESSFUL
However, the more activities you place under the watchful eye of the auditing features the more data you will collect and the more space you will need to store that data. Which means you might need to archive/purge the audit tables on a regular basis.
Reviewing your audit data
There are a number of views
For the purposes of this article I have concentrated on the DBA_COMMON_AUDIT_TRAIL table as it contains the most useful information in the context of being able to generate some BI reports. Below is an screenshot of some of the information I extracted from the DBA_COMMON_AUDIT_TRAIL table.
Don't forget to archive
You should periodically archive and then delete (purge) audit trail records, because the audit trail cannot accept new records if it grows too large. You can purge a subset of database audit trail records. For both database and operating system audit trail types, you can manually purge the records or create a purge job that performs at a specified time interval. In that case, the purge operation either purges the audit trail records that were created before the archive timestamp, or it purges all audit trail records
Can you use this information in other ways?
Absolutely. Looking at the main audit views there is a lot of very useful information that is collected.
A better solution to simple archiving to file is to create a security data warehouse. Personally, I can see a lot of value in creating a security data warehouse where information from the main audit tables is regularly moved to a separate schema as part of the purge process. This audit archive schema can then be enhanced with additional dimensions and attributes, which can be used to create some very interesting reports.
Moving Audit Data into the Security Warehouse
I was wondering how you could automate the process of moving only the incremental data into the security warehouse tables and then I remembered an earlier conversation with Jean-Pierre where we talked about flashback queries. Using flashback query it is possible to create a simplified change-data-capture process using the "AS OF TIMESTAMP" as part of the SELECT statement. Assuming flashback is enabled this would allow you to identify the new rows in the audit table and move them into the corresponding security warehouse table prior to purging the audit data from the main audit tables.
select * from DBA_COMMON_AUDIT_TRAIL
AS OF TIMESTAMP TO_TIMESTAMP('30-NOV-09 11:31:58','DD-MON-YY HH24:MI:SS')
One word of warning - it is useful to convert the extended_timestamp column, which contains a time zone element as part of its definition, to a varchar2 data type to avoid possible data type errors in some BI tools (Excel accessing Oracle via ODBC will generate an error).
When moving the data out of the audit tables I added a basic counter column, set to 1 for each record, and created an hour data column for grouping records to show activity during the day. There are lots more possibilities that could be explored, as this is a very rich data set.
It is also useful to add some additional dimensions such as date dimension, a time dimension, a user dimension and an object dimension. You can use the free features of OWB to generate a time dimension table and then use OWB to create the other dimensions. You could use OWB to control the whole data flow process from the audit tables to the target schema and since the required features are included with the database license it would be mad to use anything else.
The user dimension is useful for grouping types of users together either by department or by type. In the case of my example I collected a lot of audit information for the users DBSNMP, SYS and SYSMAN. Therefore, it would be useful to either exclude them as part of the ETL process (which I think is a bad idea) or create a higher level of grouping such as SYSTEM USERS and APPLICATION USERS. Obviously you can create other hierarchies as well but this categorization is helpful when you want to just isolate operations relating to your corporate applications.
Below is a very basic model of a security data warehouse schema and you can obviously add more dimensions to support your own specific requirements.
For this example, the above schema is probably sufficient.
Delivering Security Analytics
Once you have a security warehouse you can start to answer a lot of questions about your schema, such as:
- Which tables are the most heavily accessed and when
- What is the general pattern of activity over a day, week, month, year
- Which users are running the most select statements, or updates, or even deletes.
- Is there a pattern to certain activities
Using the Excel pivot table feature you can create some interesting graphs such as these:
The user SCOTT had auditing set up BY SESSION and we see that the ACTION_NAME appears as 'SESSION REC' in each case, which shows that auditing is being done at the session level. The user SH had auditing set at the access level, and here we see a listing of every select, insert and delete. The detailed data also contains the timestamp of the action, even though several inserts were performed on the same table in the same session.
This graph shows activity by time. Admittedly the data is not brilliant but this is from data I collected from my laptop database whilst I was creating this blog post. As you can see I ran a lot of select and update statements around 11:00am and then I just left the machine to idle. A lot of the activity after 11:00am is just normal database processes running which is why it can useful to create a hierarchy of user types (SYSTEM, APPLICATION etc) so you can either include or exclude certain user groups as required.
Using audit data to charge for access?
One of the metrics captured is the session CPU and I was wondering if this could be used to develop a charging model for access or use of a particular application. With the push to embrace cloud computing one of the key issues is charging for access. While the audit data does not provide the complete picture it should be possible to determine how long a user was connected for, what commands were executed and how much session CPU was used. Taking all this information in to account it should be possible to create a cross-charging model to that would provide a cost per user for based on some notional price per unit (how a unit is defined I leave to you!).