OLTP And Data Warehouse? How Does That Work?
By KLaker on Sep 27, 2009
I recently came across a posting on another website that was questioning some of the messaging associated with our recent launch of the Sun Oracle Database Machine. The implication being made was that an appliance designed to support data warehousing could not support OLTP operations and an appliance designed to support OLTP operations could not be expected to support data warehousing.
Are these two types of operations mutually exclusive? Is it actually possible to deliver an appliance that truly does provide the best of both worlds? More importantly why would you need such an appliance? I would argue that were are rapidly reaching a point where there is very little difference between the types of features and of operations needed to support OLTP operations compared to data warehouse.
Many of today's data warehouses are looking more and more like an OLTP application. This is what is often referred to as "active data warehousing". Active data warehousing is completely different to traditional data warehousing (as supported by the current data warehouse appliance vendors) and is based around three types of operations that overlap and intersect:
- Data loading in near-real time using trickle feed ETL jobs to update small amounts of data
- Operational style queries requesting very small amounts of data to support transactional operations that run around the clock
- Analytical style queries, executed continuously to plan next year's budgets, uncover market trends or predict future growth patterns
To support these types of "active" operations a data warehouse appliance must provide the following:
1) Strong Transactional Heritage
The traditional data warehouse is becoming a thing of the past. Very few customers are looking to build a data warehouse based on single batch updates where the database is taken down over night, or over the weekend, to load new data. An active data warehouse is as the name suggests, active. It is not supposed to be offline. By linking into operational systems the data warehouse becomes a 24*7 extension. In reality an active data warehouse cannot be unavailable. It is also wrong to assume that trickle feeds just deliver small amounts of data at regular intervals. For many customers, such as those in retail, financial organizations or telecommunications this is not the case. The trickle feeds deliver terabytes of data that need to be loaded as quickly a possible so business users can extract competitive advantages as quickly as possible.
Active data warehousing is often seen as "the revenge of OLTP" systems because of the need to combine a strong robust transactional model with data warehouse features within a single database engine. Therefore, this latest announcement of a Sun Oracle Database Machine that supports both OLTP and data warehousing means it provides the perfect platform for delivering active data warehousing.
The traditional data warehouse appliance vendors have no background in transactional processing. Their databases lack a sophisticated locking model so queries struggle to deliver accurate and consistent results during data loads. Typically these appliance databases fail to support active data warehouse style operations because they can only support a single view of the data. Consequently when data is being loaded all readers are locked out of the system until the write operation(s) completes.
If read operations are running when the data load process starts the data load is locked out until the queries all complete. This simplistic model causes severe performance problems for customers looking for the benefits of moving to a truly active data warehouse.
In contrast Oracle has a long and distinguished history of both transactional processing and data warehousing. The world-class robust locking model within the Oracle Database allows both readers and writers to co-exist without blocking each other. Oracle automatically provides read consistency to a query so that all the data that the query sees comes from a single point in time. Oracle can also provide read consistency to all of the queries in a transaction. This means the Oracle Database is ideally suited to deliver a real active data warehouse.
The leading analyst companies such as Gartner, Forrester and IDC recognize the Oracle Database as a market leader in both data warehousing and OLTP.
2) Robust Security Model
Pushing data out to many different OLTP applications and business users means that the issue of data security becomes a critical focus area. There are many areas that have to be considered when attempting to secure a database. Security has become a multidimensional problem. Customers need to consider how best to:
- Protect their network and media
- Develop a strong authentication framework
- Manage internal threats
- Deliver audit and configuration monitoring
- Create a central place to manage all users
- Encrypt and mask sensitive data
In addition, many countries have passed laws that specify how data must be managed and who can access that data. Failure to meet the requirements of this legislation can result in damaging publicity as well as fines or even lengthy jail sentences.
Security within most data warehouse appliance systems is limited. The typical appliance database provides basic user authentication and authorization procedures so users who access the data can only see information to which they have been granted access. In some cases appliance vendors will offer some form of data encryption, which can be applied to a specific column of data. However, the encryption process is not transparent and usually requires changes to both ETL/data loading processes and the query access processes.
The auditing and logging of user activity is invariably limited to basic information such as details of logons and logoffs, number of failed logins, non-typical access periods and non-typical logon sources.
In today's security conscious world these security features are completely inadequate. In fact, most of the appliance vendors have now woken up to this fact and are slowly linking up with a third party vendors in an attempt to bring 21st Century rigor to their security features. The problem for customers is that they have to deal with another software vendor, additional licenses and attempt to broker solutions to support issues where the boundaries of responsibility are unclear.
In contrast Oracle is able to fully secure your data from end to end and across the whole data lifecycle. For over 30 years Oracle has been developing market-leading solutions to help customers secure all information within their data warehouses and OLTP systems. Oracle data security solutions help address numerous government and industry regulations including SOX, PCI, HIPAA, FISMA, JSOX and the European Union Data Privacy Directive.
To help customers secure and defend their data warehouse Oracle has developed its 'Defence-in-Depth Guide' along with its 'Maximum Security Architecture'. This is a series of solutions to help customers address today's top security concerns. The Oracle Maximum Security Architecture provides an easy to follow overview of Oracle's defence-in-depth approach to data security.
Oracle's market leading security and data encryption features are designed to work transparently, minimizing any impact on existing applications while addressing mandatory requirements found in many regulations.
Overall, Oracle provides everything you need to secure the data within both your data warehouse and your OLTP applications. There is no need to contact third party vendors, no need to change your database schemas, no need to change your existing business intelligence tools. Oracle Database security features are designed to be transparent to make customers lives easier and all Oracle security features work transparently with Sun Oracle Database Machine.
3) Strong Resource Management
The other implication made on the website mentioned earlier was that you would have to be idiot to put your operational systems on the same appliance as your data warehouse. But why would the appliance vendors want to stop customers from running OLTP applications along side their data warehouse? Apart from the lack of strong transactional features the appliance vendors typical uses a shared nothing architecture, which is optimized to support pre-defined batch queries that scan large volumes of data. These optimizations do not suit the types of queries that dominate active data warehouse and OLTP operations. Operational queries request small amounts of data but the shared nothing architecture has to run these queries across all nodes within the cluster tying up resources that really are not needed. It does not take many of these operational queries to flood an appliance system and the DBA has little, if any, control over the amount of resource acquired by each query.
The Oracle DBA has complete control over how system resources are used because Oracle's 'shared everything' architecture provides the ability to dynamically optimize each query. Before the query is executed the current workload is examined and the amount of processing power, i.e. degree of parallelism, allocated to the query is adjusted rather than simply allocating the same amount of processing resource every time (as happens with the other appliance vendors). The degree of parallelism is, therefore, optimized for each query and there is no requirement for a minimal degree of parallelism across all nodes. Operations can run in parallel using one, some or all nodes depending on the current workload, the characteristics and importance of the query.
Over the years the Oracle Database has been extended to provide fully automated performance optimization features. These query optimization features have the ability to learn over time, which are the most optimal query plans, and then lock subsequent queries into using those plans to ensure consistent performance.
It is this ability to effectively manage the resources within the Sun Oracle Database Machine that allows Oracle to run both OLTP and data warehouse operations on the same platform. Oracle has a world-class database with strong transactional and data warehouse features, it has world-class resource management features and now with Sun Oracle Database Machine it has a world-class high performance platform.
4) Deliver High Availability
Customers cannot afford to have their active data warehouse or OLTP applications inactive. Without access to the operational systems and/or the data warehouse, revenue and customers can be lost, penalties incurred and bad press generated, all of which can have a lasting effect on both customers and the company's reputation. Building a robust, high availability IT infrastructure to support both a data warehouse and OLTP operations is critical to the success and well being of all enterprises in today's fast moving economy. The data warehouse, like all OLTP systems, is now a mission-critical system.
Building a robust and resilient platform based on a typical data warehouse appliance is very difficult because of its shared nothing architecture. This means data is dedicated to a specific node and if a node fails then the data associated with that node also becomes unavailable. Which by default, implies that the whole system is unavailable.
To resolve this architectural challenge appliance vendors offer a number of different methods for delivering a resilient enterprise data warehouse, each governed by the size of the customer's budget.
In some cases the use of features to deliver a robust and resilient platform can affect performance. For example when using a 'fallback' configuration to protect data, write performance is reduced as data has to be written to both the primary and fallback locations to ensure consistency. This has a knock-on effect in terms of query performance because user requests can be blocked while a write operation is being performed. If that write operation takes longer because of the need to support a fallback copy of the data, business users will be locked out of the system for longer.
Oracle databases operate in highly resilient and recoverable configurations and run thousands of mission-critical systems around the world. Oracle's database is designed to support mission critical environments from the ground up. One of the key parts in providing a highly available solution is Oracle Real Application Clusters (RAC). This is the premier database clustering technology that allows two or more computers (also referred to as "nodes") in a cluster to concurrently access a single shared database. This effectively creates a single database system that spans multiple hardware systems yet appears to the application as a single unified database.
Oracle Automatic Storage Management (ASM) provides protection at the storage layer. With performance and high availability as a primary objective, ASM builds on the principle of stripe and mirror everything. Intelligent mirroring capabilities allow administrators to define two or three way mirrors for the ultimate protection of critical business data.
Furthermore, Oracle has an extremely mature set of fully integrated and resilient solutions that enable customers to deliver highly available data warehousing. Oracle has developed "Maximum Availability Architecture" (MAA) to provide superior data protection and availability by minimizing or eliminating planned and unplanned downtime at all technology stack layers including hardware or software components.
OLTP And Data Warehouse? How Does That Work?
For many appliance vendors it just not an option to run an OLTP application on their servers because of the inherent limitations built in to their database and architecture. Therefore, to go back to the original question: "OLTP And Data Warehouse? How Does That Work?" For Oracle customers the answer is: very nicely thank you!