Delivering hardware to support a data warehouse cloud strategy
By Klaker-Oracle on Jan 18, 2010
From a hardware perspective, the move to the cloud is mostly being driven by the need to find a better way to utilize hardware resources more effectively and efficiently. Most enterprise data warehouse environments consist of an enterprise data warehouse with usually, but not always, additional data marts spun-off to support key departmental objectives. In addition there are the separate processing engines needed to support the EDW such as data transformation, data quality, metadata repositories, de-duplication etc etc. Each of these applications (EDW, marts, etc etc) tends to run on its own dedicated platform and this results in an inefficient use of resources. In some cases, attempts have been made to share storage services but this creates a whole series of other performance and management problems.
As I stated before, the rush to move everything to the cloud seems to be based on the idea that somehow this new concept is magically going to deliver better hardware utilization. In reality, this can only really be true if the hardware platform used to support a cloud based EDW strategy has the following characteristics:
- Based on high volume hardware
- Based on grid processing model
- Uses intelligent shared storage
- Based on open standards based operating systems
- Provide a highly available platform
- Deliver increased utilization
1) Use of High Volume Hardware
One of the main factors holding back many data warehouses today is their inability to quickly and easily integrate new hardware developments - new CPUs, bigger disks, faster interconnects etc. Every new advance in hardware delivers faster performance for less money. This makes it vital that such developments can incorporated into the data center as soon as possible.
This is especially true in data warehousing where many customers need to store ever-increasing volumes of data as well as supporting an ever-growing community of users running every more complex queries. New innovations such as Intel’s latest Nehalem chipset, interconnect technology from the super-computing industry (InfiniBand), ever expanding SATA/SAS disk storage capacities and the introduction of SSD/flash technology are all vital in terms of delivering increased performance, improving overall efficiency and reducing total costs.
Many customers are being told that the simplest way to access new technology is to off-load some, or all, of their processing to the cloud (at this point I am not differentiating between public and private clouds). The problem is that simply moving to the cloud (public or private) does not mean guarantee access to the latest hardware innovations. In many cases, it is simply a way of masking the use of proprietary hardware (and related software) that is probably well passed its sell-by date.
2) Use of Grid Processing
Most customers are working with large numbers of dedicated servers with storage assigned to each application. This creates a computing infrastructure where resources are tied to specific applications, resulting in an inflexible architecture. This increases cost, power requirements, reduces overall performance, scalability and availability.
The way to resolve these issues is to move to an approach based on the grid. Grid Computing is a virtualizes and pools IT resources, such as compute power, storage and network capacity into a set of shared services that can be distributed and re-distributed as needed. Grid computing provides the required flexibility to meet the changing needs of the business. It is much easier to support short-term special projects for a department if additional resources can quickly and easily provisioned. Placing applications on a grid computing based architecture enables multiple applications to share computing infrastructure, resulting in much greater flexibility, cost, power efficiency, performance, scalability and availability, all at the same time.
Oracle Real Application Clusters (RAC) allows the Oracle Database to run in a grid platform. Nodes, CPUs, storage and memory can all be dynamically provisioned while the system remains online. This makes it easy to maintain service levels while at the same time lowering overall costs through improved utilization. In fact you could consider the "C" in RAC as referring to "Cloud" rather than "Cluster".
Adding additional resources “on-demand” is a key requirement for delivering cloud computing. I would argue that this can be a complicated process within a shared-nothing infrastructure. This makes this type of approach unsuitable for use with a cloud computing strategy. In reality adding something relatively simple such as more storage has a profound impact on the whole environment. The database has to be completely rebuilt to re-distribute data evenly across all the disks. For many vendors adding more storage space also means adding more processing nodes to ensure the hardware remains in balance with the data. This all creates additional downtime for the business, as the whole platform has to go offline while the new resources are added and configured which impacts SLAs.
3) Use of Intelligent Shared Storage
Today’s data warehouse is completely different from yesterday’s data warehouse. Data volumes, query complexity and numbers of users have all increased dramatically and will continue to increase. The pressure to analyze increasing amounts of data will put more strain on the storage layer and many systems will struggle with I/O bottlenecks. With traditional storage, creating a shared storage grid is difficult to achieve because of the inability to prioritize the work of the various jobs and users consuming I/O bandwidth from the storage subsystem. The same occurs when multiple databases share the storage subsystem
Exadata delivers a new kind of storage – intelligent storage - specifically built for the Oracle database. Exadata has powerful smart scan features which reduce the time taken to find the data relevant to a specific query and begin the process of transforming the data into information. At the disk level there is a huge amount of intelligent processing to support a query. Consequently, the result returned from the disk is reduced to the necessary information to satisfy a query, being significantly smaller than compared with a traditional block storage approach (as used by many other vendors such as data warehouse).
The resource management capabilities of Exadata storage can prevent one class of work, or one database, from monopolizing disk resources and bandwidth and ensures user defined SLAs are met when using Exadata storage. With an Exadata system it is possible to identify various types of workloads, assign priority to these workloads, and ensure the most critical workloads get priority.
The tight integration between the storage layer, Exadata, and the Oracle Oracle Database ensures customers to get all the benefits of extreme performance with all the scalability and high availability required to support a “cloud” based enterprise data warehouse.
4) Use of Open Standards Based Operating Systems
The same concept that applies to hardware also applies to operating systems. Customers need to move from proprietary operating systems to one based on open standard, such as Linux. The use of open standards based operating systems also allows new technologies to rapidly incorporated.
Oracle provides its own branded version of Linux – Oracle Enterprise Linux. Oracle is committed to making Linux stronger and better. Oracle works closely with, and contributes to, the Linux community to ensure the Oracle Database runs optimally across all major flavors of Linux. This cooperation extends to the very latest technology supporting both Exadata and Sun Oracle Database Machine: such as the support of InfiniBand as a networking infrastructure. Oracle is working with the Linux community to help standardize the use of InfiniBand interconnects. Oracle has already released the InfiniBand drivers it developed for use with Oracle Database Machine to the open-source community.
With its support for Linux, use of commodity hardware components, intelligent shared storage and grid architecture, Oracle is able to deliver the most open approach to enterprise data warehousing in the market today and support the key elements needed to allow customers to develop a successful cloud based data warehouse strategy.
5) Use of a Highly Available Framework
In a hardware-cloud it is vital that there is no single point of failure. As the number of applications sharing the hardware increases so does the impact of a loss of service. A data warehouse, either inside or outside the cloud, can be subjected to both planned and unplanned outages. There are many types of unplanned system outage such as computer failure, storage failure, human error, data corruption, lost writes, hangs or slow downs and even complete site failure. Planned system outages are the result of needing to perform routine and periodic maintenance operations and new deployments. The key is to minimize the amount of downtime to reduce the impact on productivity, lost revenue, damaged customer relationships, bad publicity, and lawsuits.
A data warehouse built around a shared nothing architecture is vulnerable to the loss of a node or a disk since losing one or both of these items means that a specific portion of the data set is unavailable. As a result queries and/or processes have to be halted until the node/disk is repaired.
A shared everything architecture, such as Oracle’s, is the ideal solution for cloud computing since there is no single point of failure. If a disk or node fails, queries and/or processes are simply serviced from another disk containing a copy of the data from the failed disk or transparently moved to another node. This is achieved without interruptions in service, saving cost and ensuring business continuity.
Exadata has been designed to incorporate the same standards of high availability (HA) customers have come to expect for Oracle products. With Exadata, all database features and tools work just as they do with traditional non-Exadata storage. With the Exadata architecture, all single points of failure are eliminated. Familiar features such as mirroring, fault isolation, and protection against drive and cell failure have been incorporated into Exadata to ensure continual availability and protection of data. Other features to ensure high availability within the Exadata Storage Server are described below
Oracle's Hardware Assisted Resilient Data (HARD) Initiative is a comprehensive program designed to prevent data corruptions before they happen. Data corruptions are very rare, but when they happen, they can have a catastrophic effect on a database, and therefore a business. Exadata has enhanced HARD functionality embedded in it to provide even higher levels of protection and end-to-end data validation for your data.
6) Delivers Increased Utilization
One of the key aims of cloud computing is to increase the utilization of existing hardware. What is actually required is a new approach to hardware that allows applications to be consolidated onto a single, scalable platform. This allows resources (disk, processing, memory) to be shared across all applications and allocated as required. If one particular application needs additional short-term resources for a special project, the infrastructure should be flexible enough to allow those resources to be made available without significantly impacting other applications.
The use of high volume hardware, grid architecture, highly available framework and open standards makes it possible to create a suitable platform for consolidation to support enterprise wide applications.
Now we have the second part of our cloud strategy in place: a hardware platform to support the data warehouse cloud: