Wednesday Nov 05, 2014

Oracle MAA Part 4: Gold HA Reference Architecture

Welcome to the fourth installment in a series of blog posts describing MAA Best Practices that define four standard reference architectures for HA and data protection: BRONZE, SILVER, GOLD and PLATINUM. The objective of each reference architecture is to deploy an optimal set of Oracle HA capabilities that reliably achieve a given service level (SLA) at the lowest cost.

This article provides details for the Gold reference architecture.

Gold substantially raises the service level for business critical applications that cannot accept vulnerability to single points-of-failure. Gold builds upon Silver by using database replication technology to eliminate single point of failure and provide a much higher level of data protection and HA from all types of unplanned and planned outages.

An overview of Gold is provided in the figure below.

Gold delivers substantially enhanced service levels using the following capabilities:

Oracle Active Data Guard replaces backups used by Bronze and Silver reference architectures as the first line of defense against an unrecoverable outage of the production database. Recovery time (RTO) for outages caused by data corruption, database failure, cluster failure, and site failures is reduced to seconds or minutes with an accompanying data loss exposure (RPO) of zero or near zero depending upon configuration. While backups are no longer used for availability, they are still included in the Gold reference architecture for archival purposes and as an additional level of data protection.

Active Data Guard uses simple physical replication to maintain one or more synchronized copies (standby databases) of the production database (primary database). If the primary becomes unavailable for any reason, production is quickly failed over to the standby and availability is restored.

Active Data Guard offers a unique set of capabilities for availability and Oracle data protection that exceed other alternatives based upon storage remote-mirroring or other methods of database replication. These capabilities include:

  • Choice of zero data loss (sync) or near-zero data loss (async) disaster protection.
  • Direct transmission of database changes (redo) directly from the log buffer of the primary database providing strong isolation from lower layer hardware and software faults.
  • Use of intimate knowledge of Oracle data block and redo structures to perform continuous Oracle data validation at the standby, further isolating the standby database from corruptions that can impact a primary database.
  • Native support for all Oracle data types and features combined with high performance capable of supporting all applications and workloads.
  • Manual or automatic failover to quickly transfer production to the standby database should the primary become unavailable for any reason.
  • Integrated application failover to quickly transition application connections to the new primary database after a failover has occurred.
  • Database rolling maintenance to reduce downtime and risk during planned maintenance.
  • High return on investment by offloading read-only workloads and backups to an Active Data Guard standby while it is being synchronized by the primary database.

Oracle GoldenGate logical replication is also included in the Gold reference architecture, either to complement Active Data Guard when performing planned maintenance or to use as an alternative replication mechanism for maintaining a synchronized copy (target database) of a production database (source database).

GoldenGate reads changes from disk at a source database, transforms the data into a platform independent file format, transmits the file to a target database, then transforms the data into SQL (updates, inserts, and deletes) native to a target database that is open read-write. The target database contains the same data, but is a different physical database from the source (for example, backups are not interchangeable). This enables GoldenGate to easily support heterogeneous environments across different hardware platforms and relational database management systems. This flexibility makes it ideal for a wide range of planned maintenance and other replication requirements.  GoldenGate can:

  • Efficiently replicate subsets of a source database to distribute data to other target databases. It can also be used to consolidate data into a single target database (for example, an Operational Data Store) from multiple source databases. This function of GoldenGate is relevant to each of the four MAA reference architectures and is complementary to the use of Active Data Guard.
  • Perform maintenance and migrations in a rolling manner for use-cases that cannot be supported using Data Guard replication. For example, Oracle GoldenGate enables replication from a source database running on a big-endian platform to a target database running on a little-endian platform. This enables cross-platform migration with the additional advantage of being able to reverse replication for fast fallback to the prior version after cutover. When used in this fashion GoldenGate is complementary to Active Data Guard.
  • Maintain a complete replica of a source database for high availability or disaster protection that is ready for immediate failover should the source database become unavailable. GoldenGate would be an alternative to Active Data Guard when used for this purpose. The primary use-case where you would use GoldenGate instead of Active Data Guard for complete database replication is when there is a requirement for the target database to be open read-write at all times (remember an Active Data Guard standby is open read-only).

Note that there are several trade-offs that must be accepted when using logical replication in place of Active Data Guard for data protection and availability

  • Logical replication has additional pre-requisites and operational complexity.
  • Logical replication is inherently an asynchronous process and thus not able to provide zero data loss protection. Only Active Data Guard can provide zero data loss protection
  • It is obvious that a logical copy is not a physical replica of the source database. Rather than offload backups to a standby you must backup both source and target. Logical replication also can not support advanced data protection features that come with Data Guard physical replication: lost-write detection and automatic block repair.

Oracle Site Guard is optional in the Gold tier but is useful to reduce administrative overhead and the potential for human error. Site Guard enables administrators to automate the orchestration of switchover (a planned event) and failover (in response to an unplanned outage) of their complete Oracle environment - multiple databases and applications - between a production site and a remote disaster recovery site. Oracle Site Guard is included with the Oracle Enterprise Manager Life-Cycle Management Pack.

Oracle Site Guard offers the following benefits:

  • Reduction of errors due to prepared response to site failure. Recovery strategies are mapped out, tested, and rehearsed in prepared responses within the application. Once an administrator initiates a Site Guard operation for disaster recovery, human intervention is not required.
  • Coordination across multiple applications, databases, and various replication technologies. Oracle Site Guard automatically handles dependencies between different targets while starting or stopping a site. Site Guard integrates with Oracle Active Data Guard to coordinate multiple concurrent database failovers. Site Guard also integrates with storage remote mirroring that may be used for data that resides outside of the Oracle Database.
  • Faster recovery time. Oracle Site Guard automation minimizes time spent in the manual coordination of recovery activities. 

Gold = Better HA and Data Protection

Gold builds upon Silver by addressing all fault domains. Even in the worst cases of a complete cluster or site outage, database service can be resumed within seconds or minutes of a failure occurring. Gold eliminates the downtime and potential uncertainty of a restore from backup. Gold also eliminates data loss by protecting every database transaction in real-time.

Database-aware replication is key to achieving Gold service levels. It is network efficient. It enforces a high degree of isolation between replicated copies for optimal data protection and HA. It enables fast failover to an already synchronized and running copy of production. It achieves high ROI by enabling workloads to be offloaded from production to the replicated copy. As important as these tangible benefits are, there is the equally significant benefit of reducing risk. By running workloads at the replicated copy you are performing continuous application-level validation that it is ready for production when needed.

So what is left to address after Gold? There is a class of application where it is desirable to mask the effect of an outage from the end-user. Imagine you are the customer in the process of making a purchase online - you don't want to be left in an uncertain state should there be a database outage. Did my purchase go through, do I resubmit my payment, and if I do will I be charged twice? From a data loss perspective, what if the DR site is 100's or 1000's of miles away. How do you guarantee zero data loss protection? Finally, this same class of application frequently can not  tolerate downtime for planned maintenance - how can you shrink maintenance windows to zero so that applications can be available at all times? To learn how to address this set of requirements stay tuned for the final installment in this MAA series when we cover the Platinum reference architecture.   

Monday Sep 29, 2014

Oracle Reinvents Database Protection with Zero Data Loss Recovery Appliance

Reinventing Database Protection

During his opening keynote at Oracle OpenWorld 2014, Oracle Executive Chairman and Chief Technology Officer Larry Ellison announced the general availability of Oracle’s Zero Data Loss Recovery Appliance, the world's first and only engineered system designed specifically for Oracle Database protection. This massively scalable appliance delivers unparalleled data protection, efficiency, and scalability.

Unparalleled Data Protection

Visit the Oracle.com Main Landing Page

Read the Enhanced Press Release

Watch Sr. Vice President Juan Loaiza's Webcast

Read the story, featuring an interview with Ashish Ray, VP of Product Management

Oracle's Zero Data Loss Recovery Appliance (Recovery Appliance) is the first appliance ever to deliver zero data loss protection for critical Oracle Databases. When using today’s solutions to restore a database, businesses typically lose all data generated since the last backup—often hours to days of critical data.
The Recovery Appliance dramatically reduces the impact of backups on production servers and networks, virtually eliminating the need for lengthy backup windows. The cloud-scale architecture enables a single Recovery Appliance to manage the data protection requirements of thousands of databases, avoiding the cost and complexity of disparate backup systems. 

Zero Data Loss Recovery Appliance infographic

View the Infographic

Key Messages:

  • Eliminates data loss: Unique database integration enables continuous transport of redo data to the appliance, providing real-time protection for the most recent transactions so that databases can be restored without data loss.
  • Eliminates production impact: Backup algorithms integrated into Oracle Database send only changed data to the appliance, minimizing production database impact, I/O traffic, and network load. All expensive backup processing is offloaded to the Appliance.
  • Offloads tape archival: The Recovery Appliance can directly archive database backups to low-cost tape storage, offloading production database servers. Archival operations can run both day and night to improve tape drive utilization.
  • Enables restore to any point in time: The database change data stored on the Appliance can be used to create virtual full database copies at any desired point in time.
  • Delivers cloud-scale protection: A single Recovery Appliance can serve the data protection requirements of thousands of databases in a data center or region. Capacity expands seamlessly to petabytes of storage, with no downtime.
  • Protects data from disasters: The Recovery Appliance can replicate data in real time to a remote Recovery Appliance or to Oracle Database Backup Cloud Service to protect business data from site outages. Database blocks are continuously validated to eliminate data corruption at any stage of transmission or processing.

Want to learn more?

What else is happening at Oracle OpenWorld 2014 that includes the Zero Data Loss Recovery Appliance? Check out the following sessions:

  • Monday, Sept. 29:  2:45 - 3:30 (CON7684) - Moscone South - 307

Oracle Zero Data Loss Recovery Appliance: A New Era in Data Protection

  • Tuesday, Sept. 30:  3:45 - 4:30 (CON7686) - Moscone South - 305

Oracle Zero Data Loss Recovery Appliance: Deployment Best Practices 

And don't forget to see the Zero Data Loss Recovery Appliance in action. Visit the Engineered Systems Showcase at Moscone Center North and take a tour with one of our product experts.


Thursday Sep 11, 2014

Oracle MAA Part 3: Silver HA Reference Architecture

This is the third installment in a series of blog posts describing Oracle Maximum Availability Architecture (Oracle MAA) best practices that define four standard reference architectures for data protection and high availability: BRONZE, SILVER, GOLD and PLATINUM.  Each reference architecture uses an optimal set of Oracle HA capabilities that reliably achieve a given service level (SLA) at the lowest cost.

This article provides details for the Silver reference architecture.

Silver builds upon Bronze by adding clustering technology - either Oracle RAC or RAC One Node. This enables automatic failover if there is an unrecoverable outage of a database instance or a complete failure of the server on which it runs. Oracle RAC also delivers substantial benefit by eliminating downtime for many types of planned maintenance. It does this by performing maintenance in a rolling manner across Oracle RAC nodes so that services remain available at all times. As in the case of Bronze, RMAN provides database-optimized backups to protect data and restore availability should an outage prevent the cluster from being able to restart. An overview of Silver is provided in the figure below.

Silver HA Reference Architecture

Oracle RAC

Oracle RAC is an active-active clustering solution that provides instantaneous failover should there be an outage of a database instance or of the server on which it runs. A quick review of how Oracle RAC functions helps to understand its many benefits. There are two major components to any Oracle RAC cluster: Oracle Database instances and the Oracle Database itself.

  • A database instance is defined as a set of server processes and memory structures running on a single node (or server) which make a particular database available to clients.
  • The database is a particular set of shared files (data files, index files, control files, and initialization files) that reside on persistent storage, and together can be opened and used to read and write data.
  • Oracle RAC uses an active-active architecture that enables multiple database instances, each running on different nodes, to simultaneously read and write to the same database.

Oracle RAC is the MAA best practice for server HA and provides a number of advantages:

  • Improved HA: If a server or database instance fails, connections to surviving instances are not affected; connections to the failed instance are quickly failed over to surviving instances that are already running and open on other servers in the cluster.
  • Scalability: Oracle RAC is ideal for applications with high workloads or consolidated environments where scalability and the ability to dynamically add or reprioritize capacity are required. Additional servers, database instances, and database services can be provisioned online. The ability to easily distribute workload across the cluster makes Oracle RAC an ideal solution when Oracle Multitenant is used for database consolidation.
  • Reliable performance: Oracle Quality of Service (QoS) can be used to allocate capacity for high priority database services to deliver consistent high performance in database consolidated environments. Capacity can be dynamically shifted between workloads to quickly respond to changing requirements.
  • HA during planned maintenance: High availability is maintained by implementing changes in a rolling manner across Oracle RAC nodes. This includes hardware, OS, or network maintenance that requires a server to be taken offline; software maintenance to patch the Oracle Grid Infrastructure or database; or if a database instance needs to be moved to another server to increase capacity or balance the workload.

Oracle RAC One Node

RAC One Node provides an alternative to Oracle RAC when scalability and instant failover are not required. RAC One Node license is one-half the price of Oracle RAC, providing a lower cost option when an RTO of minutes is sufficient for server outages.

RAC One Node is an active-passive failover technology. During normal operation it only allows a single database instance to be open at one time. If the server hosting the open instance fails, RAC One Node automatically starts a new database instance on a second node to quickly resume service.

RAC One Node provides several advantages over alternative active-passive clustering technologies.

  • It automatically responds to both database instance and server failures.
  • Oracle Database HA Services, Grid Infrastructure, and database listeners are always running on the second node. At failover time only the database instance and database services need to start, reducing the time required to resume service, and enabling service to resume in minutes. 
  • It provides the same advantages for planned maintenance as Oracle RAC. RAC One Node allows two active database instances during periods of planned maintenance to allow graceful migration of users from one node to another with zero downtime; database services remain available to users at all times.

Silver = Better HA

Silver represents a significant increase in HA compared to the Bronze reference architecture and is very well suited to a broad range of application requirements.  Oracle RAC immediately responds to an instance or server outage and reconnects users to surviving instances.

While Silver has substantial benefits, it is only one step above Bronze - there is still a much broader fault domain beyond instance or server failure. This includes events that can impact the availability of an entire cluster - data corruptions, storage array failures, bugs, human error, site outages, etc. There is also a class of application where the impact of outages must be completely transparent to the user. Stay tuned for future installments when we address this expanded set of requirements with the Gold and Platinum reference architectures.

Wednesday Jul 16, 2014

Oracle MAA Part 2: Bronze HA Reference Architecture

In the first installment of this series we discussed how one size does not fit all when it comes to HA architecture. We described Oracle Maximum Availability Architecture (Oracle MAA) best practices that define four standard reference architectures for data protection and high availability: BRONZE, SILVER, GOLD and PLATINUM.  Each reference architecture uses an optimal set of Oracle HA capabilities that reliably achieve a given service level (SLA) at the lowest cost. As you progress from one level to the next, each architecture expands upon the one that preceded it in order to handle an expanded fault domain and deliver a high level of service.

This article provides details for the Bronze reference architecture.

Bronze is appropriate for databases where simple restart or restore from backup is ‘HA enough’. It uses single instance Oracle database (no cluster) to provide a very basic level of HA and data protection in exchange for reduced cost and implementation complexity. An overview is provided in the figure below.

Bronze Reference Architecture

When a database instance or the server on which it is running fails, the recovery time objective (RTO) is a function of how quickly the database can be restarted and resume service. If a database is unrecoverable the RTO becomes a function of how quickly a backup can be restored. In a worst case scenario of a complete site outage additional time is required to provision new systems and perform these tasks at a secondary location, in some cases this can take days.

The potential data loss if there is an unrecoverable outage (recovery point objective or RPO), is equal to the data generated since the last backup was taken. Copies of database backups are retained locally and at a remote location or on the Cloud for the dual purpose of archival and DR should a disaster strike the primary data center.

Major components of the Bronze reference architecture and the service levels achieved include:

Oracle Database HA and Data Protection

  • Oracle Restart automatically restarts the database, the listener, and other Oracle components after a hardware or software failure or whenever a database host computer restarts.
  • Oracle corruption protection checks for physical corruption and logical intra-block corruptions. In-memory corruptions are detected and prevented from being written to disk and in many cases can be repaired automatically. For more details see Preventing, Detecting, and Repairing Block Corruption.
  • Automatic Storage Management (ASM) is an Oracle-integrated file system and volume manager that includes local mirroring to protect against disk failure.
  • Oracle Flashback Technologies provide fast error correction at a level of granularity that is appropriate to repair an individual transaction, a table, or the full database.
  • Oracle Recovery Manager (RMAN) enables low-cost, reliable backup and recovery optimized for the Oracle Database.
  • Online maintenance includes online redefinition and reorganization for database maintenance, online file movement, and online patching. 

Database Consolidation

  • Databases deployed using Bronze often include development and test databases and databases supporting smaller work group and departmental applications that are often the first candidates for database consolidation.
  • Oracle Multitenant is the MAA best practice for database consolidation from Oracle Database 12c onward. 
Life Cycle Management
  • Oracle Enterprise Manager Cloud Control enables self service deployment of IT resources for business users along with resource pooling models that cater to various multitenant architectures. It supports Database as a Service (DBaaS), a paradigm in which end users (Database Administrators, Application Developers, Quality Assurance Engineers, Project Leads, and so on) can request database services, consume it for the lifetime of the project, and then have them automatically de-provisioned and returned to the resource pool.

Oracle Engineered Systems

  • Oracle Engineered Systems are an efficient deployment option for database consolidation and DBaaS. Oracle Engineered Systems reduce lifecycle cost by standardizing on a pre-integrated and optimized platform for Oracle Database that is completely supported by Oracle.

Bronze Summary:  Data Protection, RTO, and RPO

Table 1 summarizes the data protection capabilities and service levels provided by the Bronze tier. The first column indicates when validations for physical and logical corruption are performed:

  • Manual checks are initiated by the administrator or at regular intervals by a scheduled job.
  • Runtime checks are automatically executed on a continuous basis by background processes while the database is open.
  • Background checks are run on a regularly scheduled interval, but only during periods when resources would otherwise be idle.
  • Each check is unique to Oracle Database using specific knowledge of Oracle data block and redo structures.

Table 1: Bronze - Data Protection

Type Capability Physical Block Corruption
Logical Block Corruption
Manual Dbverify, Analyze Physical block checks Logical checks for intra-block and inter-object consistency
Manual RMAN Physical block checks during backup and restore Intra-block logical checks
Runtime Database In-memory block and redo checksum In-memory intra block logical checks
Runtime ASM Automatic corruption detection and repair using local extent pairs
Runtime Exadata HARD checks on write HARD checks on write
Background Exadata Automatic HARD Disk Scrub and Repair

Note that HARD validation and the Automatic Hard Disk Scrub and Repair (the last two rows of Table 1) are unique to Exadata storage. HARD validation ensures that Oracle Database does not write physically corrupt blocks to disk. Automatic Hard Disk Scrub and Repair inspects and repairs hard disks with damaged or worn out disk sectors (cluster of storage) or other physical or logical defects periodically when there are idle resources.

Table 2 summarizes RTO and RPO for the Bronze tier for various unplanned and planned outages.

Table 2: Bronze - Recovery Time and Data Loss Potential

Type  Event  Downtime Data Loss Potential
Unplanned  Database instance failure
 Minutes  Zero
Unplanned  Recoverable server failure
Minutes to an hour
 Zero
Unplanned Data corruptions, unrecoverable server failure, database failures, or site failures
Hours to days
Since last backup
Planned Online file move, online reorganization and redefinition, online patching
Zero
 Zero
Planned Hardware or operating system maintenance and database patches that cannot be performed online
Minutes to hours
Zero
Planned Database upgrades: patch sets and full database releases
Minutes to hours
Zero
Planned Platform migrations
Hours to a day
Zero
Planned Application upgrades that modify back-end database objects
Hours to days
Zero

So when would you use bronze?  Bronze is useful when users can wait for a backup to be restored if there is an unrecoverable outage and accept that any data generated since the last backup was taken will be lost. The Oracle Database has a number of included capabilities described above that provide unique levels of data protection and availability for a low-cost environment based upon the Bronze reference architecture.

But what if I can't accept this level of downtime or data loss potential - well that is where the Silver, Gold and Platinum reference architectures come in. Bronze is only a starting point that establishes the foundation for subsequent HA reference architectures that provide higher quality of service. Stay tuned for future blog posts that will dive into the details of each reference architecture.

Thursday Jun 12, 2014

Oracle Data Protection: How Do You Measure Up? - Part 1

This is the first installment in a blog series, which examines the results of a recent database protection survey conducted by Database Trends and Applications (DBTA) Magazine.

All Oracle IT professionals know that a sound, well-tested backup and recovery strategy plays a foundational role in protecting their Oracle database investments, which in many cases, represent the lifeblood of business operations. But just how common are the data protection strategies used and the challenges faced across various enterprises? In January 2014, Database Trends and Applications Magazine (DBTA), in partnership with Oracle, released the results of its “Oracle Database Management and Data Protection Survey”. Two hundred Oracle IT professionals were interviewed on various aspects of their database backup and recovery strategies, in order to identify the top organizational and operational challenges for protecting Oracle assets.
Here are some of the key findings from the survey:

  • The majority of respondents manage backups for tens to hundreds of databases, representing total data volume of 5 to 50TB (14% manage 50 to 200 TB and some up to 5 PB or more).
  • About half of the respondents (48%) use HA technologies such as RAC, Data Guard, or storage mirroring, however these technologies are deployed on only 25% of their databases (or less).
  • This indicates that backups are still the predominant method for database protection among enterprises. Weekly full and daily incremental backups to disk were the most popular strategy, used by 27% of respondents, followed by daily full backups, which are used by 17%. Interestingly, over half of the respondents reported that 10% or less of their databases undergo regular backup testing.

 A few key backup and recovery challenges resonated across many of the respondents:

  • Poor performance and impact on productivity (see Figure 1)
    • 38% of respondents indicated that backups are too slow, resulting in prolonged backup windows.
    • In a similar vein, 23% complained that backups degrade the performance of production systems.
  • Lack of continuous protection (see Figure 2)
    • 35% revealed that less than 5% of Oracle data is protected in real-time.
  •  Management complexity
    • 25% stated that recovery operations are too complex. (see Figure 1)
    •  31% reported that backups need constant management. (see Figure 1)
    • 45% changed their backup tools as a result of growing data volumes, while 29% changed tools due to the complexity of the tools themselves.

Figure 1: Current Challenges with Database Backup and Recovery

Figure 2: Percentage of Organization’s Data Backed Up in Real-Time or Near Real-Time

In future blogs, we will discuss each of these challenges in more detail and bring insight into how the backup technology industry has attempted to resolve them.

Tuesday Jun 03, 2014

How to Achieve Real-Time Data Protection and Availabilty....For Real

There is a class of business and mission critical applications where downtime or data loss have substantial negative impact on revenue, customer service, reputation, cost, etc. Because the Oracle Database is used extensively to provide reliable performance and availability for this class of application, it also provides an integrated set of capabilities for real-time data protection and availability.


Active Data Guard, depicted in the figure below, is the cornerstone for accomplishing these objectives because it provides the absolute best real-time data protection and availability for the Oracle Database. This is a bold statement, but it is supported by the facts. It isn’t so much that alternative solutions are bad, it’s just that their architectures prevent them from achieving the same levels of data protection, availability, simplicity, and asset utilization provided by Active Data Guard. Let’s explore further.



Active Data Guard - Overview

Backups are the most popular method used to protect data and are an essential best practice for every database. Not surprisingly, Oracle Recovery Manager (RMAN) is one of the most commonly used features of the Oracle Database. But comparing Active Data Guard to backups is like comparing apples to motorcycles. Active Data Guard uses a hot (open read-only), synchronized copy of the production database to provide real-time data protection and HA. In contrast, a restore from backup takes time and often has many moving parts - people, processes, software and systems – that can create a level of uncertainty during an outage that critical applications can’t afford. This is why backups play a secondary role for your most critical databases by complementing real-time solutions that can provide both data protection and availability.


Before Data Guard, enterprises used storage remote-mirroring for real-time data protection and availability. Remote-mirroring is a sophisticated storage technology promoted as a generic infrastructure solution that makes a simple promise – whatever is written to a primary volume will also be written to the mirrored volume at a remote site. Keeping this promise is also what causes data loss and downtime when the data written to primary volumes is corrupt – the same corruption is faithfully mirrored to the remote volume making both copies unusable. This happens because remote-mirroring is a generic process. It has no  intrinsic knowledge of Oracle data structures to enable advanced protection, nor can it perform independent Oracle validation BEFORE changes are applied to the remote copy. There is also nothing to prevent human error (e.g. a storage admin accidentally deleting critical files) from also impacting the remote mirrored copy.


Remote-mirroring tricks users by creating a false impression that there are two separate copies of the Oracle Database. In truth; while remote-mirroring maintains two copies of the data on different volumes, both are part of a single closely coupled system. Not only will remote-mirroring propagate corruptions and administrative errors, but the changes applied to the mirrored volume are a result of the same Oracle code path that applied the change to the source volume. There is no isolation, either from a storage mirroring perspective or from an Oracle software perspective.  Bottom line, storage remote-mirroring lacks both the smarts and isolation level necessary to provide true data protection.


Active Data Guard offers much more than storage remote-mirroring when your objective is protecting your enterprise from downtime and data loss. Like remote-mirroring, an Active Data Guard replica is an exact block for block copy of the primary. Unlike remote-mirroring, an Active Data Guard replica is NOT a tightly coupled copy of the source volumes - it is a completely independent Oracle Database. Active Data Guard’s inherent knowledge of Oracle data block and redo structures enables a separate Oracle Database using a different Oracle code path than the primary to use the full complement of Oracle data validation methods before changes are applied to the synchronized copy. These include: physical check sum, logical intra-block checking, lost write validation, and automatic block repair. The figure below illustrates the stark difference between the knowledge that remote-mirroring can discern from an Oracle data block and what Active Data Guard can discern.



Oracle Block Structure


An Active Data Guard standby also provides a range of additional services enabled by the fact that it is a running Oracle Database - not just a mirrored copy of data files. An Active Data Guard standby database can be open read-only while it is synchronizing with the primary. This enables read-only workloads to be offloaded from the primary system and run on the active standby - boosting performance by utilizing all assets. An Active Data Guard standby can also be used to implement many types of system and database maintenance in rolling fashion. Maintenance and upgrades are first implemented on the standby while production runs unaffected at the primary. After the primary and standby are synchronized and all changes have been validated, the production workload is quickly switched to the standby. The only downtime is the time required for user connections to transfer from one system to the next. These capabilities further expand the expectations of availability offered by a data protection solution beyond what is possible to do using storage remote-mirroring.


So don’t be fooled by appearances.  Storage remote-mirroring and Active Data Guard replication may look similar on the surface - but the devil is in the details. Only Active Data Guard has the smarts, the isolation, and the simplicity, to provide the best data protection and availability for the Oracle Database.


Stay tuned for future blog posts that dive into the many differences between storage remote-mirroring and Active Data Guard along the dimensions of data protection, data availability, cost, asset utilization and return on investment.


For additional information on Active Data Guard, see:




Thursday May 29, 2014

Oracle MAA Part 1: When One Size Does Not Fit All

The good news is that Oracle Maximum Availability Architecture (MAA) best practices combined with Oracle Database 12c (see video) introduce first-in-the-industry database capabilities that truly make unplanned outages and planned maintenance transparent to users. The trouble with such good news is that Oracle’s enthusiasm in evangelizing its latest innovations may leave some to wonder if we’ve lost sight of the fact that not all database applications are created equal. Afterall, many databases don’t have the business requirements for high availability and data protection that require all of Oracle’s ‘stuff’. For many real world applications, a controlled amount of downtime and/or data loss is OK if it saves money and effort.


Well, not to worry. Oracle knows that enterprises need solutions that address the full continuum of requirements for data protection and availability. Oracle MAA accomplishes this by defining four HA service level tiers: BRONZE, SILVER, GOLD and PLATINUM. The figure below shows the progression in service levels provided by each tier.




Each tier uses a different MAA reference architecture to deploy the optimal set of Oracle HA capabilities that reliably achieve a given service level (SLA) at the lowest cost.  Each tier includes all of the capabilities of the previous tier and builds upon the architecture to handle an expanded fault domain.



  • Bronze is appropriate for databases where simple restart or restore from backup is ‘HA enough’. Bronze is based upon a single instance Oracle Database with MAA best practices that use the many capabilities for data protection and HA included with every Oracle Enterprise Edition license. Oracle-optimized backups using Oracle Recovery Manager (RMAN) provide data protection and are used to restore availability should an outage prevent the database from being able to restart.

  • Silver provides an additional level of HA for databases that require minimal or zero downtime in the event of database instance or server failure as well as many types of planned maintenance. Silver adds clustering technology - either Oracle RAC or RAC One Node. RMAN provides database-optimized backups to protect data and restore availability should an outage prevent the cluster from being able to restart.

  • Gold raises the game substantially for business critical applications that can’t accept vulnerability to single points-of-failure. Gold adds database-aware replication technologies, Active Data Guard and Oracle GoldenGate, which synchronize one or more replicas of the production database to provide real time data protection and availability. Database-aware replication greatly increases HA and data protection beyond what is possible with storage replication technologies. It also reduces cost while improving return on investment by actively utilizing all replicas at all times.

  • Platinum introduces all of the sexy new Oracle Database 12c capabilities that Oracle staff will gush over with great enthusiasm. These capabilities include Application Continuity for reliable replay of in-flight transactions that masks outages from users; Active Data Guard Far Sync for zero data loss protection at any distance; new Oracle GoldenGate enhancements for zero downtime upgrades and migrations; and Global Data Services for automated service management and workload balancing in replicated database environments. Each of these technologies requires additional effort to implement. But they deliver substantial value for your most critical applications where downtime and data loss are not an option.


The MAA reference architectures are inherently designed to address conflicting realities. On one hand, not every application has the same objectives for availability and data protection – the Not One Size Fits All title of this blog post. On the other hand, standard infrastructure is an operational requirement and a business necessity in order to reduce complexity and cost.


MAA reference architectures address both realities by providing a standard infrastructure optimized for Oracle Database that enables you to dial-in the level of HA appropriate for different service level requirements. This makes it simple to move a database from one HA tier to the next should business requirements change, or from one hardware platform to another – whether it’s your favorite non-Oracle vendor or an Oracle Engineered System.


Please stay tuned for additional blog posts in this series that dive into the details of each MAA reference architecture.


Meanwhile, more information on Oracle HA solutions and the Maximum Availability Architecture can be found at:






About

Musings on Oracle's Maximum Availability Architecture (MAA), by members of Oracle Development team. Note that we may not have the bandwidth to answer generic questions on MAA.

Search

Categories
Archives
« July 2015
SunMonTueWedThuFriSat
   
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 
       
Today