This article outlines the Incremental Merge feature of the Oracle database and it's intended usage. This article also addresses how 3rd party products have been built upon this feature of Oracle, delivering database cloning capabilities (also known as copy data management) as well as backup/recovery solutions. Finally, this article will cover how Oracle addresses such requirements using native features of the Oracle database in a Maximum Availability Architecture (MAA) configuration rather than relying on the Incremental Merge feature.
The Incremental Merge capability of the Oracle database refers to the ability to create a copy of a database and periodically update that copy by merging incremental changes into that copy. In short, Incremental Merge is comprised of the following capabilities:
Incremental Merge involves a single copy of the database as-of a single point in time. The database is drawn forward on the change timeline by applying incremental backups. Archived redo logs spanning the duration of backup execution are required to de-fuzzy the contents of the resulting image copy (see note1). The resulting image copy can be used for restore and recovery, duplicating databases, and switching a database to the image copy.
The following diagram shows how RMAN is used to apply the changes from an incremental backup to an image copy of the database.
Note 1:For the purposes of this article, we assume customers are using HOT backup. Customers seldom use COLD backups, where the database is shut down during the backup. Database backups (including image copy and incrementals) should be considered "fuzzy" if the database was online and active during the backup.
The Incremental Merge feature was intended to be used in a transient manner for specific tasks such as:
The Incremental Merge feature was not developed to be an operational backup/recovery capability. Incremental Merge maintains a single image copy as of a single point-in-time, whereas backup/recovery typically requires recoverability over a much longer window of time such as weeks or months. The Recovery Window can be extended by DELAYING the apply of incremental backups to allow as much as 7-10 day of recoverability. However, most enterprises require recovery windows for 30 days or longer, so inducing such a delay typically is not sufficient to meet most backup/recovery needs. In addition, delaying incremental apply also elongates the time required to perform recovery. Oracle recommends using a fully functional backup/recovery solution (meeting the business mandated recovery window) for the use-cases above instead of using Incremental Merge.
Several 3rd party vendors have developed solutions based on the Incremental Merge feature including the following:
Some of these vendors go beyond the intended use of Incremental Merge as a transient data copy capability and recommend using it for operational backup/recovery. Storage snapshots are used to provide multiple restore-points, whereas Incremental Merge natively provides a single restore point.
Delphix is another 3rd party who offers a similar capability as the above vendors (implemented differently), and Delphix positions their solution as a DEV/TEST cloning solution rather than for backup/recovery.
Customers should avoid using any 3rd party products that use undocumented interfaces or reverse engineer features of Oracle such as internal Oracle data structures, including the contents of Oracle RMAN Backups. Oracle may modify those structures and product behavior without notice at anytime in any version, release or even with a simple patch. Customers should evaluate 3rd party products to determine if undocumented interfaces are being used or if the vendor has reverse engineered the Oracle database.
It is important to note that customers have also built custom-scripted Incremental Merge solutions using these same core Oracle database features for creating image copy backups and updating them incrementally. Customers have used both SAN (block storage) as well as NAS storage (such as Oracle's ZFS storage) to build these Incremental Merge solutions.
Incremental Merge provides a single copy of the database as-of a single point in time, which fits the intended use-cases of building database for DEV/TEST, for hardware migration, or instantiating Data Guard Standby databases. A single copy of the database is created on a different set of hardware, and the database is incrementally updated until it reaches the desired point.
The following diagram shows how snapshots are added to incremental merge to provide multiple restore points.
Some 3rd party vendors add storage snapshots to the Incremental Merge solution to allow multiple restore points so it can serve as a backup/recovery solution. Some customers have also built custom scripted solutions following this model as well. The resulting snapshots can then be cataloged with RMAN to allow database recovery using those snapshots.
The Incremental Merge solution presents several issues where process must be executed correctly to avoid corruption. Customers should be aware of the following issues when implementing an Incremental Merge solution:
Timing of the merge process and snapshot execution is critical to avoid corruption of snapshot copies. Incremental Merge was not designed for use with snapshots, and does not include the "snapshot optimization" feature of the Oracle database itself. Snapshots cannot overlap with the Incremental Merge process, or corruption will result. It is also important to note that the resulting Image Copy is not consistent and needs some amount of redo applied to make it consistent. The necessary change-vectors to de-fuzzy the backup must be externalized into the archived redo and backed-up. The recovery point is also critical to avoid "file needs more media recovery" errors.
Incremental Merge is a "fuzzy" backup that needs recovery to be consistent, which means not corrupt. Most experienced Oracle DBAs are familiar with the Oracle error "file needs more media recovery", which indicates the backup is corrupt ("fuzzy") and can occur in several circumstances with different Oracle error numbers as follows:
In all of these cases, redo logs are required to de-fuzzy the database and make it consistent before it can be opened for use. Recovery into the middle of a fuzzy range requires restoring a PREVIOUS backup and rolling forward. Image copy backups should be treated as fuzzy if taken while the database is up and running. Proper redo log handling is critical to the Incremental Merge process because redo logs are required to de-fuzzy the image copy.
The Incremental Merge feature deals with handling of DATA blocks only, and does not address how the redo stream is handled. Redo is critical to successful operation of the Incremental Merge feature. Proper redo handling is even more critical when used in conjunction with snapshots. Oracle recommends switching, archiving, and backing up the redo log so that change-vectors required to de-fuzzy the image copy are included in the backup.
The following diagram (from the Oracle RMAN documentation) shows how the redo log stream records database incarnations, which are used during database recovery.
Oracle also recommends NOT including the archived redo stream in a snapshot, and certainly not in the same snapshot that contains the Image Copy. Redo log change-vectors generated AFTER the incremental backup are required to de-fuzzy the database. Database recovery also requires access to ALL available incarnations of the redo stream to properly navigate incarnations as shown in the diagram above.
Archived redo is typically 20 minutes or longer behind the current changes in the database depending on the log switch interval. The image copy will be as much as 24 hours behind current, while redo backups will be at least 20 minutes behind current. Therefore, customers should expect anywhere from 20 minutes to 24 hours of data loss (loss of transactions) when using Incremental Merge for backup/recovery.
Online REDO logs are re-used in a circular fashion. Archived redo is sequential and provides a record of changes over the course of time.
The structure of Oracle redo also includes incarnations (as discussed earlier), with each incarnation representing different branches of the timeline. In the diagram above, the latest transactions are contained in log sequence 110, which is not yet available in the archived redo. Those transactions will be lost in an Incremental Merge solution. Customers should consider Oracle Data Guard or the Zero Data Loss Recovery Appliance (ZDLRA) to eliminate loss of transactions.
Oracle provides a feature known as Switch to Copy, which allows a database to be switched to an Image Copy instead of using Media Recovery. Some 3rd party vendors have described this as "instant recovery", which is not correct. Switch to copy is a SWITCH operation that can be used in place of restore as shown below.
Switch to copy involves pointing Oracle at a different copy of the database, which is similar to a restoreoperation, whereas database recoveryis the act of applying redo logs. At the end of the switch to copy command, the image copy will still need recovery, which is not instantaneous. As discussed earlier, the image copy will normally be as much as 24 hours behind current, and redo logs will be at least 20 minutes behind current. Any Image Copy taken HOT will also need some amount of recovery (log apply) to make it consistent as well as to reach the desired point-in-time.
Again, recall that Incremental Merge does not include provisions for handling the redo logs, so switch to copy involves data loss (loss of transactions). Depending on the intended usage, this capability isn't usable if the Image Copy resides on lower tier (slower) storage. The concept of "instant recovery" implies that the database is usable and will provide the same level of service that users expect of the production database.
The switch-to-copy feature should only be used with storage that meets performance expectations. Production databases typically cannot operate on a lower tier of storage than used for production. It is important to note that switch-to-copy using any 3rd party storage is notcompatible with Exadata and is not supported.
Oracle recommends customers use Data Guard rather than switch-to-copy. Data Guard is more widely used and avoids the data loss issues inherence with switch-to-copy as outlined above. Data Guard Standby databases can also be placed on equivalent storage as the production database to meet end-user performance expectations.
Incremental Merge steals resources from the source databases including CPU, memory, network I/O, and disk I/O resources. The same Oracle software version must be used to APPLY incremental changes to the image copy, and the most common method is to simply use the source database to merge incremental changes into the image copy. Resource Stealing needs to be considered in system capacity planning and customers need to be aware of performance impact from resource stealing. Oracle Data Guard does not rely on resource stealing, and places minimal overhead on source servers and network.
Oracle Data Guard provides the ability to instantiate a copy of a database and update that copy either synchronously or asynchronously via the redo log stream. The following diagram shows the basic Data Guard configuration including the observer capability known as the Data Guard Broker.
In addition to keeping a standby database in close synchronization with the primary, Data Guard also provides the ability to use a TIME DELAY, which is functionally similar to Incremental Merge with a different update mechanism. Data Guard advances a copy of a database forward on the timeline of changes by using the REDO log, whereas Incremental Merge updates a copy of a database using incremental backups.
Using Data Guard for this case assumes the use of ARCHIVELOG mode, and FORCE LOGGING is required to eliminate gaps caused by use of NOLOGGING operations. Some legacy hardware configurations might not offer sufficient performance for this configuration, while Exadata has proven to deliver the performance necessary for fully logged databases even with high transaction volumes as well as high volume ETL processing in Data Warehouse environments.
Oracle's Snapshot Standby is a critical feature for creating DEV/TEST copies of databases from production. The process is fully automated through Oracle Enterprise Manager (OEM), Data Guard Broker, and SQL Plus. The Snapshot Standby is created from a Data Guard Physical Standby, and can be reverted back to Physical and re-synchronized with the production database.
Once a snapshot standby is created, Oracle's SPARSE Disk Groups also provide the ability to create multiple thin-provisioned (SPARSE) clones from a Snapshot Standby. Oracle RMAN also allows SPARSE backups, extending the thin-provisioning capability into the backup solution as well.
The Incremental Merge process involves IMAGE COPY backups that are typically fuzzy copies needing redo to be applied to make them consistent. There are essentially 4 database recovery scenarios that any backup/recovery solution needs to support as follows:
Oracle's RMAN (Recovery Manager) tool is used to perform recovery of Oracle databases in all of the use-cases above. Some 3rd party solutions also include interactive tools or APIs that layer on top of the functionality provided by Oracle. In this section, we will cover this from the standpoint of the RMAN tool that most DBAs are familiar with.
The first use-case for RMAN addresses the need to repair databases if physical corruption occurs, which is also referred to as "recover to current" or recovering the database to the current (or latest) transaction. RMAN provides 3 levels of physical database repair as follows:
Block Media Recovery can uses a FULL or LEVEL0 backup, or can use a Virtual Full on Oracle's Recovery Appliance (RA). Redo logs are applied to recover the block(s) forward after the block is restored from the FULL, LEVEL0 or RA Virtual Full. Block recovery also works with Oracle Data Guard as shown below.
While the Recovery Advisor can detect and automatically launch repair actions when physical corruption is encountered, it's not possible to automatically evaluate "logical" corruption caused by factors such as application failures. For example, an application might be defective, or a user might delete data by mistake. Those types of failures simply cannot be detected by the database.
In some cases, logical corruption might impact only a single or a few tables within the database. Rather than recovering the entire database to a prior point-in-time, it might be desirable to recover only those tables effected by the wayward application or user.
Incremental Merge with snapshots was useful in previous releases to facilitate table recovery. However, Oracle12c and above offers the ability to recover tables using backups, as well as the ability to REMAP the table into a different schema as shown below:
Tables are often recovered to a previous point in time due to application failures or end-user errors. Placing the recovered table in a different schema allows a developer or user to examine the data to determine what changes (if any) should be made to the production data. This process has become much simpler in Oracle12c due to the feature outlined above.
The Incremental Merge process effectively validates the blocks that have changed, but un-changed blocks are never validated. Oracle recommends periodically executing the RMAN RESTORE command with the VALIDATE option to ensure integrity of Image Copy backups. Oracle's Zero Data Loss Recovery Appliance provides automatic validation of backups without using resources of the database server, and without manual data validation scripts.
Incremental Merge is a feature of the Oracle database that was developed for transient use such as creating database clones for TEST/DEV, instantiating Data Guard standby databases, and for migrating to new hardware. Some 3rd party vendors have used the Incremental Merge feature to replicate capabilities that are provided natively as part of the Oracle database. This article outlined how native features of the Oracle database provide many of the capabilities that Incremental Merge solutions have provided in previous releases. Oracle recommends customers follow the Maximum Availability Architecture (MAA) reference architectures to meet business goals.
Maximum Availability Architecture: http://www.oracle.com/goto/maa