X

Maximum Availability Architecture – Oracle’s industry-leading set of database high availability capabilities

Using Oracle Incremental Merge

Christian Craft
Senior Director, Product Management

This article outlines the Incremental Merge feature of the Oracle database and it's intended usage.  This article also addresses how 3rd party products have been built upon this feature of Oracle, delivering database cloning capabilities (also known as copy data management) as well as backup/recovery solutions.  Finally, this article will cover how Oracle addresses such requirements using native features of the Oracle database in a Maximum Availability Architecture (MAA) configuration rather than relying on the Incremental Merge feature.

What is Incremental Merge?

The Incremental Merge capability of  the Oracle database refers to the ability to create a copy of a database and periodically update that copy by merging incremental changes into that copy.  In short, Incremental Merge is comprised of the following capabilities:

  • Image Copy Backup of Oracle Database
  • Incrementally Updating the Image Copy
  • Archive Redo Log Management
  • Restore and Recovery from Image Copy
  • Duplicate Database from Image Copy
  • Switch to Copy Feature

Incremental Merge involves a single copy of the database as-of a single point in time.  The database is drawn forward on the change timeline by applying incremental backups.  Archived redo logs spanning the duration of backup execution are required to de-fuzzy the contents of the resulting image copy (see note1).  The resulting image copy can be used for restore and recovery, duplicating databases, and switching a database to the image copy.

The following diagram shows how RMAN is used to apply the changes from an incremental backup to an image copy of the database.

Screen Shot 2018-11-07 at 4.52.51 PM

Note 1:For the purposes of this article, we assume customers are using HOT backup.  Customers seldom use COLD backups, where the database is shut down during the backup.  Database backups (including image copy and incrementals) should be considered "fuzzy" if the database was online and active during the backup.

Intended Usage of Incremental Merge

The Incremental Merge feature was intended to be used in a transient manner for specific tasks such as:

  • Cloning Databases to DEV/TEST
  • Upgrading database/storage hardware
  • Instantiating Data Guard Standby

The Incremental Merge feature was not developed to be an operational backup/recovery capability.  Incremental Merge maintains a single image copy as of a single point-in-time, whereas backup/recovery typically requires recoverability over a much longer window of time such as weeks or months.  The Recovery Window can be extended by DELAYING the apply of incremental backups to allow as much as 7-10 day of recoverability.  However, most enterprises require recovery windows for 30 days or longer, so inducing such a delay typically is not sufficient to meet most backup/recovery needs.  In addition, delaying incremental apply also elongates the time required to perform recovery.  Oracle recommends using a fully functional backup/recovery solution (meeting the business mandated recovery window) for the use-cases above instead of using Incremental Merge.

3rd Party Vendor Solutions

Several 3rd party vendors have developed solutions based on the Incremental Merge feature including the following:

  • Actifio
  • Rubrik
  • NetBackup Co-Pilot

Some of these vendors go beyond the intended use of Incremental Merge as a transient data copy capability and recommend using it for operational backup/recovery.  Storage snapshots are used to provide multiple restore-points, whereas Incremental Merge natively provides a single restore point.

Delphix is another 3rd party who offers a similar capability as the above vendors (implemented differently), and Delphix positions their solution as a DEV/TEST cloning solution rather than for backup/recovery.

Avoid Products that use Reverse Engineering

Customers should avoid using any 3rd party products that use undocumented interfaces or reverse engineer features of Oracle such as internal Oracle data structures, including the contents of Oracle RMAN Backups.  Oracle may modify those structures and product behavior without notice at anytime in any version, release or even with a simple patch.  Customers should evaluate 3rd party products to determine if  undocumented interfaces are being used or if the vendor has reverse engineered the Oracle database.

Custom Built Solutions

It is important to note that customers have also built custom-scripted Incremental Merge solutions using these same core Oracle database features for creating image copy backups and updating them incrementally.  Customers have used both SAN (block storage) as well as NAS storage (such as Oracle's ZFS storage) to build these Incremental Merge solutions.

Adding Snapshots to Incremental Merge

Incremental Merge provides a single copy of the database as-of a single point in time, which fits the intended use-cases of building database for DEV/TEST, for hardware migration, or instantiating Data Guard Standby databases.  A single copy of the database is created on a different set of hardware, and the database is incrementally updated until it reaches the desired point.

The following diagram shows how snapshots are added to incremental merge to provide multiple restore points.

Screen Shot 2018-11-07 at 4.54.05 PM

Some 3rd party vendors add storage snapshots to the Incremental Merge solution to allow multiple restore points so it can serve as a backup/recovery solution.  Some customers have also built custom scripted solutions following this model as well.  The resulting snapshots can then be cataloged with RMAN to allow database recovery using those snapshots.

Critical Solution Design Issues

The Incremental Merge solution presents several issues where process must be executed correctly to avoid corruption.  Customers should be aware of the following issues when implementing an Incremental Merge solution:

  • Timing of Merge Process
  • Timing of Snapshot Execution
  • Timing of Redo Log Archival
  • Handling of Archived Redo Log Backups
  • Recovery point required to de-fuzzy Image Copy

Timing of the merge process and snapshot execution is critical to avoid corruption of snapshot copies.  Incremental Merge was not designed for use with snapshots, and does not include the "snapshot optimization" feature of the Oracle database itself.  Snapshots cannot overlap with the Incremental Merge process, or corruption will result.  It is also important to note that the resulting Image Copy is not consistent and needs some amount of redo applied to make it consistent.  The necessary change-vectors to de-fuzzy the backup must be externalized into the archived redo and backed-up.  The recovery point is also critical  to avoid "file needs more media recovery" errors.

Oracle Error: "File Needs More Media Recovery"

Incremental Merge is a "fuzzy" backup that needs recovery to be consistent, which means not corrupt.  Most experienced Oracle DBAs are familiar with the Oracle error "file needs more media recovery", which indicates the backup is corrupt ("fuzzy") and can occur in several circumstances with different Oracle error numbers as follows:

  • ORA-01113
  • ORA-01194
  • ORA-01195
  • ORA-19901
  • ORA-01143

In all of these cases, redo logs are required to de-fuzzy the database and make it consistent before it can be opened for use.  Recovery into the middle of a fuzzy range requires restoring a PREVIOUS backup and rolling forward.  Image copy backups should be treated as fuzzy if taken while the database is up and running.  Proper redo log handling is critical to the Incremental Merge process because redo logs are required to de-fuzzy the image copy.

Managing the Archived Redo Log Stream

The Incremental Merge feature deals with handling of DATA blocks only, and does not address how the redo stream is handled.  Redo is critical to successful operation of the Incremental Merge feature.  Proper redo handling is even more critical when used in conjunction with snapshots.  Oracle recommends switching, archiving, and backing up the redo log so that change-vectors required to de-fuzzy the image copy are included in the backup.

The following diagram (from the Oracle RMAN documentation) shows how the redo log stream records database incarnations, which are used during database recovery.

Screen Shot 2018-11-07 at 4.57.02 PM

Oracle also recommends NOT including the archived redo stream in a snapshot, and certainly not in the same snapshot that contains the Image Copy.  Redo log change-vectors generated AFTER the incremental backup are required to de-fuzzy the database.  Database recovery also requires access to ALL available incarnations of the redo stream to properly navigate incarnations as shown in the diagram above.

Data Loss During Recovery

Archived redo is typically 20 minutes or longer behind the current changes in the database depending on the log switch interval.  The image copy will be as much as 24 hours behind current, while redo backups will be at least 20 minutes behind current.  Therefore, customers should expect anywhere from 20 minutes to 24 hours of data loss (loss of transactions) when using Incremental Merge for backup/recovery.

Online REDO logs are re-used in a circular fashion.  Archived redo is sequential and provides a record of changes over the course of time.

Screen Shot 2018-11-12 at 1.08.17 PM

The structure of Oracle redo also includes incarnations (as discussed earlier), with each incarnation representing different branches of the timeline.  In the diagram above, the latest transactions are contained in log sequence 110, which is not yet available in the archived redo.  Those transactions will be lost in an Incremental Merge solution.  Customers should consider Oracle Data Guard or the Zero Data Loss Recovery Appliance (ZDLRA) to eliminate loss of transactions.

Does Switch to Copy = Instant Recovery?

Oracle provides a feature known as Switch to Copy, which allows a database to be switched to an Image Copy instead of using Media Recovery.  Some 3rd party vendors have described this as "instant recovery", which is not correct.  Switch to copy is a SWITCH operation that can be used in place of restore as shown below.

Screen Shot 2018-11-12 at 1.19.32 PM

Switch to copy involves pointing Oracle at a different copy of the database, which is similar to a restoreoperation, whereas database recoveryis the act of applying redo logs.  At the end of the switch to copy command, the image copy will still need recovery, which is not instantaneous.  As discussed earlier, the image copy will normally be as much as 24 hours behind current, and redo logs will be at least 20 minutes behind current.  Any Image Copy taken HOT will also need some amount of recovery (log apply) to make it consistent as well as to reach the desired point-in-time.

Again, recall that Incremental Merge does not include provisions for handling the redo logs, so switch to copy involves data loss (loss of transactions).  Depending on the intended usage, this capability isn't usable if the Image Copy resides on lower tier (slower) storage.  The concept of "instant recovery" implies that the database is usable and will provide the same level of service that users expect of the production database.

Switching to Equivalent Storage

The switch-to-copy feature should only be used with storage that meets performance expectations.  Production databases typically cannot operate on a lower tier of storage than used for production.  It is important to note that switch-to-copy using any 3rd party storage is notcompatible with Exadata and is not supported.

Oracle recommends customers use Data Guard rather than switch-to-copy.  Data Guard is more widely used and avoids the data loss issues inherence with switch-to-copy as outlined above.  Data Guard Standby databases can also be placed on equivalent storage as the production database to meet end-user performance expectations.

Resource Stealing

Incremental Merge steals resources from the source databases including CPU, memory, network I/O, and disk I/O resources.  The same Oracle software version must be used to APPLY incremental changes to the image copy, and the most common method is to simply use the source database to merge incremental changes into the image copy.  Resource Stealing needs to be considered in system capacity planning and customers need to be aware of performance impact from resource stealing.  Oracle Data Guard does not rely on resource stealing, and places minimal overhead on source servers and network.

Oracle Data Guard

Oracle Data Guard provides the ability to instantiate a copy of a database and update that copy either synchronously or asynchronously via the redo log stream.  The following diagram shows the basic Data Guard configuration including the observer capability known as the Data Guard Broker.

Screen Shot 2018-11-12 at 10.54.52 AM

In addition to keeping a standby database in close synchronization with the primary, Data Guard also provides the ability to use a TIME DELAY, which is functionally similar to Incremental Merge with a different update mechanism.  Data Guard advances a copy of a database forward on the timeline of changes by using the REDO log, whereas Incremental Merge updates a copy of a database using incremental backups.

Using Data Guard for this case assumes the use of ARCHIVELOG mode, and FORCE LOGGING is required to eliminate gaps caused by use of NOLOGGING operations.  Some legacy hardware configurations might not offer sufficient performance for this configuration, while Exadata has proven to deliver the performance necessary for fully logged databases even with high transaction volumes as well as high volume ETL processing in Data Warehouse environments.

Oracle Snapshot Standby & SPARSE Disk Groups

Oracle's Snapshot Standby is a critical feature for creating DEV/TEST copies of databases from production.  The process is fully automated through Oracle Enterprise Manager (OEM), Data Guard Broker, and SQL Plus.  The Snapshot Standby is created from a Data Guard Physical Standby, and can be reverted back to Physical and re-synchronized with the production database.

Once a snapshot standby is created, Oracle's SPARSE Disk Groups also provide the ability to create multiple thin-provisioned (SPARSE) clones from a Snapshot Standby.  Oracle RMAN also allows SPARSE backups, extending the thin-provisioning capability into the backup solution as well.

Screen Shot 2018-11-12 at 12.31.09 PM

Database Recovery with Incremental Merge

The Incremental Merge process involves IMAGE COPY backups that are typically fuzzy copies needing redo to be applied to make them consistent.  There are essentially 4 database recovery scenarios that any backup/recovery solution needs to support as follows:

  1. Repairing Physical Corruption
  2. Point-in-Time Database Recovery
  3. Point-in-Time Object/Table Recovery
  4. Recovery Based Cloning

Oracle's RMAN (Recovery Manager) tool is used to perform recovery of Oracle databases in all of the use-cases above.  Some 3rd party solutions also include interactive tools or APIs that layer on top of the functionality provided by Oracle.  In this section, we will cover this from the standpoint of the RMAN tool that most DBAs are familiar with.

Repairing Physical Corruption

The first use-case for RMAN addresses the need to repair databases if physical corruption occurs, which is also referred to as "recover to current" or recovering the database to the current (or latest) transaction.  RMAN provides 3 levels of physical database repair as follows:

  • Block Media Recovery
  • Data File Restore & Recovery
  • Database Restore & Recovery

Block Media Recovery can uses a FULL or LEVEL0 backup, or can use a Virtual Full on Oracle's Recovery Appliance (RA). Redo logs are applied to recover the block(s) forward after the block is restored from the FULL, LEVEL0 or RA Virtual Full.  Block recovery also works with Oracle Data Guard as shown below.

Screen Shot 2018-11-12 at 12.41.36 PM

Point-In-Time Database Recovery

While the Recovery Advisor can detect and automatically launch repair actions when physical corruption is encountered, it's not possible to automatically evaluate "logical" corruption caused by factors such as application failures.  For example, an application might be defective, or a user might delete data by mistake.  Those types of failures simply cannot be detected by the database.

Point-in-Time Object/Table Recovery

In some cases, logical corruption might impact only a single or a few tables within the database.  Rather than recovering the entire database to a prior point-in-time, it might be desirable to recover only those tables effected by the wayward application or user.

Incremental Merge with snapshots was useful in previous releases to facilitate table recovery.  However, Oracle12c and above offers the ability to recover tables using backups, as well as the ability to REMAP the table into a different schema as shown below:

Screen Shot 2018-11-12 at 11.15.50 AM

Tables are often recovered to a previous point in time due to application failures or end-user errors.  Placing the recovered table in a different schema allows a developer or user to examine the data to determine what changes (if any) should be made to the production data.  This process has become much simpler in Oracle12c due to the feature outlined above.

Validation Of Image Copy Backups

The Incremental Merge process effectively validates the blocks that have changed, but un-changed blocks are never validated.  Oracle recommends periodically executing the RMAN RESTORE command with the VALIDATE option to ensure integrity of Image Copy backups.  Oracle's Zero Data Loss Recovery Appliance provides automatic validation of backups without using resources of the database server, and without manual data validation scripts.

Conclusion

Incremental Merge is a feature of the Oracle database that was developed for transient use such as creating database clones for TEST/DEV, instantiating Data Guard standby databases, and for migrating to new hardware.  Some 3rd party vendors have used the Incremental Merge feature to replicate capabilities that are provided natively as part of the Oracle database.  This article outlined how native features of the Oracle database provide many of the capabilities that Incremental Merge solutions have provided in previous releases.  Oracle recommends customers follow the Maximum Availability Architecture (MAA) reference architectures to meet business goals.

References:

Maximum Availability Architecture: http://www.oracle.com/goto/maa

ZDLRA: https://www.oracle.com/engineered-systems/zero-data-loss-recovery-appliance/

 

 

 

Join the discussion

Comments ( 1 )
  • Amit Sharma Tuesday, December 3, 2019
    great article on incremental merge

    Is it possible to restore image created by incremental merge to an alternate client.

    like PROD DB backup taken as incr merge and restore it on DEV DB on alternate client.
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.