Why (CDM) Snapshots are NOT Backups - Pt. 2

December 8, 2020 | 5 minute read
Tim Chien
Senior Director of Product Management
Text Size 100%:

In Part 1 of our series "Why (CDM) Snapshots are NOT Backups", we discussed how storage-based database snapshots are neither suited for Oracle database production recovery procedures nor to achieve Oracle database production service levels after a recovery event, particularly for large database volumes. In contrast, an Oracle database-integrated solution, such as Oracle's Zero Data Loss Recovery Appliance, is designed for simple, efficient, and high-performance database recovery - all while handling large volume of data.

In Part 2, we conclude with two additional reasons why snapshots are not suitable for database backups, specifically with regards to database validation and data loss.

3. No database recovery validation. While snapshots are often used for test activities, to be useful for Oracle database recovery purposes, they need to be validated by RMAN on a periodic basis to ensure there are no issues in the backup storage and network layers. RMAN backup validation goes beyond storage-based validation by verifying database block data for inconsistencies and that all files required for successful recovery are present and accounted for. Because this in-depth validation requires database server processes, production resources are needed to read and validate backup files, which can induce non-trivial load for especially for large data volumes.

As discussed in Part 1, snapshots must first be restored as image copies and then cataloged by RMAN in order to run validation. Couple this process with the time and overhead to run validation and it's easy to see that performing this activity on a continual basis would be a massive undertaking and practically infeasible for large database environments. This presents an additional risk when relying on snapshots for recovery, as one corrupt block in single snapshot can affect the restorability of successive snapshots that rely on that same block.  Inevitably, when a corruption is discovered during a restore, this becomes especially painful to resolve, prolonging database downtime and ultimately impacting business operations.

With Recovery Appliance, database backups sent, stored, and restored are continually validated for Oracle database recoverability. The Recovery Appliance leverages RMAN validation, but without impacting production database server resources. This unique capability ensures that potential recovery issues are caught early and resolved proactively, instead of being uncovered during a critical production restore operation. Additionally, through Enterprise Manager, validation and recoverability status can be monitored on per-database level, providing insights into the overall health of enterprise data that was never before possible.

    

4. Inherent data loss. Snapshots represent fixed, point-in-time views of source data. This means that for database snapshots, complete recoverability is gated by:

  1. Last validated snapshot
  2. All archived log backups from the snapshot time forward being available
  3. Current archived and redo logs being available on production storage.

However, if the production storage system becomes unavailable because of an outage or a disaster, recovery point objective becomes gated by the timestamp of the last completed archived log backup, which could be many hours back. This means loss of critical business data from that time onward, a scenario that has a lasting adverse impact on any company.

With Recovery Appliance, all database transactions are recorded in real-time via an efficient transport of redo blocks, the same technology underpinning the industry-proven Oracle Data Guard. This means there are always current and validated archived log backups on the appliance ready for use in any database recovery operation. Simply put, hours or more of data loss risk are eliminated, in favor of immediate transaction protection with zero data loss exposure.

In conclusion, let's review the key backup and recovery claims made by CDM vendors, clarify what they mean for Oracle databases, and contrast that with a database-optimized, efficient and high-performance data protection solution, such as Recovery Appliance:

CDM Vendor Claim What Does it Mean for Oracle Databases with a CDM Solution? What Does this Mean When Using
Recovery Appliance with
Oracle Databases

Incremental Forever Backup

After incremental backup is taken, a snapshot of the data file copy is created, and then RMAN on the Oracle database performs an incremental merge to bring the copy up-to-date. For large database change volumes, merge operations induce heavy production server load and CDM storage I/O utilization, prolonging the time until an up-to-date copy is available for recovery use.

Incremental forever backups to the appliance result in up-to-date virtual full backups, without the use of incremental merge operations. The database only sends the changes and the Recovery Appliance does the work. This frees up production resources for more business critical workloads.

Recover to Any Point-in-Time

Database recovery requires more than just reverting back to an older set of snapshot files. The snapshot files and archived log backups from that point onwards must be cataloged by RMAN on the Oracle database and recovered to a transactionally consistent point so the database can be opened. This multi-step process can prolong recovery time, especially for large databases.

The appliance integrates directly with RMAN for all restore and recovery operations in a DBA-familiar and seamless manner. Virtual full backups are restored directly by RMAN on the Oracle database - without incremental merge - to achieve fast recovery to any point-in-time.

Furthermore, backups on the Recovery Appliance are continually validated for database recoverability, a unique differentiator when compared to CDM storage products.

Instant Recovery

RMAN 'switch to copy' is used to redirect database file pointers to snapshot file copies, followed by restoring and recovering with archived logs so that the database can be opened for access.

The appliance offers fast, virtual full restore directly to production storage, up to 38 TB/hr with a single rack.

Furthermore, for 'instant recovery' needs, Oracle Data Guard allows a primary database to be switched over to an up-to-date standby database copy within seconds to minutes.

Reduce Production RTO and RPO

While 'instant recovery' using CDM snapshots sounds good, a production recovered database is expected to offer production service levels. Running database files directly on CDM storage does not yield production-level performance, especially since copy-on-write operations must also be maintained for new snapshots that are created. Furthermore, such usage is not supported with Oracle Exadata (MOS Note 2663308.1 - Using External Storage with Exadata).

With all traditional backup methods, RPO is gated when the last good backup was taken, which can be hours or more of data loss exposure.

Virtual full restores go directly to production storage with no sacrifice in database performance.

Zero to sub-second RPOs are achieved via industry-proven Oracle redo block transport, whereby transactions are backed up in real-time to the appliance.

 

We hope this blog series was helpful in clearly distinguishing the use cases of CDM products versus backup solutions and properly assessing these technologies against your enterprise business requirements, including service level agreements. For more information, see this 3-minute Recovery Appliance value proposition video and compilation of product resources.

As always, let us know your questions and comments below!

 

Tim Chien

Senior Director of Product Management

Tim Chien is Senior Director of Product Management with Oracle's High Availability and Storage Management Group, focusing on Backup and Recovery, including Zero Data Loss Recovery Appliance, Recovery Manager (RMAN), and Flashback Technologies. His 20+ years of product management and marketing experience includes both application server and database products, and he has presented at numerous Oracle and industry conferences around the world. Tim received his bachelors and masters in computer science from the Massachusetts Institute of Technology.

Show more

Previous Post

Why Is Flashback Often Better Than Backups?

Ludovico Caldara | 7 min read

Next Post


The Matchless Reliability of Oracle Database High Availability (Part 1)

Glen Hawkins | 4 min read
Oracle Chatbot
Disconnected