Wednesday Jul 30, 2014

Backup to Oracle Cloud - Introduction to Oracle Database Backup Service

Backup and recovery of application data is the fundamental protection strategy for maintaining enterprise business continuity. I would be extremely surprised to hear of any enterprise that has never backed up its mission critical or business critical data. Any such a scenario is basically a ticking time bomb.

Depending on the specific RTO (recovery time objective) and RPO (recovery point objective) for each database, different Oracle Maximum Availability Architecture (MAA) strategies can be deployed by the enterprise.

From the backup and recovery perspective, the following are general practice guidelines that customers typically follow to address RTO and RPO requirements:

•    Local Fast Recovery Area (FRA): Typically stores backups for up to 7 days
•    External Storage (NAS): up to 30 days
•    Tape media (if available): 1 to 6 months
•    Tape vaulting (offsite storage): months to years

In addition to the above backup storage tiers, sophisticated organizations take additional precautions to avoid single site failure and to reduce load from production resources. MAA best practices, for example, recommend that copies of the backup data be stored in an offsite location.

But consider the following complications:

Other than tape vaulting, there is no alternative that enables complete physical offsite storage for short- and long- term backups. 

  • Many IT shops don’t have the tape infrastructure required for long term archival. Hence they are restricted to using local disk backups or expensive backup appliances.
  • Organizations with multiple databases that have various RTO/RPO requirements may have certain 2nd or 3rd tier databases that never get backed up.
  • Due to compliance requirements, customers now have to store backups for many years. Storing large volumes of data on local disks can become prohibitively expensive.
  • Many enterprises don’t have the CAPEX budget in place to implement these additional data protection steps. 
  • And almost ALL enterprise want a solution that’s operational right away.

So what’s the answer?

Introducing Oracle Database Backup Service - A Cloud Storage Solution for your Oracle Database Backups

Oracle Database Backup Service addresses the above needs by providing a low cost alternative for storing backups in an offsite location.  It is an Oracle Public Cloud object-based storage offering that enables you to store your on-premises or cloud-deployed database backups. You can use Oracle Database Backup Service as the Primary backup for 2nd or 3rd tier databases, or use the cloud backup as a secondary copy for long term archival requirements.

If you are familiar with Oracle Recovery Manager (RMAN), it should take only a few minutes for you to start backing up your database to the cloud. Here’s all you need to do:

1. Subscribe to the Oracle Database Backup Service.

  • This offering is available as a month-to-month or longer-term subscription (1,2, or 3 years).  Note that the prescription model is subject to change.

2. Download Oracle Database Cloud Backup Module from OTN site.

  • Unzip file, which has a detailed README about the steps to execute.

3. Run the installation procedure.

  • Provide your Oracle Public Cloud credentials, which are securely stored in an Oracle wallet with your database. The installation script also configures certain configuration files.

4. Configure RMAN

  • By using CONFIGURE (persistent), SET or even BACKUP commands, you can instruct RMAN to use the backup service module for backups.

5. Start enabling your backups and restores.

  • Use regular RMAN BACKUP or RESTORE commands for backups. All operations involving BACKUP SET mode of backups/recovery are supported.
  • You can also perform backups from FRA and other disk-based backup locations to the cloud.

How does this process work?

The Oracle Database Cloud Backup Module (ODCBM) receives backup blocks from RMAN, then chunks them into 20MB blocks and transmits to Oracle cloud. During the restore process, the same module retrieves data from the Cloud. The Oracle Database Cloud Backup Module is configured as SBT (Tape).

What are some unique features that Oracle Database Backup Service offers ?

To name a few:

  • End-to-end security (RMAN encryption is performed at backup time and data is securely transmitted over WAN).  And by the way, you don’t have to purchase the Advanced Security Option (ASO) to use RMAN encryption. You can use Password based, Transparent Data Encryption (TDE), or dual-mode. Encryption is supported for EE, SE, and SE1 editions.
  • Backups can be compressed to reduce the volume of data being transmitted. For Oracle Database 10gR2 and 11gR1, you can use BASIC compression. For 11gR2 and above, you can choose from LOW, MEDIUM, BASIC, and HIGH.
  • There’s NO ADDITIONAL COST other than the subscription to Oracle Database Backup Service.
  • You can use any number of RMAN channels to parallelize your backup and restore operations.
  • There are NO new commands to learn. Use the familiar RMAN commands.
  • Because a large portfolio of applications are already available in Oracle Cloud, you can use your backup in the cloud to spin a new instance or use it for your other PaaS or SaaS requirements.

So what are you waiting for?  Do you want to check the network throughput before you sign up? Start with a no-obligation one month trial by clicking “Try Now” from

For more information,

In future blogs on Oracle Database Backup Service, I will discuss some best practices when deploying cloud-based backups.

Welcome to the MAA Blog!!

Welcome to the MAA blog! This set of blogs are created and maintained by members of Oracle’s Maximum Availability Architecture (MAA) team within Oracle’s Server Technology Development group. The MAA team interacts with Oracle’s customers around the world on various critical high availability (HA) initiatives, and with this blog forum, we hope to bring to you musings on some of the rich experiences we have gained till date. Our goal is to enrich the Oracle ecosystem with an interesting, informative and interactive conversation around Oracle MAA.

Please refer to the MAA website in OTN -, for the latest collection of best practices for Oracle MAA.

Ashish Ray

Wednesday Jul 16, 2014

Oracle MAA Part 2: Bronze HA Reference Architecture

In the first installment of this series we discussed how one size does not fit all when it comes to HA architecture. We described Oracle Maximum Availability Architecture (Oracle MAA) best practices that define four standard reference architectures for data protection and high availability: BRONZE, SILVER, GOLD and PLATINUM.  Each reference architecture uses an optimal set of Oracle HA capabilities that reliably achieve a given service level (SLA) at the lowest cost. As you progress from one level to the next, each architecture expands upon the one that preceded it in order to handle an expanded fault domain and deliver a high level of service.

This article provides details for the Bronze reference architecture.

Bronze is appropriate for databases where simple restart or restore from backup is ‘HA enough’. It uses single instance Oracle database (no cluster) to provide a very basic level of HA and data protection in exchange for reduced cost and implementation complexity. An overview is provided in the figure below.

Bronze Reference Architecture

When a database instance or the server on which it is running fails, the recovery time objective (RTO) is a function of how quickly the database can be restarted and resume service. If a database is unrecoverable the RTO becomes a function of how quickly a backup can be restored. In a worst case scenario of a complete site outage additional time is required to provision new systems and perform these tasks at a secondary location, in some cases this can take days.

The potential data loss if there is an unrecoverable outage (recovery point objective or RPO), is equal to the data generated since the last backup was taken. Copies of database backups are retained locally and at a remote location or on the Cloud for the dual purpose of archival and DR should a disaster strike the primary data center.

Major components of the Bronze reference architecture and the service levels achieved include:

Oracle Database HA and Data Protection

  • Oracle Restart automatically restarts the database, the listener, and other Oracle components after a hardware or software failure or whenever a database host computer restarts.
  • Oracle corruption protection checks for physical corruption and logical intra-block corruptions. In-memory corruptions are detected and prevented from being written to disk and in many cases can be repaired automatically. For more details see Preventing, Detecting, and Repairing Block Corruption.
  • Automatic Storage Management (ASM) is an Oracle-integrated file system and volume manager that includes local mirroring to protect against disk failure.
  • Oracle Flashback Technologies provide fast error correction at a level of granularity that is appropriate to repair an individual transaction, a table, or the full database.
  • Oracle Recovery Manager (RMAN) enables low-cost, reliable backup and recovery optimized for the Oracle Database.
  • Online maintenance includes online redefinition and reorganization for database maintenance, online file movement, and online patching. 

Database Consolidation

  • Databases deployed using Bronze often include development and test databases and databases supporting smaller work group and departmental applications that are often the first candidates for database consolidation.
  • Oracle Multitenant is the MAA best practice for database consolidation from Oracle Database 12c onward. 
Life Cycle Management
  • Oracle Enterprise Manager Cloud Control enables self service deployment of IT resources for business users along with resource pooling models that cater to various multitenant architectures. It supports Database as a Service (DBaaS), a paradigm in which end users (Database Administrators, Application Developers, Quality Assurance Engineers, Project Leads, and so on) can request database services, consume it for the lifetime of the project, and then have them automatically de-provisioned and returned to the resource pool.

Oracle Engineered Systems

  • Oracle Engineered Systems are an efficient deployment option for database consolidation and DBaaS. Oracle Engineered Systems reduce lifecycle cost by standardizing on a pre-integrated and optimized platform for Oracle Database that is completely supported by Oracle.

Bronze Summary:  Data Protection, RTO, and RPO

Table 1 summarizes the data protection capabilities and service levels provided by the Bronze tier. The first column indicates when validations for physical and logical corruption are performed:

  • Manual checks are initiated by the administrator or at regular intervals by a scheduled job.
  • Runtime checks are automatically executed on a continuous basis by background processes while the database is open.
  • Background checks are run on a regularly scheduled interval, but only during periods when resources would otherwise be idle.
  • Each check is unique to Oracle Database using specific knowledge of Oracle data block and redo structures.

Table 1: Bronze - Data Protection

Type Capability Physical Block Corruption
Logical Block Corruption
Manual Dbverify, Analyze Physical block checks Logical checks for intra-block and inter-object consistency
Manual RMAN Physical block checks during backup and restore Intra-block logical checks
Runtime Database In-memory block and redo checksum In-memory intra block logical checks
Runtime ASM Automatic corruption detection and repair using local extent pairs
Runtime Exadata HARD checks on write HARD checks on write
Background Exadata Automatic HARD Disk Scrub and Repair

Note that HARD validation and the Automatic Hard Disk Scrub and Repair (the last two rows of Table 1) are unique to Exadata storage. HARD validation ensures that Oracle Database does not write physically corrupt blocks to disk. Automatic Hard Disk Scrub and Repair inspects and repairs hard disks with damaged or worn out disk sectors (cluster of storage) or other physical or logical defects periodically when there are idle resources.

Table 2 summarizes RTO and RPO for the Bronze tier for various unplanned and planned outages.

Table 2: Bronze - Recovery Time and Data Loss Potential

Type  Event  Downtime Data Loss Potential
Unplanned  Database instance failure
 Minutes  Zero
Unplanned  Recoverable server failure
Minutes to an hour
Unplanned Data corruptions, unrecoverable server failure, database failures, or site failures
Hours to days
Since last backup
Planned Online file move, online reorganization and redefinition, online patching
Planned Hardware or operating system maintenance and database patches that cannot be performed online
Minutes to hours
Planned Database upgrades: patch sets and full database releases
Minutes to hours
Planned Platform migrations
Hours to a day
Planned Application upgrades that modify back-end database objects
Hours to days

So when would you use bronze?  Bronze is useful when users can wait for a backup to be restored if there is an unrecoverable outage and accept that any data generated since the last backup was taken will be lost. The Oracle Database has a number of included capabilities described above that provide unique levels of data protection and availability for a low-cost environment based upon the Bronze reference architecture.

But what if I can't accept this level of downtime or data loss potential - well that is where the Silver, Gold and Platinum reference architectures come in. Bronze is only a starting point that establishes the foundation for subsequent HA reference architectures that provide higher quality of service. Stay tuned for future blog posts that will dive into the details of each reference architecture.

Wednesday Jul 02, 2014

Oracle GoldenGate Active-Active Part 2

My last post ( )  focused on whether or not an application's database structure was set up sufficiently to perform conflict detection and resolution in active-active GoldenGate environments. Assuming that your application structure is ready, I'll now explain how to actually prevent conflicts from happening in the first place. While this is ideal, I don't think conflict prevention is something we could ever guarantee... especially when a fault or hiccup occurs in either the database or GoldenGate itself.  

Let's break up conflicts into 3 types, based on the DML: 

1. Inserts

2. Deletes

3. Updates 

1. Insert conflicts typically occur when two rows have the same primary key or when there are duplicate unique keys within a table. 

· Two rows with same primary key: To address these cases we could have primary keys generated based on a sequence value, then set up something like alternating sequences. Depending on how many nodes or servers are in the environment, you could use an algorithm that starts with n and increments by N (where n is the node or server number and N is the total number of nodes or servers). For example, in a 2-way scenario,  one  side  would  have  odd  sequence  values  (start with 1 and increment by 2) and the other would have even sequence values (start with 2 and increment by 2). 

· Duplicate unique keys: Avoiding conflicts in tables that have duplicate unique keys is a little trickier, and sometimes must be managed from the application perspective.  For example, let's say for a particular application that we have a table that contains login information for an account.  We would want the login name to be a unique value.  However it is possible that two people working on two different servers could attempt to obtain the same login name.  These kinds of operations can be eliminated if we restrict new account creation to a single server, thereby letting the database handle the uniqueness of a column. 

2. Delete conflicts are usually nothing to worry about. In most cases, this occurs when two people are attempting to delete the same record, or when someone tries to update a record that has already been deleted.  These conflicts can usually just be ignored.  However, I typically recommend that customers keep track of these types of conflicts in an exception table, just to make sure that nothing out of the ordinary is occurring. Once you’ve confirmed that things are running smoothly you can eliminate the exception mapping and just ignore the conflicts completely. 

3. Update conflicts are definitely the most prevalent.  These conflicts occur when two people try to update the same logical record on two different servers.  A typical example is when a customer is on the phone with support to change something associated with his or her credit card. At the same time, the customer is also logged into the account and is trying to change his or her address.  If these activities occur on two different servers and the lag is high enough, it could cause a conflict. In order to reduce or eliminate these conflicts there are a few best practices to follow: 

1) Reduce the Oracle GoldenGate (OGG) lag to the lowest level possible.  There are a few knowledge tickets on this. The master note is Main Note - Oracle GoldenGate - Lag, Performance, Slow and Hung Processes (Doc ID 1304557.1)

2) Logically partition users based upon geographical regions or usernames.  For example, when all users in North America access one server, and users in Europe access a different server, the chance of two people updating the same logical record on two different machines is greatly reduced.  Another option is to split up the users based on their usernames. Even something as simple as setting up usernames A-M to log into one server and usernames N-Z to log into another server can help reduce conflicts.   The reason this helps is related to my next point...

3) Set up Session Persistence time. IP or Session Persistence is the ability of a load balancer or router to keep track of where a connection is sent. In the event that a connection is lost, disconnected, etc, and a user attempts to reconnect or log back in, the connection will be sent to the same server where it was originally connected.  Most sessions have a time value that can be associated with this persistence. For example, if I set my session persistence to 10 seconds, then any time a session is disconnected or killed, the user will be sent to the same server as long as he or she logs back in within 10 seconds.  This is ideal for Oracle GoldenGate environments, where there would be lag between the different databases. In an ideal situation you would set this session persistence time value to be twice the average lag or 20 seconds – whichever is higher.  This allows a user who is filling a shopping cart or booking a reservation to maintain a consistent view of the data, even in the event of a client or network failure. 

By using these methods, the number of conflicts that actually occur can be drastically reduced, leading to a happier end user experience.  But even with the best intentions and preparation, not every conflict can be avoided. In my next post I will cover how to resolve such unavoidable conflicts. 


Musings on Oracle's Maximum Availability Architecture (MAA), by members of Oracle Development team. Note that we may not have the bandwidth to answer generic questions on MAA.


« July 2014 »