To SYNC or not to SYNC – Part 4

This is Part 4 of a multi-part blog article where we are discussing various aspects of setting up Data Guard synchronous redo transport (SYNC). In Part 1 of this article, I debunked the myth that Data Guard SYNC is similar to a two-phase commit operation. In Part 2, I discussed the various ways that network latency may or may not impact a Data Guard SYNC configuration. In Part 3, I talked in details regarding why Data Guard SYNC is a good thing, and the distance implications you have to keep in mind.


In this final article of the series, I will talk about how you can nicely complement Data Guard SYNC with the ability to failover in seconds.


Wait - Did I Say “Seconds”?


Did I just say that some customers do Data Guard failover in seconds? Yes, Virginia, there is a Santa Claus.


Data Guard has an automatic failover capability, aptly called Fast-Start Failover. Initially available with Oracle Database 10g Release 2 for Data Guard SYNC transport mode (and enhanced in Oracle Database 11g to support Data Guard ASYNC transport mode), this capability, managed by Data Guard Broker, lets your Data Guard configuration automatically failover to a designated standby database. Yes, this means no human intervention is required to do the failover. This process is controlled by a low footprint Data Guard Broker client called Observer, which makes sure that the primary database and the designated standby database are behaving like good kids. If something bad were to happen to the primary database, the Observer, after a configurable threshold period, tells that standby, “Your time has come, you are the chosen one!” The standby dutifully follows the Observer directives by assuming the role of the new primary database. The DBA or the Sys Admin doesn’t need to be involved.


And - in case you are following this discussion very closely, and are wondering … “Hmmm … what if the old primary is not really dead, but just network isolated from the Observer or the standby - won’t this lead to a split-brain situation?” The answer is No - It Doesn’t. With respect to why-it-doesn’t, I am sure there are some smart DBAs in the audience who can explain the technical reasons. Otherwise - that will be the material for a future blog post.


So - this combination of SYNC and Fast-Start Failover is the nirvana of lights-out, integrated HA and DR, as practiced by some of our advanced customers. They have observed failover times (with no data loss) ranging from single-digit seconds to tens of seconds. With this, they support operations in industry verticals such as manufacturing, retail, telecom, Internet, etc. that have the most demanding availability requirements.


One of our leading customers with massive cloud deployment initiatives tells us that they know about server failures only after Data Guard has automatically completed the failover process and the app is back up and running! Needless to mention, Data Guard Broker has the integration hooks for interfaces such as JDBC and OCI, or even for custom apps, to ensure the application gets automatically rerouted to the new primary database after the database level failover completes.


Net Net?


To sum up this multi-part blog article, Data Guard with SYNC redo transport mode, plus Fast-Start Failover, gives you the ideal triple-combo - that is, it gives you the assurance that for critical outages, you can failover your Oracle databases:



  1. very fast

  2. without human intervention, and

  3. without losing any data.


In short, it takes the element of risk out of critical IT operations. It does require you to be more careful with your network and systems planning, but as far as HA is concerned, the benefits outweigh the investment costs.


So, this is what we in the MAA Development Team believe in. What do you think? How has your deployment experience been? We look forward to hearing from you!

Comments:

Thanks for the great explanation on To SYNC or not to SYNC 1-4
We are at a critical decision to invest in exadata
A. Primary Site [Prod] Oracle11gR2 RAC
B. Primary Site [hot Active Standby] Oracle11gR2 Single no RAC [SYNC]
cascade to DR with ASYNC
C. DR Site [DR] Oracle11gR2 RAC Physical Standby [ASYNC]

We want to achieve RPO=0 data loss.
1. from A TO B active dg with SYNC max protection and B TO C with cascade ASYNC max Performance.
A quick reply today to my email would be great of great help.
As I have read old post saying you cannot cascade with RAC 11GR2?

Posted by cleto on April 20, 2013 at 11:38 PM PDT #

There are no restrictions wrt RAC & cascading Oracle Database 11.2.0.2 onwards. However in Oracle Database 11g, if your Data Guard setup is like A -> B -> C, the B -> C redo shipping is like ARCH. Please look into Oracle Database 12c where we have enabled B -> C redo shipping to be ASYNC. While you are reviewing Database 12c documentation, please also refer to Data Guard Far Sync, which enables zero data loss for long distance Data Guard deployments.

Posted by Ashish Ray on July 18, 2013 at 12:24 PM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

Musings on Oracle's Maximum Availability Architecture (MAA), by members of Oracle Development team. Note that we may not have the bandwidth to answer generic questions on MAA.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today