To SYNC or not to SYNC – Part 1

Zero Data Loss – Nervously So?

As part of our Maximum Availability Architecture (MAA) conversations with customers, one issue that is often discussed is the capability of zero data loss in the event of a disaster. Naturally, this offers the best RPO (Recovery Point Objective), as far as disaster recovery (DR) is concerned. The Oracle solution that is a must-have for this is Oracle Data Guard, configured for synchronous redo transport. However, whenever the word “synchronous” is mentioned, the nervousness barometer rises. Some objections I have heard:

  • “Well, we don’t want our application to be impacted by network hiccups.”
  • “Well, what Data Guard does is two-phase-commit, which is so expensive!”
  • “Well, our DR data center is on the other coast, so we can’t afford a synchronous network.”

And a few others.

Some of these objections are valid, some are not. In this multi-part blog series, I will address these concerns, and more. In this particular blog, which is Part 1 of this series, I will debunk the myth that Data Guard synchronous redo transport is similar to two-phase commit.

SYNC != 2 PC

Let’s be as clear as possible. Data Guard synchronous redo transport (SYNC) is NOT two-phase-commit. Unlike distributed transactions, there is no concept of a coordinator node initiating the transaction, there are no participating nodes, there are no prepare and commit phases working in tandem.

So what really happens with Data Guard SYNC? Let’s look under the covers.

Upon every commit operation in the database, the LGWR process flushes the redo buffer to local online redo logs - this is the standard way Oracle database operates. With Data Guard SYNC, in addition, the LGWR process tells the NSS process on the primary database to make these redo blocks durable on the standby database disk as well. Until LGWR hears back from NSS that the redo blocks have been written successfully in the standby location, the commit operation is held up. That’s what provides the zero data loss assurance. The local storage on the primary database gets damaged? No problem. The bits are available on the standby storage.

But how long should LGWR wait to hear back from NSS? Well, that is governed by the NET_TIMEOUT attribute of the log_archive_dest parameter corresponding to the standby. Once LGWR hears back from NSS that life is good, the commit operation completes.

Now, let’s look into how the NSS process operates. Upon every commit, the NSS process on the primary database dutifully sends the committed redo blocks to the standby database, and then waits till the RFS process on the standby receives them, writes them on disk on the standby (in standby redo logs or SRLs), and then sends the acknowledgement back to the NSS process.

So - on the standby database, what’s happening is just disk I/O to write the incoming redo blocks into the SRLs. This should not be confused with two-phase-commit, and naturally this process is much faster compared to a distributed transaction involving two-phase-commit coordination.

In case you are wondering what happens to these incoming redo blocks in the SRLs - well, they get picked up - asynchronously, by the Managed Recovery Process (MRP) as part of Redo Apply, and the changes get applied to the standby data files in a highly efficient manner. But this Redo Apply process is a completely separate process from Redo Transport - and that is an important thing to remember whenever these two-phase-commit questions come up.

Now that you are convinced that Data Guard SYNC is not the same as two-phase commit, in the next blog article, I will talk about impact of network latency on Data Guard SYNC redo transport.


Comments:

Thanks, I was just asking my DBA about this the other day. Very clear explanation!

Posted by Tom Shaw on September 06, 2011 at 06:12 PM PDT #

Hi Ashish,

Is it then possible for data to be available on the standby that has not been confirmed as commited to the user session?

What I mean by that is, if the RFS process has written the change to the SRL but you have a disaster on the primary before the user session has received the confirmation.

Sure there might be a very small window of time when this can occur, but I'm just curious if this theoretically a possbility?

regards,

jason.

Posted by jason arneil on September 07, 2011 at 12:58 AM PDT #

Ashish,
I really appreciate that you take the effort to debunk this 2PC myth about LGWR SYNC AFFIRM! Did so myself sometimes while teaching our Data Guard courses. I am looking forward to the further postings that you announced about the topic. Keep up the good work! :-)

Posted by Uwe Hesse on September 07, 2011 at 03:36 AM PDT #

Good question, Jason.

Yes, this situation is similar to the case of any regular Oracle database (without any standbys) completing the commit on disk, but suffering an outage before the user session receives the commit acknowledgment. So in such a case, typically the application may do a re-query to determine durability status, or run whatever exception handling code that is appropriate under the circumstance, and the same rule applies for a Data Guard setup as well.

On a related note, in case you are wondering, except under Data Guard failover situations, we don't apply redo at the standby (even if written to standby redo log) unless we know it has been committed to the online redo log at the primary. That way we act safe and prevent standby data divergence compared to the primary.

Posted by Ashish Ray on September 07, 2011 at 07:30 AM PDT #

Just discovered this blog, thanks for a very clear explanation. Looking forward to more blog posts.

Posted by Thinus Meyer on September 06, 2012 at 12:11 AM PDT #

Hi Ashish,

In context of your explanatory reply to Jason, can you kindly throw some light on how does the MRP confirm that redo in SRL is committed to the ORL at primary?

Thanks

Posted by PC on April 11, 2013 at 02:23 AM PDT #

The redo data structure has internal bits to indicate the commit boundary.

Posted by Ashish Ray on July 18, 2013 at 12:14 PM PDT #

Hi Ashish,

Thanks for your reply.

I am somewhat more confused. In that reply of yours to Jason, 2nd para, end of 2nd line, you say "committed to the online redo log" - did that mean, "written to the online redo log" or "commit for the transaction written to the online redo log"? Believe its the former. Because if its the latter, then it would mean that if there is a transaction that spans multiple ORL before being committed then MRP will not apply the redo in the SRL initially and will move on to the next SRL. Later, when its committed and the SRL comes across the commit, MRP will have to go back and read an old SRL which I dont think is the case.

But in that case, the commit boundary (which I believe is to indicate whether the transaction has committed or no) should not matter - isnt it?

Kindly help understand.

Thanks.

Posted by PC on July 22, 2013 at 03:04 AM PDT #

>>> did that mean, "written to the online redo log"

Yes, you are correct - that was incorrect choice of words on my part. I re-looked at my response - there might be another incorrect implication. MRP just goes on applying (as fast as it can) from the SRLs - whether they are committed or uncommitted entries. When the time comes for exposing the changes thru a query on an Active Data Guard standby, the Data Guard infrastructure makes sure that the commit boundaries are honored - in other words, dirty data is not exposed. It's very similiar to the way multi-version concurrency control works in a production Oracle database.

Posted by Ashish Ray on July 22, 2013 at 10:57 AM PDT #

Thanks Ashish.

So the change recieved on standby side (written in the SRL) is always applied. Just that, in case of DG Failover, uncommitted changes will be rolled back - right?

Posted by PC on July 23, 2013 at 04:35 AM PDT #

They are rolled back after new primary database opens read-write.

Posted by Ashish Ray on July 29, 2013 at 07:49 PM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

Musings on Oracle's Maximum Availability Architecture (MAA), by members of Oracle Development team. Note that we may not have the bandwidth to answer generic questions on MAA.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today