I am often asked what sort of recovery time Full Stack DR can guarantee for a business system or application stack. Full Stack DR has no settings or way to configure anything about recovery point or recovery time. Recovery time really depends on you and OCI.
I can tell you based on the various deployments we are building for tutorials about specific applications that we are seeing an average of 20 to 30 minutes for a switchover including the time it takes to automatically pre-check all the steps and stop things nicely at the primary region. However, these are simple deployments of PeopleSoft, E-Business Suite, JD Edwards EnterpriseOne, Oracle Analytics Cloud and WebLogic with absolutely no workload, which in the long run is sort of meaningless.
When I say, “sort of meaningless”, I mean the time it takes Data Guard to recover a database with zero transactions per second (tps) as in our case and the time it takes to restore a production database with 800,000 tps is going to be significantly different. The time it takes to restore a database is something that is completely dependent upon your workload and Data Guard, not Full Stack DR.
Each company has some unique way of deploying a given business system that usually includes a host of different applications and homegrown, in-house satellite applications as part of a single, unified ecosystem. We want you to be free to deploy an application stack for DR any way you think is best for your unique situation. Therefore, we leave the installation and configuration of applications, databases, Data Guard, storage and networking up to you - do it your way, not our way.
This always leads to the next logical question: if Full Stack DR doesn’t guarantee or control how long a recovery takes, then how does it help me?
Full Stack DR helps you because it only takes one person to recover an application, not five system admins, a couple DBAs, some storage and network admins and a few application specialists performing heroics and slaving away at a keyboard for hours on end. Plus, one person can execute recovery for many completely different business systems at the same time if needed.
Full Stack DR has the flexibility to handle highly unique business systems with the ability to automate recovery for almost anything because we don’t limit you to a specific way of installing or deploying your applications and databases for disaster recovery.
Although OCI services are generally fast, we can’t make them move any faster. But we can eliminate the time your senior IT staff spend sitting at a keyboard walking through twenty or more tedious, time consuming and error prone steps in a DR runbook for hours at a time. This translates into your IT staff being able to recover more in the same amount of time it would normally take them to recover just one business system.
NEC Corporation is a multinational information technology corporation headquartered in Tokyo, Japan. NEC has provided Oracle products and services with Oracle Japan for over 30 years, helping Japanese customers with their requirements and technological challenges. NEC provides support services for approximately 20,000 Oracle Database systems in Japan and is currently on a mission to help power their customers’ journey to the Oracle Cloud.
Japan has many natural disasters such as earthquakes and typhoons, so there are many needs for reliable disaster recovery systems. The process of recovering from a disaster is extremely complex and time-consuming, requiring system-specific procedures such as changing Data Guard and OCI storage replication settings.
NEC measured how easily Full Stack DR can perform the complex tasks needed to recover a sample web portal. As shown in the diagram below, the test measured the effort required for switchover and failover with and without Full Stack DR using a model case of a highly available web portal built around Oracle RAC and Oracle Active Data Guard.
The test results shown in Table 1 below illustrate a couple interesting metrics related to reducing effort and recovery time when using Full Stack Disaster Recovery. For a switchover, the number of steps the user must complete was reduced by almost 90% and the time required to fully recover the web portal was reduced by almost 40%.
However, failovers are a more realistic measure of actual recovery time since failovers won't include the time to shut things down and clean up artifacts at a primary region. In the case of a failover, the recovery steps were reduced by around 88% and recovery time was reduced by almost 60% over a manually executed recovery. The important thing to understand about these data is the dramatic reduction in recovery time for both DR operations was due almost entirely to the fact that no one had to type anything at a keyboard once the DR operation was started.
The number of mouse clicks and keyboard work is dramatically reduced when using Full Stack DR. For example, the number of steps it takes to perform both a switchover and failover are exactly the same for any application stack no matter how complex or extensive the recovery process. It's always the same six steps from logging into the OCI console at the standby region, selecting the desired DR protection group and then clicking on the Execute DR plan button. It's that simple.
It is interesting to note that the time it took Full Stack DR to recover the application included the automatic pre-check to validate the readiness of the DR plan for the switchover/failover. Pre-checking the readiness of things like Data Guard, storage, networking, compute and ensuring custom automation is accessible are frequently forgotten or not fully completed in the case of manually executed DR operations. The pre-checks take a few minutes to complete which of course increases the recovery time.
Pre-checks have zero impact on production so OCI customers can perform periodic pre-checks as part of ongoing operations. Full Stack DR allows you to skip the pre-checks during a recovery operation which means you could shave five or six minutes off the overall recovery time if that helps.
Full Stack DR saves time and money since it reduces the time needed to implement and maintain DR as well as the number of people and resources involved in a recovery operation. This allows for quick and error-free failovers in the event of unplanned outages.
We maintain a wealth of videos explaining how Full Stack DR works as well as videos showing Full Stack DR in action recovering various non-Oracle and Oracle application stacks. These videos are available in several topic focused playlists we maintain on our Oracle YouTube channel.
Full Stack DR gives you the power and flexibility to implement DR for Oracle or non-Oracle applications in OCI the way you want, not the way we want. Learn more about Full Stack DR by visiting the following links to resources:
I have been speicializing in high availability & disaster recovery since 1996. I also have a systems and storage administration/networking background and worked in various roles including 8 years in production systems administration and IT operations here at Oracle. I began my product management experience with Oracle VM in 2011, including a brief stint as software devloper in virtualization for 5 years, returning to product management in time to launch Full Stack DR October 2022.
Previous Post