Oracle Database Service for Microsoft Azure (ODSA) is an Oracle-managed service that enables customers to easily provision, access, and operate enterprise-grade Oracle Database services in Oracle Cloud Infrastructure (OCI) with a familiar Azure-like experience. The new ODSA service facilitates the OCI-Azure Interconnect to simplify the setup, management, and connectivity of Azure applications to databases running in OCI.
One of the most critical aspects to consider while designing an enterprise grade solution is to ensure high availability and business continuity even in case of a disaster. A disaster is a sudden and unplanned event, such as an accident, a natural catastrophe, or a distributed network outage, that causes significant damage or loss in a vast geographic area. A well-architected disaster recovery solution helps to reduce harm or disruption and smoothly recover as quickly as possible in the event of a disaster that leads to system failure.
This article is the first of a series that discuss disaster recovery best practices for the most common ODSA scenarios, including Oracle Autonomous Database, Exadata Database service, Base Database service, and regional and cross-region architectures. In this blog, we focus on ODSA’s best practices for disaster recovery across different cloud regions with some guidelines to ensure that both the application stack and database tier based on Exadata Database service or Base Database service continue serving you if a failover is triggered.
Let’s review the main considerations to identify a deployment strategy that meets your organization requirements when disaster occurs, including the following examples:
Recovery time objectives (RTO) and recovery point objectives (RPO) expectations for both the database and the application layers
Latency between primary and disaster recovery regions and the latency between the regions and the final users or consumers for your application
Consider the data residency requirements and regulation for your application’s data to select the most appropriate failover cloud region.
Identifying and selecting the most appropriate cloud regions’ locations is critical to meeting these requirements.
OCI and Azure have several interconnected regions across the globe—12 at the time of publication, with new locations being planned. We recommend reviewing the documentation for the current list of regions. Although ODSA doesn’t come with a predefined failover region, table 2 shows the preferred failover region based on the considerations and the latency data provided by the OCI interregion latency dashboard.
Geographic region |
Primary cloud regions |
Preferred disaster recovery regions |
Asia Pacific |
OCI Japan East (Tokyo)–Azure Tokyo |
OCI Singapore (Singapore)–Azure Singapore |
OCI South Korea Central (Seoul)–Azure Seoul |
||
OCI Singapore (Singapore)–Azure Singapore |
OCI Japan East (Tokyo)–Azure Tokyo |
|
OCI South Korea Central (Seoul)–Azure Seoul |
||
OCI South Korea Central (Seoul)–Azure Seoul |
OCI Singapore (Singapore)–Azure Singapore |
|
OCI Japan East (Tokyo)–Azure Tokyo |
||
Europe, Middle East, Africa |
OCI Germany Central (Frankfurt)–Azure Frankfurt 1 & 2 |
OCI Netherlands Northwest (Amsterdam)–Azure Amersterdam2 |
OCI UK South (London)–Azure London |
||
OCI Netherlands Northwest (Amsterdam)–Azure Amersterdam2 |
OCI Germany Central (Frankfurt)–Azure Frankfurt 1 & 2 |
|
OCI UK South (London)–Azure London |
||
OCI UK South (London)–Azure London |
OCI Germany Central (Frankfurt)–Azure Frankfurt 1 & 2 |
|
OCI Netherlands Northwest (Amsterdam)–Azure Amersterdam2 |
||
OCI South Africa Central (Johannesburg)–Azure Johannesburg |
OCI Germany Central (Frankfurt)–Azure Frankfurt 1 & 2 |
|
OCI UK South (London)–Azure London |
||
Latin America |
OCI Brazil Southeast (Vinhedo)–Azure Campinas |
OCI US West (Phoenix)–Azure Phoenix |
OCI US West (San Jose)–Azure Silicon Valley |
||
North America |
OCI Canada Southeast (Toronto)–Azure Canada Central |
OCI US East (Ashburn)–Azure Washington DC 1 & 2 |
OCI US West (Phoenix)–Azure Phoenix |
||
OCI US East (Ashburn)–Azure Washington DC 1 & 2 |
OCI US West (Phoenix)–Azure Phoenix |
|
OCI US West (San Jose)–Azure Silicon Valley |
||
OCI US West (Phoenix)–Azure Phoenix |
OCI US East (Ashburn)–Azure Washington DC 1 & 2 |
|
OCI US West (San Jose)–Azure Silicon Valley |
||
OCI US West (San Jose)–Azure Silicon Valley |
OCI US West (Phoenix)–Azure Phoenix |
|
OCI US East (Ashburn)–Azure Washington DC 1 & 2 |
Table 1: Preferred failover regions
After the primary and disaster recovery regions have been identified and both application layer and database tier have been provisioned on the primary or production environment, we can define the disaster recovery plan for our solution. A crossregion approach provides resiliency in the rare cases of either a disaster event that makes a whole region unavailable or a failure of the low-latency interconnection network link.
We discuss the proposed solution under the following assumptions:
Both the primary and disaster recovery environments are hosted in ODSA-enabled regions (See table 1).
Both the application layer running on Azure and the database layer running on OCI are deployed in the same geographical location at any time.
These requirements are meant to provide an easy path to consistently design a disaster recovery architecture applying ODSA capabilities and to always ensure the lowest latency between the application stack and the database tier, even in the event of a disaster. This architecture provides an effective solution for the most common scenarios. However, you can achieve more articulated architectures by adopting OCI and Azure Interconnection.
Figure 2 shows the disaster recovery capability for a split-stack solution across regions between OCI and Azure.
Connect applications using the appropriate TNS connection string. How connections are established determines how efficiently applications can reconnect to the failover destination after a failure. The following TNS connection string is recommended for all Oracle drivers 12.2 and later:
ALIAS =(DESCRIPTION =
(CONNECT_TIMEOUT=90) (RETRY_COUNT=20)(RETRY_DELAY=3) (TRANSPORT_CONNECT_TIMEOUT=3)
(ADDRESS_LIST =
(LOAD_BALANCE=on)
( ADDRESS = (PROTOCOL = TCP)(HOST=primary-scan)(PORT=1521)))
(ADDRESS_LIST =
(LOAD_BALANCE=on)
( ADDRESS = (PROTOCOL = TCP)(HOST=secondary-scan)(PORT=1521)))
(CONNECT_DATA=(SERVICE_NAME = gold-cloud)))
You can tune the specific values, but the values quoted in this example are reasonable starting points. For more details, refer to the Application Checklist for Continuous Service for MAA Solutions.
Replicate the application tier from the primary Azure region to the disaster recovery region using the Azure backbone network connectivity. The tools and processed to maintain primary and disaster recovery regions in sync can vary depending on the application’s components and the Azure services and resources involved. As an example, to migrate and synchronize Azure storage, you have several options to consider, including RoboCopy or AzCopy that support SMB Azure file shares, linux-based rsync, or AzureBackup. For more details and a comprehensive analysis, refer to Disaster recovery and storage account failover on Azure.
Set up Azure Traffic Manager or OCI DNS Traffic Management to allow end-users to connect seamlessly to a secondary/ or standby application configured in another Azure region with the help of automation. Set up an automated process in the form of a script to detect the application failover in Azure and initiate a database switchover in OCI.
Through the ODSA console, provision the secondary database in the disaster recovery region.
Log in to the Oracle Cloud Console and set up a private remote virtual cloud network (VCN) peering connection between the primary and disaster recovery regions. The traffic between OCI regions goes through the OCI backbone network connectivity.
Manually set up a physical standby database to use with Oracle Data Guard to sync the primary database and standby database across OCI regions through the remote VCN peering.
For Oracle Data Guard configurations, enable fast-start failover (FSFO) to allow the broker to automatically failover to the standby database in the OCI disaster recovery region in the event of losing the primary database.
FSFO can run custom actions before and after the automatic failover occurs. So you can configure a process to initiate a switchover of the application layer running on Azure in the post-callout script that runs after the failover succeeds.
Preparation for a disaster isn’t an easy task. It requires a comprehensive approach that considers different business requirements and available architectures and encompasses those aspects into an actionable disaster recovery plan. The scenarios we’ve described provide guidelines to help select the disaster recovery approach that best fits your application deployment using a simple but effective failover and the disaster recovery configuration in your Oracle Cloud Infrastructure and Azure environments.
For more information, see the following resources:
Get started with Oracle Database Service for Microsoft Azure
Read industry analyst commentary
Oracle Database Service for Microsoft Azure technical overview
Oracle Data Guard: Best practices for synchronous redo transport
Previous Post
Next Post