Oracle Cloud Infrastructure (OCI) Full Stack Disaster Recovery (Full Stack DR) announces native support for Oracle Kubernetes Engine (OKE). OKE clusters are now a selectable OCI resource in Full Stack DR just like virtual machines, storage, load balancers and Oracle databases as show below. This means we know exactly how to validate, failover, switchover, and test your ability to recover OKE, infrastructure and databases without your IT staff writing one line of code or step-by-step instructions in a spreadsheet or text file.
Add existing OKE clusters in the primary and standby region to Full Stack DR, then with the click of a single button, generate DR Plans in a couple minutes that orchestrate fully automated, end-to-end failovers, switchovers, and DR drills between two different OCI regions or availability domains.
Easily integrate other OCI resources along with OKE into Full Stack DR to create complete, comprehensive recovery plans that consolidate everything your systems, network and database administrators perform into a single, seamless workflow that results in smooth, predictable recoveries or drills in less time, with less effort.
Bionexo, a Brazilian software as a service provider in the healthcare industry orchestrates disaster recovery for OKE along with the rest of their application stack using Full Stack DR. Read more about how Bionexo uses Full Stack DR to orchestrate recovery with 30 minutes of downtime and zero data loss for their entire application stack.
Orchestrate Drills, Switchovers and Failovers for OKE
Full Stack DR creates and uses DR Plans just like you use DR runbooks typically maintained in static text files or spreadsheets. The difference is that static DR runbooks require humans to find them, open them, make sure they are the right version and make sure they reflect the latest changes made to your OCI infrastructure.
Static DR runbooks are completely disconnected from the reality of your production environment in OCI and require time consuming maintenance your IT staff can ill afford. The biggest problem with static files is it extremely hard and time consuming to validate or test them to make sure there are no missing steps, or that they will even succeed in the time of crisis.
Full Stack DR Plans on the other hand, are living, dynamic DR runbooks that you maintain with Full Stack DR in your production environment. This means your DR Plans can be validated and tested against your production environment without involving an expensive outage that requires all hands on deck for hours or days at a time.
Use failover plans to handle unplanned outages that require actual disaster recovery from a catastrophic event that takes down an entire region and all availability domains within that region. This is an essential function of disaster recovery, with all the following tasks and steps occurring only in the standby region:
- Activate block and file storage used for persistent volumes.
- Execute Data Guard failover for any Autonomous Databases that are members of the DR protection groups.
- Scale up standby OKE cluster.
- Restore workload on standby OKE cluster using resources from backup.
- Update persistent storage references in the OKE resources.
- Update container image references in the OKE resources.
- Update load balancer references in the OKE resources.
- Restore standby cluster using updated resources. This launches all resources with the container images and mounts persistent volumes as expected.
Use DR drill plans to test a synthetic, mock failover that brings OCI infrastructure and platform up in the standby region while leaving your production systems untouched and running without any disruption. DR Drills exercise the same steps we execute as part of a failover, except we operate on cloned databases and persistent volumes to avoid impacting running production systems.
Use switchover plans to handle planned outages for compliance with government regulations, business continuity audits, compliance with service level agreements that specify periodic drills or any other situation where you have the luxury to transition an entire application stack to another availability domain and OCI region.
The following illustration shows OKE deployed for cross-region disaster recovery. This is ready for a drill, failover, or switchover. Notice that Full Stack DR is responsible for backing up resources and container images to an existing OKE cluster in the standby region. Autonomous Data Guard is configured using the OCI Oracle Autonomous Database service, while replication of block and file for persistent volumes are configured using the corresponding OCI storage services.
The following illustration shows OKE after Full Stack DR has completed orchestrating a planned switchover. The OKE cluster in the second region is now running everything and the designated backend sets associated with load balancer(s) have been updated.
Notice that Full Stack DR automatically changes the DR role for the protection group in each region, and prepares everything for a drill, failback, or switchback in the other direction.
Orchestrate Recovery for More Than OKE
We’re not just performing failover for Kubernetes; we’re orchestrating recovery for OCI infrastructure and platform services beyond just replicated virtual machines.
Simply add existing OKE clusters from two OCI regions to Full Stack DR. You add the resource type in both regions and provide a few properties to inform Full Stack DR what you need us to keep synchronized between regions and how you want the recovery to behave. Then click a single button to generate complete DR plans, prepopulated with all the steps needed to recover OKE and anything else you’ve included as part of the same application stack as shown in the example below.
The resources hosted in OKE clusters are often only part of a more comprehensive business system or application stack that need to be recovered at the same time. Add any additional supported infrastructure and platform-as-a-service (IaaS and PaaS) resources you need recovered along with OKE. Then, push a single button that generates a DR Plan in a few minutes that looks something like the plan shown in the figure above.
The example DR Plan above includes built-in recovery steps for OKE, plus other resources for infrastructure and platform needed for the same business system, but completely outside of the OKE cluster. This can include block volumes, file systems and object storage needed by OCI Compute that have other applications or middleware installed, other load balancers that are not related to OKE and databases that are part of the same business system, but not part of the Kubernetes cluster in either region.
The example DR Plan above also has user-defined steps with custom automation that stop and start Oracle or non-Oracle applications on the virtual machines (VMs) hosted in the OKE clusters and VMs that aren’t part of either OKE cluster. User-defined steps like the ones in the figure are added after you have created the basic DR Plans that contain all the built-in steps; user-defined steps allow you to tailor DR plans to do anything that we don’t have built into Full Stack DR already.
For example, your business system might include in-house applications, popular OCI services, such Oracle Analytics Cloud, OCI Integration, Oracle Business Intelligence Enterprise Edition, or any number of other services, and OCI Marketplace applications that are not part of your OKE clusters.
Validate Readiness Anytime
Are you ready for recovery – for everything – at any time?
The cloud infrastructure supporting your business systems evolve over time. Your disaster recovery plans need to remain in lockstep with the periodic changes to databases, compute, node pools, storage, and everything else about the cloud infrastructure supporting your various business systems. Full Stack DR gives you the tools to help ensure you’re prepared for digital disasters at any time.
Periodically validate the readiness of DR plans for drills, failovers, and switchovers using the non-intrusive precheck feature built into Full Stack DR. A precheck doesn’t execute anything, it simply walks through every step of a DR plan looking for inconsistencies that will prevent a recovery operation from succeeding, allowing your IT operations staff to correct problems before they occur.
Execute periodic recovery drills that help validate all OKE, infrastructure, and database resources you’ve added to Full Stack DR get started in the standby region with zero downtime to your production system. A DR drill executes a mock failover, bringing up copies and clones of Oracle Databases, Compute and Storage in the standby region.
Our non-disruptive, non-intrusive DR drills go far beyond simply launching replicated virtual machines in a second region. DR Drills preform a synthetic mock failover that automatically brings up all OCI resources belonging to Full Stack DR Protection Groups including compute, storage, databases, OKE, and load balancers.
The true value of our DR drills is that they can be executed with the click of a single button without having to involve a cadre of system, database, and application specialists handing tasks off between each other for hours at a time. Leave everything in place for a few days to allow your IT staff time to validate infrastructure and databases are up and running as expected in the standby region. Then automatically tear it all down and clean everything up using a stop DR Drill plan.
Conclusion
Full Stack DR makes it easy to add OKE resources to your disaster recovery plans by selecting what you already deployed and telling us what you need to keep synchronized and manage during a recovery. We do the rest.
Our zero downtime, non-intrusive prechecks and DR Drills exercise the entire infrastructure including databases and load balancers, going far beyond simply launching replicated virtual machines in a standby region. Full Stack DR gives you the tools to help you validate the integrity of your disaster recovery protection is intact and ready for you to exercise with confidence when it is needed.
This all adds up to an OCI native disaster recovery service that goes beyond simple VM failover, integrating IaaS and PaaS services with almost anything else that needs be recovered into a single, seamless workflow.
Do more with your disaster recovery plans in less time, with less effort using Full Stack DR.
Want to know more?
If you haven’t seen OCI Full Stack Disaster Recovery in action yet, ask your Oracle Cloud Infrastructure account team to set up a demonstration today. For more information, including documentation, pricing, customer success stories, videos, tutorials and hands-on labs, visit Full Stack Disaster Recovery.
Visit the following links to learn how to add your existing Oracle Kubernetes Engine clusters to Full Stack DR for fully automated, end-to-end recovery.
