X

The latest cloud infrastructure announcements, technical solutions, and enterprise cloud insights.

Backing up your OKE environment with Velero

Seshadri Dehalisan
Master Principal Enterprise Cloud Architect

Kubernetes has become the holy grail for enterprises for running their containerized deployments. Oracle offers Oracle Container Engine for Kubernetes (OKE), a fully managed, scalable, and highly available service that you can use to deploy your containerized workloads in Oracle Cloud Infrastructure (OCI). While OKE makes it a breeze to run at scale of containerized workloads, the business requirements that are quintessential for running critical workloads, such as disaster recovery, backup requirements, are contextual to customer needs.

A robust disaster recovery and backup solution must include backing up the cluster metadata definitions and providing a backup of the data that persists in the Kubernetes cluster. While many technologies are available in the marketplace, this guide aims to provide a solution for the backup of OKE clusters using the open source tool, Velero. You can extend the Velero-based solution to achieve disaster recovery and migrate your containerized Kubernetes cluster from other providers to OCI. You can also use Kasten with OKE for backup and disaster recovery use cases, as explained in this blog.

Velero deployment process

Velero uses Restic to back up persistent volumes. Restic is a lightweight cloud native backup program that the backup industry has widely adopted. Velero creates Kubernetes objects to enable backup and restore, including deployments, Restic DaemonSets, and custom resource definitions.

Install prerequisites

Install access to the OKE cluster and kubectl locally.

kubectl get nodes -o wide
NAME       STATUS   ROLES   AGE     VERSION    INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                  KERNEL-VERSION                      CONTAINER-RUNTIME
10.0.0.4   Ready    node    2d21h   v1.18.10   10.0.0.4              Oracle Linux Server 7.9   5.4.17-2036.100.6.1.el7uek.x86_64   docker://19.3.11
10.0.0.5   Ready    node    2d21h   v1.18.10   10.0.0.5              Oracle Linux Server 7.9   5.4.17-2036.100.6.1.el7uek.x86_64   docker://19.3.11

Depending on your client environment (Linux, Mac, or Windows), install steps vary. You can also initially install Velero from OCI Cloud Shell.

On Mac, you can install Velero with the following command:

brew install velero
Now that Velero is installed locally, Velero can create the appropriate Kubernetes resources as with the following code:
velero install \
    --provider aws \
    --bucket velero \
    --prefix oke \
    --use-restic \
    --secret-file /Users/xxxx/velero/velero/credentials-velero \
--backup-location-config s3Url=https://tenancyname.compat.objectstorage.region.oraclecloud.com,region=region,s3ForcePathStyle="true" \
--plugins velero/velero-plugin-for-aws:v1.1.0 \
--use-volume-snapshots=false 

OCI Object Storage is S3 compliant, and so the AWS S3 references are used by provider and object storage, as shown in the previous code block, for the purposes of backup. This functionality also enables organizations to seamlessly migrate their EKS workloads to OCI. The parameter use-restic enables Velero to use restic to backup persistent volume. On install, Velero creates few Kubernetes resources and they are by default created in velero namespace.

The secret file refers to the credentials that Velero uses to back up to OCI Object Storage bucket. As mentioned in Managing User Credentials, you must generate these credentials as a customer secret key. The user profile that backs up to Object Storage needs the ability to manage the bucket into which the backups are written.

Sample credentials file look like the following block:

aws_access_key_id=40xxxxxxxxxxxxxxxxxxxxxxxxxxxxxa32f8494a
aws_secret_access_key=YyuSZxxxxxxxxxxxxxxxxxxxxxxxxxxxxxRDYzNnv0c=
kubectl get pods --namespace velero -o wide
NAME                      READY   STATUS    RESTARTS   AGE   IP             NODE       NOMINATED NODE   READINESS GATES
restic-2lp99              1/1     Running   2          23h   10.234.0.26    10.0.0.5              
restic-6jz9k              1/1     Running   0          23h   10.234.0.137   10.0.0.4              
velero-84f5449954-46hnk   1/1     Running   0          23h   10.234.0.136   10.0.0.4              

Now that we have enabled Velero, let’s create a simple Nginx deployment that uses persistent volume claims.

  1. Create the storage class (cluster resource).

  2. Create the persistent volume (cluster resource).

  3. Create the namespace where the pod and persistent volume claim (PVC) reside.

  4. Create PVC (namespace scoped).

  5. Create the pod.

Create the storage class

To create the storage class, run the following command:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: oci-fss1
provisioner: oracle.com/oci-fss
parameters:
  # Insert mount target from the FSS here
  mntTargetId: ocid1.mounttarget.oc1.us_ashburn_1.aaaaaa4np2sra5lqmjxw2llqojxwiotboaww25lxxxxxxxxxxxxxiljr

In the storage class, we present the OCI File Storage service mount target to Kubernetes. Read more about File Storage service in the documentation.

Create required namespace

To create the required namespace, run the following command:

kubectl create namespace testing

Create the persistent storage

Create the persistent storage by running the following commands:

apiVersion: v1
kind: PersistentVolume
metadata:
 name: oke-fsspv1
spec:
 storageClassName: oci-fss1
 capacity:
  storage: 100Gi
 accessModes:
  - ReadWriteMany
 mountOptions:
  - nosuid
 nfs:
# Replace this with the IP of your FSS file system in OCI
  server: 10.0.0.3
# Replace this with the Path of your FSS file system in OCI
  path: /testpv
  readOnly: false

Create the persistent volume claim

To create the persistent volume claim, run the following command:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
 name: oke-fsspvc1
spec:
 storageClassName: oci-fss1
 accessModes:
 - ReadWriteMany
 resources:
  requests:
    storage: 100Gi
 volumeName: oke-fsspv1

You can verify that the pod is running by inputting the following string:

kubectl get pods --namespace testing -o wide
NAME          READY   STATUS    RESTARTS   AGE   IP            NODE       NOMINATED NODE   READINESS GATES
oke-fsspod3   1/1     Running   0          22h   10.234.0.28   10.0.0.5              

Let’s verify that the pod is using the PVC and that File Storage is available and mounted in the pod.

kubectl exec -it oke-fsspod3 -n testing -- bash 
root@oke-fsspod3:/# mount |egrep -i nfs 
10.0.0.3:/testpv on /usr/share/nginx/html type nfs  
root@oke-fsspod3:/# cd /usr/share/nginx/html 
root@oke-fsspod3:/usr/share/nginx/html# ls -lrt *.dmp|wc -l 65 
root@oke-fsspod3:/usr/share/nginx/html# ls -lrt *.dmp|head -5 
-rw-r--r--. 1 root root 75853 Feb 8 04:37 randomfile1.dmp 
-rw-r--r--. 1 root root 77341 Feb 8 04:37 randomfile2.dmp 
-rw-r--r--. 1 root root 76599 Feb 8 04:37 randomfile3.dmp 
-rw-r--r--. 1 root root 75066 Feb 8 04:38 randomfile4.dmp 
-rw-r--r--. 1 root root 75008 Feb 8 04:38 randomfile5.dmp

Volume annotation

Velero expects the pod to be annotated with the volume name. You can add the volume name with the following command:

kubectl -n testing annotate pod/oke-fsspod3 backup.velero.io/backup-volumes=oke-fsspv1

Backup process

Backing up the cluster

Back up the OKE cluster by issuing the following command:

./velero backup create backup-full-cluster-demo --default-volumes-to-restic=true
Backup request "backup-full-cluster-demo" submitted successfully.
Run `velero backup describe backup-full-cluster-demo` or `velero backup logs backup-full-cluster-demo` for more details.

./velero backup describe backup-full-cluster-demo

Name:         backup-full-cluster-demo
Namespace:    velero
Labels:       velero.io/storage-location=default
Annotations:  velero.io/source-cluster-k8s-gitversion=v1.18.10
              velero.io/source-cluster-k8s-major-version=1

By default in the current release, Velero tries to restore with dynamic provisioning of persistent volumes. So, you want to back up the statically created persistent volume separately. You can accomplish this task with the following command:

./velero backup create backup-pv-only-demo --default-volumes-to-restic=true --include-resources pve
Backup request "backup-pv-only-demo" submitted successfully.
Run `velero backup describe backup-pv-only-demo` or `velero backup logs backup-pv-only-demo` for more details.

% ./velero backup describe backup-pv-only-demo --details
Name:         backup-pv-only-demo
Namespace:    velero
Labels:       velero.io/storage-location=default
Annotations:  velero.io/source-cluster-k8s-gitversion=v1.18.10
              velero.io/source-cluster-k8s-major-version=1
              velero.io/source-cluster-k8s-minor-version=18

Phase:  Completed

Restoring

To create a case for restoring, let’s delete the PVC, pod, namespace, and cluster-scoped persistent volume. Accidental loss, operator error, or disaster recovery can cause this deletion to occur. To verify that the data is restored, let’s also delete the random files that were created in the pod:

kubectl exec -it oke-fsspod3 -n testing -- bash
root@oke-fsspod3:/# cd /usr/share/nginx/html
root@oke-fsspod3:/usr/share/nginx/html#
root@oke-fsspod3:/usr/share/nginx/html# ls -lrt *.dmp |wc -l
65
root@oke-fsspod3:/usr/share/nginx/html# rm *.dmp
root@oke-fsspod3:/usr/share/nginx/html# ls -lrt *.dmp |wc -l
ls: cannot access ’*.dmp’: No such file or directory
0

To delete the pod and associated resources, run the following command:

~ % kubectl delete pod oke-fsspod3 -n testing
pod "oke-fsspod3" deleted

~ % kubectl delete pvc oke-fsspvc1 -n testing
persistentvolumeclaim "oke-fsspvc1" deleted

 ~ % kubectl delete namespace testing
namespace "testing" deleted

 ~ % kubectl delete pv oke-fsspv1
persistentvolume "oke-fsspv1" deleted

 velero % kubectl get pv
No resources found

The pod and the persistent volume have been removed. Now, we can restore. Let’s check on the existing backups and issue the appropriate restore commands:

% velero % ./velero backup get
NAME                         STATUS      ERRORS   WARNINGS   CREATED                         EXPIRES   STORAGE LOCATION   SELECTOR
backup-full-cluster-demo     Completed   0        0          2021-02-07 21:46:23 -0600 CST   28d       default            
backup-full-cluster-demo-1   Completed   0        0          2021-02-07 21:52:25 -0600 CST   28d       default            
backup-full-cluster-demo-2   Completed   0        0          2021-02-07 22:44:09 -0600 CST   29d       default            
backup-pv-demo-1             Completed   0        0          2021-02-07 21:55:08 -0600 CST   29d       default            
backup-pv-demo-2             Completed   0        0          2021-02-07 22:46:23 -0600 CST   29d       default            

Restoring the persistent volumes

Restore the persistent volumes with the following command:

velero % ./velero restore create --from-backup backup-pv-demo-2
Restore request "backup-pv-demo-2-20210208215719" submitted successfully.
Run `velero restore describe backup-pv-demo-2-20210208215719` or `velero restore logs backup-pv-demo-2-20210208215719` for more details.
 velero % ./velero restore describe backup-pv-demo-2-20210208215719
Name:         backup-pv-demo-2-20210208215719
Namespace:    velero
Labels:       
Annotations:  

Phase:  Completed

Started:    2021-02-08 21:57:22 -0600 CST
Completed:  2021-02-08 21:57:23 -0600 CST

Backup:  backup-pv-demo-2

Namespaces:
  Included:  all namespaces found in the backup
  Excluded:  

Resources:
  Included:        *
  Excluded:        nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io
  Cluster-scoped:  auto

Namespace mappings:  

Label selector:  

Restore PVs:  auto
 % kubectl get pv
NAME         CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                 STORAGECLASS   REASON   AGE
oke-fsspv1   100Gi      RWX            Retain           Available   testing/oke-fsspvc1   oci-fss1                31s

Let’s restore the cluster now. While we need to restore only one pod, in this example, we issue a full cluster restore to demonstrate cluster restore capability.

 % ./velero restore create --from-backup backup-full-cluster-demo-2
Restore request "backup-full-cluster-demo-2-20210208220018" submitted successfully.
Run `velero restore describe backup-full-cluster-demo-2-20210208220018` or `velero restore logs backup-full-cluster-demo-2-20210208220018` for more details.
 % ./velero restore describe backup-full-cluster-demo-2-20210208220018
Name:         backup-full-cluster-demo-2-20210208220018
Namespace:    velero
Labels:       
Annotations:  

Phase:  Completed

Started:    2021-02-08 22:00:20 -0600 CST
Completed:  2021-02-08 22:01:04 -0600 CST

 % kubectl get pods --namespace testing -o wide
NAME          READY   STATUS    RESTARTS   AGE   IP            NODE       NOMINATED NODE   READINESS GATES
oke-fsspod3   1/1     Running   0          12m   10.234.0.30   10.0.0.5              

 % kubectl get pods --namespace testing -o wide
NAME          READY   STATUS    RESTARTS   AGE   IP            NODE       NOMINATED NODE   READINESS GATES
oke-fsspod3   1/1     Running   0          12m   10.234.0.30   10.0.0.5              
 % kubectl exec -it oke-fsspod3 -n testing -- bash
root@oke-fsspod3:/# cd /usr/share/nginx/html
root@oke-fsspod3:/usr/share/nginx/html# ls -lrt *.dmp |wc -l
65

As verified in the previous block, the pod and the associated persistent volume are restored. We’ve tested with process to work with k8s v1.18.10 and v1.17.13. For more details, refer to the documentation.

Extending the process to other use cases

This blog post focuses on enabling basic backup and recovery of OKE using persistent volumes. As with most other backup tools, Velero allows you to schedule periodic automated backups. You can also use Velero in the following use cases:

  • Change Object Storage to a different region to enable regional disaster recovery for your OKE cluster. This process must conform with your organization’s data residency requirements.

  • Migrate your Kubernetes deployments from other cloud providers to OKE to utilize the performance, security, and price benefits of OCI.

  • Migrate from your on-premises Kubernetes.

Combining the robustness and scalability of Oracle Cloud Container Engine for Kubernetes with Velero’s disaster recovery capabilities helps organizations realize the production-ready nature of the Kubernetes platform.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha