Container Registry Redundancy for Kubernetes Clusters

Introduction

Ensuring high availability of container images in multi-region deployments using Kubernetes can be tricky. Redundancy and high availability usually get attention only at the K8s cluster and persistent stores level. However, it is frequently forgotten that Kubernetes systems have a strong dependency on container registries’ availability to be able to instantiate pods. It is common these days that, to facilitate regular updates and propagate changes quickly, Kubernetes deployments and pods use imagePullPolicy: Always to force the kubelet to pull an updated image every time it launches a container. What happens then if the precise container registry used to host the image becomes unavailable due to, perhaps, the registry going down, connectivity issues, or other types of outages?

The first problem that needs to be addressed is to create a mirror of your container registry in an alternative location. Containers can then failover their image request to this registry replica in an outage scenario. CI/CD pipelines or products like Rackware, FSDR, etc., provide tools to automate this replication. The second part of the problem is making pods and deployments use this alternative container registry’s address. Some frameworks replace the image’s address based on the region hosting the K8s cluster (i.e., they update the image: location tag in pods depending on what region the pods are running in). However, it is incorrect to attach the failover of the registry to the failover of the K8s cluster itself. In other words, registries may experience outages NOT affecting the K8s cluster region itself. In fact, registries may reside in different OCI regions or even in locations completely external to OCI or in a customer Data Center. Kubernetes clusters act like clients to container registries and should “virtualize” the container registry access so that they don’t need to be updated when an outage occurs in their primary registry system.

The Challenge

In OCI, for example, Container Registries are region-specific and, while you can configure registry replication in diverse ways, you will want to use a stable image address in your deployments that can provide transparent “failover” to other registries. For example, imagine you have deployed the same image in two different registries:

fra.ocir.io
ams.ocir.io

You want to provide seamless access to the image with some requisites:

–> Kubernetes pods should pull images seamlessly even if one region is unavailable.

–> TLS certificate validation must remain secure (no insecure=true should be used to retrieve images).

–> Podman CLI and kubelet/CRI-O runtime should behave consistently.

What Doesn’t Work

DNS or /etc/hosts overrides that map a virtual hostname to multiple backend registry addresses will not work properly with Cloud registries. This is because using a hostname alias to point to a region’s registry IP breaks TLS: the certificate from OCI or other production-ready registries only covers *.ocir.io. Podman’s CLI might ignore it with –tls-verify=false, but Kubernetes CRI-O fails with X509 errors because we are requesting an SSL connection with a site name that doesn’t match our request (Updating certificates on backend OCI registries is obviously not possible since OCI manages registries’ certificates.) This approach is valid, though, for custom local registries where there is control over the certificates and SANs used in the backends. On top of this, DNS steering does not allow using authenticated requests in HTTP health checks, and all accesses to Container Registry paths require authentication. This precludes running Health Checks on the status of the precise repository and limits the HC to a simple TCP verification of the registry’s address being reachable.

The Working Solution: Registry Mirrors in the Repository’s Configuration

It is frequently recommended to use local registry mirrors to improve performance and offload large central registries with tons of images from the burden of continuous access and image retrievals. However, registry mirrors are useful beyond this. Configuring the repository’s access with registry mirrors can also provide the failover capabilities we are looking for. To use registry mirrors transparently in your K8s cluster and container ,follow these steps:

NOTE: The sample commands provided use OCI container registries, podman, and cri-o as container manager and container runtime, respectively. Refer to your precise implementation for specific steps in other cases.

Step 1 — Deploy your images using the same secret to your primary registry and at least one mirror:
- Build your image
podman build -t helidon-pod -f Dockerfile
- Verify the image locally
podmand images REPOSITORY TAG IMAGE ID CREATED SIZE localhost/helidon-pod latest 0042018ed34c 26 hours ago 681 MB
- Log in to your container registry in region 1 and push the image
podman login fra.ocir.io -u tenancy/oracleidentitycloudservice/user@oracle.com -p "XYZWQ"
podman push helidon-pod:latest fra.ocir.io/tenancy/myrepo
- Log in to your container registry in region 2 and push the image
podman login ams.ocir.io -u tenancy/oracleidentitycloudservice/user@oracle.com -p "XYZWQ"
podman push helidon-pod:latest ams.ocir.io/tenancy/myrepo

Step 2 — Use one of your registries as the Top-Level Registry and add a secondary mirror
- Configure your /etc/containers/registries.conf (refer to your precise container configuration if using other than cri-0) with entries for your main registry and the mirror (Apply this change in all your worker nodes)
[[registry]]
location = "fra.ocir.io" [[registry.mirror]] location = "ams.ocir.io"
- Restart CRI-O on each node:

Step 3 — Deploy to Kubernetes
- Create a secret in your K8s cluster to access the Container Registries (same secret or auth token needs to be used in all CRs for seamless access)
kubectl create secret docker-registry fraregcred --docker-server=fra.ocir.io --docker-username=tenancy/oracleidentitycloudservice/user@oracle.com --docker-password="XYZWQ" --docker-email=user@oracle.com
- Reference your image and secret in your corresponding pod/deployments and trigger their creation or restart. Kubernetes pods should be pulling images and will now automatically use the first mirror. If an image is missing in fra.ocir.io, it falls back to ams.ocir.io.

kind: Service
apiVersion: v1
metadata:
  name: helidon-pod
  labels:
    app: pod
spec:
  type: ClusterIP
  selector:
    app: helidon-pod
  ports:
    - name: tcp
      port: 8080
      protocol: TCP
      targetPort: 8080
---
kind: Deployment
apiVersion: apps/v1
metadata:
  name: helidon-pod
spec:
  replicas: 1
  selector:
    matchLabels:
      app: helidon-pod
  template:
    metadata:
      labels:
        app: helidon-pod
        version: v1
    spec:
      containers:
      - name: helidon-quickstart-mp
        image: fra.ocir.io/tenancy/myrepo:latest
        imagePullPolicy: Always
        ports:
        - containerPort: 8080
      imagePullSecrets:
      - name:  fraregcred
...

Step 4 — Validate Failover
You can check that failover is working properly using iptables rules (k8s worker nodes do not use firewalld; instead, they rely on complex iptables routing, hence, you can use it instead). Assuming fra.ocir.io is set first in the list of mirrors, obtain its IP and disable traffic to it. The container should “jump” to the next mirror registry address and obtain the image. If you disable traffic to all the registries’ IPs, you should see that the pull of images fails.

[opc@oke-c6nieuduvlq-nrmiw4rws7a-stvzoukvq5q-3 ~]$ nslookup fra.ocir.io
Server:		169.254.169.254
Address:	169.254.169.254#53

Non-authoritative answer:
Name:	fra.ocir.io
Address: 147.154.141.197
Name:	fra.ocir.io
Address: 147.154.178.110

[opc@oke-c6nieuduvlq-nrmiw4rws7a-stvzoukvq5q-3 ~]$ nslookup ams.ocir.io
Server:		169.254.169.254
Address:	169.254.169.254#53

Non-authoritative answer:
Name:	ams.ocir.io
Address: 192.29.192.138
[opc@oke-c6nieuduvlq-nrmiw4rws7a-stvzoukvq5q-3 ~]$ sudo iptables -A OUTPUT -d 192.29.192.138 -j DROP

[opc@oke-c6nieuduvlq-nrmiw4rws7a-stvzoukvq5q-3 ~]$ podman pull fra.ocir.io/tenancy/helidon-pod:latest
Trying to fra.ocir.io/tenancy/helidon-pod:latest...
Getting image source signatures
Copying blob cca2af61522d skipped: already exists  
Copying blob 180f7e7c1d47 skipped: already exists  
Copying blob 0cac2d858a1f skipped: already exists  
Copying blob dd337f169991 skipped: already exists  
Copying blob 2fca19049556 skipped: already exists  
Copying blob cc3d1a8fa2ac skipped: already exists  
Copying config df3b1ca66c done   | 
Writing manifest to image destination
df3b1ca66c9818e022a59f14df47d4b743a477c55d7eb3eef95279326ac90749

[opc@oke-c6nieuduvlq-nrmiw4rws7a-stvzoukvq5q-3 ~]$ sudo iptables -A OUTPUT -d 192.29.192.138 -j DROP
[opc@oke-c6nieuduvlq-nrmiw4rws7a-stvzoukvq5q-3 ~]$ sudo iptables -A OUTPUT -d 147.154.141.197 -j DROP
[opc@oke-c6nieuduvlq-nrmiw4rws7a-stvzoukvq5q-3 ~]$ sudo iptables -A OUTPUT -d 147.154.178.110 -j DROP
[opc@oke-c6nieuduvlq-nrmiw4rws7a-stvzoukvq5q-3 ~]$ podman pull fra.ocir.io/tenancy/helidon-pod:latest
Trying to pull fra.ocir.io/tenancy/helidon-pod:latest...
Error: initializing source docker://fra.ocir.io/tenancy/helidon-pod:latest: (Mirrors also failed: [ams.ocir.io/tenancy/helidon-pod:latest: pinging container registry ams.ocir.io: Get "https://ams.ocir.io/v2/": dial tcp 192.29.192.138:443: i/o timeout]
[fra.ocir.io/tenancy/helidon-pod:latest: pinging container registry fra.ocir.io: Get "https://fra.ocir.io/v2/": dial tcp 147.154.141.197:443: i/o timeout]): fra.ocir.io/tenancy/helidon-pod:latest: pinging container registry fra.ocir.io: ...

[opc@oke-c6nieuduvlq-nrmiw4rws7a-stvzoukvq5q-3 ~]$ sudo iptables -D OUTPUT -d 192.29.192.138 -j DROP
[opc@oke-c6nieuduvlq-nrmiw4rws7a-stvzoukvq5q-3 ~]$ podman pull fra.ocir.io/tenancy/helidon-pod:latest
Trying to pull fra.ocir.io/tenancy/helidon-pod:latest...
Getting image source signatures
Copying blob 180f7e7c1d47 skipped: already exists  
Copying blob 2fca19049556 skipped: already exists  
Copying blob dd337f169991 skipped: already exists  
Copying blob 0cac2d858a1f skipped: already exists  
Copying blob cca2af61522d skipped: already exists  
Copying blob cc3d1a8fa2ac skipped: already exists  
Copying config df3b1ca66c done   | 
Writing manifest to image destination
df3b1ca66c9818e022a59f14df47d4b743a477c55d7eb3eef95279326ac90749

Automation

For large clusters, it’s practical to automate the update /etc/containers/registries.conf across all worker nodes. You can extend this script with additional mirrors and customizations:

#!/bin/bash

# -------------------------------
# Configuration
# -------------------------------

#Customize your current primary and mirror registry's addresses
MAIN_REGISTRY="fra.ocir.io"
MIRROR1="ams.ocir.io"
# Replace with your SSH username and key
SSH_USER="opc"       
SSH_KEY=/home/opc/my_keys/sshkey.priv
#Provide your containers configuration file
REGISTRIES_CONF="/etc/containers/registries.conf"

# -------------------------------
# Function to generate registry entry
# -------------------------------
generate_registry_entry() {
  cat <<EOF
[[registry]]
location = "$MAIN_REGISTRY"

  [[registry.mirror]]
  location = "$MIRROR1"

EOF
}

# -------------------------------
# Get all worker nodes
# -------------------------------
#WORKERS=$(kubectl get nodes -o jsonpath='{range .items[?(@.spec.taints==null || !(@.spec.taints[*].effect=="NoSchedule"))]}{.metadata.name}{" "}{end}')
WORKERS=$(kubectl --kubeconfig=$kcfg get nodes | grep -v 'master\|control' | grep -v NAME | awk '{print $1}')
if [[ -z "$WORKERS" ]]; then
  echo "No worker nodes found"
  exit 1
fi

# -------------------------------
# Traverse and update each worker
# -------------------------------
for NODE in $WORKERS; do
  echo "Updating registries.conf on node: $NODE"

  # Backup existing registries.conf
  ssh -i $SSH_KEY "$SSH_USER@$NODE" "sudo cp $REGISTRIES_CONF ${REGISTRIES_CONF}.bak"

  # Append or replace registry entry
  ssh -i $SSH_KEY "$SSH_USER@$NODE" "sudo bash -c 'grep -q \"$MAIN_REGISTRY\" $REGISTRIES_CONF && sed -i \"/$MAIN_REGISTRY/,+3d\" $REGISTRIES_CONF; cat >> $REGISTRIES_CONF'" <<< "$(generate_registry_entry)"

  # Restart CRI-O
  echo "Restarting CRI-O on node $NODE"
  ssh -i $SSH_KEY "$SSH_USER@$NODE" "sudo systemctl restart crio"

  echo "Done on node: $NODE"
done

echo "All worker nodes updated successfully."

Conclusion

Container registry access may fail for several reasons (connectivity, storage downtime or other outages), and this can preclude a Kubernetes microservice or application from working during deployment updates or pod restarts. To avoid CRs becoming a SPOF for a K8s cluster, it is required to configure redundant registries. To provide transparent failover in kubelet‘s image instantiations, a mirror configuration can be used in the repository configuration. This allows using a list of registries that are traversed in failure scenarios WITHOUT requiring the manipulation of the registry address used by pods and deployments. This pattern is now the recommended MAA approach to implement multi-region OCI registry redundancy in Kubernetes clusters.

Container Registry Redundancy for Kubernetes Clusters

Introduction

The Challenge

What Doesn’t Work

The Working Solution: Registry Mirrors in the Repository’s Configuration

Automation

Conclusion

Fermin Castro

MAA Solutions Architect

New Oracle Solution Playbook – "Implement Mid-Tier Replication in an OCI Disaster Recovery Architecture"

Container Registry Redundancy for Kubernetes Clusters

Introduction

The Challenge

What Doesn’t Work

The Working Solution: Registry Mirrors in the Repository’s Configuration

Automation

Conclusion

Authors

Fermin Castro

MAA Solutions Architect

New Oracle Solution Playbook – "Implement Mid-Tier Replication in an OCI Disaster Recovery Architecture"