“Can you provision a database for my feature branch?”

This simple request used to trigger a 2–3 day process: create a ticket, wait for the DBA team to pick it up, manual database creation, configuration, connection details sent back, inevitably something is misconfigured, back to the DBA team…

By the time the database was ready, the developer had moved on to something else, context was lost, and momentum was dead.

We decided to fix this. What started as a simple automation project became a deep dive into Kubernetes internals, authentication patterns, and the challenges of managing stateful applications in cloud-native environments.

This is the story of that implementation — the problems we faced, the solutions we found, and the lessons we learned.
I presented this with a live workflow at the recent Oracle AI World in Las Vegas. See my slides.

The Problem: Development Database Bottleneck

Our development workflow was broken. We had three teams working on different features of our application, all needing their own database environments for testing. But our infrastructure couldn’t keep up:

For Developers:

  • Shared development database meant constant conflicts — your schema change broke someone else’s work
  • Manual provisioning took days, killing momentum
  • No two development environments were configured the same way
  • “Works on my machine” problems were constant

For DBAs:

  • Managing 15+ manual database requests per week
  • Tracking which database belongs to which feature in spreadsheets
  • Forgotten test databases are consuming cloud resources and budget
  • No standardization — every database was a snowflake

For the Business:

  • Features are taking 20–30% longer due to infrastructure delays
  • Monthly cloud bills included $5,000 in orphaned test databases
  • Production issues from schema changes that weren’t properly tested
  • Developer frustration leading to retention concerns

We needed a better way.

The Vision: Database-as-Code

The initial implementation was performed by Norman Aberin’s work; we envisioned a simple workflow:

  1. The developer creates a feature branch: feature/user-authentication
  2. Developer pushes code
  3. Magic happens
  4. 15 minutes later, they have a fully configured Oracle database with schemas deployed
  5. A developer works on their feature with an isolated database
  6. The developer deletes the branch
  7. The database automatically cleans up

No tickets. No waiting. No manual intervention.

Infrastructure-as-Code, but for databases.

The Architecture

Here’s what you can achieve (I’ve only implemented a test-db feature branch)

Architecture diagram showing GitHub → GitHub Actions → OKE Cluster with namespaces
Architecture diagram showing GitHub → GitHub Actions → OKE Cluster with namespaces

Key Components:

  • Oracle Kubernetes Engine (OKE): Our container orchestration platform
  • Oracle Database Operator: Manages the database lifecycle as a Kubernetes resource
  • GitHub Actions: Orchestrates the entire workflow
  • Kustomize: Handles configuration management
  • Liquibase: Manages schema versioning and deployment

Each feature branch gets its own namespace in the Kubernetes cluster, complete with an isolated Oracle database instance.

The flow looks like this:

GitHub Repository (feature branches)
               ↓
        GitHub Actions (triggered on push)
               ↓
        OKE Cluster
          ├─ Namespace: feature-user-auth
          │    ├─ Oracle Database Pod
          │    ├─ LoadBalancer Service
          │    └─ Persistent Volume
          └─ Namespace: feature-payment-system
               ├─ Oracle Database Pod
               ├─ LoadBalancer Service
               └─ Persistent Volume

The Journey: Four Major Challenges

What seemed straightforward in theory hit reality hard. Here are the four major technical challenges we faced and how we solved them.

Challenge 1: Authentication Hell

The Problem

We needed GitHub Actions to authenticate with our OKE cluster. Oracle provides a GitHub Action (oracle-actions/configure-kubectl-oke) that’s supposed to handle this. We followed the documentation, configured our secrets, ran the workflow, and…

Error: Authentication failed
    The required information to complete authentication 
    was not provided or was incorrect.

We spent three days trying different combinations of credentials, regions, and configuration options. The error messages were generic and unhelpful.

The Investigation

OCI authentication uses a private key and a fingerprint of that key’s public component. The Oracle action expected us to provide the fingerprint as a GitHub secret. But we kept getting mismatches — the fingerprint we stored didn’t match what the actual key generated.

Then the breakthrough: What if we don’t store the fingerprint at all? What if we generate it from the private key at runtime?

The Solution

- name: Setup OCI CLI and Configure kubectl
      run: |
        # Install OCI CLI
        bash -c "$(curl -L https://raw.githubusercontent.com/oracle/oci-cli/master/scripts/install/install.sh)" -- --accept-all-defaults
        export PATH="$HOME/bin:$PATH"
        
        # Create private key from GitHub secret
        mkdir -p ~/.oci
        echo "${{ secrets.OCI_KEY_FILE }}" > ~/.oci/key.pem
        chmod 600 ~/.oci/key.pem
        
        # Generate fingerprint from actual key (not from secret!)
        FINGERPRINT=$(openssl rsa -pubout -outform DER \
          -in ~/.oci/key.pem 2>/dev/null | \
          openssl md5 -c | awk -F'= ' '{print $2}')
        
        # Create OCI config with generated fingerprint
        cat > ~/.oci/config << EOF
        [DEFAULT]
        user=${{ secrets.OCI_USER_OCID }}
        fingerprint=${FINGERPRINT}
        tenancy=${{ secrets.OCI_TENANCY_OCID }}
        region=${{ secrets.OCI_REGION }}
        key_file=$HOME/.oci/key.pem
        EOF
        
        # Configure kubectl
        $HOME/bin/oci ce cluster create-kubeconfig \
          --cluster-id ${{ secrets.OKE_CLUSTER_OCID }} \
          --file ~/.kube/config \
          --region '${{ secrets.OCI_REGION }}' \
          --token-version 2.0.0

Key Insight

Store only what you must. Generate what you can.

The fingerprint is derived from the key, so derive it at runtime. This guarantees they match because they come from the same source.

Impact: Authentication went from 0% success rate to 100%.

Challenge 2: The Kustomize Naming Nightmare

The Problem

We had a base database configuration named freedb-lite-sample. We wanted to add the branch name as a suffix, so for the branchtest-db, we’d get db-test-db. Simple, right?

We used Kustomize’s namesuffix feature:

kustomize edit set namesuffix test-db

The result? freedb-lite-sampletest-db

Kustomize appended the suffix directly to the existing name. Our monitoring scripts were looking for db-test-db, but the database was actually named freedb-lite-sampletest-db.

We spent hours debugging “database not found” errors when the database was right there — just with the wrong name.

The Solution

We switched namesuffix to JSON patches, which give complete control:

cat > kustomization.yaml << EOF
    apiVersion: kustomize.config.k8s.io/v1beta1
    kind: Kustomization
namespace: feature-test-dbresources:
      - base/sidb-free-litepatches:
      - patch: |-
          - op: replace
            path: /metadata/name
            value: db-test-db
        target:
          kind: SingleInstanceDatabase
    EOF

The patch says: “Find the SingleInstanceDatabase resource and replace its metadata name with exactly this value.” No ambiguity, full control.

Lesson Learned

Kustomize’s convenience features are great for simple cases, but for precise control, JSON patches are worth the extra verbosity.

Challenge 3: Namespaces That Won’t Die

The Problem

This was perhaps the most frustrating issue. We’d run our delete workflow to clean up an old database. The workflow completed successfully. But when we tried to create a new database with the same name:

Error from server (AlreadyExists): object is being deleted: 
    namespaces "feature-test-db" already exists

The namespace was stuck in the “Terminating” state. For hours. In one case, over 6 hours.

Kubernetes was trying to delete the namespace, but couldn’t because resources inside it had “finalizers” — cleanup handlers that must complete before deletion. The SingleInstanceDatabase custom resource had a finalizer for proper database shutdown. Something was preventing that finalizer from completing, so the namespace was stuck in limbo forever.

Developers couldn’t create new databases because the namespace name was technically “taken” even though nothing was actually running.

The Investigation

When I examined the namespace status, I found:

"message": "Some content in the namespace has finalizers remaining: 
    database.oracle.com/singleinstancedatabasefinalizer in 1 resource instances"

The database resource was blocking namespace deletion. But the database pod was long gone. The finalizer was waiting for something that would never happen.

The Solution

We had to force the issue by removing finalizers:

- name: Force delete stuck namespace and resources
      run: |
        NAMESPACE="feature-${{ steps.getname.outputs.idname }}"
        
        if kubectl get namespace $NAMESPACE 2>/dev/null | grep -q Terminating; then
          echo "Namespace stuck, removing resources..."
          
          # Remove finalizers from database resources first
          for db in $(kubectl get singleinstancedatabase -n $NAMESPACE -o name); do
            kubectl patch $db -n $NAMESPACE \
              -p '{"metadata":{"finalizers":[]}}' --type=merge
          done
          
          # Remove finalizers from namespace
          kubectl patch namespace $NAMESPACE \
            -p '{"metadata":{"finalizers":[]}}' --type=merge
          
          # Force delete if still exists
          sleep 15
          kubectl delete namespace $NAMESPACE --force --grace-period=0
        fi

We run this check before creating a new namespace. If an old one is stuck, we forcibly remove the finalizers and delete it.

Lessons Learned

Kubernetes finalizers are powerful for ensuring proper cleanup, but they can create deadlocks.

Always have a force-cleanup mechanism for production workflows. Custom Resource Definitions are especially prone to this issue.

Impact: Workflow success rate went from ~60% (blocked by stuck namespaces) to 95%+.

Challenge 4: The Connectivity Conundrum

The Problem

Database created successfully. Pod running. Everything looked perfect in the Kubernetes dashboard. But when Liquibase tried to connect from GitHub Actions:

ORA-12541: Cannot connect. No listener at host 10.0.10.10 port 30328
    nc: connect to 10.0.10.10 port 30328 (tcp) failed: Connection timed out

We verified credentials. Checked firewall rules. Tested from different networks. Nothing worked.

Then I realized: 10.0.10.10 is an internal Kubernetes IP (ClusterIP). It’s only accessible from inside the cluster. GitHub Actions runs on GitHub’s infrastructure, completely outside our cluster. They’re in different networks with no connectivity.

The Root Cause

The Oracle Database Operator created a ClusterIP service by default:

apiVersion: v1
    kind: Service
    metadata:
      name: db-service
    spec:
      type: ClusterIP  # Internal only!
      clusterIP: 10.0.10.10
      ports:
      - port: 1521

ClusterIP is perfect for pod-to-pod communication within the cluster. But it’s invisible to the outside world.

The Solution

We needed to expose the database externally via a LoadBalancer:

- name: Expose database externally
      run: |
        kubectl apply -f - <<EOF
        apiVersion: v1
        kind: Service
        metadata:
          name: db-external
          namespace: feature-test-db
        spec:
          type: LoadBalancer
          selector:
            app: db-test-db
          ports:
          - port: 1521
            targetPort: 1521
        EOF
        
        # Wait for external IP
        kubectl wait --for=jsonpath='{.status.loadBalancer.ingress[0].ip}' \
          service/db-external --timeout=300s

In Oracle Cloud, the LoadBalancer service automatically provisions an OCI Load Balancer with a public IP address. The flow becomes:

Liquibase (GitHub Actions) 
      → External IP (OCI Load Balancer) 
      → ClusterIP (Kubernetes Service) 
      → Database Pod

We wait for the external IP assignment (takes 2–3 minutes) before proceeding with Liquibase.

Lessons Learned

Kubernetes networking has layers. ClusterIP is internal-only, NodePort exposes on node IPs, and LoadBalancer gets a cloud provider’s load balancer. Choose based on your access patterns.

Security Note: For production, you’d restrict access to this LoadBalancer (IP allowlists, VPN, etc.). For temporary development databases with non-sensitive data, external access is acceptable.

Alternative Approach: We’re considering running Liquibase as a Kubernetes Job inside the cluster for future iterations. This would keep the database internal-only and eliminate the need for external exposure.

Real World Performance

Here’s what a typical execution looks like:

Workflow execution timeline or screenshot from GitHub Actions

What you get:

  • 1 Oracle Database instance (19c or 21c)
  • 1 Kubernetes namespace (isolated)
  • 3+ application schemas (fully deployed)
  • 1 external IP for connectivity
  • Automatic cleanup on branch deletion

Compare this to our previous process: a minimum of 2–3 days.

The Impact

Quantitative Results

Before:

  • Average database provisioning time: 2–3 days
  • Monthly orphaned database costs: ~$5,000
  • Developer satisfaction score (infrastructure): 4.2/10
  • Features delayed by database availability: 20–30%

After:

  • Average database provisioning time: 15–20 minutes (99.5% reduction)
  • Monthly orphaned database costs: ~$200 (96% reduction)
  • Developer satisfaction score (infrastructure): 8.7/10
  • Features delayed by database availability: <2%

Qualitative Benefits

Developers:

“I can experiment freely without worrying about breaking someone else’s work”

“Testing schema changes is so much easier now”

“I love that I can just push code and come back to a ready database”

DBAs:

“I’ve gone from managing 15+ manual requests per week to zero”

“Everything is standardized now — I know exactly how every database is configured”

“The automatic cleanup is amazing for cost control”

Project Managers:

“Features are shipping faster”

“Our cloud bill dropped significantly”

“Developers are happier, which helps with retention”

Lessons Learned

1. Authentication: Store Less, Generate More

Don’t store credentials you can derive. We went from storing five pieces of authentication data to storing two (private key and region), generating everything else at runtime. This reduced secret management complexity and eliminated mismatch errors.

2. Kubernetes: Plan for Finalizers

Custom Resource Definitions often have finalizers. In a perfect world, they clean up gracefully. In reality, they can deadlock. Always have force-cleanup mechanisms for production workflows.

3. Networking: Internal vs External Matters

ClusterIP vs LoadBalancer isn’t just about exposure — it fundamentally changes who can access your services. Understand your access patterns before choosing service types.

4. Debugging: Add Observability Early

Every step in our workflow now echoes what it’s doing and the results. When something fails at 2 AM, these logs are invaluable. Don’t wait for production failures to add debugging.

5. Automation: Perfect is the Enemy of Done

Our first version worked for 60% of cases. We shipped it, learned from failures, and iterated. Waiting for perfection would have meant developers still waiting days for databases.

Future Improvements

We’re not done. Here’s what we’re planning:

Security:

  • Run Liquibase as a Kubernetes Job (keep databases internal)
  • Add network policies (restrict database access to specific pods)
  • Migrate from GitHub Secrets to OCI Vault (better secret management)
  • Implement TLS/SSL for database connections

Operations:

  • Automated backups before schema changes
  • Prometheus metrics for database health
  • Cost tracking by feature branch
  • Slack notifications for deployment status

Developer Experience:

  • Database seeding with realistic test data
  • Schema diff tool (compare feature branch to main)
  • Direct kubectl access for troubleshooting
  • Rollback capability for schema changes

Tools and Resources

This implementation wouldn’t have been possible without standing on the shoulders of others:

Core Technologies:

Inspiration:

Documentation & Community:

  • Oracle Cloud Infrastructure documentation
  • Kubernetes documentation (especially on finalizers and service types)
  • Stack Overflow for specific troubleshooting
  • Claude AI for debugging strategies and troubleshooting guidance

Try It Yourself

Want to implement this in your environment? Here’s where to start:

GitHub – Kuassim/AiWorld: Oracle AI World demo

  1. Set up an OKE cluster with the Oracle Database Operator installed
  2. Configure GitHub secrets for OCI authentication
  3. Create base database configuration using Oracle DB Operator manifests
  4. Implement the create workflow, starting with authentication
  5. Add Kustomize for configuration management
  6. Integrate Liquibase for schema deployment
  7. Build the delete workflow for cleanup

Start simple. Get authentication working first. Then add one piece at a time. Each challenge we faced taught us something valuable about the technology stack.

Conclusion

From 2–3 days to 15–20 minutes. From manual spreadsheet tracking to automated GitOps. From $5,000 in wasted resources to $200. From frustrated developers to delighted ones.

But more than the metrics, what we built is a culture shift. Databases are no longer special snowflakes that require manual care. They’re infrastructure-as-code, just like everything else in our cloud-native stack.

Every developer on our team can now provision a database with a git push. They can experiment freely, test thoroughly, and ship confidently. The database is no longer a bottleneck — it’s an enabler.

That’s the real value of automation: not just time saved, but possibilities created.

Have you implemented similar database automation in your environment? What challenges did you face? I’d love to hear your experiences in the comments.

Tags: #Kubernetes #DevOps #Oracle #GitHubActions #DatabaseAutomation #CloudNative #InfrastructureAsCode #OCI #Liquibase #Kustomize

Kubernetes Github Actions Oracle Database DevOps Oracle Cloud