Maximizing efficiency and savings: Using Avesha Smart Scaler for cost-effective Kubernetes clusters in OKE

February 29, 2024 | 4 minute read
Mayank Kakani
Cloud Architect
Text Size 100%:

Kubernetes has become a cornerstone for automating container deployment, scaling, and management, earning widespread acclaim for its ability to foster agility, scalability, and crossenvironment portability for organizations around the globe. As Kubernetes continues to be embraced on a global scale, the issue of cost overruns has begun to surface more frequently, echoing concerns reminiscent of the early days of cloud computing adoption.

This phenomenon mirrors the cloud’s transformative impact on how organizations access and budget for computing resources, with Kubernetes now reshaping the container orchestration domain. But this evolution brings challenges related to cost predictability and effective financial control as some organizations confront unexpectedly high expenses.

Oracle Container Engine for Kubernetes

Oracle Container Engine for Kubernetes (OKE) is a managed Kubernetes service by Oracle that simplifies the operations of enterprise-grade Kubernetes at scale. It reduces the time, cost, and effort needed to manage the complexities of the Kubernetes infrastructure. OKE lets you deploy Kubernetes clusters and ensure reliable operations for both the control plane and the worker nodes with automatic scaling, upgrades, and security patching.

The cost challenge

The dynamic nature of Kubernetes workloads running in OKE, coupled with intricate resource dependencies, can lead to unpredictable costs. Containerized microservices can scale up and down rapidly, and without proper governance, this flexibility can translate into spiraling expenses. Managing the infrastructure underlying Kubernetes clusters also demands careful consideration of factors like storage, networking, and compute resources, all of which contribute to the overall bill.

Another key aspect to the challenge is that scaling decisions are usually based on infrastructure metrics. However, integrating business logic into this process can enhance efficiency and cost-effectiveness in resource utilization, ensuring that scaling actions are more closely aligned with the actual needs of the application and the demands of its users.

To address these challenges, Avesha’s Smart Scaler offers a solution that transitions organizations from a reactive stance, typically associated with Kubernetes horizontal pod autoscaling (HPA), to a proactive strategy. By accurately forecasting traffic demands ahead of time, Smart Scaler enables precise scaling of both application and infrastructure resources, optimizing costs and resource efficiency based on these anticipatory insights.

Strategies for cost management

Organizations must adopt proactive cost management strategies, including optimizing resource allocation, applying autoscaling effectively, and implementing monitoring and alerting to identify and rectify inefficient resource usage.


  • Horizontal pod autoscaling (HPA): Implement HPA to automatically adjust the number of running pods based on observed CPU or memory utilization. This feature ensures that resources scale dynamically based on demand, preventing overprovisioning during periods of low activity and reducing costs.

  • Cluster autoscaling: Configure your Kubernetes cluster to scale the underlying infrastructure based on the workload. This can involve adding or removing nodes based on resource demand, optimizing costs during peak and off-peak periods.

Incorporating business logic in Kubernetes scaling

  • Application-specific metrics: Traditional Kubernetes scaling relies on infrastructure metrics like CPU and memory usage. Integrating business logic involves considering application-specific metrics towards scaling. For instance, an e-commerce application might scale based on the number of active shopping carts or transaction rates, rather than just CPU usage. A sports betting application might scale based on the location and demand for the sports event at that location.

  • Service graphs and service latency: Incorporating service graphs and service-to-service communication latencies can provide a holistic view of how various services should scale in relation to one another. This integration can be instrumental in understanding the impact of scaling on overall application performance and the user experience.

We know the techniques. What approach should we take?

You have the following options for approaches:

  • Manual autoscaling: Practitioners in the realm of cloud computing readily acknowledge that attempting to oversee resources manually is a recipe for disaster. Unless you’re dealing with an entirely static environment or a modest Kubernetes deployment, keeping pace with workload demands through manual pod scaling becomes an insurmountable challenge.

  • Automation: A significant portion of both open source and commercial autoscaling tools bring a degree of automation into the management of Kubernetes resources. Administrators establish minimum and maximum thresholds, and the tool autonomously adjusts cluster or pod scaling in response to resource requirements and availability. This option marks a considerable advancement in the efficient management of Kubernetes costs.

  • Intelligent autoscaling: The subsequent evolution in Kubernetes autoscaling involves the infusion of intelligence and reinforcement learning into the automation process. This advancement represents a paradigm shift, elevating the sophistication of the autoscaling mechanism.

Introducing Avesha’s Smart Scaler, the proactive Kubernetes autoscaler

Powered by generative AI, Smart Scaler accurately predicts demand in advance, precisely autoscaling infrastructure and application resources. Instead of using generic autoscaling templates, Avesha’s patented Reinforcement Learning process learns the specific workload characteristics and adjusts the autoscaling process to match. It works natively with HPA and Kubernetes Event-driven Autoscaling (KEDA) tools.

This system provides proactive pod management by predicting the necessary quantity based on workload demands and anticipating traffic patterns through learned behaviors. Utilizing a Reinforcement Learning engine, enhanced by generative AI, it continuously refines the pod count to optimize resource allocation and efficiency.

Cutting down on Kubernetes operating expenses not only aligns with sound business practices but also plays a role in fostering a more sustainable future. This dual benefit not only improves the bottom line for businesses, but also aids in minimizing carbon footprint, creating a mutually beneficial scenario for both enterprises and the environment.

Want to know more?

Try out Avesha’s Smart Scaler yourself on Oracle Cloud Infrastructure and save on your Kubernetes spending. For more information, see Avesha’s blog post, What is Kubernetes Right-Sizing and should I care?

Mayank Kakani

Cloud Architect

Previous Post

Announcing OCI FastConnect 400G direct connections

Misha Kasvin | 2 min read

Next Post

Powering the AI revolution: Oracle at NVIDIA GTC

Richard Wang | 6 min read