You can use instance pools to create multiple compute instances from the same configuration and manage them as a group. With an autoscaling configuration, you can automatically adjust the number of compute instances in an instance pool. An autoscaling configuration helps you provide a consistent performance when demand is high and reduce your costs when demand is low.
Before you can create an instance pool, you have to create an instance configuration. An instance configuration is a template that defines the settings to use when creating compute instances.
You can create an instance configuration by using an existing compute instance as a template. If you want to create an instance configuration from scratch, use the SDKs, CLI, or API.
Let’s create a compute instance that we can use as an instance configuration template.
After the instance is provisioned, you can create an instance configuration from the instance details page.
Now let’s use the new instance as a template for an instance configuration.
After you create the instance configuration, its details page is displayed, as shown in the following figure.
You can create an instance pool directly from the instance configuration page by clicking Create Instance Pool and following these steps:
On the Configure Pool Placement page, you can add one or more availability domains. For each domain, you can specify fault domains, a primary virtual cloud network (VCN), and a subnet. By default, the instances in a pool are distributed across all fault domains in a best-effort manner based on capacity. If capacity isn’t available in one fault domain, the instances are placed in other fault domains to allow the instance pool to launch successfully. You can require that the instances are evenly distributed across each of the fault domains that you select for a high availability scenario.
Also, you can associate a load balancer with the instance pool by selecting the Attach a load balancer check box. To use this feature, you must have an existing load balancer.
To continue, select AD1 for the availability domain and a VCN and a subnet. Click Next and then Create to create the instance pool.
Instance pool creation can take a couple of minutes. After it’s created, you can click the Created Instances link in the left navigation pane to open the list of created instances in the pool, as shown in the following figure.
Imagine a scenario where the existing instances can’t handle the current demand. In this scenario, you can create an autoscaling configuration that automatically scales the number of instances in the pool.
Let’s create an autoscaling configuration from the instance pool’s details page.
Click the More Actions. menu and select Create Autoscaling Configuration.
From the Create in compartment list, select the compartment where you created the instance pool.
For the name, enter my-autoscaling-config.
From the Instance pool list, select my-instance-pool.
Click Next. On the Configure Autoscaling Policy page, you can select Metric-based Autoscaling or Schedule-based Autoscaling.
With metric-based autoscaling, you can use CPU or memory utilization to trigger autoscaling events. You then define scale-out and scale-in rules. These rules take a threshold percentage and a number of instances that you want to add (scale-out) or remove (scale-in) from the pool.
You can also define scaling limits. This setting uses the initial number of instances from the pool (5 in our example). It lets you set the minimum number of instances and a maximum number of instances. To ensure that the instances aren’t added or removed too fast, the cooldown in seconds setting ensures a minimum time between the scale events (300 seconds, or 5 minutes).
The following figure shows an example of a CPU utilization-based autoscaling policy. The policy adds two instances when the CPU utilization is greater than 70%. When the CPU utilization falls under 40%, the autoscaling policy removes two instances.
You use the metric-based autoscaling when you can’t predict the amount of traffic, and you want to automate the scaling based on the CPU or memory utilization. When you can predict the demand or you know that increased demand will occur (a launch event, for example), you use the schedule-based autoscaling.
With the schedule-based autoscaling, you can use cron expressions to configure when the number of instances in the pool should change. Read more about cron expressions in Cron Trigger Tutorial.
For example, let’s say you plan a series of launch events that happen every Monday at 2 p.m., from January through March. You want to ensure that you have 10 instances running at those times. The following figure shows how to configure such a policy by setting the target pool size to 10 and entering values for the minute (0), hour (14), day of the month (?), month (1-3), day of the week (2), and year (*).
With schedule-based autoscaling, you can define multiple autoscaling policies simultaneously. For example, the policy just defined scales out the number of instances to 10, but it doesn’t scale in the number of instances after the launch events. You need to add another policy and set the target pool size to 5, for example, to decrease the number of instances after the Monday launch events are finished.
After you configure the autoscaling policy, click Create to create the policy. You can define multiple autoscaling configurations for the same instance pool, and you can enable or disable them individually.
In this post, you learned about instance pools, instance configurations, and autoscaling configurations. You saw how to create metric- and schedule-based autoscaling configurations to scale the instance pool in or out based on CPU or memory utilization or a defined schedule.
Using these resources helps you provide a consistent performance when demand is high and reduce your costs when demand is low.
Every use case is different. The only way to know if Oracle Cloud Infrastructure is right for you is to try it. You can select either the Oracle Cloud Free Tier or a 30-day free trial, which includes US$300 in credit to get you started with a range of services, including compute, storage, and networking.