By Mike Cico-Oracle on Oct 30, 2015
Introducing Elasticity for Dynamic Clusters
WebLogic Server 12.1.2 introduced the concept of dynamic clusters, which are clusters where the Managed Server configurations are based off of a single, shared template. It greatly simplified the configuration of clustered Managed Servers, and allows for dynamically assigning servers to machine resources and greater utilization of resources with minimal configuration.
In WebLogic Server 12.2.1, we build on the dynamic clusters concept to introduce elasticity to dynamic clusters, allowing them to be scaled up or down based on conditions identified by the user. Scaling a cluster can be performed on-demand (interactively by the administrator), at a specific date or time, or based on performance as seen through various server metrics.
In this blog entry, we take a high level look at the different aspects of elastic dynamic clusters in WebLogic 18.104.22.168, the next piece in the puzzle for on-premise elasticity with WebLogic Server! In subsequent blog entries, we will provide more detailed examinations of the different ways of achieving elasticity with dynamic clusters.
The WebLogic Server Elasticity Framework
The diagram below shows the different parts to the elasticity framework for WebLogic Server:
The Elastic Services Framework are a set of services residing within the Administration Server for a for WebLogic domain, and consists of
- A new set of elastic properties on the DynamicServersMBean for dynamic clusters to establish the elastic boundaries and characteristics of the cluster
- New capabilities in the WebLogic Diagnostics Framework (WLDF) to allow for the creation of automated elastic policies
- A new "interceptors" framework to allow administrators to interact with scaling events for provisioning and database capacity checks
- A set of internal services that perform the scaling
- (Optional) integration with Oracle Traffic Director (OTD) 12c to notify it of changes in cluster membership and allow it to adapt the workload accordingly
Note that while tighter integration with OTD is possible in 12.2.1, if the OTD server pool is enabled for dynamic discovery, OTD will adapt as necessary to the set of available servers in the cluster.
Configuring Elasticity for Dynamic Clusters
To get started, when you're configuring a new dynamic cluster, or modifying an existing dynamic cluster, you'll want to leverage some new properties surfaced though the DynamicServersMBean for the cluster to set some elastic boundaries and control the elastic behavior of the cluster.
The new properties to be configured include
- The starting dynamic cluster size
- The minimum and maximum elastic sizes of the cluster
- The "cool-off" period required between scaling events
There are several other properties regarding how to manage the shutdown of Managed Servers in the cluster, but the above settings control the boundaries of the cluster (by how many instances it can scale up or down), and how frequently scaling events can occur. The Elastic Services Framework will allow the dynamic cluster to scale up to the specified maximum number of instances, or down to the minimum you allow.
The cool-off period is a safety mechanism designed to prevent scaling events from occurring too frequently. It should allow enough time for a scaling event to complete and for its effects to be felt on the dynamic cluster's performance characteristics.
Needless to say, the values for these settings should be chosen carefully and aligned with your cluster capacity planning!
Scaling Dynamic Clusters
Scaling of a dynamic cluster can be achieved through the following means:
- On-demand through WebLogic Server Administration Console and WLST
- Using an automated calendar-based schedule utilizing WLDF policies and actions
- Through automated WLDF policies based on performance metrics
WebLogic administrators have the ability to scale a dynamic cluster up or down on demand when needed:
- Through the WLST scaleUp() and scaleDown() commands (nicely detailed in Byron Nevins' blog entry here)
- Using the WebLogic Administration Console, on the dynamic cluster's "Control/Scaling" tab (see image below)
In the console case, the administrator simply indicates the total number of desired running servers in the cluster, and the Console will interact with the Elastic Services Framework to scale the cluster up or down accordingly, within the boundaries of the dynamic cluster.
In addition to scaling a dynamic cluster on demand, WebLogic administrators can configure automated polices using the Polices & Actions feature (known in previous releases as the Watch & Notifications Framework) in WLDF.
Typically, automated scaling will consist of creating pairs of WLDF policies, one for scaling up a cluster, and one for scaling it down. Each scaling policy consists of
- (Optionally) A policy (previously known as a "Watch Rule") expression
- A schedule
- A scaling action
To create an automated scaling policy, an administrator must
- Configure a domain-level diagnostic system module and target it to the Administration Server
- Configure a scale-up or scale-down action for a dynamic cluster within that WLDF module
- Configure a policy and assign the scaling action
For more information you can consult the documentation for Configuring Policies and Actions.
Calendar Based Elastic Policies
In 12.2.1, WLDF introduces the ability for cron-style scheduling of policy evaluations. Policies that monitor MBeans according to a specific schedule are called "scheduled" policies.
A calendar based policy is a policy that unconditionally executes according to its schedule and executes any associated actions. When combined with a scaling action, you can create a policy that can scale up or scale down a dynamic cluster at specific scheduled times.
Each scheduled policy type has its own schedule (as opposed to earlier releases, which were tied to a single evaluation frequency) which is configured in calendar time, and allowing the ability to create the schedule patterns such as (but not limited to):
- Recurring interval based patterns (e.g., every 5th minute of the hour, or every 30th second of every minute)
- Days-of-week or days-of-month (e.g., "every Mon/Wed/Fri at 8 AM", or "every 15th and 30th of every month")
- Specific days and times within a year (e.g., "December 26th at 8AM EST")
So, for example, an online retailer could configure a pair of policies around the Christmas holidays:
- A "Black Friday" policy to scale up the necessary cluster(s) to meet increased shopping demand for the Christmas shopping season
- Another policy to scale down the cluster(s) on December 25th when the Christmas shopping season is over
Performance-based Elastic Policies
In addition to calendar-based scheduling, in 12.2.1 WLDF provides the ability to create scaling policies based on performance conditions within a server ("server-scoped") or cluster ("cluster-scoped"). You can create a policy based on various run-time metrics supported by WebLogic Server. WLDF also provides a set of pre-packaged, parameterized, out-of-the-box functions called "Smart Rules" to assist in creating performance-based policies.
Cluster-scoped Smart Rules allow you to look at trends in a performance metric across a cluster over a specified window of time and (when combined with scaling actions) scale up or down based on criteria that you specify. Some examples of the metrics that are exposed through Smart Rules include:
- Throughput (requests/second)
- JVM Free heap percentage
- Process CPU Load
- Pending user requests
- Idle threads count
- Thread pool queue length
Additionally, WLDF provides some "generic" Smart Rules to allow you to create policies based on your own JMX-based metrics. The full Smart Rule reference can be found here.
And, if a Smart Rule doesn't suit your needs, you can also craft your own policy expressions. In 12.2.1, WLDF utilizes Java EL 3.0 as the policy expression language, and allows you to craft your own policy expressions based on JavaBean objects and functions (including Smart Rules!) that we provide out of the box.
Provisioning and Safeguards with Elasticity
What if you need to add or remove virtual machines during the scaling process? In WLS 12.2.1 you can participate in the scaling event utilizing script interceptors. A script interceptor provides call-out hooks where you can supply custom shell scripts, or other executables, to be called when a scaling event happens on a cluster. In this manner, you can write a script to interact with 3rd-party virtual machine hypervisors to add virtual machines prior to scaling up, or remove/reassign virtual machines after scaling down.
WebLogic Server also provides administrators the ability to prevent overloading database capacity on a scale up event through the data source interceptor feature. Data source interceptors allow you to set a value for the maximum number of connections allowed on a database, by associating a set of data source URLs and URL patterns with a maximum connections constraint. When a scale up is requested on a cluster, the data source interceptor looks at what the new maximum connection requirements are for the cluster (with the additional server capacity), and if it looks like the scale up could lead to a database overload it rejects the scale up request. While this still requires adequate capacity planning for your database utilization, it allows you to put in some sanity checks at run time to ensure that your database doesn't get overloaded by a cluster scale up.
Integration with Oracle Traffic Director
The elasticity framework also integrates with OTD through the WebLogic Server 12.2.1 life cycle management services. When a scaling event occurs, the elasticity framework interacts with the life cycle management services to notify OTD of the scaling event so that OTD can update its routing tables accordingly.
In the event of a scale up event, for example, OTD is notified of the candidate servers and adjusts the server pool accordingly.
In the case of a scale down, the life cycle management services notifies OTD which instances are going away. OTD then halts sending new requests to the servers being scaled down, and routs new traffic to the remaining set of instances in the cluster, allowing the instances to be removed to be shutdown gracefully without losing any requests.
In order for OTD integration to be active, you must enable life cycle management services for the domain as documented here.
The Big Picture - Tying It All Together
The elasticity framework in 12.2.1 provides a lot of power and flexibility to manage the capacity in your on-premise dynamic clusters. As part of your dynamic cluster capacity planning, you can use elasticity take into account your dynamic cluster's minimum, baseline, and peak capacity needs, and incorporate those settings into your dynamic servers configuration on the cluster. Utilizing WLDF policies and actions, you can create automated policies to scale your cluster at times of known increased or decreased capacity, or to scale up or down based on cluster performance.
Through the use of script interceptors, you can interact with virtual machine pools to add or remove virtual machines during scaling, or perhaps even move shared VMs between clusters based on need. You can also utilize the data source interceptor to prevent exceeding the capacity of any databases affected by scale up events.
And, when so configured, the Elasticity Framework can interact with OTD during scaling events to ensure that new and in-flight sessions are managed safely when adding or removing capacity in the dynamic cluster.
In future blogs (and maybe vlogs!) we'll go into some of the details on these features. This is really just an overview the new features that are available to help our users implement elasticity with dynamic clusters. We will follow on in the upcoming weeks and months with more detailed discussions and examples of how to utilize these powerful new features.
Feel free to post any questions you have here, or email me directly. In the meantime, download WebLogic Server 12.2.1 and start poking around!