Clustering and Load balancing in JBI Components
The aim of this note is to peek at the Glassfish’s implementation with respect to clustering and load balancing and to formulate a strategy for the components to be cluster aware.
The approach I took while writing this note is to give an over view of clustering, the components that are involved in making the clustering possible and also discuss the strategies that AppServer took while making its components cluster aware.
When we speak of AppServer the following 3 components come to our mind.
1. Web Server – catering to the http requests
2. EJB container – catering to the ORB requests and
3. MQ – catering to JMS
When we observe Glassfish each of the component has its own strategy for clustering and load balancing. Each of the strategy vows its implementation either to its legacy (meaning its already there like in case of http load balancer) or need arising out for that specific protocol in a given scenario. We can observer variance with respect to the same protocol being implemented with different strategies, since not being owners as in case of JMS.
I tried to explain each of the strategy for the above protocols with respect to Glassfish, and concluded that each protocol must come up with its own strategy and the same can not be generalized for all the protocols because of the sheer variety of them and the need of each protocol. Though we can generalize certain aspects and can bring forth the LCM of all the protocols at a protocol agnostic way the time and effort might not match the same.
I also tried to discuss different strategies that we can adopt for making the components cluster aware and having the load balancing and failover facilities in a generic way. I also tried to explore the API so that the BC while writing the cluster aware code can get the details and info required of the cluster.
AppServer Cluster environment:
Each server instance, whether it is standalone, DAS or clustered will contain a JBI runtime. DAS will contain Facade mbeans that will be communicating to the cluster instances or standalone instances based on the target information.
Basic JBI runtime will not have any special capability to handle clustering. The components on top of it will be aware of EE clustering. So, it will be the component's responsibility to work in a clustered manner. For example, a BPEL process doing a correlation will need the messages to routed to the same instance. However this will be done in a way specific to the component. For example, BPEL engine deployment to a cluster will use a group ID to identify the cluster. This can be the cluster name.
Load balancing to the EE cluster will be handled in a component specific way. For example, the SOAP BC, will use HTTP load balancer. and the JMS BC will use inbound JMS load balancing.
Each Resource accessed by applications reference external resources such as JDBC database resources and their associated connection pools, CMP persistence managers, JMS resources, java mail resources, custom JNDI resources, and connector resources. Like an application, each resource has a JNDI name which is unique across the domain. An un-clustered server instance or cluster can reference zero or more resources, and a resource can be staged in which case it is referenced by no server instances.
At the highest level Glassfish configuration information (including Java EE applications and resources) for a domain are stored in a Central Repository that is shared by all instances in the domain. The Central Repository is written to, by a single entity – the DAS (Domain Administration Server). All applications and resources deployed to a domain are stored in the central repository. They are also locally cached at each instance.
Each server instance maintains its own local cache of the central repository that serves two important purposes: to allow the instance to read its configuration in the absence of the DAS and for performance purposes (e.g. class loaders reading from the local file system are much more efficient). The server instance must synchronize its state with that of the Central Repository in two cases: incrementally as configuration changes are made to the Central Repository and at instance startup time (e.g. because an instance might miss configuration changes when it is down).
How does AppServer handle Clustering and Load Balancing?
This understanding is needed so that we can be aware of how things happen at the AppServer side and how can they be extrapolated in the JBI environment.
For AppServer there are 2 perspectives with respect to clustering:
1. Administrative -- from this stand point cluster is a bunch of homogeneous machines/Server Instances for the DAS2. Per Instance – this is for individual instance need.
Why do we need a load balancer at all? – The Load balancer the AppServer has is only a HTTP load balancer which in fact works only with in the Web Server component and the AppServer has only a hook for that via proxy (This portion is implemented by grizzly in AS). The LB is in fact a native LB for performance reasons. The functions of this LB are
a. load balancer
b. Route the request. – this mechanism works like having context root – port mapping which helps in routing the request.
c. Maintaining sticky session -- the LB acts as façade for HTTP requests that are coming in and maintains a session store in HADB for performance reasons and for optimization.
There should be one logical question as to how does the EJB Container is handling the issue of clustering and associated issues? We should remember that this load balancer is for only HTTP Protocol and not for any other protocol like JMS/IIOP etc..
Lets look at IIOP/ORB:
The Cluster aware ORB Runtime handles the requests from the App client. Whenever there is a request from the client the runtime checks for the cluster configuration information of the client if the client is not updated with the information it will set the information. The App client has the inbuilt routing capability to route the request and the stickiness (SFSB, servlets..) for performance and session maintenance requirements.
JMS: Cluster awareness is needed in case of inbound only since outbound communication to EIS can happen with out any issue (I will talk about the transactions involved here later.). In inbound of JMS we have 2 types.
1. Sun’s own MQ: In this case, when the MDB end point gets activated, it is set with the cluster aware setting (ClusterName+InstanceName+MDBName). This actually helps as a hook to the instance on which the MDB is listening for the JMS. This helps the MDB maintains the stickiness by way of maintaining Local Delivery Profile (LDP).
2. Third party MQs supported via generic JMSJCA: This is achieved by way of maintaining a timestamp based selector with respect to each server instance.
How are the transactions handled?
Every server instance is having its own Transaction Manager. This TM actually maintains the list of transactions that are handled. It also maintains the log that is pertaining to the transaction. If an application instance is killed during the transaction the TM pertaining to that server instance will take care of that transaction. Since transactions are atomic in nature the resource recovery (commit/rollback) would not be effected. There would not be a case where in a transaction is started by one instance and had to be dealt by other instance.
What modules are available for supporting Clustering in AppServer?
1. Group Management Server (GSM): GSM is an independent software module, from Project Shoal (https://shoal.dev.java.net), which may be embedded and started by processes that require runtime cluster communications and group management services such as:
1. Static Group Membership Composition change notifications :
a. Member Added Notification
b. Member Removed Notification
2. Dynamic Group Membership Composition change notifications :
a. Join Notifications
b. Failure Suspicion Notifications
c. Failure Notifications
d. Planned Shutdown Notifications
3. Recovery Oriented Support Services:
a. Delegate Recovery Instance Selection and notification
b. Protecting recovery operations through failure fencing
4. Messaging Service API for Group and Member-to-Member messaging
5. Distributed Caching of lightweight state and recovery states
GMS is an in-process component that can be accessed by other components within the process to receive events occurring in a group of distributed processes. GMS will provides the following features:
a. Failure Notifications
b. Recovery member selection and corresponding notifications
c. Failure Fencing
d. Member Joins and Planned Shutdown Notifications
e. Support for administrative configurations
f. Group, One-to-Many and One-To-One Messaging
g. A Distributed State Cache implementation to store data in a shared cache that lives in each instance's GMS module.
Examples of GMS clients in the application server include the Timer Service, the Transaction Service, the EJB Container for Read-Only or Read-Mostly beans' cache update notifications, the IIOP Failover Loadbalancer, the In-Memory replication module and the instance that serves as the administration server for reporting cluster health.
GMS provides a simple, easy-to-use API to its clients for accessing and consuming its functionalities. GMS provides a Group Communication Service Provider Interface for group communications provider technologies to be integrated. In our implementation, we use a Service Provider implementation based on JXTA peer-to-peer technology to construct the desired group communications infrastructure.
2. Load balancer Module:
Load balancer component of the application server is a webserver plug-in, which distributes the http requests to the application server instances. Currently it only supports simple round robin load balancing policy.
AppServer study Conclusion:
The above discussion says that AppServer has the Clustering/Load balancing feature for HTTP Protocol and JMS / IIOP in its own way. It also says that the need for clustering and load balancing differ from protocol to protocol and each individual protocol needs to take care of the need from its own perspective.