Technical overview GlassFish 3.0 on Amazon cloud
By Orgad Kimchi on Mar 04, 2009
The integrated GigaSpaces GlassFish solution with its components is captured in the following diagram :
The SLA Driven deployment environment is responsible for hosting all services in the network. It basically does match making between the application requirements and the availability of the resources over the network. It is comprised of the following components:
- Grid Service Manager - GSM – responsible for managing the application lifecycle and deployment
- Grid Service Container – GSC – a light weight container which is essentially a wrapper on top of the Java process that exposes the JVM to the GSM and provides a means to deploy and undeploy services dynamically.
- Processing-Unit (PU )– Represents the application deployment unit. A Processing Unit is essentially an extension of the spring application context that packages specific application components in a single package and uses dependency injection to mesh together these components. The Processing Unit is an atomic deployment artifact and its composition is determined by the scaling and failover granularity of a given application. It, therefore, is the unit-of-scale and failover. There are number of pre-defined Processing Unit types :
- Web Processing Unit – Web Processing Unit is responsible for managing Web Container instances and enables them to run within SLA driven container environment. With a Web Processing Unit, one can deploy the Web Container as group of services and apply SLAs or QoS semantics such as one-per-vm, one-per-machine, etc. In other words, one can easily use the Processing Unit SLA to determine how web containers would be provisioned on the network. In our specific case most of the GlassFish v3 Prelude integration takes place at this level.
- Data Grid Processing Unit – Data Grid is a processing unit that wraps the GigaSpaces space instances. By wrapping the space instance it adds SLA capabilities avliable with each processing unit. One of the common SLA is to ensure that primary instances will not be running on the same machine as the backup instances etc. It also determines deployment topology (partitioned, replicated), as well as scaling policy, etc. The data grid includes another instance, not shown in the above diagram, called the Mirror Service. The Mirror Service is responsible for making sure that all updates made on the Data Grid will be passed reliably to the underlying database.
- Load Balancer Agent – The Load Balancer Agent is responsible for listening to web-containers availability and add those instances to the Load Balancer list when a new container is added, or remove it when it has been removed. The Load Balancer Agent is currently configured to work with the Apache Load Balancer but can be easily set up to work with any external Load Balancer.
How it works:
The following section provides a high-level description of how all the above components work together to provide high performance and scaling.
- Deployment - The deployment of the application is done through the GigaSpaces processing-unit deployment command. Assigning specific SLA as part of the deployment lets the GSM know how we wish to distribute the web instances over the network. For example, one could specify in the SLA that there would be only one instance per machine and define the minimum number of instances that need to be running. If needed, one can add specific system requirements such as JVM version, OS-Type, etc. to the SLA . The deployment command points to to a specific web application archive (WAR). The WAR file needs to include a configuration attribute in its META-INF configuration that will instruct the deployer tool to use GlassFish v3 Prelude as the web container for this specific web application. Upon deployment the GlassFish-processing-unit will be started on the available GSC containers that matches the SLA definitions. The GSC will assign specific port to that container instance. . When GlassFish starts it will load the war file automatically and start serving http requests to that instance of the web application.
- Connecting to the Load Balancer - Auto scaling -The load balancer agent is assigned with each instance of the GSM. It listens for the availability of new web containers and ensures that the available containers will join the load balancer by continuously updating the load-balancer configuration whenever such change happens. This happens automatically through the GigaSpaces discovery protocol and does not require any human intervention.
- Handling failure - Self healing - If one of the web containers fails, the GSM will automatically detect that and start and new web container on one of the available GSC containers if one exists. If there is not enough resources available, the GSM will wait till such a resource will become available. In cloud environments, the GSM will initiate a new machine instance in such an event by calling the proper service on the cloud infrastructure.
- Session replication - HttpSession can be automatically backed up by the GigaSpaces In Memroy Data Grid (IMDG) . In this case user applications do not need to change their code. When user data is stored in the HttpSession, that data gets stored into the underlying IMDG. When the http request is completed that data is flushed into the shared data-grid servers.
- Scaling the database tier - Beyond session state caching, the web application can get a a reference to the GigaSpaces IMDG and use it to store data in-memory in order to reduce contention on the database. GigaSpaces data grid automatically synchronizes updates with the database. To enable maximum performance, the update to the database is done in most cases asynchronously (Write-Behind). A built-in hibernate plug-in handles the mapping between the in-memory data and the relational data model. You can read more on how this model handles failure as well as consistency, aggregated queries here.