Load balancing in Oracle Service Bus v3.0 (Part 2)
By Chris Tomkins on Jun 06, 2008
Note: This post was first published before Oracle merged with BEA when the Oracle Service Bus product was known as AquaLogic Service Bus, hence the occasional reference to BEA and AquaLogic Service Bus.
In the last post I demonstrated how to configure a business service in Oracle Service Bus to support load balancing across multiple service endpoints. What we didn't consider was what would happen if one of these endpoints was unavailable for some reason - perhaps due to a network or hardware failure, or just down for maintenance. If we leave things as they are then if a service request gets directed to a service endpoint which is currently offline, then this service request will fail. This is almost certainly not what we want to happen - we would prefer this service request to be redirected to an online service endpoint (if one is available) rather than the offline one. Luckily, Oracle Service Bus comes to the rescue with its new endpoint failover feature. This enables us to automatically mark an endpoint as offline, either temporarily or permanently, and hence ensure service requests are only sent to available online endpoints.
To apply this logic to our business service we need to make a couple of changes to our configuration.
The first step is to enable a retry on our business service so that if the endpoint we attempt to invoke is offline we try to invoke an alternative endpoint. To configure this we need to do the following:
- Open up the business service definition for LoadBalancedService
- Switch to the Transport tab
- Set the Retry Count field to 1
This is an improvement over our original configuration. Now, if a service request gets directed to an offline endpoint, service bus will recognise this as an error and try to send the service request to an alternate endpoint. However, if this alternate endpoint is also offline then the service request will fail at this stage, even though there could possibly be an online endpoint available. To handle this case as well, we need to enable the offline endpoint URI capability in Oracle Service Bus.
To do this:
- Right click on the server you have defined in the Servers view of Workspace Studio and choose Launch ALSB Administration Console
- Click on the Resource Browser tab in the left hand navigation
- Select the Business Services link
- Click on your LoadBalancedService business service
- Choose the Operational Settings tab
- Click Create in Change Center to start a new session so you can make a change
- Select the checkbox to enable Offline Endpoint URIs and set a retry interval of 10 seconds
- Click Update
- Click Activate in Change Center, enter a description of the change you have made and click Submit to apply your changes
That's all there is to it.
What we have done is configured our business service so that if a service request is sent to an endpoint which happens to be offline, service bus marks this endpoint offline and attempts to send the service request to an alternate endpoint (in accordance with the retry count). Subsequent service requests will not be sent to the offline endpoint until the offline endpoint URI retry interval has elapsed, at which stage service bus will attempt to start sending service requests to the endpoint. If the endpoint is still offline the cycle will repeat, if it is online the service request will be sent to it.
Note: If you wish to mark an endpoint permanently offline, set the offline endpoint URI retry interval to be 0 hours, 0 mins and 0 seconds. In order to mark this endpoint online again, you will need to do this manually through the service bus console.
Note: If you are using the offline endpoint URI setting, you may well want to consider configuring an alert rule to notify you that an endpoint has gone offline so that you can address this. You can do this by following the instructions here.
To test this works, we need to perform the following tests:
- Using the Test Console, invoke the service twice - this should exercise both endpoints. To check this, simply go to the Operations section of the administration console, click on Service Health, then click on LoadBalancedService (if you do not see this in the list, you need to go back and enable monitoring on your business service) and finally on Endpoint URIs. You should see something like the following (provided you have waited for the monitoring aggregation interval):
- Now make the first endpoint unavailable (how you do this depends on the service you are calling and where it is hosted - for my example, I will just undeploy the application to simulate the endpoint being unavailable).
- Now, using the Test Console, invoke the service again - this request should be routed to the first endpoint as we are using the round-robin load balancing algorithm but since this endpoint is offline, we should see the service request redirected to the other endpoint. To prove this is the case, take a look at the Endpoint URIs monitoring screen again, which should now look something like:
- From this we can clearly see that service bus attempted to send the request to the EchoService endpoint and it failed (hence an Error Count of 1) because the endpoint was offline. We can also see that EchoService2 received this request as it is still online.
- Now re-enable the EchoService endpoint and invoke the business service again using the Test Console. If we look at the Endpoint URIs monitoring screen again, we should see that this service request has been directed to the now online again EchoService endpoint.
We now have a business service which load balances across a set of endpoints and handles the majority of communication related issues we may have with these endpoints. The great thing about this is that this business service can now be used across the rest of the service bus, in any proxy service, in exactly the same way as any other business service.