CLB Enhancements in SailFin 2.0
By kshitiz on Oct 28, 2009
There are handful of features newly added for Converged Load-Balancer(CLB) in SailFin 2.0. These features are as follows :
- DCR plugin in JAVA
- CLB monitoring
- Increase load on recovered instance in periodic manner
- Reuse TCP connection between client to CLB front-end
DCR plugin in JAVA
In SailFin 1.5, user need to provide DCR plugin in propriety XML format. Developer found it difficult to write DCR as it was hard to understand propriety format. Also this format is very restrictive in nature, and difficult to enhance it to support new rules.
Hence in SailFin 2.0, we have introduced support for DCR plugin in JAVA. Please refer to my blog to get complete details on how to develop DCR plugin in JAVA.
In SailFin 2.0, user can gather CLB monitoring data. CLB exposes both front-end and back-end statistics. Monitoring level is dynamically (re)configurable.
How to enable CLB monitoring
User can enable CLB monitoring by setting property converged-load-balancer under module-monitoring-levels to HIGH or
LOW. User can set it using admin CLI or GUI.
Front-end statistics are SIP and HTTP requests or responses
handled by CLB acting as front-end. The exposed statistics are as follows :
- Total number of http requests received
- Total number of http requests received to be serviced locally
- Total number of http requests proxied to remote back-end to be serviced
- Total number of http requests which resulted in error responses
- Total number of failed over http requests
- Total number of sip requests received
- Total number of sip requests received to be serviced locally
- Total number of sip requests proxied to remote back-end to be serviced
- Total number of failed over sip requests
- Total number of sip requests which resulted in error responses
- Total number of outgoing sip requests
- Total number of sip responses received
- Total number of sip responses received to be consumed locally
- Total number of sip responses proxied to remote back-end to be consumed
- Total number of sip responses discarded as they cannot be routed to its
Back-end statistics are SIP and HTTP requests or responses
handled by CLB acting as back-end. The exposed statistics are as follows :
- Total number of incoming sip requests
- Total number of incoming http requests
- Total number of incoming sip responses
- Total number of outgoing sip responses
- Total number of outgoing sip requests
Increase load on recovered instance in periodic manner
In SailFin 2.0, new properties are added to increase load on recovered instance in periodic manner. This feature is mainly introduced for Sip Session Replication(SSR).
These new properties are namely load-increase-factor and
properties can be defined under converged-clb-config and are dynamically
(re)configurable. Both of these properties will be used by
consistent-hash algorithm in CLB. When an instance recovers,
consistent-hash algorithm will not immediately include that instance for
load distribution. It will increase load, from 0 to factor of 1, on
recovered instance in a periodic manner. The load factor increases by
load-increase-factor every load-factor-increase-period-in-seconds seconds.
The default value of load-increase-factor is 1 and load-factor-increase-period-in-seconds is 0. User can set load-increase-factor to any decimal value between 0 and 1, and load-factor-increase-period-in-seconds to positive integer.
User can set using following asadmin commands:
- asadmin set domain.converged-lb-configs.<converged-lb-config-name>.property.load-factor-increase-period-in-seconds=120
- asadmin set domain.converged-lb-configs.<converged-lb-config-name>.property.load-increase-factor=.05
All requests which are distributed using consistent hash algorithm will
follow described load-distribution. Consistent hash algorithm is
1. SIP requests
2. HTTP requests for converged application, only if dcr xml is provided.
This does not impact round-robin algorithm at all. Thus HTTP requests belonging to pure web application(s) will have no impact in behavior what-so-ever.
This will not impact load-distribution in case of cluster startup. It will only come into effect in case of instance startup.
If load-factor-increase-period-in-seconds is set to 0, then load-increase-factor will not have any impact. However if load-factor-increase-period-in-seconds is set and load-increase-factor is not set, then recovered instance will not get any load for period specified by load-factor-increase-period-in-seconds. After period load-factor-increase-period-in-seconds lapses, then load factor for recovered instance will immediately become 1.
Let me explain how it will work with an example. Lets start with some assumptions :
=> There are two instances, instance1 and instance2, in cluster
=> Total Call Per Second(CPS) on system is 200, so per instance it is 100 CPS
=> load-factor-increase-period-in-seconds is set to 60
=> load-increase-factor is set to .1
Now instance1 goes down and then recovers, then load on instance1 will increase in this manner :
0 - 60 : 0 CPS
60 - 120 : 10 CPS
120 - 180 : 20 CPS
180 - 240 : 30 CPS
240 - 300 : 40 CPS
300 - 360 : 50 CPS
360 - 420 : 60 CPS
420 - 480 : 70 CPS
480 - 540 : 80 CPS
540 - 600 : 90 CPS
600 - onward : 100 CPS
Reuse TCP connection between client to CLB front-end connection
In case client in behind a firewall, then SailFin cannot open new TCP connections to client. In such a scenario, SailFin must reuse existing TCP connections between client and SailFin instance.
New property defined to control this behavior is reuseClientFEConnection. The default value of this property is false. User need to set this property to true to enable this feature. User can set it using following asadmin command:
- asadmin set
To achieve it, client to CLB Front-End(FE) mapping is maintained with-in CLB. However there are conditions to store client to CLB FE mapping, which are as follows :
- Request has to come over TCP
- Contact is a sip-uri
- Response has a request associated with it, implies SailFin instance is either acting as UAS or B2BUA
A new mapping of client to CLB FE is created for following cases :
- For all Register requests which result in 2xx response, a mapping is created for contact to CLB FE address mapping
- For all dialog creating requests which result in 2xx response, CLB FE address is added as an attribute in session
Mapping created by REGISTER requests are cleaned up either by timeout or REGISTER request with expires set to 0.
For all outgoing requests, we check whether we have a mapping for contact address in map created by REGISTER request, or attribute set in session. If either is found, we push a ROUTE header for CLB FE and use that to send request out to client.