Tuxedo Load Balancing
By Todd Little on Jun 07, 2012
A question I often receive is how does Tuxedo perform load balancing. This is often asked by customers that see an imbalance in the number of requests handled by servers offering a specific service.
First of all let me say that Tuxedo really does load or request optimization instead of load balancing. What I mean by that is that Tuxedo doesn't attempt to ensure that all servers offering a specific service get the same number of requests, but instead attempts to ensure that requests are processed in the least amount of time. Simple round robin "load balancing" can be employed to ensure that all servers for a particular service are given the same number of requests. But the question I ask is, "to what benefit"?
Instead Tuxedo scans the queues (which may or may not correspond to servers based upon SSSQ - Single Server Single Queue or MSSQ - Multiple Server Single Queue) to determine on which queue a request should be placed. The scan is always performed in the same order and during the scan if a queue is empty the request is immediately placed on that queue and request routing is done. However, should all the queues be busy, meaning that requests are currently being processed, Tuxedo chooses the queue with the least amount of "work" queued to it where work is the sum of all the requests queued weighted by their "load" value as defined in the UBBCONFIG file. What this means is that under light loads, only the first few queues (servers) process all the requests as an empty queue is often found before reaching the end of the scan. Thus the first few servers in the queue handle most of the requests. While this sounds non-optimal, in fact it capitalizes on the underlying operating systems and hardware behavior to produce the best possible performance. Round Robin scheduling would spread the requests across all the available servers and thus require all of them to be in memory, and likely not share much in the way of hardware or memory caches. Tuxedo's system maximizes the various caches and thus optimizes overall performance.
Hopefully this makes sense and now explains why you may see a few servers handling most of the requests. Under heavy load, meaning enough load to keep all servers that can handle a request busy, you should see a relatively equal number of requests processed. Next post I'll try and cover how this applies to servers in a clustered (MP) environment because the load balancing there is a little more complicated.
Oracle Tuxedo Chief Architect