High Availability Part 6

As I’ve already mentioned, redundancy is critical to providing high availability because things do fail, even software.  In my previous post I described how to provide redundancy for some of the Tuxedo system servers.  In this post I’ll cover how to provide redundancy for the remaining system servers as well as application provided servers.

Tuxedo provides many integration options to connect Tuxedo applications to other systems, including other Tuxedo applications.  The connectors or gateways that provide these integration options are essentially all network based.  These servers include the Tuxedo domain gateway, the various listeners such as the workstation listener or WSL, or add-on products such as the SALT Web service gateway or GWWS.  All of these servers listen on network ports which cannot be shared across servers, and all of them can have multiple copies of them running in the same Tuxedo machine.  But since they can’t share network ports, they have to be configured individually, instead of just running identical multiple copies of them.

For the listeners such as the WSL, JSL, and ISL, this means that multiple copies should be running across the machines in a cluster.  Although a load balancer can be used to distribute the load across the multiple copies that are each listening on their own host and port address, the clients for these listeners have that capability built in.  So by properly configuring the address string that clients use to connect to the Tuxedo application, load balancing and failover can be handled by Tuxedo without the need for an external load balancer.

For the SALT Web service gateway, multiple GWWS instance can be configured on a machine or in a cluster.  Each Web service should be configured with multiple endpoints for inbound services to ensure the service is available even if a particular gateway is down.  This is controlled by the WSDF for the service and the SALTDEPLOY file.  As well more than one gateway should import any required external Web services that need to be called.

The Tuxedo domain gateway GWTDOMAIN and its related administrative servers GWADM and DMADM also need to be considered.  The GWADM and DMADM servers are not involved in the normal message flow of the domain gateway, so configuring them for redundancy doesn’t really help availability, although they can be configured to have redundant copies.  The GWTDOMAIN system server on the other hand is required to process messages, so to ensure connectivity it is valuable to have multiple copies of the domain gateway running at the same time.  The best case scenario is multiple copies of GWTDOMAIN running on separate Tuxedo machines in a clustered configuration.  This requires defining at least two domain gateway groups in the UBBCONFIG file and two local access points in separate domain gateway groups in the DMCONFIG file.  Each gateway can be connected to multiple remote domains.  To further improve availability of imported services, for each imported service, you can specify a list of failover/failback domains that an imported service can be imported from.  You can also import the same service more than once to load balance requests across the multiple remote domains.

Finally this leaves us with application provided servers.  As should be obvious, to achieve high availability multiple copies of these should be run and distributed across multiple machines in a clustered configuration.  Within a server group, Tuxedo makes this trivial by just specifying the MIN and MAX parameters on the server’s definition in the *SERVERS section of the UBBCONFIG.  Adding the same server to multiple server groups spread across multiple machines ensures that the failure of any given server or machine won’t prevent the application from functioning.  Obviously running multiple copies of servers can also be used to scale out an application.

By following the above recommendations, your application should be free of any single points of failure.  One way to verify this is follow the flow of a request through the entire system from client, through the various servers the request must travel and analyze what would happen if all the servers along a single path were unavailable.  Can the next request still be processed?  If not, you have a single point of failure.


Hi Todd,

thanks for insightful posts regarding HA!

Could you please elaborate a little regarding running multiple copies of DMADM?

Best regards,

Posted by Per Lindström on March 25, 2014 at 04:57 AM CDT #

Hi Per,

My mistake. Only a single copy of DMADM can be configured in a domain. However as mentioned, it is only needed to make configuration changes, so having it fail and waiting for it to restart normally shouldn't pose a problem. Also note that you can configure it to failover to another machine in the case of a machine failure.


Posted by Todd Little on July 15, 2014 at 11:55 AM CDT #

Post a Comment:
  • HTML Syntax: NOT allowed

This is the Tuxedo product team blog.


« November 2015