Validating Multicast Transport: Where Did My Instances Go?
By user12611203 on Mar 04, 2011
For cluster health information and communication of high-availability state, GlassFish 3.1 depends on GMS. To dynamically discover the members of a cluster, GMS depends on UDP multicast. So if there's something about your network preventing multicast communication among hosts, instances will become isolated from each other.
As I wrote in my last blog on the
'asadmin get-health' command,
it's a good idea after starting your cluster
to make sure everything is working correctly. With the
command, you can diagnose issues with intra-instance communication or plan your cluster
deployment before creating the cluster instances.
asadmin subcommand is used to send and receive UDP multicast information, and
so acts to validate that various hosts can all communicate with each other. The usage is simple in concept: run the tool on each host at the same time, using the same multicast address and port, and verify from the tool's output that each host receives messages from the others. So if you're running on hosts 1, 2, and 3, then you should see this when running on host 1:
Listening for data... Sending message with content "host1" every 2,000 milliseconds Received data from host1 (loopback) Received data from host2 Received data from host3
Likewise, hosts 2 and 3 should see messages from all 3 machines. Make sure you're not running the DAS and instances at the same time, or else there will be interference with the UDP traffic. Here is a video showing some features of the tool:
Debugging, Step 1: Use Same Multicast Port/Address as Cluster
While this tool can be useful to check your network before deploying your cluster, it is most
helpful when one instance is not communicating with the DAS/other instances. You may see this
if you run the 'get-health' command described in the previous blog. If you know that an instance
is up according to its server log, but it's showing up as "not started" in the get-health output,
then it's likely that the DAS and the instance are not seeing each others' UDP multicast messages.
In this case, you want to run
asadmin validate-multicast with the following
--multicastportThe value of gms-multicast-port for your cluster in domain.xml.
--multicastaddressThe value of gms-multicast-address for your cluster in domain.xml.
Using those options will make the tool use the same values as the members of your cluster, in
effect simulating the GMS traffic between the DAS and instances. To find the values for those
options, you can read them from the attributes on the
in domain.xml. For instance:
<clusters> <cluster name="mycluster" gms-multicast-port="22262" gms-multicast-address="126.96.36.199" [etc.] > <server-ref ref="instance1"></server-ref> <!-- [etc.] --> </cluster> </clusters>
Debugging, Step 2: TTL
Unless specified on the command line, the validate-multicast tool and GMS use the default
MulticastSocket time-to-live for
your operating system or 4, whichever is greater. You can see this in the tool's output
if you run with the
--verbose flag. For example:
McastSender: The default TTL for the socket is 1. Setting it to minimum 4 instead.
You can try increasing this value to see if it is the limiting factor that prevents packets
from reaching the cluster members with your network configuration. To specify a different
value, use the following option with the tool (in addition to
--timetoliveSets the time-to-live value of the multicast packets sent by the tool.
If you are now seeing all the instances you expect, you can change your cluster configuration
so that GMS uses this TTL value. It is simple to pass this value into the
asadmin create-cluster command. See the
Create a Cluster" section of the
Availability Administration Guide for an
example. If your cluster is already running, however, you can set this value with
asadmin set. See the
Change GMS Settings After Cluster Creation" section of the HA guide. The property to be
GMS_MULTICAST_TIME_TO_LIVE, and it is listed in the
Names for GMS Settings" section.
Debugging, Step 3: Specifying the Network Adapter
On a multi-home machine machine (possessing two or more network interfaces), you may need
to specify the local interface that should be used for UDP multicast traffic. You can use
ifconfig, or the equivalent on your system, to list the network interfaces
and obtain the local IP address of the interface you wish to use. This address can then be
used with the following command line parameter (along with any you're already specifying):
--bindaddressSets the local interface used to receive packets.
Note that this value will be different on each machine where you are running the tool. If you are now seeing all the instances you expect, you can set the GMS bind interface for each instance following the instructions in the "Traffic Separation Using Multi-Homing" section of the HA guide.
If one or more machines are still missing in the output, then it may be that they are located on different subnets from each other. Or it could be that UDP multicast is not enabled on the network. You may need to ask the network administrator to verify that the network is configured so that the UDP multicast transport is available.
For more information, see the validate-multicast man page. A copy of the help information is attached here as well. The validate-multicast tool is also covered in more depth in the HA guide referenced above.