Learn MySQL with the Curriculum Team

  • July 4, 2013

Connection Failures Between Nodes in MySQL Cluster

Jeremy Smyth
Manager, MySQL Curriculum

If you're setting up MySQL Cluster on Oracle Linux or another Linux such as Red Hat, CentOS or Fedora, you might have come across some problems getting the nodes to communicate. There are two ways you might bump into problems with network connectivity: The iptables firewall, and SELinux. These security mechanisms might prevent your various nodes—management, data, and API—from communicating with each other in various ways and with various symptoms.

Let's have a look at what you're likely to see. 

Data nodes stuck at "starting"

The first thing you might notice is that your data nodes get stuck in the "starting" state. Running show in the management client gives something like this:

[ndbd(NDB)]     2 node(s)
id=2    @  (mysql-5.6.11 ndb-7.3.2, Nodegroup: 0, Master)
id=3    @  (mysql-5.6.11 ndb-7.3.2, starting, Nodegroup: 0)

...and it never moves away from starting.

Looking in the management node's log (possibly named something like ndb_1_cluster.log), you might see something like this, repeated many times:

[MgmtSrvr] INFO     -- Node 3: Initial start, waiting for 2 to connect,  
      nodes [ all: 2 and 3 connected: 3 no-wait:  ]

This sort of thing appears too:

[MgmtSrvr] INFO     -- Node 2: Waiting 60 sec for non partitioned start, 
     nodes [ all: 2 and 3 connected: 2 missing: 3 no-wait:  ]

Just to make matters worse, the first data node might start, but the second gets stuck at this point:

[ndbd] INFO     -- Start phase 0 completed 

What might confuse you is that it's obvious that both data nodes have connected to the management node, but they're not talking to each other. The reason for this is grounded in how the various nodes communicate with data nodes.

Data Node Communication 

Data nodes communicate with the management node on port 1186. This port is registered with IANA, and is often open on system that have MySQL installed. Once a data node launches and has been given a node ID, it gets a unique dynamically allocated port—an "ephemeral port"—on its host, and opens a socket so other hosts can communicate with it. This port is assigned by the operating system, and as a result cannot be predicted by the data node; the IANA suggest a port range of 49152–65535, but in Linux the range is from 32768 to 61000. You can get this range with the following command:

$ cat /proc/sys/net/ipv4/ip_local_port_range
32768   61000

This is where the problem lies: Firewalls routinely block incoming traffic on those high ports, so while each data node can open a connection to the management node, they can't open connections to other data nodes if those nodes are listening on sockets that are blocked by the firewall. MySQL Cluster won't declare the cluster started until all data nodes have connected (unless you use --nowait-nodes, and in general you shouldn't), so they get stuck in "starting" until they can talk to other data nodes.

iptables and ServerPort

 Many Linux servers use iptables as a firewall. To open the range of ephemeral ports in the Linux iptables firewall, use something like the following:

iptables -I INPUT -p tcp --dport 32768:61000 -s -j ACCEPT 

This isn't ideal, because you're opening thousands of ports, so MySQL Cluster provides a way that you can be more targetted in your setup. 

You can assign fixed ports to data nodes by using the ServerPort option in your management node's config.ini file, like this: 


Make sure to use ports that are unique to that host, and are unlikely to be used by other services. 

Once you've done that, open those ports in the data node hosts' firewalls. To match the example shown above, open port 50501 on datahost1 and port 50502 on datahost2. For example, the following command, when run as root on datahost1, opens its socket:

iptables -I INPUT -p tcp --dport 50501 -s -j ACCEPT

Once you've verified that this works, save your firewall rules with the following command (run as root) on each data node:

service iptables save 

Stuck Management Nodes

Of course, it's also possible that your management node can't listen on port 1186; this is particularly possible on systems that you've installed by copying the binary from compressed archive, tar.gz or similar. If so, you might need to open that port too by using a similar technique.

For example, if you've configured two management servers, and port 1186 is not open on one of them, you'll see something like this when running show in the management client that cannot see the other:

ERROR Message: The cluster configuration is not yet confirmed 
   by all defined management servers. This management server 
   is still waiting for node 2 to connect.
Could not get configuration
*  4012: Failed to get configuration
*    The cluster configuration is not yet confirmed by all 
     defined management servers. This management server is 
     still waiting for node 2 to connect.

If one management node has port 1186 open and the other closed, the one that can communicate with the other displays this:

[ndb_mgmd(MGM)] 2 node(s)
id=1 (not connected, accepting connect from mgmhost1)
id=2    @  (mysql-5.6.11 ndb-7.3.2)

iptables and PortNumber

In this situation, the one that appears to be working is actually the one with the blocked port; it can see the other, because the other's port is available. Again, a quick blast of iptables will fix the problem:

iptables -I INPUT -p tcp --dport 1186 -s -j ACCEPT 

You can, of course, change the management node's port. The relevant option is PortNumber, although bear in mind that 1186 is a registered port, so changing it requires changing not only your iptables Tables configuration, but also SELinux and any application connect strings you use.

MySQL Server Node failing to connect

When you install MySQL from RPM—either MySQL Server or MySQL Cluster—the installer adds SELinux configuration to the machine. The default configuration allows MySQL's server process mysqld to talk to ports 3306 and 1186. These are the ports of the mysqld process and the MySQL Cluster management node respectively.

However, the default SELinux configuration doesn't know anything about MySQL acting as an API node in a MySQL Cluster, so doesn't configure any permissions that allow the mysqld process to access the data nodes' dynamic ports.

This typically manifests itself as the data node appearing to connect, but running show displays something like this:

[mysqld(API)]   1 node(s)
id=5 (not connected, accepting connect from any host) 

The MySQL Error log shows this:

[Warning] NDB : Tables not available after 30 seconds.  
   Consider increasing --ndb-wait-setup value 

 The cluster log shows this:

[MgmtSrvr] INFO     -- Nodeid 5 allocated for API at
[MgmtSrvr] INFO     -- Node 5: mysqld --server-id=1
[MgmtSrvr] INFO     -- Node 3: Node 5 Connected
[MgmtSrvr] INFO     -- Node 4: Node 5 Connected
[MgmtSrvr] INFO     -- Node 4: Node 5: API mysql-5.6.11 ndb-7.3.2
[MgmtSrvr] INFO     -- Node 3: Node 5: API mysql-5.6.11 ndb-7.3.2
[MgmtSrvr] ALERT    -- Node 4: Node 5 Disconnected
[MgmtSrvr] INFO     -- Node 4: Communication to Node 5 closed
[MgmtSrvr] ALERT    -- Node 3: Node 5 Disconnected
...and several more such lines.

Attempts to create a clustered table result in a vague error:

ERROR 157 (HY000): Could not connect to storage engine

 Running show warnings in the MySQL client gives a more bizarre result:

mysql> show warnings;
| Level  | Code| Message                                                                         |
| Warning| 1296| Got error 4009 'Cluster Failure' from NDB. Could not acquire global schema lock |
| Warning| 1296| Got error 4009 'Cluster Failure' from NDB                                       |
| Error  |  157| Could not connect to storage engine                                             |
| Error  | 1499| Too many partitions (including subpartitions) were defined                      |
4 rows in set (0.00 sec)

The MySQL Error log adds lines like this:

[Warning] NDB: Could not acquire global schema lock (4009)Cluster Failure

SELinux and MySQL Cluster SQL Nodes 

This all looks pretty disastrous, but is easily fixed: You can fix it by tweaking SELinux a bit. I've written about SELinux and MySQL in the past, and the same advice in that article applies here. You could, for example, disable SELinux entirely. But, being a more intelligent reader, you're more likely to want to know how to configure it. 

Once you've set ServerPort as shown above,  you can easily let SELinux in on that particular secret and permit the mysqld process to access the ports you've assigned by running the following command as root on the MySQL host:

 semanage port -a -t mysqld_port_t -p tcp 50501-50524

This opens 24 ports starting with the two in the configuration example shown earlier. Having done this, your MySQL Server instance can talk to data nodes on their fixed ports.

Other Connections

A lot of what's said in this post also applies to other connections. If you've got a Java application that uses ClusterJPA to talk to your cluster, you'd better make sure it can see the management nodes and that SELinux isn't preventing it from accessing the data nodes.

Similarly, if you've changed your management node's port from the default, ensure you've configured the connection string correctly, with that port, in all of your applications' connect strings. This includes MySQL Server applications where you'll need to include the revised port number in the ndb-connectstring option. 

You don't need to tell your applications what hosts or ports your data nodes are at. That's the joy of a centralized configuration, isn't it?

Join the discussion

Comments ( 5 )
  • Oleg Wednesday, September 25, 2013


    Very useful, thanks


  • guest Wednesday, May 7, 2014

    I have mysql-cluster setup. I am facing the same issues i guess. I am able to see the nodes connected to management node but since for storing data on cluster DB (which is using ndg-engine), its not storing the data. I cant see any errors or exceptions in logs. My storage process just hangs up. I am using MySQL-cluster API's to store data. I have ndb management host/port for connections.

    Any clues what i need to do?

    Thanks in advance for help.


  • Jeremy Smyth Wednesday, May 7, 2014

    Hi Sohan,

    Thanks for your comment. If the nodes are definitely connected to each other and you can see this in ndb_mgm, it's probably not the same issue as that described in the article. With no messages in the logs it's going to be difficult to troubleshoot it as a cluster problem. Why don't you give a detailed description on one of the forums that offer support or troubleshooting help? The MySQL Cluster forum at http://forums.mysql.com/list.php?25 might be one place to start.


  • guest Thursday, May 8, 2014

    Thanks for reply Jeremy.

    I have cracked the issue. From debugging i found issue also arises when the cluster jar you are using is not compatible with your java version .

    In my case i was sung the clusterj-7.2.6.jar and my java version was "jdk1.7.0_51". I downgraded java version to _09 and it worked. So in future make sure the third party jar is not always compatible with JVM versions. As its not necessary java always makes sure your backward compatibility is not broken.

    Conclusion is: Though everything is connected when but when your storing data in NDB through NDB API's make sure java version is correct.

    See below exception:

    "Caused by java.lang.IllegalArgumentException:com.sun.proxy.$Proxy29 is not an interface"

    Latest Java version's i.e jdk1.7.0_21 onwards works with cluster 7.3.x .



  • Muhammad Karam Shehzad Thursday, November 3, 2016


    I am on Linux platform with MySQL NDB 5.7. I am trying to monitor all traffic related to MySQL clustering - between data nodes, management node and sql nodes. To that end, I used netstat to list all open ports listening on my machine before starting MySQL cluster. Then, I started MySQL cluster and ran netstat again. I assumed that the ports that were listening the second time around, but not the first time, were related to MySQL clustering.

    But there are two problems with this. First, there could be ports opened by other processes between the two netstat runs. Second, MySQL might open other ports after I ran the netstat command the second time.

    What is the best way to go about finding all ports being used by MySQL for clustering purposes?

    I believe ephemeral ports are picked dynamically, so perhaps if I knew all the MySQL clustering related processes that would be running, I can figure out every port that they are using. Pointers will be very welcome.

Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha