Thursday Jul 04, 2013

Connection Failures Between Nodes in MySQL Cluster

If you're setting up MySQL Cluster on Oracle Linux or another Linux such as Red Hat, CentOS or Fedora, you might have come across some problems getting the nodes to communicate. There are two ways you might bump into problems with network connectivity: The iptables firewall, and SELinux. These security mechanisms might prevent your various nodes—management, data, and API—from communicating with each other in various ways and with various symptoms.

Let's have a look at what you're likely to see. 

Data nodes stuck at "starting"

The first thing you might notice is that your data nodes get stuck in the "starting" state. Running show in the management client gives something like this:

[ndbd(NDB)]     2 node(s)
id=2    @192.168.56.211  (mysql-5.6.11 ndb-7.3.2, Nodegroup: 0, Master)
id=3    @192.168.56.212  (mysql-5.6.11 ndb-7.3.2, starting, Nodegroup: 0)

...and it never moves away from starting.

Looking in the management node's log (possibly named something like ndb_1_cluster.log), you might see something like this, repeated many times:

[MgmtSrvr] INFO     -- Node 3: Initial start, waiting for 2 to connect,  
      nodes [ all: 2 and 3 connected: 3 no-wait:  ]

This sort of thing appears too:

[MgmtSrvr] INFO     -- Node 2: Waiting 60 sec for non partitioned start, 
     nodes [ all: 2 and 3 connected: 2 missing: 3 no-wait:  ]

Just to make matters worse, the first data node might start, but the second gets stuck at this point:

[ndbd] INFO     -- Start phase 0 completed 

What might confuse you is that it's obvious that both data nodes have connected to the management node, but they're not talking to each other. The reason for this is grounded in how the various nodes communicate with data nodes.

Data Node Communication 

Data nodes communicate with the management node on port 1186. This port is registered with IANA, and is often open on system that have MySQL installed. Once a data node launches and has been given a node ID, it gets a unique dynamically allocated port—an "ephemeral port"—on its host, and opens a socket so other hosts can communicate with it. This port is assigned by the operating system, and as a result cannot be predicted by the data node; the IANA suggest a port range of 49152–65535, but in Linux the range is from 32768 to 61000. You can get this range with the following command:

$ cat /proc/sys/net/ipv4/ip_local_port_range
32768	61000

This is where the problem lies: Firewalls routinely block incoming traffic on those high ports, so while each data node can open a connection to the management node, they can't open connections to other data nodes if those nodes are listening on sockets that are blocked by the firewall. MySQL Cluster won't declare the cluster started until all data nodes have connected (unless you use --nowait-nodes, and in general you shouldn't), so they get stuck in "starting" until they can talk to other data nodes.

iptables and ServerPort

 Many Linux servers use iptables as a firewall. To open the range of ephemeral ports in the Linux iptables firewall, use something like the following:

iptables -I INPUT -p tcp --dport 32768:61000 -s 192.168.56.0/24 -j ACCEPT 

This isn't ideal, because you're opening thousands of ports, so MySQL Cluster provides a way that you can be more targetted in your setup. 

You can assign fixed ports to data nodes by using the ServerPort option in your management node's config.ini file, like this: 

[ndbd]
Hostname=datahost1
ServerPort=50501
[ndbd]
Hostname=datahost2
ServerPort=50502 

Make sure to use ports that are unique to that host, and are unlikely to be used by other services. 

Once you've done that, open those ports in the data node hosts' firewalls. To match the example shown above, open port 50501 on datahost1 and port 50502 on datahost2. For example, the following command, when run as root on datahost1, opens its socket:

iptables -I INPUT -p tcp --dport 50501 -s 192.168.56.0/24 -j ACCEPT

Once you've verified that this works, save your firewall rules with the following command (run as root) on each data node:

service iptables save 

Stuck Management Nodes

Of course, it's also possible that your management node can't listen on port 1186; this is particularly possible on systems that you've installed by copying the binary from compressed archive, tar.gz or similar. If so, you might need to open that port too by using a similar technique.

For example, if you've configured two management servers, and port 1186 is not open on one of them, you'll see something like this when running show in the management client that cannot see the other:

ERROR Message: The cluster configuration is not yet confirmed 
   by all defined management servers. This management server 
   is still waiting for node 2 to connect.
Could not get configuration
*  4012: Failed to get configuration
*    The cluster configuration is not yet confirmed by all 
     defined management servers. This management server is 
     still waiting for node 2 to connect.

If one management node has port 1186 open and the other closed, the one that can communicate with the other displays this:

[ndb_mgmd(MGM)] 2 node(s)
id=1 (not connected, accepting connect from mgmhost1)
id=2    @192.168.56.216  (mysql-5.6.11 ndb-7.3.2)

iptables and PortNumber

In this situation, the one that appears to be working is actually the one with the blocked port; it can see the other, because the other's port is available. Again, a quick blast of iptables will fix the problem:

iptables -I INPUT -p tcp --dport 1186 -s 192.168.56.0/24 -j ACCEPT 

You can, of course, change the management node's port. The relevant option is PortNumber, although bear in mind that 1186 is a registered port, so changing it requires changing not only your iptables Tables configuration, but also SELinux and any application connect strings you use.

MySQL Server Node failing to connect

When you install MySQL from RPM—either MySQL Server or MySQL Cluster—the installer adds SELinux configuration to the machine. The default configuration allows MySQL's server process mysqld to talk to ports 3306 and 1186. These are the ports of the mysqld process and the MySQL Cluster management node respectively.

However, the default SELinux configuration doesn't know anything about MySQL acting as an API node in a MySQL Cluster, so doesn't configure any permissions that allow the mysqld process to access the data nodes' dynamic ports.

This typically manifests itself as the data node appearing to connect, but running show displays something like this:

[mysqld(API)]   1 node(s)
id=5 (not connected, accepting connect from any host) 

The MySQL Error log shows this:

[Warning] NDB : Tables not available after 30 seconds.  
   Consider increasing --ndb-wait-setup value 

 The cluster log shows this:

[MgmtSrvr] INFO     -- Nodeid 5 allocated for API at 192.168.56.215
[MgmtSrvr] INFO     -- Node 5: mysqld --server-id=1
[MgmtSrvr] INFO     -- Node 3: Node 5 Connected
[MgmtSrvr] INFO     -- Node 4: Node 5 Connected
[MgmtSrvr] INFO     -- Node 4: Node 5: API mysql-5.6.11 ndb-7.3.2
[MgmtSrvr] INFO     -- Node 3: Node 5: API mysql-5.6.11 ndb-7.3.2
[MgmtSrvr] ALERT    -- Node 4: Node 5 Disconnected
[MgmtSrvr] INFO     -- Node 4: Communication to Node 5 closed
[MgmtSrvr] ALERT    -- Node 3: Node 5 Disconnected
...and several more such lines.

Attempts to create a clustered table result in a vague error:

ERROR 157 (HY000): Could not connect to storage engine

 Running show warnings in the MySQL client gives a more bizarre result:

mysql> show warnings;
+---------+------+---------------------------------------------------------------------------------+
| Level   | Code | Message                                                                         |
+---------+------+---------------------------------------------------------------------------------+
| Warning | 1296 | Got error 4009 'Cluster Failure' from NDB. Could not acquire global schema lock |
| Warning | 1296 | Got error 4009 'Cluster Failure' from NDB                                       |
| Error   |  157 | Could not connect to storage engine                                             |
| Error   | 1499 | Too many partitions (including subpartitions) were defined                      |
+---------+------+---------------------------------------------------------------------------------+
4 rows in set (0.00 sec)

The MySQL Error log adds lines like this:

[Warning] NDB: Could not acquire global schema lock (4009)Cluster Failure

SELinux and MySQL Cluster SQL Nodes 

This all looks pretty disastrous, but is easily fixed: You can fix it by tweaking SELinux a bit. I've written about SELinux and MySQL in the past, and the same advice in that article applies here. You could, for example, disable SELinux entirely. But, being a more intelligent reader, you're more likely to want to know how to configure it. 

Once you've set ServerPort as shown above,  you can easily let SELinux in on that particular secret and permit the mysqld process to access the ports you've assigned by running the following command as root on the MySQL host:

 semanage port -a -t mysqld_port_t -p tcp 50501-50524

This opens 24 ports starting with the two in the configuration example shown earlier. Having done this, your MySQL Server instance can talk to data nodes on their fixed ports.

Other Connections

A lot of what's said in this post also applies to other connections. If you've got a Java application that uses ClusterJPA to talk to your cluster, you'd better make sure it can see the management nodes and that SELinux isn't preventing it from accessing the data nodes.

Similarly, if you've changed your management node's port from the default, ensure you've configured the connection string correctly, with that port, in all of your applications' connect strings. This includes MySQL Server applications where you'll need to include the revised port number in the ndb-connectstring option. 

You don't need to tell your applications what hosts or ports your data nodes are at. That's the joy of a centralized configuration, isn't it?

Friday Mar 22, 2013

SELinux and MySQL

I've previously written about AppArmor and MySQL, and how to change MySQL's default file locations on systems with AppArmor enabled. Ubuntu and SUSE ship with AppArmor enabled, but some other distributions such as Oracle Linux don't, along with related distrubutions such as Red Hat, CentOS and Fedora. Rather, these other distributions use another mandatory access control system called SELinux. 

Here's some technical detail that might come in handy later.

SELinux uses concepts such as types and domains. Types belong to resources such as files and ports; these are the "objects" in SELinux. Domains contain the "subjects" (processes) and object types that are associated with each other in some way, for example because they are all related to MySQL.

Each executable for a service that SELinux understands, including MySQL, has its own type. When the process runs, it is placed within a domain based on its type; members of that domain (normally processes) that wish to access objects (such as files and ports) must be authorised to do so by virtue of the object having a configured type that is approved for that domain.

This configuration is contained within policy files that usually ship with each distribution, but can be created using the appropriate tools.

SELinux (Security-Enhanced, if you're interested) "is a Linux feature that provides the mechanism for supporting access control security policies" according to Wikipedia. More simply, it stops things—like programs—from accessing things—like files and network ports—they shouldn't access. By "shouldn't access" I really mean "haven't been configured to access". For example, MySQL is allowed to write to its data directory in /var/lib/mysql, and read from /etc/my.cnf. It can open a socket on port 3306, but SELinux prevents it from writing to files in /home/jeremy or /sbin or anywhere else that isn't already configured as a MySQL location.

In short, if you try changing MySQL's port to a non-standard one, or try taking backups or configuring data files or log files to anywhere but the usual locations, you'll get some odd access-denied type errors in the MySQL error log. In addition, you'll get messages in  /var/log/audit/audit.log (if auditd is running, otherwise /var/log/messages or /var/log/syslog, depending how your system is configured).

What Error do I get? 

To set this demonstration up, I've installed MySQL 5.6 on an Oracle Linux 6.3 system, with SELinux enabled. When I change the datadir option to /datadir (which contains a copy of the MySQL data directory, and has the correct permissions) the service does not start. Let's look at the errors.

From the MySQL error log:

130321 11:50:51 mysqld_safe Starting mysqld daemon with databases from /datadir
...
2013-03-21 11:50:52 2119 [Warning] Can't create test file /datadir/boxy.lower-test
2013-03-21 11:50:52 2119 [Warning] Can't create test file /datadir/boxy.lower-test
...
2013-03-21 11:50:52 2119 [ERROR] /usr/sbin/mysqld: Can't create/write to file 
    '/datadir/boxy.pid' (Errcode: 13 - Permission denied)
2013-03-21 11:50:52 2119 [ERROR] Can't start server: can't create PID file: 
    Permission denied
130321 11:50:52 mysqld_safe mysqld from pid file /datadir/boxy.pid ended

 Now, I'm very sure the permissions on that folder are correct, so let's have a look in /var/log/audit/audit.log:

...
type=AVC msg=audit(1363866652.030:24): avc:  denied  { write } for  pid=2119 
    comm="mysqld" name="datadir" dev=dm-0 ino=394 
    scontext=unconfined_u:system_r:mysqld_t:s0 
    tcontext=unconfined_u:object_r:default_t:s0 tclass=dir
...

A similar error occurs if I try starting MySQL on port 3307, a non-default port:

In the MySQL error log:

2013-03-21 12:12:09 3436 [Note] Server hostname (bind-address): '*'; port: 3307
...
2013-03-21 12:12:09 3436 [ERROR] Can't start server: Bind on TCP/IP port: 
    Permission denied
2013-03-21 12:12:09 3436 [ERROR] Do you already have another mysqld server 
    running on port: 3307 ?
2013-03-21 12:12:09 3436 [ERROR] Aborting

In the audit log: 

type=AVC msg=audit(1363867929.432:42): avc:  denied  { name_bind } for  pid=3436 
    comm="mysqld" src=3307 
    scontext=unconfined_u:system_r:mysqld_t:s0 
    tcontext=system_u:object_r:port_t:s0 tclass=tcp_socket

 Clearly something going on here. The Access Vector Cache (as seen in the "avc: denied" message) is where SELinux caches permissions for the kernel, so it's definitely SELinux doing the denying. 

Just Stop It!

I'm going to start with the hammer and work my way down to the scalpel. 

Here's the hammer:

[root@boxy ~]# setenforce 0
[root@boxy ~]# getenforce
Permissive

The setenforce 0 command switches off SELinux enforcing until the next reboot, and getenforce shows you the current status. To stop SELinux from enforcing on any reboot, you'll need to change a configuration file:

[root@boxy ~]# cat /etc/selinux/config 
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=enforcing
# SELINUXTYPE= can take one of these two values:
#     targeted - Targeted processes are protected,
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted  

 Change that enforcing to permissive (or disabled) and you're good to go. The difference:

  • enforcing blocks operations that SELinux does not allow
  • permissive does not block the operations, but logs them (to /var/log/audit/audit.log)
  • disabled switches off SELinux entirely, to the extent that you cannot use setenforce until you change it and reboot.

 For example, on a machine with SELinux set to permissive, I can do the following:

[root@boxy ~]# setenforce 1
[root@boxy ~]# getenforce
Enforcing

But if it's disabled, this happens:

[root@boxy ~]# setenforce 0
setenforce: SELinux is disabled
[root@boxy ~]# setenforce 1
setenforce: SELinux is disabled

 That's the hammer.

 So, to return to the example that generated the error, I can use the hammer:

[root@boxy ~]# setenforce 0
[root@boxy ~]# service mysql start --datadir=/datadir
Starting MySQL. SUCCESS! 
[root@boxy ~]# service mysql stop
Shutting down MySQL.. SUCCESS! 

If you're happy with that, you could then edit the configuration file to disable SELinux on next reboot, and thanks for reading. See you next time.

I'm intrigued. How do I configure it? 

Obviously, there's a lot more to SELinux than disabling it, and a responsible admin (that's you, right?) wants to know how to use it rather than disable it. I'm not going to get into too much detail here.

We can, however, look at how you can assign SELinux types to objects such as ports and files, so that members of the mysqld_t domain (specifically the mysqld_safe process, launced when you run service mysql start) can access those objects.

So here's the scalpel. First, let's configure SELinux to enable MySQL's use of port 3307:

 [root@boxy ~]# semanage port -a -t mysqld_port_t -p tcp 3307 
You'll need to install the policycoreutils-python package to use the semanage utility.

The semanage utility changes various SELinux settings.  In this case, it adds (-a) a type (-t  mysqld_port_t) to the port mappings for port 3307 using TCP as its protocol (-p tcp). When MySQL (through the mysqld_safe process) tries to access that port, SELinux recognises that the port has a type that is approved for such access by the policy.

We can also allow MySQL to use the /datadir directory: 

[root@boxy ~]# semanage fcontext -a -t mysqld_db_t "/datadir(/.*)?"
[root@boxy ~]# restorecon -Rv /datadir
restorecon reset /datadir context 
   unconfined_u:object_r:default_t:s0->unconfined_u:object_r:mysqld_db_t:s0
restorecon reset /datadir/mysql.sock context 
   system_u:object_r:mysqld_var_run_t:s0->system_u:object_r:mysqld_db_t:s0 

In this example, semanage is adding the type mysqld_db_t to the file context map (fcontext) for anything in the /datadir directory and subdirectories ("/datadir(/.*)?", a regular expression). File mappings such as this are contained in the file /etc/selinux/targeted/contexts/files/file_contexts.local; that file must subsequently be read in order to set the appropriate type on the file itself. That's done by the restorecon utility, and at system reboot.  If you want to change a file context immediately, but don't need it to survive a reboot, there's a chcon utility that performs that task.

The same principles and statements apply if you wish to use other ports or directories. There are similar types that apply to different kinds of files; I used mysqld_db_t above for database directories, but the standard SELinux policy for MySQL also include:

  • mysqld_etc_t for configuration files such as /etc/my.cnf
  • mysqld_log_t for logfiles such as those named /var/log/mysql*
  • Types for the PID file, tmp files, service startup files in the /etc/init.d directory, and the various executables you're likely to use
As you can see, you can get quite fine-grained as you wield your configuration scalpel. Personally, I've had mixed results using things like mysqld_log_t for custom logfile locations, but I got around it initially by using mysqld_db_t (as for data files), and subsequently by using a custom-made policy file.

Conclusion 

This post is already long enough, so I won't get into the deeper topics in SELinux, such as the ability to compile your own policy files and configure new policies for services that SELinux doesn't yet know about. At this stage, you know how to add an SELinux type to an object such as a port or a file so that MySQL can access that object, even if it's not used by default. You also know how to disable SELinux in a couple of ways, but you're not going to do that now, are you? You've got a perfectly good scalpel, after all. Why use a hammer?

About

Jeremy Smyth writes MySQL training courses, and likes exploring interesting questions that come up from novices and experts alike.

Connect

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today