MySQL NDB Cluster is a highly available and performant database with a shared-nothing architecture powering its linear scalability. Replication with MySQL NDB Cluster provides active-active geographical replication, allowing better data locality and disaster recovery with backup sites. Thus, it can serve a large number of clients at the same time in a global Cloud environment.

The blog post series started at MySQL NDB Cluster replication: Introduction presented an overview of the multiple MySQL NDB Cluster replication topologies. If you are unfamiliar with the multiple methods of MySQL NDB Cluster replication, we suggest starting there to make sense of what we are going to cover in this blog post.

 

Two Site Replication

The following figure illustrates a one-way replication path (single channel) through two NDB clusters.

Single channel replication from Primary MySQL NDB Cluster to Secondary MySQL NDB Cluster
Single channel replication from Primary MySQL NDB Cluster to Secondary MySQL NDB Cluster

 

In the figure, as shown using green arrows, we can see the replication path for changes originating from the Read-Write MySQL Servers on the first cluster. In the end, the changes performed on the first cluster can be read by the Read-Only MySQL Servers in the second cluster. If the replica server, MySQL Replica, has –log-replica-updates=ON, the applied changes will be written to the MySQL Replica binary log file. However, since the servers in the second cluster are Read-Only, it is unnecessary to have any changes written to the binary log (–ndb-log-bin=OFF) unless another set of servers (or clusters) replicates from these. You can find more info on this topology at MySQL NDB Cluster replication: Single-channel replication.

 

Another, more complex, example is the active-active replication path through two (or more) NDB clusters. The following figure illustrates this with two NDB clusters.

Single channel replication between two NDB clusters
Single channel replication between two NDB clusters

The green and blue arrows show the replication path for changes originating from Read-Write MySQL Servers on both clusters. Since this is active-active asynchronous replication, and thus the loop must be closed (or it becomes infinite!), this figure illustrates the –log-replica-updates=OFF option effect, which consists of having both MySQL Binlog Servers, from each cluster, discarding the changes applied from the source cluster by the MySQL Replica. Hence, the first cluster changes (green arrows) applied at step (4) are dropped in Cluster 2 MySQL Binlog server – red boxed arrow (A) – and the second cluster (blue arrows) changes applied at step (8) are dropped in Cluster 1 MySQL Binlog server – red boxed arrow (B). More info on this topology at MySQL NDB Cluster replication: Circular replication for active-active clusters.

 

Ignore Replica Updates

The examples above demonstrate that binary log updates that are irrelevant for a MySQL Server to write can be discarded. But first, we should develop the meaning of discarded. NDB data nodes do not know the whereabouts of a particular update after sending it to the subscribing nodes. They aim to minimize work to be the fastest possible. Therefore, they limit themselves to sending the update event to the table subscribers that the change concerns. For more information on the work performed by the NDB data nodes, see MySQL NDB Cluster Internals Manual.

Each binary log update involves having the data node set up signals, sending them through the transport layer (CPU bound), the actual transport network I/O, handling the incoming packets by the receiving MySQL Server machine, building and filling up event data buffers on the NDB API (Memory and CPU bound) and the processing on the binary log injector (CPU bound). All this work and these computing resources are additional overhead if the event from the data nodes is discarded. For example, discarding large BLOB values makes the overhead more significant.

From MySQL NDB Cluster 8.4, a new feature was introduced to cope with the drawback of unnecessary overhead for discarded events. With this feature, the binary logging MySQL Server node sets up its subscription against the NDB data nodes with the option to skip sending updates that were applied by a replica. Recalling Figure 2, on step (4) and step (8), NDB data nodes know that the applied update was applied by a replica MySQL Server; therefore, if the MySQL Binlog servers have –log-replica-updates=OFF then red arrows won’t occur at all. This is further illustrated in the following figure.

Single channel replication between two NDB clusters with filtered replica applied updates
Single channel replication between two NDB clusters with filtered replica applied updates

This new subscription mode is transparent and requires no additional options. The binary log MySQL Server will read its configuration and set the subscription based on the –log-replica-updates option value. The user will see the following appear on the server log indicating that the filter was successfully installed.

<timestamp> 15 [System] [MY-010865] [NDB] Created event ‘REPL$test/t1’ for table ‘test.t1’ in NDB
<timestamp> 15 [System] [MY-010865] [NDB] Binlog: filter replica updates in NDB

The curious reader might be interested in knowning how NDB data nodes realize it is a replica applied change when sending a change event update to a subscriber. This is leveraged because binary log file records the server_id from where the binary log transaction originated. The source’s binary log is transformed into the relay log at the replica MySQL Server end. The relay log is read now, and the server_id sent with the applied transaction into NDB. This value is then recorded in the ndb_apply_status table. More information are available at MySQL NDB Cluster replication: Single-channel replication.

 

Ignore No Logging Updates

Previously, if the user had sql_log_bin disabled when executing the DML SQL statements, any insert/update/delete commands against NDB tables would still produce an update event sent to subscribing binary logging MySQL Servers, to be eventually discarded on the downstream end. From MySQL NDB Cluster 8.4 onwards, all DML to NDB tables with sql_log_bin disabled will not be sent to the subscribing MySQL nodes. Therefore, for all NDB user tables created or synchronized, the following server message appears:

<timestamp> 15 [System] [MY-010865] [NDB] Created event ‘REPL$test/t1’ for table ‘test.t1’ in NDB
<timestamp> 15 [System] [MY-010865] [NDB] Binlog: filter nologging in NDB

 

Summary

In this article, we gave a brief overview of MySQL NDB Cluster active-active asynchronous replication and detailed some steps that replicated data takes on every server hop. We also showed how a new MySQL NDB Cluster 8.4 feature now transparently reduces computing overhead without user intervention.

More Information

MySQL NDB Cluster is open-source and can be downloaded both in source and binary form at MySQL Downloads where the GA version 8.4 LTS release and above can be found.