Replication is a hot topic in MySQL 5.6, and for good reason: There are many excellent features that make it a strong well-supported feature, from the new Global Transaction Identifiers (GTIDs), to simplified replication configuration and automated failover using MySQL Utilities (now available in alpha as a separate download).
The simplest topology consists of a master server that accepts changes, and slaves that replicate those changes from the master. A common requirement is for a network to have multiple servers that accept changes and replicate to each other. This is possible by means of circular replication, where each master is also the slave of another master, in a circular fashion. However, this configuration is prone to certain problems.
Firstly, you have to know a bit about how replication works. MySQL replication is asynchronous, which means each server executes operations without waiting for another server to replicate them. It does this by logging every event and subsequently transmitting these events to connected slaves. This works perfectly—without conflicts—when you have a single master that accepts changes from client applications. That master can have any number of slaves that can execute queries for applications, but don't change the contents of the database.
Circular replication enforces replication from only one other master, so MySQL avoids some sorts of conflicts such as time-order conflicts.
A time-order conflict occurs when masters can replicate directly from two or more sources, and conflicting event pairs replicate to masters in a different order. This problem cannot occur in MySQL replication.
Circular replication introduces the possibility that two or more servers can replicate concurrent changes to each other. It is therefore is subject to conflicts, because it is possible that two servers update the same row at the same (or nearly the same) time and that each subsequently replicates its changes to the other server. If the replicated changes apply to the same row, then that row ends up with a different value on each server.
MySQL does not perform conflict resolution in such situations, so you have to make sure your application caters for that possibility.
Conflicts in circular replication occur when your application allows updating the same data (rows and related rows) on different servers. If you do not take care to avoid conflicts, you risk a situation where two servers accept conflicting changes at nearly the same time, and replicate them to the other side.
For example, imagine a product that costs $520:
|Operation||Sales server||Management server|
|(0) Initial price||$520||$520|
|(1) Promotions update (subtract $50)||$470|
|(2) Brands update (increase by 20%)||$624|
|(3) Promotions update replicates||$574|
|(4) Brands update replicates||$564|
|(5) Final price||$564||$574|
After each of the changes replicates to the other server—after being performed on its local server—the price on the sales server ends up being $564, and that on the management server (for the same product) is $574. There are no errors generated, because each server executes the statements in the order it receives them (either from a connected client application or a replicating master), which, due to the nature of asynchronous replication, is not always the same order on each server.
The preceding example describes a simple situation that could occur when you use statement-based replication. If you use row-based replication, you are not immune either, because there are several ways in which operations can conflict. Here are some further examples of conflicts that can occur whether you use statement- or row-based replication:
WHEREclause that should include a row that has just changed on the other server - the
UPDATEpropagates in one direction and the
DELETEin the other, resulting in a row that exists on one server but not the other.
In short, a conflict is likely to occur at some point if you allow changes to common data on two or more servers.
The simplest way to avoid conflicts in a circular replication topology is to ensure that each master server can only update rows that no other master server changes. For example:
If you need to improve the performance of your replicated network, you can then scale out each master to multiple slaves to allow for greater bandwidth when running read-only queries. In practice, this means that you can load-balance queries across the slaves, and dedicate the master to handling writes for its tables.
Of course, to avoid conflicts all together, you should avoid circular replication and ensure that only one master accepts writes. But where would be the fun in that?