MySQL Group Replication is a powerful feature that allows you to create a fault-tolerant and redundant database system, ensuring high availability and data consistency across multiple servers. The primary goal of this technology is to provide automatic failover and improve the overall system’s reliability.

If one of the servers in the group fails, the others can continue to operate, serving client requests and maintaining the database’s integrity.

To further reduce failover time, we have introduced a new primary election method: most up-to-date memberThis method can now be used in conjunction with the existing methods.

Working with our HeatWave customers with large datasets and stringent requirements has led to valuable insights. As a result, we are now offering new features to benefit our Enterprise customers. This new primary election method is just one example of the improvements we’ve made.

This feature introduces a more sophisticated approach for DBAs to influence the election process by considering data consistency. When a primary fails, Group Replication will select a new primary that is not only prioritized by the DBA but also has the most current data, reducing the need for extensive synchronization. This approach provides a seamless experience during failover, maintaining high availability and data integrity.

This new primary election method is available in MySQL Enterprise Edition version 9.3.0, including InnoDB Cluster and ClusterSet.

It’s yet another opportunity for customers to benefit from the capabilities of the Enterprise Edition of MySQL.

 

Understanding the Issue

With Group Replication, you can create a cluster of MySQL servers that work together to replicate data changes and maintain an up-to-date copy of your database across all instances. Transactions are executed and committed on all servers in the group ensuring data consistency, with the primary server set to read/write mode. When the primary fails, the primary election mechanism must determine which server to take its place.

DBAs can configure the weight of each Group Replication member to set their preferences for becoming the primary node during a failover. While this weight-based approach represents a static preference, it does not consider the member’s data status. More precisely, a delayed server can be selected as the new primary, which will extend the failover time.

The introduced advanced method for primary election involves examining the data of each member and choosing the node with the most up-to-date information, thus reducing the time required for failover.

 

Primary election method: Weighted

MySQL Group Replication member weight election is the traditional method used to influence the selection of a primary server in a group replication setup.

The DBA assigns each group member a weight, representing the Group Replication’s preference of which member, when a primary election occurs, will act as the primary. Careful consideration of member weights can result in reducing latency, for example by taking into account the proximity between servers and clients and applying weights accordingly.

If a tie occurs during the primary election due to all members having the same member weight, the lexical order of server uuid acts as the tie breaker.

 

New primary election method: Most up-to-date

In addition to the weighted method, we have added a new most up-to-date method. This method improves operational status on primary failover by looking at each member’s data to choose the new primary.

This method will be executed before checking the group member weights, and thus be the main criteria for determining the new primary.

How it works

The new functionality is implemented in component group_replication_elect_prefers_most_updated. This component needs to be installed on all group replication members. To install the component use the following command:

INSTALL COMPONENT 'file://component_group_replication_elect_prefers_most_updated';

The component needs to have the variable enabled set to True, it’s default value. The value of the variable can be confirmed using:

SELECT @@GLOBAL.group_replication_elect_prefers_most_updated.enabled;

When all members have the component installed with the variable enabled and set to TRUE, the method most up-to-date will be the first criteria considered by Group Replication to elect the new primary member. When a tie breaker occurs, Group Replication will use the traditional election protocol between the tied members.

When a primary failover happens, information about each member’s data is collected. The member determined to have the most up-to-date data will be elected as the new primary.

Primary failover most up-to-date diagram
 

Observability

The new component adds two new global status variables:

mysql> SELECT * FROM performance_schema.global_status WHERE VARIABLE_NAME LIKE 'Gr_latest_primary_election%';
    +------------------------------------------------------------------------+---------------------+
    | VARIABLE_NAME                                                          | VARIABLE_VALUE      |
    +------------------------------------------------------------------------+---------------------+
    | Gr_latest_primary_election_by_most_uptodate_members_trx_delta          | 10                  |
    +------------------------------------------------------------------------+---------------------+
    | Gr_latest_primary_election_by_most_uptodate_member_timestamp           | 2024-07-01 12:50:56 |
    +------------------------------------------------------------------------+---------------------+

The variable Gr_latest_primary_election_by_most_uptodate_members_trx_delta is the difference in the number of transactions between the new primary and secondary most up to date, when most-up-to-date primary selection was used.

For example, a group with three members, where M1 is the primary with 100 transactions, M2 with 98 and M3 with 88, using the most up-to-date method on a primary failover will result in M2 being elected as the new primary and Gr_latest_primary_election_by_most_uptodate_members_trx_delta updated with value 10.

The variable Gr_latest_primary_election_by_most_uptodate_member_timestamp is updated whenever a new primary is chosen using the most-up-to-date selection method.

 

Conclusion

In a MySQL Group Replication system minimizing failover time is key. The new primary election method: most up-to-date member, available in MySQL Enterprise Edition 9.3.0, offers a significant improvement in handling automatic failover scenarios. By considering data up-to-date, Group Replication can make more informed decisions when electing a new primary. This sophisticated approach ensures that the chosen primary aligns with DBA preferences and maintains the most recent data, thereby reducing the time of failover.

With this enhancement, MySQL Group Replication becomes even more robust and adaptable, empowering DBAs to manage their database infrastructure effectively, especially in critical environments where data consistency and high availability are paramount. It demonstrates the ongoing evolution of MySQL to meet the ever-increasing demands of modern database applications.

With the continuous feedback and collaboration from our HeatWave customers, we are able to learn, improve, and introduce new features tailored for our Enterprise customers.

MySQL Group Replication is just one of the many benefits of upgrading to MySQL Enterprise Edition.

 

References