Quorum Device - Why Configure One?
By ira on Oct 17, 2007
The quorum component of Solaris Cluster (SC) is used to guarantee that a cluster does not suffer from partitions, namely split brain and amnesia. Both these types of partitions can lead to data corruption in a cluster,and the quorum component prevents this from happening in an SC config. Split brain is partition in space, where subclusters are up, but not able to talk to each other. Amnesia is partition in time, where a given cluster incarnation is not aware of the previous incarnation.
Quorum uses a voting mechanism in SC to prevent partitions. Each node is assigned a vote, and a cluster needs to have the majority of votes in order to stay up. For a greater-than-two node cluster, it is straightforward to deduce why, with such a voting scheme, one can not encounter either split brain or amnesia in the cluster. See Solaris Cluster Concepts Guide for a detailed discussion of this concept.
For a two node cluster, the voting mechanism requires an external tie-breaking mechanism, which is provided by a quorum device (QD). This is not required for greater-than-two node clusters. However, configuring a QD results in greater availability for the cluster in the event of failures of multiple nodes in a cluster. In fact, if you configure a fully connected (ie. connected to all nodes) QD in an N node cluster, the cluster can survive the failure of (N-1) nodes in the cluster. The lone survivor node will stay up and running, and assuming that the capacity planning for it had ensured that it be able to handle all the load in the system, this node will be able to service all client requests.
This is best illustrated through simple examples. Note that SC guarantees protection from single points of failure, so in all cases, the cluster will stay up if one of the nodes dies.
Example 1: 3 Node Cluster, No Quorum Device
Consider a 3 node cluster, with nodes A, B and C. Each has 1 vote.
Total votecount = 3
Majority votecount = 2
Ie, 2 of the nodes of the cluster need to be up for the cluster to stay up. So, the cluster is able to survive single node failures only, and can not survive the failure of greater than one node.
Example 2: 3 Node Cluster, 1 Quorum Device
Consider a 3 node cluster, with nodes A, B and C, and a quorum device QD, connected to all the three nodes. Each node has a votecount of 1. The QD is assigned a votecount of (#connected votes - 1) = 2.
Total votecount = 3 + 2 = 5
Majority votecount = 3
Here, if two of the nodes die, the cluster has the survivor node's vote. But it also has the QD vote, making the cluster votecount to 3. This is sufficient to keep the cluster up!
Example 3: 3 Node Cluster, 2 Quorum Devices with Restricted Connections
Consider a 3 node cluster, with nodes A, B and C, with a QD configured between nodes A and B, and another QD configured between nodes A and C. Each QD get assigned 1 (# connected votes - 1) vote.
Total votecount = 3 + 1 + 1 = 5
Majority votecount = 3.
Here, if nodes B and C were to die, node A can continue as a cluster since it can count its own vote, as well as those of the two QDs connected to it, totaling to 3 votes, the minimum required. However,if A dies along with B (or C), the sole survivor C (or B) can count its vote, and that of the QD connected to it, totaling 2 votes, which is lower than the minimum required (3). Hence, the cluster goes down.
From these examples, the cluster in Example 2 has the best availability characteristic (it also happens to be the maximum possible), The cluster in Example 1 has the worst of the three, and the cluster in Example 3 has better availability than the cluster in Example 1 because of the QDs configured in it.
Since a QD can also be used to share data, it is good practice to configure QDs in a cluster to improve the availability of the system.
Word of Caution
When configuring QDs in the system, be careful to not overconfigure QDs, as that will make the cluster vulnerable to QD failures, and will thus lower its availability characteristics. Again, this is best illustrated through a simple example.
Example 4: 3 Node Cluster, 2 Quorum Devices
Consider a 3 node cluster, with nodes A, B and C, and two fully connected QDs. Each QD has a vote of 2 (#connected votes - 1).
Total votecount = 3 + 2 + 2 = 7
Majority votecount = 4
Here, if the two QDs failed and there was a cluster reconfiguration, the cluster would go down even if all the nodes were ok. This is because the nodes by themselves can contribute only 3 votes, which does not constitute majority.
Rule of Thumb
Therefore, for configuring QDs in the cluster, remember: Total number of QD votes < Total number of node votes
A Nifty Tool
Richard Elling at Sun has written a nice spreadsheet that lets you specify nodes and QD connections in your cluster, and get the resulting cluster failure modes of the system in the event of node failures. That tool can be found here.