To cluster or not to cluster
By lindaschneider on Feb 07, 2007
For example, my morning was spent thinking about training for the next release of MQ.
This reminded me that we have free training for the 4.0 release of the product that I'm sure no one knows about (I wouldn't if I hadn't stayed up for several nights in a row writing it). If you are new to MQ or new to JMS or just have a few hours to kill its available for free from http://www.sun.com/training/catalog/enterprise/application.xml:
Ok - on to what I originally planned to talk about ....
Warning: This is going to be one of those blog entries that could serve as a replacement for sleeping pills ... still if you want to know a little bit about MQ clustering here it is.
Now - if you have a burning desire to hear about something different, let me know. Until then I'll just have to write about whatever comes to mind.
(I have the concept of adding in nice pretty pictures however I don't think I'm going to have time for this entry. We'll see. Right now I'm sticking to ugly ones.)
What are clustersClusters are a group of some sort of server which are grouped together to address scalability or availability. In MQ, our servers currently address scalibility and service availability but they dont address message availability (yet). I'll talk about how we did High Availability (an MQ 4.1 feature) which provides message availability in a seperate blog - it deserves its own attention. Clusters are generally broken into several types of configurations
- disconnected servers -servers provide scaliablity but dont communicate to each other.
- Fully connected clusters - ever server speaks with every other server
- Asymmetric clusters servers are connected together in some way but a server doesn't connect to everyone else.
- Primary/Backup clusters where one server mirrors the other and can take over if a failure occurs.
- Store and forward clusters where only some information passes between the servers or clusters or servers (e.g. a customer configures what is passed on). Its useful if you have servers running in different geographical areas over a WAN (where a fully connected setup just wouldn't make sense)
What does message queue do with clustersMessage Queue currently only supports Fully Connected Clusters. The original intention of clustering in message queue was to provide scaling (the ability to add more consumers in the system) and service availability (the ability for a producer or consumer to continue to send and receiver messages in the case of failure). We chose this clustering scheme because it seemed the simplest choice to give us the behavior we wanted. If we had gone for Asymetric cluster then we'd also have to deal with all the issues required to calculate the path that takes the least amount of hops (and pass a lot of information about how the cluster is connected between brokers).
Actually we also support the idea of disconnected servers (in fact we have some customers using it for failover) because it was absolutely no work (once we put in the reconnect behavior for clusters on the client side we were done).
Where are we looking at going in the future
- support data availability during failover (high availability) - targeted for 4.1
- support some sort of bridge (to allow store and forward clusters) - targeted for a future release.
How do brokers communicate ?Message Queue brokers send messages around the cluster. The messages are either broadcast (are sent to all brokers in the cluster) or unicast (are sent to only a specific broker). The format of these messages is a GPacket. Generally this packet is composed of three sections:
- a header - which includes a magic number, version, size (for parsing the rest of the packet) and enough information (timestamp, sequence) to create a unique id. A set of bit flags are also attached to the header to make it easy to mark packets in a specific way (e.g. the A bit - standing for acknowledge if a reply should be sent)
- a variable length header - these are name,value pairs
- the body - a byte buffer designed to hold any type of data.
The specifics of the protocol (what we put in the packet) is here. But the general gist is that control messages (messages that are sent when a consumer is added, a new destination is created or a lock is needed) go to all brokers in the cluster ... messages and jms acknowledgements go to only the brokers responsible for this message.
At some point, I'll talk about the ins and outs (what we send, where) but not today. Maybe next I'll talk about High Availability (possible solutions and why we chose the one we did) then again, maybe not.