Achieving High Availability with PostgreSQL using Solaris Cluster

Quick announcement: Sun has just released Solaris Express Developer Edition (SXDE) 1/08, a developer version that has all the latest and greatest Solaris features and tools. So what's new for PostgreSQL? The main updates/additions are 8.2.5, Perl driver, and pgAdmin III v1.6.3.

High availability (HA) is very important for most applications. PostgreSQL has an HA feature called warm standby or log shipping which allows one or more secondary servers to take over when the primary server fails. However, there are some limitations with PostgreSQL warm standby, two of which are:

  1. No mechanism in PostgreSQL to identify and perform automatic failover when the primary server fails
  2. No read-only support on the standby server

Fortunately, Solaris Cluster can be used to solve limitation #1. Solaris Cluster is a robust clustering solution that has been around for many years, and best of all it's open sourced and free.

Detlef Ulherr has recently implemented the Solaris Cluster agent to work with PostgreSQL warm standby. We discussed different possible use-case scenarios with PostgreSQL warm standby, and he came up with a design that I think will work well for non-shared storage clustering. Maybe not everything will be perfect initially, but as more people test out the agent, we'll know what can be improved. The agent is still in beta now and will be released soon, probably in a couple of months.

The cool thing with Solaris Cluster is that you can now setup a cluster on a single node using multiple Solaris Zones . This is extremely useful because it eliminates the need for multiple machines or complicated hardware setup if you just want to try it out or if you want a simple environment for doing development. Here's more details on the Solaris Cluster and Zones integration

Of course you wouldn't want to deploy your HA application on a single machine. In production environment, you should have at least a two nodes cluster. Please refer to the Solaris Cluster documentation for more info.

From my recent test, I setup the cluster on a single machine with two Solaris Zones. Here's how the automatic failover works in a nutshell:

  • Client connects to a logical host. The logical host is configured to have two resource groups, each containing a Zone.
  • The logical host initially points to the primary server (Zone 1) where PostgreSQL is running, and PostgreSQL is configured to ship WAL logs to the secondary server (Zone 2) where PostgreSQL is running in continuous recovery mode.
  • When primary server fails, the Solaris Cluster agent detects the failure and triggers the logical host to automatically switch to the IP address of the secondary server.
  • From the PostgreSQL client perspective, it's still using the same logical host, but the actual PostgreSQL server has moved to a different machine, all happens transparently.
  • If the client was connecting to the DB on the primary, the session would be disconnected momentarily and reconnected automatically to the DB on the secondary server, and the application would continue on its merry way.

The combination of PostgreSQL warm standby and Solaris Cluster can provide an enterprise class clustering solution for free (unless to need support services off course). So, please try it out and provide your feedback on what can be improved.

In my next blog, I will discuss how Solaris ZFS and Zones can be used in a clever way to overcome limitation #2. This idea has been used already, and some of you may have seen Theo's blog on this topic. I will provide a working sample code and step by step instructions for setting it up.


[Trackback] Bookmarked your post over at Blog!

Posted by useful on February 04, 2008 at 02:53 AM CST #

"The logical host initially points to the primary server (Zone 1) where PostgreSQL is running, and PostgreSQL is configured to ship WAL logs to the secondary server (Zone 2) where PostgreSQL is running in continuous recovery mode."

So what this basically is, is the equivalent of Oracle DataGuard functionality.

This is failover cluster mode, not HA mode. In HA mode, all nodes in the cluster participate in both DML and DDI statements, and the load between database nodes is shared (Oracle calls this concept the grid, hence the "g" in 10g and 11g).

An example of HA DB setup would be Oracle RAC.

Posted by UX-admin on November 13, 2008 at 08:02 PM CST #

Typeo correction: DDI -> DDL.

Posted by UX-admin on November 13, 2008 at 08:04 PM CST #

I would rarely recommend automatic failover because when there is an issue you want to know what it is first before failing over to the standby, especially if your standby is at a remote disaster recovery site.

Posted by Enzo on February 12, 2009 at 08:17 AM CST #

Post a Comment:
  • HTML Syntax: NOT allowed



« August 2016