By Nicholas Solter on Apr 19, 2007
As I was putting my son to bed the other night, it occurred to me that S and I practice “high-availability parenting.” What does that mean?
First, some background. In the technology world, there are a number of services that need to be available basically all the time. Think of the credit card infrastructure: When you run your card at the gas station, the database in which your credit limit, current balance, etc. are stored had better be online! Otherwise your card will be rejected, you'll try a different card, and the first bank will lose your business on that transaction.
However, because of hardware failure rates, software bugs, people tripping on power cords, etc., a single machine isn't reliable enough for important applications like your bank's credit card database. To provide higher availability, these applications can be run on a group, or cluster, of physical machines, with software that provides automatic failover of the application from one computer to the next in the case of failure (hardware or otherwise). At Sun, I work on such a High-Availability Cluster software product called Solaris Cluster.
Now back to the kids. Young children require 100% availability of the “parenting service”. They need someone on call constantly for food, entertainment, safety, etc. Both S and I can provide these services, so we switch off. In this way, we're like a two-node cluster. When one of us is “unavailable” (programming, cleaning, relaxing, etc) the other is “active.” Most of our “downtime” is of the planned variety – sort of like a planned software upgrade. However, there's certainly some unplanned downtime that comes up now and then – if one of us wakes up with a fever, the other takes over (most likely at the expense of the “upgrades” planned for that day)!
To take the analogy a bit further than it should probably go, what happens when we both plan downtime simultaneously, say, to go to a movie? Luckily we have a backup two-node cluster nearby: my in-laws!