By relling on Jun 15, 2004
I am part of a team which is taking a new look at NFS clusters. We've actually been doing this for a long time, but we're focusing on a few new techniques which should identify key areas where we can improve availability of NFS services. The current release of Sun Cluster, internally called SC3.1u2, known to the rest of the world as Sun Cluster 3.1 4/04, showed significant improvement in the recovery of NFS services as measured by clients. I have to use the term significant here because it is not easy to actually explain these sorts of improvements accurately with a single metric. We know we can improve it further, and are taking a slightly different approach than in the past. I would say it is already quite good. But as is often the case in such complex services, there are some failure modes which take longer than we'd like to recover from. And, we are constrained to refrain from making changes on the client, which for NFS service availability, is a serious constraint.
I'll try to communicate through this blog some of the gotchas we know about NFS services in a highly available environment. Stay tuned...