News, tips, partners, and perspectives for the Oracle Solaris operating system

Why SUNW.nfs is required to configure HA-NFS over ZFS in Solaris Cluster?

Guest Author

The typical way of configuring a Highly Available NFS file system in Solaris Cluster environment is by using an HA-NFS agent (SUNW.nfs). The SUNW.nfs agent simply does the sharing of the file systems which has to be exported.

With the support of ZFS as failover file system in SUNW.HAStoragePlus, there are 2 possible ways to configure Highly Available NFS file system with ZFS as underlying file system. They are

  1. By enabling ZFS sharenfs property (i.e sharenfs=on) for filesystems of zpool and without using SUNW.nfs.
  2. By disabling the ZFS sharenfs property (i.e sharenfs=off) for filesystems of zpool and letting SUNW.nfs does the actual share.

Among the above 2 approaches HA-NFS work correctly only when SUNW.nfs agent is used (i.e option 2), and this blog explains the rationale behind the requirement of SUNW.nfs, to configure an Highly Available NFS file system in Cluster environment with ZFS.

Lock reclaiming by Clients (NFSv[23])

The statd(1M) keeps track of clients and processes holding locks on the server.The server can use this information to allow the client to re-claim the lock after NFS server reboot/failover.

When a file system is shared by setting ZFS sharenfs property on and not using SUNW.nfs, the lock information will be kept under /var/statmon which is local file system and specific to a host. So in the case of failover the stored information is not available on the machine to which the server is failed over. This makes server unable to send requests to clients to re-claim the locks.

This problem has been overcome by SUNW.nfs agent by keeping the monitor information in stable storage (which is on multi-ported disks) and accessible from all cluster nodes.

State information of clients (NFSv4)

NFSv4 is stateful protocol where nfsd(1M) keeps track of client status like opened/locked files in stable storage.

When a file system is shared by setting ZFS sharenfs property on, the stable storage will be under /var/nfs which is not accessible from all nodes of cluster. In this case, in a server failover scenario, the clients reclaim requests will fail and might result in client applications being exited (unless the client applications catch the SIGLOST signal).

This problem has been overcome by SUNW.nfs agent by keeping the state information in stable storage which is shared among cluster nodes and this helps server to make clients to reclaim their state.

The pictorial difference is shown below. 

HA-NFS without SUNW.nfs
 HA-NFS without SUNW.nfs

HA-NFS using SUNW.nfs
 HA-NFS using SUNW.nfs

To say more precisely the ZFS sharenfs property of zfs file system is not meant to work in Solaris Cluster environment and hence using SUNW.nfs agent is must for HA-NFS over ZFS.

The stable storage where SUNW.nfs keeps information is on ZFS highly available file system (which is value of PathPrefix extension property of SUNW.nfs resource type).

Venkateswarlu Tella (Venku)
Solaris Cluster Engineering

Join the discussion

Comments ( 8 )
  • Wayne Dovey Wednesday, April 16, 2008

    Hi Venkateswarlu

    Do you have some configuration guides on how to get this working?

    They are very hard to find. Any help would be appreciated.



  • Maurice Volaski Monday, July 14, 2008

    I'm wondering what would be required if there wasn't shared storage and AVS were mirroring the local ZFS pool from a hot system to a standby. Would it possible to forego this complexity by simply making /var/statmon and /var/nfs symlinks to directories on the mirrored ZFS pool?

  • venku Tuesday, July 15, 2008

    It should work as long the AVS syncs the latest data to standby node at the point of primary node being down.

  • Manuel Tuesday, January 6, 2009


    I was doing some tests with HA-nfs cluster and nfs clients and in one solaris 10 up5 client I got the next error for nfs:

    NFS compound failed for server ha-nfs: error 5 (RPC: Timed out)

    NFS compound failed for server ha-nfs: error 5 (RPC: Timed out)

    NFS compound failed for server ha-nfs: error 5 (RPC: Timed out)

    NFS compound failed for server ha-nfs: error 5 (RPC: Timed out)

    NFS compound failed for server ha-nfs: error 5 (RPC: Timed out)

    NFS compound failed for server ha-nfs: error 5 (RPC: Timed out)

    nfs mount: mount: /nfs-filesystem: Connection timed out

    I got this error after a nfs failover and I cannot mount the nfs again. This problem was only for this server, others were ok, this server as like the others were running tars, finds and cp's over the nfs (stress tests).

    After look for some information in internet I found that may be some information in the rpc's for nfs, maybe lockd and nfsd, I reboot the cluster nodes, reboot the complete cluster and the nfs client but nothing was working. I also stop, restart, offline and online the nfs resource group but again didn't work for only this client.

    This is a nfs resource group working over zfs filesystem with nfs attr disable.

    Any comment about how to resolve it is appreciated...

  • Venku Tuesday, February 10, 2009

    HAStoragePlus resource requires global device paths to work.

    It looks like you are planning to configure DID disk d1, and hence the corresponding global device path is /dev/global/dsk/d1s6

  • Ken Tuesday, February 10, 2009

    So, can I mount a shared folder from another Solaris as a valid global device?

  • pb Monday, September 21, 2009

    I understand why the option 2 is needed for ha of nfs. But if I need more the flexibility and easy configuration of option 1.The ha point of option 2 is not higher than the usage of option 1. So is it ok and still supported if I don't use the nfs-ha agent and configure option 1?

    Or is there an easy configuration of ha nfs agent with the flexibility of zfs sharenfs available (inheritance,...?) so I can use option 2?


  • Venku Tuesday, September 22, 2009

    Right now there is no easy option in ha nfs agent to make all the ZFS file systems in the pool to share. It has to be done manually so that applications will behave properly after a failover.

Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.