Oracle Solaris Cluster as of 4.3 release has support for
Cluster File System with UFS
ZFS as a failover file system
For application deployments that require accessing the same file system across multiple nodes, ZFS as a failover file system will not be a suitable solution. ZFS is the preferred data storage management on Oracle Solaris 11 and has many advantages over UFS. With Oracle Solaris Cluster 4.4, now you can have both ZFS and global access to ZFS file systems.
Oracle Solaris Cluster 4.4 has added support for
Cluster File System with ZFS
With this new feature, you can now make ZFS file systems accessible from multiple nodes and run applications from the same file systems simultaneously on those nodes. It must be noted that zpool for globally mounted ZFS file systems does not actually mean a global ZFS pool, instead there is a Cluster File System layer that is present on top of ZFS that makes the file systems of the ZFS pool globally accessible.
The following procedures explain and illustrate a couple of methods to bring up this configuration.
1) Identify the shared device to be used for ZFS pool creation.
To configure a zpool for globally mounted ZFS file systems, choose one or more multi-hosted devices from the output of the cldevice show command.
phys-schost-1# cldevice show | grep Device
In the following example, the entries for DID devices /dev/did/rdsk/d1 and /dev/did/rdsk/d4 shows that those devices are connected only to phys-schost-1 and phys-schost-2 respectively, while /dev/did/rdsk/d2 and /dev/did/rdsk/d3 are accessible by both nodes of this two-node cluster, phys-schost-1 and phys-schost-2. In this example, DID device /dev/did/rdsk/d3 with device name c1t6d0 will be used for global access by both nodes.
# cldevice show | grep Device === DID Device Instances === DID Device Name: /dev/did/rdsk/d1 Full Device Path: phys-schost-1:/dev/rdsk/c0t0d0 DID Device Name: /dev/did/rdsk/d2 Full Device Path: phys-schost-1:/dev/rdsk/c0t6d0 Full Device Path: phys-schost-2:/dev/rdsk/c0t6d0 DID Device Name: /dev/did/rdsk/d3 Full Device Path: phys-schost-1:/dev/rdsk/c1t6d0 Full Device Path: phys-schost-2:/dev/rdsk/c1t6d0 DID Device Name: /dev/did/rdsk/d4 Full Device Path: phys-schost-2:/dev/rdsk/c1t6d1
2) Create a ZFS pool for the DID device(s) that you chose.
phys-schost-1# zpool create HAzpool c1t6d0 phys-schost-1# zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT HAzpool 49.8G 2.22G 47.5G 4% 1.00x ONLINE /
3) Create ZFS file systems on the pool.
phys-schost-1# zfs create -o mountpoint=/global/fs1 HAzpool/fs1 phys-schost-1# zfs create -o mountpoint=/global/fs2 HAzpool/fs2
4) Create files to show global access of the file systems.
Copy some files to the newly created file systems. These files will be used in procedures below to demonstrate the file systems globally accessible by all cluster nodes.
phys-schost-1# cp /usr/bin/ls /global/fs1/ phys-schost-1# cp /usr/bin/date /global/fs2/ phys-schost-1# ls -al /global/fs1/ /global/fs2/ /global/fs1/: total 120 drwxr-xr-x 3 root root 4 Oct 8 23:22 . drwxr-xr-x 5 root sys 5 Oct 8 23:21 .. -r-xr-xr-x 1 root root 57576 Oct 8 23:22 ls
/global/fs2/: total 7 drwxr-xr-x 3 root root 4 Oct 8 23:22 . drwxr-xr-x 5 root sys 5 Oct 8 23:21 .. -r-xr-xr-x 1 root root 24656 Oct 8 23:22 date
At this point the ZFS file systems of the zpool are accessible only on the node where the zpool is imported.
There are two ways of configuring a zpool for globally mounted ZFS file systems.
You would use this method when the requirement is only to provide global access to the ZFS file systems, when it is not known how HA services will be created using the file systems or which cluster resource groups will be created.
1) Create a device group of the same name as the zpool you created in step 1 of type zpool with poolaccess set to global.
phys-schost-1# cldevicegroup create -p poolaccess=global -n \ phys-schost-1,phys-schost-2 -t zpool HAzpool
Note: The device group must have the same name HAzpool as chosen for the pool. The poolaccess property is set to global to indicate that the file systems of this pool will be globally accessible across the nodes of the cluster.
2) Bring the device group online.
phys-schost-1# cldevicegroup online HAzpool
3) Verify the configuration.
phys-schost-1# cldevicegroup show === Device Groups === Device Group Name: HAzpool Type: ZPOOL failback: false Node List: phys-schost-1, phys-schost-2 preferenced: false autogen: false numsecondaries: 1 ZFS pool name: HAzpool poolaccess: global readonly: false import-at-boot: false searchpaths: /dev/dsk phys-schost-1# cldevicegroup status === Cluster Device Groups === --- Device Group Status --- Device Group Name Primary Secondary Status ----------------- ------- --------- ------ HAzpool phys-schost-1 phys-schost-2 Online
In these configurations, the zpool is imported on the node that is primary for the zpool device group, but the file systems in the zpool are mounted globally.
Execute the files copied in step#3 in the previous section, from a different node. It can be observed that the file systems are mounted globally and accessible across all nodes.
phys-schost-2# /global/fs1/ls -al /global/fs2 total 56 drwxr-xr-x 3 root root 4 Oct 8 23:22 . drwxr-xr-x 5 root sys 5 Oct 8 23:21 .. -r-xr-xr-x 1 root root 24656 Oct 8 23:22 date phys-schost-2# /global/fs2/date Fri Oct 9 04:08:59 PDT 2018
You can also verify that a newly created ZFS file system is immediately accessible from all nodes by executing the below commands.
From the cldevicegroup status above, it can be observed that phys-schost-1 is the primary node for the device group.Execute the below command on the primary node:
phys-schost-1# zfs create -o mountpoint=/global/fs3 HAzpool/fs3
Then from a different node, verify if the file system is accessible.
phys-schost-2# df -h /global/fs3 file system Size Used Available Capacity Mounted on HAzpool/fs3 47G 40K 47G 1% /global/fs3
You would typically use this method when you have planned on how HA services in resource groups would use the globally mounted file systems and expect dependencies from resources managing the application on a resource managing the file systems.
The device group of type zpool with poolaccess set to global is created when an HAStoragePlus resource is created and globalzpools property is defined, that is if such device group is not already created.
1) Create HAStoragePlus resource for a zpool for globally mounted file systems and bring it online.
Note: The resource group can be scalable or failover as needed by the configuration.
phys-schost-1# clresourcegroup create hasp-rg phys-schost-1# clresource create -t HAStoragePlus -p \ GlobalZpools=HAzpool -g hasp-rg hasp-rs phys-schost-1# clresourcegroup online -eM hasp-rg
2) Verify the configuration.
phys-schost-1# clrs status hasp-rs
=== Cluster Resources === Resource Name Node Name State Status Message ------------- --------- ----- -------------- hasp-rs phys-schost-1 Online Online phys-schost-2 Offline Offline
phys-schost-1# cldevicegroup show
=== Device Groups ===
Device Group Name: HAzpool
Type: ZPOOL failback: false Node List: phys-schost-1, phys-schost-2 preferenced: false autogen: true numsecondaries: 1 ZFS pool name: HAzpool poolaccess: global readonly: false import-at-boot: false searchpaths: /dev/dsk phys-schost-1# cldevicegroup status === Cluster Device Groups === --- Device Group Status --- Device Group Name Primary Secondary Status ----------------- ------- --------- ------ HAzpool phys-schost-1 phys-schost-2 Online
Execute the files copied in step#3 in the previous section, from a different node. It can be observed that the file systems are mounted globally and accessible across all nodes.
phys-schost-2# /global/fs1/ls -al /global/fs2 total 56 drwxr-xr-x 3 root root 4 Oct 8 23:22 . drwxr-xr-x 5 root sys 5 Oct 8 23:21 .. -r-xr-xr-x 1 root root 24656 Oct 8 23:22 date phys-schost-2# /global/fs2/date Fri Oct 9 04:23:26 PDT 2018
You can also verify that a newly created ZFS file system is immediately accessible from all nodes by executing the below commands.
From the cldevicegroup status above, it can be observed that phys-schost-1 is the primary node for the device group.
Execute the below command on the primary node:
phys-schost-1# zfs create -o mountpoint=/global/fs3 HAzpool/fs3
Then from a different node, verify if the file system is accessible.
phys-schost-2# df -h /global/fs3 file system Size Used Available Capacity Mounted on HAzpool/fs3 47G 40K 47G 1% /global/fs3
There might be situations where a system administrator had used Method 1 to meet the global access requirement for application admins to install the application but later finds a requirement for an HAStoragePlus resource for HA services deployment. In those situations, there is no need to undo the steps done in Method 1 and redo it over in Method 2.
HAStoragePlus also supports zpools for globally mounted ZFS file systems for already manually registered zpool device groups.
The following steps illustrate how this configuration can be achieved.
1) Create HAStoragePlus resource with an existing zpool for globally mounted ZFS file systems and bring it online.
Note: The resource group can be scalable or failover as needed by the configuration.
phys-schost-1# clresourcegroup create hasp-rg phys-schost-1# clresource create -t HAStoragePlus -p \ GlobalZpools=HAzpool -g hasp-rg hasp-rs phys-schost-1# clresourcegroup online -eM hasp-rg
2) Verify the configuration.
phys-schost-1~# clresource show -p GlobalZpools hasp-rs GlobalZpools: HAzpool phys-schost-1# cldevicegroup status === Cluster Device Groups === --- Device Group Status --- Device Group Name Primary Secondary Status ----------------- ------- --------- ------ HAzpool phys-schost-1 phys-schost-2 Online
Note: When the HAStoragePlus resource is deleted, the zpool device group is not automatically deleted. While the zpool device group exists, the ZFS filesystems in the zpool will be mounted globally when the device group is brought online (# cldevicegroup online HAzpool).
For more information, see How to Configure a ZFS Storage Pool for Cluster-wide Global Access Without HAStoragePlus
For information on how to use HAStoragePlus to manage ZFS pools for global access by data services, see
How to Set Up an HAStoragePlus Resource for a Global ZFS Storage Pool in Planning and Administering Data Services for Oracle Solaris Cluster 4.4.
Will this scale? (i.e. to 4 nodes, 40 nodes, 400 nodes, 4000 nodes?)
Is it safe for diversity? (i.e. 3 datacenters, located in different cities or continents?)
What is the performance implication when you add a node?
Can multiple nodes drop dead and the application continue to run without losing data?
I am trying to determine the use cases for clustering with ZFS... (i.e. is it a "Cloud Solution" where we could lose a datacenter and still have applications continue to run in the remaining datacenters?)
See this link for answers to your questions.
https://blogs.oracle.com/solaris/cluster-file-system-with-zfs%3a-general-deployment-questions
Thanks!