Having a shared file system is a very common request. It allows multiple applications to access the same data, for example, or it allows multiple users to access the same information at the same time. You can have a shared file system on-premises if there's an available NAS or SAN device that supports multiple connections, but how can you have a shared file system in the cloud?
Some technologies, such as iSCSI, NFS, SMB, and DRBD, enable a block device to be shared with two or more cloud instances. These services require an additional configuration, before setting up a cluster file system service like Oracle Cluster File System version 2 (OCFS2) or GlusterFS, to allow users to perform read/write operations simultaneously.
With the new shareable volumes with multiple-instance attachment option in Oracle Cloud Infrastructure Block Volumes, there’s no need to set up those sharing services. This option lets you attach a block volume to multiple compute instances, enabling all the instances to get concurrent read/write access to the same data.
After the block volume is attached to all instances, the next step is to use a (shared) file system that is cluster-aware. This post explains the process for setting up OCFS2 with the new multiple-instance attachment option.
The following diagram shows the architecture for attaching a sharable, read/write block volume to multiple instances.
OCFS2 is a general-purpose, shared-disk file system that’s intended for use in clusters to increase storage performance and availability. Almost any application can use OCFS2 because it provides local file-system semantics. Applications that are cluster-aware can use cache-coherent parallel I/O from multiple cluster nodes to balance activity across the cluster. Or, they can use the available file-system functionality to fail over and run on another node if a node fails.
OCFS2 has a large number of features that make it suitable for deployment in an enterprise-level computing environment:
The following high-level configuration steps are required for this architecture:
Before you start the configuration, disable selinux in the OCFS2 cluster nodes and open ports 7777 and 3260 in the security list for the virtual cloud network (VCN). In the Oracle Cloud Infrastructure Console, edit the VCN security list and perform ONE of the following steps:
Open all protocols for the internal subnet CIDR (172.0.0.0/16—not the public network):
Source: 172.0.0.0/16
IP Protocol: All Protocols
Allows: all traffic for all ports
Open only the required ports, 7777 and 3260:
Source: 172.0.0.0/16
IP Protocol: TCP
Source Port Range: All
Destination Port Range: 7777
Allows: TCP traffic for ports: 7777
Source: 172.0.0.0/16
IP Protocol: TCP
Source Port Range: All
Destination Port Range: 3260
Allows: TCP traffic for ports: 3260
Note: Ports 7777 and 3260 also need to be opened in the local OS firewall. Following are the required commands for Oracle Linux 7.x. Review the documentation if you are using a different OS.
Additionally, ensure that DNS is working properly and that the compute instances can communicate across the tenancy’s availability domains. Here’s an example of the /etc/resolv.conf file based on this setup:
$ cat /etc/resolv.conf
; generated by /usr/sbin/dhclient-script
search baremetal.oraclevcn.com publicsubnetad1.baremetal.oraclevcn.com publicsubnetad2.baremetal.oraclevcn.com publicsubnetad3.baremetal.oraclevcn.com
nameserver 169.254.169.254
All availability domain DNS entries must be available in the resolv.conf file.
The environment contains the following compute instances:
Role | Instance | IP Address | OS |
OCFS2 Node1 | node1.publicsubnetad1.baremetal.oraclevcn.com | 172.0.0.41 | Oracle Linux 7.x x86_64 |
OCFS2 Node2 | node2.publicsubnetad1.baremetal.oraclevcn.com | 172.0.0.42 | Oracle Linux 7.x x86_64 |
Install the required OCFS2 packages:
$ sudo yum install ocfs2-tools-devel ocfs2-tools -y
Create the configuration file by using the o2cb command or a text editor. Let's use the following command to create a cluster definition. This command creates the /etc/ocfs2/cluster.conf cluster configuration file if it doesn’t already exist.
$ sudo o2cb add-cluster ociocfs2
For each node, use the following command to define the node:
$ sudo o2cb add-node ociocfs2 node1 --ip 172.0.0.41
$ sudo o2cb add-node ociocfs2 node2 --ip 172.0.0.42
Note: The name of the node must be the same as the value of the system's HOSTNAME that is configured in /etc/sysconfig/network, and the IP address is the one that the node will use for private communication in the cluster. Copy the /etc/ocfs2/cluster.conf cluster configuration file to each node in the cluster. Any changes made to the cluster configuration file don’t take effect until the restart of the cluster stack.
The following /etc/ocfs2/cluster.conf configuration file defines a 2-node cluster named ociocfs2 with a local heartbeat, which is the configuration used in this post.
$ sudo cat /etc/ocfs2/cluster.conf
cluster:
heartbeat_mode = local
node_count = 2
name = ociocfs2
node:
number = 0
cluster = ociocfs2
ip_port = 7777
ip_address = 172.0.0.41
name = node1
node:
number = 1
cluster = ociocfs2
ip_port = 7777
ip_address = 172.0.0.42
name = node2
Run the following command on each node of the cluster. The options are explained in the documentation.
$ sudo /sbin/o2cb.init configure
Configuring the O2CB driver.
This will configure the on-boot properties of the O2CB driver.
The following questions will determine whether the driver is loaded on
boot. The current values will be shown in brackets ('[]'). Hitting
<ENTER> without typing an answer will keep that current value. Ctrl-C
will abort.
Load O2CB driver on boot (y/n) [y]:
Cluster stack backing O2CB [o2cb]:
Cluster to start on boot (Enter "none" to clear) [ocfs2]: ociocfs2
Specify heartbeat dead threshold (>=7) [31]:
Specify network idle timeout in ms (>=5000) [30000]:
Specify network keepalive delay in ms (>=1000) [2000]:
Specify network reconnect delay in ms (>=2000) [2000]:
Writing O2CB configuration: OK
checking debugfs...
Setting cluster stack "o2cb": OK
Registering O2CB cluster "ociocfs2": OK
Setting O2CB cluster timeouts : OK
Starting global heartbeat for cluster "ociocfs2": OK
To verify the settings for the cluster stack, run the /sbin/o2cb.init status command:
$ sudo /sbin/o2cb.init status
Driver for "configfs": Loaded
Filesystem "configfs": Mounted
Stack glue driver: Loaded
Stack plugin "o2cb": Loaded
Driver for "ocfs2_dlmfs": Loaded
Filesystem "ocfs2_dlmfs": Mounted
Checking O2CB cluster "ociocfs2": Online
Heartbeat dead threshold: 31
Network idle timeout: 30000
Network keepalive delay: 2000
Network reconnect delay: 2000
Heartbeat mode: Local
Checking O2CB heartbeat: Active
Debug file system at /sys/kernel/debug: mounted
In this example, the cluster is online and is using local heartbeat mode. If no volumes have been configured, the O2CB heartbeat is shown as Not Active rather than Active.
Configure the o2cb and ocfs2 services so that they start at boot time after networking is enabled:
$ sudo systemctl enable o2cb
$ sudo systemctl enable ocfs2
These settings allow the node to mount OCFS2 volumes automatically when the system starts.
For the correct operation of the cluster, configure the kernel settings shown in the following table:
Kernel Setting | Description |
panic |
Specifies the number of seconds after a panic occurs before a system automatically resets itself. If the value is 0, the system stops responding, which allows you to collect detailed information about the panic for troubleshooting. This is the default value. To enable automatic reset, set a nonzero value. If you require a memory image (vmcore), allow enough time for Kdump to create this image. The suggested value is 30 seconds, although large systems require a longer time. |
panic_on_oops | Specifies that a system must panic if a kernel oops occurs. If a kernel thread required for cluster operation crashes, the system must reset itself. Otherwise, another node might not be able to tell whether a node is slow to respond or unable to respond, causing cluster operations to stop. |
On each node, enter the following commands to set the recommended values for panic and panic_on_oops:
$ sudo sysctl kernel.panic=30
$ sudo sysctl kernel.panic_on_oops=1
To make the change persist across reboots, add the following entries to the /etc/sysctl.conf file:
# Define panic and panic_on_oops for cluster operation
kernel.panic=30
kernel.panic_on_oops=1
The following table shows the commands for performing various operations on the cluster stack:
Command | Description |
/sbin/o2cb.init status | Check the status of the cluster stack. |
/sbin/o2cb.init online | Start the cluster stack. |
/sbin/o2cb.init offline | Stop the cluster stack. |
/sbin/o2cb.init unload | Unload the cluster stack. |
Use the mkfs.ocfs2 command to create an OCFS2 volume on a device.
$ sudo mkfs.ocfs2 -L "ocfs2" /dev/sdb
mkfs.ocfs2 1.8.6
Cluster stack: classic o2cb
Label: ocfs2
Features: sparse extended-slotmap backup-super unwritten inline-data strict-journal-super xattr indexed-dirs refcount discontig-bg
Block size: 4096 (12 bits)
Cluster size: 4096 (12 bits)
Volume size: 12455405158400 (3040870400 clusters) (3040870400 blocks)
Cluster groups: 94274 (tail covers 512 clusters, rest cover 32256 clusters)
Extent allocator size: 780140544 (186 groups)
Journal size: 268435456
Node slots: 16
Creating bitmaps: done
Initializing superblock: done
Writing system files: done
Writing superblock: done
Writing backup superblock: 6 block(s)
Formatting Journals: done
Growing extent allocator: done
Formatting slot map: done
Formatting quota files: done
Writing lost+found: done
mkfs.ocfs2 successful
As shown in the following example, specify the _netdev option in /etc/fstab to allow the system to mount the OCFS2 volume at boot time after networking is started, and to unmount the file system before networking is stopped.
$ sudo mkdir /ocfs2
$ sudo vi /etc/fstab
#include the below line to mount your ocfs2 after a restart
/dev/sdb /ocfs2 ocfs2 _netdev,defaults 0 0
Run mount -a to mount the OCFS2 partition based on the fstab entry.
Congratulations! The cluster file system is mounted on /ocfs2 on both the Oracle Linux 7.x node1 and node2 servers.
Applications that are cluster-enabled can now use the OCFS2 storage as they would any network-attached storage on premises. Planning an environment thoughtfully and making use of availability domains and capabilities such as OCFS can help increase the performance and availability of the solutions built on Oracle Cloud Infrastructure.