X

Using the Multi-Attach Block Volume Feature to Create a Shared File System on Oracle Cloud Infrastructure

Gilson Melo
Principal Product Manager

Having a shared file system is a very common request to allow applications to be able to access the same data or to allow multiple users to get access to same information at the same time for example. On-premises this is a very easy task to achieve using NAS or SAN devices but how can it be done in the cloud?

 

There are different technologies like iSCSI, NFS, SMB, DRBD and other services that allow you to share a block device with two or more Cloud instances but you still need to configure those services and on top of that you also need a cluster file system like OCFS2 or GlusterFS that will allow your users to read/write simultaneously.

 

With Oracle Cloud Infrastructure you have the multi-attach block device option that allows you to attach the same block device with two or more Cloud instances. This feature is under Limited Availability which means you need to have your tenancy enabled to be able to use it. This allows customers to easily connect the same block storage volume(s) to all the instances that need to get access to the same data. It basically acts as a NAS Cloud device.

 

As of today, the process is done through a preview version of OCI CLI which needs to be requested from Oracle. Once you get access to that new OCI CLI  version and your tenancy has been enabled to use such feature you will be able to run the OCI command line to attach a block device to multiple cloud instances where you plan to use to hold your cluster file system. Here is an example:

 

"oci compute volume-attachment attach --instance-id ocid1.instance.oc1.OCID --type iscsi --volume-id ocid1.volume.oc1.REGION.OCID --is-shareable true" 

 

Now that you have your block volume attached to all instances you need, next step is creating a file system that is cluster aware. For this blog we will use OCFS2 (Oracle Cluster File System) as the following diagram illustrates.

 

Why OCFS2?

Oracle Cluster File System version 2 (OCFS2) is a general-purpose shared-disk file system intended for use in clusters to increase storage performance and availability.  Almost any application can use OCFS2 because it provides local file-system semantics. Applications that are cluster-aware can use cache-coherent parallel I/O from multiple cluster nodes to balance activity across the cluster, or they can use the available file-system functionality to fail over and run on another node in the event that a node fails.

 

OCFS2 has a large number of features that make it suitable for deployment in an enterprise-level computing environment:

  • - Support for ordered and write-back data journaling that provides file system consistency in the event of power failure or system crash.
  • - Block sizes ranging from 512 bytes to 4 KB, and file-system cluster sizes ranging from 4 KB to 1 MB (both in increments of powers of 2). The maximum supported volume size is 16 TB, which corresponds to a cluster size of 4 KB. A volume size as large as 4 PB is theoretically possible for a cluster size of 1 MB, although this limit has not been tested.
  • - Extent-based allocations for efficient storage of very large files.
  • - Optimized allocation support for sparse files, inline-data, unwritten extents, hole punching, reflinks, and allocation reservation for high performance and efficient storage.
  • - Indexing of directories to allow efficient access to a directory even if it contains millions of objects.
  • - Metadata checksums for the detection of corrupted inodes and directories.
  • - Extended attributes to allow an unlimited number of name:value pairs to be attached to file system objects such as regular files, directories, and symbolic links.
  • - Advanced security support for POSIX ACLs and SELinux in addition to the traditional file-access permission model.
  • - Support for user and group quotas.
  • - Support for heterogeneous clusters of nodes with a mixture of 32-bit and 64-bit, little-endian (x86, x86_64, ia64) and big-endian (ppc64) architectures.
  • - An easy-to-configure, in-kernel cluster-stack (O2CB) with a distributed lock manager (DLM), which manages concurrent access from the cluster nodes.
  • - Support for buffered, direct, asynchronous, splice and memory-mapped I/O.
  • - A tool set that uses similar parameters to the ext3 file system.

 

Getting Started

Below is a summary of the configuration steps required for this architecture:

1. Attach your Oracle Cloud Infrastructure Block Device(s) using oci CLI as explained above

2. Set up your OCFS2/O2CB cluster Nodes

3. Create your OCFS2 file system and mount point

 

You also need to open ports 7777 and 3260 on the Oracle Cloud Infrastructure Dashboard. Edit the VCN Security List and either open all ports for your tenancy Internal Network (NOT PUBLIC NETWORK) as shown below for network 172.0.0.0/16

Source: 172.0.0.0/16

IP Protocol: All Protocols

Allows: all traffic for all ports


or open only the required 7777 and 3260 ports for the internal network and here is an example for port 7777:

Source: 172.0.0.0/16

IP Protocol: TCP

Source Port Range: All

Destination Port Range: 7777

Allows: TCP traffic for ports: 7777

 

NOTE: Ports 7777 and 3260 need to opened in the local OS firewall as well as shown below

- sudo firewall-cmd --zone=public --permanent --add-port=7777/tcp

- sudo firewall-cmd --zone=public --permanent --add-port=3260/tcp

- sudo firewall-cmd --complete-reload


Make sure DNS is working properly and your bare metal instances can communicate properly across your tenancy availability domains (ADs). Here is a quick example of /etc/resolv.conf based on this setup

$ cat /etc/resolv.conf

; generated by /usr/sbin/dhclient-script

search baremetal.oraclevcn.com publicsubnetad3.baremetal.oraclevcn.com publicsubnetad1.baremetal.oraclevcn.com publicsubnetad1.baremetal.oraclevcn.com

nameserver 169.254.169.254

As you can see above, all ADs DNS entries are available in that resolv.conf file.

 

Environment

ROLE INSTANCE IP OS
OCFS2 Node1 node1.publicsubnetad1.baremetal.oraclevcn.com 172.0.0.41 Oracle Linux 7.4 x86_64
OCFS2 Node2 node2.publicsubnetad2.baremetal.oraclevcn.com 172.0.1.42 Oracle Linux 7.4 x86_64

 

OCFS2

Creating the Configuration File for the Cluster Stack

Install the required OCFS2 packages

$ sudo yum install ocfs2-tools-devel ocfs2-tools -y

 

Now, create the configuration file by using the o2cb command or a text editor. Lets use the following command to create a cluster definition.

$ sudo o2cb add-cluster ociocfs2

The above command creates the configuration file /etc/ocfs2/cluster.conf if it does not already exist.

 

For each node, use the following command to define the node.

$ sudo o2cb add-node ociocfs2 node1 --ip 172.0.0.41

$ sudo o2cb add-node ociocfs2 node2 --ip 172.0.1.42

NOTE: The name of the node must be same as the value of the system's HOSTNAME that is configured in /etc/sysconfig/network and the IP address is the one that the node will use for private communication in the cluster. You need to copy the cluster configuration file /etc/ocfs2/cluster.conf to each node in the cluster. Any changes that you make to the cluster configuration file do not take effect until you restart the cluster stack.

 

The following /etc/ocfs2/cluster.conf configuration file defines a 2-node cluster named ociocfs2 with a local heartbeat which is the configuration used for this tutorial.

$ sudo cat /etc/ocfs2/cluster.conf

cluster:

        heartbeat_mode = local

        node_count = 2

        name = ociocfs2

 

node:

        number = 0

        cluster = ociocfs2

        ip_port = 7777

        ip_address = 172.0.0.41

        name = node1

 

node:

        number = 1

        cluster = ociocfs2

        ip_port = 7777

        ip_address = 172.0.1.42

        name = node2

 

Configuring the Cluster Stack

Run the following command on each node of the cluster:

$ sudo /sbin/o2cb.init configure

Configuring the O2CB driver.

 

This will configure the on-boot properties of the O2CB driver.

The following questions will determine whether the driver is loaded on

boot.  The current values will be shown in brackets ('[]').  Hitting

<ENTER> without typing an answer will keep that current value.  Ctrl-C

will abort.

 

Load O2CB driver on boot (y/n) [y]:

Cluster stack backing O2CB [o2cb]:

Cluster to start on boot (Enter "none" to clear) [ocfs2]: ociocfs2

Specify heartbeat dead threshold (>=7) [31]:

Specify network idle timeout in ms (>=5000) [30000]:

Specify network keepalive delay in ms (>=1000) [2000]:

Specify network reconnect delay in ms (>=2000) [2000]:

Writing O2CB configuration: OK

checking debugfs...

Setting cluster stack "o2cb": OK

Registering O2CB cluster "ociocfs2": OK

Setting O2CB cluster timeouts : OK

Starting global heartbeat for cluster "ociocfs2": OK

Explanation of the above options can be found in OCFS2 public documentation

 

To verify the settings for the cluster stack, enter the /sbin/o2cb.init status command:

$ sudo /sbin/o2cb.init status

Driver for "configfs": Loaded

Filesystem "configfs": Mounted

Stack glue driver: Loaded

Stack plugin "o2cb": Loaded

Driver for "ocfs2_dlmfs": Loaded

Filesystem "ocfs2_dlmfs": Mounted

Checking O2CB cluster "ociocfs2": Online

  Heartbeat dead threshold: 31

  Network idle timeout: 30000

  Network keepalive delay: 2000

  Network reconnect delay: 2000

  Heartbeat mode: Local

Checking O2CB heartbeat: Active

Debug file system at /sys/kernel/debug: mounted

In this example, the cluster is online and is using local heartbeat mode. If no volumes have been configured, the O2CB heartbeat is shown as Not Active rather than Active.

 

Configure the o2cb and ocfs2 services so that they start at boot time after networking is enabled.

$ sudo systemctl enable o2cb

$ sudo systemctl enable ocfs2

These settings allow the node to mount OCFS2 volumes automatically when the system starts.

 

Configuring the Kernel for Cluster Operation

For the correct operation of the cluster, you must configure the kernel settings shown in the following table:

KERNEL SETTING DESCRIPTION
panic

Specifies the number of seconds after a panic before a system will automatically reset itself.

If the value is 0, the system hangs, which allows you to collect detailed information about the panic for troubleshooting. This is the default value.

To enable automatic reset, set a non-zero value. If you require a memory image (vmcore), allow enough time for Kdump to create this image. The suggested value is 30 seconds, although large systems will require a longer time.

panic_on_oops Specifies that a system must panic if a kernel oops occurs. If a kernel thread required for cluster operation crashes, the system must reset itself. Otherwise, another node might not be able to tell whether a node is slow to respond or unable to respond, causing cluster operations to hang.

 

On each node, enter the following commands to set the recommended values for panic and panic_on_oops:

$ sudo sysctl kernel.panic=30

$ sudo sysctl kernel.panic_on_oops=1

 

To make the change persist across reboots, add the following entries to the /etc/sysctl.conf file:

# Define panic and panic_on_oops for cluster operation

kernel.panic=30

kernel.panic_on_oops=1

 

Starting and Stopping the Cluster Stack

The following table shows the commands that you can use to perform various operations on the cluster stack.

COMMAND DESCRIPTION
/sbin/o2cb.init status Check the status of the cluster stack.
/sbin/o2cb.init online Start the cluster stack.
/sbin/o2cb.init offline Stop the cluster stack.
/sbin/o2cb.init unload Unload the cluster stack.

 

Creating OCFS2 volumes

Use mkfs.ocfs2 command to create an OCFS2 volume on a device. If you want to label the volume and mount it by specifying the label, the device must correspond to a partition. You cannot mount an unpartitioned disk device by specifying a label.

$ sudo mkfs.ocfs2 -L "ocfs2" /dev/sdb

mkfs.ocfs2 1.8.6

Cluster stack: classic o2cb

Label: ocfs2

Features: sparse extended-slotmap backup-super unwritten inline-data strict-journal-super xattr indexed-dirs refcount discontig-bg

Block size: 4096 (12 bits)

Cluster size: 4096 (12 bits)

Volume size: 12455405158400 (3040870400 clusters) (3040870400 blocks)

Cluster groups: 94274 (tail covers 512 clusters, rest cover 32256 clusters)

Extent allocator size: 780140544 (186 groups)

Journal size: 268435456

Node slots: 16

Creating bitmaps: done

Initializing superblock: done

Writing system files: done

Writing superblock: done

Writing backup superblock: 6 block(s)

Formatting Journals: done

Growing extent allocator: done

Formatting slot map: done

Formatting quota files: done

Writing lost+found: done

mkfs.ocfs2 successful

 

Mounting OCFS2 Volumes

As shown in the following example, specify "_netdev" and "nofail" options in /etc/fstab if you want the system to mount an OCFS2 volume at boot time after networking is started, and to unmount the file system before networking is stopped.

$ sudo mkdir /ocfs2

$ sudo vi /etc/fstab

#include the below line to mount your ocfs2 after a restart

/dev/sdb /ocfs2 ocfs2     _netdev,defaults   0 0

 

Run "mount -a" to mount the OCFS2 partition based on the fstab entry you created above and the setup is concluded. You should have a cluster file system mounted on /ocfs2 on both Oracle Linux 7.4 node1 and node2  servers.

 

Finally, you're finished!  Your applications can now use this storage as they would with any local file storage. Planning your environment thoughtfully and making use of Availability Domains and capabilities such as Oracle Cluster File System can help you increase the performance and availability of the solutions you build on Oracle Cloud Infrastructure.

 

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha
Oracle

Integrated Cloud Applications & Platform Services