X

Using the Multiple-Instance Attach Block Volume Feature to Create a Shared File System on Oracle Cloud Infrastructure

Gilson Melo
Senior Principal Product Manager

Having a shared file system is a very common request. It allows multiple applications to access the same data, for example, or it allows multiple users to access the same information at the same time. You can have a shared file system on-premises if there's an available NAS or SAN device that supports multiple connections, but how can you have a shared file system in the cloud?

Some technologies, such as iSCSI, NFS, SMB, and DRBD, enable a block device to be shared with two or more cloud instances. These services require an additional configuration, before setting up a cluster file system service like Oracle Cluster File System version 2 (OCFS2) or GlusterFS, to allow users to perform read/write operations simultaneously.

With the new shareable volumes with multiple-instance attachment option in Oracle Cloud Infrastructure Block Volumes, there’s no need to set up those sharing services. This option lets you attach a block volume to multiple compute instances, enabling all the instances to get concurrent read/write access to the same data.

After the block volume is attached to all instances, the next step is to use a (shared) file system that is cluster-aware. This post explains the process for setting up OCFS2 with the new multiple-instance attachment option.

Architecture

The following diagram shows the architecture for attaching a sharable, read/write block volume to multiple instances.

Diagram that shows the architecture for attaching a sharable, read/write block volume to multiple instances.

Why OCFS2?

OCFS2 is a general-purpose, shared-disk file system that’s intended for use in clusters to increase storage performance and availability. Almost any application can use OCFS2 because it provides local file-system semantics. Applications that are cluster-aware can use cache-coherent parallel I/O from multiple cluster nodes to balance activity across the cluster. Or, they can use the available file-system functionality to fail over and run on another node if a node fails.

OCFS2 has a large number of features that make it suitable for deployment in an enterprise-level computing environment:

  • Support for ordered and write-back data journaling that provides file-system consistency in the event of power failure or system crash.
  • Block sizes ranging from 512 bytes to 4 KB, and file-system cluster sizes ranging from 4 KB to 1 MB (both in increments of powers of 2). The maximum supported volume size is 16 TB, which corresponds to a cluster size of 4 KB. A volume size as large as 4 PB is theoretically possible for a cluster size of 1 MB, although this limit has not been tested.
  • Extent-based allocations for efficient storage of very large files.
  • Optimized allocation support for sparse files, inline data, unwritten extents, hole punching, reflinks, and allocation reservation for high performance and efficient storage.
  • Indexing of directories to allow efficient access to a directory even if it contains millions of objects.
  • Metadata checksums for the detection of corrupted inodes and directories.
  • Extended attributes to allow an unlimited number of name:value pairs to be attached to file system objects such as regular files, directories, and symbolic links.
  • Advanced security support for POSIX ACLs and SELinux in addition to the traditional file-access permission model.
  • Support for user and group quotas.
  • Support for heterogeneous clusters of nodes with a mixture of 32-bit and 64-bit, little-endian (x86, x86_64, ia64), and big-endian (ppc64) architectures.
  • An easy-to-configure, in-kernel cluster-stack (O2CB) with a distributed lock manager (DLM), which manages concurrent access from the cluster nodes.
  • Support for buffered, direct, asynchronous, splice, and memory-mapped I/O.
  • A tool set that uses similar parameters to the ext3 file system.

Getting Started

The following high-level configuration steps are required for this architecture:

  1. Attach an Oracle Cloud Infrastructure block volume to multiple compute instances.
  2. Set up the OCFS2/O2CB cluster nodes.
  3. Create the OCFS2 file system and mount point.

Ports

Before you start the configuration, you need to open ports 7777 and 3260 in the security list for the virtual cloud network (VCN). In the Oracle Cloud Infrastructure Console, edit the VCN security list and perform ONE of the following steps:

  • Open all protocols for the internal subnet CIDR (172.0.0.0/16—not the public network):

    Source: 172.0.0.0/16
    IP Protocol: All Protocols
    Allows: all traffic for all ports
  • Open only the required ports, 7777 and 3260:

    Source: 172.0.0.0/16
    IP Protocol: TCP
    Source Port Range: All
    Destination Port Range: 7777
    Allows: TCP traffic for ports: 7777
    Source: 172.0.0.0/16
    IP Protocol: TCP
    Source Port Range: All
    Destination Port Range: 3260
    Allows: TCP traffic for ports: 3260

Note: Ports 7777 and 3260 also need to be opened in the local OS firewall. Following are the required commands for Oracle Linux 7.x. Review the documentation if you are using a different OS.

  • sudo firewall-cmd --zone=public --permanent --add-port=7777/tcp
  • sudo firewall-cmd --zone=public --permanent --add-port=3260/tcp
  • sudo firewall-cmd --complete-reload

DNS

Additionally, ensure that DNS is working properly and that the compute instances can communicate across the tenancy’s availability domains. Here’s an example of the /etc/resolv.conf file based on this setup:

$ cat /etc/resolv.conf
; generated by /usr/sbin/dhclient-script
search baremetal.oraclevcn.com publicsubnetad1.baremetal.oraclevcn.com publicsubnetad2.baremetal.oraclevcn.com publicsubnetad3.baremetal.oraclevcn.com
nameserver 169.254.169.254

All availability domain DNS entries must be available in the resolv.conf file.

Environment

The environment contains the following compute instances:

Role Instance IP Address OS
OCFS2 Node1 node1.publicsubnetad1.baremetal.oraclevcn.com 172.0.0.41 Oracle Linux 7.x x86_64
OCFS2 Node2 node2.publicsubnetad2.baremetal.oraclevcn.com 172.0.1.42 Oracle Linux 7.x x86_64


Configuring OCFS2

Creating the Configuration File for the Cluster Stack

  1. Install the required OCFS2 packages:

    $ sudo yum install ocfs2-tools-devel ocfs2-tools -y
  2. Create the configuration file by using the o2cb command or a text editor. Let's use the following command to create a cluster definition. This command creates the /etc/ocfs2/cluster.conf cluster configuration file if it doesn’t already exist.

    $ sudo o2cb add-cluster ociocfs2
  3. For each node, use the following command to define the node:

    $ sudo o2cb add-node ociocfs2 node1 --ip 172.0.0.41
    $ sudo o2cb add-node ociocfs2 node2 --ip 172.0.1.42

    Note: The name of the node must be the same as the value of the system's HOSTNAME that is configured in /etc/sysconfig/network, and the IP address is the one that the node will use for private communication in the cluster. Copy the /etc/ocfs2/cluster.conf cluster configuration file to each node in the cluster. Any changes made to the cluster configuration file don’t take effect until the restart of the cluster stack.

    The following /etc/ocfs2/cluster.conf configuration file defines a 2-node cluster named ociocfs2 with a local heartbeat, which is the configuration used in this post.

    $ sudo cat /etc/ocfs2/cluster.conf
    cluster:
            heartbeat_mode = local
            node_count = 2
            name = ociocfs2
     
    node:
            number = 0
            cluster = ociocfs2
            ip_port = 7777
            ip_address = 172.0.0.41
            name = node1
     
    node:
            number = 1
            cluster = ociocfs2
            ip_port = 7777
            ip_address = 172.0.1.42
            name = node2

Configuring the Cluster Stack

  1. Run the following command on each node of the cluster. The options are explained in the documentation.

    $ sudo /sbin/o2cb.init configure
    Configuring the O2CB driver.
     
    This will configure the on-boot properties of the O2CB driver.
    The following questions will determine whether the driver is loaded on
    boot. The current values will be shown in brackets ('[]'). Hitting
    <ENTER> without typing an answer will keep that current value. Ctrl-C
    will abort.
     
    Load O2CB driver on boot (y/n) [y]:
    Cluster stack backing O2CB [o2cb]:
    Cluster to start on boot (Enter "none" to clear) [ocfs2]: ociocfs2
    Specify heartbeat dead threshold (>=7) [31]:
    Specify network idle timeout in ms (>=5000) [30000]:
    Specify network keepalive delay in ms (>=1000) [2000]:
    Specify network reconnect delay in ms (>=2000) [2000]:
    Writing O2CB configuration: OK
    checking debugfs...
    Setting cluster stack "o2cb": OK
    Registering O2CB cluster "ociocfs2": OK
    Setting O2CB cluster timeouts : OK
    Starting global heartbeat for cluster "ociocfs2": OK
  2. To verify the settings for the cluster stack, run the /sbin/o2cb.init status command:

    $ sudo /sbin/o2cb.init status
    Driver for "configfs": Loaded
    Filesystem "configfs": Mounted
    Stack glue driver: Loaded
    Stack plugin "o2cb": Loaded
    Driver for "ocfs2_dlmfs": Loaded
    Filesystem "ocfs2_dlmfs": Mounted
    Checking O2CB cluster "ociocfs2": Online
      Heartbeat dead threshold: 31
      Network idle timeout: 30000
      Network keepalive delay: 2000
      Network reconnect delay: 2000
      Heartbeat mode: Local
    Checking O2CB heartbeat: Active
    Debug file system at /sys/kernel/debug: mounted

    In this example, the cluster is online and is using local heartbeat mode. If no volumes have been configured, the O2CB heartbeat is shown as Not Active rather than Active.

  3. Configure the o2cb and ocfs2 services so that they start at boot time after networking is enabled:

    $ sudo systemctl enable o2cb
    $ sudo systemctl enable ocfs2

    These settings allow the node to mount OCFS2 volumes automatically when the system starts.

Configuring the Kernel for Cluster Operation

For the correct operation of the cluster, configure the kernel settings shown in the following table:

Kernel Setting Description
panic

Specifies the number of seconds after a panic occurs before a system automatically resets itself.

If the value is 0, the system stops responding, which allows you to collect detailed information about the panic for troubleshooting. This is the default value.

To enable automatic reset, set a nonzero value. If you require a memory image (vmcore), allow enough time for Kdump to create this image. The suggested value is 30 seconds, although large systems require a longer time.

panic_on_oops Specifies that a system must panic if a kernel oops occurs. If a kernel thread required for cluster operation crashes, the system must reset itself. Otherwise, another node might not be able to tell whether a node is slow to respond or unable to respond, causing cluster operations to stop.

 

  1. On each node, enter the following commands to set the recommended values for panic and panic_on_oops:

    $ sudo sysctl kernel.panic=30
    $ sudo sysctl kernel.panic_on_oops=1
  2. To make the change persist across reboots, add the following entries to the /etc/sysctl.conf file:

    # Define panic and panic_on_oops for cluster operation
    kernel.panic=30
    kernel.panic_on_oops=1

Starting and Stopping the Cluster Stack

The following table shows the commands for performing various operations on the cluster stack:

Command Description
/sbin/o2cb.init status Check the status of the cluster stack.
/sbin/o2cb.init online Start the cluster stack.
/sbin/o2cb.init offline Stop the cluster stack.
/sbin/o2cb.init unload Unload the cluster stack.


Creating and Mounting the Volumes

Creating the OCFS2 Volumes

Use the mkfs.ocfs2 command to create an OCFS2 volume on a device.

$ sudo mkfs.ocfs2 -L "ocfs2" /dev/sdb
mkfs.ocfs2 1.8.6
Cluster stack: classic o2cb
Label: ocfs2
Features: sparse extended-slotmap backup-super unwritten inline-data strict-journal-super xattr indexed-dirs refcount discontig-bg
Block size: 4096 (12 bits)
Cluster size: 4096 (12 bits)
Volume size: 12455405158400 (3040870400 clusters) (3040870400 blocks)
Cluster groups: 94274 (tail covers 512 clusters, rest cover 32256 clusters)
Extent allocator size: 780140544 (186 groups)
Journal size: 268435456
Node slots: 16
Creating bitmaps: done
Initializing superblock: done
Writing system files: done
Writing superblock: done
Writing backup superblock: 6 block(s)
Formatting Journals: done
Growing extent allocator: done
Formatting slot map: done
Formatting quota files: done
Writing lost+found: done
mkfs.ocfs2 successful

Mounting the OCFS2 Volumes

  1. As shown in the following example, specify the _netdev option in /etc/fstab to allow the system to mount the OCFS2 volume at boot time after networking is started, and to unmount the file system before networking is stopped.

    $ sudo mkdir /ocfs2
    $ sudo vi /etc/fstab
    #include the below line to mount your ocfs2 after a restart
    /dev/sdb /ocfs2 ocfs2     _netdev,defaults   0 0 
  2. Run mount -a to mount the OCFS2 partition based on the fstab entry.

Congratulations! The cluster file system is mounted on /ocfs2 on both the Oracle Linux 7.x node1 and node2 servers.

Applications that are cluster-enabled can now use the OCFS2 storage as they would any network-attached storage on premises. Planning an environment thoughtfully and making use of availability domains and capabilities such as OCFS can help increase the performance and availability of the solutions built on Oracle Cloud Infrastructure.

Join the discussion

Comments ( 2 )
  • Nav Saturday, February 23, 2019
    How can we get the multi-attach block device option for our tenancy enabled. It has been a while this blog was published and I still don't see it enabled GA.
  • Gilson Melo Wednesday, December 11, 2019
    This feature has been released and it's now available to all users.
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.