X

Using GlusterFS on Oracle Cloud Infrastructure: Quickly and Easily

Gilson Melo
Principal Product Manager

Why GlusterFS?

GlusterFS is a distributed scale out filesystem that allows rapid provisioning of additional storage based on your storage consumption needs. It incorporates automatic failover as a primary feature. All of this is accomplished without a centralized metadata server. GlusterFS is open source and part of Linux OS and it works well on Oracle's Cloud Infrastructure Services.

This experience is based of a real case customer proof of concept by the author testing 57 Oracle Cloud Infrastructure nodes using a 52TB GlusterFS volume with 26TB of data simultaneously along with a HPC (High Performance Computing) application.

What will this Tutorial Cover?

This tutorial describes the deployment steps of a high availability GlusterFS Storage environment on Oracle Cloud Infrastructure Instances using a Distributed Glusterfs Volume. There are different GlusterFS types of volumes you can use as explained in the GlusterFS public documentation and it's important that you understand them first before proceeding with the initial configuration. This blog demonstrates how to use a Distributed GlusterFS Volume on two Oracle Linux Instances being accessed by one Linux Client.

As for the configuration option on Oracle Cloud Infrastructure, you might either keep all GlusterFS resources in a single availability domain (AD) or use all ADs as shown below. Depending on the network traffic utilization you should consider having all GlusterFS resources in a single AD or use a distributed option across all availability domains for fault tolerance. Add a Gluster Replicated Volume option in the setup which is recommended to avoid data loss and for production environments. In addition to using a replicated GlusterFS Volume for fault tolerance (Distributed and Replicated Volume) you should enable GlusterFS Trash Translator and Snapshots to assist you with file recovery if needed.

Here is an example of a GlusterFS architecture that can be used on Oracle Cloud Infrastructure 

GlusterFS-Oracle-BMCS.png

 

Getting Started

For this setup you need to create at least three instances; two instances will be used to hold the Distributed GlusterFS Volume and the third instance as a GlusterFS Client with a local mount point to GlusterFS volume. As an optional step for this tutorial, create two 100GB block volumes and attach them to each GlusterFS server (two GlusterFS Servers with two 100GB block volumes) that will be used later to create a single 200GB GlusterFS volume as shown in the above picture.

 

Environment

Server  gfsserver.publicsubnetad1.baremetal.oraclevcn.com Oracle Linux 7.x x86_64
Server gfsserver1.publicsubnetad1.baremetal.oraclevcn.com Oracle Linux 7.x x86_64
Client gfsclient.publicsubnetad1.baremetal.oraclevcn.com Oracle Linux 7.x x86_64

 

GlusterFS Server Installation

Enable GlusterFS repo in your Oracle Linux 7.x x86_64 instances that will be used to hold your GlusterFS volume(s). Recommend checking Oracle Linux YUM public portal for current packages version. Below is an example for gluster310 packages.

$ sudo yum-config-manager --add-repo http://yum.oracle.com/repo/OracleLinux/OL7/developer_gluster310/x86_64 

 

Install GlusterFS Server and Samba packages on all nodes that will be used as GlusterFS servers.

$ sudo yum install glusterfs-server samba –y  

 

XFS Bricks

Skip these optional steps if your environment is already have prepared XFS bricks (partitions). Use /dev/sdb device which is the 100GB attached block volume you have created before as described above. Change the device name if necessary based on your current configuration. Now, create brick1 Logical Volumes for XFS bricks on both cluster nodes as shown below

$ sudo pvcreate /dev/sdb

$ sudo vgcreate vg_gluster /dev/sdb

$ sudo lvcreate -L 100000M -n brick1 vg_gluster    

 

Setup XFS file systems:

$ sudo mkfs.xfs /dev/vg_gluster/brick1   

 

Create a mount point and mount the XFS brick:

$ sudo mkdir -p /bricks/brick1

$ sudo mount /dev/vg_gluster/brick1 /bricks/brick1   

 

Extend the /etc/fstab inserting the following line

/dev/vg_gluster/brick1  /bricks/brick1    xfs     defaults,_netdev,nofail  0 0 

 

Important Note: as described in the Public documentation about when connecting with iSCSI on linux instances block volumes, it’s important to include the "_netdev" and "nofail" options on every non-root block volume in /etc/fstab or the instance may fail to launch as the OS tries to mount the volume(s) before the iSCSI initiator has started.

 

For example, a single volume in /etc/fstab might look like this:

/dev/sdb    /data1    ext4      defaults,noatime,_netdev,nofail       0      2 

If you reboot without these options in fstab, the instance will fail to start after the next reboot. Instances in this state are not recoverable.

 

Trusted Pool (Storage Cluster)

Enable and start glusterfsd.service on both nodes:

$ sudo systemctl enable glusterd.service  

$ sudo systemctl start glusterd.service

 

Ports "TCP:24007-24008" are required for communication between GlusterFS nodes and each brick requires another TCP port starting at 24009.

Enable required ports on the firewall:

$ sudo firewall-cmd --zone=public --add-port=24007-24008/tcp --permanent 

$ sudo firewall-cmd --reload

 

You also need to open ports "24007-24008" on the dashboard. Edit the Virtual Cloud Network Security List and either open all ports for the Internal Network (NOT PUBLIC NETWORK) as shown below for network 172.0.0.0/16 or just ports "24007-24008" as shown below respectively.

Source: 172.0.0.0/16

IP Protocol: All Protocols

Allows: all traffic for all ports   

or 

Source: 172.0.0.0/16

IP Protocol: TCP

Source Port Range: All

Destination Port Range: 24007-24008

Allows: TCP traffic for ports: 24007-24008

 

Now, use gluster command to connect the second GlusterFS node and create a Trusted Pool (Storage Cluster).

$ sudo gluster peer probe gfsserver1

peer probe: success

 

Verify cluster peer:

$ sudo gluster peer status

Number of Peers: 1

Hostname: gfsserver1

Uuid: 955326ef-fb6f-4d51-8c2a-9166b7d3b5f8

State: Peer in Cluster (Connected)

 

High Availability GlusterFS Volumes

GlusterFS Volume works with Gluster File System which is a logical collection of XFS bricks. The following table shows dependencies between Volume types and sizes:

Distributed (for maximum space) 1G + 1G = 2G
Replicated (for high availability) 1G + 1G = 1G
Striped (for large files) 1G + 1G = 2G
Distributed and Replicated (1G+1G) + (1G+1G) = 2G
Distributed and Striped (1G+1G) + (1G+1G) = 4G
Distributed, Replicated and Stripped [(1G+1G)+(1G+1G)] + [(1G+1G)+(1G+1G)] = 4G

 

NOTE: for production environments it is recommended using Distributed GlusterFS volume along with Replicated option for security reasons.

Open the required port on the firewall. Remember, each brick in the GlusterFS Volume requires a TCP port starting at 24009:

$ sudo firewall-cmd --zone=public --add-port=24009/tcp --permanent   

$ sudo firewall-cmd --reload

 

The same port needs to be opened on the Dashboard as well in case you haven't opened all ports for internal network.

Use the /bricks/brick1 XFS partition on both nodes to create a highly available Replicated Volume. First create a sub-directory in /bricks/brick1 mount point. It will be necessary for GlusterFS.

$ sudo mkdir /bricks/brick1/brick   

 

Create a Distributed GlusterFS Volume running these commands on the first node only (gfsserver)

$ sudo gluster volume create glustervol1 transport tcp gfsserver:/bricks/brick1/brick gfsserver1:/bricks/brick1/brick

$ sudo gluster volume start glustervol1

 

Verify the GlusterFS volume

$ sudo gluster volume info all

   Volume Name: glustervol1

   Type: Distribute

   Volume ID: c660712f-29ea-4288-96b6-2c0a0c85a82a

   Status: Started

   Snapshot Count: 0

   Number of Bricks: 1

   Transport-type: tcp

   Bricks:

   Brick1: gfsserver1:/bricks/brick1/brick

   Options Reconfigured:

   transport.address-family: inet

   nfs.disable: on

 

GlusterFS Clients

GlusterFS volumes can be accessed using GlusterFS Native Client (OracleLinux 6.x and 7.X), NFS v3 (other Linux clients), or CIFS (Windows clients).

Open the Firewall for Glusterfs/NFS/CIFS Clients

$ sudo firewall-cmd --zone=public --add-service=nfs --add-service=samba --add-service=samba-client --permanent

$ sudo firewall-cmd --zone=public --add-port=111/tcp --add-port=139/tcp --add-port=445/tcp --add-port=965/tcp --add-port=2049/tcp --add-port=38465-38469/tcp --add-port=631/tcp --add-port=111/udp --add-port=963/udp --add-port=49152-49251/tcp  --permanent    

$ sudo firewall-cmd --reload

 

As already mentioned above the same ports need to be opened on the Dashboard.

 

Access from another Linux machine via GlusterFS Native Client

All required GlusterFS Client packages are available by default in the Oracle Linux 7.x Base repository 

Install GlusterFS Client packages:

$ sudo yum install glusterfs glusterfs-fuse attr -y    

 

Mount GlusterFS Volumes on the client:

$ sudo mkdir /glusterfs

$ sudo mount -t glusterfs gfsserver:/glustervol1 /glusterfs

$ sudo mount |grep -i glusterfs

gfsserver:/glustervol1 on /glusterfs type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

$ df -kh |grep glusterfs

gfsserver:/glustervol1   196G   65M   98G   1% /glusterfs

 

to make it permanent across restarts you need to add the following line in /etc/fstab on all GlusterFS Linux clients

gfsserver:/glustervol1       /glusterfs  glusterfs   defaults,_netdev,nofail  0  0

 

Check the GlusterFS public documentation for more details about:

  • Troubleshooting steps
  • How to to setup Windows/Linux machines access via CIFS to GlusterFS Volumes

Try out Oracle’s Cloud Services by signing up for a free cloud trial today.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha
Oracle

Integrated Cloud Applications & Platform Services