X

Tutorial: Using Ceph Distributed Storage Cluster on Bare Metal Cloud Services

Gilson Melo
Principal Product Manager

Contributed by our Guest Blogger: Gilson Melo-Oracle

This tutorial describes the deployment steps of a Ceph Distributed Storage Cluster environment on Oracle Bare Metal Cloud Services using Oracle Linux OS.

Ceph is fully supported on Oracle Linux as described in the public documentation.

As an additional configuration option on Bare Metal Cloud Services, you might either keep all Ceph resources in a single availability domain (AD) or use all ADs. Depending on the network traffic utilization, it might be better getting all the resources in a single AD or using a distributed option across different availability domains for fault tolerance.

Here is an example of a Ceph Distributed Storage Cluster architecture that can be used on Bare Metal Cloud Services

ceph-arch.png

 

What is Ceph Distributed Storage Cluster?

Ceph is a widely used open source storage platform and it provides high performance, reliability, and scalability. The Ceph free distributed storage system provides an interface for object, block, and file-level storage.

In this tutorial, you will install and build a Ceph cluster on Oracle Linux 7.X with the following components:

  • Ceph OSDs (ceph-osd) - Handles the data store, data replication and recovery and a Ceph cluster needs at least two Ceph OSD servers which will be based on Oracle Linux
  • Ceph Monitor (ceph-mon) - Monitors the cluster state, OSD map and CRUSH map.
  • Ceph Meta Data Server (ceph-mds) - This is needed to use Ceph as a File System.

Additional details can be found in the Ceph public documentation and it's important that you understand them first before proceeding with the initial configuration.

Environment

  • 5 server nodes, all with Oracle Linux 7.X installed.
  • Root privileges on all nodes.
  • /dev/sdb is an empty partition - 50GB (attache iSCSI block storage volume).
  • /dev/sdb2 - Ceph Journal

Or you can check that directly on the OSD node with fdisk.

$ ssh osd1

$ sudo fdisk -l /dev/sdb

 

Next, deploy the management-key to all associated nodes.

$ ceph-deploy admin ceph-admin mon1 osd1 osd2

 

Change the permission of the key file by running the command below on all nodes.

$ sudo chmod 644 /etc/ceph/ceph.client.admin.keyring

 

Testing your Ceph setup

You have installed and created your new Ceph cluster, then you added OSDS nodes to the cluster. Now you can test the cluster and make sure there are no errors in the cluster setup.

From the ceph-admin node, log in to the ceph monitor server 'mon1'.

$ ssh mon1

 

Check the cluster health.

$ sudo ceph health

HEALTH_OK

 

Check the cluster status

$ sudo ceph -s

cluster 66adb950-1fc4-447b-9898-6b6cd7c45a40

     health HEALTH_OK

     monmap e1: 1 mons at {mon1=172.0.0.28:6789/0}

            election epoch 3, quorum 0 mon1

     osdmap e10: 2 osds: 2 up, 2 in

            flags sortbitwise

      pgmap v21: 64 pgs, 1 pools, 0 bytes data, 0 objects

            68744 kB used, 92045 MB / 92112 MB avail

                  64 active+clean

 

Make sure Ceph health is OK as shown above and there is a monitor node 'mon1' with IP address '172.0.0.28'. There should be 2 OSD servers and all should be up and running, and there should be an available disk of about 100GB - 2x50GB Ceph Data partition.

Your new Ceph Cluster setup is done. Now, you ready to use your new Ceph block device.

 

Configure Ceph Client Node

In this section, you will configure our Oracle Linux 7.x server as a Ceph client and you will configure the Ceph client as other Ceph node (mon-osd).

Login to the Ceph client node either through the Ceph admin node or using the Bare Metal Instance public IP. Then add a new 'cephuser' account and set a new password for the user.

$ sudo useradd -d /home/cephuser -m cephuser

$ sudo passwd cephuser

 

Repeat the visudo process, disable selinux and configure NTP as described above.

Make sure you can ssh into client instance from ceph-admin as previously done for other nodes.

$ ssh client

$ [cephuser@client ~]$

 

Install Ceph on Client Node

In this step, you will install Ceph on the client node (the node that acts as client node) from the ceph-admin node.

Login to the ceph-admin node as root by ssh and become "cephuser" with su. Go to the Ceph cluster directory, you used the 'cluster' directory.

 

$ su - cephuser

Last login: Tue Jul 18 21:25:25 GMT 2017 on pts/0

$ cd /cluster/

 

Install Ceph on the client node with ceph-deploy and then push the configuration and the admin key to the client node.

$ ceph-deploy install client

$ ceph-deploy admin client

 

The Ceph installation will take some time (depends on the server and network speed). When the task finished, connect to the client node and change the permission of the admin key.

 

$ ssh client

$ sudo chmod 644 /etc/ceph/ceph.client.admin.keyring

 

Ceph has been installed on the client node.

 

Configure and Mount Ceph as Block Device

Ceph allows users to use the Ceph cluster as a thin-provisioned block device. You can mount the Ceph storage like a normal hard drive on your system. Ceph Block Storage or Ceph RADOS Block Storage (RBD) stores block device images as an object, it automatically stripes and replicates our data across the Ceph cluster. Ceph RBD has been integrated with KVM, so you can also use it as block storage on various virtualization platforms for example.

Before creating a new block device on the client node, you must check the cluster status as done above. Login to the Ceph monitor node and check the cluster state.

$ ssh mon1

$ sudo ceph -s

 

Make sure cluster health is 'HEALTH_OK' and pgmap is 'active & clean'.

After confirming that you're ready to proceed with the Client configuration. For this tutorial, you will use Ceph as a block device or block storage on a client server with Oracle Linux 7 as the client node operating system. From the ceph-admin node, connect to the client node with ssh. There is no password required as you configured passwordless logins for that node.

$ ssh client

 

Ceph provides the rbd command for managing rados block device images. You can create a new image, resize, create a snapshot, and export our block devices with the rbd command.

On this tutorial you will create a new rbd image with size 40GB, and then check 'disk01' is available on the rbd list.

$ rbd create disk01 --size 40960

$ rbd ls -l

NAME     SIZE PARENT FMT PROT LOCK

disk01 40960M          2          

[cephuser@client ~]$

 

Next, activate the rbd kernel module.

$ sudo modprobe rbd

$ sudo rbd feature disable disk01 exclusive-lock object-map fast-diff deep-flatten

 

Now, map the disk01 image to a block device via rbd kernel module, and make sure the disk01 in the list of mapped devices then.

$ sudo rbd map disk01

/dev/rbd0

$ rbd showmapped

id pool image  snap device

0  rbd  disk01 -    /dev/rbd0

 

You can see that the disk01 image has been mapped as '/dev/rbd0' device. Before using it to store data, you have to format that disk01 image with the mkfs command. For this tutorial you will use the XFS file system

$ sudo mkfs.xfs /dev/rbd0

 

Now, mount '/dev/rbd0' to the mnt directory.

$ sudo mount /dev/rbd0 /mnt

 

The Ceph RBD or RADOS Block Device has been configured and mounted on the system. Check that the device has been mounted correctly with the df command.

$ df -hT |grep rd0

/dev/rbd0      xfs        40G   33M   40G   1% /mnt

 

Setup RBD at Boot time

After finishing the Ceph Client configuration, you will configure it to automount the Ceph Block Device to the system at boot time. One way of doing that is creating a new file in the /usr/local/bin directory for mounting and unmounting of the RBD disk01.

 

$ cd /usr/local/bin/

$ sudo vim rbd-mount

 

Paste the script below and feel free to modify it based on your requirements.

#!/bin/bash

# Script Author: http://bryanapperson.com/

# Change with your pools name

export poolname=rbd

 

# Change with your disk image name

export rbdimage=disk01

 

# Mount Directory

export mountpoint=/mnt/mydisk

 

# Image mount/unmount and pool are passed from the systems service as arguments

# Determine if we are mounting or unmounting

if [ "$1" == "m" ]; then

   modprobe rbd

   rbd feature disable $rbdimage exclusive-lock object-map fast-diff deep-flatten

   rbd map $rbdimage --id admin --keyring /etc/ceph/ceph.client.admin.keyring

   mkdir -p $mountpoint

   mount /dev/rbd/$poolname/$rbdimage $mountpoint

fi

if [ "$1" == "u" ]; then

   umount $mountpoint

   rbd unmap /dev/rbd/$poolname/$rbdimage

fi

 

Save the file and exit vim, then make it executable with chmod.

$ sudo chmod +x rbd-mount

 

Next, go to the systemd directory and create the service file.

$ cd /etc/systemd/system/

$ sudo vim rbd-mount.service

 

Paste service configuration below:

[Unit]

Description=RADOS block device mapping for $rbdimage in pool $poolname"

Conflicts=shutdown.target

Wants=network-online.target

After=NetworkManager-wait-online.service

[Service]

Type=oneshot

RemainAfterExit=yes

ExecStart=/usr/local/bin/rbd-mount m

ExecStop=/usr/local/bin/rbd-mount u

[Install]

WantedBy=multi-user.target

 

Save the file and exit vim.

Reload the systemd files and enable the rbd-mount service to start at boot time.

$ sudo systemctl daemon-reload

$ sudo systemctl enable rbd-mount.service

 

If you reboot the client node now, rbd 'disk01' will automatically be mounted to the '/mnt/mydisk' directory.

Your Ceph Distributed Storage Cluster and Client configuration are done!

Try out Oracle’s Bare Metal Cloud Services by signing up for a free cloud trial today.

Find out more about Bare Metal Storage Services at: https://cloud.oracle.com/en_US/bare-metal-storage

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha
Oracle

Integrated Cloud Applications & Platform Services