dm nfs

A little known feature that we make good use of in Oracle VM is called dm nfs. Basically the ability to create a device mapper device directly on an nfs-based file/filesystem. We use this in Oracle VM 3 if your shared storage for the cluster is nfs based.

Oracle VM clustering relies on the OCFS2 clusterstack/filesystem that is native in the kernel (uek2/2.6.39-x). When we create an HA-enabled pool, we create, what we call, a pool filesystem. That filesystem contains an ocfs2 volume so that we can store cluster-wide data. In particular we store shared database files that are needed by the Oracle VM agents on the nodes for HA. It contains info on pool membership, which VMs are in HA mode, what the pool IP is etc...

When the user provides an nfs filesystem for the pool, we do the following :

  • mount the nfs volume in /nfsmnt/
  • create a 10GB sized file ovspoolfs.img
  • create a dm nfs volume(/dev/mapper/ovspoolfs> on this ovspoolfs.img file
  • create an ocfs2 volume on this dm nfs device
  • mount the ocfs2 volume on /poolfsmnt/
  • If someone wants to try out something that relies on block-based shared storage devices, such as ocfs2, but does not have iSCSI or SAN storage, using nfs is an alternative and dm nfs just makes it really easy.

    To do this yourself, the following commands will do it for you :

  • to find out if any such devices exist just type dmsetup table --target nfs
  • to create your own device, do something like this:
  • mount mynfsserver:/mountpoint /mnt
    dd if=/dev/zero of=/mnt/myvolume.img bs=1M count=2000 
    dmsetup create myvolume --table "0 4096000 nfs /mnt/myvolume.img 0"
    
    So mount the nfs volume, create a file which will be the container of the blockdevice, in this case a 2GB file and then create the dm device. The values for the dmsetup command are the following:

    myvolume = the name of the /dev/mapper device. Here we end up with /dev/mapper/myvolume

    table = start (normally always 0), number of blocks/length, this is in 512byte blocks, so you double the number,nfs since this is on nfs, filename of the nfs based file, offset (normally always 0)

    So now you have /dev/mapper/myvolume, it acts like a normal block device. If you do this on multiple servers, you can actually create an ocfs2 filesystem on this block device and it will be consistent across the servers.

    Credits go to Chuck Lever for writing dm nfs in the first place, thanks Chuck :) The code for dm nfs is here.

    Comments:

    Post a Comment:
    • HTML Syntax: NOT allowed
    About

    Wim Coekaerts is the Senior Vice President of Linux and Virtualization Engineering for Oracle. He is responsible for Oracle's complete desktop to data center virtualization product line and the Oracle Linux support program.

    You can follow him on Twitter at @wimcoekaerts

    Search

    Categories
    Archives
    « April 2014
    SunMonTueWedThuFriSat
      
    1
    2
    3
    4
    5
    6
    7
    9
    10
    11
    12
    13
    14
    15
    16
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
       
           
    Today