Wednesday Jan 09, 2013

How to Treat an NFS File As a Block Storage Device


Wim actually beat me in blogging about this feature while I was on vacation, but I'd like to add a little more background about dm-nfs, which I gathered from our kernel developers:

What is dm-nfs?

The dm-nfs kernel module provides a device-mapper target that allows you to treat an NFS file as a block device. It provides loopback-style emulation of a block device using a regular file as backing storage. The backing file resides on a remote system and is accessed via the NFS protocol.

The general idea is to have a more-efficient-than-loop access to files on NFS. The device mapper module directly converts requests to the dm device into NFS RPC calls.

dm-nfs is used transparently by Oracle VM's Dom0 when mounting NFS-backed virtual disks. It essentially allows for asynchronous and direct I/O to an NFS-backed block device, which is a lot faster than normal NFS for virtual disks. The Xen block hotplug script has been modified on OVM to look for files which are on NFS filesystems. If the file is on NFS, OVM uses dm-nfs automatically, otherwise it falls back to using the regular (but slower) loop mount method.

The original dm-nfs module was written by Chuck Lever. It has been supported and used by Oracle VM since version 2.2 and is also included in the Unbreakable Enterprise Kernel for Oracle Linux.

Why this feature matters

This feature creates virtual disk devices (LUNs) where the data is stored in an NFS file instead of on local storage. Managed networked storage has many benefits over keeping virtual devices on a disk local to the physical host.

A sample use case is the fast migration of guest VMs for load balancing or if a physical host requires maintenance. This functionality is also possible using iSCSI LUNs, but the advantage of dm-nfs is that you can manage new virtual drives on a local host system, rather than requiring a storage administrator to initialize new LUNs on the storage subsystem. Host administrators can handle their own virtual disk provisioning.

For durability and performance, dm-nfs uses asynchronous and direct I/O so all I/O operations are performed efficiently and coherently. Guest disk data is not double cached on the underlying host. If the underlying host crashes, there's a lower probability of data corruption. If the guest is frozen, a clean backup can be taken of the virtual disk, as you can be certain that its data has been fully written out.

How to use it

You use dm-nfs by first loading the kernel module, then using dmsetup to create a device mapper device on your file. The syntax is very similar to the dm-linear module.

The following sample code demonstrates how to use dmsetup to create a mapped device (/dev/mapper/$dm_nfsdev) for the file $filename that is accessible on a mounted NFS file system:

nblks=`stat -c '%s' $filename`
echo -n "0 $nblks nfs $filename 0" | dmsetup create $dm_nfsdev

Now you can mount /dev/mapper/$dm_nfsdev like any other filesystem image.

- Lenz Grimmer (Oracle Linux Blog)

Website Newsletter Facebook Twitter

Tuesday Nov 13, 2012

We Need More Migration!


Eva Mendez says, "Oye chico, do you really want to keep your data in that tired legacy file system when it could be enjoying encryption, compression, deduplication, snapshots, remote replication and other benefits provided by ZFS in Oracle Solaris 11?

It's really not that hard to cross over. If you know how."

"I don't know how, me dices? Esta bien, papacito. Go to OTN. Take my word for it. They know how."

Aw shucks, Eva. Anything for you!

The Best Way to Migrate Data From Legacy File Systems to ZFS

To migrate data from a legacy filesystem to ZFS in Oracle Solaris 11, you need to install the shadow-migration package and enable the shadowd service. Then follow the simple procedure described by Dominic Kay.

How to Update to Oracle Solaris 11 Using the Image Packaging System

Oracle Solaris 11.1 has been released. You can upgrade using either Oracle's official Solaris release repository or, if you have a support contract, the Support repository. Peter Dennis explains how.

How to Migrate Oracle Database from Oracle Solaris 8 to Oracle Solaris 11

How to use the Oracle Solaris 8 P2V (physical to virtual) Archiver tool, which comes with Oracle Solaris Legacy Containers, to migrate a physical Oracle Solaris 8 system with Oracle Database and an Oracle Automatic Storage Management file system into an Oracle Solaris 8 branded zone inside an Oracle Solaris 10 guest domain on top of an Oracle Solaris 11 control domain.

- Ricardo

Website Newsletter Facebook Twitter

Wednesday Aug 15, 2012

It's Better with Btrfs


Two recently published articles to help you become proficient with the Btrfs file system in Oracle Linux:

How I Got Started with the Btrfs File System in Oracle Linux

By Margaret Bierman

Scalability and volume management. Write methodology and access. Tunables. Margaret describes these capabilities of the Btrfs file system, plus how it deals with redundant configurations, checksums, fault isolation and much more. She also walks you through the steps to create and set up a Btrfs file system so you can become familiar with it.

How I Use the Advanced Features of the Btrfs File System

By Margaret Bierman

How to create and mount a Btrfs file system. How to copy and delete files. How to create and manage a redundant file system configuration. How to check the integrity of the file system and its remaining capacity. How to take snapshots. How to clone. And more. In this article Margaret explores the more advanced features of the Btrfs file system.

Let us know what you think, and what you'd like to see Margaret write about in the future.

- Rick

Website Newsletter Facebook Twitter

Wednesday Aug 31, 2011

Save disk space on Linux by cloning files on Btrfs and OCFS2

Rebecca W: Dolly
"Dolly" by Rebecca W (CC BY-SA 2.0).

Btrfs and OCFS2 are two very advanced file systems for Linux. Btrfs is a next-generation local file system for Linux, and it provides a number of nice features like snapshots and subvolumes, dynamic resizing and built-in RAID functionality. OCFS2 is the ideal candidate for creating cluster file systems that can be shared across multiple machines (but it can also be used for local storage).

There is one neat little feature that both Btrfs and OCFS2 have in common — they are capable of creating "lightweight" copies ("snapshots" or "clones") of a file.

In this case the file system does not create a new link pointing to an existing inode, it rather creates a new inode that shares the same disk blocks as the original file. This means that this operation only works within the boundaries of the same file system or subvolume. The outcome looks very much like a copy of the source file, but the actual data blocks have not been duplicated. Due to the copy-on-write nature, a modification of any one of the files will not be visible in the other file. Note that this should not be confused with hard links – this web page provides a good explanation of the differences.

For Btrfs, you can invoke this feature by using the cp(1) utility with the --reflink option, which was added to the GNU coreutils in version 7.5 (released in Aug. 2009):

cp --reflink <source file> <destination file>

Adding support for the reflink implementation of OCFS2 to cp still seems to be under development. For now, you need to download and install a separate reflink binary from here. It works like the ln(1) utility:

reflink <source file> <destination file>

Wim covered OCFS2 reflink in more detail in a blog post a while ago and there is another example for OCFS2 on our Wiki.

These kind of file clones save disk space and allow copy operations to perform much quicker than actually copying entire files. This can be quite useful if you need to create copies of very large files that differ very little from each other, e.g. virtual machine disk images. In this case the disk space savings can be quite significant!


Rick Ramsey
Kemer Thomson
and members of the OTN community


« April 2014
Blogs We Like