Friday Dec 06, 2013

Oracle Linux containers continued

More on Linux containers... the use of btrfs in particular and being able to easily create clones/snapshots of container images. To get started : have an Oracle Linux 6.5 installation with UEKr3 and lxc installed and configured.

lxc by default uses /container as the directory to store container images and metadata. /container/[containername]/rootfs and /container/[containername]/config. You can specify an alternative pathname using -P. To make it easy I added an extra disk to my VM that I use to try out containers (xvdc) and then just mount that volume under /container.

- Create btrfs volume

If not yet installed, install btrfs-progs (yum install btrfs-progs)

# mkfs.btrfs /dev/xvdc1

# mount /dev/xvdc1 /container 
You can auto-mount this at startup by adding a line to /etc/fstab

/dev/xvdc1		/container		btrfs   defaults 0 0

- Create a container

# lxc-create -n OracleLinux59 -t oracle -- -R 5.9
This creates a btrfs subvolume /container/OracleLinux59/rootfs

Use the following command to verify :

# btrfs subvolume list /container/
ID 260 gen 33 top level 5 path OracleLinux59/rootfs

- Start/Stop container

# lxc-start -n OracleLinux59

This starts the container but without extra options your current shell becomes the console of the container.
Add -c [file] and -d for the container to log console output to a file and return control to the shell after starting the container.

# lxc-start -n OracleLinux59 -d -c /tmp/OL59console

# lxc-stop -n OracleLinux59

- Clone a container using btrfs's snapshot feature which is built into lxc

# lxc-clone -o OracleLinux59 -n OracleLinux59-dev1 -s
Tweaking configuration
Copying rootfs...
Create a snapshot of '/container/OracleLinux59/rootfs' in '/container/OracleLinux59-dev1/rootfs'
Updating rootfs...
'OracleLinux59-dev1' created

# btrfs subvolume list /container/
ID 260 gen 34 top level 5 path OracleLinux59/rootfs
ID 263 gen 34 top level 5 path OracleLinux59-dev1/rootfs

This snapshot clone is instantaneous and is a copy on write snapshot.
You can test space usage like this :

# btrfs filesystem df /container
Data: total=1.01GB, used=335.17MB
System: total=4.00MB, used=4.00KB
Metadata: total=264.00MB, used=25.25MB

# lxc-clone -o OracleLinux59 -n OracleLinux59-dev2 -s
Tweaking configuration
Copying rootfs...
Create a snapshot of '/container/OracleLinux59/rootfs' in '/container/OracleLinux59-dev2/rootfs'
Updating rootfs...
'OracleLinux59-dev2' created

# btrfs filesystem df /container
Data: total=1.01GB, used=335.17MB
System: total=4.00MB, used=4.00KB
Metadata: total=264.00MB, used=25.29MB

- Adding Oracle Linux 6.5

# lxc-create -n OracleLinux65 -t oracle -- -R 6.5

lxc-create: No config file specified, using the default config /etc/lxc/default.conf
Host is OracleServer 6.5
Create configuration file /container/OracleLinux65/config
Downloading release 6.5 for x86_64
...
Configuring container for Oracle Linux 6.5
Added container user:oracle password:oracle
Added container user:root password:root
Container : /container/OracleLinux65/rootfs
Config    : /container/OracleLinux65/config
Network   : eth0 (veth) on virbr0
'oracle' template installed
'OracleLinux65' created

- Install an RPM in a running container

# lxc-attach -n OracleLinux59-dev1 -- yum install mysql
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package mysql.i386 0:5.0.95-3.el5 set to be updated
..
Complete!

This connects to the container and executes # yum install mysql inside the container.

- Modify container resource usage

# lxc-cgroup -n OracleLinux59-dev1 memory.limit_in_bytes 53687091

# lxc-cgroup -n OracleLinux59-dev1 cpuset.cpus
0-3

# lxc-cgroup -n OracleLinux59-dev1 cpuset.cpus 0,1

Assigns cores 0 and 1. You can also use a range 0-2,...

# lxc-cgroup -n OracleLinux59-dev1 cpu.shares
1024

# lxc-cgroup -n OracleLinux59-dev1 cpu.shares 100

# lxc-cgroup -n OracleLinux59-dev1 cpu.shares
100

# lxc-cgroup -n OracleLinux59-dev1 blkio.weight
500

# lxc-cgroup -n OracleLinux59-dev1 blkio.weight 20

etc...
A list of resource control parameters : http://docs.oracle.com/cd/E37670_01/E37355/html/ol_subsystems_cgroups.html#ol_cpu_cgroups

Lenz has created a Hands-on lab which you can find here : https://wikis.oracle.com/display/oraclelinux/Hands-on+Lab+-+Linux+Containers

Tuesday Oct 18, 2011

get the rpms

A few days ago I wrote this blog entry. It was a little example on how to use a container on Oracle Linux with our 2.6.39 kernel.

We just pushed the RPMs to both public-yum and ULN. So you can subscribe to the UEK2 beta channel or configure your yum repository for public-yum and get the packages.

To try out what I did in my blog, make sure you install the 2.6.39 uek2 kernel, get the latest btrfs-progs and get the lxc tools.

Thursday Sep 29, 2011

btrfs compression

Another day another btrfs feature :) Compression

btrfs has built-in compression. (support for both lzo and zlib) This allows you to automatically compress data for the entire filesystem, subvolumes or even down to individual files. A mount option is all that's needed to specify which compression type to use: compress=lzo,zlib.

To try this out, I took a simple setup where I used a linux kernel tree as a testcase. Here are the results:

mount the volume on /mnt and untar a linux-2.6.32.tar.bz2 tree onto /mnt
# btrfs filesystem df /mnt
Data, RAID1: total=1.00GB, used=369.81MB
Data: total=8.00MB, used=0.00
System, RAID1: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, RAID1: total=1.00GB, used=57.75MB
Metadata: total=8.00MB, used=0.00
Now, lets do this again,
# mount -o compress=lzo /dev/sdf /mnt

# btrfs filesystem df /mnt
Data, RAID1: total=1.00GB, used=188.95MB
Data: total=8.00MB, used=0.00
System, RAID1: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, RAID1: total=1.00GB, used=41.37MB
Metadata: total=8.00MB, used=0.00
So, 188.95M instead of 369.81M - pretty darned cool... go lzo. but.. what about zlib!
# mount -o compress=zlib /dev/sdf /mnt

# btrfs file df /mnt
Data, RAID1: total=1.00GB, used=128.75MB
Data: total=8.00MB, used=0.00
System, RAID1: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, RAID1: total=1.00GB, used=35.45MB
Metadata: total=8.00MB, used=0.00
Down to 128M! Of course from a filesystem point of view you don't have to do anything, just pass a mount option or set the attribute. As I mentioned earlier, btrfs lets you do this at a file level, a directory level (and inherit down for that directory) or all the way to the top level.

More cool bits : The compression uses kernel threads and will make use of as many threads as there are cpus. So compression gets loadbalanced/spread out across all threads in a server even if it's a single big file, we will split up the big files into 128kb chunks and compress in parallel.
Another cool bit : if you have an existing uncompressed filesystem, and want to compress it, or even just compress a file on it, you can do that with btrfs filesystem defragment. The defragment command has an option -c that lets you specify zlib or lzo.

Wednesday Sep 28, 2011

btrfs scrub - go fix corruptions with mirror copies please!

Another day, another btrfs entry. I'm trying to learn all the in's and out's of the filesystem here.

As many of you know, btrfs supports CRC for data and metadata. I created a simple btrfs filesystem :

# mkfs.btrfs -L btrfstest -d raid1 -m raid1 /dev/sdb /dev/sdc
then created a file on the volume :
# dd if=/dev/urandom of=foo bs=1M count=100

# md5sum /btrfs/foo 
76f4c03dc7a3477939467ee230696b70  /btrfs/foo
so now lets play the bad guy and write over the disk itself, underneath the filesystem so it has no idea. This could be a shared device with another server that accidentally had data written on it, or a bad userspace program that spews out to the wrong device or even a bug in kernel...

Step 1: find the physical layout of the file :
# filefrag -v /btrfs/foo
Filesystem type is: 9123683e
File size of /btrfs/foo is 104857600 (25600 blocks, blocksize 4096)
 ext logical physical expected length flags
   0       0   269312           25600 eof
/btrfs/foo: 1 extent found

# echo $[4096*269312]
1103101952
The filesystem is 4k blocksize and we know it's at block 269312. Now we call btrfs-map-logical to find out what the physical offsets are on both the mirrors (/dev/sdb /dev/sdc) so I can happily overwrite it with junk.
# btrfs-map-logical -l 1103101952 -o scratch /dev/sdb
mirror 1 logical 1103101952 physical 1083179008 device /dev/sdc
mirror 2 logical 1103101952 physical 1103101952 device /dev/sdb
there we go. now. let's scribble :
# dd if=/dev/urandom of=/dev/sdc bs=1 count=50000 seek=1083179008
so we just wrote 50k bytes of random stuff to /dev/sdc at the offset of its copy of file foo
accessing the file gives the right md5sum still but now we have this command called scrub that can be run at any time and it will go through the filesystem you specific and check for any nasty errors and recover them. This happens through creating a kernel thread that does this in the background and then you can just use scrub status to see where it's at later.
# btrfs scrub start /btrfs

# btrfs scrub status /btrfs
scrub status for 15e213ad-4e2a-44f6-85d8-86d13e94099f
scrub started at Wed Sep 28 12:36:26 2011 and finished after 2 seconds
     total bytes scrubbed: 200.48MB with 13 errors
     error details: csum=13
     corrected errors: 13, uncorrectable errors: 0, unverified
As you can see above, the scrubber found 13 errors. A quick peek in dmesg shows the following :
btrfs: fixed up at 1103101952
btrfs: fixed up at 1103106048
btrfs: fixed up at 1103110144
btrfs: fixed up at 1103114240
btrfs: fixed up at 1103118336
btrfs: fixed up at 1103122432
btrfs: fixed up at 1103126528
btrfs: fixed up at 1103130624
btrfs: fixed up at 1103134720
btrfs: fixed up at 1103138816

# md5sum /btrfs/foo 
76f4c03dc7a3477939467ee230696b70  /btrfs/foo
Everything got repaired. This happens on both data and metadata. If there was a true IO error reading from one of the 2 sides we'd have handled that in the filesystem as well. If you don't have mirroring then with CRC it would have told you it was bad data and given you an IO error (instead of reading junk).

Monday Sep 26, 2011

btrfs root and yum update snapshots

ok so now it's Monday and I found a few minutes to continue my weekend project at work :)

Today, I want to take my OL6.1 with UEK setup and convert the root ext4 partition to btrfs. Then use yum update to create a snapshot before rpm installs/updates so that if something goes wrong, one can revert back to the original state.
here's my story :

The default OL6 install uses ext4 for the root fileystem(/). So the first step in my test is to convert the ext4 filesystem into a btrfs filesystem. The cool thing is that btrfs actually lets you do that, there's a tool called btrfs-convert which takes a volume as an argument and then converts ext[2,3,4] to btrfs and leaves the original ext[2,3,4] as a snapshot so you can even go back to it if you want to.

In order to do this I did the following :

- prepared my initrd to have btrfs built in. rebuilt it running mkinitrd using --with-module=btrfs. this way, the kernel module for the btrfs filesystem is included in the initrd.
- find a boot ISO that has btrfs-convert on it (not yet on the OL6 ISOs)
- reboot the machine in rescue mode off of the ISO image
- run btrfs-convert on the root volume in my case it was /dev/mapper/vg_wcoekaersrv3-lv_root
- edit etc/fstab

/dev/mapper/vg_wcoekaersrv3-lv_root /                       ext4    defaults        1 1
to
/dev/mapper/vg_wcoekaersrv3-lv_root /                       btrfs    defaults        1 1
- reboot OL6 again
- at reboot OL presents a message saying that selinux has to re-label the files. This will take a few minutes and a reboot will automatically follow again

From this point on, you have OL6 running with btrfs as root filesystem.

# mount
/dev/mapper/vg_wcoekaersrv3-lv_root on / type btrfs (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw,rootcontext="system_u:object_r:tmpfs_t:s0")
/dev/sda1 on /boot type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)

The original ext snapshot is still available as a subvolume :
# btrfs subvolume list /
ID 256 top level 5 path ext2_saved
I don't need it any more so I am just going to throw it out :
# btrfs subvolume delete /ext2_saved
Delete subvolume '//ext2_saved'

# btrfs subvolume list /
Just to run optimally, it's a good idea to de-fragment the volume as we inherit the old ext4 layout.
# btrfs filesystem defragment /
There. done.
Next up - make sure the yum-plugin-fs-snapshot is installed
# rpm -qa|grep yum-plugin
yum-plugin-fs-snapshot-1.1.30-6.el6.noarch
If not, then just run yum install yum-plugin-fs-snapshot it's on the OL6 media/ULN

So, now the big experiment. I want to do a yum update. Thanks to the installed plugin, yum will detect that the filesystem is btrfs and it will automatically, prior to installing new rpms, create a snapshot, then install.
In this case a long list, I just added bold fonts to the interesting tidbits...
# yum update
Loaded plugins: fs-snapshot
Setting up Update Process
Resolving Dependencies
--> Running transaction check
---> Package binutils.x86_64 0:2.20.51.0.2-5.20.el6 will be updated
---> Package binutils.x86_64 0:2.20.51.0.2-5.20.el6_1.1 will be an update
---> Package ca-certificates.noarch 0:2010.63-3.el6 will be updated
---> Package ca-certificates.noarch 0:2010.63-3.el6_1.5 will be an update
---> Package certmonger.x86_64 0:0.42-1.el6 will be updated
---> Package certmonger.x86_64 0:0.42-1.el6_1.2 will be an update
---> Package cifs-utils.x86_64 0:4.8.1-2.el6 will be updated
---> Package cifs-utils.x86_64 0:4.8.1-2.el6_1.2 will be an update
---> Package cups.x86_64 1:1.4.2-39.el6 will be updated
---> Package cups.x86_64 1:1.4.2-39.el6_1.1 will be an update
---> Package cups-libs.x86_64 1:1.4.2-39.el6 will be updated
---> Package cups-libs.x86_64 1:1.4.2-39.el6_1.1 will be an update
---> Package ipa-client.x86_64 0:2.0.0-23.el6_1.1 will be updated
---> Package ipa-client.x86_64 0:2.0.0-23.el6_1.2 will be an update
---> Package ipa-python.x86_64 0:2.0.0-23.el6_1.1 will be updated
---> Package ipa-python.x86_64 0:2.0.0-23.el6_1.2 will be an update
---> Package kernel-uek-devel.x86_64 0:2.6.39-100.0.5.el6uek will be installed
---> Package kernel-uek-headers.x86_64 0:2.6.32-100.34.1.el6uek will be updated
---> Package kernel-uek-headers.x86_64 0:2.6.32-200.16.1.el6uek will be updated
---> Package kernel-uek-headers.x86_64 0:2.6.39-100.0.5.el6uek will be an update
---> Package kpartx.x86_64 0:0.4.9-41.0.1.el6 will be updated
---> Package kpartx.x86_64 0:0.4.9-41.0.1.el6_1.1 will be an update
---> Package nss.x86_64 0:3.12.9-9.0.1.el6 will be updated
---> Package nss.x86_64 0:3.12.9-12.0.1.el6_1 will be an update
---> Package nss-sysinit.x86_64 0:3.12.9-9.0.1.el6 will be updated
---> Package nss-sysinit.x86_64 0:3.12.9-12.0.1.el6_1 will be an update
---> Package nss-tools.x86_64 0:3.12.9-9.0.1.el6 will be updated
---> Package nss-tools.x86_64 0:3.12.9-12.0.1.el6_1 will be an update
---> Package perf.x86_64 0:2.6.32-131.6.1.el6 will be updated
---> Package perf.x86_64 0:2.6.32-131.12.1.el6 will be an update
---> Package phonon-backend-gstreamer.x86_64 1:4.6.2-17.el6 will be updated
---> Package phonon-backend-gstreamer.x86_64 1:4.6.2-17.el6_1.1 will be an update
---> Package portreserve.x86_64 0:0.0.4-4.el6 will be updated
---> Package portreserve.x86_64 0:0.0.4-4.el6_1.1 will be an update
---> Package qt.x86_64 1:4.6.2-17.el6 will be updated
---> Package qt.x86_64 1:4.6.2-17.el6_1.1 will be an update
---> Package qt-sqlite.x86_64 1:4.6.2-17.el6 will be updated
---> Package qt-sqlite.x86_64 1:4.6.2-17.el6_1.1 will be an update
---> Package qt-x11.x86_64 1:4.6.2-17.el6 will be updated
---> Package qt-x11.x86_64 1:4.6.2-17.el6_1.1 will be an update
---> Package rsyslog.x86_64 0:4.6.2-3.el6_1.1 will be updated
---> Package rsyslog.x86_64 0:4.6.2-3.el6_1.2 will be an update
---> Package samba-client.x86_64 0:3.5.6-86.el6 will be updated
---> Package samba-client.x86_64 0:3.5.6-86.el6_1.4 will be an update
---> Package samba-common.x86_64 0:3.5.6-86.el6 will be updated
---> Package samba-common.x86_64 0:3.5.6-86.el6_1.4 will be an update
---> Package samba-winbind-clients.x86_64 0:3.5.6-86.el6 will be updated
---> Package samba-winbind-clients.x86_64 0:3.5.6-86.el6_1.4 will be an update
---> Package selinux-policy.noarch 0:3.7.19-93.0.1.el6_1.2 will be updated
---> Package selinux-policy.noarch 0:3.7.19-93.0.1.el6_1.7 will be an update
---> Package selinux-policy-targeted.noarch 0:3.7.19-93.0.1.el6_1.2 will be updated
---> Package selinux-policy-targeted.noarch 0:3.7.19-93.0.1.el6_1.7 will be an update
---> Package tzdata.noarch 0:2011h-2.el6 will be updated
---> Package tzdata.noarch 0:2011h-3.el6 will be an update
---> Package tzdata-java.noarch 0:2011h-2.el6 will be updated
---> Package tzdata-java.noarch 0:2011h-3.el6 will be an update
---> Package xmlrpc-c.x86_64 0:1.16.24-1200.1840.el6 will be updated
---> Package xmlrpc-c.x86_64 0:1.16.24-1200.1840.el6_1.4 will be an update
---> Package xmlrpc-c-client.x86_64 0:1.16.24-1200.1840.el6 will be updated
---> Package xmlrpc-c-client.x86_64 0:1.16.24-1200.1840.el6_1.4 will be an update
--> Finished Dependency Resolution
--> Running transaction check
---> Package kernel-uek-devel.x86_64 0:2.6.32-100.28.9.el6 will be erased
--> Finished Dependency Resolution

Dependencies Resolved

=================================================================================================
 Package                  Arch   Version                   Repository                       Size
=================================================================================================
Installing:
 kernel-uek-devel         x86_64 2.6.39-100.0.5.el6uek     kernel-uek-2.6.39-100.0.5-alpha 7.3 M
Updating:
 binutils                 x86_64 2.20.51.0.2-5.20.el6_1.1  ol6_latest                      2.8 M
 ca-certificates          noarch 2010.63-3.el6_1.5         ol6_latest                      531 k
 certmonger               x86_64 0.42-1.el6_1.2            ol6_latest                      193 k
 cifs-utils               x86_64 4.8.1-2.el6_1.2           ol6_latest                       41 k
 cups                     x86_64 1:1.4.2-39.el6_1.1        ol6_latest                      2.3 M
 cups-libs                x86_64 1:1.4.2-39.el6_1.1        ol6_latest                      314 k
 ipa-client               x86_64 2.0.0-23.el6_1.2          ol6_latest                       88 k
 ipa-python               x86_64 2.0.0-23.el6_1.2          ol6_latest                      491 k
 kernel-uek-headers       x86_64 2.6.39-100.0.5.el6uek     kernel-uek-2.6.39-100.0.5-alpha 716 k
 kpartx                   x86_64 0.4.9-41.0.1.el6_1.1      ol6_latest                       41 k
 nss                      x86_64 3.12.9-12.0.1.el6_1       ol6_latest                      772 k
 nss-sysinit              x86_64 3.12.9-12.0.1.el6_1       ol6_latest                       28 k
 nss-tools                x86_64 3.12.9-12.0.1.el6_1       ol6_latest                      749 k
 perf                     x86_64 2.6.32-131.12.1.el6       ol6_latest                      998 k
 phonon-backend-gstreamer x86_64 1:4.6.2-17.el6_1.1        ol6_latest                      125 k
 portreserve              x86_64 0.0.4-4.el6_1.1           ol6_latest                       22 k
 qt                       x86_64 1:4.6.2-17.el6_1.1        ol6_latest                      4.0 M
 qt-sqlite                x86_64 1:4.6.2-17.el6_1.1        ol6_latest                       50 k
 qt-x11                   x86_64 1:4.6.2-17.el6_1.1        ol6_latest                       12 M
 rsyslog                  x86_64 4.6.2-3.el6_1.2           ol6_latest                      450 k
 samba-client             x86_64 3.5.6-86.el6_1.4          ol6_latest                       11 M
 samba-common             x86_64 3.5.6-86.el6_1.4          ol6_latest                       13 M
 samba-winbind-clients    x86_64 3.5.6-86.el6_1.4          ol6_latest                      1.1 M
 selinux-policy           noarch 3.7.19-93.0.1.el6_1.7     ol6_latest                      741 k
 selinux-policy-targeted  noarch 3.7.19-93.0.1.el6_1.7     ol6_latest                      2.4 M
 tzdata                   noarch 2011h-3.el6               ol6_latest                      438 k
 tzdata-java              noarch 2011h-3.el6               ol6_latest                      150 k
 xmlrpc-c                 x86_64 1.16.24-1200.1840.el6_1.4 ol6_latest                      103 k
 xmlrpc-c-client          x86_64 1.16.24-1200.1840.el6_1.4 ol6_latest                       25 k
Removing:
 kernel-uek-devel         x86_64 2.6.32-100.28.9.el6       installed                        22 M

Transaction Summary
=================================================================================================
Install       1 Package(s)
Upgrade      29 Package(s)
Remove        1 Package(s)

Total size: 63 M
Is this ok [y/N]: y
Downloading Packages:
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
fs-snapshot: snapshotting /: /yum_20110926132957
  Updating   : nss-sysinit-3.12.9-12.0.1.el6_1.x86_64                                       1/61 
  Updating   : nss-3.12.9-12.0.1.el6_1.x86_64                                               2/61 
  Updating   : xmlrpc-c-1.16.24-1200.1840.el6_1.4.x86_64                                    3/61 
  Updating   : xmlrpc-c-client-1.16.24-1200.1840.el6_1.4.x86_64                             4/61 
  Updating   : samba-winbind-clients-3.5.6-86.el6_1.4.x86_64                                5/61 
  Updating   : samba-common-3.5.6-86.el6_1.4.x86_64                                         6/61 
  Updating   : certmonger-0.42-1.el6_1.2.x86_64                                             7/61 
  Updating   : nss-tools-3.12.9-12.0.1.el6_1.x86_64                                         8/61 
  Updating   : ca-certificates-2010.63-3.el6_1.5.noarch                                     9/61 
  Updating   : 1:qt-4.6.2-17.el6_1.1.x86_64                                                10/61 
  Updating   : 1:qt-sqlite-4.6.2-17.el6_1.1.x86_64                                         11/61 
  Updating   : 1:qt-x11-4.6.2-17.el6_1.1.x86_64                                            12/61 
  Updating   : 1:phonon-backend-gstreamer-4.6.2-17.el6_1.1.x86_64                          13/61 
  Updating   : portreserve-0.0.4-4.el6_1.1.x86_64                                          14/61 
  Updating   : ipa-python-2.0.0-23.el6_1.2.x86_64                                          15/61 
  Updating   : 1:cups-libs-1.4.2-39.el6_1.1.x86_64                                         16/61 
  Updating   : selinux-policy-3.7.19-93.0.1.el6_1.7.noarch                                 17/61 
  Updating   : selinux-policy-targeted-3.7.19-93.0.1.el6_1.7.noarch                        18/61 
  Updating   : 1:cups-1.4.2-39.el6_1.1.x86_64                                              19/61 
  Updating   : ipa-client-2.0.0-23.el6_1.2.x86_64                                          20/61 
  Updating   : samba-client-3.5.6-86.el6_1.4.x86_64                                        21/61 
  Updating   : tzdata-2011h-3.el6.noarch                                                   22/61 
  Updating   : cifs-utils-4.8.1-2.el6_1.2.x86_64                                           23/61 
  Updating   : rsyslog-4.6.2-3.el6_1.2.x86_64                                              24/61 
  Installing : kernel-uek-devel-2.6.39-100.0.5.el6uek.x86_64                               25/61 
  Updating   : kernel-uek-headers-2.6.39-100.0.5.el6uek.x86_64                             26/61 
  Updating   : binutils-2.20.51.0.2-5.20.el6_1.1.x86_64                                    27/61 
  Updating   : tzdata-java-2011h-3.el6.noarch                                              28/61 
  Updating   : perf-2.6.32-131.12.1.el6.x86_64                                             29/61 
  Updating   : kpartx-0.4.9-41.0.1.el6_1.1.x86_64                                          30/61 
  Cleanup    : selinux-policy-targeted-3.7.19-93.0.1.el6_1.2.noarch                        31/61 
  Cleanup    : selinux-policy-3.7.19-93.0.1.el6_1.2.noarch                                 32/61 
  Cleanup    : tzdata-2011h-2.el6.noarch                                                   33/61 
  Cleanup    : kernel-uek-headers.x86_64                                                   34/61 
  Cleanup    : kernel-uek-headers.x86_64                                                   35/61 
  Cleanup    : tzdata-java-2011h-2.el6.noarch                                              36/61 
  Cleanup    : perf-2.6.32-131.6.1.el6.x86_64                                              37/61 
  Cleanup    : kernel-uek-devel-2.6.32-100.28.9.el6.x86_64                                 38/61 
  Cleanup    : ipa-client-2.0.0-23.el6_1.1.x86_64                                          39/61 
  Cleanup    : certmonger-0.42-1.el6.x86_64                                                40/61 
  Cleanup    : 1:qt-x11-4.6.2-17.el6.x86_64                                                41/61 
  Cleanup    : 1:phonon-backend-gstreamer-4.6.2-17.el6.x86_64                              42/61 
  Cleanup    : samba-client-3.5.6-86.el6.x86_64                                            43/61 
  Cleanup    : 1:cups-1.4.2-39.el6.x86_64                                                  44/61 
  Cleanup    : samba-common-3.5.6-86.el6.x86_64                                            45/61 
  Cleanup    : 1:qt-sqlite-4.6.2-17.el6.x86_64                                             46/61 
  Cleanup    : 1:qt-4.6.2-17.el6.x86_64                                                    47/61 
  Cleanup    : xmlrpc-c-client-1.16.24-1200.1840.el6.x86_64                                48/61 
  Cleanup    : nss-tools-3.12.9-9.0.1.el6.x86_64                                           49/61 
  Cleanup    : ca-certificates-2010.63-3.el6.noarch                                        50/61 
  Cleanup    : nss-sysinit-3.12.9-9.0.1.el6.x86_64                                         51/61 
  Cleanup    : nss-3.12.9-9.0.1.el6.x86_64                                                 52/61 
  Cleanup    : xmlrpc-c-1.16.24-1200.1840.el6.x86_64                                       53/61 
  Cleanup    : samba-winbind-clients-3.5.6-86.el6.x86_64                                   54/61 
  Cleanup    : 1:cups-libs-1.4.2-39.el6.x86_64                                             55/61 
  Cleanup    : portreserve-0.0.4-4.el6.x86_64                                              56/61 
  Cleanup    : ipa-python-2.0.0-23.el6_1.1.x86_64                                          57/61 
  Cleanup    : cifs-utils-4.8.1-2.el6.x86_64                                               58/61 
  Cleanup    : rsyslog-4.6.2-3.el6_1.1.x86_64                                              59/61 
  Cleanup    : binutils-2.20.51.0.2-5.20.el6.x86_64                                        60/61 
  Cleanup    : kpartx-0.4.9-41.0.1.el6.x86_64                                              61/61 

Removed:
  kernel-uek-devel.x86_64 0:2.6.32-100.28.9.el6                                                  

Installed:
  kernel-uek-devel.x86_64 0:2.6.39-100.0.5.el6uek                                                

Updated:
  binutils.x86_64 0:2.20.51.0.2-5.20.el6_1.1                                                     
  ca-certificates.noarch 0:2010.63-3.el6_1.5                                                     
  certmonger.x86_64 0:0.42-1.el6_1.2                                                             
  cifs-utils.x86_64 0:4.8.1-2.el6_1.2                                                            
  cups.x86_64 1:1.4.2-39.el6_1.1                                                                 
  cups-libs.x86_64 1:1.4.2-39.el6_1.1                                                            
  ipa-client.x86_64 0:2.0.0-23.el6_1.2                                                           
  ipa-python.x86_64 0:2.0.0-23.el6_1.2                                                           
  kernel-uek-headers.x86_64 0:2.6.39-100.0.5.el6uek                                              
  kpartx.x86_64 0:0.4.9-41.0.1.el6_1.1                                                           
  nss.x86_64 0:3.12.9-12.0.1.el6_1                                                               
  nss-sysinit.x86_64 0:3.12.9-12.0.1.el6_1                                                       
  nss-tools.x86_64 0:3.12.9-12.0.1.el6_1                                                         
  perf.x86_64 0:2.6.32-131.12.1.el6                                                              
  phonon-backend-gstreamer.x86_64 1:4.6.2-17.el6_1.1                                             
  portreserve.x86_64 0:0.0.4-4.el6_1.1                                                           
  qt.x86_64 1:4.6.2-17.el6_1.1                                                                   
  qt-sqlite.x86_64 1:4.6.2-17.el6_1.1                                                            
  qt-x11.x86_64 1:4.6.2-17.el6_1.1                                                               
  rsyslog.x86_64 0:4.6.2-3.el6_1.2                                                               
  samba-client.x86_64 0:3.5.6-86.el6_1.4                                                         
  samba-common.x86_64 0:3.5.6-86.el6_1.4                                                         
  samba-winbind-clients.x86_64 0:3.5.6-86.el6_1.4                                                
  selinux-policy.noarch 0:3.7.19-93.0.1.el6_1.7                                                  
  selinux-policy-targeted.noarch 0:3.7.19-93.0.1.el6_1.7                                         
  tzdata.noarch 0:2011h-3.el6                                                                    
  tzdata-java.noarch 0:2011h-3.el6                                                               
  xmlrpc-c.x86_64 0:1.16.24-1200.1840.el6_1.4                                                    
  xmlrpc-c-client.x86_64 0:1.16.24-1200.1840.el6_1.4                                             

Complete!

Well, wasn't that easy! You can see the snapshot here :

# btrfs subvolume list /
ID 256 top level 5 path yum_20110926132957
So if something went wrong in the rpm update or you want to revert to the prior copy of the OS/filesystem, you can boot back into the snapshot, using subvolid=256 as filesystem mount options for / in fstab.

If you want to just default to the snapshot then you can run btrfs subvol set-default 256 and you are just running from the old snapshot state going forward.

Sunday Sep 25, 2011

Playing with btrfs

Since I was playing with btrfs over the weekend, I figured I 'd keep a log of things I tried out and put them together in a little blog. Just to show off some of the really nifty stuff you can do with this filesystem :)

btrfs is included in Oracle Linux and we are working hard to help make this into a production supportable filesystem and make sure it's going through a huge amount of filesystem testing, recovery scenarios, performance etc.

Let's summarize a few of the features first :

- checksumming of data and metadata (CRC)
- built-in device/space management (spanned across devices) (so multiple device support no need for lvm)
- support for raid0, raid1, raid10 and single at this point (with raid5/6 in the works)
- ability to independently span metadata and data across these devices
- copy on write(COW) for both data and metadata
- writable snapshots
- create filesystem in existing btrfs pool without need to worry about device management
- online resize of filesystem (both grow and shrink)
- transparent compression, you can even specify for each file, or across all (lzo or zlib)
- ability to defrag files and/or directories
- balance command to balance filesystem chunks in a path across multiple devices if needed
- online add and remove devices to/from filesystems
- support for trim and SSD optimizations
- in place conversion from ext3/4 to btrfs
- file-based or object based cloning support with reflink (per file clone)
- file allocation is extent based with B-tree directory structures
- cool feature for cloning is that you can use filesystem seeding on read-only storage to then have a COW btrfs fs)
- for the little details :
- Max file size 16 EiB
- Max number of files 2^64
- Max volume size 16 EiB

Getting started is very easy. btrfs.ko is the kernel module that needs to be loaded and btrfs-progs is the package that has all the needed utilities to get started.
yum install btrfs-progs
modprobe btrfs

I added 3 8gb devices to my system, per /proc/partitions output : /dev/sdb, /dev/sdc, /dev/sdd.

# cat /proc/partitions 
major minor  #blocks  name

   8        0   20971520 sda
   8        1     512000 sda1
   8        2   20458496 sda2
   8       16    8388608 sdb
   8       32    8388608 sdc
   8       48    8388608 sdd
 253        0   16326656 dm-0
 253        1    4128768 dm-1

So let's create a btrfs filesystem on those 3 devices and label it btrfstest. I will also use -d raid10 and -m raid10 to show how easy it is to decide your spanning choices for both.
# mkfs.btrfs -L btrfstest -d raid10 -m raid10 /dev/sdb /dev/sdc /dev/sdd

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

adding device /dev/sdc id 2
adding device /dev/sdd id 3
fs created label btrfstest on /dev/sdb
        nodesize 4096 leafsize 4096 sectorsize 4096 size 24.00GB
Btrfs Btrfs v0.19
create a mountpoint /btrfs and mount the filesystem root there :
# mkdir /btrfs
# mount -t btrfs /dev/sdb /btrfs

/dev/sdb              25165824        28  25151488   1% /btrfs

as you can see, we now have a filesystem mounted that shows the diskspace of the 3 disks we added. btrfs filesystem show gives more detailed information including the device list and use :
# btrfs filesystem show
Label: 'btrfstest'  uuid: 123e52f9-87f2-4764-a347-5bedb3cb12df
        Total devices 3 FS bytes used 28.00KB
        devid    2 size 8.00GB used 0.00 path /dev/sdc
        devid    3 size 8.00GB used 0.00 path /dev/sdd
        devid    1 size 8.00GB used 20.00MB path /dev/sdb

Now to quickly show how easy it is to remove/add a device to an existing, mounted volume:
lets remove /dev/sdc
# btrfs device delete /dev/sdc /btrfs

# btrfs filesystem show
Label: 'btrfstest'  uuid: 123e52f9-87f2-4764-a347-5bedb3cb12df
        Total devices 3 FS bytes used 28.00KB
        devid    3 size 8.00GB used 0.00 path /dev/sdd
        devid    1 size 8.00GB used 20.00MB path /dev/sdb
        *** Some devices missing

# df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/vg_wcoekaersrv3-lv_root
                      16070076   7494668   7759076  50% /
tmpfs                  1028992         0   1028992   0% /dev/shm
/dev/sda1               495844    107990    362254  23% /boot
/dev/sdb              16777216        28  16501760   1% /btrfs
as you can see, it now shows 8GB less of space available. so, let's add it back in :
# btrfs device add /dev/sdc /btrfs

# df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/vg_wcoekaersrv3-lv_root
                      16070076   7494668   7759076  50% /
tmpfs                  1028992         0   1028992   0% /dev/shm
/dev/sda1               495844    107990    362254  23% /boot
/dev/sdb              25165824        28  24889344   1% /btrfs

# btrfs filesystem show
Label: 'btrfstest'  uuid: 123e52f9-87f2-4764-a347-5bedb3cb12df
        Total devices 3 FS bytes used 28.00KB
        devid    4 size 8.00GB used 0.00 path /dev/sdc
        devid    3 size 8.00GB used 256.00MB path /dev/sdd
        devid    1 size 8.00GB used 20.00MB path /dev/sdb
and it's back!
on to snapshots. I created a few files in /btrfs and now want to create a snapshot. so while using sync I will first sync the fs and then create a snapshot under /btrfs/.snapshot:
# btrfs filesystem sync /btrfs

# btrfs subvolume snapshot /btrfs /btrfs/.snapshot

# ls /btrfs/.snapshot
bar  baz  foo  test

# ls /btrfs
bar  baz  foo  test

and creating a new subvolume (a new possible mountpoint without any files, so not a snapshot just a mknewfs really)

# btrfs subvolume create /btrfs/test

# btrfs subvolume list /btrfs
ID 256 top level 5 path test
Some random commands to play with : 1) filesystem df shows a more detailed explanation of what's going on.
# btrfs filesystem df /btrfs
Data: total=8.00MB, used=0.00
System: total=4.00MB, used=8.00KB
Metadata: total=264.00MB, used=24.00KB

2) list subvolumes :
# btrfs subvolume list /btrfs
ID 256 top level 5 path test
ID 257 top level 5 path .snapshot

3) if your filesystem is unbalanced due to tons of file creates and possible add/remove of devices you can rebalance it online :
# btrfs filesystem balance /btrfs

4) use cp to clone files on btrfs with COW (so individual file clones not just volumes) :
# cp --reflink foo1 foo4

5) deferagment filesystem :
btrfs filesystem defragment /btrfs

There you go. A quick 5 minute overview of some of the nifty stuff this FS can do and you have full access to. A lot more is coming and I will make sure to showcase new features as we make them available. Use this to have a backup root filesystem for recovery purposes, to do updates of rpms and the ability to fall back to a good previous known state. Use it for virtual machine files and the power of reflink. So many possibilities and virtually no filesystem or volume limits.
happy btrfs'ing :)
About

Wim Coekaerts is the Senior Vice President of Linux and Virtualization Engineering for Oracle. He is responsible for Oracle's complete desktop to data center virtualization product line and the Oracle Linux support program.

You can follow him on Twitter at @wimcoekaerts

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
9
10
11
12
13
14
15
16
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today