Oracle VM 2.2 and the power of ocfs2

Well, at Oracle World we announced Oracle VM 2.2 and we also announced the Oracle VM Storage Connect program along with showing an implementation demo at the booth. Now that Oracle World 2009 is over, I finally found some time to play with this myself.

I just wanted to point out a few cools things as it's related to the upgrade of ocfs2 from 1.2 to 1.4 as part of the Oracle VM 2.2 release.

In 1.4 we support sparse files, this is very convenient for Virtual Machine images because in many cases the template might have a large virtual disk but the disk itself is virtually empty (lots of virtual stuff here :) ). By supporting sparse files on ocfs2, we can now save a lot of diskspace. Here is an example :

# ls -l 32_deploytest/
total 5893120
-rw-r--r-- 1 root root 16106127360 Oct 12 03:59 System.img
drwxrwxrwx 3 root root 3896 Oct 15 00:53 config-chroot
-rw-r--r-- 1 root root 624 Oct 15 00:53 vm.cfg
-rw-rw-rw- 1 root root 481 Oct 22 2009 vm.cfg.orig

# du 32_deploytest/
0 32_deploytest/config-chroot/etc/ovs-autostart
0 32_deploytest/config-chroot/etc
0 32_deploytest/config-chroot
5893120 32_deploytest

as you can see in the above example, the VM disk (System.img) is 16GB in size but actual size on disk is just shy of 6GB. In previous versions we would actually have allocated and used 16GB but now it's 6GB and as the holes get filled up because data gets written, the file size will grow up to the 16GB.

When people download Oracle VM templates, they tend contain very large disk images but the files are quite sparse and in 2.2, when unpacking these images, on ocfs2, the actual space used will be a lot less.

In the future this also will happen when deploying a VM from one server pool to another or cloning a VM or creating a virtual disk.

The other thing I wanted to show off is the reflink business. I have an Oracle VM 2.2 setup here with ocfs2 version 1.6 and a just newer kernel than what is part of 2.2.

# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda2 3050092 941232 1951424 33% /
/dev/sda1 101086 51467 44400 54% /boot
tmpfs 288340 0 288340 0% /dev/shm
/dev/sda3 484078592 33098752 450979840 7% /var/ovs/mount/572F5E5036E9404D8219F4E337B0561B

# ls -l 32_deploytest/
total 5893120
-rw-r--r-- 1 root root 16106127360 Oct 12 03:59 System.img
-rw-r--r-- 1 root root 624 Oct 15 00:53 vm.cfg

# ls -l clone/
total 0

# reflink -r 32_deploytest/System.img clone/System.img
# cp 32_deploytest/vm.cfg clone/

# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda2 3050092 941232 1951424 33% /
/dev/sda1 101086 51467 44400 54% /boot
tmpfs 288340 0 288340 0% /dev/shm
/dev/sda3 484078592 33098752 450979840 7% /var/ovs/mount/572F5E5036E9404D8219F4E337B0561B

# ls -l clone
total 0
-rw-r--r-- 1 root root 16106127360 Oct 15 03:25 System.img
-rw-r--r-- 1 root root 624 Oct 15 03:25 vm.cfg

as you can see from above :
diskspace on /dev/sda3 was 33GB used
the system.img file is 16GB in 32_deploytest
there are no files in clone/
reflink -r and you can see the system.img file at the same size in the directory clone/
diskspace on /dev/sda3 is still 33GB not 49GB.

This is a totally independent file, I can just start the VM and it will work and as I make changes to this VM the diskusage will change as the clone will have more local pages.

I can just delete the other file, it will not affect the cloned version. rm -f of 32_deploytest/System.img would not affect clone/System.img.

How long does this take ? let me try again :
# time reflink -r 32_deploytest/System.img clone/System.img

real 0m0.001s
user 0m0.000s
sys 0m0.000s

instant !

Comments:

oh and keep in mind that the reflink'd inode is clustersafe/clusteraware and this cloned VM can run on any other node in the cluster read/write without any issue !

Posted by Wim Coekaerts on October 22, 2009 at 01:12 PM PDT #

Wim, using reflink, is this a way that I can implement snapshotting with OVM 2.2?

Posted by Joe Hoot on October 22, 2009 at 09:33 PM PDT #

technically yes. once we provide reflink support in the production product, it's basically the ability to snapshot

Posted by wim.coekaerts on October 25, 2009 at 02:56 PM PDT #

Hi, Re Sparse Files, it seems they are only used when first deploying a VM, but if you try to save an already installed/configured VM (with a size of 30GB) as a template, the diskavings goes out the window since the dd command used to copy the VM to the seed_pool creates the .img with an actual size of 30G. Same thing happens when going the other way. Is that the correct behaviour? It is possible to manually copy the .img from the running_pool to the seed_pool when the Manager has created the template (and still get the spacesavings) but you shouldnt have to.... regds /P

Posted by Peter on October 25, 2009 at 05:14 PM PDT #

yes that's also changing. in upcoming updates rsyncs dd's and others will move over to handling sparse when filesystems support it.

Posted by wim.coekaerts on October 26, 2009 at 01:17 AM PDT #

Hi Wim, As Peter, I noticed that dd is not doing sparse when Saving a VM as a template. Lots of wasted space results! It must be rather simple to fix this with dd, must be in the pytho scripts I guess. However I have not seen any OVM updates that fix this annoying and unnessary issue and we are now approaching June 2010. Do you have any workarounds for us ? Otherwise I am very happy with all the improvements in Oracle VM 2.2 Thanks a million for all your great work Jos (an ex-colleague from NL)

Posted by Jos Nijhoff on May 26, 2010 at 03:52 AM PDT #

Use "cp" to preserve sparse files. Excerpt "man cp",

By default, sparse SOURCE files are detected by a crude
heuristic and the corresponding DEST file is made sparse
as well. That is the behavior selected by --sparse=auto.
Specify --sparse=always to create a sparse DEST file when-
ever the SOURCE file contains a long enough sequence of
zero bytes. Use --sparse=never to inhibit creation of
sparse files.

Posted by J Peters on September 30, 2011 at 04:20 AM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

Wim Coekaerts is the Senior Vice President of Linux and Virtualization Engineering for Oracle. He is responsible for Oracle's complete desktop to data center virtualization product line and the Oracle Linux support program.

You can follow him on Twitter at @wimcoekaerts

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
9
10
11
12
13
14
15
16
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today