Saturday Jun 13, 2009

How to compact your Virtual Disks

In the previous article, we've discussed the reasons that cause your virtual disk images to become quite large over time. It might seem to be a pretty hopeless situation, but let me assure you: there is a way out. This article will show you how to reclaim your disk space. A couple of steps are presented and while each of them make sense, it is important that you execute them in the exact order given if you want to achieve the best results.

The first step (don't shoot me for that) is to delete files in your VM you don't really need. Clean up your trashcan, the temp directories, download stuff, uninstall applications you never use, etc. There are a lot of tools out there to delete unused files from Windows, I'm sure you will find a tool that suits your needs.

Now when you have just the files on the disk that you need, it's time to defragment the hard disk. After defragmentation, the files will be nicely aligned and there will be little free space between files. Windows comes with (not so good) defragmentation software, you will find it in the properties of your disk drive in My Computer under the "Tools" tab. Just give it a run and hope it will improve fragmentation. There are better tools out there, some of them cost money.

The data is now nicely aligned but we still got all those unused blocks that contain garbage (the contents of the files that used to live there). Therefore we need a tool that can find these blocks and overwrite them with zeros. Windows does not come with such a compact tool but it's available for download from Microsoft:

Close all your programs and do a

sdelete.exe -c

This will take a while, you should let it do its job without interfering. The tool will go through all parts of your virtual disk and look for things it can wipe out. It's known to be a very safe process, so don't worry.

Next, you should shutdown your virtual machine (power off, not save state) and let VirtualBox optimize the disk image and cut out all parts that SDelete zeroed out. There are two ways to do this: first you could compact the image (this will just operate on the disk image and make it smaller) or you could clone the disk image to a new image. The former needs more disk space but the latter has the advantage of being more secure (you still got the orginal bloated file after all) and it even allows you to switch from one virtual disk format (e.g. VDI) to another (e.g. VMDK). Let look at both options.

VBoxManage modifyhd XP.vdi --compact

This will compact your disk image and it will take some time. The cloning from VDI to VMDK works as follows:

VBoxManage clonehd XP.vdi NewXP.vmdk --format VMDK 

There are a lot more options to clonehd and modifyhd, have a look at the VirtualBox user manual.

This concludes our article and I hope I've given you some useful information that allows you to reclaim some of your disk space. I'm about to go on a 11h intercontinental flight and the virtual machine I want to work with was too big for my small notebook so I've used these techniques to shrink the image to less than half of its previous size. 

Some background on Virtual Disks

Virtual disk images have a life of their own: what appears to be just a big file on your host system is actually a file system in the guest image. What does that mean? Most users go for the dynamically expanding images in VirtualBox as they do not want to limit themselves to a small virtual disk size and at the same time do not want to waste disk space on their host while the guest doesn't actually need it. A great concept in theory but there are some caveats. Why do you think would the disk image grow when there is still a lot of free space in the guest? Why does it only keep getting bigger, even if you delete files from the guest?

The answer is in the way how modern filesystems work. In this article, I'd will focus on Windows guests and the NTFS filesystem which is both the most common combination and also the most problematic one. A disk is a (quite large) collection of bits that can be addressed in in arbitrary order (random access). A filesystem is designed to manage those disks and turn them into more useful things such as directories and files. The filesystem governs the disk and it blindly assumes that it owns the whole disk and that it can address all the bits on the disk. If it doesn't make use of some part of the disk, it considers that space to be wasted. It also contains some other assumptions, for example that it's better to not always touch the same bits because they might derogate after some time and therefore the disk is more likely to fail. Sometimes they even try to benefit from the fact that a disk spins faster at the outside than the inner "circles" so reads and writes there are more than 4 times faster usually. What does all this mean to VirtualBox? Well, all these things explain why a filesystem is so wasteful with disk space: it tries to scatter its data all over the disk and this makes the virtual disk image grow.

Why does the virtual disk image grow in large chunks, even if just a small file was written in the guest? That's actually an optimization. The dynamically growing disk images grow in chunks to limit the number of chunks (if it grew by e.g. 1 byte, we'd have to waste more than one byte of overhead for each byte written!). For the standard VDI files, the chunk size is 1MB. If one or more bytes get written to a 1MB chunk (we call them grains), the whole grain gets allocated. For VHD (originally from Microsoft Virtual PC), the grain size is 2MB even. For VMDK (originally from VMware), the grain size is just 64kB. This means VMDK is the most storage efficient file format but it's also a bit less efficient in its overhead and write performance.

Another important factor is fragmentation. If you delete a 1MB file on your NTFS disk, there will be 1MB of free space somewhere on the disk (assuming the file was not fragmented). Now you want to copy a 2MB file to your disk. What should the filesystem do? Should it look for a place on the disk where there 2MB free? Should it cut the file in chunks, put the first 1MB at the place where you just deleted a file and try to squeeze in the rest somewhere else? That's a decision that the filesystem has to make each time and NTFS is known for tending towards fragmentation. Fragmentation is an efficient way of using free space but if your files are distributed all over the disks, it will take a lot of time to read them and performance will degrade. Which user hasn't observed that Windows keeps getting slower and slower? Disk fragmentation is one explanation for that phenomenon.

Let's look again at that 1MB file we just deleted. What happens when you delete a file? Not much actually: the filesystem just marks that file as deleted in some global file structure (MFT - master file table for NTFS). That's very quick and allows undelete programs to do their job in many cases. However, this also means that the free space the file used to live at will still contain the contents of the file we just deleted. Until the filesystem allocates these blocks again, the data will remain as it was. For dynamically growing disk images, this has a major consequence: as the blocks contain data, they appear to VirtualBox as being used so they need to remain in the virtual disk.

If you've made until here, you've seen answers to the following questions:

  • How are virtual disk images organized?
  • Why do virtual disk images grow so fast?
  • Why do virtual disk images never shrink?
  • What is fragmentation and how does it affect virtual disk images?
In the next article, I'll show you the weapons you need to fight excess disk use. 

Thursday Jan 22, 2009

Sun xVM VirtualBox 2.1.2 is released!

Just a quick note to say that version 2.1.2 is released.

Our friend the Fat Bloke has more details.

Wednesday Dec 17, 2008

VirtualBox 2.1 now released

Just a quick note to say that version 2.1 is released.

Our friend the Fat Bloke has more details.

Friday Sep 12, 2008

Version 2.0.2 now available

Fixing a few hiccups, version 2.0.2 is the new improved vintage.

Go GetIt now.


Thursday Sep 04, 2008

Sun xVM VirtualBox 2.0 is released

We're pleased to announce that VirtualBox 2.0 is now available.

Read all about it in the Press release, or just go download it yourself.

Headline Features:

• 64 bits guest support (64 bits host only)
• New native Leopard user interface on MacOS X hosts
• The GUI was converted from Qt3 to Qt4 with many visual improvements
• New-version notifier
• Guest property information interface
• Host Interface Networking on Mac OS X hosts
• Host Interface Networking on Solaris 10 hosts
• Support for Nested Paging on modern AMD-V CPUs (major performance gain)
• Framework for collecting performance and resource usage data (metrics)
• Clipboard integration for OS/2 Guests
• Support for VHD images
• Created separate SDK component featuring a new Python programming interface
on Linux and Solaris hosts

In addition, the following items were fixed and/or added:
• VMM: VT-x fixes
• AHCI: improved performance
• GUI: keyboard fixes
• Linux installer: properly uninstall the package even if unregistering the DKMS
module fails
• Linux additions: the guest screen resolution is properly restored
• Network: added support for jumbo frames (> 1536 bytes)

Wednesday Aug 13, 2008

VirtualBox-Powered stuff

When you say "VirtualBox" to someone they immediately think "desktop virtualization" and maybe picture the VirtualBox Console.

VirtualBox Console 


But the same virtualization engine that powers VirtualBox can be put to work in other ways too.

Sun just released a Press Release and accompanying Podcast about a few partners who have taken the technology and built upon it to deliver some innovative solutions.

Seems like the compact, efficient and modular architecture of VirtualBox coupled with the APIs available make it an ideal starting point for people wanting to innovate by using virtualization in different ways. And if partners want to know more they can email 


- FB 

Monday Aug 04, 2008

Sun xVM VirtualBox 1.6.4 now available!

A wise Superhero once said something along the lines of: "With great power comes great responsibility" and we, in the VirtualBox team, never disagree with Superheroes.

So when we were informed by the guys at CoreLabs of a security vulnerability on the Windows platform we took it very seriously indeed. And the result is a new maintenance release which fixes the security problem and several other niggly bugs. 

This new version (Sun xVM VirtualBox 1.6.4) is available from the usual place  and the ChangeLog contains fuller details of the bugs fixed in this release.




Wednesday Jul 09, 2008

Recent Coverage now moved

The blogosphere coverage was creating so much noise that it was drowning out other useful blogs, so we've moved it to it's own VirtualBoxBuzz blog.


Thursday Jul 03, 2008

Recent Coverage - July 3rd, 2008

Recent Coverage - July 3rd, 2008[Read More]

Wednesday Jul 02, 2008

Recent Coverage - July 2nd, 2008

Recent Coverage - July 2nd, 2008[Read More]

Tuesday Jul 01, 2008

Recent Coverage - July 1st, 2008

Recent Coverage - July 1st, 2008[Read More]

Monday Jun 30, 2008

Recent Coverage - June 30th, 2008

Recent Coverage - June 30th, 2008[Read More]

Sunday Jun 29, 2008

Recent Coverage - June 29th, 2008

Recent Coverage - June 29th, 2008[Read More]

Saturday Jun 28, 2008

Recent Coverage - June 28th, 2008

Recent Coverage - June 28th, 2008[Read More]

This blog concerns all things VirtualBox.


« July 2016