Wednesday May 27, 2009

Hello world with VirtualBox

Sun VirtualBox is one of the hottest virtualization products around. Although I work with a performance group at Sun Microsystems, I have not been involved with programming or performance aspects of VirtualBox, therefore all my comments here are as an outsider to VirtualBox.

My experience of using VirtualBox has been simply great. I did have some hiccups getting everything working together, so this blogpost is intended to be a comprehensive summary of all that I had to scourge around the web to get things working.

I required virtualized operating systems (OSs) for my home desktop machine for some key requirements: My host OS is the latest Fedora. I host two websites at home: A vertical search engine for travel; and blog that my wife maintains. Hosting at home works well for websites which are young and have low volumes of traffic since you save on hosting fees. If you have a DSL or cable connection which gives you a dynamic IP, you may use dynamic DNS from zoneedit. But hosting at home also means that you would want your machine running 24x7. And so if you need any other OS for any other reason; you do need a virtualization tool. VirtualBox is free, and opensource, and this greatly helps. I needed the following guest OSs:

(i)Windows 7 Beta: I needed a Windows installation to try the latest OS from Microsoft and also for software such as Sopcast and iTunes.

(ii)Fedora 10: I needed another Fedora 10 so that I could run a VPN out of it without disrupting my Webserver. Virtualization really helps in this particular requirement.

(iii)Solaris Nevada release: I keep trying the latest unsupported release of Solaris Nevada available here which I work with Solaris OS code as my daily job.

Here is what I had to do to get everything working together. The most tricky part is to have a great display resolution in the guest OS, so please pay attention to them.


1) Installing Virtual Box on Fedora 10: For some reason the latest VirtualBox release 2.2.2 gave problems starting up on my Fedora 10. So I went back to 2.2.0 and thereafter thais was very smooth. Here are the steps which are documented in greater detail over here.

Get the kernel development packages:
yum install kernel-devel make automake autoconf gcc

Get the latest VirtualBox rpm and install it. Here is an example:
rpm -ivh VirtualBox-2.2.0_45846_fedora9-1.x86_64.rpm

Setup virtualbox:
/etc/init.d/vboxdrv setup

Add users who can use virtualbox:
usermod -G vboxusers -a

Start virtualbox:
VirtualBox, or open from the GUI.

2) Installing Windows 7 Beta Guest OS: Installing a guest OS on Virtual Box is pretty simple. One of the things I did not realize at first is the utility of the VirtualBox Media manager. There is no need to burn the iso image of the OS on a DVD, you can use the Media Manager to upload the iso image. When you start a new instance, Virtual Box requests the OS that is to be installed. The other information that VirtualBox needs is the amount of memory you would like to allocate for the guest OS, the amount of video memory, and a hard disk image where the guest OS will be installed. Most of these pop up with default values that Virtual Box recommends.

The installation went on very smoothly and once it completed, Windows came up just fine. The next step is to install the guest additions. This helps in a number of things - It helps in seamless integration of the mouse with the guest OS, and also in improving the display resolution. Once this was done, Windows booted up fine minus networking. I had to manually put in the DNS IP address as the IP address of my wireless router, and then the networking ran great.

I have observed that Windows Beta 7 requires at least 1 GB RAM to perform sanely, anything else would make the performance remarkably slow.

3) Installing Fedora 10 Guest OS : In a way installing guest OSs is almost similar. With Fedora 10, I didn't have to do any hanky-panky for networking. The display came up with 800x600 resolution. Interestingly the xorg.conf file was missing. I generated a xorg.conf file, manually edited it to match the settings in my Fedora 10 host OS xorg.conf, and then the resolution was excellent. Installing the guest additions was more tricky. When I ran the script, it complained about missing header files. The solution is to run "yum install kernel kernel-devel" so that all the kernel headers are installed. The script ran fine after this.

I installed vpnc and now I can connect to my work via vpn and also have the server running on the host OS at the same time. For some reason, my work DNS was not working on the guest OS although it worked easily on the host OS. I couldn't resolve this, but a easy solution was to manually add all the IP addresses I connect to, to the /etc/hosts file.

4) Installing Solaris Nevada : This by far was the easiest. The display came up in 1200\*1080, with networking going, so I didn't have much reason to play around with either of them.

On the whole, I have found the performance of guest OSs on VirtualBox extremely satisfactory. I worked on Word and Powerpoint presentations on Windows and the difference is barely noticable. There is a small difference in latency for Internet browsing, but nothing much to bother me.

I think what is important for VirtualBox though is having a lot of RAM on your machine. My machine has 2 GB RAM, and that was sufficient only for running the host OS with 1 GB RAM, and running one of the guest OSs with 1 GB RAM. Running more than one guest OSs with lesser memory inevitably had performance implications. For me this was not a limitation since I am not planning to work on multiple OSs at the same time.

What I feel is missing
1)I would love to have the drag-and-drop copy feature so that I can just cut and paste files from one OS to the other. I dont think VirtualBox supports this feature yet, and I am told from Internet postings that VMware Fusion does, so this will definitely be a great feature to have. As of now I am moving files around using scp which I find a pain, given that everything is in the same machine. Similarly there is no way I can copy a link from a browser in the host OS to a browser in the guest OS if I needed to.

2)Sound support: I still couldn't get sound working in any of my guest OSs, I plan to post an update when I get it up.

On the whole I am a very satisfied customer of VirtualBox and I hope they keep the good work up.

Thursday Mar 20, 2008

Virtualization and Networking


Since virtualization is one of the hottest areas of growth today, it would be good to blog about virtualization and networking today. This is one of the beauties of blogging, just writing about some topic in a public forum motivates one to do more research and become more thorough and proficient with the subject.

So why is virtualization so hot? It is primarily because as servers grow more and more powerful, virtualization allows consolidation of multiple hosts on one physical system. The benefits of consolidation are many, mainly power and administrative costs saving. These end hosts can be very different operating systems. The challenge is to run each independent of the other. So that the performance of one host is independent of the performance of the other. While they all share resources of the same physical system.

So what's the challenge that virtualization brings to networking. Simply put, sharing I/O is challenging. Why? Consider other components such as CPU and memory. Since modern servers have multiple CPUs, we can simply assign the desired number of CPUs to each host and not allow hosts to touch each other. If a single CPU needed to be shared, that too could be done with a scheduling algorithm that follows some time Division Multiplexed (TDM) like approach. How about memory? Since memory is always managed as virtual memory, all we need to do is play with the paging algorithm. Partition the memory and just be careful about paging algorithms. Now this is not always very simple because of memory locality issues in a system which is Non-Uniform Memory Access (NUMA). But more on that later.

Now let us consider I/O. It is hard to partition peripheral devices across multiple hosts. Consider a Network Interface Card (NIC). Suppose two hosts do network I/O using this NIC simultaneously. Who resolves this conflict? Who coordinates the device instructions so that DMA mappings do not overlap with each other? How to fairly distribute network bandwidth amongst the two hosts? These are challenging problems.

In comes the role of the hypervisor. The hypervisor is a thin layer of software which interfaces between the virtual hosts and the physical machine. Simply put, in a virtualized environment, it is the hypervisor which plays the role of managing all the resources, such as CPU, memory, and I/O, and coordinating all the instructions sent by the virtual hosts.

So now let us talk about virtualization and networking. Here are the prominent ways in which network I/O works over virtualized environments today. The hypervisor plays different roles depending on the solution chosen by the vendor.

Software solutions

Binary Translations:

The idea here is to trap the privileged instructions issued by the guest operating system (OS) at the hypervisor layer and translate them into safe instructions. Binary translations have been historically used by VMWare to support virtualization on unmodified OSs such as Microsoft Windows. The guest OS being completely ignorant of the hypervisor, and issues instructions assuming it is executing on a bare metal x86 box. The hypervisor classifies all instructions issued into two broad categories, those that may be directly executed (called non-privileged) and those that need to be translated (called priviledged). Priviledged instructions are translated on the fly and executed.

The biggest advantage of this technique is that it doesn't require any modification in guest OSs. However, performance often suffers because of the in-flight translation, and therefore the virtualization industry is moving more towards paravirtualization and hardware assisted virtualization.


In paravirtualization, the guest OS is modified to recognize the hypervisor and interact with it. The best example of this technique is in the open source Xen and Solaris XVM. In Solaris XVM, network I/O is handled by the Xen frontend driver whose source code is available here. The frontend driver interacts with the Solaris XVM backend driver (found here) which is running on the control domain, also known as Dom0. Dom0 controls and manages the network and other I/O devices directly. Thus the network path from all guest OSs is Guest OS -> Dom0 -> external world for transmit and in the reverse direction for receive. Dom0 plays the role of the arbitrator when multiple guest domains are conflicting for network I/O.

Paravirtualization typically performs better than binary translations because the hypervisor doesn't have to inspect each and every instruction. Moreover, it works great in cases like guest domain to guest domain communication, since the Dom0 can recognize the same and avoid sending packets to the hardware. However, paravirtualized solutions often require a good design (to ensure that Dom0 does not become a bottleneck as an example), and therefore higher cost of support and maintenance.

Hardware solutions

Intel I/O virtualization and AMD Pacifica virtualization technologies: Since 2006, both Intel and AMD have had hardware support to support virtualization. The hardware provides support to trap any priviledged instruction and send it to the hypervisor. This allows support of unmodified OSs on the Xen hypervisor on supported hardware. As an example, we can now run Windows XP, Solaris and Linux with Solaris XVM in the same box. Support for hardware virtualization although currently an initial step, is expected to grow and become dominant in the coming years. But as of now, paravirtualized solutions are generally seen outperforming hardware assisted solutions.

PCI-Express Technologies- I/O VT
The PCI-Express community is currently standardizing technologies to support multiple OSs running simultaneously within a single computer to natively share PCI-Express devices. There are two main technologies currently undergoing standardization, single-root I/O virtualization and multi-root I/O virtualization. The idea here is to allow an OS handle its own IOV compliant interface over PCI-Express which is also shared by other virtual OSs running in the system. This will allow more parallelism in hardware and reduce the role of the hypervisor in arbitrating amongst multiple OSs competing for the same I/O.

The current industry is in a flux of moving from software based virtualization solutions to hardware assisted ones. How much the performance of hardware solutions will improve over time is difficult to speculate. Therefore, paravirtualized solutions are still expected to be dominant for some time. It is interesting to see most vendors to support both hardware and software solutions for now.


