By user12608726 on Mar 20, 2008
Since virtualization is one of the hottest areas of growth today, it would be good to blog about virtualization and networking today. This is one of the beauties of blogging, just writing about some topic in a public forum motivates one to do more research and become more thorough and proficient with the subject.
So why is virtualization so hot? It is primarily because as servers grow more and more powerful, virtualization allows consolidation of multiple hosts on one physical system. The benefits of consolidation are many, mainly power and administrative costs saving. These end hosts can be very different operating systems. The challenge is to run each independent of the other. So that the performance of one host is independent of the performance of the other. While they all share resources of the same physical system.
So what's the challenge that virtualization brings to networking. Simply put, sharing I/O is challenging. Why? Consider other components such as CPU and memory. Since modern servers have multiple CPUs, we can simply assign the desired number of CPUs to each host and not allow hosts to touch each other. If a single CPU needed to be shared, that too could be done with a scheduling algorithm that follows some time Division Multiplexed (TDM) like approach. How about memory? Since memory is always managed as virtual memory, all we need to do is play with the paging algorithm. Partition the memory and just be careful about paging algorithms. Now this is not always very simple because of memory locality issues in a system which is Non-Uniform Memory Access (NUMA). But more on that later.
Now let us consider I/O. It is hard to partition peripheral devices across multiple hosts. Consider a Network Interface Card (NIC). Suppose two hosts do network I/O using this NIC simultaneously. Who resolves this conflict? Who coordinates the device instructions so that DMA mappings do not overlap with each other? How to fairly distribute network bandwidth amongst the two hosts? These are challenging problems.
In comes the role of the hypervisor. The hypervisor is a thin layer of software which interfaces between the virtual hosts and the physical machine. Simply put, in a virtualized environment, it is the hypervisor which plays the role of managing all the resources, such as CPU, memory, and I/O, and coordinating all the instructions sent by the virtual hosts.
So now let us talk about virtualization and networking. Here are the prominent ways in which network I/O works over virtualized environments today. The hypervisor plays different roles depending on the solution chosen by the vendor.
The idea here is to trap the privileged instructions issued by the guest operating system (OS) at the hypervisor layer and translate them into safe instructions. Binary translations have been historically used by VMWare to support virtualization on unmodified OSs such as Microsoft Windows. The guest OS being completely ignorant of the hypervisor, and issues instructions assuming it is executing on a bare metal x86 box. The hypervisor classifies all instructions issued into two broad categories, those that may be directly executed (called non-privileged) and those that need to be translated (called priviledged). Priviledged instructions are translated on the fly and executed.
The biggest advantage of this technique is that it doesn't require any modification in guest OSs. However, performance often suffers because of the in-flight translation, and therefore the virtualization industry is moving more towards paravirtualization and hardware assisted virtualization.
In paravirtualization, the guest OS is modified to recognize the hypervisor and interact with it. The best example of this technique is in the open source Xen and Solaris XVM. In Solaris XVM, network I/O is handled by the Xen frontend driver whose source code is available here. The frontend driver interacts with the Solaris XVM backend driver (found here) which is running on the control domain, also known as Dom0. Dom0 controls and manages the network and other I/O devices directly. Thus the network path from all guest OSs is Guest OS -> Dom0 -> external world for transmit and in the reverse direction for receive. Dom0 plays the role of the arbitrator when multiple guest domains are conflicting for network I/O.
Paravirtualization typically performs better than binary translations because the hypervisor doesn't have to inspect each and every instruction. Moreover, it works great in cases like guest domain to guest domain communication, since the Dom0 can recognize the same and avoid sending packets to the hardware. However, paravirtualized solutions often require a good design (to ensure that Dom0 does not become a bottleneck as an example), and therefore higher cost of support and maintenance.
Intel I/O virtualization and AMD Pacifica virtualization technologies: Since 2006, both Intel and AMD have had hardware support to support virtualization. The hardware provides support to trap any priviledged instruction and send it to the hypervisor. This allows support of unmodified OSs on the Xen hypervisor on supported hardware. As an example, we can now run Windows XP, Solaris and Linux with Solaris XVM in the same box. Support for hardware virtualization although currently an initial step, is expected to grow and become dominant in the coming years. But as of now, paravirtualized solutions are generally seen outperforming hardware assisted solutions.
PCI-Express Technologies- I/O VT
The PCI-Express community is currently standardizing technologies to support multiple OSs running simultaneously within a single computer to natively share PCI-Express devices. There are two main technologies currently undergoing standardization, single-root I/O virtualization and multi-root I/O virtualization. The idea here is to allow an OS handle its own IOV compliant interface over PCI-Express which is also shared by other virtual OSs running in the system. This will allow more parallelism in hardware and reduce the role of the hypervisor in arbitrating amongst multiple OSs competing for the same I/O.
The current industry is in a flux of moving from software based virtualization solutions to hardware assisted ones. How much the performance of hardware solutions will improve over time is difficult to speculate. Therefore, paravirtualized solutions are still expected to be dominant for some time. It is interesting to see most vendors to support both hardware and software solutions for now.