In this blog, the first in a series of three, Oracle Linux kernel developer John Johnson introduces Oracle’s work towards a more secure QEMU based hypervisor.

Disaggregating QEMU

QEMU is often used as the hypervisor for virtual machines running in the Oracle cloud. Since one of the advantages of cloud computing is the ability to run many VMs from different tenants in the same cloud infrastructure, a guest that compromised its hypervisor could potentially use the hypervisor’s access privileges to access data it is not authorized to.

QEMU can be susceptible to security attack because it is a large, monolithic program that provides many services to the VMs it controls. Many of these services can be configured out of QEMU, but even a reduced configuration QEMU has a large amount of code a guest can potentially attack in order to gain additional privileges.

QEMU services

QEMU can be broadly described as providing three types of services. One is a VM control point, where VMs can be created, migrated, re-configured, and destroyed. A second service emulates the CPU instructions within the VM, usually accelerated by HW virtualization features such as Intel’s VT extensions. Finally, it provides IO services to the VM by emulating HW IO devices, such as disk and network devices.

All these services exist within a single, monolithic QEMU process:

curent-qemu
 

A disaggregated QEMU

A disaggregated QEMU involves separating these services into multiple host processes. Having these services in separate processes allows us to use SELinux mandatory access controls to constrain the processes to only the files needed to provide its service, e.g., a disk emulation process would be given access to only the the disk images it provides; and not be allowed to access other host files, or any network devices. An attacker who compromised such a disk emulation process would not be able to exploit it beyond the host files the process has been granted access to.

A QEMU control process would remain, but in disaggregated mode, it would be a control point that executes the processes needed to support the VM being created and sets up the communication paths between them. But the QEMU control process would have no direct interfaces to the VM, although it would still provide the user interface to control the VM, such as hot-plugging devices or live migrating the VM.

Disaggregating IO services

A first step in creating a disaggregated QEMU is to separate IO services from the main QEMU program. The main QEMU process would continue to provide CPU emulation as well as being the VM control point. In a later phase, CPU emulation could be separated from the control process.

Disaggregating IO services is a good place to begin QEMU disaggregating for a couple of reasons. One is the sheer number of IO devices QEMU can emulate provides a large surface of interfaces which could potentially be exploited. Another is the modular nature of QEMU device emulation code provides interface points where the QEMU functions that perform device emulation can be separated from the QEMU functions that manage the emulation of guest CPU instructions.

phase1-qemu
 

Disaggregated CPU emulation

After IO services have been disaggregated, a second phase would be to separate a process to handle CPU instruction emulation from the main QEMU control function. There are few existing object separation points for this code, so the first task would be to create interfaces between the control plane functions and functions that manage guest CPUs.

phase2-qemu
 

Progress to date

We’ve separated our first device from the the main QEMU process: an LSI 895 SCSI disk controller. Future blogs posts on this topic will cover the design of the project, its performance, as well as where the source code can be found and how to use it.

To see part 2 in this blog series, go to: https://blogs.oracle.com/linux/towards-a-more-secure-qemu-hypervisor%2c-part-2-of-3