Introduction
The virtio-blk device has supported multi-queue for quite a while. It is used to improve performance during heavy I/O, by processing the queues in parallel. However, before QEMU 9.0, all the virtqueues were processed by a single IOThread or the main loop. This single thread can be a CPU bottleneck.
Now, in QEMU 9.0, the ‘virtio-blk’ device offers real multiqueue functionality, allowing multiple I/O threads to execute distinct queues on a single disk, thus allowing it to distribute the workload. It is now possible to specify the mapping between multiple IOThreads and virtqueues for a virtio-blk device. This can help improve scalability in particular in situations where the guest provided enough I/O to overload the host CPU while processing the virtio-blk requests with a single I/O thread.
In this article we will see how to configure multiple IOThreads using the new ‘iothread-vq-mapping’ property for a virtio-blk device.
Setup the KVM Host
The distro used for the KVM host was Oracle Linux 9 with a UEK-next Linux Kernel installed. In the next sections, we will see how to install a UEK-next kernel, build QEMU 9.0 and create a NULL device for testing to setup our host.
Install a UEK-next Linux Kernel
First, install the gpg key, then add the UEKnext yum repository.
$ rpm --import https://yum.oracle.com/RPM-GPG-KEY-oracle-development $ dnf config-manager --add-repo "https://yum.oracle.com/repo/OracleLinux/OL9/developer/UEKnext/x86_64"
Now install the UEK-next kernel:
$ dnf install kernel-ueknext Dependencies resolved. ===================================================================================================================================== Package Arch Version Repository Size ===================================================================================================================================== Installing: kernel-ueknext x86_64 6.8.0-2.el9uek yum.oracle.com_repo_OracleLinux_OL9_developer_UEKnext_x86_64 235 k Installing dependencies: kernel-ueknext-core x86_64 6.8.0-2.el9uek yum.oracle.com_repo_OracleLinux_OL9_developer_UEKnext_x86_64 17 M kernel-ueknext-modules x86_64 6.8.0-2.el9uek yum.oracle.com_repo_OracleLinux_OL9_developer_UEKnext_x86_64 53 M kernel-ueknext-modules-core x86_64 6.8.0-2.el9uek yum.oracle.com_repo_OracleLinux_OL9_developer_UEKnext_x86_64 34 M Transaction Summary ===================================================================================================================================== Install 4 Packages Total download size: 104 M Installed size: 155 M Is this ok [y/N]: y Downloading Packages: (1/4): kernel-ueknext-6.8.0-2.el9uek.x86_64.rpm 1.4 MB/s | 235 kB 00:00 (2/4): kernel-ueknext-core-6.8.0-2.el9uek.x86_64.rpm 25 MB/s | 17 MB 00:00 (3/4): kernel-ueknext-modules-core-6.8.0-2.el9uek.x86_64.rpm 32 MB/s | 34 MB 00:01 (4/4): kernel-ueknext-modules-6.8.0-2.el9uek.x86_64.rpm 29 MB/s | 53 MB 00:01 ------------------------------------------------------------------------------------------------------------------------------------- Total 56 MB/s | 104 MB 00:01 Running transaction check Transaction check succeeded. Running transaction test Transaction test succeeded. Running transaction Preparing : 1/1 Installing : kernel-ueknext-modules-core-6.8.0-2.el9uek.x86_64 1/4 Running scriptlet: kernel-ueknext-core-6.8.0-2.el9uek.x86_64 2/4 Installing : kernel-ueknext-core-6.8.0-2.el9uek.x86_64 2/4 Running scriptlet: kernel-ueknext-core-6.8.0-2.el9uek.x86_64 2/4 Installing : kernel-ueknext-modules-6.8.0-2.el9uek.x86_64 3/4 Running scriptlet: kernel-ueknext-modules-6.8.0-2.el9uek.x86_64 3/4 Installing : kernel-ueknext-6.8.0-2.el9uek.x86_64 4/4 Running scriptlet: kernel-ueknext-modules-core-6.8.0-2.el9uek.x86_64 4/4 Running scriptlet: kernel-ueknext-core-6.8.0-2.el9uek.x86_64 4/4 Running scriptlet: kernel-ueknext-modules-6.8.0-2.el9uek.x86_64 4/4 Running scriptlet: kernel-ueknext-6.8.0-2.el9uek.x86_64 4/4 Verifying : kernel-ueknext-6.8.0-2.el9uek.x86_64 1/4 Verifying : kernel-ueknext-core-6.8.0-2.el9uek.x86_64 2/4 Verifying : kernel-ueknext-modules-6.8.0-2.el9uek.x86_64 3/4 Verifying : kernel-ueknext-modules-core-6.8.0-2.el9uek.x86_64 4/4 Installed: kernel-ueknext-6.8.0-2.el9uek.x86_64 kernel-ueknext-core-6.8.0-2.el9uek.x86_64 kernel-ueknext-modules-6.8.0-2.el9uek.x86_64 kernel-ueknext-modules-core-6.8.0-2.el9uek.x86_64 Complete!
Once installation has completed, reboot the server and confirm that we have booted into the new kernel:
$ uname -r 6.8.0-2.el9uek.x86_64
QEMU Build
The new feature “iothread-vq-mapping” was introduced in QEMU 9.0 to map virtqueues to IOThreads. So we will onw build QEMU 9.0 from source.
To build QEMU, run the commands below:
Notes:
- We will do a build for a x86_64 target.
NR_CPUS
is the maximum number of cpus. To check the number of CPUs, use thelscpu
command. However, here we can also replace $NR_CPUS with the desired number of CPUs to perform the build.
$ wget https://download.qemu.org/qemu-9.0.0.tar.xz $ tar xvJf qemu-9.0.0.tar.xz $ cd qemu-9.0.0 $ ./configure --target-list=x86_64-softmmu $ make -j $NR_CPUS
In the next sections, we will run the QEMU command directly from the build directory to create a virtual machine (VM).
Setup a null-block Device
The null-block device (/dev/nullb*) emulates a block device with a size of X GB. Its purpose is to test the different block-layer implementations. Instead of performing read/write operations, it simply registers them as finished in the request queue.
Create a null-block device /dev/nullb0:
$ sudo modprobe null_blk hw_queue_depth=1024 gb=100 submit_queues=$NR_CPUS nr_devices=1 max_sectors=1024
Use the number of CPUs on your system for ‘submit-queues’, set size to 100GB, queue depth of 1024 commands and max IO size as 512K.
Confirm creation of the device:
$ ls -l /dev/nullb* brw-rw----. 1 root disk 251, 0 Jul 13 13:38 /dev/nullb0
Note: You can use any backend device of your choice also instead of nullb0.
Start a VM with virtio-blk Device
This section describes how to start a VM with a virtio-blk device attached, using the QEMU command. First we will see how to create a VM using only one IOThread attached to the virtio-blk device and then use a “iothread-vq-mapping” parameter to attach multiple IOThreads.
VM configuration will be:
- os-release: OL9.4
- kernel: 6.8.0-2.el9uek.x86_64
- 16 vCPUs
- 16G RAM
Use Only Single IOThread
Let’s see how to attach a single I/O thread to virtio-blk disk:
$ qemu-system-x86_64 -smp 16 -m 16G -enable-kvm -cpu host \ -hda /work/OL9U4_x86_64.qcow2 -name debug-threads=on \ -serial mon:stdio -vnc :7 \ -object iothread,id=iothread0 \ -device virtio-blk-pci,drive=drive0,id=virtblk0,iothread=iothread0,queue-size=1024,config-wce=false \ -drive file=/dev/nullb0,if=none,id=drive0,format=raw,cache=none,aio=native
Note: Open the VNC client application and establish a connection to localhost:5907
. A VM may be accessed with any VNC client program. For instance, you may utilize RealVNC or TightVNC if you’re using Windows. Use the vncviewer software that comes with your Linux distribution if you’re running Linux. Moreover, you may launch a VNC server using display number X. Replace the display number with X, so that, for example, 0 will listen on 5900, 1 on 5901, and so on.
Check if multi-queue is enabled in your VM:
$ ls /sys/block/vda/mq 0 1 10 11 12 13 14 15 2 3 4 5 6 7 8 9
Check for the I/O thread:
Use pidstat -t -p <QEMU-pid>
to check threads of the QEMU process on the host,
$ pidstat -t -p 1034347 11:08:11 AM UID TGID TID %usr %system %guest %wait %CPU CPU Command 11:08:11 AM 0 1034347 - 0.00 0.00 0.00 0.00 0.00 29 qemu-system-x86 11:08:11 AM 0 - 1034347 0.00 0.00 0.00 0.00 0.00 29 |__qemu-system-x86 11:08:11 AM 0 - 1034348 0.00 0.00 0.00 0.00 0.00 160 |__qemu-system-x86 11:08:11 AM 0 - 1034349 0.00 0.00 0.00 0.00 0.00 175 |__IO iothread0 11:08:11 AM 0 - 1034353 0.00 0.00 0.00 0.00 0.00 20 |__CPU 0/KVM 11:08:11 AM 0 - 1034354 0.00 0.00 0.00 0.00 0.00 17 |__CPU 1/KVM 11:08:11 AM 0 - 1034355 0.00 0.00 0.00 0.00 0.00 25 |__CPU 2/KVM 11:08:11 AM 0 - 1034356 0.00 0.00 0.00 0.00 0.00 31 |__CPU 3/KVM 11:08:11 AM 0 - 1034357 0.00 0.00 0.00 0.00 0.00 21 |__CPU 4/KVM 11:08:11 AM 0 - 1034358 0.00 0.00 0.00 0.00 0.00 27 |__CPU 5/KVM 11:08:11 AM 0 - 1034359 0.00 0.00 0.00 0.00 0.00 27 |__CPU 6/KVM 11:08:11 AM 0 - 1034360 0.00 0.00 0.00 0.00 0.00 30 |__CPU 7/KVM 11:08:11 AM 0 - 1034361 0.00 0.00 0.00 0.00 0.00 29 |__CPU 8/KVM 11:08:11 AM 0 - 1034362 0.00 0.00 0.00 0.00 0.00 16 |__CPU 9/KVM 11:08:11 AM 0 - 1034363 0.00 0.00 0.00 0.00 0.00 18 |__CPU 10/KVM 11:08:11 AM 0 - 1034364 0.00 0.00 0.00 0.00 0.00 19 |__CPU 11/KVM 11:08:11 AM 0 - 1034365 0.00 0.00 0.00 0.00 0.00 23 |__CPU 12/KVM 11:08:11 AM 0 - 1034366 0.00 0.00 0.00 0.00 0.00 22 |__CPU 13/KVM 11:08:11 AM 0 - 1034367 0.00 0.00 0.00 0.00 0.00 28 |__CPU 14/KVM 11:08:11 AM 0 - 1034368 0.00 0.00 0.00 0.00 0.00 24 |__CPU 15/KVM
Add the “iothread-vq-mapping” Parameter
Now, let’s attach the disk using the ‘iothread-vq-mapping’ parameter to assign virtqueues to IOThreads.
Below is the command-line syntax of this new property:
--device '{"driver":"foo","iothread-vq-mapping":[{"iothread":"iothread0","vqs":[0,1,2]},...]}'
- iothread: the id of IOThread object.
- vqs: an optional array of virtqueue indices that will be handled by this IOThread.
The following is an alternate syntax that does not require specifying individual virtqueue indices:
--device '{"driver":"foo","iothread-vq-mapping":[{"iothread":"iothread0"},{"iothread":"iothread1"},...]}'
Remember, either all IOThreads must have vqs mapped or none of them must have it.
Allow QEMU to Assign Mapping
Let’s see how to use this feature without specifying the vqs parameter. In this case, all the virtqueues are assigned round-robin across a set of given IOThreads.
$ qemu-system-x86_64 -smp 16 -m 16G -enable-kvm -cpu host \ -hda /work/OL9U4_x86_64.qcow2 -name debug-threads=on \ -serial mon:stdio -vnc :7 \ -object iothread,id=iothread0 -object iothread,id=iothread1 \ -object iothread,id=iothread2 -object iothread,id=iothread3 \ -object iothread,id=iothread4 -object iothread,id=iothread5 \ -object iothread,id=iothread6 -object iothread,id=iothread7 \ -object iothread,id=iothread8 -object iothread,id=iothread9 \ -object iothread,id=iothread10 -object iothread,id=iothread11 \ -object iothread,id=iothread12 -object iothread,id=iothread13 \ -object iothread,id=iothread14 -object iothread,id=iothread15 \ -drive file=/dev/nullb0,if=none,id=drive0,format=raw,cache=none,aio=native \ --device '{"driver":"virtio-blk-pci","iothread-vq-mapping":[{"iothread":"iothread0"},{"iothread":"iothread1"},{"iothread":"iothread2"},{"iothread":"iothread3"},{"iothread":"iothread4"},{"iothread":"iothread5"},{"iothread":"iothread6"},{"iothread":"iothread7"},{"iothread":"iothread8"},{"iothread":"iothread9"},{"iothread":"iothread10"},{"iothread":"iothread11"},{"iothread":"iothread12"},{"iothread":"iothread13"},{"iothread":"iothread14"},{"iothread":"iothread15"}],"drive":"drive0","queue-size":1024,"config-wce":false}'
To check how IOThreads are getting exploited after this feature is used, please see section Run the tests.
Manually Assign virtqueues to IOThreads
Single vq per IOThread
Now, we will specify each vq that is associated with different IOThthreads. The IOThreads are specified by name and virtqueues are specified by 0-based index.
In this example, we’re assigning each of 16 vqs to 16 IOThreads,
$ qemu-system-x86_64 -smp 16 -m 16G -enable-kvm -cpu host \ -serial mon:stdio -vnc :7 \ -object iothread,id=iothread0 -object iothread,id=iothread1 \ -object iothread,id=iothread2 -object iothread,id=iothread3 \ -object iothread,id=iothread4 -object iothread,id=iothread5 \ -object iothread,id=iothread6 -object iothread,id=iothread7 \ -object iothread,id=iothread8 -object iothread,id=iothread9 \ -object iothread,id=iothread10 -object iothread,id=iothread11 \ -object iothread,id=iothread12 -object iothread,id=iothread13 \ -object iothread,id=iothread14 -object iothread,id=iothread15 \ -drive file=/dev/nullb0,if=none,id=drive0,format=raw,cache=none,aio=native \ --device '{"driver":"virtio-blk-pci","iothread-vq-mapping":[{"iothread":"iothread0","vqs": [0]},{"iothread":"iothread1","vqs": [1]},{"iothread":"iothread2","vqs": [2]},{"iothread":"iothread3","vqs": [3]},{"iothread":"iothread4","vqs": [4]},{"iothread":"iothread5","vqs": [5]},{"iothread":"iothread6","vqs": [6]},{"iothread":"iothread7","vqs": [7]},{"iothread":"iothread8","vqs": [8]},{"iothread":"iothread9","vqs": [9]},{"iothread":"iothread10","vqs": [10]},{"iothread":"iothread11","vqs": [11]},{"iothread":"iothread12","vqs": [12]},{"iothread":"iothread13","vqs": [13]},{"iothread":"iothread14","vqs": [14]},{"iothread":"iothread15","vqs": [15]}],"drive":"drive0","queue-size":1024,"config-wce":false}' \
Multiple vqs per IOThread
Below is the –device definition to show how multiple virtqueues can be assigned to a single IOThread.
Here we will assign 2 vqs each to 8 IOThreads:
--device '{"driver":"virtio-blk-pci","iothread-vq-mapping":[{"iothread":"iothread0","vqs": [0,1]},{"iothread":"iothread1","vqs": [2,3]},{"iothread":"iothread2","vqs": [4,5]},{"iothread":"iothread3","vqs": [6,7]},{"iothread":"iothread4","vqs": [8,9]},{"iothread":"iothread5","vqs": [10,11]},{"iothread":"iothread6","vqs": [12,13]},{"iothread":"iothread7","vqs": [14,15]}],"drive":"drive0","queue-size":1024,"config-wce":false}'
Test with Fio
Pinning QEMU Threads
Check your system’s CPU configuration by running the lscpu -e
command. On our system, CPUs 16 to 31 are on the same NUMA node and are free. Since our VM has 16 vCPUs, we’ll bind all the QEMU threads to 16 physical CPUs on the host.
Run the following command on the host:
$ taskset -cp -a 16-31 <QEMU-pid>
Run the Tests
With this new feature in use, pidstat -t 1
will show that VMs with -smp 2 or higher are able to exploit multiple IOThreads. Let’s see the below test to verify the same.
Run this fio workload that spreads I/O across 16 queues:
$ cat randread.fio [global] bs=4K iodepth=64 direct=1 ioengine=libaio group_reporting time_based runtime=60 numjobs=16 name=standard-iops rw=randread cpus_allowed=0-15 [job1] filename=/dev/vda # fio randread.fio
pidstat output:
$ pidstat -t -p 3465871 1 12:49:55 PM UID TGID TID %usr %system %guest %wait %CPU CPU Command 12:49:55 PM 0 3465871 - 412.80 486.80 578.20 0.00 1477.80 26 qemu-system-x86 12:49:55 PM 0 - 3465871 0.00 0.20 0.00 0.00 0.20 26 |__qemu-system-x86 12:49:55 PM 0 - 3465872 0.00 0.00 0.00 0.00 0.00 135 |__qemu-system-x86 12:49:55 PM 0 - 3465873 25.80 29.60 0.00 26.40 55.40 22 |__IO iothread0 12:49:55 PM 0 - 3465874 24.80 28.60 0.00 27.20 53.40 31 |__IO iothread1 12:49:55 PM 0 - 3465875 26.60 28.80 0.00 26.20 55.40 24 |__IO iothread2 12:49:55 PM 0 - 3465876 26.40 27.80 0.00 26.60 54.20 30 |__IO iothread3 12:49:55 PM 0 - 3465877 27.00 28.00 0.00 26.40 55.00 23 |__IO iothread4 12:49:55 PM 0 - 3465878 26.40 28.60 0.00 26.40 55.00 21 |__IO iothread5 12:49:55 PM 0 - 3465879 24.80 29.20 0.00 27.00 54.00 18 |__IO iothread6 12:49:55 PM 0 - 3465880 26.40 29.60 0.00 26.60 56.00 25 |__IO iothread7 12:49:55 PM 0 - 3465881 25.60 28.20 0.00 26.40 53.80 17 |__IO iothread8 12:49:55 PM 0 - 3465882 25.80 28.00 0.00 27.00 53.80 26 |__IO iothread9 12:49:55 PM 0 - 3465883 26.40 28.00 0.00 26.80 54.40 16 |__IO iothread10 12:49:55 PM 0 - 3465884 26.00 29.00 0.00 26.20 55.00 24 |__IO iothread11 12:49:55 PM 0 - 3465885 25.80 29.00 0.00 26.80 54.80 26 |__IO iothread12 12:49:55 PM 0 - 3465886 25.80 29.00 0.00 26.60 54.80 27 |__IO iothread13 12:49:55 PM 0 - 3465887 25.80 28.00 0.00 26.00 53.80 21 |__IO iothread14 12:49:55 PM 0 - 3465888 26.20 29.60 0.00 26.80 55.80 17 |__IO iothread15 12:49:55 PM 0 - 3465891 0.60 2.80 36.40 19.20 39.80 16 |__CPU 0/KVM 12:49:55 PM 0 - 3465893 0.00 2.20 35.20 17.80 37.40 28 |__CPU 1/KVM 12:49:55 PM 0 - 3465894 0.00 2.40 36.40 16.00 38.40 20 |__CPU 2/KVM 12:49:55 PM 0 - 3465895 0.00 2.20 35.80 16.20 37.20 19 |__CPU 3/KVM 12:49:55 PM 0 - 3465896 0.00 2.00 36.40 15.80 37.80 31 |__CPU 4/KVM 12:49:55 PM 0 - 3465897 0.00 2.40 35.60 15.80 38.00 23 |__CPU 5/KVM 12:49:55 PM 0 - 3465898 0.00 1.80 35.20 16.60 36.60 18 |__CPU 6/KVM 12:49:55 PM 0 - 3465899 0.00 2.40 36.00 15.00 38.20 25 |__CPU 7/KVM 12:49:55 PM 0 - 3465900 0.00 1.80 35.80 15.80 37.20 20 |__CPU 8/KVM 12:49:55 PM 0 - 3465901 0.00 2.20 35.80 15.40 36.80 29 |__CPU 9/KVM 12:49:55 PM 0 - 3465902 0.00 2.20 35.80 16.40 37.60 27 |__CPU 10/KVM 12:49:55 PM 0 - 3465903 0.00 1.80 36.80 16.20 38.00 21 |__CPU 11/KVM 12:49:55 PM 0 - 3465904 0.00 1.80 38.00 16.40 38.20 19 |__CPU 12/KVM 12:49:55 PM 0 - 3465905 0.00 2.00 36.80 15.80 37.40 23 |__CPU 13/KVM 12:49:55 PM 0 - 3465906 0.00 1.80 35.00 16.20 37.20 28 |__CPU 14/KVM 12:49:55 PM 0 - 3465907 0.00 2.20 36.60 15.20 38.00 30 |__CPU 15/KVM
After running the fio command used in the above example, we observed that the number of IOPs was scaled by multiple times when the ‘allow QEMU to assign mapping’ VM configuration was used.
You can test on your setup with different configurations to see how the performance looks.
Conclusion
In this article, we saw how to setup a VM using the newly introduced ‘iothread-vq-mapping’ feature and its different use cases. We also tested this new feature using the fio tool to see how it can help improve performance.
References
- https://wiki.qemu.org/ChangeLog/9.0
- https://github.com/qemu/qemu/commit/cf03a152c5d749fd0083bfe540df9524f1d2ff1d
- https://blogs.oracle.com/linux/post/uek-next
- https://docs.oracle.com/en-us/iaas/Content/Block/References/samplefiocommandslinux.htm
- https://fio.readthedocs.io/en/latest/fio_doc.html
- https://docs.kernel.org/block/null_blk.html