Introduction
The virtio-scsi device has now gained true mutiqueue support, similar to virtio-blk device (as discussed in a previous blog where I/O operations can be distributed by binding specific I/O threads to individual virtqueues.
The QEMU virtio-scsi device model and core SCSI emulation assumed that all the requests are processed in a single AioContext. This single thread can be a CPU bottleneck. Now, with the iothread-vq-mapping feature we can specify the mapping between multiple IOThreads and virtqueues for a virtio-scsi device starting in QEMU-10.0.0. This feature will help improve scalability for SMP virtual machines(VM) where I/O intensive operations can hit a bottleneck from the single IOThread. In this article, we will see how to configure multiple IOThreads using the new ‘iothread-vq-mapping’ property for a virtio-scsi device.
If you have read the blog “Improve virtio-blk device performance using iothread-vq-mapping”, then you can use the same KVM host setup. Only difference being the QEMU version which in this case will be QEMU-10.0.0 and you can skip to the Start a VM with virtio-scsi Device section. If this is a new setup, follow the next section to setup the KVM host.
Setup the KVM Host
The distro for the KVM host will be Oracle Linux 9 with a UEK-next Linux Kernel installed. In the next sections, we will see how to install a UEK-next kernel, build QEMU 10.0.0 and create a NULL device for testing to setup our host.
Install a UEK-next Linux Kernel
First, install the gpg key, then add the UEKnext repo.
$ rpm --import https://yum.oracle.com/RPM-GPG-KEY-oracle-development $ dnf config-manager --add-repo "https://yum.oracle.com/repo/OracleLinux/OL9/developer/UEKnext/x86_64"
Now install the UEK-next kernel:
$ dnf install kernel-ueknext
After install is complete, reboot the server and confirm if we have booted into the new kernel:
$ uname -r 6.14.0-0.el9ueknext.x86_64
QEMU Build
For virtio-scsi, the “iothread-vq-mapping” feature was introduced in QEMU 10.0.0 to map virtqueues to IOThreads, therefore we will build QEMU 10.0.0 from source here.
To build QEMU, run the commands below:
Notes:
- We will do a build for a x86_64 target.
NR_CPUS
is the maximum number of cpus. To check the number of CPUs, use thelscpu
command. However, here we can also replace $NR_CPUS with the desired number of CPUs to perform the build.
$ wget https://download.qemu.org/qemu-10.0.0.tar.xz $ tar xvJf qemu-10.0.0.tar.xz $ cd qemu-10.0.0 $ ./configure --target-list=x86_64-softmmu $ make -j $NR_CPUS
In the next sections, we will run the QEMU command directly from the build directory to create a VM.
Setup a null-block Device
The null-block device (_/dev/nullb*_) emulates a block device with a size of X GB. Its purpose is to test the different block-layer configurations. Instead of performing read/write operations, it simply registers them as finished in the request queue.
Create a null-block device /dev/nullb0:
$ sudo modprobe null_blk hw_queue_depth=1024 gb=100 submit_queues=$NR_CPUS nr_devices=1 max_sectors=1024
Use the number of CPUs on your system for ‘submit-queues’, set size to 100GB, queue depth of 1024 commands and max IO size as 512K.
Confirm creation of the device:
$ ls -l /dev/nullb* brw-rw----. 1 root disk 251, 0 Jun 12 05:40 /dev/nullb0
Note: You can use any backend device of your choice instead of nullb0.
Start a VM with virtio-scsi Device
This section describes how to start a VM with a virtio-scsi device attached using the QEMU command line. First, we will see how to create a VM using only a single IOThread attached to a virtio-scsi device and then use the “iothread-vq-mapping” parameter to attach multiple IOThreads.
VM configuration will be:
- os-release: OL9.6
- kernel: 6.12.0-1.23.3.2.el9uek.x86_64
- 16 vCPUs
- 16G RAM
Use Only a Single IOThread
Let’s see how to attach a single I/O thread to the virtio-scsi disk:
$ qemu-system-x86_64 -smp 16 -m 16G -enable-kvm -cpu host \ -hda /test/System.img -name debug-threads=on \ -serial mon:stdio -vnc :7 \ -object iothread,id=iothread0 \ -device virtio-scsi-pci,id=scsi0,virtqueue_size=1024,iothread=iothread0 \ -device scsi-hd,drive=drive0,bus=scsi0.0,channel=0,scsi-id=0,lun=0 \ -drive file=/dev/nullb0,if=none,id=drive0
Note: Open the VNC client application and establish a connection to localhost:5907
. A VM may be accessed with any VNC client program. For instance, you may utilize RealVNC or TightVNC if you’re using Windows. Use the vncviewer software that comes with your Linux distribution if you’re running Linux. Moreover, you may launch a VNC server using display number X. Replace the display number with X, so that, for example, 0 will listen on 5900, 1 on 5901, and so on.
The above command will create a 16-queue virtio-scsi HBA (host bus adapter) with one LUN. The following information is from VM,
$ dmesg ... [ 0.900594] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 246) [ 1.701351] virtio_scsi virtio1: 16/0/0 default/read/poll queues [ 1.710029] scsi host0: Virtio SCSI HBA [ 1.714132] scsi 0:0:0:0: Direct-Access QEMU QEMU HARDDISK 2.5+ PQ: 0 ANSI: 5 [ 1.886642] scsi 0:0:0:0: Attached scsi generic sg0 type 0 [ 1.928895] sd 0:0:0:0: [sda] Attached SCSI disk ... $ lsscsi [0:0:0:0] disk QEMU QEMU HARDDISK 2.5+ /dev/sda ...
Check if multi-queue is enabled in your VM:
$ ls /sys/block/sda/mq/ 0 1 10 11 12 13 14 15 2 3 4 5 6 7 8 9
This confirms that the virtio-scsi HBA has 16 I/O queues.
Check for the I/O thread. Use pidstat -t -p <QEMU-pid>
to check threads of QEMU process on the host:
$ pidstat -t -p 129178 08:43:29 AM UID TGID TID %usr %system %guest %wait %CPU CPU Command 08:43:29 AM 0 129178 - 0.01 0.01 0.02 0.00 0.04 23 qemu-system-x86 08:43:29 AM 0 - 129178 0.00 0.00 0.00 0.00 0.00 23 |__qemu-system-x86 08:43:29 AM 0 - 129179 0.00 0.00 0.00 0.00 0.00 38 |__qemu-system-x86 08:43:29 AM 0 - 129180 0.00 0.00 0.00 0.00 0.00 14 |__IO iothread0 08:43:29 AM 0 - 129182 0.00 0.00 0.00 0.00 0.01 20 |__CPU 0/KVM 08:43:29 AM 0 - 129184 0.00 0.00 0.00 0.00 0.00 21 |__CPU 1/KVM 08:43:29 AM 0 - 129185 0.00 0.00 0.00 0.00 0.00 18 |__CPU 2/KVM 08:43:29 AM 0 - 129186 0.00 0.00 0.00 0.00 0.00 22 |__CPU 3/KVM 08:43:29 AM 0 - 129187 0.00 0.00 0.00 0.00 0.00 21 |__CPU 4/KVM 08:43:29 AM 0 - 129188 0.00 0.00 0.00 0.00 0.00 22 |__CPU 5/KVM 08:43:29 AM 0 - 129189 0.00 0.00 0.00 0.00 0.00 19 |__CPU 6/KVM 08:43:29 AM 0 - 129190 0.00 0.00 0.00 0.00 0.00 17 |__CPU 7/KVM 08:43:29 AM 0 - 129191 0.00 0.00 0.00 0.00 0.00 17 |__CPU 8/KVM 08:43:29 AM 0 - 129192 0.00 0.00 0.00 0.00 0.00 19 |__CPU 9/KVM 08:43:29 AM 0 - 129193 0.00 0.00 0.00 0.00 0.00 16 |__CPU 10/KVM 08:43:29 AM 0 - 129194 0.00 0.00 0.00 0.00 0.00 22 |__CPU 11/KVM 08:43:29 AM 0 - 129195 0.00 0.00 0.00 0.00 0.00 20 |__CPU 12/KVM 08:43:29 AM 0 - 129196 0.00 0.00 0.00 0.00 0.00 18 |__CPU 13/KVM 08:43:29 AM 0 - 129197 0.00 0.00 0.00 0.00 0.00 23 |__CPU 14/KVM 08:43:29 AM 0 - 129198 0.00 0.00 0.00 0.00 0.00 16 |__CPU 15/KVM 08:43:29 AM 0 - 129200 0.00 0.00 0.00 0.00 0.00 59 |__vhost-129178 08:43:29 AM 0 - 129201 0.00 0.00 0.00 0.00 0.00 225 |__kvm-nx-lpage-re 08:43:29 AM 0 - 129220 0.00 0.00 0.00 0.00 0.00 24 |__worker
Add the “iothread-vq-mapping” Parameter
Now, let’s attach the virtio-scsi disk using the ‘iothread-vq-mapping’ parameter to assign virtqueues to IOThreads.
Below is the command-line syntax of this new property using JSON format:
--device '{"driver":"foo","iothread-vq-mapping":[{"iothread":"iothread0","vqs":[0,1,2]},...]}'
- iothread: the id of IOThread object.
- vqs: an optional array of virtqueue indices that will be handled by this IOThread.
The following is an alternate syntax that does not require specifying individual virtqueue indices:
--device '{"driver":"foo","iothread-vq-mapping":[{"iothread":"iothread0"},{"iothread":"iothread1"},...]}'
Remember, either all IOThreads must have vqs mapped or none of them must have it.
Allow QEMU to Assign Mapping
Let’s see how to use this feature without specifying the vqs parameter. In this case, all the virtqueues are assigned round-robin across a set of given IOThreads:
$ qemu-system-x86_64 -smp 16 -m 16G -enable-kvm -cpu host \ -hda /test/System.img -name debug-threads=on \ -serial mon:stdio -vnc :7 \ -object iothread,id=iothread0 -object iothread,id=iothread1 \ -object iothread,id=iothread2 -object iothread,id=iothread3 \ -object iothread,id=iothread4 -object iothread,id=iothread5 \ -object iothread,id=iothread6 -object iothread,id=iothread7 \ -object iothread,id=iothread8 -object iothread,id=iothread9 \ -object iothread,id=iothread10 -object iothread,id=iothread11 \ -object iothread,id=iothread12 -object iothread,id=iothread13 \ -object iothread,id=iothread14 -object iothread,id=iothread15 \ -device '{"driver":"virtio-scsi-pci","id":"scsi0","iothread-vq-mapping":[{"iothread":"iothread0"},{"iothread":"iothread1"}, {"iothread":"iothread2"},{"iothread":"iothread3"},{"iothread":"iothread4"},{"iothread":"iothread5"},{"iothread":"iothread6"}, {"iothread":"iothread7"},{"iothread":"iothread8"},{"iothread":"iothread9"},{"iothread":"iothread10"},{"iothread":"iothread11"}, {"iothread":"iothread12"},{"iothread":"iothread13"},{"iothread":"iothread14"},{"iothread":"iothread15"}],"virtqueue_size":1024}' \ -drive file=/dev/nullb0,if=none,id=drive0 \ -device '{"driver":"scsi-hd","bus":"scsi0.0","channel":0,"scsi-id":0,"lun":0,"drive":"drive0"}'
To check how IOThreads are getting exploited after this feature is used, please refer to the section Run the tests later in this blog.
Manually Assign virtqueues to IOThreads
Single vq per IOThread
Now, we will specify each vq that is associated with different IOThthreads. The IOThreads are specified by name and virtqueues are specified by 0-based index.
In this example, we’re assigning each of the 16 vqs to 16 IOThreads:
$ qemu-system-x86_64 -smp 16 -m 16G -enable-kvm -cpu host \ -hda /test/System.img -name debug-threads=on \ -serial mon:stdio -vnc :7 \ -object iothread,id=iothread0 -object iothread,id=iothread1 \ -object iothread,id=iothread2 -object iothread,id=iothread3 \ -object iothread,id=iothread4 -object iothread,id=iothread5 \ -object iothread,id=iothread6 -object iothread,id=iothread7 \ -object iothread,id=iothread8 -object iothread,id=iothread9 \ -object iothread,id=iothread10 -object iothread,id=iothread11 \ -object iothread,id=iothread12 -object iothread,id=iothread13 \ -object iothread,id=iothread14 -object iothread,id=iothread15 \ -device '{"driver":"virtio-scsi-pci","id":"scsi0","iothread-vq-mapping":[{"iothread":"iothread0","vqs": [0]}, {"iothread":"iothread1","vqs": [1]},{"iothread":"iothread2","vqs": [2]},{"iothread":"iothread3","vqs": [3]}, {"iothread":"iothread4","vqs": [4]},{"iothread":"iothread5","vqs": [5]},{"iothread":"iothread6","vqs": [6]}, {"iothread":"iothread7","vqs": [7]},{"iothread":"iothread8","vqs": [8]},{"iothread":"iothread9","vqs": [9]}, {"iothread":"iothread10","vqs": [10]},{"iothread":"iothread11","vqs": [11]},{"iothread":"iothread12","vqs": [12]}, {"iothread":"iothread13","vqs": [13]},{"iothread":"iothread14","vqs": [14]},{"iothread":"iothread15","vqs": [15]}],"virtqueue_size":1024}' \ -drive file=/dev/nullb0,if=none,id=drive0 \ -device '{"driver":"scsi-hd","bus":"scsi0.0","channel":0,"scsi-id":0,"lun":0,"drive":"drive0"}'
Multiple vqs per IOThread
Below is the –device definition to show how multiple virtqueues can be assigned to a single IOThread.
Here we will assign 2 vqs each to 8 IOThreads:
--device '{"driver":"virtio-scsi-pci","id":"scsi0","iothread-vq-mapping":[{"iothread":"iothread0","vqs": [0,1]}, {"iothread":"iothread1","vqs": [2,3]},{"iothread":"iothread2","vqs": [4,5]},{"iothread":"iothread3","vqs": [6,7]}, {"iothread":"iothread4","vqs": [8,9]},{"iothread":"iothread5","vqs": [10,11]},{"iothread":"iothread6","vqs": [12,13]}, {"iothread":"iothread7","vqs": [14,15]}],"virtqueue_size":1024 }'
You can have different combinations of number of virtuqueues to iothreads as desired.
Test with Fio
Pinning QEMU Threads
Check your system’s CPU configuration by running the lscpu -e
command. On our system, CPUs 16 to 31 are on the same NUMA node and are free. Since our VM has 16 vCPUs, we’ll bind all the QEMU threads to 16 physical CPUs on the host.
Run the following command on the host:
$ taskset -cp -a 16-31 <QEMU-pid>
Run the Tests
With this new feature in use, pidstat -t 1
will show that VMs with -smp 2 or higher are able to exploit multiple IOThreads. Let’s test this to verify the same. Run this fio workload that spreads I/O across 16 queues:
$ cat randread.fio [global] bs=4K iodepth=64 direct=1 ioengine=libaio group_reporting time_based runtime=60 numjobs=16 name=standard-iops rw=randread cpus_allowed=0-15 [job1] filename=/dev/sda $ fio randread.fio
Here, as we have 16 jobs they will be spread across 16 cpus becuase of the cpus_allowed=0-15 argument. This is the max number of queues for our test case. You can also test with numjobs=4 or 8 or any number of your choice. In that case, cpus_allowed will change to 0-3 or 0-7 accrodingly.
pidstat output:
$ pidstat -t -p 1240528 1 08:05:26 AM UID TGID TID %usr %system %guest %wait %CPU CPU Command 08:05:27 AM 0 1240528 - 1045.10 748.04 1451.96 0.00 3245.10 61 qemu-system-x86 08:05:27 AM 0 - 1240528 8.82 5.88 0.00 0.00 14.71 61 |__qemu-system-x86 08:05:27 AM 0 - 1240529 0.00 0.00 0.00 0.00 0.00 24 |__qemu-system-x86 08:05:27 AM 0 - 1240530 54.90 26.47 0.00 0.00 81.37 89 |__IO iothread0 08:05:27 AM 0 - 1240531 53.92 23.53 0.00 0.00 77.45 154 |__IO iothread1 08:05:27 AM 0 - 1240532 59.80 21.57 0.00 0.00 81.37 234 |__IO iothread2 08:05:27 AM 0 - 1240533 58.82 25.49 0.00 0.00 84.31 166 |__IO iothread3 08:05:27 AM 0 - 1240534 58.82 22.55 0.00 0.00 81.37 41 |__IO iothread4 08:05:27 AM 0 - 1240535 57.84 29.41 0.00 0.00 87.25 181 |__IO iothread5 08:05:27 AM 0 - 1240536 58.82 26.47 0.00 0.00 85.29 65 |__IO iothread6 08:05:27 AM 0 - 1240537 58.82 27.45 0.00 0.00 86.27 147 |__IO iothread7 08:05:27 AM 0 - 1240538 56.86 26.47 0.00 0.00 83.33 208 |__IO iothread8 08:05:27 AM 0 - 1240539 59.80 28.43 0.00 0.00 88.24 122 |__IO iothread9 08:05:27 AM 0 - 1240540 57.84 23.53 0.00 0.00 81.37 16 |__IO iothread10 08:05:27 AM 0 - 1240541 57.84 23.53 0.00 0.00 81.37 183 |__IO iothread11 08:05:27 AM 0 - 1240542 60.78 29.41 0.00 0.00 90.20 136 |__IO iothread12 08:05:27 AM 0 - 1240543 56.86 26.47 0.00 0.00 83.33 14 |__IO iothread13 08:05:27 AM 0 - 1240544 57.84 23.53 0.00 0.00 81.37 224 |__IO iothread14 08:05:27 AM 0 - 1240545 60.78 30.39 0.00 0.00 91.18 145 |__IO iothread15 08:05:27 AM 0 - 1240547 0.00 2.94 88.24 0.00 90.20 191 |__CPU 0/KVM 08:05:27 AM 0 - 1240548 0.00 2.94 87.25 0.00 90.20 105 |__CPU 1/KVM 08:05:27 AM 0 - 1240550 0.00 3.92 90.20 0.00 94.12 74 |__CPU 2/KVM 08:05:27 AM 0 - 1240551 0.00 2.94 89.22 0.00 92.16 138 |__CPU 3/KVM 08:05:27 AM 0 - 1240552 0.00 3.92 90.20 0.00 94.12 120 |__CPU 4/KVM 08:05:27 AM 0 - 1240553 0.00 3.92 92.16 0.00 96.08 47 |__CPU 5/KVM 08:05:27 AM 0 - 1240554 0.00 2.94 91.18 0.00 93.14 152 |__CPU 6/KVM 08:05:27 AM 0 - 1240555 0.98 2.94 91.18 0.00 95.10 32 |__CPU 7/KVM 08:05:27 AM 0 - 1240556 0.00 2.94 90.20 0.00 93.14 1 |__CPU 8/KVM 08:05:27 AM 0 - 1240557 0.00 2.94 92.16 0.00 95.10 21 |__CPU 9/KVM 08:05:27 AM 0 - 1240558 0.98 2.94 90.20 0.00 94.12 64 |__CPU 10/KVM 08:05:27 AM 0 - 1240559 0.00 2.94 91.18 0.00 93.14 163 |__CPU 11/KVM 08:05:27 AM 0 - 1240560 0.00 3.92 96.08 0.00 100.00 185 |__CPU 12/KVM 08:05:27 AM 0 - 1240561 0.00 3.92 91.18 0.00 94.12 128 |__CPU 13/KVM 08:05:27 AM 0 - 1240562 0.00 3.92 87.25 0.00 91.18 112 |__CPU 14/KVM 08:05:27 AM 0 - 1240563 0.00 1.96 96.08 0.00 98.04 168 |__CPU 15/KVM 08:05:27 AM 0 - 1240565 0.00 0.00 0.00 0.00 0.00 187 |__vhost-1240528 08:05:27 AM 0 - 1240566 0.00 0.00 0.00 0.00 0.00 239 |__kvm-nx-lpage-re
After running the fio
command used in the above example, we observed that the number of IOPs has been scaled up by multiples, as a result of allowing QEMU to assign mapping in your VM configuraiton. You can test on your setup with different configurations to see how the performance looks.
Conclusion
In this article, we saw how to setup a VM using the ‘iothread-vq-mapping’ feature for a virtio-scsi device and its different use cases. We also tested this new feature using the fio
tool to see how it can help improve performance.
References
- https://wiki.qemu.org/ChangeLog/10.0
- https://lore.kernel.org/all/20250311091136.GA939747@fedora/T/#ma1fd49a95db105c4ba732f8c31212efe0f6ed659
- https://blogs.oracle.com/linux/post/uek-next
- https://docs.oracle.com/en-us/iaas/Content/Block/References/samplefiocommandslinux.htm
- https://fio.readthedocs.io/en/latest/fio_doc.html
- https://docs.kernel.org/block/null_blk.html