In modern datacenters, virtual machines (VMs) play a crucial role in providing scalable and adaptable resources. To ensure VMs perform efficiently, it is important to examine system statistics, as they can provide valuable insight into their overall health. These metrics help developers and administrators fine-tune configurations, pinpoint performance bottlenecks, and quickly address issues.
In this blog post, we’ll explore QEMU’s introspective framework to examine virtualization statistics.
Background
The initial motivation behind this work was to add QEMU support for KVM file descriptor-based VM and virtual CPU (vCPU) statistics. The plan was to add a new command for this purpose, as had been previously done for other statistics providers. However, the problem with multiple, separate commands is they end up being specific to the needs of each subsystem, with little consistency across the sources. In addition, some statistics providers require elevated privileges, or can require special tools (e.g. kvm_stat or debugfs).
Rather than continue down this path, we decided to create a generic statistics framework that would not only cater to KVM’s requirements, but could also be used for other statistics sources, such as network interfaces and block devices. The goal was to provide system administrators and developers with a unified, simple, and efficient way of querying statistics across a variety of subsystems, and provide the results in a consistent format.
“Introspective” Statistics Framework
The new statistics framework was introduced in QEMU version 7.1.0. This framework is described as “introspective” because details about the number and type of statistics available are not known at QEMU compile time. Instead, QEMU queries a list of statistics sources (callbacks) for available data, and dynamically generates a list of available statistics.
Key Features:
Self-Describing Statistics: The statistics themselves are self-describing, meaning QEMU will discover and display what is available at runtime. This makes the system more flexible, as there is no need for the QEMU to have prior knowledge of every potential statistic. This also means QEMU can accommodate additional statistics without requiring any QEMU changes.
Efficient API: The statistics are queried using QEMU QMP, which is a JSON-based protocol that supports simple automation and scripting. The schema information (metadata) is separate from the actual statistics data, which prevents unnecessary overhead with each query. In addition, results can be filtered by name, target, and provider.
Easy-to-Use: The statistics can be accessed without the need for elevated privileges or special tools. Additionally, HMP commands offer an easy-to-use protocol for human interaction.
How It Works
Each statistics provider has QEMU callbacks to query the schema (metadata) and data. When a user queries the framework, QEMU invokes the callbacks and presents the data that is available.
KVM File-Descriptor Based Statistics
The first provider to use QEMU’s new statistics framework is KVM. KVM provides a set of file descriptor-based interfaces which allow users to view various VM and vCPU statistics. This interface provides a lightweight, flexible, scalable, and lock-free method to collect telemetry data. The statistics can be pulled frequently, up to several times per second.
The KVM file descriptors are provided via ioctls. The files have a header block, which defines the overall layout, a descriptor block for the statistics metadata, and a data block. The header and descriptor blocks are static and only need to be read once. The raw data is in the data block. The ioctl details are in the Linux Kernel documentation.
The number of statistics continues to grow. As of Linux kernel version 6.13, x86_64 KVM includes 37 vCPU and 8 VM statistics. The exact set of statistics available can vary by architecture.
Core Commands
QMP
There are two key QMP commands in this framework:
query-stats-schemas
: Returns a list of available statistics for each target type and provider. This includes metadata for each statistic, such as the unit of measurement (milliseconds, bytes, cycles, etc.) and the collection method (instant, peak, histogram, etc.).
query-stats
: Returns a list of statistics for a given target type (e.g., VM, vCPU). Included are filtering options to specify statistic names, vCPU path, and providers.
The QMP API details are described in the QEMU source stats.json.
HMP
The HMP interface combines the schema and data query into a single command:
info stats
: Displays a list of statistics along with the schema information in a simple human readable format. It includes support for filtering by name (a comma-separated list), target, and provider.
Further details can be found in the QEMU source hmp-commands-info.hx.
Example Usage
QMP
Retrieve the schemas:
{ "execute": "query-stats-schemas" }
Response:
{ "return": [ {"provider": "kvm", "target": "vcpu", "stats": [ {"name": "guest_mode","exponent": 0, "type": "instant"}, {"name": "preemption_other", "exponent": 0, "type": "cumulative"}, {"name": "preemption_reported", "exponent": 0, "type": "cumulative"}, ... ] }, {"provider": "kvm", "target": "vm", "stats": [ {"name": "max_mmu_page_hash_collisions", "unit": "none", "base": 10, "exponent": 0, "type": "peak"} ... ] } ] }
Query all vCPU statistics:
{ "execute": "query-stats", "arguments" : { "target": "vcpu" } }
Response:
{ "return": [ {"provider": "kvm", "qom_path": "/machine/unattached/device[0]", "stats": [ {"name": "guest_mode", "value": 0}, {"name": "preemption_other", "value": 141}, {"name": "preemption_reported", "value": 416}, ... ] }, {"provider": "kvm", "qom_path": "/machine/unattached/device[1]", "stats": [ {"name": "guest_mode", "value": 0}, ... ] } ... ] }
Query KVM provider vCPU device[0] and [3] ‘guest_mode’ and ‘preemption_reported’:
{ "execute": "query-stats","arguments": { "target": "vcpu", "vcpus": [ "/machine/unattached/device[0]", "/machine/unattached/device[3]" ], "providers": [ { "provider": "kvm", "names": [ "guest_mode", "preemption_reported" ] } ] } }
Response:
{ "return": [ {"provider": "kvm", "qom-path": "/machine/unattached/device[0]", "stats": [ {"name": "guest_mode", "value": 0}, {"name": "preemption_reported", "value": 313} ] }, {"provider": "kvm", "qom-path": "/machine/unattached/device[3]", "stats": [ {"name": "guest_mode", "value": 0}, {"name": "preemption_reported", "value": 274} ] } ] }
HMP
Query all VM statistics:
(qemu) info stats vm provider: kvm max_mmu_page_hash_collisions (peak): 0 max_mmu_rmap_size (peak): 0 nx_lpage_splits (instant): 0 pages_1g (instant): 0 pages_2m (instant): 721 pages_4k (instant): 1001 ...
Query KVM provider VM 4KB pages and MMU cache misses:
(qemu) info stats vm pages_4k,mmu_cache_miss kvm provider: kvm pages_4k (instant): 1001 mmu_cache_miss (cumulative): 841
Wrapping Up
QEMU’s introspective statistics framework gives system admins and developers an easy-to-use tool to examine system statistics which can help ensure VMs operate at peak efficiency, improving system responsiveness and the overall user experience.
Support is available in mainline QEMU and in Oracle QEMU 7.2.0 from the KVM AppStream for Oracle Linux 8 and KVM Utilities for Oracle Linux 9 repositories.