One of the perennial performance or sizing questions for a workload is how much I/O it generates. This can be broken down several ways:
Many of the metrics have always been easy to quantify. There are standard operating system tools to measure disk I/O, network packets and bytes, etc. The question of how much random versus sequential I/O is much harder to answer, but it can be an important one if your storage is a potential limiter of performance and that storage has a significant amount of "conventional" disk - i.e. spindles of rotating rust.
Sequential I/O on conventional disks can generally be served at a higher throughput because conventional disks can avoid almost all rotational delay and seek time penalties. Random I/O will always incur these penalties, at a varying rate.
So if you need to decompose your workload enough to understand how much Random versus Sequential I/O there is, what do you do? You may be able to discuss it with the application developer to get their take on how the application behaves, what it stores, and how and when it needs to fetch data into memory or write or update it on disk. This information (or access to the developer) is not always available though.
What about if I can get inside the application and measure the I/O as it happens? That is a task for DTrace. I have developed a script that tallies up all I/O for all filesystems and can tell you how much of the I/O is Sequential (a read or write at a location in a file is then followed by a read or write at the next location for that file) or Random.
Here is some example output from the script, where I first ran a Sequential filebench test:
Sample interval: 120 seconds |
then I ran a random filebench test:
Sample interval: 60 seconds |
As you can see the script outputs a table breaking down reads and writes by operations and by KB/s, but also by "sequential", "random" and "page". There are also some totals, plus some statistics on a few other VFS operations.
What does "page" mean? Well, this script uses Solaris' virtual filesystem interface to measure filesystem I/O. Through this interface we can see conventional read and write operations, however there are also "page" operations that can happen which do not necessarily map to a read or write from an application. Because of this uncertainty I decided to list the I/O via these page operations separately. They will generally be small for an application that uses conventional I/O calls, but may be large if an application for is using the mmap() interface for example.
For the purposes of this blog entry I have simplified the output of the script. It normally outputs a table for every filesystem that has I/O during its run. The script could easily be modified to select a desired filesystem. It could also be modified to select I/O related to a specific user, process, etc.
References