SAM/QFS Directory Performance
By Ted Pogue on Apr 25, 2007
With the release of the 4.6 version of SAM and QFS on April 3rd, I thought it would be appropriate to start discussing some of the improvements included in this update. Specifically, SAM/QFS added improvements to directory lookup performance. This is part of our effort to continue scaling SAM/QFS. I thought I'd address the directory lookup improvements in this article
This phase of the directory lookup performance improvements is primarily targeted at lookups for names that do not exist. The system calls that receive the greatest benefit are create(), link(), and rename(). System calls other than these three will not benefit greatly from the 4.6 changes. We would like to further expand the system calls that benefit in the future.
The SAM/QFS performance group ran a comparison test for these system calls on small (2 CPUs), medium (4 CPUs), and large (20 CPUs) configurations comparing 4.5 to 4.6. The tests were run using a C program on a QFS 'ma' file system with 'mm' and 'mr' devices. They reported a 66 times improvement for the case with 128K zero length files in a directory on the large configuration. On the small configuration, a 26 times improvement was realized for creation of 100K files in a single directory. Additionally, on the medium configuration, rename, symlink, and mkdir performance comparisons showed 970, 169, and 4 fold increases respectively. In general, the larger the directory, the greater the gain expected with 4.6 over 4.5.
It is important to realize that the performance of the application in using these system calls must be sufficient to benefit. For example, when the create logic was moved from a C program to a shell script on the small configuration, the performance improvement was only 2.4 times compared to the 26 times with the C program. Clearly, the shell script could not create files quickly enough to see the same benefit as the C program. This needs to be kept in mind when developing performance tests or understanding performance results.
We also took a look at some of the SAM commands that are run on QFS. The command that received the greatest performance gain in 4.6 was the samfsrestore command because it primarily does lookups of directory entries that do not exist. We observed a 2 times performance increase for a samfsdump file containing a few million files. You can't expect hundreds of times of improvement at this level because the directory lookups are only a subset of the operations performed by samfsrestore.
The really nice point for existing SAM/QFS users is that, other than a 4.6 upgrade, you do not need to do anything to obtain the improvements. The directory lookup performance improvements are compatible with all 4.x QFS file systems. All directories greater than 4K in size will automatically use the new directory caching lookup scheme.
All in all, I think this a big benefit to SAM/QFS users and shows the commitment to continue scaling SAM/QFS.
'til next time Ted