Filebench: A ZFS v VxFS Shootout

Overview

Here is an example of Filebench in action to give you an idea of its capabilities "out of the box" - a run through a couple of the test suites provided with the tool on the popular filesystems ZFS and VxFS/VxVM; I've given sufficient detail so that you can easily reproduce the tests on your own hardware. I apologise for the graphs, which have struggled to survive the Openoffice .odt -> .html conversion. I hadn't the energy to recreate all 24 of them from the original data

They summarize the differences between ZFS and VxFS/VM in a number of tests which are covered in greater detail further on . It can be seen that in most cases ZFS performed better at its initial release (in Solaris 10 06/06) than Veritas 4.1; in some cases it does not perform as well; but in all cases it performs differently. The aim of such tests is to give a feel for the differences between products/technologies so intelligent decisions can be made as to which file system is more appropriate for a given purpose.

It remains true however that access to the data will be more effective in helping decision makers reach their performance goals if these can be stated clearly in numerical terms. Quite often this is not the case.



Figure 1: Filebench: Testing Summary for Veritas Foundation Suite (ZFS = 0)

Hardware and Operating System

Solaris 10 Update 2 06/06 (inc. ZFS) running on a Sun Fire E6900 with 24 x 1.5 Ghz Ultrasparc IV+, 98 Gb RAM with storage comprising 4 x StorEdge 3500 JBOD (48 x 72 Gb disks) , Fiber attach (4 x 2 Gb PCI-X 133 MHz)

The software used was VERITAS Volume Manager 4.1, VERITAS File System 4.1, FileBench 1.64.5. For each Filebench test a brief description is given followed by a table which shows how it was configured for that test run. This enables you to reproduce the test on your hardware. Of course if you want greater detail on the tests, you have to download Filebench (see blogs passim).

Create & Delete

Creation and deletion of files is a metadata intensive activity which is key to many applications, especially in web-based commerce and software development.

PersonalityWorkloadVariables
createfilescreatefilesnfiles 100000, dirwidth 20, filesize 16k, nthreads 16
deletefilesdeletefilesnfiles 100000, meandirwidth 20, filesize 16k, nthreads 16



Figure 2: Create/Delete - Operations per Second

Figure 3: Create/Delete - CPU uSec per Operation



Figure 4: Create/Delete - Latency (ms)


Copyfiles

This test creates two large directory tree structures and then measures the rate at which files can be copied from one tree to the other.

PersonalityWorkloadVariables
copyfilescopyfilesnfiles 100000, dirwidth 20, filesize 16k, nthreads 16



Figure 5: Copy Files - Operations per Second

Figure 6: Copy Files - CPU uSec per Operation



Figure 7: Copy Files - Latency (ms)


File Creation

This test creates a directory tree and fills it with a population of files of specified sizes. File sizes are chosen according to a gamma distribution of 1.5, with a mean size of 16k. The different workloads are designed to test different types of I/O - see generally the Solaris manual pages for open(2), sync(2) and fsync(3C).

PersonalityWorkloadVariables
filemicro_createcreateandallocnfiles 100000, nthreads 1, iosize 1m, count 64

createallocsyncnthreads 1, iosize 1m, count 1k, sync 1
filemicro_writefsynccreateallocfsyncnthreads 1
filemicro_createrandcreateallocappendnthreads 1




Figure 8: File Creation - Operations per Second

Figure 9: File Creation - CPU uSec per Operation



Figure 10: File Creation - Latency (ms)


Random Reads

This test performs single-threaded random reads of 2 KB size from a file of 5 Gb.

PersonalityWorkloadVariables
filemicro_rreadrandread2kcached 0, iosize 2k

randread2kcachedcached 1, iosize 2k



Figure 11: Random Read - Operations per Second

Figure 12: Random Read - CPU uSec per Operation



Figure 13: Random Read - Latency (ms)


Random Writes

This test consists of multi-threaded writes to a single 5 Gb file.

PersonalityWorkloadVariables
filemicro_rwriterandwrite2ksynccached 1, iosize 2k

randwrite2ksync4threadiosize 2k, nthreads 4, sync 1



Figure 14: Random Writes -
Operations per Second

Figure 15: Random Writes -
CPU uSec per Operation



Figure 16: Random Writes - Latency (ms)


Sequential Read

These tests perform a single threaded read from a 5 Gb file.

PersonalityWorkloadVariables
filemicro_seqreadseqread32kiosize 32k, nthreads 1, cached 0, filesize 5g

seqread32kcachediosize 32k, nthreads 1, cached 1, filesize 5g



Figure 17: Sequential Read -
Operations per Second

Figure 18: Sequential Read -
CPU uSec per Operation



Figure 19: Sequential Read - Latency (ms)


Sequential Write

These tests perform single threaded writes to a 5 Gb file.

PersonalityWorkloadVariables
filemicro_seqwriteseqwrite32kiosize 32k, count 32k, nthreads 1, cached 0, sync 0

seqwrite32kdsynciosize 32k, count 32k, nthreads 1, cached 0, sync 1
filemicro_seqwriterandseqwriterand8kiosize 8k, count 128k, nthreads 1, cached 0, sync 0



Figure 20: Sequential Write -
Operations per Second

Figure 21: Sequential Write -
CPU uSec per Operation



Figure 22: Sequential Write - Latency (ms)


Application Simulations: Fileserver, Varmail, Web Proxy & Server

There are a number of scripts supplied with Filebench to emulate applications:

Fileserver:

A file system workload, similar to SPECsfs. This workload performs a sequence of creates, deletes, appends, reads, writes and attribute operations on the file system. A configurable hierarchical directory structure is used for the file set.

Varmail:

A /var/mail NFS mail server emulation, following the workload of Postmark, but multi-threaded. The workload consists of a multi-threaded set of open/read/close, open/append/close and deletes in a single directory.

Web Proxy:

A mix of create/write/close, open/read/close, delete of multiple files in a directory tree, plus a file append (to simulate the proxy log). 100 threads are used. 16k is appended to the log for every 10 read/writes.

Web Server:

A mix of open/read/close of multiple files in a directory tree, plus a file append (to simulate the web log). 100 threads are used. 16k is appended to the weblog for every 10 reads.

PersonalityWorkloadVariables
fileserverfileservernfiles 100000, meandirwidth 20, filesize 2k, nthreads 100, meaniosize 16k
varmailvarmailnfiles 100000, meandirwidth 1000000, filesize 1k, nthreads 16, meaniosize 16k
webproxywebproxynfiles 100000, meandirwidth 1000000, filesize 1k, nthreads 100, meaniosize 16k
webserverwebservernfiles 100000, meandirwidth 20, filesize 1k, nthreads 100



Figure 23: Application Simulations - Operations per Second

Figure 24: Application Simulations - CPU uSec per Operation



Figure 25: Application Simulations - Latency (ms)


OLTP Database Simulation

This database emulation performs transactions on a file system using the I/O model from Oracle 9i. This workload tests for the performance of small random reads & writes, and is sensitive to the latency of moderate (128Kb+) synchronous writes as occur in the database log file. It launches 200 reader processes, 10 processes for asynchronous writing, and a log writer. The emulation uses intimate shared memory (ISM) in the same way as Oracle which is critical to I/O efficiency (as_lock optimizations).


PersonalityWorkloadVariables
oltplarge_db_oltp_2k_cachedcached 1, directio 0, iosize 2k,
nshadows 200, ndbwriters 10, usermode 20000,
filesize 5g, memperthread 1m, workingset 0

large_db_oltp_2k_uncachedAs above except cached 0, directio 1
large_db_oltp_8k_cachedAs for 2k except iosoze 8k

large_db_oltp_8k_uncachedAs for 2k except iosoze 8k


Figure 26: OLTP 2/8 Kb Blocksize - Operations per Second


Figure 27: OLTP 2/8 Kb Blocksize - CPU uSec per Operation


Figure 28: OLTP 2/8 Kb Blocksize - Latency (ms)

The summary figures above are the tip of a vast numerical iceberg of statistics provided by Filebench and wrappers around it which probe every system resource counter you can think of. It is a truism though that in using data like this, there is an enthusiasm to reduce it to single figures and simple graphs, leaving the engineers working on the performance bugs to the excruciating detail.

Remember also that these are the pre-packackaged scripts. The possibilities for custom benchmark workloads are as infinite as your imagination. Its also worth saying that technologies move on. The snapshot above will start to fade as improvements are made.

Comments:

I was just thinking about doing similar comparison next week and now you did it! Great! I guess I'll do some benchmarks anyway.

Posted by Robert Milkowski on December 06, 2006 at 07:43 PM GMT #

Post a Comment:
Comments are closed for this entry.
About

dom

Search

Archives
« July 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  
       
Today