Running PostgreSQL on ZFS file system. Some considerations
By paulvandenbogaard on Jun 19, 2008
Up till now I compared ZFS to UFS when deploying the PostgreSQL database files. Only these database processes were running on the system, and only all the PostgreSQL related files were on the filesystem being compared.
The file system was created on either a stripe set of nine disks using Solaris Volume Manager or a zpool using these same nine disks. The last blog entry reported ZFS outperforming UFS.
I did some extra tests to see if performance (read database throughput) could be increased by fiddling with some elements that are known to influence the IO part when using ZFS.
The ZFS Intent Log is known to consume bandwith. The effect of this element can be seen by disabling it altogether. Of course this is definitely NOT advised for production environments. (See http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#ZIL for some extra information.)
Did some tests with the ZIL disabled. Using 5, 15, 50 and 85% for the completion_target of PostgreSQL. The following graph shows the database transaction throughput:
Recalling the baseline I presented in an earlier blog using seven disk arrays with caches, that showed a throughput around 90K TPM this one comes very close. Only nine internal disks this time...
The previous throughput graph (... the saga continues) showed a rate of ~ 50K TPM: transactional integrity does not come for free.
Did some more tests where ZIL was enable, but new using a dedicated disk array (with 1GByte of write cache) for the intent log. The throughput result of this setup, the disabled-ZIL and the 'all on nine disks' tests are in the next graph (all have a completion_target setting of 15%):
The ZIL element obviously takes time. Time during which the throughput is cleary throttled while a database checkpoint is in progress. Using a dedicated fast device pays of to increase throughput and make it more constant over time. The IO (write part) was measured for both the ZIL device as for the data on the "nine disk" ZFS file system . Here is the result:
The checkpoint at 1800 seconds resulted in a 77 MByte write/sec for the ZIL only. As already suggested by Mikael Gueck I would surely like to do some extra tests with SSD disks. Guess I need to hold my breath a while, since right now I "just' do not have them.
However right now I feel ZFS is already a realistic alternative for a PostgreSQL setup.
Will do some more tests where I'll play with some of the other features of ZFS to get a feeling for their effect on my PostgreSQL environment.