Need Inodes ?

It seems that some old school filesystem still need to statically allocate inodes to hold pointers to individual files. Normally this should not cause too much problems as default settings account for an average filesize of 32K. Or will it ?

If the avg filesize you need to store on the filesystem is much smaller than this, they you are likely to eventually run out of inodes even if the space consumed on the storage is far from exhausted.

In ZFS inodes are allocated on demand and so the question came up, how many files can I store onto a piece of storage. I managed to scrape up an old disk of 33GB, created a pool and wanted to see how many 1K files I could store on that storage.

ZFS stores files with the smallest number of sectors possible and so 2 sectors was enough to store the data. Then of course one needs to also store some amount of metadata, indirect pointer, directory entries etc to complete the story. There I didn't know what to expect. My program would create 1000 files per directory. Max depth level is 2, nothing sophisticated attempted here.

So I let my program run for a while and eventually interrupted it at 86% of disk capacity :

        Filesystem  size  used avail capacity Mounted on
	space          33G  27G  6.5G  81%  /space
Then I counted the files.

        #ptime find /space/create | wc
	real  51:26.697
	user   1:16.596
	sys   25:27.416
	23823805 23823805 1405247330


So 23.8M files consuming 27GB of data. Basically less than 1.2K of used disk space per KB of files. A legacy type filesystem that would allocate one inode per 32K would have run out of space after a meager 1M files but ZFS managed to store 23X more on the disk without any tuning.

The find command here is mostly gated on fstat performance and we see here that we did the 23.8M fstat in 3060 seconds or 7777 fstat per second.

But here is the best part : And how long did it take to create all those files ?

   real 1:09:20.558
   user   9:20.236
   sys  2:52:53.624


This is hard to believe but it took about 1 hour for 23.8 million files.This is on a single direct attach drive

   3. c1t3d0 <FUJITSU-MAP3367N SUN36G-0401-33.92GB>


ZFS created on average 5721 files per second. Now obviously such a drive cannot do 5721 IOPS but with ZFS it didn't need to. File create is actually more of a cpu benchmark because the application is interacting with host cache. It's the task of the filesystem to then create the files on disk in the background. With ZFS, the combination of the Allocated on Write policy and the sophisticated I/O aggregation in the I/O scheduler (dynamics) means that the I/O for multiple independant file create can be coalesced. Using dtrace I counted the number of IO required and filecreates per minutes, typical samples show more than 200K files created per minutes using about 3000 IO per minutes or 3300 files per second using a mere 50 IOPS !!!

Per Minute
	   Sample Create  IOs
	   #1	  214643  2856
	   #2	  215409  3342
	   #3	  212797  2917
	   #4	  211545  2999


Finally with all these files, is scrubbing a problem ? It took 1h34m to actually scrub that many files at a pace of 4200 scrubbed files per second. No sweat.

pool: space
 state: ONLINE
 scrub: scrub completed after 1h34m with 0 errors on Wed Feb 11 12:17:20 2009


If you need to create, store and otherwise manipulate lots of small files efficiently, ZFS has got to be you filesystem of choice for you.
Comments:

Awesome. Simply awesome.
Probably the biggest adoption point for Solaris right now is ZFS. It's a slam dunk.

Tell me, do you feel glad to be working in the ZFS team? What does that feel like? What does your day look like?

Posted by UX-admin on février 13, 2009 at 03:04 PM MET #

Very impressive!

Posted by chris hallman on février 23, 2009 at 07:18 AM MET #

Post a Comment:
Comments are closed for this entry.
About

user13278091

Search

Categories
Archives
« avril 2014
lun.mar.mer.jeu.ven.sam.dim.
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today
News
Blogroll

No bookmarks in folder