By DarrenMoffat on Oct 21, 2013
As part of our work for integrated compliance reporting in Solaris we plan to provide a check for determining if the system has "un-owned files", ie those which are owned by a uid that does not exist in our configured nameservice. Tests such as this already exist in the Solaris CIS Benchmark (9.24 Find Un-owned Files and Directories) and other security benchmarks.
The obvious method of doing this would be using find(1) with the -nouser flag. However that requires we bring into memory the metadata for every single file and directory in every local file system we have mounted. That is probaby not an acceptable thing to do on a production system that has a large amount of storage and it is potentially going to take a long time.
Just as I went to bed last night an idea for a much faster way of
listing file systems that have un-owned files came to me.
I've now implemented it and I'm happy to report it works very well and peforms many orders of magnatude better than using find(1) ever will. ZFS (since pool version 15) has per user space accounting and quotas. We can report very quickly and without actually reading any files at all how much space any given user id is using on a ZFS filesystem. Using that information we can implement a check to very quickly list which filesystems contain un-owned files.
First a few caveats because the output data won't be exactly the same as what you get with find but it answers the same basic question. This only works for ZFS and it will only tell you which filesystems have files owned by unknown users not the actual files. If you really want to know what the files are (ie to give them an owner) you still have to run find(1). However it has the huge advantage that it doesn't use find(1) so it won't be dragging the metadata for every single file and directory on the system into memory. It also has the advantage that it can check filesystems that are not mounted currently (which find(1) can't do).
It ran in about 4 seconds on a system with 300 ZFS datasets from 2 pools totalling about 3.2T of allocated space, and that includes the uid lookups and output.
Sample output:#!/bin/sh for fs in $(zfs list -H -o name -t filesystem -r rpool) ; do unknowns="" for uid in $(zfs userspace -Hipn -o name,used $fs | cut -f1); do if [ -z "$(getent passwd $uid)" ]; then unknowns="$unknowns$uid " fi done if [ ! -z "$unknowns" ]; then mountpoint=$(zfs list -H -o mountpoint $fs) mounted=$(zfs list -H -o mounted $fs) echo "ZFS File system $fs mounted ($mounted) on $mountpoint \c" echo "has files owned by unknown user ids: $unknowns"; fi done
ZFS File system rpool/ROOT/solaris-30/var mounted (no) on /var has files owned by
unknown user ids: 6435 33667 101
ZFS File system rpool/ROOT/solaris-32/var mounted (yes) on /var has files owned by unknown user ids: 6435 33667
ZFS File system builds/bob mounted (yes) on /builds/bob has files owned by unknown user ids: 101
Note that the above might not actually appear exactly like that in any future Solaris product or feature, it is provided just as an example of what you can do with ZFS user space accounting to answer questions like the above.