Understanding snoop(1M) NFSv3 file handles

Introduction

Reading the snoop(1M) trace of NFS traffic you'll see references to file handles, but interpreting these is not straightforward. As any NFS engineer should tell you, file handles are opaque to the NFS client and are meaningful only to the NFS server that issued them.

For Solaris the file handle is derived from the underlying file system so some parts of the file handle are meaningful only to the underlying file system.

Here's an example snoop(1M) output:

  3   0.00000 v4u-450f-gmp03 -> v4u-80a-gmp03 NFS C GETATTR3 FH=FD0D
  4   0.00035 v4u-80a-gmp03 -> v4u-450f-gmp03 NFS R GETATTR3 OK

So what does FH=FD0D mean? File handles are surely longer than that?

Analysis

To make things easier to read, snoop(1M) hashes the file handle to 16-bits. Check the sum_filehandle() code in OpenSolaris:

sum_filehandle(len)
	int len;
{
	int i, l;
	int fh = 0;

	for (i = 0; i < len; i += 4) {
		l = getxdr_long();
		fh \^= (l >> 16) \^ l;
	}

	return (fh);
}

To see the complete file handle we need the verbose output:

# snoop -p3,4 -v -i /tmp/snoop2.out | grep NFS:
NFS:  ----- Sun NFS -----
NFS:  
NFS:  Proc = 1 (Get file attributes)
NFS:  File handle = [FD0D]
NFS:   0080001000000002000A0000000091EF23696D2E000A000000008FDF5F48F2A0
NFS:  
NFS:  ----- Sun NFS -----
NFS:  
NFS:  Proc = 1 (Get file attributes)
NFS:  Status = 0 (OK)
NFS:    File type = 1 (Regular File)
NFS:    Mode = 0644
NFS:     Setuid = 0, Setgid = 0, Sticky = 0
NFS:     Owner's permissions = rw-
NFS:     Group's permissions = r--
NFS:     Other's permissions = r--
NFS:    Link count = 1, User ID = 0, Group ID = 0
NFS:    File size = 301, Used = 1024
NFS:    Special: Major = 0, Minor = 0
NFS:    File system id = 137438953488, File id = 37359
NFS:    Last access time      = 01-Feb-07 15:12:16.398735000 GMT
NFS:    Modification time     = 01-Feb-07 15:12:16.410570000 GMT
NFS:    Attribute change time = 01-Feb-07 15:12:16.410570000 GMT
NFS:  
NFS:  
# 

Note the file system ID and file ID values for later.

To break down that file handle you need to understand the NFS server's implementation, for NFSv3 on OpenSolaris we have:

typedef struct {
	fsid_t	_fh3_fsid;			/\* filesystem id \*/
	ushort_t _fh3_len;			/\* file number length \*/
	char	_fh3_data[NFS_FH3MAXDATA];	/\* and data \*/
	ushort_t _fh3_xlen;			/\* export file number length \*/
	char	_fh3_xdata[NFS_FH3MAXDATA];	/\* and data \*/
} fhandle3_t;

Which means ...

fh3_fsid    0080001000000002
fh3_len     000A
fh3_data    0000000091EF23696D2E
fh3_xlen    000A
fh3_xdata   000000008FDF5F48F2A0

The fh3_fsid is itself a compressed version of the dev_t for the device and the file system type, check cmpldev(). Essentially:

  • the first 32 bits (0x00800010) are the major number shifted right 14 bits plus the minor number
  • the second 32 bits (0x00000002) are the file system type, see struct vfssw

That compressed fsid is what you see in mnttab, for this case:

# grep 800010 /etc/mnttab
/dev/dsk/c0t0d0s0       /       ufs     rw,intr,largefiles,logging,xattr,onerror=panic,dev=800010       1170093759
# 

Reassuringly, file system type 2 is ufs.

The fh3_data is derived from the underlying file system (ufs) which is a ufid structure:

struct ufid {
	ushort_t ufid_len;
	ushort_t ufid_flags;
	int32_t	ufid_ino;
	int32_t	ufid_gen;
};

So 0000000091EF23696D2E breaks down as:

ufid_flags   0000
ufid_ino     000091EF
ufid_gen     23696D2E

Reassuringly again, ufid_ino (the inode) makes sense:

# mdb
> 000091EF=U
                37359           
> !ls -li /export
total 922
     37359 -rw-r--r--   1 root     root         301 Feb  1 15:12 motd

That's the file I was checking from the NFS client and it matches the file ID from the snoop output.

The fh3_xdata represents the export data, ie the exported file system. The inode number in this case is 0x00008FDF. Checking:

> !share
-               /export   rw   ""  
> 00008FDF=U
                36831           
> !ls -lid /export
     36831 drwxr-xr-x   2 root     sys          512 Feb  1 15:12 /export
> 

If you've been paying attention you might be wondering what happened to the file system ID (137438953488). This is the uncompressed dev_t value. We can check it by compressing it (14 bit shift of the major, add the minor):

> 0t137438953488=J
                2000000010      
> (0t137438953488>>0t14) + 0t137438953488&0xffffffff=J
                800010          
> 

Yes, that looks familiar.

Conclusion

As already noted, NFS file handles are only meaningful to the NFS server and this example is just for the Solaris NFSv3 implementation. However, I hope it's given some insight into how that works and with this knowledge it's relatively easy to match snoop(1M) file handles to files on the NFS server.

Comments:

Cool! Thanks for taking the time to write this up. This is exactly the kind of stuff I come to blogs.sun.com hoping to find.

Posted by Chad Mynhier on February 01, 2007 at 01:17 PM GMT #

This seems to get a little bit crazy and difficult on x86 architectures with ZFS. I'm trying to do the on-the-fly conversion stuff, and I've managed to find the file id == inode number, but the /etc/mnttab mapping is quite elusive from the file handle. Thanks for the post.. I'm trying to see if it's possible to zfs send/recv a filesystem between two hosts (same ISA), move the IP, and avoid stale file handles (holy grail)

Posted by guest on May 17, 2011 at 02:52 PM BST #

Post a Comment:
  • HTML Syntax: NOT allowed
About

PeteH

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today