By samf on Nov 14, 2008
Traditionally, NFS has always been able to share an ordinary file system, and thus has always dealt with vnodes. With the advent of Parallel NFS (pNFS), which has stripes of data distributed across multiple servers, this is no longer the case. The initial implementation of pNFS uses the DMU API provided by ZFS to store its stripe data. The pNFS protocol also requires that a server community support proxy i/o; that is, a client must be able to perform all i/o against one server, and if the data requested is on another node, then the server must perform the i/o by proxy. Neither the DMU nor proxy i/o are accessed via vnodes.
Another change brought by pNFS is the distributed nature of the server. What has always been confined to a single server is now distributed over multiple servers. This necessitates a control protocol, for communication among the server nodes, and implies that some server tasks may have longer latencies than before. In pathological cases, e.g. a server reboot of one particular server, the latencies may be very high. This will likely require new ways that the NFS server is implemented, and it will become advantageous for the NFS server code to use new APIs, APIs with different design goals from vnodes. Asynchronous methods become more desirable, to deal with the new latencies involved with processing client requests.
Enter nnodes. nnodes can be thought of as vnodes, but customized for the needs of NFS (especially pNFS), and with three distinct ops-vector/private-data sections. The figure below shows an NFS server implementation interacting with an nnode that is backed by a vnode.
The fact that the nnode has the three distinct sections for data, metadata, and state, makes it easy to mix and match commonly needed implementations. Here are some examples:
|Traditional shared file system||vnode||vnode||rfs4_db|
|pNFS metadata server||proxy i/o||vnode||rfs4_db|
|pNFS data server||DMU||not applicable||proxy to MDS|
But this is just the beginning. There will doubtless be more constructions in the future.
nnodes also serve as a place to cache or store information relevant to a file-like object. For example, in the pNFS data server, we can cache stateids that are known to be valid. Thus, the data server will not need to contact the metadata server on every i/o operation.
Today I have been writing a more verbose comment header for the header file "nnode.h". Look for it in our repository soon.