By Jamesmorris-Oracle on Sep 02, 2013
This is a contributed post from Chuck Lever, who heads up NFS development for the mainline Linux kernel team.
The 87th meeting of the IETF was held July 28 - August 2 in Berlin, Germany.
I was in Berlin for the week to attend the NFSv4 Working Group meeting and hold informal discussions related to NFS standardization with other attendees. The Internet Engineering Task Force (IETF) produces high quality technical documents that influence the way people design, use and manage the Internet. Essentially, this is the body that regulates the protocols computers use to communicate on the Internet, for the purpose of improving interoperability.
An IETF meeting is held every four months in venues around the world. Sponsorship for each event varies. DENIC, the central registry for domain names under .de, was the primary sponsor for this event. Participation is open to anyone, but a registration fee is required to attend.
NFS version 4 is the IETF standard for file sharing. The charter of the Working Group is to maintain NFS specifications and introduce new NFS features through NFSv4 minor versions. More on the Working Group charter can be found here: http://datatracker.ietf.org/wg/nfsv4/charter/
I attend each NFSv4 Working Group meeting to represent Oracle's interest in various current and new NFS-related features, including pNFS, NFSv4.2, and FedFS. I'm the editor of two of the IETF FedFS protocol specifications, and a co-author of an Internet-Draft that addresses protocol issues affecting NFSv4 migration. Other representatives at this meeting include Microsoft, EMC, NetApp, IBM, Oracle, Tonian, and others. Topics include progress updates on Internet-Drafts on their way to become standards, reports on implementation experience, and requests to start new work or restart old work. See: https://datatracker.ietf.org/meeting/87/materials.html#nfsv4
Meeting agenda, presentation materials, and minutes are available at this location.
Working Group editor Tom Haynes (NetApp) reported on several areas where progress appears to be stalled. In general we face challenges completing our deliverables because the IETF is a volunteer organization, and the tasks at hand are large. The largest item is RFC 3530bis, which is holding up FedFS and NFSv4.2. RFC 3530bis was rejected during IESG review mainly due to the new chapter that attempts to bridge the gap between existing i18n implementations in NFS, and how we'd like i18n to work.
The problem is nobody has implemented i18n for NFSv4, and the IETF has revised i18n since 3530 was ratified. The consensus was to move the offending section to a separate Internet-Draft where the correct language can be hammered out without holding up RFC 3530bis. NFSv4.2 is held up by a lack of enthusiasm for finishing a new revision of RPCSEC GSS. The GSS I-D has languished without an author or editor for many months, and two items in NFSv4.2 depend on its completion: labeled NFS and server-to-server copy. A rough consensus was not achieved, but Tom and Andy Adamson (NetApp) will investigate options, including removing the parts of server copy and labeled NFS that depend on GSSv3, and report back.
Benny Halevy (Tonian) has submitted a fresh draft proposing "Flexible File Layouts" which is a new pNFS layout type that improves upon the existing pNFS file layout defined in RFC 5661. Motivation for a new layout scheme includes: algorithmic data striping to support load balancing, life-cycle management, and other advanced administrative features; support for using legacy NFS servers as pNFS data servers; and direct pNFS support for existing cluster filesystems such as Ceph and GlusterFS.
Chuck Lever (Oracle) described recent progress to address security concerns in the FedFS documents waiting in the RFC Editor queue. He continued by walking through a group of possible future work items, including more modern LDAP security modes, additional administrative operations, and better mechanisms for clients to choose working fileset locations. Does the working group have the energy to consider a new revision of these documents? Or should we continue to focus on making small changes? This was left unresolved.
Sorin Faibish (EMC) discussed the need for a new layout enabling pNFS clients to access Lustre data servers directly. After a lot of discussion, the issue appears to be that the NFS protocol on high performance transports is not performant enough. The proposed solution was to use LNET over RDMA. It was suggested that it would be more interesting to the Working Group if we focused on fixing the performance issues in our RDMA specifications instead.
Marc Eshel (IBM) wanted to restart the age-old conversation on tightening NFS's data cache coherency. The immediate question is whether POSIX semantics are interesting given today's compute workloads and network environment. Implementing POSIX data coherency among multiple networked systems is still a challenge. Consensus that a callback-based solution, where network traffic is proportional to the level of inter-client sharing, was most appropriate. Such a solution (byte-range delegation) was proposed by Trond Myklebust in 2006. It was recommended to start with that work.
Chuck Lever (Oracle) proposed an experimental extension to NFS that enables NFS client and servers to convey end-to-end data integrity metadata. A new I-D has been submitted that describes the protocol changes. No prototype is available yet; the I-D is meant to coordinate discussion of technical details, and enable interoperable prototype implementations.
David Noveck (EMC) elaborated on the need to allow protocol changes outside of the NFS minor version process. He described the limitations of batching unrelated features together and waiting for a full pass through the IETF review process. There was some interest in allowing innovation outside of the minor version process. The Area Directory and Working Group chair felt that there is currently not enough energy behind work already planned for delivery.
Matt Benjamin (Linux Box) is restarting work on a feature proposed several years ago by Mike Eisler that allows directories to be striped across pNFS data servers, just like file data is today. An Internet-Draft is available, and a prototype is underway.
-- Chuck Lever