nfs4trace: a New Direction

It's been a while since my last entry. Yes, quite a while... In the mean time, I've been hard at work on pNFS, which is a top priority for both myself and my employer. But that's not to say that a DTrace provider for NFS isn't important. So why the slip?

Besides the heavy emphasis that pNFS has in my priorities, it was determined that a fundamental design decision in my previous provider imposed too much of a burden when the probes were not enabled. Adam was very helpful in pointing out the general problems, and in working with me to come up with a new design that would have no impact on the kernel when the probes were not enabled.

The good news was that Adam's design was easier to implement than the work I had already started. The bad news was that I lost momentum, starting from scratch at a time when I was working overtime on pNFS. But more good news: there are now a couple of great engineers in Beijing who have started working on the next incarnation: Colin Yi and John Cui!

New Design

As I hinted above, the new design is more simple than the old. The new design is based on the sdt provider, and the sdt_subr.c code that makes a grouping of sdt probes into a virtual provider. Translators written in D will make the arguments as friendly as they were in the old provider.

One disadvantage of the new design is that the function slot of the probe cannot be customized the way it was in the old provider. Here is an example of the same probe in both the old provider and the new provider, and note that the exact syntax is subject to change:

oldnew
nfs4s::op-read:done nfs4s:::op-read-done

I left out the slots of the probes that are unhelpful, for example, the module name (which is always "nfssrv"). But note that the function slot is missing in the new provider. That is because that due to the nature of using sdt probes, the function slot is literally the function in the kernel from which the probe fires. This knowledge of our particular implementation is exactly the kind of thing that the NFS provider wants to abstract away, so it's going to be skipped in your typical D script.

Losing the function slot of the probe seemed like a problem at first. One reason we liked the old scheme was that it made this sort of thing possible:

nfs4s::op-\*:done
{
	@[probefunc] = count();
}

Which, in the new scheme, is possible like this:

nfs4s:::op-\*-done
{
	@[substr(probename, 0, rindex(probename, "-"))] = count();
}

Once you get used to the new scheme, it isn't really much of a problem. And the benefits of SDT far outweigh this small inconvenience.

As mentioned above, the old provider inflicted a performance penalty, however small, on a kernel without any probes enabled. A function pointer was called before and after events to be probed, e.g., after the server had processed a request. When the provider was not active, this function pointer pointed to a no-op function. This alone was questionable, but when the provider was loaded, the function pointer led to a function that had to do a small amount of processing to determine whether each given probe was active or inactive.

The new design eliminates this overhead entirely. The SDT provider does its enabling and disabling of probes directly in machine code. Once the probes are enabled, it does less work in the kernel and more work in translators, which is also beneficial. It's a more simple design, and will be easier to maintain.

Other changes will be made in the arguments, to make them more consistent with other providers, e.g. an iscsi target provider.

Next Steps

I have taken down the archives and example scripts that were available on the opensolaris.org project page, since these are no longer helpful toward understanding the provider that is currently being worked on. Look for new content to replace the old in the future!

Comments:

Post a Comment:
Comments are closed for this entry.
About

samf

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today