iSCSI DTrace provider and more to come

People often ask about the future direction of DTrace, and while we have some stuff planned for the core infrastructure, the future is really about extending DTrace's scope into every language, protocol, and application with new providers -- and this development is being done by many different members of the DTrace community. An important goal of this new work is to have consistent providers that work predictably. To that end, Brendan and I have started to sketch out an array of providers so that we can build a consistent model.

In that vein, I recently integrated a provider for our iSCSI target into Solaris Nevada (build 69, and it should be in a Solaris 10 update, but don't ask me which one). It's an USDT provider so the process ID is appended to the name; you can use \* to avoid typing the PID of the iSCSI target daemon. Here are the probes with their arguments (some of the names are obvious; for others you might need to refer to the iSCSI spec):

probe nameargs[0]args[1]args[2]
iscsi\*:::async-sendconninfo_t \*iscsiinfo_t \*-
iscsi\*:::login-commandconninfo_t \*iscsiinfo_t \*-
iscsi\*:::login-responseconninfo_t \*iscsiinfo_t \*-
iscsi\*:::logout-commandconninfo_t \*iscsiinfo_t \*-
iscsi\*:::logout-responseconninfo_t \*iscsiinfo_t \*-
iscsi\*:::data-receiveconninfo_t \*iscsiinfo_t \*-
iscsi\*:::data-requestconninfo_t \*iscsiinfo_t \*-
iscsi\*:::data-sendconninfo_t \*iscsiinfo_t \*-
iscsi\*:::nop-receiveconninfo_t \*iscsiinfo_t \*-
iscsi\*:::nop-sendconninfo_t \*iscsiinfo_t \*-
iscsi\*:::scsi-commandconninfo_t \*iscsiinfo_t \*iscsicmd_t \*
iscsi\*:::scsi-responseconninfo_t \*iscsiinfo_t \*-
iscsi\*:::task-commandconninfo_t \*iscsiinfo_t \*-
iscsi\*:::task-responseconninfo_t \*iscsiinfo_t \*-
iscsi\*:::text-commandconninfo_t \*iscsiinfo_t \*-
iscsi\*:::text-responseconninfo_t \*iscsiinfo_t \*-

The argument structures are defined as follows:

typedef struct conninfo {
        string ci_local;        /\* local host address \*/
        string ci_remote;       /\* remote host address \*/
        string ci_protocol;     /\* protocol (ipv4, ipv6, etc) \*/
} conninfo_t;

typedef struct iscsiinfo {
        string ii_target;               /\* target iqn \*/
        string ii_initiator;            /\* initiator iqn \*/
        uint64_t ii_lun;                /\* target logical unit number \*/

        uint32_t ii_itt;                /\* initiator task tag \*/
        uint32_t ii_ttt;                /\* target transfer tag \*/

        uint32_t ii_cmdsn;              /\* command sequence number \*/
        uint32_t ii_statsn;             /\* status sequence number \*/
        uint32_t ii_datasn;             /\* data sequence number \*/

        uint32_t ii_datalen;            /\* length of data payload \*/
        uint32_t ii_flags;              /\* probe-specific flags \*/
} iscsiinfo_t;

typedef struct iscsicmd {
        uint64_t ic_len;        /\* CDB length \*/
        uint8_t \*ic_cdb;        /\* CDB data \*/
} iscsicmd_t;

Note that the arguments go from most generic (the connection for the application protocol) to most specific. As an aside, we'd like future protocol providers to make use of the conninfo_t so that one could write a simple script to see a table of frequent consumers for all protocols:

iscsi\*:::,
http\*:::,
cifs:::
{
        @[args[0]->ci_remote] = count();
}

With the iSCSI provider you can quickly see which LUNs are most active:

iscsi\*:::scsi-command
{
        @[args[1]->ii_target] = count();
}
or the volume of data transmitted:
iscsi\*:::data-send
{
        @ = sum(args[1]->ii_datalen);
}

Brendan has been working on a bunch of iSCSI scripts -- those are great for getting started examining iSCSI

Comments:

Excellent news, thanks! Glad to see I'm not the only one typing 'iscis' too :)

Posted by Dick Davies on July 03, 2007 at 06:37 PM PDT #

Are you interested in suggestions for DTrace? I think what's missing is a stable abstract programmatic interface to allow programs to watch themselves. As an example, databases can keep track of how many writes and reads they perform for each table but have no idea how many of those writes and reads turn into actual i/o and unless they incur expensive gettimeofday calls no idea how long they take. Currently to watch this kind of stuff would mean interfacing via a text configuration file and running an external compiler binary. It seems there should be some kernel syscalls the program can do to request counts and measurements and get back the results in usable binary form.

Posted by Greg S on July 03, 2007 at 08:37 PM PDT #

Greg,

We do have a supported API in Java. The C interface is available, but -- as you observe -- not a stable interface. Creating such an interface will require a large amount of engineering work, so we were waiting to evaluate the demand.

Posted by Adam Leventhal on July 04, 2007 at 01:39 AM PDT #

So in the conninfo are the ci_local and ci_remote just the ip addr or do you expect some sort of port number to be added to the string?

I am still plinking around with creating a usdt for Open MPI, mainly probing PERUSE events. Adding the conninfo might prove helpful.

For the most part I have the usdt working, I am just coming up with some logical names and how to expose the arguments (via parameters or structures).

Posted by Terry Dontje on July 16, 2007 at 04:34 AM PDT #

Terry,

Those fields are just meant to be the IP addresses without the TCP port number. The idea was to be as generic as possible to accommodate other protocols in the future (e.g. infiniban). I'm looking forward to seeing your proposed provider on the DTrace discussion list.

Posted by Adam Leventhal on July 16, 2007 at 12:06 PM PDT #

Post a Comment:
Comments are closed for this entry.
About

Adam Leventhal, Fishworks engineer

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today