DTrace as a Streams debugger

DTrace as a Streams debugger Since OpenSolaris is releasing today, I thought I'd take a few lines to recount how we can use DTrace to snoop in Streams data. Here DTrace helped me to quickly and cleanly understand a specific Streams bug (6230528).

My usual business is Sustaining Solaris Streams and TCP/IP code, but I occasionally moonlight supporting legacy communications protocols still used by the Telco world. In that capacity I was involved in certifying the legacy Solstice X.25 9.2 product as supportable on the (then) upcoming Solaris 10 release.

The X.25 product is highly configurable, and one of it's more esoteric configs encapsulates the IP protocol over an X.25 link. To do this a userland daemon, (x25netd) opens a stream to the product's ixe driver, (a pseudo NIC driver) and uses the Streams I_LINK ioctl to create a stream to the X.25 driver below. It then uses ifconfig to bring up the pseudo NIC instance, pushing the IP module onto the stream, and voila IP over X.25.

Well IP over X.25 functioned just fine, but occasionally when you brought the X.25 product down, the system paniced in Streams code, doing putnext() of a Streams message downstream from the ixe driver to the X.25 layer.

The clear clue was this console message at every close: 
WARNING:KERNEL: munlink: could not perform unlink ioctl, closing anyway (22)

The I_UNLINK ioctl sent downstream by x25netd when it was being shutdown was failing with EINVAL.

I needed to determine what syndrome caused the stream below ixe to evaporate leading to the panic, while the I_UNLINK ioctl intended to safely demolish that stream had failed (?)

DTrace is our friend here, we can use it to slice into the Stream and select what we want to see, without adding debug and recompiling multiple Streams modules and drivers, in this case from more than one code base.  DTrace has to be the world's most flexible debugging tool, Victorinox watch out.

Let's write a D script restricting the stuff we view to the x25netd process, and look at the munlink() and ioctl activity.
Examining the code we can see the functions involved in performing a Streams ioctl, we instrument those, looking for return codes and Streams errors.
Additionally we instrument the general purpose Streams routine for moving data putnext(), but we restrict what we see to only M_IOCTL Streams messages and potential replies to M_IOCTL.

#!/usr/sbin/dtrace -s


inline int M_IOCTL = 0x0e; /\* ioctl; set/get params \*/
inline int M_IOCACK = 0x81; /\* acknowledge ioctl \*/
inline int M_IOCNAK = 0x82; /\* negative ioctl acknowledge \*/

fbt::strdoioctl:entry
/ execname == "x25netd" / /\* All Streams ioctls sent by x25netd \*/
{
/\* Separator \*/
printf("\\n");
}

fbt::strdoioctl:entry
/ execname == "x25netd" / /\* All Streams ioctls sent by x25netd \*/
{
icp = (struct strioctl \*) arg1;
printf("IOCTL %d (0x%x), length %d", icp->ic_cmd, icp->ic_cmd, icp->ic_len);
}

fbt::strdoioctl:return
/ execname == "x25netd" / /\* Return code in arg1 \*/
{
printf("RETURN=%d errno=%d pid=%d", arg1, errno, pid);
}

fbt::munlink:return
/ execname == "x25netd" / /\* munlink() returns?? \*/
{
printf("RETURN=%d errno=%d pid=%d", arg1, errno, pid);
}

fbt::putiocd:return
/ execname == "x25netd" /
{
printf("RETURN=%d errno=%d pid=%d", arg1, errno, pid);
}

fbt::putnext:entry
/ execname == "x25netd" && (((mblk_t \*) arg1)->b_datap->db_type == M_IOCTL)/
{
iocp = (struct iocblk \*) ((mblk_t \*) arg1)->b_rptr;
printf(" putnext msg from: %s:: M_IOCTL ioctl %d (0x%x) count %d",
(stringof((queue_t \*) arg0)->q_qinfo->qi_minfo->mi_idname),
iocp->ioc_cmd, iocp->ioc_cmd, iocp->ioc_count);
}

fbt::putnext:entry
/ execname == "x25netd" && (((mblk_t \*) arg1)->b_datap->db_type == M_IOCACK)/
{
iocp = (struct iocblk \*) ((mblk_t \*) arg1)->b_rptr;
printf(" putnext msg from: %s:: M_IOCACK ioctl %d (0x%x) count %d",
(stringof((queue_t \*) arg0)->q_qinfo->qi_minfo->mi_idname),
iocp->ioc_cmd, iocp->ioc_cmd, iocp->ioc_count);
}

fbt::putnext:entry
/ execname == "x25netd" && (((mblk_t \*) arg1)->b_datap->db_type == M_IOCNAK)/
{
iocp = (struct iocblk \*) ((mblk_t \*) arg1)->b_rptr;
printf(" putnext msg from: %s:: M_IOCNAK ioctl %d (0x%x) count %d",
(stringof((queue_t \*) arg0)->q_qinfo->qi_minfo->mi_idname),
iocp->ioc_cmd, iocp->ioc_cmd, iocp->ioc_count);
}

fbt::strgeterr:return
/ execname == "x25netd" /
{
printf("RETURN=%d errno=%d pid=%d", arg1, errno, pid);
}

fbt:ixe:ixewput:entry /\* What M_IOCTLs arrive in ixe \*/
/ execname == "x25netd" && ((mblk_t \*) arg1)->b_datap->db_type == M_IOCTL /
{
iocp = (struct iocblk \*) ((mblk_t \*) arg1)->b_rptr;
printf(" ixe M_IOCTL %d (0x%x) received, count %d", iocp->ioc_cmd,
iocp->ioc_cmd, iocp->ioc_count);
}

Having started the X.25 product, the associated IP over X.25 NIC was tested, then I started the D script and waited for all the DTrace probes to be established. It was time to gather the data, by stopping the IP over X.25 device.
  0  13220      strdoioctl:entry 

0 13220 strdoioctl:entry IOCTL 21261 (0x530d), length 24
0 11497 putiocd:return RETURN=0 errno=0 pid=1763
0 3511 putnext:entry putnext msg from: strwhead:: M_IOCTL ioctl 21261 (0x530d) count 24
0 3511 putnext:entry putnext msg from: ip:: M_IOCNAK ioctl 21261 (0x530d) count 0
0 13221 strdoioctl:return RETURN=22 errno=0 pid=1763
0 9815 munlink:return RETURN=0 errno=0 pid=1763

From this it was apparent that the munlink was actually SUCCEEDING, but that the I_UNLINK ioctl was being NAK'ed by the IP module and never reaching the ixe driver.
The end result was that the STREAMS framework part of the unlink processing had occurred, and the linkage below ixe was gone, but nothing had told ixe that the stream below was no more. If a message from IP was in flight as the pseudo NIC was ifconfig'ed down, a system panic was likely. The munlink code has a special case to handle failure of the I_UNLINK ioctl upon closure of the stream.
         error = strdoioctl(stp, &strioc, NULL, FNATIVE, 
K_TO_K | STR_NOERROR | STR_NOSIG, crp, rvalp);

/\*
\* If there was an error and this is not called via strclose,
\* return to the user. Otherwise, pretend there was no error
\* and close the link.
\*/
if (error) {
if (flag & LINKCLOSE) {
cmn_err(CE_WARN, "KERNEL: munlink: could not perform "
"unlink ioctl, closing anyway (%d)\\n", error);
} else {
link_rempassthru(passq);
mutex_enter(&stpdown->sd_lock);
stpdown->sd_flag &= ~STRPLUMB;
cv_broadcast(&stpdown->sd_monitor);
mutex_exit(&stpdown->sd_lock);
mutex_exit(&muxifier);
return (error);
}
}

This allowed code later in munlink() to destroy the Streams linkage despite the failure of the ioctl. The special case doesn't seem like a good idea, however the root cause of this problem is elsewhere.

The IP module has no exclusive rights to the I_UNLINK ioctl, therefore it should be passing it downstream rather than NAK'ing it, and this is the information which the D script made clear immediately.
In fact it turned out that when the FireEngine code, (TCP/IP merge and rewrite to use vertical perimeters) was putback in S10, it altered IP's ioctl handling, and where IP consumes an ioctl when used as a Streams driver, it was assuming the right to do so, when pushed as a Streams module, on another driver's stream!
The fix was to mark the subset of the ioctls consumed by IP as a driver which are generic Streams ioctls, with a flag to indicate they must be passed downstream when IP is pushed as a Streams module.
Search the source of OpenSolaris for IPI_PASS_DOWN.

Technorati Tag:
Technorati Tag:

Technorati Tag:
Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

user12629787

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today
News
Blogroll

No bookmarks in folder