In this post from Alan Maguire, he shows how to use DTrace to inspect common network-related system calls. DTrace is a powerful tracing tool for understanding user and kernel processes.
We've been steadily adding DTrace support for networking in Oracle Linux. Let's see how we can use these probes to better understand network behaviour. Let's do a quick tour to see what sort of support is available for dynamic tracing the network stack.
Before diving into the details though, it's worth asking why use DTrace at all? Don't we have enough tools in the networking space? The reasons I use it are:
As on other operating systems, the syscall provider is avaiable and we can trace network-related system calls - sendto, sendmsg, sendmmsg, recvfrom, recvmsg, recvmmsg.
Here's a simple example - trace bytes sent on a per-executable basis. Hit Ctrl+C when you're done collecting info.
# dtrace -n 'syscall::sendto:entry { @c[execname] = sum(arg2); }'
dtrace: description 'syscall::sendto:entry ' matched 1 probe
^C
DNS Resolver #6 65
ping 192
Socket Thread 2006
vpnagentd 3872
This is what we call a DTrace one-liner. Let's examine it piece-by-piece.
So to summarize, our one-liner says "show me the sum total of bytes sent via sendto() system calls, and collect the sums for each program name" ("execname" is a built in variable that references the current process executable name).
Now of course these are requests, and some may fail. Again note that the probe we used was syscall::sendto:entry. There is also a corresponding syscall::sendto:return. A common pattern in DTrace is to measure how long it takes from :entry to :return.
In Linux – as in Solaris – we have transport-level providers for TCP and UDP. These trace transmit/receive events, and for TCP we also trace TCP state machine transitions and connection establishment milestones.
# dtrace -l -P udp
ID PROVIDER MODULE FUNCTION NAME
1836 udp vmlinux udp_send_skb send
1838 udp vmlinux udp_recvmsg receive
1840 udp vmlinux udp_queue_rcv_skb receive
1868 udp vmlinux udp_v6_send_skb send
1869 udp vmlinux udpv6_recvmsg receive
1871 udp vmlinux udpv6_queue_rcv_skb receive
# dtrace -l -P tcp
ID PROVIDER MODULE FUNCTION NAME
1818 tcp vmlinux tcp_set_state state-change
1819 tcp vmlinux tcp_conn_request state-change
1820 tcp vmlinux tcp_finish_connect connect-established
1821 tcp vmlinux tcp_rcv_state_process accept-established
1822 tcp vmlinux tcp_rcv_state_process connect-refused
1823 tcp vmlinux tcp_transmit_skb send
1824 tcp vmlinux tcp_transmit_skb connect-request
1825 tcp vmlinux tcp_connect state-change
1826 tcp vmlinux tcp_v4_send_ack send
1827 tcp vmlinux tcp_v4_send_reset send
1828 tcp vmlinux tcp_v4_send_reset accept-refused
1829 tcp vmlinux tcp_v4_send_synack send
1830 tcp vmlinux tcp_v4_rcv receive
1831 tcp vmlinux tcp_time_wait state-change
1882 tcp vmlinux tcp_v6_send_response send
1883 tcp vmlinux tcp_v6_send_response accept-refused
1884 tcp vmlinux tcp_v6_send_synack send
1885 tcp vmlinux tcp_v6_rcv receive
Here's a quick example - what systems (remote IP) are we refusing connections to, and which ports are they trying to connect to?
# dtrace -n 'tcp:::accept-refused { @c[args[2]->ip_daddr, args[4]->tcp_sport] = count(); }'
dtrace: description 'tcp:::accept-refused ' matched 2 probes
^C
127.0.0.1 7 1
I telnet'ed port 7 locally in another window to generate this. Port 7 was closed so we see 1 connection being refused. The TCP, UDP and IP providers are all static-defined tracing (SDT) providers. The idea of these is that rather than trying to write probes that sit on ever-changing function entry or return to gather information from data structures that are also changing, we define a set of events of interest, and each place in the code that these events occur we place a static probe. So for example we can see that the tcp:::send probe is in tcp_v4_send_reset, tcp_v4_send_ack, etc. Importantly, these probes also define stable translators which take the OS-specific structures and translate them into stable arguments that DTrace consumers can rely on. To see the relevant deinifitions for Oracle Linux, you can look in /usr/lib64/dtrace/4.1/tcp.d, udp.d, ip.d.
For the most part we have preserved compatibility with Solaris. If there's something missing you really need, let us know! Here's how the various arguments look. We use args[0] instead of arg0 to let DTrace know we're trying to access the stable, translated argument. The raw arguments are still available via arg0, arg1 etc.
args[0] - pktinfo_t * [packet information]
args[1] - csinfo_t * [ connection information]
args[2] - ipinfo_t * [IP protocol information]
args[3] - tcpsinfo_t * for TCP, udpsinfo_t * for UDP [TCP/UDP state information]
args[4] - tcpinfo_t * for TCP, udpinfo_t * for UDP [TCP/UDP header information]
The IP provider traces send, receive, and inbound/outbound drops.
# dtrace -l -P ip
ID PROVIDER MODULE FUNCTION NAME
1839 ip vmlinux ip_rcv_finish drop-in
1840 ip vmlinux ip_local_deliver receive
1841 ip vmlinux ip_rcv drop-in
1842 ip vmlinux __ip_append_data drop-out
1843 ip vmlinux __ip_local_out_sk send
1844 ip vmlinux ip_append_page drop-out
1845 ip vmlinux ip_send_skb drop-out
1861 ip vmlinux raw_sendmsg send
1878 ip vmlinux __ip6_flush_pending_frames drop-out
1879 ip vmlinux ip6_xmit send
1880 ip vmlinux ip6_xmit drop-out
1881 ip vmlinux ip6_finish_output2 drop-out
1882 ip vmlinux __ip6_append_data drop-out
1883 ip vmlinux ip6_forward drop-out
1884 ip vmlinux ip6_output drop-out
1885 ip vmlinux ip6_send_skb drop-out
1886 ip vmlinux ip6_input_finish drop-in
1887 ip vmlinux ipv6_rcv drop-in
1888 ip vmlinux ip6_input receive
1895 ip vmlinux ndisc_send_skb send
1900 ip vmlinux rawv6_sendmsg send
1903 ip vmlinux mld_sendpack send
1904 ip vmlinux mld_sendpack drop-out
1905 ip vmlinux igmp6_send send
1906 ip vmlinux igmp6_send drop-out
1923 ip vmlinux __ip6_local_out_sk send
For IP the arguments are
args[0] - pktinfo_t * [packet information]
args[1] - csinfo_t * [connection information]
args[2] - ipinfo_t * [IP protocol information]
args[3] - ifinfo_t * [IP interface info]
args[4] - ipv4info_t * [IPv6 header info if IPv6 packet]
args[5] - ipv6info_t * [IPv6 header info if IPv6 packet]
Other
DTrace on Oracle Linux includes a perf provider, which traces perf events. And of course function boundary tracing (fbt) is available too. Here's an example of using function boundary tracing to get counts of socket buffer (struct sk_buff) allocation sizes. sk_buff is the key structure for network data, and here we're examining the counts in each power-of-two bucket 0-1, 1-2, 2-4, etc - this is the quantize() aggregating action. We see for example that the majority were in the 2048-4096 range.
# dtrace -n 'fbt::__alloc_skb:entry { @size = quantize(arg0); }'
dtrace: description 'fbt::__alloc_skb:entry ' matched 1 probe
^C
value ------------- Distribution ------------- count
-1 | 0
0 | 80
1 | 3
2 | 0
4 | 137
8 | 988
16 |@@ 5719
32 |@@@@@ 10977
64 | 779
128 |@@@ 6805
256 |@@@@@@@@@@@ 25462
512 | 508
1024 |@@ 5108
2048 |@@@@@@@@@@@@@@@@ 39011
4096 | 8
8192 | 0
16384 | 16
32768 | 0
We'll describe some more examples using fbt and perf providers in a future blog post!