Introduction
RDS provides reliable, ordered datagram delivery by using reliable transport between nodes. There can be multiple RDS connections between two nodes depending on how many IPs are configured. Even if there is only one IP, there can be multiple RDS connections thanks to QoS. Also, for TCP transport (which supports multipath), there can be multiple connections. The difference being, the connection is between nodes and not between individual processes running on these nodes. Multiple DTrace scripts have been developed to monitor various aspects of the RDS data traffic, which can help to analyze issues and diagnose them. These scripts are included as a part of the OLED suite of tools.
For installation instructions and more information about OLED, please see Oracle Linux Enhanced Diagnostics.
In addition, please see the below provided information, which will help to understand the execution details and the output information of these scripts.
Usage examples:
Script: arp_origin.d
This script detects and prints the details of all ARP requests, replies, and ignored packets on all the network interfaces present on the system where this script is being executed.
Output format: each line lists
<date>
<time>
<interface>
<Send/Recv>
arp <request/reply>
sip:<Source IP>,
tip:<Target IP>
Sample output shown below:
# ./arp_origin.d 2022 Nov 1 12:36:54 ib0 Recv arp request sip: 192.168.18.8, tip: 192.168.18.5 2022 Nov 1 12:36:54 ib0 Send arp reply sip: 192.168.18.5, tip: 192.168.18.8 2022 Nov 1 12:36:54 ib0 Send arp request: sip: 192.168.18.5, tip: 192.168.18.8 2022 Nov 1 12:36:54 ib0 Recv arp reply sip: 192.168.18.8 tip: 192.168.18.5 2022 Nov 1 12:36:54 ib1 Recv arp request (Ignoring) sip: 192.168.18.8, tip: 192.168.18.5 2022 Nov 1 12:36:55 ib0 Recv arp request sip: 192.168.18.8, tip: 192.168.18.5 2022 Nov 1 12:36:55 ib0 Send arp reply sip: 192.168.18.5, tip: 192.168.18.8 2022 Nov 1 12:36:55 ib0 Send arp request: sip: 192.168.18.5, tip: 192.168.18.8 2022 Nov 1 12:36:55 ib0 Recv arp reply sip: 192.168.18.8 tip: 192.168.18.5 2022 Nov 1 12:36:55 ib1 Recv arp request (Ignoring) sip: 192.168.18.8, tip: 192.168.18.5
Script: rds_bcopy_metric.d
This script is used to trace and display the send/recv bytes per connection.
Output format: [<src-ip>,<dst-ip>,<tos> <send/recv>] <Data transfer rate in MB/s>
Sample output shown below
# ./rds-bcopy-metric.d | grep "192.168.0.8,192.168.0.6" [<192.168.0.8,192.168.0.6,0> send] 16 [<192.168.0.8,192.168.0.6,0> recv] 16 [<192.168.0.8,192.168.0.6,0> send] 16 [<192.168.0.8,192.168.0.6,0> recv] 16 [<192.168.0.8,192.168.0.6,0> send] 16 [<192.168.0.8,192.168.0.6,0> recv] 16 [<192.168.0.8,192.168.0.6,0> send] 16
Script: rds_check_tx_stall.d
This script monitors the completions on all the connections and calculates the time difference between the consecutive completions. If this time difference exceeds the threshold time (in microseconds), given as the input argument, a message will be printed providing the details of the connection parameters.
Arguments: Threshold time in microseconds. (Minimum: 500) Output format: [<src-ip, <dst-ip>, <tos>] last send completion: <time observed> usecs ago 111
Sample output shown below.
Test 1: ./rds_check_tx_stall.d 200 Error: Argument should be >= 500 ( microseconds) [2022 Oct 31 13:39:24] Test 2: ./rds_check_tx_stall.d 1000000 2022 Oct 31 13:37:16 Print each time there is a gap that exceeds the threshold, ctrl+c to stop. [192.168.18.5,192.168.18.8,2] last send completion: 1000211 usecs ago [192.168.18.5,192.168.18.8,3] last send completion: 1000359 usecs ago [192.168.18.5,192.168.18.8,3] last send completion: 1000233 usecs ago [192.168.18.5,192.168.18.8,3] last send completion: 1000153 usecs ago ^C
Script: rds-rdma-lat.d
This script is used to trace and display the RDMA read/write in bytes and the corresponding latency (in usec) per connection.
Output format: [src ip, dest ip, tos, read/write, bytes, latency (usecs)]
Sample output shown below
# ./rds-rdma-lat.d 2022 Aug 4 06:57:31 [<connection> op] Bytes Latency(in usecs) [<192.168.0.6,192.168.0.7,0> r] bytes=1024 lat(usec)=106 [<192.168.0.6,192.168.0.7,0> w] bytes=1024 lat(usec)=47 [<192.168.0.6,192.168.0.7,0> r] bytes=1024 lat(usec)=49 [<192.168.0.6,192.168.0.7,0> w] bytes=1024 lat(usec)=71 [<192.168.0.6,192.168.0.7,0> r] bytes=1024 lat(usec)=41 [<192.168.0.6,192.168.0.7,0> w] bytes=1024 lat(usec)=41
Test with -D 100k size
# ./rds-rdma-lat.d 2022 Aug 4 07:03:33 [<connection> op] Bytes Latency(in usecs) [<192.168.0.6,192.168.0.7,0> r] bytes=102400 lat(usec)=126 [<192.168.0.6,192.168.0.7,0> w] bytes=102400 lat(usec)=93 [<192.168.0.6,192.168.0.7,0> r] bytes=102400 lat(usec)=81 [<192.168.0.6,192.168.0.7,0> w] bytes=102400 lat(usec)=70 [<192.168.0.6,192.168.0.7,0> r] bytes=102400 lat(usec)=79
Script: rds_conn2irq.d
This script is used to display RDS connection and its related qpn, cqn and ird lines. It is to be noted that this script works only on CX-5/RoCE based machines.
Output format: [<src-ip>, <dst-ip>, <tos>] conn=<rds_connection object> ic=<rds_ib_connection object> qpn=<source qp number> dqpn=<destination qp number> scqn=<source completion queue number> scqirq=<source completion queue irq number> rcqn=<receive completion queue number> rcqirq=<receive completion queue irq number>
Sample output shown below.
# ./rds_conn2irq.d
2022 Sep 1 03:26:05 print rds conn to irq mapping. Run rds-info -I parallelly to populate. ctrl+c to stop and print summary
[<192.168.100.125,192.168.100.63,0>] conn=ffff8d93e96e8148 ic=ffff8d35fa43c000 qpn=156 dqpn=0 scqn=1210 scqirq=546 rcqn=1211 rcqirq=545
^C
Summary:
irqn count
545 1
Script: rds_egress_TP.d
This script tracks the egress drop reason along with pid and comm.
Output format: <Date> <PID> <Comm> <src_ip, dst_ip, tos> reason=<reason>
Sample output shown below:
# ./rds_egress_TP.d 2022 Oct 21 03:04:45 ctrl+c to stop 2022 Oct 21 03:05:50 pid=373319 comm=rds-stress [<192.168.0.7,192.168.0.5,0>] reason=rdsinfo pending 2022 Oct 21 03:10:52 pid=373319 comm=rds-stress [<192.168.0.7,192.168.0.5,0>] reason=rdsinfo pending 2022 Oct 21 03:13:48 pid=373319 comm=rds-stress [<192.168.0.7,192.168.0.5,0>] reason=rdsinfo pending
Script: rds_rdma_xfer_rate.d
The script shows the RDMA transfer rate[MB/s] for each RDS connection. It also shows histograms of inverse bandwidth in nsecs/KB (Avg number of nsecs to transfer 1KB of data).
Output format: [src-ip, dst-ip, tos, <r/w>] <Data transfer rate in MB/s>
Sample output shown below.
# ./rds-rdma-xfer-rate.d
2022 Oct 25 10:25:17
[<connection> op] MB/s
2022 Oct 25 10:25:27
[<192.168.100.125,192.168.100.124,0> r] 1059
2022 Oct 25 10:25:37
[<192.168.100.125,192.168.100.124,0> r] 1059
2022 Oct 25 10:25:47
[<192.168.100.125,192.168.100.124,0> r] 1056
2022 Oct 25 10:25:57
[<192.168.100.125,192.168.100.124,0> r] 1049
^C << ---------- Press Cntr+c and stop the script to get histogram output.
2022 Oct 25 10:26:06 - Histogram of inverse bandwidth reported in nsecs/KB
[<192.168.100.125,192.168.100.124,0> r]
value ------------- Distribution ------------- count
128 | 0
256 |@@ 15827
512 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 373157
1024 | 3040
2048 | 0
Transfer rdma bytes of 1 MB size.
Script: rds_tx_funccount.d
This script is used to trace and display the rate of calls for send_xmit, ib_xmit and send_cqe_handler per 10 sec.
Output format: [<src_ip, dt_ip, tos, <sendmsg/send_xmit/ib_xmit/send_cqe_handler>] no_of_calls/10sec
Sample output shown below
# ./rds_tx_funccount.d 2022 Oct 26 23:52:16: rate of calls for send_xmit, ib_xmit and send_cqe_handler. ctrl+c to stop --- 2022 Oct 26 23:52:26 --- [<192.168.0.5,192.168.0.23,2> 03ib_xmits] 3 [<192.168.0.5,192.168.0.23,3> 03ib_xmits] 3 [<192.168.0.5,192.168.0.23,3> 04send_cqe_handler] 3 [<192.168.0.5,192.168.0.23,5> 01send_msg] 3 [<192.168.0.5,192.168.0.24,2> 03ib_xmits] 3 [<192.168.0.5,192.168.0.24,2> 04send_cqe_handler] 3 [<192.168.0.5,192.168.0.24,4> 03ib_xmits] 3 [<192.168.0.5,192.168.0.24,4> 04send_cqe_handler] 3 [<192.168.0.5,192.168.0.24,5> 03ib_xmits] 3 [<192.168.0.5,192.168.0.24,5> 04send_cqe_handler] 3