Introduction
We introduced Oracle Linux Enhanced Diagnostics (OLED) in a recent blog post. There are several diagnostic scripts included with the scripts tool is used to manage and run these. With the scripts tool you can list, run, enable, and disable the various scripts. Some of the scripts are designed to be run at the start up of the system, while others can be run at any time. The majority of these scripts are written for dtrace(8), however scripts can also run any other kind of script as long as the interpreter is correctly defined in the shebang (#!) at the beginning of the script. For installation instructions, please refer to that post.
Using scripts
Since the scripts tool is a part of OLED, you have to execute it using the following syntax:
sudo oled scripts <ARGUMENTS> <COMMAND> <OPTIONS>
As an example to get help on the tool you would execute:
sudo oled scripts -h
usage: scripts [-h]
{list,run,reset-startup,enable-startup,disable-startup,run-startup-enabled}
...
Performs actions on scripts provided by oled-tools.
positional arguments:
{list,run,reset-startup,enable-startup,disable-startup,run-startup-enabled}
Subcommands
list List oled-tools scripts.
run Run oled-tools script.
reset-startup Reset startup config of a script to system default
enable-startup Enable a script to run at startup
disable-startup Disable a script from running at startup
run-startup-enabled
Run enabled startup scripts
optional arguments:
-h, --help show this help message and exit
For a brief overview of what each script does, and the arguments they may take, see the man page additional-scripts(8). For detailed information and examples, review the .txt files in /usr/libexec/oled-tools/scripts.d/docs/.
Let’s look at each of the subcommands.
- list: Use the list subcommand to list all the scripts available with OLED.
(Startup Enabled: '*' = enabled by default; '+' = enabled by user; '-' disabled by user)
Startup Startup
Script Name Eligible Enabled
============================== ======== =======
arp_origin.d
cm_destroy_id.d *
mlx_vhcaid.d *
mlxqpdump.sh
nvme_io_comp.d
rds_bcopy_metric.d
rds_check_tx_stall.d
rds_conn2irq.d
rds_egress_TP.d
rds_rdma_lat.d
rds_rdma_xfer_rate.d
rds_tx_funccount.d
scsi_latency.d
scsi_queue.d
spinlock_time.d
ping_lat.d
Notice the legend at the top of the output. This lists the options that can show up in the enabled column. Only startup scripts can be enabled. We’ll see that in action later on.
- run: This subcommand will run the script you provide on the command line. Note that some scripts require an argument, and not all scripts will run on all hardware. There are some scripts that depend on specific hardware to work.
sudo oled scripts run ping_lat.d 2024-05-16 20:09:19.182 INFO - Running script '/usr/libexec/oled-tools/scripts.d/ping_lat.d '... DTrace 2.0.0 [Pre-Release with limited functionality] icmp id: 14434, icmp sequence: 256, src ip: 192.168.18.5, dst ip: 192.168.18.9 routine delta(ns) ip_send_skb dev_hard_start_xmit 190382 ipoib_start_xmit 44855 ipoib_cm_send 84644 icmp_rcv 322743 request-reply latency: 642624 ------------------------------------------------------------------
-
reset-startup: Some startup scripts may be enabled by default. Those would be identified by a ’*’ in the Enabled column when you run the list subcommand. You have the ability to disable these scripts, overriding the default setting. If you want to reset everything back to the way it was when oled-tools was installed, use the reset-startup subcommand.
-
enable-startup: As you can see from the list above, none of the startup scripts are enabled. Use this subcommand to enable the startup script(s) of your choice. While you can enable as many startup scripts as you like, the enable-startup subcommand expects only one script at a time.
sudo oled scripts enable-startup cm_destroy_id.d
sudo oled scripts list
(Startup Enabled: '*' = enabled by default; '+' = enabled by user; '-' disabled by user)
Startup Startup
Script Name Eligible Enabled
============================== ======== =======
arp_origin.d
cm_destroy_id.d * +
mlx_vhcaid.d *
mlxqpdump.sh
nvme_io_comp.d
rds_bcopy_metric.d
rds_check_tx_stall.d
rds_conn2irq.d
rds_egress_TP.d
rds_rdma_lat.d
rds_rdma_xfer_rate.d
rds_tx_funccount.d
scsi_latency.d
scsi_queue.d
spinlock_time.d
ping_lat.d
Note that cm_destroy_id.d now has a ‘+’ in the Startup Enabled column.
- disable-startup: As the name implies, this subcommand will disable a startup script.
sudo oled scripts disable-startup cm_destroy_id.d
sudo oled scripts list
(Startup Enabled: '*' = enabled by default; '+' = enabled by user; '-' disabled by user)
Startup Startup
Script Name Eligible Enabled
============================== ======== =======
arp_origin.d
cm_destroy_id.d *
mlx_vhcaid.d *
mlxqpdump.sh
nvme_io_comp.d
rds_bcopy_metric.d
rds_check_tx_stall.d
rds_conn2irq.d
rds_egress_TP.d
rds_rdma_lat.d
rds_rdma_xfer_rate.d
rds_tx_funccount.d
scsi_latency.d
scsi_queue.d
spinlock_time.d
ping_lat.d
- run-startup-enabled: Once configured and enabled the startup scripts will be executed on boot up, however, if you want to run them immediately you can use the run-startup-enabled subcommand.
Additional Setup
For the enabled startup scripts to run at boot up, you also need to enable the oled-tools-scripts.service.
sudo systemctl enable oled-tools-scripts.service Created symlink /etc/systemd/system/multi-user.target.wants/oled-tools-scripts.service → /usr/lib/systemd/system/oled-tools-scripts.service.
Scripts
Here is a list of all the scripts available in oled-tools 0.8-1, and a brief description. For detailed information about each script refer to the text files in /usr/libexec/oled-tools/scripts.d/docs. You may not see all these when you do an oled scripts list, because they are filtered by the kernel version they will run on.
- arp_origin.d – Detects and prints the details of all ARP requests, replies, and ignored packets on all the network interfaces present on the system where this script is being executed.
- cm_destroy_id.d – Executes the external command “mlxqpdump.sh” when cm_destroy_id_wait_timeout() probe fires. It is better to give a full path to the external command. Stops after 5 probes.
- cq.d – Tracks new connections and prints the Completion Queue objects and the associated completion queue numbers.
- cqn_track.d – Tracks a Completion Queue number and prints when there is a new completion by tracking completion handler, tasklet, arming and poll_cq calls.
- mlxqpdump.sh – Executes the commands to gather FW resource dumps and sysinfo-snapshot for mellanox devices.
- mlx_vhcaid.d – Prints ‘vhca id’ of the mellanox devices (CX5) from a VM. The mellanox macro in UEK7 is different from UEK[5,6]. When running the script in UEK7, add the ‘-D uek7’ argument to the command.
- nvme_io_comp.d – Measures nvme IO latency in microseconds (us).
- ping_lat.d – Measures latency details of the icmp packet, during its journey in the kernel. Output shows the time delta between two consecutive kernel functions through which the packet travels.
- rds_bcopy_metric.d – Used to trace and display the send and recv bytes per connection.
- rds_check_tx_stall.d – Monitors the completions on all the connections and calculates the time difference between the consecutive completions. If this time difference exceeds the threshold time (in microseconds), given as input argument, message will be printed providing the details of the connection parameters.
- rds_conn2irq.d – Displays rds connection and its related qpn, cqn and irq lines.
- rds_egress_TP.d – Tracks egress drop reason along with pid and comm.
- rds_rdma_lat.d – Traces and displays the rdma read/write in bytes and the corresponding latency (in usec) per connection as in the below format.
- rds_rdma_xfer_rate.d – Shows the rdma transfer rate[MB/s] for each rds connection. It also shows histograms of inverse bandwidth in nsecs/KB [Avg number of nsecs to transfer 1KB of data].
- rds_tx_funccount.d – Used to trace and display rate of calls for sendmsg, send_xmit, ib_xmit and send_cqe_handler per 10 sec.
- scsi_latency.d – Measures SCSI Mid Layer IO latency in milliseconds (ms).
- scsi_queue.d – Measures SCSI IO queue time in milliseconds (ms).
- spinlock_time.d – Tracks ‘the time interrupt is disabled’ due to spin_lock_irq*() kernel functions. Takes an argument[holdtime in ms] and prints the process (and its call stack) which disables interrupt more than the given time.
Conclusion
scripts is a simple way to manage and run the additional scripts provided with OLED. If you have any issues with scripts, please open a Service Request to Oracle Linux Support.