The sosreport
command has been around since 2009. Written in Python the tool is designed to gather comprehensive diagnostic data from a Linux system. The tool has been modified and improved by the open source community over the years and is now run using the command sos report
. sos report
does not make modifications to your system configuration.
Quite simply, the output from sos report
is a one-stop shop for diagnostic data. It contains most of the data we need to begin troubleshooting a problem on your Linux system. Sometimes it’s all the data we need. Yes, there are circumstances where we will come back and ask for more data, but sos report
is the next best thing to troubleshooting live on your system.
In some cases we may ask for more than one sos report
. We may want to compare a good system to a bad system, or we may want to see how the system changes over time. The sos report
provides a snapshot of the system’s state as well as some historical information.
“Can’t you just ask for the specific files you need?”, I hear you ask. Yes, we could, but we’re only human and chances are we may miss something or decide we need to look at a different file. sos report
will collect configuration information, log files, as well as the output from dozens of commands, saving time, and potential mistakes, by not asking you to collect those things individually by hand. By using sos report
we hope to reduce the back and forth that can be annoying and delay the resolution of your issue.
sudo yum install sos
sudo dnf install sos
Below are the commands that I prefer when asking someone to run sos report
. These commands will ensure we get all the log files from /var/log
(not just truncated versions) as well as all the sar
data. Historical information is important! Even though the commands differ between some of the OL versions, the collected data is the same.
sudo sosreport --batch --all-logs -k sar.all_sar=on
sudo sos report --batch --all-logs -k sar.all_sar=on
The --batch
option allows you to run sos report
without any further keyboard input. Normally sos report
will ask you for your name and ticket number. By using --batch
, you can avoid this and the output file will contain the system name and a time stamp.
The --all-logs
option ensures that we get all the files from /var/log
. Without this, some of the files, like /var/log/messages
, are truncated.
The -k sar.all_sar=on
option tells sos report
to collect all the sar
files in /var/log/sa
, not just the last 14 days.
sudo sos report --batch --all-logs -e sar -k sar.all_sar=on
The -e sar
option turns the sar
plugin back on. It is disabled by default in OL9.
Here is the output from an sos report
run on OL9 (truncated).
[opc@ol9-1 ~]$ sudo sos report --batch --all-logs -e sar -k sar.all_sar=on sosreport (version 4.5.3) This command will collect diagnostic and configuration information from this Oracle Linux system and installed applications. An archive containing the collected information will be generated in /var/tmp/sos.7p70_t00 and may be provided to a Oracle America support representative. Any information provided to Oracle America will be treated in accordance with the published support policies at: Distribution Website : https://support.oracle.com/ Commercial Support : https://support.oracle.com/ The generated archive may contain data considered sensitive and its content should be reviewed by the originating organization before being passed to any third party. No changes will be made to system configuration. Setting up archive ... Setting up plugins ... [plugin:networking] skipped command 'ip -s macsec show': required kmods missing: macsec. Use '--allow-system-changes' to enable collection. [plugin:networking] skipped command 'ss -peaonmi': required kmods missing: xsk_diag. Use '--allow-system-changes' to enable collection. [plugin:sar] sar: could not list /var/log/sa [plugin:sssd] skipped command 'sssctl config-check': required services missing: sssd. [plugin:sssd] skipped command 'sssctl domain-list': required services missing: sssd. [plugin:systemd] skipped command 'systemd-resolve --status': required services missing: systemd-resolved. [plugin:systemd] skipped command 'systemd-resolve --statistics': required services missing: systemd-resolved. Running plugins. Please wait ... Starting 1/95 alternatives [Running: alternatives] Starting 2/95 anacron [Running: alternatives anacron] Starting 3/95 ata [Running: alternatives anacron ata] Starting 4/95 auditd [Running: alternatives anacron ata auditd] Starting 5/95 bcache [Running: alternatives ata auditd bcache] Starting 6/95 block [Running: alternatives ata auditd block] Starting 7/95 boot [Running: alternatives ata block boot] Starting 8/95 btrfs [Running: alternatives block boot btrfs] Starting 9/95 cgroups [Running: block boot btrfs cgroups] Starting 10/95 chrony [Running: block boot cgroups chrony] Starting 11/95 cloud_init [Running: block boot cgroups cloud_init] Starting 12/95 cockpit [Running: boot cgroups cloud_init cockpit] Starting 13/95 console [Running: boot cgroups cloud_init console] Starting 14/95 cron [Running: boot cgroups cloud_init cron] Starting 15/95 crypto [Running: boot cgroups cloud_init crypto] Starting 16/95 date [Running: boot cgroups cloud_init date] Starting 17/95 dbus [Running: boot cgroups cloud_init dbus] Starting 18/95 devicemapper [Running: boot cgroups dbus devicemapper] Starting 19/95 devices [Running: boot cgroups devicemapper devices] Starting 20/95 dnf [Running: boot devicemapper devices dnf] Starting 21/95 dracut [Running: boot devicemapper dnf dracut] Starting 22/95 ebpf [Running: boot dnf dracut ebpf] Starting 23/95 filesys [Running: boot dnf ebpf filesys] Starting 24/95 firewall_tables [Running: boot dnf ebpf firewall_tables] Starting 25/95 firewalld [Running: boot dnf ebpf firewalld] Starting 26/95 fwupd [Running: dnf ebpf firewalld fwupd] Starting 27/95 grub2 [Running: dnf ebpf firewalld grub2] Starting 28/95 gssproxy [Running: dnf ebpf grub2 gssproxy] Starting 29/95 hardware [Running: dnf ebpf grub2 hardware] Starting 30/95 host [Running: dnf grub2 hardware host] Starting 31/95 hts [Running: dnf grub2 hardware hts] Starting 32/95 i18n [Running: dnf grub2 hardware i18n] Starting 33/95 iscsi [Running: dnf grub2 hardware iscsi] Starting 34/95 jars [Running: dnf grub2 hardware jars] Starting 35/95 kdump [Running: dnf grub2 hardware kdump] <SNIP> Starting 80/95 sunrpc [Running: process processor selinux sunrpc] Starting 81/95 system [Running: process processor selinux system] Starting 82/95 systemd [Running: processor selinux system systemd] Starting 83/95 systemtap [Running: processor selinux systemd systemtap] Starting 84/95 sysvipc [Running: processor selinux systemtap sysvipc] Starting 85/95 teamd [Running: processor selinux systemtap teamd] Starting 86/95 tuned [Running: processor selinux systemtap tuned] Starting 87/95 udev [Running: selinux systemtap tuned udev] Starting 88/95 udisks [Running: systemtap tuned udev udisks] Starting 89/95 unpackaged [Running: systemtap tuned udisks unpackaged] Starting 90/95 usb [Running: systemtap tuned unpackaged usb] Starting 91/95 vdo [Running: systemtap tuned unpackaged vdo] Starting 92/95 vhostmd [Running: systemtap tuned unpackaged vhostmd] Starting 93/95 x11 [Running: systemtap tuned unpackaged x11] Starting 94/95 xen [Running: systemtap tuned unpackaged xen] Finishing plugins [Running: systemtap tuned unpackaged] Starting 95/95 xfs [Running: systemtap tuned unpackaged xfs] Finishing plugins [Running: systemtap unpackaged xfs] Finishing plugins [Running: systemtap unpackaged] Finishing plugins [Running: systemtap] Finished running plugins Creating compressed archive... Your sosreport has been generated and saved in: /var/tmp/sosreport-ol9-1-2023-08-05-aaevrlq.tar.xz Size 10.52MiB Owner root sha256 1cb78f8f8114bd1f11b477eb30ed90b7095a03c648e2613cca8ea20b1102210e Please send this file to your support representative. [opc@ol9-1 ~]$
Notice that four plugins run at the same time. This helps reduce the amount of time it takes for sos report
to complete. Want to see all the plugins that are active and inactive as well as the available plugin options? Try this:
sudo sos report -l
You can enable (-e) and disable (-n) plugins as you wish from the command line.
sar
You may be asking yourself why I’m so hung up on sar
. Well, I do realize there are much better tools out there for performance monitoring. Performance Co-Pilot for example. However sar
provides an entire month’s worth of data in just a few MB. It can provide a nice bird’s-eye view, and can show you trends that would be missed by more granular data which is typically kept for a shorter duration.
Here’s an example of what you can do with a month’s worth of sar
data. It’s not as dramatic as some issues I have seen, but by looking at the averages per day for CPU utilization you can see that there is more %user taken up on the 8th and 9th.
202307$ printf "=CPU Util= %s\n" "$(sar -u -f `ls -1rt sa??|head -1` | head -3 | tail -1)";ls -1rt sa?? | while read F; do sar -u -f $F | awk 'BEGIN{S=0} $1 == "Linux"{if (S==0){D=$(NF-3);S=1}} $1 == "Average:"{print D,$0}'; done =CPU Util= 04:00:01 AM CPU %user %nice %system %iowait %steal %idle 07/10/2023 Average: all 0.18 0.00 0.05 0.20 0.00 99.57 07/11/2023 Average: all 0.18 0.00 0.06 0.30 0.00 99.46 07/11/2023 Average: all 0.21 0.00 0.04 0.21 0.00 99.53 07/12/2023 Average: all 0.30 0.00 0.08 0.44 0.00 99.18 07/13/2023 Average: all 0.37 8.72 0.22 0.41 0.00 90.29 07/14/2023 Average: all 0.27 2.91 0.10 0.34 0.00 96.38 07/15/2023 Average: all 0.21 0.00 0.06 0.26 0.00 99.47 07/16/2023 Average: all 0.29 0.00 0.07 0.33 0.00 99.31 07/17/2023 Average: all 0.24 0.00 0.07 0.30 0.00 99.39 07/18/2023 Average: all 1.36 3.72 0.60 0.80 0.00 93.51 07/19/2023 Average: all 0.47 0.01 0.25 0.94 0.00 98.33 07/20/2023 Average: all 0.53 3.64 0.48 0.83 0.00 94.52 07/21/2023 Average: all 0.78 0.00 0.16 0.48 0.00 98.58 07/22/2023 Average: all 0.48 0.00 0.15 0.46 0.00 98.90 07/23/2023 Average: all 4.17 0.00 0.47 0.70 0.00 94.65 07/24/2023 Average: all 2.32 0.00 0.34 0.58 0.00 96.76 07/25/2023 Average: all 0.60 0.00 0.21 0.53 0.00 98.66 07/26/2023 Average: all 1.25 0.00 0.39 0.81 0.00 97.55 07/27/2023 Average: all 0.98 0.00 0.24 0.58 0.00 98.21 08/01/2023 Average: all 1.34 0.01 0.36 0.81 0.00 97.49 08/02/2023 Average: all 1.55 0.00 0.43 1.23 0.00 96.79 08/03/2023 Average: all 2.16 0.00 0.48 1.13 0.00 96.23 08/04/2023 Average: all 2.26 3.48 0.74 1.13 0.00 92.39 08/05/2023 Average: all 3.02 0.00 0.59 1.15 0.00 95.25 08/06/2023 Average: all 6.45 0.00 0.91 1.42 0.00 91.21 08/07/2023 Average: all 7.59 0.00 1.53 3.45 0.00 87.43 08/08/2023 Average: all 6.36 0.00 1.10 2.07 0.00 90.47 08/09/2023 Average: all 8.23 0.00 1.70 3.23 0.00 86.83
It is recommended to install sysstat
if not already installed.
sudo dnf install sysstat sudo systemctl enable --now sysstat
sos report
will write its output files to /var/tmp
. You can control this by providing the --tmp-dir
option. On OL6 the files are written to /tmp
with no option available. sos report
will create 2 files, a tarred and a compressed file, this is the sos report
, and a checksum file so you can verify the integrity of the sos report
.
Please run sos report
as soon as possible after a problem has happened. If your system crashed, run it as soon as the system is back up. You can also run sos report
anytime you want a baseline of the systems configuration. Before making a big change on your system you can save a copy of sos report
offline so you can refer to it if things go sideways. We have even recommended to some customers that they take a fresh sos report
on boot up via systemd.
Once you untar the sos report
and cd into the directory it creates you will see several sub-directories and symbolic links. The symbolic links are commonly used information and point to files under the directories. They are placed at the top level for the sake of convenience. Here are some of the symbolic links and directories you will find:
sos report
command was run.TIP: Want to sort the RPMs by the time they were installed? Try this:
cat installed-rpms | while read R W M D T Y; do SECONDS=`date +%s -d "${M} ${D} ${T} ${Y}"`; printf "%s\t%3s %3s %2d %8s %4d\t%d\n" ${R} ${W} ${M} ${D} ${T} ${Y} ${SECONDS}; done | sort -rn -k 7 | less
Notice that some of the directories start with sos_
. These contain data that was collected by commands run by sos report
. The other directories contain files that were copied by sos report
.
sos report
runs. Spend some time checking out this directory and its sub-directories. Here are a few examples
auditctl
and ausearch
commands.lsblk
, blkid
, and parted
.dmsetup
commands./boot
as well as the output from grub2-mkconfig
.iscsiadm
commands.journalctl
commands as well as a listing from /var/log
.sos report
using the command above there were 79 of these sub directories.sos report
.sos report
. The html file is interesting as it breaks down all the data that is collected by plugin./usr
file system, including /usr/lib
, /usr/libexec
, and /lib/share
./var/log
.The customer has been alerted by their monitoring software to a “Link Down” message in their /var/log/messages
file. The first thing we should do is gather some basic information about the customer’s system. Thankfully they have uploaded an sos report
.
# cd sosreport-my-test-2023-07-12-dfbucbc/ # cat uname Linux my-test 5.4.17-2136.315.5.el8uek.x86_64 #2 SMP Wed Dec 21 19:38:18 PST 2022 x86_64 x86_64 x86_64 GNU/Linux # grep Product dmidecode Product Name: ORACLE SERVER X6-2L # grep device-mapper-multipath installed-rpms device-mapper-multipath-0.8.4-22.el8.x86_64 Tue Nov 1 14:56:32 2022
From this data we know that this is an OL8 system running UEK6. We can see the hardware type, and the version of multipath that is running. Now we do some research to discover if there are any known issues with any of these versions. Next, having not found any applicable bugs, we check the error message and, with the PCI bus address, we can check the card type.
# grep -i "link down" var/log/messages Jul 12 10:54:33 my-test kernel: lpfc 0000:23:00.1: 1:1305 Link Down Event x2 received Data: x2 x20 x800110 x0 x0 # grep "23:00.1" lspci 23:00.1 Fibre Channel [0c04]: Emulex Corporation LPe15000/LPe16000 Series 8Gb/16Gb Fibre Channel Adapter [10df:e200] (rev 30)
We can now check if there are any known issues with this type of adapter. If there aren’t any we can recommend to the customer that they have the hardware checked, but we also want to make sure that multipath did its job.
# grep multipathd /var/log/messages | grep "Jul 12" | grep mpathr Jul 12 10:54:38 my-test multipathd[5598]: checker failed path 135:224 in map mpathr Jul 12 10:54:38 my-test multipathd[5598]: mpathr: remaining active paths: 15 Jul 12 10:54:39 my-test multipathd[5598]: mpathr: remaining active paths: 14 Jul 12 10:54:39 my-test multipathd[5598]: mpathr: remaining active paths: 13 Jul 12 10:54:40 my-test multipathd[5598]: checker failed path 128:32 in map mpathr Jul 12 10:54:40 my-test multipathd[5598]: mpathr: remaining active paths: 12
Here we can see that multipath did work correctly and the customer has a remaining 12 paths serving data.
If you really, really, need to, you can obfuscate the data in sos report
. We discourage this as it may make the troubleshooting harder. Obfuscation will take certain data from the sos report
and replace it with a generic value, thus hiding the true value. The obfuscation tools will hide hostnames, domains, usernames, IP addresses, and, in the case of OL8 and 9, user-provided keywords.
Our analysis of your sos report
will be based on the obfuscated values. You will need to translate the generic values into real-world values by consulting the translation tables produced at the same time the data is obfuscated.
On OL7 you can obfuscate an sos report
only after it has been collected. You do this with the soscleaner
tool. There is no soscleaner
tool for OL6.
sudo yum install soscleaner
sudo soscleaner sosreport-jyoder-ol7-1-2023-08-05-qklnggo.tar.xz
Here are the files it produces in /tmp:
-d
or --domain=
.soscleaner
command.sos report
. Upload this file only.Here’s what the hostname table looks like:
Obfuscated Hostname,Original Hostname host1.example.com,jyoder-ol7-1.allregionaliads.osdevelopmeniad.oraclevcn.com host0,jyoder-ol7-1
sudo sos clean --batch --keywords SecretApp sosreport-ol9-1-2023-08-05-qtskqpx.tar.xz
In this example I’m passing the --keywords
option and provide it the name of an application that I want to keep secret (appropriately named “SecretApp”). You can pass --keywords
a comma separated list of additional words that you want obfuscated in the sos report
. Every occurrence of these keywords (“SecretApp” in this case) will be obfuscated everywhere they appear, in addition to the default obfuscation of hostnames, domains, IPs and usernames. We’ll see an example of what this --keywords
option does below. After we run this command the following files are created in /var/tmp
:
sos report
. Upload this file only.sos report
Here is an example of the effect of the above command on /var/log/messages
:
BEFORE
Aug 5 18:10:33 ol9-1 SecretApp[24833]: Started Aug 5 18:11:38 ol9-1 SecretApp[24836]: ERROR: Can not continue. Aborting. Aug 5 18:13:33 ol9-1 dracut[25537]: dracut-057-21.git20230214.0.2.el9 Aug 5 18:13:33 ol9-1 dracut[25539]: Executing: /usr/bin/dracut --list-modules Aug 5 18:13:44 ol9-1 systemd[1]: Starting Hostname Service... Aug 5 18:13:44 ol9-1 systemd[1]: Started Hostname Service.
AFTER
Aug 5 18:10:33 host0 obfuscatedword0[24833]: Started Aug 5 18:11:38 host0 obfuscatedword0[24836]: ERROR: Can not continue. Aborting. Aug 5 18:13:33 host0 dracut[25537]: dracut-057-21.git20230214.0.2.el9 Aug 5 18:13:33 host0 dracut[25539]: Executing: /usr/bin/dracut --list-modules Aug 5 18:13:44 host0 systemd[1]: Starting Hostname Service... Aug 5 18:13:44 host0 systemd[1]: Started Hostname Service.
You can see why it might make troubleshooting a little more cumbersome.
You can also obfuscate the sos report
as you collect it. This option is available for OL8 and OL9 only.
sudo sos report --clean --batch --keywords SecretApp --all-logs -e sar -k sar.all_sar=on
Here’s what files were created:
Note: The hostname is automatically obfuscated. You can disable this if you want. Check the man page for sos-clean
.
Don’t want to have to remember all those command line options every time? Well, you don’t have to. You can configure all the command line options in /etc/sos/sos.conf
, and then just run sos report
. Here’s an example from an OL9 system using the options I list above.
$ egrep -v "^#|^$" /etc/sos/sos.conf [global] batch = yes [report] enable-plugins = sar all-logs = yes [collect] [clean] keywords = SecretApp [plugin_options] sar.all_sar=on
I would encourage you to take some time to run a few iterations of sos report
on your system and check out all of the data you can find there. Make sure you are always running the latest version. We love sos report
reports because they make the job easier for everyone. If you have any issues with sos report
please open a Service Request with Oracle Linux Support, and we will get it sorted right away.