Introduction

The sosdiff python tool takes sos reports from two systems, and compares them. It can be very helpful to find differences between a “good” and a “bad” system, in terms of an operational or performance problem. Or maybe just to validate two systems are actually setup in a very similar fashion.

You can also use the tool to compare the sos reports from the same system, taken at two different times, in order to identify the changes.

If you need an introduction to the base sos tool, please refer to the Documentation, or the sos report blog.

Installation

For installation instructions and more information about OLED, please see Oracle Linux Enhanced Diagnostics.

Using sosdiff

Because sosdiff is part of OLED, you need to preface the commands with oled. There is no need to run as root.

For example here is how you would view the help information:

$ oled sosdiff -h
usage: sosdiff [-h] [-d] [-o] [-c] [-v] dir1 dir2

sosdiff - Compare 2 sos reports

positional arguments:
  dir1            First sos report directory
  dir2            Second sos report directory

optional arguments:
  -h, --help      show this help message and exit
  -d, --detail    Run with extensive sos report detailed checking (default:
                  False)
  -o, --override  Run despite mis-matched OL version or CPU architecture
                  (default: False)
  -c, --color     Always output color escape sequences (default: False)
  -v, --version   show program's version number and exit

Of these options, there are 2 that deserve a little more explanation.

Override

As the name suggests, this is how you get sosdiff to process sos reports that you want to compare, despite sanity checks that fail like OL Release or computer chip architecture.

Example: Try to sosdiff OL8 vs OL9:

$ oled sosdiff sosreport-jtest-ol8-2024-04-19-axlnnxx sosreport-jtest-ol9-2024-12-06-fxvstgd
sosdiff v20250403  Arguments validated .. beginning analysis ...
   First Report                       Second Report
>  jtest-ol8                          jtest-ol9
>  5.4.17-2136.329.3.1.el8uek.x86_64  6.12.0-rc7.master.20241117.ol9.x86_64
ERROR: O/S releases not identical and --override not specified, Exiting.

Example: Try to sosdiff x86_64 vs ARM:

$ oled sosdiff sosreport-jtest-ol8-2024-04-04-wofenpm sosreport-jtest-arm-2024-12-13-jrfheis
sosdiff v20250403  Arguments validated .. beginning analysis ...
   First Report                      Second Report
>  jtest-ol8                         jtest-arm
>  5.15.0-204.147.6.2.el8uek.x86_64  5.15.0-301.163.5.2.el8uek.aarch64
ERROR: Computer arch mis-match and --override not specified, Exiting.

Summary:

Add the –override flag to sosdiff to run the comparison anyway.

Detail

This option produces not only differences between the two reports, but dumps all configuration values of the sos reports. Very verbose. No example shown due to volume of data produced.

Color

Adding the –color flag will add highlights between the sos reports. If you pipe this to less, use less -r to retain the color escape sequences.

Available modules

At this time, sosdiff v20250403 contains the following modules:

alternatives.py		Compare sos_commands/alternatives (alternative software) installed on the systems
cgroups.py		Compare kernel cgroup usage between systems
clocksources.py		Compare kernel clocksource between systems
cmdline.py		Check /proc/cmdline and diff between systems
cron.py			Compare cron jobs between systems
exadata.py		See if this is an Exadata Engineered system and compare imageinfo
hardware.py		Compare lspci between systems 
kdump.py		Check if kdump is setup / enabled and compare systems
kernel_messages.py	Compare kernel dmesg between systems looking for anomolies 
kernel_modules.py	Compare lsmod (load kernel modules) between systems
lscpu.py		Contrast CPU capabilities between systems
meminfo.py		Do a diff on /proc/meminfo and show key differences between systems
mounts.py		Compare active mount points between systems
network_scripts.py	Diff /etc/sysconfig/network-scripts/ between systems
network_settings.py	Compare sos_commands/networking/ output between systems
network_stats.py	Diff sos_commands/networking/nstat_-zas between sos reports 
rpms.py			Do a diff between installed_rpms on both systems
selinux.py		Check SELinux state between systems
slabinfo.py		Compare kernel Slab usage beteween systems
sosdate.py		Compare date, Timezone and uptime on both systems
sysctl.py		Diff the kernel sysctl paramters between systems
systemd.py		Compare systemd units between systems
unames.py		Validate directories are sos reports and check OS Release & architecture
unpackaged.py		Display files not part of any RPM package, on both systems 

There are plans to add a function to both list available modules, and also specify what modules you want to run.

Examples

Here is a run between two sos reports on the same system, after a system reboot:

$ oled sosdiff sosreport-test-2025-04-21-hzasqne sosreport-test-2025-04-21-qvbwphy
sosdiff v20250403  Arguments validated .. beginning analysis ...
   First Report                    Second Report
>  test                            test
>  5.15.0-306.177.4.el8uek.x86_64  5.15.0-307.178.5.el8uek.x86_64

This is from the same system, so many sections of the sosdiff will report an INFO message stating “no differences found”, like:

INFO: No differences found in alternatives comparison.
INFO: No differences found in clocksources comparison.
INFO: No differences found in cron comparison.
INFO: No differences found in kdump comparison.
INFO: No differences found in lspci comparison.
INFO: No differences found in network scripts comparison.
INFO: No differences found in time comparison.
INFO: No differences found in unpackaged comparison.

But, other sections are different, like the RPM comparison, and kernel modules:

RPM Comparison__________________________________________________________________
   RPM Name                        First Report                            Second Report
>  NetworkManager                  1.40.16-18.0.3.el8_10.x86_64            1.40.16-19.0.1.el8_10.x86_64
>  NetworkManager-config-server    1.40.16-18.0.3.el8_10.noarch            1.40.16-19.0.1.el8_10.noarch
>  NetworkManager-libnm            1.40.16-18.0.3.el8_10.x86_64            1.40.16-19.0.1.el8_10.x86_64
>  NetworkManager-team             1.40.16-18.0.3.el8_10.x86_64            1.40.16-19.0.1.el8_10.x86_64
>  NetworkManager-tui              1.40.16-18.0.3.el8_10.x86_64            1.40.16-19.0.1.el8_10.x86_64
>  bpftool                         5.15.0-305.176.4.el8uek.x86_64          5.15.0-307.178.5.el8uek.x86_64
>  btrfs-progs                     5.15.1-1.el8.x86_64                     5.15.1-2.el8.x86_64
>  cpp                             8.5.0-23.0.1.el8_10.x86_64              8.5.0-26.0.1.el8_10.x86_64
>  device-mapper                   1.02.181-14.0.1.el8.x86_64              1.02.181-15.0.1.el8_10.x86_64
>  device-mapper-event             1.02.181-14.0.1.el8.x86_64              1.02.181-15.0.1.el8_10.x86_64
>  device-mapper-event-libs        1.02.181-14.0.1.el8.x86_64              1.02.181-15.0.1.el8_10.x86_64
>  device-mapper-libs              1.02.181-14.0.1.el8.x86_64              1.02.181-15.0.1.el8_10.x86_64
>  dnf                             4.7.0-20.0.1.el8.noarch                 4.7.0-21.0.1.el8_10.noarch
>  dnf-data                        4.7.0-20.0.1.el8.noarch                 4.7.0-21.0.1.el8_10.noarch
>  expat                           2.2.5-16.0.1.el8_10.x86_64              2.2.5-17.0.1.el8_10.x86_64
>  firewalld                       0.9.11-9.0.1.el8_10.noarch              0.9.11-10.0.1.el8_10.noarch
>  firewalld-filesystem            0.9.11-9.0.1.el8_10.noarch              0.9.11-10.0.1.el8_10.noarch
>  freetype                        2.9.1-9.el8.x86_64                      2.9.1-10.el8_10.x86_64
>  gcc                             8.5.0-23.0.1.el8_10.x86_64              8.5.0-26.0.1.el8_10.x86_64
>  glibc                           2.28-251.0.2.el8_10.13.x86_64           2.28-251.0.3.el8_10.16.x86_64
>  kernel-uek                      5.15.0-306.177.4.el8uek.x86_64          5.15.0-307.178.5.el8uek.x86_64
>  kernel-uek-core                 5.15.0-306.177.4.el8uek.x86_64          5.15.0-307.178.5.el8uek.x86_64
>  kernel-uek-devel                5.15.0-306.177.4.el8uek.x86_64          5.15.0-307.178.5.el8uek.x86_64
>  kernel-uek-modules              5.15.0-306.177.4.el8uek.x86_64          5.15.0-307.178.5.el8uek.x86_64

Kernel Module Comparison________________________________________________________
                Name  Loaded  Loaded
>              bochs  YES     NO
>          bochs_drm  NO      YES
>                cec  YES     NO
>     drm_ttm_helper  YES     NO
>        glue_helper  NO      YES
>  intel_rapl_common  YES     NO
>     intel_rapl_msr  YES     NO
>          libcrc32c  NO      YES
>      nf_tables_set  NO      YES
>        nvme_common  YES     NO
>       sch_fq_codel  NO      YES
>         sha1_ssse3  YES     NO
>       sha256_ssse3  YES     NO
>             t10_pi  YES     NO
>                tls  YES     NO
Total unique kernel modules: 107
Modules missing from the first report: 5
Modules missing from the second report: 10

Here is a comparison of a bare metal system (Intel) versus a KVM virtual machine, hosted on AMD. Notice the differences not only in CPU speed, but in numbers of CPUs and CPU cores.

$ oled sosdiff sosreport-devxx-2023-07-12-dfbucbc sosreport-jtest-ol8-2024-04-04-wofenpm
   First Report                     Second Report
>  devxx                            jtest-ol8
>  5.4.17-2136.315.5.el8uek.x86_64  5.15.0-204.147.6.2.el8uek.x86_64

....

lscpu Comparison_______________________________________________________________
   Name                 First Report                               Second Report
>  BIOS Model name      Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz  pc-i440fx-4.2
>  BIOS Vendor ID       Intel                                      QEMU
>  BogoMIPS             4389.94                                    3992.49
>  CPU MHz              2796.177                                   1996.246
>  CPU family           6                                          23
>  CPU max MHz          3100.0000                                  MISSING
>  CPU min MHz          1200.0000                                  MISSING
>  CPU(s)               40                                         4
>  Core(s) per socket   10                                         2
>  Hypervisor vendor    MISSING                                    KVM
>  L1d cache            32K                                        64K
>  L1i cache            32K                                        64K
>  L2 cache             256K                                       512K
>  L3 cache             25600K                                     16384K
>  Model                79                                         1
>  Model name           Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz  AMD EPYC 7551 32-Core Processor
>  NUMA node(s)         2                                          1
>  NUMA node0 CPU(s)    0-9,20-29                                  0-3
>  NUMA node1 CPU(s)    10-19,30-39                                MISSING
>  On-line CPU(s) list  0-39                                       0-3

Conclusion

Using sosdiff may help you determine why one system is running better than another, or why your application can’t run on a newer system, maybe because of missing RPM packages, or lower kernel sysctl settings. It also can show you many more differences like hardware, kernel modules, network configuration, RPMs, and more. We are always adding more features to sosdiff in order to make the utility more helpful.

References