X

News, tips, partners, and perspectives for Oracle’s virtualization offerings

DTrace on Oracle Linux

Jeff Savit
Product Management Senior Manager

DTrace - dynamic tracing on Oracle Linux

I like to joke that "all performance problems are either trivial or unsolvable", but that's really not true. While many performance issues can be diagnosed using standard tools like vmstat, mpstat, iostat, prstat, perf, and so on, sometimes you need to inspect the internal behavior of the system to understand what's going on. DTrace, the fantastic dynamic tracing tool introduced with Solaris, is ideal for this. While I haven't focussed on DTrace, I've blogged in the past on how I used it to discover interesting things about Oracle VM Server for SPARC live migration and internal workings of the Hercules emulator.  In one of those blogs I refer to the '*stat' tools as a stethoscope, while DTrace is the MRI you deploy when needed for deep information.

In that spirit, I'm very happy to see that DTrace continues to be enhanced on Oracle Linux. It's been around for some time, but has been increasing in scope and capability, such as recently adding the lockstat provider. This is very useful and interesting, and lets Oracle Linux uniquely provide capabilities for understanding the behavior of Linux systems.

Getting Started - Installing

First, install Oracle Linux and use the Unbreakable Enterprise Kernel (UEK). While much of this and subsequent blog entries apply to previous UEK versions, I'll use the latest UEK5 sneak preview described in Wim Coekaerts' blog. Follow the instructions there to install UEK5 and then simply yum install dtrace-utils to install DTrace. That's all there is to it.

Getting Started - simple examples

I start with DTrace on Oracle Linux the same way I do on Solaris, with 'one-liner' commands or simple scripts, and delving deeper as needed. Even trivial DTrace commands can show a lot. In this blog I'll show examples entered directly from the command line, rather than complex and powerful stored scripts.

The cliche first script below simply shows which processes are doing the most system calls. It fit on one line, but I split it here for readability. The syscall probe counts system calls by execname into a DTrace aggregate to show who issues the most system calls. The END probe fires when I stop the script, and truncates the aggregate to the highest and most interesting entries  I ran this while starting a large Java application (Oracle VM Manager, which runs under WebLogic), which explains why it's the top syscall user. Note: the exact same script works on Solaris.

System calls

# dtrace -n 'syscall:::entry { @num[execname] = count();}
    END {trunc(@num,10)}'
dtrace: description 'syscall:::entry ' matched 320 probes
^C
CPU     ID                    FUNCTION:NAME
  3      2                             :END

 dockerd                              7090
 postqueue                            7868
 tc                                   9618
 rpm                                 10148
 bash                                16562
 mysqld                              16901
 setroubleshootd                     31843
netdata                             205353
apps.plugin                         533458
java                                716525

 

Now let's find out which system calls are being executed:

which system calls

# dtrace -n 'syscall:::entry { @num[execname,probefunc] = count();} END {trunc(@num,10)}'
dtrace: description 'syscall:::entry ' matched 320 probes
^C
CPU     ID                    FUNCTION:NAME
  1      2                             :END

  apps.plugin           getdents       23604
  netdata                 read             31241
  apps.plugin           open            35258
  apps.plugin           lseek           35290
  apps.plugin           close           47048
  java                      futex            50128
  java                      newstat        50384
  apps.plugin           read             70594
  java                      lseek          130673
  java                      read           263427

 

That shows which are the hottest system calls, by caller.  Just for quick fun, let's see which files are being opened by Java (output randomly edited to just show some of the many items). Note the use of a predicate to restrict to system calls made by the Java process:

Which files are being opened?

# dtrace -qn 'syscall::open*:entry /execname=="java"/ { printf("%s \n",copyinstr(arg0)); }'

/u01/app/oracle/java/bin/../lib/amd64/jli/tls/x86_64/libpthread.so.0
/u01/app/oracle/java/bin/../lib/amd64/jli/tls/libpthread.so.0
/u01/app/oracle/java/bin/../lib/amd64/jli/x86_64/libpthread.so.0

/u01/app/oracle/java/jre/lib/rt.jar
/etc/passwd
/etc/localtime

I did the same for 'apps.plugin', and saw entries for /proc/pid/stat, because I was running Netdata, which I had forgotten.

If motivated I could drill further, such as looking at call stacks, or seeing which files are used with lseek, and read, or compute system calls per second. Or, see how locks are being managed with futex, such as how long they are held. Lock management will probably be in a later article.

Another simple example

Let's change gears. The next one-liner shows which processes are being started (again, the identical script works on Solaris). I immediately see the following (many limes omitted). That's pretty cool - you can see all this activity going on.

proc:::exec-success

#  dtrace -n 'proc:::exec-success { trace(curpsinfo->pr_psargs); }'

dtrace: description 'proc:::exec-success ' matched 1 probe
CPU     ID                    FUNCTION:NAME
  1   1864  do_execveat_common:exec-success   /usr/sbin/tc -s qdisc show dev eth0
  0   1864  do_execveat_common:exec-success   rpm -q selinux-policy                      
  0   1864  do_execveat_common:exec-success   /usr/bin/python -Es /usr/sbin/setroubleshootd -f
  0   1864  do_execveat_common:exec-success   basename /etc/cron.hourly/0anacron

...

 

These are very simple, trivial examples - just barely scratching the surface, but show how a lot of information can be obtained.

Summary.

Oracle Linux incorporates the Dynamic Tracing (DTrace) feature pioneered in Solaris. DTrace is an exceptionally powerful tool for finding out what is going on in your system.

Later articles will show more interesting (I hope) examples, and illustrate some of the features recently added to Oracle Linux. For further information, see the DTrace Guide and the DTrace Tutorial.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha

Recent Content

Oracle

Integrated Cloud Applications & Platform Services