X

News, tips, partners, and perspectives for the Oracle Solaris operating system

Making yourself indispensible with DTrace

DTrace technology and tools appeared in 2005, and
despite this, it is still not very well known among majority of
developers and system administrators. It is especially surprising given
that for five and a half years since the release of Solaris 10, where
DTrace was first implemented, there was no better technology made for
analyzing of operating systems and applications.

DTrace was
developed by Sun Microsystems, and all of its components were open
sourced in 2005. Later DTrace has been ported to Mac OS X, QNX, and
FreeBSD, and remained available in all the distributions that have
inherited from Solaris: Belenix, Korona, Nexenta, OpenIndiana and Milax.
DTrace is a core part of Solaris, and new DTrace probes were
introduced in Solaris 11 Express, announced in November 2010.

DTrace
is an instrument for system administrators as well as for developers,
and DTrace providers enable probes for many languages, including
scripting ones such as PHP.

DTrace has several unique features, which can not be found being together in other tools, and in particular DTrace:
1. allows to collect information in the system operating under maximum load in production - with low overhead;
2. collects any information from any part of the system, allowing you to observe applications and the kernel as well;
3. can show which arguments are passed from one function to another, even if one does not have source code for the functions;
4.
harvests function calls execution time info, calculates a percentage of
time spent to execute each of them, shows how many times each of the
functions was called, etc.;
5. filters information in a specified
way - for example, lets you restrict the observation scope by an
application, a thread, a particular system call, or another specified
execution unit;
6. may react to certain events (I/O, call of the given function, the completion of programs, starting a new thread, etc.).
7.
has high-level and low-level observation scopes - from observing an
internal functioning of a device driver to monitoring certain events in
PHP scripts execution or method calls in Java applications.
8. allows call trace, with tracking any of options - a run-time arguments passed, etc.

The main component of DTrace is a kernel module, servicing the whole DTrace sybsystem, and the main tool is dtrace,
which accepts scripts written in D, DTrace-specific language with
C-like syntax. Actually, the syntax is something between AWK and C
syntaxes, and those who like both languages will enjoy most.

In
this short article we are not going to dig into details of DTrace
implementation; those who are interested in them, can google more
details, or read DTrace User Guide at http://download.oracle.com/docs/cd/E19253-01/819-5488/index.html.
If you read Russian, you may wish to download Chapter 27 and 28 of the
book "Operating System Solaris" (second edition), written by Yevgeny
Ilyin and me in 2009. The chapters are available in .pdf in the
Internet.

Let's look at some DTrace usage practices. We'll see
how the desire to simplify one's life and indulge the laziness
influences the system administrator or developer to learn and apply new
tools, and how it improves their productivity in times, and
applications preformance exponentially.

Once upon a time at
OSDevCon conference in Dresden, Chad Mynhier led a master class on
DTrace. He has worked for Forsythe, and the company has earned a lot doing banking applications optimization. He told us a story
of a customer he triumphantly tuned applications for, speeding them up
in one thousand times, and breaking the record of fastest earning a
honorarium.

The bank was satisfied with an important
application before, but it significantly slowed down with load growth.
Bank was planning bigger increase of the load, and situation started to
be dangerous, as losses were unacceptable. An analysis of the
application with DTrace did not reveal anything special initially, but
function calls frequency distribution looked suspicious: gettimeofday()
call, returning the current system time, was the most demanded call.
Databases often call this function, because they often record a time at
which the transaction occurred, but this function is somehow dominated
the calls ... Chad located a particular portion of the code in the
application using DTrace, where the function was called exteremely
often, and it turned out that it was called by developer's mistake in a
cycle, which, in turn, was involved in almost every application's
transaction.

Probably, gettimeofday() call remained from some
sort of debug output, or has been put in a loop by mistake, but anyway,
the error was not revealed in the testing phase and it affected the
performance when the application's load exceeded a certain level only.
Before that, a powerful server was able to handle tens of thousands of
gettimeofday() calls in a second.

DTrace is necessary not
only to find problems or monitor performance, and I often use DTrace
when I'm too lazy to search and read the documentation, or when the
documentation is not complete, or documentation and source code are not
available, or I need to understand how the application works.

In D scripting language threre is always following syntax:

provider:module:function:probe
/ predicate /
{
actions
}


Provider is a kernel module or an application module, which registers
its DTrace probes in the system. For example, the syscall provider
registers the probes located in system calls,
mysql<mysql-server-PID> provider - do it for probes embedded into
MySQL server.
Module is a name of the module or library, for example, libc.
Function is function name, a where probe is located, for example, fopen.

Probe is a probe name (in many cases, the probe is called "entry" or "return").


If the predicate evaluates to true, DTrace performs actions. For
example, if it is necessary to do it when the probe was reached while
running an application "top" only, the predicate would be

/execname == top /

Actions are the list of statements used to harvest information, display/print it, or do aggregations.


A simple example of how I use DTrace in everyday life: after moving
user settings from one computer to another one, a printer stopped
working. It was clear that a configuration file has wrong settings for
some reason. But which file exactly? There are many of them. I started
to look for a suspicious file in  /etc/printers and /etc/lp/*,
but it did not give any clue. Then I moved to DTrace. Simple script
allows you to look inside of lpstat (the most innocuous of programs
working with a printer) and see which files it tries to open:

#lpstat-s
scheduler is running
system default printer: eaqvap21
aqvap21: unknown printer
aqvap21: unknown printer

To run the program and give its PID to a DTrace script one should use -c:

# dtrace-n 'pid $ target:: fopen: entry {printf ("% s", copyinstr (arg0 ));}'-c "lpstat-s"
dtrace: description 'pid$target::fopen:entry ' matched 1 probe
scheduler is running
system default printer: eaqvap21
aqvap21: unknown printer
aqvap21: unknown printer
dtrace: pid 11156 has exited
CPU ID FUNCTION:NAME
0 59882 fopen:entry /etc/default/init
0 59882 fopen:entry /etc/lp/ppd/eaqvap21.ppd
0 59882 fopen:entry /export/home/filip/.printers
0 59882 fopen:entry /export/home/filip/.printers
0 59882 fopen:entry /export/home/filip/.printers
0 59882 fopen:entry /etc/printers.conf
0 59882 fopen:entry /export/home/filip/.printers
0 59882 fopen:entry /etc/printers.conf
1 59882 fopen:entry /etc/nsswitch.conf
1 59882 fopen:entry /export/home/filip/.printers
1 59882 fopen:entry /export/home/filip/.printers
1 59882 fopen:entry /etc/printers.conf

That's all: I forgot that I had to remove .printers file in my home directory:

rm /export/home/filip/.printers

done! Everything works now!


Note: you have to give lpstat's PID to the script because pid provider
requires PID of the process we examine, and fopen function can be
checked with pid provider only.


Sometimes I use another modification of the script:

pfexec dtrace -n 'syscall:: open *: entry {printf ("% s \ n", copyinstr (arg0 ));}'


This script brings a bunch of data from all system calls that begin
with "open", which is inconvenient, but it is easier to remember it and
simply to use with already running applications, and information
received may be sufficient for analysis.

Finally, for those who
debugs Web applications, a script catching SQL-operators before
the server executes them can be useful. It allows us to find out exactly
which operators are to be executed; it helps if it appears that the
script presumably works with the database correctly, but the database
returns an unexpected result: it may be that either incorrect SQL
expression, or connection is made with wrong database, or a parameter
is passed to a DB server with improper syntax (for example, without
quotes or with extra quotation marks).

Here is the script:

#!/usr/sbin/dtrace -s
#pragma D option quiet
dtrace:::BEGIN
{
printf("%-20s %-20s %-40s %-9s\n", "Who", "Database", "Query", "Time(ms)");
}
mysql*:::query-start
{
self->query = copyinstr(arg0);
self->connid = arg1;
self->db = copyinstr(arg2);
self->who = strjoin(copyinstr(arg3),strjoin("@",copyinstr(arg4)));
self->querystart = timestamp;
}

mysql*:::query-done
{
printf("%-20s %-20s %-40s %-9d\n",self->who,self->db,self->query,
(timestamp - self->querystart) / 1000000);
}


Running it, we can observe what the server is doing (be careful! There
can be a lot of information if the server is heavily loaded with
requests!):

$ pfexec ./mysql.d
Who Database Query Time(ms)
root@localhost data SELECT DATABASE() 0
root@localhost test show tables 0
root@localhost test SELECT DATABASE() 0
root@localhost mysql select * from users 0
root@localhost mysql select * from host 0



A time taken for a SQL statement execution can be measured as a
difference between triggering query-start and query-done, and DTrace
allows to plot the distribution of time among statements. One can find
the database calls that take a lot of time, affecting performance of an
application. The great advantage of DTrace is that all these
measurements can be performed in production, because the overhead of
DTrace is minimal and will not interfere with the server.

These
examples show that DTrace probes can be put in Solaris applications
everywhere. Please, refer to manual for operating system (man) and the
documentation for specific applications (such as MySQL server on site
mysql.com) to learn which arguments associated with probes can be used
as shown above.

In case you wish to familarize with more examples of D-scripts please take a look at:
/opt/DTT/ (scripts are there by default in Solaris 11 Express)
http://blogs.sun.com/brendan/category/DTrace
http://www.brendangregg.com/DTrace/dtrace_oneliners.txt


Before you start playing with dtrace please note that you must have one
or more DTrace privileges in the operating system in order to run dtrace, regular users can not use dtrace for security reasons. A system administrator can give a right to use dtrace to a particular user, but it is not done by default. Root has administrative privileges, which is enough to run dtrace.

Join the discussion

Comments ( 1 )
  • shmerl Monday, July 4, 2011

    DTrace is not less known to the masses as Solaris at large. If DTrace would be widely available for example on Linux, it'll get the deserved appreciation. Alas, DTrace port on Linux is very far from mature.


Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha
Oracle

Integrated Cloud Applications & Platform Services