DTrace update to 0.2

We just put an updated version of DTrace on ULN.
This is version 0.2, another preview (hence the 0.x...)

To test it out :

Register a server with ULN and add the following channel to the server list : ol6_x86_64_Dtrace_BETA.

When you have that channel registered with your server, you can install the following required RPMs :

dtrace-modules-2.6.39-101.0.1.el6uek
dtrace-utils
kernel-uek-headers-2.6.39-101.0.1.el6uek.x86_64
kernel-uek-devel-2.6.39-101.0.1.el6uek.x86_64
kernel-uek-2.6.39-101.0.1.el6uek.x86_64
kernel-uek-firmware-2.6.39-101.0.1.el6uek.noarch

Once the RPMs are installed, reboot the server into the correct kernel : 2.6.39-101.0.1.el6uek.

The DTrace modules are installed in /lib/modules/2.6.39-101.0.1.el6uek.x86_64/kernel/drivers/dtrace.

# cd /lib/modules/2.6.39-101.0.1.el6uek.x86_64/kernel/drivers/dtrace
# ls

dtrace.ko  dt_test.ko  profile.ko  sdt.ko  systrace.ko
Load the DTrace modules into the running kernel:
# modprobe dtrace 
# modprobe profile
# modprobe sdt
# modprobe systrace
# modprobe dt_test

The DTrace compiler is in /usr/sbin/dtrace.There are a few README files in : /usr/share/doc/dtrace-0.2.4.
These explain what's there and what's not yet there...

New features:

- The SDT provider is implemented, providing in-kernel static probes. Some of the proc provider is implemented using this facility.

Bugfixes:

- Syscall tracing of stub-based syscalls (such as fork, clone, exit, and sigreturn) now works.
- Invalid memory accesses inside D scripts no longer cause oopses or panics.
- Memory exhaustion inside D scripts no longer emits spurious oopses.
- Several crash fixes.
- Fixes to arithmetic inside aggregations, fixing quantize().
- Improvements to the installed headers.

We are also getting pretty good coverage on both userspace and kernel in terms of the DTrace testsuites.
Thanks to the team working on it!

Below are a few examples with output and source code for the .d scripts.


activity.d    - this shows ongoing activity in terms of what program was executing, 
          what its parent is, and how long it ran. This makes use of the proc SDT provider.
pstrace.d     - this is similar but instead of providing timing, it lists ancestory
          of a process, based on whatever history is collected during the DTrace runtime 
          of this script.  This makes use of the proc SDT provider.
rdbufsize.d   - this shows quantised results for buffer sizes used in read syscalls, 
          i.e. it gives a statistical breakdown of sizes passed in the read() syscall, 
          which can be useful to see what buffer sizes are commonly used.

=====================
activity.d
=====================

#pragma D option quiet

proc:::create
{
    this->pid = *((int *)arg0 + 171);

    time[this->pid] = timestamp;
    p_pid[this->pid] = pid;
    p_name[this->pid] = execname;
    p_exec[this->pid] = "";
}

proc:::exec
/p_pid[pid]/
{
    p_exec[pid] = stringof(arg0);
}

proc:::exit
/p_pid[pid]&&  p_exec[pid] != ""/
{
    printf("%d: %s (%d) executed %s (%d) for %d msecs\n",
           timestamp, p_name[pid], p_pid[pid], p_exec[pid], pid,
           (timestamp - time[pid]) / 1000);
}

proc:::exit
/p_pid[pid]&&  p_exec[pid] == ""/
{
    printf("%d: %s (%d) forked itself (as %d) for %d msecs\n",
           timestamp, p_name[pid], p_pid[pid], pid,
           (timestamp - time[pid]) / 1000);

}


=============
pstrace.d
=============


#pragma D option quiet

proc:::create
{
    this->pid = *((int *)arg0 + 171);

    p_pid[this->pid] = pid;
    p_name[this->pid] = execname;
    p_exec[this->pid] = "";
    path[this->pid] = strjoin(execname, " ->  ");
}

proc:::create
/p_pid[pid]/
{
    this->pid = *((int *)arg0 + 171);

    path[this->pid] = strjoin(path[pid], " ->  ");
}

proc:::exec
/p_pid[pid]/
{
    this->path = basename(stringof(arg0));

    path[pid] = strjoin(p_name[pid], strjoin(" ->  ", this->path));
    p_exec[pid] = this->path;
}

proc:::exit
/p_pid[pid]&&  p_exec[pid] != ""/
{
    printf("%d: %s[%d] ->  %s[%d]\n",
           timestamp, p_name[pid], p_pid[pid], p_exec[pid], pid);

    p_name[pid] = 0;
    p_pid[pid] = 0;
    p_exec[pid] = 0;
    path[pid] = 0;
}

proc:::exit
/p_pid[pid]&&  p_exec[pid] == ""/
{
    printf("%d: %s[%d] ->  [%d]\n",
           timestamp, p_name[pid], p_pid[pid], pid);

    p_name[pid] = 0;
    p_pid[pid] = 0;
    p_exec[pid] = 0;
    path[pid] = 0;
}

proc:::create
/path[pid] != ""/
{
    this->pid = *((int *)arg0 + 171);

    p_name[this->pid] = path[pid];
}


==================
rdbufsize.d
==================

syscall::read:entry
{
    @["read"] = quantize(arg2);
}

Since we do not yet have CTF support, the scripts do raw memory access to get to the pid field in the task_struct.
this->pid = *((int *)arg0 + 171);
Where arg0 is a pointer to struct task_struct (arg0 passed to the proc:::create probe when a new task/thread/process is created).

Just cut'n paste these scripts into text files and run them, I have some sample output below :

activity.d (here I just run some commands in a separate shell which then shows in the output)
dtrace -s activity.d 
  2134889238792594: automount (1736) forked itself (as 11484) for 292 msecs
2134912932312379: bash (11488) forked itself (as 11489) for 1632 msecs
2134912934171504: bash (11488) forked itself (as 11491) for 1319 msecs
2134912937531743: bash (11488) forked itself (as 11493) for 2150 msecs
2134912939231853: bash (11488) forked itself (as 11496) for 1366 msecs
2134912945152337: bash (11488) forked itself (as 11499) for 1135 msecs
2134912948946944: bash (11488) forked itself (as 11503) for 1285 msecs
2134912923230099: sshd (11485) forked itself (as 11486) for 8790195 msecs
2134912932092719: bash (11489) executed /usr/bin/id (11490) for 1005 msecs
2134912945773882: bash (11488) forked itself (as 11501) for 328 msecs
2134912937325453: bash (11493) executed /usr/bin/tput (11495) for 721 msecs
2134912941951947: bash (11488) executed /bin/grep (11498) for 1418 msecs
2134912933963262: bash (11491) executed /bin/hostname (11492) for 804 msecs
2134912936358611: bash (11493) executed /usr/bin/tty (11494) for 626 msecs
2134912939035204: bash (11496) executed /usr/bin/dircolors (11497) for 789 msecs
2134912944986994: bash (11499) executed /bin/uname (11500) for 621 msecs
2134912946568141: bash (11488) executed /bin/grep (11502) for 1003 msecs
2134912948757031: bash (11503) executed /usr/bin/id (11504) for 796 msecs
2134913874947141: ksmtuned (1867) forked itself (as 11505) for 2189 msecs
2134913883976223: ksmtuned (11507) executed /bin/awk (11509) for 8056 msecs
2134913883854384: ksmtuned (11507) executed /bin/ps (11508) for 8122 msecs
2134913884227577: ksmtuned (1867) forked itself (as 11507) for 9025 msecs
2134913874664300: ksmtuned (11505) executed /bin/awk (11506) for 1307 msecs
2134919238874188: automount (1736) forked itself (as 11511) for 263 msecs
2134920459512267: bash (11488) executed /bin/ls (11512) for 1682 msecs
2134930786318884: bash (11488) executed /bin/ps (11513) for 7241 msecs
2134933581336279: bash (11488) executed /bin/find (11514) for 161853 msecs


pstrace.d (as daemons or shells/users execute binaries, they show up automatically)
# dtrace -s pstrace.d 
2134960378397662: bash[11488] ->  ps[11517]
2134962360623937: bash[11488] ->  ls[11518]
2134964238953132: automount[1736] ->  [11519]
2134965712514625: bash[11488] ->  df[11520]
2134971432047109: bash[11488] ->  top[11521]
2134973888279789: ksmtuned[1867] ->  [11522]
2134973897131858: ksmtuned ->  [11524] ->  awk[11526]
2134973896999204: ksmtuned ->  [11524] ->  ps[11525]
2134973897400622: ksmtuned[1867] ->  [11524]
2134973888019910: ksmtuned ->  [11522] ->  awk[11523]
2134981995742661: sshd ->  sshd ->  bash[11531] ->  [11532]
2134981997448161: sshd ->  sshd ->  bash[11531] ->  [11534]
2134982000599413: sshd ->  sshd ->  bash[11531] ->  [11536]
2134982002035206: sshd ->  sshd ->  bash[11531] ->  [11539]
2134982007815639: sshd ->  sshd ->  bash[11531] ->  [11542]
2134982011627125: sshd ->  sshd ->  bash[11531] ->  [11546]
2134981989026168: sshd ->  sshd[11529] ->  [11530]
2134982008472173: sshd ->  sshd ->  bash[11531] ->  [11544]
2134981995518210: sshd ->  sshd ->  bash ->  [11532] ->  id[11533]
2134982000393612: sshd ->  sshd ->  bash ->  [11536] ->  tput[11538]
2134982004531164: sshd ->  sshd ->  bash[11531] ->  grep[11541]
2134981997256114: sshd ->  sshd ->  bash ->  [11534] ->  hostname[11535]
2134981999476476: sshd ->  sshd ->  bash ->  [11536] ->  tty[11537]
2134982001865119: sshd ->  sshd ->  bash ->  [11539] ->  dircolors[11540]
2134982007610268: sshd ->  sshd ->  bash ->  [11542] ->  uname[11543]
2134982009271769: sshd ->  sshd ->  bash[11531] ->  grep[11545]
2134982011408808: sshd ->  sshd ->  bash ->  [11546] ->  id[11547]


rdbufsize.d (in another shell I just did some random read operations and this
shows a summary)
# dtrace -s rdbufsize.d 
dtrace: script 'rdbufsize.d' matched 1 probe
^C

  read                                              
           value  ------------- Distribution ------------- count    
              -1 |                                         0        
               0 |                                         8        
               1 |                                         59       
               2 |                                         209      
               4 |                                         72       
               8 |                                         488      
              16 |                                         67       
              32 |                                         1074     
              64 |                                         113      
             128 |                                         88       
             256 |                                         384      
             512 |@@@                                      6582     
            1024 |@@@@@@@@@@@@@@@@@@                       44787    
            2048 |@                                        2419     
            4096 |@@@@@@@                                  16239    
            8192 |@@@@                                     10395    
           16384 |@@@@@@                                   14784    
           32768 |                                         427      
           65536 |                                         669      
          131072 |                                         143      
          262144 |                                         43       
          524288 |                                         46       
         1048576 |                                         92       
         2097152 |                                         196      
         4194304 |                                         0   

Comments:

Source code please?

Posted by guest on February 22, 2012 at 04:44 PM PST #

There is a brand new OTN forum for this preview of DTrace on Linux at: https://forums.oracle.com/forums/forum.jspa?forumID=1398

(The Solaris DTrace forum is at https://forums.oracle.com/forums/forum.jspa?forumID=1017)

Posted by guest on February 22, 2012 at 10:11 PM PST #

Thanks for bringing such a great tool to Linux. Once UEK2 is production ready will it be immediately certified for the 11g DB? Also when do you see UEK2 making its way to the Exadata platform?

Posted by guest on February 23, 2012 at 08:20 AM PST #

When UEK2 is production ready then it will be certified for Oracle Database 11g Release 2 Enterprise, and Oracle needs to do it all before June this year, or else Oracle needs to pull this result:

www.tpc.org/tpcc/results/tpcc_result_detail.asp?id=112011801

Open Grid Scheduler/Scalable Grid Engine

Posted by guest on February 23, 2012 at 09:15 PM PST #

The other alternative to Oracles work on DTrace, is my initiative, which appears to be significantly further advanced than the Oracle one.

See my link for access to the sources or see github (dtrace4linux).

It would be interesting to know what Oracles plans are and if they will share the code, as we could actually help each other.

Posted by paul fox on February 25, 2012 at 02:15 PM PST #

Post a Comment:
  • HTML Syntax: NOT allowed
About

Wim Coekaerts is the Senior Vice President of Linux and Virtualization Engineering for Oracle. He is responsible for Oracle's complete desktop to data center virtualization product line and the Oracle Linux support program.

You can follow him on Twitter at @wimcoekaerts

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
9
10
11
12
13
14
15
16
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today