Wednesday Feb 22, 2012

DTrace update to 0.2

We just put an updated version of DTrace on ULN.
This is version 0.2, another preview (hence the 0.x...)

To test it out :

Register a server with ULN and add the following channel to the server list : ol6_x86_64_Dtrace_BETA.

When you have that channel registered with your server, you can install the following required RPMs :

dtrace-modules-2.6.39-101.0.1.el6uek
dtrace-utils
kernel-uek-headers-2.6.39-101.0.1.el6uek.x86_64
kernel-uek-devel-2.6.39-101.0.1.el6uek.x86_64
kernel-uek-2.6.39-101.0.1.el6uek.x86_64
kernel-uek-firmware-2.6.39-101.0.1.el6uek.noarch

Once the RPMs are installed, reboot the server into the correct kernel : 2.6.39-101.0.1.el6uek.

The DTrace modules are installed in /lib/modules/2.6.39-101.0.1.el6uek.x86_64/kernel/drivers/dtrace.

# cd /lib/modules/2.6.39-101.0.1.el6uek.x86_64/kernel/drivers/dtrace
# ls

dtrace.ko  dt_test.ko  profile.ko  sdt.ko  systrace.ko
Load the DTrace modules into the running kernel:
# modprobe dtrace 
# modprobe profile
# modprobe sdt
# modprobe systrace
# modprobe dt_test

The DTrace compiler is in /usr/sbin/dtrace.There are a few README files in : /usr/share/doc/dtrace-0.2.4.
These explain what's there and what's not yet there...

New features:

- The SDT provider is implemented, providing in-kernel static probes. Some of the proc provider is implemented using this facility.

Bugfixes:

- Syscall tracing of stub-based syscalls (such as fork, clone, exit, and sigreturn) now works.
- Invalid memory accesses inside D scripts no longer cause oopses or panics.
- Memory exhaustion inside D scripts no longer emits spurious oopses.
- Several crash fixes.
- Fixes to arithmetic inside aggregations, fixing quantize().
- Improvements to the installed headers.

We are also getting pretty good coverage on both userspace and kernel in terms of the DTrace testsuites.
Thanks to the team working on it!

Below are a few examples with output and source code for the .d scripts.


activity.d    - this shows ongoing activity in terms of what program was executing, 
          what its parent is, and how long it ran. This makes use of the proc SDT provider.
pstrace.d     - this is similar but instead of providing timing, it lists ancestory
          of a process, based on whatever history is collected during the DTrace runtime 
          of this script.  This makes use of the proc SDT provider.
rdbufsize.d   - this shows quantised results for buffer sizes used in read syscalls, 
          i.e. it gives a statistical breakdown of sizes passed in the read() syscall, 
          which can be useful to see what buffer sizes are commonly used.

=====================
activity.d
=====================

#pragma D option quiet

proc:::create
{
    this->pid = *((int *)arg0 + 171);

    time[this->pid] = timestamp;
    p_pid[this->pid] = pid;
    p_name[this->pid] = execname;
    p_exec[this->pid] = "";
}

proc:::exec
/p_pid[pid]/
{
    p_exec[pid] = stringof(arg0);
}

proc:::exit
/p_pid[pid]&&  p_exec[pid] != ""/
{
    printf("%d: %s (%d) executed %s (%d) for %d msecs\n",
           timestamp, p_name[pid], p_pid[pid], p_exec[pid], pid,
           (timestamp - time[pid]) / 1000);
}

proc:::exit
/p_pid[pid]&&  p_exec[pid] == ""/
{
    printf("%d: %s (%d) forked itself (as %d) for %d msecs\n",
           timestamp, p_name[pid], p_pid[pid], pid,
           (timestamp - time[pid]) / 1000);

}


=============
pstrace.d
=============


#pragma D option quiet

proc:::create
{
    this->pid = *((int *)arg0 + 171);

    p_pid[this->pid] = pid;
    p_name[this->pid] = execname;
    p_exec[this->pid] = "";
    path[this->pid] = strjoin(execname, " ->  ");
}

proc:::create
/p_pid[pid]/
{
    this->pid = *((int *)arg0 + 171);

    path[this->pid] = strjoin(path[pid], " ->  ");
}

proc:::exec
/p_pid[pid]/
{
    this->path = basename(stringof(arg0));

    path[pid] = strjoin(p_name[pid], strjoin(" ->  ", this->path));
    p_exec[pid] = this->path;
}

proc:::exit
/p_pid[pid]&&  p_exec[pid] != ""/
{
    printf("%d: %s[%d] ->  %s[%d]\n",
           timestamp, p_name[pid], p_pid[pid], p_exec[pid], pid);

    p_name[pid] = 0;
    p_pid[pid] = 0;
    p_exec[pid] = 0;
    path[pid] = 0;
}

proc:::exit
/p_pid[pid]&&  p_exec[pid] == ""/
{
    printf("%d: %s[%d] ->  [%d]\n",
           timestamp, p_name[pid], p_pid[pid], pid);

    p_name[pid] = 0;
    p_pid[pid] = 0;
    p_exec[pid] = 0;
    path[pid] = 0;
}

proc:::create
/path[pid] != ""/
{
    this->pid = *((int *)arg0 + 171);

    p_name[this->pid] = path[pid];
}


==================
rdbufsize.d
==================

syscall::read:entry
{
    @["read"] = quantize(arg2);
}

Since we do not yet have CTF support, the scripts do raw memory access to get to the pid field in the task_struct.
this->pid = *((int *)arg0 + 171);
Where arg0 is a pointer to struct task_struct (arg0 passed to the proc:::create probe when a new task/thread/process is created).

Just cut'n paste these scripts into text files and run them, I have some sample output below :

activity.d (here I just run some commands in a separate shell which then shows in the output)
dtrace -s activity.d 
  2134889238792594: automount (1736) forked itself (as 11484) for 292 msecs
2134912932312379: bash (11488) forked itself (as 11489) for 1632 msecs
2134912934171504: bash (11488) forked itself (as 11491) for 1319 msecs
2134912937531743: bash (11488) forked itself (as 11493) for 2150 msecs
2134912939231853: bash (11488) forked itself (as 11496) for 1366 msecs
2134912945152337: bash (11488) forked itself (as 11499) for 1135 msecs
2134912948946944: bash (11488) forked itself (as 11503) for 1285 msecs
2134912923230099: sshd (11485) forked itself (as 11486) for 8790195 msecs
2134912932092719: bash (11489) executed /usr/bin/id (11490) for 1005 msecs
2134912945773882: bash (11488) forked itself (as 11501) for 328 msecs
2134912937325453: bash (11493) executed /usr/bin/tput (11495) for 721 msecs
2134912941951947: bash (11488) executed /bin/grep (11498) for 1418 msecs
2134912933963262: bash (11491) executed /bin/hostname (11492) for 804 msecs
2134912936358611: bash (11493) executed /usr/bin/tty (11494) for 626 msecs
2134912939035204: bash (11496) executed /usr/bin/dircolors (11497) for 789 msecs
2134912944986994: bash (11499) executed /bin/uname (11500) for 621 msecs
2134912946568141: bash (11488) executed /bin/grep (11502) for 1003 msecs
2134912948757031: bash (11503) executed /usr/bin/id (11504) for 796 msecs
2134913874947141: ksmtuned (1867) forked itself (as 11505) for 2189 msecs
2134913883976223: ksmtuned (11507) executed /bin/awk (11509) for 8056 msecs
2134913883854384: ksmtuned (11507) executed /bin/ps (11508) for 8122 msecs
2134913884227577: ksmtuned (1867) forked itself (as 11507) for 9025 msecs
2134913874664300: ksmtuned (11505) executed /bin/awk (11506) for 1307 msecs
2134919238874188: automount (1736) forked itself (as 11511) for 263 msecs
2134920459512267: bash (11488) executed /bin/ls (11512) for 1682 msecs
2134930786318884: bash (11488) executed /bin/ps (11513) for 7241 msecs
2134933581336279: bash (11488) executed /bin/find (11514) for 161853 msecs


pstrace.d (as daemons or shells/users execute binaries, they show up automatically)
# dtrace -s pstrace.d 
2134960378397662: bash[11488] ->  ps[11517]
2134962360623937: bash[11488] ->  ls[11518]
2134964238953132: automount[1736] ->  [11519]
2134965712514625: bash[11488] ->  df[11520]
2134971432047109: bash[11488] ->  top[11521]
2134973888279789: ksmtuned[1867] ->  [11522]
2134973897131858: ksmtuned ->  [11524] ->  awk[11526]
2134973896999204: ksmtuned ->  [11524] ->  ps[11525]
2134973897400622: ksmtuned[1867] ->  [11524]
2134973888019910: ksmtuned ->  [11522] ->  awk[11523]
2134981995742661: sshd ->  sshd ->  bash[11531] ->  [11532]
2134981997448161: sshd ->  sshd ->  bash[11531] ->  [11534]
2134982000599413: sshd ->  sshd ->  bash[11531] ->  [11536]
2134982002035206: sshd ->  sshd ->  bash[11531] ->  [11539]
2134982007815639: sshd ->  sshd ->  bash[11531] ->  [11542]
2134982011627125: sshd ->  sshd ->  bash[11531] ->  [11546]
2134981989026168: sshd ->  sshd[11529] ->  [11530]
2134982008472173: sshd ->  sshd ->  bash[11531] ->  [11544]
2134981995518210: sshd ->  sshd ->  bash ->  [11532] ->  id[11533]
2134982000393612: sshd ->  sshd ->  bash ->  [11536] ->  tput[11538]
2134982004531164: sshd ->  sshd ->  bash[11531] ->  grep[11541]
2134981997256114: sshd ->  sshd ->  bash ->  [11534] ->  hostname[11535]
2134981999476476: sshd ->  sshd ->  bash ->  [11536] ->  tty[11537]
2134982001865119: sshd ->  sshd ->  bash ->  [11539] ->  dircolors[11540]
2134982007610268: sshd ->  sshd ->  bash ->  [11542] ->  uname[11543]
2134982009271769: sshd ->  sshd ->  bash[11531] ->  grep[11545]
2134982011408808: sshd ->  sshd ->  bash ->  [11546] ->  id[11547]


rdbufsize.d (in another shell I just did some random read operations and this
shows a summary)
# dtrace -s rdbufsize.d 
dtrace: script 'rdbufsize.d' matched 1 probe
^C

  read                                              
           value  ------------- Distribution ------------- count    
              -1 |                                         0        
               0 |                                         8        
               1 |                                         59       
               2 |                                         209      
               4 |                                         72       
               8 |                                         488      
              16 |                                         67       
              32 |                                         1074     
              64 |                                         113      
             128 |                                         88       
             256 |                                         384      
             512 |@@@                                      6582     
            1024 |@@@@@@@@@@@@@@@@@@                       44787    
            2048 |@                                        2419     
            4096 |@@@@@@@                                  16239    
            8192 |@@@@                                     10395    
           16384 |@@@@@@                                   14784    
           32768 |                                         427      
           65536 |                                         669      
          131072 |                                         143      
          262144 |                                         43       
          524288 |                                         46       
         1048576 |                                         92       
         2097152 |                                         196      
         4194304 |                                         0   

Monday Oct 10, 2011

it's DTrace not Dtrace :)

sorry, Bryan smacked me on the fingers because I misspelled DTrace :) so wherever I wrote Dtrace, do s/Dtrace/DTrace :) keeps Bryan happy :) Bryan we good now? ;) my bad for misspelling the name...

Sunday Oct 09, 2011

trying out dtrace

One of the things we talked about at Oracle Openworld, last week, was Dtrace for Oracle Linux. There's not been much information on it yet so I wanted to write up what you need to do to give it a go.

We released a preview of Dtrace and we will provide updates as we include new features on an ongoing basis. The biggest project to implement is userspace tracing and many side projects to include and enhance various Dtrace providers in kernel. So it's not all there yet but it is going to continuously be enhanced with new updates.

Dtrace is made available to Oracle Linux support subscribers and currently requires you to do the following :

- have your system registered with the Unbreakable Linux Network (ULN) (or a local yum repo that mirrors the channels on ULN)
- run Oracle Linux 6, of course with the Unbreakable Enterprise Kernel
- register your server with the Dtrace channel (ol6_x86_64_Dtrace_BETA)
- install the updated version of 2.6.32 and the dtrace kernel module and userspace programs
- run it

Here are some of the detailed steps :
(1) register your server with ULN, run the uln_register command on your server as the root user. It requires you to enter your single sign-on ID, password and Oracle Linux support ID (CSI).
(2) once your system(s) is registered, login to the ULN website, click on Systems and find the server(s) you just registered
(3) Click on the server and go to Manage Subscriptions
(4) add the Dtrace for Oracle Linux 6 channel to your server and click Save
(5) back on the server, double check if everything worked by typing yum repolist it should now also show ol6_x86_64_Dtrace_BETA
(6) install the required packages :
... (a) yum install dtrace-modules
... (b) yum install dtrace-utils
... (c) yum install kernel-uek-2.6.32-201.0.4.el6uek
You have to install that specific kernel from the dtrace ULN channel because it is the kernel with the instrumentation.
(7) reboot into the new kernel
(8) load the kernel modules modprobe dtrace and modprobe systrace

Now you are ready to start playing with dtrace. We will publish a number of useful dtrace scripts to get you going. I have one quick example. When running an Oracle Database on my machine, I want to find out which files the oracle executable opens on the system. This simple command does it for me :

dtrace -n 'syscall::open*:entry/execname == "oracle"/{ printf("%s %s", execname, copyinstr(arg0)); }'

Any syscall that's open* executed by "oracle" will print the first argument being the path in this case.

A lot more to come but for the existing customers that want to start taking a look at it, check it out. The 2.6.32-201.0.4 kernel source is available on ULN, the dtrace kernel module source is under the CDDL license and is also available on ULN. The userspace tools are as is. Dtrace for our UEK2 Beta kernel is not there yet but will come in an update of the UEK2 beta kernel. Another thing we are evaluating is the use of the ksplice technology to "splice in" the probes/providers at runtime. So you would be able to run a kernel without the extra code, if you want to enable dtrace, you first apply the ksplice dtrace update, run dtrace and after the fact unload it. The word evaluate is key here, however :).

a little sample :

# dtrace -n 'syscall::open*:entry/execname == "oracle"/{ printf("%s %s", execname, copyinstr(arg0)); }'
dtrace: description 'syscall::open*:entry' matched 2 probes
CPU     ID                    FUNCTION:NAME
  2      8                       open:entry oracle /proc/2672/stat
  2      8                       open:entry oracle /proc/2674/stat
  2      8                       open:entry oracle /proc/2678/stat
  2      8                       open:entry oracle /proc/2682/stat
  2      8                       open:entry oracle /proc/2686/stat
  2      8                       open:entry oracle /proc/2688/stat
  2      8                       open:entry oracle /proc/2690/stat
  2      8                       open:entry oracle /proc/2692/stat
  2      8                       open:entry oracle /proc/2694/stat
  3      8                       open:entry oracle /proc/loadavg
  1      8                       open:entry oracle /u01/app/oracle/oradata/XE/system.dbf
...
About

Wim Coekaerts is the Senior Vice President of Linux and Virtualization Engineering for Oracle. He is responsible for Oracle's complete desktop to data center virtualization product line and the Oracle Linux support program.

You can follow him on Twitter at @wimcoekaerts

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
9
10
11
12
13
14
15
16
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today