DTrace at Google

Recently, I gave a Tech Talk at Google on DTrace, the video for which is now online. If you've seen me present before and you don't want to suffer through the same tired anecdotes, arcane jokes, and disturbing rants, jump ahead to about 57:16 to see a demo of DTrace for Python -- and in particular John Levon's incredible Python ustack helper. You also might want to skip ahead to 1:10:46 for the Q&A -- any guesses what the first question was? (Hint: it was such an obvious question, both I and the room erupted with laughter.) Note that my rather candid answer to that question represents my opinion alone, and does not constitute legal advice -- and nor does it represent any sort of official position of Sun...
Comments:

Very cool! I had not heard about the python work.

Would have been great with a resolution good enough to read the screen...

Posted by Marc on August 21, 2007 at 02:39 AM PDT #

Wonderful so far - just had to let you know that I nearly snorted hot coffee through my nostrils when you explained about some smartypants wanting MAP_FIXED.

If I was paid by the number of times I have to explain to some people just WHY choosing your own mapping addresses is Not A Good Idea(tm), I'd be on an island somewhere...

And the sad thing is... it's oftentimes the same people that I explained to the last time!

Posted by Colin Burgess on August 21, 2007 at 05:28 AM PDT #

Hi,

I was present at the talk (which was really good, BTW -- little hype and more _why_ is it cool), and somehow forgot the follow-up to the obvious question: If Sun is interested in seeing DTrace in Linux, would simply dual-licensing the code (dual CDDL/GPLv2) be an option?

Posted by Steinar H. Gunderson on August 21, 2007 at 06:19 AM PDT #

Marc, apologies about the screen resolution. You might want to check out John's blog entry (linked above) for details on what I was describing.

Colin, glad I provided amusement on MAP_FIXED, and yes, it's a Bad Idea -- a great example of a little knowledge being a dangerous thing.

Steinar, dual licensing is a mess, unfortunately -- it opens up nasty pathologies like license-based forks that represent unacceptable risk. Should you be interested in a flame war on the subject, see this blog entry and its comments:

http://blogs.sun.com/ahl/entry/dtrace_knockoffs

Posted by Bryan Cantrill on August 21, 2007 at 07:01 AM PDT #

A very cool demonstration, and no sales pitch can beat this kind of demo. The resolution was very poor on Google video, it would be nice if you can publish the few DTrace scripts that you used somewhere. (And add a comment on Google Video?)

Incidentally, we moved off Solaris8 to Linux Blade servers recently at work. Such facilities made available, via licensing (or as a paid for Kernel module? does such a thing exist??) on other systems would definitely be useful. (Best to sell stuff while you can ... )

Posted by Alok Bisani on August 21, 2007 at 09:49 AM PDT #

Hey,

Would be cool if more dev used DTrace on Xorg to help clean up some of its many bugs. :P

Good work,
Edward.

Posted by Edward O'Callaghan on August 21, 2007 at 11:09 AM PDT #

Bryan, thanks for the clarification wrt. dual-licensing.

On a totally different note: When it comes to mixer_applet2, I'm not sure if my point got through at the presentation (I'm not a native speaker :-) ), but in case you're curious, mixer_applet2 is the GNOME mixer applet (ie., the volume control; it does not mix PCM audio or anything like that). What it does every 100ms is to poll the sound card's volume controls, so it can show the correct volume in its icon if something else were to change it.

Now, this causes the CPU to wake up a lot, as you found out live -- and waking up is bad for a laptop, obviously. The Intel people have made a Linux program called PowerTOP designed to find causes of CPU wakeup (so, like a more specialized version of what you constructed in D at the presentation) in order to let the CPU sleep longer and thus save battery life when the machine is idle. One of the first issues that were fixed as result of these efforts were in fact this issue -- newer versions of mixer_applet2 can subscribe to ALSA notification events (via HAL) and simply get a message whenever the value changes. (I am unsure if this works for Solaris, though, as I don't think OSS has any sort of messaging like this.)

Posted by Steinar H. Gunderson on August 21, 2007 at 04:33 PM PDT #

Hi Bryan,

Its a joy to see your presentation again.

I don't know if you remember but you visited my company in Australia last year (might of been the year before) after a D-Trace presentation in Sydney. We put you straight to work on the terminal to demonstrate D-trace in action with our software.

Anyway our software is extremely threads/mutex/condition variable extensive. We have found that running 3 instances of our software performs much better than 1 instance(configured with the same amount of threads as 3 instances) whilst only using %50 of CPU (no iowait) on a fully loaded E2900.

So we have theorized that there is shared mutex(s)/condition(s) resource that are causing threads to block up behind.

Could Dtrace identify the top x mutex/condition variables that have the most threads waiting for?

Which probes should we be looking at?

Would love to see you out in Australia again :)

Thanks

Matt.

Posted by Matthew Johnson on August 22, 2007 at 06:12 PM PDT #

Hey Matt,

I definitely remember you, and the fun afternoon we had drinking beers and debugging performance issues in Sydney in October, 2005. And I still retell one of the things we found that day: as you might recall, you had seen a performance regression going from S9 to S10 -- which turned out to be due to a configuration change to use /var/tmp (an on-disk filesystem) instead of /tmp (an in-memory filesystem) for your temporary files. To me, that's a great example of a mistake that anyone could easily make -- but that can be reasonably difficult to find without a tool like DTrace.

As for your recent problem: yes (of course!) DTrace can be of help here. Take a look at lockstat(1M) (for in-kernel locks) and plockstat(1M) (for user-level locks), both of which are implemented in terms of DTrace. I think you might also want to make use of the sched provider to understand how your threads are being scheduled (and why three instances perform better than one!).

That should point you in the right direction, but if you're looking for more help, consider asking dtrace-discuss@opensolaris.org -- there's a ton of helpful DTrace expertise on that list.

Posted by Bryan Cantrill on August 22, 2007 at 11:42 PM PDT #

Bryan,

If I wanted to get the Python ustack working with dtrace, which version of Solaris should I be downloading? And where do I download it from? Sorry I'm just new to Solaris!

Posted by Harish Mallipeddi on August 26, 2007 at 01:56 AM PDT #

Harish,

You want the latest Solaris Express Community Edition (which is updated every two weeks, and contains the latest version of the operating system). It's available here:

http://opensolaris.org/os/downloads/

Posted by Bryan Cantrill on August 26, 2007 at 11:24 AM PDT #

Post a Comment:
Comments are closed for this entry.
About

bmc

Search

Top Tags
Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today