Wednesday Sep 25, 2013

Counting how many threads a cv_broadcast wakes up

I had occasion during a call this week to want to observe what was causing a lot of threads to suddenly be made runnable, and thought I should share the DTrace that I wrote to do it. It's using the fbt provider so don't even think about considering the interfaces stable :)

# dtrace -q -x dynvarsize=4m -n '
BEGIN {trace("Monitoring ..\n"); }
fbt::cv_broadcast:entry {self->cv = (condvar_impl_t *)arg0; }
fbt::cv_broadcast:entry /self->cv && self->cv->cv_waiters>500/ {
       printf("%Y [%d] %s %d woken\n", walltimestamp, pid, execname, self->cv->cv_waiters);
       stack();}
fbt::cv_broadcast:entry /self->cv/ {self->cv = 0;}' 

I needed to make the dynvarsize 4m as I was running this on a pretty large machine so we were getting a lot of thread local variables created and destroyed.

I was rewarded with output like

Monitoring ..                                                                                                                       
2013 Sep 23 15:20:49 [0] sched 1024 woken

              vxfs`vx_sched_thread+0x131c
              unix`thread_start+0x4
2013 Sep 23 15:21:28 [0] sched 1024 woken

              vxfs`vx_sched_thread+0x131c
              unix`thread_start+0x4
2013 Sep 23 15:26:47 [0] sched 1024 woken

Posting in case anyone else has found themselves wanting to find out this kind of thing. Happy DTracing all.

Monday Mar 14, 2011

Summarising my "Nevada to OpenSolaris Sun Ray on SPARC" series

As previously noted, I finished my series on migrating from Nevada to OpenSolaris Sun Ray on SPARC. There are eight articles on this and it's taken me just under a year, generally trying to find time to work on it.

A lot has gone on in that year. I recognise that "OpenSolaris" is probably the wrong word to be using in the titles, but I'm not going to go back and change it. Please regard the term interchangeable for the purpose of this discussion.

The end result is that I am back running Sun Ray Server on the machine that I originally have nevada on right up until we stopped doing nevada distributions. The machine is running a current Development build and is serving me well.

Anyway, I thought that a small article summarising and linking to each of the posts would be in order and this seemed a good first real article for me back on blogs.sun.com/tpenta.

  1. Nevada to OpenSolaris on SPARC

    I look at the problems of getting OpenSolaris on to a SPARC Sun Blade 2000 that does not have a dvd reader on it. The end result was to install a late nevada on it and use the wanboot image of this installation to get something onto it. There were a few things which needed tidying up but it worked. I should note that I have subsequently installed Solaris 11 Express onto an Ultra 45 from the install CD, and it went flawlessly. Maybe it would have been easier to find a dvd drive for the box than all these hoops.

  2. Nevada to OpenSolaris on SPARC (part 2)

    I mention some problems I was having with sh returning an Exec format error when trying to do things as myself - the solution to which is outlined here.

    Also show how I migrated the zpools to the temporary machine.

  3. Nevada to OpenSolaris Sun Ray on SPARC (part 3 - reboot)

    Disaster strikes. My lab booking ran out and someone re-installed the box meaning I had to start from scratch. This blog is probably a much better technical reference for all that I had done before (isn't it always the way that when you have to do something over again, you manage to do a better job?).

  4. Nevada to OpenSolaris Sun Ray on SPARC ( part 4 - imapd)

    Everything I had to do to get imapd running. Starting with installing the compilers, downloading the source code, working out the Makefile hacks to make it build, making the SSL certificates (oops I have to do that again now I am on a machine with a different name).

  5. Nevada to OpenSolaris Sun Ray on SPARC (part 5 - Sun Ray Server)

    Step by step of getting around dependencies of Solaris 10 inside the Sun Ray software to get it running on this box. As I say at the end of the article ,"Getting Sun Ray running like this OpenSolaris completely voids any warranty and support. Don't do it if you don't know what you are doing."

  6. Nevada to OpenSolaris Sun Ray on SPARC (part 6 - cutting over)

    Smoke test time. What happened when I cut over to trying to do my day job on this machine. I realised I had missed a few things that were important, like openoffice, flash and acroread. The entry details the installations and also about installing certificates for the extras repository.

  7. Nevada to OpenSolaris Sun Ray on SPARC (part 7 - printing)

    Details how I got CUPS up and running. I believe that Solaris 11 Express comes with all of the packages already installed so you only need to configure things.

  8. Nevada to OpenSolaris Sun Ray on SPARC (part 8 - back to the original hardware)

    The last installment on moving everything back to the original machine, as I really did not want to be tieing up a lab resource (though I had it booked for close to a year), and vesvi actually had a little more memory in it.

    I cover some gotchas in using cloned disks like this as well as how I ended up doing it, how to change a nodename now as well as a couple of local network gotchas that had me confused for a while.

Going back through this I realise that I may have left out configuring a couple of packages, like MySQL. I may do a final part 9 to cover these once I've made a list of them and I'll add that to here too.

Thursday Jul 30, 2009

Interim fixes for Bind Vulnerability VU#725188/CVE-2009-0696 (Updated)

Yesterday I noticed an article titled New DoS Vulnerability in All versions of BIND 9 on slashdot. The article refers to BIND Dynamic Update DoS at the ISC site describing Vulnerability Note VU#725188 - ISC BIND 9 vulnerable to denial of service via dynamic update request.

This very rapidly caused a stir on a few internal mailing lists that I'm on and work on addressing this as

        6865903 Updated, P1 network/dns CVE-2009-0696 BIND dynamic update problem

The current status of this within Sun is that the Interim Security Reliefs (ISR) are available from http://sunsolve.sun.com/tpatches for the following releases:

SPARC Platform

  • Solaris 10 IDR142522-01
  • Solaris 9 IDR142524-01

x86 Platform:

  • Solaris 10 IDR142523-01
  • Solaris 9 IDR142525-01

Sun Alert 264828 is on its way to be published. When published it will be available from: http://sunsolve.sun.com/search/document.do?assetkey=1-66-264828-1

The fix is planned for build 121 for OpenSolaris/Nevada and we're attempting to get it into the next possible release Support Repository Update (SRU3).

Update 1

It turns out that the Solaris 9 ISR patches rely on an unreleased patch for Solaris 9. Work is underway to get this dependency out quickly,

Monday Jun 22, 2009

Live Upgrade and TimeSlider gotcha

Tried to upgrade my workstation over the weekend to snv_117. Apart from a little tridying up I had to do as a package didn't install correctly, all apeared to be going fine. I then went to unmount /.alt.snv_117, and it failed saying that the filesystem was busy.

fuser -c showed no processes using the mount point. What could it be?

A little bit of dtracing the umount2() system call was illuminating.

  1              <- zfsctl_umount_snapshots           0                0
  1            <- zfs_umount                          0               16

Hang on, snapshots? Although it returned 0, let's just check; as I do have timeslider enabled on this box.

rootksh@vesvi:~$ zfs list -t snapshot|grep 117                                                                     
pool/ROOT/snv_116@snv_117                                   4.03M      -  8.78G  -
pool/ROOT/snv_117@zfs-auto-snap:hourly-2009-06-19-09:00     43.8M      -  7.99G  -
pool/ROOT/snv_117@zfs-auto-snap:hourly-2009-06-19-10:00     48.9M      -  8.44G  -
pool/ROOT/snv_117@zfs-auto-snap:hourly-2009-06-19-11:00     43.7M      -  8.74G  -
pool/ROOT/snv_117@zfs-auto-snap:frequent-2009-06-19-11:15   42.6M      -  8.75G  -
pool/ROOT/snv_117@zfs-auto-snap:frequent-2009-06-19-11:30   45.8M      -  8.76G  -
pool/ROOT/snv_117@zfs-auto-snap:frequent-2009-06-19-11:45   38.1M      -  8.77G  -
pool/ROOT/snv_117@zfs-auto-snap:hourly-2009-06-19-12:00     38.5M      -  8.80G  -
pool/ROOT/snv_117@zfs-auto-snap:daily-2009-06-22-00:00          0      -  8.80G  -
pool/ROOT/snv_117@zfs-auto-snap:weekly-2009-06-22-00:00         0      -  8.80G  -
pool/ROOT/snv_117@zfs-auto-snap:hourly-2009-06-22-10:00         0      -  8.80G  -
pool/ROOT/snv_117@zfs-auto-snap:frequent-2009-06-22-10:30       0      -  8.80G  -
pool/ROOT/snv_117@zfs-auto-snap:frequent-2009-06-22-10:45       0      -  8.80G  -

Oh, timeslider was taking snapshots of the filesystem while it was upgrading. Hmm maybe we should be having that disabled on the target of a live upgrade (rfe coming, but I don't hold out a lot of hope).

Anyway, removing them was not difficult:

rootksh@vesvi:~$ zfs list -t snapshot|grep snv_117@zfs-auto|awk '{print $1}' | xargs -L 1 zfs destroy
rootksh@vesvi:~$ luumount snv_117                                                                                  
rootksh@vesvi:~$ 

Something to keep in mind if you are using timeslider, zfs root and live upgrade (I wonder if we would have the same issue with 'pkg image-update' in OpenSolaris).

Friday May 23, 2008

Yay for Sun Ray

As anyone who reads what I write here probably knows, I've spent the last 2 weeks in China, in two different offices in Beijing and one on Shanghai.

Some months back I installed Sun Ray server on my workstation in Sydney (as I prefer to run on relatively current nevada builds, rather than the Solaris 10 builds that IT Ops provides) and moved over to using a Sun Ray appliance both on my desk and at home.

I've been very happy with the ease of use of being able to transfer my work session home.

I've been even more happy with being able to transfer it to the various offices in China!

I just put my card in, it throws me to the Sydney ITops server that I would normally connect to, and I just utswitch from there to my own server. It all just works. What's more I've found working from Beijing very little different speedwise to working from home. It's all quite usable.

Technorati Tags: , ,

Wednesday Jan 09, 2008

PSARC 2008/008 DTrace Provider for Bourne Shell

I finally got to submit the fast track for the shell provider. I've already had one comment (from Darren Reed) that I have incorporated as it made very good sense. He suggested that if we are tracking variable assignments, we should also track unset. At this point I realised that a better name for the probes would be variable-set and variable-unset. I have a working copy for SPARC with these changes now.

Below is the prefix text and the revised specification.

I am sponsoring the following fast track for myself. I am doing the
bourne shell first for two primary reasons.

1. It is the "simplest" of the shells and thus should provide the
   minimum set of probes to implement for future work in other shells,
2. Providing probes into /bin/sh gives us observability of
   approximately 60% of all of the scripts on ON.

Additionally, as it has been around for a very long time there are
quite a lot of user written scripts for it, many very badly written.

I would expect future fast tracks for other shells (eg ksh88, ksh93,
zsh, bash, ...) to reference this fast track for the minimum set of
probes.

Note the probes are currently listed as Uncommitted. As the probes
gain use I would hope to log a future fast track to increase this
stability.

A Minor release binding is initially requested. Again, once things
settle down and the interfaces stabilise it is expected that a future
case may request a patch binding.

The sh provider makes available probes that can be used to observe the
behaviour of bourne shell scripts.

Probes
------

The sh provider makes available the following probes as exports:

builtin-entry   Fires on entry to a shell builtin command.
builtin-return  Fires on return from a shell builtin command.
command-entry   Fires when the shell execs an external command.
command-return  Fires on return from an external command.
function-entry  Fires on entry into a shell function.
function-return Pires on return from a shell function.
line            Fires before commands on a particular line of code are
		executed.
subshell-entry  Fires when the shell forks a subshell.
subshell-return Fires on return from a forked subshell.
script-start    Fires before any commands in a script are executed.
script-done     Fires on script exit.
variable-set	Fires on assignment to a variable.
variable-unset	Fires when a variable is unset.

The use of non-empty module or function names in a sh\* probe is
undefined at this time.

Arguments
---------

builtin-entry,
command-entry,
function-entry

	char \*	args[0]	Script Name
	char \*	args[1]	Builtin/Command/Function Name
	int	args[2]	Line Number
	int	args[3]	# Arguments
	char \*\*	args[4]	Pointer to argument list

builtin-return,
command-return,
function-return

	char \*	args[0]	Script Name
	char \*	args[1]	Builtin/Command/Function Name
	int	args[2]	Return Value

subshell-entry

	char \*	args[0]	Script Name
	pid_t	args[1]	Forked Process ID

subshell-return

	char \*	args[0]	Script Name
	int	args[1]	Return Value

line

	char \*	args[0]	Script Name
	int	args[1]	Line Number

script-start

	char \*	args[0]	Script Name

script-done

	char \*	args[0]	Script Name
	int	args[1]	Exit Value

variable-set

	char \*	args[0]	Script Name
	char \*	args[1]	Variable Name
	char \*	args[2]	Value

variable-unset

	char \*	args[0] Script Name
	char \*	args[1]	Variable Name

Examples
--------

1. Catching a variable assignment

	Say we want to determine which line in the following script has
	an assignment to WatchedVar:

	#!/bin/sh

	# starting script
	WatchedVar=Value
	unset WatchedVar
	# ending script

	We could use the following script

	#!/usr/sbin/dtrace -s

	#pragma D option quiet

	sh$target:::line { self->line = arg1; }
	sh$target:::variable-set /copyinstr(arg1) == "WatchedVar"/ {
	        printf("%d: %s=%s\\n", self->line, copyinstr(arg1),
		    copyinstr(arg2))
	}
	sh$target:::variable-unset /copyinstr(arg1) == "WatchedVar"/ {
	        printf("%d: unset %s\\n", self->line, copyinstr(arg1)); }


	$ ./watch.d -c ./var.sh
	4: WatchedVar=Value
	5: unset WatchedVar

2. Watching the time spent in functions

	#!/usr/sbin/dtrace -s

	#pragma D option quiet

	sh$target:::function-entry { self->start = vtimestamp }
	sh$target:::function-return {
		@[copyinstr(arg1)] = quantize(vtimestamp - self->start)
	}

	Similar for the other entry/return probes, with the exception
	of subshell as the probe name is unavailable.

3. Wasted time using external functions instead of builtins

	This script is copied from the DTrace toolkit. It's function
	and how it works should be relatively self explanatory.

#!/usr/sbin/dtrace -Zs
/\*
 \* sh_wasted.d - measure Bourne shell elapsed times for "wasted" commands.
 \*               Written for the sh DTrace provider.
 \*
 \* $Id: sh_wasted.d 25 2007-09-12 09:51:58Z brendan $
 \*
 \* USAGE: sh_wasted.d { -p PID | -c cmd }       # hit Ctrl-C to end
 \*
 \* This script measures "wasted" commands - those which are called externally
 \* but are in fact builtins to the shell. Ever seen a script which calls
 \* /usr/bin/echo needlessly? This script measures that cost.
 \*
 \* FIELDS:
 \*              FILE            Filename of the shell or shellscript
 \*              NAME            Name of call
 \*              TIME            Total elapsed time for calls (us)
 \*
 \* IDEA: Mike Shapiro
 \*
 \* Filename and call names are printed if available.
 \*
 \* COPYRIGHT: Copyright (c) 2007 Brendan Gregg.
 \*
 \* CDDL HEADER START
 \*
 \*  The contents of this file are subject to the terms of the
 \*  Common Development and Distribution License, Version 1.0 only
 \*  (the "License").  You may not use this file except in compliance
 \*  with the License.
 \*
 \*  You can obtain a copy of the license at Docs/cddl1.txt
 \*  or http://www.opensolaris.org/os/licensing.
 \*  See the License for the specific language governing permissions
 \*  and limitations under the License.
 \*
 \* CDDL HEADER END
 \*
 \* 09-Sep-2007  Brendan Gregg   Created this.
 \*/

#pragma D option quiet

dtrace:::BEGIN
{
        isbuiltin["echo"] = 1;
        isbuiltin["test"] = 1;
        /\* add builtins here \*/

        printf("Tracing... Hit Ctrl-C to end.\\n");
        self->start = timestamp;
}

sh$target:::command-entry
{
        self->command = timestamp;
}

sh$target:::command-return
{
        this->elapsed = timestamp - self->command;
        this->path = copyinstr(arg1);
        this->cmd = basename(this->path);
}

sh$target:::command-return
/self->command && !isbuiltin[this->cmd]/
{
        @types_cmd[basename(copyinstr(arg0)), this->path] = sum(this->elapsed);
        self->command = 0;
}

sh$target:::command-return
/self->command/
{
        @types_wasted[basename(copyinstr(arg0)), this->path] =
            sum(this->elapsed);
        self->command = 0;
}

proc:::exit
/pid == $target/
{
        exit(0);
}

dtrace:::END
{
        this->elapsed = (timestamp - self->start) / 1000;
        printf("Script duration: %d us\\n", this->elapsed);

        normalize(@types_cmd, 1000);
        printf("\\nExternal command elapsed times,\\n");
        printf("   %-30s %-22s %8s\\n", "FILE", "NAME", "TIME(us)");
        printa("   %-30s %-22s %@8d\\n", @types_cmd);

        normalize(@types_wasted, 1000);
        printf("\\nWasted command elapsed times,\\n");
        printf("   %-30s %-22s %8s\\n", "FILE", "NAME", "TIME(us)");
        printa("   %-30s %-22s %@8d\\n", @types_wasted);
}


Stability
---------

Element Name    Class           Data Class
------------------------------------------
Provider        Uncommited      Uncommited
Module          Private         Private
Function        Private         Private
Name            Uncommited      Uncommited
Arguments       Uncommited      Uncommited
------------------------------------------

Technorati Tags: , ,

Tuesday Nov 20, 2007

Mutex Contention vs number of cpus

I've had a few cases recently that have brought this issue to the fore.

It's amazing how many people think that the answer to all performance issues is to simply throw more cpu at the problem.

Let's work through this thought (and this holds true for other queuing type locks too).

  1. On Solaris, if the mutex holder is on proc, the waiter spins instead of blocking and sleeping
  2. If we have sufficient threads wanting the same mutex, we can quickly utilize all cpus in a box
  3. The more threads we have in the queue for a mutex, logically the longer any given thread will take to progress through this queue to obtain it.

What does this tell you about what is likely to happen if you add more cpus into the mix?

It's relatively obvious now, isn't it.

  1. More cpus stuck spinning in kernel space
  2. Longer mutex queues
  3. Longer average time to obtain mutex for each thread that wants it.

The obvious consequence being that adding more cpus can actually have the effect of making the problem worse.

Sigh.

Tuesday Nov 06, 2007

The official melbourne cup site can't cope?

I simply have to tongue in cheek blog this. For those that don't know, Australia stops on the first Tuesday in November for the largest horse race in the country. The Melbourne Cup. Shortly after the race finished a colleague gathered the below screen shot from the official web site.


Do you think they might need to talk to Sun Australia about the new T2 boxes ( e.g. T5220 & T5120), Solaris 10 and Our Application Server ? :-)

Tuesday Oct 23, 2007

Sun Developer Days

OK, I got back from CEC on Saturday a week back and walked into the house at about 9:30 absolutely knackered. About 2pm my pager went off and I discovered that I was on VOSJEC duty that weekend and ended up with a righht horror of a call that lasted the rest of the weekend (that I won't go into detail here with, save to say that I got an action plan out to these ghuys at about 00:30 on Monday morning.

Early Monday morning (ok I did get some sleep, this is real morning about 10-11), I got a call from Laurie Wong. Apparantly the DTrace speaker they had organised for the Developer Days couldn't make it and they really couldn't find anyone else. After some discussion with my boss, we agreed that I would go fly to Melbourne the next day to cover this and also cover Sydney on Wednesday.

Had an awful time actually using the system that we are supposed to use to book the flight, ended up taking me a bit over an hour and by that time the fare had risen 50% !!! Anyway got that all sorted and boy am I glad that I booked to get my self well ahead of when I spoke.

First off, I was using someone else's slides, so of course I had to work out what I was going to say to each one (I use flash cards to remind me of what I want to talk about so I'm not just reading the slides). Going through the slides I noticed that the information on the javascript provider was actually out of date. Indeed, you can actually download a firefox 3.0 alpha that has the new provider in there and looks pretty damned spiffy. So, I updated that stuff, then I discovered that there were actually two sections of the talk not present in the slides. This was the "tie it all together" bit and the summary. Well I didn't have the time to write a "tie it all together bit", so I removed that from the agenda slide and did up an "in conclusion" slide.

The other part of being glad I booked an earlier flight is that even though we boarded close to time, we were about an hour late getting off the runway! I got in to Melbourne at about midday. Fortunately we were able to put another speaker in front of me so I could finish writing the talk which I ended up giving at 4pm.

Anyway, the talk covered some background on DTrace (and the slide author provided some really nice graphics and animations), and discussion of various providers. In particular I talked about PHP, javascript, and postgresQL. I did demos for some of the basic DTrace, javascript and postgreSQL.

I Also touched on the shell provider I'm working on and encouraged folks to get involved with working on and testing new providers.

Amazingly, without having timed this or even thought about the length, I managed to finish exactly on time.

Laurie took me into the QANTAS lounge where we were able to relax a little before the flight home. With the flight and the train trip I got home about midnight.

The next day was in Sydney, so I only needed to take the train into the city.

After finding the venue (google maps on a treo 750 is really useful!), I sat in on a couple of the other talks and quite enjoyed those. In Sydney my talk was at 3:15 and again went pretty well.

Headed home after being treated to a really nice dinner at Doyle's on the Harbour.

Unfortunately I had a prior commitment on Friday so I couldn't give the talk in Canberra.

These were probably the largest audiences that I have ever presented to (combining both talks I spoke to about 580 people). I actually enjoyed it and I think my audience had a bit of fun as well. It's nice to do this kind of thing every so often.

Technorati Tags: , ,

Monday Aug 20, 2007

sh provider update - command-entry fixed

I've just uploaded the latest diffs and binaries to www.opensolaris.org/os/community/dtrace/shells/.

So what changed?

There was a bug in that the command-entry probe fired in both parent and child shell. This was a simple oversight that I should not have missed. I originally had the probe before sh forked, then moved it such that it fired after we knew we were able to fork (basically I didn't want it firing if we were not able to actually fork and run the command). Unfortunately, I forgot to specify that it was only to fire in the parent shell. Simple fix. My bad.

Also note that the documentation on the Providers for various shells site supercedes what I previously wrote in my blog.

I haven't tested the diffs, but I did the same massaging to them that I did for the last lot, so they should be ok to use with gpatch.

Looks like things are starting to settle down now so I'll be able to think about progressing this one and starting to look at some others, using these probe names as some kind of standardisation.

There have been suggestions for extras in this provider, but the feeling that I'm getting is let's get a basic useful provider done as a v1.0 and look at things like stacks and other probes in a later update.

Technorati Tags: , ,

Wednesday Aug 15, 2007

sh provider update (changes and sparc available)

After discussions with Brendan and Adam, I've made a couple of changes to the provider.

  1. exec has been renamed to command, and

  2. script-begin and script-end have become script-start and script-done respectively.

These changes are reflected in the documentation that is available at www.opensolaris.org/os/community/dtrace/shells/

In addition, I have rebuilt the x86 binary with this changes and provided a SPARC binary.

Technorati Tags: , ,

Tuesday Aug 14, 2007

sh DTrace provider diffs and x86 binary available

The title says it all. I've uploaded the diffs and an x86 sh binary to the "Providers for various shells" page. I'll do up a SPARC binary if there is interest in it.

Note that the arguments for the entry probes for builtin, exec and function have changed. There is a link to the correct documentation on this page too.

Have fun playing folks and let me know if you find any bugs.

I would have had these up about 90 minutes ago, but my previous blog entry needed to be written.

Technorati Tags: , ,

Why did I do a sh provider first?

Since I posted my initial blog on this, I've received a few comments and a surprising amount of mail asking why I did /bin/sh, which is an obviously obsolete shell, and not ksh93; some of it bordering on insulting.

Let me go through a few reasons why I did this one first.

  1. It's the one I was asked to do.

  2. The Bourne shell has been around for a very long time and there are, quite literally, millions of scripts that have been written in it, some of them very poorly so.

  3. Much work in the open source environment is done to scratch an itch. In my day to day work doing performance calls I come across an amazing number of instances where being able to probe /bin/sh the way that this allows me to, would be an incredibly useful thing to have in order to explain to a customer why their thrown together script runs so slowly, most of these are /bin/sh scripts. Coding probes for ksh93 has no immediate impact on my day to day work as ksh93 is not yet integrated into OpenSolaris.

  4. Just doing a quick poll of usr/src in the opensolaris tree gives me the following:

    Scripting LanguageActual scriptsDynamically created scriptsComments
    /bin/sh463538
    /usr/bin/sh3943/bin symlinked to /usr/bin
    /sbin/sh186272symlink to /bin/sh
    /bin/ksh212199
    /usr/bin/ksh5350/bin symlinked to /usr/bin
    /bin/perl139
    /usr/bin/perl241237/bin symlinked to /usr/bin
    ksh9300

  5. sh is a logical first step from which other providers can follow.

  6. ksh93 is currently on a track to get integrated. I really don't want to drop any roadblocks on it now.

  7. Does it really matter which one comes first?

At no point did I say I would not consider doing ksh93 or even helping someone else do it, indeed I have had communication with Roland suggesting ways in which this could be done. Believe it or not, there is a plan to get all the shells done. Keep an eye on http://www.opensolaris.org/os/community/dtrace/shells/.

The last thing I expected when I started on this was to be the target of insults because I didn't do someone's favorite shell. I'm wondering how this kind of behavior encourages anyone to actually do anything that is of any benefit to the community. Come on folks, we can do better than that.

Technorati Tags: , ,

Friday Aug 10, 2007

/bin/sh DTrace Provider

A couple of days ago Brendan was chatting with me on irc and we got to discussing such a beast. Mainly looking at something simple in the way of the python and perl providers that others have worked on.

Well, to make a long story short (ok it will be longer later), I've coded up something that appears to work against the nevada clone tree of a couple of days ago, and logged RFE 6591476 to track it.

I'll be putting the diffs up in the next day or so, but for a teaser, here is the documentation.

sh Provider

The sh provider makes available probes that can be used to observe the behaviour of bourne shell scripts.

Overview

The sh provider makes available the following probes:


builtin-entryProbe that fires on entry to a shell builtin command.
builtin-returnProbe that fires on return from a shell builtin command.

exec-entryProbe that fires when the shell execs an external command.
exec-returnProbe that fires on return from an external command.

function-entryProbe that fires on entry into a shell function.
function-returnProbe that fires on return from a shell function.

lineProbe that fires before commands on a particular line of code are executed.

subshell-entryProbe that fires when the shell forks a subshell.
subshell-returnProbe that fires on return from a forked subshell.

script-beginProbe that fires before any commands in a script are executed.
script-endProbe that fires on script exit.

Arguments

The argument types to the sh provider are listed in the below table.


Probeargs[0]args[1]args[2]args[3]

builtin-entry,
exec-entry,
function-entry
char \*char \*intchar \*\*

builtin-return,
exec-return,
function-return
char \*char \*int

line char \*int

script-begin char \*

script-end char \*int

subshell-entry char \*pid_t
subshell-return char \*int

arg0 in all probes is the script name.

In the builtin, exec, and function entry probes, and the builtin, exec, and function return probes, arg1 is the name of the function, builtin or program being called. arg2 and arg3 in these entry probes are the argument count and a pointer to the argument list. In these return probes, arg2 is the return code from the function, builtin or program.

In the subshell-entry, arg1 is the pid of the forked subshell and in subshell-return probes, arg1 is the return code from the subshell.

In the line probe, arg1 is the line number.

In the script-end probe, arg1 is the exit code of the script.

Stability

The sh provider uses DTrace's stability mechanism to describe its stabilities, as shown in the following table. For more information on the stability mechanism see Chapter 39 of the Solaris Dynamic Tracing guide.

ElementName ClassData ClassDependancy Class
ProviderUnstableUnstableCommon
ModulePrivatePrivateUnknown
FunctionPrivatePrivateUnknown
NameUnstableUnstableCommon
ArgumentsUnstableUnstableCommon

The probes that gave me the most trouble were line and subshell-\*.

line was tricky as sh only does line numbers when it parses input. It has no concept of line numbers on execution, which is what we need.

So, I needed to add another structure element (line) to trenod, and every other struct that is cast over the top of it. In the first failed attempt, I set this to standin->flin whenever we allocated a new one of these nodes, I set line to this value. The problem with this is that if the parser hits a newline, this number gets incremented before we actually set it, which means that the last command on every line has the line number of the following line. Not quite what I wanted.

What I ended up going with was the creation of another variable in the fileblk structure (comline) that I set just before standin->flin is incremented. This looks like it works.

The subshell-\* probes were not initially going to be a part of this, but an assumption I made about com in execute() when coding the exec-\* probes ended up causing the shell to coredump on me. It turns out that com is properly defined when we fall through the switch to the TFORK code, but if we go directly into the TFORK case, it's something completely different (would you believe "1"?). So, I made the probe conditional on the node type. If it was a TFORK, then we do a subshell probe, otherwise we do an exec probe. This also appears to work.

In the meanwhile, I've been sending Brendan sh binaries and he's already started coding more tools for the DTrace Toolkit based on this provider, and I have to say that some of his ideas look pretty cool.

Technorati Tags: , ,

Wednesday Aug 08, 2007

Finding undocumented command options

I had a colleague this morning asking about undocumented (ie not listed in usage or man pages) options in a command. The actual command doesn't really matter, but I was feeling a little lazy and couldn't bothered looking up the source code to the command (which actually wasn't in ON). Almost immediately I thought of DTrace.

Let's have a look at ls as an example. I'll give it a dummy directory as I really don't care about the output.

$ dtrace -q -n 'pid$target::getopt:entry {trace(copyinstr(arg2));}' -c 'ls /nosuchfile'                     
/nosuchfile: No such file or directory
aAbcCdeEfFghHilLmnopqrRsStux1@vV

As you can see, the second line of output is printing out the third argument (arg2) to getopt(3c), which will list every option that getopt(3c) will recognise for the command.

Of course I could have prettied it up, but it's a one liner, I know what the output means.

The point being, that DTrace is just another sysadmin tool to be used in day to day operations.

Technorati Tags: , ,


Yes I know I have been lax in my blogging, I'm going to start doing something about that, starting with this one :-)

About

* - Solaris and Network Domain, Technical Support Centre


Alan is a kernel and performance engineer based in Australia who tends to have the nasty calls gravitate towards him

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
Links
Blogroll

No bookmarks in folder

Sun Folk

No bookmarks in folder

Non-Sun Folk
Non-Sun Folks

No bookmarks in folder