Monday Dec 23, 2013

Who is renicing these processes?

I was helping out a colleague on such a call this morning. While the DTrace script I produced was not helpful in this actual case, I think it bear sharing, anyway.

What we wanted was a way to find out why various processes were running with nice set to -20. There are two ways in which a process can have its nice changed.

  • nice(2) - where it changes itself
  • priocntl(2) - where something else changes it

I ended up with the following script after a bit of poking around.

# dtrace -n '
syscall::nice:entry {
        printf("[%d] %s calling nice(%d)", pid, execname, arg0);}
syscall::priocntlsys:entry /arg2 == 6/ {
        this->n = (pcnice_t *)copyin(arg3, sizeof(struct pnice));
        this->id = (procset_t *)copyin(arg1, sizeof(struct procset));
        printf("[%d] %s renicing %d by %d",
            pid, execname, this->id->p_lid, this->n->pc_val); }'

There is an assumption in there about p_lid being the PID that I want, but in this particular case it turns out to be ok. Matching arg2 against 6 is so that we only get priocntl() calls with the command PC_DONICE. I could have also had it check the pcnice_t->pc_op but I can put up with the extra output.

So what happens when we have this running and then try something like

# renice -20 4147
...
dtrace: description 'syscall::nice:entry ' matched 2 probes
 CPU ID FUNCTION:NAME
 0 508 priocntlsys:entry [4179] renice renicing 4147 by 0
 0 508 priocntlsys:entry [4179] renice renicing 4147 by -20

Which is exactly what we wanted. We see the renice command (pid 4179) modifying pid 4179.

Oh, why didn't this help I hear you ask?

Turns out that in this instance, the process in question was being started by init from /etc/inittab, as such starting with nice set to whatever init is running at. In this case it is -20.

Wednesday Sep 25, 2013

Counting how many threads a cv_broadcast wakes up

I had occasion during a call this week to want to observe what was causing a lot of threads to suddenly be made runnable, and thought I should share the DTrace that I wrote to do it. It's using the fbt provider so don't even think about considering the interfaces stable :)

# dtrace -q -x dynvarsize=4m -n '
BEGIN {trace("Monitoring ..\n"); }
fbt::cv_broadcast:entry {self->cv = (condvar_impl_t *)arg0; }
fbt::cv_broadcast:entry /self->cv && self->cv->cv_waiters>500/ {
       printf("%Y [%d] %s %d woken\n", walltimestamp, pid, execname, self->cv->cv_waiters);
       stack();}
fbt::cv_broadcast:entry /self->cv/ {self->cv = 0;}' 

I needed to make the dynvarsize 4m as I was running this on a pretty large machine so we were getting a lot of thread local variables created and destroyed.

I was rewarded with output like

Monitoring ..                                                                                                                       
2013 Sep 23 15:20:49 [0] sched 1024 woken

              vxfs`vx_sched_thread+0x131c
              unix`thread_start+0x4
2013 Sep 23 15:21:28 [0] sched 1024 woken

              vxfs`vx_sched_thread+0x131c
              unix`thread_start+0x4
2013 Sep 23 15:26:47 [0] sched 1024 woken

Posting in case anyone else has found themselves wanting to find out this kind of thing. Happy DTracing all.

Thursday Feb 28, 2013

A Solaris tmpfs uses real memory

That title may sound a little self explanatory and obvious, but over the last two weeks I have had two customers tell me flat out that /tmp uses swap and that I should still continue to investigate where their memory is being used.

This is likely because when you define /tmp in /etc/vfstab, you list the device being used as swap.

In the context of a tmpfs, swap means physical memory + physical swap. A tmpfs uses pageable kernel memory. This means that it will use kernel memory, but if required these pages can be paged to the swap device. Indeed if you put more data onto a tmpfs than you have physical memory, this is pretty much guaranteed.

If you are still not convinced try the following.

  1. In one window start up the command
    $ vmstat 2
  2. In another window make a 1gb file in /tmp.
    $ mkfile 1g /tmp/testfile
  3. Watch what happens in the free memory column in the vmstat.

There seems to be a misconception amongst some that a tmpfs is a way of stealing some of the disk we have allocated as swap to use as a filesystem without impacting memory. I'm sorry, this is not the case.

Tuesday Jan 22, 2013

Using /etc/system on Solaris

I had cause to be reminded of this article I wrote for on#sun almost ten years ago and just noticed that I had not transferred it to my blog.

/etc/system is a file that is read just before the root filesystem is mounted. It contains directives to the kernel about configuring the system. Going into depth on this topic could span multiple books so I'm just going to give some pointers and suggestions here.

Warning, Danger Will Robinson

Settings can affect initial array and structure allocation, indeed such things as module load path and where the root directory actually resides.

It is possible to render your system unbootable if you are not careful. If this happens you might try booting with the '-a' option where you get the choice to tell the system to not load /etc/system.

Just because you find a set of values works well on one system does not necessarily mean that they will work properly on another. This is especially true if we are looking at different releases of the operating system, or different hardware.

You will need to reboot your system before these new values will take effect.

The basic actions that can be taken are outlined in the comments of the file itself so I won't go into them here.

The most common action is to set a value. Any number of products make suggestions for settings in here (eg Oracle, Veritas Volume Manager and Filesystem to name a few). Setting a value overrides the system default.

A practice that I make when working on this file is to place a comment explaining why and when I make a particular setting (remember that a comment in this file is prefixed by a '*', not a '#'). This is useful later down the track when I may have to upgrade a system. It could be that the setting may actually not have the desired effect and it would be good to know why we originally did it.

I harp on this point but it is important.

Just because settings work on one machine does not make them directly transferable to another.

For example

set lotsfree=1024

This tells the kernel not to start running the page scanner (to start paging out memory to disc) until free memory drops below 8mb (1024 x 8k blocks). While this setting may be fine on a machine with around 512mb of memory, it does not make sense for a machine with 10gb. Indeed if the machine is under memory pressure, by the time we get down to 8mb of free memory, we have very little breathing space to try to recover before requiring memory. The end result being a system that grinds to a halt until it can free up some resources.

Oracle makes available the Solaris Tunable Parameters guide as a part of the documentation for each release of Solaris. It gives information about the default values and the uses of a lot of system parameters.

Monday Jul 30, 2012

Using a libc.so from a previous kernel patch (Just Don't)

I was just assisting a colleague with an issue that after patching they found that there was higher lock spinning in malloc() in libc.

He just told me that the customer copied the old libc into a directory in /tmp, changed LD_LIBRARY_PATH to point there first and ran their application observing that the issue went away.

OK, where do we start here, ...

NO

Two things immediately spring to mind as to why this is a bad idea.

  1. libc is tightly linked to the kernel system call interfaces. These interfaces are private to libc. As such they can be changed as long as the same change is made in the libc code. If you mismatch libc and the kernel you risk incorrectly calling system calls, with potentially fatal consequences.
  2. Placing a library into /tmp (or a directory under /tmp). Picture the following scenario. Someone builds their own library (doesn't have to be libc, just has to be something that your application uses) and places it into the directory you added to your search path (eg renaming your directory and creating their own). Now we have the potential of having your application run trojan code with any kind of side effect. Similar issues if you leave the path in a startup script and reboot, if the directory doesn't exist, anyone can create it and do the same thing.

In short, please don't.

Sunday Jun 03, 2012

The Importance of Fully Specifying a Problem

I had a customer call this week where we were provided a forced crashdump and asked to determine why the system was hung.

Normally when you are looking at a hung system, you will find a lot of threads blocked on various locks, and most likely very little actually running on the system (unless it's threads spinning on busy wait type locks).

This vmcore showed none of that. In fact we were seeing hundreds of threads actively on cpu in the second before the dump was forced.

This prompted the question back to the customer:

What exactly were you seeing that made you believe that the system was hung?

It took a few days to get a response, but the response that I got back was that they were not able to ssh into the system and when they tried to login to the console, they got the login prompt, but after typing "root" and hitting return, the console was no longer responsive.

This description puts a whole new light on the "hang". You immediately start thinking "name services".

Looking at the crashdump, yes the sshds are all in door calls to nscd, and nscd is idle waiting on responses from the network.

Looking at the connections I see a lot of connections to the secure ldap port in CLOSE_WAIT, but more interestingly I am seeing a few connections over the non-secure ldap port to a different LDAP server just sitting open.

My feeling at this point is that we have an either non-responding LDAP server, or one that is responding slowly, the resolution being to investigate that server.

Moral

When you log a service ticket for a "system hang", it's great to get the forced crashdump first up, but it's even better to get a description of what you observed to make to believe that the system was hung.

Tuesday Jan 24, 2012

Using lightning from homedir on SPARC and x86 Solaris

I make great use of lightning in my thunderbird installation.

At the moment I am in the process of migrating from my Sun Blade 2000 Sun Ray server to an x86 based one.

The problem is that I am running the lightning plugin from my automounted home directory and the lightning plugin has one shared library (libcalbasecomps.so) in it.

Now the thunderbird as installed in Solaris 11 actually comes with a compatible lightning installed so you can use that. Unfortunately (or fortunately) I try to run current thunderbird (at the time of writing 9.0.1).

For reference, you can get the lightning plugin for Solaris from http://releases.mozilla.org/pub/mozilla.org/calendar/lightning/releases/1.1.1/contrib.

The obvious answer would have been to install it where I keep my thunderbird executables, but I couldn't quickly work out how to do that.

I already had the SPARC version installed. Apart from the Identifier number being different the only differences in lightning.xpi (after unzipping it) appear to be a platform line in install.rdf and the shared library.

What I did was to make a directory in my thunderbird install directory to house the architecture specific library on both the SPARC and x86 machine.

$ mkdir /rpool/thunderbird/arch

On each machine I got hold of the shared library and put a copy of it into this directory.

$ unzip lightning.xpi
...
$ cp components/libcalbasecomps.so /rpool/thunderbird/arch

The we head into the currently installed plugin in my home directory. Note the quotes. Shells have special meanings for braces.

$ cd '.thunderbird/profilename/extensions/{e2fda1a4-762b-4020-b5ad-a41df1933103}/components'
$ rm libcalbasecomps.so
$ ln -s /rpool/thunderbird/arch/libcalbasecomps.so .

Almost there.

Now in the directory one up from the components directory there is a file called install.rdf. In this file there is the following line:

<em:targetPlatform>SunOS_sparc-sunc</em:targetPlatform>

This needs to be commented out:

<!-- <em:targetPlatform>SunOS_sparc-sunc</em:targetPlatform> >

I now can run my thunderbird from either machine and continue to use lightning. I just need to follow this process whenever I upgrade thunderbird/lightning (Part of the reason for doing this blog).

As an aside, my /rpool/thunderbird and /rpool/firefox are each a zfs filesystem under rpool. Before I upgrade anything I make a zfs snapshot. That way if anything breaks, rolling back to a working version is trivial.

Friday Nov 25, 2011

Interim Patches for CVE-2011-4313 released through MOS

As reported on the article on the Sun Security Blog, interim patches are available for Solaris 8,9 and 10 directly from MOS without the need to log a Service Request. There is also Interim Relief available for Solaris 11, but at this point in time that will still require a Service Request.

As seen from running "named -V", these patches implement the same fix as ISC by taking Bind to the version:
BIND 9.6-ESV-R5-P1.

Thursday Nov 10, 2011

Upgrading Solaris 11 Express b151a with support to Solaris 11

The most common problem that I am seeing on the aliases this morning is folks who are running 151a with a support update finding that their upgrade is failing.

The reason for this is that the version of pkg that you need to do the upgrade is in SRU#13. You need to update to this before switching to the release repository and upgrading.

This is an absolutely required step.

If you have an SRU older than #13 and have already switched to the release repository, you will need to switch back to the support repository, update and then go back to the release repository.

Monday Aug 01, 2011

What are these door things?

I recently had cause to pass on an article that I wrote for the now defunct Australian Sun Customer magazine (On#Sun) on the subject of doors. It occurred to me that I really should put this on the blog. Hopefully this will give some insight as to why I think doors are really cool.


Where does this door go?

If you have had a glance through /etc you may have come across some files with door in their name. You may also have noticed calls to door functions if you have run truss over commands that interact with the name resolver routines or password entry lookup.

The Basic Idea (an example)

Imagine that you have an application that does two things. First, it provides lookup function into a potentially slow database (e.g. the DNS). Second, it caches the results to minimise having to make the slower calls.

There are already a number of ways that we could call the cached lookup function from a client (e.g. RPCs & sockets), but these require that we give up the cpu and wait for a response from another process. Even for a potentially fast operation, it could be some time before the client is next scheduled. Wouldn't it be nice if we could complete the operation within our time slice? Well, this is what the door interface accomplishes.

The Server

When you initialise a door server, a number of threads are made available to run a particular function within the server. I'll call this function the door function. These threads are created as if they had made a call to door_return() from within the door function. The server will associate a file and an open file descriptor with this function.

The Client

When the client initialises, it opens the door file and specifies the file descriptor when it calls door_call(), along with some buffers for arguments and return values. The kernel uses this file descriptor to work out how to call the door function in the server.

At this point the kernel gets a little clever. Execution is transferred directly to an idle door thread in the server process, which runs as if the door function had been called with the arguments that the client specified. As it runs in the server context, it has access to all of the global variables and other functions available to that process. When the door function is complete, instead of using return(), it calls door_return(). Execution is transferred back to the client with the result returned in a buffer we passed door_call(). The server thread is left sleeping in door_return().

If we did not have to give up the CPU in the door function, then we have just gained a major speed increase. If we did have to give it up, then we didn't really lose anything, as the overhead is only small.

This is how services such as the name service cache daemon (nscd) work. Library functions such as gethostbyname(), getpwent() and indeed any call whose behaviour is defined in /etc/nsswitch.conf are implemented with door calls to nscd. Syslog also uses this interface so that processes are not slowed down substantially because of syslog calls. The door function simply places the request in a queue (a fast operation) for another syslog thread to look after and then calls door_return() (that's actually not how syslog uses it).

For further information see the section 9 man pages on door_create, door_info, door_return and door_call.

Friday Aug 14, 2009

Why I hate macros that make pointer dereferences look like structure elements

I have a colleague who generated an IDR patch for tcp in Solaris 10 for me to give relief to a customer for a bug while the formal fix was in progress.

As a part of the fix we had this code fragment

 18984          /\*
 18985           \* If the SACK option is set, delete the entire list of
 18986           \* notsack'ed blocks.
 18987           \*/
 18988          if (tcp->tcp_sack_info != NULL) {
 18989                  if (tcp->tcp_notsack_list != NULL)
 18990                          TCP_NOTSACK_REMOVE_ALL(tcp->tcp_notsack_list, tcp);
 18991          }

replaced with this code fragment (the fix actually has a lot more to it than this, but this is what was relevent.)

 18936          /\*
 18937           \* If the SACK option is set, delete the entire list of
 18938           \* notsack'ed blocks.
 18939           \*/
 18940
 18941          if (tcp->tcp_notsack_list != NULL)
 18942                  TCP_NOTSACK_REMOVE_ALL(tcp->tcp_notsack_list, tcp);

Now, the assembly around here reads

ip:tcp_process_shrunk_swnd+0x38:       ldx      [%i0 + 0xf8], %g1
ip:tcp_process_shrunk_swnd+0x3c:       add      %g3, %i1, %g2
ip:tcp_process_shrunk_swnd+0x40:       stw      %g2, [%i0 + 0x80]
ip:tcp_process_shrunk_swnd+0x44:       ldx      [%g1 + 0x48], %i5

and the register dump

pc:  0x7bed2918 ip:tcp_process_shrunk_swnd+0x44:   ldx  [%g1 + 0x48], %i5
npc: 0x7bed291c ip:tcp_process_shrunk_swnd+0x48:   subcc  %i5, 0x0, %g0   ( cmp   %i5, 0x0 )
  global:                       %g1                  0
        %g2             0x761c  %g3             0x68ec
        %g4      0x600216f6e6c  %g5                  0
        %g6               0x1c  %g7      0x2a101f89ca0
  out:  %o0      0x600210d8640  %o1              0x1e0
        %o2              0x5a8  %o3              0x3c8
        %o4      0x600216f6e6c  %o5      0x600216f68c4
        %sp      0x2a101f88c61  %o7         0x7bed2900
  loc:  %l0         0xc7341c85  %l1             0x2000  
        %l2      0x60010972000  %l3      0x600210d8640  
        %l4             0x1000  %l5             0x1000  
        %l6             0x1000  %l7                0x5  
  in:   %i0      0x600210d8640  %i1              0xd30  
        %i2                  0  %i3         0xc73439b5  
        %i4                  0  %i5                0x4  
        %fp      0x2a101f88d11  %i7         0x7becbed4  

The last instruction is where we paniced (yes the customer paniced [twice] as a result of this). As we can see from the register dump, %g1 is NULL, so we definitely have a NULL pointer dereference going on.

So where did this come from? It looks like a dereference of 0xf8 from %i0. %i0 is a (tcp_t \*) making %g1 a (tcp_sack_info \*), namely arg0->tcp_sack_info if we look at the structure; but hang on, the code says tcp->tcp_notsack_list, not tcp->tcp_sack_info. Indeed that element name does not exist within a tcp_t.

A light dawns when we see that:

   299  #define tcp_notsack_list	        tcp_sack_info->tcp_notsack_list

So in reality line 18941 is doing:

 18941          if (tcp->tcp_sack_info->tcp_notsack_list != NULL)

Without checking whether or not tcp->tcp_sack_info is non-NULL. The correct line should perhaps read

 18941          if (tcp->tcp_sack_info != NULL && tcp->tcp_notsack_list != NULL)

Now this would probably not have made it as far as in IDR patch delivered to a customer, if we didn't have that macro definition because alarm bells would have rung that we were doing another dereference!

Thursday Jul 30, 2009

Interim fixes for Bind Vulnerability VU#725188/CVE-2009-0696 (Updated)

Yesterday I noticed an article titled New DoS Vulnerability in All versions of BIND 9 on slashdot. The article refers to BIND Dynamic Update DoS at the ISC site describing Vulnerability Note VU#725188 - ISC BIND 9 vulnerable to denial of service via dynamic update request.

This very rapidly caused a stir on a few internal mailing lists that I'm on and work on addressing this as

        6865903 Updated, P1 network/dns CVE-2009-0696 BIND dynamic update problem

The current status of this within Sun is that the Interim Security Reliefs (ISR) are available from http://sunsolve.sun.com/tpatches for the following releases:

SPARC Platform

  • Solaris 10 IDR142522-01
  • Solaris 9 IDR142524-01

x86 Platform:

  • Solaris 10 IDR142523-01
  • Solaris 9 IDR142525-01

Sun Alert 264828 is on its way to be published. When published it will be available from: http://sunsolve.sun.com/search/document.do?assetkey=1-66-264828-1

The fix is planned for build 121 for OpenSolaris/Nevada and we're attempting to get it into the next possible release Support Repository Update (SRU3).

Update 1

It turns out that the Solaris 9 ISR patches rely on an unreleased patch for Solaris 9. Work is underway to get this dependency out quickly,

Monday Jun 22, 2009

Live Upgrade and TimeSlider gotcha

Tried to upgrade my workstation over the weekend to snv_117. Apart from a little tridying up I had to do as a package didn't install correctly, all apeared to be going fine. I then went to unmount /.alt.snv_117, and it failed saying that the filesystem was busy.

fuser -c showed no processes using the mount point. What could it be?

A little bit of dtracing the umount2() system call was illuminating.

  1              <- zfsctl_umount_snapshots           0                0
  1            <- zfs_umount                          0               16

Hang on, snapshots? Although it returned 0, let's just check; as I do have timeslider enabled on this box.

rootksh@vesvi:~$ zfs list -t snapshot|grep 117                                                                     
pool/ROOT/snv_116@snv_117                                   4.03M      -  8.78G  -
pool/ROOT/snv_117@zfs-auto-snap:hourly-2009-06-19-09:00     43.8M      -  7.99G  -
pool/ROOT/snv_117@zfs-auto-snap:hourly-2009-06-19-10:00     48.9M      -  8.44G  -
pool/ROOT/snv_117@zfs-auto-snap:hourly-2009-06-19-11:00     43.7M      -  8.74G  -
pool/ROOT/snv_117@zfs-auto-snap:frequent-2009-06-19-11:15   42.6M      -  8.75G  -
pool/ROOT/snv_117@zfs-auto-snap:frequent-2009-06-19-11:30   45.8M      -  8.76G  -
pool/ROOT/snv_117@zfs-auto-snap:frequent-2009-06-19-11:45   38.1M      -  8.77G  -
pool/ROOT/snv_117@zfs-auto-snap:hourly-2009-06-19-12:00     38.5M      -  8.80G  -
pool/ROOT/snv_117@zfs-auto-snap:daily-2009-06-22-00:00          0      -  8.80G  -
pool/ROOT/snv_117@zfs-auto-snap:weekly-2009-06-22-00:00         0      -  8.80G  -
pool/ROOT/snv_117@zfs-auto-snap:hourly-2009-06-22-10:00         0      -  8.80G  -
pool/ROOT/snv_117@zfs-auto-snap:frequent-2009-06-22-10:30       0      -  8.80G  -
pool/ROOT/snv_117@zfs-auto-snap:frequent-2009-06-22-10:45       0      -  8.80G  -

Oh, timeslider was taking snapshots of the filesystem while it was upgrading. Hmm maybe we should be having that disabled on the target of a live upgrade (rfe coming, but I don't hold out a lot of hope).

Anyway, removing them was not difficult:

rootksh@vesvi:~$ zfs list -t snapshot|grep snv_117@zfs-auto|awk '{print $1}' | xargs -L 1 zfs destroy
rootksh@vesvi:~$ luumount snv_117                                                                                  
rootksh@vesvi:~$ 

Something to keep in mind if you are using timeslider, zfs root and live upgrade (I wonder if we would have the same issue with 'pkg image-update' in OpenSolaris).

Monday May 04, 2009

multithreaded processes and mdb

Today I had to look at a gcore of devfsadm. Most specifically I wanted to have at what the threads in cond_wait() were doing. I haven't done a lot with such stuff in userland before so thought it would make a good short blog topic on things that can be done.

First off we run up mdb

#  mdb /usr/sbin/devfsadm devfsadm.gcore
Loading modules: [ libsysevent.so.1 libnvpair.so.1 libc.so.1 libavl.so.1 libuutil.so.1 ld.so.1 ]
> 

Great, we got all the modules. So, what lwps have we got?

> $L
lwpids 1, 2, 3, 4, 5 and 6 are in core of process 135.

So we have six threads, let's have a look at the registers in first one (note that this is on SPARC).

> 1::regs
%g0 = 0x00000000                 %l0 = 0x00000000 
%g1 = 0x0000001d                 %l1 = 0x00043748 
%g2 = 0x0003cb2c                 %l2 = 0xffbff8ac 
%g3 = 0x00038000                 %l3 = 0x00000001 
%g4 = 0x0003cb2c                 %l4 = 0x00000000 
%g5 = 0x00000000                 %l5 = 0x00000000 
%g6 = 0x00000000                 %l6 = 0x00000000 
%g7 = 0xff342a00                 %l7 = 0x00000001 
%o0 = 0xff342c40                 %i0 = 0x00000001 
%o1 = 0xff13b90c libc.so.1`pause+0x50 %i1 = 0x0003a2a4 
%o2 = 0xff1c3800 libc.so.1`_uberdata %i2 = 0xff342a00 
%o3 = 0x00000000                 %i3 = 0x00039954 
%o4 = 0xff342a00                 %i4 = 0x00016964 
%o5 = 0x00000000                 %i5 = 0x00000000 
%o6 = 0xffbff850                 %i6 = 0xffbff8b0 
%o7 = 0xff13b914 libc.so.1`pause+0x58 %i7 = 0x00015ce4 

 %psr = 0x00000044 impl=0x0 ver=0x0 icc=nzvc
                   ec=0 ef=0 pil=0 s=0 ps=64 et=0 cwp=0x4
  %y = 0x00000000
  %pc = 0xff14c160 libc.so.1`_pause+4
  %npc = 0xff14c164 libc.so.1`_pause+8
  %sp = 0xffbff850
  %fp = 0xffbff8b0                    

 %wim = 0x00000082
 %tbr = 0x00000000

Now to have a look at the stack we simply find the %sp value and use it with the stack dcmd.

> 0xffbff850::stack
0x15ce4(0, 43b48, 39db4, 4, 2276c, 38000)
main+0x358(0, 39f2c, ffbffdec, 398e4, 1, 38000)
_start+0x108(0, 0, 0, 0, 0, 0)

Note that this gives the stack frames above the current and not the current. From the value of %pc above we can see where we are executing in the current frame. You can also see that we the caller does not have an entry in the symbol table. Unfortunately, on Solaris 10, devfsadm has a lot of functions and variables declared as static, which really does make debugging a pain. Fortunately this is not the case in Nevada/OpenSolaris.

Looking at the other lwps is as simple as listing the lwp id in front of the regs dcmd and repeating what we just did. I won't go into how I worked out which of the static routines we were executing in for the other lwps in cond_wait(), save to say that there are only a couple of places that make that call in the code, and matching up the assembly around the locations to the source (especially looking at called functions), makes this not too difficult.

Tuesday Mar 03, 2009

My CMT machine loads Oracle Databases slower than ..

This is more of an "Oh no not again" type post, ...

I am constantly amazed at the number of escalations that make it to the performance group with this as the problem.

It really is a case of an unrealistic expectation and knowledge of what the machines excel at.

The most recent of these to cross my desk talks of a customer concerned that a dual core 2.5GHz x86 based box loads data into an Oracle database much faster than his shiny new T5220.

Until such a time as Oracle makes their SQL Loader run multi-threaded (which may bring in problems all of their own) this will always be the case.

The design of the system is such that it will run single threaded applications much slower than the x86 counterparts. These machines, however, come into their own once we enter production and start getting lots of parallel requests on the database. As we are running far more cpus, the load on the database must be much higher before we start to see any significant degradation.

The question that really should be asked here is, "Where do you want your performance? In the database load that you will do once, or in responding to production queries?"

About

* - Solaris and Network Domain, Technical Support Centre


Alan is a kernel and performance engineer based in Australia who tends to have the nasty calls gravitate towards him

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
Links
Blogroll

No bookmarks in folder

Sun Folk

No bookmarks in folder

Non-Sun Folk
Non-Sun Folks

No bookmarks in folder