X

Alan Hargreaves's Weblog

Recent Posts

Solaris

An Interoperability Problem (WebLogic, Java and Solaris)

Last Friday I got pulled in to a very hot customer call.The issue was best summarised asSince migrating our WebLogic and database services from AIX toSolaris, at random times we are seeing the the WebLogic serverpause for a few minutes at a time. This takes down the backoffice and client services part of the business for theseperiods and is causing increasing frustration on the part ofstaff and customers. This only occurs when a particular moduleis enabled in WebLogic.Before going on, I should note that there were other parts to this call which required option changes to WebLogic that also addressed major performance issues (such as the field iPads timing out talking to the WebLogic service) that were being seen, but it was these pauses that were the great concern to the customer.I'd been given data from a GUDS run which initially made me concerned that we were having pstack(1M) run on the WebLogic java service. Pstack will stop the process while it walks all of the thread stacks. This could certainly have a nasty effect on accessibility to the service.Unfortunately it was not to be that simple. The pstack collection was actually part of the data gathering process that the WebLogic folks were running. A great example of the Heisenberg Effect while looking at a problem. The effect of this data gathering masked out the possibility of seeing anything else.I should also mention that in order to keep business running that the customer had disabled the particular module, so we were very limited in when we could get it enabled. Data gathering also required the them to send someone out to site with an iPad (which was the field interface that seemed to be an enabler of the problem). We were pretty much getting one shot at data gathering in any given 24 hour period.The next day we gathered data with the pstack command commented out.This was a little more interesting; however, the small amount of time that the issue was present and the fact that we were only gathering small lockstat profiles meant that it was difficult to pin anything down as we were playing hit and miss for us to be taking a profile when the issue was apparent. I did notice that we seemed to be spending more time in page-faulting than I would have expected (about 25% of available cpu at one point!), and about half of that time was being spent spinning on a cross call mutex to flush the newly mapped addresses from all other cpu caches.With the data from the next days run I also noticed that we had the kflt_user_evict() thread also fighting for the same mutex. My thought at this time was to disable that thread, and for good measure also disable page coalescing by adding the following lines to /etc/system and rebooting.set kflt_disable=1set mpss_coalesce_disable=1set pg_contig_disable=1It still felt like we were looking at addressing symptoms but not the cause.We had our breakthrough on Tuesday when we got the following update from the customer:The iPad transaction which triggers this issue has a labelprinting as part of the transaction. The application uses theSolaris print spooling mechanism to spool prints to a labelprinter. The application code spawns a lp process to do thisspooling. The code is used is something like below:Runtime.getRuntime().exec("lpr -P <destination> <file-name>");We have observed that the CPU sys% spike behaviour seems not tooccur once we have disabled this print spooling functionality.Is there any known issues in Oracle Java 1.7 with spawningmultiple processes from within the JVM ? Note that thisfunctionality as such has always been working fine on an IBM JVMon AIX.This suddenly made the page-faulting make sense.On chasing up the Java folks, it looks like the default mechanism for this kind of operation is fork()/exec().Now, fork will completely clone the address space of the parent process. This is what was causing all of the page-faults. The WebLogic Java process had a huge memory footprint and more than 600 threads.Further discussion with the Java folks revealed that there was an option in the later Java versions that we could have them use to force Java to use posix_spawn() rather than fork/exec, which would stop the address space duplication. Customer needed to start Java with the option:-Djdk.lang.Process.launchMechanism=posix_spawnThey implemented this along with the other application changes and it looks to have been running acceptably now for a few days.The hardest part of this call was the fact that without any one of the support groups looking at this (WebLogic, Java and Solaris), it is a virtual certainty that we would not have gotten to root cause and found a solution.Well done everyone.

Last Friday I got pulled in to a very hot customer call. The issue was best summarised as Since migrating our WebLogic and database services from AIX toSolaris, at random times we are seeing the the...

Solaris

Quick and Dirty iSCSI between Solaris 11.1 targets and a Solaris 10 Initiator

I recently found myself with a support request to do some research involving looking at the results of removing vdevs from a pool in a recoverable way while doing operations on the pool.My initial thought was to make the disk devices available to a guest ldom from a control ldom, but I found that Solaris and LDOMS coupled things too tightly for me to do something which had the potential to cause damage.After a bit of thought, I realised that I also had two Solaris machines already configured in our dynamic lab set up based in the UK that I could use to create some iSCSI targets that could be made available to the guest domain that I'd already built. I needed to use two hosts to provide the targets as for reasons that I really don't need to go in to, I wanted an easy way to make them progressively unavailable in such a way that I could make them available again. Using two hosts meant that I could do this with shutdown/boot.The tricky part is that the ldom I wanted to test on was running Solaris 10 and the two target machines were running Solaris 11.1I needed to reference the following documentsConfiguring Devices with COMSTAR (Solaris 11)How to Enable the STMF ServiceHow to Create a Logical UnitHow to Create an iSCSI Target/a>Setting Up Solaris iSCSI Targets and Initiators (Solaris 10)How to Configure iSCSI Target DiscoveryThe boxesNameAddressLocationSolaris Releasetarget110.163.249.27UKSolaris 11.1target210.163.246.122UKSolaris 11.1initiator10.187.56.220AustraliaSolaris 10Setting up target1Install the iSCSI packagestarget1# pkg install group/feature/storage-servertarget1# svcadm enable stmf Create a small pool. Use a file as we don't have any extra disk attached to the machine and we really don't need much and then make a small volume.target1# mkfile 4g /var/tmp/iscsitarget1# zpool create iscsi /var/tmp/iscsitarget1# zfs create -V 1g iscsi/vol0 Make it available as an iSCSI target. Take note of the target name, we'll need that later.target1# stmfadm create-lu /dev/zvol/rdsk/iscsi/vol0 Logical unit created: 600144F000144FF8C1F0556D55660001target1# stmfadm list-luLU Name: 600144F000144FF8C1F0556D55660001target1# stmfadm add-view 600144F000144FF8C1F0556D55660001target1# stmfadm list-view -l 600144F000144FF8C1F0556D55660001target1# svcadm enable -r svc:/network/iscsi/target:defaulttarget1# svcs -l iscsi/targetfmri svc:/network/iscsi/target:defaultname iscsi targetenabled truestate onlinenext_state nonestate_time Tue Jun 02 08:06:29 2015logfile /var/svc/log/network-iscsi-target:default.logrestarter svc:/system/svc/restarter:defaultmanifest /lib/svc/manifest/network/iscsi/iscsi-target.xmldependency require_any/error svc:/milestone/network (online)dependency require_all/none svc:/system/stmf:default (online)target1# itadm create-targetTarget iqn.1986-03.com.sun:02:e9d04086-3bd7-e8a7-e5b6-ac91ba0d4394 successfully createdtarget1# itadm list-target -vTARGET NAME STATE SESSIONS iqn.1986-03.com.sun:02:e9d04086-3bd7-e8a7-e5b6-ac91ba0d4394 online 0 alias: - auth: none (defaults) targetchapuser: - targetchapsecret: unset tpg-tags: defaultSetting up target2Pretty much the same as what we just did on target1.Install the iSCSI packagestarget2# pkg install group/feature/storage-servertarget2# svcadm enable stmf Create a small pool. Use a file as we don't have any extra disk attached to the machine and we really don't need much and then make a small volume.target2# mkfile 4g /var/tmp/iscsitarget2# zpool create iscsi /var/tmp/iscsitarget2# zfs create -V 1g iscsi/vol0 Make it available as an iSCSI target. Take note of the target name, we'll need that later.target2# stmfadm create-lu /dev/zvol/rdsk/iscsi/vol0Logical unit created: 600144F000144FFB7899556D5B750001target2# stmfadm add-view 600144F000144FFB7899556D5B750001target2# stmfadm list-view -l 600144F000144FFB7899556D5B750001View Entry: 0 Host group : All Target Group : All LUN : Autotarget2# svcadm enable -r svc:/network/iscsi/target:defaulttarget2# svcs -l iscsi/targetfmri svc:/network/iscsi/target:defaultname iscsi targetenabled truestate onlinenext_state nonestate_time Tue Jun 02 08:31:01 2015logfile /var/svc/log/network-iscsi-target:default.logrestarter svc:/system/svc/restarter:defaultmanifest /lib/svc/manifest/network/iscsi/iscsi-target.xmldependency require_any/error svc:/milestone/network (online)dependency require_all/none svc:/system/stmf:default (online)target2# itadm create-targetTarget iqn.1986-03.com.sun:02:6cc0044c-3d29-6acd-a873-cfc80b91e52d successfully createdtarget2# itadm list-target -vTARGET NAME STATE SESSIONS iqn.1986-03.com.sun:02:6cc0044c-3d29-6acd-a873-cfc80b91e52d online 0 alias: - auth: none (defaults) targetchapuser: - targetchapsecret: unset tpg-tags: defaultSetting up initiatorNow make them statically available on the initiator. Note that we use the Target Names we got from the last name of the earlier setups. We also need to provide the IP address of the machine hosting the target as we are attaching them statically for simplicity.initiator# iscsiadm add static-config iqn.1986-03.com.sun:02:e9d04086-3bd7-e8a7-e5b6-ac91ba0d4394,10.163.249.27initiator# iscsiadm add static-config iqn.1986-03.com.sun:02:6cc0044c-3d29-6acd-a873-cfc80b91e52d,10.163.246.122initiator# iscsiadm modify discovery --static enable Now we need to get the device nodes created.initiator# devfsadm -i iscsiinitiator# format < /dev/nullSearching for disks...donec1t600144F000144FF8C1F0556D55660001d0: configured with capacity of 1023.75MBc1t600144F000144FFB7899556D5B750001d0: configured with capacity of 1023.75MBAVAILABLE DISK SELECTIONS:0. c0d0/virtual-devices@100/channel-devices@200/disk@01. c0d1/virtual-devices@100/channel-devices@200/disk@12. c0d2/virtual-devices@100/channel-devices@200/disk@23. c1t600144F000144FF8C1F0556D55660001d0/scsi_vhci/ssd@g600144f000144ff8c1f0556d556600014. c1t600144F000144FFB7899556D5B750001d0/scsi_vhci/ssd@g600144f000144ffb7899556d5b750001Specify disk (enter its number): Great, we've found them. Let's make a mirrored pool.initiator# zpool create tpool mirror c1t600144F000144FF8C1F0556D55660001d0 c1t600144F000144FFB7899556D5B750001d0initiator# zpool status -v tpool pool: tpool state: ONLINE scan: none requestedconfig:NAME STATE READ WRITE CKSUMtpool ONLINE 0 0 0mirror-0 ONLINE 0 0 0c1t600144F000144FF8C1F0556D55660001d0 ONLINE 0 0 0c1t600144F000144FFB7899556D5B750001d0 ONLINE 0 0 0errors: No known data errorsI was then in a position to go and do the testing that I needed to do.

I recently found myself with a support request to do some research involving looking at the results of removing vdevs from a pool in a recoverable way while doing operations on the pool. My initial...

Solaris

Bashed and Shellshocked

What a last four days this has been! Certainly from the perspective of a support engineer dealing with this who on days one and two ended up going 43 hours without sleep.Unless you've been off the grid for the last week, you would know about Shellshock (CVE-2014-6271, CVE-2014-7169, CVE-2014-7186 and CVE-2014-7187).To get the obligatories out of the way first, ...Oracle has released a formal alert about this vulnerability which you can read at http://www.oracle.com/technetwork/topics/security/alert-cve-2014-7169-2303276.htmlThis document points to a MOS Note containing links to the fixes for this issue. Our current recommendation is to download all patches and IDR patches listed for your particular OS.This document will be updated as patches get formally released. Given how quickly the initial IDRs were released as formal patches, I would not expect this to take very long.I also need to thank those of you who have had patience while dealing with us in support on this. As you can imagine, we've had a huge number of calls on it, that given the nature of the issue and the fact that we hit the weekend, needed to be handled by a small number of support engineers. It's taken me most of the weekend to make sure that each of the calls that I own has a current status in it, which involved a couple of statuses from the initial updates (starting around 6am Friday morning Australia/Sydney) through various incarnations of updates. Any update that needs to go into all of my calls now will likely take at least four straight hours.So I hope you can see that I am indeed grateful for your patience.There are a few things that have cropped up quite a bit that I will detail here in the hope of avoiding some further calls on the issue.The patch failed to installThe big one is that people get errors installing the Solaris 8 - Solaris 10 patches. Generally accompanied with a message from checkinstall about being unable to open something.Checkinstall, runs as user nobody, group root (group 0 is really not special on Solaris, I would say that we get it because we only setuid(nobody) and don't touch the group), so if you don't have permissions for "nobody" to read the patch, it will fail. Check the permissions on each element of the directory path into which you extracted the patch. I've found that "chmod go=u-w path", where path is the directory the patch was installed in, fixes the issue. Of course extracting in somewhere like /tmp after a "umask 02" would also help.Installing the patch didn't fix itWe've had a few folks tell us that they've installed the patch but bash still fails the tests. It's generally turned out that they've had another bash binary installed (eg in /usr/local/bin) that comes first in $PATH. Check "which bash".Questions that we cannot answerOne last thing, Oracle has a policy of not announcing the time frame of releasing security fixes. If you ask when the IDR patches will become formal patches, we are not going to be able to answer you. What is worth noting though is that the -01 IDRs became formal patches right quickly. Indeed all fixes are now available in Solaris 11.2 sru 2.8.

What a last four days this has been! Certainly from the perspective of a support engineer dealing with this who on days one and two ended up going 43 hours without sleep. Unless you've been off the...

Solaris

Who is renicing these processes?

I was helping out a colleague on such a call this morning. While the DTrace script I produced was not helpful in this actual case, I think it bear sharing, anyway.What we wanted was a way to find out why various processes were running with nice set to -20. There are two ways in which a process can have its nice changed.nice(2) - where it changes itselfpriocntl(2) - where something else changes itI ended up with the following script after a bit of poking around.# dtrace -n 'syscall::nice:entry { printf("[%d] %s calling nice(%d)", pid, execname, arg0);}syscall::priocntlsys:entry /arg2 == 6/ { this->n = (pcnice_t *)copyin(arg3, sizeof(struct pnice)); this->id = (procset_t *)copyin(arg1, sizeof(struct procset)); printf("[%d] %s renicing %d by %d", pid, execname, this->id->p_lid, this->n->pc_val); }'There is an assumption in there about p_lid being the PID that I want, but in this particular case it turns out to be ok. Matching arg2 against 6 is so that we only get priocntl() calls with the command PC_DONICE. I could have also had it check the pcnice_t->pc_op but I can put up with the extra output.So what happens when we have this running and then try something like# renice -20 4147...dtrace: description 'syscall::nice:entry ' matched 2 probes CPU ID FUNCTION:NAME 0 508 priocntlsys:entry [4179] renice renicing 4147 by 0 0 508 priocntlsys:entry [4179] renice renicing 4147 by -20Which is exactly what we wanted. We see the renice command (pid 4179) modifying pid 4179.Oh, why didn't this help I hear you ask?Turns out that in this instance, the process in question was being started by init from /etc/inittab, as such starting with nice set to whatever init is running at. In this case it is -20.

I was helping out a colleague on such a call this morning. While the DTrace script I produced was not helpful in this actual case, I think it bear sharing, anyway. What we wanted was a way to find out...

General

Oracle Support Service Request Surveys

This morning I had my manager tell me about a survey that was taken against one of my closed calls, where we had a very unhappy and dissatisfied customer. On having a look at the survey comments, it looks like the dissatisfaction was with a completely different call, as the comments don't bear any resemblance to anything in that particular call. I can understand that if you've had a poor experience, that if you get a survey call, you will want to use that opportunity to express your dissatisfaction, but, ... It's really important that if you want your dissatisfaction to go to the group that needs to hear about it that the information is put against the correct call. You don't need to wait to see if your call is going to be randomly selected for a survey. By sending email to ops-cust-sat_ww@oracle.com and mentioning the SR in question By going to http://ora.cl/vNF after the call has been closed, you can request a survey on that call. In fact, I would encourage folks to do exactly this for any call in which they want to say negative (or positive) things about. The surveys are important feedback, but expressing your dissatisfaction against the wrong call number does not get the comments to the people who need to see them. Edited to reflect the change in how to request a Survey -- 10 May 2017

This morning I had my manager tell me about a survey that was taken against one of my closed calls, where we had a very unhappy and dissatisfied customer. On having a look at the survey comments, it...

Solaris

Using /etc/system on Solaris

I had cause to be reminded of this article I wrote for on#sun almost ten years ago and just noticed that I had not transferred it to my blog./etc/system is a file that is read just before the root filesystem is mounted. It contains directives to the kernel about configuring the system. Going into depth on this topic could span multiple books so I'm just going to give some pointers and suggestions here.Warning, Danger Will RobinsonSettings can affect initial array and structure allocation, indeed such things as module load path and where the root directory actually resides.It is possible to render your system unbootable if you are not careful. If this happens you might try booting with the '-a' option where you get the choice to tell the system to not load /etc/system.Just because you find a set of values works well on one system does not necessarily mean that they will work properly on another. This is especially true if we are looking at different releases of the operating system, or different hardware.You will need to reboot your system before these new values will take effect.The basic actions that can be taken are outlined in the comments of the file itself so I won't go into them here.The most common action is to set a value. Any number of products make suggestions for settings in here (eg Oracle, Veritas Volume Manager and Filesystem to name a few). Setting a value overrides the system default.A practice that I make when working on this file is to place a comment explaining why and when I make a particular setting (remember that a comment in this file is prefixed by a '*', not a '#'). This is useful later down the track when I may have to upgrade a system. It could be that the setting may actually not have the desired effect and it would be good to know why we originally did it.I harp on this point but it is important.Just because settings work on one machine does not make them directly transferable to another.For exampleset lotsfree=1024

I had cause to be reminded of this article I wrote for on#sun almost ten years ago and just noticed that I had not transferred it to my blog. /etc/system is a file that is read just before the...

Solaris

Using a libc.so from a previous kernel patch (Just Don't)

I was just assisting a colleague with an issue that after patching they found that there was higher lock spinning in malloc() in libc.He just told me that the customer copied the old libc into a directory in /tmp, changed LD_LIBRARY_PATH to point there first and ran their application observing that the issue went away.OK, where do we start here, ...NOTwo things immediately spring to mind as to why this is a bad idea.libc is tightly linked to the kernel system call interfaces. These interfaces are private to libc. As such they can be changed as long as the same change is made in the libc code. If you mismatch libc and the kernel you risk incorrectly calling system calls, with potentially fatal consequences.Placing a library into /tmp (or a directory under /tmp). Picture the following scenario. Someone builds their own library (doesn't have to be libc, just has to be something that your application uses) and places it into the directory you added to your search path (eg renaming your directory and creating their own). Now we have the potential of having your application run trojan code with any kind of side effect. Similar issues if you leave the path in a startup script and reboot, if the directory doesn't exist, anyone can create it and do the same thing.In short, please don't.

I was just assisting a colleague with an issue that after patching they found that there was higher lock spinning in malloc() in libc. He just told me that the customer copied the old libc into...

Solaris

The Importance of Fully Specifying a Problem

I had a customer call this week where we were provided a forced crashdump and asked to determine why the system was hung.Normally when you are looking at a hung system, you will find a lot of threads blocked on various locks, and most likely very little actually running on the system (unless it's threads spinning on busy wait type locks).This vmcore showed none of that. In fact we were seeing hundreds of threads actively on cpu in the second before the dump was forced.This prompted the question back to the customer:What exactly were you seeing that made you believe that the system was hung?It took a few days to get a response, but the response that I got back was that they were not able to ssh into the system and when they tried to login to the console, they got the login prompt, but after typing "root" and hitting return, the console was no longer responsive.This description puts a whole new light on the "hang". You immediately start thinking "name services".Looking at the crashdump, yes the sshds are all in door calls to nscd, and nscd is idle waiting on responses from the network.Looking at the connections I see a lot of connections to the secure ldap port in CLOSE_WAIT, but more interestingly I am seeing a few connections over the non-secure ldap port to a different LDAP server just sitting open.My feeling at this point is that we have an either non-responding LDAP server, or one that is responding slowly, the resolution being to investigate that server.MoralWhen you log a service ticket for a "system hang", it's great to get the forced crashdump first up, but it's even better to get a description of what you observed to make to believe that the system was hung.

I had a customer call this week where we were provided a forced crashdump and asked to determine why the system was hung. Normally when you are looking at a hung system, you will find a lot of threads...

Solaris

Using lightning from homedir on SPARC and x86 Solaris

I make great use of lightning in my thunderbird installation.At the moment I am in the process of migrating from my Sun Blade 2000 Sun Ray server to an x86 based one.The problem is that I am running the lightning plugin from my automounted home directory and the lightning plugin has one shared library (libcalbasecomps.so) in it.Now the thunderbird as installed in Solaris 11 actually comes with a compatible lightning installed so you can use that. Unfortunately (or fortunately) I try to run current thunderbird (at the time of writing 9.0.1).For reference, you can get the lightning plugin for Solaris from http://releases.mozilla.org/pub/mozilla.org/calendar/lightning/releases/1.1.1/contrib.The obvious answer would have been to install it where I keep my thunderbird executables, but I couldn't quickly work out how to do that.I already had the SPARC version installed. Apart from the Identifier number being different the only differences in lightning.xpi (after unzipping it) appear to be a platform line in install.rdf and the shared library.What I did was to make a directory in my thunderbird install directory to house the architecture specific library on both the SPARC and x86 machine.$ mkdir /rpool/thunderbird/archOn each machine I got hold of the shared library and put a copy of it into this directory.$ unzip lightning.xpi...$ cp components/libcalbasecomps.so /rpool/thunderbird/archThe we head into the currently installed plugin in my home directory. Note the quotes. Shells have special meanings for braces.$ cd '.thunderbird/profilename/extensions/{e2fda1a4-762b-4020-b5ad-a41df1933103}/components'$ rm libcalbasecomps.so$ ln -s /rpool/thunderbird/arch/libcalbasecomps.so .Almost there.Now in the directory one up from the components directory there is a file called install.rdf. In this file there is the following line:<em:targetPlatform>SunOS_sparc-sunc</em:targetPlatform>This needs to be commented out:<!-- <em:targetPlatform>SunOS_sparc-sunc</em:targetPlatform> >I now can run my thunderbird from either machine and continue to use lightning. I just need to follow this process whenever I upgrade thunderbird/lightning (Part of the reason for doing this blog).As an aside, my /rpool/thunderbird and /rpool/firefox are each a zfs filesystem under rpool. Before I upgrade anything I make a zfs snapshot. That way if anything breaks, rolling back to a working version is trivial.

I make great use of lightning in my thunderbird installation. At the moment I am in the process of migrating from my Sun Blade 2000 Sun Ray server to an x86 based one. The problem is that I am running...

Solaris

What are these door things?

I recently had cause to pass on an article that I wrote for the now defunct Australian Sun Customer magazine (On#Sun) on the subject of doors. It occurred to me that I really should put this on the blog. Hopefully this will give some insight as to why I think doors are really cool.Where does this door go?If you have had a glance through /etc you may have come across some files with door in theirname. You may also have noticed calls to door functions if you have run truss over commandsthat interact with the name resolver routines or password entry lookup.The Basic Idea (an example)Imagine that you have an application that does two things. First, it provides lookup functioninto a potentially slow database (e.g. the DNS). Second, it caches the results to minimisehaving to make the slower calls.There are already a number of ways that we could call the cached lookup function from aclient (e.g. RPCs & sockets), but these require that we give up the cpu and wait for aresponse from another process. Even for a potentially fast operation, it could be some timebefore the client is next scheduled. Wouldn't it be nice if we could complete the operationwithin our time slice? Well, this is what the door interface accomplishes.The ServerWhen you initialise a door server, a number of threads are made available to run a particularfunction within the server. I'll call this function the door function. These threads are created asif they had made a call to door_return() from within the door function. The server will associatea file and an open file descriptor with this function.The ClientWhen the client initialises, it opens the door file and specifies the file descriptor when it callsdoor_call(), along with some buffers for arguments and return values. The kernel uses this filedescriptor to work out how to call the door function in the server.At this point the kernel gets a little clever. Execution is transferred directly to an idle doorthread in the server process, which runs as if the door function had been called with thearguments that the client specified. As it runs in the server context, it has access to all of theglobal variables and other functions available to that process. When the door function iscomplete, instead of using return(), it calls door_return(). Execution is transferred back to theclient with the result returned in a buffer we passed door_call(). The server thread is leftsleeping in door_return().If we did not have to give up the CPU in the door function, then we have just gained a majorspeed increase. If we did have to give it up, then we didn't really lose anything, as theoverhead is only small.This is how services such as the name service cache daemon (nscd) work. Library functionssuch as gethostbyname(), getpwent() and indeed any call whose behaviour is defined in/etc/nsswitch.conf are implemented with door calls to nscd. Syslog also uses this interface so that processes are not slowed down substantially because of syslog calls. The door function simply places the request in a queue (a fast operation) for another syslog thread to look after and then calls door_return() (that's actually not how syslog uses it).For further information see the section 9 man pages on door_create, door_info, door_returnand door_call.

I recently had cause to pass on an article that I wrote for the now defunct Australian Sun Customer magazine (On#Sun) on the subject of doors. It occurred to me that I really should put this on the...

General

I have a performance problem

(copied from my wordpress blog).So start 95% of the performance calls that I receive. They usually continue something like:I have gathered some *stat data for you (eg the guds tool from Document 1285485.1), can you please root cause our problem?So, do you think you could?Neither can I, based on this my answer inevitably has to be "No".Given this kind of problem statement, I have no idea about the expectations, the boundary conditions, or even the application. The answer may as well be "Performance problems? Consult your local Doctor for Viagra". It's really not a lot to go on.So, What kind of problem description is going to allow me to start work on the issue that is being seen? I don't doubt that there really is an issue, it just needs to be pinned down somewhat.What behavior exactly are you expecting to see?Be specific and use business metrics. For example "run-time", "response-time" and "throughput".This helps us define exit criterea.Now, let's look at the system that is having problems.How is what you are seeing different? Use the same type of metrics.The answers to these two questions take us a long way towards being able to work a call.Even more helpful are answers to questions likeHas this system ever worked to expectation?If so, when did it start exhibiting this behavior?Is the problem always present, or does it sometimes work to expectation?If it sometimes works to expectation, when are you seeing the problem? Is there any discernible pattern?Is the impact of the problem getting better, worse, or remaining constant?What kind of differences are there between when the system was performing to expectation and when it is not?Are there other machines where we could expect to see the same issue (eg similar usage and load), but are not? Again, differences?Once we start to gather information like this we start to build up a much clearer picture of exactly what we need to investigate, and what we need to achieve so that both you and me agree that the problem has been solved.Please help get that figure of poorly defined problem statements down from it's current 95% value.

(copied from my wordpress blog). So start 95% of the performance calls that I receive. They usually continue something like: I have gathered some *stat data for you (eg the guds tool from Document...

Solaris Express

Summarising my "Nevada to OpenSolaris Sun Ray on SPARC" series

As previously noted, I finished my series on migrating from Nevada to OpenSolaris Sun Ray on SPARC. There are eight articles on this and it's taken me just under a year, generally trying to find time to work on it.A lot has gone on in that year. I recognise that "OpenSolaris" is probably the wrong word to be using in the titles, but I'm not going to go back and change it. Please regard the term interchangeable for the purpose of this discussion.The end result is that I am back running Sun Ray Server on the machine that I originally have nevada on right up until we stopped doing nevada distributions. The machine is running a current Development build and is serving me well.Anyway, I thought that a small article summarising and linking to each of the posts would be in order and this seemed a good first real article for me back on blogs.sun.com/tpenta.Nevada to OpenSolaris on SPARCI look at the problems of getting OpenSolaris on to a SPARC Sun Blade 2000 that does not have a dvd reader on it. The end result was to install a late nevada on it and use the wanboot image of this installation to get something onto it. There were a few things which needed tidying up but it worked. I should note that I have subsequently installed Solaris 11 Express onto an Ultra 45 from the install CD, and it went flawlessly. Maybe it would have been easier to find a dvd drive for the box than all these hoops.Nevada to OpenSolaris on SPARC (part 2)I mention some problems I was having with sh returning an Exec format error when trying to do things as myself - the solution to which is outlined here.Also show how I migrated the zpools to the temporary machine.Nevada to OpenSolaris Sun Ray on SPARC (part 3 - reboot)Disaster strikes. My lab booking ran out and someone re-installed the box meaning I had to start from scratch. This blog is probably a much better technical reference for all that I had done before (isn't it always the way that when you have to do something over again, you manage to do a better job?).Nevada to OpenSolaris Sun Ray on SPARC ( part 4 - imapd)Everything I had to do to get imapd running. Starting with installing the compilers, downloading the source code, working out the Makefile hacks to make it build, making the SSL certificates (oops I have to do that again now I am on a machine with a different name).Nevada to OpenSolaris Sun Ray on SPARC (part 5 - Sun Ray Server)Step by step of getting around dependencies of Solaris 10 inside the Sun Ray software to get it running on this box. As I say at the end of the article ,"Getting Sun Ray running like this OpenSolaris completely voids any warranty and support. Don't do it if you don't know what you are doing."Nevada to OpenSolaris Sun Ray on SPARC (part 6 - cutting over)Smoke test time. What happened when I cut over to trying to do my day job on this machine. I realised I had missed a few things that were important, like openoffice, flash and acroread. The entry details the installations and also about installing certificates for the extras repository.Nevada to OpenSolaris Sun Ray on SPARC (part 7 - printing)Details how I got CUPS up and running. I believe that Solaris 11 Express comes with all of the packages already installed so you only need to configure things.Nevada to OpenSolaris Sun Ray on SPARC (part 8 - back to the original hardware)The last installment on moving everything back to the original machine, as I really did not want to be tieing up a lab resource (though I had it booked for close to a year), and vesvi actually had a little more memory in it.I cover some gotchas in using cloned disks like this as well as how I ended up doing it, how to change a nodename now as well as a couple of local network gotchas that had me confused for a while.Going back through this I realise that I may have left out configuring a couple of packages, like MySQL. I may do a final part 9 to cover these once I've made a list of them and I'll add that to here too.

As previously noted, I finished my series on migrating from Nevada to OpenSolaris Sun Ray on SPARC. There are eight articles on this and it's taken me just under a year, generally trying to find time...

OpenSolaris

Software Freedom Day in Sydney

Cristina Cifuentes invited me to speak at Michael Chan's (Campus Ambassador - UTS) Software Freedom Day event at UTS yesterday. The broad guidance given to me by Michael was "Alan can you talk about Solaris, ZFS and Zones. If possible can you demonstrate this?".OK, I've not done a lot with zones, especially under OpenSolaris, and I do love a challenge.Last year I spoke on a number of things at the day at Sydney Uni and I was a bit disappointed in the slideware I created for the stuff on ZFS. This year I decided that the best slides that I had seen on ZFS were from Jeff and Bill in their ZFS - The last Word in Filesystems presentations, so I used a few slides from there. The fun part was as I only had the pdf file I had to recreate the wonderful drawings they used on the slides I wanted. I learned a bit about drawing in OpenOffice doing this :) (and was told after I finished that apparently there is an OpenOffice plugin that will let you pull images out of pdf documents. oh well).I also found someay good zones resources in sun blogs, most notably Cloning Zones by Brian Leonard.Wikipedia was also helpful verifying some timelines for Solaris and OpenSolaris.At the end of the slides I flipped over to a couple of terminal sessions and demonstrated cloning some simple zones as I'd outlined in the talk. Very simple zones with no resources imported or networks. The impressive stuff was that we could demonstrated that with cloning we could provision and boot a new clone in under 10 seconds, which lead nicely into Andrew Latham's talk on Cloud Computing.Anyway, the pdf of the presentation can be downloaded here.

Cristina Cifuentesinvited me to speak at Michael Chan's (Campus Ambassador - UTS) Software Freedom Day event at UTS yesterday. The broad guidance given to me by Michael was "Alan can you talk about...

Solaris

Why I hate macros that make pointer dereferences look like structure elements

I have a colleague who generated an IDR patch for tcp in Solaris 10 for me to give relief to a customer for a bug while the formal fix was in progress.As a part of the fix we had this code fragment 18984 /\* 18985 \* If the SACK option is set, delete the entire list of 18986 \* notsack'ed blocks. 18987 \*/ 18988 if (tcp->tcp_sack_info != NULL) { 18989 if (tcp->tcp_notsack_list != NULL) 18990 TCP_NOTSACK_REMOVE_ALL(tcp->tcp_notsack_list, tcp); 18991 }replaced with this code fragment (the fix actually has a lot more to it than this, but this is what was relevent.) 18936 /\* 18937 \* If the SACK option is set, delete the entire list of 18938 \* notsack'ed blocks. 18939 \*/ 18940 18941 if (tcp->tcp_notsack_list != NULL) 18942 TCP_NOTSACK_REMOVE_ALL(tcp->tcp_notsack_list, tcp);Now, the assembly around here readsip:tcp_process_shrunk_swnd+0x38: ldx [%i0 + 0xf8], %g1ip:tcp_process_shrunk_swnd+0x3c: add %g3, %i1, %g2ip:tcp_process_shrunk_swnd+0x40: stw %g2, [%i0 + 0x80]ip:tcp_process_shrunk_swnd+0x44: ldx [%g1 + 0x48], %i5and the register dumppc: 0x7bed2918 ip:tcp_process_shrunk_swnd+0x44: ldx [%g1 + 0x48], %i5npc: 0x7bed291c ip:tcp_process_shrunk_swnd+0x48: subcc %i5, 0x0, %g0 ( cmp %i5, 0x0 ) global: %g1 0 %g2 0x761c %g3 0x68ec %g4 0x600216f6e6c %g5 0 %g6 0x1c %g7 0x2a101f89ca0 out: %o0 0x600210d8640 %o1 0x1e0 %o2 0x5a8 %o3 0x3c8 %o4 0x600216f6e6c %o5 0x600216f68c4 %sp 0x2a101f88c61 %o7 0x7bed2900 loc: %l0 0xc7341c85 %l1 0x2000 %l2 0x60010972000 %l3 0x600210d8640 %l4 0x1000 %l5 0x1000 %l6 0x1000 %l7 0x5 in: %i0 0x600210d8640 %i1 0xd30 %i2 0 %i3 0xc73439b5 %i4 0 %i5 0x4 %fp 0x2a101f88d11 %i7 0x7becbed4 The last instruction is where we paniced (yes the customer paniced [twice] as a result of this). As we can see from theregister dump, %g1 is NULL, so we definitely have a NULL pointerdereference going on.So where did this come from? It looks like a dereference of 0xf8from %i0. %i0 is a (tcp_t \*) making %g1 a (tcp_sack_info \*), namelyarg0->tcp_sack_info if we look at the structure; but hang on, the codesays tcp->tcp_notsack_list, not tcp->tcp_sack_info. Indeed that elementname does not exist within a tcp_t.A light dawns when we see that: 299 #define tcp_notsack_list tcp_sack_info->tcp_notsack_listSo in reality line 18941 is doing: 18941 if (tcp->tcp_sack_info->tcp_notsack_list != NULL)Without checking whether or not tcp->tcp_sack_info is non-NULL. The correct line should perhaps read 18941 if (tcp->tcp_sack_info != NULL && tcp->tcp_notsack_list != NULL)Now this would probably not have made it as far as in IDR patch delivered to a customer, if we didn't have that macro definition because alarm bells would have rung that we were doing another dereference!

I have a colleague who generated an IDR patch for tcp in Solaris 10 for me to give relief to a customer for a bug while the formal fix was in progress. As a part of the fix we had this code fragment 189...

Music

Jamm For Genes in Second Life

Last year we ran this event in second life. It ran for 27 hours and we raised about AUS$2000 for Children's Medical Research Institute as a part of Jamm for Genes.This year we're doing it again. It's on now in Second Life. We have 35 hours of live music lined up. All artists donating their time.The schedule is as follows. All times are US/Pacific (or Second Life Time)Friday 7th August 20096 PM - The Pocket Jamms for Genes6:00 pm Von Johin 6:30 pm Gary Jonstone7:00 PM Dallas Silverspar8:00 PM Artel Brando9:00 PM Pato and the Re-Tenders10:00 PM OhMy Kidd11:00 PM Rosedrop Rust Saturday 8th August 200912 AM - The Pocket Jamms for Genes (continued)12:00 AM - Phoe Nix 1:00 AM - Angelica Svenska 2AM Laceys Place Jamms for Genes2:00 AM OhMY and the Kids3:00 AM Auzzie and The Lohners4:00 AM Jackdog Snook5:00 AM Rara Destiny6AM Angels Place6:00 AM Dann Numbers7:00 AM Shane Kirshner8:00 AM HölliVals9AM The Old Barn9:00 AM Bara Jonson10:00 AM Don Cabassoun11:00 AM Kaklick Martin 12 PM Sailors Cove12:00 PM Pmann Sands 1:00 PM Jonas Lunasea2:00 PM This Device3 PM Endeavour Cove3:00 PM Jaggpro Mccann4:00 PM OhMy Kidd5:00 PM Maximillion Kleene6 PM THE POCKET JAMMS AGAIN6:00 PM Paisley Beebe & Freddy Halderman7:00 PM Cellandra Zon7:00 PM Freestar Bay Jamms for Genes\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*7:45 PM Freestar Tammas8:00 PM Dale Katscher8:30 PM Matthew Perreault9:00 PM KelvinBlue Oh10:00 PM Noma Falta11:00 PM Randyb Magic11:30 PM Freestar TammasSunday 9th August 200912 AM Woodstock Jamms for Genes12 AM Rosedrop Rust1 AM Tpenta Vanalten2 AM OhMy Kidd3 AM SOAR4 AM onward - Mariposa Upsahw's Art Auction SpectacularWe've been going now for a bit over an hour and would love to see more folks come, listen and donate.

Last year we ran this event in second life. It ran for 27 hours and we raised about AUS$2000 for Children's Medical Research Institute as a part of Jamm for Genes. This year we're doing it again. It's...

OpenSolaris

Windows, OpenSolaris and VirtualBox

Over the weekend (as I knew we were going to have some network stuff going on) I installed Virtual Box on my notebook on the Windows disk (I have nevada on the other disk [yes I have a notebook with two 250gb discs]) and then installed a release candidate ISO of OpenSolaris 2009.06. I copied a backup of my punchin credentials and the two packages I needed onto the FAT32 partition of the windows disc from within nevada and then got to work setting things up.Gotcha #1, don't try to do the install with only 512mb memory, it looks like it's working, but it just sits there doing precious little. I used up about an hour of battery on the train trying this. I got off the train at Tuggerah and went to McDonalds to both get dinner and finish the install, which then went along happily (I chose McDonalds mainly because they also have free wifi).Installed the punchin packages and restored the credentials. It actually took a bit of research to find out how to use sharing. Doing it from the Windows side with Virtual Box was relatively straightforward, but doing it from the OpenSolaris side was not so obvious. I ended up finding it after hitting the User Guide. I'd called the directory I wanted "share", from OpenSolaris I had to do$ pfexec mount -F vboxfs share /mntOnce I'd done that it just worked fine. Everything looked great, except I'd now run out of battery with no power points easily in sight. Oh well headed home.Once connected to the power everything seemed to work. I dropped the memory back to half a gig (as I run a few things in windows that are kinda memory hungry), and it worked fine for me all weekend just on the notebook.Arrived at work today to find that for reasons that I won't go into today, the workstation that I normally run my Sun Ray IIFS from was off the internal network, making my Sun Ray useless to me.I ran up Virtual Box on the notebook and then started the OpenSolaris that I had to start working. A little into the day it occurred to me that I have two 22" screens and a full sized keyboard and mouse sitting next to the notebook, currently doing very little. The keyboard and mouse are in their own unpowered usb hub plugged into the Sun Ray. OK they are now plugged into the notebook. That made a huge difference to productivity.I then unplugged the digital connection from the screen and attached that to the notebook. Now I have a mirror of what's on the notebook and it is also easier to read. Productivity goes up again.You know, I could go one step further by instead of mirroring the screens actually going dual screen, such that I have the windows session displayed on teh notebook and then I could go full screen (1600x1200) for OpenSolaris.Once I arranged things that the mouse moved in the correct direction between the two, this is wonderful and surprisingly usable (well more so once i upped the memory for the OpenSolaris part to 768mb).Gotcha #2, don't install the VDI files on a FAT32 filesystem, which is what I did because that's where I had the most free space. Everything worked wonderfully until the size of the dynamic disc that I had allocated hit 4gb. Then I started getting errors about full disks from Virtual Box. OK moving the VDI file to the NTFS wasn't that hard. I had to first physically copy it, then release and remove it from within Virtual box, then attach it again from the NTFS disc.And that's how I ended up getting my job done today. I'm pretty happy with how it worked out.

Over the weekend (as I knew we were going to have some network stuff going on) I installed Virtual Box on my notebook on the Windows disk (I have nevada on the other disk [yes I have a notebook with...

OpenSolaris

Installing "extra" packages against OpenSolaris 2008.11 (with or without support repository updates)

It took me a bit to work out what was going on here (including a number of re-installs to make sure I hadn't screwed up), so I thought it worth sharing.First, what a failure looks like:After following Chris's instructions for adding the extra repository, I tried to install flash from it so the kids could play their browser based flash games.$ pfexec pkg install pkg://extra/web/firefox/plugin/flashCreating Plan | pkg: the following package(s) violated constraints: Package pkg:/SUNWcsl@0.5.11,5.11-0.111 conflicts with constraint in installed pkg:/entire: Pkg SUNWcsl: Optional min_version: 0.5.11,5.11-0.101 max version: 0.5.11,5.11-0.101 defined by: pkg:/entireIt turns out that what is at issue here is that the extra repository now has the "fat" packages that we will be using for 2009.06. The pkg command on 2008.11 (with any number of support repository updates - I was originally on SRU4 before re-install) can't handle these so it produces that cryptic message.So, what can we do about it?The first step is to have a look at all versions of the package we are interested in on extra.$ pfexec pkg list -af 'pkg://extra/web/firefox/plugin/flash'NAME (AUTHORITY) VERSION STATE UFIXweb/firefox/plugin/flash (extra) 10.0.22.87-0.111 known ----web/firefox/plugin/flash (extra) 10.0.22.87-0.101 known u---web/firefox/plugin/flash (extra) 9.0.151-0.101 known u---web/firefox/plugin/flash (extra) 9.0.125-0.101 known u---web/firefox/plugin/flash (extra) 9.0.125-0.101 known u---The version 9 packages will work ok, so we simply install one of those:$ pfexec pkg install "pkg://extra/web/firefox/plugin/flash@9.0.151-0.101"PHASE ITEMSIndexing Packages 554/554 DOWNLOAD PKGS FILES XFER (MB)Completed 1/1 3/3 2.46/2.46 PHASE ACTIONSInstall Phase 19/19 Reading Existing Index 9/9 Indexing Packages 1/1 And done. The note to myself for 2009.06 is that that 4gb root disc is just not going to cut it anymore :) Time for something more reasonable.

It took me a bit to work out what was going on here (including a number of re-installs to make sure I hadn't screwed up), so I thought it worth sharing. First, what a failure looks like: After following...

Solaris

multithreaded processes and mdb

Today I had to look at a gcore of devfsadm. Most specifically I wanted to have at what the threads in cond_wait() were doing. I haven't done a lot with such stuff in userland before so thought it would make a good short blog topic on things that can be done.First off we run up mdb# mdb /usr/sbin/devfsadm devfsadm.gcoreLoading modules: [ libsysevent.so.1 libnvpair.so.1 libc.so.1 libavl.so.1 libuutil.so.1 ld.so.1 ]> Great, we got all the modules. So, what lwps have we got?> $Llwpids 1, 2, 3, 4, 5 and 6 are in core of process 135.So we have six threads, let's have a look at the registers in first one (note that this is on SPARC).> 1::regs%g0 = 0x00000000 %l0 = 0x00000000 %g1 = 0x0000001d %l1 = 0x00043748 %g2 = 0x0003cb2c %l2 = 0xffbff8ac %g3 = 0x00038000 %l3 = 0x00000001 %g4 = 0x0003cb2c %l4 = 0x00000000 %g5 = 0x00000000 %l5 = 0x00000000 %g6 = 0x00000000 %l6 = 0x00000000 %g7 = 0xff342a00 %l7 = 0x00000001 %o0 = 0xff342c40 %i0 = 0x00000001 %o1 = 0xff13b90c libc.so.1`pause+0x50 %i1 = 0x0003a2a4 %o2 = 0xff1c3800 libc.so.1`_uberdata %i2 = 0xff342a00 %o3 = 0x00000000 %i3 = 0x00039954 %o4 = 0xff342a00 %i4 = 0x00016964 %o5 = 0x00000000 %i5 = 0x00000000 %o6 = 0xffbff850 %i6 = 0xffbff8b0 %o7 = 0xff13b914 libc.so.1`pause+0x58 %i7 = 0x00015ce4 %psr = 0x00000044 impl=0x0 ver=0x0 icc=nzvc ec=0 ef=0 pil=0 s=0 ps=64 et=0 cwp=0x4 %y = 0x00000000 %pc = 0xff14c160 libc.so.1`_pause+4 %npc = 0xff14c164 libc.so.1`_pause+8 %sp = 0xffbff850 %fp = 0xffbff8b0 %wim = 0x00000082 %tbr = 0x00000000Now to have a look at the stack we simply find the %sp value and use it with the stack dcmd.> 0xffbff850::stack0x15ce4(0, 43b48, 39db4, 4, 2276c, 38000)main+0x358(0, 39f2c, ffbffdec, 398e4, 1, 38000)_start+0x108(0, 0, 0, 0, 0, 0)Note that this gives the stack frames above the current and not the current. From the value of %pc above we can see where we are executing in the current frame. You can also see that we the caller does not have an entry in the symbol table. Unfortunately, on Solaris 10, devfsadm has a lot of functions and variables declared as static, which really does make debugging a pain. Fortunately this is not the case in Nevada/OpenSolaris.Looking at the other lwps is as simple as listing the lwp id in front of the regs dcmd and repeating what we just did. I won't go into how I worked out which of the static routines we were executing in for the other lwps in cond_wait(), save to say that there are only a couple of places that make that call in the code, and matching up the assembly around the locations to the source (especially looking at called functions), makes this not too difficult.

Today I had to look at a gcore of devfsadm. Most specifically I wanted to have at what the threads in cond_wait() were doing. I haven't done a lot with such stuff in userland before so thought it...

Music

New Song - That's Just the way that it goes

Finished recording this about 2 hours ago. It's now available on Myspace and The Sixty One as a download. I've just made the 128k mp3 available under the following license:That's Just The Way That it Goes by Alan Hargreaves is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 2.5 Australia License.Which basically means as long as you don't want to modify it, as long as you don't want to make money out of it, and as long as you attribute it; you can download and pass it around as much as you like.You can also listen to a copy from thesixtyone.com here.The technical side of things.I learned some things while recording this.The drums I created with Rhythm Rascal (and I will register this shareware when I have some spare cash). I found this application incredibly easy to use and it produced a really nice wav file that I could import into reaper. I could have saved as midi, but I found I liked the samples that it used for the drums better than anything else I had.The Bass line I did up in the midi editor in reaper using a bass sample from the Kore midi set (free download).Now the guitar, ... big lesson number 1. Put new strings on the guitar. I could not get a decent recording of the guitar with month old strings on.I also couldn'd get the exact sound I wanted inside reaper, so I ended up recording the guitar (and actually the vocals too) through my Digitech RP-150 with some hall reverb, bright EQ and slight compression.The other really big lesson I learned is just how absolutely essential it is to use compression given the huge dynamic range of an acoustic guitar. That made an enormous difference.In order to keep a decent strum going through the whole song, I recorded some incidental stuff for the guitar as well.After all that was done it was time to lay down the vocal. I set up the Behringer C-1 at head level and put a pop filter in front of it and kicked offf the recorder. Wow what a difference it makes to sing along against a full backing. It really helped to get into the feel of the song and I was bopping along while singing. Same thing adding the harmonies to the chorus.All in all, I'm extremely happy with how this turned out and I hope you enjoy it too.Alan.

Finished recording this about 2 hours ago. It's now available on Myspace and The Sixty One as a download. I've just made the 128k mp3 available under the following license: That's Just The Way That it...

OpenSolaris

Daaaaaaaad, the computer isn't booting

These were the words that my 10 year old boy yelled to me on Sunday. I'm documenting this as I tried to imagine facing this as an end user, rather than as a Solaris kernel support engineer, and shudderedThe machine he is talking about is the OpenSolaris box that I installed for them and recently upgraded to SRU4 on Friday (more on supported updates shortly).The box (silver) had been sitting at the boot load screen (those of us old enough to remember the original Battlestar Galactica would refer to it as the cylon screen) with the disk light hard on and the disk rattling away threatening to send itself off into hyperspace. It had apparently been like this for a few hours (he lost interest and went to watch TV before thinking to tell me).He'd tried resetting it and it didn't help."OK", I thought, "I've just upgraded the box, maybe there was a problem with SRU4, let's boot into the prior boot environment." Easy enough to do, just reset and select the prior environment in grub.No dice. Same issue.Failsafe boot? No it appears that we don't have one of those.Right, a single user boot. I want to have a look at what is going on on the console, So we need to get rid of the graphics crud at the start.This isn't too hard. Have a look at the options in the text boot to be sure, but all I did was hit 'e' (edit) in grub, d (delete) the splash and graphics lines, e (edit) the unix line to take out the ",Graphics... " stuff off the end of the command, hit Enter to go back a screen then hit b (boot) and watch what happens.I didn't have to wait long.Let me give you a little more background on this machine. It really is scrounged together. The root pool consists solely of a 4gb disc removed from an ultra 10.The root zpool was 100% full. The disc full messages scrolled for a while.OK, once we waited for a few minutes we got the prompt asking for a login name and password to drop us to a root single user. OK, let's go looking for where the space issue is.A 'zfs list showed me that rpool/export/home was a little larger than I expected. Unfortunately, as the pool was full, I couldn't mount those. No worries, let's poke around on / to try to find something to remove to make enough space so we can mount things.A good place to look for such space on a workstation is in /var/log, specifically the Xorg logs.Let's remove one of those, ....Bzzzzzt wrong.Copy on write, .... In order to unlink a file we need to write a new block for the directory entry. Oops no free blocks.The trick is to lose the space without having to rewrite the directory entry. We need to truncate one of the logs.# : > Xorg.0.log.oldMuch better. For good measure I zapped Xorg.0.log as well.OK, that looks much better.Let's mount rpool/export/home and have a look.# zfs mount rpool/export/homeAhhh, the kids home directories each have a largish core in them. Remove those, unmount /export/home. Now, as I mounted rpool/export/home and not rpool/export, a directory got created in /export. We need to remove that or the filesystem/local service won't start correctly (it will complain about /export having stuff in it).Logout of that shell and the system continues on to milestone=multiuser and we're good again and Jake is off to do his daily moves in Kingdom of Loathing and resume his Club Penguin.

These were the words that my 10 year old boy yelled to me on Sunday. I'm documenting this as I tried to imagine facing this as an end user, rather than as a Solaris kernel support engineer,...

OpenSolaris

OpenSolaris 2008.11 on the kids computer

OK, colour me impressed.We got a hold of a "broken" computer today. After replacing the power supply and putting on an old 9gb disk (the previous owner wanted to keep the disk) I started wondering what I was going to put on it. So it occurred to me that I really should try to put OpenSolaris on it, as I think it should do most of what the kids want.Downloaded the image from opensolaris.com and burned the cd image. Note to self, don't try to burn a cd image while running itunes, downloading a podcast and playing second life under XP. It's a good way to have a write underrun and burn a coaster.Put it into the machine and booted it. Lovely, came straight up into the live cd. Hmmm, but no network. Dived into the device assistant and it noted I had a Via ethernet and needed the vfe driver, for which it kindly pointed me at homepage2.nifty.com/mrym3/taiyodo/eng/. A little bit of finger trouble later, I found the compiled version of the amd64 driver for it and installed it. Reading the README.txt is a good idea, as I left a step out and was wondering why I had no network. The trick was, ... $ rm Makefile$ ln -s Makefile.amd64_gcc Makefile$ pfexec make install$ pfexec ./adddrv.shThat last step is vital.Just to be safe I rebooted (it's late on the Monday of a long weekend and my brain is not working real great) and it came back with a network and everything looks honky dory.Given this is for the kids and they spend a lot of time on You Tube and Club Penguin, I needed flash installed. I did a quick bit of googling and found something that I should have known from my day job (like I said, late n the Monday of a long weekend), in that if I went to pkg.sun.com/register and using my Sun Online credentials that I could register for the Extras repository and there was a package for flash in there.Well I did this but stil had some trouble as it kept telling me that my certificate date was in the future. OK This one I could figure out. This did used to run windows, so the time on teh machine was an hour slow because of how windows set the clock for summer time. Easily fixed:$ pfexec ntpdate 0.pool.ntp.orgAnd getting back through the screen saver that obviously came up :).OK, now it liked the certificates and I could install flash, and successfully look at youtube after a firefox restart.The final step was to make a new account for my 10 year old son.Now the smoke test. I had him login. He immediately brought firefox up with no prompting from me. A few web sites later he is happily playing on Club Penguin.We'll see how this runs for them over the next few weeks.I'm pretty happy with how this has gone down so far.

OK, colour me impressed. We got a hold of a "broken" computer today. After replacing the power supply and putting on an old 9gb disk (the previous owner wanted to keep the disk) I started wondering...

General

Catching up

OK, I've been on holidays so I kind of have an excuse for not blogging this time.Before I go on I have to acknowledge a man who quickly became a good friend in Second Life, who sadly passed on December 30. Chester Capalini was the monarch in the Tiny Empires kingdom (See Dana's blog for more on Tiny Empires) that I was playing in. On the 29th (I think) he was admitted to hospital, very ill. I performed a song and dedicated it to him while performing in Second Life (Running on Faith) and 8 hours later he was gone. Chester had a great heart and lots of people miss him dearly.I spent New Years Eve in Rockdale with some other Second Life friends who also happen to be musicians. Shan plays bass and Byron is a drummer. We had a great jam for New Years Eve. Had a message from Shan the other day that she managed to seriously jam her fingers in a door, requiring surgery. Fortunately the nerves are still there and the doctors expect her to be able to play again in about three months.There has to be some lighter news here somewhere :) Oh yes, while this is the last work day that constitutes my holidays, it also happens to be my 44th birthday. Jake and Lucy prepared me some breakfast (A crumpet with promite, a nectarine and a chocolate milkshake) and brought it up to me along with their present. Mum and dad will be down later in the day to take us out to lunch, so there is something else to look forward to. If I can get my act together today, I might try to head out to Brackets and Jam South tonight, and if Dexter is playing at Iguana's tomorrow that might also get a look in.

OK, I've been on holidays so I kind of have an excuse for not blogging this time. Before I go on I have to acknowledge a man who quickly became a good friend in Second Life, who sadly passed on...

Music

Christmas All Year Through

Merry Christmas!During the last month I've participated in an amazing collaboration. The result of which is the release of a song involving 22 Second Life Artists from all over the world.Below is a URL where you can download a free MP3. The MP3 is a collaborative recording featuring 22 SL musicians all performing an original song entitled "Christmas All Year Through", having recorded their contributions remotely from all over the world. The song was written by Djai Skjellerup and the final mix was painstaking put together by Toby Lancaster. The rest of the collaborators to whom we are hugely grateful for their excellent contributions are listed beneath the song name. I'll try to get a copy uploaded to my tracks shortly.URL for downloading Christmas All Year Through: http://www.mediafire.com/file/wttmmwunktz/SLChristmasAllYearThrough.mp3Information Website:http://slgetittogether.weebly.com/Project Log:http://slmc.myfastforum.org/about2137.htmlCollaborators: BabbleGrabble Swindlehurst, Carah Nitely, Franziskus Paine, Jean Munro, Kaklick Martin, Kiarranne Flanagan, Krell Karu, Lonnie Nightfire, Lyn Carlberg, Mambo Welles, Mimi Carpenter, OhMy Kidd, The Professor, Rich Desoto, Saraine Sands, Slim Warrior, Tommy Cult, Tpenta Vanalten, Vicki Nilsson, Zak Claxton, Toby Lancaster and Djai SkjellerupAnd as an extra bonus for you here is the last collaborative recording we did also free for you to download entitled Get It Together....URL for downloading Get It Together: http://www.mediafire.com/file/wyoygtyizgd/GetItTogether.mp3Information Website:http://slgetittogether.weebly.com/Project Log:http://slmc.myfastforum.org/about810.htmlCollaborators: Norris Shepherd, Zak Claxton, BabbleGrabble Swindlehurst, Rich Desoto, Jambalaya Fonck, Lyn Carlberg, Kim Seifert, Jean Munro, The Professor, Freestar Tammas, Mimi Carpenter, Toby Lancaster and Djai SkjellerupPlease keep coming back to:http://slgetittogether.weebly.com/index.htmlas well. It is only in it's embryonic stage at the moment but will soon be updated with biographies of all the collaborators and other information.Thanks for your time. We'd also like to send our best wishes to those who do not observe Christmas. We hope you enjoy our efforts all the same and have a happy time over the holiday period. My thanks to all the Get It Together collaborators for your efforts with a special mention to Toby Lancaster, Bree Birke and Cher Harrington for your invaluable contributions.

Merry Christmas! During the last month I've participated in an amazing collaboration. The result of which is the release of a song involving 22 Second Life Artists from all over the world. Below is a...

Solaris

Even More DTrace Lab Answers (9 & 10)

OK, let's get the rest out.Exercise 9List the processes that are connecting to a specific port.OK, we'll take that to mean we want to know the name and the process ID of connections to a particular port. This requires just a little network programming knowledge. We can see that the second argument (arg1) to connect(3SOCKET) is a sockaddr. The extra knowledge that we need is that to get a port, we need to cast it to a sockaddr_in and look at the port structure element. This bit of code expects the command line argument 1 to be a port number.#!/usr/sbin/dtrace -s#pragma D option quietsyscall::connect:entry {this->sock = (struct sockaddr_in \*)copyin(arg1, arg2);self->port = this->sock->sin_port;}syscall::connect:entry /self->port == $1/ { printf("%5d %s\\n", pid, execname); }syscall::connect:entry { self->port = 0; }Exercise 10Write a script to show where a specific system call, rename for example, is blocking and how long it is blocked for.This one was a bit of fun to write, especially since the output format specifier for a stack is not documented and I had to find it in dt_printf.c.We'll use the sched provider in this. While it would be nice to use the wakeup probe, it's not fired from the thread context so we won't have the thread local variable that we need to check and use. So, we just use on-cpu, which will tell us how long we were off cpu since we were told to block. Note that this also means that we are not tracking normal scheduling where it may have been pre-empted by another thread for whatever reason. This is what we want.#!/usr/sbin/dtrace -s#pragma D option quietBEGIN { printf("Collecting... \^C to continue\\n"); }syscall::rename:entry {self->interest = 1; }sched:::sleep /self->interest/ { self->blocktime = timestamp; }sched:::on-cpu /self->blocktime/ {this->taken = timestamp - self->blocktime; @[curthread,stack()] = quantize(this->taken);self->blocktime = 0;}syscall::rename:return { self->interest = 0; }END { printa("Thread 0x%p %k %@d\\n", @); }This leaves us with only Exercise 11 to go. I'll try to get that up tomorrow.

OK, let's get the rest out. Exercise 9 List the processes that are connecting to a specific port. OK, we'll take that to mean we want to know the name and the process ID of connections to a particular...

Solaris

More DTrace Lab Answers

OK, I promised more answers so here we go, ... (sorry, where files have copyright notices, I have to leave them there).Exercise 2Restrict the iosnoop.d to trace a specific process.This solution will allow you to either specify a pid by appending '-p {pid}' or to run a command by appending '-c "command args ..." to the command line. eg dtrace -s iosnoop.d -p 1234/\* \* CDDL HEADER START \* \* The contents of this file are subject to the terms of the \* Common Development and Distribution License, Version 1.0 only \* (the "License"). You may not use this file except in compliance \* with the License. \* \* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE \* or http://www.opensolaris.org/os/licensing. \* See the License for the specific language governing permissions \* and limitations under the License. \* \* When distributing Covered Code, include this CDDL HEADER in each \* file and include the License file at usr/src/OPENSOLARIS.LICENSE. \* If applicable, add the following below this CDDL HEADER, with the \* fields enclosed by brackets "[]" replaced with your own identifying \* information: Portions Copyright [yyyy] [name of copyright owner] \* \* CDDL HEADER END \*//\* \* Copyright 2005 Sun Microsystems, Inc. All rights reserved. \* Use is subject to license terms. \* \* This D script is used as an example in the Solaris Dynamic Tracing Guide \* wiki in the "io Provider" Chapter. \* \* The full text of the this chapter may be found here: \* \* http://wikis.sun.com/display/DTrace/io+Provider \* \* On machines that have DTrace installed, this script is available as \* iosnoop.d in /usr/demo/dtrace, a directory that contains all D scripts \* used in the Solaris Dynamic Tracing Guide. A table of the scripts and their \* corresponding chapters may be found here: \* \* file:///usr/demo/dtrace/index.html \*/#pragma D option quietBEGIN{printf("%10s %58s %2s\\n", "DEVICE", "FILE", "RW");}io:::start/pid == $target/{printf("%10s %58s %2s\\n", args[1]->dev_statname, args[2]->fi_pathname, args[0]->b_flags & B_READ ? "R" : "W");}Exercise 3Display the arguments for the rename(2) system call along with its return code.This is longer than it needs to be as I like the idea of a single line of output for each time the we have a rename() system call. I save the two arguments in thread local variables (self->source and self->destination). Note that the two arguments are strings and they are in user space when this probe fires, so we use copyinstr() to both get them into kernel context and make them printable strings. Note also that we clean up after ourselves in the return probe.#!/usr/sbin/dtrace -s#pragma D option quietsyscall::rename:entry {self->source = copyinstr(arg0);self->destination = copyinstr(arg1);}syscall::rename:return {printf("%s: %s -> %s returned %d\\n", probefunc, self->source, self->destination, arg1);self->source = self->destination = 0;}Exercise 4Display the real and sys times (see timex(1)) for the syscallI'm not completely happy with this as I saw some anomalous real time values running this on my notebook, but it should be about right. Note that we record a walltimestamp (seconds since epoch) and the vtimestamp (time on cpu in nanoseconds) and simply compare them in the return probe.#!/usr/sbin/dtrace -s#pragma D option quietsyscall:::entry {self->seconds = walltimestamp;self->sys = vtimestamp;}syscall:::return /self->sys/ {printf("%s: %d real seconds, %dns system\\n", probefunc, walltimestamp - self->seconds, vtimestamp - self->sys);self->seconds = self->sys = 0;}Exercise 5Show the kernel function name that triggers the io:::start probe.Many folks did this using stack(). There is a better way. This makes use of the fact that in the io provider, the probe\* values are defined.$ pfexec dtrace -qn 'io:::start {printf("iostart:::probefunc called from %s\\n",probefunc);}'iostart:::probefunc called from bdev_strategy...You could also simply use the default action without -q as that will print the probe specification each time it fires, which will include the probefunc.Exercise 6Show the flow of kernel functions for a write system call.Most folks put a -F in the arguments on line 1. The clearer (and more correct) way to do this in a script is to use the pragma (highlighted).#!/usr/sbin/dtrace -s#pragma D option flowindentsyscall::write:entry { self->interest = 1; }fbt::: /self->interest/ {}syscall::write:return { self->interest = 0; }Exercise 7Show all lock events for mutexes that occur during two second.The question hints that we may want to use an aggregation and summary rather than printing them as they happen.#!/usr/sbin/dtrace -s3#pragma D option quietlockstat:::adaptive\* {@[probename] = count();}tick-2s {exit(0);}Exercise 8Find file to which most IO is being done.I'm going to assume that the question means real IO, not IOs that hit the cache. In which case, we can use the answer to exercise 2 as a hint. Unfortunately, pretty much all of the IO that this tracks is being done by sched to flush buffers and we don't have any idea of the filename. I'll try to come up with a better solution. #!/usr/sbin/dtrace -s#pragma D option quietBEGIN { printf("Collecting data ... \^C to finish.\\n"); }io:::start { @[args[2]->fi_pathname] = count(); }END { trunc(@,1); printa(@); }That's it for the moment, I'll finish them off later.

OK, I promised more answers so here we go, ... (sorry, where files have copyright notices, I have to leave them there). Exercise 2 Restrict the iosnoop.d to trace a specific process. This solution will...

Solaris

DTrace Lab in the performance track at CEC

We have just been helping out in the DTrace lab for the performance track at CEC. On having a look at exercise 1, we suggested to the attendees that they attempt the other exercises and come back to this one. The exercise was:Enhance /usr/demo/dtrace/iosnoop to be able to detect reads that are satisfied by the filesystem cache (UFS: pagecache).We then went off to write our own versions of an answer to the problem.I started from scratch, and (referring back to a previous blog where I did some digging into this code), came up with the following:#!/usr/sbin/dtrace -s#pragma D option quietBEGIN { printf("Collecting ...\\n"); }syscall::read\*:entry { self->interest = 1; self->file = fds[arg0].fi_name; self->phys = 0;}/\* we only increment this kstat if we have to go to disk, from a filesystem \*/sysinfo:::bread /self->interest/ { self->phys = 1;}syscall::read\*:return /self->interest/ { this->str = self->phys ? "Physical reads" : "Cache Reads"; @[this->str, self->file] = count(); self->interest = 0;}END { printa("%@6d %s %s\\n", @); }On running this you get output like: ... 209 Cache Reads jaxrpc-impl.jar 211 Cache Reads appserv-admin.jar 311 Cache Reads xalan.jar 354 Cache Reads ttysrch 372 Cache Reads jaxr-impl.jar 373 Cache Reads auxv 500 Cache Reads appserv-assemblytool.jar 1216 Cache Reads 1324 Cache Reads appserv-rt.jar 1392 Cache Reads clone@0:ptm 83181 Cache Reads psinfoOf course by the time I finished writing this, pretty much everything in the only ufs filesystem on the box had been cached :)

We have just been helping out in the DTrace lab for the performance track at CEC. On having a look at exercise 1, we suggested to the attendees that they attempt the other exercises and come back to...

General

CEC '08 Saturday/Sunday

OK, I've been slack. Not only have I not blogged since September, I've been at CEC2008 since Saturday and also haven't written anything.The trip outCaught a shuttle to the airport, grabbed some lunch after clearing customs and caught a United flight to Los Angeles. Unfortunately I was stuck in a window seat at the back of the aircraft for the 13 hours in the air which was less than comfortable, sigh.As I was near the back of the aircraft I was one of the last off and at the rear of the line to clear US Immigration in Los Angeles. It was great to see the officers actually smiling and laughing with the passengers. Picked up my luggage, cleared customs and rechecked it. Walked out to terminal 7 to get the flight to Las Vegas. Cleared security ok, but unfortunately one of my colleagues was behind someone who they decided to thoroughly check the bags of, and as a result he missed the flight and had to catch another.I arrived in Las Vegas and went looking for my checked luggage, ...This is the first time that this has happened to me. Apparently my luggage was still in LA, and might arrive on the next flight. As it turns out it wasn't the next flight, they got it to the Casino after I was asleep about 11:30pm, so I got it the next morning.Met up with Chris Gerhard on the bus and headed out to the Paris Casino. Quite a few of us joined up for a quiet drink in Napoleon's Piano Bar.SundayDidn't wake up until around 10:30am and then spent the morning putting finishing touches on my presentation.I noticed that the certification room was open, so I decided to give the Solaris 10 Security Administrator exam a go. This exam has a pass mark of 52% and the course notes and trial exams did not give me a lot of hope for getting through first go.Imagine my surprise on scoring 71%! Wonderful I now posses the Admin, Nework and Security Certifications!Attended the Welcome reception and hit the sack again. More later.

OK, I've been slack. Not only have I not blogged since September, I've been at CEC2008 since Saturday and also haven't written anything. The trip out Caught a shuttle to the airport, grabbed some lunch...

OpenSolaris

Installing OpenSolaris 2008.05 under Virtual Box and Upgrading it

I'm in the process of preparing a Transfer of information (TOI) for Support Services in Asia Pacific on some of the basic differences and things to watch out for in providing Support for OpenSolaris (as against Support for Solaris 10 which they are already doing). One of the things that I am asking folks to do is to have already installed and upgraded from the 2008.05 distribution. I went hunting for documentation on this and the closest that I could find was an email trail, and there were corrections to that process further down the trail. As a result I rewrote the instructions after doing a lot of running through them to make sure I got it right.I noticed someone in #opensolaris today having trouble and they had missed out one of the listed steps which he didn't see in the release notes. As a result I was asked to blog this stuff, so here we go.As we don't really have an easy way to do the equivalent of a jumpstart from this image in our labs, I'm recommending folks try it with Virtual Box.There is excellent documentation on installing both Virtual Box and Open Solaris available at http://dlc.sun.com/osol/docs/content/IPS/virtualbox.html. This includes information on downloading the packages and the ISO image for OpenSolaris 2008.05. Unfortunately it does not discuss installing Virtual Box on Solaris. Fortunately, this is not too hard. Simply download the appropriate package and install it with pkgadd(1M).Once you have the distribution installed, you will need to upgrade it to the currently available system. Doing this will step you through exactly what is required to do this using the live-upgrade replacement and taking advantage of the system using ZFS root.General instructions on updating to the latest OpenSolaris development build from the 2008.05 ISO ImageBefore using the "image-update" subcommand, it is recommended thatthe latest available version of the IPS software be installed for yourcurrent boot environment$ BUILD=`uname -v | sed s/snv_//`$ pfexec pkg refresh$ pfexec pkg install SUNWipkg@0.5.11-0.$BUILD$ pfexec pkg install entire@0.5.11-0.$BUILDOne thing that I noticed when walking through these instructions was that NWAM had not modified /etc/nsswitch to include dns on the line for hosts. If you have this problem, you can edit the file with "pfexec vi /etc/nsswitch.conf" to fix this if you are unable to resolve pkg.opensolaris.com.Before proceeding to the next step, verify your OpenSolaris buildnumber$ echo $BUILDIf you are running build 93 or greater, you can use "pfexec pkg image-update" and skip the remainder of this page. In this case we are installing from the 2008.05 image which is based on build 86.As you are using a build prior to 93, it is recommended one applythe update directly to an alternate boot environment. First, displaythe list of the existing BEs on the system$ beadm listBE Active Active on Mountpoint SpaceName reboot Used---- ------ --------- ---------- -----opensolaris no no - 33.92Mopensolaris-1 yes yes - 17.06MNext, choose the name of a new BE - if the most recent created BE is ofthe form "opensolaris-N" where N is an integer, then a suitablechoice for the new BE is "opensolaris-{N+1}". In the above example,the new BE would be "opensolaris-2".Then, execute the following sequence to create, mount, and update the new BE$ pfexec beadm create opensolaris-1$ pfexec beadm mount opensolaris-1 /mnt$ pfexec pkg -R /mnt image-updateIf you are running build 86 (which you will be if you used the 2008.05 image), one additional work-around isrequired.\*\*\*\*\*\*\*\*\*\* IMPORTANT \*\*\*\*\*\*\*\*\*\*Due to changes in the GRUB boot system, one must manually update theMaster Boot Record (MBR) to include these latest changes. Failure tofollow these instructions when updating from 2008.05 (build 86) to alater build will result in a system that does not boot by default andinstead the original boot environment must be manually selected.Update the GRUB configuration on your ZFS boot device(s) using$ pfexec /mnt/boot/solaris/bin/update_grub -R /mntUnmount the boot environment you just updated and activate it.$ pfexec beadm unmount opensolaris-1$ pfexec beadm activate opensolaris-1When you're ready to boot into the updated boot environment, you canreboot(1M) or init(1M) as usual.Technorati Tags:OpenSolaris

I'm in the process of preparing a Transfer of information (TOI) for Support Services in Asia Pacific on some of the basic differences and things to watch out for in providing Support for OpenSolaris...

Music

Jammin for Genes

At 4pm US/Pacific (now), we are kicking off a Jammin for Genes event in support of Jeans for Genes, and I am honoured to be a part of it.From the Jammin website, ...Jamm for Genes is a live music event that takes place on the weekend of Jeans for Genes Day (the first weekend in August). In 2004 Jamm for Genes launched a very successful maiden voyage that saw bands Thirsty Merc, Machine Gun Fellatio, Taxiride, Gus & Frank and Dave McCormack donating their time and skills to perform at two separate locations. It was fun, it was loud and it was raising funds for the Jeans for Genes campaign. Cut to 2006 it was no wonder that over 70 live music venues and 160 bands put their hands up to get involved when the event was run for the second time nationwide. 2007 saw beyond the support of the previous years with ambassadors Adam Harvey, Beccy Cole, Jade McCrae, Jon Stevens, The McClymonts, Glenn Shorrock and Courtney Act all jumping on board and doing there best to help out.How to hear itGot Second Life? Go to Sailors Cove Theatre http://slurl.com/secondlife/Sailors%20Cove/243/140/24To get Second Life go to http://www.secondlife.comJust wanna listen and donate?Go to www.jammforgenes.org.au to donate and listen to thestream on one of the following URLs (depending on your preferred player) Winamp: http://sh1.audio-stream.com/tunein.php/testacct2/playlist.pls Real Audio: http://sh1.audio-stream.com/tunein.php/testacct2/playlist.ram Windows Media: http://sh1.audio-stream.com/tunein.php/testacct2/playlist.asx Itunes/QuickTime: http://sh1.audio-stream.com/tunein.php/testacct2/playlist.qtlSo who is playingNote that these times are US/Pacific, which is what Second Life time is based on. It's 10am Saturday morning here on the east coast of Australia.Aug 14PM warmup By Jonas Lunasea5PM Tpenta Vanalten6PM Wread Writer7PM Dexter Ihnen8PM Artel Brando9PM Ande Foggerty10PM Pato Milo11PM Army of IgnoranceAug 212AM Jaggpro Mcann1AM OhmyKidd2AM Lacey Lohner3AM Jackdog Snook4AM Paisley Beebe/Freddy Halderman5AM Phoe Nix6AM Midnight in Canberra7AM Jonas Lunasea8AM Winston Akland9AM Robie Bloch10AM Luigi DiPrima11AM Mason Thorne12PM Raspbury Rearwin1PM Noma Falta2PM Freetar Tammas3PM Cylindrian Rutabaga4PM Ohmy Kidd / fireworks spectacularCome along, enjoy the music and if you can, give for a worthy charity.

At 4pm US/Pacific (now), we are kicking off a Jammin for Genes event in support of Jeans for Genes, and I am honoured to be a part of it. From the Jammin website, ... Jamm for Genes is a live music...

General

China Earthquake: Oh my god!

I'm supposed to be putting the finishin touches on another customer presentation this morning (in the light of one I gave yesterday). I simply had to stop doing that and get my thoughts down as I was finding it hard to focus.Yesterday I made a comment on a colleague's blog about the earthquake, as I am also travelling in the region. I noted that I was giving a presentation to a customer at the time and actually didn't notice. I had it pointed out to me that we had had a tremor or a 'quake after I finished.I got back to one of the offices in Beijing that afternoon and had an Australian colleague in a chat session point me at an article in an Australian newspaper about the incident mentioning a loss of life of about a hundred. This in itself was incredibly sobering, as any such loss of life is tragic.This morning I woke up and flipped on BBC World and was utterly gobsmacked to hear 10,000 dead!I find myself at a loss to describe my feelings. On one hand I am incredibly grateful for my own safety, but 10,000 people?Oh my god!This is beyond tragedy.The loss of human life on this scale is beyond comprehension.The China Daily lists the numbers lost in various areas. One in particluar leaps out at me. In comparison to some of the other areas the numbers are small but how can the following not tug on your heart?Dujiangyan: Over 50 dead in a middle school. Many more are buried beneath rubble.I almost dread going into the office today as there are certain to be people who either know that they have lost family and friends, or perhaps worse, don't know whether or not they have. My heart goes out to all of these wonderful people who have made me feel so welcome here.I wish I knew what more to say.Update#1I just called my manager in Sydney to let him know that I was fine. He told me that the Australian news services are reporting on 900 kids in a collapsed school.I am fearful that the news is only going to get worse!

I'm supposed to be putting the finishin touches on another customer presentation this morning (in the light of one I gave yesterday). I simply had to stop doing that and get my thoughts down as I was...

General

Jonathan on closed MySQL extensions

I have just been reading some questions and answers with Jonathan from Tim O'Reilly. One that jumped out at me was a question he passed on from Jesse Stay. I'm going to quote both the question and answer below in full. The added emphasis is mine. JesseStay : does he anticipate a fallout of original MySQL users or fork in the mysql code and how will they handle that if it does happen? 2008-04-25 12:26:30JonathanSchwartz: I'm not anticipating a fork - Marten Mickos (SVP, Database Group at Sun, former CEO, MySQL) made some comments saying he was considering making available certain MySQL add-ons to MySQL Enterprise subscribers only - and as I said on stage, leaders at Sun have the autonomy to do what they think is right to maximize their business value - so long as they remember their responsibility to the corporation and all of its communities (from shareholders to developers). Not just their silo.I think Marten got some fairly direct and immediate feedback saying the idea was a bad one - and we have no plans whatever of "hiding the ball," of keeping any technology from the community. Everything Sun delivers will be freely available, via a free and open license (either GPL, LGPL or Mozilla/CDDL), to the community.Everything.No exception.I think puts things pretty much into black and white. I wonder if we will see some egg on face retractions from those who tried to pin keeping some bits proprietary on the Sun purchase, as it looks like the opposite is actually the case. That is, the Sun purchase is what is going to ensure that these extensions are open. You know, I'm not holding my breath for any "Oops I got it wrong" type comments.Technorati Tags:MySQL,Sun

I have just been reading some questions and answers with Jonathan from Tim O'Reilly. One that jumped out at me was a question he passed on from Jesse Stay. I'm going to quote both the question and...

General

Acer wins back a customer

I guess many of you saw my rant about the poor support I was getting from Acer over getting my Ferrari 4005 fixed.I really should have written this up earlier, as it has been resolved now for a bit over a month.After speaking with the Escalation folks again after still having no joy, I was offered a new machine with the following specs.Travelmate 7720G17" screenT5700 Core 2 duo running at 2.2GHz2gb Memory2x160gb diskswebcam, more interesting looking audio, 4 usb ports, ...A week later I had the local repairer offering me a lower spec'd machine. After I explained what I had already been offered, they agreed and we also managed to have the 3 year warranty replaced, and the internal disks were now two 250gb disks!I've been using it for about a month now and I quite like it. Nevada simply just installed and ran. I also selected the XP option as I really don't think vista is ready yet. I've actually been using it to perform live into second life (it does have relatively nice audio and in fact it also has a line level input on the front of the machine.While I am pleased that they replaced the machine and I am very happy with it, the fact remains that I should not have had the poor support experience in the first place.The two things that really stand out werethe complete lack of correct expectation setting, especially in the light of me being obvious what my expectations were).making promises that they had no intention of keeping. That is, I did not receive a single one of the promised call backs.Folks these are Support 101 basics and really need to be fixed.I will, however, say a big thank you to Acer for the actions that they did end up taking to address the issue.As an aside, a little investigation of my own showed that the issue I had with the Ferrari was apparently rather common in Ferraris of that model and age, which could explain the difficulty in sourcing a motherboard. The original problem was the video adaptor dieing in such a way as to not receive a hardware notification of an event, leaving the cpu spinning on a lock, ending up getting a BSOD.

I guess many of you saw my rant about the poor support I was getting from Acer over getting my Ferrari 4005 fixed. I really should have written this up earlier, as it has been resolved now for a bit...

OpenSolaris

PSARC 2008/008 DTrace Provider for Bourne Shell

I finally got to submit the fast track for the shell provider. I've already had one comment (from Darren Reed) that I have incorporated as it made very good sense. He suggested that if we are tracking variable assignments, we should also track unset. At this point I realised that a better name for the probes would be variable-set and variable-unset. I have a working copy for SPARC with these changes now.Below is the prefix text and the revised specification.I am sponsoring the following fast track for myself. I am doing thebourne shell first for two primary reasons.1. It is the "simplest" of the shells and thus should provide the minimum set of probes to implement for future work in other shells,2. Providing probes into /bin/sh gives us observability of approximately 60% of all of the scripts on ON.Additionally, as it has been around for a very long time there arequite a lot of user written scripts for it, many very badly written.I would expect future fast tracks for other shells (eg ksh88, ksh93,zsh, bash, ...) to reference this fast track for the minimum set ofprobes.Note the probes are currently listed as Uncommitted. As the probesgain use I would hope to log a future fast track to increase thisstability.A Minor release binding is initially requested. Again, once thingssettle down and the interfaces stabilise it is expected that a futurecase may request a patch binding.The sh provider makes available probes that can be used to observe thebehaviour of bourne shell scripts.Probes------The sh provider makes available the following probes as exports:builtin-entry Fires on entry to a shell builtin command.builtin-return Fires on return from a shell builtin command.command-entry Fires when the shell execs an external command.command-return Fires on return from an external command.function-entry Fires on entry into a shell function.function-return Pires on return from a shell function.line Fires before commands on a particular line of code areexecuted.subshell-entry Fires when the shell forks a subshell.subshell-return Fires on return from a forked subshell.script-start Fires before any commands in a script are executed.script-done Fires on script exit.variable-setFires on assignment to a variable.variable-unsetFires when a variable is unset.The use of non-empty module or function names in a sh\* probe isundefined at this time.Arguments---------builtin-entry,command-entry,function-entrychar \*args[0]Script Namechar \*args[1]Builtin/Command/Function Nameintargs[2]Line Numberintargs[3]# Argumentschar \*\*args[4]Pointer to argument listbuiltin-return,command-return,function-returnchar \*args[0]Script Namechar \*args[1]Builtin/Command/Function Nameintargs[2]Return Valuesubshell-entrychar \*args[0]Script Namepid_targs[1]Forked Process IDsubshell-returnchar \*args[0]Script Nameintargs[1]Return Valuelinechar \*args[0]Script Nameintargs[1]Line Numberscript-startchar \*args[0]Script Namescript-donechar \*args[0]Script Nameintargs[1]Exit Valuevariable-setchar \*args[0]Script Namechar \*args[1]Variable Namechar \*args[2]Valuevariable-unsetchar \*args[0] Script Namechar \*args[1]Variable NameExamples--------1. Catching a variable assignmentSay we want to determine which line in the following script hasan assignment to WatchedVar:#!/bin/sh# starting scriptWatchedVar=Valueunset WatchedVar# ending scriptWe could use the following script#!/usr/sbin/dtrace -s#pragma D option quietsh$target:::line { self->line = arg1; }sh$target:::variable-set /copyinstr(arg1) == "WatchedVar"/ { printf("%d: %s=%s\\n", self->line, copyinstr(arg1), copyinstr(arg2))}sh$target:::variable-unset /copyinstr(arg1) == "WatchedVar"/ { printf("%d: unset %s\\n", self->line, copyinstr(arg1)); }$ ./watch.d -c ./var.sh4: WatchedVar=Value5: unset WatchedVar2. Watching the time spent in functions#!/usr/sbin/dtrace -s#pragma D option quietsh$target:::function-entry { self->start = vtimestamp }sh$target:::function-return {@[copyinstr(arg1)] = quantize(vtimestamp - self->start)}Similar for the other entry/return probes, with the exceptionof subshell as the probe name is unavailable.3. Wasted time using external functions instead of builtinsThis script is copied from the DTrace toolkit. It's functionand how it works should be relatively self explanatory.#!/usr/sbin/dtrace -Zs/\* \* sh_wasted.d - measure Bourne shell elapsed times for "wasted" commands. \* Written for the sh DTrace provider. \* \* $Id: sh_wasted.d 25 2007-09-12 09:51:58Z brendan $ \* \* USAGE: sh_wasted.d { -p PID | -c cmd } # hit Ctrl-C to end \* \* This script measures "wasted" commands - those which are called externally \* but are in fact builtins to the shell. Ever seen a script which calls \* /usr/bin/echo needlessly? This script measures that cost. \* \* FIELDS: \* FILE Filename of the shell or shellscript \* NAME Name of call \* TIME Total elapsed time for calls (us) \* \* IDEA: Mike Shapiro \* \* Filename and call names are printed if available. \* \* COPYRIGHT: Copyright (c) 2007 Brendan Gregg. \* \* CDDL HEADER START \* \* The contents of this file are subject to the terms of the \* Common Development and Distribution License, Version 1.0 only \* (the "License"). You may not use this file except in compliance \* with the License. \* \* You can obtain a copy of the license at Docs/cddl1.txt \* or http://www.opensolaris.org/os/licensing. \* See the License for the specific language governing permissions \* and limitations under the License. \* \* CDDL HEADER END \* \* 09-Sep-2007 Brendan Gregg Created this. \*/#pragma D option quietdtrace:::BEGIN{ isbuiltin["echo"] = 1; isbuiltin["test"] = 1; /\* add builtins here \*/ printf("Tracing... Hit Ctrl-C to end.\\n"); self->start = timestamp;}sh$target:::command-entry{ self->command = timestamp;}sh$target:::command-return{ this->elapsed = timestamp - self->command; this->path = copyinstr(arg1); this->cmd = basename(this->path);}sh$target:::command-return/self->command && !isbuiltin[this->cmd]/{ @types_cmd[basename(copyinstr(arg0)), this->path] = sum(this->elapsed); self->command = 0;}sh$target:::command-return/self->command/{ @types_wasted[basename(copyinstr(arg0)), this->path] = sum(this->elapsed); self->command = 0;}proc:::exit/pid == $target/{ exit(0);}dtrace:::END{ this->elapsed = (timestamp - self->start) / 1000; printf("Script duration: %d us\\n", this->elapsed); normalize(@types_cmd, 1000); printf("\\nExternal command elapsed times,\\n"); printf(" %-30s %-22s %8s\\n", "FILE", "NAME", "TIME(us)"); printa(" %-30s %-22s %@8d\\n", @types_cmd); normalize(@types_wasted, 1000); printf("\\nWasted command elapsed times,\\n"); printf(" %-30s %-22s %8s\\n", "FILE", "NAME", "TIME(us)"); printa(" %-30s %-22s %@8d\\n", @types_wasted);}Stability---------Element Name Class Data Class------------------------------------------Provider Uncommited UncommitedModule Private PrivateFunction Private PrivateName Uncommited UncommitedArguments Uncommited Uncommited------------------------------------------Technorati Tags:Solaris,OpenSolaris,DTrace

I finally got to submit the fast track for the shell provider. I've already had one comment (from Darren Reed) that I have incorporated as it made very good sense. He suggested that if we are tracking...

General

Customer Service (?) from Acer Support

Updated twice (see end of entry)After having my Dell notebook die outside of warranty two years ago, I made sure to buy the extended warranty when I replaced it with an Acer Ferrari 4005 in November/December 2005.About a month ago, (around 2 years after purchase interestingly, just like the Dell), I started having problems with it. I noticed a couple of cracks on the top of the screen and that I had two keys starting to be questionable (they work about 50% of the time). Shortly after that I started seeing windows regularly crash and Solaris hang. A little investigation showed that the video card (on the main board) was starting to play up.So, obviously I arranged to have it sent back under warranty.This is where the tragedy of errors begins.I got the tracking number when DHL picked it up from Gordon at about 9:30.It looks like the driver drove all day before dropping it into his depot at 8:00pm that evening (end of shift?). It then looks like it went out of the depot for an hour at about 3am before being returned at 4am and was finally delivered to the repair center in Flemington at about 9:15am the next morning. Note the distance between the repair center is about an hours drive, if that.OK I was called a few days later to be told that the screen would not be replaced under warranty, but if I was willing to pay for it they'd replace it. I declined. My belief at this point was that the main board had been done and it would be shipped back to me shortly, and everything I said communicated this expectation. Nothing was done to correct it.The following week I had a call stating that they wouldn't be replacing the keyboard as there had been a "liquid spill" on both the keyboard and the disk. I reached over my desk and picked up the disk (which I kept as I did not wish it reformatted) and said "hmmm interesting, I have the disk in my hand and see no such evidence. If there is evidence of a liquid spill on the disk currently installed, it must have happened there", which of course prompted denials. The upshot was that they also were not going to replace the keyboard unless I paid for it. Given I can clean a keyboard myself I declined. Again the expectation thing, in fact I also changed the address for them to return it as I would be working from home the next few days. No correcting of that expectation.At this point they had had it for a week.Come Friday, I was concerned that I still hadn't got it. So, yet another call to the support line (and the interminable wait being constantly informed of my position in the queue). At this point I discover (for the first time) that the main board is actually on back-order, and everyone that I had previously spoken to had been aware of this and not passed it on. They would not tell me the expected date of delivery.Monday I tried again after hours and got probably my first good experience with Acer Customer Service. The guy was very helpful and understanding and did actually tell me that my part was expected on November 29.I called during the day the next day to speak with some in-hours person about my disappointment in the way that I had been mislead and got the usual platitudes.OK, come Thursday (Nov 29), I called a bit later in the day to verify that work had either been complete or had at least started. The back-order had slipped to the next day.Let's try again, Monday December 3. It had slipped to December 5 (the following Wednesday).At this point I demanded that if it slipped again that I would be called immediately that information was known and received that commitment.I was on training Tuesday & Wednesday so didn't get to call them until after hours on Wednesday. Again, I got another person who really tried their best (hmm why do I have better experiences with their after hours staff?). Unfortunately the case ticket had not been updated, but they offered to email the technician who was doing the repair so that they would call me first thing this morning.You guessed it, it didn't happen.I called up a few minutes ago (and waited on a queue that started at length 20). Gave my case number and Identified myself. I was then told that the part had not arrived and that there was an outstanding query to their supplier about when it would be delivered.I was livid. Especially at the "I understand why you are upset" platitudes.Apart from not being called this morning as per commitment, from last night, the delivery had slipped again and no-one thought to get in touch with me.Acer Support is not a small company, but this kind of behavior makes them look decidedly mickey mouse.I have demanded to be told the instant that they know the new delivery date, and received yet another commitment to be called this afternoon with the information.It goes to their management if they drop the ball again, I've had it. I've been without my notebook now for more than three weeks. Maybe I should send them a bill for the time I've wasted on the phone trying to sort this out. That should come close to replacing the screen!One thing is for sure, much as I liked the machine and the good performance it has given me, I will not be replacing this machine with another from Acer when the time comes.A suggestion to the folks on the phones at Acer support. Your customers are your reason for existence. Without them, you would not have a job. When you make a commitment, you honor it. A Suppliers support division is one of the main reasons that that they get return custom. Here in the Sun Support office in Gordon, we used to have a poster up that simply said "It can take years to win a customer, and seconds to lose them". Truer words were never uttered. If the first people that I had spoken to had set my expectations correctly from the outset, I would not be as angry as I currently am. A simple "To fix your video card, is a main board replacement. We've had to order this in and they normally take 3-4 weeks" would have done this.If I had treated a customer with the obvious contempt with which I feel I have been treated by Acer Support, I would expect and deserve a serious ass-kicking from my management.Update #1Well, it's just gone 5pm here and I have not received the promised call from Acer. Tomorrow we start talking to call center management. Sigh, I wish it hadn't come to this. Folks, you don't promise a customer something only to get them off the phone so you can forget about it.Update #2Just got off a phone call with their escalation department speaking with someone called Frank. Unfortunately the phone system somewhere between us was playing up. I noticed while in the queue that I had extended periods of silence and it looks like one of those occurred while I was speaking with this person.He agrees that this has taken a long time and told me that there is still no ship date on the back-ordered main board. While I am skeptical about this, he has committed to having the repair folks actually see if the board can be repaired, as apparantly this is one of the things that the Highpoint folks in Flemington do. I am to expect to be called by them early next week.We shall see.Just before we got cut off, I was pointing out that there appear to me to be call-centre folks in the support centre who will commit to almost anything to get an upset customer off the phone and then ignore it. Unfortunately I did not get to hear his response to this as that's when the line went silent again and then was disconnected a minute later.

Updated twice (see end of entry) After having my Dell notebook die outside of warranty two years ago, I made sure to buy the extended warranty when I replaced it with an Acer Ferrari 4005 in...

Music

Making a Stomp Box

I posted this over on my myspace blog, but I'm sufficiently please with myself that I thought I'd post it here too.Well I've just finished the main work in making a new stomp box.I did a google search and came up with the following instructions:2 x 350mm x 70mm (x20mm) hardwood2 x 310mm x 70mm (x20mm) hardwood } Gives you a total of 350mm x 3500mm square, 70mm high.For the top: 350mm x 350mm 3ply (make sure all ply sheets are same thickness, first one they gave me was actually thinner because the top and bottom sheets of ply where as thin as a card).Then i just used a round file to make a small circular groove for the mic lead to sit because if the stompbox isnt flush with the ground all the way around...ahh... block your ears! FEEDBACK.Well the best I could do for the sides was 66mmx19mm marante. I had Bunnings cut up the pieces to the above dimensions for me, picked up some nails and glue. Unfortunately with Bunnings, you pay for the whole bit of wood they cut up. The marante was 1.8m long (so not really a lot left over there), but the 3-ply was actually a 450x900mm cut. All up, cost me about AUS$20.I only screwed up on one thing with it, I put the 350mmx350mm piece on upside down, so now I have a nice hole to fill before I stain and varnish it. I also still have to make the cable hole for the mike, but I'm pretty happy with it.

I posted this over on my myspace blog, but I'm sufficiently please with myself that I thought I'd post it here too. Well I've just finished the main work in making a new stomp box. I did a google search...

OpenSolaris

Sun Developer Days

OK, I got back from CEC on Saturday a week back and walked into the house at about 9:30 absolutely knackered. About 2pm my pager went off and I discovered that I was on VOSJEC duty that weekend and ended up with a righht horror of a call that lasted the rest of the weekend (that I won't go into detail here with, save to say that I got an action plan out to these ghuys at about 00:30 on Monday morning.Early Monday morning (ok I did get some sleep, this is real morning about 10-11), I got a call from Laurie Wong. Apparantly the DTrace speaker they had organised for the Developer Days couldn't make it and they really couldn't find anyone else. After some discussion with my boss, we agreed that I would go fly to Melbourne the next day to cover this and also cover Sydney on Wednesday.Had an awful time actually using the system that we are supposed to use to book the flight, ended up taking me a bit over an hour and by that time the fare had risen 50% !!! Anyway got that all sorted and boy am I glad that I booked to get my self well ahead of when I spoke.First off, I was using someone else's slides, so of course I had to work out what I was going to say to each one (I use flash cards to remind me of what I want to talk about so I'm not just reading the slides). Going through the slides I noticed that the information on the javascript provider was actually out of date. Indeed, you can actually download a firefox 3.0 alpha that has the new provider in there and looks pretty damned spiffy. So, I updated that stuff, then I discovered that there were actually two sections of the talk not present in the slides. This was the "tie it all together" bit and the summary. Well I didn't have the time to write a "tie it all together bit", so I removed that from the agenda slide and did up an "in conclusion" slide.The other part of being glad I booked an earlier flight is that even though we boarded close to time, we were about an hour late getting off the runway! I got in to Melbourne at about midday. Fortunately we were able to put another speaker in front of me so I could finish writing the talk which I ended up giving at 4pm.Anyway, the talk covered some background on DTrace (and the slide author provided some really nice graphics and animations), and discussion of various providers. In particular I talked about PHP, javascript, and postgresQL. I did demos for some of the basic DTrace, javascript and postgreSQL.I Also touched on the shell provider I'm working on and encouraged folks to get involved with working on and testing new providers.Amazingly, without having timed this or even thought about the length, I managed to finish exactly on time.Laurie took me into the QANTAS lounge where we were able to relax a little before the flight home. With the flight and the train trip I got home about midnight.The next day was in Sydney, so I only needed to take the train into the city.After finding the venue (google maps on a treo 750 is really useful!), I sat in on a couple of the other talks and quite enjoyed those. In Sydney my talk was at 3:15 and again went pretty well.Headed home after being treated to a really nice dinner at Doyle's on the Harbour.Unfortunately I had a prior commitment on Friday so I couldn't give the talk in Canberra.These were probably the largest audiences that I have ever presented to (combining both talks I spoke to about 580 people). I actually enjoyed it and I think my audience had a bit of fun as well. It's nice to do this kind of thing every so often.Technorati Tags:Solaris,OpenSolaris,DTrace

OK, I got back from CEC on Saturday a week back and walked into the house at about 9:30 absolutely knackered. About 2pm my pager went off and I discovered that I was on VOSJEC duty that weekend and...

General

Tuesday at CEC

I had planned for a 5:30am start to get some online stuff out of the way before breakfast. My alarm had other ideas, not waking me up until 6:30am. I lay in bed for a bit thinking my room mate was showering etc, only realizing about 20 minutes later that it wasn't him and he had wandered off somewhere. Got down to breakfast about 7:15 and came across someone else I'd hoped to meet in Martin Canoy (who helps manage the Performance V-Team). Interesting eating breakfast in a room that had a semi-trailer parked up the front with a black-box on it.The general session started at 8am with Introductions by Dan Berg and the a talk from our VP for Eco Responsibility David Douglas. One of the standouts in this for me was the work done on the Santa-Clara Data-Center. This site is now a showcase for a green data-centre and tours are conducted through it for companies interested in just what we did. We were encouraged to get customers to tour it, but I would love to see a few you-tube clips put up about it it and perhaps a 30 minute documentary that we could give out on a DVD (are you listening guys), as the world is a lot bigger than the Bay area.Next up we again we had Jonathan spend some time talking to us and answering some surprisingly targeted questions, all handled very well.After a short coffee break, we came back for the release of the new T2 boxes. I have to say that these boxes look awesome and are going to kick some major ass in the marketplace. Given the launch was done in front of 3500-4000 technical people, it was observed during the question tat te type of questions that the panel were receiving were a lot more technical than they would normally get during a product launch. Many of the questions focused on the desire for a T2 based laptop or workstation, to whic Andy joking replied along the lines of "I can't comment on future products". When the workstation question came up he pointed back to "i thought we were looking at a laptop" or something like that :) One of the questions asked about a dual socket T2 based machine, and it was confirmed that this is currently being worked on. Now that will be a box to contend with!During the break between 11 & 12, I spent some time at the Second Life booth in the pavilion, giving a second avatar in the same general area as the one they were using to show things off.I had a whole lot of breakouts I wanted to go to today, but I also wanted to get some Solaris Certifications done. Unfortunately, the room only seated 30 people and when it opened at 12:15pm, the queue already had 45 in it. The took the sensible move of taking folks names and giving them a rough estimate of when they should come back. After getting my name on the list I wandered over to the installfest and started a live-upgrade of this notebook to nevada build 74 (as I was having some punchin problems with build 73). After I got the initial copy done and the upgrade kicked off, I went back to the certifications to find that there were only about half a dozen people in front of me on the list. I only had to wait 5-10 minutes to get in.I was a little concerned that as I was doing these examinations cold, that I might not get through. I should not have worried. My 16 years of SA, as well as maintaining my own machines while at Sun, and doing kernel work and Open Solaris Advocacy stood me in good stead.The Solaris 10 Admin (part one) allowed 90 minutes to complete it and had a 61% pass mark. I finished it in 30 minutes and scored 71.2%. Woo hoo. Heartened by this I spoke to a proctor and asked if I could do part 2 as I'd only used 30 minutes of the potential 90. He agreed and I started part 2. This was a little harder, but I knocked it over in 45 minutes and scored 70.5. So now I'm certified (well you knew I was certifiable, but that's another story).Wandered back to the installfest to pick up my notebook. Ran up the new build and was pleasantly surprised to see punchin working correctly.By the time all this had finished, the final breakout session was about to start, but there really wasn't anything I wanted to see in there. As it was now 6pm, headed back to my room to get change for the party at the Palms, as the buses were leaving at 7pm. I'm glad I did, as when I got back down to the lobby at about 6:40, there was already a very long line for the buses.The party was a blast. The live band was awesome. There was also a number of games put on for us, like air hockey, video mountain biking, surfing, gun fighting, etc. The mountain biking game was murderous and really wore you out in a hurry, of course not being able to adjust the seat to a proper height made pedaling difficult. The game I fell in love with, though, was a water skiing one. Anyone who played it understood just why you would get to the whooping and hollering while you were doing it. It was incredible fun I lost count of the number of times I rode it!I spent a lot of tonight looking for a couple of friends from the US that I wanted to spend some time with, but didn't manage to find them. Maybe they didn't go, maybe there was just too many people (note to self, get phone numbers next time). I did find a few Australian colleagues whom I spent time with, and Bob Sneed introduced me to some really nice friends of his whom I also spent some quality time with. Oh I also met Bela Amade from the EMEA cluster group who I have also done a lot of work with, as well as a number of other folks that I had backlined escalations with, whose names are too many to recall this late at night (sorry folks, I did enjoy meeting you, it's always good to put a face to a name).About 10:45, they started herding us to the buses (as the last bus back would be at 11). I continues chatting with one of the folks Bob introduced me too while traveling back on the bus.All in all a wonderful day. We've got the group specific meetings tomorrow and after that I am doing a podcast with Don Grantham, which should be fun.I forgot to mention that on Monday at lunch I also met someone else I had ben hoping to meet. Dimitri DeWild. Dimitri works in a similar group to me in EMEA, and we have long communicated with each other over email and IM.Technorati Tags:cec2007,suncec2007

I had planned for a 5:30am start to get some online stuff out of the way before breakfast. My alarm had other ideas, not waking me up until 6:30am. I lay in bed for a bit thinking my room mate...

General

Catching up - Monday at CEC

I'm more behind than I wanted to be blogging my time here.I started Monday nice and early with a Breakfast with Ian White. A lovely breakfast, I shared a table with Ian, Linda Park (VP for Services in APAC), Chris Gerhard, Cive King, and some others whose names escape me at this time of night the following day (sorry folks).We then had a marathon general session which probably could have used a break as about I can recall about it was listening to Andy Bechtolsheim.Lunch (boxed roll and some other bits and pieces), then breakout sessions for the afternoon. The standout sessions for me were the Performance ones presented by my mate Bob Sneed and another by Jim Mauro. There was also an interesting session on how support services might use Second Life to do some delivery.For a while it appeared that dinner on Monday night was going to be fend for yourself, but instead I went up to the unconference. There were probably only about 40-50 people attended this and I find that a shame as it was a damned good session, and they fed us hot dogs and beer.The unconference consisted of two parts.Ten selected to take folks of to various rooms to give a presentation, andthe remainder and any other volunteers to participate in speed-geekingWe had thirteen folks want to present in a room so a vote was held to determine who got rooms.Speek Geeking is interesting.The idea is that you get 10 people in front of aboard with butchers paper, and they get five minutes to give a presentation to the folks at the table in front of them. At the end of the five minutes the listeners move to the next table. Every listener is given a poker chip and at the end they give the chip to the speaker they enjoyed the most, and that speaker gets a prize.Hal Stern spoke on "DRM is for morons", Bob Sneed did five minutes on capacity planning, Clive King did some basic SGRT, and I gave an impromptu five minutes on the development of a DTrace bourne shell provider. There were more speakers, but as I was a speaker I didn't get to hear these and the listed ones are the ones I can recall. I got a grand total of four chips, so wasn't even in the running. I think the winner got 27. As I said earlier this session was a lot of fun and it would ave been nice if it had been better attended.A few of us including Bob Sneed, Rodney Lindner and myself hung around socializing for quite some time after the session finished, and I didn't get to bed until about 1am. I really can't remember what I did after we broke up and before I went to bed, maybe it will come back to me :)More on Tuesday soon.Technorati Tags:cec2007,suncec2007

I'm more behind than I wanted to be blogging my time here. I started Monday nice and early with a Breakfast with Ian White. A lovely breakfast, I shared a table with Ian, Linda Park (VP for Services in...

General

First day at CEC

OK, I'm currently sitting in the hallway outside the ballroom where the general sessions will be held in the morning and it's after midnight.I've only just managed to get connectivity in the last six hours as the wireless access points have been turned on. My phone only just started working today to, although I've been here 24 hours.Enough moaning :) I managed 14 hours sleep last night so I think I'm over the jet lag from te flight between Sydney and Las Vegas.Went along to the APAC Technical council meeting earlier today to hear John Greaves talk about the Principal Engineer programme (or whatever it is going to be renamed after today) and it seems pretty much the track that I'm interested in following.Caught up with a few folks that I intended to today including Sara Turner and Leon Nicodemus (I hope I spelled that correctly) at the reception drinks.Afterwards, I adjourned to another bar in the Paris Casino with Clive King, Chris Gerard and a few others for some cleansing Ales. While on this subject, I have to say that what they have done with the Paris Casino is amazing. With the work done on the ceiling you have the impression that you are outside on a Parisian twilight early evening, constantly. While I can see the advantage to casino of having you lose track of the time and the like, it was still a very pleasant way to spend an hour or so wit colleagues.I'm also having a play around wit CEC 2007, and these folks have done a marvelous job in te setup to provide access to the sessions and breakouts to those unable to attend in person.More tomorrow!Technorati Tags:cec2007,suncec2007

OK, I'm currently sitting in the hallway outside the ballroom where the general sessions will be held in the morning and it's after midnight. I've only just managed to get connectivity in the last six...

General

Open vs Proprietry: An effort in mudslinging?

I've just read an "article" by Carla Schroder on Enterprise Networking Planet, where she claims to know the real reason that closed and proprietary code exists.The point of the article appears to simply be that she doesn't buy ... into the "protect our preciouss IP" excuse because it is so overused.I see very little in the way of reasoned argument and example to support her stance. Instead I see a lot of mud slinging and cherry picked examples.We start off with the assertion that the real (and by implication, the only) reason for closed and proprietary code is embarrassment; followed by some "cool" misspellings of phrases like "Trade Sekkkrits" and "Sooper Original Algorithms". There is also another typo in her "conclusion". The typos look to me to be deliberate slurs on anything that a company might say about their code. Really sad reporting Carla, Shame.She then asserts, without any argument or example that Windows itself is the prime example of this; presented simply along the lines of 'well everybody knows it, I don't have to prove it'. It very well may be the case. It may not. Nothing in this article gives me cause to think it might be.The rest of the article is a list of cherry picked examples of poor proprietary code.With regard to the opening of netscape and openoffice, would she have preferred that they not be opened?The impression that I'm left with is the whole thing was a mudslinging exercise aimed at proprietary code, and not even a very well prepared one at that.I'm not going to sink myself to the same level as this "article", but I know with pretty much certainty I could find some disgusting examples of embarrassingly poor open source code if I really wanted to do it. She freely admits that there is bad open source code out there too, so I am left even more confused about the point of the article.Embarrassment, may be one of the reasons for not opening some code up. It's certainly not the only one, which with even a tiny amount of research she could have determined. Nor is it a differentiator between open and closed source.Carla, you have made an assertion and completely failed to argue your case. You get a D-, and I'm being generous.A small poke at Jupiter Online Media who run the site. Most articles that I read that are purported to be by trade magazines and journalists at least provide a one liner of the authors credentials. I'm sorry, I can't find it here.

I've just read an "article" by Carla Schroder on Enterprise Networking Planet, where she claims to know the real reason that closed and proprietary code exists. The point of the article appears to...

OpenSolaris

sh provider update - command-entry fixed

I've just uploaded the latest diffs and binaries to www.opensolaris.org/os/community/dtrace/shells/.So what changed?There was a bug in that the command-entry probe fired in both parent and child shell. This was a simple oversight that I should not have missed. I originally had the probe before sh forked, then moved it such that it fired after we knew we were able to fork (basically I didn't want it firing if we were not able to actually fork and run the command). Unfortunately, I forgot to specify that it was only to fire in the parent shell. Simple fix. My bad.Also note that the documentation on the Providers for various shells site supercedes what I previously wrote in my blog.I haven't tested the diffs, but I did the same massaging to them that I did for the last lot, so they should be ok to use with gpatch.Looks like things are starting to settle down now so I'll be able to think about progressing this one and starting to look at some others, using these probe names as some kind of standardisation.There have been suggestions for extras in this provider, but the feeling that I'm getting is let's get a basic useful provider done as a v1.0 and look at things like stacks and other probes in a later update.Technorati Tags:Solaris,OpenSolaris,DTrace

I've just uploaded the latest diffs and binaries to www.opensolaris.org/os/community/dtrace/shells/. So what changed? There was a bug in that the command-entry probe fired in both parent and child...

OpenSolaris

Why did I do a sh provider first?

Since I posted my initial blog on this, I've received a few comments and a surprising amount of mail asking why I did /bin/sh, which is an obviously obsolete shell, and not ksh93; some of it bordering on insulting.Let me go through a few reasons why I did this one first.It's the one I was asked to do.The Bourne shell has been around for a very long time and there are, quite literally, millions of scripts that have been written in it, some of them very poorly so.Much work in the open source environment is done to scratch an itch. In my day to day work doing performance calls I come across an amazing number of instances where being able to probe /bin/sh the way that this allows me to, would be an incredibly useful thing to have in order to explain to a customer why their thrown together script runs so slowly, most of these are /bin/sh scripts. Coding probes for ksh93 has no immediate impact on my day to day work as ksh93 is not yet integrated into OpenSolaris.Just doing a quick poll of usr/src in the opensolaris tree gives me the following:Scripting LanguageActual scriptsDynamically created scriptsComments/bin/sh463538/usr/bin/sh3943/bin symlinked to /usr/bin/sbin/sh186272symlink to /bin/sh/bin/ksh212199/usr/bin/ksh5350/bin symlinked to /usr/bin/bin/perl139/usr/bin/perl241237/bin symlinked to /usr/binksh9300sh is a logical first step from which other providers can follow.ksh93 is currently on a track to get integrated. I really don't want to drop any roadblocks on it now.Does it really matter which one comes first?At no point did I say I would not consider doing ksh93 or even helping someone else do it, indeed I have had communication with Roland suggesting ways in which this could be done. Believe it or not, there is a plan to get all the shells done. Keep an eye on http://www.opensolaris.org/os/community/dtrace/shells/.The last thing I expected when I started on this was to be the target of insults because I didn't do someone's favorite shell. I'm wondering how this kind of behavior encourages anyone to actually do anything that is of any benefit to the community. Come on folks, we can do better than that.Technorati Tags:Solaris,OpenSolaris,DTrace

Since I posted my initial blog on this, I've received a few comments and a surprising amount of mail asking why I did /bin/sh, which is an obviously obsolete shell, and not ksh93; some of it bordering...

OpenSolaris

/bin/sh DTrace Provider

A couple of days ago Brendan was chatting with me on irc and we got to discussing such a beast. Mainly looking at something simple in the way of the python and perl providers that others have worked on.Well, to make a long story short (ok it will be longer later), I've coded up something that appears to work against the nevada clone tree of a couple of days ago, and logged RFE 6591476 to track it.I'll be putting the diffs up in the next day or so, but for a teaser, here is the documentation.sh ProviderThe sh provider makes available probes that can be used to observe the behaviour of bourne shell scripts.OverviewThe sh provider makes available the following probes:builtin-entryProbe that fires on entry to a shell builtin command.builtin-returnProbe that fires on return from a shell builtin command.exec-entryProbe that fires when the shell execs an external command.exec-returnProbe that fires on return from an external command.function-entryProbe that fires on entry into a shell function.function-returnProbe that fires on return from a shell function.lineProbe that fires before commands on a particular line of code are executed.subshell-entryProbe that fires when the shell forks a subshell.subshell-returnProbe that fires on return from a forked subshell.script-beginProbe that fires before any commands in a script are executed.script-endProbe that fires on script exit.ArgumentsThe argument types to the sh provider are listed in the below table.Probeargs[0]args[1]args[2]args[3]builtin-entry,exec-entry,function-entrychar \*char \*intchar \*\*builtin-return,exec-return,function-returnchar \*char \*intlinechar \*intscript-beginchar \*script-endchar \*intsubshell-entrychar \*pid_tsubshell-returnchar \*intarg0 in all probes is the script name.In the builtin, exec, and function entry probes, and the builtin, exec, and function return probes, arg1 is the name of the function, builtin or program being called. arg2 and arg3 in these entry probes are the argument count and a pointer to the argument list. In these return probes, arg2 is the return code from the function, builtin or program.In the subshell-entry, arg1 is the pid of the forked subshell and in subshell-return probes, arg1 is the return code from the subshell.In the line probe, arg1 is the line number.In the script-end probe, arg1 is the exit code of the script.StabilityThe sh provider uses DTrace's stability mechanism to describe its stabilities, as shown in the following table. For more information on the stability mechanism see Chapter 39 of the Solaris Dynamic Tracing guide.ElementName ClassData ClassDependancy ClassProviderUnstableUnstableCommonModulePrivatePrivateUnknownFunctionPrivatePrivateUnknownNameUnstableUnstableCommonArgumentsUnstableUnstableCommonThe probes that gave me the most trouble were line and subshell-\*.line was tricky as sh only does line numbers when it parses input. It has no concept of line numbers on execution, which is what we need.So, I needed to add another structure element (line) to trenod, and every other struct that is cast over the top of it. In the first failed attempt, I set this to standin->flin whenever we allocated a new one of these nodes, I set line to this value. The problem with this is that if the parser hits a newline, this number gets incremented before we actually set it, which means that the last command on every line has the line number of the following line. Not quite what I wanted.What I ended up going with was the creation of another variable in the fileblk structure (comline) that I set just before standin->flin is incremented. This looks like it works.The subshell-\* probes were not initially going to be a part of this, but an assumption I made about com in execute() when coding the exec-\* probes ended up causing the shell to coredump on me. It turns out that com is properly defined when we fall through the switch to the TFORK code, but if we go directly into the TFORK case, it's something completely different (would you believe "1"?). So, I made the probe conditional on the node type. If it was a TFORK, then we do a subshell probe, otherwise we do an exec probe. This also appears to work.In the meanwhile, I've been sending Brendan sh binaries and he's already started coding more tools for the DTrace Toolkit based on this provider, and I have to say that some of his ideas look pretty cool.Technorati Tags:Solaris,OpenSolaris,DTrace

A couple of days ago Brendan was chatting with me on irc and we got to discussing such a beast. Mainly looking at something simple in the way of the python and perl providers that others have...

General

And sometimes it doesn't suck as much

I should have made this update earlier. I'll be careful about how specific I get here as I'm not sure how much I can or should make public.I had a phone call from Gosford Police on the Wednesday following the incident. They had made some arrests and wanted me to go through what I lost with them.On Thursday I went in to make a statement and recovered everything except a couple of cds and dvds that I had written myself and I couldn't remember what was on them (as well as some blanks).Something I didn't comment on in the prior blog entry was that when I went back to the station to look for anything that may have been dropped, I heard some arguing further up the street followed by three people running up to the station, jumping onto the tracks and running off. On my way back to the car I was stopped by some people in another car asking if I had seen three kids running away. It turns out that on the train following mine, they had assaulted and robbed another poor guy.The CCTV footage from both Narara and Gosford, as well as the hat that we had was instrumental in the police getting good photographs of these people and they were able to quickly find them and make three arrests. Watching the CCTV footage was an eye opener for how these people had actually done things.Anyway, I must add to my thank you list the detectives at Gosford and the folks involved in getting them the CCTV footage in such a timely fashion that my gear had not been disposed of. I also need to thank Steve Lau for the offer of one of his spare Ferrari chargers and Jeff Bailey for actually couriering his spare one to my office on Tuesday so I could use my notebook on Wednesday.

I should have made this update earlier. I'll be careful about how specific I get here as I'm not sure how much I can or should make public. I had a phone call from Gosford Police on the...

General

Sometimes life sucks

An unual blog title from me for an unusual entry.Last night coming home on the train (about 9:20pm) I was sleepily playing a game on my notebook as the train pulled into Narara Station (about 15 minutes from where I would disembark). Out of the corner of my eye I noticed my backpack moving across the seat. Too late I realised it was being taken. I ran after the thief shouting for him to stop. He jumped off out of the car and ran along the tracks, jumping a fence and running off. I jumped after to follow but couldn't catch him.Inside the bag were:Ferrari Charger60gb USB diskFerrari bluetooth mouseVPN Token cardSome CDs, pens and other small bibs and bobsPretty much nothing to make the theft worthwhile.I must express my extreme thanks to the folks who watched my notebook while I tried to catch this guy and returned some course material that had fallen out, also to the train guard firstly, for noticing I was beside the tracks and not allowing the train to move; and secondly for his assistance on the rest of the trip home to make sure that it was correctly reported and that I was generally ok.I know the thief probably won't be reading this, but he will have been caught on CCTV on both Gosford and Narara Stations. The police also have the hat he dropped.Anyway as I was talking to the guard, I noticed that my shins hurt and I'd cut my hands up a bit, where I'd fallen on the ballast jumping out of the train. After I got to Wyong I went up to Wyong Police Station to report the theft and hand in the hat. I then drove back down to Narara to look around the area to see if anything had been discarded. Gave up after about half an hour and went home.When I got there Lyn reminded me that as this was my usual journey home, it should be reported through workcover. As I commented earlier I was starting to hurt a little so I ventured up to Wyong Hospital to be checked out. Unfortunately I chose a very busy night, so it was about five hours before anyone could see me (which I don't begrudge as I lost track of the number of ambulances bringing in folks who were hurt worse than me). As it turned out the nurse who ended up bandaging me up after the Doctor had checked me over was a good friend of ours and she was very helpful, even grabbing me a cup of coffee and something to eat before I left (and offerring a lift home if I wasn't up to driving).Finally got home about 5:30am and decided it really was not worth going to bed until after the kids went to school, which I'm about to do now.Oh yes, one more thank you, to all the second life folks who sent me their best wishes on hearing what happened. You know who you are, and you have my gratitude.So I'm basically left with a notebook taht has about 15 minutes of charge on it until I can organise a new power supply. I hope that that is not going to take too long. I guess, I'm also really glad that the notebook was on my knee and not in the bag. I only just finished paying it off in December!

An unual blog title from me for an unusual entry. Last night coming home on the train (about 9:20pm) I was sleepily playing a game on my notebook as the train pulled into Narara Station (about 15...

OpenSolaris

in.telnetd exploit doing the rounds

I am hearing talk about an exploit of the in.telnetd issue doing the rounds. This affects Solaris 10 and Solaris Express.References at sans.org and asert.arbornetworks.comNow would be a particularly good to disable your telnet daemon:# svcadm disable telnetIf you must run telnetd, then you need to get the patches referred to in Sun Alert 102802. The patches are freely available on sunsolve.120068-03 SunOS 5.10_sparc: in.telnetd Patch120069-03 SunOS 5.10_x86: in.telnetd PatchIn spite of what the README says, these patches do not require a reboot.Some DetailsWhile things are still sketchy, it looks like it propogates by connecting as both "adm" and "lp" and copies sparc and x86 binaries into /var/adm/sa/.adm, along with crontab entries for both of these users.A quick check to see if you have been infected is to check the mode of /var/adm/wtmpx. If it is 0646 and you have the aforementoined directory, it is likely yoyu are infected.First off, disable the telnet daemon as described above. Clear out the cron entries that were added for adm and lp. There will also be a program running listening on port 32982. It will likely be called the same name as the only non-dot-file in /var/adm/sa/.adm. Make sure you kill the right one, as they choose from a number of solaris daemons for the naming. Run pfiles on it to look for port 32982.UpdateFor more information see the security blog.Technorati Tags:Solaris,OpenSolaris,Sun,Security

I am hearing talk about an exploit of the in.telnetd issue doing the rounds. This affects Solaris 10 and Solaris Express. References at sans.org and asert.arbornetworks.com Now would be a...

OpenSolaris

The in.telnetd vulnerability/exploit (3rd update)

Before I get into the meat of this posting, let me acknowledge that, yes, this was an almighty cock up and should not have happened. It did happen. Let's move on.Also, while I might not agree with the publication of zero day exploits. Again, It happened. There's really not much I can do about that. There's really no point in being upset about it.The upside to the posted exploit was the fact that because the code was available, the poster included an analysis of what was going wrong, pointing at the code that was broken. This almost certainly saved us some time in troubleshooting the issue. For this part of the post, you have my thanks.I would certainly be interested if the person who posted the exploit could tell us how he found the problem; for no other reason, than I'm simply interested.Anyway, this blog is supposed to be about getting it fixed.All the times below are Australia/NSW.One of our National SSEs (Rodney) was on-site with a large customer yesterday. This customer had asked him about a telnet exploit and described the problem to him. Rodney gave me a call and asked me about it at about 1pm. I hadn't and on hearing the description (initially only described to him as a root exploit) Ttwo of us (thanks for your help Chris) dove into the code to start looking at how telnet -l-froot could behave as it did. At this point I did not know about the zero day exploit posting. Once we worked out what was going on, I called Rodney back and explained the full implications of the bug to him so he could explain it to the customer.We told them that they could block the root vulnerability by uncommenting the CONSOLE= line in /etc/default/login. Note that this has been the default since Solaris 10 update 2 almost forever. However, I still see lots of customer configurations where it is commented out. The only other way to protect against the other implications (login as any user without a password) would be to disable the telnet service until we could fix the issue. eg# svcadm disable svc:/network/telnet:defaultWe then started looking through the code to determine the best way to fix this. I logged an internal escalation and was in the process of logging a high priority bug when I saw the following in the SCCS history of usr/src/cmd/cmd-inet/usr.sbin/in.telnetd.cD 1.67 07/02/11 19:46:41 danmcd 90 89 00009/00010/048966523815 LARGE vulnerability in telnetdI immediately had a look at the bug and banged of an email to Dan stating that I could probably get IDR patches built for on10 pretty quickly. After a brief discussion of the bug and the fix, he pointed me at the manager of the group responsible for the backport and I started the backport (actually a very simple fix).I got the IDRs built and basic testing done by about 5pm and started writing the Sun Alert.The documentation for how to write a Sun Alert and specifically the actions that need to be taken to get interim fixes available were spot on and I sent off the initial draft at about 6:45 along with sending a request for getting the IDR patches turned into ISR patches (Interim Security Relief) and getting them published.Just before 9pm, I started getting into discussions with UK based folk in Derrick Scholl's group about getting the Sun Alert out and what needed to happen for me to get a gate open to get the fix back into the patch gate.Thanks to Angela, Paul, Brent and Bill for working hard to get to the point that I could log the RTI at 10:10, and start doing the minor nit type stuff that needed to be done before Bill could pass the RTI onto the on10-patch gatekeepers and I could go home (at about 11:15).As an aside, I missed my train connection by four minutes due to a late running North Shore train and spent an hour sat on a blacked out Hornsby station, getting home at about 2:30am)While I was traveling (and sleeping) the heads up went out to the gatekeepers and all folk who needed to know about this so that when I came online at 8 this morning, I had very little to do before doing the actual putback into the patch gate (which happened at about 8:30).The gatekeepers immediately closed the gate and started work on a patch.The reason that I've detailed what I've been through with this is to point out one thing.The speed at which I was able to do this and get to the point that an ISR patch will be shortly available publicly, is nothing short of phenomenal. For Sun to respond to and address a vulnerability like this in around 24 hours would have been completely unheard of even two to three years ago.But it's not just the processes here. What really made for speed here was an incredibly focussed and helpful who had an interested in rapidly getting this addressed. Without the help of folks like Dan, Rodney, Chris, Angela, Paul, Brent, Bill and Seth, and not forgetting the gatekeeping team for pulling out the stops to start building a formal patch, none of this would be possible. If I've missed anyone, please forgive me, I didn't get a lot of sleep last night :)I love working for a company that has people like this.update 1The ISR patches are available for free download from http://sunsolve.sun.com/tpatches. The details of the patches are: IDR125457-01 SunOS i386_x86: in.telnetd can call login with an option given as a username IDR125456-01 in.telnetd can call login with an option given as a usernameupdate 2Sun Alert 102802 is publicly available talking about this issue. Section 4 should shortly be modified to add the following paragraph:Interim Security Relief (ISRs) are available fromhttp://sunsolve.sun.com/tpatches for the following releases:SPARC PlatformSolaris 10 IDR125456-01x86 PlatformSolaris 10 IDR125457-01Note: This document refers to one or more Interim Security Relief (ISRs) which are designed to address the concerns identified herein. Sun has limited experience with these (ISRs) due to their interim nature. As such, you should only install the (ISRs) on systems meeting the configurations described above. Sun may release full patches at a later date, however, Sun is under no obligation whatsoever to create, release, or distribute any such patch.Update 3I've just been informed that the formal patches are release ready and should be released to sunsolve in the next few hours. Keep an eye out for:120068-02 SunOS 5.10_sparc: in.telnetd Patch120069-02 SunOS 5.10_x86: in.telnetd PatchAs these are security patches, they will be publicly available.Technorati Tags:Solaris,OpenSolaris,Sun,Security

Before I get into the meat of this posting, let me acknowledge that, yes, this was an almighty cock up and should not have happened. It did happen. Let's move on. Also, while I might not agree with the...

OpenSolaris

Modifying your predictions does not make them true

I've been hanging off writing this as I wanted to think about it a bit first.I'm referring to Bob Cingley's predictions for 2006.You might recall that I made some comments on him changing the prediction so he got it right last year. Well, he's playing the same games again this year.The actual prediction was:Sun's Woes ContinueStill no good news for Sun. Those Galaxy servers are very nice, but they aren't enough to support the company and Eric Schmidt is too smart (I hope) to bail out his old firm.And in his evaluation of his predictions he writes:4) More bad news for Sun. That's true.Notice the subtle difference. More bad news in the evaluation and Still no good news in the prediction.Reading the prediction, one has to think that if Sun had any good news last year, then he scored a clean miss.I definitely think we had lots of good news last year.Like many folks in other forums who have commented on this particular prediction (When was the last time you saw non-Sun folks standing up for Sun on slashdot!), I think the question has to be asked, What axe has Robert X Cringley got to grind against Sun Microsystems?Bob, if you have any shred of credibility left, you really should come clean and score #4 from 2006 a clear miss.One other interesting thing that I just noticed from my prior discussion on this. In January last year, the Sun prediction was #5, it's #4 now. Did a prediction get deleted?Technorati Tags:Solaris,OpenSolaris,Sun,Cringley

I've been hanging off writing this as I wanted to think about it a bit first. I'm referring to Bob Cingley's predictions for 2006. You might recall that I made some comments on him changing the...

General

Poor Stevel

Steve has just written up what he discovered on arriving home today. However, I think his delivery onto #opensolaris tells the story so much better:(13:52:25) stevel: sigh(13:52:30) stevel: of all the things my beagles had to find, why did it have to be the peanut butter(13:52:39) stevel: and of all the places they had to eat it.... why did it have to be my bed(13:55:51) stevel: got the dried mushrooms, leftover stinky tofu (that one is fun to cleanup), a pitcher of tea, garlic, peanut butter, garlic butter spread (that one's fun too), 3 eggs, and a bunch of packets of ramen(13:56:42) stevel: ... and then to top it off, they decided that eating all this newfound bounty in the kitchen wasn't enough(13:56:47) stevel: they had to take it into various rooms of the house(13:57:00) stevel: how the @#)$(\*@#$ does a beagle carry a \*raw\* egg to another room and then decide to bite into it(13:57:39) stevel: and one of them somehow managed to tip a chair over, jump from there to the fireplace mantle, and then from there onto the top of our piano(13:57:51) stevel: so he's left scratches all over the piano(13:58:12) stevel: which, btw, is a $35,000 piano. so that's gonna be probably about $500-$700 to repair the finish on that(13:58:37) stevel: and of course... what do dogs do after they eat and drink too much?(13:58:46) stevel: poop and pee.... (13:58:53) stevel: again, the kitchen wasn't good enough for that(13:58:59) stevel: no no... they had to go on the bathmat instead(13:59:34) Tpenta: steve this is blog material man(13:59:42) richlowe: oh yeah.(14:00:05) gisburn: stevel: please blog it.(14:00:07) gisburn: steleman_: with photos(14:00:11) gisburn: LOTS of photos(14:00:42) stevel: i've already cleaned up most of it, but i can take photos of the scratches(14:00:57) stevel: and i can take photos of the two of them hiding in the corner of the living room underneath the piano cause they can tell i'm f-ing pissed at themAnd on another channel (names removed as ity's not a public channel), telling it from the dog's perspective. I've also edited some of the more obvious words :)(14:22:53) xxxx: "!@#$% you for leaving me", and "!@#$% you, go clean this shit up yourself you abandoning !@#$%"(14:23:57) xxxx: "But look at my cute little eyes and eyebrows looking up at you-- you don't want to turn me into sausage, do you?"I see that steve's now dropped the irc log into te note, but I'll still post this.

Steve has just written up what he discovered on arriving home today. However, I think his delivery onto #opensolaris tells the story so much better: (13:52:25) stevel: sigh(13:52:30) stevel: of all the...

Kids

Lucy in a modelling competition

My daughter Lucy, who turned eight on Saturday, has a photo up in a modelling competition. We are being encouraged to have folks vote for her. I would really appreciate anyone who thinks she's had a good photo taken to do so by going clicking on her photo and voting (and encouraging friends to). Voting requires an Email address (they send out a thank you email and a couple of links when you vote). Pramod, she's wearing the outfit that you haggled for me for on my last day in Bangalore.Note I've used the image as the link as while I have not received my copies, I have purchased copyrights to the photo shoot. If you can't see the image, you can click here. There appears to be a problem with accessibility of the site from outside Australia, I'm following this up with the website.The terms and conditions of the competition state that if AKC believes that anyone voting is bypassing the normal system of vote casting by continously changing their IP address, AKC reserves the right to disqualify that contestant. Please be sure to ask your friends and support crew not to do this as, ultimately, it will disadvantage your chances.The terms and conditions of the competition are available here. The important ones relating to voting are:5.Participants of this site may only use programs such as Netscape® Navigator®, Netscape®Communicator®, Safari and Microsoft®Internet Explorer® (or equivalent web browsers) in trying to win the prize. Use of any other software not designed for browsing the World Wide Web is expressly prohibited. No electronic devices or software programs may be used to falsify winnings from this site. In the event that use of non-approved client software is detected, we reserve the right to invalidate any prize or claims to the prize. In the event such fraudulent attempts to claim a prize is uncovered, such individuals will be subject to prosecution.6.Aussie-Kids.Com reserves the right to permanently disqualify from this promotion any person it believes has intentionally violated these rules or otherwise tampered with this promotion. Any attempt by any person to deliberately damage this Website or undermine legitimate operation of the poll constitutes a violation of criminal and civil laws and, should such an attempt or attempts be made, we reserve the right to seek damages from any such person or persons to the fullest extent permitted by the law.9.Voting is enabled via a simple polling method installed, activated and monitored by our webserver.It permits only one vote from an individual computer within a 24 hour period for any contestant entered in the "STRIKE-A-POSE" Photogenic Quest.It does not permit multiple voting due to the restrictions enabled through our software which identifies the IP address of the person (computer) voting.Aussie-Kids.Com is not responsible for any software malfunction that may occur during voting.Aussie-Kids.Com is not responsible if voting is not possible due to the networking of some computers in government bodies/departments or business houses and will not enter into any discussions or explanations as to this fact before, during or after the voting period.

My daughter Lucy, who turned eight on Saturday, has a photo up in a modelling competition. We are being encouraged to have folks vote for her. I would really appreciate anyone who thinks she's had a...

OpenSolaris

Tamarack, Gnome 2.17 and external devices

This is a combination that I've been hanging on. vold never quite handled my cds, dvds and usb media quite right.I'm happy to say that for the most part it all just works like it should.Well, I have a multi-partitioned usb device which exhibits an interesting problem.First, let's have a look at how the device (it's a 60gb usb 2 disk) is partitioned. Total disk size is 57231 cylinders Cylinder size is 2048 (512 byte) blocks Cylinders Partition Status Type Start End Length % ========= ====== ============ ===== === ====== === 1 Ext Win95 1 28615 28615 50 2 Solaris2 28616 57230 28615 50 Partition 1 has a fat-32 filesystem.Partition 2 has an SMI label and slice 0 is a zfs pool with the imaginitve name 'usb'.# fstyp /dev/dsk/c4t0d0p1pcfs# fstyp /dev/dsk/c4t0d0s0zfsIf I insert this disk into my notebook running b53, I get the notice about not being able to mount the zfs pool (which I expect), but it does not notice the fat-32 partition, let alone mount it:# rmmount -l/dev/dsk/c4t0d0s0 rmdisk,rmdisk0,usbI had a chat with one one of the developers about this. He found that if we place the zfs pool directly into the second partition, then everything works as you would expect. The different controller numbers and zfs pool name are purely due to using another disk and machine in anotehr continent.# fstyp /dev/dsk/c2t0d0p1pcfs# fstyp /dev/dsk/c2t0d0p2zfs# rmmount -l/dev/dsk/c2t0d0p0:1 rmdisk,rmdisk0,NONAME,/media/NONAME/dev/dsk/c2t0d0s2 rmdisk,rmdisk0,lacie_p2It turns out that if hal discovers an SMI label on any partition, it does not probe any of the other logical disks, leading to the logging of6502219 If SMI label exists, hal does not probe logical disksIn the meantime as I really don't want to muck around with the existing zpool, I'll continue to mount my pcfs manually, but it will be nice when this one is fixed.It's probably also worth mentioning that "zpool import usb" just works.Technorati Tags:SXCR,OpenSolaris

This is a combination that I've been hanging on. vold never quite handled my cds, dvds and usb media quite right. I'm happy to say that for the most part it all just works like it should. Well, I have a...