Tuesday May 05, 2009

Testing OpenSolaris in a heterogeneous world using Virtual Box

Solaris and OpenSolaris have very good reputations for being stable, well tested platforms while also being full of innovation like dtrace, power aware dispatcher, ZFS, Cross Bow etc.. In this environment test coverage is a moving target, new features, new uses, new platforms all make it necessary for teams involved in testing to adapt and innovate to cope with the ever increasing workload. Running to stand still.

The PerfQE team provides Performance QE coverage for most of Sun's software and hardware assets, producing 40,000+ performance metric a month, automated regression isolation to the putback ( or to put it another way when we log a bug in a Solaris biweekly build which could have hundreds of separate changes/putbacks we have automation that will automatically the engineer that caused the regression and reassign it to him/her )

We have 1400+ system across the globe all run at 100% 24\*7 and no dedicated lab staff in Dublin where most of our systems are located you can get the idea that we don't have the luxury of putting up with mis behaving tests that require us to kick start. One pain point for us has been a 60 Desktop Windows PC configuration placing stress on a Solaris server via in kernel CIFS and Samba. Between test run we reboot the entire configuration but 1 in 8 to 10 reboots one of those Wintel PCs would hang requiring, requiring a manual reboot. In the past we've added IP power switches to reboot offending systems hard after timeouts. But frankly they cost and I have enough cables.

So we just finished replacing the 60 Windows 2000 with a v40z ( Quad Core Opteron ) running OpenSolaris and 60 Virtual Box Windows instances. We've gone through a detailed review to ensure we are producing the same ( actually it is a higher load ) on the Solaris CIFS server and we're seeing the same load pattern on the system under test but no hangs so far.

So what have we gained from this ? What are the advantages ?

  1. Space savings of over 95% ( they were desktop PC connected to a KVM )

  2. Power savings of 80%

  3. Capital saving on hardware 60 desktops vs one server are pretty large. ( I will not put a % on it as it varies too widely )

  4. Test hangs reduced by 100% ( making the team happier ), and getting more from our capital.

  5. We'll now be testing more versions of Windows as the overhead in managing the virtual images is so low.

  6. We can use dtrace to profile the load Windows sends to our server more easily.

  7. The v40z is easier to manage remotely and hardware problems are handled by FMA making life easier

There nothing here to stop anyone test/QA/QE group implement something similar and with saving as significant as we are seeing it really is worth the time.

Tuesday Sep 16, 2008

Keeping track of those annoying controller number changes.

One “feature” of Solaris which personally drives me round the twist is the way controller number can get changed I.e. /dev/dsk/c4t5d0s0 can get changed into /dev/dsk/c7t5d0s0 when an additional HBA ( Host bus adaptor ) card is added to your system. Yes, I know why this happens, and I know it has to it is still a pain in the \*&\^\*(&.

And so I like to “brand” my disks with the name that the started out life with so I know what disks have changed. I've been doing it for some time and please no hassel about the quality of the scripting I'm a manager now :)

So why bother posting this blast from the past. We the reason is that the fixes for the following two bugs mean the default numbers of the controllers in your x4500 will change when upgrade to the latest ilom.

6727449 NPI: Require SWIE support for S10 U5 Thumper platform
6725713 ILOM and later: virtual cdrom and floppy are enabled when used


# Script to bind link names to disks

# and reset them if the link names change.

# Damien Farnham DBE

# Tue Feb 13 13:02:14 GMT 1996



format > /tmp/format.$$ 2>/dev/null - <<!




grep cyl /tmp/format.$$ >/tmp/disklist




cat /tmp/disklist |

while read line; do

DISK_NAME=`echo $line | awk '{print $2}'`

format -d $DISK_NAME > /dev/null 2>/dev/null - <<!








USAGE="Usage: set_links -r or -s "

case $1 in






echo "Usage: set_links -s saves links name on disk label"




Thursday Sep 11, 2008

Best Practise maintain BIOS for Sun Intel and AMD x86 systems

( a follow on from the SPARC firmware blog )


For many Solaris system administrators a BIOS didn't exist un recently because the vast majority where on SPARC systems and they had the OBP ( I'll not bore everyone with the long version of how simply awesome this was 18 years ago when I first saw the OK prompt and typed boot net )

But today Solaris is multi platform, Solaris x86 is not a poor relation to Solaris on SPARC, they are feature for feature equals, Xeon and AMD boxes grown from single socket, single thread, single core babies with pretty simple firmware / BIOS aka Basic Input/Output System. Today's x86 platforms are far from simple and have grown into pretty powerful beasts, take the x4600 with 8 Sockets Quad Core Opteron or the x4150 Quad Socket Quad socket Xeon. The BIOS to manage a 32 core x4600 is understandably a more complex beast than that your IBM PC of yesteryear and so having the right BIOS and right BIOS settings for your platform is critical to get the best from your x86 box.

What does this mean to a Solaris Admin on Sun Xeon & Opteron platforms ?

It simply means that as part of your Solaris Patch Policy you should always include updating your BIOS as well as the Solaris Patches. The are a number of pretty compelling reasons why.

  1. BIOS releases contain the latest microcode patches from Intel and AMD. Microcode is in effect a set of instruction loaded in a CPU to workaround hardware bugs.

  2. Sun updates the configuration of a BIOS to optimize it for the system to provide optimum performance. I recently tested two Quad core Xeon boxes from two vendors and while the had the same CPUs and memory there was a 40% difference in performance due to one having sub optimal settings with the SPECjbb2005 benchmark.

  3. QA teams across Solaris and the Systems group test up-coming releases of Solaris Updates, Nevada and OpenSolaris use the latest released BIOS for testing. Aligning with this, aligns your own software stack with the most tested and trusted Sun stack. BIOS problems can be very hard to diagnose and so limiting your exposure to them is a good idea ( read as lazy but smart )

  4. Its is easier to stay current . Upgrading from minor release to minor release is really safe and painless while going from a very old release may require you to do a number of intermediate upgrades, and of course this will happen when you least need additional work. And remember with all Sun servers you can upgrade from the SP.

  5. Your new Sun box may not come with the latest BIOS installed, an issue we are addressing ( please bear with us ) so even new systems can benefit from checking to ensure you are current.

How do I find out what my firmware release is on my X4150 ?

On your systems Service Processor

ssh oaf413-sc

root@oaf413-sc's password:

Sun Microsystems Embedded Lights Out Manager

Copyright 2006 Sun Microsystems, Inc. All rights reserved.

Firmware Version: 4.0.10

SMASH Version: v1.1

Hostname: SUNSP001B2493C5CC

IP address:

MAC address: 00:1B:24:93:C5:CC

-> show SP










Firmwareversion = 4.0.10

Timeout = 300

CPLDVersion = 063

Target Commands:





Or for those old guard you can look at the system boot ;)

Details on how to log into your SC are included on docs.sun.com and the documentation supplied with each system.

How do I find out where the latest version of Sun System BIOS are ?

I find the fastest way is to use Sun System Handbook ( All seems familiar and common sense ? Good )

Select Servers in the first drop down box.

The select “x4150” on the 2nd.

And this pretty page jumps ups



And follow the instructions to download.

Wednesday Sep 10, 2008

Mr AirBus, Boeing Help Stop the Madness.......

Why ! Why ! do Airlines use Linux to run their in flight programs !!!!!!!!

I spent last Sunday watching Linux boot on the in-flight TV on AerLingus latest A322 for 9 hours solid !. Turning computers on and off is not trying to fix stuff folks !

Solaris can boot in far less time than linux, reset to a ZFS snapshot config on boot to ensure it goes back to a known working state.... hey you can use BrandZ and even the Linux apps can boot at Zones in like 5 secs. SMF/FMA can make it fix the simple things itself.

I've seen the same on a Virgin Boeing so I cry stop the madness ! use an OS that rules out this crud ( sorry ). I'm also confused as to why a T1/T2 based systems is not a better option as they use a fraction of the power and footprint.

STOP the madness !!!!!!!!!!!!!!!!!!!

Give me my movies !!!!!!!

PS My book choice for the flight didn't prove super either.

OK a bit over the top but I needed to vent.

Best Practise In Firmware Patching for the T1000/T2000/T5X20


Many Sun SPARC Solaris system administrators will rave about how simple and elegant the OBP was on SPARC platforms going back to the SPARCstation 1. That simple OK that allowed you to rapidly boot net or boot disk1 to boot from a different OS image from the default. Apart from showing my age and proving I was easy to please the OBP is simply really effective. This firmware did little after the OS booted and in general most users never really needed to upgrade their OBP.

Today that firmware stack on the T1/T2 bases system families has grown. It now includes the features the OBP and also has features only found in the OSes in the past. The Hypervisor with Logical Domains software which makes up today's firmware stack while very elegant and small for what it does is far larger and also very active while the system is running your applications. Ldoms allows you to run many OS images on one system with different patch rev like the E10k domains of old, and soon will let you to migrate Ldoms from one system to another etc. ;) While most of this complexity is hidden from you and using Ldoms is simple for the average Solaris admin it is not suite business as usual.

What does this mean to a Solaris Admin on T1&T2 platforms ?

It simply means that as part of your Solaris Patch Policy you should always include updating your Firmware and Ldoms stack as well as the Solaris Patches. The are a number of pretty compelling reasons why.

  1. The features in the Solaris part of Ldoms is closely linked to the corresponding release of Hypervisor with Ldoms. So if you want to be try that new feature like live migration you need to have the right release in each case.

  2. As described above the FW stack has grown and along with new features you get bug fixes and performance improvements. If you believe in the need to patch Solaris then you need to upgrade your FW. However if you never patch your Solaris installation as many customers do not and never see an issue then don't patch your FW.

  3. Sun QA teams across Solaris and the Systems group test up-coming releases of Solaris Updates, Nevada and OpenSolaris and always use the latest released firmware for testing. Aligning with this is aligns your own software stack with the most tested and trusted Sun stack.

  4. Its is easier to stay current . Upgrading from minor release to minor release is really safe and painless while going from a very old release may require you to do a number of intermediate upgrades, and of course this will happen when you least need additional work.

  5. Your new Sun box may not come with the latest firmware installed, an issue we are addressing ( please bear with us ) so even new systems can benefit from checking to ensure you are current.

How do I find out what my firmware release is on my T5220 ?

On your systems Service Processor

sc> showhost

Sun System Firmware 7.1.3.e 2008/07/29 13:40

Host flash versions:

Hypervisor 1.6.4.b 2008/07/11 08:04

OBP 4.28.10 2008/07/12 12:37

POST 4.28.10 2008/07/12 13:02

Details on how to log into your SC are included on docs.sun.com and the documentation supplied with each system.

SunSPARC Enterprise T5120 and T5220 Servers Installation Guide


How Do I update the firmware ?

Updating the firmware is explained in this document.


How do I find out where the latest version of Sun System Firmware are ?

I find the fastest way is to use Sun System Handbook ( All seems familiar and common sense ? Good )

Select Servers in the first drop down box.

The select “T5220” on the 2nd.

And this pretty page jumps ups


Including this section

Revision History:

136932-02 136932-01

Patch Installation Instructions:
Please refer to the Install.info file for instructions on updating the firmware
in the flashprom using the files included in this patch.  In particular, there
is information on the differences involved with the ILOM-based Sun System
Firmware (7.x) in connection with the use of the Solaris Sun Update Connection
Special Install Instructions:

NOTE 1:  Firmware component revisions included with this release:

         Sun System Firmware 7.1.3.e 2008/07/29 13:45
         ILOM Jul 29 2008 13:26:36
         VBSC 1.6.4.d  Jul 29 2008  13:08:55
         Hypervisor 1.6.4.b 2008/07/11 08:04
         OBP 4.28.10 2008/07/12 12:37
         POST 4.28.10 2008/07/12 13:02

         Checksum of Sun_System_Firmware-7_1_3_e-SPARC_Enterprise_T5120+T5220.pkg : 4292411777
         (generated by the /usr/bin/cksum command)

NOTE 2:  By using Sun System Firmware (Firmware) you agree to the terms of the
         Software License Agreement and Entitlement (SLA/Entitlement) found in

         By using the Firmware, you agree to the terms of the SLA/Entitlement.
         If you do not agree to all of the terms, promptly destroy the unused

NOTE 3:  Please refer to the online documentation for feature and version
         compatibility between Sun System Firmware and LDom Manager releases.
         LDoms Release Notes are available on http://docs.sun.com under
         this title and part number:
         Logical Domains (LDoms) 1.0.3 Release Notes 820-4895

NOTE 4:  If you are currently using LDoms 1.0 or 1.0.1 software, you must
         perform a full upgrade procedure to upgrade to LDoms 1.0.3 software.
         Refer to the Logical Domains (LDoms) 1.0.3 Administration Guide,
         820-4894, at http://docs.sun.com/app/docs/prod/ldoms#hic. You do
         not need to destroy configurations created with LDoms 1.0.2 software;
         you only need to upgrade the software.

NOTE 5:  Sun will update this posting in the future with a link to the
         GPL ILOM source code.  Until then, to request a copy of the GPL ILOM
         source code, please contact ilom-gpl-source-request@sun.com.

README -- Last modified date:  Wednesday, August 13, 2008

Tuesday Aug 19, 2008

PowerTop OpenSolaris

As my last post mentioned we're working to take power management to the next level. Working closely with Intel and AMD and of course the SPARC folks internally. Here is some of the early features.....( OK we know this is catch up ;) but you have to catch up before you overtake ) http://www.youtube.com/OpenSolarisTesla Great work by Rafeal ( superb camera work from Andrew too ) http://www.opensolaris.org/os/project/tesla/Work/Powertop/ PowerTOP is an observability tool that shows how effectively your system is taking advantage of the CPU's power management features. By running the tool on an otherwise idle system, you can see how much time the CPUs are spending running in lower power states. Ideally, an unutilized (idle) system will spend 100% of its time running at the lowest power CPU states, but because of background user and kernel activity (random software periodically waking to poll status), idle systems typically consume more power than they should. PowerTOP shows you which software (user and kernel) is waking up, and how often. By fixing, filing bugs against, (or just not running) power inefficient software you can help improve your system's power efficiency.

Friday Aug 08, 2008

Performance Power and Lifestyles

A number of years ago we evolved our performance QA model at Sun to
better support development and testing of high performance software and
hardware. We put big rules in place "If Solaris is slower is a bug" Then
added developed a process to keep our competitive comparisons up to date.

We call this Suns' Performance Lifestyle. See

The buzz was around price performance but now the meaning of price has
changed. Total Cost of ownership includes cost of powering and cooling
our systems so Suns' Performance QA process has changed.

The first major effort by the industry performance community is SPECpower.
This is a start but like many benchmarks it is open to abuse. In short power
usage like performance depends on your application, your system load,
your configuration etc. Many of the of the more popular benchmarks
are adding a power metric but this takes time.

To support the many teams in Sun that are working on performance and
power management features we're extended power monitoring while
benchmarking to a all the benchmarks. Every Solaris & BIOS
and SPARC firmware change will be measure for its effect on power
consumption. Is this the Green Lifestyle ? Power lifestyle ?
Utility Bill lifestyle ;) ?

Monday Oct 01, 2007

Solaris making teaching Windows/Linux/Solaris easier in DIT

Solaris an Open platform for Colleges.

One of the engineers in my team recently attended his graduation and got talking to his ex lecturer ( Mark Deegan ). The topic turned to Sun and Solaris and the faith of their old SunRay lab. The lab was not heavily used and Mark asked if we could help bring it up to date.

A number of the Performance team visited and installed Solaris 10
and configured Samba, ZFS , Containers, Linux Containers , Java
Enterprise Systems etc. and Mark and his team added some Windows Terminal Server so now any lecturer can use the lab to teach on Windows, Linux and Solaris from the same lab.

All platforms can share the same ZFS storage on the new T1000s they bought and backed up using the cool snapshot feature.
Mark was shocked to see what you get for your money ! 

I cannot say how annoyed I get when I read so many stories about Sun that start with Sun Microsystems the maker of "expensive propriety systems", this is a myth, we're open, just check, you get a lot of your dollar, euro , pound and college pricing is even better. I just love people face when you type psrinfo -v on a T1000 32 threads in a 1U.

The lab has all years old SunRay advantages, quieter ( no fans )
use less power ( the SunRay draw a fraction of the power of a PC ) and the T1000 draw a fraction your Dell. Everyone seems to be fighting to claim they are the greenest but this is old hat for us.

DIT have also started hosting Irish OpenSolaris user groups http://www.opensolaris.org/os/project/ie-osug/meetings/14/

DIT have also signed up for the Sun's FREE online educational training program which covers, Solaris, Java, Java Enterprise System and even soft skills like time management.  Queen's
in Belfast are also members.

There is a news artical which covers what DIT are doing here.


Wednesday Feb 14, 2007

Venting :)

It is amazing how little things can make you really frustrated where you'll work thru the big issues without too much trouble. Right now I'm being driven round the twist by callers looking for someone in our internal support organization. This person has the same 5 digit number but to reach them from some part of the world you need to add 70. One morning I received 10 calls. One gent from Germany rang launched into his problem, I tell politely that I think he has the wrong number, he tells no he's sure he has not and continues ! I then tell him he has and he needs to put 70. No sorry for wasting your time. So like many of the poor souls that suffer the same problem in Dublin I added a new message to my voice mail explaining that I'm not in support and if you are calling about a ticket you need to redial with 70 in front. So at least when I get in each morning I hoped I will not have to go thru 5 or 6 often long, often rude messages. Fixed ! Of course not I still get 1 or 2 each morning from &\^\*&\^\* ( this mornings best was from a Lady in Sweden ) angry I hadn't contacted her about her call ( after sitting thru my message telling her its not me ) It has also shown me that there is no country that Sun does business that is more polite or ruder. People ringing help desks are generally on a short fuse and act the same no matter where they are from and its not a good side of human nature ;) This is why I'd really like Sun to make more use of meeting.central and our REALLY COOL name finder phone book because if you look up someone and ring them it knows when to add 70. 1) Globally people are rude when they ring help desks 2) People do not listen to messages on voice mail 3) People do not care if they give out to the wrong person as long as they find some poor sod. 4) I do not want the guy who shares my numbers job :)

Monday Apr 10, 2006

Zone & HP

I visited a large customer last week with a couple of engineers.  It is always useful for folks from engineering  to meet real customers. It grounds you, let's you understand what the real issues are.

Maybe its just me but most problems within IT organizations are non technical, caused by the organization of a company rather than the technology required to solve a buisness problem.

Throw in outsourcing partners and life can get complex to say the least.

The customer described one non technical issue they had as they rolled out  Solaris 10. They looked at zones/containers and said this rocks  each developer can have a "system" to develop on without impacting each other
at no cost but.......

HP delivers their system adminstration service and wanted to charge them for each zone as a seperate system. The customer was clearly "unhappy" and they are working the issue with HP.

HP seem to view server virtualization as a revenue generation engine, less effort more billing. I know many of the folks that developed zones and I've never heard it described it as a way for HP Professional services ( or anyone else ) to increase revenue ;)

Hopefully this was just one service sale rep thinking Christmas came early and that Sales Junket, a tropical beach, plam trees swaying in the breeze, drink with an umbrella was in the bag.

P.S ZFS ships soon so for the record the self healing, reduced adminstration, increased performance and instant snap shots are not a opertunity to increase charges.

Monday Dec 12, 2005

dtrace & dprofile

I have not had a lot of time lately to update my blog !

We've just added compiler Performance to our test matrix. Compilers have always been tested but we're
integrating Studio and Solaris performance testing so each development team can better understand the effect of their work on the other ( and in turn on the customer )

I have been talking to a lot of ISV ( independent software vendors ) bringing their code and workload into our performance test metrix. They often comment on the pain that goes with upgrading compiler releases. Our goal is to reduce this real pain and provide some positive incentive in terms of increased performance.

One of the numerous killer Sun Compiler Studio 11 features is called dProfile which uses cool some features of the SPARC platform and dtrace. Have you seen those T1000 & T2000 systems yet ?

So do we shout about this from the rafters ? Never !

If you do a search for dprofile on www.sun.com you'll only get 3 hits !
and none would get you interested enough to search more.

So if you love dtrace then you'll love this too. Checkout the developers own blogs,
which is rather gentle in its claims ( Nick you never struck me as shy :)


PS checkout the latest dtrace -z option in the latest Solaris Express.

Tuesday Sep 27, 2005

OpenSolaris Live on My New laptop

Finally got my Ferrari 4k install with Nevada 23. ( Been using a 3200 for a long time ) It rocks, rocks I tell you. Fast, quite and with the frkit its got everything you'll ever need from an OS.

Wednesday Aug 03, 2005

Sun Again.

I was speaking to an engineer in my team yesterday while getting a tea.
We were discussing road maps ( which I cannot go into here ) for really
cool new hardware coming out of Sun both SPARC and AMD over
the next little while.

I have been around here a while and was a customer long before that and
expressed my view that these systems design were very "Sun".  he asked
what I meant.

Well the the boxes are simple, pack in a HUGE amount and have a high
build quality ( even for the prototype units we have ). Just to show him I
 put one of our 2u rack mount systems next to a new IBM Xeon 2u system
( yes, we test Solaris x86, Java and of course JES on NON Sun hardware really )
and the difference was amazing. The IBM has got so many additional components
which make it look like a KIT built from spare bits and I'm just talking the

I'd love to post pictures but you'll have to wait a while longer to try one
for yourself,  even  with the system packaging we're back to where we
started putting standard bits together better than anyone ;) if you run a datacenter with 100's of rack mount unit you'll LOVE these.

P.S. The IBM runs Solaris x86 just great. J2SE runs fine on it with XP
( yes we test Java on XP ), and RH & SuSE too.

Wednesday Jun 29, 2005

Sun and U2

I have worked in Sun for a long time now ( 12+ years ) and taught
I had seen it all !  But never did I expect to see a PRESS RELEASE with
Sun Microsystems and Bono ( yes of U2 fame ) working together !

I was lucky enough to get a ticket to see the Vertigo Tour
home coming opening night in Dublin's Croke Park.  They sold out 3 nights
and could sell 10 more if the venue was available.

The concert was AWESOME. great music a super show and
the band clearly enjoyed playing to the home crowd. The high tech
light show was incredible.

At the end of the concert Bono asked the crowd to text the word 'AFRICA' to 53131.
and that is where Sun come in. We provided the back end infra structure
( and I guess java for all those phones  too )

Maybe marketing could get a Sun Logo somewhere in the venues for the rest of the shows.
I had to update this as it now appears that Marketing we're listening :) checkout www.sun.ie and we see Bono in all his glory.
Croke Park is an 80,000  seater stadium close to Sun's office and home
of Gaelic Football and Hurling, check these out if you get the chance
if you visit Ireland they are uniquely Irish and  great to watch. Checkout  http://gaa.ie.

Thursday Jun 02, 2005

Switch Performance.

/tmp/y A couple of my team mates ( Fintan and Sean ) have posts that deal with
Linus deciding that Performance testing is a good idea and it should be done
for Linux. ( I'm sure reading the artical again he'll see it as a Homer Simpson
moment , D ooh ! maybe we should test it !)

Sounds Silly ?

It seems that Linus may be ahead of some folks.  We do a lot, in fact, a hell of
a lot of network performance testing. Last week we blew a low end 100/1000Gb switch,
we replaced it with a new one, same make, model etc. yet there is a 10 % difference
between it and the original on standard SPECweb99 benchmark. Ouch.

The same hardware now gets 10% less. maybe switch vendors could start testing
their firmware too :)



