Testing OpenSolaris made easy in a heterogeneous world using Virtual Box

Testing OpenSolaris in a heterogeneous world using Virtual Box

Solaris and OpenSolaris have very good reputations for being stable, well tested platforms while also being full of innovation like dtrace, power aware dispatcher, ZFS, Cross Bow etc.. In this environment test coverage is a moving target, new features, new uses, new platforms all make it necessary for teams involved in testing to adapt and innovate to cope with the ever increasing workload. Running to stand still.

The PerfQE team provides Performance QE coverage for most of Sun's software and hardware assets, producing 40,000+ performance metric a month, automated regression isolation to the putback ( or to put it another way when we log a bug in a Solaris biweekly build which could have hundreds of separate changes/putbacks we have automation that will automatically the engineer that caused the regression and reassign it to him/her )

We have 1400+ system across the globe all run at 100% 24\*7 and no dedicated lab staff in Dublin where most of our systems are located you can get the idea that we don't have the luxury of putting up with mis behaving tests that require us to kick start. One pain point for us has been a 60 Desktop Windows PC configuration placing stress on a Solaris server via in kernel CIFS and Samba. Between test run we reboot the entire configuration but 1 in 8 to 10 reboots one of those Wintel PCs would hang requiring, requiring a manual reboot. In the past we've added IP power switches to reboot offending systems hard after timeouts. But frankly they cost and I have enough cables.

So we just finished replacing the 60 Windows 2000 with a v40z ( Quad Core Opteron ) running OpenSolaris and 60 Virtual Box Windows instances. We've gone through a detailed review to ensure we are producing the same ( actually it is a higher load ) on the Solaris CIFS server and we're seeing the same load pattern on the system under test but no hangs so far.

So what have we gained from this ? What are the advantages ?

  1. Space savings of over 95% ( they were desktop PC connected to a KVM )

  2. Power savings of 80%

  3. Capital saving on hardware 60 desktops vs one server are pretty large. ( I will not put a % on it as it varies too widely )

  4. Test hangs reduced by 100% ( making the team happier ), and getting more from our capital.

  5. We'll now be testing more versions of Windows as the overhead in managing the virtual images is so low.

  6. We can use dtrace to profile the load Windows sends to our server more easily.

  7. The v40z is easier to manage remotely and hardware problems are handled by FMA making life easier

There nothing here to stop anyone test/QA/QE group implement something similar and with saving as significant as we are seeing it really is worth the time.

X4500 contoller numbers are renamed.

Keeping track of those annoying controller number changes.

One “feature” of Solaris which personally drives me round the twist is the way controller number can get changed I.e. /dev/dsk/c4t5d0s0 can get changed into /dev/dsk/c7t5d0s0 when an additional HBA ( Host bus adaptor ) card is added to your system. Yes, I know why this happens, and I know it has to it is still a pain in the \*&\^\*(&.

And so I like to “brand” my disks with the name that the started out life with so I know what disks have changed. I've been doing it for some time and please no hassel about the quality of the scripting I'm a manager now :)

So why bother posting this blast from the past. We the reason is that the fixes for the following two bugs mean the default numbers of the controllers in your x4500 will change when upgrade to the latest ilom.

6727449 NPI: Require SWIE support for S10 U5 Thumper platform
6725713 ILOM and later: virtual cdrom and floppy are enabled when used


# Script to bind link names to disks

# and reset them if the link names change.

# Damien Farnham DBE

# Tue Feb 13 13:02:14 GMT 1996



format > /tmp/format.$$ 2>/dev/null - <<!




grep cyl /tmp/format.$$ >/tmp/disklist




cat /tmp/disklist |

while read line; do

DISK_NAME=`echo $line | awk '{print $2}'`

format -d $DISK_NAME > /dev/null 2>/dev/null - <<!








USAGE="Usage: set_links -r or -s "

case $1 in






echo "Usage: set_links -s saves links name on disk label"




Best Practise BIOS patching on Sun Intel and AMD x86 systems

Best Practise maintain BIOS for Sun Intel and AMD x86 systems

( a follow on from the SPARC firmware blog )


For many Solaris system administrators a BIOS didn't exist un recently because the vast majority where on SPARC systems and they had the OBP ( I'll not bore everyone with the long version of how simply awesome this was 18 years ago when I first saw the OK prompt and typed boot net )

But today Solaris is multi platform, Solaris x86 is not a poor relation to Solaris on SPARC, they are feature for feature equals, Xeon and AMD boxes grown from single socket, single thread, single core babies with pretty simple firmware / BIOS aka Basic Input/Output System. Today's x86 platforms are far from simple and have grown into pretty powerful beasts, take the x4600 with 8 Sockets Quad Core Opteron or the x4150 Quad Socket Quad socket Xeon. The BIOS to manage a 32 core x4600 is understandably a more complex beast than that your IBM PC of yesteryear and so having the right BIOS and right BIOS settings for your platform is critical to get the best from your x86 box.

What does this mean to a Solaris Admin on Sun Xeon & Opteron platforms ?

It simply means that as part of your Solaris Patch Policy you should always include updating your BIOS as well as the Solaris Patches. The are a number of pretty compelling reasons why.

  1. BIOS releases contain the latest microcode patches from Intel and AMD. Microcode is in effect a set of instruction loaded in a CPU to workaround hardware bugs.

  2. Sun updates the configuration of a BIOS to optimize it for the system to provide optimum performance. I recently tested two Quad core Xeon boxes from two vendors and while the had the same CPUs and memory there was a 40% difference in performance due to one having sub optimal settings with the SPECjbb2005 benchmark.

  3. QA teams across Solaris and the Systems group test up-coming releases of Solaris Updates, Nevada and OpenSolaris use the latest released BIOS for testing. Aligning with this, aligns your own software stack with the most tested and trusted Sun stack. BIOS problems can be very hard to diagnose and so limiting your exposure to them is a good idea ( read as lazy but smart )

  4. Its is easier to stay current . Upgrading from minor release to minor release is really safe and painless while going from a very old release may require you to do a number of intermediate upgrades, and of course this will happen when you least need additional work. And remember with all Sun servers you can upgrade from the SP.

  5. Your new Sun box may not come with the latest BIOS installed, an issue we are addressing ( please bear with us ) so even new systems can benefit from checking to ensure you are current.

How do I find out what my firmware release is on my X4150 ?

On your systems Service Processor

ssh oaf413-sc

root@oaf413-sc's password:

Sun Microsystems Embedded Lights Out Manager

Copyright 2006 Sun Microsystems, Inc. All rights reserved.

Firmware Version: 4.0.10

SMASH Version: v1.1

Hostname: SUNSP001B2493C5CC

IP address:

MAC address: 00:1B:24:93:C5:CC

-> show SP










Firmwareversion = 4.0.10

Timeout = 300

CPLDVersion = 063

Target Commands:





Or for those old guard you can look at the system boot ;)

Details on how to log into your SC are included on docs.sun.com and the documentation supplied with each system.

How do I find out where the latest version of Sun System BIOS are ?

I find the fastest way is to use Sun System Handbook ( All seems familiar and common sense ? Good )

Select Servers in the first drop down box.

The select “x4150” on the 2nd.

And this pretty page jumps ups



And follow the instructions to download.

Mr AirBus, Boeing Help Stop the Madness.......

Why ! Why ! do Airlines use Linux to run their in flight programs !!!!!!!!

I spent last Sunday watching Linux boot on the in-flight TV on AerLingus latest A322 for 9 hours solid !. Turning computers on and off is not trying to fix stuff folks !

Solaris can boot in far less time than linux, reset to a ZFS snapshot config on boot to ensure it goes back to a known working state.... hey you can use BrandZ and even the Linux apps can boot at Zones in like 5 secs. SMF/FMA can make it fix the simple things itself.

I've seen the same on a Virgin Boeing so I cry stop the madness ! use an OS that rules out this crud ( sorry ). I'm also confused as to why a T1/T2 based systems is not a better option as they use a fraction of the power and footprint.

STOP the madness !!!!!!!!!!!!!!!!!!!

Give me my movies !!!!!!!

PS My book choice for the flight didn't prove super either.

OK a bit over the top but I needed to vent.

Best Practise In Firmware Patching for the T1000/T2000/T5X20

Best Practise In Firmware Patching for the T1000/T2000/T5X20


Many Sun SPARC Solaris system administrators will rave about how simple and elegant the OBP was on SPARC platforms going back to the SPARCstation 1. That simple OK that allowed you to rapidly boot net or boot disk1 to boot from a different OS image from the default. Apart from showing my age and proving I was easy to please the OBP is simply really effective. This firmware did little after the OS booted and in general most users never really needed to upgrade their OBP.

Today that firmware stack on the T1/T2 bases system families has grown. It now includes the features the OBP and also has features only found in the OSes in the past. The Hypervisor with Logical Domains software which makes up today's firmware stack while very elegant and small for what it does is far larger and also very active while the system is running your applications. Ldoms allows you to run many OS images on one system with different patch rev like the E10k domains of old, and soon will let you to migrate Ldoms from one system to another etc. ;) While most of this complexity is hidden from you and using Ldoms is simple for the average Solaris admin it is not suite business as usual.

What does this mean to a Solaris Admin on T1&T2 platforms ?

It simply means that as part of your Solaris Patch Policy you should always include updating your Firmware and Ldoms stack as well as the Solaris Patches. The are a number of pretty compelling reasons why.

  1. The features in the Solaris part of Ldoms is closely linked to the corresponding release of Hypervisor with Ldoms. So if you want to be try that new feature like live migration you need to have the right release in each case.

  2. As described above the FW stack has grown and along with new features you get bug fixes and performance improvements. If you believe in the need to patch Solaris then you need to upgrade your FW. However if you never patch your Solaris installation as many customers do not and never see an issue then don't patch your FW.

  3. Sun QA teams across Solaris and the Systems group test up-coming releases of Solaris Updates, Nevada and OpenSolaris and always use the latest released firmware for testing. Aligning with this is aligns your own software stack with the most tested and trusted Sun stack.

  4. Its is easier to stay current . Upgrading from minor release to minor release is really safe and painless while going from a very old release may require you to do a number of intermediate upgrades, and of course this will happen when you least need additional work.

  5. Your new Sun box may not come with the latest firmware installed, an issue we are addressing ( please bear with us ) so even new systems can benefit from checking to ensure you are current.

How do I find out what my firmware release is on my T5220 ?

On your systems Service Processor

sc> showhost

Sun System Firmware 7.1.3.e 2008/07/29 13:40

Host flash versions:

Hypervisor 1.6.4.b 2008/07/11 08:04

OBP 4.28.10 2008/07/12 12:37

POST 4.28.10 2008/07/12 13:02

Details on how to log into your SC are included on docs.sun.com and the documentation supplied with each system.

SunSPARC Enterprise T5120 and T5220 Servers Installation Guide


How Do I update the firmware ?

Updating the firmware is explained in this document.


How do I find out where the latest version of Sun System Firmware are ?

I find the fastest way is to use Sun System Handbook ( All seems familiar and common sense ? Good )

Select Servers in the first drop down box.

The select “T5220” on the 2nd.

And this pretty page jumps ups


Including this section

Flash PROM Patch

127580 - 7.0.x System Firmware
136932 - 7.1.x System Firmware

Select firmware from the box on the left and you get the example below.

Note Before you shout my god that is a lot of bug fixes in a firmware patch, this is the full list of all fixes for this release on all platforms not just the T5120. So you see other platforms like the T2000 listed and many related to platforms still under development. While the release train for each train is shared there is a platform specific sections.


Download Patch (13326673 bytes):


Download Signed Patch (13331770 bytes):


Signed Patch Documentation 

Patch Finder

Document Audience:


Document ID:



Hardware/PROM: SPARC Enterprise T5120 & T5220 Sun System Firmware with LDOMS support

Copyright Notice:

Copyright © 2008 Sun Microsystems, Inc. All Rights Reserved

Update Date:

Wed Aug 13 06:54:45 MDT 2008

See Patch Revision History

Patch Id:136932-03
Keywords:hardware/prom: sparc enterprise t5120 & t5220 7.1.3.e flashprom security
Summary:Hardware/PROM: SPARC Enterprise T5120 & T5220 Sun System Firmware with LDOMS support
Date: Aug/13/2008
Installation Requirements:
Additional instructions may be listed below
Solaris Release:10
Sun OS Release:5.10
Unbundled Product:Sun System Firmware
Unbundled Release:7.1.3.e
SPARC Enterprise T5120 & T5220 Sun System Firmware 7.1.3.e flashprom update
Relevant Architecture:sparc
BugId's fixed with this patch:

Changes incorporated in this version:

6673405 6676561 6678254 6685358 6686422 6698816 6704491 6720401 6720737 6721020 6722602

Patches accumulated and obsoleted by this patch:
Patches which conflict with this patch:
Required Patches:
Obsoleted by:


Files Included in this Patch:
Problem Description:
6722602 mem leak in plat\*d get_faults code
6721020 OpenBoot should handle "shutdown" request from Hypervisor
6720737 small leak in plat_hwsvc_free_fault_data
6720401 POST should not fail the GBE device unless it is absolutely certain that the GBE is at fault
6704491 OBP doesn't display all disabled devices
6698816 multiple (all?) objects in the SUN-PLATFORM-MIB return "noSuchInstance" during SNMP GET
6686422 ILOM hangs and does not allow user logins via network or RS232 console port
6685358 need some caching in snmp
6678254 ipmitool sunoem cli through lanplus prints out garbages
6676561 changing NVRAM parameter at OS level causes panic : BAD TRAP: type=31
6673405 adding memory to system causes the saved LDoms configs on the SP to be invalidated

6719274 Thrasher does not work on VF platforms
6717373 cpu-table (in arch/sun4v/cpustruct.c) should be cleaned before use
6713501 improve PCIE test and scan from post menus.
6712683 POST mistake the slot of the faulted DIMM
6712395 POST times out with 8GB FBDIMMs
6712366 monitor preferred resolution does not work on XVR50
6712358 POST print function confusion
6711750 cpuset pointer is invalid after sc warm start on N2/VF systems
6710682 Sunvts hsclbtest test fails on Maramba platform
6710322 Turgo POST fails when T1 PCI-X card is plugged in
6710098 panic on CPU thread; BAD TRAP type=33 and type=34 reported;
        Crashme test provokes this immediately
6708608 master CPU timeout occurring on T5140/T5240
6708080 SNMP gets don't report fan module is a FRU
6708044 IPMI driver regression: broken on s10u3
6706405 expand 6679155 to apply to all platforms with PIU, not just Batoka
6705118 Intermittent XAUI Lane Synchronization Failure at Flex
6705064 POST, block memory test fails if thread 0 is asr disabled
6702424 larger vbsc md/rom allocations can cause problems on 64Mb machines (huron/glendale)
6702022 WARNING: invalid vector intr: number 0x413, pil 0x0 logged on Batoka during bootup after reboot
6699979 change SP_PROTO default in platform makefiles
6698692 PIU Secondary error handling for UE RECV needs fixes
6698362 Hypervisor abort upon receiving DAE_nc_page trap
6697899 Batoka/Maramba: Couldn't get to Unix prompt after boot disk. Typing in console didn't work.
6697814 use larger page sizes for real to physical mappings
6696383 PRI template for Batoka missing pci-mem64-support property; VF templates need factoring
6696045 trap Stack Array ECC errors handled incorrectly
6694831 increase size of ldc queues in sram for performance
6694497 POST incorrectly reports timeouts
6693906 probe-scsi-all on USB1.1 storage generates Fast Data Access MMU Miss
6693734 platforms that support MEM64 and PCI hotplug using invalid memory sizes
6692928 sunoem cli can drop connection just with single receive failure
6691804 VBSC missing POST reconfig flag
6691664 bad detector in the ereport.io.fire.pec.re
6690978 incorrect diagnosis of a memory UE
6690844 confusing POST PCI id test error
6690807 dmake needs to compile parallel targets correctly to allow parallel WOS assembly
6690208 no ereport generated for an FBU error
6689590 Webserver runs out of memory during security scan
6688749 cache target CPU pointer to improve VDEV interrupt efficiency
6688615 LDC packet transfer to/from SRAM could be a little more efficient
6687340 when hostconsole log gets full, entire log rotates, making info unavailable to customers
6687097 virtual-console device always returns NULL very first character available after it gets opened
6686442 Congo ILOM should support ipmitool sunoem CLI feature
6686172 TPM code incorrectly included in N2/VF POST makefiles
6686133 HV support for SP initiated reset
6685903 ldoms config boot after resetsc fails to save PRI; resetsc again emits
        PRI Error + ldmd fatal error
6684913 optimise enhanced POST Systest tests for N2 products
6684679 ipmitool sunoem cli command when sourced from file displays many "\^D"
6683801 N2 TSB miss/MMU error trap handlers could be a little more efficient
6683328 failure of some IO devices disabled incorrectly
6683322 with non-factory-default spconfig: ldmd fatal error: No available physical memory
6683273 micro optimisation for hypervisor spinlocks
6682970 clock drift on N2 platforms (Glendale, Monza & Turgo) due to spread spectrum
6682834 transition ds port state to "init" before bringing up corresponding ldc
6682758 VF Fatal NCX timeout when primary domain is reset - ereport.io.vf.ncx.to-fsr
6682424 ipmitool sunoem cli needs to increase timeout when spsh exits
6682118 RKVMS-AST2000: Deleting first session before second, login menu appears hangs javaRconsole
6681791 remove obsolete stack-related directives in n2 offsets.in
6681676 Hypervisor command cross-calls should be cleaned up
6680637 parallel make of VBSC has race in opcode-derived file creation
6680498 bincheck isn't quite binary safe
6679411 DBU ereports are missing L2 info
6679408 L2ND SER is discarded
6679155 VF team says to set pio_control_1 in Mem 64 PCIE Offset Register to avoid potential deadlock
6679121 possible race condition in ipmi threshold adjusting code
6679095 completion timeout bit not set in PIU_PLC_TLU_CTB_TLR_UE_INT_EN for sun4v platforms
6678016 enable features for SPARC powermgmt
6677901 ILOM timestamp creation needs to work reliably during parallel platform makes
6677840 enhanced POST memory tests on VF platforms should test local memory only
6677699 start %pc may not be what it should be when cpu_start() called immediately after cpu_stop()
6677467 Lynx/Virgo:spsh exit causes seg fault
6676959 Service tag is not properly programmed
6676924 vbsc frees memlayout buffer without checking if it is allocated
6676807 FERG needs to filter FBRs
6676309 clock drift in Huron due to spread spectrum
6676224 ipmitool sunoem led get:Sun OEM Get LED command failed: Parameter out of range
6676130 g12n - stale /etc/snmp/snmpd.static.conf causes no SNMP Traps to be sent
6675793 POST hangs when using diag_verbosity=debug
6675710 Gemini+_Pegasus+: Cannot ssh or console into SP using local/ldap user accounts
6675315 g12n - -  no snmp trap when a sensor event occurred
6674589 can't boot if HOST is reset two times approx 10 secs apart
6674338 cpuid and serial number should match in SERs
6674205 stick frequency inaccuracy due to spread spectrum
6672954 HV should offer build target to build without timestamp
6672380 vestiges of htraptracing not eliminated when vcpu is stopped
6672320 update user-visible copyrights to 2008
6672185 fopen, fprintf, fclose are not check for null
6671945 alom and ilom command lists incorrect DIMM size for 1GB Micron DIMMs
6671677 factory-default MD missing md-generation# property
6671538 vbsc should power off host immediately for POK Faults
6671365 Hypervisor makefiles could be rearranged to reduce duplication
6671343 Hypervisor sources could use cstyle and other misc cleanup
6671110 Compiler optimization ensures strands never sync and CPUs fail to start
6671070 POST diagnoses to wrong DIMM if faulty DIMM is in either D2 or D3 slots
6670841 HV must ensure C2C bit set for all remote cache errors
6670340 optimise enhanced POST memory tests for N2 products
6670188 memory leak in capi_get_fru_info
6670137 snmpd coredumps
6669841 get rid of unecessary build flags and compile flags
6669488 POST times out with verbosity min
6669456 GET_DRAM_ERROR should be the same for N2 and VF
6669412 G4F/Thor : WebGUI stops working after changing Network settings in WebGUI
6669255 domains fail to boot from right devalias name
6669222 lock in mmu_miss can be eliminated to reduce contention
6669206 picl displays error "PICL snmpplugin: sunPlatSensorClass 0 unsupported" when fault
        detected on iobox
6669121 remove unneeded IPMI mods files
6668651 HV RA to PA guest mappings should not use padding in structure
6668015 strand_in_error needs rework
6667926 no status update to user with non-debug vbsc
6667511 capsimload binary not installed on core 2.0 target image
6666961 max level POST on Batoka with memory CE's gets unexpected trap to address 0
6666913 add support for lint tool to source base for Linux builds
6666878 determine size of max_indicator_mech_map in libcapidirect
6666797 support reading md for ldc queue size and offsets for n1
6666425 bootload() function could be written in C instead of assembler
6666361 ILOM needs sis state memory
6666265 support event/reading type set to 0x6f for galaxyplatforms
6666154 hang in OBP when upgrading Huron from 7.0 to 7.1 FW
6666122 IPMI LAN consumes all available resources if it fails to bind to TCP port
6666120 fix "sunoem led get" for Phase 1 on non-Andromeda platforms
6665650 DRAM MA and CWQ load UEs may result in panic while trying to retire a page
6665257 Huron leaks 980 bytes on each power cycle
6665185 pci-mem64-support property (in openboot node) missing from the pri
6664193 speedy mondo dispatch rework leaves niu interrupts in the dust
6663348 inj command only works with root password being the default
6662162 watch-net-all not testing interfaces
6662147 coredump from lumain, hanging spsh
6661716 N2/VF should insert "variables" node only under "root" node and never "openboot" node
6661663 PID2VCPUP doesn't check for vcpu->strandp being NULL before dereferencing
6661197 Hypervisor should provide more useful information when aborting after error
6660672 SensorMonitor doesn't treat 2 byte DiscreteReadingMask properly
6660399 after sysfwdownload, flashupdate -s should remove /coredump/ILOM_flash.pkg when done
6660206 Huron POST spelling error: "Lopback" missing "o". eg:"ERROR: TEST = BCM8704 Internal Lopback"
6660171 port capisimload from trunk to core 2.0
6659611 mask wrong in relocate of vector interrupt table
6659301 need to support i2c-translation for 8275-based platforms
6659298 pca9553_blink routine logic for determining ACT (vs LOCATE) LED needs to scale
6659139 JRC: System cannot find CD image when installing any RHEL using JRC redirected image
6658998 X4100m2 ilom over ssh or javaRconsole stops reporting events; ipmitool reports current events
6658551 IPMI Redesign: sensorlist records should not need to be strictly ordered
6658510 snmp engineid incorrect default reduces v3 security
6658389 IPMI LAN process may crash sporadically because of ignored error code
6658373 motherboard shouldn't be faulted when ambient temperature threshold exceeded
6657972 HV not generate ereports for IO error injection after error injected to pciexrc1
6657747 use java workarounds for Turkish and Estonian keyboards
6657711 single scratchpad UE in kernel mode generates panic and HV abort
6657637 Scratchpad register correctable error injection generates panic: bad kernel MMU trap
6657584 enhance ILOM GUI Diagnostic tab on Andro & C10 to support manual operation of PCCheck
6656689 ldc subscribers cannot determine if SP has been reset
6656245 max_ra_bits property in pri should be 40
6656072 post CPU test failures, hardware under test is wrong field may state/fail wrong core
6655361 leave output of svn stats in webrev directory to be used for putback list
6655312 fan control algorithm issues
6655263 ILOM fmd 2.0b2 repeatedly core dumps on new platform
6655060 interrupts left pending for stopped CPUs result in non-idle idle strands6654887 SNMP trap destination port should be configurable
6654860 POST code setup for xaui copper
6654395 ldmd fatal error: No available physical memory
6654272 Alt Graph press on JavaRConsole with Windows OS provides unusable keyboard
6654153 Frutool on Andromeda CMM can report invalid data to capi if i2c requests fail
6652635 NSTRANDS_PER_CHIP_MASK pre-processor define is incorrect
6652445 large amounts of data redirected to console eventually stops being printed
6652046 XAUI Conflicting error message
6652004 javarconsole: deleting first tab makes second tab appear disconnected
6651513 version negotiation in OBP's vnet/vdisk doesn't adhere to the protocol
6651325 fix keymaps for Tier 1 (Italy, French, Brazil, Spanish, German)
6650585 showplatform and /SYS/ACT do not always display correct host state
6650392 boot-time and reset-time can be improved with further memory scrubbing changes
6650241 vBSC runtime can be improved by %40 just by turning compiler optimization on
6649860 conflict between fatal handling and power management
6647462 problem with T2000 SysFW 6.5.5, won't see DVD rom device after upgrade
6642807 mis-merge in main.s causes SET_VCPU_STRUCT() and SET_STRAND_STRUCT() to happen twice
6642749 when connected to SP via serial port, sending NULL character results in <break> being
        sent to domain
6641936 'consolehistory' in alom-cli not displaying correct history of information
6641377 alom cli usage for logout is not correct
6639312 only FRUs should have fault_state property (except for /SYS)
6639099 cannot view entire history log using ilom cli due to line_count and pause_count restrictions
6639002 incorrect FRU callout in diagnosis of PCI card(s) for FMA IO faults
6637424 offsets.h complains about missing definition
6635541 SP needs to implement correct System Policy when SCC NVRAM Invalid or Not Present
6634980 possible Data corruption when using NVRAMRC kill or yank
6634974 incorrect Error reported on nvramrc CRC Error
6632561 incorrect GNUmakefile can hang the build
6630598 call ldc_copy with negative flag returns ENOACCESS
6626599 use SIS state and descriptive string in guest-soft-state delivered by host to sc
6625125 RESET_FEE needs to be set to non-zero value
6624658 when stop /SYS command is not used, HOST_LAST_POWER_STATE not effective
6623619 PRI IO slot canonical paths are missing trailing "@0" component
6618866 customers need to be able to automate FW updates
6617464 QJ070821-014 prtdiag not displayed FAN exchanged
6616381 offsetchk prints incorrect array size
6613209 value of /HOST/diag/trigger changed when motherboard replaced [QJ070903-042]
6613103 sp fatal error generated after ac powercycle with chassis cover removed and
6612947 system hangs after 60 ms power glitch
6610641 Solaris reboot causes HDD OK2RM to stay on
6607780 RFEN: ActDir to support 'group movement' within tree even if MS ActDir warns against it
6607719 unable to delete last session in javarconsole if session to left of it is deleted first
6604574 Erratum workaround: MCU Refresh Frequency and Sync Frame Frequency cannot be integer multiples
6602475 javarconsole: choose 'No' to stop cd image redirection, checkbox is blank
6601935 plathwsvcd/showfaults: fault message in /var/tmp/faults.tmp doesn't show in showfaults command
6600945 power-off doesn't power off virtual machine OBP is running within
6598988 need error handling support for UEs in foreign L2 cache reads
6593801 when all platform identity checks fail, system poweron should be disabled
6588443 system should respond to invalid mem config (mixed arch) sooner (less than current 5 minutes)
6587238 VBSC sometimes does not reboot system after SIR
6577471 VBSC should not power-cycle when reset requested via CLI
6574695 move Maramba IIT code to Common area
6566151 Ontario s10u4 eft.undiagnosable_problem for jbc.cpe jbc.mb_pea jbc.jtceew
6564189 power state should be checked before calling sensor access functions
6560249 ilom should read vbsc version via provided API
6558974 remove check for unbind property in guest parse routine
6556636 VBSC should build separate versions of object files for debug and non-debug
6555126 increase size of ldc queues in sram for performance
6549183 Guest Watchdog should be made LDoms aware
6545380 ilom DMTF cli: Unable to force RW console if another user has console
6541482 processor always starts on lowest available strand, even if it is asr disabled - For Huron
6538289 Galaxy-snmp dumps core, timeouts during compliance test on IlomCtrlMIB
6537680 ERROR: too much data in /var/opt/vbsc/asrdb - no backing store for ASR DB
6532136 ilom scc command improperly initializing SCC card
6529467 POST should handle preceeding spaces gracefully
6519557 Hypervisor should read md for ldc queue size and offsets
6488334 VBSC sets up memory incorrectly in mixed DIMM config on N2
6487885 javarconsole doesn't prompt user that it's been disconnected in timely manner
6485109 isalist output need to be corrected on Niagara2

6683322 with non-factory-default spconfig: ldmd fatal error: No available physical memory
6682834 transition ds port state to "init" before bringing up corresponding ldc
6676959 Service tag is not properly programmed
6676309 clock drift in Huron due to spread spectrum
6666154 hang in OBP when upgrading Huron from 7.0 to 7.1 FW
6664193 speedy mondo dispatch rework leaves niu interrupts in the dust
6661663 PID2VCPUP doesn't check for vcpu->strandp being NULL before dereferencing
6657972 HV not generate ereports for IO error injection after error injected to pciexrc1
6656689 ldc subscribers cannot determine if the SP has been reset
6655060 interrupts left pending for stopped CPUs result in non-idle idle strands6654784 broke Huron build trying to compile it with multi-node support
6654395 ldmd fatal error: No available physical memory
6651301 add IOBOX connection property support to CAPI and envmon
6650585 showplatform and /SYS/ACT do not always display correct host state
6650227 bootmode reset_nvram fails to reset OBP to defaults
6649268 strand error struct - unaligned memory accesses
6648867 unaligned write in strand error handling
6648468 when there are memory holes, Hypervisor passes incorrect memory size argument to guest(s)
6647778 Error: Invalid property value when setting reset_to_defaults to None
6647337 restarting domains with multiple domains in existence causes incorrect uptime in Zeus
6647026 available commands listed for /X/powermgmt isn't correct in CLI
6644994 incorrect data received as response of pcp interface on Huron
6644049 need to ditch polling in favor of interrupts for idling
6643632 refactoring and restructuring changes needed for platform scalability
6642932 CWQ and MAU qconf set head_marker field to wrong value, causes interrupts immediately
6642809 /SPD/Timestamp printed incorrectly
6642804 SET_SIZE(start_secondary_master) is missing in NCHIPS=1 case
6642800 SET_SIZE(start_master) is missing
6642705 /etc/init.d links created on SPARC but not applicable
6642702 /etc/init.d scripts for scc, identify, bbrd, hwclock should not be replicated in mods area
6641981 CapiSDR needs to set bits 7:6 in Sensor Units 1 when Discrete sensors use Full SDRs
6641113 ActiveDirectory Web GUI not showing '<USERNAME>' portion of userdomain in table format
6641093 port CAPI Sensor simulator from ILOM 3.0 to 2.0
6640893 add better stack debugging to HV
6640613 ipmitool incorrectly displays Discrete sensors, especially Discrete sensors that use Full SDRs
6640494 no HV debug output on port 2001
6639912 uadmin 2 0 and reboot reads old bootmode settings (should be cleared)
6639225 backout 6549183, causes 6638756
6639013 N2 post interface update
6639010 change HV 1.6.x file names to be more apt
6638901 RED_STATE if MMU error in hyper-privileged mode
6637992 vecintr() mondo dispatch could be reordered for performance
6637682 move debugflags to mini-MD
6637134 HV abort handling broken at TL = 0
6636325 HV trap trace doesn't work after the guest resets
6635618 enable ILOM target /SP/powermgmt on T5120 and T5220
6635181 1.6 vbsc rejects LDoms 1.0.1 MD bootsets
6635149 mmu_tsb_ctxnon0() returns EINVAL with shared context setting, context_index = -1
6634997 some DAUs should be reported to guest as non-resumable
6634497 Servicetag information incorrectly populated
6632210 Hypervisor could use print macro that handles g7 clobbering
6631875 call ldc_mapin with misaligned cookie does not return EBADALIGN
6631187 Power Management query via web while chassis is powered off hangs
6630613 call ldc_copy with mismatched page size returns EOK
6630112 some DEVINST2INDEX or DEVINST2COOKIE failures will be nasty
6627667 change OBP environment variable pci-mem64? default value from false to true
6626599 use SIS state and descriptive string in guest-soft-state delivered by host to sc
6626045 update RNG OBP virtual-device node for Huron/N2 platforms
6626041 machine description for guest needs to support rng-#units propval in RNG virtual-device node
6624684 VCPU start_stick is never initialised
6624408 queued sp messages relayed to Solaris syslog should indicate so
6623916 software guest state updates generate too much traffic
6623142 Hypervisor needs to inform vbsc of soft state
6623132 DMU Data Parity Error should not be set as Fatal
6623032 VBSC builds fail with standard Sun make
6621998 configuring ldc channels should not corrupt other sysinos
6621584 Hypervisor stack size should be reduced
6621350 teja_profiler_dump() can dump bogus info to the console
6619991 update Phase 1 to allow a "Chassis" Entity
6619932 hv_ldc_copy should return EINVAL for SP endpoints
6619164 vBSC is running POST for user resets even if diag_trigger does not have user_reset set
6618239 dmake does not compile multiple N2 targets correctly
6615776 number of guest LDC channels need to be increased on per platform basis
6614911 configuring NCS_QTYPE_CWQ queue does not set tail offset to zero
6614166 need to pass correct Hypervisor md header to md_check_content_version
6613677 replace qas with fbe in Makefile
6611174 mau/cwq should target interrupt based on submitter
6607996 creating IPMI records from JEDEC DIMM SPDs, need to synthesize serial number properly
6607368 ENABLE_PCIE_OE_ERR_INTERRUPTS macro needs a tweak
6605824 additional error after error storm is handled results in incorrect ereport
6604967 number of guest LDC channels need to be increased on per platform basis
6604416 update versioning for vbsc1.6.x-gate
6603960 hwtw entry management could use a little optimization
6603949 sample guest dis_bpcc() function has typo and is C-style filthy
6603535 firmware needs to check FBDIMM voltage range
6602244 need parallel memscrub timeout
6602227 Hypervisor stack needs overflow and underflow safeguards
6601551 use mailbox for systest arguments
6596924 vbsc should check actual return value of stat()
6595231 rng_ctl_write() doesn't process and store wait counts passed in with control register values
6594395 "Options: true false" menu interrupts OBP reset
6594044 removal of "next" spconfig results in incorrect selection
6592934 occasional LDOM warning message after POST
6592314 ldc_read (vbsc) does not set qhead after processing control pkts
6591377 naming for IPMI FRUs should be consistent with previous platforms
6590036 malformed MSI errors are encoded wrong
6588393 N2 sends invalid interrupt to all strands on startup
6587522 missing information in "ldm list-config"
6587418 Huron vbsc should check MD content-version like N1 does to ensure firmware compatibility
6587179 rng_data_read_diag() returns wrong errno when buffer size not multiple of 8
6586938 add "show components" to cli dmtf argument alias
6584080 Post tests asr disabled XAUI cards
6583280 return status of MD property removal routine is useless
6581868 HV doesn't reset RNG to error state after watchdog timeout expiry
6581227 deserializing fails when number of entries greater than 12
6581061 number of guest LDC channels need to increase on per-platform basis
6576348 console hangs if lowest available strand of control domain is DR'ed out
6575857 DS should check return value from Xmalloc
6572421 regenerate ILOM servicetags entry on SP boot
6567748 LOM should show correct Guest Status sent from OBP and Solaris
6565130 partdef tool should flag error when nodes have same name
6563453 SNMP: udp port snmp unreachable on CMMs
6558876 combining -level all with property with show command doesn't work correctly
6558875 support parsing of mini-md from vbsc for HV-specific info to leave out of HV md
6558874 support delivery of mini-md from vbsc to HV at poweron with HV-specific info
6553956 Hypervisor spill/fill code uses %asi unnecessarily
6547428 warning message has incorrect spelling. WARNING: No response from Domain Service "Providor"
6530696 vBSC should support parallel make
6512417 vbsc needs to compile under modern operating system
6489630 ldc_unmap() with misaligned raddr expects EBADALIGN but receives EOK
136932-02 136932-01

Patch Installation Instructions:
Please refer to the Install.info file for instructions on updating the firmware
in the flashprom using the files included in this patch.  In particular, there
is information on the differences involved with the ILOM-based Sun System
Firmware (7.x) in connection with the use of the Solaris Sun Update Connection
Special Install Instructions:

NOTE 1:  Firmware component revisions included with this release:

         Sun System Firmware 7.1.3.e 2008/07/29 13:45
         ILOM Jul 29 2008 13:26:36
         VBSC 1.6.4.d  Jul 29 2008  13:08:55
         Hypervisor 1.6.4.b 2008/07/11 08:04
         OBP 4.28.10 2008/07/12 12:37
         POST 4.28.10 2008/07/12 13:02

         Checksum of Sun_System_Firmware-7_1_3_e-SPARC_Enterprise_T5120+T5220.pkg : 4292411777
         (generated by the /usr/bin/cksum command)

NOTE 2:  By using Sun System Firmware (Firmware) you agree to the terms of the
         Software License Agreement and Entitlement (SLA/Entitlement) found in

         By using the Firmware, you agree to the terms of the SLA/Entitlement.
         If you do not agree to all of the terms, promptly destroy the unused

NOTE 3:  Please refer to the online documentation for feature and version
         compatibility between Sun System Firmware and LDom Manager releases.
         LDoms Release Notes are available on http://docs.sun.com under
         this title and part number:
         Logical Domains (LDoms) 1.0.3 Release Notes 820-4895

NOTE 4:  If you are currently using LDoms 1.0 or 1.0.1 software, you must
         perform a full upgrade procedure to upgrade to LDoms 1.0.3 software.
         Refer to the Logical Domains (LDoms) 1.0.3 Administration Guide,
         820-4894, at http://docs.sun.com/app/docs/prod/ldoms#hic. You do
         not need to destroy configurations created with LDoms 1.0.2 software;
         you only need to upgrade the software.

NOTE 5:  Sun will update this posting in the future with a link to the
         GPL ILOM source code.  Until then, to request a copy of the GPL ILOM
         source code, please contact ilom-gpl-source-request@sun.com.

README -- Last modified date:  Wednesday, August 13, 2008

PowerTop OpenSolaris

As my last post mentioned we're working to take power management to the next level. Working closely with Intel and AMD and of course the SPARC folks internally. Here is some of the early features.....( OK we know this is catch up ;) but you have to catch up before you overtake ) http://www.youtube.com/OpenSolarisTesla Great work by Rafeal ( superb camera work from Andrew too ) http://www.opensolaris.org/os/project/tesla/Work/Powertop/ PowerTOP is an observability tool that shows how effectively your system is taking advantage of the CPU's power management features. By running the tool on an otherwise idle system, you can see how much time the CPUs are spending running in lower power states. Ideally, an unutilized (idle) system will spend 100% of its time running at the lowest power CPU states, but because of background user and kernel activity (random software periodically waking to poll status), idle systems typically consume more power than they should. PowerTOP shows you which software (user and kernel) is waking up, and how often. By fixing, filing bugs against, (or just not running) power inefficient software you can help improve your system's power efficiency.

Performance Power and Lifestyles

A number of years ago we evolved our performance QA model at Sun to
better support development and testing of high performance software and
hardware. We put big rules in place "If Solaris is slower is a bug" Then
added developed a process to keep our competitive comparisons up to date.

We call this Suns' Performance Lifestyle. See

The buzz was around price performance but now the meaning of price has
changed. Total Cost of ownership includes cost of powering and cooling
our systems so Suns' Performance QA process has changed.

The first major effort by the industry performance community is SPECpower.
This is a start but like many benchmarks it is open to abuse. In short power
usage like performance depends on your application, your system load,
your configuration etc. Many of the of the more popular benchmarks
are adding a power metric but this takes time.

To support the many teams in Sun that are working on performance and
power management features we're extended power monitoring while
benchmarking to a all the benchmarks. Every Solaris & BIOS
and SPARC firmware change will be measure for its effect on power
consumption. Is this the Green Lifestyle ? Power lifestyle ?
Utility Bill lifestyle ;) ?

Solaris making teaching Windows/Linux/Solaris easier in DIT

Solaris an Open platform for Colleges.

One of the engineers in my team recently attended his graduation and got talking to his ex lecturer ( Mark Deegan ). The topic turned to Sun and Solaris and the faith of their old SunRay lab. The lab was not heavily used and Mark asked if we could help bring it up to date.

A number of the Performance team visited and installed Solaris 10
and configured Samba, ZFS , Containers, Linux Containers , Java
Enterprise Systems etc. and Mark and his team added some Windows Terminal Server so now any lecturer can use the lab to teach on Windows, Linux and Solaris from the same lab.

All platforms can share the same ZFS storage on the new T1000s they bought and backed up using the cool snapshot feature.
Mark was shocked to see what you get for your money ! 

I cannot say how annoyed I get when I read so many stories about Sun that start with Sun Microsystems the maker of "expensive propriety systems", this is a myth, we're open, just check, you get a lot of your dollar, euro , pound and college pricing is even better. I just love people face when you type psrinfo -v on a T1000 32 threads in a 1U.

The lab has all years old SunRay advantages, quieter ( no fans )
use less power ( the SunRay draw a fraction of the power of a PC ) and the T1000 draw a fraction your Dell. Everyone seems to be fighting to claim they are the greenest but this is old hat for us.

DIT have also started hosting Irish OpenSolaris user groups http://www.opensolaris.org/os/project/ie-osug/meetings/14/

DIT have also signed up for the Sun's FREE online educational training program which covers, Solaris, Java, Java Enterprise System and even soft skills like time management.  Queen's
in Belfast are also members.

There is a news artical which covers what DIT are doing here.


Venting :)

It is amazing how little things can make you really frustrated where you'll work thru the big issues without too much trouble. Right now I'm being driven round the twist by callers looking for someone in our internal support organization. This person has the same 5 digit number but to reach them from some part of the world you need to add 70. One morning I received 10 calls. One gent from Germany rang launched into his problem, I tell politely that I think he has the wrong number, he tells no he's sure he has not and continues ! I then tell him he has and he needs to put 70. No sorry for wasting your time. So like many of the poor souls that suffer the same problem in Dublin I added a new message to my voice mail explaining that I'm not in support and if you are calling about a ticket you need to redial with 70 in front. So at least when I get in each morning I hoped I will not have to go thru 5 or 6 often long, often rude messages. Fixed ! Of course not I still get 1 or 2 each morning from &\^\*&\^\* ( this mornings best was from a Lady in Sweden ) angry I hadn't contacted her about her call ( after sitting thru my message telling her its not me ) It has also shown me that there is no country that Sun does business that is more polite or ruder. People ringing help desks are generally on a short fuse and act the same no matter where they are from and its not a good side of human nature ;) This is why I'd really like Sun to make more use of meeting.central and our REALLY COOL name finder phone book because if you look up someone and ring them it knows when to add 70. 1) Globally people are rude when they ring help desks 2) People do not listen to messages on voice mail 3) People do not care if they give out to the wrong person as long as they find some poor sod. 4) I do not want the guy who shares my numbers job :)

Zone & HP

I visited a large customer last week with a couple of engineers.  It is always useful for folks from engineering  to meet real customers. It grounds you, let's you understand what the real issues are.

Maybe its just me but most problems within IT organizations are non technical, caused by the organization of a company rather than the technology required to solve a buisness problem.

Throw in outsourcing partners and life can get complex to say the least.

The customer described one non technical issue they had as they rolled out  Solaris 10. They looked at zones/containers and said this rocks  each developer can have a "system" to develop on without impacting each other
at no cost but.......

HP delivers their system adminstration service and wanted to charge them for each zone as a seperate system. The customer was clearly "unhappy" and they are working the issue with HP.

HP seem to view server virtualization as a revenue generation engine, less effort more billing. I know many of the folks that developed zones and I've never heard it described it as a way for HP Professional services ( or anyone else ) to increase revenue ;)

Hopefully this was just one service sale rep thinking Christmas came early and that Sales Junket, a tropical beach, plam trees swaying in the breeze, drink with an umbrella was in the bag.

P.S ZFS ships soon so for the record the self healing, reduced adminstration, increased performance and instant snap shots are not a opertunity to increase charges.

dtrace & dprofile

I have not had a lot of time lately to update my blog !

We've just added compiler Performance to our test matrix. Compilers have always been tested but we're
integrating Studio and Solaris performance testing so each development team can better understand the effect of their work on the other ( and in turn on the customer )

I have been talking to a lot of ISV ( independent software vendors ) bringing their code and workload into our performance test metrix. They often comment on the pain that goes with upgrading compiler releases. Our goal is to reduce this real pain and provide some positive incentive in terms of increased performance.

One of the numerous killer Sun Compiler Studio 11 features is called dProfile which uses cool some features of the SPARC platform and dtrace. Have you seen those T1000 & T2000 systems yet ?

So do we shout about this from the rafters ? Never !

If you do a search for dprofile on www.sun.com you'll only get 3 hits !
and none would get you interested enough to search more.

So if you love dtrace then you'll love this too. Checkout the developers own blogs,
which is rather gentle in its claims ( Nick you never struck me as shy :)


PS checkout the latest dtrace -z option in the latest Solaris Express.

OpenSolaris Live on My New laptop

Finally got my Ferrari 4k install with Nevada 23. ( Been using a 3200 for a long time ) It rocks, rocks I tell you. Fast, quite and with the frkit its got everything you'll ever need from an OS.

Sun Again.

I was speaking to an engineer in my team yesterday while getting a tea.
We were discussing road maps ( which I cannot go into here ) for really
cool new hardware coming out of Sun both SPARC and AMD over
the next little while.

I have been around here a while and was a customer long before that and
expressed my view that these systems design were very "Sun".  he asked
what I meant.

Well the the boxes are simple, pack in a HUGE amount and have a high
build quality ( even for the prototype units we have ). Just to show him I
 put one of our 2u rack mount systems next to a new IBM Xeon 2u system
( yes, we test Solaris x86, Java and of course JES on NON Sun hardware really )
and the difference was amazing. The IBM has got so many additional components
which make it look like a KIT built from spare bits and I'm just talking the

I'd love to post pictures but you'll have to wait a while longer to try one
for yourself,  even  with the system packaging we're back to where we
started putting standard bits together better than anyone ;) if you run a datacenter with 100's of rack mount unit you'll LOVE these.

P.S. The IBM runs Solaris x86 just great. J2SE runs fine on it with XP
( yes we test Java on XP ), and RH & SuSE too.

Sun and U2

I have worked in Sun for a long time now ( 12+ years ) and taught
I had seen it all !  But never did I expect to see a PRESS RELEASE with
Sun Microsystems and Bono ( yes of U2 fame ) working together !

I was lucky enough to get a ticket to see the Vertigo Tour
home coming opening night in Dublin's Croke Park.  They sold out 3 nights
and could sell 10 more if the venue was available.

The concert was AWESOME. great music a super show and
the band clearly enjoyed playing to the home crowd. The high tech
light show was incredible.

At the end of the concert Bono asked the crowd to text the word 'AFRICA' to 53131.
and that is where Sun come in. We provided the back end infra structure
( and I guess java for all those phones  too )

Maybe marketing could get a Sun Logo somewhere in the venues for the rest of the shows.
I had to update this as it now appears that Marketing we're listening :) checkout www.sun.ie and we see Bono in all his glory.
Croke Park is an 80,000  seater stadium close to Sun's office and home
of Gaelic Football and Hurling, check these out if you get the chance
if you visit Ireland they are uniquely Irish and  great to watch. Checkout  http://gaa.ie.

Switch Performance.

/tmp/y A couple of my team mates ( Fintan and Sean ) have posts that deal with
Linus deciding that Performance testing is a good idea and it should be done
for Linux. ( I'm sure reading the artical again he'll see it as a Homer Simpson
moment , D ooh ! maybe we should test it !)

Sounds Silly ?

It seems that Linus may be ahead of some folks.  We do a lot, in fact, a hell of
a lot of network performance testing. Last week we blew a low end 100/1000Gb switch,
we replaced it with a new one, same make, model etc. yet there is a 10 % difference
between it and the original on standard SPECweb99 benchmark. Ouch.

The same hardware now gets 10% less. maybe switch vendors could start testing
their firmware too :)



