Patching enhancements and other stuff
By Gerry Haskins-Oracle on Dec 04, 2008
New title, same role, same me
I was promoted to Director, Software Patch Services in September. The last couple of months have been quite hectic, as I've suddenly got a whole new bunch of buddies in Marketing and elsewhere who want some of my time. That's a good thing, and I believe it will help me to drive and co-ordinate improvements for you, our customers, patching experience.
Resources are limited and, as always, I'm interested in getting your thoughts as to what areas I should concentrate on next.
Some of the stuff we're currently working on is outlined below as well as other information which I hope you will find useful.
Solaris 10 10/08 Patch Bundle
The Solaris 10 10/08 Patch Bundle, which delivers the equivalent set of patches to the Solaris 10 10/08 (Update 6) release image, is now available from SunSolve. See my blog entry below on the Solaris 10 5/08 (Update 5) Patch Bundle for further information on why we produce it, what it contains, why you might wish to use it, how to download it, etc.
Recommended and Sun Alert patch cluster contents updated
I discussed the purpose of, and difference between, the Solaris Recommended and Sun Alert patch clusters in a previous blog posting. To recap:
The "Recommended" Cluster contains the latest revision of any Solaris OS patch which addresses a Sun Alert issue. That is, a fix for a Security, Data Corruption, or System Availability issue. The cluster also contains the latest revision of the patch utility patches to ensure correct patch application and any patch required by any other patch in the cluster.
The Sun Alert Cluster is newer, and contains the minimum revision of any Solaris OS patch which addresses a Sun Alert issue. The cluster also contains the latest revision of the patch utility
patches to ensure correct patch application and any patch required by
any other patch in the cluster. Therefore, the Sun Alert Cluster provides the minimum amount of change to fix all Solaris OS Sun Alert issues.
Both clusters are updated whenever a new patch meeting their inclusion criteria is released. The Sun Alert Cluster changes less frequently than the "Recommended" Cluster as it contains only what is really needed to address Sun Alert issues and apply the patches.
One of my team members has been reconciling the cluster contents against the Sun Alert reports and the cluster contents have been updated as a result. Some issues where found, largely to do with patches for things like GNOME which are also part of the Solaris OS. A process has been put in place to ensure the cluster contents match the patches specified in the Sun Alert reports.
Keeping as up to date as possible with the SunAlert or Recommended Cluster contents is advisable. Remember also to keep firmware up to date.
BTW: The monthly EIS (Enterprise Installation Standards) patch baseline is based upon the Recommended Cluster contents but also includes ca. 150 additional patches to address irritants which are not Sun Alert fixes and includes patches for SunCluster, SunVTS, etc. The monthly EIS patch baselines are available through xVM Ops Center and Sun Proactive Services.
I am planning to merge the Recommended and Sun Alert patch clusters into a single cluster using the Sun Alert cluster criteria as having two very similar clusters tends to confuse customers unnecessarily.
I also intend to merge the two cluster pages on SunSolve as one is essentially a better formated subset of the other.
ZFS and Zones features fully contained in patches
As I've mentioned previously, there's effectively a single customer visible code branch for each Solaris named release. That means that there's one set of patches for all of Solaris 10, a separate set for Solaris 9, and a separate set for Solaris 8. Within a named release, e.g. Solaris 10, the same set of patches will apply to any of the Solaris 10 releases, from the original Solaris 10 3/05 release right up to the current Solaris 10 10/08 (Update 6) release. This simplifies System Administration and enables Sun to provide very long term support at reasonable cost for each Solaris named release.
A consequence of effectively having a single code branch for each Solaris named release is that any change to pre-existing packages will be delivered in patch format.
New features are typically only added to the current Solaris named release, which is currently Solaris 10. (They are also available via OpenSolaris.)
This means that if new features don't add any new packages, then the entire feature functionality is fully available in patches. Customers can utilize the new features by simply applying the appropriate patches to their existing Solaris 10 system. This is the case with all current Zones and ZFS\* functionality, including neat features like ZFS Root, ZFS Boot, and Zones "Update on Attach".
Other features which deliver new packages are only available from the Solaris Update release in which they were first included. So, for example, if a new package was first delivered in Solaris 10 8/07 (Update 4), then a customer wishing to use that feature would need to install or upgrade to the Solaris 10 8/07 (Update 4) or subsequent update release image. Such features are not available in patches.
\*OK, we cheated with ZFS. ZFS does deliver new packages, but they are streamed into existence from a patch. This type of patch is called a "genesis" patch, but they are hard to perfect, so we don't intend to release any more "genesis" patches.
Improving Zones Patching Performance
Zones Parallel Patching
My team has been working with those awfully nice folks in the Sustaining organization to deliver a Zones Parallel Patching enhancement to the patch utilities to dramatically improve Zones patching performance. We have a fully stable prototype which has been given to selected Beta customers to trial.
For a simple T2000 with 5 sparse non-global zones, the performance improvement is >3x. On systems with optimized I/O (as Zones patching is primarily I/O bound), we expect the performance improvement to be even better. A configuration file will allow users to select how many Zones to patch in parallel. This will typically equate to the number of processors or threads available on the target system.
The general release of this feature is planned for April 2009.
Zones "Update on Attach"
The Kernel patch associated with Solaris 10 10/08 (Update 6), 137137-09 (SPARC) / 137138-09 (x86) contains some cool new features, such as ZFS Root, ZFS Boot, and Zones "Update on Attach". Beware, installing this patch requires significant free disk space to install! See Sun Alert http://sunsolve.sun.com/search/document.do?assetkey=1-66-246207-1
Zones "Update on Attach" is a very cool feature indeed.
For example, if the patch level of non-global Zones is out-of-sync with respect to the global Zone, e.g. because the non-global Zones ran out of disk space during patch application, Zones "Update on Attach" provides a very neat way to bring the Zones back into sync. Simply detach the affected non-global Zones, apply Kernel patch 137137-09 (SPARC) / 137138-09 (x86) to the global zones, and reattach the affected non-global Zones using 'zoneadm -z <zone-name> attach -u'. The non-global Zones will be automagically updated to the same patch level as the global Zone. Neat!
There are other interesting possibilities. For example, detach all non-global Zones, apply an arbitrary set of patches to the global Zone (including 13713-09), and reattach the non-global Zones using 'zoneadm -z <zone-name> attach -u'. Viola!, the non-global Zones will be automagically updated with all of the patches applied to the global Zone. Way neat! And more importantly, way faster than even the Zones Parallel Patching solution we're working on. And even better, it's available now! This could be a key solution for customers having difficulty completing patching updates on Zones systems during tight maintenance windows.
We are working to explore potential caveats. For example, when a patch is applied using 'patchadd' to a non-global zone, an "Undo.Z" file containing the data necessary to back out the patch is created specifically for each non-global zone to which the patch is applied. Using Zones "Update on Attach" to patch non-global Zones will cause the "Undo.Z" file from the global Zone to be propagated to the non-global Zones. This could theoretically cause issues if the patch is subsequently backed out (e.g. data from global Zone config files could potentially be merged into non-global Zone config files during patch backout which could potentially cause issues), although we've never actually encountered such an issue. BTW: The same caveat applies to creating non-global Zones after the global Zone has been patched. Again, we have yet to see this causing an actual issue, so it appears to be more of a theoretically caveat than a practical issue.
Improvements to 'smpatch' and Update Manager
The way the PatchPro analysis engine for 'smpatch' and Update Manager used to work was fine in theory, but in practice was what I call "a process with too many moving parts". Too many steps had to happen correctly for the overall result to be correct. In Six Sigma terms, there was too much error opportunity. Occasionally, it would end up recommending a SPARC patch for an x86 system or a Solaris 8 patch for a Solaris 10 system. Not surprisingly, its reputation suffered.
I'm pleased to say that a major overhaul to dramatically simplify the back end processing of 'smpatch' and Update Manager has just been rolled out by their engineering team. The way 'smpatch' and Update Manager work is that Realization Detector(s) are associated with each patch. These Realization Detectors determine whether it's appropriate to recommend a patch for application on a target system. In the vast majority of cases, the Realization Detectors are simply comparing the packages contained in the patch to the packages installed on the system to see if the patch is applicable. The enhancement is to replace these myriad Realization Detectors, which could potentially contain coding bugs, with a single Generic Realization Detector to map patch packages to packages on the target system. It looks at the package name, package version, and package architecture fields (in pkginfo) for each package in the patch, and compares them to the same values for the packages installed on the target system. If they match, the patch is recommended, else not. Guess what, this is exactly how 'patchadd' decides whether a patch is applicable or not when installing a patch. It's also how 'pca' works too in determining which patches to apply.
A few specialist Realization Detectors remain for a small number of patches which require special handling.
The changes to 'smpatch' and Update Manager should dramatically improve the reliability of these tools and the accuracy of their patching recommendations.
One remaining distinction between 'smpatch' / Update Manager and 'pca' is that 'pca' "knows" about all current Sun patches via the patchdiag.xref file, whereas 'smpatch' / Update Manager "knows" about all patches containing a 'patchinfo' file, including older patch revisions. All Solaris OS and Java Enterprise System (middleware) patches contain a 'patchinfo' file. These account for 49% of patches. For patching the Solaris OS, the tools should produce similar results. A decision was made not to "auto-include" all other patches for 'smpatch' and Update Manager, as it was felt that the explicit step of the patch creator including a non-blank PATCH_CORRECTS realization detector specification line in the 'patchinfo' file to signal that the patch was suitable for patch automation was potentially useful. (Don't worry about what value the PATCH_CORRECTS field has. This is overriden by the Generic Realization Detector in the vast majority of cases. It has no meaning from a customer perspective.)
This enhancement is not an attempt to undermine 'pca'. It's simply to improve 'smpatch' and Update Manager. I will continue to work closely with Martin Paul to give him heads-ups on any initiative which may impact 'pca' and resolve any issues with patchdiag.xref.
One thing I want to do when I can free up some resources, is a comparative study of the patching recommendations of the various available patch automation tools, 'smpatch' / Update Manager, 'pca', UCE (a.k.a Sun Connection Satellite), xVM Ops Center\*, and TLP (Traffic Light Patching) which is used by Sun Proactive Services to provide tailored patching solutions for customers in conjunction with SRAS (Sun Risk Analysis Service) and the EIS (Enterprise Installation Standards) methodology, with a view to ensuring that the patching recommendations of the various tools are coherent and consistent, with the higher value tools providing more sophisticated analysis. It's part of my efforts to co-ordinate patching improvements to improve our customers' patching experience.
\*xVM OC also utilitizes the monthly EIS patch "baselines".
Same Patch Entitlement policy, new Patch Entitlement implementation
Solaris changed its business model a few years ago from selling Solaris and providing patches for free to a model of giving away the software releases for free and charging for patches.
The policy is that patches delivering new security fixes will remain free to all customers, irrespective of whether or not they have a support contract, but most other patches require that customers have a valid support contract to access them. (See my earlier blog entry on the subject.)
All fixes will all be available for free in the next Solaris Update release (and OpenSolaris), so customers not willing to pay for a support contract can still get the fixes by installing or upgrading to the next Solaris Update release. They'll just need to wait for it to ship. Alternatively, they can use OpenSolaris.
This policy is not changing.
What is changing is the implementation of patch entitlement to ensure it matches the policy. Currently, circa 60% of Solaris patches are free, including most of the key patches. Under the new entitlement implementation, 18% of Solaris patches will remain free, including the specific revision of all Solaris patches which include new security fixes. The rest will require a valid support contract to access.
Any of the following support contracts will provide access to all Solaris patches and patch clusters: a Solaris subscription, a Software Support Contract, a Sun System Service Plan for Solaris, a Sun Spectrum Storage Plan, or a Sun Spectrum Enterprise Service Plan. Since the names of the support contracts change from time-to-time, this list may change.
The new implementation will roll out in Phases, starting this month. The roll-out should be transparent to customers with valid support contracts.
Patch signing certificate renewal
The signing certificate used to sign Sun patches expires shortly. A new signing certificate will be rolled out in January and instructions provided on how to adopt it.
Customers who download the unsigned patch versions will not need to take any action.
The "SplitGate" source code management model we first introduced in Solaris 10 8/07 (Update 4) has dramatically improved Solaris 10 patch quality. A side-effect of the "SplitGate" model is that base PatchIDs (the first 6 digits) change at the end of each Update release. See my earlier Solaris 10 Kernel PatchID Sequence posting.
In the "SplitGate" model, when building an Update release, we effectively have two parallel source code gates, one called the Sustaining Gate containing just the bug fixes we need to release to customers in patches asynchronous to the Update release, and the other called the Update Gate containing a superset of the the Sustaining Gate and as well as new features and less critical bug fixes which will be released as part of the Update release.
The two gates remain separate (split) for the duration of the Update release build process. Once the Update release has reached release quality, the Update Gate is promoted to become the new Sustaining Gate and the process repeats. Since the Update Gate is always a strict superset of the Sustaining Gate, no regressions should result from the promotion of the Update Gate to become the new Sustaining Gate. Each patch in the old Sustaining Gate is obsoleted by a corresponding patch from the Update Gate which has accumulated its contents. When the Update is released, these new PatchIDs are released to SunSolve. This is why you see the base PatchIDs changing after each Update release.
If the Update Gate patch doesn't contain any additional code changes over the corresponding Sustaining Gate patch, then there's no need for customers to install the new Update Gate patch. Such patches are called "accumulation-only" patches and can be identified as they have a different base PatchID (the first 6 digits) but don't contain any additional CR numbers over the Sustaining patch which they obsolete.
The reason Sun releases these "accumulation-only" patches is because some customers insist that all of the PatchIDs pre-applied into a Solaris Update release image be also available from SunSolve.