Friday Jun 19, 2009

Zones Parallel Patching versus Update On Attach: When to use which one ?

The Zones Parallel Patching enhancement for the Solaris 10 patch utilities was released this week giving customers a choice of how to improve zones patching performance.

In the Zones "Update On Attach" section of a previous blog posting, I mentioned that the Zones "Update On Attach" feature could also be used to improve Zones patching perfomance.

Zones Parallel Patching is a true patching solution utilizing the 'patchadd' utility.  

Whereas Zones "Update On Attach" uses zones functionality similar to that used during zones creation to provide a pseudo-patching solution that does not utilize 'patchadd'. 

So which one to choose ?

Let's look at the two options in more detail:

Zones Parallel Patching

Zones Parallel Patching is an enhancement to the standard Solaris 10 patch utilities and is delivered in the patch utilities patch, 119254-66 (SPARC) and 119255-66 (x86).

Simply install this patch, set the maximum number of non-global zones to be patched in parallel in the config file /etc/patch/pdo.conf, and away you go.

It works for all Solaris 10 systems. 

It also works well in conjunction with higher level patch automation tools such as xVM Ops Center. 

It can dramatically improve zones patching performance by patching non-global zones in parallel.  The global zone is still patched first.

While the performance gain is dependent on a number of factors, including the number of non-global zones, the number of on-line CPUs, the speed of the system, the I/O configuration of the system, etc., a performance gain of ca. 300% can typically be expected for patching the non-global zones - e.g. On a T2000 with 5 sparse root non-global zones.

See my previous Zones Parallel Patching blog entry for further information.

Since it's a pure enhancement to 'patchadd', it's normal 'patchadd' functionality.  You can subsequently remove patches using 'patchrm', etc.  Nothing has changed except that it's now much faster to patch non global Zones with Zones Parallel Patching invoked.

Zones "Update On Attach"

The primary purpose of Zones "Update on Attach" is Zones migration from one server to another.  

For example, a database instance in a non-global zone hosted on a server has grown to the extent that the Sys Admin wants to transfer it to a better spec'd server which can better handle the workload.   The Sys Admin can detach it from the old server (e.g. a Sun4u) and reattach it to the new server (e.g. a Sun4v) using Zones "Update On Attach".   This will bring the OS Software level on the non-global zone up to the same level as the new server's global zone.

Zones "Update On Attach" can certainly be used for patching but there are limitations you need to be aware of as outlined below.

For example, detach the non-global zones from a system, apply a bunch of patches to the global zone, reattach the non-global zones using "Update On Attach" and viola, the non-global zones will be brought up to the same software level as the global zone (for OS type packages), effectively patching the non-global zones without using 'patchadd' at all.   This is typically even faster than using Zones Parallel Patching.  But there are limitations to this approach which users must be aware of (see below).

My senior engineer, Enda O'Connor, has just published an interesting article on The Zones Update on Attach Feature and Patching in the Solaris 10 OS

Zones "Update On Attach" limitations as a patching aid

Zones "Update On Attach" only works for packages which are SUNW_PKG_ALLZONES=true - i.e. typically OS level packages, and not application packages.

So when to use Zones Parallel Patching in 'patchadd' and when to use Zones "Update On Attach" ?

Here's what my senior engineer, Enda O'Connor, says:

"The Zones Update on Attach Feature and Patching in the Solaris 10 OS document may help customers understand how the technology works, applying a cluster via patching and via zones Update On Attach is not quite the same really.

It really depends on the patches being applied, i.e. applying a firefox patch via Update On Attach would not work if you wanted it to apply to the global zone and all non-global zones as well.

One has to understand how Update On Attach works and then apply that to the list of patches to see if it gets them to a desirable state.

There is no black or white answer here.

I'd recommend Zones Parallel Patching using 'patchadd' as it has a known outcome all the time, whereas Update On Attach makes it's own internal determination based on a number of things, that can vary from system to system ( e.g. inherited directories ).

But if time to patch is critical then if the customer does proper testing to validate things, and are happy with the results, then by all means use Update On Attach.

But using Update On Attach without:

1. Understanding how it determines what packages to update

2. Not inspecting the patches being applied.

...will most likely lead to grief at some point."

And my other senior engineer, Ed Clark, says:

"In terms of giving guidance on which technology to use, there are a number of considerations -- two of these considerations are:

1. Using Update On Attach to update sparse zones can require significantly more disk storage space than would be needed by applying patches with 'patchadd' (3-4 times as much space would not be uncommon i think), due to Update On Attach copying fully populated global zone 'undo' files into the non-global zones, as opposed to having patchadd build sparsely populated 'undo' files in the non-global zones.

2. If a customer is really concerned about the ability to back out patches reliably, then 'patchadd' is a lower risk option than Update On Attach -- 'patchrm' of a patch from a non-global zone that has a copy of the global zones 'undo' pkg data (as is the case after Update On Attach) may potentially have unexpected side effects." [although we have yet to see any actual cases of negative results from this.]

Conclusion

In general, we recommend using the Zones Parallel Patching enhancement in the patch utilities rather than the Zones "Update On Attach" feature as Zones Parallel Patching is standard patching functionality, only faster, whereas Zones "Update On Attach" is really designed for migrating zones from one server as another and was not primarily designed to speed up patching.  

Because Zones "Update On Attach" uses Zones functionality similar to the zone creation functionality, rather than 'patchadd' functionality, limitations exist on what will be patched (typically the OS but not applications) and there's the potential for anomalies around things like the "undo" files which would be used by 'patchrm' if patches applied using Zones "Update On Attach" were subsequently removed from the non-global zones using 'patchrm' (although we have yet to see any actual cases of serious issues resulting from this).

So in patching situations where time is absolutely critical, Zones "Update On Attach" may provide a good option, as long as it's well tested in the customer environment prior to deployment on production systems.

Remember too, Live Upgrade is also your friend in such situations, enabling you to patch an inactive boot environment while the system is still in production.   So a combination of Live Upgrade and Zones Parallel Patching would be ideal.

I hope you find this helpful!

Best Wishes,

Gerry.

Thursday Apr 02, 2009

Patching Zones goes Zoom!

My colleague, Jeff Victor, has written an excellent and informative blog posting on how to improve patching performance, "patching zones goes zoom!".   Enjoy!

Thursday Dec 04, 2008

Patching enhancements and other stuff

New title, same role, same me

I was promoted to Director, Software Patch Services in September.  The last couple of months have been quite hectic, as I've suddenly got a whole new bunch of buddies in Marketing and elsewhere who want some of my time.  That's a good thing, and I believe it will help me to drive and co-ordinate improvements for you, our customers, patching experience. 

Resources are limited and, as always, I'm interested in getting your thoughts as to what areas I should concentrate on next.  

Some of the stuff we're currently working on is outlined below as well as other information which I hope you will find useful.

Solaris 10 10/08 Patch Bundle

The Solaris 10 10/08 Patch Bundle, which delivers the equivalent set of patches to the Solaris 10 10/08 (Update 6) release image, is now available from SunSolve.  See my blog entry below on the Solaris 10 5/08 (Update 5) Patch Bundle for further information on why we produce it, what it contains, why you might wish to use it, how to download it, etc.

Recommended and Sun Alert patch cluster contents updated

I discussed the purpose of, and difference between, the Solaris Recommended and Sun Alert patch clusters in a previous blog posting. To recap:

The "Recommended" Cluster contains the latest revision of any Solaris OS patch which addresses a Sun Alert issue.  That is, a fix for a Security, Data Corruption, or System Availability issue.  The cluster also contains the latest revision of the patch utility patches to ensure correct patch application and any patch required by any other patch in the cluster.

The Sun Alert Cluster is newer, and contains the minimum revision of any Solaris OS patch which addresses a Sun Alert issue. The cluster also contains the latest revision of the patch utility patches to ensure correct patch application and any patch required by any other patch in the cluster.  Therefore, the Sun Alert Cluster provides the minimum amount of change to fix all Solaris OS Sun Alert issues. 

Both clusters are updated whenever a new patch meeting their inclusion criteria is released.  The Sun Alert Cluster changes less frequently than the "Recommended" Cluster as it contains only what is really needed to address Sun Alert issues and apply the patches.

One of my team members has been reconciling the cluster contents against the Sun Alert reports and the cluster contents have been updated as a result.  Some issues where found, largely to do with patches for things like GNOME which are also part of the Solaris OS.  A process has been put in place to ensure the cluster contents match the patches specified in the Sun Alert reports.   

Keeping as up to date as possible with the SunAlert or Recommended Cluster contents is advisable.   Remember also to keep firmware up to date.

BTW: The monthly EIS (Enterprise Installation Standards) patch baseline is based upon the Recommended Cluster contents but also includes ca. 150 additional patches to address irritants which are not Sun Alert fixes and includes patches for SunCluster, SunVTS, etc.  The monthly EIS patch baselines are available through xVM Ops Center and Sun Proactive Services.

I am planning to merge the Recommended and Sun Alert patch clusters into a single cluster using the Sun Alert cluster criteria as having two very similar clusters tends to confuse customers unnecessarily.  

I also intend to merge the two cluster pages on SunSolve as one is essentially a better formated subset of the other. 

ZFS and Zones features fully contained in patches

As I've mentioned previously, there's effectively a single customer visible code branch for each Solaris named release.  That means that there's one set of patches for all of Solaris 10, a separate set for Solaris 9, and a separate set for Solaris 8.  Within a named release, e.g. Solaris 10, the same set of patches will apply to any of the Solaris 10 releases, from the original Solaris 10 3/05 release right up to the current Solaris 10 10/08 (Update 6) release.  This simplifies System Administration and enables Sun to provide very long term support at reasonable cost for each Solaris named release. 

A consequence of effectively having a single code branch for each Solaris named release is that any change to pre-existing packages will be delivered in patch format.

New features are typically only added to the current Solaris named release, which is currently Solaris 10.  (They are also available via OpenSolaris.)

This means that if new features don't add any new packages, then the entire feature functionality is fully available in patches.  Customers can utilize the new features by simply applying the appropriate patches to their existing Solaris 10 system.  This is the case with all current Zones and ZFS\* functionality, including neat features like ZFS Root, ZFS Boot, and Zones "Update on Attach".

Other features which deliver new packages are only available from the Solaris Update release in which they were first included.  So, for example, if a new package was first delivered in Solaris 10 8/07 (Update 4), then a customer wishing to use that feature would need to install or upgrade to the Solaris 10 8/07 (Update 4) or subsequent update release image.   Such features are not available in patches.

\*OK, we cheated with ZFS.  ZFS does deliver new packages, but they are streamed into existence from a patch.  This type of patch is called a "genesis" patch, but they are hard to perfect, so we don't intend to release any more "genesis" patches.

Improving Zones Patching Performance

Zones Parallel Patching

My team has been working with those awfully nice folks in the Sustaining organization to deliver a Zones Parallel Patching enhancement to the patch utilities to dramatically improve Zones patching performance.  We have a fully stable prototype which has been given to selected Beta customers to trial. 

For a simple T2000 with 5 sparse non-global zones, the performance improvement is >3x.  On systems with optimized I/O (as Zones patching is primarily I/O bound), we expect the performance improvement to be even better.  A configuration file will allow users to select how many Zones to patch in parallel.  This will typically equate to the number of processors or threads available on the target system.

The general release of this feature is planned for April 2009.

Zones "Update on Attach" 

The Kernel patch associated with Solaris 10 10/08 (Update 6), 137137-09 (SPARC) / 137138-09 (x86) contains some cool new features, such as ZFS Root, ZFS Boot, and Zones "Update on Attach".  Beware, installing this patch requires significant free disk space to install!  See Sun Alert http://sunsolve.sun.com/search/document.do?assetkey=1-66-246207-1

Zones "Update on Attach" is a very cool feature indeed.

For example, if the patch level of non-global Zones is out-of-sync with respect to the global Zone, e.g. because the non-global Zones ran out of disk space during patch application, Zones "Update on Attach" provides a very neat way to bring the Zones back into sync.  Simply detach the affected non-global Zones, apply Kernel patch 137137-09 (SPARC) / 137138-09 (x86) to the global zones, and reattach the affected non-global Zones using 'zoneadm -z <zone-name> attach -u'.  The non-global Zones will be automagically updated to the same patch level as the global Zone.  Neat!

There are other interesting possibilities.  For example, detach all non-global Zones, apply an arbitrary set of patches to the global Zone (including 13713[78]-09), and reattach the non-global Zones using 'zoneadm -z <zone-name> attach -u'.  Viola!, the non-global Zones will be automagically updated with all of the patches applied to the global Zone.  Way neat!  And more importantly, way faster than even the Zones Parallel Patching solution we're working on.  And even better, it's available now!  This could be a key solution for customers having difficulty completing patching updates on Zones systems during tight maintenance windows.

We are working to explore potential caveats.  For example, when a patch is applied using 'patchadd' to a non-global zone, an "Undo.Z" file containing the data necessary to back out the patch is created specifically for each non-global zone to which the patch is applied.   Using Zones "Update on Attach" to patch non-global Zones will cause the "Undo.Z" file from the global Zone to be propagated to the non-global Zones.  This could theoretically cause issues if the patch is subsequently backed out (e.g. data from global Zone config files could potentially be merged into non-global Zone config files during patch backout which could potentially cause issues), although we've never actually encountered such an issue.  BTW: The same caveat applies to creating non-global Zones after the global Zone has been patched.  Again, we have yet to see this causing an actual issue, so it appears to be more of a theoretically caveat than a practical issue.

Improvements to 'smpatch' and Update Manager

The way the PatchPro analysis engine for 'smpatch' and Update Manager used to work was fine in theory, but in practice was what I call "a process with too many moving parts".   Too many steps had to happen correctly for the overall result to be correct.  In Six Sigma terms, there was too much error opportunity.  Occasionally, it would end up recommending a SPARC patch for an x86 system or a Solaris 8 patch for a Solaris 10 system.  Not surprisingly, its reputation suffered.

I'm pleased to say that a major overhaul to dramatically simplify the back end processing of 'smpatch' and Update Manager has just been rolled out by their engineering team.  The way 'smpatch' and Update Manager work is that Realization Detector(s) are associated with each patch.  These Realization Detectors determine whether it's appropriate to recommend a patch for application on a target system.  In the vast majority of cases, the Realization Detectors are simply comparing the packages contained in the patch to the packages installed on the system to see if the patch is applicable.  The enhancement is to replace these myriad Realization Detectors, which could potentially contain coding bugs, with a single Generic Realization Detector to map patch packages to packages on the target system.  It looks at the package name, package version, and package architecture fields (in pkginfo) for each package in the patch, and compares them to the same values for the packages installed on the target system.  If they match, the patch is recommended, else not.  Guess what, this is exactly how 'patchadd' decides whether a patch is applicable or not when installing a patch.  It's also how 'pca' works too in determining which patches to apply.

A few specialist Realization Detectors remain for a small number of patches which require special handling.

The changes to 'smpatch' and Update Manager should dramatically improve the reliability of these tools and the accuracy of their patching recommendations.

One remaining distinction between 'smpatch' / Update Manager and 'pca' is that 'pca' "knows" about all current Sun patches via the patchdiag.xref file, whereas 'smpatch' / Update Manager "knows" about all patches containing a 'patchinfo' file, including older patch revisions.  All Solaris OS and Java Enterprise System (middleware) patches contain a 'patchinfo' file.  These account for 49% of patches.  For patching the Solaris OS, the tools should produce similar results.  A decision was made not to "auto-include" all other patches for 'smpatch' and Update Manager, as it was felt that the explicit step of the patch creator including a non-blank PATCH_CORRECTS realization detector specification line in the 'patchinfo' file to signal that the patch was suitable for patch automation was potentially useful.  (Don't worry about what value the PATCH_CORRECTS field has.  This is overriden by the Generic Realization Detector in the vast majority of cases.  It has no meaning from a customer perspective.)

This enhancement is not an attempt to undermine 'pca'.  It's simply to improve 'smpatch' and Update Manager.  I will continue to work closely with Martin Paul to give him heads-ups on any initiative which may impact 'pca' and resolve any issues with patchdiag.xref.

One thing I want to do when I can free up some resources, is a comparative study of the patching recommendations of the various available patch automation tools, 'smpatch' / Update Manager, 'pca', UCE (a.k.a Sun Connection Satellite),  xVM Ops Center\*, and TLP (Traffic Light Patching) which is used by Sun Proactive Services to provide tailored patching solutions for customers in conjunction with SRAS (Sun Risk Analysis Service) and the EIS (Enterprise Installation Standards) methodology, with a view to ensuring that the patching recommendations of the various tools are coherent and consistent, with the higher value tools providing more sophisticated analysis.  It's part of my efforts to co-ordinate patching improvements to improve our customers' patching experience.

\*xVM OC also utilitizes the monthly EIS patch "baselines".

Same Patch Entitlement policy, new Patch Entitlement implementation

Solaris changed its business model a few years ago from selling Solaris and providing patches for free to a model of giving away the software releases for free and charging for patches. 

The policy is that patches delivering new security fixes will remain free to all customers, irrespective of whether or not they have a support contract, but most other patches require that customers have a valid support contract to access them.  (See my earlier blog entry on the subject.)

All fixes will all be available for free in the next Solaris Update release (and OpenSolaris), so customers not willing to pay for a support contract can still get the fixes by installing or upgrading to the next Solaris Update release.  They'll just need to wait for it to ship.  Alternatively, they can use OpenSolaris.

This policy is not changing.

What is changing is the implementation of patch entitlement to ensure it matches the policy.  Currently, circa 60% of Solaris patches are free, including most of the key patches.  Under the new entitlement implementation, 18% of Solaris patches will remain free, including the specific revision of all Solaris patches which include new security fixes.  The rest will require a valid support contract to access. 

Any of the following support contracts will provide access to all Solaris patches and patch clusters: a Solaris subscription, a Software Support Contract, a Sun System Service Plan for Solaris, a Sun Spectrum Storage Plan, or a Sun Spectrum Enterprise Service Plan.  Since the names of the support contracts change from time-to-time, this list may change.

The new implementation will roll out in Phases, starting this month.  The roll-out should be transparent to customers with valid support contracts.

Patch signing certificate renewal

The signing certificate used to sign Sun patches expires shortly.  A new signing certificate will be rolled out in January and instructions provided on how to adopt it.

Customers who download the unsigned patch versions will not need to take any action.

"Accumulation-only" patches

The "SplitGate" source code management model we first introduced in Solaris 10 8/07 (Update 4) has dramatically improved Solaris 10 patch quality.  A side-effect of the "SplitGate" model is that base PatchIDs (the first 6 digits) change at the end of each Update release.  See my earlier Solaris 10 Kernel PatchID Sequence posting.

In the "SplitGate" model, when building an Update release, we effectively have two parallel source code gates, one called the Sustaining Gate containing just the bug fixes we need to release to customers in patches asynchronous to the Update release, and the other called the Update Gate containing a superset of the the Sustaining Gate and as well as new features and less critical bug fixes which will be released as part of the Update release. 

The two gates remain separate (split) for the duration of the Update release build process.  Once the Update release has reached release quality, the Update Gate is promoted to become the new Sustaining Gate and the process repeats.  Since the Update Gate is always a strict superset of the Sustaining Gate, no regressions should result from the promotion of the Update Gate to become the new Sustaining Gate.  Each patch in the old Sustaining Gate is obsoleted by a corresponding patch from the Update Gate which has accumulated its contents.  When the Update is released, these new PatchIDs are released to SunSolve.  This is why you see the base PatchIDs changing after each Update release. 

If the Update Gate patch doesn't contain any additional code changes over the corresponding Sustaining Gate patch, then there's no need for customers to install the new Update Gate patch.  Such patches are called "accumulation-only" patches and can be identified as they have a different base PatchID (the first 6 digits) but don't contain any additional CR numbers over the Sustaining patch which they obsolete.

The reason Sun releases these "accumulation-only" patches is because some customers insist that all of the PatchIDs pre-applied into a Solaris Update release image be also available from SunSolve.

About

This blog is to inform customers about patching best practice, feature enhancements, and key issues. The views expressed on this blog are my own and do not necessarily reflect the views of Oracle. The Documents contained within this site may include statements about Oracle's product development plans. Many factors can materially affect these plans and the nature and timing of future product releases. Accordingly, this Information is provided to you solely for information only, is not a commitment to deliver any material code, or functionality, and SHOULD NOT BE RELIED UPON IN MAKING PURCHASING DECISIONS. The development, release, and timing of any features or functionality described remains at the sole discretion of Oracle. THIS INFORMATION MAY NOT BE INCORPORATED INTO ANY CONTRACTUAL AGREEMENT WITH ORACLE OR ITS SUBSIDIARIES OR AFFILIATES. ORACLE SPECIFICALLY DISCLAIMS ANY LIABILITY WITH RESPECT TO THIS INFORMATION. ~~~~~~~~~~~~ Gerry Haskins, Director, Software Lifecycle Engineer

Search

Categories
Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today