Potentially Problematic Solaris 10 patches

As mentioned in previous blog postings, when applying patches to a live boot environment, the Solaris 'patchadd' utility may end up invoking objects which it has just patched during the installation of the remainder of the patch or patches.  This can cause the system to get into an inconsistent state during patching, as the new objects may be incompatible with objects already loaded in memory.  This is especially true in a Zones environment, where 'patchadd' calls the Zones utilties in order to patch the non-global Zone(s) after patching the Global Zone.

This wasn't a problem in Solaris 8 or Solaris 9, as the amount of code change delivered in patches was limited.  But due to the large features included in Solaris 10 Updates, this became a problem for Solaris 10.

For example, a new library may be invoked which makes a system call and passes 5 arguments, but the old Kernel is still running, not the newly applied Kernel, and if the old Kernel only expects to receive 3 arguments for the system call, problems are likely to result.

Deferred Activation Patching addresses this issue for new patches (see previous postings below), but customers must follow the Special Install Instructions listed in the patch READMEs of some older Solaris 10 patches to avoid such problems as it is not possible to retrofit Deferred Activation Patching into previously released patches.

118844 (x86) Library compatibility

When patching a Solaris 10 x86 live boot environment, Kernel patch 118844-19 or higher must be active to ensure compatibility with library changes provided in subsequent patches.

Therefore, if you are patching an old Solaris 10 x86 system which is below this Kernel patch level, you will need to reboot the system after applying 118844-19 or higher. 

For example, 118844-20 is included in the Solaris 10 x86 Recommended and Sun Alert Clusters on SunSolve to fulfill this purpose.  See the CLUSTER_README files for further information.

118844 (x86) NewBoot (GRUB)

Solaris 10 x86 Kernel patch 118844-21includes the GNU GRand Unified Bootloader (GRUB) architecture.

Please follow the appropriate system specific instructions specified in SunAlert http://sunsolve.sun.com/search/document.do?assetkey=1-26-102087-1 . 

Failure to follow these instructions might result in the system failing to boot.

118833-36 (SPARC) / 118855-36 (x86)

118833-36 (SPARC) and 118855-36 (x86) are the Solaris 10 Kernel patches associated with the Solaris 10 11/06 (Update 3) release.  They contain significant amounts of code change.

To avoid the system getting into an inconsistent state during patching, lofs loopback mounts are used in the patch's scripts to mount the original objects over the newly installed objects.  This ensures that any objects invoked by the patch utilities during patch application will be the old versions and will be consistent with processes loaded in memory.  Upon reboot, the lofs mounts get torn down, exposing the patched objects for use.  As a safety precaution, code in the patch's scripts overlay mount the patch utilities with a no-op object to ensure the system is rebooted before further patches can be applied.  This is to prevent subsequent patches from potentially patching the lofs mounted objects rather than the underlying patched objects.

This concept of utilizing lofs loopback file system mounts was later formalized as Deferred Activation Patching, which is now implemented in the Solaris patch utilities patch.  Deferred Activation Patching is described in a previous posting below.  In the Deferred Activation Patching implementation, the solution has been enhanced to enable further patches to be applied without having to first reboot.  This is achieved by applying all subsequent patches affecting the same objects in implicit Deferred Activation Patching mode (i.e. they are also applied utilitizing loopback mounts).

Customers are strongly advised to read and follow the Special Install Instructions in the README files in patches 118833-36 (SPARC) and 118855-36 (x86).  These are the most complex patches which Sun has ever released.

When is an obsoleted patch not fully obsolete ? 

Answer: When removing it from a patch set would introduce a circular dependency between the remaining patches. 

A circular dependency is where Patch A requires Patch B to be installed first and Patch B requires Patch A to be installed first.  Catch 22.  Neither patch can be installed.

The situation is pretty obvious where only 2 patches are involved, but more typically the situation can potentially arise where 3 or more patches are involved as newer patches accumulate and obsolete older patches.

For example, if Patch B requires Patch C, and Patch A requires Patch B and Patch A also obsoletes patch C, then if all three patches are present, the following is a valid install order:

   Patch C

   Patch B

   Patch A

However, if Patch C is removed from the patch set because it has been obsoleted and therefore considered no longer needed (it has been accumulated by Patch A so all dependencies on Patch C would normally be resolved by installing Patch A), then there's no valid install order as Patch A requires Patch B to be installed first and Patch B requires Patch A to be installed first.

Unfortunately, the first thing almost any patch tool does when processing patches is to discard obsolete patches.  This is true of 'patchadd -M' and most higher level patch automation tools.

Normally, this isn't a problem, as audits are in place to catch such issues during the patch creation and test processes.  Typically, if such a patch relationship exists, Patches A and B would be accumulated together into a single patch.

However, in Solaris 10, this wasn't possible with Zones patch 122660-10 (SPARC) / 122661-08 (x86) as outlined below. 

While this situation has only occurred once, and every effort will be made to prevent recurrences, there's no guarantee that similar complex patch relationships won't arise in the future.  Therefore, patch automation tools and home-grown customer patching processes may need to cope with situations where "obsoleted" patches may still need to be installed.

Zones patch 122660-10 (SPARC) / 122661-08 (x86) 

Zones patch 122660-10 (SPARC) and 122661-08 (x86) must be applied before Kernel patch 120011-14 (SPARC) and 120012-14 (x86) can be applied due to CR 6471974 "zoneadm mount mishandles shared file systems".

CR 6471974 is fixed in 122660-10 (SPARC) / 122661-08 (x86).

Customers with Zones environments where non-global zones include a non-IPD entry that references a file system shared between two boot environments are affected by CR 6471974 unless they install 122660-10 (SPARC) / 122661-08 (x86).

Such customers cannot alt mount their zones.

On SPARC, 122660-10 is also obsoleted by 120011-14 and on x86, 122661-08 is obsoleted by 120012-14.

The problem is that we need the fix for CR 6471974 in place before Kernel patch 120011-14 (SPARC) / 120012-14 (x86) can be applied to such systems.

So we need the fix which is contained in the Kernel patch which has accumulated and obsoleted the zones patch before we can add the Kernel patch.  Catch 22. 

We could not fix the problem by checking for the presence of the Zones patch in the prepatch script of Kernel patch 120011-14 (SPARC) / 120012-14 (x86) as patchadd would fail to alt mount the zones, and therefore never get as far as running the Kernel prepatch script.

Also, we could not have the Kernel patch require the Zones patch 122660-10 / 122661-08 directly as it already accumulated it.

To fix the problem, we had to take the unusual step of creating a "zones indirection" patch, 125547-02 (SPARC) and 125548-02 (x86) , which includes a prepatch script to check that the Zones patch is installed on the target system and exit gracefully if it is not.  120011-14 (SPARC) / 120012-14 (x86) require the respective "zones indirection" patch, which in turn ensures Zones patch 122660-10 (SPARC) / 122661-08 (x86) is installed.

122660-10 (SPARC) and 122661-08 (x86) are included in the Solaris 10 SPARC and x86 Recommended and Sun Alert patch clusters available from SunSolve and are automatically installed by the cluster_install script.

However, if a customer is not using the patch clusters and instead provides a patch list including 122660-10 (SPARC) or 122661-08 (x86) to 'patchadd -M' to install, it will be discarded by patchadd during its patch install ordering process as it recognizes that these patches are obsolete and hence assumes they are no longer needed.

Many higher level patch automation tools are likely to make the same assumption.  A number of them have been modified to correctly handle this situation.

If a customer encounters this issue, simply install 122660-10 (SPARC) or 122661-08 (x86) separately, and then apply the rest of the patch set as normal.

Comments:

What a circus act. The README for 118833-36 is embarrassing. Stop ignoring your non-deterministic patching process by telling your customers to use live upgrade because live upgrade is as much a disaster as the rest of Solaris 10. Solaris 10 truly is an alpha-quality pre-rerelease of Solaris 11.

Posted by conzyor34 on March 06, 2008 at 01:31 PM GMT #

118833-36 (SPARC) / 118855-36 (x86) was somewhat less than fun. On a box with 20 or zones, it takes forever to patch in single user mode because of having to start and stop each zone twice. The usual way of making it less painful is keeping all the zones booted in single user as well saving the overhead. With this patch that didn't work. Only after shutting down all the zones did the patch apply. With pretty much every other patch depending on this one, it kept us out of sync for much too long.

Posted by Mads on March 07, 2008 at 12:14 AM GMT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

This blog is to inform customers about patching best practice, feature enhancements, and key issues. The views expressed on this blog are my own and do not necessarily reflect the views of Oracle. The Documents contained within this site may include statements about Oracle's product development plans. Many factors can materially affect these plans and the nature and timing of future product releases. Accordingly, this Information is provided to you solely for information only, is not a commitment to deliver any material code, or functionality, and SHOULD NOT BE RELIED UPON IN MAKING PURCHASING DECISIONS. The development, release, and timing of any features or functionality described remains at the sole discretion of Oracle. THIS INFORMATION MAY NOT BE INCORPORATED INTO ANY CONTRACTUAL AGREEMENT WITH ORACLE OR ITS SUBSIDIARIES OR AFFILIATES. ORACLE SPECIFICALLY DISCLAIMS ANY LIABILITY WITH RESPECT TO THIS INFORMATION. ~~~~~~~~~~~~ Gerry Haskins, Director, Software Lifecycle Engineer

Search

Categories
Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today