Wednesday Dec 17, 2008

Definitive interpretation of the "rebootimmediate" and "reconfigimmediate" patch flags

The following is now available as Infodoc 249046:

What follows is an open letter to customers in response to customer confusion over how to handle the "rebootimmediate" and "reconfigimmediate" flags specified in some patches.

Despite the READMEs of patch clusters which contain such patches clearly stating that during a patching session, a reboot is only required in exceptional and documented circumstances, it has come to my attention that some customers are initiating reboots after applying every single patch in a patch set which specifies such flags.  Not surprisingly, such customers are concerned at the length of time this takes.

Open Letter with definitive interpretation of the "rebootimmediate" and "reconfigimmediate" patch flags

To whom it may concern,

Summary: When patching a live boot environment, it is usually OK to apply any number of patches before performing a single reboot at the end, even if multiple patches specify "rebootimmediate" or "reconfigimmediate".  On the rare occasion when it is found that this is not possible, specifically for 118833-36 (SPARC) and 118855-36 (x86) and 118844-14+ (x86), code will typically be inserted into the relevant patches to prevent the application of further patches which could cause problems.  Use of Live Upgrade to patch an inactive boot environment is recommended as it avoids the need for interim reboots for even these atypical patches.  Details below.

The "reboot" metadata flags which may be contained in the patch 'pkginfo' file(s) have the following meaning:

rebootafter - a reboot is required to activate some of the content delivered in the patch, but the system remains in a consistent state until the reboot is performed.

reconfigafter - a reconfiguration reboot is required to activate some of the content in the patch, but the system remains in a consistent state until the reconfiguration reboot is performed.

rebootimmediate - the system is in a potentially inconsistent state until the system is rebooted.  The objects applied in the patch are potentially inconsistent with processes running in memory.  Normal production must not be resumed until a reboot takes place to bring the system back into a fully consistent state.  However, since the footprint of the patch utilities is relatively small, it is normally OK to continue to apply further patches before initiating the reboot.   In cases where this is not OK, the patch in question will typically contain additional code to prevent further patches from being applied until the reboot takes place\*.  Since the system is in a potentially inconsistent state, it's advisable to avoid running any additional processes until the reboot takes place.  If patch automation tools are being used to apply "rebootimmediate" or "reconfigimmediate" patches, it's up to the automation tools' QA to ensure that their additional code footprint does not hit the potential inconsistent system state when applying such patches.

reconfigimmediate - exactly the same as rebootimmediate, except a reconfiguration reboot is required.

\*This is the case with Kernel patch 118833-36 (SPARC) / 118855-36 (x86), whose patch scripts replace 'patchadd' with a no-op telling the user to reboot the system.  The only other known reboot required before further patching can be done is specific to x86, and only if the system is running at a Kernel patch level below 118844-14.  A later revision of 118844, e.g. 118844-20, needs to be applied and the system rebooted to ensure the Kernel running in memory is compatible with library changes supplied in the libc patch 121208-02.  The prepatch script in 121208-02 and -03, and 118855-xx which obsoletes it, contains code to ensure 118844-14 or later is installed and active on the system.  (BTW, 118844-14 wasn't released. 118844-20 is recommended to fulfill the libc compatibility requirement.)

UPDATE, Jan 20, 2009: Murphy's Law strikes again!.  There's currently an issue, CR 6704883, with the "Sun Fibre Channel Device Drivers" patches 125184-05, -06, -07, and -08 (SPARC) and 125185-05, -06, -07, and -08 (x86) as described in Sun Alert 238630.  The fix for this issue is in rev-09 of the patches which is currently available as a T-Patch and will be released shortly.  Rev-09 of the patches uses modloading in its prepatch script to avoid the issue.  In the meantime, a workaround is to apply the affected patches last, immediately prior to rebooting the system.  The patches in the Solaris 10 10/08 patch bundle were specifically ordered to avoid this issue.  Where such issues are found, SunAlerts are published and the issue fixed.

Remember, patches can be downloaded and installed individually.  Therefore, each patch which requires a reboot must specify the reboot requirements.  But if patches are installed collectively in the same patching session, for example, as part of a patch cluster, then the install instructions contained in the cluster README file take precedence - e.g. that reboots are only required \*during\* patching sessions for the specific cases mentioned above.

Since the above patches were created, a significant enhancement has been made to the Solaris patch utilities called Deferred Activation Patching.  This enhancement is not retrospective, so the above historical problematic patches remain.

Deferred Activation Patching

The problem with the above atypical patches is that the new code they deliver may be invoked by the original patchadd code and the utilities it calls \*during\* patch installation.  A patch may patch many packages.  The packages are applied in alphabetic order.  In a Zones environment, the patch is applied to the global zone first, then to each non-global zone.

In the case of 118833-36 (SPARC) / 118855-36 (x86), the new versions of the and libraries delivered in the patch could be invoked by patchadd and are potentially incompatible with the processes running in memory.

The solution devised in the patch scripts contained in 118833-36 (SPARC) / 118855-36 (x86) is to overlay mount the old objects on top of the newly laid down objects using the loopback filesystem (lofs).  This ensures that the system remains in a consistent state \*during\* the patch process as the old library versions which are compatible with what's running in memory will be called.

To avoid the application of further patches, which patch the same objects as 118833-36 (SPARC) / 118855-36 (x86), from patching the overlay mounted objects instead of the patched objects, 118833-36 (SPARC) / 118855-36 (x86) replace 'patchadd' with a no-op telling the customer to reboot the system before applying any further patches.

During reboot, the loopback filesystem mounts are torn down exposing the patched objects.  Further patching can now continue as the system is in a fully consistent state.

This loopback filesystem mount solution is the basis of Deferred Activation Patching.  After patch 118833-36 (SPARC) / 118855-36 (x86) was released, the solution was perfected and moved to the patch utilities.  The few patches which require application using Deferred Activation Patching specify the SUNW_PATCH_SAFE_MODE=true flag in their pkginfo files.  The solution was enhanced so that any subsequent patch applied prior to a reboot of the system, which patches the same objects as a patch explicitly specifying Deferred Activation Patching, will itself be automatically applied in Deferred Activation Patching mode.   This is known as implicit Deferred Activation Patching and enables other patches to be applied on top of a patch applied using Deferred Activation Patching without the need for an intervening reboot.  When a patch specifying Deferred Activation Patching mode is applied to a system, the user will see lots of loopback filesystem mounts on the system until such time as the reboot takes place.  Upon reboot, the loopback filesystem mounts are torn down, exposing the newly patched objects.

Kernel patch 12001[12]-14 which is included in Solaris 10 8/07 (Update 4), Kernel patch 12712[78]-11 which is included in Solaris 10 5/08 (Update 5), and Kernel patch 13713[78]-09 which is included in Solaris 10 10/08 (Update 6), are currently the only patches which specify application in Deferred Activation Patching mode.  Future Kernel patch included in future Solaris 10 Update releases are the likely candidates requiring application using Deferred Activation Patching.

With the introduction of Deferred Activation Patching, it is highly unlikely that future patches will require an interim reboot before further patches can be applied.

The problems with the system getting into an inconsistent state \*during\* patching (which Deferred Activation Patching resolves) could only occur when patching a live boot environment as it's due to the interaction between newly patched objects which are incompatible with processes running in memory being invoked prior to the system being rebooted.

To avoid this and other issues, Sun strongly recommends the use of Live Upgrade to patch (or upgrade) an inactive boot environment, which dramatically reduces the risk and downtime associated with patching.  For example, even though Deferred Activation Patching resolves the inconsistency issue, patching a live boot environment takes time and the system is out of production.

Using Live Upgrade, the inactive boot environment is patched, potentially while the system is still in production.  Issues such as those described above with Kernel patch 118833-36 (SPARC) / 118855-36 (x86), and 118844-20 (x86) simply don't apply when patching an inactive boot environment as there is no interaction between the objects being patched and the processes running in memory, as all the calls patchadd makes will be to the objects on the live partition, not the patched objects on the inactive partition.  A single reboot is required to boot into the new boot environment.

Another advantage of Live Upgrade is that if a problem arises with the new boot environment for whatever reason, the user can simply reboot back into the old boot environment to enable production to resume and the issues with the now inactive boot environment can be resolved later.

Best Wishes,

Gerry Haskins
Director, Software Patch Services


This blog is to inform customers about patching best practice, feature enhancements, and key issues. The views expressed on this blog are my own and do not necessarily reflect the views of Oracle. The Documents contained within this site may include statements about Oracle's product development plans. Many factors can materially affect these plans and the nature and timing of future product releases. Accordingly, this Information is provided to you solely for information only, is not a commitment to deliver any material code, or functionality, and SHOULD NOT BE RELIED UPON IN MAKING PURCHASING DECISIONS. The development, release, and timing of any features or functionality described remains at the sole discretion of Oracle. THIS INFORMATION MAY NOT BE INCORPORATED INTO ANY CONTRACTUAL AGREEMENT WITH ORACLE OR ITS SUBSIDIARIES OR AFFILIATES. ORACLE SPECIFICALLY DISCLAIMS ANY LIABILITY WITH RESPECT TO THIS INFORMATION. ~~~~~~~~~~~~ Gerry Haskins, Director, Software Lifecycle Engineer


« April 2014