Thursday Dec 20, 2012

pkg fix is my friend - a followup

We bloggers appreciate questions and comments about what we post, whether privately in email or attached as comments to some article. In my last post, a reader asked a set of questions that were so good, I didn't want them to get lost down in the comments section. A big thanks to David Lange for asking these questions. I shall try to answer them here (perhaps with a bit more detail than you might have wanted).

Does the pkg fix reinstall binaries if the hash or chksum doesn't match?

Yes, it does. Let's actually see this in action, and then we will take a look at where it is getting the information required to correct the error.

Since I'm working on a series of Solaris 11 Automated Installer (AI) How To articles, installadm seems a good choice to damage, courtesy of the random number generator.

# ls /sbin/install*
/sbin/install             /sbin/installadm-convert  /sbin/installf
/sbin/installadm          /sbin/installboot         /sbin/installgrub

# cd /sbin
# mv installadm installadm-

# dd if=/dev/random of=/sbin/installadm bs=8192 count=32
0+32 records in
0+32 records out

# ls -la installadm*
-rw-r--r--   1 root     root       33280 Dec 18 18:50 installadm
-r-xr-xr-x   1 root     bin        12126 Dec 17 08:36 installadm-
-r-xr-xr-x   1 root     bin        74910 Dec 17 08:36 installadm-convert
OK, that should do it. Unless I am terribly unlucky, those random bytes will produce something that doesn't match the stored hash value of the installadm binary.

This time, I will begin the repair process with a pkg verify, just to see what is broken.

# pkg verify installadm
PACKAGE                                                                 STATUS 
pkg://solaris/install/installadm                                         ERROR

	file: usr/sbin/installadm
		Group: 'root (0)' should be 'bin (2)'
		Mode: 0644 should be 0555
		Size: 33280 bytes should be 12126
		Hash: 2e862c7ebd5dce82ffd1b30c666364f23e9118b5 
                     should be 68374d71b9cb91b458a49ec104f95438c9a149a7
For clarity, I have removed all of the compiled python module errors. Most of these have been corrected in Solaris 11.1, but you may see these occasionally when doing a pkg verify.

Since we have a real package error, let's correct it.

# pkg fix installadm
Verifying: pkg://solaris/install/installadm                     ERROR          

	file: usr/sbin/installadm
		Group: 'root (0)' should be 'bin (2)'
		Mode: 0644 should be 0555
		Size: 33280 bytes should be 12126
		Hash: 2e862c7ebd5dce82ffd1b30c666364f23e9118b5 
                     should be 68374d71b9cb91b458a49ec104f95438c9a149a7
Created ZFS snapshot: 2012-12-19-00:51:00
Repairing: pkg://solaris/install/installadm                  
                                                                               

DOWNLOAD                                  PKGS       FILES    XFER (MB)
Completed                                  1/1       24/24      0.1/0.1

PHASE                                        ACTIONS
Update Phase                                   24/24 

PHASE                                          ITEMS
Image State Update Phase                         2/2 
We can now run installadm as if it was never damaged.
# installadm list

Service Name     Alias Of       Status  Arch   Image Path 
------------     --------       ------  ----   ---------- 
default-i386     solaris11-i386 on      x86    /install/solaris11-i386
solaris11-i386   -              on      x86    /install/solaris11-i386
solaris11u1-i386 -              on      x86    /install/solaris11u1-i386
Oh, if you are wondering about that hash, it is a SHA1 checksum.
# digest -a sha1 /usr/sbin/installadm
68374d71b9cb91b458a49ec104f95438c9a149a7

If so does IPS keep the installation binaries in a depot or have to point to the originating depot to fix the problem?

IPS does keep a local cache of package attributes. Before diving into some of these details, it should be known that some, if not all of these, are private details of the current implementation of IPS, and can change in the future. Always consult the command and configuration file man pages before using any of these in scripts. In this case, the relevant information would be in pkg(5) (i.e. man -s 5 pkg).

Our first step is to identify which publisher has provided the package that is currently installed. In my case, there is only one (solaris), but in a large and mature enterprise deployment, there could be many publishers.

# pkg info installadm
pkg info installadm
          Name: install/installadm
       Summary: installadm utility
   Description: Automatic Installation Server Setup Tools
      Category: System/Administration and Configuration
         State: Installed
     Publisher: solaris
       Version: 0.5.11
 Build Release: 5.11
        Branch: 0.175.0.0.0.2.1482
Packaging Date: October 19, 2011 12:26:24 PM 
          Size: 1.04 MB
          FMRI: pkg://solaris/install/installadm@0.5.11,5.11-0.175.0.0.0.2.1482:20111019T122624Z
From this we have learned that the actual package name is install/installadm and the publisher is in fact, solaris. We have also learned that the version of installadm comes from the original Solaris 11 GA release (5.11-0.175.0.0). That will allow us to go take a look at some of the configuration files (private interface warning still in effect).

Note: Since package names contain slashes (/), we will have to encode them as %2F to keep the shell from interpreting them as a directory delimiter.

# cd /var/pkg/publisher/solaris/pkg/install%2Finstalladm
# ls -la
drwxr-xr-x   2 root     root           4 Dec 18 00:55 .
drwxr-xr-x 818 root     root         818 Dec 17 08:36 ..
-rw-r--r--   1 root     root       25959 Dec 17 08:36
            0.5.11%2C5.11-0.175.0.0.0.2.1482%3A20111019T122624Z
-rw-r--r--   1 root     root       26171 Dec 18 00:55
            0.5.11%2C5.11-0.175.0.13.0.3.0%3A20121026T213106Z
The file 0.5.11%2C5.11-0.175.0.0.0.2.1482%3A20111019T122624Z is the one we are interested in.
# digest -a sha1 /usr/sbin/installadm
68374d71b9cb91b458a49ec104f95438c9a149a7

# grep 68374d71b9cb91b458a49ec104f95438c9a149a7 *
file 68374d71b9cb91b458a49ec104f95438c9a149a7
chash=a5c14d2f8cc854dbd4fa15c3121deca6fca64515 group=bin mode=0555 
owner=root path=usr/sbin/installadm pkg.csize=3194 pkg.size=12126

That's how IPS knows our version of installadm has been tampered with. Since it is more than just changing attributes of the files, it has to download a new copy of the damaged files, in this case from the solaris publisher (or one of its mirrors). To keep from making this worse, it also makes a snapshot of the current boot environment, in case things go terribly wrong - which they do not.

Armed with this information, we can use some other IPS features, such as searching by binary hash.

# pkg search -r 68374d71b9cb91b458a49ec104f95438c9a149a7
INDEX                                    ACTION VALUE               PACKAGE
68374d71b9cb91b458a49ec104f95438c9a149a7 file   usr/sbin/installadm 
                 pkg:/install/installadm@0.5.11-0.175.0.0.0.2.1482
... or by name
# pkg search -r installadm
INDEX       ACTION VALUE                      PACKAGE
basename    dir    usr/lib/installadm         pkg:/install/installadm@0.5.11-0.175.0.0.0.2.1482
basename    dir    var/installadm             pkg:/install/installadm@0.5.11-0.175.0.0.0.2.1482
basename    file   usr/sbin/installadm        pkg:/install/installadm@0.5.11-0.175.0.0.0.2.1482
pkg.fmri    set    solaris/install/installadm pkg:/install/installadm@0.5.11-0.175.0.0.0.2.1482
pkg.summary set    installadm utility         pkg:/install/installadm@0.5.11-0.175.0.0.0.2.1482
And finally...
# pkg contents -m installadm

..... lots of output truncated ......

file 68374d71b9cb91b458a49ec104f95438c9a149a7 chash=a5c14d2f8cc854dbd4fa15c3121deca6fca64515 
group=bin mode=0555 owner=root path=usr/sbin/installadm pkg.csize=3194 pkg.size=12126
There is our information using a public and stable interface. Now you know, not only where IPS caches the information, but a predictable way to retrieve it, should you ever need to do so.

As with the verify and fix operations, this is much more helpful than the SVR4 packaging commands in Solaris 10 and earlier.

Given that customers might come up with their own ideas of keeping pkgs at various levels, could they be shooting themselves in the foot and creating such a customized OS that it causes problems?

Stephen Hahn has written quite a bit on the origins of IPS, both on his archived Sun blog as well as on the OpenSolaris pkg project page. While it is a fascinating and useful read, the short answer is that IPS helps prevent this from happening - certainly much more so than with the previous packaging system.

The assistance comes in several ways.

Full packages: Since IPS delivers full packages only, that eliminates one of the most confusing and frustrating aspects of the legacy Solaris packaging system. Every time you update a package with IPS, you get a complete version of the software, the way it was assembled and tested at Oracle (and presumably other publishers as well). No more patch order files and, perhaps more important, no more complicated scripts to automate the patching process.

Dependencies: A rich dependency mechanism allows the package maintainer to guarantee that other related software is at a compatible version. This includes incorporations, which protect large groups of software, such as the basic desktop, GNOME, auto-install and the userland tools. Although not a part of dependencies, facets allow for the control of optional software components - locales being a good example.

Boot environments: Solaris 10 system administrators can enjoy many of the benefits of IPS boot environment integration by using Live Upgrade and ZFS as a root file system. IPS takes this to the next level by automatically performing important operations, such as upgrading the pkg package when needed or taking a snapshot before performing any risky actions.

Expanding your question just a bit, IPS provides one new capability that should make updates much more predictable. If there is some specific component that an application requires, its version can be locked within a range. Here is an example, albeit a rather contrived one.

# pkg list -af jre-6
NAME (PUBLISHER)                                  VERSION                    IFO
runtime/java/jre-6                                1.6.0.37-0.175.1.2.0.3.0   ---
runtime/java/jre-6                                1.6.0.35-0.175.1.0.0.24.1  ---
runtime/java/jre-6                                1.6.0.35-0.175.0.11.0.4.0  ---
runtime/java/jre-6                                1.6.0.33-0.175.0.10.0.2.0  ---
runtime/java/jre-6                                1.6.0.33-0.175.0.9.0.2.0   ---
runtime/java/jre-6                                1.6.0.32-0.175.0.8.0.4.0   ---
runtime/java/jre-6                                1.6.0.0-0.175.0.0.0.2.0    i--
Suppose that we have an application that is tied to version 1.6.0.0 of the java runtime. You can lock it at that version and IPS will prevent you from applying any upgrade that would change it. In this example, an attempt to upgrade to SRU8 (which introduces version 1.6.0.32 of jre-6) will fail.
# pkg freeze -c "way cool demonstration of IPS" jre-6@1.6.0.0
runtime/java/jre-6 was frozen at 1.6.0.0

# pkg list -af jre-6
pkg list -af jre-6
NAME (PUBLISHER)                                  VERSION                    IFO
runtime/java/jre-6                                1.6.0.37-0.175.1.2.0.3.0   ---
runtime/java/jre-6                                1.6.0.35-0.175.1.0.0.24.1  ---
runtime/java/jre-6                                1.6.0.35-0.175.0.11.0.4.0  ---
runtime/java/jre-6                                1.6.0.33-0.175.0.10.0.2.0  ---
runtime/java/jre-6                                1.6.0.33-0.175.0.9.0.2.0   ---
runtime/java/jre-6                                1.6.0.32-0.175.0.8.0.4.0   ---
runtime/java/jre-6                                1.6.0.0-0.175.0.0.0.2.0    if-

# pkg update --be-name s11ga-sru08  entire@0.5.11-0.175.0.8
What follows is a lengthy set of complaints about not being able to satisfy all of the constraints, conveniently pointing back to our frozen package.

But wait, there's more. IPS can figure out the latest update it can apply that satisfies the frozen package constraint. In this example, it should find SRU7.

# pkg update --be-name s11ga-sru07
            Packages to update:  89
       Create boot environment: Yes
Create backup boot environment:  No

DOWNLOAD                                  PKGS       FILES    XFER (MB)
Completed                                89/89   3909/3909  135.7/135.7

PHASE                                        ACTIONS
Removal Phase                                720/720 
Install Phase                                889/889 
Update Phase                               5066/5066 

PHASE                                          ITEMS
Package State Update Phase                   178/178 
Package Cache Update Phase                     89/89 
Image State Update Phase                         2/2 

A clone of solaris exists and has been updated and activated.
On the next boot the Boot Environment s11ga-sru07 will be
mounted on '/'.  Reboot when ready to switch to this updated BE.


---------------------------------------------------------------------------
NOTE: Please review release notes posted at:

http://www.oracle.com/pls/topic/lookup?ctx=E23824&id=SERNS
---------------------------------------------------------------------------
When the system is rebooted, a quick look shows that we are indeed running with SRU7.

Perhaps we were too restrictive in locking down jre-6 to version 1.6.0.0. In this example, we will loosen the constraint to any 1.6.0 version, but prohibit upgrades that change it to 1.6.1. Note that I did not have to unfreeze the package as a new pkg freeze will replace the preceding one.

# pkg freeze jre-6@1.6.0
runtime/java/jre-6 was frozen at 1.6.0

# pkg list -af jre-6
NAME (PUBLISHER)                                  VERSION                    IFO
runtime/java/jre-6                                1.6.0.37-0.175.1.2.0.3.0   -f-
runtime/java/jre-6                                1.6.0.35-0.175.1.0.0.24.1  -f-
runtime/java/jre-6                                1.6.0.35-0.175.0.11.0.4.0  -f-
runtime/java/jre-6                                1.6.0.33-0.175.0.10.0.2.0  -f-
runtime/java/jre-6                                1.6.0.33-0.175.0.9.0.2.0   -f-
runtime/java/jre-6                                1.6.0.32-0.175.0.8.0.4.0   -f-
runtime/java/jre-6                                1.6.0.0-0.175.0.0.0.2.0    if-
This shows that all versions are available for upgrade (i.e. , they all satisfy the frozen package constraint).

Once again, IPS gives us a wonderful capability that is missing in the legacy packaging system.

When you perform a pkg update on a system are we guaranteed a highly tested configuration that has gone thru multiple regression tests?

Short answer: yes.

For the details, I will turn your attention to our friend, Gerry Haskins, and his two excellent blogs: The Patch Corner (Solaris 10 and earlier) and Solaris 11 Maintenance Lifecycle. Both are excellent reads and I encourage everybody to add them to your RSS reader of choice.

Of particular note is Gerry's presentation, Solaris 11 Customer Maintenance Lifecycle, which goes into some great detail about patches, upgrades and the like. If you dig back to around the time that Solaris 10 9/10(u9) was released, you will find a links to a pair of interesting documents titled Oracle Integrated Stack - Complete, Trusted Enterprise Solutions and Trust Your Enterprise Deployments to the Oracle Product Stack: The integrated platform that's been developed, tested and certified to get the job done. These documents describe several test environments, including the Oracle Certification Environment (OCE) and Oracle Automated Stress Test (OAST). All Solaris 10 patches and Solaris 11 package updates (including Oracle Solaris Cluster) are put through these tests prior to release. The result is a higher confidence that patches will not introduce stability or performance problems, negating the old practice of putting a release or patch bundle on the shelf while somebody else finds all of the problems. Local testing on your own equipment is still a necessary practice, but you are able to move more quickly to a new release thanks to these additional testing environments.

If I am allowed to ask a follow up question, it would be something like, "what can I do proactively to keep my system as current as possible and reduce the risks of bad patch or package interactions?"

That is where the Critical Patch Updates come into play. Solaris 11 Support Repository Updates (SRU) come out approximately once per month. Every third one (generally) is special and becomes the CPU for Solaris. If you have a regular cadence for applying CPUs or Patch Set Updates (PSU) for your other Oracle software, choose the corresponding SRU that has been designated as that quarter's CPU. You can find this information in My Oracle Support (MOS), on the Oracle Technology Network (OTN), or just read Gerry's blog in mid January, April, July and October.

Thanks again to David Lange for asking such good questions. I hope the answers helped.

Tuesday Dec 11, 2012

Solaris 11 pkg fix is my new friend

While putting together some examples of the Solaris 11 Automated Installer (AI), I managed to really mess up my system, to the point where AI was completely unusable. This was my fault as a combination of unfortunate incidents left some remnants that were causing problems, so I tried to clean things up. Unsuccessfully. Perhaps that was a bad idea (OK, it was a terrible idea), but this is Solaris 11 and there are a few more tricks in the sysadmin toolbox.

Here's what I did.

# rm -rf /install/*
# rm -rf /var/ai

# installadm create-service -n solaris11-x86 --imagepath /install/solaris11-x86 \
                 -s solaris-auto-install@5.11-0.175.0

Warning: Service svc:/network/dns/multicast:default is not online.
   Installation services will not be advertised via multicast DNS.

Creating service from: solaris-auto-install@5.11-0.175.0
DOWNLOAD                                PKGS         FILES    XFER (MB)   SPEED
Completed                                1/1       130/130  264.4/264.4    0B/s

PHASE                                          ITEMS
Installing new actions                       284/284
Updating package state database                 Done 
Updating image state                            Done 
Creating fast lookup database                   Done 
Reading search index                            Done 
Updating search index                            1/1 

Creating i386 service: solaris11-x86

Image path: /install/solaris11-x86
So far so good. Then comes an oops.....
setup-service[168]: cd: /var/ai//service/.conf-templ: [No such file or directory]
                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This is where you generally say a few things to yourself, and then promise to quit deleting configuration files and directories when you don't know what you are doing. Then you recall that the new Solaris 11 packaging system has some ability to correct common mistakes (like the one I just made). Let's give it a try.
# pkg fix installadm
Verifying: pkg://solaris/install/installadm                     ERROR
        dir: var/ai
                Group: 'root (0)' should be 'sys (3)'
        dir: var/ai/ai-webserver
                Missing: directory does not exist
        dir: var/ai/ai-webserver/compatibility-configuration
                Missing: directory does not exist
        dir: var/ai/ai-webserver/conf.d
                Missing: directory does not exist
        dir: var/ai/image-server
                Group: 'root (0)' should be 'sys (3)'
        dir: var/ai/image-server/cgi-bin
                Missing: directory does not exist
        dir: var/ai/image-server/images
                Group: 'root (0)' should be 'sys (3)'
        dir: var/ai/image-server/logs
                Missing: directory does not exist
        dir: var/ai/profile
                Missing: directory does not exist
        dir: var/ai/service
                Group: 'root (0)' should be 'sys (3)'
        dir: var/ai/service/.conf-templ
                Missing: directory does not exist
        dir: var/ai/service/.conf-templ/AI_data
                Missing: directory does not exist
        dir: var/ai/service/.conf-templ/AI_files
                Missing: directory does not exist
        file: var/ai/ai-webserver/ai-httpd-templ.conf
                Missing: regular file does not exist
        file: var/ai/service/.conf-templ/AI.db
                Missing: regular file does not exist
        file: var/ai/image-server/cgi-bin/cgi_get_manifest.py
                Missing: regular file does not exist
Created ZFS snapshot: 2012-12-11-21:09:53
Repairing: pkg://solaris/install/installadm                  
Creating Plan (Evaluating mediators): |

DOWNLOAD                                PKGS         FILES    XFER (MB)   SPEED
Completed                                1/1           3/3      0.0/0.0    0B/s

PHASE                                          ITEMS
Updating modified actions                      16/16
Updating image state                            Done 
Creating fast lookup database                   Done 
In just a few moments, IPS found the missing files and incorrect ownerships/permissions. Instead of reinstalling the system, or falling back to an earlier Live Upgrade boot environment, I was able to create my AI services and now all is well.
# installadm create-service -n solaris11-x86 --imagepath /install/solaris11-x86 \
                   -s solaris-auto-install@5.11-0.175.0
Warning: Service svc:/network/dns/multicast:default is not online.
   Installation services will not be advertised via multicast DNS.

Creating service from: solaris-auto-install@5.11-0.175.0
DOWNLOAD                                PKGS         FILES    XFER (MB)   SPEED
Completed                                1/1       130/130  264.4/264.4    0B/s

PHASE                                          ITEMS
Installing new actions                       284/284
Updating package state database                 Done 
Updating image state                            Done 
Creating fast lookup database                   Done 
Reading search index                            Done 
Updating search index                            1/1 

Creating i386 service: solaris11-x86

Image path: /install/solaris11-x86

Refreshing install services
Warning: mDNS registry of service solaris11-x86 could not be verified.

Creating default-i386 alias

Setting the default PXE bootfile(s) in the local DHCP configuration
to:
bios clients (arch 00:00):  default-i386/boot/grub/pxegrub


Refreshing install services
Warning: mDNS registry of service default-i386 could not be verified.

# installadm create-service -n solaris11u1-x86 --imagepath /install/solaris11u1-x86 \
                    -s solaris-auto-install@5.11-0.175.1
Warning: Service svc:/network/dns/multicast:default is not online.
   Installation services will not be advertised via multicast DNS.

Creating service from: solaris-auto-install@5.11-0.175.1
DOWNLOAD                                PKGS         FILES    XFER (MB)   SPEED
Completed                                1/1       514/514  292.3/292.3    0B/s

PHASE                                          ITEMS
Installing new actions                       661/661
Updating package state database                 Done 
Updating image state                            Done 
Creating fast lookup database                   Done 
Reading search index                            Done 
Updating search index                            1/1 

Creating i386 service: solaris11u1-x86

Image path: /install/solaris11u1-x86

Refreshing install services
Warning: mDNS registry of service solaris11u1-x86 could not be verified.

# installadm list

Service Name    Alias Of      Status  Arch   Image Path 
------------    --------      ------  ----   ---------- 
default-i386    solaris11-x86 on      i386   /install/solaris11-x86
solaris11-x86   -             on      i386   /install/solaris11-x86
solaris11u1-x86 -             on      i386   /install/solaris11u1-x86


This is way way better than pkgchk -f in Solaris 10. I'm really beginning to like this new IPS packaging system.
About

Bob Netherton is a Principal Sales Consultant for the North American Commercial Hardware group, specializing in Solaris, Virtualization and Engineered Systems. Bob is also a contributing author of Solaris 10 Virtualization Essentials.

This blog will contain information about all three, but primarily focused on topics for Solaris system administrators.

Please follow me on Twitter Facebook or send me email

Search

Archives
« December 2012 »
SunMonTueWedThuFriSat
      
1
2
3
4
5
6
7
8
9
10
12
13
14
15
16
17
18
19
21
22
23
24
25
26
27
28
29
30
31
     
Today