Monday Dec 20, 2010

Finish Scripts and First-Boot Services with Bootable AI

In my last couple of blogs, I have talked about using the Automated Installer, specifically using Bootable AI.  We talked about creating a manifest to guide the installation and we talked about creating a system configuration manifest to configure the system identity.  The piece that has not yet been addressed, from the perspective of a long-time Jumpstart user, is how to do the final customization via finish scripts.

In general, there is no notion of an install script that is bundled into a package when using the Image Package System in Solaris 11 Express as there is with SVR4 packages from Solaris 10.  However, a package can install a service that can carry out the same function.  Its operation is a bit more controlled since it has to have its dependencies satisfied and can run in a reduced privilege environment if needed.  Additionally, many of the actions typically scripted into installation scripts, such as creation of users, groups, links, directories, are all built-in actions of the packaging system.

So, the question arises of how to use the IPS packaging system to add our own packages to a system, whether at installation time or later, and how to perform the necessary first-boot customizations to a system we are installing.  The requirement to create our own packages comes from the fact that there is no other way to deliver content to the system being installed during the installation process except through the AI manifest - and that means IPS packages.  In Jumpstart, there was a variable set at installation that pointed to a particular NFS-mounted directory where install scripts could reside.  This was all well and good so long as you could mount that directory.  When it was not available, you were left again with the notion of creating and delivering SVR4 packages via the Jumpstart profile.  So, the situation is not far different than in Solaris 10 and earlier.  There's just a little different syntax and a different network protocol in use to deliver the payload.

Why Use a Local Repository

There are two main reasons to create a local repository for use by AI and IPS.  First, you might choose to replicate the Solaris repository rather than make a home-run through your corporate network firewall to Oracle for every package installation on every server.  Performance and control are clearly going to be better and more deterministic by locating the data closer to where you plan to use it.  The second reason to create a local repository would be to host your own locally provided packages - whether developed locally or provided by an ISV.

The question then arises whether to combine both of these into the same repository.  My personal opinion is that it is better to keep them separate.  Just as it is a good practice to keep the binaries and files of the OS separate from those locally created or provided by applications on the disk, it seems a good idea to keep the repositories separate.  That does not mean that multiple systems are needed to host the data, however.  Multiple repository services can be hosted on the same system on different network ports, pointing to different directories on the disk.  Think of it like hosting multiple web sites with different ports and different htdocs directories.

Rather than go through all the details of creating a local mirror of the Solaris repository, I will refer you to Brian Leonard's blog on that topic.  Here, he talks about creating a local mirror from the repository ISO.  He also shows how you can create a file-based repository for local use.

For much of the rest of this exercise, I am relying on an article in the OpenSolaris Migration Hub that talks about creating a First Boot Service, which in turn references a page on creating a local package repository.  It is well worth it to read through these two pages.

Setting Up A Local Repository

So far, we have avoided the use of any sort of install server.  However, to have a local repository available during installation, this becomes a necessity.  So, pick a server to be used as a package repository.

Rather than clone the Solaris repository, we will create a new, empty repository to fill with our own packages.  We will configure the necessary SMF properties to enable the repository.  And then we will fill it with the packages that we need to deploy.

On the host that will host the repository, select a location in its filesystem and set it aside for this function.  A good practice would be to create a ZFS filesystem for the repository.  In this case, you can enable compression on the repository and easily control its size through quotas and reservations.  In this example, we will just create a ZFS filesystem within the root pool.  Often you will have a separate pool for this sort of function.

# zfs create -p -o mountpoint=/var/repos/mycompany.com/repo -o compression=on rpool/repos/mycompany.com/repo

Next, we will need to set the SMF properties to support the repository.  The service application/pkg/server is responsible for managing the actual package depot process.  As such, it refers to properties to locate the repository on disk, establish what port to use, etc.

The property pkg/inst_root specifies where on the repository server's disk the repo resides.  

# svccfg -s application/pkg/server setprop pkg/inst_root=/rpool/repo0906/repo

pkg/readonly specifies whether or not the repository can be updated.  Typically, for a cloned Solaris repository, this will be set to true.  It also is a good practice to set it to true when it should not be updated.

# svccfg -s application/pkg/server setprop pkg/readonly=true

pkg/prefix specifies the name that the repository will take when specified as a publisher by a client system.  pkg/port specifies the port where the repository will answer.

# svccfg -s application/pkg/server setprop pkg/prefix=local-pkgs
# svccfg -s application/pkg/server setprop pkg/port=9000


Once the properties are set, refresh and enable the service.

# svcadm refresh application/pkg/server
# svcadm enable application/pkg/server


Creating a First-Boot Package

Now that the repository has been created, we need to create a package to go into that repository.  Since there are no post-install scripts with the Image Packaging System, we will create an SMF service that will be automatically enabled so that it will run when the system boots.  One technique used with Jumpstart was to install a script into /etc/rc3.d that would run late in the boot sequence and would then remove itself so that it would only run on the first boot.  We will take a similar path with our first-boot service.  We will have it disable itself so that it doesn't continue to run on each boot.  There are two parts to creating this simple package.  First, we have to create the manifest for the service, and second, we have to create the script that will be used as the start method within the service.  This area is covered in more depth in the OpenSolaris Migration Hub paper on Creating a First Boot Service.  In fact, we will use the manifest from that paper.

Creating the Manifest

We will use the manifest from Creating a First Boot Service directly and call it finish-script-manifest.xml.  The main points to see here are
  • the service is enabled automatically when we import the manifest
  • the service is dependent on svc:/milestone/multi-user so that it won't run until the system is in the Solaris 11 Express equivalent of run level 3.
  • the script /usr/bin/finish-script.sh, which we will provide, is going to be run as the start method when the service begins.

finish-script-manifest.xml:

<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">

<service_bundle type='manifest' name='Finish:finish-script'>

<service
    name='system/finish-script'
    type='service'
    version='1'>

   <create_default_instance enabled='true' />

    <single_instance />

    <dependency name='autofs' grouping='require_all' restart_on='none' type='service'>
        <service_fmri value='svc:/system/filesystem/autofs:default' />
    </dependency>

<dependency name='multi-user' grouping='require_all' restart_on='none' type='service'>
      <service_fmri value='svc:/milestone/multi-user:default' />
   </dependency>


<exec_method
        type='method'
        name='start'
        exec='/usr/bin/finish-script.sh'
        timeout_seconds='0'>
    </exec_method>


   <exec_method
        type='method'
        name='stop'
        exec=':true'
        timeout_seconds='0'>
    </exec_method>

        <property_group name='startd' type='framework'>
                <propval name='duration' type='astring' value='transient' />
        </property_group>

</service>
</service_bundle>


Creating the Script

In this example, we will create a trivial finish script.  All it will do is log that it has run and then disable itself.  You could go so far as to have the finish script uninstall itself.  However, rather than do that we will just disable the service.  Certainly, you could have a much more expansive finish script, with multiple files and multiple functions.  Our script is short and simple:

finish-script.sh:

#!/usr/bin/bash
svcadm disable finish-script
#pkg uninstall pkg:/finish
echo "Completed Finish Script" > /var/tmp/finish_log.$$
exit 0


Adding Packages to Repository

Now that we have created the finish script and the manifest for the first-boot service, we have to insert these into the package repository that we created earlier.  Take a look at the pkgsend man page for a lot more details about how all of this works.  It's possible with pkgsend to add SVR4 package bundles into the repository, as well as tar-balls and directories full of files.  Since our package is simple, we will just insert each package.

When we open the package in the repository, we have to specify the version number.  Take a good look at the pkg(5) man page to understand the version numbering and the various actions that could be part of the package.  Since I have been working on this script, I have decided that it is version 0.3.  We start by opening the package for insertion.  Then, we add each file to the package in the repository, specifying file ownership, permissions, and path.  Once all the pieces have been added, we close the package and the FMRI for the package is returned.

# eval `pkgsend -s http://localhost:9000/ open finish@0.3`
# pkgsend -s http://localhost:9000/ add file finish-script-manifest.xml mode=0555 owner=root group=bin path=/var/svc/manifest/system/finish-script-manifest.xml restart_fmri=svc:/system/manifest-import:default
# pkgsend -s http://localhost:9000/ add file finish-script.sh mode=0555 owner=root group=bin path=/usr/bin/finish-script.sh
# pkgsend -s http://localhost:9000/ close
PUBLISHED
pkg://local-pkgs/finish@0.3,5.11:20101220T174741Z


Updating the AI Manifest

Now that the repository is created, we can add the new repository as a publisher to verify that it has the contents we expect.  On the package server, itself, we can add this publisher.  Remember that we set the prefix for the publisher to local-pkgs and that we specified it should run on port 9000.  This name could be anything that makes sense for your enterprise - perhaps the domain for the company or something that will identify it as local rather than part of Solaris is a good choice.

# pkg set-publisher -p http://localhost:9000 local-pkgs
pkg set-publisher:
  Added publisher(s): local-pkgs
# pkg list -n "pkg://local-pkgs/\*"
NAME (PUBLISHER)                                    VERSION         STATE      UFOXI
finish (local-pkgs)                                 0.3             known      -----


Now that we know the repository is on-line and contains the packages that we want to deploy, we have to update the AI manifest to reflect both the new publisher and the new packages to install.  First, to update the publisher, beneath the section that specifies the default publisher, add a second source for this new publisher:

      <source>
        <publisher name="solaris">
          <origin name="http://pkg.oracle.com/solaris/release/"/>
        </publisher>
      </source>
      <source>
         <publisher name="local-pkgs">
           <origin name="http://my-local-repository-server:9000/"/>
         </publisher>
      </source>


Then, add the packages to the list of packages to install.  Depending on how you named your package, you may not need to specify the repository for installation.  IPS will search the path of repositories to find each package.  However, it's not a bad idea to be specific.

      <software_data action="install" type="IPS">
        <name>pkg:/entire</name>
        <name>pkg:/server_install</name>
        <name>pkg://local-pkgs/finish</name>


Now, when you install the system using this manifest, in addition to the regular Solaris bits being installed, and the system configuration services executing, the finish script included in your package will be run on the first boot.  Since the script called by the service turns itself off, it will not continue to run on subsequent boots.  You can do whatever sort of additional configuration that you might need to do.  But, before you spend a long time converting all of your old Jumpstart scripts into a first-boot service, take a look at the built-in capabilities of AI and of Solaris 11 Express in general.  It may be that much of the work you had to code yourself before is no longer required.  For example, IP interface configuration is simple and persistent with ipadm.  Other new functions in Solaris 11 Express remove the need to write custom code, too.  But for the cases where you need custom code, this sort of first-boot service gives you a hook so that you can do what you need.

Using System Configuration Manifests with Bootable AI

In my last blog, I talked about how to configure a manifest for a bootable AI installation.  The main thing there was how to select which packages to install.  This time we are going to talk about how to handle AI's version of sysidcfg and configuring system identity at install-time.

In a Jumpstart world, many of the things that make up a system's identity - hostname, network configuration, timezone, naming services, etc. - can be configured at installation time by providing a sysidcfg file.  Alternately, an interactive dialog starts and prompts the installer for this sort of information.

The System Configuration manifest provides this same sort of information in an Automated Installer world.  The documentation for AI shows how to create either a separate or an embedded SC manifest to be served by an AI server.  When using Bootable AI, the SC manifest needs to be embedded within the AI manifest.  The SC manifest, whether embedded or not, is basically an XML document that is providing a bunch of properties for SMF services that are going to run on the first system boot to help complete the system configuration.  Some of the main tasks that can be completed in the SC manifest are:

  • Identify and configure the administrative "first user" created at install time.
  • Specify a root password and whether root is a role or a standard user
  • Configure timezone, hostname, keyboard maps, terminal type
  • Specify whether the network should be configured automatically or manually.
  • Configure network settings, including DNS, for manually configured networks

But, in the end, all of this is just setting SMF properties, so it's pretty straightforward.  It appears as a large service_bundle with properties for multiple SMF services.

As far as including the SC manifest information in the bootable AI manifest, the SC manifest is essentially embedded into the AI manifest as a large comment.  Don't be put off by the comment notation.  This whole section is passed on to SMF to assign the necessary service properties.

In order to explain the various sections of properties, I will just annotate an updated SC manifest.  In this manifest, I will specify some of the more common configuration settings you might use. 

The whole SC embedded manifest is identified within the AI manifest with the tag sc_embedded_manifest.  See the Automated Installer Guide for more details on the rest of the options to this tag.  The two lines following the sc_embedded_manifest tag are just the top part of the stand-alone SC manifest XML document.  Look in the default AI manifest for exact placement of this section.

    <sc_embedded_manifest name="AI">
      <!-- <?xml version='1.0'?>
      <!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">    

The rest of the SC manifest sets up service properties for a service bundled named "system configuration."

      <service_bundle type="profile" name="system configuration">

The service system/install/config is responsible for doing some of the basic configuration actions at install-time, such as setting up the first, or admin, user, setting a root password, and giving the system a name.  The property group "user_account" specifies how the first user, used for administration, should be configured.  You can specify here the username (name="login"), an encrypted password, the GECOS field information (name="description"), as well as the UID and GID for the account.  Note that the default password supplied for the first user (by default, named jack) in the default manifest is "jack".

Special note should be made of the property "roles".  Recall that in Solaris 11 Express, root is no longer a regular login user, but becomes a role.  Therefore, in order to be able to assume the root role for administrative functions, this first user needs to be given the root role.  Other roles can also be specified here as needed.  Also notice that the profile "Primary Administrator" is no longer assigned to this first user, as was done in OpenSolaris.  Additional properties around roles, profiles, authorizations, etc. may be assigned.  See the Automated Installer Guide for details.

        <service name="system/install/config" version="1" type="service">
          <instance name="default" enabled="true">
            <property_group name="user_account" type="application">
              <propval name="login" type="astring" value="sewr"/>
              <propval name="password" type="astring" value="9Nd/cwBcNWFZg"/>
              <propval name="description" type="astring" value="default_user"/>
              <propval name="shell" type="astring" value="/usr/bin/bash"/>
              <propval name="uid" type='count' value='27589'/>
              <propval name="gid" type='count' value='10'/>
              <propval name="type" type="astring" value="normal"/>
              <propval name="roles" type="astring" value="root"/>
            </property_group>

As with Jumpstart, it is possible to specify a root password at install time.  The encrypted string for the root password is given here as the password property.  If no new password is supplied, the default root password at install-time is "solaris".  Also note here that root is created as a role rather than a regular login user.

            <property_group name="root_account" type="application">
                <propval name="password" type="astring" value="$5$dnRfcZs$Hx4aBQ161Uvn9ZxJFKMdRiy8tCf4gMT2s2rtkFba2y4"/>
                <propval name="type" type="astring" value="role"/>
            </property_group>

A few other housekeeping properties can also be set here for the system/install/config service.  These include the local timezone and the hostname (/etc/nodename) for the system.

            <property_group name="other_sc_params" type="application">
              <propval name="timezone" type="astring" value="US/Eastern"/>
              <propval name="hostname" type="astring" value="myfavoritehostname"/>
            </property_group>
          </instance>
        </service>

The system/console-login service establishes the login service for the console. Here you can specify the terminal type to be used for the console.

        <service name="system/console-login" version="1" type="service">
          <property_group name="ttymon" type="application">
            <propval name="terminal_type" type="astring" value="xterms"/>
          </property_group>
        </service>

The service system/keymap establishes what sort of keyboard input is to be expected on the system.

        <service name='system/keymap' version='1' type='service'>
          <instance name='default' enabled='true'>
            <property_group name='keymap' type='system'>
              <propval name='layout' type='astring' value='US-English'/>
            </property_group>
          </instance>
        </service>

By default, Solaris 11 Express enabled NWAM (NetWork Automagic) to automatically configure the primary network interface.  NWAM, by default, activates a primary network for the system, whether wired or wireless, monitors its availability and tries to restore network connectivity if it should go away.  Most people would say that its behavior is best suited for mobile or desktop systems, and it functions well in that space.  It includes the ability to have profiles that guide its behavior in a variety of networked environments.  NWAM relies on DHCP to get an available IP address and other data needed to configure the network.

In the default AI profile, the network/physical:nwam service instance is enabled and the network/physical:default service instance is disabled.  In most server configurations, static addressing and configuration might be more desirable.  In that case, you can do as we have below and switch with service instance is enabled and which is disabled by default.

        <service name="network/physical" version="1" type="service">
          <instance name="nwam" enabled="false"/>
          <instance name="default" enabled="true"/>
        </service>

In the case where we are doing static network configuration, we will rely on the network/install service to set up our networks.  The properties and values used here correspond to arguments to the ipadm command, new in Solaris 11 Express.  ipadm is used to configure and tune IP interfaces.  See its man page for details on syntax.

In this case, we are setting up a single IPv4 network interface (xnf0), giving it a static IP address and netmask, and specifying a default route.

        <service name="network/install" version="1" type="service">
          <instance name="default" enabled="true">
            <property_group name="install_ipv4_interface" type="application">
              <propval name="name" type="astring" value="xnf0/v4"/>
              <propval name="address_type" type="astring" value="static"/>
              <propval name="static_address" type="net_address_v4" value="192.168.100.101/24"/>
              <propval name="default_route" type="net_address_v4" value="192.168.100.1"/>
            </property_group>
          </instance>
        </service>

As with Jumpstart using sysidcfg, it is possible to set up DNS information at install-time.  Note that only DNS and not NIS or LDAP naming services can be set up this way.  The System Administration Guide: Naming and Directory Services manual discusses how to configure these naming services.  NIS+ is no longer supported in Solaris 11 Express.

The network/dns/install service is used to set up DNS at install-time.  For this, we specify the regular sorts of data that will populate the /etc/resolv.conf file: nameservers, domain, and a domain name search path.  Some of these data items take multiple values, so lists of values are used, as shown below.

        <service name="network/dns/install" version="1" type="service">
          <instance name="default" enabled="true">
            <property_group name="install_props" type="application">
              <property name="nameserver" type="net_address">
                <net_address_list>
                  <value_node value="1xx.xxx.xxx.zz"/>
                  <value_node value="1xx.xxx.xxx.yy"/>
                  <value_node value="1xx.xxx.xxx.xx"/>
                </net_address_list>
              </property>
              <propval name="domain" type="astring" value="us.warble.com"/>
              <property name="search" type="astring">
                <astring_list>
                  <value_node value="us.warble.com"/>
                  <value_node value="garble.com"/>
                  <value_node value="mfg.garble.com"/>
                </astring_list>
              </property>
            </property_group>
          </instance>
        </service>

And we close out the service bundle and the embedded SC manifest.

      </service_bundle>
      -->
    </sc_embedded_manifest>

So, by building a custom AI manifest with its embedded SC manifest, you can accomplish the same sorts of install-time configuration of a system as you could with Jumpstart and sysidcfg, without having to build any sort of complex finish scripts or any kind of extra coding.  This approach makes is possible to have a repeatable methodology for creating the administrative user, with known, standard credentials, and for configuring the base system networks and naming services.

Saturday May 09, 2009

Reminder of Jumpstart Survey

Last week, I blogged about a Jumpstart Survey.  I've gotten good comments and some responses to the survey.  It's been a week, but I want to collect some more responses before posting an analysis.  Take a look at my previous blog and fill out the survey or comment on the blog.  I will summarize and report in another week or so.

Wednesday Apr 29, 2009

How do you use Jumpstart?

Jumpstart is the technology within Solaris that allows a system to be remotely installed across a network. This feature has been in the OS for a long, long time, dating to the start of Solaris 2.0, I believe. With Jumpstart, the system to be installed, the Jumpstart client, contacts a Jumpstart server to be installed across the network. This is a huge simplification, since there are nuances to how to set all of this up. Your best bet is to check the Solaris 10 Installation Guide: Network Based Installations and the Solaris 10 Installation Guide: Custom Jumpstart and Advanced Installations.

Jumpstart makes use of rules to decide how to install a particular system, based on its architecture, network connectivity, hostname, disk and memory capacity, or any of a number of other parameters. The rules select a profile that determines what will be installed on that system and where it will come from. Scripts can be inserted before and after the installation for further customization. To help manage the profiles and post-installation customization, Mike Ramchand has produced a fabulous tool, the Jumpstart Enterprise Toolkit (JET).

My Questions for You

As a long time Solaris admin, I have been a fan of Jumpstart for years and years. As an SE visiting many cool companies, I have seen people do really interesting things with Jumptstart. I want to capture how people use Jumpstart in the real world - not just the world of those who create the product. I know that people come up with new and unique ways of using the tools that we create in ways we would never imagine.

For example, I once installed 600 systems with SunOS 4.1.4 in less than a week using Jumpstart - remember that Jumpstart never supported SunOS 4.1.4.

But, I am not just looking for the weird stories. I want to know what Jumpstart features you use. I'll follow this up with extra, detailed questions around Jumpstart Flash, WAN Boot, DHCP vs. RARP. But I want to start with just some basics about Jumpstart.

Lacking a polling mechanism here at blogs.sun.com, you can just enter your responses as a comment. Or you can answer these questions at SurveyMonkey here. Or drop me a note at scott.dickson at sun.com.

  1. How do you install Solaris systems in your environment?
    1. I use Jumpstart
    2. I use DVD or CD media
    3. I do something else - please tell me about it
  2. Do you have a system for automating your jumpstart configurations?
    1. Yes, we have written our own
    2. Yes, we use JET
    3. Yes, we use xVM OpCenter
    4. No, we do interactive installations via Jumpstart. We just use Jumpstart to get the bits to the client.
  3. What system architectures do you support with Jumpstart?
    1. SPARC
    2. x86
  4. Do you use a sysidcfg file to answer the system identification questions - hostname, network, IP address, naming service, etc?
    1. No, I answer these interactively
    2. Yes, I hand-craft a sysidcfg file
    3. Yes, but it is created via the Jumpstart automation tools
  5. Do you use WANboot? I'll follow up with more questions on this at a later time.
    1. What's Wanboot?
    2. I have heard of it, but have never used it
    3. We rely on Wanboot
  6. Do you use Jumpstart Flash? More questions on this later, too
    1. Never heard of it
    2. We sometimes use Flash
    3. We live and breathe Flash
  7. What sort of rules do you include in your rules file?
    1. We do interactive installations and don't use a rules file
    2. We use the rules files generated by our automation tools, like JET
    3. We have a common rules file for all Jumpstarts based on hostname
    4. We use not only hostnames but also other parameters to determine which rule to use for installation
  8. Do you use begin scripts?
    1. No
    2. We use them to create derived profiles for installation
    3. We use them some other way
  9. Do you use finish scripts
    1. No
    2. We use the finish scripts created by our automation
    3. We use finish scripts to do some minor cleanup
    4. We do extensive post-installation customization via finish scripts. If so, please tell me about it.
  10. Do you customize the list of packages to be installed via Jumpstart?
    1. No
    2. Somewhat
    3. Not only do we customize the list of packages, but we create custom packages for our installation

Friday Dec 05, 2008

Flashless System Cloning with ZFS

Ancient History

Gather round kiddies and let Grandpa tell you a tale of how we used to to clone systems before we had Jumpstart and Flash, when we had to carry water in leaky buckets 3 miles through snow up to our knees, uphill both ways.

Long ago, a customer of mine needed to deploy 600(!) SPARCstation 5 desktops all running SunOS 4.1.4. Even then, this was an old operating system, since Solaris 2.6 had recently been released. But it was what their application required. And we only had a few days to build and deploy these systems.

Remember that Jumpstart did not exist for SunOS 4.1.4, Flash did not exist for Solaris 2.6. So, our approach was to build a system, a golden image, the way we wanted to be deployed and then use ufsdump to save the contents of the filesystems. Then, we were able to use Jumpstart from a Solaris 2.6 server to boot each of these workstations. Instead of having a Jumpstart profile, we only used a finish script that partitioned the disks and restored the ufsdump images. So Jumpstart just provided us clean way to boot these systems and apply the scripts we wanted to them.

Solaris 10 10/08, ZFS, Jumpstart and Flash

Now, we have a bit of a similar situation. Solaris 10 10/08 introduces ZFS boot to Solaris, something that many of my customers have been anxiously awaiting for some time. A system can be deployed using Jumpstart and the ZFS boot environment created as a part of the Jumpstart process.

But. There's always a but, isn't there.

But, at present, Flash archives are not supported (and in fact do not work) as a way to install into a ZFS boot environment, either via Jumpstart or via Live Upgrade. Turns out, they use the same mechanism under the covers for this. This is CR 6690473.

So, how can I continue to use Jumpstart to deploy systems, and continue to use something akin to Flash archives to speed and simplify the process?

Turns out the lessons we learned years ago can be used, more or less. Combine the idea of the ufsdump with some of the ideas that Bob Netherton recently blogged about (Solaris and OpenSolaris coexistence in the same root zpool), and you can get to a workaround that might be useful enough to get you through until Flash really is supported with ZFS root.

Build a "Golden Image" System

The first step, as with Flash, is to construct a system that you want to replicate. The caveat here is that you use ZFS for the root of this system. For this example, I have left /var as part of the root filesystem rather than a separate dataset, though this process could certainly be tweaked to accommodate a separate /var.

Once the system to be cloned has been built, you save an image of the system. Rather than using flarcreate, you will create a ZFS send stream and capture this in a file. Then move that file to the jumpstart server, just as you would with a flash archive.

In this example, the ZFS bootfs has the default name - rpool/ROOT/s10s_u6wos_07.


golden# zfs snapshot rpool/ROOT/s10s_u6wos_07@flar
golden# zfs send -v rpool/ROOT/s10s_u6wos_07@flar > s10s_u6wos_07_flar.zfs
golden# scp s10s_u6wos_07_flar.zfs js-server:/flashdirectory

How do I get this on my new server?

Now, we have to figure out how to have this ZFS send stream restored on the new clone systems. We would like to take advantage of the fact that Jumpstart will create the root pool for us, along with the dump and swap volumes, and will set up all of the needed bits for the booting from ZFS. So, let's install the minimum Solaris set of packages just to get these side effects.

Then, we will use Jumpstart finish scripts to create a fresh ZFS dataset and restore our saved image into it. Since this new dataset will contain the old identity of the original system, we have to reset our system identity. But once we do that, we are good to go.

So, set up the cloned system as you would for a hands-free jumpstart. Be sure to specify the sysid_config and install_config bits in the /etc/bootparams. The manual Solaris 10 10/08 Installation Guide: Custom JumpStart and Advanced Installations covers how to do this. We add to the rules file a finish script (I called mine loadzfs in this case) that will do the heavy lifting. Once Jumpstart installs Solaris according to the profile provided, it then runs the finish script to finish up the installation.

Here is the Jumpstart profile I used. This is a basic profile that installs the base, required Solaris packages into a ZFS pool mirrored across two drives.


install_type    initial_install
cluster         SUNWCreq
system_type     standalone
pool            rpool auto auto auto mirror c0t0d0s0 c0t1d0s0
bootenv         installbe bename s10u6_req

The finish script is a little more interesting since it has to create the new ZFS dataset, set the right properties, fill it up, reset the identity, etc. Below is the finish script that I used.


#!/bin/sh -x

# TBOOTFS is a temporary dataset used to receive the stream
TBOOTFS=rpool/ROOT/s10u6_rcv

# NBOOTFS is the final name for the new ZFS dataset
NBOOTFS=rpool/ROOT/s10u6f

MNT=/tmp/mntz
FLAR=s10s_u6wos_07_flar.zfs
NFS=serverIP:/export/solaris/Solaris10/flash

# Mount directory where archive (send stream) exists
mkdir ${MNT}
mount -o ro -F nfs ${NFS} ${MNT}

# Create file system to receive ZFS send stream &
# receive it.  This creates a new ZFS snapshot that
# needs to be promoted into a new filesystem
zfs create ${TBOOTFS}
zfs set canmount=noauto ${TBOOTFS}
zfs set compression=on ${TBOOTFS}
zfs receive -vF ${TBOOTFS} < ${MNT}/${FLAR}

# Create a writeable filesystem from the received snapshot
zfs clone ${TBOOTFS}@flar ${NBOOTFS}

# Make the new filesystem the top of the stack so it is not dependent
# on other filesystems or snapshots
zfs promote ${NBOOTFS}

# Don't automatically mount this new dataset, but allow it to be mounted
# so we can finalize our changes.
zfs set canmount=noauto ${NBOOTFS}
zfs set mountpoint=${MNT} ${NBOOTFS}

# Mount newly created replica filesystem and set up for
# sysidtool.  Remove old identity and provide new identity
umount ${MNT}
zfs mount ${NBOOTFS}

# This section essentially forces sysidtool to reset system identity at
# the next boot.
touch /a/${MNT}/reconfigure
touch /a/${MNT}/etc/.UNCONFIGURED
rm /a/${MNT}/etc/nodename
rm /a/${MNT}/etc/.sysIDtool.state
cp ${SI_CONFIG_DIR}/sysidcfg /a/${MNT}/etc/sysidcfg

# Now that we have finished tweaking things, unmount the new filesystem
# and make it ready to become the new root.
zfs umount ${NBOOTFS}
zfs set mountpoint=/ ${NBOOTFS}
zpool set bootfs=${NBOOTFS} rpool

# Get rid of the leftovers
zfs destroy ${TBOOTFS}
zfs destroy ${NBOOTFS}@flar

When we jumpstart the system, Solaris is installed, but it really isn't used. Then, we load from the send stream a whole new OS dataset, make it bootable, set our identity in it, and use it. When the system is booted, Jumpstart still takes care of updating the boot archives in the new bootfs.

On the whole, this is a lot more work than Flash, and is really not as flexible or as complete. But hopefully, until Flash is supported with a ZFS root and Jumpstart, this might at least give you an idea of how you can replicate systems and do installations that do not have to revert back to package-based installation.

Many people use Flash as a form of disaster recover. I think that this same approach might be used there as well. Still not as clean or complete as Flash, but it might work in a pinch.

So, what do you think? I would love to hear comments on this as a stop-gap approach.

About

Interesting bits about Solaris, Virtualization, and Ops Center

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today