Saturday Feb 28, 2015

One image for native zones, kernel zones, ldoms, metal, ...

In my previous post, I described how to convert a global zone to a non-global zone using a unified archive.  Since then, I've fielded a few questions about whether this same approach can be used to create a master image that is used to install Solaris regardless of virtualization type (including no virtualization).  The answer is: of course!  That was one of the key goals of the project that invented unified archives.

In my earlier example, I was focused on preserving the identity and other aspects of the global zone and knew I had only one place that I planned to deploy it.  Hence, I chose to skip media creation (--exclude-media) and used a recovery archive (-r).  To generate a unified archive of a global zone that is ideal for use as an image for installing to another global zone or native zone, just use a simpler command line.

root@global# archiveadm create /path/to/golden-image.uar

Notice that by using fewer options we get something that is more usable.

What's different about this image compared to the one created in the previous post?

  • This archive as an embedded AI iso that will be used if you install a kernel zone from it.  That is, zoneadm -z kzname install -a /path/to/golden-image.uar will boot from that embedded AI image and perform an automated install from that archive.
  • This archive only contains the active boot environment and other ZFS snapshots are not archived.
  • This archive has been stripped of its identity.  When installing, you either need to provide a sysconfig profile or interactively configure the system or zone on the console or zone console on the first post-installation boot.

Friday Feb 20, 2015

global to non-global conversion with multiple zpools

Suppose you have a global zone with multiple zpools that you would like to convert into a native zone.  You can do that, thanks to unified archives (introduced in Solaris 11.2) and dataset aliasing (introduced in Solaris 11.0).  The source system looks like this:
root@buzz:~# zoneadm list -cv
  ID NAME             STATUS      PATH                         BRAND      IP
   0 global           running     /                            solaris    shared
root@buzz:~# zpool list
rpool  15.9G  4.38G  11.5G  27%  1.00x  ONLINE  -
tank   1008M    93K  1008M   0%  1.00x  ONLINE  -
root@buzz:~# df -h /tank
Filesystem             Size   Used  Available Capacity  Mounted on
tank                   976M    31K       976M     1%    /tank
root@buzz:~# cat /tank/README
this is tank
Since we are converting a system rather than cloning it, we want to use a recovery archive and use the -r option.  Also, since the target is a native zone, there's no need for the unified archive to include media.
root@buzz:~# archiveadm create --exclude-media -r /net/kzx-02/export/uar/p2v.uar
Initializing Unified Archive creation resources...
Unified Archive initialized: /net/kzx-02/export/uar/p2v.uar
Logging to: /system/volatile/archive_log.1014
Executing dataset discovery...
Dataset discovery complete
Preparing archive system image...
Beginning archive stream creation...
Archive stream creation complete
Beginning final archive assembly...
Archive creation complete
Now we will go to the global zone that will have the zone installed.  First, we must configure the zone.  The archive contains a zone configuration that is almost correct, but needs a little help because archiveadm(1M) doesn't know the particulars of where you will deploy it.

Most examples that show configuration of a zone from an archive show the non-interactive mode.  Here we use the interactive mode.
root@vzl-212:~# zonecfg -z p2v
Use 'create' to begin configuring a new zone.
zonecfg:p2v> create -a /net/kzx-02/export/uar/p2v.uar
After the create command completes (in a fraction of a second) we can see the configuration that was embedded in the archive.  I've trimmed out a bunch of uninteresting stuff from the anet interface.
zonecfg:p2v> info
zonename: p2v
zonepath.template: /system/zones/%{zonename}
zonepath: /system/zones/p2v
brand: solaris
autoboot: false
autoshutdown: shutdown
ip-type: exclusive
[max-lwps: 40000]
[max-processes: 20000]
        linkname: net0
        lower-link: auto
        name: zonep2vchk-num-cpus
        type: string
        value: "original system had 4 cpus: consider capped-cpu (ncpus=4.0) or dedicated-cpu (ncpus=4)"
        name: zonep2vchk-memory
        type: string
        value: "original system had 2048 MB RAM and 2047 MB swap: consider capped-memory (physical=2048M swap=4095M)"
        name: zonep2vchk-net-net0
        type: string
        value: "interface net0 has lower-link set to 'auto'.  Consider changing to match the name of a global zone link."
        name: __change_me__/tank
        alias: tank
        name: zone.max-processes
        value: (priv=privileged,limit=20000,action=deny)
        name: zone.max-lwps
        value: (priv=privileged,limit=40000,action=deny)
In this case, I want to be sure that the zone's network uses a particular global zone interface, so I need to muck with that a bit.
zonecfg:p2v> select anet linkname=net0
zonecfg:p2v:anet> set lower-link=stub0
zonecfg:p2v:anet> end
The zpool list output in the beginning of this post showed that the system had two ZFS pools: rpool and tank.  We need to tweak the configuration to point the tank virtual ZFS pool to the right ZFS file system.  The name in the dataset resource refers to the location in the global zone.  This particular system has a zpool named export - a more basic Solaris installation would probably need to use rpool/export/....  The alias in the dataset resource needs to match the name of the secondary ZFS pool in the archive.
zonecfg:p2v> select dataset alias=tank
zonecfg:p2v:dataset> set name=export/tank/%{zonename}
zonecfg:p2v:dataset> info
        name.template: export/tank/%{zonename}
        name: export/tank/p2v
        alias: tank
zonecfg:p2v:dataset> end
zonecfg:p2v> exit
I did something tricky above - I used a template property to make it easier to clone this zone configuration and have the dataset name point at a different dataset.

Let's try an installation.  NOTE: Before you get around to booting the new zone, be sure the old system is offline else you will have IP address conflicts.
root@vzl-212:~# zoneadm -z p2v install -a /net/kzx-02/export/uar/p2v.uar
could not verify zfs dataset export/tank/p2v: filesystem does not exist
zoneadm: zone p2v failed to verify
Oops.  I forgot to create the dataset.  Let's do that.  I use -o zoned=on to prevent the dataset from being mounted in the global zone.  If you forget that, it's no biggy - the system will fix it for you soon enough.
root@vzl-212:~# zfs create -p -o zoned=on export/tank/p2v
root@vzl-212:~# zoneadm -z p2v install -a /net/kzx-02/export/uar/p2v.uar
The following ZFS file system(s) have been created:
Progress being logged to /var/log/zones/zoneadm.20150220T060031Z.p2v.install
    Installing: This may take several minutes...
 Install Log: /system/volatile/install.5892/install_log
 AI Manifest: /tmp/manifest.p2v.YmaOEl.xml
    Zonename: p2v
Installation: Starting ...
        Commencing transfer of stream: 0f048163-2943-cde5-cb27-d46914ec6ed3-0.zfs to rpool/VARSHARE/zones/p2v/rpool
        Commencing transfer of stream: 0f048163-2943-cde5-cb27-d46914ec6ed3-1.zfs to export/tank/p2v
        Completed transfer of stream: '0f048163-2943-cde5-cb27-d46914ec6ed3-1.zfs' from file:///net/kzx-02/export/uar/p2v.uar
        Completed transfer of stream: '0f048163-2943-cde5-cb27-d46914ec6ed3-0.zfs' from file:///net/kzx-02/export/uar/p2v.uar
        Archive transfer completed
        Changing target pkg variant. This operation may take a while
Installation: Succeeded
      Zone BE root dataset: rpool/VARSHARE/zones/p2v/rpool/ROOT/solaris-recovery
                     Cache: Using /var/pkg/publisher.
Updating image format
Image format already current.
  Updating non-global zone: Linking to image /.
Processing linked: 1/1 done
  Updating non-global zone: Syncing packages.
No updates necessary for this image. (zone:p2v)
  Updating non-global zone: Zone updated.
                    Result: Attach Succeeded.
        Done: Installation completed in 165.355 seconds.
  Next Steps: Boot the zone, then log into the zone console (zlogin -C)
              to complete the configuration process.
Log saved in non-global zone as /system/zones/p2v/root/var/log/zones/zoneadm.20150220T060031Z.p2v.install
root@vzl-212:~# zoneadm -z p2v boot
After booting we see that everything in the zone is in order.
root@vzl-212:~# zlogin p2v
[Connected to zone 'p2v' pts/3]
Oracle Corporation      SunOS 5.11      11.2    September 2014
root@buzz:~# svcs -x
root@buzz:~# zpool list
rpool  99.8G  66.3G  33.5G  66%  1.00x  ONLINE  -
tank    199G  49.6G   149G  24%  1.00x  ONLINE  -
root@buzz:~# df -h /tank
Filesystem             Size   Used  Available Capacity  Mounted on
tank                   103G    31K       103G     1%    /tank
root@buzz:~# cat /tank/README
this is tank
root@buzz:~# zonename
Happy p2v-ing!  Or rather, g2ng-ing.

Wednesday Jan 14, 2015

stamping out web servers

This is a continuation of a series of posts.  While this one may be interesting all on its own, you may want to start from the top to get the context.

The diagram above shows one global zone with a few zones in it.  That's not very exciting in a world where we need to rapidly provision new instances that are preconfigured and as hack-proof as we can make them.  This post will show how to create a unified archive that includes the kernel zone configuration and content that makes for a hard-to-compromise web server.  I'd like to say impossible, but history has shown us that software has bugs that affects everyone across the industry.

We'll start of by configuring and installing a kernel zone called web.  It will have two automatic networks, each attached to the appropriate etherstubs.  Notice the use of template properties - using %{zonename} and %{id} make it so that we don't have to futz with so much of the configuration when we configure the next zone based on this one.

root@global:~# zonecfg -z web
Use 'create' to begin configuring a new zone.
zonecfg:web> create -t SYSsolaris-kz
zonecfg:web> select device id=0
zonecfg:web:device> set storage=dev:zvol/dsk/zones/%{zonename}/disk%{id}
zonecfg:web:device> end
zonecfg:web> select anet id=0
zonecfg:web:anet> set lower-link=balstub0
zonecfg:web:anet> set allowed-address=
zonecfg:web:anet> set configure-allowed-address=true
zonecfg:web:anet> end
zonecfg:web> add anet
zonecfg:web:anet> set lower-link=internalstub0
zonecfg:web:anet> set allowed-dhcp-cids=%{zonename}
zonecfg:web:anet> end
zonecfg:web> info
zonename: web
brand: solaris-kz
autoboot: false
autoshutdown: shutdown
hostid: 0xdf87388
        lower-link: balstub0
        configure-allowed-address: true
        defrouter not specified
        allowed-dhcp-cids not specified
        link-protection: "mac-nospoof, ip-nospoof"
        id: 0
        lower-link: internalstub0
        allowed-address not specified
        configure-allowed-address: true
        defrouter not specified
        allowed-dhcp-cids.template: %{zonename}
        allowed-dhcp-cids: web
        link-protection: mac-nospoof
        id: 1
        match not specified
        storage.template: dev:zvol/dsk/zones/%{zonename}/disk%{id}
        storage: dev:zvol/dsk/zones/web/disk0
        id: 0
        bootpri: 0
        physical: 2G
zonecfg:web> exit
root@global:~# zoneadm -z web install
Progress being logged to /var/log/zones/zoneadm.20150114T193808Z.web.install
pkg cache: Using /var/pkg/publisher.
 Install Log: /system/volatile/install.4391/install_log
 AI Manifest: /tmp/zoneadm3808.vTayai/devel-ai-manifest.xml
  SC Profile: /usr/share/auto_install/sc_profiles/enable_sci.xml
Installation: Starting ...

        Creating IPS image
        Installing packages from:
                origin:  file:///export/repo/11.2/repo/
        The following licenses have been accepted and not displayed.
        Please review the licenses for the following packages post-install:
        Package licenses may be viewed using the command:
          pkg info --license <pkg_fmri>

DOWNLOAD                                PKGS         FILES    XFER (MB)   SPEED
Completed                            451/451   63686/63686  579.9/579.9    0B/s

PHASE                                          ITEMS
Installing new actions                   86968/86968
Updating package state database                 Done
Updating package cache                           0/0
Updating image state                            Done
Creating fast lookup database                   Done
Installation: Succeeded
        Done: Installation completed in 431.445 seconds.

root@global:~# zoneadm -z web boot
root@global:~# zlogin -C web
        Perform sysconfig.  Allow networking to be configured automatically.
        ~~.   (one ~ for ssh, one for zlogin -C)
root@global:~# zlogin web

At this point, networking inside the zone should look like this:

root@web:~# ipadm
NAME              CLASS/TYPE STATE        UNDER      ADDR
lo0               loopback   ok           --         --
   lo0/v4         static     ok           --
   lo0/v6         static     ok           --         ::1/128
net0              ip         ok           --         --
   net0/v4        inherited  ok           --
net1              ip         ok           --         --
   net1/v4        dhcp       ok           --
   net1/v6        addrconf   ok           --         <IPv6addr>

Configure the NFS mounts for web content (/web) and IPS repos (/repo).

root@web:~# cat >> /etc/vfstab - /repo      nfs     -       yes     - -       /web    nfs     -       yes     -
root@web:~# svcadm enable -r nfs/client

Now, update the pkg image configuration so that it uses the repository from the correct path. 

root@web:~# pkg set-publisher -O file:///repo/ solaris

Update the apache configuration so that it looks to /web for the document root. 

root@web:~# vi /etc/apache2/2.2/httpd.conf

This involves (at a minimum) changing DocumentRoot to "/web" and changing the <Directory "/var/apache/2.2/htdocs"> line to <Directory "/web">.  Your needs will be different and probably more complicated.  This is not an Apache tutorial and I'm not qualified to give it.  After modifying the configuration file, start the web server.

root@web:~# svcadm enable -r svc:/network/http:apache22

This is a good time to do any other configuration (users, other software, etc.) that you need.  If you did the changes above really quickly, you may also want to wait for first boot tasks like man-index to complete.  Allowing it to complete now will mean that it doesn't need to be redone for every instance of this zone that you create.

Since this is a type of a zone that shouldn't need to have its configuration changed a whole lot, let's use the immutable global zone (IMGZ) feature to lock down the web zone.  Note that we use IMGZ inside a kernel zone because a kernel zone is another global zone.

root@web:~# zonecfg -z global set file-mac-profile=fixed-configuration
updating /platform/sun4v/boot_archive
root@web:~# init 6

Back in the global zone, we are ready to create a clone archive once the zone reboots.

root@global:~# archiveadm create -z web /export/web.uar
Initializing Unified Archive creation resources...
Unified Archive initialized: /export/web.uar
Logging to: /system/volatile/archive_log.6835
Executing dataset discovery...
Dataset discovery complete
Creating install media for zone(s)...
Media creation complete
Preparing archive system image...
Beginning archive stream creation...
Archive stream creation complete
Beginning final archive assembly...
Archive creation complete

Now that the web clone unified archive has been created, it can be used on this machine or any other with a similar global zone configuration (etherstubs of same names, dhcp server, same nfs exports, etc.) to quickly create new web servers that fit the model described in the diagram at the top of this post.  To create the free kernel zone:

root@global:~# zonecfg -z free
Use 'create' to begin configuring a new zone.
zonecfg:free> create -a /export/web.uar
zonecfg:free> select anet id=0
zonecfg:free:anet> set allowed-address=
zonecfg:free:anet> end
zonecfg:free> select capped-memory
zonecfg:free:capped-memory> set physical=4g
zonecfg:free:capped-memory> end
zonecfg:free> add virtual-cpu
zonecfg:free:virtual-cpu> set ncpus=2
zonecfg:free:virtual-cpu> end
zonecfg:free> exit

If I were doing this for a purpose other than this blog post, I would have also created a sysconfig profile and passed it to zoneadm install.  This would have made the first boot completely hands-off.

root@global:~# zoneadm -z free install -a /export/web.uar
root@global:~# zoneadm -z free boot
root@global:~# zlogin -C free
[Connected to zone 'free' console]
   run sysconfig because I didn't do zoneadm install -c sc_profile.xml
SC profile successfully generated as:

Once we log into free, we see that there's no more setup to do.

root@free:~# df -h -F nfs
Filesystem             Size   Used  Available Capacity  Mounted on
                       194G    36G       158G    19%    /repo
                       158G    31K       158G     1%    /web
root@free:~# ipadm
NAME              CLASS/TYPE STATE        UNDER      ADDR
lo0               loopback   ok           --         --
   lo0/v4         static     ok           --
   lo0/v6         static     ok           --         ::1/128
net0              ip         ok           --         --
   net0/v4        inherited  ok           --
net1              ip         ok           --         --
   net1/v4        dhcp       ok           --
   net1/v6        addrconf   ok           --         fe80::8:20ff:fed0:5eb/10

Nearly identical steps can be taken with the deployment of premium.  The key difference there is that we are dedicating two cores (add dedicated-cpu; set cores=...) rather than allocating to virtual-cpus (add virtual-cpu; set ncpus=...).  That is, no one else can use any of the cpus on premium's  cores but free has to compete with the rest of the system for cpu time.

root@global:~# psrinfo -t
socket: 0
  core: 201457665
    cpus: 0-7
  core: 201654273
    cpus: 8-15
  core: 201850881
    cpus: 16-23
  core: 202047489
    cpus: 24-31
root@global:~# zonecfg -z premium
zonecfg:premium> create -a /export/web.uar
zonecfg:premium> select anet id=0
zonecfg:premium:anet> set allowed-address=
zonecfg:premium:anet> end
zonecfg:premium> select capped-memory
zonecfg:premium:capped-memory> set physical=8g
zonecfg:premium:capped-memory> end
zonecfg:premium> add dedicated-cpu
zonecfg:premium:dedicated-cpu> set cores=201850881,202047489
zonecfg:premium:dedicated-cpu> end
zonecfg:premium> exit

The install and boot of premium will then be the same as that of free.  After both zones are up, we can see that psrinfo reports the number of cores for premium but not for free.

root@global:~# zlogin free psrinfo -pv
The physical processor has 2 virtual processors (0-1)
  SPARC-T5 (chipid 0, clock 3600 MHz)
root@global:~# zlogin free prtconf | grep Memory
Memory size: 4096 Megabytes
root@global:~# zlogin premium psrinfo -pv
The physical processor has 2 cores and 16 virtual processors (0-15)
  The core has 8 virtual processors (0-7)
  The core has 8 virtual processors (8-15)
    SPARC-T5 (chipid 0, clock 3600 MHz)
root@global:~# zlogin premium prtconf | grep Memory
Memory size: 8192 Megabytes

That's enough for this post.  Next time, we'll get teeter going.

in-the-box NFS and pkg repository

This is the third in a series of short blog entries.  If you are new to the series, I suggest you start from the top.

As shown in our system diagram, the kernel zones have no direct connection to the outside world.  This will make it quite hard for them to apply updates.  To get past that we will set up a pkg repository in the global zone and export it via NFS to the zones.  I won't belabor the topic of a Local IPS Repositories, because our fine doc writers have already covered that.

As a quick summary, I first created a zfs file system for the repo.  On this system, export is a separate pool with its topmost dataset mounted at /export.  By default /export is the rpool/export dataset - you may need to adjust commands to match your system.

root@global:~# zfs create -p export/repo/11.2

I then followed the procedure in MOS Doc ID 1928542.1 for creating a Solaris 11.2 repository, including all of the SRUs.  That resulted in having a repo with the solaris publisher at /export/repo/11.2/repo.

Since I have a local repo for the kernel zones to use, I figured the global zone may as well use it too.

root@global:~# pkg set-publisher -O  file:///export/repo/11.2/repo/ solaris

To make this publisher accessible (read-only) to the zones on the network, it needs to be NFS exported.

root@global:~# share -F nfs -o ro=@ /export/repo/11.2/repo

Now I'll get ahead of myself a bit - I've not actually covered the installation of the free or premium zones yet.  Let's pretend we have a kernel zone called web and we want the repository to be accessible at /repo in web.

root@global:~# zlogin web
root@web:~# vi /etc/vfstab
   (add an entry)
root@web:~# grep /repo /etc/vfstab - /repo      nfs     -       yes     -
root@web:~# svcadm enable -r nfs/client

If svc:/network/nfs/client was already enabled, use mount /repo instead of svcadm enable.  Once /repo is mounted, update the solaris publisher.

root@web:~# pkg set-publisher -O  file:///repo/ solaris

In this example, we also want to have some content shared from the global zone into each of the web zones.  To make that possible:

root@global:~# zfs create export/web
root@global:~# share -F nfs -o ro=@ /export/web

Again, this is exported read-only to the zones.  Adjust for your own needs.

That's it for this post.  Next time we'll create a unified archive that can be used for quickly stamping out lots of web zones.

in-the-box networking

In my previous post, I described a scenario where a couple networks are needed to shuffle NFS and web traffic between a few zones.  In this post, I'll describe the configuration of the networking.  As a reminder, here's the configuration we are after.

The green ( network is used for the two web server zones that need to connect services in the global zone.  The red ( network is used for communication between the load balancer and the web servers.  The basis for this simplistic in-the-box network is an etherstub.

The red network is a bit simpler than the green network, so we'll start with that.

root@global:~# dladm create-etherstub balstub0

That's it!  The (empty) network has been created by simply creating an etherstub.  As zones are configured to use balstub0 as the lower-link in their anet resources, they will attach to this network.

The green network is just a little bit more involved because there will be services (DHCP and NFS) in the global zone that will use this network.

root@global:~# dladm create-etherstub internalstub0
root@global:~# dladm create-vnic -l internalstub0 internal0
root@global:~# ipadm create-ip internal0
root@global:~# ipadm create-addr -T static -a internal0

That wasn't a lot harder.  What we did here was create an etherstub named internalstub0.  On top of it, we created a vnic called internal0, attached an IP interface onto it, then set a static IP address.

As was mentioned in the introductory post, we'll have DHCP manage the IP address allocation for zones that use internalstub0.  Setup of that is pretty straight-forward too.

root@global:~# cat > /etc/inet/dhcpd4.conf
default-lease-time 86400;
log-facility local7;
subnet netmask {
    option broadcast-address;
root@global:~# svcadm enable svc:/network/dhcp/server:ipv4

The real Solaris blogs junkie will recognize this as a simplified version of something from the Maine office.

kernel zones vs. fs resources

Traditionally, zones have had a way to perform loopback mounts of global zone file systems using zonecfg fs resources with fstype=lofs.  Since kernel zones run a separate kernel, lofs is not really an option.  Other file systems can be safely presented to kernel zones by delegating the devices, so the omission of fs resources is not as bad as it may initially sound.

For those that really need something that works like a lofs fs resource, there's a way to simulate the functionality with NFS.  Consider the following system.

It has a native zone, teeter, that has load balancing software in it and two kernel zones, free and premium. The idea behind this contrived example is that freeloaders get directed to the small kernel zone and the paying customers use the resources of the bigger zone.  Neither free nor premium are directly connected to the outside world.  They use NFS served from the global zone for web content and package repositories.  When it comes time to update the content seen by all of the web zones, the admin only needs to update it in one place.  It is quite acceptable for this one place to be on the NFS server - that is, in the global zone.

To simplify the configuration of each zone, all network configuration is performed in the global zone with zonecfg.  For example, this is the configuration of free.

root@global:~# zonecfg -z free export
create -b
set brand=solaris-kz
set autoboot=false
set autoshutdown=shutdown
set hostid=0x19df23f0
add anet
set lower-link=internalstub0
set configure-allowed-address=true
set link-protection=mac-nospoof
set mac-address=auto
set id=1
add anet
set lower-link=balstub0
set allowed-address=
set configure-allowed-address=true
set link-protection=mac-nospoof
set mac-address=auto
set id=0

This example shows two ways to handle the network configuration from the global zone. The green network is configured using internalstub0 as the lower link. A DHCP server is configured in the global zone to dynamically assign addresses to all the zones that have a vnic on that network. The red network uses balstub0 as the lower link.  Because the load balancer will need to be configured to use specific IP addresses for free and premium zones, allocating these addresses from a dynamic address range seems prone to troubles.

In the next few blog entries, I'll cover the following topics:

Monday Jun 02, 2014

Oops, I left my kernel zone configuration behind!

Most people use boot environments to move in one direction.  A system starts with an initial installation and from time to time new boot environments are created - typically as a result of pkg update - and then the new BE is booted.  This post is of little interest to those people as no hackery is needed.  This post is about some mild hackery.

During development, I commonly test different scenarios across multiple boot environments.  Many times, those tests aren't related to the act of configuring or installing zone and I so it's kinda handy to avoid the effort involved of zone configuration and installation.  A somewhat common order of operations is like the following:

# beadm create -e golden -a test1
# reboot

Once the system is running in the test1 BE, I install a kernel zone.

# zonecfg -z a178 create -t SYSsolaris-kz
# zoneadm -z a178 install

Time passes, and I do all kinds of stuff to the test1 boot environment and want to test other scenarios in a clean boot environment.  So then I create a new one from my golden BE and reboot into it.

# beadm create -e golden -a test2
# reboot

Since the test2 BE was created from the golden BE, it doesn't have the configuration for the kernel zone that I configured and installed.  Getting that zone over to the test2 BE is pretty easy.  My test1 BE is really known as s11fixes-2.

root@vzl-212:~# beadm mount s11fixes-2 /mnt
root@vzl-212:~# zonecfg -R /mnt -z a178 export | zonecfg -z a178 -f -
root@vzl-212:~# beadm unmount s11fixes-2
root@vzl-212:~# zoneadm -z a178 attach
root@vzl-212:~# zoneadm -z a178 boot

On the face of it, it would seem as though it would have been easier to just use zonecfg -z a178 create -t SYSolaris-kz within the test2 BE to get the new configuration over.  That would almost work, but it would have left behind the encryption key required for access to host data and any suspend image.  See solaris-kz(5) for more info on host data.  I very commonly have more complex configurations that contain many storage URIs and non-default resource controls.  Retyping them would be rather tedious.

Thursday May 08, 2014

No time like the future

Zones have forever allowed different time zones.  Kernel zones kicks that up to 11 (or is that 11.2?) with the ability to have an entirely different time in the zone.  To be clear, this only works with kernel zones.  You can see from the output below that the zone in use has brand solaris-kz.

root@kzx-05:~# zoneadm -z junk list -v
  ID NAME             STATUS      PATH                         BRAND      IP    
  12 junk             running     -                            solaris-kz excl  

By default, the clocks between the global zone and a kernel zone are in sync.  We'll use console logging to show that...

root@kzx-05:~# zlogin -C junk
[Connected to zone 'junk' console]

vzl-178 console login: root
Last login: Fri Apr 18 13:58:01 on console
Oracle Corporation      SunOS 5.11      11.2    April 2014
root@vzl-178:~# date "+%Y-%m-%d %T"
2014-04-18 13:58:57
root@vzl-178:~# exit

vzl-178 console login: ~.
[Connection to zone 'junk' console closed]
root@kzx-05:~# tail /var/log/zones/junk.console 
2014-04-18 13:58:45 vzl-178 console login: root
2014-04-18 13:58:46 Password: 
2014-04-18 13:58:48 Last login: Fri Apr 18 13:58:01 on console
2014-04-18 13:58:48 Oracle Corporation      SunOS 5.11      11.2    April 2014
2014-04-18 13:58:48 root@vzl-178:~# date "+%Y-%m-%d %T"
2014-04-18 13:58:57 2014-04-18 13:58:57
2014-04-18 13:58:57 root@vzl-178:~# exit
2014-04-18 13:59:04 logout
2014-04-18 13:59:04 
2014-04-18 13:59:04 vzl-178 console login: root@kzx-05:~# 

Notice that the time stamp on the log matches what we see in the output of date.  Now let's pretend that it is next year.

root@kzx-05:~# zlogin junk
root@vzl-178:~# date  0101002015
Thursday, January  1, 2015 12:20:00 AM PST

And let's be sure that it still thinks it's 2015 in the zone:

root@kzx-05:~# date; zlogin junk date; date
Friday, April 18, 2014 02:05:53 PM PDT
Thursday, January  1, 2015 12:20:18 AM PST
Friday, April 18, 2014 02:05:54 PM PDT

And the time offset survives a reboot.

root@kzx-05:~# zoneadm -z junk reboot
root@kzx-05:~# date; zlogin junk date; date
Friday, April 18, 2014 02:09:18 PM PDT
Thursday, January  1, 2015 12:23:43 AM PST
Friday, April 18, 2014 02:09:18 PM PDT

So, what's happening under the covers?  When the date is set in the kernel zone, the offset between the kernel zone's clock and the global zone's clock is stored in the kernel zone's host data.  See solaris-kz(5) for a description of host data.  Whenever a kernel zone boots, the kernel zone's clock is initialized based on this offset.

Wednesday May 07, 2014

Unified Archives

I had previously mentioned that I've spent a bit of time working on Unified Archives - Solaris 11's answer to Flash Archives.  I was going to write up a blog entry or two on it, but it turns out the staff at the Maine office has already done a bang-up job of that.

And, of course, you should check out the related video.

Zones Console Logs

You know how there's that thing that you've been meaning to do for a long time but never quite get to it?  And then one day something just pushes you over the edge?  This is a short story of being pushed over the edge.

As I was working on kernel zones, I rarely used zlogin -C to get to the console.  And then I'd get a panic.  If the panic happened early enough in boot that the dump device wasn't enabled, I'd lose all traces of what went wrong.  That pushed me over the edge - time to implement console logging!

With Solaris 11.2, there's now a zone console log for all zone brands: you can find it at /var/log/zones/zonename.console.

root@kzx-05:~# zoneadm -z junk boot
root@kzx-05:~# tail /var/log/zones/junk.console 
2014-04-18 12:59:08 syncing file systems... done
2014-04-18 12:59:11 
2014-04-18 12:59:11 [NOTICE: Zone halted]
2014-04-18 13:00:56 
2014-04-18 13:00:56 [NOTICE: Zone booting up]
2014-04-18 13:00:59 Boot device: disk0  File and args: 
2014-04-18 13:00:59 reading module /platform/i86pc/amd64/boot_archive...done.
2014-04-18 13:00:59 reading kernel file /platform/i86pc/kernel/amd64/unix...done.
2014-04-18 13:01:00 SunOS Release 5.11 Version 11.2 64-bit
2014-04-18 13:01:00 Copyright (c) 1983, 2014, Oracle and/or its affiliates. All rights reserved.

The output above will likely generate a few questions.  Let me try to answer those.

Will this log passwords typed on the console?

Generally, passwords are entered in a way that they aren't echoed to the terminal.  The console log only contains the characters written from the zone to the terminal - that is it logs the echoes.  Characters you type that are never printed are never logged.

Can just anyone read the console log file?

No.  You need to be root in the global zone to read the console log file.

How is log rotation handled?

Rules have been added to logadm.conf(4) to handle weekly log rotation.

I see a time stamp, but what time zone is that?

All time stamps are in the same time zone as svc:/system/zones:default.  That should be the same as is reported by:

root@kzx-05:~# svcprop -p timezone/localtime timezone

Really, what does the time stamp mean?

It was the time that the first character of the line was written to the terminal.  This means that if a line contains a shell prompt, the time is the time that the prompt was printed, not the time that the person finished entering a command.

I see other files in /var/log/zones.  What are they?

zonename.messages contains various diagnostic information from zoneadmd.  If all goes well, you will never need to look at that.

zoneadm.* contains log files from attach, install, clone, and uninstall operations.  These files have existed since Solaris 11 first launched.

Tuesday May 06, 2014

A tour of a kernel zone

In my earlier post, I showed how to configure and install a kernel zone.  In this post, we'll take a look at this kernel zone.

The kernel zone was installed within an LDom on a T5-4.

root@vzl-212:~# prtdiag -v | head -2
System Configuration:  Oracle Corporation  sun4v SPARC T5-4
Memory size: 65536 Megabytes
root@vzl-212:~# psrinfo | wc -l

The kernel zone was configured with:

 root@vzl-212:~# zonecfg -z myfirstkz create -t SYSsolaris-kz

Let's take a look at the resulting configuration.

root@vzl-212:~# zonecfg -z myfirstkz info | cat -n
     1    zonename: myfirstkz
     2    brand: solaris-kz
     3    autoboot: false
     4    autoshutdown: shutdown
     5    bootargs: 
     6    pool: 
     7    scheduling-class: 
     8    hostid: 0x2b2044c5
     9    tenant: 
    10    anet:
    11        lower-link: auto
    12        allowed-address not specified
    13        configure-allowed-address: true
    14        defrouter not specified
    15        allowed-dhcp-cids not specified
    16        link-protection: mac-nospoof
    17        mac-address: auto
    18        mac-prefix not specified
    19        mac-slot not specified
    20        vlan-id not specified
    21        priority not specified
    22        rxrings not specified
    23        txrings not specified
    24        mtu not specified
    25        maxbw not specified
    26        rxfanout not specified
    27        vsi-typeid not specified
    28        vsi-vers not specified
    29        vsi-mgrid not specified
    30        etsbw-lcl not specified
    31        cos not specified
    32        evs not specified
    33        vport not specified
    34        id: 0
    35    device:
    36        match not specified
    37        storage: dev:/dev/zvol/dsk/rpool/VARSHARE/zones/myfirstkz/disk0
    38        id: 0
    39        bootpri: 0
    40    capped-memory:
    41        physical: 2G
    42    suspend:
    43        path: /system/zones/myfirstkz/suspend
    44        storage not specified
    45    keysource:
    46        raw redacted

There are a number of things to notice in this configuration.

  • No zonepath.  Kernel zones install into a real or virtual disks - quite like the way that logical domains install into real or virtual disks.  The virtual disk(s) that contain the root zfs pool are specified by one or more device resources that contain a bootpri property (line 39).  By default, a kernel zone's root disk is a 16 GB zfs volume in the global zone's root zfs pool.  There's more about this in the solaris-kz(5) man page.  It's never been a good idea to directly copy things into a zone's zonepath.  With kernel zones that just doesn't work.
  • The device resource accepts storage URI's (line 37).  See suri(5).  Storage URI's were introduced in Solaris 11.1 in support of Zones on Shared Storage (rootzpool and zpool resources).  This comes in really handy when a kernel zone is installed on external storage and may be migrated between hosts from time to time.
  • The device resource has an id property (line 38).  This means that this disk will be instance 0 of zvblk - which will translate into it being c1d0.  We'll see more of that in a bit.
  • The anet resource has an id property (line 34).  This means that this anet will be instance 0 of zvnet - which will normally be seen as net0.  Again, more of that in a bit.
  • A memory resource control, capped-memory, is set by default (lines 40 - 41).  In the solaris or solaris10 brand, this would mean that rcapd is used to soft limit the amount of physical memory a zone can use.  Kernel zones are different.  Not only is this a hard limit on the amount of physical memory that the kernel zone can use - the memory is immediately allocated and reserved as the zone boots.
  • A suspend resource is present, which defines a location for to write out a suspend file when zoneadm -z zonename suspend is invoked.
  • The keysource resource is used for an encryption key that is used to encrypt suspend images and host data.  solaris-kz(5) has more info on this.

There are several things not shown here that may also be of interest:

  • Previously, autoshutdown (line 4) allowed halt and shutdown as values.  It now also supports suspend for kernel zones only.  As you may recall, autoshutdown is used by svc:/system/zones:default when it is transitioning from online to offline.  If set to halt, the zone (kernel or otherwise) is brought down abruptly.  If set to shutdown, a graceful shutdown is performed.  Now, if a kernel zone has it set to suspend, the kernel zone will be suspended as svc:/system/zones:default goes offline.  When zoneadm boot is issued for a suspended zone, the zone is resumed.
  • If there are multiple device resources that have bootpri set (i.e. bootable devices), zoneadm install will add all of the boot devices to a mirrored root zpool.

From the earlier blog entry, this kernel zone was booted and sysconfig was performed.  Let's look inside.

To get into the zone, you can use zlogin just like you do with any other zone.

root@vzl-212:~# zlogin myfirstkz
[Connected to zone 'myfirstkz' pts/3]
Oracle Corporation      SunOS 5.11      11.2    April 2014

As I alluded to above, a kernel zone gets a fixed amount of memory.  The value shown above matches the value shown in the capped-memory resource in the zone configuration.

root@myfirstkz:~# prtconf | grep ^Memory
Memory size: 2048 Megabytes

By default, a kernel zone gets one virtual cpu.  You can adjust this with the virtual-cpu or dedicated-cpu zonecfg resources.  See solaris-kz(5).

root@myfirstkz:~# psrinfo
0       on-line   since 04/18/2014 22:39:22

Because a kernel zone runs its own kernel, it does not require that packages are in sync between the global zone and the kernel zone.  Notice that the pkg publisher output does not say (syspub) - the kernel zone and the global zone can even use different publishers for the solaris repository.  As SRU's and updates start to roll out you will see that you can independently update the global zone and the kernel zones on it.

root@myfirstkz:~# pkg publisher
solaris                     origin   online F

Because a kernel zone runs its own kernel, it considers itself to be a global zone.

root@myfirstkz:~# zonename

The root disk that I mentioned above shows up at c1d0.

root@myfirstkz:~# format
Searching for disks...done

       0. c1d0 <kz-vDisk-ZVOL-16.00GB>
Specify disk (enter its number): ^D

And the anet shows up as net0 using physical device zvnet0.

root@myfirstkz:~# dladm show-phys
LINK              MEDIA                STATE      SPEED  DUPLEX    DEVICE
net0              Ethernet             up         1000   full      zvnet0

Let's jump on the console and see what happens when bad things happen...

root@myfirstkz:~# logout

[Connection to zone 'myfirstkz' pts/3 closed]

root@vzl-212:~# zlogin -C myfirstkz
[Connected to zone 'myfirstkz' console]

myfirstkz console login: root
Apr 18 23:47:06 myfirstkz login: ROOT LOGIN /dev/console
Last login: Fri Apr 18 23:32:28 on kz/term
Oracle Corporation      SunOS 5.11      11.2    April 2014
root@myfirstkz:~# dtrace -wn 'BEGIN { panic() }'
dtrace: description 'BEGIN ' matched 1 probe

panic[cpu0]/thread=c4001afbd720: dtrace: panic action at probe dtrace:::BEGIN (ecb c400123381e0)

000002a10282acd0 dtrace:dtrace_probe+c54 (252acb8f029b3, 0, 0, 33fe, c4001b75e000, 103215b2)
  %l0-3: 0000c400123381e0 0000c40019b82340 00000000000013fc 0000c40016889740
  %l4-7: 0000c4001bc00000 0000c40019b82370 0000000000000003 000000000000ff00
000002a10282af10 dtrace:dtrace_state_go+4ac (c40019b82340, 100, 0, c40019b82370, 16, 702a7040)
  %l0-3: 0000000000030000 0000000010351580 0000c4001b75e000 00000000702a7000
  %l4-7: 0000000000000000 0000000df8475800 0000000000030d40 00000000702a6c00
000002a10282aff0 dtrace:dtrace_ioctl+ad8 (2c, 612164be40, 2a10282bacc, 202003, c400162fcdc0, 64747201)
  %l0-3: 000000006474720c 0000c40019b82340 000002a10282b1a4 00000000ffffffff
  %l4-7: 00000000702a6ee8 00000000702a7100 0000000000000b18 0000000000000180
000002a10282b8a0 genunix:fop_ioctl+d0 (c40019647a40, 0, 612164be40, 202003, c400162fcdc0, 2a10282bacc)
  %l0-3: 000000006474720c 0000000000202003 0000000001374f2c 0000c40010d84180
  %l4-7: 0000000000000000 0000000000000000 00000000000000c0 0000000000000000
000002a10282b970 genunix:ioctl+16c (3, 6474720c, 612164be40, 3, 1fa5ac, 0)
  %l0-3: 0000c4001a5ea958 0000000010010000 0000000000002003 0000000000000000
  %l4-7: 0000000000000003 0000000000000004 0000000000000000 0000000000000000

syncing file systems... done
dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel sections: zfs
 0:04  90% done (kernel)
 0:05 100% done (zfs)
100% done: 127783 (kernel) + 12950 (zfs) pages dumped, dump succeeded

[NOTICE: Zone rebooting]
NOTICE: Entering OpenBoot.
NOTICE: Fetching Guest MD from HV.
NOTICE: Starting additional cpus.
NOTICE: Initializing LDC services.
NOTICE: Probing PCI devices.
NOTICE: Finished PCI probing.

SPARC T5-4, No Keyboard
Copyright (c) 1998, 2014, Oracle and/or its affiliates. All rights reserved.
OpenBoot 4.36.0, 2.0000 GB memory available, Serial #723535045.
Ethernet address 0:0:0:0:0:0, Host ID: 2b2044c5.

Boot device: disk0  File and args: 
SunOS Release 5.11 Version 11.2 64-bit
Copyright (c) 1983, 2014, Oracle and/or its affiliates. All rights reserved.
Hostname: myfirstkz
Apr 18 23:48:44 myfirstkz savecore: System dump time: Fri Apr 18 23:47:42 2014
Apr 18 23:48:44 myfirstkz savecore: Saving compressed system crash dump files in directory /var/crash

myfirstkz console login: Apr 18 23:49:02 myfirstkz savecore: Decompress all crash dump files with '(cd /var/crash && savecore -v 0)' or individual files with 'savecore -vf /var/crash/vmdump{,-<secname>}.0'

EVENT-TIME: Fri Apr 18 23:49:07 CDT 2014
PLATFORM: SPARC-T5-4, CSN: unknown, HOSTNAME: myfirstkz
SOURCE: software-diagnosis, REV: 0.1
EVENT-ID: f4c0d684-da80-425f-e45c-97bd0239b154
DESC: The system has rebooted after a kernel panic.

After disconnecting from the console (~.) I was back at the global zone root prompt.  The global zone didn't panic - the kernel zone did.

root@vzl-212:~# uptime; zlogin myfirstkz uptime
  9:53pm  up  8:03,  2 users,  load average: 0.03, 0.12, 0.08
 11:52pm  up 5 min(s),  0 users,  load average: 0.04, 0.26, 0.15

That's the end of this tour.  Thanks for coming, and please come again!

Tuesday Apr 29, 2014

Need another disk in your zone? No problem!

Solaris 11.2 allows you to modify your zone configuration and apply those changes without a reboot.  Check this out:

First, I want to show that there are no disk devices present in the zone.

root@vzl-212:~# zlogin z1 'find /dev/*dsk'

Next, I modify the static zone configuration to add a access to all partitions of a particular disk.  This has no effect on the running zone.

root@vzl-212:~# zonecfg -z z1
zonecfg:z1> add device
zonecfg:z1:device> set match=/dev/*dsk/c0t600144F0DBF8AF19000052EC175B0004d0*
zonecfg:z1:device> end
zonecfg:z1> exit

I then use zoneadm to apply the changes in the zone configuration to the running zone.

root@vzl-212:~# zoneadm -z z1 apply
zone 'z1': Checking: Adding device match=/dev/*dsk/c0t600144F0DBF8AF19000052EC175B0004d0*
zone 'z1': Applying the changes

And now, the zone can see the devices. 

root@vzl-212:~# zlogin z1 'find /dev/*dsk'

If I wanted to just temporarily modify the configuration, there's an option to just modify the running configuration. For example, if I plan on doing some maintenance on my ZFS Storage Appliance that hosts the LUN I allocated above, I may want to be sure that the zone can't see it for a bit.  That's easy enough.

Here I use zonecfg's -r option to modify the running configuration.

root@vzl-212:~# zonecfg -z z1 -r
zonecfg:z1> info device
    match: /dev/*dsk/c0t600144F0DBF8AF19000052EC175B0004d0*
    storage not specified
    allow-partition not specified
    allow-raw-io not specified
zonecfg:z1> remove device
zonecfg:z1> info device
zonecfg:z1> commit
zone 'z1': Checking: Removing device match=/dev/*dsk/c0t600144F0DBF8AF19000052EC175B0004d0*
zone 'z1': Applying the changes
zonecfg:z1> exit

Without the -r option, zonecfg displays the on-disk configuration.

root@vzl-212:~# zonecfg -z z1 info device
    match: /dev/*dsk/c0t600144F0DBF8AF19000052EC175B0004d0*
    storage not specified
    allow-partition not specified
    allow-raw-io not specified

root@vzl-212:~# zonecfg -z z1 -r info device

The running configuration reflects the contents of /dev/dsk and /dev/rdsk inside the zone.

root@vzl-212:~# zlogin z1 'find /dev/*dsk'

When it is time to revert back to the on-disk configuration, simply apply the on-disk configuration and the device tree inside the zone reverts to the on-disk configuration.

root@vzl-212:~# zoneadm -z z1 apply
zone 'z1': Checking: Adding device match=/dev/*dsk/c0t600144F0DBF8AF19000052EC175B0004d0*
zone 'z1': Applying the changes

root@vzl-212:~# zlogin z1 'find /dev/*dsk'

Live Zone Reconfiguration is not limited to just device resources, it works with most other resources as well.  See the Live Zones Reconfiguration section in zones(5).

Solaris 11.2 Beta zones man pages

For those of you trying out the latest zones features, beware that many of the man page updates did not make it into the beta build, so man(1) will not tell you the full story.  The man page updates are in the beta documentation, linked below for your convenience.

The zones book has been split apart and a new book has been introduced for kernel zones:

Install a kernel zone in 3 steps

One of the shiniest new features in Oracle Solaris 11.2 is Kernel Zones.  Kernel Zones provide the familiarity of zones while providing independent kernels.  This means that it's now possible to have zones that run different patch levels, act as CIFS servers, load kernel modules, etc.  So, let's get to installing a kernel zone.

If you have installed any other zone on Solaris before, this will look quite familiar.  After all, it is  just another zone, right?

For this procedure to work, there are some prerequisites that shouldn't be much of a problem in a production environment, but are a bit of a problem if your normal playground is VirtualBox or the 6 year old server you found on the loading dock.

Step 1: Configure

root@vzl-212:~# zonecfg -z myfirstkz create -t SYSsolaris-kz

Step 2: Install

root@vzl-212:~# zoneadm -z myfirstkz install
Progress being logged to /var/log/zones/zoneadm.20140419T032707Z.myfirstkz.install
pkg cache: Using /var/pkg/publisher.
 Install Log: /system/volatile/install.5368/install_log
 AI Manifest: /tmp/zoneadm4798.dAaO7j/devel-ai-manifest.xml
  SC Profile: /usr/share/auto_install/sc_profiles/enable_sci.xml
Installation: Starting ...

        Creating IPS image
        Installing packages from:
                origin:  http://ipkg/solaris11/dev/
        The following licenses have been accepted and not displayed.
        Please review the licenses for the following packages post-install:
        Package licenses may be viewed using the command:
          pkg info --license <pkg_fmri>

DOWNLOAD                                PKGS         FILES    XFER (MB)   SPEED
Completed                            549/549   76929/76929  680.9/680.9  8.4M/s

PHASE                                          ITEMS
Installing new actions                   104278/104278
Updating package state database                 Done 
Updating package cache                           0/0 
Updating image state                            Done 
Creating fast lookup database                   Done 
Installation: Succeeded
        Done: Installation completed in 438.132 seconds.

Step 3: Celebrate!

At this point the kernel zone is installed and ready for boot. 

root@vzl-212:~# zoneadm -z myfirstkz boot
root@vzl-212:~# zlogin -C myfirstkz
[Connected to zone 'myfirstkz' console]
Loading smf(5) service descriptions: 220/220

Because a sysconfig profile was not provided during installation, sysconfig(1M) will ask a few things on first boot.

                           System Configuration Tool
     System Configuration Tool enables you to specify the following            
     configuration parameters for your newly-installed Oracle Solaris 11       
     - system hostname, network, time zone and locale, date and time, user     
       and root accounts, name services, keyboard layout, support
     System Configuration Tool produces an SMF profile file in
     How to navigate through this tool:
     - Use the function keys listed at the bottom of each screen to move       
       from screen to screen and to perform other operations.
     - Use the up/down arrow keys to change the selection or to move           
       between input fields.
     - If your keyboard does not have function keys, or they do not            
       respond, press ESC; the legend at the bottom of the screen will         
       change to show the ESC keys for navigation and other functions.         
  F2_Continue  F6_Help  F9_Quit

If you've read this far into this entry, you know how to take it from here.

Solaris 11.2 Beta Opens Today

The Oracle Solaris 11.2 Open Beta has begun.  Please grab the bits and let us know what you think.

As I mentioned a while back, I've spent a lot of my time over the last couple of years on Kernel Zones and Unified Archives.  This has had overlap with other prominent projects like integration of OpenStack in Solaris and lesser known ones such as zonecfg template properties, and zone console logging.  In the coming days, watch this space for blog entries that should help you explore some of advances that Solaris 11.2 offers.



  • Mike Gerdts - Principal Software Engineer, Solaris Virtualization
  • More coming soon!


« July 2015