X

Recent Posts

Live storage migration for kernel zones

From time to time every sysadmin realizes that something that is consuming a bunch of storage is sitting in the wrong place.  This could be because of a surprise conversion of proof of concept into proof of production or something more positive like ripping out old crusty storage for a nice new Oracle ZFS Storage Appliance. When you use kernel zones with Solaris 11.3, storage migration gets a lot easier. As our fine manual says: The Oracle Solaris 11.3 release introduces the Live Zone Reconfiguration feature for Oracle Solaris Kernel Zones. With this feature, you can reconfigure the network and the attached devices of a running kernel zone. Because the configuration changes are applied immediately without requiring a reboot, there is zero downtime service availability within the zone. You can use the standard zone utilities such as zonecfg and zoneadm to administer the Live Zone Reconfiguration.  Well, we can combine this with other excellent features of Solaris to have no-outage storage migrations, even of the root zpool. In this example, I have a kernel zone that was created with something like: root@global:~# zonecfg -z kz1 create -t SYSsolaris-kzroot@global:~# zoneadm -z kz1 install -c <scprofile.xml> That happened several weeks ago and now I really wish that I had installed it using an iSCSI LUN from my ZFS Storage Appliance. We can fix that with no outage. First, I'll update the zone's configuration to add a bootable iscsi disk. root@global:~# zonecfg -z kz1zonecfg:kz1> add devicezonecfg:kz1:device> set storage=iscsi://zfssa/luname.naa.600144F0DBF8AF19000053879E9C0009 zonecfg:kz1:device> set bootpri=0zonecfg:kz1:device> endzonecfg:kz1> exit Next, I tell the system to add that disk to the running kernel zone. root@global:~# zoneadm -z kz1 applyzone 'kz1': Checking: Adding device storage=iscsi://zfssa/luname.naa.600144F0DBF8AF19000053879E9C0009zone 'kz1': Applying the changes Let's be sure we can see it and look at the current rpool layout.  Notice that this kernel zone is running Solaris 11.2 - I only need to have Solaris 11.3 in the global zone. root@global:~# zlogin kz1[Connected to zone 'kz1' pts/2]Oracle Corporation      SunOS 5.11      11.2    May 2015You have new mail.root@kz1:~# formatSearching for disks...doneAVAILABLE DISK SELECTIONS:       0. c1d0 <kz-vDisk-ZVOL-16.00GB>          /kz-devices@ff/disk@0       1. c1d1 <SUN-ZFS Storage 7120-1.0-120.00GB>          /kz-devices@ff/zvblk@1Specify disk (enter its number): ^Droot@kz1:~# zpool status rpool  pool: rpool state: ONLINE  scan: none requestedconfig:        NAME    STATE     READ WRITE CKSUM        rpool   ONLINE       0     0     0          c1d0  ONLINE       0     0     0errors: No known data errors Now, zpool replace can be used to migrate the root pool over to the new storage. root@kz1:~# zpool replace rpool c1d0 c1d1Make sure to wait until resilver is done before rebooting.root@kz1:~# zpool status rpool  pool: rpool state: DEGRADEDstatus: One or more devices is currently being resilvered.  The pool will        continue to function in a degraded state.action: Wait for the resilver to complete.        Run 'zpool status -v' to see device specific details.  scan: resilver in progress since Thu Jul 30 05:47:50 2015    4.39G scanned    143M resilvered at 24.7M/s, 3.22% done, 0h2m to goconfig:        NAME           STATE     READ WRITE CKSUM        rpool          DEGRADED     0     0     0          replacing-0  DEGRADED     0     0     0            c1d0       ONLINE       0     0     0            c1d1       DEGRADED     0     0     0  (resilvering)errors: No known data errors After a couple minutes, that completes. root@kz1:~# zpool status rpool  pool: rpool state: ONLINE  scan: resilvered 4.39G in 0h2m with 0 errors on Thu Jul 30 05:49:57 2015config:        NAME    STATE     READ WRITE CKSUM        rpool   ONLINE       0     0     0          c1d1  ONLINE       0     0     0errors: No known data errorsroot@kz1:~# zpool listNAME    SIZE  ALLOC   FREE  CAP  DEDUP  HEALTH  ALTROOTrpool  15.9G  4.39G  11.5G  27%  1.00x  ONLINE  - You may have noticed in the format output that I'm replacing a 16 GB zvol with a 120 GB disk.  However, the size of the zpool reported above doesn't reflect that it's on a bigger disk.  Let's fix that by setting the autoexpand property.  root@kz1:~# zpool get autoexpand rpoolNAME   PROPERTY    VALUE  SOURCErpool  autoexpand  off    defaultroot@kz1:~# zpool set autoexpand=on rpoolroot@kz1:~# zpool listNAME   SIZE  ALLOC  FREE  CAP  DEDUP  HEALTH  ALTROOTrpool  120G  4.39G  115G   3%  1.00x  ONLINE  - To finish this off, all we need to do is remove the old disk from the kernel zone's configuration.  This happens back in the global zone. root@global:~# zonecfg -z kz1zonecfg:kz1> info devicedevice 0:match not specifiedstorage: iscsi://zfssa/luname.naa.600144F0DBF8AF19000053879E9C0009id: 1bootpri: 0device 1:match not specifiedstorage.template: dev:/dev/zvol/dsk/%{global-rootzpool}/VARSHARE/zones/%{zonename}/disk%{id}storage: dev:/dev/zvol/dsk/rpool/VARSHARE/zones/kz1/disk0id: 0bootpri: 0zonecfg:kz1> remove device id=0zonecfg:kz1> exit Now, let's apply that configuration. To show what it does, I run format in kz1 before and after applying the configuration. root@global:~# zlogin kz1 format </dev/nullSearching for disks...doneAVAILABLE DISK SELECTIONS:       0. c1d0 <kz-vDisk-ZVOL-16.00GB>          /kz-devices@ff/disk@0       1. c1d1 <SUN-ZFS Storage 7120-1.0-120.00GB>          /kz-devices@ff/zvblk@1Specify disk (enter its number): root@global:~# zoneadm -z kz1 apply zone 'kz1': Checking: Removing device storage=dev:/dev/zvol/dsk/rpool/VARSHARE/zones/kz1/disk0zone 'kz1': Applying the changesroot@global:~# zlogin kz1 format </dev/nullSearching for disks...doneAVAILABLE DISK SELECTIONS:       0. c1d1 <SUN-ZFS Storage 7120-1.0-120.00GB>          /kz-devices@ff/zvblk@1Specify disk (enter its number): root@global:~#  At this point the live (no outage) storage migration is complete and it is safe to destroy the old disk (rpool/VARSHARE/zones/kz1/disk0). root@global:~# zfs destroy rpool/VARSHARE/zones/kz1/disk0

From time to time every sysadmin realizes that something that is consuming a bunch of storage is sitting in the wrong place.  This could be because of a surprise conversion of proof of concept into pro...

Multi-CPU bindings for Solaris Project

Traditionally, assigning specific processes to a certain set of CPUs has beendone by using processor sets (and resource pools). This is quite useful, but itrequires the hard partitioning of processors in the system. That means, we can'trestrict process A to run on CPUs 1,2,3 and process B to run on CPUs 3,4,5,because these partitions overlap. There is another way to assign CPUs to processes, which is called processor affinity, or Multi-CPU binding (MCB for short). Oracle Solaris 11.2 introduced MCB binding, as described in pbind(1M) and processor_affinity(2). With the release of Oracle Solaris 11.3, we have a new interface to assign/modify/remove MCBs, via Solaris project. Briefly, a Solaris project is a collection of processes with predefined attributes. These attributes include various resource controls one can apply to processes that belong to the project. For more details, see projects(4) and resource_controls(5). What's new is that MCB becomes simply another resource control we can manage through Solaris projects. We start by making a new project with MCB property. We assume that we haveenough privilege to create a project and there's no project called test-projectin the system, and all CPUs described by project.mcb.cpus entry exist in thesystem and online. We also assume that the listed cpus are in the resource poolto which current zone is bound. For manipulating project, we use standardcommand line tools projadd(1M)/projdel(1M)/projmod(1M). root@sqkx4450-1:~# projects -l test-projectprojects: project "test-project" does not existroot@sqkx4450-1:~# projadd -K project.mcb.cpus=0,3-5,9-11 -K project.mcb.flags=weak -K project.pool=pool_default test-projectroot@sqkx4450-1:~# projects -l test-projecttest-project projid : 100 comment: "" users : (none) groups : (none) attribs: project.mcb.cpus=0,3-5,9-11 project.mcb.flags=weak project.pool=pool_default This means that processes in test-project will be weakly bound to CPUs 0,3,4,5,9,10,11. (Note: For the concept of strong/weak binding, seeprocessor_affinity(2). In short, strong binding guarantees that processes willrun ONLY on designated CPUs, while weak binding does not have such a guarantee.) Next thing is to assign some processes to test-project. If we know PIDs oftarget processes, it can be done by newtask(1). root@sqkx4450-1:~# newtask -c 4156 -p test-projectroot@sqkx4450-1:~# newtask -c 4170 -p test-projectroot@sqkx4450-1:~# newtask -c 4184 -p test-project Let's check the result by using the following command. root@sqkx4450-1:~# pbind -q -i projid 100pbind(1M): pid 4156 weakly bound to processor(s) 0 3 4 5 9 10 11.pbind(1M): pid 4170 weakly bound to processor(s) 0 3 4 5 9 10 11.pbind(1M): pid 4184 weakly bound to processor(s) 0 3 4 5 9 10 11. Good. Now suppose we want to change the binding type to strong binding. In that case, all we need to do is change the value of project.mcb.flags to "strong", or even delete the project.mcb.flag key, because the default value is set to "strong". root@sqkx4450-1:~# projmod -s -K project.mcb.flags=strong test-projectroot@sqkx4450-1:~# projects -l test-projecttest-project projid : 100 comment: "" users : (none) groups : (none) attribs: project.mcb.cpus=0,3-5,9-11 project.mcb.flags=strong project.pool=pool_default Things look good, but... root@sqkx4450-1:~# pbind -q -i projid 100pbind(1M): pid 4156 weakly bound to processor(s) 0 3 4 5 9 10 11.pbind(1M): pid 4170 weakly bound to processor(s) 0 3 4 5 9 10 11.pbind(1M): pid 4184 weakly bound to processor(s) 0 3 4 5 9 10 11. Nothing changed actually! WARNING: By default, projmod(1M) only modifies project configuration file, but do not attempt to apply it to its processes. To do that, use the "-A" option. root@sqkx4450-1:~# projmod -A test-projectroot@sqkx4450-1:~# pbind -q -i projid 100pbind(1M): pid 4156 strongly bound to processor(s) 0 3 4 5 9 10 11.pbind(1M): pid 4170 strongly bound to processor(s) 0 3 4 5 9 10 11.pbind(1M): pid 4184 strongly bound to processor(s) 0 3 4 5 9 10 11. Now, suppose we want to change the list of CPUs, but oops, we made some typos. root@sqkx4450-1:~# projmod -s -K project.mcb.cpus=0,3-5,13-17 -A test-projectprojmod: Updating project test-project succeeded with following warning message.WARNING: Following ids of cpus are not found in the system:16-17root@sqkx4450-1:~# projects -l test-projecttest-project projid : 100 comment: "" users : (none) groups : (none) attribs: project.mcb.cpus=0,3-5,13-17 project.mcb.flags=strong project.pool=pool_default Our system has CPUs 0 to 15, not up to 17. In that case, we get some warnings. But the command succeeded anyway. The command simply ignores missing CPUs. root@sqkx4450-1:~# pbind -q -i projid 100pbind(1M): pid 4156 strongly bound to processor(s) 0 3 4 5 13 14 15.pbind(1M): pid 4170 strongly bound to processor(s) 0 3 4 5 13 14 15.pbind(1M): pid 4184 strongly bound to processor(s) 0 3 4 5 13 14 15. And one more thing: If you want to check the validity of project file only, use projmod(1M) without any options. root@sqkx4450-1:~# projmodprojmod: Validation warning on line 6, WARNING: Following ids of cpus are not found in the system:16-17 But projmod is not tolerant if it can't find any CPUs at all. root@sqkx4450-1:~# projmod -s -K project.mcb.cpus=17-20 -A test-projectprojmod: WARNING: Following ids of cpus are not found in the system:17-20projmod: ERROR: All of given multi-CPU binding (MCB) ids are not found in the system: project.mcb.cpus=17-20root@sqkx4450-1:~# projects -l test-projecttest-project projid : 100 comment: "" users : (none) groups : (none) attribs: project.mcb.cpus=0,3-5,13-17 project.mcb.flags=strong project.pool=pool_default Now we see ERROR. It's something that actually fails the command. Please read the error message carefully when you see it. Note that project configuration is not updated also. Before moving to next topic, one small but important tip. How do we clear MCBfrom a project? Set the value of project.mcb.cpus to "none" and removeproject.mcb.flags if there is. root@sqkx4450-1:~# projects -l test-projecttest-project projid : 100 comment: "" users : (none) groups : (none) attribs: project.mcb.cpus=none project.pool=pool_defaultroot@sqkx4450-1:~# projmod -A test-projectroot@sqkx4450-1:~# pbind -q -i projid 100root@sqkx4450-1:~# Let's move on to a little bit of advanced usage. In Oracle Solaris systems, aswell as other systems, CPUs are grouped in certain units. Currently there are'cores', 'sockets', 'processor-groups' and 'lgroups'. Utilizing these units canimprove performance aided by hardware design. (I have less familiarity withthose topics, so look at the following post about lgroups: Locality Group Observability on Solaris.) MCB for projects supports all of these CPU structures. The usage is simple. Just change "project.mcb.cpus" to "project.mcb.cores", "project.mcb.sockets", "project.mcb.pgs", or "project.mcb.lgroups". Note: To get information about CPU structures on a given system, use following commands. "psrinfo -t" gives information about cpu/core/socket structure, "pginfo" gives information about processor groups, and "lgrpinfo -c" gives information about lgroups. root@sqkx4450-1:~# projects -l test-projecttest-project projid : 100 comment: "" users : (none) groups : (none) attribs: project.mcb.sockets=1 project.pool=pool_defaultroot@sqkx4450-1:~# projmod -A test-projectroot@sqkx4450-1:~# pbind -q -i projid 100pbind(1M): pid 4156 strongly bound to processor(s) 1 5 9 13.pbind(1M): pid 4170 strongly bound to processor(s) 1 5 9 13.pbind(1M): pid 4184 strongly bound to processor(s) 1 5 9 13. These examples explain the basics of MCB for projects. For more details, you can refer to the appropriate man pages. But, let me briefly summarize some features we didn't explain here. And, final warning: Many features we used in this post are not supported on Oracle Solaris 11.2, even those not related to MCB directly. 1. newtask(1) also utilizes projects. When we set MCB for a project in theproject configuration file, an unprivileged user that is a member of project canuse newtask(1) to put new or his/her existing processes in it. 2. For Solaris projects APIs, look at libproject(3LIB). Warning: some features work only for 64-bit version of the library for now. 3. There are many other existing attributes of project. Combining them with MCB usually causes no problems, but there is one exception: project.pool. Ignoring all the detail, there's only one important guideline when using both project.pool and project.mcb.(cpus|cores|sockets|pgs|lgroups): all the CPUs in project.mcb.(cpus|cores|sockets|pgs|lgroups) should reside in the project.pool. When we don't specifiy project.pool and use project.mcb.(cpus|cores|sockets|pgs|lgroups), the system ASSUMES that project.pool is the default pool of the current zone. In this case, when we try to apply the project's attributes to processes, we'll see following warning message. root@sqkx4450-1:~# projects -l test-projecttest-project projid : 100 comment: "" users : (none) groups : (none) attribs: project.mcb.cpus=0,3-5,9-11 project.mcb.flags=weakroot@sqkx4450-1:~# projmod -A test-projectprojmod: Updating project test-project succeeded with following warning message.WARNING: We bind the target project to the default pool of the zone because an multi-CPU binding (MCB) entry exists. Man page references.    General information:        Project file configuration: project(4)        How to manage resource control by project: resource_controls(5)     Project utilization:        Get information of projects: projects(1)        Manage projects: projadd(1M) / projdel(1M) / projmod(1M)        Assign a process to project: newtask(1)        project control APIs: libproject(3LIB)    Existing interfaces dealing MCB:        command line interface: pbind(1M)        system call interface: processor_affinity(2)    Processor information:        psrinfo(1M) / pginfo(1M) / lgrpinfo(1M)

Traditionally, assigning specific processes to a certain set of CPUs has been done by using processor sets (and resource pools). This is quite useful, but itrequires the hard partitioning of...

Secure multi-threaded live migration for kernel zones

As mentioned in the What's New document, Solaris 11.3 now supports live migration for kernel zones!  Let's try it out. As mentioned in the fine manual, live migration requires the use of zones on shared storage (ZOSS) and a few other things. In Solaris 11.2, we could use logical units (i.e. fibre channel) or iscsi.  Always living on the edge, I decide to try out the new ZOSS NFS feature.  Since the previous post did such a great job of explaining how to set it up, I won't go into the details.  Here's what my zone configuration looks like: zonecfg:mig1> infozonename: mig1brand: solaris-kz...anet 0: ...device 0:match not specifiedstorage.template: nfs://zoss:zoss@kzx-05/zones/zoss/%{zonename}.disk%{id}storage: nfs://zoss:zoss@kzx-05/zones/zoss/mig1.disk0id: 0bootpri: 0virtual-cpu:ncpus: 4capped-memory:physical: 4Gkeysource:raw redacted And the zone is running. root@vzl-216:~# zoneadm -z mig1 list -sNAME             STATUS           AUXILIARY STATE                               mig1             running                                         In order for live migration to work, the kz-migr and rad:remote services need to be online.  They are disabled by default. # svcadm enable -s svc:/system/rad:remote svc:/network/kz-migr:stream# svcs svc:/system/rad:remote svc:/network/kz-migr:streamSTATE          STIME    FMRIonline          6:40:12 svc:/network/kz-migr:streamonline          6:40:12 svc:/system/rad:remote While these services are only needed on the remote end, I enable them on both sides because there's a pretty good chance that I will migrate kernel zones in both directions.  Now we are ready to perform the migration.  I'm migrating mig1 from vzl-216 to vzl-212.  Both vzl-216 and vzl-212 are logical domains on T5's. root@vzl-216:~# zoneadm -z mig1 migrate vzl-212Password: zoneadm: zone 'mig1': Importing zone configuration.zoneadm: zone 'mig1': Attaching zone.zoneadm: zone 'mig1': Booting zone in 'migrating-in' mode.zoneadm: zone 'mig1': Checking migration compatibility.zoneadm: zone 'mig1': Starting migration.zoneadm: zone 'mig1': Suspending zone on source host.zoneadm: zone 'mig1': Waiting for migration to complete.zoneadm: zone 'mig1': Migration successful.zoneadm: zone 'mig1': Halting and detaching zone on source host. Afterwards, we see that the zone is now configured on vzl-216 and running on vzl-212. root@vzl-216:~# zoneadm -z mig1 list -sNAME             STATUS           AUXILIARY STATE                               mig1             configured                                    root@vzl-212:~# zoneadm -z mig1 list -sNAME             STATUS           AUXILIARY STATE                               mig1             running                 Ok, cool.  But what really happened?  During the migration, I was also running tcptop, one of our demo dtrace scripts.  Unfortunately, it doesn't print the pretty colors: I added those so we can see what's going on. root@vzl-216:~# dtrace -s /usr/demo/dtrace/tcptop.dSampling... Please wait....2015 Jul  9 06:50:30,  load: 0.10,  TCPin:      0 Kb,  TCPout:      0 Kb  ZONE    PID LADDR           LPORT RADDR           RPORT      SIZE     0    613 10.134.18.216      22 10.134.18.202   48168       112     0   2640 10.134.18.216   60773 10.134.18.212   12302       137     0    613 10.134.18.216      22 10.134.18.202   60194       3362015 Jul  9 06:50:35,  load: 0.10,  TCPin:      0 Kb,  TCPout: 832420 Kb  ZONE    PID LADDR           LPORT RADDR           RPORT      SIZE     0    613 10.134.18.216      22 10.134.18.202   48168       208     0   2640 10.134.18.216   60773 10.134.18.212   12302       246     0    613 10.134.18.216      22 10.134.18.202   60194       480     0   2640 10.134.18.216   45661 10.134.18.212    8102      8253     0   2640 10.134.18.216   41441 10.134.18.212    8102 418467721     0   2640 10.134.18.216   59051 10.134.18.212    8102 459765481...2015 Jul  9 06:50:50,  load: 0.41,  TCPin:      1 Kb,  TCPout: 758608 Kb  ZONE    PID LADDR           LPORT RADDR           RPORT      SIZE     0   2640 10.134.18.216   60773 10.134.18.212   12302       388     0    613 10.134.18.216      22 10.134.18.202   60194       544     0    613 10.134.18.216      22 10.134.18.202   48168       592     0   2640 10.134.18.216   45661 10.134.18.212    8102    119032     0   2640 10.134.18.216   59051 10.134.18.212    8102 151883984     0   2640 10.134.18.216   41441 10.134.18.212    8102 6204496802015 Jul  9 06:50:55,  load: 0.48,  TCPin:      0 Kb,  TCPout:      0 Kb  ZONE    PID LADDR           LPORT RADDR           RPORT      SIZE     0    613 10.134.18.216      22 10.134.18.202   60194       736^C In the first sample, we see that vzl-216 (10.134.18.216) has established a RAD connection to vzl-212.  We know it is RAD because it is over port 12302.  RAD is used to connect the relevant zone migration processes on the two machines.  One connection between the zone migration processes is used for orchestrating various aspects of the migration.  There are two others that are used for synchronizing the memory between the machines.  In each of the samples, there is also some traffic from each of a couple ssh sessions I have between vzl-216 and another machine. As the amount of kernel zone memory increases, the number of connections will also increase.  Currently that scaling factor is one connection per 2 GB of kernel zone memory, with an upper limit based on the number of CPUs in the machine.  The scaling is limited by the number of CPUs because each connection corresponds to a sending and a receiving thread. Those threads are responsible for encrypting and decrypting the traffic.  The multiple connections can work nicely with IPMP's outbound load sharing and/or link aggregations to spread the load across multiple physical network links. The algorithm for selecting the number of connections may change from time to time, so don't be surprised if your observations don't match what is shown above. All of the communication between the two machines is encrypted.  The RAD connection (in this case) is encrypted with TLS, as described in rad(1M).  This RAD connection supports a series of calls that are used to negotiate various things, including encryption parameters for connections to kz-migr (port 8102).  You have control over the encryption algorithm used with the -c <cipher> option to zoneadm migrate.  You can see the list of available ciphers with: root@vzl-216:~# zoneadm -z mig1 migrate -c list vzl-216Password: source ciphers: aes-128-ccm aes-128-gcm nonedestination ciphers: aes-128-ccm aes-128-gcm none If for some reason you don't want to use encryption, you can use migrate -c none.  There's not much reason to do that, though.  The default encryption, aes-128-ccm, makes use of hardware crypto instructions found in all of the SPARC and x86 processors that are supported with kernel zones.  In tests, I regularly saturated a 10 gigabit link while migrating a single kernel zone. One final note.... If you don't like typing the root password every time you migrate, you can also set up key-based authentication between the two machines.  In that case, you will use a command like: # zoneadm -z <zone> migrate ssh://<remotehost> Happy secure live migrating! 

As mentioned in the What's New document, Solaris 11.3 now supports live migration for kernel zones!  Let's try it out. As mentioned in the fine manual, live migration requires the use of zones on...

Kernel zone suspend now goes zoom!

Solaris 11.2 had the rather nice feature that you can have kernel zones automatically suspend and resume across global zone reboots.  We've made some improvements in this area in Solaris 11.3 to help in the cases where more CPU cycles could make suspend and resume go faster. As a recap, automatic suspend/resume of kernel zones across global zone reboots can be accomplished by having a suspend resource, setting autoboot=true and autoshutdown=suspend. # zonecfg -z kz1zonecfg:kz1> set autoboot=truezonecfg:kz1> set autoshutdown=suspendzonecfg:kz1> exitzonecfg:kz1:suspend> infosuspend:path.template: /export/%{zonename}.suspendpath: /export/kz1.suspendstorage not specified When a graceful reboot is performed (that is, shutdown -r or init 6) svc:/system/zones:default will suspend the zone as it shuts down and resume it as the system boots.  Obviously, reading from memory and writing to disk would have the inclination to saturate the disk bandwidth.  To create a more balanced system, the suspend image is compressed.  While this greatly slows down the write rate, several kernel zones that were concurrently suspending would still saturate available bandwidth in typical configurations.  More balanced and faster - good, right? Well, this more balanced system came at a cost.  When suspending one zone the performance was not so great.  For example, a basic kernel zone with 2 GB of RAM on a T5 ldom shows: # tail /var/log/zones/kz1.messages...2015-07-08 12:33:15 notice: NOTICE: Zone suspending2015-07-08 12:33:39 notice: NOTICE: Zone haltedroot@vzl-212:~# ls -lh /export/kz1.suspend-rw-------   1 root     root        289M Jul  8 12:33 /export/kz1.suspend# bc -l289 / 2412.04166666666666666666 Yikes - 12 MB/s to disk.  During this time, I used prstat -mLc -n 5 1 and iostat -xzn and could see that the compression thread in zoneadmd was using 100% of a CPU and the disk had idle times then spurts of being busy as zfs flushed out each transaction group.  Note that this rate of 12 MB/s is artificially low because some other things are going on before and after writing the suspend file that may take up to a couple of seconds. I then updated my system to the Solaris 11.3 beta release and tried again.  This time things look better. # zoneadm -z kz1 suspend# tail /var/log/zones/kz1.messages...2015-07-08 12:59:49 info: Processing command suspend flags 0x0 from ruid/euid/suid 0/0/0 pid 31412015-07-08 12:59:49 notice: NOTICE: Zone suspending2015-07-08 12:59:58 info: Processing command halt flags 0x0 from ruid/euid/suid 0/0/0 pid 02015-07-08 12:59:58 notice: NOTICE: Zone halted# ls -lh /export/kz1.suspend -rw-------   1 root     root        290M Jul  8 12:59 /export/kz1.suspend# echo 290 / 9 | bc -l32.22222222222222222222 That's better, but not great.  Remember what I said about the rate being artificially low above?  While writing the multi-threaded suspend/resume support, I also created some super secret debug code that gives more visibility into the rate.  That shows: Suspend raw: 1043 MB in 5.9 sec 177.5 MB/sSuspend compressed: 289 MB in 5.9 sec 49.1 MB/sSuspend raw-fast-fsync: 1043 MB in 3.5 sec 299.1 MB/sSuspend compressed-fast-fsync: 289 MB in 3.5 sec 82.8 MB/s What this is telling me is that my kernel zone with 2 GB of RAM actually had 1043 MB that actually needed to be suspended - the remaining was blocks of zeroes.  The total suspend time was 5.9 seconds, giving a read from memory rate of 177.5 MB/s and write to disk rate of 49.1 MB/s.  The -fsync lines are saying that if suspend didn't fsync(3C) the suspend file before returning, it would have completed in 3.5 seconds, giving a suspend rate of 82.8 MB/s.  That's looking better. In another experiment, we aim to make the storage not be the limiting factor. This time, let's do 16 GB of RAM and write the suspend image to /tmp. # zonecfg -z kz1 infozonename: kz1brand: solaris-kzautoboot: trueautoshutdown: suspend...virtual-cpu:ncpus: 12capped-memory:physical: 16Gsuspend:path: /tmp/kz1.suspendstorage not specified To ensure that most of the RAM wasn't just blocks of zeroes (and as such wouldn't be in the suspend file), I created a tar file of /usr in kz1's /tmp and made copies of it until the kernel zone's RAM was rather full. This time around, we are seeing that we are able to write the 15 GB of active memory in 52.5 seconds.  Notice that this is roughly 15x the amount of memory in only double the time from our Solaris 11.2 baseline. Suspend raw: 15007 MB in 52.5 sec 286.1 MB/sSuspend compressed: 5416 MB in 52.5 sec 103.3 MB/s While the focus of this entry has been multi-threaded compression during suspend, it's also worth noting that: The suspend image is also encrypted. If someone gets a copy of the suspend image, it doesn't mean that they can read the guest memory.  Oh, and the encryption is multi-threaded as well. Decryption is also multi-threaded. And so is uncompression.  The parallel compression and uncompression code is freely available, even. :) The performance numbers here should be taken with a grain of salt.  Many factors influence the actual rate you will see.  In particular: Different CPUs have very different performance characteristics. If the zone has a dedicated-cpu resource, only the CPU's that are dedicated to the zone will be used for compression and encryption. More CPUs tend to go faster, but only to a certain point. Various types of storage will perform vastly differently. When many zones are suspending or resuming at the same time, they will compete for resources. And one last thing... for those of you that are too impatient to wait until Solaris 11.3 to try this out, it is actually in Solaris 11.2 SRU 8 and later.

Solaris 11.2 had the rather nice feature that you can have kernel zones automatically suspend and resume across global zone reboots.  We've made some improvements in this area in Solaris 11.3 to help...

Shared Storage on NFS for Kernel Zones

In Solaris 11.2 Zones could be installed on shared storage (ZOSS) using iSCSI devices.  With Solaris 11.3 Beta the shared storage for Kernel Zones can also be placed on NFS files.To setup an NFS SURI (storage URI), you'll need to identify the NFS host, share and path where the file will be placed and the user and group allowed to access the file.  The file does not need to exist, but the parent directory of the file must exist.  The user and group are specified so a user can control access of their zone storage via NFS.Then in the zone configuration, you can setup a device (including a boot device) using the NFS SURI that looks like:    - nfs://user:group@host/NFS_share/path_to_fileIf the file does not yet exist, you'll need to specify a size.Here's my setup of a 16g file for the zone root on an NFS share "/test" on system "sys1" owned by user "user1". My NFS server has this mode/owner for the directory /test/z1kz: # ls -ld /test/z1kzdrwx------   2 user1  staff          4 Jun 12 12:36 /test/z1kz In zonecfg for the kernel zone "z1kz", select device 0 (the boot device) and set storage and create-size: zonecfg:z1kz> select device 0zonecfg:z1kz:device> set storage=nfs://user1:staff@sys1/test/z1kz/z1kz_rootzonecfg:z1kz:device> set create-size=16gzonecfg:z1kz:device> endzonecfg:z1kz> info devicedevice 0: match not specified storage: nfs://user1:staff@sys1/test/z1kz_root     create-size: 16g    id: 0     bootpri: 0zonecfg:z1kz> commit To add another device to this kernel zone, do: zonecfg:z1kz> add devicezonecfg:z1kz:device> set storage=nfs://user1:staff@sys1/test/z1kz/z1kz_disk1 zonecfg:z1kz:device> set create-size=8gzonecfg:z1kz:device> end zonecfg:z1kz> commitWhen installing the kernel zone, use the "-x storage-create-missing" option to create the NFS files owned by user1:staff. # zoneadm -z z1kz install -x storage-create-missing<output deleted> #On my NFS server: # ls -l /test/z1kztotal 407628 -rw-------   1 user1  staff    8589934592 Jun 12 12:36 z1kz_disk1-rw-------   1 user1  staff    17179869184 Jun 12 12:43 z1kz_root When the zone is uninstalled, the option "-x force-storage-destroy-all" will be needed to destroy the NFS files z1kz_root and z1kz_disk1.  If the "-x force-storage-destroy-all" option isn't used, then the NFS files will still exist on the NFS server after the zone uninstall.

In Solaris 11.2 Zones could be installed on shared storage (ZOSS) using iSCSI devices.  With Solaris 11.3 Beta the shared storage for Kernel Zones can also be placed on NFS files.To setup an NFS SURI...

One image for native zones, kernel zones, ldoms, metal, ...

In my previous post, I described how to convert a global zone to a non-global zone using a unified archive.  Since then, I've fielded a few questions about whether this same approach can be used to create a master image that is used to install Solaris regardless of virtualization type (including no virtualization).  The answer is: of course!  That was one of the key goals of the project that invented unified archives.In my earlier example, I was focused on preserving the identity and other aspects of the global zone and knew I had only one place that I planned to deploy it.  Hence, I chose to skip media creation (--exclude-media) and used a recovery archive (-r).  To generate a unified archive of a global zone that is ideal for use as an image for installing to another global zone or native zone, just use a simpler command line.root@global# archiveadm create /path/to/golden-image.uarNotice that by using fewer options we get something that is more usable.What's different about this image compared to the one created in the previous post?This archive as an embedded AI iso that will be used if you install a kernel zone from it.  That is, zoneadm -z kzname install -a /path/to/golden-image.uar will boot from that embedded AI image and perform an automated install from that archive.This archive only contains the active boot environment and other ZFS snapshots are not archived.This archive has been stripped of its identity.  When installing, you either need to provide a sysconfig profile or interactively configure the system or zone on the console or zone console on the first post-installation boot.

In my previous post, I described how to convert a global zone to a non-global zone using a unified archive.  Since then, I've fielded a few questions about whether this same approach can be used to...

global to non-global conversion with multiple zpools

Suppose you have a global zone with multiple zpools that you would like to convert into a native zone.  You can do that, thanks to unified archives (introduced in Solaris 11.2) and dataset aliasing (introduced in Solaris 11.0).  The source system looks like this: root@buzz:~# zoneadm list -cv  ID NAME             STATUS      PATH                         BRAND      IP   0 global           running     /                            solaris    sharedroot@buzz:~# zpool listNAME    SIZE  ALLOC   FREE  CAP  DEDUP  HEALTH  ALTROOTrpool  15.9G  4.38G  11.5G  27%  1.00x  ONLINE  -tank   1008M    93K  1008M   0%  1.00x  ONLINE  -root@buzz:~# df -h /tankFilesystem             Size   Used  Available Capacity  Mounted ontank                   976M    31K       976M     1%    /tankroot@buzz:~# cat /tank/READMEthis is tank Since we are converting a system rather than cloning it, we want to use a recovery archive and use the -r option.  Also, since the target is a native zone, there's no need for the unified archive to include media. root@buzz:~# archiveadm create --exclude-media -r /net/kzx-02/export/uar/p2v.uarInitializing Unified Archive creation resources...Unified Archive initialized: /net/kzx-02/export/uar/p2v.uarLogging to: /system/volatile/archive_log.1014Executing dataset discovery...Dataset discovery completePreparing archive system image...Beginning archive stream creation...Archive stream creation completeBeginning final archive assembly...Archive creation complete Now we will go to the global zone that will have the zone installed.  First, we must configure the zone.  The archive contains a zone configuration that is almost correct, but needs a little help because archiveadm(1M) doesn't know the particulars of where you will deploy it. Most examples that show configuration of a zone from an archive show the non-interactive mode.  Here we use the interactive mode. root@vzl-212:~# zonecfg -z p2vUse 'create' to begin configuring a new zone.zonecfg:p2v> create -a /net/kzx-02/export/uar/p2v.uar After the create command completes (in a fraction of a second) we can see the configuration that was embedded in the archive.  I've trimmed out a bunch of uninteresting stuff from the anet interface. zonecfg:p2v> infozonename: p2vzonepath.template: /system/zones/%{zonename}zonepath: /system/zones/p2vbrand: solarisautoboot: falseautoshutdown: shutdownbootargs:file-mac-profile:pool:limitpriv:scheduling-class:ip-type: exclusivehostid:tenant:fs-allowed:[max-lwps: 40000][max-processes: 20000]anet:        linkname: net0        lower-link: auto    [snip]attr:        name: zonep2vchk-num-cpus        type: string        value: "original system had 4 cpus: consider capped-cpu (ncpus=4.0) or dedicated-cpu (ncpus=4)"attr:        name: zonep2vchk-memory        type: string        value: "original system had 2048 MB RAM and 2047 MB swap: consider capped-memory (physical=2048M swap=4095M)"attr:        name: zonep2vchk-net-net0        type: string        value: "interface net0 has lower-link set to 'auto'.  Consider changing to match the name of a global zone link."dataset:        name: __change_me__/tank        alias: tankrctl:        name: zone.max-processes        value: (priv=privileged,limit=20000,action=deny)rctl:        name: zone.max-lwps        value: (priv=privileged,limit=40000,action=deny) In this case, I want to be sure that the zone's network uses a particular global zone interface, so I need to muck with that a bit. zonecfg:p2v> select anet linkname=net0zonecfg:p2v:anet> set lower-link=stub0zonecfg:p2v:anet> end The zpool list output in the beginning of this post showed that the system had two ZFS pools: rpool and tank.  We need to tweak the configuration to point the tank virtual ZFS pool to the right ZFS file system.  The name in the dataset resource refers to the location in the global zone.  This particular system has a zpool named export - a more basic Solaris installation would probably need to use rpool/export/....  The alias in the dataset resource needs to match the name of the secondary ZFS pool in the archive. zonecfg:p2v> select dataset alias=tankzonecfg:p2v:dataset> set name=export/tank/%{zonename}zonecfg:p2v:dataset> infodataset:        name.template: export/tank/%{zonename}        name: export/tank/p2v        alias: tankzonecfg:p2v:dataset> endzonecfg:p2v> exit I did something tricky above - I used a template property to make it easier to clone this zone configuration and have the dataset name point at a different dataset. Let's try an installation.  NOTE: Before you get around to booting the new zone, be sure the old system is offline else you will have IP address conflicts. root@vzl-212:~# zoneadm -z p2v install -a /net/kzx-02/export/uar/p2v.uarcould not verify zfs dataset export/tank/p2v: filesystem does not existzoneadm: zone p2v failed to verify Oops.  I forgot to create the dataset.  Let's do that.  I use -o zoned=on to prevent the dataset from being mounted in the global zone.  If you forget that, it's no biggy - the system will fix it for you soon enough. root@vzl-212:~# zfs create -p -o zoned=on export/tank/p2vroot@vzl-212:~# zoneadm -z p2v install -a /net/kzx-02/export/uar/p2v.uarThe following ZFS file system(s) have been created:    rpool/VARSHARE/zones/p2vProgress being logged to /var/log/zones/zoneadm.20150220T060031Z.p2v.install    Installing: This may take several minutes... Install Log: /system/volatile/install.5892/install_log AI Manifest: /tmp/manifest.p2v.YmaOEl.xml    Zonename: p2vInstallation: Starting ...        Commencing transfer of stream: 0f048163-2943-cde5-cb27-d46914ec6ed3-0.zfs to rpool/VARSHARE/zones/p2v/rpool        Commencing transfer of stream: 0f048163-2943-cde5-cb27-d46914ec6ed3-1.zfs to export/tank/p2v        Completed transfer of stream: '0f048163-2943-cde5-cb27-d46914ec6ed3-1.zfs' from file:///net/kzx-02/export/uar/p2v.uar        Completed transfer of stream: '0f048163-2943-cde5-cb27-d46914ec6ed3-0.zfs' from file:///net/kzx-02/export/uar/p2v.uar        Archive transfer completed        Changing target pkg variant. This operation may take a whileInstallation: Succeeded      Zone BE root dataset: rpool/VARSHARE/zones/p2v/rpool/ROOT/solaris-recovery                     Cache: Using /var/pkg/publisher.Updating image formatImage format already current.  Updating non-global zone: Linking to image /.Processing linked: 1/1 done  Updating non-global zone: Syncing packages.No updates necessary for this image. (zone:p2v)  Updating non-global zone: Zone updated.                    Result: Attach Succeeded.        Done: Installation completed in 165.355 seconds.  Next Steps: Boot the zone, then log into the zone console (zlogin -C)              to complete the configuration process.Log saved in non-global zone as /system/zones/p2v/root/var/log/zones/zoneadm.20150220T060031Z.p2v.installroot@vzl-212:~# zoneadm -z p2v boot After booting we see that everything in the zone is in order. root@vzl-212:~# zlogin p2v[Connected to zone 'p2v' pts/3]Oracle Corporation      SunOS 5.11      11.2    September 2014root@buzz:~# svcs -xroot@buzz:~# zpool listNAME    SIZE  ALLOC   FREE  CAP  DEDUP  HEALTH  ALTROOTrpool  99.8G  66.3G  33.5G  66%  1.00x  ONLINE  -tank    199G  49.6G   149G  24%  1.00x  ONLINE  -root@buzz:~# df -h /tankFilesystem             Size   Used  Available Capacity  Mounted ontank                   103G    31K       103G     1%    /tankroot@buzz:~# cat /tank/READMEthis is tankroot@buzz:~# zonenamep2vroot@buzz:~# Happy p2v-ing!  Or rather, g2ng-ing.

Suppose you have a global zone with multiple zpools that you would like to convert into a native zone.  You can do that, thanks to unified archives (introduced in Solaris 11.2) and dataset aliasing (...

fronting isolated zones

This is a continuation of a series of posts.  While this one may be interesting all on its own, you may want to start from the top to get the context. In this post, we'll create teeter - the load balancer.  This zone will be a native (solaris brand) zone.  The intent of this arrangement is to make it so that paying customers get served by the zone named premium and the freeloaders have to scrape by with free.  Since that logic is clearly highly dependent on each webapp, I'll take the shortcut of having a more simplistic load balancer. Once again, we'll configure the zone's networking from the global zone.  This time around both networks get a static configuration - one attached to the red (192.168.1.0/24) network and the other attached to the global zone's first network interface. root@global:~# zonecfg -z teeterUse 'create' to begin configuring a new zone.zonecfg:teeter> createzonecfg:teeter> set zonepath=/zones/%{zonename}zonecfg:teeter> select anet linkname=net0zonecfg:teeter:anet> set lower-link=balstub0zonecfg:teeter:anet> set allowed-address=192.168.1.1/24zonecfg:teeter:anet> set configure-allowed-address=truezonecfg:teeter:anet> endzonecfg:teeter> add anetzonecfg:teeter:anet> set lower-link=net0zonecfg:teeter:anet> set allowed-address=10.134.17.196/24zonecfg:teeter:anet> set defrouter=10.134.17.1zonecfg:teeter:anet> set configure-allowed-address=truezonecfg:teeter:anet> endzonecfg:teeter> exitroot@global:~# zoneadm -z teeter installThe following ZFS file system(s) have been created:    zones/teeterProgress being logged to /var/log/zones/zoneadm.20150114T222949Z.teeter.install       Image: Preparing at /zones/teeter/root....Log saved in non-global zone as /zones/teeter/root/var/log/zones/zoneadm.20150114T222949Z.teeter.install root@global:~# zoneadm -z teeter bootroot@global:~# zlogin -C teeter   sysconfig, again.  Gee, I really should have created a sysconfig.xml...  In a solaris zone, there are no dependencies that bring in the apache web server, so that needs to be installed. root@vzl-212:~# zlogin teeter[Connected to zone 'teeter' pts/3]Oracle Corporation    SunOS 5.11    11.2    December 2014root@teeter:~# pkg install apache-22... Once the web server is installed, we'll configure a simple load balancer using mod_proxy_balancer. root@teeter:~# cd /etc/apache2/2.2/conf.d/root@teeter:/etc/apache2/2.2/conf.d# cat > mod_proxy_balancer.conf <<EOF<Proxy balancer://mycluster>BalancerMember http://192.168.1.3:80BalancerMember http://192.168.1.4:80</Proxy>ProxyPass /test balancer://mycluster EOFroot@teeter:/etc/apache2/2.2/conf.d# svcadm enable apache22 To see if this is working, we will use a symbolic link on the NFS server to point to a unique file on each of the web servers.  Unless you are trying to paste your output into Oracle's blogging software, you won't need to define $download as I did. root@global:~# ln -s /tmp/hostname /export/web/hostnameroot@global:~# zlogin free 'hostname > /tmp/hostname'root@global:~# zlogin premium 'hostname > /tmp/hostname'root@global:~# download=cu; download=${download}rlroot@global:~# $download -s http://10.134.17.196/test/hostnamepremiumroot@global:~# $download http://10.134.17.196/test/hostnamefreeroot@global:~# for i in {1..100}; do \    $download -s http://10.134.17.196/test/hostname; done| sort | uniq -c  50 free  50 premium This concludes this series.  Surely there are things that I've glossed over and many more interesting things I could have done.  Please leave comments with any questions and I'll try to fill in the details.

This is a continuation of a series of posts.  While this one may be interesting all on its own, you may want to start from the top to get the context. In this post, we'll create teeter - the...

stamping out web servers

This is a continuation of a series of posts.  While this one may be interesting all on its own, you may want to start from the top to get the context. The diagram above shows one global zone with a few zones in it.  That's not very exciting in a world where we need to rapidly provision new instances that are preconfigured and as hack-proof as we can make them.  This post will show how to create a unified archive that includes the kernel zone configuration and content that makes for a hard-to-compromise web server.  I'd like to say impossible, but history has shown us that software has bugs that affects everyone across the industry. We'll start of by configuring and installing a kernel zone called web.  It will have two automatic networks, each attached to the appropriate etherstubs.  Notice the use of template properties - using %{zonename} and %{id} make it so that we don't have to futz with so much of the configuration when we configure the next zone based on this one. root@global:~# zonecfg -z webUse 'create' to begin configuring a new zone.zonecfg:web> create -t SYSsolaris-kzzonecfg:web> select device id=0zonecfg:web:device> set storage=dev:zvol/dsk/zones/%{zonename}/disk%{id}zonecfg:web:device> endzonecfg:web> select anet id=0zonecfg:web:anet> set lower-link=balstub0zonecfg:web:anet> set allowed-address=192.168.1.2/24zonecfg:web:anet> set configure-allowed-address=truezonecfg:web:anet> endzonecfg:web> add anetzonecfg:web:anet> set lower-link=internalstub0zonecfg:web:anet> set allowed-dhcp-cids=%{zonename}zonecfg:web:anet> endzonecfg:web> infozonename: webbrand: solaris-kzautoboot: falseautoshutdown: shutdownbootargs:pool:scheduling-class:hostid: 0xdf87388tenant:anet:        lower-link: balstub0        allowed-address: 192.168.1.2/24        configure-allowed-address: true        defrouter not specified        allowed-dhcp-cids not specified        link-protection: "mac-nospoof, ip-nospoof" ...        id: 0anet:        lower-link: internalstub0        allowed-address not specified        configure-allowed-address: true        defrouter not specified        allowed-dhcp-cids.template: %{zonename}        allowed-dhcp-cids: web        link-protection: mac-nospoof ...        id: 1device:        match not specified        storage.template: dev:zvol/dsk/zones/%{zonename}/disk%{id}        storage: dev:zvol/dsk/zones/web/disk0        id: 0        bootpri: 0capped-memory:        physical: 2Gzonecfg:web> exitroot@global:~# zoneadm -z web installProgress being logged to /var/log/zones/zoneadm.20150114T193808Z.web.installpkg cache: Using /var/pkg/publisher. Install Log: /system/volatile/install.4391/install_log AI Manifest: /tmp/zoneadm3808.vTayai/devel-ai-manifest.xml  SC Profile: /usr/share/auto_install/sc_profiles/enable_sci.xmlInstallation: Starting ...        Creating IPS image        Installing packages from:            solaris                origin:  file:///export/repo/11.2/repo/        The following licenses have been accepted and not displayed.        Please review the licenses for the following packages post-install:          consolidation/osnet/osnet-incorporation        Package licenses may be viewed using the command:          pkg info --license <pkg_fmri>DOWNLOAD                                PKGS         FILES    XFER (MB)   SPEEDCompleted                            451/451   63686/63686  579.9/579.9    0B/sPHASE                                          ITEMSInstalling new actions                   86968/86968Updating package state database                 DoneUpdating package cache                           0/0Updating image state                            DoneCreating fast lookup database                   DoneInstallation: Succeeded        Done: Installation completed in 431.445 seconds.root@global:~# zoneadm -z web bootroot@global:~# zlogin -C web        Perform sysconfig.  Allow networking to be configured automatically.        ~~. (one ~ for ssh, one for zlogin -C)root@global:~# zlogin web At this point, networking inside the zone should look like this: root@web:~# ipadmNAME CLASS/TYPE STATE UNDER ADDRlo0 loopback ok -- -- lo0/v4 static ok -- 127.0.0.1/8 lo0/v6 static ok -- ::1/128net0 ip ok -- -- net0/v4 inherited ok -- 192.168.1.2/24net1 ip ok -- -- net1/v4 dhcp ok -- 192.168.0.2/24 net1/v6 addrconf ok -- <IPv6addr> Configure the NFS mounts for web content (/web) and IPS repos (/repo). root@web:~# cat >> /etc/vfstab192.168.0.1:/export/repo/11.2/repo - /repo      nfs     -       yes     -192.168.0.1:/export/web -       /web    nfs     -       yes     -^Droot@web:~# svcadm enable -r nfs/client Now, update the pkg image configuration so that it uses the repository from the correct path.  root@web:~# pkg set-publisher -O file:///repo/ solaris Update the apache configuration so that it looks to /web for the document root.  root@web:~# vi /etc/apache2/2.2/httpd.conf This involves (at a minimum) changing DocumentRoot to "/web" and changing the <Directory "/var/apache/2.2/htdocs"> line to <Directory "/web">.  Your needs will be different and probably more complicated.  This is not an Apache tutorial and I'm not qualified to give it.  After modifying the configuration file, start the web server. root@web:~# svcadm enable -r svc:/network/http:apache22 This is a good time to do any other configuration (users, other software, etc.) that you need.  If you did the changes above really quickly, you may also want to wait for first boot tasks like man-index to complete.  Allowing it to complete now will mean that it doesn't need to be redone for every instance of this zone that you create. Since this is a type of a zone that shouldn't need to have its configuration changed a whole lot, let's use the immutable global zone (IMGZ) feature to lock down the web zone.  Note that we use IMGZ inside a kernel zone because a kernel zone is another global zone. root@web:~# zonecfg -z global set file-mac-profile=fixed-configurationupdating /platform/sun4v/boot_archiveroot@web:~# init 6 Back in the global zone, we are ready to create a clone archive once the zone reboots. root@global:~# archiveadm create -z web /export/web.uarInitializing Unified Archive creation resources...Unified Archive initialized: /export/web.uarLogging to: /system/volatile/archive_log.6835Executing dataset discovery...Dataset discovery completeCreating install media for zone(s)...Media creation completePreparing archive system image...Beginning archive stream creation...Archive stream creation completeBeginning final archive assembly...Archive creation complete Now that the web clone unified archive has been created, it can be used on this machine or any other with a similar global zone configuration (etherstubs of same names, dhcp server, same nfs exports, etc.) to quickly create new web servers that fit the model described in the diagram at the top of this post.  To create the free kernel zone: root@global:~# zonecfg -z freeUse 'create' to begin configuring a new zone.zonecfg:free> create -a /export/web.uarzonecfg:free> select anet id=0zonecfg:free:anet> set allowed-address=192.168.1.3/24zonecfg:free:anet> endzonecfg:free> select capped-memoryzonecfg:free:capped-memory> set physical=4gzonecfg:free:capped-memory> endzonecfg:free> add virtual-cpuzonecfg:free:virtual-cpu> set ncpus=2zonecfg:free:virtual-cpu> endzonecfg:free> exit If I were doing this for a purpose other than this blog post, I would have also created a sysconfig profile and passed it to zoneadm install.  This would have made the first boot completely hands-off. root@global:~# zoneadm -z free install -a /export/web.uar...root@global:~# zoneadm -z free bootroot@global:~# zlogin -C free[Connected to zone 'free' console]   run sysconfig because I didn't do zoneadm install -c sc_profile.xmlSC profile successfully generated as:/etc/svc/profile/sysconfig/sysconfig-20150114-210014/sc_profile.xml... Once we log into free, we see that there's no more setup to do. root@free:~# df -h -F nfsFilesystem             Size   Used  Available Capacity  Mounted on192.168.0.1:/export/repo/11.2/repo                       194G    36G       158G    19%    /repo192.168.0.1:/export/web                       158G    31K       158G     1%    /webroot@free:~# ipadmNAME              CLASS/TYPE STATE        UNDER      ADDRlo0               loopback   ok           --         --   lo0/v4         static     ok           --         127.0.0.1/8   lo0/v6         static     ok           --         ::1/128net0              ip         ok           --         --   net0/v4        inherited  ok           --         192.168.1.3/24net1              ip         ok           --         --   net1/v4        dhcp       ok           --         192.168.0.3/24   net1/v6        addrconf   ok           --         fe80::8:20ff:fed0:5eb/10 Nearly identical steps can be taken with the deployment of premium.  The key difference there is that we are dedicating two cores (add dedicated-cpu; set cores=...) rather than allocating to virtual-cpus (add virtual-cpu; set ncpus=...).  That is, no one else can use any of the cpus on premium's  cores but free has to compete with the rest of the system for cpu time. root@global:~# psrinfo -tsocket: 0  core: 201457665    cpus: 0-7  core: 201654273    cpus: 8-15  core: 201850881    cpus: 16-23  core: 202047489    cpus: 24-31root@global:~# zonecfg -z premiumzonecfg:premium> create -a /export/web.uarzonecfg:premium> select anet id=0zonecfg:premium:anet> set allowed-address=192.168.1.4/24zonecfg:premium:anet> endzonecfg:premium> select capped-memoryzonecfg:premium:capped-memory> set physical=8gzonecfg:premium:capped-memory> endzonecfg:premium> add dedicated-cpuzonecfg:premium:dedicated-cpu> set cores=201850881,202047489zonecfg:premium:dedicated-cpu> endzonecfg:premium> exit The install and boot of premium will then be the same as that of free.  After both zones are up, we can see that psrinfo reports the number of cores for premium but not for free. root@global:~# zlogin free psrinfo -pvThe physical processor has 2 virtual processors (0-1)  SPARC-T5 (chipid 0, clock 3600 MHz)root@global:~# zlogin free prtconf | grep MemoryMemory size: 4096 Megabytesroot@global:~# zlogin premium psrinfo -pvThe physical processor has 2 cores and 16 virtual processors (0-15)  The core has 8 virtual processors (0-7)  The core has 8 virtual processors (8-15)    SPARC-T5 (chipid 0, clock 3600 MHz)root@global:~# zlogin premium prtconf | grep MemoryMemory size: 8192 Megabytes That's enough for this post.  Next time, we'll get teeter going.

This is a continuation of a series of posts.  While this one may be interesting all on its own, you may want to start from the top to get the context. The diagram above shows one global zone with a...

in-the-box NFS and pkg repository

This is the third in a series of short blog entries.  If you are new to the series, I suggest you start from the top. As shown in our system diagram, the kernel zones have no direct connection to the outside world.  This will make it quite hard for them to apply updates.  To get past that we will set up a pkg repository in the global zone and export it via NFS to the zones.  I won't belabor the topic of a Local IPS Repositories, because our fine doc writers have already covered that. As a quick summary, I first created a zfs file system for the repo.  On this system, export is a separate pool with its topmost dataset mounted at /export.  By default /export is the rpool/export dataset - you may need to adjust commands to match your system. root@global:~# zfs create -p export/repo/11.2 I then followed the procedure in MOS Doc ID 1928542.1 for creating a Solaris 11.2 repository, including all of the SRUs.  That resulted in having a repo with the solaris publisher at /export/repo/11.2/repo. Since I have a local repo for the kernel zones to use, I figured the global zone may as well use it too. root@global:~# pkg set-publisher -O  file:///export/repo/11.2/repo/ solaris To make this publisher accessible (read-only) to the zones on the 192.168.0.0/24 network, it needs to be NFS exported. root@global:~# share -F nfs -o ro=@192.168.0.0/24 /export/repo/11.2/repo Now I'll get ahead of myself a bit - I've not actually covered the installation of the free or premium zones yet.  Let's pretend we have a kernel zone called web and we want the repository to be accessible at /repo in web. root@global:~# zlogin webroot@web:~# vi /etc/vfstab   (add an entry)root@web:~# grep /repo /etc/vfstab192.168.0.1:/export/repo/11.2/repo - /repo      nfs     -       yes     -root@web:~# svcadm enable -r nfs/client If svc:/network/nfs/client was already enabled, use mount /repo instead of svcadm enable.  Once /repo is mounted, update the solaris publisher. root@web:~# pkg set-publisher -O  file:///repo/ solaris In this example, we also want to have some content shared from the global zone into each of the web zones.  To make that possible: root@global:~# zfs create export/webroot@global:~# share -F nfs -o ro=@192.168.0.0/24 /export/web Again, this is exported read-only to the zones.  Adjust for your own needs. That's it for this post.  Next time we'll create a unified archive that can be used for quickly stamping out lots of web zones.

This is the third in a series of short blog entries.  If you are new to the series, I suggest you start from the top. As shown in our system diagram, the kernel zones have no direct connection to the...

in-the-box networking

In my previous post, I described a scenario where a couple networks are needed to shuffle NFS and web traffic between a few zones.  In this post, I'll describe the configuration of the networking.  As a reminder, here's the configuration we are after.The green (192.168.0.0/24) network is used for the two web server zones that need to connect services in the global zone.  The red (102.168.1.0/24) network is used for communication between the load balancer and the web servers.  The basis for this simplistic in-the-box network is an etherstub.The red network is a bit simpler than the green network, so we'll start with that.root@global:~# dladm create-etherstub balstub0That's it!  The (empty) network has been created by simply creating an etherstub.  As zones are configured to use balstub0 as the lower-link in their anet resources, they will attach to this network.The green network is just a little bit more involved because there will be services (DHCP and NFS) in the global zone that will use this network.root@global:~# dladm create-etherstub internalstub0root@global:~# dladm create-vnic -l internalstub0 internal0root@global:~# ipadm create-ip internal0root@global:~# ipadm create-addr -T static -a 192.168.0.1/24 internal0internal0/v4That wasn't a lot harder.  What we did here was create an etherstub named internalstub0.  On top of it, we created a vnic called internal0, attached an IP interface onto it, then set a static IP address.As was mentioned in the introductory post, we'll have DHCP manage the IP address allocation for zones that use internalstub0.  Setup of that is pretty straight-forward too.root@global:~# cat > /etc/inet/dhcpd4.confdefault-lease-time 86400;log-facility local7;subnet 192.168.0.0 netmask 255.255.255.0 {    range 192.168.0.2 192.168.0.254;    option broadcast-address 192.168.0.255;}^Droot@global:~# svcadm enable svc:/network/dhcp/server:ipv4The real Solaris blogs junkie will recognize this as a simplified version of something from the Maine office.

In my previous post, I described a scenario where a couple networks are needed to shuffle NFS and web traffic between a few zones.  In this post, I'll describe the configuration of the networking. ...

kernel zones vs. fs resources

Traditionally, zones have had a way to perform loopback mounts of global zone file systems using zonecfg fs resources with fstype=lofs.  Since kernel zones run a separate kernel, lofs is not really an option.  Other file systems can be safely presented to kernel zones by delegating the devices, so the omission of fs resources is not as bad as it may initially sound. For those that really need something that works like a lofs fs resource, there's a way to simulate the functionality with NFS.  Consider the following system. It has a native zone, teeter, that has load balancing software in it and two kernel zones, free and premium. The idea behind this contrived example is that freeloaders get directed to the small kernel zone and the paying customers use the resources of the bigger zone.  Neither free nor premium are directly connected to the outside world.  They use NFS served from the global zone for web content and package repositories.  When it comes time to update the content seen by all of the web zones, the admin only needs to update it in one place.  It is quite acceptable for this one place to be on the NFS server - that is, in the global zone. To simplify the configuration of each zone, all network configuration is performed in the global zone with zonecfg.  For example, this is the configuration of free. root@global:~# zonecfg -z free exportcreate -bset brand=solaris-kzset autoboot=falseset autoshutdown=shutdownset hostid=0x19df23f0add anetset lower-link=internalstub0set configure-allowed-address=trueset link-protection=mac-nospoofset mac-address=autoset id=1endadd anetset lower-link=balstub0set allowed-address=192.168.1.3/24set configure-allowed-address=trueset link-protection=mac-nospoofset mac-address=autoset id=0end... This example shows two ways to handle the network configuration from the global zone. The green network is configured using internalstub0 as the lower link. A DHCP server is configured in the global zone to dynamically assign addresses to all the zones that have a vnic on that network. The red network uses balstub0 as the lower link.  Because the load balancer will need to be configured to use specific IP addresses for free and premium zones, allocating these addresses from a dynamic address range seems prone to troubles. In the next few blog entries, I'll cover the following topics: Configuration of the in-the-box networking.  This will include the red and green networks and DHCP. Configuration of in-the-box NFS for web content and pkg repositories. Use of a Unified Archive to make it easy to stamp out preconfigured locked-down web servers Fronting web servers with a load balancer.

Traditionally, zones have had a way to perform loopback mounts of global zone file systems using zonecfg fs resources with fstype=lofs.  Since kernel zones run a separate kernel, lofs is not really an...

Oops, I left my kernel zone configuration behind!

Most people use boot environments to move in one direction.  A system starts with an initial installation and from time to time new boot environments are created - typically as a result of pkg update - and then the new BE is booted.  This post is of little interest to those people as no hackery is needed.  This post is about some mild hackery. During development, I commonly test different scenarios across multiple boot environments.  Many times, those tests aren't related to the act of configuring or installing zone and I so it's kinda handy to avoid the effort involved of zone configuration and installation.  A somewhat common order of operations is like the following: # beadm create -e golden -a test1# reboot Once the system is running in the test1 BE, I install a kernel zone. # zonecfg -z a178 create -t SYSsolaris-kz# zoneadm -z a178 install Time passes, and I do all kinds of stuff to the test1 boot environment and want to test other scenarios in a clean boot environment.  So then I create a new one from my golden BE and reboot into it. # beadm create -e golden -a test2# reboot Since the test2 BE was created from the golden BE, it doesn't have the configuration for the kernel zone that I configured and installed.  Getting that zone over to the test2 BE is pretty easy.  My test1 BE is really known as s11fixes-2. root@vzl-212:~# beadm mount s11fixes-2 /mntroot@vzl-212:~# zonecfg -R /mnt -z a178 export | zonecfg -z a178 -f -root@vzl-212:~# beadm unmount s11fixes-2root@vzl-212:~# zoneadm -z a178 attachroot@vzl-212:~# zoneadm -z a178 boot On the face of it, it would seem as though it would have been easier to just use zonecfg -z a178 create -t SYSolaris-kz within the test2 BE to get the new configuration over.  That would almost work, but it would have left behind the encryption key required for access to host data and any suspend image.  See solaris-kz(5) for more info on host data.  I very commonly have more complex configurations that contain many storage URIs and non-default resource controls.  Retyping them would be rather tedious.

Most people use boot environments to move in one direction.  A system starts with an initial installation and from time to time new boot environments are created - typically as a result of pkg update...

No time like the future

Zones have forever allowed different time zones.  Kernel zones kicks that up to 11 (or is that 11.2?) with the ability to have an entirely different time in the zone.  To be clear, this only works with kernel zones.  You can see from the output below that the zone in use has brand solaris-kz. root@kzx-05:~# zoneadm -z junk list -v  ID NAME             STATUS      PATH                         BRAND      IP      12 junk             running     -                            solaris-kz excl  By default, the clocks between the global zone and a kernel zone are in sync.  We'll use console logging to show that... root@kzx-05:~# zlogin -C junk[Connected to zone 'junk' console]vzl-178 console login: rootPassword: Last login: Fri Apr 18 13:58:01 on consoleOracle Corporation      SunOS 5.11      11.2    April 2014root@vzl-178:~# date "+%Y-%m-%d %T"2014-04-18 13:58:57root@vzl-178:~# exitlogoutvzl-178 console login: ~.[Connection to zone 'junk' console closed]root@kzx-05:~# tail /var/log/zones/junk.console 2014-04-18 13:58:45 vzl-178 console login: root2014-04-18 13:58:46 Password: 2014-04-18 13:58:48 Last login: Fri Apr 18 13:58:01 on console2014-04-18 13:58:48 Oracle Corporation      SunOS 5.11      11.2    April 20142014-04-18 13:58:48 root@vzl-178:~# date "+%Y-%m-%d %T"2014-04-18 13:58:57 2014-04-18 13:58:572014-04-18 13:58:57 root@vzl-178:~# exit2014-04-18 13:59:04 logout2014-04-18 13:59:04 2014-04-18 13:59:04 vzl-178 console login: root@kzx-05:~# Notice that the time stamp on the log matches what we see in the output of date.  Now let's pretend that it is next year. root@kzx-05:~# zlogin junkroot@vzl-178:~# date  0101002015Thursday, January  1, 2015 12:20:00 AM PST And let's be sure that it still thinks it's 2015 in the zone: root@kzx-05:~# date; zlogin junk date; dateFriday, April 18, 2014 02:05:53 PM PDTThursday, January  1, 2015 12:20:18 AM PSTFriday, April 18, 2014 02:05:54 PM PDT And the time offset survives a reboot. root@kzx-05:~# zoneadm -z junk rebootroot@kzx-05:~# date; zlogin junk date; dateFriday, April 18, 2014 02:09:18 PM PDTThursday, January  1, 2015 12:23:43 AM PSTFriday, April 18, 2014 02:09:18 PM PDT So, what's happening under the covers?  When the date is set in the kernel zone, the offset between the kernel zone's clock and the global zone's clock is stored in the kernel zone's host data.  See solaris-kz(5) for a description of host data.  Whenever a kernel zone boots, the kernel zone's clock is initialized based on this offset.

Zones have forever allowed different time zones.  Kernel zones kicks that up to 11 (or is that 11.2?) with the ability to have an entirely different time in the zone.  To be clear, this only works...

Zones Console Logs

You know how there's that thing that you've been meaning to do for a long time but never quite get to it?  And then one day something just pushes you over the edge?  This is a short story of being pushed over the edge. As I was working on kernel zones, I rarely used zlogin -C to get to the console.  And then I'd get a panic.  If the panic happened early enough in boot that the dump device wasn't enabled, I'd lose all traces of what went wrong.  That pushed me over the edge - time to implement console logging! With Solaris 11.2, there's now a zone console log for all zone brands: you can find it at /var/log/zones/zonename.console. root@kzx-05:~# zoneadm -z junk bootroot@kzx-05:~# tail /var/log/zones/junk.console 2014-04-18 12:59:08 syncing file systems... done2014-04-18 12:59:11 2014-04-18 12:59:11 [NOTICE: Zone halted]2014-04-18 13:00:56 2014-04-18 13:00:56 [NOTICE: Zone booting up]2014-04-18 13:00:59 Boot device: disk0 File and args: 2014-04-18 13:00:59 reading module /platform/i86pc/amd64/boot_archive...done.2014-04-18 13:00:59 reading kernel file /platform/i86pc/kernel/amd64/unix...done.2014-04-18 13:01:00 SunOS Release 5.11 Version 11.2 64-bit2014-04-18 13:01:00 Copyright (c) 1983, 2014, Oracle and/or its affiliates. All rights reserved. The output above will likely generate a few questions.  Let me try to answer those. Will this log passwords typed on the console? Generally, passwords are entered in a way that they aren't echoed to the terminal.  The console log only contains the characters written from the zone to the terminal - that is it logs the echoes.  Characters you type that are never printed are never logged. Can just anyone read the console log file? No.  You need to be root in the global zone to read the console log file. How is log rotation handled? Rules have been added to logadm.conf(4) to handle weekly log rotation. I see a time stamp, but what time zone is that? All time stamps are in the same time zone as svc:/system/zones:default.  That should be the same as is reported by: root@kzx-05:~# svcprop -p timezone/localtime timezoneUS/Pacific Really, what does the time stamp mean? It was the time that the first character of the line was written to the terminal.  This means that if a line contains a shell prompt, the time is the time that the prompt was printed, not the time that the person finished entering a command. I see other files in /var/log/zones.  What are they? zonename.messages contains various diagnostic information from zoneadmd.  If all goes well, you will never need to look at that. zoneadm.* contains log files from attach, install, clone, and uninstall operations.  These files have existed since Solaris 11 first launched.

You know how there's that thing that you've been meaning to do for a long time but never quite get to it?  And then one day something just pushes you over the edge?  This is a short story of being...

A tour of a kernel zone

In my earlier post, I showed how to configure and install a kernel zone.  In this post, we'll take a look at this kernel zone. The kernel zone was installed within an LDom on a T5-4. root@vzl-212:~# prtdiag -v | head -2System Configuration:  Oracle Corporation  sun4v SPARC T5-4Memory size: 65536 Megabytesroot@vzl-212:~# psrinfo | wc -l      32 The kernel zone was configured with: root@vzl-212:~# zonecfg -z myfirstkz create -t SYSsolaris-kz Let's take a look at the resulting configuration. root@vzl-212:~# zonecfg -z myfirstkz info | cat -n 1 zonename: myfirstkz 2 brand: solaris-kz 3 autoboot: false 4 autoshutdown: shutdown 5 bootargs: 6 pool: 7 scheduling-class: 8 hostid: 0x2b2044c5 9 tenant: 10 anet: 11 lower-link: auto 12 allowed-address not specified 13 configure-allowed-address: true 14 defrouter not specified 15 allowed-dhcp-cids not specified 16 link-protection: mac-nospoof 17 mac-address: auto 18 mac-prefix not specified 19 mac-slot not specified 20 vlan-id not specified 21 priority not specified 22 rxrings not specified 23 txrings not specified 24 mtu not specified 25 maxbw not specified 26 rxfanout not specified 27 vsi-typeid not specified 28 vsi-vers not specified 29 vsi-mgrid not specified 30 etsbw-lcl not specified 31 cos not specified 32 evs not specified 33 vport not specified 34 id: 0 35 device: 36 match not specified 37 storage: dev:/dev/zvol/dsk/rpool/VARSHARE/zones/myfirstkz/disk0 38 id: 0 39 bootpri: 0 40 capped-memory: 41 physical: 2G 42 suspend: 43 path: /system/zones/myfirstkz/suspend 44 storage not specified 45 keysource: 46 raw redacted There are a number of things to notice in this configuration. No zonepath.  Kernel zones install into a real or virtual disks - quite like the way that logical domains install into real or virtual disks.  The virtual disk(s) that contain the root zfs pool are specified by one or more device resources that contain a bootpri property (line 39).  By default, a kernel zone's root disk is a 16 GB zfs volume in the global zone's root zfs pool.  There's more about this in the solaris-kz(5) man page.  It's never been a good idea to directly copy things into a zone's zonepath.  With kernel zones that just doesn't work. The device resource accepts storage URI's (line 37).  See suri(5).  Storage URI's were introduced in Solaris 11.1 in support of Zones on Shared Storage (rootzpool and zpool resources).  This comes in really handy when a kernel zone is installed on external storage and may be migrated between hosts from time to time. The device resource has an id property (line 38).  This means that this disk will be instance 0 of zvblk - which will translate into it being c1d0.  We'll see more of that in a bit. The anet resource has an id property (line 34).  This means that this anet will be instance 0 of zvnet - which will normally be seen as net0.  Again, more of that in a bit. A memory resource control, capped-memory, is set by default (lines 40 - 41).  In the solaris or solaris10 brand, this would mean that rcapd is used to soft limit the amount of physical memory a zone can use.  Kernel zones are different.  Not only is this a hard limit on the amount of physical memory that the kernel zone can use - the memory is immediately allocated and reserved as the zone boots. A suspend resource is present, which defines a location for to write out a suspend file when zoneadm -z zonename suspend is invoked. The keysource resource is used for an encryption key that is used to encrypt suspend images and host data.  solaris-kz(5) has more info on this. There are several things not shown here that may also be of interest: Previously, autoshutdown (line 4) allowed halt and shutdown as values.  It now also supports suspend for kernel zones only.  As you may recall, autoshutdown is used by svc:/system/zones:default when it is transitioning from online to offline.  If set to halt, the zone (kernel or otherwise) is brought down abruptly.  If set to shutdown, a graceful shutdown is performed.  Now, if a kernel zone has it set to suspend, the kernel zone will be suspended as svc:/system/zones:default goes offline.  When zoneadm boot is issued for a suspended zone, the zone is resumed. If there are multiple device resources that have bootpri set (i.e. bootable devices), zoneadm install will add all of the boot devices to a mirrored root zpool. From the earlier blog entry, this kernel zone was booted and sysconfig was performed.  Let's look inside. To get into the zone, you can use zlogin just like you do with any other zone. root@vzl-212:~# zlogin myfirstkz[Connected to zone 'myfirstkz' pts/3]Oracle Corporation      SunOS 5.11      11.2    April 2014root@myfirstkz:~# As I alluded to above, a kernel zone gets a fixed amount of memory.  The value shown above matches the value shown in the capped-memory resource in the zone configuration. root@myfirstkz:~# prtconf | grep ^MemoryMemory size: 2048 Megabytes By default, a kernel zone gets one virtual cpu.  You can adjust this with the virtual-cpu or dedicated-cpu zonecfg resources.  See solaris-kz(5). root@myfirstkz:~# psrinfo0       on-line   since 04/18/2014 22:39:22 Because a kernel zone runs its own kernel, it does not require that packages are in sync between the global zone and the kernel zone.  Notice that the pkg publisher output does not say (syspub) - the kernel zone and the global zone can even use different publishers for the solaris repository.  As SRU's and updates start to roll out you will see that you can independently update the global zone and the kernel zones on it. root@myfirstkz:~# pkg publisherPUBLISHER                   TYPE     STATUS P LOCATIONsolaris                     origin   online F http://internal-ips-repo.example.com/ Because a kernel zone runs its own kernel, it considers itself to be a global zone. root@myfirstkz:~# zonenameglobal The root disk that I mentioned above shows up at c1d0. root@myfirstkz:~# formatSearching for disks...doneAVAILABLE DISK SELECTIONS:       0. c1d0 <kz-vDisk-ZVOL-16.00GB>          /kz-devices@ff/disk@0Specify disk (enter its number): ^D And the anet shows up as net0 using physical device zvnet0. root@myfirstkz:~# dladm show-physLINK              MEDIA                STATE      SPEED  DUPLEX    DEVICEnet0              Ethernet             up         1000   full      zvnet0 Let's jump on the console and see what happens when bad things happen... root@myfirstkz:~# logout[Connection to zone 'myfirstkz' pts/3 closed]root@vzl-212:~# zlogin -C myfirstkz[Connected to zone 'myfirstkz' console]myfirstkz console login: rootPassword: Apr 18 23:47:06 myfirstkz login: ROOT LOGIN /dev/consoleLast login: Fri Apr 18 23:32:28 on kz/termOracle Corporation      SunOS 5.11      11.2    April 2014root@myfirstkz:~# dtrace -wn 'BEGIN { panic() }'dtrace: description 'BEGIN ' matched 1 probepanic[cpu0]/thread=c4001afbd720: dtrace: panic action at probe dtrace:::BEGIN (ecb c400123381e0)000002a10282acd0 dtrace:dtrace_probe+c54 (252acb8f029b3, 0, 0, 33fe, c4001b75e000, 103215b2)  %l0-3: 0000c400123381e0 0000c40019b82340 00000000000013fc 0000c40016889740  %l4-7: 0000c4001bc00000 0000c40019b82370 0000000000000003 000000000000ff00000002a10282af10 dtrace:dtrace_state_go+4ac (c40019b82340, 100, 0, c40019b82370, 16, 702a7040)  %l0-3: 0000000000030000 0000000010351580 0000c4001b75e000 00000000702a7000  %l4-7: 0000000000000000 0000000df8475800 0000000000030d40 00000000702a6c00000002a10282aff0 dtrace:dtrace_ioctl+ad8 (2c, 612164be40, 2a10282bacc, 202003, c400162fcdc0, 64747201)  %l0-3: 000000006474720c 0000c40019b82340 000002a10282b1a4 00000000ffffffff  %l4-7: 00000000702a6ee8 00000000702a7100 0000000000000b18 0000000000000180000002a10282b8a0 genunix:fop_ioctl+d0 (c40019647a40, 0, 612164be40, 202003, c400162fcdc0, 2a10282bacc)  %l0-3: 000000006474720c 0000000000202003 0000000001374f2c 0000c40010d84180  %l4-7: 0000000000000000 0000000000000000 00000000000000c0 0000000000000000000002a10282b970 genunix:ioctl+16c (3, 6474720c, 612164be40, 3, 1fa5ac, 0)  %l0-3: 0000c4001a5ea958 0000000010010000 0000000000002003 0000000000000000  %l4-7: 0000000000000003 0000000000000004 0000000000000000 0000000000000000syncing file systems... donedumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel sections: zfs 0:04  90% done (kernel) 0:05 100% done (zfs)100% done: 127783 (kernel) + 12950 (zfs) pages dumped, dump succeededrebooting...Resetting...[NOTICE: Zone rebooting]NOTICE: Entering OpenBoot.NOTICE: Fetching Guest MD from HV.NOTICE: Starting additional cpus.NOTICE: Initializing LDC services.NOTICE: Probing PCI devices.NOTICE: Finished PCI probing.SPARC T5-4, No KeyboardCopyright (c) 1998, 2014, Oracle and/or its affiliates. All rights reserved.OpenBoot 4.36.0, 2.0000 GB memory available, Serial #723535045.Ethernet address 0:0:0:0:0:0, Host ID: 2b2044c5.Boot device: disk0  File and args: SunOS Release 5.11 Version 11.2 64-bitCopyright (c) 1983, 2014, Oracle and/or its affiliates. All rights reserved.Hostname: myfirstkzApr 18 23:48:44 myfirstkz savecore: System dump time: Fri Apr 18 23:47:42 2014Apr 18 23:48:44 myfirstkz savecore: Saving compressed system crash dump files in directory /var/crashmyfirstkz console login: Apr 18 23:49:02 myfirstkz savecore: Decompress all crash dump files with '(cd /var/crash && savecore -v 0)' or individual files with 'savecore -vf /var/crash/vmdump{,-<secname>}.0'SUNW-MSG-ID: SUNOS-8000-KL, TYPE: Defect, VER: 1, SEVERITY: MajorEVENT-TIME: Fri Apr 18 23:49:07 CDT 2014PLATFORM: SPARC-T5-4, CSN: unknown, HOSTNAME: myfirstkzSOURCE: software-diagnosis, REV: 0.1EVENT-ID: f4c0d684-da80-425f-e45c-97bd0239b154DESC: The system has rebooted after a kernel panic. After disconnecting from the console (~.) I was back at the global zone root prompt.  The global zone didn't panic - the kernel zone did. root@vzl-212:~# uptime; zlogin myfirstkz uptime  9:53pm  up  8:03,  2 users,  load average: 0.03, 0.12, 0.08 11:52pm  up 5 min(s),  0 users,  load average: 0.04, 0.26, 0.15 That's the end of this tour.  Thanks for coming, and please come again!

In my earlier post, I showed how to configure and install a kernel zone.  In this post, we'll take a look at this kernel zone. The kernel zone was installed within an LDom on a T5-4. root@vzl-212:~# pr...

Need another disk in your zone? No problem!

Solaris 11.2 allows you to modify your zone configuration and apply those changes without a reboot.  Check this out: First, I want to show that there are no disk devices present in the zone. root@vzl-212:~# zlogin z1 'find /dev/*dsk'/dev/dsk/dev/rdsk Next, I modify the static zone configuration to add a access to all partitions of a particular disk.  This has no effect on the running zone. root@vzl-212:~# zonecfg -z z1zonecfg:z1> add devicezonecfg:z1:device> set match=/dev/*dsk/c0t600144F0DBF8AF19000052EC175B0004d0*zonecfg:z1:device> endzonecfg:z1> exit I then use zoneadm to apply the changes in the zone configuration to the running zone. root@vzl-212:~# zoneadm -z z1 applyzone 'z1': Checking: Adding device match=/dev/*dsk/c0t600144F0DBF8AF19000052EC175B0004d0*zone 'z1': Applying the changes And now, the zone can see the devices.  root@vzl-212:~# zlogin z1 'find /dev/*dsk'/dev/dsk/dev/dsk/c0t600144F0DBF8AF19000052EC175B0004d0/dev/dsk/c0t600144F0DBF8AF19000052EC175B0004d0s0/dev/dsk/c0t600144F0DBF8AF19000052EC175B0004d0s1/dev/dsk/c0t600144F0DBF8AF19000052EC175B0004d0s2/dev/dsk/c0t600144F0DBF8AF19000052EC175B0004d0s3/dev/dsk/c0t600144F0DBF8AF19000052EC175B0004d0s4/dev/dsk/c0t600144F0DBF8AF19000052EC175B0004d0s5/dev/dsk/c0t600144F0DBF8AF19000052EC175B0004d0s6/dev/rdsk/dev/rdsk/c0t600144F0DBF8AF19000052EC175B0004d0/dev/rdsk/c0t600144F0DBF8AF19000052EC175B0004d0s0/dev/rdsk/c0t600144F0DBF8AF19000052EC175B0004d0s1/dev/rdsk/c0t600144F0DBF8AF19000052EC175B0004d0s2/dev/rdsk/c0t600144F0DBF8AF19000052EC175B0004d0s3/dev/rdsk/c0t600144F0DBF8AF19000052EC175B0004d0s4/dev/rdsk/c0t600144F0DBF8AF19000052EC175B0004d0s5/dev/rdsk/c0t600144F0DBF8AF19000052EC175B0004d0s6 If I wanted to just temporarily modify the configuration, there's an option to just modify the running configuration. For example, if I plan on doing some maintenance on my ZFS Storage Appliance that hosts the LUN I allocated above, I may want to be sure that the zone can't see it for a bit.  That's easy enough. Here I use zonecfg's -r option to modify the running configuration. root@vzl-212:~# zonecfg -z z1 -rzonecfg:z1> info devicedevice:    match: /dev/*dsk/c0t600144F0DBF8AF19000052EC175B0004d0*    storage not specified    allow-partition not specified    allow-raw-io not specifiedzonecfg:z1> remove devicezonecfg:z1> info devicezonecfg:z1> commitzone 'z1': Checking: Removing device match=/dev/*dsk/c0t600144F0DBF8AF19000052EC175B0004d0*zone 'z1': Applying the changeszonecfg:z1> exit Without the -r option, zonecfg displays the on-disk configuration. root@vzl-212:~# zonecfg -z z1 info devicedevice:    match: /dev/*dsk/c0t600144F0DBF8AF19000052EC175B0004d0*    storage not specified    allow-partition not specified    allow-raw-io not specifiedroot@vzl-212:~# zonecfg -z z1 -r info deviceroot@vzl-212:~# The running configuration reflects the contents of /dev/dsk and /dev/rdsk inside the zone. root@vzl-212:~# zlogin z1 'find /dev/*dsk'/dev/dsk/dev/rdsk When it is time to revert back to the on-disk configuration, simply apply the on-disk configuration and the device tree inside the zone reverts to the on-disk configuration. root@vzl-212:~# zoneadm -z z1 applyzone 'z1': Checking: Adding device match=/dev/*dsk/c0t600144F0DBF8AF19000052EC175B0004d0*zone 'z1': Applying the changesroot@vzl-212:~# zlogin z1 'find /dev/*dsk'/dev/dsk/dev/dsk/c0t600144F0DBF8AF19000052EC175B0004d0/dev/dsk/c0t600144F0DBF8AF19000052EC175B0004d0s0/dev/dsk/c0t600144F0DBF8AF19000052EC175B0004d0s1/dev/dsk/c0t600144F0DBF8AF19000052EC175B0004d0s2/dev/dsk/c0t600144F0DBF8AF19000052EC175B0004d0s3/dev/dsk/c0t600144F0DBF8AF19000052EC175B0004d0s4/dev/dsk/c0t600144F0DBF8AF19000052EC175B0004d0s5/dev/dsk/c0t600144F0DBF8AF19000052EC175B0004d0s6/dev/rdsk/dev/rdsk/c0t600144F0DBF8AF19000052EC175B0004d0/dev/rdsk/c0t600144F0DBF8AF19000052EC175B0004d0s0/dev/rdsk/c0t600144F0DBF8AF19000052EC175B0004d0s1/dev/rdsk/c0t600144F0DBF8AF19000052EC175B0004d0s2/dev/rdsk/c0t600144F0DBF8AF19000052EC175B0004d0s3/dev/rdsk/c0t600144F0DBF8AF19000052EC175B0004d0s4/dev/rdsk/c0t600144F0DBF8AF19000052EC175B0004d0s5/dev/rdsk/c0t600144F0DBF8AF19000052EC175B0004d0s6 Live Zone Reconfiguration is not limited to just device resources, it works with most other resources as well.  See the Live Zones Reconfiguration section in zones(5).

Solaris 11.2 allows you to modify your zone configuration and apply those changes without a reboot.  Check this out: First, I want to show that there are no disk devices present in the zone. root@vzl-212...

Install a kernel zone in 3 steps

One of the shiniest new features in Oracle Solaris 11.2 is Kernel Zones.  Kernel Zones provide the familiarity of zones while providing independent kernels.  This means that it's now possible to have zones that run different patch levels, act as CIFS servers, load kernel modules, etc.  So, let's get to installing a kernel zone. If you have installed any other zone on Solaris before, this will look quite familiar.  After all, it is  just another zone, right? For this procedure to work, there are some prerequisites that shouldn't be much of a problem in a production environment, but are a bit of a problem if your normal playground is VirtualBox or the 6 year old server you found on the loading dock. Step 1: Configure root@vzl-212:~# zonecfg -z myfirstkz create -t SYSsolaris-kz Step 2: Install root@vzl-212:~# zoneadm -z myfirstkz installProgress being logged to /var/log/zones/zoneadm.20140419T032707Z.myfirstkz.installpkg cache: Using /var/pkg/publisher. Install Log: /system/volatile/install.5368/install_log AI Manifest: /tmp/zoneadm4798.dAaO7j/devel-ai-manifest.xml  SC Profile: /usr/share/auto_install/sc_profiles/enable_sci.xmlInstallation: Starting ...        Creating IPS image        Installing packages from:            solaris                origin:  http://ipkg/solaris11/dev/        The following licenses have been accepted and not displayed.        Please review the licenses for the following packages post-install:          consolidation/osnet/osnet-incorporation                             Package licenses may be viewed using the command:          pkg info --license <pkg_fmri>DOWNLOAD                                PKGS         FILES    XFER (MB)   SPEEDCompleted                            549/549   76929/76929  680.9/680.9  8.4M/sPHASE                                          ITEMSInstalling new actions                   104278/104278Updating package state database                 Done Updating package cache                           0/0 Updating image state                            Done Creating fast lookup database                   Done Installation: Succeeded        Done: Installation completed in 438.132 seconds. Step 3: Celebrate! At this point the kernel zone is installed and ready for boot.  root@vzl-212:~# zoneadm -z myfirstkz bootroot@vzl-212:~# zlogin -C myfirstkz[Connected to zone 'myfirstkz' console]Loading smf(5) service descriptions: 220/220... Because a sysconfig profile was not provided during installation, sysconfig(1M) will ask a few things on first boot.                            System Configuration Tool      System Configuration Tool enables you to specify the following                 configuration parameters for your newly-installed Oracle Solaris 11            system:     - system hostname, network, time zone and locale, date and time, user            and root accounts, name services, keyboard layout, support      System Configuration Tool produces an SMF profile file in     /etc/svc/profile/sysconfig/sysconfig-20140419-034040.      How to navigate through this tool:     - Use the function keys listed at the bottom of each screen to move              from screen to screen and to perform other operations.     - Use the up/down arrow keys to change the selection or to move                  between input fields.     - If your keyboard does not have function keys, or they do not                   respond, press ESC; the legend at the bottom of the screen will                change to show the ESC keys for navigation and other functions.              F2_Continue  F6_Help  F9_Quit If you've read this far into this entry, you know how to take it from here.

One of the shiniest new features in Oracle Solaris 11.2 is Kernel Zones.  Kernel Zones provide the familiarity of zones while providing independent kernels.  This means that it's now possible to have...

How I spent my summer instead of vacationing

I've been pretty quiet around here lately, mainly because I've been heads down working on Unified Archives and Kernel Zones. Markus has started to let the cat out of the bag...1. kernel zones: With kernel zones customers will have to option to run different kernel patch levels across different zones while maintaining the simplicity of zones management. We'll also be able to do live migration of kernel zones. All of that across HW platforms, i.e. kernel zones will be available on both SPARC as well as x86. Key benefits of kernel zones: x Low overhead (Lots of optimizations because we run Solaris on Solaris)x Unique security features: Ability to make them immutable by locking down the root file systemx Integration with the Solaris resource management capabilities: CPU, memory, I/O, networkingx Fully compatible with OVM SPARC as well as native and S10 branded zonesx Comprehensive SDN capabilities: Distributed Virtual Switch and VxLAN capabilities2. Unified Template Builder: This will allow customers to go from any-to-any of the following: Bare metal, OVM, kernel zone, native zone. For instance: You'll be able to take a zones image and re-deploy it as a bare metal image, kernel zone or ldom. Or vice versa! Pretty powerful, huh? Unified templates also provide us with a great foundation to distribute key Oracle applications as a shrink-wrapped, pre-installed, pre-tuned an configured image where customers can specify at install time whether to turn them into a bare metal image, a zone, a kernel zone or an OVM

I've been pretty quiet around here lately, mainly because I've been heads down working on Unified Archives and Kernel Zones. Markus has started to let the cat out of the bag... 1. kernel zones: With...

Cold storage migration with Zones on Shared Storage

A question on the Solaris Zones forum inspired this entry in a place that perhaps more people will see it.  The goal in this exercise is to migrate a ZOSS (zones on shared storage) zone from a mirrored rootzpool to a raidz rootzpool. WARNING:  I've used lofi devices as the backing store for my zfs pools.  This configuration will not survive a reboot and should not be used in the real world.  Use real disks when you do this for any zones that matter. My starting configuration is: # zonecfg -z stuff info rootzpoolrootzpool:    storage: dev:lofi/1    storage: dev:lofi/2 My zone, stuff, is installed and not running. # zoneadm -z stuff list -v  ID NAME             STATUS     PATH                           BRAND    IP       - stuff            installed /zones/stuff                   solaris  excl  I need to prepare the new storage by creating a zpool with the desired layout. # zpool create newpool raidz /dev/lofi/3 /dev/lofi/4 /dev/lofi/5 /dev/lofi/6 /dev/lofi/7# zpool status newpool  pool: newpool state: ONLINE  scan: none requestedconfig:    NAME             STATE     READ WRITE CKSUM    newpool          ONLINE       0     0     0      raidz1-0       ONLINE       0     0     0        /dev/lofi/3  ONLINE       0     0     0        /dev/lofi/4  ONLINE       0     0     0        /dev/lofi/5  ONLINE       0     0     0        /dev/lofi/6  ONLINE       0     0     0        /dev/lofi/7  ONLINE       0     0     0errors: No known data errors Next, migrate the data.  Remember, the zone is not running at this point.  We can use zfs list to figure out the name of the zpool mounted at the zonepath. # zfs list -o name,mountpoint,mounted /zones/stuffNAME         MOUNTPOINT    MOUNTEDstuff_rpool  /zones/stuff      yes# zfs snapshot -r stuff_rpool@migrate # zfs send -R stuff_rpool@migrate | zfs recv -u -F newpool The -u option was used with zfs receive so that it didn't try to mount the zpool's root file system at the zonepath when it completed.  The -F option was used to allow it to wipe out anything that happens to exist in the top-level dataset in the destination zpool. Now, we are ready to switch which pool is in the zone configuration.  To do that, we need to detach the zone, modify the configuration, and then attach it.  Prior to attaching, we also need to ensure that newpool is exported. # zoneadm -z stuff detachExported zone zpool: stuff_rpool# zpool export newpool# zonecfg -z stuffzonecfg:stuff> info rootzpoolrootzpool:    storage: dev:lofi/1    storage: dev:lofi/2zonecfg:stuff> remove rootzpoolzonecfg:stuff> add rootzpoolzonecfg:stuff:rootzpool> add storage dev:lofi/3zonecfg:stuff:rootzpool> add storage dev:lofi/4zonecfg:stuff:rootzpool> add storage dev:lofi/5zonecfg:stuff:rootzpool> add storage dev:lofi/6zonecfg:stuff:rootzpool> add storage dev:lofi/7zonecfg:stuff:rootzpool> endzonecfg:stuff> exit In the commands above, I was quite happy that zonecfg allows the up arrow or ^P to select the previous command.  Each instance of add storage was just four keystrokes (^P, backspace, number, enter). # zoneadm -z stuff attachImported zone zpool: stuff_rpoolProgress being logged to /var/log/zones/zoneadm.20130430T144419Z.stuff.attach    Installing: Using existing zone boot environment      Zone BE root dataset: stuff_rpool/rpool/ROOT/solaris                     Cache: Using /var/pkg/publisher.  Updating non-global zone: Linking to image /.Processing linked: 1/1 done  Updating non-global zone: Auditing packages.No updates necessary for this image.  Updating non-global zone: Zone updated.                    Result: Attach Succeeded.Log saved in non-global zone as /zones/stuff/root/var/log/zones/zoneadm.20130430T144419Z.stuff.attach# zpool status stuff_rpool  pool: stuff_rpool state: ONLINE  scan: none requestedconfig:    NAME             STATE     READ WRITE CKSUM    stuff_rpool      ONLINE       0     0     0      raidz1-0       ONLINE       0     0     0        /dev/lofi/3  ONLINE       0     0     0        /dev/lofi/4  ONLINE       0     0     0        /dev/lofi/5  ONLINE       0     0     0        /dev/lofi/6  ONLINE       0     0     0        /dev/lofi/7  ONLINE       0     0     0errors: No known data errors At this point the storage has been migrated.  You can boot the zone and move on to the next task. You probably want to use zfs destroy -r stuff_rpool@migrate once you are sure you don't need to revert to the old storage.  Until you delete it (or the source zpool) you can use zfs send -I to send just the differences back to the old pool.  That's left as an exercise for the reader.

A question on the Solaris Zones forum inspired this entry in a place that perhaps more people will see it.  The goal in this exercise is to migrate a ZOSS (zones on shared storage) zone from...

Linux to Solaris @ Morgan Stanley

I came across this blog entry and the accompanying presentation by Robert Milkowski about his experience switching from Linux to Oracle Solaris 11 for a distributed OpenAFS file serving environment at Morgan Stanley. If you are an IT manager, the presentation will show you: Running Solaris with a support contract can cost less than running Linux (even without a support contract) because of technical advantages of Solaris. IT departments can benefit from hiring computer scientists into Systems Programmer or similar roles.  Their computer science background should be nurtured so that they can continue to deliver value (savings and opportunity) to the business as technology advances. If you are a sysadmin, developer, or somewhere in between, the presentation will show you: A presentation that explains your technical analysis can be very influential. Learning and using the non-default options of an OS can make all the difference as to whether one OS is better suited than another.  For example, see the graphs on slides 3 - 5.  The ZFS default is to not use compression. When trying to convince those that hold the purse strings that your technical direction should be taken, the financial impact can be the part that closes the deal.  See slides 6, 9, and 10.  Sometimes reducing rack space requirements can be the biggest impact because it may stave off or completely eliminate the need for facilities growth. DTrace can be used to shine light on performance problems that may be suspected but not diagnosed.  It is quite likely that these problems have existed in OpenAFS for a decade or more.  DTrace made diagnosis possible. DTrace can be used to create performance analysis tools without modifying the source of software that is under analysis.  See slides 29 - 32. Microstate accounting, visible in the prstat output on slide 37 can be used to quickly draw focus to problem areas that affect CPU saturation.  Note that prstat without -m gives a time-decayed moving average that is not nearly as useful. Instruction level probes (slides 33 - 34) are a super-easy way to identify which part of a function is hot.

I came across this blog entry and the accompanying presentation by Robert Milkowski about his experience switching from Linux to Oracle Solaris 11 for a distributed OpenAFS file serving environment...

What I learned about lofi today

As I was digging into some other things today, I realized that lofiadm is not needed in common use cases.  As mount(1M) says:      For file system types that support it, a file can be mounted     directly as a file system by specifying the full path to the     file as the special argument. In such  a  case,  the  nosuid     option is enforced. If specific file system support for such     loopback file mounts is  not  present,  you  can  still  use     lofiadm(1M)  to  mount a file system image. In this case, no     special options are enforced. That is, you can do this: root@global# lofiadmBlock Device             File                           Optionsroot@global# mount -F hsfs `pwd`/sol-10-u9-ga-x86-dvd.iso /mntroot@global# df -h /mntFilesystem             Size   Used  Available Capacity  Mounted on/ws/media/solaris/sol-10-u9-ga-x86-dvd.iso                       2.0G   2.0G         0K   100%    /mntroot@global# lofiadmBlock Device             File                           Options/dev/lofi/1              /ws/media/solaris/sol-10-u9-ga-x86-dvd.iso     - When I unmount it, the lofi device goes away as well. root@global# umount /mntroot@global# lofiadmBlock Device             File                           Options Note that this was on Solaris 11 - I don't believe that this feature was backported to Solaris 10.

As I was digging into some other things today, I realized that lofiadm is not needed in common use cases.  As mount(1M) says: For file system types that support it, a file can be mounted ...

Automating custom software installation in a zone

In Solaris 11, the internals of zone installation are quite different than they were in Solaris 10.  This difference allows the administrator far greater control of what software is installed in a zone.  The rules in Solaris 10 are simple and inflexible: if it is installed in the global zone and is not specifically excluded by package metadata from being installed in a zone, it is installed in the zone.  In Solaris 11, the rules are still simple, but are much more flexible:  the packages you tell it to install and the packages on which they depend will be installed. So, where does the default list of packages come from?  From the AI (auto installer) manifest, of course.  The default AI manifest is /usr/share/auto_install/manifest/zone_default.xml.  Within that file you will find:             <software_data action="install">                <name>pkg:/group/system/solaris-small-server</name>            </software_data> So, the default installation will install pkg:/group/system/solaris-small-server.  Cool.  What is that?  You can figure out what is in the package by looking for it in the repository with your web browser (click the manifest link), or use pkg(1).  In this case, it is a group package (pkg:/group/), so we know that it just has a bunch of dependencies to name the packages that really wants installed. $ pkg contents -t depend -o fmri -s fmri -r solaris-small-serverFMRIcompress/bzip2compress/gzipcompress/p7zip...terminal/luitterminal/resizetext/doctoolstext/doctools/jatext/lesstext/spelling-utilitiesweb/wget If you would like to see the entire manifest from the command line, use pkg contents -r -m solaris-small-server. Let's suppose that you want to install a zone that also has mercurial and a full-fledged installation of vim rather than just the minimal vim-core that is part of solaris-small-server.  That's pretty easy. First, copy the default AI manifest somewhere where you will edit it and make it writable. # cp /usr/share/auto_install/manifest/zone_default.xml ~/myzone-ai.xml# chmod 644 ~/myzone-ai.xml Next, edit the file, changing the software_data section as follows:             <software_data action="install">                <name>pkg:/group/system/solaris-small-server</name>                <name>pkg:/developer/versioning/mercurial</name>                <name>pkg:/editor/vim</name>            </software_data> To figure out  the names of the packages, either search the repository using your browser, or use a command like pkg search hg. Now we are all ready to install the zone.  If it has not yet been configured, that must be done as well. # zonecfg -z myzone 'create; set zonepath=/zones/myzone'# zoneadm -z myzone install -m ~/myzone-ai.xml A ZFS file system has been created for this zone.Progress being logged to /var/log/zones/zoneadm.20111113T004303Z.myzone.install Image: Preparing at /zones/myzone/root. Install Log: /system/volatile/install.15496/install_log AI Manifest: /tmp/manifest.xml.XfaWpE SC Profile: /usr/share/auto_install/sc_profiles/enable_sci.xml Zonename: myzoneInstallation: Starting ... Creating IPS image Installing packages from: solaris origin: http://localhost:1008/solaris/54453f3545de891d4daa841ddb3c844fe8804f55/DOWNLOAD PKGS FILES XFER (MB)Completed 169/169 34047/34047 185.6/185.6PHASE ACTIONSInstall Phase 46498/46498 PHASE ITEMSPackage State Update Phase 169/169 Image State Update Phase 2/2 Installation: Succeeded Note: Man pages can be obtained by installing pkg:/system/manual done. Done: Installation completed in 531.813 seconds. Next Steps: Boot the zone, then log into the zone console (zlogin -C) to complete the configuration process.Log saved in non-global zone as /zones/myzone/root/var/log/zones/zoneadm.20111113T004303Z.myzone.installNow, for a few things that I've seen people trip over: Ignore that bit about man pages - it's wrong.  Man pages are already installed so long as the right facet is set properly.  And that's a topic for another blog entry. If you boot the zone then just use zlogin myzone, you will see that services you care about haven't started and that svc:/milestone/config:default is starting.  That is because you have not yet logged into the console with zlogin -C myzone. If the zone has been booted for more than a very short while when you first connect to the zone console, it will seem like the console is hung.  That's not really the case - hit ^L (control-L) to refresh the sysconfig(1M) screen that is prompting you for information.

In Solaris 11, the internals of zone installation are quite different than they were in Solaris 10.  This difference allows the administrator far greater control of what software is installed in a...

These are 11 of my favorite things!

With today's launch of Solaris 11, I felt it would be good to introduce 11 of my favorite things about zones in Solaris 11. Minimized by  default.  The default zone installation size is about 420 MB and has a very strong selection of the tools you expect in a modern UNIX installation.  It's also very simple to find (e.g. pkg search -r hg) and install (e.g. pkg install mercurial) that tool that you use all the time that isn't in the default installation. It's easy to have different packages in the global zone and in each non-global zone.  In fact, the default is to have the solaris-large-server group package installed in the global zone and the solaris-small-server installed in each zone. Zone boot environments, synchronized with global zone boot environments. Immutable zones allows you to turn all or part of a zone read-only.  Even if a bad guy (or a mostly good guy doing something bad) running as root cannot add, delete, or modify protected files. Dedicated IP stack by default with automatic configuration with anets. The same zfs dataset layout in the global zone and non-global zones. Dataset aliasing. The automated installer configuration (AI manifest) used to install a global zone and a non-global zone are almost the same (and you can install zones automatically as part of a global zone installation installation). Performance monitoring with zonestat. Easy p2v and v2v using zfs send streams. solaris10 branded zones allow you to easily begin to take advantage of some Solaris 11 goodies while still running Solaris 10. In the coming weeks, I'll talk about each of these and show how you can put them together in interesting ways to solve the types of problems that you see in the real world.  If there's something you would really like me to dig into first, let me know.

With today's launch of Solaris 11, I felt it would be good to introduce 11 of my favorite things about zones in Solaris 11. Minimized by  default.  The default zone installation size is about 420 MB...

Oracle

Integrated Cloud Applications & Platform Services