Freitag Jul 22, 2016

Setting up Owncloud on Solaris

I recently had this private little project to try out Owncloud and Nextcloud for personal use.  But since I tried it on Solaris, I thought I might as well share a short summary here for whoever might find it useful.

To deploy either Owncloud or Nextcloud on Solaris, you generally follow the commandline installation instructions.  They are very short and straightforward.  In general, use the Linux manual installation for guidance. However, there are a few Solaris specifics like package dependencies, which are not documented.  Here's what you'll need to do:

  • I installed in a non-global zone (targeting to make it immutable once it's all up and running).  To resolve all the dependencies, you'll need to install these packages right after deploying the empty zone (not sure I need all those apache packages...):
  • Make sure your zone has internet access and DNS resolution.  It will need it to use the Owncloud/Nextcloud appstore.
  • It is easiest to install and run Owncloud/Nextcloud as webservd, since then you don't have to bother with tweaking apache into using a different user.
  • You'll need to enable a few extensions for php.  You do this in /ec/php/5.6/conf.d/extensions.ini  Here are the ones I enabled, I'm not sure I need them all...
  • Create a config file for the mysql extension in /etc/php/5.6/conf.d/mysql.ini.  I took the example from the Admin Guide.
  • I wanted to have a separate ZFS dataset for the software, the data and the mysql database.  This would give me snapshot capability as well as write access to the data once the zone is immutable.
    • Delegate a ZFS dataset to the zone.
      zonecfg -z nextcloud info dataset
      	name: datapool/nextcloud
      	alias: nextcloud
    • Create some filesystems in the dataset to host software, data and database
      root@nextcloud:~# zfs list -r nextcloud
      nextcloud          243M  2.52T  38.6K  /nextcloud
      nextcloud/apache  38.0K  2.52T  38.0K  /nextcloud/apache
      nextcloud/data    17.5M  2.52T  17.5M  /nextcloud/server/nextcloud/data
      nextcloud/mysql    146M  2.52T   146M  /nextcloud/mysql
      nextcloud/server  79.2M  2.52T  79.2M  /nextcloud/server
    • Change the mysql default to point to the new location:
      svccfg -s mysql:version_56 setprop mysql/data=/nextcloud/mysql/data 
      svccfg -s mysql:version_56 refresh
  • Now just follow the Admin Guide to create the mysql database:
    svcadm enable mysql
    mysqladmin -u root password "secret"
    mysql -u root -p
    mysql> create user 'admin'@'localhost' identified by 'secret';
    Query OK, 0 rows affected (0.25 sec)
    mysql> create database if not exists nextcloud ;
    Query OK, 1 row affected (0.00 sec)
    mysql> GRANT ALL PRIVILEGES ON nextcloud.* TO 'admin'@'localhost' identified by 'secret';
    Query OK, 0 rows affected (0.00 sec)
  • And finally, perform the installation:
    php occ maintenance:install --database "mysql" --database-name "nextcloud" --database-user "root" --database-pass "secret"\
    --admin-user "admin" --admin-pass "secret"
  • The rest is no different to the Linux installation.  You'll need to configure apache to serve the application.  Don't forget to do this with SSL if you're actually running this on the internet!
  • Don't forget to tighten file security as described in the Admin Guide!
  • Once done, I turned my zone immutable for additional security.  For this to work, I had to redirect the apache logs to a writable directory, so I created another zfs dataset in the nextcloud pool and had apache send it's logs there.  To turn immutability on, just do
    zoneadm -z nextcloud halt
    zonecfg -z nextcloud set file-mac-profile=fixed-configuration
    zoneadm -z nextcloud boot

Have fun!

Dienstag Apr 26, 2016

Socket, Core, Strand - Where are my Zones?

Consolidation using Solaris Zones is widely adopted.  In many cases, people run all the zones on all available CPUs, which is great for overall utilization.  In such a case, Solaris does all the scheduling, taking care that the best CPU is chosen for each process and that all resources are distributed fairly amongst all applications.  However, there are cases where you would want to dedicate a certain set of CPUs to one or more zones.  For example to deal with license restrictions or to create a more strict separation between different workloads.  This separation is achieved either by using the "dedicated-cpu" setting in the zone's configuration, or by binding the zone to an existing resource pool, which in turn contains a processor set.  The technology in both cases is the same, since in the case of "dedicated-cpu", Solaris automatically creates a temporary resource pool when the zone is started.  The effect of using a processor set is that the CPUs assigned to it are available exclusively to the zones associated with this set.  This means that these zones can use exactly those CPUs - not more, not less.  Anything else running on the system (the global zone and any other zones) can no longer be executed on these CPUs.

In this article, I'll discuss (and hopefully answer) the question, which CPUs to include in such a processor set, and how to figure out which zones currently run on which CPUs.

To avoid unnecessary confusion, let me define a few terms first, since there are multiple names in use for the various concepts:

  • A CPU is a processor, consisting of one or more cores, cache and optionally some IO controllers and/or memory controllers.
  • A Core is one computation or execution unit on a CPU.  (Not to be confused with the pipelines that it contains.)
  • A Strand is an entry point into a core, which makes the core's services available to the operating system.

For example, a SPARC M7 CPU consists of 32 cores.  Each core provides 8 strands, so a M7 CPU provides 32*8=256 strands to the OS.  The OS treats each of these strands as a fully-fledged execution unit and therefore shows 256 "CPUs".

All modern multi-core CPUs include multiple levels of caches.  The L3 cache is usually shared by all cores.  L2 and L1 caches are closer to the cores.  They are smaller but faster and often dedicated to one or a small number of cores.  (The M7 CPU applies different strategies, but each core owns it's own, exclusive L1 cache.)  Now, if multiple strands of the same core are used by the same process (or application), this can lead to relatively high hit rates in these caches.  If, on the other hand, different processes use the same core, there will be competition for the little cache space, overwriting each other's entries.  We call this behavior "cache thrashing".  Solaris does a good job trying to prevent this.  However, when using many zones, it is common to assign different zones to different sets of cores.  Use whole cores (complete sets of 8 strands) to avoid sharing of cores between zones or applications.  This also makes the most sense with regards to license capping, since you usually license your application by the number of cores.

So how can you make sure that your zones are bound correctly to whole, exclusive cores?

Solaris knows about the relation between strands, cores and cpus (as well as the memory hierarchy, which I'll not cover here).  You can query this relation using kstat.  For historical reasons (from the times where there were no multi-core or multi-strand cpus), Solaris uses the term "CPU" for what we now call a strand:

root@mars:~# kstat -m cpu_info -s core_id -i 150
module: cpu_info                        instance: 150   
name:   cpu_info150                     class:    misc
        core_id                         18

root@mars:~# kstat -m cpu_info -s chip_id -i 150
module: cpu_info                        instance: 150   
name:   cpu_info150                     class:    misc
        chip_id                         1

In the above example, the "cpu" with id 150 is a strand of core 18, which belongs to CPU 1.  You can discover all available strands and CPUs like this.

Usually, when you configure a processor set for a resource pool, you just tell it the minimum and maximum number of strands it should contain (where min=max is quite common). Optionally, you can also specify specific CPU-IDs (strands) or, since Solaris 11.2, core IDs.  The commands to do this are "pooladm" and "poolcfg".  (There is also the command "psrset", but it only creates a processor set, not a resource pool, and is not permanent, so needs to be run after every reboot.)  I already described the use of these commands a while ago.  Now, to figure out which strands, cores or CPUs are assigned to a specific zone, you'd need to use kstat to find the association between strand IDs in your processor set and the corresponding cores and CPUs.  Done manually, that's a little painful, which is why I wrote a little script to do this for you:

root@mars:~# ./zonecores -h
usage: zonecores [-Sscl] 
       -S report whole Socket use
       -s report shared use
       -c report whole core use
       -l list cpu overview

 With the "-l" commandline option, it will give you an overview of the available CPUs and which zones are running on them.  Here's an example from a SPARC system with 2 16-core CPUs:

root@mars:~# ./zonecores -l
# Socket, Core, Strand and Zone Overview
Socket Core Strands Zones
0      0    0,1,2,3,4,5,6,7 db2,
0      1    8,9,10,11,12,13,14,15 db2,
0      2    16,17,18,19,20,21,22,23 none
0      3    24,25,26,27,28,29,30,31 db2,
0      4    32,33,34,35,36,37,38,39 db2,
0      5    40,41,42,43,44,45,46,47 db2,
0      6    48,49,50,51,52,53,54,55 db2,
0      7    56,57,58,59,60,61,62,63 coreshare,db1,
0      8    64,65,66,67,68,69,70,71 db2,
0      9    72,73,74,75,76,77,78,79 none
0     10    80,81,82,83,84,85,86,87 none
0     11    88,89,90,91,92,93,94,95 none
0     12    96,97,98,99,100,101,102,103 none
0     13    104,105,106,107,108,109,110,111 none
0     14    112,113,114,115,116,117,118,119 none
0     15    120,121,122,123,124,125,126,127 none
1     16    128,129,130,131,132,133,134,135 none
1     17    136,137,138,139,140,141,142,143 none
1     18    144,145,146,147,148,149,150,151 none
1     19    152,153,154,155,156,157,158,159 none
1     20    160,161,162,163,164,165,166,167 none
1     21    168,169,170,171,172,173,174,175 none
1     22    176,177,178,179,180,181,182,183 none
1     23    184,185,186,187,188,189,190,191 none
1     24    192,193,194,195,196,197,198,199 none
1     25    200,201,202,203,204,205,206,207 none
1     26    208,209,210,211,212,213,214,215 none
1     27    216,217,218,219,220,221,222,223 none
1     28    224,225,226,227,228,229,230,231 none
1     29    232,233,234,235,236,237,238,239 none
1     30    240,241,242,243,244,245,246,247 db2,
1     31    248,249,250,251,252,253,254,255 none

Using the options -S and -c, you can check whether your zones use whole sockets (-S) or whole cores (-c).   With -s you can check whether or not several zones share one or more cores, which can be intentional or not, depending on the use case.  Here's an example with various pools and zones on the same system as above:

root@mars:~# ./zonecores -Ssc
# Checking Socket Affinity (16 cores per socket)
INFO - Zone db2 using 2 sockets for 8 cores.
OK - Zone db1 using 1 sockets for 1 cores.
OK - Zone capped7 using default pool.
OK - Zone coreshare using 1 sockets for 1 cores.
# Checking Core Resource Sharing
OK - Core 0 used by only one zone.
OK - Core 1 used by only one zone.
OK - Core 3 used by only one zone.
OK - Core 30 used by only one zone.
OK - Core 4 used by only one zone.
OK - Core 5 used by only one zone.
OK - Core 6 used by only one zone.
INFO - Core 7 used by 2 zones!
-> coreshare
-> db1
OK - Core 8 used by only one zone.
# Checking Whole Core Assignments
OK - Zone db2 using all 8 strands of core 0.
OK - Zone db2 using all 8 strands of core 1.
OK - Zone db2 using all 8 strands of core 3.
OK - Zone db2 using all 8 strands of core 30.
OK - Zone db2 using all 8 strands of core 4.
OK - Zone db2 using all 8 strands of core 5.
FAIL - only 7 strands of core 6 in use for zone db2.
FAIL - only 1 strands of core 8 in use for zone db2.
OK - Zone db1 using all 8 strands of core 7.
OK - Zone coreshare using all 8 strands of core 7.

Info: 1 instances of core sharing found.
Info: 1 instances of socket spanning found.
Warning: 2 issues found with whole core assignments.

While this mostly speaks for itself, here are some comments:

  • Zone db01 uses a resource pool with 8 strands from one core.
  • Zone coreshare also uses that same pool.
  • Zone db2 uses a resource pool with 64 strands coming from cores from two different CPUs.  It only uses 7 of the 8 strands from core 6, while the 8th strand comes from core 8.  This is probably not intentional.  It would make more sense to use all 8 strands from the same core to avoid cache sharing and reduce the number of cores to license by one.   It might also be benefitial to use all 8 cores from the same CPU.  In this case, Solaris would attempt to allocate memory local to that CPU to avoid remote memory access.
  • Zone capped7 is configured with the option "capped-cpu: ncpus=7".  This is implemented using the Fair Share Scheduler (FSS) which uses all available CPUs in the default pool.

The script is available for download here: zonecores

I also wrote a more detailed discussion of all of this, with examples how to reconfigure your pool configuration in MOS DocID 2116794.1

Some links to further reading:

Dienstag Okt 01, 2013

CPU-DR for Zones

In my last entry, I described how to change the memory configuration of a running zone.  The natural next question is of course, if that also works with CPUs that have been assigned to a zone.  The answer, of course, is "yes".

You might wonder why that would be necessary in the first place.  After all, there's the Fair Share Scheduler, that's extremely capable of managing zones' CPU usage.  However, there are reasons to assign dedicated CPU resources to zones, licensing is one, SLAs with specified CPU requirements another.  In such cases, you configure a fixed amount of CPUs (more precisely, strands) for a zone.  Being able to change this configuration on the fly then becomes desirable.  I'll show how to do that in this blog entry.

In general, there are two ways to assign exclusive CPUs to a zone.  The classic approach is by using a resource pool with an associated processor set.  One or more zones can then be bound to that pool.  The easier solution is to use the parameter "dedicated-cpu" directly when configuring the zone.  In this second case, Solaris will create a temporary pool to manage these resources.  So effectively, the implementation is the same in both cases.  Which makes it clear how to change the CPU configuration in both cases: By changing the pool.  If you do this in the classical approach, the change to the pool will be persistent.  If working with the temporary pool created for the zone, you will also need to change the zone's configuration if you want the change to survive a zone restart.

If you configured you zone with "dedicated-cpu", the temporary pool (and also the temporary processor set that goes along with it) will usually be called "SUNWtmp_<zonename>".   If not, you'll know the name of the pool...  In both cases, everything else is the same:

Let's assume a zone called orazone, currently configured with 1 CPU.  It's to be assigned a second CPU.  The current pool configuration is like this:
root@benjaminchen:~# pooladm                

system default
	string	system.comment 
	int	system.version 1
	boolean	system.bind-default true
	string	system.poold.objectives wt-load

	pool pool_default
		int	pool.sys_id 0
		boolean true
		boolean	pool.default true
		int	pool.importance 1
		string	pool.comment 
		pset	pset_default

	pool SUNWtmp_orazone
		int	pool.sys_id 5
		boolean true
		boolean	pool.default false
		int	pool.importance 1
		string	pool.comment 
		boolean	pool.temporary true
		pset	SUNWtmp_orazone

	pset pset_default
		int	pset.sys_id -1
		boolean	pset.default true
		uint	pset.min 1
		uint	pset.max 65536
		string	pset.units population
		uint	pset.load 687
		uint	pset.size 3
		string	pset.comment 

			int	cpu.sys_id 1
			string	cpu.comment 
			string	cpu.status on-line

			int	cpu.sys_id 3
			string	cpu.comment 
			string	cpu.status on-line

			int	cpu.sys_id 2
			string	cpu.comment 
			string	cpu.status on-line

	pset SUNWtmp_orazone
		int	pset.sys_id 2
		boolean	pset.default false
		uint	pset.min 1
		uint	pset.max 1
		string	pset.units population
		uint	pset.load 478
		uint	pset.size 1
		string	pset.comment 
		boolean	pset.temporary true

			int	cpu.sys_id 0
			string	cpu.comment 
			string	cpu.status on-line
As we can see in the definition of pset SUNWtmp_orazone, it has been assigned CPU #0.  To add another CPU to this pool, you'll need these two commands:
root@benjaminchen:~# poolcfg -dc 'modify pset SUNWtmp_orapset \
                     (uint pset.max=2)' 
root@benjaminchen:~# poolcfg -dc 'transfer to pset \
                     orapset (cpu 1)'

To remove that CPU from the pool again, use these:

root@benjaminchen:~# poolcfg -dc 'transfer to pset pset_default \
                     (cpu 1)'
root@benjaminchen:~# poolcfg -dc 'modify pset SUNWtmp_orapset \
                     (uint pset.max=1)' 

That's it.   If you've used "dedicated-cpu" for your zone's configuration, you'll need to change that before the next reboot.  If not, you'd have to use the pool name you assigned to the zone.

Further details:

Montag Aug 19, 2013

Memory-DR for Zones

Zones allow you to limit their memory consumption.  The usual way to configure this is with the zone parameter "capped-memory" and it's three sub-values "physical", "swap" and "locked".  "Physical" corresponds to the resource control "zone.max-rss", which is actual main memory.  "Swap" corresponds to "zone.max-swap", which is swapspace and "locked" corresponds to "zone.max-locked-memory", which is non-pageable memory, typically shared memory segments.  Swap and locked memory are rather hard limits that can't be exceeded.  RSS - physical memory, is not quite as hard, being enforced by rcapd.  This daemon will try to page out all those memory pages that are beyond the allowed amount of memory and are least active.  Depending on the activity of the processes in question, this is more or less successful, but will always result in paging activity.  This will slow down the memory-hungry processes in that zone.

If you change any of these values using zonecfg, these changes will only be in effect after a reboot of the zone.  This is not as dynamic as one might be used to from the LDoms world.  But it can be, as I'd like to show you in a small example:

Let's assume a little zone with a memory configuration like this:

root@benjaminchen:~# zonecfg -z orazone info capped-memory
    physical: 512M
    [swap: 256M]
    [locked: 512M]

To change these values while the zone is in operation, you need to interact with two different sub-systems.   For physical memory, we'll need to talk to rcapd.  For swap and locked memory, we need prctl for the normal resource controls.  So, if I wanted to double all three limits for my zone, I'd need these commands:

root@benjaminchen:~# prctl -n zone.max-swap -v 512m -r -i zone orazone
root@benjaminchen:~# prctl -n zone.max-locked-memory -v 1g -r -i zone orazone
root@benjaminchen:~# rcapadm -z orazone -m 1g

These new values will be effective immediately - for rcapd after the next reconfigure-interval.  You can also change this interval with rcapadm.  Note that these changes are not persistent - if you reboot your zone, it will fall back to whatever was configured with zonecfg.  So to have both - persistent changes and immediate effect, you'll need to touch both tools.


  • Solaris Admin Guide:

Mittwoch Jun 12, 2013

Growing the root pool

Some small inbetween laptop experiences...  I finally decided to throw away that other OS (I used it so rarely that I regularily had to use the password reset procedure...).  That gave me another 50g of valuable laptop disk space - furtunately on the right part of the disk.  So in theory, all I'd have to do is resize the Solaris partition, tell ZFS about it and be happy...  Of course, there are the usual pitfalls.

To avoid confusion, much of this is x86 related.  On normal SPARC servers, you don't have any of the problems for which I describe solutions here...

First of all, you should *not* try to resize the partition that hosts your rpool while Solaris is up and running.  It works, but there are nicer ways to do a shutdown.  (What happens is that fdisk will not only create the new partition, but also write a default label in that partition, which means that ZFS will not find it's slice, which will make Solaris very unresponsive...)  The right way to do this is to boot off something else (PXE, USB, DVD, whatever) and then change the partition size.  Once that's done, re-create the slice for the ZFS rpool.  The important part is to use the very same starting cylinder.  The length, naturally, will be larger.  (At least, I had to do that, since the original zpool lived in a slice.)

After that, it's back to the book:  Boot Solaris and choose one of "zpool set autoexpand=on rpool" or "zpool online -e rpool c0t0d0s0" and there you go - 50g more space.

Did I forget to mention that I actually did a full backup before all of this?  I must be getting old...

Dienstag Apr 17, 2012

Solaris Zones: Virtualization that Speeds up Benchmarks

One of the first questions that typically comes up when I talk to customers about virtualization is the overhead involved.  Now we all know that virtualization with hypervisors comes with an overhead of some sort.  We should also all know that exactly how big that overhead is depends on the type of workload as much as it depends on the hypervisor used.  While there have been attempts to create standard benchmarks for this, quantifying hypervisor overhead is still mostly hidden in the mists of marketing and benchmark uncertainty.  However, what always raises eyebrows is when I come to Solaris Zones (called Containers in Solaris 10) as an alternative to hypervisor virtualization.  Since Zones are, greatly simplyfied, nothing more than a group of Unix processes contained by a set of rules which are enforced by the Solaris kernel, it is quite evident that there can't be much overhead involved.  Nevertheless, since many people think in hypervisor terms, there is almost always some doubt about this claim of zero overhead.  And as much as I find the explanation with technical details compelling, I also understand that seeing is so much better than believing.  So - look and see:

The Oracle benchmark teams are so convinced of the advantages of Solaris Zones that they actually use them in the configurations for public benchmarking.  Solaris resource management will also work in a non Zones environment, but Zones make it just so much easier to handle, especially with some of the more complex benchmark configurations.  There are numerous benchmark publications available using Solaris Containers, dating back to the days of the T5440.  Some recent examples, all of them world records, are:

The use of Solaris Zones is documented in all of these benchmark publications.

The benchmarking team also published a blog entry detailing how they make use of resource management with Solaris Zones to actually increase application performance.  That almost asks for calling this "negative overhead", if the term weren't somewhat misleading.

So, if you ever need to substantiate why Solaris Zones have no virtualization overhead, point to these (and probably some more) published benchmarks.

Montag Mrz 19, 2012

Setting up a local AI server - easy with Solaris 11

Many things are new in Solaris 11, Autoinstall is one of them.  If, like me, you've known Jumpstart for the last 2 centuries or so, you'll have to start from scratch.  Well, almost, as the concepts are similar, and it's not all that difficult.  Just new.

I wanted to have an AI server that I could use for demo purposes, on the train if need be.  That answers the question of hardware requirements: portable.  But let's start at the beginning.

First, you need an OS image, of course.  In the new world of Solaris 11, it is now called a repository.  The original can be downloaded from the Solaris 11 page at Oracle.   What you want is the "Oracle Solaris 11 11/11 Repository Image", which comes in two parts that can be combined using cat.  MD5 checksums for these (and all other downloads from that page) are available closer to the top of the page.

With that, building the repository is quick and simple:

# zfs create -o mountpoint=/export/repo rpool/ai/repo
# zfs create rpool/ai/repo/sol11
# mount -o ro -F hsfs /tmp/sol-11-1111-repo-full.iso /mnt
# rsync -aP /mnt/repo /export/repo/sol11
# umount /mnt
# pkgrepo rebuild -s /export/repo/sol11/repo
# zfs snapshot rpool/ai/repo/sol11@fcs
# pkgrepo info -s  /export/repo/sol11/repo
solaris   4292     online           2012-03-12T20:47:15.378639Z
That's all there's to it.  Let's make a snapshot, just to be on the safe side.  You never know when one will come in handy.  To use this repository, you could just add it as a file-based publisher:
# pkg set-publisher -g file:///export/repo/sol11/repo solaris
In case I'd want to access this repository through a (virtual) network, i'll now quickly activate the repository-service:
# svccfg -s application/pkg/server \
setprop pkg/inst_root=/export/repo/sol11/repo
# svccfg -s application/pkg/server setprop pkg/readonly=true
# svcadm refresh application/pkg/server
# svcadm enable application/pkg/server

That's all you need - now point your browser to http://localhost/ to view your beautiful repository-server. Step 1 is done.  All of this, by the way, is nicely documented in the README file that's contained in the repository image.

Of course, we already have updates to the original release.  You can find them in MOS in the Oracle Solaris 11 Support Repository Updates (SRU) Index.  You can simply add these to your existing repository or create separate repositories for each SRU.  The individual SRUs are self-sufficient and incremental - SRU4 includes all updates from SRU2 and SRU3.  With ZFS, you can also get both: A full repository with all updates and at the same time incremental ones up to each of the updates:

# mount -o ro -F hsfs /tmp/sol-11-1111-sru4-05-incr-repo.iso /mnt
# pkgrecv -s /mnt/repo -d /export/repo/sol11/repo '*'
# umount /mnt
# pkgrepo rebuild -s /export/repo/sol11/repo
# zfs snapshot rpool/ai/repo/sol11@sru4
# zfs set snapdir=visible rpool/ai/repo/sol11
# svcadm restart svc:/application/pkg/server:default
The normal repository is now updated to SRU4.  Thanks to the ZFS snapshots, there is also a valid repository of Solaris 11 11/11 without the update located at /export/repo/sol11/.zfs/snapshot/fcs . If you like, you can also create another repository service for each update, running on a separate port.

But now lets continue with the AI server.  Just a little bit of reading in the dokumentation makes it clear that we will need to run a DHCP server for this.  Since I already have one active (for my SunRay installation) and since it's a good idea to have these kinds of services separate anyway, I decided to create this in a Zone.  So, let's create one first:

# zfs create -o mountpoint=/export/install rpool/ai/install
# zfs create -o mountpoint=/zones rpool/zones
# zonecfg -z ai-server
zonecfg:ai-server> create
create: Using system default template 'SYSdefault'
zonecfg:ai-server> set zonepath=/zones/ai-server
zonecfg:ai-server> add dataset
zonecfg:ai-server:dataset> set name=rpool/ai/install
zonecfg:ai-server:dataset> set alias=install
zonecfg:ai-server:dataset> end
zonecfg:ai-server> commit
zonecfg:ai-server> exit
# zoneadm -z ai-server install
# zoneadm -z ai-server boot ; zlogin -C ai-server
Give it a hostname and IP address at first boot, and there's the Zone.  For a publisher for Solaris packages, it will be bound to the "System Publisher" from the Global Zone.  The /export/install filesystem, of course, is intended to be used by the AI server.  Let's configure it now:
#zlogin ai-server
root@ai-server:~# pkg install install/installadm
root@ai-server:~# installadm create-service -n x86-fcs -a i386 \
-s pkg://solaris/install-image/solaris-auto-install@5.11,5.11- \
-d /export/install/fcs -i -c 3

With that, the core AI server is already done.  What happened here?  First, I installed the AI server software.  IPS makes that nice and easy.  If necessary, it'll also pull in the required DHCP-Server and anything else that might be missing.  Watch out for that DHCP server software.  In Solaris 11, there are two different versions.  There's the one you might know from Solaris 10 and earlier, and then there's a new one from ISC.  The latter is the one we need for AI.  The SMF service names of both are very similar.  The "old" one is "svc:/network/dhcp-server:default". The ISC-server comes with several SMF-services. We at least need "svc:/network/dhcp/server:ipv4". 

The command "installadm create-service" creates the installation-service. It's called "x86-fcs", serves the "i386" architecture and gets its boot image from the repository of the system publisher, using version 5.11,5.11-, which is Solaris 11 11/11.  (The option "-a i386" in this example is optional, since the installserver itself runs on a x86 machine.) The boot-environment for clients is created in /export/install/fcs and the DHCP-server is configured for 3 IP-addresses starting at  This configuration is stored in a very human readable form in /etc/inet/dhcpd4.conf.  An AI-service for SPARC systems could be created in the very same way, using "-a sparc" as the architecture option.

Now we would be ready to register and install the first client.  It would be installed with the default "solaris-large-server" using the publisher "" and would query it's configuration interactively at first boot.  This makes it very clear that an AI-server is really only a boot-server.  The true source of packets to install can be different.  Since I don't like these defaults for my demo setup, I did some extra config work for my clients.

The configuration of a client is controlled by manifests and profiles.  The manifest controls which packets are installed and how the filesystems are layed out.  In that, it's very much like the old "rules.ok" file in Jumpstart.  Profiles contain additional configuration like root passwords, primary user account, IP addresses, keyboard layout etc.  Hence, profiles are very similar to the old sysid.cfg file.

The easiest way to get your hands on a manifest is to ask the AI server we just created to give us it's default one.  Then modify that to our liking and give it back to the installserver to use:

root@ai-server:~# mkdir -p /export/install/configs/manifests
root@ai-server:~# cd /export/install/configs/manifests
root@ai-server:~# installadm export -n x86-fcs -m orig_default \
-o orig_default.xml
root@ai-server:~# cp orig_default.xml s11-fcs.small.local.xml
root@ai-server:~# vi s11-fcs.small.local.xml
root@ai-server:~# more s11-fcs.small.local.xml
<!DOCTYPE auto_install SYSTEM "file:///usr/share/install/ai.dtd.1">
  <ai_instance name="S11 Small fcs local">
        <zpool name="rpool" is_root="true">
          <filesystem name="export" mountpoint="/export"/>
          <filesystem name="export/home"/>
          <be name="solaris"/>
    <software type="IPS">
          <!-- Specify locales to install -->
          <facet set="false">facet.locale.*</facet>
          <facet set="true"></facet>
          <facet set="true">facet.locale.de_DE</facet>
          <facet set="true">facet.locale.en</facet>
          <facet set="true">facet.locale.en_US</facet>
        <publisher name="solaris">
          <origin name=""/>
        By default the latest build available, in the specified IPS
        repository, is installed.  If another build is required, the
        build number has to be appended to the 'entire' package in the
        following form:

      <software_data action="install">

root@ai-server:~# installadm create-manifest -n x86-fcs -d \
-f ./s11-fcs.small.local.xml 
root@ai-server:~# installadm list -m -n x86-fcs
Manifest             Status    Criteria 
--------             ------    -------- 
S11 Small fcs local  Default   None
orig_default         Inactive  None

The major points in this new manifest are:

  • Install "solaris-small-server"
  • Install a few locales less than the default.  I'm not that fluid in French or Japanese...
  • Use my own package service as publisher, running on IP address
  • Install the initial release of Solaris 11:  pkg:/entire@0.5.11,5.11-

Using a similar approach, I'll create a default profile interactively and use it as a template for a few customized building blocks, each defining a part of the overall system configuration.  The modular approach makes it easy to configure numerous clients later on:

root@ai-server:~# mkdir -p /export/install/configs/profiles
root@ai-server:~# cd /export/install/configs/profiles
root@ai-server:~# sysconfig create-profile -o default.xml
root@ai-server:~# cp default.xml general.xml; cp default.xml mars.xml
root@ai-server:~# cp default.xml user.xml
root@ai-server:~# vi general.xml mars.xml user.xml
root@ai-server:~# more general.xml mars.xml user.xml
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">
<service_bundle type="profile" name="sysconfig">
  <service version="1" type="service" name="system/timezone">
    <instance enabled="true" name="default">
      <property_group type="application" name="timezone">
        <propval type="astring" name="localtime" value="Europe/Berlin"/>
  <service version="1" type="service" name="system/environment">
    <instance enabled="true" name="init">
      <property_group type="application" name="environment">
        <propval type="astring" name="LANG" value="C"/>
  <service version="1" type="service" name="system/keymap">
    <instance enabled="true" name="default">
      <property_group type="system" name="keymap">
        <propval type="astring" name="layout" value="US-English"/>
  <service version="1" type="service" name="system/console-login">
    <instance enabled="true" name="default">
      <property_group type="application" name="ttymon">
        <propval type="astring" name="terminal_type" value="vt100"/>
  <service version="1" type="service" name="network/physical">
    <instance enabled="true" name="default">
      <property_group type="application" name="netcfg">
        <propval type="astring" name="active_ncp" value="DefaultFixed"/>
  <service version="1" type="service" name="system/name-service/switch">
    <property_group type="application" name="config">
      <propval type="astring" name="default" value="files"/>
      <propval type="astring" name="host" value="files dns"/>
      <propval type="astring" name="printer" value="user files"/>
    <instance enabled="true" name="default"/>
  <service version="1" type="service" name="system/name-service/cache">
    <instance enabled="true" name="default"/>
  <service version="1" type="service" name="network/dns/client">
    <property_group type="application" name="config">
      <property type="net_address" name="nameserver">
          <value_node value=""/>
    <instance enabled="true" name="default"/>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">
<service_bundle type="profile" name="sysconfig">
  <service version="1" type="service" name="network/install">
    <instance enabled="true" name="default">
      <property_group type="application" name="install_ipv4_interface">
        <propval type="astring" name="address_type" value="static"/>
        <propval type="net_address_v4" name="static_address" 
        <propval type="astring" name="name" value="net0/v4"/>
        <propval type="net_address_v4" name="default_route" 
      <property_group type="application" name="install_ipv6_interface">
        <propval type="astring" name="stateful" value="yes"/>
        <propval type="astring" name="stateless" value="yes"/>
        <propval type="astring" name="address_type" value="addrconf"/>
        <propval type="astring" name="name" value="net0/v6"/>
  <service version="1" type="service" name="system/identity">
    <instance enabled="true" name="node">
      <property_group type="application" name="config">
        <propval type="astring" name="nodename" value="mars"/>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">
<service_bundle type="profile" name="sysconfig">
  <service version="1" type="service" name="system/config-user">
    <instance enabled="true" name="default">
      <property_group type="application" name="root_account">
        <propval type="astring" name="login" value="root"/>
        <propval type="astring" name="password" 
        <propval type="astring" name="type" value="role"/>
      <property_group type="application" name="user_account">
        <propval type="astring" name="login" value="stefan"/>
        <propval type="astring" name="password" 
        <propval type="astring" name="type" value="normal"/>
        <propval type="astring" name="description" value="Stefan Hinker"/>
        <propval type="count" name="uid" value="12345"/>
        <propval type="count" name="gid" value="10"/>
        <propval type="astring" name="shell" value="/usr/bin/bash"/>
        <propval type="astring" name="roles" value="root"/>
        <propval type="astring" name="profiles" value="System Administrator"/>
        <propval type="astring" name="sudoers" value="ALL=(ALL) ALL"/>
root@ai-server:~# installadm create-profile -n x86-fcs -f general.xml
root@ai-server:~# installadm create-profile -n x86-fcs -f user.xml
root@ai-server:~# installadm create-profile -n x86-fcs -f mars.xml \
-c ipv4=
root@ai-server:~# installadm list -p

Service Name  Profile     
------------  -------     
x86-fcs       general.xml

root@ai-server:~# installadm list -n x86-fcs -p

Profile      Criteria 
-------      -------- 
general.xml  None
mars.xml     ipv4 =
user.xml     None

Here's the idea behind these files:

  • "general.xml" contains settings valid for all my clients.  Stuff like DNS servers, for example, which in my case will always be the same.
  • "user.xml" only contains user definitions.  That is, a root password and a primary user.
    Both of these profiles will be valid for all clients (for now).
  • "mars.xml" defines network settings for an individual client.  This profile is associated with an IP-Address.  For this to work, I'll have to tweak the DHCP-settings in the next step:
root@ai-server:~# installadm create-client -e 08:00:27:AA:3D:B1 -n x86-fcs
root@ai-server:~# vi /etc/inet/dhcpd4.conf
root@ai-server:~# tail -5 /etc/inet/dhcpd4.conf
host 080027AA3DB1 {
  hardware ethernet 08:00:27:AA:3D:B1;
  filename "01080027AA3DB1";

This completes the client preparations.  I manually added the IP-Address for mars to /etc/inet/dhcpd4.conf.  This is needed for the "mars.xml" profile.  Disabling arbitrary DHCP-replies will shut up this DHCP server, making my life in a shared environment a lot more peaceful ;-)

Note: The above example shows the configuration for x86 clients.  SPARC clients have a slightly different entry in the dhcp config file, again with some manual tweaking to create a fixed IP address for my client:

subnet netmask {
  option broadcast-address;
  option routers;

class "SPARC" {
  match if not (substring(option vendor-class-identifier, 0, 9) = "PXEClient");
  filename "";

host sparcy {
   hardware ethernet 00:14:4f:fb:52:3c ;
   fixed-address ;
Now, I of course want this installation to be completely hands-off.  For this to work, I'll need to modify the grub boot menu for this client slightly.  You can find it in /etc/netboot.  "installadm create-client" will create a new boot menu for every client, identified by the client's MAC address.  The template for this can be found in a subdirectory with the name of the install service, /etc/netboot/x86-fcs in our case.  If you don't want to change this manually for every client, modify that template to your liking instead.
root@ai-server:~# cd /etc/netboot
root@ai-server:~# cp menu.lst.01080027AA3DB1
root@ai-server:~# vi menu.lst.01080027AA3DB1
root@ai-server:~# diff menu.lst.01080027AA3DB1
< default=1
< timeout=10
> default=0
> timeout=30
root@ai-server:~# more menu.lst.01080027AA3DB1

title Oracle Solaris 11 11/11 Text Installer and command line
	kernel$ /x86-fcs/platform/i86pc/kernel/$ISADIR/unix -B install_media=htt
	module$ /x86-fcs/platform/i86pc/$ISADIR/boot_archive

title Oracle Solaris 11 11/11 Automated Install
	kernel$ /x86-fcs/platform/i86pc/kernel/$ISADIR/unix -B install=true,inst
	module$ /x86-fcs/platform/i86pc/$ISADIR/boot_archive

Now just boot the client off the network using PXE-boot.  For my demo purposes, that's a client from VirtualBox, of course.   Again, if this were a SPARC system, you'd instead be typing "boot net:dhcp - install" at the OK prompt and then just watch the installation.

That's all there's to it.  And despite the fact that this blog entry is a little longer - that wasn't that hard now, was it?

Mittwoch Feb 29, 2012

Solaris Fingerprint Database - How it's done in Solaris 11

Many remember the Solaris Fingerprint Database. It was a great tool to verify the integrity of a solaris binary.  Unfortunately, it went away with the rest of sunsolve, and was not revived in the replacement, "My Oracle Support".  Here's the good news:  It's back for Solaris 11, and it's better than ever!

It is now totally integrated with IPS...  Read more

[Read More]

Montag Feb 20, 2012

Solaris 11 submitted for EAL4+ certification

Solaris 11 has been submitted for certification by the Canadian Common Criteria Scheme in Level EAL4+. They will be certifying against the protection profile "Operating System Protection Profile (OS PP)" as well as the extensions

  • Advanced Management (AM)
  • Extended Identification and Authentication (EIA)
  • Labeled Security (LS)
  • Virtualization (VIRT)

EAL4+ is the highest level typically achievable for commercial software,
and is the highest level mutually recognized by 26 countries, including Germany and the USA. Completion of the certification lies in the hands of the certification authority.

You can check the current status of this certification (as well as other certified Oracle software) on the page Oracle Security Evaluations.

Freitag Okt 07, 2011

Solaris 11 Launch

There have been many questions and rumors about the upcoming launch of Solaris 11.  Now it's out:  Watch the webcast on

November 9, 2011
at 10am ET

Be invited to join!

(I hope to get around summarizing all the OpenWorld announcements, especially around T4, soon...)

Montag Aug 01, 2011

Oracle Solaris Studio 12.3 Beta Program

The beta program for Oracle Solaris Studio 12.3 is now open for participation.  Anyone willing to test the newest compiler and developer tools is welcome to join.  You may expect performance improvements over earlier versions of Studio as well as GCC that make testing worth your while.

Happy testing!

Mittwoch Jun 08, 2011

Erasing disks securely

Actually, both the question and the answer are old and well known.  However, these things tend to be forgotten and pop up as questions from time to time.  Hence a little reminder for all of us:

Solaris makes it easy to erase a disk so that all the data can't be restored, even with sophisticated methods.  There is a subcommand "analyze/purge" in the command format(1M) that does it all for you.  It will overwrite the selected area of your disk (usually s2) a total of four times with different patterns to achieve this.  Of course, depending on the size of the disk, this might take a while.  But it's secure enough to comply with Department Of Defence(DOD) wipe disk standard 5220.22-M.  Note however that as of June 28, 2007, overwriting in general is no longer accepted as a method to securely erase data.  Here is a link to the relevant DSS publication.

Some more details are here:

Note that this method does not apply to SSDs of all kind!  And of course, to avoid any risk of losing your data with your disk, simply encrypt it!  It's quite easy using ZFS or Oracle TDE :-)

Update 2015-05-29:

  • The link to the original DoD standard doesn't work anymore and has been replaced by a link to Wikipedia.
  • Here's an additional link to a more recent NIST publication.
  • Note that with modern drives, destroying data with OS or application level tools will not satisfy higher security requirements.  The sector management of these drives might make defective sectors with sensitive data unavailable to such tools - but not to more intrusive methods of active data recovery.  If you want to protect against those, physical destruction is your only reliable option.

Update 2015-09-29:

This is my final comment on this matter:

  • If you are worried about the data on storage devices you no longer use, physical destruction of those devices is the only truly secure option.
  • Encrypt your data right from the start to avoid this issue.  Encryption is easily and in many cases freely available.  If you don't care enough about your data to encrypt it, you are unlikely to worry about data on decommissioned storage devices.
  • If you are worried enough not to trust encryption, no erasing technique will be good enough to satisfy your requirements.  And the cost of physically destroying those devices will not matter to you.

Donnerstag Mrz 31, 2011

What's up with Solaris 11?

Interested in the upcoming Solaris 11?  What will be the highlights?  What exactly is the new packaging format, how does the new installer work?  What do the analysts think?

All this will be covered in the Solaris Online Forum on April 14, starting at 9 am PST.  This will be a live event where you can ask questions. (A recording will be available afterwards.)  Speakers are all high level members of development and product management.

All further details can be found at the registration page.

Mittwoch Nov 24, 2010

Encrypting Your Filesystem with ZFS and AES128

ZFS filesystem encryption is finally available in Solaris 11 Express.  This closes a gap in Solaris that hurt all those that carried their data around with them.  But of course there are many good reasons to encrypt data living on disks well secured in a datacenter.  After all, they will all leave the datacenter in one way or another eventually...

Enough introduction, here's how simple this is:

  1. You will need to upgrade the zpool intended to host the encrypted filesystem to version 30.  Issue a simple "zpool upgrade <poolname>.  Of course, you can skip this step on a newly installed Solaris 11 Express.
  2. Now create a new filesystem, with encryption enabled: zfs create -o encryption=on <poolname/newfs>
    The command will interactively prompt for a passphrase which will be used to generate the key for this filesystem.  You're done!  You can not encrypt an already existing filesystem.  Of course there are several more options on how and where to store the key.  Just have a look at the manpage :-)

Likewise, you also have a choice of three different key lengths for AES, the algorithm used for encryption.  The default used for "encryption=on" is AES-128 in CCM mode.  But you can also choose the longer 192 or 256 bit keys.  While developing ZFS crypto, it was discussed what default keylength to choose.  AES-128 was chosen for two reasons:  First, of course, the 128 bit variant is faster than the longer key lengths, especially without hardware acceleration like it is available in the SPARC T2/T3 and Intel 5600 Chips.  Second, there is new research including successful attacks on AES256 and AES 192 that requires a search of only 2\^39.  These attacks don't work for AES128, which is therefore, as of today, not only faster, but also more secure than the variants with longer keys.

More details about ZFS Crypto in the ZFS Admin Guide.

Dienstag Sep 14, 2010

How auto_reg really works in Solaris 10 09/10

The newest update of Solaris 10 (09/10) brings a new feature called autoregistration. This can be automated using the new "auto_reg" option in the sysidcfg file.  Or rather, it can be sometimes.  Due to a (already known) bug, this new parameter is ignored by the GUI-installer, which will query you for the registration details no matter what you put in the sysidcfg file.  The GUI-installer usually runs if you have a screen (and video card) attached to the system.  On headless servers, the text-installer runs, which correctly acts upon "auto_reg" settings.

As a workaround for workstations, use "boot net - install nowin" instead of the usual "boot net - install", and you're all set.

 Many thanks to Peter Tribble, who suffered through this and eventually found the solution for me.


Neuigkeiten, Tipps und Wissenswertes rund um SPARC, CMT, Performance und ihre Analyse sowie Erfahrungen mit Solaris auf dem Server und dem Laptop.

This is a bilingual blog (most of the time). Please select your prefered language:
The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.


« July 2016