
If you run two or three solaris branded (aka native) zones on a Solaris host, it will likely just work without thinking about memory, CPU count, or disk space. However, if you plan on running, say, 10, 50, 100, or more native zones on a single Solaris host, one should really know what to expect with regard to resource consumption of a single native zone, and thus how to set up the host system to run many of those.
This article provides basic information on suggested minimums related to how much resources a Solaris host needs to run native zones. Note that as every set-up is unique, nothing is set in stone and the admin will likely need to adjust those recommentations to the actual environment.
As any software nowadays should run in some kind of virtualization, we will assume the host will be a Kernel Zone and will make relevant suggestions related to that. You can of course run your native zones on a bare metal system if you decide so. In that case the flexibility around adding/removing resources to the host on the fly is limited in contrast to a Kernel Zone, or a Logical Domain on SPARC.
TL;DR
If you want to run native zones on a Solaris host, we recommend to use the following minimums:
- Reserve at least 8GB of memory for the host itself (kernel, ZFS, userland). We assume the swap is set to its installation default 1GB or 2GB.
- For each native zone expected to run, add 300MB of memory on top of those 8GB assigned for the host. Based on what the zone is to be actually used for, you need to increase that estimate accordingly.
- Use at least 4 CPUs for the host system; that is also the default for a Kernel Zone installation. Add more as needed. See notes below on time-outs on boot. If you plan on using the
poolproperties or thededicated-cpuresources with your zones, your set-up will be more complex, of course. - Set
config/concurrent-boot-shutdownfor thesvc:/system/zones:defaultservice to the number of CPU the host runs with, to limit the amount of zones that are allowed to boot up in parallel. E.g., if your host has 8 CPUs (that would bevirtual-cpu/ncpus=8for a Kernel Zone configuration), we recommend to do the following inside the host system:svccfg -s system/zones:default setprop config/concurrent-boot-shutdown=8 svccfg -s system/zones:default setprop config/concurrent-suspend-resume=8 svcadm refresh system/zones:default
In general, if you do not limit zone boot parallelism and there is not enough resources to boot all zones at once, various services inside the zones may end up in the
maintenancestate due to time-outs. It is difficult to provide more precise information as every environment is different but setting the concurrent boot/shutdown to the CPU count is conservative enough and seems to work in general. Alternatively, add more CPUs to the host.To learn about actual service time-outs inside the native zones, check zone logs under
/var/log/zoneson the host system. - In default zone installs, we recommend to give each zone at least 2GB of disk space, and more if you expect the host to be upgraded as each zone will be upgraded into its cloned zone boot environment.
- Later, when updating the host via
pkg, using theupdate -C0option may not work as everypkgprocess instance running in parallel on behalf of each zone would use significant amount of memory, which is also based on how many system versions your IPS publisher contains. If you want a parallel zone upgrade, start with e.g.update -C4and experiment with greater numbers. If unsure, do not use the-Coption in which casepkg updatewill upgrade the zones one by one.
See the Monitoring section on how to monitor per zone resource consumption.
Using these recommendations, as an example to run 50 native zones on a single Kernel Zone, use 8GB plus 50 * 300MB (15GB), that is 23GB of memory in total. Add 100GB (2 * 50) of storage on top of what you plan for the host itself.
Memory
The Solaris system can run on as few as 2GB of memory but it is then limited to what you can do with the system. For example, you probably would run out of memory when updating such a machine as pkg graph resolver uses information on all system SRU releases in the package repository and that may need significant amount of memory.
You need at least 5GB of memory to install Solaris but you can later change that, even on the fly for Kernel Zones. We know that customers run Solaris on 3GB Kernel Zones without any issues, but they limit the number of Solaris SRUs present in their local repositories to execute pkg update.
With native zones, we do recommend to run Solaris on at least 8GB of memory. A default native zone installation having the small server package may use about 250-300MB of memory after it is fully booted up, and that is why we recommend 8GB plus extra 300MB for each native zone running. If you install minimal zones (see below), use at least 200MB of memory.
CPU resources
CPU wise, the most stressing action is likely to boot up all zones upon host boot. While a native zone is OS level virtualization (i.e. a container), and is sharing the kernel with its host, each zone is running the full userland environment, including SMF and sstore daemon. For that reason, we recommend to limit the parallelism on boot (and resume) to the number of CPUs available, as shown in the TL;DR section.
Experiment with your environment. You can find out that you can boot up 80 zones fully in parallel on just an 8 CPU host without any issues but that also depends on how fast storage you have, for example. If your zone service instances randomly end up in the maintenance mode on the host boot/reboot, and the zone log /var/log/zones/zonename.messages shows timeout error messages (or warnings), it is an indication that you either need more CPU resources, or limit the zone boot parallelism.
To identify zone service instances in maintenace, run svcs -xv on the host. Individual zone instances are called svc:/system/zones/zone:zonename. For example, the following output shows zone tzone1 in good state:
# svcs svc:/system/zones/zone:tzone1 STATE STIME FMRI online 2023-01-03T07:11:14 svc:/system/zones/zone:tzone1
What is actually driving the zone service instance state is the state of its service svc:/milestone/goals:default inside the zone. By default, it only depends on svc:/milestone/multi-user-server:default. See Zones Delegated Restarter and SMF Goals to learn more.
With the zone boot parallelism limited to 8 on our 8 vCPU SPARC S7 Kernel Zone with 32GB of memory (8 + 0.3 * 80), it took about 13 minutes to get all 80 zones to the online state. Without limiting the paralelism, it was slightly faster and took about 11 minutes. If we added 4 more vCPUs to make it 12 in total, all zones were online within 8 minutes even on our 1Gbit iSCSI network, with full paralelism. However, we do recommend to be more conservative and set concurrent-boot-shutdown, see the TL;DR section.
After the host boots up with all the zones, you could even have 80 zones sitting idle, sharing 1 CPU, but that is just an academic example:
bare-metal# zonecfg -z mykz -r 'select virtual-cpu; set ncpus=1; end; commit'
Checking: Modifying virtual-cpu ncpus=1
Applying the changes
bare-metal# zlogin mykz-host
mykz-host# time { seq -f "%02.0f" 1 80 | while read i; do zlogin sb$i uptime </dev/null; done; }
...
real 2m2.871s
user 0m1.637s
sys 0m2.103s
Note that such an environment would likely be using its 1 CPU to its fullest at all time as 80 sstore daemons periodically collecting statistics need some CPU resources even if the systems are otherwise idle.
Storage
Zone dataset usage
Aside from kernel, drivers, and some other packages missing from a native zone IPS image, a native zone installation is a full Solaris system. The default installation uses the group/system/solaris-small-server package. As of SRU51, such native zone installation takes about 1.1GB, and does not share its storage with other zones or the host (as it was possible on Solaris 10). Therefore, we recommend at least 2GB per zone. To install more zones, you need the multiple of one zone storage needs. Many customers use the group/system/solaris-minimal-server package, and then only add what is essential to them. To execute a non-default group package installation, you need to provide the package manifest to the zoneadm install command via the -m manifest option. See the You can go minimal section below to learn more.
When you upgrade the host, each zone system must be upgraded as well since by design, any native zone must run the exact system software version as its host. As the host, each zone will be upgraded into a newly cloned zone boot environment (ZBE). The space needed to upgrade depends on the actual delta size between those two Solaris versions because internally, ZFS is used, with its snapshots and clones.
For example, the delta between upgrading from SRU44 to SRU45 was about 700MB. Upgrading it to SRU51 then added another 1G (greater delta across multiple SRUs).
root@114sru44:~# zfs list -o name,used -s used rpool/VARSHARE/zones/tzone2 NAME USED rpool/VARSHARE/zones/tzone2 1.12G root@114sru45:~# zfs list -o name,used -s used rpool/VARSHARE/zones/tzone2 NAME USED rpool/VARSHARE/zones/tzone2 1.82G root@114sru51:~# zfs list -o name,used -s used rpool/VARSHARE/zones/tzone2 NAME USED rpool/VARSHARE/zones/tzone2 2.94G
To save storage, you can also set the host rpool to use compression and/or deduplication.
It also good practice to periodically remove old SRU BEs (via beadm destroy be-name). With the BE removal, all ZBEs linked to that BE will be also removed.
With the zpool property autoexpand set, you can increase a Kernel Zone storage on the fly. That works both for shared storage and the default local ZFS volume storage as well. With that, you can start with a small KZ host volume and only expand it when needed.
On storage speed
Obviously, the faster the better, especially with more zones running. However, with iSCSI storage in our lab as an example, booting up our 8 vCPU SPARC S7 Kernel Zone with 80 zones both on 10Gb and 1Gb iSCSI SAN did not show any noticeable difference in boot times.
If you hit any time-out issues when booting up a large score of native zones, storage performance should be definitely checked as well.
You can go minimal
If you start with the minimal zone installation, as many of our customers do, as of SRU51 it consumes about 800MB of storage and 200MB of memory upon boot. With large number of zones running, that could save significant amount of resources.
To install a minimal native zone, you can do the following:
root# cp /usr/share/auto_install/manifest/zone_default.xml . root# gsed -i -e 's/small-server/minimal-server/' zone_default.xml root# zonecfg -z minimal create root# zoneadm -z minimal install -m zone_default.xml
Monitoring resource consumption
You can use zonestat to monitor resource use per zone. In the default view, you get CPU, memory, and network statistics. You need to run it with an interval specification, e.g. zonestat 2. See also the -n and -r options. To see all you can monitor, check out the output of zonestat -r all 1 1.
An example of zonestat output on a SPARC Kernel Zone with some minimal server group package zone installed, already running for a few days:
root# zonestat 1 1
Collecting data for first interval...
Interval: 1, Duration: 0:00:01
SUMMARY Cpus/Online: 8/8 PhysMem: 34.0G VirtMem: 35.0G
----------CPU---------- --PhysMem-- --VirtMem-- --PhysNet--
ZONE USED %PART STLN %STLN USED %USED USED %USED PBYTE %PUSE
[total] 0.05 0.69% 0.00 0.00% 11.9G 35.2% 12.4G 35.6% 0 0.00%
[system] 0.00 0.05% 0.00 0.00% 8528M 24.4% 8855M 24.7% - -
global 0.03 0.42% - - 1569M 4.50% 1616M 4.51% 0 0.00%
sb01 0.00 0.01% - - 218M 0.62% 232M 0.64% 0 0.00%
sb02 0.00 0.02% - - 213M 0.61% 231M 0.64% 0 0.00%
sb03 0.00 0.02% - - 218M 0.62% 230M 0.64% 0 0.00%
sb04 0.00 0.02% - - 221M 0.63% 231M 0.64% 0 0.00%
sb05 0.00 0.02% - - 219M 0.63% 231M 0.64% 0 0.00%
sb06 0.00 0.02% - - 220M 0.63% 233M 0.65% 0 0.00%
sb07 0.00 0.02% - - 219M 0.63% 232M 0.64% 0 0.00%
sb08 0.00 0.02% - - 217M 0.62% 231M 0.64% 0 0.00%
sb09 0.00 0.02% - - 217M 0.62% 227M 0.63% 0 0.00%
sb10 0.00 0.01% - - 217M 0.62% 225M 0.62% 0 0.00%
Conclusion
This blog post contains very basic information regarding to what should be considered when you plan on running a large number of native zones on a single Solaris host. While we provide concrete numbers to start with, please know that you may always need to update those to fit your specific environments.
