Friday May 23, 2014

Overview of Solaris Zones Security Models

Over the years of explaining the security model of Solaris Zones and LDOMs to customers "security people" I've encountered two basic "schools of thought".  The first is "shared kernel bad" the second is "shared kernel good".

Which camp is right ?  Well both are, because there are advantages to both models. 

If you have a shared kernel there the policy engine has more information about what is going on and can make more informed access and data flow decisions, however if an exploit should happen at the kernel level it has the potential to impact multiple (or all) guests. 

If you have separate kernels then a kernel level exploit should only impact that single guest, except if it then results in a VM breakout.

Solaris non global zones fall into the "shared kernel" style.  Solaris Zones are included in the Solaris 11 Common Criteria Evaluation for the virtualisation extension (VIRT) to the OSPP.  Solaris Zones are also the foundation of our multi-level Trusted Extensions feature and is used for separation of classified data by many government/military deployments around the world.

LDOMs are "separate kernel", but LDOMs are also unlike hypervisors in the x86 world because we can shutdown the underlying control domain OS (assuming the guests have another path to the IO requirements they need, either on their own or from root-domains or io-domains).  So LDOMs can be deployed in a way that they are more protected from a VM breakout being used to cause a reconfiguration of resources.  The Solaris 11 CC evaluation still applies to Solaris instances running in an LDOM regardless of wither they are the control-domain, io-domain, root-domain or guest-domain.

Solaris 11.2 introduces a new brand of zone called "Kernel Zones", they look like native Zones, you configure them like native Zones and they actually are Solaris zones but with the ability to run a separate kernel.  Having their own kernel means that they support suspend/resume independent of the host OS - which gives us warm migration.  In particular "Kernel Zones" are not like popular x86 hypervisors such as VirtualBox, Xen or VMWare.

General purpose hypervisors, like VirtualBox, support multiple different guest operating systems so they have to virtualise the hardware. Some hypervisors (most type 2 but even some type 1) support multiple different host operating systems for providing the services as well. So this means the guest can only assume virtualised hardware.

Solaris Kernel Zones are different, the zones kernel knows it is running on Solaris as the "host" and the Solaris global zone "host" also knows that the kernel zone is Solaris.  This means we get to make more informed decisions about resources, general requirements and security policy. Even with this we can still host Solaris 11.2 kernel zones on the internal developement release of Solaris 12 and vice versa.

Note that what follows is an out line of implementation details that are subject to change at any time: The kernel of a Solaris Kernel Zone is represented as a user land process in a Solaris non global zone.  That non global zone is configured with less privilege than a normal non global zone would have and it is always configured as an immutable zone.  So if there happened to be an exploit of the guest kernel that resulted in a VM break out you would end up in an immutable non global zone with lowered privilege.

This means that we can have advantages of "shared kernel" and "separate kernel" security models with Solaris Kernel Zones and we have the management simplicity of traditional Solaris Zones (# zonecfg -z mykz 'create -t SYSsolaris-kz' && zoneadm -z mykz install && zoneadm -z mykz boot)

If you want even more layers of protection on SPARC it is possible to host Kernel Zones inside a guest LDOM.

Tuesday Apr 29, 2014

Using /etc/system.d rather than /etc/system to package your Solaris kernel config

The request for an easy way to package Solaris kernel configuration (/etc/system basically) came up both via the Solaris Customer Advisory Board meetings and requests from customers with early access to Solaris 11.2 via the Platinum Customer Program.  I also had another fix for the Solaris Cryptographic Framework that I needed to implement to stop cryptoadm(1M) from writing to /etc/system (some of the background to what that is needed is in my recent blog post about FIPS 140-2).

So /etc/system.d was born.  My initial plan for the implementation was to read the "fragment" files directly from the kernel. However that is very complex to do at the time we need to read these; since it happens (in kernel boot time scales) eons before we have the root file system mounted. We can however read from a well known file name that is in the boot archive.

The way I ended up implementing this is that during boot archive creation (either manually running 'bootadm update-archive' or as a result of BE or packaging operations or just a system reboot) we assemble together the content of /etc/system.d into a single well known /etc/system.d/.self-assembly (but considered a Private interface) file.  We read the files in /etc/system.d/ in C locale collation order and ignore all files that start with a "." character, this ensures that the assembly is predictable and consistent across all systems.

I then had too choose wither /etc/system.d or /etc/system "wins" if a variable happens to get set in both.  The decision was that /etc/system is read second and thus wins, this preserves existing behaviours. 

I also enhanced the diagnostic output from when the system file parser detects duplication so that we could indicate which file it was that caused the issue. When bootadm creates the .self-assembly file it includes START/END comment markers so that you will be able to easily determine which file from /etc/system.d delivered a given setting.

So now you can much more easily deliver any Solaris kernel customisations you need by using IPS to deliver fragments (one line or  many) into /etc/system.d/ instead of attempting to modify /etc/system via first boot SMF services or other scripting.  This also means they apply on first boot of the image after install as well. 

So how do I pick which file name in /etc/system.d/ to use so that it doesn't clash with other people ? The recommendation (which will be documented in the man pages and in /etc/system itself) is to use the full name of the IPS package (with '/' replaced by ':' ) as the prefix or name of any files you deliver to /etc/system.

As part of the same change I updated cryptoadm(1M) and dtrace(1M) to no longer write to /etc/system but instead write to files in /etc/system.d/ and I followed my own advice on file naming!

Information on how to get the Solaris 11.2 Beta is available from this OTN page.

Note that this particular change came in after the Solaris 11.2 Beta build was closed so you won't see this in Solaris 11.2 Beta (which is build 37).




« March 2015