Migratory FrankenZones

Sometimes I wonder too much. This is one of those times.

Solaris 10 11/03 introduced the ability to migrate a Solaris non-global zone from one computer to another. Use of this feature is supported for two computers which have the same CPU type and substantially similar package sets and patch levels. (Note that non-global zones are also called simply 'zones' or, more officially, 'Solaris Containers.')

But I wondered... what would happen if you migrated a zone from a SPARC system to an x86/x64 system (or vice versa)? Would it work?

Theoretically, it depends on how hardware-dependent Solaris is. With a few minor exceptions, Solaris has only one source-code base, compiled for each hardware architecture on which it runs. The exceptions are things like device drivers and other components which operate hardware directly. But none of those components are part of a zone... (eery foreshadowing music plays in the background)

Of course, programs compiled for one architecture won't run on another one. If a zone contains a binary program and you move the zone to a computer with a different CPU type, that program will not run. I wondered: do zones include binary programs?

The answer to that question is "it depends." A sparse-root zone, which is the default type, does not include binary programs except for a few in /etc/fs, /etc/lp/alerts and /etc/security/lib which are no longer used and didn't belong there in the first place. In fact, when a zone is not running, it is just a bunch of configuration files of these types:

  • ASCII
  • directory
  • symbolic link
  • empty file
  • FIFO
  • executable shell script
All of those file types are portable from one Solaris system to another, regardless of CPU type. So, theoretically, it might be possible to move a sparse-root zone from any Solaris 10 system to any other, without regard to CPU type.

In addition, when a zone is booted, a few loopback mounts (see lofs(7FS)) are created from the global zone into the non-global zone. They include directories like /usr and /sbin - the directories that actually contain the Solaris programs. Those loopback mounts make all of the operating system's programs available to a zone's processes when the zone is running.

Although the information about the mount points moves with a migrating (sparse-root) zone, the contents of those mount points don't move with the zone... (there's that music again)

On the other hand, a whole-root zone contains its own copy of almost all files in a Solaris instance, including all of the binary programs. Because of that, a whole-root zone cannot be moved to a system with a different CPU type.

To test the migration of a sparse-root zone across CPU types, I created a zone on a SPARC system and used the steps shown in my "How to Move a Container" guide to move it to an x86 system. Note that in step 2 of the section "Move the Container", pax(1) is used to ensure that there are no endian issues.

The original zone had this configuration on an Ultra 30 workstation:

sparc-global# zonecfg -z bennu
zonecfg:bennu> create
zonecfg:bennu> set zonepath=/zones/roots/bennu
zonecfg:bennu> add net
zonecfg:bennu:net> set physical=hme0
zonecfg:bennu:net> set address=192.168.0.31
zonecfg:bennu:net> end
zonecfg:bennu> exit
exit

When configuring the new zone, you must specify any hardware differences. In my case, the NIC on the original system was hme0. On the destination system (a Toshiba Tecra M2) it was e1000g0. I chose to keep the same IP address for simlicity. After moving the archive file to the Tecra and unpacking it into /zones/roots/phoenix, it was time to configure the new zone. The zonecfg session for the new zone looked like this:

x86-global# zonecfg -z phoenix
zonecfg:phoenix> create -a /zones/roots/phoenix
zonecfg:phoenix> select net physical=hme0
zonecfg:phoenix:net> set physical=e1000g0
zonecfg:phoenix:net> end
zonecfg:phoenix> exit
exit

By specifying the change in hardware, the appropriate actions are implemented by zoneadm when the zone boots.

The zoneadm(1M) command is used to attach a zone's detached files to their new computer as a new zone. When used to attach a zone, the zoneadm(1M) command compares the zone's package and patch information - generated when detaching the zone - to the package and patch information of the new host for the zone. Unfortunately (for this situation) patch numbers are different for SPARC and x86 systems. As you might guess, attaching a zone which was first created on a SPARC system, to an x86 system, caused zoneadm to emit numerous complaints, including:

These packages installed on the source system are inconsistent with this system:
(SPARC-specific packages)
...
These pacakges installed on this system were not installed on the source system:
(x86-specific packages)
...
These patches installed on the source system are inconsistent with this system:
        118367-04: not installed
(other SPARC-specific patches)
...
These patches installed on this system were not installed on the source system:
        118668-10
(other x86-specific patches)
...

If zoneadm detects sufficient differences in packages and patches, it does not attach the zone. Fortunately, for situations like this, when you know what you are doing (or pretend that you do...) and are willing to create a possibly unsupported configuration, the 'attach' sub-command to zoneadm has its own -F flag. The use of that flag tells zoneadm to attach the zone even if there are package and/or patch inconsistencies.

After forcing the attachment, the zone boots correctly. It uses programs in the loopback-mounted file systems /usr, /lib and /sbin. Other applications could be loopback-mounted into /opt as long as that loopback mount is modified, if necessary, when the zone is attached to the new system.

Conclusion

This little experiment has shown that it is possible to move a Solaris zone from a computer with one CPU type to a Solaris computer that has a different CPU type. I have not shown that it is wise to do so, nor that an arbitrary application will run correctly.

My goals were:

  • depict the light weight of zones, especially showing that an unbooted sparse-root zone is merely a collection of hardware- independent configuration files
  • show how easy it is to create and migrate a zone
  • expand the boundaries of knowledge about zones
  • explore the possibility of designing an infrastructure that is not based on the assumption that a workload must be limited to one CPU architecture. Without that limitation, an application could "float" from system to system as needed, regardless of CPU type.

Comments:

Nicely stated Jeff. You can move NG-zones between architectures, but why would you if your application won't run.
...
What about services that are part of the Solaris OS, such as named (DNS server) or samba (SMB server) or apache (do I need to tell anyone here what apache serves, OK, web pages) ? I can see the utility of being able to get services back up and running on available hardware (even if it is a different CPU architecture) as long as the application is supported (and installed) there.
...
What about bigger services ? I installed the Sun Java Application Server 9 Update 1 for the first time the other day (in a NG-zone) and it appears to run entirely within Java (a ps -ef does not show any binaries out of the install directory running, just two java processes (one for the admin server and one for domain1).
...
Just thinking aloud and food for thought...

Posted by Paul Kraus on June 03, 2007 at 11:43 AM EDT #

Post a Comment:
Comments are closed for this entry.
About

Jeff Victor writes this blog to help you understand Oracle's Solaris and virtualization technologies.

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today