Containers on NFS?

Thinking I needed a good story as a reason to start writing a blog I hadn't joined the many friends who have been blogging for quite some time, but now I found a good reason. Btw. For those of you who like short blog entries I definitely don't plan to have entries this long too often either ...

I guess an introduction is in place for my first Sun Blog. My name is Joost Pronk van Hoogeveen (yes I had to write my full surname on every sheet on every exam I ever took), although I normally only use Joost Pronk to make it easier for myself and others. I work in the Solaris group and work on the resource/workload management tools (Solaris Resource Management, Solaris Zones, Solaris Containers, Solaris Container Manager, ...). I expect that's probably what I'll blog about most... Oh, and I'm Dutch so I might throw some of that in too.

Anyway a few weeks ago I was at a Sun customer conference discussing the fact that we currently don't support running Solaris Containers on NFS. More precisely we don't support the ability to have a Solaris Zone with its root filesystem on an NFS mounted directory.

Solaris Containers

Before I go further I should discuss the Solaris Container versus Solaris Zones nomenclature. In short: Solaris Containers = Solaris Zones + Solaris Resource Management.

The longer version: Solaris Containers is the group of technologies that allow the administrator to create separate environments for their applications in Solaris. Solaris Zones is the technology that allows the admin to create a separate logical namespace environments, and the Solaris Resource Management (RM) allows them to assign amounts of system resources to their applications. To create a complete "Container" you'd want to use both technologies and not just one. Now Solaris Zones is really new and cool and it gets most of the lime-light but a good container also has the RM stuff configured too. Now when talking about it I generally default back to the "bigger" Solaris Container to not lock out the use of RM, but sometimes you need to zoom in to be precise and I'll talk about an individual piece, which is often a zone because that's "the new thing" people want info about.

Anyway back to the story ...

So for a whole bunch of reasons running a zone's root filesystem on an NFS mounted directory doesn't work today, and most of them have to do with the fact that the global zone (who makes the initial NFS mount) has a different IP address than the non-global zone (who then wants to use this mount), and credentials, and the possible use of Kerberos, and different domain names, and ...

What if the non-global zone doesn't see any of this because it's behind some other mechanism?

At breakfast a customer (Thomas Nau from the University of Ulm) and I were talking about this when he said; "why don't we use lofi?" So we did it right then and there with two laptops and it works (big grin).

So here's how we did it

First let me say this is a workaround hack, we didn't do anything illegal, and all the interfaces we used are regular Solaris interfaces, however it comes dangerously close to the "don't do this at home folks" category. So think twice before you'd use it in a production environment, but it definitely fun to do.

In short what we do is: We mount an NFS filesystem; Create a file; lofiadm this file; newfs this new lofi block device; mount the device on de zonepath location; Define a zone with the correct zonepath; Install and boot the zone. (sounds straight forward, right?)

Before we start

Make sure you have an NFS share that you can mount:

root@nfsserv # share
-               /export   rw   ""
root@nfsserv # pwd
Mount the filesystem and create a file

This means create a mount point (/zonemount), mount the filesystem on this, create the file:

root@zoneserver # mkdir /zonemount

root@zoneserver # mount nfsserv:/export/zones /zonemount
root@zoneserver # df -k
Filesystem            kbytes    used   avail capacity  Mounted on
/dev/dsk/c0t0d0s0    8242397 5397348 2762626    67%    /
/devices                   0       0       0     0%    /devices
ctfs                       0       0       0     0%    /system/contract
proc                       0       0       0     0%    /proc
mnttab                     0       0       0     0%    /etc/mnttab
swap                  443904     400  443504     1%    /etc/svc/volatile
objfs                      0       0       0     0%    /system/object
fd                         0       0       0     0%    /dev/fd
swap                  443560      56  443504     1%    /var/run
swap                  446320    2816  443504     1%    /tmp
nfsserv:/export/zones 17088954 3549610 13368455    21%    /zonemount

root@zoneserver # cd /zonemount
root@zoneserver # mkfile 300M zone1_file
root@zoneserver # ls -l
total 614720
-rw-------   1 root     root     314572800 Jun  1  2005 zone1_file
lofiadm a new device and newfs this device

Now we have a 300MB file we can use lofiadm(1M) to create a new block device in /dev/lofi. In this case we choose to specify which device (/dev/lofi/1) we want. Once we've done that we use newfs(1M) to put a filesystem on our device/file:

root@zoneserver # lofiadm -a /zonemount/zone1_file /dev/lofi/1
root@zoneserver # newfs /dev/lofi/1
newfs: construct a new file system /dev/rlofi/1: (y/n)? y
/dev/rlofi/1:   614400 sectors in 1024 cylinders of 1 tracks, 600 sectors
        300.0MB in 64 cyl groups (16 c/g, 4.69MB/g, 2240 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
 32, 9632, 19232, 28832, 38432, 48032, 57632, 67232, 76832, 86432,
 518432, 528032, 537632, 547232, 556832, 566432, 576032, 585632, 595232,
Create a mount point and mount the new filesystem on it

Now we have the device, we need to mount it where we plan to have our zonepath's. So first we mkdir(1M) a new mount point /zones/zone1, and then mount the new filesystem on top of this mount point:

root@zoneserver # mkdir /zones/zone1
root@zoneserver # mount /dev/lofi/1 /zones/zone1
root@zoneserver # chmod 700 /zones/zone1

Note: Because we created the filesystem instead of zoneadm doing if for us we need to use chmod(1) to make the directory only accessible to the global root user.

Configure and Install the zone

So now we'll quickly create a fairly standard non-global zone and tell it to install:

root@zoneserver # zonecfg -z zone1
zone1: No such zone configured
Use 'create' to begin configuring a new zone.
zonecfg:zone1> create
zonecfg:zone1> set zonepath=/zones/zone1
zonecfg:zone1> add net
zonecfg:zone1:net> set physical=hme0
zonecfg:zone1:net> set address=
zonecfg:zone1:net> end
zonecfg:zone1> verify
zonecfg:zone1> exit

root@zoneserver # zoneadm -z zone1 install
Preparing to install zone .
Creating list of files to copy from the global zone.
Copying <2568> files to the zone.
Initializing zone product registry.
Determining zone package initialization order.
Preparing to initialize <946> packages on the zone.
Initialized <946> packages on zone.                                
Zone  is initialized.
The file  contains a log of the zone installation.
root@zoneserver # zoneadm list -cv
  ID NAME             STATUS         PATH                          
   0 global           running        /                             
   - zone1            installed      /zones/zone1                  

root@zoneserver # df -k
Filesystem            kbytes    used   avail capacity  Mounted on
/dev/dsk/c0t0d0s0    8242397 5397356 2762618    67%    /
/devices                   0       0       0     0%    /devices
ctfs                       0       0       0     0%    /system/contract
proc                       0       0       0     0%    /proc
mnttab                     0       0       0     0%    /etc/mnttab
swap                  442520     400  442120     1%    /etc/svc/volatile
objfs                      0       0       0     0%    /system/object
fd                         0       0       0     0%    /dev/fd
swap                  442176      56  442120     1%    /var/run
swap                  445136    3016  442120     1%    /tmp
nfsserv:/export/zones   17088954 3549875 13368190    21%    /zonemount
/dev/lofi/1           288239   64799  194617    25%    /zones/zone1

root@zoneserver # zoneadm -z zone1 boot
root@zoneserver # zoneadm list -cv
  ID NAME             STATUS         PATH                          
   0 global           running        /                             
   1 zone1            running        /zones/zone1                  

Voila! It works

Note: df(1M) now shows us both mounts where the zone install only used 64799 KB out of the 288239 KB available on our lofi device-file-thing.

Why this works

So why does this work? Well all of the NFS credential stuff that would normally break booting a non-global zone from NFS is now hidden from the zone. All of the NFS related stuff is done by the global zone and its credentials, the NFS server side never even sees any of the local zone files nor does it need to know about the IP addresses the non-global zone is using. On the other side the non-global zone never really knows its files are not on the local system, all its file are encapsulated in the filesystem living in the lofi block device. That this block device is actually a file on an NFS mount is not visible to it, nor does it care.

So can I use it?

Well that's up to you. First as you can see above this is a fairly laborious process, and the more steps the more opportunities to mess up. Of course you can automate most of this, but still the debugability and measurability for someone who hasn't seen your setup isn't very high.

And then there is the version compliance. Having the zone root on NFS begs you to use it to migrate zones from one system to another, but be aware every zone has its package and patch list/revision embedded in it. So when you move them around they should only move to a system with an equal patch level, or you could get very strange behavior. what's going on there is worth a whole series of blogs and this entry is getting pretty long as it is...

Anyway, have fun.


I like your NFS solution. I am going to test it for my own purposes. Another limitation with zones is related to /dev/ip. The restricted permissions will not allow in.rarpd to run. I sent out a question on the Sun forums website. The basics questions are: 1. Is it possible to create Network Server zones? 2. What is the workaround that will enable me to utilize in.rarpd on the various zones that are to be created? Have you found or heard of a workaround for creating boot servers in zones? Thanks, Robert

Posted by Robert Richardson on November 17, 2005 at 02:52 AM PST #

Awsome! I was just trying Zones on NFS and discovered zones creating fails now using an NFS mount...

This solution looks good, however moving the NFS zone to a new machine will be a challenge

Anyone know why Zones on NFS is not allowed in the first place?

Posted by dan on April 13, 2008 at 04:04 AM PDT #

I'm trying to figure out an inexpensive(hence usb external drive) way
to create a "logical SAN", I guess some way to simulate, dual pathed
drives in a "cheap" drive.....
Of course, one cannot assume that all SAN's are limited to ethernet
and/or to fiber....that would be a mockery of technology.....there
might be a way to create logical drives in a drive-space, such logical
drives have gotta be dual pathed.....I am able to create a makefile,
create a virtual device entry for it and losf mount it I'd
like to have a logical drive that it multipathed.....some vmware tool
may well do this, but, it may not be free....

perhaps virtualization, zoning and containers will be well served by the functionality of a multipathed virtual drive.

Posted by Kartik Vashishta on January 05, 2009 at 09:24 AM PST #

First thank you for your blog entry! I have this configured in my lab and the installation is work perfectly.

I'd like to take the config a step further and "export" the lofi device to a second host for a cheap-n-easy DR solution, but I can't figure out how to share the device between two hosts.

I've tried detaching the lofi device and re-attaching on the second host, but when I try I get the error "not this file system type". I've tried just about everything I can think of, but I've hit a wall.

I appreciate any help you can offer.

Posted by ken bessler on September 03, 2009 at 07:08 AM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed



« December 2016