Thursday Aug 24, 2006

Setting a filesystem quota on a zone



For the zone and/or zfs aficionado this is probably old news. For those new to Solaris, this is pretty straightforward. While zones leverage Solaris resource controls, controlling disk space in an administratively efficient manner was challenging  ... until recently.

ZFS sets quotas at the filesystem level of granularity. Sssooo, why not give each zone its own ZFS filesystem? Creating and managing a zfs filesystem is easy to do. With this approach, we also get the added benefit of managing each zone's ZFS properties independently, such as enabling compression (or potentially encryption in the future) on a zone by zone basis.

Assuming a zfs pool named 'zonepool':

## First, the ZFS setup
# zfs create zonepool/zone1
## I'm a stickler about my zone mountpoints :)
# zfs set mountpoint=/zones/zone1 zonepool/zone1
# zfs set quota=5G zonepool/zone1

## And now for the zone setup
# zonecfg -z zone1
Use 'create' to begin configuring a new zone.
zonecfg:zone1> create
zonecfg:zone1> set zonepath=/zones/zone1
....

I haven't tried applying this to cloned zones yet. My bet is that if a zone is on a cloned filesystem, the 5 gigs of disk will be above and beyond the disk space "used" by a cloned dataset, not inclusive of it. I did do a test a while back on a zone with both a quota and compression enabled, and the quota is applied to the post-compressed data, not the pre-compressed data. Makes sense since that is the true on-disk utilization.

Note, using ZFS delegated administration with zonecfg's "add dataset" command, users home directories, if managed locally in the zone, can also have per-user ZFS attributes applied (compression, quotas, etc). However, I suspect NFS mounts of ZFS-enabled home directories to be of more benefit in most scenarios.

Of course, after writing 95% of this blog entry, I ran across this howto while looking for a link. Wouldn't be the first time. What the heck, I'll post this entry anyway :)


Thursday Aug 17, 2006

Debugging zones using dtrace



I got an email from a Sun customer having problems with zones asking for ideas as to why the host was performing so slowly with a boatload of zones. That customer did notice svc.configd in each zone consuming quite a bit of CPU time and disabled as many services as he felt safe in disabling, yet the problem persisted.

The bad news is that I missed the email and it lay idle in my rather large inbox for far too long before responding (apologies Mr. Customer). The good news is that the customer figured out the problem leveraging dtrace. 

I must say that I've enjoyed cramming wwwaayyy too many zones in wwaayyy to small a space. It was a fun thing for my feeble mind to do and I learned quite a bit along the way. When, from a  zones perspective, the Ultra 10 was nearing the Twilight Zone I wanted to figure out what the bottleneck was. I turned to dtrace. What's even better is that this customer leveraged that to help debug his problem.

Due to a (logged) "headless system" bug in Nevada, the Xserver was trying to continually restart. Multiply that by N number of zones and you can probably see the performance impact.  Think over 100,000 times a minute - on a low-end system. Unfortunately, the issue was not being captured by prstat, much like Bryan's well-known stock ticker experience. Dtrace makes can make problems such as this apparent quickly.

Update: More accurately, the cde-login service, as opposed to the Xserver, was trying to continually restart.

This customer experience brings up an unfortunate pattern (sysadmin anti-pattern?). This is not the first customer that has run into the Solaris services problem. Many customers new to Solaris don't know what services should be running, so unless it's obvious they leave (mostly unneeded) services running. In this case, it was cde-login (per zone). To me, this experience re-inforces the value of the Secure by Default Opensolaris project, which makes it easy to disable all services (sans ssh) by default and then youcan enable only the ones you need.


Wednesday Aug 16, 2006

Auditing Zones

Yes, I'm still paranoid. Having the X2100 out on the web is like being a fish in the ocean, and I'm at the bottom of the food chain. At some point I'll start creating some services, but I have this constant "look over my shoulder"  gut feeling. I've got the web server up and running, but it's disabled at the moment :)

Securing one's only child server is like Fight Club. The first rule of Fight Club is to not talk about Fight Club. However, I'm willing to take one for the team. I've got ipfilters running and zones are are running all necessary services with any unneeded services disabled. The global zone is doing pretty much nothing. Logging is enabled ... everywhere.

With the help of Sun Blueprints (here and here for example) and others (here and here), I thought I would utilize BART. However, I've only got one server. Unless someone wants to donate a second server and the monthly hosting fees, I don't have the luxury of auditing (BARTing) my system over ssh (Update: anyone want to take a guess as to why we don't call it File Audit and Reporting Tool?). So I took a related route by making the global zone the "master" system, sans SSH. BART is run from the global zone and audits the local zones from the global zone. While listening to the news last night, I threw together a script (below) to automate the auditing, with cron running the script via automated intervals. If you follow the blueprints, you can create a separate account with just enough privileges to run BART, but nothing more.

One feature I want to add is a second cron script to scan the "reports" directory, and email (via "zlogin mailzone mail ...") any potential issues. Before that, I have to write some code to clean up old reports & daily manifests.


#!/usr/bin/bash

#
# BART_DIR: top level directory for bart input and output files
#

BART_DIR=/var/bart

#
# RULES_DIR. While it is an option to have one rules file for all
# zones, this script copies a rules file template, which can later be
# customized for each zone
#

RULES_DIR=${BART_DIR}/rules

#
# CONTROL_MANIFEST_DIR contains The "Control" manifest. This contains the
# baseline manifest for each zone.
#

CONTROL_MANIFEST_DIR=${BART_DIR}/manifests


#
# DAILY_DIR containes manifests created daily, with a subdirectory for each
# zone.
#

DAILY_DIR=${BART_DIR}/daily

#
#  REPORT_DIR, whith a sub-directory for each zone, contains the daily
#  comparisons between the control manifest and the daily manifest.
#  These files will show file modifications
#

REPORT_DIR=${BART_DIR}/reports

#
# RULES_TEMPLATE. Default rules file that will be utilized by default for
# each zone.
#

RULES_TEMPLATE=${RULES_DIR}/rules.template

#
# The bart binary.
#

BART=/usr/bin/bart

#
# List of all zones on the host, running or not
#

zones=`/usr/sbin/zoneadm list -cp |  cut -d':' -f 2`

#
# Pre-create the various directories. This requires "/var/bart" (by default)
# to exist and have appropriate permissions.
#

if [ ! -d ${RULES_DIR} ];
then
   mkdir -p ${RULES_DIR}
fi

if [ ! -d ${DAILY_DIR} ];
then
   mkdir -p ${DAILY_DIR}
fi

if [ ! -d ${CONTROL_MANIFEST_DIR} ];
then
   mkdir -p ${CONTROL_MANIFEST_DIR}
fi

if [ ! -d ${REPORT_DIR} ];
then
   mkdir -p ${REPORT_DIR}
fi

#
# For each zone, run bart. The first time through, create the control
# manifest. Run this script before putting any newly created zone on
# the network. Creating a control manifest of compromised zone doesn't
# do much good :)
#

for zone in ${zones}
do
#
# Get the base directory of the zone. Wish there were a formal CLI
# way of doing this (zoneadm get property would be nice). If you have
# a better way of doing this, ping me.
#

   zonepath=`grep ${zone} /etc/zones/index | cut -d':' -f 3`

#
# Pre-create various "file" and "directory" variables before the heavy lifting.
#
   CONTROL_MANIFEST_FILE=${CONTROL_MANIFEST_DIR}/${zone}.control.manifest
   RULE_FILE=${RULES_DIR}/${zone}.rules
   DATE=`date '+%m_%d_%y'`
   DAILY_TEST_MANIFEST=${DAILY_DIR}/${zone}/${DATE}.manifest
   REPORT_FILE=${REPORT_DIR}/${zone}/${DATE}.report

#
# Determining the global zone base directory is different than local
# zones

   if [ "${zone}" = "global" ];
   then
       BASE_ZONE_DIR=/
   else 
       BASE_ZONE_DIR=${zonepath}/root
   fi


#
# If the control file doesn't exist, create it. If it does exist, create
# the daily manifest and generate a report.
#

   if [ ! -f ${CONTROL_MANIFEST_FILE} ];
   then
      if [ ! -f ${RULES_DIR}/rules.template ];
      then
         echo "Rules template does not exist, please create ${RULES_DIR}/rules.template"
         exit 1
      fi

      #
      # If a rules file doesn't exist, copy the template
      #

      if [ ! -f ${RULES_DIR}/${zone}.rules ];
      then
         cp ${RULES_DIR}/rules.template ${RULE_FILE}
      fi

      ${BART} create -R ${BASE_ZONE_DIR} -r ${RULE_FILE} > ${CONTROL_MANIFEST_FILE}
   else
      #
      # Create the zone's report and daily sub directory
      #
      if [ ! -d ${REPORT_DIR}/${zone} ];
      then
          mkdir -p ${REPORT_DIR}/${zone}
      fi

      if [ ! -d ${DAILY_DIR}/${zone} ];
      then
          mkdir -p ${DAILY_DIR}/${zone}
      fi

      #
      # Generate the daily manifest
      #

      ${BART} create -R ${BASE_ZONE_DIR} -r ${RULE_FILE} > ${DAILY_TEST_MANIFEST}

      #
      #
      # Generate the daily report by comparing the control manifest with the
      # daily manifest
      #

      ${BART} compare -r ${RULE_FILE} ${CONTROL_MANIFEST_FILE} \\
          ${DAILY_TEST_MANIFEST} >  ${REPORT_FILE}
   fi
done


Per the script, here is a sample "rules.template" file. Customize to your needs.
/usr/sbin
/usr/bin
/etc
CHECK all
IGNORE dirmtime

Monday Aug 14, 2006

ZFS @ UUASC



Last night I presented ZFS to the Unix Users Association of Southern California, Orange County Chapter (UUASC). Lots of good questions. Inquiring minds wanted to know. What really stokes me about ZFS is the simplicity (and the zone snapshots, of course). ZFS makes what are traditionally difficult operations pretty darn trivial.

Thanks to the UUASC for the opportunity to share, and sorry I couldn't make it to beer.


Update: Wish I had this in my hip pocket for last night :)

Tuesday Aug 08, 2006

Securing my X2100

I've been getting paranoid. Me-thinks too paranoid for my own good. I've been spending a bit of time securing my server. This is a good thing to do when there are thousands of bad, bad dudes (and dudettes) trying to hack into systems. What have I done so far?

First off, I unplugged the server from the network. Next, I powered it off. I am just now starting to feel safe. Wait, this is going to make writing network services pretty difficult. Sigh. Power it on. Plug it in to the network. Now what?

First, I didn't plumb any interfaces. Setup begins while logged in to the console.

Step #1 was to disable a good chunk unnecessary-for-my-needs services in the global zone (svccfg apply /var/svc/profile/generic_limited_net). We're not quite Secure by Default yet, so I had to disable some additional services as well, such as sendmail.

Step #2: Configure IP Filter.  Block all incoming traffic ("block in all"). Then enable traffic on an as-needed basis. For the global zone, block all if you can.

Step #3: Create a user for me and assign some roles to myself. On my system, I'm a stud. But not too studly. Can't let it go to my head. Or weaken security.

Step #4: Plumb the interface. Set up the Sun Update Connection to get security patches pushed down. Reboot (kernel patch).  Instead of waiting for the polling interval, I opened up a can of /usr/lib/patch/swupas on my system to sync the files I selected in the Sun Update Connection portal. I'll follow up with more on the Sun Update Connection later. Some patches had to be installed manually :( Wish I could use the Sun Update Connection in it's acronym form, but I don't think marketing accounted for that ...

Step #5: Create a zone. As I've mentioned before, the default configuration should utilize zones with no services running in the global zone. Just my opinion.

Step #5.1: Apply Step #1 in context of Step #5.

Step #6: Installed a name server in the local zone. named -t [directory] -u [user]. By specifying the "chroot" directory and user, there's a bit more security, not to mention the SMF script limits the privileges available to the service.

Step #7: TBD. I am not done with security and I am open to suggestions to take it a step further. Security is not my forte. Some thoughts are additional minimization and potentially BART.

Note sure what I want to install first. Web Server ? Portal Server? Java CAPS? N1 SPS? Sigh, too many choices. I'm a kid in a bit-candy store. I'm leaning Portal. That will front-end everything else.

Monday Jul 24, 2006

X2100 running Solaris 10 Update 2



Matt and I headed down to the data center today to get our respective X2100's up and running. Matt's using the SMDC serial console redirect. I gave up on that and decided to use the serial port. I still use the SMDC to power cycle the box among other things, just not console redirect.

You may be wondering why I made this choice. There are a couple of reasons. My provider has provided me various ways to access the box. The terminal server is by far the simplest. I can get to it at any time and from any OS I have booted. With IPMI console redirect, I have to have a different tunneling client and IPMI running. That's not a good combination for my Solaris X86 client.

Matt and I have been having some problems with installing Solaris 10 X86 remotely with the IPMI. You have to modify asy.conf to configure the serial driver to talk over the IPMI port. A (default) bootable CD image doesn't do that. There may be a way to accomplish this and I am open to suggestions, but given our time constraint we installed his box with a local console. Even with the fix to bug id 6337341in Solaris 10 Update 2, Matt still had to disable bge in /etc/system or else he would lose connectivity to the console via IPMI.

Once up and running, the IPMI remote console works fine. It's the bootstrap problem that was an issue. Keep in mind that the x2100 is designed to be an HPC compute node.

I sliced my two drives with swap and two 10GB UFS (boot) slices each to facilitate live upgrade and a 50GB mirrored zfs slice across the two disks. At some point I'll mirror the UFS boot slice. I look forward to a bootable zfs root slice.


Friday Jul 21, 2006

Sun Web Server 6.1 SSL acceleration on T1000/T2000



I can't say that security is my area of expertise.  I have this nasty habit of trusting people. Therefore, when it comes t setting things up, I need help. Get this: I even RTFM.

You may or may not know this, but the T1 chip (T1000/T2000 servers to date) supports crypto acceleration natively, and we have a handy blueprint which walks through the setup/configuration. When combined with a couple of blog entries (here and here), I am starting to "get it". That being said, here is a set of end-to-end steps to get it all up and running with a self-signed certificate:

## Set up password file
# echo YOUR_PASSWORD > /tmp/password.txt
# chmod 600 /tmp/password.txt

## Create certificate store
# /opt/SUNWwbsvr/bin/https/admin/bin/certutil -N \\
   -P https-web.West.Sun.COM-web- \\
   -d /opt/SUNWwbsvr/alias \\
   -f /tmp/password.txt

## Create a self-signed certificate and store it in the certificate store
## When run, Select 1 (SSL client), then 9 (other), then "y".
## (Jyri describes how to do this via a Certificate Authority)
# /opt/SUNWwbsvr/bin/https/admin/bin/certutil -S -x \\
   -P https-web.West.Sun.COM-web- \\
   -d /opt/SUNWwbsvr/alias \\
   -f /tmp/password.txt \\
   -n Server-Cert \\
   -s "CN=web.West.Sun.COM,C=US" \\
   -t u,u,u -m 12345 -v 99 -5

## Enable the Sun Metaslot
# /opt/SUNWwbsvr/bin/https/admin/bin/modutil \\
   -dbdir /opt/SUNWwbsvr/alias \\
   -dbprefix https-web.West.Sun.COM-web- \\
   -nocertdb
   -disable "Solaris Cryptographic Framework"
# /opt/SUNWwbsvr/bin/https/admin/bin/modutil \\
   -dbdir /opt/SUNWwbsvr/alias \\
   -dbprefix https-web.West.Sun.COM-web- \\
   -nocertdb \\
   -enable "Solaris Cryptographic Framework"\\
   -slot "Sun Metaslot"

## Export the certificate and key from the internal store to a PKCS#12 formatted file
# /opt/SUNWwbsvr/bin/https/admin/bin/pk12util \\
   -o /tmp/cert.p12 \\
   -d /opt/SUNWwbsvr/alias \\
   -n Server-Cert \\
   -P https-web.West.Sun.COM-web-

## Import the certificate and key into the Sun Metaslot
# /opt/SUNWwbsvr/bin/https/admin/bin/pk12util \\
   -o /tmp/cert.p12 \\
   -d /opt/SUNWwbsvr/alias \\
   -n Server-Cert -P https-web.West.Sun.COM-web-

## Ensure the web server user can utilize the keystore.
## I'm not entirely sure of what the security implications
## are here, but the web server (webservd) couldn't open
## the store without doing this. I probably should have tried
## to import the key into the keystore as webservd ...
# chown -R webservd:webservd /.sunw

## Run these steps if you want the web server to start up on boot.
## Otherwise the user is prompted to enter the keystore password
# echo internal:`cat /tmp/password.txt`
   > /opt/SUNWwbsvr/https-web.West.Sun.COM/config/password.conf
# chmod 400 /opt/SUNWwbsvr/https-web.West.Sun.COM/config/password.conf


## Clean up
# rm /tmp/password.txt
# rm /tmp/cert.p12
Hope this helps.


Wednesday Jul 12, 2006

ZFS @ UUASC



I'm scheduled to deliver a ZFS presentation at the Orange County chapter of the UUASC, and owe Rabbs an  overview and Bio.

Overview
The primary goals of ZFS are data integrity and simplicity. In short, to "End the Suffering". I'll lift the rest from the OpenSolaris ZFS home page:

ZFS is a new kind of file system that provides simple administration, transactional semantics, end-to-end data integrity, and immense scalability. ZFS is not an incremental improvement to existing technology; it is a fundamentally new approach to data management. We've blown away 20 years of obsolete assumptions, eliminated complexity at the source, and created a storage system that's actually a pleasure to use.

ZFS Features

  • Pooled Storage Model
  • Always consistent on disk
  • Protection from data corruption
  • Live data scrubbing
  • Instantaneous snapshots and clones
  • Fast native backup and restore
  • Highly scalable
  • Built in compression
  • Simplified administration model

Additional information is available on the  "What is ZFS" page.

I believe I'll have somewhere between 60 and 90 minutes. The temporal breakdown will be between 25% to 33% demonstration and remainder being dedicated to a discussion. During the discussion, I'll be leveraging the blogosphere (ex: Eric, Derek, Ricardo, and DragonFly).

What I have found exceptionally useful, which will come as no surprise to many of you, is the ability to snapshot/clone zones. Huge time saver. Yeah, I'll be demoing that as well using Nevada build 41 (or later).


Bio
I wrote a Bio for my last UUASC presentation. However, that Bio was much more Java focused. Here is an update to that Bio:

John has been with Sun Microsystems for over 9 years. He is a Technical Specialist in the Client Solutions Organization's Software Practice, where he focuses primarily on the Java Enterprise System and Solaris. John's personal interests are in distributed computing (SOA is the current industry focus), virtualization and "participating" in the "Participation Age".

Friday Jul 07, 2006

Installing N1SPS in a Zone



Following up on yesterday's post, I thought I would share the steps I took to get the N1 Service Provisioning System up and running in a local zone.

Global zone
The following steps should be run from the global zone.
  • Create a whole zone. I've called mine "sps".
  • Run "modload /kernel/sys/semsys"

Local zone
The following steps should be run from the local (sps) zone.

You may or may not have to do this depending on your setup, but N1 SPS requires a minimum amount of IPC resources. I think this is primarily due to the bundled Postgres database. FYI, these steps are documented in the installation guide. No special sauce applied.
  • Install guide steps
  • projmod -a -K "project.max-shm-memory(priv,512mb,deny) default
  • projmod -a -K "project.max-sem-ids=(priv,32,deny)" default 
  • projmod -a -K "process.max-sem-nsems=(priv,17,deny)" default 
  • prctl -n project.max-shm-memory -v 536870912 -r -i project 1 
  • prctl -n project.max-sem-ids -v 32 -r -i project 1 
  • prctl -n process.max-sem-nsems -v 17 -r -i process $$
I screwed myself by not running projmod. I have a bad habit of editing files using vi instead of running command line tools. That wasted an hour or two.

Here's the other (abstract) steps I followed:
  • I created a Solaris user/group: n1sps/n1sps
  • ran installer (cr_ms_solaris_x86_pkg_5.2.sh)
    • I chose ssh, SSL/HTTPS, create keystore later 
  • Because I chose https, I had to create a keystore
    • /usr/jdk/j2sdk1.4.2_06/bin/keytool -genkey -alias tomcat -keyalg RSA -keystore SPS_HOME/server/tomcat/keystore -storepass [YOUR_KEYSTORE_PASSWORD]
      • Your JDK path may vary
    • chmod 600 keystore
    • chown n1sps:n1sps keystore
  • SPS_HOME/server/bin/crkeys -epass -password [YOUR_KEYSTORE_PASSWORD] 
  • copy the resulting text and paste into tomcat's server.xml (search for "keystore" in server.xml)
  • Started the server: su - n1sps SPS_HOME/server/bin/cr_server start
I now have SPS running in a local zone that has the ability to communicate with an agent running in the global zone. That agent will be responsible for provisioning other zones along with JES components. Perhaps this will be another blog entry.


Wednesday Jun 14, 2006

Automating zone and application provisioning



Over the last few months I've re-provisioned the zones on my laptop multiple times. Approximately 3 times. The first was my original installation. The second  was because of my defragmented Solaris partition. The last was due to moving my N-Tier HA Web application setup in 5 zones from the external USB ZFS drive to my 2nd internal ZFS drive. Wish I knew about zone migration being in build 41 before I manually re-created the zones! Sigh.

Now that's just me. 3 installs. I wonder how many times, across the Sun field folks, that N-Tier setup has been provisioned? How much time would have been saved if the process was automated? At some point the same folks will repeat the process when Glassfish gets enterprise (High Availability) capabilities. From a different perspective, how many times was provisioning  not done due to the effort and time involved?

[Context Switch] I really dig creating demos. A picture is worth a thousand words. A screenshot is worth 1K of ASCII. A demo is worth 1000 screenshots, and definitely much better than a screenshot of a demo.

[Context Switch] I've been watching  N1 at work within a customer environment. Provisioning bare metal with Solaris 10. Creating zones. Deploying an N-Tier architecture. Deploying applications too. All via point-and-click. Repeatable. Predictable. Auditable. Of course, there is work behind the scenes to build the "plans" (sequence of provisioning rules/commands) to provision the stack, but it's a one-time hit.

[Bringing it all together] Wouldn't it be nice if those of us in the "field" at Sun used N1 to provision demos to our laptops? Yeah, I could try tar. Or flar. Yeah, I could use VMWare. If VMWare supported Solaris 10 X86 as a host OS (Grrrr). Or the QEMU if the QEMU accelerator supported Solaris. Regardless, I think I would use N1 anyway. Why? Because I want to learn more. Because N1 is more flexible than VMware in that VMWare provisions static images, N1 allows parameterization. Actually, VMWare and N1 are complimentary with some overlap (N1 can be used to provision the N-Tiers within 1 or more VMWare virtual machines for example).

So I am thinking of building a machine or two to host N1, most likely just the N1 Service Provisioning System. From there I can host the Java Enterprise System bits, which get downloaded and customized to a fresh zone created on a Solaris 10 laptop, according to an N1 "plan". Of course, finding the hardware to host the N1 software could be problematic. It'll require calling in some favors. Wait ... Uhhh .... I'm in the red on the favor balance  thing ... Hmmm ... . Bah, what's one more favor!?

If anyone at Sun is interested in such a scenario, lemme know. Could use some help. And some hardware :) Expect to blog about it.

Monday Jun 12, 2006

Grid-enabling Zones

While trying to get the details on the new zones snapshot capability on my Nevada build 41 laptop, I ran the ol' trusty gender-neutral "man" command. I missed it the first time, but not the second. New to the build are "attach" and "detach":

detach
         Detach the specified zone. Detaching a zone is the first
         step  in  moving  a zone from one system to another. The
         full procedure to migrate a zone is  that  the  zone  is
         detached,  the  zonepath  directory  is moved to the new
         host, and then the zone is attached  on  the  new  host.
         Once  the zone is detached, it is left in the configured
         state. If you try to install or clone  to  a  configured
         zone  that  has been detached, you will receive an error
         message and the install or clone subcommand will not  be
         allowed to proceed.

attach [-F]
        The  attach  subcommand  takes  a  zone  that  has  been
         detached  from  one system and attaches the zone on to a
         new system. Therefore, the detach subcommand must be run
         before  the  "attach"  can  take  place.  The zone being
         attached must first be configured using the zonecfg (see
         zonecfg(1M))  command. Once you have the new zone in the
         configured state, use the attach subcommand  to  set  up
         the  zone  root instead of installing. The -F option can
         be used to force the zone  into  the  "installed"  state
         with no validation. This option should be used with care
         since it can leave the zone in an unsupportable state if
         it  was  moved  from  a source system to a target system
         that is unable to properly host the zone.

With attach and detach, a zone can be detached, migrated to a different host, and then attached. Note, this is not an on-the-fly running zone pause and resume capability. The zone must be shut down before being detached. As one engineer put it, its more like a "move" of a zone than a "migration".

One idea is to install the zone with its mount point on an NFS file system or shared SAN (e.g. QFS).  With a shared network fileystem, any system in the grid (running essentially the same OS image) can boot the migrated zone. I recall reading this request on the zones forum roughly 18 months ago.

I have to do a bit more investigation to find out exactly how close we are to actually putting a zone on a grid in such a manner, but at a minimum the foundation is being layed.

I'll have to test this out on my laptop. Thank goodness I have my 2nd internal drive installed on my laptop. That's another 100GB of zones and zfs space to play (ahem - work) with.

Oxymoron: Doing 90% of one's distributed computing on a single laptop :)

Saturday Jun 10, 2006

Running a defragmented Solaris



It didn't take too long to get my Solaris image back up and running after the defragmentation of my Solaris partition. Turns out my backup was only 2 months out of date, and the breadth of things that had changed was minimal. Most of what I do is stored "out there" on the 'Net, which transfers much of the onus of backups on others :)

Unfortunately I've been too busy to keep up with what's been going on over at OpenSolaris, but it looks like they have been busy. I have already installed 5 zones using the new zoneadm "clone" capability. Love it. The look & feel of Gnome has been updated, although it's unclear if Gnome itself has been updated. ZFS has been updated. Not sure exactly what has changed, but it warned me about wanting to upgrade the pool. Eric covers zfs modifications requiring a "zpool upgrade". The comments say build 42 will incorporate those changes and I'm running build 41. Hmmm. Either way, I ran "zpool upgrade" and all is fine.

Installing build 41 from scratch, as opposed to my usual Live Upgrade, wasn't all that bad an idea. It forced me to refresh certain applications and services that didn't get backed up, such as the iwi wireless driver. My laptop would occassionally hang, and I think iwi was to blame. Reading the notes, I saw a fix that many have addressed the issue. I also updated inetmenu.

There is a downside, though. My USB thumb drive stopped working due to a supposed bug in the scsa2usb driver. Thanks to Kesari, I have a to-be-installed update.

Thursday Jun 08, 2006

Defragmenting my Solaris partition



Today began by adding a second internal disk to my Toshiba Tecra M2 laptop to do some more zfs work. The USB drive was a bit undesirable from the performance perspective and externally clunky, but very usable. I was doing the partitioning under Windows because I had to run a customer's windows application.  While I was in the disk management tool, I thought I would defragment my windows partition since the defragmentation tool showed a color-coded mess.

What the hell was I thinking? Not only did it defragment my windows partition, it blew away the boot block. Crap. I boot off my Knoppix CD and activate the partition. Reboot. Up comes windows. Hmmm, yeah, the boot block is hosed. I boot Solaris off CD and figure I'll just mount the Solaris partition to run installgrub.

No matter what I did, I couldn't get the UFS filesystem to mount. I backed up the drive roughly 2 months ago, so i didn't lose everything.

Right now I'm installing Nevada build 41 over the old build. Note-to-self, use the second internal drive to back up the first internal drive. Regularly.

Wednesday Jun 07, 2006

Setting up DNS



Now that my X2100is up and running, I am slowly getting to getting it fully operational as I get bits and pieces of time. In the past I have relied upon others to provide me DNS services. I'm up for learning something new, so I thought I'd take a whack at it myself this time. The further I dig into this, the more I question the wisdom of the decision :)

DNS has been at the core of the Internet for decades. Solaris 10 ships with Bind 9, and given its history I am sure Bind 9 is reliable and scalable. However, IMHO, it's not the easist service to set up. The file format(s) leaves a lot to be desired. Thanks to the tip from non-blogging-heathen co-worker Roy, I am using h2n which is making the effort both simpler and a better learning experience. BTW, I think I have the O'Reilly DNS & Bind book at home. Although I am out on travel, I can picture the book somewhere in the lllooonnnnggg row of O'Reilly books on my bookshelf.

Does anyone have thoughts about configuring Bind through webmin? Any other ideas?




Sunday Jun 04, 2006

Managing Zones using ZFS



Martin has a really useful blog entry on using a new Nevada build 40 feature for cloning a zone. No need to hack it anymore, it's simply there.

I have found this capability, even through hacking, to be invaluable. While it saves me hours of cumulative time waiting, it also enables me to concentrate on whatever the problem is at hand. For example, I deployed an N-Tier Highly Available (sans a single laptop) application using zones on my laptop in under an hour. That included zone creation, product installation and configuration, etc.

I'll tell you where I am having a problem, though. Getting my head around which zone does what and then understanding the zfs/zone dependency tree. For example, at the peak I had roughly 20 zones cloned and slightly tweaked to meet my needs. Grokking which zone had what tweak was problematic. The temporary solution was to create (using the hack) snapshots with names such as "appserver@TWEAKED_THIS" and "appserver@after_TWEAKED_THIS_and_then_tweaked_that". I would get a few levels deep. Then, of course, when it comes time to clean it all up, it's not easy (that I know of) to find out which cloned zones depend on what other cloned zones. There is, of course, the zfs "origin" property, but I wish there were a zfs command to recursively print out the entire zfs dependency tree from pool root down to leaf nodes. "zfs -r", which seems to stop at snapshots (IIRC) and doesn't pick up the clones of snapshots. The origin property does, though. A perl or shell script could probably do it with some thought.

I haven't tried build 40 yet, so some of this may be addressed.


About

John Clingan

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today