Friday Nov 08, 2013

Who keeps removing that file?

Over the years, I've had many times when some file gets removed and there's no obvious culprit.  With dtrace, it is somewhat easy to figure out:

 #! /usr/sbin/dtrace -wqs

syscall::unlinkat:entry
/cleanpath(copyinstr(arg1)) == "/dev/null"/
{
        stop();
        printf("%s[%d] stopped before removing /dev/null\n", execname, pid);
        system("ptree %d; pstack %d", pid, pid);
}

That script will stop the process trying to remove /dev/null before it does it.  You can allow it to continue by restarting (unstopping?) the command with prun(1) or killing it with kill -9.  If you want the command to continue automatically after getting the ptree and pstack output, you can add "; prun %d" and another pid argument to the system() call.

Wednesday Oct 30, 2013

Hardware refresh of Solaris 10 systems? Try this!

I've been seeing quite an uptick in the people that are wanting to install Solaris 11 when they are doing hardware refreshes.  I applaud that effort - Solaris 11 (and 11.1) have great improvements that advance the state of the art and make the best use of the latest hardware.

Sometimes, however, you really don't want to disturb the OS or upgrade to the a later version of an application that is certified with Solaris 11.  That's a great use of Solaris 10 Zones.  If you are already using Solaris Cluster, or would like to have more protection as you put more eggs in an ever growing basket, check out solaris10 Brand Zone Clusters.

Sunday Sep 29, 2013

How I spent my summer instead of vacationing

I've been pretty quiet around here lately, mainly because I've been heads down working on Unified Archives and Kernel Zones. Markus has started to let the cat out of the bag...

1. kernel zones: With kernel zones customers will have to option to run different kernel patch levels across different zones while maintaining the simplicity of zones management. We'll also be able to do live migration of kernel zones. All of that across HW platforms, i.e. kernel zones will be available on both SPARC as well as x86. Key benefits of kernel zones:
x Low overhead (Lots of optimizations because we run Solaris on Solaris)
x Unique security features: Ability to make them immutable by locking down the root file system
x Integration with the Solaris resource management capabilities: CPU, memory, I/O, networking
x Fully compatible with OVM SPARC as well as native and S10 branded zones
x Comprehensive SDN capabilities: Distributed Virtual Switch and VxLAN capabilities

2. Unified Template Builder: This will allow customers to go from any-to-any of the following: Bare metal, OVM, kernel zone, native zone. For instance: You'll be able to take a zones image and re-deploy it as a bare metal image, kernel zone or ldom. Or vice versa! Pretty powerful, huh? Unified templates also provide us with a great foundation to distribute key Oracle applications as a shrink-wrapped, pre-installed, pre-tuned an configured image where customers can specify at install time whether to turn them into a bare metal image, a zone, a kernel zone or an OVM

Tuesday Apr 30, 2013

Cold storage migration with Zones on Shared Storage

A question on the Solaris Zones forum inspired this entry in a place that perhaps more people will see it.  The goal in this exercise is to migrate a ZOSS (zones on shared storage) zone from a mirrored rootzpool to a raidz rootzpool.

WARNING:  I've used lofi devices as the backing store for my zfs pools.  This configuration will not survive a reboot and should not be used in the real world.  Use real disks when you do this for any zones that matter.

My starting configuration is:

# zonecfg -z stuff info rootzpool
rootzpool:
    storage: dev:lofi/1
    storage: dev:lofi/2

My zone, stuff, is installed and not running.

# zoneadm -z stuff list -v
  ID NAME             STATUS     PATH                           BRAND    IP    
   - stuff            installed /zones/stuff                   solaris  excl  

I need to prepare the new storage by creating a zpool with the desired layout.

# zpool create newpool raidz /dev/lofi/3 /dev/lofi/4 /dev/lofi/5 /dev/lofi/6 /dev/lofi/7
# zpool status newpool
  pool: newpool
 state: ONLINE
  scan: none requested
config:

    NAME             STATE     READ WRITE CKSUM
    newpool          ONLINE       0     0     0
      raidz1-0       ONLINE       0     0     0
        /dev/lofi/3  ONLINE       0     0     0
        /dev/lofi/4  ONLINE       0     0     0
        /dev/lofi/5  ONLINE       0     0     0
        /dev/lofi/6  ONLINE       0     0     0
        /dev/lofi/7  ONLINE       0     0     0

errors: No known data errors

Next, migrate the data.  Remember, the zone is not running at this point.  We can use zfs list to figure out the name of the zpool mounted at the zonepath.

# zfs list -o name,mountpoint,mounted /zones/stuff
NAME         MOUNTPOINT    MOUNTED
stuff_rpool  /zones/stuff      yes

# zfs snapshot -r stuff_rpool@migrate

# zfs send -R stuff_rpool@migrate | zfs recv -u -F newpool

The -u option was used with zfs receive so that it didn't try to mount the zpool's root file system at the zonepath when it completed.  The -F option was used to allow it to wipe out anything that happens to exist in the top-level dataset in the destination zpool.

Now, we are ready to switch which pool is in the zone configuration.  To do that, we need to detach the zone, modify the configuration, and then attach it.  Prior to attaching, we also need to ensure that newpool is exported.

# zoneadm -z stuff detach
Exported zone zpool: stuff_rpool
# zpool export newpool
# zonecfg -z stuff
zonecfg:stuff> info rootzpool
rootzpool:
    storage: dev:lofi/1
    storage: dev:lofi/2
zonecfg:stuff> remove rootzpool
zonecfg:stuff> add rootzpool
zonecfg:stuff:rootzpool> add storage dev:lofi/3
zonecfg:stuff:rootzpool> add storage dev:lofi/4
zonecfg:stuff:rootzpool> add storage dev:lofi/5
zonecfg:stuff:rootzpool> add storage dev:lofi/6
zonecfg:stuff:rootzpool> add storage dev:lofi/7
zonecfg:stuff:rootzpool> end
zonecfg:stuff> exit

In the commands above, I was quite happy that zonecfg allows the up arrow or ^P to select the previous command.  Each instance of add storage was just four keystrokes (^P, backspace, number, enter).

# zoneadm -z stuff attach
Imported zone zpool: stuff_rpool
Progress being logged to /var/log/zones/zoneadm.20130430T144419Z.stuff.attach
    Installing: Using existing zone boot environment
      Zone BE root dataset: stuff_rpool/rpool/ROOT/solaris
                     Cache: Using /var/pkg/publisher.
  Updating non-global zone: Linking to image /.
Processing linked: 1/1 done
  Updating non-global zone: Auditing packages.
No updates necessary for this image.

  Updating non-global zone: Zone updated.
                    Result: Attach Succeeded.
Log saved in non-global zone as /zones/stuff/root/var/log/zones/zoneadm.20130430T144419Z.stuff.attach
# zpool status stuff_rpool
  pool: stuff_rpool
 state: ONLINE
  scan: none requested
config:

    NAME             STATE     READ WRITE CKSUM
    stuff_rpool      ONLINE       0     0     0
      raidz1-0       ONLINE       0     0     0
        /dev/lofi/3  ONLINE       0     0     0
        /dev/lofi/4  ONLINE       0     0     0
        /dev/lofi/5  ONLINE       0     0     0
        /dev/lofi/6  ONLINE       0     0     0
        /dev/lofi/7  ONLINE       0     0     0

errors: No known data errors

At this point the storage has been migrated.  You can boot the zone and move on to the next task.

You probably want to use zfs destroy -r stuff_rpool@migrate once you are sure you don't need to revert to the old storage.  Until you delete it (or the source zpool) you can use zfs send -I to send just the differences back to the old pool.  That's left as an exercise for the reader.


Thursday Oct 25, 2012

Linux to Solaris @ Morgan Stanley

I came across this blog entry and the accompanying presentation by Robert Milkowski about his experience switching from Linux to Oracle Solaris 11 for a distributed OpenAFS file serving environment at Morgan Stanley.

If you are an IT manager, the presentation will show you:

  • Running Solaris with a support contract can cost less than running Linux (even without a support contract) because of technical advantages of Solaris.
  • IT departments can benefit from hiring computer scientists into Systems Programmer or similar roles.  Their computer science background should be nurtured so that they can continue to deliver value (savings and opportunity) to the business as technology advances.

If you are a sysadmin, developer, or somewhere in between, the presentation will show you:

  • A presentation that explains your technical analysis can be very influential.
  • Learning and using the non-default options of an OS can make all the difference as to whether one OS is better suited than another.  For example, see the graphs on slides 3 - 5.  The ZFS default is to not use compression.
  • When trying to convince those that hold the purse strings that your technical direction should be taken, the financial impact can be the part that closes the deal.  See slides 6, 9, and 10.  Sometimes reducing rack space requirements can be the biggest impact because it may stave off or completely eliminate the need for facilities growth.
  • DTrace can be used to shine light on performance problems that may be suspected but not diagnosed.  It is quite likely that these problems have existed in OpenAFS for a decade or more.  DTrace made diagnosis possible.
  • DTrace can be used to create performance analysis tools without modifying the source of software that is under analysis.  See slides 29 - 32.
  • Microstate accounting, visible in the prstat output on slide 37 can be used to quickly draw focus to problem areas that affect CPU saturation.  Note that prstat without -m gives a time-decayed moving average that is not nearly as useful.
  • Instruction level probes (slides 33 - 34) are a super-easy way to identify which part of a function is hot.


Thursday Dec 01, 2011

What I learned about lofi today

As I was digging into some other things today, I realized that lofiadm is not needed in common use cases.  As mount(1M) says:

     For file system types that support it, a file can be mounted
     directly as a file system by specifying the full path to the
     file as the special argument. In such  a  case,  the  nosuid
     option is enforced. If specific file system support for such
     loopback file mounts is  not  present,  you  can  still  use
     lofiadm(1M)  to  mount a file system image. In this case, no
     special options are enforced.

That is, you can do this:

root@global# lofiadm
Block Device             File                           Options

root@global# mount -F hsfs `pwd`/sol-10-u9-ga-x86-dvd.iso /mnt

root@global# df -h /mnt
Filesystem             Size   Used  Available Capacity  Mounted on
/ws/media/solaris/sol-10-u9-ga-x86-dvd.iso
                       2.0G   2.0G         0K   100%    /mnt

root@global# lofiadm
Block Device             File                           Options
/dev/lofi/1              /ws/media/solaris/sol-10-u9-ga-x86-dvd.iso     -

When I unmount it, the lofi device goes away as well.

root@global# umount /mnt

root@global# lofiadm
Block Device             File                           Options

Note that this was on Solaris 11 - I don't believe that this feature was backported to Solaris 10.

Thursday Nov 17, 2011

Come see me @LISA

LISA '11 is just around the corner and once again includes an Oracle Solaris Summit the day before the main conference.  Please come to the summit and to as my esteemed colleagues and I introduce many of the great improvements found in Solaris 11.  I'll be giving a talk on Zones.

Even with a full day to talk about Solaris 11, we'll certainly be unable to get into the depth in the areas that concern you the most.  To get some face time with Oracle engineers, stop by the Oracle demo booth - I'll be there Wednesday from 2:00 - 4:00.

Saturday Nov 12, 2011

Automating custom software installation in a zone

In Solaris 11, the internals of zone installation are quite different than they were in Solaris 10.  This difference allows the administrator far greater control of what software is installed in a zone.  The rules in Solaris 10 are simple and inflexible: if it is installed in the global zone and is not specifically excluded by package metadata from being installed in a zone, it is installed in the zone.  In Solaris 11, the rules are still simple, but are much more flexible:  the packages you tell it to install and the packages on which they depend will be installed.

So, where does the default list of packages come from?  From the AI (auto installer) manifest, of course.  The default AI manifest is /usr/share/auto_install/manifest/zone_default.xml.  Within that file you will find:

            <software_data action="install">
                <name>pkg:/group/system/solaris-small-server</name>
            </software_data>

So, the default installation will install pkg:/group/system/solaris-small-server.  Cool.  What is that?  You can figure out what is in the package by looking for it in the repository with your web browser (click the manifest link), or use pkg(1).  In this case, it is a group package (pkg:/group/), so we know that it just has a bunch of dependencies to name the packages that really wants installed.

$ pkg contents -t depend -o fmri -s fmri -r solaris-small-server
FMRI
compress/bzip2
compress/gzip
compress/p7zip
...
terminal/luit
terminal/resize
text/doctools
text/doctools/ja
text/less
text/spelling-utilities
web/wget

If you would like to see the entire manifest from the command line, use pkg contents -r -m solaris-small-server.

Let's suppose that you want to install a zone that also has mercurial and a full-fledged installation of vim rather than just the minimal vim-core that is part of solaris-small-server.  That's pretty easy.

First, copy the default AI manifest somewhere where you will edit it and make it writable.

# cp /usr/share/auto_install/manifest/zone_default.xml ~/myzone-ai.xml
# chmod 644 ~/myzone-ai.xml 

Next, edit the file, changing the software_data section as follows:

            <software_data action="install">
                <name>pkg:/group/system/solaris-small-server</name>
                <name>pkg:/developer/versioning/mercurial</name>
                <name>pkg:/editor/vim</name>             </software_data>

To figure out  the names of the packages, either search the repository using your browser, or use a command like pkg search hg.

Now we are all ready to install the zone.  If it has not yet been configured, that must be done as well.

# zonecfg -z myzone 'create; set zonepath=/zones/myzone'
# zoneadm -z myzone install -m ~/myzone-ai.xml 
A ZFS file system has been created for this zone.
Progress being logged to /var/log/zones/zoneadm.20111113T004303Z.myzone.install
       Image: Preparing at /zones/myzone/root.

 Install Log: /system/volatile/install.15496/install_log
 AI Manifest: /tmp/manifest.xml.XfaWpE
  SC Profile: /usr/share/auto_install/sc_profiles/enable_sci.xml
    Zonename: myzone
Installation: Starting ...

              Creating IPS image
              Installing packages from:
                  solaris
                      origin:  http://localhost:1008/solaris/54453f3545de891d4daa841ddb3c844fe8804f55/
               
DOWNLOAD                                  PKGS       FILES    XFER (MB)
Completed                              169/169 34047/34047  185.6/185.6

PHASE                                        ACTIONS
Install Phase                            46498/46498 

PHASE                                          ITEMS
Package State Update Phase                   169/169 
Image State Update Phase                         2/2 
Installation: Succeeded

        Note: Man pages can be obtained by installing pkg:/system/manual

 done.

        Done: Installation completed in 531.813 seconds.


  Next Steps: Boot the zone, then log into the zone console (zlogin -C)

              to complete the configuration process.

Log saved in non-global zone as /zones/myzone/root/var/log/zones/zoneadm.20111113T004303Z.myzone.install
Now, for a few things that I've seen people trip over:

  • Ignore that bit about man pages - it's wrong.  Man pages are already installed so long as the right facet is set properly.  And that's a topic for another blog entry.
  • If you boot the zone then just use zlogin myzone, you will see that services you care about haven't started and that svc:/milestone/config:default is starting.  That is because you have not yet logged into the console with zlogin -C myzone.
  • If the zone has been booted for more than a very short while when you first connect to the zone console, it will seem like the console is hung.  That's not really the case - hit ^L (control-L) to refresh the sysconfig(1M) screen that is prompting you for information.
About

I'm a Principal Software Engineer in the Solaris Zones team. In this blog, I'll talk about zones, how they interact with other parts of Solaris, and related topics.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today