Wednesday Dec 10, 2008

Massively porting FOSS for OpenSolaris 2008.11 /pending and /contrib repositories

Today is the official release of OpenSolaris 2008.11, including commercial support.

Along with OpenSolaris 2008.11 we're also publishing new repositories full of various open source software built and packaged for OpenSolaris:

  • A pending repository with 1,708 FOSS pkgs today, and many more coming. This is "pending" in that we want to promote the packages in it to the contrib repository.
  • A contrib repository with 154 FOSS pkgs today, and many more coming soon.

These packages came from two related OpenSolaris projects in the OpenSolaris software porters community:

The two projects focus on different goals. Here I describe the work that we did on the PkgFactory/Roboporter project. Our primary goal is to port and package FOSS to OpenSolaris as quickly as possible. We do not yet focus very much on proper integration with OpenSolaris, such as making sure that the FOSS we package is properly integrated with RBAC, SMF, Solaris audit facilities, with manpages placed in the correct sections, etcetera, though we do intend to get to the point where we do get close enough to proper integration that the most valuable packages can then be polished off manually, put through the ARC and c-team processes, and pushed to the /dev repository.

Note, by the way, that the /pending and /contrib repositories are open to all contributors. The processes involved for contributing packages to these repositories are described in the SW Porters community pages, so if there's something you'd like to make sure that your favorite FOSS is included you can always do it yourself!

The 154 packages in /contrib are a representative subset of the 1,708 packages in /pending, which in turn are a representative subset of some 10,000 FOSS pkgs that we had in an project-private repository. That's right, 10,000, which we built in a matter of just a few weeks. [NOTE: Most, but not all of the 1,708 packages in /pending and 154 in /contrib came from the pkgfactory project.]

The project began with Doug Leavitt doing incredible automation of: a) searching for and downloading spec files from SFE and similar from Ubuntu and other Linux packaging repositories, b) building them on Solaris. (b) is particularly interesting, but I'll let Doug blog about that. With Doug's efforts we had over 12,000 packages in a project-private IPS repository, and the next step was to clean things up, cut the list down to something that we could reasonably test and push to /pending and /contrib. That's where Baban Kenkre and I jumped in.

To come up with that 1,704 package list we first removed all the Perl5 CPAN stuff from the list of 12,000, then we wrote a utility to look for conflicts between our repository, the Solaris WOS and OpenSolaris. It turned out we had many conflicts even withing our own repository (some 2,000 pkgs were removed as a result, if I remember correctly, after removing the Perl5 packages). Then we got down and dirty and did as much [very light-weight] testing as we could.

What's really interesting here is that the tool we wrote to look for conflicts turned out to be really useful in general. That's because it loads package information from our project's repo, the SVR4 Solaris WOS and OpenSolaris into a SQLite3 database, and analyzes the data to some degree. What's really useful about this is that with little knowledge of SQL we did many ad-hoc queries that helped a lot when it came to whittling down our package list and testing. For example: getting a list of all executables in /bin and /usr/sbin that are delivered by our package factory and which have manpages, was trivial, and quite useful (because then I could read the manpages in one terminal and try the executables in another, which made the process of light-weight testing much faster than it would have otherwise been). We did lots of ad-hoc queries against this little database, the kinds of queries that without a database would have required significantly more scripting; SQL is a very powerful language!

That's it for now. We'll blog more later. In the meantime, check out the /pending and /contrib repositories. We hope you're pleased. And keep in mind that what you see there is mostly result of just a few weeks of the PkgFactory project work, so you can expect: a) higher quality as we improve our integration techniques and tools, and b) more, many, many more packages as we move forward. Our two projects' ultimate goal is to package for OpenSolaris all of the useful, redistributable FOSS that you can find on Sourceforge and other places.

Tuesday Apr 25, 2006

ZFS and the automounter

So, you have ZFS, and you create lots and lots of filesystems, all the time, any time you like.

Problem: new filesystems don't show up in /net (-hosts special automount map) once a host's /net entry has been automounted.

Solution #1: wait for OpenSolaris to gain support for client-side crossing of NFSv4 server-side mount-points.

Solution #2: replace the -hosts map with an executable automount map that generates hierarchical automount map entries based on showmount -e output.

The script is pretty simple, but there is one problem: the automounter barfs at entries longer than 4095 characters (sigh).


exec 2>/dev/null

if [[ $# -eq 0 ]]
	logger -p daemon.error -t ${0##\*/} "Incorrect usage by automountd!"
	exit 1

entry="/ $1:/"
showmount -e $1|sort|grep '\^/'|while read p junk
	entry="$entry $p $1:$p"

print "$entry"

logger -p daemon.debug -t ${0##\*/} "$entry"

Wednesday Dec 07, 2005

Comparing ZFS to the 4.4BSD LFS

Remember the 4.4BSD Log Structured Filesystem? I do. I've been thinking about how the two compare recently. Let's take a look.

First, let's recap how LFS works. Open your trusty copies of "The Design and Implementation of the 4.4BSD Operating Systems" to chapter 8, section 3 -- or go look at the Wikepedia entry for LFS, or read any of the LFS papers referenced therein. Go ahead, take your time, this blog entry will wait for you.

As you can see, LFS is organized on disk as a sequence of large segments, each segments consisting of smaller chunks, the whole forming a log file of sorts. Each small chunk consists of data and meta-data blocks that have been modified/added at the time that the chunk was written. LFS, like ZFS, never overwrites data, excepting superblocks, so LFS does, by definition, copy-on-write, just like ZFS. In order to be able to find inodes whose block locations have changed LFS maintains a mapping of inode numbers to block addresses in a regular file called the "ifile."

That should be enough of a recap. As for ZFS, I assume the reader has been reading the same ZFS blog entries I have been reading. (By the way, I'm not a ZFS developer. I only looked at ZFS source code for the first time two days ago.)

So, let's compare the two filesystems, starting with generic aspects of transactional filesystems:

  • LFS writes data and meta-data blocks everytime it needs to fsync()/sync()
  • Whereas ZFS need only write data blocks and entries in its intent log (ZIL)

This is very important. The ZIL is a compression technique that allows ZFS to safely defer many writes that LFS could not. Most LFS meta-data writes are very redundant, after all: writing to one block of a file implies writing new indirect blocks, a new inode, a new data block for the ifile, new indirect blocks for the ifile, and a new ifile inode -- but all of these writes are easily summarized as "wrote block # to replace the block of the file whose inode # is ."

Of course, ZFS can't ward off meta-data block writes forever, but it can safely defer them with its ZIL, and in the process stands a good chance of being able to coalesce related ZIL entries.

  • LFS needs copying garbage collection, a process involving both, searching for garbage and relocating data surrounding garbage
  • Whereas where LFS sees garbage ZFS sees snapshots and clones

The Wikipedia LFS entry says this about LFS' lack of snapshots and clones: "LFS does not allow snapshotting or versioning, even though both features are trivial to implement in general on log-structured file systems." I'm not sure I'd say "trivial," but, certainly easier than in more traditional filesystems.

  • LFS has an ifile to track inode number to block address mappings. It has to, else how to COW inodes?
  • ZFS has a meta-dnode; all objects in ZFS are modelled as a "dnode" and so dnodes belie inodes, or rather, "znodes," and znode numbers are dnode numbers. Dnode-number-to-block-address mappings are kept in a ZFS filesystem's object set's meta-dnode much as in LFS inode-to-block address mappings are kept in the LFS ifile.

It's worth noting that ZFS uses dnodes for many purposes besides implementing regular files and directories; some such uses do not require many of the meta-data items associated with regular files and directories, such as ctime/mtime/atime, and so on. Thus "everything is a dnode" is more space efficient than "everything is a file" (in LFS the "ifile" is a regular file).

  • LFS does COW (copy-on-write)
  • ZFS does COW too


  • LFS stops there
  • Whereas ZFS uses its COW nature to improve on RAID-5 by avoiding the need to read-modify-write, so RAID-z goes faster than RAID-5. Of course, ZFS also integrates volume management.

Besides features associated with transactional filesystems, and besides volume management, ZFS provides many other features, such as:

  • data and meta-data checksums to protect against bitrot
  • extended attributes
  • NFSv4-style ACLs
  • ease of use
  • and so on

I'm an engineer at Oracle (erstwhile Sun), where I've been since 2002, working on Sun_SSH, Solaris Kerberos, Active Directory interoperability, Lustre, and misc. other things.


« June 2016