Diskset Import - An Introduction to the source

Diskset Import - An Introduction to the source One of my most significant contributions (along with Steve Peng) to
Solaris 10 was to add support for import/export of disksets to SVM. So, what
is import/export of disksets? Simply put, you've got a bunch of disks
encapsulated in an SVM diskset and you want to disconnect them from one
host and connect them to a different host. And, get your SVM configuration
back. Why might you want to do this - say if you want to consolidate your storage
or incase of a disaster if you want to move your storage from one server to
another you might want to do something just like this.

SVM stores it's configuration information for the local set in a regular
metadb (the one that can be seen by metadb(1M) without arguments). The diskset
related configuration is stored in a diskset metadb (one that can be seen by
'metadb -s <diskset>' command) that resides on most (if not all) of the disks that are
a part of that diskset. Additionally, the local set metadb has knowledge about
the disksets including information on where to find the diskset metadbs.

The problem with moving storage from one server to another is that you loose
the local metadb and thus don't know where to find the diskset metadbs (and the
associated configuration). In order to implement diskset import it was needed to
figure out which of the recently connected disks in the target system have a diskset
metadb on them, read the configuration in from that metadb and populate the
kernel structures with the read in configuration information. That was the
scope of the problem in a nut shell.

We started out by writing the code to scan the disks for diskset metadbs
(entirely in userland). If you want to follow the conversation with code
references, pull up metaimport.c This is the essentially the source of
metaimport(1M). The code starts out by scanning the available set of disks,
pruning the disks that are in use and then for each drive that's left it
calls meta_get_set_info - this is the heart of the scanning code. It checks
to see if a diskset metadb exists on the passed in disk and if one exists, it
reads it in and does a sanity check on the metadata information read in. It
also does the work figuring out the new disk names, i.e. a disk named c1t1d1
in the source system might be named c2t2d22 in the target system and you need
to correct the related metadata information in the diskset metadb to reflect
the fresh state of affairs. Upon it's return, meta_get_set_info has a list
of disks that comprise a diskset.

Once we've build up the list of disksets and the disks that comprise each
of those disksets, we pass all of this information to meta_imp_set that does
the real work of populating the information in the kernel via ioctls. The
MD_DB_USEDEV ioctl creates the kernel structures (akin to what happens when
creating the initial configuration). The MD_IOCIMP_LOAD ioctl then snarfs in
the detailed configuration, the heart of this code is in md_imp_snarf_set.
The ops vector for each of the modules (stripe, mirror, etc) was expanded to
include an import op. So, for example, the stripe ops vector now looked
something like this -

md_ops_t stripe_md_ops = {
stripe_open, /\* open \*/
stripe_close, /\* close \*/
md_stripe_strategy, /\* strategy \*/
NULL, /\* print \*/
stripe_dump, /\* dump \*/
NULL, /\* read \*/
NULL, /\* write \*/
md_stripe_ioctl, /\* stripe_ioctl, \*/
stripe_snarf, /\* stripe_snarf \*/
stripe_halt, /\* stripe_halt \*/
NULL, /\* aread \*/
NULL, /\* awrite \*/
stripe_imp_set, /\* import set \*/

The import op for each of the modules handled creating detailed configuration
as well as updating out-of-date information.

md_imp_snarf_set calls the import op for each of the modules that appear in
the diskset configuration. So, if there's a stripe in the diskset configuration
stripe_imp_set gets called and so on. Subsequently, the code does exactly
what it says :)

\* Fixup
\* (1) locator block
\* (2) locator name block if necessary
\* (3) master block
\* (4) directory block
\* calls appropriate writes to push changes out
if ((err = md_imp_db(setno)) != 0)
goto cleanup;

\* Create set in MD_LOCAL_SET
if ((err = md_imp_create_set(setno)) != 0)
goto cleanup;

It fixes up another set of out-of-date information and creates the appropriate
structures in the local set to inform the local set about the diskset
configuration and where to find it. That's it, we're done with our job in the
kernel and we return to userland.

In the userland, the only other thing that needs to be done is to inform the
rpc daemon that stores the knowledge about disksets (rpc.metad) about the
existence of this imported set. This is accomplished via the clnt_resnarf_set routine.

So there you have it - a 15,000 ft overview of the implementation of diskset

Technorati Tag:
Technorati Tag:


Post a Comment:
  • HTML Syntax: NOT allowed



Top Tags
« February 2016

No bookmarks in folder