Wednesday Mar 10, 2010

Monitoring the Sun Storage 7000 Appliance from Oracle Grid Control

Over the past few months I've blogged on various monitoring and alerting topics for Sun Storage 7000 Appliances. Besides my favorite of the blogs (Tweeting your Sun Storage 7000 Appliance Alerts), the culmination of this monitoring work is now available as the Sun Storage 7000 Management Plug-in for Oracle Enterprise Manager 10g Grid Controller 1.0, for use with the just shipped 2010.Q1 software release for the Sun Storage 7000 Appliance Family. Phew, that's a bit of a mouthful for a title but I'll just refer to it as the SS7000MPfOEMGC, does that help? Well, maybe not ;-)

Sun Storage 7000 Management Plug-in for Oracle Enterprise Manager 10g Grid Controller creates a coupling between the enterprise-wide monitoring provided by Oracle Grid Control and the monitoring and analytics provided by Sun Storage 7000 Appliances. If you are not familiar with Oracle Grid Control, there is a nice write-up within the Installation and Configuration Guide for Oracle Grid Control. In a nutshell, Oracle Grid Control aids in monitoring your vertical data center rather than simply being an aggregation of horizontal health information. The documentation describes it as software and the infrastructure it runs on but I would simply call it a "Vertical Data Center Monitoring Environment".

The goal of the plug-in to Oracle Grid Control is to facilitate a Database Administrator in their use of Sun Storage 7000 Appliances without attempting to reproduce the world-class analytics available within Sun Storage 7000 Appliances. In other words, the goal is to create a bridge between the world of Database Administration and the world of Storage Administration with just enough information so the two worlds can have dialog about the environment. Specifically, the Plug-in for Sun Storage 7000 Appliances is targeted at the following tasks:


  • Connecting Database deployments with Sun Storage 7000 resources that provide storage infrastructure to the database
  • Understanding the performance metrics of a Database from the perspective of the Appliance (what cache resources are being used for a database, what network resources and the performance being delivered, and how various storage abstractions are being used by the database)
  • Providing a Federated view of Sun Storage 7000 Appliances deployed in the environment (including storage profiles and capacities, network information and general accounting information about the appliances)
  • Providing detailed performance metrics for use in initial service delivery diagnostics (these metrics are used to have more detailed conversations with the Storage Administrator when additional diagnostics are required)

Let's take a look at one of the more interesting scenarios as a simple way of showing the plug-in at work rather than reproducing the entire Installation Guide in blog-form.

Download the Plug-in for Sun Storage 7000 Appliances, Unzip the downloaded file, and read the Installation Guide included with the plug-in.

Follow the instructions for installing, deploying the plug-in to agents and adding instances of Sun Storage 7000 Appliances to the environment for monitoring. Each instance added takes about 60 minutes to fully populate with information (this is simply the nature of this being a polling environment and the plug-in is set-up to monitor data sets that don't change often less frequently ... 60 minutes ... than data sets that do change frequently ... 10 minute intervals).

Once data is funneling in, all of the standard appliance-centric views of the information are available (including the individual metrics that the plug-in collects) as well as a view of some of the important high-level information presented on the home page for an instance (provided you are using Oracle Grid Control 10.2.0.5). Here is a view of a single appliance instance's home page:

Looking into the Metrics collected for an appliance brings you to a standard displays of single metrics (as shown below) or tables of related metrics (all standard navigation in Oracle Grid Controller for plug-in components).

Included in the plug-in for Sun Storage 7000 Appliances are 5 reports. Of these reports, 3 run against a single instance of a Sun Storage 7000 Appliance and are available from both the context of the single instance and the Oracle Grid Control Reports Tab while 2 run against all monitored instances of Sun Storage 7000 Appliances and are only available from the Reports Tab. Among the 5 Reports are 2 that combine information about Databases deployed against NFS mount points and Sun Storage 7000 Appliances that export those NFS mount points. The two reports are:


  • Database to Appliance Mapping Report - Viewable from a single target instance or the Reports Tab, this report shows databases deployed against NFS shares from a single Sun Storage 7000 Target Instance
  • Federated Database to Appliance Mapping Report - Viewable only from the Reports Tab, this report shows databases deployed against NFS shares from all monitored Sun Storage 7000 Appliances

Looking at the "Master" (top-level) Database to Appliance Mapping Report (shown below) you will see a "Filter" (allowing you to scope the information in the table to a single Database SID) and a table that correlates the filtered Database SID to Network File Systems shared by specific appliances along with the Storage IP Address that the share is accessed through, the appliance's Storage Network Interface and the name that the appliance is referred to as throughout this Grid Control instance.

From the Master report, 4 additional links are provided to more detailed information that is filtered to the appliance abstraction that is used by the Database SID. The links in the columns navigate in the following way:


  • Database Link - This link takes the viewer to a drill-down table that shows all of the files deployed on the shares identified in the first table. With this detail report, and administrator could see exactly what files are deployed where. The table also contains the three links identified next.
  • Network File System - Takes the viewer down to a detailed report showing metadata about the share created on the appliance, how the cache is used (ARC and L2ARC) for this share and general capacity information for the share.
  • Storage IP Address - Takes the viewer to the Metric Details that relate to the appliance configuration (serial number, model, etc...).
  • Storage Network Interface - Takes the viewer to metadata about the network interface as well as reports on the Network Interface KB/sec and NFS Operations Per Second (combined with the NFS Operations Per Second that are allocated to serving the share that the database resides on)

The detail reports for the Network File System and Storage Network Interface (both of which are not directly accessible from the Reports Tab) use a combination of current metrics and graphical time-axis data, as shown in the following report:

Wherever applicable, the Detail Reports drill further into Metric Details (that could also be accessed through an appliance instance target home page).

It is important to note that several of these reports combine a substantial amount of data into a single page. This approach can create rather lengthy report generation times (in worst case scenarios up to 5 minutes). It is always possible to view individual metrics through the monitoring home page. As metric navigation is much more focused and relates to a single metric, metric navigation always performs faster and is preferred unless the viewer is looking for a more complex assembly of information. With the reports, an administrator can view network performance and storage performance side by side which may be more helpful in diagnosing service delivery issues than navigating through single metric data points.

In addition to a substantial number of collected metrics there are several alerts that are generated on various appliance thresholds that can occur throughout the operation of target appliances.

Conclusion


Oracle Grid Control gives a fully integrated view of the "Vertical" data center, combining software infrastructure with hardware infrastructure (including storage appliances). Sun Storage 7000 Management Plug-in for Oracle Enterprise Manager 10g Grid Controller 1.0 presents Sun Storage 7000 Appliances within the vertical context and presents metrics and reports tailored specifically towards Sun Storage 7000 Appliances as viewed by a Database Administrator. For more information on the plug-in and software discussed in this entry:

Wednesday Dec 16, 2009

The SNMP Service on a Sun Storage 7000 Appliance

Without a doubt, SNMP rules the playground in terms of monitoring hardware assets, and many software assets, in a data center monitoring ecosystem. It is the single biggest integration technology I'm asked about and that I've encountered when discussing monitoring with customers.

Why does SNMP have such amazing staying power?


  • It's extensible (vendors can provide MIBs and extend existing MIBs)
  • It's simple (hierarchical data rules and really it boils down to GET, SET, TRAP)
  • It's ubiquitous (monitoring tools accept SNMP, systems deliver SNMP)
  • It operates on two models, real time (traps) and polling (get)
  • It has aged gracefully (security extensions in v4 did not destroy it's propagation)

To keep the SNMP support in the Sun Storage 7000 Appliances relatively succinct, I am going to tackle this in two separate posts. This first post shows how to enable SNMP and what you get "out of the box" once it's enabled. The next post discusses how to deliver more information via SNMP (alerts with more information and threshold violations).

To get more information on SNMP on the Sun Storage 7000 and to download the MIBs that will be discussed here, go to the Help Wiki on a Sun Storage 7000 Appliance (or the simulator):


  • SNMP - https://[hostname]:215/wiki/index.php/Configuration:Services:SNMP

Also, as I work at Sun Microsystems, Inc., all of my examples of walking MIBs on a Sun Storage 7000 Appliance or receiving traps will be from a Solaris-based system. There are plenty of free / open source / trial packages for other Operating System platforms so you will have to adapt this content appropriately for your platform.

One more note as I progress in this series, all of my examples are from the CLI or from scripts, so you won't find many pretty pictures in the series :-)

Enabling SNMP on the Sun Storage 7000 Appliance gives you the ability to:


  • Receive traps (delivered via Sun's Fault Manager (FM) MIB)
  • GET system information (MIB-II System, MIB-II Interfaces, Sun Enterprise MIB)
  • GET information customized to the appliance (using the Sun Storage AK MIB)

Enabling alerts (covered in the next article) extends the SNMP support by delivering targeted alerts via the AK MIB itself.

Enable SNMP


The first thing we'll want to do is log into a target Sun Storage 7000 Appliance via SSH and check if SNMP is enabled.


aie-7110j:> configuration services snmp
aie-7110j:>configuration services snmp> ls
Properties:
<status> = disabled
community = public
network =
syscontact =
trapsinks =

aie-7110j:configuration services snmp>

Here you can see it is currently disabled and that we have to set up all of the SNMP parameters. The most common community string to this day is "public" and as we will not be changing system information via SNMP we will keep it. The "network" parameter to use for us is 0.0.0.0/0, this allows access to the MIB from any network. Finally, I will add a single trapsink so that any traps get sent to my management host. The last step shown is to enable the service once the parameters are committed.


aie-7110j:configuration services snmp> set network=0.0.0.0/0
network = 0.0.0.0/0 (uncommitted)
aie-7110j:configuration services snmp> set syscontact="Paul Monday"
syscontact = Paul Monday (uncommitted)
aie-7110j:configuration services snmp> set trapsinks=10.9.166.33
trapsinks = 10.9.166.33 (uncommitted)
aie-7110j:configuration services snmp> commit
aie-7110j:configuration services snmp> enable
aie-7110j:configuration services snmp> show
Properties:
<status> = online
community = public
network = 0.0.0.0/0
syscontact = Paul Monday
trapsinks = 10.9.166.33

From the appliance perspective we are now up and running!

Get the MIBs and Install Them


As previously mentioned, all of the MIBs that are unique to the Sun Storage 7000 Appliance are also distributed with the appliance. Go to the Help Wiki and download them, then move them to the appropriate location for monitoring.

On the Solaris system I'm using, that location is /etc/sma/snmp/mibs. Be sure to browse the MIB for appropriate tables or continue to look at the Help Wiki as it identifies relevant OIDs that we'll be using below.

Walking and GETting Information via the MIBs


Using standard SNMP operations, you can retrieve quite a bit of information. As an example from the management station, we will retrieve a list of shares available from the system using snmpwalk:


-bash-3.00# ./snmpwalk -c public -v 2c isv-7110h sunAkShareName
SUN-AK-MIB::sunAkShareName.1 = STRING: pool-0/MMC/deleteme
SUN-AK-MIB::sunAkShareName.2 = STRING: pool-0/MMC/data
SUN-AK-MIB::sunAkShareName.3 = STRING: pool-0/TestVarious/filesystem1
SUN-AK-MIB::sunAkShareName.4 = STRING: pool-0/oracle_embench/oralog
SUN-AK-MIB::sunAkShareName.5 = STRING: pool-0/oracle_embench/oraarchive
SUN-AK-MIB::sunAkShareName.6 = STRING: pool-0/oracle_embench/oradata
SUN-AK-MIB::sunAkShareName.7 = STRING: pool-0/AnotherProject/NoCacheFileSystem
SUN-AK-MIB::sunAkShareName.8 = STRING: pool-0/AnotherProject/simpleFilesystem
SUN-AK-MIB::sunAkShareName.9 = STRING: pool-0/default/test
SUN-AK-MIB::sunAkShareName.10 = STRING: pool-0/default/test2
SUN-AK-MIB::sunAkShareName.11 = STRING: pool-0/EC/tradetest
SUN-AK-MIB::sunAkShareName.12 = STRING: pool-0/OracleWork/simpleExport

Next, I can use snmpget to obtain a mount point for the first share:

-bash-3.00# ./snmpget -c public -v 2c isv-7110h sunAkShareMountpoint.1
SUN-AK-MIB::sunAkShareMountpoint.1 = STRING: /export/deleteme

It is also possible to get a list of problems on the system identified by problem code:

-bash-3.00# ./snmpwalk -c public -v 2c isv-7110h sunFmProblemUUID
SUN-FM-MIB::sunFmProblemUUID."91e97860-f1d1-40ef-8668-dc8fb85679bb" = STRING: "91e97860-f1d1-40ef-8668-dc8fb85679bb"

And then turn around and retrieve the associated knowledge article identifier:

-bash-3.00# ./snmpget -c public -v 2c isv-7110h sunFmProblemCode.\\"91e97860-f1d1-40ef-8668-dc8fb85679bb\\"
SUN-FM-MIB::sunFmProblemCode."91e97860-f1d1-40ef-8668-dc8fb85679bb" = STRING: AK-8000-86

The FM-MIB does not contain information on severity, but using the problem code I can SSH into the system and retrieve that information:

isv-7110h:> maintenance logs select fltlog select uuid="91e97860-f1d1-40ef-8668-dc8fb85679bb"
isv-7110h:maintenance logs fltlog entry-005> ls
Properties:
timestamp = 2009-12-15 05:55:37
uuid = 91e97860-f1d1-40ef-8668-dc8fb85679bb
desc = The service processor needs to be reset to ensure proper functioning.
type = Major Defect

isv-7110h:maintenance logs fltlog entry-005>

Take time to inspect the MIBs through your MIB Browser to understand all of the information available. I tend to shy away from using SNMP for getting system information and instead write scripts and workflows as much more information is available directly on the system, I'll cover this in a later article.

Receive the Traps


Trap receiving on Solaris is a piece of cake, at least for demonstration purposes. What you choose to do with the traps is a whole different process. Each tool has it's own trap monitoring facilities that will hand you the fields in different ways. For this example, Solaris just dumps the traps to the console.

Locate the "snmptrapd" binary on your Solaris system and start monitoring:


-bash-3.00# cd /usr/sfw/sbin
-bash-3.00# ./snmptrapd -P
2009-12-16 09:27:47 NET-SNMP version 5.0.9 Started.

From there you can wait for something bad to go wrong with your system or you can provoke it yourself. Fault Management can be a bit difficult to provoke intentionally since things one thinks would provoke a fault are actually administrator activites. Pulling a disk drive is very different from a SMART drive error on a disk drive. Similarly, pulling a Power Supply is different from tripping over a power cord and yanking it out. The former is not a fault since it is a complex operation requiring an administrator to unseat the power supply (or disk) whereas the latter occurs out in the wild all the time.

Here are some examples of FM traps I've received through this technique using various "malicious" techniques on a lab system ;-)

Here is an FM Trap when I "accidentally" tripped over a power cord in the lab. Be careful when you do this so you don't pull the system off the shelf if it is not racked properly (note that I formatted this a little bit from the raw output):


2009-11-16 12:25:34 isv-7110h [172.20.67.78]:
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (1285895753) 148 days, 19:55:57.53
SNMPv2-MIB::snmpTrapOID.0 = OID: SUN-FM-MIB::sunFmProblemTrap
SUN-FM-MIB::sunFmProblemUUID."2c7ff987-6248-6f40-8dbc-f77f22ce3752" = STRING: "2c7ff987-6248-6f40-8dbc-f77f22ce3752"
SUN-FM-MIB::sunFmProblemCode."2c7ff987-6248-6f40-8dbc-f77f22ce3752" = STRING: SENSOR-8000-3T
SUN-FM-MIB::sunFmProblemURL."2c7ff987-6248-6f40-8dbc-f77f22ce3752" = STRING: http://sun.com/msg/SENSOR-8000-3T

Notice again that I have a SunFmProblemUUID that I can turn around and shell into the system to obtain more details (similarly to what was shown in the last section). Again, the next article will contain an explanation of Alerts. Using the AK MIB and Alerts, we can get many more details pushed out to us via an SNMP Trap, and we have finer granularity as to the alerts that get pushed.

Here, I purchased a very expensive fan stopper-upper device from a fellow tester. It was quite pricey, it turns out it is also known as a "Twist Tie". Do NOT do this at home, seriously, the decreased air flow through the system can cause hiccups in your system.


DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (1285889746) 148 days, 19:54:57.46
SNMPv2-MIB::snmpTrapOID.0 = OID: SUN-FM-MIB::sunFmProblemTrap
SUN-FM-MIB::sunFmProblemUUID."cf480476-51b7-c53a-bd07-c4df59030284" = STRING: "cf480476-51b7-c53a-bd07-c4df59030284"
SUN-FM-MIB::sunFmProblemCode."cf480476-51b7-c53a-bd07-c4df59030284" = STRING: SENSOR-8000-26
SUN-FM-MIB::sunFmProblemURL."cf480476-51b7-c53a-bd07-c4df59030284" = STRING: http://sun.com/msg/SENSOR-8000-26

You will receive many, many other traps throughout the day including the Enterprise MIB letting us know when the system starts up or any other activities.

Wrap it Up


In this article, I illustrated enabling the SNMP Service on the Sun Storage 7000 Appliance via an SSH session. I also showed some basic MIB walking and traps that you'll receive once SNMP is enabled.

This is really simply the "start" of the information we can push through the SNMP pipe from a system. In the next article I'll show how to use Alerts on the system with the SNMP pipe so you can have more control over the events on a system that you wish to be notified about.

Tuesday Aug 04, 2009

Sun Storage 7000 as an Administrator Development Platform

The Sun Storage 7000 Family of Appliances breaks ground in manageability and transparency through an amazing amount of analytics information provided to administrators as well as a highly customizable and extensible management environment that resides on the system. The "Workflow", delivered in the latest release of appliance software, is of particular interest to those of us responsible for "integrating" the Sun Storage 7000 into a management ecosystem, bundling pieces of management logic for use by our peers and reproducing management logic (such as configuration and environmental setup) on several systems at a time.

A workflow is a parameterized piece of logic that is uploaded to a Sun Storage 7000 where it remains resident and is then run via the BUI, CLI or remotely via a shell. The logic within the workflow is programmed in JavaScript (resident on the Sun Storage 7000) and interacts with the system's management shell via "run" commands or built-ins that interact with the current administrative context.

A workflow can do anything that an administrator could do via the CLI, but in a nicely bundled and parameterized way. Here are a few things I've done with workflows:


  • gather information about the appliance and reformat it to make it digestable by a higher-level tool
  • retrieve sets of analytics data and turn them into different sized chunks (instead of 1 second interval give me a 60 second interval as an average as well as the min and max during the interval) and reformat it to make it easy to digest
  • manage the lifecycle of shares (create, manage settings and delete) that are common across appliances
  • manage network settings
  • create a set of worksheets on every appliance in the network

The opportunities for automation are endless, only bounded by the needs of the administrator in their efforts to integrate the appliances within the management ecosystem.

There is substantial documentation on the appliance's Help Wiki, but for clarity, here is a very simple workflow that will list the attribute of a filesystem that is given as input to the workflow:


  • Input: attribute name (same as the attribute in the CLI)
  • Output: CSV format: project,sharename,attribute (one line for each share)
  • Behavior Notes: a listed attributed that is not valid will return NA in the column (this could be moved to parameter verification but will serve to illustrate exception handling). Also, there are some properties that return empty values as the value was actually inherited from the project context.

Since this is a relatively "short" example, I will simply put the code here with comments and then add additional information afterwords. Note the use of JavaScript functions (such as printToString) as well as the most important element, the definition of the variable "workflow".

/\* The printed headers, one will be added with the property name \*/
var headerList = new Array(
"Project",
"Share"
);

/\* A function to print the array into a string for display \*/
function printToString(csvToPrint){
var csvAsString = "";
for(var i=0 ; i csvAsString = csvAsString + csvToPrint[i];
// do not finish with an end of line marker
if(i!=csvToPrint.length-1) csvAsString = csvAsString + "\\n";
}
return csvAsString;
}

/\* This is a required structure for the workflow, it identifies the name, parameters
and the function to execute when it is run \*/
var workflow = {
name: 'Get Filesystem Attribute',
origin: 'Sun Microsystems, Inc.',
description: 'Prints a Property for all Shares',
parameters: {
property : {
label: 'Filesystem Property',
type: 'String'
}
},
execute:
function (params) {
// prepare the output arrays
var csvContents = new Array();
var currentRow = 0;
headerList[2] = params.property;
csvContents[0] = headerList;
currentRow++;

// go to the root context to start navigation
run('cd /');
run('shares')

// get a list of all of the projects on the system
var projects = list();

// navigate through each project
for(var i=0 ; i run('select '+projects[i]);

// get a list of all shares
var shares = list();

// go into the context of each share
for(var j=0 ; j run('select '+shares[j]);
var filesystem = true;
var mountPoint = "";
try {
mountPoint = get('mountpoint');
} catch (err) {
// will end up here if "mountpoint" does not exist, not a filesystem
filesystem = false;
}
if(filesystem) {
var currentRowContents = new Array();
currentRowContents[0] = projects[i];
currentRowContents[1] = shares[j];
try {
var propertyValue = get(params.property);
currentRowContents[2] = ""+propertyValue;
} catch (err) {
currentRowContents[2] = "NA";
}
csvContents[currentRow] = currentRowContents;
currentRow++;
}
run('cd ..');
}

run('cd ..');
}

var newCsvAsString = printToString(csvContents);

return (newCsvAsString);
}
};

While the bulk of the example is standard JavaScript, the workflow structure is where there must be adherence. Here are the important properties:


  • name - The name that the workflow will be identified by within the BUI or CLI
  • origin - The author of the workflow, can also be used to minimize name collisions
  • description - A description of the contents of the workflow, displayed in the BUI or CLI
  • parameters - A list of parameters with types (the types supported are listed in the documentation)
  • execute - The function that gets executed when the workflow is run (there are more advanced ways of identifying the execution code than are shown here)

The code itself interacts with the system to get a list of the projects on the system, then a list of the shares within the system. The mountpoint property is ONLY available on filesystems, so we know if there is a property error that we do not have a filesystem and skip processing of it (it is most likely an iSCSI LUN).

To upload the workflow, cut/paste the text above and put it in a file. Log into a Sun Storage 7000 Appliance with the latest software and go to Maintenance / Workflows. Click the "+" sign to add a workflow and identify the location of the file. The syntax is error checked on upload, then you will see it listed. Workflows can also be uploaded from the CLI.

Here is what a run of the workflow from the CLI looks like:


isv-7110h:maintenance workflows> ls
Properties:
showhidden = false

Workflows:

WORKFLOW NAME OWNER SETID ORIGIN
workflow-004 Get Filesystem Attribute root false Sun Microsystems, Inc.

isv-7110h:maintenance workflows> select workflow-004
isv-7110h:maintenance workflow-004> ls
Properties:
name = Get Filesystem Attribute
description = Prints a Property for all Shares
owner = root
origin = Sun Microsystems, Inc.
setid = false

isv-7110h:maintenance workflow-004> execute
isv-7110h:maintenance workflow-004 execute (uncommitted)> ls
Properties:
property = (unset)

isv-7110h:maintenance workflow-004 execute (uncommitted)> set property=space_total
property = space_total
isv-7110h:maintenance workflow-004 execute (uncommitted)> commit

Project,Share,space_total
AnotherProject,NoCacheFileSystem,53928
AnotherProject,simpleFilesystem,53928
OracleWork,simpleExport,53928
TestVarious,filesystem1,53928
default,test,448116
default,test2,5368709120
isv-7110h:maintenance workflow-004>

While the example is simple, hopefully it illustrates that this is the start of workflow capabilities, not the entirety of them. The workflow can create management structures (like new shares and worksheets), delete them, modify them, and even enable and disable services.

Workflows make the Sun Storage 7000 an Administrator Development Platform. Try it out in the Sun Unified Storage Simulator if you don't have an appliance at your fingertips!

Thursday Mar 27, 2008

National Archives, PASIG, a little Vacation

My family and I took a brief vacation this weekend and made our way to Washington D.C. for a little R & R. We enjoyed 2 and a half days of sights, tours, history and we even squeezed in a little time for the pool. For those of you that have been to D.C. (or live there), you know that 2 and a half days only allowed us to scratch the surface of the United States cultural base that is alive as well as preserved in the city (and often within a few blocks of the National Mall).

There are so many thought provoking and emotional moments as you move around that after two and a half days I found myself almost completely wrung out. We saved the Congressional Gardens with the Vietnam Memorial, World War II Memorial, Lincoln Memorial, Korean Memorial, and the others for the last day. The artistry and the thought that went into these memorials is astounding and the emotions that they pull out of you put you into knots.

I won't list everything we did on the whole journey over the weekend. For my youngest son, going up the Washington Memorial (and our need to start standing in line for tickets at 6:30am) will probably be the most impacting moments. For Shaun, hopefully the Vietnam Memorial and the Pederson House. For me, who knows, the Bill of Rights, the Constitution, the Declaration of Independence, the Magna Carta...simply amazing, but overall I can't name a single moment that wasn't worth its weight in gold.

Professionally though, the National Archives had to be one of the most thought provoking of our stops.

Here is this large building, with all of these physical manifestations of our history on display and in vaults around the building. The Constitution, the Declaration of Independence, the Bill of Rights, the Emancipation Proclamation and more than I could ever list here. It was "Magna Carta Days" at the National Archives as well and one of the four remaining copies of the 1297 Magna Carta from King Edward I was on display.

Here are a few thoughts that went through my head:


  • With infinite, perfect copies of digital content, what makes a digital entity "unique" and "awe inspiring"?
  • How is our country going to preserve and revere a digital creation 710 years from now?
  • How do you know what digital creations are worth preserving since you can hit a button and destroy them so easily?

The first of these seems entirely out of place with storage technology, but when you stand in front of the Declaration of Independence it makes you wonder what digital content could actually have this impact on a person and how you would embody that digital content. The record companies are struggling with it as well. In addition to digital download content, the companies are trying out releases on USB thumb drives as well as larger packages and "Deluxe" sets. Books are going to struggle through the same revolution (as magazines already have) of being bundled as bits with little or no "branding" or "artistry" about the packaging. How does one recreate that sense of uniqueness when the content is merely a bunch of bits that gets flattened into 10 songs amongst 8,000 on an iPod? Really, what "value" do those songs and books have anymore when they can be passed around at will and are part of a great "torrent" of traffic into and out of our computers? Something really has to "stick" to remain on "top of our stereo" these days.

And as for the Declaration of Independence...what a clean and simple document. The document itself hung in windows and is incredibly faded and worn down. Only after time passed did our country seek to formally preserve it for posterity. Perhaps we caught it in time to save it from deteriorating any more. But one still has to ask, is it the single original document that retains the significance or is it the content that remains significant. If it is the content, we wouldn't store the original in a huge underground vault and protect it as well as Vice President Cheney, would we?

Having seen the original, I would have to argue that there is something incredibly unique about it, it actually holds more reverence (for lack of a better word) than one of the many copies of it. So how does one reproduce that "reverence" in a digital world?

If that is not enough to think about, we have to think about digital preservation. The Magna Carta of 1297 has withstood time for 710 years and is in wonderful shape. What digital storage technology today do we have that can withstand decay for that length of time (of course, one could argue that some rock etchings have withstood time for 1,000s of years). Let's put this in perspective, today's disk drives and SSDs are generally spec'd for 5 years. If I want to preserve my family's pictures for 710 years, I would have to ensure the data was migrated 142 times. Hmmm, I'm not sure if my kids and their kids and their kids are up for that.

It appears that CDs and DVDs may have a lifespan of around 50-200 years if you preserve them properly. That is getting pretty reasonable...of course, they haven't been around for 50 to 200 years so they are certainly not battle tested like carving on a good rock. The National Institute of Standards and Technology appears to be looking heavily into the longevity of optical recording media. DLT appears to have a shelf-life of around 30 years if preserved properly.

Let's say, hypothetically, that you solve the problem of the storage media (perhaps a self migrating technology in a box that guarantees infinite lifespan and that, itself, produces the new disks and technology to ensure fresh DVDs are always built). Now you have two additional challenges (at least):


  • Maintaining the integrity of the data (how do I ensure) that the data that is NOW on the DVD is the original data
  • Maintaining the ability for outsiders to inspect and recall the stored information

The first of these seems obvious, but is actually quite difficult. Checksums can be overcome with time (imagine compute power in 700 years!) and we can't guarantee that the keepers of the information will not have a vested interest in changing the contents of the information. We see governments attempting to re-write history all the time, don't we?

Let's take a simpler example of what happens when "a" byte disappears. Recall Neil Armstrong's famous quote: "One small step for man ... ". Well, after a lot of CPU cycles and speculation and conspiracy theories, it turns out that we now believe that Neil Armstrong said: "One small step for a man...". It is a fundamentally different statement (though there is no less historical impact). This data is only 40 years old, but consider the angst in trying to prove whether or not "a" was a part of the quote. What happens when a government deliberately alters, say, the digital equivalent of an "original" 2nd Bill of Rights written in 2030?

One more thought for the day, since I really do have to work and if you have made it this far, it is my duty to you to free you of my ramblings.

We know for a fact that the English (dare I say...United States dialect) language is evolving. Even after 200 years there are phrases and semantics and constructs in the Declaration of Independence that require quite a bit of research for the common US citizen. Take the following paragraph:

He is at this time transporting large Armies of foreign Mercenaries to compleat the works of death, desolation and tyranny, already begun with circumstances of Cruelty & perfidy scarcely paralleled in the most barbarous ages, and totally unworthy the Head of a civilized nation.

There is the obvious use of the word perfidy, a word that has since all but disappeared from common speech in the United States.

Looking deeper at the paragraph we see evolution in spelling (compleat). There is also a fascinating use of capitalization throughout the Declaration of Independence. The study and usage of capitalization alone could be worth the creation of long research papers.

What does this tell us? The content and meaning of a work lies often with the context and times in which the work was created. How does one retain this context, language, and ability to read the content over 700 years? This is not a small problem at all. There are entire cultures lost or in the process of being lost as the language and the context is lost, consider the United States own Anasazi culture as an example.

A computer dialect (protocol, standard, information model, etc...) are themselves subject to evolution and are even more fragile than spoken language itself. A change in a capitalization in an XML model may break the ability of pre-existing programs for reading and migrating information, resulting in lost information. Once you break a program from 200 years prior, how much expertise will still exist to maintain and fix that program?

Crazy things to think about. Personally, I believe we are in a fragile place in our history where we could lose decades of historical information as we transition between written works and digital works. As part of my night job I'm trying to get more involved in the Sun Preservation and Archiving Special Interest Group (PASIG) to learn more about what our customers are doing in this area. I'm also trying to reorganize my own home "infrastructure" to be more resilient for the long run to ensure that my family's history does not disappear with my computers.

There are significant challenges in the computer industry all over, but preservation of history is one that our children and our children's children will judge us with. USB thumb drives will come and go, but hopefully our generation's digital treasures will not go to the grave with us.

Monday Mar 03, 2008

Computational Photography and Storage

There is a great article on CNet's news.com about computational photography, "Photo industry braces for another revolution". It is basically about Photography 2.0. The first wave of digital photography seeks to reproduce film-based photography as well as it can. Photography 2.0 advances hardware while taking advantage of higher processing power within the camera to take advantage of the new hardware, replace hardware functionality with software functionality or bring image detection and manipulation capabilities that are not possible in the hardware space.

There are a few developments worthy of note, and all of them involve bringing more CPU capabilities into the camera:


  • Panoramic photography - I enjoy these types of scenes (one shown below), though I don't think they are the future of photography at all
  • Depth of field and 3-D photography - There is an excellent example of this in the CNet article. Personally, depth of field is arguably one of the most difficult techniques to master since this is purely 4-dimensional using our current lenses (aperture size decrease increases time of exposure and more depth will be in focus, etc...)

There are many other ideas in the article...detecting smiles (an extension of this is closed or open eyes), better light detection, self-correcting for stabilization (this is done with high priced hardware today in Image Stabilized lenses), etc... Clearly a Photography 2.0 revolution is in the works.

Photography 2.0 is really the same trend we see in the storage business...Storage 2.0. There are simple changes in the industry, like the incredible increase in CPU driving software RAID into storage stacks again. A huge benefit with software RAID is the decrease in hardware costs that it drives. This is very similar to the Photography 2.0 concept of moving image stabilization out of the hardware (the lenses) and into the software.

Storage 2.0 also brings us projects like this one: Project Royal Jelly. Project Royal Jelly encompasses two important pieces, one is the implementation of a standard access model to fixed content information, a second is the insertion of execution code between the storage API and the spinning rust. The ability to "extend" a storage appliance (or device) via a standard API will allow us to leverage the proliferation of these inexpensive and high-powered CPUs. A common use-case for an execution environment embedded in a storage device would be an image repository or a video repository. Every image submitted goes through a series of conversions: different image formats, different image sizes (thumbnail, Small, Medium, Large), and often a series of color adjustments. Documents go through similar transformations: a PDF may have different formats created (HTML primarily), the document will be indexed, larger chunks will be extracted into a variety of metadata databases for quick views, etc...

These transformations can arguably be the responsibility of the storage operation, not the application operations, especially when the operations can be considered part of an archiving operation. While indexing and manipulation could be considered a higher tier, storage tiering and taking advantage of storage utilities could also benefit from a standard storage execution platform. Vendors could easily insert logic onto storage platforms to "move" data and evolve a storage platform in place rather than authoring applications that have to operate outside of the storage platform.

Just some Monday morning musings...have a great week.

Friday Feb 15, 2008

Why one bit matters.

Sometimes I wonder why I'm in the field of storage. Its not glamorous. Its JBODs, RAID arrays, HBAs, expanders, spinning rust, and all of those things wrapped into enclosures with lots of fans humming. My background is varied, I wrote a file system for my Master's, I worked on one of the biggest Java Business Frameworks ever (the SanFrancisco Project at IBM), and I've danced between the application and infrastructure space more than once.

I often think about my "ideal" job, I've even pondered it here on my blog...and take note, the new Jack Johnson CD is very good and I am ripping it to 8-track real soon now. Personally, I love the field of digital preservation, XAM is in the right direction, and long term digital archives are important to people-kind.

But still, this storage business, there is something to it.

I watched my friend get their eyes lasered to correct their vision this week. While I was watching, I was able to sit with one of the assistants and pepper her with questions, it is an astounding process. Basically, as I understand it, the Doctors use the scanners and computers to


  • map the surface of each eye
  • analyze the surface to understand why the vision is incorrect
  • create several corrective treatments
  • the doctor looks at the corrective treatments and adds their wisdom to make the right decision (a lot goes into this, like the health of the patient, the age, their profession, whatever...)
  • the doctor may tweak the map of places that need adjustments
  • the updated map is loaded into the "laser"
  • the patient comes in, gets prepped, the doctor aims the laser and sets the program loose
  • the "laser" jumps around the eye zapping away
  • the doctor reassembles the eye
  • the patient goes home

Coolness. But then the geek in me took over, I asked what I could about the machine, backup generators, power, moving the data, mapping the eye, etc... But my head kept thinking about the storage and computer software.

What if a bit is wrong? What if the bits are stored away but due to some battery backup cache being down, it doesn't really get stored and the out of date map is actually in place? What if one tiny point "ages" and becomes rust and there is no checksumming to see it "rotted"? These are people's eyes, you know? Would you want to be the storage vendor that supplied storage that messed up someone's eye because you didn't get the signal / noise ratio on the cabling right?

I've been thinking a lot about digital photography lately as well. While its not people's eyes, it is still an incredibly fragile process. In fact, many of the world's best photographers still do not use digital, and for very good reason. Even when you purchase photographs, you pay a premium price for pictures that have not gone through the digitization process.

Think about this, if a person takes a picture, the CCD (or whatever they are these days) takes the light and transfers it to a memory card. The memory card gets transferred to a laptop hard drive (in my case), a variety of backups are made and I move many of the pictures to SmugMug.

That's a lot of storage along the way. Now, let's say (God forbid), my house burns down. I get my pictures back from SmugMug and one of my pictures has a bit that rotted away.

Now, that is one tiny bit of imperfection to some people. To a professional, that picture is no longer an original. At that point, you have to decide to toss away your artistic integrity and photoshop the point to be like the ones near to it, or just toss the picture from your portfolio. Either way, the picture is never the same.

How would you like to be the one that sold the storage unit that allowed the bit to rot or be stored incorrectly, or archived incorrectly and destroyed that person's memory, that one perfect picture that was meant to be a keepsake forever.

Well, when you think about it, building storage units and management for those storage units is probably not as glamorous as owning the software or companies that specialize in photo archiving, or "lasering" people's eyes, or storing original recordings for artists, or archives of space travel. But those folks have to pick storage units from a company...and if you are the company they pick and you fulfill your moral responsibility to supply checksumming in your file systems, and well-tested storage that may occasionally be late to market to ensure that a memory is not lost or an eye doesn't get fried...you know, that's pretty rewarding.

Cheers to all of my co-workers at Sun who believe storage is more than a spinning drive or a paycheck.

Friday May 25, 2007

Storage Remote Monitoring...got that...

One of my many projects is to tackle the product-side architecture for Remote Monitoring of our storage systems. Remote Monitoring is a fascinating problem to solve for many, many reasons:


  • There are different ways to break the problem up, each being pursued with almost religious fanaticism, but each having its place depending on the customer's needs
  • It is a cross-organizational solution (at least within Sun)
  • It has a classic separation of responsibilities in its architecture
  • It solves real problems for customers and for our own company
  • It is conceptually simple, yet extremely difficult to get right

The problem at hand was to create a built-in remote monitoring solution for our midrange storage systems. Our NAS Product Family and anything being managed by our Common Array Manager was a good start. Our CAM software alone covers our Sun StorageTek 6130, 6140, 6540, 2530, and 2540 Arrays. Our high-end storage already has a level of remote monitoring and we already have a solution to do remote monitoring of "groups" of systems via a service appliance, so our solution was targeted directly at monitoring individual systems with a built in solution.

This remote monitoring solution is focused on providing you with a valuable service: "Auto Service Request", ASR. The Remote Monitoring Web Site has a great definition of ASR: Uses fault telemetry to automatically initiate a service request and begin the problem resolution process as soon as a problem occurs. This focus gives us the ability to trim down the information being sent to Sun to faults, it also gives you a particular value...it tightens up the service pipeline to get you what you need in a timely manner.

For example, if a serious fault occurs in your system (one that would typically involve Sun Services), we will have a case generated for you within a few minutes...typically less than 15.

The information flow with the "built in" Remote Monitoring is only towards Sun Microsystems (we heard you with security!). If you, the customer, want to work with us remotely to resolve the problem, a second solution known as Shared Shell is in place. With this solution, we work cooperatively with you so that you can collaborate with us to resolve problems.

Remember though, I'm an engineer, so let's get back to the problem...building Remote Monitoring.

The solution is a classic separation of concerns. Here are the major architectural components:


  • REST-XML API
  • HTTPS protocol for connectivity
  • Security (user-based and repudiation) via Authentication and Public / Private Key Pairs
  • Information Producer (the product installed at the customer site)
  • Information Consumer (the service information processor that turns events into cases)
  • Routing Infrastructure

The REST-XML API gives us a common information model that abstracts away implementation details yet gives all of the organizations involved in information production and consumption a common language. The relatively tight XML Schema also gives an easily testable output for the product without having to actually deliver telemetry in the early stages of implementation. Further, the backend can eaily mock up messages to test their implementation without a product being involved. Early in the implementation we cranked out a set of messages that were common to some of the arrays and sent them to the programmers on the back end, the teams then worked independently on their implementations. When we brought the teams back together, things went off without much of a hiccup, though we did find places where the XML Schema was too tight or too loose for one of the parties, so you do still have to talk. The format also helps us bring teams on board quickly...give them an XSD and tell them to come back later.

Here is an example of a message (real data removed...). Keep in mind there are multiple layers of security to protect this information from prying eyes. We've kept the data to a minimum, just the data we need to help us determine if a case needs to be created and what parts we probably need to ship out:


<?xml version="1.0" encoding="UTF-8"?>
<message xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="message.xsd">
<site-id>paul</site-id>
<message-uuid>uuid:xxxxx</message-uuid>
<message-time timezone="America/Denver">2005-11-22T12:10:11</message-time>
<system-id>SERIAL</system-id>
<asset-id>UNIQUENUMBER</asset-id>
<product-id>uniqueproductnumber</product-id>
<product-name>Sun StorageTek 6130</product-name>
<event>
<primary-event-information>
<message-id>STK-8000-5H</message-id>
<event-uuid>uuid:00001111</event-uuid>
<event-time timezone="America/Denver">2005-11-22T12:10:11</event-time>
<severity>Critical</severity>
<component>
<hardware-component>
<name>RAID</name>
</hardware-component>
</component>
<summary>Critical: Controller 0 write-cache is disabled</summary>
<description>Ctlr 0 Battery Pack Low Power</description>
</primary-event-information>
</event>
</message>

Use of XML gives us the ability to be very tight with use of tabs and enforce particular values, like severity, across the product lines.

The format above is heavily influenced by our Fault Management Architecture, though an FMA implementation is not required.

What we've found is that good diagnostics on a device (and FMA helps with this) yields a quick assembly of the information we need and fewer events that are not directly translated into cases. FMA and "self healing" provide and exceptional foundation for remote monitoring with a heavy reduction in "noise".

The rest of the architecture (the services that produce, consume, secure, and transport the information) is handed off to the implementors! The product figures out how to do diagnostics and output the XML via HTTPS to services at Sun Microsystems. Another team deploys services in the data center for security and registration (there are additional XML formats, authentication capabilities and POST headers for this part of the workflow). Another team deploys a service to receive the telemetry, check the signature on the telemetry for repudiation purposes, process it, filter it, and create a case.

There are additional steps that each product needs to go through, such as communicating across organizations the actual message-ids that a device can send and what should happen if that message-id is received.

In the end, the centerpiece of the architecture is the information and the language that all teams communicate with. Isn't this the case with any good architecture? Choose the interfaces and the implementations will follow.

Keep in mind, this remote monitoring solution is secure end to end. Further, remote monitoring is only one piece of the broader services portfolio...I'm just particularly excited about this since I was privileged to have worked with a great, cross-organizational team to get it done! The team included Mike Monahan (who KICKS BUTT), Wayne Seltzer, Bill Masters, Todd Sherman, Mark Vetter, Jim Kremer, Pat Ryan and many others (I hope I didn't forget any). There are also lots of folks that were pivotal in getting this done that we lost along the way (Kathy MacDougall I hope you are doing well as well as Mike Harding!).

This post has been a long time in coming! Enjoy!

Friday Jan 12, 2007

Wired: One Giant Screwup for Mankind

Several weeks ago I blogged about data loss and taking the long view when it comes to data retention. This month's Wired magazine has an article entitled One Giant Screwup for Mankind that illustrates the need for taking a long view of data retention policies. It also brings up an interesting point about our current trend at digitizing and chopping our digital content up into lossy data compression formats (like 128kbps MP3s).

Apparently, the grainy images of the original moon landing that we see on TV ("one small step for [a] man...") are not the original images and sound! The engineers were forced to create a smaller format for transmission from the moon to earth, it was 320 scan lines at 10 frames per second transmitted at 500 kHz. This stream was received at 3 tracking stations, pushed to a central location, recorded on media and converted to the broadcast rate of 525 scan lines at 30 frames per second transmitted at 4.5 Mhz. This is essentially 3 transmissions (camera to tracking station, tracking station to central site, central site to tv) and 2 conversions (camera to moon/earth broadcast, moon/earth broadcast to tv). Between the reception of the data and conversion to the tv format, the quality was greatly reduced! The engineers noticed that the broadcast images were not as crisp as what SHOULD have been in the original format. In fact, they could verify this with pictures of the monitors in the conversion room. So, the engineers tried to find the tape that the original data was recorded on so they could recover the full quality images.

Gone, lost, disappeared.

Just as I mentioned previously though, the engineers had TWO problems they had to work on:
- Getting and retaining the equipment they could use to recover the origninal data (remember, we have went through multiple media formats since the 60s)
- Locating the original tapes used for recording the data stream prior to conversion to the television signal format

I won't tell you how its going, you have to read Wired to find out. But, this does bring up an excellent example of
- Why a company that has record retention requirements of over 7 years must put in place a comprehensive policy to not only record the information and store it, but also retain the equipment that can read that data and write it to a new format. Some companies, instead of storing the components to read/write data, will enact a policy to migrate the data to the current media format every 7 years or less.
- Why a company should consider the effect on history of losing their data if their retention policy is less than 7 years or not explicitly stated. For example, is there a retention policy at our record companies for all of the garage band tape recordings they've received? If there isn't, how are we going to retain this valuable piece of American History and Culture? The record companies have a historical responsibility to record and maintain these.

More interestingly to me for this blog post is the problems with the data conversion process itself. Recall I'm a big vinyl fan at this point. Vinyl and analog recordings provide a warm and continuous signal whereas digital chops that up into many slices. Further, when compressing information for our MP3s we actually lose data. Depending on the number of Kbps you use, the data loss can be very noticeable in certain types of music.

Many download services also do not provide lossless downloads.

In the coming year we will see 1 Terrabyte desktop drives. I am convinced that we will start seeing more pervasive use of lossless compression. Still, it begs the question, will our original data remain intact? Are we losing important historical data and content quality through the conversion to digital and then using lossy compression techniques because we feel the quality is "good enough"? I have every reason to believe that as we start merging technology with our bodies and brains, our senses will become more and more aware of the lossy compression techniques used in the late 90's and early 2000's. Even without computer enhancement our brains are adapting to the saturation of media and information in a way that previous generations would be astounded at.

The only question to our kids who will have the heightened senses through the merging of technology with our human anatomy will be "How much quality did my parents compromise and lose for the sake of their convenience...and how much of it will we be able to recover to enjoy their creativity to its fullest potential?". So, be sure to save those original recordings...especially if you are the owner of the Beatles recordings.

btw, does anyone REALLY agree with releasing a Beatles album that does not adhere to the group's original music scores but is instead a mashup? Should content created by a team of people in a specific way be rebuilt to fulfill someone else's vision? What if our future generation actually thinks that these songs were originally mashed up, are we changing history? I agree with mashups and especially for content that is INTENDED to be mashed up, but I believe we should be very careful with taking original content and mashing it up to be something not intended by the author (though I do like the version of the Elvis tune at the beginning of the NBC show Las Vegas :-)

- Gotta run!

About

pmonday

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today