Wednesday Dec 16, 2009

The SNMP Service on a Sun Storage 7000 Appliance

Without a doubt, SNMP rules the playground in terms of monitoring hardware assets, and many software assets, in a data center monitoring ecosystem. It is the single biggest integration technology I'm asked about and that I've encountered when discussing monitoring with customers.

Why does SNMP have such amazing staying power?

  • It's extensible (vendors can provide MIBs and extend existing MIBs)
  • It's simple (hierarchical data rules and really it boils down to GET, SET, TRAP)
  • It's ubiquitous (monitoring tools accept SNMP, systems deliver SNMP)
  • It operates on two models, real time (traps) and polling (get)
  • It has aged gracefully (security extensions in v4 did not destroy it's propagation)

To keep the SNMP support in the Sun Storage 7000 Appliances relatively succinct, I am going to tackle this in two separate posts. This first post shows how to enable SNMP and what you get "out of the box" once it's enabled. The next post discusses how to deliver more information via SNMP (alerts with more information and threshold violations).

To get more information on SNMP on the Sun Storage 7000 and to download the MIBs that will be discussed here, go to the Help Wiki on a Sun Storage 7000 Appliance (or the simulator):

  • SNMP - https://[hostname]:215/wiki/index.php/Configuration:Services:SNMP

Also, as I work at Sun Microsystems, Inc., all of my examples of walking MIBs on a Sun Storage 7000 Appliance or receiving traps will be from a Solaris-based system. There are plenty of free / open source / trial packages for other Operating System platforms so you will have to adapt this content appropriately for your platform.

One more note as I progress in this series, all of my examples are from the CLI or from scripts, so you won't find many pretty pictures in the series :-)

Enabling SNMP on the Sun Storage 7000 Appliance gives you the ability to:

  • Receive traps (delivered via Sun's Fault Manager (FM) MIB)
  • GET system information (MIB-II System, MIB-II Interfaces, Sun Enterprise MIB)
  • GET information customized to the appliance (using the Sun Storage AK MIB)

Enabling alerts (covered in the next article) extends the SNMP support by delivering targeted alerts via the AK MIB itself.

Enable SNMP

The first thing we'll want to do is log into a target Sun Storage 7000 Appliance via SSH and check if SNMP is enabled.

aie-7110j:> configuration services snmp
aie-7110j:>configuration services snmp> ls
<status> = disabled
community = public
network =
syscontact =
trapsinks =

aie-7110j:configuration services snmp>

Here you can see it is currently disabled and that we have to set up all of the SNMP parameters. The most common community string to this day is "public" and as we will not be changing system information via SNMP we will keep it. The "network" parameter to use for us is, this allows access to the MIB from any network. Finally, I will add a single trapsink so that any traps get sent to my management host. The last step shown is to enable the service once the parameters are committed.

aie-7110j:configuration services snmp> set network=
network = (uncommitted)
aie-7110j:configuration services snmp> set syscontact="Paul Monday"
syscontact = Paul Monday (uncommitted)
aie-7110j:configuration services snmp> set trapsinks=
trapsinks = (uncommitted)
aie-7110j:configuration services snmp> commit
aie-7110j:configuration services snmp> enable
aie-7110j:configuration services snmp> show
<status> = online
community = public
network =
syscontact = Paul Monday
trapsinks =

From the appliance perspective we are now up and running!

Get the MIBs and Install Them

As previously mentioned, all of the MIBs that are unique to the Sun Storage 7000 Appliance are also distributed with the appliance. Go to the Help Wiki and download them, then move them to the appropriate location for monitoring.

On the Solaris system I'm using, that location is /etc/sma/snmp/mibs. Be sure to browse the MIB for appropriate tables or continue to look at the Help Wiki as it identifies relevant OIDs that we'll be using below.

Walking and GETting Information via the MIBs

Using standard SNMP operations, you can retrieve quite a bit of information. As an example from the management station, we will retrieve a list of shares available from the system using snmpwalk:

-bash-3.00# ./snmpwalk -c public -v 2c isv-7110h sunAkShareName
SUN-AK-MIB::sunAkShareName.1 = STRING: pool-0/MMC/deleteme
SUN-AK-MIB::sunAkShareName.2 = STRING: pool-0/MMC/data
SUN-AK-MIB::sunAkShareName.3 = STRING: pool-0/TestVarious/filesystem1
SUN-AK-MIB::sunAkShareName.4 = STRING: pool-0/oracle_embench/oralog
SUN-AK-MIB::sunAkShareName.5 = STRING: pool-0/oracle_embench/oraarchive
SUN-AK-MIB::sunAkShareName.6 = STRING: pool-0/oracle_embench/oradata
SUN-AK-MIB::sunAkShareName.7 = STRING: pool-0/AnotherProject/NoCacheFileSystem
SUN-AK-MIB::sunAkShareName.8 = STRING: pool-0/AnotherProject/simpleFilesystem
SUN-AK-MIB::sunAkShareName.9 = STRING: pool-0/default/test
SUN-AK-MIB::sunAkShareName.10 = STRING: pool-0/default/test2
SUN-AK-MIB::sunAkShareName.11 = STRING: pool-0/EC/tradetest
SUN-AK-MIB::sunAkShareName.12 = STRING: pool-0/OracleWork/simpleExport

Next, I can use snmpget to obtain a mount point for the first share:

-bash-3.00# ./snmpget -c public -v 2c isv-7110h sunAkShareMountpoint.1
SUN-AK-MIB::sunAkShareMountpoint.1 = STRING: /export/deleteme

It is also possible to get a list of problems on the system identified by problem code:

-bash-3.00# ./snmpwalk -c public -v 2c isv-7110h sunFmProblemUUID
SUN-FM-MIB::sunFmProblemUUID."91e97860-f1d1-40ef-8668-dc8fb85679bb" = STRING: "91e97860-f1d1-40ef-8668-dc8fb85679bb"

And then turn around and retrieve the associated knowledge article identifier:

-bash-3.00# ./snmpget -c public -v 2c isv-7110h sunFmProblemCode.\\"91e97860-f1d1-40ef-8668-dc8fb85679bb\\"
SUN-FM-MIB::sunFmProblemCode."91e97860-f1d1-40ef-8668-dc8fb85679bb" = STRING: AK-8000-86

The FM-MIB does not contain information on severity, but using the problem code I can SSH into the system and retrieve that information:

isv-7110h:> maintenance logs select fltlog select uuid="91e97860-f1d1-40ef-8668-dc8fb85679bb"
isv-7110h:maintenance logs fltlog entry-005> ls
timestamp = 2009-12-15 05:55:37
uuid = 91e97860-f1d1-40ef-8668-dc8fb85679bb
desc = The service processor needs to be reset to ensure proper functioning.
type = Major Defect

isv-7110h:maintenance logs fltlog entry-005>

Take time to inspect the MIBs through your MIB Browser to understand all of the information available. I tend to shy away from using SNMP for getting system information and instead write scripts and workflows as much more information is available directly on the system, I'll cover this in a later article.

Receive the Traps

Trap receiving on Solaris is a piece of cake, at least for demonstration purposes. What you choose to do with the traps is a whole different process. Each tool has it's own trap monitoring facilities that will hand you the fields in different ways. For this example, Solaris just dumps the traps to the console.

Locate the "snmptrapd" binary on your Solaris system and start monitoring:

-bash-3.00# cd /usr/sfw/sbin
-bash-3.00# ./snmptrapd -P
2009-12-16 09:27:47 NET-SNMP version 5.0.9 Started.

From there you can wait for something bad to go wrong with your system or you can provoke it yourself. Fault Management can be a bit difficult to provoke intentionally since things one thinks would provoke a fault are actually administrator activites. Pulling a disk drive is very different from a SMART drive error on a disk drive. Similarly, pulling a Power Supply is different from tripping over a power cord and yanking it out. The former is not a fault since it is a complex operation requiring an administrator to unseat the power supply (or disk) whereas the latter occurs out in the wild all the time.

Here are some examples of FM traps I've received through this technique using various "malicious" techniques on a lab system ;-)

Here is an FM Trap when I "accidentally" tripped over a power cord in the lab. Be careful when you do this so you don't pull the system off the shelf if it is not racked properly (note that I formatted this a little bit from the raw output):

2009-11-16 12:25:34 isv-7110h []:
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (1285895753) 148 days, 19:55:57.53
SNMPv2-MIB::snmpTrapOID.0 = OID: SUN-FM-MIB::sunFmProblemTrap
SUN-FM-MIB::sunFmProblemUUID."2c7ff987-6248-6f40-8dbc-f77f22ce3752" = STRING: "2c7ff987-6248-6f40-8dbc-f77f22ce3752"
SUN-FM-MIB::sunFmProblemCode."2c7ff987-6248-6f40-8dbc-f77f22ce3752" = STRING: SENSOR-8000-3T
SUN-FM-MIB::sunFmProblemURL."2c7ff987-6248-6f40-8dbc-f77f22ce3752" = STRING:

Notice again that I have a SunFmProblemUUID that I can turn around and shell into the system to obtain more details (similarly to what was shown in the last section). Again, the next article will contain an explanation of Alerts. Using the AK MIB and Alerts, we can get many more details pushed out to us via an SNMP Trap, and we have finer granularity as to the alerts that get pushed.

Here, I purchased a very expensive fan stopper-upper device from a fellow tester. It was quite pricey, it turns out it is also known as a "Twist Tie". Do NOT do this at home, seriously, the decreased air flow through the system can cause hiccups in your system.

DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (1285889746) 148 days, 19:54:57.46
SNMPv2-MIB::snmpTrapOID.0 = OID: SUN-FM-MIB::sunFmProblemTrap
SUN-FM-MIB::sunFmProblemUUID."cf480476-51b7-c53a-bd07-c4df59030284" = STRING: "cf480476-51b7-c53a-bd07-c4df59030284"
SUN-FM-MIB::sunFmProblemCode."cf480476-51b7-c53a-bd07-c4df59030284" = STRING: SENSOR-8000-26
SUN-FM-MIB::sunFmProblemURL."cf480476-51b7-c53a-bd07-c4df59030284" = STRING:

You will receive many, many other traps throughout the day including the Enterprise MIB letting us know when the system starts up or any other activities.

Wrap it Up

In this article, I illustrated enabling the SNMP Service on the Sun Storage 7000 Appliance via an SSH session. I also showed some basic MIB walking and traps that you'll receive once SNMP is enabled.

This is really simply the "start" of the information we can push through the SNMP pipe from a system. In the next article I'll show how to use Alerts on the system with the SNMP pipe so you can have more control over the events on a system that you wish to be notified about.

Thursday Dec 10, 2009

Monitoring the Sun Storage 7000 Appliance

Over the past several months I've been working on integrating our Sun Storage 7000 Appliances into monitoring products from other companies. The monitoring work I'm doing is a combination of software writing (via a plug-in for a data center monitoring product that will see it's release in conjunction with our next Sun Storage 7000 Appliance Software Release) and "consulting" with our customers directly about monitoring the appliances they install after purchase.

The Sun Storage 7000 Appliance comes with a variety of mechanisms for monitoring:
- SNMP (via several different MIBs using traps or GETs)
- Email Alerts
- Remote Syslog

A variety of software and hardware faults delivered internal to the system as Fault Management Architecture (FMA) events get pushed to the monitoring environment via the above mechanisms.

As valuable as these capabilities are, customers always have more advanced monitoring needs that require customization of the environment. Some customers want to tune the information available for significant digits, get more significant digits than we surface in the CLI, or gather data from our industry leading analytics capabilities delivered with the appliance. Some may want to integrate with an ITIL-style Configuration Management Database, others may want to create a billing system based on user capacity and accounting for levels of service (guaranteed space, thin-provisioned space, etc...).

All of these customizations can easily be achieved using simple SSH navigation of the appliance's environment or more advanced manipulation of the environment using the embedded JavaScript environment on each Sun Storage 7000 Appliance via scripts or Workflows.

Over the next few weeks, I'm going through my Email Archives (not a pretty sight to be honest) and I'm going to mine the greatest hits as I've sent out information to specific audiences on monitoring boxes and customizing the environment based on specific monitoring application use cases. Other articles will be focused on how I achieved the monitoring environment for the upcoming plug-in that will hit the download center with the next software release.

With all of that lead-in, I am going to kick off my monitoring guidance with what I tell everyone right out of the chute, "Use the Built-in Sun Storage 7000 Appliance Help Wiki to get up to speed on these topics and get the latest information". After all, this blog post will age with each release of the Sun Storage 7000 Appliance whereas the Help Wiki is updated with each release.

On a running Sun Storage 7000 Appliance, use the following URLs (substituting the address of the appliance where I put [hostname]):

  • SNMP - https://[hostname]:215/wiki/index.php/Configuration:Services:SNMP
  • Alerts - https://[hostname]:215/wiki/index.php/Configuration:Alerts
  • Scripting - https://[hostname]:215/wiki/index.php/User_Interface:CLI:Scripting
  • Workflows - https://[hostname]:215/wiki/index.php/Maintenance:Workflows

You can download the latest Sun Storage 7000 Appliance Storage Simulator and follow these instructions as well.

In case the pages have moved, be sure to use the Search feature in the Help Text that comes with the Wiki.

There are always cases that customers want more hardcore examples tailored to environments of each of the above or a slightly different take on learning these topics. And that, my friends, is what the next few weeks will be about. I'll give more examples and approaches, similar to what I did with my Fun with the FMA and SNMP article.

Tuesday Aug 04, 2009

Sun Storage 7000 as an Administrator Development Platform

The Sun Storage 7000 Family of Appliances breaks ground in manageability and transparency through an amazing amount of analytics information provided to administrators as well as a highly customizable and extensible management environment that resides on the system. The "Workflow", delivered in the latest release of appliance software, is of particular interest to those of us responsible for "integrating" the Sun Storage 7000 into a management ecosystem, bundling pieces of management logic for use by our peers and reproducing management logic (such as configuration and environmental setup) on several systems at a time.

A workflow is a parameterized piece of logic that is uploaded to a Sun Storage 7000 where it remains resident and is then run via the BUI, CLI or remotely via a shell. The logic within the workflow is programmed in JavaScript (resident on the Sun Storage 7000) and interacts with the system's management shell via "run" commands or built-ins that interact with the current administrative context.

A workflow can do anything that an administrator could do via the CLI, but in a nicely bundled and parameterized way. Here are a few things I've done with workflows:

  • gather information about the appliance and reformat it to make it digestable by a higher-level tool
  • retrieve sets of analytics data and turn them into different sized chunks (instead of 1 second interval give me a 60 second interval as an average as well as the min and max during the interval) and reformat it to make it easy to digest
  • manage the lifecycle of shares (create, manage settings and delete) that are common across appliances
  • manage network settings
  • create a set of worksheets on every appliance in the network

The opportunities for automation are endless, only bounded by the needs of the administrator in their efforts to integrate the appliances within the management ecosystem.

There is substantial documentation on the appliance's Help Wiki, but for clarity, here is a very simple workflow that will list the attribute of a filesystem that is given as input to the workflow:

  • Input: attribute name (same as the attribute in the CLI)
  • Output: CSV format: project,sharename,attribute (one line for each share)
  • Behavior Notes: a listed attributed that is not valid will return NA in the column (this could be moved to parameter verification but will serve to illustrate exception handling). Also, there are some properties that return empty values as the value was actually inherited from the project context.

Since this is a relatively "short" example, I will simply put the code here with comments and then add additional information afterwords. Note the use of JavaScript functions (such as printToString) as well as the most important element, the definition of the variable "workflow".

/\* The printed headers, one will be added with the property name \*/
var headerList = new Array(

/\* A function to print the array into a string for display \*/
function printToString(csvToPrint){
var csvAsString = "";
for(var i=0 ; i csvAsString = csvAsString + csvToPrint[i];
// do not finish with an end of line marker
if(i!=csvToPrint.length-1) csvAsString = csvAsString + "\\n";
return csvAsString;

/\* This is a required structure for the workflow, it identifies the name, parameters
and the function to execute when it is run \*/
var workflow = {
name: 'Get Filesystem Attribute',
origin: 'Sun Microsystems, Inc.',
description: 'Prints a Property for all Shares',
parameters: {
property : {
label: 'Filesystem Property',
type: 'String'
function (params) {
// prepare the output arrays
var csvContents = new Array();
var currentRow = 0;
headerList[2] =;
csvContents[0] = headerList;

// go to the root context to start navigation
run('cd /');

// get a list of all of the projects on the system
var projects = list();

// navigate through each project
for(var i=0 ; i run('select '+projects[i]);

// get a list of all shares
var shares = list();

// go into the context of each share
for(var j=0 ; j run('select '+shares[j]);
var filesystem = true;
var mountPoint = "";
try {
mountPoint = get('mountpoint');
} catch (err) {
// will end up here if "mountpoint" does not exist, not a filesystem
filesystem = false;
if(filesystem) {
var currentRowContents = new Array();
currentRowContents[0] = projects[i];
currentRowContents[1] = shares[j];
try {
var propertyValue = get(;
currentRowContents[2] = ""+propertyValue;
} catch (err) {
currentRowContents[2] = "NA";
csvContents[currentRow] = currentRowContents;
run('cd ..');

run('cd ..');

var newCsvAsString = printToString(csvContents);

return (newCsvAsString);

While the bulk of the example is standard JavaScript, the workflow structure is where there must be adherence. Here are the important properties:

  • name - The name that the workflow will be identified by within the BUI or CLI
  • origin - The author of the workflow, can also be used to minimize name collisions
  • description - A description of the contents of the workflow, displayed in the BUI or CLI
  • parameters - A list of parameters with types (the types supported are listed in the documentation)
  • execute - The function that gets executed when the workflow is run (there are more advanced ways of identifying the execution code than are shown here)

The code itself interacts with the system to get a list of the projects on the system, then a list of the shares within the system. The mountpoint property is ONLY available on filesystems, so we know if there is a property error that we do not have a filesystem and skip processing of it (it is most likely an iSCSI LUN).

To upload the workflow, cut/paste the text above and put it in a file. Log into a Sun Storage 7000 Appliance with the latest software and go to Maintenance / Workflows. Click the "+" sign to add a workflow and identify the location of the file. The syntax is error checked on upload, then you will see it listed. Workflows can also be uploaded from the CLI.

Here is what a run of the workflow from the CLI looks like:

isv-7110h:maintenance workflows> ls
showhidden = false


workflow-004 Get Filesystem Attribute root false Sun Microsystems, Inc.

isv-7110h:maintenance workflows> select workflow-004
isv-7110h:maintenance workflow-004> ls
name = Get Filesystem Attribute
description = Prints a Property for all Shares
owner = root
origin = Sun Microsystems, Inc.
setid = false

isv-7110h:maintenance workflow-004> execute
isv-7110h:maintenance workflow-004 execute (uncommitted)> ls
property = (unset)

isv-7110h:maintenance workflow-004 execute (uncommitted)> set property=space_total
property = space_total
isv-7110h:maintenance workflow-004 execute (uncommitted)> commit

isv-7110h:maintenance workflow-004>

While the example is simple, hopefully it illustrates that this is the start of workflow capabilities, not the entirety of them. The workflow can create management structures (like new shares and worksheets), delete them, modify them, and even enable and disable services.

Workflows make the Sun Storage 7000 an Administrator Development Platform. Try it out in the Sun Unified Storage Simulator if you don't have an appliance at your fingertips!

Tuesday Jul 07, 2009

Vertical and Horizontal Mobility

As far back as I can remember I've been fascinated by the ability to move from location to location to get my work done as well as being able to move from device to device. Horizontal mobility (the ability to move across geography) via virtually ubiquitous wireless Internet access is a reality.

This blog post originated in my house, continued between programming sessions at the Highlands Ranch Public Library and then over lunch at The Corner Bakery with only periodic "sleep" modes on my laptop disrupting the work. Horizontal mobility and the ability to work and communicate from anywhere in the country (and often overseas) is so effective, our customers, peers and even our management hierarchies in many industries may not even know where you are geographically located day to day. Have you ever had a conversation start with "Where are you working from?".

Horizontal Mobility shouldn't be news to anyone, I even carry a 3G modem with me so I can work in the infrequent times that I can't find a wireless network. But that is getting more and more rare, consider I had two "Facebook Friends" update their status from an airplane this week.

While horizontal mobility is in our lives to stay, what about Vertical Mobility ... what I would call the ability to move from device to device to achieve a task or handle content? I should be careful to note, when I say device to device, I mean wildly different classes of devices, such as a phone and a laptop, or an Amazon Kindle, a phone and a laptop, or a TV and a computer and a phone. My assumption several years ago, almost a decade now, was that we would move away from the desktop computers and at some point our kids would simply have purpose-built devices giving access to appropriate content for the device ... I jokingly called this a "Fully Integrated Lifestyle" to a few friends the other week but the more I think about it, that is truly what Vertical Mobility is all about.

Are we there yet?

Absolutely. Over the past few weeks I bought an Amazon Kindle and an iPhone so I could revisit the mobile lifestyle. I last visited it about 3 years ago as I architected a Sun Storage Management Portal. At that time, I was playing with re-rendering portlets from HTML / XML to WML. Basically, I would create alternate views of information based on the device being used to view the information (MVC, Model View Controller).

Today, our devices are so powerful and usability of the devices has hit such a high level that alternate views of information are often not required (though I would argue that good vertical mobility does take into account the device). The image above is a fully dynamic view of our Sun Storage 7000 Appliance taken from my iPhone VPN'd to work.

Horizontal and Vertical Mobility

What other devices can be combined? My favorite for vacation is taking TV shows from my Tivo (like Spongebob Squarepants) and moving them to the iPod so my son can watch them while we are stuck in an airport.

The Amazon Kindle has even joined the party. I purchase books on my Kindle and can read them on my iPhone via the Amazon Kindle downloadable iPhone application.

It is truly an amazing world we live in. What's the infrastructure at work here? The cloud is here, the network continues to be built out through our carriers and most of all, content providers that are becoming more aware of the opportunities that abound in content delivery to these devices and, of course, standards around content structure, authentication and authorization, and every other technology that gives us Horizontal and Vertical Mobility and the amazing ability to live a Fully Integrated Lifestyle.

Ok, times up ... back to my design document ... I'll work on it from my Mac today :-)

Friday Jun 20, 2008

When the clouds disappear...not always a sunny day

I build, maintain and pay for hosting for my friend's charity web site, Play for a Heart. For the past several years I've had a host provider where I would deploy the very simple web site I constructed for her. The host had become my "cloud" in the stormy skies of Cloud Computing.

This cloud kept drifting along, the price was right, it was somewhat reliable, it was ... easy.

I went to the web site on a Saturday morning, it was gone. Somewhere over the course of the last few months the company had turned into a shell of the host provider it formerly was. There was no tech response over the weekend and the forums were more or less ghost towns. I made a post and the only responses were "Leave this host as fast as you can". The outage continued on through the weekend and extended to a large percentage of the host's own sales and marketing sites.

Finally, late Sunday I couldn't take it anymore, I pulled the plug and moved to GoDaddy. The move had nothing to do with Danica Patrick. Here was my very simple logic:

  • Java Support / PHP Support / MySQL Support
  • Able to host multiple domains with a single hosting account
  • Resilient enough to support the onslaught after a Super Bowl commercial
  • Good recommendations and community
  • Great price

And I switched...

Lucky for me I made my friend's site as basic as I could.

  • Very simple HTML
  • Slideshow objects embedded from another Cloud Application (SmugMug so that the links moved right over (and I only put these on after she had enough sponsors that I couldn't figure out how to keep the sponsor page clean)
  • No applications embedded directly from my low-budget hosting solution
  • Domain Names purchased from Yahoo and GoDaddy (this became a HUGE win as the host records would have been locked had I purchased the domain names with this smaller company)

My web site was moved to GoDaddy and back up and running in, literally, 1.5 hours and my exit was complete from my previous host.

Still, the entire experience left me very shaken over Cloud Computing. I've come up with the following set of thoughts when it comes to attaching my digital life to the clouds:

  • Many companies literally own the information you create or have very liberal rights to that information (Always read the License Agreements)
  • Many companies have no exit strategy for your information (once your data is captured or created in the online application it cannot be extracted...this is especially the case in social networking infrastructure)
  • Because of the nature of the cloud, you have no guarantees that many companies that host the applications you depend on are even viable (check the business model and financials if you are tying your life to the cloud)
  • Several companies entire business models are centered on analyzing the information you give to them and monetizing it (in many cases "personalization" in your eyes is really "targeting and demographics" in the company's eyes)
  • Standard Platforms, Standard APIs, Standard Information Models are at the heart of a successful cloud (not fog), this allows you a better chance at having tools to import / export information and interact with that information...and in the case of building an application, it is critical that these standards are the core of the application so that you can get out of your cloud as quickly as you got into it and at the first sign of turbulence
  • In lieu of standards (even defacto), Open APIs, Open Source, Open Architecture, and a good open license can be a huge help...especially with a robust developer community. This assures infrastructure can live beyond the life of any individual cloud if it should disappear.
  • Buyer beware

Well...I'm a huge, huge advocate of Cloud Computing. Our own site, Amazon EC3 and S3,, they are all leading the wave...and in 10 years it will just be the way it is done. But until then, there will certainly be some growing pains, probably more in the small-company space than in the enterprise space...but they will be there.

Standard's groups, defacto standards, open source cloud infrastructure, entry and exit for information, information licensing, information security and transparent motivation for collection of your information will go a long way to determining whether there is longevity in the sunny skies for cloud computing.

Friday Apr 25, 2008

Paul [Hearts] eLOM

For the past week or so I've been on one of those "get this done" projects from my boss. Its tough to "get this done" when you are commuting though (or when the lab is the same temperature and wind speed as the top of Loveland Pass). My project involves 5 Sun Fire x4150s (and will involve another set of servers next week).

I did have to go into the office for a few days to pull memory in and out of them, do some DVD swapping on the machines, check cables, and (of course) get the 8 x 73GB 15K SAS drives swapped in (all of my machines need the same drives and speed).

During the setup, Terry Hill allocated IP addresses for the service processor eLOM (embedded Lights Out Manager) and configured the IP addresses within the BIOS. Keep in mind when you look at the back of a Sun system with eLOM, you see the standard ethernet ports (typically 4 on this size box), but you also see an additional port that's offset from the others and labeled "Service Processor". It is next to the Serial Console port. You can see it here, on the left side of the back of the system.

(The x4150 eLOM documentation has more details about configuring eLOM).

Once the network is configured, you type in the web address of your x4150 service processor ( and up pops the eLOM Web User Interface. You can do this from anywhere (even home :-)

Now that I'm in the eLOM, I can do all sorts of things. I primarily use the Web interface. With it I can do all of the power resets I need to do (including having it come straight up to BIOS). I can also launch a console through the magic of Java. When you press the button to get a console redirection, a tiny Java Web Start application comes down to your machine and connects back to the server. With it I can do all sorts of cool things like handle BIOS, take console options on my Operating Systems (I use Solaris, OpenSolaris, RHEL and Windows on my machines), and so on... I am also able to monitor event logs and ensure the components are operating as I expect (heck, I can even flash LEDs to say Hi to all of those frozen lab folks who haven't set up their eLOM).

My biggest complaint with eLOM isn't even with eLOM, its with Mac OS X. I have to tell you, I am not happy with their Java support at all. The eLOM console redirection doesn't work (strange Apple-only Java problem). I went to to get a Mac OSX updated luck, have to go to Apple. Go to Apple, they are hung up on Java 1.5. I found the Java Platform Standard Edition 6 Press Announcement, its been out since December 2006.

I found a Java Platform Standard Edition 6 Developer Release at Apple but it only runs on Leopard and I haven't upgraded (and I'm not). So, I'm stuck with the Mac...can't use it in our labs to support my systems. I know people love their Macs but some days it reminds me a little too much of OS/2. Great promise with a lack of application support.

I bounced to another platform (with its own Java 1.5 that actually works), the JVMs on my Solaris machines work fine as well as on my Windows laptop and home systems.

In the end...lovin' the eLOM!

Monday Apr 21, 2008


Have you ever been attacked by a plastic grocery bag on the way to work? One of those ones that float through traffic freaking you out? And then it seems to somehow manage to go under your car and pop out the other side only to terrorize yet another driver! Kai and I did some calculations with the bags.

Each bag is a bit over a 15 inch square.

If I didn't use my cloth bags and reuse the plastic "overspill" bags, I would use about 10 per week (why they pack a single bag of potato chips in another plastic bag all by itself is pretty far beyond my comprehension). So, I work my butt off to save at least 5 plastic bags a week (usually I can save all 10). Its probably reasonable that every family could cut about 5 bags off their plastic bag diet if they haven't already.

Saving my 5 bags each week saves about (5 bags \* 52 weeks) 260 bags per year. That's saving about (260 bags per year \* 15 inches) 3,900 bag inches long (15 bag inches wide) and that is about (3,900 bag inches / 12 inches per foot) 325 feet. Each year saves about the number of bags to go across an American football field and an end zone.

(Yes, that's the high school rugby team playing on the football field...I know they don't need a bunch of plastic grocery bags flying across the field...and they do, quite often).

Let's say all of the families in Highlands Ranch (about 20,000) stopped using about 5 bags a week too. That would be 325 feet \* 20,000 people = 6,500,000 feet (15" wide). That's 1,231 miles of bags (15" wide).

The folks of Highlands Ranch could save a strip of plastic bags that runs from Denver to San Francisco (as the Google Map flies).

View Larger Map

Well, that's a lot of bags. Who knows the countless number of car accidents that would be avoided if we all saved a few plastic bags. Reuse the ones you have and get, recycle them, and in so many cases...don't even use them in the first place.

While thinking about the surface area that these bags cover is convenient and somewhat astounding, also keep in mind

  • Energy is used to produce these bags
  • The bags are comprised of materials like plastics and oils
  • The bags take up space in landfills (not much...but it is space)
  • The bags won't decompose for 100s or 1000s of years

Next time just think: "Do I really need that bag or can I just carry my sack of potatoes?".

Monday Oct 29, 2007

Street Direction Savvy vs. Being a Good Driver

Whenever I'm doing architecture or talking to managers and such about projects, I always have a collection of metaphors in my head. I've always wanted to pull this particular metaphor out at a party but I've never had the right moment, so I decided to put a blog together about it since it is one of the architecture tools I carry around in my toolbox.

As with all metaphors, there is the danger you take it too far so...let me know if I've done that :-)

In practicing architecture and implementation and especially discussing project planning with people I encounter situations where a distinction is necessary between Street Direction Savvy (best thing I could come up with) vs. what I could probably term as simply Being a Good Driver. What's the difference you may ask?

In the real world, we spend a lot of time teaching our kids the mechanics of driving things: a bike, a skateboard, skis, and eventually a car (by the way, I believe the legal driving age should be 31). The parallel in programming is a person that has spent a lot of time learning the mechanics of a language or a specific part of a platform: Java 2 Enteprise Edition, Web Tier User Interfaces with AJAX, Database Engineering (by the way, I believe the legal age for use of Java 2 Enterprise Edition should also be 31...ITS A JOKE).

(Image linked from eHow)

One thing you quickly realize when you start to drive is that the technical ability to drive something does not make you an expert in getting from Point A to Point B. What's worse, the shortest path from Point A to Point B is almost always mired in complexities such as what you are driving (using a bike to get between points may yield an entirely different route as compared to a car) and what time of day it is (traffic often sends you onto side streets at particular times in the day). Further, what you know how to drive often forms your decisions on how to get between the points (knowing how to drive a bike and then learning how to drive a car yields some remarkably inefficient ways to get between points...often including driving through a neighbors yard (this is frowned upon when you have 4 wheels and 2,000 pounds added to your vehicle)).

As a result, you develop street savvy that goes with the context of what you are driving. Bringing this back to engineering, I like to think that architecture is all about street savvy and the better you know how the architecture is to be applied (how to drive) the deeper an architecture can go. Not only can I provide a map to the engineering teams, but I can also tell them specific platform decisions rather than just wave some boxes and requirements of those boxes at them.

Interestingly, some people never learn how to drive things but they have remarkable street savvy. Other times we are asked to have street savvy but we don't know how to drive what the other person is driving...I was asked by one of the Project Blackbox drivers how to get from Point A to Point B and I was like "are you driving your semi or your car".

This happens all the time as an architect. You end up applying the knowledge of something else that you learned how to drive that is similar. For example, an architect with experience in Swing and Designing User Interfaces may have a valid estimate for a team that is building an AJAX user interface. More importantly, I feel, is that the architect has to be sure to inform people of the differences in various tiers and platforms. Many times you are asked about an individual and whether they could fill a role on another team. People with driving savvy in one tier are often not useful in another tier if they haven't learned quite a bit about street savvy. Someone that has spent their entire life writing scripts in perl may not be helpful building an AJAX Web User Interface, yet it is a time honored tradition that management asks the question of whether the perl programmer can fill a role that needs immediate filling in the UI tier.

More than anything, I believe you are morally responsible for the information you give to people. So why not be transparent when you tell a person that your estimate to get from Point A to Point B is based on a map built for a bike rather than a map built for a car. Also, when asked about resources, feel free to ask questions about the person if you are not fully informed...if you've known someone only in the position as a database engineer, ask if they have experience in the user interface tier and what type of projects they've worked on. Ask if they understand the abstractness of languages and not just a language, this can be a big help...we all know that someone that understands a clutch and how to drive a particular manual transmission will probably be quicker on the uptake of another manual transmission since they get why they are pressing the mysterious third pedal down.

The metaphor works if you think about, enjoy...and let me know if there is a party around that I can use it at or if I can refine it a bit more.

Thursday Oct 11, 2007

Sha - XAM (Fall SNW)

My team delivered an end to end prototype of the Version 0.65 XAM API (XAM is the eXtensible Access Method) to the SNIA XAM SDK TWG today for use at Storage Networking World in Dallas next week.

Its pretty exciting, our Sun StorageTek 5800 Storage System already has a client API available from the Sun Download Center and is in the process of moving to OpenSolaris as a first-class project.

XAM helps to standardize the architecture of the API as well as specific language bindings for the API. With the architecture and API specifications at 0.65 to 0.69, there is obviously going to be changes to the architecture and APIs as it moves closer to release. Still, it is a great start with obvious promise.

The API itself is worked on by a broad range of companies that provide storage solutions (EMC, HP, not to mention Sun Microsystems). Application authors are also a part of the TWG membership to ensure there is an end to end view of the architecture. The XAM Architecture itself provides a top-end API for application vendors as well as a bottom end API for participation by storage vendors. Within the architecture are query capabilities and standardized properties that are a part content addressable storage systems.

I know this part may bore a bunch of folks, but here is an example of submitting a query that returns a list of all XSet objects (objects that contain metadata and data) residing on a storage system.

String sQueryCommand = "select .xset.xuid";
XSet qXSet = sys.createXSet(XSet.MODE_UNRESTRICTED);
qXSet.createProperty(XSet.XAM_JOB_COMMAND, true, XSet.XAM_JOB_QUERY);
XStream vXStream = qXSet.createXStream(XSet.XAM_JOB_QUERY_COMMAND, true, "text/plain");
System.out.println("Closed query stream");
XUID queryXuid = qXSet.commit();

The results get stored in the qXSet and can be read. The application can then go back and talk to each separate XSet for more information about it or to grab content from the object.

So, as you're milling around the demo area next week at SNW, look for one of these:

and be sure to drop in on Mark Carlson's XAM talk on Monday from 10:15 to 11am.

Wednesday Sep 12, 2007

Wi-Fi Wiki Space

All right, I'm over my frustration from yesterday's Starbucks incident. I realized that the search for free Wi-Fi hotspots is always full of adventure. Tables are always full at the good Wi-Fi spots and in other places the Wi-Fi seems crippled or simply well out of range due to the size of the location.

I'm a big fan of Free as in why not create a place where folks can participate in creating an up to date listing of hot spots? I know even some towns are one big hot spot (I believe Fort Collins in Colorado is one of these, but I'm not going to plug it in until I've tested it in a few weeks).

As such, I've created the Free Wi-Fi Wiki. This can help some of us frequent travelers or work-from-home people get out and participate in the world and embrace those that are bringing the network to the people :-)

I know, you've found a "secret" free hotspot and you don't want everyone to know about it since you have your own comfy chair and you don't want someone to spill coffee on it or break the springs. Well, you don't "have" to give it up. BUT, I would ask that if you do use the Wi-Fi Wiki ever, to find a free hotspot to plug in...even if you keep your favorite a secret ;-)

Monday Jul 16, 2007

Amazon Unbox - Utter and Complete Irritation

I gave Amazon Unbox the old 2.0 college try over the past week out of desperation. The promise remains, but I am completely and utterly disappointed in Amazon Unbox rentals. The summary:

  • While the software installed smoothly, my rental at didn't even register with my player for over 24 hours!
  • It would have been faster to order a Netflix movie through the mail than it took me to receive the Unbox download for my rental once it started
  • The 24 hours to watch your rental once you start is utterly foolish

Here's the whole story. I was stuck in my hospital room with only the Season 1 House DVD to keep me company and a wireless connection. Since I had a fever and a chest tube I wasn't really going to spend any time working (sorry...). My friend offered me her account to rent a movie from Amazon Unbox.

I downloaded the software and installed it, no problem...oh, I had to reboot before it ran but I've been there before with Amazon (see a prior post).

I launched and found "The Bourne Identity" and hit the rental button. The terms were interesting, I have 30 days to watch it once its downloaded, but I have to watch the whole thing in 24 hours once I start. No problem, I'm in a hospital bed.

The download didn't even REGISTER with the software for over a day, then once it started downloading, it took over 36 hours to download! UNBELIEVABLE. By the time it downloaded I was SO frustrated I didn't even watch it at the hospital.

So, I started watching it last night...I'm still in recovery so I only got about 30 minutes in and had to tune out to catch some rest. I ran up to my room to beat the 24 hour deadline tonight, and I seemed to have made it. At the 1:20 mark (a 1:40 long movie), my screen saver kicked in. Ouch, I flicked the mouse, no picture...sound, but no picture. I played around to no avail so I restarted Amazon Unbox. My 24 hours had expired!!!!!!! My video was gone with 20 minutes left to watch.


The worst part? When I was getting the details on the rental times for this blog post I "1 clicked" and re-rented the @)(#$&\*#&$ movie for another 3.99. GAGGHHHHHHHHHHHHH.

Amazon Unbox, I have tried you twice. This time you completely and utterly failed me when I needed you. To be honest, I don't really give three strikes to things. I registered my Tivo with your service but I am going to have to pass until I find better rental terms and a guaranteed delivery time. If you can't beat a Netflix Rental by snail mail, you really don't deserve to be in this game and, further, if you are going to force me to watch a movie in my hectic life within 24 hours of when I first press play, you have SERIOUSLY miscalculated my lifestyle.

The worst part of this experience? DUDE, I WAS DEPENDING ON YOU TO ENTERTAIN ME WHEN I WAS KNOCKED OUT IN A HOSPITAL ROOM, it was your PERFECT chance to shine...and you struck out in so many ways.

Control Panel / Add or Remove Programs / Amazon Unbox Video / Change/Remove

To reiterate, I won't try you on my Tivo and your software is gone. Here is what you have to change before I install you again:

  • Guaranteed delivery time (sorry, I know its hard, but you have ways to calculate overall connection it)
  • A "panic" button if your download doesn't start
  • 30 days to download and watch the program, no 1 day limit once its started...end of story

Thursday May 31, 2007

What is a Blueprint?

Besides being a drawing that was traditionally "blue" in color, the notion of a Blueprint in software has a decent definition Wikipedia definition:

Software blueprinting processes advocate containing inspirational activity (problem solving) as much as possible to the early stages of a project in the same way that the construction blueprint captures the inspirational activity of the construction architect. Following the blueprinting phase only procedural activity (following prescribed steps) is required.

Of course, Thomas Edison may argue with this, saying that "Genius is one percent inspiration and ninety-nine percent perspiration", which would beg the point of why blueprint at all?

My interpretation of a "Blueprint" after much reading and many years in the "(software) architecture" field is simply: A blueprint documents the set of requirements and constraints that need to be fulfilled for a particular solution along with the materials and instructions that implementors would need to solve for the requirements consistently and repeatably. A "good" blueprint will go further to enumerate requirements that the blueprint does not address as well as the specification for materials so that an implementor could safely choose other materials and instructions than were originally specified in the blueprint.

Where blueprints often come into heated debate is in "What form does a blueprint take?" as well as "How deep should the requirements be?". There really is not a simple answer here. In physical construction, the blueprint is a time-honored carbon-like thing with strict rules around elevations and material specifications...I remember this from my 11th grade architecture class and lots of head scratching when I have tried to build out a basement. For software and systems, the answer becomes more complex and is often determined by the tier or layer of the application for which the architect is building the blueprint for as well as the intended audience for the blueprint. A GUI application may have wireframes and look a lot like a physical construction diagram, with border widths and behaviors of the areas of the form. A model or data access tier will likely have table specifications and/or UML modeling involved. A "system" or a "server" (single box) will have a detailed schematic, much like a building architect's blueprint.

Things become more complex with blueprints for solutions that incorporate multiple audiences and types of components (software, systems, cables, cards, etc...). A blueprint for a SAN or a Storage Server would have deployment diagrams, hardware component specifications and software specifications involved.

The blueprint documentation itself must spend considerable time on the requirements and specifications for which the blueprint was constructed and will, hopefully, create a linkage between the requirements and specifications and the materials and instructions used to construct the solution. This documentation helps a user determine if the blueprint meets their needs and where they should concentrate on modifying the blueprint if it is a close match.

For example, I may build a system blueprint for a storage grid built with off the shelf servers. In the requirements and specifications, I concentrate on a a few simple requirements:

  • I must be able to add capacity without downtime, from 100MB to 1 Zetabyte
  • All storage must be available in a single namespace
  • All storage is represented as files
  • A storage node must be able to fail in place

I now give you a blueprint for this solution that gives you information about the hardware to use, the software to use, the cabling, the configuration, and more. It looks fantastic.

You easily reproduce the solution from the blueprint. BUT, the result is that you get 1MB throughput with a 5 second latency on a file lookup. This would be the equivalent of giving you instructions to build a house, but not telling you the house is 5" with a total of two square feet space. Ouch. 1% inspiration, 99% perspiration.

Check out some of Sun's Blueprints, they exist for software, hardware as well as solutions. Maybe they'll solve some of your needs. Further, with a little collaboration, maybe we can make the blueprints better by being broader or narrower...or simply more complete have better specifications in areas to aid you in selecting different components or meeting different requirements.

Here are a few interesting blueprints for you to take a look at:

Here are some locations at to find blueprints:

  • Sun BluePrints Program - Self described as "Technical best practices, derived from the real-world experience of Sun experts, for addressing a well-defined problem with Sun Solutions.".
  • Ajax BluePrints

Finally, a Java Blueprint Community is available on

Friday Jul 28, 2006

A Bazaar and RESTful Week

The past two weeks I've had a great opportunity to take a few breaths from my role as a Storage Management Software Architect. My background is more in general software and, as such, my breather consisted of FINALLY reading "The Cathedral and the Bazaar" and taking the time to read the fine paper from Roy Thomas Fielding on "Architectural Styles and the Design of Network-based Software Architectures". The former is the definitive book on Open Source Software (OSS) and the latter is the definitive guide to the REST architecture style, the style that guides the development of the World Wide Web (WWW).

First, the book. When The Schwartz first announced giving away our software for free and then open sourcing much of it, I definitely raised an eyebrow. Still, after chatting with folks in marketing and thinking about it, the move made a lot of sense. "The Cathedral and the Bazaar" goes to great length to discuss the economics of OSS in terms of amortization of resources as well as the economics of OSS for those that use it as well as help develop it. It is a great book from both a historical perspective, as well as a current perspective on how Sun's strategy can benfit both consumers and Sun Microsystems.

Although the current version of the book has undergone updates, many of the examples are from the days of Netscape open sourcing the browser and Linux as it started to penetrate the Fortune 500. Still, a lot can be learned from the book, though its important to keep it in perspective as one viewpoint.

If you haven't read the book, it is a collection of several essays addressing various aspects of OSS. In the namesake essay, "The Cathedral and the Bazaar", the author presents a set of lessons and examples to take away. My favorite was #2, "Good programmers know what to write. Great ones know what to rewrite (and reuse)". There is DEFINITELY not as much reuse as there could be, though I've seen it increase substantially. In the project I last architected, I presented a set of rules to the team members very early on and kept them on our community blog. The first rule was "We are lazy", implying we first looked for code to reuse and coopt from other projects. Another rule was that "we are application developers, not infrastructure developers". This is important, software engineers LOVE to write infrastructure. Its the proverbial insect light...we get all googly eyed over database access tiers and web servers and before you know it...ZAP...the project is toast when you think you've invented a new OR mapping paradigm.

Overall, lots of lessons to ponder in the book. There are a few too many "I"s for my liking, but it is an I/blogging/soap box sort of world these days anyway ;-) There are definitely a few gaps in the essays, hopefully I can squirrel away enough time to finish a paper I started while I was reading the book...stay tuned :-)

Onto the REST. REST is an architecture style. What is an architecture style???? Its basically a group of architectural properties that define a distinctive, identifiable style of architecture. Mr. Fielding does the best job I've ever seen into delving into architeture styles and exploring a nice subset of properties with their advantages and disadvantages. This lays a foundation for the presentation of the Representational State Transfer (REST) Architecture Style. The first half of the paper is basically a lesson in architecture, the second half of the paper discusses REST with examples from the Web.

When people discuss REST, they use the WWW as the definitive example. And, based on the history that Fielding gives, there is good reason...REST was derived from early (relatively undocumented) Web architectures. Now, REST is used to help guide the evolution of the Web and extensions to HTTP. Perhaps the best part of the paper, to me, was after the foundation was layed and REST was presented, the author went through a variety of "Architectural Lessons" learned. Within this section, the author describes the advantages of a Network-based API as opposed to API libraries, why HTTP is not RPC, and a missing piece of HTTP (matching responses to requests) due to the fact that HTTP is synchronous as opposed to an asynchronous model that you would expect after reading about the REST architecture style.

The paper is a great read for a variety of audiences (new architects trying to understand the implications of "Architecture Style" and how to apply various styles, people trying to wrap their head around how the Web works, and the author even weaves in references to why frames (remember frames?) are dangerous and misplaced based on the REST architecture style.

Both the book ("The Cathedral and the Bazaar") and the paper ("Architectural Styles and the Design of Network-based Software Architectures) are must reads.

Hmmm, could be some crazy weeks coming up :-) Armed with some new perspectives I should be able to go into them satisfied that I have knocked a couple of things off my "ToDo" list that should have been taken care of years ago...where does all the time go?

Thursday Jun 23, 2005

Is the Development Environment a Valid Architecture Constraint?

Hypothetical Problem: You're an architect trying to produce an architecture that meets some set of requirements that will be turned into a product. You realize you have limited resources and time to build the product prior to doing the architecture. Where should you fit the constraint on time and resources?

The question is...something has to give...doesn't it? Here are some options I can think of off the top of my head:

  • The architect works in a "clean room" without considering resources and hands the architecture off to implement. The management team must, in turn, recognize what cannot be built and make choices to lessen requirements thus, incurring additional architecture spirals.
  • The architect work in a "clean room" and produces a roadmap for all of the requirements to be implemented, the management team then takes responsibility for carving things up into resources and making product release cutoffs.
  • The architect builds an architecture, then takes into consideration resources to determine reuse, off the shelf implementations, etc... to streamline development of the architecture.
  • The architect puts resources as a constraint PRIOR to building the architecture, possibly influencing how requirements are interpreted and met within the architecture.

The latter two are interesting because the architect takes responsibility for delivery of the product by fitting the architecture around the constraint. The former solutions are also interesting because the buck gets passed for a product slip. How much responsibility should an architect have for delivering a product based on the architecture?

My opinion? Probably what most people would consider a "true" architect opinion..."it depends" on the context of the product and architecture development as well as how the architect is positioned in the company. So, my style is to consider the constraint early in the architecture and style the architecture around the constraint. That's a difficult choice to make though, because a team could be replaced making the assumptions of the talent to implement the architecture invalid.

Also, there seem to be so many types of projects that there is no "definitive" answer here. I do think architects should make a conscious decision about the "target" engineering team that can implement the architecture. Its much like a business plan, why should someone invest in the architecture if you haven't made the case that the ROI is is the architect's responsibility to make this case, at least to get the management team SOME idea of whether the ROI will be in the ballpark if the architecture is implemented with the architect's recommended resources...

Of course, my view on this changes all the time, so I'll add a big IMHO on the you have any thoughts?




« November 2015