Thursday Oct 08, 2009

Starting at the Top...

To back up a couple steps and frame these entries a bit... There are three basic categories of efficiency in the datacenter: Supply, Demand, and Business (or Process). All three, and all of the subcategories under them, should be tracked.

Obviously, supply components have cost. Reducing the consumption and the waste of the supply components is often the focus of efficiency efforts. We see tons of marketing and geek-journal information about new, super-efficient power distribution systems, transformer-less datacenter designs, and there is much effort in the industry to make these pieces less wasteful. For those who haven't really dabbled in the field, you would be amazed at how much power is wasted between "the pole and the plug". UPS systems, PDU's, power conversions, etc. all mean loss. Loss means power that you are paying for that never makes it to a useful function of processing.

Downside, the supply categories are difficult and usually expensive to change (including labor and asset categories). The real efficiency gains that are often overlooked or given less priority are in demand and business. Demand includes the workloads and software running in your IT environment. Can you imagine the cost savings if you were about to "compress" your consumption by 30%? Maybe 50%? Capacity management, virtualization, and less obvious things (to most people) like fixing bad and under-performing code can be huge wins.

In my experience, customers (and "efficiency consultants") rarely look at the business processes, goals, and overall flow that drives the processing in the first place. Often the flow of demands, and the scoping of demands from the business can have huge impact on consumption. Do you really need 50,000 batch jobs that represent every report that has ever been run on the system? Do you really need to save those results? Do you really need to run and distribute all those reports, and to how many people? How many manual processes are involved in your IT management where the answer is always "yes", or the process is mostly repeatable and could be automated?

Examining supply, demand and business in a structured fashion, and asking "why are we doing that anyway?" can have huge returns with minimal investment. There is always "low hanging fruit" in efficiency. It is just plain dumb to keep throwing money away for the sake of tradition, habit, and legacy operations.


Thursday Oct 01, 2009

Sometimes it is the Obvious Things...

Slight tangent, but still on the efficiency theme. Last year, I was doing some work with a large Internet retailer, and came across a few epiphanies that apply to these monologues.

We were addressing storage, with a view toward a "private storage cloud" as the intended goal. This customer was very forward-thinking in much of their environment, with excellent processes in place for most of their IT environment. Some pieces seemed to be stuck in the 90's though, and were based on "traditional operating practices" rather than the realities of their business.

Simple example: What happens when the table spaces of your database are running out of room? Traditionally (and in this customer's environment), a monitoring agent on the server sent an alarm to the service desk, and opened a trouble ticket. The ticket was then assigned (by hand) to one of the DBA's, who could go into the database management interface and "add tablespace" using some available disk space. What happens when you run out of available disk space? A trouble ticket is opened for the storage group to allocate another chunk of storage to the system. What happens when the storage array starts running short of available chunks? A trouble ticket is opened for the manager of the storage group to either add capacity through a "Capacity on Demand" contract, order an upgrade, or purchase an additional storage array.

What's wrong with this picture? Nothing according to the classic flow of IT management. In reality, there are way too many people doing manual processes in this flow. Simple business question: If the system taking orders from your customers is running out of tablespace to store the orders, are you really ever going to say no to adding more disk space? No? Then why do we have a series of binary decisions and checkpoints on the way to satisfying the demand?

Automation and self-service are key components of the cloud "ilities", but can also stand alone as an efficiency play in almost every IT environment. We often execute traditional processes and practices rather than mapping the real business needs and constraints against the technology capabilities. In this example, a few scripts to automate adding tablespaces, creating new tablespaces, adding filesystems, adding and assigning storage chunks, and pre-provisioning those storage chunks in a standardized fashion saved countless hours of human intervention. Each of those activities is simple and repeatable, and the scripting reduces the opportunities for error. Throw in informational alerts to the interested parties, instead of trouble tickets requiring action, and the efficiency of this little piece of the puzzle is greatly improved.

Some parts do remain mostly "traditional", such as the procurement of new storage assets, but even those business functions are streamlined as part of this evolution. Once IT realizes that the business needs follow a predictable pattern of simple actions and reactions, automation becomes a simple task. After all, what Internet retailer wants a potential customer to click on the "check out now" button and get some weird error message because of a storage shortage, when they just spent an hour filling the cart with all of those items that they need? It isn't just a lost transaction, it could be a lost customer.

Well, back to my day job for now.


Wednesday Sep 30, 2009

Private Clouds, the Evolutionary Approach

Continuing with the ramblings of my last entry, since I am up late with children and their dozens of excuses as to why they are not asleep...

Now that we have defined the "ilities" that we want from our private cloud efforts, we can examine each of them and look for obvious opportunities with high returns. People cost, IT CAPEX, IT OPEX, energy costs, reducing operational complexities, improving service levels, reducing risk, and any other opportunities that we can target and quantify. One major rule here is that when we pick a target and an approach, we must also have a "SMART" set of goals in place.

For the .21% of the readers who have never heard the SMART acronym before, it stands for Specific, Measurable, Attainable, Realistic, and Timely. In other words, for every action that we plan to take, or every improvement that we want to deploy, we must have a measurable set of criteria for success. It amazes me how many IT managers do not know the "average utilization of server systems in the data layer during peak shift". Yeah, that is pretty darn specific, but ask yourself, do you know what your company's utilization is during the prime workday cycles? Bingo. We need a baseline for whatever metrics we choose to measure success for each project and change to our IT operations.

Sidenote: If the answer to the previous question was "Yes", and the utilization is anywhere above 30% during workday peak shift hours, I am impressed.

So where are the obvious targets? I have already hit on one of them, system utilization and idle processing cycles. Systems consume electricity and generate heat (Those servers are actually very efficient space heaters), resulting in cooling requirements and air circulation requirements, and odds are that a majority of the processing potential is not being used for processing.

Consolidation? Maybe. Capacity Planning? Definitely. Capacity Management? Absolutely! Consolidation is a valid target project, but is usually approached as a one-time, isolated event. Consolidation does not necessarily change the behavior that caused over-sizing to begin with, or help when workloads are seasonal or sporadic. These variable workloads most often result in systems that are sized for "peak load", with lots of idle cycles during off hours and off-days (and sometimes off-months).

The first step to a consolidation is capacity planning, including the key step of generating a baseline of capacity and consumption. If, instead of treating this as a one time event, we start monitoring, reporting, and trending on capacity and consumption, we have now stepped into the realm of Capacity Management. We can watch business cycles, transactional trends, traffic patterns, and system loads and project the processing needs in advance of growth and demands. What a concept.

Now imagine a world where we could dynamically allocate CPU resources on-demand, juggle workloads between systems with little or no downtime, and use systems of differing capacity to service workloads with differing demands. Wow. That sounds like one of those "ilities" that we were promised with that "Cloud" concept. Dynamic resource allocation and resource sharing, possibly with multi-tenancy to maximize utilization of compute resources. Yep. Sure is. Ignoring the "Cloud" word, let's look at how we can implement this "Cloud-like capability" into our existing IT environment without bringing in a forklift to replace all of our systems and networks, and spending billions.

Breaking down those technology pieces necessary to execute against that plan, we need Capacity Management (TeamQuest, BMC, Tivoli, pick your tool that does capacity and service level management). The tool doesn't matter. The process, the knowledge generated, and the proactive view of the business matter. Caveat: Define your needs and goals \*before\* buying tools that you will never fully implement or utilize!

So now we know what our hour-by-hour, day-by-day needs are, and can recognize and trend consumption. We can even start to predict consumption and run some "what if" scenarios. The next step is dynamic capacity, which in this context, includes "Resource Sharing, Dynamic Allocation, Physical Abstraction (maybe), Automation (hopefully, to some degree), and Multi-Tenancy from our right hand "Business Drivers" column from my last blahg entry. Sure, we can juggle and migrate these workloads and systems by hand, but the complexity and risk of moving those applications around is ridiculous. We need a layer of physical abstraction in order to move workloads around, and stop thinking of "systems" as a box running an application.

There are many ways to do this, so pick the solution and products that best fit your IT world. You can create "application containers", or standard operating environments for your applications, and juggle the "personalities" running in the physical machines. Not easy. Most apps will likely not move in easily. Still a good goal to reduce variance and complexity in your environment. In this case, not a quick hit, as you will end up touching and changing most of your applications.

The obvious answer (to me and 99.6% of the geeks reading this) is to employ virtualization to de-couple the application from the operating environment, and the operating environment from the physical hardware (and network, and storage). Solaris Containers, LDOMs, VMware, Xen, xVM software in OpenSolaris, Citrix, fast deployment and management tools, the options and combinations are all over the map. The deciding factors will be cost, capabilities, management tools (monitoring, reporting, and intelligence), and support of your operational and application needs. The right answer is very often a combination of several technology pieces, with a unifying strategy to accomplish the technical and business goals within the contraints of your business. There are many of us geeky types that can help to define the technology pieces to accomplish business goals. Defining those business goals, drivers, and constraints is the hard part, and must be done in IT, "the business", and across the corporate organization that will be impacted and serviced.

There, we have some significant pieces of the "private cloud" puzzle in place, and if the server systems were severely under-utilized, and we were able to move a significant number of them into our new "managed, dynamic capacity" environment, we should be able to realize power, cooling, and perhaps even license cost savings to balance the cost of implementation. One interesting note here, if I have "too many servers with too many idle cycles" in my datacenter, why should a vendor come in leading with a new rack full of new servers? Just wondering. Personally, I would prefer to invest in a strategy, develop a plan, identify my needs and the metrics that I would like improved, and then, maybe, invest in technology towards those goals.

Just the late night ramblings of an old IT guy.

Next entry will likely talk more about the metrics of "how much are we saving", and get back to those SMART goals.


Unicorns, Leprechauns, and Private Clouds...

Numerous conversations and customer projects over the past few weeks have motivated me to exit from the travelogue and world adventures entries here, and get back to geeky writing for a bit.

Yes, cloud. We all "get it". We all see Amazon and the others in the public cloud space doing really cool things. Somewhere along the way, some of the message got scrambled a bit though. With any luck, I'll clear up some of the confusion, or at least plant some seeds of thought and maybe even some debate with a couple monologues.

Non-controversial: Let's talk about three kinds of clouds. Most folks in the industry agree that there are Public clouds like Amazon's AWS, Joyent's Public Cloud, and GoGrid. That one is easy. In theory, there are "private clouds", where the cloud exists within the IT organization of a customer (note that I did not say "within the four walls of the datacenter"), and "hybrid clouds" that allow a private compute infrastructure to "spill over" to a public cloud, as expandable capacity, disaster recovery, or dynamic infrastructure.

No hate mail so far? Good, I'm on a roll.

So do private clouds exist? Maybe. If we hop into the WayBack Machine, John Gage said it best, "The Network is the Computer.". Let's dive a little deeper into what makes a computing infrastructure a "cloud":

Unfortunately, most people start on the left side, with the technical details. This rarely results in a productive discussion, unless the center and right columns are agreed on first. We put a man on the moon, I think we can solve content structure and locking/concurrency in distributed and flexible applications. We don't want to start a cloud discussion with APIs, protocols, and data formats. We want to start with business drivers, and justifiable benefits to the business in costs, value, and security.

The center column describes at a very high level, the goals of implementing a "private cloud", while the right column lists the control points where a private cloud architecture would create efficiencies and other benefits. If we can agree that the center column is full of good things that we would all like to see improved, we can apply the business drivers to our IT environment and business environment to start looking for change. All changes will be cost/benefit, and many will be business process versus technical implementation conflicts. For example, is your business ready to give all IT assets to "the cloud", and start doing charge-backs? In many corporations, the business unit or application owner acquires and maintains computing and storage assets. In order for a shared environment to work, everyone must "share" the compute resources, and generally pay for the usage of them on a consumption basis. This is just one example of where the business could conflict with the nirvana of a private cloud. You can imagine trying to tell a business application owner who just spent $5M on IT assets that those assets now belong to the IT department, and that they will be charged for usage of those assets now.

So is private cloud impossible? No. Is private cloud achievable? Probably. Does private cloud fit your current business processes and needs? Probably not. Do the benefits of trying to get there outweigh the hassles and heartaches of trying to fit this square peg into the octagon shaped hole without using a large hammer? Most definitely.

Some attributes and motivators for cloud computing have huge benefits, especially on the financial side of the equation. Virtualization definitely has some great benefits, lowering migration downtime requirements, offering live migration capabilities, enabling new and innovative disaster recovery capabilities, and allowing workloads to "balance" more dynamically than ever before. Capacity planning has always been a black art in the datacenter, and every model only survives as long as the workloads are predictable and stable. How many applications have you seen in your datacenter that don't bloat and grow? How many businesses actually want their number of customers and transactions to remain stable? Not many, at least not many that survive very long.

So, to wrap up this piece (leaving much detail and drama for future blahg entries)... Private clouds probably don't come in a box, or have a list of part numbers. Business processes and profiles need to adjust to accommodate the introduction of cloud enabling technologies and processes. Everyone, from the CIO, to IT, to legal, to business application owners, has to buy into the vision and work together to "get more cloud-like". And finally, the business discussion, business drivers, and evolutionary plan must be reasonably solid before any pile of hardware, software, and magic cloud devices are ordered. The datacenter will go through an evolution to become more "private cloud", while a revolution will likely mean huge up front costs, indeterminate complexity and implementation drama, and questionable real results.


Thursday Apr 23, 2009

More from Beijing...

A couple more pics from my week in Beijing that might be interesting. Since I mentioned our new, fancy caffeinated beverage machine here in the Solution Center, here is a picture of the beast:

And here is another little item that I found interesting. This is a picture that I took of a sign posted in my flex office space. Apparently if I leave my bicycle in the flex office space, it will be removed, along with any decor that happens to clash with the office themes and decoration. I will have to keep my strange and loud ties out of view! Just to put this in context, the "swanky" flex spaces are about 1.5m of desk space with power and network. Unless you put the bicycle on the desk, I don't see how storing one in the flex space would be possible. Maybe we can put some bike racks hanging from the ceiling?


Friday Nov 07, 2008

SunOS Rises Again, Better than Ever!

After 15 months of bozos whacking away with power tools...

YAY!! And yes, my 2008 Hybrid Mercury Mariner has "Solaris" license plates. And yes, that is a 1975 Bricklin SV1 Gullwing in the background, and no, I haven't touched it in 3 years, and yes, if you'd like to take it off my hands, I would entertain offers.


Back to the real world...

I spent the past couple of months working on a project that had way too many lawyers involved. I didn't want to blog about it, as dealing with lawyers and "content review" for everything I decided to write during the project would have made my head explode.

Now that the project is finished, and the intellectual property sharks are done reviewing and blessing things, I feel a bit more open about sharing my experiences. I did get to work with a a great geek who also happens to be an actor. I learned a ton about storage and cloud-like things using virtualization layers.

Since the project finished, I have been working with the xVM Server folks on the Early Access Program, details here and here.

I'll throw some screen shots and info up soon. For now, I have a borrowed, single CPU, dual core, 8GB memory x2200 M2, sitting in a datacenter 1500 miles away to run my tests on. As of today, it is running the xVM Server software (EA2), with Solaris 10 update 6, a Solaris Express Community Edition release (Nevada 101a), Windows Server 2003 x64 Enterprise Edition, and Windows Server 2008 x64 Datacenter Edition. All on the same machine. All managed from a single desktop filled with console windows. My Windows Server guest systems even have remote console working, with decent performance to my desktop at home over VPN.



Friday Aug 22, 2008

Wikis for Dummies...

The folks at The CommonCraft Show have produced a bunch of interesting videos in a series called "in Plain English", explaining how technology (and Zombies) work. Very Cool stuff:



Wednesday Aug 13, 2008

Tupperware comes in sets...

Continuing where I left off, the previous blahg entries addressed installation of the Solaris 8 branded container. Those pieces covered the mechanics of the container itself. One of the key architectural decisions in this process was "where do we put the stuff?". Not just mountpoints and filesystems, we already covered that, but what pieces go on local disk storage, and what pieces go on the shared SAN storage?

Since the objective is to eventually integrate into a failover scenario, we looked at two options here. Each one has benefits and can supply a capability to our final solution. In the first case, we want to fail a container over to an alternate host system. In the second case, we want to fail a container over to an alternate datacenter. Think of these two as "Business Continuity" and "Disaster Recovery".

In the Business Continuity case, the capability to do "rolling upgrades" as part of the solution would be a huge added bonus. We decided to put the zone itself on local disk storage, and the application data on the shared SAN storage. This allows us to "upgrade" a container, roll the application in, and still maintain a "fallback" configuration in case the upgrade causes problems, with minimal downtime. Accomplishing this requires two copies of the container. Application data "rollback" and "fallback" scenarios are satisfied with the shared SAN storage itself through snapshots and point in time copies.

Similar to a cluster failover pair, both zones have their own patch levels and configurations, and a shared IP address can be used for accessing application services. Only one zone can be “live” at any time as these two zones are actually copies of the same zone “system”.

To migrate the branded container to another host system, the zone must be halted, and the shared SAN storage volumes must be detached, and unmounted from the original host system:

The detach operation saves information about the container and its configuration in an XML file in the ZONEPATH (/zones/[zonename] in our configuration). This will allow the container to be created on the target system with minimal manual configuration through zonecfg.

The detached container’s filesystem can now be safely copied to the new target system. The filesystem will be backed up and then restored on to the target system. There are many utilities that can create and extract backup images, this example uses the “pax” utility. A pax archive can preserve information about the filesystem, including ACLs, permissions, creation, access, and modification times, and most types of “special files” that are persistent. Make sure that there is enough space on both the source system and the target system to hold the pax archive (/path/to/[zonename].pax in the example) as the image may be several gigabytes in size. Some warnings could be seen during the pax archive creation. Some transient special files cannot be archived, but will be re-created on the target system when the zone boots.

On the target system, the zone filesystem space must be configured and mounted, and have “700” permissions with owner root. The /zones loopback mount must also be in place, just as in the source system.

Since the zone filesystem is not on shared storage, and will remain local to the target system, the “mount at boot” option can be set to “yes”.

Storage for the applications and data should now be imported and mounted on the target system to replicate the configuration of the source system. All mountpoints, loopback filesystems, and targets of the “add fs” components of the zone must be replicated. Once the filesystems are mounted into the global zone, the zone pax archive can be extracted. Again, care must be taken to make sure that there is sufficient space on the zone filesystem for the extraction:

The filesystem of the zone is now in place, but the zone is not yet configured into the target system. The zone must be created, modified as necessary (i.e. different network adapter hardware or device naming), and “attached” to the new host system. As a sanity check, it is highly recommended that the /usr/lib/brand/solaris8/s8_p2v command is run against the new zone to make sure that the new system “accepts” the attach of a zone created elsewhere:

The “attach” command may fail with messages about patch version conflicts, as well as extra or missing patches. Even though this is a full root zone, the detach/attach functionality makes sure that the host systems are equivalent. Some patches will be missing or extra in some cases, especially where the machine types or CPU types are different (sun4u, sun4v, HBA types and models, Ethernet adapter hardware differences, etc.). It is possible to normalize all patch versions and instances across systems of different configurations and architectures, but this involves significant effort and planning, and has no real effect on the operation of the hosting systems or the hosted zones (patching software that will never run on a given machine).

Once all errors and warnings are accounted for as “accepted deltas” or resolved, a failed attach can be forced:

Zone migration can be toggled between the machines by halting the zone, detaching the zone, moving the shared SAN storage into the target system, attaching the zone and booting the zone. Once the zone has been installed, configured, and booted on both systems, there is no need to use the s8_p2v function for migration. Strictly speaking, the “detach/attach” function is not necessary since the zone itself resides locally, and is not actually migrating, but it does provide an extra layer of protection on the non-active machine to keep the halted zone from being booted while the shared storage is not active. By setting the zone state to “detached”, the zone will not boot unless the “attach” command is executed first, providing the check for the shared SAN storage configured with the “add fs” zone configuration.

Pretty simple, huh? In fact, if you look at the above diagram, it looks mysteriously like the functionality of a cluster failover. Once we modeled and tested these actions by hand, we integrated the pair of containers into Veritas Cluster Server and managed the zones through the VCS GUI. Online, offline, failover... It all just works. Very cool stuff.


Tuesday Aug 12, 2008

Tupperware footnote...

A couple folks have emailed me asking about my VxVM and VxFS in this physical to virtual conversion. As I have blahg'd before, separation is a good thing. In this case, we had Veritas Volume Manager and Veritas Filesystem on the source machine, and on the target machine. Volume Management and Filesystem Management should live within the Global Zone, not in local zones (or non-global zones as they are officially called). Mixing "system" activities within zones is a BadThing[tm], especially when the zone is a branded container. Trying to use Solaris 8 filesystem utilities and disk volume management utility binaries (even through the branded containers software) against a Solaris 10 system, kernel, and possibly even a hardware architecture (sun4v) unknown to a Solaris 8 operating system is a dangerous path to walk.

Definitely too much risk in there for me to even attempt to whack it into working. :)

We installed out Volume Management (VxVM) and Filesystem (VxFS) on the Solaris 10 target host system using the Veritas Foundation Suite (5.0 maintenance pack 1 Rolling Patch 4). All of the storage goodies were installed and configured as local objects on the Solaris 10 host system, and mounted under the /z/[zonename]_[function] pathnames as described earlier. The lofs loopback mounts and zonecfg "add fs" pieces mapped them into the places that we wanted them to be, just providing "disk space" to the Solaris 8 branded containers. We did use zonecfg "add fs" with a type of vxfs, and it worked as advertised. In the end, we decided that the VxFS pieces are a "system function" and should be mounted in the Global Zone under /z for simplicity and consistency.

Who knows, at some point we might even use ZFS instead of the VxFS in this configuration (more on that in a later blahg entry), and this allows us to keep the "zone space" filesystem and storage agnostic.


Monday Aug 11, 2008

Tupperware isn't the only Branded Container in town...

For this project, two Solaris 8 branded containers were installed on the test systems using "flar" images. The containers were created “unconfigured” so that IP addresses could be assigned to avoid conflict with the source systems. Instructions from the Solaris 8 branded containers administration guide were used to install and configure the test containers. The docs in the administration guide were excellent, with lots of "type this" and "you should see this" guidance.

We had two test scenarios that we wanted to play with: (1) Branded containers on SAN shared storage, and (2) branded containers hosted on local storage with SAN shared storage for application data. There are advantages and disadvantages for both, and situations where each has significant value. I'll get into the details of that in a future blahg entry.

In order to separate the "storage administration" from the “zone administration” function, storage was configured with device mounts within /z, regardless of the type (SAN, NAS or internal) of storage being utilized:

The storage for the container was mounted to the /zones/[zonename]/ path using loopback filesystems (lofs). This mount was created for testing using the command line:

So that the mount could persist for reboots, an entry was added to /etc/vfstab:

Note that the “mount at boot” option is set to “no” in this example. This is to allow for zones installed on SAN shared storage volumes to migrate back and forth between the systems. Zones installed on local, internal storage will use “yes” for the “mount at boot” option. In the shared SAN storage zones, the filesystems must be mounted as part of the zone management when rebooting, or attaching a zone into a new host:

The basic Solaris 8 branded container was created with the zonecfg command on the first Solaris 10 host system. The basic Solaris 8 branded zone contains a zonepath for the zone to be installed within, and a network interface for network access:

Extra filesystems for administrative tools and applications are mounted into the branded container from the global zone mountpoints with the “add fs” command within zonecfg. The loopback filesystem (lofs) is used to allow the Solaris 10 global zone to manage devices, volume management, filesystems, and mountpoints, and “project” that filesystem space into the branded container. While it is possible to pass physical devices in to a container, it is not a recommended architecture when at all possible to avoid, especially in the case of branded containers, where device and filesystem management would be running under the brand, while the kernel being leveraged would be of a different OS release. This loopback filesystem is defined during the zone configuration with:

Because the Solaris 10 Global Zone (GZ) can run on machines that don’t support native Solaris 8, some applications and software packages could become confused about the architecture of the underlying hardware and cause problems. The Solaris 8 branded container is shielding us from the hardware platform hosting the Solaris 10 GZ, so it is recommended that we set a machine architecture of “sun4u” (Sun UltraSPARC) for the branded container so that the hardware platform is essentially hidden from the operating environment within the container. The “machine” attribute can be assigned within the zone configuration with:

Other filesystems can be added, additional network interfaces can be defined, and other system attributes can be assigned to the branded container as needed. Once the container has been configured within zonecfg, the configuration should be verified and committed to save the data.

The branded container is now configured and ready for installation of an image (flar) from a physical system. The Solaris 8 system image is created using the flarcreate command. Make sure that the source system is up to speed on patches, especially those related to patching, flash archives (flar) and the "df" patch (137749) that I mentioned in an earlier blahg entry.

The configured zone can now be installed using the flar image of the physical Solaris 8 based system:

Once the zone has been installed, it the “P2V” utility must be run to turn the physical machine configuration into a virtual machine configuration within the zone:

The Solaris 8 branded container can now be booted. There will likely be some system startup scripts, tools, and applications that will give warning messages and errors on boot up. Remember the nature of the zone / host system relationship and decide what needs to run in a zone and what functionality should remain in the host system global zone. The zone will boot “un-configured”, and ask for hostname, console terminal type, name service information, and network configurations on the first boot. As an alternative, a “sysidcfg” file can be copied into the /zones/[zonename]/root/etc/ directory before the first boot to allow the zone to auto-configure with sysid upon first boot.

That's it. Other than fixing up the system startup scripts (/etc/rc\*.d and /etc/init.d), and attaching in a copy of the source system's SAN attached data, the move is done and ready to be tested. The really cool part of this is that it just works. We were expecting some application issues and possibly some "speed of light" problems, but everything just worked. Obviously some things had to be adjusted in the branded container, disabling the Veritas Volume Manager startup, and some hardware inventory scripts that used "prtconf" to collect information, but these were identified early, and several reboots sorted out the symptoms from the zone's boot messages on the zlogin console.

More details later about migration of the zone between systems, various storage configurations that we tested, and some other (hopefully) interesting thoughts.


Monday Jul 28, 2008

Separation is a Good Thing...

One of the interesting side effects of using containers/zones is that it provides a layer of separation for administration functions. Along the way, we have an opportunity to look at other separations that we can provide between the various IT organizations contributing to the "service" being provided by the system.

As an example, how many of us IT types here "I need the root password" from the database or application folks on a regular basis. Well, unless you have implemented RBAC (Role Based Access Control), and have trained your developers and DBAs to the point that they know what to ask for, this is probably a common request. I am alot more comfortable as an administrator giving administrative access to a zone than I am in giving it within the Global Zone. I know that if they totally roach things in the local zone, I can still edit files and move things around from the Global Zone to fix things up, and the system is not "down" or in a state that I need to boot from DVD or Jumpstart and twiddle bits.

Implementing zones gives us a chance to integrate several areas of separation, providing simpler administration. One example, I like to separate "disk/storage administration" from system/zone administration. I mount the (non-ZFS) storage under /z and then use the loopback filesystem (lofs) to make my zonepath. for example:

      lofs mounted on /zones/[zonename]

      zonecfg "add fs" with mountpoint=/opt/oracle

      zonecfg "add fs" with mountpoint=/data/oracle

Obviously this is even easier with ZFS, but I might get to that another day in another blahg entry. Using the resource mountpoint and the zonename in the filesystem information allows me to use grep in interesting ways with "df" and "mount" to more easily track down where things are being mounted and used.

I also try to avoid using devices within zones. This is especially important when I am using branded containers, as it creates interesting dependencies between the Solaris 8 software and the Solaris 10 kernel interfaces. One prime example, Veritas Filesystem (vxfs) and Veritas Volume Manager (vxvm). Often, when migrating a physical Solaris 8 machine into a branded container, admins will try to move administrative functions into the container and treat it as a real machine. There are huge benefits in moving "administrative functions" into the Global Zone, and leaving the application functionality within the branded container. Just say no. Manage your local storage, SAN, and volume resources from the GZ, and just present "disk space" into the local zones.

Other administrative functions that are much easier to keep in the GZ include backups, network administration, configuration management, and performance and capacity management. I find alot of effort being spent on trying to "shoehorn" these functions into a container, when logically they belong in the "system", or GZ. Separating the "system" (there is only one, and it is a shared resource) and the "application" (there can be many, and they consume the "system) has huge benefits in the ongoing administration and maintenance efforts. This is more of a "how to think" problem than a "what to think" education.

Just my opinions, your mileage may vary, objects in mirror may be closer than they appear, etc.. Feel free to comment, debate, or object in the comments below.


Saturday Jul 26, 2008

Branded Container Hurdle...

We ran into an interesting hurdle with our branded containers project. Once we had everything up and running, we decided to test patching the global zone (Solaris 10) and patching within the local zone (Solaris 8). We downloaded both "Recommended Patch Clusters" from SunSolve, and applied them.

The Solaris 10 patch installation went without a hitch (as expected), but adding patches to the Solaris 8 branded container failed. The error message from patchadd was that there wasn't enough disk space to install the patches. None of the patches would install. We did some checking, and "df -h" showed that we had 22GB of free disk space on the partition containing the zone. Weird. So we looked at patchadd, and it was running "df" on the /var/sadm directory to see if there was enough space to hold the files necessary to back out the patches. "df -k /var/sadm" failed with a message "special device and mount point are the same for loopback file system". So "df -k" worked fine, but "df -k [something]" would fail. Interesting.

I popped a note off to the Branded Containers team at Sun, and received an answer within 10 minutes. Apparently I was running into a known (and fixed) bug:

Patch Id: 137749-01

fixes BugID 6653043 ...


When running A solaris 8 zone within Migration Assistant (etude), 
it is possible to have a loopback filesystem mount in which the 
underlying mount is not visible in the zone.

The actual issue is addressed very simply in the Solaris 10 df 
by saving the best match while doing the walk backwards through 
the mount table. 

This bug manifests itself if your branded zone is using filesystems that have loopback filesystems (lofs) underneath. Since we were mounting our disk space under /z/[zonename]_root and then using lofs to mount it into the zonepath (/zones/[zonename]), this one bit us. Even our application and data spaces were mounted under /z (with zonename and resource name included in the mountpoint name) and then imported into the zones with zonecfg "add fs". A little more research revealed that the error message that we were receiving was actually removed from the Solaris 8 source code as part of the fix.

-       if (EQ(match->mte_mount->mnt_fstype, MNTTYPE_LOFS) &&
-               EQ(match->mte_mount->mnt_mountp,
-               match->mte_mount->mnt_special))
-               errmsg(ERR_FATAL,
-       "special device and mount point are the same for 
             loopback file system"); 

Now comes a more interesting question, if I need a patch to make adding patches work, how can I install that patch? The answer is to use the "patchadd -d" option to tell patchadd not to try and save the old versions of the files being patched (in this case, just the "df" binary).

Interesting note, patch 137749-01 is not contained in the latest Solaris 8 recommended patch cluster, or in the Solaris 8 Branded Containers package. This patch must be downloaded separately and applied by hand to the Solaris 8 branded container.

One of the Branded Containers support team guys did file an RFE to get the patch included in the software distribution for Solaris 8 Branded Containers for a future release.

Hopefully there is enough info in this blahg to help others who might Google the error message or symptom, and provide a quick path to the answer.


Friday Jul 18, 2008

Branded zones are your friend...

I'm working with a different customer for a couple of weeks. This is a large financial/insurance company going through a fairly common set of issues. The primary issue that I am here to help with is that alot of their environment is running on Solaris 8, and they are finding it very difficult to justify buying "old school" hardware to expand the Solaris 8 server farm. Until they can update the applications and complete testing cycles, the new hardware and OS features aren't an option. Until now.

We are doing Physical machine to Virtual machine (P2V) work to re-host some of those Solaris 8 workloads into Solaris 8 branded containers (zones) on a Solaris 10 host. Lots of advantages here. We can now run on up to date hardware (testing now with a T5220). We can take advantage of ZFS, Dtrace, performance improvements, and all those other Solaris 10 features that didn't exist in Solaris 8. Best of all, we can use the zones as a development, test, and migration tool moving forward to bring these working environments up to current releases of software applications, tools, and operating environment without having to spawn off even more machines for the migrations. The applications running in Solaris 8 branded zones now on host X will become Solaris 10 native zones at some point, running the updated applications and services.

There are several blahg entries to come... Doing the P2V to host the Solaris 8 "system" into a branded zone. Migrating the branded zone back and forth between physical machines (think clustering and hardware upgrades/service windows). Integration of this work within a SAN environment with BCVs (for backup services) and volume management. Cloning / copying the production zones into development and test environments. There are tons of possibilities in this kind of architecture, and a few gotchas and constraints to go along with them. I'll cover some of the key points over the next week or so.

Since I don't have a SPARC machine in my hotel room, I created "Solaris 10 branded containers" on x86. This enables me to play around under VirtualBox on my laptop from a hotel room to model and test. Yeah, this isn't a supported function at all, but it did enable me to learn alot about how zones (and particularly branded zones) work. Key features here include being able to install a zone from a flar or ufsdump image of a physical system or VirtualBox VM (P2V for Solaris 10 x86) from the zonecfg interface, and being able to emulate the SAN "attach/import/export/detach" functionality by moving my virtual disks that contain the branded zones between virtual machines.

Yeah, alot learned this week, and not enough time to write up the details just yet.

So here is is, the end of the first week, and we have:

  • Installed two systems with Solaris 10, patched

  • Installed the branded containers software

  • Created, mounted, and configured our SAN based storage

  • Created, configured, installed, and verified a pair of Solaris 8 branded containers from production system flars (unconfig'd and preserved)

  • Used zonecfg detach / attach and appropriate storage magic to move zones between the physical machines

  • Learned \*alot\* about how zones and branded zones function with the global zone, device mounts, storage devices, etc..

Wow. All that in a week. This team rocks. Oh yeah, so does VirtualBox, Solaris 10, zones, branded zones, and my favorite debugging tool, Google. :)


Friday Jul 11, 2008

Patches?! We don't need no steenking patches...

"Badges? We ain't got no badges. We don't need no badges. I don't have to show you any stinking badges!"

--Gold Hat, as played by Alfonso Bedoya
"The Treasure of the Sierra Madre" (1948)

I have been using my Virtualbox hosted JET server to install my VMs in Virtualbox for a couple weeks now. I have been using Solaris 10 update 5, and haven't worried about new patches. Until now.

Apparently the JET package from BigAdmin expects patch clusters/bundles to be in a directory called "10_x86_Recommended" with a patch_order file sitting amongst the individual patch directories (i.e. 109773-10/). The new patch clusters from SunSolve (using the 0508 extracted zips as an example) look like:

      Copyright_chunk2        \*

The three options are pretty simple. We could run the as a post-install task after the first reboot in this section of the template. Don't forget to include an NFS mount of the patch bundle directory in your custom script, and read the README file to make sure everything will work. The script requires a password to install to make sure you have read the README.

# Scripts to be run on the client at the end of the build
# The scripts must be placed in the directory
#       /opt/jet/Clients/
# and will be copied to the client.
# If you want to run custom scripts during the Jumpstart
# phase, use the custom_scripts_f variable below.
# Custom scripts at subsequent boots
#  denotes the boot number you want the action to be performed.
# You can create new variables for boot levels 2,3,4 etc.
# n means after the last reboot. i.e. last.
# m means n-1. i.e. before the last reboot. Use m if you need to
# guarantee a reboot after the action is performed.

The second option would be to define the patch bundle as a custom patch set in this section of the template:

# Custom patch sets... create a directory in the patch 
# directory named after the set, and put a patch_order 
# file in it, along with the patches...
# (Space seperated list of patch set names)
# N.B. as a side effect, if a directory exists under the 
# patch set dir named
#      after the OS, (uname -r), the subdirectory will be 
#      used instead of the main patchset directory
#       i.e /export/install/patches/patchset/5.8 takes
#           preference over /export/install/patches/patchset

The third option (that I chose) was to make my patch tree appear to be what JET was expecting, using this section of the template:

# N.B. Unless you need to point this client at alternate 
#      media for patches and packages that is not held on 
#      this server, please skip this section!
# productdir    is where to find the products. This should 
#               be a URI style path, i.e. 
#               nfs:// 
#               If the server is the JumpStart server, then 
#               it should just be specified as a normal path.
# patchdir      is where to find the patches. Same format 
#               as productdir.


In order for this to work, the patches have to be in the "10_x86_Recommended" directory (discovered through trial and error, and looking at the logs on the installed VM), accompanied by a patch_order file.

ayrton# cd /export/install/patches

ayrton# ls

ayrton# ln -s patchbundle_0508_x86/Patches 10_x86_Recommended

ayrton# cd patchbundle_0508_x86/Patches

ayrton# ln -s ../patch_order patch_order

So now there is a 10_x86_Recommended directory (symlink) visible in /export/install/patches that contains the patch directories and a patch_order file. As always, your mileage may vary, no warranty expressed or implied, but it worked for me.

PS - Kudos to Mike Ramchand for finding my real answer. Apparently the 0508 bundle is supposed to bring you up to the level of the 05/08 Solaris update release. The real patch cluster (with the correct directory structure) is on SunSolve under "Solaris 10 x86", buried down in the list underneath the "Solaris 10 x86 0508 Patch Bundle" entry. Oops. Oh well, now we have two choices, but the obvious "best answer" is to just install the update 5 Solaris image in the first place from your JET server.





« June 2016