Tuesday Jan 02, 2007

Virtualisation - Hype, Habit or Helpful

From the conversations that I have with my colleagues at Sun Microsystems, with friends and colleagues in the industry and with customers there is certainly plenty of hype surrounding virtualisation. It even has it's own acronym - V12N - there are 12 letters between the V and the N in virtualisation as there are 18 between the I and the N in internationalisation - I18N. In common with anything that achieves this level of hype we often loose track of what the real problem is that we are trying to solve and we also suffer the risk that many people can easily loose interest believing it to be nothing but hype. A colleague of mine once said that the problem with technologists was that we spent the minimum possible time analysing the problem until we could find a tool or technology that might solve the problem and then focused on deploying the tool as the panacea to all problems that might look similar. I like to believe that some of us, at least some of the time, can step beyond this statement but let's face it we have all been guilty of it at one point our career.

So let's take a step back and look at the problem which virtualisation is trying to solve and also take a look around at what other tools we might deployed in the fight before we come back to virtualisation and where it has a role to play.

The real problem that we are trying to address is utilisation, or to be more precise the lack of it. This under utilisation of IT resources is driving a complete over use of power (both for the systems themselves and for cooling these systems) which at the moment is largely a non-renewable source of energy. So one solution might be to find a renewable, infinite source of power that does not effect climate change! This I think may be the subject for someone else's blog. The other problem that this under utilisation is driving is a physical space issue. Finally there is the cost of servicing and support of this largely under utilised resources.

When I first started my interest in technology and computing the idea of NOT having to share the resources that you used was simply did not exists. The cost of acquisition of the resources meant that this was not economically viable and nobody expect to have a “Personal Computer”. Yes I know that this dates. Today it is no longer the cost of acquisition (which has fallen so low as to be almost irrelevant) that is the issue but the cost of ownership, system management, maintenance, space, power and cooling. The fall in the cost of acquisition has meant that our expectation of accessing computing resources as changed. The mind-set now is that I must own the computing resources that I use. The applies at all levels from the data centre to the end user. Many end users will now be screaming that this is not true I use many network based services and resources, such as e-mail, calendar, database, web, etc. that I do not own - bear with me for a minute or two longer.

The expectation of the desktop user is that the way that I need to access the network is with their own computing device (laptop, PC, PDA) – look at the large “My Computer” label in current versions of Windows if you still don't believe me. This behaviour leads to some of the lowest utilisation levels around. Even for a "power user" less that 5% and for most users less that 1%. Before you start screaming about these numbers being too low (my load monitor averages >70%), think about how many hours does your desktop or laptop spend switched off, on standby or worse still just running a screensaver. Then think of how much load the machine has for the majority of the time that we are using it. We have designed a solution that is scoped for a peak load condition that will NEVER EVER happen! All this to ensure that on the odd occasion that we need it all the resources in our machine they will be guaranteed available.

Now let's move into the data centre. A similar but slightly less out of control condition occurs here. In order to manage resource allocation and guarantee service levels we use the same mentality to deploy applications or services and ensure that they have their own infrastructure. While the service might be shared between multiple users the infrastructure that delivers the service is rarely shared between services. Estimates vary for utilisation in the data centre from between 20% to 50% but rarely exceed that.

So how do we try to address this problem. well basically we need to start changing the mind-set to one of sharing resources. This is where the habitual comes in. The biggest issue that we face is to break the bad habits that have lead to the current state of affairs.

One way is to take a physical machine and divide it into a number of virtual machines and allocate these virtual machines to users. Virtualisation. As I mention this is by no means the only approach and is not really breaking the habit, just virtualising the problem. Why not just share the system. In the case of Windows, Linux and Solaris now have the ability to support multiple users sharing the same system. Provided that applications are written in such a way that they expect to see multiple users and multiple instances of the application on the same system then you can share. All you need to ensure is appropriate resource management is in place and there is appropriate security between users. Solaris Resource Manager and Solaris security and privileges proves such a set of mechanisms. Talking this a step further Containers in Solaris 10 provide a way of dividing a single OS instance so that it can look to the application layer as multiple instances of the OS. This provides a very light weight (performance and licensing - it is free with free Solaris 10) way of having multiple versions of the same OS and because it use a single underlying OS instance the cost and complexity of OS management, maintaince and patching is significantly reduced.

Another technique is the physical isolation of hardware resources that can be managed dynamically without the need for system restart or reboots. This type of technique has been deployed in the higher end sun Machines since 1997 in the form of domains. With the recently announced Logical Domains within the T1000/T2000 family of servers this hardware isolation is now brought inside a single CPU and offered at the hardware threading level. LDOMs also brings the approach within the reach of cheaper horizontally scalable systems rather then purely the reserve of larger vertically scalable SMP systems.

The next approach is the hypervisor or virtual machines offered by Xen or VMware for x86 based architectures and similarly at a hardware level Logical Domains provides a hypervisor for SPARC based CMT systems with out the cost (no licensing required) and performance overhead of an additional software layer. These techniques allow you the flexibility to install on the same machine mutiple different OS versions or types. For example VMware and Xen will support Linux, Windows and Solaris amongst other and LDOMs supports BSD, Linux and Solaris.

So far we have talked about dividing resources between multiple tasks at a single point in time but what abut sharing resources over time. Particularly with horizontally scaling applications a system built for peak load will have a number of systems idle for long periods of time. Why not reuse these for something else. if the peak loads for various services can be mapped to different times the this re-provisioning of systems becomes an option, and not a hypervisor VM in sight. To do this you need to start to took across services - sharing within company or god forbid sharing between organisations. this was the concept behind the dollar per CPU per hour utility computing service from Sun. Much has been learnt from building this service in managing multi-tenancy of a pool of resources that we are currently using to help large organisation to deploy within their own infrastructure to provide resource sharing and billing between departments in the same organisation.

So what are the conclusions of this rant. first don't get CAUGHT UP IN THE HYPE of virtualisation. Take a step back and define the problem you want to address and PRAMATERISE THAT PROBLEM. Do not get caught up in selecting tools and technology until you have your problem statement defined. Do NOT UNDER estimate the cultural and behavioural change that is required to yield benefits from this approach. This will be by far the hardest problem to solve rather that products, technology or process. An essential first step in improving utilisation, desktop or data centre is a strong standardisation and governance. Without this you do not have the option to share since everything is bespoke.

Virtulisation is one weapon in the armoury but by no means it the only one.

We must solve the problem of utilisation in order to prevent the continual drive for new technology coupled with reducing utilisation driving the power and cooling requirements to such a point that it destroys the planet on which we need to house this equipment.




« May 2016