Sunday Aug 30, 2009

Three dimensions of Virtualisation

Another piece of, what I hope is wisdom, coming from my last three months of customer conversations is that virtualisation has three dimensions.

We use virtualisation to make large systems small. I call this “Atomisation”. We can also use virtualisation technologies to make many components seem as one, this is of critical use for horizontally scalable services, and I call this “Aggregation”. The third dimension is “Longevity”. Maybe I should play around with “Age” as a word, so each dimension has a mnemonic starting with “A”, but by using a Type II hypervisor, one can protect old software against platform innovation and continue to run it until its business case changes or expires.


How new is Cloud Computing?

I have spoken to several of Sun's customers over the last 3 months about Cloud Computing and have often used the following quote.

“When we build a distributed computing platform and run one application on it, we call this HPC, when we build a distributed computing platfrom and run many copies of one application on it, we call this Web 2.0, and when we build a distributed computing platform and run many applications on it, we call it Cloud Computing.”

Who said it? Me!

Its not quite true, but the difference between the platforms is not necessarily as great as some might like to make it seem. Web 2.0 platforms are rarely as economic as running many copies of one application but its a pretty small portfolio often supporting only one end-user application. I accept that elasticity and metering are important, unsolved, or not well solved problems in the cloud world but I think the quote is worth publishing here and repeating and offers insights into planning an evolving the next generation of IT platforms.


Wednesday Mar 18, 2009


There is a conversation on google groups, cloud computing [XML] about CISCO's plans to enter the server market, kicked off by this article at Business Week.

The dimension, only just, missed in that conversation is the opportunity to get design synergies on the hardware between networking and systems. Why do large scale users have to buy switches and servers as seperate procurements? Perhaps the next stage is to migrate the network functionality to a software appliance, so one buys a box and then decides what to do with it. (I know that a switch needs a lot of ports where a non-switch system only needs two, but modern blade systems are modularising this design area as well.)

The interesting questions then left are whether the data centre, or network can consolidate to one cabling standard and perfromance. When will the need for seperate networking (or interconnect) technologies between CPUs and Systems decline? (If ever?)

I know some computer scientists thinking about tomorrow's problems are interested in this sort of thinking.


Tuesday Nov 25, 2008

Managing Torrow's Cloud

An off agenda session on Cloud Computing, kicked off by William Fellows of the 451 Group. I quite like his stacks both of functionality, illustrating what needs to be done and the evolution of the cloud from its partly failed predecessors. The discussion then moved to management, with contributions from IRMOS and the Autonomic Internet project, which sounds a bit IBM'ish but isn't. There's obviously some thinking going on about Service Management for Clouds and networks, looking at life cycle issues (is this just job management, probably not because of birth and death), self functioning, SLAs and QoS issues. It seems to me that Robert Holt's experimentation with SMF is exactly the right thing to do. The features that Sun's Systems Management Facilty add to the operating system are a foundation on which a number of features can be built which meet the need of Cloud managers. The BREIN project which says about itself,

"BREIN takes the e-business concept developed in recent Grid research projects, namely the concept of so-called "dynamic virtual organisations" towards a more business-centric model, by enhancing the system with methods from artificial intelligence, intelligent systems, semantic web etc."

I love the etc. It always makes you think people know exactly what they're doing. They have published a white paper here.... Despite this, these projects and this approach might well enable the automated SLA negociation. Can we create a semweb for SLAs? It always been the fact that sustaining and management science comes after the invention stage, but this was a jolly interesting session, and addressing issues identified by both myself and colleagues at Sun and leading industry commentators as crucial. If we don't/can't automate this stuff, we are going to run out of people.


Thursday Nov 06, 2008

What will the Cloud do?

I was pointed at the Eucalyptus project, an open-source software infrastructure for implementing "cloud computing" on clusters, by a colleague and decided I needed to check out Amazon first. Several colleagues have given me this advice but have the University really written an open source grid platform conforming to Amazon's EC2 APIs.

If so its a fascinating example of the speed of commoditisation. It raises the question of where's the value in building clouds? If you can't innovate above the system components where can you innovate? Its obviously pointless to copy what Google did 10 years ago and if the assembly is available in Open Source you should probably use it. The space left by Amazon for a competitive threat is that they major on Infrastructure as a Service, although of ocurse given the operating systems available you can quickly turn it into a platform. I have just checked Amazon's EC2 Page, and they now offer a database query interface to their storage solution. The space left is to offer higher levels of abstraction, specifically by offering Java, Python or Ruby space to customers, and this is what Sun's Project Caroline does. Sun also innovates at the system, silicon and software layers. IT Systems are not really commodities and sedimentation means they will continue to change, the industry still needs innovators. IT isn't done yet.


Billing for Clouds

When considering the some of the issues related to building private clouds, the "Usage to Billing" problem was raised and I was reminded of Emlyn Pagden's Blue Print the Utility Model - Part II. I had been consulting with a mid sized European Investment Bank, and discussed the architectural problem with them, and Emlyn. Its a while since I have read Emlyn's paper, but he took the architectural decomposition

  • Measurement, what are people using
  • Aggregation/Mediation - accross the whole estate
  • Allocation - how many charges have they incurred
  • Invoicing - give us our money

and built a reference implementation using Solaris Resource Manager and accounting functionality and some third party products. At the time, he was working for a team that wanted to sell third party software, he had no engineering resources and thus a propensity to use 3rd Party software before building significant scripted functionality. With different resources and motivations, the reference implementation might look quite different, but the paper which was based in a real prototype exposes a working solution.I suspect that not all the companies he mention either still exist, or remain in the "Systems Management" business. However the decomposition should allow easy replacement and the advances in SOA may make this easier to do.

One of the key problems that inhibit adoption of these solutions is that end-user IT departments are cost centres and financially aim to spend or underspend their budgets. Their outgoing charging tariffs are based on cost recovery and they don't care how busy what they supply is; they have to charge for what they supply. If they don't do this they make a loss, and the CIO gets fired.

Neither he, nor I experimented with testing this on a grid, and it might involve having a global /etc/projects name space across the whole cloud, but with Virtual Box testing these things becomes easier. Sadly I have picked up enough projects from this trip already, but now I need to build a grid on a laptop.

We both agreed that the invoicing function was best left to the ERP system. Private cloud builders may not need to produce an invoice since they may not be using real money, they will have to make some entries into the Financials systems either cost relief transfers or something. Also new start ups of public clouds may wish to look at Open Bravo, an open source ERP package.


Talking about Cloud Computing

The current technical state of systems, storage and networking and specifically the cost of broad band networking has created a tipping point. Over the last 10 years, organisations and people have been learning to build new distributed computing server complexes. It may be too late to copy the leaders, but certain design criteria and the regulatory constraints may mean that there is a slower commercial adoption cylce. The privacy, availability and response time requirements are for businesses are all different. In my mind, its commercial adoption that turns grids into clouds.

"One class of grid is where we locate one application, which has many identical parts on a distributed computing platform and we call this HPC; where we locate many copies of one application be it apache, glassfish or MySQL on a distributed computing platform we call it web 2.0 and when we locate many applications on a distributed computing platform we call it Cloud Computing"

Dave Levy

Its commerce that has the need for the Cloud, because they have usually have a large portfolio of applications, some of which behave like HPC and some of which behave like Web 2.0 and its the economics of utility that drives this. Sun's ERP solutions have leveraged our product portfolio and Moore's law to become a tiny fraction of Sun's IT estate, with the community infrastructure and the design support solutions being implemented on web 2.0 and HPC grids, now dominating Sun's internal network in terms of cycles, storage and cost.

Admittedly, there are other aspects of what makes a cloud different from the payroll bureau of thirty years ago.

Data Centres are expensive and as we are discovering in the last few years, they are best built for purpose. Building and running Data Centres also benefits from 'specialisation'. In "The Big Switch", Nicholas Carr argues that the efficiencies of the plant apply to IT. (I'm really going to have to read it). Historically, applications developers have tightly coupled their code with an operating system image, specifying the version, library installs, package cluster and patch state. This is beginning to end. Developers want to and do develop to new contracts, be it Java, Python or another run time. Also with virtualisation technology such as Virtual Box and VMware, deployers can build their utility plant and take an application appliance with an integrated OS and applications run time, this allows developers to choose whether to use modern dynamic runtimes or to tightly integrate their code with the environment.

A second driver is the amount of data coming on-line. This cornucopia of data is enabling/creating new applications, of which internet search is an obvious one. Google scans the web, but many companies and increasing social networks are scanning their storage to discover new valuable pieces of information. Internet scale also means the "clever people work elsewhere" rule of life is generating new questions. The growing number of devices attached to the internet is also discovering and delivering new digital facts. The evolution of the internet of things will make the growth in data explosive so its a good time to be introducing a new disruptive storage capability and economics. The need to analyse this massive new data source is what's driving the emergence of Hadoop and Map/Reduce. Only parallel computing is capable of getting information out of the data in any reasonable time. A fascinating proof point is documented on the NYT Blog, where Derek Gottfrid shows how he used Amazon's cloud offerings to convert the NYT's 4TB archive into .pdf using Hadoop. I'd hate to think how long it might have taken using traditional techniques.

One tendency I have observed from my work over the last year is that today building grids is now longer hard, and most dramatically Amazon and Google are turning their grids to applications hosting. A number of public sector research institutions have also been building publicly available grids for a wile, although they tend to share amongst themselves. In the public sector world at least, they have begun to address the question of grid interoperability, and everyone is looking at how to 'slice' resource chunks out of the grid for users, on demand of course.

In the commercial world the competitive positioning of various players has led to them competing with different services and different levels of abstraction. The offerings of Google's "google apps engine" vs "Amazon's EC2" are quite different. Sun believes that cloud computing offerings need to organise above the OS level now and that developers don't want to worry about the operating system, merely their run time execution environment. This is only possible because modern development and runtime environments can protect developers from both the cpu architecture and now the operating system implementation. I know that as I search for a new solution for the services I run on my Qube, I'm happy to configure the applications and their backups, but I don't want to worry about disk reliability and other system services.

Jim Baty made the comment that we're entering a Web 3.0 world which is chmod 777 for everyone. :)

So the economics are compelling, the state of technology is right, developers are ready to leave these decisions behind and the first movers are moving.

Can and will Sun play a role in this next stage of the maturing of IT?

This article is I hope the first of two, written from notes made during a presentation by Jim Baty, Chief Architect, Sun Global Sales and Services, Scott Matton, one of the senior architects in GSS and Lew Tucker, VP & CTO of The article is back dated to about the time of occurrence.

Monday Nov 03, 2008

Building new age clouds

Sohrab Modi introduced three presentations from the Sun Labs on Hadoop & Hbase, and Project Celeste. He also pointed us at I have downloaded this and shall let you know how it goes.





« August 2016