Talking about Cloud Computing
By DaveLevy on Nov 06, 2008
The current technical state of systems, storage and networking and specifically the cost of broad band networking has created a tipping point. Over the last 10 years, organisations and people have been learning to build new distributed computing server complexes. It may be too late to copy the leaders, but certain design criteria and the regulatory constraints may mean that there is a slower commercial adoption cylce. The privacy, availability and response time requirements are for businesses are all different. In my mind, its commercial adoption that turns grids into clouds.
"One class of grid is where we locate one application,
which has many identical parts on a distributed computing platform and we call
this HPC; where we locate many copies of one application be it apache,
glassfish or MySQL on a distributed computing platform we call it web 2.0 and
when we locate many applications on a distributed computing platform we call it
Its commerce that has the need for the Cloud, because they have usually have a large portfolio of applications, some of which behave like HPC and some of which behave like Web 2.0 and its the economics of utility that drives this. Sun's ERP solutions have leveraged our product portfolio and Moore's law to become a tiny fraction of Sun's IT estate, with the community infrastructure and the design support solutions being implemented on web 2.0 and HPC grids, now dominating Sun's internal network in terms of cycles, storage and cost.
Admittedly, there are other aspects of what makes a cloud different from the payroll bureau of thirty years ago.
Data Centres are expensive and as we are discovering in the last few years, they are best built for purpose. Building and running Data Centres also benefits from 'specialisation'. In "The Big Switch", Nicholas Carr argues that the efficiencies of the plant apply to IT. (I'm really going to have to read it). Historically, applications developers have tightly coupled their code with an operating system image, specifying the version, library installs, package cluster and patch state. This is beginning to end. Developers want to and do develop to new contracts, be it Java, Python or another run time. Also with virtualisation technology such as Virtual Box and VMware, deployers can build their utility plant and take an application appliance with an integrated OS and applications run time, this allows developers to choose whether to use modern dynamic runtimes or to tightly integrate their code with the environment.
A second driver is the amount of data coming on-line. This cornucopia of data is enabling/creating new applications, of which internet search is an obvious one. Google scans the web, but many companies and increasing social networks are scanning their storage to discover new valuable pieces of information. Internet scale also means the "clever people work elsewhere" rule of life is generating new questions. The growing number of devices attached to the internet is also discovering and delivering new digital facts. The evolution of the internet of things will make the growth in data explosive so its a good time to be introducing a new disruptive storage capability and economics. The need to analyse this massive new data source is what's driving the emergence of Hadoop and Map/Reduce. Only parallel computing is capable of getting information out of the data in any reasonable time. A fascinating proof point is documented on the NYT Blog, where Derek Gottfrid shows how he used Amazon's cloud offerings to convert the NYT's 4TB archive into .pdf using Hadoop. I'd hate to think how long it might have taken using traditional techniques.
One tendency I have observed from my work over the last year is that today building grids is now longer hard, and most dramatically Amazon and Google are turning their grids to applications hosting. A number of public sector research institutions have also been building publicly available grids for a wile, although they tend to share amongst themselves. In the public sector world at least, they have begun to address the question of grid interoperability, and everyone is looking at how to 'slice' resource chunks out of the grid for users, on demand of course.
In the commercial world the competitive positioning of various players has led to them competing with different services and different levels of abstraction. The offerings of Google's "google apps engine" vs "Amazon's EC2" are quite different. Sun believes that cloud computing offerings need to organise above the OS level now and that developers don't want to worry about the operating system, merely their run time execution environment. This is only possible because modern development and runtime environments can protect developers from both the cpu architecture and now the operating system implementation. I know that as I search for a new solution for the services I run on my Qube, I'm happy to configure the applications and their backups, but I don't want to worry about disk reliability and other system services.
Jim Baty made the comment that we're entering a Web 3.0 world which is chmod 777 for everyone.
So the economics are compelling, the state of technology is right, developers are ready to leave these decisions behind and the first movers are moving.
Can and will Sun play a role in this next stage of the maturing of IT?
This article is I hope the first of two, written from notes made during a presentation by Jim Baty, Chief Architect, Sun Global Sales and Services, Scott Matton, one of the senior architects in GSS and Lew Tucker, VP & CTO of Network.com. The article is back dated to about the time of occurrence.