The current technical state of systems, storage and networking and
specifically the cost of broad band networking has created a tipping point.
Over the last 10 years, organisations and people have been learning to build
new distributed computing server complexes. It may be too late to copy the
leaders, but certain design criteria and the regulatory constraints may mean
that there is a slower commercial adoption cylce. The privacy, availability and
response time requirements are for businesses are all different. In my mind, its
commercial adoption that turns grids into clouds.
"One class of grid is where we locate one application,
which has many identical parts on a distributed computing platform and we call
this HPC; where we locate many copies of one application be it apache,
glassfish or MySQL on a distributed computing platform we call it web 2.0 and
when we locate many applications on a distributed computing platform we call it
Its commerce that has the need for the Cloud, because they have usually have
a large portfolio of applications, some of which behave like HPC and some of
which behave like Web 2.0 and its the economics of utility that drives this.
Sun's ERP solutions have leveraged our product portfolio and Moore's law to
become a tiny fraction of Sun's IT estate, with the community infrastructure
and the design support solutions being implemented on web 2.0 and HPC grids,
now dominating Sun's internal network in terms of cycles, storage and cost.
Admittedly, there are other aspects of what makes a cloud different from the
payroll bureau of thirty years ago.
Data Centres are expensive and as we are discovering in the last few years,
they are best built for purpose. Building and running Data Centres also
benefits from 'specialisation'. In "The Big Switch", Nicholas Carr argues that the efficiencies of the plant apply to IT. (I'm really going to have to read it). Historically, applications developers have tightly coupled their code with an operating system image, specifying the version, library installs, package cluster and patch state. This is beginning to end. Developers want to and do develop to new contracts, be it Java, Python or another run time. Also with virtualisation technology such as Virtual Box and VMware, deployers can build their utility plant and take an application appliance with an integrated OS and applications run time, this allows developers to choose whether to use modern dynamic runtimes or to tightly integrate their code with the environment.
A second driver is the amount of data coming on-line. This cornucopia of
data is enabling/creating new applications, of which internet search is an
obvious one. Google scans the web, but many companies and increasing social
networks are scanning their storage to discover new valuable pieces of
information. Internet scale also means the "clever people work
elsewhere" rule of life is generating new questions. The growing number of
devices attached to the internet is also discovering and delivering new
digital facts. The evolution of the internet of things will make the growth in
data explosive so its a good time to be introducing a new disruptive storage
capability and economics. The need to analyse this massive new data source is
what's driving the emergence of Hadoop and Map/Reduce. Only parallel computing
is capable of getting information out of the data in any reasonable time. A fascinating proof point is documented on the NYT Blog, where Derek Gottfrid shows how he used Amazon's cloud offerings to convert the NYT's 4TB archive into .pdf using Hadoop. I'd hate to think how long it might have taken using traditional techniques.
One tendency I have observed from my work over the last year is that today
building grids is now longer hard, and most dramatically Amazon and Google are
turning their grids to applications hosting. A number of public sector
research institutions have also been building publicly available grids for a
wile, although they tend to share amongst themselves. In the public sector
world at least, they have begun to address the question of grid
interoperability, and everyone is looking at how to 'slice' resource chunks
out of the grid for users, on demand of course.
In the commercial world the competitive positioning of various players has
led to them competing with different services and different levels of
abstraction. The offerings of Google's "google apps engine" vs
"Amazon's EC2" are quite different. Sun believes that cloud computing
offerings need to organise above the OS level now and that developers don't
want to worry about the operating system, merely their run time execution
environment. This is only possible because modern development and runtime environments can protect developers from both the cpu architecture and now the operating system implementation. I know that as I search for a new solution for the services I run on my Qube, I'm happy to configure the applications and their backups, but I don't want to worry about disk reliability and other system services.
Jim Baty made the comment that we're entering a Web 3.0 world which is chmod 777 for everyone.
So the economics are compelling, the state of technology is right, developers are ready to leave these decisions behind and the first movers are moving.
Can and will Sun play a role in this next stage of the maturing of IT?
This article is I hope the first of two, written from notes made during a presentation by Jim Baty, Chief Architect, Sun Global Sales and Services, Scott Matton, one of the senior architects in GSS and Lew Tucker, VP & CTO of Network.com. The article is back dated to about the time of occurrence.
technology cloudcomputing economics sunw datacentre datacenter