By jasoncatsun on Jan 28, 2010
Please see my continued blog at http://archmatters.wordpress.com
Thanks for reading,
We've just published a new paper that explores the area of risk and cloud application optimization. Does it make sense to refactor that existing application? Should I make it run on the cloud or optimize it? What is cloud computing application utopia?
You can find it here...
Table of Contents...
Benefits of cloud computing
Risks of cloud computing
Compatible with the cloud
Runs in the cloud
Optimized for the cloud
Example: two-tier Web service
Optimize through refactoring
Example: enterprise database cluster
Optimize through refactoring further
Seems like "DI" is everywhere today. It has been a set of core principles at Sun since 2006. It has helped us help customers move toward's cloud-like and enabled environments and continues today in practice. I wanted to review some of the principles we started with back in 2006 and one's that I still believe are as relevant today as then.
You can also listen to a podcast at http://blogs.sun.com/jasoncatsun/entry/defining_dynamic_infrastructure_a_podcast linked in a prior blog.
So what are some of these principles? With Sun's DI we started with four basic ones...
USE AS MUCH AUTOMATION AS YOU NEED.
Whether its bare-metal provisioning or data-center wide orchestration (ala OpenDI -- see http://opendi.kenai.com and now http://veriscale-manager.kenai.com) there's always tradeoffs, spin up time, etc.
START WITH STANDARDS AND DEFINE YOUR OPERATING CONTRACTS.
Standards are important even in the cloud. The fact that public cloud infrastructure provides a strict model and operating contract is evidence of that. Standards also translate to the run-time environment and the definition of the 'stack." See "Standard Operating Environment" at Sun's DC reference guide.
THE NETWORK IS CRITICAL -- AND NOW THE STORAGE NETWORK!
For me and some others - -the network has always been the computer. In dynamic data center environments its role cannot be taken for granted -- it's critical for operation. A solid network infrastructure and architecture that supports static and dynamic services such as Sun's Service Delivery Network Architecture is a way to do it.
MODULAR DEPLOYMENT AND INFRASTRUCTURE SCALING -- OR PODS.
Constraints are a given -- and they play a critical role in defining the deployment architecture. The deployment platform is now inseparable from the hardware itself as we move towards highly virtualized infrastructure. Being able to define the characteristics of the platform is critical, and leads to a well-defined scaling and capacity model that can be versioned and changed over time.
Better yet, how can this deployment architecture "snap in" to your data center as you need to grow? in-rack networking? Pre-racked/pre-cabled?? etc?
Sun's including the DI concepts as part of its refresh and introduction of the Veriscale (follow on twitter ) architecture platform. Look for more soon -- but you can bet these principles are still important!
I'll be presenting at TSW 09 -- Here's the abstract...
\* An understanding of the impact of cloud-like infrastructures
\* How we must change our support models
\* How we design better products in a cloud-like world
Why You Need to Attend This Session:
In this economically challenging time, technology services providers must make decisions on how to best invest for the future while maintaining service levels today. Supporting services that are running on the cloud will be different than supporting those of the past. This session will help you balance this investment looking forward.
This session will discuss how dynamic infrastructure, cloud computing, and the increasing use of virtualization and automation impact support, monitoring, operations, and general serviceability of cloud-like infrastructure. As data centers continue to grow larger and become more complicated, what techniques can and should be used to see the important events and issues that impact service levels? Is that server going down "architecturally" significant or should it fail in place? How does this impact organizational issues and operations? What are some of the patterns being used to support large-scale dynamic systems and how can you implement them in your data center or support organization? How do we develop better products and services to help us see the "forest through the trees?"
One thing is for sure -- the cloud computing model has affected IT. Some of its still occurring, some of its maturing, other areas seem stuck in momentum -- and may not be solved. But I consider the three factors below to be influencers beyond the cloud buzz hype:
1) the effect on time to market: user's of IT won't wait for 3 weeks to get a VM or hardware installed, much less their "stack" configured
2) packaging: speaking of stacks -- virtual machine images have won out as the preferred packaging element -- good or bad.
3) control of resources: this will continue to be a struggle between organizational units within a enterprise but the developer is gaining more control -- IT administration may be able to wiggle some back if the deal with #1 and #2 above.
So where are we on the cloud and the data center beyond these key elements? It is and will continue to change how we view data centers and services deployed within them. A new model with concerns around business and IT risk at the forefront versus after the fact as we see "provisioning" and time to market solved by the adoption of cloud-like models. As IT users continue to adopt these models (API-driven, self-provisioning, VMIs, etc) it will start to form a more distributed model of the data center as we know it.
For many large scale enterprises they are concerned but have much of this under control in their own data centers or via hosting providers. Services around these core elements may grow and shrink but core systems remain pretty stable -- think OSS/BSS (BILLING!) for a Telecom. You don't mess with billing.
But what about application distribution and provisioning for my platforms: phones, home media devices, video, etc -- I'll be able to bill you -- we covered that in risk profile 3. But how do I scale, offer new, distinctive services and do quickly, globally?
-- Managed risk profile
-- high level of compliance/reporting
-- less refactoring because of risk
-- trusted/true -- appliances, vertical scaled, etc.
-- "platinum" support
or at least have some of my services on the cloud -- either privately within my own DCs or increasingly in someone else's. My core data platforms will be in my own data centers, with my compliance rules and corporate governance laws, but I might have some more flexibility (and needs to scale beyond the core) that push me to different models of service delivery. Maybe I'm running our of space/cooling (the data center TTL (time to live) -- take a look at NCompass' Inc. health check up -- what's your DC's TTL?
You will require the flexibility to be able to host applications and their data across data centers, providers, continents. Layer 2 starts become more dynamic. Apps are updated, scaled, etc. dynamically. The risk profile is less since much of the critical data is held back in the 3 layer.
While layer 3 maybe risk adverse, solid solutions, layer 2 and above may have more flexibility. Since one of the attributes is global and scale this may make more sense to look at open source (license friendly) solutions that may be more agile and certainly more cost effective.
-- global scale/reliability factor
-- dynamic - ability to change overall footprints dynamically
-- replicated & repeatable -- "virtual appliances"
-- replicas/copies of data (COHERENCY)
-- "bronze" support contracts, more open source
Layer 1 is a content delivery network -- think Akamai but I wonder how this changes if you have lots more data centers, etc in the mix for layer 2. Functionally though they are a caching layer for content that needs fast access and globally available.
The client layer is increasingly important -- GSLB and DNS as a strategy is limited at best. Better is P2P technologies that may help with service discovery and quality of service. We are starting to see this tier changing in terms of IDEs and other "management" related products -- loose coupling of resources, discover/re-discover vs hard code IPs, etc. What else needs to happen here? Thoughts?
Thoughts from Jason Carolan -- Distinguished Engineer @ Sun and Global Systems Engineering Director - http://twitter.com/jtcarolan - http://archmatters.wordpress.com