The VeriScale Architecture for a Scalable, Elastic, Self-provisioning Datacenter
By blueprints on Jul 20, 2009
By Mikael Lofstrand
The modern datacenter is evolving into the cloud computing model, where networking, platform, storage, and software infrastructure are provisioned as services that can scale up or down on demand. This model allows the datacenter to be viewed as a collection of managed application services that are deployed automatically, while utilizing the underlying services. Providing sufficient elasticity and scalability for the rapidly evolving needs of the datacenter requires these collections of automatically-managed services to scale efficiently, and with essentially no limits. Sun calls this truly scalable approach a VeriScale architecture.
Beyond elasticity and scalability, other requirements from the architecture to help ensure its utility include:
- Providing a simple means to deploy components with ease so that the operational IT environment is efficient
- Supporting a rapid development cycle to service new requirements and deploy new services
- Avoiding reliance on centralized provisioning systems since they inherently have breaking points that limit scalability
- Consisting of self-contained components that include a fully-operational, self-sufficient software stack with applications, virtualization technologies, storage, and networking for modularity and ease of deployment
- Enabling services to be self-provisioned, while providing cost-efficient, scalable, and rapid deployments with standard technologies
This blog provides an overview of Sun’s VeriScale architecture as well as its defining principals, and demonstrates how it fulfils these requirements.
VeriScale functional components
Sun’s VeriScale architecture is comprised of a set of functional components including:
- The point of delivery (POD), is part of Sun’s scalable dynamic infrastructure suite (DIS) — and represents a self-contained network service stack that includes both software and infrastructure devices. Each POD provides a certain defined functional capacity and can range in size. Each POD is delivered ready to be easily plugged into the network and participate in the system.
- The base static platform consists of the interconnected nodes and storage devices that are physically wired and remain statically connected. These infrastructure components are virtualized in higher layers that can then be dynamically reconfigured as needed. The components that are used for this layer are defined per required cost, performance, availability, interfaces, etc. The base static platform can be implemented as a flat network where virtual LANs (VLANs) can exist anywhere on the platform or as a point of delivery.
- The Sun service delivery network (SDN) architecture, provides a set of network connectivity, routing, load balancing, and security mechanisms that combine to form a flexible network infrastructure design framework. This innovative architecture provides high performance, scalability, availability, security, flexibility, and manageability for datacenter infrastructure. The SDN methodology enables a common approach to designing network architectures, and provides a common set of tools that help ensure proper architectural design decisions and trade-offs.
- The target nodes that are the actual compute hosts that run the applications.
- The services that encapsulate the business functionality provisioned at the point of delivery. These can include infrastructure services, or applications, and include the networking logic they require.
The functional components are managed through OpenSolaris Dynamic Service Containers (DSC) and are automatically deployed with the networking logic encapsulated into the services components, leveraging the SDN architecture.
The VeriScale unit of management is a service — not individual functional components. Networking functions are embedded within services so that services are fully functional, self-contained, self-sufficient, and self-provisioned networked entities.
The traditional approach to automation is to have centralized provisioning systems that deploy individual services onto target systems. This approach limits scalability since regardless of the size of the centralized system, there is always a limit to its capacity and throughput, and it can break if there is a demand peak for its services that outstrips its capacity. In contrast, the VeriScale architecture assigns the deployment logic to the target systems, creating an environment where the targets are self-updating and pull components from the appropriate repositories. This strategy improves scalability since the provisioning workload is handled by each target system and not by a centralized resource. Contrast this approach with the very limited level of network automation in traditional architectures and the true power of VeriScale is immediately apparent.
OpenSolaris Dynamic Service Containers
OpenSolaris DSC is a distributed resource and service management system designed for simplicity and scalability suitable for lightweight applications with simple installation and configuration requirements. OpenSolaris DSC tracks what is going on in the network as a whole, maintains multiple hosts that run the OpenSolaris DSC services and software, and creates and destroys payloads. OpenSolaris DSC implement a pull model that uses resources on the target nodes for provisioning, avoiding the need for a centralized provisioning system with its associated resource bottlenecks. In this model, the resources available for provisioning grow as the number of target nodes increases. The performance limitations then become associated with the target nodes and are not related to their number — only to the load-capacity of each single target node.
The OpenSolaris DSC consists of the following components:
- The registry is a repository for information that includes the different elements required for the deployment of services across the scalable architecture. In particular the registry includes the necessary information needed to download and install a payload. Other information includes data used to communicate with the service instances to configure them, the network and capacity information for specific hardware platforms, and other administrative and operational data.
- The repository is a simple, passive storage facility for the payloads.
- Node controllers locate the registry and continuously query it for changes of desired state, pull initial configuration data and query it for the desired state of service definitions. In addition, node controllers analyze the suitability of a given POD to host a required workload. When a service is provisioned, a node controller offers to host a workload, downloads and installs the needed payload, starts the service, and updates the registry with its state.
- Payloads maintain information and data needed for the node controller to install and run a workload. Some examples of workloads include an install script, a binary, base configuration, and additional content. The payload can also include a load balancer and its initial configuration.
- Nodes are aggregates consisting of computing, storage, and communication devices. These aggregates have a pre-defined application capacity that simplifies the resource management model, and allows applications to be dynamically re-provisioned as their needs grow. Nodes are added to an already-provisioned application when its capacity is exhausted, or reassigned when they are no longer needed. Alternatively, the applications can be re-deployed to larger or smaller environments as needed.
Network optimization through distributing networking logic to the PODs
To achieve the flexibility needed for a dynamic, scalable, load balanced environment it is necessary to abstract the destination address by using indirection and resource pooling. Load-balanced networks are normally implemented with a central load balancer that redirects traffic to its destination so that requestors only need to know about the load balancer. This approach results in sub-optimal routing since the data is routed via the load-balancer even if there is a more optimal route to the destination. In addition, if the virtualized services are deployed on a specific hardware platform, it must be replicated to scale.
In contrast, with OpenSolaris DSC based load-balancing, when a client connects to a virtual service through a virtual address, the load-balancer re-directs the request to a server from a resource-pool available for the requested virtual service. Scaling is implemented by adding resources to the pool. The node controller queries the registry for the information it needs. This approach enables the automatic update of the configuration of the resource pool for the load balancer.
In the VeriScale Architecture, load balancing can be further optimized if the networking logic is implemented locally in the service’s POD and treated as part of the application. The load-balancing networking logic in the POD scales automatically as payloads are added to scale the services in the POD. The networking logic is embedded in the payload for the applications, which enables the communication to the next destination. (Example: A web server communicates with a locally installed load-balancer, on the same server, which re-directs the traffic to a data store outside the server hardware.) This will provide elastic scale for the networking logic which follows the scaling of the applications. Additional optimization can be achieved if response times are measured, making it possible to optimize the network routing on the fly and determine the optimal path through the network to the target dynamically.
The use of software components that are co-located with applications (e.g., the distribution of the registry and its logic to the POD) help reduce the network traffic and, as a result, reduce the number of physical devices on the network. At the same time, latency can be improved as there are fewer devices, fewer communication hops, and optimized communication paths.
Service delivery network (SDN) architecture and VeriScale automation
Sun’s service delivery network architecture is a domain-specific, modular, and flexible logical model and language for providing a service-oriented view of the network architecture. The SDN architecture provides architectural guidelines, while the actual architecture implemented is designed based upon specific requirements.
The SDN architecture consists of different service domains. Each service domain is a grouping of similar services, called service instances, that include the specific attributes needed for a service to be reachable over the network. For example: a Web server is a service instance in the service domain called Web serving. The Web-serving service domain provides services as a single entity, and is comprised of a subnet in a virtual-LAN that includes multiple Web servers grouped for the purpose of load-balancing.
Each service domain is addressable by clients that connect to the service instance and not to a specific server. The service domains are distinguished by their different characteristic — protocols, ports, health monitoring implementation, and security requirements. A separate management network is included in every instance of an SDN architecture. The management network enforces the security requirements of the service domains. A management domain can manage many service domains, while a service domain has only one management domain. At the same time, the modular nature of the SDN architecture enables the addition of security modules anywhere in the architecture.
In the VeriScale architecture, the SDN provides the glue between the applications and network, and is essential to achieving network automation. In this context, the VeriScale architecture defines service domains that are grouped into service modules, while a service module is a collection of service domains that have a specific purpose — normally an application composed of a collection of software components. These service modules are designed to be easily replicated and distributed, allowing the application to scale on demand.
Two practical illustrationsThe advantages of the VeriScale architecture can be illustrated by describing Web server deployment and load-balancing in a traditional versus a VeriScale datacenter.
Deploying Web servers
In the traditional datacenter, when a Web server is deployed, a provisioning server connects to the target node, uploads the Web server application, performs the installation, and uploads content. Manual intervention or the use of difficult to maintain semi-automatic scripts are commonly needed to configure the network appropriately for the added service. Clearly, if there is a sudden need for the allocation of several dozen, or for that matter, several thousand Web servers, the dependence of a traditional datacenter on centralized resources means that it would struggle to service such a request.
In the VeriScale architecture, the act of allocating the POD to the Web serving service domain, causes the POD to pull the appropriate Web server payload from the repository. For this purpose, the POD uses its own local registry that is pre-configured and automatically maintained, to determine all the necessary network information it needs. In this way, the POD can provision itself without the use of centralized, potentially scarce resources nor need manual intervention. Assuming the PODs are available and activated, servicing the request for thousands of new Web servers would be as simple as servicing the request for a single Web server.
In the traditional datacenter, dedicated load-balancing devices distribute requests from a client to a server in a pool of resources. The load-balancers publish a virtual service with one unique, virtual IP address and all requests are directed to it, while each server in the resource-pool handles the requests forwarded to it by the load balancers. Unfortunately, the stand-alone load-balancing and topology control functionality suffers from the bottleneck created by the load-balancers, limiting the capacity of the network services, and requiring the periodical addition of dedicated load-balancers. In the VeriScale architecture, the load balancing functionality is implemented in each POD and may be distributed onto every server. As a result, the topology control and load distribution are decentralized, do not require dedicated devices, and their capacity does not need to be managed.
From the primary requirement of the VeriScale architecture — scalability — follows its primary characteristic — POD self sufficiency. Put another way, each service is encapsulated within a payload with the full range of capabilities required. Once delivered to any suitably-capable POD, these payloads — applications, application platforms, or entire virtual machines (VMs) — can configure themselves and provide the useful function they were designed to provide with no support from central resources. These capabilities help enable the creation of elastic service domains that can rapidly scale up or down on demand, limited only by the availability of hardware resources.