Zones? Clusters? Clustering zones? Zoneclusters?
By Karoly Vegh-Oracle on Feb 24, 2012
Everyone values zones, Solaris' builtin OS-virtualization. They are near-footprintless. Their administration is delegable. They have their own bootenvironments. Easily cloneable with ZFS snapshots, etc. They are also cleanly integratable with Solaris Cluster in different ways - this post should shed some light on the different options, and provide an example of zoneclusters.
In this post I will:
I. Explain the two ways to create HA services with zones on Solaris Cluster
II. Show you a quick walkthrough of setting up zoneclusters
III. Envision a large platform with LDOMs, zoneclusters, central monitoring.
I. Part one, the two ways to integrate zones with Solaris Cluster
Solaris Cluster has two ways to integrate Zones in a HA environment:
- Failover Zones: In this case we talk about a single zone, that is monitored by the cluster, and in case of switching the service between nodes, the zone itself will be halted, detached from the global zone, moved over to another physical clusternode, attached to the Solaris instance on the global zone there, and started up.
- This is a failry simple and straightforward setup, simply putting your zones to shared storage, containing the application and the data, application startup and shutdown is handled by zonestartup/shutdown.
- On the other hand, the moving zone itself is a SPOF (single point of failure). If it has any issues booting, the service is down. If you do a rolling upgrade of your cluster, upgrade node A, failover and attach the zone from node B, the service in the zone is down while the upgrade-on-attach procedure is running, and should it not work out for some reason, your fallback is endangered, since the zoneupgrade has been started. Any rollback and fallback mechanism to the original global zone extends your maintenance window and hence, your downtime.
- Zoneclusters: The easiest way to define zoneclusters is to compare them to zones, that they are built on. A global zone is the same to a non-global zone, like a global cluster is to a zonecluster. A global cluster is the cluster you always have installed in the global zone, having global zones as clusternodes. A zonecluster us a zone-level cluster, and has zones as clusternodes. Every zonecluster is running in a global cluster, and you can have several independent zoneclusters on a global cluster. Zoneclusters have their own cluster stack: within them you can create resources and resource groups, add or remove nodes, evacuate all nodes, shutdown the cluster, etc. Without affecting the other zoneclusters, or the global cluster. See? Just like zones.
- Having two (or more) static (non-moving) zones as clusternodes has its advantages. Your application failover happens from zone to zone, removing the complexity of upgrade-on-attach and eliminating a single zone as a SPOF. Your maintenance windows are more plannable, during an update your service is running, and - should anything go wrong - you can fallback your service to a zonenode that hasn't yet been altered.
- Of course, complexity doesn't simply disappear, it just changes form: You have now several clusters to manage. The advantage is, you can delegate the administration of the dozens of zoneclusters to your customers, just like you can delegate the administration of zones to the zoneadmins. This isn't anymore just a cloud solution, but a cloud without SPOF, with self service and built-in HA.
II. Part two, presenting and setting up a zonecluster
I will assume that you have already an installed, initialized, running global cluster. I actually have considered including here the global cluster setup procedure, but that really is beyond the scope of this post - should a clustersetup post help you to get with clustering started, let us know in the comments, we can create a separate post. For this demo, I run Solaris 11 in VirtualBox instances, the whole magic runs on my laptop - that is, it could run on yours too.
So, you have your global cluster running, and want to start with zoneclusters. What you will need is the clzonecluster command to create one. My cluster looks like this:
kvegh@sc4A:~$ clinfo && clnode list sc4B sc4A kvegh@sc4A:~$ clresourcegroup list kvegh@sc4A:~$ kvegh@sc4A:~$ zoneadm list -cv ID NAME STATUS PATH BRAND IP 0 global running / solaris shared 1 zc2 running /zones/zc2 solaris shared 2 zc1 running /zones/zc1 solaris shared kvegh@sc4A:~$
...that is, I have a two-node global cluster, consisting of the nodes sc4A and sc4B (Solaris Cluster 4, node A and B), no resourcegroups in the global cluster, but with two non-global zones. On node B I have a very similar zone list:
kvegh@sc4B:~$ zoneadm list -cv ID NAME STATUS PATH BRAND IP 0 global running / solaris shared 1 zc1 running /zones/zc1 solaris shared 2 zc2 running /zones/zc2 solaris shared kvegh@sc4B:~$
As you already probably assume, zc1 and zc2 are two zoneclusters, each having a zone (that is, a node) on both nodes of the global cluster. And you're right:
kvegh@sc4A:~$ clzonecluster list zc1 zc2 kvegh@sc4A:~$ clzonecluster status
=== Zone Clusters === --- Zone Cluster Status ---
Name Node Name Zone HostName Status Zone Status ---- --------- ------------- ------ ----------- zc1 sc4B sc4B-zc1 Online Running sc4A sc4A-zc1 Online Running zc2 sc4A sc4A-zc2 Online Running sc4B sc4B-zc2 Online Running
Now, I told you that I have no clusterresourcegroups defined in the global cluster, but I do have one configured in the zonecluster "zc2":
kvegh@sc4A:~$ sudo zlogin zc2 [Connected to zone 'zc2' pts/2] Oracle Corporation SunOS 5.11 11.0 December 2011 You have mail. root@sc4A-zc2:~# zonename zc2 root@sc4A-zc2:~# clzonecluster list zc2 root@sc4A-zc2:~# clrg status
=== Cluster Resource Groups ===
Group Name Node Name Suspended Status
---------- --------- --------- ------ apache_rg sc4A-zc2 No Online sc4B-zc2 No Offline
To sum it up: On the global two-node (sc4A and sc4B) cluster there are two zoneclusters defined, zc1 and zc2. In the global zone sc4A, within the zone zc2, that is a node in the zonecluster zc2, there is an apache-resourcegroup running that can be readily switched with usual cluster commands to the other node of the zonecluster, running on the other node of the global cluster.
On the other configured zonecluster, zc1 there are no resources configured at all, and its administrator could do all the usual cluster actions without disturbing zc2, like "clnode evacuate" or "cluster shutdown".
Now, let's configure and install a zonecluster. This is very similar to configuring and installing zones, with the only difference that the setting up of the zones will be done by the cluster:
kvegh@sc4A:~$ pfexec clzc configure newZC newZC: No such zone cluster configured Use 'create' to begin configuring a new zone cluster. clzc:newZC> create clzc:newZC> add node clzc:newZC:node> set physical-host=sc4A clzc:newZC:node> set hostname=sc4A-newZC clzc:newZC:node> add net clzc:newZC:node:net> set physical=sc_ipmp0 clzc:newZC:node:net> set address=192.168.56.163/24 clzc:newZC:node:net> end clzc:newZC:node> end clzc:newZC> add node clzc:newZC:node> set physical-host=sc4B clzc:newZC:node> set hostname=sc4B-newZC clzc:newZC:node> add net clzc:newZC:node:net> set physical=sc_ipmp0 clzc:newZC:node:net> set address=192.168.56.183/24 clzc:newZC:node:net> end clzc:newZC:node> end clzc:newZC> set zonepath=/zones/newZC clzc:newZC> verify clzc:newZC> exit kvegh@sc4A:~$
Having configured the zonecluster, now I could start the installation of its zone-nodes, but oops! For some reason I have no access to an IPS repo. That isn't so great, since in S11 zones are not anymore created from the global zone's packages, but installed from the pkg repository directly. An alternative is: if you have a zonecluster already installed, you can clone the new ones from that one over ZFS clones, just like you can clone zones:
kvegh@sc4A:~$ clzc halt zc1 Waiting for zone halt commands to complete on all the nodes of the zone cluster "zc1"... kvegh@sc4A:~$ clzc status === Zone Clusters === --- Zone Cluster Status --- Name Node Name Zone HostName Status Zone Status ---- --------- ------------- ------ ----------- zc1 sc4B sc4B-zc1 Offline Installed sc4A sc4A-zc1 Offline Installed zc2 sc4A sc4A-zc2 Online Running sc4B sc4B-zc2 Online Running newZC sc4A sc4A-newZC Offline Configured sc4B sc4B-newZC Offline Configured kvegh@sc4A:~$ /usr/cluster/bin/clzc clone -Z newZC -v zc1 Waiting for zone clone commands to complete on all the nodes of the zone cluster "newZC"... kvegh@sc4A:~$ kvegh@sc4A:~$ clzc status newZC === Zone Clusters === --- Zone Cluster Status --- Name Node Name Zone HostName Status Zone Status ---- --------- ------------- ------ ----------- newZC sc4A sc4A-newZC Offline Installed sc4B sc4B-newZC Offline Installed kvegh@sc4A:~$
Of course the zones have been installed too, with the name of the zonecluster:
kvegh@sc4A:~$ zoneadm list -cv ID NAME STATUS PATH BRAND IP 0 global running / solaris shared 1 zc2 running /zones/zc2 solaris shared 2 zc1 running /zones/zc1 solaris shared 3 newZC installed /zones/newZC solaris shared kvegh@sc4A:~$At this point, after starting up the newly created zonecluster newZC, you will have to configure the OS in those zones, either with "zlogin -C newZC" on both nodes, or having prepared a system profile file that you can feed the zones for automatic self-configuration, just like at any zone installation. After that you can zlogin to your zonecluster nodes, and create your HA services, manage clusterresources and resourcegroups, switch them between the zonenodes, create affinities, restart the whole zonecluster, just like with any global cluster you are used to use - and without affecting the other zoneclusters.
III. Part three, the vision of a great platform
Now lean back, and imagine a platform:
- This platform shall consist of 4+1 T4 servers, using LDOMs, each with two I/O Service domains that cooperate with multipathing, and two guest domains, one running Solaris 10 with zoneclusters and one running Solaris 11 with zoneclusters.
- On the 4 boxes you create a 4-node S10 global cluster in the S10 ldoms, and a 4-node S11 global cluster in the S11 ldoms, deploying your applications to the platform of their choice.
- You create 2-node zoneclusters within your 4-node global clusters. One 2-node ZC is for scalable applications (Webserver, DB RAC, loadbalanced Application servers, etc.), and another 2-node ZC is for failover applications (running in active-standby mode).
- Having defined a role to 4 of 5 servers you decide that the 5th node should be a general HW-standby-node, to failover services to in case one of the other 4 active servers would experience HW issues.
- Now to manage this platform, you can use Enterprise Manager Ops Center, to deploy/patch OS, manage virtualization, overview your clusters, monitor utilization.
- Leverage the T4 capabilities, encrypt everything in HW, your ZFS, your database data, outsource your SSL encryption. Use Solaris Cluster to livemigrate your LDOMs, if you please.
With all this SPARC and Solaris 11 goodness (encryption, ZFS, virtual networking, resource management, bootenvironments), combined with zoneclusters you can build true cloud platforms, with builtin HA, self-service, application separation and high server utilization. Not to mention the sky-high coolness factor of a platform like this :)
Documentation I have used:
Oracle Solaris Cluster Software Installation Guide
The Solaris Cluster 4 Documentation Collection
- In Solaris 10 with Solaris Cluster 3.3 there was a third option to integrate zones with the cluster, but in SC4 this isn't supported anymore, hence I did not mention it.
- Solaris Cluster runs both on sparc and x86
- Solaris Cluster 3.x supports Solaris 10, Solaris Cluster 4.0 runs on Solaris 11
- You can run zoneclusters with S10 and SC3.x too.