Zones? Clusters? Clustering zones? Zoneclusters?

Everyone values zones, Solaris' builtin OS-virtualization. They are near-footprintless. Their administration is delegable. They have their own bootenvironments. Easily cloneable with ZFS snapshots, etc. They are also cleanly integratable with Solaris Cluster in different ways - this post should shed some light on the different options, and provide an example of zoneclusters. 

In this post I will: 
I. Explain the two ways to create HA services with zones on Solaris Cluster 
II. Show you a quick walkthrough of setting up zoneclusters
III. Envision a large platform with LDOMs, zoneclusters, central monitoring. 


I. Part one, the two ways to integrate zones with Solaris Cluster

Solaris Cluster has two ways to integrate Zones in a HA environment:  

  • Failover Zones: In this case we talk about a single zone, that is monitored by the cluster, and in case of switching the service between nodes, the zone itself will be halted, detached from the global zone, moved over to another physical clusternode, attached to the Solaris instance on the global zone there, and started up. 
    • This is a failry simple and straightforward setup, simply putting your zones to shared storage, containing the application and the data, application startup and shutdown is handled by zonestartup/shutdown.  
    • On the other hand, the moving zone itself is a SPOF (single point of failure). If it has any issues booting, the service is down. If you do a rolling upgrade of your cluster, upgrade node A, failover and attach the zone from node B, the service in the zone is down while the upgrade-on-attach procedure is running, and should it not work out for some reason, your fallback is endangered, since the zoneupgrade has been started. Any rollback and fallback mechanism to the original global zone extends your maintenance window and hence, your downtime. 

  • Zoneclusters: The easiest way to define zoneclusters is to compare them to zones, that they are built on. A global zone is the same to a non-global zone, like a global cluster is to a zonecluster. A global cluster is the cluster you always have installed in the global zone, having global zones as clusternodes. A zonecluster us a zone-level cluster, and has zones as clusternodes. Every zonecluster is running in a global cluster, and you can have several independent zoneclusters on a global cluster. Zoneclusters have their own cluster stack: within them you can create resources and resource groups, add or remove nodes, evacuate all nodes, shutdown the cluster, etc. Without affecting the other zoneclusters, or the global cluster. See? Just like zones. 
    • Having two (or more) static (non-moving) zones as clusternodes has its advantages. Your application failover happens from zone to zone, removing the complexity of upgrade-on-attach and eliminating a single zone as a SPOF. Your maintenance windows are more plannable, during an update your service is running, and - should anything go wrong - you can fallback your service to a zonenode that hasn't yet been altered. 
    • Of course, complexity doesn't simply disappear, it just changes form: You have now several clusters to manage. The advantage is, you can delegate the administration of the dozens of zoneclusters to your customers, just like you can delegate the administration of zones to the zoneadmins. This isn't anymore just a cloud solution, but a cloud without SPOF, with self service and built-in HA. 


II. Part two, presenting and setting up a zonecluster

I will assume that you have already an installed, initialized, running global cluster. I actually have considered including here the global cluster setup procedure, but that really is beyond the scope of this post - should a clustersetup post help you to get with clustering started, let us know in the comments, we can create a separate post. For this demo, I run Solaris 11 in VirtualBox instances, the whole magic runs on my laptop - that is, it could run on yours too. 

So, you have your global cluster running, and want to start with zoneclusters. What you will need is the clzonecluster command to create one. My cluster looks like this: 

kvegh@sc4A:~$ clinfo && clnode list 
sc4B
sc4A
kvegh@sc4A:~$ clresourcegroup list 
kvegh@sc4A:~$ 
kvegh@sc4A:~$ zoneadm list -cv 
  ID NAME             STATUS     PATH                      BRAND    IP   
   0 global           running    /                         solaris  shared
   1 zc2              running    /zones/zc2                solaris  shared
   2 zc1              running    /zones/zc1                solaris  shared
kvegh@sc4A:~$ 

...that is, I have a two-node global cluster, consisting of the nodes sc4A and sc4B (Solaris Cluster 4, node A and B), no resourcegroups in the global cluster, but with two non-global zones. On node B I have a very similar zone list: 

kvegh@sc4B:~$ zoneadm list -cv 
  ID NAME             STATUS     PATH                      BRAND    IP    
   0 global           running    /                         solaris  shared
   1 zc1              running    /zones/zc1                solaris  shared
   2 zc2              running    /zones/zc2                solaris  shared
kvegh@sc4B:~$ 

As you already probably assume, zc1 and zc2 are two zoneclusters, each having a zone (that is, a node) on both nodes of the global cluster. And you're right: 

kvegh@sc4A:~$ clzonecluster list
zc1
zc2
kvegh@sc4A:~$ clzonecluster status 

=== Zone Clusters === --- Zone Cluster Status ---

Name    Node Name   Zone HostName    Status    Zone Status ----    ---------   -------------    ------   ----------- zc1     sc4B        sc4B-zc1         Online    Running         sc4A        sc4A-zc1         Online    Running zc2     sc4A        sc4A-zc2         Online    Running         sc4B        sc4B-zc2         Online    Running

kvegh@sc4A:~$ 

Now, I told you that I have no clusterresourcegroups defined in the global cluster, but I do have one configured in the zonecluster "zc2": 

kvegh@sc4A:~$ sudo zlogin zc2 
[Connected to zone 'zc2' pts/2]
Oracle Corporation      SunOS 5.11      11.0    December 2011
You have mail.
root@sc4A-zc2:~# zonename
zc2
root@sc4A-zc2:~# clzonecluster list
zc2
root@sc4A-zc2:~# clrg status 

=== Cluster Resource Groups ===

Group Name    Node Name    Suspended   Status

----------    ---------    ---------   ------ apache_rg     sc4A-zc2     No          Online               sc4B-zc2     No          Offline

root@sc4A-zc2:~# 

To sum it up: On the global two-node (sc4A and sc4B) cluster there are two zoneclusters defined, zc1 and zc2. In the global zone sc4A,  within the zone zc2, that is a node in the zonecluster zc2, there is an apache-resourcegroup running that can be readily switched with usual cluster commands to the other node of the zonecluster, running on the other node of the global cluster. 

On the other configured zonecluster, zc1 there are no resources configured at all, and its administrator could do all the usual cluster actions without disturbing zc2, like "clnode evacuate" or "cluster shutdown". 

Now, let's configure and install a zonecluster. This is very similar to configuring and installing zones, with the only difference that the setting up of the zones will be done by the cluster: 

kvegh@sc4A:~$ pfexec clzc configure newZC
newZC: No such zone cluster configured
Use 'create' to begin configuring a new zone cluster.
clzc:newZC> create
clzc:newZC> add node
clzc:newZC:node> set physical-host=sc4A
clzc:newZC:node> set hostname=sc4A-newZC
clzc:newZC:node> add net
clzc:newZC:node:net> set physical=sc_ipmp0
clzc:newZC:node:net> set address=192.168.56.163/24
clzc:newZC:node:net> end
clzc:newZC:node> end
clzc:newZC> add node
clzc:newZC:node> set physical-host=sc4B
clzc:newZC:node> set hostname=sc4B-newZC
clzc:newZC:node> add net
clzc:newZC:node:net> set physical=sc_ipmp0
clzc:newZC:node:net> set address=192.168.56.183/24
clzc:newZC:node:net> end
clzc:newZC:node> end
clzc:newZC> set zonepath=/zones/newZC
clzc:newZC> verify
clzc:newZC> exit
kvegh@sc4A:~$

Having configured the zonecluster, now I could start the installation of its zone-nodes, but oops! For some reason I have no access to an IPS repo. That isn't so great, since in S11 zones are not anymore created from the global zone's packages, but installed from the pkg repository directly. An alternative is: if you have a zonecluster already installed, you can clone the new ones from that one over ZFS clones, just like you can clone zones: 

kvegh@sc4A:~$ clzc halt zc1 
Waiting for zone halt commands to complete on all the nodes of the zone cluster "zc1"...
kvegh@sc4A:~$ clzc status 
=== Zone Clusters ===
--- Zone Cluster Status ---

Name     Node Name    Zone HostName   Status    Zone Status
----     ---------    -------------   ------    -----------
zc1      sc4B         sc4B-zc1        Offline   Installed
         sc4A         sc4A-zc1        Offline   Installed

zc2      sc4A         sc4A-zc2        Online    Running
         sc4B         sc4B-zc2        Online    Running

newZC    sc4A         sc4A-newZC      Offline   Configured
         sc4B         sc4B-newZC      Offline   Configured

kvegh@sc4A:~$ /usr/cluster/bin/clzc clone -Z newZC -v zc1 
Waiting for zone clone commands to complete on all the nodes of the zone cluster "newZC"...
kvegh@sc4A:~$
kvegh@sc4A:~$ clzc status newZC
=== Zone Clusters ===
--- Zone Cluster Status ---

Name     Node Name    Zone HostName   Status    Zone Status
----     ---------    -------------   ------    -----------
newZC    sc4A         sc4A-newZC      Offline   Installed
         sc4B         sc4B-newZC      Offline   Installed

kvegh@sc4A:~$ 

Of course the zones have been installed too, with the name of the zonecluster: 

kvegh@sc4A:~$ zoneadm list -cv 
  ID NAME             STATUS     PATH                      BRAND    IP   
   0 global           running    /                         solaris  shared
   1 zc2              running    /zones/zc2                solaris  shared
   2 zc1              running    /zones/zc1                solaris  shared
   3 newZC            installed  /zones/newZC              solaris  shared
kvegh@sc4A:~$ 
At this point, after starting up the newly created zonecluster newZC, you will have to configure the OS in those zones, either with "zlogin -C newZC" on both nodes, or having prepared a system profile file that you can feed the zones for automatic self-configuration, just like at any zone installation. After that you can zlogin to your zonecluster nodes, and create your HA services, manage clusterresources and resourcegroups, switch them between the zonenodes, create affinities, restart the whole zonecluster, just like with any global cluster you are used to use - and without affecting the other zoneclusters. 


III. Part three, the vision of a great platform

Now lean back, and imagine a platform: 

  • This platform shall consist of 4+1 T4 servers, using LDOMs, each with two I/O Service domains that cooperate with multipathing, and two guest domains, one running Solaris 10 with zoneclusters and one running Solaris 11 with zoneclusters. 
  • On the 4 boxes you create a 4-node S10 global cluster in the S10 ldoms, and a 4-node S11 global cluster in the S11 ldoms, deploying your applications to the platform of their choice.  
  • You create 2-node zoneclusters within your 4-node global clusters. One 2-node ZC is for scalable applications (Webserver, DB RAC, loadbalanced Application servers, etc.), and another 2-node ZC is for failover applications (running in active-standby mode). 
  • Having defined a role to 4 of 5 servers you decide that the 5th node should be a general HW-standby-node, to failover services to in case one of the other 4 active servers would experience HW issues. 
  • Now to manage this platform, you can use Enterprise Manager Ops Center, to deploy/patch OS, manage virtualization, overview your clusters, monitor utilization. 
  • Leverage the T4 capabilities, encrypt everything in HW, your ZFS, your database data, outsource your SSL encryption. Use Solaris Cluster to livemigrate your LDOMs, if you please. 


With all this SPARC and Solaris 11 goodness (encryption, ZFS, virtual networking, resource management, bootenvironments), combined with zoneclusters you can build true cloud platforms, with builtin HA, self-service, application separation and high server utilization. Not to mention the sky-high coolness factor of a platform like this :) 

Documentation I have used:  
Oracle Solaris Cluster Software Installation Guide
The Solaris Cluster 4 Documentation Collection

Footnotes:
- In Solaris 10 with Solaris Cluster 3.3 there was a third option to integrate zones with the cluster, but in SC4 this isn't supported anymore, hence I did not mention it. 
- Solaris Cluster runs both on sparc and x86
- Solaris Cluster 3.x supports Solaris 10, Solaris Cluster 4.0 runs on Solaris 11
- You can run  zoneclusters with S10 and SC3.x too. 

Comments:

Hello! Thank you for this wonderful article on the topic of Zone Clusters. You mentioned that if there was interest in your "global cluster setup procedure" to indicate that desire here. As I am working on an implementation of Oracle RAC in a Solaris10 Update10 Cluster 3.3 5/11 Zone Cluster, your process and procedures would be very helpful!
Thanks!!

Posted by guest on March 26, 2012 at 11:50 PM CEST #

Hi,

thanks for the kind words.

As for the global cluster setup, I was actually referring to installing the clustersoftware and initializing the clusterinstance on Solaris 11 and Solaris Cluster 4, but you have mentioned Solaris Cluster 3.3 on S10.

Is SC4 on S11 of interest too?

wbr,

charlie

Posted by charlie on March 27, 2012 at 07:31 AM CEST #

Thanks for your very informative article.
But I have one doubt, you were saying in footnote that

"In Solaris 10 with Solaris Cluster 3.3 there was a third option to integrate zones with the cluster, but in SC4 this isn't supported anymore, hence I did not mention it."

Will you please tell us what was that third option and what caused it to be removed from Solaris Cluster 4 and Solaris 11?

And one more doubt, if zones are already installed and running, how can we use these already running zones to "zone cluster", instead of crating new zones with clzc command.

regards
Solfan

Posted by guest on May 31, 2012 at 11:29 AM CEST #

Hi Solfan,

In Solaris Cluster 3.x the third option is to create not-moving zones manually on all the clusternodes, and defining those as nodes for the clusterresourcegroup in this form: host1:zone1, host2:zone2, etc. With this you can switch your clusterresourcegroups from one zone on one host *into* another zone on another host. We call this zonenodes method.

The difference to zonecluster is that if you create a zonecluster, then the cluster itself creates/installs the zones on the target hosts, with the brand "cluster" with the same zonename, centrally managing their configuration from the CCR, and those zonecluster represent *separate* zone-based clusters with the most features what a global cluster has too.

The zonenodes method has been dropped for the two methods were very comparable and overlapping, only having brought complexity maintaining both when both fulfill the same goals. Zoneclusters are the more sophisticated solution, hence they were picked as an implementation of choice to run (and switch!) resources between zones.

wbr,

charlie

Posted by Karoly Vegh on July 05, 2012 at 12:07 PM CEST #

Very informative and useful article. I am working with a customer who has a number of older clustered SPARC servers (about 100) in two data centers and I have been promoting the idea of moving them to a new clustered enviroment as you described in Part III. Was wondering if you could suggest how long we should spend in planning, implementation and testing as well as what would be the best approach in training.

Cheers ... Graham

Posted by guest on July 26, 2012 at 04:43 PM CEST #

Hi Graham,

That is somewhat scarce information to answer the question responsibly, let me try to give some general guidelines:

- First things first: Training. I hope I'm wrong, but I assume that the customer is not yet running productively with Solaris 11. For building a new platform I highly recommend going for the Solaris 11 release, it is a great update, has been around for a while, brings lots of innovation, especially for consolidation projects. There are several trainings on S11 to attend: http://education.oracle.com/pls/web_prod-plq-dad/db_pages.getpage?page_id=402&p_nl=ORSL&p_org_id=46777&lang=US

- Running Solaris 11 means running Solaris Cluster 4.

- Running T4 servers (https://blogs.oracle.com/orasysat/entry/the_sparc_t4_servers_have) enables the customer with LDoms, of course, and great HW-speed encryption features too. Not sure if the customer already runs LDoms, again, consider trainings, if not yet.

Now, I can't estimate how much testing you need, but I, for one would setup a smaller version of the target platform in the test/development environment, and test all the features that are going to be needed (will your applications be able to benefit from HW-encryption? Will you need to use LDom LiveMigration? Do ZoneClusters fit your needs? Do you still need Solaris 10 in separate LDoms, or do you need Solaris 10 branded zones running on top of Solaris 11? ( https://blogs.oracle.com/orasysat/entry/shall_i_use_zones_or ) Do you need lots of LDoms with virtualized I/O interfaces or will you want to go with direct I/O? Will you use policy-based dynamic resource reconfiguration between LDoms? Will you use the LDoms as software license boundaries? What level of availability do you need, do you need separate ServiceDomains within a box? Are you running the cluster to move the LDoms or do you run your clusters within the LDoms? Will you use Ops Center as a management tool? Will you be migrating (physical-to-virtual migration, virtual-to-virtual migrations) existing instances to run on the new HW without modification? ...and so forth, and so forth.)

I very much recommend involving ACS (Advanced Customer Services) from the vendor side for the planning of the platform, those are experienced engineers and can lead the project well.

Otherwise the direction is great, the T4s are great consolidation servers exactly for the OVM for SPARC (LDoms) features and their excellent single thread performance and dynamical threading - not to mention the ease of older system migration to them.

HTH,

charlie

Posted by Karoly Vegh on July 29, 2012 at 08:10 PM CEST #

Hi,
What is correct procedure to patch Recommended and cluster patch configured with zone cluster running Solaris 10 with SC3.X?

Posted by guest on September 03, 2012 at 04:12 PM CEST #

Hi,

Thanks for your kind reply, I agree, Zoneclusters is the good way to manage zone resources/zones. But what about already up and running Zones? Any ways to add these already running zones to 'zonecluster'? What I meant is that, we can create new Zonecluster 'NZC', but instead of installing new zones, can we integrate these already running zones to "NZC". Is there any means to change our already running zones to become 'cluster' branded zone, because there is no Zonenodes method is available now.

One more doubt, in our example it is seen that

root@sc4A-zc2:~# clzonecluster list
zc2

here sc4A-zc2 is also a member of Zonecluster zc1, but why it is not showing zc1, in output it only shows zc2,

Thanks and regards
Solfan

Posted by guest on September 11, 2012 at 01:16 PM CEST #

Hi,

Would you tell me how can we configure Zonecluster with the Zones already running. I don't want to go with the Zone creation process all over again and I want to use the existing zone itself into a zonecluster.

Kindly suggest.

Posted by guest on October 25, 2012 at 10:21 AM CEST #

"What is correct procedure to patch Recommended and cluster patch configured with zone cluster running Solaris 10 with SC3.X?"

From the top of my head:

- Always follow the documentation
- I'd recommend using Live Upgrade
- definitely read the patchluster READMEs

the docs can be found here:

http://docs.oracle.com/cd/E19680-01/html/821-1256/gcssh.html#scrolltoc

HTH,

charlie

Posted by Karoly Vegh on October 25, 2012 at 07:30 PM CEST #

"Would you tell me how can we configure Zonecluster with the Zones already running. I don't want to go with the Zone creation process all over again and I want to use the existing zone itself into a zonecluster."

I am afraid that is not a supported scenario, both Oracle Solaris Cluster 4.0 and 3.3 documentations states:

"[...]
Conversion to a zone-cluster node – You cannot add to a zone cluster a non-global zone that resides outside that zone cluster. You must use only the clzonecluster command to add new nodes to a zone cluster.
[...]"

See:

http://docs.oracle.com/cd/E23623_01/html/E23437/babccjcd.html#ghbmi
and
http://docs.oracle.com/cd/E18728_01/html/821-2845/babccjcd.html#ghbmi

wbr,

charlie

Posted by Karoly Vegh on October 25, 2012 at 08:19 PM CEST #

"Is there any means to change our already running zones to become 'cluster' branded zone, because there is no Zonenodes method is available now."

Hi Solfan,

The answer is unfortunately, no, see my other reply. Also, in S11 and OSC4 the zonecluster zones aren't anymore branded "cluster".

In an internal discussion we have came to the conclusion that it would be a complex procedure to check/control all the possible zone configurations against the allowed zonecluster zone configurations, application installation (is the data on shared storage? are the binaries? Are they shared or dual-installed? Can the single zone be cloned? are the services started by SMF or rc scripts? etc.)
The new installation guarantees cluster consistence and a clean (and repeatable) installation without a bumpy ride.

I do understand though that this means an additional effort on the systemadministrator side, however probably not much larger one as we assume now for a conversion.

HTH,

charlie

Posted by Karoly Vegh on October 25, 2012 at 08:28 PM CEST #

"...
root@sc4A-zc2:~# clzonecluster list
zc2

here sc4A-zc2 is also a member of Zonecluster zc1, but why it is not showing zc1, in output it only shows zc2
..."

Hi Solfan,

actually the setup is like this:

Physical node 1: sc4A
Physical node 2: sc4B
The zonename of zonecluster 1 is: zc1
The zonename of zonecluster 2 is: zc2
The hostname within the zone of zonecluster 1 on physical node 1: sc4A-zc1
The hostname within the zone of zonecluster 1 on physical node 2: sc4B-zc1
The hostname within the zone of zonecluster 2 on physical node 1: sc4A-zc2
The hostname within the zone of zonecluster 2 on physical node 2: sc4B-zc2

That is, if I login to sc4A-zc2, which is the hostname within the zone running on physical node 1 (sc4A), and there, within the zone, that is, within the zonecluster "zc2" I list all the zoneclusters, I will only see zc2 - because zc1 is a completely separate zonecluster with completely separate zones.

I guess the names aren't too explanatory, let me know if the explanation is still unclear.

Posted by Karoly Vegh on October 25, 2012 at 08:35 PM CEST #

Thanks for the quick update Karoly Vegh!!!

If Solaris Cluster do not support Zone Cluster from existing zone, I have to stick with ZoneNodes method only. I am going to migrate zones into a cluster setup and the zones from existing setup can be migrated by V2V process. The zpool & logical ip can be configured as a fail-over resources between zones.

Definitely Oracle should implement zonenodes method into Solaris Cluster 4 aswell as it helps to migrate Solaris 10 machine into a cluster with P2V process as a zone very easily.

Thanks,
Arul

Posted by A M on October 25, 2012 at 09:52 PM CEST #

Arul,

Now, that Oracle Solaris Cluster 4.1 has been released, there is a possibility to run Solaris 10-branded zoneclusters on top of Solaris 11.1 and OSC 4.1.

By default, this still would require you to setup the zones manually, but there is an option ("-a", IIRC) where after having configured the zonecluster, at installation time you can attach an archive, that has been created from an existing zone.

This is not quite the same you want, but could eventually help you.

Posted by Karoly Vegh on October 30, 2012 at 11:05 AM CET #

Thanks so much for your post. So very informative. I'm running SC3.x to failover LDOMS, but just needed a single branded zone. I installed this zone within an LDOM but I'm getting a P2V error when I try to failover the LDOM another node. Is this not a supported configuration? Are my only choices to run zoneclusters or zonenodes?

Thanks,
G

Posted by guest on April 24, 2013 at 05:00 PM CEST #

My partner and i actually enjoy this post and the internet site all in all! Your piece of writing is really plainly composed as well as simply understandable. Your current Blog design is awesome as well! Would be awesome to know where I are able obtain it. Please maintain up the very good job. We all require far more such website owners like you on the net and much fewer spammers. Fantastic mate!
http://www.sqlservermasters.com/

Posted by guest on May 29, 2013 at 09:50 AM CEST #

G:
About the failover issue with the LDOM: I'm afraid I don't quite get the question nor the setup. Have you opened up a SR?

Posted by Karoly Vegh on September 05, 2013 at 03:55 PM CEST #

Post a Comment:
  • HTML Syntax: NOT allowed
About

This is the Technology Blog of the Oracle Hardware Presales Consultant Team in Vienna, Austria. We post about our technology fields: server- and storage hardware, operating system technologies, virtualization, clustering, datacenter management and performance tuning possibilities.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today