Wednesday Aug 24, 2016

HA for Oracle VM Server for SPARC: How to Avoid Cold Switchover When a Live Migration Fails

Oracle VM Server for SPARC provides the ability to split a single physical system into multiple, independent virtual systems, called logical domains, with a subset of re-configurable resources such as memory, vCPU, and cryptographic units (MAU). The HA for Oracle VM Server for SPARC data service for Oracle Solaris Cluster provides a seamless mechanism for orderly startup, shutdown, fault monitoring, and automatic failover of the Oracle VM Server for SPARC logical domain service.

One of the salient features for Oracle VM Server for SPARC guest domains is live migration, which allows you to transfer a running guest domain between different physical machines while maintaining continuous service to clients. For example, the administrator could relocate the guest domain to another cluster node without a disruption to clients, thereby freeing up the node for a planned outage.

Typically, Oracle Solaris Cluster resource types perform a switchover of a resource group from one node to another by stopping the application on the source node, and then starting it on the target node. However, for a resource group containing a resource of type SUNW.ldom, if the Migration_type extension property is set to MIGRATE, a typical switchover is not performed. Instead, an attempt to live-migrate the guest domain to the target node is performed. 

Figure 1 - Uninterrupted clients during a successful live migration
In order to live-migrate a guest domain managed by a resource of type SUNW.ldom, assume the root role or a role that provides solaris.cluster.modify and solaris.cluster.admin authorizations to execute the following on any one node:
$ /usr/cluster/bin/clresourcegroup switch -n target_node ldom-rg

However, due to the dynamic nature of guest domains and possible change to the system configuration, there can be cases where a guest domain live migration does not complete successfully. For example, the target node resources, such as vCPU or  memory, might not meet the minimum requirements for the guest domain to start on the target node. Alternatively, a change in the authentication mechanism might prevent a guest domain from starting on the target node.

Currently, the HA for Oracle VM Server for SPARC data service reacts to a failed live migration by reverting to a cold switchover between the nodes of the resource group. In such a scenario, live migration is canceled, followed by a stop of the guest domain on the source node and then a start of the guest domain on the target node. However, that behavior is disruptive to the clients.


Figure 2 - Clients are interrupted during a failed live migration.

Before performing live migration, ensure that the migration will succeed by performing a dry-run migration of the guest domain. Do this before executing a switchover of resource group containing a resource of type SUNW.ldom, where Migration_type=MIGRATE is set. To perform a dry-run migration, assume the root role or a role that has been assigned the LDoms Management profile and execute the following command:
$ /usr/sbin/ldm migrate-domain -n -c domain-name target_host
$ echo $?
0

Starting with Oracle Solaris Cluster 4.3 SRU4, which introduces version 10 of the SUNW.ldom resource type, a resource of that resource type and version can abort the switchover of the resource group if live migration fails, and leave the guest domain running on the source node. During a planned outage, this would prevent any interruption to clients. To enable this feature, set the new extension property Abort_migration to TRUE. When set, a switchover of the resource group will only work if live migration is successful, otherwise it will interrupt the ongoing resource group switchover and keep the guest domain in the active state on the source node. During this period, it is expected to see a change of state for resource and resource group, only to come back to the Online state. However, during this transient state change the guest domain remains in the active state throughout and continues to provide services to clients.



To migrate an existing resource of type SUNW.ldom to resource type version 10 and set the Abort_migration extension property, assume the root role or a role that provides solaris.cluster.modify and solaris.cluster.admin authorizations and execute the following on any one node:
$ /opt/SUNWscxvm/util/rt_upgrade ldom-rs
$ /usr/cluster/bin/clresource set -p Abort_migration=TRUE ldom-rs


For more information, refer the following resources:

Oracle Solaris Cluster Data Service for Oracle VM Server for SPARC Guide
http://docs.oracle.com/cd/E56676_01/html/E56924/index.html

SUNW.ldom(5) Man Page
https://docs.oracle.com/cd/E56676_01/html/E56746/sunw.ldom-5.html

Oracle Solaris Cluster 4.3 Reference Manual
https://docs.oracle.com/cd/E56676_01/html/E56745/clrg-1cl.html

SSL Live Migration for HA for Oracle VM Server for SPARC
https://blogs.oracle.com/SC/entry/ha_ldom_live_migration_using

Configuring a Data Service for Oracle VM Server for SPARC by Using a Graphical Wizard
https://blogs.oracle.com/SC/entry/configuring_data_service_for_oracle

Tapan Avasthi
Oracle Solaris Cluster Engineering

Tuesday Jul 26, 2016

SSL Live Migration for HA for Oracle VM Server for SPARC

As detailed in this article, the HA for Oracle VM Server for SPARC data service for Oracle Solaris Cluster can be used to support enhanced availability of Oracle VM Server for SPARC. This high availability (HA) agent can control and manage a guest domain as a "black box." It can fail over the guest domain in case of failure, but it can also use the domain migration procedures to operate a managed switchover.   

Up to this point, using the HA for Oracle VM Server for SPARC service to orchestrate the guest domain migration required, providing the Oracle Solaris Cluster software with administrative credentials for the control domains. Starting with Oracle Solaris Cluster 4.3 SRU3, a resource of type SUNW.ldom starting with version 8 can now live-migrate a guest domain by using SSL (Secure Sockets Layer) certificates that have been set up to establish a trust relationship between different Oracle VM Server for SPARC server control domains, thereby enhancing the security for the system.

To enable live migration of a guest domain by using SSL certificate based authentication, you must first configure the SSL certificates by referring to the version-specific Administration Guide for Oracle VM Server for SPARC.

Resource type SUNW.ldom version 8 introduces the extension property Use_SSL_Certificate that can be tuned to enable or disable SSL certificate-based authentication for live migration. By default Use_SSL_Certificate=FALSE is set which disables SSL certificate based authentication. However, Use_SSL_Certificate=TRUE can be set anytime to enable SSL certificate based authentication.

In order to migrate an existing resource of type SUNW.ldom to resource type version 8, assume the root role or a role that provides solaris.cluster.modify and solaris.cluster.admin authorizations to execute the following on any one node:
$ /usr/cluster/bin/clresource set -p TYPE_VERSION=8 ldom-rs

The steps below briefly describe how to set up the SSL certificates for a guest domain that is managed by Oracle Solaris Cluster Data Service for Oracle VM Server for SPARC 3.3 on a three-node cluster (node1, node2, node3), and leverage the configured SSL certificates for live migration of the guest domain.

Perform the following on each node where a guest domain could be managed by a resource of SUNW.ldom starting with version 8. Setting up SSL certificates for a guest domain requires root privilege.

1. Create the /var/share/ldomsmanager/trust directory, if it does not already exist.
root@node1:~# /usr/bin/mkdir -p /var/share/ldomsmanager/trust
root@node2:~# /usr/bin/mkdir -p /var/share/ldomsmanager/trust
root@node3:~# /usr/bin/mkdir -p /var/share/ldomsmanager/trust


2. Securely copy the remote ldmd certificate to the local ldmd trusted certificate directory.

root@node1:~# /usr/bin/scp \
root@node2.example.com:/var/share/ldomsmanager/server.crt \
/var/share/ldomsmanager/trust/node2.pem

root@node1:~# /usr/bin/scp \
root@node3.example.com:/var/share/ldomsmanager/server.crt \
/var/share/ldomsmanager/trust/node3.pem


root@node2:~# /usr/bin/scp \
root@node1.example.com:/var/share/ldomsmanager/server.crt \
/var/share/ldomsmanager/trust/node1.pem

root@node2:~# /usr/bin/scp \
root@node3.example.com:/var/share/ldomsmanager/server.crt \
/var/share/ldomsmanager/trust/node3.pem


root@node3:~# /usr/bin/scp \
root@node1.example.com:/var/share/ldomsmanager/server.crt \
/var/share/ldomsmanager/trust/node1.pem

root@node3:~# /usr/bin/scp \
root@node2.example.com:/var/share/ldomsmanager/server.crt \
/var/share/ldomsmanager/trust/node2.pem



3. Create a symbolic link from the certificate in the ldmd trusted certificate directory to the /etc/certs/CA/ directory.
root@node1:~# /usr/bin/ln -s /var/share/ldomsmanager/trust/node2.pem \
/etc/certs/CA/

root@node1:~# /usr/bin/ln -s /var/share/ldomsmanager/trust/node3.pem \
/etc/certs/CA/


root@node2:~# /usr/bin/ln -s /var/share/ldomsmanager/trust/node1.pem \
/etc/certs/CA/

root@node2:~# /usr/bin/ln -s /var/share/ldomsmanager/trust/node3.pem \
/etc/certs/CA/


root@node3:~# /usr/bin/ln -s /var/share/ldomsmanager/trust/node1.pem \
/etc/certs/CA/

root@node3:~# /usr/bin/ln -s /var/share/ldomsmanager/trust/node2.pem \
/etc/certs/CA/


4. Restart the svc:/system/ca-certificates service.
root@node1:~# /usr/sbin/svcadm restart svc:/system/ca-certificates
root@node2:~# /usr/sbin/svcadm restart svc:/system/ca-certificates
root@node3:~# /usr/sbin/svcadm restart svc:/system/ca-certificates

5. Verify that the configuration is operational.
root@node1:~# /usr/bin/openssl verify /etc/certs/CA/node2.pem
/etc/certs/CA/node2.pem: OK
root@node1:~# /usr/bin/openssl verify /etc/certs/CA/node3.pem
/etc/certs/CA/node3.pem: OK
root@node2:~# /usr/bin/openssl verify /etc/certs/CA/node1.pem
/etc/certs/CA/node1.pem: OK
root@node2:~# /usr/bin/openssl verify /etc/certs/CA/node3.pem
/etc/certs/CA/node3.pem: OK
root@node3:~# /usr/bin/openssl verify /etc/certs/CA/node1.pem
/etc/certs/CA/node1.pem: OK
root@node3:~# /usr/bin/openssl verify /etc/certs/CA/node2.pem
/etc/certs/CA/node2.pem: OK

6. Restart the ldmd daemon.
root@node1:~# /usr/sbin/svcadm restart svc:/ldoms/ldmd:default
root@node2:~# /usr/sbin/svcadm restart svc:/ldoms/ldmd:default
root@node3:~# /usr/sbin/svcadm restart svc:/ldoms/ldmd:default


7. If the guest domain is already managed by a resource of type SUNW.ldom version 8, set the Use_SSL_Certificate extension property to TRUE.
Assume the root role or a role that provides solaris.cluster.modify and solaris.cluster.admin authorizations to execute the following on any one node:
$ /usr/cluster/bin/clresource set -p Use_SSL_Certificate=TRUE ldom-rs

For a new resource of SUNW.ldom starting with version 8, ensure that the live migration of a guest domain, using SSL certificate authentication, is successful before setting Use_SSL_Certificate=TRUE and enabling the resource.

If you are configuring the SSL certificates for a guest domain that is already running, verify that a dry-run live migration, using SSL certificate authentication, is successful before setting Use_SSL_Certificate=TRUE.

Assume the root role or a role that has been assigned the "LDoms Management" profile and execute the following command:
$ /usr/sbin/ldm migrate-domain -n -c domain-name target_host
$ echo $?
0


For more information, refer the following resources:

Oracle Solaris Cluster Data Service for Oracle VM Server for SPARC Guide
http://docs.oracle.com/cd/E56676_01/html/E56924/index.html

SUNW.ldom(5) Man Page
https://docs.oracle.com/cd/E56676_01/html/E56746/sunw.ldom-5.html

Oracle VM Server for SPARC 3.4 Administration Guide

http://docs.oracle.com/cd/E69554_01/html/E69557/index.html

Oracle VM Server for SPARC 3.3 Administration Guide
https://docs.oracle.com/cd/E62357_01/html/E62358/index.html

Oracle VM Server for SPARC 3.2 Administration Guide
https://docs.oracle.com/cd/E48724_01/html/E48732/index.html

Secure administration of Oracle VM Server for SPARC
https://blogs.oracle.com/jsavit/entry/secure_administration_of_oracle_vm


Tapan Avasthi
Oracle Solaris Cluster Engineering

Tuesday Jul 19, 2016

Oracle E-Business 12.2 support on Oracle Solaris Cluster

We are very pleased to announce that Oracle Solaris Cluster 4.3 SRU 3 on Oracle Solaris 11 introduces support for Oracle E-Business Suite 12.2.

In particular, starting with Oracle E-Business Suite 12.2.4, Oracle Applications DBA (AD) and Oracle E-Business Suite Technology Stack (TXK) Delta 6, Oracle E-Business Suite 12.2 can now be managed by Oracle Solaris Cluster 4.3 SRU 3.

One advantage of deploying Oracle E-Business Suite 12.2 with Oracle Solaris Cluster is the ability to install the Primary Application Tier and subsequent WebLogic Administration Server on a logical host. 

What this means is that, if the physical node hosting the Primary Application Tier and subsequent WebLogic Administration Server fails, then Oracle Solaris Cluster will fail over the Primary Application Tier to another node.

Oracle Solaris Cluster will detect a node failure within seconds and will automatically fail over the logical host to another node where the Primary Application Tier services are automatically started again. Typically, the WebLogic Administration Server is available again within 2-3 minutes after a node failure. Without Oracle Solaris Cluster providing high availability for Oracle E-Business Suite 12.2, you will need to recover the Primary Application Tier and WebLogic Administration Server from a reliable backup.

Please note that if the physical node hosting the Primary Application Tier and subsequent WebLogic Administration Server fails, then patching would not be possible until the WebLogic Administration Server is available again.

The following diagram shows a typical Oracle E-Business 12.2 deployment on Oracle Solaris Cluster 4.3 with Oracle Solaris 11, using Oracle Solaris Zone Clusters.


This deployment example is based on the information available in the “Deployment Option with Single Web Entry Point and Multiple Managed Servers” section of the My Oracle Support (MOS) note, Using Load-Balancers with Oracle E-Business Suite Release 12.2 (Doc ID 1375686.1), with the modification that the Admin Server and its Node Manager are running on the Web Entry Point server.

It is important to note that the Primary Applications Tier and subsequent WebLogic Administration Server have been installed on the logical host primary-lh to enable Oracle Solaris Cluster, in the event of a node failure to fail over the Primary Application Tier services to another node where those services are automatically started again.

For more information please refer to the following:

How to Deploy Oracle RAC on an Exclusive-IP Oracle Solaris Zones Cluster

http://www.oracle.com/technetwork/articles/servers-storage-admin/rac-excl-ip-zone-cluster-2341657.html#About

Oracle Solaris Cluster Data Service for Oracle E-Business Suite as of Release 12.2 Guide

http://docs.oracle.com/cd/E56676_01/html/E60641/index.html

Neil Garthwaite
Oracle Solaris Cluster Engineering

Wednesday Aug 13, 2014

Support for Kernel Zones with the Oracle Solaris Cluster 4.2 Data Service for Oracle Solaris Zones

The Oracle Solaris Cluster Data Service for Oracle Solaris Zones is enhanced to support Oracle Solaris Kernel Zones (also called solaris-kz branded zones) with Oracle Solaris 11.2.

This data service provides high availability for Oracle Solaris Zones through three components in a failover or multi-master configuration:

  • sczbt: The orderly booting, shutdown and fault monitoring of an Oracle Solaris zone.
  • sczsh: The orderly startup, shutdown and fault monitoring of an application within the Oracle Solaris zone (managed by sczbt), using scripts or commands.
  • sczsmf: The orderly startup, shutdown and fault monitoring of an Oracle Solaris Service Management Facility (SMF) service within the Oracle Solaris zone managed by sczbt.

With Oralce Solaris Cluster 4.0 and 4.1 the sczbt component does support cold migration (boot and shutdown) of solaris and solaris10 branded zones.

The sczbt component now in addition supports cold and warm migration for kernel zones on Oracle Solaris 11.2. Using warm migration (suspend and resume of a kernel zone) allows you to minimize planned downtime, in case a cluster node is overloaded or needs to be shutdown for maintenance.

By deploying kernel zones under control of the sczbt data service, system administrators can provide a highly available virtual machine with enhanced flexibility associated with being able to support consolidation of multiple zones with separate kernel patch levels on a single system. At the same time the administrator has the ability to very efficiently consolidate multiple workloads onto a server.

The three data service components are now implemented as their own dedicated resource types as follows:

  • sczbt - ORCL.ha-zone_sczbt
  • sczsh - ORCL.ha-zone_sczsh
  • sczsmf - ORCL.ha-zone_sczsmf

Resource configuration is still done by amending the component configuration file and providing it to the component register script. There is no longer a need to configure or maintain a parameter file for the sczbt or sczsh component.

In case existing deployments of the data service components are upgraded to Oracle Solaris Cluster 4.2, there is no requirement to re-register the resources. The previous SUNW.gds based component resources continue to run unchanged.

For more details, please refer to the Oracle Solaris Cluster Data Service for Oracle Solaris Zones Guide.

Thorsten Früauf
Oracle Solaris Cluster Engineering

Thursday Aug 07, 2014

Oracle's Siebel 8.2.2 support on Oracle Solaris Cluster software

Swathi Devulapalli

The Oracle Solaris Cluster 4.1 data service for Siebel on SPARC now supports Oracle's Siebel 8.2.2 version. It is now possible to configure the Siebel Gateway Server and Siebel Server components of Siebel 8.2.2 software for failover/high availability.

What it is

Siebel is the most popular CRM solution that delivers a combination of transactional, analytical, and engagement features to manage all customer-facing operations. Siebel 8.2.2 is the first Siebel version that is certified and released for Oracle Solaris 11 software on the SPARC platform.

The Oracle Solaris Cluster 4.1 SRU3 software on Oracle Solaris 11 provides a high availability(HA) data service for Siebel 8.2.2. The Oracle Solaris Cluster data service for Siebel provides fault monitoring and automatic failover of Siebel application. The data service makes highly available two essential components of the Siebel application: Siebel Gateway Server and Siebel Server. A resource of type SUNW.sblgtwy monitors the Siebel Gateway server and a resource of type SUNW.sblsrvr monitors the Siebel server. The Oracle Solaris Cluster 4.1 SRU3 on Oracle Solaris 11 extends the support of this data service for Siebel 8.2.2.

With the support of Siebel 8.2.2 on Oracle Solaris 11, the features of Oracle Solaris 11 and Oracle Solaris Cluster 4.1 are available in the Siebel 8.2.2 HA agent. The HA solution for Siebel stack can be configured on a complete Oracle products stack with an Oracle Solaris Cluster HA solution available on each tier of the stack. For example, the Oracle Solaris Cluster HA Oracle agent in the database tier, the Oracle Solaris Cluster HA Siebel agent in the application tier and the Oracle Solaris Cluster HA Oracle iPlanet Web Server agent in the Web tier.

What’s new?

1. A new extension property Siebel_version has been introduced. This property indicates the Siebel server version number i.e 8.2.2, etc

The below example illustrates the usage of the Siebel_version property in the Siebel Gateway Server and Siebel Server resource creation.

Creation of Siebel Gateway Server resource:


Creation of Siebel Server resource:


2. Encryption facility for the HA-Siebel configuration files

The HA Siebel solution uses the database user/password and the Siebel user/password to execute the start, stop and monitor methods. These passwords are stored in scsblconfig and scgtwyconfig files located under the Siebel Server installation directory and Siebel Gateway Server installation directory respectively. The new HA-Siebel data service provides an option to encrypt these files that the agent decrypts before use.

The Oracle Solaris Cluster administrator encrypts the configuration files following the steps provided in the HA-Siebel document. The HA-Siebel agent decrypts these files and uses the entries while executing the start, stop and monitor methods.

For detailed information on the configuration of encrypted files, refer to Configuring the Oracle Solaris Cluster Data Service for Siebel (Article 1509776.1) posted on My Oracle Support at http://support.oracle.com. You must have an Oracle support contract to access the site. Alternatively, you can also refer to the Oracle Solaris Cluster Data Service for Siebel Guide.

About

Oracle Solaris Cluster Engineering Blog

Search


Archives
« February 2017
SunMonTueWedThuFriSat
   
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
    
       
Today