By Dmitry Nefedkin on Dec 29, 2010
It's a common task to increase a cluster's capacity by adding new machines to the cluster to host the new server instances. You can do it by manually installing weblogic binaries to the new host and use pack/unpack commands to add a managed server to this new host. But with Enterprise Manager Grid Control 11gR1 (EMGC) there is another way - Fusion Middleware Domain Scale Up procedure. I'm going to show you how it works.
Here is a picture of my medrec_oradb weblogic domain, what is registered in EMGC. It contains an admin server and a cluster MedRecCluster with the single managed server MS1. Both admin and managed servers are on the same host oel46-vmware, it's a virtual machine with OEL 4.6 that runs inside our Oracle VM infrastructure.
And here are the application deployments, note that couple of applications are deployed to the cluster.
First of all I have to prepare a new machine that will host new managed sever of my cluster. I created new VM with OEL 5.4 using the corresponding Oracle VM template available in Oracle E-Delivery site for Oracle Linux and Oracle VM and named it wls1032.
Next step is to install Oracle EM Grid Control 11gR1 Agent to this new host. You can download it from the OTN page and install it manually, or you can use Agent Installation Deployment procedure available in EMGC (Deployments->Agent Installation->Install Agent). Anyway, when you agent is up and running on the new machine, you will see it in EMGC Console in the Targets->Hosts subtab.
Now we are ready to scale up our weblogic domain. Click the Deployments tab in Oracle Enterprise Manager Grid Control, and then click Deployment Procedure. Select a Fusion Middleware Domain Scale Up procedure from the list, and click Schedule Deployment. The first page of the FMW Domain Scale Up Wizard is displayed and you can proceed with the deployment process.
Select the domain from list, enter the working directory on the admin server host, and also fill the weblogic credentials for the administration server console and the OS credentials for the admin server host. Click Next button.
The next step allows you to configure you domain, to add a new manager server to the cluster you should select the cluster in the tree and click Add Server button. Select the newly added server in a tree, choose the target host and enter the configuration details of your managed server. You can also add new machine and node manager details. Please note that you cannot change the values in Domain Location and Fusion Middleware Home fields, so these locations on the target host will be the same as for the admin server host. Working directory on the target host should have enough free space to store FMW home binaries and domain configuration files. In my experience the working directories should have at least 3 Gb of free space. The last thing you should fill is the OS credentials for the target host.
The next steps allows you to schedule the execution of the procedure, it is started immediately in my example.
You can track the status of the procedure execution by selecting Deployments->Deployment Procedures->Procedure Completion Status in the EMGC Console.
As you can see in the picture below, the procedure consists of the many steps, and I'm going to share my experience about the issues that I had at some of the steps. Please keep in mind that you can always continue the execution from the last successfully completed step by clicking Retry button.
- Check OUI Prerequisites step may fail if the target host does not pass prerequisites checks for Weblogic Server installation such as amount of RAM, linux packages installed, etc.
- Create FMW Clone Archive step may fail if you do not have enough free space in the working directory on the administration server host.
- Transfer cloning archive to targets step may fail if the EMGC agents on the admin server host or on target host are not secured. You should secure the agent by issuing ./emctl secure agent command from $AGENT_HOME/bin directory and entering the agent registration password.
- Both Transfer cloning archive to targets and Apply Clone at target hosts steps may fail if you do not have enough free space in the working directory on the target host.
- The most complicated issue I had on the Run Inventory Collection step. The step failed and I noticed that the agent on the target server is also failed with the following error in the $AGENT_HOME/sysman/log/emagent.trc log file:
Response received: 500|ORA-20603: The timezone of the multiagent target (/Farm_Localhost_MedRec_medrec_oradb/medrec_oradb,weblogic_domain)is not consistent with the timezone (America/Los_Angeles) reported by other agents.
2010-12-28 11:50:34,310 Thread-2838952848 ERROR upload: 1 Failure(s) in a row or XML error for A0000008.xml, retcode = -6, we give up
2010-12-28 11:50:35,552 Thread-2838952848 WARN upload: FxferSend: received fatal error in header from repository: https://oel46-vmware:1159/em/upload
FATAL_ERROR::500|ORA-20603: The timezone of the multiagent target (/Farm_Localhost_MedRec_medrec_oradb/medrec_oradb,weblogic_domain)is not consistent with the timezone (America/Los_Angeles) reported by other agents.
2010-12-28 11:50:35,552 Thread-2838952848 ERROR upload: number of fatal error exceeds the limit 3
2010-12-28 11:50:35,552 Thread-2838952848 ERROR upload: agent will shutdown now
2010-12-28 11:50:35,552 Thread-2838952848 ERROR : Signalled to Exit with status 55. Too many fatal upload failures
2010-12-28 11:50:35,552 Thread-2838952848 ERROR upload: 1 Failure(s) in a row or XML error for A0000008.xml, retcode = -6, we give up
2010-12-28 11:50:35,552 Thread-3044607680 ERROR main: EMAgent abnormal terminating
I checked the timezone of my domain target inside EMGC repository
where target_type = 'weblogic_domain'
and display_name = 'medrec_oradb'
"TIMEZONE_REGION"Then checked the timezone of my agents and indeed, they differed
select target_name, timezone_region
where type_display_name = 'Agent'
So I had to change the timezone on the wls1032 host and propagate this changes to the agent and to the EMGC repository. Here was the steps:
- issued system-config-date command on wls1032.imc.fors.ru and set timezone to "America/Los_Angeles"
- propagated the changes to the agent bu executing ./emctl resetTZ agent command from $AGENT_HOME/bin directory
- connected to EMGC repository as sysman and executed the following PL/SQL block:
After that I had to clear the pending uploads on wls1032.imc.fors.ru:
rm -r $AGENT_HOME/sysman/emd/state/*
rm -r $AGENT_HOME/sysman/emd/collection/*
rm -r $AGENT_HOME/sysman/emd/upload/*
$AGENT_HOME/bin/emctl start agent
$AGENT_HOME/bin/emctl clearstate agent
The last part of this solution was to resync the agent in EMGC console by clicking Agent Resynchronization button (please leave "Unblock agent on successful completion of agent resynchronization" checkbox checked in the next screen).
After that I issued ./emctl upload command from $AGENT_HOME/bin on the wls1032 host, and my previous error disappeared, but I catched another one:
EMD upload error: Failed to upload file A0000004.xml: HTTP error.
Response received: ERROR-400|Data will be rejected for upload from agent 'https://wls1032.imc.fors.ru:3872/emd/main/', max size limit for direct load exceeded [7544731/5242880]
So the uploading XML file size was 7 Mb, and the limit on OMS was 5 Mb.
To increase the max file size limit to 20 Mb I had to connect to the OMS host and execute the following commands from $OMS_HOME/bin directory:
./emctl set property -name em.loader.maxDirectLoadFileSz -value 20971520 -module emoms
./emctl stop oms
./emctl start oms
After that I issued ./emctl upload command from $AGENT_HOME/bin on the wls1032 one more time and it completed successfully. The agent uploaded the configuration information to the EMGC repository and I was able to see the results of my weblogic domain scale-up in EMGC Console.
So, now the weblogic cluster contains 2 managed servers located on the different hosts.
This powerful feature of the Enterprise Manager Grid Control is a part of the WebLogic Server Management Pack Enterprise Edition.