Monday Feb 24, 2014

Monitoring Archive Area % Used on Cluster Databases


One of the most critical events to monitor on an Oracle Database is your archive area. If the archive area fills up, your database will halt until it can continue to archive the redo logs. If your archive destination is set to a file system, then the Archive Area % Used metric is often the best way to go. This metric allows you to monitor a particular file system for the percentage space that has been used. However, there are a couple of things to be aware of for this critical metric.

Cluster Database vs. Database Instance

You will notice in EM 12c, the Archive Area metric exists on both the Cluster Database and the Database Instance targets. The majority of Cluster Databases (RAC) are built against database best practices which indicate that the Archive destination should be shared read/write between all instances. The purpose for this is that in case of recovery, any instance can perform the recovery and has all necessary archive logs to do so. Monitoring this destination for a Cluster Database at the instance level caused duplicate alerts and notifications, as both instances would hit the Warning/Critical threshold for Archive Area % Used within minutes of each other. To eliminate duplicate notifications, the Archive Area % Used metric for Cluster Databases was introduced. This allows the archive destination to be monitored at a database level, much like tablespaces are monitored in a RAC database.

In the Database Instance (RAC Instance) target, you will notice the Archive Area % Used metric collection schedule is set to Disabled.

If you have a RAC database and you do not share archive destinations between instances, you will want to Disable the Cluster Database metric, and enable the Database Instance metric to ensure that each destination is monitored individually.


Friday Oct 04, 2013

Java Heap Size Settings For Enterprise Manager 12c

This blog is to provide an update to a previous blog (Oracle Enterprise Manager 12c Configuration Best Practices (Part 1 of 3)) on how to increase the java heap size for an OMS running release 12cR3.  The entire series can be found in the My Oracle Support note titled Oracle Enterprise Manager 12c Configuration Best Practices [1553342.1].

Increase JAVA Heap Size

For larger enterprises, there may be a need to increase the amount of memory used for the OMS.  One of the symptoms of this condition is a “sluggish” performance on the OMS.  If it is determined that the OMS needs more memory, it is done by increasing the JAVA heap size parameters.  However, it is very important to increase this parameter incrementally and be careful not to consume all of the memory on the server.  Also, java does not always perform better with more memory. 

Verify:  The parameters for the java heap size are stored in the following file:

<MW_HOME>/user_projects/domains/GCDomain/bin/startEMServer.sh

Recommendation:  If you have more than 250 agents, increase the -Xmx parameter which specifies the maximum size for the java heap to 2 gb.  As the number of agents grows, it can be incrementally increased.  Note:  Do not increase this larger than 4gb without contacting Oracle.  Change only the –Xmx value in the line containing USER_MEM_ARGS="-Xms256m –Xmx1740m …options…" as seen in the example below.   Do not change the Xms or MaxPermSize values. Note:  change both lines as seen below.  The second occurrence will be used if running in debug mode.

Steps to modify the Java setting for versions prior to 12cR3 (12.1.0.3)

Before

 if [ "${SERVER_NAME}" != "EMGC_ADMINSERVER" ] ; then
  USER_MEM_ARGS="-Xms256m -Xmx1740m
 -XX:MaxPermSize=768M -XX:-DoEscapeAnalysis -XX:+UseCodeCacheFlushing 
-XX:ReservedCodeCacheSize=100M -XX:+UseConcMarkSweepGC -XX:+UseParNewGC 
-XX:+CMSClassUnloadingEnabled"
  if [ "${JAVA_VENDOR}" = "Sun" ] ; then
    if [ "${PRODUCTION_MODE}" = "" ] ; then
      USER_MEM_ARGS="-Xms256m -Xmx1740m
 -XX:MaxPermSize=768M -XX:-DoEscapeAnalysis -XX:+UseCodeCacheFlushing 
-XX:ReservedCodeCacheSize=100M -XX:+UseConcMarkSweepGC -XX:+UseParNewGC 
-XX:+CMSClassUnloadingEnabled -XX:CompileThreshold=8000 
-XX:PermSize=128m"
    fi
  fi
  export USER_MEM_ARGS
fi

After

 if [ "${SERVER_NAME}" != "EMGC_ADMINSERVER" ] ; then
  USER_MEM_ARGS="-Xms256m -Xmx2560m -XX:MaxPermSize=768M
 -XX:-DoEscapeAnalysis -XX:+UseCodeCacheFlushing 
-XX:ReservedCodeCacheSize=100M -XX:+UseConcMarkSweepGC -XX:+UseParNewGC 
-XX:+CMSClassUnloadingEnabled"
  if [ "${JAVA_VENDOR}" = "Sun" ] ; then
    if [ "${PRODUCTION_MODE}" = "" ] ; then
      USER_MEM_ARGS="-Xms256m –Xmx2560m
 -XX:MaxPermSize=768M -XX:-DoEscapeAnalysis -XX:+UseCodeCacheFlushing 
-XX:ReservedCodeCacheSize=100M -XX:+UseConcMarkSweepGC -XX:+UseParNewGC 
-XX:+CMSClassUnloadingEnabled -XX:CompileThreshold=8000 
-XX:PermSize=128m"
    fi
  fi
  export USER_MEM_ARGS
fi

Steps to modify the Java setting for version 12.1.0.3

emctl set property -name JAVA_EM_MEM_ARGS -value "<value>"
emctl stop oms -all
emctl start oms

Please note that this value gets seeded inside emgc.properties and is used to start the OMS.  Please be careful setting this property as this would be the property used by the OMS to start and the oms can fail to start if it is not specified correctly.  Below is an example of the command:

emctl set property -name JAVA_EM_MEM_ARGS -value "-Xms256m -Xmx2048m -XX:MaxPermSize=768M -XX:-DoEscapeAnalysis -XX:+UseCodeCacheFlushing -XX:ReservedCodeCacheSize=100M -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSClassUnloadingEnabled"

 
    

Monday Aug 19, 2013

Simplified Agent and Plug-in Deployment

On your site of hundreds or thousands of hosts have you had to patch agents immediately as they get deployed?  For this reason I’ve always been a big fan of cloning an agent that has the required plug-ins and all the recommended core agent and plug-in patches, then using that clone for all new agent deployments. With Oracle Enterprise Manager 12c this got even easier as you can now clone the agent using the console “Add Host” method. You still have to rely on the EM users to use the clone. The one problem I have with cloning is that you have to have a reference target for each platform that you support. If you have a consolidated environment and only have Linux x64, this may not be a problem. If you are managing a typical data center with a mixture of platforms, it can become quite the maintenance nightmare just to maintain your golden images. You must update golden image agents whenever you get a new patch (generic or platform specific) for the agent or plug-in, and recreate the clone for each platform. Typically, I find people create a clone for their most common platforms, and forget about the rest. That means, maybe 80% of their agents meet their standard patch requirements and plug-ins upon deployment, but the other 20% have to be patched post-deploy, or worse – never get patched!

While deployed agents and plug-ins can be patched easily using EM Patches & Updates, but what about the agents still getting deployed or upgraded? Wouldn’t it be nice if they got patched as part of the deployment or upgrade? This article will show you two new features in EM 12.1.0.3 (EM 12cR3) that will help you deploy the most current agent and plug-in versions. Whether you have 100s or 1000s of agents to manage, reducing maintenance and keeping the agents up to date is an important task, and being able to deploy or upgrade to a fully patched agent will save you a lot of time and effort.

[Read More]

Thursday Jun 27, 2013

Oracle Enterprise Manager 12c Configuration Best Practices (Part 3 of 3)

This is part 3 of a three-part blog series that summarizes the most commonly implemented configuration changes to improve performance and operation of a large Enterprise Manager 12c environment. A “large” environment is categorized by the number of agents, targets and users. See the Oracle Enterprise Manager Cloud Control Advanced Installation and Configuration Guide chapter on Sizing for more details on sizing your environment properly.

  • Part 1 of this series covered recommended configuration changes for the OMS and Repository
  • Part 2 covered recommended changes for the Weblogic server
  • Part 3 covers general configuration recommendations and a few known issues

The entire series can be found in the My Oracle Support note titled id=1553342.1">Oracle Enterprise Manager 12c Configuration Best Practices [1553342.1].

Configuration Recommendations

Configure E-Mail Notifications for EM related Alerts

In some environments, the notifications for events for different target types may be sent to different support teams (i.e. notifications on host targets may be sent to a platform support team). However, the EM application administrators should be well informed of any alerts or problems seen on the EM infrastructure components.


Recommendation: Create a new Incident rule for monitoring all EM components and setup the notifications to be sent to the EM administrator(s). The notification methods available can create or update an incident, send an email or forward to an event connector. To setup the incident rule set follow the steps below. Note that each individual rule in the rule set can have different actions configured.


1.  To create an incident rule for monitoring the EM components, click on Setup / Incidents / Incident Rules. On the All Enterprise Rules page, click on the out-of-box rule called “Incident management Ruleset for all targets” and then click on the Actions drop down list and select “Create Like Rule Set…”

2. For the rule set name, enter a name such as MTM Ruleset. Under the Targets tab, select “Specified targets” and select "Targets" from the Add drop down list.  Click on the green "+" sign.  Click on the drop down arrow for Target Type and deselect all target types except "EM Service" and “OMS and Repository".  Click "Search".  Select the targets returned and click "Select".


3. Click on the Rules tab. To edit a rule, click on the rule name and click on Edit as seen below

4. Modify the following rules (names for rules in 12.1.0.3 are in parentheses if they have changed):

a. Incident creation Rule for metric alerts (Create incident for critical metric alerts)

i. Leave the Type set as is but change the Severity to add Warning by clicking on the drop down list and selecting “Warning”. Click Next.

ii.  Add or modify the actions as required (i.e. add email notifications). Click Continue and then click Next.

iii. Leave the Name and description the same and click Next.

iv. Click Continue on the Review page.

b. Incident creation Rule for target unreachable.

i.   Leave the Type set as is but change the Target type to add EM Service and OMS and Repository by clicking on the drop down list selecting both "EM Service" and “OMS and Repository”. Click Next.

ii.  Add or modify the actions as required (i.e. add email notifications) Click Continue and then click Next.

iii. Leave the Name and description the same and click Next.

iv. Click Continue on the Review page.

5 Modify the actions for any other rule as required and be sure to click the “Save” push button to save the rule set or all changes will be lost.

Configure Out-of-Band Notifications for EM Agent

Out-of-Band notifications act as a backup when there’s a complete EM outage or a repository database issue. This is configured on the agent of the OMS server and can be used to send emails or execute another script that would create a trouble ticket. It will send notifications about the following issues:

? Repository Database down

? All OMS are down

? Repository side collection job that is broken or has an invalid schedule

? Notification job that is broken or has an invalid schedule

Recommendation: To setup Out-of-Band Notifications, refer to the MOS note “How To Setup Out Of Bound Email Notification In 12c” (Doc ID 1472854.1)

Modify the Performance Test for the EM Console Service

The EM Console Service has an out-of-box defined performance test that will be run to determine the status of this service. The test issues a request via an HTTP method to a specific URL. By default, the HTTP method used for this test is a GET but for performance reasons, should be changed to HEAD. The URL used for this request is set to point to a specific OMS server by default. If a multi-OMS system has been implemented and the OMS servers are behind a load balancer, then the URL used by EM as the URL in notifications and by this EM Service test must be modified to point to the load balancer name instead of a specific server name. If this is not done and a portion of the infrastructure is down then the EM Console Service will show down as this test will fail.

Recommendation: Modify the HTTP Method for the EM Console Service test and the URL if required following the detailed steps below.

Setting the Console URL if a multi-OMS system is implemented:

1.  Click on Setup / Manage Cloud Control / Health Overview

2.  Click on the "Add" push button next to Console URL as seen in the picture below.


3.  Type in the URL and click OK.


Modifying the HTTP Method for the EM Console Service test:

1.  To create an incident rule for monitoring the EM components, click on Targets / Services. From the list of services, click on the EM Console Service.

2. On the EM Console Service page, click on the Test Performance tab.

3.  At the bottom of the page, click on the Web Transaction test called EM Console Service Test

4.  Click on the Service Tests and Beacons breadcrumb near the top of the page.

5.  Under the Service Tests section, make sure the EM Console Service Test is selected and click on the Edit push button.

6.  Under the Transaction section, make sure the Access Logout page transaction is selected and click on the Edit push button

7) Under the Request section, change the HTTP Method from the default of GET to the recommended value of HEAD. The URL in this section should point to the load balancer name instead of a specific server name if multi-OMSes have been implemented and the Console URL was set according to the steps above.

Check for Known Issues

Job Purge Repository Job is Shown as Down

This issue is caused after upgrading EM from 12c to 12cR2. On the Repository page under Setup → Manage Cloud Control → Repository, the job called “Job Purge” is shown as down and the Next Scheduled Run is blank. Also, repvfy reports that this is a missing DBMS_SCHEDULER job.  NOTE:  this issue is fixed in version 12.1.0.3

Recommendation: In EM 12cR2, the apply_purge_policies have been moved from the MGMT_JOB_ENGINE package to the EM_JOB_PURGE package. To remove this error, execute the commands below:

$ repvfy verify core -test 2 -fix

To confirm that the issue resolved, execute

$ repvfy verify core -test 2

It can also be verified by refreshing the Job Service page in EM and check the status of the job, it should now be Up.

Configure the Listener Targets in EM with the Listener Password (where required)

EM will report this error every time it is encountered in the listener log file. In a RAC environment, typically the grid home and rdbms homes are owned by different OS users. The listener always runs from the grid home. Only the listener process owner can query or change the listener properties. The listener uses a password to allow other OS users (ex. the agent user) to query the listener process for parameters. EM has a default listener target metric that will query these properties. If the agent is not permitted to do this, the TNS incident (TNS-1190) will be logged in the listener’s log file. This means that the listener targets in EM also need to have this password set. Not doing so will cause many TNS incidents (TNS-1190). Below is a sample of this error from the listener log file:

Recommendation: Set a listener password and include it in the configuration of the listener targets in EM

For steps on setting the listener passwords, see MOS notes: 260986.1 , 427422.1

Thursday Jun 20, 2013

Oracle Enterprise Manager 12c Configuration Best Practices (Part 2 of 3)

This is part 2 of a three-part blog series that summarizes the most commonly implemented configuration changes to improve performance and operation of a large Enterprise Manager 12c environment. A “large” environment is categorized by the number of agents, targets and users. See the Oracle Enterprise Manager Cloud Control Advanced Installation and Configuration Guide chapter on Sizing for more details on sizing your environment properly.

  • Part 1 of this series covered recommended configuration changes for the OMS and Repository
  • Part 2 covers recommended changes for the Weblogic server
  • Part 3 will cover general configuration recommendations and a few known issues

The entire series can be found in the My Oracle Support note titled id=1553342.1">Oracle Enterprise Manager 12c Configuration Best Practices [1553342.1].

WebLogic Server Recommendations

Stuck Thread Max Time

By design WLS will ping applications and wait for a response for up to the value of Stuck Thread Max Time which is set to 600 seconds by default. This is a heartbeat to ensure that a particular thread is not stuck. EM on the other hand will keep threads running as long as there is work in the queue and they will not respond to a heartbeat. This is expected behavior for both EM and WebLogic Server however it will cause WLS to timeout and error which will create an incident within EM. If this parameter is not increased, the number of incidents created by this WLS error can be significant. Below is an example of the incident that may be seen. Please note, an enhancement bug has been created requesting that EM install out of the box with a higher value for this parameter.


Recommendation: To assist in reducing these errors, increase the stuck thread timeout in the Admin server as per the steps below. Note that this will reduce the number of above alerts but may not remove them completely.

1. Log onto the WLS Admin server.

2. Click on Environment in the top right side menu and expand Servers. Click on one of the OMS server names.

3. Click on the Tuning tab on in the middle window and then on the Lock and Edit under the Change Center (top left).

4. Change the value for Stuck Thread Max Time to 1800.


5. Save and Activate the change. This will require a restart of the OMS server for it to go into effect and will need to be repeated for all servers in the Admin Console (i.e. OMS servers and ADMINSERVER) but only needs to be done once per site/domain. If the environment contains standby OMS servers, repeat these steps for all standby OMS servers and the ADMINSERVER although a reboot is not required for the standby OMS servers as they are not running.

Modify Log Settings

The default severity setting for logging information in the WebLogic Server is set at a level that will create excessive logging data. These settings should be set to a higher severity level.

Recommendation: To modify these settings, follow the steps below:

1.  Log onto the WLS Admin server.

2.  Click on Environment in the top right side menu and expand Servers. Click on the first OMS server.


3. Click on the Logging tab in the middle window and then on the Lock and Edit under the Change Center (top left).




4. Expand the Advanced option at the bottom of the page.


5. Change the Minimum log severity from Info to Warning.


6. Change the Domain Log Broadcaster Severity Level from Notice to Error.


7. Save and Activate the change. This does not require a restart of the OMS server for it to go into effect but will need to be repeated for all servers in the Admin Console (i.e. OMS servers and ADMINSERVER. This change only needs to be done once per site/domain. If the environment contains standby OMS servers, repeat these steps for all standby OMS servers and the ADMINSERVER.

Thursday Jun 13, 2013

Oracle Enterprise Manager 12c Configuration Best Practices (Part 1 of 3)

The objective of this three-part blog series is to summarize the most commonly implemented configuration changes to improve performance and operation of a large Enterprise Manager 12c environment. A “large” environment is categorized by the number of agents, targets and users. See the Oracle Enterprise Manager Cloud Control Advanced Installation and Configuration Guide chapter on Sizing for more details on sizing your environment properly.

  • Part 1 of this series covers recommended configuration changes for the OMS and Repository
  • Part 2 will cover recommended changes for the Weblogic server
  • Part 3 will cover general configuration recommendations and a few known issues

The entire series can be found in the My Oracle Support note titled Oracle Enterprise Manager 12c Configuration Best Practices [1553342.1].

OMS Recommendations

Increase JAVA Heap Size

For larger enterprises, there may be a need to increase the amount of memory used for the OMS.  One of the symptoms of this condition is a “sluggish” performance on the OMS.  If it is determined that the OMS needs more memory, it is done by increasing the JAVA heap size parameters.  However, it is very important to increase this parameter incrementally and be careful not to consume all of the memory on the server.  Also, java does not always perform better with more memory. 

Verify:  The parameters for the java heap size are stored in the following file:

<MW_HOME>/user_projects/domains/GCDomain/bin/startEMServer.sh

Recommendation:  If you have more than 250 agents, increase the -Xmx parameter which specifies the maximum size for the java heap to 2 gb.  As the number of agents grows, it can be incrementally increased.  Note:  Do not increase this larger than 4gb without contacting Oracle.  Change only the –Xmx value in the line containing USER_MEM_ARGS="-Xms256m –Xmx1740m …options…" as seen in the example below.   Do not change the Xms or MaxPermSize values. Note:  change both lines as seen below.  The second occurrence will be used if running in debug mode.

Before

 if [ "${SERVER_NAME}" != "EMGC_ADMINSERVER" ] ; then
  USER_MEM_ARGS="-Xms256m -Xmx1740m
 -XX:MaxPermSize=768M -XX:-DoEscapeAnalysis -XX:+UseCodeCacheFlushing 
-XX:ReservedCodeCacheSize=100M -XX:+UseConcMarkSweepGC -XX:+UseParNewGC 
-XX:+CMSClassUnloadingEnabled"
  if [ "${JAVA_VENDOR}" = "Sun" ] ; then
    if [ "${PRODUCTION_MODE}" = "" ] ; then
      USER_MEM_ARGS="-Xms256m -Xmx1740m
 -XX:MaxPermSize=768M -XX:-DoEscapeAnalysis -XX:+UseCodeCacheFlushing 
-XX:ReservedCodeCacheSize=100M -XX:+UseConcMarkSweepGC -XX:+UseParNewGC 
-XX:+CMSClassUnloadingEnabled -XX:CompileThreshold=8000 
-XX:PermSize=128m"
    fi
  fi
  export USER_MEM_ARGS
fi

After

 if [ "${SERVER_NAME}" != "EMGC_ADMINSERVER" ] ; then
  USER_MEM_ARGS="-Xms256m -Xmx2560m -XX:MaxPermSize=768M
 -XX:-DoEscapeAnalysis -XX:+UseCodeCacheFlushing 
-XX:ReservedCodeCacheSize=100M -XX:+UseConcMarkSweepGC -XX:+UseParNewGC 
-XX:+CMSClassUnloadingEnabled"
  if [ "${JAVA_VENDOR}" = "Sun" ] ; then
    if [ "${PRODUCTION_MODE}" = "" ] ; then
      USER_MEM_ARGS="-Xms256m –Xmx2560m
 -XX:MaxPermSize=768M -XX:-DoEscapeAnalysis -XX:+UseCodeCacheFlushing 
-XX:ReservedCodeCacheSize=100M -XX:+UseConcMarkSweepGC -XX:+UseParNewGC 
-XX:+CMSClassUnloadingEnabled -XX:CompileThreshold=8000 
-XX:PermSize=128m"
    fi
  fi
  export USER_MEM_ARGS
fi

Repository Recommendations

Repvfy execute optimize

This command can be executed to establish a baseline and set the environment to the “recommended” values based on the configuration of that environment.  The following command will check the existing settings and modify them if needed.

$ repvfy execute optimize

This command does several things some of which include the following:

1.                 Internal task system:

  • Verify there are at least 2 short running and 2 long running worker threads
  • Verify that the availability worker threads are disabled since these threads are now obsolete

2.                 Repository settings:

  • Set the retention time for the MGMT_SYSTEM_ERROR_LOG table to 7 days (unless this setting has already been changed)
  • Disable PL/SQL and metric tracing to reduce logging when not necessary
  • Recompile any invalid SYSMAN objects

3.                 Target system:

  • Tune the PING grace period to allow the OMS to wait a longer period of time after startup before checking the heartbeat of the agents

Increase Task Workers

Task worker threads are used to pick up tasks from the dbms_scheduler jobs queue based on their type.  These jobs are used to calculate metrics, rollup metrics for clusters and provide the self-monitoring metrics for EM.  Tasks are defined as short or long.  Many larger systems require more than one short and long task workers to do the housekeeping jobs in a timely manner without creating a backlog.   The recommendation is to have at least 2 short-running worker threads and 2 long-running worker threads.

Verify:  To determine if you have a backlog:

 $ repvfy verify repository -test 1001

If you have a backlog, execute the command below to gather more details on the performance data for the task workers.

 $ repvfy dump task_health

Recommendation:  If the output from the dump task_health indicates a backlog, execute the following statement to set the recommended number of task workers for both short running tasks (type 0) and long running tasks (type 1).  This will increase the settings to the recommended settings for your environment (this command is not necessary if you already ran it on your environment from the first recommended step above).

 $ repvfy execute optimize

If after setting the recommended settings, the site had grown to such a size that there is still a task worker backlog, use this routine to increase the number of workers above 2:

$ sqlplus /nolog
 SQL> connect SYSMAN;
 SQL> exec gc_diag2_ext.SetWorkerCounts(<number>);

The number can be 3 or 4 (the routine will not accept values larger than 4). If you need to go higher than 4, contact Oracle Support.

Increase Ping Grace Period

Upon system startup, the OMS must ping each agent to get a current heartbeat and update the availability state for all the agents.  In systems with 100’s or 1000’s of agents, this can take longer.   By increasing the grace period for the ping/heartbeat system to kick in and contact Agents we allow more time for the agents to start uploading first. 

Recommendation:  Execute the following statement. This command will evaluate the system and set the appropriate value for the Ping Grace Period to give the majority of the agents a chance to begin their upload upon system startup (this command is not necessary if you already ran it on your environment).

$ repvfy execute optimize 

If after an OMS restart, you still see a high number of pending agents for a prolonged period of time, this value may need to be set higher.  Execute the following statement and contact Oracle Support, providing the output from the dump ping_health command.

$ repvfy dump ping_health

Wednesday Apr 03, 2013

Relocating targets with EM 12c

Multi-Agent targets

Some targets (like RAC databases, clusters, FMW domains etc) are considered 'clustered' and have failover build into them 'by design'.

Enterprise Manager handles those targets in a special way:

  • They are marked as 'multi-Agent' targets: They are discovered on all Agents of the 'cluster' or 'set of hosts' they can run on.
  • The OMS will decide which Agent will do the actual 'monitoring' of that target in question. (OMS mediation)
  • If that Agent goes down or becomes 'unavailable', the OMS will choose another Agent from the discovered set to take responsibility of that target and continue the monitoring.

For these targets, the 'relocate_target' functionality should not be used, since the OMS will take care of the failover, and move the monitoring to a 'surviving Agent' in case a failover is needed.
Forcing a target to get moved to another Agent should also not be done with 'relocate' functionality, since the target is in almost all cases linked to other targets (like CRS or cluster targets) which have to have known associations with these targets.

To see which targets are considered 'OMS mediated', run this query in the repository:

  SELECT host_name, entity_type, entity_name
  FROM   em_manageable_entities
  WHERE  manage_status   = 2  -- Managed
    AND  promote_status  = 3  -- Promoted
    AND  monitoring_mode = 1  -- OMS mediated
  ORDER BY host_name, entity_type, entity_name
  ;

To be able to see the list of Agents that have discovered these OMS mediated targets, and can assume monitoring of them, use these REPVFY commands:

  • To see which Agents can monitor an OMS mediated target:
    $ repvfy show master_agent -name "<name_of_target>" -type "<type_of_target>"
  • To see the last 10 Agent failovers for a given target:
    $ repvfy show master_agent_history -name "<name_of_target>" -type "<type_of_target>"
  • To see which Agents can monitor the various components of a Database Machine (Exadata):
    $ repvfy show exadata_master_agent -name "<name_of_the_dbmachinetarget>"
  • For debugging/maintenance purposes, a special routine exists in REPVFY to force an OMS mediated failover:
    SQL> exec gc_diag3_ext.ForceFailover("<name_of_target>","<type_of_target>");

For more information on EMDIAG, see: 421053.1: EMDIAG Master Index

Manual relocation of targets

In case a 'regular' target needs to get moved to another Agent, a special EMCLI verb exists to move the definition and monitoring settings of a target from one Agent to another Agent:

  $ emcli relocate_targets 
          -src_agent=<source_agent_target_name>
          -dest_agent=<dest_agent_target_name>
          -target_name=<name_of_target_to_be_relocated>
          -target_type=<type_of_target_to_be_relocated>
          -copy_from_src -force=yes

Gotchas:

  • For targets that have known 'associations' (services like FMW, EBiz, etc) the '-force' flag will move all related targets together with the main service to the new Agent.
  • For those targets that have monitoring settings that are host/Agent specific, the values of those properties will have to get updated when the target moves
    $ emcli relocate_targets <...options...> -changed_param=<propName>:<propValue>
  • If there is 'clock skew' between the source and the destination Agent, the availability of the target might be impacted when the target gets moved from the old to the new Agent.
    To force the new time of the target, the 'ignoreTimeSkew' parameter can be used, to make the repository 'accept' the 'older' time from the new Agent:
    $ emcli relocate_targets <...options...> -ignoreTimeSkew=yes

Automated relocation of targets

For cold failover cluster (CFC), the EMCLI way of moving a target from one Agent to another Agent will not work because of the interactive nature of EMCLI (and the password requirement)

For those setups, there is an EMCTL command to take ownership of a target.

  • An Agent can only assume control over a target.
    It can not give a target 'away' or push a target on another Agent
  • For security reason, the list of Agents that can 'assume' control over a given target need to get registered first in the repository.
  • For every target requiring automated failover (emctl failover), run this EMCLI command once to setup the list of possible Agents:
       $ emcli set_standby_agent 
               -src_agent=<source_agent_target_name>
               -dest_agent=<dest_agent_target_name>
               -target_name=<name_of_target_to_be_relocated>
               -target_type=<type_of_target_to_be_relocated>
    
  • If more than 2 Agents are needed, run multiple EMCLI commands for each target, each time with a new Agent specified for the '-dest_agent' parameter.

Once the setup has been done, an Agent can take control over a target by running this command:

   $ emctl relocate_target agent "<name_of_target>" "<type_of_target>"

Using this EMCTL command, the cluster scripts that are run before and after the failover of a node can then be enhanced with these EMCTL commands to let the Agent on the new (surviving) node assume control over the desired targets.

There is no build-in way in EM today to visualize or retrieve the failover configuration of a target.
To be able to see the setup done for a particular target, the following commands have been added to REPVFY:

  • To see which Agents can 'assume' control over a target:
    $ repvfy show agent_failover -name "<name_of_target>" -type "<type_of_target>"

For more information on EMDIAG, see: 421053.1: EMDIAG Master Index

Stay Connected with Oracle Enterprise Manager:

Twitter | Facebook | YouTube | Linkedin | Newsletter

Using Advanced Notifications in Oracle Enterprise Manager 12c

When using an enterprise monitoring tool such as Oracle Enterprise Manager 12c, one of the most critical components is notification. Once an alert or issue has been identified, how do you tell the right people at the right time? Most enterprises use e-mail or open a trouble ticket. As you can imagine, no two enterprises are the same when it comes to their tools and processes. Many customers use one of the more common and well known trouble ticketing systems but quite a few use non-standard or custom (homegrown) trouble ticketing systems. Some customers have special routing requirements or corporate standards and have custom applications which handle all emailing functions instead of directly emailing using an SMTP server.

Oracle Enterprise Manager 12c can handle all of these situations by utilizing one of the various notification methods provided: E-mail, 3rd party connectors and advanced notification methods. There are three types of advanced notifications: SNMP, OS Command or PL/SQL. This blog will introduce you to the OS Command and PL/SQL notification methods available in EM 12c and provide an example of using a custom OS script for notifications.

Advanced Notification Methods: OS Command and PL/SQL

With the advanced notification methods, you can write a notification directly to a table or an OS log for further processing or push a notification to any trouble ticketing system using their command line tools providing the data and variables from EM 12c as input.  This method is used by some customers whose corporate standard requires that all alerts be written to a log file, which the ticketing system can poll on a regular basis for alerts. Additionally, advanced notifications allow you to call a procedure whose interface is PL/SQL or access additional data in the Enterprise Manager repository.  For example, for a database alert on % of Processes or Sessions Used, you could use PL/SQL to notify the proper application team (stored in a custom target property) that they have too many sessions.   To create and configure advanced notifications, the user must have Super Admin privileges.

Creating OS or PL/SQL Scripts

When creating an advanced notification the first step is to create the OS or PL/SQL script. It’s highly recommended to include debugging and logging information so you can fully understand what information EM 12c is passing and assist in troubleshooting.  When using an OS Command in a multi-OMS environment, the script needs to reside on all OMS servers or preferably in a shared location. If you plan to use PL/SQL, the procedure must be created in the repository database before configuring it as a notification method.   Of course, any custom objects created in the repository should be created under a separate schema and privileges granted to the SYSMAN user. 

The Oracle Enterprise Manager Cloud Administrator’s Guide chapter on Notifications has detailed examples of OS Scripts and PL/SQL that can be utilized in different situations.  This chapter also provides detailed information on using passing information to the OS or PL/SQL script and troubleshooting.

Creating Custom Notification Methods

After defining the OS Command or PL/SQL Script, you need to add it to EM 12c as a notification method.  In this example, we will create a simple OS script that logs events to a log file, which can be further processed by a custom ticketing system.

#!/bin/ksh
LOG_FILE=/tmp/event.log
 
if test -f $LOG_FILE
then
echo $TARGET_NAME $MESSAGE $EVENT_REPORTED_TIME >> $LOG_FILE
else
   exit 100
fi

To create a notification method login as a user with Super Admin privileges and select Setup / Notifications / Notification Methods.  Under the Scripts and SNMP Traps section, click the Add drop down box and select OS Command and click Go.


Enter a name and provide the fully qualified script location.  Use the Test button to validate and click Save.


Adding Notification Methods to Incident Rules

To receive notifications, you will need to create an Incident Rule set and select relevant targets and events to notify and then select the OS Command advanced notification method we created earlier. In this section we will go through the steps to create a simple incident rule.  For full details on how to configure your Incident Rules, see the Oracle Enterprise Manager Cloud Control Administrator’s Guide.

Go to Setup / Incidents / Incident Rules.   At this point you can either select a rule set to edit, or create a new rule set.  In this example we are going to create a new rule set by clicking Create


On the first screen, enter a user friendly name and description. Since this example is planning to use an advanced notification choose Enterprise


Since we only want notifications on our Production targets, we’re going to narrow down the list of targets. On Targets tab select Specific targets, select Groups in the drop down box and click Add. Search for the desired group and select.


Click on the Rules tab, and select Create.


Select the type of rule to create, in this example we are using Incoming events and updates to events.


Enter filter criteria, in this example we are filtering on all Metric Alerts in Severity Warning and Critical, click Next.


On the Add Actions screen you will define the action that you wish the notification to perform.  Click Add.


In the Create Incident or Update Incident section you can choose to create an Incident, assign Incidents to an administrator, set priority/status or escalate (if update is selected).  Under the Notifications section, you can select Basic Notifications to send an e-mail or page to a particular user or users, on top of any methods you might select in the Advanced Notifications section.  In Advanced Notifications you will see the method we previously created, select that method and click Continue.



Click Next.


Provide a user friendly name for your rule and a description, click Next



Review the details of your rule and click Continue


Note the rules are not saved until the Save button is clicked. Click OK.


Review your rule set and click Save.


Click OK.


Validating Incident Rules

Once you have configured your rule set, you should trigger a critical alert on one of the targets included in the rule set. Verify that the rule set triggers the notification method you created and logs the event in the log file.  It's helpful to keep the e-mail option turned on during testing so you can validate when the rule was triggered as well. 

Identify a target in the group you configured notifications for in the previous section. For this example, we will use a database instance and trigger a Process Limit Usage (%) event. From Oracle Database / MonitoringAll Metrics, I’ve identified a metric that I can lower thresholds on to trigger an event. Click on Process Limit Usage (%) to drill down to this metric. Notice our Real Time Value of Processes is 16.4.


Click on Modify Thresholds to set thresholds.


Set the Warning and Critical thresholds lower than the Real Time Value we made note of earlier. Ours was 16.4, so setting the Warning to 5, and Critical to 10. In the Occurrences before Alert field, you may wish to change the value to 1. Since the collection frequency is 10 minutes, if occurrences are set to 3, you will have to wait 30 minutes for it to trigger and again to clear. Click Save Thresholds.


Close the confirmation window.


Navigate to Oracle Database / Monitoring / Incident Manager. Since we have the out-of-box rule set enabled we get an Incident for every critical metric alert and we can see the incident for Process Limit in the Unacknowledged incidents view. If you have disabled the out-of-box rule set or don’t see the incident, check the Events without incidents view. Select the appropriate line and in the lower pane click on the Events tab.


On the Events tab click on the Message link to drill into the event details.


Here you’ll see the notification method was called under the Last Comment field. From here, click on Updates tab to see more details.


In the Updates tab, you will see the details of the alert and the notification method that was called to run our /tmp/event_log.sh script.


Finally, we can check the /tmp/event.log file and see the information that was reported.   Be sure to go back and set your threshold to its regular value to clear the false alert you triggered!


Summary

The options for notifications in Oracle Enterprise Manager 12c are very flexible. You can choose e-mail, one of the available connectors to integrate with a 3rd party ticketing system, or use one of the advanced notification methods (SNMP, OS Commands or PL/SQL). In this blog, we’ve shown you how to create an advanced OS Command notification method, how to configure an incident rule set to call that method and how to validate by triggering an alert. Once you are familiar with how to create advanced notification methods and rule sets, you can customize your notifications to suit your needs. 

For additional information on configuring your environment for enterprise monitoring, see the whitepaper Strategies for Scalable, Smarter Monitoring using Oracle Enterprise Manager Cloud Control 12c.  

Stay Connected with Oracle Enterprise Manager:

Twitter | Facebook | YouTube | Linkedin | Newsletter

Network Ports Used in Oracle Enterprise Manager 12c

When planning and configuring your Oracle Enterprise Manager 12c implementation, you will have many infrastructure considerations. One of the most often discussed pieces is the network ports that are used and how to configure load balancers, firewalls and ACLs for communication.

This blog post will help identify the typical default port and range for each component, how to identify it and how to modify the port usage.

To modify most ports during installation, select the Advanced Installation and set the appropriate ports on the Port Configuration Details screen.


Once the system is installed, you can use the following EMCTL or OMSVFY commands to validate components and port assignment:

$emctl status oms -details
$omsvfy show opmn
$omsvfy show ports

To verify if a port is free, run the following command:

On Unix:
$netstat -an | grep <port no>

On Microsoft Windows:
>netstat -an|findstr <port_no>

For more information on OMSVFY (part of the EMDIAG toolkit) see MOS Note 421053.1: EMDIAG Troubleshooting Kits Master Index

External Ports

These ports will be used in every Enterprise Manager 12c installation and will require firewall and/or ACL modifications if your network is restricted.  These are also the components that will be added to your load balancer configuration.

Default Port

Range

Component

Usage

Modify

4889

4889 – 4898

Enterprise Manager OHS Upload HTTP

Agent Communication to OMS (unsecure). Used in load balancer.

To modify after install follow notes 1381030.1 and 1385776.1. Requires changes on all Agents.

1159

1159, 4899 – 4908

Enterprise Manager OHS Upload HTTP SSL

Agent Communication to OMS (secure). Used in load balancer.

To modify after install follow notes 1381030.1 and 1385776.1. Requires changes on all Agents.

7788

7788 – 7798

Enterprise Manager OHS Central Console HTTP (Apache/UI)

Web browser connecting to Cloud Control Console (unsecure). Used in load balancer and for EM CLI.

To modify after install follow notes 1381030.1.

7799

7799 - 7809

Enterprise Manager OHS Central Console HTTP SSL (Apache/UI)

Web browser connecting to Cloud Control Console (secure). Used in load balancer and for EM CLI.

To modify after install follow note 1381030.1.

7101

7101 - 7200

EM Domain WebLogic Admin Server HTTP SSL Port

Cloud Control Admin Server.

To modify after install follow note 1109638.1.

3872

3872, 1830 – 1849

Cloud Control Agent

Only the OMS will connect to this port, to either report changes in the monitoring, submit jobs, or to request real-time statistics.

Port can be provided during Agent install.

If the agent port needs to be changed at a later date this can be done with the following command on the agent:
emctl setproperty agent -name EMD_URL -value https://hostname.domain:port/emd/main/

This will allow the agent to run on the new port, however the target does not get renamed so continues to show the original port.

1521*

Depends on Listener Configuration

Database Targets -  SQL*Net Listener

For Repository database, only the OMS will connect to store management data from

the agents. For all monitored target databases OMS will retrieve information requested by browser clients.

To modify this port for the repository database:

Change the listener.ora file for the EM repository. Restart the listener. Then for every OMS machine using that repository run the following:

emctl stop oms
emctl config oms -store_repos_details -repos_conndesc <connect descriptor of database> -repos_user sysman
emctl start oms
emctl config emrep -agent <agent name> -conn_desc <connect descriptor of database> 

To modify this port for monitored targets, change the listener configuration on the target, then update Monitoring Configuration in EM.

7101

7101 - 7200

FMW Targets – Admin Console

Outgoing from OMS, used for managing FMW targets.

To modify after install follow note 1109638.1.

NA

NA

ICMP

Outgoing from OMS to host servers if the Agent is unreachable. Validates if server is up or down.

NA

Internal Ports

These ports are required for internal Enterprise Manager communication and typically do not require additional firewall/ACL configuration.

Default Port

Range

Component

Usage

Modify

7201

7201 – 7300

EM Domain WebLogic Managed Server HTTP Port

Used for Fusion Middleware communication.

Configured during installation

7301

7301 – 7400

EM Domain WebLogic Managed Server HTTP SSL Port

Used for Fusion Middleware communication.

Configured during installation

7401

7401 – 7500

Node Manager HTTP SSL Port

Used for Fusion Middleware communication.

Configured during installation

6702

6100 - 6199

Oracle Notification Server (OPMN) Local

Ports used by OPMN  can be verified from <MW_HOME> /gc_inst/WebTierIH1

/config /OPMN/opmn/opmn.xml:


<debug comp="" rotation-size="1500000"/>
<notification-server interface="any">
<port local="6700" remote="6701"/>

Modify the opmn.xml to use free ports as below:

1. Stop OMS

2. Take a backup of the existing opmn.xml and ports.prop in the <MW_HOME>/ gc_inst/WebTierIH1/ config /OPMN/opmn directory.

3. Edit the opmn.xml file, under the <notification-server> element, modify the local / remote port, as necessary to the free port available and save the file.

4. Edit the ports.prop file and modify the remote / local port parameters as necessary and save the file.

5. Start the OMS

6703

6200 - 6201

Oracle Notification Server (OPMN) Remote

Ports used by OPMN  can be verified from <MW_HOME> /gc_inst/WebTierIH1

/config/OPMN/opmn/opmn.xml:


<debug comp="" rotation-size="1500000"/>
<notification-server interface="any">
<port local="6700" remote="6701"/>

Modify the opmn.xml to use free ports as below:

1. Stop OMS

2. Take a backup of the existing opmn.xml and ports.prop in the <MW_HOME> /gc_inst/WebTierIH1/ config/OPMN/opmn directory.

3. Edit the opmn.xml file, under the <notification-server> element, modify the local / remote port, as necessary to the free port available and save the file.

4. Edit the ports.prop file and modify the remote / local port parameters as necessary and save the file.

5. Start the OMS

Optional

These ports required only if certain components are to be used and firewall/ACL changes may be needed.

Default Port

Range

Component

Usage

Modify

443


Secure web connection (https - 443) to updates.oracle.com support.oracle.com

ccr.oracle.com

login.oracle.com

aru-akam.oracle.com

Outgoing from OMS used for online communication with Oracle for OCM, MOS, Patching, Self-Updates, ASR

Proxy settings defined via the UI (Setup -> Proxy Settings)
Do not use the OMS parameters!  

51099


Application Dependency and Performance RMI Registry Port

ADP

Configured during installation

55003


Application Dependency and Performance Java Provider Port

ADP

Configured during installation

55000


Application Dependency and Performance Remote Service Controller Port

ADP

Configured during installation

4210


Listen

ADP

Configured during installation

4211


SSL Listen Port

ADP

Configured during installation

3800


JVM Managed Server Listen

JVM

Configured during installation

3801


JVM Managed Server SSL Listen

JVM

Configured during installation

9701

9701-49152

BI Publisher HTTP

BI Publisher

During install can modify with configureBIP script.  Post-install can be modified per Note 1524248.1

9702

9701-49152

BI Publisher HTTP SSL Port

BI Publisher

During install can modify with configureBIP script.  Post-install can be modified per Note 1524248.1

Stay Connected with Oracle Enterprise Manager:

Twitter | Facebook | YouTube | Linkedin | Newsletter

About

bocadmin_ww

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today