Oracle Enterprise Manager 12c Configuration Best Practices (Part 1 of 3)

The objective of this three-part blog series is to summarize the most commonly implemented configuration changes to improve performance and operation of a large Enterprise Manager 12c environment. A “large” environment is categorized by the number of agents, targets and users. See the Oracle Enterprise Manager Cloud Control Advanced Installation and Configuration Guide chapter on Sizing for more details on sizing your environment properly.

  • Part 1 of this series covers recommended configuration changes for the OMS and Repository
  • Part 2 will cover recommended changes for the Weblogic server
  • Part 3 will cover general configuration recommendations and a few known issues

The entire series can be found in the My Oracle Support note titled Oracle Enterprise Manager 12c Configuration Best Practices [1553342.1].

OMS Recommendations

Increase JAVA Heap Size

For larger enterprises, there may be a need to increase the amount of memory used for the OMS.  One of the symptoms of this condition is a “sluggish” performance on the OMS.  If it is determined that the OMS needs more memory, it is done by increasing the JAVA heap size parameters.  However, it is very important to increase this parameter incrementally and be careful not to consume all of the memory on the server.  Also, java does not always perform better with more memory. 

Verify:  The parameters for the java heap size are stored in the following file:

<MW_HOME>/user_projects/domains/GCDomain/bin/startEMServer.sh

Recommendation:  If you have more than 250 agents, increase the -Xmx parameter which specifies the maximum size for the java heap to 2 gb.  As the number of agents grows, it can be incrementally increased.  Note:  Do not increase this larger than 4gb without contacting Oracle.  Change only the –Xmx value in the line containing USER_MEM_ARGS="-Xms256m –Xmx1740m …options…" as seen in the example below.   Do not change the Xms or MaxPermSize values. Note:  change both lines as seen below.  The second occurrence will be used if running in debug mode.

Before

 if [ "${SERVER_NAME}" != "EMGC_ADMINSERVER" ] ; then
  USER_MEM_ARGS="-Xms256m -Xmx1740m
 -XX:MaxPermSize=768M -XX:-DoEscapeAnalysis -XX:+UseCodeCacheFlushing 
-XX:ReservedCodeCacheSize=100M -XX:+UseConcMarkSweepGC -XX:+UseParNewGC 
-XX:+CMSClassUnloadingEnabled"
  if [ "${JAVA_VENDOR}" = "Sun" ] ; then
    if [ "${PRODUCTION_MODE}" = "" ] ; then
      USER_MEM_ARGS="-Xms256m -Xmx1740m
 -XX:MaxPermSize=768M -XX:-DoEscapeAnalysis -XX:+UseCodeCacheFlushing 
-XX:ReservedCodeCacheSize=100M -XX:+UseConcMarkSweepGC -XX:+UseParNewGC 
-XX:+CMSClassUnloadingEnabled -XX:CompileThreshold=8000 
-XX:PermSize=128m"
    fi
  fi
  export USER_MEM_ARGS
fi

After

 if [ "${SERVER_NAME}" != "EMGC_ADMINSERVER" ] ; then
  USER_MEM_ARGS="-Xms256m -Xmx2560m -XX:MaxPermSize=768M
 -XX:-DoEscapeAnalysis -XX:+UseCodeCacheFlushing 
-XX:ReservedCodeCacheSize=100M -XX:+UseConcMarkSweepGC -XX:+UseParNewGC 
-XX:+CMSClassUnloadingEnabled"
  if [ "${JAVA_VENDOR}" = "Sun" ] ; then
    if [ "${PRODUCTION_MODE}" = "" ] ; then
      USER_MEM_ARGS="-Xms256m –Xmx2560m
 -XX:MaxPermSize=768M -XX:-DoEscapeAnalysis -XX:+UseCodeCacheFlushing 
-XX:ReservedCodeCacheSize=100M -XX:+UseConcMarkSweepGC -XX:+UseParNewGC 
-XX:+CMSClassUnloadingEnabled -XX:CompileThreshold=8000 
-XX:PermSize=128m"
    fi
  fi
  export USER_MEM_ARGS
fi

Repository Recommendations

Repvfy execute optimize

This command can be executed to establish a baseline and set the environment to the “recommended” values based on the configuration of that environment.  The following command will check the existing settings and modify them if needed.

$ repvfy execute optimize

This command does several things some of which include the following:

1.                 Internal task system:

  • Verify there are at least 2 short running and 2 long running worker threads
  • Verify that the availability worker threads are disabled since these threads are now obsolete

2.                 Repository settings:

  • Set the retention time for the MGMT_SYSTEM_ERROR_LOG table to 7 days (unless this setting has already been changed)
  • Disable PL/SQL and metric tracing to reduce logging when not necessary
  • Recompile any invalid SYSMAN objects

3.                 Target system:

  • Tune the PING grace period to allow the OMS to wait a longer period of time after startup before checking the heartbeat of the agents

Increase Task Workers

Task worker threads are used to pick up tasks from the dbms_scheduler jobs queue based on their type.  These jobs are used to calculate metrics, rollup metrics for clusters and provide the self-monitoring metrics for EM.  Tasks are defined as short or long.  Many larger systems require more than one short and long task workers to do the housekeeping jobs in a timely manner without creating a backlog.   The recommendation is to have at least 2 short-running worker threads and 2 long-running worker threads.

Verify:  To determine if you have a backlog:

 $ repvfy verify repository -test 1001

If you have a backlog, execute the command below to gather more details on the performance data for the task workers.

 $ repvfy dump task_health

Recommendation:  If the output from the dump task_health indicates a backlog, execute the following statement to set the recommended number of task workers for both short running tasks (type 0) and long running tasks (type 1).  This will increase the settings to the recommended settings for your environment (this command is not necessary if you already ran it on your environment from the first recommended step above).

 $ repvfy execute optimize

If after setting the recommended settings, the site had grown to such a size that there is still a task worker backlog, use this routine to increase the number of workers above 2:

$ sqlplus /nolog
 SQL> connect SYSMAN;
 SQL> exec gc_diag2_ext.SetWorkerCounts(<number>);

The number can be 3 or 4 (the routine will not accept values larger than 4). If you need to go higher than 4, contact Oracle Support.

Increase Ping Grace Period

Upon system startup, the OMS must ping each agent to get a current heartbeat and update the availability state for all the agents.  In systems with 100’s or 1000’s of agents, this can take longer.   By increasing the grace period for the ping/heartbeat system to kick in and contact Agents we allow more time for the agents to start uploading first. 

Recommendation:  Execute the following statement. This command will evaluate the system and set the appropriate value for the Ping Grace Period to give the majority of the agents a chance to begin their upload upon system startup (this command is not necessary if you already ran it on your environment).

$ repvfy execute optimize 

If after an OMS restart, you still see a high number of pending agents for a prolonged period of time, this value may need to be set higher.  Execute the following statement and contact Oracle Support, providing the output from the dump ping_health command.

$ repvfy dump ping_health

Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

bocadmin_ww

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today