The objective of this three-part blog series is to summarize the most commonly implemented configuration changes to improve performance and operation of a large Enterprise Manager 12c environment. A “large” environment is categorized by the number of agents, targets and users. See the Oracle Enterprise Manager Cloud Control Advanced Installation and Configuration Guide chapter on Sizing for more details on sizing your environment properly.
The entire series can be found in the My Oracle Support note titled Oracle Enterprise Manager 12c Configuration Best
Practices [1553342.1].
For larger enterprises, there may be a need to increase the amount of memory used for the OMS. One of the symptoms of this condition is a “sluggish” performance on the OMS. If it is determined that the OMS needs more memory, it is done by increasing the JAVA heap size parameters. However, it is very important to increase this parameter incrementally and be careful not to consume all of the memory on the server. Also, java does not always perform better with more memory.
Verify: The parameters for the java heap size are stored in the following file:
<MW_HOME>/user_projects/domains/GCDomain/bin/startEMServer.sh
Recommendation: If you have more than 250 agents, increase the -Xmx parameter which specifies the maximum size for the java heap to 2 gb. As the number of agents grows, it can be incrementally increased. Note: Do not increase this larger than 4gb without contacting Oracle. Change only the –Xmx value in the line containing USER_MEM_ARGS="-Xms256m –Xmx1740m …options…" as seen in the example below. Do not change the Xms or MaxPermSize values. Note: change both lines as seen below. The second occurrence will be used if running in debug mode.
Before
if [ "${SERVER_NAME}" != "EMGC_ADMINSERVER" ] ; thenUSER_MEM_ARGS="-Xms256m -Xmx1740m -XX:MaxPermSize=768M -XX:-DoEscapeAnalysis -XX:+UseCodeCacheFlushing -XX:ReservedCodeCacheSize=100M -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSClassUnloadingEnabled"if [ "${JAVA_VENDOR}" = "Sun" ] ; thenif [ "${PRODUCTION_MODE}" = "" ] ; thenUSER_MEM_ARGS="-Xms256m -Xmx1740m -XX:MaxPermSize=768M -XX:-DoEscapeAnalysis -XX:+UseCodeCacheFlushing -XX:ReservedCodeCacheSize=100M -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSClassUnloadingEnabled -XX:CompileThreshold=8000 -XX:PermSize=128m"fifiexport USER_MEM_ARGSfi
After
if [ "${SERVER_NAME}" != "EMGC_ADMINSERVER" ] ; thenUSER_MEM_ARGS="-Xms256m -Xmx2560m -XX:MaxPermSize=768M -XX:-DoEscapeAnalysis -XX:+UseCodeCacheFlushing -XX:ReservedCodeCacheSize=100M -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSClassUnloadingEnabled"if [ "${JAVA_VENDOR}" = "Sun" ] ; thenif [ "${PRODUCTION_MODE}" = "" ] ; thenUSER_MEM_ARGS="-Xms256m –Xmx2560m -XX:MaxPermSize=768M -XX:-DoEscapeAnalysis -XX:+UseCodeCacheFlushing -XX:ReservedCodeCacheSize=100M -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSClassUnloadingEnabled -XX:CompileThreshold=8000 -XX:PermSize=128m"fifiexport USER_MEM_ARGSfi
This command can be executed to establish a baseline and set the environment to the “recommended” values based on the configuration of that environment. The following command will check the existing settings and modify them if needed.
$ repvfy execute optimize
This command does several things some of which include the following:
1. Internal task system:
2. Repository settings:
3. Target system:
Task worker threads are used to pick up tasks from the dbms_scheduler jobs queue based on their type. These jobs are used to calculate metrics, rollup metrics for clusters and provide the self-monitoring metrics for EM. Tasks are defined as short or long. Many larger systems require more than one short and long task workers to do the housekeeping jobs in a timely manner without creating a backlog. The recommendation is to have at least 2 short-running worker threads and 2 long-running worker threads.
Verify: To determine if you have a backlog:
$ repvfy verify repository -test 1001
If you have a backlog, execute the command below to gather more details on the performance data for the task workers.
$ repvfy dump task_health
Recommendation: If the output from the dump task_health indicates a backlog, execute the following statement to set the recommended number of task workers for both short running tasks (type 0) and long running tasks (type 1). This will increase the settings to the recommended settings for your environment (this command is not necessary if you already ran it on your environment from the first recommended step above).
$ repvfy execute optimize
If after setting the recommended settings, the site had grown to such a size that there is still a task worker backlog, use this routine to increase the number of workers above 2:
$ sqlplus /nologSQL> connect SYSMAN;SQL> exec gc_diag2_ext.SetWorkerCounts(<number>);
The number can be 3 or 4 (the routine will not accept values larger than 4). If you need to go higher than 4, contact Oracle Support.
Upon system startup, the OMS must ping each agent to get a current heartbeat and update the availability state for all the agents. In systems with 100’s or 1000’s of agents, this can take longer. By increasing the grace period for the ping/heartbeat system to kick in and contact Agents we allow more time for the agents to start uploading first.
Recommendation:
Execute the following statement. This command will evaluate the system and set the appropriate value for the Ping Grace Period to give the majority of the agents a chance to begin their upload upon system startup (this command is not necessary if you already ran it on your environment).
$ repvfy execute optimize
If after an OMS restart, you still see a high number of pending agents for a prolonged period of time, this value may need to be set higher. Execute the following statement and contact Oracle Support, providing the output from the dump ping_health command.
$ repvfy dump ping_health