Thursday Feb 27, 2014

Fast Recovery Area for Archive Destination

If you are using Fast Recovery Area (FRA) for the archive destination and the destination is set to USE_DB_RECOVERY_FILE_DEST, you may notice that the Archive Area % Used metric does not trigger anymore. Instead you will see the Recovery Area % Used metric trigger when it hits a Warning threshold of 85% full, and Critical of 97% full. As this metric is controlled by the server side database thresholds it cannot be modified by Enterprise Manager (see MOS Note 428473.1 for more information). Thresholds of 85/97 are not sufficient for some of the larger, busier databases. This may not give you enough time to kickoff a backup and clear enough logs before the archiver hangs. If you need different thresholds, you can easily accomplish this by creating a Metric Extension (ME) and setting thresholds to your desired values.  This blog will walk through an example of creating an ME to monitor archive on FRA destinations, for more information on ME's and how they can be used, refer to the Oracle Enterprise Manager Cloud Control Administrator's Guide

[Read More]

Tuesday Feb 18, 2014

Monitoring Archive Area % Used on Cluster Databases

One of the most critical events to monitor on an Oracle Database is your archive area. If the archive area fills up, your database will halt until it can continue to archive the redo logs. If your archive destination is set to a file system, then the Archive Area % Used metric is often the best way to go. This metric allows you to monitor a particular file system for the percentage space that has been used. However, there are a couple of things to be aware of for this critical metric.

Cluster Database vs. Database Instance

You will notice in EM 12c, the Archive Area metric exists on both the Cluster Database and the Database Instance targets. The majority of Cluster Databases (RAC) are built against database best practices which indicate that the Archive destination should be shared read/write between all instances. The purpose for this is that in case of recovery, any instance can perform the recovery and has all necessary archive logs to do so. Monitoring this destination for a Cluster Database at the instance level caused duplicate alerts and notifications, as both instances would hit the Warning/Critical threshold for Archive Area % Used within minutes of each other. To eliminate duplicate notifications, the Archive Area % Used metric for Cluster Databases was introduced. This allows the archive destination to be monitored at a database level, much like tablespaces are monitored in a RAC database.

In the Database Instance (RAC Instance) target, you will notice the Archive Area % Used metric collection schedule is set to Disabled.

If you have a RAC database and you do not share archive destinations between instances, you will want to Disable the Cluster Database metric, and enable the Database Instance metric to ensure that each destination is monitored individually.

Tuesday Mar 12, 2013

Monitoring virtualization targets in Oracle Enterprise Manager 12C

Contributed by Sampanna Salunke, Principal Member of Technical Staff, Enterprise Manager

For monitoring any target instance in Oracle Enterprise Manager 12C, you would typically go to target home page, and click on the target menu to navigate to:

  • Monitoring->All Metrics page to view all the collected metrics
  • Monitoring->Metric and Collection Settings to set thresholds and/or modify collection frequencies of metrics
The thresholds and collection frequencies modified affect only the target instance that you are making changes to.

However, some of virtualization targets need to be monitored and managed differently due to changes made to the way data is collected and thresholds/collection frequencies are applied. Such target types include:

  • Oracle VM Server
  • Oracle VM Guest

As an optimization effort to minimize number of connections made to Oracle VM Manager to collect data for virtualization targets, the performance metrics for Oracle VM Server and Oracle VM Guest targets are “bulk-collected” at the Oracle VM Server Pool level. This means that thresholds and collection frequencies of Oracle VM Server and Oracle VM Guest metrics need to be set on the Oracle Server Pool that they belong to. For example, if a user wants to set thresholds on the “Oracle VM Server Load:CPU Utilization” metric for Oracle VM Server target, the sequence of steps to be performed are:

1. Navigate to the homepage of the Oracle VM Server Pool target that the Oracle VM Server target belongs to

2. Click on the target menu->Monitoring->Metric and Collection Settings

3. Expand the view option to “All Metrics” if required, and find the “Oracle VM Server Load” metric and change the thresholds or collection frequency of "CPU Utilization" as required.

Note that any changes made at the Oracle VM Server Pool for a “bulk collected” metric affect all the targets for which the metric is applicable in the server pool. In this example, since the user modified the “Oracle VM Server Load: CPU Utilization” threshold, the change is applied to all the Oracle VM Server targets in the server pool sg-pool1.

To summarize – the differences between “traditional” monitoring and “bulk-collected” monitoring is that the thresholds and collection frequencies of metrics are modified at the parent target, and the changes made are applied to all the children targets for which the metrics are applicable. However, data and alerts uploaded continue to appear as normal against the child target.

Stay Connected:
Twitter |
Facebook | YouTube | Linkedin | Newsletter

Thursday Nov 03, 2011

Alert Monitoring and Problem Notification in Oracle Enterprise Manager Ops Center

Oracle Enterprise Manager Ops Center provides full lifecycle management of your Oracle hardware and operating systems, including your virtual environments. A significant portion of any given asset's lifecycle is spent in daily operations and when things are running smoothly, there isn't much for an administrator to do. When things go awry, it's critical to know what happened and why as quickly as possible. Oracle Enterprise Manager Ops Center provides alert monitoring and problem notification and management capabilities to enable you to do just that. I'll walk you through a quick and simple example of how you can use these features and hopefully it will spark ideas of how you can implement even more interesting solutions using the same basic steps.

The first step is to tune your monitoring rules. Each type of asset will have a default set of monitoring rules that are applied when the asset is first managed. Rules can be managed on individual assets via their Monitoring tab, or by applying Monitoring Profiles to individual assets or groups of assets. Monitoring rules can be configured to raise alerts when, for example, a monitored attribute exceeds a threshold value for a selectable period of time. For more details on how to configure your monitoring rules, please see section 9 of the Advanced User's Guide, available by clicking on the Help link from within the browser user interface. If you update monitoring rules in a Monitoring Profile, be sure to apply that profile to your desired assets in order to make it affect their monitoring rules. For this example I have set a very short window for the CPU Usage attribute to generate an alert after only 1 minute of high CPU utilization, as shown in the screenshot below.

When an alert is generated, a new problem will be created if none is already open for the issue. Otherwise the alert will be added to an existing problem. Problems aggregate alerts and annotations together and provide the opportunity to assign and track resolution. Any users who have their Notification Profile defined to receive notification of the problem will get an email or page with the pertinent details. The image below shows how you might specify to have the root user subscribe to get email notification of all WARNING or higher level problems.

Problems can be managed holistically from the Message Center in the top of the left-hand navigation panel or they can be viewed for individual assets by selecting the Problems tab. When looking at an open problem, icons along the top allow you to see existing alerts and annotations, to add an annotation, to assign the problem to a user or to take action on the problem, as shown in the screenshot below.

Annotations can be simple textual comments or suggested actions which can include the execution of an existing Operational Plan. For more detail on how to use Operational Plans, see section 11 of the Advanced User's Guide. For this example, I created a simple Operational Plan to execute a prstat. Be sure to select the appropriate Subtype, in this case a Global Zone.

When adding an annotation to a problem, you can optionally select the checkbox at the bottom of the window in order to save that annotation to the Problems Knowledge Base and associate it to future problems of the same type and severity as shown below.

When an annotation has been saved to the Problems Knowledge Base, it can be edited to include additional severities and can also be changed to execute automatically when a future problem is initially created, as shown below. For more detail on the Problems Knowledge Base, please refer to section 10 of the Advanced User's Guide.

When a new problem is detected, the newly added Automated Action will execute the associated Operational Plan and attach the output as an annotation to the problem. To demonstrate this in action, I executed several 'dd' commands on the host to force excessive CPU usage. In this case, the prstat output shows the high CPU usage of the processes that were running at the time that the alert was generated, even though they lasted only a few minutes.

This is clearly a simple example and would not suffice to capture very short-lived processes but it illustrates the possibilities available. The automatic action could have been a more in-depth data gathering script utilizing dTrace or could have even made system changes, depending on the real scenario it was built to address. I hope this quick walk-through has provoked thoughts of how you might implement Alert Monitoring and Problem Notification and Management in your enterprise using Oracle Enterprise Manager Ops Center.

Follow Oracle Enterprise Manager Ops Center at : 

Twitter   Facebook YouTube Linkedin


Latest information and perspectives on Oracle Enterprise Manager.

Related Blogs


« April 2014