By Scott McNeil on Jan 15, 2013
The following is a Guest Blog, contributed by Anand Ranganathan, Product Manager for Oracle Products and Solutions at NetApp
As a DBA managing databases running on storage systems, some of your major challenges are how do you:
If you are planning to or are currently running your Oracle databases on NetApp storage systems then the NetApp Storage System Plug-in for Oracle Enterprise Manager 12c resolves the above challenges by:
The plug-in was built by NetApp using the Enterprise Manager 12c Extensibility Kit that not only helped us develop screens that have a rich look and feel but also helped enrich the plug-in by allowing us to leverage Cloud Control’s powerful monitoring and event management features for monitoring of NetApp storage systems.
The NetApp Storage System Plug-in for Oracle Enterprise Manager 12c is free and is listed at the Enterprise Manager Extensibility Exchange. The plug-in has achieved Oracle Validated Integration, which provides customers the confidence knowing that it:
Here is a screenshot of the dashboard of a NetApp storage system monitored with Enterprise Manager 12c via the NetApp plug-in.
Customers and partners can visit the NetApp Communities site to watch a demo and download the plug-in. If you are a partner, visit the Oracle PartnerNetwork Validated Integration Knowledge Zone to learn more about the validation program.
Recently Oracle announced incremental release of Enterprise Manager 12c called Enterprise Manager 12c Release 2 (EM12c R2) which includes several new exciting features (Press announcement). Right before the official release, we upgraded an internal production site from EM 12c R1 to EM 12c R2 and had an extremely pleasant experience. Let me share few key takeaways as well as few tips from this upgrade exercise.
I - Why Should You Upgrade To Enterprise Manager 12c Release 2
While an upgrade is usually recommended primarily to take benefit of the latest features (which is valid for this upgrade as well), I found several other compelling reasons purely from deployment perspective.
Note: BP1 patches are not mandatory to upgrade to EM12c R2 release
EM 12c R2 provides an excellent opportunity to standardize your Cloud Control environment (OMS, repository and agents) and plug-ins to latest versions in single shot.
Plug-in Upgrade or Migrate
II - Few Tips To Remember
In my last post (blog link) I shared few tips and tricks from my experience applying the Bundle Patch. Recently I upgraded the same site to EM 12c R2 and found few points that you must take note of, while planning this upgrade. The tips below are also applicable to EM 12c R1 environments that do not have Bundle Patch 1 patches applied.
We are very excited about this latest release and will look forward to hear back any feedback from your upgrade experience!
A recording of this community call is now available here:
With Enterprise Manager Ops Center 12c, you can provision, patch, monitor and manage Oracle Solaris 11 instances. To do this, Ops Center creates and maintains a Solaris 11 Image Packaging System (IPS) repository on the Enterprise Controller. During the Enterprise Controller configuration, you can load repository content directly from Oracle's Support Web site and subsequently synchronize the repository as new content becomes available.
Of course, you can also use Solaris 11 ISO images to create and update your Ops Center repository. There are a few excellent reasons for doing this:
This demo will show you how to use Solaris 11 ISO images to set up and update your Ops Center repository.
This tip assumes that you've already installed the Enterprise Controller on a Solaris 11 OS instance and that you're ready for post-install configuration.
In addition, there are specific Ops Center and OS version requirements depending on which version of Solaris 11 you plan to install.You can get full details about the requirements in the Release Notes for Ops Center 12c update 2.
Additional information is available in the Ops Center update 2 Readme document.
The Oracle Web site provides a number of download links for official Solaris 11 images. Among those links is a two-part downloadable repository image, which provides repository content for Solaris 11 SPARC and X86 architectures. In this case, I used the Solaris 11 11/11 image.
First, navigate to the Oracle Web site and accept the OTN License agreement:
Next, download both parts of the Solaris 11 repository image. I used the Solaris 11 11/11 image, and have provided the URLs here:Finally, use the cat command to generate an ISO image you can use to create your repository:
# cat sol-11-1111-repo-full.iso-a sol-11-1111-repo-full.iso-b > sol-11-1111-repo-full.iso
The process is very similar if you plan to set up a Solaris 11.1 release in Ops Center. In that case, navigate to the Solaris 11 download page, accept the license agreement and download both parts of the Solaris 11.1 repository image. Use the cat command to create a single ISO image for Solaris 11.1
Once you have created the Solaris 11 ISO file, use the mount command to attach it to your local filesystem. After the image has been mounted, you can browse the repository from the ./repo subdirectory, and use the pkgrepo command to verify that Solaris 11 recognizes the content:
When you have confirmed the repository is available, you can use the image to create the Enterprise Controller repository. The operation will be slightly different depending on whether you configure Ops Center for Connected or Disconnected Mode operation.For connected mode operation, specify the mounted ./repo directory in step 4.1 of the configuration wizard, replacing the default Web-based URL. Since you're synchronizing from an OS repository image, you don't need to specify a key or certificate for the operation.
For disconnected mode configuration, specify the Solaris 11 directory along with the path to the disconnected mode bundle downloaded by running the Ops Center harvester script:
Ops Center will run a job to import package content from the mounted ISO image. A synchronization job can take several hours to run – in my case, the job ran for 3 hours, 22 minutes on a SunFire X4200 M2 server.
During the job, Ops Center performs three important tasks:
- Synchronizes all content from the image and refreshes the repository
- Updates the IPS publisher information
- Creates OS Provisioning profiles and policies based on the content
When the job is complete, you can unmount the ISO image from your Enterprise Controller. At that time, you can view the repository contents in your Ops Center Solaris 11 library. For the Solaris 11 11/11 release, you should see 8,668 packages and patches in the contents.
You should also see default deployment plans for Solaris 11 provisioning. As part of the repository import, Ops Center generates plans and profiles for desktop, small and large servers for the SPARC and X86 architecture.
It's possible to use the same approach to upgrade your Ops Center repository to a Solaris 11 Support Repository Update, or SRU. Each SRU provides packages and updates to Solaris 11 - for example, SRU 8.5 provided the packaged for Oracle VM Server for SPARC 2.2
SRUs are available for download as ISO images from My Oracle Support, under document ID 1372094.1. The document provides download links for all SRUs which have been released by Oracle for Solaris 11. SRUs are cumulative, so later versions include the packages from earlier SRUs.
After downloading an ISO image for an SRU, you can mount it to your local filesystem using a mount command similar to the one shown for Solaris 11 11/11.
When the ISO image is mounted to the file system, you can perform the Add Content action from the Solaris 11 Library to synchronize packages and patches from the mounted image. I used the same mount point, so the repository URL was file://mnt/repo once again:
After the synchronization of an SRU is complete, you can verify its content in the Solaris 11 library using the search function. The version pattern is 0.175.0.#, where the # is the same value as the SRU.
In this example, I upgraded to SRU 1. The update job ran in just under 8 minutes, and a quick search shows that 22 software components were added to the repository:
It's also possible to search for "Support Repository Update" to confirm the SRU was successfully added to the repository. Details on any of the update content are available by clicking the "View Details" button under the Packages/Patches entry.
Enterprise Manager Ops Center 12c provides significant monitoring capabilities, combined with very flexible incident management. These capabilities even extend to monitoring the file systems associated with Solaris or Linux assets. Depending on your needs you can monitor and manage incidents, or you can fine tune alert monitoring rules to specific file systems.
This article will show you how to use Ops Center 12c to
A recording of this community call is now available here:
The Libraries tab provides basic, device-level information about the storage associated with an OS instance. This tab shows you the local file system associated with the instance and any shared storage libraries mounted by Ops Center.
More detailed information about file system storage is available under the Analytics tab under the sub-tab named Charts. Here, you can select and display the individual mount points of an OS, and export the utilization data if desired:
In this example, the OS instance has a basic root file partition and several NFS directories. Each file system mount point can be independently chosen for display in the Ops Center chart.
Every asset managed by Ops Center has a "monitoring policy", which determines what represents a reportable issue with the asset. The policy is made up of a bunch of monitoring rules, where each rule describes
When the conditions are met, Ops Center sends a notification and creates an incident.
By default, OS instances have three monitoring rules associated with file systems:
You can view these rules in the Monitoring tab for an OS:
Of course, the default monitoring rules is that they apply to every file system associated with an OS instance. As a result, any issue with NAS accessibility or disk utilization will trigger an incident. This can cause incidents for file systems to be reported multiple times if the same shared storage is used by many assets, as shown in this screen shot:
Depending on the level of control you'd like, there are a number of ways to fine tune incident reporting.
Note that any changes to an asset's monitoring policy will detach it from the default, creating a new monitoring policy for the asset. If you'd like, you can extract a monitoring policy from an asset, which allows you to save it and apply the customized monitoring profile to other OS assets.
In some cases, you may want to modify the basic conditions for incident reporting in your file system. The changes you make to a default monitoring rule will apply to all of the file systems associated with your operating system. Selecting the File Systems Used Space Percentage entry and clicking the "Edit Alert Monitoring Rule Parameters" button opens a pop-up dialog which allows you to modify the rule.
The first screen lets you decide when you will check for file system usage, and how long you will wait before opening an incident in Ops Center. By default, Ops Center monitors continuously and reports disk utilization issues which exist for more than 15 minutes.
The second screen lets you define actual threshold values. By default, Ops Center opens a Warning level incident is utilization rises above 80%, and a Critical level incident for utilization above 95%
If you'd rather not report file system incidents, you can disable the monitoring rules altogether. In this case, you can select the monitoring rules and click the "Disable Alert Monitoring Rule(s)" button to open the pop-up confirmation dialog.
Like the first solution, this option affects all file system monitoring. It allows you to completely disable incident reporting for NAS library status or file system space consumption.
If you'd like to have the greatest flexibility when monitoring file systems, you can create entirely new rules. Clicking the "Add Alert Monitoring Rule" (the icon with the green plus sign) opens a wizard which allows you to define a new rule.
This rule will be based on a threshold, and will be used to monitor operating system assets. We'd like to add a rule to track disk utilization for a specific file system - the /nfs-guest directory. To do this, we specify the following attribute
The value of name in the attribute allows us to define a specific NFS shared directory or file system... in the case of this OS, we could have chosen any of the values shown in the File Systems Utilization chart at the beginning of this article.
usedSpacePercentage lets us define a threshold based on the percentage of total disk space used. There are a number of other values that we could use for threshold-based monitoring of FileSystemUsages, including
The final sections of the screen allow us to determine when to monitor for disk usage, and how long to wait after utilization reaches a threshold before creating an incident. The next screen lets us define the threshold values and severity levels for the monitoring rule:
If historical data is available, Ops Center will display it in the screen. Clicking the Apply button will create the new monitoring rule and active it in your monitoring policy.
If you combine this with one of the previous solutions, you can precisely define which file systems will generate incidents and notifications. For example, this monitoring policy has the default "File System Used Space Percentage" rule disabled, but the new rule reports ONLY on utilization for the /nfs-guest directory.
The Exchange offers a searchable listing of Enterprise Manager entities. Today it’s stocked with plug-ins and connectors for Enterprise Manager 12c and 11g. Anyone - partners, customers, ACE community members, anyone - can post an entity subject to approval of course. So in addition to plug-ins and connectors, the Exchange will have best practices, deployment procedures, templates, and essentially any Enterprise Manager entity that’s relevant.
The Exchange provides Development Resources to guide contributors in the creation of plug-ins and connectors. A Community Resources page features plug-ins validated through the Oracle Validate Integration program as well as some other contributions important to customers. You can also discover ways to get more involved with Enterprise Manager through the user and partner communities.
The Exchange was announced in the October 2nd Enterprise Manager Partner Press Release and is being presented at Oracle OpenWorld 2012 during the following sessions:
• “Using Oracle Enterprise Manager to Manage Your Own Private Cloud” General Session – Tuesday Oct 2nd
• “Managing Heterogeneous Environments with Oracle Enterprise Manager” Conference Session – Tuesday Oct 2nd
• “Using Management Already Built into Oracle Products: Oracle Enterprise Manager” Oracle Partner Network Exchange Session – Wednesday Oct 3rd
Check it out at http://www.oracle.com/goto/emextensibility, and let us know what you think by posting a comment below or clicking the "Forum" button at the Exchange itself.
Enterprise Manager Ops Center 12c recently released an upgrade for
Solaris Agent Controllers. In this week's blog post, we'll show you
how to upgrade agent controllers.
Detailed instructions about upgrading Agent Controllers are available in the product documentation here. This blog post uses an Enterprise Controller which is configured for connected mode operation. If you'd like to apply the agent update in a disconnected installation, additional instructions are available here.
Step 1: Download Agent Controller Updates
With a connected mode Ops Center installation, you can check for product updates at any time by selecting the Enterprise Controller from the left-hand Administration navigation tab.
Select the right-hand Action link “Ops Center Downloads” to open a pop-up dialog displaying any new product updates. In this example, the Enterprise Controller has already been upgraded to the latest version (Update 1, also shown as build version 2076) so only the Agent Controller updates will appear.
There are three updates available: one for Solaris 10 X86, one for Solaris 8-10 SPARC, and one for all versions of Solaris 11. Note that the last update in the screen shot is the Solaris 11 update; for details on any of the downloads, place your mouse over the information icon under the details column for a pop-up text region.
Select the software to download and click the Next button to display the Ops Center license agreement.
Review and click the check box to accept the license agreement, then click the Next button to begin downloading the software.
The status screen shows the current download status. If desired, you can perform the downloads as a background job. Simply click the check box, then click the next button to proceed to the summary screen.
The summary screen shows the updates to be downloaded as well as the current status. Clicking the Finish button will close the dialog and return to the Browser UI. The download job will continue to run in Ops Center and progress can still be viewed from the jobs menu at the bottom of the browser window.
Step 2: Check the Version of Existing Agent Controllers
After the download job completes, you can check the availability of agent updates as well as the current versions of your Agent Controllers from the left-hand Assets navigation tab.
Select “Operating Systems” from the pull-down tab lets to display only OS assets. Next, select “Solaris” in the left-hand tab to display the Solaris assets. Finally, select the Summary tab in the center display panel to show which versions of agent controllers are installed in your data center.
Notice that a few of the OS assets are not displayed in the Agent Controllers tab. Ops Center will not display OS instances which do not have an Agent Controller installation. This includes Enterprise Controllers and Proxy Controllers (unless the agent has been activated on the OS instance) and and OS instances using agentless management.
For Agent Controllers which support an update, the version of agent software (in this example, 2083) appears to the right of the currently installed version.
Step 3: Upgrade Your Agent Controllers
If desired, you can upgrade agent controllers from the previous screen by selecting the desired systems and clicking the upgrade button. Alternatively, you can click the link “Upgrade All Agent Controllers” in the right-hand Actions menu:
In either case, a pop-up dialog lets you start the upgrade process. The first screen in the dialog lets you choose the upgrade method:
Ops Center provides three ways to upgrade agent controllers:
After selecting the upgrade method, click the Next button to proceed to the summary screen. Click the Finish button to close the pop-up dialog and start the upgrade job for the agent controllers.
The upgrade job runs a series of tasks in parallel, and will upgrade all agents which have been selected. Once the job completes, the OS instances in your data center will be upgraded and running the latest version of Agent Controller software.
To try and make the new features a bit more understandable, I’ll be writing a number of blog entries over the coming months to highlight just some of my favourite new features for EM12c. From an administrator’s perspective, one of those standout features (and the subject of today’s entry) has to be incident management.
The goal of incident management is to enable administrators to monitor and resolve service disruptions that may be occurring in their data centre as quickly and efficiently as possible. Instead of managing the numerous discrete individual events that may be raised as the result of any of these service disruptions, we want to manage a smaller number of more meaningful incidents, and to manage them based on business priority across the lifecycle of those incidents.
To do this, Enterprise Manager now provides a centralized incident console called Incident Manager that will enable the administrator to track, diagnose, and resolve incidents, as well as providing features to help rectify the root causes of recurrent incidents. Incident Manager also directly leverages Oracle’s own expertise via My Oracle Support knowledge base articles and documentation to enable administrators to accelerate the process of diagnosing and resolving incidents and problems. Finally, Incident Manager also offers the ability to do lifecycle operations for incidents, so you can assign ownership of an incident to a specific user, acknowledge an incident, set priority for an incident, track an incident’s status, escalate an incident or suppress it so you can defer it to a later time. You can also raise notifications on an incident or open a helpdesk ticket via the helpdesk connectors.
Enterprise Manager continues to be the primary tool for managing and monitoring the Oracle data center, so it manages and monitors Oracle applications as well as the application stack from presentation layer to middleware, databases to hosts and the operating system, as well as non-Oracle technology. When Enterprise Manager detects issues in any of this infrastructure, it raises events. Sample events might be:
1. Metric alerts (for example, CPU utilization or tablespace usage alerts) where a critical threshold you set has been crossed
2. Job events – events are raised by the job system for job statuses
that you specify, for example an event is raised to signal the failure of a job.
3. Standards violations – if you are using compliance standards and any of the targets that are being monitored violate any of the compliance standards, then a standards violation event could be raised.
4. Availability events – if a target is down and Enterprise Manager detects that, an availability event that the target is down can be raised
5. Other events – there are other types of events that occur as well
All these events signal particular issues have occurred in the managed data centre. As an administrator, you really want to be able to determine which of these events are significant. From these significant events, you then want to be able to correlate discrete events that are related to the same underlying issue, so you in fact have to manage a smaller number of significant incidents.
An incident could then be defined as an object containing a significant event (such as a target being down, for example) or it could be a combination of events that all relate to the same issue (for example, running out of space could be detected by Enterprise Manager as separate events raised from the database, host and storage target types). For example, you may have a performance incident that amalgamates a number of performance events, another incident related to space, and a different incident based on availability problems.
Sound good? OK, so how do we do this? Well, events are significant occurrences in your IT infrastructure and that Enterprise Manager detects and raises. Each event has a set of attributes– what type of event it is, the severity (fatal, critical and so on), the object or entity on which the event is raised (typically a target but it can also be a job or some other object), the message associated with the event, the timestamp at which it occurred, as well as the functional category (such as availability, security etc.)
Some examples of the different types of events include:
· Target availability: raised when a target is down or has gone into an agent unreachable state.
· Metric alert: raised when a metric crosses its threshold.
· Job status change: raised, for example, when a job fails.
· Compliance standard rule: raised when a compliance standard rule is violated.
· Metric evaluation: raised when there is an error with the evaluation of a metric.
· Other events such as SLA Alert, High Availability and Compliance Standard Score violation can also be raised, and of course, users can cause an event to be raised.
Associated with these event types are event severities. The first of these, “Fatal”, is a new severity level in Enterprise Manager specifically associated with the target availability event type for when the target is down. Critical and warning events have the same meaning as they had in previous releases, and then we have the Advisory level. Typically, this is associated with non-service-impacting events such as compliance standard violation events. The informational level is an event severity used to indicate simply that an event has occurred, but there is no need to do anything about it.
As we discussed previously, an actual incident will contain one or more events. Let’s look at the details of an incident with one event. For example, Figure 1 shows us an availability event:
Figure 1: Incident with one event
The event signals that the database DB1 is down and includes a timestamp of when the event was raised. Because this is a target availability event and the database is down, the severity is marked as Fatal. An incident can be created for that event, so the incident contains only one event. In order to manage and track the resolution of the incident, the incident has other attributes such as owner (the Enterprise Manager user that is working on the incident), status, incident severity (which is based on the event severity), priority and a comment field.
Many incidents will instead contain
multiple events, where those events are related and pointed to the same
underlying cause. In the example shown
in Figure 2, we have two metric alert events on a host target -- a memory
utilization metric alert event and a CPU utilization metric alert event because
the host is starting to suffer from heavy load. We have a warning severity memory utilization metric alert event, and a
short time later a critical severity CPU utilization metric alert event.
An incident can be created containing both events in order to manage and track the resolution of the incident. In the current release, the administrator needs to manually combine events into an incident in the Enterprise Manager console (the automatic grouping of related events into an incident is a future enhancement). Again, we have additional attributes associated with the incident like we had in the previous example. Enterprise Manager automatically assigns the incident severity, based on the worst case event severity of all the events contained in the incident. Since the worst event severity is Critical, the incident severity is also set to Critical. Finally, the incident has a summary which is a short description of what the incident is about. The individual events are indicating the machine load is high so you can set the summary to that. Alternatively, you can set the incident summary to be the same as the event messages.
If you are using one of the helpdesk connectors to interface to a helpdesk system, an incident might also result in a helpdesk ticket which can allow the helpdesk analyst to work on the ticket. Within Enterprise Manager, we’ll be able to track both the ticket number and the status of that particular ticket.
A problem is the underlying root cause of an incident. In Enterprise Manager terms, a problem is specifically related to either an Automatic Diagnostic Repository (ADR) incident or Oracle software incident. Enterprise Manager will automatically create a problem whenever it detects an ADR incident has been raised. An ADR incident can be thought of as a critical Oracle software problem where the resolution of the software problem typically involves contacting Oracle Support, opening a service request and possibly receiving a patch for that problem.
Whenever an ADR incident is raised, we generate one incident in Enterprise Manager for that ADR incident, and we also automatically generate a problem as well. All the ADR incidents that have the same problem signature (that is, the same root cause) will be linked into a single problem object. The administrator can manage the problem in Incident Manager in the same way as you would manage an incident, so you can assign an owner to the problem, track the resolution and so on. In addition, there are in-context links to Support Workbench functionality which allows the administrator to package the diagnostic material, open a service request and view the status of diagnostic activity such as the SR number and ultimately bug number (if one is generated) within the user interface.
Figure 3 shows a diagrammatic example of how incidents and problems are related. Two ADR incidents have occurred, in this example two ORA-600 errors have occurred in my database. Both of these incidents are of critical severity. Enterprise Manager automatically creates a problem containing those incidents. Within the Incident Manager interface you can link to the Support Workbench to open a service request which you can then track from Incident Manager.
Figure 3: Incidents and problems
So now you have an understanding of the terminology and relationships between these terms, what’s next? Well, the next thing to understand is just how you deal with these incidents. That will be the topic of my next blog, so stay tuned for more!
Contributed by Pete Sharman , Principal Product Manager, Oracle Enterprise Manager
Oracle Enterprise Manager Ops Center 12c Update 1 was released earlier this month. Eran Steiner , Technical Architect, Oracle Enterprise Manager, adds some additional information and best practices about upgrading to Ops Center 12c Update 1 in this blog.Eran hosted a call to provide an overview of Oracle Enterprise Manager Ops Center 12c Update 1 and answer any questions.The recording of this call is available here and the presentation can be downloaded here.
Latest information and perspectives on Oracle Enterprise Manager.