By Rodney Lindner-Oracle on Sep 10, 2014
Modifying Monitoring Policies
Ops Center delivers default monitoring policies for the various types of assets managed and monitored by Ops Center. These policies are specific to each asset type. In the real world, these policies act only as a starting point and you will need to customize them to suit your own environment. Most of the customizations can be done in the BUI (Browser User Interface), which is covered in the manuals and other blogs on this site, but occasionally, you will need to manually edit the underlying XML of the default policies to get the customization you require. The method of doing that is covered in this blog entry.
In the BUI, you can easily copy these default policies and then modify them to suit your own environment.
You can do the following modifications in the BUI:
- enable/disable monitoring rules
- add a new monitoring rule
- delete an existing monitoring rule
- Modify the thresholds/severities/triggers for most alert rules
Modifications are normally done by highlighting the rule, clicking the edit  icon, making your changes and then clicking the apply button. Remember that once you have made all the rule changes, the policy should be applied/reapplied to your target assets. Most rules are editable in this way.
However, not all rules can be edited in the BUI. A rule like "Operating System Reachability" can not be edited from the BUI and must be done manually by editing the underlying XML. These rules can be identified by the fact that there is no edit  icon available when the "Operating System Reachability" alert rule is selected.
Only Ops Center factory default policies (product standard default policies) can be edited by modifying the XML on the filesystem. When a policy is modified, it copies the default policy to a custom policy which can be modified in the BUI. These modified policies are stored in the database, not as XML on the filesystem. This means that if you want to change one of these non editable rules, you must manually edit the factory default policy. Then, make a copy of the policy to create a custom policy and, if required, re-apply any additional customizations in the BUI, so that your new policy adsorbs the manual modifications.
While the default values are normally sufficient for most customers, I had a request from a customer who wanted to change the "Operating System Reachability" severity from Warning (the default) to Critical. He considered this to be an important event that needed to be alerted at a higher level so that it would grab the attention of his administration staff. Below is the procedure for how to achieve such a modification.
Manually Modifying the Default Alert Severity
As part of a standard install, Ops Center will create an alert of severity Warning if it loses connectivity with an Operating System (S8/9 OS or S10/11 GZ).
This will create an alert with the description "The asset can no longer be reached"
So here is the procedure for how to change the default alert severity for the "Operating System Reachability" alert from Warning to Critical. Be aware that there is a different alert for "Non-global zone
Reachability", which will not be covered here, but modifying it, or other alerts, would
follow a similar procedure.
We will be modifying the XML files for the default monitoring policies. These can be found at /var/opt/sun/xvm/monitoringprofiles on your EC.
root@ec:/var/opt/sun/xvm/monitoringprofiles# ls Chassis.xml MSeriesDomain.xml ScCluster.xml CiscoSwitch.xml NasLibrary.xml ScNode.xml Cloud.xml NonGlobalZone.xml ScZoneClusterGroup.xml ExadataCell.xml OperatingSystem.xml ScZoneClusterNode.xml FileServer.xml OvmGuest.xml Server.xml GlobalZone.xml OvmHost.xml Storage.xml IscsiStorageArray.xml OvmManager.xml Switch.xml LDomGuest.xml PDU.xml Tenancy.xml LDomHost.xml RemoteOracleEngineeredSystem.xml VirtualPool.xml LocalLibrary.xml SanLibrary.xml MSeriesChassis.xml SanStorageArray.xml root@ec:/var/opt/sun/xvm/monitoringprofiles#
Follow the steps below to modify the monitoring policy:
In the BUI, identify which policies you want to modify. Look at an asset in the BUI and select the "Monitoring" tab. At the top of the screen, you will see what monitoring policy (Alert Monitoring Rules) it is running. In this case, the policy is called "OC- Global Zone", which will be the "GlobalZone.xml" file.
Or alternatively, log on to the EC and grep for the alert rule name.
# grep "Operating System Reachability" * GlobalZone.xml: <name>Operating System Reachability</name> OperatingSystem.xml: <name>Operating System Reachability</name> #
In this case, we will want to change "OC - Operating System" and "OC - Global Zone" policies, as they both have the "Operating System Reachability" rule, so we will be editing both the "GlobalZone.xml" and "OperatingSystem.xml" files.
Make a backup copy of any XML file you modify (in case you mess something up).
# pwd /var/opt/sun/xvm/monitoringprofiles # cp OperatingSystem.xml OperatingSystem.xml.orig # cp GlobalZone.xml GlobalZone.xml.orig
Edit each file and look for the rule name
<monitor> <enabled>true</enabled> <monitorType>Reachability</monitorType> <name>Operating System Reachability</name> <parameter> <name>unreachable.duration.minutes.WARNING</name> <value>3</value> </parameter> </monitor>
and change "unreachable.duration.minutes.WARNING" to "unreachable.duration.minutes.CRITICAL".
<monitor> <enabled>true</enabled> <monitorType>Reachability</monitorType> <name>Operating System Reachability</name> <parameter> <name>unreachable.duration.minutes.CRITICAL</name> <value>3</value> </parameter> </monitor>
Repeat for the other file(s).
Make a backup copy of your modified XML files as these files may be overwritten during an upgrade process.
Now restart the EC so that the new monitoring policies are re-read.
You should now apply the new policy to the hosts you want to have the updated rule.
Check the Message Center in the Navigation panel and you will see that your alert has now changed from "Warning" to "Critical".
A Best Practice option would now use the BUI to copy the new (OC - Global Zone and OC - Operating System) policies to your own custom policies, adding any additional rule modifications. Copying the new OC policy to a custom policy saves it into the database so it will not get overridden by any subsequent Ops Center upgrade. Remember to apply the custom policy to your asset(s) or asset groups.
It is good practice to keep the name of the source policy in the name of your custom policy. It will make your life easier if you ever get confused about which policy applies to which type of asset or if you want to go back to the original source policy.
If you want your new custom policy to be automatically applied when you discover/provision a new asset, you will need to select the policy and click the "Set as Default Policy" action for that asset class.
The green tick on the icon indicates that a policy is the default for that asset class.
You have now successfully modified the default alert severity, for an alert that could not be modified in the BUI.
Senior IT/Product Architect
Systems Management - Ops Center Engineering