Keeping track of File System Utilization in Ops Center 12c
By S Stelting on Nov 07, 2012
Enterprise Manager Ops Center 12c provides significant monitoring capabilities, combined with very flexible incident management. These capabilities even extend to monitoring the file systems associated with Solaris or Linux assets. Depending on your needs you can monitor and manage incidents, or you can fine tune alert monitoring rules to specific file systems.
This article will show you how to use Ops Center 12c to
- Track file system utilization
- Adjust file system monitoring rules
- Disable file system rules
- Create custom monitoring rules
A recording of this community call is now available here:
Monitoring File Systems for OS Assets
The Libraries tab provides basic, device-level information about the storage associated with an OS instance. This tab shows you the local file system associated with the instance and any shared storage libraries mounted by Ops Center.
More detailed information about file system storage is available under the Analytics tab under the sub-tab named Charts. Here, you can select and display the individual mount points of an OS, and export the utilization data if desired:
In this example, the OS instance has a basic root file partition and several NFS directories. Each file system mount point can be independently chosen for display in the Ops Center chart.
File Systems and Incident Reporting
Every asset managed by Ops Center has a "monitoring policy", which determines what represents a reportable issue with the asset. The policy is made up of a bunch of monitoring rules, where each rule describes
- An attribute to monitor
- The conditions which represent an issue
- The level or levels of severity for the issue
When the conditions are met, Ops Center sends a notification and creates an incident.
By default, OS instances have three monitoring rules associated with file systems:
- File System Reachability: Triggers an incident if a file system is not reachable
- NAS Library Status: Triggers an incident for a value of "WARNING" or "DEGRADED" for a NAS-based file system
- File System Used Space Percentage: Triggers an incident when file system utilization grows beyond defined thresholds
You can view these rules in the Monitoring tab for an OS:
Of course, the default monitoring rules is that they apply to every file system associated with an OS instance. As a result, any issue with NAS accessibility or disk utilization will trigger an incident. This can cause incidents for file systems to be reported multiple times if the same shared storage is used by many assets, as shown in this screen shot:
Depending on the level of control you'd like, there are a number of ways to fine tune incident reporting.
Note that any changes to an asset's monitoring policy will detach it from the default, creating a new monitoring policy for the asset. If you'd like, you can extract a monitoring policy from an asset, which allows you to save it and apply the customized monitoring profile to other OS assets.
Solution #1: Modify the Reporting Thresholds
In some cases, you may want to modify the basic conditions for incident reporting in your file system. The changes you make to a default monitoring rule will apply to all of the file systems associated with your operating system. Selecting the File Systems Used Space Percentage entry and clicking the "Edit Alert Monitoring Rule Parameters" button opens a pop-up dialog which allows you to modify the rule.
The first screen lets you decide when you will check for file system usage, and how long you will wait before opening an incident in Ops Center. By default, Ops Center monitors continuously and reports disk utilization issues which exist for more than 15 minutes.
The second screen lets you define actual threshold values. By default, Ops Center opens a Warning level incident is utilization rises above 80%, and a Critical level incident for utilization above 95%
Solution #2: Disable Incident Reporting for File System
If you'd rather not report file system incidents, you can disable the monitoring rules altogether. In this case, you can select the monitoring rules and click the "Disable Alert Monitoring Rule(s)" button to open the pop-up confirmation dialog.
Like the first solution, this option affects all file system monitoring. It allows you to completely disable incident reporting for NAS library status or file system space consumption.
Solution #3: Create New Monitoring Rules for Specific File Systems
If you'd like to have the greatest flexibility when monitoring file systems, you can create entirely new rules. Clicking the "Add Alert Monitoring Rule" (the icon with the green plus sign) opens a wizard which allows you to define a new rule.
This rule will be based on a threshold, and will be used to monitor operating system assets. We'd like to add a rule to track disk utilization for a specific file system - the /nfs-guest directory. To do this, we specify the following attribute
The value of name in the attribute allows us to define a specific NFS shared directory or file system... in the case of this OS, we could have chosen any of the values shown in the File Systems Utilization chart at the beginning of this article.
usedSpacePercentage lets us define a threshold based on the percentage of total disk space used. There are a number of other values that we could use for threshold-based monitoring of FileSystemUsages, including
The final sections of the screen allow us to determine when to monitor for disk usage, and how long to wait after utilization reaches a threshold before creating an incident. The next screen lets us define the threshold values and severity levels for the monitoring rule:
If historical data is available, Ops Center will display it in the screen. Clicking the Apply button will create the new monitoring rule and active it in your monitoring policy.
If you combine this with one of the previous solutions, you can precisely define which file systems will generate incidents and notifications. For example, this monitoring policy has the default "File System Used Space Percentage" rule disabled, but the new rule reports ONLY on utilization for the /nfs-guest directory.