It is highly desirable to monitor the Exadata Compute node disks for current failures or degraded performance. By using the Enterprise Manager metric extension functionality Compute nodes can be monitored for these conditions and an alert created in the event of an issue. The following steps will guide you through this process
1. First a root monitoring credential set must be created . Login into the OMS using emcli
$ ./emcli login -username=sysman
Enter password :
2. Create the credential set:
$ ./emcli create_credential_set -set_name=root_monitoring -target_type=cluster -supported_cred_types=HostCreds -descript=root_monitoring monitoring
Create credential set for target type host
3. Next login to EM and go to the monitoring credentials page to setup credentials for a test target
Setup--> Security-->Monitoring Credentials
Select Cluster and the push the "Manage Monitoring Credentials" button
Find the target you want to test on with the credential set defined in step 2( In this case root_monitoring)
Highlight the credential set and push the "Set Credentials" button. Enter the credentials and use the test and save button to ensure they are correctly defined
4. Next create the metric extenstion
5. On the General Properties Screen set the following
Target type select "Host"
Display Name "Compute Node Disk Monitoring"
Adapter "OS Command - Multiple Columns"
Data Collection "Enabled"
Repeat Every "5 Minutes"
Use of Metric Data "Alerting and Historical Trending"
Upload Interval "1 Collections"
Select the Next Button
6. Now create the script to run on the agent
On your local machine create a file called megaclicommand.sh that contains the following
/opt/MegaRAID/MegaCli/MegaCli64 AdpAllInfo -aALL | grep "Virtual Drives" -A 6 | grep -w 'Degraded\|Critical\|Offline\|Failed' | sed 's/Degraded/Virtual Drives Degrades/g' | sed 's/Offline/Virtual Drives Offline/g' | sed 's/Critical Disks/Critical Physical Disks/g' | sed 's/Failed Disks/Failed Physical Disks/g'
7. On the "Edit Storage Server Disk Status (ME$Storage_Server_Disk_Status) v1 : Adapter" page enter the following
8. On the Upload Custom Files Section
Select the upload button and select the file created in step 6
Click okay and one back to the Create New:Adapter page select the "Next" button
9. On the "Create New : Columns" page create two columns
Column one should be setup as:
Display Name "Type"
Column Type "Key Column"
Value Type "String"
Metric Category "Fault"
Column two should be setup as:
Display Name "Value"
Column Type "Data Column"
Value Type "Number"
Metric Category "Fault"
Comparison Operator ">"
After Setting up the two column select the next button
10. On The Credentials Screen
Select the “Specify Credential Set” radio button
In the drop down box select the credential set created in step 1
Click the next button
11. On the “Create New : Test” page
Add a target to test with in the “Test Targets” section
Click the “Run Test” button and ensure that results are displayed properly in the “Test Results” box.
The results should be similar to below
Virtual Drives Degrades 0
Virtual Drives Offline 0
Critical Physical Disks 0
Failed Physical Disks 0
12. Next the the Metric Extension must be saved as a deployable draft. This is accomplished on the main metric extension page. This allows the metric to be deployed to targets for testing. However at this stage only the developer has access to publish the metric. After satisfactory testing is completed the metric is then published. This is once again accomplished from the main metric extension page.
To ensure that administrators are notified in the event the metric created fails an incident rule should be created.
1, To Begin navigate to the Incident Rules Home Page
From the Setup button on the upper right hand corner of the Enterprise Overview Page
Now click the “Create Rule Set..” button
2. On the Create Rule Set screen enter the following information
Name: Whatever the rule should be called. i.e. Metric Collection Error Rule
Enabled Box: Should be checked
Applies To: Targets
Select the “All Targets of Type” radio button on the bottom of the screen followed by Host in the drop down box
3. Now select the “Rules” tab at the bottom of the screen
4. Chose the "Create.." button on the middle of the screen
4. On the “Select Type of Rule to Create” Popup box select the “Incoming events and updates to events” radio button. Click the continue button.
5. On the “Create New Rule: Select Events” screen check the type check box. In the drop down select “Metric Extension Update”. Click the next button
6. On the “Add conditional Actions” page you can specify conditions that must occur for the action to apply, if an incident should be created and email notifications. Specify the desired properties and select the continue button.
7. If no additional rules are required select the next button on the “Create New Rule: Add Actions” page.
8. On the next screen either accept the default rule name or specify the desired name
9. For the “Create New Rule : Review” page, ensure everything looks correct and select the “continue button”
10. Lastly click the “Save” button to complete the setup
11. The metric can now be deployed to the desired target by selecting the “Deploy to Targets” option from the “Actions” drop down button on the Metric Extensions Page