Enhancing OCI metrics and creating Alerts using Logging Analytics

July 28, 2023 | 7 minute read
Kumar Varun
Product Management - Logging Analytics
Text Size 100%:

Guest author - Alexandru-Adrian Birzu, Master Principal Tech Cloud Specialist (Observability & Management)

As businesses increasingly migrate to the cloud, having robust monitoring and alerting mechanisms becomes crucial to ensure the performance, security, and stability of cloud infrastructure. Oracle Cloud Infrastructure (OCI) offers a powerful Monitoring service that allows users to track and analyze various metrics related to their resources. However, some scenarios may require additional customization to optimize metrics and create more insightful alerts.

This comprehensive guide will explore how to enhance an OCI metric and create alerts using Logging Analytics. This tutorial will focus on enhancing the "DNS Query Count" metric, as it was implemented by a large Banking customer, but the same principles can be applied to other metrics within the monitoring service, even if you have over 10 metrics or over 500 metrics. In this case, the new metrics are DNS Zone Names.

Email alert
Figure 1:  Email alarm with DNS Zone name

How the Zone Name becomes available as a new dimension in the Monitoring service

In this case, the need for alerting becomes crucial when the DNS Query Count exceeds a specific threshold, potentially signaling attacks or other situations. To navigate this, here is a solution where Zone Name becomes available as a new dimension in the Monitoring service.

process flow
Figure 2:  Alarm and Metrics workflow

 

In this case, the need for alerting becomes crucial when the DNS Query Count exceeds a specific threshold, potentially signaling attacks or other situations. To navigate this, here is a solution where Zone Name becomes available as a new dimension in the Monitoring service.

 

depiction
Figure 3:  Create a query in OCI Monitoring for DNS Query Count

 

Step 1: Creating a Stream for Metrics Ingestion

The first step is to create a Stream that will ingest and export the monitoring metrics. Streams in OCI allow seamless data transfer between services, making it an ideal choice for our purpose.

Step 2: Configuring Service Connector Hub

Next, create a Service Connector Hub, that will copy the metrics from the Stream to Logging Analytics. This hub acts as a bridge between services and ensures a smooth flow of data.

Step 3: Creating a Log Source and Parser in Logging Analytics

In this step, create a new Log Source and Parser in Logging Analytics Administration. The Parser will be used to interpret and extract relevant information from the log data. Have a sample log to create the JSON Parser (use the oci_dns log for below, or any other metric parser for other use cases):

{
 
"namespace": "oci_dns",
 
"resourceGroup": null,
 
"compartmentId": "ocid1.compartment.oc1..aaaaYourIDhere",
 
"name": "DNSQueryCount",
 
"dimensions": {
   
"resourceId": "ocid1.dns-zone.oc1.. YourIDhere"
  },
 
"metadata": {},
 
"datapoints": [
    {
     
"timestamp": 1687769100000,
     
"value": 26,
     
"count": 1
    }
  ]
}

 

 

{

  "namespace": "oci_computeagent",

  "resourceGroup": null,

  "compartmentId": "ocid1.compartment.oc1..aaaaaaaagYourIDhere",

  "name": "LoadAverage",

  "dimensions": {

    "instancePoolId": "Default",

    "resourceDisplayName": "Arkime",

    "faultDomain": "FAULT-DOMAIN-1",

    "resourceId": "ocid1.instance.oc1.eu-frankfurt-1.aYourIDhere",

    "availabilityDomain": "NoEK:EU-FRANKFURT-1-AD-1",

    "imageId": "ocid1.image.oc1.eu-frankfurt-1.aaaaaaYourIDhere",

    "region": "eu-frankfurt-1",

    "shape": "VM.Standard.E4.Flex"

  },

  "metadata": {

    "displayName": "Load Average",

    "unit": "NumberOfProcesses"

  },

  "datapoints": [

    {

      "timestamp": 1687513887000,

      "value": 0,

      "count": 1

    },

    {

      "timestamp": 1687513897000,

      "value": 0,

      "count": 1

    },

    {

      "timestamp": 1687513907000,

      "value": 0,

      "count": 1

    },

    {

      "timestamp": 1687513917010,

      "value": 0,

      "count": 1

    },

    {

      "timestamp": 1687513927009,

      "value": 0,

      "count": 1

    },

    {

      "timestamp": 1687513937001,

      "value": 0,

      "count": 1

    }

  ]

}

 

In this scenario, I utilized the extended metrics from the OCI_Computeagent, given that the dimensions possessed multiple values. After this, I mapped the JSON path to fields. Of significant importance here are the data points values, so we need to create the field with the Data Type Float. When crafting custom fields, it's crucial to determine if the field in question is a Multi-Valued Field. If computing values, select it as Float.

Step 4: Defining Custom Fields in Logging Analytics

Custom fields are essential for processing and analyzing data effectively. Create custom fields using the "+" option near the Field name or through the Administration section.

Step 5: Configure Another Service Connector Hub for Data transfer

After creating custom fields, the next step is to set up another Service Connector Hub to move the data from Streaming to Logging Analytics. This ensures that data is correctly channeled for further processing. With all of this in place, after a few minutes, you will have the metric data points as logs into Logging Analytics:

Step 6: Create a Lookup for Mapping OCID to Zone Name

In this step, we address the challenge of identifying DNS Zones by creating a lookup that maps the OCID of the DNS zone to its corresponding Zone Name. This mapping will prove invaluable in the subsequent steps. Upload the CSV file with the OCID on the first row and the zone name on 2nd row. For multiple zones, add them on different rows, with the proper delimiter.

Step 7: Craft a Query to Display Total DNS Queries

Now that we have the lookup in place, create a query to display the total number of DNS queries while replacing the Resource ID with the appropriate fields. This query will help with the needed data for further analysis.

query creation
Figure 4:  Query creation for DNS zones
'Log Source' = OCI_Monitoring and Monitoring_Metric_Name = DNSQueryCount 

| lookup table = DNS select 'Monitoring.Dimensions.ResourceID',

   'Zone Name' using Monitoring.Dimensions.ResourceID = 'Monitoring.Dimensions.ResourceID' 

| stats sum(Monitoring_Datapoints_Value) as 'DNS Queries' by 'Zone Name'


Step 8: Create a Detection Rule and Alarm

With the computed data available, we can create a Detection Rule that will send the data back to Monitoring, allowing us to set up an Alarm for timely notifications. These alerts will serve as valuable indicators of potential issues or anomalies, with custom fields populated from the lookup.

Check for the proper policies, and specify where you want to send the metrics.

Enable Access to Logging Analytics and Its Resources (oracle.com)

Alert creation
Figure 5:  Alert creation of DNS query count

 

Congratulations! The alarm is created, and you will receive the notification with the proper Zone Name.

 

DNS queries
Figure 6:  Alert with zone names from Logging Analytics

 

Email alert
Figure 7:  Email alarm with DNS Zone name

 

In conclusion, managing multiple DNS Zones within Oracle Cloud Infrastructure Monitoring can present unique challenges if you have advanced requirements. However, with the help of extended metrics, Logging Analytics, customized fields, and a well-structured lookup, it is possible to devise a streamlined solution for monitoring and alerting needs.

 

Resources

For further insights into the realm of Oracle Cloud Infrastructure's Observability & Management solutions, please refer to the links below:

Oracle Cloud Infrastructure Monitoring

Oracle Cloud Infrastructure Logging Analytics

Oracle Cloud Infrastructure Streaming

Oracle Cloud Infrastructure Connector Hub

These resources offer a wealth of information and a detailed guide to Oracle's suite of observability tools, paving the way for effective and efficient cloud infrastructure management.  Happy monitoring!

 

 

Kumar Varun

Product Management - Logging Analytics


Previous Post

Stack Monitoring now supports OCI Tags and SOA

Aaron Rimel | 5 min read

Next Post


OCI Stack Monitoring

Ana Maria McCollum | 5 min read