Implementing an observability strategy requires collecting IT environment data such as metrics, logs, events, or traces.  Good, solid Observability means using your data and assets wisely so that you can easily predict outages, excessive use of resources, or applications’ poor performance before they happen.

Being proactive is a must in a distributed environment and saving resources is directly correlated with your budget.  OCI Observability & Management (O&M) provides a strategy to make day-to-day activity easier for both DBAs and IT managers. DBAs should be able to provision PDBs, plug/unplug, and create test and dev environments with a click of a button. IT Managers should be able to provide a forecast of the budget based on current resource utilization and trend analysis.

O&M services improve the communication between teams and departments by offering a single pane of glass options to optimize the interaction. It also makes the resources needed evident by correlating costs for each.  Unlike basic monitoring tools, O&M solutions provide the ability to forecast resource utilization and notify customers days before a resource outage. Here is an example of forecasting of CPU usage, across all databases.

 

Forecasting CPU usage
Figure 1:  Forecasting CPU usage

 

For Engineered Systems environments, you can reduce outages by clustering events and data by similarity, identifying critical patterns, and comparing the trend to prevent error propagations.

 

Exadata Errors overview
Figure 2:  Clustering events and data to prevent error propagation

 

In this blog, we explain how to leverage the different Oracle tools and services for your O&M needs, specifically for database environments. 

 

Typical database environments to leverage O&M tools

  1.  

Observability for on-premises/hybrid

In a hybrid scenario, some databases are on-premises, and some in OCI.

On-premises example
Figure 3:  Hybrid scenario example 

In a hybrid environment, with on-premises and Cloud databases, Enterprise Manager (EM) is the optimal solution for these reasons:

  • EM is on-premises so it makes the discovery of targets easy
  • EM has a rich interface to monitor and administer database targets
  • Benefit from the EM topology without discovering it from scratch. They have been already discovered in the Enterprise manager logs

Consider enriching your Observability platform by extending EM with OCI Ops Insights and Logging Analytics services capabilities, such as:

  • Resource Usage Forecast for budget allocation
  • Utilization trend by department/team for chargeback
  • Resource utilization limits early warning
  • SQL degradation analysis
  • Database Alerts and trace log management for root cause analysis

Choose to deploy EM on an OCI Compute VM and use the full Stack provided in OCI Marketplace.  Similar to an on-premise EM, the OMS and Repository maintenance is under your control.

Observability for ExaCC

Everything is in your data center, similar to the example shown below:

On premises example
Figure 4: On-premises example

 

In this scenario, database management is limited to the database layer.  However, EM capabilities can be enriched  with OCI Op Insights, Exadata Insights, and Logging Analytics services for the following benefits:

  • Resource Usage Forecast for budget allocation
  • Utilization trend by VM cluster/department/team for chargeback
  • Resource utilization limits early warning
  • SQL degradation analysis
  • Database Alert and trace log management for root cause analysis (suggest importing application logs)

Observability for OCI ExaCS/DBCS

In this case, everything runs in OCI.

OCI example
Figure 5:  Observability for OCI ExaCS/DBCS

Use the native OCI services to save on maintenance for EM components.

Native OCI services like Database Management and Ops Insights are agentless (creation without an agent) for OCI-based Oracle databases. The Logging Analytics service can be used with Management Agent for security reasons and will allow you to ingest any file or data from database tables in addition to pre-defined logs sources.

 

Provisioning and database resource management 

Provisioning involves resource creation, termination, definition, assignment, and all other activities that allow the creation and use of a database. This also includes setting resource limits in case various teams share resources.  Here are various solutions used for provisioning needs:

Service/Product Name

Target

Description

More resources

Enterprise Manager Cloud Management Pack

On-premises, Autonomous, DBCS, ExaCS, ExaCC

Cloud Management Pack for Oracle Database delivers capabilities spanning the entire database lifecycle. Cloud administrators can set up the Self-Service Portal to identify pooled resources, configure role-based access, define service catalogs, and configure chargeback plans

Enterprise Manager Cloud Management Pack

OCI DBaaS/Exa service console

Autonomous, DBCS, ExaCS, ExaCC

Is it possible to use OCI DBaaS/Exa service console to provision the Oracle database in OCI

OCI DBaaS/Exa service console for provisioning

OCI Resource Manager

Autonomous, DBCS, ExaCS, ExaCC

Resource Manager is an Oracle Cloud Infrastructure service that allows you to automate the process of provisioning your Oracle Cloud Infrastructure resources. Using Terraform, Resource Manager allows installing, configuring, and managing resources through the “infrastructure-as-code” model.

OCI Resource Manager

 

Real Time Monitoring

Real-time monitoring means continuously delivering updated data about systems, processes, or events. Such monitoring provides information streaming at zero or low latency, there is minimal delay between data collection and analysis. It enables quick detection of anomalies, performance issues, and critical events.

Oracle provides the following products/services for real-time monitoring:

Service/Product Name

Target

Description

Additional resources

Monitoring

Autonomous, DBCS, ExaCS, ExaCC

OCI Monitor collects PaaS/IaaS OCI services metrics. Is enabled by default for all the OCI services

List of metrics collected by default

OCI DBaaS/Exa service console

Autonomous, DBCS, ExaCS, ExaCC

Service Console offers a list of graphs and basic information about critical metrics like CPU, Memory, and Storage

OCI DBaaS/Exa service console for DBCS and EXACS  ExaCC Resource Usage Tracking

Enterprise Manager – ADDM

On-premises, Autonomous, DBCS, ExaCS, ExaCC

Custom-managed solution, on-premises or in OCI. It is usually used for diagnostic and admin but it also provides real-time metrics

List of metrics collected by Enterprise manager

OCI Database Management (opt to Enterprise Manager)

On-premises, DBCS, ExaCS

OCI managed service, provides complete monitoring, management, and performance tuning of databases across the hybrid fleet

List of metrics collected by OCI Database Management

Stack Monitoring

On-premises, Autonomous, DBCS

Stack Monitoring lets you proactively monitor an application and its underlying tech stack, including application servers, databases and hosts. Extend monitoring by creating custom metrics with Metric Extensions.

Stack Monitoring for Oracle Database

 

Creating custom metrics using Metric Extensions

OCI Logging Analytics

On-premises, Autonomous, DBCS

It is possible to create metrics based on the alert and trace logs. OCI Logging analytics offers a list of predefined Label based on the alert and trace message

Enhancing OCI metrics and creating Alerts using Logging Analytics

 

Third-Party Tools – Service Connector Hub

Autonomous, DBCS, ExaCS, ExaCC

OCI offers full O&M capabilities but in the case the customers want to use their tools allows high integration using Service Connect Hub

Service Connector Hub

 

Performance and Tuning solutions to evaluate database systems

Performance diagnostics and tuning are critical for any IT environment. For database systems, this activity can be proactive or reactive.  For example, reacting to an increased workload is critical to tune the subsystem as it can end up in an outage. Performance tuning can be at the system resource level (CPU/memory/storage utilization) or a higher level like SQL statements response time. It is important to analyze and correlate the information to correctly identify the bottleneck.  Here are Oracle solutions to help with performance evaluation and tuning:

Service/Product Name

Target

Description

Additional resources

Enterprise Manager – Diagnostic and Tuning Pack

On-premises, Autonomous, DBCS, ExaCS, ExaCC

Custom managed solution is enabled at DB level, on-premises, or in OCI. It provides functionalities like Tuning Advisor or TopSqlStatment detection

Performance and Tuning Management Pack

OCI Database Management – PerfHub (opt to Enterprise Manager)

On-premises, Autonomous, DBCS, ExaCS

OCI managed service, offers the same Performance and Tuning capabilities offered by Enterprise Manager Performance and Tuning Pack but in a completed managed solution

Database Management Performance Hub

Operation Insights SQL Insights and Capacity Planning

On-Premises, Autonomous, DBCS, ExaCS, ExaCC by Enterprise Manager

OCI Ops Insights allows tracking of metrics charts and data collection. It allows the correlation of resources from different infrastructure layers. It allows us to predict resources high utilization

OCI Ops Insights SQL

OCI Ops Insights Capacity planning

 

Database fleet administration solutions

Database Administrators are responsible for the management and maintenance of company databases. Their duties include maintaining adherence to a data management policy and ensuring that these essential pieces of equipment are functional. Activities include instance start and stop, database backup and restore, key management, and resource assignment up to a fixed storage level.

The following Oracle tools and solutions are key for a database fleet admin:

Service/Product Name

Target

Description

Additional resources

Enterprise Manager

On-premises, Autonomous, DBCS, ExaCS, ExaCC

Custom-managed solution on-premises or in OCI. It provides administration tools from the database to the infrastructure level

Enterprise manager

OCI Database Management(opt to Enterprise Manager)

On-premises, Autonomous, DBCS, ExaCS

OCI managed service; it offers a comprehensive list of administrative capabilities and new ones are being introduced in an agile manner.

Database Management

OCI DBaaS/Exa service console

Autonomous, DBCS, ExaCS, ExaCC

OCI DBaaS/Exa service console is embedded in all Cloud PaaS services. It allows basic task list start/stop/terminate instances, backup and restore connection and wallet

OCI DBaaS/Exa service console

 

Patching a database fleet safely

Patching is one of the important stages of the product lifecycle. It enables you to keep the software product updated with bug fixes. Oracle releases several types of patches periodically to maintain a secure and bug-free stack. Although OCI simplifies it, patching has always been a challenging phase of the lifecycle because it is complex, time-consuming, and can involve downtime. These Oracle tools/services can reduce the risk and allow database admins to safely patch their database fleet.

Service/Product Name

Target

Description

Additional resources

Enterprise Manager Lifecycle Management Pack

On-premises, Autonomous, DBCS, ExaCS, ExaCC

Database Lifecycle Management Pack supports the entire Patch Management Lifecycle including, patch advisories, pre-deployment analysis, rollout, and reporting. It is linked with My Oracle Support to provide a synchronized view of available and recommended patches. It manages drift and version comparison

Enterprise Manager Lifecycle Management Pack

OCI DBaaS/Exa service console

Autonomous, DBCS, ExaCS, ExaCC

It is possible to use the OCI DBaaS/Exa service console to patch OCI databases and other OCI services

Patching Oracle database

 

Cost control and chargeback provide visibility into IT services

Cost control is the practice of identifying and reducing business expenses to increase profits, and it starts with the budgeting process. Cost control is an important factor in maintaining and growing profitability.  IT chargeback can provide greater visibility into the costs of IT services and infrastructure usage, enabling organizations to identify opportunities for cost optimization and reduce wasteful spending. Cost Control and chargeback are important topics for the company that adopts the Cloud or a new FinOps challenge. In this scenario, savings in consumption is directly connected to the business.

Service/Product Name

Target

Description

Additional resources

Enterprise Manager Chargeback

On-premises, Autonomous, DBCS, ExaCS, ExaCC

Custom-managed solution on-premises or in OCI. It offers a deep drill-down and metric correlation

Enterprise Manager Chargeback

Ops Insights Capacity Planning

On-Premises, Autonomous, DBCS, ExaCS, ExaCC (by Enterprise Manager)

OCI managed service, allows to predict the resource consumption for one year. Using tagging and other grouping built-in functionality, it is possible to associate the forecast and consumption to specific departments, users, applications, and/or VM clusters.

Ops Insights Capacity Planning

Cost Analysis                                                       

Autonomous, DBCS, ExaCS, ExaCC

Cost Analysis is an easy-to-use visualization tool to help you track and optimize your Oracle Cloud Infrastructure spending. It allows to generation of charts and download of accurate, reliable tabular reports of aggregated cost data. Using tagging is also possible to associate the forecast and the consumption to a specific VM Cluster or DBCS (no DB or PDB visibility)

OCI Cost Analysis

 

Troubleshooting to identify root cause and prevent outages

Database issues can happen on several levels.  To identify problem root causes, it is important to be able to correlate resources, drill down into the issues, and analyze trends in the systems. The root cause can be an application so it is important to also have visibility into this information. Troubleshooting also helps avoid error propagation which is why it is important to notice the issues as early as possible.

All these Oracle tools are important for troubleshooting and preventing outages:

Service/Product Name

Target

Description

Additional resources

Enterprise Manager

On-premises, Autonomous, DBCS, ExaCS, ExaCC

Custom-managed solution on-premises or in OCI. It offers a deep drill down and metric correlation. For example, it allows retrieving the Top SQL statement or the blocking session. It is possible to see and manage tablespace and data files, users, and DB parameters. Enterprise manager allow to drill down from the database until the physical host or the user domain (it depends on which system are you monitoring).

Enterprise manager

OCI Database Management (opt to Enterprise Manager)

On-premises, Autonomous, DBCS, ExaCS

OCI managed service, allows us to drill down and correlate metrics and data from different layers. There is built-in integration to other O&M services (ex. Ops Insights). For example, it is possible to compare SQL Statement response time with the baseline to check whether there has been a performance degradation.

Database Management

Logging Analytics

On-premises, Autonomous, DBCS, ExaCS, ExaCC

OCI Logging analytics can handle log events generated by all software applications and infrastructure on the Cloud or on-premises. For Databases log messages severity is pre-classified based on Oracle expert experience. It is possible to set alerts for critical events to be proactive in case of issues. For example  1407 labels are defined for DB Audit logs, DB 767 labels for Audit (db tbl), 124 labels for DB Alert, 43 labels for DB Trace etc…

OCI Logging Analytics  OCI Logging Analytics for Exa

 

Ops Insights

On-Premises, Autonomous, DBCS, ExaCS, ExaCC by Enterprise Manager

OCI Ops Insights allows tracking down metrics charts and data collection. It allows correlating resources from different infrastructure layers. It is possible to set an early warning alert to know days in advance if systems are running out of resources. That prevents outages.

OCI Ops Insights

OCI Exadata Insights

 

Oracle offers several tools to manage and monitor large database fleets. It is important to extend EM with O&M capabilities with diagnostic tools like forecast or pattern identification offered in OCI. Explore more and perform Hands-on Lab activities on Oracle Cloud-for free, Use the Oracle Cloud Free Tier Account. 

Resources: