Oracle Cloud Infrastructure (OCI) OpenSearch is a managed service that makes it easy for customers to ingest, search, visualize, and analyze data in near-real time. With OpenSearch, you can search an index instead of the data directly. OCI OpenSearch has many use cases, but typically applications use it to search large datasets and log analytics for quick investigation and debugging.
OCI OpenSearch is derived from Apache 2.0 licensed Elasticsearch 7.10.2 and Kibana 7.10.2. The community-driven, open source search and analytics suite is built on Apache Lucene, a Java-based search and indexing library. OpenSearch has the following primary use cases:
Application performance monitoring
The following architecture consists of a simplified multitier, highly available application. The frontend application is deployed on an Oracle Container Engine for Kubernetes (OKE) cluster. The backend application is deployed on-premises in a customer-managed Kubernetes cluster. Fluentd and Fluentbit are deployed as log collector and log aggregator.
The end user accesses the frontend application deployed on the Kubernetes cluster in the cloud environment.
The request is received on the active load balancer, and an application pod processes it.
To fulfill the request, the frontend application pod fetches user data from the backend Kubernetes cluster.
Fleuntbit runs as a DaemonSet on K8 deployed on OCI to collect logs from all the pods and push them to Fluentd.
Fluentd runs as a Deployment on K8 deployed on OCI to aggerate logs received from Fluentbit and ingest to OpenSearch endpoint.
Fleuntd runs as a DaemonSet on K8 deployed on the customer premises to collect logs from all the pods and ingest to the OpenSearch endpoint.
If an incident occurs, the application owner can log in to the OCI OpenSearch dashboard and search for error codes.
OCI OpenSearch fetches the matching search results in near-real time.
The application owner identifies a potential incident and can take corrective action on the specific node, pod, or application.
A graphical representation is provided to the end user for building artifacts related to the incident using the OCI OpenSearch service dashboard.
OCI OpenSearch REST API is accessible on port 9200.
OCI OpenSearch dashboards are accessible on 5601.
Figure 1: Reference architecture for log ingestion
Fluentbit is a super-fast and highly scalable logging and metrics processor and forwarder. Its lightweight, asynchronous design optimizes resource usage: CPU, memory, disk I/O, and network. Fluentbit is suitable for highly distributed environments where limited capacity and reduced overhead (memory and CPU) are a huge consideration. It has a smaller ecosystem compared to Fluentd. Inputs include syslog, tcp, systemd/journald, CPU, memory, and disk.
Fluentd is an open source data collector that lets you unify the data collection and consumption for better use and understanding of data. It developed a rich ecosystem consisting of more than 700 different plugins that extend its functionality. Fluentd uses disk or memory for buffering and queuing to handle transmission failures or data overload and supports multiple configuration options to ensure a more resilient data pipeline.
Because the logs are published in near-real time on an OpenSearch dashboard, we can search for specific information. In the following example, I search for the keyword “error” for a specific time frame and configurable timeline. As the results are populated, I can identify the node IP and Pod ID to mitigate the issue.
Figure 2. OpenSearch dashboard with "error" as the search criteria
OCI OpenSearch helps investigate an issue in near-real time. With it, the application owner can make proactive decisions to improve the end user experience. Using the visualize feature, you can tell a story about your data and focus on things, which are important to identify any issues. Build your own customized solution using the Oracle Cloud Free Tier or a 30-day free trial, which includes US$300 in credit to get you started with a range of services, including compute, storage, and networking.
For more information on the configuration of log ingestion, see the following resources: