In today’s digital world, customers expect applications to be always available and responsive, and to provide a superior end-user experience. As the first gateway between users and an application, load balancers are a critical piece of any scalable application infrastructure. An unhealthy or improperly configured load balancer can cause degraded user experiences like higher latency, reachability errors, or, much worse, an application outage, which often leads to customer churn and lost business. It's imperative to have meaningful metrics on your load balancer that can provide insights on the health of your application and help remediate issues faster.
Oracle Cloud Infrastructure Load Balancing service metrics provide an array of critical metrics to proactively monitor the health and load of your Oracle load balancer infrastructure. The Load Balancing service metrics measure the number and type of connections, the HTTP responses, and the quantity of data managed by your load balancer. These metrics are statistics calculated from relevant data points as an ordered set of time-series data and are divided by load balancer, listener, and backend set component groups.
The service metrics are an integral part of the Oracle Cloud Infrastructure Monitoring service and are automatically available for any load balancer that you create in your tenancy. You don't need to enable monitoring on the resource to get these metrics. In the Oracle Cloud Infrastructure Console, you can view the metrics details for load balancers in your compartment by selecting Monitoring > Service Metrics from the navigation menu, selecting your compartment, and then selecting oci_lbaas from the Metric Namespace menu.
Figure 1: Load Balancing Service Metrics
You can further filter or group the service metrics by dimensions such as availability domain, backend set name, listener name, region, or OCID. You do this by adding a dimension filter under the Dimensions option in the Service Metrics page.
Figure 2: Filtering Based on Metric Dimensions
The metrics are automatically refreshed every minute. You can modify the metrics time-interval data on the charts to one-minute, five-minute, or one-hour time periods. You can modify the aggregate statistic to perform functions such as Rate, Mean, Sum, and so on by choosing the option from the Statistic menu.
You can also view the metrics for a load balancer by navigating to the details page for the load balancer and accessing the Metrics tab under Resources. Similarly, you can view the metrics for a specific backend set by navigating to the Metrics tab on the backend set's details page. The Load Balancing service metrics are also available through the Monitoring API endpoint. You can use the Monitoring API to manage metric queries, alarms, and the performance of your load balancing resources.
One of the key objectives of our Monitoring service is to deliver metrics that provide actionable insights that enable you to deliver a great digital experience for your end users. For example, you can use the load balancing metrics to understand your baseline performance metrics, such as the average/peak-time traffic trends over time. The metrics can also be used as a demand signal for business decisions such as future-capacity planning.
Let's walk through an example scenario. You are the head of operations for a travel website hosted on Oracle Cloud Infrastructure. Your business has been running a social media campaign for a summer travel sale. You have been tasked to ensure that users have a great digital experience and that no business is lost because of application infrastructure issues. You wonder, can load balancing metrics help in this scenario? Absolutely! You can leverage the Inbound Requests, Active Connections, and Bytes Received load balancing metrics, in addition to your compute metrics, to gather insights on incoming traffic patterns and predict load balancer/compute capacity needs. The service metrics enable you to make data-driven decisions and dynamically adjust to the changing needs of your application infrastructure.
Apart from proactive monitoring and management, load balancing metrics also help you to identify, isolate, and troubleshoot issues with your load balancer infrastructure. In this example scenario, you are deploying a new web application, ociexample.com, with an Oracle Cloud Infrastructure public load balancer as the front end in your development environment. However, when you try to access the application, you see an HTTP 502 response on the browser. Let's explore how load balancing metrics can help you troubleshoot this issue.
When you browse to a load-balanced IP address, you see 502 Bad Gateway error.
You can confirm this behavior by running a curl test: curl -v http://ociexample.com
> GET / HTTP/1.1 > Host: 220.127.116.11 > User-Agent: curl/7.54.0 > Accept: */* > < HTTP/1.1 502 Bad Gateway < Content-Type: text/html < Content-Length: 161 < Connection: keep-alive
In the Oracle Cloud Infrastructure Console, navigate to Monitoring > Service Metrics. Select your compartment and select oci_lbaas as the metric namespace. You will notice that an HTTP 502 response appears for each curl or browser test.
Navigate to the Load Balancer Details page, and note that the load balancer backend set health is critical.
If you run the same curl test against the IP address of the instance, you get the following error:
Connection failed connect to 18.104.22.168 port 80 failed: Connection refused Failed to connect to 22.214.171.124 port 80: Connection refused Closing connection 0
However, you can log in to the backend instance via SSH, and running curl -v href="http://127.0.0.1/ returns an HTTP 200 OK response.
HTTP/1.1 200 OK Server: Apache/2.4.6 () Accept-Ranges: bytes Content-Length: 5 charset=UTF-8
In this scenario, the host firewall is preventing the traffic from reaching the instance from beyond the instance on port 80.
To resolve this issue, open port 80 on the firewall using the firewall-offline-cmd --add-port=80/tcp command and then using systemctl restart firewalld to cycle.
The Monitoring service provides alarms and notifications functionality that is tightly integrated with the metrics. We recommend setting up alarms and notifications to be proactively notified on deviations from your baseline metrics. Let's walk through the steps to create an alarm and a notification for the HTTP 502 responses in our previous example.
On the metric chart for which you want to create an alarm, click Options and then select Create an Alarm on this Query.
Select a name, severity, and message body for the alarm message.
Keep the Compartment, Metric Namespace, and Metric Name values the same, but adjust the interval and statistic as needed. You can optionally set up a metrics dimension, such as region, to filter the alarm. Create a trigger rule condition to enable firing of the alarm when the condition is met.
Create a notification, which is the most common approach to managing alarms. You can add a list of recipients to a notification, and those recipients are emailed a notification in the event of an alarm. The Monitoring service also supports a native integration with PagerDuty, which allows companies to configure services, on-call rotations, acknowledgment requirements, and escalation rules for inbound notifications.
Click Save alarm.
We recommend that you use the Monitoring service and Load Balancing service metrics to monitor any critical application that you are delivering, whether it's hosted solely in Oracle Cloud Infrastructure or across your hybrid environment. Load Balancing metrics will be extended to include more metrics across the Load Balancing infrastructure. If there is a specific metric or integration that you would like us to support, let us know.
For more information, see Monitoring Overview and Load Balancing Metrics in the Oracle Cloud Infrastructure documentation. If you haven't tried Oracle Cloud Infrastructure yet, you can try it for free.