In this blog we investigate how to tune Oracle Linux (OL) for maximizing Redis throughput on Ampere’s A1 [1] Arm–based two-socket systems in Oracle Cloud Infrastructure (OCI).
Based on our testing of Redis on A1, we recommend:
In the rest of the blog, we provide reasons for these recommendations.
Redis [2] (REmote DIctionary Server) is an in-memory open-source data structure store used as a database, cache and message broker. Redis has two components:
Using the open-source memtier [3] Redis benchmark and included scripts, we setup and ran the tests on 8 core VM and 160 core bare-metal A1 systems, using throughput (operations/second) as the performance metric. Moreover, we ran a single Redis instance with these parameters:
--test-time 300 --pipeline=100 --ratio 1:10 --clients=25 --run-count=1 --data-size-range=10240-1048576
We should note that for better resource utilization, on the 8 core VM and 160 core bare-metal systems, we have also run with 4 and 80 Redis instances, respectively, and have observed similar results.
In other unrelated research, we have observed that some workloads are impacted by the two kernel scheduling parameters: sched_latency_ns and sched_wakeup_granularity_ns. We experimented and observed that Redis is also impacted by these kernel scheduling parameters. These parameters are described in the kernel documentation [4] as:
Since the default sched_wakeup_granularity_ns is less than half of sched_latency_ns, wakeup preemption is enabled by default on A1 instances in OCI. In order to see the impact of disabling wakeup preemption, we gradually increased the value of sched_wakeup_granularity_ns until it became larger than half of sched_latency_ns. The results are shown in the following graph.
From the graph, two things stand out:
These observations suggest disabling preemption and setting sched_wakeup_granularity_ns to a very large value. However, setting the parameter to extremely large values is not advisable as it may have an undesirable impact on other applications running on the system. Instead of setting these parameters directly, it is simpler to choose a tuned profile that can enable and disable wakeup preemption. To this end, we found that the the default tuned profiles – oci-rps-xps oci-busy-polling oci-cpu-power oci-nic – enables preemption (Table 1), whereas the tuned profile throughput-performance disables preemption (Table 2). Table 1 shows the default values for the two parameters for different VM sizes. For the default case, although the individual values for sched_latency_ns and sched_wakeup_granularity_ns change with the number of Oracle CPUs (OCPUs), the ratio between them remains the same (6:1).
The individual parameters and tuned profiles can be set using the following commands:
sudo sysctl -w kernel.sched_wakeup_granularity_ns=$x sudo tuned-adm profile throughput-performance sudo tuned-adm profile oci-rps-xps oci-busy-polling oci-cpu-power oci-nic
So, why does disabling preemption helps Redis? In the default case, with wakeup preemption enabled, a client ready to get scheduled will kick a server process off a CPU once the server’s time slice is over. However, with preemption disabled, the client won’t be scheduled on that CPU until the server process is done with the current task. Thus, with disabling preemption, we are reducing the context switches and the cost associated with them. The following graphs show the reduction in the number of context switches (and improvement in the throughput) on disabling preemption. The data in these graphs is collected with perf stat [5]. In order to identify the root cause of the performance difference, we also looked at the perf profiles for the kernel instructions. For the lower performance case, there are calls to ‘swapper’ resulting in idle cycles.
When THP is enabled on our system, Redis outputs the following warning about the negative impacts of enabling Transparent Huge Pages:
WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo madvise > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled (set to 'madvise' or 'never').
However, in our experiments, we have not observed any performance impacts of enabling THP. THP helps workloads with large working sets run more efficiently by reducing the number of pages required to map their memory. Therefore, we recommend enabling THP to reduce the cost associated in servicing page faults by using large memory pages.
In this blog, we provided tuning recommendations for improving Redis performance on Ampere’s A1 in OCI. Specifically, we showed that after disabling wakeup preemption by selecting the throughput-performance tuned profile, and by minimizing page faults through using Transparent Huge Pages (THP), performance was 1.6x faster than using default values.
Next Post