Introduction
Definitive statistics are hard to come by, but cgroup v2 adoption continues to increase. Most Linux distros now default to cgroup v2 [1], and older long-term-stable releases that defaulted to cgroup v1 are slowly being retired. Large, complex applications that heavily relied on cgroup v1 have started the transition to cgroup v2. These applications have to contend with new requirements and constraints imposed by cgroup v2:
- Unified hierarchy – All the cgroup controllers are now mounted under a single cgroup hierarchy. (In cgroup v1, each controller was typically mounted as a separate path under /sys/fs/cgroup).
- Leaf-node rule – In cgroup v1, processes could be located anywhere within the cgroup hierarchy, and processes could be siblings to cgroups. In cgroup v2, processes can only exist in leaf nodes of the hierarchy, and the kernel rigorously enforces this.
- Single-writer rule – To further simplify the operating system’s management of the cgroup hierarchy, the single-writer rule was adopted. Each cgroup is to be managed by a single task, and by default on Oracle Linux (and other distros) this task is systemd. Users/Applications can request to manage a subset of the cgroup hierarchy via delegated cgroups. Failure to obey the single-writer rule (by modifying a cgroup managed by systemd) could result in systemd reverting the changes.
- Renaming of settings – Cgroup v1 settings and names were inconsistent across controllers. Cgroup v2 standardized the naming across the controllers.
- Removal of arcane features – Early cgroup v1 development resulted in many obscure and confusing settings. Many of these settings were intentionally not forward-ported to cgroup v2 and thus do not have an equivalent v2 setting.
The libcgroup abstraction layer was added to help enterprise applications as seamlessly as possible migrate to cgroup v2. libcgroup abstracts the details of the underlying cgroup hierarchy, and allows the user to continue using their existing cgconfig.conf and their existing calls to cgget, cgroup_get_cgroup(), etc. By leveraging these new features in libcgroup, enterprise applications can run the same executable on a cgroup v1 or a cgroup v2 system; no recompiling, no ugly #ifdefs, just a single executable. The C APIs, CLI tools and Python bindings provide the same flexibility and power, and users can utilize them to write tools and management scripts that will behave consistently across cgroup v1 and v2 systems.
Explore abstraction layer support with use case
To demonstrate how the abstraction layer of libcgroup can help, let’s consider a simple scenario, where the admin has been working with cgroup v1 system and is getting used to the cgroup v2 world. Consider the classic cgroup case, where the admin is trying to ration the cpu time between users, and the allocation depends on the users role. On a cgroup v1 system, the cpu.shares setting would be set in the range up to 1024 for allocation of a single cpu worth resource to the cgroup, where 512 denotes allocation of half a cpu or 50% worth of cpu time for the tasks in the user owning cgroup. Without libcgroup, the admin would need to execute:
# echo '50' > /sys/fs/cgroup/cgroupA/cpu.weight (on cgroup v2 system) and # echo '512' /sys/fs/cgroup/cpu/cgroupA/cpu.shares (on cgroup v1 system)
Solution 1 – Using the command line cgxset tool
In the above scenario, the admin would use cgxset tool provided by the libcgroup, without needing to remember different cpu controller settings on different cgroup versions. The admin can use the following command line on both cgroup v1 and cgroup v2 setups:
$ sudo cgxset -1 -r cpu.shares=512 cgroupA (on cgroup v2 system)
On the cgroup v2 system the above command will achieve the same result as on cgroup v1 system as setting 50% of cpu cycle for the cgroupA cgroup. Let’s break down the command line options:
- -1 hints that the following controller setting is of cgroup v1 controller
- -r cpu.shares=512 is the cpu controller setting and 512 is the value passed to the controller.
- cgroupA is the cgroup to which the controller setting is to be set. cgxset tool internally translates the cgroup v1 cpu controller setting cpu.shares to the cgroup v2 setting cpu.weight and also converts the controller value 512 (50% cpu time) into 50. The cpu.weight value range from 0-100 instead of cpu.shares range of 0-1024, . Similarly, the admin could use:
$ sudo cgxset -2 -r cpu.weight=50 cgroupA
On the cgroup v1 system libcgroup provides tools that would be considered a Swiss knife in an admins toolbox to work with cgroups v1/v2, which can help them manage complex hierarchies with just a few commands.
The above example works on libcgroup v3.0 and later. It’s always recommended to use the most current release (which is v3.1.0 as of the writing of this blog.
On a side note, there is another mode where cgroup v1 and cgroup v2 controllers can co-exist, called the cgroup hybrid mode. In this mode, the cgroup controller can either be available as v1 or v2, not both. libcgroup provides the cgget tool, that can help the user to find the current cgroup setup mode by passing the -m option.
$ cgget -m Legacy Mode (Cgroup v1 only).
the above output means all of the supported controllers are mounted in cgroup v1 mode, cgget also provides another interesting option -c, that displays the version of every controller.
The above example works on libcgroup v3.1.0 and later.
$ cgget -c #Controller Version name=systemd 1 devices 1 debug 1 misc 1 blkio 1 memory 1 hugetlb 1 freezer 1 net_cls 1 net_prio 1 perf_event 1 rdma 1 cpu 1 cpuacct 1 pids 1 cpuset 1
Here every controller is a cgroup v1 controller.
The above example works on libcgroup v3.1.0 and later.
Solution 2 – using cgconfig.conf
libcgroup allows users to define their complex cgroup hierarchies, like mounting multiple hierarchies, nested cgroups, and setting up controller values per cgroup in the hierarchy of nested cgroups using cgconfig.conf. It uses a simple syntax, which gets parsed during system boot. This is handy in large system farms and the admins can just use the cgconfig.conf file they have defined for cgroup v1 on the cgroup v2 without any changes. The libcgroup configparser that parses cgconfig.conf* is super smart to auto-translate the settings based on the underlying cgroup mode, i.e. it can auto-translate cgroup v1 settings on a cgroup v2 system by setting up the cgroups and controllers, as well as their settings in a cgroup v2 world. The following file cgconfig.conf, would work on both cgroup v1 and cgroup v2 systems:
group userA {
cpu {
cpu.shares = 512;
}
}
*Not all controller mapping is done yet, it’s undergoing work.
The above example works on libcgroup v3.1.0 and later
Solution 3 – using API (c)
libcgroup provides a huge collection of APIs, that users can use to work with cgroups, where the abstraction layer cgroup_convert_cgroup() is added to the list. This programmatically converts the controller setting from v1 to v2 and vice versa. This API is very helpful for applications linking with the libcgroup library, including short programs and defined programs, Like one that creats cgroups and sets their cpu share value, which an admin in our example might use. cgroup_convert_cgroup(struct cgroup const out_cgroup, enum cg_version_t out_version, const struct cgroup * const in_cgroup, enum cgroup_version_t in_version);*, let one break down the arguments required for the API:
- out_cgroup – struct cgroup, holding the translated controller settings
- out_version – version of controller settings, that expected after translationin out_cgroup
- in_cgroup – the struct cgroup, holding the original controller settings
- in_version – original version of controller settings of in_cgroup
Let’s expand on the idea of a short program for an admin to create a cgroup and set cpu.shares on ia cgroup v2 system.
Please see compilation instructions at the top of the following sample code:
/*
** Copyright (c) 2023, Oracle and/or its affiliates.
**
** The Universal Permissive License (UPL), Version 1.0
**
** Subject to the condition set forth below, permission is hereby granted to any
** person obtaining a copy of this software, associated documentation and/or data
** (collectively the "Software"), free of charge and under any and all copyright
** rights in the Software, and any and all patent rights owned or freely
** licensable by each licensor hereunder covering either (i) the unmodified
** Software as contributed to or provided by such licensor, or (ii) the Larger
** Works (as defined below), to deal in both
**
** (a) the Software, and
** (b) any piece of software and/or hardware listed in the lrgrwrks.txt file if
** one is included with the Software (each a "Larger Work" to which the Software
** is contributed by such licensors),
**
** without restriction, including without limitation the rights to copy, create
** derivative works of, display, perform, and distribute the Software and make,
** use, sell, offer for sale, import, export, have made, and have sold the
** Software and the Larger Work(s), and to sublicense the foregoing rights on
** either these or other terms.
**
** This license is subject to the following condition:
** The above copyright notice and either this complete permission notice or at
** a minimum a reference to the UPL must be included in all copies or
** substantial portions of the Software.
**
** THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
** IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
** FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
** AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
** LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
** OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
** SOFTWARE.
*/
/*
To compile this source please generate a Makefile with the following contents:
------------ start --------------------
all: set_cpu_shares
set_cpu_shares:
gcc -o $@ set_cpu_shares.c -lcgroup
clean:
rm set_cpu_shares
------------ end --------------------
And then run make
$ make
*/
#include <stdio.h>
#include <libcgroup.h>
#define CTRL_NAME "cpu"
#define CTRL_SETTING "cpu.shares"
int main(int argc, char **argv)
{
enum cg_version_t in_ver, out_ver, ctrl_ver;
struct cgroup_controller *cgrp_ctrl;
struct cgroup *in_cgrp, *out_cgrp;
char *cgrp_name, *ctrl_val;
int ret = 1;
if (argc < 2) {
fprintf(stderr, "usage: %s <cgroup name> <cpu value>\n", argv[0]);
return 1;
}
cgrp_name = argv[1];
ctrl_val = argv[2];
ret = cgroup_init();
if (ret) {
fprintf(stderr, "cgroup initialization failed\n");
return 1;
}
in_ver = CGROUP_V1;
out_ver = CGROUP_V2;
in_cgrp = cgroup_new_cgroup(cgrp_name);
if (!in_cgrp) {
fprintf(stderr, "failed to create cgroup %s\n", cgrp_name);
return 1;
}
cgrp_ctrl = cgroup_add_controller(in_cgrp, CTRL_NAME);
if (!cgrp_ctrl) {
fprintf(stderr, "failed to add controller %s\n", CTRL_NAME);
goto err;
}
ret = cgroup_add_value_string(cgrp_ctrl, CTRL_SETTING, ctrl_val);
if (ret) {
fprintf(stderr, "failed to add setting %s\n", CTRL_SETTING);
goto err;
}
out_cgrp = cgroup_new_cgroup(cgrp_name);
if (!out_cgrp) {
fprintf(stderr, "failed to create cgroup %s\n", cgrp_name);
goto err;
}
ret = cgroup_convert_cgroup(out_cgrp, out_ver, in_cgrp, in_ver);
if (ret) {
fprintf(stderr, "failed to convert %s from %d version to %d version\n",
cgrp_name, in_ver, out_ver);
goto err;
}
ret = cgroup_create_cgroup(out_cgrp, 0);
if (ret)
fprintf(stderr, "failed to create cgroup %s:%s\n", cgrp_name,
cgroup_strerror(ret));
err:
cgroup_free(&in_cgrp);
cgroup_free(&out_cgrp);
return ret;
}
Summary
The Abstraction layer is one of the many features that libcgroup provides for interacting with cgroups, via API’s or tools. This is the first of a multi-part series on exploring the tools/features of libcgroup.
References
- Releases when distros first defaulted to cgroup v2 – Oracle Linux 9 (released 2022), RHEL 9 (released 2022), Ubuntu Server (released 2021)
- https://github.com/libcgroup/libcgroup
- https://systemd.io/CGROUP_DELEGATION/
- https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/cgroup-v2.rst
- https://docs.oracle.com/en/operating-systems/oracle-linux/9/relnotes9.0/ol9-NewFeaturesandChanges.html#ol9-features-kernel
- https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html-single/9.0_release_notes/index#enhancement_kernel
- https://lists.ubuntu.com/archives/ubuntu-devel/2021-August/041598.html
- man cgget
- man cgxset
- man 5 cgconfig.conf