X

News, tips, partners, and perspectives for the Oracle Linux operating system and upstream Linux kernel work

  • February 11, 2020

Libcgroup in the Twenty-First Century

Scott Michael
Director, Software Development

In this blog post, Oracle Linux kernel developer Tom Hromatka writes about the new testing frameworks, continuous integration and code coverage capabilities that have been added to libcgroup.

In 2008 libcgroup was created to simplify how users interact with and manage cgroups. At the time, only cgroups v1 existed, the libcgroup source was hosted in a subversion repository on Sourceforce, and System V still ruled the universe.

Fast forward to today and the landscape is changing quickly. To pave the way for cgroups v2 support in libcgroup, we have added unit tests, functional tests, continuous integration, code coverage, and more.

Unit Testing

In May 2019 we added the googletest unit testing framework to libcgroup. libcgroup has many large, monolithic functions that perform the bulk of the cgroup management logic, and adding cgroup v2 support to these complex functions could easily introduce regressions. To combat this, we plan on adding tests before we add cgroup v2 support.

Functional Testing

In June 2019 we added a functional test framework to libcgroup. The functional test framework consists of several Python classes that either represent cgroup data or can be used to manage cgroups and the system. Years ago tests were added to libcgroup, but they have proven difficult to run and maintain because they are destructive to the host system's libcgroup hierarchy. With the advent of containers, this problem can easily be avoided.

The functional test framework utilizes LXC containers and the LXD interfaces to encapsulate the tests. Running the tests within a container provides a safe environment where cgroups can be created, deleted, and modified in an easily reproducible setting - without destructively modifying the host's cgroup hierachy.

libcgroup's functional tests are quick and easy to write and provide concise and informative feedback on the status of the run.

Here's a simple example of a successful test run:

$ ./001-cgget-basic_cgget.py
-----------------------------------------------------------------
Test Results:
    Run Date:                     Dec 02 17:54:28
    Passed:                             1 test(s)
    Skipped:                            0 test(s)
    Failed:                             0 test(s)
-----------------------------------------------------------------
Timing Results:
    Test                               Time (sec)
    ---------------------------------------------------------
    setup                                    5.02
    001-cgget-basic_cgget.py                 0.76
    teardown                                 0.00
    ---------------------------------------------------------
    Total Run Time                           5.79

And here's an example of where something went wrong. In this case I have artificially caused the Run() class to raise an exception early in the test run. The framework reports the test and the exact command that failed. The return code, stdout, and stderr from the failing command are also reported to facilitate debugging. And of course the log file contains a chronological history of the entire test run to further help in troubleshooting the root cause.

$ ./001-cgget-basic_cgget.py
-----------------------------------------------------------------
Test Results:
    Run Date:                     Dec 02 18:11:47
    Passed:                             0 test(s)
    Skipped:                            0 test(s)
    Failed:                             1 test(s)
        Test:               001-cgget-basic_cgget.py - RunError:
    command = ['sudo', 'lxc', 'exec', 'TestLibcg', '--', '/home/thromatka/git/libcgroup/src/tools/cgset', '-r', 'cpu.shares=512', '001cgget']
    ret = 0
    stdout = b''
    stderr = b'I artificially injected this exception'
-----------------------------------------------------------------

Continuous Integration and Code Coverage

In September 2019 we added continuous integration and code coverage to libcgroup. libcgroup's github repository is now linked with Travis CI to automatically configure the library, build the library, run the unit tests, and run the functional tests every time a commit is pushed to the repo. If the tests pass, Travis CI invokes coveralls.io to gather code coverage metrics. The continuous integration status and the current code coverage percentage are prominently displayed on the github source repository.

Currently all two :) tests are passing and code coverage is at 16%. I have many more tests currently in progress, so expect to see these numbers improve significantly in the next few months.

Future Work

Ironically, after all these changes, we're now nearly ready to start the "real work."

A loose roadmap of our upcoming improvements:

  • Add an "ignore" rule to cgrulesengd. (While not directly related to the cgroup v2 work, this new ignore rule will heavily utilize the testing capabilities outlined above)
  • Add a ton more tests - both unit and functional
  • Add cgroup v2 support to our functional testing framework. I have a really rough prototype working, but I think automating it will require help from the Travis CI development team
  • Add cgroup v2 capabilities to libcgroup utilities like cgget, cgset, etc.
  • Design and implement a cgroup abstraction layer that will abstract away all of the gory detail differences between cgroup v1 and cgroup v2

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.