X

News, tips, partners, and perspectives for the Oracle Linux operating system and upstream Linux kernel work

Writing kernel tests with the new Kernel Test Framework (KTF)

In this blog, Oracle Linux kernel developers Alan Maguire and Knut Omang explain how to write Kernel Test Framework tests.

KTF is available as a standalone git repository, but we are also working to offer it as a patch set for integration into the kernel. Read more about KTF in our introductory blog post, here:

https://blogs.oracle.com/linux/oracles-new-kernel-test-framework-for-linux-v2 

Writing new KTF (Kernel Test Framework) tests

Here we're going to try and describe how we can use KTF to write some tests. The neat thing about KTF is it allows us to test kernel code in kernel context directly. This means the environment we're running our tests in affords a lot of control.

Here we're going to try and write some tests for a key abstraction in Linux kernel networking, the "struct sk_buff". The sk_buff (socket buffer) is the structure used to store packet data as it moves through the networking stack. For an excellent introduction, see

http://vger.kernel.org/~davem/skb_data.html

In fact we're going to base our tests around some of the descriptions there, by creating/manipulating/freeing skbs and asking questions like

  • what is the state of an sk_buff when it is first allocated?
  • what about when we reserve space for new packet headers, or add tailroom?

...etc. My hope is that we can show that adding tests is in fact a great way to understand an API. If we can formalize the guarantees of the API such that we can write tests to validate them, we've come a long way in understanding it.

Brief Introduction to KTF

KTF allows us to test both exported and un-exported kernel interfaces in test cases which are added in a dedicated test module. We can make assertions about state during these test cases and the results are communicated to userspace via netlink sockets. The googletest framework is used in conjunction with this framework. While KTF supports hybrid user- and kernel-mode tests, here we will focus on kernel-only tests.

Creating our project

First let's grab a copy of KTF and build it. We use separate source and build trees, and because KTF builds modules we need kernel-specific builds. Because we are building kernel modules we will also need the kernel-uek-devel package. We build googletest from source.

Note: these instructions are for Oracle Linux; some package names etc may differ for other distros. Full instructions can be found in the doc/installation.txt file in KTF. We use Knut's version of googletest as it includes assertion counting and better test case naming.

Building googletest

# yum install cmake3
# cd ~
# mkdir -p src build/`uname -r`
# cd src
# git clone https://github.com/knuto/googletest.git
# cd ~/build/`uname -r`
# mkdir googletest
# cd googletest
# cmake3 ~/src/googletest/ -DBUILD_SHARED_LIBS=ON
# make
# sudo make install

Building KTF

We need kernel-uek-devel and cpp packages to build. Finally once we have built ktf, we insert the kernel module.

# sudo yum install kernel-uek-devel cpp libnl3-devel
# cd ~/src
# git clone https://github.com/oracle/ktf
# cd ktf
# autoreconf
# cd ~/build/`uname -r`
# mkdir ktf
# cd ktf
# PKG_CONFIG_PATH=/usr/local/lib64/pkgconfig ~/src/ktf/configure KVER=`uname -r`
# make
# sudo make install
# sudo insmod kernel/ktf.ko

Creating our new test suite

Getting started here is easy; Knut created a "ktfnew" program to populate a new suite:

# ~/src/ktf/scripts/ktfnew -p ~/src skbtest
Creating a new project under ~/src/skbtest

Let's see what we got!

# ls ~/src/skbtest
ac          autom4te.cache  configure.ac  m4           Makefile.in
aclocal.m4  configure       kernel        Makefile.am

The kernel subdir is where we will add tests to our "skbtest" module, and it has already been populated with a file:

# ls ~/src/skbtest/kernel
Makefile.in  skbtest.c

skbtest.c is a simple module with one test "t1" in test set "simple" which evaluates a true expression via the EXPECT_TRUE() macro. The module init function adds the test via the ADD_TEST(name) macro.

ASSERT_() and EXPECT_() macros are used to test conditions, and if they fail the test fails. We can clean up by using the ASSERT_*_GOTO() variants which we can pass a label to jump to on failure. ASSERTs are fatal to a test case execution. We will see more examples of this later on.

#include <linux/module.h>
#include "ktf.h"

MODULE_LICENSE("GPL");

KTF_INIT();

TEST(simple, t1)
{
        EXPECT_TRUE(true);
}

static void add_tests(void)
{
        ADD_TEST(t1);
}

static int __init skbtest_init(void)
{
        add_tests();
        return 0;
}
static void __exit skbtest_exit(void)
{
        KTF_CLEANUP();
}

module_init(skbtest_init);
module_exit(skbtest_exit);

So we're ready to start adding our tests!

Before we do anything else, let's ensure we track our progress with git.

We remove "configure" as we don't want to track it via git, we create it with "autoreconf".

# cd ~/src/skbtest
# rm configure
# git init .
# git add ac aclocal.m4  configure.ac kernel/ m4 Makefile.*
# git commit -a -m "initial commit"

The first thing we need to do is ensure that our tests have access to the skb interfaces. We need to add

#include <linux/skbuff.h>

Next, let's add a simple test that makes assertions about skb state after allocation.

/**
 * alloc_skb_sizes()
 *
 * ensure initial skb state is as expected for allocations of various sizes.
 *  - head == data
 *  - end >= tail + size
 *  - len == data_len == 0
 *  - nr_frags == 0
 *
 **/
TEST(skb, alloc_skb_sizes)
{
        unsigned int i, sizes[] = { 127, 260, 320, 550, 1028, 2059 };
        struct sk_buff *skb = NULL;

        for (i = 0; i < ARRAY_SIZE(sizes); i++) {
                skb = alloc_skb(sizes[i], GFP_KERNEL);

                ASSERT_ADDR_NE_GOTO(skb, 0, done);
                ASSERT_ADDR_EQ_GOTO(skb->head, skb->data, done);
                /*
                 * skb->end will be aligned and include overhead of shared
                 * info.
                 */
                ASSERT_TRUE_GOTO(skb->end >= skb->tail + sizes[i], done);
                ASSERT_TRUE_GOTO(skb->tail == skb->data - skb->head, done);
                ASSERT_TRUE_GOTO(skb->len == 0, done);
                ASSERT_TRUE_GOTO(skb->data_len == 0, done);
                ASSERT_TRUE_GOTO(skb_shinfo(skb)->nr_frags == 0, done);
                kfree_skb(skb);
                skb = NULL;
        }

done:
        kfree_skb(skb);
}

static void add_tests(void)
{
        ADD_TEST(alloc_skb_sizes);
}

If one of our ASSERT_ macros fails, we will goto "done", and we clean up there by freeing the skb. Ensuring tests tidy up after themselves is important as we don't want our tests to induce memory leaks!

Now we build and run our test.

Building and running our test

Here we build our test kernel module. Since we installed ktf/googletest in /usr/local, we need to tell configure to look there.

# cd ~/src/skbtest
# autoreconf
# cd ~/build/`uname -r`
# mkdir skbtest
# cd skbtest
# ~/src/skbtest/configure KVER=`uname -r` --prefix=/usr/local --libdir=/usr/local/lib64 --with-ktf=/usr/local
# make
# sudo make install

Now let's load our test module (we loaded ktf above) and run the tests:

# sudo insmod kernel/skbtest.ko
# sudo LD_LIBRARY_PATH=/usr/local/lib64 /usr/local/bin/ktfrun
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from skb
[ RUN      ] skb.alloc_skb_sizes
[       OK ] skb.alloc_skb_sizes, 42 assertions (0 ms)
[----------] 1 test from skb (0 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (0 ms total)
[  PASSED  ] 1 test.

Error injection

Now the above admittedly looks pretty dull. However it's worth emphasizing something before we move on.

This code actually ran in-kernel! With a lot of pain, it would be possible to hack up a user-space equivalent test, but it would require adding definitions for kmalloc, kmem_cache_alloc etc. Here we test the code in the same environment in which it is run, with no caveats or special-purpose environments. This makes KTF execution pretty unique; no need for extensive stubbing, we're testing the code as-is.

Next we're going to try and inject an error and see how skb allocation behaves in low-memory conditions.

KTF allows us to catch function execution and return and mess with the results via kprobes; or specifically kretprobes. To catch a return value we declare:

KTF_RETURN_PROBE(function_name, function_handler)
{
    void *retval = (void *)KTF_RETURN_VALUE();
    ...
    KTF_SET_RETURN_VALUE(newvalue);
    return 0;
}

We get the intended return value witl KTF_RETURN_VALUE(), and we can set our own via KTF_SET_RETURN_VALUE(). Note that the return value that the above function returns should always be 0 - the value that the functionw we're probing actually returns is set by KTF_SET_RETURN_VALUE(). For neatness, if it's a memory allocation we should free it, otherwise we'll be inducing a memory leak with our test!

However we face a few problems with this sort of error injection.

First, the kmem_cache used - skbuff_head_cache - is not exported as a symbol, so how do we access it in order to kmem_cache_free() our skb memory?

Luckily, ktf has a handy function for cases like this - ktf_find_symbol(). We pass in the module name (NULL in this case because it's a core kernel variable) and the symbol name, and we get back the address of the symbol. Remember though that this is essentially &skbuff_head_cache, so we need to dereference it before use.

Second, we don't want to fail skb allocations for everyone as that will kill our network access etc. So by recording the task_struct * for the test in alloc_skb_nomem_task, we can limit the damage to our test thread.

Here's what the test looks like in full:

struct task_struct *alloc_skb_nomem_task;

KTF_RETURN_PROBE(kmem_cache_alloc_node, kmem_cache_alloc_nodehandler)
{
        struct sk_buff *retval = (void *)KTF_RETURN_VALUE();
        struct kmem_cache **cache;

        /* We only want alloc failures for this task! */
        if (alloc_skb_nomem_task != current)
                return 0;

        /* skbuf_head_cache is private to skbuff.c */
        cache = ktf_find_symbol(NULL, "skbuff_head_cache");
        if (!cache || !*cache || !retval)
                return 0;

        kmem_cache_free(*cache, retval);
        KTF_SET_RETURN_VALUE(0);

        return 0;
}

/**
 * alloc_skb_nomem()
 *
 * Ensure that in the face of allocation failures (kmem cache alloc of the
 * skb) alloc_skb() behaves sensibly and returns NULL.
 **/
TEST(skb, alloc_skb_nomem)
{
        struct sk_buff *skb = NULL;

        alloc_skb_nomem_task = current;

        ASSERT_INT_EQ_GOTO(KTF_REGISTER_RETURN_PROBE(kmem_cache_alloc_node,
                           kmem_cache_alloc_nodehandler), 0, done);

        skb = alloc_skb(128, GFP_KERNEL);
        ASSERT_ADDR_EQ_GOTO(skb, 0, done);

        alloc_skb_nomem_task = NULL;
done:
        KTF_UNREGISTER_RETURN_PROBE(kmem_cache_alloc_node,
                                    kmem_cache_alloc_nodehandler);
        kfree_skb(skb);
}

static void add_tests(void)
{
        ADD_TEST(alloc_skb_sizes);
        ADD_TEST(alloc_skb_nomem);
}

Let's run it!

# sudo LD_LIBRARY_PATH=/usr/local/lib64 /usr/local/bin/ktfrun
[==========] Running 2 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 2 tests from skb
[ RUN      ] skb.alloc_skb_nomem
[       OK ] skb.alloc_skb_nomem, 2 assertions (27 ms)
[ RUN      ] skb.alloc_skb_sizes
[       OK ] skb.alloc_skb_sizes, 42 assertions (0 ms)
[----------] 2 tests from skb (27 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test case ran. (27 ms total)
[  PASSED  ] 2 tests.

Neat! Our error injection must have worked since alloc_skb() returned NULL, and we also cleaned up the memory that was really allocated but we pretended wasn't.

alloc_skb and bad skb sizes

Next we might wonder; given the arguments, can we see what happens when we provide an invalid size? But what is an invalid size? 0? UINT_MAX? Let's try a test where we pass in 0 and UINT_MAX and expect alloc_skb() to fail:

TEST(skb, alloc_skb_invalid_sizes)
{
        unsigned int i, sizes[] = { 0, UINT_MAX };
        struct sk_buff *skb = NULL;

        for (i = 0; i < ARRAY_SIZE(sizes); i++) {
                skb = alloc_skb(sizes[i], GFP_KERNEL);

                ASSERT_ADDR_EQ_GOTO(skb, 0, done);
        }
done:
        kfree_skb(skb);
}

Build again, and let's see what happens:

# sudo LD_LIBRARY_PATH=/usr/local/lib64 ktfrun 
[==========] Running 3 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 3 tests from skb
[ RUN      ] skb.alloc_skb_invalid_sizes
/var/tmp/build/4.14.35+/skbtest/kernel/skbtest.c:113: Failure
Assertion '(u64)(skb)==(u64)(0)' failed: (u64)(skb)==0xffffa07b53ca6c00, (u64)(0)==0x0
[  FAILED  ] skb.alloc_skb_invalid_sizes, where GetParam() = "alloc_skb_invalid_sizes" (19 ms)
[ RUN      ] skb.alloc_skb_nomem
[       OK ] skb.alloc_skb_nomem, 2 assertions (23 ms)
[ RUN      ] skb.alloc_skb_sizes
[       OK ] skb.alloc_skb_sizes, 2 assertions (0 ms)
[----------] 3 tests from skb (42 ms total)

[----------] Global test environment tear-down
[==========] 3 tests from 1 test case ran. (42 ms total)
[  PASSED  ] 2 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] skb.alloc_skb_invalid_sizes, where GetParam() = "alloc_skb_invalid_sizes"

 1 FAILED TEST

Okay so that failed, which means our allocation succeeded; why? Taking a closer look at alloc_skb(), there's no bar on 0 values. What about UINT_MAX, that shouldn't work, right? Actually it does! If we look at the code however, the size value that gets passed in gets the sizeof(struct skb_shared_info) etc added to it. So we just overflow the value, but what's interesting about that is we'll end up with an skb that invalidates the initial state expectations. Let's demonstrate that by add UINT_MAX to our "sizes" array in our valid skb alloc test "alloc_skb_sizes":

unsigned int i, sizes[] = { 0, 127, 260, 320, 550, 1028, 2059, UINT_MAX };

Rebuilding and running we see this:

# sudo LD_LIBRARY_PATH=/usr/local/lib64 ktfrun 
[==========] Running 3 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 3 tests from skb
[ RUN      ] skb.alloc_skb_invalid_sizes
[       OK ] skb.alloc_skb_invalid_sizes, 2 assertions (0 ms)
[ RUN      ] skb.alloc_skb_nomem
[       OK ] skb.alloc_skb_nomem, 2 assertions (23 ms)
[ RUN      ] skb.alloc_skb_sizes
/var/tmp/build/4.14.35+/skbtest/kernel/skbtest.c:45: Failure
Failure '(skb->end >= skb->tail + sizes[i])' occurred 
[  FAILED  ] skb.alloc_skb_sizes, where GetParam() = "alloc_skb_sizes" (15 ms)
[----------] 3 tests from skb (38 ms total)

So if we pass UINT_MAX to alloc_skb() we end up with a broken skb, in that skb->end isn't pointing where it should be.

Seems like there could be some range checking here, but alloc_skb() is such a hot codepath it's likely the pragmatic argument that "no-one should allocate dumb-sized skbs" wins. We can modify our test to use "safer" bad sizes for now:

TEST(skb, alloc_skb_invalid_sizes)
{
        /* We cannot just use UINT_MAX here as the "size" argument passed in
         * has sizeof(struct skb_shared_info) etc added to it; let's settle for
         * UINT_MAX >> 1, UINT_MAX >> 2, etc.
         */
        unsigned int i, sizes[] = { UINT_MAX >> 1, UINT_MAX >> 2};
        struct sk_buff *skb = NULL;

        for (i = 0; i < ARRAY_SIZE(sizes); i++) {
                skb = alloc_skb(sizes[i], GFP_KERNEL);

                ASSERT_ADDR_EQ_GOTO(skb, 0, done);
        }
done:
        kfree_skb(skb);
}

In general the skb interfaces assume the data they are provided is sensible, but we've just learned what can happen when it isn't! Writing tests is a great way to learn about an API.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha