A while back, I posted a blog series on BPF, including some suggestions about setting up a BPF development environment. Much has changed since then in terms of BPF features, so it's worth revisiting how BPF applications are developed now.
The key change - at least for the BPF projects I work on - is that libbpf has become central to BPF application development. Why?
The SYS_BPF syscall is the Swiss Army knife of BPF; it allows you to for example inject a BPF program into the kernel and verify it, create associated maps and attach the program to its target. When these basic tasks were all that was needed, they could be carried out using simple wrappers to the SYS_BPF system call.
However BPF programs do a lot more now and there's a huge amount of interplay between userspace and kernel required to set up a program to run. libbpf solves a bunch of problems for you. I'll try and describe a few of them here.
First though a note on libbpf availability: you can either build libbpf yourself from https://github.com/libbpf/libbpf or check for a packaged version; on Oracle Linux, libbpf[-devel] is available for OL7 and OL8.
BPF can communicate with userspace by sending perf events from a BPF program; these populate a buffer ring that can then be traversed using epoll such that one system call can process multiple events. The BPF ring buffer is a newer technology that makes creating a single stream of events easier, and on the BPF side it has a reserve/commit model that can be useful.
Setting up perf events is simple; it's just a matter of calling:
perf_buffer__new() with the BPF perf map fd and a set of options specifying the callbacks to call on getting events/losing events
perf_buffer__poll() can then be used to retrieve events (it uses epoll__wait() under the hood)
Rolling your own version of this is inadvisable; the ring buffer semantics would have to be reimplemented from scratch, not a worthwhile effort!
Add to this that perf events/ring buffer events are really the most flexible ways to communicate with userspace (though map updates/bpf_trace_printk() often suffice for simple cases), and libbpf becomes vital!
libbpf contains a lot of code to simplify program attach; in most cases it simply involves ensuring your program section has the relevant name and all the work is done for you. For example SEC("tracepoint/netif_receive_skb") tells libbpf the program type and attach target, and bpf_program__attach__tracepoint() - when passed tracepoint category and tracepoint name - takes care of the attach.
In more recent kernels/libbpf, an attachment creates a bpf_link; an abstraction of that attachment which allows for detachment regardless of attach type; bpf_link__destroy() can destroy the link and deal with the attach-type specific detach. BPF links also have associated IDs which can be observed via bpftool; this allows us to better understand current BPF attach state.
Creating a BPF program requires a large toolset including LLVM and clang, which take up a fair bit of disk space. That being the case we'd like to compile our BPF program once and then be able to run it on multiple systems, with different kernel versions.
However, since BPF programs may access kernel data structures, and the offsets of fields within those data structures can change, a program compiled for version X may not work for version Y; it could fail or worse silently process incorrect data due to a field offset in a structure changing.
Enter Compile Once, Run Everywhere. By annotating field accesses in our BPF program appropriately, compilation of our BPF bytecode will emit relocations for those accesses. These can be thought of as placeholders that get filled in when the program loads on the kernel in question. libbpf can use these, plus type information about the running kernel (present via the BPF Type Format, BTF) to substitute the appropriate offsets. There is even support for representing kernel CONFIG parameters as external variables so that BPF programs can behave differently on kernels compiled with different features.
Note that libbpf is not the whole solution here; also required to support CORE relocations are recent LLVM+Clang (that support emitting those relocations at compile time), and a kernel that includes BPF Type Format (BTF) information (allowing us to fill in the right offsets at program load time). We will discuss BTF more in a future blog post.
Again in the case where we want to run a program on multiple systems with different kernel versions, certain BPF features may or may not be available. libbpf has functions to probe which program, map types and helpers are available.
bpftool can generate a "skeleton" which consists of a header file defining a number of easy ways to load, attach and access BPF programs and their data; libbpf is still under the hood for this (it's statically linked into bpftool), but it may be worth investigating for simpler programs. There are many BPF skeleton examples in tools/testing/selftests/bpf/prog_tests , and they're easy to spot; they #include a ".skel.h" file. One of the nicest features of the skeleton is that it generates bytecode for your BPF program so it can be bundled with the userspace side. Usually we have to maintain a userspace program and a BPF .o file, with the former loading the latter. However the skeleton means we can simply ship the userspace program since it embeds the BPF in the skeleton header file. This is neat as it means only one file to package or copy around.
With the above, we can see that libbpf provides many sophisticated mechanisms, either directly or via programs that use it under the hood such as bpftool. These will likely only multiply in the future. If you are developing a BPF application, libbpf is really a must these days!