Here we continue our series of BPF blog entries by looking at Oracle Linux specifically, and describing how it facilitates BPF use and development.

BPF – a means of observing or changing Linux kernel system behaviour using programs that are verified to be safe and then injected into the kernel – has become a key technology in Linux. The Oracle Linux team work on finding ways to facilitate BPF – trying to make life easier both for developers who want to write BPF programs, and for users who want to use the BPF-based tools they build. We also add kernel features to improve BPF, often as a result of interacting with internal/external teams using BPF and hitting roadblocks. Because BPF is such a fast-moving technology, and because it relies both on kernel features to enable various BPF behaviours and user-space tools to build BPF programs, observe them and so on, it is important to make sure that the overall experience is smooth.

The Oracle Linux team have focused on a few different aspects of this:

  • BPF and the kernel: what kernel options are used, and how are these facilitated?
  • BPF and building programs: what tools are needed for this? A sample Makefile is also provided
  • BPF-based tools. Tools that use BPF are discussed.

Note that latest package versions are correct at time of writing, but will change over time.

BPF and the kernel

The following BPF-related options are set for our Oracle Linux UEK7 kernel release – it a 5.15 LTS-based kernel.

CONFIG_BPF=y
CONFIG_HAVE_EBPF_JIT=y
CONFIG_BPF_SYSCALL=y
CONFIG_BPF_JIT=y
CONFIG_BPF_JIT_ALWAYS_ON=y
CONFIG_BPF_JIT_DEFAULT_ON=y
CONFIG_BPF_UNPRIV_DEFAULT_OFF=y
CONFIG_BPF_LSM=y
CONFIG_CGROUP_BPF=y
CONFIG_NETFILTER_XT_MATCH_BPF=m
CONFIG_NET_CLS_BPF=m
CONFIG_NET_ACT_BPF=m
CONFIG_BPF_STREAM_PARSER=y
CONFIG_LWTUNNEL_BPF=y
CONFIG_BPF_EVENTS=y
CONFIG_BPF_KPROBE_OVERRIDE=y
CONFIG_DEBUG_INFO_BTF=y (see below for more)

For debug kernels, the following additional option is set (to help run BPF selftests):

CONFIG_TEST_BPF=m 

BPF Type Format (BTF) Availability

Firstly, a key enabler for BPF users is having kernel BPF Type Format information. This information is really important for many of the aspects of BPF, most critically BPF tracing programs and Compile Once – Run Everywhere (CO-RE). BTF provides descriptions of types, functions and kernel variables which play critical roles in tracing, verification etc.

If you want to do this for your kernel build, you need to specify CONFIG_DEBUG_INFO_BTF=y and make sure the build process has access to a recent version of pahole (available from the dwarves package built from https://github.com/acmel/dwarves).

BPF tracing uses BTF descriptions of functions and types to allow us to instrument code meaningfully, and Compile Once – Run Everywhere is enabled because the info about types allows a pre-compiled BPF program to run on multiple kernel versions. Data structure offsets and the like are adjusted to fit the running kernel using information gathered from BTF, in a relocation process that is driven by libbpf.

We are providing BPF Type Format information for both core kernel and modules for the UEK7 (v5.15-based) kernel and for more recent updates of the UEK6U3 (5.4-based) kernel.

BTF availability can be easily checked on a running system; check what is in /sys/kernel/btf:

  • if that directory does not exist, BTF is not present.
  • if it just contains vmlinux, you have core kernel BTF
  • …and if it also contains entries for modules, module BTF is present also.

Compiling an out-of-tree module to include BTF

If you are compiling an out-of-tree module against one of these BTF-capable kernels, you may wish to include module BTF that describes the module data structure, functions, etc. Doing this is valuable as it allows BPF observability tools which rely on BTF to work on the module.

The following steps are required:

  • ensure as usual you have installed the kernel-uek-devel package associated with that kernel version.
  • check you have an up-to-date version of pahole from the dwarves/libdwarves packages (available in the OL8 appstream); it will do the BTF generation.
  • BTF module information is created relative to the associated kernel (vmlinux) BTF, so you need base vmlinux BTF in order for module BTF to be generated. So on a BTF-capable kernel, you can create a dummy vmlinux containing BTF via
    dd if=/sys/kernel/btf/vmlinux of=/usr/src/kernels/`uname -r`/vmlinux 
  • Now module build as usual should succeed and BTF is generated. To test, load the module and check if /sys/kernel/btf/ is present. To check the BTF you generated for your module, try running
    bpftool btf dump -B /sys/kernel/btf/vmlinux file /sys/kernel/btf/<module_name>

This means “using base kernel BTF from vmlinux, dump module BTF”. BTF is designed such that module BTF is encoded relative to the base kernel (vmlinux) BTF.

It is important to stress out-of-tree modules can still be compiled without BTF; you will see this message however:

Skipping BTF generation for <module_name> due to unavailability of vmlinux

Note that we also continue to provide the Compact Type Format (CTF) information too – it plays a similar role for DTrace that BTF does for BPF, and indeed CTF was the starting point for BTF design. There is a great presentation on the history available at http://vger.kernel.org/~acme/perf/btf-perf-pahole-lsfmm-san-juan-2019/#/.

BPF tooling – compiling your program

A key enabler of BPF program compilation (and runtime) is libbpf. An up-to-date libbpf (v0.6) is available here for OL8:

    https://yum.oracle.com/repo/OracleLinux/OL8/UEKR7/x86_64/

…and here for OL7:

    https://yum.oracle.com/repo/OracleLinux/OL7/UEKR6/x86_64/

Both libbpf and libbpf-devel are needed for compile-time, libbpf is needed at runtime for program loading.

bpftool is important when building your program. It can be used for skeleton generation – “gen skel” – which creates a header file containing simplified access functions for your BPF object along with an embedded bytecode representation, avoiding the need to ship a separate BPF object. v5.15 is available at

    https://yum.oracle.com/repo/OracleLinux/OL8/UEKR7/x86_64/

Finally you need to compile your source code into BPF bytecode. The great thing is Oracle Linux now have two options for this! It is now possible to build BPF programs using gcc, via the binutils-bpf-unknown-none, gcc-bpf-unknown-none packages found in the ol8_developer repo.

LLVM and clang (v12, supporting Compile Once – Run Everywhere) are also available from OL8 appstream:

    https://yum.oracle.com/repo/OracleLinux/OL8/appstream/x86_64/index.html

Putting the pieces together – a modern BPF Makefile

Here we show a simple Makefile that illustrates the uses of the above to build a BPF program.

It:

  • uses “bpftool btf dump” to create a vmlinux.h using BTF on the local system;
  • uses “bpftool gen skeleton” to generate a BPF skeleton from the BPF object;
  • relies on libbpf[-devel] for building, libbpf for running.
# Copyright (c) 2022, Oracle and/or its affiliates.

SRCARCH := $(shell uname -m | sed -e s/i.86/x86/ -e s/x86_64/x86/ \
                                  -e /arm64/!s/arm.*/arm/ -e s/sa110/arm/ \
                                  -e s/aarch64.*/arm64/ )
CLANG ?= clang
LLC ?= llc
LLVM_STRIP ?= llvm-strip
BPFTOOL ?= bpftool
INSTALL ?= install

BPF_INCLUDE := /usr/local/include
INCLUDES := -I. -I$(BPF_INCLUDE) -I../include/uapi
CFLAGS := -g -Wall
VMLINUX_BTF_PATH := /sys/kernel/btf/vmlinux

ifeq ($(V),1)
Q =
else
Q = @
MAKEFLAGS += --no-print-directory
submake_extras := feature_display=0
endif

.DELETE_ON_ERROR:

.PHONY: all clean $(PROG)

PROG := helloworld

all: $(PROG)

clean:
    $(call QUIET_CLEAN, $(PROG))
    $(Q)$(RM) *.o
    $(Q)$(RM) *.skel.h vmlinux.h

install: $(PROG)
    $(Q)$(INSTALL) -m 0755 -d $(DESTDIR)$(prefix)/sbin
    $(Q)$(INSTALL) $(PROG) $(DESTDIR)$(prefix)/sbin

$(PROG): $(PROG).o
    $(QUIET_LINK)$(CC) $(CFLAGS) $^ -lbpf -o $@

$(PROG).o: $(PROG).skel.h         \
       $(PROG).bpf.o

%.skel.h: %.bpf.o
    $(QUIET_GEN)$(BPFTOOL) gen skeleton $< > $@

$(PROG).bpf.o: vmlinux.h
    $(QUIET_GEN)$(CLANG) -g -D__TARGET_ARCH_$(SRCARCH) -O2 -target bpf \
        $(INCLUDES) -c $(PROG).bpf.c -o $@ &&                   \

%.o: %.c
    $(QUIET_CC)$(CC) $(CFLAGS) $(INCLUDES) -c $(filter %.c,$^) -o $@

vmlinux.h:
    $(QUIET_GEN)$(BPFTOOL) btf dump file $(VMLINUX_BTF_PATH) format c > $@

Some BPF-based tools

bpftool – as mentioned previously – is used to

  • monitor BPF programs, showing what programs are loaded and where they are attached
  • monitor BPF maps
  • attach, detach and pin BPF programs
  • generate BPF skeletons from BPF programs (see above)

The latest 5.15 version is available in the UEKR7 repository.

Likewise, the BPF-based DTrace – v2.0 – is available from

    https://yum.oracle.com/repo/OracleLinux/OL8/UEKR7/x86_64/

..or from the UEKR6 repository for UEK6.

More details available here: https://docs.oracle.com/en/operating-systems/oracle-linux/dtrace-relnotes/

bcc, bcc-tools, libbpf-tools – all built from the bcc project

    https://github.com/iovisor/bcc/

For OL8 v0.23 is available in the UEKR7 repository:

    https://yum.oracle.com/repo/OracleLinux/OL8/UEKR7/x86_64/

Jonah Palmer wrote an excellent blog seris on using bcc; the first entry is here:

    https://blogs.oracle.com/linux/post/intro-to-bcc-1

Summary

So we see there are many things to do to enable BPF use and development. Having a BPF-friendly environment makes a system much more amenable to observability. We will focus on this topic in the next blog post.

Previous Blogs