Here we continue our series of BPF blog entries by looking at Oracle Linux specifically, and describing how it facilitates BPF use and development.
BPF - a means of observing or changing Linux kernel system behaviour using programs that are verified to be safe and then injected into the kernel - has become a key technology in Linux. The Oracle Linux team work on finding ways to facilitate BPF - trying to make life easier both for developers who want to write BPF programs, and for users who want to use the BPF-based tools they build. We also add kernel features to improve BPF, often as a result of interacting with internal/external teams using BPF and hitting roadblocks. Because BPF is such a fast-moving technology, and because it relies both on kernel features to enable various BPF behaviours and user-space tools to build BPF programs, observe them and so on, it is important to make sure that the overall experience is smooth.
The Oracle Linux team have focused on a few different aspects of this:
Note that latest package versions are correct at time of writing, but will change over time.
The following BPF-related options are set for our Oracle Linux UEK7 kernel release - it a 5.15 LTS-based kernel.
CONFIG_BPF=y CONFIG_HAVE_EBPF_JIT=y CONFIG_BPF_SYSCALL=y CONFIG_BPF_JIT=y CONFIG_BPF_JIT_ALWAYS_ON=y CONFIG_BPF_JIT_DEFAULT_ON=y CONFIG_BPF_UNPRIV_DEFAULT_OFF=y CONFIG_BPF_LSM=y CONFIG_CGROUP_BPF=y CONFIG_NETFILTER_XT_MATCH_BPF=m CONFIG_NET_CLS_BPF=m CONFIG_NET_ACT_BPF=m CONFIG_BPF_STREAM_PARSER=y CONFIG_LWTUNNEL_BPF=y CONFIG_BPF_EVENTS=y CONFIG_BPF_KPROBE_OVERRIDE=y CONFIG_DEBUG_INFO_BTF=y (see below for more)
For debug kernels, the following additional option is set (to help run BPF selftests):
CONFIG_TEST_BPF=m
Firstly, a key enabler for BPF users is having kernel BPF Type Format information. This information is really important for many of the aspects of BPF, most critically BPF tracing programs and Compile Once - Run Everywhere (CO-RE). BTF provides descriptions of types, functions and kernel variables which play critical roles in tracing, verification etc.
If you want to do this for your kernel build, you need to specify CONFIG_DEBUG_INFO_BTF=y and make sure the build process has access to a recent version of pahole
(available from the dwarves package built from https://github.com/acmel/dwarves).
BPF tracing uses BTF descriptions of functions and types to allow us to instrument code meaningfully, and Compile Once - Run Everywhere is enabled because the info about types allows a pre-compiled BPF program to run on multiple kernel versions. Data structure offsets and the like are adjusted to fit the running kernel using information gathered from BTF, in a relocation process that is driven by libbpf.
We are providing BPF Type Format information for both core kernel and modules for the UEK7 (v5.15-based) kernel and for more recent updates of the UEK6U3 (5.4-based) kernel.
BTF availability can be easily checked on a running system; check what is in /sys/kernel/btf:
If you are compiling an out-of-tree module against one of these BTF-capable kernels, you may wish to include module BTF that describes the module data structure, functions, etc. Doing this is valuable as it allows BPF observability tools which rely on BTF to work on the module.
The following steps are required:
pahole
from the dwarves/libdwarves packages (available in the OL8 appstream); it will do the BTF generation.dd if=/sys/kernel/btf/vmlinux of=/usr/src/kernels/`uname -r`/vmlinux
bpftool btf dump -B /sys/kernel/btf/vmlinux file /sys/kernel/btf/<module_name>
This means “using base kernel BTF from vmlinux, dump module BTF”. BTF is designed such that module BTF is encoded relative to the base kernel (vmlinux) BTF.
It is important to stress out-of-tree modules can still be compiled without BTF; you will see this message however:
Skipping BTF generation for <module_name> due to unavailability of vmlinux
Note that we also continue to provide the Compact Type Format (CTF) information too - it plays a similar role for DTrace that BTF does for BPF, and indeed CTF was the starting point for BTF design. There is a great presentation on the history available at http://vger.kernel.org/~acme/perf/btf-perf-pahole-lsfmm-san-juan-2019/#/.
A key enabler of BPF program compilation (and runtime) is libbpf. An up-to-date libbpf (v0.6) is available here for OL8:
https://yum.oracle.com/repo/OracleLinux/OL8/UEKR7/x86_64/
…and here for OL7:
https://yum.oracle.com/repo/OracleLinux/OL7/UEKR6/x86_64/
Both libbpf and libbpf-devel are needed for compile-time, libbpf is needed at runtime for program loading.
bpftool
is important when building your program. It can be used for skeleton generation - “gen skel” - which creates a header file containing simplified access functions for your BPF object along with an embedded bytecode representation, avoiding the need to ship a separate BPF object. v5.15 is available at
https://yum.oracle.com/repo/OracleLinux/OL8/UEKR7/x86_64/
Finally you need to compile your source code into BPF bytecode. The great thing is Oracle Linux now have two options for this! It is now possible to build BPF programs using gcc, via the binutils-bpf-unknown-none, gcc-bpf-unknown-none packages found in the ol8_developer repo.
LLVM and clang (v12, supporting Compile Once - Run Everywhere) are also available from OL8 appstream:
https://yum.oracle.com/repo/OracleLinux/OL8/appstream/x86_64/index.html
Here we show a simple Makefile that illustrates the uses of the above to build a BPF program.
It:
# Copyright (c) 2022, Oracle and/or its affiliates. SRCARCH := $(shell uname -m | sed -e s/i.86/x86/ -e s/x86_64/x86/ \ -e /arm64/!s/arm.*/arm/ -e s/sa110/arm/ \ -e s/aarch64.*/arm64/ ) CLANG ?= clang LLC ?= llc LLVM_STRIP ?= llvm-strip BPFTOOL ?= bpftool INSTALL ?= install BPF_INCLUDE := /usr/local/include INCLUDES := -I. -I$(BPF_INCLUDE) -I../include/uapi CFLAGS := -g -Wall VMLINUX_BTF_PATH := /sys/kernel/btf/vmlinux ifeq ($(V),1) Q = else Q = @ MAKEFLAGS += --no-print-directory submake_extras := feature_display=0 endif .DELETE_ON_ERROR: .PHONY: all clean $(PROG) PROG := helloworld all: $(PROG) clean: $(call QUIET_CLEAN, $(PROG)) $(Q)$(RM) *.o $(Q)$(RM) *.skel.h vmlinux.h install: $(PROG) $(Q)$(INSTALL) -m 0755 -d $(DESTDIR)$(prefix)/sbin $(Q)$(INSTALL) $(PROG) $(DESTDIR)$(prefix)/sbin $(PROG): $(PROG).o $(QUIET_LINK)$(CC) $(CFLAGS) $^ -lbpf -o $@ $(PROG).o: $(PROG).skel.h \ $(PROG).bpf.o %.skel.h: %.bpf.o $(QUIET_GEN)$(BPFTOOL) gen skeleton $< > $@ $(PROG).bpf.o: vmlinux.h $(QUIET_GEN)$(CLANG) -g -D__TARGET_ARCH_$(SRCARCH) -O2 -target bpf \ $(INCLUDES) -c $(PROG).bpf.c -o $@ && \ %.o: %.c $(QUIET_CC)$(CC) $(CFLAGS) $(INCLUDES) -c $(filter %.c,$^) -o $@ vmlinux.h: $(QUIET_GEN)$(BPFTOOL) btf dump file $(VMLINUX_BTF_PATH) format c > $@
bpftool
- as mentioned previously - is used to
The latest 5.15 version is available in the UEKR7 repository.
Likewise, the BPF-based DTrace - v2.0 - is available from
https://yum.oracle.com/repo/OracleLinux/OL8/UEKR7/x86_64/
..or from the UEKR6 repository for UEK6.
More details available here: https://docs.oracle.com/en/operating-systems/oracle-linux/dtrace-relnotes/
bcc, bcc-tools, libbpf-tools - all built from the bcc project
https://github.com/iovisor/bcc/
For OL8 v0.23 is available in the UEKR7 repository:
https://yum.oracle.com/repo/OracleLinux/OL8/UEKR7/x86_64/
Jonah Palmer wrote an excellent blog seris on using bcc; the first entry is here:
https://blogs.oracle.com/linux/post/intro-to-bcc-1
So we see there are many things to do to enable BPF use and development. Having a BPF-friendly environment makes a system much more amenable to observability. We will focus on this topic in the next blog post.
Previous Post