eBPF instructions disassembled from bpftool and llvm-objdump are different-CodePudding

Why the output of bpftool prog dump xlated is a bit different from the output of llvm-objdump -d? What does xlated (translated instructions) mean? How does the kernel rewrite the bytecode?

CodePudding user response：

Why the output of `bpftool prog dump xlated` is a bit different from the output of `llvm-objdump -d`?

When you load an eBPF program into the kernel, your loader application usually takes an ELF file and extracts the eBPF bytecode instruction. But before sending them to the kernel, it performs a number of ELF relocations (for eBPF maps, or for CO-RE, for example), which means that the actual bytecode that you are loading into the kernel is already slightly different from the instructions stored in your ELF object file.

Furthermore, the verifier in the kernel further modifies the bytecode. What you can dump with bpftool prog dump xlated is the result of those different changes: The eBPF program as stored in the kernel, before JIT-compilation, but after ELF relocations (in user space) and kernel rewrites. This is why it looks slightly different from the output of llvm-objdump -d on your ELF file.

What does `xlated` (translated instructions) mean?

See the above - it means that the eBPF bytecode instructions have been adjusted by the kernel verifier, either for efficiency, or simply because it needs to.

How does the kernel rewrite the bytecode?

There are a number of steps involved here. You'd need to go and read the verifier code to get all the details. Some of them include:

Updating the context access (e.g. convert access to the struct __sk_buff used when writing networking program for TC to the actual struct sk_buff used in the kernel)
Inlining map access (loading directly from the map address instead of doing a more costly call to bpf_map_lookup_elem())
Inlining some other helper functions
Removing unreachable instructions

This is in addition to the changes performed in user space before loading the program, for example, replacing references to eBPF maps with file descriptors to the actual maps.

Why the output of bpftool prog dump xlated is a bit different from the output of llvm-objdump -d?

What does xlated (translated instructions) mean?

How does the kernel rewrite the bytecode?

Why the output of `bpftool prog dump xlated` is a bit different from the output of `llvm-objdump -d`?

What does `xlated` (translated instructions) mean?