Why the output of bpftool prog dump xlated
is a bit different from the output of llvm-objdump -d
? What does xlated
(translated instructions) mean? How does the kernel rewrite the bytecode?
CodePudding user response:
Why the output of bpftool prog dump xlated
is a bit different from the output of llvm-objdump -d
?
When you load an eBPF program into the kernel, your loader application usually takes an ELF file and extracts the eBPF bytecode instruction. But before sending them to the kernel, it performs a number of ELF relocations (for eBPF maps, or for CO-RE, for example), which means that the actual bytecode that you are loading into the kernel is already slightly different from the instructions stored in your ELF object file.
Furthermore, the verifier in the kernel further modifies the bytecode. What you can dump with bpftool prog dump xlated
is the result of those different changes: The eBPF program as stored in the kernel, before JIT-compilation, but after ELF relocations (in user space) and kernel rewrites. This is why it looks slightly different from the output of llvm-objdump -d
on your ELF file.
What does xlated
(translated instructions) mean?
See the above - it means that the eBPF bytecode instructions have been adjusted by the kernel verifier, either for efficiency, or simply because it needs to.
How does the kernel rewrite the bytecode?
There are a number of steps involved here. You'd need to go and read the verifier code to get all the details. Some of them include:
- Updating the context access (e.g. convert access to the
struct __sk_buff
used when writing networking program for TC to the actualstruct sk_buff
used in the kernel) - Inlining map access (loading directly from the map address instead of doing a more costly call to
bpf_map_lookup_elem()
) - Inlining some other helper functions
- Removing unreachable instructions
This is in addition to the changes performed in user space before loading the program, for example, replacing references to eBPF maps with file descriptors to the actual maps.