Error: RESOURCE_LEAK (CWE-772):
dwarves-1.21/btf_loader.c:293: alloc_fn: Storage is returned from allocation function "type__new".
dwarves-1.21/btf_loader.c:293: var_assign: Assigning: "enumeration" = storage returned from "type__new(DW_TAG_enumeration_type, tp->name_off, ((*tp).size ? (*tp).size * 8U : 32UL))".
dwarves-1.21/btf_loader.c:315: noescape: Resource "enumeration" is not freed or pointed-to in "enumeration__delete".
dwarves-1.21/btf_loader.c:316: leaked_storage: Variable "enumeration" going out of scope leaks the storage it points to.
# 314| out_free:
# 315| enumeration__delete(enumeration, btfe->priv);
# 316|-> return -ENOMEM;
# 317| }
# 318|
Error: RESOURCE_LEAK (CWE-772):
dwarves-1.21/ctf_loader.c:398: alloc_fn: Storage is returned from allocation function "type__new".
dwarves-1.21/ctf_loader.c:398: var_assign: Assigning: "enumeration" = storage returned from "type__new(DW_TAG_enumeration_type, ctf__get32(ctf, &tp->base.ctf_name), (size ?: 32UL))".
dwarves-1.21/ctf_loader.c:421: noescape: Resource "enumeration" is not freed or pointed-to in "enumeration__delete".
dwarves-1.21/ctf_loader.c:422: leaked_storage: Variable "enumeration" going out of scope leaks the storage it points to.
# 420| out_free:
# 421| enumeration__delete(enumeration, ctf->priv);
# 422|-> return -ENOMEM;
# 423| }
# 424|
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the CTF and later the BTF loaders were implemented they didn't use
obstacks, and then over time some functions, like type__delete(),
class__delete(), enumeration__delete() were shared, which can lead to
crashes by corrupting the obstack by not following its requirements or
to leaks, to avoid such corruption, stop using it.
There is a penalty, but I think its not worth the complexity to keep
using it.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
BTF is currently generated for functions that are in ftrace list
or extern.
A recent use case also needs BTF generated for functions included in
allowlist.
In particular, the kernel commit:
e78aea8b2170 ("bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc")
allows bpf program to directly call a few tcp cc kernel functions. Those
kernel functions are currently allowed only if CONFIG_DYNAMIC_FTRACE
is set to ensure they are in the ftrace list but this kconfig dependency
is unnecessary.
Those kernel functions are specified under an ELF section .BTF_ids.
There was an earlier attempt [0] to add another filter for the functions in
the .BTF_ids section. That discussion concluded that the ftrace filter
should be removed instead.
This patch is to remove the ftrace filter and its related functions.
Number of BTF FUNC with and without is_ftrace_func():
My kconfig in x86: 40643 vs 46225
Jiri reported on arm: 25022 vs 55812
[0]: https://lore.kernel.org/dwarves/20210423213728.3538141-1-kafai@fb.com/
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Tested-by: Nathan Chancellor <nathan@kernel.org> # build
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Jiri Slaby <jirislaby@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Two dates have invalid days of the week. Fix that.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently, when DWARF5 is enabled in kernel, DEBUG_INFO_BTF needs to be
disabled. I hacked the kernel to enable DEBUG_INFO_BTF like:
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -286,7 +286,6 @@ config DEBUG_INFO_DWARF5
bool "Generate DWARF Version 5 debuginfo"
depends on GCC_VERSION >= 50000 || CC_IS_CLANG
depends on CC_IS_GCC || $(success,$(srctree)/scripts/test_dwarf5_support.sh $(CC) $(CLANG_FLAGS))
- depends on !DEBUG_INFO_BTF
help
and tried DWARF5 with latest trunk clang, thin-LTO and no LTO.
In both cases, I got a few additional failures like:
$ ./test_progs -n 55/2
...
libbpf: extern (var ksym) 'bpf_prog_active': failed to find BTF ID in kernel BTF(s).
libbpf: failed to load object 'kfunc_call_test_subprog'
libbpf: failed to load BPF skeleton 'kfunc_call_test_subprog': -22
test_subprog:FAIL:skel unexpected error: 0
#55/2 subprog:FAIL
Here, bpf_prog_active is a percpu global variable and pahole is supposed to
put into BTF, but it is not there.
Further analysis shows this is due to encoding difference between DWARF4
and DWARF5. In DWARF5, a new section .debug_addr and several new ops,
e.g. DW_OP_addrx, are introduced. DW_OP_addrx is actually an index into
.debug_addr section starting from an offset encoded with DW_AT_addr_base
in DW_TAG_compile_unit.
For the above 'bpf_prog_active' example, with DWARF4, we have
0x02281a96: DW_TAG_variable
DW_AT_name ("bpf_prog_active")
DW_AT_decl_file ("/home/yhs/work/bpf-next/include/linux/bpf.h")
DW_AT_decl_line (1170)
DW_AT_decl_column (0x01)
DW_AT_type (0x0226d171 "int")
DW_AT_external (true)
DW_AT_declaration (true)
0x02292f04: DW_TAG_variable
DW_AT_specification (0x02281a96 "bpf_prog_active")
DW_AT_decl_file ("/home/yhs/work/bpf-next/kernel/bpf/syscall.c")
DW_AT_decl_line (45)
DW_AT_location (DW_OP_addr 0x28940)
For DWARF5, we have
0x0138b0a1: DW_TAG_variable
DW_AT_name ("bpf_prog_active")
DW_AT_type (0x013760b9 "int")
DW_AT_external (true)
DW_AT_decl_file ("/home/yhs/work/bpf-next/kernel/bpf/syscall.c")
DW_AT_decl_line (45)
DW_AT_location (DW_OP_addrx 0x16)
This patch added support for DW_OP_addrx. With the patch, the above
failing bpf selftest and other similar failed selftests succeeded.
Signed-off-by: Yonghong Song <yhs@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Bill Wendling <morbo@google.com>
Cc: David Blaikie <dblaikie@gmail.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Link: https://lore.kernel.org/r/20210403184158.2834387-1-yhs@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With latest bpf-next built with clang LTO (thin or full), I hit one test
failures:
$ ./test_progs -t tcp
...
libbpf: extern (func ksym) 'tcp_slow_start': func_proto [23] incompatible with kernel [115303]
libbpf: failed to load object 'bpf_cubic'
libbpf: failed to load BPF skeleton 'bpf_cubic': -22
test_cubic:FAIL:bpf_cubic__open_and_load failed
#9/2 cubic:FAIL
...
The reason of the failure is due to bpf program 'tcp_slow_start' func
signature is different from vmlinux BTF. bpf program uses the following
signature:
extern __u32 tcp_slow_start(struct tcp_sock *tp, __u32 acked);
which is identical to the kernel definition in linux:include/net/tcp.h:
u32 tcp_slow_start(struct tcp_sock *tp, u32 acked);
While vmlinux BTF definition like:
[115303] FUNC_PROTO '(anon)' ret_type_id=0 vlen=2
'tp' type_id=39373
'acked' type_id=18
[115304] FUNC 'tcp_slow_start' type_id=115303 linkage=static
The above is dumped with `bpftool btf dump file vmlinux`.
You can see the ret_type_id is 0 and this caused the problem.
Looking at dwarf, we have:
0x11f2ec67: DW_TAG_subprogram
DW_AT_low_pc (0xffffffff81ed2330)
DW_AT_high_pc (0xffffffff81ed235c)
DW_AT_frame_base ()
DW_AT_GNU_all_call_sites (true)
DW_AT_abstract_origin (0x11f2ed66 "tcp_slow_start")
...
0x11f2ed66: DW_TAG_subprogram
DW_AT_name ("tcp_slow_start")
DW_AT_decl_file ("/home/yhs/work/bpf-next/net/ipv4/tcp_cong.c")
DW_AT_decl_line (392)
DW_AT_prototyped (true)
DW_AT_type (0x11f130c2 "u32")
DW_AT_external (true)
DW_AT_inline (DW_INL_inlined)
We have a subprogram which has an abstract_origin pointing to the
subprogram prototype with return type. Current one pass recoding cannot
easily resolve this easily since at the time recoding for 0x11f2ec67,
the return type in 0x11f2ed66 has not been resolved.
To simplify implementation, I just added another pass to go through all
functions after recoding pass. This should resolve the above issue.
With this patch, among total 250999 functions in vmlinux, 4821 functions
needs return type adjustment from type id 0 to correct values. The above
failed bpf selftest passed too.
Committer testing:
Before:
$ pfunct tcp_slow_start
void tcp_slow_start(struct tcp_sock * tp, u32 acked);
$
$ pfunct --prototypes /sys/kernel/btf/vmlinux > before
$ head before
int fb_is_primary_device(struct fb_info * info);
int arch_resume_nosmt(void);
int relocate_restore_code(void);
int arch_hibernation_header_restore(void * addr);
int get_e820_md5(struct e820_table * table, void * buf);
int arch_hibernation_header_save(void * addr, unsigned int max_size);
int pfn_is_nosave(long unsigned int pfn);
int swsusp_arch_resume(void);
int amd_bus_cpu_online(unsigned int cpu);
void pci_enable_pci_io_ecs(void);
$
After:
$ pfunct -F btf ../build/bpf_clang_thin_lto/vmlinux -f tcp_slow_start
u32 tcp_slow_start(struct tcp_sock * tp, u32 acked);
$
$ pfunct -F btf --prototypes ../build/bpf_clang_thin_lto/vmlinux > after
$
$ head after
int fb_is_primary_device(struct fb_info * info);
int arch_resume_nosmt(void);
int relocate_restore_code(void);
int arch_hibernation_header_restore(void * addr);
int get_e820_md5(struct e820_table * table, void * buf);
int arch_hibernation_header_save(void * addr, unsigned int max_size);
int pfn_is_nosave(long unsigned int pfn);
int swsusp_arch_resume(void);
int amd_bus_cpu_online(unsigned int cpu);
void pci_enable_pci_io_ecs(void);
$
$ diff -u before after | grep ^+ | wc -l
1604
$
$ diff -u before after | grep tcp_slow_start
-void tcp_slow_start(struct tcp_sock * tp, u32 acked);
+u32 tcp_slow_start(struct tcp_sock * tp, u32 acked);
$
$ diff -u before after | grep ^[+-] | head
--- before 2021-04-02 11:35:15.578160795 -0300
+++ after 2021-04-02 11:33:34.204847317 -0300
-void set_bf_sort(const struct dmi_system_id * d);
+int set_bf_sort(const struct dmi_system_id * d);
-void raw_pci_write(unsigned int domain, unsigned int bus, unsigned int devfn, int reg, int len, u32 val);
-void raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn, int reg, int len, u32 * val);
+int raw_pci_write(unsigned int domain, unsigned int bus, unsigned int devfn, int reg, int len, u32 val);
+int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn, int reg, int len, u32 * val);
-void xen_find_device_domain_owner(struct pci_dev * dev);
+int xen_find_device_domain_owner(struct pci_dev * dev);
$
The same results are obtained if using /sys/kernel/btf/vmlinux after
rebooting with the kernel built from the ../build/bpf_clang_thin_lto/vmlinux
file used in the above 'after' examples.
Signed-off-by: Yonghong Song <yhs@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: David Blaikie <dblaikie@gmail.com>
Link: https://lore.kernel.org/bpf/82dfd420-96f9-aedc-6cdc-bf20042455db@fb.com/
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Bill Wendling <morbo@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 39227909db checked compilation flags to see whether the binary
is built with LTO or not (-flto).
Currently, for clang LTO build, default setting won't put compilation
flags in dwarf due to size concern.
David Blaikie suggested in [1] to scan through .debug_abbrev for
DW_FORM_ref_addr which should be most faster than scanning through CU's.
This patch implemented this suggestion and replaced the previous
compilation flag matching approach. Indeed, it seems that the overhead
for this approach is indeed managable.
I did some change to measure the overhead of cus_merging_cu():
@@ -2650,7 +2652,15 @@ static int cus__load_module(struct cus *cus, struct conf_load *conf,
}
}
- if (cus__merging_cu(dw)) {
+ bool do_merging;
+ struct timeval start, end;
+ gettimeofday(&start, NULL);
+ do_merging = cus__merging_cu(dw);
+ gettimeofday(&end, NULL);
+ fprintf(stderr, "%ld %ld -> %ld %ld\n", start.tv_sec, start.tv_usec,
+ end.tv_sec, end.tv_usec);
+
+ if (do_merging) {
res = cus__merge_and_process_cu(cus, conf, mod, dw, elf, filename,
build_id, build_id_len,
type_cu ? &type_dcu : NULL);
For LTO vmlinux, the cus__merging_cu() checking takes 130us over total
"pahole -J vmlinux" time 65sec as the function bail out earlier due to
detecting a merging CU condition.
For non-LTO vmlinux, the cus__merging_cu() checking takes ~171368us over
total pahole time 36sec, roughly 0.5% overhead.
[1] https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/
Suggested-by: David Blaikie <blaikie@google.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Bill Wendling <morbo@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: David Blaikie <dblaikie@gmail.com>
Cc: Fāng-ruì Sòng <maskray@google.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For vmlinux built with clang thin-LTO or LTO (Link Time Optimizationq,
there exists cross CU type references. For example, this can happen:
compile unit 1:
tag 10: type A
compile unit 2:
...
refer to type A (tag 10 in compile unit 1)
I only checked a few but have seen type A may be a simple type like
"unsigned char" or a complex type like an array of base types.
To resolve this issue, the tag DW_AT_producer of the first few
DW_TAG_compile_unit is checked. If the binary is built with clang LTO,
all debuginfo DWARF CU's will be merged into one pahole CU which will
resolve the above cross-CU tag reference issue. To test whether a binary
is built with clang LTO or not, the "clang version" and "-flto" will be
checked against DW_AT_producer string for the first 5 debuginfo CU's.
The reason is that a few linux objects disabled LTO for various reasons.
Merging CU's will create a single CU with lots of types, tags and
functions. For example with clang thin-LTO built vmlinux, I saw 9M
entries in types table, 5.2M in tags table. The below are pahole
wallclock time for different hashbits:
command line: time pahole -J vmlinux
# of hashbits wallclock time in seconds
15 460
16 255
17 131
18 97
19 75
20 69
21 64
22 62
23 58
24 64
The problem is with hashtags__find(), esp. the loop
uint32_t bucket = hashtags__fn(id);
const struct hlist_head *head = hashtable + bucket;
hlist_for_each_entry(tpos, pos, head, hash_node) {
if (tpos->id == id)
return tpos;
}
Say we have 9M types and (1 << 15) buckets, that means each bucket will
have roughly 64 elements. So each lookup will traverse the loop 32
iterations on average.
If we have 1 << 21 buckets, then each buckets will have 4 elements, and
the average number of loop iterations for hashtags__find() will be 2.
Note that the number of hashbits 24 makes performance worse than 23. The
reason could be that 23 hashbits can cover 8M buckets (close to 9M for
the number of entries in types table). Higher number of hash bits
allocates more memory and becomes less cache efficient compared to 23
hashbits.
This patch picks # of hashbits 21 as the starting value and will try to
allocate memory based on that, if memory allocation fails, we will go
with less hashbits until we reach hashbits 15 which is the default for
non merge-CU case.
Committer notes:
To test this we need this patch to be applied on bpf-next/master:
https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@fb.com/
Signed-off-by: Yonghong Song <yhs@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Bill Wendling <morbo@google.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Both cus__load_debug_types() and cus__load_module() created new cu's
followed by initialization. The initialization codes are identical so
let us refactor into a common function which can be used later as well
when dealing with merging cu's.
Signed-off-by: Yonghong Song <yhs@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Bill Wendling <morbo@google.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently, types/tags hash table has fixed HASHTAGS__BITS = 15.
That means the number of buckets will be 1UL << 15 = 32768.
In my experiments, a thin-LTO built vmlinux has roughly 9M entries
in types table and 5.2M entries in tags table. So the number
of buckets is too less for an efficient lookup. This patch
refactored the code to allow the number of buckets to be changed.
In addition, currently hashtags__fn(key) return value is
assigned to uint16_t. Change to uint32_t as in a later patch
the number of hashtag bits can be increased to be more than 16.
Committer notes:
Since we keep a pointer to the 'struct dwarf_cu' in cu->priv and now we
want to release the hashtables it contains, we need to make it also be
dynamicly allocated, otherwise tools such as 'codiff' will fail when
calling cus__delete() -> cu__delete() -> dwarf_cu__delelete().
Signed-off-by: Yonghong Song <yhs@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Bill Wendling <morbo@google.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
By default, pahole makes use only of BTF features introduced with kernel
v5.2. Features that are added later need to be turned on with explicit
feature flags, such as --btf_gen_floats. According to [1], this will
hinder the people who generate BTF for kernels externally (e.g. for old
kernels to support BPF CO-RE).
Introduce --btf_gen_all that allows using all BTF features supported
by pahole.
[1] https://lore.kernel.org/dwarves/CAEf4Bzbyugfb2RkBkRuxNGKwSk40Tbq4zAvhQT8W=fVMYWuaxA@mail.gmail.com/
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some BPF programs compiled on s390 fail to load, because s390
arch-specific linux headers contain float and double types.
Fix as follows:
- Make the DWARF loader fill base_type.float_type.
- Introduce the --btf_gen_floats command-line parameter, so that
pahole could be used to build both the older and the newer kernels.
- libbpf introduced the support for the floating-point types in commit
986962fade5, so update the libbpf submodule to that version and use
the new btf__add_float() function in order to emit the floating-point
types when not in the compatibility mode.
- Make the BTF loader recognize the new BTF kind.
Example of the resulting entry in the vmlinux BTF:
[7164] FLOAT 'double' size=8
when building with:
LLVM_OBJCOPY=${OBJCOPY} ${PAHOLE} -J ${1} --btf_gen_floats
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The assert macro is compiled out with NDEBUG which can lead to an unused
variable warning if the variable is only read in the assert. This is
seen just here:
dwarf_loader.c:957:17: error: unused variable 'tag' [-Werror,-Wunused-variable]
const uint16_t tag = dwarf_tag(die);
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently when processing a DWARF function, we check its entrypoint
against ftrace addresses, assuming that the ftrace address matches with
the function's entrypoint.
This is not the case on some architectures as reported by Nathan
when building kernel on arm [1].
Fix the check to take into account the whole function, not just the
entrypoint.
Most of the is_ftrace_func code was contributed by Andrii.
[1] https://lore.kernel.org/bpf/20210209034416.GA1669105@ubuntu-m3-large-x86/
Committer notes:
Test comments by Nathan:
"I did several builds with CONFIG_DEBUG_INFO_BTF enabled (arm64, ppc64le,
and x86_64) and saw no build errors. I did not do any runtime testing."
Test comments by Sedat:
Linux v5.11-rc7+ and LLVM/Clang v12.0.0-rc1 on x86 (64bit)
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Hao Luo <haoluo@google.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This hashing function[1] produces better hash table bucket
distributions. The original hashing function always produced zeros in
the three least significant bits. The new hashing function gives a
modest performance boost:
Original: 0:11.373s
New: 0:11.110s
for a performance improvement of ~2%.
[1] From the hash function used in libbpf.
Committer notes:
Bill found the suboptimality of the hash function being used, Andrii
suggested using the libbpf one, which ended up being better.
Signed-off-by: Bill Wendling <morbo@google.com>
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
clang with dwarf5 may generate non-regular int base type, i.e., not a
signed/unsigned char/short/int/longlong/__int128. Such base types are
often used to describe how an actual parameter or variable is generated.
For example,
0x000015cf: DW_TAG_base_type
DW_AT_name ("DW_ATE_unsigned_1")
DW_AT_encoding (DW_ATE_unsigned)
DW_AT_byte_size (0x00)
0x00010ed9: DW_TAG_formal_parameter
DW_AT_location (DW_OP_lit0,
DW_OP_not,
DW_OP_convert (0x000015cf) "DW_ATE_unsigned_1",
DW_OP_convert (0x000015d4) "DW_ATE_unsigned_8",
DW_OP_stack_value)
DW_AT_abstract_origin (0x00013984 "branch")
What it does is with a literal "0", did a "not" operation, and the converted to
one-bit unsigned int and then 8-bit unsigned int.
Another example,
0x000e97e4: DW_TAG_base_type
DW_AT_name ("DW_ATE_unsigned_24")
DW_AT_encoding (DW_ATE_unsigned)
DW_AT_byte_size (0x03)
0x000f88f8: DW_TAG_variable
DW_AT_location (indexed (0x3c) loclist = 0x00008fb0:
[0xffffffff82808812, 0xffffffff82808817):
DW_OP_breg0 RAX+0,
DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8",
DW_OP_stack_value,
DW_OP_piece 0x1,
DW_OP_breg0 RAX+0,
DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
DW_OP_lit8,
DW_OP_shr,
DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24",
DW_OP_stack_value,
DW_OP_piece 0x3
......
At one point, a right shift by 8 happens and the result is converted to
32-bit unsigned int and then to 24-bit unsigned int.
BTF does not need any of these DW_OP_* information and such non-regular
int types will cause libbpf to emit errors.
Let us sanitize them to generate BTF acceptable to libbpf and kernel.
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Mark Wielaard <mark@klomp.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This reverts commit 82749180b2.
Getting in the way of releasing 1.20, breaking the build of a dwarves
rpm when a libbpf package is installed in a fedora 33 system:
In file included from /home/acme/rpmbuild/BUILD/dwarves-1.20/strings.c:7:
/home/acme/rpmbuild/BUILD/dwarves-1.20/pahole_strings.h:9:10: fatal error: bpf/btf.h: No such file or directory
9 | #include <bpf/btf.h>
| ^~~~~~~~~~~
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In d783117162 ("dwarf_loader: Handle DWARF5 DW_TAG_call_site like
DW_TAG_GNU_call_site") we forgot to handle that GNU extension promotion
to be a standard, fix it.
Cc: Mark Wielaard <mark@klomp.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This made the build fail on libbpf CI on systems where
DW_FORM_implicit_const isn't defined, so do it conditionally.
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Bulding on fedora rawhide gets us:
/home/acme/git/pahole/dtagnames.c:17:16: error: ‘mallinfo’ is deprecated [-Werror=deprecated-declarations]
17 | struct mallinfo m = mallinfo();
| ^~~~~~~~
In file included from /home/acme/git/pahole/dtagnames.c:10:
/usr/include/malloc.h:118:24: note: declared here
118 | extern struct mallinfo mallinfo (void) __THROW __MALLOC_DEPRECATED;
| ^~~~~~~~
cc1: all warnings being treated as errors
glibc-2.32.9000-26.fc34.x86_64
So stop using it, was just for debugging/assessing memory usage.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When calling cmake on the build dir we got this on fedora rawhide (cmake 3.19.4):
CMake Deprecation Warning at CMakeLists.txt:2 (cmake_minimum_required):
Compatibility with CMake < 2.8.12 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
So bump it.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In function ‘enumeration__calc_prefix’,
inlined from ‘enumeration__calc_prefix’ at /home/acme/git/pahole/dwarves.c:1661:6:
/home/acme/git/pahole/dwarves.c:1683:38: warning: ‘strndup’ specified bound 2147483647 exceeds source size 1 [-Wstringop-overread]
1683 | enumeration->member_prefix = strndup(curr_name, common_part);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
$ gcc --version | head -1
gcc (GCC) 11.0.0 20210123 (Red Hat 11.0.0-0)
$
So check if we actually found the common part, even with that meaning
the enumeration has no entries.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
DW_TAG_call_site and DW_TAG_call_site_parameter are the standardized
DWARF5 versions of DW_TAG_GNU_call_site and DW_TAG_GNU
call_site_parameter. Handle them the same way (which is by ignoring
them).
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1922698
Signed-off-by: Mark Wielaard <mark@klomp.org>
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Noticed witth selftest/bpf/test_verifier_log failing on v5.11-rc5 with a
newer gcc.
However looks like we don't handle DW_FORM_implicit_const when counting
the byte offset (when handling DW_AT_data_member_location)... It was
used for some struct members in my vmlinux, so we got zero for byte
offset and that created another unique struct.
With this patch I no longer see any struct duplication, also
test_verifier_log is working for me, but I could not reproduce the error
before.
Reported-by: Paul Moore <paul@paul-moore.com>
Reported-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: bpf <bpf@vger.kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This appeared in DWARF4 but is supported only in gcc's -gdwarf-5,
support it in a way that makes the output be the same for both cases:
$ gcc -gdwarf-4 -c examples/dwarf5/bf.c
$ pahole bf.o
struct pea {
long int a:1; /* 0: 0 8 */
long int b:1; /* 0: 1 8 */
long int c:1; /* 0: 2 8 */
/* XXX 29 bits hole, try to pack */
/* Bitfield combined with next fields */
int after_bitfield; /* 4 4 */
/* size: 8, cachelines: 1, members: 4 */
/* sum members: 4 */
/* sum bitfield members: 3 bits, bit holes: 1, sum bit holes: 29 bits */
/* last cacheline: 8 bytes */
};
$ gcc -gdwarf-5 -c examples/dwarf5/bf.c
$ pahole bf.o
struct pea {
long int a:1; /* 0: 0 8 */
long int b:1; /* 0: 1 8 */
long int c:1; /* 0: 2 8 */
/* XXX 29 bits hole, try to pack */
/* Bitfield combined with next fields */
int after_bitfield; /* 4 4 */
/* size: 8, cachelines: 1, members: 4 */
/* sum members: 4 */
/* sum bitfield members: 3 bits, bit holes: 1, sum bit holes: 29 bits */
/* last cacheline: 8 bytes */
};
$
Now with an integer before the bitfield:
$ cat examples/dwarf5/bf.c
struct pea {
int before_bitfield;
long a:1, b:1, c:1;
int after_bitfield;
} p;
$ gcc -gdwarf-4 -c examples/dwarf5/bf.c
$ pahole bf.o
struct pea {
int before_bitfield; /* 0 4 */
/* Bitfield combined with previous fields */
long int a:1; /* 0:32 8 */
long int b:1; /* 0:33 8 */
long int c:1; /* 0:34 8 */
/* XXX 29 bits hole, try to pack */
int after_bitfield; /* 8 4 */
/* size: 16, cachelines: 1, members: 5 */
/* sum members: 8 */
/* sum bitfield members: 3 bits, bit holes: 1, sum bit holes: 29 bits */
/* padding: 4 */
/* last cacheline: 16 bytes */
};
$ gcc -gdwarf-5 -c examples/dwarf5/bf.c
$ pahole bf.o
struct pea {
int before_bitfield; /* 0 4 */
/* Bitfield combined with previous fields */
long int a:1; /* 0:32 8 */
long int b:1; /* 0:33 8 */
long int c:1; /* 0:34 8 */
/* XXX 29 bits hole, try to pack */
int after_bitfield; /* 8 4 */
/* size: 16, cachelines: 1, members: 5 */
/* sum members: 8 */
/* sum bitfield members: 3 bits, bit holes: 1, sum bit holes: 29 bits */
/* padding: 4 */
/* last cacheline: 16 bytes */
};
$
And an array of long integers at the start, before the combination of an
integer with a long integer bitfield:
$ cat examples/dwarf5/bf.c
struct pea {
long array[3];
int before_bitfield;
long a:1, b:1, c:1;
int after_bitfield;
} p;
$ gcc -gdwarf-4 -c examples/dwarf5/bf.c
$ pahole bf.o
struct pea {
long int array[3]; /* 0 24 */
int before_bitfield; /* 24 4 */
/* Bitfield combined with previous fields */
long int a:1; /* 24:32 8 */
long int b:1; /* 24:33 8 */
long int c:1; /* 24:34 8 */
/* XXX 29 bits hole, try to pack */
int after_bitfield; /* 32 4 */
/* size: 40, cachelines: 1, members: 6 */
/* sum members: 32 */
/* sum bitfield members: 3 bits, bit holes: 1, sum bit holes: 29 bits */
/* padding: 4 */
/* last cacheline: 40 bytes */
};
$ gcc -gdwarf-5 -c examples/dwarf5/bf.c
$ pahole bf.o
struct pea {
long int array[3]; /* 0 24 */
int before_bitfield; /* 24 4 */
/* Bitfield combined with previous fields */
long int a:1; /* 24:32 8 */
long int b:1; /* 24:33 8 */
long int c:1; /* 24:34 8 */
/* XXX 29 bits hole, try to pack */
int after_bitfield; /* 32 4 */
/* size: 40, cachelines: 1, members: 6 */
/* sum members: 32 */
/* sum bitfield members: 3 bits, bit holes: 1, sum bit holes: 29 bits */
/* padding: 4 */
/* last cacheline: 40 bytes */
};
$
Reported-by: Mark Wielaard <mark@klomp.org>
Tested-by: "Daniel P. Berrangé" <berrange@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1919965
Link: https://lore.kernel.org/dwarves/20210128121122.GA775562@kernel.org/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using the newly added __attr_offset(), so that we try to read it, if it
isn't there, then we don't need to use dwarf_hasattr() for setting
member->is_static.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Sometimes we need to check if a attribute is present and if so, use its
result, so split the part that acts just on the Dwarf_Attribute from
attr_offset() and allow using it directly via __attr_offset().
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For very large ELF objects (with many sections), we could get special
value SHN_XINDEX (65535) for symbol's st_shndx.
This patch is adding code to detect the optional extended section index
table and use it to resolve symbol's section index.
Adding elf_symtab__for_each_symbol_index macro that returns symbol's
section index and usign it in collect functions.
Tested by running pahole on kernel compiled with:
make KCFLAGS="-ffunction-sections -fdata-sections" -j$(nproc) vmlinux
and ensure FUNC records are generated and match normal build (without
above KCFLAGS).
Also bpf selftest passed and generated kernel BTF, is same as without
the patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Hao Luo <haoluo@google.com>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Mark Wieelard <mjw@redhat.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use elf_getshdrstrndx() to cover the case where the ELF header
'e_shstrndx' field contains the special value SHN_XINDEX so that we get
the proper string table index.
This is necessary to handle files with over 65536 sections, such as when
building the kernel with -f[function|data]-sections. Other cases may
include when using FG-ASLR, LTO.
With so many sections, ELF is using extended section index table, which
is used to hold values for some of the indexes and extra code is needed
to retrieve them.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Hao Luo <haoluo@google.com>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Mark Wieelard <mjw@redhat.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When processing kernel images built by clang we can find some functions
without a name, which causes pahole to segfault.
Add extra checks to make sure we always have function's name defined
before using it.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Tom Stellard <tstellar@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a new CMake option, LIBBPF_EMBEDDED, to switch between the
embedded version and the system version (searched via pkg-config)
of libbpf. Set the embedded version as the default.
Signed-off-by: Luca Boccassi <bluca@debian.org>
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adjust pahole logic of skipping any per-CPU symbol with offset 0, which is
especially bad for kernel modules, because it most certainly skips the very
first per-CPU variable.
Instead, do collect per-CPU ELF symbol with 0 offset, but do extra check for
non-kernel module case by verifying that ELF symbol name and DWARF variable
name match. Due to the bug of DWARF name of variable sometimes being NULL,
this is necessarily too pessimistic check (e.g., on my vmlinux image,
fixed_percpu_data variable is still not emitted due to missing DWARF variable
name), it allows to emit data for all module per-CPU variables.
Fixes: f3d9054ba8 ("btf_encoder: Teach pahole to store percpu variables in vmlinux BTF.")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Hao Luo <haoluo@google.com>
Cc: kernel-team@fb.com
Link: https://lore.kernel.org/r/20201211041139.589692-3-andrii@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix pahole's logic for determining per-CPU variables. For vmlinux,
btfe->percpu_base_addr is always 0, so it didn't matter at which point to
subtract it to get offset that later was matched against corresponding ELF
symbol.
For kernel module, though, the situation is different. Kernel module's per-CPU
data section has non-zero offset, which is taken into account in all DWARF
variable addresses calculation. For such cases, it's important to subtract
section offset (btfe->percpu_base_addr) before ELF symbol look up is
performed.
This patch also records per-CPU data section size and uses it for early
filtering of non-per-CPU variables by their address.
Fixes: 2e719cca66 ("btf_encoder: revamp how per-CPU variables are encoded")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: kernel-team@fb.com
Cc: Hao Luo <haoluo@google.com>
Link: https://lore.kernel.org/r/20201211041139.589692-2-andrii@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Replace `%lx' for addr (uint64_t) with PRIx64. `%ld' for seek_bytes
(off_t) is replaced with PRIx64 too, likewise in other places it's
printed.
Fixes these error messages on i586 and arm-32:
btf_encoder.c:445:52: error: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'uint64_t'
btf_encoder.c:687:54: error: format '%lx' expects argument of type 'long unsigned int', but argument 4 has type 'uint64_t'
btf_encoder.c:695:71: error: format '%lx' expects argument of type 'long unsigned int', but argument 4 has type 'uint64_t'
btf_encoder.c:708:88: error: format '%lx' expects argument of type 'long unsigned int', but argument 4 has type 'uint64_t'
pahole.c:1872:20: error: format '%ld' expects argument of type 'long int', but argument 4 has type 'off_t'
Signed-off-by: Cc: Vitaly Chikunov <vt@altlinux.org>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support to detect kernel module ftrace addresses and use
it as filter for detected functions.
For kernel modules the ftrace addresses are stored in __mcount_loc
section. Adding the code that detects this section and reads
its data into array, which is then processed as filter by
current code.
There's one tricky point with kernel modules wrt Elf object,
which we get from dwfl_module_getelf function. This function
performs all possible relocations, including __mcount_loc
section.
So addrs array contains relocated values, which we need take
into account when we compare them to functions values which
are relative to their sections.
With this change for example for xfs.ko module in my kernel
config, I'm getting slightly bigger number of functions:
before: 2373, after: 2601
The ftrace's available_filter_functions still shows 2701, but
it includes functions like:
suffix_kstrtoint.constprop.0
xchk_btree_check_minrecs.isra.0
xfs_ascii_ci_compname.part.0
which are not part of dwarf data, the rest matches BTF functions.
Because of the malfunction DWARF's declaration tag, the 'before'
functions contain also functions that are not part of the module.
The 'after' functions contain only functions that are traceable
and part of xfs.ko.
Despite filtering out some declarations, this change also adds
static functions, hence the total number of functions is bigger.
Committer notes:
Andrii test notes:
<quote>
I've tested locally on bpf_testmod that I'm adding to selftests in [0].
All worked well. I changed the test function from global to non-inlined
static, and BTF had it. And the tests passed. So LGTM.
[0] https://patchwork.kernel.org/user/todo/netdevbpf/?series=395715&delegate=121173&state=*
</>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Hao Luo <haoluo@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We can't assume the address size is always size of unsigned long, we
have to use directly the ELF's address size.
Change the 'addrs' array to __u64 and convert 32 bit address values when
copying from ELF section.
Committer notes:
Jiri tested this by:
<quote>
So to test this I built 32bit vmlinux and used 64bit pahole
to generate BTF data on both vmlinux and modules, which I
thought was valid use case.
</>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Hao Luo <haoluo@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>