Commit Graph

152 Commits

Author SHA1 Message Date
Arnaldo Carvalho de Melo 188f25b741 btf_encoder: Move add_enum() from btf_elf to btf encode_enum()
As it is all that it needs, rename from the __add_ pattern to __encode
to avoid a clash with libbpf btf__add_array() API.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 17:28:56 -03:00
Arnaldo Carvalho de Melo f82bd80a5c btf_encoder: Move add_struct() from btf_elf to btf encode_struct()
As it is all that it needs, rename from the __add_ pattern to __encode
to avoid a clash with libbpf btf__add_array() API.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 17:28:56 -03:00
Arnaldo Carvalho de Melo 1216aebd27 btf_encoder: Move add_array() from btf_elf to btf encode_array()
As it is all that it needs, rename from the __add_ pattern to __encode
to avoid a clash with libbpf btf__add_array() API.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 17:28:56 -03:00
Arnaldo Carvalho de Melo a96c91261e btf_encoder: Move add_ref_type() from btf_elf to btf encode_ref_type()
As it is all that it needs. Rename from __add to __encode prefix for
consistency, as we have btf__add_array() and thus need to use a
different name for arrays, so do it here as well.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 17:28:56 -03:00
Arnaldo Carvalho de Melo a97ed8080a btf_encoder: Move add_member() from btf_elf to btf encode_member()
As it is all that it needs. Rename from __add to __encode prefix for
consistency, as we have btf__add_array() and thus need to use a
different name for arrays, so do it here as well.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 17:28:56 -03:00
Arnaldo Carvalho de Melo 56e0c2d4cc btf_encoder: Move add_base_type() from btf_elf to btf encode_base_type()
As it is all that it needs. Rename from __add to __encode prefix for
consistency, as we have btf__add_array() and thus need to use a
different name for arrays, so do it here as well.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 17:28:56 -03:00
Arnaldo Carvalho de Melo f93e05d8bd btf_encoder: Pass the base BTF object to the BTF encoder
We'll get rid of the 'base_btf' global variable in libbtf.c, so stop
using it in the BTF encoder.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 17:28:56 -03:00
Arnaldo Carvalho de Melo 89be5646a0 pahole: Allow encoding BTF into a detached file
Previously the newly encoded BTF info was stored into a ELF section in
the file where the DWARF info was obtained, but it is useful to just
dump it into a separate file, do it.

  $ ls -la vmlinux.btf
  ls: cannot access 'vmlinux.btf': No such file or directory
  $ pahole -j vmlinux.btf vmlinux
  $ ls -la vmlinux.btf
  -rw-r-----. 1 acme acme 4630082 Jun  1 16:15 vmlinux.btf
  $ pahole -C list_head ./vmlinux.btf
  struct list_head {
  	struct list_head *         next;                 /*     0     8 */
  	struct list_head *         prev;                 /*     8     8 */

  	/* size: 16, cachelines: 1, members: 2 */
  	/* last cacheline: 16 bytes */
  };
  acme@toolbox pahole]$ pahole -C raw_spinlock_t ./vmlinux.btf
  typedef struct raw_spinlock raw_spinlock_t;
  acme@toolbox pahole]$ pahole -EC raw_spinlock ./vmlinux.btf
  struct raw_spinlock {
  	/* typedef arch_spinlock_t */ struct qspinlock {
  		union {
  			/* typedef atomic_t */ struct {
  				int counter;                                                  /*     0     4 */
  			} val;                                                                /*     0     4 */
  			struct {
  				/* typedef u8 -> __u8 */ unsigned char locked;                /*     0     1 */
  				/* typedef u8 -> __u8 */ unsigned char pending;               /*     1     1 */
  			};                                                                    /*     0     2 */
  			struct {
  				/* typedef u16 -> __u16 */ short unsigned int locked_pending; /*     0     2 */
  				/* typedef u16 -> __u16 */ short unsigned int tail;           /*     2     2 */
  			};                                                                    /*     0     4 */
  		};                                                                            /*     0     4 */
  	} raw_lock;                                                                           /*     0     4 */

  	/* size: 4, cachelines: 1, members: 1 */
  	/* last cacheline: 4 bytes */
  };
  ⬢[acme@toolbox pahole]$

Requested-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 17:28:54 -03:00
Andrii Nakryiko 0d17503db0 btf_encoder: fix and complete filtering out zero-sized per-CPU variables
btf_encoder is ignoring zero-sized per-CPU ELF symbols, but the same has to be
done for DWARF variables when matching them with ELF symbols. This is due to
zero-sized DWARF variables matching unrelated (non-zero-sized) variable that
happens to be allocated at the exact same address, leading to a lot of
confusion in BTF.

See [0] for when this causes big problems.

  [0] https://lore.kernel.org/bpf/CAEf4BzZ0-sihSL-UAm21JcaCCY92CqfNxycHRZYXcoj8OYb=wA@mail.gmail.com/

Committer notes:

Kept the {} around the if block with more than one line, which
simplifies the original patch by just removing that assignment
to the 'dwarf_name' variable.

Reported-by: Michal Suchánek <msuchanek@suse.de>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-27 11:56:45 -03:00
Martin KaFai Lau 58a98f76ac btf: Remove ftrace filter
BTF is currently generated for functions that are in ftrace list
or extern.

A recent use case also needs BTF generated for functions included in
allowlist.

In particular, the kernel commit:

  e78aea8b2170 ("bpf: tcp: Put some tcp cong functions in allowlist for bpf-tcp-cc")

allows bpf program to directly call a few tcp cc kernel functions. Those
kernel functions are currently allowed only if CONFIG_DYNAMIC_FTRACE
is set to ensure they are in the ftrace list but this kconfig dependency
is unnecessary.

Those kernel functions are specified under an ELF section .BTF_ids.
There was an earlier attempt [0] to add another filter for the functions in
the .BTF_ids section.  That discussion concluded that the ftrace filter
should be removed instead.

This patch is to remove the ftrace filter and its related functions.

Number of BTF FUNC with and without is_ftrace_func():
My kconfig in x86: 40643 vs 46225
Jiri reported on arm: 25022 vs 55812

[0]: https://lore.kernel.org/dwarves/20210423213728.3538141-1-kafai@fb.com/

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Tested-by: Nathan Chancellor <nathan@kernel.org> # build
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Jiri Slaby <jirislaby@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-05-08 11:26:29 -03:00
Jiri Olsa 8e1f8c904e btf_encoder: Match ftrace addresses within ELF functions
Currently when processing a DWARF function, we check its entrypoint
against ftrace addresses, assuming that the ftrace address matches with
the function's entrypoint.

This is not the case on some architectures as reported by Nathan
when building kernel on arm [1].

Fix the check to take into account the whole function, not just the
entrypoint.

Most of the is_ftrace_func code was contributed by Andrii.

[1] https://lore.kernel.org/bpf/20210209034416.GA1669105@ubuntu-m3-large-x86/

Committer notes:

Test comments by Nathan:

"I did several builds with CONFIG_DEBUG_INFO_BTF enabled (arm64, ppc64le,
 and x86_64) and saw no build errors. I did not do any runtime testing."

Test comments by Sedat:

Linux v5.11-rc7+ and LLVM/Clang v12.0.0-rc1 on x86 (64bit)

Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Hao Luo <haoluo@google.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-17 09:44:25 -03:00
Arnaldo Carvalho de Melo 7943374ac5 Revert "libbpf: allow to use packaged version"
This reverts commit 82749180b2.

Getting in the way of releasing 1.20, breaking the build of a dwarves
rpm when a libbpf package is installed in a fedora 33 system:

In file included from /home/acme/rpmbuild/BUILD/dwarves-1.20/strings.c:7:
/home/acme/rpmbuild/BUILD/dwarves-1.20/pahole_strings.h:9:10: fatal error: bpf/btf.h: No such file or directory
    9 | #include <bpf/btf.h>
      |          ^~~~~~~~~~~

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03 21:45:00 -03:00
Jiri Olsa 1bb49897dd bpf_encoder: Translate SHN_XINDEX in symbol's st_shndx values
For very large ELF objects (with many sections), we could get special
value SHN_XINDEX (65535) for symbol's st_shndx.

This patch is adding code to detect the optional extended section index
table and use it to resolve symbol's section index.

Adding elf_symtab__for_each_symbol_index macro that returns symbol's
section index and usign it in collect functions.

Tested by running pahole on kernel compiled with:

  make KCFLAGS="-ffunction-sections -fdata-sections" -j$(nproc) vmlinux

and ensure FUNC records are generated and match normal build (without
above KCFLAGS).

Also bpf selftest passed and generated kernel BTF, is same as without
the patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Hao Luo <haoluo@google.com>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Mark Wieelard <mjw@redhat.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-01-26 22:36:36 -03:00
Jiri Olsa e32b9800e6 btf_encoder: Add extra checks for symbol names
When processing kernel images built by clang we can find some functions
without a name, which causes pahole to segfault.

Add extra checks to make sure we always have function's name defined
before using it.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Tom Stellard <tstellar@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-01-21 16:37:23 -03:00
Luca Boccassi 82749180b2 libbpf: allow to use packaged version
Add a new CMake option, LIBBPF_EMBEDDED, to switch between the
embedded version and the system version (searched via pkg-config)
of libbpf. Set the embedded version as the default.

Signed-off-by: Luca Boccassi <bluca@debian.org>
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-01-21 10:26:25 -03:00
Andrii Nakryiko b688e35970 btf_encoder: fix skipping per-CPU variables at offset 0
Adjust pahole logic of skipping any per-CPU symbol with offset 0, which is
especially bad for kernel modules, because it most certainly skips the very
first per-CPU variable.

Instead, do collect per-CPU ELF symbol with 0 offset, but do extra check for
non-kernel module case by verifying that ELF symbol name and DWARF variable
name match. Due to the bug of DWARF name of variable sometimes being NULL,
this is necessarily too pessimistic check (e.g., on my vmlinux image,
fixed_percpu_data variable is still not emitted due to missing DWARF variable
name), it allows to emit data for all module per-CPU variables.

Fixes: f3d9054ba8 ("btf_encoder: Teach pahole to store percpu variables in vmlinux BTF.")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Hao Luo <haoluo@google.com>
Cc: kernel-team@fb.com
Link: https://lore.kernel.org/r/20201211041139.589692-3-andrii@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-14 10:42:40 -03:00
Andrii Nakryiko 8c009d6ce7 btf_encoder: fix BTF variable generation for kernel modules
Fix pahole's logic for determining per-CPU variables. For vmlinux,
btfe->percpu_base_addr is always 0, so it didn't matter at which point to
subtract it to get offset that later was matched against corresponding ELF
symbol.

For kernel module, though, the situation is different. Kernel module's per-CPU
data section has non-zero offset, which is taken into account in all DWARF
variable addresses calculation. For such cases, it's important to subtract
section offset (btfe->percpu_base_addr) before ELF symbol look up is
performed.

This patch also records per-CPU data section size and uses it for early
filtering of non-per-CPU variables by their address.

Fixes: 2e719cca66 ("btf_encoder: revamp how per-CPU variables are encoded")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: kernel-team@fb.com
Cc: Hao Luo <haoluo@google.com>
Link: https://lore.kernel.org/r/20201211041139.589692-2-andrii@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-14 10:42:08 -03:00
Vitaly Chikunov b94e97e015 dwarves: Fix compilation on 32-bit architectures
Replace `%lx' for addr (uint64_t) with PRIx64. `%ld' for seek_bytes
(off_t) is replaced with PRIx64 too, likewise in other places it's
printed.

Fixes these error messages on i586 and arm-32:

  btf_encoder.c:445:52: error: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'uint64_t'
  btf_encoder.c:687:54: error: format '%lx' expects argument of type 'long unsigned int', but argument 4 has type 'uint64_t'
  btf_encoder.c:695:71: error: format '%lx' expects argument of type 'long unsigned int', but argument 4 has type 'uint64_t'
  btf_encoder.c:708:88: error: format '%lx' expects argument of type 'long unsigned int', but argument 4 has type 'uint64_t'
  pahole.c:1872:20: error: format '%ld' expects argument of type 'long int', but argument 4 has type 'off_t'

Signed-off-by: Cc: Vitaly Chikunov <vt@altlinux.org>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-07 12:55:36 -03:00
Jiri Olsa 17df51c700 btf_encoder: Detect kernel module ftrace addresses
Add support to detect kernel module ftrace addresses and use
it as filter for detected functions.

For kernel modules the ftrace addresses are stored in __mcount_loc
section. Adding the code that detects this section and reads
its data into array, which is then processed as filter by
current code.

There's one tricky point with kernel modules wrt Elf object,
which we get from dwfl_module_getelf function. This function
performs all possible relocations, including __mcount_loc
section.

So addrs array contains relocated values, which we need take
into account when we compare them to functions values which
are relative to their sections.

With this change for example for xfs.ko module in my kernel
config, I'm getting slightly bigger number of functions:

  before: 2373, after: 2601

The ftrace's available_filter_functions still shows 2701, but
it includes functions like:

  suffix_kstrtoint.constprop.0
  xchk_btree_check_minrecs.isra.0
  xfs_ascii_ci_compname.part.0

which are not part of dwarf data, the rest matches BTF functions.

Because of the malfunction DWARF's declaration tag, the 'before'
functions contain also functions that are not part of the module.
The 'after' functions contain only functions that are traceable
and part of xfs.ko.

Despite filtering out some declarations, this change also adds
static functions, hence the total number of functions is bigger.

Committer notes:

Andrii test notes:

<quote>
I've tested locally on bpf_testmod that I'm adding to selftests in [0].
All worked well. I changed the test function from global to non-inlined
static, and BTF had it. And the tests passed. So LGTM.

  [0] https://patchwork.kernel.org/user/todo/netdevbpf/?series=395715&delegate=121173&state=*
</>

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Hao Luo <haoluo@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-07 12:53:11 -03:00
Jiri Olsa 06ca639505 btf_encoder: Use address size based on ELF's class
We can't assume the address size is always size of unsigned long, we
have to use directly the ELF's address size.

Change the 'addrs' array to __u64 and convert 32 bit address values when
copying from ELF section.

Committer notes:

Jiri tested this by:

<quote>
So to test this I built 32bit vmlinux and used 64bit pahole
to generate BTF data on both vmlinux and modules, which I
thought was valid use case.
</>

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Hao Luo <haoluo@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-07 12:49:49 -03:00
Jiri Olsa aff60970d1 btf_encoder: Factor filter_functions function
Reorder the filter_functions function so we can add processing of kernel
modules in following patch.

There's no functional change intended.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Hao Luo <haoluo@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-07 12:48:55 -03:00
Jiri Olsa b3dd4f3c3d btf_encoder: Use better fallback message
Using more suitable fallback message for the case when the ftrace filter
can't be used because of missing symbols.

Committer notes:

Before:

  vmlinux not detected, falling back to dwarf data

Now:

  ftrace symbols not detected, falling back to DWARF data

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-17 14:55:46 -03:00
Jiri Olsa d06048c530 btf_encoder: Move btf_elf__verbose/btf_elf__force setup
With introduction of collect_symbols function, we moved the percpu
variables code before btf_elf__verbose/btf_elf__force setup, so they
don't have any effect in that code anymore.

Also btf_elf__verbose is used in code that prepares ftrace filter for
functions generations, also called within collect_symbols function.

Moving btf_elf__verbose/btf_elf__force setup early in the cu__encode_btf
function, so we can get verbose messages and see the effect of the force
option.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-16 16:45:40 -03:00
Jiri Olsa 8156bec8ae btf_encoder: Fix function generation
Current conditions for picking up function records break BTF data on
some gcc versions.

Some function records can appear with no arguments but with declaration
tag set, so moving the 'fn->declaration' in front of other checks.

Then checking if argument names are present and finally checking ftrace
filter if it's present. If ftrace filter is not available, using the
external tag to filter out non external functions.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-16 09:02:34 -03:00
Jiri Olsa d0cd007339 btf_encoder: Generate also .init functions
Currently we skip functions under .init* sections, Removing the .init*
section check, BTF now contains also functions from .init* sections.

Andrii's explanation from email:

> ...                  I think we should just drop the __init check and
> include all the __init functions into BTF. There could be cases where
> we'd need to attach BPF programs to __init functions (e.g., bpf_lsm
> security cases), so having BTFs for those FUNCs are necessary as well.
> Ftrace currently disallows that, but it's only because no user-space
> application has a way to attach probes early enough. This might change
> in the future, so there is no need to invent special mechanisms now
> for bpf_iter function preservation. Let's just include all __init
> functions in BTF.

It's over ~2000 functions on my .config:

   $ bpftool btf dump file ./vmlinux | grep 'FUNC ' | wc -l
   41505
   $ bpftool btf dump file /sys/kernel/btf/vmlinux | grep 'FUNC ' | wc -l
   39256

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-16 09:02:21 -03:00
Andrii Nakryiko ace05ba941 btf: Add support for split BTF loading and encoding
Add support for generating split BTF, in which there is a designated base
BTF, containing a base set of types, and a split BTF, which extends main BTF
with extra types, that can reference types and strings from the main BTF.

This is going to be used to generate compact BTFs for kernel modules, with
vmlinux BTF being a main BTF, which all kernel modules are based off of.

These changes rely on patch set [0] to be present in libbpf submodule.

  [0] https://patchwork.kernel.org/project/netdevbpf/list/?series=377859&state=*

Committer notes:

Fixed up wrt ARGP_numeric_version and added a man page entry.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-11 09:12:44 -03:00
Andrii Nakryiko 94a7535939 btf_encoder: Fix array index type numbering
Take into account type ID offset, accumulated from previous CUs, when
calculating a new type ID for the generated array index type.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-11 08:23:40 -03:00
Jiri Olsa 5a22c2de79 btf_encoder: Change functions check due to broken dwarf
We need to generate just single BTF instance for the function, while
DWARF data contains multiple instances of DW_TAG_subprogram tag.

Unfortunately we can no longer rely on DW_AT_declaration tag
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060)

Instead we apply following checks:

  - argument names are defined for the function

  - there's symbol and address defined for the function

  - function is generated only once

Also because we want to follow kernel's ftrace traceable functions, this
patchset is adding extra check that the function is one of the ftrace's
functions.

All ftrace functions addresses are stored in vmlinux binary within
symbols:

  __start_mcount_loc
  __stop_mcount_loc

During object preparation code we read those addresses, sort them and
use them as filter for all detected dwarf functions.

We also filter out functions within .init section, ftrace is doing that
in runtime. At the same time we keep functions from
.init.bpf.preserve_type, because they are needed in BTF.

I can still see several differences to ftrace functions in
/sys/kernel/debug/tracing/available_filter_functions file:

  - available_filter_functions includes modules

  - available_filter_functions includes functions like:
      __acpi_match_device.part.0.constprop.0
      acpi_ns_check_sorted_list.constprop.0
      acpi_os_unmap_generic_address.part.0
      acpiphp_check_bridge.part.0

    which are not part of dwarf data

  - BTF includes multiple functions like:

      __clk_register_clkdev
      clk_register_clkdev

    which share same code so they appear just as single function
    in available_filter_functions, but dwarf keeps track of both
    of them

  - BTF includes iterator functions, which do not make it to
    available_filter_functions

With this change I'm getting 38384 BTF functions, which when added above
functions to consideration gives same amount of functions in
available_filter_functions.

The patch still keeps the original function filter condition (that uses
current fn->declaration check) in case the object does not contain
*_mcount_loc symbol -> object is not vmlinux.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Mark Wieelard <mjw@redhat.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-09 14:27:11 -03:00
Jiri Olsa 7b1af3f484 btf_encoder: Move find_all_percpu_vars in generic collect_symbols
Move find_all_percpu_vars() under generic collect_symbols() that walks
over symbols and calls collect_percpu_var().

We will add another collect function that needs to go through all the
symbols, so it's better we go through them just once.

There's no functional change intended.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Hao Luo <haoluo@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Mark Wieelard <mjw@redhat.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-09 14:25:49 -03:00
Hao Luo 863e6f0f2c btf_encoder: Check var type after checking var addr.
Commit 2e719cca66 ("btf_encoder: revamp how per-CPU variables are
encoded") adds percpu_var_exists() to filter out the symbols that are
not percpu var. However, the check comes after checking the var's type.
There can be symbols that are of zero type. If we hit that, btf_encoder
will not work without '--btf_encode_force'.  So we should check
percpu_var_exists before checking var's type.

Tested:

  haoluo@haoluo:~/kernel/tip$ gcc --version
  gcc (GCC) 10.2.0
  Copyright (C) 2020 Free Software Foundation, Inc.
  This is free software; see the source for copying conditions.  There is NO
  warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Before:

  haoluo@haoluo:~/kernel/tip$ make clean -s
  haoluo@haoluo:~/kernel/tip$ make -j 32 -s
    LINK     resolve_btfids
  error: found variable in CU 'kernel/bpf/btf.c' that has void type
  Encountered error while encoding BTF.
  FAILED: load BTF from vmlinux: Unknown error -2make: *** [Makefile:1164: vmlinux] Error 255

After:

  haoluo@haoluo:~/kernel/tip$ make clean -s
  haoluo@haoluo:~/kernel/tip$ make -j 32 -s
    LINK     resolve_btfids
  haoluo@haoluo:~/kernel/tip$

Fixes: 2e719cc ("btf_encoder: revamp how per-CPU variables are encoded")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reported-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Hao Luo <haoluo@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-27 08:52:42 -03:00
Andrii Nakryiko 8cac1c54c8 btf_encoder: Ignore zero-sized ELF symbols
It's legal for ELF symbol to have size 0, if it's size is unknown or
unspecified. Instead of erroring out, just ignore such symbols, as they can't
be a valid per-CPU variable anyways.

Reported-by: Érico Rolim <erico.erc@gmail.com>
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Jiri Slaby <jirislaby@kernel.org>
Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1177921
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-23 13:37:21 -03:00
Andrii Nakryiko 2e719cca66 btf_encoder: revamp how per-CPU variables are encoded
Right now to encode per-CPU variables in BTF, pahole iterates complete vmlinux
symbol table for each CU. There are 2500 CUs for a typical kernel image.
Overall, to encode 287 per-CPU variables pahole spends more than 10% of its CPU
budget, this is incredibly wasteful.

This patch revamps how this is done. Now it pre-processes symbol table once
before any of per-CU processing starts. It remembers each per-CPU variable
symbol, including its address, size, and name. Then during processing each CU,
binary search is used to correlate DWARF variable with per-CPU symbols and
figure out if variable belongs to per-CPU data section. If the match is found,
BTF_KIND_VAR is emitted and var_secinfo is recorded, just like before. At the
very end, after all CUs are processed, BTF_KIND_DATASEC is emitted with sorted
variables.

This change makes per-CPU variables generation overhead pretty negligible and
returns back about 10% of CPU usage.

Performance counter stats for './pahole -J /home/andriin/linux-build/default/vmlinux':

BEFORE:
      19.160149000 seconds user
       1.304873000 seconds sys

         24,114.05 msec task-clock                #    0.999 CPUs utilized
                83      context-switches          #    0.003 K/sec
                 0      cpu-migrations            #    0.000 K/sec
           622,417      page-faults               #    0.026 M/sec
    72,897,315,125      cycles                    #    3.023 GHz                      (25.02%)
   127,807,316,959      instructions              #    1.75  insn per cycle           (25.01%)
    29,087,179,117      branches                  # 1206.234 M/sec                    (25.01%)
       464,105,921      branch-misses             #    1.60% of all branches          (25.01%)
    30,252,119,368      L1-dcache-loads           # 1254.543 M/sec                    (25.01%)
     1,156,336,207      L1-dcache-load-misses     #    3.82% of all L1-dcache hits    (25.05%)
       343,373,503      LLC-loads                 #   14.240 M/sec                    (25.02%)
        12,044,977      LLC-load-misses           #    3.51% of all LL-cache hits     (25.01%)

      24.136198321 seconds time elapsed

      22.729693000 seconds user
       1.384859000 seconds sys

AFTER:
      16.781455000 seconds user
       1.343956000 seconds sys

         23,398.77 msec task-clock                #    1.000 CPUs utilized
                86      context-switches          #    0.004 K/sec
                 0      cpu-migrations            #    0.000 K/sec
           622,420      page-faults               #    0.027 M/sec
    68,395,641,468      cycles                    #    2.923 GHz                      (25.05%)
   114,241,327,034      instructions              #    1.67  insn per cycle           (25.01%)
    26,330,711,718      branches                  # 1125.303 M/sec                    (25.01%)
       465,926,869      branch-misses             #    1.77% of all branches          (25.00%)
    24,662,984,772      L1-dcache-loads           # 1054.029 M/sec                    (25.00%)
     1,054,052,064      L1-dcache-load-misses     #    4.27% of all L1-dcache hits    (25.00%)
       340,970,622      LLC-loads                 #   14.572 M/sec                    (25.00%)
        16,032,297      LLC-load-misses           #    4.70% of all LL-cache hits     (25.03%)

      23.402259654 seconds time elapsed

      21.916437000 seconds user
       1.482826000 seconds sys

Committer testing:

  $ grep 'model name' -m1 /proc/cpuinfo
  model name	: AMD Ryzen 9 3900X 12-Core Processor
  $

Before:

  $ perf stat -r5 pahole -J vmlinux

   Performance counter stats for 'pahole -J vmlinux' (5 runs):

            9,730.28 msec task-clock:u              #    0.998 CPUs utilized            ( +-  0.54% )
                   0      context-switches:u        #    0.000 K/sec
                   0      cpu-migrations:u          #    0.000 K/sec
             353,854      page-faults:u             #    0.036 M/sec                    ( +-  0.00% )
      39,721,726,459      cycles:u                  #    4.082 GHz                      ( +-  0.07% )  (83.33%)
         626,010,654      stalled-cycles-frontend:u #    1.58% frontend cycles idle     ( +-  0.91% )  (83.33%)
       7,518,333,691      stalled-cycles-backend:u  #   18.93% backend cycles idle      ( +-  0.56% )  (83.33%)
      85,477,123,093      instructions:u            #    2.15  insn per cycle
                                                    #    0.09  stalled cycles per insn  ( +-  0.02% )  (83.34%)
      19,346,085,683      branches:u                # 1988.235 M/sec                    ( +-  0.03% )  (83.34%)
         237,291,787      branch-misses:u           #    1.23% of all branches          ( +-  0.15% )  (83.33%)

              9.7465 +- 0.0524 seconds time elapsed  ( +-  0.54% )

  $

After:

  $ perf stat -r5 pahole -J vmlinux

   Performance counter stats for 'pahole -J vmlinux' (5 runs):

            8,953.80 msec task-clock:u              #    0.998 CPUs utilized            ( +-  0.09% )
                   0      context-switches:u        #    0.000 K/sec
                   0      cpu-migrations:u          #    0.000 K/sec
             353,855      page-faults:u             #    0.040 M/sec                    ( +-  0.00% )
      35,775,730,539      cycles:u                  #    3.996 GHz                      ( +-  0.07% )  (83.33%)
         579,534,836      stalled-cycles-frontend:u #    1.62% frontend cycles idle     ( +-  2.21% )  (83.33%)
       5,719,840,144      stalled-cycles-backend:u  #   15.99% backend cycles idle      ( +-  0.93% )  (83.33%)
      73,035,744,786      instructions:u            #    2.04  insn per cycle
                                                    #    0.08  stalled cycles per insn  ( +-  0.02% )  (83.34%)
      16,798,017,844      branches:u                # 1876.077 M/sec                    ( +-  0.05% )  (83.33%)
         237,777,143      branch-misses:u           #    1.42% of all branches          ( +-  0.15% )  (83.34%)

             8.97077 +- 0.00803 seconds time elapsed  ( +-  0.09% )

  $

Indeed, about 10% shaved, not bad.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Hao Luo <haoluo@google.com>
Cc: Oleg Rombakh <olegrom@google.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-09 12:57:35 -03:00
Andrii Nakryiko 3c913e18b2 btf_encoder: Fix emitting __ARRAY_SIZE_TYPE__ as index range type
Fix the logic of determining if __ARRAY_SIZE_TYPE__ needs to be emitted.
Previously, such type could be emitted unnecessarily due to some
particular CU not having an int type in it. That would happen even if
there was no array type in that CU. Fix it by keeping track of 'int'
type across CUs and only emitting __ARRAY_SIZE_TYPE__ if a given CU has
array type, but we still haven't found 'int' type.

Testing against vmlinux shows that now there are no __ARRAY_SIZE_TYPE__
integers emitted.

Committer testing:

  $ pahole -J vmlinux
  $ ./btfdiff vmlinux
  $

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-09 12:49:40 -03:00
Andrii Nakryiko 48efa92933 btf_encoder: Use libbpf APIs to encode BTF type info
Switch to use libbpf's BTF writing APIs to encode BTF. This reconciles
btf_elf's use of internal struct btf from libbpf for both loading and
encoding BTF type info. This change also saves a considerable amount of
memory used for DWARF to BTF conversion due to avoiding extra memory
copy between gobuffers and libbpf's struct btf. Now that pahole uses
libbpf's struct btf, it's possible to further utilize libbpf's features
and APIs, e.g., for handling endianness conversion, for dumping raw BTF
type info during encoding. These features might be implemented in the
follow up patches.

Committer notes:

Built with 'cmake -DCMAKE_BUILD_TYPE=Release'

Before:

  $ cp ~/git/build/bpf-next-v5.9.0-rc8+/vmlinux .
  $ perf stat -r5 pahole -J vmlinux

   Performance counter stats for 'pahole -J vmlinux' (5 runs):

           10,065.20 msec task-clock:u              #    0.998 CPUs utilized            ( +-  0.68% )
                   0      context-switches:u        #    0.000 K/sec
                   0      cpu-migrations:u          #    0.000 K/sec
             514,596      page-faults:u             #    0.051 M/sec                    ( +-  0.00% )
      40,098,447,225      cycles:u                  #    3.984 GHz                      ( +-  0.26% )  (83.33%)
         547,247,149      stalled-cycles-frontend:u #    1.36% frontend cycles idle     ( +-  2.00% )  (83.33%)
       6,493,462,167      stalled-cycles-backend:u  #   16.19% backend cycles idle      ( +-  1.53% )  (83.33%)
      86,338,929,286      instructions:u            #    2.15  insn per cycle
                                                    #    0.08  stalled cycles per insn  ( +-  0.01% )  (83.34%)
      19,859,060,127      branches:u                # 1973.043 M/sec                    ( +-  0.02% )  (83.33%)
         288,389,742      branch-misses:u           #    1.45% of all branches          ( +-  0.13% )  (83.33%)

             10.0831 +- 0.0683 seconds time elapsed  ( +-  0.68% )

  $

After:

  $ perf stat -r5 pahole -J vmlinux

   Performance counter stats for 'pahole -J vmlinux' (5 runs):

           10,043.94 msec task-clock:u              #    0.998 CPUs utilized            ( +-  0.69% )
                   0      context-switches:u        #    0.000 K/sec
                   0      cpu-migrations:u          #    0.000 K/sec
             412,035      page-faults:u             #    0.041 M/sec                    ( +-  0.00% )
      39,985,610,202      cycles:u                  #    3.981 GHz                      ( +-  0.18% )  (83.33%)
         657,352,766      stalled-cycles-frontend:u #    1.64% frontend cycles idle     ( +-  2.79% )  (83.33%)
       7,387,740,861      stalled-cycles-backend:u  #   18.48% backend cycles idle      ( +-  1.65% )  (83.33%)
      85,926,053,845      instructions:u            #    2.15  insn per cycle
                                                    #    0.09  stalled cycles per insn  ( +-  0.04% )  (83.34%)
      19,428,047,875      branches:u                # 1934.305 M/sec                    ( +-  0.05% )  (83.33%)
         240,156,838      branch-misses:u           #    1.24% of all branches          ( +-  0.14% )  (83.34%)

             10.0609 +- 0.0696 seconds time elapsed  ( +-  0.69% )

  $

  $ ./btfdiff vmlinux
  $

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-09 12:49:37 -03:00
Hao Luo c815d26689 btf_encoder: Handle DW_TAG_variable that has DW_AT_specification
It is found on gcc 8.2 that global percpu variables generate the
following dwarf entry in the cu where the variable is defined[1].

Take the global variable "bpf_prog_active" defined in
kernel/bpf/syscall.c as an example. The debug info for syscall.c has two
dwarf entries for "bpf_prog_active".

 > readelf -wi kernel/bpf/syscall.o

0x00013534:   DW_TAG_variable
                 DW_AT_name      ("bpf_prog_active")
                 DW_AT_decl_file
("/data/users/yhs/work/net-next/include/linux/bpf.h")
                 DW_AT_decl_line (1074)
                 DW_AT_decl_column       (0x01)
                 DW_AT_type      (0x000000d6 "int")
                 DW_AT_external  (true)
                 DW_AT_declaration       (true)

0x00021a25:   DW_TAG_variable
                 DW_AT_specification     (0x00013534 "bpf_prog_active")
                 DW_AT_decl_file
("/data/users/yhs/work/net-next/kernel/bpf/syscall.c")
                 DW_AT_decl_line (43)
                 DW_AT_location  (DW_OP_addr 0x0)

Note that second DW_TAG_variable entry contains specification that
points to the first entry. This causes problem for btf_encoder when
encoding global variables. The tag generated for the second entry
doesn't have the type and scope info. Therefore the BTF VARs encoded
using this tag has incorrect type_id and scope.

As fix, when creating variable, examine the dwarf entry. If it has
a DW_AT_specification, store the referred struct variable in a 'spec'
field. When encoding VARs, check this 'spec', if it's non-empty, follow
the pointer to use the referred var.

 [1] https://www.mail-archive.com/netdev@vger.kernel.org/msg348144.html

Tested: Tested using gcc 4.9 and gcc 8.2. The types and scopes of global
 vars are now generated correctly.

 [21] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
 [21102] VAR 'bpf_prog_active' type_id=21, linkage=global-alloc

Signed-off-by: Hao Luo <haoluo@google.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-01 15:22:41 -03:00
Hao Luo da4ad2f650 btf_encoder: Allow disabling BTF var encoding.
A new feature was introduced in commit f3d9054ba8 ("btf_encoder: Teach
pahole to store percpu variables in vmlinux BTF.") which encodes kernel
percpu variables into BTF. Add a flag --skip_encoding_btf_vars to allow
users to toggle this feature off, so that the rollout of pahole v1.18
can be protected by potential bugs in this feature.

Committer notes:

Added missing man page entry.

Signed-off-by: Hao Luo <haoluo@google.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Yonghong Song <yhs@fb.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-21 17:00:16 -03:00
Hao Luo f3d9054ba8 btf_encoder: Teach pahole to store percpu variables in vmlinux BTF.
On SMP systems, the global percpu variables are placed in a special
'.data..percpu' section, which is stored in a segment whose initial
address is set to 0, the addresses of per-CPU variables are relative
positive addresses [1].

This patch extracts these variables from vmlinux and places them with
their type information in BTF. More specifically, when BTF is encoded,
we find the index of the '.data..percpu' section and then traverse the
symbol table to find those global objects which are in this section.
For each of these objects, we push a BTF_KIND_VAR into the types buffer,
and a BTF_VAR_SECINFO into another buffer, percpu_secinfo. When all the
CUs have finished processing, we push a BTF_KIND_DATASEC into the
btfe->types buffer, followed by the percpu_secinfo's content.

In a v5.8-rc3 linux kernel, I was able to extract 288 such variables.
The build time overhead is small and the space overhead is also small.
See testings below.

A found variable can be invalid in two ways:

 - Its name found in elf_sym__name is invalid.
 - Its size identified by elf_sym__size is 0.

In either case, the BTF containing such symbols will be rejected by the
BTF verifier. Normally we should not see such symbols. But if one is
seen during BTF encoding, the encoder will exit with error. An new flag
'-j' (or '--force') is implemented to help testing, which skips the
invalid symbols and force emit a BTF.

Testing:

- vmlinux size has increased by ~12kb.
  Before:
   $ readelf -SW vmlinux | grep BTF
   [25] .BTF              PROGBITS        ffffffff821a905c 13a905c 2d2bf8 00
  After:
   $ pahole -J vmlinux
   $ readelf -SW vmlinux  | grep BTF
   [25] .BTF              PROGBITS        ffffffff821a905c 13a905c 2d5bca 00

- Common global percpu VARs and DATASEC are found in BTF section.
  $ bpftool btf dump file vmlinux | grep runqueues
  [14152] VAR 'runqueues' type_id=13778, linkage=global-alloc

  $ bpftool btf dump file vmlinux | grep 'cpu_stopper'
  [17582] STRUCT 'cpu_stopper' size=72 vlen=5
  [17601] VAR 'cpu_stopper' type_id=17582, linkage=static

  $ bpftool btf dump file vmlinux | grep ' DATASEC '
  [63652] DATASEC '.data..percpu' size=179288 vlen=288

- Tested bpf selftests.

- pahole exits with error if an invalid symbol is seen during encoding,
  make -f Makefile -j 36 -s
  PAHOLE: Error: Found symbol of zero size when encoding btf (sym: 'yyy', cu: 'xxx.c').
  PAHOLE: Error: Use '-j' or '--force_emit' to ignore such symbols and force emit the btf.
  scripts/link-vmlinux.sh: line 137: 2475712 Segmentation fault      LLVM_OBJCOPY=${OBJCOPY} ${PAHOLE} -J ${1}

- With the flag '-j' or '--force', the invalid symbols are ignored.

- Further in verbose mode and with '-j' or '--force' set, a warning is generated:
  PAHOLE: Warning: Found symbol of zero size when encoding btf, ignored (sym: 'yyy', cu: 'xxx.c').
  PAHOLE: Warning: Found symbol of invalid name when encoding btf, ignored (sym: 'zzz', cu: 'sss.c').

References:
 [1] https://lwn.net/Articles/531148/

Signed-off-by: Hao Luo <haoluo@google.com>
Tested-by: Andrii Nakryiko <andriin@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Oleg Rombakh <olegrom@google.com>
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-05 14:51:17 -03:00
Andrii Nakryiko 3c5f2a224a btf_encoder: Preserve and encode exported functions as BTF_KIND_FUNC
Add encoding of DWARF's DW_TAG_subprogram_type into BTF's BTF_KIND_FUNC
(plus corresponding BTF_KIND_FUNC_PROTO). Only exported functions are
converted for now. This allows to capture all the exported kernel
functions, same subset that's exposed through /proc/kallsyms.

Committer testing:

Before:

  $ readelf -SW vmlinux  | grep BTF
    [78] .BTF              PROGBITS        0000000000000000 26a27da9 1e5543 00      0   0  1
  $

After:

  $ pahole -J vmlinux
  $ readelf -SW vmlinux  | grep BTF
    [78] .BTF              PROGBITS        0000000000000000 26a27da9 2d5f47 00      0   0  1
  $

  >>> 0x2d5f47 - 0x1e5543
  985604

The kernel has a lot of functions! :-)

  $ pahole -VJ vmlinux > /tmp/pahole-btf-encoding-verbose-output.txt

  $ grep -w FUNC /tmp/pahole-btf-encoding-verbose-output.txt | wc -l
  22871
  [acme@quaco pahole]$ grep -w FUNC /tmp/pahole-btf-encoding-verbose-output.txt | tail
  [4511543] FUNC copy_from_user_nmi type_id=4511542
  [4512934] FUNC memcpy_page_flushcache type_id=4512933
  [4512936] FUNC __memcpy_flushcache type_id=4512935
  [4512938] FUNC __copy_user_flushcache type_id=4512937
  [4512940] FUNC arch_wb_cache_pmem type_id=4512939
  [4512942] FUNC mcsafe_handle_tail type_id=4512941
  [4512944] FUNC copy_user_handle_tail type_id=4512943
  [4512946] FUNC clear_user type_id=4512945
  [4512948] FUNC __clear_user type_id=4512947
  [4512950] FUNC memcpy type_id=4512949
  $ grep -w FUNC_PROTO /tmp/pahole-btf-encoding-verbose-output.txt | tail
  [4512902] FUNC_PROTO (anon) return=4511725 args=(4512097 (anon), 4511544 (anon))
  [4512933] FUNC_PROTO (anon) return=0 args=(4511598 to, 4511725 page, 4511610 offset, 4511610 len)
  [4512935] FUNC_PROTO (anon) return=0 args=(4511638 _dst, 4511759 _src, 4511610 size)
  [4512937] FUNC_PROTO (anon) return=4511585 args=(4511638 dst, 4511759 src, 4511552 size)
  [4512939] FUNC_PROTO (anon) return=0 args=(4511638 addr, 4511610 size)
  [4512941] FUNC_PROTO (anon) return=4511544 args=(4511598 to, 4511598 from, 4511552 len)
  [4512943] FUNC_PROTO (anon) return=4511544 args=(4511598 to, 4511598 from, 4511552 len)
  [4512945] FUNC_PROTO (anon) return=4511544 args=(4511638 to, 4511544 n)
  [4512947] FUNC_PROTO (anon) return=4511544 args=(4511638 addr, 4511544 size)
  [4512949] FUNC_PROTO (anon) return=4511638 args=(4511638 p, 4511759 q, 4511591 size)
  $ grep -w FUNC_PROTO /tmp/pahole-btf-encoding-verbose-output.txt |grep 4511542
  [4511542] FUNC_PROTO (anon) return=4510159 args=(4510254 to, 4510374 from, 4510159 n)
  $

With a little change to pdwtags to see DW_TAG_subroutine_type, which is
what BTF's KIND_FUNC_PROTO maps to, we see some of those last
prototypes:

[acme@quaco pahole]$ pdwtags -F btf vmlinux  | grep '()(' | tail
void ()(struct insn * insn); /* size: 45404744 */
int ()(struct insn * insn); /* size: 4 */
void ()(struct insn * insn, const void  * kaddr, int buf_len, int x86_64); /* size: 45405032 */
long unsigned int ()(const char  * purpose); /* size: 8 */
void ()(char * to, struct page * page, size_t offset, size_t len); /* size: 45405864 */
void ()(void * _dst, const void  * _src, size_t size); /* size: 45406200 */
long int ()(void * dst, const void  * src, unsigned int size); /* size: 8 */
long unsigned int ()(char * to, char * from, unsigned int len); /* size: 8 */
long unsigned int ()(void * to, long unsigned int n); /* size: 8 */
long unsigned int ()(void * addr, long unsigned int size); /* size: 8 */
[acme@quaco pahole]$

I.e.:

  [4512941] FUNC_PROTO (anon) return=4511544 args=(4511598 to, 4511598 from, 4511552 len)

gets decoded by pdwtags as:

  long unsigned int ()(char * to, char * from, unsigned int len); /* size: 8 */

  $ grep '\[\(4511544\|4511598\|4511550\|4511552\)\]' /tmp/pahole-btf-encoding-verbose-output.txt
  [4511544] INT long unsigned int size=8 bit_offset=0 nr_bits=64 encoding=(none)
  [4511550] INT char size=1 bit_offset=0 nr_bits=8 encoding=(none)
  [4511552] INT unsigned int size=4 bit_offset=0 nr_bits=32 encoding=(none)
  [4511598] PTR (anon) type_id=4511550
  $

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-05 09:27:58 -03:00
Andrii Nakryiko c0fdc5e685 dwarf_loader: Use DWARF recommended uniform bit offset scheme
Use uniform bit offset scheme as described in DWARF standard (though
apparently not really followed by major compilers), in which bit offset
is a natural extension of byte offset in both big- and little-endian
architectures.

BEFORE:

1. Bit offsets for little-endian are output as offsets from highest-order bit
of underlying int to highest-order bit of bitfield, so double-backwards for
little-endian arch and counter to how byte offsets are used, which point to
lowest-order bit of underlying type. This makes first bitfield to have bit
offset 27, instead of natural 0.

2. Bit offsets for big-endian are output as expected, by referencing
highest-order bit offset from highest-order bit of underlying int. This is
natural for big-endian platform, e.g., first bitfield has bit offset of 0.

3. Big-endian target also has problem with determining bit holes, because bit
positions have to be calculated differently for little- and big-endian
platforms and previous commit changed pahole logic to follow little-endian
semantics.

4. BTF encoder outputs uniform bit offset for both little- and big-endian
format (following DWARF's recommended bit offset scheme)

5. BTF loader, though, follows DWARF loader's format and outputs little-endian
bit offsets "double-backwards".

  $ gcc -g dwarf_test.c -o dwarf_test
  $ pahole -F dwarf dwarf_test
  struct S {
          int                        j:5;                  /*     0:27  4 */
          int                        k:6;                  /*     0:21  4 */
          int                        m:5;                  /*     0:16  4 */
          int                        n:8;                  /*     0: 8  4 */

          /* size: 4, cachelines: 1, members: 4 */
          /* bit_padding: 8 bits */
          /* last cacheline: 4 bytes */
  };

  $ pahole -JV dwarf_test
  File dwarf_test:
  [1] STRUCT S kind_flag=1 size=4 vlen=4
          j type_id=2 bitfield_size=5 bits_offset=0
          k type_id=2 bitfield_size=6 bits_offset=5
          m type_id=2 bitfield_size=5 bits_offset=11
          n type_id=2 bitfield_size=8 bits_offset=16
  [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED

  $ pahole -F btf dwarf_test
  struct S {
          int                        j:5;                  /*     0:27  4 */
          int                        k:6;                  /*     0:21  4 */
          int                        m:5;                  /*     0:16  4 */
          int                        n:8;                  /*     0: 8  4 */

          /* size: 4, cachelines: 1, members: 4 */
          /* bit_padding: 8 bits */
          /* last cacheline: 4 bytes */
  };

  $ aarch64-linux-gnu-gcc -mbig-endian -g -c dwarf_test.c -o dwarf_test.be
  $ pahole -F dwarf dwarf_test.be
  struct S {

          /* XXX 27 bits hole, try to pack */

          int                        j:5;                  /*     0: 0  4 */

          /* XXX 245 bits hole, try to pack */

          int                        k:6;                  /*     0: 5  4 */

          /* XXX 245 bits hole, try to pack */

          int                        m:5;                  /*     0:11  4 */

          /* XXX 243 bits hole, try to pack */

          int                        n:8;                  /*     0:16  4 */

          /* size: 4, cachelines: 1, members: 4 */
          /* bit holes: 4, sum bit holes: 760 bits */
          /* bit_padding: 16 bits */
          /* last cacheline: 4 bytes */

          /* BRAIN FART ALERT! 4 bytes != 24 (member bits) + 0 (byte holes) + 760 (bit holes), diff = -768 bits */
  };

  $ pahole -JV dwarf_test.be
  File dwarf_test.be:
  [1] STRUCT S kind_flag=1 size=4 vlen=4
          j type_id=2 bitfield_size=5 bits_offset=0
          k type_id=2 bitfield_size=6 bits_offset=5
          m type_id=2 bitfield_size=5 bits_offset=11
          n type_id=2 bitfield_size=8 bits_offset=16
  [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED

  $ pahole -F btf dwarf_test.be
  struct S {

          /* XXX 27 bits hole, try to pack */

          int                        j:5;                  /*     0: 0  4 */

          /* XXX 245 bits hole, try to pack */

          int                        k:6;                  /*     0: 5  4 */

          /* XXX 245 bits hole, try to pack */

          int                        m:5;                  /*     0:11  4 */

          /* XXX 243 bits hole, try to pack */

          int                        n:8;                  /*     0:16  4 */

          /* size: 4, cachelines: 1, members: 4 */
          /* bit holes: 4, sum bit holes: 760 bits */
          /* bit_padding: 16 bits */
          /* last cacheline: 4 bytes */

          /* BRAIN FART ALERT! 4 bytes != 24 (member bits) + 0 (byte holes) + 760 (bit holes), diff = -768 bits */
  };

AFTER:

1. Same output for little- and big-endian binaries, both for BTF and DWARF
loader.

2. For little-endian target, bit offsets are natural extensions of byte offset,
counting from lowest-order bit of underlying int to lowest-order bit of a
bitfield.

3. BTF encoder still emits correct and natural bit offsets (for both binaries).

4. No more BRAIN FART ALERTs for big-endian.

  $ pahole -F dwarf dwarf_test
  struct S {
          int                        j:5;                  /*     0: 0  4 */
          int                        k:6;                  /*     0: 5  4 */
          int                        m:5;                  /*     0:11  4 */
          int                        n:8;                  /*     0:16  4 */

          /* size: 4, cachelines: 1, members: 4 */
          /* bit_padding: 8 bits */
          /* last cacheline: 4 bytes */
  };

  $ pahole -JV dwarf_test
  File dwarf_test:
  [1] STRUCT S kind_flag=1 size=4 vlen=4
          j type_id=2 bitfield_size=5 bits_offset=0
          k type_id=2 bitfield_size=6 bits_offset=5
          m type_id=2 bitfield_size=5 bits_offset=11
          n type_id=2 bitfield_size=8 bits_offset=16
  [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED

  $ pahole -F btf dwarf_test
  struct S {
          int                        j:5;                  /*     0: 0  4 */
          int                        k:6;                  /*     0: 5  4 */
          int                        m:5;                  /*     0:11  4 */
          int                        n:8;                  /*     0:16  4 */

          /* size: 4, cachelines: 1, members: 4 */
          /* bit_padding: 8 bits */
          /* last cacheline: 4 bytes */
  };

  $ pahole -F dwarf dwarf_test.be
  struct S {
          int                        j:5;                  /*     0: 0  4 */
          int                        k:6;                  /*     0: 5  4 */
          int                        m:5;                  /*     0:11  4 */
          int                        n:8;                  /*     0:16  4 */

          /* size: 4, cachelines: 1, members: 4 */
          /* bit_padding: 8 bits */
          /* last cacheline: 4 bytes */
  };

  $ pahole -JV dwarf_test.be
  File dwarf_test.be:
  [1] STRUCT S kind_flag=1 size=4 vlen=4
          j type_id=2 bitfield_size=5 bits_offset=0
          k type_id=2 bitfield_size=6 bits_offset=5
          m type_id=2 bitfield_size=5 bits_offset=11
          n type_id=2 bitfield_size=8 bits_offset=16
  [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED

  $ pahole -F btf dwarf_test.be
  struct S {
          int                        j:5;                  /*     0: 0  4 */
          int                        k:6;                  /*     0: 5  4 */
          int                        m:5;                  /*     0:11  4 */
          int                        n:8;                  /*     0:16  4 */

          /* size: 4, cachelines: 1, members: 4 */
          /* bit_padding: 8 bits */
          /* last cacheline: 4 bytes */
  };

FOR REFERENCE. Relevant parts of DWARF output from GCC (clang outputs exactly
the same data) for both little- and big-endian binaries:

  $ readelf -wi dwarf_test
  Contents of the .debug_info section:
  <snip>
   <1><2d>: Abbrev Number: 2 (DW_TAG_structure_type)
      <2e>   DW_AT_name        : S
      <30>   DW_AT_byte_size   : 4
      <31>   DW_AT_decl_file   : 1
      <32>   DW_AT_decl_line   : 1
      <33>   DW_AT_decl_column : 8
      <34>   DW_AT_sibling     : <0x71>
   <2><38>: Abbrev Number: 3 (DW_TAG_member)
      <39>   DW_AT_name        : j
      <3b>   DW_AT_decl_file   : 1
      <3c>   DW_AT_decl_line   : 2
      <3d>   DW_AT_decl_column : 6
      <3e>   DW_AT_type        : <0x71>
      <42>   DW_AT_byte_size   : 4
      <43>   DW_AT_bit_size    : 5
      <44>   DW_AT_bit_offset  : 27
      <45>   DW_AT_data_member_location: 0
   <2><46>: Abbrev Number: 3 (DW_TAG_member)
      <47>   DW_AT_name        : k
      <49>   DW_AT_decl_file   : 1
      <4a>   DW_AT_decl_line   : 3
      <4b>   DW_AT_decl_column : 6
      <4c>   DW_AT_type        : <0x71>
      <50>   DW_AT_byte_size   : 4
      <51>   DW_AT_bit_size    : 6
      <52>   DW_AT_bit_offset  : 21
      <53>   DW_AT_data_member_location: 0
   <2><54>: Abbrev Number: 3 (DW_TAG_member)
      <55>   DW_AT_name        : m
      <57>   DW_AT_decl_file   : 1
      <58>   DW_AT_decl_line   : 4
      <59>   DW_AT_decl_column : 6
      <5a>   DW_AT_type        : <0x71>
      <5e>   DW_AT_byte_size   : 4
      <5f>   DW_AT_bit_size    : 5
      <60>   DW_AT_bit_offset  : 16
      <61>   DW_AT_data_member_location: 0
   <2><62>: Abbrev Number: 3 (DW_TAG_member)
      <63>   DW_AT_name        : n
      <65>   DW_AT_decl_file   : 1
      <66>   DW_AT_decl_line   : 5
      <67>   DW_AT_decl_column : 6
      <68>   DW_AT_type        : <0x71>
      <6c>   DW_AT_byte_size   : 4
      <6d>   DW_AT_bit_size    : 8
      <6e>   DW_AT_bit_offset  : 8
      <6f>   DW_AT_data_member_location: 0
   <2><70>: Abbrev Number: 0
   <1><71>: Abbrev Number: 4 (DW_TAG_base_type)
      <72>   DW_AT_byte_size   : 4
      <73>   DW_AT_encoding    : 5        (signed)
      <74>   DW_AT_name        : int
  <snip>

  $ readelf -wi dwarf_test.be
  Contents of the .debug_info section:
  <snip>
   <1><2d>: Abbrev Number: 2 (DW_TAG_structure_type)
      <2e>   DW_AT_name        : S
      <30>   DW_AT_byte_size   : 4
      <31>   DW_AT_decl_file   : 1
      <32>   DW_AT_decl_line   : 1
      <33>   DW_AT_sibling     : <0x6c>
   <2><37>: Abbrev Number: 3 (DW_TAG_member)
      <38>   DW_AT_name        : j
      <3a>   DW_AT_decl_file   : 1
      <3b>   DW_AT_decl_line   : 2
      <3c>   DW_AT_type        : <0x6c>
      <40>   DW_AT_byte_size   : 4
      <41>   DW_AT_bit_size    : 5
      <42>   DW_AT_bit_offset  : 0
      <43>   DW_AT_data_member_location: 0
   <2><44>: Abbrev Number: 3 (DW_TAG_member)
      <45>   DW_AT_name        : k
      <47>   DW_AT_decl_file   : 1
      <48>   DW_AT_decl_line   : 3
      <49>   DW_AT_type        : <0x6c>
      <4d>   DW_AT_byte_size   : 4
      <4e>   DW_AT_bit_size    : 6
      <4f>   DW_AT_bit_offset  : 5
      <50>   DW_AT_data_member_location: 0
   <2><51>: Abbrev Number: 3 (DW_TAG_member)
      <52>   DW_AT_name        : m
      <54>   DW_AT_decl_file   : 1
      <55>   DW_AT_decl_line   : 4
      <56>   DW_AT_type        : <0x6c>
      <5a>   DW_AT_byte_size   : 4
      <5b>   DW_AT_bit_size    : 5
      <5c>   DW_AT_bit_offset  : 11
      <5d>   DW_AT_data_member_location: 0
   <2><5e>: Abbrev Number: 3 (DW_TAG_member)
      <5f>   DW_AT_name        : n
      <61>   DW_AT_decl_file   : 1
      <62>   DW_AT_decl_line   : 5
      <63>   DW_AT_type        : <0x6c>
      <67>   DW_AT_byte_size   : 4
      <68>   DW_AT_bit_size    : 8
      <69>   DW_AT_bit_offset  : 16
      <6a>   DW_AT_data_member_location: 0
  <snip>

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Mark Wielaard <mark@klomp.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-29 15:55:37 -03:00
Arnaldo Carvalho de Melo 5375d06faf dwarves: Introduce type_id_t for use with the type IDs
This is just a prep patch, marking uint16_t IDs as type_id_t, that
points to uint16_t, so no change in the resulting code.

Cc: Andrii Nakryiko <andriin@fb.com>
Tested-by:Andrii Nakryiko <andrii.nakryiko@gmail.com>
Link: https://lore.kernel.org/bpf/CAEf4Bzb0SpvXdDKMMnUof==kp4Y0AP54bKFjeCzX_AsmDm7k7g@mail.gmail.com/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-11 11:44:53 -03:00
Andrii Nakryiko a9afcc65fc btf_encoder: Don't special case packed enums
BTF data can represent packed enums correctly without any special
handling from pahole side. Previously pahole's own `enum vscope` would
be omitted causing problems.

Original commit tried to generate correct struct bitfield member type if
the member is an enum. This was dated before kind_flag implementation.
Later, kind_flag support was added and now pahole always generates BTF
with kind_flag = 1 for structures with bitfield, where bitfield size is
encoded in btf_member, so this workaround is not needed any more.
Removing this "hack" makes handling it easier to handle packed enums
correctly.

Repro:

  $ cat test/packed_enum.c
  enum packed_enum {
          VALUE1,
          VALUE2,
          VALUE3
  } __attribute__((packed));

  struct s {
          int x;
          enum packed_enum e;
          int y;
  };

  int main()
  {
          struct s s;
          return 0;
  }

  $ gcc -g -c test/packed_enum.c -o test/packed_enum.o

  $ ~/local/pahole/build/pahole -JV test/packed_enum.o
  File test/packed_enum.o:
  [1] INT (anon) size=1 bit_offset=0 nr_bits=8 encoding=SIGNED
  [2] STRUCT s kind_flag=0 size=12 vlen=3
          x type_id=3 bits_offset=0
          e type_id=1 bits_offset=32
          y type_id=3 bits_offset=64
  [3] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED

  $ ~/local/pahole/build/pahole -F dwarf test/packed_enum.o
  struct s {
          int                        x;                    /*     0     4 */
          enum packed_enum           e;                    /*     4     1 */

          /* XXX 3 bytes hole, try to pack */

          int                        y;                    /*     8     4 */

          /* size: 12, cachelines: 1, members: 3 */
          /* sum members: 9, holes: 1, sum holes: 3 */
          /* last cacheline: 12 bytes */
  };

  $ ~/local/pahole/build/pahole -F btf test/packed_enum.o
  struct s {
          int                        x;                    /*     0     4 */
          nameless base type!        e;                    /*     4     1 */

          /* XXX 3 bytes hole, try to pack */

          int                        y;                    /*     8     4 */

          /* size: 12, cachelines: 1, members: 3 */
          /* sum members: 9, holes: 1, sum holes: 3 */
          /* last cacheline: 12 bytes */
  };

Notice how pahole's log doesn't have a mention of encoding 'packed_enum'
(anonymous integer is generated instead), which causes 'nameless base
type!' output above.

Fix this change:

  $ ~/local/pahole/build/pahole -JV test/packed_enum.o
  File test/packed_enum.o:
  [1] ENUM packed_enum size=1 vlen=3
          VALUE1 val=0
          VALUE2 val=1
          VALUE3 val=2
  [2] STRUCT s kind_flag=0 size=12 vlen=3
          x type_id=3 bits_offset=0
          e type_id=1 bits_offset=32
          y type_id=3 bits_offset=64
  [3] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED

  $ ~/local/pahole/build/pahole -F btf test/packed_enum.o
  struct s {
          int                        x;                    /*     0     4 */
          enum packed_enum           e;                    /*     4     1 */

          /* XXX 3 bytes hole, try to pack */

          int                        y;                    /*     8     4 */

          /* size: 12, cachelines: 1, members: 3 */
          /* sum members: 9, holes: 1, sum holes: 3 */
          /* last cacheline: 12 bytes */
  };

  $ PAHOLE=~/local/pahole/build/pahole ./btfdiff test/packed_enum.o

Also verified on pahole, kernel and glibc:

  $ PAHOLE=~/local/pahole/build/pahole ./btfdiff ~/local/btf/pahole.debug
  $ PAHOLE=~/local/pahole/build/pahole ./btfdiff ~/local/btf/libc-2.28.so.debug
  $ PAHOLE=~/local/pahole/build/pahole ./btfdiff ~/local/btf/vmlinux4

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Fixes: b18354f64c ("btf: Generate correct struct bitfield member types")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-26 11:32:32 -03:00
Arnaldo Carvalho de Melo fe4e1f799c btf_elf: Rename btf_elf__free() to btf_elf__delete()
That is the idiom for free its members and then free itself, 'free' is
just to free its members.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-14 17:06:40 -03:00
Arnaldo Carvalho de Melo 6780c4334d btf: Rename 'struct btf' to 'struct btf_elf'
So that we don't clash with libbpf's 'struct btf', in time more internal
state now in 'struct btf_elf' will refer to the equivalent internal
state in libbpf's 'struct btf', as they have lots in common.

Requested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Acked-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Martin Lau <kafai@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-14 17:06:23 -03:00
Andrii Nakryiko ca86e9416b pahole: use btf.h directly from libbpf
Now that libbpf is a submodule, we don't need to copy/paste btf.h header
with BTF type definitions.

This is a first step in migrating parts of libbtf, btf_encoder and
btf_loader to use libbpf and starting to use btf__dedup().

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-11 12:57:06 -03:00
Martin Lau a58c746c4c Fixup copyright notices for BTF files authored by Facebook engineers
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Domenico Andreoli <domenico.andreoli@linux.com>
Cc: Yonghong Song <yhs@fb.com>
Signed-off-by: Martin Lau <kafai@fb.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-18 20:34:05 -03:00
Domenico Andreoli e714d2eaa1 Adopt SPDX-License-Identifier
Signed-off-by: Domenico Andreoli <domenico.andreoli@linux.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-18 15:41:48 -03:00
Yonghong Song 3aa3fd506e btf: add func_proto support
Two new btf kinds, BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO,
have been added in kernel since
  https://patchwork.ozlabs.org/cover/1000176/
to support better func introspection.

Currently, for a DW_TAG_subroutine_type dwarf type,
a simple "void *" is generated instead of real subroutine type.

This patch teaches pahole to generate BTF_KIND_FUNC_PROTO
properly. After this patch, pahole should have complete
type coverage for C frontend with types a bpf program cares.

For example,
  $ cat t1.c
  typedef int __int32;
  struct t1 {
    int a1;
    int (*f1)(char p1, __int32 p2);
  } g1;
  $ cat t2.c
  typedef int __int32;
  struct t2 {
    int a2;
    int (*f2)(char q1, __int32 q2, ...);
    int (*f3)();
  } g2;
  int main() { return 0; }
  $ gcc -O2 -o t1 -g t1.c t2.c
  $ pahole -JV t1
  File t1:
  [1] TYPEDEF __int32 type_id=2
  [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
  [3] STRUCT t1 kind_flag=0 size=16 vlen=2
        a1 type_id=2 bits_offset=0
        f1 type_id=6 bits_offset=64
  [4] FUNC_PROTO (anon) return=2 args=(5 (anon), 1 (anon))
  [5] INT char size=1 bit_offset=0 nr_bits=8 encoding=(none)
  [6] PTR (anon) type_id=4
  [7] TYPEDEF __int32 type_id=8
  [8] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
  [9] STRUCT t2 kind_flag=0 size=24 vlen=3
        a2 type_id=8 bits_offset=0
        f2 type_id=12 bits_offset=64
        f3 type_id=14 bits_offset=128
  [10] FUNC_PROTO (anon) return=8 args=(11 (anon), 7 (anon), vararg)
  [11] INT char size=1 bit_offset=0 nr_bits=8 encoding=(none)
  [12] PTR (anon) type_id=10
  [13] FUNC_PROTO (anon) return=8 args=(vararg)
  [14] PTR (anon) type_id=13
  $

In the above example, type [4], [10] and [13] represent the
func_proto types.

BTF_KIND_FUNC, which represents a real subprogram, is not generated in
this patch and will be considered later.

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-20 11:33:35 -03:00
Yonghong Song 8630ce4042 btf: fix struct/union/fwd types with kind_flag
This patch fixed two issues with BTF. One is related to struct/union
bitfield encoding and the other is related to forward type.

Issue #1 and solution:
======================

Current btf encoding of bitfield follows what pahole generates.
For each bitfield, pahole will duplicate the type chain and
put the bitfield size at the final int or enum type.
Since the BTF enum type cannot encode bit size,
commit b18354f64c ("btf: Generate correct struct bitfield
member types") workarounds the issue by generating
an int type whenever the enum bit size is not 32.

The above workaround is not ideal as we lost original type
in BTF. Another undesiable fact is the type duplication
as the pahole duplicates the type chain.

To fix this issue, this patch implemented a compatible
change for BTF struct type encoding:
  . the bit 31 of type->info, previously reserved,
    now is used to indicate whether bitfield_size is
    encoded in btf_member or not.
  . if bit 31 of struct_type->info is set,
    btf_member->offset will encode like:
      bit 0 - 23: bit offset
      bit 24 - 31: bitfield size
    if bit 31 is not set, the old behavior is preserved:
      bit 0 - 31: bit offset

So if the struct contains a bit field, the maximum bit offset
will be reduced to (2^24 - 1) instead of MAX_UINT. The maximum
bitfield size will be 255 which is enough for today as maximum
bitfield in compiler can be 128 where int128 type is supported.

A new global, no_bitfield_type_recode, is introduced and which
will be set to true if BTF encoding is enabled. This global
will prevent pahole duplicating the bitfield types to avoid
type duplication in BTF.

Issue #2 and solution:
======================

Current forward type in BTF does not specify whether the original
type is struct or union. This will not work for type pretty print
and BTF-to-header-file conversion as struct/union must be specified.

To fix this issue, similar to issue #1, type->info bit 31
is used. If the bit is set, it is union type. Otherwise, it is
a struct type.

Examples:
=========

  -bash-4.4$ cat t.c
  struct s;
  union u;
  typedef int ___int;
  enum A { A1, A2, A3 };
  struct t {
	  int a[5];
	  ___int b:4;
	  volatile enum A c:4;
	  struct s *p1;
	  union u *p2;
  } g;
  -bash-4.4$ gcc -c -O2 -g t.c

Without this patch:

  $ pahole -JV t.o
  [1] TYPEDEF ___int type_id=2
  [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
  [3] ENUM A size=4 vlen=3
        A1 val=0
        A2 val=1
        A3 val=2
  [4] STRUCT t size=40 vlen=5
        a type_id=5 bits_offset=0
        b type_id=13 bits_offset=160
        c type_id=15 bits_offset=164
        p1 type_id=9 bits_offset=192
        p2 type_id=11 bits_offset=256
  [5] ARRAY (anon) type_id=2 index_type_id=2 nr_elems=5
  [6] INT sizetype size=8 bit_offset=0 nr_bits=64 encoding=(none)
  [7] VOLATILE (anon) type_id=3
  [8] FWD s type_id=0
  [9] PTR (anon) type_id=8
  [10] FWD u type_id=0
  [11] PTR (anon) type_id=10
  [12] INT int size=1 bit_offset=0 nr_bits=4 encoding=(none)
  [13] TYPEDEF ___int type_id=12
  [14] INT (anon) size=1 bit_offset=0 nr_bits=4 encoding=SIGNED
  [15] VOLATILE (anon) type_id=14

With this patch:

  $ pahole -JV t.o
  File t.o:
  [1] TYPEDEF ___int type_id=2
  [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
  [3] ENUM A size=4 vlen=3
        A1 val=0
        A2 val=1
        A3 val=2
  [4] STRUCT t kind_flag=1 size=40 vlen=5
        a type_id=5 bitfield_size=0 bits_offset=0
        b type_id=1 bitfield_size=4 bits_offset=160
        c type_id=7 bitfield_size=4 bits_offset=164
        p1 type_id=9 bitfield_size=0 bits_offset=192
        p2 type_id=11 bitfield_size=0 bits_offset=256
  [5] ARRAY (anon) type_id=2 index_type_id=2 nr_elems=5
  [6] INT sizetype size=8 bit_offset=0 nr_bits=64 encoding=(none)
  [7] VOLATILE (anon) type_id=3
  [8] FWD s struct
  [9] PTR (anon) type_id=8
  [10] FWD u union
  [11] PTR (anon) type_id=10

The fix removed the type duplication, preserved the enum type for the
bitfield, and have correct struct/union information for the forward
type.

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-20 11:27:20 -03:00
Andrii Nakryiko 65bd17abc7 btf: Allow multiple cu's in dwarf->btf conversion
Currently, the pahole dwarf->btf conversion only supports one
compilation unit. This is not ideal since we would like using pahole to
generate BTF for vmlinux which has a lot of compilation units.

This patch added support to process multiple compilation units per ELF
file. Multiple ELF files are also supported properly.

The following is a demonstration example:
  -bash-4.4$ cat t1.c
  struct t1 {
    int a1;
  } g1;
  int main(void) { return 0; }
  -bash-4.4$ cat t2.c
  struct t2 {
    char a2;
  } g2;
  int main() { return 0; }
  -bash-4.4$ cat t3.c
  struct t3 {
    unsigned char a1:4;
  } g1;
  int main(void) { return 0; }
  -bash-4.4$ cat t4.c
  struct t4 {
    volatile char a4;
  } g2;
  int main() { return 0; }
  -bash-4.4$ gcc -O2 -o t1 -g t1.c t2.c
  -bash-4.4$ gcc -O2 -o t3 -g t3.c t4.c

Note that both the binary "t1" and "t3" have two compilation units in
their respective dwarf debug_info sections. The following is the pahole
verbose output for BTF conversion for these two binaries.

  -bash-4.4$ pahole -JV t1 t3
  File t1:
  [1] STRUCT t1 size=4 vlen=1
        a1 type_id=2 bits_offset=0
  [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
  [3] STRUCT t2 size=1 vlen=1
        a2 type_id=4 bits_offset=0
  [4] INT char size=1 bit_offset=0 nr_bits=8 encoding=(none)
  [5] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED

  File t3:
  [1] STRUCT t3 size=1 vlen=1
        a1 type_id=3 bits_offset=0
  [2] INT unsigned char size=1 bit_offset=0 nr_bits=8 encoding=(none)
  [3] INT unsigned char size=1 bit_offset=0 nr_bits=4 encoding=(none)
  [4] INT (anon) size=4 bit_offset=0 nr_bits=32 encoding=(none)
  [5] STRUCT t4 size=1 vlen=1
        a4 type_id=6 bits_offset=0
  [6] VOLATILE (anon) type_id=7
  [7] INT char size=1 bit_offset=0 nr_bits=8 encoding=(none)
  [8] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-20 10:44:48 -03:00
Yonghong Song b18354f64c btf: Generate correct struct bitfield member types
For int types, the correct type size will be generated.  For enum types,
if the bit size is not 32, current BTF enum cannot represent it so a
signed int type will be generated.

For the following example:

  $ cat test.c
  enum A { A1, A2, A3 };
  struct t {
    enum A a:3;
    volatile enum A b:4;
  } g;
  $ gcc -c -g -O2 test.c

Without this patch, we will have:

  $ pahole -JV test.o
  [1] ENUM A size=4 vlen=3
        A1 val=0
        A2 val=1
        A3 val=2
  [2] STRUCT t size=4 vlen=2
        a type_id=4 bits_offset=0
        b type_id=6 bits_offset=3
  [3] VOLATILE (anon) type_id=1
  [4] ENUM A size=1 vlen=3
        A1 val=0
        A2 val=1
        A3 val=2
  [5] ENUM A size=1 vlen=3
        A1 val=0
        A2 val=1
        A3 val=2
  [6] VOLATILE (anon) type_id=5
  [7] INT (anon) size=4 bit_offset=0 nr_bits=32 encoding=(none)

There are two issues in the above. The struct "t" member "a" points to
type [4]. But the nr_bits is lost in type [4].

The same for type [5] which is for struct "t" member "b".

Since BTF ENUM type cannot encode nr_bits, this patch fixed the issue by
generating a BTF INT type if the ENUM type number of bits in pahole is
not 32.

With this patch, the incorrect member nr_bits issue is fixed as below:

  $ pahole -JV test.o
  [1] ENUM A size=4 vlen=3
        A1 val=0
        A2 val=1
        A3 val=2
  [2] STRUCT t size=4 vlen=2
        a type_id=4 bits_offset=0
        b type_id=6 bits_offset=3
  [3] VOLATILE (anon) type_id=1
  [4] INT (anon) size=1 bit_offset=0 nr_bits=3 encoding=SIGNED
  [5] INT (anon) size=1 bit_offset=0 nr_bits=4 encoding=SIGNED
  [6] VOLATILE (anon) type_id=5
  [7] INT (anon) size=4 bit_offset=0 nr_bits=32 encoding=(none)

Signed-off-by: Yonghong Song <yhs@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Cc: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-03 12:04:46 -03:00
Yonghong Song 0d2511fd1d btf: Fix bitfield encoding
The btf bitfield encoding is broken.

For the following example:

  -bash-4.2$ cat t.c
  struct t {
     int a:2;
     int b:1;
     int :3;
     int c:1;
     int d;
     char e:1;
     char f:1;
     int g;
  };
  void test(struct t *t) {
     return;
  }
  -bash-4.2$ clang -S -g -emit-llvm t.c

The output for bpf "little and big" endian results with pahole dwarf2btf
conversion:

  -bash-4.2$ llc -march=bpfel -mattr=dwarfris -filetype=obj t.ll
  -bash-4.2$ pahole -JV t.o
  [1] PTR (anon) type_id=2
  [2] STRUCT t size=16 vlen=7
        a type_id=5 bits_offset=30
        b type_id=6 bits_offset=29
        c type_id=6 bits_offset=25
        d type_id=3 bits_offset=32
        e type_id=7 bits_offset=71
        f type_id=7 bits_offset=70
        g type_id=3 bits_offset=96
  [3] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
  [4] INT char size=1 bit_offset=0 nr_bits=8 encoding=(none)
  [5] INT int size=1 bit_offset=0 nr_bits=2 encoding=(none)
  [6] INT int size=1 bit_offset=0 nr_bits=1 encoding=(none)
  [7] INT char size=1 bit_offset=0 nr_bits=1 encoding=(none)
  -bash-4.2$ llc -march=bpfeb -mattr=dwarfris -filetype=obj t.ll
  -bash-4.2$ pahole -JV t.o
  [1] PTR (anon) type_id=2
  [2] STRUCT t size=16 vlen=7
        a type_id=5 bits_offset=0
        b type_id=6 bits_offset=2
        c type_id=6 bits_offset=6
        d type_id=3 bits_offset=32
        e type_id=7 bits_offset=64
        f type_id=7 bits_offset=65
        g type_id=3 bits_offset=96
  [3] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
  [4] INT char size=1 bit_offset=0 nr_bits=8 encoding=(none)
  [5] INT int size=1 bit_offset=0 nr_bits=2 encoding=(none)
  [6] INT int size=1 bit_offset=0 nr_bits=1 encoding=(none)
  [7] INT char size=1 bit_offset=0 nr_bits=1 encoding=(none)

The BTF struct member bits_offset counts bits from the beginning of the
containing entity regardless of endianness, similar to what
DW_AT_bit_offset from DWARF4 does. Such counting is equivalent to the
big endian conversion in the above.

But the little endian conversion is not correct since dwarf generates
DW_AT_bit_offset based on actual bit position in the little endian
architecture.  For example, for the above struct member "a", the dwarf
would generate DW_AT_bit_offset=30 for little endian, and
DW_AT_bit_offset=0 for big endian.

This patch fixed the little endian structure member bits_offset problem
with proper calculation based on dwarf attributes.

With the fix, we get:

  -bash-4.2$ llc -march=bpfel -mattr=dwarfris -filetype=obj t.ll
  -bash-4.2$ pahole -JV t.o
    [1] STRUCT t size=16 vlen=7
        a type_id=5 bits_offset=0
        b type_id=6 bits_offset=2
        c type_id=6 bits_offset=6
        d type_id=2 bits_offset=32
        e type_id=7 bits_offset=64
        f type_id=7 bits_offset=65
        g type_id=2 bits_offset=96
    [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
    [3] INT char size=1 bit_offset=0 nr_bits=8 encoding=(none)
    [4] PTR (anon) type_id=1
    [5] INT int size=1 bit_offset=0 nr_bits=2 encoding=(none)
    [6] INT int size=1 bit_offset=0 nr_bits=1 encoding=(none)
    [7] INT char size=1 bit_offset=0 nr_bits=1 encoding=(none)
  -bash-4.2$ llc -march=bpfeb -mattr=dwarfris -filetype=obj t.ll
  -bash-4.2$ pahole -JV t.o
  [1] PTR (anon) type_id=2
  [2] STRUCT t size=16 vlen=7
        a type_id=5 bits_offset=0
        b type_id=6 bits_offset=2
        c type_id=6 bits_offset=6
        d type_id=3 bits_offset=32
        e type_id=7 bits_offset=64
        f type_id=7 bits_offset=65
        g type_id=3 bits_offset=96
  [3] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
  [4] INT char size=1 bit_offset=0 nr_bits=8 encoding=(none)
  [5] INT int size=1 bit_offset=0 nr_bits=2 encoding=(none)
  [6] INT int size=1 bit_offset=0 nr_bits=1 encoding=(none)
  [7] INT char size=1 bit_offset=0 nr_bits=1 encoding=(none)
  -bash-4.2$

For both little endian and big endian, we have correct and
same bits_offset for struct members.

We could fix pos->bit_offset, but pos->bit_offset will be inconsistent
to pos->bitfield_offset in the meaning and pos->bitfield_offset is used
to print out pahole data structure:

  -bash-4.2$ llc -march=bpfel -mattr=dwarfris -filetype=obj t.ll
  -bash-4.2$ /bin/pahole t.o
  struct t {
        int                        a:2;                  /*     0:30  4 */
        int                        b:1;                  /*     0:29  4 */
        int                        c:1;                  /*     0:25  4 */
  .....

So this patch just made the change in btf specific routines.

Signed-off-by: Yonghong Song <yhs@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Cc: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-09-17 11:44:58 -03:00
Martin KaFai Lau 68645f7fac btf: Add BTF support
This patch introduces BPF Type Format (BTF).

BTF (BPF Type Format) is the meta data format which describes
the data types of BPF program/map.  Hence, it basically focus
on the C programming language which the modern BPF is primary
using.  The first use case is to provide a generic pretty print
capability for a BPF map.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-07-25 14:42:06 -03:00