Commit Graph

1982 Commits

Author SHA1 Message Date
Arnaldo Carvalho de Melo 742f04f89d emit: Search for data structures using its type in addition to its name
As we may have, say, both a typedef and a struct with the same name and
sometimes we need to emit both to reflect some types found in the Linux
kernel that use:

typedef struct foo {
	...
} foo;

So we need both 'struct foo' and 'typedef foo'.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-07 16:13:27 -03:00
Arnaldo Carvalho de Melo 32cc148172 fprintf: Consider enumerations without members as forward declarations
To avoid emitting:

  enum x86_intercept_stage {
  };

Which isn't compilable.

The DWARF info for this enum in the Linux kernel has the declaration
flag set, but somehow this is not being available when loading from BTF.

So do the best we can at this enumeration__fprintf() time, that is to
just print zero members enumerations as a forward declaration.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-03 11:30:34 -03:00
Arnaldo Carvalho de Melo 6afc296eeb emit: Fix printing typedef of nameless struct/union
E.g:

  typedef struct {
          __u8                       b[16];                /*     0    16 */

          /* size: 16, cachelines: 1, members: 1 */
          /* last cacheline: 16 bytes */
  } guid_t;
  typedef guid_t efi_guid_t;

Before we were not emitting the guid_t typedef, fix it.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-02 17:39:36 -03:00
Arnaldo Carvalho de Melo 49a2dd6577 fprintf: Check if conf->conf_fprintf is not NULL in when resolving cacheline_size
There are tools that don't set conf_load->conf_fprintf, like codiff, so
check for that in dwarves__resolve_cacheline_size().

Cc: Douglas Raillard <douglas.raillard@arm.com>
Fixes: 772725a77d ("dwarves_fprintf: Move cacheline_size into struct conf_fprintf")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-02 17:20:00 -03:00
Arnaldo Carvalho de Melo 46cec35ff0 fprintf: Fix division by zero for uninitialized conf_fprintf->cacheline_size field
tldr;

  gdb pfunct
  (gdb) run --compile tcp.o
  Program received signal SIGFPE, Arithmetic exception.
  0x00007ffff7f18551 in class__fprintf_cacheline_boundary (conf=0x7fffffffda10, offset=0, fp=0x7ffff7e17520 <_IO_2_1_stdout_>) at /var/home/acme/git/pahole/dwarves_fprintf.c:1319
  1319		uint32_t cacheline = offset / conf->cacheline_size;
  (gdb) bt
  #0  0x00007ffff7f18551 in class__fprintf_cacheline_boundary (conf=0x7fffffffda10, offset=0, fp=0x7ffff7e17520 <_IO_2_1_stdout_>) at /var/home/acme/git/pahole/dwarves_fprintf.c:1319
  #1  0x00007ffff7f16af2 in class_member__fprintf (member=0x45de10, union_member=false, type=0x45dfb0, cu=0x435a40, conf=0x7fffffffda10, fp=0x7ffff7e17520 <_IO_2_1_stdout_>) at /var/home/acme/git/pahole/dwarves_fprintf.c:869
  #2  0x00007ffff7f1717b in struct_member__fprintf (member=0x45de10, type=0x45dfb0, cu=0x435a40, conf=0x7fffffffda10, fp=0x7ffff7e17520 <_IO_2_1_stdout_>) at /var/home/acme/git/pahole/dwarves_fprintf.c:983
  #3  0x00007ffff7f1945c in __class__fprintf (class=0x45dcc0, cu=0x435a40, conf=0x7fffffffdbb0, fp=0x7ffff7e17520 <_IO_2_1_stdout_>) at /var/home/acme/git/pahole/dwarves_fprintf.c:1583
  #4  0x00007ffff7f1a6bd in tag__fprintf (tag=0x45dcc0, cu=0x435a40, conf=0x7fffffffdc70, fp=0x7ffff7e17520 <_IO_2_1_stdout_>) at /var/home/acme/git/pahole/dwarves_fprintf.c:1906
  #5  0x00007ffff7fbf022 in type__emit (tag=0x45dcc0, cu=0x435a40, prefix=0x0, suffix=0x0, fp=0x7ffff7e17520 <_IO_2_1_stdout_>) at /var/home/acme/git/pahole/dwarves_emit.c:333
  #6  0x00007ffff7fbed3d in tag__emit_definitions (tag=0x6b21e0, cu=0x435a40, emissions=0x408300 <emissions>, fp=0x7ffff7e17520 <_IO_2_1_stdout_>) at /var/home/acme/git/pahole/dwarves_emit.c:265
  #7  0x00007ffff7fbef45 in type__emit_definitions (tag=0x6b20c0, cu=0x435a40, emissions=0x408300 <emissions>, fp=0x7ffff7e17520 <_IO_2_1_stdout_>) at /var/home/acme/git/pahole/dwarves_emit.c:315
  #8  0x00007ffff7fbed15 in tag__emit_definitions (tag=0x6b3b40, cu=0x435a40, emissions=0x408300 <emissions>, fp=0x7ffff7e17520 <_IO_2_1_stdout_>) at /var/home/acme/git/pahole/dwarves_emit.c:264
  #9  0x00007ffff7fbef45 in type__emit_definitions (tag=0x6b31d0, cu=0x435a40, emissions=0x408300 <emissions>, fp=0x7ffff7e17520 <_IO_2_1_stdout_>) at /var/home/acme/git/pahole/dwarves_emit.c:315
  #10 0x00007ffff7fbed15 in tag__emit_definitions (tag=0x4cb920, cu=0x435a40, emissions=0x408300 <emissions>, fp=0x7ffff7e17520 <_IO_2_1_stdout_>) at /var/home/acme/git/pahole/dwarves_emit.c:264
  #11 0x00007ffff7fbef45 in type__emit_definitions (tag=0x4cb7d0, cu=0x435a40, emissions=0x408300 <emissions>, fp=0x7ffff7e17520 <_IO_2_1_stdout_>) at /var/home/acme/git/pahole/dwarves_emit.c:315
  #12 0x0000000000403592 in function__emit_type_definitions (func=0x738ad0, cu=0x435a40, fp=0x7ffff7e17520 <_IO_2_1_stdout_>) at /var/home/acme/git/pahole/pfunct.c:353
  #13 0x0000000000403670 in function__show (func=0x738ad0, cu=0x435a40) at /var/home/acme/git/pahole/pfunct.c:371
  #14 0x00000000004038e9 in cu_function_iterator (cu=0x435a40, cookie=0x0) at /var/home/acme/git/pahole/pfunct.c:404
  #15 0x00007ffff7f1296b in cus__for_each_cu (cus=0x4369e0, iterator=0x403869 <cu_function_iterator>, cookie=0x0, filter=0x0) at /var/home/acme/git/pahole/dwarves.c:1919
  #16 0x000000000040432a in main (argc=3, argv=0x7fffffffe1f8) at /var/home/acme/git/pahole/pfunct.c:776
  (gdb) p conf->cacheline_size
  $2 = 0

We need to pass a conf_fprintf pointer to the chain starting with
function__emit_type_definitions(), i.e. dwarves_emit.c needs to receive
the printing configuration instead of, right at type__emit() synthesize
a conf_fprintf without initializing conf_fprintf->cacheline_size which
ends up in a division by zero.

But to fix this quicker just add a helper that checks if it is zero and
uses the conf_fprintf__defaults.cacheline_size field that is being
initialized by all tools via:

  dwarves__resolve_cacheline_size(&conf_load, 0);

Fixes: 772725a77d ("dwarves_fprintf: Move cacheline_size into struct conf_fprintf")
Cc: Douglas Raillard <douglas.raillard@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-28 17:24:30 -03:00
Kui-Feng Lee 73383b3a39 libbpf: Update libbpf to the latest git HEAD
Replace deprecated APIs with new ones.

Signed-off-by: Kui-Feng Lee <kuifeng@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Link: https://lore.kernel.org/r/20220126192039.2840752-5-kuifeng@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-28 16:30:55 -03:00
Kui-Feng Lee 2135275318 pahole: Use per-thread btf instances to avoid mutex locking
Create an instance of btf for each worker thread, and add type info to
the local btf instance in the steal-function of pahole without mutex
acquiring.  Once finished with all worker threads, merge all
per-thread btf instances to the primary btf instance.

Committer testing:

Results with no multithreading, and without further DWARF loading
improvements (not loading things that won't be converted to BTF, etc),
i.e. using pahole 1.21:

  # perf stat -r5 pahole --btf_encode /tmp/vmlinux ; btfdiff /tmp/vmlinux

   Performance counter stats for 'pahole --btf_encode /tmp/vmlinux' (5 runs):

            6,317.41 msec task-clock                #    0.985 CPUs utilized            ( +-  1.07% )
                  80      context-switches          #   12.478 /sec                     ( +- 15.25% )
                   1      cpu-migrations            #    0.156 /sec                     ( +-111.36% )
             535,890      page-faults               #   83.585 K/sec                    ( +-  0.00% )
      29,789,308,790      cycles                    #    4.646 GHz                      ( +-  0.46% )  (83.33%)
          97,696,165      stalled-cycles-frontend   #    0.33% frontend cycles idle     ( +-  4.05% )  (83.34%)
         145,554,652      stalled-cycles-backend    #    0.49% backend cycles idle      ( +- 21.53% )  (83.33%)
      78,215,192,264      instructions              #    2.61  insn per cycle
                                                    #    0.00  stalled cycles per insn  ( +-  0.05% )  (83.33%)
      18,141,376,637      branches                  #    2.830 G/sec                    ( +-  0.06% )  (83.33%)
         148,826,657      branch-misses             #    0.82% of all branches          ( +-  0.65% )  (83.34%)

              6.4129 +- 0.0682 seconds time elapsed  ( +-  1.06% )

  #

Now with pahole 1.23, with just parallel DWARF loading + trimmed DWARF
loading (skipping DWARF tags that won't be converted to BTF, etc):

  $ perf stat -r5 pahole -j --btf_encode /tmp/vmlinux

   Performance counter stats for 'pahole -j --btf_encode /tmp/vmlinux' (5 runs):

           10,828.98 msec task-clock:u              #    3.539 CPUs utilized            ( +-  0.94% )
                   0      context-switches:u        #    0.000 /sec
                   0      cpu-migrations:u          #    0.000 /sec
             105,407      page-faults:u             #    9.895 K/sec                    ( +-  0.15% )
      24,774,029,571      cycles:u                  #    2.326 GHz                      ( +-  0.50% )  (83.49%)
          76,895,232      stalled-cycles-frontend:u #    0.31% frontend cycles idle     ( +-  4.84% )  (83.50%)
          24,821,768      stalled-cycles-backend:u  #    0.10% backend cycles idle      ( +-  3.66% )  (83.11%)
      69,891,360,588      instructions:u            #    2.83  insn per cycle
                                                    #    0.00  stalled cycles per insn  ( +-  0.10% )  (83.20%)
      16,966,456,889      branches:u                #    1.593 G/sec                    ( +-  0.21% )  (83.41%)
         131,923,443      branch-misses:u           #    0.78% of all branches          ( +-  0.82% )  (83.42%)

              3.0600 +- 0.0140 seconds time elapsed  ( +-  0.46% )

  $

It is a bit better not to use -j to use all the CPU threads in the
machine, i.e. using just the number of non-hyperthreading cores, in this
machine, a Ryzen 5950x, 16 cores:

  $ perf stat -r5 pahole -j16 --btf_encode /tmp/vmlinux

   Performance counter stats for 'pahole -j16 --btf_encode /tmp/vmlinux' (5 runs):

           10,075.46 msec task-clock:u              #    3.431 CPUs utilized            ( +-  0.49% )
                   0      context-switches:u        #    0.000 /sec
                   0      cpu-migrations:u          #    0.000 /sec
              90,777      page-faults:u             #    8.983 K/sec                    ( +-  0.16% )
      22,611,016,624      cycles:u                  #    2.237 GHz                      ( +-  0.93% )  (83.34%)
          55,760,536      stalled-cycles-frontend:u #    0.24% frontend cycles idle     ( +-  2.35% )  (83.25%)
          15,985,651      stalled-cycles-backend:u  #    0.07% backend cycles idle      ( +-  8.79% )  (83.33%)
      68,976,319,497      instructions:u            #    2.96  insn per cycle
                                                    #    0.00  stalled cycles per insn  ( +-  0.34% )  (83.39%)
      16,770,540,533      branches:u                #    1.659 G/sec                    ( +-  0.31% )  (83.35%)
         128,220,385      branch-misses:u           #    0.76% of all branches          ( +-  0.77% )  (83.37%)

              2.9365 +- 0.0284 seconds time elapsed  ( +-  0.97% )

  $

Then with parallel DWARF loading + parallel BTF encoding (this patch):

  $ perf stat -r5 pahole -j --btf_encode /tmp/vmlinux

   Performance counter stats for 'pahole -j --btf_encode /tmp/vmlinux' (5 runs):

           11,063.29 msec task-clock:u              #    6.389 CPUs utilized            ( +-  0.79% )
                   0      context-switches:u        #    0.000 /sec
                   0      cpu-migrations:u          #    0.000 /sec
             163,263      page-faults:u             #   14.840 K/sec                    ( +-  0.48% )
      41,892,887,608      cycles:u                  #    3.808 GHz                      ( +-  0.96% )  (83.41%)
         197,163,158      stalled-cycles-frontend:u #    0.47% frontend cycles idle     ( +-  3.23% )  (83.46%)
         114,187,423      stalled-cycles-backend:u  #    0.27% backend cycles idle      ( +- 16.57% )  (83.43%)
      74,053,722,204      instructions:u            #    1.78  insn per cycle
                                                    #    0.00  stalled cycles per insn  ( +-  0.18% )  (83.37%)
      17,848,238,467      branches:u                #    1.622 G/sec                    ( +-  0.10% )  (83.27%)
         180,232,427      branch-misses:u           #    1.01% of all branches          ( +-  0.86% )  (83.16%)

              1.7316 +- 0.0301 seconds time elapsed  ( +-  1.74% )

  $

Again it is better not to use -j to use all the CPU threads:

  $ perf stat -r5 pahole -j16 --btf_encode /tmp/vmlinux

   Performance counter stats for 'pahole -j16 --btf_encode /tmp/vmlinux' (5 runs):

            6,626.33 msec task-clock:u              #    4.421 CPUs utilized            ( +-  0.82% )
                   0      context-switches:u        #    0.000 /sec
                   0      cpu-migrations:u          #    0.000 /sec
             140,919      page-faults:u             #   21.240 K/sec                    ( +-  1.03% )
      26,085,701,848      cycles:u                  #    3.932 GHz                      ( +-  1.20% )  (83.38%)
          98,962,246      stalled-cycles-frontend:u #    0.37% frontend cycles idle     ( +-  3.47% )  (83.41%)
         102,762,088      stalled-cycles-backend:u  #    0.39% backend cycles idle      ( +- 17.95% )  (83.38%)
      71,193,141,569      instructions:u            #    2.69  insn per cycle
                                                    #    0.00  stalled cycles per insn  ( +-  0.14% )  (83.33%)
      17,166,459,728      branches:u                #    2.587 G/sec                    ( +-  0.15% )  (83.27%)
         150,984,525      branch-misses:u           #    0.87% of all branches          ( +-  0.61% )  (83.34%)

              1.4989 +- 0.0113 seconds time elapsed  ( +-  0.76% )

  $

Minor tweaks to reduce the patch size, things like avoiding moving the
pthread_mutex_lock(&btf_lock) to after a comment, etc.

Signed-off-by: Kui-Feng Lee <kuifeng@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Link: https://lore.kernel.org/r/20220126192039.2840752-4-kuifeng@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-28 16:30:54 -03:00
Kui-Feng Lee 96d2c5c323 dwarf_loader: Prepare and pass per-thread data to worker threads
Add interfaces to allow users of dwarf_loader to prepare and pass
per-thread data to steal-functions running on worker threads.

Signed-off-by: Kui-Feng Lee <kuifeng@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Link: https://lore.kernel.org/r/20220126192039.2840752-3-kuifeng@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-28 16:30:10 -03:00
Kui-Feng Lee 724c8fddd7 dwarf_loader: Receive per-thread data on worker threads
Add arguments to steal and thread_exit callbacks of conf_load to
receive per-thread data.

Signed-off-by: Kui-Feng Lee <kuifeng@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Link: https://lore.kernel.org/r/20220126192039.2840752-2-kuifeng@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-28 16:19:29 -03:00
Arnaldo Carvalho de Melo 2f7d61b2bf core: Define DW_TAG_skeleton_unit if not available on current dwarf.h
We use this in both the dwarf_loader.c and in fprintf.c, so define it in
dwarves.h that is included in both.

Reported-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/all/YbkTAPn3EEu6BUYR@archlinux-ax161
Cc: Domenico Andreoli <domenico.andreoli@linux.com>
Cc: Douglas RAILLARD <douglas.raillard@arm.com>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Jan Engelhardt <jengelh@inai.de>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Matteo Croce <mcroce@microsoft.com>
Cc: Matthias Schwarzott <zzam@gentoo.org>
Cc: Yonghong Song <yhs@fb.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-17 15:50:17 -03:00
Arnaldo Carvalho de Melo c2b7b8c208 pahole: Prep 1.23
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-08 09:19:32 -03:00
Douglas RAILLARD 54ae2f7f5e Revert "fprintf: Allow making struct/enum/union anonymous"
This reverts commit 7c5e35b63b.

Dropped since it could not cope with recursive types. A new attempt will
be made on 1.24.

Signed-off-by: Douglas RAILLARD <douglas.raillard@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-08 08:52:51 -03:00
Douglas RAILLARD 69fb1861de Revert "pahole: Add --inner_anon option"
This reverts commit 005236c3e4.

Dropped since it could not cope with recursive types. A new attempt will
be made on 1.24.

Signed-off-by: Douglas RAILLARD <douglas.raillard@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-08 08:52:10 -03:00
Douglas Raillard 005236c3e4 pahole: Add --inner_anon option
Allow making the inner struct/enum/union anonymous. This permits using
the header to inspect pointer values using -E, without having to care
about avoiding duplicate type definitions such as:

    struct foo { ... };
    struct bar {
        struct foo {
	     ....
	} a;
    };

With --inner_anon, the conflict between the two definitions of struct
foo is gone:

    struct foo { ... };
    struct bar {
        struct {
	     ....
	} a;
    };

Committer testing:

  $ cat inner_anon.c

  struct foo {
  	int  a;
  	char b;
  };

  struct bar {
          struct foo c;
  	int	   d;
  } bla;
  $ gcc -g -c inner_anon.c   -o inner_anon.o

No expansion:

  $ pahole inner_anon.o
  struct foo {
  	int                        a;                    /*     0     4 */
  	char                       b;                    /*     4     1 */

  	/* size: 8, cachelines: 1, members: 2 */
  	/* padding: 3 */
  	/* last cacheline: 8 bytes */
  };
  struct bar {
  	struct foo                 c;                    /*     0     8 */

  	/* XXX last struct has 3 bytes of padding */

  	int                        d;                    /*     8     4 */

  	/* size: 12, cachelines: 1, members: 2 */
  	/* paddings: 1, sum paddings: 3 */
  	/* last cacheline: 12 bytes */
  };

Expanding types:

  $ pahole -E inner_anon.o
  struct foo {
  	int                        a;                                                    /*     0     4 */
  	char                       b;                                                    /*     4     1 */

  	/* size: 8, cachelines: 1, members: 2 */
  	/* padding: 3 */
  	/* last cacheline: 8 bytes */
  };
  struct bar {
  	struct foo {
  		int                a;                                                    /*     0     4 */
  		char               b;                                                    /*     4     1 */
  	}c; /*     0     8 */

  	/* XXX last struct has 3 bytes of padding */

  	int                        d;                                                    /*     8     4 */

  	/* size: 12, cachelines: 1, members: 2 */
  	/* paddings: 1, sum paddings: 3 */
  	/* last cacheline: 12 bytes */
  };

Anonymising the inner struct:

  $ pahole -E --inner_anon inner_anon.o
  struct foo {
  	int                        a;                                                    /*     0     4 */
  	char                       b;                                                    /*     4     1 */

  	/* size: 8, cachelines: 1, members: 2 */
  	/* padding: 3 */
  	/* last cacheline: 8 bytes */
  };
  struct bar {
  	struct /* foo */ {
  		int                a;                                                    /*     0     4 */
  		char               b;                                                    /*     4     1 */
  	}c; /*     0     8 */

  	/* XXX last struct has 3 bytes of padding */

  	int                        d;                                                    /*     8     4 */

  	/* size: 12, cachelines: 1, members: 2 */
  	/* paddings: 1, sum paddings: 3 */
  	/* last cacheline: 12 bytes */
  };

Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
[ Added man page entry for --inner_anon, refreshed the patch to cope with the btf_tag series ]
Link: https://lore.kernel.org/all/20211019100724.325570-3-douglas.raillard@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-07 14:55:16 -03:00
Douglas Raillard 7c5e35b63b fprintf: Allow making struct/enum/union anonymous
Allow making inner struct enums and union anonymous, so that when using
-E to expand types we don't end up with multiple definitions for
expanded inner structs, allowing the resulting expanded struct to be
compilable.

Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
[ Applied it manually to cover some fuzz due to other patches ]
Link: https://lore.kernel.org/all/20211019100724.325570-2-douglas.raillard@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-26 15:36:44 -03:00
Yonghong Song d99d551930 btf_encoder: Support btf_type_tag attribute
[$ ~] cat t.c
  #define __tag1 __attribute__((btf_type_tag("tag1")))
  #define __tag2 __attribute__((btf_type_tag("tag2")))
  int __tag1 * __tag1 __tag2 *g __attribute__((section(".data..percpu")));
  [$ ~] clang -O2 -g -c t.c
  [$ ~] pahole -JV t.o
  Found per-CPU symbol 'g' at address 0x0
  Found 1 per-CPU variables!
  File t.o:
  [1] TYPE_TAG tag1 type_id=5
  [2] TYPE_TAG tag2 type_id=1
  [3] PTR (anon) type_id=2
  [4] TYPE_TAG tag1 type_id=6
  [5] PTR (anon) type_id=4
  [6] INT int size=4 nr_bits=32 encoding=SIGNED
  search cu 't.c' for percpu global variables.
  Variable 'g' from CU 't.c' at address 0x0 encoded
  [7] VAR g type=3 linkage=1
  [8] DATASEC .data..percpu size=8 vlen=1
          type=7 offset=0 size=8
  [$ ~]

You can see for the source:

  int __tag1 * __tag1 __tag2 *g __attribute__((section(".data..percpu")));

the following type chain is generated:

  var -> ptr -> tag2 -> tag1 -> ptr -> tag1 -> int

The following shows pahole option "--skip_encoding_btf_type_tag" can be
used to prevent BTF_KIND_TYPE_TAG generation.

  [$ ~] pahole -JV t.o --skip_encoding_btf_type_tag
  Found per-CPU symbol 'g' at address 0x0
  Found 1 per-CPU variables!
  File t.o:
  [1] PTR (anon) type_id=2
  [2] PTR (anon) type_id=3
  [3] INT int size=4 nr_bits=32 encoding=SIGNED
  search cu 't.c' for percpu global variables.
  Variable 'g' from CU 't.c' at address 0x0 encoded
  [4] VAR g type=1 linkage=1
  [5] DATASEC .data..percpu size=8 vlen=1
          type=4 offset=0 size=8
  [$ ~]

Committer testing:

  $ rm -f t.o; clang -O2 -g -c t.c
  $ llvm-dwarfdump t.o
  t.o:	file format elf64-x86-64

  .debug_info contents:
  0x00000000: Compile Unit: length = 0x0000005e, format = DWARF32, version = 0x0004, abbr_offset = 0x0000, addr_size = 0x08 (next unit at 0x00000062)

  0x0000000b: DW_TAG_compile_unit
                DW_AT_producer	("clang version 14.0.0 (https://github.com/llvm/llvm-project 0d3add216f04b99ed1db1a05c39975d4a9c83e6b)")
                DW_AT_language	(DW_LANG_C99)
                DW_AT_name	("t.c")
                DW_AT_stmt_list	(0x00000000)
                DW_AT_comp_dir	("/var/home/acme/git/pahole")

  0x0000001e:   DW_TAG_variable
                  DW_AT_name	("g")
                  DW_AT_type	(0x00000033 "int **")
                  DW_AT_external	(true)
                  DW_AT_decl_file	("/var/home/acme/git/pahole/t.c")
                  DW_AT_decl_line	(3)
                  DW_AT_location	(DW_OP_addr 0x0)

  0x00000033:   DW_TAG_pointer_type
                  DW_AT_type	(0x0000004b "int *")

  0x00000038:     DW_TAG_LLVM_annotation
                    DW_AT_name	("btf_type_tag")
                    DW_AT_const_value	("tag1")

  0x00000041:     DW_TAG_LLVM_annotation
                    DW_AT_name	("btf_type_tag")
                    DW_AT_const_value	("tag2")

  0x0000004a:     NULL

  0x0000004b:   DW_TAG_pointer_type
                  DW_AT_type	(0x0000005a "int")

  0x00000050:     DW_TAG_LLVM_annotation
                    DW_AT_name	("btf_type_tag")
                    DW_AT_const_value	("tag1")

  0x00000059:     NULL

  0x0000005a:   DW_TAG_base_type
                  DW_AT_name	("int")
                  DW_AT_encoding	(DW_ATE_signed)
                  DW_AT_byte_size	(0x04)

  0x00000061:   NULL
  $ pahole -JV t.o
  Found per-CPU symbol 'g' at address 0x0
  Found 1 per-CPU variables!
  File t.o:
  [1] TYPE_TAG tag1 type_id=5
  [2] TYPE_TAG tag2 type_id=1
  [3] PTR (anon) type_id=2
  [4] TYPE_TAG tag1 type_id=6
  [5] PTR (anon) type_id=4
  [6] INT int size=4 nr_bits=32 encoding=SIGNED
  search cu 't.c' for percpu global variables.
  Variable 'g' from CU 't.c' at address 0x0 encoded
  [7] VAR g type=3 linkage=1
  [8] DATASEC .data..percpu size=8 vlen=1
  	type=7 offset=0 size=8
  ⬢[acme@toolbox pahole]$ pahole -JV t.o --skip_encoding_btf_type_tag
  Found per-CPU symbol 'g' at address 0x0
  Found 1 per-CPU variables!
  File t.o:
  [1] PTR (anon) type_id=2
  [2] PTR (anon) type_id=3
  [3] INT int size=4 nr_bits=32 encoding=SIGNED
  search cu 't.c' for percpu global variables.
  Variable 'g' from CU 't.c' at address 0x0 encoded
  [4] VAR g type=1 linkage=1
  [5] DATASEC .data..percpu size=8 vlen=1
  	type=4 offset=0 size=8
  $

Signed-off-by: Yonghong Song <yhs@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-23 20:37:58 -03:00
Arnaldo Carvalho de Melo 3da248c328 man pages: Add missing --skip_encoding_btf_decl_tag entry
In the past we saw the value of being able to disable specific features
due to problems in in its implementation, allowing users to use a subset
of functionality, without the problematic one.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-23 20:37:58 -03:00
Arnaldo Carvalho de Melo a58ecca0a8 man pages: Add missing --skip_encoding_btf_type_tag entry
In the past we saw the value of being able to disable specific features
due to problems in in its implementation, allowing users to use a subset
of functionality, without the problematic one.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-23 20:37:58 -03:00
Yonghong Song b488c8d328 dwarf_loader: Support btf_type_tag attribute
LLVM patches ([1] for clang, [2] and [3] for BPF backend)
added support for btf_type_tag attributes. The following is
an example:

  [$ ~] cat t.c
  #define __tag1 __attribute__((btf_type_tag("tag1")))
  #define __tag2 __attribute__((btf_type_tag("tag2")))
  int __tag1 * __tag1 __tag2 *g __attribute__((section(".data..percpu")));
  [$ ~] clang -O2 -g -c t.c
  [$ ~] llvm-dwarfdump --debug-info t.o
  t.o:    file format elf64-x86-64
  ...
  0x0000001e:   DW_TAG_variable
                  DW_AT_name      ("g")
                  DW_AT_type      (0x00000033 "int **")
                  DW_AT_external  (true)
                  DW_AT_decl_file ("/home/yhs/t.c")
                  DW_AT_decl_line (3)
                  DW_AT_location  (DW_OP_addr 0x0)
  0x00000033:   DW_TAG_pointer_type
                  DW_AT_type      (0x0000004b "int *")
  0x00000038:     DW_TAG_LLVM_annotation
                    DW_AT_name    ("btf_type_tag")
                    DW_AT_const_value     ("tag1")
  0x00000041:     DW_TAG_LLVM_annotation
                    DW_AT_name    ("btf_type_tag")
                    DW_AT_const_value     ("tag2")
  0x0000004a:     NULL
  0x0000004b:   DW_TAG_pointer_type
                  DW_AT_type      (0x0000005a "int")
  0x00000050:     DW_TAG_LLVM_annotation
                    DW_AT_name    ("btf_type_tag")
                    DW_AT_const_value     ("tag1")
  0x00000059:     NULL
  0x0000005a:   DW_TAG_base_type
                  DW_AT_name      ("int")
                  DW_AT_encoding  (DW_ATE_signed)
                  DW_AT_byte_size (0x04)
  0x00000061:   NULL

From the above example, you can see that DW_TAG_pointer_type may contain
one or more DW_TAG_LLVM_annotation btf_type_tag tags.  If
DW_TAG_LLVM_annotation tags are present inside DW_TAG_pointer_type, for
BTF encoding, pahole will need to follow [3] to generate a type chain
like:

  var -> ptr -> tag2 -> tag1 -> ptr -> tag1 -> int

This patch implemented dwarf_loader support. If a pointer type contains
DW_TAG_LLVM_annotation tags, a new type btf_type_tag_ptr_type will be
created which will store the pointer tag itself and all
DW_TAG_LLVM_annotation tags.  During recoding stage, the type chain will
be formed properly based on the above example.

An option "--skip_encoding_btf_type_tag" is added to disable
this new functionality.

  [1] https://reviews.llvm.org/D111199
  [2] https://reviews.llvm.org/D113222
  [3] https://reviews.llvm.org/D113496

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-23 20:37:51 -03:00
Yonghong Song a0cc68687f dutil: Move DW_TAG_LLVM_annotation definition to dutil.h
Move DW_TAG_LLVM_annotation definition from dwarf_load.c to dutil.h as
it will be used later for btf_encoder.c.  There is no functionality
change for this patch.

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-23 20:37:43 -03:00
Yonghong Song 76401e9e46 libbpf: Sync with latest libbpf repo to pick support for BTF_KIND_TYPE_TAG
Sync up to commit 94a49850c5ee61ea ("Makefile: enforce gnu89 standard").

This is needed to support BTF_KIND_TYPE_TAG.

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-23 20:37:35 -03:00
Arnaldo Carvalho de Melo 0135ccd632 dwarf_loader: Warn about DW_TAG_skeleton_unit and give a workaround
$ pahole ~/c/split/foo.o
  WARNING: DW_TAG_skeleton_unit used, please look for a .dwo file and use it instead.
           A future version of pahole will support do this automagically.
  $

Reported-by: https://twitter.com/trass3r
Link: https://github.com/acmel/dwarves/issues/23
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12 15:20:09 -03:00
Arnaldo Carvalho de Melo 433dc780ca fprintf: Add DWARF5 tags added in elfutils 0.170
Now we know what that 0x4a thing is:

  $ pahole ~/c/split/foo.o
  die__process: DW_TAG_compile_unit, DW_TAG_type_unit or DW_TAG_partial_unit expected got skeleton_unit (0x4a)!
  $

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12 15:16:51 -03:00
Arnaldo Carvalho de Melo 7af9ed4aed dwarf_loader: Print the hexadecimal value for unexpected tags in die__process()
So that we can get it from user reports, i.e. instead of:

  die__process: DW_TAG_compile_unit, DW_TAG_type_unit or DW_TAG_partial_unit expected got INVALID

We now get:

  die__process: DW_TAG_compile_unit, DW_TAG_type_unit or DW_TAG_partial_unit expected got INVALID (0x4a)

That we can then look in dwarf.h and notice that there is this new:

     DW_TAG_skeleton_unit = 0x4a,

Now lets go support it...

Reported-by: https://twitter.com/trass3r
Link: https://github.com/acmel/dwarves/issues/23
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-12 15:02:28 -03:00
Yonghong Song ec62499774 btf_encoder: generate BTF_KIND_DECL_TAGs for typedef btf_decl_tag attributes
Emit BTF BTF_KIND_DECL_TAGs for btf_decl_tag attributes attached to
typedef declarations. The following is a simple example:
  $ cat t.c
    #define __tag1 __attribute__((btf_decl_tag("tag1")))
    #define __tag2 __attribute__((btf_decl_tag("tag2")))
    typedef struct { int a; int b; } __t __tag1 __tag2;
    __t g;
  $ clang -O2 -g -c t.c
  $ pahole -JV t.o
    btf_encoder__new: 't.o' doesn't have '.data..percpu' section
    Found 0 per-CPU variables!
    File t.o:
    [1] TYPEDEF __t type_id=2
    [2] STRUCT (anon) size=8
            a type_id=3 bits_offset=0
            b type_id=3 bits_offset=32
    [3] INT int size=4 nr_bits=32 encoding=SIGNED
    [4] DECL_TAG tag1 type_id=1 component_idx=-1
    [5] DECL_TAG tag2 type_id=1 component_idx=-1

Signed-off-by: Yonghong Song <yhs@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-11 09:31:31 -03:00
Yonghong Song 468b4196f6 dwarf_loader: support typedef DW_TAG_LLVM_annotation
llvm commit ([1]) added support for btf_decl_tag attribute
with typedef declaration. Eventually, DW_TAG_LLVM_annotation
tag may appear inside dwarf typedef declaration tag.

kernel support for typedef BTF_KIND_DECL_TAG support
is introduced in [2]. There is no additional libbpf
change needed as the previous libbpf BTF_KIND_DECL_TAG
support is generic enough to cover new typedef use
cases.

This patch added parsing of DW_TAG_LLVM_annotation
for dwarf typedef decl.

  $ cat t.c
  $ clang -O2 -g -c t.c
  $ llvm-dwarfdump --debug-info t.o
    ......
    0x00000033:   DW_TAG_typedef
                    DW_AT_type      (0x00000051 "structure ")
                    DW_AT_name      ("__t")
                    DW_AT_decl_file ("/home/yhs/t.c")
                    DW_AT_decl_line (3)

    0x0000003e:     DW_TAG_LLVM_annotation
                      DW_AT_name    ("btf_decl_tag")
                      DW_AT_const_value     ("tag1")

    0x00000047:     DW_TAG_LLVM_annotation
                      DW_AT_name    ("btf_decl_tag")
                      DW_AT_const_value     ("tag2")

    0x00000050:     NULL

Previously, pahole will issue a warning if typedef tag
contains any child tag. I removed this warning since
it is not true any more. Note that dwarf standard doesn't
prevent typedef decl tag from having nested tags.
In the future if we need to process any tag inside
typedef tag, we can just add code to process it.

  [1] https://reviews.llvm.org/D110127
  [2] https://lore.kernel.org/bpf/20211021195628.4018847-1-yhs@fb.com

Signed-off-by: Yonghong Song <yhs@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-11-11 09:31:27 -03:00
Douglas Raillard 696c621804 btf_loader: Use cacheline size to infer alignment
When the alignment is larger than natural, it is very likely that the
source code was using the cacheline size. Therefore, use the cacheline
size when it would only result in increasing the alignment.

Committer tests:

This is one of the cases that this heuristic works well, 'struct Qdisc'
in the Linux kernel:

  --- /tmp/btfdiff.dwarf.pXdgRU   2021-10-28 10:22:11.738200232 -0300
  +++ /tmp/btfdiff.btf.bkDkdf     2021-10-28 10:22:11.925205061 -0300
  @@ -107,7 +107,7 @@ struct Qdisc {
          /* XXX 24 bytes hole, try to pack */

          /* --- cacheline 2 boundary (128 bytes) --- */
  -       struct sk_buff_head        gso_skb __attribute__((__aligned__(64))); /*   128    24 */
  +       struct sk_buff_head        gso_skb __attribute__((__aligned__(32))); /*   128    24 */
          struct qdisc_skb_head      q;                    /*   152    24 */
          struct gnet_stats_basic_packed bstats;           /*   176    16 */
          /* --- cacheline 3 boundary (192 bytes) --- */

With this patch both DWARF and BTF generated output have the same
alignment.

Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-28 10:22:17 -03:00
Douglas Raillard 48f4086b76 btf_loader: Propagate struct conf_load
Give access to struct conf_load in class__infer_alignment.

Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-28 10:18:59 -03:00
Douglas Raillard 772725a77d dwarves_fprintf: Move cacheline_size into struct conf_fprintf
Remove the global variable and turn it into a member in struct
conf_fprintf, so that it can be used by other parts of the code.

Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-28 10:17:59 -03:00
Arnaldo Carvalho de Melo cdd088c05c btfdiff: Suppress alignment tags with BTF as well as with DWARF
Now that the alignment attributes are being inferred from BTF we need to
suppress it in btfdiff, as we can't infer for some cases, like when the
field is naturally aligned.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-28 09:37:26 -03:00
Douglas Raillard 836c139fdf btf_loader: Infer alignment info
BTF does not carry alignment information, but it carries the offset in
structs. This allows inferring the original alignment, yielding a C
header dump that is not identical to the original C code, but is
guaranteed to lead to the same memory layout.

This allows using the output of pahole in another program to poke at
memory, with the assurance that we will not read garbage.

Note: Since the alignment is inferred from the offset, it sometimes
happens that the offset was already correctly aligned, which means the
inferred alignment will be smaller than in the original source. This
does not impact the ability to read existing structs, but it could
impact creating such struct if other client code expects higher
alignment than the one exposed in the generated header.

Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
Cc: dwarves@vger.kernel.org
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-27 15:45:47 -03:00
Douglas Raillard 4db65fe0cd core: Export tag__natural_alignment()
We'll use it in the BTF loader.

Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-27 15:44:31 -03:00
Arnaldo Carvalho de Melo 43e8216c25 fprintf: Fix __attribute__((__aligned__(N)) handling for struct members
We just need to record if we printed it for a member and if so, deduce
that from the number of spaces left to print before the end of line
comment (offset, size).

Fixes: a59459bb80 ("fprintf: Account inline type __aligned__ member types for spacing")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-27 15:44:10 -03:00
Yonghong Song c52f6421f2 btf: Rename btf_tag to btf_decl_tag
Kernel commit ([1]) renamed btf_tag to btf_decl_tag for uapi btf.h and
libbpf api's. The reason is a new clang attribute, btf_type_tag, is
introduced ([2]).  Renaming btf_tag to btf_decl_tag makes it easier to
distinghish from btf_type_tag.

I also pulled in latest libbpf repo since it contains renamed libbpf api
function btf__add_decl_tag().

  [1] https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/
  [2] https://reviews.llvm.org/D111199

Signed-off-by: Yonghong Song <yhs@fb.com>
[ Minor fixups to cope with --skip_missing ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-27 09:35:23 -03:00
Domenico Andreoli 3433c67bbd manpages: Minor fixes
A typo, some escaping for paths and the header for an option.

Signed-off-by: Domenico Andreoli <domenico.andreoli@linux.com>
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-27 09:35:23 -03:00
Douglas Raillard e975d0fba8 btf_loader: Refactor class__fixup_btf_bitfields
Refactor class__fixup_btf_bitfields to remove a "continue" statement, to
prepare the ground for alignment fixup that is relevant for some types
matching:

    type->tag != DW_TAG_base_type && type->tag != DW_TAG_enumeration_type

Committer testing:

btfdiff passes for a x86_64 kernel built with gcc and for a clang
thin-LTO vmlinux build.

Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-26 11:29:55 -03:00
Douglas Raillard 5282feee6d pahole: Add --skip_missing option
Add a --skip_missing option that allows pahole to keep going in case one
of the type passed to -C (e.g. via a file) does not exist.

This is useful for intropsection software such as debugging kernel
modules that can handle various kernel configurations and versions for
which some recently added types are missing. The consumer of the header
becomes responsible of gating the uses of the type with #ifdef
CONFIG_XXX, rather than pahole bailing out on the first unknown type.

Committer testing:

Before:

  $ pahole tcp_splice_state,xxfrm_policy_queue,list_head tcp.o
  struct tcp_splice_state {
  	struct pipe_inode_info *   pipe;                 /*     0     8 */
  	size_t                     len;                  /*     8     8 */
  	unsigned int               flags;                /*    16     4 */

  	/* size: 24, cachelines: 1, members: 3 */
  	/* padding: 4 */
  	/* last cacheline: 24 bytes */
  };
  pahole: type 'xxfrm_policy_queue' not found
  $

After:

  $ pahole --help |& grep skip
        --skip=COUNT           Skip COUNT input records
        --skip_encoding_btf_tag   Do not encode TAGs in BTF.
        --skip_encoding_btf_vars   Do not encode VARs in BTF.
        --skip_missing         skip missing types passed to -C rather than stop
  $ pahole --skip_missing tcp_splice_state,xxfrm_policy_queue,list_head tcp.o
  struct tcp_splice_state {
  	struct pipe_inode_info *   pipe;                 /*     0     8 */
  	size_t                     len;                  /*     8     8 */
  	unsigned int               flags;                /*    16     4 */

  	/* size: 24, cachelines: 1, members: 3 */
  	/* padding: 4 */
  	/* last cacheline: 24 bytes */
  };
  struct list_head {
  	struct list_head *         next;                 /*     0     8 */
  	struct list_head *         prev;                 /*     8     8 */

  	/* size: 16, cachelines: 1, members: 2 */
  	/* last cacheline: 16 bytes */
  };
  pahole: type 'xxfrm_policy_queue' not found
  $

Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-26 11:29:55 -03:00
Douglas Raillard 6931e393f8 fprintf: Fix nested struct printing wrt attributes
This code:

    struct X {
       struct {
       } __attribute__((foo)) x __attribute__((bar));
    }

Was wrongly printed as:

    struct X {
       struct {
       } x __attribute__((foo)) __attribute__((bar));
    }

This unfortunately matters a lot, since "bar" is suppose to apply to
"x", but "foo" to typeof(x). In the wrong form, both apply to "x",
leading to e.g. incorrect layout for __aligned__ attribute.

Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-26 11:29:55 -03:00
Ilya Leoshkevich 16a7acaba4 btf_encoder: Fix handling of percpu symbols on s390
pahole does not generate VARs for percpu symbols on s390. A percpu
symbol definition on a typical x86_64 kernel looks like this:

  [33] .data..percpu     PROGBITS         0000000000000000  01c00000
                                          ^^^^^^^^^^^^^^^^ sh_addr
  LOAD           0x0000000001c00000 0x0000000000000000 0x000000000286f000
                                    ^^^^^^^^^^^^^^^^^^ p_vaddr
 13559: 000000000001ba50     4 OBJECT  LOCAL  DEFAULT   33 cpu_profile_flip
        ^^^^^^^^^^^^^^^^ st_value

Most importantly, .data..percpu's sh_addr is 0, and this is what pahole
is currently assuming. However, on s390 this is different:

   [37] .data..percpu     PROGBITS         00000000019cd000  018ce000
                                           ^^^^^^^^^^^^^^^^ sh_addr
  LOAD           0x000000000136e000 0x000000000146d000 0x000000000146d000
                                    ^^^^^^^^^^^^^^^^^^ p_vaddr
80377: 0000000001ba1440     4 OBJECT  WEAK   DEFAULT   37 cpu_profile_flip
       ^^^^^^^^^^^^^^^^ st_value

Fix by restructuring the code to always use section-relative offsets for
symbols. Change the comment to focus on this invariant.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-26 11:29:55 -03:00
Ilya Leoshkevich 3cde0135ca dwarf_loader: Fix heap overflow when accessing variable specification
Variables can be allocated with or without specification, however,
tag__recode_dwarf_type() always tries accessing it, leading to heap read
overflows and subsequent logic bugs.

Fix by introducing a bit that tracks whether or not specification is
present.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-26 11:29:55 -03:00
Arnaldo Carvalho de Melo a9c99e9881 dwarves: Introduce conf_load->thread_exit() callback
Will be called when a thread exits, initially only in the DWARF loader,
so that pahole can call the btf_encoder associated with the exiting
thread to do the dedup as the last step done in parallel.

Then we'll iterate the btf_encoders list and combine everything into the
first btf_encoder instance that gets then written to disk.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-14 17:37:25 -03:00
Arnaldo Carvalho de Melo cc6c7d473d Update libbpf to get API to combine BTF
I.e. the one in:

 13ebb60ab66799ab libbpf: Add API that copies all BTF types from one BTF object to another

This will be used to paralellize the BTF encoding phase.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-10-14 16:27:07 -03:00
Yonghong Song e38e89e853 btf_encoder: Generate BTF_KIND_TAG from llvm annotations
The following is an example with latest upstream clang:

  $ cat t.c
  #define __tag1 __attribute__((btf_tag("tag1")))
  #define __tag2 __attribute__((btf_tag("tag2")))

  struct t {
          int a:1 __tag1;
          int b __tag2;
  } __tag1 __tag2;

  int g __tag1 __attribute__((section(".data..percpu")));

  int __tag1 foo(struct t *a1, int a2 __tag2) {
    return a1->b + a2 + g;
  }

  $ clang -O2 -g -c t.c
  $ pahole -JV t.o
  Found per-CPU symbol 'g' at address 0x0
  Found 1 per-CPU variables!
  Found 1 functions!
  File t.o:
  [1] INT int size=4 nr_bits=32 encoding=SIGNED
  [2] PTR (anon) type_id=3
  [3] STRUCT t size=8
        a type_id=1 bitfield_size=1 bits_offset=0
        b type_id=1 bitfield_size=0 bits_offset=32
  [4] TAG tag1 type_id=3 component_idx=0
  [5] TAG tag2 type_id=3 component_idx=1
  [6] TAG tag1 type_id=3 component_idx=-1
  [7] TAG tag2 type_id=3 component_idx=-1
  [8] FUNC_PROTO (anon) return=1 args=(2 a1, 1 a2)
  [9] FUNC foo type_id=8
  [10] TAG tag2 type_id=9 component_idx=1
  [11] TAG tag1 type_id=9 component_idx=-1
  search cu 't.c' for percpu global variables.
  Variable 'g' from CU 't.c' at address 0x0 encoded
  [12] VAR g type=1 linkage=1
  [13] TAG tag1 type_id=12 component_idx=-1
  [14] DATASEC .data..percpu size=4 vlen=1
        type=12 offset=0 size=4
  $ ...

With additional option --skip_encoding_btf_tag, pahole doesn't
generate BTF_KIND_TAGs any more.

  $ pahole -JV --skip_encoding_btf_tag t.o
  Found per-CPU symbol 'g' at address 0x0
  Found 1 per-CPU variables!
  Found 1 functions!
  File t.o:
  [1] INT int size=4 nr_bits=32 encoding=SIGNED
  [2] PTR (anon) type_id=3
  [3] STRUCT t size=8
        a type_id=1 bitfield_size=1 bits_offset=0
        b type_id=1 bitfield_size=0 bits_offset=32
  [4] FUNC_PROTO (anon) return=1 args=(2 a1, 1 a2)
  [5] FUNC foo type_id=4
  search cu 't.c' for percpu global variables.
  Variable 'g' from CU 't.c' at address 0x0 encoded
  [6] VAR g type=1 linkage=1
  [7] DATASEC .data..percpu size=4 vlen=1
        type=6 offset=0 size=4
  $ ...

Signed-off-by: Yonghong Song <yhs@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Link: https://lore.kernel.org/r/20210922021332.2287418-1-yhs@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 17:59:40 -03:00
Yonghong Song aa8c494e65 dwarf_loader: Parse DWARF tag DW_TAG_LLVM_annotation
Parse the DWARF tag DW_TAG_LLVM_annotation. Only record annotations with
btf_tag name which corresponds to btf_tag attributes in C code. Such
information will be used later by the btf_encoder for BTF conversion.

The LLVM implementation only supports btf_tag annotations on
struct/union, func, func parameter and variable ([1]).  So we only check
existence of corresponding DW tags in these places.

A flag "--skip_encoding_btf_tag" is introduced if for whatever reason
this feature needs to be disabled.

 [1] https://reviews.llvm.org/D106614

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Link: https://lore.kernel.org/r/20210922021326.2287095-1-yhs@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 17:06:56 -03:00
Matteo Croce 3d20210d84 CMakeList.txt: Don't download libbpf source when system library is used
The build system always download the libbpf submodule, regardless if
we're using the embedded or the system version.
Download the libbpf source only if we're using the embedded one.

Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-27 17:04:53 -03:00
Yonghong Song 38fad22d66 libbpf: Get latest libbpf
Latest upstream LLVM now supports to emit btf_tag to dwarf ([1]) and the
kernel support for btf_tag is also landed ([2]). Sync with latest libbpf
which has btf_tag support. Next step will be to implement dwarf -> btf
conversion for btf_tag.

 [1] https://reviews.llvm.org/D106621
 [2] https://lore.kernel.org/bpf/20210914223015.245546-1-yhs@fb.com

Signed-off-by: Yonghong Song <yhs@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Cc: kernel-team@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-09-18 18:29:39 -03:00
Matteo Croce 8843109995 CMakeList.txt: Make python optional
ostra-cg, which requires python, is installed in the destination dir.
Make it optional for embedded distributions which doesn't have the
python interpreter available.

Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-30 15:57:14 -03:00
Arnaldo Carvalho de Melo f02af2553e pahole: Prep 1.22
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-23 09:48:58 -03:00
Arnaldo Carvalho de Melo 40a40df961 core: Bump the chunk size for ptr_table uses in types, tags, functions tables
On a:

  $ grep "model name" /proc/cpuinfo
  model name	: Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
  model name	: Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
  model name	: Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
  model name	: Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
  $

Before, using 256 as the chunk size:

  $ perf stat -r5 pahole -j --btf_encode vmlinux

   Performance counter stats for 'pahole -j --btf_encode vmlinux' (5 runs):

         8,336.57 msec task-clock:u        # 2.649 CPUs utilized   ( +-  0.19% )
                0      context-switches:u  # 0.000 /sec
                0      cpu-migrations:u    # 0.000 /sec
           69,028      page-faults:u       # 8.260 K/sec           ( +-  0.03% )
   28,799,380,143      cycles:u            # 3.446 GHz             ( +-  0.06% )
   66,068,272,802      instructions:u      # 2.29  insn per cycle  ( +-  0.00% )
   15,801,729,716      branches:u          # 1.891 G/sec           ( +-  0.00% )
      134,370,099      branch-misses:u     # 0.85% of all branches ( +-  0.07% )

          3.14696 +- 0.00527 seconds time elapsed  ( +-  0.17% )

  $

After bumping it to 1024:

  $ perf stat -r5 pahole -j --btf_encode vmlinux

   Performance counter stats for 'pahole -j --btf_encode vmlinux' (5 runs):

         8,255.93 msec task-clock:u        # 2.635 CPUs utilized   ( +-  0.03% )
                0      context-switches:u  # 0.000 /sec
                0      cpu-migrations:u    # 0.000 /sec
           68,597      page-faults:u       # 8.312 K/sec           ( +-  0.04% )
   28,504,209,806      cycles:u            # 3.454 GHz             ( +-  0.03% )
   66,067,020,098      instructions:u      # 2.32  insn per cycle  ( +-  0.00% )
   15,802,624,183      branches:u          # 1.915 G/sec           ( +-  0.00% )
      133,542,603      branch-misses:u     # 0.85% of all branches ( +-  0.13% )

          3.13324 +- 0.00205 seconds time elapsed  ( +-  0.07% )

  $

And 2048:

  $ perf stat -r10 pahole -j --btf_encode vmlinux

   Performance counter stats for 'pahole -j --btf_encode vmlinux' (10 runs):

         8,237.37 msec task-clock:u        # 2.635 CPUs utilized   ( +-  0.02% )
                0      context-switches:u  # 0.000 /sec
                0      cpu-migrations:u    # 0.000 /sec
           68,643      page-faults:u       # 8.331 K/sec           ( +-  0.06% )
   28,447,701,874      cycles:u            # 3.453 GHz             ( +-  0.02% )
   66,077,728,879      instructions:u      # 2.32  insn per cycle  ( +-  0.00% )
   15,806,113,927      branches:u          # 1.918 G/sec           ( +-  0.00% )
      132,811,965      branch-misses:u     # 0.84% of all branches ( +-  0.11% )

         3.125675 +- 0.000905 seconds time elapsed  ( +-  0.03% )

  $

Value chosen using:

  $ pahole --ptr_table_stats --btf_encode vmlinux

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo 9f0809e6a8 pahole: Introduce --ptr_table_stats
Useful while developing to help in tuning the ptr tables (types, tags,
functions, maybe some more in the future).

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00