Arnaldo Carvalho de Melo
7a8e75cd9a
elfcreator: elfcreator_copy_scn() doesn't need the 'elf' arg
...
Not used at all, remove it.
Cc: Peter Jones <pjones@redhat.com>
Fixes: 29ef465cd8
("Add scncopy - like object copy but tries not to change section content")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
3925a5bd53
syscse: zero_extend() doesn't need a 'cu' arg
...
Since we don't need the cu to get the strings table, all tags have a
char pointer for strings.
Found while building with clang to prep 1.22.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
21b2933f01
pahole: Fix signedness of ternary expression operator
...
To address this clang warning:
/var/home/acme/git/pahole/pahole.c: In function ‘type__instance_read_once’:
/var/home/acme/git/pahole/pahole.c:1933:78: warning: operand of ‘?:’ changes signedness from ‘int’ to ‘uint32_t’ {aka ‘unsigned int’} due to unsignedness of other operand [-Wsign-compare]
1933 | return fread(instance->instance, instance->type->size, 1, fp) != 1 ? -1 : instance->type->size;
Fixes: e3e5a4626c
("pahole: Make sure the header is read only once")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
4e11c13895
ctracer: Remove a bunch of unused 'cu' pointers
...
Since we don't need the cu to get the strings table, all tags have a
char pointer for strings.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
54c1e93b8e
pahole: Use the 'prototypes' parameter in prototypes__load()
...
It was using &class_names directly while it was also being passed as the
'prototypes' argument, use the argument.
Fixes: 823739b56f
("pahole: Convert class_names into a list of struct prototypes")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
8b495918e6
codiff: class__find_pair_member() doesn't need 'cu' args
...
Since we don't need the cu to get the strings table, all tags have a
char pointer for strings.
Found while building with clang to prep 1.22.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
057be3d993
core: class__find_member_by_name() doesn't need a cu pointer
...
Since we don't need the cu to get the strings table, all tags have a
char pointer for strings.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
ce9de90364
core: Document type->node member usage
...
Right now its just for when we emit types, so we can reuse it for
instance, to handle different types with the same name in different CUs
in pahole.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
cead526d6b
core: Fix nnr_members typo on 'struct type' comment docs
...
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
7cfc9be1f2
man-pages: Improve the --nr_methods/-m pahole man page entry
...
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
3895127ce6
pahole: Clarify that currently --nr_methods doesn't work together witn -C
...
It should, as its natural to do:
$ pahole --nr_methods -C sock
And have it traverse all functions in all compilation units and show how
many of them have 'struct sock *' as one of its arguments, but more
changes are needed to have this in place and it is easy enough to do:
$ pahole --nr_methods | grep -w sock
$ pahole --nr_methods | grep -w sock
sock 1005
$
And with BTF, its super fast too.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
2ea46285ac
pahole: No need to store the class name in 'struct structure'
...
As we by now already store the 'struct class' it comes from and
class->name is now a string, no point in storing a duplicate name.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
4d8551396d
pahole: Multithreaded DWARF loading requires elfutils >= 0.178
...
According to Mark Wieelard and as per testing, elfutils' libdw version
must be at least 0.178 for multithreaded DWARF loading.
Check that and emit a warning and then continue using just a single
thread, this allows for asking for multithreading in things like the
Linux Kernel makefiles while still working on older systems, such as
centos:7, where the elfutils version is 0.176.
Mark also provided this info for people using centos:7 (and
equivalents):
''Note that on centos7 if you install centos-release-scl you can get the
various devtoolset packages that do contain newer gcc and elfutils. The
latest are devtoolset-10-gcc (gcc-10.2.1) and devtoolset-10-elfutils-devel
(elfutils-0.182).
After installing you can use them with "scl enable devtoolset-10 bash"
which sets up the environment with the new devtools as default.''
A quick attempt at using a lock around all libdw functions ended up
being a too heavy big hammer, making the multithreaded DWARF loader to
be worse than using just a single thread.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
e57e23c72a
btf_encoder: Add methods to maintain a list of btf encoders
...
We'll have one per thread and then at the end combine and dedup them one
last time.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
e9b83dba79
list: Adopt list_next_entry() from the Linux kernel
...
We'll use it to traverse the list of opaque btf_encoder entries in
pahole.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
6edae3e768
dwarf_loader: Make hash table size default to 12, faster than 15
...
The sweet spot for recent kernels, the default is 15 in the tests below,
changing to 12 reduces the time elapsed, make it the new default.
$ grep "model name" /proc/cpuinfo
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
$
$ sudo perf stat -d -r5 pahole -j --btf_encode_detached vmlinux-j.btf vmlinux
Performance counter stats for 'pahole -j --btf_encode_detached vmlinux-j.btf vmlinux' (5 runs):
8,101.71 msec task-clock # 2.752 CPUs utilized ( +- 0.06% )
1,682 context-switches # 207.610 /sec ( +- 0.98% )
5 cpu-migrations # 0.592 /sec ( +- 15.31% )
68,870 page-faults # 8.501 K/sec ( +- 0.02% )
29,205,269,606 cycles # 3.605 GHz ( +- 0.05% )
63,448,636,788 instructions # 2.17 insn per cycle ( +- 0.00% )
15,127,493,299 branches # 1.867 G/sec ( +- 0.00% )
120,362,476 branch-misses # 0.80% of all branches ( +- 0.11% )
13,967,000,698 L1-dcache-loads # 1.724 G/sec ( +- 0.00% )
375,052,289 L1-dcache-load-misses # 2.69% of all L1-dcache accesses ( +- 0.03% )
91,506,061 LLC-loads # 11.295 M/sec ( +- 0.10% )
27,905,809 LLC-load-misses # 30.50% of all LL-cache accesses ( +- 0.16% )
2.94445 +- 0.00188 seconds time elapsed ( +- 0.06% )
$ sudo perf stat -d -r5 pahole --hashbits 12 -j --btf_encode_detached vmlinux-j.btf vmlinux
Performance counter stats for 'pahole --hashbits 12 -j --btf_encode_detached vmlinux-j.btf vmlinux' (5 runs):
7,681.15 msec task-clock # 2.702 CPUs utilized ( +- 0.05% )
1,660 context-switches # 216.114 /sec ( +- 1.02% )
3 cpu-migrations # 0.365 /sec ( +- 13.36% )
67,794 page-faults # 8.826 K/sec ( +- 0.05% )
27,692,748,327 cycles # 3.605 GHz ( +- 0.04% )
63,041,363,409 instructions # 2.28 insn per cycle ( +- 0.00% )
15,063,798,404 branches # 1.961 G/sec ( +- 0.00% )
127,461,737 branch-misses # 0.85% of all branches ( +- 0.11% )
13,974,527,710 L1-dcache-loads # 1.819 G/sec ( +- 0.00% )
364,775,664 L1-dcache-load-misses # 2.61% of all L1-dcache accesses ( +- 0.01% )
83,685,127 LLC-loads # 10.895 M/sec ( +- 0.14% )
19,073,967 LLC-load-misses # 22.79% of all LL-cache accesses ( +- 0.30% )
2.842468 +- 0.000561 seconds time elapsed ( +- 0.02% )
$ sudo perf stat -d -r5 pahole -j --btf_encode_detached vmlinux-j.btf vmlinux-5.14.0-0.rc1.20210714git40226a3d96ef.18.fc35.x86_64
Performance counter stats for 'pahole -j --btf_encode_detached vmlinux-j.btf vmlinux-5.14.0-0.rc1.20210714git40226a3d96ef.18.fc35.x86_64' (5 runs):
9,512.30 msec task-clock # 2.741 CPUs utilized ( +- 0.54% )
1,964 context-switches # 206.469 /sec ( +- 2.60% )
7 cpu-migrations # 0.736 /sec ( +- 37.25% )
81,611 page-faults # 8.579 K/sec ( +- 0.08% )
34,294,568,812 cycles # 3.605 GHz ( +- 0.53% )
72,897,384,015 instructions # 2.13 insn per cycle ( +- 0.15% )
17,386,180,039 branches # 1.828 G/sec ( +- 0.15% )
136,142,139 branch-misses # 0.78% of all branches ( +- 1.06% )
16,020,787,096 L1-dcache-loads # 1.684 G/sec ( +- 0.19% )
430,392,585 L1-dcache-load-misses # 2.69% of all L1-dcache accesses ( +- 0.37% )
107,401,567 LLC-loads # 11.291 M/sec ( +- 0.30% )
35,172,977 LLC-load-misses # 32.75% of all LL-cache accesses ( +- 0.48% )
3.4710 +- 0.0243 seconds time elapsed ( +- 0.70% )
$ sudo perf stat -d -r5 pahole --hashbits 12 -j --btf_encode_detached vmlinux-j.btf vmlinux-5.14.0-0.rc1.20210714git40226a3d96ef.18.fc35.x86_64
Performance counter stats for 'pahole --hashbits 12 -j --btf_encode_detached vmlinux-j.btf vmlinux-5.14.0-0.rc1.20210714git40226a3d96ef.18.fc35.x86_64' (5 runs):
8,929.50 msec task-clock # 2.700 CPUs utilized ( +- 0.04% )
1,907 context-switches # 213.539 /sec ( +- 0.68% )
4 cpu-migrations # 0.426 /sec ( +- 30.46% )
80,661 page-faults # 9.033 K/sec ( +- 0.03% )
32,213,009,827 cycles # 3.607 GHz ( +- 0.03% )
72,345,614,657 instructions # 2.25 insn per cycle ( +- 0.00% )
17,290,227,666 branches # 1.936 G/sec ( +- 0.00% )
142,108,954 branch-misses # 0.82% of all branches ( +- 0.09% )
15,998,190,852 L1-dcache-loads # 1.792 G/sec ( +- 0.00% )
417,872,772 L1-dcache-load-misses # 2.61% of all L1-dcache accesses ( +- 0.02% )
98,061,829 LLC-loads # 10.982 M/sec ( +- 0.24% )
24,750,223 LLC-load-misses # 25.24% of all LL-cache accesses ( +- 0.17% )
3.30670 +- 0.00185 seconds time elapsed ( +- 0.06% )
$
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
d2d83be1e2
pahole: Allow tweaking the size of the loader hash tables
...
To experiment with different sizes as time goes by and the number of symbols in
the kernel grows.
The current default, 15, is suboptimal for the fedora rawhide kernel, we can do
better using 12.
Default: 15:
$ sudo ~acme/bin/perf stat -d -r5 pahole -j --btf_encode_detached vmlinux-j.btf vmlinux
Performance counter stats for 'pahole -j --btf_encode_detached vmlinux-j.btf vmlinux' (5 runs):
8,107.73 msec task-clock # 2.749 CPUs utilized ( +- 0.05% )
1,723 context-switches # 212.562 /sec ( +- 1.86% )
5 cpu-migrations # 0.641 /sec ( +- 46.07% )
68,802 page-faults # 8.486 K/sec ( +- 0.05% )
29,221,590,880 cycles # 3.604 GHz ( +- 0.04% )
63,438,138,612 instructions # 2.17 insn per cycle ( +- 0.00% )
15,125,172,105 branches # 1.866 G/sec ( +- 0.00% )
119,983,284 branch-misses # 0.79% of all branches ( +- 0.06% )
13,964,248,638 L1-dcache-loads # 1.722 G/sec ( +- 0.00% )
375,110,346 L1-dcache-load-misses # 2.69% of all L1-dcache accesses( +- 0.01% )
91,712,402 LLC-loads # 11.312 M/sec ( +- 0.14% )
28,025,289 LLC-load-misses # 30.56% of all LL-cache accesses ( +- 0.23% )
2.94980 +- 0.00193 seconds time elapsed ( +- 0.07% )
$
New default, to be set in an upcoming patch, 12:
$ sudo ~acme/bin/perf stat -d -r5 pahole --hashbits=12 -j --btf_encode_detached vmlinux-j.btf vmlinux
Performance counter stats for 'pahole --hashbits=12 -j --btf_encode_detached vmlinux-j.btf vmlinux' (5 runs):
7,687.31 msec task-clock # 2.704 CPUs utilized ( +- 0.02% )
1,677 context-switches # 218.126 /sec ( +- 0.70% )
4 cpu-migrations # 0.468 /sec ( +- 18.84% )
67,827 page-faults # 8.823 K/sec ( +- 0.03% )
27,711,744,058 cycles # 3.605 GHz ( +- 0.02% )
63,032,539,630 instructions # 2.27 insn per cycle ( +- 0.00% )
15,062,001,666 branches # 1.959 G/sec ( +- 0.00% )
127,728,818 branch-misses # 0.85% of all branches ( +- 0.07% )
13,972,184,314 L1-dcache-loads # 1.818 G/sec ( +- 0.00% )
364,962,883 L1-dcache-load-misses # 2.61% of all L1-dcache accesses( +- 0.02% )
83,969,109 LLC-loads # 10.923 M/sec ( +- 0.13% )
19,141,055 LLC-load-misses # 22.80% of all LL-cache accesses ( +- 0.25% )
2.842440 +- 0.000952 seconds time elapsed ( +- 0.03% )
$ sudo ~acme/bin/perf stat -d -r5 pahole --hashbits=11 -j --btf_encode_detached vmlinux-j.btf vmlinux
Performance counter stats for 'pahole --hashbits=11 -j --btf_encode_detached vmlinux-j.btf vmlinux' (5 runs):
7,704.29 msec task-clock # 2.702 CPUs utilized ( +- 0.05% )
1,676 context-switches # 217.515 /sec ( +- 1.04% )
2 cpu-migrations # 0.286 /sec ( +- 17.01% )
67,813 page-faults # 8.802 K/sec ( +- 0.05% )
27,786,710,102 cycles # 3.607 GHz ( +- 0.05% )
63,027,795,038 instructions # 2.27 insn per cycle ( +- 0.00% )
15,066,316,987 branches # 1.956 G/sec ( +- 0.00% )
130,431,772 branch-misses # 0.87% of all branches ( +- 0.20% )
13,981,516,517 L1-dcache-loads # 1.815 G/sec ( +- 0.00% )
369,525,466 L1-dcache-load-misses # 2.64% of all L1-dcache accesses( +- 0.03% )
83,328,524 LLC-loads # 10.816 M/sec ( +- 0.27% )
18,704,020 LLC-load-misses # 22.45% of all LL-cache accesses ( +- 0.18% )
2.85109 +- 0.00281 seconds time elapsed ( +- 0.10% )
$ sudo ~acme/bin/perf stat -d -r5 pahole --hashbits=8 -j --btf_encode_detached vmlinux-j.btf vmlinux
Performance counter stats for 'pahole --hashbits=8 -j --btf_encode_detached vmlinux-j.btf vmlinux' (5 runs):
8,190.55 msec task-clock # 2.774 CPUs utilized ( +- 0.03% )
1,607 context-switches # 196.226 /sec ( +- 0.67% )
3 cpu-migrations # 0.317 /sec ( +- 15.38% )
67,869 page-faults # 8.286 K/sec ( +- 0.05% )
29,511,213,192 cycles # 3.603 GHz ( +- 0.02% )
63,347,196,598 instructions # 2.15 insn per cycle ( +- 0.00% )
15,198,023,498 branches # 1.856 G/sec ( +- 0.00% )
131,113,100 branch-misses # 0.86% of all branches ( +- 0.14% )
14,118,162,884 L1-dcache-loads # 1.724 G/sec ( +- 0.00% )
422,048,384 L1-dcache-load-misses # 2.99% of all L1-dcache accesses( +- 0.01% )
105,878,910 LLC-loads # 12.927 M/sec ( +- 0.05% )
21,022,664 LLC-load-misses # 19.86% of all LL-cache accesses ( +- 0.20% )
2.952678 +- 0.000858 seconds time elapsed ( +- 0.03% )
$ sudo ~acme/bin/perf stat -d -r5 pahole --hashbits=13 -j --btf_encode_detached vmlinux-j.btf vmlinux
Performance counter stats for 'pahole --hashbits=13 -j --btf_encode_detached vmlinux-j.btf vmlinux' (5 runs):
7,728.71 msec task-clock # 2.707 CPUs utilized ( +- 0.07% )
1,661 context-switches # 214.887 /sec ( +- 0.70% )
2 cpu-migrations # 0.259 /sec ( +- 22.36% )
67,893 page-faults # 8.785 K/sec ( +- 0.04% )
27,874,322,843 cycles # 3.607 GHz ( +- 0.07% )
63,079,425,815 instructions # 2.26 insn per cycle ( +- 0.00% )
15,067,279,408 branches # 1.950 G/sec ( +- 0.00% )
125,706,874 branch-misses # 0.83% of all branches ( +- 1.00% )
13,967,177,801 L1-dcache-loads # 1.807 G/sec ( +- 0.00% )
363,566,754 L1-dcache-load-misses # 2.60% of all L1-dcache accesses( +- 0.02% )
86,583,482 LLC-loads # 11.203 M/sec ( +- 0.13% )
20,629,871 LLC-load-misses # 23.83% of all LL-cache accesses ( +- 0.21% )
2.85551 +- 0.00124 seconds time elapsed ( +- 0.04% )
$
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
ff7bd7083f
core: Allow sizing the loader hash table
...
For now this will only apply to the dwarf loader, for experimenting as
time passes and kernels grow bigger or with more symbols.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
3068ff36b7
hash: Remove unused hash_32(), hash_ptr()
...
We're only using hash_64(), so ditch unused parts.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
8eebf70d05
dwarf_loader: Use a per-CU frontend cache for the latest lookup result
...
Using a debug patch I found that for the Linux (vmlinux from fedora
rawhide) we get this number of hits:
nr_saved_lookups=2661460
$ grep "model name" /proc/cpuinfo
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
$
Before:
$ perf stat -d -r1 pahole -j --btf_encode_detached vmlinux-5.14.0-0.rc1.20210714git40226a3d96ef.18.fc35.x86_64-j.btf vmlinux-5.14.0-0.rc1.20210714git40226a3d96ef.18.fc35.x86_64
Performance counter stats for 'pahole -j --btf_encode_detached vmlinux-5.14.0-0.rc1.20210714git40226a3d96ef.18.fc35.x86_64-j.btf vmlinux-5.14.0-0.rc1.20210714git40226a3d96ef.18.fc35.x86_64':
9,515.95 msec task-clock:u # 2.731 CPUs utilized
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
81,634 page-faults:u # 8.579 K/sec
33,468,454,452 cycles:u # 3.517 GHz
72,279,667,117 instructions:u # 2.16 insn per cycle
17,256,208,904 branches:u # 1.813 G/sec
132,775,067 branch-misses:u # 0.77% of all branches
15,840,427,579 L1-dcache-loads:u # 1.665 G/sec
417,209,398 L1-dcache-load-misses:u # 2.63% of all L1-dcache accesses
105,099,756 LLC-loads:u # 11.045 M/sec
35,027,985 LLC-load-misses:u # 33.33% of all LL-cache accesses
3.484851710 seconds time elapsed
9.353155000 seconds user
0.190730000 seconds sys
$
After:
$ perf stat -d -r1 pahole -j --btf_encode_detached \
vmlinux-5.14.0-0.rc1.20210714git40226a3d96ef.18.fc35.x86_64-j.btf \
vmlinux-5.14.0-0.rc1.20210714git40226a3d96ef.18.fc35.x86_64
Performance counter stats for 'pahole -j --btf_encode_detached vmlinux-5.14.0-0.rc1.20210714git40226a3d96ef.18.fc35.x86_64-j.btf vmlinux-5.14.0-0.rc1.20210714git40226a3d96ef.18.fc35.x86_64':
9,416.17 msec task-clock:u # 2.744 CPUs utilized
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
81,461 page-faults:u # 8.651 K/sec
33,330,006,641 cycles:u # 3.540 GHz
72,301,897,397 instructions:u # 2.17 insn per cycle
17,263,694,358 branches:u # 1.833 G/sec
133,414,373 branch-misses:u # 0.77% of all branches
15,860,141,450 L1-dcache-loads:u # 1.684 G/sec
418,816,079 L1-dcache-load-misses:u # 2.64% of all L1-dcache accesses
104,960,787 LLC-loads:u # 11.147 M/sec
34,629,758 LLC-load-misses:u # 32.99% of all LL-cache accesses
3.431376846 seconds time elapsed
9.294489000 seconds user
0.146507000 seconds sys
$
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
a2f1e69848
core: Use obstacks: take 2
...
Allow asking for obstacks to be used, as for use cases like the btf
encoder where its all allocate sequentially + free everything at
cu__delete(), so obstacks are applicable and provide a good speedup:
$ grep "model name" /proc/cpuinfo
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
$
Before:
$ perf stat -r5 pahole -j --btf_encode_detached vmlinux-j.btf vmlinux
Performance counter stats for 'pahole -j --btf_encode_detached vmlinux-j.btf vmlinux' (5 runs):
10,445.75 msec task-clock:u # 2.864 CPUs utilized ( +- 0.08% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
761,926 page-faults:u # 72.941 K/sec ( +- 0.00% )
31,946,591,661 cycles:u # 3.058 GHz ( +- 0.05% )
69,103,520,880 instructions:u # 2.16 insn per cycle ( +- 0.00% )
16,353,763,143 branches:u # 1.566 G/sec ( +- 0.00% )
122,309,098 branch-misses:u # 0.75% of all branches ( +- 0.12% )
3.64689 +- 0.00437 seconds time elapsed ( +- 0.12% )
$ perf record --call-graph lbr pahole -j --btf_encode_detached vmlinux-j.btf vmlinux
[ perf record: Woken up 52 times to write data ]
[ perf record: Captured and wrote 13.151 MB perf.data (43058 samples) ]
$
$ perf report --no-children
Samples: 43K of event 'cycles:u', Event count (approx.): 31938442091
Overhead Command Shared Object Symbol
+ 22.98% pahole libdw-0.185.so [.] __libdw_find_attr
+ 6.69% pahole libdwarves.so.1.0.0 [.] cu__hash.isra.0
+ 5.82% pahole libdwarves.so.1.0.0 [.] hashmap__insert
+ 5.16% pahole libc.so.6 [.] __libc_calloc
+ 5.01% pahole libdwarves.so.1.0.0 [.] btf_dedup_is_equiv
+ 3.39% pahole libc.so.6 [.] _int_malloc
+ 2.82% pahole libc.so.6 [.] __strcmp_avx2
+ 2.22% pahole libdw-0.185.so [.] __libdw_form_val_compute_len
+ 2.13% pahole libdw-0.185.so [.] dwarf_attr
+ 2.08% pahole [unknown] [k] 0xffffffffa0e010a7
+ 1.98% pahole libdwarves.so.1.0.0 [.] dwarf_cu__find_type_by_ref
+ 1.98% pahole libdwarves.so.1.0.0 [.] btf__dedup
+ 1.92% pahole libc.so.6 [.] pthread_rwlock_unlock@@GLIBC_2.34
+ 1.92% pahole libdwarves.so.1.0.0 [.] btf__add_field
+ 1.92% pahole libdwarves.so.1.0.0 [.] list__for_all_tags
+ 1.61% pahole libdwarves.so.1.0.0 [.] btf_encoder__encode_cu
+ 1.49% pahole libdwarves.so.1.0.0 [.] die__process_class
+ 1.44% pahole libc.so.6 [.] pthread_rwlock_tryrdlock@@GLIBC_2.34
+ 1.24% pahole libdw-0.185.so [.] dwarf_siblingof
+ 1.18% pahole libdwarves.so.1.0.0 [.] btf_dedup_ref_type
+ 1.12% pahole libdwarves.so.1.0.0 [.] strs_hash_fn
+ 1.11% pahole libdwarves.so.1.0.0 [.] attr_numeric
+ 1.01% pahole libdwarves.so.1.0.0 [.] tag__size
After:
$ perf stat -r5 pahole -j --btf_encode_detached vmlinux-j.btf vmlinux
Performance counter stats for 'pahole -j --btf_encode_detached vmlinux-j.btf vmlinux' (5 runs):
8,114.11 msec task-clock:u # 2.747 CPUs utilized ( +- 0.09% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
68,792 page-faults:u # 8.478 K/sec ( +- 0.05% )
28,705,283,249 cycles:u # 3.538 GHz ( +- 0.09% )
63,013,653,035 instructions:u # 2.20 insn per cycle ( +- 0.00% )
15,039,319,384 branches:u # 1.853 G/sec ( +- 0.00% )
118,272,350 branch-misses:u # 0.79% of all branches ( +- 0.41% )
2.95368 +- 0.00221 seconds time elapsed ( +- 0.07% )
$
$ perf record --call-graph lbr pahole -j --btf_encode_detached vmlinux-j.btf vmlinux
[ perf record: Woken up 40 times to write data ]
[ perf record: Captured and wrote 10.426 MB perf.data (33733 samples) ]
$
$ perf report --no-children
Samples: 33K of event 'cycles:u', Event count (approx.): 28860426071
Overhead Command Shared Object Symbol
+ 26.10% pahole libdw-0.185.so [.] __libdw_find_attr
+ 6.13% pahole libdwarves.so.1.0.0 [.] cu__hash.isra.0
+ 5.83% pahole libdwarves.so.1.0.0 [.] hashmap__insert
+ 5.52% pahole libdwarves.so.1.0.0 [.] btf_dedup_is_equiv
+ 3.04% pahole libc.so.6 [.] __strcmp_avx2
+ 2.45% pahole libdw-0.185.so [.] __libdw_form_val_compute_len
+ 2.31% pahole libdwarves.so.1.0.0 [.] btf__dedup
+ 2.30% pahole libdw-0.185.so [.] dwarf_attr
+ 2.19% pahole libc.so.6 [.] pthread_rwlock_unlock@@GLIBC_2.34
+ 2.08% pahole libdwarves.so.1.0.0 [.] list__for_all_tags
+ 2.07% pahole libdwarves.so.1.0.0 [.] dwarf_cu__find_type_by_ref
+ 1.96% pahole libdwarves.so.1.0.0 [.] btf__add_field
+ 1.67% pahole libc.so.6 [.] pthread_rwlock_tryrdlock@@GLIBC_2.34
+ 1.63% pahole libdwarves.so.1.0.0 [.] btf_encoder__encode_cu
+ 1.52% pahole libdwarves.so.1.0.0 [.] die__process_class
+ 1.51% pahole libdwarves.so.1.0.0 [.] attr_type
+ 1.36% pahole libdwarves.so.1.0.0 [.] btf_dedup_ref_type
+ 1.32% pahole libdwarves.so.1.0.0 [.] strs_hash_fn
+ 1.25% pahole libdw-0.185.so [.] dwarf_siblingof
+ 1.24% pahole libdwarves.so.1.0.0 [.] namespace__recode_dwarf_types
+ 1.17% pahole libdwarves.so.1.0.0 [.] attr_numeric
+ 1.16% pahole libdwarves.so.1.0.0 [.] dwarf_cu__init
+ 1.03% pahole libdwarves.so.1.0.0 [.] tag__init
+ 1.01% pahole libdwarves.so.1.0.0 [.] tag__size
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
dca86fb8c2
dwarf_loader: Add comment on why we can't ignore lexblocks
...
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
9d0a7ee0c3
pahole: Ignore DW_TAG_label when encoding BTF
...
As it will not be used, so don't waste cycles/memory parsing them:
$ grep "model name" /proc/cpuinfo
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
$
Before:
$ perf stat -r5 pahole -j --btf_encode_detached=vmlinux-j.btf -F dwarf vmlinux
Performance counter stats for 'pahole -j --btf_encode_detached=vmlinux-j.btf -F dwarf vmlinux' (5 runs):
10,487.54 msec task-clock:u # 2.855 CPUs utilized ( +- 0.31% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
762,431 page-faults:u # 72.699 K/sec ( +- 0.00% )
31,994,949,358 cycles:u # 3.051 GHz ( +- 0.09% )
69,129,157,311 instructions:u # 2.16 insn per cycle ( +- 0.00% )
16,359,974,001 branches:u # 1.560 G/sec ( +- 0.00% )
122,800,385 branch-misses:u # 0.75% of all branches ( +- 0.23% )
3.67286 +- 0.00917 seconds time elapsed ( +- 0.25% )
$
After:
$ perf stat -r5 pahole -j --btf_encode_detached=vmlinux-j.btf -F dwarf vmlinux
Performance counter stats for 'pahole -j --btf_encode_detached=vmlinux-j.btf -F dwarf vmlinux' (5 runs):
10,431.47 msec task-clock:u # 2.865 CPUs utilized ( +- 0.04% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
761,982 page-faults:u # 73.046 K/sec ( +- 0.00% )
31,885,756,148 cycles:u # 3.057 GHz ( +- 0.04% )
69,103,456,079 instructions:u # 2.17 insn per cycle ( +- 0.00% )
16,353,867,606 branches:u # 1.568 G/sec ( +- 0.00% )
122,023,818 branch-misses:u # 0.75% of all branches ( +- 0.09% )
3.64095 +- 0.00194 seconds time elapsed ( +- 0.05% )
$
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
d40c5f1e20
core: Allow ignoring DW_TAG_label
...
As the BTF encoder doesn't use this information, so no need parsing it.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:27 -03:00
Arnaldo Carvalho de Melo
51ba831929
pahole: Ignore DW_TAG_inline_expansion when encoding BTF
...
XXX: for now leave this commented out, see comments in the source code.
As it will not be used, so don't waste cycles/memory parsing them:
$ grep "model name" /proc/cpuinfo
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
$
Before:
$ perf stat -r5 pahole -j --btf_encode_detached=vmlinux-j.btf -F dwarf vmlinux
Performance counter stats for 'pahole -j --btf_encode_detached=vmlinux-j.btf -F dwarf vmlinux' (5 runs):
10,973.13 msec task-clock:u # 2.906 CPUs utilized ( +- 0.13% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
793,927 page-faults:u # 72.352 K/sec ( +- 0.00% )
33,585,562,298 cycles:u # 3.061 GHz ( +- 0.17% )
72,687,766,428 instructions:u # 2.16 insn per cycle ( +- 0.15% )
17,198,056,478 branches:u # 1.567 G/sec ( +- 0.16% )
129,011,360 branch-misses:u # 0.75% of all branches ( +- 0.53% )
3.7760 +- 0.0158 seconds time elapsed ( +- 0.42% )
$
After:
$ perf stat -r5 pahole -j --btf_encode_detached=vmlinux-j.btf -F dwarf vmlinux
Performance counter stats for 'pahole -j --btf_encode_detached=vmlinux-j.btf -F dwarf vmlinux' (5 runs):
10,487.54 msec task-clock:u # 2.855 CPUs utilized ( +- 0.31% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
762,431 page-faults:u # 72.699 K/sec ( +- 0.00% )
31,994,949,358 cycles:u # 3.051 GHz ( +- 0.09% )
69,129,157,311 instructions:u # 2.16 insn per cycle ( +- 0.00% )
16,359,974,001 branches:u # 1.560 G/sec ( +- 0.00% )
122,800,385 branch-misses:u # 0.75% of all branches ( +- 0.23% )
3.67286 +- 0.00917 seconds time elapsed ( +- 0.25% )
$
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:40:25 -03:00
Arnaldo Carvalho de Melo
9038638891
core: Allow ignoring DW_TAG_inline_expansion
...
As the BTF encoder doesn't use this information, so no need parsing it.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:39:31 -03:00
Arnaldo Carvalho de Melo
20757745f0
pahole: Allow encoding BTF with parallel DWARF loading
...
By adding a lock to serialize access to btf_encoder__encode_cu().
This works and allows a speedup in BTF encoding, but its too brute
force, the right thing to do is have per-thread BTF encoders and then
at the end merge everything in a last pass.
But pick the low hanging fruits now.
On a machine with 4 cores, no HT:
$ grep "model name" -m1 /proc/cpuinfo
model name : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
$
Non-parallel:
$ perf stat -r5 pahole --btf_encode_detached=vmlinux.btf vmlinux
Performance counter stats for 'pahole --btf_encode_detached=vmlinux.btf vmlinux' (5 runs):
8,580.19 msec task-clock:u # 1.000 CPUs utilized ( +- 0.08% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
795,451 page-faults:u # 92.708 K/sec ( +- 0.00% )
29,151,924,821 cycles:u # 3.398 GHz ( +- 0.11% )
70,947,245,709 instructions:u # 2.43 insn per cycle ( +- 0.00% )
16,791,160,182 branches:u # 1.957 G/sec ( +- 0.00% )
120,793,994 branch-misses:u # 0.72% of all branches ( +- 1.04% )
8.58192 +- 0.00686 seconds time elapsed ( +- 0.08% )
$
Parallel:
$ perf stat -r5 pahole --btf_encode_detached=vmlinux-j.btf -j vmlinux
Performance counter stats for 'pahole --btf_encode_detached=vmlinux-j.btf -j vmlinux' (5 runs):
10,962.45 msec task-clock:u # 2.914 CPUs utilized ( +- 0.15% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
793,915 page-faults:u # 72.421 K/sec ( +- 0.00% )
33,552,130,646 cycles:u # 3.061 GHz ( +- 0.16% )
72,778,320,572 instructions:u # 2.17 insn per cycle ( +- 0.12% )
17,220,541,136 branches:u # 1.571 G/sec ( +- 0.13% )
129,353,767 branch-misses:u # 0.75% of all branches ( +- 0.48% )
3.7614 +- 0.0141 seconds time elapsed ( +- 0.38% )
$
That CPUs utilized should go all the way to 4 when we parallelize the
BTF encoding.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:39:02 -03:00
Arnaldo Carvalho de Melo
5a85d9a450
core: Zero out unused entries when extending ptr_table array in ptr_table__add()
...
Otherwise we may end up accessing invalid pointers and crashing.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:39:02 -03:00
Arnaldo Carvalho de Melo
d133569bd0
pahole: No need to read DW_AT_alignment when encoding BTF
...
No need to read the DW_AT_alignment, not used in BTF encoding.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:38:58 -03:00
Arnaldo Carvalho de Melo
21a41e5386
dwarf_loader: Allow asking not to read the DW_AT_alignment attribute
...
As this isn't present in most types or struct members, which ends up
making dwarf_attr() call libdw_find_attr() that will do a linear search
on all the attributes.
We don't use this in the BTF encoder, so no point in reading that.
This will be used in pahole in the following cset.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-20 16:38:09 -03:00
Arnaldo Carvalho de Melo
1ef1639039
dwarf_loader: Do not look for non-C DWARF attributes in C CUs
...
Avoid looking for attributes that doesn't apply to the C language, such
as DW_AT_virtuality (virtual, pure_virtual), DW_AT_accessibility
(public, protected, private) and DW_AT_const_value.
Looking for those attributes in class_member__new() makes
libdw_find_attr() linearly search all attributes for a die, which
appears on profiling.
Before:
$ perf stat -r5 pahole --btf_encode_detached=vmlinux.btf -j vmlinux
Performance counter stats for 'pahole --btf_encode_detached=vmlinux.btf -j vmlinux' (5 runs):
11,239.99 msec task-clock:u # 2.921 CPUs utilized ( +- 0.08% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
793,897 page-faults:u # 70.631 K/sec ( +- 0.00% )
34,593,518,484 cycles:u # 3.078 GHz ( +- 0.05% )
75,592,805,563 instructions:u # 2.19 insn per cycle ( +- 0.00% )
17,923,046,622 branches:u # 1.595 G/sec ( +- 0.00% )
131,080,371 branch-misses:u # 0.73% of all branches ( +- 0.18% )
3.84794 +- 0.00327 seconds time elapsed ( +- 0.09% )
$
After:
$ perf stat -r5 pahole --btf_encode_detached=vmlinux.btf -j vmlinux
Performance counter stats for 'pahole --btf_encode_detached=vmlinux.btf -j vmlinux' (5 runs):
11,178.28 msec task-clock:u # 2.929 CPUs utilized ( +- 0.12% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
793,890 page-faults:u # 71.021 K/sec ( +- 0.00% )
34,378,886,265 cycles:u # 3.076 GHz ( +- 0.13% )
75,523,849,140 instructions:u # 2.20 insn per cycle ( +- 0.12% )
17,907,573,910 branches:u # 1.602 G/sec ( +- 0.12% )
130,137,529 branch-misses:u # 0.73% of all branches ( +- 0.50% )
3.8165 +- 0.0137 seconds time elapsed ( +- 0.36% )
$
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-12 09:41:13 -03:00
Arnaldo Carvalho de Melo
88265eab35
core: Add cu__is_c() to check if the CU language is C
...
We'll use this to avoid looking for attributes that doesn't apply to the
C language, such as DW_AT_virtuality (virtual, pure_virtual) and
DW_AT_accessibility (public, protected, private),
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-12 09:41:13 -03:00
Arnaldo Carvalho de Melo
1caed1c443
dwarf_loader: Add a lock around dwarf_decl_file() and dwarf_decl_line() calls
...
As this ends up racing on a tsearch() call, probably for some libdw
cache that gets updated/lookedup in concurrent pahole threads (-j N).
This cures the following, a patch for libdw will be cooked up and sent.
(gdb) run -j -I -F dwarf vmlinux > /dev/null
Starting program: /var/home/acme/git/pahole/build/pahole -j -I -F dwarf vmlinux > /dev/null
warning: Expected absolute pathname for libpthread in the inferior, but got .gnu_debugdata for /lib64/libpthread.so.0.
warning: File "/usr/lib64/libthread_db-1.0.so" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.
[New LWP 844789]
[New LWP 844790]
[New LWP 844791]
[New LWP 844792]
[New LWP 844793]
[New LWP 844794]
[New LWP 844795]
[New LWP 844796]
[New LWP 844797]
[New LWP 844798]
[New LWP 844799]
[New LWP 844800]
[New LWP 844801]
[New LWP 844802]
[New LWP 844803]
[New LWP 844804]
[New LWP 844805]
[New LWP 844806]
[New LWP 844807]
[New LWP 844808]
[New LWP 844809]
[New LWP 844810]
[New LWP 844811]
[New LWP 844812]
[New LWP 844813]
[New LWP 844814]
Thread 2 "pahole" received signal SIGSEGV, Segmentation fault.
[Switching to LWP 844789]
0x00007ffff7dfa321 in ?? () from /lib64/libc.so.6
(gdb) bt
#0 0x00007ffff7dfa321 in ?? () from /lib64/libc.so.6
#1 0x00007ffff7dfa4bb in ?? () from /lib64/libc.so.6
#2 0x00007ffff7f5eaa6 in __libdw_getsrclines (dbg=0x4a7f90, debug_line_offset=10383710, comp_dir=0x7ffff3c29f01 "/var/home/acme/git/build/v5.13.0-rc6+", address_size=address_size@entry=8, linesp=linesp@entry=0x7fffcfe04ba0, filesp=filesp@entry=0x7fffcfe04ba8)
at dwarf_getsrclines.c:1129
#3 0x00007ffff7f5ed14 in dwarf_getsrclines (cudie=cudie@entry=0x7fffd210caf0, lines=lines@entry=0x7fffd210cac0, nlines=nlines@entry=0x7fffd210cac8) at dwarf_getsrclines.c:1213
#4 0x00007ffff7f64883 in dwarf_decl_file (die=<optimized out>) at dwarf_decl_file.c:66
#5 0x0000000000425f24 in tag__init (tag=0x7fff0421b710, cu=0x7fffcc001e40, die=0x7fffd210cd30) at /var/home/acme/git/pahole/dwarf_loader.c:476
#6 0x00000000004262ec in namespace__init (namespace=0x7fff0421b710, die=0x7fffd210cd30, cu=0x7fffcc001e40, conf=0x475600 <conf_load>) at /var/home/acme/git/pahole/dwarf_loader.c:576
#7 0x00000000004263ac in type__init (type=0x7fff0421b710, die=0x7fffd210cd30, cu=0x7fffcc001e40, conf=0x475600 <conf_load>) at /var/home/acme/git/pahole/dwarf_loader.c:595
#8 0x00000000004264d1 in type__new (die=0x7fffd210cd30, cu=0x7fffcc001e40, conf=0x475600 <conf_load>) at /var/home/acme/git/pahole/dwarf_loader.c:614
#9 0x0000000000427ba6 in die__create_new_typedef (die=0x7fffd210cd30, cu=0x7fffcc001e40, conf=0x475600 <conf_load>) at /var/home/acme/git/pahole/dwarf_loader.c:1212
#10 0x0000000000428df5 in __die__process_tag (die=0x7fffd210cd30, cu=0x7fffcc001e40, top_level=1, fn=0x45cee0 <__FUNCTION__.10> "die__process_unit", conf=0x475600 <conf_load>) at /var/home/acme/git/pahole/dwarf_loader.c:1823
#11 0x0000000000428ea1 in die__process_unit (die=0x7fffd210cd30, cu=0x7fffcc001e40, conf=0x475600 <conf_load>) at /var/home/acme/git/pahole/dwarf_loader.c:1848
#12 0x0000000000429e45 in die__process (die=0x7fffd210ce20, cu=0x7fffcc001e40, conf=0x475600 <conf_load>) at /var/home/acme/git/pahole/dwarf_loader.c:2311
#13 0x0000000000429ecb in die__process_and_recode (die=0x7fffd210ce20, cu=0x7fffcc001e40, conf=0x475600 <conf_load>) at /var/home/acme/git/pahole/dwarf_loader.c:2326
#14 0x000000000042a9d6 in dwarf_cus__create_and_process_cu (dcus=0x7fffffffddc0, cu_die=0x7fffd210ce20, pointer_size=8 '\b') at /var/home/acme/git/pahole/dwarf_loader.c:2644
#15 0x000000000042ab28 in dwarf_cus__process_cu_thread (arg=0x7fffffffddc0) at /var/home/acme/git/pahole/dwarf_loader.c:2687
#16 0x00007ffff7ed6299 in start_thread () from /lib64/libpthread.so.0
#17 0x00007ffff7dfe353 in ?? () from /lib64/libc.so.6
(gdb)
(gdb) fr 2
1085
(gdb) list files_lines_compare
1086 static int
1087 files_lines_compare (const void *p1, const void *p2)
1088 {
1089 const struct files_lines_s *t1 = p1;
1090 const struct files_lines_s *t2 = p2;
1091
1092 if (t1->debug_line_offset < t2->debug_line_offset)
(gdb)
1093 return -1;
1094 if (t1->debug_line_offset > t2->debug_line_offset)
1095 return 1;
1096
1097 return 0;
1098 }
1099
1100 int
1101 internal_function
1102 __libdw_getsrclines (Dwarf *dbg, Dwarf_Off debug_line_offset,
(gdb) list __libdw_getsrclines
1100 int
1101 internal_function
1102 __libdw_getsrclines (Dwarf *dbg, Dwarf_Off debug_line_offset,
1103 const char *comp_dir, unsigned address_size,
1104 Dwarf_Lines **linesp, Dwarf_Files **filesp)
1105 {
1106 struct files_lines_s fake = { .debug_line_offset = debug_line_offset };
1107 struct files_lines_s **found = tfind (&fake, &dbg->files_lines,
1108 files_lines_compare);
1109 if (found == NULL)
(gdb)
1110 {
1111 Elf_Data *data = __libdw_checked_get_data (dbg, IDX_debug_line);
1112 if (data == NULL
1113 || __libdw_offset_in_section (dbg, IDX_debug_line,
1114 debug_line_offset, 1) != 0)
1115 return -1;
1116
1117 const unsigned char *linep = data->d_buf + debug_line_offset;
1118 const unsigned char *lineendp = data->d_buf + data->d_size;
1119
(gdb)
1120 struct files_lines_s *node = libdw_alloc (dbg, struct files_lines_s,
1121 sizeof *node, 1);
1122
1123 if (read_srclines (dbg, linep, lineendp, comp_dir, address_size,
1124 &node->lines, &node->files) != 0)
1125 return -1;
1126
1127 node->debug_line_offset = debug_line_offset;
1128
1129 found = tsearch (node, &dbg->files_lines, files_lines_compare);
(gdb)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-12 09:41:13 -03:00
Arnaldo Carvalho de Melo
dd13708f2f
btfdiff: Use multithreaded DWARF loading
...
Quite a few cases of types with the same name, will add a
--exclude-types option to filter those, and study BTF dedup to see what
it does in this case.
$ btfdiff vmlinux
--- /tmp/btfdiff.dwarf.BgsYYn 2021-07-06 17:03:07.471814114 -0300
+++ /tmp/btfdiff.btf.Ene2Ug 2021-07-06 17:03:07.714819609 -0300
@@ -23627,12 +23627,15 @@ struct deadline_data {
};
struct debug_buffer {
ssize_t (*fill_func)(struct debug_buffer *); /* 0 8 */
- struct ohci_hcd * ohci; /* 8 8 */
+ struct usb_bus * bus; /* 8 8 */
struct mutex mutex; /* 16 32 */
size_t count; /* 48 8 */
- char * page; /* 56 8 */
+ char * output_buf; /* 56 8 */
+ /* --- cacheline 1 boundary (64 bytes) --- */
+ size_t alloc_size; /* 64 8 */
- /* size: 64, cachelines: 1, members: 5 */
+ /* size: 72, cachelines: 2, members: 6 */
+ /* last cacheline: 8 bytes */
};
struct debug_reply_data {
struct ethnl_reply_data base; /* 0 8 */
@@ -47930,11 +47933,12 @@ struct intel_community {
/* last cacheline: 32 bytes */
};
struct intel_community_context {
- u32 * intmask; /* 0 8 */
- u32 * hostown; /* 8 8 */
+ unsigned int intr_lines[16]; /* 0 64 */
+ /* --- cacheline 1 boundary (64 bytes) --- */
+ u32 saved_intmask; /* 64 4 */
- /* size: 16, cachelines: 1, members: 2 */
- /* last cacheline: 16 bytes */
+ /* size: 68, cachelines: 2, members: 2 */
+ /* last cacheline: 4 bytes */
};
struct intel_early_ops {
resource_size_t (*stolen_size)(int, int, int); /* 0 8 */
@@ -52600,64 +52604,19 @@ struct irqtime {
/* size: 24, cachelines: 1, members: 4 */
/* last cacheline: 24 bytes */
};
-struct irte {
- union {
- struct {
- __u64 present:1; /* 0: 0 8 */
- __u64 fpd:1; /* 0: 1 8 */
- __u64 __res0:6; /* 0: 2 8 */
- __u64 avail:4; /* 0: 8 8 */
- __u64 __res1:3; /* 0:12 8 */
- __u64 pst:1; /* 0:15 8 */
- __u64 vector:8; /* 0:16 8 */
- __u64 __res2:40; /* 0:24 8 */
- }; /* 0 8 */
- struct {
- __u64 r_present:1; /* 0: 0 8 */
- __u64 r_fpd:1; /* 0: 1 8 */
- __u64 dst_mode:1; /* 0: 2 8 */
- __u64 redir_hint:1; /* 0: 3 8 */
- __u64 trigger_mode:1; /* 0: 4 8 */
- __u64 dlvry_mode:3; /* 0: 5 8 */
- __u64 r_avail:4; /* 0: 8 8 */
- __u64 r_res0:4; /* 0:12 8 */
- __u64 r_vector:8; /* 0:16 8 */
- __u64 r_res1:8; /* 0:24 8 */
- __u64 dest_id:32; /* 0:32 8 */
- }; /* 0 8 */
- struct {
- __u64 p_present:1; /* 0: 0 8 */
- __u64 p_fpd:1; /* 0: 1 8 */
- __u64 p_res0:6; /* 0: 2 8 */
- __u64 p_avail:4; /* 0: 8 8 */
- __u64 p_res1:2; /* 0:12 8 */
- __u64 p_urgent:1; /* 0:14 8 */
- __u64 p_pst:1; /* 0:15 8 */
- __u64 p_vector:8; /* 0:16 8 */
- __u64 p_res2:14; /* 0:24 8 */
- __u64 pda_l:26; /* 0:38 8 */
- }; /* 0 8 */
- __u64 low; /* 0 8 */
- }; /* 0 8 */
- union {
- struct {
- __u64 sid:16; /* 8: 0 8 */
- __u64 sq:2; /* 8:16 8 */
- __u64 svt:2; /* 8:18 8 */
- __u64 __res3:44; /* 8:20 8 */
- }; /* 8 8 */
- struct {
- __u64 p_sid:16; /* 8: 0 8 */
- __u64 p_sq:2; /* 8:16 8 */
- __u64 p_svt:2; /* 8:18 8 */
- __u64 p_res3:12; /* 8:20 8 */
- __u64 pda_h:32; /* 8:32 8 */
- }; /* 8 8 */
- __u64 high; /* 8 8 */
- }; /* 8 8 */
-
- /* size: 16, cachelines: 1, members: 2 */
- /* last cacheline: 16 bytes */
+union irte {
+ u32 val; /* 0 4 */
+ struct {
+ u32 valid:1; /* 0: 0 4 */
+ u32 no_fault:1; /* 0: 1 4 */
+ u32 int_type:3; /* 0: 2 4 */
+ u32 rq_eoi:1; /* 0: 5 4 */
+ u32 dm:1; /* 0: 6 4 */
+ u32 rsvd_1:1; /* 0: 7 4 */
+ u32 destination:8; /* 0: 8 4 */
+ u32 vector:8; /* 0:16 4 */
+ u32 rsvd_2:8; /* 0:24 4 */
+ } fields; /* 0 4 */
};
struct irte_ga {
union irte_ga_lo lo; /* 0 8 */
@@ -66862,12 +66821,13 @@ struct netlbl_domhsh_tbl {
/* last cacheline: 16 bytes */
};
struct netlbl_domhsh_walk_arg {
- struct netlbl_audit * audit_info; /* 0 8 */
- u32 doi; /* 8 4 */
+ struct netlink_callback * nl_cb; /* 0 8 */
+ struct sk_buff * skb; /* 8 8 */
+ u32 seq; /* 16 4 */
- /* size: 16, cachelines: 1, members: 2 */
+ /* size: 24, cachelines: 1, members: 3 */
/* padding: 4 */
- /* last cacheline: 16 bytes */
+ /* last cacheline: 24 bytes */
};
struct netlbl_dommap_def {
u32 type; /* 0 4 */
@@ -72907,20 +72867,16 @@ struct pci_raw_ops {
/* last cacheline: 16 bytes */
};
struct pci_root_info {
- struct list_head list; /* 0 16 */
- char name[12]; /* 16 12 */
-
- /* XXX 4 bytes hole, try to pack */
-
- struct list_head resources; /* 32 16 */
- struct resource busn; /* 48 64 */
- /* --- cacheline 1 boundary (64 bytes) was 48 bytes ago --- */
- int node; /* 112 4 */
- int link; /* 116 4 */
+ struct acpi_pci_root_info common; /* 0 56 */
+ struct pci_sysdata sd; /* 56 40 */
+ /* --- cacheline 1 boundary (64 bytes) was 32 bytes ago --- */
+ bool mcfg_added; /* 96 1 */
+ u8 start_bus; /* 97 1 */
+ u8 end_bus; /* 98 1 */
- /* size: 120, cachelines: 2, members: 6 */
- /* sum members: 116, holes: 1, sum holes: 4 */
- /* last cacheline: 56 bytes */
+ /* size: 104, cachelines: 2, members: 5 */
+ /* padding: 5 */
+ /* last cacheline: 40 bytes */
};
struct pci_root_res {
struct list_head list; /* 0 16 */
@@ -76415,25 +76371,66 @@ struct pmc_dev {
/* XXX 4 bytes hole, try to pack */
- void * regmap; /* 8 8 */
+ void * regbase; /* 8 8 */
const struct pmc_reg_map * map; /* 16 8 */
struct dentry * dbgfs_dir; /* 24 8 */
- bool init; /* 32 1 */
+ int pmc_xram_read_bit; /* 32 4 */
- /* size: 40, cachelines: 1, members: 5 */
- /* sum members: 29, holes: 1, sum holes: 4 */
- /* padding: 7 */
- /* last cacheline: 40 bytes */
+ /* XXX 4 bytes hole, try to pack */
+
+ struct mutex lock; /* 40 32 */
+ /* --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- */
+ bool check_counters; /* 72 1 */
+
+ /* XXX 7 bytes hole, try to pack */
+
+ u64 pc10_counter; /* 80 8 */
+ u64 s0ix_counter; /* 88 8 */
+ int num_lpm_modes; /* 96 4 */
+ int lpm_en_modes[8]; /* 100 32 */
+
+ /* XXX 4 bytes hole, try to pack */
+
+ /* --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- */
+ u32 * lpm_req_regs; /* 136 8 */
+
+ /* size: 144, cachelines: 3, members: 12 */
+ /* sum members: 125, holes: 4, sum holes: 19 */
+ /* last cacheline: 16 bytes */
};
struct pmc_reg_map {
- const struct pmc_bit_map * d3_sts_0; /* 0 8 */
- const struct pmc_bit_map * d3_sts_1; /* 8 8 */
- const struct pmc_bit_map * func_dis; /* 16 8 */
- const struct pmc_bit_map * func_dis_2; /* 24 8 */
- const struct pmc_bit_map * pss; /* 32 8 */
+ const struct pmc_bit_map * * pfear_sts; /* 0 8 */
+ const struct pmc_bit_map * mphy_sts; /* 8 8 */
+ const struct pmc_bit_map * pll_sts; /* 16 8 */
+ const struct pmc_bit_map * * slps0_dbg_maps; /* 24 8 */
+ const struct pmc_bit_map * ltr_show_sts; /* 32 8 */
+ const struct pmc_bit_map * msr_sts; /* 40 8 */
+ const struct pmc_bit_map * * lpm_sts; /* 48 8 */
+ const u32 slp_s0_offset; /* 56 4 */
+ const int slp_s0_res_counter_step; /* 60 4 */
+ /* --- cacheline 1 boundary (64 bytes) --- */
+ const u32 ltr_ignore_offset; /* 64 4 */
+ const int regmap_length; /* 68 4 */
+ const u32 ppfear0_offset; /* 72 4 */
+ const int ppfear_buckets; /* 76 4 */
+ const u32 pm_cfg_offset; /* 80 4 */
+ const int pm_read_disable_bit; /* 84 4 */
+ const u32 slps0_dbg_offset; /* 88 4 */
+ const u32 ltr_ignore_max; /* 92 4 */
+ const u32 pm_vric1_offset; /* 96 4 */
+ const int lpm_num_maps; /* 100 4 */
+ const int lpm_res_counter_step_x2; /* 104 4 */
+ const u32 lpm_sts_latch_en_offset; /* 108 4 */
+ const u32 lpm_en_offset; /* 112 4 */
+ const u32 lpm_priority_offset; /* 116 4 */
+ const u32 lpm_residency_offset; /* 120 4 */
+ const u32 lpm_status_offset; /* 124 4 */
+ /* --- cacheline 2 boundary (128 bytes) --- */
+ const u32 lpm_live_status_offset; /* 128 4 */
+ const u32 etr3_offset; /* 132 4 */
- /* size: 40, cachelines: 1, members: 5 */
- /* last cacheline: 40 bytes */
+ /* size: 136, cachelines: 3, members: 27 */
+ /* last cacheline: 8 bytes */
};
struct pmic_table {
int address; /* 0 4 */
@@ -114574,12 +114571,18 @@ struct urb {
/* last cacheline: 56 bytes */
};
struct urb_priv {
- int num_tds; /* 0 4 */
- int num_tds_done; /* 4 4 */
- struct xhci_td td[]; /* 8 0 */
+ struct ed * ed; /* 0 8 */
+ u16 length; /* 8 2 */
+ u16 td_cnt; /* 10 2 */
- /* size: 8, cachelines: 1, members: 3 */
- /* last cacheline: 8 bytes */
+ /* XXX 4 bytes hole, try to pack */
+
+ struct list_head pending; /* 16 16 */
+ struct td * td[]; /* 32 0 */
+
+ /* size: 32, cachelines: 1, members: 5 */
+ /* sum members: 28, holes: 1, sum holes: 4 */
+ /* last cacheline: 32 bytes */
};
struct usb2_lpm_parameters {
unsigned int besl; /* 0 4 */
$
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-12 09:41:13 -03:00
Arnaldo Carvalho de Melo
f95f783849
btfdiff: Use --sort for pretty printing from both BTF and DWARF
...
$ btfdiff vmlinux
$
As expected, no change, both sort to the same output, now lets add
--jobs to the DWARF case.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-12 09:41:13 -03:00
Arnaldo Carvalho de Melo
3e1c7a2077
pahole: Introduce --sort
...
To ask for sorting output, initially by name.
This is needed in 'btfdiff' to diff the output of 'pahole -F dwarf
--jobs N', where N threads will go on consuming DWARF compile units and
and pretty printing them, producing a non deterministic output.
So we need to sort the output for both BTF and DWARF, and then diff
them.
This is still not enough for some cases where different types have the
same name, things like "usb_priv" that exists in multiple DWARF compile
units, the first processed is "winning", i.e. being the only one
considered.
I have to look how BTF handles this to adopt a similar algorithm and
keep btfdiff usable as a regression test for the BTF and DWARF loader
and the BTF encoder.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-12 09:41:13 -03:00
Arnaldo Carvalho de Melo
967290bc71
pahole: Store the class id in 'struct structure' as well
...
Needed to defer calling printing classes to after we have all sorted out
by name with the upcoming 'pahole --sort' option, needed to make it
possible to compare 'pahole -F btf' with 'pahole -F dwarf -j', as the
multithreaded DWARF loader will not have all classes in a deterministic
order. This is needed for 'btfdiff'.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-12 09:41:13 -03:00
Arnaldo Carvalho de Melo
2b45e1b6d0
dwarf_loader: Defer freeing libdw Dwfl handler
...
So that 'pahole --sort -F dwarf' can defer printing all classes to when
it has all of them processed and sorted.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-12 09:41:13 -03:00
Arnaldo Carvalho de Melo
35845e7e41
core: Provide a way to store per loader info in cus and an exit function
...
So that loaders such as the DWARF one can store there the DWARF handler
(Dwfl) that needs to stay live while tools use the core tags (struct
class, struct union, struct tag, etc) because they point to strings that
are managed by Dwfl, so we have to defer dwfl_end() to after tools are
done processing the core tags.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-12 09:41:13 -03:00
Arnaldo Carvalho de Melo
5365c45177
pahole: Keep class + cu in tree of structures
...
We'll use it for ordering by name.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-12 09:41:13 -03:00
Arnaldo Carvalho de Melo
fb99cad539
dwarf_loader: Parallel DWARF loading
...
Tested so far with a typical Linux kernel vmlinux file.
Testing it:
⬢[acme@toolbox pahole]$ perf stat -r5 pahole -F dwarf vmlinux > /dev/null
Performance counter stats for 'pahole -F dwarf vmlinux' (5 runs):
5,675.97 msec task-clock:u # 1.000 CPUs utilized ( +- 0.36% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
736,865 page-faults:u # 129.898 K/sec ( +- 0.00% )
21,921,617,854 cycles:u # 3.864 GHz ( +- 0.23% ) (83.34%)
206,308,275 stalled-cycles-frontend:u # 0.95% frontend cycles idle ( +- 4.59% ) (83.33%)
2,186,772,169 stalled-cycles-backend:u # 10.02% backend cycles idle ( +- 0.46% ) (83.33%)
62,272,507,248 instructions:u # 2.85 insn per cycle
# 0.03 stalled cycles per insn ( +- 0.03% ) (83.34%)
14,967,758,961 branches:u # 2.639 G/sec ( +- 0.03% ) (83.33%)
65,688,710 branch-misses:u # 0.44% of all branches ( +- 0.29% ) (83.33%)
5.6750 +- 0.0203 seconds time elapsed ( +- 0.36% )
⬢[acme@toolbox pahole]$ perf stat -r5 pahole -F dwarf -j12 vmlinux > /dev/null
Performance counter stats for 'pahole -F dwarf -j12 vmlinux' (5 runs):
18,015.77 msec task-clock:u # 7.669 CPUs utilized ( +- 2.49% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
739,157 page-faults:u # 40.726 K/sec ( +- 0.01% )
26,673,502,570 cycles:u # 1.470 GHz ( +- 0.44% ) (83.12%)
734,106,744 stalled-cycles-frontend:u # 2.80% frontend cycles idle ( +- 2.30% ) (83.65%)
2,258,159,917 stalled-cycles-backend:u # 8.60% backend cycles idle ( +- 1.51% ) (83.62%)
63,347,827,742 instructions:u # 2.41 insn per cycle
# 0.04 stalled cycles per insn ( +- 0.03% ) (83.32%)
15,242,840,672 branches:u # 839.841 M/sec ( +- 0.03% ) (83.22%)
73,860,851 branch-misses:u # 0.48% of all branches ( +- 0.51% ) (83.09%)
2.349 +- 0.116 seconds time elapsed ( +- 4.93% )
⬢[acme@toolbox pahole]$
Since this is done in 12 threads and pahole prints as it finishes
processing each CU, the output is not anymore deterministically the same
for all runs.
I'll add a mode where one can ask for the structures to be kept into a
data structure to sort before printing, so that btfdiff can use it with
-j and continue working.
Also since it prints the first struct with a given name, and there are
multiple structures with a given name in the kernel, we get differences
even when we ask just for the sizes (so that we get just one line per
struct):
⬢[acme@toolbox pahole]$ pahole -F dwarf --sizes vmlinux > /tmp/pahole--sizes.txt
⬢[acme@toolbox pahole]$ pahole -F dwarf -j12 --sizes vmlinux > /tmp/pahole--sizes-j12.txt
⬢[acme@toolbox pahole]$ diff -u /tmp/pahole--sizes.txt /tmp/pahole--sizes-j12.txt | head
--- /tmp/pahole--sizes.txt 2021-07-01 21:56:49.260958678 -0300
+++ /tmp/pahole--sizes-j12.txt 2021-07-01 21:57:00.322209241 -0300
@@ -1,20 +1,9 @@
-list_head 16 0
-hlist_head 8 0
-hlist_node 16 0
-callback_head 16 0
-file_system_type 72 1
-qspinlock 4 0
-qrwlock 8 0
⬢[acme@toolbox pahole]$
We can't compare it that way, lets sort both and then try again:
⬢[acme@toolbox pahole]$ sort /tmp/pahole--sizes.txt > /tmp/pahole--sizes.txt.sorted
⬢[acme@toolbox pahole]$ sort /tmp/pahole--sizes-j12.txt > /tmp/pahole--sizes-j12.txt.sorted
⬢[acme@toolbox pahole]$ diff -u /tmp/pahole--sizes.txt.sorted /tmp/pahole--sizes-j12.txt.sorted
--- /tmp/pahole--sizes.txt.sorted 2021-07-01 21:57:13.841515467 -0300
+++ /tmp/pahole--sizes-j12.txt.sorted 2021-07-01 21:57:16.771581840 -0300
@@ -1116,7 +1116,7 @@
child_latency_info 48 1
chipset 32 1
chksum_ctx 4 0
-chksum_desc_ctx 4 0
+chksum_desc_ctx 2 0
cipher_alg 32 0
cipher_context 16 0
cipher_test_sglists 1184 0
@@ -1589,7 +1589,7 @@
ddebug_query 40 0
ddebug_table 40 1
deadline_data 120 1
-debug_buffer 72 0
+debug_buffer 64 0
debugfs_blob_wrapper 16 0
debugfs_devm_entry 16 0
debugfs_fsdata 48 1
@@ -3291,7 +3291,7 @@
integrity_sysfs_entry 32 0
intel_agp_driver_description 24 1
intel_community 96 1
-intel_community_context 68 0
+intel_community_context 16 0
intel_early_ops 16 0
intel_excl_cntrs 536 0
intel_excl_states 260 0
@@ -3619,7 +3619,7 @@
irqtime 24 0
irq_work 24 0
ir_table 16 0
-irte 4 0
+irte 16 0
irte_ga 16 0
irte_ga_hi 8 0
irte_ga_lo 8 0
@@ -4909,7 +4909,7 @@
pci_platform_pm_ops 64 0
pci_pme_device 24 0
pci_raw_ops 16 0
-pci_root_info 104 0
+pci_root_info 120 1
pci_root_res 80 0
pci_saved_state 64 0
pciserial_board 24 0
@@ -5132,10 +5132,10 @@
pmc_clk 24 0
pmc_clk_data 24 0
pmc_data 16 0
-pmc_dev 144 4
+pmc_dev 40 1
pm_clk_notifier_block 32 0
pm_clock_entry 40 0
-pmc_reg_map 136 0
+pmc_reg_map 40 0
pmic_table 12 0
pm_message 4 0
pm_nl_pernet 80 1
@@ -6388,7 +6388,7 @@
sw842_hlist_node2 24 0
sw842_hlist_node4 24 0
sw842_hlist_node8 32 0
-sw842_param 59496 2
+sw842_param 48 1
swait_queue 24 0
swait_queue_head 24 1
swap_cgroup 2 0
@@ -7942,7 +7942,7 @@
uprobe_trace_entry_head 8 0
uprobe_xol_ops 32 0
urb 184 0
-urb_priv 32 1
+urb_priv 8 0
usb2_lpm_parameters 8 0
usb3_lpm_parameters 16 0
usb_anchor 56 0
⬢[acme@toolbox pahole]$
I'll check one by one, but looks kinda legit.
Now to fiddle with thread affinities. And then move to threaded BTF
encoding, that at a first test with a single btf_lock in the pahole
stealer ended up producing corrupt BTF, valid just up to a point.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-12 09:41:13 -03:00
Arnaldo Carvalho de Melo
75d4748861
pahole: Disable parallell BTF encoding for now
...
Introduce first parallell DWARF loading, test it, then move on to use it
together with BTF encoding.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-12 09:41:13 -03:00
Arnaldo Carvalho de Melo
1c60f71daa
pahole: Add locking for the structures list and rbtree
...
Prep work for multithreaded DWARF loading, when there will be concurrent
access to this data structure.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-12 09:41:13 -03:00
Arnaldo Carvalho de Melo
46ad8c0158
dwarf_loader: Introduce 'dwarf_cus' to group all the DWARF specific per-cus state
...
Will help reusing in the upcoming multithreading mode.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-12 09:41:13 -03:00
Arnaldo Carvalho de Melo
d963af9fd8
dwarf_loader: Factor common bits for creating and processing CU
...
Will be used for the multithreaded loading
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-12 09:41:13 -03:00
Arnaldo Carvalho de Melo
0c5bf70cc1
fprintf: class__vtable_fprintf() doesn't need a 'cu' arg
...
Another simplification made possible by using a plain char string
instead of string_t, that was only needed in the core as prep work
for CTF encoding.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-12 09:41:13 -03:00
Arnaldo Carvalho de Melo
38ff86b149
fprintf: string_type__fprintf() doesn't need a 'cu' arg
...
Another simplification made possible by using a plain char string
instead of string_t, that was only needed in the core as prep work
for CTF encoding.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-12 09:41:13 -03:00
Arnaldo Carvalho de Melo
a75c342ac2
core: Ditch tag__free_orig_info(), unused
...
Since we stopped using per-cu obstacks we don't need it. If we ever
want to use it we can do per thread obstacks.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-12 09:41:13 -03:00
Arnaldo Carvalho de Melo
80fe32fd29
core: variable__name() doesn't need a 'cu' arg
...
Another simplification made possible by using a plain char string
instead of string_t, that was only needed in the core as prep work
for CTF encoding.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-12 09:41:13 -03:00
Arnaldo Carvalho de Melo
caa219dffc
core: base_type__name() doesn't need a 'cu' arg
...
Another simplification made possible by using a plain char string
instead of string_t, that was only needed in the core as prep work
for CTF encoding.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-12 09:41:13 -03:00