btf_encoder: Move libbtf.c to btf_encoder.c, the only user of its functions
...
All those functions now operate on a 'struct btf_encoder' object, there
is no need to make them visible outside the btf_encoder.c source file,
so move them all there and make them static.
This leads to some savings as the compiler is free to optimize further,
inlining stuff used in just one place, etc:
Before, for encoding then reading we have:
⬢[acme@toolbox pahole]$ rm -f vmlinux.btf ; perf stat -r5 pahole -j vmlinux.btf vmlinux && perf stat -r5 btfdiff vmlinux vmlinux.btf
Performance counter stats for 'pahole -j vmlinux.btf vmlinux' (5 runs):
8,546.56 msec task-clock:u # 0.989 CPUs utilized ( +- 0.71% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
775,699 page-faults:u # 89.802 K/sec ( +- 0.00% )
34,082,471,148 cycles:u # 3.946 GHz ( +- 0.22% ) (83.33%)
636,039,662 stalled-cycles-frontend:u # 1.87% frontend cycles idle ( +- 1.69% ) (83.33%)
4,895,524,778 stalled-cycles-backend:u # 14.38% backend cycles idle ( +- 2.10% ) (83.33%)
77,379,632,646 instructions:u # 2.27 insn per cycle
# 0.07 stalled cycles per insn ( +- 0.04% ) (83.33%)
18,185,560,802 branches:u # 2.105 G/sec ( +- 0.03% ) (83.34%)
149,715,849 branch-misses:u # 0.82% of all branches ( +- 0.15% ) (83.34%)
8.6412 +- 0.0612 seconds time elapsed ( +- 0.71% )
Performance counter stats for 'btfdiff vmlinux vmlinux.btf' (5 runs):
7,168.97 msec task-clock:u # 1.016 CPUs utilized ( +- 0.50% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
727,965 page-faults:u # 103.257 K/sec ( +- 0.00% )
27,339,019,686 cycles:u # 3.878 GHz ( +- 0.17% ) (83.28%)
511,689,773 stalled-cycles-frontend:u # 1.88% frontend cycles idle ( +- 1.84% ) (83.34%)
3,677,090,126 stalled-cycles-backend:u # 13.53% backend cycles idle ( +- 1.47% ) (83.35%)
66,182,032,226 instructions:u # 2.44 insn per cycle
# 0.06 stalled cycles per insn ( +- 0.02% ) (83.35%)
15,747,149,247 branches:u # 2.234 G/sec ( +- 0.02% ) (83.36%)
98,013,024 branch-misses:u # 0.62% of all branches ( +- 0.21% ) (83.33%)
7.0554 +- 0.0357 seconds time elapsed ( +- 0.51% )
⬢[acme@toolbox pahole]$
Then, with this patch:
⬢[acme@toolbox pahole]$ rm -f vmlinux.btf ; perf stat -r5 pahole -j vmlinux.btf vmlinux && perf stat -r5 btfdiff vmlinux vmlinux.btf
Performance counter stats for 'pahole -j vmlinux.btf vmlinux' (5 runs):
8,280.48 msec task-clock:u # 0.975 CPUs utilized ( +- 0.72% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
775,699 page-faults:u # 91.481 K/sec ( +- 0.00% )
33,265,078,702 cycles:u # 3.923 GHz ( +- 0.32% ) (83.32%)
725,690,346 stalled-cycles-frontend:u # 2.16% frontend cycles idle ( +- 1.76% ) (83.34%)
4,803,211,469 stalled-cycles-backend:u # 14.33% backend cycles idle ( +- 2.43% ) (83.34%)
77,162,277,929 instructions:u # 2.30 insn per cycle
# 0.07 stalled cycles per insn ( +- 0.06% ) (83.34%)
18,139,715,894 branches:u # 2.139 G/sec ( +- 0.03% ) (83.34%)
149,609,552 branch-misses:u # 0.82% of all branches ( +- 0.16% ) (83.33%)
8.4921 +- 0.0630 seconds time elapsed ( +- 0.74% )
Performance counter stats for 'btfdiff vmlinux vmlinux.btf' (5 runs):
7,018.11 msec task-clock:u # 1.013 CPUs utilized ( +- 0.68% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
727,949 page-faults:u # 105.207 K/sec ( +- 0.00% )
26,632,191,985 cycles:u # 3.849 GHz ( +- 0.31% ) (83.35%)
496,648,058 stalled-cycles-frontend:u # 1.87% frontend cycles idle ( +- 2.02% ) (83.29%)
3,437,243,040 stalled-cycles-backend:u # 12.92% backend cycles idle ( +- 0.90% ) (83.33%)
66,192,034,237 instructions:u # 2.49 insn per cycle
# 0.05 stalled cycles per insn ( +- 0.03% ) (83.34%)
15,750,883,004 branches:u # 2.276 G/sec ( +- 0.03% ) (83.35%)
97,544,298 branch-misses:u # 0.62% of all branches ( +- 0.12% ) (83.36%)
6.9247 +- 0.0478 seconds time elapsed ( +- 0.69% )
⬢[acme@toolbox pahole]$
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>