bfd/
* elf64-ppc.c (ppc64_elf_check_relocs): Set has_gotrel for
R_PPC64_GOT_PCREL34.
(xlate_pcrel_opt): New function.
(ppc64_elf_edit_toc): Handle R_PPC64_GOT_PCREL34.
(ppc64_elf_relocate_section): Edit GOT indirect to GOT relative
for R_PPC64_GOT_PCREL34. Implement R_PPC64_PCREL_OPT optimisation.
ld/
* testsuite/ld-powerpc/pcrelopt.s,
* testsuite/ld-powerpc/pcrelopt.d,
* testsuite/ld-powerpc/pcrelopt.sec: New test.
* testsuite/ld-powerpc/powerpc.exp: Run it.
When optimising inline plt calls to __tls_get_addr without tls marker
relocs, ld should zap any toc restore insn after the bctrl, to stop a
load-hit-store stall.
* elf64-ppc.c (ppc64_elf_relocate_section <tls_ldgd_opt>): When
editing an old-style __tls_get_addr call, replace a toc restore
insn with a nop.
All of the backend relocate_section functions that interpret reloc
numbers assuming the input file is of the expected type (ie. same as
output or very similar) really ought to be checking input file type.
Not many do, and those that do currently just assert. This patch
replaces the assertion with a more graceful exit.
PR 23980
* elf32-i386.c (elf_i386_relocate_section): Exit with wrong format
error rather than asserting input file is as expected.
* elf32-s390.c (elf_s390_relocate_section): Likewise.
* elf32-sh.c (sh_elf_relocate_section): Likewise.
* elf32-xtensa.c (elf_xtensa_relocate_section): Likewise.
* elf64-ppc.c (ppc64_elf_relocate_section): Likewise.
* elf64-s390.c (elf_s390_relocate_section): Likewise.
* elf64-x86-64.c (elf_x86_64_relocate_section): Likewise.
* elf32-ppc.c (ppc_elf_relocate_section): Exit with wrong format
error if input file is not ppc32 ELF.
IFUNC resolvers must always be called via their global entry point.
They will be called from ld.so rather than from the local executable.
PR 23937
bfd/
* elf64-ppc.c (write_plt_relocs_for_local_syms): Don't add local
entry offset for ifuncs.
ld/
* testsuite/ld-powerpc/pr23937.d,
* testsuite/ld-powerpc/pr23937.s: New test.
* testsuite/ld-powerpc/powerpc.exp: Run it.
This PR shows a fuzzed binary triggering a segfault via a bad
relocation in .debug_line. It turns out that unlike normal
relocations applied to a section, the linker applies those with
symbols from discarded sections via _bfd_clear_contents without
checking that the relocation is within the section bounds. The same
thing now happens when reading debug sections since commit
a4cd947aca, the PR23425 fix.
PR 23770
PR 23425
* reloc.c (_bfd_clear_contents): Replace "location" param with
"buf" and "off". Bounds check "off". Return status.
* cofflink.c (_bfd_coff_generic_relocate_section): Update
_bfd_clear_contents call.
* elf-bfd.h (RELOC_AGAINST_DISCARDED_SECTION): Likewise.
* elf32-arc.c (elf_arc_relocate_section): Likewise.
* elf32-i386.c (elf_i386_relocate_section): Likewise.
* elf32-metag.c (metag_final_link_relocate): Likewise.
* elf32-nds32.c (nds32_elf_get_relocated_section_contents): Likewise.
* elf32-ppc.c (ppc_elf_relocate_section): Likewise.
* elf32-visium.c (visium_elf_relocate_section): Likewise.
* elf64-ppc.c (ppc64_elf_relocate_section): Likewise.
* elf64-x86-64.c *(elf_x86_64_relocate_section): Likewise.
* libbfd-in.h (_bfd_clear_contents): Update prototype.
* libbfd.h: Regenerate.
This patch uses the newly defined high-part REL16 relocs to emit
relocations on the notoc stubs as we already do for other stubs.
* elf64-ppc.c (num_relocs_for_offset): New function.
(emit_relocs_for_offset): New function.
(use_global_in_relocs): New function, split out from..
(ppc_build_one_stub): ..here. Output relocations for notoc stubs.
(ppc_size_one_stub): Calculate reloc count for notoc stubs.
(ppc64_elf_size_stubs): Don't count undefined syms in stub_globals.
This patch rearranges ppc_size_one_stub to make it a little easier to
compare against ppc_build_one_stub, and makes a few other random
changes that might help for future maintenance. There should be no
functional changes here.
The patch also fixes code examples in comments. A couple of "ori"
instructions lacked the source register operand, and "@high" is the
correct reloc modifier to use in a sequence building a 64-bit value.
(@hi reports overflow of a 32-bit signed value.)
* elf64-ppc.c: Correct _notoc stub comments.
(ppc_build_one_stub): Simplify output of branch for notoc
long branch stub. Don't include label offset of 8 bytes in
"off" calculation for notoc plt stub. Don't emit insns to get pc.
(build_offset): Emit insns to get pc here instead.
(size_offset): Add 4 extra insns.
(plt_stub_size): Adjust for "off" and size_offset changes.
(ppc_size_one_stub): Rearrange code into a switch, duplicating
some to better match ppc_build_one_stub.
The "-fPIC" and "-mcmodel=small" parts of these messages isn't always
true, so lets dispense with that and just report the type of stub
causing trouble.
* elf64-ppc.c (ppc64_elf_relocate_section): Revise "call lacks
nop" error message.
These take up far too many lines in the files. This patch introduces
a replacement for the HOWTO macro that simplifies the relow howto
initialization. Apart from the two relocs mentioned in the ChangeLog,
no relocation howto is changed.
* elf64-ppc.c (HOW): Define.
(ONES): Delete.
(ppc64_elf_howto_raw): Use HOW to initialize entries.
* elf32-ppc.c (HOW): Define.
(ppc_elf_howto_raw): Use HOW to initialize entries, updating
R_PPC_VLE_REL15 and R_PPC_VLE_REL24 to use bitpos=0.
ppc_stub_long_branch_notoc will never need more than a 32-bit offset
for the r12 offset since the stub target must be in range of a
branch instruction.
* elf64-ppc.c: Correct ppc_stub_long_branch_notoc example.
Formatting.
This patch generates EH info for the new _notoc linkage stubs, to
support unwinding from asynchronous signal handlers. Unwinding
through the __tls_get_addr_opt stub was already supported, but that
was just a single stub. With multiple stubs the EH opcodes need to be
emitted and sized when iterating over stubs, so this is done when
emitting and sizing the stub code. Emitting the CIEs and FDEs is done
when sizing the stubs, as we did before in order to have the linker
generated FDEs indexed in .eh_frame_hdr. I moved the final tweaks to
FDEs from ppc64_elf_finish_dynamic_sections to ppc64_elf_build_stubs
simply because it's tidier to be done with them at that point.
bfd/
* elf64-ppc.c (struct map_stub): Delete tls_get_addr_opt_bctrl.
Add lr_restore, eh_size and eh_base.
(eh_advance, eh_advance_size): New functions.
(build_tls_get_addr_stub): Emit EH info for stub.
(ppc_build_one_stub): Likewise for _notoc stubs.
(ppc_size_one_stub): Size EH info for stub.
(group_sections): Init new map_stub fields.
(stub_eh_frame_size): Delete.
(ppc64_elf_size_stubs): Size EH info for stubs. Set up dummy EH
program for stubs.
(ppc64_elf_build_stubs): Reinit new map_stub fields. Set FDE
offset to stub section here..
(ppc64_elf_finish_dynamic_sections): ..rather than here.
ld/
* testsuite/ld-powerpc/notoc.s: Generate some cfi.
* testsuite/ld-powerpc/notoc.d: Adjust.
* testsuite/ld-powerpc/notoc.wf: New file.
* testsuite/ld-powerpc/powerpc.exp: Run "ext" and "notoc" tests
as run_ld_link_tests rather than run_dump_test.
This patch fixes a bug in the handling of the __tls_get_addr_opt
stub. Calls via this stub don't have a toc restoring instruction
following the "bl", and the stub itself doesn't have an initial toc
save instruction. Thus it is incorrect to skip over the first
instruction when a __tls_get_addr call is marked with a tocsave
reloc.
* elf64-ppc.c (ppc64_elf_relocate_section): Don't skip first
instruction of __tls_get_addr_opt stub.
(plt_stub_size): Omit ALWAYS_EMIT_R2SAVE condition when
dealing with __tls_get_addr_opt stub.
(build_tls_get_addr_stub, ppc_size_one_stub): Likewise.
R_PPC64_REL24_NOTOC is used on calls like "bl foo@notoc" to tell the
linker that linkage stubs for PLT calls or long branches can't use r2
for pic addressing. Instead, new stubs that generate pc-relative
addresses are used. One complication is that pc-relative offsets to
the PLT may need to be 64-bit in large programs, in contrast to the
toc-relative addressing used by older PLT linkage stubs where a 32-bit
offset is sufficient until the PLT itself exceeds 2G in size.
.eh_frame info to cover the _notoc stubs is yet to be implemented.
bfd/
* elf64-ppc.c (ADDI_R12_R11, ADDI_R12_R12, LIS_R12),
(ADDIS_R12_R11, ORIS_R12_R12_0, ORI_R12_R12_0),
(SLDI_R12_R12_32, LDX_R12_R11_R12, ADD_R12_R11_R12): Define.
(ppc64_elf_howto_raw): Add R_PPC64_REL24_NOTOC entry.
(ppc64_elf_reloc_type_lookup): Support R_PPC64_REL24_NOTOC.
(ppc_stub_type): Add ppc_stub_long_branch_notoc,
ppc_stub_long_branch_both, ppc_stub_plt_branch_notoc,
ppc_stub_plt_branch_both, ppc_stub_plt_call_notoc, and
ppc_stub_plt_call_both.
(is_branch_reloc): Add R_PPC64_REL24_NOTOC.
(build_offset, size_offset): New functions.
(plt_stub_size): Support plt_call_notoc and plt_call_both.
(ppc_build_one_stub, ppc_size_one_stub): Support new stubs.
(toc_adjusting_stub_needed): Handle R_PPC64_REL24_NOTOC.
(ppc64_elf_size_stubs): Likewise, and new stubs.
(ppc64_elf_build_stubs, ppc64_elf_relocate_section): Likewise.
* reloc.c: Add BFD_RELOC_PPC64_REL24_NOTOC.
* bfd-in2.h: Regenerate.
* libbfd.h: Regenerate.
gas/
* config/tc-ppc.c (ppc_elf_suffix): Support @notoc.
(ppc_force_relocation, ppc_fix_adjustable): Handle REL24_NOTOC.
ld/
* testsuite/ld-powerpc/ext.d,
* testsuite/ld-powerpc/ext.s,
* testsuite/ld-powerpc/ext.lnk,
* testsuite/ld-powerpc/notoc.d,
* testsuite/ld-powerpc/notoc.s: New tests.
* testsuite/ld-powerpc/powerpc.exp: Run them.
Not a lot is conveyed by putting _r2off in a stub symbol that can't be
seen by inspecting the stub code or the toc restoring instruction
immediately after a call via such a stub. Also, we don't distinguish
plt_call stub symbols from plt_call_r2save stub symbols, so this patch
makes long branch and plt branch stub symbols consistent with that
decision.
bfd/
* elf64-ppc.c (ppc_build_one_stub): Lose "_r2off" in stub symbols.
ld/
* testsuite/ld-powerpc/elfv2exe.d: Adjust for stub symbol change.
* testsuite/ld-powerpc/tocopt6.d: Likewise.
This patch sets stub_offset in ppc_size_one_stub rather than in
ppc_build_one_stub. That allows the plt stub alignment to be done in
just ppc_size_one_stub rather than both functions. The patch also
corrects the place where the alignment was done, fixing a possible
error in .eh_frame data, and tidies some offset calculations.
bfd/
* elf64-ppc.c (plt_stub_pad): Delay plt_stub_size call until needed.
(ppc_build_one_stub): Don't set stub_offset, instead assert that
it is sane. Don't adjust stub_offset for alignment. Adjust size
calculation. Use "targ" temp when calculating offsets.
(ppc_size_one_stub): Set stub_offset here. Use "targ" temp when
calculating offsets. Adjust for alignment before setting
tls_get_addr_opt_bctrl.
ld/
* testsuite/ld-powerpc/powerpc.exp: Run tlsopt5 with plt alignment.
* testsuite/ld-powerpc/tlsopt5.s: Add extra call.
* testsuite/ld-powerpc/tlsopt5.wf: Adjust expected output.
* testsuite/ld-powerpc/tlsopt5.d: Likewise.
This adds support for ".localentry 1", a new st_other
STO_PPC64_LOCAL_MASK encoding that signifies a function with a single
entry point like ".localentry 0", but unlike a ".localentry 0"
function does not preserve r2.
include/
* elf/ppc64.h: Specify byte offset to local entry for values
of two to six in STO_PPC64_LOCAL_MASK. Clarify r2 return
value for such functions when entering via global entry point.
Specify meaning of a value of one in STO_PPC64_LOCAL_MASK.
bfd/
* elf64-ppc.c (ppc64_elf_size_stubs): Use a ppc_stub_long_branch_r2off
for calls to symbols with STO_PPC64_LOCAL_MASK bits set to 1.
gas/
* config/tc-ppc.c (ppc_elf_localentry): Allow .localentry values
of 1 and 7 to directly set value into STO_PPC64_LOCAL_MASK bits.
ld/testsuite/
* ld-powerpc/elfv2.s: Add .localentry f5,1 testcase.
* ld-powerpc/elfv2exe.d: Update.
* ld-powerpc/elfv2so.d: Update.
Fixes a number of build errors like the following
.../elf32-arm.c: In function 'elf32_arm_nabi_write_core_note':
.../elf32-arm.c:2177: error: #pragma GCC diagnostic not allowed inside functions
.../elf32-arm.c:2186: error: #pragma GCC diagnostic not allowed inside functions
See the comment in diagnostics.h.
include/
* diagnostics.h: Comment on macro usage.
bfd/
* elf32-arm.c (elf32_arm_nabi_write_core_note): Don't use
DIAGNOTIC_PUSH and DIAGNOSTIC_POP unconditionally.
* elf32-ppc.c (ppc_elf_write_core_note): Likewise.
* elf32-s390.c (elf_s390_write_core_note): Likewise.
* elf64-ppc.c (ppc64_elf_write_core_note): Likewise.
* elf64-s390.c (elf_s390_write_core_note): Likewise.
* elfxx-aarch64.c (_bfd_aarch64_elf_write_core_note): Likewise.
And report the two input files that are incompatible rather than
reporting that an input file is incompatible with the output.
bfd/
* elf-bfd.h (_bfd_elf_ppc_merge_fp_attributes): Update prototype.
* elf32-ppc.c (_bfd_elf_ppc_merge_fp_attributes): Return error
on mismatch. Remove "warning: " from messages. Track last bfd
used to set tags.
(ppc_elf_merge_obj_attributes): Likewise. Handle status from
_bfd_elf_ppc_merge_fp_attributes.
* elf64-ppc.c (ppc64_elf_merge_private_bfd_data): Handle status
from _bfd_elf_ppc_merge_fp_attributes.
ld/
* testsuite/ld-powerpc/attr-gnu-4-12.d: Update expected output.
* testsuite/ld-powerpc/attr-gnu-4-13.d: Likewise.
* testsuite/ld-powerpc/attr-gnu-4-21.d: Likewise.
* testsuite/ld-powerpc/attr-gnu-4-23.d: Likewise.
* testsuite/ld-powerpc/attr-gnu-4-31.d: Likewise.
* testsuite/ld-powerpc/attr-gnu-4-32.d: Likewise.
* testsuite/ld-powerpc/attr-gnu-8-23.d: Likewise.
* testsuite/ld-powerpc/attr-gnu-12-21.d: Likewise.
.gnu.attributes entries from linker input files are merged to the
output file, the output having the union of compatible input
attributes. Incompatible attributes generally cause a linker error
and no output. However in some cases only a warning is emitted, and
one of the incompatible input attributes is passed on to the output.
PowerPC tends to emit warnings rather than errors, and the output
takes the first input attribute. For example, if we have two input
files with Tag_GNU_Power_ABI_FP, the first with a value signifying
"double-precision hard float, IBM long double", the second with a
value signifying "double-precision hard float, IEEE long double",
we'll get a warning about incompatible long double types and the
output will say "double-precision hard float, IBM long double".
The output attribute of course isn't correct. It would be correct to
specify "IBM and IEEE long double", but we don't have a way to
represent that currently. While it would be possible to extend the
encoding, there isn't much gain in doing so. A shared library
providing support for both long double types should link against
objects using either long double type without warning or error. That
is what you'd get if such a shared library had no Tag_GNU_Power_ABI_FP
attribute.
So this patch provides a way for the backend to omit .gnu.attributes
tags from the output.
* elf-bfd.h (ATTR_TYPE_FLAG_ERROR, ATTR_TYPE_HAS_ERROR): Define.
* elf-attrs.c (is_default_attr): Handle ATTR_TYPE_HAS_ERROR.
* elf32-ppc.c (_bfd_elf_ppc_merge_fp_attributes): Use
ATTR_TYPE_FLAG_INT_VAL. Set ATTR_TYPE_HAS_ERROR on finding
incompatible attribute.
(ppc_elf_merge_obj_attributes): Likewise. Return
_bfd_elf_merge_object_attributes result.
* elf64-ppc.c (ppc64_elf_merge_private_bfd_data): Return
_bfd_elf_merge_object_attributes result.
https://sourceware.org/ml/binutils/2013-05/msg00271.html was supposed
to banish "file format is ambiguous" errors for ELF. It didn't,
because the code supposedly detecting formats that implement
match_priority didn't work. That was due to not placing all matching
targets into the vector of matching targets. ELF objects should all
match the generic ELF target (priority 2), plus one or more machine
specific targets (priority 1), and perhaps a single machine specific
target with OS/ABI set (priority 0, best match). So the armel object
in the testcase actually matches elf32-littlearm,
elf32-littlearm-symbian, and elf32-littlearm-vxworks (all priority 1),
and elf32-little (priority 2). As the PR reported, elf32-little
wasn't seen as matching. Fixing that part of the problem wasn't too
difficult but matching the generic ELF target as well as the ARM ELF
targets resulted in ARM testsuite failures.
These proved to be the annoying reordering of stubs that occurs from
time to time due to the stub names containing the section id.
Matching another target causes more sections to be created in
elf_object_p. If section ids change, stub names change, which results
in different hashing and can therefore result in different hash table
traversal and stub creation order. That particular problem is fixed
by resetting section_id to the initial state before attempting each
target match, and taking a snapshot of its value after a successful
match.
PR 22458
* format.c (struct bfd_preserve): Add section_id.
(bfd_preserve_save, bfd_preserve_restore): Save and restore
_bfd_section_id.
(bfd_reinit): Set _bfd_section_id.
(bfd_check_format_matches): Put all matches of any priority into
matching_vector. Save initial section id and start each attempted
match at that section id.
* libbfd-in.h (_bfd_section_id): Declare.
* section.c (_bfd_section_id): Rename from section_id and make
global. Adjust uses.
(bfd_get_next_section_id): Delete.
* elf64-ppc.c (ppc64_elf_setup_section_lists): Replace use of
bfd_get_section_id with _bfd_section_id.
* libbfd.h: Regenerate.
* bfd-in2.h: Regenerate.
Two of the gcc ifunc tests fail for ppc32, due to my pr22374 fix being
a little too enthusiastic in trimming PLT entries. ppc64 doesn't have
the same failures because ppc64_elf_check_relocs happens to set
needs_plt for any ifunc reloc.
PR 23123
PR 22374
* elf32-ppc.c (ppc_elf_adjust_dynamic_symbol): Don't drop plt
relocs for ifuncs.
* elf64-ppc.c (ppc64_elf_adjust_dynamic_symbol): Comment fixes.
If you create an ifunc using GCC's __attribute__ ifunc, like:
extern int gnu_ifunc (int arg);
static int gnu_ifunc_target (int arg) { return 0; }
__typeof (gnu_ifunc) *gnu_ifunc_resolver (unsigned long hwcap) { return gnu_ifunc_target; }
__typeof (gnu_ifunc) gnu_ifunc __attribute__ ((ifunc ("gnu_ifunc_resolver")));
then you end up with two (function descriptor) symbols, one for the
ifunc itself, and another for the resolver:
(...)
12: 0000000000020060 104 FUNC GLOBAL DEFAULT 18 gnu_ifunc_resolver
(...)
16: 0000000000020060 104 GNU_IFUNC GLOBAL DEFAULT 18 gnu_ifunc
(...)
Both ifunc and resolver symbols have the same address/value, so
ppc64_elf_get_synthetic_symtab only creates a synthetic text symbol
for one of them. In the case above, it ends up being created for the
resolver, only:
(gdb) maint print msymbols
(...)
[ 7] t 0x980 .frame_dummy section .text
[ 8] T 0x9e4 .gnu_ifunc_resolver section .text
[ 9] T 0xa58 __glink_PLTresolve section .text
(...)
GDB needs to know when a program stepped into an ifunc resolver, so
that it can know whether to step past the resolver into the target
function without the user noticing. The way GDB does it is my
checking whether the current PC points to an ifunc symbol (since
resolver and ifunc have the same address by design).
The problem is then that ppc64_elf_get_synthetic_symtab never creates
the synchetic symbol for the ifunc, so GDB stops stepping at the
resolver (in a test added by the following patch):
(gdb) step
gnu_ifunc_resolver (hwcap=21) at gdb/testsuite/gdb.base/gnu-ifunc-lib.c:33
33 {
(gdb) FAIL: gdb.base/gnu-ifunc.exp: resolver_attr=1: resolver_debug=1: final_debug=0: step
After this commit, we get:
[ 8] i 0x9e4 .gnu_ifunc section .text
[ 9] T 0x9e4 .gnu_ifunc_resolver section .text
And stepping an ifunc call takes to the final function:
(gdb) step
0x00000000100009e8 in .final ()
(gdb) PASS: gdb.base/gnu-ifunc.exp: resolver_attr=1: resolver_debug=1: final_debug=0: step
An alternative to touching bfd I considered was for GDB to check
whether there's an ifunc data symbol / function descriptor that points
to the current PC, whenever the program stops, but discarded it
because we'd have to do a linear scan over .opd over an over to find a
matching function descriptor for the current PC. At that point I
considered caching that info, but quickly dismissed it as then that
has no advantage (memory or performance) over just creating the
synthetic ifunc text symbol in the first place.
I ran the binutils and ld testsuites on PPC64 ELFv1 (machine gcc110 on
the GCC compile farm), and saw no regressions. This commit is part of
a GDB patch series that includes GDB tests that fail without this fix.
bfd/ChangeLog:
2018-04-26 Pedro Alves <palves@redhat.com>
* elf64-ppc.c (ppc64_elf_get_synthetic_symtab): Don't consider
ifunc and non-ifunc symbols duplicates.
max-page-size only matters for demand paged executables or shared
libraries, and the ideal size is the largest value used by your
operating system. Values larger than necessary just waste file space
and memory. common-page-size also affects file and memory size,
trading a possible small increase in file size for a decrease in
memory size when the operating system is using a common-page-size
page. With a powerpc max-page-size of 64k and common-page-size of 4k
many executables will use no more memory pages when the system page
size is 4k than an executable linked with -z max-page-size=0x1000,
yet will still run on a system using 64k pages. However, when running
on a system using 64k pages relro protection will not be completely
effective.
Due to the relro problem, powerpc binutils has been using a default
common-page-size of 64k since 2014-12-18 (git commit 04c6a44c7),
leading to complaints about increased file and memory sizes. People
not using relro do have a valid reason to complain..
So this patch introduces an extra back-end value to use as the default
for common-page-size when generating relro executables, and enables
the support for powerpc. Non relro executables will now be generated
with a default common-page-size of 4k.
bfd/
* elf-bfd.h (struct elf_backend_data): Add relropagesize.
* elfxx-target.h (ELF_RELROPAGESIZE): Provide default and
sanity test.
(elfNN_bed): Init relropagesize.
* bfd.c (bfd_emul_get_commonpagesize): Add boolean param to
select relropagesize.
* elf32-ppc.c (ELF_COMMONPAGESIZE): Define as 0x1000.
(ELF_RELROPAGESIZE): Define as ELF_MAXPAGESIZE.
(ELF_MINPAGESIZE): Don't define.
* elf64-ppc.c (ELF_COMMONPAGESIZE): Define as 0x1000.
(ELF_RELROPAGESIZE): Define as ELF_MAXPAGESIZE.
* bfd-in2.h: Regenerate.
ld/
* ldmain.c (main): Move config.maxpagesize and
config.commonpagesize initialization to..
* ldemul.c (after_parse_default): ..here.
* testsuite/ld-powerpc/ppc476-shared.d: Pass -z common-page-size.
* testsuite/ld-powerpc/ppc476-shared2.d: Likewise.
This patch adds the analysis part of PLT call optimization, enabling
the code added with the previous patch that actually performs the
optimization.
Gold support is not available yet.
bfd/
* elf64-ppc.c (struct _ppc64_elf_section_data): Add has_pltcall field.
(struct ppc_link_hash_table): Add can_convert_all_inline_plt.
(ppc64_elf_check_relocs): Set has_pltcall.
(ppc64_elf_adjust_dynamic_symbol): Discard some PLT entries.
(ppc64_elf_inline_plt): New function.
(ppc64_elf_size_dynamic_sections): Discard some PLT entries for locals.
* elf64-ppc.h (ppc64_elf_inline_plt): Declare.
* elf32-ppc.c (has_pltcall): Define.
(struct ppc_elf_link_hash_table): Add can_convert_all_inline_plt.
(ppc_elf_check_relocs): Set has_pltcall.
(ppc_elf_inline_plt): New function.
(ppc_elf_adjust_dynamic_symbol): Discard some PLT entries.
(ppc_elf_size_dynamic_sections): Likewise.
* elf32-ppc.h (ppc_elf_inline_plt): Declare.
ld/
* emultempl/ppc64elf.em (no_inline_plt): New var.
(ppc_before_allocation): Call ppc64_elf_inline_plt.
(enum ppc64_opt): Add OPTION_NO_INLINE_OPT.
(PARSE_AND_LIST_LONGOPTS, PARSE_AND_LIST_OPTIONS,
PARSE_AND_LIST_ARGS_CASES): Handle --no-inline-optimize.
* emultemps/ppc32elf.em (no_inline_opt): New var.
(prelim_size_sections): New function, extracted from..
(ppc_before_allocation): ..here. Call ppc_elf_inline_plt.
(enum ppc32_opt): Add OPTION_NO_INLINE_OPT.
(PARSE_AND_LIST_LONGOPTS, PARSE_AND_LIST_OPTIONS,
PARSE_AND_LIST_ARGS_CASES): Handle --no-inline-optimize.
In addition to the existing relocs we need two more to mark all
instructions in the call sequence, PLTCALL on the call itself (plus
the toc restore insn for ppc64), and PLTSEQ on others. All
relocations in a particular sequence have the same symbol.
Example ppc64 ELFv2 assembly:
.reloc .,R_PPC64_PLTSEQ,puts
std 2,24(1)
addis 12,2,puts@plt@ha # .reloc .,R_PPC64_PLT16_HA,puts
ld 12,puts@plt@l(12) # .reloc .,R_PPC64_PLT16_LO_DS,puts
.reloc .,R_PPC64_PLTSEQ,puts
mtctr 12
.reloc .,R_PPC64_PLTCALL,puts
bctrl
ld 2,24(1)
Example ppc32 -fPIC assembly:
addis 12,30,puts+32768@plt@ha # .reloc .,R_PPC_PLT16_HA,puts+0x8000
lwz 12,12,puts+32768@plt@l # .reloc .,R_PPC_PLT16_LO,puts+0x8000
.reloc .,R_PPC_PLTSEQ,puts+32768
mtctr 12
.reloc .,R_PPC_PLTCALL,puts+32768
bctrl
Marking sequences like this allows the linker to convert them to nops
and a direct call if the target symbol turns out to be local.
When the call is __tls_get_addr, each relocation shown above is paired
with an R_PPC*_TLSLD or R_PPC*_TLSGD reloc to additionally mark the
sequence for possible TLS optimization. The TLSLD or TLSGD relocs are
emitted first.
include/
* elf/ppc.h (R_PPC_PLTSEQ, R_PPC_PLTCALL): Define.
* elf/ppc64.h (R_PPC64_PLTSEQ, R_PPC64_PLTCALL): Define.
bfd/
* elf32-ppc.c (ppc_elf_howto_raw): Add PLTSEQ and PLTCALL howtos.
(is_plt_seq_reloc): New function.
(ppc_elf_check_relocs): Handle PLTSEQ and PLTCALL relocs.
(ppc_elf_tls_optimize): Handle inline plt call sequence.
(ppc_elf_relax_section): Handle PLTCALL reloc.
(ppc_elf_relocate_section): Nop out inline plt call sequence when
resolving locally.
* elf64-ppc.c (ppc64_elf_howto_raw): Add R_PPC64_PLTSEQ and
R_PPC64_PLTCALL entries. Comment R_PPC64_TOCSAVE.
(has_tls_get_addr_call): Correct comment.
(is_branch_reloc): Add PLTCALL.
(is_plt_seq_reloc): New function.
(ppc64_elf_check_relocs): Handle PLT16_LO_DS reloc. Set
has_tls_reloc for R_PPC64_TLSGD and R_PPC64_TLSLD. Create plt
entry for R_PPC64_PLTCALL.
(ppc64_elf_tls_optimize): Handle inline plt call sequence.
(ppc_type_of_stub): Handle PLTCALL reloc.
(toc_adjusting_stub_needed): Likewise.
(ppc64_elf_relocate_section): Set "can_plt_call" for PLTCALL
reloc insn. Nop out inline plt call sequence when resolving
locally. Handle __tls_get_addr inline plt call optimization.
elfcpp/
* powerpc.h (R_POWERPC_PLTSEQ, R_POWERPC_PLTCALL): Define.
gold/
* powerpc.cc (Target_powerpc::Track_tls::maybe_skip_tls_get_addr_call):
Handle inline plt sequence relocs.
(Stub_table::Plt_stub_key::Plt_stub_key): Likewise.
(Target_powerpc::Scan::reloc_needs_plt_for_ifunc): Likewise.
(Target_powerpc::Relocate::relocate): Likewise.
Necessary if gcc is to use PLT16 relocs to implement -mlongcall, and
there isn't a good technical reason why local symbols should be
excluded from PLT16 support. Non-ifunc local symbol PLT entries go in
a separate section to other PLT entries. In a fixed position
executable they won't need to be relocated, and in a PIE or shared
library I chose to not implement lazy relocation.
bfd/
* elf64-ppc.c (LOCAL_PLT_ENTRY_SIZE): Define.
(struct ppc_stub_hash_entry): Add symtype field.
(PLT_KEEP): Define.
(struct ppc_link_hash_table): Add pltlocal and relpltlocal.
(create_linkage_sections): Create pltlocal and relpltlocal.
(ppc64_elf_check_relocs): Allow PLT relocs on local symbols.
Set PLT_KEEP.
(ppc64_elf_adjust_dynamic_symbol): Keep PLT entries for inline calls.
(allocate_dynrelocs): Allocate pltlocal and relpltlocal.
(ppc64_elf_size_dynamic_sections): Size pltlocal and relpltlocal.
Keep PLT entries for inline calls against locals.
(ppc_build_one_stub): Use pltlocal as appropriate.
(ppc_size_one_stub): Likewise.
(ppc64_elf_size_stubs): Set symtype.
(build_global_entry_stubs_and_plt): Init pltlocal and write
relpltlocal for globals.
(write_plt_relocs_for_local_syms): Likewise for local syms.
(ppc64_elf_relocate_section): Support PLT for local syms.
* elf32-ppc.c (PLT_KEEP): Define.
(struct ppc_elf_link_hash_table): Add pltlocal and relpltlocal.
(ppc_elf_create_glink): Create pltlocal and relpltlocal.
(ppc_elf_check_relocs): Allow PLT relocs on local symbols.
Set PLT_KEEP. Adjust update_local_sym_info call.
(ppc_elf_adjust_dynamic_symbol): Keep PLT entries for inline calls.
(allocate_dynrelocs): Allocate pltlocal and relpltlocal.
(ppc_elf_size_dynamic_sections): Size pltlocal and relpltlocal.
(ppc_elf_relocate_section): Support PLT16 relocs for local syms.
(write_global_sym_plt): Init pltlocal and write relpltlocal.
(ppc_finish_symbols): Likewise for locals.
ld/
* emulparams/elf32ppc.sh (OTHER_RELRO_SECTIONS_2): Add .branch_lt.
(OTHER_GOT_RELOC_SECTIONS): Add .rela.branch_lt.
* testsuite/ld-powerpc/elfv2so.d: Update for symbol/stub reordering.
* testsuite/ld-powerpc/relbrlt.d: Likewise.
* testsuite/ld-powerpc/relbrlt.s: Likewise.
* testsuite/ld-powerpc/tlsso.r: Likewise.
* testsuite/ld-powerpc/tlstocso.r: Likewise.
gold/
* powerpc.cc (Target_powerpc::lplt_): New variable.
(Target_powerpc::lplt_section): Associated accessor.
(Target_powerpc::plt_off): Handle local non-ifunc symbols.
(Target_powerpc::make_lplt_section): New function.
(Target_powerpc::make_local_plt_entry): New function.
(Powerpc_relobj::do_relocate_sections): Write out lplt.
(Output_data_plt_powerpc::first_plt_entry_offset): Zero for lplt.
(Output_data_plt_powerpc::add_local_entry): New function.
(Output_data_plt_powerpc::do_write): Ignore lplt.
(Target_powerpc::make_iplt_section): Make lplt first.
(Target_powerpc::make_brlt_section): Make .branch_lt relro.
(Target_powerpc::Scan::local): Handle PLT16 relocs.
The current scheme where we output PLT relocs for global symbols in
finish_dynamic_symbol, and PLT relocs for local symbols when
outputting stubs does not work if PLT entries are to be used for
inline PLT sequences against non-dynamic globals or local symbols.
bfd/
* elf64-ppc.c (ppc_build_one_stub): Move output of PLT relocs
for local symbols to..
(write_plt_relocs_for_local_syms): ..here. New function.
(ppc64_elf_finish_dynamic_symbol): Move output of PLT relocs for
global symbols to..
(build_global_entry_stubs_and_plt): ..here. Rename from
build_global_entry_stubs.
(ppc64_elf_build_stubs): Always call build_global_entry_stubs_and_plt.
Call write_plt_relocs_for_local_syms.
* elf32-ppc.c (get_sym_h): New function.
(ppc_elf_relax_section): Use get_sym_h.
(ppc_elf_relocate_section): Move output of PLT relocs and glink
stubs for local symbols to..
(ppc_finish_symbols): ..here. New function.
(ppc_elf_finish_dynamic_symbol): Move output of PLT relocs for
global syms to..
(write_global_sym_plt): ..here. New function.
* elf32-ppc.h (ppc_elf_modify_segment_map): Delete attribute.
(ppc_finish_symbols): Declare.
ld/
* ppc32elf.em (ppc_finish): Call ppc_finish_symbols.
The PowerPC64 ELFv2 ABI and the PowerPC SysV ABI support a number of
relocations that can be used to create and access a PLT entry.
However, the relocs are not well defined. The PLT16 family of relocs
talk about "the section offset or address of the procedure linkage
table entry". It's plain that we do need a relative address when PIC
as otherwise we'd have dynamic text relocations, but "section offset"
doesn't specify which section. The most obvious one, ".plt", isn't
that useful because there is no readily available way of addressing
the start of the ".plt" section. Much more useful would be "the
GOT/TOC-pointer relative offset of the procedure linkage table entry",
and I suppose you could argue that is a "section offset" of sorts.
For PowerPC64 it is better to use the same TOC-pointer relative
addressing even when non-PIC, since ".plt" may be located outside the
range of a 32-bit address. However, for ppc32 we do want an absolute
address when non-PIC as a GOT pointer may not be set up. Also, for
ppc32 PIC we have a similar situation to R_PPC_PLTREL24 in that the
GOT pointer is set to a location in the .got2 section and we need to
specify the .got2 offset in the PLT16 reloc addend.
This patch supports PLT16 relocations using these semantics. This is
not an ABI change for ppc32 since the relocations were not previously
supported by GNU ld, but is for ppc64 where some of the PLT16 relocs
were supported. I'm not particularly concerned since the old ppc64
PLT16 reloc semantics made them almost completely useless.
bfd/
* elf32-ppc.c (ppc_elf_check_relocs): Handle PLT16 relocs.
(ppc_elf_relocate_section): Likewise.
* elf64-ppc.c (ppc64_elf_check_relocs): Handle PLT16_LO_DS.
(ppc64_elf_relocate_section): Likewise. Correct PLT16
resolution to plt entry relative to toc pointer.
gold/
* powerpc.cc (Target_powerpc::plt_off): New functions.
(is_plt16_reloc): New function.
(Stub_table::plt_off): Use Target_powerpc::plt_off.
(Stub_table::plt_call_size): Use plt_off.
(Stub_table::do_write): Likewise.
(Target_powerpc::Scan::get_reference_flags): Return RELATIVE_REF
for PLT16 relocations.
(Target_powerpc::Scan::reloc_needs_plt_for_ifunc): Return true
for PLT16 relocations.
(Target_powerpc::Scan::global): Make a PLT entry for PLT16 relocations.
(Target_powerpc::Relocate::relocate): Support PLT16 relocations.
(Powerpc_scan_relocatable_reloc::global_strategy): Return RELOC_SPECIAL
for ppc32 plt16 relocs.
It is possible to construct indirect calls to __tls_get_addr in
assembly that confuse TLS optimization. (PowerPC gcc doesn't support
such calls, ignoring -mlongcall for __tls_get_addr.) This patch fixes
the problem by requiring a TLSLD or TLSGD marker reloc before any insn
in an indirect call to __tls_get_addr will be optimized. They also
need additional marker relocs defined in a later patch, so don't
expect the optimization to work just yet. The point here is to
prevent mis-optimization of indirect calls without any marker relocs.
The presense of a marker reloc is tracked by a new bit in the tls_mask
field of ppc_link_hash_entry and the corresponding lgot_masks unsigned
char array for local symbols. Since the field is only 8 bits, we've
run out of space. However, tracking TLS use for variables, and
tracking IFUNC for functions are independent, and bits can be reused.
TLS_TLS is always set for TLS usage, so can be used to select the
meaning of the other bits. This patch does that even for elf32-ppc.c
which hasn't yet run out of space in the field.
* elf64-ppc.c (TLS_TLS, TLS_GD, TLS_LD, TLS_TPREL, TLS_DTPREL,
TLS_TPRELGD, TLS_EXPLICIT): Renumber. Test TLS_TLS throughout
file when other TLS flags are tested in a mask.
(TLS_MARK, NON_GOT): Define.
(PLT_IFUNC): Redefine, and test TLS_TLS throughout file as well.
(update_local_sym_info): Don't create got entry when NON_GOT.
(ppc64_elf_check_relocs): Pass NON_GOT with PLT_IFUNC.
Set TLS_MARK.
(get_tls_mask): Do toc lookup if tls_mask is just TLS_MARK.
(ppc64_elf_relocate_section): Likewise.
(ppc64_elf_tls_optimize): Don't attempt to optimize indirect
__tls_get_addr calls lacking a marker reloc.
* elf32-ppc.c (TLS_TLS, TLS_GD, TLS_LD, TLS_TPREL, TLS_DTPREL,
TLS_TPRELGD): Renumber. Update comment.
(TLS_MARK, NON_GOT): Define.
(PLT_IFUNC): Redefine, and test TLS_TLS throughout file as well.
(update_local_sym_info): Don't create got entry when NON_GOT.
(ppc_elf_check_relocs): Pass NON_GOT with PLT_IFUNC.
Set TLS_MARK.
(ppc_elf_tls_optimize): Don't attempt to optimize indirect
__tls_get_addr calls lacking a marker reloc.
STT_FILE and a bunch of other symbol types aren't proper symbols to
mark the start of a function's code.
* elf64-ppc.c (ppc64_elf_get_synthetic_symtab): Trim uninteresting
symbols. Use size_t counts. Delete redundant opd test.
Commit f15d0b545b trimmed some unnecessary TPREL relocs, but missed
changing another place where they are allocated.
* elf64-ppc.c (ppc_size_one_stub): Fix comment typo.
(ppc64_elf_layout_multitoc): Allocate relocs for tprel as we
do in size_dynamic_sections.
This calculation in relocate_section
if (stub_entry->stub_type == ppc_stub_save_res)
relocation += (stub_sec->output_offset
+ stub_sec->output_section->vma
+ stub_sec->size - htab->sfpr->size
- htab->sfpr->output_offset
- htab->sfpr->output_section->vma);
to adjust from the original out-of-line save/restore function address
in sfpr to a copy at the end of stub_sec goes wrong when stub_sec is
padded, because the copy is no longer at the end of stub_sec. The
solution is to pad before copying sfpr, so the copy is always at the
end of stub_sec.
* elf64-ppc.c (sfpr_define): Adjust for stub_sec size having
sfpr size added before defining alias symbols.
(ppc64_elf_build_stubs): Add stub section padding before
copying sfpr contents and defining save/restore alias symbols.
The GNU coding standard says error messages should be of the form
program:sourcefile:lineno: message
or
program: message
and
"The string message should not begin with a capital letter when it
follows a program name and/or file name, because that isn’t the
beginning of a sentence. (The sentence conceptually starts at the
beginning of the line.) Also, it should not end with a period."
This patch does that for ppc, and removes some British spelling.
I've also switched some error output from using the linker callback
einfo to _bfd_error_handler, due to improved compilation time
argument checking now done for the latter function.
bfd/
* elf32-ppc.c: Standardize error/warning messages. Use
_bfd_error_handler rather than einfo when einfo features not used.
* elf64-ppc.c: Likewise.
ld/
* testsuite/ld-powerpc/attr-gnu-12-21.d: Update.
* testsuite/ld-powerpc/attr-gnu-4-12.d: Update.
* testsuite/ld-powerpc/attr-gnu-4-13.d: Update.
* testsuite/ld-powerpc/attr-gnu-4-21.d: Update.
* testsuite/ld-powerpc/attr-gnu-4-23.d: Update.
* testsuite/ld-powerpc/attr-gnu-4-31.d: Update.
* testsuite/ld-powerpc/attr-gnu-4-32.d: Update.
* testsuite/ld-powerpc/attr-gnu-8-23.d: Update.