This patch rearranges ppc_size_one_stub to make it a little easier to
compare against ppc_build_one_stub, and makes a few other random
changes that might help for future maintenance. There should be no
functional changes here.
The patch also fixes code examples in comments. A couple of "ori"
instructions lacked the source register operand, and "@high" is the
correct reloc modifier to use in a sequence building a 64-bit value.
(@hi reports overflow of a 32-bit signed value.)
* elf64-ppc.c: Correct _notoc stub comments.
(ppc_build_one_stub): Simplify output of branch for notoc
long branch stub. Don't include label offset of 8 bytes in
"off" calculation for notoc plt stub. Don't emit insns to get pc.
(build_offset): Emit insns to get pc here instead.
(size_offset): Add 4 extra insns.
(plt_stub_size): Adjust for "off" and size_offset changes.
(ppc_size_one_stub): Rearrange code into a switch, duplicating
some to better match ppc_build_one_stub.
The "-fPIC" and "-mcmodel=small" parts of these messages isn't always
true, so lets dispense with that and just report the type of stub
causing trouble.
* elf64-ppc.c (ppc64_elf_relocate_section): Revise "call lacks
nop" error message.
These take up far too many lines in the files. This patch introduces
a replacement for the HOWTO macro that simplifies the relow howto
initialization. Apart from the two relocs mentioned in the ChangeLog,
no relocation howto is changed.
* elf64-ppc.c (HOW): Define.
(ONES): Delete.
(ppc64_elf_howto_raw): Use HOW to initialize entries.
* elf32-ppc.c (HOW): Define.
(ppc_elf_howto_raw): Use HOW to initialize entries, updating
R_PPC_VLE_REL15 and R_PPC_VLE_REL24 to use bitpos=0.
ppc_stub_long_branch_notoc will never need more than a 32-bit offset
for the r12 offset since the stub target must be in range of a
branch instruction.
* elf64-ppc.c: Correct ppc_stub_long_branch_notoc example.
Formatting.
This patch generates EH info for the new _notoc linkage stubs, to
support unwinding from asynchronous signal handlers. Unwinding
through the __tls_get_addr_opt stub was already supported, but that
was just a single stub. With multiple stubs the EH opcodes need to be
emitted and sized when iterating over stubs, so this is done when
emitting and sizing the stub code. Emitting the CIEs and FDEs is done
when sizing the stubs, as we did before in order to have the linker
generated FDEs indexed in .eh_frame_hdr. I moved the final tweaks to
FDEs from ppc64_elf_finish_dynamic_sections to ppc64_elf_build_stubs
simply because it's tidier to be done with them at that point.
bfd/
* elf64-ppc.c (struct map_stub): Delete tls_get_addr_opt_bctrl.
Add lr_restore, eh_size and eh_base.
(eh_advance, eh_advance_size): New functions.
(build_tls_get_addr_stub): Emit EH info for stub.
(ppc_build_one_stub): Likewise for _notoc stubs.
(ppc_size_one_stub): Size EH info for stub.
(group_sections): Init new map_stub fields.
(stub_eh_frame_size): Delete.
(ppc64_elf_size_stubs): Size EH info for stubs. Set up dummy EH
program for stubs.
(ppc64_elf_build_stubs): Reinit new map_stub fields. Set FDE
offset to stub section here..
(ppc64_elf_finish_dynamic_sections): ..rather than here.
ld/
* testsuite/ld-powerpc/notoc.s: Generate some cfi.
* testsuite/ld-powerpc/notoc.d: Adjust.
* testsuite/ld-powerpc/notoc.wf: New file.
* testsuite/ld-powerpc/powerpc.exp: Run "ext" and "notoc" tests
as run_ld_link_tests rather than run_dump_test.
This patch fixes a bug in the handling of the __tls_get_addr_opt
stub. Calls via this stub don't have a toc restoring instruction
following the "bl", and the stub itself doesn't have an initial toc
save instruction. Thus it is incorrect to skip over the first
instruction when a __tls_get_addr call is marked with a tocsave
reloc.
* elf64-ppc.c (ppc64_elf_relocate_section): Don't skip first
instruction of __tls_get_addr_opt stub.
(plt_stub_size): Omit ALWAYS_EMIT_R2SAVE condition when
dealing with __tls_get_addr_opt stub.
(build_tls_get_addr_stub, ppc_size_one_stub): Likewise.
R_PPC64_REL24_NOTOC is used on calls like "bl foo@notoc" to tell the
linker that linkage stubs for PLT calls or long branches can't use r2
for pic addressing. Instead, new stubs that generate pc-relative
addresses are used. One complication is that pc-relative offsets to
the PLT may need to be 64-bit in large programs, in contrast to the
toc-relative addressing used by older PLT linkage stubs where a 32-bit
offset is sufficient until the PLT itself exceeds 2G in size.
.eh_frame info to cover the _notoc stubs is yet to be implemented.
bfd/
* elf64-ppc.c (ADDI_R12_R11, ADDI_R12_R12, LIS_R12),
(ADDIS_R12_R11, ORIS_R12_R12_0, ORI_R12_R12_0),
(SLDI_R12_R12_32, LDX_R12_R11_R12, ADD_R12_R11_R12): Define.
(ppc64_elf_howto_raw): Add R_PPC64_REL24_NOTOC entry.
(ppc64_elf_reloc_type_lookup): Support R_PPC64_REL24_NOTOC.
(ppc_stub_type): Add ppc_stub_long_branch_notoc,
ppc_stub_long_branch_both, ppc_stub_plt_branch_notoc,
ppc_stub_plt_branch_both, ppc_stub_plt_call_notoc, and
ppc_stub_plt_call_both.
(is_branch_reloc): Add R_PPC64_REL24_NOTOC.
(build_offset, size_offset): New functions.
(plt_stub_size): Support plt_call_notoc and plt_call_both.
(ppc_build_one_stub, ppc_size_one_stub): Support new stubs.
(toc_adjusting_stub_needed): Handle R_PPC64_REL24_NOTOC.
(ppc64_elf_size_stubs): Likewise, and new stubs.
(ppc64_elf_build_stubs, ppc64_elf_relocate_section): Likewise.
* reloc.c: Add BFD_RELOC_PPC64_REL24_NOTOC.
* bfd-in2.h: Regenerate.
* libbfd.h: Regenerate.
gas/
* config/tc-ppc.c (ppc_elf_suffix): Support @notoc.
(ppc_force_relocation, ppc_fix_adjustable): Handle REL24_NOTOC.
ld/
* testsuite/ld-powerpc/ext.d,
* testsuite/ld-powerpc/ext.s,
* testsuite/ld-powerpc/ext.lnk,
* testsuite/ld-powerpc/notoc.d,
* testsuite/ld-powerpc/notoc.s: New tests.
* testsuite/ld-powerpc/powerpc.exp: Run them.
Not a lot is conveyed by putting _r2off in a stub symbol that can't be
seen by inspecting the stub code or the toc restoring instruction
immediately after a call via such a stub. Also, we don't distinguish
plt_call stub symbols from plt_call_r2save stub symbols, so this patch
makes long branch and plt branch stub symbols consistent with that
decision.
bfd/
* elf64-ppc.c (ppc_build_one_stub): Lose "_r2off" in stub symbols.
ld/
* testsuite/ld-powerpc/elfv2exe.d: Adjust for stub symbol change.
* testsuite/ld-powerpc/tocopt6.d: Likewise.
This patch sets stub_offset in ppc_size_one_stub rather than in
ppc_build_one_stub. That allows the plt stub alignment to be done in
just ppc_size_one_stub rather than both functions. The patch also
corrects the place where the alignment was done, fixing a possible
error in .eh_frame data, and tidies some offset calculations.
bfd/
* elf64-ppc.c (plt_stub_pad): Delay plt_stub_size call until needed.
(ppc_build_one_stub): Don't set stub_offset, instead assert that
it is sane. Don't adjust stub_offset for alignment. Adjust size
calculation. Use "targ" temp when calculating offsets.
(ppc_size_one_stub): Set stub_offset here. Use "targ" temp when
calculating offsets. Adjust for alignment before setting
tls_get_addr_opt_bctrl.
ld/
* testsuite/ld-powerpc/powerpc.exp: Run tlsopt5 with plt alignment.
* testsuite/ld-powerpc/tlsopt5.s: Add extra call.
* testsuite/ld-powerpc/tlsopt5.wf: Adjust expected output.
* testsuite/ld-powerpc/tlsopt5.d: Likewise.
This adds support for ".localentry 1", a new st_other
STO_PPC64_LOCAL_MASK encoding that signifies a function with a single
entry point like ".localentry 0", but unlike a ".localentry 0"
function does not preserve r2.
include/
* elf/ppc64.h: Specify byte offset to local entry for values
of two to six in STO_PPC64_LOCAL_MASK. Clarify r2 return
value for such functions when entering via global entry point.
Specify meaning of a value of one in STO_PPC64_LOCAL_MASK.
bfd/
* elf64-ppc.c (ppc64_elf_size_stubs): Use a ppc_stub_long_branch_r2off
for calls to symbols with STO_PPC64_LOCAL_MASK bits set to 1.
gas/
* config/tc-ppc.c (ppc_elf_localentry): Allow .localentry values
of 1 and 7 to directly set value into STO_PPC64_LOCAL_MASK bits.
ld/testsuite/
* ld-powerpc/elfv2.s: Add .localentry f5,1 testcase.
* ld-powerpc/elfv2exe.d: Update.
* ld-powerpc/elfv2so.d: Update.
Fixes a number of build errors like the following
.../elf32-arm.c: In function 'elf32_arm_nabi_write_core_note':
.../elf32-arm.c:2177: error: #pragma GCC diagnostic not allowed inside functions
.../elf32-arm.c:2186: error: #pragma GCC diagnostic not allowed inside functions
See the comment in diagnostics.h.
include/
* diagnostics.h: Comment on macro usage.
bfd/
* elf32-arm.c (elf32_arm_nabi_write_core_note): Don't use
DIAGNOTIC_PUSH and DIAGNOSTIC_POP unconditionally.
* elf32-ppc.c (ppc_elf_write_core_note): Likewise.
* elf32-s390.c (elf_s390_write_core_note): Likewise.
* elf64-ppc.c (ppc64_elf_write_core_note): Likewise.
* elf64-s390.c (elf_s390_write_core_note): Likewise.
* elfxx-aarch64.c (_bfd_aarch64_elf_write_core_note): Likewise.
And report the two input files that are incompatible rather than
reporting that an input file is incompatible with the output.
bfd/
* elf-bfd.h (_bfd_elf_ppc_merge_fp_attributes): Update prototype.
* elf32-ppc.c (_bfd_elf_ppc_merge_fp_attributes): Return error
on mismatch. Remove "warning: " from messages. Track last bfd
used to set tags.
(ppc_elf_merge_obj_attributes): Likewise. Handle status from
_bfd_elf_ppc_merge_fp_attributes.
* elf64-ppc.c (ppc64_elf_merge_private_bfd_data): Handle status
from _bfd_elf_ppc_merge_fp_attributes.
ld/
* testsuite/ld-powerpc/attr-gnu-4-12.d: Update expected output.
* testsuite/ld-powerpc/attr-gnu-4-13.d: Likewise.
* testsuite/ld-powerpc/attr-gnu-4-21.d: Likewise.
* testsuite/ld-powerpc/attr-gnu-4-23.d: Likewise.
* testsuite/ld-powerpc/attr-gnu-4-31.d: Likewise.
* testsuite/ld-powerpc/attr-gnu-4-32.d: Likewise.
* testsuite/ld-powerpc/attr-gnu-8-23.d: Likewise.
* testsuite/ld-powerpc/attr-gnu-12-21.d: Likewise.
.gnu.attributes entries from linker input files are merged to the
output file, the output having the union of compatible input
attributes. Incompatible attributes generally cause a linker error
and no output. However in some cases only a warning is emitted, and
one of the incompatible input attributes is passed on to the output.
PowerPC tends to emit warnings rather than errors, and the output
takes the first input attribute. For example, if we have two input
files with Tag_GNU_Power_ABI_FP, the first with a value signifying
"double-precision hard float, IBM long double", the second with a
value signifying "double-precision hard float, IEEE long double",
we'll get a warning about incompatible long double types and the
output will say "double-precision hard float, IBM long double".
The output attribute of course isn't correct. It would be correct to
specify "IBM and IEEE long double", but we don't have a way to
represent that currently. While it would be possible to extend the
encoding, there isn't much gain in doing so. A shared library
providing support for both long double types should link against
objects using either long double type without warning or error. That
is what you'd get if such a shared library had no Tag_GNU_Power_ABI_FP
attribute.
So this patch provides a way for the backend to omit .gnu.attributes
tags from the output.
* elf-bfd.h (ATTR_TYPE_FLAG_ERROR, ATTR_TYPE_HAS_ERROR): Define.
* elf-attrs.c (is_default_attr): Handle ATTR_TYPE_HAS_ERROR.
* elf32-ppc.c (_bfd_elf_ppc_merge_fp_attributes): Use
ATTR_TYPE_FLAG_INT_VAL. Set ATTR_TYPE_HAS_ERROR on finding
incompatible attribute.
(ppc_elf_merge_obj_attributes): Likewise. Return
_bfd_elf_merge_object_attributes result.
* elf64-ppc.c (ppc64_elf_merge_private_bfd_data): Return
_bfd_elf_merge_object_attributes result.
https://sourceware.org/ml/binutils/2013-05/msg00271.html was supposed
to banish "file format is ambiguous" errors for ELF. It didn't,
because the code supposedly detecting formats that implement
match_priority didn't work. That was due to not placing all matching
targets into the vector of matching targets. ELF objects should all
match the generic ELF target (priority 2), plus one or more machine
specific targets (priority 1), and perhaps a single machine specific
target with OS/ABI set (priority 0, best match). So the armel object
in the testcase actually matches elf32-littlearm,
elf32-littlearm-symbian, and elf32-littlearm-vxworks (all priority 1),
and elf32-little (priority 2). As the PR reported, elf32-little
wasn't seen as matching. Fixing that part of the problem wasn't too
difficult but matching the generic ELF target as well as the ARM ELF
targets resulted in ARM testsuite failures.
These proved to be the annoying reordering of stubs that occurs from
time to time due to the stub names containing the section id.
Matching another target causes more sections to be created in
elf_object_p. If section ids change, stub names change, which results
in different hashing and can therefore result in different hash table
traversal and stub creation order. That particular problem is fixed
by resetting section_id to the initial state before attempting each
target match, and taking a snapshot of its value after a successful
match.
PR 22458
* format.c (struct bfd_preserve): Add section_id.
(bfd_preserve_save, bfd_preserve_restore): Save and restore
_bfd_section_id.
(bfd_reinit): Set _bfd_section_id.
(bfd_check_format_matches): Put all matches of any priority into
matching_vector. Save initial section id and start each attempted
match at that section id.
* libbfd-in.h (_bfd_section_id): Declare.
* section.c (_bfd_section_id): Rename from section_id and make
global. Adjust uses.
(bfd_get_next_section_id): Delete.
* elf64-ppc.c (ppc64_elf_setup_section_lists): Replace use of
bfd_get_section_id with _bfd_section_id.
* libbfd.h: Regenerate.
* bfd-in2.h: Regenerate.
Two of the gcc ifunc tests fail for ppc32, due to my pr22374 fix being
a little too enthusiastic in trimming PLT entries. ppc64 doesn't have
the same failures because ppc64_elf_check_relocs happens to set
needs_plt for any ifunc reloc.
PR 23123
PR 22374
* elf32-ppc.c (ppc_elf_adjust_dynamic_symbol): Don't drop plt
relocs for ifuncs.
* elf64-ppc.c (ppc64_elf_adjust_dynamic_symbol): Comment fixes.
If you create an ifunc using GCC's __attribute__ ifunc, like:
extern int gnu_ifunc (int arg);
static int gnu_ifunc_target (int arg) { return 0; }
__typeof (gnu_ifunc) *gnu_ifunc_resolver (unsigned long hwcap) { return gnu_ifunc_target; }
__typeof (gnu_ifunc) gnu_ifunc __attribute__ ((ifunc ("gnu_ifunc_resolver")));
then you end up with two (function descriptor) symbols, one for the
ifunc itself, and another for the resolver:
(...)
12: 0000000000020060 104 FUNC GLOBAL DEFAULT 18 gnu_ifunc_resolver
(...)
16: 0000000000020060 104 GNU_IFUNC GLOBAL DEFAULT 18 gnu_ifunc
(...)
Both ifunc and resolver symbols have the same address/value, so
ppc64_elf_get_synthetic_symtab only creates a synthetic text symbol
for one of them. In the case above, it ends up being created for the
resolver, only:
(gdb) maint print msymbols
(...)
[ 7] t 0x980 .frame_dummy section .text
[ 8] T 0x9e4 .gnu_ifunc_resolver section .text
[ 9] T 0xa58 __glink_PLTresolve section .text
(...)
GDB needs to know when a program stepped into an ifunc resolver, so
that it can know whether to step past the resolver into the target
function without the user noticing. The way GDB does it is my
checking whether the current PC points to an ifunc symbol (since
resolver and ifunc have the same address by design).
The problem is then that ppc64_elf_get_synthetic_symtab never creates
the synchetic symbol for the ifunc, so GDB stops stepping at the
resolver (in a test added by the following patch):
(gdb) step
gnu_ifunc_resolver (hwcap=21) at gdb/testsuite/gdb.base/gnu-ifunc-lib.c:33
33 {
(gdb) FAIL: gdb.base/gnu-ifunc.exp: resolver_attr=1: resolver_debug=1: final_debug=0: step
After this commit, we get:
[ 8] i 0x9e4 .gnu_ifunc section .text
[ 9] T 0x9e4 .gnu_ifunc_resolver section .text
And stepping an ifunc call takes to the final function:
(gdb) step
0x00000000100009e8 in .final ()
(gdb) PASS: gdb.base/gnu-ifunc.exp: resolver_attr=1: resolver_debug=1: final_debug=0: step
An alternative to touching bfd I considered was for GDB to check
whether there's an ifunc data symbol / function descriptor that points
to the current PC, whenever the program stops, but discarded it
because we'd have to do a linear scan over .opd over an over to find a
matching function descriptor for the current PC. At that point I
considered caching that info, but quickly dismissed it as then that
has no advantage (memory or performance) over just creating the
synthetic ifunc text symbol in the first place.
I ran the binutils and ld testsuites on PPC64 ELFv1 (machine gcc110 on
the GCC compile farm), and saw no regressions. This commit is part of
a GDB patch series that includes GDB tests that fail without this fix.
bfd/ChangeLog:
2018-04-26 Pedro Alves <palves@redhat.com>
* elf64-ppc.c (ppc64_elf_get_synthetic_symtab): Don't consider
ifunc and non-ifunc symbols duplicates.
max-page-size only matters for demand paged executables or shared
libraries, and the ideal size is the largest value used by your
operating system. Values larger than necessary just waste file space
and memory. common-page-size also affects file and memory size,
trading a possible small increase in file size for a decrease in
memory size when the operating system is using a common-page-size
page. With a powerpc max-page-size of 64k and common-page-size of 4k
many executables will use no more memory pages when the system page
size is 4k than an executable linked with -z max-page-size=0x1000,
yet will still run on a system using 64k pages. However, when running
on a system using 64k pages relro protection will not be completely
effective.
Due to the relro problem, powerpc binutils has been using a default
common-page-size of 64k since 2014-12-18 (git commit 04c6a44c7),
leading to complaints about increased file and memory sizes. People
not using relro do have a valid reason to complain..
So this patch introduces an extra back-end value to use as the default
for common-page-size when generating relro executables, and enables
the support for powerpc. Non relro executables will now be generated
with a default common-page-size of 4k.
bfd/
* elf-bfd.h (struct elf_backend_data): Add relropagesize.
* elfxx-target.h (ELF_RELROPAGESIZE): Provide default and
sanity test.
(elfNN_bed): Init relropagesize.
* bfd.c (bfd_emul_get_commonpagesize): Add boolean param to
select relropagesize.
* elf32-ppc.c (ELF_COMMONPAGESIZE): Define as 0x1000.
(ELF_RELROPAGESIZE): Define as ELF_MAXPAGESIZE.
(ELF_MINPAGESIZE): Don't define.
* elf64-ppc.c (ELF_COMMONPAGESIZE): Define as 0x1000.
(ELF_RELROPAGESIZE): Define as ELF_MAXPAGESIZE.
* bfd-in2.h: Regenerate.
ld/
* ldmain.c (main): Move config.maxpagesize and
config.commonpagesize initialization to..
* ldemul.c (after_parse_default): ..here.
* testsuite/ld-powerpc/ppc476-shared.d: Pass -z common-page-size.
* testsuite/ld-powerpc/ppc476-shared2.d: Likewise.
This patch adds the analysis part of PLT call optimization, enabling
the code added with the previous patch that actually performs the
optimization.
Gold support is not available yet.
bfd/
* elf64-ppc.c (struct _ppc64_elf_section_data): Add has_pltcall field.
(struct ppc_link_hash_table): Add can_convert_all_inline_plt.
(ppc64_elf_check_relocs): Set has_pltcall.
(ppc64_elf_adjust_dynamic_symbol): Discard some PLT entries.
(ppc64_elf_inline_plt): New function.
(ppc64_elf_size_dynamic_sections): Discard some PLT entries for locals.
* elf64-ppc.h (ppc64_elf_inline_plt): Declare.
* elf32-ppc.c (has_pltcall): Define.
(struct ppc_elf_link_hash_table): Add can_convert_all_inline_plt.
(ppc_elf_check_relocs): Set has_pltcall.
(ppc_elf_inline_plt): New function.
(ppc_elf_adjust_dynamic_symbol): Discard some PLT entries.
(ppc_elf_size_dynamic_sections): Likewise.
* elf32-ppc.h (ppc_elf_inline_plt): Declare.
ld/
* emultempl/ppc64elf.em (no_inline_plt): New var.
(ppc_before_allocation): Call ppc64_elf_inline_plt.
(enum ppc64_opt): Add OPTION_NO_INLINE_OPT.
(PARSE_AND_LIST_LONGOPTS, PARSE_AND_LIST_OPTIONS,
PARSE_AND_LIST_ARGS_CASES): Handle --no-inline-optimize.
* emultemps/ppc32elf.em (no_inline_opt): New var.
(prelim_size_sections): New function, extracted from..
(ppc_before_allocation): ..here. Call ppc_elf_inline_plt.
(enum ppc32_opt): Add OPTION_NO_INLINE_OPT.
(PARSE_AND_LIST_LONGOPTS, PARSE_AND_LIST_OPTIONS,
PARSE_AND_LIST_ARGS_CASES): Handle --no-inline-optimize.
In addition to the existing relocs we need two more to mark all
instructions in the call sequence, PLTCALL on the call itself (plus
the toc restore insn for ppc64), and PLTSEQ on others. All
relocations in a particular sequence have the same symbol.
Example ppc64 ELFv2 assembly:
.reloc .,R_PPC64_PLTSEQ,puts
std 2,24(1)
addis 12,2,puts@plt@ha # .reloc .,R_PPC64_PLT16_HA,puts
ld 12,puts@plt@l(12) # .reloc .,R_PPC64_PLT16_LO_DS,puts
.reloc .,R_PPC64_PLTSEQ,puts
mtctr 12
.reloc .,R_PPC64_PLTCALL,puts
bctrl
ld 2,24(1)
Example ppc32 -fPIC assembly:
addis 12,30,puts+32768@plt@ha # .reloc .,R_PPC_PLT16_HA,puts+0x8000
lwz 12,12,puts+32768@plt@l # .reloc .,R_PPC_PLT16_LO,puts+0x8000
.reloc .,R_PPC_PLTSEQ,puts+32768
mtctr 12
.reloc .,R_PPC_PLTCALL,puts+32768
bctrl
Marking sequences like this allows the linker to convert them to nops
and a direct call if the target symbol turns out to be local.
When the call is __tls_get_addr, each relocation shown above is paired
with an R_PPC*_TLSLD or R_PPC*_TLSGD reloc to additionally mark the
sequence for possible TLS optimization. The TLSLD or TLSGD relocs are
emitted first.
include/
* elf/ppc.h (R_PPC_PLTSEQ, R_PPC_PLTCALL): Define.
* elf/ppc64.h (R_PPC64_PLTSEQ, R_PPC64_PLTCALL): Define.
bfd/
* elf32-ppc.c (ppc_elf_howto_raw): Add PLTSEQ and PLTCALL howtos.
(is_plt_seq_reloc): New function.
(ppc_elf_check_relocs): Handle PLTSEQ and PLTCALL relocs.
(ppc_elf_tls_optimize): Handle inline plt call sequence.
(ppc_elf_relax_section): Handle PLTCALL reloc.
(ppc_elf_relocate_section): Nop out inline plt call sequence when
resolving locally.
* elf64-ppc.c (ppc64_elf_howto_raw): Add R_PPC64_PLTSEQ and
R_PPC64_PLTCALL entries. Comment R_PPC64_TOCSAVE.
(has_tls_get_addr_call): Correct comment.
(is_branch_reloc): Add PLTCALL.
(is_plt_seq_reloc): New function.
(ppc64_elf_check_relocs): Handle PLT16_LO_DS reloc. Set
has_tls_reloc for R_PPC64_TLSGD and R_PPC64_TLSLD. Create plt
entry for R_PPC64_PLTCALL.
(ppc64_elf_tls_optimize): Handle inline plt call sequence.
(ppc_type_of_stub): Handle PLTCALL reloc.
(toc_adjusting_stub_needed): Likewise.
(ppc64_elf_relocate_section): Set "can_plt_call" for PLTCALL
reloc insn. Nop out inline plt call sequence when resolving
locally. Handle __tls_get_addr inline plt call optimization.
elfcpp/
* powerpc.h (R_POWERPC_PLTSEQ, R_POWERPC_PLTCALL): Define.
gold/
* powerpc.cc (Target_powerpc::Track_tls::maybe_skip_tls_get_addr_call):
Handle inline plt sequence relocs.
(Stub_table::Plt_stub_key::Plt_stub_key): Likewise.
(Target_powerpc::Scan::reloc_needs_plt_for_ifunc): Likewise.
(Target_powerpc::Relocate::relocate): Likewise.
Necessary if gcc is to use PLT16 relocs to implement -mlongcall, and
there isn't a good technical reason why local symbols should be
excluded from PLT16 support. Non-ifunc local symbol PLT entries go in
a separate section to other PLT entries. In a fixed position
executable they won't need to be relocated, and in a PIE or shared
library I chose to not implement lazy relocation.
bfd/
* elf64-ppc.c (LOCAL_PLT_ENTRY_SIZE): Define.
(struct ppc_stub_hash_entry): Add symtype field.
(PLT_KEEP): Define.
(struct ppc_link_hash_table): Add pltlocal and relpltlocal.
(create_linkage_sections): Create pltlocal and relpltlocal.
(ppc64_elf_check_relocs): Allow PLT relocs on local symbols.
Set PLT_KEEP.
(ppc64_elf_adjust_dynamic_symbol): Keep PLT entries for inline calls.
(allocate_dynrelocs): Allocate pltlocal and relpltlocal.
(ppc64_elf_size_dynamic_sections): Size pltlocal and relpltlocal.
Keep PLT entries for inline calls against locals.
(ppc_build_one_stub): Use pltlocal as appropriate.
(ppc_size_one_stub): Likewise.
(ppc64_elf_size_stubs): Set symtype.
(build_global_entry_stubs_and_plt): Init pltlocal and write
relpltlocal for globals.
(write_plt_relocs_for_local_syms): Likewise for local syms.
(ppc64_elf_relocate_section): Support PLT for local syms.
* elf32-ppc.c (PLT_KEEP): Define.
(struct ppc_elf_link_hash_table): Add pltlocal and relpltlocal.
(ppc_elf_create_glink): Create pltlocal and relpltlocal.
(ppc_elf_check_relocs): Allow PLT relocs on local symbols.
Set PLT_KEEP. Adjust update_local_sym_info call.
(ppc_elf_adjust_dynamic_symbol): Keep PLT entries for inline calls.
(allocate_dynrelocs): Allocate pltlocal and relpltlocal.
(ppc_elf_size_dynamic_sections): Size pltlocal and relpltlocal.
(ppc_elf_relocate_section): Support PLT16 relocs for local syms.
(write_global_sym_plt): Init pltlocal and write relpltlocal.
(ppc_finish_symbols): Likewise for locals.
ld/
* emulparams/elf32ppc.sh (OTHER_RELRO_SECTIONS_2): Add .branch_lt.
(OTHER_GOT_RELOC_SECTIONS): Add .rela.branch_lt.
* testsuite/ld-powerpc/elfv2so.d: Update for symbol/stub reordering.
* testsuite/ld-powerpc/relbrlt.d: Likewise.
* testsuite/ld-powerpc/relbrlt.s: Likewise.
* testsuite/ld-powerpc/tlsso.r: Likewise.
* testsuite/ld-powerpc/tlstocso.r: Likewise.
gold/
* powerpc.cc (Target_powerpc::lplt_): New variable.
(Target_powerpc::lplt_section): Associated accessor.
(Target_powerpc::plt_off): Handle local non-ifunc symbols.
(Target_powerpc::make_lplt_section): New function.
(Target_powerpc::make_local_plt_entry): New function.
(Powerpc_relobj::do_relocate_sections): Write out lplt.
(Output_data_plt_powerpc::first_plt_entry_offset): Zero for lplt.
(Output_data_plt_powerpc::add_local_entry): New function.
(Output_data_plt_powerpc::do_write): Ignore lplt.
(Target_powerpc::make_iplt_section): Make lplt first.
(Target_powerpc::make_brlt_section): Make .branch_lt relro.
(Target_powerpc::Scan::local): Handle PLT16 relocs.
The current scheme where we output PLT relocs for global symbols in
finish_dynamic_symbol, and PLT relocs for local symbols when
outputting stubs does not work if PLT entries are to be used for
inline PLT sequences against non-dynamic globals or local symbols.
bfd/
* elf64-ppc.c (ppc_build_one_stub): Move output of PLT relocs
for local symbols to..
(write_plt_relocs_for_local_syms): ..here. New function.
(ppc64_elf_finish_dynamic_symbol): Move output of PLT relocs for
global symbols to..
(build_global_entry_stubs_and_plt): ..here. Rename from
build_global_entry_stubs.
(ppc64_elf_build_stubs): Always call build_global_entry_stubs_and_plt.
Call write_plt_relocs_for_local_syms.
* elf32-ppc.c (get_sym_h): New function.
(ppc_elf_relax_section): Use get_sym_h.
(ppc_elf_relocate_section): Move output of PLT relocs and glink
stubs for local symbols to..
(ppc_finish_symbols): ..here. New function.
(ppc_elf_finish_dynamic_symbol): Move output of PLT relocs for
global syms to..
(write_global_sym_plt): ..here. New function.
* elf32-ppc.h (ppc_elf_modify_segment_map): Delete attribute.
(ppc_finish_symbols): Declare.
ld/
* ppc32elf.em (ppc_finish): Call ppc_finish_symbols.
The PowerPC64 ELFv2 ABI and the PowerPC SysV ABI support a number of
relocations that can be used to create and access a PLT entry.
However, the relocs are not well defined. The PLT16 family of relocs
talk about "the section offset or address of the procedure linkage
table entry". It's plain that we do need a relative address when PIC
as otherwise we'd have dynamic text relocations, but "section offset"
doesn't specify which section. The most obvious one, ".plt", isn't
that useful because there is no readily available way of addressing
the start of the ".plt" section. Much more useful would be "the
GOT/TOC-pointer relative offset of the procedure linkage table entry",
and I suppose you could argue that is a "section offset" of sorts.
For PowerPC64 it is better to use the same TOC-pointer relative
addressing even when non-PIC, since ".plt" may be located outside the
range of a 32-bit address. However, for ppc32 we do want an absolute
address when non-PIC as a GOT pointer may not be set up. Also, for
ppc32 PIC we have a similar situation to R_PPC_PLTREL24 in that the
GOT pointer is set to a location in the .got2 section and we need to
specify the .got2 offset in the PLT16 reloc addend.
This patch supports PLT16 relocations using these semantics. This is
not an ABI change for ppc32 since the relocations were not previously
supported by GNU ld, but is for ppc64 where some of the PLT16 relocs
were supported. I'm not particularly concerned since the old ppc64
PLT16 reloc semantics made them almost completely useless.
bfd/
* elf32-ppc.c (ppc_elf_check_relocs): Handle PLT16 relocs.
(ppc_elf_relocate_section): Likewise.
* elf64-ppc.c (ppc64_elf_check_relocs): Handle PLT16_LO_DS.
(ppc64_elf_relocate_section): Likewise. Correct PLT16
resolution to plt entry relative to toc pointer.
gold/
* powerpc.cc (Target_powerpc::plt_off): New functions.
(is_plt16_reloc): New function.
(Stub_table::plt_off): Use Target_powerpc::plt_off.
(Stub_table::plt_call_size): Use plt_off.
(Stub_table::do_write): Likewise.
(Target_powerpc::Scan::get_reference_flags): Return RELATIVE_REF
for PLT16 relocations.
(Target_powerpc::Scan::reloc_needs_plt_for_ifunc): Return true
for PLT16 relocations.
(Target_powerpc::Scan::global): Make a PLT entry for PLT16 relocations.
(Target_powerpc::Relocate::relocate): Support PLT16 relocations.
(Powerpc_scan_relocatable_reloc::global_strategy): Return RELOC_SPECIAL
for ppc32 plt16 relocs.
It is possible to construct indirect calls to __tls_get_addr in
assembly that confuse TLS optimization. (PowerPC gcc doesn't support
such calls, ignoring -mlongcall for __tls_get_addr.) This patch fixes
the problem by requiring a TLSLD or TLSGD marker reloc before any insn
in an indirect call to __tls_get_addr will be optimized. They also
need additional marker relocs defined in a later patch, so don't
expect the optimization to work just yet. The point here is to
prevent mis-optimization of indirect calls without any marker relocs.
The presense of a marker reloc is tracked by a new bit in the tls_mask
field of ppc_link_hash_entry and the corresponding lgot_masks unsigned
char array for local symbols. Since the field is only 8 bits, we've
run out of space. However, tracking TLS use for variables, and
tracking IFUNC for functions are independent, and bits can be reused.
TLS_TLS is always set for TLS usage, so can be used to select the
meaning of the other bits. This patch does that even for elf32-ppc.c
which hasn't yet run out of space in the field.
* elf64-ppc.c (TLS_TLS, TLS_GD, TLS_LD, TLS_TPREL, TLS_DTPREL,
TLS_TPRELGD, TLS_EXPLICIT): Renumber. Test TLS_TLS throughout
file when other TLS flags are tested in a mask.
(TLS_MARK, NON_GOT): Define.
(PLT_IFUNC): Redefine, and test TLS_TLS throughout file as well.
(update_local_sym_info): Don't create got entry when NON_GOT.
(ppc64_elf_check_relocs): Pass NON_GOT with PLT_IFUNC.
Set TLS_MARK.
(get_tls_mask): Do toc lookup if tls_mask is just TLS_MARK.
(ppc64_elf_relocate_section): Likewise.
(ppc64_elf_tls_optimize): Don't attempt to optimize indirect
__tls_get_addr calls lacking a marker reloc.
* elf32-ppc.c (TLS_TLS, TLS_GD, TLS_LD, TLS_TPREL, TLS_DTPREL,
TLS_TPRELGD): Renumber. Update comment.
(TLS_MARK, NON_GOT): Define.
(PLT_IFUNC): Redefine, and test TLS_TLS throughout file as well.
(update_local_sym_info): Don't create got entry when NON_GOT.
(ppc_elf_check_relocs): Pass NON_GOT with PLT_IFUNC.
Set TLS_MARK.
(ppc_elf_tls_optimize): Don't attempt to optimize indirect
__tls_get_addr calls lacking a marker reloc.
STT_FILE and a bunch of other symbol types aren't proper symbols to
mark the start of a function's code.
* elf64-ppc.c (ppc64_elf_get_synthetic_symtab): Trim uninteresting
symbols. Use size_t counts. Delete redundant opd test.
Commit f15d0b545b trimmed some unnecessary TPREL relocs, but missed
changing another place where they are allocated.
* elf64-ppc.c (ppc_size_one_stub): Fix comment typo.
(ppc64_elf_layout_multitoc): Allocate relocs for tprel as we
do in size_dynamic_sections.
This calculation in relocate_section
if (stub_entry->stub_type == ppc_stub_save_res)
relocation += (stub_sec->output_offset
+ stub_sec->output_section->vma
+ stub_sec->size - htab->sfpr->size
- htab->sfpr->output_offset
- htab->sfpr->output_section->vma);
to adjust from the original out-of-line save/restore function address
in sfpr to a copy at the end of stub_sec goes wrong when stub_sec is
padded, because the copy is no longer at the end of stub_sec. The
solution is to pad before copying sfpr, so the copy is always at the
end of stub_sec.
* elf64-ppc.c (sfpr_define): Adjust for stub_sec size having
sfpr size added before defining alias symbols.
(ppc64_elf_build_stubs): Add stub section padding before
copying sfpr contents and defining save/restore alias symbols.
The GNU coding standard says error messages should be of the form
program:sourcefile:lineno: message
or
program: message
and
"The string message should not begin with a capital letter when it
follows a program name and/or file name, because that isn’t the
beginning of a sentence. (The sentence conceptually starts at the
beginning of the line.) Also, it should not end with a period."
This patch does that for ppc, and removes some British spelling.
I've also switched some error output from using the linker callback
einfo to _bfd_error_handler, due to improved compilation time
argument checking now done for the latter function.
bfd/
* elf32-ppc.c: Standardize error/warning messages. Use
_bfd_error_handler rather than einfo when einfo features not used.
* elf64-ppc.c: Likewise.
ld/
* testsuite/ld-powerpc/attr-gnu-12-21.d: Update.
* testsuite/ld-powerpc/attr-gnu-4-12.d: Update.
* testsuite/ld-powerpc/attr-gnu-4-13.d: Update.
* testsuite/ld-powerpc/attr-gnu-4-21.d: Update.
* testsuite/ld-powerpc/attr-gnu-4-23.d: Update.
* testsuite/ld-powerpc/attr-gnu-4-31.d: Update.
* testsuite/ld-powerpc/attr-gnu-4-32.d: Update.
* testsuite/ld-powerpc/attr-gnu-8-23.d: Update.
This reverts most of commit 1be5d8d3bb.
Left in place are addition of --no-plt-align to some ppc32 ld tests
and the ld.texinfo --no-plt-thread-safe fix.
This fixes a "bug" in that nops emitted as part of code optimization
were being relocated. As it happens the relocation value was always
zero so the nop wasn't changed. Whew! I've also moved the use of
"howto" later since I was caught out in some recent code changes with
the howto not matching r_type.
* elf64-ppc.c (ppc64_elf_relocate_section): Don't relocate nops
emitted for toc sequence optimization. Set and use "howto" later.
bfdI would like to fix instances of the following warning, when building
with clang with no special CFLAGS other than -g3 -O0.
/home/emaisin/src/binutils-gdb/bfd/elflink.c:5425:45: error: performing pointer arithmetic on a null pointer has undefined behavior [-Werror,-Wnull-pointer-arithmetic]
return (struct elf_link_hash_entry *) 0 - 1;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
Replacing those with "(struct elf_link_hash_entry *) -1" gets rid of the
warning. I wanted to check that it didn't change the resulting code, so
I tried to build this:
$ cat test.c
int *before()
{
return (int *) 0 - 1;
}
int *after()
{
return (int *) - 1;
}
$ gcc -c test.c -g
$ objdump -d test.o
test.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <before>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 48 c7 c0 fc ff ff ff mov $0xfffffffffffffffc,%rax
b: 5d pop %rbp
c: c3 retq
000000000000000d <after>:
d: 55 push %rbp
e: 48 89 e5 mov %rsp,%rbp
11: 48 c7 c0 ff ff ff ff mov $0xffffffffffffffff,%rax
18: 5d pop %rbp
19: c3 retq
This shows that the previous code doesn't actually return -1 as the
function documentation says, but the new one does, so it's kind of a
bugfix.
bfd * elf64-ppc.c (ppc64_elf_archive_symbol_lookup): Avoid pointer
arithmetic on NULL pointer.
* elflink.c (_bfd_elf_archive_symbol_lookup,
elf_link_add_archive_symbols): Likewise.
ld * ldexp.c (fold_name, exp_fold_tree_1): Avoid pointer arithmetic
on NULL pointer.
https://bugzilla.redhat.com/show_bug.cgi?id=1523457
I haven't analyzed this myself, I'm relying on Nick's excellent
analysis. What I believe is happening is that after some number of
stub sizing iterations, a long-branch stub needs to be converted to a
plt-branch, but either due to stub alignment or other stubs shrinking
in size, the stub group section size doesn't change.
That means we exit from ppc64_elf_size_stubs after sizing with an
incorrect layout, in fact the additional .branch_lt entry overlays
.got! Since .TOC. is normally set to .got + 0x8000 the stub sizing
code decides that entry is within +/-32k of the TOC pointer and so a
three insn stub is sufficient. When we come to build the stubs using
a correct non-overlaying layout, a four insn plt-branch stub is
generated and the stub group size doesn't match that calculated
earlier.
* elf64-ppc.c (ppc64_elf_size_stubs): Iterate sizing when
.branch_lt changes size.
Asking for ppc32 plt call stubs to be aligned at 32 byte boundaries
didn't quite work. For ld.bfd they were spaced 32 bytes apart, but
only started on a 16 byte boundary. ld.gold also didn't get it right.
Finding that bug made me check over the ppc64 plt stub alignment,
where I found that negative values for alignment (meaning align to
minimize boundary crossing) were not accepted. Since no one has
complained about that, I guess I could have removed the feature from
ld.bfd documentation, but I've opted instead to correct the code.
I've also added an optional alignment paramenter for ppc32
--plt-align, for some consistency with gold and ppc64 ld.bfd.
bfd/
* elf32-ppc.c (ppc_elf_create_glink): Correct alignment of .glink.
* elf64-ppc.c (ppc64_elf_size_stubs): Handle negative plt_stub_align.
(ppc64_elf_build_stubs): Likewise.
gold/
* powerpc.cc (param_plt_align): New function supplying default
--plt-align values. Use it..
(Stub_table::plt_call_align): ..here, and..
(Output_data_glink::global_entry_align): ..here.
(Stub_table::stub_align): Correct 32-bit minimum alignment.
ld/
* emultempl/ppc32elf.em: Support optional --plt-align arg.
* emultempl/ppc64elf.em: Support negative --plt-align arg.
This is in preparation for the next patch adding Spectre variant 2
mitigation for PowerPC and PowerPC64. Besides tidying code involved
in stub output (to reduce the number of places where bctr is output),
the patch adds some user visible features:
1) PowerPC64 ELFv2 global entry stubs now are aligned under the
control of --plt-align, with a default alignment of 32 bytes.
2) PowerPC64 __glink_PLTresolve is no longer padded out with nops.
3) PowerPC32 PLT stubs are aligned under the control of --plt-align,
with the default alignment being 16 bytes as before.
4) The PowerPC32 branch/nop table emitted before __glink_PLTresolve
is now smaller in many cases. It was sized incorrectly when the
__tls_get_addr_opt stub was used, and unnecessarily included space
for local ifuncs.
bfd/
* elf32-ppc.c (GLINK_ENTRY_SIZE): Add parameters, handle
__tls_get_addr_opt, and alignment sizing.
(TLS_GET_ADDR_GLINK_SIZE): Delete.
(is_nonpic_glink_stub): Don't use GLINK_ENTRY_SIZE.
(ppc_elf_get_synthetic_symtab): Recognize stubs spaced at 4, 6,
or 8 insns.
(ppc_elf_link_hash_table_create): Init new ppc_elf_params field.
(allocate_dynrelocs): Use new GLINK_ENTRY_SIZE.
(ppc_elf_size_dynamic_sections): Likewise. Size branch table
by PLT reloc count.
(write_glink_stub): Handle __tls_get_addr_opt stub.
Pad out to size given by GLINK_ENTRY_SIZE.
(ppc_elf_relocate_section): Adjust write_glink_stub call.
(ppc_elf_finish_dynamic_symbol): Likewise.
(ppc_elf_finish_dynamic_sections): Write PLTresolve without using
insn array since so many need rewriting.
* elf32-ppc.h (struct ppc_elf_params): Add plt_stub_align.
* elf64-ppc.c (GLINK_PLTRESOLVE_SIZE): Rename from
GLINK_CALL_STUB_SIZE. Add htab param and evaluate to size without
nops. Adjust all uses.
(ppc64_elf_get_synthetic_symtab): Don't use GLINK_CALL_STUB_SIZE
in glink_vma calculation.
(struct ppc_link_hash_table): Add global_entry section pointer.
(create_linkage_sections): Create separate section for global
entry stubs.
(PPC_LO, PPC_HI, PPC_HA): Move earlier.
(size_global_entry_stubs): Handle sizing for aligned stubs.
(ppc64_elf_size_dynamic_sections): Handle global_entry alloc,
and don't stash end of glink branch table in rawsize.
(ppc_build_one_stub): Rewrite stub size calculations.
(build_global_entry_stubs): Use new section.
(ppc64_elf_build_stubs): Don't pad __glink_PLTresolve with nops.
Build lazy link stubs out to end of section. Build global entry
stubs in new section.
gold/
* options.h (plt_align): Support for PowerPC32 too.
* powerpc.cc (Stub_table::stub_align): Heed --plt-align for 32-bit.
(Stub_table::plt_call_size, branch_stub_size): Tidy.
(Stub_table::plt_call_align): Implement using stub_align.
(Output_data_glink::global_entry_align): New function.
(Output_data_glink::global_entry_off): New function.
(Output_data_glink::global_entry_address): Use global_entry_off.
(Output_data_glink::pltresolve_size): New function, replacing
pltresolve_size_ constant. Update all uses.
(Output_data_glink::add_global_entry): Align offset.
(Output_data_glink::set_final_data_size): Use global_entry_align.
(Stub_table::do_write): Don't pad __glink_PLTrelsolve with nops.
Tidy stub output. Use global_entry_off.
ld/
* emultempl/ppc32elf.em (params): Init new field.
(enum ppc32_opt): New enum to define OPTION_* values. Add
OPTION_PLT_ALIGN and OPTION_NO_PLT_ALIGN.
(PARSE_AND_LIST_LONGOPTS): Handle new options.
(PARSE_AND_LIST_ARGS_CASES): Likewise.
(PARSE_AND_LIST_OPTIONS): Likewise. Break up help output.
* emultempl/ppc64elf.em (ppc_add_stub_section): Init alignment
correctly for negative --plt-stub-align.
* testsuite/ld-powerpc/elfv2exe.d,
* testsuite/ld-powerpc/elfv2so.d,
* testsuite/ld-powerpc/relbrlt.d,
* testsuite/ld-powerpc/relbrlt.s,
* testsuite/ld-powerpc/tlsexe.d,
* testsuite/ld-powerpc/tlsexe.r,
* testsuite/ld-powerpc/tlsexe32.d,
* testsuite/ld-powerpc/tlsexe32.g,
* testsuite/ld-powerpc/tlsexe32.r,
* testsuite/ld-powerpc/tlsexetoc.d,
* testsuite/ld-powerpc/tlsexetoc.r,
* testsuite/ld-powerpc/tlsopt5_32.d,
* testsuite/ld-powerpc/tlsso.d,
* testsuite/ld-powerpc/tlstocso.d: Update for changed stub order.
PowerPC64 has its own mark_dynamic_ref, which needs the same change as
made by d664fd41e1 to the generic ELF version. Some other targets
discard more than just .data, so allow for that too in expected ld
messages.
bfd/
PR ld/22649
* elf64-ppc.c (ppc64_elf_gc_mark_dynamic_ref): Ignore dynamic
references on forced local symbols.
ld/
PR ld/22649
* testsuite/ld-elf/pr22649.msg: Allow other messages.
* testsuite/ld-elf/shared.exp: Check that --gc-sections is
supported before running ld/22649 tests.
Past tense is wrong for a comment before some action.
* elf32-ppc.c (ppc_elf_adjust_dynamic_symbol): Comment tidy.
* elf64-ppc.c (ppc64_elf_adjust_dynamic_symbol): Likewise.
* elfnn-aarch64.c (elfNN_aarch64_adjust_dynamic_symbol): Likewise.
In early October, HJ Lu added support for a number of targets to "Dump
dynamic relocation in read-only section with minfo". This extends
that support to more targets, displays the symbol involved, and splits
the existing function that sets TEXTREL into a "readonly_dynrelocs"
and "maybe_set_textrel" function. I'll need "readonly_dynrelocs" if I
ever get around to fixing "pr22374 function pointer initialization"
fails.
am33_2.0, arc, bfin, hppa64, mn10300, and nios2 fail to mark a binary
needing text relocations with DT_TEXTREL. That's not good. xtensa also
fails to do so but complains about "dangerous relocation: dynamic
relocation in read-only section" so I reckon that is fine and have
marked the test as an xfail. The other targets need maintainer
attention.
Curiously, the map file dump wasn't added for x86, so the map test
currently fail on x86. It also fails on alpha, am33_2.0, arc, bfin,
hppa64, ia64, m68k, mips, mn10300, nios2, score and vax. cris
complains with "tmpdir/textrel.o, section .rodata: relocation
R_CRIS_32 should not be used in a shared object; recompile with -fPIC"
so I've marked it as an xfail.
bfd/
* elf32-hppa.c (maybe_set_textrel): Print symbol for map file output.
* elf32-ppc.c (maybe_set_textrel): Likewise.
* elf64-ppc.c (maybe_set_textrel): Likewise.
* elf32-arm.c (readonly_dynrelocs): New function.
(maybe_set_textrel): New function, replacing..
(elf32_arm_readonly_dynrelocs): ..this.
* elf32-lm32.c (readonly_dynrelocs): New function.
(maybe_set_textrel): New function, replacing old version of..
(readonly_dynrelocs): ..this.
* elf32-m32r.c (readonly_dynrelocs): New function.
(maybe_set_textrel): New function, replacing old version of..
(readonly_dynrelocs): ..this.
* elf32-metag.c (readonly_dynrelocs): New function.
(maybe_set_textrel): New function, replacing old version of..
(readonly_dynrelocs): ..this.
* elf32-nds32.c: Delete unnecessary forward declarations.
(readonly_dynrelocs): New function.
(maybe_set_textrel): New function, replacing old version of..
(readonly_dynrelocs): ..this.
* elf32-or1k.c (readonly_dynrelocs): New function.
(maybe_set_textrel): New function, replacing old version of..
(readonly_dynrelocs): ..this.
* elf32-s390.c (readonly_dynrelocs): New function.
(maybe_set_textrel): New function, replacing old version of..
(readonly_dynrelocs): ..this.
* elf32-sh.c (readonly_dynrelocs): New function.
(maybe_set_textrel): New function, replacing old version of..
(readonly_dynrelocs): ..this.
* elf32-tic6x.c (readonly_dynrelocs): New function.
(maybe_set_textrel): New function, replacing..
(elf32_tic6x_readonly_dynrelocs): ..this.
* elf32-tilepro.c (readonly_dynrelocs): New function.
(maybe_set_textrel): New function, replacing old version of..
(readonly_dynrelocs): ..this.
* elf64-s390.c (readonly_dynrelocs): New function.
(maybe_set_textrel): New function, replacing old version of..
(readonly_dynrelocs): ..this.
* elfnn-aarch64.c (readonly_dynrelocs): New function.
(maybe_set_textrel): New function, replacing..
(aarch64_readonly_readonly_dynrelocs): ..this.
* elfnn-riscv.c (readonly_dynrelocs): New function.
(maybe_set_textrel): New function, replacing old version of..
(readonly_dynrelocs): ..this.
* elfxx-sparc.c (readonly_dynrelocs): New function.
(maybe_set_textrel): New function, replacing old version of..
(readonly_dynrelocs): ..this.
* elfxx-tilegx.c (readonly_dynrelocs): New function.
(maybe_set_textrel): New function, replacing old version of..
(readonly_dynrelocs): ..this.
ld/
* testsuite/ld-elf/shared.exp: Run new textrel tests.
* testsuite/ld-elf/textrel.map: New file.
* testsuite/ld-elf/textrel.rd: New file.
* testsuite/ld-elf/textrel.s: New file.
* testsuite/ld-elf/textrel.warn: New file.
This cleans up yet more craziness with non_got_ref.
PR 22533
* elf32-hppa.c (elf32_hppa_copy_indirect_symbol): Don't do anything
special with non_got_ref for weak aliases.
(elf32_hppa_check_relocs): Tweak setting of non_got_ref.
(elf32_hppa_adjust_dynamic_symbol): When initialising weak aliases,
don't uselessly copy non_got_ref. Clear dyn_relocs instead if
strong symbol is allocated in dynbss. Tidy comments.
(elf32_hppa_relocate_section): Comment fix.
* elf32-ppc.c (ppc_elf_copy_indirect_symbol): Don't do anything
special with non_got_ref for weak aliases.
(ppc_elf_adjust_dynamic_symbol): When initialising weak aliases,
don't uselessly copy non_got_ref. Clear dyn_relocs instead if
strong symbol is allocated in dynbss. Tidy comments.
* elf64-ppc.c (ppc64_elf_copy_indirect_symbol): Don't do anything
special with non_got_ref for weak aliases.
(ppc64_elf_adjust_dynamic_symbol): When initialising weak aliases,
don't uselessly copy non_got_ref. Clear dyn_relocs instead if
strong symbol is allocated in dynbss. Tidy comments.
Now that u.alias is circular, weakref just duplicates its function.
Also, function symbols shouldn't be on the alias list so there is no
need to use alias_readonly_dynrelocs with them.
* elf64-ppc.c (struct ppc_link_hash_entry): Delete weakref field.
(ppc64_elf_copy_indirect_symbol): Don't set weakref.
(alias_readonly_dynrelocs): Use u.alias rather than weakref.
(ppc64_elf_adjust_dynamic_symbol): Don't use
alias_readonly_dynrelocs for function symbols.
This makes the elf_link_hash_entry weakdef field, currently used to
point from a weak symbol to a strong alias, a circular list so that
all aliases can be found from any of them. A new flag, is_weakalias,
distinguishes the weak symbol from a strong alias, and is used in all
places where we currently test u.weakdef != NULL.
With the original u.weakdef handling it was possible to have two or
more weak symbols pointing via u.weakdef to a strong definition.
Obviously that situation can't map to a circular list; One or more of
the weak symbols must point at another weak alias rather than the
strong definition. To handle that, I've added an accessor function to
return the strong definition.
* elf-bfd.h (struct elf_link_hash_entry): Add is_weakalias.
Rename u.weakdef to u.alias and update comment.
(weakdef): New static inline function.
* elflink.c (bfd_elf_record_link_assignment) Test is_weakalias
rather than u.weakdef != NULL, and use weakdef function.
(_bfd_elf_adjust_dynamic_symbol): Likewise.
(_bfd_elf_fix_symbol_flags): Likewise. Clear is_weakalias on
all aliases if def has been overridden in a regular object, not
u.weakdef.
(elf_link_add_object_symbols): Delete new_weakdef flag. Test
is_weakalias and use weakdef. Set is_weakalias and circular
u.alias. Update comments.
(_bfd_elf_gc_mark_rsec): Test is_weakalias rather than
u.weakdef != NULL and use weakdef function.
* elf-m10300.c (_bfd_mn10300_elf_adjust_dynamic_symbol): Test
is_weakalias rather than u.weakdef != NULL and use weakdef
function. Assert that def is strong defined.
* elf32-arc.c (elf_arc_adjust_dynamic_symbol): Likewise.
* elf32-arm.c (elf32_arm_adjust_dynamic_symbol): Likewise.
* elf32-bfin.c (elf32_bfinfdpic_adjust_dynamic_symbol): Likewise.
(bfin_adjust_dynamic_symbol): Likewise.
* elf32-cr16.c (_bfd_cr16_elf_adjust_dynamic_symbol): Likewise.
* elf32-cris.c (elf_cris_adjust_dynamic_symbol): Likewise.
* elf32-frv.c (elf32_frvfdpic_adjust_dynamic_symbol): Likewise.
* elf32-hppa.c (elf32_hppa_adjust_dynamic_symbol): Likewise.
* elf32-i370.c (i370_elf_adjust_dynamic_symbol): Likewise.
* elf32-lm32.c (lm32_elf_adjust_dynamic_symbol): Likewise.
* elf32-m32r.c (m32r_elf_adjust_dynamic_symbol): Likewise.
* elf32-m68k.c (elf_m68k_adjust_dynamic_symbol): Likewise.
* elf32-metag.c (elf_metag_adjust_dynamic_symbol): Likewise.
* elf32-microblaze.c (microblaze_elf_adjust_dynamic_symbol): Likewise.
* elf32-nds32.c (nds32_elf_adjust_dynamic_symbol): Likewise.
* elf32-nios2.c (nios2_elf32_adjust_dynamic_symbol): Likewise.
* elf32-or1k.c (or1k_elf_adjust_dynamic_symbol): Likewise.
* elf32-ppc.c (ppc_elf_adjust_dynamic_symbol): Likewise.
* elf32-s390.c (elf_s390_adjust_dynamic_symbol): Likewise.
* elf32-score.c (s3_bfd_score_elf_adjust_dynamic_symbol): Likewise.
* elf32-score7.c (s7_bfd_score_elf_adjust_dynamic_symbol): Likewise.
* elf32-sh.c (sh_elf_adjust_dynamic_symbol): Likewise.
* elf32-tic6x.c (elf32_tic6x_adjust_dynamic_symbol): Likewise.
* elf32-tilepro.c (tilepro_elf_gc_mark_hook): Likewise.
(tilepro_elf_adjust_dynamic_symbol): Likewise.
* elf32-vax.c (elf_vax_adjust_dynamic_symbol): Likewise.
* elf32-xtensa.c (elf_xtensa_adjust_dynamic_symbol): Likewise.
* elf64-alpha.c (elf64_alpha_adjust_dynamic_symbol): Likewise.
* elf64-hppa.c (elf64_hppa_adjust_dynamic_symbol): Likewise.
* elf64-ia64-vms.c (elf64_ia64_adjust_dynamic_symbol): Likewise.
* elf64-ppc.c (ppc64_elf_gc_mark_hook): Likewise.
(ppc64_elf_adjust_dynamic_symbol): Likewise.
* elf64-s390.c (elf_s390_adjust_dynamic_symbol): Likewise.
* elf64-sh64.c (sh64_elf64_adjust_dynamic_symbol): Likewise.
* elfnn-aarch64.c (elfNN_aarch64_adjust_dynamic_symbol): Likewise.
* elfnn-ia64.c (elfNN_ia64_adjust_dynamic_symbol): Likewise.
* elfnn-riscv.c (riscv_elf_adjust_dynamic_symbol): Likewise.
* elfxx-mips.c (_bfd_mips_elf_adjust_dynamic_symbol): Likewise.
* elfxx-sparc.c (_bfd_sparc_elf_gc_mark_hook): Likewise.
(_bfd_sparc_elf_adjust_dynamic_symbol): Likewise.
* elfxx-tilegx.c (tilegx_elf_gc_mark_hook): Likewise.
(tilegx_elf_adjust_dynamic_symbol): Likewise.
* elfxx-x86.c (_bfd_x86_elf_adjust_dynamic_symbol): Likewise.
The fix for the PR is to not use input_section->output_section->owner
to get to the output bfd, but use the output bfd directly since it is
available nowadays in struct bfd_link_info.
I thought it worth warning when non-empty dynamic sections are
discarded too, which meant a tweak to one of the ld tests to avoid the
warning.
bfd/
PR 22431
* elf64-ppc.c (ppc64_elf_size_dynamic_sections): Warn on discarding
non-empty dynamic section.
(ppc_build_one_stub): Take elf_gp from output bfd, not output
section owner.
(ppc_size_one_stub, ppc64_elf_next_toc_section): Likewise.
ld/
* testsuite/ld-elf/note-3.t: Don't discard .got.
There is code in bfd/elf-eh-frame.c and ld/emultempl/elf32.em that
checks for the presence of eh_frame info by testing for a section
named .eh_frame sized more than 8 bytes. The size test is to exclude
a zero terminator. A similar check in elf64-ppc.c wrongly just tested
for non-zero size before creating the linker generated .eh_frame
describing plt call and other linkage stubs. The intention was to not
generate that info unless there was some user .eh_frame. (No user
.eh_frame implies the user doesn't care about exception handling.)
Because the test in elf64-ppc.c was wrong, ld generated the stub
.eh_frame just on finding a zero .eh_frame terminator in crtend.o, but
didn't generate the corresponding .eh_frame_hdr.
* elf64-ppc.c (ppc64_elf_size_stubs): Correct test for user
.eh_frame info.
This patch was aimed at a FIXME in elf32-hppa.c, the ludicrous and
confusing fact that non_got_ref after adjust_dynamic_relocs in that
backend means precisely the inverse of what it means before
adjust_dynamic_relocs. Before, when non_got_ref is set it means there
are dynamic relocs, after, if non_got_ref is clear it means "keep
dynamic relocs" and later, "has dynamic relocs". There is a reason
why it was done that way.. Some symbols that may have dynamic
relocations pre-allocated in check_relocs turn out to not be dynamic,
and then are not seen by the backend adjust_dynamic_symbols. We want
those symbols to lose their dynamic relocs when non-pic, so it's handy
that non_got_ref means the opposite after adjust_dynamic_relocs. But
it's really confusing.
Most other targets, like ppc32, don't always set non_got_ref on
non-GOT references that have dynamic relocations. This is because the
primary purpose of non_got_ref before adjust_dynamic_relocs is to flag
symbols that might need to be copied to .dynbss, and there are
relocation types that may require dyn_relocs but clearly cannot have
symbols copied into .dynbss, for example, TLS relocations.
Why do we need a flag after adjust_dynamic_relocs to say "keep
dynamic relocations"? Well, you can discard most unwanted dyn_relocs
in the backend adjust_dynamic_relocs, and for those symbols that
aren't seen by the backend adjust_dynamic_relocs, in
allocate_dynrelocs based on a flag set by adjust_dynamic relocs,
dynamic_adjusted. That doesn't solve all our difficulties though.
relocate_section needs to know whether a symbol has dyn_relocs, and
many targets transfer dyn_relocs to a weakdef if the symbol has one.
The transfer means relocate_section can't test dyn_relocs itself and
the weakdef field has been overwritten by that time. So non_got_ref
is used to flag "this symbol has dynamic relocations" for
relocate_section.
Confused still? Well, let's hope the comments I've added help clarify
things.. The patch also fixes a case where we might wrongly emit
dynamic relocations in an executable for common and undefined symbols.
* elf32-hppa.c (elf32_hppa_adjust_dynamic_symbol): Set non_got_ref
to keep dyn_relocs, clear to discard. Comment.
(allocate_dynrelocs): Always clear non_got_ref when clearing
dyn_relocs in non-pic case. Invert non_got_ref test. Also test
dynamic_adjusted and ELF_COMMON_DEF_P. Move code deleting
dyn_relocs on undefined syms to handle for non-pic too.
(elf32_hppa_relocate_section): Simplify test for non-pic dyn relocs.
* elf32-ppc.c (ppc_elf_adjust_dynamic_symbol): Set non_got_ref
to keep dyn_relocs, clear to discard. Comment.
(allocate_dynrelocs): Always clear non_got_ref when clearing
dyn_relocs in non-pic case. Invert non_got_ref test. Also test
dynamic_adjusted and ELF_COMMON_DEF_P. Move code deleting
dyn_relocs on undefined syms to handle for non-pic too.
(ppc_elf_relocate_section): Simplify test for non-pic dyn relocs.
* elf64-ppc.c (ppc64_elf_adjust_dynamic_symbol): Discard
dyn_relocs here. Don't bother setting non_got_ref. Comment.
(allocate_dynrelocs): Delete special handling of non-pic ELFv2
ifuncs. Move code deleting dyn_relocs on undefined symbols to
handle for non-pic too. Don't test non_got_ref. Do test
dynamic_adjusted and ELF_COMMON_DEF_P.
This patch removes unnecessary GOT IE TLS relocations in PIEs. Useful
with --no-tls-optimize, or with an enormous TLS segment. With the
default --tls-optimize in effect IE code sequences will be edited to
LE under the same circumstances we can remove the GOT reloc.
* elf32-ppc.c (got_entries_needed, got_relocs_needed): New functions.
(allocate_dynrelocs, ppc_elf_size_dynamic_sections): Use them here.
(ppc_elf_relocate_section): Don't output a dynamic relocation
for IE GOT entries in an executable.
* elf64-ppc.c (allocate_got): Trim unnecessary TPREL relocs.
(ppc64_elf_size_dynamic_sections): Likewise.
(ppc64_elf_relocate_section): Likewise.
PowerPC64 lacked the mapfile textrel warning on finding dynamic relocs
in read-only sections. This patch adds it, and tidies the
readonly_dynrelocs interface. PowerPC doesn't need a SEC_ALLOC test
because !SEC_ALLOC sections are excluded by check_relocs so will never
have dyn_relocs.
* elf32-ppc.c (readonly_dynrelocs): Delete info param. Update all
callers. Don't bother with SEC_ALLOC test. Return section pointer.
Move minfo call to..
(maybe_set_textrel): ..here.
* elf64-ppc.c (readonly_dynrelocs): Return section pointer.
(maybe_set_textrel): Call minfo to print textrel warning to map file.
We don't need a PLT entry when function pointer initialization in a
read/write section is the only reference to a given function symbol.
This patch prevents the unnecessary PLT entry, and ensures no dynamic
relocs are emitted when UNDEFWEAK_NO_DYNAMIC_RELOC says so.
bfd/
PR 22374
* elf32-ppc.c (ppc_elf_adjust_dynamic_symbol): Don't create a plt
entry when just a dynamic reloc can serve. Ensure no dynamic
relocations when UNDEFWEAK_NO_DYNAMIC_RELOC by setting non_got_ref.
Expand and move the non_got_ref comment.
* elf64-ppc.c (ppc64_elf_adjust_dynamic_symbol): Likewise.
ld/
* testsuite/ld-powerpc/ambiguousv2.d: Remove FIXME.
polyml produces object files with the wrong OS/ABI for hppa-linux.
This, along with the fact that elf32-hppa.c is using the strictest
backend relocs_compatible, results in wrong merging of ELF symbols.
So, remove the relocs_compatible check in _bfd_elf_merge_symbol.
_bfd_elf_merge_symbol is only called nowadays from within blocks
protected by is_elf_hash_table, so "we are doing an ELF link" as the
removed comment says, is true.
Also relax relocs_compatible for hppa and powerpc. relocs_compatible
is used for more than just merging symbols, as the name suggests.
This allows objects that are in fact reasonably compatible to be
linked.
PR 22300
* elflink.c (_bfd_elf_merge_symbol): Remove relocs_compatible check.
* elf32-hppa.c (elf_backend_relocs_compatible): Define.
* elf32-ppc.c (elf_backend_relocs_compatible): Define.
* elf64-ppc.c (elf_backend_relocs_compatible): Define.
Move UNDEFWEAK_NO_DYNAMIC_RELOC to elf-bfd.h so that it can be used by
other ELF linker backends.
* elf32-ppc.c (UNDEFWEAK_NO_DYNAMIC_RELOC): Moved to ...
* elf-bfd.h (UNDEFWEAK_NO_DYNAMIC_RELOC): Here.
* elf64-ppc.c (UNDEFWEAK_NO_DYNAMIC_RELOC): Removed.
check_relocs was setting up some data used by the --gc-sections
gc_mark_hook. If we change ld to run check_relocs after gc_sections
that data needs to be set up elsewhere. Done by this patch in the
backend check_directives function (ppc64_elf_before_check_relocs).
* elf64-ppc.c (ppc64_elf_before_check_relocs): Set sec_type for
.opd whenever .opd is present and non-zero size. Move code
setting abiversion to/from output file earlier. Only set
u.opd.func_sec when --gc-sections. Read relocs and set up
u.opd.func_sec values here..
(ppc64_elf_check_relocs): ..rather than here. Simplify opd
section tests.
(ppc64_elf_edit_opd): Don't set sec_type for .opd here.
After the PR 21411 fix, the linker generated .eh_frame for ppc64 glink
can be edited by the generic code. The sequence of events goes
something like:
1) Some object file adds .eh_frame aligned to 8, making the output
.eh_frame aligned to at least 8, so linker generated .eh_frame FDE
is padded to an 8 byte boundary.
2) All .eh_frame past the glink .eh_frame is garbage collected.
3) Generic code detects that last FDE (the glink .eh_frame) doesn't
need to be padded to an 8 byte boundary, reducing size from 88 to
84.
4) elf64-ppc.c check fails.
PR 21441
* elf64-ppc.c (ppc64_elf_build_stubs): Don't check glink_eh_frame
size.
This changes the PowerPC64 --plt-align option to perform the usual
alignment of code as suggested by its name, as well as the previous
behaviour of padding so as to reduce boundary crossing. The old
behaviour is had by using a negative parameter.
The default is also changed to align plt stub code by default to 32
byte boundaries, the point being to get better bctr branch prediction
on power8 and power9 hardware.
bfd/
* elf64-ppp.c (plt_stub_pad): Handle positive and negative
plt_stub_align.
ld/
* ld.texinfo (--plt-align): Describe new behaviour of option.
* emultempl/ppc64elf.em (params): Default plt_stub_align to 5.
* testsuite/ld-powerpc/powerpc.exp: Pass --no-plt-align for
selected tests.
* testsuite/ld-powerpc/relbrlt.d: Pass --no-plt-align.
* testsuite/ld-powerpc/elfv2so.d: Adjust expected output.
In the TLS GD/LD to LE optimization, ld replaces a sequence like
addi 3,2,x@got@tlsgd R_PPC64_GOT_TLSGD16 x
bl __tls_get_addr(x@tlsgd) R_PPC64_TLSGD x
R_PPC64_REL24 __tls_get_addr
nop
with
addis 3,13,x@tprel@ha R_PPC64_TPREL16_HA x
addi 3,3,x@tprel@l R_PPC64_TPREL16_LO x
nop
When the tprel offset is small, this can be further optimized to
nop
addi 3,13,x@tprel
nop
bfd/
* elf64-ppc.c (struct ppc_link_hash_table): Add do_tls_opt.
(ppc64_elf_tls_optimize): Set it.
(ppc64_elf_relocate_section): Nop addis on TPREL16_HA, and convert
insn on TPREL16_LO and TPREL16_LO_DS relocs to use r13 when
addis would add zero.
* elf32-ppc.c (struct ppc_elf_link_hash_table): Add do_tls_opt.
(ppc_elf_tls_optimize): Set it.
(ppc_elf_relocate_section): Nop addis on TPREL16_HA, and convert
insn on TPREL16_LO relocs to use r2 when addis would add zero.
gold/
* powerpc.cc (Target_powerpc::Relocate::relocate): Nop addis on
TPREL16_HA, and convert insn on TPREL16_LO and TPREL16_LO_DS
relocs to use r2/r13 when addis would add zero.
ld/
* testsuite/ld-powerpc/tls.s: Add calls with tls markers.
* testsuite/ld-powerpc/tls32.s: Likewise.
* testsuite/ld-powerpc/powerpc.exp: Run tls marker tests.
* testsuite/ld-powerpc/tls.d: Adjust for TPREL16_HA/LO optimization.
* testsuite/ld-powerpc/tlsexe.d: Likewise.
* testsuite/ld-powerpc/tlsexetoc.d: Likewise.
* testsuite/ld-powerpc/tlsld.d: Likewise.
* testsuite/ld-powerpc/tlsmark.d: Likewise.
* testsuite/ld-powerpc/tlsopt4.d: Likewise.
* testsuite/ld-powerpc/tlstoc.d: Likewise.
There isn't a good reason for ld.bfd to behave differently from gold
in the code generated by TLS GD/LD to LE optimization.
bfd/
* elf64-ppc.c (ppc64_elf_relocate_section): When optimizing
__tls_get_addr call sequences to LE, don't move the addi down
to the nop. Replace the bl with addi and leave the nop alone.
ld/
* testsuite/ld-powerpc/tls.d: Update.
* testsuite/ld-powerpc/tlsexe.d: Update.
* testsuite/ld-powerpc/tlsexetoc.d: Update.
* testsuite/ld-powerpc/tlsld.d: Update.
* testsuite/ld-powerpc/tlsmark.d: Update.
* testsuite/ld-powerpc/tlsopt4.d: Update.
* testsuite/ld-powerpc/tlstoc.d: Update.
Tidy how these are handled in PIEs.
* elf32-ppc.c (must_be_dyn_reloc): Use bfd_link_dll. Comment.
(ppc_elf_check_relocs): Only set DF_STATIC_TLS in shared libs.
(ppc_elf_relocate_section): Comment fix.
* elf64-ppc.c (must_be_dyn_reloc): Use bfd_link_dll. Comment.
(ppc64_elf_check_relocs): Only set DF_STATIC_TLS in shared libs.
Support dynamic relocs for TPREL16 when non-pic too.
(dec_dynrel_count): Adjust TPREL16 handling as per check_relocs.
(ppc64_elf_relocate_section): Support dynamic relocs for TPREL16
when non-pic too.
..if they have dynamic relocs. An undefined symbol in a PIC object
that finds no definition ought to become dynamic in order to support
--allow-shlib-undefined, but there is nothing in the generic ELF
linker code to do this if the reference isn't via the GOT or PLT. (An
initialized function pointer is an example.) So it falls to backend
code to ensure the symbol is made dynamic.
PR 21988
* elf64-ppc.c (ensure_undef_dynamic): Rename from
ensure_undefweak_dynamic. Handle undefined too.
* elf32-ppc.c (ensure_undef_dynamic): Likewise.
* elf32-hppa.c (ensure_undef_dynamic): Likewise.
(allocate_dynrelocs): Discard undefined non-default visibility
relocs first. Make undefined syms dynamic. Tidy goto.
This makes ld warn about --plt-localentry if a version of glibc
without the necessary ld.so checks is detected, and revises the
documentation.
bfd/
* elf64-ppc.c (ppc64_elf_tls_setup): Warn on --plt-localentry
without ld.so checks.
gold/
* powerpc.cc (Target_powerpc::scan_relocs): Warn on --plt-localentry
without ld.so checks.
ld/
* ld.texinfo (plt-localentry): Revise.
The big comment in ppc64_elf_tls_setup says why. I've also added some
code to the bfd linker that catches the -lpthread -lc symbol
differences and disable generation of optimized call stubs even when
--plt-localentry is activated. Gold doesn't yet have that.
PR 21847
bfd/
* elf64-ppc.c (struct ppc_link_hash_entry): Add non_zero_localentry.
(ppc64_elf_merge_symbol): Set non_zero_localentry.
(is_elfv2_localentry0): Test non_zero_localentry.
(ppc64_elf_tls_setup): Default to --no-plt-localentry.
gold/
* powerpc.cc (Target_powerpc::scan_relocs): Default to
--no-plt-localentry.
ld/
* ld.texinfo (plt-localentry): Document.
Since the __tls_get_addr_opt stub saves LR and makes a call, eh_frame
info should be generated to describe how to unwind through the stub.
The patch also changes the way the backend iterates over stubs, from
looking at all sections in stub_bfd to which all dynamic sections are
attached as well, to iterating over the group list, which gets just
the stub sections. Most binaries will have just one or two stub
groups, so this is a little faster.
bfd/
* elf64-ppc.c (struct map_stub): Add tls_get_addr_opt_bctrl.
(stub_eh_frame_size): New function.
(ppc_size_one_stub): Set group tls_get_addr_opt_bctrl.
(group_sections): Init group tls_get_addr_opt_bctrl.
(ppc64_elf_size_stubs): Update sizing and initialization of
.eh_frame. Iteration over stubs via group list.
(ppc64_elf_build_stubs): Iterate over stubs via group list.
(ppc64_elf_finish_dynamic_sections): Update finalization of
.eh_frame.
ld/
* testsuite/ld-powerpc/tlsopt5.s: Add cfi.
* testsuite/ld-powerpc/tlsopt5.d: Update.
* testsuite/ld-powerpc/tlsopt5.wf: New file.
* testsuite/ld-powerpc/powerpc.exp: Perform new tlsopt5 test.
My PPC64_OPT_LOCALENTRY patch of June 1, git commit f378ab099d, and
the later gold change, git commit 7ee7ff7015, added an insn in
__glink_PLTresolve which needs a corresponding adjustment in the
eh_frame info for asynchronous exceptions to unwind correctly.
It would have been OK for both ABIs to use +5 for the advance before
restore of LR, since we can put the DW_CFA_restore_extended on any
insn after the actual restore and before the r12/r0 copy is clobbered,
but it's slightly better to delay as much as possible. There are
then more addresses where fewer CFA program insns are executed.
bfd/
* elf64-ppc.c (ppc64_elf_size_stubs): Correct advance to
restore of LR.
gold/
* powerpc.cc (glink_eh_frame_fde_64v2): Correct advance to
restore of LR.
(glink_eh_frame_fde_64v1): Advance to restore of LR at latest
possible insn.
My 2017-01-24 patch (commit f0158f44) wrongly applied an optimization
of GOT entries for the __tls_get_addr_opt stub, to shared libraries.
When the TLS segment layout is known, as it is for the executable and
shared libraries loaded at initial program start, powerpc supports a
__tls_get_addr optimization. On the first call to __tls_get_addr for
a given __tls_index GOT entry, the DTPMOD word is set to zero and the
DTPREL word to the thread pointer offset to the thread variable. This
allows the __tls_get_addr_opt stub to return that value immediately
without making a call into glibc for any subsequent __tls_get_addr
calls using that __tls_index GOT entry.
That's all fine, but I thought I'd be clever and when the thread
variable is local, set up the GOT entry as if __tls_get_addr had
already been called. Which is good only for the executable, since ld
cannot know the TLS layout for shared libraries.
Of course, if this only applies to executables there isn't much point
to the optimization. Normally, GD and LD code in an executable will
be converted to IE or LE, losing the __tls_get_addr call. So the only
time it will trigger is with --no-tls-optimize. Thus, revert all
support.
* elf64-ppc.c (ppc64_elf_relocate_section): Don't optimize
__tls_index GOT entries when using __tls_get_addr_opt stub.
* elf32-ppc.c (ppc_elf_relocate_section): Likewise.
These don't need a following nop. Also, a localentry:0 plt call
marked with an R_PPC64_TOCSAVE reloc should ignore the tocsave.
There's no need to save r2 in the prologue for such calls.
* elf64-ppc.c (ppc64_elf_size_stubs): Test for localentry:0 plt
calls before tocsave calls.
(ppc64_elf_relocate_section): Allow localentry:0 plt calls without
following nop.
ELFv2 functions with localentry:0 are those with a single entry point,
ie. global entry == local entry, and that have no requirement on r2 or
r12, and guarantee r2 is unchanged on return. Such an external
function can be called via the PLT without saving r2 or restoring it
on return, avoiding a common load-hit-store for small functions. The
optimization is attractive. The TOC pointer load-hit-store is a major
reason why calls to small functions that need no register saves, or
with shrink-wrap, no register saves on a fast path, are slow on
powerpc64le.
To be safe, this optimization needs ld.so support to check that the
run-time matches link-time function implementation. If a function
in a shared library with st_other localentry non-zero is called
without saving and restoring r2, r2 will be trashed on return, leading
to segfaults. For that reason the optimization does not happen for
weak functions since a weak definition is a fairly solid hint that the
function will likely be overridden. I'm also not enabling the
optimization by default unless glibc-2.26 is detected, which should
have the ld.so checks implemented.
bfd/
* elf64-ppc.c (struct ppc_link_hash_table): Add has_plt_localentry0.
(ppc64_elf_merge_symbol_attribute): Merge localentry bits from
dynamic objects.
(is_elfv2_localentry0): New function.
(ppc64_elf_tls_setup): Default params->plt_localentry0.
(plt_stub_size): Adjust size for tls_get_addr_opt stub.
(build_tls_get_addr_stub): Use a simpler stub when r2 is not saved.
(ppc64_elf_size_stubs): Leave stub_type as ppc_stub_plt_call for
optimized localentry:0 stubs.
(ppc64_elf_build_stubs): Save r2 in ELFv2 __glink_PLTresolve.
(ppc64_elf_relocate_section): Leave nop unchanged for optimized
localentry:0 stubs.
(ppc64_elf_finish_dynamic_sections): Set PPC64_OPT_LOCALENTRY in
DT_PPC64_OPT.
* elf64-ppc.h (struct ppc64_elf_params): Add plt_localentry0.
include/
* elf/ppc64.h (PPC64_OPT_LOCALENTRY): Define.
ld/
* emultempl/ppc64elf.em (params): Init plt_localentry0 field.
(enum ppc64_opt): New, replacing OPTION_* defines. Add
OPTION_PLT_LOCALENTRY, and OPTION_NO_PLT_LOCALENTRY.
(PARSE_AND_LIST_*): Support --plt-localentry and --no-plt-localentry.
* testsuite/ld-powerpc/elfv2so.d: Update.
* testsuite/ld-powerpc/powerpc.exp (TLS opt 5): Use --no-plt-localentry.
* testsuite/ld-powerpc/tlsopt5.d: Update.
dynamic_ref_after_ir_def is a little odd compared to other symbol
flags in that as the name suggests, it is set only for certain
references after a definition. It turns out that setting a flag for
any non-ir reference from a dynamic object can be used to solve the
problem for which this flag was invented, which I think is a cleaner.
This patch does that, and sets non_ir_ref only for regular object
references.
include/
* bfdlink.h (struct bfd_link_hash_entry): Update non_ir_ref
comment. Rename dynamic_ref_after_ir_def to non_ir_ref_dynamic.
ld/
* plugin.c (is_visible_from_outside): Use non_ir_ref_dynamic.
(plugin_notice): Set non_ir_ref for references from regular
objects, non_ir_ref_dynamic for references from dynamic objects.
bfd/
* elf64-ppc.c (add_symbol_adjust): Transfer non_ir_ref_dynamic.
* elflink.c (elf_link_add_object_symbols): Update to use
non_ir_ref_dynamic.
(elf_link_input_bfd): Test non_ir_ref_dynamic in addition to
non_ir_ref.
* linker.c (_bfd_generic_link_add_one_symbol): Likewise.
This patch fixes a number of cases where -z nodynamic-undefined-weak
was not effective in preventing dynamic relocations or linkage stubs.
* elf32-ppc.c (UNDEFWEAK_NO_DYNAMIC_RELOC): Define.
(ppc_elf_select_plt_layout, ppc_elf_tls_setup): Use it.
(ppc_elf_adjust_dynamic_symbol, allocate_dynrelocs): Likewise.
(ppc_elf_relocate_section): Likewise. Delete silly optimisation
for undef and undefweak dyn_relocs.
* elf64-ppc.c (UNDEFWEAK_NO_DYNAMIC_RELOC): Define.
(ppc64_elf_adjust_dynamic_symbol, ppc64_elf_tls_setup): Use it.
(allocate_got, allocate_dynrelocs): Likewise.
(ppc64_elf_relocate_section): Likewise.
This patch fixes an assumption made by code that runs for objcopy and
strip, that SHT_REL/SHR_RELA sections are always named starting with a
.rel/.rela prefix. I'm also modifying the interface for
elf_backend_get_reloc_section, so any backend function just needs to
handle name mapping.
PR 21412
* elf-bfd.h (struct elf_backend_data <get_reloc_section>): Change
parameters and comment.
(_bfd_elf_get_reloc_section): Delete.
(_bfd_elf_plt_get_reloc_section): Declare.
* elf.c (_bfd_elf_plt_get_reloc_section, elf_get_reloc_section):
New functions. Don't blindly skip over assumed .rel/.rela prefix.
Extracted from..
(_bfd_elf_get_reloc_section): ..here. Delete.
(assign_section_numbers): Call elf_get_reloc_section.
* elf64-ppc.c (elf_backend_get_reloc_section): Define.
* elfxx-target.h (elf_backend_get_reloc_section): Update.
-z nodynamic-undefined-weak is only implemented for x86. (The sparc
backend has some support code but doesn't enable the option by
including ld/emulparams/dynamic_undefined_weak.sh, and since the
support looks like it may be broken I haven't enabled it.) This patch
adds the complementary -z dynamic-undefined-weak, extends both options
to affect building of shared libraries as well as executables, and
adds support for the option on powerpc.
include/
* bfdlink.h (struct bfd_link_info <dynamic_undefined_weak>):
Revise comment.
bfd/
* elflink.c (_bfd_elf_adjust_dynamic_symbol): Hide undefweak
or make dynamic for info->dynamic_undefined_weak 0 and 1.
* elf32-ppc.c:Formatting.
(ensure_undefweak_dynamic): Don't make dynamic when
info->dynamic_undefined_weak is zero.
(allocate_dynrelocs): Discard undefweak dyn_relocs for
info->dynamic_undefined_weak. Discard undef dyn_relocs when
not default visibility. Discard undef and undefweak
dyn_relocs earlier.
(ppc_elf_relocate_section): Adjust to suit.
* elf64-ppc.c: Formatting.
(ensure_undefweak_dynamic): Don't make dynamic when
info->dynamic_undefined_weak is zero.
(allocate_dynrelocs): Discard undefweak dyn_relocs for
info->dynamic_undefined_weak. Discard them earlier.
ld/
* ld.texinfo (dynamic-undefined-weak): Document.
(nodynamic-undefined-weak): Document that this option now can
be used with shared libs.
* emulparams/dynamic_undefined_weak.sh: Support -z
dynamic-undefined-weak.
* emulparams/elf32ppccommon.sh: Include dynamic_undefined_weak.sh.
* testsuite/ld-undefined/weak-undef.exp (undef_weak_so),
(undef_weak_exe): New. Use them. Add -z dynamic-undefined-weak
and -z nodynamic-undefined-weak tests.
* Makefile.am: Update powerpc dependencies.
* Makefile.in: Regenerate.
* elf32-arm.c (arm_type_of_stub): Supply missing args to "long
branch veneers" error. Fix double space and format message.
* elf32-avr.c (avr_add_stub): Do not pass NULL as %B arg.
* elf64-ppc.c (tocsave_find): Supply missing %B arg.
If you should somehow link non-pic objects into a PIE or shared
library, resulting in an object with DT_TEXTREL (text relocations)
set, and your executable or shared library also contains GNU indirect
functions, then you're in trouble. To apply dynamic relocations
ld.so will make the text segment writable. On most systems this will
make the text segment non-executable, which will then result in a
segfault when ld.so tries to run ifunc resolvers when applying
relocations against ifuncs.
This patch teaches PowerPC ld to detect the situation, and warn.
* elf64-ppc.c (struct ppc_link_hash_table): Add
local_ifunc_resolver and maybe_local_ifunc_resolver.
(ppc_build_one_stub): Set flags on emitting dynamic
relocation to ifunc.
(ppc64_elf_relocate_section): Likewise.
(ppc64_elf_finish_dynamic_symbol): Likewise.
(ppc64_elf_finish_dynamic_sections): Error on DT_TEXTREL with
local dynamic relocs to ifuncs.
* elf32-ppc.c (struct ppc_elf_link_hash_table): Add
local_ifunc_resolver and maybe_local_ifunc_resolver.
(ppc_elf_relocate_section): Set flag on emitting dynamic
relocation to ifuncs.
(ppc_elf_finish_dynamic_symbol): Likewise.
(ppc_elf_finish_dynamic_sections): Error on DT_TEXTREL with local
dynamic relocs to ifuncs.
ppc64_elf_relocate_section lacked a check which meant that it emitted
dynamic relocs against a hidden undefweak symbol for which no dynamic
relocs had been allocated.
PR 21224
PR 20519
* elf64-ppc.c (ppc64_elf_relocate_section): Add missing
dyn_relocs check.
In the last patch I said "The patch also fixes overflow checking".
In fact, there wasn't anything wrong with the previous code. So,
revert that change. The new checks are OK too, so this is just a
tidy.
* elf64-ppc.c (ppc64_elf_ha_reloc): Revert last change.
(ppc64_elf_relocate_section): Likewise.
This came up because I was looking at ld/tmpdir/addpcis.o and noticed
the odd addends on REL16DX_HA. They ought to both be -4. The error
crept in due REL16DX_HA howto being pc-relative (as indeed it should
be), and code at gas/write.c:1001 after this comment
/* Make it pc-relative. If the back-end code has not
selected a pc-relative reloc, cancel the adjustment
we do later on all pc-relative relocs. */
*not* cancelling the pc-relative adjustment. So I've made a dummy
non-relative split reloc so that the generic code handles this, rather
than attempting to add hacks later in md_apply_fix which would not be
very robust. Having the new internal reloc also makes it easy to
support
addpcis rx,sym@ha
as an equivalent to
addpcis rx,(sym-0f)@ha
0:
The patch also fixes overflow checking, which must test whether the
addi will overflow too since @l relocs don't have any overflow check.
Lastly, since I was poking at md_apply_fix, I arranged to have the
generic gas/write.c code emit errors for subtraction expressions where
we lack reloc support.
include/
* elf/ppc64.h (R_PPC64_16DX_HA): New. Expand fake reloc comment.
* elf/ppc.h (R_PPC_16DX_HA): Likewise.
bfd/
* reloc.c (BFD_RELOC_PPC_16DX_HA): New.
* elf64-ppc.c (ppc64_elf_howto_raw <R_PPC64_16DX_HA>): New howto.
(ppc64_elf_reloc_type_lookup): Translate new bfd reloc.
(ppc64_elf_ha_reloc): Correct overflow test on REL16DX_HA.
(ppc64_elf_relocate_section): Likewise.
* elf32-ppc.c (ppc_elf_howto_raw <R_PPC_16DX_HA>): New howto.
(ppc_elf_reloc_type_lookup): Translate new bfd reloc.
(ppc_elf_check_relocs): Handle R_PPC_16DX_HA to pacify gcc.
* libbfd.h: Regenerate.
* bfd-in2.h: Regenerate.
gas/
* config/tc-ppc.c (md_assemble): Use BFD_RELOC_PPC_16DX_HA for addpcis.
(md_apply_fix): Remove fx_subsy check. Move code converting to
pcrel reloc earlier and handle BFD_RELOC_PPC_16DX_HA. Remove code
emiiting errors on seeing fx_pcrel set on unexpected relocs, as
that is done now by the generic code via..
* config/tc-ppc.h (TC_FORCE_RELOCATION_SUB_LOCAL): ..this. Define.
(TC_VALIDATE_FIX_SUB): Define.
ld/
* testsuite/ld-powerpc/addpcis.d: Define ext1 and ext2 at
limits of addpcis range.
I'd made this dynamic section read-only so a flag test distinguished
it from .dynbss, but like any other .data.rel.ro section it really
should be marked read-write. (It is read-only after relocation, not
before.) When using the standard linker scripts this usually doesn't
matter since the output section is among other read-write sections and
not page aligned. However, it might matter in the extraordinary case
of the dynamic section being the only .data.rel.ro section with the
output section just happening to be page aligned and a multiple of a
page in size. In that case the output section would be read-only, and
live it its own read-only PT_LOAD segment, which is incorrect.
* elflink.c (_bfd_elf_create_dynamic_sections): Don't make
dynamic .data.rel.ro read-only.
* elf32-arm.c (elf32_arm_finish_dynamic_symbol): Compare section
rather than section flags when deciding where copy reloc goes.
* elf32-cris.c (elf_cris_finish_dynamic_symbol): Likewise.
* elf32-hppa.c (elf32_hppa_finish_dynamic_symbol): Likewise.
* elf32-i386.c (elf_i386_finish_dynamic_symbol): Likewise.
* elf32-metag.c (elf_metag_finish_dynamic_symbol): Likewise.
* elf32-microblaze.c (microblaze_elf_finish_dynamic_symbol): Likewise.
* elf32-nios2.c (nios2_elf32_finish_dynamic_symbol): Likewise.
* elf32-or1k.c (or1k_elf_finish_dynamic_symbol): Likewise.
* elf32-ppc.c (ppc_elf_finish_dynamic_symbol): Likewise.
* elf32-s390.c (elf_s390_finish_dynamic_symbol): Likewise.
* elf32-tic6x.c (elf32_tic6x_finish_dynamic_symbol): Likewise.
* elf32-tilepro.c (tilepro_elf_finish_dynamic_symbol): Likewise.
* elf64-ppc.c (ppc64_elf_finish_dynamic_symbol): Likewise.
* elf64-s390.c (elf_s390_finish_dynamic_symbol): Likewise.
* elf64-x86-64.c (elf_x86_64_finish_dynamic_symbol): Likewise.
* elfnn-aarch64.c (elfNN_aarch64_finish_dynamic_symbol): Likewise.
* elfnn-riscv.c (riscv_elf_finish_dynamic_symbol): Likewise.
* elfxx-mips.c (_bfd_mips_vxworks_finish_dynamic_symbol): Likewise.
* elfxx-sparc.c (_bfd_sparc_elf_finish_dynamic_symbol): Likewise.
* elfxx-tilegx.c (tilegx_elf_finish_dynamic_symbol): Likewise.
Remove an inconsistency in BFD linker error messages across the PowerPC
backends, where in the presence of line information the `%P: %H:' format
sequence makes the first error message produced for any given function
different from subsequent ones.
Taking the `ld/testsuite/ld-powerpc/tocopt7.s' test case source as an
example and the `powerpc-linux' target we have:
$ as -gdwarf2 -o tocopt.o -a64 tocopt.s
$ ld -o tocopt -melf64ppc tocopt.o
ld: tocopt.o: In function `_start':
tocopt.s:35:(.text+0x14): toc optimization is not supported for 0x3fa00000 instruction.
ld: tocopt.s:49:(.text+0x34): toc optimization is not supported for 0x3fa00000 instruction.
$
where the first error message does not have the source file name
prefixed with the linker program executable's name, i.e. `ld:', whereas
the second error message does, as would any subsequent.
This is because with a multiple-line error message such as `%H' produces
`%P' only prints the program executable's name on the first line and not
any later ones. Also the PowerPC backend is the only part of BFD which
uses `%P' along with one of the clever `%C', `%D', `%G', `%H' format
specifiers. And last but not least this breaks a GNU Coding Standard's
requirement that error messages from compilers should look like this:
source-file-name:lineno: message
also quoted in `vfinfo' code handling these specifiers.
Convert `%P: %H:' to `%H:' in error messages across the PowerPC backends
then, yielding:
$ as -gdwarf2 -o tocopt.o -a64 tocopt.s
$ ld -o tocopt -melf64ppc tocopt.o
tocopt.o: In function `_start':
tocopt.s:35:(.text+0x14): toc optimization is not supported for 0x3fa00000 instruction.
tocopt.s:49:(.text+0x34): toc optimization is not supported for 0x3fa00000 instruction.
$
instead, making it consistent and matching the GNU Coding Standard's
requirement.
bfd/
* elf32-ppc.c (ppc_elf_check_relocs): Use `%H:' rather than
`%P: %H:' with `info->callbacks->einfo'.
(ppc_elf_relocate_section): Likewise.
* elf64-ppc.c (ppc64_elf_check_relocs): Likewise.
(ppc64_elf_edit_toc): Likewise.
(ppc64_elf_relocate_section): Likewise.