In the TLS GD/LD to LE optimization, ld replaces a sequence like
addi 3,2,x@got@tlsgd R_PPC64_GOT_TLSGD16 x
bl __tls_get_addr(x@tlsgd) R_PPC64_TLSGD x
R_PPC64_REL24 __tls_get_addr
nop
with
addis 3,13,x@tprel@ha R_PPC64_TPREL16_HA x
addi 3,3,x@tprel@l R_PPC64_TPREL16_LO x
nop
When the tprel offset is small, this can be further optimized to
nop
addi 3,13,x@tprel
nop
bfd/
* elf64-ppc.c (struct ppc_link_hash_table): Add do_tls_opt.
(ppc64_elf_tls_optimize): Set it.
(ppc64_elf_relocate_section): Nop addis on TPREL16_HA, and convert
insn on TPREL16_LO and TPREL16_LO_DS relocs to use r13 when
addis would add zero.
* elf32-ppc.c (struct ppc_elf_link_hash_table): Add do_tls_opt.
(ppc_elf_tls_optimize): Set it.
(ppc_elf_relocate_section): Nop addis on TPREL16_HA, and convert
insn on TPREL16_LO relocs to use r2 when addis would add zero.
gold/
* powerpc.cc (Target_powerpc::Relocate::relocate): Nop addis on
TPREL16_HA, and convert insn on TPREL16_LO and TPREL16_LO_DS
relocs to use r2/r13 when addis would add zero.
ld/
* testsuite/ld-powerpc/tls.s: Add calls with tls markers.
* testsuite/ld-powerpc/tls32.s: Likewise.
* testsuite/ld-powerpc/powerpc.exp: Run tls marker tests.
* testsuite/ld-powerpc/tls.d: Adjust for TPREL16_HA/LO optimization.
* testsuite/ld-powerpc/tlsexe.d: Likewise.
* testsuite/ld-powerpc/tlsexetoc.d: Likewise.
* testsuite/ld-powerpc/tlsld.d: Likewise.
* testsuite/ld-powerpc/tlsmark.d: Likewise.
* testsuite/ld-powerpc/tlsopt4.d: Likewise.
* testsuite/ld-powerpc/tlstoc.d: Likewise.
There isn't a good reason for ld.bfd to behave differently from gold
in the code generated by TLS GD/LD to LE optimization.
bfd/
* elf64-ppc.c (ppc64_elf_relocate_section): When optimizing
__tls_get_addr call sequences to LE, don't move the addi down
to the nop. Replace the bl with addi and leave the nop alone.
ld/
* testsuite/ld-powerpc/tls.d: Update.
* testsuite/ld-powerpc/tlsexe.d: Update.
* testsuite/ld-powerpc/tlsexetoc.d: Update.
* testsuite/ld-powerpc/tlsld.d: Update.
* testsuite/ld-powerpc/tlsmark.d: Update.
* testsuite/ld-powerpc/tlsopt4.d: Update.
* testsuite/ld-powerpc/tlstoc.d: Update.
Tidy how these are handled in PIEs.
* elf32-ppc.c (must_be_dyn_reloc): Use bfd_link_dll. Comment.
(ppc_elf_check_relocs): Only set DF_STATIC_TLS in shared libs.
(ppc_elf_relocate_section): Comment fix.
* elf64-ppc.c (must_be_dyn_reloc): Use bfd_link_dll. Comment.
(ppc64_elf_check_relocs): Only set DF_STATIC_TLS in shared libs.
Support dynamic relocs for TPREL16 when non-pic too.
(dec_dynrel_count): Adjust TPREL16 handling as per check_relocs.
(ppc64_elf_relocate_section): Support dynamic relocs for TPREL16
when non-pic too.
..if they have dynamic relocs. An undefined symbol in a PIC object
that finds no definition ought to become dynamic in order to support
--allow-shlib-undefined, but there is nothing in the generic ELF
linker code to do this if the reference isn't via the GOT or PLT. (An
initialized function pointer is an example.) So it falls to backend
code to ensure the symbol is made dynamic.
PR 21988
* elf64-ppc.c (ensure_undef_dynamic): Rename from
ensure_undefweak_dynamic. Handle undefined too.
* elf32-ppc.c (ensure_undef_dynamic): Likewise.
* elf32-hppa.c (ensure_undef_dynamic): Likewise.
(allocate_dynrelocs): Discard undefined non-default visibility
relocs first. Make undefined syms dynamic. Tidy goto.
This makes ld warn about --plt-localentry if a version of glibc
without the necessary ld.so checks is detected, and revises the
documentation.
bfd/
* elf64-ppc.c (ppc64_elf_tls_setup): Warn on --plt-localentry
without ld.so checks.
gold/
* powerpc.cc (Target_powerpc::scan_relocs): Warn on --plt-localentry
without ld.so checks.
ld/
* ld.texinfo (plt-localentry): Revise.
The big comment in ppc64_elf_tls_setup says why. I've also added some
code to the bfd linker that catches the -lpthread -lc symbol
differences and disable generation of optimized call stubs even when
--plt-localentry is activated. Gold doesn't yet have that.
PR 21847
bfd/
* elf64-ppc.c (struct ppc_link_hash_entry): Add non_zero_localentry.
(ppc64_elf_merge_symbol): Set non_zero_localentry.
(is_elfv2_localentry0): Test non_zero_localentry.
(ppc64_elf_tls_setup): Default to --no-plt-localentry.
gold/
* powerpc.cc (Target_powerpc::scan_relocs): Default to
--no-plt-localentry.
ld/
* ld.texinfo (plt-localentry): Document.
Since the __tls_get_addr_opt stub saves LR and makes a call, eh_frame
info should be generated to describe how to unwind through the stub.
The patch also changes the way the backend iterates over stubs, from
looking at all sections in stub_bfd to which all dynamic sections are
attached as well, to iterating over the group list, which gets just
the stub sections. Most binaries will have just one or two stub
groups, so this is a little faster.
bfd/
* elf64-ppc.c (struct map_stub): Add tls_get_addr_opt_bctrl.
(stub_eh_frame_size): New function.
(ppc_size_one_stub): Set group tls_get_addr_opt_bctrl.
(group_sections): Init group tls_get_addr_opt_bctrl.
(ppc64_elf_size_stubs): Update sizing and initialization of
.eh_frame. Iteration over stubs via group list.
(ppc64_elf_build_stubs): Iterate over stubs via group list.
(ppc64_elf_finish_dynamic_sections): Update finalization of
.eh_frame.
ld/
* testsuite/ld-powerpc/tlsopt5.s: Add cfi.
* testsuite/ld-powerpc/tlsopt5.d: Update.
* testsuite/ld-powerpc/tlsopt5.wf: New file.
* testsuite/ld-powerpc/powerpc.exp: Perform new tlsopt5 test.
My PPC64_OPT_LOCALENTRY patch of June 1, git commit f378ab099d, and
the later gold change, git commit 7ee7ff7015, added an insn in
__glink_PLTresolve which needs a corresponding adjustment in the
eh_frame info for asynchronous exceptions to unwind correctly.
It would have been OK for both ABIs to use +5 for the advance before
restore of LR, since we can put the DW_CFA_restore_extended on any
insn after the actual restore and before the r12/r0 copy is clobbered,
but it's slightly better to delay as much as possible. There are
then more addresses where fewer CFA program insns are executed.
bfd/
* elf64-ppc.c (ppc64_elf_size_stubs): Correct advance to
restore of LR.
gold/
* powerpc.cc (glink_eh_frame_fde_64v2): Correct advance to
restore of LR.
(glink_eh_frame_fde_64v1): Advance to restore of LR at latest
possible insn.
My 2017-01-24 patch (commit f0158f44) wrongly applied an optimization
of GOT entries for the __tls_get_addr_opt stub, to shared libraries.
When the TLS segment layout is known, as it is for the executable and
shared libraries loaded at initial program start, powerpc supports a
__tls_get_addr optimization. On the first call to __tls_get_addr for
a given __tls_index GOT entry, the DTPMOD word is set to zero and the
DTPREL word to the thread pointer offset to the thread variable. This
allows the __tls_get_addr_opt stub to return that value immediately
without making a call into glibc for any subsequent __tls_get_addr
calls using that __tls_index GOT entry.
That's all fine, but I thought I'd be clever and when the thread
variable is local, set up the GOT entry as if __tls_get_addr had
already been called. Which is good only for the executable, since ld
cannot know the TLS layout for shared libraries.
Of course, if this only applies to executables there isn't much point
to the optimization. Normally, GD and LD code in an executable will
be converted to IE or LE, losing the __tls_get_addr call. So the only
time it will trigger is with --no-tls-optimize. Thus, revert all
support.
* elf64-ppc.c (ppc64_elf_relocate_section): Don't optimize
__tls_index GOT entries when using __tls_get_addr_opt stub.
* elf32-ppc.c (ppc_elf_relocate_section): Likewise.
These don't need a following nop. Also, a localentry:0 plt call
marked with an R_PPC64_TOCSAVE reloc should ignore the tocsave.
There's no need to save r2 in the prologue for such calls.
* elf64-ppc.c (ppc64_elf_size_stubs): Test for localentry:0 plt
calls before tocsave calls.
(ppc64_elf_relocate_section): Allow localentry:0 plt calls without
following nop.
ELFv2 functions with localentry:0 are those with a single entry point,
ie. global entry == local entry, and that have no requirement on r2 or
r12, and guarantee r2 is unchanged on return. Such an external
function can be called via the PLT without saving r2 or restoring it
on return, avoiding a common load-hit-store for small functions. The
optimization is attractive. The TOC pointer load-hit-store is a major
reason why calls to small functions that need no register saves, or
with shrink-wrap, no register saves on a fast path, are slow on
powerpc64le.
To be safe, this optimization needs ld.so support to check that the
run-time matches link-time function implementation. If a function
in a shared library with st_other localentry non-zero is called
without saving and restoring r2, r2 will be trashed on return, leading
to segfaults. For that reason the optimization does not happen for
weak functions since a weak definition is a fairly solid hint that the
function will likely be overridden. I'm also not enabling the
optimization by default unless glibc-2.26 is detected, which should
have the ld.so checks implemented.
bfd/
* elf64-ppc.c (struct ppc_link_hash_table): Add has_plt_localentry0.
(ppc64_elf_merge_symbol_attribute): Merge localentry bits from
dynamic objects.
(is_elfv2_localentry0): New function.
(ppc64_elf_tls_setup): Default params->plt_localentry0.
(plt_stub_size): Adjust size for tls_get_addr_opt stub.
(build_tls_get_addr_stub): Use a simpler stub when r2 is not saved.
(ppc64_elf_size_stubs): Leave stub_type as ppc_stub_plt_call for
optimized localentry:0 stubs.
(ppc64_elf_build_stubs): Save r2 in ELFv2 __glink_PLTresolve.
(ppc64_elf_relocate_section): Leave nop unchanged for optimized
localentry:0 stubs.
(ppc64_elf_finish_dynamic_sections): Set PPC64_OPT_LOCALENTRY in
DT_PPC64_OPT.
* elf64-ppc.h (struct ppc64_elf_params): Add plt_localentry0.
include/
* elf/ppc64.h (PPC64_OPT_LOCALENTRY): Define.
ld/
* emultempl/ppc64elf.em (params): Init plt_localentry0 field.
(enum ppc64_opt): New, replacing OPTION_* defines. Add
OPTION_PLT_LOCALENTRY, and OPTION_NO_PLT_LOCALENTRY.
(PARSE_AND_LIST_*): Support --plt-localentry and --no-plt-localentry.
* testsuite/ld-powerpc/elfv2so.d: Update.
* testsuite/ld-powerpc/powerpc.exp (TLS opt 5): Use --no-plt-localentry.
* testsuite/ld-powerpc/tlsopt5.d: Update.
dynamic_ref_after_ir_def is a little odd compared to other symbol
flags in that as the name suggests, it is set only for certain
references after a definition. It turns out that setting a flag for
any non-ir reference from a dynamic object can be used to solve the
problem for which this flag was invented, which I think is a cleaner.
This patch does that, and sets non_ir_ref only for regular object
references.
include/
* bfdlink.h (struct bfd_link_hash_entry): Update non_ir_ref
comment. Rename dynamic_ref_after_ir_def to non_ir_ref_dynamic.
ld/
* plugin.c (is_visible_from_outside): Use non_ir_ref_dynamic.
(plugin_notice): Set non_ir_ref for references from regular
objects, non_ir_ref_dynamic for references from dynamic objects.
bfd/
* elf64-ppc.c (add_symbol_adjust): Transfer non_ir_ref_dynamic.
* elflink.c (elf_link_add_object_symbols): Update to use
non_ir_ref_dynamic.
(elf_link_input_bfd): Test non_ir_ref_dynamic in addition to
non_ir_ref.
* linker.c (_bfd_generic_link_add_one_symbol): Likewise.
This patch fixes a number of cases where -z nodynamic-undefined-weak
was not effective in preventing dynamic relocations or linkage stubs.
* elf32-ppc.c (UNDEFWEAK_NO_DYNAMIC_RELOC): Define.
(ppc_elf_select_plt_layout, ppc_elf_tls_setup): Use it.
(ppc_elf_adjust_dynamic_symbol, allocate_dynrelocs): Likewise.
(ppc_elf_relocate_section): Likewise. Delete silly optimisation
for undef and undefweak dyn_relocs.
* elf64-ppc.c (UNDEFWEAK_NO_DYNAMIC_RELOC): Define.
(ppc64_elf_adjust_dynamic_symbol, ppc64_elf_tls_setup): Use it.
(allocate_got, allocate_dynrelocs): Likewise.
(ppc64_elf_relocate_section): Likewise.
This patch fixes an assumption made by code that runs for objcopy and
strip, that SHT_REL/SHR_RELA sections are always named starting with a
.rel/.rela prefix. I'm also modifying the interface for
elf_backend_get_reloc_section, so any backend function just needs to
handle name mapping.
PR 21412
* elf-bfd.h (struct elf_backend_data <get_reloc_section>): Change
parameters and comment.
(_bfd_elf_get_reloc_section): Delete.
(_bfd_elf_plt_get_reloc_section): Declare.
* elf.c (_bfd_elf_plt_get_reloc_section, elf_get_reloc_section):
New functions. Don't blindly skip over assumed .rel/.rela prefix.
Extracted from..
(_bfd_elf_get_reloc_section): ..here. Delete.
(assign_section_numbers): Call elf_get_reloc_section.
* elf64-ppc.c (elf_backend_get_reloc_section): Define.
* elfxx-target.h (elf_backend_get_reloc_section): Update.
-z nodynamic-undefined-weak is only implemented for x86. (The sparc
backend has some support code but doesn't enable the option by
including ld/emulparams/dynamic_undefined_weak.sh, and since the
support looks like it may be broken I haven't enabled it.) This patch
adds the complementary -z dynamic-undefined-weak, extends both options
to affect building of shared libraries as well as executables, and
adds support for the option on powerpc.
include/
* bfdlink.h (struct bfd_link_info <dynamic_undefined_weak>):
Revise comment.
bfd/
* elflink.c (_bfd_elf_adjust_dynamic_symbol): Hide undefweak
or make dynamic for info->dynamic_undefined_weak 0 and 1.
* elf32-ppc.c:Formatting.
(ensure_undefweak_dynamic): Don't make dynamic when
info->dynamic_undefined_weak is zero.
(allocate_dynrelocs): Discard undefweak dyn_relocs for
info->dynamic_undefined_weak. Discard undef dyn_relocs when
not default visibility. Discard undef and undefweak
dyn_relocs earlier.
(ppc_elf_relocate_section): Adjust to suit.
* elf64-ppc.c: Formatting.
(ensure_undefweak_dynamic): Don't make dynamic when
info->dynamic_undefined_weak is zero.
(allocate_dynrelocs): Discard undefweak dyn_relocs for
info->dynamic_undefined_weak. Discard them earlier.
ld/
* ld.texinfo (dynamic-undefined-weak): Document.
(nodynamic-undefined-weak): Document that this option now can
be used with shared libs.
* emulparams/dynamic_undefined_weak.sh: Support -z
dynamic-undefined-weak.
* emulparams/elf32ppccommon.sh: Include dynamic_undefined_weak.sh.
* testsuite/ld-undefined/weak-undef.exp (undef_weak_so),
(undef_weak_exe): New. Use them. Add -z dynamic-undefined-weak
and -z nodynamic-undefined-weak tests.
* Makefile.am: Update powerpc dependencies.
* Makefile.in: Regenerate.
* elf32-arm.c (arm_type_of_stub): Supply missing args to "long
branch veneers" error. Fix double space and format message.
* elf32-avr.c (avr_add_stub): Do not pass NULL as %B arg.
* elf64-ppc.c (tocsave_find): Supply missing %B arg.
If you should somehow link non-pic objects into a PIE or shared
library, resulting in an object with DT_TEXTREL (text relocations)
set, and your executable or shared library also contains GNU indirect
functions, then you're in trouble. To apply dynamic relocations
ld.so will make the text segment writable. On most systems this will
make the text segment non-executable, which will then result in a
segfault when ld.so tries to run ifunc resolvers when applying
relocations against ifuncs.
This patch teaches PowerPC ld to detect the situation, and warn.
* elf64-ppc.c (struct ppc_link_hash_table): Add
local_ifunc_resolver and maybe_local_ifunc_resolver.
(ppc_build_one_stub): Set flags on emitting dynamic
relocation to ifunc.
(ppc64_elf_relocate_section): Likewise.
(ppc64_elf_finish_dynamic_symbol): Likewise.
(ppc64_elf_finish_dynamic_sections): Error on DT_TEXTREL with
local dynamic relocs to ifuncs.
* elf32-ppc.c (struct ppc_elf_link_hash_table): Add
local_ifunc_resolver and maybe_local_ifunc_resolver.
(ppc_elf_relocate_section): Set flag on emitting dynamic
relocation to ifuncs.
(ppc_elf_finish_dynamic_symbol): Likewise.
(ppc_elf_finish_dynamic_sections): Error on DT_TEXTREL with local
dynamic relocs to ifuncs.
ppc64_elf_relocate_section lacked a check which meant that it emitted
dynamic relocs against a hidden undefweak symbol for which no dynamic
relocs had been allocated.
PR 21224
PR 20519
* elf64-ppc.c (ppc64_elf_relocate_section): Add missing
dyn_relocs check.
In the last patch I said "The patch also fixes overflow checking".
In fact, there wasn't anything wrong with the previous code. So,
revert that change. The new checks are OK too, so this is just a
tidy.
* elf64-ppc.c (ppc64_elf_ha_reloc): Revert last change.
(ppc64_elf_relocate_section): Likewise.
This came up because I was looking at ld/tmpdir/addpcis.o and noticed
the odd addends on REL16DX_HA. They ought to both be -4. The error
crept in due REL16DX_HA howto being pc-relative (as indeed it should
be), and code at gas/write.c:1001 after this comment
/* Make it pc-relative. If the back-end code has not
selected a pc-relative reloc, cancel the adjustment
we do later on all pc-relative relocs. */
*not* cancelling the pc-relative adjustment. So I've made a dummy
non-relative split reloc so that the generic code handles this, rather
than attempting to add hacks later in md_apply_fix which would not be
very robust. Having the new internal reloc also makes it easy to
support
addpcis rx,sym@ha
as an equivalent to
addpcis rx,(sym-0f)@ha
0:
The patch also fixes overflow checking, which must test whether the
addi will overflow too since @l relocs don't have any overflow check.
Lastly, since I was poking at md_apply_fix, I arranged to have the
generic gas/write.c code emit errors for subtraction expressions where
we lack reloc support.
include/
* elf/ppc64.h (R_PPC64_16DX_HA): New. Expand fake reloc comment.
* elf/ppc.h (R_PPC_16DX_HA): Likewise.
bfd/
* reloc.c (BFD_RELOC_PPC_16DX_HA): New.
* elf64-ppc.c (ppc64_elf_howto_raw <R_PPC64_16DX_HA>): New howto.
(ppc64_elf_reloc_type_lookup): Translate new bfd reloc.
(ppc64_elf_ha_reloc): Correct overflow test on REL16DX_HA.
(ppc64_elf_relocate_section): Likewise.
* elf32-ppc.c (ppc_elf_howto_raw <R_PPC_16DX_HA>): New howto.
(ppc_elf_reloc_type_lookup): Translate new bfd reloc.
(ppc_elf_check_relocs): Handle R_PPC_16DX_HA to pacify gcc.
* libbfd.h: Regenerate.
* bfd-in2.h: Regenerate.
gas/
* config/tc-ppc.c (md_assemble): Use BFD_RELOC_PPC_16DX_HA for addpcis.
(md_apply_fix): Remove fx_subsy check. Move code converting to
pcrel reloc earlier and handle BFD_RELOC_PPC_16DX_HA. Remove code
emiiting errors on seeing fx_pcrel set on unexpected relocs, as
that is done now by the generic code via..
* config/tc-ppc.h (TC_FORCE_RELOCATION_SUB_LOCAL): ..this. Define.
(TC_VALIDATE_FIX_SUB): Define.
ld/
* testsuite/ld-powerpc/addpcis.d: Define ext1 and ext2 at
limits of addpcis range.
I'd made this dynamic section read-only so a flag test distinguished
it from .dynbss, but like any other .data.rel.ro section it really
should be marked read-write. (It is read-only after relocation, not
before.) When using the standard linker scripts this usually doesn't
matter since the output section is among other read-write sections and
not page aligned. However, it might matter in the extraordinary case
of the dynamic section being the only .data.rel.ro section with the
output section just happening to be page aligned and a multiple of a
page in size. In that case the output section would be read-only, and
live it its own read-only PT_LOAD segment, which is incorrect.
* elflink.c (_bfd_elf_create_dynamic_sections): Don't make
dynamic .data.rel.ro read-only.
* elf32-arm.c (elf32_arm_finish_dynamic_symbol): Compare section
rather than section flags when deciding where copy reloc goes.
* elf32-cris.c (elf_cris_finish_dynamic_symbol): Likewise.
* elf32-hppa.c (elf32_hppa_finish_dynamic_symbol): Likewise.
* elf32-i386.c (elf_i386_finish_dynamic_symbol): Likewise.
* elf32-metag.c (elf_metag_finish_dynamic_symbol): Likewise.
* elf32-microblaze.c (microblaze_elf_finish_dynamic_symbol): Likewise.
* elf32-nios2.c (nios2_elf32_finish_dynamic_symbol): Likewise.
* elf32-or1k.c (or1k_elf_finish_dynamic_symbol): Likewise.
* elf32-ppc.c (ppc_elf_finish_dynamic_symbol): Likewise.
* elf32-s390.c (elf_s390_finish_dynamic_symbol): Likewise.
* elf32-tic6x.c (elf32_tic6x_finish_dynamic_symbol): Likewise.
* elf32-tilepro.c (tilepro_elf_finish_dynamic_symbol): Likewise.
* elf64-ppc.c (ppc64_elf_finish_dynamic_symbol): Likewise.
* elf64-s390.c (elf_s390_finish_dynamic_symbol): Likewise.
* elf64-x86-64.c (elf_x86_64_finish_dynamic_symbol): Likewise.
* elfnn-aarch64.c (elfNN_aarch64_finish_dynamic_symbol): Likewise.
* elfnn-riscv.c (riscv_elf_finish_dynamic_symbol): Likewise.
* elfxx-mips.c (_bfd_mips_vxworks_finish_dynamic_symbol): Likewise.
* elfxx-sparc.c (_bfd_sparc_elf_finish_dynamic_symbol): Likewise.
* elfxx-tilegx.c (tilegx_elf_finish_dynamic_symbol): Likewise.
Remove an inconsistency in BFD linker error messages across the PowerPC
backends, where in the presence of line information the `%P: %H:' format
sequence makes the first error message produced for any given function
different from subsequent ones.
Taking the `ld/testsuite/ld-powerpc/tocopt7.s' test case source as an
example and the `powerpc-linux' target we have:
$ as -gdwarf2 -o tocopt.o -a64 tocopt.s
$ ld -o tocopt -melf64ppc tocopt.o
ld: tocopt.o: In function `_start':
tocopt.s:35:(.text+0x14): toc optimization is not supported for 0x3fa00000 instruction.
ld: tocopt.s:49:(.text+0x34): toc optimization is not supported for 0x3fa00000 instruction.
$
where the first error message does not have the source file name
prefixed with the linker program executable's name, i.e. `ld:', whereas
the second error message does, as would any subsequent.
This is because with a multiple-line error message such as `%H' produces
`%P' only prints the program executable's name on the first line and not
any later ones. Also the PowerPC backend is the only part of BFD which
uses `%P' along with one of the clever `%C', `%D', `%G', `%H' format
specifiers. And last but not least this breaks a GNU Coding Standard's
requirement that error messages from compilers should look like this:
source-file-name:lineno: message
also quoted in `vfinfo' code handling these specifiers.
Convert `%P: %H:' to `%H:' in error messages across the PowerPC backends
then, yielding:
$ as -gdwarf2 -o tocopt.o -a64 tocopt.s
$ ld -o tocopt -melf64ppc tocopt.o
tocopt.o: In function `_start':
tocopt.s:35:(.text+0x14): toc optimization is not supported for 0x3fa00000 instruction.
tocopt.s:49:(.text+0x34): toc optimization is not supported for 0x3fa00000 instruction.
$
instead, making it consistent and matching the GNU Coding Standard's
requirement.
bfd/
* elf32-ppc.c (ppc_elf_check_relocs): Use `%H:' rather than
`%P: %H:' with `info->callbacks->einfo'.
(ppc_elf_relocate_section): Likewise.
* elf64-ppc.c (ppc64_elf_check_relocs): Likewise.
(ppc64_elf_edit_toc): Likewise.
(ppc64_elf_relocate_section): Likewise.
Complement commit 81ff47b3a546 ("PR ld/20828: Fix linker script symbols
wrongly forced local with section GC") and move the symbol sweep stage
of section GC from `elf_gc_sweep' to `bfd_elf_size_dynamic_sections',
avoiding the need to clear the `forced_local' marker, problematic for
targets that have special processing in their `elf_backend_hide_symbol'
handler. Set `mark' instead in `bfd_elf_record_link_assignment' and,
matching changes from commit 3bd43ebcb602 ("ld --gc-sections fail with
__tls_get_addr_opt"), also in PowerPC `__tls_get_addr_opt' handling
code, removing a:
FAIL: PR ld/20828 dynamic symbols with section GC (version script)
test suite failure with the `score-elf' target.
The rationale is it is enough if symbols are swept at the beginning of
`bfd_elf_size_dynamic_sections' as it is only in this function that the
size of the GOT, the dynamic symbol table and other dynamic sections is
determined, which will depend on the number of symbols making it to the
dynamic symbol table. It is also appropriate to do the sweep at this
point as it is already after any changes have been made to symbols with
`bfd_elf_record_link_assignment', and not possible any earlier as calls
to that function are only made just beforehand -- barring audit entry
processing -- via `gld${EMULATION_NAME}_find_statement_assignment'
invoked from `gld${EMULATION_NAME}_before_allocation' which is the ELF
handler for `ldemul_before_allocation'.
bfd/
PR ld/20828
* elflink.c (bfd_elf_record_link_assignment): Revert last
change and don't ever clear `forced_local'. Set `mark'
unconditionally.
(elf_gc_sweep_symbol_info, elf_gc_sweep_symbol): Reorder within
file.
(elf_gc_sweep): Move the call to `elf_gc_sweep_symbol'...
(bfd_elf_size_dynamic_sections): ... here.
* elf32-ppc.c (ppc_elf_tls_setup): Don't clear `forced_local'
and set `mark' instead in `__tls_get_addr_opt' processing.
* elf64-ppc.c (ppc64_elf_tls_setup): Likewise.
This patch fixes a number of issues with powerpc dynamic relocations.
1) Both ppc and ppc64 were emitting more dynamic symbols and
relocations than necessary, due to not supporting static linker
resolution of tls_index entries for __tls_get_addr_opt. This meant
that any @got@tlsgd or @got@tlsld reloc needed to make their symbols
dynamic and generate dptmod and dtprel relocs for the dynamic linker.
That would have been passable, but what happened was that practically
all @got relocations resulted in their symbols being made dynamic and
dynamic relocations emitted against the GOT entries. (Mostly visible
on ppc32 executables since ppc64 gcc really only uses @got style
relocs for TLS.)
2) The PowerOpen syntax was not supported with __tls_get_addr_opt.
DTPMOD/DTPREL relocs on tls_index TOC entries did not use the trick of
forcing dynamic symbols and relocations so those entries always
resulted in the full __tls_get_addr processing. gcc doesn't use the
PowerOpen syntax for TLS, and normally such code would be optimized to
TLS IE or LE so the impact of missing this support was minimal.
3) In an executable, relocations against GNU indirect functions always
used the value of their PLT stub. While this is correct, it is
better in some cases to use a dynamic relocation. An extra dynamic
relocation can mean that calls via function pointers need not bounce
through the PLT stub at runtime.
The patch also tidies the PLT handling code in ppc32
allocate_dynrelocs. Allocating PLT entries after other dynamic relocs
allows the PLT loop to omit special handling for undefined weak
symbols, and that in turn allows the loop to be simplified.
bfd/
* elf32-ppc.c (ppc_elf_adjust_dynamic_symbol): Merge two cases
where dynamic relocs are preferable. Allow ifunc too.
(ensure_undefweak_dynamic): New function.
(allocate_dynrelocs): Use it here. Move plt handling last and
don't make symbols dynamic, simplifying loop. Only make undef
weak symbols with GOT entries dynamic. Correct condition
for GOT relocs. Handle dynamic relocs on ifuncs. Correct
comments. Remove goto.
(ppc_elf_relocate_section): Correct test for using dynamic
symbol on GOT relocs. Rearrange test for emitting GOT relocs
to suit. Set up explicit tls_index entries and implicit GOT
tls_index entries resolvable at link time for
__tls_get_addr_opt. Simplify test to clear mem for prelink.
* elf64-ppc.c (allocate_got): Correct condition for GOT relocs.
(ensure_undefweak_dynamic): New function.
(allocate_dynrelocs): Use it here. Only make undef weak symbols
with GOT entries dynamic. Remove unnecessary test of
WILL_CALL_FINISH_DYNAMIC_SYMBOL in PLT handling.
(ppc64_elf_relocate_section): Correct test for using dynamic
symbol on GOT relocs. Rearrange test for emitting GOT relocs
to suit. Set up explicit tls_index entries and implicit GOT
tls_index entries resolvable at link time for __tls_get_addr_opt.
Simplify expression to clear mem for prelink.
ld/
* testsuite/ld-powerpc/tlsexe.r: Update for fewer dynamic relocs
and symbols.
* testsuite/ld-powerpc/tlsexe.d: Likewise.
* testsuite/ld-powerpc/tlsexe.g: Likewise.
A while ago HJ fixed PR ld/18720 with commit 6e33951ed, which, among
other things, modified _bfd_elf_link_hash_copy_indirect to not copy
ref_dynamic, ref_regular, ref_regular_nonweak, non_got_ref, needs_plt
and pointer_equality_needed when setting up an indirect non-versioned
symbol pointing to a non-default versioned symbol. I didn't notice at
the time, but the pr18720 testcase fails on hppa-linux with
"internal error, aborting at binutils-gdb-2.28/bfd/elf32-hppa.c:3933
in elf32_hppa_relocate_section".
Now hppa-linux creates entries in the plt even for local functions, if
they are referenced using plabel (function pointer) relocations. So
needs_plt is set for foo when processing pr18720a.o. When the aliases
in pr28720b.o are processed, we get an indirection from foo to
foo@FOO, but don't copy needs_plt. Since foo@FOO is the "real" symbol
that is used after that point, no plt entry is made for foo and we
bomb when relocating the plabel.
As shown by the hppa-linux scenario, needs_plt should be copied even
for non-default versioned symbols. I believe all of the others ought
to be copied too, with the exception of ref_dynamic. Not copying
ref_dynamic is right because if a shared lib references "foo" it
should not be satisfied by any non-default version "foo@FOO".
* elflink.c (_bfd_elf_link_hash_copy_indirect): Only omit
copying one flag, ref_dynamic, when versioned_hidden.
* elf64-ppc.c (ppc64_elf_copy_indirect_symbol): Likewise.
* elf32-hppa.c (elf32_hppa_copy_indirect_symbol): Use same
logic for copying weakdef flags. Copy plabel flag and merge
tls_type.
* elf32-i386.c (elf_i386_copy_indirect_symbol): Use same logic
for copying weakdef flags.
* elf32-ppc.c (ppc_elf_copy_indirect_symbol): Likewise.
* elf32-s390.c (elf_s390_copy_indirect_symbol): Likewise.
* elf32-sh.c (sh_elf_copy_indirect_symbol): Likewise.
* elf64-s390.c (elf_s390_copy_indirect_symbol): Likewise.
* elfnn-ia64.c (elfNN_ia64_hash_copy_indirect): Likewise.
* elf64-x86-64.c (elf_x86_64_copy_indirect_symbol): Likewise.
Simplify.
Variables defined in shared libraries are copied into an executable's
.bss section when code in the executable is non-PIC and thus would
require dynamic text relocations to access the variable directly in
the shared library. Recent x86 toolchains also copy variables into
the executable to gain a small speed improvement.
The problem is that if the variable was originally read-only, the copy
in .bss is writable, potentially opening a security hole. This patch
cures that problem by putting the copy in a section that becomes
read-only after ld.so relocation, provided -z relro is in force.
The patch also fixes a microblaze linker segfault on attempting to
use dynamic bss variables.
bfd/
PR ld/20995
* elf-bfd.h (struct elf_link_hash_table): Add sdynrelro and
sreldynrelro.
(struct elf_backend_data): Add want_dynrelro.
* elfxx-target.h (elf_backend_want_dynrelro): Define.
(elfNN_bed): Update initializer.
* elflink.c (_bfd_elf_create_dynamic_sections): Create
sdynrelro and sreldynrelro sections.
* elf32-arm.c (elf32_arm_adjust_dynamic_symbol): Place variables
copied into the executable from read-only sections into sdynrelro.
(elf32_arm_size_dynamic_sections): Handle sdynrelro.
(elf32_arm_finish_dynamic_symbol): Select sreldynrelro for
dynamic relocs in sdynrelro.
(elf_backend_want_dynrelro): Define.
* elf32-hppa.c (elf32_hppa_adjust_dynamic_symbol)
(elf32_hppa_size_dynamic_sections, elf32_hppa_finish_dynamic_symbol)
(elf_backend_want_dynrelro): As above.
* elf32-i386.c (elf_i386_adjust_dynamic_symbol)
(elf_i386_size_dynamic_sections, elf_i386_finish_dynamic_symbol)
(elf_backend_want_dynrelro): As above.
* elf32-metag.c (elf_metag_adjust_dynamic_symbol)
(elf_metag_size_dynamic_sections, elf_metag_finish_dynamic_symbol)
(elf_backend_want_dynrelro): As above.
* elf32-microblaze.c (microblaze_elf_adjust_dynamic_symbol)
(microblaze_elf_size_dynamic_sections)
(microblaze_elf_finish_dynamic_symbol)
(elf_backend_want_dynrelro): As above.
* elf32-nios2.c (nios2_elf32_finish_dynamic_symbol)
(nios2_elf32_adjust_dynamic_symbol)
(nios2_elf32_size_dynamic_sections)
(elf_backend_want_dynrelro): As above.
* elf32-or1k.c (or1k_elf_finish_dynamic_symbol)
(or1k_elf_adjust_dynamic_symbol, or1k_elf_size_dynamic_sections)
(elf_backend_want_dynrelro): As above.
* elf32-ppc.c (ppc_elf_adjust_dynamic_symbol)
(ppc_elf_size_dynamic_sections, ppc_elf_finish_dynamic_symbol)
(elf_backend_want_dynrelro): As above.
* elf32-s390.c (elf_s390_adjust_dynamic_symbol)
(elf_s390_size_dynamic_sections, elf_s390_finish_dynamic_symbol)
(elf_backend_want_dynrelro): As above.
* elf32-tic6x.c (elf32_tic6x_adjust_dynamic_symbol)
(elf32_tic6x_size_dynamic_sections)
(elf32_tic6x_finish_dynamic_symbol)
(elf_backend_want_dynrelro): As above.
* elf32-tilepro.c (tilepro_elf_adjust_dynamic_symbol)
(tilepro_elf_size_dynamic_sections)
(tilepro_elf_finish_dynamic_symbol)
(elf_backend_want_dynrelro): As above.
* elf64-ppc.c (ppc64_elf_adjust_dynamic_symbol)
(ppc64_elf_size_dynamic_sections, ppc64_elf_finish_dynamic_symbol)
(elf_backend_want_dynrelro): As above.
* elf64-s390.c (elf_s390_adjust_dynamic_symbol)
(elf_s390_size_dynamic_sections, elf_s390_finish_dynamic_symbol)
(elf_backend_want_dynrelro): As above.
* elf64-x86-64.c (elf_x86_64_adjust_dynamic_symbol)
(elf_x86_64_size_dynamic_sections)
(elf_x86_64_finish_dynamic_symbol)
(elf_backend_want_dynrelro): As above.
* elfnn-aarch64.c (elfNN_aarch64_adjust_dynamic_symbol)
(elfNN_aarch64_size_dynamic_sections)
(elfNN_aarch64_finish_dynamic_symbol)
(elf_backend_want_dynrelro): As above.
* elfnn-riscv.c (riscv_elf_adjust_dynamic_symbol)
(riscv_elf_size_dynamic_sections, riscv_elf_finish_dynamic_symbol)
(elf_backend_want_dynrelro): As above.
* elfxx-mips.c (_bfd_mips_elf_adjust_dynamic_symbol)
(_bfd_mips_elf_size_dynamic_sections)
(_bfd_mips_vxworks_finish_dynamic_symbol): As above.
* elfxx-sparc.c (_bfd_sparc_elf_adjust_dynamic_symbol)
(_bfd_sparc_elf_size_dynamic_sections)
(_bfd_sparc_elf_finish_dynamic_symbol): As above.
* elfxx-tilegx.c (tilegx_elf_adjust_dynamic_symbol)
(tilegx_elf_size_dynamic_sections)
(tilegx_elf_finish_dynamic_symbol): As above.
* elf32-mips.c (elf_backend_want_dynrelro): Define.
* elf64-mips.c (elf_backend_want_dynrelro): Define.
* elf32-sparc.c (elf_backend_want_dynrelro): Define.
* elf64-sparc.c (elf_backend_want_dynrelro): Define.
* elf32-tilegx.c (elf_backend_want_dynrelro): Define.
* elf64-tilegx.c (elf_backend_want_dynrelro): Define.
* elf32-microblaze.c (microblaze_elf_adjust_dynamic_symbol): Tidy.
(microblaze_elf_size_dynamic_sections): Handle sdynbss.
* elf32-nios2.c (nios2_elf32_size_dynamic_sections): Make use
of linker shortcuts to dynamic sections rather than comparing
names. Correctly set "got" flag.
ld/
PR ld/20995
* testsuite/ld-arm/farcall-mixed-app-v5.d: Update to suit changed
stub hash table traversal caused by section id increment. Accept
the previous output too.
* testsuite/ld-arm/farcall-mixed-app.d: Likewise.
* testsuite/ld-arm/farcall-mixed-lib-v4t.d: Likewise.
* testsuite/ld-arm/farcall-mixed-lib.d: Likewise.
* testsuite/ld-elf/pr20995a.s, * testsuite/ld-elf/pr20995b.s,
* testsuite/ld-elf/pr20995.r: New test.
* testsuite/ld-elf/elf.exp: Run it.
Recognize power9 and a few other insns from older machines. Fixes
linker complaints like "toc optimization is not supported for
0xf4090002 instruction". 0xf4090002 is stxsd v0,0(r9)
bfd/
* elf64-ppc.c (ok_lo_toc_insn): Add r_type param. Recognize
lq,lfq,lxv,lxsd,lxssp,lfdp,stq,stfq,stxv,stxsd,stxssp,stfdp.
Don't match lmd and stmd.
ld/
* testsuite/ld-powerpc/tocopt7.s,
* testsuite/ld-powerpc/tocopt7.out,
* testsuite/ld-powerpc/tocopt7.d: New test.
* testsuite/ld-powerpc/tocopt8.s,
* testsuite/ld-powerpc/tocopt8.d: New test.
* testsuite/ld-powerpc/powerpc.exp: Run them.
Lots of fixes for the compatibility code that handles linking of
-mcall-aixdesc code (or that generated by 12 year old gcc) with
current ELFv1 ABI code.
1) A reference to a dot-symbol in an object file wasn't satisfied by a
function descriptor in later object files.
2) The as-needed code had bit-rotted; Shared libs now need a strong
reference to be counted as needed.
3) --gc-sections involving dot-symbols was broken, needing
func_desc_adjust to be run early and lots of other fixes.
bfd/
* elf64-ppc.c (struct ppc_link_hash_entry): Delete "was_undefined".
(struct ppc_link_hash_table): Delete "twiddled_syms". Add
"need_func_desc_adj".
(lookup_fdh): Link direct fdh sym via oh field and set flags.
(make_fdh): Make strong and weak undefined function descriptor
symbols.
(ppc64_elf_merge_symbol): New function.
(elf_backend_merge_symbol): Define.
(ppc64_elf_archive_symbol_lookup): Don't test undefweak for fake
function descriptors.
(add_symbol_adjust): Don't twiddle symbols to undefweak.
Propagate more ref flags to function descriptor symbol. Make
some function descriptor symbols dynamic.
(ppc64_elf_before_check_relocs): Only run add_symbol_adjust for
ELFv1. Set need_func_desc_adj. Don't fix undefs list.
(ppc64_elf_check_relocs): Set non_ir_ref for descriptors.
Don't call lookup_fdh here.
(ppc64_elf_gc_sections): New function.
(bfd_elf64_bfd_gc_sections): Define.
(ppc64_elf_gc_mark_hook): Mark descriptor.
(func_desc_adjust): Don't make fake function descriptor syms strong
here. Exit earlier on non-dotsyms. Take note of elf.dynamic
flag when deciding whether a dynamic function descriptor might
be needed. Transfer elf.dynamic and set elf.needs_plt. Move
plt regardless of visibility. Make descriptor dynamic if
entry sym is dynamic, not for other cases.
(ppc64_elf_func_desc_adjust): Don't run func_desc_adjust if
already done.
(ppc64_elf_edit_opd): Use oh field rather than lookup_fdh.
(ppc64_elf_size_stubs): Likewise.
(ppc_build_one_stub): Don't clear was_undefined. Only set sym
undefweak if stub symbol is defined.
(undo_symbol_twiddle, ppc64_elf_restore_symbols): Delete.
* elf64-ppc.h (ppc64_elf_restore_symbols): Don't declare.
ld/
* emultempl/ppc64elf.em (gld${EMULATION_NAME}_finish): Don't call
ppc64_elf_restore_symbols.
* testsuite/ld-powerpc/dotsym1.d: New.
* testsuite/ld-powerpc/dotsym2.d: New.
* testsuite/ld-powerpc/dotsym3.d: New.
* testsuite/ld-powerpc/dotsym4.d: New.
* testsuite/ld-powerpc/dotsymref.s: New.
* testsuite/ld-powerpc/nodotsym.s: New.
* testsuite/ld-powerpc/powerpc.exp: Run new tests.
It's possible but unlikely that an indirect symbol points at a warning
symbol.
* elf64-ppc.c (add_symbol_adjust): Correct order of tests for
warning and indirect symbols.
As per _bfd_elf_link_hash_copy_indirect.
* elf64-ppc.c (ppc64_elf_copy_indirect_symbol): Don't copy dynamic
flags when direct symbol is versioned_hidden.
The PR20886 binary is large enough that there are two stub sections
servicing .text (which is 88M). It so happens that between one
iteration of sizing and the next that one stub section grows while
the other shrinks. Since one section is always growing, the loop
never terminates.
This patch changes the algorithm to not update previous size on
shrinking, once we go past a certain number of iterations.
PR ld/20886
* elf64-ppc.c (ppc64_elf_size_stubs): Make rawsize max size seen
on any pass past STUB_SHRINK_ITER.
It makes just a little more sense to use input_bfd when retrieving
insns for relocation, since the relocations match the endianness of
the input bfd.
* elf32-ppc.c (ppc64_elf_relocate_section): Calculate d_offset for
input_bfd. Replace occurrences of output_bfd as bfd_get_32 and
bfd_put_32 param with input_bfd.
* elf32-ppc.c (ppc_elf_relocate_section): Likewise. Also
ppc_elf_vle_split16 param.
(ppc_elf_vle_split16): Rename output_bfd param to input_bfd.
This patch extends Tag_GNU_Power_ABI_FP to cover long double ABIs,
makes the assembler warn about undefined tag values, and removes
similar warnings from the linker. I think it is better to not
warn in the linker about undefined tag values as future extensions to
the tags then won't result in likely bogus warnings. This is
consistent with the fact that an older linker won't warn on an
entirely new tag.
include/
* elf/ppc.h (Tag_GNU_Power_ABI_FP): Comment.
bfd/
* elf-bfd.h (_bfd_elf_ppc_merge_fp_attributes): Declare.
* elf32-ppc.c (_bfd_elf_ppc_merge_fp_attributes): New function.
(ppc_elf_merge_obj_attributes): Use it. Don't copy first file
attributes, merge them. Don't warn about undefined tag bits,
or copy unknown values to output.
* elf64-ppc.c (ppc64_elf_merge_private_bfd_data): Call
_bfd_elf_ppc_merge_fp_attributes.
binutils/
* readelf.c (display_power_gnu_attribute): Catch truncated section
for all powerpc attributes. Display long double ABI. Don't
capitalize words, except for names. Show known bits of tag values
when some unknown bits are present. Whitespace fixes.
gas/
* config/tc-ppc.c (ppc_elf_gnu_attribute): New function.
(md_pseudo_table <ELF>): Handle "gnu_attribute".
ld/
* testsuite/ld-powerpc/attr-gnu-4-4.s: Delete.
* testsuite/ld-powerpc/attr-gnu-4-14.d: Delete.
* testsuite/ld-powerpc/attr-gnu-4-24.d: Delete.
* testsuite/ld-powerpc/attr-gnu-4-34.d: Delete.
* testsuite/ld-powerpc/attr-gnu-4-41.d: Delete.
* testsuite/ld-powerpc/attr-gnu-4-32.d: Adjust expected warning.
* testsuite/ld-powerpc/attr-gnu-8-23.d: Likewise.
* testsuite/ld-powerpc/attr-gnu-4-01.d: Adjust expected output.
* testsuite/ld-powerpc/attr-gnu-4-02.d: Likewise.
* testsuite/ld-powerpc/attr-gnu-4-03.d: Likewise.
* testsuite/ld-powerpc/attr-gnu-4-10.d: Likewise.
* testsuite/ld-powerpc/attr-gnu-4-11.d: Likewise.
* testsuite/ld-powerpc/attr-gnu-4-20.d: Likewise.
* testsuite/ld-powerpc/attr-gnu-4-22.d: Likewise.
* testsuite/ld-powerpc/attr-gnu-4-33.d: Likewise.
* testsuite/ld-powerpc/attr-gnu-8-11.d: Likewise.
* testsuite/ld-powerpc/powerpc.exp: Don't run deleted tests.