This reverts most of commit 1be5d8d3bb.
Left in place are addition of --no-plt-align to some ppc32 ld tests
and the ld.texinfo --no-plt-thread-safe fix.
Asking for ppc32 plt call stubs to be aligned at 32 byte boundaries
didn't quite work. For ld.bfd they were spaced 32 bytes apart, but
only started on a 16 byte boundary. ld.gold also didn't get it right.
Finding that bug made me check over the ppc64 plt stub alignment,
where I found that negative values for alignment (meaning align to
minimize boundary crossing) were not accepted. Since no one has
complained about that, I guess I could have removed the feature from
ld.bfd documentation, but I've opted instead to correct the code.
I've also added an optional alignment paramenter for ppc32
--plt-align, for some consistency with gold and ppc64 ld.bfd.
bfd/
* elf32-ppc.c (ppc_elf_create_glink): Correct alignment of .glink.
* elf64-ppc.c (ppc64_elf_size_stubs): Handle negative plt_stub_align.
(ppc64_elf_build_stubs): Likewise.
gold/
* powerpc.cc (param_plt_align): New function supplying default
--plt-align values. Use it..
(Stub_table::plt_call_align): ..here, and..
(Output_data_glink::global_entry_align): ..here.
(Stub_table::stub_align): Correct 32-bit minimum alignment.
ld/
* emultempl/ppc32elf.em: Support optional --plt-align arg.
* emultempl/ppc64elf.em: Support negative --plt-align arg.
This is in preparation for the next patch adding Spectre variant 2
mitigation for PowerPC and PowerPC64. Besides tidying code involved
in stub output (to reduce the number of places where bctr is output),
the patch adds some user visible features:
1) PowerPC64 ELFv2 global entry stubs now are aligned under the
control of --plt-align, with a default alignment of 32 bytes.
2) PowerPC64 __glink_PLTresolve is no longer padded out with nops.
3) PowerPC32 PLT stubs are aligned under the control of --plt-align,
with the default alignment being 16 bytes as before.
4) The PowerPC32 branch/nop table emitted before __glink_PLTresolve
is now smaller in many cases. It was sized incorrectly when the
__tls_get_addr_opt stub was used, and unnecessarily included space
for local ifuncs.
bfd/
* elf32-ppc.c (GLINK_ENTRY_SIZE): Add parameters, handle
__tls_get_addr_opt, and alignment sizing.
(TLS_GET_ADDR_GLINK_SIZE): Delete.
(is_nonpic_glink_stub): Don't use GLINK_ENTRY_SIZE.
(ppc_elf_get_synthetic_symtab): Recognize stubs spaced at 4, 6,
or 8 insns.
(ppc_elf_link_hash_table_create): Init new ppc_elf_params field.
(allocate_dynrelocs): Use new GLINK_ENTRY_SIZE.
(ppc_elf_size_dynamic_sections): Likewise. Size branch table
by PLT reloc count.
(write_glink_stub): Handle __tls_get_addr_opt stub.
Pad out to size given by GLINK_ENTRY_SIZE.
(ppc_elf_relocate_section): Adjust write_glink_stub call.
(ppc_elf_finish_dynamic_symbol): Likewise.
(ppc_elf_finish_dynamic_sections): Write PLTresolve without using
insn array since so many need rewriting.
* elf32-ppc.h (struct ppc_elf_params): Add plt_stub_align.
* elf64-ppc.c (GLINK_PLTRESOLVE_SIZE): Rename from
GLINK_CALL_STUB_SIZE. Add htab param and evaluate to size without
nops. Adjust all uses.
(ppc64_elf_get_synthetic_symtab): Don't use GLINK_CALL_STUB_SIZE
in glink_vma calculation.
(struct ppc_link_hash_table): Add global_entry section pointer.
(create_linkage_sections): Create separate section for global
entry stubs.
(PPC_LO, PPC_HI, PPC_HA): Move earlier.
(size_global_entry_stubs): Handle sizing for aligned stubs.
(ppc64_elf_size_dynamic_sections): Handle global_entry alloc,
and don't stash end of glink branch table in rawsize.
(ppc_build_one_stub): Rewrite stub size calculations.
(build_global_entry_stubs): Use new section.
(ppc64_elf_build_stubs): Don't pad __glink_PLTresolve with nops.
Build lazy link stubs out to end of section. Build global entry
stubs in new section.
gold/
* options.h (plt_align): Support for PowerPC32 too.
* powerpc.cc (Stub_table::stub_align): Heed --plt-align for 32-bit.
(Stub_table::plt_call_size, branch_stub_size): Tidy.
(Stub_table::plt_call_align): Implement using stub_align.
(Output_data_glink::global_entry_align): New function.
(Output_data_glink::global_entry_off): New function.
(Output_data_glink::global_entry_address): Use global_entry_off.
(Output_data_glink::pltresolve_size): New function, replacing
pltresolve_size_ constant. Update all uses.
(Output_data_glink::add_global_entry): Align offset.
(Output_data_glink::set_final_data_size): Use global_entry_align.
(Stub_table::do_write): Don't pad __glink_PLTrelsolve with nops.
Tidy stub output. Use global_entry_off.
ld/
* emultempl/ppc32elf.em (params): Init new field.
(enum ppc32_opt): New enum to define OPTION_* values. Add
OPTION_PLT_ALIGN and OPTION_NO_PLT_ALIGN.
(PARSE_AND_LIST_LONGOPTS): Handle new options.
(PARSE_AND_LIST_ARGS_CASES): Likewise.
(PARSE_AND_LIST_OPTIONS): Likewise. Break up help output.
* emultempl/ppc64elf.em (ppc_add_stub_section): Init alignment
correctly for negative --plt-stub-align.
* testsuite/ld-powerpc/elfv2exe.d,
* testsuite/ld-powerpc/elfv2so.d,
* testsuite/ld-powerpc/relbrlt.d,
* testsuite/ld-powerpc/relbrlt.s,
* testsuite/ld-powerpc/tlsexe.d,
* testsuite/ld-powerpc/tlsexe.r,
* testsuite/ld-powerpc/tlsexe32.d,
* testsuite/ld-powerpc/tlsexe32.g,
* testsuite/ld-powerpc/tlsexe32.r,
* testsuite/ld-powerpc/tlsexetoc.d,
* testsuite/ld-powerpc/tlsexetoc.r,
* testsuite/ld-powerpc/tlsopt5_32.d,
* testsuite/ld-powerpc/tlsso.d,
* testsuite/ld-powerpc/tlstocso.d: Update for changed stub order.
We never need to resolve_forwards() a symbol found by hash table lookup
such as target->tls_get_addr_opt() but we do potentially need to do so
for random symbols seen on relocs. So these calls were in the wrong
order, resulting in missing stubs and an assertion failure.
PR 22602
* powerpc.cc (Target_powerpc::Branch_info::mark_pltcall): Resolve
forwards before replacing __tls_get_addr.
(Target_powerpc::Branch_info::make_stub): Likewise.
The fix for PR 19291 broke some other cases where -r is used with scripts,
as reported in PR 22266. The original fix for PR 22266 ended up breaking
many cases for REL targets, where the addends are stored in the section data,
and are not being adjusted properly.
The problem was basically that in a relocatable output file (ET_REL),
symbol values are supposed to be relative to the start address of their
section. Usually in a relocatable file, all sections start at 0, so the
failure to get this right is often irrelevant, but with a linker script,
we occasionally see an output section whose starting address is not 0,
and gold would occasionally write a symbol with its relocated value instead
of its section-relative value.
This patch reverts the recent fix for PR 22266 as well as my original fix
for PR 19291. The original fix moved the symbol value adjustment to
write_local_symbols, but neglected to undo a few places where the adjustment
was also being applied, resulting in an occasional double adjustment. The
more recent fix removed those other adjustments, but then failed to
re-account for the adjustment when rewriting the relocations on REL targets.
With the old attempts reverted, we now apply the symbol value adjustment to
the one case that had been missed (non-section symbols in merge sections).
But now we also need to account for the adjustment when rewriting the addends
for RELA relocations.
gold/
PR gold/19291
PR gold/22266
* object.cc (Sized_relobj_file::compute_final_local_value_internal):
Revert changes from 2017-11-08 patch. Adjust symbol value in
relocatable links for non-section symbols.
(Sized_relobj_file::compute_final_local_value): Revert changes from
2017-11-08 patch.
(Sized_relobj_file::do_finalize_local_symbols): Likewise.
(Sized_relobj_file::write_local_symbols): Revert changes from
2015-11-25 patch.
* object.h (Sized_relobj_file::compute_final_local_value_internal):
Revert changes from 2017-11-08 patch.
* powerpc.cc (Target_powerpc::relocate_relocs): Adjust addend for
relocatable links.
* target-reloc.h (relocate_relocs): Adjust addend for relocatable links.
* testsuite/pr22266_a.c (hello): New function.
* testsuite/pr22266_main.c (main): Add test for merge sections.
* testsuite/pr22266_script.t: Add rule for .rodata.
Fixes a thinko. Given code that puts variables into the TOC (a bad
idea, but some see the TOC as a small data section) this bug could
result in an attempt to optimize a sequence that should not be
optimized.
* powerpc.cc (Target_powerpc::Scan::local): Correct dst_off
calculation for TOC16 relocs.
(Target_powerpc::Scan::global): Likewise.
gcc doesn't emit stack notes for ELFv1, since ELFv1 never needs an
executable stack. Note that ELFv1 is usually big-endian and ELFv2
little-endian, but the ABI is really orthogonal to endiannes.
* powerpc.cc (Target_powerpc<64,*>::powerpc_info): Set
is_default_stack_executable false.
ppc32, like many targets, defines the address of a function as the PLT
call stub code for functions referenced but not defined in a non-PIC
executable. ppc32 gold, unlike other targets, inherits the ppc64
multiple stub capability for dealing with very large binaries where
one set of stubs can't be reached from all code locations. This means
there can be multiple choices of address for a function, which might
cause function pointer comparison failures. So for ppc32, make
non-branch references always use the first stub group.
(PowerPC64 ELFv1 is always PIC so doesn't need to define the address
of an external function as the PLT stub. PowerPC64 ELFv2 needs a
special set of global entry stubs to serve as the address of external
functions, so it too is not affected by this bug.)
* powerpc.cc (Target_powerpc::Branch_info::make_stub): Put
stubs for ppc32 non-branch relocs in first stub table.
(Target_powerpc::Relocate::relocate): Resolve similarly.
In the TLS GD/LD to LE optimization, ld replaces a sequence like
addi 3,2,x@got@tlsgd R_PPC64_GOT_TLSGD16 x
bl __tls_get_addr(x@tlsgd) R_PPC64_TLSGD x
R_PPC64_REL24 __tls_get_addr
nop
with
addis 3,13,x@tprel@ha R_PPC64_TPREL16_HA x
addi 3,3,x@tprel@l R_PPC64_TPREL16_LO x
nop
When the tprel offset is small, this can be further optimized to
nop
addi 3,13,x@tprel
nop
bfd/
* elf64-ppc.c (struct ppc_link_hash_table): Add do_tls_opt.
(ppc64_elf_tls_optimize): Set it.
(ppc64_elf_relocate_section): Nop addis on TPREL16_HA, and convert
insn on TPREL16_LO and TPREL16_LO_DS relocs to use r13 when
addis would add zero.
* elf32-ppc.c (struct ppc_elf_link_hash_table): Add do_tls_opt.
(ppc_elf_tls_optimize): Set it.
(ppc_elf_relocate_section): Nop addis on TPREL16_HA, and convert
insn on TPREL16_LO relocs to use r2 when addis would add zero.
gold/
* powerpc.cc (Target_powerpc::Relocate::relocate): Nop addis on
TPREL16_HA, and convert insn on TPREL16_LO and TPREL16_LO_DS
relocs to use r2/r13 when addis would add zero.
ld/
* testsuite/ld-powerpc/tls.s: Add calls with tls markers.
* testsuite/ld-powerpc/tls32.s: Likewise.
* testsuite/ld-powerpc/powerpc.exp: Run tls marker tests.
* testsuite/ld-powerpc/tls.d: Adjust for TPREL16_HA/LO optimization.
* testsuite/ld-powerpc/tlsexe.d: Likewise.
* testsuite/ld-powerpc/tlsexetoc.d: Likewise.
* testsuite/ld-powerpc/tlsld.d: Likewise.
* testsuite/ld-powerpc/tlsmark.d: Likewise.
* testsuite/ld-powerpc/tlsopt4.d: Likewise.
* testsuite/ld-powerpc/tlstoc.d: Likewise.
This implements the special __tls_get_addr_opt call stub for powerpc
gold that returns __thread variable addresses without actually making
a call to __tls_get_addr in most cases. Shared libraries that are
loaded at program load time (ie. dlopen is not used) have a known
layout for their __thread variables, and thus DTPMOD64/DPTREL64 pairs
describing those variables can be set up by ld.so for the
__tls_get_addr_opt call stub fast exit.
Ref https://sourceware.org/ml/libc-alpha/2015-03/msg00626.html
I really, really wish I'd used a differently versioned __tls_get_addr
symbol than the base symbol to indicate glibc support for the
optimized call, rather than having glibc export __tls_get_addr_opt. A
lot of the messing around here, flipping symbols from __tls_get_addr
to __tls_get_addr_opt, is caused by that decision. About the only
benefit is that a user can see at a glance that their disassembled
code is calling __tls_get_addr via the fancy call stub.. Anyway, we
need references to __tls_get_addr to seem like they were to
__tls_get_addr_opt, and in cases like the tsan interceptor, a
definition of __tls_get_addr to seem like one of __tls_get_addr_opt
as well. That's the reason for Symbol::clear_in_reg and
Symbol_table::clone, and why symbols are substituted in Scan::global
and other places dealing with dynamic linking.
elfcpp/
* elfcpp.h (DT_PPC_OPT): Define.
* powerpc.h (PPC_OPT_TLS): Define.
gold/
* options.h (tls_get_addr_optimize): New option.
* symtab.h (Symbol::clear_in_reg, clone): New functions.
(Sized_symbol::clone): New function.
(Symbol_table::clone): New function.
* resolve.cc (Symbol::clone, Sized_symbol::clone): New functions.
* powerpc.cc (Target_powerpc::has_tls_get_addr_opt_,
tls_get_addr_, tls_get_addr_opt_): New vars.
(Target_powerpc::tls_get_addr_opt, tls_get_addr,
is_tls_get_addr_opt, replace_tls_get_addr,
set_has_tls_get_addr_opt, stk_linker): New functions.
(Target_powerpc::Track_tls::maybe_skip_tls_get_addr_call): Add
target param. Update callers. Compare symbols rather than names.
(Target_powerpc::do_define_standard_symbols): Init tls_get_addr_
and tls_get_addr_opt_.
(Target_powerpc::Branch_info::mark_pltcall): Translate tls_get_addr
sym to tls_get_addr_opt.
(Target_powerpc::Branch_info::make_stub): Likewise.
(Stub_table::define_stub_syms): Likewise.
(Target_powerpc::Scan::global): Likewise.
(Target_powerpc::Relocate::relocate): Likewise.
(add_3_12_2, add_3_12_13, bctrl, beqlr, cmpdi_11_0, cmpwi_11_0,
ld_11_1, ld_11_3, ld_12_3, lwz_11_3, lwz_12_3, mr_0_3, mr_3_0,
mtlr_11, std_11_1): New constants.
(Stub_table::eh_frame_added_): Delete.
(Stub_table::tls_get_addr_opt_bctrl_, plt_fde_len_, plt_fde_): New vars.
(Stub_table::init_plt_fde): New functions.
(Stub_table::add_eh_frame, replace_eh_frame): Move definition out
of line. Init and use plt_fde_.
(Stub_table::plt_call_size): Return size for tls_get_addr stub.
Extract alignment code to..
(Stub_table::plt_call_align): ..this new function. Adjust all callers.
(Stub_table::add_plt_call_entry): Set has_tls_get_addr_opt and
tls_get_addr_opt_bctrl, and align after that.
(Stub_table::do_write): Write out tls_get_addr stub.
(Target_powerpc::do_finalize_sections): Emit DT_PPC_OPT
PPC_OPT_TLS/PPC64_OPT_TLS bit.
(Target_powerpc::Relocate::relocate): Don't check for or modify
nop following bl for tls_get_addr stub.
This patch provides a flag for PowerPC64 ELFv2 use in class Symbol,
and modifies Sized_target::resolve to return whether the symbol has
been resolved. If not, normal processing continues. I use this for
PowerPC64 ELFv2 to keep track of whether a symbol has any definition
with non-zero localentry, in order to disable --plt-localentry for
that symbol.
PR 21847
* powerpc.cc (Target_powerpc::is_elfv2_localentry0): Test
non_zero_localentry.
(Target_powerpc::resolve): New function.
(powerpc_info): Set has_resolve for 64-bit.
* target.h (Sized_target::resolve): Return bool.
* resolve.cc (Symbol_table::resolve): Continue with normal
processing when target resolve returns false.
* symtab.h (Symbol::non_zero_localentry, set_non_zero_localentry):
New accessors.
(Symbol::non_zero_localentry_): New flag bit.
* symtab.cc (Symbol::init_fields): Init non_zero_localentry_.
There is a very small but non-zero probability that a stub group
contains stubs on one relax pass, but does not on the next. In that
case we would get an FDE covering a zero length address range.
(Actually, it's even worse. Alignment padding for stubs can mean the
address for the non-existent stubs is past the end of the original
section to which stubs are attached, and due to the way
do_plt_fde_location calculates the length we can get a negative
length.) Fixing this properly requires removing the FDE.
Also, I have been implementing the __tls_get_addr_opt support for
gold, and that stub needs something other than the default FDE. The
necessary FDE will depend on the offset to the __tls_get_addr_opt
stub, which of course can change during relaxation. That means at the
very least, rewriting the FDE on each pass, possibly changing the FDE
size. I think that is better done by completely recreating PLT
eh_frame FDEs.
* ehframe.cc (Fde::operator==): New.
(Cie::remove_fde, Eh_frame::remove_ehframe_for_plt): New.
* ehframe.h (Fde::operator==): Declare.
(Cie::remove_fde, Eh_frame::remove_ehframe_for_plt): Likewise.
* layout.cc (Layout::remove_eh_frame_for_plt): New.
* layout.h (Layout::remove_eh_frame_for_plt): Declare.
* powerpc.cc (Target_powerpc::do_relax): Remove old eh_frame FDEs.
(Stub_table::add_eh_frame): Delete eh_frame_added_ condition.
Don't add eh_frame for empty stub section.
(Stub_table::remove_eh_frame): New.
This adds a --no-tls-optimize option for people who want to keep
__tls_get_addr calls in an executable rather than optimizing such code
sequences to IE/LE.
Also tidy some formatting errors, rename a variable to better reflect
its use, and tweak two functions that create pairs of GOT entries to
first check whether the GOT entry already exists before potentially
inserting the header via reserve(2). Without the check it is possible
to waste one GOT entry.
* options.h (no_tls_optimize): New powerpc option.
* powerpc.cc (Target_powerpc::abiversion, set_abiversion): Formatting.
(Target_powerpc::stk_toc): Formatting, fix comment.
(Target_powerpc::Track_tls::tls_get_addr_state): Rename from
tls_get_addr.
(Target_powerpc::optimize_tls_gd, optimize_tls_ld, optimize_tls_ie):
Return TLSOPT_NONE when !tls_optimize.
(Target_powerpc::add_global_pair_with_rel): Check
for existing reloc before reserving.
(Target_powerpc::add_local_tls_pair): Likewise.
This makes ld warn about --plt-localentry if a version of glibc
without the necessary ld.so checks is detected, and revises the
documentation.
bfd/
* elf64-ppc.c (ppc64_elf_tls_setup): Warn on --plt-localentry
without ld.so checks.
gold/
* powerpc.cc (Target_powerpc::scan_relocs): Warn on --plt-localentry
without ld.so checks.
ld/
* ld.texinfo (plt-localentry): Revise.
The big comment in ppc64_elf_tls_setup says why. I've also added some
code to the bfd linker that catches the -lpthread -lc symbol
differences and disable generation of optimized call stubs even when
--plt-localentry is activated. Gold doesn't yet have that.
PR 21847
bfd/
* elf64-ppc.c (struct ppc_link_hash_entry): Add non_zero_localentry.
(ppc64_elf_merge_symbol): Set non_zero_localentry.
(is_elfv2_localentry0): Test non_zero_localentry.
(ppc64_elf_tls_setup): Default to --no-plt-localentry.
gold/
* powerpc.cc (Target_powerpc::scan_relocs): Default to
--no-plt-localentry.
ld/
* ld.texinfo (plt-localentry): Document.
My PPC64_OPT_LOCALENTRY patch of June 1, git commit f378ab099d, and
the later gold change, git commit 7ee7ff7015, added an insn in
__glink_PLTresolve which needs a corresponding adjustment in the
eh_frame info for asynchronous exceptions to unwind correctly.
It would have been OK for both ABIs to use +5 for the advance before
restore of LR, since we can put the DW_CFA_restore_extended on any
insn after the actual restore and before the r12/r0 copy is clobbered,
but it's slightly better to delay as much as possible. There are
then more addresses where fewer CFA program insns are executed.
bfd/
* elf64-ppc.c (ppc64_elf_size_stubs): Correct advance to
restore of LR.
gold/
* powerpc.cc (glink_eh_frame_fde_64v2): Correct advance to
restore of LR.
(glink_eh_frame_fde_64v1): Advance to restore of LR at latest
possible insn.
elfcpp/
* elfcpp.h (DT_PPC64_OPT): Define.
* powerpc.h (PPC64_OPT_TLS, PPC64_OPT_MULTI_TOC,
PPC64_OPT_LOCALENTRY): Define.
gold/
* options.h (General_options): Add plt_localentry.
* powerpc.cc (Target_powerpc::st_other): New function.
(Target_powerpc::plt_localentry0_, plt_localentry0_init_,
has_localentry0_): New vars.
(Target_powerpc::plt_localentry0, set_has_localentry0,
is_elfv2_localentry0): New functions.
(Target_powerpc::Branch_info::mark_pltcall): Don't set tocsave or
return true for localentry:0 calls.
(Stub_table::Plt_stub_ent::localentry0_): New var.
(Stub_table::add_plt_call_entry): Set localentry0_ and has_localentry0_.
Don't set r2save_ for localentry:0 calls.
(Output_data_glink::do_write): Save r2 in __glink_PLTresolve for elfv2.
(Target_powerpc::scan_relocs): Default plt_localentry0_.
(Target_powerpc::do_finalize_sections): Set DT_PPC64_OPT.
(Target_powerpc::Relocate::relocate): Don't require nop following
calls for localentry:0 plt calls, and don't change nop.
This adds support to gold for the tocsave relocs already supported by
ld.bfd. R_PPC64_TOCSAVE relocs are part of a scheme to move r2 saves
to the prologue of a function rather than in each plt call stub. We
don't want a compiler to always emit the r2 save, as this would be
wasted if the calls turned out to be local. See the tocsave*.s in
ld/testsuite/ld-powerpc/.
* powerpc.cc (Target_powerpc::tocsave_loc_): New var.
(Target_powerpc::mark_pltcall, add_tocsave, tocsave_loc): New functions.
(Target_powerpc::Branch_info::tocsave_): New var.
(Target_powerpc::Branch_info::mark_pltcall): New function.
(Target_powerpc::Branch_info::make_stub): Pass tocsave_ to
add_plt_call_entry.
(Stub_table::Plt_stub_ent): Make public. Add r2save_.
(Stub_table::add_plt_call_entry): Add bool tocsave_ param. Set
r2save_.
(Stub_table::find_plt_call_entry): Return Plt_stub_ent*. Adjust
use throughout.
(Stub_table::do_write): Conditionally output r2 save in plt stubs.
(Target_powerpc::Scan::local): Handle R_PPC64_TOCSAVE.
(Target_powerpc::Scan::global): Likewise.
(Target_powerpc::Relocate::relocate): Skip r2 save in plt call stub
with tocsave reloc. Replace header tocsave nop with r2 save.
* symtab.h (struct Symbol_location_hash): Make public.
I was lazy when adding indx_ to Plt_stub_ent. The field isn't part of
the key, so ought to be part of the mapped type. Make it so.
* powerpc.cc (Plt_stub_key): Rename from Plt_stub_ent. Remove indx_.
(Plt_stub_key_hash): Rename from Plt_stub_ent_hash.
(struct Plt_stub_ent): New.
(Plt_stub_entries): Map from Plt_stub_key to Plt_stub_ent. Adjust
use throughout file.
If two objects are compiled with -fPIC or -fPIE and call the same
function, two different PLT entries are created, one for each object,
but the same stub symbol name is used for both.
* powerpc.cc (Stub_table::define_stub_syms): Always include object's
uniq_ value.
Doesn't yet trim off the unused TOC entries.
* powerpc.cc (class Powerpc_copy_relocs): New.
(Powerpc_copy_relocs::emit): New function.
(Powerpc_relobj::relatoc_, toc_, no_toc_opt_): New variables.
(Powerpc_relobj::toc_shndx, set_no_toc_opt, no_toc_opt): New inlines.
(Powerpc_relobj::do_relocate_sections): New function.
(Powerpc_relobj::make_toc_relative): Likewise.
(Powerpc_relobj::do_find_special_sections): Stash away .rela.toc
and .toc too.
(ok_lo_toc_insn): Move earlier, and handle more insns.
(Target_powerpc::Scan::local): If optimizing toc accesses, set
no_toc_opt for entries we can't edit. Check insn validity.
Emit "toc optimization is not supported" warning, downgraded
from error.
(Target_powerpc::Scan::global): Likewise.
(Target_powerpc::Relocate::relocate): Edit TOC indirect code
to TOC relative. Don't emit "toc optimization is not supported"
error here.
Added just to accept, and ignore. gcc since 2015-10-21, when
configured with --enable-secureplt passes this option to the linker.
As powerpc gold cannot link --bss-plt code successfully, gold needs to
accept the option or the gcc specs file needs to be changed.
The patch also make gold detect --bss-plt code and error out rather
than producing a binary that crashes.
* options.h: Add --secure-plt option.
* powerpc.cc (Target_powerpc::Scan::local): Detect and error
on -fPIC -mbss-plt code.
(Target_powerpc::Scan::global): Likewise.
Plus some paranoia in symval_for_branch. We shouldn't get there with
dynamic symbols, but if we ever did the static_cast to Powerpc_relobj
would be wrong.
* powerpc.cc: Use shorter equivalent elfcpp typedef for
Reltype and reloc_size throughout.
(Target_powerpc::symval_for_branch): Exclude dynamic symbols.
(Target_powerpc::Scan::local): Use local var r_sym.
(Target_powerpc::Scan::global: Likewise.
(Target_powerpc::Relocate::relocate): Delete shadowing r_sym.
A branch in a non-exec section that needs a stub can lead to this
assertion.
* powerpc.cc (Powerpc_relobj::stub_table): Return NULL rather
then asserting.
Adds a new option, defaulting to off, that allows a group of stubs to
serve multiple output sections. Prior to this patch powerpc gold
allowed this unconditionally, which is a little unsafe with clever
code that discards/reuses sections at runtime.
* options.h (--stub-group-multi): New PowerPC option.
* powerpc.cc (Stub_control): Add multi_os_ var and param
to constructor. Sort start_ var later. Comment State.
(Stub_control::can_add_to_stub_group): Heed multi_os_.
(Target_powerpc::group_sections): Update.
Gold attaches stubs to an existing section in contrast to ld.bfd which
inserts a new section for stubs. If we want stubs before branches,
then the stubs must be added to the previous section. Adding to the
previous section is a disaster if there is a large gap between the
previous section and the group.
PR gold/20878
* powerpc.cc (Stub_control): Replace stubs_always_before_branch_
with stubs_always_after_branch_, group_end_addr_ with
group_start_addr_.
(Stub_control::can_add_to_stub_group): Rewrite to suit scanning
sections by increasing address.
(Target_powerpc::group_sections): Scan that way. Delete corner
case.
* options.h (--stub-group-size): Update help string.
Some more debug output, and a little hardening.
* powerpc.cc (Stub_table_owner): Provide constructor.
(Powerpc_relobj::set_stub_table): Resize fill with -1.
(Target_powerpc::Branch_info::make_stub): Provide target debug
output on returning false.
This patch adds a little more debug output, and replaces two variables
with one, tracking current max group size by group_size_ rather than
by has14_.
* powerpc.cc (class Stub_control): Delete stub14_group_size_
and has14_. Add group_size_.
(Stub_control::can_add_to_stub_group): Adjust to suit. Print
debug info when switching to adding sections before stubs.
This patch rewrites the rather obscure can_add_to_stub_group, fixing
a problem with the handling of sections containing conditional
external branches. When a section group contains any such section,
the group size needs to be limited to a much smaller size than groups
with only non-conditional external branches.
PR 20523
* powerpc.cc (class Stub_control): Add has14_. Comment owner_.
(Stub_control::can_add_to_stub_group): Correct grouping of
sections containing 14-bit external branches. When returning
false, set state_ to reflect the fact that we have one section
for the next group. Rewrite most of function for clarity.
Add and expand comments.
(Target_powerpc::do_relax): Print stub group size retry in hex.
gold/ChangeLog
2016-08-26 Han Shen <shenhan@google.com>
* powerpc.cc (Stub_table::min_size_threshold_): New member to
limit size.
(Stub_table::set_min_size_threshold): New member function.
(Stub_table::set_address_and_size): Add code to only allow size
increase.
(Target_powerpc::do_relax): Add code to record last size.
This tightens the condition under which ld optimizes PIC entry code
to non-PIC.
bfd/
* elf64-ppc.c (ppc64_elf_relocate_section): Further restrict
ELFv2 entry optimization.
gold/
* powerpc.cc (relocate): Further restrict ELFv2 entry optimization.
For MIPS-64, the r_info field in the relocation format is
replaced by several individual fields, including r_sym and
r_type. To enable support for this format, I've refactored
target-independent code to remove almost all uses of the r_info
field. (I've left alone a couple of routines used only for
incremental linking, which I can update if/when the MIPS target
adds support for incremental linking.)
For routines that are already templated on a Classify_reloc class
(namely, gc_process_relocs, relocate_section, and
relocate_relocs), I've extended the Classify_reloc interface to
include sh_type (which no longer needs to be a separate template
parameter) as well as get_r_sym() and get_r_type() methods for
extracting the r_sym and r_type fields. For
scan_relocatable_relocs, I've extended the
Default_scan_relocatable_relocs class by converting it to a class
template with Classify_reloc as a template parameter. For the
remaining routines that need to access r_sym, I've added a
virtual Target::get_r_sym() method with an override for the MIPS
target.
In elfcpp, I've added Mips64_rel, etc., accessor classes and
corresponding internal data structures. The MIPS target uses
these new classes within its own Mips_classify_reloc class.
The Mips64_ accessor classes also expose the r_ssym, r_type2,
and r_type3 fields from the relocation.
These changes should be functionally the same for all but the
MIPS target.
elfcpp/
* elfcpp.h (Mips64_rel, Mips64_rel_write): New classes.
(Mips64_rela, Mips64_rela_write): New classes.
* elfcpp_internal.h (Mips64_rel_data, Mips64_rela_data): New structs.
gold/
* gc.h (get_embedded_addend_size): Remove sh_type parameter.
(gc_process_relocs): Remove sh_type template parameter.
Use Classify_reloc to access r_sym, r_type, and r_addend fields.
* object.h (Sized_relobj_file::split_stack_adjust): Add target
parameter.
(Sized_relobj_file::split_stack_adjust_reltype): Likewise.
* reloc-types.h (Reloc_types::copy_reloc_addend): (SHT_REL and SHT_RELA
specializations) Remove.
* reloc.cc (Emit_relocs_strategy): Rename and move to target-reloc.h.
(Sized_relobj_file::emit_relocs_scan): Call Target::emit_relocs_scan().
(Sized_relobj_file::emit_relocs_scan_reltype): Remove.
(Sized_relobj_file::split_stack_adjust): Add target parameter.
Adjust all callers.
(Sized_relobj_file::split_stack_adjust_reltype): Likewise. Call
Target::get_r_sym() to get r_sym field from relocations.
(Track_relocs::next_symndx): Call Target::get_r_sym().
* target-reloc.h (scan_relocs): Remove sh_type template parameter;
add Classify_reloc template parameter. Use for accessing r_sym and
r_type.
(relocate_section): Likewise.
(Default_classify_reloc): New class (renamed and moved from reloc.cc).
(Default_scan_relocatable_relocs): Remove sh_type template parameter.
(Default_scan_relocatable_relocs::Reltype): New typedef.
(Default_scan_relocatable_relocs::reloc_size): New const.
(Default_scan_relocatable_relocs::sh_type): New const.
(Default_scan_relocatable_relocs::get_r_sym): New method.
(Default_scan_relocatable_relocs::get_r_type): New method.
(Default_emit_relocs_strategy): New class.
(scan_relocatable_relocs): Replace sh_type template parameter with
Scan_relocatable_relocs class. Use it to access r_sym and r_type
fields.
(relocate_relocs): Replace sh_type template parameter with
Classify_reloc class. Use it to access r_sym and r_type fields.
* target.h (Target::is_call_to_non_split): Replace r_type parameter
with pointer to relocation. Adjust all callers.
(Target::do_is_call_to_non_split): Likewise.
(Target::emit_relocs_scan): New virtual method.
(Sized_target::get_r_sym): New virtual method.
* target.cc (Target::do_is_call_to_non_split): Replace r_type parameter
with pointer to relocation.
* aarch64.cc (Target_aarch64::emit_relocs_scan): New method.
(Target_aarch64::Relocatable_size_for_reloc): Remove.
(Target_aarch64::gc_process_relocs): Use Default_classify_reloc.
(Target_aarch64::scan_relocs): Likewise.
(Target_aarch64::relocate_section): Likewise.
(Target_aarch64::Relocatable_size_for_reloc::get_size_for_reloc):
Remove.
(Target_aarch64::scan_relocatable_relocs): Use Default_classify_reloc.
(Target_aarch64::relocate_relocs): Use Default_classify_reloc.
* arm.cc (Target_arm::Arm_scan_relocatable_relocs): Remove sh_type
template parameter.
(Target_arm::emit_relocs_scan): New method.
(Target_arm::Relocatable_size_for_reloc): Replace with...
(Target_arm::Classify_reloc): ...this.
(Target_arm::gc_process_relocs): Use Classify_reloc.
(Target_arm::scan_relocs): Likewise.
(Target_arm::relocate_section): Likewise.
(Target_arm::scan_relocatable_relocs): Likewise.
(Target_arm::relocate_relocs): Likewise.
* i386.cc (Target_i386::emit_relocs_scan): New method.
(Target_i386::Relocatable_size_for_reloc): Replace with...
(Target_i386::Classify_reloc): ...this.
(Target_i386::gc_process_relocs): Use Classify_reloc.
(Target_i386::scan_relocs): Likewise.
(Target_i386::relocate_section): Likewise.
(Target_i386::scan_relocatable_relocs): Likewise.
(Target_i386::relocate_relocs): Likewise.
* mips.cc (Mips_scan_relocatable_relocs): Remove sh_type template
parameter.
(Mips_reloc_types): New class template.
(Mips_classify_reloc): New class template.
(Target_mips::Reltype): New typedef.
(Target_mips::Relatype): New typedef.
(Target_mips::emit_relocs_scan): New method.
(Target_mips::get_r_sym): New method.
(Target_mips::Relocatable_size_for_reloc): Replace with
Mips_classify_reloc.
(Target_mips::copy_reloc): Use Mips_classify_reloc.
(Target_mips::gc_process_relocs): Likewise.
(Target_mips::scan_relocs): Likewise.
(Target_mips::relocate_section): Likewise.
(Target_mips::scan_relocatable_relocs): Likewise.
(Target_mips::relocate_relocs): Likewise.
(mips_get_size_for_reloc): New function, factored out from
Relocatable_size_for_reloc::get_size_for_reloc.
(Target_mips::Scan::local): Use Mips_classify_reloc.
(Target_mips::Scan::global): Likewise.
(Target_mips::Relocate::relocate): Likewise.
* powerpc.cc (Target_powerpc::emit_relocs_scan): New method.
(Target_powerpc::Relocatable_size_for_reloc): Remove.
(Target_powerpc::gc_process_relocs): Use Default_classify_reloc.
(Target_powerpc::scan_relocs): Likewise.
(Target_powerpc::relocate_section): Likewise.
(Powerpc_scan_relocatable_reloc): Convert to class template.
(Powerpc_scan_relocatable_reloc::Reltype): New typedef.
(Powerpc_scan_relocatable_reloc::reloc_size): New const.
(Powerpc_scan_relocatable_reloc::sh_type): New const.
(Powerpc_scan_relocatable_reloc::get_r_sym): New method.
(Powerpc_scan_relocatable_reloc::get_r_type): New method.
(Target_powerpc::scan_relocatable_relocs): Use
Powerpc_scan_relocatable_reloc.
(Target_powerpc::relocate_relocs): Use Default_classify_reloc.
* s390.cc (Target_s390::emit_relocs_scan): New method.
(Target_s390::Relocatable_size_for_reloc): Remove.
(Target_s390::gc_process_relocs): Use Default_classify_reloc.
(Target_s390::scan_relocs): Likewise.
(Target_s390::relocate_section): Likewise.
(Target_s390::Relocatable_size_for_reloc::get_size_for_reloc):
Remove.
(Target_s390::scan_relocatable_relocs): Use Default_classify_reloc.
(Target_s390::relocate_relocs): Use Default_classify_reloc.
* sparc.cc (Target_sparc::emit_relocs_scan): New method.
(Target_sparc::Relocatable_size_for_reloc): Remove.
(Target_sparc::gc_process_relocs): Use Default_classify_reloc.
(Target_sparc::scan_relocs): Likewise.
(Target_sparc::relocate_section): Likewise.
(Target_sparc::Relocatable_size_for_reloc::get_size_for_reloc):
Remove.
(Target_sparc::scan_relocatable_relocs): Use Default_classify_reloc.
(Target_sparc::relocate_relocs): Use Default_classify_reloc.
* tilegx.cc (Target_tilegx::emit_relocs_scan): New method.
(Target_tilegx::Relocatable_size_for_reloc): Remove.
(Target_tilegx::gc_process_relocs): Use Default_classify_reloc.
(Target_tilegx::scan_relocs): Likewise.
(Target_tilegx::relocate_section): Likewise.
(Target_tilegx::Relocatable_size_for_reloc::get_size_for_reloc):
Remove.
(Target_tilegx::scan_relocatable_relocs): Use Default_classify_reloc.
(Target_tilegx::relocate_relocs): Use Default_classify_reloc.
* x86_64.cc (Target_x86_64::emit_relocs_scan): New method.
(Target_x86_64::Relocatable_size_for_reloc): Remove.
(Target_x86_64::gc_process_relocs): Use Default_classify_reloc.
(Target_x86_64::scan_relocs): Likewise.
(Target_x86_64::relocate_section): Likewise.
(Target_x86_64::Relocatable_size_for_reloc::get_size_for_reloc):
Remove.
(Target_x86_64::scan_relocatable_relocs): Use Default_classify_reloc.
(Target_x86_64::relocate_relocs): Use Default_classify_reloc.
* testsuite/testfile.cc (Target_test::emit_relocs_scan): New method.
In an fixed position executable, the entry code does not need to be
PIC and can thus lose a dependency on r12.
* powerpc.cc (Target_powerpc::Relocate::relocate): Edit ELFv2
entry code.
(Target_powerpc::relocate_relocs): Edit relocs to suit.