See https://sourceware.org/ml/binutils/2016-07/msg00091.html
This patch stop --gc-sections elf_gc_sweep_symbol localizing symbols
that ought to remain global.
The difficulty with always descending into output section statements
is that symbols defined by the script in such statements don't have
a bfd section when lang_do_assignments runs early in the link process.
There are two approaches to curing this problem. Either we can
create the bfd section early, or we can use a special section. This
patch takes the latter approach and uses bfd_und_section. (Creating
bfd sections early results in changed output section order, and thus
lots of testsuite failures. You can't create all output sections
early to ensure proper ordering as KEEP then stops empty sections
from being stripped.)
The wrinkle with this approach is that some code that runs at
gc-sections time needs to be made aware of the odd defined symbols
using bfd_und_section.
bfd/
* elf64-x86-64.c (elf_x86_64_convert_load_reloc): Handle symbols
defined temporarily with bfd_und_section.
* elflink.c (_bfd_elf_gc_keep): Don't set SEC_KEEP for bfd_und_section.
* elfxx-mips.c (mips_elf_local_pic_function_p): Exclude defined
symbols with bfd_und_section.
ld/
* ldlang.c (lang_do_assignments_1): Descend into output section
statements that do not yet have bfd sections. Set symbol section
temporarily for symbols defined in such statements to the undefined
section. Don't error on data or reloc statements until final phase.
* ldexp.c (exp_fold_tree_1 <etree_assign>): Handle bfd_und_section
in expld.section.
* testsuite/ld-mmix/bpo-10.d: Adjust.
* testsuite/ld-mmix/bpo-11.d: Adjust.
Changes the result of ld expressions that were previously plain
numbers to be an absolute address, in the same circumstances where
numbers are treated as absolute addresses.
* ld.texinfo (Expression Section): Update result of arithmetic
expressions.
* ldexp.c (arith_result_section): New function.
(fold_binary): Use it.
Commit b751e639 regressed arm linux kernel builds, that have an
ASSERT (((__hyp_idmap_text_end - (__hyp_idmap_text_start
& ~ (((0x1 << 0xc) - 0x1))))
<= (0x1 << 0xc)), HYP init code too big or misaligned)
Due to some insanity in ld expression evaluation, the integer values
0x1 and 0xc above are treated as absolute addresses (ie. they have an
associated section, *ABS*, see exp_fold_tree_1 case etree_value) while
the expression (0x1 << 0xc) has a plain number result. The left hand
side of the inequality happens to evaluate to a "negative" .text
section relative value. Comparing a section relative value against an
absolute value works since the section relative value is first
converted to absolute. Comparing a section relative value against a
number just compares the offsets, which fails since the "negative"
offset is really a very large positive number.
This patch works around the problem by folding integer expressions, so
the assert again becomes
ASSERT (((__hyp_idmap_text_end - (__hyp_idmap_text_start
& 0xfffffffffffff000))
<= 0x1000), HYP init code too big or misaligned)
* ldexp.c (exp_value_fold): New function.
(exp_unop, exp_binop, exp_trinop): Use it.
Folding a constant expression early can lead to loss of tokens, eg.
ABSOLUTE, that are significant in ld's horrible context sensitive
expression evaluation. Also, MAXPAGESIZE and other "constants" may
not have taken values specified on the command line, leading to the
wrong value being cached.
* ldexp.c (exp_unop, exp_binop, exp_trinop, exp_nameop): Don't
fold expression.
* testsuite/ld-elf/maxpage3b.d: Expect correct maxpagesize.
Makes these symbols defined before bfd_elf_size_dynamic_sections, to
avoid horrible hacks elsewhere. The exp_fold_tree undefweak change
is necessary to define undefweak symbols early too. The comment was
wrong. PROVIDE in fact defines undefweak symbols, via
bfd_elf_record_link_assignment.
PR ld/19175
* ldlang.c (lang_insert_orphan): Evaluate __start_* and __stop_*
symbol PROVIDE expressions.
* ldexp.c (exp_fold_tree_1 <etree_provide>): Define undefweak
references.
Giving linker script symbols defined outside of output sections a
section-relative value early, leads to them being used in expressions
as if they were defined inside an output section. This can mean loss
of the section VMA, and wrong results.
ld/
PR ld/18963
* ldexp.h (struct ldexp_control): Add rel_from_abs.
(ldexp_finalize_syms): Declare.
* ldexp.c (new_rel_from_abs): Keep absolute for expressions
outside of output section statements. Set rel_from_abs.
(make_abs, exp_fold_tree, exp_fold_tree_no_dot): Clear rel_from_abs.
(struct definedness_hash_entry): Add final_sec, and comment.
(update_definedness): Set final_sec.
(set_sym_sections, ldexp_finalize_syms): New functions.
* ldlang.c (lang_process): Call ldexp_finalize_syms.
ld/testsuite
PR ld/18963
* ld-scripts/pr18963.d,
* ld-scripts/pr18963.t: New test.
* ld-scripts/expr.exp: Run it.
* ld-elf/provide-hidden-2.ld: Explicitly make "dot" absolute.
* ld-mips-elf/gp-hidden.sd: Don't care about _gp section.
* ld-mips-elf/no-shared-1-n32.d: Don't care about symbol shown at
start of .data section.
* ld-mips-elf/no-shared-1-n64.d: Likewise.
* ld-mips-elf/no-shared-1-o32.d: Likewise.
bfd/
* elf64-ppc.c (ppc64_elf_func_desc_adjust): Don't redefine .TOC.
if already defined, and set linker_def.
(ppc64_elf_set_toc): Use .TOC. value if defined other than by
the backend.
ld/
* ldexp.c (exp_fold_tree_1): Clear linker_def on symbol assignment.
Reverts a2c59f28 and e474ab13. Since the unary form of ALIGN only
references "dot" implicitly, there isn't really a strong argument for
making ALIGN use a relative value when inside an output section.
* ldexp.c (align_dot_val): Delete.
(fold_unary <ALIGN_K, NEXT>): Revert 2015-07-10 change.
(is_align_conditional): Revert 2015-07-20 change.
(exp_fold_tree_1): Likewise, but keep expanded comment.
* scripttempl/elf.sc (.ldata, .bss): Revert 2015-07-20 change.
* ld.texinfo (<ALIGN>): Correct description.
a2c59f28 changed the way the unary ALIGN behaved inside output sections,
resulting in cris-elf testsuite regressions. This patch pads out .bss
in the same manner as it was prior to the ALIGN change.
* scripttempl/elf.sc (.ldata, .bss): Align absolute value of dot.
* ldexp.c (is_align_conditional): Handle binary ALIGN.
(exp_fold_tree_1): Move code setting SEC_KEEP for assignments to
dot inside output sections. Handle absolute expressions.
Inside output sections, ALIGN(.,x) uses a section-relative value for
dot. The unary ALIGN always used the absolute value of dot.
* ldexp.c (align_dot_val): New function.
(fold_unary <ALIGN_K, NEXT>): Use it.
The linker tries to put the end of the last section in the relro
segment exactly on a page boundary, because the relro segment itself
must end on a page boundary. If for any reason this can't be done,
padding is inserted. Since the end of the relro segment is typically
between .got and .got.plt, padding effectively increases the size of
the GOT. This isn't nice for targets and code models with limited GOT
addressing.
The problem with the current code is that it doesn't cope very well
with aligned sections in the relro segment. When making .got aligned
to a 256 byte boundary for PowerPC64, I found that often the initial
alignment attempt failed and the fallback attempt to be less than
adequate. This is a particular problem for PowerPC64 since the
distance between .got and .plt affects the size of plt call stubs,
leading to "stubs don't match calculated size" errors.
So this rewrite takes a direct approach to calculating a new relro
base. Starting from the last section in the segment, we calculate
where it must start to position its end on the boundary, or as near as
possible considering alignment requirements. The new start then
becomes the goal for the previous section to end, and so on for all
sections. This of course ignores the possibility that user scripts
will place . = ALIGN(xxx); in the relro segment, or provide section
address expressions. In those cases we might fail, but the old code
probably did too, and a fallback is provided.
ld/
* ldexp.h (struct ldexp_control): Delete dataseg.min_base. Add
data_seg.relro_offset.
* ldexp.c (fold_binary <DATA_SEGMENT_ALIGN>): Don't set min_base.
(fold_binary <DATA_SEGMENT_RELRO_END>): Do set relro_offset.
* ldlang.c (lang_size_sections): Rewrite code adjusting relro
segment base to line up last section on page boundary.
ld/testsuite/
* ld-x86-64/pr18176.d: Update.
Adjusting the start of the relro segment in order to make it end
exactly on a page boundary runs into difficulties when sections in the
relro segment are aligned; Adjusting the start by (next_page - end)
sometimes results in more than that adjustment occurring at the end,
overrunning the page boundary. So when that occurs we try a new lower
start position by masking the adjusted start with the maximum section
alignment. However, we didn't consider that this masked start address
may in fact be before the initial relro base, which is silly since
that can only increase padding at the relro end.
I've also moved some calculations closer to where they are used, and
comments closer to the relevant statements.
* ldlang.c (lang_size_sections): When alignment of sections
results in relro base adjustment being too large, don't go lower
than the initial value.
* ldexp.c (fold_binary <DATA_SEGMENT_RELRO_END>): Comment.
* scripttempl/elf.sc (DATA_SEGMENT_ALIGN): Omit SEGMENT_SIZE
alignment when SEGMENT_SIZE is the same as MAXPAGESIZE.
This patch fixes PR 4643 by allowing symbols in the LENGTH and ORIGIN
fields of MEMORY regions. Previously, only constants and constant
expressions are allowed.
For the AVR target, this helps define memory constraints more
accurately (per device), without having to create a ton of device
specific linker scripts.
ld/
PR 4643
* ldexp.c (fold_name): Fold LENGTH only after
lang_first_phase_enum.
* ldgram.y (memory_spec): Don't evaluate ORIGIN and LENGTH
rightaway.
* ldlang.h (struct memory_region_struct): Add origin_exp and
length_exp fields.
* ldlang.c (lang_do_memory_regions): New.
(lang_memory_region_lookup): Initialize origin_exp and
length_exp fields.
(lang_process): Call lang_do_memory_regions.
ld/testsuite/
* ld-scripts/memory.t: Define new symbol tred.
* ld-scripts/memory_sym.t: New.
* ld-scripts/script.exp: Perform MEMORY with symbols test, and
conditionally check values of linker symbols.
or maybe not just yet, but this is better than a FIXME.
* ldexp.c (update_definedness): Return false if script symbol is
redefining a strong symbol in an object.
(exp_fold_tree_1 <etree_assign>): Set up for reporting a multiple
definition error, but for now leave disabled.
Trying to use the SEC_LINKER_CREATED section flag to determine whether
a symbol is linker defined fails to work on targets like alpha that
define special SEC_COMMON sections. These might contain symbols that
originated in an object file.
include/
* bfdlink.h (struct bfd_link_hash_entry): Comment non_ir_ref. Add
linker_def.
bfd/
* elflink.c (_bfd_elf_define_linkage_sym): Set linker_def.
* linker.c (_bfd_generic_link_add_one_symbol): Clear linker_def
for CDEF, DEF, DEFW, COM.
ld/
* ldexp.c (exp_fold_tree_1 <etree_provide>): Test linker_def.
ld/testsuite/
* ld-powerpc/sdabase.s,
* ld-powerpc/sdabase.t,
* ld-powerpc/sdabase.d: New test.
* ld-powerpc/sdabase2.t,
* ld-powerpc/sdabase2.d: New test.
* ld-powerpc/powerpc.exp: Run them.
The old code missed testing bfd_link_hash_undefweak, and wrongly
excluded bfd_link_hash_common symbols. It is also clearer to invert
the set of enum bfd_link_hash_type values tested.
bfd_link_hash_indirect and bfd_link_hash_warning will never appear
here.
* ldexp.c (update_definedness): Correct logic setting by_object.
This moves support code for DEFINED to ldexp.c where it is used,
losing the lang_ prefix on identifiers. Two new functions are needed
to initialize and clean up to hash table, but other than that there
are no functional changes here.
* ldexp.c (struct definedness_hash_entry, definedness_table)
(definedness_newfunc, symbol_defined, update_definedness): Move
and rename from..
* ldlang.h (struct lang_definedness_hash_entry): ..here,..
* ldlang.c (lang_definedness_table, lang_definedness_newfunc)
(lang_symbol_defined, lang_update_definedness): ..and here.
* ldexp.c (ldexp_init, ldexp_finish): New functions, extracted from..
* ldlang.c (lang_init, lang_finish): ..here.
* ldexp.h (ldexp_init, ldexp_finish): Declare.
* ldlang.h (lang_symbol_defined, lang_update_definedness): Delete.
* ldmain.c (main): Call ldexp_init and ldexp_finish.
An assignment to dot in an output section that allocates space of
course keeps the output section. Here, I'm changing the behaviour for
assignments that don't allocate space. The idea is not so much to
allow people to force output of an empty section with ". = .", but
to fix cases where an otherwise empty section has padding added by an
alignment expression that changes with relaxation or .eh_frame
editing. Such a section might have zero size before relaxation and so
be stripped incorrectly.
ld/
* ld.texinfo (Output Section Discarding): Mention assigning to dot
as a way of keeping otherwise empty sections.
* ldexp.c (is_dot, is_value, is_sym_value, is_dot_ne_0,
is_dot_plus_0, is_align_conditional): New predicates.
(exp_fold_tree_1): Set SEC_KEEP when assigning to dot inside an
output section, except for some special cases.
* scripttempl/elfmicroblaze.sc: Use canonical form to align at
end of .heap and .stack.
ld/testsuite/
* ld-shared/elf-offset.ld: Align end of .bss with canonical form
of ALIGN that allows an empty .bss to be removed.
* ld-arm/arm-dyn.ld: Likewise.
* ld-arm/arm-lib.ld: Likewise.
* ld-elfvsb/elf-offset.ld: Likewise.
* ld-mips-elf/mips-dyn.ld: Likewise.
* ld-mips-elf/mips-lib.ld: Likewise.
* ld-arm/arm-no-rel-plt.ld: Remove duplicate ALIGN.
* ld-powerpc/vle-multiseg-1.ld: Remove ALIGN at start of section.
ALIGN address of section instead.
* ld-powerpc/vle-multiseg-2.ld: Likewise.
* ld-powerpc/vle-multiseg-3.ld: Likewise.
* ld-powerpc/vle-multiseg-4.ld: Likewise.
* ld-powerpc/vle-multiseg-6.ld: Likewise.
* ld-scripts/empty-aligned.d: Check section headers not program
headers. Remove xfail and notarget.
* ld-scripts/empty-aligned.t: Use canonical ALIGN for end of .text2.
Modifies ld machinery tracking linker script assignments to notice all
assignments, not just those symbols mentioned in DEFINED().
ld/
PR ld/14962
* ldlang.h (struct lang_definedness_hash_entry): Add by_object and
by_script. Make iteration a single bit field.
(lang_track_definedness, lang_symbol_definition_iteration): Delete.
(lang_symbol_defined): Declare.
* ldlang.c (lang_statement_iteration): Expand comment a little.
(lang_init <lang_definedness_table>): Make it bigger.
(lang_track_definedness, lang_symbol_definition): Delete.
(lang_definedness_newfunc): Update.
(lang_symbol_defined): New function.
(lang_update_definedness): Create entries here. Do track whether
script definition of symbol is valid, even when also defined in
an object file.
* ldexp.c (fold_name <DEFINED>): Update.
(fold_name <NAME>): Allow self-assignment for absolute symbols
defined in a linker script.
ld/testsuite/
* ld-scripts/pr14962-2.d,
* ld-scripts/pr14962-2.t: New test.
* ld-scripts/expr.exp: Run it.
* ldgram.y: Likewise
* ldlex.l: Likewise
* NEWS: Mention the new feature.
* ld.texinfo: Document the new feature.
* ld-scripts/log2.exp: New: Run the new log2 test.
* ld-scripts/log2.s: Source for the new test.
* ld-scripts/log2.t: Linker script for new test.
* ldexp.h (struct ldexp_control): Add "assign_name".
* ldexp.c (fold_name <NAME>): Compare and clear assign_name on match.
(exp_fold_tree_1): Remove existing code testing for self assignment.
Instead set and test expld.assign_name.
* ldlang.c (scan_for_self_assignment): Delete.
(print_assignment): Instead set and test expld.assign_name.
* ldexp.c (fold_name): Ignore undefined symbols when assigning to
dot in mark phase.
(exp_fold_tree_1): Evaluate assignment to dot expressions even when
discarding result, for side effects. Fix typo in error message.
* ldlang.c (lang_process): Rerun lang_do_assignments before
starting garbage collection.
* ldexp.c (fold_name): Generate a reloc for defined symbols
found without an associated output section during the mark phase.
(exp_fold_tree_1): Continue processing an expression, even if we
are unable to fold it, if we are in the first two evaluation
phases.
* ldexp.h (enum lang_phase_type): Add descriptions of the phases.
* ld-gc/pr13683.c: New test source file.
* ld-gc/pr13683.d: New test control and output file.
* ld-gc/gc.exp: Run the pr13683 test.
* ld-cris/tls-gc-68: Update expected symbol table dump.
* ld-cris/tls-gc-69: Likewise.
* ld-cris/tls-gc-70: Likewise.
* ld-cris/tls-gc-71: Likewise.
* ld-cris/tls-gc-75: Likewise.
* ld-cris/tls-gc-76.d: Likewise.
* ld-cris/tls-gc-79.d: Likewise.
* ld.h (parsing_defsym): Delete.
* ldexp.c (exp_intop, exp_bigintop, exp_relop): Set type.filename.
(fold_binary, fold_name, exp_fold_tree_1, exp_get_vma, exp_get_fill,
exp_get_abs_int): Add tree arg for %S in error messages. Don't
fudge lineno.
(exp_binop, exp_unop, exp_nameop, exp_assop, exp_assert): Copy
type.filename from sub-tree.
(exp_trinop): Likewise, and use "cond" rather than "lhs".
* ldexp.h (node_type): Add filename field to struct.
* ldfile.c (ldfile_input_filename): Delete. Remove all refs.
* ldfile.h (ldfile_input_filename): Delete.
* ldgram.y (phdr_type, phdr_qualifiers, yyerror): Add NULL arg for
%S in error messages.
* ldemul.c (syslib_default, hll_default): Likewise.
* ldlang.c (lang_memory_region_lookup, lang_memory_region_alias,
lang_get_regions, lang_new_phdr): Likewise.
(lang_size_sections_1): Pass addr_tree for %S.
* ldlex.h (lex_redirect): Update prototype.
(ldlex_filename): Declare.
* ldlex.l (<EOF>): Don't set ldfile_input_filename.
(lex_redirect): Add fake_filename and count params. Push
fake_filename to file_name_stack and init lineno from count.
(ldlex_filename): New function.
(lex_warn_invalid): Use above.
* ldmain.c (main): Update lex_redirect call.
* ldmisc.c (vfinfo <%S>): Take file name and line number from
etree_type arg, or use current if arg is NULL.
* lexsup.c (parsing_defsym): Delete.
(parse_args <OPTION_DEFSYM>): Update lex_redirect call.
* ldexp.h (enum phase_enum): Comment. Add exp_dataseg_done.
* ldexp.c (fold_unary <DATA_SEGMENT_END>): Rearrange code. Test
for exp_dataseg_done rather than expld.phase == lang_final_phase_enum
to detect when we've finished sizing sections.
(fold_binary <DATA_SEGMENT_ALIGN>): Likewise.
(fold_binary <DATA_SEGMENT_RELRO_END>): Likewise. Also test
that we are not inside an output section statement.
* ldlang.c (lang_size_sections): Set exp_dataseg_done on exit if
not exp_dataseg_relro_adjust or exp_dataseg_adjust. Don't set
lang_final_phase_enum here.
(lang_process): Set lang_final_phase_enum here.