This may no longer be necessary with the current version
of the SVE patches, but it does at least make things consistent
with the TYPE_MODE/SET_TYPE_MODE split.
gcc/ada/
2016-11-16 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* gcc-interface/utils.c (create_label_decl): Use SET_DECL_MODE.
gcc/c/
2016-11-16 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* c-decl.c (merge_decls): Use SET_DECL_MODE.
(make_label, finish_struct): Likewise.
gcc/cp/
2016-11-16 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* class.c (finish_struct_bits): Use SET_DECL_MODE.
(build_base_field_1, layout_class_type, finish_struct_1): Likewise.
* decl.c (make_label_decl): Likewise.
* pt.c (tsubst_decl): Likewise.
gcc/fortran/
2016-11-16 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* trans-common.c (build_common_decl): Use SET_DECL_MODE.
* trans-decl.c (gfc_build_label_decl): Likewise.
* trans-types.c (gfc_get_array_descr_info): Likewise.
gcc/lto/
2016-11-16 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* lto.c (offload_handle_link_vars): Use SET_DECL_MODE.
gcc/
2016-11-16 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* tree.h (SET_DECL_MODE): New macro.
* cfgexpand.c (avoid_deep_ter_for_debug): Use SET_DECL_MODE.
(expand_gimple_basic_block): Likewise.
* function.c (split_complex_args): Likeise.
* ipa-prop.c (ipa_modify_call_arguments): Likewise.
* omp-simd-clone.c (ipa_simd_modify_stmt_ops): Likewise.
* stor-layout.c (layout_decl, relayout_decl): Likewise.
(finish_bitfield_representative): Likewise.
* tree.c (make_node_stat): Likewise.
* tree-inline.c (remap_ssa_name): Likewise.
(tree_function_versioning): Likewise.
* tree-into-ssa.c (rewrite_debug_stmt_uses): Likewise.
* tree-sra.c (sra_ipa_reset_debug_stmts): Likewise.
* tree-ssa-ccp.c (optimize_atomic_bit_test_and): Likewise.
* tree-ssa-loop-ivopts.c (remove_unused_ivs): Likewise.
* tree-ssa.c (insert_debug_temp_for_var_def): Likewise.
* tree-streamer-in.c (unpack_ts_decl_common_value_fields): Likewise.
* varasm.c (make_debug_expr_from_rtl): Likewise.
libcc1/
2016-11-16 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* plugin.cc (plugin_build_add_field): Use SET_DECL_MODE.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242585
This patch makes scheduling not reorder prologue insns relative to
epilogue insns and vice versa. This fixes PR78029.
The problem in that PR:
We have two insns, in this order:
(insn/f 300 299 267 8 (set (reg:DI 65 lr)
(reg:DI 0 0)) 579 {*movdi_internal64}
(expr_list:REG_DEAD (reg:DI 0 0)
(expr_list:REG_CFA_RESTORE (reg:DI 65 lr)
(nil))))
...
(insn/f 310 268 134 8 (set (mem/c:DI (plus:DI (reg/f:DI 1 1)
(const_int 144 [0x90])) [6 S8 A8])
(reg:DI 0 0)) 579 {*movdi_internal64}
(expr_list:REG_DEAD (reg:DI 0 0)
(expr_list:REG_CFA_OFFSET (set (mem/c:DI (plus:DI (reg/f:DI 1 1)
(const_int 144 [0x90])) [6 S8 A8])
(reg:DI 65 lr))
(nil))))
and sched swaps them (when compiling for power6, it tries to put memory
stores together, so insn 310 is moved up past 300 to go together with
some other store). But the REG_CFA_RESTORE and REG_CFA_OFFSET cannot be
swapped (they both say where the orig value of LR now lives).
PR rtl-optimization/78029
* function.c (prologue_contains, epilogue_contains): New functions.
(record_prologue_seq, record_epilogue_seq): New functions.
* function.h (prologue_contains, epilogue_contains,
record_prologue_seq, record_epilogue_seq): New declarations.
* sched-deps.c (sched_analyze_insn): Make dependencies to prevent
mixing prologue and epilogue insns.
(init_deps): Initialize the new fields in struct deps_desc.
* sched-int.h (struct deps_desc): New fields last_prologue,
last_epilogue, and last_logue_was_epilogue.
* shrink-wrap.c (emit_common_heads_for_components): Record all
emitted prologue and epilogue insns.
(emit_common_tails_for_components): Ditto.
(insert_prologue_epilogue_for_components): Ditto.
From-SVN: r241650
PR77962 shows Go failing on 32-bit x86. This happens because the i386
port requires the split stack prologue to be created before the normal
prologue, and my previous patch changed it to be the other way around.
This patch changes it back. Things will be exactly as before for targets
that do not do shrink-wrapping for separate components. For targets
that *do* support it, all three prologue/epilogue creation functions
will now be called twice for functions that have anything wrapped
separately (instead of just the prologue created twice).
PR bootstrap/77962
* function.c (thread_prologue_and_epilogue_insns): Call all
make_*logue_seq in the same order as traditional. Call them
all a second time if shrink_wrapped-separate.
From-SVN: r241135
This is the main substance of this patch series.
Instead of doing all of the prologue and epilogue in one spot, it often
is better to do components of it at different places, so that they are
executed less frequently.
What exactly is a component is completely up to the target; this code
treats it all abstractly, and uses hooks for the target to handle the
more concrete things. Commonly there is one component for each callee-
saved register, for example.
Components can be executed more than once per function execution. This
pass makes sure that a component's epilogue is not called more often
than the corresponding prologue has been, at any point in time; that the
prologue is called more often, wherever the prologue's effect is needed;
and that the epilogue is called as often as the prologue has been, when
the function exits. It does this by first deciding which blocks need
which components active, and then placing prologue and epilogue
components to make that exactly true.
Deciding what blocks should run with a certain component active so that
the total cost of executing the prologues (and epilogues) is optimal, is
not a computationally feasible problem. Instead, for each basic block,
we estimate the cost of putting a prologue right before the block, and
if that is cheaper than the total cost of putting prologues optimally
(according to the estimated cost) in the dominator subtrees strictly
dominated by this first block, place it at the first block instead.
This simple procedure places the components optimally for any dominator
sub tree where the root node's cost does not depend on anything outside
its subtree.
The cost is the execution frequency of all edges into the block coming
from blocks that do not have this component active. The estimated cost
is the execution frequency of the block, minus the execution frequency
of any backedges (which by definition are coming from subtrees, so if
the "head" block gets a prologue, the source block of any backedge has
that component active as well).
Currently, the epilogues are placed as late as possible, given the
constraints. This does not matter for execution cost, but we could
save a little bit of code size by placing the epilogues in a smarter
way. This is a possible future optimisation.
Now all that is left is inserting prologues and epilogues on all edges
that jump into resp. out of the "active" set of blocks. Often we need
to insert some components' prologues (or epilogues) on all edges into
(or out of) a block. In theory cross-jumping can unify all such, but
in practice that often fails; besides, that is a lot of work. So in
this case we insert the prologue and epilogue components at the "head"
or "tail" of a block, instead.
As a final optimisation, if a block needs a prologue and its immediate
dominator has the block as a post-dominator, that immediate dominator
gets the prologue as well.
* function.c (thread_prologue_and_epilogue_insns): Call
try_shrink_wrapping_separate. Compute the prologue_seq afterwards,
if it has possibly changed. Compute the split_prologue_seq and
epilogue_seq later, too.
* shrink-wrap.c: #include cfgbuild.h and insn-config.h.
(dump_components): New function.
(struct sw): New struct.
(SW): New function.
(init_separate_shrink_wrap): New function.
(fini_separate_shrink_wrap): New function.
(place_prologue_for_one_component): New function.
(spread_components): New function.
(disqualify_problematic_components): New function.
(emit_common_heads_for_components): New function.
(emit_common_tails_for_components): New function.
(insert_prologue_epilogue_for_components): New function.
(try_shrink_wrapping_separate): New function.
* shrink-wrap.h: Declare try_shrink_wrapping_separate.
From-SVN: r241063
On x86, interrupt handlers are only called by processors which push
interrupt data onto stack at the address where the normal return address
is. Since interrupt handlers must access interrupt data via pointers so
that they can update interrupt data, the pointer argument is passed as
"argument pointer - word".
TARGET_FUNCTION_INCOMING_ARG defines how callee sees its argument.
Normally it returns REG, NULL, or CONST_INT. This patch adds arbitrary
address computation based on hard register, which can be forced into a
register, to the list.
When copying an incoming argument onto stack, assign_parm_setup_stack
has:
if (argument in memory)
copy argument in memory to stack
else
move argument to stack
Since an arbitrary address computation may be passed as an argument, we
change it to:
if (argument in memory)
copy argument in memory to stack
else
{
if (argument isn't in register)
force argument into a register
move argument to stack
}
* function.c (assign_parm_setup_stack): Force source into a
register if needed.
* target.def (function_incoming_arg): Update documentation to
allow arbitrary address computation based on hard register.
* doc/tm.texi: Regenerated.
Co-Authored-By: Julia Koval <julia.koval@intel.com>
From-SVN: r237037
This patch restructures how the prologues/epilogues are inserted. Sibcalls
that run without prologue are now handled in shrink-wrap.c; it communicates
what is already handled by setting the EDGE_IGNORE flag. The
try_shrink_wrapping function then doesn't need to be passed the bb_flags
anymore.
* function.c (make_epilogue_seq): Remove epilogue_end parameter.
(thread_prologue_and_epilogue_insns): Remove bb_flags. Restructure
code. Ignore sibcalls on EDGE_IGNORE edges.
* shrink-wrap.c (handle_simple_exit): New function. Set EDGE_IGNORE
on edges for sibcalls that run without prologue. The rest of the
function is combined from...
(fix_fake_fallthrough_edge): ... this, and ...
(try_shrink_wrapping): ... a part of this. Remove the bb_with
function argument, make it a local variable.
From-SVN: r236491
It failed for targets that have an eh_return pattern with a splitter
gated by epilogue_done.
* function.c (thread_prologue_and_epilogue_insn): Move the
"goto epilogue_done" one block later.
From-SVN: r236441
Make new functions make_split_prologue_seq, make_prologue_seq, and
make_epilogue_seq.
* function.c (make_split_prologue_seq, make_prologue_seq,
make_epilogue_seq): New functions, factored out from...
(thread_prologue_and_epilogue_insns): Here.
From-SVN: r236373
We should do CLEANUP_EXPENSIVE after shrink-wrapping, because shrink-
wrapping creates constructs that CLEANUP_EXPENSIVE can optimise, and
nothing runs CLEANUP_EXPENSIVE later.
* function.c (rest_of_handle_thread_prologue_and_epilogue): Call
cleanup_cfg with CLEANUP_EXPENSIVE after shrink-wrapping instead
of before. Add a comment.
From-SVN: r236372
Now that cfgcleanup knows how to optimize with return statements, the
epilogue insertion code doesn't have to deal with it itself anymore.
* function.c (emit_use_return_register_into_block): Delete.
(gen_return_pattern): Delete.
(emit_return_into_block): Delete.
(active_insn_between): Delete.
(convert_jumps_to_returns): Delete.
(emit_return_for_exit): Delete.
(thread_prologue_and_epilogue_insns): Delete all code dealing with
simple_return for shrink-wrapped blocks.
* shrink-wrap.c (try_shrink_wrapping): Insert simple_return at the
end of blocks that need one.
(get_unconverted_simple_return): Delete.
(convert_to_simple_return): Delete.
* shrink-wrap.c (get_unconverted_simple_return): Delete declaration.
(convert_to_simple_return): Ditto.
From-SVN: r235905
Fix this warning:
../../../src/gcc/function.c: In function ‘void locate_and_pad_parm(machine_mode, tree, int, int, int, tree, args_size*, locate_and_pad_arg_data*)’:
../../../src/gcc/function.c:4123:2: error: statement is indented as if it were guarded by... [-Werror=misleading-indentation]
{
^
../../../src/gcc/function.c:4119:7: note: ...this ‘if’ clause, but it is not
if (initial_offset_ptr->var)
^
gcc/ChangeLog:
* function.c (locate_and_pad_parm): Fix indentation.
From-SVN: r231518
PR middle-end/68291
PR middle-end/68292
* cfgexpand.c (set_rtl): Always accept PARALLELs with BLKmode for
SSA names based on RESULT_DECLs.
* function.c (expand_function_start): Do not create BLKmode REGs
for GIMPLE registers when coalescing is enabled.
From-SVN: r231372
Storing a register in memory as a full word and then accessing the
same memory address under a smaller-than-word mode amounts to
right-shifting of the register word on big endian machines. So, if
BLOCK_REG_PADDING chooses upward padding for BYTES_BIG_ENDIAN, and
we're copying from the entry_parm REG directly to a pseudo, bypassing
any stack slot, perform the shifting explicitly.
This fixes the miscompile of function_return_val_10 in
gcc.target/aarch64/aapcs64/func-ret-4.c for target aarch64_be-elf
introduced in the first patch for 67753.
for gcc/ChangeLog
PR rtl-optimization/67753
PR rtl-optimization/64164
* function.c (assign_parm_setup_block): Right-shift
upward-padded big-endian args when bypassing the stack slot.
From-SVN: r230985
In assign_parms_setup_block, the copy of args in PARALLELs from
entry_parm to stack_parm is deferred to the parm conversion insn seq,
but the copy from stack_parm to target_reg was inserted in the normal
copy seq, that is executed before the conversion insn seq. Oops.
We could do away with the need for an actual stack_parm in general,
which would have avoided the need for emitting the copy to target_reg
in the conversion seq, but at least on pa, due to the need for stack
to copy between SI and SF modes, it seems like using the reserved
stack slot is beneficial, so I put in logic to use a pre-reserved
stack slot when there is one, and emit the copy to target_reg in the
conversion seq if stack_parm was set up there.
for gcc/ChangeLog
PR rtl-optimization/67753
PR rtl-optimization/64164
* function.c (assign_parm_setup_block): Avoid allocating a
stack slot if we don't have an ABI-reserved one. Emit the
copy to target_reg in the conversion seq if the copy from
entry_parm is in it too. Don't use the conversion seq to copy
a PARALLEL to a REG or a CONCAT.
From-SVN: r229840
* cgraphunit.c (cgraph_node::expand_thunk): Call
allocate_struct_function before init_function_start.
(cgraph_node::expand): Use push_cfun and pop_cfun.
* config/i386/i386.c (ix86_code_end): Call
allocate_struct_function before init_function_start.
* config/rs6000/rs6000.c (rs6000_code_end): Likewise.
* function.c (init_function_start): Move preamble to all
callers.
* passes.c (do_per_function_toporder): Use push_cfun and pop_cfun.
(execute_one_pass): Handle newly added TODO_discard_function.
(execute_pass_list_1): Terminate if cfun equals to NULL.
(execute_pass_list): Do not push and pop cfun, expect that
cfun is set.
* tree-pass.h (TODO_discard_function): Define.
From-SVN: r229764
for gcc/ChangeLog
PR middle-end/67766
* function.c (expand_function_end): Move return value
promotion past the handling of PARALLELs and CONCATs.
From-SVN: r228651
Revert the fragile and complicated changes to assign_parms designed to
enable it to use RTL assigments chosen by cfgexpand, and instead have
cfgexpand use the RTL assignments by assign_parms, keying them off of
the default defs that are now necessarily introduced for each parm and
result. The possible lack of a default def was already a problem, and
the fallbacks in place were not enough, as shown by PR67312. We now
have checking asserts in set_rtl that verify that we're assigning to
each var a piece of RTL that matches the expectations set forth by
use_register_for_decl.
for gcc/ChangeLog
PR rtl-optimization/64164
PR tree-optimization/67312
PR middle-end/67340
PR middle-end/67490
PR bootstrap/67597
* cfgexpand.c (parm_in_stack_slot_p): Remove.
(ssa_default_def_partition): Remove.
(get_rtl_for_parm_ssa_default_def): Remove.
(set_rtl): Check that RTL assignments match expectations.
Loop on SUBREGs, CONCATs and PARALLELs subexprs. Set only the
default def location for params and results. Record SSA names
or types in REG and MEM attrs, respectively.
(set_parm_rtl): New.
(expand_one_ssa_partition): Drop logic that assigned MEMs with
unassigned addresses.
(adjust_one_expanded_partition_var): Don't accept NULL RTL on
deferred stack alloc vars.
(expand_used_vars): Skip partitions holding parm default defs.
Move adjust_one_expanded_partition_var loop...
(pass_expand::execute): ... here. Drop redundant assert.
Adjust comments before the final loop over all ssa names.
Require assigned rtl of parms and results to match exactly.
Reset its attributes to match them, not any other variables in
the same partition.
(expand_debug_expr): Use entry value for PARM's default defs
only iff they have zero nondebug uses.
* cfgexpand.h (parm_in_stack_slot_p): Remove.
(get_rtl_for_parm_ssa_default_def): Remove.
(set_parm_rtl): Declare.
* doc/invoke.texi: Improve wording.
* explow.c (promote_decl_mode): Fix promote_function_mode for
result decls not by reference.
(promote_ssa_mode): Disregard BLKmode from promote_decl, and
bypass TYPE_MODE to get the actual vector mode.
* function.c: Include tree-dfa.h. Revert 2015-08-14's and
2015-08-19's changes as follows. Drop include of
basic-block.h and df.h.
(rtl_for_parm): Remove.
(maybe_reset_rtl_for_parm): Remove.
(parm_in_unassigned_mem_p): Remove.
(use_register_for_decl): Add logic for RESULT_DECLs matching
assign_parms' behavior.
(split_complex_args): Revert.
(assign_parms_augmented_arg_list): Revert. Add comment
referencing the logic above.
(assign_parm_adjust_stack_rtl): Revert.
(assign_parm_setup_block): Revert. Use set_parm_rtl instead
of SET_DECL_RTL. Set up a REG if the parm demands so.
(assign_parm_setup_reg): Revert. Consolidated SET_DECL_RTL
calls into a single set_parm_rtl. Set up a temporary RTL
temporarily for expand_assignment.
(assign_parm_setup_stack): Revert. Use set_parm_rtl.
(assign_parms_unsplit_complex): Revert. Use set_parm_rtl.
(assign_bounds): Revert.
(assign_parms): Revert. Use set_parm_rtl.
(allocate_struct_function): Relayout result and parms of
non-abstruct functions.
(expand_function_start): Revert. Use set_parm_rtl. If the
result is not a hard reg, create a pseudo from the promoted
mode of the default def. Promote static chain mode.
* tree-outof-ssa.c (remove_ssa_form): Drop unused
partition_has_default_def. Set up
partitions_for_parm_default_defs.
(finish_out_of_ssa): Remove partition_has_default_def.
Release partitions_for_parm_default_defs.
* tree-outof-ssa.h (struct ssaexpand): Remove
partition_has_default_def. Add
partitions_for_parm_default_defs.
* tree-ssa-coalesce.c: Include tree-dfa.h, tm_p.h and
stor-layout.h.
(build_ssa_conflict_graph): Fix conflict-detection of default
defs of even unused default defs of params and results.
(for_all_parms): New.
(create_default_def): New.
(register_default_def): New.
(coalesce_with_default): New.
(create_outofssa_var_map): Create default defs for all parms
and results, and register their partitions. Add GIMPLE_RETURN
operands as coalesce candidates with results. Add default
defs of each parm or result as coalesce candidates with its
other defs. Mark each result def, and each default def of
parms, as used_in_copy.
(gimple_can_coalesce_p): Call it. Call use_register_for_decl
with the ssa names, even anonymous ones. Drop
parm_in_stack_slot_p calls. Require same signedness and
alignment.
(coalesce_ssa_name): Add coalesce candidates for all defs of
each parm and result, even unused ones.
(parm_default_def_partition_arg): New type.
(set_parm_default_def_partition): New.
(get_parm_default_def_partitions): New.
* tree-ssa-coalesce.h (get_parm_default_def_partitions): New.
* tree-ssa-live.c (partition_view_init): Regard unused defs of
parms and results as used.
(verify_live_on_entry): Don't error out just because they're
not live.
for gcc/testsuite/ChangeLog
PR rtl-optimization/64164
PR tree-optimization/67312
* gcc.dg/pr67312.c: New. From Zdenek Sojka.
* gcc.target/i386/stackalign/return-4.c: Add -O.
From-SVN: r228175
The caller of try_shrink_wrapping wants to be returned a single edge to
put the prologue on. To make that work even if there are multiple edges
(all pointing to the PRO block) that need the prologue, add a new block
that becomes the destination of all such edges, and then jumps to PRO.
In the general case, some edges to PRO will need to be redirected, and
not all edges *can* be redirected. This adds a can_get_prologue function
that detects such cases. This then happily can also handle the "prologue
clobbers some reg that is live on the edge we want to insert it on" case.
Not all EDGE_CROSSING edges can be redirected, so handle those the same
as EDGE_COMPLEX edges.
2015-09-22 Segher Boessenkool <segher@kernel.crashing.org>
* function.c (thread_prologue_and_epilogue_insns): Delete
orig_entry_edge argument to try_shrink_wrapping.
* shrink-wrap.c (can_get_prologue): New function.
(can_dup_for_shrink_wrapping): Also handle EDGE_CROSSING.
(try_shrink_wrapping): Delete orig_entry_edge argument. Use
can_get_prologue where needed. Remove code that finds a single
edge for the prologue. Remove code that tests if any reg clobbered
by the prologue is live on the prologue edge. Remove code that finds
the new prologue edge after duplicating blocks. Make a new prologue
block and edge.
* shrink-wrap.h (try_shrink_wrapping): Delete orig_entry_edge argument.
From-SVN: r228022
optabs.[hc] is a bit of a behemoth. It includes basic functions for querying
what a target can do, related tree- and gimple-level query functions,
related rtl-level query functions, and the functions that actually
generate code. Some gimple optimisations therefore need:
#include "insn-config.h"
#include "expmed.h"
#include "dojump.h"
#include "explow.h"
#include "emit-rtl.h"
#include "varasm.h"
#include "stmt.h"
#include "expr.h"
purely to query whether the target has support for a particular operation.
This patch splits optabs up as follows:
- optabs-query.[hc]: IL-independent functions for querying what a target
can do natively.
- optabs-tree.[hc]: tree and gimple query functions (an extension of
optabs-query.[hc]).
- optabs-libfuncs.[hc]: optabs-specific libfuncs (an extension of
libfuncs.h)
- optabs.h: For now includes optabs-query.h and optabs-libfuncs.h.
Only two files outside optabs need to include both optabs.h and
optabs-tree.h: expr.c and function.c. I think that's expected given
that both are related to expand.
It might be good to split optabs.h further, but this is already quite
a big patch.
I changed can_conditionally_move_p from returning an int to returning
a bool and fixed a few formatting glitches. There should be no other
changes to the functions themselves.
gcc/
* Makefile.in (OBJS): Add optabs-libfuncs.o, optabs-query.o
and optabs-tree.o.
(GTFILES): Replace optabs.c with optabs-libfunc.c.
* genopinit.c (main): Add an include guard to insn-opinit.h.
Protect the rtx_code parts with NUM_RTX_CODE.
* optabs.h: Split parts out to...
* optabs-libfuncs.h, optabs-query.h, optabs-tree.h: ...these
new files.
* optabs.c: Split parts out to...
* optabs-libfuncs.c, optabs-query.c, optabs-tree.c: ...these
new files.
* cilk-common.c: Include optabs-query.h rather than optabs.h.
* fold-const.c: Likewise.
* target-globals.c: Likewise.
* tree-if-conv.c: Likewise.
* tree-ssa-forwprop.c: Likewise.
* tree-ssa-loop-prefetch.c: Likewise.
* tree-ssa-math-opts.c: Include optabs-tree.h rather than
optabs.h. Remove unncessary include files.
* tree-ssa-phiopt.c: Likewise.
* tree-ssa-reassoc.c: Likewise.
* tree-switch-conversion.c: Likewise.
* tree-vect-data-refs.c: Likewise.
* tree-vect-generic.c: Likewise.
* tree-vect-loop.c: Likewise.
* tree-vect-patterns.c: Likewise.
* tree-vect-slp.c: Likewise.
* tree-vect-stmts.c: Likewise.
* tree-vrp.c: Likewise.
* toplev.c: Include optabs-query.h and optabs-libfuncs.h
rather than optabs.h.
* expr.c: Include optabs-tree.h.
* function.c: Likewise.
From-SVN: r227865
With the new shrink-wrap algorithm, blocks reachable both with and
without prologue are duplicated, and their incoming edges are then
distributed accordingly. So we need to call fixup_partitions.
2015-09-16 Segher Boessenkool <segher@kernel.crashing.org>
PR bootstrap/67587
* function.c (rest_of_handle_thread_prologue_and_epilogue): Call
fixup_partitions.
From-SVN: r227827
Defer stack slot address assignment for all parms that can't live in
pseudos, and accept pseudos assignments in assign_param_setup_block.
for gcc/ChangeLog
PR rtl-optimization/64164
* cfgexpand.c (parm_maybe_byref_p): Renamed to...
(parm_in_stack_slot_p): ... this. Disregard mode, what
matters is whether the parm will live in a pseudo or a stack
slot.
(expand_one_ssa_partition): Deal with params without a default
def. Disregard mode.
* cfgexpand.h: Renamed function declaration.
* tree-ssa-coalesce.c: Adjust.
* function.c (split_complex_args): Allocate stack slot for
unassigned parms before splitting.
(parm_in_unassigned_mem_p): New. Use it instead of
parm_maybe_byref_p throughout this file.
(assign_parm_setup_block): Use it. Accept pseudos in the
expand-assigned rtl.
(assign_parm_setup_reg): Drop BLKmode requirement.
(assign_parm_setup_stack): Allocate and fill in the address of
unassigned MEM parms.
From-SVN: r227015
for gcc/ChangeLog
PR rtl-optimization/64164
PR bootstrap/66978
PR middle-end/66983
PR rtl-optimization/67000
PR middle-end/67034
PR middle-end/67035
* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
* tree-ssa-copyrename.c: Removed.
* opts.c (default_options_table): Drop -ftree-copyrename. Add
-ftree-coalesce-vars.
* passes.def: Drop all occurrences of pass_rename_ssa_copies.
* common.opt (ftree-copyrename): Ignore.
(ftree-coalesce-inlined-vars): Likewise.
* doc/invoke.texi: Remove the ignored options above.
* gimple-expr.h (gimple_can_coalesce_p): Move declaration
* tree-ssa-coalesce.h: ... here.
* tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
headers required by it.
* gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
across variables when flag_tree_coalesce_vars. Check register
use and promoted modes to allow coalescing. Do not coalesce
maybe-byref parms with SSA_NAMEs of other variables, or
anonymous SSA_NAMEs. Moved to tree-ssa-coalesce.c.
* tree-ssa-live.c (struct tree_int_map_hasher): Move along
with its member functions to tree-ssa-coalesce.c.
(var_map_base_init): Likewise. Renamed to
compute_samebase_partition_bases.
(partition_view_normal): Drop want_bases parameter.
(partition_view_bitmap): Likewise.
* tree-ssa-live.h: Adjust declarations.
* tree-ssa-coalesce.c: Include explow.h and cfgexpand.h.
(build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
default defs at the entry point.
(dump_part_var_map): New.
(compute_optimized_partition_bases): New, called by...
(coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
of compute_samebase_partition_bases. Adjust.
* alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs.
* cfgexpand.c (leader_merge, parm_maybe_byref_p): New.
(ssa_default_def_partition): New.
(get_rtl_for_parm_ssa_default_def): New.
(align_local_variable, add_stack_var): Support anonymous SSA
names.
(defer_stack_allocation): Likewise. Declare earlier.
(set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
vars. Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
Do no record deferred-allocation marker in
SA.partition_to_pseudo.
(expand_stack_vars): Adjust check for the marker in it.
(expand_one_stack_var_at): Handle anonymous SSA_NAMEs. Drop
redundant MEM attr setting.
(expand_one_stack_var_1): Handle anonymous SSA_NAMEs. Renamed
from...
(expand_one_stack_var): ... this. New wrapper to check and
skip already expanded SSA partitions.
(record_alignment_for_reg_var): New, factored out of...
(expand_one_var): ... this.
(expand_one_ssa_partition): New.
(adjust_one_expanded_partition_var): New.
(expand_one_register_var): Check and skip already expanded SSA
partitions.
(expand_used_vars): Don't create DECLs for anonymous SSA
names. Expand all SSA partitions, then adjust all SSA names.
(pass::execute): Replace the loops that set
SA.partition_to_pseudo from partition leaders and cleared
DECL_RTL for multi-location variables, and that which used to
rename vars and set attrs, with one that clears DECL_RTL and
checks that PARMs and RESULTs default_defs match DECL_RTL.
* cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
* emit-rtl.c: Include stor-layout.h.
(set_reg_attrs_for_parm): Handle NULL decl.
(set_reg_attrs_for_decl_rtl): Take mode from expression if
it's not a DECL.
* stmt.c (emit_case_decision_tree): Pass it the SSA_NAME
rather than its possibly-NULL DECL.
* explow.c (promote_ssa_mode): New.
* explow.h (promote_ssa_mode): Declare.
* expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
(read_complex_part): Export.
* expr.h (read_complex_part): Declare.
* cfgexpand.h (parm_maybe_byref_p): Declare.
* function.c: Include cfgexpand.h.
(use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
(use_register_for_parm_decl): Wrapper for the above to
special-case the result_ptr.
(rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
(split_complex_args): Take assign_parm_data_all argument.
Pass it to rtl_for_parm. Set up rtl and context for split
args. Reset complex parm before fetching its default decl
rtl.
(assign_parms_unsplit_complex): Use the default-def complex
parm rtl if it matches the components.
(assign_parms_augmented_arg_list): Adjust.
(maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
multiple locations. Recognize split complex args.
(assign_parm_adjust_stack_rtl): Add all and parm arguments,
for rtl_for_parm. For SSA-assigned parms, zero stack_parm.
(assign_parm_setup_block): Prefer SSA-assigned location, and
fill in its address if the memory location of a maybe-byref
parm was not assigned by cfgexpand.
(assign_parm_setup_reg): Likewise. Adjust its mode as
needed. Use entry_parm for equiv if stack_parm is NULL. Make
sure passed_pointer parms don't need conversion. Copy address
or value as needed.
(assign_parm_setup_stack): Prefer SSA-assigned location.
(assign_parms): Maybe reset DECL_RTL of params. Adjust stack
rtl before testing for pointer bounds. Special-case result_ptr.
(expand_function_start): Maybe reset DECL_RTL of result.
Prefer SSA-assigned location for result and static chain.
Factor out DECL_RESULT and SET_DECL_RTL. Convert static chain
to Pmode if needed, from H.J. Lu <hongjiu.lu@intel.com>.
* tree-outof-ssa.c (insert_value_copy_on_edge): Handle
anonymous SSA names. Use promote_ssa_mode.
(get_temp_reg): Likewise.
(remove_ssa_form): Adjust.
* stor-layout.c (layout_decl): Don't set mem attributes of
non-MEMs.
* var-tracking.c (dataflow_set_clear_at_call): Take call_insn
and get its reg_usage for reg invalidation.
(compute_bb_dataflow): Pass it insn.
(emit_notes_in_bb): Likewise.
for gcc/testsuite/ChangeLog
* gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
* gcc.dg/ssp-1.c: Make counter a register.
* gcc.dg/ssp-2.c: Likewise.
* gcc.dg/torture/parm-coalesce.c: New.
From-SVN: r226901
PR middle-end/64744
PR middle-end/48470
PR middle-end/43404
* cfgexpand.c (expand_one_var): Add check if stack is going to
be used in naked function.
* expr.c (expand_expr_addr_expr_1): Remove excess checking
whether expression should not reside in MEM.
* function.c (use_register_for_decl): Do not use registers for
non-register things (volatile, float, BLKMode) in naked functions.
PR middle-end/64744
PR middle-end/48470
PR middle-end/43404
* gcc.target/arm/pr43404.c : New testcase.
* gcc.target/arm/pr48470.c : New testcase.
* gcc.target/arm/pr64744-1.c : New testcase.
* gcc.target/arm/pr64744-2.c : New testcase.
From-SVN: r226528
for gcc/ChangeLog
PR rtl-optimization/64164
* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
* tree-ssa-copyrename.c: Removed.
* opts.c (default_options_table): Drop -ftree-copyrename. Add
-ftree-coalesce-vars.
* passes.def: Drop all occurrences of pass_rename_ssa_copies.
* common.opt (ftree-copyrename): Ignore.
(ftree-coalesce-inlined-vars): Likewise.
* doc/invoke.texi: Remove the ignored options above.
* gimple-expr.h (gimple_can_coalesce_p): Move declaration
* tree-ssa-coalesce.h: ... here.
* tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
headers required by it.
* gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
across variables when flag_tree_coalesce_vars. Check register
use and promoted modes to allow coalescing. Moved to
tree-ssa-coalesce.c.
* tree-ssa-live.c (struct tree_int_map_hasher): Move along
with its member functions to tree-ssa-coalesce.c.
(var_map_base_init): Likewise. Renamed to
compute_samebase_partition_bases.
(partition_view_normal): Drop want_bases parameter.
(partition_view_bitmap): Likewise.
* tree-ssa-live.h: Adjust declarations.
* tree-ssa-coalesce.c: Include explow.h.
(build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
default defs at the entry point.
(dump_part_var_map): New.
(compute_optimized_partition_bases): New, called by...
(coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
of compute_samebase_partition_bases. Adjust.
* alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs.
* cfgexpand.c (leader_merge): New.
(get_rtl_for_parm_ssa_default_def): New.
(set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
vars. Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
(expand_one_stack_var_at): Handle anonymous SSA_NAMEs. Drop
redundant MEM attr setting.
(expand_one_stack_var_1): Handle anonymous SSA_NAMEs. Renamed
from...
(expand_one_stack_var): ... this. New wrapper to check and
skip already expanded SSA partitions.
(record_alignment_for_reg_var): New, factored out of...
(expand_one_var): ... this.
(expand_one_ssa_partition): New.
(adjust_one_expanded_partition_var): New.
(expand_one_register_var): Check and skip already expanded SSA
partitions.
(expand_used_vars): Don't create DECLs for anonymous SSA
names. Expand all SSA partitions, then adjust all SSA names.
(pass::execute): Replace the loops that set
SA.partition_to_pseudo from partition leaders and cleared
DECL_RTL for multi-location variables, and that which used to
rename vars and set attrs, with one that clears DECL_RTL and
checks that PARMs and RESULTs default_defs match DECL_RTL.
* cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
* emit-rtl.c (set_reg_attrs_for_parm): Handle NULL decl.
* explow.c (promote_ssa_mode): New.
* explow.h (promote_ssa_mode): Declare.
* expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
* function.c: Include cfgexpand.h.
(use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
(use_register_for_parm_decl): Wrapper for the above to
special-case the result_ptr.
(rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
(split_complex_args): Take assign_parm_data_all argument.
Pass it to rtl_for_parm. Set up rtl and context for split
args.
(assign_parms_augmented_arg_list): Adjust.
(maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
multiple locations. Recognize split complex args.
(assign_parm_adjust_stack_rtl): Add all and parm arguments,
for rtl_for_parm. For SSA-assigned parms, zero stack_parm.
(assign_parm_setup_block): Prefer SSA-assigned location.
(assign_parm_setup_reg): Likewise. Use entry_parm for equiv
if stack_parm is NULL.
(assign_parm_setup_stack): Prefer SSA-assigned location.
(assign_parms): Maybe reset DECL_RTL of params. Adjust stack
rtl before testing for pointer bounds. Special-case result_ptr.
(expand_function_start): Maybe reset DECL_RTL of result.
Prefer SSA-assigned location for result and static chain.
Factor out DECL_RESULT and SET_DECL_RTL.
* tree-outof-ssa.c (insert_value_copy_on_edge): Handle
anonymous SSA names. Use promote_ssa_mode.
(get_temp_reg): Likewise.
(remove_ssa_form): Adjust.
* stor-layout.c (layout_decl): Don't set mem attributes of
non-MEMs.
* var-tracking.c (dataflow_set_clear_at_call): Take call_insn
and get its reg_usage for reg invalidation.
(compute_bb_dataflow): Pass it insn.
(emit_notes_in_bb): Likewise.
for gcc/testsuite/ChangeLog
* gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars.
* gcc.dg/ssp-1.c: Make counter a register.
* gcc.dg/ssp-2.c: Likewise.
* gcc.dg/torture/parm-coalesce.c: New.
From-SVN: r226113
2015-07-08 Kito Cheng <kito.cheng@gmail.com>
* function.c (stack_protect_epilogue): Use if rather than switch for
check targetm.have_stack_protect_test.
From-SVN: r225599