PR middle-end/78507
PR middle-end/78510
PR middle-end/78517
* match.pd ((cond (cmp (convert1? @1) @3) (convert2? @1) @2)): Use
cmp directly, rather than cmp_code. Initialize code to ERROR_MARK
and set it to result code if transformation is valid. Use code EQ
directly in last simplification case.
gcc/testsuite
PR middle-end/78507
PR middle-end/78510
PR middle-end/78517
* g++.dg/torture/pr78507.C: New test.
* gcc.dg/torture/pr78510.c: New test.
* gcc.dg/torture/pr78517.c: New test.
From-SVN: r242874
2016-11-25 Thomas Preud'homme <thomas.preudhomme@arm.com>
gcc/
PR tree-optimization/77673
* tree-ssa-math-opts.c (struct symbolic_number): Add new src field.
(init_symbolic_number): Initialize src field from src parameter.
(perform_symbolic_merge): Select most dominated statement as the
source statement. Set src field of resulting n structure from the
input src with the lowest address.
(find_bswap_or_nop): Rename source_stmt into ins_stmt.
(bswap_replace): Rename src_stmt into ins_stmt. Initially get source
of load from src field rather than insertion statement. Cancel
optimization if statement analyzed is not dominated by the insertion
statement.
(pass_optimize_bswap::execute): Rename src_stmt to ins_stmt. Compute
dominance information.
gcc/testsuite/
PR tree-optimization/77673
* gcc.dg/pr77673.c: New test.
From-SVN: r242869
"unpredictable" for EXCESS_PRECISION_TYPE_STANDARD
gcc/
PR target/78509
* config/i386/i386.c (i386_excess_precision): Do not return
FLT_EVAL_METHOD_UNPREDICTABLE when "type" is
EXCESS_PRECISION_TYPE_STANDARD.
* target.def (excess_precision): Document that targets should
not return FLT_EVAL_METHOD_UNPREDICTABLE when "type" is
EXCESS_PRECISION_TYPE_STANDARD or EXCESS_PRECISION_TYPE_FAST.
Fix typo in first sentence.
* doc/tm.texi: Regenerate.
From-SVN: r242866
2016-11-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/78396
* tree-vectorizer.c (vectorize_loops): When the if-converted
body contains masked loads or stores do not attempt to
basic-block-vectorize it.
From-SVN: r242865
The previous code processed the users of a stack slot in order of
decreasing size and allocated the slot based on the first user.
This seems a bit dangerous, since the ordering is based on the
mode of the biggest reference while the allocation is based also
on the size of the register itself (which I think could be larger).
That scheme doesn't scale well to polynomial sizes, since there's
no guarantee that the order of the sizes is known at compile time.
This patch instead records an upper bound on the size required
by all users of a slot. It also records the maximum alignment
requirement.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* function.h (spill_slot_alignment): Declare.
* function.c (spill_slot_alignment): New function.
* lra-spills.c (slot): Add align and size fields.
(assign_mem_slot): Use them in the call to assign_stack_local.
(add_pseudo_to_slot): Update the fields.
(assign_stack_slot_num_and_sort_pseudos): Initialise the fields.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242863
Previously decimal floating-point types were created and laid
out as binary floating-point types, then the caller changed
the mode to a decimal mode later. The problem with that
approach is that not all targets support an equivalent binary
floating-point mode. When they didn't, we would give the
type BLKmode and lay it out as a zero-sized type.
This probably had no effect in practice. If a target doesn't
support a binary mode then it's unlikely to support the decimal
equivalent either. However, with the stricter mode checking
added by later patches, we would assert if a scalar floating-
point type didn't have a scalar floating-point mode.
gcc/
2016-11-16 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* stor-layout.c (layout_type): Allow the caller to set the mode of
a float type. Only choose one here if the mode is still VOIDmode.
* tree.c (build_common_tree_nodes): Set the type mode of decimal
floats before calling layout_type.
* config/rs6000/rs6000.c (rs6000_init_builtins): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242862
This loop:
/* Make sure the tail invocation of this function does not refer
to local variables. */
FOR_EACH_LOCAL_DECL (cfun, idx, var)
{
if (TREE_CODE (var) != PARM_DECL
&& auto_var_in_fn_p (var, cfun->decl)
&& (ref_maybe_used_by_stmt_p (call, var)
|| call_may_clobber_ref_p (call, var)))
return;
}
triggered even for local variables that are passed by value.
This meant that we didn't allow local aggregates to be passed
to a sibling call but did (for example) allow global aggregates
to be passed.
I think the loop is really checking for indirect references,
so should be able to skip any variables that never have their
address taken.
gcc/
* tree-tailcall.c (find_tail_calls): Allow calls to reference
local variables if all references are known to be direct.
gcc/testsuite/
* gcc.dg/tree-ssa/tailcall-8.c: New test.
From-SVN: r242860
The smaller int size for the avr target breaks the test's
expectation on the number of iterations. The failure goes
away if 32 bit ints are used in place of a plain int.
Fix by conditionally typedef int32_t to __INT32_TYPE__ for targets
with int size < 4, and then use int32_t everywhere.
gcc/testsuite
016-11-25 Senthil Kumar Selvaraj <senthil_kumar.selvaraj@atmel.com>
* gcc.dg/pr64277.c: Use __INT32_TYPE__ for targets
with sizeof(int) < 4.
From-SVN: r242859
2016-11-25 Jakub Jelinek <jakub@redhat.com>
Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
PR middle-end/78501
* tree-vrp.c (extract_range_basic): Check for ptrdiff_type_node to be
non null and it's precision matches precision of lhs's type.
Co-Authored-By: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
From-SVN: r242858
2016-11-24 Vladimir Makarov <vmakarov@redhat.com>
PR rtl-optimization/77541
* lra-constraints.c (struct input_reload): Add field match_p.
(get_reload_reg): Check modes of input reloads to generate unique
value reload pseudo.
(match_reload): Add input reload pseudo for the current insn.
2016-11-24 Vladimir Makarov <vmakarov@redhat.com>
PR rtl-optimization/77541
* gcc.target/i386/pr77541.c: New.
From-SVN: r242848
* common/config/sparc/sparc-common.c (sparc_option_optimization_table):
Enable REE at -O2 and higher.
* config/sparc/sparc.c (sparc_option_override): Disable it by default
in 32-bit mode.
From-SVN: r242841
PR target/48863
PR inline-asm/70184
* tree-ssa-ter.c (temp_expr_table): Add reg_vars_cnt field.
(new_temp_expr_table): Initialise reg_vars_cnt.
(free_temp_expr_table): Release reg_vars_cnt.
(process_replaceable): Add reg_vars_cnt argument, set reg_vars_cnt
field of TAB.
(find_replaceable_in_bb): Use the above to record register variable
write occurrences and cancel replacement across them.
* gcc.target/arm/pr48863.c: New test.
From-SVN: r242840
PR rtl-optimization/78437
* ree.c (get_uses): New function.
(combine_reaching_defs): When a copy is needed, return false if any
reaching use of the source register reads it in a mode larger than
the mode it is set in and WORD_REGISTER_OPERATIONS is true.
From-SVN: r242839
2016-11-24 Richard Biener <rguenther@suse.de>
PR tree-optimization/71595
* cfgloopmanip.h (remove_path): Add irred_invalidated and
loop_closed_ssa_invalidated parameters, defaulted to NULL.
* cfgloopmanip.c (remove_path): Likewise, pass them along to
called functions. Only fix irred flags if the caller didn't
request state.
* tree-ssa-loop-ivcanon.c (unloop_loops): Use add_bb_to_loop.
(unloop_loops): Pass irred_invalidated and loop_closed_ssa_invalidated
to remove_path.
* gcc.dg/torture/pr71595.c: New testcase.
From-SVN: r242835
PR rtl-optimization/78120
* ifcvt.c (noce_conversion_profitable_p): Check original cost in all
cases, and additionally test against max_seq_cost for speed
optimization.
(noce_process_if_block): Compute an estimate for the original cost when
optimizing for speed, using the minimum of then and else block costs.
testsuite/
PR rtl-optimization/78120
* gcc.target/i386/pr78120.c: New test.
From-SVN: r242834
PR middle-end/78429
* tree.h (wi::fits_to_boolean_p): New predicate.
(wi::fits_to_tree_p): Use it for boolean types.
* tree.c (int_fits_type_p): Likewise.
From-SVN: r242829
* print-tree.c (struct bucket): Remove.
(print_node): Add new argument which drives whether a tree node
is printed briefly or not.
(debug_tree): Replace a custom hash table with hash_set<T>.
* print-tree.h (print_node): Add the argument.
From-SVN: r242820
gcc/
PR target/78458
* config/rs6000/rs6000.h (HARD_REGNO_CALLER_SAVE_MODE): Return MODE
if it is at least NREGS wide.
gcc/testsuite/
PR target/78458
* gcc.target/powerpc/pr78458.c: New.
From-SVN: r242818
Given my previous fix for a missing insn pattern for e500, building
glibc runs into an assembler error "Error: operand out of range (256
is not between 0 and 248)". This comes from an insn:
(insn 115 1209 1210 (set (reg:DF 27 27 [orig:294 _129 ] [294])
(subreg:DF (mem/c:TI (plus:SI (reg/f:SI 1 1)
(const_int 256 [0x100])) [14 %sfp+256 S16 A128]) 0)) 1909 {*frob_df_ti}
(nil))
This patch adjusts the offset handling for TImode - and TDmode and
PTImode in case such subregs can arise for them - to be the same as
for TFmode, so that proper SPE offset checks are made in the
TARGET_E500_DOUBLE case.
This allows the glibc build to complete. Testing shows 372 FAILs
across the gcc, g++ and libstdc++ testsuites; more cleanup is
certainly needed, but this gets to the point where the toolchain at
least builds so it's possible to compare test results when fixing
bugs.
* config/rs6000/rs6000.c (rs6000_legitimate_offset_address_p): For
TARGET_E500_DOUBLE. handle TDmode, TImode and PTImode the same as
TFmode, IFmode and KFmode.
From-SVN: r242814
Building glibc for powerpc-linux-gnuspe --enable-e500-double, given
the patch <https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02404.html>
applied, fails with errors such as:
../sysdeps/ieee754/ldbl-128ibm/s_modfl.c: In function '__modfl':
../sysdeps/ieee754/ldbl-128ibm/s_modfl.c:91:1: error: unrecognizable insn:
}
^
(insn 31 30 32 2 (set (reg:DF 203)
(subreg:DF (reg:TI 202) 8)) "../sysdeps/ieee754/ldbl-128ibm/s_modfl.c":44 -1
(nil))
../sysdeps/ieee754/ldbl-128ibm/s_modfl.c:91:1: internal compiler error: in extract_insn, at recog.c:2311
This patch adds an insn pattern similar to various patterns already
present to handle extracting such a subreg. This allows the glibc
build to get further, until it runs into an assembler error for which
I have another patch.
gcc:
* config/rs6000/spe.md (*frob_<SPE64:mode>_ti_8): New insn
pattern.
gcc/testsuite:
* gcc.c-torture/compile/20161123-1.c: New test.
From-SVN: r242813