2017-12-04 Richard Biener <rguenther@suse.de>
* tree-vectorizer.h (check_reduction_path): Declare.
* tree-vect-loop.c (check_reduction_path): New function, split out
from ...
(vect_is_simple_reduction): ... here.
* gimple-loop-interchange.cc: Include tree-vectorizer.h.
(loop_cand::analyze_iloop_reduction_var): Use single_imm_use.
Properly check for a supported reduction operation and a
valid expression if the reduction covers multiple stmts.
(prepare_perfect_loop_nest): Simpify allocation.
(pass_linterchange::execute): Likewise.
* gcc.dg/tree-ssa/loop-interchange-1.c: Add fast-math flags.
* gcc.dg/tree-ssa/loop-interchange-1b.c: New test variant.
* gcc.dg/tree-ssa/loop-interchange-4.c: XFAIL.
From-SVN: r255383
2017-11-28 Bin Cheng <bin.cheng@arm.com>
* Makefile.in (gimple-loop-interchange.o): New object file.
* common.opt (floop-interchange): Reuse the option from graphite.
* doc/invoke.texi (-floop-interchange): Ditto. New document.
* gimple-loop-interchange.cc: New file.
* params.def (PARAM_LOOP_INTERCHANGE_MAX_NUM_STMTS): New parameter.
(PARAM_LOOP_INTERCHANGE_STRIDE_RATIO): New parameter.
* passes.def (pass_linterchange): New pass.
* timevar.def (TV_LINTERCHANGE): New time var.
* tree-pass.h (make_pass_linterchange): New declaration.
* tree-ssa-loop-ivcanon.c (create_canonical_iv): Change to external
interchange. Record IV before/after increment in new parameters.
* tree-ssa-loop-ivopts.h (create_canonical_iv): New declaration.
gcc/testsuite
2017-11-28 Bin Cheng <bin.cheng@arm.com>
* gcc.dg/tree-ssa/loop-interchange-1.c: New test.
* gcc.dg/tree-ssa/loop-interchange-2.c: New test.
* gcc.dg/tree-ssa/loop-interchange-3.c: New test.
* gcc.dg/tree-ssa/loop-interchange-4.c: New test.
* gcc.dg/tree-ssa/loop-interchange-5.c: New test.
* gcc.dg/tree-ssa/loop-interchange-6.c: New test.
* gcc.dg/tree-ssa/loop-interchange-7.c: New test.
* gcc.dg/tree-ssa/loop-interchange-8.c: New test.
* gcc.dg/tree-ssa/loop-interchange-9.c: New test.
* gcc.dg/tree-ssa/loop-interchange-10.c: New test.
* gcc.dg/tree-ssa/loop-interchange-11.c: New test.
From-SVN: r255207
2017-11-28 Bin Cheng <bin.cheng@arm.com>
* tree-ssa-dce.c (simple_dce_from_worklist): Move and rename from
tree-ssa-pre.c::remove_dead_inserted_code.
* tree-ssa-dce.h: New file.
* tree-ssa-pre.c (tree-ssa-dce.h): Include new header file.
(remove_dead_inserted_code): Move and rename to function
tree-ssa-dce.c::simple_dce_from_worklist.
(pass_pre::execute): Update use.
From-SVN: r255206
PR c++/81675
* cp-gimplify.c (cp_fold) <case COND_EXPR>: Don't return immediately
for VOID_TYPE_P COND_EXPRs, instead fold the operands and if op0 is
INTEGER_CST, ensure that both op1 and op2 are non-NULL and fall
through into normal folding, otherwise just rebuild x if any op
changed.
* g++.dg/warn/pr81675.C: New test.
From-SVN: r255167
bootstrap-ubsan shows:
gcc/hash-map.h:277:19: runtime error: member access within null pointer of type 'struct hash_map'
Fix the issue by returning early.
From-SVN: r255166
According to the description of inssp instruction from Intel CET it
adusts the shadow stack pointer (ssp) only by value in the range of
[0..255]. As a number of adjustment could be greater than 255 there
should be a loop generated to adjust ssp.
gcc/
* config/i386/i386.md: Add a loop with incssp.
* testsuite/gcc.target/i386/cet-sjlj-1.c: Fix test.
* testsuite/gcc.target/i386/cet-sjlj-4.c: Likewise.
From-SVN: r255164
PR debug/81307
* dbxout.c (lastlineno): New variable.
(dbx_debug_hooks): Use dbxout_switch_text_section as
switch_text_section debug hook.
(dbxout_function_end): Switch to current_function_section
rather than function_section. If crtl->has_bb_partition,
output just one N_FUN, depending on in_cold_section_p.
(dbxout_source_line): Remember last lineno in lastlineno.
(dbxout_switch_text_section): New function.
(dbxout_function_decl): Adjust dbxout_block caller.
(dbx_block_with_cold_children): New function.
(dbxout_block): Return true if any LBRAC/RBRAC have been
emitted. Use dbx_block_with_cold_children at depth == 0
in second partition. Add PARENT_BLOCKNUM argument, pass
it optionally adjusted to children. Output LBRAC/RBRAC
around recursive call only if the block is in the current
partition, if not and anything was output, emit empty
range LBRAC/RBRAC.
* final.c (final_scan_insn): Compute cold_function_name
before calling switch_text_section debug hook. Call
that hook even if dwarf2out_do_frame if not emitting
dwarf debug info.
* g++.dg/debug/debug9.C: Remove -fno-reorder-blocks-and-partition
workaround.
From-SVN: r255161
PR rtl-optimization/81553
* combine.c (simplify_if_then_else): In (if_then_else COND (OP Z C1) Z)
to (OP Z (mult COND (C1 * STORE_FLAG_VALUE))) optimization, if OP
is a shift where C1 has different mode than the whole shift, use C1's
mode for MULT rather than the shift's mode.
* gcc.c-torture/compile/pr81553.c: New test.
From-SVN: r255150
PR target/82848
* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Don't fold
builtins not enabled in the currently selected ISA.
* gcc.target/powerpc/pr82848.c: New test.
From-SVN: r255148
PR fortran/81304
* trans-openmp.c (gfc_trans_omp_array_reduction_or_udr): Set
attr.implicit_type in intrinsic_sym to avoid undesirable warning.
* testsuite/libgomp.fortran/pr81304.f90: New test.
From-SVN: r255144
This patch implements the some of the division optimizations discussed in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71026.
The division reciprocal optimization now handles divisions by squares:
x / (y * y) -> x * (1 / y) * (1 / y)
This requires at least one more division by y before it triggers - the
3 divisions of (1/ y) are then CSEd into a single division. Overall
this changes 1 division into 1 multiply, which is generally much faster.
2017-11-24 Jackson Woodruff <jackson.woodruff@arm.com>
gcc/
PR tree-optimization/71026
* tree-ssa-math-opts (is_division_by_square, is_square_of): New.
(insert_reciprocals): Change to insert reciprocals before a division
by a square and to insert the square of a reciprocal.
(execute_cse_reciprocals_1): Change to consider division by a square.
(register_division_in): Add importance parameter.
testsuite/
PR tree-optimization/71026
* gfortran.dg/extract_recip_1.f: New test.
* gcc.dg/extract_recip_3.c: New test.
* gcc.dg/extract_recip_4.c: New test.
From-SVN: r255141
* tree-object-size.c (pass_through_call): Use gimple_call_return_flags
ERF_RETURN*ARG* for builtins other than BUILT_IN_ASSUME_ALIGNED,
check for the latter with gimple_call_builtin_p. Do not handle
BUILT_IN_STPNCPY_CHK which is not a pass through call.
* gcc.dg/builtin-object-size-18.c: New test.
From-SVN: r255133