PR tree-optimization/68599
* loop-init.c (rtl_loop_init): Set LOOPS_HAVE_RECORDED_EXITS
in call to loop_optimizer_init.
* loop-iv.c (get_simple_loop_desc): Only allow unsafe loop
optimization to drop the assumptions/infinite notations if
the loop has a single exit.
From-SVN: r231231
As Bernd requested, this patch adds "This pattern cannot FAIL" to the
documentation of optabs that came to be mapped to interal functions.
For consistency I did the same for optabs that were already being
used for internal functions.
Many of the optabs weren't documented in the first place, so I added
entries for the missing ones. Also, there were some inaccuracies in
the documentation of the rounding optabs. The bitcount optabs said
that operand 0 has mode @var{m} and that operand 1 is under target
control, whereas it should be the other way around.
Tested on x86_64-linux-gnu.
gcc/
* doc/md.texi (vec_load_lanes@var{m}@var{n}): Document that
the pattern cannot FAIL.
(vec_store_lanes@var{m}@var{n}): Likewise.
(maskload@var{m}@var{n}): Likewise.
(maskstore@var{m}@var{n}): Likewise. Fix a cut-&-paste error
in the name of the pattern.
(rsqrt@var{m}2): Document that mode m must be a scalar or vector
floating-point mode and that all operands have that mode.
(fmin@var{m}3, fmax@var{m}3): Likewise. Document that the
pattern cannot FAIL.
(sqrt@var{m}2): Document that mode m must be a scalar or vector
floating-point mode, that all operands have that mode, and that
the patterns cannot FAIL. Remove previous documentation referring
to @code{double} and @code{float}.
(fmod@var{m}3, remainder@var{m}3, cos@var{m}2, sin@var{m}2)
(sincos@var{m}3, log@var{m}2, pow@var{m}3, atan2@var{m}3)
(copysign@var{m}3): Likewise.
(exp@var{m}2): Likewise. Explicitly state the base.
(floor@var{m}2): As for sqrt@var{m}2, but also specify the operands.
(btrunc@var{m}2, rint@var{m}2): Likewise.
(round@var{m}2): Likewise. Fix incorrect description of rounding
effect.
(ceil@var{m}2): As for round@var{m}2.
(nearbyint@var{m}2): As for floor@var{m}2, but also mention that
the instruction must not raise an inexact condition.
(scalb@var{m}3): Document previously-undocumented pattern
(ldexp@var{m}3, tan@var{m}2, asin@var{m}2, acos@var{m}2)
(atan@var{m}2, expm1@var{m}2, exp10@var{m}2, exp2@var{m}2)
(log1p@var{m}2, log10@var{m}2, log2@var{m}2, logb@var{m}2)
(significand@var{m}2): Likewise.
(ffs@var{m}2): Fix the description of the modes, so that operand 1 has
mode m and operand 0 is defined more freely. Document that @var{m}
can be a scalar or vector integer mode and that the pattern is not
allowed to FAIL.
(clz@var{m}2, ctz@var{m}2, popcount@var{m}2, parity@var{m}2): Likewise.
(clrsb@var{m}2): Likewise, except that the description of the
mode was missing in this case.
From-SVN: r231230
All current uses of builtin_reciprocal convert 1.0/sqrt into rsqrt.
This patch adds an rsqrt optab and associated internal function for
that instead. We can then pick up the vector forms of rsqrt automatically,
fixing an AArch64 regression from my internal_fn patches.
With that change, builtin_reciprocal only needs to handle target-specific
built-in functions. I've restricted the hook to those since, if we need
a reciprocal of another standard function later, I think there should be
a strong preference for adding a new optab and internal function for it,
rather than hiding the code in a backend.
Three targets implement builtin_reciprocal: aarch64, i386 and rs6000.
i386 and rs6000 already used the obvious rsqrt<mode>2 pattern names
for the instructions, so they pick up the new code automatically.
aarch64 needs a slight rename.
mn10300 is unusual in that its native operation is rsqrt, and
sqrt is approximated as 1.0/rsqrt. The port also uses rsqrt<mode>2
for the rsqrt pattern, so after the patch we now pick it up as a native
operation.
Two other ports define rsqrt patterns: sh and v850. AFAICT these
patterns aren't currently used, but I think the patch does what the
authors of the patterns would have expected. There's obviously some
risk of fallout though.
Tested on x86_64-linux-gnu, aarch64-linux-gnu, arm-linux-gnueabihf
(as a target without the hooks) and powerpc64-linux-gnu.
gcc/
* internal-fn.def (RSQRT): New function.
* optabs.def (rsqrt_optab): New optab.
* doc/md.texi (rsqrtM2): Document.
* target.def (builtin_reciprocal): Replace gcall argument with
a function decl. Restrict hook to machine functions.
* doc/tm.texi: Regenerate.
* targhooks.h (default_builtin_reciprocal): Update prototype.
* targhooks.c (default_builtin_reciprocal): Likewise.
* tree-ssa-math-opts.c: Include internal-fn.h.
(internal_fn_reciprocal): New function.
(pass_cse_reciprocals::execute): Call it, and build a call to an
internal function on success. Only call targetm.builtin_reciprocal
for machine functions.
* config/aarch64/aarch64-protos.h (aarch64_builtin_rsqrt): Remove
second argument.
* config/aarch64/aarch64-builtins.c (aarch64_expand_builtin_rsqrt):
Rename aarch64_rsqrt_<mode>2 to rsqrt<mode>2.
(aarch64_builtin_rsqrt): Remove md_fn argument and only handle
machine functions.
* config/aarch64/aarch64.c (use_rsqrt_p): New function.
(aarch64_builtin_reciprocal): Replace gcall argument with a
function decl. Use use_rsqrt_p. Remove optimize_size check.
Only handle machine functions. Update call to aarch64_builtin_rsqrt.
(aarch64_optab_supported_p): New function.
(TARGET_OPTAB_SUPPORTED_P): Define.
* config/aarch64/aarch64-simd.md (aarch64_rsqrt_<mode>2): Rename to...
(rsqrt<mode>2): ...this.
* config/i386/i386.c (use_rsqrt_p): New function.
(ix86_builtin_reciprocal): Replace gcall argument with a
function decl. Use use_rsqrt_p. Remove optimize_insn_for_size_p
check. Only handle machine functions.
(ix86_optab_supported_p): Handle rsqrt_optab.
* config/rs6000/rs6000.c (TARGET_OPTAB_SUPPORTED_P): Define.
(rs6000_builtin_reciprocal): Replace gcall argument with a
function decl. Remove optimize_insn_for_size_p check.
Only handle machine functions.
(rs6000_optab_supported_p): New function.
From-SVN: r231229
PR rtl-optimization/68624
* ifcvt.c (noce_try_cmove_arith): Check clobbers of temp regs in both
blocks if they exist and simplify the logic choosing the order to emit
them in.
* gcc.c-torture/execute/pr68624.c: New test.
From-SVN: r231226
2015-12-03 Richard Biener <rguenther@suse.de>
PR tree-optimization/66051
* tree-vect-slp.c (vect_build_slp_tree_1): Remove restriction
on load group size. Do not pass in vectorization_factor.
(vect_transform_slp_perm_load): Do not require any permute support.
(vect_build_slp_tree): Do not pass in vectorization factor.
(vect_analyze_slp_instance): Do not compute vectorization
factor estimate. Use vector size instead of vectorization factor
estimate to split store groups for BB vectorization.
* gcc.dg/vect/slp-42.c: New testcase.
From-SVN: r231225
2015-12-03 Richard Biener <rguenther@suse.de>
PR tree-optimization/68639
* tree-vect-data-refs.c (dr_group_sort_cmp): Split groups
belonging to different loops.
(vect_analyze_data_ref_accesses): Likewise.
* gfortran.fortran-torture/compile/pr68639.f90: New testcase.
From-SVN: r231220
* ipa-pure-const.c (ignore_edge): Rename to ...
(ignore_edge_for_nothrow) ... this one; also ignore eges to
interposable functions or ones that can not throw.
(propagate_nothrow): Fix handling of availability.
From-SVN: r231218
PR preprocessor/57580
* c-ppoutput.c (print): Change printed field to bool.
Move src_file last for smaller padding.
(init_pp_output): Set print.printed to false instead of 0.
(scan_translation_unit): Fix up formatting. Set print.printed
to true after printing something other than newline.
(scan_translation_unit_trad): Set print.printed to true instead of 1.
(maybe_print_line_1): Set print.printed to false instead of 0.
(print_line_1): Likewise.
(do_line_change): Set print.printed to true instead of 1.
(cb_define, dump_queued_macros, cb_include, cb_def_pragma,
dump_macro): Set print.printed to false after printing newline.
* c-c++-common/cpp/pr57580.c: New test.
* c-c++-common/gomp/pr57580.c: New test.
From-SVN: r231213
From ISL's documentation, isl_ast_op_zdiv_r is equal to zero iff the remainder
on integer division is zero. Code generate a modulo operation for that.
* graphite-isl-ast-to-gimple.c (binary_op_to_tree): Handle isl_ast_op_zdiv_r.
(gcc_expression_from_isl_expr_op): Same.
* gcc.dg/graphite/id-28.c: New.
Co-Authored-By: Sebastian Pop <s.pop@samsung.com>
From-SVN: r231212
on the testcase we used to generate code in the function entry bb_0,
and that choked the cfg verifier.
* graphite-isl-ast-to-gimple.c (copy_bb_and_scalar_dependences): Check
that insertion point is still in the region.
* gfortran.dg/graphite/id-26.f03: New.
Co-Authored-By: Sebastian Pop <s.pop@samsung.com>
From-SVN: r231211
In case ISL did some loop peeling, like this:
S_8(0);
for (int c1 = 1; c1 <= 5; c1 += 1) {
S_8(c1);
}
S_8(6);
we should not copy loop-phi nodes in S_8(0) or in S_8(6).
PR tree-optimization/68550
* graphite-isl-ast-to-gimple.c (copy_loop_phi_nodes): Add dump.
(copy_bb_and_scalar_dependences): Do not code generate loop peeled
statements.
* gfortran.dg/graphite/pr68550-1.f90: New.
* gfortran.dg/graphite/pr68550-2.f90: New.
Co-Authored-By: Sebastian Pop <s.pop@samsung.com>
From-SVN: r231206
* configure.ac: Check assembler support for R_PPC64_ENTRY relocation.
* configure: Regenerate.
* config.in: Regenerate.
* config/rs6000/rs6000.c (rs6000_global_entry_point_needed_p): New
function.
(rs6000_output_function_prologue): Use it instead of checking
cfun->machine->r2_setup_needed. Use internal labels instead of
GNU as local label extension. Handle ELFv2 large code model.
(rs6000_output_mi_thunk): Do not set cfun->machine->r2_setup_needed.
(rs6000_elf_declare_function_name): Handle ELFv2 large code model.
From-SVN: r231202
* cp-gimplify.c (cp_fold_maybe_rvalue, cp_fold_rvalue): New.
(c_fully_fold): Use cp_fold_rvalue.
(cp_fold): Use them for rvalue operands.
From-SVN: r231197
PR c/68162 reports a spurious warning about incompatible types
involving arrays of const double, constructed in one place using a
typedef for const double and in another place literally using const
double.
The problem is that the array of the typedef was incorrectly
constructed without a TYPE_MAIN_VARIANT being an array of unqualified
elements as it should be (though it seems some more recent change
resulted in this producing incorrect diagnostics, likely the support
for C++-style handling of arrays of qualified type). This patch fixes
the logic in grokdeclarator to determine first_non_attr_kind, which is
used to determine whether it is necessary to use the TYPE_MAIN_VARIANT
of the type in the declaration specifiers.
However, fixing that logic introduces a failure of
gcc.dg/debug/dwarf2/pr47939-4.c, a test introduced along with
first_non_attr_kind. Thus, it is necessary to track the original
qualified typedef when qualifying an array type, to use it rather than
a newly-constructed type, to avoid regressing regarding typedef names
in debug info. This is done along lines I suggested in
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47939#c6>: track the
original type and the number of levels of array indirection at which
it appears, and, in possibly affected cases, pass extra arguments to
c_build_qualified_type (with default arguments to avoid needing to
pass those extra arguments explicitly everywhere). Given Richard's
recent fix to dwarf2out.c, this allows the C bug to be fixed without
causing debug information regressions.
Bootstrapped with no regressions on x86_64-pc-linux-gnu.
gcc/c:
PR c/68162
* c-decl.c (grokdeclarator): Set first_non_attr_kind before
following link from declarator to next declarator. Track original
qualified type and pass it to c_build_qualified_type.
* c-typeck.c (c_build_qualified_type): Add arguments
orig_qual_type and orig_qual_indirect.
gcc/c-family:
PR c/68162
* c-common.h (c_build_qualified_type): Add extra default
arguments.
gcc/cp:
PR c/68162
* tree.c (c_build_qualified_type): Add extra arguments.
gcc/testsuite:
PR c/68162
* gcc.dg/pr68162-1.c: New test.
From-SVN: r231194
While enabling graphite in -O3 we found a Fortran testcase that fails
because the max of the type domain is -1. We used to add that as a constraint
to the elements accessed by the array, leading to a unfeasible constraint:
0 <= i <= -1. Having that constraint, drops the data reference as that says
that there are no elements accessed in the array.
* graphite-dependences.c (scop_get_reads): Add extra dumps.
(scop_get_must_writes): Same.
(scop_get_may_writes): Same.
(compute_deps): Same.
* graphite-sese-to-poly.c (bounds_are_valid): New.
(pdr_add_data_dimensions): Call bounds_are_valid.
* gfortran.dg/graphite/run-id-3.f90: New.
Co-Authored-By: Sebastian Pop <s.pop@samsung.com>
From-SVN: r231191
PR c++/68290
* constraint.cc (make_constrained_auto): Move to...
* pt.c (make_auto_1): Add set_canonical parameter and set
TYPE_CANONICAL on the type only if it is true.
(make_decltype_auto): Adjust call to make_auto_1.
(make_auto): Likewise.
(splice_late_return_type): Likewise.
(make_constrained_auto): ...here. Call make_auto_1 instead of
make_auto and pass false. Set TYPE_CANONICAL directly.
From-SVN: r231189
gcc/ChangeLog:
* dwarf2out.c (dwar2out_var_location): In addition to notes,
process indirect calls whose target is compile-time known.
Enhance pattern matching to get the SYMBOL_REF they embed.
(gen_subprogram_die): Handle such calls.
* final.c (final_scan_insn): For call instructions, invoke the
var_location debug hook only after the call has been emitted.
From-SVN: r231185
2015-12-02 Tom de Vries <tom@codesourcery.com>
* gimplify.c (enum gimplify_omp_var_data): Add enum value
GOVD_MAP_FORCE.
(oacc_default_clause): Fix default for scalars in oacc kernels.
(gimplify_adjust_omp_clauses_1): Handle GOVD_MAP_FORCE.
* c-c++-common/goacc/kernels-default-2.c: New test.
* c-c++-common/goacc/kernels-default.c: New test.
From-SVN: r231183
2015-12-02 Tom de Vries <tom@codesourcery.com>
* omp-low.c (install_var_field, scan_sharing_clauses): Add and handle
parameter base_pointers_restrict.
(omp_target_base_pointers_restrict_p): New function.
(scan_omp_target): Call scan_sharing_clauses with base_pointers_restrict
arg.
* c-c++-common/goacc/kernels-alias-2.c: New test.
* c-c++-common/goacc/kernels-alias-3.c: New test.
* c-c++-common/goacc/kernels-alias-4.c: New test.
* c-c++-common/goacc/kernels-alias-5.c: New test.
* c-c++-common/goacc/kernels-alias-6.c: New test.
* c-c++-common/goacc/kernels-alias-7.c: New test.
* c-c++-common/goacc/kernels-alias-8.c: New test.
* c-c++-common/goacc/kernels-alias.c: New test.
From-SVN: r231182
2015-12-02 Richard Biener <rguenther@suse.de>
* tree.h (tree_invariant_p): Declare.
* tree.c (tree_invariant_p): Export.
* genmatch.c (dt_simplify::gen_1): For GENERIC code-gen never
create SAVE_EXPRs but reject patterns if we would need to.
From-SVN: r231178