2019-10-17 Yuliang Wang <yuliang.wang@arm.com>
gcc/
* config/aarch64/aarch64-sve2.md (aarch64_sve2_eor3<mode>)
(aarch64_sve2_nor<mode>, aarch64_sve2_nand<mode>)
(aarch64_sve2_bsl<mode>, aarch64_sve2_nbsl<mode>)
(aarch64_sve2_bsl1n<mode>, aarch64_sve2_bsl2n<mode>):
New combine patterns.
* config/aarch64/iterators.md (BSL_DUP): New int iterator for the
above.
(bsl_1st, bsl_2nd, bsl_dup, bsl_mov): Attributes for the above.
gcc/testsuite/
* gcc.target/aarch64/sve2/eor3_1.c: New test.
* gcc.target/aarch64/sve2/nlogic_1.c: As above.
* gcc.target/aarch64/sve2/nlogic_2.c: As above.
* gcc.target/aarch64/sve2/bitsel_1.c: As above.
* gcc.target/aarch64/sve2/bitsel_2.c: As above.
* gcc.target/aarch64/sve2/bitsel_3.c: As above.
* gcc.target/aarch64/sve2/bitsel_4.c: As above.
From-SVN: r277110
gcc/ChangeLog:
2019-10-17 Andre Vieira <andre.simoesdiasvieira@arm.com>
* tree-vect-loop.c (vect_analyze_loop_2): Use same condition to decide
when to use versioning threshold.
From-SVN: r277105
gcc/ChangeLog:
2019-10-17 Andre Vieira <andre.simoesdiasvieira@arm.com>
* tree-vect-loop.c (determine_peel_for_niter): New function contained
outlined code from ...
(vect_analyze_loop_2): ... here.
From-SVN: r277103
https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01283.html
* decl.c (builtin_function_1): Merge into ...
(cxx_builtin_function): ... here. Nadger the decl before maybe
copying it. Set the context.
(cxx_builtin_function_ext_scope): Push to top level, then call
cxx_builtin_function.
From-SVN: r277102
2019-10-17 Richard Biener <rguenther@suse.de>
PR tree-optimization/92129
* tree-vect-loop.c (vectorizable_reduction): Also fail
on GIMPLE_SINGLE_RHS.
From-SVN: r277094
PR tree-optimization/92056
* tree-object-size.c (cond_expr_object_size): Return early if then_
processing resulted in unknown size.
* gcc.c-torture/compile/pr92056.c: New test.
From-SVN: r277093
PR tree-optimization/92115
* tree-ssa-ifcombine.c (ifcombine_ifandif): Force condition into
temporary if it could trap.
* gcc.dg/pr92115.c: New test.
From-SVN: r277092
2019-10-17 Richard Biener <rguenther@suse.de>
PR debug/91887
* dwarf2out.c (gen_formal_parameter_die): Also try to match
context_die against a DW_TAG_GNU_formal_parameter_pack parent.
* g++.dg/debug/dwarf2/pr91887.C: New testcase.
From-SVN: r277090
I've found this stale reference while looking at cp-gimplify.c. tree-gimple.c
no longer exists and its contents were merged into gimple.c.
Seems obvious enough.
gcc/cp/ChangeLog:
2019-10-16 Luis Machado <luis.machado@linaro.org>
* cp-gimplify.c: Fix reference to non-existing tree-gimple.c file.
From-SVN: r277089
This finishes the part 1 of 2 patch submitted by Andrew Burgess on Aug 19.
This adds the argument registers but not t0 (aka x5) to SIBCALL_REGS. It
also adds the missing riscv_regno_to_class change.
Tested with cross riscv32-elf and riscv64-linux toolchain build and check.
There were no regressions. I see about a 0.01% code size reduction for the
C and libstdc++ libraries.
gcc/
* config/riscv/riscv.h (REG_CLASS_CONTENTS): Add argument passing
regs to SIBCALL_REGS.
* config/riscv/riscv.c (riscv_regno_to_class): Change argument
passing regs to SIBCALL_REGS.
Co-Authored-By: Jim Wilson <jimw@sifive.com>
From-SVN: r277082
The Arm port is failing bootstrap because GCC is now warning about an
unitialized array.
The code is complex enough that I certainly can't be sure the compiler
is wrong, so perhaps the best fix here is just to memset the entire
array before use.
* config/arm/arm.c (neon_valid_immediate): Clear bytes before use.
From-SVN: r277073
* config/mips/mips.c (mips_expand_builtin_insn): Force the
operands which correspond to the same input-output register to
have the same pseudo assigned to them.
* gcc.target/mips/msa-dpadd-dpsub.c: New test.
From-SVN: r277071
In aarch64_classify_symbol symbols are allowed large offsets on relocations.
This means the offset can use all of the +/-4GB offset, leaving no offset
available for the symbol itself. This results in relocation overflow and
link-time errors for simple expressions like &global_array + 0xffffff00.
To avoid this, unless the offset_within_block_p is true, limit the offset
to +/-1MB so that the symbol needs to be within a 3.9GB offset from its
references. For the tiny code model use a 64KB offset, allowing most of
the 1MB range for code/data between the symbol and its references.
gcc/
* config/aarch64/aarch64.c (aarch64_classify_symbol):
Apply reasonable limit to symbol offsets.
testsuite/
* gcc.target/aarch64/symbol-range.c: Improve testcase.
* gcc.target/aarch64/symbol-range-tiny.c: Likewise.
From-SVN: r277068
2019-10-16 Richard Biener <rguenther@suse.de>
* tree-vect-loop.c (vect_valid_reduction_input_p): Remove.
(vect_is_simple_reduction): Delay checking to
vectorizable_reduction and relax the checking.
(vectorizable_reduction): Check we have a simple use. Check
for bogus condition reductions.
* tree-vect-stmts.c (vect_transform_stmt): Make sure we
are looking at the last stmt in a pattern sequence when
filling in backedge PHI values.
* gcc.dg/vect/vect-cond-reduc-3.c: New testcase.
* gcc.dg/vect/vect-cond-reduc-4.c: Likewise.
From-SVN: r277067
In PR70010, a function is marked with target(no-vsx) to disable VSX code
generation. To avoid VSX code generation, this function should not be
inlined into VSX function. To fix the bug, in the current logic when
checking whether the caller's ISA flags supports the callee's ISA flags, we
just need to add a test that enforces that the caller's ISA flags match
exactly the callee's flags, for those flags that were explicitly set in the
callee. If caller without target attribute then using options from command
line.
gcc/
2019-10-16 Peter Bergner <bergner@linux.ibm.com>
Jiufu Guo <guojiufu@linux.ibm.com>
PR target/70010
* config/rs6000/rs6000.c (rs6000_can_inline_p): Prohibit inlining if
the callee explicitly disables some isa_flags the caller is using.
gcc.testsuite/
2019-10-16 Peter Bergner <bergner@linux.ibm.com>
Jiufu Guo <guojiufu@linux.ibm.com>
PR target/70010
* gcc.target/powerpc/pr70010.c: New test.
* gcc.target/powerpc/pr70010-1.c: New test.
* gcc.target/powerpc/pr70010-2.c: New test.
* gcc.target/powerpc/pr70010-3.c: New test.
* gcc.target/powerpc/pr70010-4.c: New test.
Co-Authored-By: Jiufu Guo <guojiufu@linux.ibm.com>
From-SVN: r277065
This patch adds extra vector modes that represent a half, quarter or
eighth of what an SVE vector can hold. This is useful for describing
the memory vector involved in an extending load or truncating store.
It might also be useful in future for representing "unpacked" SVE
registers, i.e. registers that contain values in the low bits of a
wider containing element.
The new modes could have the same width as an Advanced SIMD mode for
certain -msve-vector-bits=N options, so we need to ensure that they
come later in the mode list and that Advanced SIMD modes always "win".
2019-10-16 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* genmodes.c (mode_data::order): New field.
(blank_mode): Update accordingly.
(VECTOR_MODES_WITH_PREFIX): Add an order parameter.
(make_vector_modes): Likewise.
(VECTOR_MODES): Update use accordingly.
(cmp_modes): Sort by the new order field ahead of sorting by size.
* config/aarch64/aarch64-modes.def (VNx2QI, VN2xHI, VNx2SI)
(VNx4QI, VNx4HI, VNx8QI): New partial vector modes.
* config/aarch64/aarch64.c (VEC_PARTIAL): New flag value.
(aarch64_classify_vector_mode): Handle the new partial modes.
(aarch64_vl_bytes): New function.
(aarch64_hard_regno_nregs): Use it instead of BYTES_PER_SVE_VECTOR
when counting the number of registers in an SVE mode.
(aarch64_class_max_nregs): Likewise.
(aarch64_hard_regno_mode_ok): Don't allow partial vectors
in registers yet.
(aarch64_classify_address): Treat partial vectors analogously
to full vectors.
(aarch64_print_address_internal): Consolidate the printing of
MUL VL addresses, using aarch64_vl_bytes as the number of
bytes represented by "VL".
(aarch64_vector_mode_supported_p): Reject partial vector modes.
From-SVN: r277062
I'd used known_lt when converting these conditions to poly_int,
but on reflection that was a bad choice. The code isn't just
doing a range check; it specifically needs constants that will
fit in a certain encoding.
2019-10-16 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64.c (aarch64_layout_frame): Use is_constant
rather than known_lt when choosing frame layouts.
From-SVN: r277061
This patch adds an assert that all the individual *_adjust allocations
add up to the full frame size. With that safety net, it seemed slightly
clearer to use crtl->outgoing_args_size as the final adjustment where
appropriate, to match what's used in the comments.
This is a bit overkill on its own, but I need to add more cases for SVE.
2019-10-16 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64.c (aarch64_layout_frame): Assert
that all the adjustments add up to the full frame size.
Use crtl->outgoing_args_size directly as the final adjustment
where appropriate.
From-SVN: r277060
Using the full path "cfun->machine->frame" in aarch64_layout_frame
led to awkward formatting in some follow-on patches, so it seemed
worth using a local reference instead.
2019-10-16 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64.c (aarch64_layout_frame): Use a local
"frame" reference instead of always referring directly to
"cfun->machine->frame".
From-SVN: r277059
Clang doesn't support __is_same_as but provides __is_same instead.
Restore the original implementation (pre r276891) when neither of those
built-ins is available.
* include/bits/c++config (_GLIBCXX_BUILTIN_IS_SAME_AS): Define to
one of __is_same_as or __is_same when available.
* include/std/concepts (__detail::__same_as): Use std::is_same_v.
* include/std/type_traits (is_same) [_GLIBCXX_BUILTIN_IS_SAME_AS]:
Use new macro instead of __is_same_as.
(is_same) [!_GLIBCXX_BUILTIN_IS_SAME_AS]: Restore partial
specialization.
(is_same_v) [_GLIBCXX_BUILTIN_IS_SAME_AS]: Use new macro.
(is_same_v) [!_GLIBCXX_BUILTIN_IS_SAME_AS]: Use std::is_same.
From-SVN: r277058
This patch makes value_range_base::set convert POLY_INT_CST bounds
into the worst-case INTEGER_CST bounds. The main case in which this
gives useful ranges is a lower bound of A + B * X becoming A when B >= 0.
E.g.:
[32 + 16X, 100] -> [32, 100]
[32 + 16X, 32 + 16X] -> [32, MAX]
But the same thing can be useful for the upper bound with negative
X coefficients.
2019-10-16 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR middle-end/92033
* poly-int.h (constant_lower_bound_with_limit): New function.
(constant_upper_bound_with_limit): Likewise.
* doc/poly-int.texi: Document them.
* tree-vrp.c (value_range_base::set): Convert POLY_INT_CST bounds
into the worst-case INTEGER_CST bounds.
From-SVN: r277056
2019-10-16 Feng Xue <fxue@os.amperecomputing.com>
PR ipa/91088
* doc/invoke.texi (ipa-max-param-expr-ops): Document new option.
* params.def (PARAM_IPA_MAX_PARAM_EXPR_OPS): New.
* ipa-predicat.h (struct expr_eval_op): New struct.
(expr_eval_ops): New typedef.
(struct condition): Add type and param_ops fields, remove size field.
(add_condition): Replace size parameter with type parameter, add
param_ops parameter.
* ipa-predicat.c (expr_eval_ops_equal_p): New function.
(predicate::add_clause): Add comparisons on type and param_ops.
(dump_condition): Add debug dump for param_ops.
(remap_after_inlining): Adjust call arguments to add_condition.
(add_condition): Replace size parameter with type parameter, add
param_ops parameter. Unshare constant value used in conditions.
* ipa-fnsummary.c (evaluate_conditions_for_known_args): Fold
parameter expressions using param_ops.
(decompose_param_expr): New function.
(set_cond_stmt_execution_predicate): Use call to decompose_param_expr
to replace call to unmodified_parm_or_parm_agg_item.
(set_switch_stmt_execution_predicate): Likewise.
(will_be_nonconstant_expr_predicate): Likewise. Replace usage of size
with type.
(inline_read_section): Read param_ops from summary stream.
(ipa_fn_summary_write): Write param_ops to summary stream.
2019-10-16 Feng Xue <fxue@os.amperecomputing.com>
PR ipa/91088
* gcc.dg/ipa/pr91088.c: New test.
* gcc.dg/ipa/pr91089.c: Add sub-test for range analysis.
* g++.dg/tree-ssa/ivopts-3.C: Force a function to be noinline.
From-SVN: r277054
As PR92107 shows, genattrtab doesn't parenthesize expressions correctly
(or at all, even). This fixes it.
PR rtl-optimization/92107
* genattrtab.c (write_attr_value) <do_operator>: Parenthesize the
expression written.
From-SVN: r277023
* config/pa/fptr.c (_dl_read_access_allowed): Change argument to
unsigned int. Adjust callers.
(__canonicalize_funcptr_for_compare): Change plabel type to volatile
unsigned int *. Load relocation offset before function pointer.
Add barrier to ensure ordering.
From-SVN: r277015
2019-10-15 Andrew Pinski <apinski@marvell.com>
* gcc.c-torture/compile/20191015-1.c: New test.
* gcc.c-torture/compile/20191015-2.c: New test.
From-SVN: r277011
This updates the description of the support for fix and continue
debugging.
gcc/ChangeLog:
2019-10-15 Iain Sandoe <iain@sandoe.co.uk>
* config/darwin.c: Update description of fix and continue.
From-SVN: r277010
The use of default_binds_local_p had got out of sync with the varasm
changes, this restores the call to be direct. In practice, we add some
further tests to determine local binding - but this callback is used for
the initial assessments made by default_encode_section_info().
gcc/ChangeLog:
2019-10-15 Iain Sandoe <iain@sandoe.co.uk>
* config/darwin.c (darwin_binds_local_p): Update to call
default_binds_local_p_3 () directly. amend comments.
From-SVN: r277009