As part of the change to larger character lengths, the string copy
algorithm was temporarily pessimized to get around some spurious
-Wstringop-overflow warnings. Having tried a number of variations of
this algorithm I have managed to get it down to one spurious warning,
only with -O1 optimization, in the testsuite. This patch reinstates
the optimized variant and modifies this one testcase to ignore the
warning.
Regtested on x86_64-pc-linux-gnu.
gcc/fortran/ChangeLog:
2018-01-31 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/78534
* trans-expr.c (fill_with_spaces): Use memset instead of
generating loop.
(gfc_trans_string_copy): Improve opportunity to use builtins with
constant lengths.
gcc/testsuite/ChangeLog:
2018-01-31 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/78534
* gfortran.dg/allocate_deferred_char_scalar_1.f03: Prune
-Wstringop-overflow warnings due to spurious warning with -O1.
* gfortran.dg/char_cast_1.f90: Update dump scan pattern.
* gfortran.dg/transfer_intrinsic_1.f90: Likewise.
From-SVN: r257233
This test has been failing since forever, it has never passed AFAIK.
The PR details the vectoriser deficiency.
I propose we xfail this with a reference to the PR.
PR tree-optimization/64946
* gcc.target/aarch64/vect-abs-compile.c: XFAIL byte and half-word
scan-assembler checks.
From-SVN: r257225
PR rtl-optimization/84071
* combine.c (record_dead_and_set_regs_1): Record the source unmodified
for a paradoxical SUBREG on a WORD_REGISTER_OPERATIONS target.
From-SVN: r257224
The 'aux' variable attribute is used to directly access the auxiliary
register space from C.
gcc/
2018-01-31 Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/arc.c (arc_handle_aux_attribute): New function.
(arc_attribute_table): Add 'aux' attribute.
(arc_in_small_data_p): Consider aux like variables.
(arc_is_aux_reg_p): New function.
(arc_asm_output_aligned_decl_local): Ignore 'aux' like variables.
(arc_get_aux_arg): New function.
(prepare_move_operands): Handle aux-register access.
(arc_handle_aux_attribute): New function.
* doc/extend.texi (ARC Variable attributes): Add subsection.
testsuite/
2018-01-31 Claudiu Zissulescu <claziss@synopsys.com>
* gcc.target/arc/taux-1.c: New test.
* gcc.target/arc/taux-2.c: Likewise.
From-SVN: r257223
The _Uncached type qualifier can be used to bypass the cache without
resorting to declaring variables as volatile.
gcc/
2018-01-31 Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/arc-protos.h (arc_is_uncached_mem_p): Function proto.
* config/arc/arc.c (arc_handle_uncached_attribute): New function.
(arc_attribute_table): Add 'uncached' attribute.
(arc_print_operand): Print '.di' flag for uncached memory
accesses.
(arc_in_small_data_p): Do not consider for small data the uncached
types.
(arc_is_uncached_mem_p): New function.
* config/arc/predicates.md (compact_store_memory_operand): Check
for uncached memory accesses.
(nonvol_nonimm_operand): Likewise.
* gcc/doc/extend.texi (ARC Type Attribute): New subsection.
gcc/testsuite
2018-01-31 Claudiu Zissulescu <claziss@synopsys.com>
* gcc.target/arc/uncached.c: New test.
From-SVN: r257222
PR preprocessor/69869
* traditional.c (skip_macro_block_comment): Return bool, true if
the macro block comment is unterminated.
(copy_comment): Use return value from skip_macro_block_comment instead
of always false.
* gcc.dg/cpp/trad/pr69869.c: New test.
From-SVN: r257220
Function_type and Backend_function_type have different backend
representations, so they should not be identical. Otherwise it
confuses Type::type_btypes map.
Reviewed-on: https://go-review.googlesource.com/90975
From-SVN: r257216
PR debug/84131
* trans-array.c (gfc_get_descriptor_offsets_for_info): Set *data_off
to DATA_FIELD's offset rather than OFFSET_FIELD's offset.
From-SVN: r257212
2018-01-30 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/84133
* frontend-passes (matmul_to_var_expr): Return early if
in association list.
(inline_matmul_assign): Likewise.
2018-01-30 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/84133
* gfortran.dg/inline_matmul_21.f90: New test case.
From-SVN: r257206
2018-01-30 Vladimir Makarov <vmakarov@redhat.com>
PR target/84112
* lra-constraints.c (curr_insn_transform): Process AND in the
address.
2018-01-30 Vladimir Makarov <vmakarov@redhat.com>
PR target/84112
* pr84112.c: New.
From-SVN: r257204
PR rtl-optimization/83986
* sched-deps.c (sched_analyze_insn): For frame related insns, add anti
dependence against last_pending_memory_flush in addition to
pending_jump_insns.
* gcc.dg/pr83986.c: New test.
From-SVN: r257203
If there are copies between the GIMPLE_PHI at the loop body and the
increment that reaches it (presumably through a back edge), still
regard it as a simple_iv_increment, so that we won't consider the
value in the back edge eligible for forwprop. Doing so would risk
making the phi node and the incremented conflicting value live
within the loop, and the phi node to be preserved for propagated
uses after the loop.
for gcc/ChangeLog
PR tree-optimization/81611
* tree-ssa-dom.c (simple_iv_increment_p): Skip intervening
copies.
From-SVN: r257194
This patch xfails a few test cases on powerpc64 that fail after r256380
due to a longstanding issue with floating-point compares.
See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58684 for more information.
2018-01-30 Bill Seurer <seurer@linux.vnet.ibm.com>
PR 58684
* gcc/testsuite/gcc.dg/torture/inf-compare-1.c: Add xfail.
* gcc/testsuite/gcc.dg/torture/inf-compare-2.c: Add xfail.
* gcc/testsuite/gcc.dg/torture/inf-compare-3.c: Add xfail.
* gcc/testsuite/gcc.dg/torture/inf-compare-4.c: Add xfail.
From-SVN: r257190
PR tree-optimization/84111
* tree-ssa-loop-ivcanon.c (tree_unroll_loops_completely_1): Skip
inner loops added during recursion, as they don't have up-to-date
SSA form.
* gcc.c-torture/compile/pr84111.c: New test.
From-SVN: r257188
PR lto/83954
* lto-symtab.c (warn_type_compatibility_p): Silence false positive
for type match warning on arrays of pointers.
* gcc.dg/lto/pr83954.h: New testcase.
* gcc.dg/lto/pr83954_0.c: New testcase.
* gcc.dg/lto/pr83954_1.c: New testcase.
From-SVN: r257183
2018-01-30 Richard Biener <rguenther@suse.de>
PR tree-optimization/83008
* tree-vect-slp.c (vect_analyze_slp_cost_1): Properly cost
invariant and constant vector uses in stmts when they need
more than one stmt.
From-SVN: r257181
sve/extract_[12].c were relying on the target-independent optimisation
that removes a redundant vec_select, so that we don't end up with
things like:
dup v0.4s, v0.4s[0]
...use s0...
But that optimisation rightly doesn't trigger for big-endian targets,
because GCC expects lane 0 to be in the high part of the register
rather than the low part.
SVE breaks this assumption -- see the comment at the head of
aarch64-sve.md for details -- so the optimisation is valid for
both endiannesses. Long term, we probably need some kind of target
hook to make GCC aware of this.
But there's another problem with the current extract pattern: it doesn't
tell the register allocator how cheap an extraction of lane 0 is with
tied registers. It seems better to split the lane 0 case out into
its own pattern and use tied operands for the FPR<-SIMD case,
so that using different registers has the cost of an extra reload.
I think we want this for both endiannesses, regardless of the hook
described above.
Also, the gen_lowpart in this pattern fails for aarch64_be due to
TARGET_CAN_CHANGE_MODE_CLASS restrictions, so the patch uses gen_rtx_REG
instead. We're only creating this rtl in order to print it, so there's
no need for anything fancier.
2018-01-30 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* config/aarch64/aarch64-sve.md (*vec_extract<mode><Vel>_0): New
pattern.
(*vec_extract<mode><Vel>_v128): Require a nonzero lane number.
Use gen_rtx_REG rather than gen_lowpart.
Reviewed-by: James Greenhalgh <james.greenhalgh@arm.com>
From-SVN: r257178
LRA was using a subreg offset of 0 whenever constraints matched
two operands with different modes. That leads to an invalid offset
(and ICE) on big-endian targets if one of the modes is narrower
than a word. E.g. if a (reg:SI X) is matched to a (reg:QI Y),
the big-endian subreg should be (subreg:QI (reg:SI X) 3) rather
than (subreg:QI (reg:SI X) 0).
But this raises the issue of what the behaviour should be when the
matched operands occupy different numbers of registers. Should the
register numbers match, or should the locations of the lsbs match?
Although the documentation isn't clear, reload went for the second
interpretation (which seems the most natural to me):
/* On a REG_WORDS_BIG_ENDIAN machine, point to the last register of a
multiple hard register group of scalar integer registers, so that
for example (reg:DI 0) and (reg:SI 1) will be considered the same
register. */
So I think this means that we can/must use the lowpart offset
unconditionally, rather than trying to separate out the multi-register
case. This also matches the LRA handling of constant integers, which
already uses lowpart subregs.
The patch fixes gcc.target/aarch64/sve/extract_[34].c for aarch64_be.
2018-01-30 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* lra-constraints.c (match_reload): Use subreg_lowpart_offset
rather than 0 when creating partial subregs.
From-SVN: r257177
2018-01-30 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* vec-perm-indices.c (vec_perm_indices::series_p): Give examples
of usage.
From-SVN: r257176
This test fails to optimise away the PLUS reduction in the loop on arm targets when vectorisation
is not enabled due to absence of SIMD instructions.
From reading the logs and the PR I gather that the presence or absence of SIMD affects the passing of this test
on other targets as well, as evidenced by the long list of xfail targets.
This list looks quite unwieldy to me, but here is a patch adding non-NEON arm to that list.
* gcc.dg/tree-ssa/ssa-dom-cse-2.c: XFAIL on !arm_neon arm targets.
From-SVN: r257175
PR testsuite/81010
* gcc.target/powerpc/pr56605.c: Update various dg- directives to
better match other tests which require vsx. Verify the zero
extension is part of the test in the combiner dump.
From-SVN: r257172
CL 84555 added support for the SuperH architecture, but didn't add the
randomTrap definition to be used for the getrandom syscall on Linux.
Add it now.
Reviewed-on: https://go-review.googlesource.com/90535
From-SVN: r257171
2018-01-29 Michael Meissner <meissner@linux.vnet.ibm.com>
PR target/81550
* config/rs6000/rs6000.c (rs6000_setup_reg_addr_masks): If DFmode
and SFmode can go in Altivec registers (-mcpu=power7 for DFmode,
-mcpu=power8 for SFmode) don't set the PRE_INCDEC or PRE_MODIFY
flags. This restores the settings used before the 2017-07-24.
Turning off pre increment/decrement/modify allows IVOPTS to
optimize DF/SF loops where the index is an int.
From-SVN: r257166