2015-12-14 Richard Biener <rguenther@suse.de>
PR tree-optimization/68775
* tree-vect-slp.c (vect_build_slp_tree): Make sure to apply
a operand swapping even if replacing the op with scalars.
From-SVN: r231617
2015-12-14 Thomas Preud'homme <thomas.preudhomme@arm.com>
PR testsuite/68629
* lib/target-supports.exp (check_effective_target_cilkplus): Also
check that compiling with -fcilkplus does not give an error.
* c-c++-common/attr-simd-3.c: Require cilkplus effective target.
From-SVN: r231605
VTA's cselib expression hashing compares expressions with the same
hash before adding them to the hash table. When there is a collision
involving a self-referencing expression, we could get infinite
recursion, in spite of the cycle breakers already in place. The
problem is currently latent in the trunk, because by chance we don't
get a collision.
Such value cycles are often introduced by reverse_op; most often,
they're indirect, and then value canonicalization takes care of the
cycle, but if the reverse operation simplifies to the original value,
we used to issue a (plus V (const_int 0)), because at some point
adding a plain value V to a location list as a reverse_op equivalence
caused other problems.
This dummy zero, in turn, caused the value canonicalizer to not fully
realize the equivalence, leading to more complex graphs and,
occasionally, to infinite recursion when comparing such
value-plus-zero expressions recursively.
Simply using V solves the infinite recursion from the PR testcase,
since the extra equivalence and the preexisting value canonicalization
together prevent recursion while the unrecognized equivalence
wouldn't, but it exposed another infinite recursion in
memrefs_conflict_p: get_addr had a cycle breaker in place, to skip RTL
referencing values introduced after the one we're examining, but it
wouldn't break the cycle if the value itself appeared in the
expression being examined.
After removing the dummy zero above, this kind of cycle in the
equivalence graph is no longer introduced by VTA itself, but dummy
zeros are also present in generated code, such as in the 32-bit x86's
pro_epilogue_adjust_stack_si_add epilogue insn generated as part of
the builtin longjmp in _Unwind_RaiseException building libgcc's
unwind-dw2.o. So, break the recursion cycle for them too.
for gcc/ChangeLog
PR debug/67355
* var-tracking.c (reverse_op): Don't add dummy zero to reverse
ops that simplify back to the original value.
* alias.c (refs_newer_value_p): Cut off recursion for
expressions containing the original value.
From-SVN: r231599
[PATCH][PR target/19201] Peephole to improve clearing items in structure for m68k
* config/m68k/m68k.md (load feeding clear byte): New peephole2.
* gcc.target/m68k/pr19201.c: New test.
From-SVN: r231597
2015-12-13 Tom de Vries <tom@codesourcery.com>
* tree-ssa-structalias.c (find_func_clobbers): Handle sizes and kinds
parameters of GOACC_paralllel.
From-SVN: r231595
gcc:
PR sanitizer/68418
* c-family/c-ubsan.c (ubsan_instrument_shift): Disable
sanitization of left shifts for wrapping signed types as well.
gcc/testsuite:
PR sanitizer/68418
* gcc.dg/ubsan/c99-wrapv-shift-1.c,
gcc.dg/ubsan/c99-wrapv-shift-2.c: New testcases.
From-SVN: r231582
Function (or more narrow) scope static variables (as well as others not
placed on the stack) should also not have any effect on the stack
alignment. I noticed the issue first with Linux'es dynamic_pr_debug()
construct using an 8-byte aligned sub-file-scope local variable.
According to my checking bad behavior started with 4.6.x (4.5.3 was
still okay), but generated code got quite a bit worse as of 4.9.0.
gcc/
2015-12-11 Jan Beulich <jbeulich@suse.com>
* cfgexpand.c (expand_one_var): Exit early for static and
external variables when adjusting stack alignment related.
gcc/testsuite/
2015-12-11 Jan Beulich <jbeulich@suse.com>
* gcc.c-torture/execute/stkalign.c: New.
From-SVN: r231569
libmpx/
2015-12-11 Tsvetkova Alexandra <aleksandra.tsvetkova@intel.com>
* mpxrt/Makefile.am (libmpx_la_LDFLAGS): Add -version-info
option.
* libmpxwrap/Makefile.am (libmpx_la_LDFLAGS): Likewise and
fix include path.
* libmpx/Makefile.in: Regenerate.
* mpxrt/Makefile.in: Regenerate.
* libmpxwrap/Makefile.in: Regenerate.
* mpxrt/libtool-version: New version.
* libmpxwrap/libtool-version: Likewise.
* mpxrt/libmpx.map: Add new version and a new symbol.
* mpxrt/mpxrt.h: New file.
* mpxrt/mpxrt.c (NUM_L1_BITS): Moved to mpxrt.h.
(REG_IP_IDX): Moved to mpxrt.h.
(REX_PREFIX): Moved to mpxrt.h.
(XSAVE_OFFSET_IN_FPMEM): Moved to mpxrt.h.
(MPX_L1_SIZE): Moved to mpxrt.h.
* libmpxwrap/mpx_wrappers.c (mpx_pointer): New type.
(mpx_bt_entry): New type.
(alloc_bt): New function.
(get_bt): New function.
(copy_if_possible): New function.
(copy_if_possible_from_end): New function.
(move_bounds): New function.
(__mpx_wrapper_memmove): Use move_bounds to copy bounds.
gcc/testsuite/
2015-12-11 Tsvetkova Alexandra <aleksandra.tsvetkova@intel.com>
* gcc.target/i386/mpx/memmove-1.c: New test.
* gcc.target/i386/mpx/memmove-2.c: New test.
From-SVN: r231565
gcc/ChangeLog
* config/s390/s390.md ("movstr", "*movstr"): Fix warning.
("movstr<P:mode>"): New indirect expanders used by "movstr".
gcc/testsuite/ChangeLog
* gcc.target/s390/md/movstr-1.c: New test.
* gcc.target/s390/s390.exp: Add subdir md.
Do not run hotpatch tests twice.
From-SVN: r231557
2015-12-11 Richard Biener <rguenther@suse.de>
* lto-streamer.h (lto_simple_header_with_strings): Remove
main_size field already in lto_simple_header.
From-SVN: r231555
gcc/
* config/i386/i386.c (ix86_get_mask_mode): Use scalar
modes for 32 and 16 byte boolean vectors when possible.
gcc/testsuite/
* gcc.dg/vect/vect-32-chars.c: New test.
From-SVN: r231553
After shrink-wrapping has found the "tightest fit" for where to place
the prologue, it tries move it earlier (so that frame saves are run
earlier) -- but without copying any more basic blocks.
Unfortunately a candidate block we select can be inside a loop, and we
will still allow it (because the loop always exits via our previously
chosen block). We can do that just fine if we make a duplicate of the
block, but we do not want to here.
So we need to detect this situation. We can place the prologue at a
previous block PRE only if PRE dominates every block reachable from
it, because then we will never need to duplicate that block (it will
always be executed with prologue).
2015-12-11 Segher Boessenkool <segher@kernel.crashing.org>
PR rtl-optimization/67778
PR rtl-optimization/68634
* shrink-wrap.c (try_shrink_wrapping): Add a comment about why we want
to put the prologue earlier. When determining if an earlier block is
suitable, make sure it dominates every block reachable from it.
From-SVN: r231552