vect_transform_loop has to reduce three iteration counts by
the vectorisation factor: nb_iterations_upper_bound,
nb_iterations_likely_upper_bound and nb_iterations_estimate.
All three are latch execution counts rather than loop body
execution counts. The calculations were taking that into
account for the first two, but not for nb_iterations_estimate.
This patch updates the way the calculations are done to fix
this and to add a bit more commentary about what is going on.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* tree-vect-loop.c (vect_transform_loop): Protect the updates of
all three iteration counts with an any_* test. Use a single update
for each count. Fix the calculation of nb_iterations_estimate.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242475
The old code still built thanks to the brackets in the definition
of XVECEXP.
gcc/
* config/arc/arc.c (arc_loop_hazard): Add missing brackets.
From-SVN: r242473
The test assumes short is always smaller than int, and therefore does not
expect a warning when the logical operands are of type short and int.
This isn't true for the avr - shorts and ints are of the same size, and
therefore the warning triggers for the above case also.
Fix by explicitly typedef'ing __INT32_TYPE for int and __INT16_TYPE__ for short
if the target's int size is less than 4 bytes.
gcc/testsuite/
2016-11-16 Senthil Kumar Selvaraj <senthil_kumar.selvaraj@atmel.com>
* c-c++-common/Wlogical-op-1.c: Use __INT{16,32}_TYPE__ instead
of {short,int} if __SIZEOF_INT__ is less than 4 bytes.
From-SVN: r242472
PR target/78364
* config/arm/arm.md (*extv_reg): Restrict operands 2 and 3 to the
proper ranges for an SBFX instruction.
(extzv_t2): Likewise for UBFX.
From-SVN: r242471
2016-11-16 Richard Biener <rguenther@suse.de>
PR tree-optimization/78348
* tree-loop-distribution.c (enum partition_kind): Add PKIND_MEMMOVE.
(generate_memcpy_builtin): Honor PKIND_MEMCPY on the partition.
(classify_partition): Set PKIND_MEMCPY if dependence analysis
revealed no dependency, PKIND_MEMMOVE otherwise.
* gcc.dg/tree-ssa/ldist-24.c: New testcase.
From-SVN: r242470
PR sanitizer/77823
* ubsan.c (ubsan_build_overflow_builtin): Add DATAP argument, if
it points to non-NULL tree, use it instead of ubsan_create_data.
(instrument_si_overflow): Handle vector signed integer overflow
checking.
* ubsan.h (ubsan_build_overflow_builtin): Add DATAP argument.
* tree-vrp.c (simplify_internal_call_using_ranges): Punt for
vector IFN_UBSAN_CHECK_*.
* internal-fn.c (expand_addsub_overflow): Add DATAP argument,
pass it through to ubsan_build_overflow_builtin.
(expand_neg_overflow, expand_mul_overflow): Likewise.
(expand_vector_ubsan_overflow): New function.
(expand_UBSAN_CHECK_ADD, expand_UBSAN_CHECK_SUB,
expand_UBSAN_CHECK_MUL): Use tit for vector arithmetics.
(expand_arith_overflow): Adjust expand_*_overflow callers.
* c-c++-common/ubsan/overflow-vec-1.c: New test.
* c-c++-common/ubsan/overflow-vec-2.c: New test.
From-SVN: r242469
2016-11-15 Bernd Edlinger <bernd.edlinger@hotmail.de>
* genattrtab.c (attr_rtx_1): Avoid allocating new rtx objects.
Clear ATTR_CURR_SIMPLIFIED_P for re-used binary rtx objects.
Use DEF_ATTR_STRING for string arguments. Use RTL_HASH for
integer arguments. Only set ATTR_PERMANENT_P on newly hashed
rtx when all sub-rtx are also permanent.
(attr_eq): Simplify.
(attr_copy_rtx): Remove.
(make_canonical, get_attr_value): Use attr_equal_p.
(copy_boolean): Rehash NOT.
(simplify_test_exp_in_temp,
optimize_attrs): Remove call to attr_copy_rtx.
(attr_alt_intersection, attr_alt_union,
attr_alt_complement, mk_attr_alt): Rehash EQ_ATTR_ALT.
(make_automaton_attrs): Use attr_eq.
From-SVN: r242460
* doc/xml/manual/intro.xml: Document LWG 2770 status. Remove entries
for 2742 and 2748.
* doc/html/*: Regenerate.
* include/std/utility (__tuple_size_cv_impl): New helper to safely
detect tuple_size<T>::value, as per LWG 2770.
(tuple_size<cv T>): Adjust partial specializations to derive from
__tuple_size_cv_impl.
* testsuite/20_util/tuple/cv_tuple_size.cc: Test SFINAE-friendliness.
From-SVN: r242452
When constructing an :? or fold expression that requires a third
expression only the first and second were explicitly checked to
not be NULL. Since the third expression is also required in these
constructs it needs to be explicitly checked and rejected when missing.
Otherwise the demangler will crash once it tries to d_print the
NULL component. Added two examples to demangle-expected of strings
that would crash before this fix.
Found by American Fuzzy Lop (afl) fuzzer.
From-SVN: r242451
In various situations the cplus_demangle () function could read past the
end of input causing crashes. Add checks in various places to not advance
the demangle string location and fail early when end of string is reached.
Add various examples of input strings to the testsuite that would crash
test-demangle before the fixes.
Found by using the American Fuzzy Lop (afl) fuzzer.
libiberty/ChangeLog:
* cplus-dem.c (demangle_signature): After 'H', template function,
no success and don't advance position if end of string reached.
(demangle_template): After 'z', template name, return zero on
premature end of string.
(gnu_special): Guard strchr against searching for zero characters.
(do_type): If member, only advance mangled string when 'F' found.
* testsuite/demangle-expected: Add examples of strings that could
crash the demangler by reading past end of input.
From-SVN: r242450
* gcc.target/i386/funcspec-56.inc: New file.
* gcc.target/i386.funcspec-5.c: Include funcspec-56.inc. Remove
common 32-bit and 64-bit function specific options.
* gcc.target/i386.funcspec-6.c: Ditto.
From-SVN: r242448
Several definitions of INCOMING_RETURN_ADDR_RTX used
gen_rtx_REG (VOIDmode, ...), which with later patches
would trip an assert. This patch converts them to use
Pmode instead.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* config/i386/i386.h (INCOMING_RETURN_ADDR_RTX): Use Pmode instead
of VOIDmode.
* config/ia64/ia64.h (INCOMING_RETURN_ADDR_RTX): Likewise.
* config/iq2000/iq2000.h (INCOMING_RETURN_ADDR_RTX): Likewise.
* config/m68k/m68k.h (INCOMING_RETURN_ADDR_RTX): Likewise.
* config/microblaze/microblaze.h (INCOMING_RETURN_ADDR_RTX): Likewise.
* config/mips/mips.h (INCOMING_RETURN_ADDR_RTX): Likewise.
* config/mn10300/mn10300.h (INCOMING_RETURN_ADDR_RTX): Likewise.
* config/nios2/nios2.h (INCOMING_RETURN_ADDR_RTX): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242447
Using MEM_SIZE is more general, since it copes with cases where
targets are forced to use BLKmode references for whatever reason.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* dce.c (check_argument_store): Pass the size instead of
the memory reference.
(find_call_stack_args): Pass MEM_SIZE to check_argument_store.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242446
After simplifying the operands of a PLUS, canon_rtx checked only
for cases in which one of the simplified operands was a constant,
falling back to gen_rtx_PLUS otherwise. This left the PLUS in a
non-canonical order if one of the simplified operands was
(plus (reg R1) (const_int X)); we'd end up with:
(plus (plus (reg R1) (const_int Y)) (reg R2))
rather than:
(plus (plus (reg R1) (reg R2)) (const_int Y))
Fixing this exposed new DSE opportunities on spu-elf in
gcc.c-torture/execute/builtins/strcat-chk.c but otherwise
it doesn't seem to have much practical effect.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* alias.c (canon_rtx): Use simplify_gen_binary.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242445
LOAD_EXTEND_OP only applies to scalar integer modes that are narrower
than a word. However, callers weren't consistent about which of these
checks they made beforehand, and also weren't consistent about whether
"smaller" was based on (bit)size or precision (IMO it's the latter).
This patch adds a wrapper to try to make the macro easier to use.
LOAD_EXTEND_OP is often used to disable transformations that aren't
beneficial when extends from memory are free, so being stricter about
the check accidentally exposed more optimisation opportunities.
"SUBREG_BYTE (...) == 0" and subreg_lowpart_p are implied by
paradoxical_subreg_p, so the patch also removes some redundant tests.
The patch doesn't change reload, since different checks could have
unforeseen consequences.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* rtl.h (load_extend_op): Declare.
* rtlanal.c (load_extend_op): New function.
(nonzero_bits1): Use it.
(num_sign_bit_copies1): Likewise.
* cse.c (cse_insn): Likewise.
* fold-const.c (fold_single_bit_test): Likewise.
(fold_unary_loc): Likewise.
* fwprop.c (free_load_extend): Likewise.
* postreload.c (reload_cse_simplify_set): Likewise.
(reload_cse_simplify_operands): Likewise.
* combine.c (try_combine): Likewise.
(simplify_set): Likewise. Remove redundant SUBREG_BYTE and
subreg_lowpart_p checks.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242444
simplify_shift_const_1 handles both shifts of scalars by scalars
and shifts of vectors by scalars. For vectors this means that
each element is shifted by the same amount.
However:
(a) the two cases weren't always distinguished, so we'd try
things for vectors that only made sense for scalars.
(b) a lot of the range and bitcount checks were based on the
bitsize or precision of the full shifted operand, rather
than the mode of each element.
Fixing (b) accidentally exposed more optimisation opportunities,
although that wasn't the point of the patch.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* combine.c (simplify_shift_const_1): Use the number of bits
in the inner mode to determine the range of the shift.
When handling shifts of vectors, skip any rules that apply
only to scalars.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242442
The old assignment to bitwidth was before we handled VOIDmode with:
if (mode == VOIDmode)
mode = GET_MODE (x);
so when VOIDmode was specified we would always use:
if (bitwidth < GET_MODE_PRECISION (GET_MODE (x)))
{
num0 = cached_num_sign_bit_copies (x, GET_MODE (x),
known_x, known_mode, known_ret);
return MAX (1,
num0 - (int) (GET_MODE_PRECISION (GET_MODE (x)) - bitwidth));
}
For a zero bitwidth this always returns 1 (which is the most
pessimistic result).
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* rtlanal.c (num_sign_bit_copies1): Calculate bitwidth after
handling VOIDmode.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242440
* include/std/variant: Remove variant<T&>, variant<void>, variant<> support
to rebase on the post-Issaquah design.
* testsuite/20_util/variant/compile.cc: Likewise.
From-SVN: r242437
gcc/
* config/mips/mips.c (mips16_emit_constants): Emit `consttable'
insn at the beginning of the constant pool.
(mips_insert_insn_pseudos): New function.
(mips_machine_reorg2): Call it.
* config/mips/mips.md (unspec): Add UNSPEC_CONSTTABLE and
UNSPEC_INSN_PSEUDO enum values.
(insn_pseudo, consttable): New insns.
gcc/testsuite/
* gcc.target/mips/insn-casesi.c: New test case.
* gcc.target/mips/insn-pseudo-1.c: New test case.
* gcc.target/mips/insn-pseudo-2.c: New test case.
* gcc.target/mips/insn-pseudo-3.c: New test case.
* gcc.target/mips/insn-pseudo-4.c: New test case.
* gcc.target/mips/insn-tablejump.c: New test case.
From-SVN: r242424
* doc/xml/manual/intro.xml: Document LWG 2742 status.
* doc/html/*: Regenerate.
* include/bits/basic_string.h
(basic_string(const T&, size_type, size_type, const Allocator&)): Add
constructor for substring of basic_string_view, as per LWG 2742 but
with additional constraint to fix ambiguity.
* testsuite/21_strings/basic_string/cons/char/9.cc: New test.
* testsuite/21_strings/basic_string/cons/wchar_t/9.cc: New test.
From-SVN: r242416
* doc/xml/manual/intro.xml: Document LWG 2748 status.
* include/std/optional (optional<T>::swap): Use is_nothrow_swappable_v
for exception specification.
(swap(optional<T>&, optional<T>&)): Disable when T is not swappable.
* testsuite/20_util/optional/swap/2.cc: New test.
From-SVN: r242415