This removes debug-bind resets aka
# DEBUG b = NULL
when the reset variable is otherwise unused. I've gathered statistics
for a single TU, fold-const.ii which at -O2 -g shows
28 ssa "dead debug bind reset" 1
34 einline "dead debug bind reset" 340
54 release_ssa "dead debug bind reset" 176
54 release_ssa "live debug bind reset of dead var" 4
86 inline "dead debug bind reset" 5131
86 inline "live debug bind reset of dead var" 61
241 optimized "dead debug bind reset" 970
241 optimized "live debug bind reset of dead var" 287
where "live debug bind reset of dead var" means the variable is unused
but there were debug binds with a value for them and
"dead debug bind reset" means the variable is unused and there were
only debug bind resets (each reset of the same variable is counted
for both counters). This shows A considerable amount of dead stmts
removed esp. after IPA inlining.
2020-05-12 Richard Biener <rguenther@suse.de>
* tree-ssa-live.c (remove_unused_locals): Remove dead debug
bind resets.
The testcase only works for little-endian, mark it so.
2020-05-12 Richard Biener <rguenther@suse.de>
PR middle-end/94988
* gcc.dg/torture/pr94988.c: Disable runtime test for
* non-little-endian.
gcc/ChangeLog:
2020-05-12 Jozef Lawrynowicz <jozef.l@mittosystems.com>
* config/msp430/msp430-protos.h (msp430_output_aligned_decl_common):
Update prototype to include "local" argument.
* config/msp430/msp430.c (msp430_output_aligned_decl_common): Add
"local" argument. Handle local common decls.
* config/msp430/msp430.h (ASM_OUTPUT_ALIGNED_DECL_COMMON): Adjust
msp430_output_aligned_decl_common call with 0 for "local" argument.
(ASM_OUTPUT_ALIGNED_DECL_LOCAL): Define.
gcc/testsuite/ChangeLog:
2020-05-12 Jozef Lawrynowicz <jozef.l@mittosystems.com>
* gcc.c-torture/execute/noinit-attribute.c: Skip for msp430
in the large memory model.
This fixes an oversight in the new canonicalization code for packable
types: it does not take into account the scalar storage order.
PR ada/95035
* gcc-interface/utils.c (packable_type_hasher::equal): Also compare
the scalar storage order.
(hash_packable_type): Also hash the scalar storage order.
(hash_pad_type): Likewise.
This moves EDGE_DFS_BACK to the appropriate edge when the split
edge had it set.
2020-05-12 Richard Biener <rguenther@suse.de>
* cfghooks.c (split_edge): Preserve EDGE_DFS_BACK if set.
The r11-15 change broke this testcase, as it now asserts type is equal to
the type of the DECL_VALUE_EXPR, but for DECL_OMP_PRIVATIZED_MEMBER artificial
vars mapping to bitfields it wasn't. Fixed by changing the
DECL_OMP_PRIVATIZED_MEMBER var type in that case.
2020-05-12 Jakub Jelinek <jakub@redhat.com>
PR c++/95063
* pt.c (tsubst_decl): Deal with DECL_OMP_PRIVATIZED_MEMBER for
a bit-field.
* g++.dg/gomp/pr95063.C: New test.
This third patch of three actually fixes the PR. We were using
8-bit BIT_FIELD_REFs to access single-bit elements, and multiplying
the vector index by 8 bits rather than 1 bit.
2020-05-12 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/94980
* tree-vect-generic.c (expand_vector_comparison): Use
vector_element_bits_tree to get the element size in bits,
rather than using TYPE_SIZE.
(expand_vector_condition, vector_element): Likewise.
gcc/testsuite/
PR tree-optimization/94980
* gcc.target/i386/pr94980.c: New test.
This patch makes build_replicated_const take the number of bits
in VALUE rather than calculating the width from the element type.
The callers can then use vector_element_bits to calculate the
correct element size from the vector type.
2020-05-12 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/94980
* tree-vect-generic.c (build_replicated_const): Take the number
of bits as a parameter, instead of the type of the elements.
(do_plus_minus): Update accordingly, using vector_element_bits
to calculate the correct number of bits.
(do_negate): Likewise.
A lot of code that wants to know the number of bits in a vector
element gets that information from the element's TYPE_SIZE,
which is always equal to TYPE_SIZE_UNIT * BITS_PER_UNIT.
This doesn't work for SVE and AVX512-style packed boolean vectors,
where several elements can occupy a single byte.
This patch introduces a new pair of helpers for getting the true
(possibly sub-byte) size. I made a token attempt to convert obvious
element size calculations, but I'm sure I missed some.
2020-05-12 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/94980
* tree.h (vector_element_bits, vector_element_bits_tree): Declare.
* tree.c (vector_element_bits, vector_element_bits_tree): New.
* match.pd: Use the new functions instead of determining the
vector element size directly from TYPE_SIZE(_UNIT).
* tree-vect-data-refs.c (vect_gather_scatter_fn_p): Likewise.
* tree-vect-patterns.c (vect_recog_mask_conversion_pattern): Likewise.
* tree-vect-stmts.c (vect_is_simple_cond): Likewise.
* tree-vect-generic.c (expand_vector_piecewise): Likewise.
(expand_vector_conversion): Likewise.
(expand_vector_addition): Likewise for a TYPE_SIZE_UNIT used as
a divisor. Convert the dividend to bits to compensate.
* tree-vect-loop.c (vectorizable_live_operation): Call
vector_element_bits instead of open-coding it.
This attempts to implement what the OpenMP 5.0 spec in declare target section
says as ammended by the 5.1 changes so far (related to device_type(host)), except
that it doesn't have the device(ancestor: ...) handling yet because we do not
support it yet, and I've left so far out the except lambda note, because I need
that clarified.
2020-05-12 Jakub Jelinek <jakub@redhat.com>
* omp-offload.h (omp_discover_implicit_declare_target): Declare.
* omp-offload.c: Include context.h.
(omp_declare_target_fn_p, omp_declare_target_var_p,
omp_discover_declare_target_fn_r, omp_discover_declare_target_var_r,
omp_discover_implicit_declare_target): New functions.
* cgraphunit.c (analyze_functions): Call
omp_discover_implicit_declare_target.
* testsuite/libgomp.c/target-39.c: New test.
This canonicalizes those to a constant literal.
2020-05-12 Richard Biener <rguenther@suse.de>
* gimple-fold.c (maybe_canonicalize_mem_ref_addr): Canonicalize
literal constant &MEM[..] to a constant literal.
Since we apply SM to an edge which exits multiple loops we have
to make sure to commit insertions on it immediately since otherwise
store order is not preserved.
2020-05-12 Richard Biener <rguenther@suse.de>
PR tree-optimization/95045
* dbgcnt.def (lim): Add debug-counter.
* tree-ssa-loop-im.c: Include dbgcnt.h.
(find_refs_for_sm): Use lim debug counter for store motion
candidates.
(do_store_motion): Rename form store_motion. Commit edge
insertions...
(store_motion_loop): ... here.
(tree_ssa_lim): Adjust.
* include/bits/atomic_base.h (atomic_flag): Implement test member
function.
* include/std/version: Define __cpp_lib_atomic_flag_test.
* testsuite/29_atomics/atomic_flag/test/explicit.cc: New file.
* testsuite/29_atomics/atomic_flag/test/implicit.cc: New file.
Changes to the built-in specification occurred after early patches
added support for these. The name of vec_clzm became vec_cntlzm,
and vec_ctzm became vec_cnttzm. Four of the overloaded forms of
vec_gnb were removed, and the fourth argument redefined as an
unsigned int, not an unsigned char. This patch reflects those
changes in the code and test cases. Eight of the vec_gnb test
cases are removed as a result.
[gcc]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
* config/rs6000/altivec.h (vec_clzm): Rename to vec_cntlzm.
(vec_ctzm): Rename to vec_cnttzm.
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
Change fourth operand for vec_ternarylogic to require
compatibility with unsigned SImode rather than unsigned QImode.
* config/rs6000/rs6000-call.c (altivec_overloaded_builtins):
Remove overloaded forms of vec_gnb that are no longer needed.
* doc/extend.texi (PowerPC AltiVec Built-in Functions Available
for a Future Architecture): Replace vec_clzm with vec_cntlzm;
replace vec_ctzm with vec_cntlzm; remove four unwanted forms of
vec_gnb; move vec_ternarylogic documentation into this section
and replace const unsigned char with const unsigned int as its
fourth argument.
[gcc/testsuite]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
* gcc.target/powerpc/vec-clzm-0.c: Rename to...
* gcc.target/powerpc/vec-cntlzm-0.c: ...this.
* gcc.target/powerpc/vec-clzm-1.c: Rename to...
* gcc.target/powerpc/vec-cntlzm-1.c: ...this.
* gcc.target/powerpc/vec-ctzm-0.c: Rename to...
* gcc.target/powerpc/vec-cnttzm-0.c: ...this.
* gcc.target/powerpc/vec-ctzm-1.c: Rename to...
* gcc.target/powerpc/vec-cnttzm-1.c: ...this.
* gcc.target/powerpc/vec-gnb-8.c: Rename to...
* gcc.target/powerpc/vec-gnb-0.c: ...this, deleting the old file.
* gcc.target/powerpc/vec-gnb-9.c: Rename to...
* gcc.target/powerpc/vec-gnb-1.c: ...this, deleting the old file.
* gcc.target/powerpc/vec-gnb-10.c: Rename to...
* gcc.target/powerpc/vec-gnb-2.c: ...this, deleting the old file.
* gcc.target/powerpc/vec-gnb-3.c: Delete.
* gcc.target/powerpc/vec-gnb-4.c: Delete.
* gcc.target/powerpc/vec-gnb-5.c: Delete.
* gcc.target/powerpc/vec-gnb-6.c: Delete.
* gcc.target/powerpc/vec-gnb-7.c: Delete.
Add support for xxgenpcv[dw]m, along with individual and overloaded
built-in functions for access.
[gcc]
2020-05-11 Carl Love <cel@us.ibm.com>
* config/rs6000/altivec.h (vec_genpcvm): New #define.
* config/rs6000/rs6000-builtin.def (XXGENPCVM_V16QI): New built-in
instantiation.
(XXGENPCVM_V8HI): Likewise.
(XXGENPCVM_V4SI): Likewise.
(XXGENPCVM_V2DI): Likewise.
(XXGENPCVM): New overloaded built-in instantiation.
* config/rs6000/rs6000-call.c (altivec_overloaded_builtins): Add
entries for FUTURE_BUILTIN_VEC_XXGENPCVM.
(altivec_expand_builtin): Add special handling for
FUTURE_BUILTIN_VEC_XXGENPCVM.
(builtin_function_type): Add handling for
FUTURE_BUILTIN_XXGENPCVM_{V16QI,V8HI,V4SI,V2DI}.
* config/rs6000/vsx.md (VSX_EXTRACT_I4): New mode iterator.
(UNSPEC_XXGENPCV): New constant.
(xxgenpcvm_<mode>_internal): New insn.
(xxgenpcvm_<mode>): New expansion.
* doc/extend.texi: Add documentation for vec_genpcvm built-ins.
[gcc/testsuite]
2020-05-11 Carl Love <cel@us.ibm.com>
* gcc.target/powerpc/xxgenpc-runnable.c: New.
The expected result of TestCallersNilPointerPanic has changed in
GoLLVM. This CL makes some elements of the expected result optional
so that this test passes in both gccgo and GoLLVM.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/230138
Add the xxeval insn and access it via the vec_ternarylogic built-in
function. As part of this, add support to the built-in function
infrastructure for functions that take four arguments.
[gcc]
2020-05-11 Kelvin Nilsen <wschmidt@linux.ibm.com>
* config/rs6000/altivec.h (vec_ternarylogic): New #define.
* config/rs6000/altivec.md (UNSPEC_XXEVAL): New constant.
(xxeval): New insn.
* config/rs6000/predicates.md (u8bit_cint_operand): New predicate.
* config/rs6000/rs6000-builtin.def: Add handling of new macro
RS6000_BUILTIN_4.
(BU_FUTURE_V_4): New macro. Use it.
(BU_FUTURE_OVERLOAD_4): Likewise.
* config/rs6000/rs6000-c.c (altivec_build_resolved_builtin): Add
handling for quaternary built-in functions.
(altivec_resolve_overloaded_builtin): Add special-case handling
for __builtin_vec_xxeval.
* config/rs6000/rs6000-call.c: Add handling of new macro
RS6000_BUILTIN_4 in initialization of rs6000_builtin_info,
bdesc0_arg, bdesc1_arg, bdesc2_arg, bdesc_3arg,
bdesc_altivec_preds, bdesc_abs, and bdesc_htm arrays.
(altivec_overloaded_builtins): Add definitions for
FUTURE_BUILTIN_VEC_XXEVAL.
(bdesc_4arg): New array.
(htm_expand_builtin): Add handling for quaternary built-in
functions.
(rs6000_expand_quaternop_builtin): New function.
(rs6000_expand_builtin): Add handling for quaternary built-in
functions.
(rs6000_init_builtins): Initialize builtin_mode_to_type entries
for unsigned QImode and unsigned HImode.
(builtin_quaternary_function_type): New function.
(rs6000_common_init_builtins): Add handling of quaternary
operations.
* config/rs6000/rs6000.h (RS6000_BTC_QUATERNARY): New defined
constant.
(RS6000_BTC_PREDICATE): Change value of constant.
(RS6000_BTC_ABS): Likewise.
(rs6000_builtins): Add support for new macro RS6000_BUILTIN_4.
* doc/extend.texi (PowerPC AltiVec Built-In Functions Available
for a Future Architecture): Add description of vec_ternarylogic
built-in function.
[gcc/testsuite]
2020-05-11 Kelvin Nilsen <wschmidt@linux.ibm.com>
* gcc.target/powerpc/vec-ternarylogic-0.c: New.
* gcc.target/powerpc/vec-ternarylogic-1.c: New.
* gcc.target/powerpc/vec-ternarylogic-10.c: New.
* gcc.target/powerpc/vec-ternarylogic-2.c: New.
* gcc.target/powerpc/vec-ternarylogic-3.c: New.
* gcc.target/powerpc/vec-ternarylogic-4.c: New.
* gcc.target/powerpc/vec-ternarylogic-5.c: New.
* gcc.target/powerpc/vec-ternarylogic-6.c: New.
* gcc.target/powerpc/vec-ternarylogic-7.c: New.
* gcc.target/powerpc/vec-ternarylogic-8.c: New.
* gcc.target/powerpc/vec-ternarylogic-9.c: New.
Add support for new scalar instructions for counting leading or
trailing zeros under control of a bitmask.
[gcc]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
* config/rs6000/rs6000-builtin.def (__builtin_cntlzdm): New
built-in function definition.
(__builtin_cnttzdm): Likewise.
* config/rs6000/rs6000.md (UNSPEC_CNTLZDM): New constant.
(UNSPEC_CNTTZDM): Likewise.
(cntlzdm): New insn.
(cnttzdm): Likewise.
* doc/extend.texi (Basic PowerPC Built-in Functions available for
a Future Architecture): Add descriptions of __builtin_cntlzdm and
__builtin_cnttzdm functions.
[gcc/testsuite]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
* gcc.target/powerpc/cntlzdm-0.c: New test.
* gcc.target/powerpc/cntlzdm-1.c: New test.
* gcc.target/powerpc/cnttzdm-0.c: New test.
* gcc.target/powerpc/cnttzdm-1.c: New test.
The resolution of comment CA104 clarifies that we need to do direct
substitution of constraints in order to determine which member template
corresponds to an explicit specialization.
gcc/cp/ChangeLog
2020-05-11 Jason Merrill <jason@redhat.com>
Resolve C++20 NB comment CA104
* pt.c (determine_specialization): Compare constraints for
specialization of member template of class instantiation.
While looking at 92583/92654 it occurred to me that typename types needed
the same fix. So extract_locals_r also needs to see the TYPE_CONTEXT of a
TYPENAME_TYPE. But it must not look through a typedef.
Most tree walking in the front end wants to walk through the syntactic form
of a type of expression, and doesn't care about the type referred to by a
typedef. But min_vis_r does care.
gcc/cp/ChangeLog
2020-05-11 Jason Merrill <jason@redhat.com>
PR c++/92583
PR c++/92654
* tree.c (cp_walk_subtrees): Stop at typedefs.
Handle TYPENAME_TYPE here.
* pt.c (find_parameter_packs_r): Not here.
(for_each_template_parm_r): Clear *walk_subtrees.
* decl2.c (min_vis_r): Look through typedefs.
This improves the diagnostic from
error: could not convert ‘((A<>*)(void)0)->A<>::e’ from
‘<unresolved overloaded function type>’ to ‘bool’
to
error: cannot convert ‘A<>::e’ from type ‘void (A<>::)()’ to type ‘bool’
gcc/cp/ChangeLog
2020-05-11 Jason Merrill <jason@redhat.com>
* call.c (implicit_conversion_error): Split out from...
(perform_implicit_conversion_flags): ...here.
(build_converted_constant_expr_internal): Use it.
We were incorrectly accepting the use of 'this' at parse time and then
crashing when we tried to instantiate it. It is invalid because 'this' is
not in scope until after the function-cv-quals. So let's hoist setting
current_class_ptr up from cp_parser_late_return_type_opt into
cp_parser_direct_declarator where it can work for noexcept as well.
gcc/cp/ChangeLog
2020-05-11 Jason Merrill <jason@redhat.com>
PR c++/90748
* parser.c (inject_parm_decls): Set current_class_ptr here.
(cp_parser_direct_declarator): And here.
(cp_parser_late_return_type_opt): Not here.
(cp_parser_noexcept_specification_opt): Nor here.
(cp_parser_exception_specification_opt)
(cp_parser_late_noexcept_specifier): Remove unneeded parameters.
The fix for PR 93499 introduced a too strict check in gfc_divide
that could trigger errors in the early parsing phase. Relax the
check and defer to a later stage.
gcc/fortran/
2020-05-11 Harald Anlauf <anlauf@gmx.de>
PR fortran/95053
* arith.c (gfc_divide): Do not error out if operand 2 is
non-numeric. Defer checks to later stage.
gcc/testsuite/
2020-05-11 Harald Anlauf <anlauf@gmx.de>
PR fortran/95053
* gfortran.dg/pr95053.f: New test.
If a program has no other dependencies on libstdc++, we shouldn't require it
just for __cxa_pure_virtual, which is only there to give a prettier
diagnostic before crashing the program; resolving the reference to NULL will
also crash, just without the diagnostic.
gcc/cp/ChangeLog
2020-05-11 Jason Merrill <jason@redhat.com>
* decl.c (cxx_init_decl_processing): Call declare_weak for
__cxa_pure_virtual.
We weren't printing the condition and message of a STATIC_ASSERT.
It's also unnecessary to duplicate the code for instantiating a
STATIC_ASSERT between tsubst_expr and instantiate_class_template_1.
gcc/cp/ChangeLog
2020-05-11 Jason Merrill <jason@redhat.com>
* pt.c (instantiate_class_template_1): Call tsubst_expr for
STATIC_ASSERT member.
* ptree.c (cxx_print_xnode): Handle STATIC_ASSERT.
We walk the lambda captures in cp_walk_subtrees, so we don't also need to
walk them here.
gcc/cp/ChangeLog
2020-05-11 Jason Merrill <jason@redhat.com>
* pt.c (find_parameter_packs_r) [LAMBDA_EXPR]: Remove redundant
walking of capture list.
This flag is redundant with the explicit_targs field in the overload
candidate information.
gcc/cp/ChangeLog
2020-05-11 Jason Merrill <jason@redhat.com>
* cp-tree.h (LOOKUP_EXPLICIT_TMPL_ARGS): Remove.
* call.c (build_new_function_call): Don't set it.
(build_new_method_call_1): Likewise.
(build_over_call): Check cand->explicit_targs instead.
If we put the SAVE_EXPR for a VLA size inside the MINUS_EXPR rather than
outside, it will work better with constant folding.
The equivalent change was made in the C front-end in 2004, in commit
r0-64535-g8b0b9aefd29dfe6398857bcf5628662e2f0e21f6
gcc/cp/ChangeLog
2020-05-11 Jason Merrill <jason@redhat.com>
* decl.c (compute_array_index_type_loc): Stabilize before building
the MINUS_EXPR.
There's no need to warn that a deprecated function uses a deprecated type,
that just adds noise. We were preventing that in start_decl, but that
didn't help member declarations that go through grokfield. So handle it in
grokdeclarator instead, which is shared between them.
gcc/cp/ChangeLog
2020-05-11 Jason Merrill <jason@redhat.com>
* decl.c (grokdeclarator): Adjust deprecated_state here.
(start_decl): Not here.
Add the new vector centrifuge-doubleword instruction and built-in
function access.
[gcc]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
* config/rs6000/altivec.h (vec_cfuge): New #define.
* config/rs6000/altivec.md (UNSPEC_VCFUGED): New constant.
(vcfuged): New insn.
* config/rs6000/rs6000-builtin.def (__builtin_altivec_vcfuged):
New built-in function.
* config/rs6000/rs6000-call.c (builtin_function_type): Add
handling for FUTURE_BUILTIN_VCFUGED case.
* doc/extend.texi (PowerPC AltiVec Built-in Functions Available
for a Future Architecture): Add description of vec_cfuge built-in
function.
[gcc/testsuite]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
* gcc.target/powerpc/vec-cfuged-0.c: New test.
* gcc.target/powerpc/vec-cfuged-1.c: New test.
Add the centifuge-doubleword instruction and built-in access.
[gcc]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
* config/rs6000/rs6000-builtin.def (BU_FUTURE_MISC_0): New
#define.
(BU_FUTURE_MISC_1): Likewise.
(BU_FUTURE_MISC_2): Likewise.
(BU_FUTURE_MISC_3): Likewise.
(__builtin_cfuged): New built-in function definition.
* config/rs6000/rs6000.md (UNSPEC_CFUGED): New constant.
(cfuged): New insn.
* doc/extend.texi (Basic PowerPC Built-in Functions Available for
a Future Architecture): New subsubsection.
[gcc/testsuite]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
* gcc.target.powerpc/cfuged-0.c: New test.
* gcc.target.powerpc/cfuged-1.c: New test.
This rejects lattice changes from one constant to another.
2020-05-11 Richard Biener <rguenther@suse.de>
PR tree-optimization/95049
* tree-ssa-sccvn.c (set_ssa_val_to): Reject lattice transition
between different constants.
* gcc.dg/torture/pr95049.c: New testcase.
AVX512-style masks and SVE-style predicates can be difficult
to debug in gimple dumps, since the types are printed like this:
vector(4) <unnamed type> foo;
Some important details are hidden by that <unnamed type>,
such as the number of bits in an element and whether the type
is signed or unsigned.
This patch uses an ad-hoc syntax for printing unnamed
boolean types. Normal frontend ones should be handled
by the earlier TYPE_NAME code.
2020-05-11 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-pretty-print.c (dump_generic_node): Handle BOOLEAN_TYPEs.
Add support for the vgnb instruction, which gathers every Nth bit
per vector element.
[gcc]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
Bill Schmidt <wschmidt@linux.ibm.com>
* config/rs6000/altivec.h (vec_gnb): New #define.
* config/rs6000/altivec.md (UNSPEC_VGNB): New constant.
(vgnb): New insn.
* config/rs6000/rs6000-builtin.def (BU_FUTURE_OVERLOAD_1): New
#define.
(BU_FUTURE_OVERLOAD_2): Likewise.
(BU_FUTURE_OVERLOAD_3): Likewise.
(__builtin_altivec_gnb): New built-in function.
(__buiiltin_vec_gnb): New overloaded built-in function.
* config/rs6000/rs6000-call.c (altivec_overloaded_builtins):
Define overloaded forms of __builtin_vec_gnb.
(rs6000_expand_binop_builtin): Add error checking for 2nd argument
of __builtin_vec_gnb.
(builtin_function_type): Mark return value and arguments unsigned
for FUTURE_BUILTIN_VGNB.
* doc/extend.texi (PowerPC AltiVec Built-in Functions Available
for a Future Architecture): Add description of vec_gnb built-in
function.
[gcc/testsuite]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
Bill Schmidt <wschmidt@linux.ibm.com>
* gcc.target/powerpc/vec-gnb-0.c: New test.
* gcc.target/powerpc/vec-gnb-1.c: New test.
* gcc.target/powerpc/vec-gnb-10.c: New test.
* gcc.target/powerpc/vec-gnb-2.c: New test.
* gcc.target/powerpc/vec-gnb-3.c: New test.
* gcc.target/powerpc/vec-gnb-4.c: New test.
* gcc.target/powerpc/vec-gnb-5.c: New test.
* gcc.target/powerpc/vec-gnb-6.c: New test.
* gcc.target/powerpc/vec-gnb-7.c: New test.
* gcc.target/powerpc/vec-gnb-8.c: New test.
* gcc.target/powerpc/vec-gnb-9.c: New test.
Add support for the vpdepd and vpextd instructions that perform
vector parallel bit deposit and vector parallel bit extract.
[gcc]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
Bill Schmidt <wschmidt@linux.ibm.com>
* config/rs6000/altivec.h (vec_pdep): New macro implementing new
built-in function.
(vec_pext): Likewise.
* config/rs6000/altivec.md (UNSPEC_VPDEPD): New constant.
(UNSPEC_VPEXTD): Likewise.
(vpdepd): New insn.
(vpextd): Likewise.
* config/rs6000/rs6000-builtin.def (__builtin_altivec_vpdepd): New
built-in function.
(__builtin_altivec_vpextd): Likewise.
* config/rs6000/rs6000-call.c (builtin_function_type): Add
handling for FUTURE_BUILTIN_VPDEPD and FUTURE_BUILTIN_VPEXTD
cases.
* doc/extend.texi (PowerPC Altivec Built-in Functions Available
for a Future Architecture): Add description of vec_pdep and
vec_pext built-in functions.
[gcc/testsuite]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
* gcc.target/powerpc/vec-pdep-0.c: New.
* gcc.target/powerpc/vec-pdep-1.c: New.
* gcc.target/powerpc/vec-pext-0.c: New.
* gcc.target/powerpc/vec-pext-1.c: New.
Add support for new vclzdm and vctzdm vector instructions that
count leading and trailing zeros under control of a mask.
[gcc]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
Bill Schmidt <wschmidt@linux.ibm.com>
* config/rs6000/altivec.h (vec_clzm): New macro.
(vec_ctzm): Likewise.
* config/rs6000/altivec.md (UNSPEC_VCLZDM): New constant.
(UNSPEC_VCTZDM): Likewise.
(vclzdm): New insn.
(vctzdm): Likewise.
* config/rs6000/rs6000-builtin.def (BU_FUTURE_V_0): New macro.
(BU_FUTURE_V_1): Likewise.
(BU_FUTURE_V_2): Likewise.
(BU_FUTURE_V_3): Likewise.
(__builtin_altivec_vclzdm): New builtin definition.
(__builtin_altivec_vctzdm): Likewise.
* config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Cause
_ARCH_PWR_FUTURE macro to be defined if OPTION_MASK_FUTURE flag is
set.
* config/rs6000/rs6000-call.c (builtin_function_type): Set return
value and parameter types to be unsigned for VCLZDM and VCTZDM.
* config/rs6000/rs6000.c (rs6000_builtin_mask_calculate): Add
support for TARGET_FUTURE flag.
* config/rs6000/rs6000.h (RS6000_BTM_FUTURE): New macro constant.
* doc/extend.texi (PowerPC Altivec Built-in Functions Available
for a Future Architecture): New subsubsection.
[gcc/testsuite]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
* gcc.target/powerpc/vec-clzm-0.c: New test.
* gcc.target/powerpc/vec-clzm-1.c: New test.
* gcc.target/powerpc/vec-ctzm-0.c: New test.
* gcc.target/powerpc/vec-ctzm-1.c: New test.
This enhances store-order preserving store motion to handle the case
of non-invariant dependent stores in the sequence of unconditionally
executed stores on exit by re-issueing them as part of the sequence
of stores on the exit. This fixes the observed regression of
gcc.target/i386/pr64110.c which relies on store-motion of 'b'
for a loop like
for (int i = 0; i < j; ++i)
*b++ = x;
where for correctness we now no longer apply store-motion. With
the patch we emit the correct
tem = b;
for (int i = 0; i < j; ++i)
{
tem = tem + 1;
*tem = x;
}
b = tem;
*tem = x;
preserving the original order of stores. A testcase reflecting
the miscompilation done by earlier GCC is added as well.
This also fixes the reported ICE in PR95025 and adds checking code
to catch it earlier - the issue was not-supported refs propagation
leaving stray refs in the sequence.
2020-05-11 Richard Biener <rguenther@suse.de>
PR tree-optimization/94988
PR tree-optimization/95025
* tree-ssa-loop-im.c (seq_entry): Make a struct, add from.
(sm_seq_push_down): Take extra parameter denoting where we
moved the ref to.
(execute_sm_exit): Re-issue sm_other stores in the correct
order.
(sm_seq_valid_bb): When always executed, allow sm_other to
prevail inbetween sm_ord and record their stored value.
(hoist_memory_references): Adjust refs_not_supported propagation
and prune sm_other from the end of the ordered sequences.
* gcc.dg/torture/pr94988.c: New testcase.
* gcc.dg/torture/pr95025.c: Likewise.
* gcc.dg/torture/pr95045.c: Likewise.
* g++.dg/asan/pr95025.C: New testcase.