Changes to the built-in specification occurred after early patches
added support for these. The name of vec_clzm became vec_cntlzm,
and vec_ctzm became vec_cnttzm. Four of the overloaded forms of
vec_gnb were removed, and the fourth argument redefined as an
unsigned int, not an unsigned char. This patch reflects those
changes in the code and test cases. Eight of the vec_gnb test
cases are removed as a result.
[gcc]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
* config/rs6000/altivec.h (vec_clzm): Rename to vec_cntlzm.
(vec_ctzm): Rename to vec_cnttzm.
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
Change fourth operand for vec_ternarylogic to require
compatibility with unsigned SImode rather than unsigned QImode.
* config/rs6000/rs6000-call.c (altivec_overloaded_builtins):
Remove overloaded forms of vec_gnb that are no longer needed.
* doc/extend.texi (PowerPC AltiVec Built-in Functions Available
for a Future Architecture): Replace vec_clzm with vec_cntlzm;
replace vec_ctzm with vec_cntlzm; remove four unwanted forms of
vec_gnb; move vec_ternarylogic documentation into this section
and replace const unsigned char with const unsigned int as its
fourth argument.
[gcc/testsuite]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
* gcc.target/powerpc/vec-clzm-0.c: Rename to...
* gcc.target/powerpc/vec-cntlzm-0.c: ...this.
* gcc.target/powerpc/vec-clzm-1.c: Rename to...
* gcc.target/powerpc/vec-cntlzm-1.c: ...this.
* gcc.target/powerpc/vec-ctzm-0.c: Rename to...
* gcc.target/powerpc/vec-cnttzm-0.c: ...this.
* gcc.target/powerpc/vec-ctzm-1.c: Rename to...
* gcc.target/powerpc/vec-cnttzm-1.c: ...this.
* gcc.target/powerpc/vec-gnb-8.c: Rename to...
* gcc.target/powerpc/vec-gnb-0.c: ...this, deleting the old file.
* gcc.target/powerpc/vec-gnb-9.c: Rename to...
* gcc.target/powerpc/vec-gnb-1.c: ...this, deleting the old file.
* gcc.target/powerpc/vec-gnb-10.c: Rename to...
* gcc.target/powerpc/vec-gnb-2.c: ...this, deleting the old file.
* gcc.target/powerpc/vec-gnb-3.c: Delete.
* gcc.target/powerpc/vec-gnb-4.c: Delete.
* gcc.target/powerpc/vec-gnb-5.c: Delete.
* gcc.target/powerpc/vec-gnb-6.c: Delete.
* gcc.target/powerpc/vec-gnb-7.c: Delete.
Add support for xxgenpcv[dw]m, along with individual and overloaded
built-in functions for access.
[gcc]
2020-05-11 Carl Love <cel@us.ibm.com>
* config/rs6000/altivec.h (vec_genpcvm): New #define.
* config/rs6000/rs6000-builtin.def (XXGENPCVM_V16QI): New built-in
instantiation.
(XXGENPCVM_V8HI): Likewise.
(XXGENPCVM_V4SI): Likewise.
(XXGENPCVM_V2DI): Likewise.
(XXGENPCVM): New overloaded built-in instantiation.
* config/rs6000/rs6000-call.c (altivec_overloaded_builtins): Add
entries for FUTURE_BUILTIN_VEC_XXGENPCVM.
(altivec_expand_builtin): Add special handling for
FUTURE_BUILTIN_VEC_XXGENPCVM.
(builtin_function_type): Add handling for
FUTURE_BUILTIN_XXGENPCVM_{V16QI,V8HI,V4SI,V2DI}.
* config/rs6000/vsx.md (VSX_EXTRACT_I4): New mode iterator.
(UNSPEC_XXGENPCV): New constant.
(xxgenpcvm_<mode>_internal): New insn.
(xxgenpcvm_<mode>): New expansion.
* doc/extend.texi: Add documentation for vec_genpcvm built-ins.
[gcc/testsuite]
2020-05-11 Carl Love <cel@us.ibm.com>
* gcc.target/powerpc/xxgenpc-runnable.c: New.
The expected result of TestCallersNilPointerPanic has changed in
GoLLVM. This CL makes some elements of the expected result optional
so that this test passes in both gccgo and GoLLVM.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/230138
Add the xxeval insn and access it via the vec_ternarylogic built-in
function. As part of this, add support to the built-in function
infrastructure for functions that take four arguments.
[gcc]
2020-05-11 Kelvin Nilsen <wschmidt@linux.ibm.com>
* config/rs6000/altivec.h (vec_ternarylogic): New #define.
* config/rs6000/altivec.md (UNSPEC_XXEVAL): New constant.
(xxeval): New insn.
* config/rs6000/predicates.md (u8bit_cint_operand): New predicate.
* config/rs6000/rs6000-builtin.def: Add handling of new macro
RS6000_BUILTIN_4.
(BU_FUTURE_V_4): New macro. Use it.
(BU_FUTURE_OVERLOAD_4): Likewise.
* config/rs6000/rs6000-c.c (altivec_build_resolved_builtin): Add
handling for quaternary built-in functions.
(altivec_resolve_overloaded_builtin): Add special-case handling
for __builtin_vec_xxeval.
* config/rs6000/rs6000-call.c: Add handling of new macro
RS6000_BUILTIN_4 in initialization of rs6000_builtin_info,
bdesc0_arg, bdesc1_arg, bdesc2_arg, bdesc_3arg,
bdesc_altivec_preds, bdesc_abs, and bdesc_htm arrays.
(altivec_overloaded_builtins): Add definitions for
FUTURE_BUILTIN_VEC_XXEVAL.
(bdesc_4arg): New array.
(htm_expand_builtin): Add handling for quaternary built-in
functions.
(rs6000_expand_quaternop_builtin): New function.
(rs6000_expand_builtin): Add handling for quaternary built-in
functions.
(rs6000_init_builtins): Initialize builtin_mode_to_type entries
for unsigned QImode and unsigned HImode.
(builtin_quaternary_function_type): New function.
(rs6000_common_init_builtins): Add handling of quaternary
operations.
* config/rs6000/rs6000.h (RS6000_BTC_QUATERNARY): New defined
constant.
(RS6000_BTC_PREDICATE): Change value of constant.
(RS6000_BTC_ABS): Likewise.
(rs6000_builtins): Add support for new macro RS6000_BUILTIN_4.
* doc/extend.texi (PowerPC AltiVec Built-In Functions Available
for a Future Architecture): Add description of vec_ternarylogic
built-in function.
[gcc/testsuite]
2020-05-11 Kelvin Nilsen <wschmidt@linux.ibm.com>
* gcc.target/powerpc/vec-ternarylogic-0.c: New.
* gcc.target/powerpc/vec-ternarylogic-1.c: New.
* gcc.target/powerpc/vec-ternarylogic-10.c: New.
* gcc.target/powerpc/vec-ternarylogic-2.c: New.
* gcc.target/powerpc/vec-ternarylogic-3.c: New.
* gcc.target/powerpc/vec-ternarylogic-4.c: New.
* gcc.target/powerpc/vec-ternarylogic-5.c: New.
* gcc.target/powerpc/vec-ternarylogic-6.c: New.
* gcc.target/powerpc/vec-ternarylogic-7.c: New.
* gcc.target/powerpc/vec-ternarylogic-8.c: New.
* gcc.target/powerpc/vec-ternarylogic-9.c: New.
Add support for new scalar instructions for counting leading or
trailing zeros under control of a bitmask.
[gcc]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
* config/rs6000/rs6000-builtin.def (__builtin_cntlzdm): New
built-in function definition.
(__builtin_cnttzdm): Likewise.
* config/rs6000/rs6000.md (UNSPEC_CNTLZDM): New constant.
(UNSPEC_CNTTZDM): Likewise.
(cntlzdm): New insn.
(cnttzdm): Likewise.
* doc/extend.texi (Basic PowerPC Built-in Functions available for
a Future Architecture): Add descriptions of __builtin_cntlzdm and
__builtin_cnttzdm functions.
[gcc/testsuite]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
* gcc.target/powerpc/cntlzdm-0.c: New test.
* gcc.target/powerpc/cntlzdm-1.c: New test.
* gcc.target/powerpc/cnttzdm-0.c: New test.
* gcc.target/powerpc/cnttzdm-1.c: New test.
The resolution of comment CA104 clarifies that we need to do direct
substitution of constraints in order to determine which member template
corresponds to an explicit specialization.
gcc/cp/ChangeLog
2020-05-11 Jason Merrill <jason@redhat.com>
Resolve C++20 NB comment CA104
* pt.c (determine_specialization): Compare constraints for
specialization of member template of class instantiation.
While looking at 92583/92654 it occurred to me that typename types needed
the same fix. So extract_locals_r also needs to see the TYPE_CONTEXT of a
TYPENAME_TYPE. But it must not look through a typedef.
Most tree walking in the front end wants to walk through the syntactic form
of a type of expression, and doesn't care about the type referred to by a
typedef. But min_vis_r does care.
gcc/cp/ChangeLog
2020-05-11 Jason Merrill <jason@redhat.com>
PR c++/92583
PR c++/92654
* tree.c (cp_walk_subtrees): Stop at typedefs.
Handle TYPENAME_TYPE here.
* pt.c (find_parameter_packs_r): Not here.
(for_each_template_parm_r): Clear *walk_subtrees.
* decl2.c (min_vis_r): Look through typedefs.
This improves the diagnostic from
error: could not convert ‘((A<>*)(void)0)->A<>::e’ from
‘<unresolved overloaded function type>’ to ‘bool’
to
error: cannot convert ‘A<>::e’ from type ‘void (A<>::)()’ to type ‘bool’
gcc/cp/ChangeLog
2020-05-11 Jason Merrill <jason@redhat.com>
* call.c (implicit_conversion_error): Split out from...
(perform_implicit_conversion_flags): ...here.
(build_converted_constant_expr_internal): Use it.
We were incorrectly accepting the use of 'this' at parse time and then
crashing when we tried to instantiate it. It is invalid because 'this' is
not in scope until after the function-cv-quals. So let's hoist setting
current_class_ptr up from cp_parser_late_return_type_opt into
cp_parser_direct_declarator where it can work for noexcept as well.
gcc/cp/ChangeLog
2020-05-11 Jason Merrill <jason@redhat.com>
PR c++/90748
* parser.c (inject_parm_decls): Set current_class_ptr here.
(cp_parser_direct_declarator): And here.
(cp_parser_late_return_type_opt): Not here.
(cp_parser_noexcept_specification_opt): Nor here.
(cp_parser_exception_specification_opt)
(cp_parser_late_noexcept_specifier): Remove unneeded parameters.
The fix for PR 93499 introduced a too strict check in gfc_divide
that could trigger errors in the early parsing phase. Relax the
check and defer to a later stage.
gcc/fortran/
2020-05-11 Harald Anlauf <anlauf@gmx.de>
PR fortran/95053
* arith.c (gfc_divide): Do not error out if operand 2 is
non-numeric. Defer checks to later stage.
gcc/testsuite/
2020-05-11 Harald Anlauf <anlauf@gmx.de>
PR fortran/95053
* gfortran.dg/pr95053.f: New test.
If a program has no other dependencies on libstdc++, we shouldn't require it
just for __cxa_pure_virtual, which is only there to give a prettier
diagnostic before crashing the program; resolving the reference to NULL will
also crash, just without the diagnostic.
gcc/cp/ChangeLog
2020-05-11 Jason Merrill <jason@redhat.com>
* decl.c (cxx_init_decl_processing): Call declare_weak for
__cxa_pure_virtual.
We weren't printing the condition and message of a STATIC_ASSERT.
It's also unnecessary to duplicate the code for instantiating a
STATIC_ASSERT between tsubst_expr and instantiate_class_template_1.
gcc/cp/ChangeLog
2020-05-11 Jason Merrill <jason@redhat.com>
* pt.c (instantiate_class_template_1): Call tsubst_expr for
STATIC_ASSERT member.
* ptree.c (cxx_print_xnode): Handle STATIC_ASSERT.
We walk the lambda captures in cp_walk_subtrees, so we don't also need to
walk them here.
gcc/cp/ChangeLog
2020-05-11 Jason Merrill <jason@redhat.com>
* pt.c (find_parameter_packs_r) [LAMBDA_EXPR]: Remove redundant
walking of capture list.
This flag is redundant with the explicit_targs field in the overload
candidate information.
gcc/cp/ChangeLog
2020-05-11 Jason Merrill <jason@redhat.com>
* cp-tree.h (LOOKUP_EXPLICIT_TMPL_ARGS): Remove.
* call.c (build_new_function_call): Don't set it.
(build_new_method_call_1): Likewise.
(build_over_call): Check cand->explicit_targs instead.
If we put the SAVE_EXPR for a VLA size inside the MINUS_EXPR rather than
outside, it will work better with constant folding.
The equivalent change was made in the C front-end in 2004, in commit
r0-64535-g8b0b9aefd29dfe6398857bcf5628662e2f0e21f6
gcc/cp/ChangeLog
2020-05-11 Jason Merrill <jason@redhat.com>
* decl.c (compute_array_index_type_loc): Stabilize before building
the MINUS_EXPR.
There's no need to warn that a deprecated function uses a deprecated type,
that just adds noise. We were preventing that in start_decl, but that
didn't help member declarations that go through grokfield. So handle it in
grokdeclarator instead, which is shared between them.
gcc/cp/ChangeLog
2020-05-11 Jason Merrill <jason@redhat.com>
* decl.c (grokdeclarator): Adjust deprecated_state here.
(start_decl): Not here.
Add the new vector centrifuge-doubleword instruction and built-in
function access.
[gcc]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
* config/rs6000/altivec.h (vec_cfuge): New #define.
* config/rs6000/altivec.md (UNSPEC_VCFUGED): New constant.
(vcfuged): New insn.
* config/rs6000/rs6000-builtin.def (__builtin_altivec_vcfuged):
New built-in function.
* config/rs6000/rs6000-call.c (builtin_function_type): Add
handling for FUTURE_BUILTIN_VCFUGED case.
* doc/extend.texi (PowerPC AltiVec Built-in Functions Available
for a Future Architecture): Add description of vec_cfuge built-in
function.
[gcc/testsuite]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
* gcc.target/powerpc/vec-cfuged-0.c: New test.
* gcc.target/powerpc/vec-cfuged-1.c: New test.
Add the centifuge-doubleword instruction and built-in access.
[gcc]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
* config/rs6000/rs6000-builtin.def (BU_FUTURE_MISC_0): New
#define.
(BU_FUTURE_MISC_1): Likewise.
(BU_FUTURE_MISC_2): Likewise.
(BU_FUTURE_MISC_3): Likewise.
(__builtin_cfuged): New built-in function definition.
* config/rs6000/rs6000.md (UNSPEC_CFUGED): New constant.
(cfuged): New insn.
* doc/extend.texi (Basic PowerPC Built-in Functions Available for
a Future Architecture): New subsubsection.
[gcc/testsuite]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
* gcc.target.powerpc/cfuged-0.c: New test.
* gcc.target.powerpc/cfuged-1.c: New test.
This rejects lattice changes from one constant to another.
2020-05-11 Richard Biener <rguenther@suse.de>
PR tree-optimization/95049
* tree-ssa-sccvn.c (set_ssa_val_to): Reject lattice transition
between different constants.
* gcc.dg/torture/pr95049.c: New testcase.
AVX512-style masks and SVE-style predicates can be difficult
to debug in gimple dumps, since the types are printed like this:
vector(4) <unnamed type> foo;
Some important details are hidden by that <unnamed type>,
such as the number of bits in an element and whether the type
is signed or unsigned.
This patch uses an ad-hoc syntax for printing unnamed
boolean types. Normal frontend ones should be handled
by the earlier TYPE_NAME code.
2020-05-11 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-pretty-print.c (dump_generic_node): Handle BOOLEAN_TYPEs.
Add support for the vgnb instruction, which gathers every Nth bit
per vector element.
[gcc]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
Bill Schmidt <wschmidt@linux.ibm.com>
* config/rs6000/altivec.h (vec_gnb): New #define.
* config/rs6000/altivec.md (UNSPEC_VGNB): New constant.
(vgnb): New insn.
* config/rs6000/rs6000-builtin.def (BU_FUTURE_OVERLOAD_1): New
#define.
(BU_FUTURE_OVERLOAD_2): Likewise.
(BU_FUTURE_OVERLOAD_3): Likewise.
(__builtin_altivec_gnb): New built-in function.
(__buiiltin_vec_gnb): New overloaded built-in function.
* config/rs6000/rs6000-call.c (altivec_overloaded_builtins):
Define overloaded forms of __builtin_vec_gnb.
(rs6000_expand_binop_builtin): Add error checking for 2nd argument
of __builtin_vec_gnb.
(builtin_function_type): Mark return value and arguments unsigned
for FUTURE_BUILTIN_VGNB.
* doc/extend.texi (PowerPC AltiVec Built-in Functions Available
for a Future Architecture): Add description of vec_gnb built-in
function.
[gcc/testsuite]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
Bill Schmidt <wschmidt@linux.ibm.com>
* gcc.target/powerpc/vec-gnb-0.c: New test.
* gcc.target/powerpc/vec-gnb-1.c: New test.
* gcc.target/powerpc/vec-gnb-10.c: New test.
* gcc.target/powerpc/vec-gnb-2.c: New test.
* gcc.target/powerpc/vec-gnb-3.c: New test.
* gcc.target/powerpc/vec-gnb-4.c: New test.
* gcc.target/powerpc/vec-gnb-5.c: New test.
* gcc.target/powerpc/vec-gnb-6.c: New test.
* gcc.target/powerpc/vec-gnb-7.c: New test.
* gcc.target/powerpc/vec-gnb-8.c: New test.
* gcc.target/powerpc/vec-gnb-9.c: New test.
Add support for the vpdepd and vpextd instructions that perform
vector parallel bit deposit and vector parallel bit extract.
[gcc]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
Bill Schmidt <wschmidt@linux.ibm.com>
* config/rs6000/altivec.h (vec_pdep): New macro implementing new
built-in function.
(vec_pext): Likewise.
* config/rs6000/altivec.md (UNSPEC_VPDEPD): New constant.
(UNSPEC_VPEXTD): Likewise.
(vpdepd): New insn.
(vpextd): Likewise.
* config/rs6000/rs6000-builtin.def (__builtin_altivec_vpdepd): New
built-in function.
(__builtin_altivec_vpextd): Likewise.
* config/rs6000/rs6000-call.c (builtin_function_type): Add
handling for FUTURE_BUILTIN_VPDEPD and FUTURE_BUILTIN_VPEXTD
cases.
* doc/extend.texi (PowerPC Altivec Built-in Functions Available
for a Future Architecture): Add description of vec_pdep and
vec_pext built-in functions.
[gcc/testsuite]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
* gcc.target/powerpc/vec-pdep-0.c: New.
* gcc.target/powerpc/vec-pdep-1.c: New.
* gcc.target/powerpc/vec-pext-0.c: New.
* gcc.target/powerpc/vec-pext-1.c: New.
Add support for new vclzdm and vctzdm vector instructions that
count leading and trailing zeros under control of a mask.
[gcc]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
Bill Schmidt <wschmidt@linux.ibm.com>
* config/rs6000/altivec.h (vec_clzm): New macro.
(vec_ctzm): Likewise.
* config/rs6000/altivec.md (UNSPEC_VCLZDM): New constant.
(UNSPEC_VCTZDM): Likewise.
(vclzdm): New insn.
(vctzdm): Likewise.
* config/rs6000/rs6000-builtin.def (BU_FUTURE_V_0): New macro.
(BU_FUTURE_V_1): Likewise.
(BU_FUTURE_V_2): Likewise.
(BU_FUTURE_V_3): Likewise.
(__builtin_altivec_vclzdm): New builtin definition.
(__builtin_altivec_vctzdm): Likewise.
* config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Cause
_ARCH_PWR_FUTURE macro to be defined if OPTION_MASK_FUTURE flag is
set.
* config/rs6000/rs6000-call.c (builtin_function_type): Set return
value and parameter types to be unsigned for VCLZDM and VCTZDM.
* config/rs6000/rs6000.c (rs6000_builtin_mask_calculate): Add
support for TARGET_FUTURE flag.
* config/rs6000/rs6000.h (RS6000_BTM_FUTURE): New macro constant.
* doc/extend.texi (PowerPC Altivec Built-in Functions Available
for a Future Architecture): New subsubsection.
[gcc/testsuite]
2020-05-11 Kelvin Nilsen <kelvin@gcc.gnu.org>
* gcc.target/powerpc/vec-clzm-0.c: New test.
* gcc.target/powerpc/vec-clzm-1.c: New test.
* gcc.target/powerpc/vec-ctzm-0.c: New test.
* gcc.target/powerpc/vec-ctzm-1.c: New test.
This enhances store-order preserving store motion to handle the case
of non-invariant dependent stores in the sequence of unconditionally
executed stores on exit by re-issueing them as part of the sequence
of stores on the exit. This fixes the observed regression of
gcc.target/i386/pr64110.c which relies on store-motion of 'b'
for a loop like
for (int i = 0; i < j; ++i)
*b++ = x;
where for correctness we now no longer apply store-motion. With
the patch we emit the correct
tem = b;
for (int i = 0; i < j; ++i)
{
tem = tem + 1;
*tem = x;
}
b = tem;
*tem = x;
preserving the original order of stores. A testcase reflecting
the miscompilation done by earlier GCC is added as well.
This also fixes the reported ICE in PR95025 and adds checking code
to catch it earlier - the issue was not-supported refs propagation
leaving stray refs in the sequence.
2020-05-11 Richard Biener <rguenther@suse.de>
PR tree-optimization/94988
PR tree-optimization/95025
* tree-ssa-loop-im.c (seq_entry): Make a struct, add from.
(sm_seq_push_down): Take extra parameter denoting where we
moved the ref to.
(execute_sm_exit): Re-issue sm_other stores in the correct
order.
(sm_seq_valid_bb): When always executed, allow sm_other to
prevail inbetween sm_ord and record their stored value.
(hoist_memory_references): Adjust refs_not_supported propagation
and prune sm_other from the end of the ordered sequences.
* gcc.dg/torture/pr94988.c: New testcase.
* gcc.dg/torture/pr95025.c: Likewise.
* gcc.dg/torture/pr95045.c: Likewise.
* g++.dg/asan/pr95025.C: New testcase.
gcc/fortran/
2020-05-07 Tobias Burnus <tobias@codesourcery.com>
PR fortran/94672
* trans.h (gfc_conv_expr_present): Add use_saved_decl=false argument.
* trans-expr.c (gfc_conv_expr_present): Likewise; use DECL directly
and only if use_saved_decl is true, use the actual PARAM_DECL arg (saved
descriptor).
* trans-array.c (gfc_trans_dummy_array_bias): Set local 'arg.0'
variable to NULL if 'arg' is not present.
* trans-openmp.c (gfc_omp_check_optional_argument): Simplify by checking
'arg.0' instead of the true PARM_DECL.
(gfc_omp_finish_clause): Remove setting 'arg.0' to NULL.
gcc/testsuite/
2020-05-07 Jakub Jelinek <jakub@redhat.com>
Tobias Burnus <tobias@codesourcery.com>
PR fortran/94672
* gfortran.dg/gomp/pr94672.f90: New.
* gfortran.dg/missing_optional_dummy_6a.f90: Update scan-tree.
In the testcase for PR94991, we are doing FAIL for scalar floating move expand
pattern since TARGET_FLOAT is false with option -mgeneral-regs-only. But move
expand pattern cannot fail. It would be better to replace the FAIL with code
that bitcasts to the equivalent integer mode using gen_lowpart.
2020-05-11 Felix Yang <felix.yang@huawei.com>
gcc/
PR target/94991
* config/aarch64/aarch64.md (mov<mode>):
Bitcasts to the equivalent integer mode using gen_lowpart
instead of doing FAIL for scalar floating point move.
gcc/testsuite/
PR target/94991
* gcc.target/aarch64/mgeneral-regs_5.c: New test.
Given the C code:
unsigned long long inv(unsigned a, unsigned b, unsigned c)
{
return a ? b : ~c;
}
Prior to this patch, AArch64 GCC at -O2 generates:
inv:
cmp w0, 0
mvn w2, w2
csel w0, w1, w2, ne
ret
and after applying the patch, we get:
inv:
cmp w0, 0
csinv w0, w1, w2, ne
ret
The new pattern also catches the optimization for the symmetric case where the
body of foo reads a ? ~b : c.
Similarly, with the following code:
unsigned long long neg(unsigned a, unsigned b, unsigned c)
{
return a ? b : -c;
}
GCC at -O2 previously gave:
neg:
cmp w0, 0
neg w2, w2
csel w0, w1, w2, ne
but now gives:
neg:
cmp w0, 0
csneg w0, w1, w2, ne
ret
with the corresponding code for the symmetric case as above.
2020-05-11 Alex Coplan <alex.coplan@arm.com>
gcc/
* config/aarch64/aarch64.c (aarch64_if_then_else_costs): Add case
to correctly calculate cost for new pattern (*csinv3_uxtw_insn3).
* config/aarch64/aarch64.md (*csinv3_utxw_insn1): New.
(*csinv3_uxtw_insn2): New.
(*csinv3_uxtw_insn3): New.
* config/aarch64/iterators.md (neg_not_cs): New.
gcc/testsuite/
* gcc.target/aarch64/csinv-neg.c: New test.
- The testcase uses the -fgnu-tm option but does not ensure that support
is enabled. This patch adds the test to the testcase.
* gcc/testsuite/g++.dg/ipa/pr94856.C: Require fgnu-tm.
Enable V2SFmode vectorization and vectorize V2SFmode PLUS,
MINUS, MULT, MIN and MAX operations using XMM registers.
To avoid unwanted secondary effects (e.g. exceptions), load values
to XMM registers using MOVQ that clears high bits of the XMM
register outside V2SFmode.
The compiler now vectorizes e.g.:
float r[2], a[2], b[2];
void
test_plus (void)
{
for (int i = 0; i < 2; i++)
r[i] = a[i] + b[i];
}
to:
movq a(%rip), %xmm0
movq b(%rip), %xmm1
addps %xmm1, %xmm0
movlps %xmm0, r(%rip)
ret
gcc/ChangeLog:
PR target/95046
* config/i386/i386.c (ix86_vector_mode_supported_p):
Vectorize 3dNOW! vector modes for TARGET_MMX_WITH_SSE.
* config/i386/mmx.md (*mov<mode>_internal): Do not set
mode of alternative 13 to V2SF for TARGET_MMX_WITH_SSE.
(mmx_addv2sf3): Change operand predicates from
nonimmediate_operand to register_mmxmem_operand.
(addv2sf3): New expander.
(*mmx_addv2sf3): Add SSE/AVX alternatives. Change operand
predicates from nonimmediate_operand to register_mmxmem_operand.
Enable instruction pattern for TARGET_MMX_WITH_SSE.
(mmx_subv2sf3): Change operand predicate from
nonimmediate_operand to register_mmxmem_operand.
(mmx_subrv2sf3): Ditto.
(subv2sf3): New expander.
(*mmx_subv2sf3): Add SSE/AVX alternatives. Change operand
predicates from nonimmediate_operand to register_mmxmem_operand.
Enable instruction pattern for TARGET_MMX_WITH_SSE.
(mmx_mulv2sf3): Change operand predicates from
nonimmediate_operand to register_mmxmem_operand.
(mulv2sf3): New expander.
(*mmx_mulv2sf3): Add SSE/AVX alternatives. Change operand
predicates from nonimmediate_operand to register_mmxmem_operand.
Enable instruction pattern for TARGET_MMX_WITH_SSE.
(mmx_<code>v2sf3): Change operand predicates from
nonimmediate_operand to register_mmxmem_operand.
(<code>v2sf3): New expander.
(*mmx_<code>v2sf3): Add SSE/AVX alternatives. Change operand
predicates from nonimmediate_operand to register_mmxmem_operand.
Enable instruction pattern for TARGET_MMX_WITH_SSE.
(mmx_ieee_<ieee_maxmin>v2sf3): Ditto.
testsuite/ChangeLog:
PR target/95046
* gcc.target/i386/pr95046-1.c: New test.
This change is from a patch developed for gcc-5. The code
has moved on since then requiring a change to interface.c
2020-05-11 Janus Weil <janus@gcc.gnu.org>
Dominique d'Humieres <dominiq@lps.ens.fr>
gcc/fortran/
PR fortran/59107
* gfortran.h: Rename field resolved as resolve_symbol_called
and assign two 2 bits instead of 1.
* interface.c (check_dtio_interface1): Use new field name.
(gfc_find_typebound_dtio_proc): Use new field name.
* resolve.c (gfc_resolve_intrinsic): Replace check of the formal
field with resolve_symbol_called is at least 2, if it is not
set the field to 2. (resolve_typebound_procedure): Use new field
name. (resolve_symbol): Use new field name and check whether it
is at least 1, if it is not set the field to 1.
2020-05-11 Mark Eggleston <markeggleston@gcc.gnu.org>
gcc/testsuite/
PR fortran/59107
* gfortran.dg/pr59107.f90: New test.
Use determine_value_range to get value range info for fold convert expressions
with internal operation PLUS_EXPR/MINUS_EXPR/MULT_EXPR when not overflow on
wrapping overflow inner type. i.e.:
(long unsigned int)((unsigned int)n * 10 + 1)
=>
(long unsigned int)n * (long unsigned int)10 + (long unsigned int)1
With this patch for affine combination, load/store motion could detect
more address refs independency and promote some memory expressions to
registers within loop.
PS: Replace the previous "(T1)(X + CST) as (T1)X - (T1)(-CST))"
to "(T1)(X + CST) as (T1)X + (T1)(CST))" for wrapping overflow.
Bootstrap and regression tested pass on Power8-LE.
gcc/ChangeLog
2020-05-11 Xiong Hu Luo <luoxhu@linux.ibm.com>
PR tree-optimization/83403
* tree-affine.c (expr_to_aff_combination): Replace SSA_NAME with
determine_value_range, Add fold conversion of MULT_EXPR, fix the
previous PLUS_EXPR.
gcc/testsuite/ChangeLog
2020-05-11 Xiong Hu Luo <luoxhu@linux.ibm.com>
PR tree-optimization/83403
* gcc.dg/tree-ssa/pr83403-1.c: New test.
* gcc.dg/tree-ssa/pr83403-2.c: New test.
* gcc.dg/tree-ssa/pr83403.h: New header.
Avoids race condition when checking for an iterator to be singular or
to be comparable to another iterator.
* src/c++/debug.cc
(_Safe_sequence_base::_M_attach_single): Set attached iterator
sequence pointer and version.
(_Safe_sequence_base::_M_detach_single): Reset detached iterator.
(_Safe_iterator_base::_M_attach): Remove attached iterator sequence
pointer and version asignments.
(_Safe_iterator_base::_M_attach_single): Likewise.
(_Safe_iterator_base::_M_detach_single): Remove detached iterator
reset.
(_Safe_iterator_base::_M_singular): Use atomic load to access parent
sequence.
(_Safe_iterator_base::_M_can_compare): Likewise.
(_Safe_iterator_base::_M_get_mutex): Likewise.
(_Safe_local_iterator_base::_M_attach): Remove attached iterator container
pointer and version assignments.
(_Safe_local_iterator_base::_M_attach_single): Likewise.
(_Safe_unordered_container_base::_M_attach_local_single):
Set attached iterator container pointer and version.
(_Safe_unordered_container_base::_M_detach_local_single): Reset detached
iterator.