Issue an error where an array is used before its definition
instead of an ICE.
2020-06-22 Steven G. Kargl <kargl@gcc.gnu.org>
gcc/fortran/
PR fortran/95585
* check.c (gfc_check_reshape): Add check for a value when
the symbol has an attribute flavor FL_PARAMETER.
2020-06-22 Mark Eggleston <markeggleston@gcc.gnu.org>
gcc/testsuite/
PR fortran/95585
* gfortran.dg/pr95585.f90: New test.
Messages in gfc_arith_error contain gcc internal format specifiers
which should be enclosed in G_() in order to be correctly translated.
2020-06-22 Mark Eggleston <markeggleston@gcc.gnu.org>
gcc/fortran/
PR fortran/42693
* arith.c (gfc_arith_error): Enclose strings in G_() instead
of _().
This fixes the vectorized stmt placement compute for the case of
external defs.
2020-06-22 Richard Biener <rguenther@suse.de>
PR tree-optimization/95770
* tree-vect-slp.c (vect_schedule_slp_instance): Also consider
external defs.
* gcc.dg/pr95770.c: New testcase.
2020-06-22 Jakub Jelinek <jakub@redhat.com>
* omp-general.c (omp_extract_for_data): For triangular loops with
all loop invariant expressions constant where the innermost loop is
executed at least once compute number of iterations at compile time.
- Normalize arch string would help the multi-lib handling, e.g. rv64gc and
rv64g_c are both valid and same arch, but latter one would confuse
the detection of multi-lib, earlier normalize can resolve this issue.
gcc/ChangeLog:
* config/riscv/riscv.h (ASM_SPEC): Remove riscv_expand_arch call.
(DRIVER_SELF_SPECS): New.
- g++ will complain too few arguments for frflags builtin like bellow
message:
error: too few arguments to function 'unsigned int __builtin_riscv_frflags(void)'
- However it's no arguments needed, it because we declare the function
type with VOID arguments, that seems like require a VOID argument
in the c++ front-end when GCC tried to resolve the function.
gcc/ChangeLog
* config/riscv/riscv-builtins.c (RISCV_FTYPE_NAME0): New.
(RISCV_FTYPE_ATYPES0): New.
(riscv_builtins): Using RISCV_USI_FTYPE for frflags.
* config/riscv/riscv-ftypes.def: Remove VOID argument.
gcc/testsuite/ChangeLog
* g++.target/riscv/frflags.C: New.
This patch adds the ability to configure GCC on AIX to build as a
64 bit application and to build target libraries "FAT" libraries in both
32 bit and 64 bit mode.
The patch adds makefile fragment hooks to target libraries that allows
them to include target-specific rules. The target specific rules for
AIX place both 32 bit and 64 bit objects and shared objects
in archives at the top-level, not multilib subdirectories. The
multilibs are built in subdirectories, but must be combined during the
last parts of the target library build process. Because of the way
that GCC bootstrap works, the libraries must be combined during the
multiple stages of GCC bootstrap, not solely when installed in the
final destination, so the libraries are correct at the end of
each target library build stage, not solely an install recipe.
gcc/ChangeLog
2020-06-21 David Edelsohn <dje.gcc@gmail.com>
* config.gcc: Use t-aix64, biarch64 and default64 for cpu_is_64bit.
* config/rs6000/aix72.h (ASM_SPEC): Remove aix64 option.
(ASM_SPEC32): New.
(ASM_SPEC64): New.
(ASM_CPU_SPEC): Remove vsx and altivec options.
(CPP_SPEC_COMMON): Rename from CPP_SPEC.
(CPP_SPEC32): New.
(CPP_SPEC64): New.
(CPLUSPLUS_CPP_SPEC): Rename to CPLUSPLUS_CPP_SPEC_COMMON..
(TARGET_DEFAULT): Only define if not BIARCH.
(LIB_SPEC_COMMON): Rename from LIB_SPEC.
(LIB_SPEC32): New.
(LIB_SPEC64): New.
(LINK_SPEC_COMMON): Rename from LINK_SPEC.
(LINK_SPEC32): New.
(LINK_SPEC64): New.
(STARTFILE_SPEC): Add 64 bit version of crtcxa and crtdbase.
(ASM_SPEC): Define 32 and 64 bit alternatives using DEFAULT_ARCH64_P.
(CPP_SPEC): Same.
(CPLUSPLUS_CPP_SPEC): Same.
(LIB_SPEC): Same.
(LINK_SPEC): Same.
(SUBTARGET_EXTRA_SPECS): Add new 32/64 specs.
* config/rs6000/defaultaix64.h: New file.
* config/rs6000/t-aix64: New file.
libgcc/ChangeLog
2020-06-21 David Edelsohn <dje.gcc@gmail.com>
* config.host (extra_parts): Add crtcxa_64 and crtdbase_64.
* config/rs6000/t-aix-cxa: Explicitly compile 32 bit with -maix32
and 64 bit with -maix64.
* config/rs6000/t-slibgcc-aix: Remove extra @multilib_dir@ level.
Build and install AIX-style FAT libraries.
libgomp/ChangeLog
2020-06-21 David Edelsohn <dje.gcc@gmail.com>
* Makefile.am (tmake_file): Build and install AIX-style FAT libraries.
* Makefile.in: Regenerate
* configure.ac (tmake_file): Substitute.
* configure: Regenerate.
* configure.tgt (powerpc-ibm-aix*): Define tmake_file.
* config/t-aix: New file.
libstdc++-v3/ChangeLog
2020-06-21 David Edelsohn <dje.gcc@gmail.com>
* Makefile.am (tmake_file): Build and install AIX-style FAT libraries.
* Makefile.in: Regenerate.
* configure.ac (tmake_file): Substitute.
* configure: Regenerate.
* configure.host (aix*): Define tmake_file.
* config/os/aix/t-aix: New file.
libatomic/ChangeLog
2020-06-21 David Edelsohn <dje.gcc@gmail.com>
* Makefile.am (tmake_file): Build and install AIX-style FAT libraries.
* Makefile.in: Regenerate.
* configure.ac (tmake_file): Substitute.
* configure: Regenerate.
* configure.tgt (powerpc-ibm-aix*): Define tmake_file.
* config/t-aix: New file.
libgfortran/ChangeLog
2020-06-21 David Edelsohn <dje.gcc@gmail.com>
* Makefile.am (tmake_file): Build and install AIX-style FAT libraries.
* Makefile.in: Regenerate.
* configure.ac (tmake_file): Substitute.
* configure: Regenerate.
* configure.host: Add system configury stanza. Define tmake_file.
* config/t-aix: New file.
Add the Matrix-Multiply Assist (MMA) built-ins. The MMA accumulators are
INOUT operands for most MMA instructions, but they are also very expensive
to move around. For this reason, we have implemented a built-in API where
the accumulators are passed using pass-by-reference/pointers, so the user
won't use one accumulator as input and another as output, which wouldentail
a lot of copies. However, using pointers gives us poor code generation
when we expand the built-ins at normal expand time. We therefore expand
the MMA built-ins early into gimple, converting the pass-by-reference calls
to an internal built-in that uses pass-by-value calling convention, where
we can enforce the input and output accumulators are the same. This gives
us much better code generation.
2020-06-20 Peter Bergner <bergner@linux.ibm.com>
gcc/
* config/rs6000/predicates.md (mma_assemble_input_operand): New.
* config/rs6000/rs6000-builtin.def (BU_MMA_1, BU_MMA_V2, BU_MMA_3,
BU_MMA_5, BU_MMA_6, BU_VSX_1): Add support macros for defining MMA
built-in functions.
(ASSEMBLE_ACC, ASSEMBLE_PAIR, DISASSEMBLE_ACC, DISASSEMBLE_PAIR,
PMXVBF16GER2, PMXVBF16GER2NN, PMXVBF16GER2NP, PMXVBF16GER2PN,
PMXVBF16GER2PP, PMXVF16GER2, PMXVF16GER2NN, PMXVF16GER2NP,
PMXVF16GER2PN, PMXVF16GER2PP, PMXVF32GER, PMXVF32GERNN,
PMXVF32GERNP, PMXVF32GERPN, PMXVF32GERPP, PMXVF64GER, PMXVF64GERNN,
PMXVF64GERNP, PMXVF64GERPN, PMXVF64GERPP, PMXVI16GER2, PMXVI16GER2PP,
PMXVI16GER2S, PMXVI16GER2SPP, PMXVI4GER8, PMXVI4GER8PP, PMXVI8GER4,
PMXVI8GER4PP, PMXVI8GER4SPP, XVBF16GER2, XVBF16GER2NN, XVBF16GER2NP,
XVBF16GER2PN, XVBF16GER2PP, XVCVBF16SP, XVCVSPBF16, XVF16GER2,
XVF16GER2NN, XVF16GER2NP, XVF16GER2PN, XVF16GER2PP, XVF32GER,
XVF32GERNN, XVF32GERNP, XVF32GERPN, XVF32GERPP, XVF64GER, XVF64GERNN,
XVF64GERNP, XVF64GERPN, XVF64GERPP, XVI16GER2, XVI16GER2PP, XVI16GER2S,
XVI16GER2SPP, XVI4GER8, XVI4GER8PP, XVI8GER4, XVI8GER4PP, XVI8GER4SPP,
XXMFACC, XXMTACC, XXSETACCZ): Add MMA built-ins.
* config/rs6000/rs6000.c (rs6000_emit_move): Use CONST_INT_P.
Allow zero constants.
(print_operand) <case 'A'>: New output modifier.
(rs6000_split_multireg_move): Add support for inserting accumulator
priming and depriming instructions. Add support for splitting an
assemble accumulator pattern.
* config/rs6000/rs6000-call.c (mma_init_builtins, mma_expand_builtin,
rs6000_gimple_fold_mma_builtin): New functions.
(RS6000_BUILTIN_M): New macro.
(def_builtin): Handle RS6000_BTC_QUAD and RS6000_BTC_PAIR attributes.
(bdesc_mma): Add new MMA built-in support.
(htm_expand_builtin): Use RS6000_BTC_OPND_MASK.
(rs6000_invalid_builtin): Add handling of RS6000_BTM_FUTURE and
RS6000_BTM_MMA.
(rs6000_builtin_valid_without_lhs): Handle RS6000_BTC_VOID attribute.
(rs6000_gimple_fold_builtin): Call rs6000_builtin_is_supported_p
and rs6000_gimple_fold_mma_builtin.
(rs6000_expand_builtin): Call mma_expand_builtin.
Use RS6000_BTC_OPND_MASK.
(rs6000_init_builtins): Adjust comment. Call mma_init_builtins.
(htm_init_builtins): Use RS6000_BTC_OPND_MASK.
(builtin_function_type): Handle VSX_BUILTIN_XVCVSPBF16 and
VSX_BUILTIN_XVCVBF16SP.
* config/rs6000/rs6000.h (RS6000_BTC_QUINARY, RS6000_BTC_SENARY,
RS6000_BTC_OPND_MASK, RS6000_BTC_QUAD, RS6000_BTC_PAIR,
RS6000_BTC_QUADPAIR, RS6000_BTC_GIMPLE): New defines.
(RS6000_BTC_PREDICATE, RS6000_BTC_ABS, RS6000_BTC_DST,
RS6000_BTC_TYPE_MASK, RS6000_BTC_ATTR_MASK): Adjust values.
* config/rs6000/mma.md (MAX_MMA_OPERANDS): New define_constant.
(UNSPEC_MMA_ASSEMBLE_ACC, UNSPEC_MMA_PMXVBF16GER2,
UNSPEC_MMA_PMXVBF16GER2NN, UNSPEC_MMA_PMXVBF16GER2NP,
UNSPEC_MMA_PMXVBF16GER2PN, UNSPEC_MMA_PMXVBF16GER2PP,
UNSPEC_MMA_PMXVF16GER2, UNSPEC_MMA_PMXVF16GER2NN,
UNSPEC_MMA_PMXVF16GER2NP, UNSPEC_MMA_PMXVF16GER2PN,
UNSPEC_MMA_PMXVF16GER2PP, UNSPEC_MMA_PMXVF32GER,
UNSPEC_MMA_PMXVF32GERNN, UNSPEC_MMA_PMXVF32GERNP,
UNSPEC_MMA_PMXVF32GERPN, UNSPEC_MMA_PMXVF32GERPP,
UNSPEC_MMA_PMXVF64GER, UNSPEC_MMA_PMXVF64GERNN,
UNSPEC_MMA_PMXVF64GERNP, UNSPEC_MMA_PMXVF64GERPN,
UNSPEC_MMA_PMXVF64GERPP, UNSPEC_MMA_PMXVI16GER2,
UNSPEC_MMA_PMXVI16GER2PP, UNSPEC_MMA_PMXVI16GER2S,
UNSPEC_MMA_PMXVI16GER2SPP, UNSPEC_MMA_PMXVI4GER8,
UNSPEC_MMA_PMXVI4GER8PP, UNSPEC_MMA_PMXVI8GER4,
UNSPEC_MMA_PMXVI8GER4PP, UNSPEC_MMA_PMXVI8GER4SPP,
UNSPEC_MMA_XVBF16GER2, UNSPEC_MMA_XVBF16GER2NN,
UNSPEC_MMA_XVBF16GER2NP, UNSPEC_MMA_XVBF16GER2PN,
UNSPEC_MMA_XVBF16GER2PP, UNSPEC_MMA_XVF16GER2, UNSPEC_MMA_XVF16GER2NN,
UNSPEC_MMA_XVF16GER2NP, UNSPEC_MMA_XVF16GER2PN, UNSPEC_MMA_XVF16GER2PP,
UNSPEC_MMA_XVF32GER, UNSPEC_MMA_XVF32GERNN, UNSPEC_MMA_XVF32GERNP,
UNSPEC_MMA_XVF32GERPN, UNSPEC_MMA_XVF32GERPP, UNSPEC_MMA_XVF64GER,
UNSPEC_MMA_XVF64GERNN, UNSPEC_MMA_XVF64GERNP, UNSPEC_MMA_XVF64GERPN,
UNSPEC_MMA_XVF64GERPP, UNSPEC_MMA_XVI16GER2, UNSPEC_MMA_XVI16GER2PP,
UNSPEC_MMA_XVI16GER2S, UNSPEC_MMA_XVI16GER2SPP, UNSPEC_MMA_XVI4GER8,
UNSPEC_MMA_XVI4GER8PP, UNSPEC_MMA_XVI8GER4, UNSPEC_MMA_XVI8GER4PP,
UNSPEC_MMA_XVI8GER4SPP, UNSPEC_MMA_XXMFACC, UNSPEC_MMA_XXMTACC): New.
(MMA_ACC, MMA_VV, MMA_AVV, MMA_PV, MMA_APV, MMA_VVI4I4I8,
MMA_AVVI4I4I8, MMA_VVI4I4I2, MMA_AVVI4I4I2, MMA_VVI4I4,
MMA_AVVI4I4, MMA_PVI4I2, MMA_APVI4I2, MMA_VVI4I4I4,
MMA_AVVI4I4I4): New define_int_iterator.
(acc, vv, avv, pv, apv, vvi4i4i8, avvi4i4i8, vvi4i4i2,
avvi4i4i2, vvi4i4, avvi4i4, pvi4i2, apvi4i2, vvi4i4i4,
avvi4i4i4): New define_int_attr.
(*movpxi): Add zero constant alternative.
(mma_assemble_pair, mma_assemble_acc): New define_expand.
(*mma_assemble_acc): New define_insn_and_split.
(mma_<acc>, mma_xxsetaccz, mma_<vv>, mma_<avv>, mma_<pv>, mma_<apv>,
mma_<vvi4i4i8>, mma_<avvi4i4i8>, mma_<vvi4i4i2>, mma_<avvi4i4i2>,
mma_<vvi4i4>, mma_<avvi4i4>, mma_<pvi4i2>, mma_<apvi4i2>,
mma_<vvi4i4i4>, mma_<avvi4i4i4>): New define_insn.
* config/rs6000/rs6000.md (define_attr "type"): New type mma.
* config/rs6000/vsx.md (UNSPEC_VSX_XVCVBF16SP): New.
(UNSPEC_VSX_XVCVSPBF16): Likewise.
(XVCVBF16): New define_int_iterator.
(xvcvbf16): New define_int_attr.
(vsx_<xvcvbf16>): New define_insn.
* doc/extend.texi: Document the mma built-ins.
Add the new -mmma option as well as the initial MMA support, which includes
the target specific __vector_pair and __vector_quad types, the POImode and
PXImode partial integer modes they are mapped to, and their associated
move patterns. Support for the restrictions on the registers these modes
can be assigned to as also been added.
2020-06-20 Peter Bergner <bergner@linux.ibm.com>
Michael Meissner <meissner@linux.ibm.com>
gcc/
* config/rs6000/mma.md: New file.
* config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Define
__MMA__ for mma.
* config/rs6000/rs6000-call.c (rs6000_init_builtins): Add support
for __vector_pair and __vector_quad types.
* config/rs6000/rs6000-cpus.def (OTHER_FUTURE_MASKS): Add
OPTION_MASK_MMA.
(POWERPC_MASKS): Likewise.
* config/rs6000/rs6000-modes.def (OI, XI): New integer modes.
(POI, PXI): New partial integer modes.
* config/rs6000/rs6000.c (TARGET_INVALID_CONVERSION): Define.
(rs6000_hard_regno_nregs_internal): Use VECTOR_ALIGNMENT_P.
(rs6000_hard_regno_mode_ok_uncached): Likewise.
Add support for POImode being allowed in VSX registers and PXImode
being allowed in FP registers.
(rs6000_modes_tieable_p): Adjust comment.
Add support for POImode and PXImode.
(rs6000_debug_reg_global) <print_tieable_modes>: Add OImode, POImode
XImode, PXImode, V2SImode, V2SFmode and CCFPmode..
(rs6000_setup_reg_addr_masks): Use VECTOR_ALIGNMENT_P.
Set up appropriate addr_masks for vector pair and vector quad addresses.
(rs6000_init_hard_regno_mode_ok): Add support for vector pair and
vector quad registers. Setup reload handlers for POImode and PXImode.
(rs6000_builtin_mask_calculate): Add support for RS6000_BTM_MMA.
(rs6000_option_override_internal): Error if -mmma is specified
without -mcpu=future.
(rs6000_slow_unaligned_access): Use VECTOR_ALIGNMENT_P.
(quad_address_p): Change size test to less than 16 bytes.
(reg_offset_addressing_ok_p): Add support for ISA 3.1 vector pair
and vector quad instructions.
(avoiding_indexed_address_p): Likewise.
(rs6000_emit_move): Disallow POImode and PXImode moves involving
constants.
(rs6000_preferred_reload_class): Prefer VSX registers for POImode
and FP registers for PXImode.
(rs6000_split_multireg_move): Support splitting POImode and PXImode
move instructions.
(rs6000_mangle_type): Adjust comment. Add support for mangling
__vector_pair and __vector_quad types.
(rs6000_opt_masks): Add entry for mma.
(rs6000_builtin_mask_names): Add RS6000_BTM_MMA and RS6000_BTM_FUTURE.
(rs6000_function_value): Use VECTOR_ALIGNMENT_P.
(address_to_insn_form): Likewise.
(reg_to_non_prefixed): Likewise.
(rs6000_invalid_conversion): New function.
* config/rs6000/rs6000.h (MASK_MMA): Define.
(BIGGEST_ALIGNMENT): Set to 512 if MMA support is enabled.
(VECTOR_ALIGNMENT_P): New helper macro.
(ALTIVEC_VECTOR_MODE): Use VECTOR_ALIGNMENT_P.
(RS6000_BTM_MMA): Define.
(RS6000_BTM_COMMON): Add RS6000_BTM_MMA and RS6000_BTM_FUTURE.
(rs6000_builtin_type_index): Add RS6000_BTI_vector_pair and
RS6000_BTI_vector_quad.
(vector_pair_type_node): New.
(vector_quad_type_node): New.
* config/rs6000/rs6000.md: Include mma.md.
(define_mode_iterator RELOAD): Add POI and PXI.
* config/rs6000/t-rs6000 (MD_INCLUDES): Add mma.md.
* config/rs6000/rs6000.opt (-mmma): New.
* doc/invoke.texi: Document -mmma.
The actual issue is that (in the testcase) std::nothrow is not
available. So update the handling of the get-return-on-alloc-fail
to include the possibility that std::nothrow might not be
available.
gcc/cp/ChangeLog:
PR c++/95505
* coroutines.cc (morph_fn_to_coro): Update handling of
get-return-object-on-allocation-fail and diagnose missing
std::nothrow.
gcc/testsuite/ChangeLog:
PR c++/95505
* g++.dg/coroutines/pr95505.C: New test.
P2113 from the last C++ meeting clarified that we only compare constraints
on functions or function templates that have equivalent template parameters
and function parameters.
I'm not currently implementing the complicated handling of reversed
comparison operators here; thinking about it now, it seems like a lot of
complexity to support a very weird usage. If I write two similar comparison
operators to be distinguished by their constraints, why would I write one
reversed? If they're two unrelated operators, they're very unlikely to be
similar enough for the complexity to help. I've started a discussion on the
committee reflector about changing these rules.
This change breaks some greedy_ops tests in libstdc++ that were relying on
comparing constraints on unrelated templates, which seems pretty clearly
wrong, so I'm removing those tests for now.
gcc/cp/ChangeLog:
* call.c (joust): Only compare constraints for non-template
candidates with matching parameters.
* pt.c (tsubst_pack_expansion): Fix getting a type parameter
pack.
(more_specialized_fn): Only compare constraints for candidates with
matching parameters.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-return-req1.C: Expect error.
* g++.dg/cpp2a/concepts-p2113a.C: New test.
* g++.dg/cpp2a/concepts-p2113b.C: New test.
libstdc++-v3/ChangeLog:
* testsuite/24_iterators/move_iterator/rel_ops_c++20.cc:
Remove greedy_ops tests.
* testsuite/24_iterators/reverse_iterator/rel_ops_c++20.cc:
Remove greedy_ops tests.
With submodules and equivalence declarations, name mangling may result in
long internal symbols overflowing internal buffers. We now check that
we do not exceed the enlarged buffer sizes.
gcc/fortran/
PR fortran/95707
* gfortran.h (gfc_common_head): Enlarge buffer.
* trans-common.c (gfc_sym_mangled_common_id): Enlarge temporary
buffers, and add check on length on mangled name to prevent
overflow.
With submodules, name mangling of character pointer declarations produces long
internal symbols that overflowed a static internal buffer. Adjust the buffer
size.
gcc/fortran/
PR fortran/95688
* iresolve.c (gfc_get_string): Enlarge static buffer size.
With submodules and PDTs, name mangling of interfaces may result in long
internal symbols overflowing a previously static internal buffer. We now
set the buffer size dynamically.
gcc/fortran/
PR fortran/95687
* class.c (get_unique_type_string): Return a string with dynamic
length.
(get_unique_hashed_string, gfc_hash_value): Use dynamic result
from get_unique_type_string instead of static buffer.
With submodules, name mangling of interfaces may result in long internal
symbols overflowing an internal buffer. We now check that we do not
exceed the enlarged buffer size.
gcc/fortran/
PR fortran/95689
* interface.c (check_sym_interfaces): Enlarge temporary buffer,
and add check on length on mangled name to prevent overflow.
EQUIVALENCE objects are subject to constraints listed in the Fortran 2018
standard, section 8.10.1.1. These constraints are to be checked
also for CLASS variables.
gcc/fortran/
PR fortran/95587
* match.c (gfc_match_equivalence): Check constraints on
EQUIVALENCE objects also for CLASS variables.
popcount[45]ll require __builtin_popcountll, but the test can succeed
without libcall through expand_doubleword_popcount. However the Tree-SSA
optiization requires recognition of POPCOUNT. This patch limits the test
to lp64 for the targets that fall through the cracks and were not
caught by the dg-require-effective-target popcountll.
gcc/testsuite/ChangeLog
2020-06-19 David Edelsohn <dje.gcc@gmail.com>
* gcc.dg/tree-ssa/popcount4ll.c: Add target lp64.
* gcc.dg/tree-ssa/popcount5ll.c: Same.
Implementing P2085, another refinement to the operator<=> specification from
the Prague meeting. It was deemed desirable to be able to have a non-inline
defaulted definition of a comparison operator just like you can with other
defaulted functions.
gcc/cp/ChangeLog:
* method.c (early_check_defaulted_comparison): Allow defaulting
comparison outside class. Complain if non-member operator isn't a
friend.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/spaceship-friend1.C: New test.
* g++.dg/cpp2a/spaceship-err4.C: Adjust diagnostic.
Pattern "(x | y) - y" can be optimized to simple "(x & ~y)" andn
pattern.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/ChangeLog:
PR tree-optimization/94880
* match.pd (A | B) - B -> (A & ~B): New simplification.
gcc/testsuite/ChangeLog:
PR tree-optimization/94880
* gcc.dg/tree-ssa/pr94880.c: New Test.
This properly handles a lane permutation in scalar costing.
For the current only use this doesn't matter much but with
permutes that change the number of lanes it will eventually
ICE.
2020-06-19 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (vect_bb_slp_scalar_cost): Adjust
for lane permutations.
A small tweak to the implementation of __includes, which in my
application saves 20% of the running time. I noticed it because using
range-v3 was giving unexpected performance gains.
Some of the gain comes from pulling the 2 calls ++__first1 out of the
condition so there is just one call. And most of the gain comes from
replacing the resulting
if (__comp(__first1, __first2))
;
else
++__first2;
with
if (!__comp(__first1, __first2))
++__first2;
I was very surprised that the code ended up being so different for such
a change, and I still don't really understand where the extra time is
going...
Anyway, while I blame the compiler for not generating very good code
with the current implementation, I believe the change can be seen as a
simplification.
libstdc++-v3/ChangeLog:
* include/bits/stl_algo.h (__includes): Simplify the code.
I missed that indeed SLP permutation code generation can end up
refering to a non-last vectorized stmt in the last SLP_TREE_VEC_STMTS
element as optimization. So walk them all.
2020-06-19 Richard Biener <rguenther@suse.de>
PR tree-optimization/95761
* tree-vect-slp.c (vect_schedule_slp_instance): Walk all
vectorized stmts for finding the last one.
* gcc.dg/torture/pr95761.c: New testcase.
The attached patch changes the code generated for
std::optional<std::array<int,1024>>f(){return{};}
from
movq $0, (%rdi)
movq %rdi, %r8
leaq 8(%rdi), %rdi
xorl %eax, %eax
movq $0, 4084(%rdi)
movq %r8, %rcx
andq $-8, %rdi
subq %rdi, %rcx
addl $4100, %ecx
shrl $3, %ecx
rep stosq
movq %r8, %rax
or with different tuning
subq $8, %rsp
movl $4100, %edx
xorl %esi, %esi
call memset
addq $8, %rsp
to the much shorter
movb $0, 4096(%rdi)
movq %rdi, %rax
i.e. the same as the nullopt constructor.
The constructor was already non-trivial, so we don't lose that. It passes the
testsuite without regression, but there is no new testcase to verify the
better codegen.
libstdc++-v3/ChangeLog:
* include/std/optional (optional()): Explicitly define it.
2020-06-19 Eric Botcazou <ebotcazou@adacore.com>
gcc/ada/
* gcc-interface/trans.c (lvalue_required_for_attribute_p): Do not deal
with 'Pos or 'Val.
(Attribute_to_gnu): Likewise.
* gcc-interface/utils.c (create_field_decl): Small formatting fix.
2020-06-19 Eric Botcazou <ebotcazou@adacore.com>
gcc/ada/
* gcc-interface/trans.c (adjust_for_implicit_deref): Delete.
(maybe_implicit_deref): Likewise.
(Attribute_to_gnu): Replace calls to maybe_implicit_deref by calls
to maybe_padded_object.
(Call_to_gnu): Likewise.
(gnat_to_gnu) <N_Indexed_Component>: Likewise.
<N_Slice>: Likewise.
<N_Selected_Component>: Likewise.
<N_Free_Statement>: Remove call to adjust_for_implicit_deref and
manually make sure that the designated type is complete.
* gcc-interface/utils2.c (build_simple_component_ref): Add comment.
2020-06-19 Eric Botcazou <ebotcazou@adacore.com>
gcc/ada/
* gcc-interface/decl.c (gnat_to_gnu_param): Tidy up.
(gnat_to_gnu_subprog_type): For a variadic C function, do not
build unnamed parameters and do not add final void node.
* gcc-interface/misc.c: Include snames.h.
* gcc-interface/trans.c (Attribute_to_gnu): Tidy up.
(Call_to_gnu): Implement support for unnamed parameters in a
variadic C function.
* gcc-interface/utils.c: Include snames.h.
(copy_type): Tidy up.
2020-06-19 Justin Squirek <squirek@adacore.com>
gcc/ada/
* lib.adb (Check_Same_Extended_Unit): Add check to determine if
the body for the subunits exist in the same file as their
specifications.
2020-06-19 Eric Botcazou <ebotcazou@adacore.com>
gcc/ada/
* exp_aggr.adb (In_Place_Assign_OK): In an allocator context,
check the bounds of an array aggregate against those of the
designated type, except if the latter is unconstrained.
2020-06-19 Eric Botcazou <ebotcazou@adacore.com>
gcc/ada/
* sem_ch3.adb (Is_Visible_Component): Reason only on the private
status of the original type in an instance body.
2020-06-19 Eric Botcazou <ebotcazou@adacore.com>
gcc/ada/
* sem_res.adb (Resolve_Qualified_Expression): Do not override the
type of the node when it is unconstrained if it is for an allocator.
2020-06-19 Eric Botcazou <ebotcazou@adacore.com>
gcc/ada/
* sem_res.adb (Resolve_Allocator): Call Resolve_Qualified_Expression
on the qualified expression, if any, instead of doing an incomplete
type resolution manually.
(Resolve_Qualified_Expression): Apply predicate check to operand.
2020-06-19 Eric Botcazou <ebotcazou@adacore.com>
gcc/ada/
* sem_ch4.adb (Analyze_Selected_Component): In an instance body,
also invoke Find_Component_In_Instance on the parent subtype of
a derived tagged type immediately visible. Remove obsolete case.
2020-06-19 Eric Botcazou <ebotcazou@adacore.com>
gcc/ada/
* exp_attr.adb (Get_Integer_Type): Return the largest supported
unsigned integer type if need be.
2020-06-19 Justin Squirek <squirek@adacore.com>
gcc/ada/
* sem_warn.adb (Warn_On_Known_Condition): Add general sanity
check that asserts the original source node being checked
contains an entity. If not, it could be the result of special
case expansion for type conversions.
2020-06-19 Ed Schonberg <schonberg@adacore.com>
gcc/ada/
* sem_ch6.adb (Analyze_Expression_Function): Do not indicate
that the function has a completion if it appears within a Ghost
generic package.
2020-06-19 Javier Miranda <miranda@adacore.com>
gcc/ada/
* exp_ch3.ads (Ensure_Activation_Chain_And_Master): New
subprogram.
* exp_ch3.adb (Ensure_Activation_Chain_And_Master): New
subprogram that factorizes code.
(Expand_N_Object_Declaration): Call new subprogram.
* sem_ch6.adb (Analyze_Function_Return): Returning a
build-in-place unconstrained array type defer the full analysis
of the returned object to avoid generating the corresponding
constrained subtype; otherwise the bounds would be created in
the stack and a dangling reference would be returned pointing to
the bounds.