PR tree-optimization/91885
* gcc.dg/pr91885.c (__int64_t): Change from long to long long.
(__uint64_t): Change from unsigned long to unsigned long long.
From-SVN: r276178
It was easier to add the SVE ACLE support without enumerating every
function at build time. This in turn meant that it was easier if the
SVE builtins occupied a distinct numberspace from the existing AArch64
ones, which *are* enumerated at build time. This patch therefore
divides the built-in functions codes into "major" and "minor" codes.
At present the major code is just "general", but the SVE patch will add
"SVE" as well.
Also, it was convenient to put the SVE ACLE support in its own file,
so the patch makes aarch64.c provide the frontline target hooks directly,
forwarding to the other files for the real work.
The reason for organising the files this way is that aarch64.c needs
to define the target hook macros whatever happens, and having aarch64.c
macros forward to aarch64-builtins.c functions and aarch64-bulitins.c
functions forward to the SVE file seemed a bit indirect. Doing things
the way the patch does them puts aarch64-builtins.c and the SVE code on
more of an equal footing.
The aarch64_(general_)gimple_fold_builtin change is mostly just
reindentation.
2019-09-27 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64-protos.h (aarch64_builtin_class): New enum.
(AARCH64_BUILTIN_SHIFT, AARCH64_BUILTIN_CLASS): New constants.
(aarch64_gimple_fold_builtin, aarch64_mangle_builtin_type)
(aarch64_fold_builtin, aarch64_init_builtins, aarch64_expand_builtin):
(aarch64_builtin_decl, aarch64_builtin_rsqrt): Delete.
(aarch64_general_mangle_builtin_type, aarch64_general_init_builtins):
(aarch64_general_fold_builtin, aarch64_general_gimple_fold_builtin):
(aarch64_general_expand_builtin, aarch64_general_builtin_decl):
(aarch64_general_builtin_rsqrt): Declare.
* config/aarch64/aarch64-builtins.c (aarch64_general_add_builtin):
New function.
(aarch64_mangle_builtin_type): Rename to...
(aarch64_general_mangle_builtin_type): ...this.
(aarch64_init_fcmla_laneq_builtins, aarch64_init_simd_builtins)
(aarch64_init_crc32_builtins, aarch64_init_builtin_rsqrt)
(aarch64_init_pauth_hint_builtins, aarch64_init_tme_builtins): Use
aarch64_general_add_builtin instead of add_builtin_function.
(aarch64_init_builtins): Rename to...
(aarch64_general_init_builtins): ...this. Use
aarch64_general_add_builtin instead of add_builtin_function.
(aarch64_builtin_decl): Rename to...
(aarch64_general_builtin_decl): ...this and remove the unused
arguments.
(aarch64_expand_builtin): Rename to...
(aarch64_general_expand_builtin): ...this and remove the unused
arguments.
(aarch64_builtin_rsqrt): Rename to...
(aarch64_general_builtin_rsqrt): ...this.
(aarch64_fold_builtin): Rename to...
(aarch64_general_fold_builtin): ...this. Take the function subcode
and return type as arguments. Remove the "ignored" argument.
(aarch64_gimple_fold_builtin): Rename to...
(aarch64_general_gimple_fold_builtin): ...this. Take the function
subcode and gcall as arguments, and return the new function call.
* config/aarch64/aarch64.c (aarch64_init_builtins)
(aarch64_fold_builtin, aarch64_gimple_fold_builtin)
(aarch64_expand_builtin, aarch64_builtin_decl): New functions.
(aarch64_builtin_reciprocal): Call aarch64_general_builtin_rsqrt
instead of aarch64_builtin_rsqrt.
(aarch64_mangle_type): Call aarch64_general_mangle_builtin_type
instead of aarch64_mangle_builtin_type.
From-SVN: r276177
For SVE, we'd like the frontends to check calls to target-specific
built-in functions in the same way that they already do for "normal"
builtins. This patch adds a target hook for that and extends
check_builtin_function_arguments accordingly.
A slight complication is that when TARGET_RESOLVE_OVERLOADED_BUILTIN
has resolved an overload, it can use build_function_call_vec to build
the call to the underlying non-overloaded function decl. This in
turn coerces the arguments to the function type and then calls
check_builtin_function_arguments to check the final call. If the
target does find a problem in this final call, it can be useful
to refer to the original overloaded function decl in diagnostics,
since that's what the user wrote.
The patch therefore passes the original decl as a final optional
parameter to build_function_call_vec.
2019-09-27 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* target.def (check_builtin_call): New target hook.
* doc/tm.texi.in (TARGET_CHECK_BUILTIN_CALL): New @hook.
* doc/tm.texi: Regenerate.
gcc/c-family/
* c-common.h (build_function_call_vec): Take the original
function decl as an optional final parameter.
(check_builtin_function_arguments): Take the original function decl.
* c-common.c (check_builtin_function_arguments): Likewise.
Handle all built-in functions, not just BUILT_IN_NORMAL ones.
Use targetm.check_builtin_call to check BUILT_IN_MD functions.
gcc/c/
* c-typeck.c (build_function_call_vec): Take the original function
decl as an optional final parameter. Pass all built-in calls to
check_builtin_function_arguments.
gcc/cp/
* cp-tree.h (build_cxx_call): Take the original function decl
as an optional final parameter.
(cp_build_function_call_vec): Likewise.
* call.c (build_cxx_call): Likewise. Pass all built-in calls to
check_builtin_function_arguments.
* typeck.c (build_function_call_vec): Take the original function
decl as an optional final parameter and pass it to
cp_build_function_call_vec.
(cp_build_function_call_vec): Take the original function
decl as an optional final parameter and pass it to build_cxx_call.
From-SVN: r276176
The then/else order of the VEC_COND_EXPRs created by
vect_create_epilog_for_reduction meeds to line up with the
main VEC_COND_EXPR.
2019-09-27 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/91909
* tree-vect-loop.c (vect_create_epilog_for_reduction): Take a
reduc_index parameter. When handling COND_REDUCTION, make sure
that the reduction phi operand is in the correct arm of the
VEC_COND_EXPR.
(vectorizable_reduction): Pass reduc_index to the above.
From-SVN: r276175
This patch adds combining support for SVE2's shift-right accumulate
instructions.
2019-09-27 Yuliang Wang <yuliang.wang@arm.com>
gcc/
* config/aarch64/aarch64-sve2.md (aarch64_sve2_sra<mode>):
New combine pattern.
gcc/testsuite/
* gcc.target/aarch64/sve2/shracc_1.c: New test.
From-SVN: r276174
Zero-sized fields do not get processed by finish_record_type: they're
removed from the field list before and reinserted after, so their
DECL_SIZE_UNIT remains unset, causing the translation of assignment
statements with use_memset_p, in quite unusual circumstances, to use a
NULL_TREE as the memset length. This patch sets DECL_SIZE_UNIT for
the zero-sized fields, that don't go through language-independent
layout, in language-specific layout.
for gcc/ada/ChangeLog
* gcc-interface/decl.c (components_to_record): Set
DECL_SIZE_UNIT for zero-sized fields.
From-SVN: r276173
* charset.c (UCS_LIMIT): New macro.
(ucn_valid_in_identifier): Use it instead of a hardcoded constant.
(_cpp_valid_ucn): Issue a pedantic warning for UCNs larger than
UCS_LIMIT outside of identifiers in C and in C++2a or later.
From-SVN: r276167
Xtensa hwloop_optimize segfaults when zero overhead loop is about to be
inserted as the first instruction of the function.
Insert zero overhead loop instruction into new basic block before the
loop when basic block that precedes the loop is empty.
2019-09-26 Max Filippov <jcmvbkbc@gmail.com>
gcc/
* config/xtensa/xtensa.c (hwloop_optimize): Insert zero overhead
loop instruction into new basic block before the loop when basic
block that precedes the loop is empty.
gcc/testsuite/
* gcc.target/xtensa/pr91880.c: New test case.
* gcc.target/xtensa/xtensa.exp: New test suite.
From-SVN: r276166
We can use the mode iterators directly with an @pattern to avoid the
need for an expander that was only there to pass the mode through.
gcc/ChangeLog:
2019-09-26 Iain Sandoe <iain@sandoe.co.uk>
* config/rs6000/darwin.md: Replace the expanders for
load_macho_picbase and reload_macho_picbase with use of '@'
in their respective define_insns.
(nonlocal_goto_receiver): Pass Pmode to gen_reload_macho_picbase.
* config/rs6000/rs6000-logue.c (rs6000_emit_prologue): Pass
Pmode to gen_load_macho_picbase.
* config/rs6000/rs6000.md: Likewise.
From-SVN: r276159
2019-09-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/91896
* tree-vect-loop.c (vectorizable_reduction): The single
def-use cycle optimization cannot apply when there's more
than one pattern stmt involved.
* gcc.dg/torture/pr91896.c: New testcase.
From-SVN: r276158
2019-09-26 Richard Biener <rguenther@suse.de>
* tree-vect-loop.c (vect_analyze_loop_operations): Also call
vectorizable_reduction for vect_double_reduction_def.
(vect_transform_loop): Likewise.
(vect_create_epilog_for_reduction): Move double-reduction
PHI creation and preheader argument setting of PHIs ...
(vectorizable_reduction): ... here. Also process
vect_double_reduction_def PHIs, creating the vectorized
PHI nodes, remembering the scalar adjustment computed for
the epilogue in STMT_VINFO_REDUC_EPILOGUE_ADJUSTMENT.
Remember the original reduction code in STMT_VINFO_REDUC_CODE.
* tree-vectorizer.c (vec_info::new_stmt_vec_info):
Initialize STMT_VINFO_REDUC_CODE.
* tree-vectorizer.h (_stmt_vec_info::reduc_epilogue_adjustment): New.
(_stmt_vec_info::reduc_code): Likewise.
(STMT_VINFO_REDUC_EPILOGUE_ADJUSTMENT): Likewise.
(STMT_VINFO_REDUC_CODE): Likewise.
From-SVN: r276150
When -march=native is passed to host_detect_local_cpu to the backend,
it overrides all command lines after it. That means
$ gcc -march=native -march=armv8-a
is treated as
$ gcc -march=armv8-a -march=native
Prune joined switches with Negative and RejectNegative to allow
-march=armv8-a to override previous -march=native on command-line.
This is the same fix as was applied for i386 in SVN revision 269164 but for
aarch64 and arm.
2019-09-26 Matt Turner <mattst88@gmail.com>
PR driver/69471
* config/aarch64/aarch64.opt (march=): Add Negative(march=).
(mtune=): Add Negative(mtune=).
(mcpu=): Add Negative(mcpu=).
* config/arm/arm.opt: Likewise.
From-SVN: r276148
This patch implements some more SIMD32, but these ones have a DImode result+addend.
Apart from that there's nothing too exciting about them.
Bootstrapped and tested on arm-none-linux-gnueabihf.
* config/arm/arm.md (arm_<simd32_op>): New define_insn.
* config/arm/arm_acle.h (__smlald, __smlaldx, __smlsld, __smlsldx):
Define.
* config/arm/arm_acle.h: Define builtins for the above.
* config/arm/iterators.md (SIMD32_DIMODE): New int_iterator.
(simd32_op): Handle the above.
* config/arm/unspecs.md: Define unspecs for the above.
* gcc.target/arm/acle/simd32.c: Update test.
From-SVN: r276147
This patch is part of a series to implement the SIMD32 ACLE intrinsics [1].
The interesting parts implementation-wise involve adding support for setting and reading
the Q bit for saturation and the GE-bits for the packed SIMD instructions.
That will come in a later patch.
For now, this patch implements the other intrinsics that don't need anything special ;
just a mapping from arm_acle.h function to builtin to RTL expander+unspec.
I've compressed as many as I could with iterators so that we end up needing only 3
new define_insns.
Bootstrapped and tested on arm-none-linux-gnueabihf.
[1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics
* config/arm/arm.md (arm_<simd32_op>): New define_insn.
(arm_<sup>xtb16): Likewise.
(arm_usada8): Likewise.
* config/arm/arm_acle.h (__qadd8, __qsub8, __shadd8, __shsub8,
__uhadd8, __uhsub8, __uqadd8, __uqsub8, __qadd16, __qasx, __qsax,
__qsub16, __shadd16, __shasx, __shsax, __shsub16, __uhadd16, __uhasx,
__uhsax, __uhsub16, __uqadd16, __uqasx, __uqsax, __uqsub16, __sxtab16,
__sxtb16, __uxtab16, __uxtb16): Define.
* config/arm/arm_acle_builtins.def: Define builtins for the above.
* config/arm/unspecs.md: Define unspecs for the above.
* config/arm/iterators.md (SIMD32_NOGE_BINOP): New int_iterator.
(USXTB16): Likewise.
(simd32_op): New int_attribute.
(sup): Handle UNSPEC_SXTB16, UNSPEC_UXTB16.
* doc/sourcebuild.exp (arm_simd32_ok): Document.
* lib/target-supports.exp
(check_effective_target_arm_simd32_ok_nocache): New procedure.
(check_effective_target_arm_simd32_ok): Likewise.
(add_options_for_arm_simd32): Likewise.
* gcc.target/arm/acle/simd32.c: New test.
From-SVN: r276146
My recent assemble_real patch (r275873) meant that we now output
negative FP16 constants in the same way as we'd output an integer
subreg of them. This patch updates gcc.target/arm/fp16-* accordingly.
2019-09-26 Richard Sandiford <richard.sandiford@arm.com>
gcc/testsuite/
* gcc.target/arm/fp16-compile-alt-3.c: Expect (__fp16) -2.0
to be written as a negative short rather than a positive one.
* gcc.target/arm/fp16-compile-ieee-3.c: Likewise.
From-SVN: r276145
The PLUS handling in aarch64_rtx_costs only checked for nonnegative
constants, meaning that simple immediate subtractions like:
(set (reg R1) (plus (reg R2) (const_int -8)))
had a cost of two instructions.
2019-09-26 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64.c (aarch64_rtx_costs): Use
aarch64_plus_immediate rather than aarch64_uimm12_shift
to test for valid PLUS immediates.
From-SVN: r276140
gcc/fortran/ChangeLog:
PR fortran/91426
* error.c (curr_diagnostic): New static variable.
(gfc_report_diagnostic): New static function.
(gfc_warning): Replace call to diagnostic_report_diagnostic with
call to gfc_report_diagnostic.
(gfc_format_decoder): Colorize the text of %L and %C to match the
colorization used by diagnostic_show_locus.
(gfc_warning_now_at): Replace call to diagnostic_report_diagnostic with
call to gfc_report_diagnostic.
(gfc_warning_now): Likewise.
(gfc_warning_internal): Likewise.
(gfc_error_now): Likewise.
(gfc_fatal_error): Likewise.
(gfc_error_opt): Likewise.
(gfc_internal_error): Likewise.
From-SVN: r276132
Hi,
Martin and his clang warnings discovered that I forgot to remove a
static inline function and a variable when ripping out the old IPA-SRA
from tree-sra.c and both are now unused. Thus I am doing that now
with the patch below which I will commit as obvious (after including
it in a round of a bootstrap and testing on an x86_64-linux).
Thanks,
Martin
2019-09-25 Martin Jambor <mjambor@suse.cz>
* tree-sra.c (no_accesses_p): Remove.
(no_accesses_representant): Likewise.
From-SVN: r276128
We're somewhat inconsistent in arm_neon.h when it comes to using the implementation namespace for local
identifiers. This means things like:
#define hash_abcd 0
#define hash_e 1
#define wk 2
#include "arm_neon.h"
uint32x4_t
foo (uint32x4_t a, uint32_t b, uint32x4_t c)
{
return vsha1cq_u32 (a, b, c);
}
don't compile.
This patch fixes these issues throughout the whole of arm_neon.h
Bootstrapped and tested on aarch64-none-linux-gnu.
The advsimd-intrinsics.exp tests pass just fine.
From-SVN: r276125
2019-09-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/91896
* tree-vect-loop.c (vectorizable_reduction): The single
def-use cycle optimization cannot apply when there's more
than one pattern stmt involved.
* gcc.dg/torture/pr91896.c: New testcase.
From-SVN: r276123
The DCache clean & ICache invalidation requirements for instructions
to be data coherence are discoverable through new fields in CTR_EL0.
Let's support the two bits if they are enabled, the CPU core will
not execute the unnecessary DCache clean or Icache Invalidation
instructions.
2019-09-25 Shaokun Zhang <zhangshaokun@hisilicon.com>
* config/aarch64/sync-cache.c (__aarch64_sync_cache_range): Add support for
CTR_EL0.IDC and CTR_EL0.DIC.
From-SVN: r276122
The break here was skipping over the code that sets EXPR_LOCATION on the
call expressions, for no good reason.
* parser.c (cp_parser_postfix_expression): Do set location of
dependent member call.
From-SVN: r276112
This switches the picbase load and reload patterns to use the 'P' mode
iterator instead of writing an SI and DI pattern for each.
gcc/ChangeLog:
2019-09-24 Iain Sandoe <iain@sandoe.co.uk>
* config/rs6000/rs6000.md (load_macho_picbase_<mode>): New, using
the 'P' mode iterator, replacing the (removed) SI and DI variants.
(reload_macho_picbase_<mode>): Likewise.
From-SVN: r276107