This patch adds in plumbing for the ACLE intrinsics that set the GE bits in
APSR. These are special SIMD instructions in Armv6 that pack bytes or
halfwords into the 32-bit general-purpose registers and set the GE bits in
APSR to indicate if some of the "lanes" of the result have overflowed or
have some other instruction-specific property.
These bits can then be used by the SEL instruction (accessed through the
__sel intrinsic) to select lanes for further processing.
This situation is similar to the Q-setting intrinsics: we have to track
the GE fake register, detect when a function reads it through __sel and restrict
existing patterns that may generate GE-clobbering instruction from
straight-line C code when reading the GE bits matters.
* config/arm/aout.h (REGISTER_NAMES): Add apsrge.
* config/arm/arm.md (APSRGE_REGNUM): Define.
(arm_<simd32_op>): New define_insn.
(arm_sel): Likewise.
* config/arm/arm.h (FIXED_REGISTERS): Add entry for apsrge.
(CALL_USED_REGISTERS): Likewise.
(REG_ALLOC_ORDER): Likewise.
(FIRST_PSEUDO_REGISTER): Update value.
(ARM_GE_BITS_READ): Define.
* config/arm/arm.c (arm_conditional_register_usage): Clear
APSRGE_REGNUM from operand_reg_set.
(arm_ge_bits_access): Define.
* config/arm/arm-builtins.c (arm_check_builtin_call): Handle
ARM_BUIILTIN_sel.
* config/arm/arm-protos.h (arm_ge_bits_access): Declare prototype.
* config/arm/arm-fixed.md (add<mode>3): Convert to define_expand.
FAIL if ARM_GE_BITS_READ.
(*arm_add<mode>3): New define_insn.
(sub<mode>3): Convert to define_expand. FAIL if ARM_GE_BITS_READ.
(*arm_sub<mode>3): New define_insn.
* config/arm/arm_acle.h (__sel, __sadd8, __ssub8, __uadd8, __usub8,
__sadd16, __sasx, __ssax, __ssub16, __uadd16, __uasx, __usax,
__usub16): Define.
* config/arm/arm_acle_builtins.def: Define builtins for the above.
* config/arm/iterators.md (SIMD32_GE): New int_iterator.
(simd32_op): Handle the above.
* config/arm/unspecs.md (UNSPEC_GE_SET): Define.
(UNSPEC_SEL, UNSPEC_SADD8, UNSPEC_SSUB8, UNSPEC_UADD8, UNSPEC_USUB8,
UNSPEC_SADD16, UNSPEC_SASX, UNSPEC_SSAX, UNSPEC_SSUB16, UNSPEC_UADD16,
UNSPEC_UASX, UNSPEC_USAX, UNSPEC_USUB16): Define.
* gcc.target/arm/acle/simd32.c: Update test.
* gcc.target/arm/acle/simd32_sel.c: New test.
From-SVN: r277917
This patch implements some more Q-setting intrinsics form the SMLA* group.
These can set the saturation bit on overflow in the accumulation step.
Like earlier, these have non-Q-setting RTL forms as well for when the
Q-bit read
is not needed.
* config/arm/arm.md (arm_smlabb_setq): New define_insn.
(arm_smlabb): New define_expand.
(*maddhisi4tb): Rename to...
(maddhisi4tb): ... This.
(*maddhisi4tt): Rename to...
(maddhisi4tt): ... This.
(arm_smlatb_setq): New define_insn.
(arm_smlatb): New define_expand.
(arm_smlatt_setq): New define_insn.
(arm_smlatt): New define_expand.
(arm_<smlaw_op><add_clobber_name>_insn): New define_insn.
(arm_<smlaw_op>): New define_expand.
* config/arm/arm_acle.h (__smlabb, __smlatb, __smlabt, __smlatt,
__smlawb, __smlawt): Define.
* config/arm_acle_builtins.def: Define builtins for the above.
* config/arm/iterators.md (SMLAWBT): New int_iterator.
(slaw_op): New int_attribute.
* config/arm/unspecs.md (UNSPEC_SMLAWB, UNSPEC_SMLAWT): Define.
* gcc.target/arm/acle/dsp_arith.c: Update test.
From-SVN: r277916
This patch implements some more Q-bit-setting intrinsics from ACLE.
With the plumbing from patch 1 in place they are a simple builtin->RTL
affair.
* config/arm/arm.md (arm_<ss_op>): New define_expand.
(arm_<ss_op><add_clobber_q_name>_insn): New define_insn.
* config/arm/arm_acle.h (__qadd, __qsub, __qdbl): Define.
* config/arm/arm_acle_builtins.def: Add builtins for qadd, qsub.
* config/arm/iterators.md (SSPLUSMINUS): New code iterator.
(ss_op): New code_attr.
* gcc.target/arm/acle/dsp_arith.c: New test.
From-SVN: r277915
This patch adds the plumbing for and an implementation of the saturation
intrinsics from ACLE, in particular the __ssat, __usat intrinsics.
These intrinsics set the Q sticky bit in APSR if an overflow occurred.
ACLE allows the user to read that bit (within the same function, it's not
defined across function boundaries) using the __saturation_occurred
intrinsic
and reset it using __set_saturation_occurred.
Thus, if the user cares about the Q bit they would be using a flow such as:
__set_saturation_occurred (0); // reset the Q bit
...
__ssat (...) // Do some calculations involving __ssat
...
if (__saturation_occurred ()) // if Q bit set handle overflow
...
For the implementation this has a few implications:
* We must track the Q-setting side-effects of these instructions to make
sure
saturation reading/writing intrinsics are ordered properly.
This is done by introducing a new "apsrq" register (and associated
APSRQ_REGNUM) in a similar way to the "fake"" cc register.
* The RTL patterns coming out of these intrinsics can have two forms:
one where they set the APSRQ_REGNUM and one where they don't.
Which one is used depends on whether the function cares about reading the Q
flag. This is detected using the TARGET_CHECK_BUILTIN_CALL hook on the
__saturation_occurred, __set_saturation_occurred occurrences.
If no Q-flag read is present in the function we'll use the simpler
non-Q-setting form to allow for more aggressive scheduling and such.
If a Q-bit read is present then the Q-setting form is emitted.
To avoid adding two patterns for each intrinsic to the MD file we make
use of define_subst to auto-generate the Q-setting forms
* Some existing patterns already produce instructions that may clobber the
Q bit, but they don't model it (as we didn't care about that bit up till
now).
Since these patterns can be generated from straight-line C code they can
affect
the Q-bit reads from intrinsics. Therefore they have to be disabled when
a Q-bit read is present. These are mostly patterns in arm-fixed.md that are
not very common anyway, but there are also a couple of widening
multiply-accumulate patterns in arm.md that can set the Q-bit during
accumulation.
There are more Q-setting intrinsics in ACLE, but these will be
implemented in
a more mechanical fashion once the infrastructure in this patch goes in.
* config/arm/aout.h (REGISTER_NAMES): Add apsrq.
* config/arm/arm.md (APSRQ_REGNUM): Define.
(add_setq): New define_subst.
(add_clobber_q_name): New define_subst_attr.
(add_clobber_q_pred): Likewise.
(maddhisi4): Change to define_expand. Split into mult and add if
ARM_Q_BIT_READ.
(arm_maddhisi4): New define_insn.
(*maddhisi4tb): Disable for ARM_Q_BIT_READ.
(*maddhisi4tt): Likewise.
(arm_ssat): New define_expand.
(arm_usat): Likewise.
(arm_get_apsr): New define_insn.
(arm_set_apsr): Likewise.
(arm_saturation_occurred): New define_expand.
(arm_set_saturation): Likewise.
(*satsi_<SAT:code>): Rename to...
(satsi_<SAT:code><add_clobber_q_name>): ... This.
(*satsi_<SAT:code>_shift): Disable for ARM_Q_BIT_READ.
* config/arm/arm.h (FIXED_REGISTERS): Mark apsrq as fixed.
(CALL_USED_REGISTERS): Mark apsrq.
(FIRST_PSEUDO_REGISTER): Update value.
(REG_ALLOC_ORDER): Add APSRQ_REGNUM.
(machine_function): Add q_bit_access.
(ARM_Q_BIT_READ): Define.
* config/arm/arm.c (TARGET_CHECK_BUILTIN_CALL): Define.
(arm_conditional_register_usage): Clear APSRQ_REGNUM from
operand_reg_set.
(arm_q_bit_access): Define.
* config/arm/arm-builtins.c: Include stringpool.h.
(arm_sat_binop_imm_qualifiers,
arm_unsigned_sat_binop_unsigned_imm_qualifiers,
arm_sat_occurred_qualifiers, arm_set_sat_qualifiers): Define.
(SAT_BINOP_UNSIGNED_IMM_QUALIFIERS,
UNSIGNED_SAT_BINOP_UNSIGNED_IMM_QUALIFIERS, SAT_OCCURRED_QUALIFIERS,
SET_SAT_QUALIFIERS): Likewise.
(arm_builtins): Define ARM_BUILTIN_SAT_IMM_CHECK.
(arm_init_acle_builtins): Initialize __builtin_sat_imm_check.
Handle 0 argument expander.
(arm_expand_acle_builtin): Handle ARM_BUILTIN_SAT_IMM_CHECK.
(arm_check_builtin_call): Define.
* config/arm/arm.md (ssmulsa3, usmulusa3, usmuluha3,
arm_ssatsihi_shift, arm_usatsihi): Disable when ARM_Q_BIT_READ.
* config/arm/arm-protos.h (arm_check_builtin_call): Declare prototype.
(arm_q_bit_access): Likewise.
* config/arm/arm_acle.h (__ssat, __usat, __ignore_saturation,
__saturation_occurred, __set_saturation_occurred): Define.
* config/arm/arm_acle_builtins.def: Define builtins for ssat, usat,
saturation_occurred, set_saturation_occurred.
* config/arm/unspecs.md (UNSPEC_Q_SET): Define.
(UNSPEC_APSR_READ): Likewise.
(VUNSPEC_APSR_WRITE): Likewise.
* config/arm/arm-fixed.md (ssadd<mode>3): Convert to define_expand.
(*arm_ssadd<mode>3): New define_insn.
(sssub<mode>3): Convert to define_expand.
(*arm_sssub<mode>3): New define_insn.
(ssmulsa3): Convert to define_expand.
(*arm_ssmulsa3): New define_insn.
(usmulusa3): Convert to define_expand.
(*arm_usmulusa3): New define_insn.
(ssmulha3): FAIL if ARM_Q_BIT_READ.
(arm_ssatsihi_shift, arm_usatsihi): Disable for ARM_Q_BIT_READ.
* config/arm/iterators.md (qaddsub_clob_q): New mode attribute.
* gcc.target/arm/acle/saturation.c: New test.
* gcc.target/arm/acle/sat_no_smlatb.c: Likewise.
* lib/target-supports.exp (check_effective_target_arm_qbit_ok_nocache):
Define..
(check_effective_target_arm_qbit_ok): Likewise.
(add_options_for_arm_qbit): Likewise.
From-SVN: r277914
2019-11-07 Martin Liska <mliska@suse.cz>
PR c++/92354
* cgraph.c (delete_function_version): Clear global
variable version_info_node if equal to deleted
function.
2019-11-07 Martin Liska <mliska@suse.cz>
PR c++/92354
* g++.target/i386/pr92354.C: New test.
From-SVN: r277913
2019-11-07 Martin Liska <mliska@suse.cz>
* merge.sh: Update to use llvm-project git repository.
* all source files: Merge from upstream
82588e05cc32bb30807e480abd4e689b0dee132a.
From-SVN: r277909
gcc/
Support 64-bit double and 64-bit long double configurations.
PR target/92055
* config.gcc (tm_defines) [avr]: Set from --with-double=,
--with-long-double=.
* config/avr/t-multilib: Remove.
* config/avr/t-avr: Output of genmultilib.awk is now fully
dynamically generated and no more part of the repo.
(HAVE_DOUBLE_MULTILIB, HAVE_LONG_DOUBLE_MULTILIB): New variables.
Pass them down to...
* config/avr/genmultilib.awk: ...here and handle them.
* gcc/config/avr/avr.opt (-mdouble=, avr_double). New option and var.
(-mlong-double=, avr_long_double). New option and var.
* common/config/avr/avr-common.c (opts.h, diagnostic.h): Include.
(TARGET_OPTION_OPTIMIZATION_TABLE) <-mdouble=, -mlong-double=>:
Set default as requested by --with-double=
(TARGET_HANDLE_OPTION): Define to this...
(avr_handle_option): ...new hook worker.
* config/avr/avr.h (DOUBLE_TYPE_SIZE): Define to avr_double.
(LONG_DOUBLE_TYPE_SIZE): Define to avr_long_double.
(avr_double_lib): New proto for spec function.
(EXTRA_SPEC_FUNCTIONS) <double-lib>: Add.
(DRIVER_SELF_SPECS): Call %:double-lib.
* config/avr/avr.c (avr_option_override): Assert
sizeof(long double) >= sizeof(double) for the target.
* config/avr/avr-c.c (avr_cpu_cpp_builtins)
[__HAVE_DOUBLE_MULTILIB__, __HAVE_LONG_DOUBLE_MULTILIB__]
[__HAVE_DOUBLE64__, __HAVE_DOUBLE32__, __DEFAULT_DOUBLE__=]
[__HAVE_LONG_DOUBLE64__, __HAVE_LONG_DOUBLE32__]
[__HAVE_LONG_DOUBLE_IS_DOUBLE__, __DEFAULT_LONG_DOUBLE__=]:
New built-in define depending on --with-double=, --with-long-double=.
* config/avr/driver-avr.c (avr_double_lib): New spec function.
* doc/invoke.tex (AVR Options) <-mdouble=,-mlong-double=>: Doc.
* doc/install.texi (Cross-Compiler-Specific Options)
<--with-double=, --with-long-double=>: Doc.
libgcc/
Support 64-bit double and 64-bit long double configurations.
PR target/92055
* config/avr/t-avr (HOST_LIBGCC2_CFLAGS): Only add -DF=SF if
long double is a 32-bit type.
* config/avr/t-avrlibc: Copy double64 and long-double64
multilib(s) from the vanilla one.
* config/avr/t-copy-libgcc: New Makefile snip.
From-SVN: r277908
2019-11-06 Jerry DeLisle <jvdelisle@gcc.ngu.org>
PR fortran/90374
* io.c (check_format): Allow zero width for D, E, EN, and ES
specifiers as default and when -std=F2018 is given. Retain
existing errors when using the -fdec family of flags.
* libgfortran/io/format.c (parse_format_list): Relax format checking for
zero width as default and when -std=f2018.
io/format.h (format_token): Move definition to io.h.
io/io.h (format_token): Add definition here to allow access to
this definition at higher levels. Rename the declaration of
write_real_g0 to write_real_w0 and add a new format_token
argument, allowing higher level functions to pass in the
token for handling of g0 vs the other zero width specifiers.
io/transfer.c (formatted_transfer_scalar_write): Add checks for
zero width and call write_real_w0 to handle it.
io/write.c (write_real_g0): Remove.
(write_real_w0): Add new, same as previous write_real_g0 except
check format token to handle the g0 case.
* gfortran.dg/fmt_error_10.f: Modify for new constraints.
* gfortran.dg/fmt_error_7.f: Add dg-options "-std=f95".
* gfortran.dg/fmt_error_9.f: Modify for new constraints.
* gfortran.dg/fmt_zero_width.f90: New test.
From-SVN: r277905
gcc/testsuite/ChangeLog:
2019-11-07 Xiong Hu Luo <luoxhu@linux.ibm.com>
* gcc.target/powerpc/pr72804.c: Move inline options from
dg-require-effective-target to dg-options.
From-SVN: r277904
This patch is another piece of preparation for C2x attributes support.
C2x attributes require unbounded lookahead in the parser, because the
token sequence '[[' that starts a C2x attribute is also valid in
Objective-C in some of the same contexts, so it is necessary to see
whether the matching ']]' are consecutive tokens or not to determine
whether those tokens start an attribute.
Unbounded lookahead means lexing an unbounded number of tokens before
they are parsed. c_lex_one_token does various context-sensitive
processing of tokens that cannot be done at that lookahead time,
because it depends on information (such as whether particular
identifiers are typedefs) that may be different at the time it is
relevant than at the time the lookahead is needed (recall that more or
less arbitrary C code, including declarations and statements, can
appear inside expressions in GNU C).
Most of that context-sensitive processing is not a problem, simply
because it is not needed for lookahead purposes so can be deferred
until the tokens lexed during lookahead are parsed. However, the
earliest piece of context-sensitive processing is the handling of
string literals based on flags passed to c_lex_with_flags, which
determine whether adjacent literals are concatenated and whether
translation to the execution character set occurs.
Because the choice of whether to translate to the execution character
set is context-sensitive, this means that unbounded lookahead requires
the C parser to move to the approach used by the C++ parser, where
string literals are generally not translated or concatenated from
within c_lex_with_flags, but only later in the parser once it knows
whether translation is needed. (Translation requires the tokens in
their form before concatenation.)
Thus, this patch makes that change to the C parser. Flags in the
parser are still used for two special cases similar to C++: the
handling of an initial #pragma pch_preprocess, and arranging for
strings inside attributes not to be translated (the latter is made
more logically correct by saving and restoring the flags, as in the
C++ parser, rather than assuming that the state outside the attribute
was always to translate string literals, which might not be the case
in corner cases involving declarations and attributes inside
attributes).
The consequent change to pragma_lex to use c_parser_string_literal
makes it disallow wide strings and disable translation in that
context, which also follows C++ and is more logically correct than the
previous state without special handling in that regard. Translation
to the execution character set is always disabled when string
constants are handled in the GIMPLE parser.
Although the handling of strings is now a lot closer to that in C++,
there are still some differences, in particular regarding the handling
of locations. See c-c++-common/Wformat-pr88257.c, which has different
expected multiline diagnostic output for C and C++, for example; I'm
not sure whether the C or C++ output is better there (C++ has a more
complete range than C, C mentions a macro definition location that C++
doesn't), but I tried to keep the locations the same as those
previously used by the C front end, as far as possible, to minimize
the testsuite changes needed, rather than possibly making them closer
to those used with C++.
The only changes needed for tests of user-visible diagnostics were for
the wording of one diagnostic changing to match C++ (as a consequence
of having a check for wide strings based on a flag in a general
string-handling function rather than in a function specific to asm).
However, although locations are extremely similar to what they were
before, I couldn't make them completely identical in all cases. (My
understanding of the implementation reason for the differences is as
follows: lex_string uses src_loc from each cpp_token; the C parser is
using the virtual location from cpp_get_token_with_location as called
by c_lex_with_flags, and while passing that through
linemap_resolve_location with LRK_MACRO_DEFINITION_LOCATION, as this
patch does, produces something very close to what lex_string uses,
it's not completely identical in some cases.)
This results in changes being needed to two of the gcc.dg/plugin tests
that use a plugin to test details of how string locations are handled.
Because the tests being changed are for ICEs and the only change is to
the details of the particular non-user-visible error that code gives
in cases it can't handle (one involving __FILE__, one involving a
string literal from stringizing), I think it's OK to change that
non-user-visible error and that the new errors are no worse than the
old ones. So these particular errors are now different for C and C++
(some other messages in those tests already had differences between C
and C++).
Bootstrapped with no regressions on x86_64-pc-linux-gnu.
gcc/c:
* c-parser.c (c_parser): Remove lex_untranslated_string. Add
lex_joined_string and translate_strings_p.
(c_lex_one_token): Pass 0 or C_LEX_STRING_NO_JOIN to
c_lex_with_flags.
(c_parser_string_literal): New function.
(c_parser_static_assert_declaration_no_semi): Use
c_parser_string_literal. Do not set lex_untranslated_string.
(c_parser_asm_string_literal): Use c_parser_string_literal.
(c_parser_simple_asm_expr): Do not set lex_untranslated_string.
(c_parser_gnu_attributes): Set and restore translate_strings_p
instead of lex_untranslated_string.
(c_parser_asm_statement): Do not set lex_untranslated_string.
(c_parser_asm_operands): Likewise.
(c_parser_has_attribute_expression): Set and restore
translate_strings_p instead of lex_untranslated_string.
(c_parser_postfix_expression): Use c_parser_string_literal.
(pragma_lex): Likewise.
(c_parser_pragma_pch_preprocess): Set lex_joined_string.
(c_parse_file): Set translate_strings_p.
* gimple-parser.c (c_parser_gimple_postfix_expression)
(c_parser_gimple_or_rtl_pass_list): Use c_parser_string_literal.
* c-parser.c (c_parser_string_literal): Declare function.
gcc/testsuite:
* gcc.dg/asm-wide-1.c, gcc.dg/diagnostic-token-ranges.c,
gcc.dg/plugin/diagnostic-test-string-literals-1.c,
gcc.dg/plugin/diagnostic-test-string-literals-2.c: Update expected
diagnostics.
From-SVN: r277903
ISO C++ paper D1907R1 proposes "structural type" as an alternative to the
current notion of "strong structural equality", which has various problems.
I'm implementing it to give people a chance to try it.
The build_base_field changes are to make it easier for structural_type_p to
see whether a base is private or protected.
* tree.c (structural_type_p): New.
* pt.c (invalid_nontype_parm_type_p): Use it.
* class.c (build_base_field_1): Take binfo. Copy TREE_PRIVATE.
(build_base_field): Pass binfo.
From-SVN: r277902
Here unify was getting confused by the VIEW_CONVERT_EXPR we add in
finish_id_expression_1 to make class NTTP const when they're used in an
expression.
Tested x86_64-pc-linux-gnu, applying to trunk.
* pt.c (unify): Handle VIEW_CONVERT_EXPR.
From-SVN: r277901
gcc/cp/
2019-11-06 Andrew Sutton <asutton@lock3software.com>
* constraint.cc (build_parameter_mapping): Use
current_template_parms when the declaration is not available.
(norm_info::norm_info) Make explicit.
(normalize_constraint_expression): Factor into a separate overload
that takes arguments, and use that in the original function.
(tsubst_nested_requirement): Use satisfy_constraint instead of
trying to evaluate this as a constant expression.
(finish_nested_requirement): Keep the normalized constraint and the
original normalization arguments with the requirement.
(diagnose_nested_requirement): Use satisfy_constraint. Tentatively
implement more comprehensive diagnostics, but do not enable.
* parser.c (cp_parser_requires_expression): Relax requirement that
requires-expressions can live only inside templates.
* pt.c (any_template_parm_r): Look into type of PARM_DECL.
2019-11-06 Jason Merrill <jason@redhat.com>
* pt.c (use_pack_expansion_extra_args_p): Still do substitution if
all packs are simple pack expansions.
(add_extra_args): Check that the extra args aren't dependent.
gcc/testsuite/
* lib/prune.exp: Ignore "in requirements" in diagnostics.
* g++.dg/cpp2a/requires-18.C: New test.
* g++.dg/cpp2a/requires-19.C: New test.
From-SVN: r277900
When multiple hard registers are required to hold the frame pointer,
ensure that the registers after the first are marked as non-allocatable,
live and eliminable as well.
2019-11-07 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* ira.c (setup_alloc_regs): Setup no_unit_alloc_regs for
frame pointer in multiple registers.
(ira_setup_eliminable_regset): Setup eliminable_regset,
ira_no_alloc_regs and regs_ever_live for frame pointer in
multiple registers.
From-SVN: r277895
The test works by checking that a known framework path is accessible
when the '-F' option is given. We need to find a framework path that
exists across a range of Darwin versions, and parseable by GCC. This
adjusts the test to use a header path that exists and is parsable from
Darwin9 through Darwin19.
gcc/testsuite/ChangeLog:
2019-11-06 Iain Sandoe <iain@sandoe.co.uk>
* gcc.dg/framework-1.c: Adjust test header path.
From-SVN: r277892
No real use cases have ever arisen for constraints on non-templated
functions, and handling of them has never been entirely clear, so the
committee agreed to accept this national body comment proposing that we
remove them.
* decl.c (grokfndecl): Reject constraints on non-templated function.
From-SVN: r277891
* ggc-common.c (ggc_prune_overhead_list): Do not delete surviving
allocations.
* mem-stats.h (mem_alloc_description<T>::release_object_overhead):
Do not silently ignore summary corruptions.
From-SVN: r277890
Now that operator<=> is supported, these operators can be generated by
the compiler.
* include/bits/iterator_concepts.h (unreachable_sentinel_t): Remove
redundant equality operators.
* testsuite/util/testsuite_iterators.h (test_range::sentinel):
Likewise.
From-SVN: r277888
This change lets grok_op_properties print its useful "ISO C++ prohibits
overloading operator ?:" message instead of the cryptic error message about
a missing type-specifier before '?' token.
2019-11-06 Matthias Kretz <m.kretz@gsi.de>
* parser.c (cp_parser_operator): Parse operator?: as an
attempt to overload the conditional operator.
From-SVN: r277887
With a later patch I saw a case in which we peeled a single iteration
for gaps but didn't need to peel further iterations to make up a full
vector. We then tried to vectorise the single-iteration epilogue.
2019-11-06 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop.c (vect_analyze_loop): Only try to vectorize
the epilogue if there are peeled iterations for it to handle.
From-SVN: r277886
With later patches, we're able to vectorise the epilogues of these tests
on AArch64 and so get two instances of "vectorizing stmts using SLP".
Although it would be possible with a bit of effort to predict when
this happens, it doesn't seem important whether we get 1 vs. 2
occurrences. All that matters is zero vs. nonzero.
2019-11-06 Richard Sandiford <richard.sandiford@arm.com>
gcc/testsuite/
* gcc.dg/vect/slp-9.c: Use scan-tree-dump rather than
scan-tree-dump-times.
* gcc.dg/vect/slp-widen-mult-s16.c: Likewise.
* gcc.dg/vect/slp-widen-mult-u8.c: Likewise.
From-SVN: r277881
The number of iterations of an epilogue loop is always smaller than the
VF of the main loop. vect_analyze_loop_costing was taking this into
account when deciding whether the loop is cheap enough to vectorise,
but that has no effect with the unlimited cost model. We need to use
a separate check for correctness as well.
This can happen if the sizes returned by autovectorize_vector_sizes
happen to be out of order, e.g. because the target prefers smaller
vectors. It can also happen with later patches if two vectorisation
attempts happen to end up with the same VF.
2019-11-06 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop.c (vect_analyze_loop_2): When vectorizing an
epilogue loop, make sure that the VF is small enough or that
the epilogue loop can be fully-masked.
From-SVN: r277880
Once vect_analyze_loop has found a valid loop_vec_info X, we carry
on searching for alternatives if (1) X doesn't satisfy simdlen or
(2) we want to vectorize the epilogue of X. I have a patch that
optionally adds a third reason: we want to see if there are cheaper
alternatives to X.
This patch restructures vect_analyze_loop so that it's easier
to add more reasons for continuing. There's supposed to be no
behavioural change.
If we wanted to, we could allow vectorisation of epilogues once
loop->simdlen has been reached by changing "loop->simdlen" to
"simdlen" in the new vect_epilogues condition. That should be
a separate change though.
2019-11-06 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop.c (vect_analyze_loop): Break out of the main
loop when we've finished, rather than returning directly from
the loop. Use a local variable to track whether we're still
searching for the preferred simdlen. Make vect_epilogues
record whether the next iteration should try to treat the
loop as an epilogue.
From-SVN: r277879
Currently for hard float we need to check for
__ARC_FPU_SP__ || __ARC_FPU_DP__ and for soft float inverse of that.
So define single convenience macros for either cases.
gcc/
xxxx-xx-xx Vineet Gupta <vgupta@synopsyscom>
* config/arc/arc-c.c (arc_cpu_cpp_builtins): Add
__arc_hard_float__, __ARC_HARD_FLOAT__,
__arc_soft_float__, __ARC_SOFT_FLOAT__
From-SVN: r277878
This was first submitted many years ago
https://gcc.gnu.org/ml/gcc-patches/2010-10/msg02468.html
The command line option -fcallgraph-info is added and makes the
compiler generate another output file (xxx.ci) for each compilation
unit (or LTO partitoin), which is a valid VCG file (you can launch
your favorite VCG viewer on it unmodified) and contains the "final"
callgraph of the unit. "final" is a bit of a misnomer as this is
actually the callgraph at RTL expansion time, but since most
high-level optimizations are done at the Tree level and RTL doesn't
usually fiddle with calls, it's final in almost all cases. Moreover,
the nodes can be decorated with additional info: -fcallgraph-info=su
adds stack usage info and -fcallgraph-info=da dynamic allocation info.
for gcc/ChangeLog
From Eric Botcazou <ebotcazou@adacore.com>, Alexandre Oliva <oliva@adacore.com>
* common.opt (-fcallgraph-info[=]): New option.
* doc/invoke.texi (Developer options): Document it.
* opts.c (common_handle_option): Handle it.
* builtins.c (expand_builtin_alloca): Record allocation if
-fcallgraph-info=da.
* calls.c (expand_call): If -fcallgraph-info, record the call.
(emit_library_call_value_1): Likewise.
* flag-types.h (enum callgraph_info_type): New type.
* explow.c: Include stringpool.h.
(set_stack_check_libfunc): Set SET_SYMBOL_REF_DECL on the symbol.
* function.c (allocate_stack_usage_info): New.
(allocate_struct_function): Call it for -fcallgraph-info.
(prepare_function_start): Call it otherwise.
(record_final_call, record_dynamic_alloc): New.
* function.h (struct callinfo_callee): New.
(CALLEE_FROM_CGRAPH_P): New.
(struct callinfo_dalloc): New.
(struct stack_usage): Add callees and dallocs.
(record_final_call, record_dynamic_alloc): Declare.
* gimplify.c (gimplify_decl_expr): Record dynamically-allocated
object if -fcallgraph-info=da.
* optabs-libfuncs.c (build_libfunc_function): Keep SYMBOL_REF_DECL.
* print-tree.h (print_decl_identifier): Declare.
(PRINT_DECL_ORIGIN, PRINT_DECL_NAME, PRINT_DECL_UNIQUE_NAME): New.
* print-tree.c: Include print-tree.h.
(print_decl_identifier): New function.
* toplev.c: Include print-tree.h.
(callgraph_info_file): New global variable.
(callgraph_info_external_printed): Likewise.
(output_stack_usage): Rename to...
(output_stack_usage_1): ... this. Make it static, add cf
parameter. If -fcallgraph-info=su, print stack usage to cf.
If -fstack-usage, use print_decl_identifier for
pretty-printing.
(INDIRECT_CALL_NAME): New.
(dump_final_node_vcg_start): New.
(dump_final_callee_vcg, dump_final_node_vcg): New.
(output_stack_usage): New.
(lang_dependent_init): Open and start file if
-fcallgraph-info. Allocated callgraph_info_external_printed.
(finalize): If callgraph_info_file is not null, finish it,
close it, and release callgraph_info_external_printed.
for gcc/ada/ChangeLog
* gcc-interface/misc.c (callgraph_info_file): Delete.
Co-Authored-By: Alexandre Oliva <oliva@adacore.com>
From-SVN: r277876
OpenACC (cf. OpenACC 2.7, section 2.9.11. "reduction clause";
this was first clarified by OpenACC 2.6) requires that, if a
variable is used in reduction clauses on two nested loops, then
there must be reduction clauses for that variable on all loops
that are nested in between the two loops and all these reduction
clauses must use the same operator.
This commit introduces a check for that property which reports
warnings if it is violated.
2019-11-06 Gergö Barany <gergo@codesourcery.com>
Frederik Harwath <frederik@codesourcery.com>
Thomas Schwinge <thomas@codesourcery.com>
gcc/
* omp-low.c (struct omp_context): New fields
local_reduction_clauses, outer_reduction_clauses.
(new_omp_context): Initialize these.
(scan_sharing_clauses): Record reduction clauses on OpenACC constructs.
(scan_omp_for): Check reduction clauses for incorrect nesting.
gcc/testsuite/
* c-c++-common/goacc/nested-reductions-warn.c: New test.
* c-c++-common/goacc/nested-reductions.c: New test.
* gfortran.dg/goacc/nested-reductions-warn.f90: New test.
* gfortran.dg/goacc/nested-reductions.f90: New test.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-1.c:
Add expected warnings about missing reduction clauses.
* testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-2.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-3.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-4.c:
Likewise.
Reviewed-by: Thomas Schwinge <thomas@codesourcery.com>
From-SVN: r277875
-finline-functions is enabled by default for O2 since r276469, update the
test cases with -fno-inline-functions.
c11-atomic-exec-5.c stills hit ICE of LRA on BE systems in PR92090.
This commit is NOT a fix for the bug and so it must NOT be closed.
gcc/testsuite/ChangeLog:
2019-11-06 Xiong Hu Luo <luoxhu@linux.ibm.com>
PR92090
* gcc.target/powerpc/pr72804.c: Add -fno-inline-functions --param
max-inline-insns-single-O2=200.
* gcc.target/powerpc/pr79439-1.c: Add -fno-inline-functions.
* gcc.target/powerpc/vsx-builtin-7.c: Likewise.
From-SVN: r277872
The combine pass is perfectly happy if a splitter splits to just one
instruction (instead of two).
* doc/md.texi (Insn Splitting): Fix combiner documentation.
From-SVN: r277866
There are three major pieces to this support: scalar operator<=>,
synthesis of comparison operators, and rewritten/reversed overload
resolution (e.g. a < b becomes 0 > b <=> a).
Unlike other defaulted functions, where we use synthesized_method_walk to
semi-simulate what the definition of the function will be like, this patch
determines the characteristics of a comparison operator by trying to define
it.
My handling of non-dependent rewritten operators in templates can still use
some work: build_min_non_dep_op_overload can't understand the rewrites and
crashes, so I'm avoiding it for now by clearing *overload. This means we'll
do name lookup again at instantiation time, which can incorrectly mean a
different result. I'll poke at this more in stage 3.
I'm leaving out a fourth section ("strong structural equality") even though
I've implemented it, because it seems likely to change radically tomorrow.
Thanks to Tim van Deurzen and Jakub for implementing lexing of the <=>
operator, and Jonathan for the initial <compare> header.
gcc/cp/
* cp-tree.h (struct lang_decl_fn): Add maybe_deleted bitfield.
(DECL_MAYBE_DELETED): New.
(enum special_function_kind): Add sfk_comparison.
(LOOKUP_REWRITTEN, LOOKUP_REVERSED): New.
* call.c (struct z_candidate): Add rewritten and reversed methods.
(add_builtin_candidate): Handle SPACESHIP_EXPR.
(add_builtin_candidates): Likewise.
(add_candidates): Don't add a reversed candidate if the parms are
the same.
(add_operator_candidates): Split out from build_new_op_1. Handle
rewritten and reversed candidates.
(add_candidate): Swap conversions of reversed candidate.
(build_new_op_1): Swap them back. Build a second operation for
rewritten candidates.
(extract_call_expr): Handle rewritten calls.
(same_fn_or_template): New.
(joust): Handle rewritten and reversed candidates.
* class.c (add_implicitly_declared_members): Add implicit op==.
(classtype_has_op, classtype_has_defaulted_op): New.
* constexpr.c (cxx_eval_binary_expression): Handle SPACESHIP_EXPR.
(cxx_eval_constant_expression, potential_constant_expression_1):
Likewise.
* cp-gimplify.c (genericize_spaceship): New.
(cp_genericize_r): Use it.
* cp-objcp-common.c (cp_common_init_ts): Handle SPACESHIP_EXPR.
* decl.c (finish_function): Handle deleted function.
* decl2.c (grokfield): SET_DECL_FRIEND_CONTEXT on defaulted friend.
(mark_used): Check DECL_MAYBE_DELETED. Remove assumption that
defaulted functions are non-static members.
* error.c (dump_expr): Handle SPACESHIP_EXPR.
* method.c (type_has_trivial_fn): False for sfk_comparison.
(enum comp_cat_tag, struct comp_cat_info_t): New types.
(comp_cat_cache): New array variable.
(lookup_comparison_result, lookup_comparison_category)
(is_cat, cat_tag_for, spaceship_comp_cat)
(spaceship_type, genericize_spaceship)
(common_comparison_type, early_check_defaulted_comparison)
(comp_info, build_comparison_op): New.
(synthesize_method): Handle sfk_comparison. Handle deleted.
(get_defaulted_eh_spec, maybe_explain_implicit_delete)
(explain_implicit_non_constexpr, implicitly_declare_fn)
(defaulted_late_check, defaultable_fn_check): Handle sfk_comparison.
* name-lookup.c (get_std_name_hint): Add comparison categories.
* tree.c (special_function_p): Add sfk_comparison.
* typeck.c (cp_build_binary_op): Handle SPACESHIP_EXPR.
2019-11-05 Tim van Deurzen <tim@kompiler.org>
Add new tree code for the spaceship operator.
gcc/cp/
* cp-tree.def: Add new tree code.
* operators.def: New binary operator.
* parser.c: Add new token and tree code.
libcpp/
* cpplib.h: Add spaceship operator for C++.
* lex.c: Implement conditional lexing of spaceship operator for C++20.
2019-11-05 Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/
* libsupc++/compare: New header.
* libsupc++/Makefile.am (std_HEADERS): Add compare.
* include/std/version: Define __cpp_lib_three_way_comparison.
* include/std/functional: #include <compare>.
From-SVN: r277865
While working on C++20 operator<=>, I noticed that build_new_op_1 was doing
too much conversion when a built-in candidate was selected; the standard
says it should only perform user-defined conversions, and then leave the
normal operator semantics to handle any standard conversions. This is
important for operator<=> because a comparison of two different unscoped
enums is ill-formed; if we promote the enums to int here, cp_build_binary_op
never gets to see the original operand types, so we can't give the error.
I'm also disabling -Wmaybe-uninitialized for expmed.c to avoid the bootstrap
failure from the last time I applied this patch.
* call.c (build_new_op_1): Don't apply any standard conversions to
the operands of a built-in operator. Don't suppress conversions in
cp_build_unary_op.
* typeck.c (cp_build_unary_op): Do integral promotions for enums.
PR tree-optimization/91825
* expmed.c: Reduce -Wmaybe-uninitialized to warning.
From-SVN: r277864
My operator<=> patch wants to split up build_new_op_1, which makes using a
tree array as well as the vec inconvenient. build_new_op_1 already has a
vec, and build_conditional_expr_1 can release its vec right away, so this
doesn't increase garbage at all.
* call.c (build_builtin_candidate): Take args in a vec.
(add_builtin_candidate, add_builtin_candidates): Likewise.
(build_conditional_expr_1, build_new_op_1): Adjust.
From-SVN: r277863
Wrappers for lookup_qualified_name and build_x_binary_op to make calling
them more convenient in places, and a function named contextual_conv_bool
for places that want contextual conversion to bool.
I noticed that we weren't showing the declaration location when we complain
about a call to a non-constexpr function where a constant expression is
required.
If maybe_instantiate_noexcept doesn't actually instantiate, there's no
reason for it to mess with clones.
* constexpr.c (explain_invalid_constexpr_fn): Show location of fn.
* pt.c (maybe_instantiate_noexcept): Only update clones if we
instantiated.
* typeck.c (contextual_conv_bool): New.
* name-lookup.c (lookup_qualified_name): Add wrapper overload taking
C string rather than identifier.
* parser.c (cp_parser_userdef_numeric_literal): Use it.
* rtti.c (emit_support_tinfos): Use it.
* cp-tree.h (ovl_op_identifier): Change to inline functions.
(build_x_binary_op): Add wrapper with fewer parms.
From-SVN: r277862
The RISC-V backend wants to use a libcall when optimizing for size if
more than 6 instructions are needed. Emit_move_complex asks for no
libcalls. This case requires 8 insns for rv64 and 16 insns for rv32,
so we get fallback code that emits a loop. Commit_one_edge_insertion
doesn't allow code inserted for a phi node on an edge to end with a
branch, and so this triggers an assertion. This problem goes away if
we allow libcalls when optimizing for size, which gives the code the
RISC-V backend wants, and avoids triggering the assert.
gcc/
PR middle-end/92263
* expr.c (emit_move_complex): Only use BLOCK_OP_NO_LIBCALL when
optimize_insn_for_speed_p is true.
gcc/testsuite/
PR middle-end/92263
* gcc.dg/pr92263.c: New.
From-SVN: r277861
While looking at CA378 I noticed that we weren't properly diagnosing two of
the three erroneous lines in the example.
* decl2.c (mark_used): Diagnose use of a function with unsatisfied
constraints here.
* typeck.c (cp_build_function_call_vec): Not here.
From-SVN: r277860