The interface of an intrinsic procedure is automatically explicit.
Do not write it to the module file to prevent wrong ambiguities on USE.
gcc/fortran/ChangeLog:
PR fortran/63797
* module.c (write_symtree): Do not write interface of intrinsic
procedure to module file for F2003 and newer.
gcc/testsuite/ChangeLog:
PR fortran/63797
* gfortran.dg/pr63797.f90: New test.
Co-authored-by: Paul Thomas <pault@gcc.gnu.org>
For z10 and newer inner loops are completely unrolled which means store
motion is not applied. Reverting max-completely-peeled-insns to the
default value fixes these testcases.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/pr83403-1.c: Revert
max-completely-peeled-insns to the default value on IBM Z.
* gcc.dg/tree-ssa/pr83403-2.c: Likewise.
Here, reduced_constant_expression_p is incorrectly returning true for a
partially initialized array CONSTRUCTOR (in C++20 mode) because when the
CONSTRUCTOR_NO_CLEARING flag is set, the predicate doesn't check that
the CONSTRUCTOR spans the entire array like it does for class CONSTRUCTORS.
This patch adds a dedicated loop for the array case that simultaneously
verifies the CONSTRUCTOR spans the entire array and is made up of valid
constant expressions.
gcc/cp/ChangeLog:
PR c++/99700
* constexpr.c (reduced_constant_expression_p): For array
CONSTRUCTORs, use a dedicated loop that additionally verifies
the CONSTRUCTOR spans the entire array.
gcc/testsuite/ChangeLog:
PR c++/99700
* g++.dg/cpp2a/constexpr-init21.C: New test.
The testcase used to be compiled at -O2 by GCC8 and earlier to:
f1:
neg w1, w0, asr 16
and w1, w1, 65535
orr w0, w1, w0, lsl 16
ret
f2:
neg w1, w0
extr w0, w1, w0, 16
ret
but since GCC9 (r9-3594 for f1 and r9-6926 for f2) we compile it into:
f1:
mov w1, w0
sbfx x0, x1, 16, 16
neg w0, w0
bfi w0, w1, 16, 16
ret
f2:
neg w1, w0
sbfx x0, x0, 16, 16
bfi w0, w1, 16, 16
ret
instead, i.e. one insn longer each. With this patch we get:
f1:
mov w1, w0
neg w0, w1, asr 16
bfi w0, w1, 16, 16
ret
f2:
neg w1, w0
extr w0, w1, w0, 16
ret
i.e. identical f2 and same number of insns as in GCC8 in f1.
The combiner unfortunately doesn't try splitters when doing 2 -> 1
combination, so it can't be implemented as combine splitters, but
it could be implemented as define_insn_and_split if desirable.
2021-04-16 Jakub Jelinek <jakub@redhat.com>
PR target/100075
* config/aarch64/aarch64.md (*neg_asr_si2_extr, *extrsi5_insn_di): New
define_insn patterns.
* gcc.target/aarch64/pr100075.c: New test.
This patch fixes a regression introduced by the rtl-ssa patches.
It was seen on HPPA but it might be latent elsewhere.
The problem is that the traditional way of expanding an untyped_call
is to emit sequences like:
(call (mem (symbol_ref "foo")))
(set (reg pseudo1) (reg result1))
...
(set (reg pseudon) (reg resultn))
The ABI specifies that result1..resultn are clobbered by the call but
nothing in the RTL indicates that result1..resultn are the results
of the call. Normally, using a clobbered value gives undefined results,
but in this case the results are well-defined and matter for correctness.
This seems like a niche case, so I think it would be better to mark
it explicitly rather than try to detect it heuristically.
Note that in expand_builtin_apply we already have an rtx_insn *,
so it doesn't matter whether we call emit_call_insn or emit_insn.
Calling emit_insn seems more natural now that the gen_* call
has been split out. It also matches later code in the function.
gcc/
PR rtl-optimization/98689
* reg-notes.def (UNTYPED_CALL): New note.
* combine.c (distribute_notes): Handle it.
* emit-rtl.c (try_split): Likewise.
* rtlanal.c (rtx_properties::try_to_add_insn): Likewise. Assume
that calls with the note implicitly set all return value registers.
* builtins.c (expand_builtin_apply): Add a REG_UNTYPED_CALL
to untyped_calls.
This patch is a GCC 11 regression caused by the rtl-ssa code.
Normally we treat calls as containing a potential set of a global
register, but DF makes a sensible exception for the stack pointer:
if (i == STACK_POINTER_REGNUM)
/* The stack ptr is used (honorarily) by a CALL insn. */
df_ref_record (DF_REF_BASE, collection_rec, regno_reg_rtx[i],
NULL, bb, insn_info, DF_REF_REG_USE,
DF_REF_CALL_STACK_USAGE | flags);
else if (global_regs[i])
{
/* Calls to const functions cannot access any global registers and
calls to pure functions cannot set them. All other calls may
reference any of the global registers, so they are recorded as
used. */
The only DF definition of SP was therefore the one in the entry block.
However, the rtlanal.c rtx_properties code (wrongly) assumed that calls
also clobbered the global SP. This led to multiple definitions of SP
when we only expected one.
This patch tightens the rtlanal.c handling of global registers
to match the DF approach.
gcc/
PR rtl-optimization/99596
* rtlanal.c (rtx_properties::try_to_add_insn): Don't add global
register accesses for const calls. Assume that pure functions
can only read from global registers. Ignore cases in which
the stack pointer has been marked global.
gcc/testsuite/
PR rtl-optimization/99596
* gcc.target/arm/pr99596.c: New test.
Commit r11-8168 changed the word ordering of a warning in order to
make the text more consistent. Unfortunately, it neglected to update
some filters in the testsuite that are intended to strip such warnings
when we try to partially override the user-supplied command-line
options.
This patch rectifies this and also fixes some patterns that were
incorrectly specified in the first place.
gcc/testsuite:
PR target/100067
* g++.target/arm/arm.exp (dg_runtest_extra_prunes): Update prune
template.
* gcc.target/arm/arm.exp (dg_runtest_extra_prunes): Likewise.
* g++.target/arm/mve.exp (dg_runtest_extra_prunes): Likewise. Fix
missing quotes around switch names.
* gcc.target/arm/mve/mve.exp: (dg_runtest_extra_prunes): Likewise.
The following testcase ICEs because disabling of DCE means there are dead
stmts in the loop (though, in theory they could become dead only shortly
before if-conv through some optimization), ifcvt which goes through all
stmts in the loop if-converts them into .COND_DIV etc. internal fn calls
in the copy of the loop meant for vectorization only, the loop is
successfully vectorized but the particular .COND_* call is not because
it isn't a live statement and the scalar .COND_* remains in the IL until
expansion where it ICEs because these ifns only support vectors and not
scalars.
These ifns are similar to .MASK_{LOAD,STORE} in this behavior.
One possible fix could be to expand scalar versions of them during
expansion, basically undoing what if-conv did to create them, i.e.
expand them as the lhs = else; if (mask) { lhs = statement; } or so.
For .MASK_LOAD we have code to replace them in vect_transform_loop already
though (not needed for .MASK_STORE, as stores should be always live
and thus always vectorized), so this patch instead replaces .COND_*
similarly to .MASK_LOAD in that loop, with the small difference
that lhs = .MASK_LOAD (...); is replaced by lhs = 0; while
lhs = .COND_* (..., else_arg); is replaced by lhs = else_arg.
The statement must be dead, otherwise it would be vectorized, so I think
it is not a big deal we don't turn it back into multiple basic blocks etc.
(and it might be not possible to do that at that point).
2021-04-16 Jakub Jelinek <jakub@redhat.com>
PR target/99767
* tree-vect-loop.c (vect_transform_loop): Don't remove just
dead scalar .MASK_LOAD calls, but also dead .COND_* calls - replace
them by their last argument.
* gcc.target/aarch64/pr99767.c: New test.
The requires clause parsing has code to suggest users wrapping
non-primary expressions in (), so if it e.g. parses a primary expression
and sees it is followed by ++, --, ., ( or -> among other things it
will try to reparse it as assignment expression or what and if that works
suggests wrapping it inside of parens.
When it is requires-clause that is after <typename T> etc. it already
has an exception from that as ( can occur in valid C++20 expression there
- starting the parameters of the lambda.
In C++23 another case can occur, as the parameters with the ()s can be
omitted, requires C can be followed immediately by -> which starts a
trailing return type. Even in that case, we don't want to parse that
as C->...
2021-04-16 Jakub Jelinek <jakub@redhat.com>
PR c++/99850
* parser.c (cp_parser_constraint_requires_parens) <case CPP_DEREF>:
If lambda_p, return pce_ok instead of pce_maybe_postfix.
* g++.dg/cpp23/lambda-specifiers2.C: New test.
The following testcase ICEs in tsubst_decomp_names because the assumptions
that the structured binding artificial var is followed in DECL_CHAIN by
the corresponding structured binding vars is violated.
I've tracked it to extract_locals* which is done for the constexpr
IF_STMT. extract_locals_r when it sees a DECL_EXPR adds that decl
into a hash set so that such decls aren't returned from extract_locals*,
but in the case of a structured binding that just means the artificial var
and not the vars corresponding to structured binding identifiers.
The following patch fixes it by pushing not just the artificial var
for structured bindings but also the other vars.
2021-04-16 Jakub Jelinek <jakub@redhat.com>
PR c++/99833
* pt.c (extract_locals_r): When handling DECL_EXPR of a structured
binding, add to data.internal also all corresponding structured
binding decls.
* g++.dg/cpp1z/pr99833.C: New test.
* g++.dg/cpp2a/pr99833.C: New test.
For z10 and newer inner loops are completely unrolled which leaves no
inner loops to jam which renders this testcase to fail. Reverting
max-completely-peel-times to the default value fixes this testcase.
gcc/testsuite/ChangeLog:
* gcc.dg/unroll-and-jam.c: Revert max-completely-peel-times to
the default value on IBM Z.
The new testcase was breaking because constexpr evaluation was simplifying
Bar{Baz<42>{}} to Bar{empty}, but then we weren't treating them as
equivalent. Poking at this revealed that the code for eliding trailing
zero-initialization in class non-type template argument mangling was pretty
broken, including the test, mangle71.
I dealt with the FIXME to support RANGE_EXPR, and fixed the confusion
between a list-initialized temporary mangled as written (i.e. in the
signature of a function template) and a template parameter object mangled as
the value representation of the object. I'm distinguishing between these
using COMPOUND_LITERAL_P. A later patch will adjust the use of
COMPOUND_LITERAL_P to be more useful for this distinction, but it works now
for distinguishing these cases in mangling.
gcc/cp/ChangeLog:
PR c++/100079
* cp-tree.h (first_field): Declare.
* mangle.c (range_expr_nelts): New.
(write_expression): Improve class NTTP mangling.
* pt.c (get_template_parm_object): Clear TREE_HAS_CONSTRUCTOR.
* tree.c (zero_init_expr_p): Improve class NTTP handling.
* decl.c: Adjust comment.
gcc/testsuite/ChangeLog:
PR c++/100079
* g++.dg/abi/mangle71.C: Fix expected mangling.
* g++.dg/abi/mangle77.C: New test.
* g++.dg/cpp2a/nontype-class-union1.C: Likewise.
* g++.dg/cpp2a/nontype-class-equiv1.C: Removed.
* g++.dg/cpp2a/nontype-class44.C: New test.
Resolves:
PR c/99420 - bogus -Warray-parameter on a function redeclaration in function scope
PR c/99972 - missing -Wunused-result on a call to a locally redeclared warn_unused_result function
gcc/c/ChangeLog:
PR c/99420
PR c/99972
* c-decl.c (pushdecl): Always propagate type attribute.
gcc/testsuite/ChangeLog:
PR c/99420
PR c/99972
* gcc.dg/Warray-parameter-9.c: New test.
* gcc.dg/Wnonnull-6.c: New test.
* gcc.dg/Wreturn-type3.c: New test.
* gcc.dg/Wunused-result.c: New test.
* gcc.dg/attr-noreturn.c: New test.
* gcc.dg/attr-returns-nonnull.c: New test.
Unfortunately it appears that this PR is on nobody's radar.
Xfailing it to get an arguably artificial zero regression
state (since T0=2007-01-05) helps my autotester.
Caveat: the pass/fail state of this test, as long as stack
alignment isn't adjusted, is dependent on the alignment of
the stack at the entry of main, so depending on the target,
e.g. the size and number of environment variables at
invocation time can affect the result (including simulator
runs where environment variables are propagated to the
target).
gcc/testsuite:
PR middle-end/84877
* gcc.dg/pr84877.c: Xfail for cris-*-*.
When calling a static member function we still need to evaluate an explicit
object argument. But we don't want to force a load of the entire object
if the argument is volatile, so we take its address. If as a result it no
longer has any side-effects, we don't need to evaluate it after all.
gcc/cp/ChangeLog:
PR c++/80456
* call.c (build_new_method_call_1): Check again for side-effects
with a volatile object.
gcc/testsuite/ChangeLog:
PR c++/80456
* g++.dg/cpp0x/constexpr-volatile3.C: New test.
Here instantiating the noexcept-specifier for bar<void>() means
instantiating A<void>::value, which complains about the conversion from 0 to
int* in the default argument of foo. Since my patch for PR99583, printing
the error context involves looking at C<void>::type, which again wants to
instantiate A<void>::value, which breaks. For now at least, let's break
this recursion by avoiding looking into the noexcept-specifier in
find_typenames, and limit that to just the uses_parameter_packs case that
PR99583 cares about.
gcc/cp/ChangeLog:
PR c++/100101
PR c++/99583
* pt.c (find_parameter_packs_r) [FUNCTION_TYPE]: Walk into
TYPE_RAISES_EXCEPTIONS here.
* tree.c (cp_walk_subtrees): Not here.
gcc/testsuite/ChangeLog:
PR c++/100101
* g++.dg/cpp0x/noexcept67.C: New test.
Without this I see a number of tests failing when -m32 is used.
libstdc++-v3/ChangeLog:
* testsuite/lib/dg-options.exp (add_options_for_libatomic): Also
add libatomic options for 32-bit sparc*-*-linux-gnu.
My patch for 99478 relied on local_variables_forbidden_p for distinguishing
between a template parameter and its default argument, but that flag wasn't
set for a default type template-argument.
gcc/cp/ChangeLog:
PR c++/100091
PR c++/99478
* parser.c (cp_parser_default_type_template_argument): Set
parser->local_variables_forbidden_p.
gcc/testsuite/ChangeLog:
PR c++/100091
* g++.dg/cpp2a/lambda-uneval15.C: New test.
The changes for PR libstdc++/64735 mean that libsupc++ function might
now depend on the __exchange_and_add and __atomic_add functions defined
in config/cpu/*/atomicity.h which is not compiled into libsupc++. This
causes a link failure for some targets when trying to use libsupc++
without the rest of libstdc++.
This patch simply moves the definitions of those functions into
libsupc++ so that they are available there.
libstdc++-v3/ChangeLog:
PR libstdc++/96657
* libsupc++/Makefile.am: Add atomicity.cc here.
* src/c++98/Makefile.am: Remove it from here.
* libsupc++/Makefile.in: Regenerate.
* src/c++98/Makefile.in: Regenerate.
* testsuite/18_support/exception_ptr/96657.cc: New test.
This patch follows on from a previous one and adds -mtune=generic
to the SVE ACLE assembler tests. These tests are pure assembly
tests (execution tests are elsewhere) and they already require
dg-additional-options to be used to add new options. We therefore
don't need aarch64-with-arch-dg-options.
gcc/testsuite/
* g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Add
-mtune=generic to the SVE flags.
* g++.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp: Likewise.
* gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Likewise.
* gcc.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp: Likewise.
A lot of the SVE assembly tests are for generic-tuned SVE codegen
and so can fail when run on a toolchain configured with --with-cpu.
This could easily be solved by forcing -mtune=generic, like we already
do for -moverride=tune=none. However, the testsuite also has some
useful execution tests that it would be better to run with as
few flag changes as possible. Also, the flags in $sve_flags are
printed as part of the test results, so each change to $sve_flags
results in a change to the test summaries.
This patch instead intercepts dg-options and tailors the list
of additional options based on what the test is trying to do.
It also gets rid of DEFAULT_CFLAGS, which are never useful
for these tests.
gcc/testsuite/
* lib/gcc-defs.exp (aarch64-arch-dg-options): New procedure.
(aarch64-with-arch-dg-options): Likewise.
* g++.target/aarch64/sve/aarch64-sve.exp: Run the tests inside
aarch64-with-arch-dg-options. Move the default architecture
flags to the final dg-runtest argument.
* gcc.target/aarch64/sve/aarch64-sve.exp: Likewise. Dispense with
DEFAULT_CFLAGS.
* gcc.target/aarch64/sve2/aarch64-sve2.exp: Likewise.
The test also works with -m32 or -mx32 the same as it does for -m64,
therefore it should be enabled for i?86-*-* x86_64-*-* targets,
x86_64-*-* alone is never right.
2021-04-15 Jakub Jelinek <jakub@redhat.com>
PR testsuite/100073
* gcc.dg/pr86058.c: Enable also on i?86-*-*.
This adds a deprecation note to the undocumented gimple-builder.h
API only used by asan and sancov.
2021-04-15 Richard Biener <rguenther@suse.de>
* gimple-builder.h: Add deprecation note.
<arm_neon.h> types are distinct from GNU vector types in at least
their mangling. However, there used to be nothing explicit in the
VECTOR_TYPE itself to indicate the difference: we simply treated them
as distinct TYPE_MAIN_VARIANTs. This caused problems like the ones
reported in PR95726.
The fix for that PR was to add type attributes to the <arm_neon.h>
types, in order to maintain the distinction between them and GNU
vectors. However, this in turn caused PR98852, where cp_common_type
would merge the type attributes from the two source types and attach
the result to the common type. For example:
unsigned vector with no attribute + signed vector with attribute X
would get converted to:
unsigned vector with attribute X
That isn't what we want in this case, since X describes the mangling
of the original type. But even if we dropped the mangling from X and
worked it out from context, we would still have a situation in which
the common type was provably distinct from both of the source types:
it would take its <arm_neon.h>-ness from one side and its signedness
from the other. I guess there are other cases where the common type
doesn't match either side, but I'm not sure it's the obvious behaviour
here. It's also different from GCC 10.1 and earlier, where the unsigned
vector “won” in its original form.
This patch instead merges only the attributes that don't affect type
identity. For now I've restricted it to vector types, since we're so
close to GCC 11, but it might make sense to use this elsewhere.
I've tried to audit the C and target-specific attributes to look for
other types that might be affected by this, but I couldn't see any.
The closest was s390_vector_bool, but the handler for that attribute
changes the type node and drops the attribute itself
(*no_add_attrs = true).
gcc/
PR c++/98852
* attribs.h (restrict_type_identity_attributes_to): Declare.
* attribs.c (restrict_type_identity_attributes_to): New function.
gcc/cp/
PR c++/98852
* typeck.c (merge_type_attributes_from): New function.
(cp_common_type): Use it for vector types.
<arm_neon.h> types are distinct from GNU vector types in at least
their mangling. However, there used to be nothing explicit in the
VECTOR_TYPE itself to indicate the difference: we simply treated them
as distinct TYPE_MAIN_VARIANTs. This caused problems like the ones
reported in PR95726.
The fix for that PR was to add type attributes to the <arm_neon.h>
types, in order to maintain the distinction between them and GNU
vectors. However, this in turn caused PR98852, where c_common_type
would unconditionally drop the attributes on the source types.
This meant that:
<arm_neon.h> vector + <arm_neon.h> vector
had a GNU vector type rather than an <arm_neon.h> vector type.
See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96377#c2 for
Jakub's analysis of the history of this c_common_type code.
TBH I'm not sure which case the build_type_attribute_variant
code is handling, but I think we should at least avoid dropping
attributes that affect type identity.
I've tried to audit the C and target-specific attributes to look
for other types that might be affected by this, but I couldn't
see any. We are only dealing with:
gcc_assert (code1 == VECTOR_TYPE || code1 == COMPLEX_TYPE
|| code1 == FIXED_POINT_TYPE || code1 == REAL_TYPE
|| code1 == INTEGER_TYPE);
which excludes most affects_type_identity attributes. The closest
was s390_vector_bool, but the handler for that attribute changes
the type node and drops the attribute itself (*no_add_attrs = true).
I put the main list handling into a separate function
(remove_attributes_matching) because a later patch will need it
for something else.
gcc/
PR c/98852
* attribs.h (affects_type_identity_attributes): Declare.
* attribs.c (remove_attributes_matching): New function.
(affects_type_identity_attributes): Likewise.
gcc/c/
PR c/98852
* c-typeck.c (c_common_type): Do not drop attributes that
affect type identity.
gcc/testsuite/
PR c/98852
* gcc.target/aarch64/advsimd-intrinsics/pr98852.c: New test.
Before combiner added 2 to 2 combinations, the following testcase functions
have been all compiled into 2 instructions, zero/sign extensions or and
followed by orr with lsl, e.g. for the first function
Trying 7 -> 8:
7: r96:SI=r94:SI<<0xb
8: r95:SI=r96:SI|r94:SI
REG_DEAD r96:SI
REG_DEAD r94:SI
Successfully matched this instruction:
(set (reg:SI 95)
(ior:SI (ashift:SI (reg/v:SI 94 [ i ])
(const_int 11 [0xb]))
(reg/v:SI 94 [ i ])))
is the important successful try_combine and so we end up with
and w0, w0, 255
orr w0, w0, w0, lsl 11
in the body.
With 2 to 2 combination, before that can trigger, another successful
combination:
Trying 2 -> 7:
2: r94:SI=zero_extend(x0:QI)
REG_DEAD x0:QI
7: r96:SI=r94:SI<<0xb
is replaced with:
(set (reg/v:SI 94 [ i ])
(zero_extend:SI (reg:QI 0 x0 [ i ])))
and
(set (reg:SI 96)
(and:SI (ashift:SI (reg:SI 0 x0 [ i ])
(const_int 11 [0xb]))
(const_int 522240 [0x7f800])))
and in the end results in 3 instructions in the body:
and w1, w0, 255
ubfiz w0, w0, 11, 8
orr w0, w0, w1
The following combine splitters help undo that when combiner tries to
combine 3 instructions - the zero/sign extend or and, the other insn
from the 2 to 2 combination ([us]bfiz) and the logical op, the CPUs
don't have an insn to do everything in one op, but we can split it
back into the zero/sign extend or and followed by logical with lsl.
2021-04-15 Jakub Jelinek <jakub@redhat.com>
PR target/100056
* config/aarch64/aarch64.md (*<LOGICAL:optab>_<SHIFT:optab><mode>3):
Add combine splitters for *<LOGICAL:optab>_ashl<mode>3 with
ZERO_EXTEND, SIGN_EXTEND or AND.
* gcc.target/aarch64/pr100056.c: New test.
2021-04-15 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/99307
* symbol.c: Remove trailing white space.
* trans-array.c (gfc_trans_create_temp_array): Create a class
temporary for class expressions and assign the new descriptor
to the data field.
(build_class_array_ref): If the class expr can be extracted,
then use that for 'decl'. Class function results are reliably
handled this way. Call gfc_find_and_cut_at_last_class_ref to
eliminate largely redundant code. Remove dead code and recast
the rest of the code to extract 'decl' for remaining cases.
Call gfc_build_spanned_array_ref.
(gfc_alloc_allocatable_for_assignment): Use class descriptor
element length for 'elemsize1'. Eliminate repeat set of dtype
for class expressions.
* trans-expr.c (gfc_find_and_cut_at_last_class_ref): Include
additional code from build_class_array_ref, and use optional
gfc_typespec pointer argument.
(gfc_trans_scalar_assign): Make use of pre and post blocks for
all class expressions.
* trans.c (get_array_span): For unlimited polymorphic exprs
multiply the span by the value of the _len field.
(gfc_build_spanned_array_ref): New function.
(gfc_build_array_ref): Call gfc_build_spanned_array_ref and
eliminate repeated code.
* trans.h: Add arg to gfc_find_and_cut_at_last_class_ref and
add prototype for gfc_build_spanned_array_ref.
Regarding test gcc.dg/pr93210.c, on different targets GIMPLE code may
slightly differ which is why the scan-tree-dump-times directive may
fail. For example, for a RETURN_EXPR on x86_64 we have
return 0x11100f0e0d0c0a090807060504030201;
whereas on IBM Z the first operand is a RESULT_DECL like
<retval> = 0x102030405060708090a0c0d0e0f1011;
return <retval>;
gcc/testsuite/ChangeLog:
* gcc.dg/pr93210.c: Adapt regex in order to also support a
RESULT_DECL as an operand for a RETURN_EXPR.
PR99929 is one of those “how did we get away with this for so long”
bugs: the equality routines weren't checking whether two variable-length
CONST_VECTORs had the same encoding. This meant that:
{ 1, 0, 0, 0, 0, 0, ... }
would appear to be equal to:
{ 1, 0, 1, 0, 1, 0, ... }
since both are represented using the elements { 1, 0 }.
gcc/
PR rtl-optimization/99929
* rtl.h (same_vector_encodings_p): New function.
* cse.c (exp_equiv_p): Check that CONST_VECTORs have the same encoding.
* cselib.c (rtx_equal_for_cselib_1): Likewise.
* jump.c (rtx_renumbered_equal_p): Likewise.
* lra-constraints.c (operands_match_p): Likewise.
* reload.c (operands_match_p): Likewise.
* rtl.c (rtx_equal_p_cb, rtx_equal_p): Likewise.
gcc/testsuite/
* gcc.target/aarch64/sve/pr99929_1.c: New file.
* gcc.target/aarch64/sve/pr99929_2.c: Likewise.
Looking at PR99929 showed that we weren't dumping enough information
about variable-length CONST_VECTORs. Something like:
(const_vector:VNx4SI [(const_int 1) (const_int 0)])
could be either:
(a) 1, 0, 1, 0, repeating
(b) 1 followed by all zeros
This patch adds more information to the dumps. There are four cases:
(a) above:
(const_vector:VNx4SI repeat [
(const_int 1)
(const_int 0)
])
(b) above:
(const_vector:VNx4SI [
(const_int 1)
repeat [
(const_int 0)
]
])
a single stepped sequence:
(const_vector:VNx4SI [
(const_int 0)
stepped [
(const_int 1)
(const_int 2)
]
])
interleaved stepped sequences:
(const_vector:VNx4SI [
(const_int 0)
(const_int 40)
stepped (interleave 2) [
(const_int 1)
(const_int 41)
(const_int 2)
(const_int 42)
]
])
There are probably better syntaxes, but hopefully this is at least
an improvement on the status quo.
gcc/
* print-rtl.c (rtx_writer::print_rtx_operand_codes_E_and_V): Print
more information about variable-length CONST_VECTORs.
My patch for PR93085 didn't consider that a default template argument can
also make a template dependent.
gcc/cp/ChangeLog:
PR c++/100078
PR c++/93085
* pt.c (uses_outer_template_parms): Also look at default
template argument.
gcc/testsuite/ChangeLog:
PR c++/100078
* g++.dg/template/dependent-tmpl2.C: New test.
N2253 allowed referring to non-static data members without an object in
unevaluated operands like that of sizeof, but in a constant-expression
context like an array bound or template argument within such an unevaluated
operand we do actually need a value, so that permission cannot apply.
gcc/cp/ChangeLog:
PR c++/93314
* semantics.c (finish_id_expression_1): Clear cp_unevaluated_operand
for a non-static data member in a constant-expression.
gcc/testsuite/ChangeLog:
PR c++/93314
* g++.dg/parse/uneval1.C: New test.
When splitting live range of a hard reg, LRA actually split multi-register
containing the hard reg. So we need to check the biggest used mode of the hard reg on
paradoxical subregister when the natural and the biggest
mode are ordered.
gcc/ChangeLog:
PR rtl-optimization/100066
* lra-constraints.c (split_reg): Check paradoxical_subreg_p for
ordered modes when choosing splitting mode for hard reg.
gcc/testsuite/ChangeLog:
PR rtl-optimization/100066
* gcc.target/i386/pr100066.c: New.
PR99246 is about a case in which we failed to handle a CONST_VECTOR
with NELTS_PER_PATTERN==2, i.e. a vector with a “foreground” sequence
of N vectors followed by a repeating “background” sequence of N vectors.
At the moment, it's difficult to produce these vectors directly,
but I'm hoping that for GCC 12 we'll do more folding, which will
in turn make this easier to test and easier to optimise. Until then,
the patch simply relies on the testcase in the PR.
gcc/
PR target/99246
* config/aarch64/aarch64.c (aarch64_expand_sve_const_vector_sel):
New function.
(aarch64_expand_sve_const_vector): Use it for nelts_per_pattern==2.
gcc/testsuite/
PR target/99246
* gcc.target/aarch64/sve/acle/general/pr99246.c: New test.
This fixes the error checking for two of the vector builtins which
accept irregular (e.g. non-contigiuous) ranges of values.
gcc/ChangeLog:
* config/s390/s390-builtins.def (O_M5, O_M12, ...): Add new macros
for mask operand types.
(s390_vec_permi_s64, s390_vec_permi_b64, s390_vec_permi_u64)
(s390_vec_permi_dbl, s390_vpdi): Use the M5 type for the immediate
operand.
(s390_vec_msum_u128, s390_vmslg): Use the M12 type for the
immediate operand.
* config/s390/s390.c (s390_const_operand_ok): Check the new
operand types and generate a list of valid values.
gcc/testsuite/ChangeLog:
* gcc.target/s390/zvector/imm-range-error-1.c: New test.
* gcc.target/s390/zvector/vec_msum_u128-1.c: New test.
This allows target platforms that have D support files to defined their
own target-specific information keys.
gcc/ChangeLog:
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (D language and ABI): Add @hook for
TARGET_D_REGISTER_OS_TARGET_INFO.
gcc/d/ChangeLog:
* d-target.cc (Target::_init): Call new targetdm hook to register OS
specific target info keys.
* d-target.def (d_register_os_target_info): New hook.
In the testcase ref11.C below, during deduction for the call f(a),
uses_deducible_template_parms returns false for the dependent
specialization A<V> because the generic template argument V here is
wrapped in an implicit INDIRECT_REF (formed from template_parm_to_arg).
Since uses_deducible_template_parms returns false, unify_one_argument
exits early without ever attempting to deduce 'n' for 'V'. This patch
fixes this by making deducible_expression look through such implicit
INDIRECT_REFs.
gcc/cp/ChangeLog:
PR c++/83476
PR c++/99885
* pt.c (deducible_expression): Look through implicit
INDIRECT_REFs as well.
gcc/testsuite/ChangeLog:
PR c++/83476
PR c++/99885
* g++.dg/cpp1z/class-deduction85.C: New test.
* g++.dg/template/ref11.C: New test.
Now that all dependencies on these flags have been removed, there's no
need to test and set them.
gcc/d/ChangeLog:
* d-builtins.cc (d_add_builtin_version): Remove all setting of
target-specific global.params.
* typeinfo.cc (create_typeinfo): Don't add argType fields to
TypeInfo_Struct.
This both prevents against it being called twice for declarations that
are defined, and fixes an issue where variables defined in the
compilation get one kind of linkage (weak), and the same variables
declared via declare_extern_var get another (extern).
gcc/d/ChangeLog:
PR d/99914
* decl.cc (DeclVisitor::visit (StructDeclaration *)): Don't set
DECL_INSTANTIATED on static initializer declarations.
(DeclVisitor::visit (ClassDeclaration *)): Likewise.
(DeclVisitor::visit (EnumDeclaration *)): Likewise.
(d_finish_decl): Move call to set_linkage_for_decl to...
(declare_extern_var): ...here.
This replaces the use of the D front-end `is64bit' parameter in
determining whether to insert the "stdcall" function attribute.
It is also used to determine whether `extern(System)' should be the same
as `extern(Windows)' in the implementation of Target::systemLinkage.
gcc/ChangeLog:
* config/i386/i386-d.c (ix86_d_has_stdcall_convention): New function.
* config/i386/i386-protos.h (ix86_d_has_stdcall_convention): Declare.
* config/i386/i386.h (TARGET_D_HAS_STDCALL_CONVENTION): Define.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (D language and ABI): Add @hook for
TARGET_D_HAS_STDCALL_CONVENTION.
gcc/d/ChangeLog:
* d-target.cc (Target::systemLinkage): Return LINKwindows if
d_has_stdcall_convention applies to LINKsystem.
* d-target.def (d_has_stdcall_convention): New hook.
* types.cc (TypeVisitor::visit (TypeFunction *)): Insert "stdcall"
function attribute if d_has_stdcall_convention applies to LINKwindows.
This adjusts GIMPLE verification with respect to the VEC_COND_EXPR
changes forcing a split out condition.
2021-04-14 Richard Biener <rguenther@suse.de>
* tree-cfg.c (verify_gimple_assign_ternary): Verify that
VEC_COND_EXPRs have a gimple_val condition.
* tree-ssa-propagate.c (valid_gimple_rhs_p): VEC_COND_EXPR
can no longer have a GENERIC condition.