These have been misdocumented since C++98 POD was split into C++11 trivial
and standard-layout in r149721.
PR c++/59426
gcc/ChangeLog:
* doc/extend.texi: Refer to __is_trivial instead of __is_pod.
We weren't rejecting a concept declared with multiple template
parameter lists.
PR c++/105067
gcc/cp/ChangeLog:
* pt.cc (finish_concept_definition): Check that a concept is
declared with exactly one template parameter list.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-err4.C: New test.
Here during declaration matching for the two constrained template
friends, we crash from maybe_substitute_reqs_for because the second
friend doesn't yet have DECL_TEMPLATE_INFO set (we're being called
indirectly from push_template_decl).
As far as I can tell, this situation happens only when declaring a
constrained template friend within a non-template class (as in the
testcase), in which case the substitution would be a no-op anyway.
So this patch rearranges maybe_substitute_reqs_for to gracefully
handle missing DECL_TEMPLATE_INFO by just skipping the substitution.
PR c++/105064
gcc/cp/ChangeLog:
* constraint.cc (maybe_substitute_reqs_for): Don't assume
DECL_TEMPLATE_INFO is available.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-friend9.C: New test.
Currently we have:
...
$ gcc --target-help 2>&1 | egrep "misa|mptx"
-misa= Specify the version of the ptx ISA to use.
-mptx= Specify the version of the ptx version to use.
Known PTX ISA versions (for use with the -misa= option):
Known PTX versions (for use with the -mptx= option):
...
As reported in PR104818, the "version of the ptx version" doesn't make much
sense.
Furthermore, the description of misa (and 'Known ISA versions') is misleading
because it does not specify the version of the PTX ISA, but rather the PTX ISA
target architecture.
Fix this by printing instead:
...
$ gcc --target-help 2>&1 | egrep "misa|mptx"
-misa= Specify the PTX ISA target architecture to use.
-mptx= Specify the PTX ISA version to use.
Known PTX ISA target architectures (for use with the -misa= option):
Known PTX ISA versions (for use with the -mptx= option):
...
Tested on nvptx.
gcc/ChangeLog:
2022-03-28 Tom de Vries <tdevries@suse.de>
PR target/104818
* config/nvptx/gen-opt.sh (ptx_isa): Improve help text.
* config/nvptx/nvptx-gen.opt: Regenerate.
* config/nvptx/nvptx.opt (misa, mptx, ptx_version): Improve help text.
* config/nvptx/t-nvptx (s-nvptx-gen-opt): Add missing dependency on
gen-opt.sh.
For PR104008 we thought it might be enough to keep strip_typedefs from
removing this alias template specialization, but this PR demonstrates that
other parts of the compiler also need to know to consider it dependent.
So, this patch changes complex_alias_template_p to no longer consider
template parameters used when their only use appears in a pack expansion,
unless they are the parameter packs being expanded.
To do that I also needed to change it to use cp_walk_tree instead of
for_each_template_parm. It occurs to me that find_template_parameters
should probably also use cp_walk_tree, but I'm not messing with that now.
PR c++/105003
PR c++/104008
PR c++/102869
gcc/cp/ChangeLog:
* pt.cc (complex_alias_template_r): walk_tree callback, replacing
uses_all_template_parms_r, complex_pack_expansion_r.
(complex_alias_template_p): Adjust.
* tree.cc (strip_typedefs): Revert r12-7710 change.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/variadic-alias6.C: New test.
* g++.dg/cpp0x/variadic-alias7.C: New test.
PR analyzer/104308 reports that when -Wanalyzer-use-of-uninitialized-value
complains about certain memmove operations where the source is
uninitialized, the diagnostic uses UNKNOWN_LOCATION:
In function 'main':
cc1: warning: use of uninitialized value '*(short unsigned int *)&s + 1' [CWE-457] [-Wanalyzer-use-of-uninitialized-value]
'main': event 1
|
|pr104308.c:5:8:
| 5 | char s[5]; /* { dg-message "region created on stack here" } */
| | ^
| | |
| | (1) region created on stack here
|
'main': event 2
|
|cc1:
| (2): use of uninitialized value '*(short unsigned int *)&s + 1' here
|
The issue is that gimple_fold_builtin_memory_op converts a memmove to:
_3 = MEM <unsigned short> [(char * {ref-all})_1];
MEM <unsigned short> [(char * {ref-all})&s] = _3;
but only sets the location of the 2nd stmt, not the 1st.
Fixed thusly, giving:
pr104308.c: In function 'main':
pr104308.c:6:3: warning: use of uninitialized value '*(short unsigned int *)&s + 1' [CWE-457] [-Wanalyzer-use-of-uninitialized-value]
6 | memmove(s, s + 1, 2); /* { dg-warning "use of uninitialized value" } */
| ^~~~~~~~~~~~~~~~~~~~
'main': events 1-2
|
| 5 | char s[5]; /* { dg-message "region created on stack here" } */
| | ^
| | |
| | (1) region created on stack here
| 6 | memmove(s, s + 1, 2); /* { dg-warning "use of uninitialized value" } */
| | ~~~~~~~~~~~~~~~~~~~~
| | |
| | (2) use of uninitialized value '*(short unsigned int *)&s + 1' here
|
One side-effect of this change is a change in part of the output of
gcc.dg/uninit-40.c from:
uninit-40.c:47:3: warning: ‘*(long unsigned int *)(&u[1][0][0])’ is used uninitialized [-Wuninitialized]
47 | __builtin_memcpy (&v[1], &u[1], sizeof (V));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
uninit-40.c:45:5: note: ‘*(long unsigned int *)(&u[1][0][0])’ was declared here
45 | V u[2], v[2];
| ^
to:
uninit-40.c:47:3: warning: ‘u’ is used uninitialized [-Wuninitialized]
47 | __builtin_memcpy (&v[1], &u[1], sizeof (V));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
uninit-40.c:45:5: note: ‘u’ declared here
45 | V u[2], v[2];
| ^
What's happening is that pass "early_uninit"(29)'s call to
maybe_warn_operand is guarded by this condition:
1051 else if (gimple_assign_load_p (stmt)
1052 && gimple_has_location (stmt))
Before the patch, the stmt:
_3 = MEM <unsigned long> [(char * {ref-all})&u + 8B];
has no location, and so early_uninit skips this operand at line
1052 above. Later, pass "uninit"(217) tests the var_decl "u$8", and
emits a warning for it.
With the patch, the stmt has a location, and so early_uninit emits a
warning for "u" and sets a NW_UNINIT warning suppression at that
location. Later, pass "uninit"(217)'s test of "u$8" is rejected
due to that per-location suppression of uninit warnings, from the
earlier warning.
gcc/ChangeLog:
PR analyzer/104308
* gimple-fold.cc (gimple_fold_builtin_memory_op): When optimizing
to loads then stores, set the location of the new load stmt.
gcc/testsuite/ChangeLog:
PR analyzer/104308
* gcc.dg/analyzer/pr104308.c: New test.
* gcc.dg/uninit-40.c (foo): Update expression in expected message.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
This test ICEd after the constexpr new patch (r10-3661) because alloc_call
had a NOP_EXPR around it; fixed by moving the NOP_EXPR to alloc_expr. And
the PR pointed out that the size_t cookie might need more alignment, so I
fix that as well.
PR c++/102071
gcc/cp/ChangeLog:
* init.cc (build_new_1): Include cookie in alignment. Omit
constexpr wrapper from alloc_call.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/aligned-new9.C: New test.
When setting up the hidden namespace-scope decl for a local extern, we also
need to set its visibility.
PR c++/103291
gcc/cp/ChangeLog:
* name-lookup.cc (push_local_extern_decl_alias): Call
determine_visibility.
gcc/testsuite/ChangeLog:
* g++.dg/ext/visibility/visibility-local-extern1.C: New test.
When building a deduction guide from the Test constructor, we need to
rewrite the use of _dummy into a dependent reference, i.e. Test<T>::template
_dummy. We were using SCOPE_REF for both type and non-type templates; we
need to use UNBOUND_CLASS_TEMPLATE for type templates.
PR c++/102123
gcc/cp/ChangeLog:
* pt.cc (tsubst_copy): Use make_unbound_class_template for rewriting
a type template reference.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/class-deduction110.C: New test.
Here, we were wrongly thinking that (const Options&)Widget<T>::c_options is
not value-dependent because neither the type nor the (value of) c_options
are dependent, but since we're binding it to a reference we also need to
consider that it has a value-dependent address.
PR c++/103968
gcc/cp/ChangeLog:
* pt.cc (value_dependent_expression_p): Check
has_value_dependent_address for conversion to reference.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/alias-decl-mem1.C: New test.
More quirks of rewriting member references to dependent references for
CTAD. A reference to a member of dependent scope is definitely dependent.
And since r11-7044, tsubst_baselink builds a SCOPE_REF, so
tsubst_qualified_id should just use it.
PR c++/103943
gcc/cp/ChangeLog:
* pt.cc (tsubst_qualified_id): Handle getting SCOPE_REF from
tsubst_baselink.
(instantiation_dependent_scope_ref_p): Check dependent_scope_p.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/class-deduction109.C: New test.
When make_base_init_ok changes a call to a complete constructor into a call
to a base constructor, we were never marking the base ctor as used, so it
didn't get emitted.
PR c++/102045
gcc/cp/ChangeLog:
* call.cc (make_base_init_ok): Call make_used.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1z/aggr-base12.C: New test.
My implementation of union non-type template arguments in r11-2016 broke
braced casts of union type, because they are still in syntactic (undigested)
form.
PR c++/104847
gcc/cp/ChangeLog:
* mangle.cc (write_expression): Don't write a union designator when
undigested.
gcc/testsuite/ChangeLog:
* g++.dg/abi/mangle-union1.C: New test.
This was breaking because when we stripped the 't' typedef in s<t<Args>...>
to be s<Args...>, the TYPE_MAIN_VARIANT of "Args..." was still
"t<Args>...", because type pack expansions are treated as types. Fixed by
using the right function to copy a "type".
PR c++/99445
PR c++/103769
gcc/cp/ChangeLog:
* tree.cc (strip_typedefs): Use build_distinct_type_copy.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/variadic-alias5.C: New test.
When building an nvptx offloading configuration on openSUSE Leap 15.3, the
site script /usr/share/site/x86_64-unknown-linux-gnu is activated, setting
libexecdir to ${exec_prefix}/lib rather than ${exec_prefix}/libexec:
...
| # If user did not specify libexecdir, set the correct target:
| # Nor FHS nor openSUSE allow prefix/libexec. Let's default to prefix/lib.
|
| if test "$libexecdir" = '${exec_prefix}/libexec' ; then
| libexecdir='${exec_prefix}/lib'
| fi
...
However, in libgomp libgomp/plugin/configfrag.ac we hardcode libexec:
...
# Configure additional search paths.
if test x"$tgt_dir" != x; then
offload_additional_options="$offload_additional_options \
-B$tgt_dir/libexec/gcc/\$(target_alias)/\$(gcc_version) \
-B$tgt_dir/bin"
...
Fix this by using /$(libexecdir:\$(exec_prefix)/%=%)/ instead of /libexec/.
Tested on x86_64-linux with nvptx accelerator.
libgomp/ChangeLog:
2022-03-28 Tom de Vries <tdevries@suse.de>
* plugin/configfrag.ac: Use /$(libexecdir:\$(exec_prefix)/%=%)/
instead of /libexec/.
* configure: Regenerate.
The following makes sure to annotate the tests generated by
switch lowering bit-clustering with locations which otherwise
can be completely lost even at -O0.
2022-03-28 Richard Biener <rguenther@suse.de>
PR tree-optimization/105070
* tree-switch-conversion.h
(bit_test_cluster::hoist_edge_and_branch_if_true): Add location
argument.
* tree-switch-conversion.cc
(bit_test_cluster::hoist_edge_and_branch_if_true): Annotate
cond with location.
(bit_test_cluster::emit): Annotate all generated expressions
with location.
pinsrw is available for both reg and mem operand under sse2.
pextrw requires sse4.1 for mem operands.
The patch change attr "isa" for pinsrw mem alternative from sse4_noavx
to noavx, will enable below optimization.
- movzwl (%rdi), %eax
pxor %xmm1, %xmm1
- pinsrw $0, %eax, %xmm1
+ pinsrw $0, (%rdi), %xmm1
movdqa %xmm1, %xmm0
gcc/ChangeLog:
PR target/105066
* config/i386/sse.md (vec_set<mode>_0): Change attr "isa" of
alternative 4 from sse4_noavx to noavx.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr105066.c: New test.
The recent change didn't initialize comp_step while previously we used
XCNEW to allocate it.
I think RS_ANY is better than RS_INTERNAL (== 0) as the default.
2022-03-28 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/105056
* tree-predcom.cc (component::component): Initialize also comp_step.
Since AVX512VL and AVX512BW are required for AVX512 VPSHUFB, replace the
"Yv" register constraint with the "Yw" register constraint.
gcc/
PR target/105068
* config/i386/sse.md (*ssse3_pshufbv8qi3): Replace "Yv" with
"Yw".
gcc/testsuite/
PR target/105068
* gcc.target/i386/pr105068.c: New test.
Because this adds a new class template called std::unexpected, we have
to stop declaring the std::unexpected() function (which was deprecated
in C++11 and removed in C++17).
libstdc++-v3/ChangeLog:
* doc/doxygen/user.cfg.in: Add new header.
* include/Makefile.am: Likewise.
* include/Makefile.in: Regenerate.
* include/precompiled/stdc++.h: Add new header.
* include/std/version (__cpp_lib_expected): Define.
* libsupc++/exception [__cplusplus > 202002] (unexpected)
(unexpected_handler, set_unexpected): Do not declare for C++23.
* include/std/expected: New file.
* testsuite/20_util/expected/assign.cc: New test.
* testsuite/20_util/expected/cons.cc: New test.
* testsuite/20_util/expected/illformed_neg.cc: New test.
* testsuite/20_util/expected/observers.cc: New test.
* testsuite/20_util/expected/requirements.cc: New test.
* testsuite/20_util/expected/swap.cc: New test.
* testsuite/20_util/expected/synopsis.cc: New test.
* testsuite/20_util/expected/unexpected.cc: New test.
* testsuite/20_util/expected/version.cc: New test.
My recent testcase for PR c++/84964.C stress tests the middle-end by
attempting to pass a UINT_MAX sized structure on the stack. Although
my fix to PR84964 avoids the ICE after sorry on x86_64 and similar
targets, a related issue still exists on powerpc64 (and similar
ACCUMULATE_OUTGOING_ARGS/ARGS_GROW_DOWNWARD targets) which don't
issue a "sorry, unimplemented" message, but instead ICE elsewhere.
After attempting several alternate fixes, the simplest solution is
to just defensively check in mark_stack_region_used that the upper
bound of the region lies within the allocated stack_usage_map
array, which is of size highest_outgoing_arg_in_use. When this isn't
the case, the code now follows the same path as for variable sized
regions, and uses stack_usage_watermark rather than a map.
2022-03-26 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR middle-end/104885
* calls.cc (mark_stack_region_used): Check that the region
is within the allocated size of stack_usage_map.
The following testcase ICEs on aarch64-linux with -g and
assembles with a warning otherwise, because it emits
ldrb w0,[x0,16]!
instruction which sets the x0 register multiple times.
Due to disabled DCE (from -Og) we end up before REE with:
(insn 12 39 13 2 (set (reg:SI 1 x1 [orig:93 _2 ] [93])
(zero_extend:SI (mem/c:QI (pre_modify:DI (reg/f:DI 0 x0 [114])
(plus:DI (reg/f:DI 0 x0 [114])
(const_int 16 [0x10]))) [1 u128_1+0 S1 A128]))) "pr103775.c":5:35 117 {*zero_extendqisi2_aarch64}
(expr_list:REG_INC (reg/f:DI 0 x0 [114])
(nil)))
(insn 13 12 14 2 (set (reg:DI 0 x0 [orig:112 _2 ] [112])
(zero_extend:DI (reg:SI 1 x1 [orig:93 _2 ] [93]))) "pr103775.c":5:16 111 {*zero_extendsidi2_aarch64}
(nil))
which is valid but not exactly efficient as x0 is dead after the
insn that auto-increments it. REE turns it into:
(insn 12 39 44 2 (set (reg:DI 0 x0)
(zero_extend:DI (mem/c:QI (pre_modify:DI (reg/f:DI 0 x0 [114])
(plus:DI (reg/f:DI 0 x0 [114])
(const_int 16 [0x10]))) [1 u128_1+0 S1 A128]))) "pr103775.c":5:35 119 {*zero_extendqidi2_aarch64}
(expr_list:REG_INC (reg/f:DI 0 x0 [114])
(nil)))
(insn 44 12 14 2 (set (reg:DI 1 x1)
(reg:DI 0 x0)) "pr103775.c":5:35 -1
(nil))
which is invalid because it sets x0 multiple times, one
in SET_DEST of the PATTERN and once in PRE_MODIFY.
As perhaps other passes than REE might suffer from it, IMHO it is better
to reject this during change validation.
2022-03-26 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/103775
* recog.cc (check_invalid_inc_dec): New function.
(insn_invalid_p): Return 1 if REG_INC operand overlaps
any stored REGs.
* gcc.dg/pr103775.c: New test.
When an if-stmt is determined to be non-constant because both of its
branches are non-constant, we issue a somewhat generic error which,
since the error also points to the 'if' token, misleadingly suggests
the condition is at fault:
constexpr-105050.C:8:3: error: expression ‘<statement>’ is not a constant expression
8 | if (p != q && *p < 0)
| ^~
This patch clarifies the error message to instead read:
constexpr-105050.C:8:3: error: neither branch of ‘if’ is a constant expression
8 | if (p != q && *p < 0)
| ^~
PR c++/105050
gcc/cp/ChangeLog:
* constexpr.cc (potential_constant_expression_1) <case IF_STMT>:
Clarify error message when a if-stmt is non-constant because its
branches are non-constant.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1y/constexpr-105050.C: New test.
Here when constructing the builtin operator->* candidate set according
to the available conversion functions for the operand types, we end up
considering a candidate with C1=T (through B's dependent conversion
function) and C2=F, during which we crash from DERIVED_FROM_P because
dependent_type_p sees a TEMPLATE_TYPE_PARM outside of a template
context.
Sidestepping the question of whether we should be considering such a
dependent conversion function here in the first place, it seems futile
to test DERIVED_FROM_P for anything other than an actual class type, so
this patch fixes this ICE by simply guarding the DERIVED_FROM_P test
with CLASS_TYPE_P instead of MAYBE_CLASS_TYPE_P.
PR c++/103455
gcc/cp/ChangeLog:
* call.cc (add_builtin_candidate) <case MEMBER_REF>: Test
CLASS_TYPE_P instead of MAYBE_CLASS_TYPE_P.
gcc/testsuite/ChangeLog:
* g++.dg/overload/builtin6.C: New test.
Since KL instructions have no AVX512 version, replace the "v" register
constraint with the "x" register constraint.
PR target/105058
* config/i386/sse.md (loadiwkey): Replace "v" with "x".
(aes<aesklvariant>u8): Likewise.
Since PHADDW/PHADDD/PHADDSW/PHSUBW/PHSUBD/PHSUBSW/PSIGNB/PSIGNW/PSIGND
have no AVX512 version, replace the "Yv" register constraint with the
"x" register constraint.
PR target/105052
* config/i386/sse.md (ssse3_ph<plusminus_mnemonic>wv4hi3):
Replace "Yv" with "x".
(ssse3_ph<plusminus_mnemonic>dv2si3): Likewise.
(ssse3_psign<mode>3): Likewise.
In r12-7809-g5f6197d7c197f9d2b7fb2e1a19dac39a023755e8 I added an
optimization to avoid tracking the state of certain memory regions
in the store.
Unfortunately, I didn't cover every way in which
store::get_or_create_cluster can be called for a base region, leading
to assertion failure ICEs in -fanalyzer on certain function calls
with certain params.
I've worked through all uses of store::get_or_create_cluster and found
four places where the assertion could fire.
This patch fixes them, and adds regression tests where possible.
gcc/analyzer/ChangeLog:
PR analyzer/105057
* store.cc (binding_cluster::make_unknown_relative_to): Reject
attempts to create a cluster for untracked base regions.
(store::set_value): Likewise.
(store::fill_region): Likewise.
(store::mark_region_as_unknown): Likewise.
gcc/testsuite/ChangeLog:
PR analyzer/105057
* gcc.dg/analyzer/fread-2.c: New test, as a regression test for
ICE in store::set_value on untracked base region.
* gcc.dg/analyzer/memset-2.c: Likewise, for ICE in
store::fill_region.
* gcc.dg/analyzer/strcpy-2.c: Likewise, for ICE in
store::mark_region_as_unknown.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Jonathan reported on IRC that we don't parse
__builtin_bit_cast (type, val).field
etc.
The problem is that for these 2 builtins we return from
cp_parser_postfix_expression instead of setting postfix_expression
to the cp_build_* value and falling through into the postfix regression
suffix handling loop.
2022-03-26 Jakub Jelinek <jakub@redhat.com>
* parser.cc (cp_parser_postfix_expression)
<case RID_BILTIN_CONVERTVECTOR, case RID_BUILTIN_BIT_CAST>: Don't
return cp_build_{vec,convert,bit_cast} result right away, instead
set postfix_expression to it and break.
* c-c++-common/builtin-convertvector-3.c: New test.
* g++.dg/cpp2a/bit-cast15.C: New test.
gcc:
* reload.cc (find_reloads): Align comment with code where
considering the intersection of register classes then tweaking the
regclass for the current alternative or rejecting it.
This patch updates the POWER testsuite test cases using -mcpu= and -mtune=
to use the preferred -mdejagnu-cpu= and -mdejagnu-tune= options. This also
obviates the need for the dg-skip-if directive, since the user cannot
override the -mcpu= value being used to compile the test case.
2022-03-25 Peter Bergner <bergner@linux.ibm.com>
gcc/testsuite/
* g++.dg/pr65240-1.C: Use -mdejagnu-cpu=. Remove dg-skip-if.
* g++.dg/pr65240-2.C: Likewise.
* g++.dg/pr65240-3.C: Likewise.
* g++.dg/pr65240-4.C: Likewise.
* g++.dg/pr65242.C: Likewise.
* g++.dg/pr67211.C: Likewise.
* g++.dg/pr69667.C: Likewise.
* g++.dg/pr71294.C: Likewise.
* g++.dg/pr84279.C: Likewise.
* g++.dg/torture/ppc-ldst-array.C: Likewise.
* gfortran.dg/nint_p7.f90: Likewise.
* gfortran.dg/pr102860.f90: Likewise.
* gcc.target/powerpc/fusion.c: Use -mdejagnu-cpu= and -mdejagnu-tune=.
* gcc.target/powerpc/fusion2.c: Likewise.
* gcc.target/powerpc/int_128bit-runnable.c: Use -mdejagnu-cpu=.
* gcc.target/powerpc/test_mffsl.c: Likewise.
* gfortran.dg/pr47614.f: Likewise.
* gfortran.dg/pr58968.f: Likewise.
This reverts commit r12-1434-g046a3beb1673bf to fix PR target/104882.
As discussed in the PR, it turns out that the MVE ISA has no natural
mapping with GCC's vec_pack_trunc / vec_unpack standard patterns, unlike
Neon or SVE for instance.
This patch also adds the executable testcase provided in the PR.
This test passes at -O3 because the generated code does not need
to use the pack/unpack patterns, hence the use of -O2 which now
triggers vectorization since a few months ago.
2022-03-18 Christophe Lyon <christohe.lyon@arm.com>
PR target/104882
Revert
2021-06-11 Christophe Lyon <christophe.lyon@linaro.org>
gcc/
* config/arm/mve.md (mve_vec_unpack<US>_lo_<mode>): Delete.
(mve_vec_unpack<US>_hi_<mode>): Delete.
(@mve_vec_pack_trunc_lo_<mode>): Delete.
(mve_vmovntq_<supf><mode>): Remove '@' prefix.
* config/arm/neon.md (vec_unpack<US>_hi_<mode>): Move back
from vec-common.md.
(vec_unpack<US>_lo_<mode>): Likewise.
(vec_pack_trunc_<mode>): Rename from
neon_quad_vec_pack_trunc_<mode>.
* config/arm/vec-common.md (vec_unpack<US>_hi_<mode>): Delete.
(vec_unpack<US>_lo_<mode>): Delete.
(vec_pack_trunc_<mode>): Delete.
PR target/104882
gcc/testsuite/
* gcc.target/arm/simd/mve-vclz.c: Update expected results.
* gcc.target/arm/simd/mve-vshl.c: Likewise.
* gcc.target/arm/simd/mve-vec-pack.c: Delete.
* gcc.target/arm/simd/mve-vec-unpack.c: Delete.
* gcc.target/arm/simd/pr104882.c: New test.
LRA removes insn modifying sp for given PR test set. We should also have
checked living hard regs to prevent this. The patch fixes this.
gcc/ChangeLog:
PR middle-end/104971
* lra-lives.cc (process_bb_lives): Check hard_regs_live for hard
regs to clear remove_p flag.
When we optimize permutations in a reduction chain we have to
be careful to select the correct live-out stmt, otherwise the
reduction result will be unused and the retained scalar code will
execute only the number of vector iterations.
2022-03-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/105053
* tree-vect-loop.cc (vect_create_epilog_for_reduction): Pick
the correct live-out stmt for a reduction chain.
* g++.dg/vect/pr105053.cc: New testcase.
I started looking into this PR because in GCC 4.9 we were able to
detect the invalid
struct alignas(void) S{};
but I broke it in r210262.
It's ill-formed code in C++:
[dcl.align]/3: "An alignment-specifier of the form alignas(type-id) has
the same effect as alignas(alignof(type-id))", and [expr.align]/1:
"The operand shall be a type-id representing a complete object type,
or an array thereof, or a reference to one of those types." and void
is not a complete type.
It's also invalid in C:
6.7.5: _Alignas(type-name) is equivalent to _Alignas(_Alignof(type-name))
6.5.3.4: "The _Alignof operator shall not be applied to a function type
or an incomplete type."
We have a GNU extension whereby we treat sizeof(void) as 1, but I assume
it doesn't apply to alignof, at least in C++. However, __alignof__(void)
is still accepted with a -Wpedantic warning.
We still say "invalid application of 'alignof'" rather than 'alignas' in the
void diagnostic, but I felt that fixing that may not be suitable as part of
this patch. The "incomplete type" diagnostic still always prints
'__alignof__'.
PR c++/104944
gcc/cp/ChangeLog:
* typeck.cc (cxx_sizeof_or_alignof_type): Diagnose alignof(void).
(cxx_alignas_expr): Call cxx_sizeof_or_alignof_type with
complain == true.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/alignas20.C: New test.
When a display manager is running on an nvidia card, all CUDA kernel launches
get a 5 seconds watchdog timer.
Consequently, when running the libgomp testsuite with nvptx accelerator and
GOMP_NVPTX_JIT=-O0 we run into a few FAILs like this:
...
libgomp: cuStreamSynchronize error: the launch timed out and was terminated
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/parallel-dims.c \
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0 \
execution test
...
Fix this by scaling down the failing test-cases by default, and reverting to
the original behaviour for GCC_TEST_RUN_EXPENSIVE=1.
Tested on x86_64-linux with nvptx accelerator.
libgomp/ChangeLog:
2022-03-25 Tom de Vries <tdevries@suse.de>
PR libgomp/105042
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Reduce
execution time.
* testsuite/libgomp.oacc-c-c++-common/vred2d-128.c: Same.
* testsuite/libgomp.oacc-fortran/parallel-dims.f90: Same.