For array-descriptor vars, the descriptor is assigned to a temporary. However,
this failed when the clause's argument was in turn in a data-sharing clause
as the outer context's VALUE_EXPR wasn't used.
gcc/ChangeLog:
* omp-low.cc (lower_omp_target): Fix use_device_{addr,ptr} with list
item that is in an outer data-sharing clause.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/use_device_addr-5.f90: New test.
This patch fixes an oversight whereby we treated >= as the end of
a template argument. This causes problems in C++14, because in
cp_parser_template_argument we go different ways for C++14 and C++17:
/* It must be a non-type argument. In C++17 any constant-expression is
allowed. */
if (cxx_dialect > cxx14)
goto general_expr;
so in this testcase in C++14 we get "N" as the template argument but in
C++17 it is the whole "N >= 5" expression. So in C++14 the remaining
">= 5" triggered the newly-added diagnostic.
PR c++/105436
gcc/cp/ChangeLog:
* parser.cc (cp_parser_next_token_ends_template_argument_p): Don't
return true for CPP_GREATER_EQ.
gcc/testsuite/ChangeLog:
* g++.dg/parse/template31.C: New test.
This removes the __array_traits::_S_ref and __array_traits::_S_ptr
accessors, which only exist to make the special case of std::array<T, 0>
syntactically well-formed.
By changing the empty type used as the std::array<T, 0>::_M_elems data
member to support operator[] and conversion to a pointer, we can write
code using the natural syntax. The indirection through _S_ref and
_S_ptr is removed for the common case, and a function call is only used
for the special case of zero-size arrays.
The invalid member access for zero-sized arrays is changed to use
__builtin_trap() instead of a null dereference. This guarantees a
runtime error if it ever gets called, instead of undefined behaviour
that is likely to get optimized out as unreachable.
libstdc++-v3/ChangeLog:
PR libstdc++/104719
* include/std/array (__array_traits::_S_ref): Remove.
(__array_traits::_S_ptr): Remove.
(__array_traits<T, 0>::_Type): Define operator[] and operator T*
to provide an array-like API.
(array::_AT_Type): Remove public typeef.
(array::operator[], array::at, array::front, array::back): Use
index operator to access _M_elems instead of _S_ref.
(array::data): Use implicit conversion from _M_elems to pointer.
(swap(array&, array&)): Use __enable_if_t helper.
(get<I>): Use index operator to access _M_elems.
* testsuite/23_containers/array/tuple_interface/get_neg.cc:
Adjust dg-error line numbers.
Jakub pointed out that cdtor_label is unnecessary, we should get all the
desired semantics with a normal return.
gcc/cp/ChangeLog:
* cp-tree.h (struct language_function): Remove x_cdtor_label.
(cdtor_label, LABEL_DECL_CDTOR): Remove.
* constexpr.cc (returns): Don't check LABEL_DECL_CDTOR.
(cxx_eval_constant_expression): Don't call returns.
* decl.cc (check_goto): Don't check cdtor_label.
(start_preparsed_function): And don't set it.
(finish_constructor_body, finish_destructor_body): Remove.
(finish_function_body): Don't call them.
* typeck.cc (check_return_expr): Handle cdtor_returns_this here.
* semantics.cc (finish_return_stmt): Not here.
When pattern recognition fails to sanitize all defs of a mask
producing operation and the respective def is external or constant
we end up trying to produce a VECTOR_BOOLEAN_TYPE_P constructor
which in turn ends up exposing stmts like
<signed-boolean:1> _135 = _49 ? -1 : 0;
which isn't handled well in followup SLP and generates awful code.
We do rely heavily on pattern recognition to sanitize mask vs.
data uses of bools but that fails here which means we also should
fail vectorization. That avoids ICEing because of such stmts
and it also avoids generating weird code which makes the
vectorization not profitable.
The following patch simply disallows external VECTOR_BOOLEAN_TYPE_P
defs and arranges the promote to external code to instead promote
mask uses to extern (that's just a short-cut here).
I've also looked at aarch64 and with SVE and a fixed vector length
for the gcc.target/i386/pr101636.c testcase. I see similar vectorization
(using <signed-boolean:4>) there but it's hard to decide whether the
old, the new or no vectorization is better for this. The code
generated with traditional integer masks isn't as awkward but we
still get the != 0 promotion done for each scalar element which
doesn't look like intended - this operation should be visible upfront.
That also means some cases will now become a missed optimization
that needs to be fixed by bool pattern recognition. But that can
possibly be delayed to GCC 13.
2022-02-22 Richard Biener <rguenther@suse.de>
PR tree-optimization/104658
* tree-vect-slp.cc (vect_slp_convert_to_external): Do not
create VECTOR_BOOLEAN_TYPE_P extern defs. Reset the vector
type on nodes we promote.
(vectorizable_bb_reduc_epilogue): Deal with externalized
root.
* tree-vect-stmts.cc (vect_maybe_update_slp_op_vectype): Do
not allow VECTOR_BOOLEAN_TYPE_P extern defs.
* gcc.target/i386/pr104658.c: New testcase.
The testcase shows that we can end up with a contiguous access across
loop iterations but by means of permutations the elements accessed
might only cover parts of a vector. In this case we end up with
GROUP_GAP == 0 but still need to avoid accessing excess elements
in the last loop iterations. Peeling for gaps is designed to cover
this but a single scalar iteration might not cover all of the excess
elements. The following ensures peeling for gaps is done in this
situation and when that isn't sufficient because we need to peel
more than one iteration (gcc.dg/vect/pr103116-2.c), fail the SLP
vectorization.
2022-05-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/103116
* tree-vect-stmts.cc (get_group_load_store_type): Handle the
case we need peeling for gaps even though GROUP_GAP is zero.
* gcc.dg/vect/pr103116-1.c: New testcase.
* gcc.dg/vect/pr103116-2.c: Likewise.
In PR105049 we had
return VIEW_CONVERT_EXPR<U>( VEC_PERM_EXPR < {<<< Unknown tree: compound_literal_expr
V D.1984 = { 0 }; >>>, { 0 }} , {<<< Unknown tree: compound_literal_expr
V D.1985 = { 0 }; >>>, { 0 }} , { 0, 0 } > & {(short int) SAVE_EXPR <c>, (short int) SAVE_EXPR <c>});
where we gimplify the init CTORs to
_1 = {{ 0 }, { 0 }};
_2 = {{ 0 }, { 0 }};
instead of to vector constants. The following makes sure to simplify the
CTORs to VECTOR_CSTs during gimplification by re-ordering the simplification
to after CTOR flag recomputation and gimplification of the elements.
2022-03-25 Richard Biener <rguenther@suse.de>
* gimplify.cc (gimplify_init_constructor): First gimplify,
then simplify the result to a VECTOR_CST.
Somebody complained on IRC that when writing a new backend one can get
an error while compiling build/gencondmd.cc.
The problem is that when host compiler is g++ 3 or later (or when
bootstrapping), we compile it with g++ -std=c++11 -pedantic and
the generated insn_conditions array contains pairs
{ "cond", __builtin_constant_p (cond) ? (int) (cond) : -1 },
where cond is some non-trivial instruction condition. Now if a target
uses "" for all the conditions (admittedly unlikely for non-trivial
target), the initializer for insn_conditions[] is {} and that is
pedantically rejected because C++ doesn't support zero-sized arrays.
The following patch fixes that by adding an artificial termination
element and skips that during the walk.
2022-05-04 Jakub Jelinek <jakub@redhat.com>
* genconditions.cc (write_conditions): Append a { nullptr, -1 }
element at the end of insn_conditions.
(write_writer): Use ARRAY_SIZE (insn_conditions) - 1 instead of
ARRAY_SIZE (insn_conditions).
This simple patch avoids the ICE described in the PR:
internal compiler error: in simd_valid_immediate, at config/arm/arm.cc:12866
with an early exit from simd_valid_immediate if we are trying to
handle a vector of booleans and MVE is not enabled.
We still get an ICE when compiling the existing
gcc.dg/rtl/arm/mve-vxbi.c without -march=armv8.1-m.main+mve:
error: unrecognizable insn:
(insn 7 5 8 2 (set (reg:V4BI 114)
(const_vector:V4BI [
(const_int 1 [0x1])
(const_int 0 [0]) repeated x2
(const_int 1 [0x1])
])) -1
(nil))
during RTL pass: ira
but there's little we can do since the testcase explicitly creates
vectors of booleans which do need MVE.
That is the reason why I do not add a testcase.
2022-04-19 Christophe Lyon <christophe.lyon@arm.com>
PR target/104662
* config/arm/arm.cc (simd_valid_immediate): Exit when input is a
vector of booleans and MVE is not enabled.
On the following testcase, we emit deprecated warnings or unavailable errors
even on merge declarations of those lambdas (the dg-bogus directives), while
IMHO we should emit them only when something actually calls those lambdas.
The following patch temporarily disables that diagnostics during
maybe_add_lambda_conv_op.
PR2173R1 also says that ambiguity between attribute-specifier-seq at the
end of requires-clause and attribute-specifier-seq from lambda-expression
should be resolved to attribute-specifier-seq for the latter. Do we need
to do anything about that? I mean, can a valid requires-clause end with
an attribute-specifier-seq? Say operator int [[]] is valid primary
expression, but requires operator int [[]] isn't valid, nor is
requires operator int, no?
2022-05-04 Jakub Jelinek <jakub@redhat.com>
* lambda.cc: Include decl.h.
(maybe_add_lambda_conv_op): Temporarily override deprecated_state to
UNAVAILABLE_DEPRECATED_SUPPRESS.
* g++.dg/cpp23/lambda-attr1.C: New test.
* g++.dg/cpp23/lambda-attr2.C: New test.
Supports change in libsanitizer where it newly reports:
READ of size 4 at 0xffffffffc3d4 tags: 02/01(00) (ptr/mem) in thread T0
So the 'tags' contains now 3 entries compared to 2 entries.
gcc/testsuite/ChangeLog:
* c-c++-common/hwasan/alloca-outside-caught.c: Update dg-output.
* c-c++-common/hwasan/heap-overflow.c: Likewise.
* c-c++-common/hwasan/hwasan-thread-access-parent.c: Likewise.
* c-c++-common/hwasan/large-aligned-1.c: Likewise.
* c-c++-common/hwasan/stack-tagging-basic-1.c: Likewise.
Currently when we cannot move debug stmt from a forwarder to the
destination block we drop/reset them. But in some cases as for
the testcase we can move them to the predecessor when that has
a single successor and we can insert after the last stmt of the
block. That allows us to preserve debug info here.
2022-04-05 Richard Biener <rguenther@suse.de>
PR debug/105158
* tree-cfgcleanup.cc (move_debug_stmts_from_forwarder):
Move debug stmts to the predecessor if moving to the
destination is not possible.
(remove_forwarder_block): Adjust.
(remove_forwarder_block_with_phi): Likewise.
Here since finish_non_static_data_member isn't SFINAE enabled, we
incorrectly emit an error when considering the first overload rather
than silently discarding it:
sfinae33.C: In substitution of ‘template<class T> A<T::value> f() [with T = B]’:
sfinae33.C:11:7: required from here
sfinae33.C:5:31: error: invalid use of non-static data member ‘B::value’
5 | template<class T> A<T::value> f();
| ^
This patch makes the function SFINAE enabled in the usual way: give it a
complain parameter, check it before emitting an error, and pass it through
appropriately.
PR c++/105351
gcc/cp/ChangeLog:
* cp-tree.h (finish_non_static_data_member): Add defaulted
complain parameter.
* pt.cc (tsubst_copy_and_build): Pass complain to
finish_non_static_data_member.
* semantics.cc (finish_non_static_data_member): Respect complain
parameter.
(finish_qualified_id_expr): Pass complain to
finish_non_static_data_member.
gcc/testsuite/ChangeLog:
* g++.dg/template/sfinae33.C: New test.
Update the match rules to accommodate the non-standard libgcc function
names for PRU backend.
gcc/testsuite/ChangeLog:
* gcc.c-torture/compile/attr-complex-method-2.c: Accept both __divdc3
and __gnu_divdc3 as valid libgcc function names.
* gcc.dg/complex-6.c: Ditto for __mulsc3.
* gcc.dg/complex-7.c: Ditto for __muldc3.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
The memchr test cases expect padding to be present in structures. But
this is not true for targets which pack by default. Skip these test
cases in order to avoid static assert errors when checking field offsets.
gcc/testsuite/ChangeLog:
* gcc.dg/memchr.c: Skip for default_packed targets.
* gcc.dg/memcmp-3.c: Ditto.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
Place markers in test case to handle targets which pack structures by
default. Validated on pru-none-elf.
gcc/testsuite/ChangeLog:
* gcc.dg/Wattributes-8.c: Add annotations for default_packed
targets.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
PRU target defines DI patterns for logical ALU operations.
gcc/testsuite/ChangeLog:
* gcc.dg/lower-subreg-1.c: Skip for PRU.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
Access to arbitrary stack frames is not supported on PRU.
gcc/testsuite/ChangeLog:
* gcc.dg/Wno-frame-address.c: Skip for PRU target.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
This patch fixes PR tree-optimization/102950, which is a P2 regression,
by providing better range bounds for BIT_XOR_EXPR, BIT_AND_EXPR and
BIT_IOR_EXPR on signed integer types. In general terms, any binary
bitwise operation on sign-extended or zero-extended integer types will
produce results that are themselves sign-extended or zero-extended.
More precisely, we can derive signed bounds from the number of leading
redundant sign bit copies, from the equation:
clrsb(X op Y) >= min (clrsb (X), clrsb(Y))
and from the property that for any (signed or unsigned) range [lb, ub]
that clrsb([lb, ub]) >= min (clrsb(lb), clrsb(ub)).
These can be used to show that [-1, 0] op [-1, 0] is [-1, 0] or that
[-128, 127] op [-128, 127] is [-128, 127], even when tracking nonzero
bits would result in VARYING (as every bit can be 0 or 1). This is
equivalent to determining the minimum type precision in which the
operation can be performed then sign extending the result.
One additional refinement is to observe that X ^ Y can never be
zero if the ranges of X and Y don't overlap, i.e. X can't be equal
to Y.
Previously, the expression "(int)(char)a ^ 233" in the PR was considered
VARYING, but with the above changes now has the range [-256, -1][1, 255],
which is sufficient to optimize away the call to foo.
2022-05-03 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR tree-optimization/102950
* range-op.cc (wi_optimize_signed_bitwise_op): New function to
determine bounds of bitwise operations on signed types.
(operator_bitwise_and::wi_fold): Call the above function.
(operator_bitwise_or::wi_fold): Likewise.
(operator_bitwise_xor::wi_fold): Likewise. Additionally, the
result can't be zero if the operands can't be equal.
gcc/testsuite/ChangeLog
PR tree-optimization/102950
* gcc.dg/pr102950.c: New test case.
* gcc.dg/tree-ssa/evrp10.c: New test case.
Current host tools mark some additional symbols as 'no dead strip' and also
expose one additional group to the linker. This does not affect older Darwin
versions or x86_64, but omitting these changes results in link errors for
aarch64.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:
* config/darwin.cc (darwin_label_is_anonymous_local_objc_name): Make
protocol class methods linker-visible.
gcc/objc/ChangeLog:
* objc-next-runtime-abi-02.cc (next_runtime_abi_02_protocol_decl): Do
not dead-strip the runtime meta-data symbols.
(build_v2_classrefs_table): Likewise.
(build_v2_protocol_list_address_table): Likewise.
The floating-point overloads of from_char are only declared if
_GLIBCXX_HAVE_USELOCALE is #defined as nonzero. That's exposed from
charconv as __cpp_lib_to_chars >= 201611L, so guard the test body with
that.
for libstdc++-v3/ChangeLog
PR c++/105324
* testsuite/20_util/from_chars/pr105324.cc: Guard test body
with conditional for floating-point overloads of from_char.
Optimize _mm_storeu_si16 to use MOVD from a SSE to an integer register
instead of PEXTRW from a low word of the SSE register to an integer reg.
Avoid the transformation when optimizing for size for targets without
TARGET_INTER_UNIT_MOVES_FROM_VEC capability, where the transformation
results in two moves via a memory location.
2022-05-03 Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
PR target/105079
* config/i386/sse.md (*vec_extract<mode>_0_mem): New pre-reload
define_insn_and_split pattern.
gcc/testsuite/ChangeLog:
PR target/105079
* gcc.target/i386/pr105079.c: New test.
* gcc.target/i386/pr95483-1.c (dg-options): Use -msse4.1.
It seems the license header was omitted when dfp.m4 was originally
contributed in 2010 (commit 3c39bca6bb, r0-102573 or svn r163815.
This copies the license from libdecnumber/configure.ac since dfp.m4
was originally extracted from that file.
2022-04-29 Christophe Lyon <christophe.lyon@arm.com>
config/
* dfp.m4: Add license header.
2022-05-03 Richard Biener <rguenther@suse.de>
PR middle-end/105083
* tree-scalar-evolution.cc (scev_initialize): Verify we
have appropriate loop state.
* tree-ssa-dce.cc (perform_tree_ssa_dce): Re-order SCEV and
loop init and finalization.
The flag_var_tracking reset in finish_options doesn't match the
condition in process_options, in particular we fail to reset it
when the option was specified on the command line. The following
fixes this and also alters the debug info level guard to match
the one in process_options.
2022-05-03 Richard Biener <rguenther@suse.de>
PR middle-end/105461
* opts.cc (finish_options): Match the condition to
disable flag_var_tracking to that of process_options.
* gcc.dg/pr105461.c: New testcase.
When some code was moved from process_options to finish_options,
uses of OPTION_SET_P were not replaced with references to the
opts_set option set. The following fixes this.
2022-05-03 Richard Biener <rguenther@suse.de>
* opts.cc: #undef OPTIONS_SET_P.
(finish_options): Use opts_set instead of OPTIONS_SET_P.
The following fixes missing handling of non-integer mode but
masked (SVE or MVE) compares in vector lowering by using the
appropriate mask element width to extract the components and
adjust the index.
2022-04-29 Richard Biener <rguenther@suse.de>
PR tree-optimization/105394
* tree-vect-generic.cc (expand_vector_condition): Adjust
comp_width for non-integer mode masks as well.
gcc.dg/vect/costmodel/ppc/costmodel-vect-31a.c covers ppc variants
that accept and reject misaligned accesses. The message that it
expects for rejection was removed in the gcc-11 development cycle by
commit r11-1969. The patch adjusted multiple tests to use the message
introduced in r11-1945, but missed this one.
for gcc/testsuite/ChangeLog
* gcc.dg/vect/costmodel/ppc/costmodel-vect-31a.c: Update
the expected message for the case in which unaligned accesses
are not allowed.
On PR102629 I noticed that we were giving the entire lambda as the location
for this template-id.
gcc/cp/ChangeLog:
* pt.cc (tsubst_copy_and_build) [TEMPLATE_ID_EXPR]: Copy location.
(do_auto_deduction): Use expr location.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/lambda-pack-init7.C: Check column number.
While looking at PR96645 I noticed that while we were diagnosing names
changing meaning in the full class context, we weren't doing this for
lookups in nested class bodies.
Note that this breaks current range-v3; I've submitted a pull request to fix
its violation of the rule.
gcc/cp/ChangeLog:
* class.cc (maybe_note_name_used_in_class): Note in all enclosing
classes. Remember location of use.
(note_name_declared_in_class): Adjust.
gcc/testsuite/ChangeLog:
* g++.dg/lookup/name-clash13.C: New test.
* g++.dg/lookup/name-clash14.C: New test.
* g++.dg/lookup/name-clash15.C: New test.
* g++.dg/lookup/name-clash16.C: New test.
This makes sure to not consider calls to builtin decls with
mismatching arguments as inexpensive.
2022-04-13 Richard Biener <rguenther@suse.de>
* tree-scalar-evolution.cc (expression_expensive_p):
Never consider mismatched calls as cheap.
The following extends SLP discovery to handle swapped operands
in comparisons.
2022-05-02 Richard Biener <rguenther@suse.de>
PR tree-optimization/104240
* tree-vect-slp.cc (op1_op0_map): New.
(vect_get_operand_map): Handle compares.
(vect_build_slp_tree_1): Support swapped operands for
tcc_comparison.
* gcc.dg/vect/bb-slp-pr104240.c: New testcase.
As with std::isdigit in r12-6281-gc83ecfbe74a5cf, we shouldn't be using
std::tolower in <charconv> either.
PR libstdc++/103911
libstdc++-v3/ChangeLog:
* src/c++17/floating_from_chars.cc (find_end_of_float): Accept
two delimeters for the exponent part in the form of a possibly
NULL string of length two. Don't use std::tolower.
(pattern): Adjust calls to find_end_of_float accordingly.
The hexfloat parser for binary32/64 added in r12-6645-gcc3bf3404e4b1c
overlooked that the exponent part can also begin with an uppercase 'P'.
PR libstdc++/105441
libstdc++-v3/ChangeLog:
* src/c++17/floating_from_chars.cc (__floating_from_chars_hex):
Also accept 'P' as the start of the exponent.
* testsuite/20_util/from_chars/7.cc: Add corresponding testcase.
The following testcase fails -fcompare-debug on aarch64-linux. The problem
is that for the n variable we create a varpool node, then remove it again
because the var isn't really used, but it keeps being referenced in debug
stmts/insns with -g. Later during sched1 pass we ask whether the n var
can be modified through some store to an anchored variable and with -g
we create a new varpool node for it again just so that we can find its
ultimate alias target. Even later on we create some cgraph node for the
loop parallelization, but as there has been an extra varpool node creation
in between, we get higher node->order with -g than without.
The patch fixes that by throwing variables without varpool nodes away
during expansion time, they are very unlikely to actually end up with
useful debug info anyway.
I've bootstrapped/regtested the following on x86_64-linux and i686-linux,
then bootstrapped with the patch reverted, reapplied the patch and did make
cc1plus in stage3. The debug section sizes are identical, .debug_info and
.debug_loc is identical too, so I think we don't lose any debug info through
it.
So at least on cc1plus it makes no difference.
2022-05-02 Jakub Jelinek <jakub@redhat.com>
PR debug/105415
* cfgexpand.cc (expand_debug_expr): Don't make_decl_rtl_for_debug
if there is no symtab node for the VAR_DECL.
* gcc.dg/pr105415.c: New test.