Jonathan reported on IRC that we don't parse
__builtin_bit_cast (type, val).field
etc.
The problem is that for these 2 builtins we return from
cp_parser_postfix_expression instead of setting postfix_expression
to the cp_build_* value and falling through into the postfix regression
suffix handling loop.
2022-03-26 Jakub Jelinek <jakub@redhat.com>
* parser.cc (cp_parser_postfix_expression)
<case RID_BILTIN_CONVERTVECTOR, case RID_BUILTIN_BIT_CAST>: Don't
return cp_build_{vec,convert,bit_cast} result right away, instead
set postfix_expression to it and break.
* c-c++-common/builtin-convertvector-3.c: New test.
* g++.dg/cpp2a/bit-cast15.C: New test.
gcc:
* reload.cc (find_reloads): Align comment with code where
considering the intersection of register classes then tweaking the
regclass for the current alternative or rejecting it.
This patch updates the POWER testsuite test cases using -mcpu= and -mtune=
to use the preferred -mdejagnu-cpu= and -mdejagnu-tune= options. This also
obviates the need for the dg-skip-if directive, since the user cannot
override the -mcpu= value being used to compile the test case.
2022-03-25 Peter Bergner <bergner@linux.ibm.com>
gcc/testsuite/
* g++.dg/pr65240-1.C: Use -mdejagnu-cpu=. Remove dg-skip-if.
* g++.dg/pr65240-2.C: Likewise.
* g++.dg/pr65240-3.C: Likewise.
* g++.dg/pr65240-4.C: Likewise.
* g++.dg/pr65242.C: Likewise.
* g++.dg/pr67211.C: Likewise.
* g++.dg/pr69667.C: Likewise.
* g++.dg/pr71294.C: Likewise.
* g++.dg/pr84279.C: Likewise.
* g++.dg/torture/ppc-ldst-array.C: Likewise.
* gfortran.dg/nint_p7.f90: Likewise.
* gfortran.dg/pr102860.f90: Likewise.
* gcc.target/powerpc/fusion.c: Use -mdejagnu-cpu= and -mdejagnu-tune=.
* gcc.target/powerpc/fusion2.c: Likewise.
* gcc.target/powerpc/int_128bit-runnable.c: Use -mdejagnu-cpu=.
* gcc.target/powerpc/test_mffsl.c: Likewise.
* gfortran.dg/pr47614.f: Likewise.
* gfortran.dg/pr58968.f: Likewise.
This reverts commit r12-1434-g046a3beb1673bf to fix PR target/104882.
As discussed in the PR, it turns out that the MVE ISA has no natural
mapping with GCC's vec_pack_trunc / vec_unpack standard patterns, unlike
Neon or SVE for instance.
This patch also adds the executable testcase provided in the PR.
This test passes at -O3 because the generated code does not need
to use the pack/unpack patterns, hence the use of -O2 which now
triggers vectorization since a few months ago.
2022-03-18 Christophe Lyon <christohe.lyon@arm.com>
PR target/104882
Revert
2021-06-11 Christophe Lyon <christophe.lyon@linaro.org>
gcc/
* config/arm/mve.md (mve_vec_unpack<US>_lo_<mode>): Delete.
(mve_vec_unpack<US>_hi_<mode>): Delete.
(@mve_vec_pack_trunc_lo_<mode>): Delete.
(mve_vmovntq_<supf><mode>): Remove '@' prefix.
* config/arm/neon.md (vec_unpack<US>_hi_<mode>): Move back
from vec-common.md.
(vec_unpack<US>_lo_<mode>): Likewise.
(vec_pack_trunc_<mode>): Rename from
neon_quad_vec_pack_trunc_<mode>.
* config/arm/vec-common.md (vec_unpack<US>_hi_<mode>): Delete.
(vec_unpack<US>_lo_<mode>): Delete.
(vec_pack_trunc_<mode>): Delete.
PR target/104882
gcc/testsuite/
* gcc.target/arm/simd/mve-vclz.c: Update expected results.
* gcc.target/arm/simd/mve-vshl.c: Likewise.
* gcc.target/arm/simd/mve-vec-pack.c: Delete.
* gcc.target/arm/simd/mve-vec-unpack.c: Delete.
* gcc.target/arm/simd/pr104882.c: New test.
LRA removes insn modifying sp for given PR test set. We should also have
checked living hard regs to prevent this. The patch fixes this.
gcc/ChangeLog:
PR middle-end/104971
* lra-lives.cc (process_bb_lives): Check hard_regs_live for hard
regs to clear remove_p flag.
When we optimize permutations in a reduction chain we have to
be careful to select the correct live-out stmt, otherwise the
reduction result will be unused and the retained scalar code will
execute only the number of vector iterations.
2022-03-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/105053
* tree-vect-loop.cc (vect_create_epilog_for_reduction): Pick
the correct live-out stmt for a reduction chain.
* g++.dg/vect/pr105053.cc: New testcase.
I started looking into this PR because in GCC 4.9 we were able to
detect the invalid
struct alignas(void) S{};
but I broke it in r210262.
It's ill-formed code in C++:
[dcl.align]/3: "An alignment-specifier of the form alignas(type-id) has
the same effect as alignas(alignof(type-id))", and [expr.align]/1:
"The operand shall be a type-id representing a complete object type,
or an array thereof, or a reference to one of those types." and void
is not a complete type.
It's also invalid in C:
6.7.5: _Alignas(type-name) is equivalent to _Alignas(_Alignof(type-name))
6.5.3.4: "The _Alignof operator shall not be applied to a function type
or an incomplete type."
We have a GNU extension whereby we treat sizeof(void) as 1, but I assume
it doesn't apply to alignof, at least in C++. However, __alignof__(void)
is still accepted with a -Wpedantic warning.
We still say "invalid application of 'alignof'" rather than 'alignas' in the
void diagnostic, but I felt that fixing that may not be suitable as part of
this patch. The "incomplete type" diagnostic still always prints
'__alignof__'.
PR c++/104944
gcc/cp/ChangeLog:
* typeck.cc (cxx_sizeof_or_alignof_type): Diagnose alignof(void).
(cxx_alignas_expr): Call cxx_sizeof_or_alignof_type with
complain == true.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/alignas20.C: New test.
When a display manager is running on an nvidia card, all CUDA kernel launches
get a 5 seconds watchdog timer.
Consequently, when running the libgomp testsuite with nvptx accelerator and
GOMP_NVPTX_JIT=-O0 we run into a few FAILs like this:
...
libgomp: cuStreamSynchronize error: the launch timed out and was terminated
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/parallel-dims.c \
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0 \
execution test
...
Fix this by scaling down the failing test-cases by default, and reverting to
the original behaviour for GCC_TEST_RUN_EXPENSIVE=1.
Tested on x86_64-linux with nvptx accelerator.
libgomp/ChangeLog:
2022-03-25 Tom de Vries <tdevries@suse.de>
PR libgomp/105042
* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Reduce
execution time.
* testsuite/libgomp.oacc-c-c++-common/vred2d-128.c: Same.
* testsuite/libgomp.oacc-fortran/parallel-dims.f90: Same.
We have
return VIEW_CONVERT_EXPR<U>( VEC_PERM_EXPR < {<<< Unknown tree: compound_literal_expr
V D.1984 = { 0 }; >>>, { 0 }} , {<<< Unknown tree: compound_literal_expr
V D.1985 = { 0 }; >>>, { 0 }} , { 0, 0 } > & {(short int) SAVE_EXPR <c>, (short int) SAVE_EXPR <c>});
where we gimplify the init CTORs to
_1 = {{ 0 }, { 0 }};
_2 = {{ 0 }, { 0 }};
instead of to vector constants. That later runs into a bug in
uniform_vector_p which doesn't handle CTORs of vector elements
correctly.
The following adjusts uniform_vector_p to handle CTORs of vector
elements.
2022-03-25 Richard Biener <rguenther@suse.de>
PR middle-end/105049
* tree.cc (uniform_vector_p): Recurse for VECTOR_CST or
CONSTRUCTOR first elements.
* gcc.dg/pr105049.c: New testcase.
This used to work long ago but broke at some point.
gcc/c-family/
* c-ada-spec.cc (dump_ada_import): Deal with the "section" attribute
(dump_ada_node) <POINTER_TYPE>: Do not modify and pass the name, but
the referenced type instead. Deal with the anonymous original type
of a typedef'ed type. In the actual access case, follow the chain
of external subtypes.
<TYPE_DECL>: Tidy up control flow.
On the gfortran.dg/pr103691.f90 testcase the Fortran ICE emits
static real(kind=4) a[0] = {[0 ... -1]=2.0e+0};
That is an invalid RANGE_EXPR where the maximum is smaller than the minimum.
The following patch fixes that. If TYPE_MAX_VALUE is smaller than
TYPE_MIN_VALUE, the array is empty and so doesn't need any initializer,
if the two are equal, we don't need to bother with a RANGE_EXPR and
can just use that INTEGER_CST as the index and finally for the 2+ values
in the range it uses a RANGE_EXPR as before.
2022-03-25 Jakub Jelinek <jakub@redhat.com>
PR fortran/103691
* trans-array.cc (gfc_conv_array_initializer): If TYPE_MAX_VALUE is
smaller than TYPE_MIN_VALUE (i.e. empty array), ignore the
initializer; if TYPE_MIN_VALUE is equal to TYPE_MAX_VALUE, use just
the TYPE_MIN_VALUE as index instead of RANGE_EXPR.
With TeX output ("make pdf"), @gccoptlist's content end up in a single
line such that TeX does not find the matching '@end ignore' for the
'@ignore' block – failing with a runaway error. Solution is to move
the @ignore block after the closing '}'.
(Follow up to r12-7808-g319ba7e241e7e21f9eb481f075310796f13d2035 )
gcc/
PR analyzer/103533
* doc/invoke.texi (Static Analyzer Options): Move
@ignore block after @gccoptlist's '}' for 'make pdf'.
PR analyzer/104954 tracks that -fanalyzer was taking a very long time
on a particular source file in the Linux kernel:
drivers/gpu/drm/amd/display/dc/calcs/dce_calcs.c
One issue occurs with the repeated use of dynamic debug lines e.g. via
the DC_LOG_BANDWIDTH_CALCS macro, such as in print_bw_calcs_dceip in
drivers/gpu/drm/amd/display/dc/calcs/calcs_logger.h:
DC_LOG_BANDWIDTH_CALCS("#####################################################################");
DC_LOG_BANDWIDTH_CALCS("struct bw_calcs_dceip");
DC_LOG_BANDWIDTH_CALCS("#####################################################################");
[...snip dozens of lines...]
DC_LOG_BANDWIDTH_CALCS("[bw_fixed] dmif_request_buffer_size: %d",
bw_fixed_to_int(dceip->dmif_request_buffer_size));
When this is configured to use __dynamic_pr_debug, each of these becomes
code like:
do {
static struct _ddebug __attribute__((__aligned__(8)))
__attribute__((__section__("__dyndbg"))) __UNIQUE_ID_ddebug277 = {
[...snip...]
};
if (arch_static_branch(&__UNIQUE_ID_ddebug277.key, false))
__dynamic_pr_debug(&__UNIQUE_ID_ddebug277, [...the message...]);
} while (0);
The analyzer was naively seeing each call to __dynamic_pr_debug, noting
that the __UNIQUE_ID_nnnn object escapes. At each call, as successive
__UNIQUE_ID_nnnn object escapes, there are N escaped objects, and thus N
need clobbering, and so we have O(N^2) clobbering of escaped objects overall,
leading to huge amounts of pointless work: print_bw_calcs_data has 225
uses of DC_LOG_BANDWIDTH_CALCS, many of which are in loops.
This patch adds a way to identify declarations that aren't interesting
to the analyzer, so that we don't attempt to create binding_clusters
for them (i.e. we don't store any state for them in our state objects).
This is implemented by adding a new region::tracked_p, implemented for
declarations by walking the existing IPA data the first time the
analyzer sees a declaration, setting it to false for global vars that
have no loads/stores/aliases, and "sufficiently safe" address-of
ipa-refs.
The patch gives a large speedup of -fanalyzer on the above kernel
source file:
Before After
Total cc1 wallclock time: 180s 36s
analyzer wallclock time: 162s 17s
% spent in analyzer: 90% 47%
gcc/analyzer/ChangeLog:
PR analyzer/104954
* analyzer.opt (-fdump-analyzer-untracked): New option.
* engine.cc (impl_run_checkers): Handle it.
* region-model-asm.cc (region_model::on_asm_stmt): Don't attempt
to clobber regions with !tracked_p ().
* region-model-manager.cc (dump_untracked_region): New.
(region_model_manager::dump_untracked_regions): New.
(frame_region::dump_untracked_regions): New.
* region-model.h (region_model_manager::dump_untracked_regions):
New decl.
* region.cc (ipa_ref_requires_tracking): New.
(symnode_requires_tracking_p): New.
(decl_region::calc_tracked_p): New.
* region.h (region::tracked_p): New vfunc.
(frame_region::dump_untracked_regions): New decl.
(class decl_region): Note that this is also used fo SSA names.
(decl_region::decl_region): Initialize m_tracked.
(decl_region::tracked_p): New.
(decl_region::calc_tracked_p): New decl.
(decl_region::m_tracked): New.
* store.cc (store::get_or_create_cluster): Assert that we
don't try to create clusters for base regions that aren't
trackable.
(store::mark_as_escaped): Don't mark base regions that we're not
tracking.
gcc/ChangeLog:
PR analyzer/104954
* doc/invoke.texi (Static Analyzer Options): Add
-fdump-analyzer-untracked.
gcc/testsuite/ChangeLog:
PR analyzer/104954
* gcc.dg/analyzer/asm-x86-dyndbg-1.c: New test.
* gcc.dg/analyzer/asm-x86-dyndbg-2.c: New test.
* gcc.dg/analyzer/many-unused-locals.c: New test.
* gcc.dg/analyzer/untracked-1.c: New test.
* gcc.dg/analyzer/unused-local-1.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
I have changed employers and need to withdraw as SLSR maintainer for now.
Adding myself under a new email address under the DCO session. Thanks!
2021-03-24 Bill Schmidt <bill.schmidt@gmail.com>
* MAINTAINERS: Change my information.
Since r9-6073 cxx_eval_store_expression preevaluates the value to
be stored, and that revealed a crash where a template code (here,
code=IMPLICIT_CONV_EXPR) leaks into cxx_eval*.
It happens because we're performing build_vec_init while processing
a template, which calls get_temp_regvar which creates an INIT_EXPR.
This INIT_EXPR's RHS contains an rvalue conversion so we create an
IMPLICIT_CONV_EXPR. Its operand is not type-dependent and the whole
INIT_EXPR is not type-dependent. So we call build_non_dependent_expr
which, with -fchecking=2, calls fold_non_dependent_expr. At this
point the expression still has an IMPLICIT_CONV_EXPR, which ought to
be handled in instantiate_non_dependent_expr_internal. However,
tsubst_copy_and_build doesn't handle INIT_EXPR; it will just call
tsubst_copy which does nothing when args is null. So we fail to
replace the IMPLICIT_CONV_EXPR and ICE.
The problem is that we call build_vec_init in a template in the
first place. We can avoid doing so by checking p_t_d before
calling build_aggr_init in check_initializer.
PR c++/104284
gcc/cp/ChangeLog:
* decl.cc (check_initializer): Don't call build_aggr_init in
a template.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1y/constexpr-104284-1.C: New test.
* g++.dg/cpp1y/constexpr-104284-2.C: New test.
* g++.dg/cpp1y/constexpr-104284-3.C: New test.
* g++.dg/cpp1y/constexpr-104284-4.C: New test.
With the changes for PR81359 and PR88368 to make get_nsdmi errors be treated
as substitution failure, we have the problem that if we check
std::is_default_constructible for a complete class that still has unparsed
default member initializers, we get an answer (false) that will be wrong
once the DMIs have been parsed. The traits avoid this problem for regular
incomplete classes by giving an error if the operand is incomplete; we
should do the same if get_nsdmi is going to fail due to unparsed DMI.
PR c++/96645
gcc/cp/ChangeLog:
* cp-tree.h (type_has_default_ctor_to_be_synthesized): Declare.
* class.cc (type_has_default_ctor_to_be_synthesized): New.
(type_has_non_user_provided_default_constructor_1): Support it.
(type_has_non_user_provided_default_constructor): Now a wrapper.
* method.cc (complain_about_unparsed_dmi): New.
(constructible_expr): Call it.
gcc/testsuite/ChangeLog:
* g++.dg/ext/is_constructible3.C: Expect error.
* g++.dg/ext/is_constructible7.C: New test.
This is a crash where a FIX_TRUNC_EXPR gets into tsubst_copy_and_build
where it hits gcc_unreachable ().
The history of tsubst_copy_and_build/FIX_TRUNC_EXPR is such that it
was introduced in r181478, but it did the wrong thing, whereupon it
was turned into gcc_unreachable () in r258821 (see this thread:
<https://gcc.gnu.org/pipermail/gcc-patches/2018-March/495853.html>).
In a template, we should never create a FIX_TRUNC_EXPR (that's what
conv_unsafe_in_template_p is for). But in this test we are NOT in
a template when we call digest_nsdmi_init which ends up calling
convert_like, converting 1.0e+0 to int, so convert_to_integer_1
gives us a FIX_TRUNC_EXPR.
But then when we get to parsing f's parameters, we are in a template
when processing decltype(Helpers{}), and since r268321, when the
compound literal isn't instantiation-dependent and the type isn't
type-dependent, finish_compound_literal falls back to the normal
processing, so it calls digest_init, which does fold_non_dependent_init
and since the FIX_TRUNC_EXPR isn't dependent, we instantiate and
therefore crash in tsubst_copy_and_build.
The fateful call to fold_non_dependent_init comes from massage_init_elt,
We shouldn't be calling f_n_d_i on the result of get_nsdmi. This we can
avoid by eschewing calling f_n_d_i on CONSTRUCTORs; their elements have
already been folded.
PR c++/102990
gcc/cp/ChangeLog:
* typeck2.cc (massage_init_elt): Avoid folding CONSTRUCTORs.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/nsdmi-template22.C: New test.
* g++.dg/cpp0x/nsdmi-template23.C: New test.
Here we weren't respecting SFINAE when evaluating a call to a consteval
function, which caused us to reject the new testcase below. This patch
fixes this by making build_over_call use the SFINAE-friendly version of
cxx_constant_value.
This change causes us to no longer diagnose ahead of time a couple of
non-constant non-dependent consteval calls in consteval-if2.C with
-fchecking=2. These errors were apparently coming from the call to
fold_non_dependent_expr in build_non_dependent_expr (for the RHS of the +=)
despite complain=tf_none being passed. Now that build_over_call respects
the value of complain during constant evaluation of a consteval call,
the errors are gone.
That the errors are also gone without -fchecking=2 is a regression caused
by r12-7264-gc19f317a78c0e4 and is the subject of PR104620. As described
in comment #5, I think it's basically an accident that we were diagnosing
these two calls correctly before r12-7264, so perhaps we can live without
these errors for GCC 12. Thus this patch just XFAILs the two tests.
PR c++/104620
gcc/cp/ChangeLog:
* call.cc (build_over_call): Use cxx_constant_value_sfinae
instead of cxx_constant_value to evaluate a consteval call.
* constexpr.cc (cxx_constant_value_sfinae): Add decl parameter
and pass it to cxx_eval_outermost_constant_expr.
* cp-tree.h (cxx_constant_value_sfinae): Add decl parameter.
* pt.cc (fold_targs_r): Pass NULL_TREE as decl parameter to
cxx_constant_value_sfinae.
gcc/testsuite/ChangeLog:
* g++.dg/cpp23/consteval-if2.C: XFAIL two dg-error tests where
the argument to the non-constant non-dependent consteval call is
wrapped by NON_DEPENDENT_EXPR.
* g++.dg/cpp2a/consteval30.C: New test.
The copies of identifiers, indended to associate hardening SSA
temporaries to the original variables they refer to, end up causing
-fcompare-debug to fail, because DECL_UIDs are not identical, and the
nouid flag used in compare-debug dumps doesn't affect the uids in
naked identifiers, so the divergence becomes apparent.
This patch drops the naked identifiers. Though somewhat desirable,
they're not necessary.
for gcc/ChangeLog
PR debug/104564
* gimple-harden-conditionals.cc (detach_value): Keep temps
anonymous.
for gcc/testsuite/ChangeLog
PR debug/104564
* c-c++-common/torture/harden-comp.c: Adjust.
* c-c++-common/torture/harden-cond.c: Adjust.
If we harden a compare at the end of a block with an edge to the
abnormal dispatch block, it won't have a single successor. Arrange to
split the block at its final stmt so as to have a single succ.
for gcc/ChangeLog
PR middle-end/104975
* gimple-harden-conditionals.cc
(pass_harden_compares::execute): Force split in case of
multiple edges.
for gcc/testsuite/ChangeLog
PR middle-end/104975
* gcc.dg/pr104975.c: New.
On nvptx (using a Quadro K2000 with driver 470.103.01) I ran into this:
...
FAIL: gcc.dg/atomic/stdatomic-flag-2.c -O1 execution test
...
which mimimized to:
...
#include <stdatomic.h>
atomic_flag a = ATOMIC_FLAG_INIT;
int main () {
if ((atomic_flag_test_and_set) (&a))
__builtin_abort ();
return 0;
}
...
The atomic_flag_test_and_set is implemented using __atomic_test_and_set_1,
which corresponds to the "word-sized compare-and-swap loop" version of
libat_test_and_set in libatomic/tas_n.c.
The semantics of a test-and-set is that the return value is "true if and only
if the previous contents were 'set'".
But the code uses:
...
return woldval != 0;
...
which means it doesn't look only at the byte that was either set or not set,
but at the entire word.
Fix this by using instead:
...
return (woldval & ((UTYPE) ~(UTYPE) 0 << shift)) != 0;
...
Tested on nvptx.
libatomic/ChangeLog:
2022-03-24 Tom de Vries <tdevries@suse.de>
PR target/105011
* tas_n.c (libat_test_and_set): Fix return value.
On Tue, Mar 22, 2022 at 05:51:58PM +0100, Jakub Jelinek via Gcc wrote:
> I guess it would be nice to include the testcases we are talking about,
> like { float x; int : 0; float y; } and { float x; int : 0; } and
> { int : 0; float x; } into compat.exp testsuite so that we see ABI
> differences in compat testing.
Here is a patch that does that. It uses the struct-layout-1* framework,
but isn't generated because we don't want in this case pseudo-random
structure layouts, but particular ones we know cause or could cause problems
on some targets. If other problematic cases are discovered, we can add
further ones.
Tested on x86_64-linux with:
make check-gcc check-g++ RUNTESTFLAGS='ALT_CC_UNDER_TEST=gcc ALT_CXX_UNDER_TEST=g++ compat.exp=pr102*'
and with
make check-gcc check-g++ RUNTESTFLAGS='compat.exp=pr102*'
The former as expected has:
FAIL: gcc.dg/compat/pr102024 c_compat_x_tst.o-c_compat_y_alt.o execute
FAIL: gcc.dg/compat/pr102024 c_compat_x_alt.o-c_compat_y_tst.o execute
fails because on x86_64 we've changed the C ABI but kept the C++ ABI here.
E.g. on rs6000 it should be the g++.dg such tests to fail (all assuming
the alt gcc/g++ is GCC 4.5 through 11).
2022-03-24 Jakub Jelinek <jakub@redhat.com>
PR target/102024
* gcc.dg/compat/pr102024_main.c: New test.
* gcc.dg/compat/pr102024_test.h: New test.
* gcc.dg/compat/pr102024_x.c: New test.
* gcc.dg/compat/pr102024_y.c: New test.
* g++.dg/compat/pr102024_main.C: New test.
* g++.dg/compat/pr102024_test.h: New test.
* g++.dg/compat/pr102024_x.C: New test.
* g++.dg/compat/pr102024_y.C: New test.
As mentioned in the PR, operand_equal_p already contains some hacks so that
it can be called already on pre-instantiation C++ trees from templates,
but the recent change to compare DECL_FIELD_OFFSET in the COMPONENT_REF
case broke this. Many such COMPONENT_REFs are already punted on earlier
because they have NULL TREE_TYPE, but in this case the code knows what
type they have but still uses an IDENTIFIER_NODE as second operand
of COMPONENT_REF (I think SCOPE_REF is something that could be used too).
The following patch looks at those DECL_FIELD_*OFFSET fields only if
both field[01] args are FIELD_DECLs and otherwise keeps it to the
earlier OP_SAME (1) check that guards this whole block.
2022-03-24 Jakub Jelinek <jakub@redhat.com>
PR c++/105035
* fold-const.cc (operand_equal_p) <case COMPONENT_REF>: If either
field0 or field1 is not a FIELD_DECL, return false.
* g++.dg/warn/Wduplicated-cond2.C: New test.
When the serial port is closed, we need to ensure that the port handle is
properly reset for it to be detected as closed.
gcc/ada/
PR ada/104767
* libgnat/g-sercom__mingw.adb (Close): Reset port handle to -1.
* libgnat/g-sercom__linux.adb (Close): Likewise.
When changing the predcom pass to use auto_vec leaks were introduced by
failing to replace deallocation with C++ delete. The following does
this. It also fixes leaks in vectorization and range folding.
2022-03-24 Richard Biener <rguenther@suse.de>
* tree-predcom.cc (chain::chain): Add CTOR.
(component::component): Likewise.
(pcom_worker::release_chain): Use delete.
(release_components): Likewise.
(pcom_worker::filter_suitable_components): Likewise.
(pcom_worker::split_data_refs_to_components): Use new.
(make_invariant_chain): Likewise.
(make_rooted_chain): Likewise.
(pcom_worker::combine_chains): Likewise.
* tree-vect-loop.cc (vect_create_epilog_for_reduction):
Make sure to release previously constructed scalar_results.
* tree-vect-stmts.cc (vectorizable_load): Use auto_vec
for vec_offsets.
* vr-values.cc (simplify_using_ranges::~simplify_using_ranges):
Release m_flag_set_edges.
Limit object size computation only to the simple case where access
attribute has been explicitly specified. The object passed to
__builtin_dynamic_object_size could either be a pointer or a VLA whose
size has been described using access attribute.
Further, return a valid size only if the object is a void * pointer or
points to (or is a VLA of) a type that has a constant size.
gcc/ChangeLog:
PR tree-optimization/104970
* tree-object-size.cc (parm_object_size): Restrict size
computation scenarios to explicit access attributes.
gcc/testsuite/ChangeLog:
PR tree-optimization/104970
* gcc.dg/builtin-dynamic-object-size-0.c (test_parmsz_simple2,
test_parmsz_simple3, test_parmsz_extern, test_parmsz_internal,
test_parmsz_internal2, test_parmsz_internal3): New tests.
(main): Use them.
Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
C++14 to C++20 apparently should allow extern thread_local declarations in
constexpr functions, however useless they are there (because accessing
such vars is not valid in a constant expression, perhaps sizeof/decltype).
P2242 changed that for C++23 to passing through declaration but
https://cplusplus.github.io/CWG/issues/2552.html
has been filed for it yesterday.
The following patch implements the proposed wording of CWG 2552 in addition
to fixing the C++14 - C++20 handling bug.
If you'd like instead to keep the current pedantic C++23 wording for now,
that would mean taking out the first hunk (cxx_eval_constant_expression) and
g++.dg/cpp23/constexpr-nonlit2.C hunk.
2022-03-24 Jakub Jelinek <jakub@redhat.com>
PR c++/104994
* constexpr.cc (cxx_eval_constant_expression): Don't diagnose passing
through extern thread_local declarations. Change wording from
declaration to definition.
(potential_constant_expression_1): Don't diagnose extern thread_local
declarations. Change wording from declared to defined.
* decl.cc (start_decl): Likewise.
* g++.dg/diagnostic/constexpr1.C: Change expected diagnostic wording
from declared to defined.
* g++.dg/cpp23/constexpr-nonlit1.C: Likewise.
(garply): Change dg-error into dg-bogus.
* g++.dg/cpp23/constexpr-nonlit2.C: Change expected diagnostic wording
from declaration to definition.
* g++.dg/cpp23/constexpr-nonlit6.C: Change expected diagnostic wording
from declared to defined.
* g++.dg/cpp23/constexpr-nonlit7.C: New test.
* g++.dg/cpp2a/constexpr-try5.C: Change expected diagnostic wording
from declared to defined.
* g++.dg/cpp2a/consteval3.C: Likewise.
For some overload built-in function instance, if it requires
a data type which isn't defined on the target, its fntype
would be initialized as NULL. This patch is to consider
this possibility in function find_instance, as shown in
PR104967.
PR target/104967
gcc/ChangeLog:
* config/rs6000/rs6000-c.cc (find_instance): Skip instances with null
function types.
PR analyzer/104979 reports a leak false positive when handling an
interprocedural return to a caller:
LHS = CALL(ARGS);
where the LHS is a certain non-trivial compound expression.
The root cause is that parts of the LHS were being erroneously
evaluated with respect to the stack frame of the called function,
rather than tha of the caller. When LHS contained a local variable
within the caller as part of certain nested expressions, this local
variable was looked for within the called frame, rather than that of the
caller. This lookup in the wrong stack frame led to the local variable
being treated as uninitialized, and thus the write to LHS was considered
as writing to a garbage location, leading to the return value being
lost, and thus being considered as a leak.
The region_model code uses the analyzer's path_var class to try to
extend the tree type with stack depth information. Based on the above,
I think that the path_var class is fundamentally broken, but it's used
in a few other places in the analyzer, so I don't want to rip it out
until the next stage 1.
In the meantime, this patch reworks how region_model::pop_frame works so
that the destination region for an interprocedural return value is
computed after the frame is popped, so that the region_model has the
stack frame for the *caller* at that point. Doing so fixes the issue.
I attempted a more ambitious fix which moved the storing of the return
svalue into the destination region from region_model::pop_region into
region_model::update_for_return_gcall, with pop_frame returning the
return svalue. Unfortunately, this regressed g++.dg/analyzer/pr93212.C,
which returns a pointer into a stale frame.
unbind_region_and_descendents and poison_any_pointers_to_descendents are
only set up to poison regions with bindings into the stale frame, not
individual svalues, and updating that became more invasive than I'm
comfortable with in stage 4.
The patch also adds assertions to verify that we have the correct
function when looking up locals/SSA names in a stack frame. There
doesn't seem to be a general-purpose way to get at the function of an
SSA name, so the assertions go from SSA name to def-stmt to basic_block,
and from there use the analyzer's supergraph to get the function from
the basic_block. If there's a simpler way to do this, please let me know.
gcc/analyzer/ChangeLog:
PR analyzer/104979
* engine.cc (impl_run_checkers): Create the engine after the
supergraph, and pass the supergraph to the engine.
* region-model.cc (region_model::get_lvalue_1): Pass ctxt to
frame_region::get_region_for_local.
(region_model::update_for_return_gcall): Pass the lvalue for the
result to pop_frame as a tree, rather than as a region.
(region_model::pop_frame): Update for above change, determining
the destination region after the frame is popped and thus with
respect to the caller frame rather than the called frame.
Likewise, set the value of the region to the return value after
the frame is popped.
(engine::engine): Add supergraph pointer.
(selftest::test_stack_frames): Set the DECL_CONTECT of PARM_DECLs.
(selftest::test_get_representative_path_var): Likewise.
(selftest::test_state_merging): Likewise.
* region-model.h (region_model::pop_frame): Convert first param
from a const region * to a tree.
(engine::engine): Add param "sg".
(engine::m_sg): New field.
* region.cc: Include "analyzer/sm.h" and
"analyzer/program-state.h".
(frame_region::get_region_for_local): Add "ctxt" param.
Add assertions that VAR_DECLs are locals, and that expr is for the
correct function.
* region.h (frame_region::get_region_for_local): Add "ctxt" param.
gcc/testsuite/ChangeLog:
PR analyzer/104979
* gcc.dg/analyzer/boxed-malloc-1-29.c: Deleted test, moving the
now fixed test_29 to...
* gcc.dg/analyzer/boxed-malloc-1.c: ...here.
* gcc.dg/analyzer/stale-frame-1.c: Add test coverage.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Patrick suggested a way to implement the designated-init handling without
(temporarily) modifying the CONSTRUCTOR being reshaped.
PR c++/103337
gcc/cp/ChangeLog:
* decl.cc (reshape_single_init): New.
(reshape_init_class): Use it.
Checking dependent_type_p avoids needing to walk the overloads in cases
where it would not be possible to find a dependent using.
PR c++/105006
gcc/cp/ChangeLog:
* name-lookup.cc (lookup_using_decl): Check that scope is
a dependent type before looking for dependent using.
MinGW does not like a call to 'stat' for './' via gfc_do_check_include_dir.
Solution: Only append '/' when concatenating the path with the filename.
gcc/fortran/ChangeLog:
PR fortran/103560
* scanner.cc (add_path_to_list): Don't append '/' to the
save include path.
(open_included_file): Use '/' in concatenating path + file name.
* module.cc (gzopen_included_file_1): Likewise.
gcc/testsuite/ChangeLog:
PR fortran/103560
* gfortran.dg/include_14.f90: Update dg-warning.
* gfortran.dg/include_17.f90: Likewise.
* gfortran.dg/include_18.f90: Likewise.
* gfortran.dg/include_6.f90: Update dg-*.
The following extends the heuristical memcpy folding path with the
ability to use misaligned accesses on strict-alignment targets just
like the size-based path does. That avoids regressing the following
testcase on arm
uint64_t bar64(const uint8_t *rData1)
{
uint64_t buffer;
memcpy(&buffer, rData1, sizeof(buffer));
return buffer;
}
when r12-3482-g5f6a6c91d7c592 is reverted.
2022-03-23 Richard Biener <rguenther@suse.de>
PR target/102125
* gimple-fold.cc (gimple_fold_builtin_memory_op): Allow the
use of movmisalign when either the source or destination
decl is properly aligned.
form_threads_from_copies processes a sorted array of copies, skipping
those with the same thread and conflicting threads and merging the
first non-conflicting ones. After that it terminates the loop and
gathers the remaining elements of the array, skipping same thread
copies, re-starting the process. For a large number of copies this
gathering of the rest takes considerable time and it also appears
pointless. The following simply continues processing the array
which should be equivalent as far as I can see.
This takes form_threads_from_copies off the profile radar from
previously taking ~50% of the compile-time.
2022-03-23 Richard Biener <rguenther@suse.de>
PR rtl-optimization/105028
* ira-color.cc (form_threads_from_copies): Remove unnecessary
copying of the sorted_copies tail.
Here, DECL_DEPENDENT_P was false for the second using because Row<eT> is
"the current instantiation", so lookup succeeds. But since Row itself has a
dependent using-decl for operator(), the set of functions imported by the
second using is dependent, so we should set the flag.
PR c++/105006
gcc/cp/ChangeLog:
* name-lookup.cc (lookup_using_decl): Set DECL_DEPENDENT_P if lookup
finds a dependent using.
gcc/testsuite/ChangeLog:
* g++.dg/template/using30.C: New test.
gcc/analyzer/ChangeLog:
PR analyzer/105017
* sm-taint.cc (taint_diagnostic::subclass_equal_p): Check
m_has_bounds as well as m_arg.
(tainted_allocation_size::subclass_equal_p): Chain up to base
class implementation. Also check m_mem_space.
(tainted_allocation_size::emit): Add note showing stack-based vs
heap-based allocations.
gcc/testsuite/ChangeLog:
PR analyzer/105017
* gcc.dg/analyzer/taint-alloc-1.c: Add expected messages relating
to heap vs stack.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
gcc/analyzer/ChangeLog:
PR analyzer/104997
* diagnostic-manager.cc (diagnostic_manager::add_diagnostic):
Convert return type from "void" to "bool", reporting success vs
failure to caller, for both overloads.
* diagnostic-manager.h (diagnostic_manager::add_diagnostic):
Likewise.
* engine.cc (impl_region_model_context::warn): Propagate return
value from diagnostic_manager::add_diagnostic.
gcc/testsuite/ChangeLog:
PR analyzer/104997
* gcc.dg/analyzer/write-to-string-literal-4-disabled.c: New test,
adapted from write-to-string-literal-4.c.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Our std::bit_cast was relying on the compiler to check for errors inside
__builtin_bit_cast, instead of checking them as constraints. That means
std::bit_cast was not SFINAE-friendly.
This fix uses a requires-clause, so for old versions of Clang without
concepts support the function will still be unconstrained. At some point
in future we can remove the #ifdef __cpp_concepts check and rely on all
compilers having full concepts support in C++20 mode.
libstdc++-v3/ChangeLog:
PR libstdc++/105027
* include/std/bit (bit_cast): Add constraints.
* testsuite/26_numerics/bit/bit.cast/105027.cc: New test.
gcc/ChangeLog:
* config/rs6000/rs6000-c.cc (altivec_resolve_overloaded_builtin):
Use %qs in format.
* config/rs6000/rs6000.cc (rs6000_option_override_internal):
Reword the error message.
Some C++17 and C++20 feature test macros are only defined in <version>
for hosted builds, even though the features are supported for
freestanding.
All C++23 feature test macros are defined in <version> for freestanding,
but most of the features are only supported for hosted.
libstdc++-v3/ChangeLog:
* include/std/version [!_GLIBCXX_HOSTED]
(__cpp_lib_hardware_interference_size): Define for freestanding.
(__cpp_lib_bit_cast): Likewise.
(__cpp_lib_is_layout_compatible): Likewise.
(__cpp_lib_is_pointer_interconvertible): Likewise.
(__cpp_lib_adaptor_iterator_pair_constructor): Do not define for
freestanding.
(__cpp_lib_invoke_r): Likewise.
(__cpp_lib_ios_noreplace): Likewise.
(__cpp_lib_monadic_optional): Likewise.
(__cpp_lib_move_only_function): Likewise.
(__cpp_lib_spanstream): Likewise.
(__cpp_lib_stacktrace): Likewise.
(__cpp_lib_string_contains): Likewise.
(__cpp_lib_string_resize_and_overwrite): Likewise.
(__cpp_lib_to_underlying): Likewise.
We use either condition variables or futexes to implement atomic waits,
so we can't do it in freestanding. This is non-conforming, so should be
revisited later, probably by making freestanding atomic waiting
operations spin without ever blocking.
Reviewed-by: Thomas Rodgers <trodgers@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/105021
* include/bits/atomic_base.h [!_GLIBCXX_HOSTED]: Do not include
<bits/atomic_wait.h> for freestanding.
This test is dg-do run and invokes UB when these rotate functions
are called with 0 as second argument. There are some other tests
that do this but they are dg-do compile only and not even call those
functions at all, so it IMHO doesn't matter that they are only well
defined for [1,127] and not [0,127].
The following patch fixes it, we pattern recognize both forms as rotates
and we emit identical assembly.
2022-03-23 Jakub Jelinek <jakub@redhat.com>
PR target/102986
* gcc.target/i386/sse2-v1ti-shift-3.c (rotr_v1ti, rotl_v1ti, rotr_ti,
rotl_ti): Use -i&127 instead of 128-i to avoid UB on i == 0.
gcc/lto/ChangeLog:
PR middle-end/104285
* lto-partition.cc (maybe_rewrite_identifier): Use get_identifier
for the returned string to be usable as hash key.
(validize_symbol_for_target): Hence, use return value directly.
(privatize_symbol_name_1): Track maybe_rewrite_identifier renames.
* lto.cc (offload_handle_link_vars): Move function up before ...
(do_whole_program_analysis): Call it after static renamings.
(lto_main): Move call after static renamings.
libgomp/ChangeLog:
PR middle-end/104285
* testsuite/libgomp.c++/target-same-name-2-a.C: New test.
* testsuite/libgomp.c++/target-same-name-2-b.C: New test.
* testsuite/libgomp.c++/target-same-name-2.C: New test.
* testsuite/libgomp.c-c++-common/target-same-name-1-a.c: New test.
* testsuite/libgomp.c-c++-common/target-same-name-1-b.c: New test.
* testsuite/libgomp.c-c++-common/target-same-name-1.c: New test.