gcc/ada/
* sem_ch3.adb (Analyze_Object_Declaration): If the type is an
Unchecked_Union, and the expression is an aggregate. complete
the analysis and resolution of the aggregate, and treat like a
regular object declaration, instead of as a renaming
declarattion.
gcc/ada/
* exp_ch9.adb (Is_Potentially_Large_Family): Add documentation.
(Actual_Index_Expression): Use Entry_Index_Type.
(Build_Entry_Count_Expression): Likewise.
(Build_Find_Body_Index): Likewise.
(Collect_Entry_Families): Likewise. Use directly the bounds of
the index type to find out whether the family is large.
(Entry_Index_Expression): Likewise.
gcc/ada/
* exp_aggr.adb (Aggr_Assignment_OK_For_Backend): Move to library
level and use a new predicate Is_OK_Aggregate to recognize the
aggregates suitable for direct assignment by the back-end.
(Convert_Array_Aggr_In_Allocator): If neither in CodePeer mode nor
generating C code, generate a direct assignment instead of further
expanding if Aggr_Assignment_OK_For_Backend returns true.
gcc/ada/
* sem_aux.adb: Add a with clause for Nlists.
(Nearest_Ancestor): Test for the case of concurrent
types (testing for both Is_Concurrent_Type and
Is_Concurrent_Record_Type), and return the first ancestor in the
Interfaces list if present (otherwise will return Empty if no
interfaces).
* sem_ch13.adb (Build_Predicate_Functions): Add a ??? comment
about missing handling for adding predicates when they can be
inherited from multiple progenitors.
The following patch adds support for three-input addition instructions to
the nvptx backend. The PTX ISA's "vadd.u32.u32.u32.add d, a, b, c"
instruction effectively implements 32-bit d = a+b+c, and the
"vsub.u32.u32.u32 d,a,b,c" instruction that provides 32-bit d = (a-b)+c.
The hope is that these mnemonics help ptxas generate the low-level
hardware's IADD3 instruction.
2020-07-06 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog:
* config/nvptx/nvptx.md (*vadd_addsi4): New instruction.
(*vsub_addsi4): New instruction.
gcc/testsuite/ChangeLog:
* gcc.target/nvptx/vadd_add.c: New test.
* gcc.target/nvptx/vsub_add.c: New test.
Combine likes to change a zero-extension / and + shift as seen
in the test-case source to a logical shift followed by an and of
the shifted mask, like:
lsrq 1,r0
and.d 0x7f,r0
This was observed in the hot loop of coremark crcu16 and crcu32,
when doing other changes affecting instruction selection. While
fixable by other means (like instruction costs or combine
patches), I wanted to break this out from those "other means".
The similarity to extant peephole optimizations is not
deliberate.
I noticed some paths to other peephole2 test-cases have changed
due to moves and renaming, so I updated them.
gcc:
* config/cris/cris.md (movulsr): New peephole2.
gcc/testsuite:
* gcc.target/cris/peep2-movulsr.c: New test.
Yet another misnumbering of operands: the asserted non-overlap
would be the only benign operands overlap. "Suddenly" exposed
by g++.dg/cpp0x/pr81325.C when testing unrelated changes
affecting register allocation.
To wit, operands 2 and 1 are the only ones that are safe for
overlap, it's only that it doesn't seem to make much sense to
write the address of the atomic data as the atomic data.
gcc:
* config/cris/sync.md ("cris_atomic_fetch_<atomic_op_name><mode>_1"):
Correct gcc_assert of overlapping operands.
The code in cris_select_cc_mode for selecting CC_NZmode was
partly inconsistent with the comment and partly seemed
ambiguous. I couldn't find a reason why I qualified selection
of CC_NZmode on the setting operation once a matching user was
spotted, so I just removed that. The cris.c update was due to
observing the new test-case failing; the CC_NZmode compare
wasn't eliminated.
The recently re-instated adds/addu/subs/subu/bound patterns are
rewritten to replace the use of match_operator with iterators.
gcc:
* config/cris/cris.c (cris_select_cc_mode): Always return
CC_NZmode for matching comparisons. Clarify comments.
* config/cris/cris-modes.def: Clarify mode comment.
* config/cris/cris.md (plusminus, plusminusumin, plusumin): New
code iterators.
(addsub, addsubbo, nd): New code iterator attributes.
("*<addsub><su>qihi"): Rename from "*extopqihi". Use code
iterator constructs instead of match_operator constructs.
("*<addsubbo><su><nd><mode>si<setnz>"): Similar from
"*extop<mode>si<setnz>".
("*add<su>qihi_swap"): Similar from "*addxqihi_swap".
("*<addsubbo><su><nd><mode>si<setnz>_swap"): Similar from
"*extop<mode>si<setnz>_swap".
gcc/testsuite:
* gcc.target/cris/pr93372-39.c: New test.
When cleaning out the multitude of patterns with unknown
coverage, this one went the way of the bathwater. It's use is
barely common enough to mark when diffing libgcc, and has a
minimal impact on performance-testsuites. Anyway, reinstated
with a couple of test-cases. It's suboptimal of gcc-core not to
make use of the SImode pattern when performing HImode; see the
FIXME (which is actually also reinstated).
This version uses match_operator, for continuity but will be
replaced with a version making use of iterators (like it does
for the mode).
gcc:
* config/cris/cris.md ("*extopqihi", "*extop<mode>si<setnz>_swap")
("*extop<mode>si<setnz>", "*addxqihi_swap"): Reinstate.
gcc/testsuite:
* gcc.target/cris/pr93372-36.c, gcc.target/cris/pr93372-37.c,
gcc.target/cris/pr93372-38.c: New tests.
Apart from calling gfc_compare_interfaces to check interfaces against
global identifiers, this also sets and check a few sym->error flags
to avoid duplicate error messages. I thought about issuing errors
on mismatched interfaces, but when the procedure is not invoked,
a warning should be enough to alert the user.
gcc/fortran/ChangeLog:
PR fortran/27318
* frontend-passes.c (check_against_globals): New function.
(gfc_check_externals): Split; also invoke check_against_globals
via gfc_traverse_ns.
(gfc_check_externals0): Recursive part formerly in
gfc_check_externals.
* resolve.c (resolve_global_procedure): Set sym->error on
interface mismatch.
* symbol.c (ambiguous_symbol): Check for, and set sym->error.
gcc/testsuite/ChangeLog:
PR fortran/27318
* gfortran.dg/error_recovery_1.f90: Adjust test case.
* gfortran.dg/use_15.f90: Likewise.
* gfortran.dg/interface_47.f90: New test.
The test was committed with a placeholder name, this
renames it as described in the PR.
gcc/testsuite/ChangeLog:
* g++.dg/coroutines/pr9xxxx-mismatched-traits-and-promise-prev.C: Moved to...
* g++.dg/coroutines/pr94760-mismatched-traits-and-promise-prev.C: ...here.
The GIMPLE store merging pass doesn't merge STRING_CSTs in the general
case, although they are accepted by native_encode_expr; the reason is
that the pass only works with integral modes, i.e. with chunks whose
size is a power of two.
There are two possible ways of extending it to handle STRING_CSTs:
1) lift the condition of integral modes and treat STRING_CSTs as
other _CST nodes but with arbitrary size; 2) implement a specific
and separate handling for STRING_CSTs.
The attached patch implements 2) for the following reasons: on the
one hand, even in Ada where character strings are first-class citizens,
cases where merging STRING_CSTs with other *_CST nodes would be possible
are quite rare in practice; on the other hand, string concatenations
happen more naturally and frequently thanks to the "&" operator, giving
rise to merging opportunities.
gcc/ChangeLog:
* gimple-fold.c (gimple_fold_builtin_memory_op): Fold calls that
were initially created for the assignment of a variable-sized
object and whose source is now a string constant.
* gimple-ssa-store-merging.c (struct merged_store_group): Document
STRING_CST for rhs_code field.
Add string_concatenation boolean field.
(merged_store_group::merged_store_group): Initialize it as well as
bit_insertion here.
(merged_store_group::do_merge): Set it upon seeing a STRING_CST.
Also set bit_insertion here upon seeing a BIT_INSERT_EXPR.
(merged_store_group::apply_stores): Clear it for small regions.
Do not create a power-of-2-sized buffer if it is still true.
And do not set bit_insertion here again.
(encode_tree_to_bitpos): Deal with BLKmode for the expression.
(merged_store_group::can_be_merged_into): Deal with STRING_CST.
(imm_store_chain_info::coalesce_immediate_stores): Set bit_insertion
to true after changing MEM_REF stores into BIT_INSERT_EXPR stores.
(count_multiple_uses): Return 0 for STRING_CST.
(split_group): Do not split the group for a string concatenation.
(imm_store_chain_info::output_merged_store): Constify and rename
some local variables. Build an array type as destination type
for a string concatenation, as well as a zero mask, and call
build_string to build the source.
(lhs_valid_for_store_merging_p): Return true for VIEW_CONVERT_EXPR.
(pass_store_merging::process_store): Accept STRING_CST on the RHS.
* gimple.h (gimple_call_alloca_for_var_p): New accessor function.
* gimplify.c (gimplify_modify_expr_to_memcpy): Set alloca_for_var.
* tree.h (CALL_ALLOCA_FOR_VAR_P): Document it for BUILT_IN_MEMCPY.
gcc/testsuite/ChangeLog:
* gnat.dg/opt87.adb: New test.
* gnat.dg/opt87_pkg.ads: New helper.
* gnat.dg/opt87_pkg.adb: Likewise.
PR 96040 revealed IPA-SRA, when checking whether an intended split is
the same as the one in a called function does not also check if the
types match and the transformation code does not handle any resulting
type mismatches. This patch simply avoids the the split in the case
of mismatches, so that we do not have to be careful about invalid
floating-point values being passed in floating point registers and
related issues.
gcc/ChangeLog:
2020-07-03 Martin Jambor <mjambor@suse.cz>
PR ipa/96040
* ipa-sra.c (all_callee_accesses_present_p): Do not accept type
mismatched accesses.
gcc/testsuite/ChangeLog:
2020-07-03 Martin Jambor <mjambor@suse.cz>
PR ipa/96040
* gcc.dg/ipa/pr96040.c: New test.
As done for 'GOMP_MAP_FROM', also for 'GOMP_MAP_FORCE_FROM' we should only
'gomp_copy_dev2host' if 'n->refcount == 0'.
This had gotten altered in commit 378da98fcc
(r279621) "OpenACC reference count overhaul".
libgomp/
* oacc-mem.c (goacc_exit_data_internal): Revert always-copyfrom
behavior for 'GOMP_MAP_FORCE_FROM'.
* testsuite/libgomp.oacc-c-c++-common/pr92843-1.c: Adjust XFAIL.
This had gotten added in commit 378da98fcc
(r279621) "OpenACC reference count overhaul", but it doesn't have any use in
OpenACC.
libgomp/
* oacc-mem.c (goacc_exit_data_internal): Remove
'GOMP_MAP_ALWAYS_FROM' handling.
2020-07-01 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog:
* config/nvptx/nvptx.md (popcount<mode>2): New instructions.
(mulhishi3, mulsidi3, umulhisi3, umulsidi3): New instructions.
gcc/testsuite/ChangeLog:
* gcc.target/nvptx/popc-1.c: New test.
* gcc.target/nvptx/popc-2.c: New test.
* gcc.target/nvptx/popc-3.c: New test.
* gcc.target/nvptx/mul-wide.c: New test.
* gcc.target/nvptx/umul-wide.c: New test.
The following avoids leaving slp_def as passed to vect_is_simple_use
by reference uninitialized.
2020-07-03 Richard Biener <rguenther@suse.de>
PR tree-optimization/96037
* tree-vect-stmts.c (vect_is_simple_use): Initialize *slp_def.
We were costing the scalar pattern stmts rather than the scalar
original stmt and also not appropriately looking at the pattern
stmt for whether the stmt is vectorized.
2020-07-03 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (vect_bb_slp_scalar_cost): Cost the
original non-pattern stmts, look at the pattern stmt
vectorization status.
* gcc.dg/vect/costmodel/x86_64/costmodel-vect-slp-2.c: New
testcase.
These aren't real in-order instructions, because the ISA can't do that
quickly, but a means to allow regular out-of-order reductions when that's
good enough, but the middle-end doesn't know so.
gcc/
* config/gcn/gcn-valu.md (fold_left_plus_<mode>): New.
This provides helpers to insert stmts on region entry abstracted
from loop/basic-block split out from vec_init_vector and used
from the SLP constant code generation path. The SLP constant
code generation path is also changed to avoid needless SSA
copying since we can store VECTOR_CSTs directly in the vectorized
defs array, improving the IL from the vectorizer.
2020-07-03 Richard Biener <rguenther@suse.de>
* tree-vectorizer.h (vec_info::insert_on_entry): New.
(vec_info::insert_seq_on_entry): Likewise.
* tree-vectorizer.c (vec_info::insert_on_entry): Implement.
(vec_info::insert_seq_on_entry): Likewise.
* tree-vect-stmts.c (vect_init_vector_1): Use
vec_info::insert_on_entry.
(vect_finish_stmt_generation): Set modified bit after
adjusting VUSE.
* tree-vect-slp.c (vect_create_constant_vectors): Simplify
by using vec_info::insert_seq_on_entry and bypassing
vec_init_vector.
(vect_schedule_slp_instance): Deal with all-constant
children later.
This patch addresses the ICE in gcc.dg/attr-vector_size.c during
make -k check on nvptx-none. The actual ICE looks like:
testsuite/gcc.dg/attr-vector_size.c:29:1: internal compiler error: \
in tree_to_shwi, at tree.c:7321
0xf53bf2 tree_to_shwi(tree_node const*)
../../gcc/gcc/tree.c:7321
0xff1969 nvptx_vector_alignment
../../gcc/gcc/config/nvptx/nvptx.c:5105^M
The problem is that the caller has ensured that TYPE_SIZE(type) is
representable as an unsigned HOST_WIDE_INT, but nvptx_vector_alignment is
accessing it as a signed HOST_WIDE_INT which overflows in pathological
conditions. Amongst those pathological conditions is that a TYPE_SIZE of
zero can sometimes reach this function, prior to an error being emitted.
Making sure the result is not less than the mode's alignment and not greater
than BIGGEST_ALIGNMENT fixes the ICEs, and generates the expected
compile-time error messages.
Tested on --target=nvptx-none, with a "make" and "make check" which results
in four fewer unexpected failures and three more expected passes.
2020-07-03 Roger Sayle <roger@nextmovesoftware.com>
Tom de Vries <tdevries@suse.de>
gcc/ChangeLog:
PR target/90932
* config/nvptx/nvptx.c (nvptx_vector_alignment): Use tree_to_uhwi
to access TYPE_SIZE (type). Return at least the mode's alignment.
Some testcases specifically test for negative line numbers. Those tests with
bare line numbers may be parsed incorrectly by Tcl/Expect as invalid options.
This patch encloses the negative numbers in braces so that they are
recognized as an optional parameter.
gcc/testsuite/ChangeLog
2020-07-02 David Edelsohn <dje.gcc@gmail.com>
* gcc.dg/fixits-pr84852-1.c: Enclose negative line number in braces.
* gcc.dg/fixits-pr84852-2.c: Same.
* gcc.dg/pr89410-1.c: Same.
* gcc.dg/pr89410-2.c: Same.
This test checks a conversion which only exists in C++98 and won't
compile since C++11. It uses { dg-options "-std=gnu++98" } so that it is
explicitly run in C++98 mode. This change also adds a target selector so
that the test will be skipped if the dg-options directive is filtered
out or overridden.
libstdc++-v3/ChangeLog:
* testsuite/27_io/basic_ios/conv/voidptr.cc: Add c++98_only
target selector.
These tests verify that including C++11 headers fails to compile in
C++98 mode. They use { dg-options "-std=gnu++98" } so that they are
explicitly run in C++98 mode. This change also adds a target selector so
that the tests will be skipped even if the dg-options directive is
filtered out or overridden. This is in preparation for a desired future
change where tests do not use -std options, so that they can be tested
with e.g. --target_board=unix\"{-std=gnu++17,-std=gnu++20}\"
In some cases the dg-options and dg-do directives need to be reordered,
so that the -std=gnu++98 option is already added to the options before
the target selector is checked.
libstdc++-v3/ChangeLog:
* testsuite/18_support/headers/cstdalign/std_c++0x_neg.cc: Add
c++98_only target selector.
* testsuite/18_support/headers/cstdbool/std_c++0x_neg.cc:
Likewise.
* testsuite/18_support/headers/cstdint/std_c++0x_neg.cc:
Likewise.
* testsuite/18_support/headers/new/synopsis_cxx98.cc: Likewise.
* testsuite/19_diagnostics/headers/system_error/std_c++0x_neg.cc:
Likewise.
* testsuite/20_util/headers/type_traits/std_c++0x_neg.cc:
Likewise.
* testsuite/23_containers/headers/array/std_c++0x_neg.cc:
Likewise.
* testsuite/23_containers/headers/tuple/std_c++0x_neg.cc:
Likewise.
* testsuite/23_containers/headers/unordered_map/std_c++0x_neg.cc:
Likewise.
* testsuite/23_containers/headers/unordered_set/std_c++0x_neg.cc:
Likewise.
* testsuite/26_numerics/headers/ccomplex/std_c++0x_neg.cc:
Likewise.
* testsuite/26_numerics/headers/cfenv/std_c++0x_neg.cc:
Likewise.
* testsuite/26_numerics/headers/cmath/c99_classification_macros_c++98.cc:
Likewise.
* testsuite/26_numerics/headers/ctgmath/std_c++0x_neg.cc:
Likewise.
* testsuite/26_numerics/headers/random/std_c++0x_neg.cc:
Likewise.
* testsuite/27_io/headers/cinttypes/std_c++0x_neg.cc: Likewise.
* testsuite/28_regex/headers/regex/std_c++0x_neg.cc: Likewise.
* testsuite/29_atomics/headers/atomic/std_c++0x_neg.cc:
Likewise.
* testsuite/30_threads/headers/condition_variable/std_c++0x_neg.cc:
Likewise.
* testsuite/30_threads/headers/future/std_c++0x_neg.cc:
Likewise.
* testsuite/30_threads/headers/mutex/std_c++0x_neg.cc: Likewise.
* testsuite/30_threads/headers/thread/std_c++0x_neg.cc:
Likewise.
PR libstdc++/91807
* include/std/variant
(_Copy_assign_base::operator=(const _Copy_assign_base&):
Do the move-assignment from a temporary so that the temporary
is constructed with an explicit index.
* testsuite/20_util/variant/91807.cc: New.
When recovering from an error, a NULL pointer dereference could occur.
Check for that situation and punt.
gcc/fortran/
PR fortran/93423
* resolve.c (resolve_symbol): Avoid NULL pointer dereference.
When declaring a polymorphic variable that is not a dummy, allocatable or
pointer, an ICE occurred due to a NULL pointer dereference. Check for
that situation and punt.
gcc/fortran/
PR fortran/93337
* class.c (gfc_find_derived_vtab): Punt if name is not set.
These tests fail with AIX double double. Use different floating point
values that behave less surprisingly.
libstdc++-v3/ChangeLog:
PR libstdc++/91153
PR target/93224
* testsuite/29_atomics/atomic_float/1.cc: Use different values
for tests.
* testsuite/29_atomics/atomic_ref/float.cc: Likewise.
Jakub's partial implementation of consteval virtual had trouble with the
current ABI requirement that we omit the vtable slot for a consteval virtual
function; it's difficult to use the normal code for constant evaluation and
also magically make the slots disappear if the vtables get written out. I
notice that Clang trunk also doesn't implement that requirement, and it
seems unnecessary to me; I expect consteval virtual functions to be
extremely rare, so it should be fine to just give them a vtable slot as
normal but put zero in it if the vtable gets emitted. I've commented as
much to the ABI committee.
One of Jakub's testcases points out that we weren't handling thunks in
our constexpr virtual handling; that is fixed here as well.
Incidentally, being able to use C++11 range-for definitely simplified
clear_consteval_vfns.
gcc/c-family/ChangeLog:
* c-cppbuiltin.c (c_cpp_builtins): Define __cpp_consteval.
gcc/cp/ChangeLog:
* decl.c (grokfndecl): Allow consteval virtual.
* search.c (check_final_overrider): Check consteval mismatch.
* constexpr.c (cxx_eval_thunk_call): New.
(cxx_eval_call_expression): Call it.
* cvt.c (cp_get_fndecl_from_callee): Handle FDESC_EXPR.
* decl2.c (mark_vtable_entries): Track vtables with consteval.
(maybe_emit_vtables): Pass consteval_vtables through.
(clear_consteval_vfns): Replace consteval with nullptr.
(c_parse_final_cleanups): Call it.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/consteval-virtual1.C: New test.
* g++.dg/cpp2a/consteval-virtual2.C: New test.
* g++.dg/cpp2a/consteval-virtual3.C: New test.
* g++.dg/cpp2a/consteval-virtual4.C: New test.
* g++.dg/cpp2a/consteval-virtual5.C: New test.
Co-authored-by: Jakub Jelinek <jakub@redhat.com>
This guards externalizing a SLP node when it fails to code generate
to actually have scalar defs we can use. It also makes failure
to do so not fell the whole SLP instance but instead try this again
on the parent.
2020-07-02 Richard Biener <rguenther@suse.de>
PR tree-optimization/96028
* tree-vect-slp.c (vect_slp_convert_to_external): Make sure
we have scalar stmts to use.
(vect_slp_analyze_node_operations): When analyzing a child
failed try externalizing the parent node.
The mechanism generating debug info for removed parameters did not
adjust index of the argument in the call statement to take into
account extra arguments IPA-SRA might have produced when splitting a
strucutre. This patch addresses that omission and stops gdb from
showing incorrect value for the removed parameter and says "value
optimized out" instead. The guality testcase will end up as
UNSUPPORTED in the results which is how Richi told me on IRC we deal
with this.
It is possible to generate debug info to actually show the value of
the removed parameter but so far my approaches to do just that seem
toocontroversial
(https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546705.html), so
before I come up with something better I'd like to push this to master
and the gcc-10 branch in time for the GCC 10.2 release.
gcc/ChangeLog:
2020-07-01 Martin Jambor <mjambor@suse.cz>
PR debug/95343
* ipa-param-manipulation.c (ipa_param_adjustments::modify_call): Adjust
argument index if necessary.
gcc/testsuite/ChangeLog:
2020-07-01 Martin Jambor <mjambor@suse.cz>
PR debug/95343
* gcc.dg/guality/pr95343.c: New test.
gcc/ChangeLog:
PR middle-end/95830
* tree-vect-generic.c (expand_vector_condition): Forward declaration.
(expand_vector_comparison): Do not expand a comparison if all
uses are consumed by a VEC_COND_EXPR.
(expand_vector_operation): Change void return type to bool.
(expand_vector_operations_1): Pass dce_ssa_names.
Bootstrap with musl libc fails with numerous "missing sentinel in
function call" errors. This is because musl defines NULL as 0L for C++,
but gcc requires sentinel value to be a pointer or __null.
Jonathan Wakely says:
To be really safe during stage 1, GCC should not use NULL as a
pointer sentinel in C++ code anyway.
The bootstrap compiler could define it to 0 or 0u, neither of which
is guaranteed to be OK to pass as a varargs sentinel where a null
pointer is expected. Any of (void*)0 or (void*)NULL or nullptr
would be safe.
While it is possible to fix this by replacing NULL sentinels with
nullptrs, such approach would generate backporting conflicts, therefore
simply redefine NULL to nullptr at the end of system.h, where it would
not confuse system headers.
gcc/ChangeLog:
2020-06-30 Ilya Leoshkevich <iii@linux.ibm.com>
PR bootstrap/95700
* system.h (NULL): Redefine to nullptr.
Use of _() to enclose string literals assigned to arrays is not
portable. Use pointer instead.
2020-07-02 Mark Eggleston <markeggleston@gcc.gnu.org>
gcc/fortran/
PR fortran/52279
* check.c (gfc_invalid_boz): Change array declaration for
hint into a pointer.
The following testcase ICEs, because during the cfg cleanup, we see:
switch (i$e_11) <default: <L12> [33.33%], case -3: <lab2> [33.33%], case 0: <L10> [33.33%], case 2: <lab2> [33.33%]>
...
lab2:
__builtin_unreachable ();
where lab2 is FORCED_LABEL. The way it works, we go through the case labels
and when we reach the first one that points to gimple_seq_unreachable*
basic block, we remove the edge (if any) from the switch bb to the bb
containing the label and bbs reachable only through that edge we've just
removed. Once we do that, we must throw away all other cases that use
the same label (or some other labels from the same bb we've removed the edge
to and the bb). To avoid quadratic behavior, this is not done by walking
all remaining cases immediately before removing, but only when processing
them later.
For normal labels this works, fine, if the label is in a deleted bb, it will
have NULL label_to_block and we handle that case, or, if the unreachable bb
has some other edge to it, only the edge will be removed and not the bb,
and again, find_edge will not find the edge and we only remove the case.
And if a label would be to some other block, that other block wouldn't have
been removed earlier because there would be still an edge from the switch
block.
Now, FORCED_LABEL (and I think DECL_NONLOCAL too) break this, because
those labels aren't removed, but instead moved to some surrounding basic
block. So, when we later process those, when their gimple_seq_unreachable*
basic block is removed, label_to_block will return some unrelated block
(in the testcase the switch bb), so we decide to keep the case which doesn't
seem to be unreachable, but we don't really have an edge from the switch
block to the block the label got moved to.
I thought first about punting in gimple_seq_unreachable* on
FORCED_LABEL/DECL_NONLOCAL labels, but that might penalize even code that
doesn't care, so this instead just makes sure that for
FORCED_LABEL/DECL_NONLOCAL labels that are being removed (and thus moved
randomly) we remember in a hash_set the fact that those labels should be
treated as removed for the purpose of the optimization, and later on
handle those labels that way.
2020-07-02 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/95857
* tree-cfg.c (group_case_labels_stmt): When removing an unreachable
base_bb, remember all forced and non-local labels on it and later
treat those as if they have NULL label_to_block. Formatting fix.
Fix a comment typo.
* gcc.dg/pr95857.c: New test.
This fixes lane extraction for internal def vectorized shifts
with an effective scalar shift operand by always using lane zero
of the first vector stmt.
It also fixes a SLP build issue noticed on the testcase where
we end up building unary vector ops with the only operand built
form scalars which isn't profitable by itself. The exception
is for stores.
2020-07-02 Richard Biener <rguenther@suse.de>
PR tree-optimization/96022
* tree-vect-stmts.c (vectorizable_shift): Only use the
first vector stmt when extracting the scalar shift amount.
* tree-vect-slp.c (vect_build_slp_tree_2): Also build unary
nodes with all-scalar children from scalars but not stores.
(vect_analyze_slp_instance): Mark the node not failed.
* g++.dg/vect/pr96022.cc: New testcase.
In the test case for PR95961, vectorization factor computed
by vect_determine_vectorization_factor is [8,8]. But this is
updated to [1,1] later by vect_update_vf_for_slp. When we call
vect_get_num_vectors in vect_enhance_data_refs_alignment, the number
of scalars which is based on the vectorization factor is not a multiple
of the the number of elements in the vector type. This leads to
the ICE. This isn't a simple stream of contiguous vector accesses.
It's hard to predict from the available information how many vector
accesses we'll actually need per iteration. As discussed, here we
should use the number of scalars instead of the number of vectors as
an upper bound for the loop saving info about DR in the hash table.
2020-07-02 Felix Yang <felix.yang@huawei.com>
gcc/
PR tree-optimization/95961
* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Use the
number of scalars instead of the number of vectors as an upper bound
for the loop saving info about DR in the hash table. Remove unused
local variables.
gcc/testsuite/
PR tree-optimization/95961
* gcc.target/aarch64/sve/pr95961.c: New test.
THe OpenMP 5 standard requires that if some loop in OpenMP loop nest refers
to some outer loop's iterator variable, then the subtraction of the multiplication
factors for the outer iterator multiplied by the outer increment modulo the
inner increment is 0. For loops with non-constants in any of these we can't
diagnose it, it would be a task for something like -fsanitize=openmp,
but if all these are constant, we can diagnose it.
2020-07-02 Jakub Jelinek <jakub@redhat.com>
* omp-expand.c (expand_omp_for): Diagnose non-rectangular loops with
invalid steps - ((m2 - m1) * incr_outer) % incr must be 0 in valid
OpenMP non-rectangular loops. Use XALLOCAVEC.
* c-c++-common/gomp/loop-7.c: New test.
Such problematic components can be specified by means of a component
clause but they cannot be fully supported by the type system. They
had initially been forbidden, then we decided to accept them by working
around the type system, but this is very fragile and, for example, any
static aggregate is guaranteed to trigger an ICE with the current
implementation.
We now reject them again, except if the -gnatd.K switch is passed.
gcc/ada/ChangeLog:
* debug.adb (d.K): Document new usage.
* fe.h (Debug_Flag_Dot_KK): Declare.
* gcc-interface/decl.c (gnat_to_gnu_field): Give an error when the
component overlaps with the parent subtype, except with -gnatd.K.