If we call omp_add_variable, following omp_notice_variable will already find it
on that construct and not go through outer constructs, the following patch fixes that.
Note, this still doesn't follow OpenMP 5.0 semantics on target combined with other
constructs with reduction/lastprivate/linear clauses, will handle that for GCC11.
2020-02-06 Jakub Jelinek <jakub@redhat.com>
PR libgomp/93515
* gimplify.c (gimplify_scan_omp_clauses) <do_notice>: If adding
shared clause, call omp_notice_variable on outer context if any.
The ARM Exception Handling ABI requires personality functions in
phase1 to initialize barrier_cache before returning
_URC_HANDLER_FOUND, and we don't.
Although our own ARM personality function does not use barrier_cache
at all, other languages' ARM personality functions, during phase2, are
allowed and expected to test barrier_cache.sp to check whether the
handler frame was reached, which implies that personality functions is
in charge of the frame, and the remaining fields of barrier_cache hold
whatever values it put there in phase1.
Since we did not set barrier_cache.sp, an earlier exception, already
handled by a non-Ada handler and then released, may have its storage
reused for a new exception, that phase1 matches to an Ada frame, but
if that leaves barrier_cache.sp alone, the phase2 personality function
that handled the earlier exception, upon reaching the frame that
handled the earlier exception, may believe the information in
barrier_cache applies to the current exception. The C++ personality
function, for example, would take the information in the barrier_cache
and end up activating the handler that handled the earlier exception:
try {
throw 1;
} catch (int i) {
std::cout << "caught " << i << " by c++" << std::endl;
}
raise_ada_exception (); // might loop back to the handler above
for gcc/ada/ChangeLog
* raise-gcc.c (personality_body) [__ARM_EABI_UNWINDER__]:
Initialize barrier_cache.sp when ending phase1.
We should be able to assume that a template instantiation or other COMDAT
has non-zero address even if MAKE_DECL_ONE_ONLY for the target sets
DECL_WEAK and we haven't yet decided to emit a definition in this
translation unit.
PR c++/92003
* symtab.c (symtab_node::nonzero_address): A DECL_COMDAT decl has
non-zero address even if weak and not yet defined.
In unevaluated context, we only substitute a single PARM_DECL, not the
entire chain, but the handling of an empty pack expansion was missing that
check.
PR c++/93140
* pt.c (tsubst_decl) [PARM_DECL]: Check cp_unevaluated_operand in
handling of TREE_CHAIN for empty pack.
Now that we have post epilogue_completed split point for all
optimization levels, we can simplify post epilogue_completed splitters
considerably. If corresponding define_peephole2 pattern fails to
allocate a temporary register (or if peephole2 pass isn't run at all),
we can now always split invalid RTX after epilogue_completed is set.
Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
* config/i386/i386.md (*pushdi2_rex64 peephole2): Remove.
(*pushdi2_rex64 peephole2): Unconditionally split after
epilogue_completed.
(*ashl<mode>3_doubleword): Ditto.
(*<shift_insn><mode>3_doubleword): Ditto.
In C++ we weren't calling mark_exp_read on the __builtin_convertvector first
argument. I guess it could misbehave even with lambda implicit captures.
Fixed by calling decay_conversion on the argument, we use the argument as
rvalue so we want the standard lvalue to rvalue conversions, but as the
argument must be a vector type, e.g. integral promotions aren't really
needed.
2020-02-05 Jakub Jelinek <jakub@redhat.com>
PR c++/93557
* semantics.c (cp_build_vec_convert): Call decay_conversion on arg
prior to passing it to c_build_vec_convert.
* c-c++-common/Wunused-var-17.c: New test.
Since reshape_init_array_1 can now reuse a single constructor for
an array of non-aggregate type, we might run into a scenario where
we reuse a constructor with TREE_SIDE_EFFECTS. This broke this test
because we have something like { { expr } } and we try to reshape it,
so we recurse on the inner CONSTRUCTOR, reuse an existing CONSTRUCTOR
with TREE_SIDE_EFFECTS, and then ICE on the discrepancy because the
outermost CONSTRUCTOR doesn't have TREE_SIDE_EFFECTS. In this case
EXPR was a call to an operator function so TREE_SIDE_EFFECTS should
be set. Naturally one would want to fix this by calling
recompute_constructor_flags in an appropriate place so that the flags
on the CONSTRUCTORs match. The appropriate place would be at the end
of reshape_init, but this breaks initlist109.C: there we are dealing
with { { TARGET_EXPR <{}> } } where the outermost { } is TREE_CONSTANT
but the inner { } is not, so recompute_constructor_flags would clear
the constant flag in the outermost { }. Seems resonable but it upsets
check_initializer which then complains about "non-constant in-class
initialization invalid for static member". TARGET_EXPRs are always
created with TREE_SIDE_EFFECTS on, but that is mutually exclusive
with TREE_CONSTANT. So we're in a bind.
Fixed by not reusing a CONSTRUCTOR that has TREE_SIDE_EFFECTS; in the
grand scheme of things it isn't measurable: it only affects ~3 tests
in the testsuite.
PR c++/93559 - ICE with CONSTRUCTOR flags verification.
* decl.c (reshape_init_array_1): Don't reuse a CONSTRUCTOR with
TREE_SIDE_EFFECTS.
* g++.dg/cpp0x/initlist119.C: New test.
* g++.dg/cpp0x/initlist120.C: New test.
In the testcase, since there's no declaration of T, ref_view(T) declares a
non-static data member T of type ref_view, the same type as its enclosing
class. Then when we try to do C++20 aggregate class template argument
deduction we recursively try to adjust the braced-init-list to match the
template class definition until we run out of stack.
Fixed by rejecting the template data member.
PR c++/92593
* decl.c (grokdeclarator): Reject field of current class type even
in a template.
* testsuite/lib/libgomp.exp
(check_effective_target_offload_target_nvptx): Pass flags as 'options'
and not as 'source' argument to libgomp_target_compile.
The G++ bug has been fixed for a couple of months so we can remove these
workarounds that define alias templates in terms of constrained class
templates. We can just apply constraints directly to alias templates as
specified in the C++20 working draft.
* include/bits/iterator_concepts.h (iter_reference_t)
(iter_rvalue_reference_t, iter_common_reference_t, indirect_result_t):
Remove workarounds for PR c++/67704.
* testsuite/24_iterators/aliases.cc: New test.
The analyzer recognizes __analyzer_dump_exploded_nodes as a "magic"
function for use in DejaGnu tests: at the end of the pass, it issues
a warning at each such call, dumping the count of exploded nodes seen at
the call, which can be checked in test cases via dg-warning directives,
along with the IDs of the enodes (which is helpful when debugging).
My intent was to give a way of testing the results of the state-merging
code.
The state-merging code can generate duplicate exploded nodes at a point
when state merging occurs, taking a pair of enodes from the worklist
that share a program_point and sufficiently similar state. For these
cases it generates a merged state, and adds edges from those enodes to
the merged-state enode (potentially a new or a pre-existing enode); the
input enodes don't have process_node called on them.
This means that at a CFG join point there can be an unpredictable number
of enodes that we don't care about, where the precise number depends on
the details of the state-merger code, immediately followed by a more
predictable number that we do care about.
I've been papering over this in the analyzer DejaGnu tests somewhat
by adding pairs of __analyzer_dump_exploded_nodes calls at CFG join
points, where the output at the first call is somewhat arbitrary, and
the second has the number we care about; the first number tends to
change "at random" as I tweak the state merging code, in ways that
aren't interesting, but require the tests to be updated.
See e.g. gcc.dg/analyzer/paths-6.c which had:
__analyzer_dump_exploded_nodes (0); /* { dg-warning "2 exploded nodes" } */
// FIXME: the above can vary between 2 and 3 exploded nodes
__analyzer_dump_exploded_nodes (0); /* { dg-warning "1 exploded node" } */
This patch remedies this situation by tracking which enodes are
processed, and which are merely "merger" enodes. It updates the
output for __analyzer_dump_exploded_nodes so that count of enodes
only includes the *processed* enodes, and that the IDs are split
into "processed" and "merger" enodes.
The patch simplifies the testsuite by eliminating the redundant calls
described above; the example above becomes:
__analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed enode" } */
where the output in question is now:
warning: 1 processed enode: [EN: 94] merger(s): [EN: 93]
The patch also adds various checks on the status of enodes, to ensure
e.g. that each enode is processed at most once.
gcc/analyzer/ChangeLog:
* engine.cc (exploded_node::dump_dot): Show merger enodes.
(worklist::add_node): Assert that the node's m_status is
STATUS_WORKLIST.
(exploded_graph::process_worklist): Likewise for nodes from the
worklist. Set status of merged nodes to STATUS_MERGER.
(exploded_graph::process_node): Set status of node to
STATUS_PROCESSED.
(exploded_graph::dump_exploded_nodes): Rework handling of
"__analyzer_dump_exploded_nodes", splitting enodes by status into
"processed" and "merger", showing the count of just the processed
enodes at the call, rather than the count of all enodes.
* exploded-graph.h (exploded_node::status): New enum.
(exploded_node::exploded_node): Initialize m_status to
STATUS_WORKLIST.
(exploded_node::get_status): New getter.
(exploded_node::set_status): New setter.
(exploded_node::m_status): New field.
gcc/ChangeLog:
* doc/analyzer.texi
(Special Functions for Debugging the Analyzer): Update description
of __analyzer_dump_exploded_nodes.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/data-model-1.c: Update for changed output to
__analyzer_dump_exploded_nodes, dropping redundant call at merger.
* gcc.dg/analyzer/data-model-7.c: Likewise.
* gcc.dg/analyzer/loop-2.c: Update for changed output format.
* gcc.dg/analyzer/loop-2a.c: Likewise.
* gcc.dg/analyzer/loop-4.c: Likewise.
* gcc.dg/analyzer/loop.c: Likewise.
* gcc.dg/analyzer/malloc-paths-10.c: Likewise; drop redundant
call at merger.
* gcc.dg/analyzer/malloc-vs-local-1a.c: Likewise.
* gcc.dg/analyzer/malloc-vs-local-1b.c: Likewise.
* gcc.dg/analyzer/malloc-vs-local-2.c: Likewise.
* gcc.dg/analyzer/malloc-vs-local-3.c: Likewise.
* gcc.dg/analyzer/paths-1.c: Likewise.
* gcc.dg/analyzer/paths-1a.c: Likewise.
* gcc.dg/analyzer/paths-2.c: Likewise.
* gcc.dg/analyzer/paths-3.c: Likewise.
* gcc.dg/analyzer/paths-4.c: Update for changed output format.
* gcc.dg/analyzer/paths-5.c: Likewise.
* gcc.dg/analyzer/paths-6.c: Likewise; drop redundant calls
at merger.
* gcc.dg/analyzer/paths-7.c: Likewise.
* gcc.dg/analyzer/torture/conditionals-2.c: Update for changed
output format.
* gcc.dg/analyzer/zlib-1.c: Likewise; drop redundant calls.
* gcc.dg/analyzer/zlib-5.c: Update for changed output format.
As mentioned in the PR, the CLOBBERs in vzeroupper are added there even for
registers that aren't ever live in the function before and break the
prologue/epilogue expansion with ms ABI (normal ABIs are fine, as they
consider all [xyz]mm registers call clobbered, but the ms ABI considers
xmm0-15 call used but the bits above low 128 ones call clobbered).
The following patch fixes it by not adding the clobbers during vzeroupper
pass (before pro_and_epilogue), but adding them for -fipa-ra purposes only
during the final output. Perhaps we could add some CLOBBERs early (say for
df_regs_ever_live_p regs that aren't live in the live_regs bitmap, or
depending on the ABI either add all of them immediately, or for ms ABI add
CLOBBERs for xmm0-xmm5 if they don't have a SET) and add the rest later.
And the addition could be perhaps done at other spots, e.g. in an
epilogue_completed guarded splitter.
2020-02-05 Jakub Jelinek <jakub@redhat.com>
PR target/92190
* config/i386/i386-features.c (ix86_add_reg_usage_to_vzeroupper): Only
include sets and not clobbers in the vzeroupper pattern.
* config/i386/sse.md (*avx_vzeroupper): Require in insn condition that
the parallel has 17 (64-bit) or 9 (32-bit) elts.
(*avx_vzeroupper_1): New define_insn_and_split.
* gcc.target/i386/pr92190.c: New test.
The problem is that x86 is the only target that defines HAVE_ATTR_length and
doesn't schedule any splitting pass at -O0 after pro_and_epilogue.
So, either we go back to handling this during vzeroupper output
(unconditionally, rather than flag_ipa_ra guarded), or we need to tweak the
split* passes for x86.
Seems there are 5 split passes, split1 is run unconditionally before reload,
split2 is run for optimize > 0 or STACK_REGS (x86) after ra but before
epilogue_completed, split3 is run before regstack for STACK_REGS and
optimize and -fno-schedule-insns2, split4 is run before sched2 if sched2 is
run and split5 is run before shorten_branches if HAVE_ATTR_length and not
STACK_REGS.
2020-02-05 Jakub Jelinek <jakub@redhat.com>
PR target/92190
* recog.c (pass_split_after_reload::gate): For STACK_REGS targets,
don't run when !optimize.
(pass_split_before_regstack::gate): For STACK_REGS targets, run even
when !optimize.
We're now consistently building SLP operations with only
scalar defs from scalars which makes the testcase no longer
testing multiplication vectorization. The following smuggles
in a constant making the vector variant profitable for SLP build.
2020-02-05 Richard Biener <rguenther@suse.de>
PR testsuite/92177
* gcc.dg/vect/bb-slp-22.c: Adjust.
This adds guards to genmatch generated code before accessing call
expression or stmt arguments that might be out of bounds when
the user provided bogus prototypes for what we consider builtins.
2020-02-05 Richard Biener <rguenther@suse.de>
PR middle-end/90648
* genmatch.c (dt_node::gen_kids_1): Emit number of argument
checks before matching calls.
* gcc.dg/pr90648.c: New testcase.
Makes some parameters const in libiberty's hashtab library.
include/ChangeLog:
* hashtab.h (htab_remove_elt): Make a parameter const.
(htab_remove_elt_with_hash): Likewise.
libiberty/ChangeLog:
* hashtab.c (htab_remove_elt): Make a parameter const.
(htab_remove_elt_with_hash): Likewise.
The testcases ICE because when processing the declare simd inbranch,
we don't create the i == 0 clone as it already exists, which means
clone_info->nargs is not adjusted, but we then rely on it being adjusted
when trying other clones.
2020-02-05 Jakub Jelinek <jakub@redhat.com>
PR middle-end/93555
* omp-simd-clone.c (expand_simd_clones): If simd_clone_mangle or
simd_clone_create failed when i == 0, adjust clone->nargs by
clone->inbranch.
* c-c++-common/gomp/pr93555-1.c: New test.
* c-c++-common/gomp/pr93555-2.c: New test.
* gfortran.dg/gomp/pr93555.f90: New test.
These changes are needed for some of the tests in the constrained algorithm
patch, because they use move_iterator with an uncopyable output_iterator. The
other changes described in the paper are already applied, it seems.
libstdc++-v3/ChangeLog:
* include/bits/stl_iterator.h (move_iterator::move_iterator): Move __i
when initializing _M_current.
(move_iterator::base): Split into two overloads differing in
ref-qualifiers as in P1207R4 for C++20.
gcc/cp
* coroutines.cc (build_co_await): Call convert_from_reference
to wrap co_await_expr with indirect_ref which avoid
reference/non-reference type confusion.
(co_await_expander): Sink to call_expr if await_resume
is wrapped by indirect_ref.
gcc/testsuite
* g++.dg/coroutines/co-await-14-return-ref-to-auto.C: New test.
Here, push_tinst_level refused to push into the scope of Foo::Foo
because it was triggered from the ill-formed function fun. But we didn't
check the return value and tried to pop the un-pushed level.
PR c++/93551
* constraint.cc (satisfy_declaration_constraints): Check return
value of push_tinst_level.
Value-initialization is importantly different from {}-initialization for
this testcase, where the former calls the deleted S constructor and the
latter initializes S happily.
PR c++/90951
* constexpr.c (cxx_eval_array_reference): {}-initialize missing
elements instead of value-initializing them.
Here, we were going down the wrong path in perform_member_init because of
the incorrect parens around the mem-initializer for the array. And then
cxx_eval_vec_init_1 didn't know what to do with a CONSTRUCTOR as the
initializer. The latter issue was a straightforward fix, but I also wanted
to fix us silently accepting the parens, which led to factoring out handling
of TREE_LIST and flexarrays. The latter led to adjusting the expected
behavior on flexary29.C: we should complain about the initializer, but not
complain about a missing initializer.
As I commented on PR 92812, in this process I noticed that we weren't
handling C++20 parenthesized aggregate initialization as a mem-initializer.
So my TREE_LIST handling includes a commented out section that should
probably be part of a future fix for that issue; with it uncommented we
continue to crash on the testcase in C++20 mode, but should instead complain
about the braced-init-list not being a valid initializer for an A.
PR c++/86917
* init.c (perform_member_init): Simplify.
* constexpr.c (cx_check_missing_mem_inits): Allow uninitialized
flexarray.
(cxx_eval_vec_init_1): Handle CONSTRUCTOR.
Fix some failures on xstormy16-elf:
gcc.dg/analyzer/data-model-1.c (test for warnings, line 595)
gcc.dg/analyzer/data-model-1.c (test for warnings, line 642)
gcc.dg/analyzer/data-model-1.c (test for warnings, line 690)
gcc.dg/analyzer/data-model-1.c (test for warnings, line 738)
due to:
warning: overflow in conversion from ‘long int’ to ‘int’ changes
value from ‘100024’ to ‘-31048’ [-Woverflow]
20 | p[0].x = 100024;
| ^~~~~~
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/data-model-1.c (struct coord): Convert fields
from int to long.
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=243681 reports a build
failure with clang 9.0.1:
gcc/analyzer/engine.cc:2971:13: error:
reinterpret_cast from 'nullptr_t' to 'function *' is not allowed
v.m_fun = reinterpret_cast<function *> (NULL);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
engine.cc:2983:21: error:
reinterpret_cast from 'nullptr_t' to 'function *' is not allowed
return v.m_fun == reinterpret_cast<function *> (NULL);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The casts appears to be unnecessary; eliminate them.
gcc/analyzer/ChangeLog:
PR analyzer/93543
* engine.cc (pod_hash_traits<function_call_string>::mark_empty):
Eliminate reinterpret_cast.
(pod_hash_traits<function_call_string>::is_empty): Likewise.
This adds back a folding that worked in GCC 4.5 times by amending
the pattern that handles other cases of address vs. SSA name
comparisons.
2020-02-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/93538
* match.pd (addr EQ/NE ptr): Amend to handle &ptr->x EQ/NE ptr.
* gcc.dg/tree-ssa/forwprop-38.c: New testcase.
The macro that is defined is _GLIBCXX_NOT_FN_CALL_OP but the macro that
was named in the #undef directive was _GLIBCXX_NOT_FN_CALL. This fixes
the #undef.
* include/std/functional (_GLIBCXX_NOT_FN_CALL_OP): Un-define after
use.
The requirements for this function are only that the deleter is
swappable, but we incorrectly require that the element type is complete
and that the deleter can be swapped using std::swap (which requires it
to be move cosntructible and move assignable).
The fix is to add __uniq_ptr_impl::swap which swaps the pointer and
deleter individually, instead of using the generic std::swap on the
tuple containing them.
PR libstdc++/93562
* include/bits/unique_ptr.h (__uniq_ptr_impl::swap): Define.
(unique_ptr::swap, unique_ptr<T[], D>::swap): Call it.
* testsuite/20_util/unique_ptr/modifiers/93562.cc: New test.
Add forgotten gcc/testsuite/c-c++-common/gomp/has-include-1.c.
2020-02-04 Jakub Jelinek <jakub@redhat.com>
* macro.c (builtin_has_include): Diagnose __has_include* use outside
of preprocessing directives.
* c-c++-common/cpp/has-include-1.c: New test.
* c-c++-common/cpp/has-include-next-1.c: New test.
* c-c++-common/gomp/has-include-1.c: New test.
The standard says http://eel.is/c++draft/cpp.cond#7.sentence-2 that
__has_include can't appear at arbitrary places in the source. As we have
not recognized __has_include* outside of preprocessing directives in the
past, accepting it there now would be a regression. The patch does still
allow it in #define if it is then used in preprocessing directives, I guess
that use isn't strictly valid either, but clang seems to accept it.
2020-02-04 Jakub Jelinek <jakub@redhat.com>
* macro.c (builtin_has_include): Diagnose __has_include* use outside
of preprocessing directives.
* c-c++-common/cpp/has-include-1.c: New test.
* c-c++-common/cpp/has-include-next-1.c: New test.
* c-c++-common/gomp/has-include-1.c: New test.
Some of the following testcases ICE, because one of the cpp_get_token
calls in builtin_has_include reads the CPP_EOF token but the caller isn't
aware that CPP_EOF has been reached and will do another cpp_get_token.
get_token_no_padding is something that is use by the
has_attribute/has_builtin callbacks, which will first peek and will not
consume CPP_EOF (but will consume other tokens). The !SEEN_EOL ()
check on the other side doesn't work anymore and isn't really needed,
as we don't consume the EOF. The change adds one further error to the
pr88974.c testcase, if we wanted just one error per __has_include,
we could add some boolean whether we've emitted errors already and
only emit the first one we encounter (not implemented).
2020-02-04 Jakub Jelinek <jakub@redhat.com>
PR preprocessor/93545
* macro.c (cpp_get_token_no_padding): New function.
(builtin_has_include): Use it instead of cpp_get_token. Don't check
SEEN_EOL.
* c-c++-common/cpp/pr88974.c: Expect another diagnostics during error
recovery.
* c-c++-common/cpp/pr93545-1.c: New test.
* c-c++-common/cpp/pr93545-2.c: New test.
* c-c++-common/cpp/pr93545-3.c: New test.
* c-c++-common/cpp/pr93545-4.c: New test.
If the user's coroutine return type omits the mandatory promise
type then we will currently restate that error each time we see
a coroutine keyword, which doesn't provide any new information.
This suppresses all but the first instance in each coroutine.
gcc/cp/ChangeLog:
2020-02-04 Iain Sandoe <iain@sandoe.co.uk>
* coroutines.cc (find_promise_type): Delete unused forward
declaration.
(struct coroutine_info): Add a bool for no promise type error.
(coro_promise_type_found_p): Only emit the error for a missing
promise once in each affected coroutine.
gcc/testsuite/ChangeLog:
2020-02-04 Iain Sandoe <iain@sandoe.co.uk>
* g++.dg/coroutines/coro-missing-promise.C: New test.
Redundant store removal in FRE was restricted for correctness reasons.
The following extends correctness fixes required to memcpy/aggregate
copy translation. The main change is that we no longer insert
references rewritten to cover such aggregate copies into the hashtable
but the original one.
2020-02-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/91123
* tree-ssa-sccvn.c (vn_walk_cb_data::finish): New method.
(vn_walk_cb_data::last_vuse): New member.
(vn_walk_cb_data::saved_operands): Likewsie.
(vn_walk_cb_data::~vn_walk_cb_data): Release saved_operands.
(vn_walk_cb_data::push_partial_def): Use finish.
(vn_reference_lookup_2): Update last_vuse and use finish if
we've saved operands.
(vn_reference_lookup_3): Use finish and update calls to
push_partial_defs everywhere. When translating through
memcpy or aggregate copies save off operands and alias-set.
(eliminate_dom_walker::eliminate_stmt): Restore VN_WALKREWRITE
operation for redundant store removal.
* gcc.dg/tree-ssa/ssa-fre-85.c: New testcase.
The PR shows that code generation ends up pessimized by the new
canonicalization rules that end up nailing do-not-care elements
to specific values making it hard to generate good code later.
The temporary solution is to avoid this for the cases we also
obviously know the canonicalization will create more GIMPLE stmts than
before.
2020-02-04 Richard Biener <rguenther@suse.de>
PR tree-optimization/92819
* tree-ssa-forwprop.c (simplify_vector_constructor): Avoid
generating more stmts than before.
* gcc.target/i386/pr92819.c: New testcase.
* gcc.target/i386/pr92803.c: Adjust.