The PLT is volatile. On PowerPC it is a bss style section which the
dynamic loader initialises to point at resolver stubs (called glink on
PowerPC64) to support lazy resolution of function addresses. The
first call to a given function goes via the dynamic loader symbol
resolver, which updates the PLT entry for that function and calls the
function. The second call, if there is one and we don't have a
multi-threaded race, will use the updated PLT entry and thus avoid
the relatively slow symbol resolver path.
Calls via the PLT are like calls via a function pointer, except that
no initialised function pointer is volatile like the PLT. All
initialised function pointers are resolved at program startup to point
at the function or are left as NULL. There is no support for lazy
resolution of any user visible function pointer.
So why does any of this matter to gcc? Well, normally the PLT call
mechanism happens entirely behind gcc's back, but since we implemented
inline PLT calls (effectively putting the PLT code stub that loads the
PLT entry inline and making that code sequence scheduled), the load of
the PLT entry is visible to gcc. That load then is subject to gcc
optimization, for example in
/* -S -mcpu=future -mpcrel -mlongcall -O2. */
int foo (int);
void bar (void)
{
while (foo(0))
foo (99);
}
we see the PLT load for foo being hoisted out of the loop and stashed
in a call-saved register. If that happens to be the first call to
foo, then the stashed value is that for the resolver stub, and every
call to foo in the loop will then go via the slow resolver path. Not
a good idea. Also, if foo turns out to be a local function and the
linker replaces the PLT calls with direct calls to foo then gcc has
just wasted a call-saved register.
This patch teaches gcc that the PLT loads are volatile. The change
doesn't affect other loads of function pointers and thus has no effect
on normal indirect function calls. Note that because the
"optimization" this patch prevents can only occur over function calls,
the only place gcc can stash PLT loads is in call-saved registers or
in other memory. I'm reasonably confident that this change will be
neutral or positive for the "ld -z now" case where the PLT is not
volatile, in code where there is any register pressure. Even if gcc
could be taught to recognise cases where the PLT is resolved, you'd
need to discount use of registers to cache PLT loads by some factor
involving the chance that those calls would be converted to direct
calls.
PR target/94145
* config/rs6000/rs6000.c (rs6000_longcall_ref): Use unspec_volatile
for PLT16_LO and PLT_PCREL.
* config/rs6000/rs6000.md (UNSPEC_PLT16_LO, UNSPEC_PLT_PCREL): Remove.
(UNSPECV_PLT16_LO, UNSPECV_PLT_PCREL): Define.
(pltseq_plt16_lo_, pltseq_plt_pcrel): Use unspec_volatile.
With the PR94346 fix in, we can revert the attr-copy-2.C workaround.
2020-03-27 Jakub Jelinek <jakub@redhat.com>
PR c++/94339
* g++.dg/ext/attr-copy-2.C: Revert the last changes.
gcc/c-family/ChangeLog:
PR c++/94346
* c-attribs.c (handle_copy_attribute): Avoid passing expressions
to decl_attributes. Make handling of different kinds of entities
more robust.
gcc/c-c++-common/ChangeLog:
PR c++/94346
* c-c++-common/attr-copy.c: New test.
The iterative addition of 8 and 16 bit vectors has left the mode iterators in a
bit of a mess. Also, the original names were rather verbose leading to
formatting difficulties.
This patch renames all the vector modes such that they are shorter and tidier.
It does not change the output machine description at all.
2020-03-27 Andrew Stubbs <ams@codesourcery.com>
gcc/
* config/gcn/gcn-valu.md:
(VEC_SUBDWORD_MODE): Rename to V_QIHI throughout.
(VEC_1REG_MODE): Delete.
(VEC_1REG_ALT): Delete.
(VEC_ALL1REG_MODE): Rename to V_1REG throughout.
(VEC_1REG_INT_MODE): Delete.
(VEC_ALL1REG_INT_MODE): Rename to V_INT_1REG throughout.
(VEC_ALL1REG_INT_ALT): Rename to V_INT_1REG_ALT throughout.
(VEC_2REG_MODE): Rename to V_2REG throughout.
(VEC_REG_MODE): Rename to V_noHI throughout.
(VEC_ALLREG_MODE): Rename to V_ALL throughout.
(VEC_ALLREG_ALT): Rename to V_ALL_ALT throughout.
(VEC_ALLREG_INT_MODE): Rename to V_INT throughout.
(VEC_INT_MODE): Delete.
(VEC_FP_MODE): Rename to V_FP throughout and move to top.
(VEC_FP_1REG_MODE): Rename to V_FP_1REG throughout and move to top.
(FP_MODE): Delete and replace with FP throughout.
(FP_1REG_MODE): Delete and replace with FP_1REG throughout.
(VCMP_MODE): Rename to V_noQI throughout and move to top.
(VCMP_MODE_INT): Rename to V_INT_noQI throughout and move to top.
* config/gcn/gcn.md (FP): New mode iterator.
(FP_1REG): New mode iterator.
-Wredundant-tags doesn't consider type declarations that are also
the first uses of the type, such as in 'void f (struct S);' and
issues false positives for those. According to the reported that's
making it harder to use the warning to clean up LibreOffice.
The attached patch extends -Wredundant-tags to avoid these false
positives by relying on the same class_decl_loc_t::class2loc mapping
as -Wmismatched-tags. The patch also improves the detection
of both issues in template declarations.
gcc/cp/ChangeLog
2020-03-27 Martin Sebor <msebor@redhat.com>
PR c++/94078
PR c++/93824
PR c++/93810
* cp-tree.h (most_specialized_partial_spec): Declare.
* parser.c (cp_parser_elaborated_type_specifier): Distinguish alias
from declarations.
(specialization_of): New function.
(cp_parser_check_class_key): Move code...
(class_decl_loc_t::add): ...to here. Add parameters. Avoid issuing
-Wredundant-tags on first-time declarations in other declarators.
Correct handling of template specializations.
(class_decl_loc_t::diag_mismatched_tags): Also expect to be called
when -Wredundant-tags is enabled. Use primary template or partial
specialization as the guide for uses of implicit instantiations.
* pt.c (most_specialized_partial_spec): Declare extern.
gcc/testsuite/ChangeLog
2020-03-27 Martin Sebor <msebor@redhat.com>
PR c++/94078
PR c++/93824
PR c++/93810
* g++.dg/warn/Wmismatched-tags-3.C: New test.
* g++.dg/warn/Wmismatched-tags-4.C: New test.
* g++.dg/warn/Wmismatched-tags-5.C: New test.
* g++.dg/warn/Wmismatched-tags-6.C: New test.
* g++.dg/warn/Wredundant-tags-3.C: Remove xfails.
* g++.dg/warn/Wredundant-tags-6.C: New test.
* g++.dg/warn/Wredundant-tags-7.C: New test.
Following DR2061, 'namespace F', looks for 'F's inside inline namespaces.
That can result in ambiguous lookups that we failed to diagnose early enough,
leading us to push a new namespace and ICE later. Diagnose the ambiguity
earlier, and then pick one.
PR c++/94257
* name-lookup.c (push_namespace): Triage ambiguous lookups that
contain namespaces.
Fixes to exploded_path::feasible_p exposed a pre-existing bug
with pointer NULL-ness for pointers to symbolic_region.
symbolic_region has an "m_possibly_null" flag which if set means
that a region_svalue pointing to that region is treated as possibly
NULL. Adding a constraint of "!= NULL" on an edge records that
the pointer is non-NULL, but doesn't affect other pointers (e.g.
if the first if a void *, but the other pointers are cast to other
pointer types). This showed up in the tests
gcc.dg/analyzer/data-model-5b.c and -5c.c, which malloc a buffer
and test for NULL, but then cast that to a struct * and later test
that struct *: a path for the first test being non-NULL and the
second being NULL was erroneously found to be feasible.
This patch clears the m_possibly_null flag when a "!= NULL" constraint
is added, fixing that erroneous path (but not yet fixing the false
positive in the above tests, which seems to go on to hit a different
issue). It also adds the field to dumps.
gcc/analyzer/ChangeLog:
* program-state.cc (selftest::test_program_state_dumping): Update
expected dump to include symbolic_region's possibly_null field.
* region-model.cc (symbolic_region::print_fields): New vfunc
implementation.
(region_model::add_constraint): Clear m_possibly_null from
symbolic_regions now known to be non-NULL.
(selftest::test_malloc_constraints): New selftest.
(selftest::analyzer_region_model_cc_tests): Call it.
* region-model.h (region::dyn_cast_symbolic_region): Add non-const
overload.
(symbolic_region::dyn_cast_symbolic_region): Implement it.
(symbolic_region::print_fields): New vfunc override decl.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/data-model-5b.c: Add xfail for new false
positive leak.
* gcc.dg/analyzer/data-model-5c.c: Likewise.
* gcc.dg/analyzer/malloc-5.c: New test.
This patch extends -fdump-analyzer-supergraph so that rather than just
dumping a DUMP_BASE_NAME.supergraph.dot at the start of analysis, it
also dumps a DUMP_BASE_NAME.supergraph-eg.dot at the end.
The new dump file contains a concise dump of the exploded_graph,
organized with respect to the supergraph and its statements. The
exploded nodes are colorized to show sm-state, but no other state
is shown. Per exploded_node saved_diagnostics are also shown,
along with feasibility of the paths to reach them.
I've been finding this a useful way of tracking down issues in
exploded_graphs that are sufficiently large that the output of
-fdump-analyzer-exploded-graph becomes unwieldy.
The patch extends feasiblity-testing so that if the exploded_path
for a saved_diagnostic is found to be infeasible, the reason is
saved and written into the saved_diagnostic, so it can be shown in the
dump. I've found this very useful when tracking down feasibility
issues.
I'm keeping the initial dump file as it's useful when tracking down
ICEs within the analyzer (which would stop the second dump file being
written).
gcc/analyzer/ChangeLog:
* analyzer.h (class feasibility_problem): New forward decl.
* diagnostic-manager.cc (saved_diagnostic::saved_diagnostic):
Initialize new fields m_status, m_epath_length, and m_problem.
(saved_diagnostic::~saved_diagnostic): Delete m_problem.
(dedupe_candidate::dedupe_candidate): Convert "sd" param from a
const ref to a mutable ptr.
(dedupe_winners::add): Convert "sd" param from a const ref to a
mutable ptr. Record the length of the exploded_path. Record the
feasibility/infeasibility of sd into sd, capturing a
feasibility_problem when feasible_p fails, and storing it in sd.
(diagnostic_manager::emit_saved_diagnostics): Update for pass by
ptr rather than by const ref.
* diagnostic-manager.h (class saved_diagnostic): Add new enum
status. Add fields m_status, m_epath_length and m_problem.
(saved_diagnostic::set_feasible): New member function.
(saved_diagnostic::set_infeasible): New member function.
(saved_diagnostic::get_feasibility_problem): New accessor.
(saved_diagnostic::get_status): New accessor.
(saved_diagnostic::set_epath_length): New member function.
(saved_diagnostic::get_epath_length): New accessor.
* engine.cc: Include "gimple-pretty-print.h".
(exploded_path::feasible_p): Add OUT param and, if non-NULL, write
a new feasibility_problem to it on failure.
(viz_callgraph_node::dump_dot): Convert begin_tr calls to
begin_trtd. Convert end_tr calls to end_tdtr.
(class exploded_graph_annotator): New subclass of dot_annotator.
(impl_run_checkers): Add a second -fdump-analyzer-supergraph dump
after the analysis runs, using exploded_graph_annotator. dumping
to DUMP_BASE_NAME.supergraph-eg.dot.
* exploded-graph.h (exploded_node::get_dot_fillcolor): Make
public.
(exploded_path::feasible_p): Add OUT param.
(class feasibility_problem): New class.
* state-purge.cc (state_purge_annotator::add_node_annotations):
Return a bool, add a "within_table" param.
(print_vec_of_names): Convert begin_tr calls to begin_trtd.
Convert end_tr calls to end_tdtr.
(state_purge_annotator::add_stmt_annotations): Add "within_row"
param.
* state-purge.h ((state_purge_annotator::add_node_annotations):
Return a bool, add a "within_table" param.
(state_purge_annotator::add_stmt_annotations): Add "within_row"
param.
* supergraph.cc (supernode::dump_dot): Call add_node_annotations
twice: as before, passing false for "within_table", then again
with true when within the TABLE element. Convert some begin_tr
calls to begin_trtd, and some end_tr calls to end_tdtr.
Repeat each add_stmt_annotations call, distinguishing between
calls that add TRs and those that add TDs to an existing TR.
Add a call to add_after_node_annotations.
* supergraph.h (dot_annotator::add_node_annotations): Add a
"within_table" param.
(dot_annotator::add_stmt_annotations): Add a "within_row" param.
(dot_annotator::add_after_node_annotations): New vfunc.
gcc/ChangeLog:
* doc/invoke.texi (-fdump-analyzer-supergraph): Document that this
now emits two .dot files.
* graphviz.cc (graphviz_out::begin_tr): Only emit a TR, not a TD.
(graphviz_out::end_tr): Only close a TR, not a TD.
(graphviz_out::begin_td): New.
(graphviz_out::end_td): New.
(graphviz_out::begin_trtd): New, replacing the old implementation
of graphviz_out::begin_tr.
(graphviz_out::end_tdtr): New, replacing the old implementation
of graphviz_out::end_tr.
* graphviz.h (graphviz_out::begin_td): New decl.
(graphviz_out::end_td): New decl.
(graphviz_out::begin_trtd): New decl.
(graphviz_out::end_tdtr): New decl.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/dot-output.c: Check that
dot-output.c.supergraph-eg.dot is valid.
gcc/analyzer/ChangeLog:
* diagnostic-manager.cc (dedupe_winners::add): Show the
exploded_node index in the log messages.
(diagnostic_manager::emit_saved_diagnostics): Log a summary of
m_saved_diagnostics at entry.
This avoids completing types for DINFO_LEVEL_TERSE by using
the should_emit_struct_debug machinery.
2020-03-27 Richard Biener <rguenther@suse.de>
PR debug/94273
* dwarf2out.c (should_emit_struct_debug): Return false for
DINFO_LEVEL_TERSE.
* g++.dg/debug/pr94273.C: New testcase.
This fixes a (harmless) use of a not re-initialized curr_order.
2020-03-27 Richard Biener <rguenther@suse.de>
PR tree-optimization/94352
* tree-ssa-propagate.c (ssa_prop_init): Move seeding of the
worklist ...
(ssa_propagation_engine::ssa_propagate): ... here after
initializing curr_order.
As PR90332 shows, the current scalar epilogue peeling for gaps
elimination requires expected vec_init optab with two half size
vector mode. On Power, we don't support vector mode like V8QI,
so can't support optab like vec_initv16qiv8qi. But we want to
leverage existing scalar mode like DI to init the desirable
vector mode. This patch is to extend the existing support for
Power, as evaluated on Power9 we can see expected 1.9% speed up
on SPEC2017 525.x264_r.
As Richi suggested, add one function vector_vector_composition_type
to refactor existing related codes and also make use of it further.
Bootstrapped/regtested on powerpc64le-linux-gnu (LE) P8 and P9,
as well as x86_64-redhat-linux.
gcc/ChangeLog
2020-03-27 Kewen Lin <linkw@gcc.gnu.org>
PR tree-optimization/90332
* tree-vect-stmts.c (vector_vector_composition_type): New function.
(get_group_load_store_type): Adjust to call vector_vector_composition_type,
extend it to construct with scalar types.
(vectorizable_load): Likewise.
PR fortran/93363
* resolve.c (resolve_assoc_var): Reject association to DT and
function name.
PR fortran/93363
* gfortran.dg/associate_51.f90: Fix test case.
* gfortran.dg/associate_53.f90: New.
The following testcase FAILs -fcompare-debug, because if we emit a
-Wreturn-local-addr warning, we tsubst decltype in order to print the
warning and as that function could throw, set_flags_from_callee during that
sets cp_function_chain->can_throw and later on we don't set TREE_NOTHROW
on foo. While with -w or -Wno-return-local-addr, tsubst isn't called during
the warning_at, cp_function_chain->can_throw is kept clear and TREE_NOTHROW
is set on foo.
It isn't just a matter of the warning though, in
int foo ();
int bar () { return sizeof (foo ()); }
int baz () { return sizeof (int); }
I don't really see why we should mark only baz as TREE_NOTHROW and not bar
too, when neither can really throw.
2020-03-27 Jakub Jelinek <jakub@redhat.com>
PR c++/94326
* call.c (set_flags_from_callee): Don't update
cp_function_chain->can_throw or current_function_returns_abnormally
if cp_unevaluated_operand.
* g++.dg/other/pr94326.C: New test.
My recent change to get_narrower/warnings_for_convert_and_check broke
the following testcase, warnings_for_convert_and_check is upset that
expr is a COMPOUND_EXPR with INTEGER_CST at the rightmost operand, while
result is a COMPOUND_EXPR with a NOP_EXPR of INTEGER_CST at the rightmost
operand, it expects such conversions to be simplified.
The easiest fix seems to be to handle COMPOUND_EXPRs in ocp_convert too,
by converting the rightmost operand and recreating COMPOUND_EXPR(s) if that
changed.
The attr-copy-2.C change is a workaround for PR94346, where we now ICE on
the testcase, while previously we'd ICE only if it contained a comma
expression at the outer level rather than cast of a COMPOUND_EXPR to
something. I'll defer that to Martin.
2020-03-27 Jakub Jelinek <jakub@redhat.com>
PR c++/94339
* cvt.c (ocp_convert): Handle COMPOUND_EXPR by recursion on the second
operand and creating a new COMPOUND_EXPR if anything changed.
* g++.dg/other/pr94339.C: New test.
* g++.dg/ext/attr-copy-2.C: Comment out failing tests due to PR94346.
This patch removes all debug insns from DDG analysis. It fixes bootstrap
comparison failure on powerpc64le when running with -fmodulo-sched enabled.
* ddg.c (create_ddg_dep_from_intra_loop_link): Remove assertions.
(create_ddg_dep_no_link): Likewise.
(add_cross_iteration_register_deps): Move debug instruction check.
Other minor refactoring.
(add_intra_loop_mem_dep): Do not check for debug instructions.
(add_inter_loop_mem_dep): Likewise.
(build_intra_loop_deps): Likewise.
(create_ddg): Do not include debug insns into the graph.
* ddg.h (struct ddg): Remove num_debug field.
* modulo-sched.c (doloop_register_get): Adjust condition.
(res_MII): Remove DDG num_debug field usage.
(sms_schedule_by_order): Use assertion against debug insns.
(ps_has_conflicts): Drop debug insn check.
testsuite:
* gcc.c-torture/execute/pr70127-debug-sms.c: New test.
* gcc.dg/torture/pr87197-debug-sms.c: New test.
This came up on the C++ core list recently. We don't diagnose the case
when 'template' is followed by a destructor name which is not permitted
by [temp.names]/5.
PR c++/94336 - template keyword accepted before destructor names.
* parser.c (cp_parser_unqualified_id): Give an error when 'template'
is followed by a destructor name.
* g++.dg/template/template-keyword2.C: New test.
This simplifies conditions that test both value_dependent_expression_p and
type_dependent_expression_p, since the former predicate now subsumes the latter.
gcc/cp/ChangeLog:
* decl.c (compute_array_index_type_loc): Remove redundant
type_dependent_expression_p check that is subsumed by
value_dependent_expression_p.
* decl2.c (is_late_template_attribute): Likewise.
* pt.c (uses_template_parms): Likewise.
(dependent_template_arg_p): Likewise.
In order for the test output to work we need to include
cstdio.
2020-03-27 Iain Sandoe <iain@sandoe.co.uk>
* g++.dg/coroutines/torture/symmetric-transfer-00-basic.C:
Add <cstdio>.
Consider
template <typename T> class A {
template <typename U> class B {
void fn(typename A<T>::B<U>);
};
};
which is rejected with
error: 'typename A<T>::B' names 'template<class T> template<class U> class A<T>::B', which is not a type
whereas clang/icc/msvc accept it.
"typename A<T>::B<U>" is a typename-specifier. Sadly, our comments
don't mention it anywhere, because the typename-specifier wasn't in C++11;
it was only added to the language in N1376. Instead, we handle it as
an elaborated-type-specifier (not a problem thus far). So we get to
cp_parser_nested_name_specifier_opt which has a loop that breaks if we
don't see a < or ::, but that means we can -- tentatively -- parse even
B<U> which is not a nested-name-specifier (it doesn't end with a ::).
I think this should compile because [temp.names]/4 says: "In a qualified-id
used as the name in a typename-specifier, elaborated-type-specifier,
using-declaration, or class-or-decltype, an optional keyword template
appearing at the top level is ignored.", added in DR 1710. Also see
DR 1812.
This issue on its own is not a significant problem or a regression.
However, in C++20, the typename here becomes optional, and so this test
is rejected in C++20, but accepted in C++17:
template <typename T> class A {
template <typename U> class B {
void fn(A<T>::B<U>);
};
};
Here we morph A<T>::B<U> into a typename-specifier, but that happens
in cp_parser_simple_type_specifier and we never handle it as above.
To fake the template keyword I'm afraid we need to use cp_parser_template_id
with template_keyword_p=true as in the patch below. The tricky thing
is to avoid breaking concepts.
To handle DR 1710, I made cp_parser_nested_name_specifier_opt assume that
when we're naming a type, the template keyword is present, too. That
revealed a bug: DR 1710 also says that the template keyword can be followed
by an alias template, but we weren't prepared to handle that. alias-decl?.C
exercise this.
gcc/cp:
DR 1710
PR c++/94057 - template keyword in a typename-specifier.
* parser.c (check_template_keyword_in_nested_name_spec): New.
(cp_parser_nested_name_specifier_opt): Implement DR1710, optional
'template'. Call check_template_keyword_in_nested_name_spec.
(cp_parser_simple_type_specifier): Assume that a <
following a qualified-id in a typename-specifier begins
a template argument list.
gcc/testsuite:
DR 1710
PR c++/94057 - template keyword in a typename-specifier.
* g++.dg/cpp1y/alias-decl1.C: New test.
* g++.dg/cpp1y/alias-decl2.C: New test.
* g++.dg/cpp1y/alias-decl3.C: New test.
* g++.dg/parse/missing-template1.C: Update dg-error.
* g++.dg/parse/template3.C: Likewise.
* g++.dg/template/error4.C: Likewise.
* g++.dg/template/meminit2.C: Likewise.
* g++.dg/template/dependent-name5.C: Likewise.
* g++.dg/template/dependent-name7.C: New test.
* g++.dg/template/dependent-name8.C: New test.
* g++.dg/template/dependent-name9.C: New test.
* g++.dg/template/dependent-name10.C: New test.
* g++.dg/template/dependent-name11.C: New test.
* g++.dg/template/dependent-name12.C: New test.
* g++.dg/template/dependent-name13.C: New test.
* g++.dg/template/dr1794.C: New test.
* g++.dg/template/dr314.C: New test.
* g++.dg/template/dr1710.C: New test.
* g++.dg/template/dr1710-2.C: New test.
* g++.old-deja/g++.pt/crash38.C: Update dg-error.
Although the note in the text [expr.await] / 5.1.1 is not normative,
it is asserted by users that an implementation that is unable to
perform unlimited symmetric transfers is not terribly useful.
This relates to the following circumstance:
try {
users-function-body:
{
....
{ some suspend context
continuation_handle = await_suspend (another handle);
continuation_handle.resume ();
'return' (actually a suspension operation).
}
}
} catch (...) {}
The call to 'continuation_handle.resume ()' needs to be a tail-
call in order that an arbitrary number of coroutines can be handled
in this manner. There are two issues with this:
1. That the user's function body is wrapped in a try/catch block and
one cannot tail-call from within those.
2. That GCC doesn't usually produce tail-calls when the optimisation
level is < O2.
After considerable discussion at WG21 meetings, it has been determined
that the intent is that the operation behaves as if the resume call is
executed in the context of the caller.
So, we can remap the fragment above like this:
{
void_coroutine_handle continuation;
try {
users-function-body:
{
....
{ some suspend context
continuation = await_suspend (another handle);
<scope exit without cleanup> symmetric_transfer;
}
}
} catch (...) {}
symmetric_transfer:
continuation.resume(); [tail call] [must tail call]
}
Thus we take the call outside the try-catch block which solves
issue (1) and mark it as a tail call and as "must tail call" for
correctness which solves (2).
As bonuses, since we no longer need to differentiate handle types
returned from await_suspend() methods, nor do we need to keep them
in the coroutine frame, since they are ephemeral, we save entries in
the frame and reduce some code too.
gcc/cp/ChangeLog:
2020-03-26 Iain Sandoe <iain@sandoe.co.uk>
* coroutines.cc (coro_init_identifiers): Initialize an identifier
for the cororoutine handle 'address' method name.
(struct coro_aw_data): Add fields to cover the continuations.
(co_await_expander): Determine the kind of await_suspend in use.
If we have the case that returns a continuation handle, then save
this and make the target for 'scope exit without cleanup' be the
continuation resume label.
(expand_co_awaits): Remove.
(struct suspend_point_info): Remove fields that kept the returned
await_suspend handle type.
(transform_await_expr): Remove code tracking continuation handles.
(build_actor_fn): Add the continuation handle as an actor-function
scope var. Build the symmetric transfer continuation point. Call
the tree walk for co_await expansion directly, rather than via a
trivial shim function.
(register_await_info): Remove fields tracking continuation handles.
(get_await_suspend_return_type): Remove.
(register_awaits): Remove code tracking continuation handles.
(morph_fn_to_coro): Remove code tracking continuation handles.
gcc/testsuite/ChangeLog:
2020-03-26 Iain Sandoe <iain@sandoe.co.uk>
* g++.dg/coroutines/torture/co-ret-09-bool-await-susp.C: Amend
to n4849 behaviour.
* g++.dg/coroutines/torture/symmetric-transfer-00-basic.C: New
test.
The standard now calls up a revised mechanism to handle exceptions
where exceptions thrown by the await_resume () method of the
initial suspend expression are considered in the same manner as
exceptions thrown by the user-authored function body.
This implements [dcl.fct.def.coroutine] / 5.3.
gcc/cp/ChangeLog:
2020-03-26 Iain Sandoe <iain@sandoe.co.uk>
* coroutines.cc (co_await_expander): If we are expanding the
initial await expression, set a boolean flag to show that we
have now reached the initial await_resume() method call.
(expand_co_awaits): Handle the 'initial await resume called' flag.
(build_actor_fn): Insert the initial await expression into the
start of the user-authored function-body. Handle the 'initial await
resume called' flag.
(morph_fn_to_coro): Initialise the 'initial await resume called'
flag. Modify the unhandled exception catch clause to recognise
exceptions that occur before the initial await_resume() and re-
throw them.
gcc/testsuite/ChangeLog:
2020-03-26 Iain Sandoe <iain@sandoe.co.uk>
* g++.dg/coroutines/torture/exceptions-test-01-n4849-a.C: New test.
The recent patch to convert all thumb1 code in libgcc to unified syntax
ommitted the conditional code that is used only when building the library
for minimal size. This patch fixes this case.
I've also fixed the COND macro so that a single definition is always used
that is for unified syntax. This eliminates a warning that is now being
seen from the assembler when compiling the ieee fp support code.
PR target/94220
* config/arm/lib1funcs.asm (COND): Use a single definition for
unified syntax.
(aeabi_uidivmod): Unified syntax when optimizing Thumb for size.
(aeabi_idivmod): Likewise.
(divsi3_skip_div0_test): Likewise.
The following testcase FAILs since recently when the C++ FE started calling
protected_set_expr_location more often.
With -g, it is called on a STATEMENT_LIST that contains a DEBUG_BEGIN_STMT
and CLEANUP_POINT_EXPR, and as STATEMENT_LISTs have !CAN_HAVE_LOCATION_P,
nothing is set. Without -g, it is called instead on the CLEANUP_POINT_EXPR
directly and changes its location.
The following patch recurses on the single non-DEBUG_BEGIN_STMT statement
of a STATEMENT_LIST if any to make the two behave the same.
2020-03-26 Jakub Jelinek <jakub@redhat.com>
PR debug/94323
* tree.c (protected_set_expr_location): Recurse on STATEMENT_LIST
that contains exactly one non-DEBUG_BEGIN_STMT statement.
* g++.dg/debug/pr94323.C: New test.
The following testcase FAILs, because gimplify_body adds a GIMPLE_NOP only
when there are no statements in the function and with -g there is a
DEBUG_BEGIN_STMT, so it doesn't add it and due to -fno-tree-dce that never
gets removed afterwards. Similarly, if the body seq after gimplification
contains some DEBUG_BEGIN_STMTs plus a single gbind, then we could behave
differently between -g0 and -g, by using that gbind as the body in the -g0
case and not in the -g case.
This patch fixes that by ignoring DEBUG_BEGIN_STMTs (other debug stmts can't
appear at this point yet thankfully) during decisions and if we pick the
single gbind and there are DEBUG_BEGIN_STMTs next to it, it moves them into
the gbind.
While debugging this, I found also a bug in the gimple_seq_last_nondebug_stmt
function, for a seq that has a single non-DEBUG_BEGIN_STMT statement
followed by one or more DEBUG_BEGIN_STMTs it would return NULL rather than
the first statement.
2020-03-26 Jakub Jelinek <jakub@redhat.com>
PR debug/94281
* gimple.h (gimple_seq_first_nondebug_stmt): New function.
(gimple_seq_last_nondebug_stmt): Don't return NULL if seq contains
a single non-debug stmt followed by one or more debug stmts.
* gimplify.c (gimplify_body): Use gimple_seq_first_nondebug_stmt
instead of gimple_seq_first_stmt, use gimple_seq_first_nondebug_stmt
and gimple_seq_last_nondebug_stmt instead of gimple_seq_first and
gimple_seq_last to check if outer_stmt gbind could be reused and
if yes and it is surrounded by any debug stmts, move them into the
gbind body.
* g++.dg/debug/pr94281.C: New test.
The standard says: "A function is user-provided if it is user-declared and
not explicitly defaulted or deleted on its first declaration."
I don't see anything about function templates having different rules here,
but user_provided_p does return true for all TEMPLATE_DECLs.
The following patch fixes it by treating as user-provided only templates
that aren't deleted.
2020-03-26 Jakub Jelinek <jakub@redhat.com>
PR c++/81349
* class.c (user_provided_p): Use STRIP_TEMPLATE instead of returning
true for all TEMPLATE_DECLs.
* g++.dg/cpp1z/pr81349.C: New test.
The following testcase FAILs with -fcompare-debug. The problem is that
the C++ FE initially uses IF_STMTs, tcc_statement which default to
TREE_SIDE_EFFECTS set, but later on is genericized into COND_EXPRs,
tcc_expression which default to TREE_SIDE_EFFECTS ored from all 3 operands.
Furthermore, with -g we emit by default DEBUG_BEGIN_STMTs (TREE_SIDE_EFFECTS
clear) and so end up with a STATEMENT_LIST containing DEBUG_BEGIN_STMT
+ e.g. the IF_STMT, while with -g0 we would end up with just the IF_STMT
alone and in that case there is no STATEMENT_LIST wrapping it.
Now, the STATEMENT_LIST has TREE_SIDE_EFFECTS set to match the IF_STMT,
but if none of the 3 operands (condition and both branches) have
TREE_SIDE_EFFECTS, genericize_if_stmt will replace the IF_STMT with
COND_EXPR without TREE_SIDE_EFFECTS, but with -g only STATEMENT_LIST
wrapping it will keep TREE_SIDE_EFFECTS. Then during gimplification,
shortcut_cond_expr checks TREE_SIDE_EFFECTS of the operands and as it
is differennt between -g and -g0, will generate different code.
The following patch attempts to fix this by clearing TREE_SIDE_EFFECTS
on STATEMENT_LISTs that initially have it set and contain only
DEBUG_BEGIN_STMT or at most one other statement that lost TREE_SIDE_EFFECTS
during the genericization.
2020-03-26 Jakub Jelinek <jakub@redhat.com>
PR c++/94272
* cp-gimplify.c (cp_genericize_r): Handle STATEMENT_LIST.
* g++.dg/debug/pr94272.C: New test.
With this simple patch, on i686-linux and x86_64-linux with -m32 (no change
for -m64), the find_base_term visited_vals.length () > 100 find_base_term
statistics changed (fbt is before this patch, fbt2 with this patch):
for k in '' '1$'; do for i in 32 64; do for j in fbt fbt2; do \
echo -n "$j $i $k "; LC_ALL=C grep ^$i.*"$k" $j | wc -l; done; done; done
fbt 32 5313355
fbt2 32 4229854
fbt 64 217523
fbt2 64 217523
fbt 32 1$ 1296
fbt2 32 1$ 407
fbt 64 1$ 0
fbt2 64 1$ 0
For frame_pointer_needed functions, we need to wait until we see the
fp_setter insn in the prologue at which point we disassociate the fp based
VALUEs from sp based VALUEs, but for !frame_pointer_needed functions,
we IMHO don't need to wait anything. For ACCUMULATE_OUTGOING_ARGS it isn't
IMHO worth doing anything, as there is a single sp adjustment and so there
is no risk of many thousands deep VALUE chains, but for
!ACCUMULATE_OUTGOING_ARGS the sp keeps changing constantly.
2020-03-26 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/92264
* var-tracking.c (add_stores): Call cselib_set_value_sp_based even
for sp based values in !frame_pointer_needed
&& !ACCUMULATE_OUTGOING_ARGS functions.
In the testcase for PR94269, widening_mul moves two multiply
instructions from outside the loop to inside
the loop, merging with two add instructions separately. This
increases the cost of the loop. Like FMA detection
in the same pass, simply restrict ops to be defined in the same
basic-block to avoid possibly moving multiply
to a different block with a higher execution frequency.
2020-03-26 Felix Yang <felix.yang@huawei.com>
PR tree-optimization/94269
* tree-ssa-math-opts.c (convert_plusminus_to_widen): Restrict
this
operation to single basic block.
* gcc.dg/pr94269.c: New test.
These tests were supposed to be committed as part of r278904 (aka
b789efeae8) but I didn't 'git add' them.
* testsuite/30_threads/shared_timed_mutex/try_lock_until/1.cc: New
test.
* testsuite/30_threads/shared_timed_mutex/try_lock_until/2.cc: New
test.
For C++20 the wait_until members of mutexes and condition variables are
required to be ill-formed if given a clock that doesn't meet the
requirements for a clock type. To implement that requirement this patch
adds static assertions using the chrono::is_clock trait, and defines
that trait.
To avoid expensive checks for the common cases, the trait (and
associated variable template) are explicitly specialized for the
standard clock types.
This also moves the filesystem::__file_clock type from <filesystem> to
<chrono>, so that chrono::file_clock and chrono::file_time can be
defined in <chrono> as required.
* include/bits/fs_fwd.h (filesystem::__file_clock): Move to ...
* include/std/chrono (filesystem::__file_clock): Here.
(filesystem::__file_clock::from_sys, filesystem::__file_clock::to_sys):
Define public member functions for C++20.
(is_clock, is_clock_v): Define traits for C++20.
* include/std/condition_variable (condition_variable::wait_until): Add
check for valid clock.
* include/std/future (_State_baseV2::wait_until): Likewise.
* include/std/mutex (__timed_mutex_impl::_M_try_lock_until): Likewise.
* include/std/shared_mutex (shared_timed_mutex::try_lock_shared_until):
Likewise.
* include/std/thread (this_thread::sleep_until): Likewise.
* testsuite/30_threads/condition_variable/members/2.cc: Qualify
slow_clock with new namespace.
* testsuite/30_threads/condition_variable/members/clock_neg.cc: New
test.
* testsuite/30_threads/condition_variable_any/members/clock_neg.cc:
New test.
* testsuite/30_threads/future/members/clock_neg.cc: New test.
* testsuite/30_threads/recursive_timed_mutex/try_lock_until/3.cc:
Qualify slow_clock with new namespace.
* testsuite/30_threads/recursive_timed_mutex/try_lock_until/
clock_neg.cc: New test.
* testsuite/30_threads/shared_future/members/clock_neg.cc: New
test.
* testsuite/30_threads/shared_lock/locking/clock_neg.cc: New test.
* testsuite/30_threads/shared_timed_mutex/try_lock_until/clock_neg.cc:
New test.
* testsuite/30_threads/timed_mutex/try_lock_until/3.cc: Qualify
slow_clock with new namespace.
* testsuite/30_threads/timed_mutex/try_lock_until/4.cc: Likewise.
* testsuite/30_threads/timed_mutex/try_lock_until/clock_neg.cc: New
test.
* testsuite/30_threads/unique_lock/locking/clock_neg.cc: New test.
* testsuite/std/time/traits/is_clock.cc: New test.
* testsuite/util/slow_clock.h (slow_clock): Move to __gnu_test
namespace.
The following testcase ICEs, because arm_gen_dicompare_reg creates invalid
RTL which then propagates into DEBUG_INSNs and ICEs while handling them.
The problem is that this function emits
(insn 18 17 19 2 (set (reg:CC_DNE 100 cc)
(compare (ior:SI (ne:SI (subreg:SI (reg:DI 129) 0)
(subreg:SI (reg:DI 114 [ _2 ]) 0))
(ne:SI (subreg:SI (reg:DI 129) 4)
(subreg:SI (reg:DI 114 [ _2 ]) 4)))
(const_int 0 [0]))) "pr94292.c":7:11 325 {*cmp_ior}
(nil))
and the invalid thing is that the COMPARE has VOIDmode. Setting a
non-VOIDmode SET_DEST to VOIDmode SET_SRC is only valid if the SET_SRC is
CONST_INT/CONST_DOUBLE.
The following patch fixes it by giving the COMPARE the same mode as it gives
to the SET_DEST cc register.
2020-03-25 Jakub Jelinek <jakub@redhat.com>
PR target/94292
* config/arm/arm.c (arm_gen_dicompare_reg): Set mode of COMPARE to
mode rather than VOIDmode.
* gcc.dg/pr94292.c: New test.
gcc/testsuite/ChangeLog:
PR middle-end/94004
* gcc.dg/Walloca-larger-than-3.c: New test.
* gcc.dg/Walloca-larger-than-3.h: New test header.
* gcc.dg/Wvla-larger-than-4.c: New test.
gcc/ChangeLog:
PR middle-end/94004
* gimple-ssa-warn-alloca.c (pass_walloca::execute): Issue warnings
even for alloca calls resulting from system macro expansion.
Include inlining context in all warnings.