A previous change to simplify LRA introduced in 11b809 (From-SVN: r279550)
disabled hard register splitting for -O0. This causes a problem on aarch64 in
cases where parameters are passed in multiple registers (in the bug report an OI
passed in 2 V4SI registers). This is mandated by the AAPCS.
gcc/ChangeLog:
2020-01-29 Joel Hutton <Joel.Hutton@arm.com>
PR target/93221
* ira.c (ira): Revert use of simplified LRA algorithm.
gcc/testsuite/ChangeLog:
2020-01-29 Joel Hutton <Joel.Hutton@arm.com>
PR target/93221
* gcc.target/aarch64/pr93221.c: New test.
The __3way_builtin_ptr_cmp concept can use three_way_comparable_with to
check whether <=> is valid. Doing that makes it obvious that the
disjunction on compare_three_way::operator() is redundant, because
the second constraint subsumes the first.
The workaround for PR c++/91073 can also be removed as that bug is fixed
now.
* libsupc++/compare (__detail::__3way_builtin_ptr_cmp): Use
three_way_comparable_with.
(__detail::__3way_cmp_with): Remove workaround for fixed bug.
(compare_three_way::operator()): Remove redundant constraint from
requires-clause.
(__detail::_Synth3way::operator()): Use three_way_comparable_with
instead of workaround.
* testsuite/18_support/comparisons/object/93479.cc: Prune extra
output due to simplified constraints on compare_three_way::operator().
Currently types that cannot be compared using <=> but which are
convertible to pointers will be compared by converting to pointers
first. They should not be comparable.
PR libstdc++/93479
* libsupc++/compare (__3way_builtin_ptr_cmp): Require <=> to be valid.
* testsuite/18_support/comparisons/object/93479.cc: New test.
* testsuite/std/ranges/access/end.cc: Do not assume test_range::end()
returns the same type as test_range::begin(). Add comments.
* testsuite/std/ranges/access/rbegin.cc: Likewise.
* testsuite/std/ranges/access/rend.cc: Likewise.
* testsuite/std/ranges/range.cc: Do not assume the sentinel for
test_range is the same as its iterator type.
* testsuite/util/testsuite_iterators.h (test_range::sentinel): Add
operator- overloads to satisfy sized_sentinel_for when the iterator
satisfies random_access_iterator.
2020-01-29 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/92706
* tree-sra.c (struct access): Fields first_link, last_link,
next_queued and grp_queued renamed to first_rhs_link, last_rhs_link,
next_rhs_queued and grp_rhs_queued respectively, new fields
first_lhs_link, last_lhs_link, next_lhs_queued and grp_lhs_queued.
(struct assign_link): Field next renamed to next_rhs, new field
next_lhs. Updated comment.
(work_queue_head): Renamed to rhs_work_queue_head.
(lhs_work_queue_head): New variable.
(add_link_to_lhs): New function.
(relink_to_new_repr): Also relink LHS lists.
(add_access_to_work_queue): Renamed to add_access_to_rhs_work_queue.
(add_access_to_lhs_work_queue): New function.
(pop_access_from_work_queue): Renamed to
pop_access_from_rhs_work_queue.
(pop_access_from_lhs_work_queue): New function.
(build_accesses_from_assign): Also add links to LHS lists and to LHS
work_queue.
(child_would_conflict_in_lacc): Renamed to
child_would_conflict_in_acc. Adjusted parameter names.
(create_artificial_child_access): New parameter set_grp_read, use it.
(subtree_mark_written_and_enqueue): Renamed to
subtree_mark_written_and_rhs_enqueue.
(propagate_subaccesses_across_link): Renamed to
propagate_subaccesses_from_rhs.
(propagate_subaccesses_from_lhs): New function.
(propagate_all_subaccesses): Also propagate subaccesses from LHSs to
RHSs.
testsuite/
* gcc.dg/tree-ssa/pr92706-1.c: New test.
2020-01-29 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/92706
* tree-sra.c (struct access): Adjust comment of
grp_total_scalarization.
(find_access_in_subtree): Look for single children spanning an entire
access.
(scalarizable_type_p): Allow register accesses, adjust callers.
(completely_scalarize): Remove function.
(scalarize_elem): Likewise.
(create_total_scalarization_access): Likewise.
(sort_and_splice_var_accesses): Do not track total scalarization
flags.
(analyze_access_subtree): New parameter totally, adjust to new meaning
of grp_total_scalarization.
(analyze_access_trees): Pass new parameter to analyze_access_subtree.
(can_totally_scalarize_forest_p): New function.
(create_total_scalarization_access): Likewise.
(create_total_access_and_reshape): Likewise.
(total_should_skip_creating_access): Likewise.
(totally_scalarize_subtree): Likewise.
(analyze_all_variable_accesses): Perform total scalarization after
subaccess propagation using the new functions above.
(initialize_constant_pool_replacements): Output initializers by
traversing the access tree.
testsuite/
* gcc.dg/tree-ssa/pr92706-2.c: New test.
* gcc.dg/guality/pr59776.c: Xfail tests for s2.g.
2020-01-29 Martin Jambor <mjambor@suse.cz>
* tree-sra.c (verify_sra_access_forest): New function.
(verify_all_sra_access_forests): Likewise.
(create_artificial_child_access): Set parent.
(analyze_all_variable_accesses): Call the verifier.
* cgraph.c (cgraph_edge::resolve_speculation): Only lookup direct edge
if called on indirect edge.
(cgraph_edge::redirect_call_stmt_to_callee): Lookup indirect edge of
speculative call if needed.
* gcc.dg/tree-prof/indir-call-prof-2.c: New testcase.
Add full support for the OpenACC 2.6 acc_get_property and
acc_get_property_string functions to the libgomp GCN plugin.
libgomp/
* plugin-gcn.c (struct agent_info): Add fields "name" and
"vendor_name" ...
(GOMP_OFFLOAD_init_device): ... and init from here.
(struct hsa_context_info): Add field "driver_version_s" ...
(init_hsa_contest): ... and init from here.
(GOMP_OFFLOAD_openacc_get_property): Replace stub with a proper
implementation.
* testsuite/libgomp.oacc-c-c++-common/acc_get_property.c:
Enable test execution for amdgcn and host offloading targets.
* testsuite/libgomp.oacc-fortran/acc_get_property.f90: Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_get_property-aux.c
(expect_device_properties): Split function into ...
(expect_device_string_properties): ... this new function ...
(expect_device_memory): ... and this new function.
* testsuite/libgomp.oacc-c-c++-common/acc_get_property-gcn.c:
Add test.
If the typeinfo decls appear in OpenMP default(none) regions, as we no longer
predetermine const with no mutable members, they are diagnosed as errors,
but it isn't something the users can actually provide explicit sharing for in
the clauses.
2020-01-29 Jakub Jelinek <jakub@redhat.com>
PR c++/91118
* cp-gimplify.c (cxx_omp_predetermined_sharing): Return
OMP_CLAUSE_DEFAULT_SHARED for typeinfo decls.
* g++.dg/gomp/pr91118-1.C: New test.
* g++.dg/gomp/pr91118-2.C: New test.
As the testcase shows, some EXEC_OACC_* codes weren't handled in
oacc_code_to_statement. Fixed thusly.
2020-01-29 Jakub Jelinek <jakub@redhat.com>
PR fortran/93463
* openmp.c (oacc_code_to_statement): Handle
EXEC_OACC_{ROUTINE,UPDATE,WAIT,CACHE,{ENTER,EXIT}_DATA,DECLARE}.
* gfortran.dg/goacc/pr93463.f90: New test.
All that is really needed is make sure you #include "diagnostic-core.h"
before including pretty-print.h. By including
diagnostic-core.h first, you do:
and then pretty-print.h will do:
If instead pretty-print.h is included first, then it will use __gcc_diag__
instead of __gcc_tdiag__ and thus will assume %E/%D etc. can't be handled.
2020-01-29 Jakub Jelinek <jakub@redhat.com>
* analyzer.h (PUSH_IGNORE_WFORMAT, POP_IGNORE_WFORMAT): Remove.
* constraint-manager.cc: Include diagnostic-core.h before graphviz.h.
(range::dump, equiv_class::print): Don't use PUSH_IGNORE_WFORMAT or
POP_IGNORE_WFORMAT.
* state-purge.cc: Include diagnostic-core.h before
gimple-pretty-print.h.
(state_purge_annotator::add_node_annotations, print_vec_of_names):
Don't use PUSH_IGNORE_WFORMAT or POP_IGNORE_WFORMAT.
* region-model.cc: Move diagnostic-core.h include before graphviz.h.
(path_var::dump, svalue::print, constant_svalue::print_details,
region::dump_to_pp, region::dump_child_label, region::print_fields,
map_region::print_fields, map_region::dump_dot_to_pp,
map_region::dump_child_label, array_region::print_fields,
array_region::dump_dot_to_pp): Don't use PUSH_IGNORE_WFORMAT or
POP_IGNORE_WFORMAT.
With SLP now being a graph with shared nodes across instances we have
to make sure to compute the load permutation of nodes once, not
overwriting the result of earlier analysis.
2020-01-28 Richard Biener <rguenther@suse.de>
PR tree-optimization/93428
* tree-vect-slp.c (vect_build_slp_tree_2): Compute the load
permutation when the load node is created.
(vect_analyze_slp_instance): Re-use it here.
* gcc.dg/torture/pr93428.c: New testcase.
A return statement in a discarded statement is not used for return type
deduction, but we still want to do deduction for a return statement in a
lambda in a discarded statement.
PR c++/93442
* parser.c (cp_parser_lambda_expression): Clear in_discarded_stmt.
My patch for PR 91476 worked for decls that are implicitly comdat/weak due
to C++ linkage rules, but broke variables explicitly marked weak.
PR c++/93477
PR c++/91476
* decl2.c (copy_linkage): Do copy DECL_ONE_ONLY and DECL_WEAK.
Comments 11-16 within PR analyzer/93316 discuss an ICE in some setjmp
tests seen on AIX and powerpc-darwin9.
The issue turned out to be an implicit assumption that longjmp is
marked "noreturn". There are two places in engine.cc where the code
attempted to locate the longjmp GIMPLE_CALL by finding the final stmt
within its supernode, in one place casting it via "as_a <gcall *>",
in the other using it as the location_t of the
"rewinding from longjmp..." event.
When longjmp isn't marked noreturn, its basic block and hence supernode
can have additional stmts after the longjmp; in the setjmp-3.c case
this was a GIMPLE_RETURN, leading to a ICE when casting this to a
GIMPLE_CALL.
This patch fixes the two places in question to use the
rewind_info_t::get_longjmp_call
member function introduced by 342e14ffa3,
fixing the ICE (and ensuring the correct location_t is used for events).
gcc/analyzer/ChangeLog:
PR analyzer/93316
* engine.cc (rewind_info_t::update_model): Get the longjmp call
stmt via get_longjmp_call () rather than assuming it is the last
stmt in the longjmp's supernode.
(rewind_info_t::add_events_to_path): Get the location_t for the
rewind_from_longjmp_event via get_longjmp_call () rather than from
the supernode's get_end_location ().
2020-01-28 Vladimir Makarov <vmakarov@redhat.com>
PR rtl-optimization/93272
* ira-lives.c (process_out_of_region_eh_regs): New function.
(process_bb_node_lives): Call it.
This patch started as work to resole Richard's comment on quadratic lookups
in resolve_speculation. While doing it I however noticed multiple problems
in the new speuclative call code which made the patch quite big. In
particular:
1) Before applying speculation we consider only targets with at lest
probability 1/2.
If profile is sane at most two targets can have probability greater or
equal to 1/2. So the new multi-target speculation code got enabled only
in very special scenario when there ae precisely two target with precise
probability 1/2 (which is tested by the single testcase).
As a conseuqence the multiple target logic got minimal test coverage and
this made us to miss several ICEs.
2) Profile updating in profile merging, tree-inline and indirect call
expansion was wrong which led to inconsistent profiles (as already seen
on the testcase).
3) Code responsible to turn speculative call to direct call was broken for
anything with more than one target.
4) There were multiple cases where call_site_hash went out of sync which
eventually leads to an ICE..
5) Some code expects that all speculative call targets forms a sequence in
the callee linked list but there is no code to maintain that invariant
nor a verifier.
Fixing this it became obvious that the current API of speculative_call_info is
not useful because it really builds on fact tht there are precisely three
components (direct call, ref and indirect call) in every speculative call
sequence. I ended up replacing it with iterator API for direct call
(first_speculative_call_target, next_speculative_call_target) and accessors for
the other coponents updating comment in cgraph.h.
Finally I made the work with call site hash more effetive by updating edge
manipulation to keep them in sequence. So first one can be looked up from the
hash and then they can be iterated by callee.
There are other things that can be improved (for example the speculation should
start with most common target first), but I will try to keep that for next
stage1. This patch is mostly about getting rid of ICE and profile corruption
which is a regression from GCC 9.
Honza
gcc/ChangeLog:
2020-01-28 Jan Hubicka <hubicka@ucw.cz>
PR lto/93318
* cgraph.c (cgraph_add_edge_to_call_site_hash): Update call site
hash only when edge is first within the sequence.
(cgraph_edge::set_call_stmt): Update handling of speculative calls.
(symbol_table::create_edge): Do not set target_prob.
(cgraph_edge::remove_caller): Watch for speculative calls when updating
the call site hash.
(cgraph_edge::make_speculative): Drop target_prob parameter.
(cgraph_edge::speculative_call_info): Remove.
(cgraph_edge::first_speculative_call_target): New member function.
(update_call_stmt_hash_for_removing_direct_edge): New function.
(cgraph_edge::resolve_speculation): Rewrite to new API.
(cgraph_edge::speculative_call_for_target): New member function.
(cgraph_edge::make_direct): Rewrite to new API; fix handling of
multiple speculation targets.
(cgraph_edge::redirect_call_stmt_to_callee): Likewise; fix updating
of profile.
(verify_speculative_call): Verify that targets form an interval.
* cgraph.h (cgraph_edge::speculative_call_info): Remove.
(cgraph_edge::first_speculative_call_target): New member function.
(cgraph_edge::next_speculative_call_target): New member function.
(cgraph_edge::speculative_call_target_ref): New member function.
(cgraph_edge;:speculative_call_indirect_edge): New member funtion.
(cgraph_edge): Remove target_prob.
* cgraphclones.c (cgraph_node::set_call_stmt_including_clones):
Fix handling of speculative calls.
* ipa-devirt.c (ipa_devirt): Fix handling of speculative cals.
* ipa-fnsummary.c (analyze_function_body): Likewise.
* ipa-inline.c (speculation_useful_p): Use new speculative call API.
* ipa-profile.c (dump_histogram): Fix formating.
(ipa_profile_generate_summary): Watch for overflows.
(ipa_profile): Do not require probablity to be 1/2; update to new API.
* ipa-prop.c (ipa_make_edge_direct_to_target): Update to new API.
(update_indirect_edges_after_inlining): Update to new API.
* ipa-utils.c (ipa_merge_profiles): Rewrite merging of speculative call
profiles.
* profile-count.h: (profile_probability::adjusted): New.
* tree-inline.c (copy_bb): Update to new speculative call API; fix
updating of profile.
* value-prof.c (gimple_ic_transform): Rename to ...
(dump_ic_profile): ... this one; update dumping.
(stream_in_histogram_value): Fix formating.
(gimple_value_profile_transformations): Update.
gcc/testsuite/ChangeLog:
2020-01-28 Jan Hubicka <hubicka@ucw.cz>
* g++.dg/tree-prof/indir-call-prof.C: Update template.
* gcc.dg/tree-prof/crossmodule-indircall-1.c: Add more targets.
* gcc.dg/tree-prof/crossmodule-indircall-1a.c: Add more targets.
* gcc.dg/tree-prof/indir-call-prof.c: Update template.
When I implemented the [over.match.ref] rule that a reference conversion
function needs to match l/rvalue of the target reference type it changed our
handling of this testcase. It seems to me that our current behavior is what
the standard says, but it doesn't seem desirable, and all the other
compilers have our old behavior. So let's limit the change to non-templates
until there's some clarification from the committee.
PR c++/90546
* call.c (build_user_type_conversion_1): Allow a template conversion
returning an rvalue reference to bind directly to an lvalue.
This patch started as work to resole Richard's comment on quadratic lookups
in resolve_speculation. While doing it I however noticed multiple problems
in the new speuclative call code which made the patch quite big. In
particular:
1) Before applying speculation we consider only targets with at lest
probability 1/2.
If profile is sane at most two targets can have probability greater or
equal to 1/2. So the new multi-target speculation code got enabled only
in very special scenario when there ae precisely two target with precise
probability 1/2 (which is tested by the single testcase).
As a conseuqence the multiple target logic got minimal test coverage and
this made us to miss several ICEs.
2) Profile updating in profile merging, tree-inline and indirect call
expansion was wrong which led to inconsistent profiles (as already seen
on the testcase).
3) Code responsible to turn speculative call to direct call was broken for
anything with more than one target.
4) There were multiple cases where call_site_hash went out of sync which
eventually leads to an ICE..
5) Some code expects that all speculative call targets forms a sequence in
the callee linked list but there is no code to maintain that invariant
nor a verifier.
Fixing this it became obvious that the current API of speculative_call_info is
not useful because it really builds on fact tht there are precisely three
components (direct call, ref and indirect call) in every speculative call
sequence. I ended up replacing it with iterator API for direct call
(first_speculative_call_target, next_speculative_call_target) and accessors for
the other coponents updating comment in cgraph.h.
Finally I made the work with call site hash more effetive by updating edge
manipulation to keep them in sequence. So first one can be looked up from the
hash and then they can be iterated by callee.
There are other things that can be improved (for example the speculation should
start with most common target first), but I will try to keep that for next
stage1. This patch is mostly about getting rid of ICE and profile corruption
which is a regression from GCC 9.
gcc/ChangeLog:
PR lto/93318
* cgraph.c (cgraph_add_edge_to_call_site_hash): Update call site
hash only when edge is first within the sequence.
(cgraph_edge::set_call_stmt): Update handling of speculative calls.
(symbol_table::create_edge): Do not set target_prob.
(cgraph_edge::remove_caller): Watch for speculative calls when updating
the call site hash.
(cgraph_edge::make_speculative): Drop target_prob parameter.
(cgraph_edge::speculative_call_info): Remove.
(cgraph_edge::first_speculative_call_target): New member function.
(update_call_stmt_hash_for_removing_direct_edge): New function.
(cgraph_edge::resolve_speculation): Rewrite to new API.
(cgraph_edge::speculative_call_for_target): New member function.
(cgraph_edge::make_direct): Rewrite to new API; fix handling of
multiple speculation targets.
(cgraph_edge::redirect_call_stmt_to_callee): Likewise; fix updating
of profile.
(verify_speculative_call): Verify that targets form an interval.
* cgraph.h (cgraph_edge::speculative_call_info): Remove.
(cgraph_edge::first_speculative_call_target): New member function.
(cgraph_edge::next_speculative_call_target): New member function.
(cgraph_edge::speculative_call_target_ref): New member function.
(cgraph_edge;:speculative_call_indirect_edge): New member funtion.
(cgraph_edge): Remove target_prob.
* cgraphclones.c (cgraph_node::set_call_stmt_including_clones):
Fix handling of speculative calls.
* ipa-devirt.c (ipa_devirt): Fix handling of speculative cals.
* ipa-fnsummary.c (analyze_function_body): Likewise.
* ipa-inline.c (speculation_useful_p): Use new speculative call API.
* ipa-profile.c (dump_histogram): Fix formating.
(ipa_profile_generate_summary): Watch for overflows.
(ipa_profile): Do not require probablity to be 1/2; update to new API.
* ipa-prop.c (ipa_make_edge_direct_to_target): Update to new API.
(update_indirect_edges_after_inlining): Update to new API.
* ipa-utils.c (ipa_merge_profiles): Rewrite merging of speculative call
profiles.
* profile-count.h: (profile_probability::adjusted): New.
* tree-inline.c (copy_bb): Update to new speculative call API; fix
updating of profile.
* value-prof.c (gimple_ic_transform): Rename to ...
(dump_ic_profile): ... this one; update dumping.
(stream_in_histogram_value): Fix formating.
(gimple_value_profile_transformations): Update.
gcc/testsuite/ChangeLog:
* g++.dg/tree-prof/indir-call-prof.C: Update template.
* gcc.dg/tree-prof/crossmodule-indircall-1.c: Add more targets.
* gcc.dg/tree-prof/crossmodule-indircall-1a.c: Add more targets.
* gcc.dg/tree-prof/indir-call-prof.c: Update template.
Changed in v2:
- rename from warning_with_metadata_at to warning_meta
- fix test plugins
While C++ can have overloads, xgettext can't deal with overloads that have
different argument positions, leading to two failures in "make gcc.pot":
emit_diagnostic_valist used incompatibly as both
--keyword=emit_diagnostic_valist:4
--flag=emit_diagnostic_valist:4:gcc-internal-format and
--keyword=emit_diagnostic_valist:5
--flag=emit_diagnostic_valist:5:gcc-internal-format
warning_at used incompatibly as both
--keyword=warning_at:3 --flag=warning_at:3:gcc-internal-format and
--keyword=warning_at:4 --flag=warning_at:4:gcc-internal-format
The emit_diagnostic_valist overload isn't used anywhere (I think it's
a leftover from an earlier iteration of the analyzer patch kit).
The warning_at overload is used throughout the analyzer but nowhere else.
Ideally I'd like to consolidate this argument with something
constructable in various ways:
- from a metadata and an int or
- from an int (or, better an "enum opt_code"),
so that the overload happens when implicitly choosing the ctor, but
that feels like stage 1 material.
In the meantime, fix xgettext by deleting the unused overload and
renaming the used one.
gcc/analyzer/ChangeLog:
* region-model.cc (poisoned_value_diagnostic::emit): Update for
renaming of warning_at overload to warning_meta.
* sm-file.cc (file_leak::emit): Likewise.
* sm-malloc.cc (double_free::emit): Likewise.
(possible_null_deref::emit): Likewise.
(possible_null_arg::emit): Likewise.
(null_deref::emit): Likewise.
(null_arg::emit): Likewise.
(use_after_free::emit): Likewise.
(malloc_leak::emit): Likewise.
(free_of_non_heap::emit): Likewise.
* sm-sensitive.cc (exposure_through_output_file::emit): Likewise.
* sm-signal.cc (signal_unsafe_call::emit): Likewise.
* sm-taint.cc (tainted_array_index::emit): Likewise.
gcc/ChangeLog:
* diagnostic-core.h (warning_at): Rename overload to...
(warning_meta): ...this.
(emit_diagnostic_valist): Delete decl of overload taking
diagnostic_metadata.
* diagnostic.c (emit_diagnostic_valist): Likewise for defn.
(warning_at): Rename overload taking diagnostic_metadata to...
(warning_meta): ...this.
gcc/testsuite/ChangeLog:
* gcc.dg/plugin/diagnostic_plugin_test_metadata.c: Update for
renaming of warning_at overload to warning_meta.
* gcc.dg/plugin/diagnostic_plugin_test_paths.c: Likewise.
PR fortran/93461
* trans.h: Increase GFC_MAX_MANGLED_SYMBOL_LEN to
GFC_MAX_SYMBOL_LEN*3+5 to allow for inclusion of submodule name,
plus the "." between module and submodule names.
* gfortran.dg/pr93461.f90: New test.
Increase length of char variables "parent1" and "parent2" in
set_syms_host_assoc() to allow them to hold concatenated module +
submodule names.
PR fortran/93473
* parse.c: Increase length of char variables to allow them to hold
a concatenated module + submodule name.
* gfortran.dg/pr93473.f90: New test.
The clever hack of '#define __has_include __has_include' breaks -dD
and -fdirectives-only, because that emits definitions. This turns
__has_include into a proper builtin macro. Thus it's never emitted
via -dD, and because use outside of directive processing is undefined,
we can just expand it anywhere.
PR preprocessor/93452
* internal.h (struct spec_nodes): Drop n__has_include{,_next}.
* directives.c (lex_macro_node): Don't check __has_include redef.
* expr.c (eval_token): Drop __has_include eval.
(parse_has_include): Move to ...
* macro.c (builtin_has_include): ... here.
(_cpp_builtin_macro_text): Eval __has_include{,_next}.
* include/cpplib.h (enum cpp_builtin_type): Add BT_HAS_INCLUDE{,_NEXT}.
* init.c (builtin_array): Add them.
(cpp_init_builtins): Drop __has_include{,_next} init here ...
* pch.c (cpp_read_state): ... and here.
* traditional.c (enum ls): Drop has_include states ...
(_cpp_scan_out_logical_line): ... and here.
We just need to handle the exception specification like other properties of
a function typedef.
PR c++/90731
* decl.c (grokdeclarator): Propagate eh spec from typedef.
gcc/fortran/
* gfortran.h (gfc_symbol): Add comp_mark bitfield.
* openmp.c (resolve_omp_clauses): Disallow mixed component and
full-derived-type accesses to the same variable within a single
directive.
libgomp/
* testsuite/libgomp.oacc-fortran/deep-copy-2.f90: Remove test from here.
* testsuite/libgomp.oacc-fortran/deep-copy-3.f90: Don't use mixed
component/non-component variable refs in a single directive.
* testsuite/libgomp.oacc-fortran/classtypes-1.f95: Likewise.
gcc/testsuite/
* gfortran.dg/goacc/deep-copy-2.f90: Move test here (from libgomp
testsuite). Make a compilation test, and expect rejection of mixed
component/non-component accesses.
* gfortran.dg/goacc/mapping-tests-1.f90: New test.
It's wrong to assume that clock_gettime is unavailable on any *-*-linux*
target that doesn't have glibc 2.17 or later. Use a generic test instead
of using __GLIBC_PREREQ. Only do that test when is_hosted=yes so that we
don't get an error for cross targets without a working linker.
This ensures that C library's clock_gettime will be used on non-glibc
targets, instead of an incorrect syscall to SYS_clock_gettime.
PR libstdc++/93325
* acinclude.m4 (GLIBCXX_ENABLE_LIBSTDCXX_TIME): Use AC_SEARCH_LIBS for
clock_gettime instead of explicit glibc version check.
* configure: Regenerate.
Autopar was doing clique bookkeeping too early when creating destination
functions but then later introducing new cliques via versioning loops.
The following moves the bookkeeping to the actual outlining process.
2020-01-28 Richard Biener <rguenther@suse.de>
PR tree-optimization/93439
* tree-parloops.c (create_loop_fn): Move clique bookkeeping...
* tree-cfg.c (move_sese_region_to_fn): ... here.
(verify_types_in_gimple_reference): Verify used cliques are
tracked.
* gfortran.dg/graphite/pr93439.f90: New testcase.
There are
static void
parse_mtune_ctrl_str (bool dump)
{
if (!ix86_tune_ctrl_string)
return;
parse_mtune_ctrl_str is only called from set_ix86_tune_features, which
is only called from ix86_function_specific_restore and
ix86_option_override_internal. parse_mtune_ctrl_str shouldn't use
ix86_tune_ctrl_string which is defined with global_options. Instead,
opts should be passed to parse_mtune_ctrl_str.
PR target/91399
* config/i386/i386-options.c (set_ix86_tune_features): Add an
argument of a pointer to struct gcc_options and pass it to
parse_mtune_ctrl_str.
(ix86_function_specific_restore): Pass opts to
set_ix86_tune_features.
(ix86_option_override_internal): Likewise.
(parse_mtune_ctrl_str): Add an argument of a pointer to struct
gcc_options and use it for x_ix86_tune_ctrl_string.
This change makes sure that if the driver is invoked with
"-mcode-density" flag, then the assembler will receive it too.
Note Claudiu Zissulescu:
This is an old patch of which I forgot to add the test.
testsuite/
2019-09-03 Sahahb Vahedi <shahab@synopsys.com>
* gcc.target/arc/code-density-flag.c: New test.
In the gcc.target/aarch64/lsl_asr_sbfiz.c part of this PR, we have:
Failed to match this instruction:
(set (reg:SI 95)
(ashift:SI (subreg:SI (sign_extract:DI (subreg:DI (reg:SI 97) 0)
(const_int 3 [0x3])
(const_int 0 [0])) 0)
(const_int 19 [0x13])))
If we perform the natural simplification to:
(set (reg:SI 95)
(ashift:SI (sign_extract:SI (reg:SI 97)
(const_int 3 [0x3])
(const_int 0 [0])) 0)
(const_int 19 [0x13])))
then the pattern matches. And it turns out that we do have a
simplification like that already, but it would only kick in for
extractions from a reg, not a subreg. E.g.:
(set (reg:SI 95)
(ashift:SI (subreg:SI (sign_extract:DI (reg:DI X)
(const_int 3 [0x3])
(const_int 0 [0])) 0)
(const_int 19 [0x13])))
would simplify to:
(set (reg:SI 95)
(ashift:SI (sign_extract:SI (subreg:SI (reg:DI X) 0)
(const_int 3 [0x3])
(const_int 0 [0])) 0)
(const_int 19 [0x13])))
IMO the subreg case is even more obviously a simplification
than the bare reg case, since the net effect is to remove
either one or two subregs, rather than simply change the
position of a subreg/truncation.
However, doing that regressed gcc.dg/tree-ssa/pr64910-2.c
for -m32 on x86_64-linux-gnu, because we could then simplify
a :HI zero_extract to a :QI one. The associated *testqi_ext_3
pattern did already seem to want to handle QImode extractions:
"ix86_match_ccmode (insn, CCNOmode)
&& ((TARGET_64BIT && GET_MODE (operands[2]) == DImode)
|| GET_MODE (operands[2]) == SImode
|| GET_MODE (operands[2]) == HImode
|| GET_MODE (operands[2]) == QImode)
but I'm not sure how often the QI case would trigger in practice,
since the zero_extract mode was restricted to HI and above. I checked
the other x86 patterns and couldn't see any other instances of this.
2020-01-28 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR rtl-optimization/87763
* simplify-rtx.c (simplify_truncation): Extend sign/zero_extract
simplification to handle subregs as well as bare regs.
* config/i386/i386.md (*testqi_ext_3): Match QI extracts too.
gcc.dg/pr56350.c started ICEing for SVE in GCC 10 because we
pattern-matched a division reduction:
a /= 8;
into a signed shift with division semantics:
... = IFN_SDIV_POW2 (..., 3);
whereas the reduction code expected it still to be a gassign.
One fix would be to check for a reduction in the pattern matcher
(but current patterns don't generally do that). Another would be
to fail gracefully for reductions involving calls. Since we can't
vectorise the reduction either way, and probably have a better shot
with the shift form, this patch goes for the "fail gracefully" approach.
2020-01-28 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop.c (vectorizable_reduction): Fail gracefully
for reduction chains that (now) include a call.
PR fortran/93464
* openmp.c (gfc_omp_check_optional_argument): Avoid ICE when
DECL_LANG_SPECIFIC and GFC_DESCRIPTOR_TYPE_P but not
GFC_DECL_SAVED_DESCRIPTOR as for local allocatable character vars.
PR fortran/93464
* gfortran.dg/goacc/pr93464.f90: New.
For the 2s failures in the PR, we have a V4SF VEC_PERM_EXPR in
which the first two elements are duplicates of one element and
the other two are don't-care:
v4sf_b = VEC_PERM_EXPR <v4sf_a, v4sf_a, { 1, 1, ?, ? }>;
The heuristic was to extend this with a blend:
v4sf_b = VEC_PERM_EXPR <v4sf_a, v4sf_a, { 1, 1, 2, 3 }>;
but it seems better to extend a partial duplicate to a full duplicate:
v4sf_b = VEC_PERM_EXPR <v4sf_a, v4sf_a, { 1, 1, 1, 1 }>;
Obviously this is still just a heuristic though.
I wondered whether to restrict this to two elements or more
but couldn't find any examples in which it made a difference.
Either way should be fine for the purposes of fixing this PR.
2020-01-28 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/92822
* tree-ssa-forwprop.c (simplify_vector_constructor): When filling
out the don't-care elements of a vector whose significant elements
are duplicates, make the don't-care elements duplicates too.
predcom has the following code to stop one rogue load from
interfering with other store-load opportunities:
/* If A is read and B write or vice versa and there is unsuitable
dependence, instead of merging both components into a component
that will certainly not pass suitable_component_p, just put the
read into bad component, perhaps at least the write together with
all the other data refs in it's component will be optimizable. */
But when store-store commoning was added later, this had the effect
of ignoring loads that occur between two candidate stores.
There is code further up to handle loads and stores with unknown
dependences:
/* Don't do store elimination if there is any unknown dependence for
any store data reference. */
if ((DR_IS_WRITE (dra) || DR_IS_WRITE (drb))
&& (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know
|| DDR_NUM_DIST_VECTS (ddr) == 0))
eliminate_store_p = false;
But the store-load code above skips loads for *known* dependences
if (a) the load has already been marked "bad" or (b) the data-ref
machinery knows the dependence distance, but determine_offsets
can't handle the combination.
(a) happens to be the problem in the testcase, but a different
sequence could have given (b) instead. We have writes to individual
fields of a structure and reads from the whole structure. Since
determine_offsets requires the types to be the same, it returns false
for each such read/write combination.
This patch records which components have had loads removed and
prevents store-store commoning for them. It's a bit too pessimistic,
since there shouldn't be a problem if a "bad" load dominates all stores
in a component. But (a) we can't AFAIK use pcom_stmt_dominates_stmt_p
here and (b) the handling for that case would probably need to be
removed again if we handled more exotic cases in future.
2020-01-28 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/93434
* tree-predcom.c (split_data_refs_to_components): Record which
components have had aliasing loads removed. Prevent store-store
commoning for all such components.
gcc/testsuite/
PR tree-optimization/93434
* gcc.c-torture/execute/pr93434.c: New test.