Commit Graph

189865 Commits

Author SHA1 Message Date
Jason Merrill 37326651b4 c++: check constexpr constructor body
The implicit constexpr patch revealed that our checks for constexpr
constructors that could possibly produce a constant value (which
otherwise are IFNDR) was failing to look at most of the function body.
Fixing that required some library tweaks.

gcc/cp/ChangeLog:

	* constexpr.c (maybe_save_constexpr_fundef): Also check whether the
	body of a constructor is potentially constant.

libstdc++-v3/ChangeLog:

	* src/c++17/memory_resource.cc: Add missing constexpr.
	* include/experimental/internet: Only mark copy constructor
	as constexpr with __cpp_constexpr_dynamic_alloc.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp1y/constexpr-89285-2.C: Expect error.
	* g++.dg/cpp1y/constexpr-89285.C: Adjust error.
2021-11-15 02:50:45 -05:00
Jason Merrill daa9c6b015 c++: is_this_parameter and coroutines proxies
Compiling coroutines/pr95736.C with the implicit constexpr patch broke
because is_this_parameter didn't recognize the coroutines proxy for 'this'.

gcc/cp/ChangeLog:

	* semantics.c (is_this_parameter): Check DECL_HAS_VALUE_EXPR_P
	instead of is_capture_proxy.
2021-11-15 02:50:26 -05:00
Jason Merrill bd95d75f34 c++: c++20 constexpr default ctor and array init
The implicit constexpr patch revealed that marking the constructor in the
PR70690 testcase as constexpr made the bug reappear, because build_vec_init
assumed that a constexpr default constructor initialized the whole object,
so it was equivalent to value-initialization.  But this is no longer true in
C++20.

	PR c++/70690

gcc/cp/ChangeLog:

	* init.c (build_vec_init): Check default_init_uninitialized_part in
	C++20.

gcc/testsuite/ChangeLog:

	* g++.dg/init/array41a.C: New test.
2021-11-15 02:49:51 -05:00
Jason Merrill 4df7f8c798 c++: don't do constexpr folding in unevaluated context
The implicit constexpr patch revealed that we were doing constant evaluation
of arbitrary expressions in unevaluated contexts, leading to failure when we
tried to evaluate e.g. a call to declval.  This is wrong more generally;
only manifestly-constant-evaluated expressions should be evaluated within
an unevaluated operand.

Making this change revealed a case we were failing to mark as manifestly
constant-evaluated.

gcc/cp/ChangeLog:

	* constexpr.c (maybe_constant_value): Don't evaluate
	in an unevaluated operand unless manifestly const-evaluated.
	(fold_non_dependent_expr_template): Likewise.
	* decl.c (compute_array_index_type_loc): This context is
	manifestly constant-evaluated.
2021-11-15 02:45:48 -05:00
Jason Merrill 267318a285 c++: constexpr virtual and vbase thunk
C++20 allows virtual functions to be constexpr.  I don't think that calling
through a pointer to a vbase subobject is supposed to work in a constant
expression, since an object with virtual bases can't be constant, but the
call shouldn't ICE.

gcc/cp/ChangeLog:

	* constexpr.c (cxx_eval_thunk_call): Error instead of ICE
	on vbase thunk to constexpr function.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/constexpr-virtual20.C: New test.
2021-11-15 02:30:26 -05:00
Hans-Peter Nilsson adcfd2c45c gcc.dg/uninit-pred-9_b.c: Correct last adjustment for cris-elf
The change at r12-4790 should have done the same change for
CRIS as was done for powerpc64*-*-*.  (Probably MMIX too but
that may have to wait until the next weekend.)

gcc/testsuite:
	* gcc.dg/uninit-pred-9_b.c: Correct last adjustment, for CRIS.
2021-11-15 07:59:16 +01:00
Maciej W. Rozycki 3e09331f6a VAX: Implement the `-mlra' command-line option
Add the the `-mlra' command-line option for the VAX target, with the
usual semantics of enabling Local Register Allocation, off by default.

LRA remains unstable with the VAX target, with numerous ICEs throughout
the testsuite and worse code produced overall where successful, however
the presence of a command line option to enable it makes it easier to
experiment with it as the compiler does not have to be rebuilt to flip
between the old reload and LRA.

	gcc/
	* config/vax/vax.c (vax_lra_p): New prototype and function.
	(TARGET_LRA_P): Wire it.
	* config/vax/vax.opt (mlra): New option.
	* doc/invoke.texi (Option Summary, VAX Options): Document the
	new option.
2021-11-15 03:14:31 +00:00
GCC Administrator b85a03ae11 Daily bump. 2021-11-15 00:16:20 +00:00
Andrew Pinski 09f33d12b5 [Commmitted] Move some testcases to torture from tree-ssa
While writing up some testcases, I noticed some newer testcases
just had "dg-do compile/run" on them with dg-options of either -O1
or -O2. Since it is always better to run them over all optimization
levels I put them in gcc.c-torture/compile or gcc.c-torture/execute.

Committed after testing to make sure the testcases pass.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/pr100278.c: Move to ...
	* gcc.c-torture/compile/pr100278.c: Here.
	Remove dg-do and dg-options.
	* gcc.dg/tree-ssa/pr101189.c: Move to ...
	* gcc.c-torture/compile/pr101189.c: Here.
	Remove dg-do and dg-options.
	* gcc.dg/tree-ssa/pr100453.c: Move to ...
	* gcc.c-torture/execute/pr100453.c: Here.
	Remove dg-do and dg-options.
	* gcc.dg/tree-ssa/pr101335.c: Move to ...
	* gcc.c-torture/execute/pr101335.c: Here
	Remove dg-do and dg-options.
2021-11-15 00:02:18 +00:00
Jan Hubicka a34edf9a3e Track nondeterminism and interposable calls in ipa-modref
Adds tracking of two new flags in ipa-modref: nondeterministic and
calls_interposable.  First is set when function does something that is not
guaranteed to be the same if run again (volatile memory access, volatile asm or
external function call).  Second is set if function calls something that
does not bind to current def.

nondeterministic enables ipa-modref to discover looping pure/const functions
and it now discovers 138 of them during cc1plus link (which about doubles
number of such functions detected late).  We however can do more

 1) We can extend FRE to eliminate redundant calls.
    I filled a PR103168 for that.
    A common case are inline functions that are not autodetected as ECF_CONST
    just becuase they do not bind to local def and can be easily handled.
    More tricky is to use modref summary to check what memory locations are
    read.
 2) DSE can eliminate redundant stores

The calls_interposable flag currently also improves tree-ssa-structalias
on functions that are not binds_to_current_def since reads_global_memory
is now not cleared by interposable functions.

gcc/ChangeLog:

	* ipa-modref.h (struct modref_summary): Add nondeterministic
	and calls_interposable flags.
	* ipa-modref.c (modref_summary::modref_summary): Initialize new flags.
	(modref_summary::useful_p): Check new flags.
	(struct modref_summary_lto): Add nondeterministic and
	calls_interposable flags.
	(modref_summary_lto::modref_summary_lto): Initialize new flags.
	(modref_summary_lto::useful_p): Check new flags.
	(modref_summary::dump): Dump new flags.
	(modref_summary_lto::dump): Dump new flags.
	(ignore_nondeterminism_p): New function.
	(merge_call_side_effects): Merge new flags.
	(process_fnspec): Likewise.
	(analyze_load): Volatile access is nondeterministic.
	(analyze_store): Liekwise.
	(analyze_stmt): Volatile ASM is nondeterministic.
	(analyze_function): Clear new flags.
	(modref_summaries::duplicate): Duplicate new flags.
	(modref_summaries_lto::duplicate): Duplicate new flags.
	(modref_write): Stream new flags.
	(read_section): Stream new flags.
	(propagate_unknown_call): Update new flags.
	(modref_propagate_in_scc): Propagate new flags.
	* tree-ssa-alias.c (ref_maybe_used_by_call_p_1): Check
	calls_interposable.
	* tree-ssa-structalias.c (determine_global_memory_access):
	Likewise.
2021-11-15 00:10:06 +01:00
Maciej W. Rozycki 3057f1ab73 VAX: Add the `setmemhi' instruction
The MOVC5 machine instruction has `memset' semantics if encoded with a
zero source length[1]:

"4. MOVC5 with a zero source length operand is the preferred way
    to fill a block of memory with the fill character."

Use that instruction to implement the `setmemhi' instruction then.  Use
the AP register in the register deferred mode for the source address to
yield the shortest possible encoding of the otherwise unused operand,
observing that the address is never dereferenced if the source length is
zero.

The use of this instruction yields steadily better performance, at least
with the Mariah VAX implementation, for a variable-length `memset' call
expanded inline as a single MOVC5 operation compared to an equivalent
libcall invocation:

Length:   1, time elapsed:  0.971789 (builtin),  2.847303 (libcall)
Length:   2, time elapsed:  0.907904 (builtin),  2.728259 (libcall)
Length:   3, time elapsed:  1.038311 (builtin),  2.917245 (libcall)
Length:   4, time elapsed:  0.775305 (builtin),  2.686088 (libcall)
Length:   7, time elapsed:  1.112331 (builtin),  2.992968 (libcall)
Length:   8, time elapsed:  0.856882 (builtin),  2.764885 (libcall)
Length:  15, time elapsed:  1.256086 (builtin),  3.096660 (libcall)
Length:  16, time elapsed:  1.001962 (builtin),  2.888131 (libcall)
Length:  31, time elapsed:  1.590456 (builtin),  3.774164 (libcall)
Length:  32, time elapsed:  1.288909 (builtin),  3.629622 (libcall)
Length:  63, time elapsed:  3.430285 (builtin),  5.269789 (libcall)
Length:  64, time elapsed:  3.265147 (builtin),  5.113156 (libcall)
Length: 127, time elapsed:  6.438772 (builtin),  8.268305 (libcall)
Length: 128, time elapsed:  6.268991 (builtin),  8.114557 (libcall)
Length: 255, time elapsed: 12.417338 (builtin), 14.259678 (libcall)

(times given in seconds per 1000000 `memset' invocations for the given
length made in a loop).  It is clear from these figures that hardware
does data coalescence for consecutive bytes rather than naively copying
them one by one, as for lengths that are powers of 2 the figures are
consistently lower than ones for their respective next lower lengths.

The use of MOVC5 also requires at least 4 bytes less in terms of machine
code as it avoids encoding the address of `memset' needed for the CALLS
instruction used to make a libcall, as well as extra PUSHL instructions
needed to pass arguments to the call as those can be encoded directly as
the respective operands of the MOVC5 instruction.

It is perhaps worth noting too that for constant lengths we prefer to
emit up to 5 individual MOVx instructions rather than a single MOVC5
instruction to clear memory and for consistency we copy this behavior
here for filling memory with another value too, even though there may be
a performance advantage with a string copy in comparison to a piecemeal
copy, e.g.:

Length:  40, time elapsed:  2.183192 (string),   2.638878 (piecemeal)

But this is something for another change as it will have to be carefully
evaluated.

[1] DEC STD 032-0 "VAX Architecture Standard", Digital Equipment
    Corporation, A-DS-EL-00032-00-0 Rev J, December 15, 1989, Section
    3.10 "Character-String Instructions", p. 3-163

	gcc/
	* config/vax/vax.h (SET_RATIO): New macro.
	* config/vax/vax.md (UNSPEC_SETMEM_FILL): New constant.
	(setmemhi): New expander.
	(setmemhi1): New insn and splitter.
	(*setmemhi1): New insn.

	gcc/testsuite/
	* gcc.target/vax/setmem.c: New test.
2021-11-14 21:01:51 +00:00
François Dumont e9a53a4f76 libstdc++: [_GLIBCXX_DEBUG] Remove _Safe_container<>::_M_safe()
_GLIBCXX_DEBUG container code cleanup to get rid of _Safe_container<>::_M_safe() and just
use _Safe:: calls which use normal inheritance. Also remove several usages of _M_base()
which can be most of the time ommitted and sometimes replace with explicit _Base::
calls.

libstdc++-v3/ChangeLog:

	* include/debug/safe_container.h (_Safe_container<>::_M_safe): Remove.
	* include/debug/deque (deque::operator=(initializer_list<>)): Replace
	_M_base() call with _Base:: call.
	(deque::operator[](size_type)): Likewise.
	* include/debug/forward_list (forward_list(forward_list&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(forward_list::operator=(initializer_list<>)): Remove _M_base() calls.
	(forward_list::splice_after, forward_list::merge): Likewise.
	* include/debug/list (list(list&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(list::operator=(initializer_list<>)): Remove _M_base() calls.
	(list::splice, list::merge): Likewise.
	* include/debug/map.h (map(map&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(map::operator=(initializer_list<>)): Remove _M_base() calls.
	* include/debug/multimap.h (multimap(multimap&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(multimap::operator=(initializer_list<>)): Remove _M_base() calls.
	* include/debug/set.h (set(set&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(set::operator=(initializer_list<>)): Remove _M_base() calls.
	* include/debug/multiset.h (multiset(multiset&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(multiset::operator=(initializer_list<>)): Remove _M_base() calls.
	* include/debug/string (basic_string(basic_string&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(basic_string::operator=(initializer_list<>)): Remove _M_base() call.
	(basic_string::operator=(const _CharT*), basic_string::operator=(_CharT)): Likewise.
	(basic_string::operator[](size_type), basic_string::operator+=(const basic_string&)):
	Likewise.
	(basic_string::operator+=(const _Char*), basic_string::operator+=(_CharT)): Likewise.
	* include/debug/unordered_map (unordered_map(unordered_map&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(unordered_map::operator=(initializer_list<>), unordered_map::merge):
	Remove _M_base() calls.
	(unordered_multimap(unordered_multimap&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(unordered_multimap::operator=(initializer_list<>), unordered_multimap::merge):
	Remove _M_base() calls.
	* include/debug/unordered_set (unordered_set(unordered_set&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(unordered_set::operator=(initializer_list<>), unordered_set::merge):
	Remove _M_base() calls.
	(unordered_multiset(unordered_multiset&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(unordered_multiset::operator=(initializer_list<>), unordered_multiset::merge):
	Remove _M_base() calls.
	* include/debug/vector (vector(vector&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(vector::operator=(initializer_list<>)): Remove _M_base() calls.
	(vector::operator[](size_type)): Likewise.
2021-11-14 21:55:01 +01:00
Jan Hubicka 64f3e71c30 Extend modref to track kills
This patch adds kill tracking to ipa-modref.  This is representd by array
of accesses to memory locations that are known to be overwritten by the
function.

gcc/ChangeLog:

2021-11-14  Jan Hubicka  <hubicka@ucw.cz>

	* ipa-modref-tree.c (modref_access_node::update_for_kills): New
	member function.
	(modref_access_node::merge_for_kills): Likewise.
	(modref_access_node::insert_kill): Likewise.
	* ipa-modref-tree.h (modref_access_node::update_for_kills,
	modref_access_node::merge_for_kills, modref_access_node::insert_kill):
	Declare.
	(modref_access_node::useful_for_kill): New member function.
	* ipa-modref.c (modref_summary::useful_p): Release useless kills.
	(lto_modref_summary): Add kills.
	(modref_summary::dump): Dump kills.
	(record_access): Add mdoref_access_node parameter.
	(record_access_lto): Likewise.
	(merge_call_side_effects): Merge kills.
	(analyze_call): Add ALWAYS_EXECUTED param and pass it around.
	(struct summary_ptrs): Add always_executed filed.
	(analyze_load): Update.
	(analyze_store): Update; record kills.
	(analyze_stmt): Add always_executed; record kills in clobbers.
	(analyze_function): Track always_executed.
	(modref_summaries::duplicate): Duplicate kills.
	(update_signature): Release kills.
	* ipa-modref.h (struct modref_summary): Add kills.
	* tree-ssa-alias.c (alias_stats): Add kill stats.
	(dump_alias_stats): Dump kill stats.
	(store_kills_ref_p): Break out from ...
	(stmt_kills_ref_p): Use it; handle modref info based kills.

gcc/testsuite/ChangeLog:

2021-11-14  Jan Hubicka  <hubicka@ucw.cz>

	* gcc.dg/tree-ssa/modref-dse-3.c: New test.
2021-11-14 18:49:15 +01:00
Aldy Hernandez 8a601f9bc4 Remove gcc.dg/pr103229.c
gcc/testsuite/ChangeLog:

	* gcc.dg/pr103229.c: Removed.
2021-11-14 16:17:36 +01:00
Aldy Hernandez a7ef5da3a9 Do not pass NULL to memset in ssa_global_cache.
The code computing ranges in PHIs in the path solver reuses the
temporary ssa_global_cache by calling its clear method.  Calling it on
an empty cache causes us to call memset with NULL.

Tested on x86-64 Linux.

gcc/ChangeLog:

	PR tree-optimization/103229
	* gimple-range-cache.cc (ssa_global_cache::clear): Do not pass
	null value to memset.

gcc/testsuite/ChangeLog:

	* gcc.dg/pr103229.c: New test.
2021-11-14 14:13:55 +01:00
Martin Liska 5a6100a255 tsan: remove not needed -ldl in options
gcc/testsuite/ChangeLog:

	* c-c++-common/tsan/free_race.c: Remove unnecessary -ldl.
	* c-c++-common/tsan/free_race2.c: Likewise.
2021-11-14 13:54:32 +01:00
Jan Hubicka a29174904b Cleanup tree-ssa-alias and tree-ssa-dse use of modref summary
Move code getting tree op from access_node and stmt to a common place.  I also
commonized logic to build ao_ref. While I was on it I also replaced FOR_EACH_*
by range for since they reads better.

gcc/ChangeLog:

2021-11-14  Jan Hubicka  <hubicka@ucw.cz>

	* ipa-modref-tree.c (modref_access_node::get_call_arg): New member
	function.
	(modref_access_node::get_ao_ref): Likewise.
	* ipa-modref-tree.h (modref_access_node::get_call_arg): Declare.
	(modref_access_node::get_ao_ref): Declare.
	* tree-ssa-alias.c (modref_may_conflict): Use new accessors.
	* tree-ssa-dse.c (dse_optimize_call): Use new accessors.

gcc/testsuite/ChangeLog:

2021-11-14  Jan Hubicka  <hubicka@ucw.cz>

	* c-c++-common/asan/null-deref-1.c: Update template.
	* c-c++-common/tsan/free_race.c: Update template.
	* c-c++-common/tsan/free_race2.c: Update template.
	* gcc.dg/ipa/ipa-sra-4.c: Update template.
2021-11-14 12:01:41 +01:00
GCC Administrator a8029add30 Daily bump. 2021-11-14 00:16:23 +00:00
Jan Hubicka 6471396dec Fix bug in ipa-pure-const and add debug counters
gcc/ChangeLog:

	PR lto/103211
	* dbgcnt.def (ipa_attr): New counters.
	* ipa-pure-const.c: Include dbgcnt.c
	(ipa_make_function_const): Use debug counter.
	(ipa_make_function_pure): Likewise.
	(propagate_pure_const): Fix bug in my previous change.
2021-11-14 00:48:32 +01:00
Jan Hubicka e30bf33044 More ipa-modref-tree.h cleanups
Move access dumping to member function and cleanup formating.

gcc/ChangeLog:

2021-11-13  Jan Hubicka  <hubicka@ucw.cz>

	* ipa-modref-tree.c (modref_access_node::range_info_useful_p):
	Offline from ipa-modref-tree.h.
	(modref_access_node::dump): Move from ipa-modref.c; make member
	function.
	* ipa-modref-tree.h (modref_access_node::range_info_useful_p.
	modref_access_node::dump): Declare.
	* ipa-modref.c (dump_access): Remove.
	(dump_records): Update.
	(dump_lto_records): Update.
	(record_access): Update.
	(record_access_lto): Update.
2021-11-13 23:18:38 +01:00
Jan Hubicka 5aa91072e2 Implement DSE of dead functions calls storing memory.
gcc/ChangeLog:

2021-11-13  Jan Hubicka  <hubicka@ucw.cz>

	* ipa-modref.c (modref_summary::modref_summary): Clear new flags.
	(modref_summary::dump): Dump try_dse.
	(modref_summary::finalize): Add FUN attribute; compute try-dse.
	(analyze_function): Update.
	(read_section): Update.
	(update_signature): Update.
	(pass_ipa_modref::execute): Update.
	* ipa-modref.h (struct modref_summary):
	* tree-ssa-alias.c (ao_ref_init_from_ptr_and_range): Export.
	* tree-ssa-alias.h (ao_ref_init_from_ptr_and_range): Declare.
	* tree-ssa-dse.c (dse_optimize_call): New function.
	(dse_optimize_stmt): Use it.

gcc/testsuite/ChangeLog:

2021-11-13  Jan Hubicka  <hubicka@ucw.cz>

	* g++.dg/cpp1z/inh-ctor23.C: Fix template
	* g++.dg/ipa/ipa-icf-4.C: Fix template
	* gcc.dg/tree-ssa/modref-dse-1.c: New test.
	* gcc.dg/tree-ssa/modref-dse-2.c: New test.
2021-11-13 22:25:23 +01:00
Jan Hubicka af47f22fd5 Fix checking disabled build.
gcc/ChangeLog:

2021-11-13  Jan Hubicka  <hubicka@ucw.cz>

	* ipa-modref-tree.c: Move #if CHECKING_P to proper place.
2021-11-13 20:45:26 +01:00
Xi Ruoyao 04c5a91d06
fixincludes: simplify handling for access() failure [PR21283, PR80047]
POSIX says:

    On some implementations, if buf is a null pointer, getcwd() may obtain
    size bytes of memory using malloc(). In this case, the pointer returned
    by getcwd() may be used as the argument in a subsequent call to free().
    Invoking getcwd() with buf as a null pointer is not recommended in
    conforming applications.

This produces an error building GCC with --enable-werror-always:

    ../../../fixincludes/fixincl.c: In function ‘process’:
    ../../../fixincludes/fixincl.c:1356:7: error: argument 1 is null but
    the corresponding size argument 2 value is 4096 [-Werror=nonnull]

It's suggested by POSIX to call getcwd() with progressively larger
buffers until it does not give an [ERANGE] error. However, it's highly
unlikely that this error-handling route is ever used.

So we can simplify it instead of writting too much code.  We give up to
use getcwd(), because `make` will output a `Leaving directory ...` message
containing the path to cwd when we call abort().

fixincludes/ChangeLog:

	PR other/21823
	PR bootstrap/80047
	* fixincl.c (process): Simplify the handling for highly
	  unlikely access() failure, to avoid using non-standard
	  extensions.
2021-11-14 02:32:25 +08:00
Jan Hubicka a246d7230b modref_access_node cleanup
move member functions of modref_access_node from ipa-modref-tree.h to
ipa-modref-tree.c since they become long and not fitting for inlines anyway.  I
also cleaned up the interface by making static insert method (which handles
inserting accesses into a vector and optimizing them) which makes it possible
to hide most of the interface handling interval merging private.

Honza

gcc/ChangeLog:

	* ipa-modref-tree.h
	(struct modref_access_node): Move longer member functions to
	ipa-modref-tree.c
	(modref_ref_node::try_merge_with): Turn into modreef_acces_node member
	function.
	* ipa-modref-tree.c (modref_access_node::contains): Move here
	from ipa-modref-tree.h.
	(modref_access_node::update): Likewise.
	(modref_access_node::merge): Likewise.
	(modref_access_node::closer_pair_p): Likewise.
	(modref_access_node::forced_merge): Likewise.
	(modref_access_node::update2): Likewise.
	(modref_access_node::combined_offsets): Likewise.
	(modref_access_node::try_merge_with): Likewise.
	(modref_access_node::insert): Likewise.
2021-11-13 18:27:18 +01:00
Jan Hubicka e0040bc3d9 Add finalize method to modref summary.
gcc/ChangeLog:

	* ipa-modref.c (modref_summary::global_memory_read_p): Remove.
	(modref_summary::global_memory_written_p): Remove.
	(modref_summary::dump): Dump new flags.
	(modref_summary::finalize): New member function.
	(analyze_function): Call it.
	(read_section): Call it.
	(update_signature): Call it.
	(pass_ipa_modref::execute): Call it.
	* ipa-modref.h (struct modref_summary): Remove
	global_memory_read_p and global_memory_written_p.
	Add global_memory_read, global_memory_written.
	* tree-ssa-structalias.c (determine_global_memory_access):
	Update.
2021-11-13 18:21:12 +01:00
Jan Hubicka 2af63f0f53 Whitelity type attributes for function signature change
gcc/ChangeLog:

	* ipa-fnsummary.c (compute_fn_summary): Use type_attribut_allowed_p
	* ipa-param-manipulation.c
	(ipa_param_adjustments::type_attribute_allowed_p):
	New member function.
	(drop_type_attribute_if_params_changed_p): New function.
	(build_adjusted_function_type): Use it.
	* ipa-param-manipulation.h: Add type_attribute_allowed_p.
2021-11-13 15:46:57 +01:00
David Malcolm b9365b9321 analyzer: add four new taint-based warnings
The initial commit of the analyzer in GCC 10 had a single warning,
  -Wanalyzer-tainted-array-index
and required manually enabling the taint checker with
-fanalyzer-checker=taint (due to scaling issues).

This patch extends the taint detection to add four new taint-based
warnings:

  -Wanalyzer-tainted-allocation-size
     for e.g. attacker-controlled malloc/alloca
  -Wanalyzer-tainted-divisor
     for detecting where an attacker can inject a divide-by-zero
  -Wanalyzer-tainted-offset
     for attacker-controlled pointer offsets
  -Wanalyzer-tainted-size
     for e.g. attacker-controlled memset

and rewords all the warnings to talk about "attacker-controlled" values
rather than "tainted" values.

Unfortunately I haven't yet addressed the scaling issues, so all of
these still require -fanalyzer-checker=taint (in addition to -fanalyzer).

gcc/analyzer/ChangeLog:
	* analyzer.opt (Wanalyzer-tainted-allocation-size): New.
	(Wanalyzer-tainted-divisor): New.
	(Wanalyzer-tainted-offset): New.
	(Wanalyzer-tainted-size): New.
	* engine.cc (impl_region_model_context::get_taint_map): New.
	* exploded-graph.h (impl_region_model_context::get_taint_map):
	New decl.
	* program-state.cc (sm_state_map::get_state): Call
	alt_get_inherited_state.
	(sm_state_map::impl_set_state): Modify states within
	compound svalues.
	(program_state::impl_call_analyzer_dump_state): Undo casts.
	(selftest::test_program_state_1): Update for new context param of
	create_region_for_heap_alloc.
	(selftest::test_program_state_merging): Likewise.
	* region-model-impl-calls.cc (region_model::impl_call_alloca):
	Likewise.
	(region_model::impl_call_calloc): Likewise.
	(region_model::impl_call_malloc): Likewise.
	(region_model::impl_call_operator_new): Likewise.
	(region_model::impl_call_realloc): Likewise.
	* region-model.cc (region_model::check_region_access): Call
	check_region_for_taint.
	(region_model::get_representative_path_var_1): Handle binops.
	(region_model::create_region_for_heap_alloc): Add "ctxt" param and
	pass it to set_dynamic_extents.
	(region_model::create_region_for_alloca): Likewise.
	(region_model::set_dynamic_extents): Add "ctxt" param and use it
	to call check_dynamic_size_for_taint.
	(selftest::test_state_merging): Update for new context param of
	create_region_for_heap_alloc.
	(selftest::test_malloc_constraints): Likewise.
	(selftest::test_malloc): Likewise.
	(selftest::test_alloca): Likewise for create_region_for_alloca.
	* region-model.h (region_model::create_region_for_heap_alloc): Add
	"ctxt" param.
	(region_model::create_region_for_alloca): Likewise.
	(region_model::set_dynamic_extents): Likewise.
	(region_model::check_dynamic_size_for_taint): New decl.
	(region_model::check_region_for_taint): New decl.
	(region_model_context::get_taint_map): New vfunc.
	(noop_region_model_context::get_taint_map): New.
	* sm-taint.cc: Remove include of "diagnostic-event-id.h"; add
	includes of "gimple-iterator.h", "tristate.h", "selftest.h",
	"ordered-hash-map.h", "cgraph.h", "cfg.h", "digraph.h",
	"analyzer/supergraph.h", "analyzer/call-string.h",
	"analyzer/program-point.h", "analyzer/store.h",
	"analyzer/region-model.h", and "analyzer/program-state.h".
	(enum bounds): Move to top of file.
	(class taint_diagnostic): New.
	(class tainted_array_index): Convert to subclass of taint_diagnostic.
	(tainted_array_index::emit): Add CWE-129.  Reword warning to use
	"attacker-controlled" rather than "tainted".
	(tainted_array_index::describe_state_change): Move to
	taint_diagnostic::describe_state_change.
	(tainted_array_index::describe_final_event): Reword to use
	"attacker-controlled" rather than "tainted".
	(class tainted_offset): New.
	(class tainted_size): New.
	(class tainted_divisor): New.
	(class tainted_allocation_size): New.
	(taint_state_machine::alt_get_inherited_state): New.
	(taint_state_machine::on_stmt): In assignment handling, remove
	ARRAY_REF handling in favor of check_region_for_taint.  Add
	detection of tainted divisors.
	(taint_state_machine::get_taint): New.
	(taint_state_machine::combine_states): New.
	(region_model::check_region_for_taint): New.
	(region_model::check_dynamic_size_for_taint): New.
	* sm.h (state_machine::alt_get_inherited_state): New.

gcc/ChangeLog:
	* doc/invoke.texi (Static Analyzer Options): Add
	-Wno-analyzer-tainted-allocation-size,
	-Wno-analyzer-tainted-divisor, -Wno-analyzer-tainted-offset, and
	-Wno-analyzer-tainted-size to list.  Add
	-Wanalyzer-tainted-allocation-size, -Wanalyzer-tainted-divisor,
	-Wanalyzer-tainted-offset, and -Wanalyzer-tainted-size to list
	of options effectively enabled by -fanalyzer.
	(-Wanalyzer-tainted-allocation-size): New.
	(-Wanalyzer-tainted-array-index): Tweak wording; add link to CWE.
	(-Wanalyzer-tainted-divisor): New.
	(-Wanalyzer-tainted-offset): New.
	(-Wanalyzer-tainted-size): New.

gcc/testsuite/ChangeLog:
	* gcc.dg/analyzer/pr93382.c: Tweak expected wording.
	* gcc.dg/analyzer/taint-alloc-1.c: New test.
	* gcc.dg/analyzer/taint-alloc-2.c: New test.
	* gcc.dg/analyzer/taint-divisor-1.c: New test.
	* gcc.dg/analyzer/taint-1.c: Rename to...
	* gcc.dg/analyzer/taint-read-index-1.c: ...this.  Tweak expected
	wording.  Mark some events as xfail.
	* gcc.dg/analyzer/taint-read-offset-1.c: New test.
	* gcc.dg/analyzer/taint-size-1.c: New test.
	* gcc.dg/analyzer/taint-write-index-1.c: New test.
	* gcc.dg/analyzer/taint-write-offset-1.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2021-11-13 09:27:26 -05:00
Jan Hubicka e2dd12ab66 Remember fnspec based EAF flags in modref summary.
gcc/ChangeLog:

	* attr-fnspec.h (attr_fnspec::arg_eaf_flags): Break out from ...
	* gimple.c (gimple_call_arg_flags): ... here.
	* ipa-modref.c (analyze_parms): Record flags known from fnspec.
	(modref_merge_call_site_flags): Use arg_eaf_flags.
2021-11-13 15:20:00 +01:00
Aldy Hernandez b7a23949b0 path solver: Compute all PHI ranges simultaneously.
PHIs must be resolved simulatenously, otherwise we may not pick up the
ranges incoming to the block.

For example.  If we put p3_7 in the cache before all PHIs have been
computed, we will pick up the wrong p3_7 value for p2_17:

    # p3_7 = PHI <1(2), 0(5)>
    # p2_17 = PHI <1(2), p3_7(5)>

This patch delays updating the cache until all PHIs have been
analyzed.

gcc/ChangeLog:

	PR tree-optimization/103222
	* gimple-range-path.cc (path_range_query::compute_ranges_in_phis):
	New.
	(path_range_query::compute_ranges_in_block): Call
	compute_ranges_in_phis.
	* gimple-range-path.h (path_range_query::compute_ranges_in_phis):
	New.

gcc/testsuite/ChangeLog:

	* gcc.dg/pr103222.c: New test.
2021-11-13 14:41:47 +01:00
H.J. Lu 380fc3b69f libsanitizer: Update LOCAL_PATCHES
* LOCAL_PATCHES: Update to the corresponding revision.
2021-11-13 05:17:14 -08:00
H.J. Lu 55b43a22ab libsanitizer: Apply local patches 2021-11-13 05:15:25 -08:00
H.J. Lu 86289a4ff4 libsanitizer: Merge with upstream
Merged revision: 82bc6a094e85014f1891ef9407496f44af8fe442

with the fix for PR sanitizer/102911
2021-11-13 05:15:24 -08:00
Jonathan Wakely a30a2e43e4 libstdc++: Implement std::spanstream for C++23
This implements the <spanstream> header, as proposed for C++23 by P0448R4.

libstdc++-v3/ChangeLog:

	* include/Makefile.am: Add spanstream header.
	* include/Makefile.in: Regenerate.
	* include/precompiled/stdc++.h: Add spanstream header.
	* include/std/version (__cpp_lib_spanstream): Define.
	* include/std/spanstream: New file.
	* testsuite/27_io/spanstream/1.cc: New test.
	* testsuite/27_io/spanstream/version.cc: New test.
2021-11-13 11:45:31 +00:00
Jan Hubicka ecdf414bd8 Enable ipa-sra with fnspec attributes
Enable some ipa-sra on fortran by allowing signature changes on functions
with "fn spec" attribute when ipa-modref is enabled.  This is possible since ipa-modref
knows how to preserve things we trace in fnspec and fnspec generated by fortran forntend
are quite simple and can be analysed automatically now.  To be sure I will also add
code that merge fnspec to parameters.

This unfortunately hits bug in ipa-param-manipulation when we remove parameter
that specifies size of variable length parameter. For this reason I added a hack
that prevent signature changes on such functions and will handle it incrementally.

I tried creating C testcase but it is blocked by another problem that we punt ipa-sra
on access attribute.  This is optimization regression we ought to fix so I filled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103223.

As a followup I will add code classifying the type attributes (we have just few) and
get stats on access attribute.

gcc/ChangeLog:

	* ipa-fnsummary.c (compute_fn_summary): Do not give up on signature
	changes on "fn spec" attribute; give up on varadic types.
	* ipa-param-manipulation.c: Include attribs.h.
	(build_adjusted_function_type): New parameter ARG_MODIFIED; if it is
	true remove "fn spec" attribute.
	(ipa_param_adjustments::build_new_function_type): Update.
	(ipa_param_body_adjustments::modify_formal_parameters): update.
	* ipa-sra.c: Include attribs.h.
	(ipa_sra_preliminary_function_checks): Do not check for TYPE_ATTRIBUTES.
2021-11-13 12:13:42 +01:00
Aldy Hernandez dc777f6b06 path solver: Merge path_range_query constructors.
There's no need for two constructors, when we can do it all with one
that defaults to the common behavior:

path_range_query (bool resolve = true, gimple_ranger *ranger = NULL);

Tested on x86-64 Linux.

gcc/ChangeLog:

	* gimple-range-path.cc (path_range_query::path_range_query): Merge
	ctors.
	(path_range_query::import_p): Move from header file.
	(path_range_query::~path_range_query): Adjust for combined ctors.
	* gimple-range-path.h: Merge ctors.
	(path_range_query::import_p): Move to .cc file.
2021-11-13 11:45:31 +01:00
Jan Hubicka 2f3d43a351 Fix wrong code with modref and some builtins.
ipa-modref gets confused by EAF flags of memcpy becuase parameter 1 is
escaping but used only directly.  In modref we do not track values saved to
memory and thus we clear all other flags on each store.  This needs to also
happen when called function escapes parameter.

gcc/ChangeLog:

	PR tree-optimization/103182
	* ipa-modref.c (callee_to_caller_flags): Fix merging of flags.
	(modref_eaf_analysis::analyze_ssa_name): Fix merging of flags.
2021-11-13 01:51:25 +01:00
Hans-Peter Nilsson 60f761c7e5 libstdc++: Use GCC_TRY_COMPILE_OR_LINK for getentropy, arc4random
Since r12-5056-g3439657b0286, there has been a regression in
test results; an additional 100 FAILs running the g++ and
libstdc++ testsuite on cris-elf, a newlib target.  The
failures are linker errors, not finding a definition for
getentropy.  It appears newlib has since 2017-12-03
declarations of getentropy and arc4random, and provides an
implementation of arc4random using getentropy, but provides no
definition of getentropy, not even a stub yielding ENOSYS.
This is similar to what it does for many other functions too.

While fixing newlib (like adding said stub) would likely help,
it still leaves older newlib releases hanging.  Thankfully,
the libstdc++ configury test can be improved to try linking
where possible; using the bespoke GCC_TRY_COMPILE_OR_LINK
instead of AC_TRY_COMPILE.  BTW, I see a lack of consistency;
some tests use AC_TRY_COMPILE and some GCC_TRY_COMPILE_OR_LINK
for no apparent reason, but this commit just amends
r12-5056-g3439657b0286.

libstdc++-v3:
	PR libstdc++/103166
	* acinclude.m4 (GLIBCXX_CHECK_GETENTROPY, GLIBCXX_CHECK_ARC4RANDOM):
	Use GCC_TRY_COMPILE_OR_LINK instead of AC_TRY_COMPILE.
	* configure: Regenerate.
2021-11-13 01:45:06 +01:00
GCC Administrator af2852b9dc Daily bump. 2021-11-13 00:16:39 +00:00
Stafford Horne 1bac7d31a1 or1k: Fix clobbering of _mcount argument if fPIC is enabled
Recently we changed the PROFILE_HOOK _mcount call to pass in the link
register as an argument.  This actually does not work when the _mcount
call uses a PLT because the GOT register setup code ends up getting
inserted before the PROFILE_HOOK and clobbers the link register
argument.

These glibc tests are failing:
  gmon/tst-gmon-pie-gprof
  gmon/tst-gmon-static-gprof

This patch fixes this by saving the instruction that stores the Link
Register to the _mcount argument and then inserts the GOT register setup
instructions after that.

For example:

main.c:

    extern int e;

    int f2(int a) {
      return a + e;
    }

    int f1(int a) {
      return f2 (a + a);
    }

    int main(int argc, char ** argv) {
      return f1 (argc);
    }

Compiled:

    or1k-smh-linux-gnu-gcc -Wall -c -O2 -fPIC -pg -S main.c

Before Fix:

    main:
        l.addi  r1, r1, -16
        l.sw    8(r1), r2
        l.sw    0(r1), r16
        l.addi  r2, r1, 16   # Keeping FP, but not needed
        l.sw    4(r1), r18
        l.sw    12(r1), r9
        l.jal   8            # GOT Setup clobbers r9 (Link Register)
         l.movhi        r16, gotpchi(_GLOBAL_OFFSET_TABLE_-4)
        l.ori   r16, r16, gotpclo(_GLOBAL_OFFSET_TABLE_+0)
        l.add   r16, r16, r9
        l.or    r18, r3, r3
        l.or    r3, r9, r9    # This is not the original LR
        l.jal   plt(_mcount)
         l.nop

        l.jal   plt(f1)
         l.or    r3, r18, r18
        l.lwz   r9, 12(r1)
        l.lwz   r16, 0(r1)
        l.lwz   r18, 4(r1)
        l.lwz   r2, 8(r1)
        l.jr    r9
         l.addi  r1, r1, 16

After the fix:

    main:
        l.addi  r1, r1, -12
        l.sw    0(r1), r16
        l.sw    4(r1), r18
        l.sw    8(r1), r9
        l.or    r18, r3, r3
        l.or    r3, r9, r9    # We now have r9 (LR) set early
        l.jal   8             # Clobbers r9 (Link Register)
         l.movhi        r16, gotpchi(_GLOBAL_OFFSET_TABLE_-4)
        l.ori   r16, r16, gotpclo(_GLOBAL_OFFSET_TABLE_+0)
        l.add   r16, r16, r9
        l.jal   plt(_mcount)
         l.nop

        l.jal   plt(f1)
         l.or    r3, r18, r18
        l.lwz   r9, 8(r1)
        l.lwz   r16, 0(r1)
        l.lwz   r18, 4(r1)
        l.jr    r9
         l.addi  r1, r1, 12

Fixes: 308531d148 ("or1k: Add return address argument to _mcount call")

gcc/ChangeLog:
	* config/or1k/or1k-protos.h (or1k_profile_hook): New function.
	* config/or1k/or1k.h (PROFILE_HOOK): Change macro to reference
	new function or1k_profile_hook.
	* config/or1k/or1k.c (struct machine_function): Add new field
	set_mcount_arg_insn.
	(or1k_profile_hook): New function.
	(or1k_init_pic_reg): Update to inject pic rtx after _mcount arg
	when profiling.
	(or1k_frame_pointer_required): Frame pointer no longer needed
	when profiling.
2021-11-13 07:58:00 +09:00
Jan Hubicka 4d2d5565a0 Fix wrong code with pure functions
I introduced bug into find_func_aliases_for_call in handling pure functions.
Instead of reading global memory pure functions are believed to write global
memory.  This results in misoptimization of the testcase at -O1.

The change to pta-callused.c updates the template for new behaviour of the
constraint generation. We copy nonlocal memory to calluse which is correct but
also not strictly necessary because later we take care to add nonlocal_p flag
manually.

gcc/ChangeLog:

	PR tree-optimization/103209
	* tree-ssa-structalias.c (find_func_aliases_for_call): Fix
	use of handle_rhs_call

gcc/testsuite/ChangeLog:

	PR tree-optimization/103209
	* gcc.dg/tree-ssa/pta-callused.c: Update template.
	* gcc.c-torture/execute/pr103209.c: New test.
2021-11-12 23:55:50 +01:00
Aldy Hernandez 264f061997 path solver: Solve PHI imports first for ranges.
PHIs must be resolved first while solving ranges in a block,
regardless of where they appear in the import bitmap.  We went through
a similar exercise for the relational code, but missed these.

Tested on x86-64 & ppc64le Linux.

gcc/ChangeLog:

	PR tree-optimization/103202
	* gimple-range-path.cc
	(path_range_query::compute_ranges_in_block): Solve PHI imports first.
2021-11-12 20:42:56 +01:00
Jan Hubicka b301cb43a7 Fix ipa-pure-const
gcc/ChangeLog:

	* ipa-pure-const.c (propagate_pure_const): Remove redundant check;
	fix call of ipa_make_function_const and ipa_make_function_pure.
2021-11-12 20:15:48 +01:00
David Malcolm 72f1c1c452 analyzer: "__analyzer_dump_state" has no side-effects
gcc/analyzer/ChangeLog:
	* engine.cc (exploded_node::on_stmt_pre): Return when handling
	"__analyzer_dump_state".

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2021-11-12 14:01:36 -05:00
Richard Sandiford 87fcff96db aarch64: Remove redundant costing code
Previous patches made some of the complex parts of the issue rate
code redundant.

gcc/
	* config/aarch64/aarch64.c (aarch64_vector_op::n_advsimd_ops): Delete.
	(aarch64_vector_op::m_seen_loads): Likewise.
	(aarch64_vector_costs::aarch64_vector_costs): Don't push to
	m_advsimd_ops.
	(aarch64_vector_op::count_ops): Remove vectype and factor parameters.
	Remove code that tries to predict different vec_flags from the
	current loop's.
	(aarch64_vector_costs::add_stmt_cost): Update accordingly.
	Remove m_advsimd_ops handling.
2021-11-12 17:33:03 +00:00
Richard Sandiford c6c5c5ebae aarch64: Use new hooks for vector comparisons
Previously we tried to account for the different issue rates of
the various vector modes by guessing what the Advanced SIMD version
of an SVE loop would look like and what its issue rate was likely to be.
We'd then increase the cost of the SVE loop if the Advanced SIMD loop
might issue more quickly.

This patch moves that logic to better_main_loop_than_p, so that we
can compare loops side-by-side rather than having to guess.  This also
means we can apply the issue rate heuristics to *any* vector loop
comparison, rather than just weighting SVE vs. Advanced SIMD.

The actual heuristics are otherwise unchanged.  We're just
applying them in a different place.

gcc/
	* config/aarch64/aarch64.c (aarch64_vector_costs::m_saw_sve_only_op)
	(aarch64_sve_only_stmt_p): Delete.
	(aarch64_vector_costs::prefer_unrolled_loop): New function,
	extracted from adjust_body_cost.
	(aarch64_vector_costs::better_main_loop_than_p): New function,
	using heuristics extracted from adjust_body_cost and
	adjust_body_cost_sve.
	(aarch64_vector_costs::adjust_body_cost_sve): Remove
	advsimd_cycles_per_iter and could_use_advsimd parameters.
	Update after changes above.
	(aarch64_vector_costs::adjust_body_cost): Update after changes above.
2021-11-12 17:33:03 +00:00
Richard Sandiford 2e1886ea06 aarch64: Add vf_factor to aarch64_vec_op_count
-mtune=neoverse-512tvb sets the likely SVE vector length to 128 bits,
but it also takes into account Neoverse V1, which is a 256-bit target.
This patch adds this VF (VL) factor to aarch64_vec_op_count.

gcc/
	* config/aarch64/aarch64.c (aarch64_vec_op_count::m_vf_factor):
	New member variable.
	(aarch64_vec_op_count::aarch64_vec_op_count): Add a parameter for it.
	(aarch64_vec_op_count::vf_factor): New function.
	(aarch64_vector_costs::aarch64_vector_costs): When costing for
	neoverse-512tvb, pass a vf_factor of 2 for the Neoverse V1 version
	of an SVE loop.
	(aarch64_vector_costs::adjust_body_cost): Read the vf factor
	instead of hard-coding 2.
2021-11-12 17:33:02 +00:00
Richard Sandiford a82ffd4361 aarch64: Move cycle estimation into aarch64_vec_op_count
This patch just moves the main cycle estimation routines
into aarch64_vec_op_count.

gcc/
	* config/aarch64/aarch64.c
	(aarch64_vec_op_count::rename_cycles_per_iter): New function.
	(aarch64_vec_op_count::min_nonpred_cycles_per_iter): Likewise.
	(aarch64_vec_op_count::min_pred_cycles_per_iter): Likewise.
	(aarch64_vec_op_count::min_cycles_per_iter): Likewise.
	(aarch64_vec_op_count::dump): Move earlier in file.  Dump the
	above properties too.
	(aarch64_estimate_min_cycles_per_iter): Delete.
	(adjust_body_cost): Use aarch64_vec_op_count::min_cycles_per_iter
	instead of aarch64_estimate_min_cycles_per_iter.  Rely on the dump
	routine to print CPI estimates.
	(adjust_body_cost_sve): Likewise.  Use the other functions above
	instead of doing the work inline.
2021-11-12 17:33:02 +00:00
Richard Sandiford 1a5288fe3d aarch64: Use an array of aarch64_vec_op_counts
-mtune=neoverse-512tvb uses two issue rates, one for Neoverse V1
and one with more generic parameters.  We use both rates when
making a choice between scalar, Advanced SIMD and SVE code.

Previously we calculated the Neoverse V1 issue rates from the
more generic issue rates, but by removing m_scalar_ops and
(later) m_advsimd_ops, it becomes easier to track multiple
issue rates directly.

This patch therefore converts m_ops and (temporarily) m_advsimd_ops
into arrays.

gcc/
	* config/aarch64/aarch64.c (aarch64_vec_op_count): Allow default
	initialization.
	(aarch64_vec_op_count::base_issue_info): Remove handling of null
	issue_infos.
	(aarch64_vec_op_count::simd_issue_info): Likewise.
	(aarch64_vec_op_count::sve_issue_info): Likewise.
	(aarch64_vector_costs::m_ops): Turn into a vector.
	(aarch64_vector_costs::m_advsimd_ops): Likewise.
	(aarch64_vector_costs::aarch64_vector_costs): Add entries to
	the vectors based on aarch64_tune_params.
	(aarch64_vector_costs::analyze_loop_vinfo): Update the pred_ops
	of all entries in m_ops.
	(aarch64_vector_costs::add_stmt_cost): Call count_ops for all
	entries in m_ops.
	(aarch64_estimate_min_cycles_per_iter): Remove issue_info
	parameter and get the information from the ops instead.
	(aarch64_vector_costs::adjust_body_cost_sve): Take a
	aarch64_vec_issue_info instead of a aarch64_vec_op_count.
	(aarch64_vector_costs::adjust_body_cost): Update call accordingly.
	Exit earlier if m_ops is empty for either cost structure.
2021-11-12 17:33:02 +00:00
Richard Sandiford 6756706ea6 aarch64: Use real scalar op counts
Now that vector finish_costs is passed the associated scalar costs,
we can record the scalar issue information while computing the scalar
costs, rather than trying to estimate it while computing the vector
costs.

This simplifies things a little, but the main motivation is to improve
accuracy.

gcc/
	* config/aarch64/aarch64.c (aarch64_vector_costs::m_scalar_ops)
	(aarch64_vector_costs::m_sve_ops): Replace with...
	(aarch64_vector_costs::m_ops): ...this.
	(aarch64_vector_costs::analyze_loop_vinfo): Update accordingly.
	(aarch64_vector_costs::adjust_body_cost_sve): Likewise.
	(aarch64_vector_costs::aarch64_vector_costs): Likewise.
	Initialize m_vec_flags here rather than in add_stmt_cost.
	(aarch64_vector_costs::count_ops): Test for scalar reductions too.
	Allow vectype to be null.
	(aarch64_vector_costs::add_stmt_cost): Call count_ops for scalar
	code too.  Don't require vectype to be nonnull.
	(aarch64_vector_costs::adjust_body_cost): Take the loop_vec_info
	and scalar costs as parameters.  Use the scalar costs to determine
	the cycles per iteration of the scalar loop, then multiply it
	by the estimated VF.
	(aarch64_vector_costs::finish_cost): Update call accordingly.
2021-11-12 17:33:01 +00:00
Richard Sandiford 902b7c9e18 aarch64: Get floatness from stmt_info
This patch gets the floatness of a memory access from the data
reference rather than the vectype.  This makes it more suitable
for use in scalar costing code.

gcc/
	* config/aarch64/aarch64.c (aarch64_dr_type): New function.
	(aarch64_vector_costs::count_ops): Use it rather than the
	vectype to determine floatness.
2021-11-12 17:33:01 +00:00