189701 Commits

Author SHA1 Message Date
Jan Hubicka
e69b7c5779 Fix uninitialized access in merge_call_side_effects
gcc/ChangeLog:

	PR ipa/103262
	* ipa-modref.c (merge_call_side_effects): Fix uninitialized
	access.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/modref-dse-5.c: New test.
2021-11-16 09:15:39 +01:00
Andrew Pinski
3200de91bc tree-optimization: [PR103245] Improve detection of abs pattern using multiplication
So while working on PR 103228 (and a few others), I noticed the testcase for PR 94785
was failing. The problem is that the nop_convert moved from being inside the IOR to be
outside of it. I also noticed the patch for PR 103228 was not needed to reproduce the
issue either.
This patch combines the two patterns together for the abs match when using multiplication
and adds a few places where nop_convert are optional.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

	PR tree-optimization/103245

gcc/ChangeLog:

	* match.pd: Combine the abs pattern matching using multiplication.
	Adding optional nop_convert too.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/pr103245-1.c: New test.
2021-11-16 03:31:57 +00:00
H.J. Lu
074ee8d9a9 Add a missing return when transforming atomic bit test and operations
When failing to transform equivalent, but slighly different cases of
atomic bit test and operations to their canonical forms, return
immediately.

gcc/

	PR middle-end/103268
	* tree-ssa-ccp.c (optimize_atomic_bit_test_and): Add a missing
	return.

gcc/testsuite/

	PR middle-end/103268
	* gcc.dg/pr103268-1.c: New test.
	* gcc.dg/pr103268-2.c: Likewise.
2021-11-15 19:23:58 -08:00
Jim Wilson
a031aaa2ac Update my email address.
* MAINTAINERS: Update my address.
2021-11-15 17:02:40 -08:00
GCC Administrator
e2b57363fc Daily bump. 2021-11-16 00:16:31 +00:00
Jason Merrill
87c2080b05 c++: Add -fimplicit-constexpr
With each successive C++ standard the restrictions on the use of the
constexpr keyword for functions get weaker and weaker; it recently occurred
to me that it is heading toward the same fate as the C register keyword,
which was once useful for optimization but became obsolete.  Similarly, it
seems to me that we should be able to just treat inlines as constexpr
functions and not make people add the extra keyword everywhere.

There were a lot of testcase changes needed; many disabling errors about
non-constexpr functions that are now constexpr, and many disabling implicit
constexpr so that the tests can check the same thing as before, whether
that's mangling or whatever.

gcc/c-family/ChangeLog:

	* c.opt: Add -fimplicit-constexpr.
	* c-cppbuiltin.c: Define __cpp_implicit_constexpr.
	* c-opts.c (c_common_post_options): Disable below C++14.

gcc/cp/ChangeLog:

	* cp-tree.h (struct lang_decl_fn): Add implicit_constexpr.
	(decl_implicit_constexpr_p): New.
	* class.c (type_maybe_constexpr_destructor): Use
	TYPE_HAS_TRIVIAL_DESTRUCTOR and maybe_constexpr_fn.
	(finalize_literal_type_property): Simplify.
	* constexpr.c (is_valid_constexpr_fn): Check for dtor.
	(maybe_save_constexpr_fundef): Try to set DECL_DECLARED_CONSTEXPR_P
	on inlines.
	(cxx_eval_call_expression): Use maybe_constexpr_fn.
	(maybe_constexpr_fn): Handle flag_implicit_constexpr.
	(var_in_maybe_constexpr_fn): Use maybe_constexpr_fn.
	(potential_constant_expression_1): Likewise.
	(decl_implicit_constexpr_p): New.
	* decl.c (validate_constexpr_redeclaration): Allow change with
	-fimplicit-constexpr.
	(grok_special_member_properties): Use maybe_constexpr_fn.
	* error.c (dump_function_decl): Don't print 'constexpr'
	if it's implicit.
	* Make-lang.in (check-c++-all): Update.

libstdc++-v3/ChangeLog:

	* testsuite/20_util/to_address/1_neg.cc: Adjust error.
	* testsuite/26_numerics/random/concept.cc: Adjust asserts.

gcc/testsuite/ChangeLog:

	* lib/g++-dg.exp: Handle "impcx".
	* lib/target-supports.exp
	(check_effective_target_implicit_constexpr): New.
	* g++.dg/abi/abi-tag16.C:
	* g++.dg/abi/abi-tag18a.C:
	* g++.dg/abi/guard4.C:
	* g++.dg/abi/lambda-defarg1.C:
	* g++.dg/abi/mangle26.C:
	* g++.dg/cpp0x/constexpr-diag3.C:
	* g++.dg/cpp0x/constexpr-ex1.C:
	* g++.dg/cpp0x/constexpr-ice5.C:
	* g++.dg/cpp0x/constexpr-incomplete2.C:
	* g++.dg/cpp0x/constexpr-memfn1.C:
	* g++.dg/cpp0x/constexpr-neg3.C:
	* g++.dg/cpp0x/constexpr-specialization.C:
	* g++.dg/cpp0x/inh-ctor19.C:
	* g++.dg/cpp0x/inh-ctor30.C:
	* g++.dg/cpp0x/lambda/lambda-mangle3.C:
	* g++.dg/cpp0x/lambda/lambda-mangle5.C:
	* g++.dg/cpp1y/auto-fn12.C:
	* g++.dg/cpp1y/constexpr-loop5.C:
	* g++.dg/cpp1z/constexpr-lambda7.C:
	* g++.dg/cpp2a/constexpr-dtor3.C:
	* g++.dg/cpp2a/constexpr-new13.C:
	* g++.dg/cpp2a/constinit11.C:
	* g++.dg/cpp2a/constinit12.C:
	* g++.dg/cpp2a/constinit14.C:
	* g++.dg/cpp2a/constinit15.C:
	* g++.dg/cpp2a/spaceship-constexpr1.C:
	* g++.dg/cpp2a/spaceship-eq3.C:
	* g++.dg/cpp2a/udlit-class-nttp-neg2.C:
	* g++.dg/debug/dwarf2/auto1.C:
	* g++.dg/debug/dwarf2/cdtor-1.C:
	* g++.dg/debug/dwarf2/lambda1.C:
	* g++.dg/debug/dwarf2/pr54508.C:
	* g++.dg/debug/dwarf2/pubnames-2.C:
	* g++.dg/debug/dwarf2/pubnames-3.C:
	* g++.dg/ext/is_literal_type3.C:
	* g++.dg/ext/visibility/template7.C:
	* g++.dg/gcov/gcov-12.C:
	* g++.dg/gcov/gcov-2.C:
	* g++.dg/ipa/devirt-35.C:
	* g++.dg/ipa/devirt-36.C:
	* g++.dg/ipa/devirt-37.C:
	* g++.dg/ipa/devirt-44.C:
	* g++.dg/ipa/imm-devirt-1.C:
	* g++.dg/lookup/builtin5.C:
	* g++.dg/lto/inline-crossmodule-1_0.C:
	* g++.dg/modules/enum-1_a.C:
	* g++.dg/modules/fn-inline-1_c.C:
	* g++.dg/modules/pmf-1_b.C:
	* g++.dg/modules/used-1_c.C:
	* g++.dg/tls/thread_local11.C:
	* g++.dg/tls/thread_local11a.C:
	* g++.dg/tm/pr46653.C:
	* g++.dg/ubsan/pr70035.C:
	* g++.old-deja/g++.other/delete6.C:
	* g++.dg/modules/pmf-1_a.H:
	Adjust for implicit constexpr.
2021-11-15 18:50:07 -05:00
Jason Merrill
29e4163a09 c++: split_nonconstant_init and flexarrays
split_nonconstant_init was doing the wrong thing for both the initialization
and cleanup here; we know the size from the initializer, and we can pass it
along.  This doesn't make the testcase work, since the y destructor is still
broken, but it removes the wrong error for the aggregate initialization.

gcc/cp/ChangeLog:

	* typeck2.c (split_nonconstant_init_1): Handle flexarrays better.

gcc/testsuite/ChangeLog:

	* g++.dg/ext/flexary37.C: Remove expected error.
2021-11-15 18:48:04 -05:00
Siddhesh Poyarekar
323026c7df gimple-fold: Use ranges to simplify strncat and snprintf
Use ranges for lengths and object sizes in strncat and snprintf to
determine if they can be transformed into simpler operations.

gcc/ChangeLog:

	* gimple-fold.c (gimple_fold_builtin_strncat): Use ranges to
	determine if it is safe to transform to strcat.
	(gimple_fold_builtin_snprintf): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.dg/fold-stringops-2.c: Define size_t.
	(safe1): Adjust.
	(safe4): New test.
	* gcc.dg/fold-stringops-3.c: New test.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
2021-11-16 04:20:46 +05:30
Siddhesh Poyarekar
cea4dab861 gimple-fold: Use ranges to simplify _chk calls
Instead of comparing LEN and SIZE only if they are constants, use their
ranges to decide if LEN will always be lower than or same as SIZE.

This change ends up putting the stringop-overflow warning line number
against the strcpy implementation, so adjust the warning check to be
line number agnostic.

gcc/ChangeLog:

	* gimple-fold.c (known_lower): New function.
	(gimple_fold_builtin_strncat_chk,
	gimple_fold_builtin_memory_chk, gimple_fold_builtin_stxcpy_chk,
	gimple_fold_builtin_stxncpy_chk,
	gimple_fold_builtin_snprintf_chk,
	gimple_fold_builtin_sprintf_chk): Use it.

gcc/testsuite/ChangeLog:

	* gcc.dg/Wobjsize-1.c: Make warning change line agnostic.
	* gcc.dg/fold-stringops-2.c: New test.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
2021-11-16 04:20:31 +05:30
Siddhesh Poyarekar
d1753b4be9 gimple-fold: Transform stp*cpy_chk to str*cpy directly
Avoid going through another folding cycle and use the ignore flag to
directly transform BUILT_IN_STPCPY_CHK to BUILT_IN_STRCPY when set,
likewise for BUILT_IN_STPNCPY_CHK to BUILT_IN_STPNCPY.

Dump the transformation in dump_file so that we can verify in tests that
the direct transformation actually happened.

gcc/ChangeLog:

	* gimple-fold.c (dump_transformation): New function.
	(gimple_fold_builtin_stxcpy_chk,
	gimple_fold_builtin_stxncpy_chk): Use it.  Simplify to
	BUILT_IN_STRNCPY if return value is not used.

gcc/testsuite/ChangeLog:

	* gcc.dg/fold-stringops-1.c: New test.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
2021-11-16 04:19:51 +05:30
H.J. Lu
4c19122bf5 Check optab before transforming atomic bit test and operations
Check optab before transforming equivalent, but slighly different cases
of atomic bit test and operations to their canonical forms.

gcc/

	PR middle-end/103184
	* tree-ssa-ccp.c (optimize_atomic_bit_test_and): Check optab
	before transforming equivalent, but slighly different cases to
	their canonical forms.

gcc/testsuite/

	PR middle-end/103184
	* gcc.dg/pr103184-1.c: New test.
	* gcc.dg/pr103184-2.c: Likewise.
2021-11-15 12:58:56 -08:00
Iain Sandoe
fabe8cc41e IPA: Provide a mechanism to register static DTORs via cxa_atexit.
For at least one target (Darwin) the platform convention is to
register static destructors (i.e. __attribute__((destructor)))
with __cxa_atexit rather than placing them into a list that is
run by some other mechanism.

This patch provides a target hook that allows a target to opt
into this and handling for the process in ipa_cdtor_merge ().

When the mode is enabled (dtors_from_cxa_atexit is set) we:

 * Generate new CTORs to register static destructors with
   __cxa_atexit and add them to the existing list of CTORs;
   we then process the revised CTORs list.

 * We sort the DTORs into priority and then TU order, this
   means that they are registered in that order with
   __cxa_atexit () and therefore will be run in the reverse
   order.

 * Likewise, CTORs are sorted into priority and then TU order,
   which means that they will run in that order.

This matches the behavior of using init/fini (or
mod_init_func/mod_term_func) sections.

This also fixes a bug where Fortran needs a DTOR to be run to
close IO.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

	PR fortran/102992

gcc/ChangeLog:

	* config/darwin.h (TARGET_DTORS_FROM_CXA_ATEXIT): New.
	* doc/tm.texi: Regenerated.
	* doc/tm.texi.in: Add TARGET_DTORS_FROM_CXA_ATEXIT hook.
	* ipa.c (cgraph_build_static_cdtor_1): Return the built
	function decl.
	(build_cxa_atexit_decl): New.
	(build_dso_handle_decl): New.
	(build_cxa_dtor_registrations): New.
	(compare_cdtor_tu_order): New.
	(build_cxa_atexit_fns): New.
	(ipa_cdtor_merge): If dtors_from_cxa_atexit is set,
	process the DTORs/CTORs accordingly.
	(pass_ipa_cdtor_merge::gate): Also run if
	dtors_from_cxa_atexit is set.
	* target.def (dtors_from_cxa_atexit): New hook.
2021-11-15 19:48:56 +00:00
Iain Sandoe
d3cc82dc9c configure, Darwin: Check ld64 support for -platform-version.
Newer versions of ld64 allow specifiying the OS target (e.g.
macos or ios) the version and the SDK version all in a single
command.  This checks the availability of the command for the
current toolchain.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

gcc/ChangeLog:

	* config.in: Regenerate.
	* configure: Regenerate.
	* configure.ac: Test ld64 for -platform-version support.
2021-11-15 19:35:10 +00:00
Iain Sandoe
bd5159bdd4 testsuite, Darwin: In tsvc.h, use malloc for Darwin <= 9.
Earlier Darwin versions fdo not have posix_memalign() but the
malloc implementation is guaranteed to produce memory suitably
aligned for the largest vector type.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

gcc/testsuite/ChangeLog:

	* gcc.dg/vect/tsvc/tsvc.h: Use malloc for Darwin 9 and
	earlier.
2021-11-15 19:28:07 +00:00
Iain Sandoe
b7f0147833 Ada, Darwin : Use DSYMUTIL_FOR_TARGET in libgnat/gnarl builds.
Most of the time we get away with using the dsymutil that is
installed with the latest Xcode, however for some cross-compilation
cases that does not work.

We now have the ability to specify the correct dsymutil to use for
the toolchain (--with-dsymutil=) and we should use that specified
tool for debug link.  Fixes cross-compilers from x86-64 to powerpc.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

gcc/ada/ChangeLog:

	* gcc-interface/Makefile.in: Use DSYMUTIL_FOR_TARGET in
	libgnat/libgnarl recipies.
2021-11-15 19:27:24 +00:00
François Dumont
d10b863fa3 libstdc++: Unordered containers merge re-use hash code
When merging 2 unordered containers with same hasher we can re-use the hash code from
the cache if any.

Also in the context of the merge operation on multi-container use previous insert iterator as a hint
for the next insert.

libstdc++-v3/ChangeLog:

	* include/bits/hashtable_policy.h:
	(_Hash_code_base<>::_M_hash_code(const _Hash&, const _Hash_node_value<_Value, true>&)): New.
	(_Hash_code_base<>::_M_hash_code<_H2>(const _H2&, const _Hash_node_value<>&)): New.
	* include/bits/hashtable.h (_Hashtable<>::_M_merge_unique): Use latter.
	(_Hashtable<>::_M_merge_multi): Likewise.
	* testsuite/23_containers/unordered_multiset/modifiers/merge.cc (test05): New test.
	* testsuite/23_containers/unordered_set/modifiers/merge.cc (test04): New test.
2021-11-15 18:52:07 +01:00
Thomas Schwinge
f861ed8b29 Use 'location_hash' for 'gcc/diagnostic-spec.h:nowarn_map'
Instead of hard-coded '0'/'UINT_MAX', we now use the 'RESERVED_LOCATION_P'
values 'UNKNOWN_LOCATION'/'BUILTINS_LOCATION' as spare values for
'Empty'/'Deleted', and generally simplify the code.

	gcc/
	* diagnostic-spec.h (typedef xint_hash_t)
	(typedef xint_hash_map_t): Replace with...
	(typedef nowarn_map_t): ... this.
	(nowarn_map): Adjust.
	* diagnostic-spec.c (nowarn_map, suppress_warning_at): Likewise.
2021-11-15 17:57:54 +01:00
Thomas Schwinge
bcebd05720 Use 'location_hash' for 'seen_locations' in 'gcc/profile.c:branch_prob'
Follow-up to commit 102fcf94e625a2016a65829c73a42bd6c2420376
"Fix GCOV CFG related issues": considering the current
'int_hash <location_t, 0, 2>', per 'libcpp/include/line-map.h':

      Actual     | Value                         | Meaning
      -----------+-------------------------------+-------------------------------
      0x00000000 | UNKNOWN_LOCATION (gcc/input.h)| Unknown/invalid location.
      -----------+-------------------------------+-------------------------------
      0x00000001 | BUILTINS_LOCATION             | The location for declarations
                 |   (gcc/input.h)               | in "<built-in>"
      -----------+-------------------------------+-------------------------------
      0x00000002 | RESERVED_LOCATION_COUNT       | The first location to be
                 | (also                         | handed out, and the
                 |  ordmap[0]->start_location)   | first line in ordmap 0

... this currently uses value '0' ('UNKNOWN_LOCATION') as spare values for
'Empty', and value '2' ('RESERVED_LOCATION_COUNT') as spare values for
'Deleted', which is questionable?

What actually does get put into 'seen_locations' is (mostly...)
restricted/gated by '!RESERVED_LOCATION_P' (which is true unless
'UNKNOWN_LOCATION' or 'BUILTINS_LOCATION'), thus we may simply use
'location_hash'.

	gcc/
	* profile.c (branch_prob): Use 'location_hash' for
	'seen_locations'.
2021-11-15 17:56:49 +01:00
Aldy Hernandez
6c29c9d6a7 Drop tree overflow in irange setter.
Drop meaningless overflow that may creep into the IL.

gcc/ChangeLog:

	PR tree-optimization/103207
	* value-range.cc (irange::set): Drop overflow.

gcc/testsuite/ChangeLog:

	* gcc.dg/pr103207.c: New test.
2021-11-15 17:31:50 +01:00
Tobias Burnus
82ec4cb3c4 Fortran: openmp: Add support for thread_limit clause on target
gcc/fortran/ChangeLog:

	* openmp.c (OMP_TARGET_CLAUSES): Add thread_limit.
	* trans-openmp.c (gfc_split_omp_clauses): Add thread_limit also to
	teams.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/thread-limit-1.f90: New test.
2021-11-15 15:44:11 +01:00
Jakub Jelinek
b2e1ac5485 testsuite: Add testcase for already fixed PR [PR100469]
This bug introduced in r11-7448-gff92ede8d269375f800e1b347a48f4698874b4a3
has been fixed already by r12-1354-g2d2ed777b23ab6503027039e0adbfe1162f52b2f
aka PR100852 fix.

2021-11-15  Jakub Jelinek  <jakub@redhat.com>

	PR debug/100469
	* g++.dg/opt/pr100469.C: New test.
2021-11-15 14:47:44 +01:00
H.J. Lu
650108971b x86: Add gcc.target/i386/pr103205-2.c
PR target/103205
	* gcc.target/i386/pr103205-2.c: New test.
2021-11-15 05:41:54 -08:00
H.J. Lu
7d768a9d6f libffi: Update LOCAL_PATCHES
Add

commit a91f844ef449d0dd1cf2e0e47b0ade0d8a6304e1
Author: Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
Date:   Mon Nov 15 10:24:27 2021 +0100

    libffi: Use #define instead of .macro in  src/x86/win64.S [PR102874]

to LOCAL_PATCHES.

	* LOCAL_PATCHES: Add commit a91f844ef44.
2021-11-15 04:56:05 -08:00
Jakub Jelinek
aea7238683 openmp: Add support for thread_limit clause on target
OpenMP 5.1 says that thread_limit clause can also appear on target,
and similarly to teams should affect the thread-limit-var ICV.
On combined target teams, the clause goes to both.

We actually passed thread_limit internally on target already before,
but only used it for gcn/ptx offloading to hint how many threads should be
created and for ptx didn't set thread_limit_var in that case.
Similarly for host fallback.
Also, I found that we weren't copying the args array that contains encoded
thread_limit and num_teams clause for target (etc.) for async target.

2021-11-15  Jakub Jelinek  <jakub@redhat.com>

gcc/
	* gimplify.c (optimize_target_teams): Only add OMP_CLAUSE_THREAD_LIMIT
	to OMP_TARGET_CLAUSES if it isn't there already.
gcc/c-family/
	* c-omp.c (c_omp_split_clauses) <case OMP_CLAUSE_THREAD_LIMIT>:
	Duplicate to both OMP_TARGET and OMP_TEAMS.
gcc/c/
	* c-parser.c (OMP_TARGET_CLAUSE_MASK): Add
	PRAGMA_OMP_CLAUSE_THREAD_LIMIT.
gcc/cp/
	* parser.c (OMP_TARGET_CLAUSE_MASK): Add
	PRAGMA_OMP_CLAUSE_THREAD_LIMIT.
libgomp/
	* task.c (gomp_create_target_task): Copy args array as well.
	* target.c (gomp_target_fallback): Add args argument.
	Set gomp_icv (true)->thread_limit_var if thread_limit is present.
	(GOMP_target): Adjust gomp_target_fallback caller.
	(GOMP_target_ext): Likewise.
	(gomp_target_task_fn): Likewise.
	* config/nvptx/team.c (gomp_nvptx_main): Set
	gomp_global_icv.thread_limit_var.
	* testsuite/libgomp.c-c++-common/thread-limit-1.c: New test.
2021-11-15 13:20:53 +01:00
Aldy Hernandez
fcdf49a0ad Fix PHI ordering problems in the path solver.
After auditing the PHI range calculations, I'm not convinced we've
caught all the corner cases.  They haven't shown up in the wild (yet),
but better safe than sorry.

We shouldn't write anything to the cache or trigger additional
lookups while calculating a PHI, as this may cause ordering problems.
We should resolve the PHI with either the cache as it stands, or by
asking for ranges on entry to the path.  I've documented this.

There was one dubious case where we called fold_range in
ssa_range_in_phi, which mostly by luck wasn't triggering lookups,
because fold_range solves a PHI by calling range_on_edge, which is set
to pick up global ranges by default in path_range_query.  This is
fragile, so I've rewritten the call to explicitly use cached or global
ranges.

Also, the cache should be avoided in ssa_range_in_phi when the arg is
defined in the PHI's block, as not doing so could create an ordering
problem.  We have a similar check when calculating relations in PHIs.

Tested on x86-64 & ppc64le Linux.

gcc/ChangeLog:

	* gimple-range-path.cc (path_range_query::internal_range_of_expr):
	Remove useless code.
	(path_range_query::ssa_defined_in_bb): New.
	(path_range_query::ssa_range_in_phi): Avoid fold_range call that
	could trigger additional lookups.
	Do not use the cache for ARGs defined in this block.
	(path_range_query::compute_ranges_in_block): Use ssa_defined_in_bb.
	(path_range_query::maybe_register_phi_relation): Same.
	(path_range_query::range_of_stmt): Adjust comment.
	* gimple-range-path.h (ssa_defined_in_bb): New.
2021-11-15 13:16:57 +01:00
Aldy Hernandez
540d92ae9b path solver: Default to global range if nothing found.
This has been a long time coming, but we weren't able to make the
change because of some unrelated regressions.

Tested on x86-64 & ppc64le Linux.

gcc/ChangeLog:

	* gimple-range-path.cc (path_range_query::internal_range_of_expr):
	Default to global range if nothing found.

gcc/testsuite/ChangeLog:

	* g++.dg/tree-ssa/pr31146-2.C: Add -fno-thread-jumps.
2021-11-15 13:16:56 +01:00
Richard Biener
220bd61874 tree-optimization/103237 - avoid vectorizing unhandled double reductions
Double reductions which have multiple LC PHIs in the inner loop
are not handled correctly during transformation since those PHIs
are not properly classified as reduction.  The following disables
vectorizing them.

2021-11-15  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/103237
	* tree-vect-loop.c (vect_is_simple_reduction): Fail for
	double reductions with multiple inner loop LC PHI nodes.

	* gcc.dg/torture/pr103237.c: New testcase.
2021-11-15 13:07:57 +01:00
Hongyu Wang
4d281ff7dd PR target/103069: Relax cmpxchg loop for x86 target
From the CPU's point of view, getting a cache line for writing is more
expensive than reading.  See Appendix A.2 Spinlock in:

https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/
xeon-lock-scaling-analysis-paper.pdf

The full compare and swap will grab the cache line exclusive and causes
excessive cache line bouncing.

The atomic_fetch_{or,xor,and,nand} builtins generates cmpxchg loop under
-march=x86-64 like:

	movl	v(%rip), %eax
.L2:
	movl	%eax, %ecx
	movl	%eax, %edx
	orl	$1, %ecx
	lock cmpxchgl	%ecx, v(%rip)
	jne	.L2
	movl	%edx, %eax
	andl	$1, %eax
	ret

To relax above loop, GCC should first emit a normal load, check and jump to
.L2 if cmpxchgl may fail. Before jump to .L2, PAUSE should be inserted to
yield the CPU to another hyperthread and to save power, so the code is
like

.L84:
        movl    (%rdi), %ecx
        movl    %eax, %edx
        orl     %esi, %edx
        cmpl    %eax, %ecx
        jne     .L82
        lock cmpxchgl   %edx, (%rdi)
        jne     .L84
.L82:
        rep nop
        jmp     .L84

This patch adds corresponding atomic_fetch_op expanders to insert load/
compare and pause for all the atomic logic fetch builtins. Add flag
-mrelax-cmpxchg-loop to control whether to generate relaxed loop.

gcc/ChangeLog:

	PR target/103069
	* config/i386/i386-expand.c (ix86_expand_atomic_fetch_op_loop):
	New expand function.
	* config/i386/i386-options.c (ix86_target_string): Add
	-mrelax-cmpxchg-loop flag.
	(ix86_valid_target_attribute_inner_p): Likewise.
	* config/i386/i386-protos.h (ix86_expand_atomic_fetch_op_loop):
	New expand function prototype.
	* config/i386/i386.opt: Add -mrelax-cmpxchg-loop.
	* config/i386/sync.md (atomic_fetch_<logic><mode>): New expander
	for SI,HI,QI modes.
	(atomic_<logic>_fetch<mode>): Likewise.
	(atomic_fetch_nand<mode>): Likewise.
	(atomic_nand_fetch<mode>): Likewise.
	(atomic_fetch_<logic><mode>): New expander for DI,TI modes.
	(atomic_<logic>_fetch<mode>): Likewise.
	(atomic_fetch_nand<mode>): Likewise.
	(atomic_nand_fetch<mode>): Likewise.
	* doc/invoke.texi: Document -mrelax-cmpxchg-loop.

gcc/testsuite/ChangeLog:

	PR target/103069
	* gcc.target/i386/pr103069-1.c: New test.
	* gcc.target/i386/pr103069-2.c: Ditto.
2021-11-15 19:09:38 +08:00
Richard Biener
d1ca8aeaf3 tree-optimization/103219 - avoid ICE in unroll-and-jam
For no particularly good reason unroll-and-jam uses single_dom_exit
to determine the exit for the region it wants to run VN on.  That
happens to ICE because of the dominance restriction.  Use single_exit
instead.

2021-11-15  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/103219
	* gimple-loop-jam.c (tree_loop_unroll_and_jam): Use single_exit
	to determine the exit for the VN region.

	* gcc.dg/torture/pr103219.c: New testcase.
2021-11-15 11:10:16 +01:00
Prathamesh Kulkarni
2551cd4f9b [tree-vectorizer.c] Merge pass_vectorize::execute with vectorize_loops and replace occurences of cfun with function param.
gcc/ChangeLog:
	* tree-ssa-loop.c (pass_vectorize): Move to tree-vectorizer.c.
	(pass_data_vectorize): Likewise.
	(make_pass_vectorize): Likewise.
	* tree-vectorizer.c (vectorize_loops): Merge with
	pass_vectorize::execute and replace cfun occurences with fun param.
	(adjust_simduid_builtins): Add fun param, replace cfun occurences with
	fun, and adjust callers approrpiately.
	(note_simd_array_uses): Likewise.
	(vect_loop_dist_alias_call): Likewise.
	(set_uid_loop_bbs): Likewise.
	(vect_transform_loops): Likewise.
	(try_vectorize_loop_1): Likewise.
	(try_vectorize_loop): Likewise.
2021-11-15 15:37:36 +05:30
Rainer Orth
a91f844ef4 libffi: Use #define instead of .macro in src/x86/win64.S [PR102874]
The libffi 3.4.2 import badly broke Solaris/x86 bootstrap with the native
assembler:

Assembler:
        "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 88 :
Illegal mnemonic
        Near line: ".macro epilogue"
        "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 88 : Syntax
error
        Near line: ".macro epilogue"
        "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 95 :
Illegal mnemonic
        Near line: ".endm"
        "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 95 : Syntax
error
        Near line: ".endm"
        "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 100 :
Illegal mnemonic
        Near line: " epilogue"
        "/vol/gcc/src/hg/master/local/libffi/src/x86/win64.S", line 100 :
Syntax error
        Near line: "epilogue"

Solaris as doesn't support .macro/.endm.

Fixed by using #define instead of the unportable .macro.

Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.

The bug has been reported upstream
(https://github.com/libffi/libffi/issues/665); a corresponding pull
request is also pending (https://github.com/libffi/libffi/pull/669).


2021-10-21  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	libffi:
	PR libffi/102874
	* src/x86/win64.S (epilogue): Use #define instead of .macro.
2021-11-15 10:24:27 +01:00
Rainer Orth
a68933da01 testsuite: i386: Require dfp in gcc.target/i386/pr101346.c
gcc.target/i386/pr101346.c currently FAILs on Solaris/x86:

FAIL: gcc.target/i386/pr101346.c (test for excess errors)

Excess errors:
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/pr101346.c:6:1:
error: decimal floating-point not supported for this target
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/pr101346.c:7:6:
error: decimal floating-point not supported for this target
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/pr101346.c:9:12:
warning: implicit declaration of function '__builtin_fabsd128'; did you
mean '__builtin_fabsf128'? [-Wimplicit-function-declaration]

Fixed by requiring dfp support.  Tested on i386-pc-solaris2.11 and
x86_64-pc-linux-gnu.


2021-10-20  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	gcc/testsuite:
	* gcc.target/i386/pr101346.c: Require dfp support.
2021-11-15 10:00:14 +01:00
Jakub Jelinek
625eef42e3 i386: Fix up x86 atomic_bit_test* expanders for !TARGET_HIMODE_MATH [PR103205]
With !TARGET_HIMODE_MATH, the OPTAB_DIRECT expand_simple_binop fail and so
we ICE.  We don't really care if they are done promoted in SImode instead.

2021-11-15  Jakub Jelinek  <jakub@redhat.com>

	PR target/103205
	* config/i386/sync.md (atomic_bit_test_and_set<mode>,
	atomic_bit_test_and_complement<mode>,
	atomic_bit_test_and_reset<mode>): Use OPTAB_WIDEN instead of
	OPTAB_DIRECT.

	* gcc.target/i386/pr103205.c: New test.
2021-11-15 09:30:08 +01:00
Jakub Jelinek
9fa72756d9 libgomp, nvptx: Honor OpenMP 5.1 num_teams lower bound
Here is a PTX implementation of what I was talking about, that for
num_teams_upper 0 or whenever num_teams_lower <= num_blocks, the current
implementation is fine but if the user explicitly asks for more
teams than we can provide in hardware, we need to stop assuming that
omp_get_team_num () is equal to the hw team id, but instead need to use some
team specific memory (it is .shared for PTX), or if none is
provided, array indexed by the hw team id and run some teams serially within
the same hw thread.

2021-11-15  Jakub Jelinek  <jakub@redhat.com>

	* config/nvptx/team.c (__gomp_team_num): Define as
	__attribute__((shared)) var.
	(gomp_nvptx_main): Initialize __gomp_team_num to 0.
	* config/nvptx/target.c (__gomp_team_num): Declare as
	extern __attribute__((shared)) var.
	(GOMP_teams4): Use __gomp_team_num as the team number instead of
	%ctaid.x.  If first, initialize it to %ctaid.x.  If num_teams_lower
	is bigger than num_blocks, use num_teams_lower teams and arrange for
	bumping of __gomp_team_num if !first and returning false once we run
	out of teams.
	* config/nvptx/teams.c (__gomp_team_num): Declare as
	extern __attribute__((shared)) var.
	(omp_get_team_num): Return __gomp_team_num value instead of %ctaid.x.
2021-11-15 09:20:52 +01:00
Jakub Jelinek
d294459720 libgomp: Add a testcase for omp_get_num_teams inside of target inside of host teams
This is https://github.com/OpenMP/spec/issues/3183
There is an agreement that we should return 1 team inside of target,
even if that target is inside of host teams.  We were doing that
when offloading and not during host fallback, r12-5151 should fix that
even for host fallback.

2021-11-15  Jakub Jelinek  <jakub@redhat.com>

	* testsuite/libgomp.c/teams-5.c: New test.
2021-11-15 08:58:39 +01:00
Jason Merrill
2317082c15 c++: location of lambda object and conversion call
Two things that had poor location info: we weren't giving the TARGET_EXPR
for a lambda object any location, and the call to a conversion function was
getting whatever input_location happened to be.

gcc/cp/ChangeLog:

	* call.c (perform_implicit_conversion_flags): Use the location of
	the argument.
	* lambda.c (build_lambda_object): Set location on the TARGET_EXPR.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/lambda/lambda-switch.C: Adjust expected location.
2021-11-15 02:52:36 -05:00
Jason Merrill
37326651b4 c++: check constexpr constructor body
The implicit constexpr patch revealed that our checks for constexpr
constructors that could possibly produce a constant value (which
otherwise are IFNDR) was failing to look at most of the function body.
Fixing that required some library tweaks.

gcc/cp/ChangeLog:

	* constexpr.c (maybe_save_constexpr_fundef): Also check whether the
	body of a constructor is potentially constant.

libstdc++-v3/ChangeLog:

	* src/c++17/memory_resource.cc: Add missing constexpr.
	* include/experimental/internet: Only mark copy constructor
	as constexpr with __cpp_constexpr_dynamic_alloc.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp1y/constexpr-89285-2.C: Expect error.
	* g++.dg/cpp1y/constexpr-89285.C: Adjust error.
2021-11-15 02:50:45 -05:00
Jason Merrill
daa9c6b015 c++: is_this_parameter and coroutines proxies
Compiling coroutines/pr95736.C with the implicit constexpr patch broke
because is_this_parameter didn't recognize the coroutines proxy for 'this'.

gcc/cp/ChangeLog:

	* semantics.c (is_this_parameter): Check DECL_HAS_VALUE_EXPR_P
	instead of is_capture_proxy.
2021-11-15 02:50:26 -05:00
Jason Merrill
bd95d75f34 c++: c++20 constexpr default ctor and array init
The implicit constexpr patch revealed that marking the constructor in the
PR70690 testcase as constexpr made the bug reappear, because build_vec_init
assumed that a constexpr default constructor initialized the whole object,
so it was equivalent to value-initialization.  But this is no longer true in
C++20.

	PR c++/70690

gcc/cp/ChangeLog:

	* init.c (build_vec_init): Check default_init_uninitialized_part in
	C++20.

gcc/testsuite/ChangeLog:

	* g++.dg/init/array41a.C: New test.
2021-11-15 02:49:51 -05:00
Jason Merrill
4df7f8c798 c++: don't do constexpr folding in unevaluated context
The implicit constexpr patch revealed that we were doing constant evaluation
of arbitrary expressions in unevaluated contexts, leading to failure when we
tried to evaluate e.g. a call to declval.  This is wrong more generally;
only manifestly-constant-evaluated expressions should be evaluated within
an unevaluated operand.

Making this change revealed a case we were failing to mark as manifestly
constant-evaluated.

gcc/cp/ChangeLog:

	* constexpr.c (maybe_constant_value): Don't evaluate
	in an unevaluated operand unless manifestly const-evaluated.
	(fold_non_dependent_expr_template): Likewise.
	* decl.c (compute_array_index_type_loc): This context is
	manifestly constant-evaluated.
2021-11-15 02:45:48 -05:00
Jason Merrill
267318a285 c++: constexpr virtual and vbase thunk
C++20 allows virtual functions to be constexpr.  I don't think that calling
through a pointer to a vbase subobject is supposed to work in a constant
expression, since an object with virtual bases can't be constant, but the
call shouldn't ICE.

gcc/cp/ChangeLog:

	* constexpr.c (cxx_eval_thunk_call): Error instead of ICE
	on vbase thunk to constexpr function.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/constexpr-virtual20.C: New test.
2021-11-15 02:30:26 -05:00
Hans-Peter Nilsson
adcfd2c45c gcc.dg/uninit-pred-9_b.c: Correct last adjustment for cris-elf
The change at r12-4790 should have done the same change for
CRIS as was done for powerpc64*-*-*.  (Probably MMIX too but
that may have to wait until the next weekend.)

gcc/testsuite:
	* gcc.dg/uninit-pred-9_b.c: Correct last adjustment, for CRIS.
2021-11-15 07:59:16 +01:00
Maciej W. Rozycki
3e09331f6a VAX: Implement the `-mlra' command-line option
Add the the `-mlra' command-line option for the VAX target, with the
usual semantics of enabling Local Register Allocation, off by default.

LRA remains unstable with the VAX target, with numerous ICEs throughout
the testsuite and worse code produced overall where successful, however
the presence of a command line option to enable it makes it easier to
experiment with it as the compiler does not have to be rebuilt to flip
between the old reload and LRA.

	gcc/
	* config/vax/vax.c (vax_lra_p): New prototype and function.
	(TARGET_LRA_P): Wire it.
	* config/vax/vax.opt (mlra): New option.
	* doc/invoke.texi (Option Summary, VAX Options): Document the
	new option.
2021-11-15 03:14:31 +00:00
GCC Administrator
b85a03ae11 Daily bump. 2021-11-15 00:16:20 +00:00
Andrew Pinski
09f33d12b5 [Commmitted] Move some testcases to torture from tree-ssa
While writing up some testcases, I noticed some newer testcases
just had "dg-do compile/run" on them with dg-options of either -O1
or -O2. Since it is always better to run them over all optimization
levels I put them in gcc.c-torture/compile or gcc.c-torture/execute.

Committed after testing to make sure the testcases pass.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/pr100278.c: Move to ...
	* gcc.c-torture/compile/pr100278.c: Here.
	Remove dg-do and dg-options.
	* gcc.dg/tree-ssa/pr101189.c: Move to ...
	* gcc.c-torture/compile/pr101189.c: Here.
	Remove dg-do and dg-options.
	* gcc.dg/tree-ssa/pr100453.c: Move to ...
	* gcc.c-torture/execute/pr100453.c: Here.
	Remove dg-do and dg-options.
	* gcc.dg/tree-ssa/pr101335.c: Move to ...
	* gcc.c-torture/execute/pr101335.c: Here
	Remove dg-do and dg-options.
2021-11-15 00:02:18 +00:00
Jan Hubicka
a34edf9a3e Track nondeterminism and interposable calls in ipa-modref
Adds tracking of two new flags in ipa-modref: nondeterministic and
calls_interposable.  First is set when function does something that is not
guaranteed to be the same if run again (volatile memory access, volatile asm or
external function call).  Second is set if function calls something that
does not bind to current def.

nondeterministic enables ipa-modref to discover looping pure/const functions
and it now discovers 138 of them during cc1plus link (which about doubles
number of such functions detected late).  We however can do more

 1) We can extend FRE to eliminate redundant calls.
    I filled a PR103168 for that.
    A common case are inline functions that are not autodetected as ECF_CONST
    just becuase they do not bind to local def and can be easily handled.
    More tricky is to use modref summary to check what memory locations are
    read.
 2) DSE can eliminate redundant stores

The calls_interposable flag currently also improves tree-ssa-structalias
on functions that are not binds_to_current_def since reads_global_memory
is now not cleared by interposable functions.

gcc/ChangeLog:

	* ipa-modref.h (struct modref_summary): Add nondeterministic
	and calls_interposable flags.
	* ipa-modref.c (modref_summary::modref_summary): Initialize new flags.
	(modref_summary::useful_p): Check new flags.
	(struct modref_summary_lto): Add nondeterministic and
	calls_interposable flags.
	(modref_summary_lto::modref_summary_lto): Initialize new flags.
	(modref_summary_lto::useful_p): Check new flags.
	(modref_summary::dump): Dump new flags.
	(modref_summary_lto::dump): Dump new flags.
	(ignore_nondeterminism_p): New function.
	(merge_call_side_effects): Merge new flags.
	(process_fnspec): Likewise.
	(analyze_load): Volatile access is nondeterministic.
	(analyze_store): Liekwise.
	(analyze_stmt): Volatile ASM is nondeterministic.
	(analyze_function): Clear new flags.
	(modref_summaries::duplicate): Duplicate new flags.
	(modref_summaries_lto::duplicate): Duplicate new flags.
	(modref_write): Stream new flags.
	(read_section): Stream new flags.
	(propagate_unknown_call): Update new flags.
	(modref_propagate_in_scc): Propagate new flags.
	* tree-ssa-alias.c (ref_maybe_used_by_call_p_1): Check
	calls_interposable.
	* tree-ssa-structalias.c (determine_global_memory_access):
	Likewise.
2021-11-15 00:10:06 +01:00
Maciej W. Rozycki
3057f1ab73 VAX: Add the `setmemhi' instruction
The MOVC5 machine instruction has `memset' semantics if encoded with a
zero source length[1]:

"4. MOVC5 with a zero source length operand is the preferred way
    to fill a block of memory with the fill character."

Use that instruction to implement the `setmemhi' instruction then.  Use
the AP register in the register deferred mode for the source address to
yield the shortest possible encoding of the otherwise unused operand,
observing that the address is never dereferenced if the source length is
zero.

The use of this instruction yields steadily better performance, at least
with the Mariah VAX implementation, for a variable-length `memset' call
expanded inline as a single MOVC5 operation compared to an equivalent
libcall invocation:

Length:   1, time elapsed:  0.971789 (builtin),  2.847303 (libcall)
Length:   2, time elapsed:  0.907904 (builtin),  2.728259 (libcall)
Length:   3, time elapsed:  1.038311 (builtin),  2.917245 (libcall)
Length:   4, time elapsed:  0.775305 (builtin),  2.686088 (libcall)
Length:   7, time elapsed:  1.112331 (builtin),  2.992968 (libcall)
Length:   8, time elapsed:  0.856882 (builtin),  2.764885 (libcall)
Length:  15, time elapsed:  1.256086 (builtin),  3.096660 (libcall)
Length:  16, time elapsed:  1.001962 (builtin),  2.888131 (libcall)
Length:  31, time elapsed:  1.590456 (builtin),  3.774164 (libcall)
Length:  32, time elapsed:  1.288909 (builtin),  3.629622 (libcall)
Length:  63, time elapsed:  3.430285 (builtin),  5.269789 (libcall)
Length:  64, time elapsed:  3.265147 (builtin),  5.113156 (libcall)
Length: 127, time elapsed:  6.438772 (builtin),  8.268305 (libcall)
Length: 128, time elapsed:  6.268991 (builtin),  8.114557 (libcall)
Length: 255, time elapsed: 12.417338 (builtin), 14.259678 (libcall)

(times given in seconds per 1000000 `memset' invocations for the given
length made in a loop).  It is clear from these figures that hardware
does data coalescence for consecutive bytes rather than naively copying
them one by one, as for lengths that are powers of 2 the figures are
consistently lower than ones for their respective next lower lengths.

The use of MOVC5 also requires at least 4 bytes less in terms of machine
code as it avoids encoding the address of `memset' needed for the CALLS
instruction used to make a libcall, as well as extra PUSHL instructions
needed to pass arguments to the call as those can be encoded directly as
the respective operands of the MOVC5 instruction.

It is perhaps worth noting too that for constant lengths we prefer to
emit up to 5 individual MOVx instructions rather than a single MOVC5
instruction to clear memory and for consistency we copy this behavior
here for filling memory with another value too, even though there may be
a performance advantage with a string copy in comparison to a piecemeal
copy, e.g.:

Length:  40, time elapsed:  2.183192 (string),   2.638878 (piecemeal)

But this is something for another change as it will have to be carefully
evaluated.

[1] DEC STD 032-0 "VAX Architecture Standard", Digital Equipment
    Corporation, A-DS-EL-00032-00-0 Rev J, December 15, 1989, Section
    3.10 "Character-String Instructions", p. 3-163

	gcc/
	* config/vax/vax.h (SET_RATIO): New macro.
	* config/vax/vax.md (UNSPEC_SETMEM_FILL): New constant.
	(setmemhi): New expander.
	(setmemhi1): New insn and splitter.
	(*setmemhi1): New insn.

	gcc/testsuite/
	* gcc.target/vax/setmem.c: New test.
2021-11-14 21:01:51 +00:00
François Dumont
e9a53a4f76 libstdc++: [_GLIBCXX_DEBUG] Remove _Safe_container<>::_M_safe()
_GLIBCXX_DEBUG container code cleanup to get rid of _Safe_container<>::_M_safe() and just
use _Safe:: calls which use normal inheritance. Also remove several usages of _M_base()
which can be most of the time ommitted and sometimes replace with explicit _Base::
calls.

libstdc++-v3/ChangeLog:

	* include/debug/safe_container.h (_Safe_container<>::_M_safe): Remove.
	* include/debug/deque (deque::operator=(initializer_list<>)): Replace
	_M_base() call with _Base:: call.
	(deque::operator[](size_type)): Likewise.
	* include/debug/forward_list (forward_list(forward_list&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(forward_list::operator=(initializer_list<>)): Remove _M_base() calls.
	(forward_list::splice_after, forward_list::merge): Likewise.
	* include/debug/list (list(list&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(list::operator=(initializer_list<>)): Remove _M_base() calls.
	(list::splice, list::merge): Likewise.
	* include/debug/map.h (map(map&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(map::operator=(initializer_list<>)): Remove _M_base() calls.
	* include/debug/multimap.h (multimap(multimap&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(multimap::operator=(initializer_list<>)): Remove _M_base() calls.
	* include/debug/set.h (set(set&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(set::operator=(initializer_list<>)): Remove _M_base() calls.
	* include/debug/multiset.h (multiset(multiset&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(multiset::operator=(initializer_list<>)): Remove _M_base() calls.
	* include/debug/string (basic_string(basic_string&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(basic_string::operator=(initializer_list<>)): Remove _M_base() call.
	(basic_string::operator=(const _CharT*), basic_string::operator=(_CharT)): Likewise.
	(basic_string::operator[](size_type), basic_string::operator+=(const basic_string&)):
	Likewise.
	(basic_string::operator+=(const _Char*), basic_string::operator+=(_CharT)): Likewise.
	* include/debug/unordered_map (unordered_map(unordered_map&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(unordered_map::operator=(initializer_list<>), unordered_map::merge):
	Remove _M_base() calls.
	(unordered_multimap(unordered_multimap&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(unordered_multimap::operator=(initializer_list<>), unordered_multimap::merge):
	Remove _M_base() calls.
	* include/debug/unordered_set (unordered_set(unordered_set&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(unordered_set::operator=(initializer_list<>), unordered_set::merge):
	Remove _M_base() calls.
	(unordered_multiset(unordered_multiset&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(unordered_multiset::operator=(initializer_list<>), unordered_multiset::merge):
	Remove _M_base() calls.
	* include/debug/vector (vector(vector&&, const allocator_type&)):
	Remove _M_safe() and _M_base() calls.
	(vector::operator=(initializer_list<>)): Remove _M_base() calls.
	(vector::operator[](size_type)): Likewise.
2021-11-14 21:55:01 +01:00
Jan Hubicka
64f3e71c30 Extend modref to track kills
This patch adds kill tracking to ipa-modref.  This is representd by array
of accesses to memory locations that are known to be overwritten by the
function.

gcc/ChangeLog:

2021-11-14  Jan Hubicka  <hubicka@ucw.cz>

	* ipa-modref-tree.c (modref_access_node::update_for_kills): New
	member function.
	(modref_access_node::merge_for_kills): Likewise.
	(modref_access_node::insert_kill): Likewise.
	* ipa-modref-tree.h (modref_access_node::update_for_kills,
	modref_access_node::merge_for_kills, modref_access_node::insert_kill):
	Declare.
	(modref_access_node::useful_for_kill): New member function.
	* ipa-modref.c (modref_summary::useful_p): Release useless kills.
	(lto_modref_summary): Add kills.
	(modref_summary::dump): Dump kills.
	(record_access): Add mdoref_access_node parameter.
	(record_access_lto): Likewise.
	(merge_call_side_effects): Merge kills.
	(analyze_call): Add ALWAYS_EXECUTED param and pass it around.
	(struct summary_ptrs): Add always_executed filed.
	(analyze_load): Update.
	(analyze_store): Update; record kills.
	(analyze_stmt): Add always_executed; record kills in clobbers.
	(analyze_function): Track always_executed.
	(modref_summaries::duplicate): Duplicate kills.
	(update_signature): Release kills.
	* ipa-modref.h (struct modref_summary): Add kills.
	* tree-ssa-alias.c (alias_stats): Add kill stats.
	(dump_alias_stats): Dump kill stats.
	(store_kills_ref_p): Break out from ...
	(stmt_kills_ref_p): Use it; handle modref info based kills.

gcc/testsuite/ChangeLog:

2021-11-14  Jan Hubicka  <hubicka@ucw.cz>

	* gcc.dg/tree-ssa/modref-dse-3.c: New test.
2021-11-14 18:49:15 +01:00
Aldy Hernandez
8a601f9bc4 Remove gcc.dg/pr103229.c
gcc/testsuite/ChangeLog:

	* gcc.dg/pr103229.c: Removed.
2021-11-14 16:17:36 +01:00