Commit Graph

186260 Commits

Author SHA1 Message Date
Andrew MacLeod 4c85ff7549 Split gimple-range into gimple-range-fold and gimple-range.
Split the fold_using_range functions from gimple-range into gimple-range-fold.
Also move the gimple_range_calc* routines into gimple-range-gori.

	* Makefile.in (OBJS): Add gimple-range-fold.o
	* gimple-range-fold.cc: New.
	* gimple-range-fold.h: New.
	* gimple-range-gori.cc (gimple_range_calc_op1): Move to here.
	(gimple_range_calc_op2): Ditto.
	* gimple-range-gori.h: Move prototypes to here.
	* gimple-range.cc: Adjust include files.
	(fur_source:fur_source): Relocate to gimple-range-fold.cc.
	(fur_source::get_operand): Ditto.
	(fur_source::get_phi_operand): Ditto.
	(fur_source::query_relation): Ditto.
	(fur_source::register_relation): Ditto.
	(class fur_edge): Ditto.
	(fur_edge::fur_edge): Ditto.
	(fur_edge::get_operand): Ditto.
	(fur_edge::get_phi_operand): Ditto.
	(fur_stmt::fur_stmt): Ditto.
	(fur_stmt::get_operand): Ditto.
	(fur_stmt::get_phi_operand): Ditto.
	(fur_stmt::query_relation): Ditto.
	(class fur_depend): Relocate to gimple-range-fold.h.
	(fur_depend::fur_depend): Relocate to gimple-range-fold.cc.
	(fur_depend::register_relation): Ditto.
	(fur_depend::register_relation): Ditto.
	(class fur_list): Ditto.
	(fur_list::fur_list): Ditto.
	(fur_list::get_operand): Ditto.
	(fur_list::get_phi_operand): Ditto.
	(fold_range): Ditto.
	(adjust_pointer_diff_expr): Ditto.
	(gimple_range_adjustment): Ditto.
	(gimple_range_base_of_assignment): Ditto.
	(gimple_range_operand1): Ditto.
	(gimple_range_operand2): Ditto.
	(gimple_range_calc_op1): Relocate to gimple-range-gori.cc.
	(gimple_range_calc_op2): Ditto.
	(fold_using_range::fold_stmt): Relocate to gimple-range-fold.cc.
	(fold_using_range::range_of_range_op): Ditto.
	(fold_using_range::range_of_address): Ditto.
	(fold_using_range::range_of_phi): Ditto.
	(fold_using_range::range_of_call): Ditto.
	(fold_using_range::range_of_builtin_ubsan_call): Ditto.
	(fold_using_range::range_of_builtin_call): Ditto.
	(fold_using_range::range_of_cond_expr): Ditto.
	(fold_using_range::range_of_ssa_name_with_loop_info): Ditto.
	(fold_using_range::relation_fold_and_or): Ditto.
	(fold_using_range::postfold_gcond_edges): Ditto.
	* gimple-range.h: Add gimple-range-fold.h to include files. Change
	GIMPLE_RANGE_STMT_H to GIMPLE_RANGE_H.
	(gimple_range_handler): Relocate to gimple-range-fold.h.
	(gimple_range_ssa_p): Ditto.
	(range_compatible_p): Ditto.
	(class fur_source): Ditto.
	(class fur_stmt): Ditto.
	(class fold_using_range): Ditto.
	(gimple_range_calc_op1): Relocate to gimple-range-gori.h
	(gimple_range_calc_op2): Ditto.
2021-06-23 10:26:16 -04:00
Andrew MacLeod a03e944e92 Do not continue propagating values which cannot be set properly.
If the on-entry cache cannot properly represent a range, do not continue
trying to propagate it.

	PR tree-optimization/101148
	PR tree-optimization/101014
	* gimple-range-cache.cc (ranger_cache::ranger_cache): Adjust.
	(ranger_cache::~ranger_cache): Adjust.
	(ranger_cache::block_range): Check if propagation disallowed.
	(ranger_cache::propagate_cache): Disallow propagation if new value
	can't be stored properly.
	* gimple-range-cache.h (ranger_cache::m_propfail): New member.
2021-06-23 10:26:16 -04:00
Andrew MacLeod ca4d381662 Adjust on_entry cache to indicate if the value was set properly.
* gimple-range-cache.cc (class ssa_block_ranges): Adjust prototype.
	(sbr_vector::set_bb_range): Return true.
	(class sbr_sparse_bitmap): Adjust.
	(sbr_sparse_bitmap::set_bb_range): Return value.
	(block_range_cache::set_bb_range): Return value.
	(ranger_cache::propagate_cache): Use return value to print msg.
	* gimple-range-cache.h (class block_range_cache): Adjust.
2021-06-23 10:24:30 -04:00
Andrew MacLeod 9d674b735f Dump should be read only. Do not trigger new lookups.
* gimple-range.cc (dump_bb): Use range_on_edge from the cache.
2021-06-23 10:24:30 -04:00
Jeff Law 402c818ac0 Use more logicals to eliminate useless test/compare instructions
gcc/
	* config/h8300/logical.md (<code><mode>3<ccnz>): Use <cczn>
	so this pattern can be used for test/compare removal.  Pass
	current insn to compute_logical_op_length and output_logical_op.
	* config/h8300/h8300.c (compute_logical_op_cc): Remove.
	(h8300_and_costs): Add argument to compute_logical_op_length.
	(output_logical_op): Add new argument.  Use it to determine if the
	condition codes are used and adjust the output accordingly.
	(compute_logical_op_length): Add new argument and update length
	computations when condition codes are used.
	* config/h8300/h8300-protos.h (compute_logical_op_length): Update
	prototype.
	(output_logical_op): Likewise.
2021-06-23 10:18:30 -04:00
Uros Bizjak 37e9392536 i386: Add PPERM two-operand 64bit vector permutation [PR89021]
Add emulation of V8QI PPERM permutations for TARGET_XOP target.  Similar
to PSHUFB, the permutation is performed with V16QI PPERM instruction,
where selector is defined in V16QI mode with inactive elements set to 0x80.
Specific to two operand permutations is the remapping of elements from
the second operand (e.g. e[8] -> e[16]), as we have to account for the
inactive elements from the first operand.

2021-06-23  Uroš Bizjak  <ubizjak@gmail.com>

gcc/
	PR target/89021
	* config/i386/i386-expand.c (expand_vec_perm_pshufb):
	Handle 64bit modes for TARGET_XOP.  Use indirect gen_* functions.
	* config/i386/mmx.md (mmx_ppermv64): New insn pattern.
	* config/i386/i386.md (unspec): Move UNSPEC_XOP_PERMUTE from ...
	* config/i386/sse.md (unspec): ... here.
2021-06-23 16:16:18 +02:00
Martin Liska 371c199262 arm: Revert partially ebd5e86c0f
PR target/98636

gcc/ChangeLog:

	* optc-save-gen.awk: Put back arm_fp16_format to
	checked_options.
2021-06-23 15:30:17 +02:00
Patrick Palka 3eecc1db4c c++: CTAD and deduction guide selection [PR86439]
During CTAD, we select the best viable deduction guide using
build_new_function_call, which performs overload resolution on the set
of candidate guides and then forms a call to the guide.  As the PR
points out, this latter step is unnecessary and occasionally incorrect
since a call to the selected guide may be ill-formed, or forming the
call may have side effects such as prematurely deducing the type of a {}.

So this patch introduces a specialized subroutine based on
build_new_function_call that stops short of building a call to the
selected function, and makes do_class_deduction use this subroutine
instead.  And since a call is no longer built, do_class_deduction
doesn't need to set tf_decltype or cp_unevaluated_operand anymore.

This change causes us to reject some container CTAD examples in the
libstdc++ testsuite due to deduction failure for {}, which AFAICT is the
correct behavior.  Previously in e.g. the first removed example

  std::map{{std::pair{1, 2.0}, {2, 3.0}, {3, 4.0}}, {}},

the type of the {} would get deduced to less<int> as a side effect of
forming a call to the chosen guide

  template<typename _Key, typename _Tp, typename _Compare = less<_Key>,
           typename _Allocator = allocator<pair<const _Key, _Tp>>>
      map(initializer_list<pair<_Key, _Tp>>,
          _Compare = _Compare(), _Allocator = _Allocator())
      -> map<_Key, _Tp, _Compare, _Allocator>;

which made later overload resolution for the constructor call
unambiguous.  Now, the type of the {} remains undeduced until
constructor overload resolution, and we complain about ambiguity
for the two equally good constructor candidates

  map(initializer_list<value_type>,
      const _Compare& = _Compare(),
      const allocator_type& = allocator_type())

  map(initializer_list<value_type>, const allocator_type&).

This patch fixes these problematic container CTAD examples by giving
the {} an appropriate concrete type.  Two of these adjusted CTAD
examples (one for std::set and one for std::multiset) end up triggering
an unrelated CTAD bug on trunk, PR101174, so these two adjusted examples
are commented out for now.

	PR c++/86439

gcc/cp/ChangeLog:

	* call.c (print_error_for_call_failure): Constify 'args' parameter.
	(perform_dguide_overload_resolution): Define.
	* cp-tree.h: (perform_dguide_overload_resolution): Declare.
	* pt.c (do_class_deduction): Use perform_dguide_overload_resolution
	instead of build_new_function_call.  Don't use tf_decltype or
	set cp_unevaluated_operand.  Remove unnecessary NULL_TREE tests.

libstdc++-v3/ChangeLog:

	* testsuite/23_containers/map/cons/deduction.cc: Replace ambiguous
	CTAD examples.
	* testsuite/23_containers/multimap/cons/deduction.cc: Likewise.
	* testsuite/23_containers/multiset/cons/deduction.cc: Likewise.
	Mention one of the replaced examples is broken due to PR101174.
	* testsuite/23_containers/set/cons/deduction.cc: Likewise.
	* testsuite/23_containers/unordered_map/cons/deduction.cc: Replace
	ambiguous CTAD examples.
	* testsuite/23_containers/unordered_multimap/cons/deduction.cc:
	Likewise.
	* testsuite/23_containers/unordered_multiset/cons/deduction.cc:
	Likewise.
	* testsuite/23_containers/unordered_set/cons/deduction.cc: Likewise.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp1z/class-deduction88.C: New test.
	* g++.dg/cpp1z/class-deduction89.C: New test.
	* g++.dg/cpp1z/class-deduction90.C: New test.
2021-06-23 08:24:34 -04:00
Uros Bizjak 1e16f2b472 i386: Prevent unwanted combine from LZCNT to BSR [PR101175]
The current RTX pattern for BSR allows combine pass to convert LZCNT insn
to BSR. Note that the LZCNT has a defined behavior to return the operand
size when operand is zero, where BSR has not.

Add a BSR specific setting of zero-flag to RTX pattern of BSR insn
in order to avoid matching unwanted combinations.

2021-06-23  Uroš Bizjak  <ubizjak@gmail.com>

gcc/
	PR target/101175
	* config/i386/i386.md (bsr_rex64): Add zero-flag setting RTX.
	(bsr): Ditto.
	(*bsrhi): Remove.
	(clz<mode>2): Update RTX pattern for additions.

gcc/testsuite/

	PR target/101175
	* gcc.target/i386/pr101175.c: New test.
2021-06-23 12:51:32 +02:00
Jonathan Wakely 75404109dc libstdc++: Avoid "__lockable" name defined as macro by newlib
libstdc++-v3/ChangeLog:

	* include/std/mutex (__detail::__try_lock_impl): Rename
	parameter to avoid clashing with newlib's __lockable macro.
	(try_lock): Add 'inline' specifier.
	* testsuite/17_intro/names.cc: Add check for __lockable.
	* testsuite/30_threads/try_lock/5.cc: Add options for pthreads.
2021-06-23 11:05:51 +01:00
Andre Vehreschild da13e4ebeb fortran: Fix deref of optional in gen. code. [PR100337]
gcc/fortran/ChangeLog:

	PR fortran/100337
	* trans-intrinsic.c (conv_co_collective): Check stat for null ptr
	before dereferrencing.

gcc/testsuite/ChangeLog:

	PR fortran/100337
	* gfortran.dg/coarray_collectives_17.f90: New test.
2021-06-23 10:17:14 +02:00
Jakub Jelinek 679506c383 openmp: Fix up *_reduction clause handling with UDRs on PARM_DECLs [PR101167]
The following testcase FAILs, because the UDR combiner is invoked incorrectly.
lower_omp_rec_clauses expects that when it sets
DECL_VALUE_EXPR/DECL_HAS_VALUE_EXPR_P
for both the placeholder and the var that everything will be properly
regimplified, but as the variable in question is a PARM_DECL rather than
VAR_DECL, lower_omp_regimplify_p doesn't say that it should be regimplified
and so it is not.

2021-06-23  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/101167
	* omp-low.c (lower_omp_regimplify_p): Regimplify also PARM_DECLs
	and RESULT_DECLs that have DECL_HAS_VALUE_EXPR_P set.

	* testsuite/libgomp.c-c++-common/task-reduction-15.c: New test.
2021-06-23 10:03:28 +02:00
Martin Liska c2124b51a9 contrib: add git-commit-mklog wrapper
contrib/ChangeLog:

	* gcc-git-customization.sh: Use the new wrapper.
	* git-commit-mklog.py: New file.
	* prepare-commit-msg: Support GCC_MKLOG_ARGS.
2021-06-23 09:39:10 +02:00
Kewen Lin 47749c43ac rs6000: Fix typos in float128 ISA3.1 support
The recent float128 ISA3.1 support (r12-1340) has some typos,
it makes the libgcc build fail if it's with one binutils
(assembler) which doesn't support Power10 insns.  The error
looks like:

Error: invalid switch -mpower10
Error: unrecognized option -mpower10
... [...libgcc/shared-object.mk:14: float128-p10.o] Error 1

What this patch does are:
  - fix test target typo libgcc_cv_powerpc_3_1_float128_hw
    (written wrongly as libgcc_cv_powerpc_float128_hw, so it's
     going to build ISA3.1 stuffs just when detecting ISA3.0).
  - fix test used for libgcc_cv_powerpc_3_1_float128_hw check.
  - fix test option used for libgcc_cv_powerpc_3_1_float128_hw
    check.
  - remove the ISA3.1 related contents from t-float128-hw.
  - add new macro FLOAT128_HW_INSNS_ISA3_1 to differentiate
    ISA3.1 content from ISA3.0 part in ifunc support.

Bootstrapped/regtested on:
  - powerpc64le-linux-gnu P10
  - powerpc64le-linux-gnu P9 (w/i and w/o p10 supported as)
  - powerpc64-linux-gnu P8 (w/i and w/o p10 supported as)

libgcc/ChangeLog:

	* configure: Regenerate.
	* configure.ac (test for libgcc_cv_powerpc_3_1_float128_hw): Fix
	typos among the name, CFLAGS and the test.
	* config/rs6000/t-float128-hw (fp128_3_1_hw_funcs, fp128_3_1_hw_src,
	fp128_3_1_hw_static_obj, fp128_3_1_hw_shared_obj, fp128_3_1_hw_obj):
	Remove.
	* config/rs6000/t-float128-p10-hw (FLOAT128_HW_INSNS): Append
	macro FLOAT128_HW_INSNS_ISA3_1.
	(FP128_3_1_CFLAGS_HW): Fix option typo.
	* config/rs6000/float128-ifunc.c (SW_OR_HW_ISA3_1): Guard this with
	FLOAT128_HW_INSNS_ISA3_1.
	(__floattikf_resolve): Likewise.
	(__floatuntikf_resolve): Likewise.
	(__fixkfti_resolve): Likewise.
	(__fixunskfti_resolve): Likewise.
	(__floattikf): Likewise.
	(__floatuntikf): Likewise.
	(__fixkfti): Likewise.
	(__fixunskfti): Likewise.
2021-06-22 23:09:30 -05:00
GCC Administrator 419af06a35 Daily bump. 2021-06-23 00:16:28 +00:00
Jonathan Wakely c556596119 libstdc++: Simplify std::try_lock and std::lock further
The std::try_lock and std::lock algorithms can use iteration instead of
recursion when all lockables have the same type and can be held by an
array of unique_lock<L> objects.

By making this change to __detail::__try_lock_impl it also benefits
__detail::__lock_impl, which uses it. For std::lock we can just put the
iterative version directly in std::lock, to avoid making any call to
__detail::__lock_impl.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

Co-authored-by: Matthias Kretz <m.kretz@gsi.de>

libstdc++-v3/ChangeLog:

	* include/std/mutex (lock): Replace recursion with iteration
	when lockables all have the same type.
	(__detail::__try_lock_impl): Likewise. Pass lockables as
	parameters, instead of a tuple. Always lock the first one, and
	recurse for the rest.
	(__detail::__lock_impl): Adjust call to __try_lock_impl.
	(__detail::__try_to_lock): Remove.
	* testsuite/30_threads/lock/3.cc: Check that mutexes are locked.
	* testsuite/30_threads/lock/4.cc: Also test non-heterogeneous
	arguments.
	* testsuite/30_threads/unique_lock/cons/60497.cc: Also check
	std::try_lock.
	* testsuite/30_threads/try_lock/5.cc: New test.
2021-06-22 21:17:25 +01:00
Jonathan Wakely b5a29741db libstdc++: Remove garbage collection support for C++23 [P2186R2]
This removes the non-functional garbage colection support from <memory>,
as proposed for C++23 by P2186R2.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++-v3/ChangeLog:

	* include/std/memory (declare_reachable, undeclare_reachable)
	(declare_no_pointers, undeclare_no_pointers, get_pointer_safety)
	(pointer_safety): Only define for C++11 to C++20 inclusive.
	* testsuite/20_util/pointer_safety/1.cc: Do not run for C++23.
2021-06-22 20:58:43 +01:00
Jonathan Wakely 6c63cb231e libstdc++: Implement LWG 3422 for std::seed_seq
This ensures that the std::seed_seq initializer-list constructor will
not be used for list-initialization unless the initializers in the list
are integers. This allows list-initialization syntax to be used with a
pair of pointers and for that to use the appropriate constructor.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++-v3/ChangeLog:

	* include/bits/random.h (seed_seq): Constrain initializer-list
	constructor.
	* include/bits/random.tcc (seed_seq): Add template parameter.
	* testsuite/26_numerics/random/seed_seq/cons/default.cc: Check
	for noexcept.
	* testsuite/26_numerics/random/seed_seq/cons/initlist.cc: Check
	constraints.
2021-06-22 20:58:25 +01:00
Sandra Loosemore f61e5d4d8b Fortran: fix sm computation in CFI_allocate [PR93524]
This patch fixes a bug in setting the step multiplier field in the
C descriptor for array dimensions > 2.

2021-06-21  Sandra Loosemore  <sandra@codesourcery.com>
	    Tobias Burnus  <tobias@codesourcery.com>

libgfortran/
	PR fortran/93524
	* runtime/ISO_Fortran_binding.c (CFI_allocate): Fix
	sm computation.

gcc/testsuite/
	PR fortran/93524
	* gfortran.dg/pr93524.c: New.
	* gfortran.dg/pr93524.f90: New.
2021-06-22 12:45:47 -07:00
Thomas Rodgers e02840c1a9 libstdc++: Fix for deadlock in std::counting_semaphore [PR100806]
libstdc++-v3/ChangeLog:
	PR libstdc++/100806
	* include/bits/semaphore_base.h (__atomic_semaphore::_M_release):
	Force _M_release() to wake all waiting threads.
	* testsuite/30_threads/semaphore/100806.cc: New test.
2021-06-22 11:06:07 -07:00
David Malcolm ea4e32181d analyzer: fix ICE on malloc/alloca param type mismatch [PR101143]
gcc/analyzer/ChangeLog:
	PR analyzer/101143
	* region-model.cc (compat_types_p): New function.
	(region_model::create_region_for_heap_alloc): Convert assertion to
	an error check.
	(region_model::create_region_for_alloca): Likewise.

gcc/testsuite/ChangeLog:
	PR analyzer/101143
	* gcc.dg/analyzer/pr101143.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2021-06-22 13:44:57 -04:00
Sergei Trofimovich 83bd60452d docs: drop unbalanced parenthesis in rtl.texi
gcc/ChangeLog:

	* doc/rtl.texi: drop unbalanced parenthesis.
2021-06-22 18:03:46 +01:00
Richard Biener b4e21c8046 middle-end/101156 - remove not working optimization in gimplification
This removes a premature and not working optimization from the
gimplifier.  When gimplification is requested not to produce a SSA
name we try to avoid generating a copy when we did so anyway but
instead replace the LHS of its definition.  But that only works in
case there are no uses of the SSA name already which is something
we cannot easily check, so the following removes said optimization.

Statistics on the whole bootstrap shows we hit this optimization
only for libiberty/cp-demangle.c and overall we have 21652112
gimplifications where just 240 copies are elided.  Preserving
the optimization would require scanning the original expression
and the pre and post sequences for SSA names and uses, that seems
excessive to avoid these 240 copies.

2021-06-22  Richard Biener  <rguenther@suse.de>

	PR middle-end/101156
	* gimplify.c (gimplify_expr): Remove premature incorrect
	optimization.

	* gcc.dg/pr101156.c: New testcase.
2021-06-22 15:31:04 +02:00
Jakub Jelinek 3adb9ac662 testsuite: Add testcase for recently fixed PR [PR101159]
On Tue, Jun 22, 2021 at 11:00:51AM +0200, Richard Biener wrote:
> 2021-06-22  Richard Biener  <rguenther@suse.de>
>
>       PR tree-optimization/101159
>       * tree-vect-patterns.c (vect_recog_popcount_pattern): Add
>       missing NULL vectype check.

The following patch adds the testcase for it, IMHO it can't hurt and
from my experience testcases often trigger other bugs later on (rather
than the original bugs reappearing, though even that happens),
and also fixes a couple of typos in the new function.

2021-06-22  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/101159
	* tree-vect-patterns.c (vect_recog_popcount_pattern): Fix some
	comment typos.

	* gcc.c-torture/compile/pr101159.c: New test.
2021-06-22 15:22:51 +02:00
Jakub Jelinek 9b613e825d expand: Fix up empty class return optimization [PR101160]
On Mon, Jun 14, 2021 at 11:24:22PM -0400, Jason Merrill via Gcc-patches wrote:
> The x86_64 psABI says that an empty class isn't passed or returned in memory or
> registers, so we shouldn't set %eax in this function.  Is this a reasonable
> place to implement that?  Another possibility would be to remove the hack to
> prevent i386.c:function_value_64 from returning NULL in this case and fix the
> callers to deal, but that seems like more work.
>
> The df-scan hunk catches the case where we look at a 0-length reg and build
> a range the length of unsigned int, which happened before I changed
> assign_parms to match expand_function_end.

The assign_params change unfortunately breaks e.g. the following testcase.
The problem is that some passes (e.g. subreg lowering but assign_parms
comments also talk about delayed slot scheduling) rely on crtl->return_rtx
not to contain pseudo registers, and the assign_parms change results
in the pseudo in there not being replaced with a hard register.

The following patch instead clears the crtl->return_rtx if a function
returns TYPE_EMPTY_P structure, that way (use (pseudo)) is not emitted
into the IL and it is treated like more like functions returning void.

I've also changed the effective target on the empty-class1.C testcase, so
that it doesn't fail on x86_64-linux with -m32 testing.

2021-06-22  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/101160
	* function.c (assign_parms): For decl_result with TYPE_EMPTY_P type
	clear crtl->return_rtx instead of keeping it referencing a pseudo.

	* g++.target/i386/empty-class1.C: Require lp64 effective target
	instead of x86_64-*-*.
	* g++.target/i386/empty-class2.C: New test.
2021-06-22 15:21:35 +02:00
Jakub Jelinek 92d9c9e705 fold-const: Return corresponding integral type for OFFSET_TYPE in range_check_type [PR101162]
Andrew's recent r12-1608-g2f1686ff70b25fceb04ca2ffc0a450fb682913ef change
to fail verification on various unary and binary operations with OFFSET_TYPE
revealed that e.g. switchconv happily performs multiplications and additions
in OFFSET_TYPE.

2021-06-22  Jakub Jelinek  <jakub@redhat.com>
	    Andrew Pinski  <apinski@marvell.com>

	PR tree-optimization/101162
	* fold-const.c (range_check_type): Handle OFFSET_TYPE like pointer
	types.

	* g++.dg/opt/pr101162.C: New test.
2021-06-22 15:20:14 +02:00
Andrew MacLeod ca1f9f2285 Add relational self-tests.
* range-op.cc (range_relational_tests): New.
	(range_op_tests): Call range_relational_tests.
2021-06-22 08:11:46 -04:00
Andrew MacLeod 0f7ccc063a Add relation between LHS and op1 for casts and copies.
* range-op.cc (operator_cast::lhs_op1_relation): New.
	(operator_identity::lhs_op1_relation): Mew.
2021-06-22 08:11:45 -04:00
Andrew MacLeod ae6b830f31 Add relation effects between operands to MINUS_EXPR.
* range-op.cc (operator_minus::op1_op2_relation_effect): New.
2021-06-22 08:11:45 -04:00
Andrew MacLeod c526de3f43 Add relations between LHS and op1/op2 for PLUS_EXPR.
* range-op.cc (operator_plus::lhs_op1_relation): New.
	(operator_plus::lhs_op2_relation): New.
2021-06-22 08:11:45 -04:00
Andrew MacLeod a2c9173331 Add relational support to fold_using_range
Enable a relation oracle in ranger, and add full range-op relation support
to fold_using_range.

	* gimple-range-cache.cc (ranger_cache::ranger_cache): Create a
	relation_oracle if dominators exist.
	(ranger_cache::~ranger_cache): Dispose of oracle.
	(ranger_cache::dump_bb): Dump oracle.
	* gimple-range.cc (fur_source::fur_source): New.
	(fur_source::get_operand): Use mmeber query.
	(fur_source::get_phi_operand): Use member_query.
	(fur_source::query_relation): New.
	(fur_source::register_dependency): Delete.
	(fur_source::register_relation): New.
	(fur_edge::fur_edge): Adjust.
	(fur_edge::get_phi_operand): Fix comment.
	(fur_edge::query): Delete.
	(fur_stmt::fur_stmt): Adjust.
	(fur_stmt::query): Delete.
	(fur_depend::fur_depend): Adjust.
	(fur_depend::register_relation): New.
	(fur_depend::register_relation): New.
	(fur_list::fur_list): Adjust.
	(fur_list::get_operand): Use member query.
	(fold_using_range::range_of_range_op): Process and query relations.
	(fold_using_range::range_of_address): Adjust dependency call.
	(fold_using_range::range_of_phi): Ditto.
	(gimple_ranger::gimple_ranger): New.  Use ranger_ache oracle.
	(fold_using_range::relation_fold_and_or): New.
	(fold_using_range::postfold_gcond_edges): New.
	* gimple-range.h (class gimple_ranger): Adjust.
	(class fur_source): Adjust members.
	(class fur_stmt): Ditto.
	(class fold_using_range): Ditto.
2021-06-22 08:11:45 -04:00
Andrew MacLeod 80dd13f5c3 Add relational support to range-op.
This patch integrates relations with range-op functionality so that any
known relations can be used to help reduce or resolve ranges.
Initially handle  EQ_EXPR, NE_EXPR, LE_EXPR, LT_EXPR, GT_EXPR and GE_EXPR.

	* range-op.cc (range_operator::wi_fold): Apply relation effect.
	(range_operator::fold_range): Adjust and apply relation effect.
	(*::fold_range): Add relation parameters.
	(*::op1_range): Ditto.
	(*::op2_range): Ditto.
	(range_operator::lhs_op1_relation): New.
	(range_operator::lhs_op2_relation): New.
	(range_operator::op1_op2_relation): New.
	(range_operator::op1_op2_relation_effect): New.
	(relop_early_resolve): New.
	(operator_equal::op1_op2_relation): New.
	(operator_equal::fold_range): Call relop_early_resolve.
	(operator_not_equal::op1_op2_relation): New.
	(operator_not_equal::fold_range): Call relop_early_resolve.
	(operator_lt::op1_op2_relation): New.
	(operator_lt::fold_range): Call relop_early_resolve.
	(operator_le::op1_op2_relation): New.
	(operator_le::fold_range): Call relop_early_resolve.
	(operator_gt::op1_op2_relation): New.
	(operator_gt::fold_range): Call relop_early_resolve.
	(operator_ge::op1_op2_relation): New.
	(operator_ge::fold_range): Call relop_early_resolve.
	* range-op.h (class range_operator): Adjust parameters and methods.
2021-06-22 08:11:44 -04:00
Andrew MacLeod 3aaa69e5f3 Initial value-relation code.
This code provides a both an equivalence and relation oracle which can be
accessed via a range_query object.  This initial code drop includes the
oracles and access them, but does not utilize them yet.

	* Makefile.in (OBJS): Add value-relation.o.
	* gimple-range.h: Adjust include files.
	* tree-data-ref.c: Adjust include file order.
	* value-query.cc (range_query::get_value_range): Default to no oracle.
	(range_query::query_relation): New.
	(range_query::query_relation): New.
	* value-query.h (class range_query): Adjust.
	* value-relation.cc: New.
	* value-relation.h: New.
2021-06-22 08:11:44 -04:00
Richard Biener a2ef8395fa tree-optimization/101151 - fix irreducible region check for sinking
The check whether two blocks are in the same irreducible region
and thus post-dominance checks being unreliable was incomplete
since an irreducible region can contain reducible sub-regions but
if one block is in the irreducible part and one not the check
still doesn't work as expected.

2021-06-22  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/101151
	* tree-ssa-sink.c (statement_sink_location): Expand irreducible
	region check.

	* gcc.dg/torture/pr101151.c: New testcase.
2021-06-22 12:09:59 +02:00
Jojo R 7822285515 RISC-V: Add tune info for T-HEAD C906.
gcc/
	* config/riscv/riscv.c (thead_c906_tune_info): New.
	(riscv_tune_info_table): Use new tune.
2021-06-22 17:14:25 +08:00
Kito Cheng f0e40ea064 testuite: Add pthread check to dg-module-cmi for omp module testing
gcc/testsuite:

	* g++.dg/modules/omp-1_a.C: Check pthread is available for
	dg-module-cmi.
	* g++.dg/modules/omp-2_a.C: Ditto.
2021-06-22 17:06:01 +08:00
Richard Biener 7a22d8a764 tree-optimization/101158 - adjust SLP call matching sequence
This moves the check for same operands after verifying we're
facing compatible calls.

2021-06-22  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/101158
	* tree-vect-slp.c (vect_build_slp_tree_1): Move same operand
	checking after checking for matching operation.

	* gfortran.dg/pr101158.f90: New testcase.
2021-06-22 11:01:17 +02:00
Richard Biener a5b773d3f8 tree-optimization/101159 - fix missing NULL check in popcount pattern
This fixes a missing check for a NULL vectype in the new popcount
pattern.

2021-06-22  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/101159
	* tree-vect-patterns.c (vect_recog_popcount_pattern): Add
	missing NULL vectype check.
2021-06-22 11:01:17 +02:00
Richard Biener 26f05f5a82 tree-optimization/101154 - fix out-of bound access in SLP
This fixes an out-of-bound access of matches.

2021-06-22  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/101154
	* tree-vect-slp.c (vect_build_slp_tree_2): Fix out-of-bound access.
2021-06-22 11:01:17 +02:00
Jakub Jelinek d58a66aa0f i386: Use xor to write zero to memory with -Os even for more than 4 stores [PR11877]
> > 2021-06-20  Roger Sayle  <roger@nextmovesoftware.com>
> >
> > gcc/ChangeLog
> >         PR target/11877
> >         * config/i386/i386.md: New define_peephole2s to shrink writing
> >         1, 2 or 4 consecutive zeros to memory when optimizing for size.

It unfortunately doesn't extend well to larger memory clearing.
Consider e.g.
void
foo (int *p)
{
  p[0] = 0;
  p[7] = 0;
  p[23] = 0;
  p[41] = 0;
  p[48] = 0;
  p[59] = 0;
  p[69] = 0;
  p[78] = 0;
  p[83] = 0;
  p[89] = 0;
  p[98] = 0;
  p[121] = 0;
  p[132] = 0;
  p[143] = 0;
  p[154] = 0;
}
where with the patch we emit:
        xorl    %eax, %eax
        xorl    %edx, %edx
        xorl    %ecx, %ecx
        xorl    %esi, %esi
        xorl    %r8d, %r8d
        movl    %eax, (%rdi)
        movl    %eax, 28(%rdi)
        movl    %eax, 92(%rdi)
        movl    %eax, 164(%rdi)
        movl    %edx, 192(%rdi)
        movl    %edx, 236(%rdi)
        movl    %edx, 276(%rdi)
        movl    %edx, 312(%rdi)
        movl    %ecx, 332(%rdi)
        movl    %ecx, 356(%rdi)
        movl    %ecx, 392(%rdi)
        movl    %ecx, 484(%rdi)
        movl    %esi, 528(%rdi)
        movl    %esi, 572(%rdi)
        movl    %r8d, 616(%rdi)
Here is an incremental patch that emits:
        xorl    %eax, %eax
        movl    %eax, (%rdi)
        movl    %eax, 28(%rdi)
        movl    %eax, 92(%rdi)
        movl    %eax, 164(%rdi)
        movl    %eax, 192(%rdi)
        movl    %eax, 236(%rdi)
        movl    %eax, 276(%rdi)
        movl    %eax, 312(%rdi)
        movl    %eax, 332(%rdi)
        movl    %eax, 356(%rdi)
        movl    %eax, 392(%rdi)
        movl    %eax, 484(%rdi)
        movl    %eax, 528(%rdi)
        movl    %eax, 572(%rdi)
        movl    %eax, 616(%rdi)
instead.

2021-06-22  Jakub Jelinek  <jakub@redhat.com>

	PR target/11877
	* config/i386/i386-protos.h (ix86_last_zero_store_uid): Declare.
	* config/i386/i386-expand.c (ix86_last_zero_store_uid): New variable.
	* config/i386/i386.c (ix86_expand_prologue): Clear it.
	* config/i386/i386.md (peephole2s for 1/2/4 stores of const0_rtx):
	Remove "" from match_operand.  Emit new insns using emit_move_insn and
	set ix86_last_zero_store_uid to INSN_UID of the last store.
	Add peephole2s for 1/2/4 stores of const0_rtx following previous
	successful peep2s.

	* gcc.target/i386/pr11877-2.c: New test.
2021-06-22 10:16:18 +02:00
liuhongt 706533c339 Remove my Write After Approval entry.
ChangeLog:

	* MAINTAINERS: Remove my Write After Approval entry.
2021-06-22 16:09:56 +08:00
Martin Liska 48b312b4ba contrib: fix a flake8 issue
contrib/ChangeLog:

	* mklog.py: Fix flake8 issue.
2021-06-22 09:50:38 +02:00
Martin Liska 8819c82ce8 autofdo: Bump AUTO_PROFILE_VERSION.
gcc/ChangeLog:

	* auto-profile.c (AUTO_PROFILE_VERSION): Bump as string format
	was changed.
2021-06-22 08:54:34 +02:00
Martin Liska 6871b899b8 gcov: update comment about padding
gcc/ChangeLog:

	* gcov-io.h: Remove padding entries.
2021-06-22 08:43:41 +02:00
liuhongt e08a125b20 Add vect_recog_popcount_pattern to handle mismatch between the vectorized popcount IFN and scalar popcount builtin.
The patch remove those pro- and demotions when backend support direct
optab.

For i386: it enables vectorization for vpopcntb/vpopcntw and optimized
for vpopcntq.

gcc/ChangeLog:

	PR tree-optimization/97770
	* tree-vect-patterns.c (vect_recog_popcount_pattern):
	New.
	(vect_recog_func vect_vect_recog_func_ptrs): Add new pattern.

gcc/testsuite/ChangeLog:

	PR tree-optimization/97770
	* gcc.target/i386/avx512bitalg-pr97770-1.c: Remove xfail.
	* gcc.target/i386/avx512vpopcntdq-pr97770-1.c: Remove xfail.
2021-06-22 10:40:11 +08:00
liuhongt f51618f301 Optimize vpexpand* to mask mov when mask have all ones in it's lower part (including 0 and -1).
gcc/ChangeLog:

	PR target/100267
	* config/i386/i386-builtin.def (BDESC): Adjust builtin name.
	* config/i386/sse.md (<avx512>_expand<mode>_mask): Rename to ..
	(expand<mode>_mask): this ..
	(*expand<mode>_mask): New pre_reload splitter to transform
	v{,p}expand* to vmov* when mask is zero, all ones, or has all
	ones in it's lower part, otherwise still generate
	v{,p}expand*.

gcc/testsuite/ChangeLog:

	PR target/100267
	* gcc.target/i386/avx512bw-pr100267-1.c: New test.
	* gcc.target/i386/avx512bw-pr100267-b-2.c: New test.
	* gcc.target/i386/avx512bw-pr100267-d-2.c: New test.
	* gcc.target/i386/avx512bw-pr100267-q-2.c: New test.
	* gcc.target/i386/avx512bw-pr100267-w-2.c: New test.
	* gcc.target/i386/avx512f-pr100267-1.c: New test.
	* gcc.target/i386/avx512f-pr100267-pd-2.c: New test.
	* gcc.target/i386/avx512f-pr100267-ps-2.c: New test.
	* gcc.target/i386/avx512vl-pr100267-1.c: New test.
	* gcc.target/i386/avx512vl-pr100267-pd-2.c: New test.
	* gcc.target/i386/avx512vl-pr100267-ps-2.c: New test.
	* gcc.target/i386/avx512vlbw-pr100267-1.c: New test.
	* gcc.target/i386/avx512vlbw-pr100267-b-2.c: New test.
	* gcc.target/i386/avx512vlbw-pr100267-d-2.c: New test.
	* gcc.target/i386/avx512vlbw-pr100267-q-2.c: New test.
	* gcc.target/i386/avx512vlbw-pr100267-w-2.c: New test.
2021-06-22 09:35:16 +08:00
liuhongt b6efffa552 Fix ICE for vpexpand*.
gcc/ChangeLog

	PR target/100310
	* config/i386/i386-expand.c
	(ix86_expand_special_args_builtin): Keep constm1_operand only
	if it satisfies insn's operand predicate.

gcc/testsuite/ChangeLog

	PR target/100310
	* gcc.target/i386/pr100310.c: New test.
2021-06-22 09:34:47 +08:00
GCC Administrator 2f080224cf Daily bump. 2021-06-22 00:16:29 +00:00
Jonathan Wakely 6cf0040fff libstdc++: Improve std::lock algorithm
The current std::lock algorithm is the one called "persistent" in Howard
Hinnant's https://howardhinnant.github.io/dining_philosophers.html post.
While it tends to perform acceptably fast, it wastes a lot of CPU cycles
by continuously locking and unlocking the uncontended mutexes.
Effectively, it's a spin lock with no back-off.

This replaces it with the one Howard calls "smart and polite". It's
smart, because when a Mi.try_lock() call fails because mutex Mi is
contended, the algorithm reorders the mutexes until Mi is first, then
calls Mi.lock(), to block until Mi is no longer contended.  It's
polite because it uses std::this_thread::yield() between the failed
Mi.try_lock() call and the Mi.lock() call. (In reality it uses
__gthread_yield() directly, because using this_thread::yield() would
require shuffling code around to avoid a circular dependency.)

This version of the algorithm is inspired by some hints from Howard, so
that it has strictly bounded stack usage. As the comment in the code
says:

// This function can recurse up to N levels deep, for N = 1+sizeof...(L1).
// On each recursion the lockables are rotated left one position,
// e.g. depth 0: l0, l1, l2; depth 1: l1, l2, l0; depth 2: l2, l0, l1.
// When a call to l_i.try_lock() fails it recurses/returns to depth=i
// so that l_i is the first argument, and then blocks until l_i is locked.

The 'i' parameter is the desired permuation of the lockables, and the
'depth' parameter is the depth in the call stack of the current
instantiation of the function template. If i == depth then the function
calls l0.lock() and then l1.try_lock()... for each lockable in the
parameter pack l1.  If i > depth then the function rotates the lockables
to the left one place, and calls itself again to go one level deeper.
Finally, if i < depth then the function returns to a shallower depth,
equivalent to a right rotate of the lockables.  When a call to
try_lock() fails, i is set to the index of the contended lockable, so
that the next call to l0.lock() will use the contended lockable as l0.

This commit also replaces the std::try_lock implementation details. The
new code is identical in behaviour, but uses a pair of constrained
function templates. This avoids instantiating a class template, and is a
litle simpler to call where used in std::__detail::__lock_impl and
std::try_lock.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++-v3/ChangeLog:

	* include/std/mutex (__try_to_lock): Move to __detail namespace.
	(struct __try_lock_impl): Replace with ...
	(__detail::__try_lock_impl<Idx>(tuple<Lockables...>&)): New
	function templates to implement std::try_lock.
	(try_lock): Use new __try_lock_impl.
	(__detail::__lock_impl(int, int&, L0&, L1&...)): New function
	template to implement std::lock.
	(lock): Use __lock_impl.
2021-06-21 18:29:58 +01:00
Jason Merrill 7232f7c4c2 expand: empty class return optimization [PR88529]
The x86_64 psABI says that an empty class isn't passed or returned in memory
or registers, so we shouldn't set %eax in this function.

The df-scan hunk catches the case where we look at a 0-length reg and build
a range the length of unsigned int, which happened before I changed
assign_parms to match expand_function_end.

	PR target/88529

gcc/ChangeLog:

	* df-scan.c (df_ref_record): Check that regno < endregno.
	* function.c (assign_parms, expand_function_end): Do nothing with a
	TYPE_EMPTY_P result.

gcc/testsuite/ChangeLog:

	* g++.target/i386/empty-class1.C: New test.
2021-06-21 10:50:01 -04:00