Commit Graph

194721 Commits

Author SHA1 Message Date
Richard Biener c64ef5cd92 Remove --param max-fsm-thread-length
This removes max-fsm-thread-length which is obsoleted by
max-jump-thread-paths.

	* doc/invoke.texi (max-fsm-thread-length): Remove.
	* params.opt (max-fsm-thread-length): Likewise.
	* tree-ssa-threadbackward.cc
	(back_threader_profitability::profitable_path_p): Do not
	check max-fsm-thread-length.
2022-08-09 10:14:30 +02:00
Richard Biener 409978d58d tree-optimization/106514 - add --param max-jump-thread-paths
The following adds a limit for the exponential greedy search of
the backwards jump threader.  The idea is to limit the search
space in a way that the paths considered are the same if the search
were in BFS order rather than DFS.  In particular it stops considering
incoming edges into a block if the product of the in-degrees of
blocks on the path exceeds the specified limit.

When considering the low stmt copying limit of 7 (or 1 in the size
optimize case) this means the degenerate case with maximum search
space is a sequence of conditions with no actual code

  B1
   |\
   | empty
   |/
  B2
   |\
   ...
  Bn
   |\

GIMPLE_CONDs are costed 2, an equivalent GIMPLE_SWITCH already 4, so
we reach 7 already with 3 middle conditions (B1 and Bn do not count).
The search space would be 2^4 == 16 to reach this.  The FSM threads
historically allowed for a thread length of 10 but is really looking
for a single multiway branch threaded across the backedge.  I've
chosen the default of the new parameter to 64 which effectively
limits the outdegree of the switch statement (the cases reaching the
backedge) to that number (divided by 2 until I add some special
pruning for FSM threads due to the loop header indegree).  The
testcase ssa-dom-thread-7.c requires 56 at the moment (as said,
some special FSM thread pruning of considered edges would bring
it down to half of that), but we now get one more threading
and quite some more in later threadfull.  This testcase seems to
be difficult to check for expected transforms.

The new testcases add the degenerate case we currently thread
(without deciding whether that's a good idea ...) plus one with
an approripate limit that should prevent the threading.

This obsoletes the mentioned --param max-fsm-thread-length but
I am not removing it as part of this patch.  When the search
space is limited the thread stmt size limit effectively provides
max-fsm-thread-length.

The param with its default does not help PR106514 enough to unleash
path searching with the higher FSM stmt count limit.

	PR tree-optimization/106514
	* params.opt (max-jump-thread-paths): New.
	* doc/invoke.texi (max-jump-thread-paths): Document.
	* tree-ssa-threadbackward.cc (back_threader::find_paths_to_names):
	Honor max-jump-thread-paths, take overall_path argument.
	(back_threader::find_paths): Pass 1 as initial overall_path.

	* gcc.dg/tree-ssa/ssa-thread-16.c: New testcase.
	* gcc.dg/tree-ssa/ssa-thread-17.c: Likewise.
	* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust.
2022-08-09 10:14:30 +02:00
Tobias Burnus 8a16b9f983 OpenMP: Fix folding with simd's linear clause [PR106492]
gcc/ChangeLog:

	PR middle-end/106492
	* omp-low.cc (lower_rec_input_clauses): Add missing folding
	to data type of linear-clause list item.

gcc/testsuite/ChangeLog:

	PR middle-end/106492
	* g++.dg/gomp/pr106492.C: New test.
2022-08-09 07:57:40 +02:00
GCC Administrator 5f17badb64 Daily bump. 2022-08-09 00:16:47 +00:00
Andrew MacLeod ef623bb585 Evaluate condition arguments with the correct type.
Processing of a cond_expr requires that a range of the correct type for the
operands of the cond_expr is passed in.

	PR tree-optimization/106556
	gcc/
	* gimple-range-gori.cc (gori_compute::condexpr_adjust): Use the
	  type of the cond_expr operands being evaluted.

	gcc/testsuite/
	* gfortran.dg/pr106556.f90: New.
2022-08-08 16:08:51 -04:00
Tom Honermann 053876cdbe preprocessor/106426: Treat u8 character literals as unsigned in char8_t modes.
This patch corrects handling of UTF-8 character literals in preprocessing
directives so that they are treated as unsigned types in char8_t enabled
C++ modes (C++17 with -fchar8_t or C++20 without -fno-char8_t). Previously,
UTF-8 character literals were always treated as having the same type as
ordinary character literals (signed or unsigned dependent on target or use
of the -fsigned-char or -funsigned char options).

	PR preprocessor/106426

gcc/c-family/ChangeLog:
	* c-opts.cc (c_common_post_options): Assign cpp_opts->unsigned_utf8char
	subject to -fchar8_t, -fsigned-char, and/or -funsigned-char.

gcc/testsuite/ChangeLog:
	* g++.dg/ext/char8_t-char-literal-1.C: Check signedness of u8 literals.
	* g++.dg/ext/char8_t-char-literal-2.C: Check signedness of u8 literals.

libcpp/ChangeLog:
	* charset.cc (narrow_str_to_charconst): Set signedness of CPP_UTF8CHAR
	literals based on unsigned_utf8char.
	* include/cpplib.h (cpp_options): Add unsigned_utf8char.
	* init.cc (cpp_create_reader): Initialize unsigned_utf8char.
2022-08-08 19:50:40 +00:00
Tom Honermann 703837b2cc C: Implement C2X N2653 char8_t and UTF-8 string literal changes
This patch implements the core language and compiler dependent library
changes adopted for C2X via WG14 N2653.  The changes include:
- Change of type for UTF-8 string literals from array of const char to
  array of const char8_t (unsigned char).
- A new atomic_char8_t typedef.
- A new ATOMIC_CHAR8_T_LOCK_FREE macro defined in terms of the existing
  __GCC_ATOMIC_CHAR8_T_LOCK_FREE predefined macro.

gcc/ChangeLog:

	* ginclude/stdatomic.h (atomic_char8_t,
	ATOMIC_CHAR8_T_LOCK_FREE): New typedef and macro.

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_string_literal): Use char8_t as the type
	of CPP_UTF8STRING when char8_t support is enabled.
	* c-typeck.cc (digest_init): Allow initialization of an array
	of character type by a string literal with type array of
	char8_t.

gcc/c-family/ChangeLog:

	* c-lex.cc (lex_string, lex_charconst): Use char8_t as the type
	of CPP_UTF8CHAR and CPP_UTF8STRING when char8_t support is
	enabled.
	* c-opts.cc (c_common_post_options): Set flag_char8_t if
	targeting C2x.

gcc/testsuite/ChangeLog:
	* gcc.dg/atomic/c2x-stdatomic-lockfree-char8_t.c: New test.
	* gcc.dg/atomic/gnu2x-stdatomic-lockfree-char8_t.c: New test.
	* gcc.dg/c11-utf8str-type.c: New test.
	* gcc.dg/c17-utf8str-type.c: New test.
	* gcc.dg/c2x-utf8str-type.c: New test.
	* gcc.dg/c2x-utf8str.c: New test.
	* gcc.dg/gnu2x-utf8str-type.c: New test.
	* gcc.dg/gnu2x-utf8str.c: New test.
2022-08-08 19:50:38 +00:00
Iain Buclaw 4b0253b019 d: Fix ICE in in add_stack_var, at cfgexpand.cc:476
The type that triggers the ICE never got completed by the semantic
analysis pass.  Checking for size forces it to be done, or issue a
compile-time error.

	PR d/106555

gcc/d/ChangeLog:

	* d-target.cc (Target::isReturnOnStack): Check for return type size.

gcc/testsuite/ChangeLog:

	* gdc.dg/imports/pr106555.d: New test.
	* gdc.dg/pr106555.d: New test.
2022-08-08 20:27:49 +02:00
François Dumont 01b1afdc35 libstdc++: [_GLIBCXX_DEBUG] Do not consider detached iterators as value-initialized
An attach iterator has its _M_version set to something != 0, the container version. This
value shall be preserved when detaching it so that the iterator does not look like a
value-initialized one.

libstdc++-v3/ChangeLog:

	* include/debug/formatter.h (__singular_value_init): New _Iterator_state enum entry.
	(_Parameter<>(const _Safe_iterator<>&, const char*, _Is_iterator)): Check if iterator
	parameter is value-initialized.
	(_Parameter<>(const _Safe_local_iterator<>&, const char*, _Is_iterator)): Likewise.
	* include/debug/safe_iterator.h (_Safe_iterator<>::_M_value_initialized()): New. Adapt
	checks.
	* include/debug/safe_local_iterator.h (_Safe_local_iterator<>::_M_value_initialized()): New.
	Adapt checks.
	* src/c++11/debug.cc (_Safe_iterator_base::_M_reset): Do not reset _M_version.
	(print_field(PrintContext&, const _Parameter&, const char*)): Adapt state_names.
	* testsuite/23_containers/deque/debug/iterator1_neg.cc: New test.
	* testsuite/23_containers/deque/debug/iterator2_neg.cc: New test.
	* testsuite/23_containers/forward_list/debug/iterator1_neg.cc: New test.
	* testsuite/23_containers/forward_list/debug/iterator2_neg.cc: New test.
	* testsuite/23_containers/forward_list/debug/iterator3_neg.cc: New test.
2022-08-08 20:11:59 +02:00
Andrew Pinski 21c7aab098 Fix middle-end/103645: empty struct store not removed when using compound literal
For compound literals empty struct stores are not removed as they go down a
different path of the gimplifier; trying to optimize the init constructor.
This fixes the problem by not adding the gimple assignment at the end
of gimplify_init_constructor if it was an empty type.

Note this updates gcc.dg/pr87052.c where we had:
const char d[0] = { };
And was expecting a store to d but after this, there is no store
as the decl's type is zero in size.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

	PR middle-end/103645
	* gimplify.cc (gimplify_init_constructor): Don't build/add
	gimple assignment of an empty type.

gcc/testsuite/ChangeLog:
	* gcc.dg/pr87052.c: Update d var to expect nothing.
2022-08-08 08:14:25 -07:00
Tamar Christina 5471f55f00 AArch32: Fix 128-bit sequential consistency atomic operations.
Similar to AArch64 the Arm implementation of 128-bit atomics is broken.

For 128-bit atomics we rely on pthread barriers to correct guard the address
in the pointer to get correct memory ordering.  However for 128-bit atomics the
address under the lock is different from the original pointer.

This means that one of the values under the atomic operation is not protected
properly and so we fail during when the user has requested sequential
consistency as there's no barrier to enforce this requirement.

As such users have resorted to adding an

#ifdef GCC
<emit barrier>
#endif

around the use of these atomics.

This corrects the issue by issuing a barrier only when __ATOMIC_SEQ_CST was
requested.  I have hand verified that the barriers are inserted
for atomic seq cst.

libatomic/ChangeLog:

	PR target/102218
	* config/arm/host-config.h (pre_seq_barrier, post_seq_barrier,
	pre_post_seq_barrier): Require barrier on __ATOMIC_SEQ_CST.
2022-08-08 14:37:42 +01:00
Tamar Christina e6a8ae900b AArch64: Fix 128-bit sequential consistency atomic operations.
The AArch64 implementation of 128-bit atomics is broken.

For 128-bit atomics we rely on pthread barriers to correct guard the address
in the pointer to get correct memory ordering.  However for 128-bit atomics the
address under the lock is different from the original pointer.

This means that one of the values under the atomic operation is not protected
properly and so we fail during when the user has requested sequential
consistency as there's no barrier to enforce this requirement.

As such users have resorted to adding an

#ifdef GCC
<emit barrier>
#endif

around the use of these atomics.

This corrects the issue by issuing a barrier only when __ATOMIC_SEQ_CST was
requested.  To remedy this performance hit I think we should revisit using a
similar approach to out-line-atomics for the 128-bit atomics.

Note that I believe I need the empty file due to the include_next chain but
I am not entirely sure.  I have hand verified that the barriers are inserted
for atomic seq cst.

libatomic/ChangeLog:

	PR target/102218
	* config/aarch64/aarch64-config.h: New file.
	* config/aarch64/host-config.h: New file.
2022-08-08 14:37:00 +01:00
Richard Biener 2a1448f276 lto/106540 - fix LTO tree input wrt dwarf2out_register_external_die
I've revisited the earlier two workarounds for dwarf2out_register_external_die
getting duplicate entries.  It turns out that r11-525-g03d90a20a1afcb
added dref_queue pruning to lto_input_tree but decl reading uses that
to stream in DECL_INITIAL even when in the middle of SCC streaming.
When that SCC then gets thrown away we can end up with debug nodes
registered which isn't supposed to happen.  The following adjusts
the DECL_INITIAL streaming to go the in-SCC way, using lto_input_tree_1,
since no SCCs are expected at this point, just refs.

	PR lto/106540
	PR lto/106334
	* dwarf2out.cc (dwarf2out_register_external_die): Restore
	original assert.
	* lto-streamer-in.cc (lto_read_tree_1): Use lto_input_tree_1
	to input DECL_INITIAL, avoiding to commit drefs.
2022-08-08 11:13:13 +02:00
Andrew Pinski 2633c8d8f3 Move testcase gcc.dg/tree-ssa/pr93776.c to gcc.c-torture/compile/pr93776.c
Since this testcase is not exactly SSA specific and it would
be a good idea to compile this at more than just at -O1, moving
it to gcc.c-torture/compile would do that.

Committed as obvious after a test on x86_64-linux-gnu.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/pr93776.c: Moved to...
	* gcc.c-torture/compile/pr93776.c: ...here.
2022-08-07 20:35:06 -07:00
GCC Administrator 37e8e63d3c Daily bump. 2022-08-08 00:16:22 +00:00
Roger Sayle ef54eb74ca [Committed] Add -mno-stv to new gcc.target/i386/cmpti2.c test case.
Adding -march=cascadelake to the command line options of the new cmpti2.c
testcase triggers TImode STV and produces vector code that doesn't match
the scalar implementation that this test was intended to check.  Adding
-mno-stv to the options fixes this.  Committed as obvious.

2022-08-07  Roger Sayle  <roger@nextmovesoftware.com>

gcc/testsuite/ChangeLog
	* gcc.target/i386/cmpti2.c: Add -mno-stv to dg-options.
2022-08-07 22:19:24 +01:00
Jakub Jelinek 1907767735 c++: Add support for __real__/__imag__ modifications in constant expressions [PR88174]
We claim we support P0415R1 (constexpr complex), but e.g.
 #include <complex>

constexpr bool
foo ()
{
  std::complex<double> a (1.0, 2.0);
  a += 3.0;
  a.real (6.0);
  return a.real () == 6.0 && a.imag () == 2.0;
}

static_assert (foo ());

fails with
test.C:12:20: error: non-constant condition for static assertion
   12 | static_assert (foo ());
      |                ~~~~^~
test.C:12:20:   in ‘constexpr’ expansion of ‘foo()’
test.C:8:10:   in ‘constexpr’ expansion of ‘a.std::complex<double>::real(6.0e+0)’
test.C:12:20: error: modification of ‘__real__ a.std::complex<double>::_M_value’ is not a constant expression

The problem is we don't handle REALPART_EXPR and IMAGPART_EXPR
in cxx_eval_store_expression.
The following patch attempts to support it (with a requirement
that those are the outermost expressions, ARRAY_REF/COMPONENT_REF
etc. are just not possible on the result of these, BIT_FIELD_REF
would be theoretically possible if trying to extract some bits
from one part of a complex int, but I don't see how it could appear
in the FE trees.

For these references, the code handles value being COMPLEX_CST,
COMPLEX_EXPR or CONSTRUCTOR_NO_CLEARING empty CONSTRUCTOR (what we use
to represent uninitialized values for C++20 and later) and the
code starts by rewriting it to COMPLEX_EXPR, so that we can freely
adjust the individual parts and later on possibly optimize it back
to COMPLEX_CST if both halves are constant.

2022-08-07  Jakub Jelinek  <jakub@redhat.com>

	PR c++/88174
	* constexpr.cc (cxx_eval_store_expression): Handle REALPART_EXPR
	and IMAGPART_EXPR.  Change ctors from releasing_vec to
	auto_vec<tree *>, adjust all uses.  For !preeval, update ctors
	vector.

	* g++.dg/cpp1y/constexpr-complex1.C: New test.
2022-08-07 10:07:38 +02:00
Roger Sayle a46bca36b7 Allow any immediate constant in *cmp<dwi>_doubleword splitter on x86_64.
This patch tweaks i386.md's *cmp<dwi>_doubleword splitter's predicate to
allow general_operand, not just x86_64_hilo_general_operand, to improve
code generation.  As a general rule, i386.md's _doubleword splitters should
be post-reload splitters that require integer immediate operands to be
x86_64_hilo_int_operand, so that each part is a valid word mode immediate
constant.  As an exception to this rule, doubleword patterns that must be
split before reload, because they require additional scratch registers,
can use take advantage of this ability to create new pseudos, to accept
any immediate constant, and call force_reg on the high and/or low parts
if they are not suitable immediate operands in word mode.

The benefit is shown in the new cmpti3.c test case below.

__int128 x;
int foo()
{
    __int128 t = 0x1234567890abcdefLL;
    return x == t;
}

where GCC with -O2 currently generates:

        movabsq $1311768467294899695, %rax
        xorl    %edx, %edx
        xorq    x(%rip), %rax
        xorq    x+8(%rip), %rdx
        orq     %rdx, %rax
        sete    %al
        movzbl  %al, %eax
        ret

but with this patch now generates:

        movabsq $1311768467294899695, %rax
        xorq    x(%rip), %rax
        orq     x+8(%rip), %rax
        sete    %al
        movzbl  %al, %eax
        ret

2022-08-07  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/i386/i386.md (*cmp<dwi>_doubleword): Change predicate
	for x86_64_hilo_general_operand to general operand.  Call
	force_reg on parts that are not x86_64_immediate_operand.

gcc/testsuite/ChangeLog
	* gcc.target/i386/cmpti1.c: New test case.
	* gcc.target/i386/cmpti2.c: Likewise.
	* gcc.target/i386/cmpti3.c: Likewise.
2022-08-07 08:49:48 +01:00
GCC Administrator 019a41a7ce Daily bump. 2022-08-07 00:16:36 +00:00
GCC Administrator 36e96748ed Daily bump. 2022-08-06 00:16:27 +00:00
David Malcolm e1a9168153 New warning: -Wanalyzer-jump-through-null [PR105947]
This patch adds a new warning to -fanalyzer for jumps through NULL
function pointers.

gcc/analyzer/ChangeLog:
	PR analyzer/105947
	* analyzer.opt (Wanalyzer-jump-through-null): New option.
	* engine.cc (class jump_through_null): New.
	(exploded_graph::process_node): Complain about jumps through NULL
	function pointers.

gcc/ChangeLog:
	PR analyzer/105947
	* doc/invoke.texi: Add -Wanalyzer-jump-through-null.

gcc/testsuite/ChangeLog:
	PR analyzer/105947
	* gcc.dg/analyzer/function-ptr-5.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2022-08-05 19:45:41 -04:00
Roger Sayle cc01a27db5 middle-end: Allow backend to expand/split double word compare to 0/-1.
This patch to the middle-end's RTL expansion reorders the code in
emit_store_flag_1 so that the backend has more control over how best
to expand/split double word equality/inequality comparisons against
zero or minus one.  With the current implementation, the middle-end
always decides to lower this idiom during RTL expansion using SUBREGs
and word mode instructions, without ever consulting the backend's
machine description.  Hence on x86_64, a TImode comparison against zero
is always expanded as:

(parallel [
  (set (reg:DI 91)
       (ior:DI (subreg:DI (reg:TI 88) 0)
               (subreg:DI (reg:TI 88) 8)))
  (clobber (reg:CC 17 flags))])
(set (reg:CCZ 17 flags)
     (compare:CCZ (reg:DI 91)
                  (const_int 0 [0])))

This patch, which makes no changes to the code itself, simply reorders
the clauses in emit_store_flag_1 so that the middle-end first attempts
expansion using the target's doubleword mode cstore optab/expander,
and only if this fails, falls back to lowering to word mode operations.
On x86_64, this allows the expander to produce:

(set (reg:CCZ 17 flags)
     (compare:CCZ (reg:TI 88)
                  (const_int 0 [0])))

which is a candidate for scalar-to-vector transformations (and
combine simplifications etc.).  On targets that don't define a cstore
pattern for doubleword integer modes, there should be no change in
behaviour.  For those that do, the current behaviour can be restored
(if desired) by restricting the expander/insn to not apply when the
comparison is EQ or NE, and operand[2] is either const0_rtx or
constm1_rtx.

This change just keeps RTL expansion more consistent (in philosophy).
For other doubleword comparisons, such as with operators LT and GT,
or with constants other than zero or -1, the wishes of the backend
are respected, and only if the optab expansion fails are the default
fall-back implementations using narrower integer mode operations
(and conditional jumps) used.

2022-08-05  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* expmed.cc (emit_store_flag_1): Move code to expand double word
	equality and inequality against zero or -1, using word operations,
	to after trying to use the backend's cstore<mode>4 optab/expander.
2022-08-05 21:05:35 +01:00
Jonathan Wakely 58a644cfde libstdc++: Add feature test macro for <experimental/scope>
libstdc++-v3/ChangeLog:

	* include/experimental/scope (__cpp_lib_experimental_scope):
	Define.
	* testsuite/experimental/scopeguard/uniqueres.cc: Check macro.
2022-08-05 15:17:57 +01:00
Jonathan Wakely 29fc5075d7 libstdc++: Implement <experimental/scope> from LFTSv3
libstdc++-v3/ChangeLog:

	* include/Makefile.am: Add new header.
	* include/Makefile.in: Regenerate.
	* include/experimental/scope: New file.
	* testsuite/experimental/scopeguard/uniqueres.cc: New test.
	* testsuite/experimental/scopeguard/exit.cc: New test.
2022-08-05 14:57:31 +01:00
Tamar Christina 1878ab3650 middle-end: Guard value_replacement and store_elim from seeing diamonds.
This excludes value_replacement and store_elim from diamonds as they don't
handle the form properly.

gcc/ChangeLog:

	PR middle-end/106534
	* tree-ssa-phiopt.cc (tree_ssa_phiopt_worker): Guard the
	value_replacement and store_elim from diamonds.
2022-08-05 14:53:28 +01:00
Richard Biener 6ca948264d backthreader dump fix
This fixes odd SUCCEEDED dumps from the backthreader registry that
can happen even though register_jump_thread cancelled the thread
as invalid.

	* tree-ssa-threadbackward.cc (back_threader::maybe_register_path):
	Check whether the registry register_path rejected the path.
	(back_threader_registry::register_path): Return whether
	register_jump_thread succeeded.
2022-08-05 14:32:40 +02:00
Aldy Hernandez 47964e7662 Inline unsupported_range constructor.
An unsupported_range temporary is instantiated in every Value_Range
for completeness sake and should be mostly a NOP.  However, it's
showing up in the callgrind stats, because it's not inline.  This
fixes the oversight.

	PR tree-optimization/106514

gcc/ChangeLog:

	* value-range.cc (unsupported_range::unsupported_range): Move...
	* value-range.h (unsupported_range::unsupported_range): ...here.
	(unsupported_range::set_undefined): New.
2022-08-05 14:06:36 +02:00
Richard Biener 36bc2a8f24 tree-optimization/106533 - loop distribution of inner loop of nest
Loop distribution currently gives up if the outer loop of a loop
nest it analyzes contains a stmt with side-effects instead of
continuing to analyze the innermost loop.  The following fixes that
by continuing anyway.

	PR tree-optimization/106533
	* tree-loop-distribution.cc (loop_distribution::execute): Continue
	analyzing the inner loops when find_seed_stmts_for_distribution
	fails.

	* gcc.dg/tree-ssa/ldist-39.c: New testcase.
2022-08-05 12:11:46 +02:00
Haochen Gui 4574dad43f rs6000: Correct return value of check_p9modulo_hw_available.
Set the return value to 0 when modulo is supported, and to 1 when not supported.

gcc/testsuite/
	* lib/target-supports.exp (check_p9modulo_hw_available): Correct return
	value.
2022-08-05 10:46:00 +08:00
Andrew Pinski ffe4f55aa1 [RSIC-V] Fix 32bit riscv with zbs extension enabled
The problem here was a disconnect between splittable_const_int_operand
predicate and the function riscv_build_integer_1 for 32bits with zbs enabled.
The splittable_const_int_operand predicate had a check for TARGET_64BIT which
was not needed so this patch removed it.

Committed as obvious after a build for risc32-elf configured with --with-arch=rv32imac_zba_zbb_zbc_zbs.

Thanks,
Andrew Pinski

gcc/ChangeLog:

	* config/riscv/predicates.md (splittable_const_int_operand):
	Remove the check for TARGET_64BIT for single bit const values.
2022-08-04 19:42:42 -07:00
GCC Administrator 4ad52740ba Daily bump. 2022-08-05 00:16:24 +00:00
Eugene Rozenfeld cd093ee468 Add myself as AutoFDO maintainer
ChangeLog:

	* MAINTAINERS: Add myself as AutoFDO maintainer.
2022-08-04 13:38:28 -07:00
Jonathan Wakely 2678386df2 libstdc++: Make std::string_view(Range&&) constructor explicit
The P2499R0 paper was recently approved for C++23.

libstdc++-v3/ChangeLog:

	* include/std/string_view (basic_string_view(Range&&)): Add
	explicit as per P2499R0.
	* testsuite/21_strings/basic_string_view/cons/char/range_c++20.cc:
	Adjust implicit conversions. Check implicit conversions fail.
	* testsuite/21_strings/basic_string_view/cons/wchar_t/range_c++20.cc:
	Likewise.
2022-08-04 19:37:56 +01:00
Jonathan Wakely db33daa467 libstdc++: Add comparisons to std::default_sentinel_t (LWG 3719)
This library defect was recently approved for C++23.

libstdc++-v3/ChangeLog:

	* include/bits/fs_dir.h (directory_iterator): Add comparison
	with std::default_sentinel_t. Remove redundant operator!= for
	C++20.
	* (recursive_directory_iterator): Likewise.
	* include/bits/iterator_concepts.h [!__cpp_lib_concepts]
	(default_sentinel_t, default_sentinel): Define even if concepts
	are not supported.
	* include/bits/regex.h (regex_iterator): Add comparison with
	std::default_sentinel_t. Remove redundant operator!= for C++20.
	(regex_token_iterator): Likewise.
	(regex_token_iterator::_M_end_of_seq()): Add noexcept.
	* testsuite/27_io/filesystem/iterators/lwg3719.cc: New test.
	* testsuite/28_regex/iterators/regex_iterator/lwg3719.cc:
	New test.
	* testsuite/28_regex/iterators/regex_token_iterator/lwg3719.cc:
	New test.
2022-08-04 19:37:56 +01:00
Andrew MacLeod 8e34d92ef2 Loop over intersected bitmaps.
compute_ranges_in_block loops over the import list and then checks the
same bit in exports.  It is nmore efficent to loop over the intersection
of the 2 bitmaps.

	PR tree-optimization/106514
	* gimple-range-path.cc (path_range_query::compute_ranges_in_block):
	Use EXECUTE_IF_AND_IN_BITMAP to loop over 2 bitmaps.
2022-08-04 14:21:59 -04:00
Tamar Christina be58bf98e9 middle-end: Simplify subtract where both arguments are being bitwise inverted.
This adds a match.pd rule that drops the bitwwise nots when both arguments to a
subtract is inverted. i.e. for:

float g(float a, float b)
{
  return ~(int)a - ~(int)b;
}

we instead generate

float g(float a, float b)
{
  return (int)b - (int)a;
}

We already do a limited version of this from the fold_binary fold functions but
this makes a more general version in match.pd that applies more often.

gcc/ChangeLog:

	* match.pd: New bit_not rule.

gcc/testsuite/ChangeLog:

	* gcc.dg/subnot.c: New test.
2022-08-04 16:37:25 +01:00
Tamar Christina c832ec4c3e middle-end: Fix phi-ssa assertion triggers. [PR106519]
For the diamond PHI form in tree_ssa_phiopt_worker we need to
extract edge e2 sooner.  This changes it so we extract it at the
same time we determine we have a diamond shape.

gcc/ChangeLog:

	PR middle-end/106519
	* tree-ssa-phiopt.cc (tree_ssa_phiopt_worker): Check final phi edge for
	diamond shapes.

gcc/testsuite/ChangeLog:

	PR middle-end/106519
	* gcc.dg/pr106519.c: New test.
2022-08-04 16:35:31 +01:00
Sam Feifer 39579ba8de match.pd: Add bitwise and pattern [PR106243]
This patch adds a new optimization to match.pd. The pattern, -x & 1,
now gets simplified to x & 1, reducing the number of instructions
produced.

This patch also adds tests for the optimization rule.

Bootstrapped/regtested on x86_64-pc-linux-gnu.

	PR tree-optimization/106243

gcc/ChangeLog:

	* match.pd (-x & 1): New simplification.

gcc/testsuite/ChangeLog:

	* gcc.dg/pr106243-1.c: New test.
	* gcc.dg/pr106243.c: New test.
2022-08-04 09:35:14 -04:00
Richard Biener d8552eaddc tree-optimization/106521 - unroll-and-jam LC SSA rewrite
The LC SSA rewrite performs SSA verification at start but the VN
run performed on the unrolled-and-jammed body can leave us with
invalid SSA form until CFG cleanup is run.  So make sure we do that
before rewriting into LC SSA.

	PR tree-optimization/106521
	* gimple-loop-jam.cc (tree_loop_unroll_and_jam): Perform
	CFG cleanup manually before rewriting into LC SSA.

	* gcc.dg/torture/pr106521.c: New testcase.
2022-08-04 15:01:38 +02:00
Richard Biener d86d81a449 Backwards threader greedy search TLC
I've tried to understand how the greedy search works seeing the
bitmap dances and the split into resolve_phi.  I've summarized
the intent of the algorithm as

      // For further greedy searching we want to remove interesting
      // names defined in BB but add ones on the PHI edges for the
      // respective edges.

but the implementation differs in detail.  In particular when
there is more than one interesting PHI in BB it seems to only consider
the first for translating defs across edges.  It also only applies
the loop crossing restriction when there is an interesting PHI.

The following preserves the loop crossing restriction to the case
of interesting PHIs but merges resolve_phi back, changing interesting
as outlined with the intent above.  It should get more threading
cases when there are multiple interesting PHI defs in a block.
It might be a bit faster due to less bitmap operations but in the
end the main intent was to make what happens more obvious.

	* tree-ssa-threadbackward.cc (populate_worklist): Remove.
	(back_threader::resolve_phi): Likewise.
	(back_threader::find_paths_to_names): Rewrite greedy search.
2022-08-04 15:01:38 +02:00
Jonathan Wakely 07c7ee4d2d libstdc++: Rename data members of std::unexpected and std::bad_expected_access
The P2549R1 paper was accepted for C++23. I already implemented it for
our <expected>, but I didn't rename the private daata members, only the
public member functions. This renames the data members for consistency
with the working draft.

libstdc++-v3/ChangeLog:

	* include/std/expected (unexpected::_M_val): Rename to _M_unex.
	(bad_expected_access::_M_val): Likewise.
2022-08-04 13:10:33 +01:00
Jonathan Wakely 3e9bd6b2b1 libstdc++: Update value of __cpp_lib_ios_noreplace macro
My P2467R1 proposal was accepted for C++23 so there's an official value
for this macro now.

libstdc++-v3/ChangeLog:

	* include/bits/ios_base.h (__cpp_lib_ios_noreplace): Update
	value to 202207L.
	* include/std/version (__cpp_lib_ios_noreplace): Likewise.
	* testsuite/27_io/basic_ofstream/open/char/noreplace.cc: Check
	for new value.
	* testsuite/27_io/basic_ofstream/open/wchar_t/noreplace.cc:
	Likewise.
2022-08-04 13:10:24 +01:00
Jonathan Wakely af98cb88eb libstdc++: Unblock atomic wait on non-futex platforms [PR106183]
When using a mutex and condition variable, the notifying thread needs to
increment _M_ver while holding the mutex lock, and the waiting thread
needs to re-check after locking the mutex. This avoids a missed
notification as described in the PR.

By moving the increment of _M_ver to the base _M_notify we can make the
use of the mutex local to the use of the condition variable, and
simplify the code a little. We can use a relaxed store because the mutex
already provides sequential consistency. Also we don't need to check
whether __addr == &_M_ver because we know that's always true for
platforms that use a condition variable, and so we also know that we
always need to use notify_all() not notify_one().

Reviewed-by: Thomas Rodgers <trodgers@redhat.com>

libstdc++-v3/ChangeLog:

	PR libstdc++/106183
	* include/bits/atomic_wait.h (__waiter_pool_base::_M_notify):
	Move increment of _M_ver here.
	[!_GLIBCXX_HAVE_PLATFORM_WAIT]: Lock mutex around increment.
	Use relaxed memory order and always notify all waiters.
	(__waiter_base::_M_do_wait) [!_GLIBCXX_HAVE_PLATFORM_WAIT]:
	Check value again after locking mutex.
	(__waiter_base::_M_notify): Remove increment of _M_ver.
2022-08-04 13:09:39 +01:00
Ulrich Drepper 075683767a Adjust index number of tuple pretty printer
The tuple pretty printer uses 1-based indeces which is quite confusing
considering the access to the same values with the std::get functions
uses 0-based indeces.  This patch changes the pretty printer since
this is not a guaranteed API.

libstdc++-v3/ChangeLog:

	* python/libstdcxx/v6/printers.py (class StdTuplePrinter): Use
	zero-based indeces just like std:get takes.
2022-08-04 13:18:05 +02:00
Ilya Leoshkevich 2f17f489de PR106342 - IBM zSystems: Provide vsel for all vector modes
dg.exp=pr104612.c fails with an ICE on s390x, because copysignv2sf3
produces an insn that vsel<mode> is supposed to recognize, but can't,
because it's not defined for V2SF.  Fix by defining it for all vector
modes supported by copysign<mode>3.

gcc/ChangeLog:

	* config/s390/vector.md (V_HW_FT): New iterator.
	* config/s390/vx-builtins.md (vsel<mode>): Use V_HW_FT instead
	of V_HW.
2022-08-04 12:28:43 +02:00
GCC Administrator 4c23b534d4 Daily bump. 2022-08-04 00:16:49 +00:00
Michael Meissner 1e4a8c782e Do not enable -mblock-ops-vector-pair.
Testing has shown that using the load vector pair and store vector pair
instructions for block moves has some performance issues on power10.

A patch on June 11th modified the code so that GCC would not set
-mblock-ops-vector-pair by default if we are tuning for power10, but it would
set the option if we were tuning for a different machine and have load and store
vector pair instructions enabled.

This patch eliminates the code setting -mblock-ops-vector-pair.  If you want to
generate load vector pair and store vector pair instructions for block moves,
you must use -mblock-ops-vector-pair.

2022-08-03   Michael Meissner  <meissner@linux.ibm.com>

gcc/

	* config/rs6000/rs6000.cc (rs6000_option_override_internal): Remove code
	setting -mblock-ops-vector-pair.
2022-08-03 17:52:31 -04:00
Andrew MacLeod 19ffb35d17 Do not walk equivalence set in path_oracle::killing_def.
When killing a def in the path ranger, there is no need to walk the set
of existing equivalences clearing bits.  An equivalence match requires
that both ssa-names have to be in each others set.  As killing_def
creates a new empty set contianing only the current def,  it already
ensures false equivaelnces won't happen.

	PR tree-optimization/106514
	* value-relation.cc (path_oracle::killing_def) Do not walk the
	  equivalence set clearing bits.
2022-08-03 14:40:55 -04:00
Jose E. Marchesi f0688c82ba testsuite: btf: fix regexps in btf-int-1.c
The regexps in hte test btf-int-1.c were not working properly with the
commenting style of at least one target: powerpc64le-linux-gnu.  This
patch changes the test to use better regexps.

Tested in bpf-unkonwn-none, x86_64-linux-gnu and powerpc64le-linux-gnu.
Pushed to master as obvious.

gcc/testsuite/ChangeLog:

	PR testsuite/106515
	* gcc.dg/debug/btf/btf-int-1.c: Fix regexps in
	scan-assembler-times.
2022-08-03 18:50:05 +02:00
Tamar Christina 9bb19e143c middle-end: Support recognition of three-way max/min.
This patch adds support for three-way min/max recognition in phi-opts.

Concretely for e.g.

#include <stdint.h>

uint8_t three_min (uint8_t xc, uint8_t xm, uint8_t xy) {
	uint8_t	 xk;
    if (xc < xm) {
        xk = (uint8_t) (xc < xy ? xc : xy);
    } else {
        xk = (uint8_t) (xm < xy ? xm : xy);
    }
    return xk;
}

we generate:

  <bb 2> [local count: 1073741824]:
  _5 = MIN_EXPR <xc_1(D), xy_3(D)>;
  _7 = MIN_EXPR <xm_2(D), _5>;
  return _7;

instead of

  <bb 2>:
  if (xc_2(D) < xm_3(D))
    goto <bb 3>;
  else
    goto <bb 4>;

  <bb 3>:
  xk_5 = MIN_EXPR <xc_2(D), xy_4(D)>;
  goto <bb 5>;

  <bb 4>:
  xk_6 = MIN_EXPR <xm_3(D), xy_4(D)>;

  <bb 5>:
  # xk_1 = PHI <xk_5(3), xk_6(4)>
  return xk_1;

The same function also immediately deals with turning a minimization problem
into a maximization one if the results are inverted.  We do this here since
doing it in match.pd would end up changing the shape of the BBs and adding
additional instructions which would prevent various optimizations from working.

gcc/ChangeLog:

	* tree-ssa-phiopt.cc (minmax_replacement): Optionally search for the phi
	sequence of a three-way conditional.
	(replace_phi_edge_with_variable): Support diamonds.
	(tree_ssa_phiopt_worker): Detect diamond phi structure for three-way
	min/max.
	(strip_bit_not, invert_minmax_code): New.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/split-path-1.c: Disable phi-opts so we don't optimize
	code away.
	* gcc.dg/tree-ssa/minmax-10.c: New test.
	* gcc.dg/tree-ssa/minmax-11.c: New test.
	* gcc.dg/tree-ssa/minmax-12.c: New test.
	* gcc.dg/tree-ssa/minmax-13.c: New test.
	* gcc.dg/tree-ssa/minmax-14.c: New test.
	* gcc.dg/tree-ssa/minmax-15.c: New test.
	* gcc.dg/tree-ssa/minmax-16.c: New test.
	* gcc.dg/tree-ssa/minmax-3.c: New test.
	* gcc.dg/tree-ssa/minmax-4.c: New test.
	* gcc.dg/tree-ssa/minmax-5.c: New test.
	* gcc.dg/tree-ssa/minmax-6.c: New test.
	* gcc.dg/tree-ssa/minmax-7.c: New test.
	* gcc.dg/tree-ssa/minmax-8.c: New test.
	* gcc.dg/tree-ssa/minmax-9.c: New test.
2022-08-03 16:00:39 +01:00