Commit Graph

189969 Commits

Author SHA1 Message Date
John David Anglin
14dd0921fe Fix typo in t-dimode
2021-11-27  John David Anglin  <danglin@gcc.gnu.org>

libgcc/ChangeLog:

	* config/pa/t-dimode (lib2difuncs): Fix typo.
2021-11-27 21:47:47 +00:00
Petter Tomner
1e53408452 jit: Change printf specifiers for size_t to %zu
Change four occurances of %ld specifier for size_t to %zu for clean 32bit builds.

Signed-off-by
2021-11-27	Petter Tomner	<tomner@kth.se>

gcc/jit/
	* libgccjit.c: %ld -> %zu
2021-11-27 16:45:41 +01:00
Jakub Jelinek
f7e4f57f1c x86: Fix up x86_{,64_}sh{l,r}d patterns [PR103431]
The following testcase is miscompiled because the x86_{,64_}sh{l,r}d
patterns don't properly describe what the instructions do.  One thing
is left out, in particular that there is initial count &= 63 for
sh{l,r}dq and initial count &= 31 for sh{l,r}d{l,w}.  And another thing
not described properly, in particular the behavior when count (after the
masking) is 0.  The pattern says it is e.g.
res = (op0 << op2) | (op1 >> (64 - op2))
but that triggers UB on op1 >> 64.  For op2 0 we actually want
res = (op0 << op2) | 0
When constants are propagated to these patterns during RTL optimizations,
both such problems trigger wrong-code issues.
This patch represents the patterns as e.g.
res = (op0 << (op2 & 63)) | (unsigned long long) ((uint128_t) op1 >> (64 - (op2 & 63)))
so there is both the initial masking and op2 == 0 behavior results in
zero being ored.
The patch introduces alternate patterns for constant op2 where
simplify-rtx.c will fold those expressions into simple numbers,
and define_insn_and_split pre-reload splitter for how the patterns
looked before into the new form, so that it can pattern match during
combine even computations that assumed the shift amount will be in
the range of 1 .. bitsize-1.

2021-11-27  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/103431
	* config/i386/i386.md (x86_64_shld, x86_shld, x86_64_shrd, x86_shrd):
	Change insn pattern to accurately describe the instructions.
	(*x86_64_shld_1, *x86_shld_1, *x86_64_shrd_1, *x86_shrd_1): New
	define_insn patterns.
	(*x86_64_shld_2, *x86_shld_2, *x86_64_shrd_2, *x86_shrd_2): New
	define_insn_and_split patterns.
	(*ashl<dwi>3_doubleword_mask, *ashl<dwi>3_doubleword_mask_1,
	*<insn><dwi>3_doubleword_mask, *<insn><dwi>3_doubleword_mask_1,
	ix86_rotl<dwi>3_doubleword, ix86_rotr<dwi>3_doubleword): Adjust
	splitters for x86_{,64_}sh{l,r}d pattern changes.

	* gcc.dg/pr103431.c: New test.
2021-11-27 13:02:06 +01:00
Jakub Jelinek
567d5f3d62 bswap: Fix UB in find_bswap_or_nop_finalize [PR103435]
On gcc.c-torture/execute/pr103376.c in the following code we trigger UB
in the compiler.  n->range is 8 because it is 64-bit load and rsize is 0
because it is a bswap sequence with load and known to be 0:
  /* Find real size of result (highest non-zero byte).  */
  if (n->base_addr)
    for (tmpn = n->n, rsize = 0; tmpn; tmpn >>= BITS_PER_MARKER, rsize++);
  else
    rsize = n->range;
The shifts then shift uint64_t by 64 bits.  For this case mask is 0
and we want both *cmpxchg and *cmpnop as 0, the operation can be done as
both nop and bswap and callers will prefer nop.

2021-11-27  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/103435
	* gimple-ssa-store-merging.c (find_bswap_or_nop_finalize): Avoid UB if
	n->range - rsize == 8, just clear both *cmpnop and *cmpxchg in that
	case.
2021-11-27 13:00:55 +01:00
Roger Sayle
d9c8a0238f [Committed] Fix new ivopts-[89].c test cases for -m32.
2021-11-27  Roger Sayle  <roger@nextmovesoftware.com>

gcc/testsuite/ChangeLog
	* gcc.dg/tree-ssa/ivopts-8.c: Fix new test case for -m32.
	* gcc.dg/tree-ssa/ivopts-9.c: Likewise.
2021-11-27 10:13:31 +00:00
GCC Administrator
f4ed2e3ae7 Daily bump. 2021-11-27 00:16:19 +00:00
Martin Jambor
9e2e47391b
ipa: Fix CFG fix-up in IPA-CP transform phase (PR 103441)
I forgot that IPA passes before ipa-inline must not return
TODO_cleanup_cfg from their transformation function because ordinary
CFG cleanup does not remove call graph edges associated with removed
call statements but must use
delete_unreachable_blocks_update_callgraph instead.  This patch fixes
that error.

gcc/ChangeLog:

2021-11-26  Martin Jambor  <mjambor@suse.cz>

	PR ipa/103441
	* ipa-prop.c (ipcp_transform_function): Call
	delete_unreachable_blocks_update_callgraph instead of returning
	TODO_cleanup_cfg.
2021-11-27 01:01:46 +01:00
Jonathan Wakely
52b769437a libstdc++: Fix test that fails in C++20 mode
This test was written to verify that the LWG 3265 changes work. But
those changes were superseded by LWG 3435, and the test is now incorrect
according to the current draft. The assignment operator is now
constrained to also require convertibility, which makes the test fail.

Change the Iter type to be convertible from int*, but make it throw an
exception if that conversion is used. Change the test from compile-only
to run, so we verify that the exception isn't thrown.

libstdc++-v3/ChangeLog:

	* testsuite/24_iterators/move_iterator/dr3265.cc: Fix test to
	account for LWG 3435 resolution.
2021-11-26 22:56:51 +00:00
Jonathan Wakely
33adfd0d42 libstdc++: Fix trivial relocation for constexpr std::vector
When implementing constexpr std::vector I added a check for constant
evaluation in vector::_S_use_relocate(), so that we would not try to relocate
trivial objects by using memmove. But I put it in the constexpr function
that decides whether to relocate or not, and calls to that function are
always constant evaluated. This had the effect of disabling relocation
entirely, even in non-constexpr vectors.

This removes the check in _S_use_relocate() and modifies the actual
relocation algorithm, __relocate_a_1, to use the non-trivial
implementation instead of memmove when called during constant
evaluation.

libstdc++-v3/ChangeLog:

	* include/bits/stl_uninitialized.h (__relocate_a_1): Do not use
	memmove during constant evaluation.
	* include/bits/stl_vector.h (vector::_S_use_relocate()): Do not
	check is_constant_evaluated in always-constexpr function.
2021-11-26 22:28:48 +00:00
Jonathan Wakely
76c6be48b7 libstdc++: Remove workaround for FE bug in std::tuple [PR96592]
The FE bug was fixed, so we don't need this workaround now.

libstdc++-v3/ChangeLog:

	PR libstdc++/96592
	* include/std/tuple (tuple::is_constructible): Remove.
2021-11-26 22:26:08 +00:00
Harald Anlauf
4d540c7a4a Fortran: improve check of arguments to the RESHAPE intrinsic
gcc/fortran/ChangeLog:

	PR fortran/103411
	* check.c (gfc_check_reshape): Improve check of size of source
	array for the RESHAPE intrinsic against the given shape when pad
	is not given, and shape is a parameter.  Try other simplifications
	of shape.

gcc/testsuite/ChangeLog:

	PR fortran/103411
	* gfortran.dg/pr68153.f90: Adjust test to improved check.
	* gfortran.dg/reshape_7.f90: Likewise.
	* gfortran.dg/reshape_9.f90: New test.
2021-11-26 21:00:35 +01:00
Iain Sandoe
caa04517e6 libitm: Fix bootstrap for targets without HAVE_ELF_STYLE_WEAKREF.
Recent improvements to null address warnings notice that for
targets that do not support HAVE_ELF_STYLE_WEAKREF the dummy stub
implementation of __cxa_get_globals() means that the address can
never be null.

Fixed by removing the test for such targets.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

libitm/ChangeLog:

	* eh_cpp.cc (GTM::gtm_thread::init_cpp_exceptions): If the
	target does not support HAVE_ELF_STYLE_WEAKREF then do not
	try to test the __cxa_get_globals against NULL.
2021-11-26 19:40:27 +00:00
Siddhesh Poyarekar
4a2007594c tree-object-size: Abstract object_sizes array
Put all accesses to object_sizes behind functions so that we can add
dynamic capability more easily.

gcc/ChangeLog:

	* tree-object-size.c (object_sizes_grow, object_sizes_release,
	object_sizes_unknown_p, object_sizes_get, object_size_set_force,
	object_sizes_set): New functions.
	(addr_object_size, compute_builtin_object_size,
	expr_object_size, call_object_size, unknown_object_size,
	merge_object_sizes, plus_stmt_object_size,
	cond_expr_object_size, collect_object_sizes_for,
	check_for_plus_in_loops_1, init_object_sizes,
	fini_object_sizes): Adjust.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
2021-11-26 23:33:59 +05:30
Siddhesh Poyarekar
35c8bbe96b tree-object-size: Replace magic numbers with enums
A simple cleanup to allow inserting dynamic size code more easily.

gcc/ChangeLog:

	* tree-object-size.c: New enum.
	(object_sizes, computed, addr_object_size,
	compute_builtin_object_size, expr_object_size, call_object_size,
	merge_object_sizes, plus_stmt_object_size,
	collect_object_sizes_for, init_object_sizes, fini_object_sizes,
	object_sizes_execute): Replace magic numbers with enums.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
2021-11-26 23:33:56 +05:30
Roger Sayle
b41be002ed ivopts: Improve code generated for very simple loops.
This patch tidies up the code that GCC generates for simple loops,
by selecting/generating a simpler loop bound expression in ivopts.
The original motivation came from looking at the following loop (from
gcc.target/i386/pr90178.c)

int *find_ptr (int* mem, int sz, int val)
{
  for (int i = 0; i < sz; i++)
    if (mem[i] == val)
      return &mem[i];
  return 0;
}

which GCC currently compiles to:

find_ptr:
        movq    %rdi, %rax
        testl   %esi, %esi
        jle     .L4
        leal    -1(%rsi), %ecx
        leaq    4(%rdi,%rcx,4), %rcx
        jmp     .L3
.L7:    addq    $4, %rax
        cmpq    %rcx, %rax
        je      .L4
.L3:    cmpl    %edx, (%rax)
        jne     .L7
        ret
.L4:    xorl    %eax, %eax
        ret

Notice the relatively complex leal/leaq instructions, that result
from ivopts using the following expression for the loop bound:
inv_expr 2:     ((unsigned long) ((unsigned int) sz_8(D) + 4294967295)
		* 4 + (unsigned long) mem_9(D)) + 4

which results from NITERS being (unsigned int) sz_8(D) + 4294967295,
i.e. (sz - 1), and the logic in cand_value_at determining the bound
as BASE + NITERS*STEP at the start of the final iteration and as
BASE + NITERS*STEP + STEP at the end of the final iteration.

Ideally, we'd like the middle-end optimizers to simplify
BASE + NITERS*STEP + STEP as BASE + (NITERS+1)*STEP, especially
when NITERS already has the form BOUND-1, but with type conversions
and possible overflow to worry about, the above "inv_expr 2" is the
best that can be done by fold (without additional context information).

This patch improves ivopts' cand_value_at by instead of using just
the tree expression for NITERS, passing the data structure that
explains how that expression was derived.  This allows us to peek
under the surface to check that NITERS+1 doesn't overflow, and in
this patch to use the SSA_NAME already holding the required value.

In the motivating loop above, inv_expr 2 now becomes:
(unsigned long) sz_8(D) * 4 + (unsigned long) mem_9(D)

And as a result, on x86_64 we now generate:

find_ptr:
        movq    %rdi, %rax
        testl   %esi, %esi
        jle     .L4
        movslq  %esi, %rsi
        leaq    (%rdi,%rsi,4), %rcx
        jmp     .L3
.L7:    addq    $4, %rax
        cmpq    %rcx, %rax
        je      .L4
.L3:    cmpl    %edx, (%rax)
        jne     .L7
        ret
.L4:    xorl    %eax, %eax
        ret

This improvement required one minor tweak to GCC's testsuite for
gcc.dg/wrapped-binop-simplify.c, where we again generate better
code, and therefore no longer find as many optimization opportunities
in later passes (vrp2).

Previously:

void v1 (unsigned long *in, unsigned long *out, unsigned int n)
{
  int i;
  for (i = 0; i < n; i++) {
    out[i] = in[i];
  }
}

on x86_64 generated:
v1:	testl   %edx, %edx
        je      .L1
        movl    %edx, %edx
        xorl    %eax, %eax
.L3:	movq    (%rdi,%rax,8), %rcx
        movq    %rcx, (%rsi,%rax,8)
        addq    $1, %rax
        cmpq    %rax, %rdx
        jne     .L3
.L1:	ret

and now instead generates:
v1:	testl   %edx, %edx
        je      .L1
        movl    %edx, %edx
        xorl    %eax, %eax
        leaq    0(,%rdx,8), %rcx
.L3:	movq    (%rdi,%rax), %rdx
        movq    %rdx, (%rsi,%rax)
        addq    $8, %rax
        cmpq    %rax, %rcx
        jne     .L3
.L1:	ret

2021-11-26  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* tree-ssa-loop-ivopts.c (cand_value_at): Take a class
	tree_niter_desc* argument instead of just a tree for NITER.
	If we require the iv candidate value at the end of the final
	loop iteration, try using the original loop bound as the
	NITER for sufficiently simple loops.
	(may_eliminate_iv): Update (only) call to cand_value_at.

gcc/testsuite/ChangeLog
	* gcc.dg/wrapped-binop-simplify.c: Update expected test result.
	* gcc.dg/tree-ssa/ivopts-5.c: New test case.
	* gcc.dg/tree-ssa/ivopts-6.c: New test case.
	* gcc.dg/tree-ssa/ivopts-7.c: New test case.
	* gcc.dg/tree-ssa/ivopts-8.c: New test case.
	* gcc.dg/tree-ssa/ivopts-9.c: New test case.
2021-11-26 17:22:10 +00:00
Jonathan Wakely
665f726b8a libstdc++: Ensure dg-add-options comes after dg-options
This is what the docs say is required.

libstdc++-v3/ChangeLog:

	* testsuite/29_atomics/atomic_float/1.cc: Reorder directives.
2021-11-26 15:11:58 +00:00
Jonathan Wakely
0a12bd92d1 libstdc++: Fix dg-do directive for tests supposed to be run
libstdc++-v3/ChangeLog:

	* testsuite/23_containers/unordered_map/modifiers/move_assign.cc:
	Change dg-do compile to run.
	* testsuite/27_io/basic_istream/extractors_character/wchar_t/lwg2499.cc:
	Likewise.
2021-11-26 15:11:58 +00:00
Jonathan Wakely
1ecc9ba578 libstdc++: Remove redundant xfail selectors in dg-do compile tests
An 'xfail' selector means the test is expected to fail at runtime, so is
ignored for a compile-only test. The way to mark a compile-only test as
failing is with dg-error (which these already do).

libstdc++-v3/ChangeLog:

	* testsuite/21_strings/basic_string_view/element_access/char/back_constexpr_neg.cc:
	Remove xfail selector.
	* testsuite/21_strings/basic_string_view/element_access/char/constexpr_neg.cc:
	Likewise.
	Likewise.
	* testsuite/21_strings/basic_string_view/element_access/char/front_constexpr_neg.cc:
	Likewise.
	* testsuite/21_strings/basic_string_view/element_access/wchar_t/back_constexpr_neg.cc:
	Likewise.
	* testsuite/21_strings/basic_string_view/element_access/wchar_t/constexpr_neg.cc:
	Likewise.
	* testsuite/21_strings/basic_string_view/element_access/wchar_t/front_constexpr_neg.cc:
	Likewise.
	* testsuite/23_containers/span/101411.cc: Likewise.
	* testsuite/25_algorithms/copy/debug/constexpr_neg.cc: Likewise.
	* testsuite/25_algorithms/copy_backward/debug/constexpr_neg.cc:
	Likewise.
	* testsuite/25_algorithms/equal/constexpr_neg.cc: Likewise.
	* testsuite/25_algorithms/equal/debug/constexpr_neg.cc: Likewise.
	* testsuite/25_algorithms/lower_bound/debug/constexpr_partitioned_neg.cc:
	Likewise.
	* testsuite/25_algorithms/lower_bound/debug/constexpr_partitioned_pred_neg.cc:
	Likewise.
	* testsuite/25_algorithms/lower_bound/debug/constexpr_valid_range_neg.cc:
	Likewise.
	* testsuite/25_algorithms/upper_bound/debug/constexpr_partitioned_neg.cc:
	Likewise.
	* testsuite/25_algorithms/upper_bound/debug/constexpr_partitioned_pred_neg.cc:
	Likewise.
	* testsuite/25_algorithms/upper_bound/debug/constexpr_valid_range_neg.cc:
	Likewise.
2021-11-26 15:11:58 +00:00
Martin Liska
f1ec39c86c d: fix ASAN in option processing
Fixes:

==129444==ERROR: AddressSanitizer: global-buffer-overflow on address 0x00000666ca5c at pc 0x000000ef094b bp 0x7fffffff8180 sp 0x7fffffff8178
READ of size 4 at 0x00000666ca5c thread T0
    #0 0xef094a in parse_optimize_options ../../gcc/d/d-attribs.cc:855
    #1 0xef0d36 in d_handle_optimize_attribute ../../gcc/d/d-attribs.cc:916
    #2 0xef107e in d_handle_optimize_attribute ../../gcc/d/d-attribs.cc:887
    #3 0xff85b1 in decl_attributes(tree_node**, tree_node*, int, tree_node*) ../../gcc/attribs.c:829
    #4 0xef2a91 in apply_user_attributes(Dsymbol*, tree_node*) ../../gcc/d/d-attribs.cc:427
    #5 0xf7b7f3 in get_symbol_decl(Declaration*) ../../gcc/d/decl.cc:1346
    #6 0xf87bc7 in get_symbol_decl(Declaration*) ../../gcc/d/decl.cc:967
    #7 0xf87bc7 in DeclVisitor::visit(FuncDeclaration*) ../../gcc/d/decl.cc:808
    #8 0xf83db5 in DeclVisitor::build_dsymbol(Dsymbol*) ../../gcc/d/decl.cc:146

for the following test-case: gcc/testsuite/gdc.dg/attr_optimize1.d.

gcc/d/ChangeLog:

	* d-attribs.cc (parse_optimize_options): Check index before
	accessing cl_options.
2021-11-26 14:55:12 +01:00
Jan Hubicka
2cadaa1f13 Minor ipa-modref tweaks
To make dumps easier to read modref now dumps cgraph_node name rather then
cfun name in function being analysed and I also fixed minor issue with ECF
flags merging when updating inline summary.

gcc/ChangeLog:

2021-11-26  Jan Hubicka  <hubicka@ucw.cz>

	* ipa-modref.c (analyze_function): Drop parameter F and dump
	cgraph node name rather than cfun name.
	(modref_generate): Update.
	(modref_summaries::insert):Update.
	(modref_summaries_lto::insert):Update.
	(pass_modref::execute):Update.
	(ipa_merge_modref_summary_after_inlining): Improve combining of
	ECF_FLAGS.
2021-11-26 13:54:41 +01:00
Jan Hubicka
906cad89b3 Fix failure in inlline-9.c testcase
gcc/testsuite/ChangeLog:

2021-11-26  Jan Hubicka  <hubicka@ucw.cz>

	* gcc.dg/ipa/inline-9.c: Update template.c
2021-11-26 13:49:01 +01:00
Jonathan Wakely
0178b73a02 libstdc++: Move std::to_address tests to more appropriate place
Some of the checks in 20_util/pointer_traits/lwg3545.cc really belong in
20_util/to_address/lwg3545 instead.

This also fixes the ordering of the dg-options and dg-do directives.

libstdc++-v3/ChangeLog:

	* testsuite/20_util/pointer_traits/lwg3545.cc: Move to_address
	tests to ...
	* testsuite/20_util/to_address/lwg3545.cc: ... here. Add -std
	option before checking effective target.
2021-11-26 12:38:35 +00:00
Jan Hubicka
a70faf6e4d Fix handling of in_flags in update_escape_summary_1
update_escape_summary_1 has thinko where it compues proper min_flags but then
stores original value (ignoring the fact whether there was a dereference
in the escape point).

	PR ipa/102943
	* ipa-modref.c (update_escape_summary_1): Fix handling of min_flags.
2021-11-26 13:38:00 +01:00
Jakub Jelinek
8dedf065af c++: Fix up taking address of an immediate function diagnostics [PR102753]
On Wed, Oct 20, 2021 at 07:16:44PM -0400, Jason Merrill wrote:
> or an unevaluated operand, or a subexpression of an immediate invocation.
>
> Hmm...that suggests that in consteval23.C, bar(foo) should also be OK,

The following patch handles that by removing the diagnostics about taking
address of immediate function from cp_build_addr_expr_1, and instead diagnoses
it in cp_fold_r.  To do that with proper locations, the patch attempts to
ensure that ADDR_EXPRs of immediate functions get EXPR_LOCATION set and
adds a PTRMEM_CST_LOCATION for PTRMEM_CSTs.  Also, evaluation of
std::source_location::current() is moved from genericization to cp_fold.

2021-11-26  Jakub Jelinek  <jakub@redhat.com>

	PR c++/102753
	* cp-tree.h (struct ptrmem_cst): Add locus member.
	(PTRMEM_CST_LOCATION): Define.
	* tree.c (make_ptrmem_cst): Set PTRMEM_CST_LOCATION to input_location.
	(cp_expr_location): Return PTRMEM_CST_LOCATION for PTRMEM_CST.
	* typeck.c (build_x_unary_op): Overwrite PTRMEM_CST_LOCATION for
	PTRMEM_CST instead of calling maybe_wrap_with_location.
	(cp_build_addr_expr_1): Don't diagnose taking address of
	immediate functions here.  Instead when taking their address make
	sure the returned ADDR_EXPR has EXPR_LOCATION set.
	(expand_ptrmemfunc_cst): Copy over PTRMEM_CST_LOCATION to ADDR_EXPR's
	EXPR_LOCATION.
	(convert_for_assignment): Use cp_expr_loc_or_input_loc instead of
	EXPR_LOC_OR_LOC.
	* pt.c (tsubst_copy): Use build1_loc instead of build1.  Ensure
	ADDR_EXPR of immediate function has EXPR_LOCATION set.
	* cp-gimplify.c (cp_fold_r): Diagnose taking address of immediate
	functions here.  For consteval if don't walk THEN_CLAUSE.
	(cp_genericize_r): Move evaluation of calls to
	std::source_location::current from here to...
	(cp_fold): ... here.  Don't assert calls to immediate functions must
	be source_location_current_p, instead only constant evaluate
	calls to source_location_current_p.

	* g++.dg/cpp2a/consteval20.C: Add some extra tests.
	* g++.dg/cpp2a/consteval23.C: Likewise.
	* g++.dg/cpp2a/consteval25.C: New test.
	* g++.dg/cpp2a/srcloc20.C: New test.
2021-11-26 10:16:20 +01:00
konglin1
90cb088ece i386: vcvtph2ps and vcvtps2ph should be used to convert _Float16 to SFmode with -mf16c [PR 102811]
Add define_insn extendhfsf2 and truncsfhf2 for target_f16c.

gcc/ChangeLog:

	PR target/102811
	* config/i386/i386.c (ix86_can_change_mode_class): Allow 16 bit data in XMM register
	for TARGET_SSE2.
	* config/i386/i386.md (extendhfsf2): Add extenndhfsf2 for TARGET_F16C.
	(extendhfdf2): Restrict extendhfdf for TARGET_AVX512FP16 only.
	(*extendhf<mode>2): Rename from extendhf<mode>2.
	(truncsfhf2): Likewise.
	(truncdfhf2): Likewise.
	(*trunc<mode>2): Likewise.

gcc/testsuite/ChangeLog:

	PR target/102811
	* gcc.target/i386/pr90773-21.c: Allow pextrw instead of movw.
	* gcc.target/i386/pr90773-23.c: Ditto.
	* gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c: New test.
2021-11-26 09:29:10 +08:00
liuhongt
379be00f45 Fix typo in r12-5486.
gcc/ChangeLog:

	PR middle-end/103419
	* match.pd: Fix typo, use the type of second parameter, not
	first one.
2021-11-26 08:59:13 +08:00
GCC Administrator
091ccc066d Daily bump. 2021-11-26 00:16:26 +00:00
Jonathan Wakely
9664c46545 libstdc++: Remove dg-error that no longer happens
There was a c++11_only dg-error in this testcase, for a "body of
constexpr function is not a return statement" diagnostic that was bogus,
but happened because the return statement was ill-formed. A change to
G++ earlier this month means that diagnostic is no longer emitted, so
remove the dg-error.

libstdc++-v3/ChangeLog:

	* testsuite/20_util/tuple/comparison_operators/overloaded2.cc:
	Remove dg-error for C++11_only error.
2021-11-25 23:12:15 +00:00
Jonathan Wakely
b8018e5c5e libstdc++: Make std::pointer_traits SFINAE-friendly [PR96416]
This implements the resolution I'm proposing for LWG 3545, to avoid hard
errors when using std::to_address for types that make pointer_traits
ill-formed.

Consistent with std::iterator_traits, instantiating std::pointer_traits
for a non-pointer type will be well-formed, but give an empty type with
no member types. This avoids the problematic cases for std::to_address.
Additionally, the pointer_to member is now only declared when the
element type is not cv void (and for C++20, when the function body would
be well-formed). The rebind member was already SFINAE-friendly in our
implementation.

libstdc++-v3/ChangeLog:

	PR libstdc++/96416
	* include/bits/ptr_traits.h (pointer_traits): Reimplement to be
	SFINAE-friendly (LWG 3545).
	* testsuite/20_util/pointer_traits/lwg3545.cc: New test.
	* testsuite/20_util/to_address/1_neg.cc: Adjust dg-error line.
	* testsuite/20_util/to_address/lwg3545.cc: New test.
2021-11-25 23:12:14 +00:00
Jan Hubicka
1b0acc4b80 Remove forgotten early return in ipa_value_range_from_jfunc
gcc/ChangeLog:

	* ipa-cp.c (ipa_value_range_from_jfunc): Remove forgotten early return.

gcc/testsuite/ChangeLog:

	* gcc.dg/ipa/inline10.c: New test.
2021-11-25 23:58:48 +01:00
Jonathan Wakely
82c3657dd7 libstdc++: Do not use memset in constexpr calls to ranges::fill_n [PR101608]
libstdc++-v3/ChangeLog:

	PR libstdc++/101608
	* include/bits/ranges_algobase.h (__fill_n_fn): Check for
	constant evaluation before using memset.
	* testsuite/25_algorithms/fill_n/constrained.cc: Check
	byte-sized values as well.
2021-11-25 20:03:13 +00:00
Roger Sayle
6ea5fb3cc7 PR middle-end/103406: Check for Inf before simplifying x-x.
This is a simple one line fix to the regression PR middle-end/103406,
where x - x is being folded to 0.0 even when x is +Inf or -Inf.
In GCC 11 and previously, we'd check whether the type honored NaNs
(which implicitly covered the case where the type honors infinities),
but my patch to test whether the operand could potentially be NaN
failed to also check whether the operand could potentially be Inf.

2021-11-25  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR middle-end/103406
	* match.pd (minus @0 @0): Check tree_expr_maybe_infinite_p.

gcc/testsuite/ChangeLog
	PR middle-end/103406
	* gcc.dg/pr103406.c: New test case.
2021-11-25 19:02:06 +00:00
Florian Weimer
9488d24206 libgcc: Split FDE search code from PT_GNU_EH_FRAME lookup
This allows switching to a different implementation for
PT_GNU_EH_FRAME lookup in a subsequent commit.

This moves some of the PT_GNU_EH_FRAME parsing out of the glibc loader
lock that is implied by dl_iterate_phdr.  However, the FDE is already
parsed outside the lock before this change, so this does not introduce
additional crashes in case of a concurrent dlclose.

libgcc/ChangeLog:

	* unwind-dw2-fde-dip.c (struct unw_eh_callback_data): Add hdr.
	Remove func, ret.
	(find_fde_tail): New function.  Split from
	_Unwind_IteratePhdrCallback.  Move the result initialization
	from _Unwind_Find_FDE.
	(_Unwind_Find_FDE): Updated to call find_fde_tail.
2021-11-25 18:43:55 +01:00
Martin Jambor
5bc4cb0412
ipa: Teach IPA-CP transformation about IPA-SRA modifications (PR 103227)
PR 103227 exposed an issue with ordering of transformations of IPA
passes.  IPA-CP can create clones for constants passed by reference
and at the same time IPA-SRA can also decide that the parameter does
not need to be a pointer (or an aggregate) and plan to convert it
into (a) simple scalar(s).  Because no intermediate clone is created
just for the purpose of ordering the transformations and because
IPA-SRA transformation is implemented as part of clone
materialization, the IPA-CP transformation happens only afterwards,
reversing the order of the transformations compared to the ordering of
analyses.

IPA-CP transformation looks at planned substitutions for values passed
by reference or in aggregates but finds that all the relevant
parameters no longer exist.  Currently it subsequently simply gives
up, leading to clones created for no good purpose (and huge regression
of 548.exchange_r.  This patch teaches it recognize the situation,
look up the new scalarized parameter and perform value substitution on
it.  On my desktop this has recovered the lost exchange2 run-time (and
some more).

I have disabled IPA-SRA in a Fortran testcase so that the dumping from
the transformation phase can still be matched in order to verify that
IPA-CP understands the IL after verifying that it does the right thing
also with IPA-SRA.

gcc/ChangeLog:

2021-11-23  Martin Jambor  <mjambor@suse.cz>

	PR ipa/103227
	* ipa-prop.h (ipa_get_param): New overload.  Move bits of the existing
	one to the new one.
	* ipa-param-manipulation.h (ipa_param_adjustments): New member
	function get_updated_index_or_split.
	* ipa-param-manipulation.c
	(ipa_param_adjustments::get_updated_index_or_split): New function.
	* ipa-prop.c (adjust_agg_replacement_values): Reimplement, add
	capability to identify scalarized parameters and perform substitution
	on them.
	(ipcp_transform_function): Create descriptors earlier, handle new
	return values of adjust_agg_replacement_values.

gcc/testsuite/ChangeLog:

2021-11-23  Martin Jambor  <mjambor@suse.cz>

	PR ipa/103227
	* gcc.dg/ipa/pr103227-1.c: New test.
	* gcc.dg/ipa/pr103227-3.c: Likewise.
	* gcc.dg/ipa/pr103227-2.c: Likewise.
	* gfortran.dg/pr53787.f90: Disable IPA-SRA.
2021-11-25 18:16:31 +01:00
Aldy Hernandez
415f9ee404 path solver: Revert computation of ranges in gimple order.
Revert the patch below, as it may slow down compilation with large CFGs.

	commit 8acbd7bef6
	Author: Aldy Hernandez <aldyh@redhat.com>
	Date:   Wed Nov 24 09:43:36 2021 +0100

	    path solver: Compute ranges in path in gimple order.

gcc/ChangeLog:

	* gimple-range-path.cc (path_range_query::compute_ranges_defined): Remove.
	(path_range_query::compute_ranges_in_block): Revert to bitmap order.
	* gimple-range-path.h: Remove compute_ranges_defined.
2021-11-25 17:34:31 +01:00
Andrew Stubbs
58d50a5dd6 amdgcn: Fix ICE generating CFI [PR103396]
gcc/ChangeLog:

	PR target/103396
	* config/gcn/gcn.c (move_callee_saved_registers): Ensure that the
	number of spilled registers is counted correctly.
2021-11-25 16:04:00 +00:00
Andrew MacLeod
1598bd47b2 Add the testcase for this PR to the testsuite.
Various ranger-enabled patches like threading and VRP2 can do this now, so add the testcase for posterity.

	gcc/testsuite/
	PR tree-optimization/102648
	* gcc.dg/pr102648.c: New.
2021-11-25 09:02:28 -05:00
Jan Hubicka
a2ae4e9ac3 Initialize node_is_self_scc in ipa_node_params::ipa_node_params
gcc/ChangeLog:

2021-11-25  Jan Hubicka  <hubicka@ucw.cz>

	* ipa-prop.h (ipa_node_params::ipa_node_params): Initialize
	node_is_self_scc.
2021-11-25 14:48:14 +01:00
Andrew MacLeod
661c02e54e Check for equivalences between PHI argument and def.
If a PHI argument on an edge is equivalent with the DEF, then it doesn't
provide any new information, defer processing it unless they are all
equivalences.

	PR tree-optimization/103359
	gcc/
	* gimple-range-fold.cc (fold_using_range::range_of_phi): If arg is
	equivalent to def, don't initially include it's range.

	gcc/testsuite/
	* gcc.dg/pr103359.c: New.
2021-11-25 08:44:27 -05:00
Jan Hubicka
f4e470d44e Do not check gimple_static_cahin in ref_maybe_used_by_call_p_1
gcc/ChangeLog:

2021-11-25  Jan Hubicka  <hubicka@ucw.cz>

	* tree-ssa-alias.c (ref_maybe_used_by_call_p_1): Do not check
	gimple_static_chain.
2021-11-25 14:44:04 +01:00
Richard Biener
4eda2eee0e Remove dead code and function
The only use of get_alias_symbol is gated by a gcc_unreachable (),
so the following patch gets rid of it.

2021-11-24  Richard Biener  <rguenther@suse.de>

	* cgraphunit.c (symbol_table::output_weakrefs): Remove
	unreachable init.
	(get_alias_symbol): Remove now unused function.
2021-11-25 14:23:44 +01:00
Richard Biener
8addb0b127 Continue RTL verifying in rtl_verify_fallthru
One case used fatal_insn which does not return which isn't
intended as can be seen by the following erro = 1.  The following
change refactors this to inline the relevant parts of fatal_insn
instead and continue validating the RTL IL.

2021-11-25  Richard Biener  <rguenther@suse.de>

	* cfgrtl.c (rtl_verify_fallthru): Do not stop verifying
	with fatal_insn.
	(skip_insns_after_block): Remove unreachable break and continue.
2021-11-25 14:23:44 +01:00
Richard Biener
0fdd1804ee Remove never looping loop in label_rtx_for_bb
This refactors the IL "walk" in a way to avoid the loop which will
never iterate.

2021-11-25  Richard Biener  <rguenther@suse.de>

	* cfgexpand.c (label_rtx_for_bb): Remove dead loop construct.
2021-11-25 14:23:44 +01:00
Richard Biener
555b8cc390 Introduce REG_SET_EMPTY_P
This avoids a -Wunreachable-code diagnostic with EXECUTE_IF_*
in case the first iteration will exit the loop.  For the case
in thread_jump using bitmap_empty_p looks preferable so this
adds REG_SET_EMPTY_P to make that available for register sets.

2021-11-25  Richard Biener  <rguenther@suse.de>

	* regset.h (REG_SET_EMPTY_P): New macro.
	* cfgcleanup.c (thread_jump): Use REG_SET_EMPTY_P.
2021-11-25 14:23:44 +01:00
Martin Liska
1167d4890f docs: Add missing @option keyword.
gcc/ChangeLog:

	* doc/invoke.texi: Use @option for -Wuninitialized.
2021-11-25 12:15:20 +01:00
Aldy Hernandez
d1c1919ef8 path solver: Move boolean import code to compute_imports.
In a follow-up patch I will be pruning the set of exported ranges
within blocks to avoid unnecessary work.  In order to do this, all the
interesting SSA names must be in the internal import bitmap ahead of
time.  I had already abstracted them out into compute_imports, but I
missed the boolean code.  This fixes the oversight.

There's a net gain of 25 threadable paths, which is unexpected but
welcome.

Tested on x86-64 & ppc64le Linux.

gcc/ChangeLog:

	PR tree-optimization/103254
	* gimple-range-path.cc (path_range_query::compute_ranges): Move
	exported boolean code...
	(path_range_query::compute_imports): ...here.
2021-11-25 11:52:23 +01:00
Aldy Hernandez
8acbd7bef6 path solver: Compute ranges in path in gimple order.
Andrew's patch for this PR103254 papered over some underlying
performance issues in the path solver that I'd like to address.

We are currently solving the SSA's defined in the current block in
bitmap order, which amounts to random order for all purposes.  This is
causing unnecessary recursion in gori.  This patch changes the order
to gimple order, thus solving dependencies before uses.

There is no change in threadable paths with this change.

Tested on x86-64 & ppc64le Linux.

gcc/ChangeLog:

	PR tree-optimization/103254
	* gimple-range-path.cc (path_range_query::compute_ranges_defined): New
	(path_range_query::compute_ranges_in_block): Move to
	compute_ranges_defined.
	* gimple-range-path.h (compute_ranges_defined): New.
2021-11-25 11:51:21 +01:00
Jakub Jelinek
94912212d3 match.pd: Fix up the recent bitmask_inv_cst_vector_p simplification [PR103417]
The following testcase is miscompiled since the r12-5489-g0888d6bbe97e10
changes.
The simplification triggers on
(x & 4294967040U) >= 0U
and turns it into:
x <= 255U
which is incorrect, it should fold to 1 because unsigned >= 0U is always
true and normally the
/* Non-equality compare simplifications from fold_binary  */
     (if (wi::to_wide (cst) == min)
       (if (cmp == GE_EXPR)
        { constant_boolean_node (true, type); })
simplification folds that, but this simplification was done earlier.

The simplification correctly doesn't include lt which has the same
reason why it shouldn't be handled, we'll fold it to 0 elsewhere.

But, IMNSHO while it isn't incorrect to handle le and gt there, it is
unnecessary.  Because (x & cst) <= 0U and (x & cst) > 0U should
never appear, again in
/* Non-equality compare simplifications from fold_binary  */
we have a simplification for it:
       (if (cmp == LE_EXPR)
        (eq @2 @1))
       (if (cmp == GT_EXPR)
        (ne @2 @1))))
This is done for
  (cmp (convert?@2 @0) uniform_integer_cst_p@1)
and so should be done for both integers and vectors.
As the bitmask_inv_cst_vector_p simplification only handles
eq and ne for signed types, I think it can be simplified to just
following patch.

2021-11-25  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/103417
	* match.pd ((X & Y) CMP 0): Only handle eq and ne.  Commonalize
	common tests.

	* gcc.c-torture/execute/pr103417.c: New test.
2021-11-25 10:47:24 +01:00
Jakub Jelinek
531dae29a6 bswap: Improve perform_symbolic_merge [PR103376]
Thinking more about it, perhaps we could do more for BIT_XOR_EXPR.
We could allow masked1 == masked2 case for it, but would need to
do something different than the
  n->n = n1->n | n2->n;
we do on all the bytes together.
In particular, for masked1 == masked2 if masked1 != 0 (well, for 0
both variants are the same) and masked1 != 0xff we would need to
clear corresponding n->n byte instead of setting it to the input
as x ^ x = 0 (but if we don't know what x and y are, the result is
also don't know).  Now, for plus it is much harder, because not only
for non-zero operands we don't know what the result is, but it can
modify upper bytes as well.  So perhaps only if current's byte
masked1 && masked2 set the resulting byte to 0xff (unknown) iff
the byte above it is 0 and 0, and set that resulting byte to 0xff too.
Also, even for | we could instead of return NULL just set the resulting
byte to 0xff if it is different, perhaps it will be masked off later on.

This patch just punts on plus if both corresponding bytes are non-zero,
otherwise implements the above.

2021-11-25  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/103376
	* gimple-ssa-store-merging.c (perform_symbolic_merge): For
	BIT_IOR_EXPR, if masked1 && masked2 && masked1 != masked2, don't
	punt, but set the corresponding result byte to MARKER_BYTE_UNKNOWN.
	For BIT_XOR_EXPR similarly and if masked1 == masked2 and the
	byte isn't MARKER_BYTE_UNKNOWN, set the corresponding result byte to
	0.

	* gcc.dg/optimize-bswapsi-7.c: New test.
2021-11-25 10:38:33 +01:00
Jakub Jelinek
8e86218f05 c++: Return early in apply_late_template_attributes if there are no late attribs [PR101180]
The r12-299-ga0fdff3cf33f7284 change can result in cplus_decl_attributes being called
even if there are no late attributes (but at least one early attribute) in
apply_late_template_attributes.  This patch fixes that, so that we return early
if there are no late attrs, only arrange for TYPE_ATTRIBUTES to get the early
attribute list.

2021-11-25  Jakub Jelinek  <jakub@redhat.com>

	PR c++/101180
	* pt.c (apply_late_template_attributes): Return early if there are no
	dependent attributes.
2021-11-25 08:39:35 +01:00