Commit Graph

191471 Commits

Author SHA1 Message Date
Jakub Jelinek
e9bf6d6b0e veclower: Fix up -fcompare-debug issue in expand_vector_comparison [PR104307]
The following testcase fails -fcompare-debug, because expand_vector_comparison
since r11-1786-g1ac9258cca8030745d3c0b8f63186f0adf0ebc27 sets
vec_cond_expr_only when it sees some use other than VEC_COND_EXPR that uses
the lhs in its condition.
Obviously we should ignore debug stmts when doing so, e.g. by not pushing
them to uses.
That would be a 2 liner change, but while looking at it, I'm also worried
about VEC_COND_EXPRs that would use the lhs in more than one operand,
like VEC_COND_EXPR <lhs, lhs, something> or VEC_COND_EXPR <lhs, something, lhs>
(sure, they ought to be folded, but what if they weren't).  Because if
something like that happens, then FOR_EACH_IMM_USE_FAST would push the same
stmt multiple times and expand_vector_condition can return true even when
it modifies it (for vector bool masking).
And lastly, it seems quite wasteful to safe_push statements that will just
cause vec_cond_expr_only = false; and break; in the second loop, both for
cases like 1000 immediate non-VEC_COND_EXPR uses and for cases like
999 VEC_COND_EXPRs with lhs in cond followed by a single non-VEC_COND_EXPR
use.  So this patch only pushes VEC_COND_EXPRs there.

2022-02-01  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/104307
	* tree-vect-generic.cc (expand_vector_comparison): Don't push debug
	stmts to uses vector, just set vec_cond_expr_only to false for
	non-VEC_COND_EXPRs instead of pushing them into uses.  Treat
	VEC_COND_EXPRs that use lhs not just in rhs1, but rhs2 or rhs3 too
	like non-VEC_COND_EXPRs.

	* gcc.target/i386/pr104307.c: New test.
2022-02-01 16:02:54 +01:00
Bill Schmidt
7e83607907 rs6000: Don't #ifdef "short" built-in names
It was recently pointed out that we get anomalous behavior when using
__attribute__((target)) to select a CPU.  As an example, when building for
-mcpu=power8 but using __attribute__((target("mcpu=power10")), it is legal
to call __builtin_vec_mod, but not vec_mod, even though these are
equivalent.  This is because the equivalence is established with a #define
that is guarded by #ifdef _ARCH_PWR10.

This goofy behavior occurs with both the old builtins support and the
new.  One of the goals of the new builtins support was to make sure all
appropriate interfaces are available using __attribute__((target)), so I
failed in this respect.  This patch corrects the problem by removing the
ifdef.  Note that in a few cases we use an ifdef in a way that can't be
overridden by __attribute__((target)), and we need to keep those.  For
example, #ifdef __PPU__ is still appropriate.

2022-01-06  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-overload.def (VEC_ABSD): Remove #ifdef token.
	(VEC_BLENDV): Likewise.
	(VEC_BPERM): Likewise.
	(VEC_CFUGE): Likewise.
	(VEC_CIPHER_BE): Likewise.
	(VEC_CIPHERLAST_BE): Likewise.
	(VEC_CLRL): Likewise.
	(VEC_CLRR): Likewise.
	(VEC_CMPNEZ): Likewise.
	(VEC_CNTLZ): Likewise.
	(VEC_CNTLZM): Likewise.
	(VEC_CNTTZM): Likewise.
	(VEC_CNTLZ_LSBB): Likewise.
	(VEC_CNTM): Likewise.
	(VEC_CNTTZ): Likewise.
	(VEC_CNTTZ_LSBB): Likewise.
	(VEC_CONVERT_4F32_8F16): Likewise.
	(VEC_DIV): Likewise.
	(VEC_DIVE): Likewise.
	(VEC_EQV): Likewise.
	(VEC_EXPANDM): Likewise.
	(VEC_EXTRACT_FP_FROM_SHORTH): Likewise.
	(VEC_EXTRACT_FP_FROM_SHORTL): Likewise.
	(VEC_EXTRACTH): Likewise.
	(VEC_EXTRACTL): Likewise.
	(VEC_EXTRACTM): Likewise.
	(VEC_EXTRACT4B): Likewise.
	(VEC_EXTULX): Likewise.
	(VEC_EXTURX): Likewise.
	(VEC_FIRSTMATCHINDEX): Likewise.
	(VEC_FIRSTMACHOREOSINDEX): Likewise.
	(VEC_FIRSTMISMATCHINDEX): Likewise.
	(VEC_FIRSTMISMATCHOREOSINDEX): Likewise.
	(VEC_GB): Likewise.
	(VEC_GENBM): Likewise.
	(VEC_GENHM): Likewise.
	(VEC_GENWM): Likewise.
	(VEC_GENDM): Likewise.
	(VEC_GENQM): Likewise.
	(VEC_GENPCVM): Likewise.
	(VEC_GNB): Likewise.
	(VEC_INSERTH): Likewise.
	(VEC_INSERTL): Likewise.
	(VEC_INSERT4B): Likewise.
	(VEC_LXVL): Likewise.
	(VEC_MERGEE): Likewise.
	(VEC_MERGEO): Likewise.
	(VEC_MOD): Likewise.
	(VEC_MSUB): Likewise.
	(VEC_MULH): Likewise.
	(VEC_NAND): Likewise.
	(VEC_NCIPHER_BE): Likewise.
	(VEC_NCIPHERLAST_BE): Likewise.
	(VEC_NEARBYINT): Likewise.
	(VEC_NMADD): Likewise.
	(VEC_ORC): Likewise.
	(VEC_PDEP): Likewise.
	(VEC_PERMX): Likewise.
	(VEC_PEXT): Likewise.
	(VEC_POPCNT): Likewise.
	(VEC_PARITY_LSBB): Likewise.
	(VEC_REPLACE_ELT): Likewise.
	(VEC_REPLACE_UN): Likewise.
	(VEC_REVB): Likewise.
	(VEC_RINT): Likewise.
	(VEC_RLMI): Likewise.
	(VEC_RLNM): Likewise.
	(VEC_SBOX_BE): Likewise.
	(VEC_SIGNEXTI): Likewise.
	(VEC_SIGNEXTLL): Likewise.
	(VEC_SIGNEXTQ): Likewise.
	(VEC_SLDB): Likewise.
	(VEC_SLV): Likewise.
	(VEC_SPLATI): Likewise.
	(VEC_SPLATID): Likewise.
	(VEC_SPLATI_INS): Likewise.
	(VEC_SQRT): Likewise.
	(VEC_SRDB): Likewise.
	(VEC_SRV): Likewise.
	(VEC_STRIL): Likewise.
	(VEC_STRIL_P): Likewise.
	(VEC_STRIR): Likewise.
	(VEC_STRIR_P): Likewise.
	(VEC_STXVL): Likewise.
	(VEC_TERNARYLOGIC): Likewise.
	(VEC_TEST_LSBB_ALL_ONES): Likewise.
	(VEC_TEST_LSBB_ALL_ZEROS): Likewise.
	(VEC_VEE): Likewise.
	(VEC_VES): Likewise.
	(VEC_VIE): Likewise.
	(VEC_VPRTYB): Likewise.
	(VEC_VSCEEQ): Likewise.
	(VEC_VSCEGT): Likewise.
	(VEC_VSCELT): Likewise.
	(VEC_VSCEUO): Likewise.
	(VEC_VSEE): Likewise.
	(VEC_VSES): Likewise.
	(VEC_VSIE): Likewise.
	(VEC_VSTDC): Likewise.
	(VEC_VSTDCN): Likewise.
	(VEC_VTDC): Likewise.
	(VEC_XL): Likewise.
	(VEC_XL_BE): Likewise.
	(VEC_XL_LEN_R): Likewise.
	(VEC_XL_SEXT): Likewise.
	(VEC_XL_ZEXT): Likewise.
	(VEC_XST): Likewise.
	(VEC_XST_BE): Likewise.
	(VEC_XST_LEN_R): Likewise.
	(VEC_XST_TRUNC): Likewise.
	(VEC_XXPERMDI): Likewise.
	(VEC_XXSLDWI): Likewise.
	(VEC_TSTSFI_EQ_DD): Likewise.
	(VEC_TSTSFI_EQ_TD): Likewise.
	(VEC_TSTSFI_GT_DD): Likewise.
	(VEC_TSTSFI_GT_TD): Likewise.
	(VEC_TSTSFI_LT_DD): Likewise.
	(VEC_TSTSFI_LT_TD): Likewise.
	(VEC_TSTSFI_OV_DD): Likewise.
	(VEC_TSTSFI_OV_TD): Likewise.
	(VEC_VADDCUQ): Likewise.
	(VEC_VADDECUQ): Likewise.
	(VEC_VADDEUQM): Likewise.
	(VEC_VADDUDM): Likewise.
	(VEC_VADDUQM): Likewise.
	(VEC_VBPERMQ): Likewise.
	(VEC_VCLZB): Likewise.
	(VEC_VCLZD): Likewise.
	(VEC_VCLZH): Likewise.
	(VEC_VCLZW): Likewise.
	(VEC_VCTZB): Likewise.
	(VEC_VCTZD): Likewise.
	(VEC_VCTZH): Likewise.
	(VEC_VCTZW): Likewise.
	(VEC_VEEDP): Likewise.
	(VEC_VEESP): Likewise.
	(VEC_VESDP): Likewise.
	(VEC_VESSP): Likewise.
	(VEC_VIEDP): Likewise.
	(VEC_VIESP): Likewise.
	(VEC_VPKSDSS): Likewise.
	(VEC_VPKSDUS): Likewise.
	(VEC_VPKUDUM): Likewise.
	(VEC_VPKUDUS): Likewise.
	(VEC_VPOPCNT): Likewise.
	(VEC_VPOPCNTB): Likewise.
	(VEC_VPOPCNTD): Likewise.
	(VEC_VPOPCNTH): Likewise.
	(VEC_VPOPCNTW): Likewise.
	(VEC_VPRTYBD): Likewise.
	(VEC_VPRTYBQ): Likewise.
	(VEC_VPRTYBW): Likewise.
	(VEC_VRLD): Likewise.
	(VEC_VSLD): Likewise.
	(VEC_VSRAD): Likewise.
	(VEC_VSRD): Likewise.
	(VEC_VSTDCDP): Likewise.
	(VEC_VSTDCNDP): Likewise.
	(VEC_VSTDCNQP): Likewise.
	(VEC_VSTDCNSP): Likewise.
	(VEC_VSTDCQP): Likewise.
	(VEC_VSTDCSP): Likewise.
	(VEC_VSUBECUQ): Likewise.
	(VEC_VSUBEUQM): Likewise.
	(VEC_VSUBUDM): Likewise.
	(VEC_VSUBUQM): Likewise.
	(VEC_VTDCDP): Likewise.
	(VEC_VTDCSP): Likewise.
	(VEC_VUPKHSW): Likewise.
	(VEC_VUPKLSW): Likewise.
2022-02-01 08:55:48 -06:00
Andreas Krebbel
b9ebf6c330 PR101260 regcprop: Add mode change check for copy reg
When propagating a multi-word register into an access with a smaller
mode the can_change_mode backend hook is already consulted for the
original register.  This however is also required for the intermediate
copy in copy_regno which might use a different register class.

gcc/ChangeLog:

	PR rtl-optimization/101260
	* regcprop.cc (maybe_mode_change): Invoke mode_change_ok also for
	copy_regno.

gcc/testsuite/ChangeLog:

	PR rtl-optimization/101260
	* gcc.target/s390/pr101260.c: New testcase.
2022-02-01 13:33:55 +01:00
Xi Ruoyao
34afa19d29
fold-const: do not fold NaN result from non-NaN operands [PR95115]
These operations should raise an invalid operation exception at runtime.
So they should not be folded during compilation unless -fno-trapping-math
is used.

gcc/
	PR middle-end/95115
	* fold-const.cc (const_binop): Do not fold NaN result from
	  non-NaN operands.

gcc/testsuite
	* gcc.dg/pr95115.c: New test.
2022-02-01 18:20:57 +08:00
Tom de Vries
d43fbc7d3f [libgomp, testsuite] Fix insufficient resources in test-cases
When running libgomp test-case broadcast-many.c on an nvptx accelerator
(T400, driver version 470.86), I run into:
...
libgomp: The Nvidia accelerator has insufficient resources to launch \
  'main$_omp_fn$0' with num_workers = 32 and vector_length = 32; \
  recompile the program with 'num_workers = x and vector_length = y' on \
  that offloaded region or '-fopenacc-dim=y' where x * y <= 896.

FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/broadcast-many.c \
  -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  \
  -O0  execution test
...

The error does not occur when using GOMP_NVPTX_JIT=-O0.

Fix this by using 896 / 32 == 28 workers for ACC_DEVICE_TYPE_nvidia.

Likewise for some other test-cases.

Tested libgomp on x86_64 with nvptx accelerator.

libgomp/ChangeLog:

2022-01-27  Tom de Vries  <tdevries@suse.de>

	* testsuite/libgomp.oacc-c-c++-common/broadcast-many.c: Reduce
	num_workers for nvidia accelerator to fix libgomp error 'insufficient
	resources'.
	* testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-4.c:
	Same.
	* testsuite/libgomp.oacc-c-c++-common/reduction-7.c: Same.
2022-02-01 08:15:00 +01:00
Tom de Vries
be362d5e12 [libgomp, testsuite] Reduce recursion depth in declare_target-*.f90
When running the libgomp testsuite with GOMP_NVPTX_JIT=-O0 using an nvptx
accelerator (Nvidia T400, 2GB), I run into:
...
libgomp: cuCtxSynchronize error: unspecified launch failure \
  (perhaps abort was called)

libgomp: cuMemFree_v2 error: unspecified launch failure

libgomp: device finalization failed
FAIL: libgomp.fortran/examples-4/declare_target-1.f90   -O0  execution test
...

The test-case contains:
...
  ! Reduced from 25 to 23, otherwise execution runs out of thread stack on
  ! Nvidia Titan V.
  if (fib (23) /= fib_wrapper (23)) stop 2
...

Fix this by reducing the fib/fib_wrapper argument from 23 to 22.

Same for declare_target-2.f90.

Tested on x86_64 with nvptx accelerator.

libgomp/ChangeLog:

2022-01-27  Tom de Vries  <tdevries@suse.de>

	* testsuite/libgomp.fortran/examples-4/declare_target-1.f90: Reduce
	recursion depth.
	* testsuite/libgomp.fortran/examples-4/declare_target-2.f90: Same.
2022-02-01 08:13:06 +01:00
Tom de Vries
2989516651 [ldist] Don't add lib calls with -fno-tree-loop-distribute-patterns
As mentioned in PR56888 comment 21:
...
-fno-tree-loop-distribute-patterns is the reliable way to not
transform loops into library calls.
...

However, since commit 6f966f0614 ("ldist: Recognize strlen and rawmemchr like
loops") a strlen or rawmemchr library call may be introduced by ldist.

This caused regressions in testcases
gcc.c-torture/execute/builtins/strlen{,-2,-3}.c for nvptx.

Fix this by not calling transform_reduction_loop from
loop_distribution::execute for -fno-tree-loop-distribute-patterns.

Tested regressed test-cases as well as gcc.dg/tree-ssa/ldist-*.c on
nvptx.

gcc/ChangeLog:

2022-01-31  Tom de Vries  <tdevries@suse.de>

	* tree-loop-distribution.cc (generate_reduction_builtin_1): Check for
	-ftree-loop-distribute-patterns.
	(loop_distribution::execute): Don't call transform_reduction_loop for
	-fno-tree-loop-distribute-patterns.

gcc/testsuite/ChangeLog:

2022-01-31  Tom de Vries  <tdevries@suse.de>

	* gcc.dg/tree-ssa/ldist-strlen-4.c: New test.
2022-02-01 08:12:24 +01:00
GCC Administrator
1bb5266257 Daily bump. 2022-02-01 00:16:29 +00:00
Andrew Pinski
691924db0d Fix comment for operand_compare::operand_equal_p.
The OEP_* enums were moved to tree-core.h in
r0-124973-g5e351e960763 but the comment was correct
when it was added added to fold-const.h in
r10-4231-g7f4a8ee03d40. This fixes the reference
to the OEP_* enum to reference tree-core.

Committed as obvious after a bootstrap/test on x86_64-linux.

gcc/ChangeLog:

	* fold-const.h (operand_compare::operand_equal_p):
	Fix comment about OEP_* flags.
2022-01-31 23:26:18 +00:00
Ed Smith-Rowland
43ee212764 MAINTAINERS: Update my email and add myself to the DCO list.
ChangeLog:

2022-01-31  Ed Smith-Rowland  <esmithrowland@gmail.com>

	* MAINTAINERS: Update my email and add myself to the DCO list.
2022-01-31 18:05:40 -05:00
Marek Polacek
874ad5d674 c++: ICE with auto[] and VLA [PR102414]
Here we ICE in unify_array_domain when we're trying to deduce the type
of an array, as in

  auto(*p)[i] = (int(*)[i])0;

but unify_array_domain doesn't arbitrarily complex bounds.  Another
test is, e.g.,

  auto (*b)[0/0] = &a;

where the type of the array is

  <<< Unknown tree: template_type_parm >>>[0:(sizetype) ((ssizetype) (0 / 0) - 1)]

It seems to me that we need not handle these.

	PR c++/102414
	PR c++/101874

gcc/cp/ChangeLog:

	* decl.cc (create_array_type_for_decl): Use template_placeholder_p.
	Sorry on a variable-length array of auto.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp23/auto-array3.C: New test.
	* g++.dg/cpp23/auto-array4.C: New test.
2022-01-31 15:35:59 -05:00
Marek Polacek
b1a8b92f8f c++: Reject union std::initializer_list [PR102434]
Weird things are going to happen if you define your std::initializer_list
as a union.  In this case, we crash in output_constructor_regular_field.

Let's not allow such a definition in the first place.

	PR c++/102434

gcc/cp/ChangeLog:

	* class.cc (finish_struct): Don't allow union initializer_list.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp0x/initlist128.C: New test.
2022-01-31 15:35:20 -05:00
Patrick Palka
76dc465aaf c++: CTAD for class tmpl defined inside partial spec [PR104294]
Here during deduction guide generation for the nested class template
B<char(int)>::C, the computation of outer_args yields the template
arguments relative to the primary template for B (i.e. {char(int)})
but what we really want is those relative to C's enclosing scope, the
partial specialization of B (i.e. {char, int}).

	PR c++/104294

gcc/cp/ChangeLog:

	* pt.cc (ctor_deduction_guides_for): Correct computation of
	outer_args.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp1z/class-deduction106.C: New test.
2022-01-31 15:27:58 -05:00
Patrick Palka
0eb06ee9a4 c++: CONSTRUCTORs are non-deduced contexts [PR104291]
PR c++/104291

gcc/cp/ChangeLog:

	* pt.cc (for_each_template_parm_r) <case CONSTRUCTOR>: Clear
	walk_subtrees if !include_nondeduced_p.  Simplify given that
	cp_walk_subtrees already walks TYPE_PTRMEMFUNC_FN_TYPE_RAW.

gcc/testsuite/ChangeLog:

	* g++.dg/template/partial20.C: New test.
2022-01-31 14:15:01 -05:00
Jakub Jelinek
2cbe5dd54f rs6000: Fix up build of non-glibc/aix/darwin powerpc* targets [PR104298]
As reported by Martin, while David has added OPTION_GLIBC define to aix
and Iain to darwin, all the other non-linux targets now fail because
rs6000.md macro isn't defined.

One possibility is to define this macro in option-defaults.h which on rs6000
targets is included last, then we don't need to define it in aix/darwin
headers and for targets using linux.h or linux64.h it will DTRT too.

The other option is the first 2 hunks + changing the 3
   if (!OPTION_GLIBC)
     FAIL;
cases in rs6000.md to e.g.
 #ifdef OPTION_GLIBC
   if (!OPTION_GLIBC)
 #endif
     FAIL;
or to:
 #ifdef OPTION_GLIBC
   if (!OPTION_GLIBC)
 #else
   if (true)
 #endif
     FAIL;
(the latter case if Richi wants to push the -Wunreachable-code changes for
GCC 13).

2022-01-31  Jakub Jelinek  <jakub@redhat.com>

	PR target/104298
	* config/rs6000/aix.h (OPTION_GLIBC): Remove.
	* config/rs6000/darwin.h (OPTION_GLIBC): Likewise.
	* config/rs6000/option-defaults.h (OPTION_GLIBC): Define to 0
	if not already defined.
2022-01-31 20:08:18 +01:00
Martin Sebor
48d3191e7b Constrain PHI handling in -Wuse-after-free [PR104232].
Resolves:
PR middle-end/104232 - spurious -Wuse-after-free after conditional free

gcc/ChangeLog:

	PR middle-end/104232
	* gimple-ssa-warn-access.cc (pointers_related_p): Add argument.
	Handle PHIs.  Add a synonymous overload.
	(pass_waccess::check_pointer_uses): Call pointers_related_p.

gcc/testsuite/ChangeLog:

	PR middle-end/104232
	* g++.dg/warn/Wuse-after-free4.C: New test.
	* gcc.dg/Wuse-after-free-2.c: New test.
	* gcc.dg/Wuse-after-free-3.c: New test.
2022-01-31 12:04:55 -07:00
Martin Liska
31ab99f7c8 contrib: update analyze_brprob_* scripts.
contrib/ChangeLog:

	* analyze_brprob.py: Support more formatted predict.def file.
	* analyze_brprob_spec.py: Improve output and documentation.
2022-01-31 16:42:15 +01:00
Nick Clifton
f10bec5ffa libiberty: Fix infinite recursion in rust demangler.
libiberty/
	PR demangler/98886
	PR demangler/99935
	* rust-demangle.c (struct rust_demangler): Add a recursion
	counter.
	(demangle_path): Increment/decrement the recursion counter upon
	entry and exit.  Fail if the counter exceeds a fixed limit.
	(demangle_type): Likewise.
	(rust_demangle_callback): Initialise the recursion counter,
	disabling if requested by the option flags.
2022-01-31 14:33:34 +00:00
Pierre-Marie de Rodat
36c155c893 [Ada] doc/share/conf.py: fix string handling
gcc/ada/

	* doc/share/conf.py: Remove spurious call to ".decode()".
2022-01-31 10:46:27 +00:00
Arnaud Charlet
2dbc237e86 [Ada] Fix up handling of ghost units PR104027 #2
gcc/ada/

	PR ada/104027
	* gnat1drv.adb (Gnat1drv): Only call Exit_Program when not
	generating code, otherwise instead go to End_Of_Program.
2022-01-31 10:46:27 +00:00
Jakub Jelinek
263a5944fc testsuite: Fix up tree-ssa/pr103514.c testcase [PR103514]
> > PR tree-optimization/103514
> >     * match.pd (a & b) ^ (a == b) -> !(a | b): New optimization.
> >     * match.pd (a & b) == (a ^ b) -> !(a | b): New optimization.
> >     * gcc.dg/tree-ssa/pr103514.c: Testcase for this optimization.
> >
> > 1) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103514
> Note the bug was filed an fixed during stage3, review just didn't happen in
> a reasonable timeframe.
>
> I'm going to ACK this for the trunk and go ahead and commit it for you.

The testcase FAILs on short-circuit targets like powerpc64le-linux.
While the first 2 functions are identical, the last two look like:
  <bb 2> :
  if (a_5(D) != 0)
    goto <bb 3>; [INV]
  else
    goto <bb 4>; [INV]

  <bb 3> :
  if (b_6(D) != 0)
    goto <bb 5>; [INV]
  else
    goto <bb 4>; [INV]

  <bb 4> :

  <bb 5> :
  # iftmp.1_4 = PHI <1(3), 0(4)>
  _1 = a_5(D) == b_6(D);
  _2 = (int) _1;
  _3 = _2 ^ iftmp.1_4;
  _9 = _2 != iftmp.1_4;
  return _9;
instead of the expected:
  <bb 2> :
  _3 = a_8(D) & b_9(D);
  _4 = (int) _3;
  _5 = a_8(D) == b_9(D);
  _6 = (int) _5;
  _1 = a_8(D) | b_9(D);
  _2 = ~_1;
  _7 = (int) _2;
  _10 = ~_1;
  return _10;
so no wonder it doesn't match.  E.g. x86_64-linux will also use jumps
if it isn't just a && b but a && b && c && d (will do
a & b and c & d tests and jump based on those.

As it is too late to implement this optimization even for the short
circuiting targets this late (not even sure which pass would be best),
this patch just forces non-short-circuiting for the test.

2022-01-31  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/103514
	* gcc.dg/tree-ssa/pr103514.c: Add
	--param logical-op-non-short-circuit=1 to dg-options.
2022-01-31 10:30:58 +01:00
Martin Liska
e97cfaa9f6 d: Fix -Werror=format-diag error.
PR d/104287

gcc/d/ChangeLog:

	* decl.cc (d_finish_decl): Remove trailing dot.
2022-01-31 09:49:41 +01:00
Martin Liska
c99a6eb015 Add mold detection for libs.
libatomic/ChangeLog:

	* acinclude.m4: Detect *_ld_is_mold and use it.
	* configure: Regenerate.

libgomp/ChangeLog:

	* acinclude.m4: Detect *_ld_is_mold and use it.
	* configure: Regenerate.

libitm/ChangeLog:

	* acinclude.m4: Detect *_ld_is_mold and use it.
	* configure: Regenerate.

libstdc++-v3/ChangeLog:

	* acinclude.m4: Detect *_ld_is_mold and use it.
	* configure: Regenerate.
2022-01-31 09:46:44 +01:00
Richard Biener
625f16c798 Fix multiple_of_p behavior with NOP_EXPR
We were passing down the original type to recursive invocations
of multiple_of_p for say (int)(unsigned * unsigned).

2022-01-24  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/100499
	* fold-const.cc (multiple_of_p): Pass the correct type of
	the expression to the recursive invocation of multiple_of_p
	for conversions and use CASE_CONVERT.
2022-01-31 09:38:10 +01:00
Eric Botcazou
23987912dd Use V8+ default in 32-bit mode on SPARC64/Linux
This is what has been done for ages on SPARC/Solaris and makes it possible
to use 64-bit atomic instructions even in 32-bit mode.

gcc/
	PR target/104189
	* config/sparc/linux64.h (TARGET_DEFAULT): Add MASK_V8PLUS.
2022-01-31 09:21:48 +01:00
Eric Botcazou
825e5599f3 Add testcase for incorrect optimization in Ada
gcc/testsuite/
	* gnat.dg/div_zero.adb: New test.
2022-01-31 09:15:30 +01:00
Richard Biener
3c7067cc92 Reduce multiple_of_p uses
There are a few cases where we know we're dealing with (poly-)integer
constants, so remove the use of multiple_of_p in those cases to make
the PR100499 fix less impactful.

2022-01-24  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/100499
	* tree-cfg.cc (verify_gimple_assign_ternary): Use multiple_p
	on poly-ints instead of multiple_of_p.
	* tree-ssa.cc (maybe_rewrite_mem_ref_base): Likewise.
	(non_rewritable_mem_ref_base): Likewise.
	(non_rewritable_lvalue_p): Likewise.
	(execute_update_addresses_taken): Likewise.
2022-01-31 08:55:45 +01:00
GCC Administrator
c67ffc256d Daily bump. 2022-01-31 00:16:28 +00:00
Hans-Peter Nilsson
baf98320ac libstdc++ testsuite: Don't run lwg3464.cc tests on simulators
These tests have always been failing for my autotester running a
cris-elf simulator; when unrestrained they take about 20 minutes each,
compared to the (doubled) timeout of 720 seconds, of a total 2h40min
for the whole of the libstdc++-v3 testsuite.  The tests cover counter
overflow and are already disabled for LP64 targets.

	* testsuite/27_io/basic_istream/get/char/lwg3464.cc: Don't run on
	simulator targets.
	* testsuite/27_io/basic_istream/get/wchar_t/lwg3464.cc: Likewise.
2022-01-30 17:51:02 +01:00
GCC Administrator
d1182631ee Daily bump. 2022-01-30 00:16:20 +00:00
Jakub Jelinek
3d41939c87 testsuite: Fix up tree-ssa/divide-7.c testcase [PR95424]
This test fails everywhere, because ? doesn't match literal ?.
It should use \\? instead.  I've also changed those .s in there.

2022-01-29  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/95424
	* gcc.dg/tree-ssa/divide-7.c: Fix up regexps in scan-tree-dump{,-not}.
2022-01-29 17:55:51 +01:00
Jakub Jelinek
a154487896 match.pd: Fix up 1 / X for unsigned X optimization [PR104280]
On Fri, Jan 28, 2022 at 11:38:23AM -0700, Jeff Law wrote:
> Thanks.  Given the original submission and most of the review work was done
> prior to stage3 closing, I went ahead and installed this on the trunk.

Unfortunately this breaks quite a lot of things.
The main problem is that GIMPLE allows EQ_EXPR etc. only with BOOLEAN_TYPE
or with TYPE_PRECISION == 1 integral type (or vector boolean).
Violating this causes verification failures in tree-cfg.cc in some cases,
in other cases wrong-code issues because before it is verified we e.g.
transform
1U / x
into
x == 1U
and later into
x (because we assume that == type must be one of the above cases and
when it is the same type as the type of the first operand, for boolean-ish
cases it should be equivalent).

Fixed by changing that
(eq @1 { build_one_cst (type); })
into
(convert (eq:boolean_type_node @1 { build_one_cst (type); }))
Note, I'm not 100% sure if :boolean_type_node is required in that case,
I see some spots in match.pd that look exactly like this, while there is
e.g. (convert (le ...)) that supposedly does the right thing too.
The signed integer 1/X case doesn't need changes changes, for
(cond (le ...) ...)
le gets correctly boolean_type_node and cond should use type.
I've also reformatted it, some lines were too long, match.pd uses
indentation by 1 column instead of 2 etc.

2022-01-29  Jakub Jelinek  <jakub@redhat.com>
	    Andrew Pinski  <apinski@marvell.com>

	PR tree-optimization/104279
	PR tree-optimization/104280
	PR tree-optimization/104281
	* match.pd (1 / X -> X == 1 for unsigned X): Build eq with
	boolean_type_node and convert to type.  Formatting fixes.

	* gcc.dg/torture/pr104279.c: New test.
	* gcc.dg/torture/pr104280.c: New test.
	* gcc.dg/torture/pr104281.c: New test.
2022-01-29 17:54:43 +01:00
GCC Administrator
f6f2d6cfec Daily bump. 2022-01-29 00:16:22 +00:00
Yoshinori Sato
06995c2958 sh-linux fix target cpu
sh-linux not supported any SH1 and SH2a little-endian.

gcc

	* config/sh/t-linux (MULTILIB_EXCEPTIONS): Add m1, mb/m1 and m2a.
2022-01-28 17:17:39 -05:00
Navid Rahimi
cb3ac1985a tree-optimization/103514 Missing XOR-EQ-AND Optimization
This patch will add the missed pattern described in bug 103514 [1] to the match.pd. [1] includes proof of correctness for the patch too.

1) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103514

gcc/
	PR tree-optimization/103514
	* match.pd (a & b) ^ (a == b) -> !(a | b): New optimization.
	(a & b) == (a ^ b) -> !(a | b): New optimization.

gcc/testsuite
	* gcc.dg/tree-ssa/pr103514.c: Testcase for this optimization.
2022-01-28 17:13:08 -05:00
Marek Polacek
5d8b422818 doc: Update -Wbidi-chars documentation
gcc/ChangeLog:

	* doc/invoke.texi: Update -Wbidi-chars documentation.
2022-01-28 15:58:41 -05:00
Patrick Palka
e971990cbd c++: bogus warning with value init of const pmf [PR92752]
Here we're emitting a -Wignored-qualifiers warning for an intermediate
compiler-generated cast of nullptr to 'method-type* const' as part of
value initialization of a const pmf.  This patch suppresses the warning
by instead casting to the corresponding unqualified type.

	PR c++/92752

gcc/cp/ChangeLog:

	* typeck.cc (build_ptrmemfunc): Cast a nullptr constant to the
	unqualified pointer type not the qualified one.

gcc/testsuite/ChangeLog:

	* g++.dg/warn/Wignored-qualifiers2.C: New test.

Co-authored-by: Jason Merrill <jason@redhat.com>
2022-01-28 15:41:15 -05:00
Iain Sandoe
3a5fdf986d Darwin, PPC: Fix bootstrap after GLIBC version changes.
A recent patch added tests for OPTION_GLIBC that is defined in
linux.h and linux64.h.  This broke bootstrap for powerpc Darwin.
Fixed by adding a definition to 0 for OPTION_GLIBC.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

gcc/ChangeLog:

	* config/rs6000/darwin.h (OPTION_GLIBC): Define to 0.
2022-01-28 19:17:16 +00:00
Zhao Wei Liew
c2b610e7c6 match.pd: Simplify 1 / X for integer X [PR95424]
This patch implements an optimization for the following C++ code:

int f(int x) {
    return 1 / x;
}

int f(unsigned int x) {
    return 1 / x;
}

Before this patch, x86-64 gcc -std=c++20 -O3 produces the following assembly:

f(int):
    xor edx, edx
    mov eax, 1
    idiv edi
    ret
f(unsigned int):
    xor edx, edx
    mov eax, 1
    div edi
    ret

In comparison, clang++ -std=c++20 -O3 produces the following assembly:

f(int):
    lea ecx, [rdi + 1]
    xor eax, eax
    cmp ecx, 3
    cmovb eax, edi
    ret
f(unsigned int):
    xor eax, eax
    cmp edi, 1
    sete al
    ret

Clang's output is more efficient as it avoids expensive div operations.

With this patch, GCC now produces the following assembly:

f(int):
    lea eax, [rdi + 1]
    cmp eax, 2
    mov eax, 0
    cmovbe eax, edi
    ret
f(unsigned int):
    xor eax, eax
    cmp edi, 1
    sete al
    ret

which is virtually identical to Clang's assembly output. Any slight differences
in the output for f(int) is possibly related to a different missed optimization.

v2: https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587751.html
Changes from v2:
1. Refactor from using a switch statement to using the built-in
if-else statement.

v1: https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587634.html
Changes from v1:
1. Refactor common if conditions.
2. Use build_[minus_]one_cst (type) to get -1/1 of the correct type.
3. Match only for TRUNC_DIV_EXPR and TYPE_PRECISION (type) > 1.

gcc/ChangeLog:

	PR tree-optimization/95424
	* match.pd: Simplify 1 / X where X is an integer.
2022-01-28 13:36:39 -05:00
Jakub Jelinek
a591c71b41 store-merging: Fix up a -fcompare-debug bug in get_status_for_store_merging [PR104263]
As mentioned in the PRthe following testcase fails, because the last
stmt of a bb with -g is a debug stmt and get_status_for_store_merging
uses gimple_seq_last_stmt (bb_seq (bb)) when testing if it is valid
for store merging.  The debug stmt isn't valid, while a stmt at that
position with -g0 is valid and so the divergence.

As we walk the whole bb already, this patch just remembers the last
non-debug stmt, so that we don't need to skip backwards debug stmts at the
end of the bb to find last real stmt.

2022-01-28  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/104263
	* gimple-ssa-store-merging.cc (get_status_for_store_merging): For
	cfun->can_throw_non_call_exceptions && cfun->eh test whether
	last non-debug stmt in the bb is store_valid_for_store_merging_p
	rather than last stmt.

	* gcc.dg/pr104263.c: New test.
2022-01-28 19:02:26 +01:00
Allan McRae
90c31ff339 testsuite/70230 - fix failures with default SSP\
Configuring with --enable-default-ssp triggers various testsuite
failures.  These contain asm statements that are not compatible with
-fstack-protector.  Adding -fno-stack-protector to dg-options to
work around this issue.

Tested on x86_64-linux.

	PR testsuite/70230
	* gcc.dg/asan/use-after-scope-4.c (dg-options): Add
	-fno-stack-protector.
	* gcc.dg/stack-usage-1.c: Likewise
	* gcc.dg/superblock.c: Likewise
	* gcc.target/i386/avx-vzeroupper-17.c: Likewise
	* gcc.target/i386/cleanup-1.c: Likewise
	* gcc.target/i386/cleanup-2.c: Likewise
	* gcc.target/i386/interrupt-redzone-1.c: Likewise
	* gcc.target/i386/interrupt-redzone-2.c: Likewise
	* gcc.target/i386/pr79793-1.c: Likewise
	* gcc.target/i386/pr79793-2.c: Likewise
	* gcc.target/i386/shrink_wrap_1.c: Likewise
	* gcc.target/i386/stack-check-11.c: Likewise
	* gcc.target/i386/stack-check-18.c: Likewise
	* gcc.target/i386/stack-check-19.c: Likewise
	* gcc.target/i386/stackalign/pr88483-1.c: Likewise
	* gcc.target/i386/stackalign/pr88483-2.c: Likewise
	* gcc.target/i386/sw-1.c: Likewise
2022-01-28 12:45:16 -05:00
Martin Liska
3f0fcda37f Remove extra newline in ICE report.
Revert partially what I did in g:76ef38e3178a11e76a66b4d4c0e10e85fe186a45.

gcc/ChangeLog:

	* diagnostic.cc (diagnostic_action_after_output): Remove extra
	newline.
2022-01-28 16:11:33 +01:00
Martin Liska
206222e0ce internal_error - do not use leading capital letter
gcc/ChangeLog:

	* config/rs6000/host-darwin.cc (segv_crash_handler):
	Do not use leading capital letter.
	(segv_handler): Likewise.
	* ipa-sra.cc (verify_splitting_accesses): Likewise.
	* varasm.cc (get_section): Likewise.

gcc/d/ChangeLog:

	* decl.cc (d_finish_decl): Do not use leading capital letter.
2022-01-28 16:08:58 +01:00
Patrick Palka
e272cf95ba c++: var tmpl w/ dependent constrained auto type [PR103341]
When deducing the type of a variable template (or templated static data
member) with a constrained auto type, we might need its template
arguments for satisfaction since the constraint could depend on them.

	PR c++/103341

gcc/cp/ChangeLog:

	* decl.cc (cp_finish_decl): Pass the template arguments of a
	variable template specialization or a templated static data
	member to do_auto_deduction when the auto is constrained.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/concepts-class4.C: New test.
	* g++.dg/cpp2a/concepts-var-templ2.C: New test.
2022-01-28 08:18:28 -05:00
Richard Biener
9ec306582f tree-optimization/104267 - fix external def vector type for call args
The following fixes the vector type registered for external defs
in call arguments when vectorizing with SLP.  We assumed uniform
vectype_in types here but with calls like .COND_MUL we also have
mask arguments which, when invariant or external, need to have
a proper mask vector type.

2022-01-28  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/104267
	* tree-vect-stmts.cc (vectorizable_call): Properly use the
	per-argument determined vector type for externals and
	invariants.
2022-01-28 13:29:56 +01:00
Richard Biener
5b6f04276e tree-optimization/104263 - avoid retaining abnormal edges for non-call/goto stmts
This removes a premature optimization from
gimple_purge_dead_abnormal_call_edges which, after eliding the
last setjmp (or computed goto) statement from a function and
thus clearing cfun->calls_setjmp, leaves us with the abnormal
edges from other calls that are elided for example via inlining
or DCE.  That's a CFG / IL combination that should be impossible
(not addressing the fact that with cfun->calls_setjmp and
cfun->has_nonlocal_label cleared we should not have any abnormal
edge at all).

For the testcase in the PR this means that IPA inlining will
remove the abormal edges from the block after inlining the call
the edge was coming from.

2022-01-28  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/104263
	* tree-cfg.cc (gimple_purge_dead_abnormal_call_edges):
	Purge edges also when !cfun->has_nonlocal_label
	and !cfun->calls_setjmp.

	* gcc.dg/tree-ssa/inline-13.c: New testcase.
2022-01-28 13:29:37 +01:00
Maciej W. Rozycki
833e651a76 RISC-V: Document auipc' and bitmanip' `type' attributes
Document new `auipc' and `bitmanip' `type' attributes added respectively
with commit 88108b27dd ("RISC-V: Add sifive-7 pipeline description.")
and commit 283b1707f2 ("RISC-V: Implement instruction patterns for ZBA
extension.") but not listed so far.

	gcc/
	* config/riscv/riscv.md: Document `auipc' and `bitmanip' `type'
	attributes.
2022-01-28 11:55:12 +00:00
Andre Vehreschild
26e237fb5b Prevent malicious descriptor stacking for scalar components [V2].
gcc/fortran/ChangeLog:

	PR fortran/103790
	* trans-array.cc (structure_alloc_comps): Prevent descriptor
	stacking for non-array data; do not broadcast caf-tokens.
	* trans-intrinsic.cc (conv_co_collective): Prevent generation
	of unused descriptor.

gcc/testsuite/ChangeLog:

	PR fortran/103790
	* gfortran.dg/coarray_collectives_18.f90: New test.
2022-01-28 12:34:17 +01:00
Jakub Jelinek
430dca620f cfgrtl: Fix up locus comparison in unique_locus_on_edge_between_p [PR104237]
The testcase in the PR (not included for the testsuite because we don't
have an (easy) way to -fcompare-debug LTO, we'd need 2 compilations/linking,
one with -g and one with -g0 and -fdump-rtl-final= at the end of lto1
and compare that) has different code generation for -g vs. -g0.

The difference appears during expansion, where we have a goto_locus
that is at -O0 compared to the INSN_LOCATION of the previous and next insn
across an edge.  With -g0 the locations are equal and so no nop is added.
With -g the locations aren't equal and so a nop is added holding that
location.

The reason for the different location is in the way how we stream in
locations by lto1.
We have lto_location_cache::apply_location_cache that is called with some
set of expanded locations, qsorts them, creates location_t's for those
and remembers the last expanded location.
lto_location_cache::input_location_and_block when read in expanded_location
is equal to the last expanded location just reuses the last location_t
(or adds/changes/removes LOCATION_BLOCK in it), when it is not queues
it for next apply_location_cache.  Now, when streaming in -g input, we can
see extra locations that don't appear with -g0, and if we are unlucky
enough, those can be sorted last during apply_location_cache and affect
what locations are used from the single entry cache next.
In particular, second apply_location_cache with non-empty loc_cache in
the testcase has 14 locations with -g0 and 16 with -g and those 2 extra
ones sort both last (they are the same).  The last one from -g0 then
appears to be input_location_and_block sourced again, for -g0 triggers
the single entry cache, while for -g it doesn't and so apply_location_cache
will create for it another location_t with the same content.

The following patch fixes it by comparing everything we care about the
location instead (well, better in addition) to a simple location_t ==
location_t check.  I think we don't care about the sysp flag for debug
info...

2022-01-28  Jakub Jelinek  <jakub@redhat.com>

	PR lto/104237
	* cfgrtl.cc (loc_equal): New function.
	(unique_locus_on_edge_between_p): Use it.
2022-01-28 11:48:18 +01:00
Richard Biener
b500d2591e Make graph dumping work for fn != cfun
The following makes dumping of a function as graph work as intended
when specifying a function other than cfun.  Unfortunately the loop
and the dominance APIs are not set up to work for other functions
than cfun so you won't get any fancy loop dumps but the non-loop
dump works up to reaching mark_dfs_back_edges which I trivially made
function aware and adjusted current callers with a wrapper.

With all this, doing dot-fn id->src_cfun from the debugger when
debugging inlining works.  Previously you got a strange mix of
the src and dest functions visualized ;)

2022-01-28  Richard Biener  <rguenther@suse.de>

	* cfganal.h (mark_dfs_back_edges): Provide API with struct
	function argument.
	* cfganal.cc (mark_dfs_back_edges): Take a struct function
	to work on, add a wrapper passing cfun.
	* graph.cc (draw_cfg_nodes_no_loops): Replace stray cfun
	uses with fun which is already passed.
	(draw_cfg_edges): Likewise.
	(draw_cfg_nodes_for_loop): Do not use draw_cfg_nodes_for_loop
	for fun != cfun.
2022-01-28 11:28:09 +01:00