189068 Commits

Author SHA1 Message Date
Uros Bizjak
b37351e327 i386: Improve workaround for PR82524 LRA limitation [PR85730]
As explained in PR82524, LRA is not able to reload strict_low_part inout
operand with matched input operand. The patch introduces a workaround,
where we allow LRA to generate an instruction with non-matched input operand
which is split post reload to an instruction that inserts non-matched input
operand to an inout operand and the instruction that uses matched operand.

The generated code improves from:

        movsbl  %dil, %edx
        movl    %edi, %eax
        sall    $3, %edx
        movb    %dl, %al

to:

        movl    %edi, %eax
        movb    %dil, %al
        salb    $3, %al

which is still not optimal, but the code is one instruction shorter and
does not use a temporary register.

2021-10-12  Uroš Bizjak  <ubizjak@gmail.com>

gcc/
	PR target/85730
	PR target/82524
	* config/i386/i386.md (*add<mode>_1_slp): Rewrite as
	define_insn_and_split pattern.  Add alternative 1 and split it
	post reload to insert operand 1 into the low part of operand 0.
	(*sub<mode>_1_slp): Ditto.
	(*and<mode>_1_slp): Ditto.
	(*<any_or:code><mode>_1_slp): Ditto.
	(*ashl<mode>3_1_slp): Ditto.
	(*<any_shiftrt:insn><mode>3_1_slp): Ditto.
	(*<any_rotate:insn><mode>3_1_slp): Ditto.
	(*neg<mode>_1_slp): New insn_and_split pattern.
	(*one_cmpl<mode>_1_slp): Ditto.

gcc/testsuite/
	PR target/85730
	PR target/82524
	* gcc.target/i386/pr85730.c: New test.
2021-10-12 18:21:33 +02:00
David Edelsohn
640ae312f1 doc: Update MinGW and mingw-64 download links.
gcc/ChangeLog:

	* doc/install.texi: Update MinGW and mingw-64 Binaries
	download links.
2021-10-12 11:55:45 -04:00
Jonathan Wakely
727137d6ca libstdc++: Fix test that fails for C++20
Also restore the test for 'a < a' that was removed by r12-2537 because
it is ill-formed. We still want to test operator< for tuple, we just
need to not use std::nullptr_t in that tuple type.

libstdc++-v3/ChangeLog:

	* testsuite/20_util/tuple/comparison_operators/overloaded.cc:
	Restore test for operator<.
	* testsuite/20_util/tuple/comparison_operators/overloaded2.cc:
	Adjust expected errors for C++20.
2021-10-12 16:05:15 +01:00
Jonathan Wakely
7481021364 libstdc++: Fix move construction of std::tuple with array elements [PR101960]
The r12-3022 commit only fixed the case where an array is the last
element of the tuple. This fixes the other cases too. We can just define
the move constructor as defaulted, which does the right thing. Changing
the move constructor to be trivial would be an ABI break, but since the
last base class still has a non-trivial move constructor, defining the
derived ones as defaulted doesn't change anything.

libstdc++-v3/ChangeLog:

	PR libstdc++/101960
	* include/std/tuple (_Tuple_impl(_Tuple_impl&&)): Define as
	defauled.
	* testsuite/20_util/tuple/cons/101960.cc: Check tuples with
	array elements before the last element.
2021-10-12 16:05:15 +01:00
Jonathan Wakely
d9dfd7ad3e libstdc++: Improve diagnostics for misuses of output iterators
This adds deleted overloads so that the errors for invalid uses of
std::advance and std::distance are easier to understand (see for example
PR 102181).

libstdc++-v3/ChangeLog:

	* include/bits/stl_iterator_base_funcs.h (__advance): Add
	deleted overload to improve diagnostics.
	(__distance): Likewise.
2021-10-12 16:05:15 +01:00
Daniel Le Duc Khoi Nguyen
8226f6383a doc: Fix typos in alloc_size documentation
gcc/
	* doc/extend.texi (Common Variable Attributes): Fix typos in
	alloc_size documentation.
2021-10-12 10:53:23 -04:00
Luís Ferreira
98c0ac7e0d [PATCH v2] libiberty: d-demangle: remove parenthesis where it is not needed
libiberty/
	* d-demangle.c (dlang_parse_qualified): Remove redudant parenthesis
	around lhs and rhs of assignments.
2021-10-12 10:40:20 -04:00
Julian Brown
ccfcf08e66 libgomp: Release device lock on cbuf error path
This patch releases the device lock on a sanity-checking error path in
transfer combining (cbuf) handling in libgomp:target.c.  This shouldn't
happen when handling well-formed mapping clauses, but erroneous clauses
can currently cause a hang if the condition triggers.

2021-12-10  Julian Brown  <julian@codesourcery.com>

libgomp/
	* target.c (gomp_copy_host2dev): Release device lock on cbuf
	error path.
2021-10-12 06:50:26 -07:00
Richard Biener
d1dcaa3145 tree-optimization/102696 - fix SLP discovery for failed BIT_FIELD_REF
This fixes a forgotten adjustment of matches[] when we fail SLP
discovery.

2021-10-12  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/102696
	* tree-vect-slp.c (vect_build_slp_tree_2): Properly mark
	the tree fatally failed when we reject a BIT_FIELD_REF.

	* g++.dg/vect/pr102696.cc: New testcase.
2021-10-12 14:49:44 +02:00
Richard Biener
9f12a45ef1 tree-optimization/102572 - fix gathers with invariant mask
This fixes the vector def gathering for invariant masks which
failed to pass in the desired vector type resulting in a non-mask
type to be generate.

2021-10-12  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/102572
	* tree-vect-stmts.c (vect_build_gather_load_calls): When
	gathering the vectorized defs for the mask pass in the
	desired mask vector type so invariants will be handled
	correctly.

	* g++.dg/vect/pr102572.cc: New testcase.
2021-10-12 14:49:44 +02:00
Tamar Christina
e36206c994 sve: combine inverted masks into NOTs
The following example

void f10(double * restrict z, double * restrict w, double * restrict x,
	 double * restrict y, int n)
{
    for (int i = 0; i < n; i++) {
        z[i] = (w[i] > 0) ? x[i] + w[i] : y[i] - w[i];
    }
}

generates currently:

        ld1d    z1.d, p1/z, [x1, x5, lsl 3]
        fcmgt   p2.d, p1/z, z1.d, #0.0
        fcmgt   p0.d, p3/z, z1.d, #0.0
        ld1d    z2.d, p2/z, [x2, x5, lsl 3]
        bic     p0.b, p3/z, p1.b, p0.b
        ld1d    z0.d, p0/z, [x3, x5, lsl 3]

where a BIC is generated between p1 and p0 where a NOT would be better here
since we won't require the use of p3 and opens the pattern up to being CSEd.

After this patch using a 2 -> 2 split we generate:

        ld1d    z1.d, p0/z, [x1, x5, lsl 3]
        fcmgt   p2.d, p0/z, z1.d, #0.0
        not     p1.b, p0/z, p2.b

The additional scratch is needed such that we can CSE the two operations.  If
both statements wrote to the same register then CSE won't be able to CSE the
values if there are other statements in between that use the register.

A second pattern is needed to capture the nor case as combine will match the
longest sequence first.  So without this pattern we end up de-optimizing nor
and instead emit two nots.  I did not find a better way to do this.

gcc/ChangeLog:

	* config/aarch64/aarch64-sve.md (*fcm<cmp_op><mode>_bic_combine,
	*fcm<cmp_op><mode>_nor_combine, *fcmuo<mode>_bic_combine,
	*fcmuo<mode>_nor_combine): New.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/sve/pred-not-gen-1.c: New test.
	* gcc.target/aarch64/sve/pred-not-gen-2.c: New test.
	* gcc.target/aarch64/sve/pred-not-gen-3.c: New test.
	* gcc.target/aarch64/sve/pred-not-gen-4.c: New test.
2021-10-12 11:35:45 +01:00
Eric Botcazou
a1a7d09430 Fix PR target/102588
We need a 32-byte wide integer mode (OImode) in order to handle structure
returns in the 64-bit ABI.

gcc/
	PR target/102588
	* config/sparc/sparc-modes.def (OI): New integer mode.
2021-10-12 11:20:42 +02:00
Tobias Burnus
f5a538e164 Fortran version of libgomp.c-c++-common/icv-{3,4}.c
This adds the Fortran testsuite coverage of
omp_{get_max,set_num}_threads and omp_{s,g}et_teams_thread_limit

libgomp/
	* testsuite/libgomp.fortran/icv-3.f90: New.
	* testsuite/libgomp.fortran/icv-4.f90: New.
2021-10-12 10:54:18 +02:00
Tobias Burnus
eb92cd57a1 Fortran: Various CLASS + assumed-rank fixed [PR102541]
Starting point was PR102541, were a previous patch caused an invalid
e->ref access for class. When testing, it turned out that for
CLASS to CLASS the code was never executed - additionally, issues
appeared for optional and a bogus error for -fcheck=all. In particular:

There were a bunch of issues related to optional CLASS, can have the
'attr.dummy' set in CLASS_DATA (sym) - but sometimes also in 'sym'!?!
Additionally, gfc_variable_attr could return pointer = 1 for nonpointers
when the expr is no longer "var" but "var%_data".

	PR fortran/102541

gcc/fortran/ChangeLog:

	* check.c (gfc_check_present): Handle optional CLASS.
	* interface.c (gfc_compare_actual_formal): Likewise.
	* trans-array.c (gfc_trans_g77_array): Likewise.
	* trans-decl.c (gfc_build_dummy_array_decl): Likewise.
	* trans-types.c (gfc_sym_type): Likewise.
	* primary.c (gfc_variable_attr): Fixes for dummy and
	pointer when 'class%_data' is passed.
	* trans-expr.c (set_dtype_for_unallocated, gfc_conv_procedure_call):
	For assumed-rank dummy, fix setting rank for dealloc/notassoc actual
	and setting ubound to -1 for assumed-size actuals.

gcc/testsuite/ChangeLog:

	* gfortran.dg/assumed_rank_24.f90: New test.
2021-10-12 09:56:08 +02:00
Jakub Jelinek
8e1fe3f779 openmp: Avoid calling clear_type_padding_in_mask in the common case where there can't be any padding
We can use the clear_padding_type_may_have_padding_p function, which
is conservative for e.g. RECORD_TYPE/UNION_TYPE, but for the floating and
complex floating types is accurate.  clear_type_padding_in_mask is
more expensive because we need to allocate memory, fill it, call the function
which itself is more expensive and then analyze the memory, so for the
common case of float/double atomics or even long double on most targets
we can avoid that.

2021-10-12  Jakub Jelinek  <jakub@redhat.com>

gcc/
	* gimple-fold.h (clear_padding_type_may_have_padding_p): Declare.
	* gimple-fold.c (clear_padding_type_may_have_padding_p): No longer
	static.
gcc/c-family/
	* c-omp.c (c_finish_omp_atomic): Use
	clear_padding_type_may_have_padding_p.
2021-10-12 09:37:25 +02:00
Jakub Jelinek
4096bf82a0 openmp: Add documentation for omp_{get_max, set_num}_threads and omp_{s, g}et_teams_thread_limit
This patch adds documentation for these new OpenMP 5.1 APIs as well as
two new environment variables - OMP_NUM_TEAMS and OMP_TEAMS_THREAD_LIMIT.

2021-10-12  Jakub Jelinek  <jakub@redhat.com>

	* libgomp.texi (omp_get_max_teams, omp_get_teams_thread_limit,
	omp_set_num_teams, omp_set_teams_thread_limit, OMP_NUM_TEAMS,
	OMP_TEAMS_THREAD_LIMIT): Document.
2021-10-12 09:35:43 +02:00
Jakub Jelinek
de7fa7063e openmp: Fix up warnings on libgomp.info build
When building libgomp documentation, I see
makeinfo --split-size=5000000  -I ../../../libgomp/../gcc/doc/include -I ../../../libgomp -o libgomp.info ../../../libgomp/libgomp.texi
../../../libgomp/libgomp.texi:503: warning: node next `omp_get_default_device' in menu `omp_get_device_num' and in sectioning `omp_get_dynamic' differ
../../../libgomp/libgomp.texi:528: warning: node prev `omp_get_dynamic' in menu `omp_get_device_num' and in sectioning `omp_get_default_device' differ
../../../libgomp/libgomp.texi:560: warning: node next `omp_get_initial_device' in menu `omp_get_level' and in sectioning `omp_get_device_num' differ
../../../libgomp/libgomp.texi:587: warning: node next `omp_get_device_num' in menu `omp_get_dynamic' and in sectioning `omp_get_level' differ
../../../libgomp/libgomp.texi:587: warning: node prev `omp_get_device_num' in menu `omp_get_default_device' and in sectioning `omp_get_initial_device' differ
../../../libgomp/libgomp.texi:615: warning: node prev `omp_get_level' in menu `omp_get_initial_device' and in sectioning `omp_get_device_num' differ
warnings.  This patch fixes those.

2021-10-12  Jakub Jelinek  <jakub@redhat.com>

	* libgomp.texi (omp_get_device_num): Move @node before omp_get_dynamic
	to avoid makeinfo warnings.
2021-10-12 09:34:38 +02:00
Jakub Jelinek
88f5ad524a openmp: Add testsuite coverage for omp_{get_max,set_num}_threads and omp_{s,g}et_teams_thread_limit
This adds (C/C++ only) testsuite coverage for these new OpenMP 5.1 APIs.

2021-10-12  Jakub Jelinek  <jakub@redhat.com>

	* testsuite/libgomp.c-c++-common/icv-3.c: New test.
	* testsuite/libgomp.c-c++-common/icv-4.c: New test.
2021-10-12 09:32:28 +02:00
Jakub Jelinek
342aedf0e5 libgomp: alloc* test fixes [PR102628, PR102668]
As reported, the alloc-9.c test and alloc-{1,2,3}.F* and alloc-11.f90
tests fail on powerpc64-linux with -m32.
The reason why it fails just there is that malloc doesn't guarantee there
128-bit alignment (historically glibc guaranteed 2 * sizeof (void *)
alignment from malloc).

There are two separate issues.
One is a thinko on my side.
In this part of alloc-9.c test (copied to alloc-11.f90), we have
2 allocators, a with pool size 1024B and alignment 16B and default fallback
and a2 with pool size 512B and alignment 32B and a as fallback allocator.
We start at no allocations in both at line 194 and do:
  p = (int *) omp_alloc (sizeof (int), a2);
// This succeeds in a2 and needs 4+overhead bytes (which includes the 32B alignment)
  p = (int *) omp_realloc (p, 420, a, a2);
// This allocates 420 bytes+overhead in a, with 16B alignment and deallocates the above
  q = (int *) omp_alloc (sizeof (int), a);
// This allocates 4+overhead bytes in a, with 16B alignment
  q = (int *) omp_realloc (q, 420, a2, a);
// This allocates 420+overhead in a2 with 32B alignment
  q = (int *) omp_realloc (q, 768, a2, a2);
// This attempts to reallocate, but as there are elevated alignment
// requirements doesn't try to just realloc (even if it wanted to try that
// a2 is almost full, with 512-420-overhead bytes left in it), so it
// tries to alloc in a2, but there is no space left in the pool, falls
// back to a, which already has 420+overhead bytes allocated in it and
// 1024-420-overhead bytes left and so fails too and fails to default
// non-pool allocator that allocates it, but doesn't guarantee alignment
// higher than malloc guarantees.
// But, the test expected 16B alignment.

So, I've slightly lowered the allocation sizes in that part of the test
420->320 and 768 -> 568, so that the last test still fails to allocate
in a2 (568 > 512-320-overhead) but succeeds in a as fallback, which was
the intent of the test.

Another thing is that alloc-1.F90 seems to be transcription of
libgomp.c-c++-common/alloc-1.c into Fortran, but alloc-1.c had:
  q = (int *) omp_alloc (768, a2);
  if ((((uintptr_t) q) % 16) != 0)
    abort ();
  q[0] = 7;
  q[767 / sizeof (int)] = 8;
  r = (int *) omp_alloc (512, a2);
  if ((((uintptr_t) r) % __alignof (int)) != 0)
    abort ();
there but Fortran has:
        cq = omp_alloc (768_c_size_t, a2)
        if (mod (transfer (cq, intptr), 16_c_intptr_t) /= 0) stop 12
        call c_f_pointer (cq, q, [768 / c_sizeof (i)])
        q(1) = 7
        q(768 / c_sizeof (i)) = 8
        cr = omp_alloc (512_c_size_t, a2)
        if (mod (transfer (cr, intptr), 16_c_intptr_t) /= 0) stop 13
I'm changing the latter to 4_c_intptr_t because other spots in the
testcase do that, Fortran sadly doesn't have c_alignof, but strictly
speaking it isn't correct, __alignof (int) could be on some architectures
smaller than 4.
So probably alloc-1.F90 etc. should also have
! { dg-additional-sources alloc-7.c }
! { dg-prune-output "command-line option '-fintrinsic-modules-path=.*' is valid for Fortran but not for C" }
and use get__alignof_int.

2021-10-12  Jakub Jelinek  <jakub@redhat.com>

	PR libgomp/102628
	PR libgomp/102668
	* testsuite/libgomp.c-c++-common/alloc-9.c (main): Decrease
	allocation sizes from 420 to 320 and from 768 to 568.
	* testsuite/libgomp.fortran/alloc-11.f90: Likewise.
	* testsuite/libgomp.fortran/alloc-1.F90: Change expected alignment
	for cr from 16 to 4.
2021-10-12 09:30:41 +02:00
Jakub Jelinek
fab2f61dc1 vectorizer: Fix up -fsimd-cost-model= handling
>	* testsuite/libgomp.c++/scan-10.C: Add option -fvect-cost-model=cheap.

I don't think this is the right thing to do.
This just means that at some point between 2013 when -fsimd-cost-model has
been introduced and now -fsimd-cost-model= option at least partially stopped
working properly.
As documented, -fsimd-cost-model= overrides the -fvect-cost-model= setting
for OpenMP simd loops (loop->force_vectorize is true) if specified differently
from default.
In tree-vectorizer.h we have:
static inline bool
unlimited_cost_model (loop_p loop)
{
  if (loop != NULL && loop->force_vectorize
      && flag_simd_cost_model != VECT_COST_MODEL_DEFAULT)
    return flag_simd_cost_model == VECT_COST_MODEL_UNLIMITED;
  return (flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED);
}
and use it in various places, but we also just use flag_vect_cost_model
in lots of places (and in one spot use flag_simd_cost_model, not sure if
we are sure it is a force_vectorize loop or what).

So, IMHO we should change the above inline function to
loop_cost_model and let it return the cost model and then just
reimplement unlimited_cost_model as
return loop_cost_model (loop) == VECT_COST_MODEL_UNLIMITED;
and then adjust the direct uses of the flag and revert these changes.

2021-10-12  Jakub Jelinek  <jakub@redhat.com>

gcc/
	* tree-vectorizer.h (loop_cost_model): New function.
	(unlimited_cost_model): Use it.
	* tree-vect-loop.c (vect_analyze_loop_costing): Use loop_cost_model
	call instead of flag_vect_cost_model.
	* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Likewise.
	(vect_prune_runtime_alias_test_list): Likewise.  Also use it instead
	of flag_simd_cost_model.
gcc/testsuite/
	* gcc.dg/gomp/simd-2.c: Remove option -fvect-cost-model=cheap.
	* gcc.dg/gomp/simd-3.c: Likewise.
libgomp/
	* testsuite/libgomp.c/scan-11.c: Remove option -fvect-cost-model=cheap.
	* testsuite/libgomp.c/scan-12.c: Likewise.
	* testsuite/libgomp.c/scan-13.c: Likewise.
	* testsuite/libgomp.c/scan-14.c: Likewise.
	* testsuite/libgomp.c/scan-15.c: Likewise.
	* testsuite/libgomp.c/scan-16.c: Likewise.
	* testsuite/libgomp.c/scan-17.c: Likewise.
	* testsuite/libgomp.c/scan-18.c: Likewise.
	* testsuite/libgomp.c/scan-19.c: Likewise.
	* testsuite/libgomp.c/scan-20.c: Likewise.
	* testsuite/libgomp.c/scan-21.c: Likewise.
	* testsuite/libgomp.c/scan-22.c: Likewise.
	* testsuite/libgomp.c++/scan-9.C: Likewise.
	* testsuite/libgomp.c++/scan-10.C: Likewise.
	* testsuite/libgomp.c++/scan-11.C: Likewise.
	* testsuite/libgomp.c++/scan-12.C: Likewise.
	* testsuite/libgomp.c++/scan-13.C: Likewise.
	* testsuite/libgomp.c++/scan-14.C: Likewise.
	* testsuite/libgomp.c++/scan-15.C: Likewise.
	* testsuite/libgomp.c++/scan-16.C: Likewise.
2021-10-12 09:28:10 +02:00
liuhongt
73c535a00b Support reduc_{plus,smax,smin,umax,umin}_scal_v4qi.
gcc/ChangeLog

	PR target/102483
	* config/i386/i386-expand.c (emit_reduc_half): Handle
	V4QImode.
	* config/i386/mmx.md (reduc_<code>_scal_v4qi): New expander.
	(reduc_plus_scal_v4qi): Ditto.

gcc/testsuite/ChangeLog

	* gcc.target/i386/pr102483.c: New test.
	* gcc.target/i386/pr102483-2.c: New test.
2021-10-12 15:25:08 +08:00
liuhongt
d61ce6ab04 Adjust testcase for O2 vectorization enabling
This issue was observed in rs6000 specific PR102658 as well.

I've looked into it a bit, it's caused by the "conditional store replacement" which
is originally disabled without vectorization as below code.

  /* If either vectorization or if-conversion is disabled then do
     not sink any stores.  */
  if (param_max_stores_to_sink == 0
      || (!flag_tree_loop_vectorize && !flag_tree_slp_vectorize)
      || !flag_tree_loop_if_convert)
    return false;

The new change makes the innermost loop look like

for (int c1 = 0; c1 <= 1499; c1 += 1) {
  if (c1 <= 500) {
     S_10(c0, c1);
  } else {
      S_9(c0, c1);
  }
  S_11(c0, c1);
}

and can not be splitted as:

for (int c1 = 0; c1 <= 500; c1 += 1)
  S_10(c0, c1);

for (int c1 = 501; c1 <= 1499; c1 += 1)
  S_9(c0, c1);

So instead of disabling vectorization, could we just disable this cs replacement
with parameter "--param max-stores-to-sink=0"?

I tested this proposal on ppc64le, it should work as well.

2021-10-11  Kewen Lin  <linkw@linux.ibm.com>

libgomp/ChangeLog:

	* testsuite/libgomp.graphite/force-parallel-8.c: Add --param max-stores-to-sink=0.
2021-10-12 15:24:12 +08:00
Paul A. Clarke
82bc9355ee rs6000: Correct several errant dg-require-effective-target
I misspelled the dg-require-effective-target attribute "vsx_hw" in
recent commits, causing the effected tests to fail.  Correct the spelling.

2021-10-11  Paul A. Clarke  <pc@us.ibm.com>

gcc/testsuite
	* gcc.target/powerpc/pr78102.c: Fix dg-require-effective-target.
	* gcc.target/powerpc/sse4_1-packusdw.c: Likewise.
	* gcc.target/powerpc/sse4_1-pmaxsb.c: Likewise.
	* gcc.target/powerpc/sse4_1-pmaxsd.c: Likewise.
	* gcc.target/powerpc/sse4_1-pmaxud.c: Likewise.
	* gcc.target/powerpc/sse4_1-pmaxuw.c: Likewise.
	* gcc.target/powerpc/sse4_1-pminsb.c: Likewise.
	* gcc.target/powerpc/sse4_1-pminsd.c: Likewise.
	* gcc.target/powerpc/sse4_1-pminud.c: Likewise.
	* gcc.target/powerpc/sse4_1-pminuw.c: Likewise.
	* gcc.target/powerpc/sse4_1-pmovsxbd.c: Likewise.
	* gcc.target/powerpc/sse4_1-pmovsxbw.c: Likewise.
	* gcc.target/powerpc/sse4_1-pmovsxwd.c: Likewise.
	* gcc.target/powerpc/sse4_1-pmovzxbd.c: Likewise.
	* gcc.target/powerpc/sse4_1-pmovzxbq.c: Likewise.
	* gcc.target/powerpc/sse4_1-pmovzxbw.c: Likewise.
	* gcc.target/powerpc/sse4_1-pmovzxdq.c: Likewise.
	* gcc.target/powerpc/sse4_1-pmovzxwd.c: Likewise.
	* gcc.target/powerpc/sse4_1-pmovzxwq.c: Likewise.
	* gcc.target/powerpc/sse4_1-pmulld.c: Likewise.
	* gcc.target/powerpc/sse4_2-pcmpgtq.c: Likewise.
	* gcc.target/powerpc/sse4_1-phminposuw.c: Use correct
	dg-require-effective-target.
2021-10-11 22:40:59 -05:00
Paul A. Clarke
29fb1e831b rs6000: Support more SSE4 "cmp", "mul", "pack" intrinsics
Function signatures and decorations match gcc/config/i386/smmintrin.h.

Also, copy tests for:
- _mm_cmpeq_epi64
- _mm_mullo_epi32, _mm_mul_epi32
- _mm_packus_epi32
- _mm_cmpgt_epi64 (SSE4.2)

from gcc/testsuite/gcc.target/i386.

2021-10-11  Paul A. Clarke  <pc@us.ibm.com>

gcc
	* config/rs6000/smmintrin.h (_mm_cmpeq_epi64, _mm_cmpgt_epi64,
	_mm_mullo_epi32, _mm_mul_epi32, _mm_packus_epi32): New.
	* config/rs6000/nmmintrin.h: Copy from i386, tweak to suit.

gcc/testsuite
	* gcc.target/powerpc/pr78102.c: Copy from gcc.target/i386,
	adjust dg directives to suit.
	* gcc.target/powerpc/sse4_1-packusdw.c: Same.
	* gcc.target/powerpc/sse4_1-pcmpeqq.c: Same.
	* gcc.target/powerpc/sse4_1-pmuldq.c: Same.
	* gcc.target/powerpc/sse4_1-pmulld.c: Same.
	* gcc.target/powerpc/sse4_2-pcmpgtq.c: Same.
	* gcc.target/powerpc/sse4_2-check.h: Copy from gcc.target/i386,
	tweak to suit.
2021-10-11 20:26:15 -05:00
Paul A. Clarke
285d75a454 rs6000: Support SSE4.1 "cvt" intrinsics
Function signatures and decorations match gcc/config/i386/smmintrin.h.

Also, copy tests for:
- _mm_cvtepi8_epi16, _mm_cvtepi8_epi32, _mm_cvtepi8_epi64
- _mm_cvtepi16_epi32, _mm_cvtepi16_epi64
- _mm_cvtepi32_epi64,
- _mm_cvtepu8_epi16, _mm_cvtepu8_epi32, _mm_cvtepu8_epi64
- _mm_cvtepu16_epi32, _mm_cvtepu16_epi64
- _mm_cvtepu32_epi64

from gcc/testsuite/gcc.target/i386.

sse4_1-pmovsxbd.c, sse4_1-pmovsxbq.c, and sse4_1-pmovsxbw.c were
modified from using "char" types to "signed char" types, because
the default is unsigned on powerpc.

2021-10-11  Paul A. Clarke  <pc@us.ibm.com>

gcc
	* config/rs6000/smmintrin.h (_mm_cvtepi8_epi16, _mm_cvtepi8_epi32,
	_mm_cvtepi8_epi64, _mm_cvtepi16_epi32, _mm_cvtepi16_epi64,
	_mm_cvtepi32_epi64, _mm_cvtepu8_epi16, _mm_cvtepu8_epi32,
	_mm_cvtepu8_epi64, _mm_cvtepu16_epi32, _mm_cvtepu16_epi64,
	_mm_cvtepu32_epi64): New.

gcc/testsuite
	* gcc.target/powerpc/sse4_1-pmovsxbd.c: Copy from gcc.target/i386,
	adjust dg directives to suit.
	* gcc.target/powerpc/sse4_1-pmovsxbq.c: Same.
	* gcc.target/powerpc/sse4_1-pmovsxbw.c: Same.
	* gcc.target/powerpc/sse4_1-pmovsxdq.c: Same.
	* gcc.target/powerpc/sse4_1-pmovsxwd.c: Same.
	* gcc.target/powerpc/sse4_1-pmovsxwq.c: Same.
	* gcc.target/powerpc/sse4_1-pmovzxbd.c: Same.
	* gcc.target/powerpc/sse4_1-pmovzxbq.c: Same.
	* gcc.target/powerpc/sse4_1-pmovzxbw.c: Same.
	* gcc.target/powerpc/sse4_1-pmovzxdq.c: Same.
	* gcc.target/powerpc/sse4_1-pmovzxwd.c: Same.
	* gcc.target/powerpc/sse4_1-pmovzxwq.c: Same.
2021-10-11 20:26:15 -05:00
Paul A. Clarke
1ec08caf7e rs6000: Simplify some SSE4.1 "test" intrinsics
Copy some simple redirections from i386 <smmintrin.h>, for:
- _mm_test_all_zeros
- _mm_test_all_ones
- _mm_test_mix_ones_zeros

2021-10-11  Paul A. Clarke  <pc@us.ibm.com>

gcc
	* config/rs6000/smmintrin.h (_mm_test_all_zeros,
	_mm_test_all_ones, _mm_test_mix_ones_zeros): Rewrite as macro.
2021-10-11 20:26:15 -05:00
Paul A. Clarke
2be6f6d498 rs6000: Support SSE4.1 "min" and "max" intrinsics
Function signatures and decorations match gcc/config/i386/smmintrin.h.

Also, copy tests for _mm_min_epi8, _mm_min_epu16, _mm_min_epi32,
_mm_min_epu32, _mm_max_epi8, _mm_max_epu16, _mm_max_epi32, _mm_max_epu32
from gcc/testsuite/gcc.target/i386.

sse4_1-pmaxsb.c and sse4_1-pminsb.c were modified from using
"char" types to "signed char" types, because the default is unsigned on
powerpc.

2021-10-11  Paul A. Clarke  <pc@us.ibm.com>

gcc
	* config/rs6000/smmintrin.h (_mm_min_epi8, _mm_min_epu16,
	_mm_min_epi32, _mm_min_epu32, _mm_max_epi8, _mm_max_epu16,
	_mm_max_epi32, _mm_max_epu32): New.

gcc/testsuite
	* gcc.target/powerpc/sse4_1-pmaxsb.c: Copy from gcc.target/i386.
	* gcc.target/powerpc/sse4_1-pmaxsd.c: Same.
	* gcc.target/powerpc/sse4_1-pmaxud.c: Same.
	* gcc.target/powerpc/sse4_1-pmaxuw.c: Same.
	* gcc.target/powerpc/sse4_1-pminsb.c: Same.
	* gcc.target/powerpc/sse4_1-pminsd.c: Same.
	* gcc.target/powerpc/sse4_1-pminud.c: Same.
	* gcc.target/powerpc/sse4_1-pminuw.c: Same.
2021-10-11 20:26:14 -05:00
GCC Administrator
732d763847 Daily bump. 2021-10-12 00:17:02 +00:00
Eric Gallager
30cce6f65a Add obj-c++.srcman target to gcc/objcp/Makefile.
Closes #56604

Signed-off-by: Eric Gallager <egallager@gcc.gnu.org>

gcc/objcp/ChangeLog:
	PR objc++/56604
	* Make-lang.in: Add obj-c++.srcman: line.
2021-10-11 16:21:48 -04:00
Jan Hubicka
150493d1fa Revert accidental change in ipa-modref-tree.h
* ipa-modref-tree.h (struct modref_access_node): Revert
	accidental change.
	(struct modref_ref_node): Likewise.
2021-10-11 21:58:43 +02:00
Jonathan Wakely
250ddf4c0b libstdc++: Add wrapper for internal uses of std::terminate
This adds an inline wrapper for std::terminate that doesn't add the
declaration of std::terminate to namespace std. This allows the
library to terminate without including all of <exception>.

libstdc++-v3/ChangeLog:

	* include/bits/atomic_timed_wait.h: Remove unused header.
	* include/bits/c++config (std:__terminate): Define.
	* include/bits/semaphore_base.h: Remove <exception> and use
	__terminate instead of terminate.
	* include/bits/std_thread.h: Likewise.
	* libsupc++/eh_terminate.cc (std::terminate): Use qualified-id
	to call __cxxabiv1::__terminate.
2021-10-11 20:35:51 +01:00
Jonathan Wakely
247bac507e libstdc++: Simplify std::basic_regex::assign
We know that if __is_contiguous_iterator is true then we have a pointer
or a __normal_iterator that wraps a pointer, so we don't need to use
std::__to_address.

libstdc++-v3/ChangeLog:

	* include/bits/regex.h (basic_regex::assign(Iter, Iter)): Avoid
	std::__to_address by using poitner directly or using base()
	member of __normal_iterator.
2021-10-11 20:35:45 +01:00
Jonathan Wakely
45ba5426c1 libstdc++: Fix std::numeric_limits::lowest() test for strict modes
This test uses std::is_integral to decide whether we are testing an
integral or floating-point type. But that fails for __int128 because
is_integral<__int128> is false in strict modes. By using
numeric_limits::is_integer instead we get the right answer for all types
that have a numeric_limits specialization.

We can also simplify the test by removing the unnecessary tag
dispatching.

libstdc++-v3/ChangeLog:

	* testsuite/18_support/numeric_limits/lowest.cc: Use
	numeric_limits<T>::is_integer instead of is_integral<T>::value.
2021-10-11 20:34:17 +01:00
Jonathan Wakely
6b6788f8c2 libstdc++: Add valid range assertions to std::basic_regex [PR89927]
This adds some debug assertions to basic_regex. They don't actually
diagnose the error in the PR yet, but I have another patch to make them
more effective.

Also change the __glibcxx_assert(false) consistency checks to include a
string literal that tells the user a bit more about why the process
aborted. We could consider adding a __glibcxx_bug or
__glibcxx_internal_error macro for this purpose, but ideally we'll never
hit such bugs anyway so it shouldn't be needed.

libstdc++-v3/ChangeLog:

	PR libstdc++/89927
	* include/bits/regex.h (basic_regex(const _Ch_type*, size_t)):
	Add __glibcxx_requires_string_len assertion.
	(basic_regex::assign(InputIterator, InputIterator)): Add
	__glibcxx_requires_valid_range assertion.
	* include/bits/regex_scanner.tcc (_Scanner::_M_advance())
	(_Scanner::_M_scan_normal()): Use string literal in assertions.
2021-10-11 20:34:16 +01:00
Jonathan Wakely
84088dc4bb libstdc++: Fix std::match_results::end() for failed matches [PR102667]
The end() function needs to consider whether the underlying vector is
empty, not whether the match_results object is empty. That's because the
underlying vector will always contain at least three elements for a
match_results object that is "ready". It contains three extra elements
which are stored in the vector but are not considered part of sequence,
and so should not be part of the [begin(),end()) range.

libstdc++-v3/ChangeLog:

	PR libstdc++/102667
	* include/bits/regex.h (match_result::empty()): Optimize by
	calling the base function directly.
	(match_results::end()): Check _Base_type::empty() not empty().
	* testsuite/28_regex/match_results/102667.C: New test.
2021-10-11 20:34:16 +01:00
Jan Hubicka
008e7397da Commonize ipa-pta constraint generation for calls
Commonize the three paths to produce constraints for function call
and makes it more flexible, so we can implement new features more easily.  Main
idea is to not special case pure and const since we can now describe all of
pure/const via their EAF flags (implicit_const_eaf_flags and
implicit_pure_eaf_flags) and info on existence of global memory loads/stores in
function which is readily available in the modref tree.

While rewriting the function, I dropped some of optimizations in the way we
generate constraints. Some of them we may want to add back, but I think the
constraint solver should be fast to get rid of them quickly, so it looks like
bit of premature optimization.

We now always produce one additional PTA variable (callescape) for things that
escape into function call and thus can be stored to parameters or global memory
(if modified). This is no longer the same as global escape in case function is
not reading global memory. It is also not same as call use, since we now
understand the fact that interposable functions may use parameter in a way that
is not releavnt for PTA (so we can not optimize out stores initializing the
memory, but we can be safe about fact that pointers stored does not escape).

Compared to previous code we now handle correctly EAF_NOT_RETURNED in all cases
(previously we did so only when all parameters had the flag) and also handle
NOCLOBBER in more cases (since we make difference between global escape and
call escape). Because I commonized code handling args and static chains, we
could now easily extend modref to also track flags for static chain and return
slot which I plan to do next.

Otherwise I put some effort into producing constraints that produce similar
solutions as before (so it is harder to debug differences). For example if
global memory is written one can simply move callescape to escape rather then
making everything escape by its own constraints, but it affects ipa-pta
testcases.

gcc/ChangeLog:

	* ipa-modref-tree.h (modref_tree::global_access_p): New member
	function.
	* ipa-modref.c:
	(implicint_const_eaf_flags,implicit_pure_eaf_flags,
	ignore_stores_eaf_flags): Move to ipa-modref.h
	(remove_useless_eaf_flags): Remove early exit on NOCLOBBER.
	(modref_summary::global_memory_read_p): New member function.
	(modref_summary::global_memory_written_p): New member function.
	* ipa-modref.h (modref_summary::global_memory_read_p,
	modref_summary::global_memory_written_p): Declare.
	(implicint_const_eaf_flags,implicit_pure_eaf_flags,
	ignore_stores_eaf_flags): move here.
	* tree-ssa-structalias.c: Include ipa-modref-tree.h, ipa-modref.h
	and attr-fnspec.h.
	(handle_rhs_call): Rewrite.
	(handle_call_arg): New function.
	(determine_global_memory_access): New function.
	(handle_const_call): Remove
	(handle_pure_call): Remove
	(find_func_aliases_for_call): Update use of handle_rhs_call.
	(compute_points_to_sets): Handle global memory acccesses
	selectively

gcc/testsuite/ChangeLog:

	* gcc.dg/torture/ssa-pta-fn-1.c: Fix template; add noipa.
	* gcc.dg/tree-ssa/pta-callused.c: Fix template.
2021-10-11 18:43:26 +02:00
Patrick Palka
0de8c2f810 c++: Add testcase for already-fixed PR [PR102643]
Fixed with r12-1744.

	PR c++/102643

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/class-deduction-alias11.C: New test.
2021-10-11 12:33:30 -04:00
Diane Meirowitz
1c0a83eff7 doc: improve -fsanitize=undefined description
gcc/ChangeLog:
	* doc/invoke.texi: Add link to UndefinedBehaviorSanitizer
	documentation, mention UBSAN_OPTIONS, similar to what is done
	for AddressSanitizer.
2021-10-11 17:08:06 +01:00
Jonathan Wakely
f858239830 ChangeLog: Remove incorrect PR reference 2021-10-11 16:19:01 +01:00
Richard Biener
338725652f middle-end/102683 - fix .DEFERRED_INIT expansion
This avoids using an integer type for which we don't have an
approprate mode when expanding .DEFERRED_INIT to a non-memory
entity.

2021-10-11  Richard Biener  <rguenther@suse.de>

	PR middle-end/102683
	* internal-fn.c (expand_DEFERRED_INIT): Check for mode
	availability before building an integer type for storage
	purposes.
2021-10-11 16:41:49 +02:00
Richard Biener
09a0affdb0 middle-end/101480 - overloaded global new/delete
The following fixes the issue of ignoring side-effects on memory
from overloaded global new/delete operators by not marking them
as effectively 'const' apart from other explicitely specified
side-effects.

This will cause

FAIL: g++.dg/warn/Warray-bounds-16.C  -std=gnu++1? (test for excess errors)

because we now no longer statically see the initialization loop
never executes because the call to operator new can now clobber 'a.m'.
This seems to be an issue with the warning code and/or ranger so
I'm leaving this FAIL to be addressed as followup.

2021-10-11  Richard Biener  <rguenther@suse.de>

	PR middle-end/101480
	* gimple.c (gimple_call_fnspec): Do not mark operator new/delete
	as const.

	* g++.dg/torture/pr10148.C: New testcase.
2021-10-11 16:20:19 +02:00
Eric Botcazou
a40970cf04 [Ada] Fix problematic import of type-generic GCC atomic builtin
gcc/ada/

	* gcc-interface/gigi.h (resolve_atomic_size): Declare.
	(list_third): New inline function.
	* gcc-interface/decl.c (type_for_atomic_builtin_p): New function.
	(resolve_atomic_builtin): Likewise.
	(gnat_to_gnu_subprog_type): Perform type resolution for most of
	type-generic GCC atomic builtins and give an error for the rest.
	* gcc-interface/utils2.c (resolve_atomic_size): Make public.
2021-10-11 13:38:13 +00:00
Eric Botcazou
4a0d6b70e3 [Ada] Tweak the warning about missing local raises
gcc/ada/

	* gcc-interface/trans.c (gnat_to_gnu) <N_Pop_Constraint_Error_Label>:
	Given the warning only if No_Exception_Propagation is active.
	<N_Pop_Storage_Error_Label>: Likewise.
	<N_Pop_Program_Error_Label>: Likewise.
2021-10-11 13:38:13 +00:00
Eric Botcazou
5ea133c6ce [Ada] Fix for atomic wrongly rejected on object of discriminated type
gcc/ada/

	* gcc-interface/decl.c (promote_object_alignment): Add GNU_SIZE
	parameter and use it for the size of the object if not null.
	(gnat_to_gnu_entity) <E_Variable>: Perform the automatic alignment
	promotion for objects whose nominal subtype is of variable size.
	(gnat_to_gnu_field): Adjust call to promote_object_alignment.
2021-10-11 13:38:13 +00:00
Eric Botcazou
92961bdf2d [Ada] Fix incorrect size for pathological pass-by-copy parameters
gcc/ada/

	* gcc-interface/decl.c (gnat_to_gnu_param): Strip padding types
	only if the size does not change in the process.  Rename local
	variable and add bypass for initialization procedures.
2021-10-11 13:38:13 +00:00
Doug Rupp
547513eeab [Ada] Runtime transition: System.Threads
gcc/ada/

	* libgnat/s-thread.ads: Fix comments.  Remove unused package
	imports.
	(Thread_Body_Exception_Exit): Remove Exception_Occurrence
	parameter.
	(ATSD): Declare type locally.
	* libgnat/s-thread__ae653.adb: Fix comments.  Remove unused
	package imports.  Remove package references to Stack_Limit
	checking.
	(Install_Handler): Remove.
	(Set_Sec_Stack): Likewise.
	(Thread_Body_Enter): Remove calls to Install_Handler and
	Stack_Limit checking.
	(Thread_Body_Exception_Exit): Remove Exception_Occurrence
	parameter.
	(Init_RTS): Call local Get_Sec_Stack.  Remove call to
	Install_Handler.  Remove references to accessors for
	Get_Sec_Stack and Set_Sec_Stack.  Remove OS check.
	(Set_Sec_Stack): Remove.
2021-10-11 13:38:12 +00:00
Piotr Trojanek
a59626c8b8 [Ada] Remove redundant guard in expansion of dispatching calls
gcc/ada/

	* exp_ch3.adb (Make_Predefined_Primitive_Specs,
	Predefined_Primitive_Bodies): Remove guard with restriction
	No_Dispatching_Calls.
2021-10-11 13:38:12 +00:00
Steve Baird
939047f542 [Ada] Valid postconditions incorrectly rejected.
gcc/ada/

	* sem_attr.adb (Analyze_Attribute_Old_Result): Permit an
	attribute reference inside a compiler-generated _Postconditions
	procedure. In this case, Subp_Decl is assigned the declaration
	of the enclosing subprogram.
	* exp_util.adb (Insert_Actions): When climbing up the tree
	looking for an insertion point, do not climb past an
	N_Iterated_Component/Element_Association, since this could
	result in inserting a reference to a loop parameter at a
	location outside of the scope of that loop parameter. On the
	other hand, be careful to preserve existing behavior in the case
	of an N_Component_Association node.
2021-10-11 13:38:12 +00:00
Steve Baird
2ad5d5e3d5 [Ada] Incorrect Dynamic_Predicate results for static arguments
gcc/ada/

	* exp_ch6.adb (Can_Fold_Predicate_Call): Do not attempt folding
	if there is more than one predicate involved. Recall that
	predicate aspect specification are additive, not overriding, and
	that there are three different predicate
	aspects (Dynamic_Predicate, Static_Predicate, and the
	GNAT-defined Predicate aspect). These various ways of
	introducing multiple predicates are all checked for.  A new
	nested function, Augments_Other_Dynamic_Predicate, is
	introduced.
	* sem_ch4.adb
	(Analyze_Indexed_Component_Form.Process_Function_Call): When
	determining whether a name like "X (Some_Discrete_Type)" might
	be interpreted as a slice, the answer should be "no" if the
	type/subtype name denotes the current instance of type/subtype.
2021-10-11 13:38:12 +00:00
Patrick Bernardi
26a7b2ada5 [Ada] sigset_t is an unsigned long on RTEMS
gcc/ada/

	* libgnarl/s-osinte__rtems.ads: Change sigset_t to an unsigned
	long.
2021-10-11 13:38:12 +00:00