Commit Graph

187449 Commits

Author SHA1 Message Date
Martin Liska 84f906df4f mklog: support '-b c/101343' format.
contrib/ChangeLog:

	* mklog.py: Support additional PRs without PR prefix.
2021-08-10 17:53:48 +02:00
Tobias Burnus 2ba0376ac4 gfortran: Fix in-build-tree testing [PR101305, PR101660]
ISO_Fortran_binding.h is written in the build dir - hence, a previous commit
added it as include directory for in-build-tree testing.  However,
it turned out that -I$specdir/libgfortran interferes with reading .mod files
as they are then no longer regareded as intrinsic modules.  Solution: Create
an extra include/ directory in the libgfortran build dir and copy
ISO_Fortran_binding.h to that directory.  As -B$specdir/libgfortran already
causes gfortran to read that include subdirectory, the -I flag is no longer
needed.

	PR libfortran/101305
	PR fortran/101660
	PR testsuite/101847

libgfortran/ChangeLog:

	* Makefile.am (ISO_Fortran_binding.h): Create include/ in the build dir
	and copy the include file to it.
	(clean-local): Add for removing the 'include' directory.
	* Makefile.in: Regenerate.

gcc/testsuite/ChangeLog:

	* lib/gfortran.exp (gfortran_init): Remove -I$specpath/libgfortran
	from the string used to set GFORTRAN_UNDER_TEST.
2021-08-10 17:26:32 +02:00
H.J. Lu 557d06f8b3 Enable gcc.target/i386/pr88531-1a.c for all targets
PR tree-optimization/101809
	* gcc.target/i386/pr88531-1a.c: Enable for all targets.
2021-08-10 05:30:44 -07:00
Jakub Jelinek 50b5877925 i386: Allow some V32HImode and V64QImode permutations even without AVX512BW [PR80355]
When working on the PR, I've noticed we generate terrible code for
V32HImode or V64QImode permutations for -mavx512f -mno-avx512bw.
Generally we can't do much with such permutations, but since PR68655
we can handle at least some, those expressible using V16SImode or V8DImode
permutations, but that wasn't reachable, because ix86_vectorize_vec_perm_const
didn't even try, it said without TARGET_AVX512BW it can't do anything, and
with it can do everything, no d.testing_p attempts.

This patch makes it try it for TARGET_AVX512F && !TARGET_AVX512BW.

The first hunk is to avoid ICE, expand_vec_perm_even_odd_1 asserts d->vmode
isn't V32HImode because expand_vec_perm_1 for AVX512BW handles already
all permutations, but when we let it through without !TARGET_AVX512BW,
expand_vec_perm_1 doesn't handle it.

If we want, that hunk can be dropped if we implement in
expand_vec_perm_even_odd_1 and its helper the even permutation as
vpmovdw + vpmovdw + vinserti64x4 and odd permutation as
vpsrld $16 + vpsrld $16 + vpmovdw + vpmovdw + vinserti64x4.

2021-08-10  Jakub Jelinek  <jakub@redhat.com>

	PR target/80355
	* config/i386/i386-expand.c (expand_vec_perm_even_odd): Return false
	for V32HImode if !TARGET_AVX512BW.
	(ix86_vectorize_vec_perm_const) <case E_V32HImode, case E_V64QImode>:
	If !TARGET_AVX512BW and TARGET_AVX512F and d.testing_p, don't fail
	early, but actually check the permutation.

	* gcc.target/i386/avx512f-pr80355-2.c: New test.
2021-08-10 12:38:00 +02:00
Richard Biener 08aa0e3d4f tree-optimization/101809 - support emulated gather for double[int]
This adds emulated gather support for index vectors with more
elements than the data vector.  The internal function gather
vectorization code doesn't currently handle this (but the builtin
decl code does).  This allows vectorization of double data gather
with int indexes on 32bit platforms where there isn't an implicit
widening to 64bit present.

2021-08-10  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/101809
	* tree-vect-stmts.c (get_load_store_type): Allow emulated
	gathers with offset vector nunits being a constant multiple
	of the data vector nunits.
	(vect_get_gather_scatter_ops): Use the appropriate nunits
	for the offset vector defs.
	(vectorizable_store): Adjust call to
	vect_get_gather_scatter_ops.
	(vectorizable_load): Likewise.  Handle the case of less
	offset vectors than data vectors.
2021-08-10 12:25:19 +02:00
Jakub Jelinek 7665af0b1a i386: Improve single operand AVX512F permutations [PR80355]
On the following testcase we emit
	vmovdqa32	.LC0(%rip), %zmm1
	vpermd	%zmm0, %zmm1, %zmm0
and
	vmovdqa64	.LC1(%rip), %zmm1
	vpermq	%zmm0, %zmm1, %zmm0
instead of
	vshufi32x4	$78, %zmm0, %zmm0, %zmm0
and
	vshufi64x2	$78, %zmm0, %zmm0, %zmm0
we can emit with the patch.  We have patterns that match two argument
permutations for vshuf[if]*, but for one argument it doesn't trigger.
Either we can add two patterns for that, or we would need to add another
routine to i386-expand.c that would transform under certain condition
these cases to the two argument vshuf*, doing it in sse.md looked simpler.
We don't need this for 32-byte vectors, we already emit single insn
permutation that doesn't need memory op there.

2021-08-10  Jakub Jelinek  <jakub@redhat.com>

	PR target/80355
	* config/i386/sse.md (*avx512f_shuf_<shuffletype>64x2_1<mask_name>_1,
	*avx512f_shuf_<shuffletype>32x4_1<mask_name>_1): New define_insn
	patterns.

	* gcc.target/i386/avx512f-pr80355-1.c: New test.
2021-08-10 11:34:53 +02:00
Jakub Jelinek c40c6a50fd openmp: Add support for declare simd and declare variant in a attribute syntax
This patch adds support for declare simd and declare variant in attribute
syntax.  Either in attribute-specifier-seq at the start of declaration, in
that case it has similar restriction to pragma-syntax, that there is a single
function declaration/definition in the declaration, rather than variable
declaration or more than one function declarations or mix of function and
variable declarations.  Or after the declarator id, in that case it applies
just to the single function declaration and the same declaration can have
multiple such attributes.  Or both.

Furthermore, cp_parser_statement has been adjusted so that it doesn't
accept [[omp::directive (parallel)]] etc. before statements that don't
take attributes at all, or where those attributes don't appertain to
the statement but something else (e.g. to label, using directive,
declaration, etc.).

2021-08-10  Jakub Jelinek  <jakub@redhat.com>

gcc/cp/
	* parser.h (struct cp_omp_declare_simd_data): Remove
	in_omp_attribute_pragma and clauses members, add loc and attribs.
	(struct cp_oacc_routine_data): Remove loc member, add clauses
	member.
	* parser.c (cp_finalize_omp_declare_simd): New function.
	(cp_parser_handle_statement_omp_attributes): Mention in
	function comment the function is used also for
	attribute-declaration.
	(cp_parser_handle_directive_omp_attributes): New function.
	(cp_parser_statement): Don't call
	cp_parser_handle_statement_omp_attributes if statement doesn't
	have attribute-specifier-seq at the beginning at all or if
	if those attributes don't appertain to the statement.
	(cp_parser_simple_declaration): Call
	cp_parser_handle_directive_omp_attributes and
	cp_finalize_omp_declare_simd.
	(cp_parser_explicit_instantiation): Likewise.
	(cp_parser_init_declarator): Initialize prefix_attributes
	only after parsing declarators.
	(cp_parser_direct_declarator): Call
	cp_parser_handle_directive_omp_attributes and
	cp_finalize_omp_declare_simd.
	(cp_parser_member_declaration): Likewise.
	(cp_parser_single_declaration): Likewise.
	(cp_parser_omp_declare_simd): Don't initialize
	data.in_omp_attribute_pragma, instead initialize
	data.attribs[0] and data.attribs[1].
	(cp_finish_omp_declare_variant): Remove
	in_omp_attribute_pragma argument, instead use
	parser->lexer->in_omp_attribute_pragma.
	(cp_parser_late_parsing_omp_declare_simd): Adjust
	cp_finish_omp_declare_variant caller.  Handle attribute-syntax
	declare simd/variant.
gcc/testsuite/
	* g++.dg/gomp/attrs-1.C (bar): Add missing semicolon after
	[[omp::directive (threadprivate (t2))]].  Add tests with
	if/while/switch after parallel in attribute syntax.
	(corge): Add missing omp:: before directive.
	* g++.dg/gomp/attrs-2.C (bar): Add missing semicolon after
	[[omp::directive (threadprivate (t2))]].
	* g++.dg/gomp/attrs-10.C: New test.
	* g++.dg/gomp/attrs-11.C: New test.
2021-08-10 11:22:33 +02:00
Hongyu Wang c318f8e42b i386: Fix typos in amxbf16 runtime test.
gcc/testsuite/ChangeLog:

	* gcc.target/i386/amxbf16-dpbf16ps-2.c: Fix typos.
2021-08-10 16:20:20 +08:00
Richard Biener 19d1a529fa tree-optimization/101801 - rework generic vector vectorization more
This builds ontop of the vect_worthwhile_without_simd_p refactoring
done earlier.  It was wrong in dropping the appearant double checks
for operation support since the optab check can happen with an
integer vector emulation mode and thus succeed but vector lowering
might not actually support the operation on word_mode.

The following patch adds a vect_emulated_vector_p helper and
re-instantiates the check where it was previously.  It also adds
appropriate costing of the scalar stmts emitted by vector lowering
to vectorizable_operation which should be the only place such
operations are synthesized.  I've also cared for the case where
the vector mode is supported but the operation is not (though
I think this will be unlikely given we're talking about plus, minus
and negate).

This fixes the observed FAIL of gcc.dg/tree-ssa/gen-vect-11b.c
with -m32 where we end up vectorizing a multiplication that ends up
being teared down to scalars again by vector lowering.

I'm not super happy about all the other places where we're now
and previously feeding scalar modes to optab checks where we
want to know whether we can vectorize sth but well.

2021-09-08  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/101801
	PR tree-optimization/101819
	* tree-vectorizer.h (vect_emulated_vector_p): Declare.
	* tree-vect-loop.c (vect_emulated_vector_p): New function.
	(vectorizable_reduction): Re-instantiate a check for emulated
	operations.
	* tree-vect-stmts.c (vectorizable_shift): Likewise.
	(vectorizable_operation): Likewise.  Cost emulated vector
	operations according to the scalar sequence synthesized by
	vector lowering.
2021-08-10 10:12:39 +02:00
Richard Biener bb169406cd middle-end/101824 - properly handle volatiles in nested fn lowering
When we build the COMPONENT_REF of a formerly volatile local off
the FRAME decl we have to make sure to mark the COMPONENT_REF
as TREE_THIS_VOLATILE.  While the GIMPLE operand scanner looks
at the FIELD_DECL this is not how volatile GENERIC refs work.

2021-08-09  Richard Biener  <rguenther@suse.de>

	PR middle-end/101824
	* tree-nested.c (get_frame_field): Mark the COMPONENT_REF as
	volatile in case the variable was.

	* gcc.dg/tree-ssa/pr101824.c: New testcase.
2021-08-10 09:27:49 +02:00
Martin Uecker 0631faf87a Evaluate arguments of sizeof that are structs of variable size.
Evaluate arguments of sizeof for all types of variable size
and not just for VLAs. This fixes some issues related to
[PR29970] where statement expressions need to be evaluated
so that the size is well defined.

2021-08-10  Martin Uecker  <muecker@gwdg.de>

gcc/c/
	PR c/29970
	* c-typeck.c (c_expr_sizeof_expr): Evaluate
	size expressions for structs of variable size.

gcc/testsuite/
	PR c/29970
	* gcc.dg/vla-stexp-1.c: New test.
2021-08-10 07:49:57 +02:00
H.J. Lu 3d7ccbc1ef x86: Optimize load of const FP all bits set vectors
Check float_vector_all_ones_operand for vector floating-point modes to
optimize load of const floating-point all bits set vectors.

gcc/

	PR target/101804
	* config/i386/constraints.md (BC): Document for integer SSE
	constant all bits set operand.
	(BF): New constraint for const floating-point all bits set
	vectors.
	* config/i386/i386.c (standard_sse_constant_p): Likewise.
	(standard_sse_constant_opcode): Likewise.
	* config/i386/sse.md (sseconstm1): New mode attribute.
	(mov<mode>_internal): Replace BC with <sseconstm1>.

gcc/testsuite/

	PR target/101804
	* gcc.target/i386/avx2-gather-2.c: Pass -march=skylake instead
	of "-mavx2 -mtune=skylake".  Scan vpcmpeqd.
2021-08-09 21:41:35 -07:00
liuhongt 813ccbe9d2 Support cond_ashr/lshr/ashl for vector integer modes under AVX512.
gcc/ChangeLog:

	* config/i386/sse.md (cond_<insn><mode>): New expander.
	(VI248_AVX512VLBW): New mode iterator.
	* config/i386/predicates.md
	(nonimmediate_or_const_vec_dup_operand): New predicate.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/cond_op_shift_d-1.c: New test.
	* gcc.target/i386/cond_op_shift_d-2.c: New test.
	* gcc.target/i386/cond_op_shift_q-1.c: New test.
	* gcc.target/i386/cond_op_shift_q-2.c: New test.
	* gcc.target/i386/cond_op_shift_ud-1.c: New test.
	* gcc.target/i386/cond_op_shift_ud-2.c: New test.
	* gcc.target/i386/cond_op_shift_uq-1.c: New test.
	* gcc.target/i386/cond_op_shift_uq-2.c: New test.
	* gcc.target/i386/cond_op_shift_uw-1.c: New test.
	* gcc.target/i386/cond_op_shift_uw-2.c: New test.
	* gcc.target/i386/cond_op_shift_w-1.c: New test.
	* gcc.target/i386/cond_op_shift_w-2.c: New test.
2021-08-10 09:26:21 +08:00
GCC Administrator 377681505f Daily bump. 2021-08-10 00:16:28 +00:00
Andrew MacLeod c86c95edd1 Ensure toupper and tolower follow the expected pattern.
If the parameter is not compatible with the LHS, assume this is not really a
builtin function to avoid a trap.

	gcc/
	PR tree-optimization/101741
	* gimple-range-fold.cc (fold_using_range::range_of_builtin_call): Check
	type of parameter for toupper/tolower.

	gcc/testsuite/
	* gcc.dg/pr101741.c: New.
2021-08-09 16:24:05 -04:00
Jonathan Wakely f5a2d78072 libstdc++: Reduce use of debug containers in <regex>
The std::regex code uses std::map and std::vector, which means that when
_GLIBCXX_DEBUG is defined it uses the debug versions of those
containers. That no longer compiles, because I changed <regex> to
include <bits/stl_map.h> and <bits/stl_vector.h> instead of <map> and
<vector>, so the debug versions aren't defined, and std::map doesn't
compile. There is also a use of std::stack, which defaults to std::deque
which is the debug deque when _GLIBCXX_DEBUG is defined.

Using std::map, std::vector, and std::deque is probably a mistake, and
we should qualify them with _GLIBCXX_STD_C instead so that the debug
versions aren't used. We do not need the overhead of checking our own
uses of those containers, which should be correct anyway. The exception
is the vector base class of std::match_results, which exposes iterators
to users, so can benefit from debug mode checks for its iterators. For
other accesses to the vector elements, match_results already does its
own checks, so can access the _GLIBCXX_STD_C::vector base class
directly.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++-v3/ChangeLog:

	* include/bits/regex.h (basic_regex::transform_primary): Use
	_GLIBCXX_STD_C::vector for local variable.
	* include/bits/regex.tcc (__regex_algo_impl): Use reference to
	_GLIBCXX_STD_C::vector base class of match_results.
	* include/bits/regex_automaton.tcc (_StateSeq:_M_clone): Use
	_GLIBCXX_STD_C::map and _GLIBCXX_STD_C::deque for local
	variables.
	* include/bits/regex_compiler.h (_BracketMatcher): Use
	_GLIBCXX_STD_C::vector for data members.
	* include/bits/regex_executor.h (_Executor): Likewise.
	* include/std/regex [_GLIBCXX_DEBUG]: Include <debug/vector>.
2021-08-09 20:46:56 +01:00
François Dumont 1354603bf7 libstdc++: [_GLIBCXX_DEBUG] Avoid allocator operator== when always equal
Use std::allocator_traits::is_always_equal to find out if we need to compare
allocator instances on safe container allocator aware move constructor.

libstdc++-v3/ChangeLog:

	* include/debug/safe_container.h
	(_Safe_container(_Safe_container&&, const _Alloc&, std::true_type)): New.
	(_Safe_container(_Safe_container&&, const _Alloc&, std::false_type)): New.
	(_Safe_container(_Safe_container&&, const _Alloc&)): Use latters.
2021-08-09 20:44:58 +02:00
Martin Jambor d55d3f5b04
ipa: Fix testsuite/gcc.dg/ipa/remref-6.c
I forgot to add -fdump-ipa-inline to options of
testsuite/gcc.dg/ipa/remref-6.c and so the dump scan test were not
PASSing but ended up as UNRESOLVED.  Fixing that revealed that the one
of the dumps it was looking for had a double space, so I removed it
too.

gcc/ChangeLog:

2021-08-09  Martin Jambor  <mjambor@suse.cz>

	PR testsuite/101654
	* ipa-prop.c (propagate_controlled_uses): Removed a spurious space.

gcc/testsuite/ChangeLog:

2021-08-09  Martin Jambor  <mjambor@suse.cz>

	PR testsuite/101654
	* gcc.dg/ipa/remref-6.c: Added missing -fdump-ipa-inline option.
2021-08-09 17:36:12 +02:00
Pat Haugen 00eab082e9 Verify destination[source] of a load[store] instruction is a register.
gcc/ChangeLog:

	* config/rs6000/rs6000.c (is_load_insn1): Verify destination is a
	register.
	(is_store_insn1): Verify source is a register.
2021-08-09 10:05:49 -05:00
Uros Bizjak 9d2d660aab i386: Name V2SF logic insns [PR101812]
Name V2SF logic insns, so expand_simple_binop works with V2SF modes.

2021-08-09  Uroš Bizjak  <ubizjak@gmail.com>

gcc/
	PR target/101812
	* config/i386/mmx.md (<any_logic:code>v2sf3):
	Rename from *mmx_<any_logic:code>v2sf3

gcc/testsuite/
	PR target/101812
	* gcc.target/i386/pr101812.c: New test.
2021-08-09 16:39:40 +02:00
Thomas Schwinge 62f01243fb Cross-reference parts adapted in 'gcc/omp-oacc-neuter-broadcast.cc'
gcc/
	* config/nvptx/nvptx.c: Cross-reference parts adapted in
	'gcc/omp-oacc-neuter-broadcast.cc'.
	* omp-low.c: Likewise.
	* omp-oacc-neuter-broadcast.cc: Cross-reference parts adapted from
	the above files.
2021-08-09 15:16:58 +02:00
Julian Brown c408512e1f amdgcn: Enable OpenACC worker partitioning for AMD GCN
gcc/
	* config/gcn/gcn.c (gcn_init_builtins): Override decls for
	BUILT_IN_GOACC_SINGLE_START, BUILT_IN_GOACC_SINGLE_COPY_START,
	BUILT_IN_GOACC_SINGLE_COPY_END and BUILT_IN_GOACC_BARRIER.
	(gcn_goacc_validate_dims): Turn on worker partitioning unconditionally.
	(gcn_fork_join): Update comment.
	* config/gcn/gcn.opt (flag_worker_partitioning): Remove.
	(macc_experimental_workers): Remove unused option.
	libgomp/
	* plugin/plugin-gcn.c (gcn_exec): Change default number of workers to
	16.
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c
	[acc_device_radeon]: Update.
	* testsuite/libgomp.oacc-c-c++-common/loop-dim-default.c
	[ACC_DEVICE_TYPE_radeon]: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
	[acc_device_radeon]: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/routine-wv-2.c
	[ACC_DEVICE_TYPE_radeon]: Likewise.
	* testsuite/libgomp.oacc-fortran/optional-reduction.f90: XFAIL for
	'openacc_radeon_accel_selected' and '-O0'.
	* testsuite/libgomp.oacc-fortran/reduction-7.f90: Likewise.

Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2021-08-09 15:08:44 +02:00
Julian Brown e2a58ed6dc openacc: Middle-end worker-partitioning support
This patch implements worker-partitioning support in the middle end,
by rewriting gimple. The OpenACC execution model requires that code
can run in either "worker single" mode where only a single worker per
gang is active, or "worker partitioned" mode, where multiple workers
per gang are active. This means we need to do something equivalent
to spawning additional workers when transitioning from worker-single
to worker-partitioned mode. However, GPUs typically fix the number of
threads of invoked kernels at launch time, so we need to do something
with the "extra" threads when they are not wanted.

The scheme used is to conditionalise each basic block that executes
in "worker single" mode for worker 0 only. Conditional branches
are handled specially so "idle" (non-0) workers follow along with
worker 0. On transitioning to "worker partitioned" mode, any variables
modified by worker 0 are propagated to the other workers via GPU shared
memory. Special care is taken for routine calls, writes through pointers,
and so forth, as follows:

  - There are two types of function calls to consider in worker-single
    mode: "normal" calls to maths library routines, etc. are called from
    worker 0 only. OpenACC routines may contain worker-partitioned loops
    themselves, so are called from all workers, including "idle" ones.

  - SSA names set in worker-single mode, but used in worker-partitioned
    mode, are copied to shared memory in worker 0. Other workers retrieve
    the value from the appropriate shared-memory location after a barrier,
    and new phi nodes are introduced at the convergence point to resolve
    the worker 0/other worker copies of the value.

  - Local scalar variables (on the stack) also need special handling. We
    broadcast any variables that are written in the current worker-single
    block, and that are read in any worker-partitioned block.  (This is
    believed to be safe, and is flow-insensitive to ease analysis.)

  - Local aggregates (arrays and composites) on the stack are *not*
    broadcast. Instead we force gimple stmts modifying elements/fields of
    local aggregates into fully-partitioned mode. The RHS of the
    assignment is a scalar, and is thus subject to broadcasting as above.

  - Writes through pointers may affect any local variable that has
    its address taken. We use points-to analysis to determine the set
    of potentially-affected variables for a given pointer indirection.
    We broadcast any such variable which is used in worker-partitioned
    mode, on a per-block basis for any block containing a write through
    a pointer.

Some slides about the implementation (from 2018) are available at:

  https://jtb20.github.io/gcnworkers.pdf

	gcc/
	* Makefile.in (OBJS): Add omp-oacc-neuter-broadcast.o.
	* doc/tm.texi.in (TARGET_GOACC_CREATE_WORKER_BROADCAST_RECORD):
	Add documentation hook.
	* doc/tm.texi: Regenerate.
	* omp-oacc-neuter-broadcast.cc: New file.
	* omp-builtins.def (BUILT_IN_GOACC_BARRIER)
	(BUILT_IN_GOACC_SINGLE_START, BUILT_IN_GOACC_SINGLE_COPY_START)
	(BUILT_IN_GOACC_SINGLE_COPY_END): New builtins.
	* passes.def (pass_omp_oacc_neuter_broadcast): Add pass.
	* target.def (goacc.create_worker_broadcast_record): Add target
	hook.
	* tree-pass.h (make_pass_omp_oacc_neuter_broadcast): Add
	prototype.
	* config/gcn/gcn-protos.h (gcn_goacc_adjust_propagation_record):
	Rename prototype to...
	(gcn_goacc_create_worker_broadcast_record): ... this.
	* config/gcn/gcn-tree.c (gcn_goacc_adjust_propagation_record): Rename
	function to...
	(gcn_goacc_create_worker_broadcast_record): ... this.
	* config/gcn/gcn.c (TARGET_GOACC_ADJUST_PROPAGATION_RECORD):
	Rename to...
	(TARGET_GOACC_CREATE_WORKER_BROADCAST_RECORD): ... this.

Co-Authored-By: Nathan Sidwell <nathan@codesourcery.com> (via 'gcc/config/nvptx/nvptx.c' master)
Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2021-08-09 14:47:42 +02:00
Tejas Belagod e2e0b85c1e PR101609: Use the correct iterator for AArch64 vector right shift pattern
Loops containing long long shifts fail to vectorize due to the vectorizer
not being able to recognize long long right shifts. This is due to a bug
in the iterator used for the vashr and vlshr patterns in aarch64-simd.md.

2021-08-09  Tejas Belagod  <tejas.belagod@arm.com>

gcc/ChangeLog
	PR target/101609
	* config/aarch64/aarch64-simd.md (vlshr<mode>3, vashr<mode>3): Use
	the right iterator.

gcc/testsuite/ChangeLog
	* gcc.target/aarch64/vect-shr-reg.c: New testcase.
	* gcc.target/aarch64/vect-shr-reg-run.c: Likewise.
2021-08-09 12:54:14 +01:00
Thomas Schwinge 0095afa82a Remove 'gcc/omp-offload.c' from 'GTFILES'
Given that it doesn't contain any 'GTY' markers, no 'gcc/gt-omp-offload.h' file
gets generated (and '#include'd anywhere).

Small fix-up for r243673 (Git commit 629b3d75c8)
"Split omp-low into multiple files".

	gcc/
	* Makefile.in (GTFILES): Remove '$(srcdir)/omp-offload.c'.
2021-08-09 13:40:54 +02:00
Thomas Schwinge 2a700fb8ea Don't consider '-foffload-abi' in 'DEF_GOACC_BUILTIN', 'DEF_GOMP_BUILTIN'
Since Tom's PR64707 commit r220037 (Git commit
1506ae0e1e) "Make fopenmp an LTO option" as well
as PR64672 commit r220038 (Git commit a0c88d0629)
"Make fopenacc an LTO option", we're now actually passing
'-fopenacc'/'-fopenmp' to the 'mkoffload's, which will pass these on to the
offload compilers.

	gcc/
	* builtins.def (DEF_GOACC_BUILTIN, DEF_GOMP_BUILTIN): Don't
	consider '-foffload-abi'.
	* common.opt (-foffload-abi): Remove 'Var', 'Init'.
	* opts.c (common_handle_option) <-foffload-abi> [ACCEL_COMPILER]:
	Ignore.
2021-08-09 13:39:38 +02:00
Thomas Schwinge c523051930 Sanity check that 'Init' doesn't appear without 'Var' in '*.opt' files
... as that doesn't make sense.

    @item Init(@var{value})
    The variable specified by the @code{Var} property should be statically
    initialized to @var{value}.  [...]

	gcc/
	* optc-gen.awk: Sanity check that 'Init' doesn't appear without
	'Var'.
2021-08-09 13:38:14 +02:00
Thomas Schwinge 06870af3e4 [OpenACC] Clean up unused 'BUILT_IN_ACC_GET_DEVICE_TYPE'
Unused as of r229767 (Git commit e50146711b)
"OpenACC reductions".

	gcc/
	* omp-builtins.def (BUILT_IN_ACC_GET_DEVICE_TYPE): Remove.
2021-08-09 13:36:19 +02:00
Thomas Schwinge 7cc85851bc [documentation] No need anymore to "mention ['gt-*.h' file] as a dependency in the 'Makefile'"
... as of r202907 (Git commit b6541edc52)
"remove explicit dependencies".

	gcc/
	* doc/gty.texi (Files): Update.
2021-08-09 13:28:10 +02:00
Thomas Schwinge 67b8443bd1 [documentation] Fix GTY header file example
Fix-up for CVS 'gcc/doc/gty.texi' r1.6 (Subversion r55857, Git
commit cba57c9d40) "Minor doc updates"

	gcc/
	* doc/gty.texi (Files): Fix GTY header file example.
2021-08-09 13:27:54 +02:00
Roger Sayle 848bcda52d Improve handling of unknown sign bit in CCP.
This middle-end patch implements several related improvements to
tree-ssa's conditional (bit) constant propagation pass.  The current
code handling ordered comparisons contains the comment "If the
most significant bits are not known we know nothing" which is not
entirely true [this test even prevents this pass understanding these
comparisons always have a zero or one result].  This patch introduces
a new value_mask_to_min_max helper function, that understands the
different semantics of the most significant bit on signed vs.
unsigned values.  This allows us to generalize ordered comparisons,
GE_EXPR, GT_EXPR, LE_EXPR and LT_EXPR, where to code is tweaked to
correctly handle the potential equal cases.  Then finally support
is added for the related tree codes MIN_EXPR, MAX_EXPR, ABS_EXPR
and ABSU_EXPR.

Regression testing revealed three test cases in the testsuite that
were checking for specific optimizations that are now being performed
earlier than expected.  These tests can continue to check their
original transformations by explicitly adding -fno-tree-ccp to their
dg-options (some already specify -fno-ipa-vrp or -fno-tree-forwprop
for the same reason).

2021-08-09  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* tree-ssa-ccp.c (value_mask_to_min_max): Helper function to
	determine the upper and lower bounds from a mask-value pair.
	(bit_value_unop) [ABS_EXPR, ABSU_EXPR]: Add support for
	absolute value and unsigned absolute value expressions.
	(bit_value_binop):  Initialize *VAL's precision.
	[LT_EXPR, LE_EXPR]: Use value_mask_to_min_max to determine
	upper and lower bounds of operands.  Add LE_EXPR/GE_EXPR
	support when the operands are unknown but potentially equal.
	[MIN_EXPR, MAX_EXPR]: Support minimum/maximum expressions.

gcc/testsuite/ChangeLog
	* gcc.dg/pr68217.c: Add -fno-tree-ccp option.
	* gcc.dg/tree-ssa/vrp24.c: Add -fno-tree-ccp option.
	* g++.dg/ipa/pure-const-3.C: Add -fno-tree-ccp option.
2021-08-09 12:02:53 +01:00
Jonathan Wakely 2eff2a3cb5 libstdc++: Make allocator equality comparable in tests
libstdc++-v3/ChangeLog:

	* testsuite/23_containers/unordered_map/cons/default.cc: Add
	equality comparison operators to allocator.
	* testsuite/23_containers/unordered_set/cons/default.cc:
	Likewise.
2021-08-09 11:43:50 +01:00
Tobias Burnus 527a1cf32c testsuite/lib/gfortran.exp: Add -I for ISO*.h [PR101305, PR101660]
This patch adds -I$specdir/libgfortran to GFORTRAN_UNDER_TEST, when
set by proc gfortran_init. As the $specdir depends on the multilib
setting, it has to be re-set for a different multilib; hence, we track
whether a previous call to gfortran_init set that var or whether it
was set differently.

gcc/testsuite/
	PR libfortran/101305
	PR fortran/101660

	* lib/gfortran.exp (gfortran_init): Add -I $specdir/libgfortran to
	GFORTRAN_UNDER_TEST; update it when set by previous gfortran_init call.
	* gfortran.dg/ISO_Fortran_binding_1.c: Use <...> not "..." for
	ISO_Fortran_binding.h's #include.
	* gfortran.dg/ISO_Fortran_binding_10.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_11.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_12.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_15.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_16.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_17.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_18.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_3.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_5.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_6.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_7.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_8.c: Likewise.
	* gfortran.dg/ISO_Fortran_binding_9.c: Likewise.
	* gfortran.dg/PR94327.c: Likewise.
	* gfortran.dg/PR94331.c: Likewise.
	* gfortran.dg/bind_c_array_params_3_aux.c: Likewise.
	* gfortran.dg/iso_fortran_binding_uint8_array_driver.c: Likewise.
	* gfortran.dg/pr93524.c: Likewise.
2021-08-09 12:35:23 +02:00
Bin Cheng a5e78ee60c aarch64: Expand %<w> correctly according to mode iterator
Pattern "*extend<SHORT:mode><GPI:mode>2_aarch64" is duplicated
from the corresponding zero_extend pattern, however %<w> needs
to be expanded according to its mode iterator because the smov
instruction is different to umov.

2021-08-09  Bin Cheng  <bin.cheng@linux.alibaba.com>

gcc/
	* config/aarch64/aarch64.md
	(*extend<SHORT:mode><GPI:mode>2_aarch64): Use %<GPI:w>0.
2021-08-09 17:21:03 +08:00
Jonathan Wright a5e3c1e2c8 testsuite: aarch64: Fix invalid SVE tests
Some scan-assembler tests for SVE code generation were erroneously
split over multiple lines - meaning they became invalid. This patch
gets the tests working again by putting each test on a single line.

The extract_[1234].c tests are corrected to expect that extracted
32-bit values are moved into 'w' registers rather than 'x' registers.

gcc/testsuite/ChangeLog:

2021-08-06  Jonathan Wright  <jonathan.wright@arm.com>

	* gcc.target/aarch64/sve/dup_lane_1.c: Don't split
	scan-assembler tests over multiple lines. Expect 32-bit
	result values in 'w' registers.
	* gcc.target/aarch64/sve/extract_1.c: Likewise.
	* gcc.target/aarch64/sve/extract_2.c: Likewise.
	* gcc.target/aarch64/sve/extract_3.c: Likewise.
	* gcc.target/aarch64/sve/extract_4.c: Likewise.
2021-08-09 09:59:05 +01:00
Jonathan Wright da81e30d21 testsuite: aarch64: Fix failing vector structure tests on big-endian
Recent refactoring of the arm_neon.h header enabled better code
generation for intrinsics that manipulate vector structures. New
tests were also added to verify the benefit of these changes. It now
transpires that the code generation improvements are observed only on
little-endian systems. This patch restricts the code generation tests
to little-endian targets.

gcc/testsuite/ChangeLog:

2021-08-04  Jonathan Wright  <jonathan.wright@arm.com>

	* gcc.target/aarch64/vector_structure_intrinsics.c: Restrict
	tests to little-endian targets.
2021-08-09 09:58:43 +01:00
Hongyu Wang 78be906b26 MAINTAINERS: Add myself for write after approval
ChangeLog:

	* MAINTAINERS (Write After Approval): Add myself.
2021-08-09 09:59:44 +08:00
GCC Administrator 844105d912 Daily bump. 2021-08-09 00:16:32 +00:00
Sergei Trofimovich 5f564fd013 lra: Fix s/otput/output/ typo in debug output
gcc/
	* lra-constraints.c: Fix s/otput/output/ typo.
2021-08-08 21:37:20 +01:00
François Dumont ad9c394114 libstdc++: Fix dg-prune-output assertion message
Since __glibcxx_assert changes in r6b42b5a the generated assertion message
has changed.

libstdc++-v3/ChangeLog:

	* testsuite/25_algorithms/copy/debug/constexpr_neg.cc: Replace 'failed_assertion'
	dg-prune-output reason with 'builtin_unreachable'.
	* testsuite/25_algorithms/copy_backward/debug/constexpr_neg.cc: Likewise.
	* testsuite/25_algorithms/equal/debug/constexpr_neg.cc: Likewise.
	* testsuite/25_algorithms/lower_bound/debug/constexpr_partitioned_neg.cc: Likewise.
	* testsuite/25_algorithms/lower_bound/debug/constexpr_partitioned_pred_neg.cc: Likewise.
	* testsuite/25_algorithms/lower_bound/debug/constexpr_valid_range_neg.cc: Likewise.
	* testsuite/25_algorithms/upper_bound/debug/constexpr_partitioned_neg.cc: Likewise.
	* testsuite/25_algorithms/upper_bound/debug/constexpr_partitioned_pred_neg.cc: Likewise.
	* testsuite/25_algorithms/upper_bound/debug/constexpr_valid_range_neg.cc: Likewise.
2021-08-08 19:12:22 +02:00
Jeff Law fd26ce8398 Fix c6x test compromised by recent improvements to bswap & rotates
gcc/testsuite
	* gcc.target/tic6x/rotdi16-scan.c: Pull rotate into its own function.
2021-08-08 11:20:41 -04:00
Hans-Peter Nilsson e9b639c4b5 libstdc++: Tweak timeout for testsuite/std/ranges/iota/max_size_type.cc
A simulator can easily spend more than 10 minutes running
this test-case, and the default timeout is at 5 minutes.
Better allow even slower machines; use 4 as the factor.

Regarding relative runtime numbers (very local; mmixware simulator for
mmix-knuth-mmixware): test01 and test05 finish momentarily; test02 at
about 2 minutes, and test03 about 2m30, but test04 itself runs for
more than 6 minues and so times out.

Not sure if it's better to split up this test, as the excessive
runtime may be unintended, but this seemed simplest.

libstdc++-v3:
	* testsuite/std/ranges/iota/max_size_type.cc: Set
	dg-timeout-factor to 4.
2021-08-08 10:52:50 +02:00
GCC Administrator 7b51202c2a Daily bump. 2021-08-08 00:16:32 +00:00
Ian Lance Taylor 307e0d4036 compiler: support export/import of unsafe.Add/Slice
For golang/go#19367
For golang/go#40481

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/340549
2021-08-07 13:37:58 -07:00
Harald Anlauf cd754efa9a Fortran: ICE with automatic character object, save, and various options
gcc/fortran/ChangeLog:

	PR fortran/68568
	* primary.c (gfc_expr_attr): Variable attribute can only be
	inquired when symtree is non-NULL.
2021-08-07 20:30:32 +02:00
H.J. Lu 6866f4819a Add tests for PR tree-optimization/88531
PR tree-optimization/88531
	* gcc.target/i386/pr88531-1a.c: New test.
	* gcc.target/i386/pr88531-1b.c: Likewise.
	* gcc.target/i386/pr88531-1c.c: Likewise.
	* gcc.target/i386/pr88531-2a.c: Likewise.
	* gcc.target/i386/pr88531-2b.c: Likewise.
	* gcc.target/i386/pr88531-2c.c: Likewise.
2021-08-07 07:34:44 -07:00
GCC Administrator f92f477852 Daily bump. 2021-08-07 00:16:39 +00:00
Martin Sebor 81d6cdd335 Move more code to new gimple-ssa-warn-access pass.
gcc/ChangeLog:

	* builtins.c (expand_builtin_memchr): Move to gimple-ssa-warn-access.cc.
	(expand_builtin_strcat): Same.
	(expand_builtin_stpncpy): Same.
	(expand_builtin_strncat): Same.
	(check_read_access): Same.
	(check_memop_access): Same.
	(expand_builtin_strlen): Move checks to gimple-ssa-warn-access.cc.
	(expand_builtin_strnlen): Same.
	(expand_builtin_memcpy): Same.
	(expand_builtin_memmove): Same.
	(expand_builtin_mempcpy): Same.
	(expand_builtin_strcpy): Same.
	(expand_builtin_strcpy_args): Same.
	(expand_builtin_stpcpy_1): Same.
	(expand_builtin_strncpy): Same.
	(expand_builtin_memset): Same.
	(expand_builtin_bzero): Same.
	(expand_builtin_strcmp): Same.
	(expand_builtin_strncmp): Same.
	(expand_builtin): Remove handlers.
	(fold_builtin_strlen): Add a comment.
	* builtins.h (check_access): Move to gimple-ssa-warn-access.cc.
	* calls.c (maybe_warn_nonstring_arg): Same.
	* diagnostic-spec.c (nowarn_spec_t::nowarn_spec_t): Add warning option.
	* gimple-fold.c (gimple_fold_builtin_strcpy): Pass argument to callee.
	(gimple_fold_builtin_stpcpy): Same.
	* gimple-ssa-warn-access.cc (has_location): New function.
	(get_location): Same.
	(get_callee_fndecl): Same.
	(call_nargs): Same.
	(call_arg): Same.
	(warn_string_no_nul): Define.
	(unterminated_array): Same.
	(check_nul_terminated_array): Same.
	(maybe_warn_nonstring_arg): Same.
	(maybe_warn_for_bound): Same.
	(warn_for_access): Same.
	(check_access): Same.
	(check_memop_access): Same.
	(check_read_access): Same.
	(warn_dealloc_offset): Use helper functions.
	(maybe_emit_free_warning): Same.
	(class pass_waccess): Add members.
	(check_strcat): New function.
	(check_strncat): New function.
	(check_stxcpy): New function.
	(check_stxncpy): New function.
	(check_strncmp): New function.
	(pass_waccess::check_builtin): New function.
	(pass_waccess::check): Call it.
	* gimple-ssa-warn-access.h (warn_string_no_nul): Move here from
	builtins.h.
	(maybe_warn_for_bound): Same.
	(check_access): Same.
	(check_memop_access): Same.
	(check_read_access): Same.
	* pointer-query.h (struct access_data): Define a ctor overload.

gcc/testsuite/ChangeLog:

	* c-c++-common/Wsizeof-pointer-memaccess1.c: Also disable
	-Wstringop-overread.
	* c-c++-common/attr-nonstring-3.c: Adjust pattern of expected message.
	* gcc.dg/Warray-bounds-39.c: Add an xfail due to a known bug.
	* gcc.dg/Wstring-compare-3.c: Also disable -Wstringop-overread.
	* gcc.dg/attr-nonstring-2.c: Adjust pattern of expected message.
	* gcc.dg/attr-nonstring-4.c: Same.
	* gcc.dg/Wstringop-overread-6.c: New test.
	* gcc.dg/sso-14.c: Fix typos to avoid buffer overflow.
2021-08-06 16:08:36 -06:00
Cherry Mui 629b5699fb compiler: make escape analysis more strict about runtime calls
Following the previous CL, in the escape analysis list all the
expected runtime calls, and fail if an unexpected one is seen.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/340397
2021-08-06 12:37:48 -07:00
Christophe Lyon aff75af3b5 arm: Fix pr69245.c testcase for reorder assembler architecture directives [PR101723]
In gcc.target/arm/pr69245.c, to have a .fpu neon-vfpv4 directive, make
sure code for fn1() is emitted, by removing the static keyword.

Fix a typo in gcc.target/arm/pr69245.c, where \s should be \\s.

2021-08-06  Christophe Lyon  <christophe.lyon@foss.st.com>

	gcc/testsuite/

	PR target/101723
	* gcc.target/arm/pr69245.c: Make sure to emit code for fn1, fix
	typo.
2021-08-06 14:25:47 +00:00