174694 Commits

Author SHA1 Message Date
Claudiu Zissulescu
9ebba06b5b [ARC] Deprecate q-class option.
This option was used to control the short instruction selection.  However,
there is no difference in cycles if we use or not a short instruction,
and always someone wants a smaller program.

gcc/
xxxx-xx-xx  Claudiu Zissulescu  <claziss@synopsys.com>

	* config/arc/arc.c (arc_conditional_register_usage): R0-R3 and
	R12-R15 are always in ARCOMPACT16_REGS register class.
	* config/arc/arc.opt (mq-class): Deprecate.
	* config/arc/constraint.md ("q"): Remove dependency on mq-class
	option.
	* doc/invoke.texi (mq-class): Update text.
	* common/config/arc/arc-common.c (arc_option_optimization_table):
	Update list.

testsuite/
xxxx-xx-xx  Claudiu Zissulescu  <claziss@synopsys.com>

	* gcc.target/arc/nps400-1.c: Update test.
2020-02-13 12:49:12 +02:00
Claudiu Zissulescu
e57764be55 [ARC] Use TARGET_INSN_COST.
TARGET_INSN_COST gives us a better control over the instruction costs
than classical RTX_COSTS.  A simple cost scheme is in place for the
time being, when optimizing for size, the cost is given by the
instruction length. When optimizing for speed, the cost is 1 for any
recognized instruction, and 2 for any load/store instruction.  The
latter one can be overwritten by using cost attribute for an
instruction.  Due to this change, we need to update also a number of
instruction patterns with a new predicate to better reflect the costs.

gcc/
xxxx-xx-xx  Claudiu Zissulescu  <claziss@synopsys.com>

	* config/arc/arc.c (arc_insn_cost): New function.
	(TARGET_INSN_COST): Define.
	* config/arc/arc.md (cost): New attribute.
	(add_n): Use arc_nonmemory_operand.
	(ashlsi3_insn): Likewise, also update constraints.
	(ashrsi3_insn): Likewise.
	(rotrsi3): Likewise.
	(add_shift): Likewise.
	* config/arc/predicates.md (arc_nonmemory_operand): New predicate.

testsuite/
xxxx-xx-xx  Claudiu Zissulescu  <claziss@synopsys.com>

	* gcc.target/arc/or-cnst-size2.c: Update test.
2020-02-13 12:32:05 +02:00
Claudiu Zissulescu
8dca38c43c [ARC] Update mlo/mhi handling when big-endian CPU.
gcc/
xxxx-xx-xx  Claudiu Zissulescu  <claziss@synopsys.com>

	* config/arc/arc.md (mulsidi_600): Correctly select mlo/mhi
	registers.
	(umulsidi_600): Likewise.

testsuite/
xxxx-xx-xx  Claudiu Zissulescu  <claziss@synopsys.com>
	Petro Karashchenko  <petro.karashchenko@ring.com>

	* estsuite/gcc.target/arc/mul64-1.c: New test.
2020-02-13 12:32:05 +02:00
Jakub Jelinek
ae2b8ede40 i386: Fix up _mm*_mask_popcnt_epi* [PR93696]
As mentioned in the PR and as
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mask_popcnt_epi
also documents, _mm*_popcnt_epi* intrinsics are consistent with all other
unary AVX512* intrinsics regarding arguments, i.e. the
_mm*_whatever has just single argument (called a in the docs, and __A in the
GCC headers),
_mm*_mask_whatever has 3 arguments (called src, k, a in the docs and
_W, __U, __A in GCC headers) and
_mm*_maskz_whatever 2 arguments (called k, a in the docs and __U, __A in GCC
headers).  Unfortunately, whomever implemented the _mm*_popcnt_epi*
intrinsics got it wrong for the _mm*_mask_popcnt_epi* ones, calling the
args __A, __U, __B and not passing them in the canonical order to the
builtins, making it API incompatible with ICC as well as clang (tested on
godbolts clang 7/8/9/trunk and ICC 19.0.{0,1}, older clang/ICC don't
understand those, so it isn't that it used to be broken even in other
compilers and got changed afterwards).

2020-02-13  Jakub Jelinek  <jakub@redhat.com>

	PR target/93696
	* config/i386/avx512bitalgintrin.h (_mm512_mask_popcnt_epi8,
	_mm512_mask_popcnt_epi16, _mm256_mask_popcnt_epi8,
	_mm256_mask_popcnt_epi16, _mm_mask_popcnt_epi8,
	_mm_mask_popcnt_epi16): Rename __B argument to __A and __A to __W,
	pass __A to the builtin followed by __W instead of __A followed by
	__B.
	* config/i386/avx512vpopcntdqintrin.h (_mm512_mask_popcnt_epi32,
	_mm512_mask_popcnt_epi64): Likewise.
	* config/i386/avx512vpopcntdqvlintrin.h (_mm_mask_popcnt_epi32,
	_mm256_mask_popcnt_epi32, _mm_mask_popcnt_epi64,
	_mm256_mask_popcnt_epi64): Likewise.

	* gcc.target/i386/pr93696-1.c: New test.
	* gcc.target/i386/pr93696-2.c: New test.
	* gcc.target/i386/avx512bitalg-vpopcntw-1.c (TEST): Fix argument order
	of _mm*_mask_popcnt_*.
	* gcc.target/i386/avx512vpopcntdq-vpopcntq-1.c (TEST): Likewise.
	* gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c (TEST): Likewise.
	* gcc.target/i386/avx512bitalg-vpopcntb-1.c (TEST): Likewise.
	* gcc.target/i386/avx512bitalg-vpopcntb.c (foo): Likewise.
	* gcc.target/i386/avx512bitalg-vpopcntbvl.c (foo): Likewise.
	* gcc.target/i386/avx512vpopcntdq-vpopcntd.c (foo): Likewise.
	* gcc.target/i386/avx512bitalg-vpopcntwvl.c (foo): Likewise.
	* gcc.target/i386/avx512bitalg-vpopcntw.c (foo): Likewise.
	* gcc.target/i386/avx512vpopcntdq-vpopcntq.c (foo): Likewise.
2020-02-13 10:43:27 +01:00
Frederik Harwath
2d9eb4e4ca Add ChangeLog entry for my last commit 2020-02-13 10:26:13 +01:00
Frederik Harwath
001ab12e62 openmp: ignore nowait if async execution is unsupported [PR93481]
An OpenMP "nowait" clause on a target construct currently leads to
a call to GOMP_OFFLOAD_async_run in the plugin that is used for
offloading at execution time. The nvptx plugin contains only a stub
of this function that always produces a fatal error if called.

This commit changes the "nowait" implementation to ignore the clause
if the executing device's plugin does not implement GOMP_OFFLOAD_async_run.
The stub in the nvptx plugin is removed which effectively means that
programs containing "nowait" can now be executed with nvptx offloading
as if the clause had not been used.
This behavior is consistent with the OpenMP specification which says that
"[...] execution of the target task *may* be deferred" (emphasis added),
cf. OpenMP 5.0, page 172.

libgomp/

	* plugin/plugin-nvptx.c: Remove GOMP_OFFLOAD_async_run stub.
	* target.c (gomp_load_plugin_for_device): Make "async_run" loading
	optional.
	(gomp_target_task_fn): Assert "devicep->async_run_func".
	(clear_unsupported_flags): New function to remove unsupported flags
	(right now only GOMP_TARGET_FLAG_NOWAIT) that can be be ignored.
	(GOMP_target_ext): Apply clear_unsupported_flags to flags.
	* testsuite/libgomp.c/target-33.c:
	Remove xfail for offload_target_nvptx.
	* testsuite/libgomp.c/target-34.c: Likewise.
2020-02-13 10:18:31 +01:00
Jakub Jelinek
8aba425f4e sccvn: Handle bitfields in vn_reference_lookup_3 [PR93582]
The following patch is first step towards fixing PR93582.
vn_reference_lookup_3 right now punts on anything that isn't byte aligned,
so to be able to lookup a constant bitfield store, one needs to use
the exact same COMPONENT_REF, otherwise it isn't found.

This patch lifts up that that restriction if the bits to be loaded are
covered by a single store of a constant (keeps the restriction so far
for the multiple store case, can tweak that incrementally, but I think
for bisection etc. it is worth to do it one step at a time).

2020-02-13  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/93582
	* fold-const.h (shift_bytes_in_array_left,
	shift_bytes_in_array_right): Declare.
	* fold-const.c (shift_bytes_in_array_left,
	shift_bytes_in_array_right): New function, moved from
	gimple-ssa-store-merging.c, no longer static.
	* gimple-ssa-store-merging.c (shift_bytes_in_array): Move
	to gimple-ssa-store-merging.c and rename to shift_bytes_in_array_left.
	(shift_bytes_in_array_right): Move to gimple-ssa-store-merging.c.
	(encode_tree_to_bitpos): Use shift_bytes_in_array_left instead of
	shift_bytes_in_array.
	(verify_shift_bytes_in_array): Rename to ...
	(verify_shift_bytes_in_array_left): ... this.  Use
	shift_bytes_in_array_left instead of shift_bytes_in_array.
	(store_merging_c_tests): Call verify_shift_bytes_in_array_left
	instead of verify_shift_bytes_in_array.
	* tree-ssa-sccvn.c (vn_reference_lookup_3): For native_encode_expr
	/ native_interpret_expr where the store covers all needed bits,
	punt on PDP-endian, otherwise allow all involved offsets and sizes
	not to be byte-aligned.

	* gcc.dg/tree-ssa/pr93582-1.c: New test.
	* gcc.dg/tree-ssa/pr93582-2.c: New test.
	* gcc.dg/tree-ssa/pr93582-3.c: New test.
2020-02-13 10:04:11 +01:00
Richard Biener
8ea884b85e testsuite/93717 fix up gcc.dg/optimize-bswapsi-2.c for BE
2020-02-13  Richard Biener  <rguenther@suse.de>

	PR testsuite/93717
	* gcc.dg/optimize-bswapsi-2.c: Add BE case.
2020-02-13 09:12:17 +01:00
Jakub Jelinek
dc6d0f89d4 i386: Fix k*shift* intrinsics [PR93673]
As mentioned in the PR, the intrinsics allow counts from 0 to 255, but
we actually reject values from 128 to 255.  That is because QImode
CONST_INTs can be only -128 to 127.  Fixed by using const_0_to_255_operand
and dropping the modes for the operands with those predicates
(the IL actually contains the CONST_INT which has VOIDmode).

2020-02-13  Jakub Jelinek  <jakub@redhat.com>

	PR target/93673
	* config/i386/sse.md (k<code><mode>): Drop mode from last operand and
	use const_0_to_255_operand predicate instead of immediate_operand.
	(avx512dq_fpclass<mode><mask_scalar_merge_name>,
	avx512dq_vmfpclass<mode><mask_scalar_merge_name>,
	vgf2p8affineinvqb_<mode><mask_name>,
	vgf2p8affineqb_<mode><mask_name>): Drop mode from
	const_0_to_255_operand predicated operands.

	* gcc.target/i386/avx512f-pr93673.c: New test.
	* gcc.target/i386/avx512dq-pr93673.c: New test.
	* gcc.target/i386/avx512bw-pr93673.c: New test.
2020-02-13 08:17:07 +01:00
Jakub Jelinek
74ddc9b8e5 testsuite: Fix g++.dg/analyzer/pr93212.C with check-c++-all
The test FAILs with c++11:
.../gcc/testsuite/g++.dg/analyzer/pr93212.C:4:1: error: 'lol' function uses 'auto' type specifier without trailing return type
.../gcc/testsuite/g++.dg/analyzer/pr93212.C:4:1: note: deduced return type only available with '-std=c++14' or '-std=gnu++14'

2020-02-13  Jakub Jelinek  <jakub@redhat.com>

	* g++.dg/analyzer/pr93212.C: Require c++14 rather than c++11.
2020-02-13 08:06:51 +01:00
GCC Administrator
fc7c3d13a8 Daily bump. 2020-02-13 00:16:35 +00:00
Jason Merrill
c2368db567 c++: Fix constexpr if and braced functional cast.
While partially instantiating a generic lambda, we can encounter pack
expansions or constexpr if where we can't actually do the substitution
immediately, and instead remember a partial instantiation context
in *_EXTRA_ARGS.  This includes any local_specializations used in the
pattern or condition.  In this testcase our tree walk wasn't finding the use
of i because we weren't walking into the type of a CONSTRUCTOR.  Fixed by
moving the code for doing that from find_parameter_packs_r into
cp_walk_subtrees.

2020-02-11  Jason Merrill  <jason@redhat.com>

	PR c++/92583
	PR c++/92654
	* tree.c (cp_walk_subtrees): Walk CONSTRUCTOR types here.
	* pt.c (find_parameter_packs_r): Not here.
2020-02-13 00:43:22 +01:00
Iain Sandoe
68bb7e3b9d coroutines: Update to n4849 allocation/deallocation.
This updates the coroutine frame allocation and deallocation usage to
match n4849.

[dcl.fct.def.coroutine] /9, /10, /12.

9 An implementation may need to allocate additional storage for a coroutine.
This storage is known as the coroutine state and is obtained by calling a
non-array allocation function. The allocation function’s name is looked up
in the scope of the promise type. If this lookup fails, the allocation
function’s name is looked up in the global scope. If the lookup finds an
allocation function in the scope of the promise type, overload resolution
is performed on a function call created by assembling an argument list.
The first argument is the amount of space requested, and has type
std::size_t. The lvalues p1 . . . pn are the succeeding [user's function]
arguments. If no viable function is found, overload resolution is performed
again on a function call created by passing just the amount of space required
as an argument of type std::size_t.

10 The unqualified-id get_return_object_on_allocation_failure is looked up in
the scope of the promise type by class member access lookup. If any
declarations are found, then the result of a call to an allocation function
used to obtain storage for the coroutine state is assumed to return nullptr
if it fails to obtain storage, and if a global allocation function is
selected, the ::operator new(size_t, nothrow_t) form is used. The allocation
function used in this case shall have a non-throwing noexcept-specification.
If the allocation function returns nullptr, the coroutine returns control to
the caller of the coroutine and the return value is obtained by a call to
T::get_return_object_on_allocation_failure(), where T is the promise type.

12 The deallocation function’s name is looked up in the scope of the promise
type. If this lookup fails, the deallocation function’s name is looked up in
the global scope. If deallocation function lookup finds both a usual
deallocation function with only a pointer parameter and a usual deallocation
function with both a pointer parameter and a size parameter, then the
selected deallocation function shall be the one with two parameters.
Otherwise, the selected deallocation function shall be the function with one
parameter. If no usual deallocation function is found, the program is ill-
formed. The selected deallocation function shall be called with the address
of the block of storage to be reclaimed as its first argument. If a
deallocation function with a parameter of type std::size_t is used, the size
of the block is passed as the corresponding argument.

gcc/cp/ChangeLog:

2020-02-12 Iain Sandoe <iain@sandoe.co.uk>

	* coroutines.cc (build_actor_fn): Implement deallocation function
	selection per n4849, dcl.fct.def.coroutine bullet 12.
	(morph_fn_to_coro): Implement allocation function selection per
	n4849, dcl.fct.def.coroutine bullets 9 and 10.

2020-02-12 Iain Sandoe <iain@sandoe.co.uk>

	* g++.dg/coroutines/coro1-allocators.h: New.
	* g++.dg/coroutines/coro-bad-alloc-00-bad-op-new.C: New test.
	* g++.dg/coroutines/coro-bad-alloc-01-bad-op-del.C: New test.
	* g++.dg/coroutines/coro-bad-alloc-02-no-op-new-nt.C: New test.
	* g++.dg/coroutines/torture/alloc-00-gro-on-alloc-fail.C: Use new
	coro1-allocators.h header.
	* g++.dg/coroutines/torture/alloc-01-overload-newdel.C: Likewise.
	* g++.dg/coroutines/torture/alloc-02-fail-new-grooaf-check.C: New.
	* g++.dg/coroutines/torture/alloc-03-overload-new-1.C: New test.
	* g++.dg/coroutines/torture/alloc-04-overload-del-use-two-args.C:New.
2020-02-12 23:27:26 +01:00
Jakub Jelinek
1cd9bef89e testsuite: Fix up gcc.target/powerpc/pr93122.c test
The recent renaming of PowerPC -mprefixed-addr option to -mprefixed
has not adjusted the gcc.target/powerpc/pr93122.c test, so it now
FAIL: gcc.target/powerpc/pr93122.c (test for excess errors)
Excess errors:
xgcc: error: unrecognized command-line option '-mprefixed-addr'; did you mean '-mprefixed'?

2020-02-12  Jakub Jelinek  <jakub@redhat.com>

	* gcc.target/powerpc/pr93122.c: Use -mprefixed instead of
	-mprefixed-addr in dg-options.
2020-02-12 23:21:33 +01:00
Jeff Law
bc7ac0a2da Commit correct version of last patch 2020-02-12 14:57:50 -07:00
Jeff Law
985873e508 Combine the two H8 mode shortening peepholes into a single peephole
* config/h8300/h8300.md (comparison shortening peepholes): Use
	a mode iterator to merge the HImode and SImode peepholes.
2020-02-12 14:52:24 -07:00
Patrick Palka
99bbab9f77 libstdc++: Fix LWG issues 3389 and 3390
libstdc++-v3/ChangeLog:

	LWG 3389 and LWG 3390
	* include/bits/stl_iterator.h (move_move_iterator): Use std::move when
	constructing the move_iterator with __i.
	(counted_iterator::counted_iterator): Use std::move when initializing
	M_current with __i.
	* testsuite/24_iterators/counted_iterator/lwg3389.cc: New test.
	* testsuite/24_iterators/move_iterator/lwg3390.cc: New test.
2020-02-12 16:30:19 -05:00
Sandra Loosemore
02ce382cd3 Use a non-empty test program to test ability to link.
On bare-metal targets, I/O support is typically provided by a BSP and
requires a linker script and/or hosting library to be specified on the
linker command line.  Linking an empty program with the default linker
script may succeed, however, which confuses libstdc++ configuration
when programs that probe for the presence of various I/O features fail
with link errors.

2020-02-12  Sandra Loosemore  <sandra@codesourcery.com>

	PR libstdc++/79193
	PR libstdc++/88999

	config/
	* no-executables.m4: Use a non-empty program to test for linker
	support.

	libgcc/
	* configure: Regenerated.

	libgfortran/
	* configure: Regenerated.

	libiberty/
	* configure: Regenerated.

	libitm/
	* configure: Regenerated.

	libobjc/
	* configure: Regenerated.

	libquadmath/
	* configure: Regenerated.

	libssp/
	* configure: Regenerated.

	libstdc++v-3/
	* configure: Regenerated.
2020-02-12 13:22:07 -08:00
Jakub Jelinek
3f3932a0ec real: Fix roundeven on inf/nan [PR93663]
As can be seen in the testcase, roundeven with inf or nan arguments
ICE because of those asserts where nothing prevents from is_halfway_below
being called with those arguments.

The following patch fixes that by just returning false for rvc_inf/rvc_nan
like it returns for rvc_zero, so that we handle roundeven with all those
values as round.  Inf/NaN are not halfway in between two integers...

2020-02-12  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/93663
	* real.c (is_even): Make static.  Function comment fix.
	(is_halfway_below): Make static, don't assert R is not inf/nan,
	instead return false for those.  Small formatting fixes.

	* gcc.dg/torture/builtin-round-roundeven.c (main): Add tests
	for DBL_MAX, inf, their negations and nan.
2020-02-12 22:15:40 +01:00
François Dumont
b32a3f3243 libstdc++: Add missing std:: qualification of a forward call
* include/bits/hashtable.h
	(_Hashtable<>(_Hashtable&&, std::allocator_type&)): Add
	missing std namespace qualification to forward call.
2020-02-12 22:09:41 +01:00
Martin Sebor
0a0de9636d PR middle-end/93646 - confusing -Wstringop-truncation on strncat where -Wstringop-overflow is expected
gcc/ChangeLog:

	PR middle-end/93646
	* tree-ssa-strlen.c (handle_builtin_stxncpy): Rename...
	(handle_builtin_stxncpy_strncat): ...to this.  Change first argument.
	Issue only -Wstringop-overflow strncat, never -Wstringop-truncation.
	(strlen_check_and_optimize_call): Adjust callee name.

gcc/testsuite/ChangeLog:

	PR middle-end/93646
	* gcc.dg/Wstringop-overflow-31.c: New test.
2020-02-12 13:53:49 -07:00
Jeff Law
37462a131c Drop unused comparison shortening pattern and consolidate remaining comparison shortening patterns.
* config/h8300/h8300.md (comparison shortening peepholes): Drop
	(and (xor)) variant.  Combine other two into single peephole.
2020-02-12 12:12:22 -07:00
Wilco Dijkstra
5bfc8303ff [AArch64] Set ctz rtx_cost (PR93565)
Combine sometimes behaves oddly and duplicates ctz to remove an unnecessary
sign extension.  Avoid this by setting the cost for ctz to be higher than
that of a simple ALU instruction.  Deepsjeng performance improves by ~0.6%.

gcc/
	PR rtl-optimization/93565
	* config/aarch64/aarch64.c (aarch64_rtx_costs): Add CTZ costs.

testsuite/
	PR rtl-optimization/93565
	* gcc.target/aarch64/pr93565.c: New test.
2020-02-12 18:23:21 +00:00
Wilco Dijkstra
9921bbf9b2 [AArch64] Improve popcount expansion
The popcount expansion uses umov to extend the result and move it back
to the integer register file.  If we model ADDV as a zero-extending
operation, fmov can be used to move back to the integer side. This
results in a ~0.5% speedup on deepsjeng on Cortex-A57.

A typical __builtin_popcount expansion is now:

	fmov	s0, w0
	cnt	v0.8b, v0.8b
	addv	b0, v0.8b
	fmov	w0, s0

gcc/
	* config/aarch64/aarch64-simd.md
	(aarch64_zero_extend<GPI:mode>_reduc_plus_<VDQV_E:mode>): New pattern.
	* config/aarch64/aarch64.md (popcount<mode>2): Use it instead of
	generating separate ADDV and zero_extend patterns.
	* config/aarch64/iterators.md (VDQV_E): New iterator.

testsuite/
	* gcc.target/aarch64/popcnt2.c: New test.
2020-02-12 18:19:25 +00:00
Jeff Law
e5cc04a73a Clean up dead patterns, splitters, expanders and peepholes on the H8 port.
* config/h8300/h8300.md (cpymemsi, movmd): Remove dead patterns,
	expanders, splits, etc.
	(movmd_internal_<mode>, movmd splitter, movstr, movsd): Likewise.
	(stpcpy_internal_<mode>, stpcpy splitter): Likewise.
	(peepholes to convert QI/HI mode pushes to SI mode pushes): Likewise.
	* config/h8300/h8300.c (h8300_swap_into_er6): Remove unused function.
	(h8300_swap_out_of_er6, h8sx_emit_movmd): Likewise
	* config/h8300/h8300-protos.h (h8300_swap_into_er6): Remove unused
	function prototype.
	(h8300_swap_out_of_er6, h8sx_emit_movmd): Likewise.
2020-02-12 10:35:12 -07:00
Marek Polacek
54947e4db0 c++: Add new test [PR88819]
Fixed by r10-1975-g59febe0ece37bedab7f42ae51b9f2b7a372d2950.

2020-02-12  Marek Polacek  <polacek@redhat.com>

	PR c++/88819
	* g++.dg/cpp2a/nontype-class32.C: New test.
2020-02-12 11:20:07 -05:00
Marek Polacek
e428a9cf85 c++: Fix ICE-on-invalid with broken attribute [PR93684]
We crash when parsing

  [[a::

because we see a CPP_SCOPE and then we're trying to consume a CPP_EOF
token.  So peek before consuming it.

	PR c++/93684 - ICE-on-invalid with broken attribute.
	* parser.c (cp_parser_std_attribute): Peek a token first before
	consuming it.

	* g++.dg/parse/attr4.C: New test.
2020-02-12 10:39:40 -05:00
Jakub Jelinek
62fc0a6ce2 i386: Fix up vec_extract_lo* patterns [PR93670]
The VEXTRACT* insns have way too many different CPUID feature flags (ATT
syntax)
vextractf128 $imm, %ymm, %xmm/mem		AVX
vextracti128 $imm, %ymm, %xmm/mem		AVX2
vextract{f,i}32x4 $imm, %ymm, %xmm/mem {k}{z}	AVX512VL+AVX512F
vextract{f,i}32x4 $imm, %zmm, %xmm/mem {k}{z}	AVX512F
vextract{f,i}64x2 $imm, %ymm, %xmm/mem {k}{z}	AVX512VL+AVX512DQ
vextract{f,i}64x2 $imm, %zmm, %xmm/mem {k}{z}	AVX512DQ
vextract{f,i}32x8 $imm, %zmm, %ymm/mem {k}{z}	AVX512DQ
vextract{f,i}64x4 $imm, %zmm, %ymm/mem {k}{z}	AVX512F

As the testcase shows and the patch too, we didn't get it right in all
cases.

The first hunk is about avx512vl_vextractf128v8s[if] incorrectly
requiring TARGET_AVX512DQ.  The corresponding insn is the first
vextract{f,i}32x4 above, so it requires VL+F, and the builtins have it
correct (TARGET_AVX512VL implies TARGET_AVX512F):
BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vextractf128v8sf, "__builtin_ia32_extractf32x4_256_mask", IX86_BUILTIN_EXTRACTF32X4_256, UNKNOWN, (int) V4SF_FTYPE_V8SF_INT_V4SF_UQI)
BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vextractf128v8si, "__builtin_ia32_extracti32x4_256_mask", IX86_BUILTIN_EXTRACTI32X4_256, UNKNOWN, (int) V4SI_FTYPE_V8SI_INT_V4SI_UQI)
We only need TARGET_AVX512DQ for avx512vl_vextractf128v4d[if].

The second hunk is about vec_extract_lo_v16s[if]{,_mask}.  These are using
the vextract{f,i}32x8 insns (AVX512DQ above), but we weren't requiring that,
but instead incorrectly && 1 for non-masked and && (64 == 64 && TARGET_AVX512VL)
for masked insns.  This is extraction from ZMM, so it doesn't need VL for
anything.  The hunk actually only requires TARGET_AVX512DQ when the insn
is masked, if it is not masked, when TARGET_AVX512DQ isn't available we can
use vextract{f,i}64x4 instead which is available already in TARGET_AVX512F
and does the same thing, extracts the low 256 bits from 512 bits vector
(often we split it into just nothing, but there are some special cases like
when using xmm16+ when we can't without AVX512VL).

The last hunk is about vec_extract_lo_v8s[if]{,_mask}.  The non-_mask
suffixed ones are ok already and just split into nothing (lowpart subreg).
The masked ones were incorrectly requiring TARGET_AVX512VL and
TARGET_AVX512DQ, when we only need TARGET_AVX512VL.

2020-02-12  Jakub Jelinek  <jakub@redhat.com>

	PR target/93670
	* config/i386/sse.md (VI48F_256_DQ): New mode iterator.
	(avx512vl_vextractf128<mode>): Use it instead of VI48F_256.  Remove
	TARGET_AVX512DQ from condition.
	(vec_extract_lo_<mode><mask_name>): Use <mask_avx512dq_condition>
	instead of <mask_mode512bit_condition> in condition.  If
	TARGET_AVX512DQ is false, emit vextract*64x4 instead of
	vextract*32x8.
	(vec_extract_lo_<mode><mask_name>): Drop <mask_avx512dq_condition>
	from condition.

	* gcc.target/i386/avx512vl-pr93670.c: New test.
2020-02-12 11:58:35 +01:00
Richard Biener
12c763c68a testsuite/93697 fix inconsistent warning in testcase
The warning was emitted inconsistently on targets, so disable it since
the testcase was for an ICE.

2020-02-12  Richard Biener  <rguenther@suse.de>

	PR testsuite/93697
	* gcc.dg/pr93661.c: Pass -w, remove dg-warning.
2020-02-12 10:03:09 +01:00
Kewen Lin
4d2248bec5 [IRA] Fix PR91052 by skipping multiple_sets insn in combine_and_move_insns
As PR91052's comments show, commit r272731 exposed one issue in function
combine_and_move_insns.  Function combine_and_move_insns perform the
unexpected movement which alter live interval of some register, leading
incorrect value to be used.  See PR91052 for details.

2020-02-12  Kewen Lin  <linkw@gcc.gnu.org>
    PR target/91052
    * ira.c (combine_and_move_insns): Skip multiple_sets def_insn.
2020-02-11 23:22:02 -06:00
David Malcolm
91f993b7e3 analyzer: use ultimate alias target at calls (PR 93288)
PR analyzer/93288 reports an ICE in a C++ testcase when calling a
constructor.

The issue is that when building the supergraph, we encounter the
cgraph edge to "__ct_comp ", the DECL_COMPLETE_CONSTRUCTOR_P, and
this node's DECL_STRUCT_FUNCTION has a NULL CFG, which the analyzer
reads through, leading to the ICE.

This patch reworks function and fndecl lookup at calls throughout the
analyzer so that it looks for the ultimate_alias_target of the callee.
In the case above, this means using the "__ct_base " for the ctor,
which has a CFG, fixing the ICE.

Getting this right allows for some simple C++ cases involving ctors to
work, so the patch also adds some test coverage for that.

gcc/analyzer/ChangeLog:
	PR analyzer/93288
	* analysis-plan.cc (analysis_plan::use_summary_p): Look through
	the ultimate_alias_target when getting the called function.
	* engine.cc (exploded_node::on_stmt): Rename second "ctxt" to
	"sm_ctxt".  Use the region_model's get_fndecl_for_call rather than
	gimple_call_fndecl.
	* region-model.cc (region_model::get_fndecl_for_call): Use
	ultimate_alias_target on fndecl.
	* supergraph.cc (get_ultimate_function_for_cgraph_edge): New
	function.
	(supergraph_call_edge): Use it when rejecting edges without
	functions.
	(supergraph::supergraph): Use it to get the function for the
	cgraph_edge when building interprocedural superedges.
	(callgraph_superedge::get_callee_function):  Use it.
	* supergraph.h (supergraph::get_num_snodes): Make param const.
	(supergraph::function_to_num_snodes_t): Make first type param
	const.

gcc/testsuite/ChangeLog:
	PR analyzer/93288
	* g++.dg/analyzer/malloc.C: Add test coverage for a double-free
	called in a constructor.
	* g++.dg/analyzer/pr93288.C: New test.
2020-02-11 21:06:43 -05:00
Segher Boessenkool
d9e067f98b rs6000: Use strlen instead of sizeof - 1
It is easier to read and understand  strlen ("string")  than it is to
read and understand  sizeof ("string") - 1  .

	* config/rs6000/rs6000.c (rs6000_debug_print_mode): Don't use sizeof
	where strlen is more legible.
	(rs6000_builtin_vectorized_libmass): Ditto.
	(rs6000_print_options_internal): Ditto.
2020-02-12 02:05:00 +00:00
David Malcolm
35e24106fc analyzer: g++ testsuite support
PR analyzer/93288 reports a C++-specific ICE with -fanalyzer.

This patch creates the beginnings of a C++ test suite for the analyzer,
so that there's a place to put test coverage for the fix.
It adds a regression test for PR analyzer/93212, an ICE fixed
in r10-5970-g32077b693df8e3ed0424031a322df23822bf2f7e.

gcc/testsuite/ChangeLog:
	PR analyzer/93212
	* g++.dg/analyzer/analyzer.exp: New subdirectory and .exp suite.
	* g++.dg/analyzer/malloc.C: New test.
	* g++.dg/analyzer/pr93212.C: New test.
2020-02-11 20:58:38 -05:00
GCC Administrator
3889b0cb45 Daily bump. 2020-02-12 00:16:40 +00:00
Jason Merrill
d6ef77e023 c++: Fix implicit friend operator==.
It seems that in writing testcases for the operator<=> proposal I didn't
include any tests for implicitly declared friend operator==, and
consequently it didn't work.

2020-02-11  Jason Merrill  <jason@redhat.com>

	PR c++/93675
	* class.c (add_implicitly_declared_members): Use do_friend.
	* method.c (implicitly_declare_fn): Fix friend handling.
	(decl_remember_implicit_trigger_p): New.
	(synthesize_method): Use it.
	* decl2.c (mark_used): Use it.
2020-02-12 01:07:41 +01:00
Martin Sebor
9a5338e57d PR tree-optimization/93683 - ICE on calloc with unused return value in ao_ref_init_from_ptr_and_size
gcc/testsuite/ChangeLog:

	PR tree-optimization/93683
	* gcc.dg/tree-ssa/ssa-dse-39.c: New test.

gcc/ChangeLog:

	PR tree-optimization/93683
	* tree-ssa-alias.c (stmt_kills_ref_p): Avoid using LHS when not set.
2020-02-11 16:53:13 -07:00
Will Schmidt
ad21e0072e Add ppc_ieee128_ok target-supports proc
Add a target_supports entry to check that the __ieee128 keyword
is understood by the target.
Also add a dg-requires check to the existing pr92796 testcase.

    [testsuite]
	* lib/target-supports.exp (check_effective_target_ppc_ieee128_ok): New.
	* gcc.target/powerpc/pr92796.c: Add a require-effective-target
	statement for ppc_ieee128_ok.
2020-02-11 14:01:59 -06:00
Michael Meissner
7a775242ea Rename -mprefixed-addr to be -mprefixed, and document it.
2020-02-11  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/predicates.md (cint34_operand): Rename the
	-mprefixed-addr option to be -mprefixed.
	* config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Rename
	the -mprefixed-addr option to be -mprefixed.
	(OTHER_FUTURE_MASKS): Likewise.
	(POWERPC_MASKS): Likewise.
	* config/rs6000/rs6000.c (rs6000_option_override_internal): Rename
	the -mprefixed-addr option to be -mprefixed.  Change error
	messages to refer to -mprefixed.
	(num_insns_constant_gpr): Rename the -mprefixed-addr option to be
	-mprefixed.
	(rs6000_legitimate_offset_address_p): Likewise.
	(rs6000_mode_dependent_address): Likewise.
	(rs6000_opt_masks): Change the spelling of "-mprefixed-addr" to be
	"-mprefixed" for target attributes and pragmas.
	(address_to_insn_form): Rename the -mprefixed-addr option to be
	-mprefixed.
	(rs6000_adjust_insn_length): Likewise.
	* config/rs6000/rs6000.h (FINAL_PRESCAN_INSN): Rename the
	-mprefixed-addr option to be -mprefixed.
	(ASM_OUTPUT_OPCODE): Likewise.
	* config/rs6000/rs6000.md (prefixed insn attribute): Rename the
	-mprefixed-addr option to be -mprefixed.
	* config/rs6000/rs6000.opt (-mprefixed): Rename the
	-mprefixed-addr option to be prefixed.  Change the option from
	being undocumented to being documented.
	* doc/invoke.texi (RS/6000 and PowerPC Options): Document the
	-mprefixed option.  Update the -mpcrel documentation to mention
	-mprefixed.
2020-02-11 14:03:16 -05:00
David Malcolm
a60d98890b analyzer: fix ICE due to missing state_change purging (PR 93374)
PR analyzer/93374 reports an ICE within state_change::validate due to an
m_new_sid in a recorded state-change being out of range of the svalues
of the region_model of the new state.

During get_or_create_node we attempt to merge the new state with the
state of each of the existing enodes at the program point (in the
absence of sm-state differences), simplifying the state at each
attempt, and potentially reusing a node if we get a match.

This state-merging invalidates any svalue_ids within any state_change
object.

The root cause is that, although the code was purging any such
svalue_ids for the case where no match was found during merging, it was
failing to purge them for the case where a matching enode *was* found
for the merged state, leading to an invalid state_change along the
exploded_edge to the reused enode.

This patch moves the invalidation code to cover both cases, fixing the
ICE.  It also extends state_change validation so that states are also
checked.

gcc/analyzer/ChangeLog:
	PR analyzer/93374
	* engine.cc (exploded_edge::exploded_edge): Add ext_state param
	and pass it to change.validate.
	(exploded_graph::get_or_create_node): Move purging of change
	svalues to also cover the case of reusing an existing enode.
	(exploded_graph::add_edge): Pass m_ext_state to exploded_edge's
	ctor.
	* exploded-graph.h (exploded_edge::exploded_edge): Add ext_state
	param.
	* program-state.cc (state_change::sm_change::validate): Likewise.
	Assert that m_sm_idx is sane.  Use ext_state to validate
	m_old_state and m_new_state.
	(state_change::validate): Add ext_state param and pass it to
	the sm_change validate calls.
	* program-state.h (state_change::sm_change::validate): Add
	ext_state param.
	(state_change::validate): Likewise.

gcc/testsuite/ChangeLog:
	PR analyzer/93374
	* gcc.dg/analyzer/torture/pr93374.c: New test.
2020-02-11 13:37:09 -05:00
David Malcolm
a0e4929b04 analyzer: fix ICE in "__analyzer_dump_exploded_nodes" on non-empty worklist (PR 93669)
gcc/analyzer/ChangeLog:
	PR analyzer/93669
	* engine.cc (exploded_graph::dump_exploded_nodes): Handle missing
	case of STATUS_WORKLIST in implementation of
	"__analyzer_dump_exploded_nodes".

gcc/testsuite/ChangeLog:
	PR analyzer/93669
	* gcc.dg/analyzer/pr93669.c: New test.
2020-02-11 13:34:08 -05:00
David Malcolm
cd28b75921 analyzer: fix ICE with equiv_class constant (PR 93649)
gcc/analyzer/ChangeLog:
	PR analyzer/93649
	* constraint-manager.cc (constraint_manager::add_constraint): When
	merging equivalence classes and updating m_constant, also update
	m_cst_sid.
	(constraint_manager::validate): If m_constant is non-NULL assert
	that m_cst_sid is non-null and is valid.

gcc/testsuite/ChangeLog:
	PR analyzer/93649
	* gcc.dg/analyzer/torture/pr93649.c: New test.
2020-02-11 13:32:51 -05:00
David Malcolm
5e17c1bdad analyzer.opt: reword descriptions of two dump options (PR 93657)
gcc/analyzer/ChangeLog:
	PR analyzer/93657
	* analyzer.opt (fdump-analyzer): Reword description.
	(fdump-analyzer-stderr): Likewise.
2020-02-11 13:30:42 -05:00
David Malcolm
c46d057f55 analyzer: workaround for nested pp_printf
The dumps from the analyzer sometimes contain garbled output.

The root cause is due to nesting of calls to pp_printf: I'm using
pp_printf with %qT to print types with a PP using default_tree_printer.

default_tree_printer handles 'T' (and various other codes) via
  dump_generic_node (pp, t, 0, TDF_SLIM, 0);
and dump_generic_node can call pp_printf in various ways, leading
to a pp_printf within a pp_printf, and garbled output.

I don't think it's feasible to fix pp_printf to be reentrant, in
stage 4, at least, so for the moment this patch works around it
in the analyzer.

gcc/analyzer/ChangeLog:
	* region-model.cc (print_quoted_type): New function.
	(svalue::print): Use it to replace %qT.
	(region::dump_to_pp): Likewise.
	(region::dump_child_label): Likewise.
	(region::print_fields): Likewise.
2020-02-11 13:28:25 -05:00
Hans-Peter Nilsson
a5e3dd5d2e regalloc/debug: fix buggy print_hard_reg_set
* ira-conflicts.c (print_hard_reg_set): Correct output for sets
including FIRST_PSEUDO_REGISTER - 1.
* ira-color.c (print_hard_reg_set): Ditto.

Before, for a target with FIRST_PSEUDO_REGISTER 20, you'd get "19-18"
for (1<<19).  For (1<<18)|(1<<19), you'd get "18".

I was using ira-conflicts.c:print_hard_reg_set with a local
patch to gdbinit.in in a debug-session, and noticed the
erroneous output.  I see there's an almost identical function in
ira-color.c and on top of that, there's another function by the
same name and with similar semantics in sel-sched-dump.c, but
the last one doesn't try to print ranges.
2020-02-11 18:16:40 +01:00
Will Schmidt
c0e05505ff Tweak testcases for pr70010
[testsuite]

	* gcc.target/powerpc/pr70010-2.c: Add -maltivec.
	* gcc.target/powerpc/pr70010-3.c: Add -maltivec.
2020-02-11 10:27:25 -06:00
Stam Markianos-Wright
f348846e25 [GCC][PATCH][ARM]Add ACLE intrinsics for dot product (vusdot - vector, v<us/su>dot - by element) for AArch32 AdvSIMD ARMv8.6 Extension
This patch adds the ARMv8.6 Extension ACLE intrinsics for dot product
operations (vector/by element) to the ARM back-end.

These are:
usdot (vector), <us/su>dot (by element).

The functions are optional from ARMv8.2-a as -march=armv8.2-a+i8mm and
for ARM they remain optional after as of ARMv8.6-a.

The functions are declared in arm_neon.h, RTL patterns are defined to
generate assembler and tests are added to verify and perform adequate checks.

Regression testing on arm-none-eabi passed successfully.

gcc/ChangeLog:

2020-02-11  Stam Markianos-Wright  <stam.markianos-wright@arm.com>

	* config/arm/arm-builtins.c (enum arm_type_qualifiers):
	(USTERNOP_QUALIFIERS): New define.
	(USMAC_LANE_QUADTUP_QUALIFIERS): New define.
	(SUMAC_LANE_QUADTUP_QUALIFIERS): New define.
	(arm_expand_builtin_args): Add case ARG_BUILTIN_LANE_QUADTUP_INDEX.
	(arm_expand_builtin_1): Add qualifier_lane_quadtup_index.
	* config/arm/arm_neon.h (vusdot_s32): New.
	(vusdot_lane_s32): New.
	(vusdotq_lane_s32): New.
	(vsudot_lane_s32): New.
	(vsudotq_lane_s32): New.
	* config/arm/arm_neon_builtins.def (usdot, usdot_lane,sudot_lane): New.
	* config/arm/iterators.md (DOTPROD_I8MM): New.
	(sup, opsuffix): Add <us/su>.
	* config/arm/neon.md (neon_usdot, <us/su>dot_lane: New.
	* config/arm/unspecs.md (UNSPEC_DOT_US, UNSPEC_DOT_SU): New.

gcc/testsuite/ChangeLog:

2020-02-11  Stam Markianos-Wright  <stam.markianos-wright@arm.com>

	* gcc.target/arm/simd/vdot-2-1.c: New test.
	* gcc.target/arm/simd/vdot-2-2.c: New test.
	* gcc.target/arm/simd/vdot-2-3.c: New test.
	* gcc.target/arm/simd/vdot-2-4.c: New test.
2020-02-11 11:14:07 +00:00
Richard Biener
667afe5a49 tree-optimization/93661 properly guard tree_to_poly_int64
2020-02-11  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/93661
	PR tree-optimization/93662
	* tree-ssa-sccvn.c (vn_reference_lookup_3): Properly guard
	tree_to_poly_int64.
	* tree-sra.c (get_access_for_expr): Likewise.

	* gcc.dg/pr93661.c: New testcase.
2020-02-11 10:56:03 +01:00
Richard Biener
9714f1a70d tree-optimization/93661 properly guard tree_to_poly_int64
2020-02-11  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/93661
	PR tree-optimization/93662
	* tree-ssa-sccvn.c (vn_reference_lookup_3): Properly guard
	tree_to_poly_int64.
	* tree-sra.c (get_access_for_expr): Likewise.

	* gcc.dg/pr93661.c: New testcase.
2020-02-11 10:53:31 +01:00
Jason Merrill
dfffecb802 c++: Fix static initialization from <=>.
Constant evaluation of genericize_spaceship produced a CONSTRUCTOR, which we
then wanted to bind to a reference, which we can't do.  So wrap the result
in a TARGET_EXPR so we get something with an address.

We also need to handle treating the result of cxx_eval_binary_expression as
a glvalue for SPACESHIP_EXPR.

My earlier change to add uid_sensitive to maybe_constant_value was wrong; we
don't even look at the cache when manifestly_const_eval, and I failed to
adjust the later call to cxx_eval_outermost_constant_expr.

gcc/cp/ChangeLog
2020-02-11  Jason Merrill  <jason@redhat.com>

	PR c++/93650
	PR c++/90691
	* constexpr.c (maybe_constant_value): Correct earlier change.
	(cxx_eval_binary_expression) [SPACESHIP_EXPR]: Pass lval through.
	* method.c (genericize_spaceship): Wrap result in TARGET_EXPR.
2020-02-11 09:17:42 +01:00
Patrick Palka
a6ee556c76 c++: Fix return type deduction with an abbreviated function template
This patch fixes two issues with return type deduction in the presence of an
abbreviated function template.

The first issue (PR 69448) is that if a placeholder auto return type contains
any modifiers such as & or *, then the abbreviated function template
compensation in splice_late_return_type does not get performed for the
underlying auto node, leading to incorrect return type deduction.  This happens
because splice_late_return_type does not consider that a placeholder auto return
type might have modifiers.  To fix this it seems we need to look through
modifiers in the return type to obtain the location of the underlying auto node
in order to replace it with the adjusted auto node.  To that end this patch
refactors the utility function find_type_usage to return a pointer to the
matched tree, and uses it to find and replace the underlying auto node.

The second issue (PR 80471) is that the AUTO_IS_DECLTYPE flag is not being
preserved in splice_late_return_type when compensating for an abbreviated
function template, leading to us treating a decltype(auto) return type as if it
was an auto return type.  Fixed by making make_auto_1 set the AUTO_IS_DECLTYPE
flag whenever we're building a decltype(auto) node and adjusting callers
appropriately.  The test for PR 80471 is adjusted to expect the correct
behavior.

gcc/cp/ChangeLog:

	PR c++/69448
	PR c++/80471
	* type-utils.h (find_type_usage): Refactor to take a tree * and to
	return a tree *, and update documentation accordingly.
	* pt.c (make_auto_1): Set AUTO_IS_DECLTYPE when building a
	decltype(auto) node.
	(make_constrained_decltype_auto): No need to explicitly set
	AUTO_IS_DECLTYPE anymore.
	(splice_late_return_type): Use find_type_usage to find and
	replace a possibly nested auto node instead of using is_auto.
	Check test for is_auto into an assert when deciding whether
	to late_return_type.
	(type_uses_auto): Adjust the call to find_type_usage.
	* parser.c (cp_parser_decltype): No need to explicitly set
	AUTO_IS_DECLTYPE anymore.

libcc1/ChangeLog:

	PR c++/69448
	PR c++/80471
	* libcp1plugin.cc (plugin_get_expr_type): No need to explicitly set
	AUTO_IS_DECLTYPE anymore.

gcc/testsuite/ChangeLog:

	PR c++/69448
	PR c++/80471
	* g++.dg/concepts/abbrev3.C: New test.
	* g++.dg/cpp2a/concepts-pr80471.C: Adjust a static_assert to expect the
	correct behavior.
	* g++.dg/cpp0x/auto9.C: Adjust a dg-error directive.
2020-02-10 20:43:53 -05:00