Commit Graph

184000 Commits

Author SHA1 Message Date
Vladimir N. Makarov
8bf983c71e [PR99680] Check empty constraint before using CONSTRAINT_LEN.
It seems CONSTRAINT_LEN treats constraint '\0' as one having length 1.  Therefore we
read after the constraint string.  The patch fixes it.

gcc/ChangeLog:

	PR rtl-optimization/99680
	* lra-constraints.c (skip_contraint_modifiers): Rename to skip_constraint_modifiers.
	(process_address_1): Check empty constraint before using
	CONSTRAINT_LEN.
2021-03-20 10:52:10 -04:00
GCC Administrator
5f256a70a0 Daily bump. 2021-03-20 00:16:24 +00:00
Jakub Jelinek
3279a9a5a9 c: Fix up -Wunused-but-set-* warnings for _Atomics [PR99588]
As the following testcases show, compared to -D_Atomic= case we have many
-Wunused-but-set-* warning false positives.
When an _Atomic variable/parameter is read, we call mark_exp_read on it in
convert_lvalue_to_rvalue, but build_atomic_assign does not.
For consistency with the non-_Atomic case where we mark_exp_read the lhs
for lhs op= ... but not for lhs = ..., this patch does that too.
But furthermore we need to pattern match the trees emitted by _Atomic store,
so that _Atomic store itself is not marked as being a variable read, but
when the result of the store is used, we mark it.

2021-03-19  Jakub Jelinek  <jakub@redhat.com>

	PR c/99588
	* c-typeck.c (mark_exp_read): Recognize what build_atomic_assign
	with modifycode NOP_EXPR produces and mark the _Atomic var as read
	if found.
	(build_atomic_assign): For modifycode of NOP_EXPR, use COMPOUND_EXPRs
	rather than STATEMENT_LIST.  Otherwise call mark_exp_read on lhs.
	Set TREE_SIDE_EFFECTS on the TARGET_EXPR.

	* gcc.dg/Wunused-var-5.c: New test.
	* gcc.dg/Wunused-var-6.c: New test.
2021-03-19 22:54:31 +01:00
Joseph Myers
e5d74554b5 Regenerate gcc.pot.
* gcc.pot: Regenerate.
2021-03-19 21:33:05 +00:00
Pat Haugen
e1df2c3436 Add Power10 scheduling description.
2021-03-19  Pat Haugen  <pthaugen@linux.ibm.com>

gcc/

	* config/rs6000/rs6000.c (power10_cost): New.
	(rs6000_option_override_internal): Set Power10 costs.
	(rs6000_issue_rate): Set Power10 issue rate.
	* config/rs6000/power10.md: Rewrite for Power10.
2021-03-19 15:51:22 -05:00
Jonathan Wakely
b8ecdc7727 libstdc++: Add std::is_scoped_enum for C++23
Implement this C++23 feature, as proposed by P1048R1.

This implementation assumes that a C++23 compiler supports concepts
already. I don't see any point in using preprocessor hacks to detect
compilers which define __cplusplus to a post-C++20 value but don't
support concepts yet.

libstdc++-v3/ChangeLog:

	* include/std/type_traits (is_scoped_enum): Define.
	* include/std/version (__cpp_lib_is_scoped_enum): Define.
	* testsuite/20_util/is_scoped_enum/value.cc: New test.
	* testsuite/20_util/is_scoped_enum/version.cc: New test.
2021-03-19 20:10:56 +00:00
Thomas Koenig
83855386c4 Add size check to vector-matrix matmul.
It turns out the library version is much faster for vector-matrix
multiplications for large sizes than what inlining can produce.
Use size checks for switching between this and inlining for
that case to.

gcc/fortran/ChangeLog:

	* frontend-passes.c (inline_limit_check): Add rank_a
	argument. If a is rank 1, set the second dimension to 1.
	(inline_matmul_assign): Pass rank_a argument to inline_limit_check.
	(call_external_blas): Likewise.

gcc/testsuite/ChangeLog:

	* gfortran.dg/inline_matmul_6.f90: Adjust count for
	  _gfortran_matmul.
2021-03-19 20:52:20 +01:00
Vladimir N. Makarov
d81019db09 [PR99663] Don't use unknown constraint for address constraint in process_address_1.
s390x has insns using several alternatives with address constraints.  Even
if we don't know at this stage what alternative will be used, we still can say
that is an address constraint.  So don't use unknown constraint in this
case when there are multiple constraints or/and alternative.

gcc/ChangeLog:

	PR target/99663
	* lra-constraints.c (process_address_1): Don't use unknown
	constraint for address constraint.

gcc/testsuite/ChangeLog:

	PR target/99663
	* gcc.target/s390/pr99663.c: New.
2021-03-19 15:34:48 -04:00
Jakub Jelinek
82bb66730b c++: Only reject reinterpret casts from pointers to integers for manifestly_const_eval evaluation [PR99456]
My PR82304/PR95307 fix moved reinterpret cast from pointer to integer
diagnostics from cxx_eval_outermost_constant_expr where it caught
invalid code only at the outermost level down into
cxx_eval_constant_expression.
Unfortunately, it regressed following testcase, we emit worse code
including dynamic initialization of some vars.
While the initializers are not constant expressions due to the
reinterpret_cast in there, there is no reason not to fold them as an
optimization.

I've tried to make this dependent on !ctx->quiet, but that regressed
two further tests, and on ctx->strict, which regressed other tests,
so this patch bases that on manifestly_const_eval.

The new testcase is now optimized as much as it used to be in GCC 10
and the only regression it causes is an extra -Wnarrowing warning
on vla22.C test on invalid code (which the patch adjusts).

2021-03-19  Jakub Jelinek  <jakub@redhat.com>

	PR c++/99456
	* constexpr.c (cxx_eval_constant_expression): For CONVERT_EXPR from
	INDIRECT_TYPE_P to ARITHMETIC_TYPE_P, when !ctx->manifestly_const_eval
	don't diagnose it, set *non_constant_p nor return t.

	* g++.dg/opt/pr99456.C: New test.
	* g++.dg/ext/vla22.C: Expect a -Wnarrowing warning for c++11 and
	later.
2021-03-19 18:36:56 +01:00
Iain Sandoe
02f305440f Darwin : Fix build failure for powerpc-darwin8 [PR99661].
A hunk had been missed from r11-6417, fixed thus:

gcc/ChangeLog:

	PR target/99661
	* config.gcc (powerpc-*-darwin8): Delete the reference to
	the now removed darwin8.h.
2021-03-19 16:40:11 +00:00
Olivier Hainque
eadb118e36 target/99660 - missing VX_CPU_PREFIX for vxworksae
This fixes an oversight which causes make all-gcc
to fail for --target=*vxworksae or vxworksmils, a regression
introduced by the recent VxWorks7 related updates.

Both AE and MILS variants resort to a common config/vxworksae.h,
which misses a definition of VX_CPU_PREFIX expected by port
specific headers.

The change just provides the missing definition.

2021-03-19  Olivier Hainque  <hainque@adacore.com>

gcc/
	PR target/99660
	* config/vxworksae.h (VX_CPU_PREFIX): Define.
2021-03-19 16:16:39 +00:00
John David Anglin
22d1a90a15 Use memcpy instead of strncpy to avoid error with -Werror=stringop-truncation.
gcc/ChangeLog:

	* config/pa/pa.c (import_milli): Use memcpy instead of strncpy.
2021-03-19 15:57:06 +00:00
Tamar Christina
c3a2bc6daa slp: remove unneeded permute calculation (PR99656)
The attach testcase ICEs because as you showed on the PR we have one child
which is an internal with a PERM of EVENEVEN and one with TOP.

The problem is while we can conceptually merge the permute itself into EVENEVEN,
merging the lanes don't really make sense.

That said, we no longer even require the merged lanes as we create the permutes
based on the KIND directly.

This patch just removes all of that code.

Unfortunately it still won't vectorize with the cost model enabled due to the
blend that's created combining the load and the external

	note: node 0x51f2ce8 (max_nunits=1, refcnt=1)
	note: op: VEC_PERM_EXPR
	note:       { }
	note:       lane permutation { 0[0] 1[1] }
	note:       children 0x51f23e0 0x51f2578
	note: node 0x51f23e0 (max_nunits=2, refcnt=1)
	note: op template: _16 = REALPART_EXPR <*t1_9(D)>;
	note:       stmt 0 _16 = REALPART_EXPR <*t1_9(D)>;
	note:       stmt 1 _16 = REALPART_EXPR <*t1_9(D)>;
	note:       load permutation { 0 0 }
	note: node (external) 0x51f2578 (max_nunits=1, refcnt=1)
	note:       { _18, _18 }

which costs the cost for the load-and-split and the cost of the external splat,
and the one for blending them while in reality it's just a scalar load and
insert.

The compiler (with the cost model disabled) generates

	ldr     q1, [x19]
	dup     v1.2d, v1.d[0]
	ldr     d0, [x19, 8]
	fneg    d0, d0
	ins     v1.d[1], v0.d[0]

while really it should be

	ldp     d1, d0, [x19]
	fneg    d0, d0
	ins     v1.d[1], v0.d[0]

but that's for another time.

gcc/ChangeLog:

	PR tree-optimization/99656
	* tree-vect-slp-patterns.c (linear_loads_p,
	complex_add_pattern::matches, is_eq_or_top,
	vect_validate_multiplication, complex_mul_pattern::matches,
	complex_fms_pattern::matches): Remove complex_perm_kinds_t.
	* tree-vectorizer.h: (complex_load_perm_t): Removed.
	(slp_tree_to_load_perm_map_t): Use complex_perm_kinds_t instead of
	complex_load_perm_t.

gcc/testsuite/ChangeLog:

	PR tree-optimization/99656
	* gfortran.dg/vect/pr99656.f90: New test.
2021-03-19 14:29:36 +00:00
H.J. Lu
5e2eabe1ee x86: Issue error for return/argument only with function body
If we never generate function body, we shouldn't issue errors for return
nor argument.  Add silent_p to i386 machine_function to avoid issuing
errors for return and argument without function body.

gcc/

	PR target/99652
	* config/i386/i386-options.c (ix86_init_machine_status): Set
	silent_p to true.
	* config/i386/i386.c (init_cumulative_args): Set silent_p to
	false.
	(construct_container): Return early for return and argument
	errors if silent_p is true.
	* config/i386/i386.h (machine_function): Add silent_p.

gcc/testsuite/

	PR target/99652
	* gcc.dg/torture/pr99652-1.c: New test.
	* gcc.dg/torture/pr99652-2.c: Likewise.
	* gcc.target/i386/pr57655.c: Adjusted.
	* gcc.target/i386/pr59794-6.c: Likewise.
	* gcc.target/i386/pr70738-1.c: Likewise.
	* gcc.target/i386/pr96744-1.c: Likewise.
2021-03-19 06:39:51 -07:00
David Malcolm
21d09cb732 analyzer: mark epath_finder with DISABLE_COPY_AND_ASSIGN [PR99614]
cppcheck warns that class epath_finder does dynamic memory allocation, but
is missing a copy constructor and operator=.

This class isn't meant to be copied or assigned, so mark it with
DISABLE_COPY_AND_ASSIGN.

gcc/analyzer/ChangeLog:
	PR analyzer/99614
	* diagnostic-manager.cc (class epath_finder): Add
	DISABLE_COPY_AND_ASSIGN.
2021-03-19 09:01:57 -04:00
Jakub Jelinek
009528d61c arm: Fix mve_vshlq* [PR99593]
As mentioned in the PR, before the r11-6708-gbfab355012ca0f5219da8beb04f2fdaf757d34b7
change v[al]shr<mode>3 expanders were expanding the shifts by register
to gen_ashl<mode>3_{,un}signed which don't support immediate CONST_VECTOR
shift amounts, but now expand to mve_vshlq_<supf><mode> which does.
The testcase ICEs, because the constraint doesn't match the predicate and
because LRA works solely with the constraints, so it can e.g. from REG_EQUAL
propagate there a CONST_VECTOR which matches the constraint but fails the
predicate and only later on other passes will notice the predicate fails
and ICE.

Fixed by adding a constraint that matches the immediate part of the
predicate.

	PR target/99593
	* config/arm/constraints.md (Ds): New constraint.
	* config/arm/vec-common.md (mve_vshlq_<supf><mode>): Use w,Ds
	constraint instead of w,Dm.

	* g++.target/arm/pr99593.C: New test.
2021-03-19 13:48:44 +01:00
Andrew Stubbs
5cded5aff7 amdgcn: Typo fix
gcc/ChangeLog:

	* config/gcn/gcn.c (gcn_parse_amdgpu_hsa_kernel_attribute): Fix quotes
	in error message.
2021-03-19 10:51:43 +00:00
Matthias Klose
3b0155305e substitute @tie{} with a space for the man pages
contrib/

2021-03-19  Matthias Klose  <doko@ubuntu.com>

	* texi2pod.pl: Substitute @tie{} with a space for the man pages.
2021-03-19 10:03:02 +00:00
Eric Botcazou
af73a8b202 Require linker plugin for another LTO test
If it is not present, fat LTO is generated with an additional warning.

gcc/testsuite/
	* g++.dg/lto/pr89335_0.C: Require the linker plugin.
2021-03-19 09:25:23 +01:00
Eric Botcazou
b980edba50 Fix segfault during encoding of CONSTRUCTORs
The segfault occurs in native_encode_initializer when it is encoding the
CONSTRUCTOR for an array whose lower bound is negative (it's OK in Ada).
The computation of the current position is done in HOST_WIDE_INT and this
does not work for arrays whose original range has a negative lower bound
and a positive upper bound; the computation must be done in sizetype
instead so that it may wrap around.

gcc/
	PR middle-end/99641
	* fold-const.c (native_encode_initializer) <CONSTRUCTOR>: For an
	array type, do the computation of the current position in sizetype.
2021-03-19 09:25:23 +01:00
GCC Administrator
287e3e8466 Daily bump. 2021-03-19 00:16:26 +00:00
Marek Polacek
bd9b262fa9 c++: Fix error-recovery with requires expression [PR99500]
This fixes an ICE on invalid code where one of the parameters was
error_mark_node and thus resetting its DECL_CONTEXT crashed.

gcc/cp/ChangeLog:

	PR c++/99500
	* parser.c (cp_parser_requirement_parameter_list): Handle
	error_mark_node.

gcc/testsuite/ChangeLog:

	PR c++/99500
	* g++.dg/cpp2a/concepts-err3.C: New test.
2021-03-18 20:09:44 -04:00
Marek Polacek
96ccb32543 c++: Remove FLOAT_EXPR assert in tsubst.
This assert triggered when pr85013.C was compiled with -fchecking=2
which the usual testing doesn't exercise.  Let's remove it for now
and revisit in GCC 12.

gcc/cp/ChangeLog:

	* pt.c (tsubst_copy_and_build) <case FLOAT_EXPR>: Remove.
2021-03-18 17:20:32 -04:00
Vladimir N. Makarov
a4670f58eb [PR99422] LRA: Use lookup_constraint only for a single constraint in process_address_1.
This is an additional patch for PR99422.  In process_address_1 we
look only at the first constraint in the 1st alternative
and ignore all other possibilities.  As we don't know what
alternative and constraint will be used at this stage, we can be sure
only for a single constraint with one alternative and should use unknown
constraint for all other cases.

gcc/ChangeLog:

	PR target/99422
	* lra-constraints.c (process_address_1): Use lookup_constraint
	only for a single constraint.
2021-03-18 15:59:15 -04:00
Martin Sebor
30b10dacd0 PR middle-end/99502 - missing -Warray-bounds on partial out of bounds
gcc/ChangeLog:

	PR middle-end/99502
	* gimple-array-bounds.cc (inbounds_vbase_memaccess_p): Rename...
	(inbounds_memaccess_p): ...to this.  Check the ending offset of
	the accessed member.

gcc/testsuite/ChangeLog:

	PR middle-end/99502
	* g++.dg/warn/Warray-bounds-22.C: New test.
	* g++.dg/warn/Warray-bounds-23.C: New test.
	* g++.dg/warn/Warray-bounds-24.C: New test.
2021-03-18 13:38:00 -06:00
Marek Polacek
c5e55673b4 c++: Add assert to tsubst.
As discussed in the r11-7709 patch, we can now make sure that tsubst
never sees a FLOAT_EXPR, much like its counterpart FIX_TRUNC_EXPR.

gcc/cp/ChangeLog:

	* pt.c (tsubst_copy_and_build): Add assert.
2021-03-18 14:20:00 -04:00
Andrew Stubbs
55308fc263 amdgcn: Silence warnings in gcn.c
This fixes a few cases of "unquoted identifier or keyword", one "spurious
trailing punctuation sequence", and a "may be used uninitialized".

gcc/ChangeLog:

	* config/gcn/gcn.c (gcn_parse_amdgpu_hsa_kernel_attribute): Add %< and
	  %> quote markers to error messages.
	(gcn_goacc_validate_dims): Likewise.
	(gcn_conditional_register_usage): Remove exclaimation mark from error
	message.
	(gcn_vectorize_vec_perm_const): Ensure perm is fully uninitialized.
2021-03-18 17:38:51 +00:00
Jan Hubicka
ab03c0d575 Fix idiv latencies for znver3
update costs of integer divides to match actual latencies (the scheduler model
already does the right thing).  It is essentially no-op, since we end up
expanding idiv for all sensible constants, so this only may end up disabling
vectorization in some cases, but I did not find any such examples.  However in
general it is better ot have actual latencies than random numbers.

gcc/ChangeLog:

2021-03-18  Jan Hubicka  <hubicka@ucw.cz>

	* config/i386/x86-tune-costs.h (struct processor_costs): Fix costs of
	integer divides1.
2021-03-18 17:15:34 +01:00
Sinan Lin
d9f0ade001 PR target/99314: Fix integer signedness issue for cpymem pattern expansion.
Third operand of cpymem pattern is unsigned HOST_WIDE_INT, however we
are interpret that as signed HOST_WIDE_INT, that not a problem in
most case, but when the value is large than signed HOST_WIDE_INT, it
might screw up since we have using that value to calculate the buffer
size.

2021-03-05  Sinan Lin  <sinan@isrc.iscas.ac.cn>
	    Kito Cheng  <kito.cheng@sifive.com>

gcc/ChangeLog:

	* config/riscv/riscv.c (riscv_block_move_straight): Change type
	to unsigned HOST_WIDE_INT for parameter and local variable with
	HOST_WIDE_INT type.
	(riscv_adjust_block_mem): Ditto.
	(riscv_block_move_loop): Ditto.
	(riscv_expand_block_move): Ditto.
2021-03-19 00:04:32 +08:00
Jakub Jelinek
89d44a9f3b testsuite: Fix up strlenopt-80.c on powerpc [PR99636]
Similar issue as in strlenopt-73.c, various spots in this test rely
on MOVE_MAX >= 8, this time it uses a target selector to pick up a couple
of targets, and all of them but powerpc 32-bit satisfy it, but powerpc
32-bit have MOVE_MAX just 4.

2021-03-18  Jakub Jelinek  <jakub@redhat.com>

	PR testsuite/99636
	* gcc.dg/strlenopt-80.c: For powerpc*-*-*, only enable for lp64.
2021-03-18 16:14:47 +01:00
Jakub Jelinek
fff9faa790 testsuite: Fix up strlenopt-73.c on powerpc [PR99626]
As mentioned in the testcase as well as in the PR, this testcase relies on
MOVE_MAX being sufficiently large that the memcpy call is folded early into
load + store.  Some popular targets define MOVE_MAX to 8 or even 16 (e.g.
x86_64 or some options on s390x), but many other targets define it to just 4
(e.g. powerpc 32-bit), or even 2.

The testcase has already one test routine guarded on one particular target
with MOVE_MAX 16 (but does it incorrectly, __i386__ is only defined on
32-bit x86 and __SIZEOF_INT128__ is only defined on 64-bit targets), this
patch fixes that, and guards another test that relies on memcpy (, , 8)
being folded that way (which therefore needs MOVE_MAX >= 8) on a couple of
common targets that are known to have such MOVE_MAX.

2021-03-18  Jakub Jelinek  <jakub@redhat.com>

	PR testsuite/99626
	* gcc.dg/strlenopt-73.c: Ifdef out test_copy_cond_unequal_length_i64
	on targets other than x86, aarch64, s390 and 64-bit powerpc.  Use
	test_copy_cond_unequal_length_i128 for __x86_64__ with int128 support
	rather than __i386__.
2021-03-18 16:11:46 +01:00
Jeff Law
d186c677e4 Update email address for primary entry
/
	* MAINTAINERS: Update primary entry.
2021-03-18 08:33:20 -06:00
Christophe Lyon
0211fbb610 testsuite: Skip c-c++-common/zero-scratch-regs-10.c on arm
As discussed in PR 97680, -fzero-call-used-regs is not supported on
arm.

Skip this test to avoid failure reports.

2021-03-18  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/testsuite/
	PR testsuite/97680
	* c-c++-common/zero-scratch-regs-10.c: Skip on arm
2021-03-18 14:26:34 +00:00
Nick Clifton
073595ef13 Fix building the V850 port using recent versions of gcc.
gcc/
	* config/v850/v850.c (construct_restore_jr): Increase static
	 buffer size.
	(construct_save_jarl): Likewise.
	* config/v850/v850.h (DWARF2_DEBUGGING_INFO): Define.
2021-03-18 12:57:25 +00:00
Iain Sandoe
0cc218d42c Objective-C++ : Fix handling of unnamed message parms [PR49070].
When we are parsing an Objective-C++ message, a colon is a valid
terminator for a assignment-expression.  That is:

[receiver methxx];

Is a valid, if somewhat unreadable, construction; corresponding
to a method declaration like:

- (id) meth:(id)arg0 :(id)arg1 :(id)arg2 :(id)arg3;

Where three of the message params have no selector name.

If fact, although it might be unintentional, Objective-C/C++ can
accept message selectors with all the parms unnamed (this applies
to the clang implementation too, which is taken as the reference
for the language).

For regular C++, the pattern x:x is not valid in that position an
an error is emitted with a fixit for the expected scope token.

If we simply made that error conditional on !c_dialect_objc()
that would regress Objective-C++ diagnostics for cases outside a
message selector, so we add a state flag for this.

gcc/cp/ChangeLog:

	PR objc++/49070
	* parser.c (cp_debug_parser): Add Objective-C++ message
	state flag.
	(cp_parser_nested_name_specifier_opt): Allow colon to
	terminate an assignment-expression when parsing Objective-
	C++ messages.
	(cp_parser_objc_message_expression): Set and clear message
	parsing state on entry and exit.
	* parser.h (struct cp_parser): Add a context flag for
	Objective-C++ message state.

gcc/testsuite/ChangeLog:

	PR objc++/49070
	* obj-c++.dg/pr49070.mm: New test.
	* objc.dg/unnamed-parms.m: New test.
2021-03-18 11:47:27 +00:00
Kyrylo Tkachov
8f0c9d53ef aarch64: Improve generic SVE tuning defaults
This patch adds the recently-added tweak to split some SVE VL-based scalar
operations [1] to the generic tuning used for SVE, as enabled by adding +sve
to the -march flag, for example -march=armv8.2-a+sve.

The recommendation for best performance on a particular CPU remains unchanged:
use the -mcpu option for that CPU, where possible. -mcpu=native makes this
straightforward for native compilation.

The tweak to split out SVE VL-based scalar operations is a consistent win for
the Neoverse V1 CPU and should be neutral for the Fujitsu A64FX. A run of
SPEC2017 on A64FX with this tweak on didn't show any non-noise differences.
It is also expected to be neutral on SVE2 implementations.

Therefore, the patch enables the tweak for generic +sve tuning e.g.
-march=armv8.2-a+sve. No SVE2 CPUs are expected to benefit from it,
therefore the tweak is disabled for generic tuning when +sve2 is in
-march e.g. -march=armv8.2-a+sve2.

The implementation of this approach requires a bit of custom logic in
aarch64_override_options_internal to handle these kinds of
architecture-dependent decisions, but we do believe the user-facing principle
here is important to implement.

In general, for the generic target we're using a decision framework that looks
like:

* If all cores that are known to benefit from an optimization
are of architecture X, and all other cores that implement X or above
are not impacted, or have a very slight impact, we will consider it for
generic tuning for architecture X.
* We will not enable that optimisation for generic tuning for architecture X+1
if no known cores of architecture X+1 or above will benefit.

This framework allows us to improve generic tuning for CPUs of generation X
while avoiding accumulating tweaks for future CPUs of generation X+1, X+2...
that do not need them, and thus avoid even the slight negative effects of
these optimisations if the user is willing to tell us the desired architecture
accurately.

X above can mean either annual architecture updates (Armv8.2-a, Armv8.3-a etc)
or optional architecture extensions (like SVE, SVE2).

[1] http://gcc.gnu.org/g:a65b9ad863c5fc0aea12db58557f4d286a1974d7

gcc/ChangeLog:

	* config/aarch64/aarch64.c (aarch64_adjust_generic_arch_tuning): Define.
	(aarch64_override_options_internal): Use it.
	(generic_tunings): Add AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS to
	tune_flags.

gcc/testsuite/ChangeLog:

	* g++.target/aarch64/sve/aarch64-sve.exp: Add -moverride=tune=none to
	sve_flags.
	* g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Likewise.
	* g++.target/aarch64/sve/acle/aarch64-sve-acle.exp: Likewise.
	* gcc.target/aarch64/sve/aarch64-sve.exp: Likewise.
	* gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Likewise.
	* gcc.target/aarch64/sve/acle/aarch64-sve-acle.exp: Likewise.
2021-03-18 09:56:47 +00:00
Martin Liska
3bcf19215d coroutines: init struct members to NULL
gcc/cp/ChangeLog:

	PR c++/99617
	* coroutines.cc (struct var_nest_node): Init then_cl and else_cl
	to NULL.
2021-03-18 10:42:44 +01:00
Jakub Jelinek
57e274408c testsuite: Fix up pr98099.c testcase for big endian [PR98099]
The testcase fails on big-endian without int128 support, because
due to -fsso-struct=big-endian no swapping is needed for big endian.
This patch restricts the testcase to big or little endian (but not pdp)
and uses -fsso-struct=little-endian for big endian, so that it is
swapping everywhere.

2021-03-18  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/98099
	* gcc.dg/pr98099.c: Don't compile the test on pdp endian.
	For big endian use -fsso-struct=little-endian dg-options.
2021-03-18 09:53:24 +01:00
GCC Administrator
19ac7c94b2 Daily bump. 2021-03-18 00:16:24 +00:00
Marek Polacek
40465293cd c++: ICE with real-to-int conversion in template [PR97973]
In this test we are building a call in a template, but since neither
the function nor any of its arguments are dependent, we go down the
normal path in finish_call_expr.  convert_arguments sees that we're
binding a reference to int to double and therein convert_to_integer
creates a FIX_TRUNC_EXPR.  Later, we call check_function_arguments
which folds the arguments, and, in a template, fold_for_warn calls
fold_non_dependent_expr.  But tsubst_copy_and_build should not see
a FIX_TRUNC_EXPR (see the patch discussed in
<https://gcc.gnu.org/pipermail/gcc-patches/2018-March/496183.html>)
or we crash.

So let's not create a FIX_TRUNC_EXPR in a template in the first place
and instead use IMPLICIT_CONV_EXPR.

gcc/cp/ChangeLog:

	PR c++/97973
	* call.c (conv_unsafe_in_template_p): New.
	(convert_like): Use it.

gcc/testsuite/ChangeLog:

	PR c++/97973
	* g++.dg/conversion/real-to-int1.C: New test.
2021-03-17 19:26:25 -04:00
Anthony Sharp
be246ac2d2 c++: Private parent access check for using decls [PR19377]
This bug was already mostly fixed by the patch for PR17314. This
patch continues that by ensuring that where a using decl is used,
causing an access failure to a child class because the using decl is
private, the compiler correctly points to the using decl as the
source of the problem.

gcc/cp/ChangeLog:

2021-03-10  Anthony Sharp  <anthonysharp15@gmail.com>

	* semantics.c (get_class_access_diagnostic_decl): New
	function that examines special cases when a parent
	class causes a private access failure.
	(enforce_access): Slightly modified to call function
	above.

gcc/testsuite/ChangeLog:

2021-03-10  Anthony Sharp  <anthonysharp15@gmail.com>

	* g++.dg/cpp1z/using9.C: New using decl test.

Co-authored-by: Jason Merrill <jason@redhat.com>
2021-03-17 19:11:02 -04:00
Sandra Loosemore
5074c6fa38 nios2: Fix format complaints and similar diagnostics.
The nios2 back end has not been building with newer versions of host
GCC due to several complaints about diagnostic formatting, along with
a couple other warnings.  This patch fixes the errors seen when
building with a host compiler from current mainline head.  I also made
a pass through all the error messages in this file to make them use
more consistent formatting, even where the host compiler was not
specifically complaining.

	gcc/
	* config/nios2/nios2.c (nios2_custom_check_insns): Clean up
	error message format issues.
	(nios2_option_override): Likewise.
	(nios2_expand_fpu_builtin): Likewise.
	(nios2_init_custom_builtins): Adjust to avoid bogus strncpy
	truncation warning.
	(nios2_expand_custom_builtin): More error message format fixes.
	(nios2_expand_rdwrctl_builtin): Likewise.
	(nios2_expand_rdprs_builtin): Likewise.
	(nios2_expand_eni_builtin): Likewise.
	(nios2_expand_builtin): Likewise.
	(nios2_register_custom_code): Likewise.
	(nios2_valid_target_attribute_rec): Likewise.
	(nios2_add_insn_asm): Fix uninitialized variable warning.
2021-03-17 14:41:31 -07:00
Jan Hubicka
bd364aaee3 Enable gather on zen3 hardware.
For TSVC it get used by 5 benchmarks with following runtime improvements:

s4114: 1.424 -> 1.209  (84.9017%)
s4115: 2.021 -> 1.065  (52.6967%)
s4116: 1.549 -> 0.854  (55.1323%)
s4117: 1.386 -> 1.193  (86.075%)
vag: 2.741 -> 1.940  (70.7771%)

there is regression in

s4112: 1.115 -> 1.184  (106.188%)

The internal loop is:

        for (int i = 0; i < LEN_1D; i++) {
            a[i] += b[ip[i]] * s;
        }

(so a standard accmulate and add with indirect addressing)

  40a400:       c5 fe 6f 24 03          vmovdqu (%rbx,%rax,1),%ymm4
  40a405:       c5 fc 28 da             vmovaps %ymm2,%ymm3
  40a409:       48 83 c0 20             add    $0x20,%rax
  40a40d:       c4 e2 65 92 04 a5 00    vgatherdps %ymm3,0x594100(,%ymm4,4),%ymm0
  40a414:       41 59 00
  40a417:       c4 e2 75 a8 80 e0 34    vfmadd213ps 0x5b34e0(%rax),%ymm1,%ymm0
  40a41e:       5b 00
  40a420:       c5 fc 29 80 e0 34 5b    vmovaps %ymm0,0x5b34e0(%rax)
  40a427:       00
  40a428:       48 3d 00 f4 01 00       cmp    $0x1f400,%rax
  40a42e:       75 d0                   jne    40a400 <s4112+0x60>

compared to:

  40a280:       49 63 14 04             movslq (%r12,%rax,1),%rdx
  40a284:       48 83 c0 04             add    $0x4,%rax
  40a288:       c5 fa 10 04 95 00 41    vmovss 0x594100(,%rdx,4),%xmm0
  40a28f:       59 00
  40a291:       c4 e2 71 a9 80 fc 34    vfmadd213ss 0x5b34fc(%rax),%xmm1,%xmm0
  40a298:       5b 00
  40a29a:       c5 fa 11 80 fc 34 5b    vmovss %xmm0,0x5b34fc(%rax)
  40a2a1:       00
  40a2a2:       48 3d 00 f4 01 00       cmp    $0x1f400,%rax
  40a2a8:       75 d6                   jne    40a280 <s4112+0x40>

Looking at instructions latencies

 - fmadd is 4 cycles
 - vgatherdps is 39

So vgather iself is 4.8 cycle per iteration and probably CPU is able to execute
rest out of order getting clos to 4 cycles per iteration (it can do 2 loads in
parallel, one store and rest fits easily to execution resources). That would
explain 20% slowdown.

gimple internal loop is:
  _2 = a[i_38];
  _3 = (long unsigned int) i_38;
  _4 = _3 * 4;
  _5 = ip_18 + _4;
  _6 = *_5;
  _7 = b[_6];
  _8 = _7 * s_19;
  _9 = _2 + _8;
  a[i_38] = _9;
  i_28 = i_38 + 1;
  ivtmp_52 = ivtmp_53 - 1;
  if (ivtmp_52 != 0)
    goto <bb 8>; [98.99%]
  else
    goto <bb 4>; [1.01%]

0x25bac30 a[i_38] 1 times scalar_load costs 12 in body
0x25bac30 *_5 1 times scalar_load costs 12 in body
0x25bac30 b[_6] 1 times scalar_load costs 12 in body
0x25bac30 _7 * s_19 1 times scalar_stmt costs 12 in body
0x25bac30 _2 + _8 1 times scalar_stmt costs 12 in body
0x25bac30 _9 1 times scalar_store costs 16 in body

so 19 cycles estimate of scalar load

0x2668630 a[i_38] 1 times vector_load costs 12 in body
0x2668630 *_5 1 times unaligned_load (misalign -1) costs 12 in body
0x2668630 b[_6] 8 times scalar_load costs 96 in body
0x2668630 _7 * s_19 1 times scalar_to_vec costs 4 in prologue
0x2668630 _7 * s_19 1 times vector_stmt costs 12 in body
0x2668630 _2 + _8 1 times vector_stmt costs 12 in body
0x2668630 _9 1 times vector_store costs 16 in body

so 40 cycles per 8x vectorized body

tsvc.c:3450:27: note:  operating only on full vectors.
tsvc.c:3450:27: note:  Cost model analysis:
  Vector inside of loop cost: 160
  Vector prologue cost: 4
  Vector epilogue cost: 0
  Scalar iteration cost: 76
  Scalar outside cost: 0
  Vector outside cost: 4
  prologue iterations: 0
  epilogue iterations: 0
  Calculated minimum iters for profitability: 1

I think this generally suffers from GIGO principle.
One problem seems to be that we do not know about fmadd yet and compute it as
two instructions (6 cycles instead of 4). More importnat problem is that we do
not account the parallelism at all.  I do not see how to disable the
vecotrization here without bumping gather costs noticeably off reality and thus
we probably can try to experiment with this if more similar problems are found.

Icc is also using gather in s1115 and s128.
For s1115 the vectorization does not seem to help and s128 gets slower.

Clang and aocc does not use gathers.

	* config/i386/x86-tune-costs.h (struct processor_costs): Update costs
	of gather to match reality.
	* config/i386/x86-tune.def (X86_TUNE_USE_GATHER): Enable for znver3.
2021-03-17 22:37:11 +01:00
Ian Lance Taylor
f3e9c98a9f compiler: copy receiver argument for go/defer of method call
Test case is https://golang.org/cl/302371.

Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/302270
2021-03-17 12:17:51 -07:00
Iain Sandoe
c86c5195c8 testsuite, Darwin : Fix the asan/strncpy-overflow-1 test.
1. To be more compatible with Linux, Darwin testcases that include
string.h should set _FORTIFY_SOURCE=0 since, otherwise, it will be
defaulted on and the _chk versions of the string builtins will be
used.  This testcase fails otherwise because there's no convenient
way to disable the _chk builtins.

2. The system tool that handles symbolization (atos) is not reliable
with GCC's DWARF-2 output but, fortunately, all the platform
versions that support the current sanitizers are able to handle
dwarf-3 for this testcase.

gcc/testsuite/ChangeLog:

	* c-c++-common/asan/strncpy-overflow-1.c: Add _FORTIFY_SOURCE=0 and
	-gdwarf-3 to the command line options. Adjust the expected line
	numbers for the revised options header.
2021-03-17 19:12:25 +00:00
Iain Sandoe
9c4d77fc1c testsuite, Darwin : Fix match output for asan/memcmp-1.c.
The Darwin part of libasan produces different output for memcmp
cases from other ports.  The GCC implementation does produce the
same output for this test as the clang one (modulo the two points
below).

1. To be more compatible with Linux, Darwin testcases that include
string.h should set _FORTIFY_SOURCE=0 since, otherwise, it will be
defaulted on and the _chk versions of the string builtins will be
used.

2. The system tool that handles symbolization (atos) is not reliable
with GCC's DWARF-2 output but, fortunately, all the platform
versions that support the current sanitizers are able to handle
dwarf-3 for this testcase.

gcc/testsuite/ChangeLog:

	* c-c++-common/asan/memcmp-1.c: Add _FORTIFY_SOURCE=0 and
	-gdwarf-3 to the command line options.  Provide Darwin-
	specific match lines for the expected output.
2021-03-17 19:12:03 +00:00
Kyrylo Tkachov
f7581eb38e aarch64: Fix status return logic in RNG intrinsics
There is a bug with the RNG intrinsics in their return code. The definition says:

"Stores a 64-bit random number into the object pointed to by the argument and returns zero.
If the implementation could not generate a random number within a reasonable period of time
the object pointed to by the input is set to zero and a non-zero value is returned."

This means we should be testing whether to return non-zero with:
CSET W0, EQ
rather than NE.

This patch fixes that.

gcc/ChangeLog:

	* config/aarch64/aarch64-builtins.c (aarch64_expand_rng_builtin): Use EQ
	to compare against CC_REG rather than NE.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/acle/rng_2.c: New test.
2021-03-17 18:21:05 +00:00
H.J. Lu
adf14bdbc1 x86: Update 'P' operand modifier for -fno-plt
Update 'P' operand modifier for -fno-plt to support inline assembly
statements.  In 64-bit, we can always load function address with
@GOTPCREL.  In 32-bit, we load function address with @GOT only for
non-PIC since PIC register may not be available at call site.

gcc/

	PR target/99504
	* config/i386/i386.c (ix86_force_load_from_GOT_p): Support
	inline assembly statements.
	(ix86_print_operand): Update 'P' handling for -fno-plt.

gcc/testsuite/

	PR target/99504
	* gcc.target/i386/pr99530-1.c: New test.
	* gcc.target/i386/pr99530-2.c: Likewise.
	* gcc.target/i386/pr99530-3.c: Likewise.
	* gcc.target/i386/pr99530-4.c: Likewise.
	* gcc.target/i386/pr99530-5.c: Likewise.
	* gcc.target/i386/pr99530-6.c: Likewise.
2021-03-17 07:06:10 -07:00
Tamar Christina
39916ceab4 AArch64: Fix -Werror issue in aarch64_simd_clone_compute_vecsize_and_simdlen
g:fcefc59befd396267b824c170b6a37acaf10874e introduced a new variable named
arg_type which shadows the function scoped one.

The function scoped one is now unused and so causes bootstrap to fail due to
-Werror.

This patch removes the unused variable.

gcc/ChangeLog:

	PR target/99542
	* config/aarch64/aarch64.c
	(aarch64_simd_clone_compute_vecsize_and_simdlen): Remove unused var.
2021-03-17 11:12:25 +00:00
GCC Administrator
bc2127767a Daily bump. 2021-03-17 00:16:25 +00:00