Commit Graph

184131 Commits

Author SHA1 Message Date
Christophe Lyon 7c1d6e8999 arm: Fix mult autovectorization patterm for iwmmxt (PR target/99786)
Similarly to other recently-added autovectorization patterns, mult has
been erroneously enabled for iwmmxt. However, V4HI and V2SI modes are
supported, so we make an exception for them.

The new testcase is derived from gcc.dg/ubsan/pr79904.c, with
additional modes added.

I kept dg-do compile because 'assemble' results in error messages from
the assembler, which are not related to this PR:

Error: selected processor does not support `tmcrr wr0,r4,r5' in ARM mode
Error: selected processor does not support `wstrd wr0,[r0]' in ARM mode
Error: selected processor does not support `wldrd wr0,[r0]' in ARM mode
Error: selected processor does not support `wldrd wr2,.L5' in ARM mode
Error: selected processor does not support `wmulul wr0,wr0,wr2' in ARM mode
Error: selected processor does not support `wstrd wr0,[r0]' in ARM mode
Error: selected processor does not support `wldrd wr0,[r0]' in ARM mode
Error: selected processor does not support `wldrd wr2,.L8' in ARM mode
Error: selected processor does not support `wmulwl wr0,wr0,wr2' in ARM mode
Error: selected processor does not support `wstrd wr0,[r0]' in ARM mode

2021-03-29  Christophe Lyon  <christophe.lyon@linaro.org>

	PR target/99786

	gcc/
	* config/arm/vec-common.md (mul<mode>3): Disable on iwMMXT, expect
	for V4HI and V2SI.

	gcc/testsuite/
	* gcc.target/arm/pr99786.c: New test.
2021-03-31 13:50:22 +00:00
H.J. Lu bf24f4ec73 x86: Update memcpy/memset inline strategies for Ice Lake
Simply memcpy and memset inline strategies to avoid branches for
-mtune=icelake:

1. With MOVE_RATIO and CLEAR_RATIO == 17, GCC will use integer/vector
   load and store for up to 16 * 16 (256) bytes when the data size is
   fixed and known.
2. Inline only if data size is known to be <= 256.
   a. Use "rep movsb/stosb" with simple code sequence if the data size
      is a constant.
   b. Use loop if data size is not a constant.
3. Use memcpy/memset libray function if data size is unknown or > 256.

On Ice Lake processor with -march=native -Ofast -flto,

1.  Performance impacts of SPEC CPU 2017 rate are:

500.perlbench_r -0.93%
502.gcc_r        0.36%
505.mcf_r        0.31%
520.omnetpp_r   -0.07%
523.xalancbmk_r -0.53%
525.x264_r      -0.09%
531.deepsjeng_r -0.19%
541.leela_r      0.16%
548.exchange2_r  0.22%
557.xz_r        -1.64%
Geomean         -0.24%

503.bwaves_r    -0.01%
507.cactuBSSN_r  0.00%
508.namd_r       0.12%
510.parest_r     0.07%
511.povray_r     0.29%
519.lbm_r        0.00%
521.wrf_r       -0.38%
526.blender_r    0.16%
527.cam4_r       0.18%
538.imagick_r    0.76%
544.nab_r       -0.84%
549.fotonik3d_r -0.07%
554.roms_r      -0.01%
Geomean          0.02%

2. Significant impacts on eembc benchmarks are:

eembc/nnet_test      9.90%
eembc/mp2decoddata2  16.42%
eembc/textv2data3   -4.86%
eembc/qos            12.90%

gcc/

	* config/i386/i386-expand.c (expand_set_or_cpymem_via_rep):
	For TARGET_PREFER_KNOWN_REP_MOVSB_STOSB, don't convert QImode
	to SImode.
	(decide_alg): For TARGET_PREFER_KNOWN_REP_MOVSB_STOSB, use
	"rep movsb/stosb" only for known sizes.
	* config/i386/i386-options.c (processor_cost_table): Use Ice
	Lake cost for Cannon Lake, Ice Lake, Tiger Lake, Sapphire
	Rapids and Alder Lake.
	* config/i386/i386.h (TARGET_PREFER_KNOWN_REP_MOVSB_STOSB): New.
	* config/i386/x86-tune-costs.h (icelake_memcpy): New.
	(icelake_memset): Likewise.
	(icelake_cost): Likewise.
	* config/i386/x86-tune.def (X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB):
	New.

gcc/testsuite/

	* gcc.target/i386/memcpy-strategy-5.c: New test.
	* gcc.target/i386/memcpy-strategy-6.c: Likewise.
	* gcc.target/i386/memcpy-strategy-7.c: Likewise.
	* gcc.target/i386/memcpy-strategy-8.c: Likewise.
	* gcc.target/i386/memset-strategy-3.c: Likewise.
	* gcc.target/i386/memset-strategy-4.c: Likewise.
	* gcc.target/i386/memset-strategy-5.c: Likewise.
	* gcc.target/i386/memset-strategy-6.c: Likewise.
2021-03-31 05:28:32 -07:00
Richard Sandiford 1393938e4c aarch64: Fix target alignment for SVE [PR98119]
The vectoriser supports peeling for alignment using predication:
we move back to the previous aligned boundary and make the skipped
elements inactive in the first loop iteration.  As it happens,
the costs for existing CPUs give an equal cost to aligned and
unaligned accesses, so this feature is rarely used.

However, the PR shows that when the feature was forced on, we were
still trying to align to a full-vector boundary even when using
partial vectors.

gcc/
	PR target/98119
	* config/aarch64/aarch64.c
	(aarch64_vectorize_preferred_vector_alignment): Query the size
	of the provided SVE vector; do not assume that all SVE vectors
	have the same size.

gcc/testsuite/
	PR target/98119
	* gcc.target/aarch64/sve/pr98119.c: New test.
2021-03-31 11:26:06 +01:00
Jan Hubicka d7145b4bb6 Small refactoring of cgraph_node::release_body
PR lto/99447
	* cgraph.c (cgraph_node::release_body): Remove all callers and
	references.
	* cgraphclones.c (cgraph_node::materialize_clone): Do not do it here.
	* cgraphunit.c (cgraph_node::expand): And here.
2021-03-31 11:35:29 +02:00
Martin Liska c3c616747a Fix coding style in IPA modref.
gcc/ChangeLog:

	* ipa-modref.c (analyze_ssa_name_flags): Fix coding style
	and one negated condition.
2021-03-31 10:52:22 +02:00
Jakub Jelinek c001c194a2 aarch64: Fix up *add<mode>3_poly_1 [PR99813]
As mentioned in the PR, Uai constraint stands for
aarch64_sve_scalar_inc_dec_immediate
while Uav for
aarch64_sve_addvl_addpl_immediate.
Both *add<mode>3_aarch64 and *add<mode>3_poly_1 patterns use
  * return aarch64_output_sve_scalar_inc_dec (operands[2]);
  * return aarch64_output_sve_addvl_addpl (operands[2]);
in that order, but the former with Uai,Uav order, while the
latter with Uav,Uai instead.  This patch swaps the constraints
so that they match the output.

Co-authored-by: Richard Sandiford <richard.sandiford@arm.com>

2021-03-31  Jakub Jelinek  <jakub@redhat.com>
	    Richard Sandiford  <richard.sandiford@arm.com>

	PR target/99813
	* config/aarch64/aarch64.md (*add<mode>3_poly_1): Swap Uai and Uav
	constraints on operands[2] and similarly 0 and rk constraints
	on operands[1] corresponding to that.

	* g++.target/aarch64/sve/pr99813.C: New test.
2021-03-31 10:46:01 +02:00
Jakub Jelinek a49a96f681 i386, debug: Default to -gdwarf-4 on Windows targets with broken ld.bfd [PR98860]
As mentioned in the PR, before the
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ba6eb62ff0ea9843a018cfd7cd06777bd66ae0a0
fix from March 1st, PECOFF ld.bfd didn't know about .debug_loclists,
.debug_rnglists and other debug sections new in DWARF 5.  Unfortunately,
unlike for ELF linkers, that means the sections were placed in wrong
ordering with wrong VMA/LMA, so the resulting executables are apparently
unusable.

As that is pretty new change, newer than 2.35.2 or 2.36 binutils releases,
the following patch adds a workaround that turns -gdwarf-4 by default
instead of -gdwarf-5 if a broken linker is found at configure time.
Users can still explicitly play with -gdwarf-5 and either use a non-broken
linker or use custom linker scripts for the broken one, but at least
by default it should work.

2021-03-31  Jakub Jelinek  <jakub@redhat.com>

	PR bootstrap/98860
	* configure.ac (HAVE_LD_BROKEN_PE_DWARF5): New AC_DEFINE if PECOFF
	linker doesn't support DWARF sections new in DWARF5.
	* config/i386/i386-options.c (ix86_option_override_internal): Default
	to dwarf_version 4 if HAVE_LD_BROKEN_PE_DWARF5 for TARGET_PECOFF
	targets.
	* config.in: Regenerated.
	* configure: Regenerated.
2021-03-31 09:11:29 +02:00
Jakub Jelinek 0989e99470 testsuite: Disable zero-scratch-regs-{8, 9, 10, 11}.c on all but ... [PR97680]
Seems the target hook is only defined on
config/i386/i386.c:#undef TARGET_ZERO_CALL_USED_REGS
config/i386/i386.c:#define TARGET_ZERO_CALL_USED_REGS ix86_zero_call_used_regs
config/sparc/sparc.c:#undef TARGET_ZERO_CALL_USED_REGS
config/sparc/sparc.c:#define TARGET_ZERO_CALL_USED_REGS sparc_zero_call_used_regs
but apparently many of the tests actually succeed on various targets that
don't define those hooks.  E.g. I haven't seen them to fail on aarch64,
on arm only the -10.c fails, on powerpc*/s390* all {8,9,10,11} fail (plus
5 is skipped on power*-aix*).
On ia64 according to testresults {6,7,8,9,10,11} fail, some with ICEs.
On mipsel according to testresults {9,10,11} fail, some with ICEs.
On nvptx at least 1-9 succeed, 10-11 don't know, don't have assert.h around.

I've kept {5,6,7} with aix,ia64,ia64 skipped because those seems like
outliers, it works pretty much everywhere but on those.
The rest have known good targets.

2021-03-31  Jakub Jelinek  <jakub@redhat.com>

	PR testsuite/97680
	* c-c++-common/zero-scratch-regs-6.c: Skip on ia64.
	* c-c++-common/zero-scratch-regs-7.c: Likewise.
	* c-c++-common/zero-scratch-regs-8.c: Change from dg-skip-if of
	selected unsupported triplets to all targets but selected triplets
	of supported targets.
	* c-c++-common/zero-scratch-regs-9.c: Likewise.
	* c-c++-common/zero-scratch-regs-10.c: Likewise.
	* c-c++-common/zero-scratch-regs-11.c: Likewise.
2021-03-31 08:55:38 +02:00
Patrick Palka a3bf6ce7f2 c++: Adjust mangling of __alignof__ [PR88115]
r11-4926 made __alignof__ get mangled differently from alignof,
encoding __alignof__ as a vendor extended operator.  But this
mangling is problematic for the reasons mentioned in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88115#c6.

This patch changes our mangling of __alignof__ to instead use the
new "vendor extended expression" syntax that's proposed in
https://github.com/itanium-cxx-abi/cxx-abi/issues/112.  Clang does
the same thing already, so after this patch Clang and GCC agree
about the mangling of __alignof__(type) and __alignof__(expr).

gcc/cp/ChangeLog:

	PR c++/88115
	* mangle.c (write_expression): Adjust the mangling of
	__alignof__.

include/ChangeLog:

	PR c++/88115
	* demangle.h (enum demangle_component_type): Add
	DEMANGLE_COMPONENT_VENDOR_EXPR.

libiberty/ChangeLog:

	PR c++/88115
	* cp-demangle.c (d_dump, d_make_comp, d_expression_1)
	(d_count_templates_scopes): Handle DEMANGLE_COMPONENT_VENDOR_EXPR.
	(d_print_comp_inner): Likewise.
	<case DEMANGLE_COMPONENT_EXTENDED_OPERATOR>: Revert r11-4926
	change.
	<case DEMANGLE_COMPONENT_UNARY>: Likewise.
	* testsuite/demangle-expected: Adjust __alignof__ tests.

gcc/testsuite/ChangeLog:

	PR c++/88115
	* g++.dg/cpp0x/alignof7.C: Adjust expected mangling.
2021-03-30 22:57:11 -04:00
Patrick Palka 0bbf0edbfc c++: placeholder type constraint and argument pack [PR99815]
When checking dependence of a placeholder type constraint, if the first
template argument of the constraint is an argument pack, we need to
expand it in order to properly separate the implicit 'auto' argument
from the rest.

gcc/cp/ChangeLog:

	PR c++/99815
	* pt.c (placeholder_type_constraint_dependent_p): Expand
	argument packs to separate the first non-pack argument
	from the rest.

gcc/testsuite/ChangeLog:

	PR c++/99815
	* g++.dg/cpp2a/concepts-placeholder5.C: New test.
2021-03-30 22:54:37 -04:00
GCC Administrator 08d2edae5d Daily bump. 2021-03-31 00:16:31 +00:00
David Malcolm d0b7c82175 analyzer: remove old decl of region::dump_to_pp
This was made redundant in the GCC 11 rewrite of state
(808f4dfeb3).

gcc/analyzer/ChangeLog:
	* region.h (region::dump_to_pp): Remove old decl.
2021-03-30 17:54:36 -04:00
David Malcolm 0f9aa35c79 analyzer: only call get_diagnostic_tree when it's needed
impl_sm_context::get_diagnostic_tree could be expensive, and
I find myself needing to put a breakpoint on it to debug
PR analyzer/99771, so only call it if we're about to use
the result.

gcc/analyzer/ChangeLog:
	* sm-file.cc (fileptr_state_machine::on_stmt): Only call
	get_diagnostic_tree if the result will be used.
	* sm-malloc.cc (malloc_state_machine::on_stmt): Likewise.
	(malloc_state_machine::on_deallocator_call): Likewise.
	(malloc_state_machine::on_realloc_call): Likewise.
	(malloc_state_machine::on_realloc_call): Likewise.
	* sm-sensitive.cc
	(sensitive_state_machine::warn_for_any_exposure): Likewise.
	* sm-taint.cc (taint_state_machine::on_stmt): Likewise.
2021-03-30 17:51:21 -04:00
David Malcolm a01f5fd710 analyzer testsuite: fix typo
gcc/testsuite/ChangeLog:
	* gcc.dg/analyzer/symbolic-1.c: Fix typo.
2021-03-30 17:50:38 -04:00
Nathan Sidwell 5f3c602725 c++: duplicate const static members [PR 99283]
This is the bug that keeps on giving.  Reducing it has been successful
at hitting other defects. In this case, some more specialization hash
table fun, plus an issue with reading in a definition of a duplicated
declaration.  At least I discovered a null context check is no longer
needed.

	PR c++/99283
	gcc/cp/
	* module.cc (dumper::operator): Make less brittle.
	(trees_out::core_bools): VAR_DECLs always have a context.
	(trees_out::key_mergeable): Use same_type_p for asserting.
	(trees_in::read_var_def): Propagate
	DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P.
	gcc/testsuite/
	* g++.dg/modules/pr99283-5.h: New.
	* g++.dg/modules/pr99283-5_a.H: New.
	* g++.dg/modules/pr99283-5_b.H: New.
	* g++.dg/modules/pr99283-5_c.C: New.
2021-03-30 09:52:21 -07:00
Jakub Jelinek 953624089b c++: Fix ICE on PTRMEM_CST in lambda in inline var initializer [PR99790]
The following testcase ICEs (since the addition of inline var support),
because the lambda contains PTRMEM_CST but finish_function is called for the
lambda quite early during parsing it (from finish_lambda_function) when
the containing class is still incomplete.  That means that during
genericization cplus_expand_constant keeps the PTRMEM_CST unmodified, but
later nothing lowers it when the class is finalized.
Using sizeof etc. on the class in such contexts is rejected by both g++ and
clang++, and when the PTRMEM_CST appears e.g. in static var initializers
rather than in functions, we handle it correctly because c_parse_final_cleanups
-> lower_var_init will handle those cplus_expand_constant when all classes
are already finalized.

The following patch fixes it by calling cplus_expand_constant again during
gimplification, as we are now unconditionally unit at a time, I'd think
everything that could be completed will be before we start gimplification.

2021-03-30  Jakub Jelinek  <jakub@redhat.com>

	PR c++/99790
	* cp-gimplify.c (cp_gimplify_expr): Handle PTRMEM_CST.

	* g++.dg/cpp1z/pr99790.C: New test.
2021-03-30 18:15:32 +02:00
Kyrylo Tkachov c277abd9cd aarch64: PR target/99820: Guard on available SVE issue info before using
This fixes a simple segfault ICE when using the use_new_vector_costs tunable with a CPU tuning that it wasn't intended for.
I'm not adding a testcase here as we intend to remove the tunable for GCC 12 anyway (the new costing logic will remain and will benefit
from this extra check, but the -moverride option will no longer exist).

gcc/ChangeLog:

	PR target/99820
	* config/aarch64/aarch64.c (aarch64_analyze_loop_vinfo): Check for
	available issue_info before using it.
2021-03-30 16:42:17 +01:00
Kyrylo Tkachov 19199a6f2b aarch64: PR target/99822 Don't allow zero register in first operand of SUBS/ADDS-immediate
In this PR we end up generating an invalid instruction:
adds x1,xzr,#2

because the pattern accepts zero as an operand in the comparison, but the instruction doesn't.
Fix it by adjusting the predicate and constraints.

gcc/ChangeLog:

	PR target/99822
	* config/aarch64/aarch64.md (sub<mode>3_compare1_imm): Do not allow zero
	in operand 1.

gcc/testsuite/ChangeLog:

	PR target/99822
	* gcc.c-torture/compile/pr99822.c: New test.
2021-03-30 15:43:36 +01:00
luoxhu@cn.ibm.com f64b91568f rs6000: Enable 32bit variable vec_insert [PR99718]
32bit and P7 VSX could also benefit a lot from the variable vec_insert
implementation with shift/insert/shift back method.

2011-03-29  Xionghu Luo  <luoxhu@linux.ibm.com>

	PR target/99718
	* config/rs6000/altivec.md (altivec_lvsl_reg): Change to ...
	(altivec_lvsl_reg_<mode>): ... this.
	(altivec_lvsr_reg): Change to ...
	(altivec_lvsr_reg_<mode>): ... this.
	* config/rs6000/predicates.md (vec_set_index_operand): New.
	* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
	Enable 32bit variable vec_insert for all TARGET_VSX.
	* config/rs6000/rs6000.c (rs6000_expand_vector_set_var_p9):
	Enable 32bit variable vec_insert for p9 and above.
	(rs6000_expand_vector_set_var_p8): Rename to ...
	(rs6000_expand_vector_set_var_p7): ... this.
	(rs6000_expand_vector_set): Use TARGET_VSX and adjust assert
	position.
	* config/rs6000/vector.md (vec_set<mode>): Use vec_set_index_operand.
	* config/rs6000/vsx.md (xl_len_r): Use gen_altivec_lvsl_reg_di and
	gen_altivec_lvsr_reg_di.

gcc/testsuite/
	PR target/99718
	* gcc.target/powerpc/fold-vec-insert-char-p8.c: Update
	instruction counts.
	* gcc.target/powerpc/fold-vec-insert-char-p9.c: Likewise.
	* gcc.target/powerpc/fold-vec-insert-double.c: Likewise.
	* gcc.target/powerpc/fold-vec-insert-float-p8.c: Likewise.
	* gcc.target/powerpc/fold-vec-insert-float-p9.c: Likewise.
	* gcc.target/powerpc/fold-vec-insert-int-p8.c: Likewise.
	* gcc.target/powerpc/fold-vec-insert-int-p9.c: Likewise.
	* gcc.target/powerpc/fold-vec-insert-longlong.c: Likewise.
	* gcc.target/powerpc/fold-vec-insert-short-p8.c: Likewise.
	* gcc.target/powerpc/fold-vec-insert-short-p9.c: Likewise.
	* gcc.target/powerpc/pr79251.p8.c: Likewise.
	* gcc.target/powerpc/pr79251.p9.c: Likewise.
	* gcc.target/powerpc/vsx-builtin-7.c: Likewise.
	* gcc.target/powerpc/pr79251-run.p7.c: New test.
	* gcc.target/powerpc/pr79251.p7.c: New test.
2021-03-30 13:43:21 +00:00
H.J. Lu 5463cee277 x86: Define __rdtsc and __rdtscp as macros
Define __rdtsc and __rdtscp as macros for callers with general-regs-only
target attribute to avoid inline failure with always_inline attribute.

gcc/

	PR target/99744
	* config/i386/ia32intrin.h (__rdtsc): Defined as macro.
	(__rdtscp): Likewise.

gcc/testsuite/

	PR target/99744
	* gcc.target/i386/pr99744-1.c: New test.
2021-03-30 06:29:18 -07:00
Tamar Christina 9c68e2abe2 slp: reject non-multiple of 2 laned SLP trees (PR99825)
TWO_OPERANDS allows any order or number of combinations of + and - operations
but the pattern matcher only supports pairs of operations.

This patch has the pattern matcher for complex numbers reject SLP trees where
the lanes are not a multiple of 2.

gcc/ChangeLog:

	PR tree-optimization/99825
	* tree-vect-slp-patterns.c (vect_check_evenodd_blend):
	Reject non-mult 2 lanes.

gcc/testsuite/ChangeLog:

	PR tree-optimization/99825
	* gfortran.dg/vect/pr99825.f90: New test.
2021-03-30 14:16:03 +01:00
Christophe Lyon 6f93a7c7fc arm: Fix emission of Tag_ABI_VFP_args with MVE and -mfloat-abi=hard (PR target/99773)
When compiling with -mfloat-abi=hard -march=armv8.1-m.main+mve, we
want to emit Tag_ABI_VFP_args even though we are not emitting
floating-point instructions (we need "+mve.fp" for that), because we
use MVE registers to pass FP arguments.

This patch removes the condition on (! TARGET_SOFT_FLOAT) because this
is a case where TARGET_SOFT_FLOAT is true, and TARGET_HARD_FLOAT_ABI
is true too.

2021-03-30  Richard Earnshaw  <rearnsha@arm.com>

	gcc/
	PR target/99773
	* config/arm/arm.c (arm_file_start): Fix emission of
	Tag_ABI_VFP_args attribute.
2021-03-30 13:11:10 +00:00
Kyrylo Tkachov 41d57b2a97 aarch64: Fix gcc.target/aarch64/pr99808.c for ILP32
Fix test for -mabi=ilp32

gcc/testsuite/ChangeLog:

	PR target/99808
	* gcc.target/aarch64/pr99808.c: Use ULL constant suffix.
2021-03-30 14:08:38 +01:00
Richard Biener bd3d919b58 tree-optimization/99824 - avoid excessive integer type precision in VN
VN sometimes builds new integer types to handle accesss where precision
of the access type does not match the access size.  The way
ao_ref_init_from_vn_reference is computing the access size ignores
the access type in case the ref operands have an outermost
COMPONENT_REF which, in case it is an array for example, can be
way larger than the access size.  This can cause us to try
building an integer type with precision larger than WIDE_INT_MAX_PRECISION
eventually leading to memory corruption.

The following adjusts ao_ref_init_from_vn_reference to only lower
access sizes via the outermost COMPONENT_REF but otherwise honor
the access size as specified by the access type.

It also places an assert in integer type building that we remain
in the limits of WIDE_INT_MAX_PRECISION.  I chose the shared code
where we set TYPE_MIN/MAX_VALUE because that will immediately
cross the wide_ints capacity otherwise.

2021-03-30  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/99824
	* stor-layout.c (set_min_and_max_values_for_integral_type):
	Assert the precision is within the bounds of
	WIDE_INT_MAX_PRECISION.
	* tree-ssa-sccvn.c (ao_ref_init_from_vn_reference): Use
	the outermost component ref only to lower the access size
	and initialize that from the access type.

	* gcc.dg/torture/pr99824.c: New testcase.
2021-03-30 14:00:58 +02:00
Richard Sandiford 48c79f054b aarch64: Tweak post-RA handling of CONST_INT moves [PR98136]
This PR is a regression caused by r8-5967, where we replaced
a call to aarch64_internal_mov_immediate in aarch64_add_offset
with a call to aarch64_force_temporary, which in turn uses the
normal emit_move_insn{,_1} routines.

The problem is that aarch64_add_offset can be called while
outputting a thunk, where we require all instructions to be
valid without splitting.  However, the move expanders were
not splitting CONST_INT moves themselves.

I think the right fix is to make the move expanders work
even in this scenario, rather than require callers to handle
it as a special case.

gcc/
	PR target/98136
	* config/aarch64/aarch64.md (mov<mode>): Pass multi-instruction
	CONST_INTs to aarch64_expand_mov_immediate when called after RA.

gcc/testsuite/
	PR target/98136
	* g++.dg/pr98136.C: New test.
2021-03-30 11:42:50 +01:00
Mihailo Stojanovic cc2fda1328 aarch64: Prevent use of SIMD fcvtz[su] instruction variant with "nosimd"
Currently, SF->SI and DF->DI conversions on Aarch64 with the "nosimd"
flag provided sometimes cause the emitting of a vector variant of the
fcvtz[su] instruction (e.g. fcvtzu s0, s0).

This modifies the corresponding pattern to only select the vector
variant of the instruction when generating code with SIMD enabled.

gcc/ChangeLog:

	* config/aarch64/aarch64.md
	(<optab>_trunc<fcvt_target><GPI:mode>2): Set the "arch"
	attribute to disambiguate between SIMD and FP variants of the
	instruction.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/fcvt_nosimd.c: New test.
2021-03-30 11:42:49 +01:00
GCC Administrator 65374af219 Daily bump. 2021-03-30 00:16:29 +00:00
Joseph Myers 8aac913adf Update cpplib sr.po.
* sr.po: Update.
2021-03-29 22:53:22 +00:00
Joseph Myers 318074f335 Update gcc sv.po.
* sv.po: Update.
2021-03-29 22:51:16 +00:00
Eric Botcazou 471babd886 Fix wrong assignment of aggregate to full-access component
This is a regression present on the mainline: the compiler (front-end) fails
to assign an aggregate to a full-access component (i.e. Atomic or VFA) as a
whole if the type of the component is not full access itself.

gcc/ada/
	PR ada/99802
	* freeze.adb (Is_Full_Access_Aggregate): Call Is_Full_Access_Object
	on the name of an N_Assignment_Statement to spot full access.
2021-03-30 00:45:38 +02:00
Martin Sebor af739c8797 PR tree-optimization/61869 - Spurious uninitialized warning
gcc/testsuite/ChangeLog:
	PR tree-optimization/61869
	* gcc.dg/uninit-pr61869.c: New test.
2021-03-29 15:58:01 -06:00
Martin Sebor fecc835e21 PR tree-optimization/61677 - False positive with -Wmaybe-uninitialized
gcc/testsuite/ChangeLog:
	PR tree-optimization/61677
	* gcc.dg/uninit-pr61677.c: New test.
2021-03-29 15:23:03 -06:00
Michael Meissner 645bfc1619 Require GLIBC 2.32 for Decimal/_Float128 conversions.
In the patch that I applied on March 2nd, I had code to provide support for
Decimal/_Float128 conversions if the user did not use at least GLIBC 2.32.  It
did this by using __ibm128 as an intermediate type.  The trouble is __ibm128
cannot represent all of the numbers that _Float128 can, and you lose if you do
this conversion.

This patch removes this support.  The dfp-bit.c functions now call the the
__sprintfieee128 and __strtoieee128 functions to do the conversion.  If the
user does not have GLIBC, they will get a linker error that these functions do
not exist.

The float128 support functions are only built into the static libgcc, so there
isn't an issue with having references to __strtoieee128 and __sprintfieee128
with older GLIBC libraries.

As an added bonus, this patch eliminates the __sprintfkf function which
included stdio.h to get a definition for the sprintf library function.  This
allows for building cross compilers without having to have a target stdio.h
available.

libgcc/
2021-03-29  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/t-float128 (fp128_decstr_funcs): Delete.
	(fp128_ppc_funcs): Do not add $(fp128_decstr_funcs).
	(fp128_decstr_objs): Delete.
	* dfp-bit.h: Call __sprintfieee128 to do conversions from
	_Float128 to a Decimal type.  Call __strtoieee128 to do
	conversions from a Decimal type to _Float128.
	* config/rs6000/_sprintfkf.c: Delete file.
	* config/rs6000/_sprintfkf.h: Delete file.
	* config/rs6000/_strtokf.c: Delete file.
	* config/rs6000/_strtokf.h: Delete file.
2021-03-29 16:43:14 -04:00
Martin Sebor 77093a75ca PR tree-optimization/61112 - repeated conditional triggers false positive -Wmaybe-uninitialized
gcc/testsuite/ChangeLog:
	PR tree-optimization/61112
	* gcc.dg/uninit-pr61112.c: New test.
2021-03-29 13:52:53 -06:00
Jan Hubicka 7b6ca93b2d Fix pr99751.c testcase
PR ipa/99751
	* gcc.c-torture/compile/pr99751.c: Rename from ...
	* gcc.c-torture/execute/pr99751.c: ... to this.
2021-03-29 20:59:42 +02:00
Jan Hubicka dd64aaafe6 Fix typo in merge_call_lhs_flags
gcc/ChangeLog:

2021-03-29  Jan Hubicka  <hubicka@ucw.cz>

	* ipa-modref.c (merge_call_lhs_flags): Correct handling of deref.
	(analyze_ssa_name_flags): Fix typo in comment.

gcc/testsuite/ChangeLog:

2021-03-29  Jan Hubicka  <hubicka@ucw.cz>

	* gcc.c-torture/compile/pr99751.c: New test.
2021-03-29 20:09:35 +02:00
Jonathan Wakely 864caa158f Fix PR number in ChangeLog 2021-03-29 17:08:38 +01:00
Jakub Jelinek afa8c67eb9 testsuite: Expect a warning on aarch64 for declare-simd-coarray-lib.f90 [PR93660]
aarch64 currently doesn't support declare simd where the return value and arguments
have different sizes and warns about that case.  This change adds a dg-warning
for that case like various other tests have already.

2021-03-29  Jakub Jelinek  <jakub@redhat.com>

	PR fortran/93660
	* gfortran.dg/gomp/declare-simd-coarray-lib.f90: Expect a mixed size
	declare simd warning on aarch64.
2021-03-29 17:05:47 +02:00
Jonathan Wakely e19afa0645 libstdc++: Adjust link to PSTL upstream (again)
The LLVM project renamed their default branch to 'main'.

libstdc++-v3/ChangeLog:

	* doc/xml/manual/status_cxx2017.xml: Adjust link for PSTL.
	* doc/html/manual/status.html: Regenerate.
2021-03-29 14:14:00 +01:00
Alex Coplan e4005cf871 aarch64: Fix SVE ACLE builtins with LTO [PR99216]
As discussed in the PR, we currently have two different numbering
schemes for SVE builtins: one for C, and one for C++. This is
problematic for LTO, where we end up getting confused about which
intrinsic we're talking about. This patch inserts placeholders into the
registered_functions vector to ensure that there is a consistent
numbering scheme for both C and C++.

We use integer_zero_node as a placeholder node instead of building a
function decl. This is safe because the node is only returned by the
TARGET_BUILTIN_DECL hook, which (on AArch64) is only used for validation
when builtin decls are streamed into lto1.

gcc/ChangeLog:

	PR target/99216
	* config/aarch64/aarch64-sve-builtins.cc
	(function_builder::add_function): Add placeholder_p argument, use
	placeholder decls if this is set.
	(function_builder::add_unique_function): Instead of conditionally adding
	direct overloads, unconditionally add either a direct overload or a
	placeholder.
	(function_builder::add_overloaded_function): Set placeholder_p if we're
	using C++ overloads. Use the obstack for string storage instead
	of relying on the tree nodes.
	(function_builder::add_overloaded_functions): Don't return early for
	m_direct_overloads: we need to add placeholders.
	* config/aarch64/aarch64-sve-builtins.h
	(function_builder::add_function): Add placeholder_p argument.

gcc/testsuite/ChangeLog:

	PR target/99216
	* g++.target/aarch64/sve/pr99216.C: New test.
2021-03-29 12:18:19 +01:00
Richard Biener 8cf2812cfc tree-optimization/99807 - avoid bogus assert with permute SLP node
This avoids asserting anything on the SLP_TREE_REPRESENTATIVE of
an SLP permute node (which shouldn't be there).

2021-03-29  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/99807
	* tree-vect-slp.c (vect_slp_analyze_node_operations_1): Move
	assert below VEC_PERM handling.

	* gfortran.dg/vect/pr99807.f90: New testcase.
2021-03-29 13:13:10 +02:00
Kyrylo Tkachov 37d9074e12 aarch64: PR target/99037 Fix RTL represntation in move_lo_quad patterns
This patch fixes the RTL representation of the move_lo_quad patterns to use aarch64_simd_or_scalar_imm_zero
for the zero part rather than a vec_duplicate of zero or a const_int 0.
The expander that generates them is also adjusted so that we use and match the correct const_vector forms throughout.

Co-Authored-By: Jakub Jelinek <jakub@redhat.com>

gcc/ChangeLog:

	PR target/99037
	* config/aarch64/aarch64-simd.md (move_lo_quad_internal_<mode>): Use
	aarch64_simd_or_scalar_imm_zero to match zeroes.  Remove pattern
	matching const_int 0.
	(move_lo_quad_internal_be_<mode>): Likewise.
	(move_lo_quad_<mode>): Update for the above.
	* config/aarch64/iterators.md (VQ_2E): Delete.

gcc/testsuite/ChangeLog:

	PR target/99808
	* gcc.target/aarch64/pr99808.c: New test.
2021-03-29 11:54:57 +01:00
Jakub Jelinek 25e515d219 fold-const: Fix ICE in extract_muldiv_1 [PR99777]
extract_muldiv{,_1} is apparently only prepared to handle scalar integer
operations, the callers ensure it by only calling it if the divisor or
one of the multiplicands is INTEGER_CST and because neither multiplication
nor division nor modulo are really supported e.g. for pointer types, nullptr
type etc.  But the CASE_CONVERT handling doesn't really check if it isn't
a cast from some other type kind, so on the testcase we end up trying to
build MULT_EXPR in POINTER_TYPE which ICEs.  A few years ago Marek has
added ANY_INTEGRAL_TYPE_P checks to two spots, but the code uses
TYPE_PRECISION which means something completely different for vector types,
etc.
So IMNSHO we should just punt on conversions from non-integrals or
non-scalar integrals.

2021-03-29  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/99777
	* fold-const.c (extract_muldiv_1): For conversions, punt on casts from
	types other than scalar integral types.

	* g++.dg/torture/pr99777.C: New test.
2021-03-29 12:35:32 +02:00
Tobias Burnus d579e2e76f libgomp: Fix on_device_arch.c aux-file handling [PR99555]
libgomp/ChangeLog:

	PR target/99555
	* testsuite/lib/on_device_arch.c: Move to ...
	* testsuite/libgomp.c-c++-common/on_device_arch.h: ... here.
	* testsuite/libgomp.fortran/on_device_arch.c: New file;
	#include on_device_arch.h.
	* testsuite/libgomp.c-c++-common/task-detach-6.c: #include
	on_device_arch.h instead of using dg-additional-source.
	* testsuite/libgomp.c/pr99555-1.c: Likewise.
	* testsuite/libgomp.fortran/task-detach-6.f90: Update to use
	on_device_arch.c without relative paths.
2021-03-29 10:40:38 +02:00
GCC Administrator c411011287 Daily bump. 2021-03-29 00:16:20 +00:00
David Edelsohn 499fa254ae aix: TLS DWARF symbol decorations.
GCC currently emits TLS relocation decorations on symbols in DWARF sections.
Recent changes to the AIX linker cause it to reject such symbols.
This patch removes the decorations (@ie, @le, @m) and emit only the
qualified symbol name.

gcc/ChangeLog:

	* config/rs6000/rs6000.c (rs6000_output_dwarf_dtprel): Do not add
	XCOFF TLS reloc decorations.
2021-03-28 17:57:33 -04:00
Gerald Pfeifer d15db0c5f5 doc: Update link to "Memory Model" paper
gcc/ChangeLog:
	* doc/analyzer.texi (Analyzer Internals): Update link to
	"A Memory Model for Static Analysis of C Programs".
2021-03-28 23:34:35 +02:00
François Dumont d04c246cae libstdc++: _GLIBCXX_DEBUG Fix allocator-extended move constructor
libstdc++-v3/ChangeLog:

	* include/debug/forward_list
	(forward_list(forward_list&&, const allocator_type&)): Add noexcept qualification.
	* include/debug/list (list(list&&, const allocator_type&)): Likewise and add
	call to safe container allocator aware move constructor.
	* include/debug/vector (vector(vector&&, const allocator_type&)):
	Fix noexcept qualification.
	* testsuite/23_containers/forward_list/cons/noexcept_move_construct.cc:
	Add allocator-extended move constructor noexceot qualification check.
	* testsuite/23_containers/list/cons/noexcept_move_construct.cc: Likewise.
2021-03-28 22:06:33 +02:00
Christophe Lyon 46720db72c testsuite/arm: Improve scan-assembler in pr96770.c
I'm seeing random scan-assembler-times failures in pr96770.c when LTO is used.

I suspect this is because the \\+4 string matches the LTO sections, sometimes.

This small patch avoids the issue, by matching arr\\+4 instead of \\+4.

2021-03-28  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/testsuite/
	PR target/96770
	* gcc.target/arm/pure-code/pr96770.c: Improve scan-assembler-times.
2021-03-28 19:01:24 +00:00
Paul Thomas 297363774e Fortran: Fix problem with runtime pointer check [PR99602].
2021-03-28  Paul Thomas  <pault@gcc.gnu.org>

gcc/fortran/ChangeLog

	PR fortran/99602
	* trans-expr.c (gfc_conv_procedure_call): Use the _data attrs
	for class expressions and detect proc pointer evaluations by
	the non-null actual argument list.

gcc/testsuite/ChangeLog

	PR fortran/99602
	* gfortran.dg/pr99602.f90: New test.
	* gfortran.dg/pr99602a.f90: New test.
	* gfortran.dg/pr99602b.f90: New test.
	* gfortran.dg/pr99602c.f90: New test.
	* gfortran.dg/pr99602d.f90: New test.
2021-03-28 19:39:50 +01:00