182625 Commits

Author SHA1 Message Date
Jakub Jelinek
4ddee425b8 reassoc: Fix reassociation on 32-bit hosts with > 32767 bbs [PR98514]
Apparently reassoc ICEs on large functions (more than 32767 basic blocks
with something to reassociate in those).
The problem is that the pass uses long type to store the ranks, and
the bb ranks are (number of SSA_NAMEs with default defs + 2 + bb->index) << 16,
so with many basic blocks we overflow the ranks and we then have assertions
rank is not negative.

The following patch just uses int64_t instead of long in the pass,
yes, it means slightly higher memory consumption (one array indexed by
bb->index is twice as large, and one hash_map from trees to the ranks
will grow by 50%, but I think it is better than punting on large functions
the reassociation on 32-bit hosts and making it inconsistent e.g. when
cross-compiling.  Given vec.h uses unsigned for vect element counts,
we don't really support more than 4G of SSA_NAMEs or more than 2G of basic
blocks in a function, so even with the << 16 we can't really overflow the
int64_t rank counters.

2021-01-05  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/98514
	* tree-ssa-reassoc.c (bb_rank): Change type from long * to
	int64_t *.
	(operand_rank): Change type from hash_map<tree, long> to
	hash_map<tree, int64_t>.
	(phi_rank): Change return type from long to int64_t.
	(loop_carried_phi): Change block_rank variable type from long to
	int64_t.
	(propagate_rank): Change return type, rank parameter type and
	op_rank variable type from long to int64_t.
	(find_operand_rank): Change return type from long to int64_t
	and change slot variable type from long * to int64_t *.
	(insert_operand_rank): Change rank parameter type from long to
	int64_t.
	(get_rank): Change return type and rank variable type from long to
	int64_t.  Use PRId64 instead of ld to print the rank.
	(init_reassoc): Change rank variable type from long to int64_t
	and adjust correspondingly bb_rank and operand_rank initialization.
2021-01-05 16:37:40 +01:00
Jakub Jelinek
576714b309 phiopt: Optimize x < 0 ? ~y : y to (x >> 31) ^ y [PR96928]
As requested in the PR, the one's complement abs can be done more
efficiently without cmov or branching.

Had to change the ifcvt-onecmpl-abs-1.c testcase, we no longer optimize
it in ifcvt, on x86_64 with -m32 we generate in the end the exact same
code, but with -m64:
        movl    %edi, %eax
-       notl    %eax
-       cmpl    %edi, %eax
-       cmovl   %edi, %eax
+       sarl    $31, %eax
+       xorl    %edi, %eax
        ret

2021-01-05  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/96928
	* tree-ssa-phiopt.c (xor_replacement): New function.
	(tree_ssa_phiopt_worker): Call it.

	* gcc.dg/tree-ssa/pr96928.c: New test.
	* gcc.target/i386/ifcvt-onecmpl-abs-1.c: Remove -fdump-rtl-ce1,
	instead of scanning rtl dump for ifcvt message check assembly
	for xor instruction.
2021-01-05 16:35:22 +01:00
Jakub Jelinek
5ca2400270 match.pd: Improve (A / (1 << B)) -> (A >> B) optimization [PR96930]
The following patch improves the A / (1 << B) -> A >> B simplification,
as seen in the testcase, if there is unnecessary widening for the division,
we just optimize it into a shift on the widened type, but if the lshift
is widened too, there is no reason to do that, we can just shift it in the
original type and convert after.  The tree_nonzero_bits & wi::mask check
already ensures it is fine even for signed values.

I've split the vr-values optimization into a separate patch as it causes
a small regression on two testcases, but this patch fixes what has been
reported in the PR alone.

2021-01-05  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/96930
	* match.pd ((A / (1 << B)) -> (A >> B)): If A is extended
	from narrower value which has the same type as 1 << B, perform
	the right shift on the narrower value followed by extension.

	* g++.dg/tree-ssa/pr96930.C: New test.
2021-01-05 16:33:29 +01:00
Jakub Jelinek
a7553ad60b store-merging: Handle vector CONSTRUCTORs using bswap [PR96239]
I've tried to add such helper, but handling over just analysis and letting
each pass handle it differently seems complicated given the limitations of
the bswap infrastructure.

So, this patch just hooks the optimization also into store-merging so that
the original testcase from the PR can be fixed.

2021-01-05  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/96239
	* gimple-ssa-store-merging.c (maybe_optimize_vector_constructor): New
	function.
	(get_status_for_store_merging): Don't return BB_INVALID for blocks
	with potential bswap optimizable CONSTRUCTORs.
	(pass_store_merging::execute): Optimize vector CONSTRUCTORs with bswap
	if possible.

	* gcc.dg/tree-ssa/pr96239.c: New test.
2021-01-05 16:16:06 +01:00
Jakub Jelinek
f702893787 go: Fix -fgo-embedcfg= option description.
Description of options should be . terminated, the:
FAIL: compiler driver --help=go option(s): "^ +-.*[^:.]$" absent from output: "  -fgo-embedcfg=<file>        List embedded files via go:embed"
test even reports that.

2021-01-05  Jakub Jelinek  <jakub@redhat.com>

	* lang.opt (fgo-embedcfg=): Add full stop at the end of description.
2021-01-05 16:13:20 +01:00
Richard Biener
01da03c915 tree-optimization/98381 - fix live bool vector extract
This fixes extraction of live bool vector results for the case of
integer mode vectors.

2021-01-05  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/98381
	* tree.c (vector_element_bits): Properly compute bool vector
	element size.
	* tree-vect-loop.c (vectorizable_live_operation): Properly
	compute the last lane bit offset.
2021-01-05 15:54:42 +01:00
Uros Bizjak
1ff0ddcd8b i386: Prevent spurious FP exceptions with _mm_cvt{,t}ps_pi32 [PR98522]
Prevent spurious FP exceptions with _mm_cvt{,t}ps_pi32 for TARGET_MMX_WITH_SSE
by clearing the top 64 bytes of the input XMM register.

2021-01-05  Uroš Bizjak  <ubizjak@gmail.com>

gcc/
	PR target/98522
	* config/i386/sse.md (sse_cvtps2pi): Redefine as define_insn_and_split.
	Clear the top 64 bytes of the input XMM register.
	(sse_cvttps2pi): Ditto.

gcc/testsuite

	PR target/98522
	* gcc.target/i386/pr98522.c: New test.
2021-01-05 14:45:28 +01:00
Uros Bizjak
951bdbde6a i386: Add _mm256_cmov_si256 [PR98521]
Add missing _mm256_cmov_si256 intrinsic to xopintrin.h.

2021-01-05  Uroš Bizjak  <ubizjak@gmail.com>

gcc/
	PR target/98521
	* config/i386/xopintrin.h (_mm256_cmov_si256): New.
2021-01-05 14:45:27 +01:00
Nathan Sidwell
6ffaffd5d1 [c++]: Improve module-decl diagnostics [PR 98327]
The diagnostic for a misplaced module decl was essentially 'computer
says no', which isn't the most helpful.  This adjusts it to indicate
what would be acceptable.

	gcc/cp/
	* parser.c (cp_parser_module_declaration): Alter diagnostic
	text to say where is permissable.
	gcc/testsuite/
	* g++.dg/modules/mod-decl-1.C: Adjust.
	* g++.dg/modules/p0713-2.C: Adjust.
	* g++.dg/modules/p0713-3.C: Adjust.
2021-01-05 05:28:23 -08:00
H.J. Lu
af60b0ec79 x86: Cast to unsigned short first for _mm_extract_pi16
_mm_extract_pi16 is intrinsic for pextrw, which should be zero-extended,
not sign-extended.

gcc/

	PR target/98495
	* config/i386/xmmintrin.h (_mm_extract_pi16): Cast to unsigned
	short first.

gcc/testsuite/

	PR target/98495
	* gcc.target/i386/pr98495-1.c: New test.
	* gcc.target/i386/pr98495-2.c: New test.
	* gcc.target/i386/pr98495-3.c: New test.
	* gcc.target/i386/pr98495-4.c: New test.
	* gcc.target/i386/pr98495-5.c: New test.
2021-01-05 05:08:00 -08:00
Claudiu Zissulescu
b679559385 arc: fix accumulator first register.
gcc/
2021-01-05  Claudiu Zissulescu  <claziss@synopsys.com>

	* config/arc/arc.md (maddsidi4_split): Use ACC_REG_FIRST.
	(umaddsidi4_split): Likewise.

Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>
2021-01-05 14:19:27 +02:00
liuhongt
bea984814c i386: Optimize pmovskb on zero_extend of subreg HI of pmovskb result [PR98461]
The following patch adds define_insn_and_split to optimize

       vpmovmskb       %xmm0, %eax
-       movzwl  %ax, %eax
        notl    %eax

and combine splitter to optimize

        pmovmskb        %xmm0, %eax
-       notl    %eax
-       movzwl  %ax, %eax
+       xorl    $65535, %eax

gcc/ChangeLog
	PR target/98461
	* config/i386/sse.md (*sse2_pmovskb_zexthisi): New
	define_insn_and_split for zero_extend of subreg HI of pmovskb
	result.
	(*sse2_pmovskb_zexthisi): Add new combine splitters for
	zero_extend of not of subreg HI of pmovskb result.

gcc/testsuite/ChangeLog
	* gcc.target/i386/sse2-pr98461-2.c: New test.
2021-01-05 19:39:46 +08:00
Richard Sandiford
e8beba1cfc explow, aarch64: Fix force-Pmode-to-mem for ILP32 [PR97269]
This patch fixes a mode/rtx mismatch for ILP32 targets in:

	  mem = force_const_mem (ptr_mode, imm);

where imm can be Pmode rather than ptr_mode.

The patch uses convert_memory_address to convert the Pmode address
to ptr_mode before the call.  However, immediate addresses can in
general contain unspecs, and convert_memory_address wasn't set up
to handle those.

The patch therefore adds some generic unspec handling to
convert_memory_address_addr_space_1.  As the comment says, we can add
a target hook if this behaviour turns out to be wrong for some targets.
But I think what the patch does is a strict improvement over the status
quo: without it, we would try to force the unspec into a register,
but nevertheless wrap the result in a (const ...).  That in turn
would be invalid rtl and seems bound to generate an ICE later.

I tested the explow.c part using -fstack-protector with local hacks
to force SYMBOL_FORCE_TO_MEM for UNSPEC_SALT_ADDR.

Fixes c-c++-common/torture/pr57945.c and various other tests.

gcc/
	PR target/97269
	* explow.c (convert_memory_address_addr_space_1): Handle UNSPECs
	nested in CONSTs.
	* config/aarch64/aarch64.c (aarch64_expand_mov_immediate): Use
	convert_memory_address to convert symbolic immediates to ptr_mode
	before forcing them to memory.
2021-01-05 11:29:10 +00:00
Richard Sandiford
eac8675225 recog: Fix a constrain_operands corner case [PR97144]
aarch64's *add<mode>3_poly_1 has a pattern with the constraints:

  "=...,r,&r"
  "...,0,rk"
  "...,Uai,Uat"

i.e. the penultimate alternative requires operands 0 and 1 to match,
but the final alternative does not allow them to match.

The register allocators dealt with this correctly, and so used
different input and output registers for instructions with Uat
operands.  However, constrain_operands carried the penultimate
alternative's matching rule over to the final alternative,
so it would essentially ignore the earlyclobber.  This in turn
allowed postreload to convert a correct Uat pairing into an
incorrect one.

The fix is simple: recompute the matching information for each
alternative.

gcc/
	PR rtl-optimization/97144
	* recog.c (constrain_operands): Initialize matching_operand
	for each alternative, rather than only doing it once.

gcc/testsuite/
	PR rtl-optimization/97144
	* gcc.c-torture/compile/pr97144.c: New test.
	* gcc.target/aarch64/sve/pr97144.c: Likewise.
2021-01-05 11:18:48 +00:00
Richard Sandiford
8a25be517f rtl-ssa: Fix updates to call clobbers [PR98403]
In the PR, fwprop was changing a call instruction and tripped
an assert when trying to update a list of call clobbers.
There are two ways we could handle this: remove the call clobber
and then add it back, or assume that the clobber will stay in its
current place.

At the moment we don't have enough information to safely move
calls around, so the second approach seems simpler and more
efficient.

gcc/
	PR rtl-optimization/98403
	* rtl-ssa/changes.cc (function_info::finalize_new_accesses): Explain
	why we don't remove call clobbers.
	(function_info::apply_changes_to_insn): Don't attempt to add
	call clobbers here.

gcc/testsuite/
	PR rtl-optimization/98403
	* g++.dg/opt/pr98403.C: New test.
2021-01-05 11:04:15 +00:00
Richard Sandiford
01be45ecce vect: Fix missing alias checks for 128-bit SVE [PR98371]
On AArch64, the vectoriser tries various ways of vectorising with both
SVE and Advanced SIMD and picks the best one.  All other things being
equal, it prefers earlier attempts over later attempts.

The way this works currently is that, once it has a successful
vectorisation attempt A, it analyses all other attempts as epilogue
loops of A:

      /* When pick_lowest_cost_p is true, we should in principle iterate
	 over all the loop_vec_infos that LOOP_VINFO could replace and
	 try to vectorize LOOP_VINFO under the same conditions.
	 E.g. when trying to replace an epilogue loop, we should vectorize
	 LOOP_VINFO as an epilogue loop with the same VF limit.  When trying
	 to replace the main loop, we should vectorize LOOP_VINFO as a main
	 loop too.

	 However, autovectorize_vector_modes is usually sorted as follows:

	 - Modes that naturally produce lower VFs usually follow modes that
	   naturally produce higher VFs.

	 - When modes naturally produce the same VF, maskable modes
	   usually follow unmaskable ones, so that the maskable mode
	   can be used to vectorize the epilogue of the unmaskable mode.

	 This order is preferred because it leads to the maximum
	 epilogue vectorization opportunities.  Targets should only use
	 a different order if they want to make wide modes available while
	 disparaging them relative to earlier, smaller modes.  The assumption
	 in that case is that the wider modes are more expensive in some
	 way that isn't reflected directly in the costs.

	 There should therefore be few interesting cases in which
	 LOOP_VINFO fails when treated as an epilogue loop, succeeds when
	 treated as a standalone loop, and ends up being genuinely cheaper
	 than FIRST_LOOP_VINFO.  */

However, the vectoriser can normally elide alias checks for epilogue
loops, on the basis that the main loop should do them instead.
Converting an epilogue loop to a main loop can therefore cause the alias
checks to be skipped.  (It probably also unfairly penalises the original
loop in the cost comparison, given that one loop will have alias checks
and the other won't.)

As the comment says, we should in principle analyse each vector mode
twice: once as a main loop and once as an epilogue.  However, doing
that up-front would be quite expensive.  This patch instead goes for a
compromise: if an epilogue loop for mode M2 seems better than a main
loop for mode M1, re-analyse with M2 as the main loop.

The patch fixes dg.torture.exp=pr69719.c when testing with
-msve-vector-bits=128.

gcc/
	PR tree-optimization/98371
	* tree-vect-loop.c (vect_reanalyze_as_main_loop): New function.
	(vect_analyze_loop): If an epilogue loop appears to be cheaper
	than the main loop, re-analyze it as a main loop before adopting
	it as a main loop.
2021-01-05 11:03:22 +00:00
Rainer Orth
a20893cf6b build: libcody: Link with -lsocket -lnsl if necessary [PR98316]
With the introduction of C++20 modules and libcody, cc1plus and
cc1objplus gained a dependency on the socket functions.  Before those
were merged into libc in Solaris 11.4, one needed to link with -lsocket -lnsl
on Solaris, so that merge broke the Solaris 11.3 build.

While we already have 4 different checks for those libraries in the
tree, I decided to import autoconf-archive's AX_LIB_SOCKET_NSL macro
instead.  At the same time, the patch only links libcody and the
networking libs where needed (cc1plus, cc1objplus).

Bootstrapped without regressions on i386-pc-solaris2.11 (Solaris 11.3
and 11.4), sparc-sun-solaris2.11, and x86_64-pc-linux-gnu.

2020-12-16  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	c++tools:
	PR c++/98316
	* configure.ac: Include ../config/ax_lib_socket_nsl.m4.
	(NETLIBS): Determine using AX_LIB_SOCKET_NSL.
	* configure: Regenerate.
	* Makefile.in (NETLIBS): Define.
	(g++-mapper-server$(exeext)): Add $(NETLIBS).

	gcc/objcp:
	PR c++/98316
	* Make-lang.in (cc1objplus$(exeext)): Add $(CODYLIB), $(NETLIBS).

	gcc/cp:
	PR c++/98316
	* Make-lang.in (cc1plus$(exeext)): Add $(CODYLIB), $(NETLIBS).

	gcc:
	PR c++/98316
	* configure.ac (NETLIBS): Determine using AX_LIB_SOCKET_NSL.
	* aclocal.m4, configure: Regenerate.
	* Makefile.in (NETLIBS): Define.
	(BACKEND): Remove $(CODYLIB).

	config:
	PR c++/98316
	* ax_lib_socket_nsl.m4: Import from autoconf-archive.
2021-01-05 11:32:31 +01:00
Jakub Jelinek
4615cde5d7 simplify-rtx: Optimize (x - 1) * y + y [PR98334]
We don't try to optimize for signed x, y (int) (x - 1U) * y + y
into x * y, we can't do that with signed x * y, because the former
is well defined for INT_MIN and -1, while the latter is not.
We could perhaps optimize it during isel or some very late optimization
where we'd turn magically flag_wrapv, but we don't do that yet.

This patch optimizes it in simplify-rtx.c, such that we can optimize
it during combine.

2021-01-05  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/98334
	* simplify-rtx.c (simplify_context::simplify_binary_operation_1):
	Optimize (X - 1) * Y + Y to X * Y or (X + 1) * Y - Y to X * Y.

	* gcc.target/i386/pr98334.c: New test.
2021-01-05 10:59:00 +01:00
Bernd Edlinger
6b69738c1e Restore input_location after recursive expand_call_inline
This is just a precautionary fix.

2021-01-05  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	* tree-inline.c (expand_call_inline): Restore input_location.
	Return result from recursive call.
2021-01-05 10:57:54 +01:00
Jerome Lambourg
560d991576 Fix testsuite/g++.dg/cpp1y/constexpr-66093.C execution failure...
The constexpr iteration dereferenced an array element past the end of
the array.


for  gcc/testsuite/ChangeLog

	* g++.dg/cpp1y/constexpr-66093.C: Fix bounds issue.
2021-01-05 04:47:41 -03:00
Ian Lance Taylor
bf183413c6 Go frontend: add -fgo-embedcfg option
This option will be used by the go command to implement go:embed directives,
which are new with the upcoming Go 1.16 release.

	* lang.opt (fgo-embedcfg): New option.
	* go-c.h (struct go_create_gogo_args): Add embedcfg field.
	* go-lang.c (go_embedcfg): New static variable.
	(go_langhook_init): Set go_create_gogo_args embedcfg field.
	(go_langhook_handle_option): Handle OPT_fgo_embedcfg_.
	* gccgo.texi (Invoking gccgo): Document -fgo-embedcfg.
2021-01-04 17:41:16 -08:00
David Malcolm
15af33a880 analyzer: fix ICE with -fsanitize=undefined [PR98293]
-fsanitize=undefined with calls to nonnull functions
creates struct __ubsan_nonnull_arg_data instances
with CONSTRUCTORs for RECORD_TYPEs with NULL index values.
The analyzer was mistakenly using INTEGER_CST for these
fields, leading to ICEs.

Fix the issue by iterating through the fields in the type
for such cases, imitating similar logic in varasm.c's
output_constructor.

gcc/analyzer/ChangeLog:
	PR analyzer/98293
	* store.cc (binding_map::apply_ctor_to_region): When "index" is
	NULL, iterate through the fields for RECORD_TYPEs, rather than
	creating an INTEGER_CST index.

gcc/testsuite/ChangeLog:
	PR analyzer/98293
	* gcc.dg/analyzer/pr98293.c: New test.
2021-01-04 19:20:32 -05:00
GCC Administrator
7e73f51157 Daily bump. 2021-01-05 00:16:42 +00:00
Martin Uecker
a000eb5918 C: Add test for incorrect warning for assignment of certain volatile expressions fixed by commit 58a45ce [PR98029]
2021-01-04  Martin Uecker  <muecker@gwdg.de>

gcc/testsuite/
	PR c/98029
	* gcc.dg/pr98029.c: New test.
2021-01-04 22:53:58 +01:00
Philipp Tomsich
f262a35188 MAINTAINERS: Update my email address.
2021-01-04  Philipp Tomsich  <philipp.tomsich@vrull.eu>

	* MAINTAINERS: Update my email address.
2021-01-04 17:37:54 +01:00
Nathan Sidwell
a5469584f6 c++: Add stdlib module test cases
The remaining modules tests use the std library.  These are those.

	gcc/testsuite/
	* g++.dg/modules/binding-1_a.H: New.
	* g++.dg/modules/binding-1_b.H: New.
	* g++.dg/modules/binding-1_c.C: New.
	* g++.dg/modules/binding-2.H: New.
	* g++.dg/modules/builtin-3_a.C: New.
	* g++.dg/modules/global-2_a.C: New.
	* g++.dg/modules/global-2_b.C: New.
	* g++.dg/modules/global-3_a.C: New.
	* g++.dg/modules/global-3_b.C: New.
	* g++.dg/modules/hello-1_a.C: New.
	* g++.dg/modules/hello-1_b.C: New.
	* g++.dg/modules/iostream-1_a.H: New.
	* g++.dg/modules/iostream-1_b.C: New.
	* g++.dg/modules/part-5_a.C: New.
	* g++.dg/modules/part-5_b.C: New.
	* g++.dg/modules/part-5_c.C: New.
	* g++.dg/modules/stdio-1_a.H: New.
	* g++.dg/modules/stdio-1_b.C: New.
	* g++.dg/modules/string-1_a.H: New.
	* g++.dg/modules/string-1_b.C: New.
	* g++.dg/modules/string-view1.C: New.
	* g++.dg/modules/string-view2.C: New.
	* g++.dg/modules/tinfo-1.C: New.
	* g++.dg/modules/tinfo-2_a.H: New.
	* g++.dg/modules/tinfo-2_b.C: New.
	* g++.dg/modules/tname-spec-1_a.H: New.
	* g++.dg/modules/tname-spec-1_b.C: New.
	* g++.dg/modules/xtreme-header-1.h: New.
	* g++.dg/modules/xtreme-header-1_a.H: New.
	* g++.dg/modules/xtreme-header-1_b.C: New.
	* g++.dg/modules/xtreme-header-1_c.C: New.
	* g++.dg/modules/xtreme-header-2.h: New.
	* g++.dg/modules/xtreme-header-2_a.H: New.
	* g++.dg/modules/xtreme-header-2_b.C: New.
	* g++.dg/modules/xtreme-header-2_c.C: New.
	* g++.dg/modules/xtreme-header-3.h: New.
	* g++.dg/modules/xtreme-header-3_a.H: New.
	* g++.dg/modules/xtreme-header-3_b.C: New.
	* g++.dg/modules/xtreme-header-3_c.C: New.
	* g++.dg/modules/xtreme-header-4.h: New.
	* g++.dg/modules/xtreme-header-4_a.H: New.
	* g++.dg/modules/xtreme-header-4_b.C: New.
	* g++.dg/modules/xtreme-header-4_c.C: New.
	* g++.dg/modules/xtreme-header-5.h: New.
	* g++.dg/modules/xtreme-header-5_a.H: New.
	* g++.dg/modules/xtreme-header-5_b.C: New.
	* g++.dg/modules/xtreme-header-5_c.C: New.
	* g++.dg/modules/xtreme-header-6.h: New.
	* g++.dg/modules/xtreme-header-6_a.H: New.
	* g++.dg/modules/xtreme-header-6_b.C: New.
	* g++.dg/modules/xtreme-header-6_c.C: New.
	* g++.dg/modules/xtreme-header.h: New.
	* g++.dg/modules/xtreme-header_a.H: New.
	* g++.dg/modules/xtreme-header_b.C: New.
	* g++.dg/modules/xtreme-tr1.h: New.
	* g++.dg/modules/xtreme-tr1_a.H: New.
	* g++.dg/modules/xtreme-tr1_b.C: New.
2021-01-04 07:52:21 -08:00
Richard Sandiford
aa204d5118 vect, aarch64: Fix alignment units for IFN_MASK* [PR95401]
The IFN_MASK* functions take two leading arguments: a load or
store pointer and a “cookie”.  The type of the cookie is the
type of the access for TBAA purposes (like for MEM_REFs)
while the value of the cookie is the alignment of the access.
This PR was caused by a disagreement about whether the alignment
is measured in bits or bytes.

It looks like this goes back to PR68786, which made the
vectoriser create its own cookie argument rather than reusing
the one created by ifcvt.  The alignment value of the new cookie
was measured in bytes (as needed by set_ptr_info_alignment)
while the existing code expected it to be measured in bits.
The folds I added for IFN_MASK_LOAD and STORE then made
things worse.

gcc/
	PR tree-optimization/95401
	* config/aarch64/aarch64-sve-builtins.cc
	(gimple_folder::load_store_cookie): Use bits rather than bytes
	for the alignment argument to IFN_MASK_LOAD and IFN_MASK_STORE.
	* gimple-fold.c (gimple_fold_mask_load_store_mem_ref): Likewise.
	* tree-vect-stmts.c (vectorizable_store): Likewise.
	(vectorizable_load): Likewise.

gcc/testsuite/
	PR tree-optimization/95401
	* g++.dg/vect/pr95401.cc: New test.
	* g++.dg/vect/pr95401a.cc: Likewise.
2021-01-04 14:44:21 +00:00
Nathan Sidwell
6288183377 [libcody] Remove some std::move [PR 98368]
Compiling on clang showed a couple of pessimizations.  Fixed thusly.

	libcody/
	* client.cc (Client::ProcessResponse): Remove std::move
	inside ?:
	c++tools/
	* resolver.cc (module_resolver::cmi_response): Remove
	std::move of temporary.
2021-01-04 06:38:52 -08:00
Mateusz Wajchęprzełóż
6bbc196c64 [libcody] Windows absdir fix
An obvious thinko in dirve name check :(

	libcody/
	* resolver.cc (IsAbsDir): Fix string indexing.

Signed-off-by: Nathan Sidwell <nathan@acm.org>
2021-01-04 08:59:10 -05:00
Richard Biener
9e79d76a16 tree-optimization/98308 - set vector type for mask of masked load
This makes sure to set the vector type on an invariant mask argument
for a masked load and SLP.

2021-01-04  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/98308
	* tree-vect-stmts.c (vectorizable_load): Set invariant mask
	SLP vectype.

	* gcc.dg/vect/pr98308.c: New testcase.
2021-01-04 14:39:14 +01:00
Jakub Jelinek
24cd9afe61 loop-niter: Recognize popcount idioms even with char, short and __int128 [PR95771]
As the testcase shows, we punt unnecessarily on popcount loop idioms if
the type is smaller than int or larger than long long.
Smaller type than int can be handled by zero-extending the argument to
unsigned int, and types twice as long as long long by doing
__builtin_popcountll on both halves of the __int128.

2020-01-04  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/95771
	* tree-ssa-loop-niter.c (number_of_iterations_popcount): Handle types
	with precision smaller than int's precision and types with precision
	twice as large as long long.  Formatting fixes.

	* gcc.target/i386/pr95771.c: New test.
2021-01-04 14:36:06 +01:00
Richard Biener
39bd65faee tree-optimization/98464 - replace loop info with avail at uses
This does VN replacement in loop nb_iterations consistent with
the rest of the IL by using availability at the definition site
of uses.

2021-01-04  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/98464
	* tree-ssa-sccvn.c (vn_valueize_for_srt): Rename from ...
	(vn_valueize_wrapper): ... this.  Temporarily adjust vn_context_bb.
	(process_bb): Adjust.

	* g++.dg/opt/pr98464.C: New testcase.
2021-01-04 13:51:56 +01:00
Matthew Malcomson
7f2b731756 docs: Fix wording describing the hwaddress sanitizer
The original documentation added to mention the clash between
-fsanitize=address and -fsanitize=hwaddress used confusing wording trying
to say that -fsanitize=hwaddress is only available on AArch64.

It read as if -fsanitize=address were only supported on AArch64.

This patch fixes that wording by being more explicit.

gcc/ChangeLog:

	PR other/98437
	* doc/invoke.texi (-fsanitize=address): Fix wording describing
	clash with -fsanitize=hwaddress.
2021-01-04 12:06:27 +00:00
Richard Biener
13b80a7d1b tree-optimization/98282 - classify V_C_E<constant> as nary
This avoids running into memory reference code in compute_avail by
properly classifying unfolded reference trees on constants.

2021-01-04  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/98282
	* tree-ssa-sccvn.c (vn_get_stmt_kind): Classify tcc_reference on
	invariants as VN_NARY.

	* g++.dg/opt/pr98282.C: New testcase.
2021-01-04 12:59:44 +01:00
Richard Sandiford
b41e6dd50f aarch64: Improve vcombine codegen [PR89057]
This patch fixes a codegen regression in the handling of things like:

  __temp.val[0]								     \
    = vcombine_##funcsuffix (__b.val[0],				     \
			     vcreate_##funcsuffix (__AARCH64_UINT64_C (0))); \

in the 64-bit vst[234] functions.  The zero was forced into a
register at expand time, and we relied on combine to fuse the
zero and combine back together into a single combinez pattern.
The problem is that the zero could be hoisted before combine
gets a chance to do its thing.

gcc/
	PR target/89057
	* config/aarch64/aarch64-simd.md (aarch64_combine<mode>): Accept
	aarch64_simd_reg_or_zero for operand 2.  Use the combinez patterns
	to handle zero operands.

gcc/testsuite/
	PR target/89057
	* gcc.target/aarch64/pr89057.c: New test.
2021-01-04 11:59:07 +00:00
Richard Sandiford
ba15b0fa0d aarch64: Use the MUL VL form of SVE PRF[BHWD]
The expansions of the svprf[bhwd] instructions weren't taking
advantage of the immediate addressing mode.

gcc/
	* config/aarch64/aarch64.c (offset_6bit_signed_scaled_p): New function.
	(offset_6bit_unsigned_scaled_p): Fix typo in comment.
	(aarch64_sve_prefetch_operand_p): Accept MUL VLs in the range
	[-32, 31].

gcc/testsuite/
	* gcc.target/aarch64/sve/acle/asm/prfb.c: Test for a MUL VL range of
	[-32, 31].
	* gcc.target/aarch64/sve/acle/asm/prfh.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/prfw.c: Likewise.
	* gcc.target/aarch64/sve/acle/asm/prfd.c: Likewise.
2021-01-04 11:56:19 +00:00
Richard Biener
0926259f9f tree-optimization/98393 - properly init matches when failing SLP
This zeroes matches when failing SLP discovery because of the
work limit.

2021-01-04  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/98393
	* tree-vect-slp.c (vect_build_slp_tree): Properly zero matches
	when hitting the limit.
2021-01-04 12:13:44 +01:00
Martin Liska
a40718b5fc Convert 2 files to utf8.
libiberty/ChangeLog:

	* strverscmp.c: Convert to utf8 from iso8859.

gcc/testsuite/ChangeLog:

	* README: Convert to utf8 from iso8859.
2021-01-04 11:35:17 +01:00
Martin Liska
ff6b406247 avr.exp: convert Dos newlines to Unix ones
gcc/testsuite/ChangeLog:

	* gcc.target/avr/avr.exp: Run dos2unix on the file.
2021-01-04 11:21:20 +01:00
Richard Biener
8837f82e4b tree-optimization/98291 - allow SLP more vectorization of reductions
When the VF is one a SLP reduction is in-order and thus we can
vectorize even when the reduction op is not associative.

2021-01-04  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/98291
	* tree-vect-loop.c (vectorizable_reduction): Bypass
	associativity check for SLP reductions with VF 1.

	* gcc.dg/vect/slp-reduc-11.c: New testcase.
	* gcc.dg/vect/vect-reduc-in-order-4.c: Adjust.
2021-01-04 10:47:43 +01:00
Jakub Jelinek
ad64e807ff match.pd: Fold x == ~x to false [PR96782]
x is never equal to ~x, so we can fold such comparisons to constants.

2021-01-04  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/96782
	* match.pd (x == ~x -> false, x != ~x -> true): New simplifications.

	* gcc.dg/tree-ssa/pr96782.c: New test.
2021-01-04 10:37:12 +01:00
Jakub Jelinek
99dee82307 Update copyright years. 2021-01-04 10:26:59 +01:00
Jakub Jelinek
c00e2af363 Add AMD and Ulf Adams as external authors
* update-copyright.py: Add AMD and Ulf Adams as external authors.
2021-01-04 10:25:17 +01:00
Martin Liska
f96f664cf6 Remove duplicate ChangeLog entries.
gcc/fortran/ChangeLog:

	* ChangeLog-2018: Remove duplicate ChangeLog entries.
2021-01-04 10:18:18 +01:00
Jakub Jelinek
2a680610d1 Fix up indentation in update-copyright.py
* update-copyright.py: Use 8 spaces instead of tab to indent.
2021-01-04 10:16:13 +01:00
Martin Liska
cf76bbf8a8 mklog.py: add --update-copyright option
contrib/ChangeLog:

	* mklog.py: Add --update-copyright option which adds:
	"Update copyright years." to ChangeLog files belonging
	to a modified file.
2021-01-04 10:09:07 +01:00
Martin Liska
8869bd0efc gcc-changelog: Ignore copyright years commits.
contrib/ChangeLog:

	* gcc-changelog/git_commit.py: Skip Update copyright
	years commits.
2021-01-04 10:09:07 +01:00
Jakub Jelinek
b4cdbb9335 Remove duplicated ChangeLog entries from po/ChangeLog
to undo broken https://gcc.gnu.org/git/?p=gcc.git;a=blobdiff;f=gcc/po/ChangeLog;h=9f4bf9a8e3a34266e521e24be1adbba52f31e8d3;hp=5f5f8f70e44a374d3a8a615abc6cddc6642982a3;hb=818ab71a415cd234be092111a0aa5e812ec56434;hpb=21fa2a29dc265ab54c957c37d8a9e9ab07d7cd66
change.
2021-01-04 10:08:04 +01:00
Bernd Edlinger
e9f8a554ef Fix -save-temp leaking lto files in /tmp
When linking with -flto and -save-temps, various
temporary files are created in /tmp.
The same happens when invoking the driver with @file
parameter, and using -L or -I options.

gcc:
2021-01-04  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	* collect-utils.c (collect_execute): Check dumppfx.
	* collect2.c (maybe_run_lto_and_relink, do_link): Pass atsuffix
	to collect_execute.
	(do_link): Add new parameter atsuffix.
	(main): Handle -dumpdir option.  Skip one argument for
	-o, -isystem and -B options.
	* gcc.c (make_at_file): New helper function.
	(close_at_file): Use it.

gcc/testsuite:
2021-01-04  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	* gcc.misc-tests/outputs.exp: Adjust testcase.
2021-01-04 10:03:19 +01:00
Jakub Jelinek
c48514bea6 Update Copyright in ChangeLog files
Do this separately from all other Copyright updates, as ChangeLog files
can be modified only separately.
2021-01-04 09:35:45 +01:00