Commit Graph

169423 Commits

Author SHA1 Message Date
GCC Administrator 5fe50e5a5c Daily bump. 2020-04-19 00:17:30 +00:00
GCC Administrator b29ea6948c Daily bump. 2020-04-18 00:17:34 +00:00
H.J. Lu 4a745938b5 x86: Insert ENDBR if function will be called indirectly
Since constant_call_address_operand has

;; Test for a pc-relative call operand
(define_predicate "constant_call_address_operand"
  (match_code "symbol_ref")
{
  if (ix86_cmodel == CM_LARGE || ix86_cmodel == CM_LARGE_PIC
      || flag_force_indirect_call)
    return false;
  if (TARGET_DLLIMPORT_DECL_ATTRIBUTES && SYMBOL_REF_DLLIMPORT_P (op))
    return false;
  return true;
})

even if cgraph_node::get (cfun->decl)->only_called_directly_p () returns
false, the fuction may still be called indirectly.  Copy the logic from
constant_call_address_operand to rest_of_insert_endbranch to insert ENDBR
at function entry if function will be called indirectly.

NB: gcc.target/i386/pr94417-2.c is updated to expect 4 ENDBRs, instead
of 2, since only GCC 10 has the fix for PR target/89355 not to insert
ENDBR after NOTE_INSN_DELETED_LABEL.

gcc/

	Backport from master
	PR target/94417
	* config/i386/i386.c (rest_of_insert_endbranch): Insert ENDBR at
	function entry if function will be called indirectly.

gcc/testsuite/

	Backport from master
	PR target/94417
	* gcc.target/i386/pr94417-1.c: New test.
	* gcc.target/i386/pr94417-2.c: Likewise.
	* gcc.target/i386/pr94417-3.c: Likewise.

(cherry picked from commit c5f3796539)
2020-04-17 15:23:50 -07:00
Kewen Lin 7bce1c7244 Fix PR94443 with gsi_insert_seq_before [PR94443]
This patch is to fix the stupid mistake by using
gsi_insert_seq_before instead of gsi_insert_before.

BTW, the regression testing on one x86_64 machine from CFarm is
unable to reveal it (I guess due to native arch sandybridge?), so I
specified additional option -march=znver2 and verified the coverage.

Bootstrapped/regtested on powerpc64le-linux-gnu (P9) and
x86_64-pc-linux-gnu, also verified the fail cases in related PRs.

Backport from mainline.

  2020-04-03  Kewen Lin  <linkw@gcc.gnu.org>

  gcc/
      PR tree-optimization/94443
      * tree-vect-loop.c (vectorizable_live_operation): Use
      gsi_insert_seq_before to replace gsi_insert_before.

  gcc/testsuite/
      PR tree-optimization/94443
      * gcc.dg/vect/pr94443.c: New test.
2020-04-17 02:51:40 -05:00
Kewen Lin a809efd70d Fix PR94043 by making vect_live_op generate lc-phi
As PR94043 shows, my commit r10-4524 exposed one issue in
vectorizable_live_operation, which inserts one extra BB
before the single exit, leading unexpected operand expansion
and unexpected loop depth assertion.  As Richi suggested,
this patch is to teach vectorizable_live_operation to
generate loop closed phi for vec_lhs, it looks like:
     loop;
     # lhs' = PHI <lhs>
=>
     loop;
     # vec_lhs' = PHI <vec_lhs>
     new_tree = BIT_FIELD_REF <vec_lhs', ...>;
     lhs' = new_tree;

I noticed that there are some SLP cases that have same lhs
and vec_lhs but different offsets, which can make us have
more PHIs for the same vec_lhs there.  But I think it would
be fine since only one of them is actually live, the others
should be eliminated by the following dce.  So the patch
doesn't check whether there is one phi for vec_lhs, just
create one directly instead.

Bootstrapped/regtested on powerpc64le-linux-gnu (LE) P8.

Backport from mainline.

  2020-04-01  Kewen Lin  <linkw@gcc.gnu.org>

  gcc/ChangeLog

      PR tree-optimization/94043
      * tree-vect-loop.c (vectorizable_live_operation): Generate loop-closed
      phi for vec_lhs and use it for lane extraction.

  gcc/testsuite/ChangeLog

      PR tree-optimization/94043
      * gfortran.dg/graphite/vect-pr94043.f90: New test.
2020-04-17 02:51:11 -05:00
GCC Administrator 3c5d3cc15a Daily bump. 2020-04-17 00:17:32 +00:00
Michael Meissner baf3a5a942 Fix target/94557 PowerPC regression on GCC 9 (variable vec_extract)
2020-04-16  Michael Meissner  <meissner@linux.ibm.com>

	PR target/94557
	* config/rs6000/rs6000.c (rs6000_adjust_vec_address): Fix
	regression caused by PR target/93932 backport.  Mask variable
	vector extract index so it does not go beyond the vector when
	extracting a vector element from memory.
2020-04-16 12:49:22 -04:00
Richard Biener 0f1cf13ece middle-end/94479 - fix gimplification of address
When gimplifying an address operand we may expose an indirect
ref via DECL_VALUE_EXPR for example.  This is dealt with in the
code already but it fails to consider that INDIRECT_REFs get
gimplified to MEM_REFs.

Fixed which makes the ICE observed on x86_64-netbsd go away.

2020-04-07  Richard Biener  <rguenther@suse.de>

	PR middle-end/94479
	* gimplify.c (gimplify_addr_expr): Also consider generated
	MEM_REFs.

	* gcc.dg/torture/pr94479.c: New testcase.
2020-04-16 14:29:39 +02:00
GCC Administrator d998a89f9c Daily bump. 2020-04-16 00:17:35 +00:00
Max Filippov 79b5967653 xtensa: backport fix for PR target/94584
Patterns zero_extendhisi2, zero_extendqisi2 and extendhisi2_internal can
load value from memory, but they don't treat volatile memory correctly.
Add %v1 before load instructions to emit 'memw' instruction when
-mserialize-volatile is in effect.

2020-04-15  Max Filippov  <jcmvbkbc@gmail.com>
gcc/
	* config/xtensa/xtensa.md (zero_extendhisi2, zero_extendqisi2)
	(extendhisi2_internal): Add %v1 before the load instructions.

gcc/testsuite/
	* gcc.target/xtensa/pr94584.c: New test.
2020-04-15 14:14:02 -07:00
Max Filippov 20c6c0c8b1 xtensa: backport fix for PR target/91880
Xtensa hwloop_optimize segfaults when zero overhead loop is about to be
inserted as the first instruction of the function.
Insert zero overhead loop instruction into new basic block before the
loop when basic block that precedes the loop is empty.

2020-04-15  Max Filippov  <jcmvbkbc@gmail.com>
gcc/
	* config/xtensa/xtensa.c (hwloop_optimize): Insert zero overhead
	loop instruction into new basic block before the loop when basic
	block that precedes the loop is empty.

gcc/testsuite/
	* gcc.target/xtensa/pr91880.c: New test case.
	* gcc.target/xtensa/xtensa.exp: New test suite.
2020-04-15 14:13:59 -07:00
Uros Bizjak 1eccf99556 i386: Require OPTION_MASK_ISA_SSE2 for __builtin_ia32_movq128 [PR94603]
PR target/94603
	* config/i386/i386-builtin.def (__builtin_ia32_movq128):
	Require OPTION_MASK_ISA_SSE2.

testsuite/ChangeLog:

	PR target/94603
	* gcc.target/i386/pr94603.c: New test.
2020-04-15 22:02:39 +02:00
GCC Administrator 54ab0a7d75 Daily bump. 2020-04-15 00:17:34 +00:00
Thomas König 7c94472580 Backport from trunk of the fix for PR 94270.
2020-04-14  Thomas Koenig  <tkoenig@gcc.gnu.org>

	Backport from trunk
	PR fortran/94270
	* gfortran.dg/warn_unused_dummy_argument_6.f90: New test.

2020-04-14  Thomas Koenig  <tkoenig@gcc.gnu.org>

	PR fortran/94270
	* gfortran.dg/warn_unused_dummy_argument_6.f90: New test.
2020-04-14 16:15:49 +02:00
GCC Administrator 12d027adaf Daily bump. 2020-04-14 00:17:35 +00:00
Thomas Schwinge a99a8431e6 Rename 'libgomp.oacc-c-c++-common/static-dynamic-lifetimes-*' to 'libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-*' [PR92843]
Fix-up for commit be9862dd96 "Test cases for
mixed structured/dynamic data lifetimes with OpenACC [PR92843]": it's
"structured", not "static" data lifetimes/reference counters.

	libgomp/
	PR libgomp/92843
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-1-lib.c:
	Rename to...
	* testsuite/libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-1-lib.c:
	... this.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-1.c:
	Rename to...
	* testsuite/libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-1.c:
	... this.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-2-lib.c:
	Rename to...
	* testsuite/libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-2-lib.c:
	... this.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-2.c:
	Rename to...
	* testsuite/libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-2.c:
	... this.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-3-lib.c:
	Rename to...
	* testsuite/libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-3-lib.c:
	... this.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-3.c:
	Rename to...
	* testsuite/libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-3.c:
	... this.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-4-lib.c:
	Rename to...
	* testsuite/libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-4-lib.c:
	... this.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-4.c:
	Rename to...
	* testsuite/libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-4.c:
	... this.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-5-lib.c:
	Rename to...
	* testsuite/libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-5-lib.c:
	... this.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-5.c:
	Rename to...
	* testsuite/libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-5.c:
	... this.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-6-lib.c:
	Rename to...
	* testsuite/libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-6-lib.c:
	... this.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-6.c:
	Rename to...
	* testsuite/libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-6.c:
	... this.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-7-lib.c:
	Rename to...
	* testsuite/libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-7-lib.c:
	... this.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-7.c:
	Rename to...
	* testsuite/libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-7.c:
	... this.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-8-lib.c:
	Rename to...
	* testsuite/libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-8-lib.c:
	... this.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-8.c::
	Rename to...
	* testsuite/libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-8.c:
	... this.

(cherry picked from commit af4c92573d)
2020-04-13 08:56:22 +02:00
GCC Administrator a2a0f3ee6f Daily bump. 2020-04-13 00:17:29 +00:00
GCC Administrator 557142474a Daily bump. 2020-04-12 00:17:30 +00:00
Uros Bizjak d2fee90546 i386: Fix REDUC_SSE_SMINMAX_MODE mode conditions.
V4SI, V8HI and V16QI modes of redux_<code>_scal_<mode> expander
expand with SSE2 instructions (PSRLDQ and PCMPGTx) so use
TARGET_SSE2 as relevant mode iterator codition.

	PR target/94494
	* config/i386/sse.md (REDUC_SSE_SMINMAX_MODE): Use TARGET_SSE2
	condition for V4SI, V8HI and V16QI modes.

testsuite/ChangeLog:

	PR target/94494
	* gcc.target/i386/pr94494.c: New test.
2020-04-11 13:25:51 +02:00
Uros Bizjak 59eddd9769 i386: Fix REDUC_SSE_SMINMAX_MODE mode conditions.
V4SI, V8HI and V16QI modes of redux_<code>_scal_<mode> expander
expand with SSE2 instructions (PSRLDQ and PCMPGTx) so use
TARGET_SSE2 as relevant mode iterator codition.

	PR target/94494
	* config/i386/sse.md (REDUC_SSE_SMINMAX_MODE): Use TARGET_SSE2
	condition for V4SI, V8HI and V16QI modes.

testsuite/ChangeLog:

	PR target/94494
	* gcc.target/i386/pr94494.c: New test.
2020-04-11 13:22:52 +02:00
GCC Administrator f41bd52147 Daily bump. 2020-04-11 00:17:29 +00:00
Julian Brown 3c7a476c5a Test cases for mixed structured/dynamic data lifetimes with OpenACC [PR92843]
libgomp/
	PR libgomp/92843
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-1-lib.c:
	New file.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-1.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-2-lib.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-2.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-3-lib.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-3.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-4-lib.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-4.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-5-lib.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-5.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-6-lib.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-6.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-7-lib.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-7.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-8-lib.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/static-dynamic-lifetimes-8.c:
	Likewise.

(cherry picked from commit be9862dd96)
2020-04-10 16:09:50 +02:00
Claudiu Zissulescu 3e84068b30 arc: Allow more ABIs in GLIBC_DYNAMIC_LINKER
Enable big-endian suffixed dynamic linker per glibc multi-abi support.

And to avoid a future churn and version pairingi hassles, also allow
arc700 although glibc for ARC currently doesn't support it.

gcc/
xxxx-xx-xx  Vineet Gupta <vgupta@synopsys.com>

       * config/arc/linux.h: GLIBC_DYNAMIC_LINKER support BE/arc700
2020-04-10 15:12:58 +03:00
GCC Administrator ffd2a2014d Daily bump. 2020-04-10 00:17:33 +00:00
Michael Meissner 892c755eae Backport PR target/93932 (variable vec_extract) to GCC 9
2020-04-09  Michael Meissner  <meissner@linux.ibm.com>

	Back port from trunk
	2020-02-26  Michael Meissner  <meissner@linux.ibm.com>

	PR target/93932
	* config/rs6000/vsx.md (vsx_extract_<mode>_var, VSX_D iterator):
	Split the insn into two parts.  This insn only does variable
	extract from a register.
	(vsx_extract_<mode>_var_load, VSX_D iterator): New insn, do
	variable extract from memory.
	(vsx_extract_v4sf_var): Split the insn into two parts.  This insn
	only does variable extract from a register.
	(vsx_extract_v4sf_var_load): New insn, do variable extract from
	memory.
	(vsx_extract_<mode>_var, VSX_EXTRACT_I iterator): Split the insn
	into two parts.  This insn only does variable extract from a
	register.
	(vsx_extract_<mode>_var_load, VSX_EXTRACT_I iterator): New insn,
	do variable extract from memory.
2020-04-09 12:25:05 -05:00
GCC Administrator f1a6a1e588 Daily bump. 2020-04-09 00:17:27 +00:00
GCC Administrator 1a2a0af530 Daily bump. 2020-04-08 00:17:30 +00:00
Will Schmidt 9a385bd124 rs6000 pragma fix backport from mainline to gcc-9
2020-04-07  Will Schmidt  <will_schmidt@vnet.ibm.com>

Backport from mainline.
    2020-03-23  Will Schmidt  <will_schmidt@vnet.ibm.com>

    * config/rs6000/rs6000-call.c altivec_init_builtins(): Remove
    code to skip defining builtins based on builtin_mask.

    * gcc.target/powerpc/pragma_power6.c: New.
    * gcc.target/powerpc/pragma_power7.c: New.
    * gcc.target/powerpc/pragma_power8.c: New.
    * gcc.target/powerpc/pragma_power9.c: New.
    * gcc.target/powerpc/pragma_misc9.c: New.
    * gcc.target/powerpc/vsu/pragma_misc9.c: New.
    * gcc.target/powerpc/vsu/vec-all-nez-7.c: Update.
    * gcc.target/powerpc/vsu/vec-any-eqz-7.c: Update.
2020-04-07 16:07:03 -05:00
Jakub Jelinek 14192f1ed4 i386: Fix V{64QI,32HI}mode constant permutations [PR94509]
The following testcases are miscompiled, because expand_vec_perm_pshufb
incorrectly thinks it can use vpshufb instruction for the permutations
when it can't.
The
          if (vmode == V32QImode)
            {
              /* vpshufb only works intra lanes, it is not
                 possible to shuffle bytes in between the lanes.  */
              for (i = 0; i < nelt; ++i)
                if ((d->perm[i] ^ i) & (nelt / 2))
                  return false;
            }
intra-lane check which is correct has been copied and adjusted for 64-byte
modes into:
          if (vmode == V64QImode)
            {
              /* vpshufb only works intra lanes, it is not
                 possible to shuffle bytes in between the lanes.  */
              for (i = 0; i < nelt; ++i)
                if ((d->perm[i] ^ i) & (nelt / 4))
                  return false;
            }
which is not correct, because 64-byte modes have 4 lanes rather than just
two and the above is only testing that the permutation grabs even lane elts
from even lanes and odd lane elts from odd lanes, but not that they are
from the same 256-bit half.

The following patch fixes it by using 3 * nelt / 4 instead of nelt / 4,
so we actually check the most significant 2 bits rather than just one.

2020-04-07  Jakub Jelinek  <jakub@redhat.com>

	PR target/94509
	* config/i386/i386-expand.c (expand_vec_perm_pshufb): Fix the check
	for inter-lane permutation for 64-byte modes.

	* gcc.target/i386/avx512bw-pr94509-1.c: New test.
	* gcc.target/i386/avx512bw-pr94509-2.c: New test.
2020-04-07 21:02:06 +02:00
Jakub Jelinek 0490cb0e61 openmp: Fix parallel master error recovery [PR94512]
We need to set OMP_PARALLEL_COMBINED only if the parsing of omp_master
succeeded, because otherwise there is no nested master construct in the
parallel.

2020-04-07  Jakub Jelinek  <jakub@redhat.com>

	PR c++/94512
	* c-parser.c (c_parser_omp_parallel): Set OMP_PARALLEL_COMBINED
	if c_parser_omp_master succeeded.

	* parser.c (cp_parser_omp_parallel): Set OMP_PARALLEL_COMBINED
	if cp_parser_omp_master succeeded.

	* g++.dg/gomp/pr94512.C: New test.
2020-04-07 21:01:46 +02:00
Jakub Jelinek 7f3ac38b3c aarch64: Fix {ash[lr],lshr}<mode>3 expanders [PR94488]
The following testcase ICEs on aarch64 apparently since the introduction of
the aarch64 port.  The reason is that the {ashl,ashr,lshr}<mode>3 expanders
completely unnecessarily FAIL; if operands[2] is something other than
a CONST_INT or REG or MEM and the middle-end code can't cope with the
pattern giving up in these cases.  All the expanders use general_operand
predicate for the shift amount operand, but then have just a special case
for CONST_INT (if in-bound, emit an immediate shift, otherwise force into
REG), or MEM (force into REG), or REG (that is the case it handles).
In the testcase, operands[2] is a lowpart SUBREG of a REG, which is valid
general_operand.
I don't see any reason what is magic about MEMs that it should be forced
into REG and others like SUBREGs that it shouldn't, there isn't even a
reason to check for !REG_P because force_reg will do nothing if the operand
is already a REG, and otherwise can handle general_operand just fine.

2020-04-07  Jakub Jelinek  <jakub@redhat.com>

	PR target/94488
	* config/aarch64/aarch64-simd.md (ashl<mode>3, lshr<mode>3,
	ashr<mode>3): Force operands[2] into reg whenever it is not CONST_INT.
	Assume it is a REG after that instead of testing it and doing FAIL
	otherwise.  Formatting fix.

	* gcc.c-torture/compile/pr94488.c: New test.
2020-04-07 21:01:46 +02:00
Jakub Jelinek b5039b7259 debug: Improve debug info of c++14 deduced return type [PR94459]
On the following testcase, in gdb ptype S<long>::m1 prints long as return
type, but all the other methods show void instead.
PR53756 added code to add_type_attribute if the return type is
auto/decltype(auto), but we actually should look through references,
pointers and qualifiers.
Haven't included there DW_TAG_atomic_type, because I think at least ATM
one can't use that in C++.  Not sure about DW_TAG_array_type or what else
could be deduced.

> http://eel.is/c++draft/dcl.spec.auto#3 says it has to appear as a
> decl-specifier.
>
> http://eel.is/c++draft/temp.deduct.type#8 lists the forms where a template
> argument can be deduced.
>
> Looks like you are missing arrays, pointers to members, and function return
> types.

2020-04-04  Hannes Domani  <ssbssa@yahoo.de>
	    Jakub Jelinek  <jakub@redhat.com>

	PR debug/94459
	* dwarf2out.c (gen_subprogram_die): Look through references, pointers,
	arrays, pointer-to-members, function types and qualifiers when
	checking if in-class DIE had an 'auto' or 'decltype(auto)' return type
	to emit type again on definition.

	* g++.dg/debug/pr94459.C: New test.

Co-Authored-By: Hannes Domani <ssbssa@yahoo.de>
2020-04-07 21:01:40 +02:00
Jakub Jelinek d1371dbe12 openmp: Fix ICE on #pragma omp parallel master in template [PR94477]
The following testcase ICEs, because for parallel combined with some
other construct we initialize the omp_parallel_combined_clauses pointer
and expect the construct combined with it to clear it after it no longer
needs it, but OMP_MASTER didn't do that.

2020-04-04  Jakub Jelinek  <jakub@redhat.com>

	PR c++/94477
	* pt.c (tsubst_expr) <case OMP_MASTER>: Clear
	omp_parallel_combined_clauses.

	* g++.dg/gomp/pr94477.C: New test.
2020-04-07 21:01:13 +02:00
Jakub Jelinek dbff182984 i386: Fix vph{add,subs?}[wd] 256-bit AVX2 RTL patterns [PR94460]
The following testcase is miscompiled, because the AVX2 patterns don't
describe correctly what the insn does.  E.g. vphaddd with %ymm* operands
(the second pattern) instruction as per:
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm256_hadd_epi32&expand=2941
does { a0+a1, a2+a3, b0+b1, b2+b3, a4+a5, a6+a7, b4+b5, b6+b7 }
but our RTL pattern did
     { a0+a1, a2+a3, a4+a5, a6+a7, b0+b1, b2+b3, b4+b5, b6+b7 }
where the first and last 64 bits are the same and two middle 64 bits
swapped.
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm256_hadd_epi16&expand=2939
similarly, insn does:
     { a0+a1, a2+a3, a4+a5, a6+a7, b0+b1, b2+b3, b4+b5, b6+b7,
       a8+a9, a10+a11, a12+a13, a14+a15, b8+b9, b10+b11, b12+b13, b14+b15 }
but RTL pattern did
     { a0+a1, a2+a3, a4+a5, a6+a7, a8+a9, a10+a11, a12+a13, a14+a15,
       b0+b1, b2+b3, b4+b5, b6+b7, b8+b9, b10+b11, b12+b13, b14+b15 }
again, first and last 64 bits are the same and the two middle 64 bits
swapped.

2020-04-03  Jakub Jelinek  <jakub@redhat.com>

	PR target/94460
	* config/i386/sse.md (avx2_ph<plusminus_mnemonic>wv16hi3,
	avx2_ph<plusminus_mnemonic>dv8si3): Fix up RTL pattern to do
	second half of first lane from first lane of second operand and
	first half of second lane from second lane of first operand.

	* gcc.target/i386/avx2-pr94460.c: New test.
2020-04-07 21:01:13 +02:00
Jakub Jelinek 4486a537f1 objsz: Don't call replace_uses_by on SSA_NAME_OCCURS_IN_ABNORMAL_PHI [PR94423]
The following testcase ICEs because the objsz pass calls replace_uses_by
on SSA_NAME_OCCURS_IN_ABNORMAL_PHI SSA_NAME.  The following patch instead
of that calls replace_call_with_value, which will turn it into
  xyz_123(ab) = 234;

2020-04-01  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/94423
	* tree-object-size.c (pass_object_sizes::execute): Don't call
	replace_uses_by for SSA_NAME_OCCURS_IN_ABNORMAL_PHI lhs, instead
	call replace_call_with_value.

	* gcc.dg/ubsan/pr94423.c: New test.
2020-04-07 21:01:06 +02:00
Jakub Jelinek 8f99f9e6cc fold-const: Fix division folding with vector operands [PR94412]
The following testcase is miscompiled since 4.9, we treat unsigned
vector types as if they were signed and "optimize" negations across it.

2020-03-31  Marc Glisse  <marc.glisse@inria.fr>
	    Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/94412
	* fold-const.c (fold_binary_loc) <case TRUNC_DIV_EXPR>: Use
	ANY_INTEGRAL_TYPE_P instead of INTEGRAL_TYPE_P.

	* gcc.c-torture/execute/pr94412.c: New test.

Co-authored-by: Marc Glisse <marc.glisse@inria.fr>
2020-04-07 21:00:35 +02:00
Jakub Jelinek a6bf0e5fb1 c++: Fix handling of internal fn calls in statement expressions [PR94385]
The following testcase ICEs, because the FE when processing the statement
expression changes the .VEC_CONVERT internal fn CALL_EXPR into .PHI call.
That is because the internal fn call is recorded in the base.u.ifn
field, which overlaps base.u.bits.lang_flag_1 which is used for
STMT_IS_FULL_EXPR_P, so this essentially does ifn |= 2 on little-endian.
STMT_IS_FULL_EXPR_P bit is used in:
cp-gimplify.c-  if (STATEMENT_CODE_P (code))
cp-gimplify.c-    {
cp-gimplify.c-      saved_stmts_are_full_exprs_p = stmts_are_full_exprs_p ();
cp-gimplify.c-      current_stmt_tree ()->stmts_are_full_exprs_p
cp-gimplify.c:        = STMT_IS_FULL_EXPR_P (*expr_p);
cp-gimplify.c-    }
and
pt.c-  if (STATEMENT_CODE_P (TREE_CODE (t)))
pt.c:    current_stmt_tree ()->stmts_are_full_exprs_p = STMT_IS_FULL_EXPR_P (t);
so besides being wrong on some other codes, it actually isn't beneficial at
all to set it on anything else, so the following patch restricts it to
trees with STATEMENT_CODE_P TREE_CODE.

2020-03-30  Jakub Jelinek  <jakub@redhat.com>

	PR c++/94385
	* semantics.c (add_stmt): Only set STMT_IS_FULL_EXPR_P on trees with
	STATEMENT_CODE_P code.

	* c-c++-common/pr94385.c: New test.
2020-04-07 21:00:35 +02:00
Jakub Jelinek 57e276f3e3 Fix vextract* masked patterns [PR93069]
The AVX512F documentation clearly states that in instructions where the
destination is a memory only merging-masking is possible, not zero-masking,
and the assembler enforces that.

The testcase in this patch fails to assemble because of
Error: unsupported masking for `vextracti32x8'
on
        vextracti32x8   $0x0, %zmm1, -64(%rsp){%k1}{z}
For the vector extraction patterns, we apparently have 7 *_maskm patterns
that only accept memory destinations and rtx_equal_p merge-masking source
for it, 7 *<mask_name> corresponding patterns that allow memory destination
only for the non-masked cases (through <store_mask_constraint>), then 2
*<mask_name> patterns (lo ssehalf V16FI and lo ssehalf VI8F_256 ones) which
do allow memory destination even for masked cases and are the cause of the
testsuite failure, because we must not allow C constraint if the destination
is m, and finally one pair of patterns (separate * and *_mask, hi ssehalf
VI4F_256), which has another issue (for which I don't have a testcase
though), where if it would match zero-masking with register destination,
it wouldn't emit the needed {z} into assembly.
The attached patch fixes those 3 issues only, perhaps more suitable for
backporting.

2020-03-30  Jakub Jelinek  <jakub@redhat.com>

	PR target/93069
	* config/i386/sse.md (vec_extract_lo_<mode><mask_name>): Use
	<store_mask_constraint> instead of m in output operand constraint.
	(vec_extract_hi_<mode><mask_name>): Use <mask_operand2> instead of
	%{%3%}.

	* gcc.target/i386/avx512vl-pr93069.c: New test.
	* gcc.dg/vect/pr93069.c: New test.
2020-04-07 21:00:28 +02:00
Jakub Jelinek aa9c08ef97 reassoc: Fix -fcompare-debug bug in reassociate_bb [PR94329]
The following testcase FAILs with -fcompare-debug, because reassociate_bb
mishandles the case when the last stmt in a bb has zero uses.  In that case
reassoc_remove_stmt (like gsi_remove) moves the iterator to the next stmt,
i.e. gsi_end_p is true, which means the code sets the iterator back to
gsi_last_bb.  The problem is that the for loop does gsi_prev on that before
handling the next statement, which means the former penultimate stmt, now
last one, is not processed by reassociate_bb.
Now, with -g, if there is at least one debug stmt at the end of the bb,
reassoc_remove_stmt moves the iterator to that following debug stmt and we
just do gsi_prev and continue with the former penultimate non-debug stmt,
now last non-debug stmt.

The following patch fixes that by not doing the gsi_prev in this case; there
are too many continue; cases, so I didn't want to copy over the gsi_prev to
all of them, so this patch uses a bool for that instead.  The second
gsi_end_p check isn't needed anymore, because when we don't do the
undesirable gsi_prev after gsi = gsi_last_bb, the loop !gsi_end_p (gsi)
condition will catch the removal of the very last stmt from a bb.

2020-03-28  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/94329
	* tree-ssa-reassoc.c (reassociate_bb): When calling reassoc_remove_stmt
	on the last stmt in a bb, make sure gsi_prev isn't done immediately
	after gsi_last_bb.

	* gfortran.dg/pr94329.f90: New test.
2020-04-07 20:59:45 +02:00
Jakub Jelinek 56407bab53 varasm: Fix output_constructor where a RANGE_EXPR index needs to skip some elts [PR94303]
The following testcase is miscompiled, because output_constructor doesn't
output the initializer correctly.  The FE creates {[1...2] = 9} in this
case, and we emit .long 9; long 9; .zero 8 instead of the expected
.zero 8; .long 9; .long 9.  If the CONSTRUCTOR is {[1] = 9, [2] = 9},
output_constructor_regular_field has code to notice that the current
location (local->total_bytes) is smaller than the location we want to write
to (1*sizeof(elt)) and will call assemble_zeros to skip those.  But
RANGE_EXPRs are handled by a different function which didn't do this,
so for RANGE_EXPRs we emitted them properly only if local->total_bytes
was always equal to the location where the RANGE_EXPR needs to start.

2020-03-25  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/94303
	* varasm.c (output_constructor_array_range): If local->index
	RANGE_EXPR doesn't start at the current location in the constructor,
	skip needed number of bytes using assemble_zeros or assert we don't
	go backwards.

	PR middle-end/94303
	* g++.dg/torture/pr94303.C: New test.
2020-04-07 20:59:37 +02:00
Jakub Jelinek 8ea7970c49 if-conv: Delete dead stmts backwards in ifcvt_local_dce [PR94283]
> > This patch caused:
> >
> > gcc /home/marxin/Programming/gcc/gcc/testsuite/gcc.c-torture/compile/990625-2.c -O3 -g -fno-tree-dce -c
> > during GIMPLE pass: ifcvt
> > /home/marxin/Programming/gcc/gcc/testsuite/gcc.c-torture/compile/990625-2.c: In function ‘broken030599’:
> > /home/marxin/Programming/gcc/gcc/testsuite/gcc.c-torture/compile/990625-2.c:2:1: internal compiler error: Segmentation fault
>
> Likely
>
>   /* Delete dead statements.  */
>   gsi = gsi_start_bb (bb);
>   while (!gsi_end_p (gsi))
>     {
>
> needs to instead work back-to-front for debug stmt adjustment to work

Indeed, that seems to work.

2020-03-25  Richard Biener  <rguenther@suse.de>
	    Jakub Jelinek  <jakub@redhat.com>

	PR debug/94283
	* tree-if-conv.c (ifcvt_local_dce): Delete dead statements backwards.

	* gcc.dg/pr94283.c: New test.

Co-authored-by: Richard Biener <rguenther@suse.de>
2020-04-07 20:57:48 +02:00
Jakub Jelinek 4dcfd4e56b if-conv: Fix -fcompare-debug bugs in ifcvt_local_dce [PR94283]
The following testcase shows -fcompare-debug bugs in ifcvt_local_dce,
where the decisions what statements are needed is based also on debug stmt
operands, which is wrong.
So, this patch makes sure to never add debug stmt to the worklist, or never
add an assign to worklist just because it is used in a debug stmt in another
bb.

2020-03-24  Jakub Jelinek  <jakub@redhat.com>

	PR debug/94283
	* tree-if-conv.c (ifcvt_local_dce): For gimple debug stmts, just set
	GF_PLF_2, but don't add them to worklist.  Don't add an assigment to
	worklist or set GF_PLF_2 just because it is used in a debug stmt in
	another bb.  Formatting improvements.

	* gcc.target/i386/pr94283.c: New test.
2020-04-07 20:57:37 +02:00
Jakub Jelinek 4ac9ab60f0 cgraphunit: Avoid code generation differences based on -w/TREE_NO_WARNING [PR94277]
The following testcase FAILs with -fcompare-debug, but not because -g vs.
-g0 would make a difference, but because the second compilation is done with
-w in order not to emit warnings twice and -w seems to affect the *.gkd dump
content.
This is because TREE_NO_WARNING flag, or warn_unused_function does affect
not just whether a warning/pedwarn is printed, but also whether we set
TREE_PUBLIC on such decls.
The following patch makes sure we set it regardless of anything warning
related (TREE_NO_WARNING or warn_unused_function).

2020-03-24  Jakub Jelinek  <jakub@redhat.com>

	PR debug/94277
	* cgraphunit.c (check_global_declaration): For DECL_EXTERNAL and
	non-TREE_PUBLIC non-DECL_ARTIFICIAL FUNCTION_DECLs, set TREE_PUBLIC
	regardless of whether TREE_NO_WARNING is set on it or whether
	warn_unused_function is true or not.

	* gcc.dg/pr94277.c: New test.
2020-04-07 20:55:11 +02:00
Jakub Jelinek f83c2d2991 c: Fix up cfun->function_end_locus on invalid function bodies [PR94239]
Unfortunately the patch broke
+FAIL: gcc.dg/pr20245-1.c (internal compiler error)
+FAIL: gcc.dg/pr20245-1.c (test for excess errors)
+FAIL: gcc.dg/pr28419.c (internal compiler error)
+FAIL: gcc.dg/pr28419.c (test for excess errors)
on some targets (and under valgrind on the rest of them).

Those functions don't have the opening { and so c_parser_compound_statement
returned error_mark_node before initializing *endlocp.
So, either we can initialize it in that case too:
--- gcc/c/c-parser.c    2020-03-20 22:09:39.659411721 +0100
+++ gcc/c/c-parser.c    2020-03-21 09:36:44.455705261 +0100
@@ -5611,6 +5611,8 @@ c_parser_compound_statement (c_parser *p
         if we have just prepared to enter a function body.  */
       stmt = c_begin_compound_stmt (true);
       c_end_compound_stmt (brace_loc, stmt, true);
+      if (endlocp)
+       *endlocp = brace_loc;
       return error_mark_node;
     }
   stmt = c_begin_compound_stmt (true);
or perhaps simpler initialize it to the function_start_locus at the
beginning and have those functions without { have function_start_locus ==
function_end_locus like the __GIMPLE functions (where propagating the
closing } seemed too difficult).

2020-03-23  Jakub Jelinek  <jakub@redhat.com>

	PR gcov-profile/94029
	PR c/94239
	* c-parser.c (c_parser_declaration_or_fndef): Initialize endloc to
	the function_start_locus location.  Don't do that afterwards for the
	__GIMPLE body parsing.
2020-04-07 20:55:11 +02:00
Jakub Jelinek 827e5af19a c: Fix up cfun->function_end_locus from the C FE [PR94029]
On the following testcase we ICE because while
      DECL_STRUCT_FUNCTION (current_function_decl)->function_start_locus
        = c_parser_peek_token (parser)->location;
and similarly DECL_SOURCE_LOCATION (fndecl) is set from some token's
location, the end is set as:
  /* Store the end of the function, so that we get good line number
     info for the epilogue.  */
  cfun->function_end_locus = input_location;
and the thing is that input_location is only very rarely set in the C FE
(the primary spot that changes it is the cb_line_change/fe_file_change).
Which means, e.g. for pretty much all C functions that are on a single line,
function_start_locus column is > than function_end_locus column, and the
testcase even has smaller line in function_end_locus because cb_line_change
isn't performed while parsing multi-line arguments of a function-like macro.

Attached are two possible fixes to achieve what the C++ FE does, in
particular that cfun->function_end_locus is the locus of the closing } of
the function.  The first one updates input_location when we see a closing }
of a compound statement (though any, not just the function body) and thus
input_location in the finish_function call is what we need.
The second instead propagates the location_t from the parsing of the
outermost compound statement (the function body) to finish_function.
The second one is this version.

2020-03-19  Jakub Jelinek  <jakub@redhat.com>

	PR gcov-profile/94029
	* c-tree.h (finish_function): Add location_t argument defaulted to
	input_location.
	* c-parser.c (c_parser_compound_statement): Add endlocp argument and
	set it to the locus of closing } if non-NULL.
	(c_parser_compound_statement_nostart): Return locus of closing }.
	(c_parser_parse_rtl_body): Likewise.
	(c_parser_declaration_or_fndef): Propagate locus of closing } to
	finish_function.
	* c-decl.c (finish_function): Add end_loc argument, use it instead of
	input_location to set function_end_locus.

	* gcc.misc-tests/gcov-pr94029.c: New test.
2020-04-07 20:55:10 +02:00
Jakub Jelinek 484206967f c++: Fix up handling of captured vars in lambdas in OpenMP clauses [PR93931]
Without the parser.c change we were ICEing on the testcase, because while the
uses of the captured vars inside of the constructs were replaced with capture
proxy decls, we didn't do that for decls in OpenMP clauses.

With that fixed, we don't ICE anymore, but the testcase is miscompiled and FAILs
at runtime.  This is because the capture proxy decls have DECL_VALUE_EXPR and
during gimplification we were gimplifying those to their DECL_VALUE_EXPRs.
That is fine for shared vars, but for privatized ones we must not do that.
So that is what the cp-gimplify.c changes do.  Had to add a DECL_CONTEXT check
before calling is_capture_proxy because some VAR_DECLs don't have DECL_CONTEXT
set (yet) and is_capture_proxy relies on that being non-NULL always.

2020-03-19  Jakub Jelinek  <jakub@redhat.com>

	PR c++/93931
	* parser.c (cp_parser_omp_var_list_no_open): Call process_outer_var_ref
	on outer_automatic_var_p decls.
	* cp-gimplify.c (cxx_omp_disregard_value_expr): Return true also for
	capture proxy decls.

	* testsuite/libgomp.c++/pr93931.C: New test.
2020-04-07 20:55:10 +02:00
Jakub Jelinek 8db876e9c0 phiopt: Avoid -fcompare-debug bug in phiopt [PR94211]
Two years ago, I've added support for up to 2 simple preparation statements
in value_replacement, but the
-      && estimate_num_insns (assign, &eni_time_weights)
+      && estimate_num_insns (bb_seq (middle_bb), &eni_time_weights)
change, meant that we compute the cost of all those statements rather than
just the single assign that has been the single supported non-debug
statement in the bb before, doesn't do what I thought would do, gimple_seq
is just gimple * and thus it can't be really overloaded depending on whether
we pass a single gimple * or a whole sequence.  Which means in the last
two years it doesn't count all the statements, but only the first one.
With -g that happens to be a DEBUG_STMT, or it could be e.g. the first
preparation statement which could be much cheaper than the actual assign.

2020-03-19  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/94211
	* tree-ssa-phiopt.c (value_replacement): Use estimate_num_insns_seq
	instead of estimate_num_insns for bb_seq (middle_bb).  Rename
	emtpy_or_with_defined_p variable to empty_or_with_defined_p, adjust
	all uses.

	* gcc.dg/pr94211.c: New test.
2020-04-07 20:55:10 +02:00
Jakub Jelinek 87ce34fa00 c: Handle C_TYPE_INCOMPLETE_VARS even for ENUMERAL_TYPEs [PR94172]
The following testcases ICE, because they contain extern variable
declarations with incomplete enum types that is later completed and after
that those variables are accessed.  The ICEs are because the vars then may have
incorrect DECL_MODE etc., e.g. in the first case the var has SImode
DECL_MODE (the guessed mode for the enum), but the enum then actually has
DImode because its enumerators don't fit into unsigned int.

The following patch fixes it by using C_TYPE_INCOMPLETE_VARS not just on
incomplete struct/union types, but also incomplete enum types.
TYPE_VFIELD can't be used as it is TYPE_MIN_VALUE on ENUMERAL_TYPE,
thankfully TYPE_LANG_SLOT_1 has been used in the C FE only on
FUNCTION_TYPEs.

2020-03-17  Jakub Jelinek  <jakub@redhat.com>

	PR c/94172
	* c-tree.h (C_TYPE_INCOMPLETE_VARS): Define to TYPE_LANG_SLOT_1
	instead of TYPE_VFIELD, and support it on {RECORD,UNION,ENUMERAL}_TYPE.
	(TYPE_ACTUAL_ARG_TYPES): Check that it is only used on FUNCTION_TYPEs.
	* c-decl.c (pushdecl): Push C_TYPE_INCOMPLETE_VARS also to
	ENUMERAL_TYPEs.
	(finish_incomplete_vars): New function, moved from finish_struct.  Use
	relayout_decl instead of layout_decl.
	(finish_struct): Remove obsolete comment about C_TYPE_INCOMPLETE_VARS
	being TYPE_VFIELD.  Use finish_incomplete_vars.
	(finish_enum): Clear C_TYPE_INCOMPLETE_VARS.  Call
	finish_incomplete_vars.
	* c-typeck.c (c_build_qualified_type): Clear C_TYPE_INCOMPLETE_VARS
	also on ENUMERAL_TYPEs.

	* gcc.dg/pr94172-1.c: New test.
	* gcc.dg/pr94172-2.c: New test.
2020-04-07 20:55:09 +02:00
Jakub Jelinek 980a7a0be5 c++: Fix parsing of invalid enum specifiers [PR90995]
The testcase shows some accepts-invalid (the ones without alignas) and
ice-on-invalid-code (the ones with alignas) cases.
If the enum doesn't have an underlying type and is not a definition,
the caller retries to parse it as elaborated type specifier.
E.g. for enum struct S s it will then pedwarn that elaborated type specifier
shouldn't have the struct/class keywords.
The problem is if the enum specifier is not followed by { when it has
underlying type.  In that case we have already called
cp_parser_parse_definitely to end the tentative parsing started at the
beginning of cp_parser_enum_specifier.  But the
cp_parser_error (parser, "expected %<;%> or %<{%>");
doesn't emit any error because the whole function is called from yet another
tentative parse and the caller starts parsing the elaborated type
specifier where the cp_parser_enum_specifier stopped (i.e. after the
underlying type token(s)).  The ultimate caller than commits the tentative
parsing (and even if it wouldn't, it wouldn't know what kind of error
to report).  I think after seeing enum {,struct,class} : type not being
followed by { or ;, there is no reason not to report it right away, as it
can't be valid C++, which is what the patch does.  Not sure if we shouldn't
also return error_mark_node instead of NULL_TREE, so that the caller doesn't
try to parse it as elaborated type specifier (the patch doesn't do that
right now).

Furthermore, while reading the code, I've noticed that
parser->colon_corrects_to_scope_p is saved and set to false at the start
of the function, but not restored back in some cases.  Don't have a testcase
where this would be a problem, but it just seems wrong.  Either we can in
the two spots replace return NULL_TREE; with { type = NULL_TREE; goto out; }
or we could perhaps abuse warning_sentinel or create a special class with
dtor to clean the flag up.

And lastly, I've fixed some formatting issues in the function while reading
it.

2020-03-17  Jakub Jelinek  <jakub@redhat.com>

	PR c++/90995
	* parser.c (cp_parser_enum_specifier): Use temp_override for
	parser->colon_corrects_to_scope_p, replace goto out with return.
	If scoped enum or enum with underlying type is not followed by
	{ or ;, call cp_parser_commit_to_tentative_parse before calling
	cp_parser_error and make sure to return error_mark_node instead of
	NULL_TREE.  Formatting fixes.

	* g++.dg/cpp0x/enum40.C: New test.
2020-04-07 20:55:09 +02:00
Kyrylo Tkachov 470626394a [AArch64] PR target/94518: Fix memmodel index in aarch64_store_exclusive_pair
2020-04-07  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

        PR target/94518
        2019-09-23  Richard Sandiford  <richard.sandiford@arm.com>

        * config/aarch64/atomics.md (aarch64_store_exclusive_pair): Fix
        memmodel index.
2020-04-07 18:10:45 +01:00