Commit Graph

180811 Commits

Author SHA1 Message Date
Andrea Corallo
d65303b699 arm: Add vst1_lane_bf16 + vstq_lane_bf16 intrinsics
gcc/ChangeLog

2020-10-23  Andrea Corallo  <andrea.corallo@arm.com>

	* config/arm/arm_neon.h (vst1_lane_bf16, vst1q_lane_bf16): Add
	intrinsics.
	* config/arm/arm_neon_builtins.def (STORE1LANE): Add v4bf, v8bf.

gcc/testsuite/ChangeLog

2020-10-23  Andrea Corallo  <andrea.corallo@arm.com>

	* gcc.target/arm/simd/vst1_lane_bf16_1.c: New testcase.
	* gcc.target/arm/simd/vstq1_lane_bf16_indices_1.c: Likewise.
	* gcc.target/arm/simd/vst1_lane_bf16_indices_1.c: Likewise.
2020-11-03 14:21:27 +01:00
Andrea Corallo
c9a0276840 arm: Add vld1_lane_bf16 + vldq_lane_bf16 intrinsics
gcc/ChangeLog

2020-10-21  Andrea Corallo  <andrea.corallo@arm.com>

	* config/arm/arm_neon_builtins.def: Add to LOAD1LANE v4bf, v8bf.
	* config/arm/arm_neon.h (vld1_lane_bf16, vld1q_lane_bf16): Add
	intrinsics.

gcc/testsuite/ChangeLog

2020-10-21  Andrea Corallo  <andrea.corallo@arm.com>

	* gcc.target/arm/simd/vld1_lane_bf16_1.c: New testcase.
	* gcc.target/arm/simd/vld1_lane_bf16_indices_1.c: Likewise.
	* gcc.target/arm/simd/vld1q_lane_bf16_indices_1.c: Likewise.
2020-11-03 14:19:52 +01:00
Nathan Sidwell
444655b6f0 c++: cp_tree_equal cleanups
A couple of small fixes.  I noticed bind_template_template_parms was
not marking the parm a template parm (this broke some module
handling).  Debugging CALL_EXPR comparisons led me to refactor
cp_tree_equal's CALL_EXPR code (and my recent fix to debug printing of
same).  Finally TREE_VECS are best compared by comp_template_args.  I
recall that last piece being a left over from fixes during gcc-10.
I've been using it on the modules branch since then.

	gcc/cp/
	* tree.c (bind_template_template_parm): Mark the parm as a
	template parm.
	(cp_tree_equal): Refactor CALL_EXPR.  Use comp_template_args for
	TREE_VECs.
2020-11-03 05:16:31 -08:00
Nathan Sidwell
fbc3f84743 c++: rtti cleanups
Here are a few cleanups from the modules branch.  Generally some RAII,
and a bit of lazy namespace pushing.

	gcc/cp/
	* rtti.c (init_rtti_processing): Move var decl to its init.
	(get_tinfo_decl): Likewise.  Break out creation to called helper
	...
	(get_tinfo_decl_direct): ... here.
	(build_dynamic_cast_1): Move var decls to their initializers.
	(tinfo_base_init): Set decl's location to BUILTINS_LOCATION.
	(get_tinfo_desc): Only push ABI namespace when needed.  Set type's
	context.
2020-11-03 05:16:31 -08:00
Nathan Sidwell
918e8b10a7 libcpp: dependency emission tidying
This patch cleans up the interface to the dependency generation a
little.  We now only check the option in one place, and the
cpp_get_deps function returns nullptr if there are no dependencies.  I
also reworded the -MT and -MQ help text to be make agnostic -- as
there are ideas about emitting, say, JSON.

	libcpp/
	* include/mkdeps.h: Include cpplib.h
	(deps_write): Adjust first parm type.
	* mkdeps.c: Include internal.h
	(make_write): Adjust first parm type.  Check phony option
	directly.
	(deps_write): Adjust first parm type.
	* init.c (cpp_read_main_file): Use get_deps.
	* directives.c (cpp_get_deps): Check option before initializing.
	gcc/c-family/
	* c.opt (MQ,MT): Reword description to be make-agnostic.
	gcc/fortran/
	* cpp.c (gfc_cpp_add_dep): Only add dependency if we're recording
	them.
	(gfc_cpp_init): Likewise for target.
2020-11-03 05:16:19 -08:00
Dennis Zhang
f7d6961126 aarch64: ACLE intrinsics convert BF16 to Float32
This patch enables intrinsics to convert BFloat16 scalar and vector
operands to Float32 modes. The intrinsics are implemented by shifting
each BFloat16 item 16 bits to left using shl/shll/shll2 instructions.

gcc/ChangeLog:

2020-11-03  Dennis Zhang  <dennis.zhang@arm.com>

	* config/aarch64/aarch64-simd-builtins.def(vbfcvt): New entry.
	(vbfcvt_high, bfcvt): Likewise.
	* config/aarch64/aarch64-simd.md(aarch64_vbfcvt<mode>): New entry.
	(aarch64_vbfcvt_highv8bf, aarch64_bfcvtsf): Likewise.
	* config/aarch64/arm_bf16.h (vcvtah_f32_bf16): New intrinsic.
	* config/aarch64/arm_neon.h (vcvt_f32_bf16): Likewise.
	(vcvtq_low_f32_bf16, vcvtq_high_f32_bf16): Likewise.

gcc/testsuite/ChangeLog

	* gcc.target/aarch64/advsimd-intrinsics/bfcvt-compile.c
	(test_vcvt_f32_bf16, test_vcvtq_low_f32_bf16): New tests.
	(test_vcvtq_high_f32_bf16, test_vcvth_f32_bf16): Likewise.
2020-11-03 13:00:51 +00:00
Richard Biener
9d1b813d0f bootstrap/97666 - fix array of bool allocation
This fixes the bad assumption that sizeof (bool) == 1

2020-11-03  Richard Biener  <rguenther@suse.de>

	PR bootstrap/97666
	* tree-vect-slp.c (vect_build_slp_tree_2): Scale
	allocation of skip_args by sizeof (bool).
2020-11-03 13:33:37 +01:00
Richard Biener
ac6affba97 tree-optimization/80928 - SLP vectorize nested loop induction
This adds SLP vectorization of nested inductions.

2020-11-03  Richard Biener <rguenther@suse.de>

	PR tree-optimization/80928
	* tree-vect-loop.c (vectorizable_induction): SLP vectorize
	nested inductions.

	* gcc.dg/vect/vect-outer-slp-2.c: New testcase.
	* gcc.dg/vect/vect-outer-slp-3.c: Likewise.
2020-11-03 13:33:37 +01:00
Uros Bizjak
a562d44924 testsuite: Fix gcc.target/i386/zero-scratch-regs-*.c scan-asm directives
Improve zero-scratch-regs-*.c scan-asm regexps
and add target selectors for 32bit targets.

2020-11-03  Uroš Bizjak  <ubizjak@gmail.com>

gcc/testsuite/ChangeLog:

	* gcc.target/i386/zero-scratch-regs-1.c: Add ia32 target
	selector where appropriate.  Improve scan-assembler regexp.
	* gcc.target/i386/zero-scratch-regs-2.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-3.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-4.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-5.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-6.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-7.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-8.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-9.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-10.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-13.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-14.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-15.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-16.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-17.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-18.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-19.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-20.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-21.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-22.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-23.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-24.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-25.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-26.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-27.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-28.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-29.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-30.c: Ditto.
	* gcc.target/i386/zero-scratch-regs-31.c: Ditto.
2020-11-03 13:08:04 +01:00
Olivier Hainque
87a9861b06 Add missing require-effective-target lto
This prevents failure of an lto test in configurations
missing LTO support, such as VxWorks for kernel mode.

2020-11-02  Olivier Hainque  <hainque@adacore.com>

gcc/testsuite/
	* gcc.dg/tree-ssa/pr71077.c: Add
	dg-require-effective-target lto.
2020-11-03 11:31:27 +00:00
Olivier Hainque
aa23a2dd53 Add dg-require-effective-target fpic to gcc i386 tests
This change adds

 /* { dg-require-effective-target fpic } */

to tests in gcc.target/i386 that do use -fpic or -fPIC
but don't currently query the target support.

This corresponds to what many other fpic tests do
and helps the vxWorks ports at least, as -fpic is
typically not supported in at least one of the two
major modes of such port (kernel vs RTP).

2020-11-03  Olivier Hainque  <hainque@adacore.com>

gcc/testsuite/

	* gcc.target/i386/pr45352-1.c: Add dg-require-effective-target fpic.
	* gcc.target/i386/pr47602.c: Likewise.
	* gcc.target/i386/pr55151.c: Likewise.
	* gcc.target/i386/pr55458.c: Likewise.
	* gcc.target/i386/pr56348.c: Likewise.
	* gcc.target/i386/pr57097.c: Likewise.
	* gcc.target/i386/pr65753.c: Likewise.
	* gcc.target/i386/pr65915.c: Likewise.
	* gcc.target/i386/pr66232-5.c: Likewise.
	* gcc.target/i386/pr66334.c: Likewise.
	* gcc.target/i386/pr66819-2.c: Likewise.
	* gcc.target/i386/pr67265.c: Likewise.
	* gcc.target/i386/pr81481.c: Likewise.
	* gcc.target/i386/pr83994.c: Likewise.
2020-11-03 11:13:11 +00:00
Jan Hubicka
f89dcf9334 Avoid recursion in tree-inline
gcc/ChangeLog:

2020-11-03  Jan Hubicka  <hubicka@ucw.cz>

	PR ipa/97578
	* ipa-inline-transform.c (maybe_materialize_called_clones): New
	function.
	(inline_transform): Use it.

gcc/testsuite/ChangeLog:

2020-11-03  Jan Hubicka  <hubicka@ucw.cz>

	* gcc.c-torture/compile/pr97578.c: New test.
2020-11-03 11:56:05 +01:00
Richard Biener
8414529156 testsuite/97688 - fix check_vect () with __AVX2__
This fixes the cpuid check to always specify a subleaf zero
which is required to detect AVX2 and doesn't hurt for level one.
Without this fix we get zero runtime coverage when -mavx2 is
specified.

2020-11-03  Richard Biener  <rguenther@suse.de>

	PR testsuite/97688
	* gcc.dg/vect/tree-vect.h (check_vect): Fix the x86 cpuid
	check to always specify subleaf zero.
2020-11-03 11:14:01 +01:00
Richard Biener
f53e9d40de tree-optimization/97678 - fix SLP induction epilogue vectorization
This restores not tracking SLP nodes for induction initial values
in not nested context because this interferes with peeling and
epilogue vectorization.

2020-11-03  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/97678
	* tree-vect-slp.c (vect_build_slp_tree_2): Do not track
	the initial values of inductions when not nested.
	* tree-vect-loop.c (vectorizable_induction): Look at
	PHI node initial values again for SLP and not nested
	inductions.  Handle LOOP_VINFO_MASK_SKIP_NITERS and cost
	invariants.

	* gcc.dg/vect/pr97678.c: New testcase.
2020-11-03 09:56:40 +01:00
Tobias Burnus
0caf400a86 Fortran: Add !GCC$ attributes DEPRECATED
gcc/fortran/ChangeLog:

	* decl.c (ext_attr_list): Add EXT_ATTR_DEPRECATED.
	* gfortran.h (ext_attr_id_t): Ditto.
	* gfortran.texi (GCC$ ATTRIBUTES): Document it.
	* resolve.c (resolve_variable, resolve_function,
	resolve_call, resolve_values): Show -Wdeprecated-declarations warning.
	* trans-decl.c (add_attributes_to_decl): Skip those
	with no middle_end_name.

gcc/testsuite/ChangeLog:

	* gfortran.dg/attr_deprecated.f90: New test.
2020-11-03 09:55:58 +01:00
Uros Bizjak
682ed7ad23 x86: Optimize aes<aeswideklvariant>u8 a bit, fix whitespace
2020-11-03  Uroš Bizjak  <ubizjak@gmail.com>

gcc/

	* config/i386/sse.md (aes<aeswideklvariant>u8):
	Do not use xmm_regs array.  Fix whitespace.
2020-11-03 09:51:01 +01:00
Uros Bizjak
db3f0d218c x86: Fix comment in ix86_expand_builtin
2020-11-03  Uroš Bizjak  <ubizjak@gmail.com>

gcc/

	* config/i386/i386-expand.c (ix86_expand_builtin): Fix comment.
2020-11-03 09:46:59 +01:00
Thomas Schwinge
64dc14b1a7 [OpenACC] Enable inconsistent nested 'reduction' clauses checking for OpenACC 'kernels'
gcc/
	* omp-low.c (scan_omp_for) <OpenACC>: Move earlier inconsistent
	nested 'reduction' clauses checking.
	gcc/testsuite/
	* c-c++-common/goacc/nested-reductions-1-kernels.c: Extend.
	* c-c++-common/goacc/nested-reductions-2-kernels.c: Likewise.
	* gfortran.dg/goacc/nested-reductions-1-kernels.f90: Likewise.
	* gfortran.dg/goacc/nested-reductions-2-kernels.f90: Likewise.
2020-11-03 09:35:33 +01:00
Thomas Schwinge
fedf3e94ef [OpenACC] Split up testcases for inconsistent nested 'reduction' clauses checking
gcc/testsuite/
	* c-c++-common/goacc/nested-reductions.c: Split file into...
	* c-c++-common/goacc/nested-reductions-1-kernels.c: ... this...
	* c-c++-common/goacc/nested-reductions-1-parallel.c: ..., this...
	* c-c++-common/goacc/nested-reductions-1-routine.c: ..., and this.
	* c-c++-common/goacc/nested-reductions-warn.c: Split file into...
	* c-c++-common/goacc/nested-reductions-2-kernels.c: ... this...
	* c-c++-common/goacc/nested-reductions-2-parallel.c: ..., this...
	* c-c++-common/goacc/nested-reductions-2-routine.c: ..., and this.
	* gfortran.dg/goacc/nested-reductions.f90: Split file into...
	* gfortran.dg/goacc/nested-reductions-1-kernels.f90: ... this...
	* gfortran.dg/goacc/nested-reductions-1-parallel.f90: ..., this...
	* gfortran.dg/goacc/nested-reductions-1-routine.f90: ..., and
	this.
	* gfortran.dg/goacc/nested-reductions-warn.f90: Split file into...
	* gfortran.dg/goacc/nested-reductions-2-kernels.f90: ... this...
	* gfortran.dg/goacc/nested-reductions-2-parallel.f90: ..., this...
	* gfortran.dg/goacc/nested-reductions-2-routine.f90: ..., and
	this.
2020-11-03 09:35:33 +01:00
Jonathan Yong
08fca4df1d libstdc++: use lt_host_flags for libstdc++.la
For platforms like Mingw and Cygwin, cygwin refuses to generate the
shared library without using -no-undefined.

Attached patch makes sure the right flags are used, since libtool is
already used to link libstdc++.

libstdc++-v3/ChangeLog:

	* src/Makefile.am (libstdc___la_LINK): Add lt_host_flags.
	* src/Makefile.in: Regenerate.
2020-11-03 08:22:53 +00:00
Thomas Schwinge
41f7f6178e [Fortran] More precise location information for OpenACC 'gang', 'worker', 'vector' clauses with argument [PR92793]
gcc/fortran/
	PR fortran/92793
	* trans-openmp.c (gfc_trans_omp_clauses): More precise location
	information for OpenACC 'gang', 'worker', 'vector' clauses with
	argument.
	gcc/testsuite/
	PR fortran/92793
	* gfortran.dg/goacc/pr92793-1.f90: Adjust.
2020-11-03 09:13:07 +01:00
Thomas Schwinge
beddd1762a [OpenACC] More precise diagnostics for 'gang', 'worker', 'vector' clauses with arguments on 'loop' only allowed in 'kernels' regions
Instead of at the location of the 'loop' directive, 'error_at' the location of
the improper clause, and 'inform' at the location of the enclosing parent
compute construct/routine.

The Fortran testcases come with some XFAILing, to be resolved later.

	gcc/
	* omp-low.c (scan_omp_for) <OpenACC>: More precise diagnostics for
	'gang', 'worker', 'vector' clauses with arguments only allowed in
	'kernels' regions.
	gcc/testsuite/
	* c-c++-common/goacc/pr92793-1.c: Extend.
	* gfortran.dg/goacc/pr92793-1.f90: Likewise.
2020-11-03 09:13:07 +01:00
Kewen Lin
f5e18dd9c7 pass: Run cleanup passes before SLP [PR96789]
As the discussion in PR96789, we found that some scalar stmts
which can be eliminated by some passes after SLP, but we still
modeled their costs when trying to SLP, it could impact
vectorizer's decision.  One typical case is the case in PR96789
on target Power.

As Richard suggested there, this patch is to introduce one pass
called pre_slp_scalar_cleanup which has some secondary clean up
passes, for now they are FRE and DSE.  It introduces one new
TODO flags group called pending TODO flags, unlike normal TODO
flags, the pending TODO flags are passed down in the pipeline
until one of its consumers can perform the requested action.
Consumers should then clear the flags for the actions that they
have taken.

Soem compilation time statistics on all SPEC2017 INT bmks were
collected on one Power9 machine for several option sets below:
  A1: -Ofast -funroll-loops
  A2: -O1
  A3: -O1 -funroll-loops
  A4: -O2
  A5: -O2 -funroll-loops

the corresponding increment rate is trivial:
  A1       A2       A3        A4        A5
  0.08%    0.00%    -0.38%    -0.10%    -0.05%

Bootstrapped/regtested on powerpc64le-linux-gnu P8.

gcc/ChangeLog:

	PR tree-optimization/96789
	* function.h (struct function): New member unsigned pending_TODOs.
	* passes.c (class pass_pre_slp_scalar_cleanup): New class.
	(make_pass_pre_slp_scalar_cleanup): New function.
	(pass_data_pre_slp_scalar_cleanup): New pass data.
	* passes.def: (pass_pre_slp_scalar_cleanup): New pass, add
	pass_fre and pass_dse as its children.
	* timevar.def (TV_SCALAR_CLEANUP): New timevar.
	* tree-pass.h (PENDING_TODO_force_next_scalar_cleanup): New
	pending TODO flag.
	(make_pass_pre_slp_scalar_cleanup): New declare.
	* tree-ssa-loop-ivcanon.c (tree_unroll_loops_completely_1):
	Once any outermost loop gets unrolled, flag cfun pending_TODOs
	PENDING_TODO_force_next_scalar_cleanup on.

gcc/testsuite/ChangeLog:

	PR tree-optimization/96789
	* gcc.dg/tree-ssa/ssa-dse-28.c: Adjust.
	* gcc.dg/tree-ssa/ssa-dse-29.c: Likewise.
	* gcc.dg/vect/bb-slp-41.c: Likewise.
	* gcc.dg/tree-ssa/pr96789.c: New test.
2020-11-02 20:55:48 -06:00
Martin Storsjö
bd6ecbe48a libgcc: Expose the instruction pointer and stack pointer in SEH _Unwind_Backtrace
Previously, the SEH version of _Unwind_Backtrace did unwind
the stack and call the provided callback function as intended,
but there was little the caller could do within the callback to
actually get any info about that particular level in the unwind.

Set the ra and cfa pointers, which are used by _Unwind_GetIP
and _Unwind_GetCFA, to allow using these functions from the
callacb to inspect the state at each stack frame.

2020-09-08  Martin Storsjö  <martin@martin.st>

	libgcc/
	* unwind-seh.c (_Unwind_Backtrace): Set the ra and cfa pointers
	before calling the callback.
2020-11-03 00:30:35 +00:00
GCC Administrator
18f8fc9329 Daily bump. 2020-11-03 00:16:23 +00:00
Alan Modra
18963d3bee can_implement_as_sibling_call_p REG_PARM_STACK_SPACE check
This moves an #ifdef block of code from calls.c to
targetm.function_ok_for_sibcall.  Only two targets, x86 and rs6000,
define REG_PARM_STACK_SPACE or OUTGOING_REG_PARM_STACK_SPACE macros
that might vary depending on the called function.  Macros like
UNITS_PER_WORD don't change over a function boundary, nor does the
MIPS ABI, nor does TARGET_64BIT on PA-RISC.  Other targets are even
more trivially proven to not need the calls.c code.

Besides cleaning up a small piece of #ifdef code, the motivation for
this patch is to allow tail calls on PowerPC for functions that
require less reg_parm_stack_space than their caller.  The original
code in calls.c only permitted tail calls when exactly equal, but on
PowerPC we can tail call if the callee has less or equal
REG_PARM_STACK_SPACE than the caller, as demonstrated by the
testcase.  So we should use

  /* If reg parm stack space increases, we cannot sibcall.  */
  if (REG_PARM_STACK_SPACE (decl ? decl : fntype)
      > INCOMING_REG_PARM_STACK_SPACE (current_function_decl))

and note the change to use INCOMING_REG_PARM_STACK_SPACE.
REG_PARM_STACK_SPACE has always been wrong there for PowerPC.  See
https://gcc.gnu.org/pipermail/gcc-patches/2014-May/389867.html for why
if you're curious.  Not that it matters, because PowerPC can do
without this check entirely, relying on a stack slot test in generic
code.

a) The generic code checks that arg passing stack in the callee is not
   greater than that in the caller, and,
b) ELFv2 only allocates reg_parm_stack_space when some parameter is
   passed on the stack.
Point (b) means that zero reg_parm_stack_space implies zero stack
space, and non-zero reg_parm_stack_space implies non-zero stack
space.  So the case of 0 reg_parm_stack_space in the caller and 64 in
the callee will be caught by (a).

gcc/
	PR middle-end/97267
	* calls.h (maybe_complain_about_tail_call): Declare.
	* calls.c (maybe_complain_about_tail_call): Make global.
	(can_implement_as_sibling_call_p): Delete reg_parm_stack_space
	param.  Adjust caller.  Move REG_PARM_STACK_SPACE check to..
	* config/i386/i386.c (ix86_function_ok_for_sibcall): ..here.

gcc/testsuite/
	PR middle-end/97267
	* gcc.target/powerpc/pr97267.c: New test.
2020-11-03 09:36:40 +10:30
Vladimir N. Makarov
3ceaafc95c Expand reg_equiv when scratches are removed.
gcc/ChangeLog:

	* ira.c (ira_remove_scratches): Rename to remove_scratches.  Make
	it static and returning flag of any change.
	(ira.c): Call ira_expand_reg_equiv in case of removing scratches.
2020-11-02 17:00:40 -05:00
H.J. Lu
6058b874ef x86: Also require MMX for __builtin_ia32_maskmovq
MMX emulation with SEE is implemented at MMX intrinsic level, not at MMX
instruction level.  _mm_maskmove_si64 intrinsic for "MASKMOVQ mm1, mm2"
is emulated with __builtin_ia32_maskmovdqu.  Since SSE "MASKMOVQ mm1, mm2"
builtin function, __builtin_ia32_maskmovq, can't be emulated with XMM
registers, make __builtin_ia32_maskmovq also require MMX instead of SSE
only.

gcc/

	PR target/97140
	* config/i386/i386-expand.c (ix86_expand_builtin): Require MMX
	for __builtin_ia32_maskmovq.

gcc/testsuite/

	PR target/97140
	* gcc.target/i386/pr97140.c: New test.
2020-11-02 13:38:34 -08:00
GCC Administrator
88ce3d5fbb Daily bump. 2020-11-02 20:53:00 +00:00
Martin Sebor
9e3c694afa Correct -Wstringop-overflow and -Wstringop-overread.
gcc/ChangeLog:
	* doc/invoke.texi (-Wstringop-overflow): Correct default setting.
	(-Wstringop-overread): Move past -Wstringop-overflow.
2020-11-02 13:50:35 -07:00
François-Xavier Coudert
034db20e2e gcc: quote characters in texi source
gcc/ChangeLog:

	PR bootstrap/57076
	* Makefile.in (gcc-vers.texi): Quote @, { and }.
2020-11-02 21:15:10 +01:00
Thomas Rodgers
6bcbcea058 libstdc++: Add c++2a <syncstream>
libstdc++-v3/ChangeLog:
	* doc/doxygen/user.cfg.in (INPUT): Add new header.
	* include/Makefile.am (std_headers): Add new header.
	* include/Makefile.in: Regenerate.
	* include/precompiled/stdc++.h: Include new header.
	* include/std/syncstream: New header.
	* include/std/version: Add __cpp_lib_syncbuf.
	* testsuite/27_io/basic_syncbuf/1.cc: New test.
	* testsuite/27_io/basic_syncbuf/2.cc: Likewise.
	* testsuite/27_io/basic_syncbuf/basic_ops/1.cc:
	Likewise.
	* testsuite/27_io/basic_syncbuf/requirements/types.cc:
	Likewise.
	* testsuite/27_io/basic_syncbuf/sync_ops/1.cc:
	Likewise.
	* testsuite/27_io/basic_syncstream/1.cc: Likewise.
	* testsuite/27_io/basic_syncstream/2.cc: Likewise.
	* testsuite/27_io/basic_syncstream/basic_ops/1.cc:
	Likewise.
	* testsuite/27_io/basic_syncstream/requirements/types.cc:
	Likewise.
2020-11-02 10:41:32 -08:00
Nathan Sidwell
d6912d9b17 c++: Fixup some vardecls and whitespace
Move some var decls to their initializers.  Correct some whitespace.

	gcc/cp/
	* decl.c (start_decl_1): Refactor declarations.  Fixup some
	whitespace.
	(lookup_and_check_tag): Fixup some whitespace.
2020-11-02 10:34:31 -08:00
Nathan Sidwell
9757d793f8 c++: refactor duplicate decls
A couple of paths in duplicate decls dealing with templates and
builtins were overly complicated.  Fixing thusly.

	gcc/cp/
	* decl.c (duplicate_decls): Refactor some template & builtin
	handling.
2020-11-02 10:34:31 -08:00
Nathan Sidwell
f915e19e62 c++: Delete unused hash type
Since I redid block-scope extern decls, the need for a uid->decl
hasher has gone away.  Deleting thusly.

	gcc/cp/
	* cp-tree.h (struct cxx_int_tree_map): Delete.
	(struct cxx_int_tree_map_hasher): Delete.
	* cp-gimplify.c (cxx_int_tree_map_hasher::equal): Delete.
	(cxx_int_tree_map_hasher::hash): Delete.
2020-11-02 10:34:31 -08:00
Patrick Palka
bebabf70a0 c++: Don't purge the satisfaction caches
The adoption of P2104 ("Disallow changing concept values") means we can
memoize the result of satisfaction indefinitely and no longer have to
clear the satisfaction caches on various events that would affect
satisfaction.  To that end, this patch removes the invalidation routine
clear_satisfaction_cache and adjusts its callers appropriately.

This provides a large reduction in compile time and memory use in some
cases.  For example, on the libstdc++ test std/ranges/adaptor/join.cc,
compile time and memory usage drops nearly 75%, from 7.5s/770MB to
2s/230MB, with a --enable-checking=release compiler.

gcc/cp/ChangeLog:

	* class.c (finish_struct_1): Don't call clear_satisfaction_cache.
	* constexpr.c (clear_cv_and_fold_caches): Likewise.  Remove bool
	parameter.
	* constraint.cc (clear_satisfaction_cache): Remove definition.
	* cp-tree.h (clear_satisfaction_cache): Remove declaration.
	(clear_cv_and_fold_caches): Remove bool parameter.
	* typeck2.c (store_init_value): Remove argument to
	clear_cv_and_fold_caches.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/concepts-complete1.C: Delete test that became
	ill-formed after P2104.
2020-11-02 13:19:29 -05:00
Carl Love
05161256d3 Add bcd builtings listed in appendix B of the ABI
2020-10-29  Carl Love  <cel@us.ibm.com>

gcc/
	PR target/93449
	* config/rs6000/altivec.h (__builtin_bcdadd, __builtin_bcdadd_lt,
	__builtin_bcdadd_eq, __builtin_bcdadd_gt, __builtin_bcdadd_ofl,
	__builtin_bcdadd_ov, __builtin_bcdsub, __builtin_bcdsub_lt,
	__builtin_bcdsub_eq, __builtin_bcdsub_gt, __builtin_bcdsub_ofl,
	__builtin_bcdsub_ov, __builtin_bcdinvalid, __builtin_bcdmul10,
	__builtin_bcddiv10, __builtin_bcd2dfp, __builtin_bcdcmpeq,
	__builtin_bcdcmpgt, __builtin_bcdcmplt, __builtin_bcdcmpge,
	__builtin_bcdcmple): Add defines.
	* config/rs6000/altivec.md: Add UNSPEC_BCDSHIFT.
	(BCD_TEST): Add le, ge to code iterator.
	Add VBCD mode iterator.
	(bcd<bcd_add_sub>_test, *bcd<bcd_add_sub>_test2,
	bcd<bcd_add_sub>_<code>, bcd<bcd_add_sub>_<code>): Add mode to name.
	Change iterator from V1TI to VBCD.
	(*bcdinvalid_<mode>, bcdshift_v16qi): New define_insn.
	(bcdinvalid_<mode>, bcdmul10_v16qi, bcddiv10_v16qi): New define.
	* config/rs6000/dfp.md (dfp_denbcd_v16qi_inst): New define_insn.
	(dfp_denbcd_v16qi): New define_expand.
	* config/rs6000/rs6000-builtin.def (BU_P8V_MISC_1): New define.
	(BCDADD): Replaced with BCDADD_V1TI and BCDADD_V16QI.
	(BCDADD_LT): Replaced with BCDADD_LT_V1TI and BCDADD_LT_V16QI.
	(BCDADD_EQ): Replaced with BCDADD_EQ_V1TI and BCDADD_EQ_V16QI.
	(BCDADD_GT): Replaced with BCDADD_GT_V1TI and BCDADD_GT_V16QI.
	(BCDADD_OV): Replaced with BCDADD_OV_V1TI and BCDADD_OV_V16QI.
	(BCDSUB_V1TI, BCDSUB_V16QI, BCDSUB_LT_V1TI, BCDSUB_LT_V16QI,
	BCDSUB_LE_V1TI, BCDSUB_LE_V16QI, BCDSUB_EQ_V1TI, BCDSUB_EQ_V16QI,
	BCDSUB_GT_V1TI, BCDSUB_GT_V16QI, BCDSUB_GE_V1TI, BCDSUB_GE_V16QI,
	BCDSUB_OV_V1TI, BCDSUB_OV_V16QI, BCDINVALID_V1TI, BCDINVALID_V16QI,
	BCDMUL10_V16QI, BCDDIV10_V16QI, DENBCD_V16QI): New builtin definitions.
	(BCDADD, BCDADD_LT, BCDADD_EQ, BCDADD_GT, BCDADD_OV, BCDSUB, BCDSUB_LT,
	BCDSUB_LE, BCDSUB_EQ, BCDSUB_GT, BCDSUB_GE, BCDSUB_OV, BCDINVALID,
	BCDMUL10, BCDDIV10, DENBCD): New overload definitions.
	* config/rs6000/rs6000-call.c (P8V_BUILTIN_VEC_BCDADD, P8V_BUILTIN_VEC_BCDADD_LT,
	P8V_BUILTIN_VEC_BCDADD_EQ, P8V_BUILTIN_VEC_BCDADD_GT, P8V_BUILTIN_VEC_BCDADD_OV,
	P8V_BUILTIN_VEC_BCDINVALID, P9V_BUILTIN_VEC_BCDMUL10, P8V_BUILTIN_VEC_DENBCD.
	P8V_BUILTIN_VEC_BCDSUB, P8V_BUILTIN_VEC_BCDSUB_LT, P8V_BUILTIN_VEC_BCDSUB_LE,
	P8V_BUILTIN_VEC_BCDSUB_EQ, P8V_BUILTIN_VEC_BCDSUB_GT, P8V_BUILTIN_VEC_BCDSUB_GE,
	P8V_BUILTIN_VEC_BCDSUB_OV): New overloaded specifications.
	(CODE_FOR_bcdadd): Replaced with CODE_FOR_bcdadd_v16qi and CODE_FOR_bcdadd_v1ti.
	(CODE_FOR_bcdadd_lt): Replaced with CODE_FOR_bcdadd_lt_v16qi and CODE_FOR_bcdadd_lt_v1ti.
	(CODE_FOR_bcdadd_eq): Replaced with CODE_FOR_bcdadd_eq_v16qi and CODE_FOR_bcdadd_eq_v1ti.
	(CODE_FOR_bcdadd_gt): Replaced with CODE_FOR_bcdadd_gt_v16qi and CODE_FOR_bcdadd_gt_v1ti.
	(CODE_FOR_bcdsub): Replaced with CODE_FOR_bcdsub_v16qi and CODE_FOR_bcdsub_v1ti.
	(CODE_FOR_bcdsub_lt): Replaced with CODE_FOR_bcdsub_lt_v16qi and CODE_FOR_bcdsub_lt_v1ti.
	(CODE_FOR_bcdsub_eq): Replaced with CODE_FOR_bcdsub_eq_v16qi and CODE_FOR_bcdsub_eq_v1ti.
	(CODE_FOR_bcdsub_gt): Replaced with CODE_FOR_bcdsub_gt_v16qi and CODE_FOR_bcdsub_gt_v1ti.
	(rs6000_expand_ternop_builtin):  Add CODE_FOR_dfp_denbcd_v16qi to else if.
	* doc/extend.texi: Add documentation for new builtins.

gcc/testsuite/
	* gcc.target/powerpc/bcd-2.c: Add include altivec.h.
	* gcc.target/powerpc/bcd-3.c: Add include altivec.h.
	* gcc.target/powerpc/bcd-4.c: New test.
2020-11-02 11:29:56 -06:00
Nathan Sidwell
0a07912f2d c++: Some additional tests
I created a few tests on the modules branch that are not actually
module-related.  Here they are.

	gcc/testsuite/
	* g++.dg/concepts/pack-1.C: New.
	* g++.dg/lookup/using53.C: Add an enum.
	* g++.dg/template/error25.C: Relax 'export' error check.
2020-11-02 08:57:33 -08:00
Nathan Sidwell
48a201e9bc options: Tiny refactor
This changes more on the modules branch, but let's move the
declaration to the initializer now.

	gcc/c-family/
	* c-opts.c (c_common_post_options): Move var decl to its
	initialization point.
2020-11-02 08:56:39 -08:00
Nathan Sidwell
a0bc61e0b6 core: Synchronize tree-cst & wide-int caching expectations
I fell over an ICE where wide_int_to_type_1's expectations of pointer
value caching didn't match that of cache_integer_cst's behaviour.  I
don't know why it only exhibited on the modules branch, but it seems
pretty wrong.  This patch matches up the behaviours and adds a comment
about that.

	gcc/
	* tree.c (cache_integer_cst): Fixup pointer caching to match
	wide_int_to_type_1's expectations.  Add comment.
2020-11-02 08:56:39 -08:00
Nathan Sidwell
9a2e765d77 core: id_equal should forward
I noticed the two id_equal functions directly called strcmp.  This
changes one of them to call the other with args swapped.

	gcc/
	* tree.h (id_equal): Call the symetric predicate with swapped
	arguments.
2020-11-02 08:56:39 -08:00
Nathan Sidwell
f8a737930b core: debug-print whole call expr
In debugging some call-expr handling, I got confused because the debug
printer elided NULL call operands.  This changes the printer to display
them as NULL.

	gcc/
	* print-tree.c (print_node): Display all the operands of a call
	expr.
2020-11-02 08:56:38 -08:00
Nathan Sidwell
e9a2e208dd cpplib: Macro use location and comparison
Our macro use hook passes a location, but doesn't recieve it from the
using location.  This patch adds the extra location_t parameter and
passes it though.

A second cleanup is breaking out the macro comparison code from the
redefinition warning.  That;ll turn out useful for modules.

Finally, there's a filename comparison needed for the location
optimization of rewinding from line 2 (occurs during the emission of
builtin macros).

	libcpp/
	* internal.h (_cpp_notify_macro_use): Add location parm.
	(_cpp_maybe_notify_macro_use): Likewise.
	* directives.c (_cpp_do_file_change): Check we've not changed file
	when optimizing a rewind.
	(do_ifdef): Pass location to _cpp_maybe_notify_macro_use.
	(do_ifndef): Likewise.  Delete obsolete comment about powerpc.
	* expr.c (parse_defined): Pass location to
	_cpp_maybe_notify_macro_use.
	* macro.c (enter_macro_context): Likewise.
	(warn_of_redefinition): Break out helper function.  Call it.
	(compare_macros): New function broken out of warn_of_redefinition.
	(_cpp_new_macro): Zero all fields.
	(_cpp_notify_macro_use): Add location parameter.
2020-11-02 08:56:38 -08:00
Vladimir N. Makarov
1c689b827c Add hint * too 2nd alternative of the 1st scratch in *vsx_extract_<mode>_store_p9.
gcc/ChangeLog:

	* config/rs6000/vsx.md (*vsx_extract_<mode>_store_p9): Add hint *
	to 2nd alternative of the 1st scratch.
2020-11-02 11:12:59 -05:00
Sudakshina Das
ce99142c11 [PATCH] aarch64: Fix PR97638
Currently the testcase in the patch was failing to produce
a 'bti c' at the beginning of the function. This was because
in aarch64_pac_insn_p, we were wrongly returning at the first
check!

2020-10-30  Sudakshina Das  <sudi.das@arm.com>

gcc/ChangeLog:

	PR target/97638
	* config/aarch64/aarch64-bti-insert.c (aarch64_pac_insn_p): Update
	return value on INSN_P check.

gcc/testsuite/ChangeLog:

	PR target/97638
	* gcc.target/aarch64/pr97638.c: New test.a
2020-11-02 15:52:22 +00:00
Richard Biener
e881774d0d Rewrite SLP induction vectorization
This rewrites SLP induction vectorization to handle different
inductions in the different SLP lanes.  It also changes SLP
build to represent the initial value (but not the cycle) so
it can be enhanced to handle outer loop vectorization later.

Note this FAILs gcc.dg/vect/costmodel/x86_64/costmodel-pr30843.c
because it removes one CSE optimization that no longer works
with non-uniform initial value and step.  I'll see to recover
from this after outer loop vectorization of inductions works.

It might be a bit friendlier to variable-size vectors now
but then we're now building the step vector from scalars ...

2020-11-02  Richard Biener  <rguenther@suse.de>

	* tree.h (build_real_from_wide): Declare.
	* tree.c (build_real_from_wide): New function.
	* tree-vect-slp.c (vect_build_slp_tree_2): Remove
	restriction on induction vectorization, represent
	the initial value.
	* tree-vect-loop.c (vect_model_induction_cost): Inline ...
	(vectorizable_induction): ... here.  Rewrite SLP
	code generation.

	* gcc.dg/vect/slp-49.c: New testcase.
2020-11-02 15:58:14 +01:00
Martin Jambor
86deadf8d3 ipa-cp: New debug counters for IPA-CP
Martin Liška has been asking me to add debug counters to the IPA-CP pass so
that testcase reductions are easier.  The pass already has one for the bit
value propagation, so this patch adds one for value_range propagation
and one for the actual constant propagation.

gcc/ChangeLog:

2020-10-30  Martin Jambor  <mjambor@suse.cz>

	* dbgcnt.def (ipa_cp_values): New counter.
	(ipa_cp_vr): Likewise.
	* ipa-cp.c (decide_about_value): Check and bump ipa_cp_values debug
	counter.
	(decide_whether_version_node): Likewise.
	(ipcp_store_vr_results):Check and bump ipa_cp_vr debug counter.
2020-11-02 15:43:28 +01:00
Christophe Lyon
637aeb6b8d arm: Fix multiple inheritance thunks for thumb-1 with -mpure-code
When -mpure-code is used, we cannot load delta from code memory (like
we do without -mpure-code).

This patch builds the value of mi_delta into r3 with a series of
movs/adds/lsls.

We also do some cleanup by not emitting the function address and delta
via .word directives at the end of the thunk since we don't use them
with -mpure-code.

No need for new testcases, this bug was already identified by:
g++.dg/ipa/pr46287-3.C
g++.dg/ipa/pr46984.C
g++.dg/opt/thunk1.C
g++.dg/torture/pr46287.C
g++.dg/torture/pr45699.C

2020-11-02  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/arm.c (arm_thumb1_mi_thunk): Build mi_delta in r3 and
	do not emit function address and delta when -mpure-code is used.
2020-11-02 14:40:10 +00:00
Christophe Lyon
c7f49e0579 arm: Call thumb1_gen_const_int from thumb1_movsi_insn
thumb1_movsi_insn used the same algorithm to build a constant in asm
than thumb1_gen_const_int_1 does in RTL. Since the previous patch added
support for asm generation in thumb1_gen_const_int_1, this patch calls
it from thumb1_movsi_insn to avoid duplication.

We need to introduce a new proxy function, thumb1_gen_const_int_print
to select the right template.

This patch also adds a new testcase as the updated alternative is only
used by thumb-1 processors that also support movt/movw.

2020-11-02  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/thumb1.md (thumb1_movsi_insn): Call
	thumb1_gen_const_int_print.
	* config/arm/arm-protos.h (thumb1_gen_const_int_print): Add
	prototype.
	* config/arm/arm.c (thumb1_gen_const_int_print): New.

	gcc/testsuite/
	* gcc.target/arm/pure-code/no-literal-pool-m23.c: New.
2020-11-02 14:39:52 +00:00
Christophe Lyon
011f5e92f8 arm: Improve thumb1_gen_const_int
Enable thumb1_gen_const_int to generate RTL or asm depending on the
context, so that we avoid duplicating code to handle constants in
Thumb-1 with -mpure-code.

Use a template so that the algorithm is effectively shared, and
rely on two classes to handle the actual emission as RTL or asm.

The generated sequence is improved to handle right-shiftable and small
values with less instructions. We now generate:

128:
        movs    r0, r0, #128
264:
        movs    r3, #33
        lsls    r3, #3
510:
        movs    r3, #255
        lsls    r3, #1
512:
        movs    r3, #1
        lsls    r3, #9
764:
        movs    r3, #191
        lsls    r3, #2
65536:
        movs    r3, #1
        lsls    r3, #16
0x123456:
        movs    r3, #18 ;0x12
        lsls    r3, #8
        adds    r3, #52 ;0x34
        lsls    r3, #8
        adds    r3, #86 ;0x56
0x1123456:
        movs    r3, #137 ;0x89
        lsls    r3, #8
        adds    r3, #26 ;0x1a
        lsls    r3, #8
        adds    r3, #43 ;0x2b
        lsls    r3, #1
0x1000010:
        movs    r3, #16
        lsls    r3, #16
        adds    r3, #1
        lsls    r3, #4
0x1000011:
        movs    r3, #1
        lsls    r3, #24
        adds    r3, #17
-8192:
	movs	r3, #1
	lsls	r3, #13
	rsbs	r3, #0

The patch adds a testcase which does not fully exercise
thumb1_gen_const_int, as other existing patterns already catch small
constants.  These parts of thumb1_gen_const_int are used by
arm_thumb1_mi_thunk.

2020-11-02  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* config/arm/arm.c (thumb1_const_rtl, thumb1_const_print): New
	classes.
	(thumb1_gen_const_int): Rename to ...
	(thumb1_gen_const_int_1): ... New helper function. Add capability
	to emit either RTL or asm, improve generated code.
	(thumb1_gen_const_int_rtl): New function.
	* config/arm/arm-protos.h (thumb1_gen_const_int): Rename to
	thumb1_gen_const_int_rtl.
	* config/arm/thumb1.md: Call thumb1_gen_const_int_rtl instead
	of thumb1_gen_const_int.

	gcc/testsuite/
	* gcc.target/arm/pure-code/no-literal-pool-m0.c: New.
2020-11-02 14:39:24 +00:00