Commit Graph

161510 Commits

Author SHA1 Message Date
Toon Moene
5f007d14ce invoke.texi: Move -floop-unroll-and-jam documentation directly after that of -floop-interchange.
2018-05-18  Toon Moene  <toon@moene.org>

	* doc/invoke.texi: Move -floop-unroll-and-jam documentation
	directly after that of -floop-interchange. Indicate that both
	options are enabled by default when specifying -O3.

From-SVN: r260352
2018-05-18 09:07:39 +00:00
Kyrylo Tkachov
8364e58b5a [AArch64] Unify vec_set patterns, support floating-point vector modes properly
We've a deficiency in our vec_set family of patterns.
We don't support directly loading a vector lane using LD1 for V2DImode and all the vector floating-point modes.
We do do it correctly for the other integer vector modes (V4SI, V8HI etc) though.

The alternatives on the relative floating-point patterns only allow a register-to-register INS instruction.
That means if we want to load a value into a vector lane we must first load it into a scalar register and then
perform an INS, which is wasteful.

There is also an explicit V2DI vec_set expander dangling around for no reason that I can see. It seems to do the
exact same things as the other vec_set expanders. This patch removes that.
It now unifies all vec_set expansions into a single "vec_set<mode>" define_expand using the catch-all VALL_F16 iterator. 

With this patch we avoid loading values into scalar registers and then doing an explicit INS on them to move them into
the desired vector lanes. For example for:

typedef float v4sf __attribute__ ((vector_size (16)));
typedef long long v2di __attribute__ ((vector_size (16)));

v2di
foo_v2di (long long *a, long long *b)
{
  v2di res = { *a, *b };
  return res;
}

v4sf
foo_v4sf (float *a, float *b, float *c, float *d)
{
  v4sf res = { *a, *b, *c, *d };
  return res;
}

we currently generate:

foo_v2di:
        ldr     d0, [x0]
        ldr     x0, [x1]
        ins     v0.d[1], x0
        ret

foo_v4sf:
        ldr     s0, [x0]
        ldr     s3, [x1]
        ldr     s2, [x2]
        ldr     s1, [x3]
        ins     v0.s[1], v3.s[0]
        ins     v0.s[2], v2.s[0]
        ins     v0.s[3], v1.s[0]
        ret

but with this patch we generate the much cleaner:
foo_v2di:
        ldr     d0, [x0]
        ld1     {v0.d}[1], [x1]
        ret

foo_v4sf:
        ldr     s0, [x0]
        ld1     {v0.s}[1], [x1]
        ld1     {v0.s}[2], [x2]
        ld1     {v0.s}[3], [x3]
        ret


	* config/aarch64/aarch64-simd.md (vec_set<mode>): Use VALL_F16 mode
	iterator.  Delete separate integer-mode vec_set<mode> expander.
	(aarch64_simd_vec_setv2di): Delete.
	(vec_setv2di): Delete.
	(aarch64_simd_vec_set<mode>): Delete all other patterns with that name.
	Use VALL_F16 mode iterator.  Add LD1 alternative and use vwcore for
	the "w, r" alternative.

	* gcc.target/aarch64/vect-init-ld1.c: New test.

From-SVN: r260351
2018-05-18 08:52:30 +00:00
Martin Liska
eb63c01f65 Radically simplify emission of balanced tree for switch statements.
2018-05-18  Martin Liska  <mliska@suse.cz>

	* passes.def: Add pass_lower_switch and pass_lower_switch_O0.
	* tree-pass.h (make_pass_lower_switch_O0): New function.
	* tree-switch-conversion.c (node_has_low_bound): Remove.
	(node_has_high_bound): Likewise.
	(node_is_bounded): Likewise.
	(class pass_lower_switch): Make it a template type and create
	two instances.
	(pass_lower_switch::execute): Add template argument.
	(make_pass_lower_switch): New function.
	(make_pass_lower_switch_O0): New function.
	(do_jump_if_equal): Remove.
	(emit_case_nodes): Simplify to just handle all 3 cases and leave
	all the hard work to tree optimization passes.
2018-05-18  Martin Liska  <mliska@suse.cz>

	* gcc.dg/tree-ssa/vrp104.c: Adjust dump file that is scanned.
	* gcc.dg/tree-prof/update-loopch.c: Likewise.

From-SVN: r260350
2018-05-18 08:43:19 +00:00
Martin Liska
cdc3b88343 Support lower and upper limit for -fdbg-cnt flag.
2018-05-18  Martin Liska  <mliska@suse.cz>

	* dbgcnt.c (limit_low): Renamed from limit.
	(limit_high): New variable.
	(dbg_cnt_is_enabled): Check for upper limit.
	(dbg_cnt): Adjust dumping.
	(dbg_cnt_set_limit_by_index): Add new argument for high
	value.
	(dbg_cnt_set_limit_by_name): Likewise.
	(dbg_cnt_process_single_pair): Parse new format.
	(dbg_cnt_process_opt): Use strtok.
	(dbg_cnt_list_all_counters): Remove 'value' and add
	'limit_high'.
	* doc/invoke.texi: Document changes.
2018-05-18  Martin Liska  <mliska@suse.cz>

	* gcc.dg/ipa/ipa-icf-39.c: New test.
	* gcc.dg/pr68766.c: Adjust pruned output.

From-SVN: r260349
2018-05-18 08:42:15 +00:00
Richard Sandiford
c566cc9f78 Replace FMA_EXPR with one internal fn per optab
There are four optabs for various forms of fused multiply-add:
fma, fms, fnma and fnms.  Of these, only fma had a direct gimple
representation.  For the other three we relied on special pattern-
matching during expand, although tree-ssa-math-opts.c did have
some code to try to second-guess what expand would do.

This patch removes the old FMA_EXPR representation of fma and
introduces four new internal functions, one for each optab.
IFN_FMA is tied to BUILT_IN_FMA* while the other three are
independent directly-mapped internal functions.  It's then
possible to do the pattern-matching in match.pd and
tree-ssa-math-opts.c (via folding) can select the exact
FMA-based operation.

The BRIG & HSA parts are a best guess, but seem relatively simple.

2018-05-18  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* doc/sourcebuild.texi (scalar_all_fma): Document.
	* tree.def (FMA_EXPR): Delete.
	* internal-fn.def (FMA, FMS, FNMA, FNMS): New internal functions.
	* internal-fn.c (ternary_direct): New macro.
	(expand_ternary_optab_fn): Likewise.
	(direct_ternary_optab_supported_p): Likewise.
	* Makefile.in (build/genmatch.o): Depend on case-fn-macros.h.
	* builtins.c (fold_builtin_fma): Delete.
	(fold_builtin_3): Don't call it.
	* cfgexpand.c (expand_debug_expr): Remove FMA_EXPR handling.
	* expr.c (expand_expr_real_2): Likewise.
	* fold-const.c (operand_equal_p): Likewise.
	(fold_ternary_loc): Likewise.
	* gimple-pretty-print.c (dump_ternary_rhs): Likewise.
	* gimple.c (DEFTREECODE): Likewise.
	* gimplify.c (gimplify_expr): Likewise.
	* optabs-tree.c (optab_for_tree_code): Likewise.
	* tree-cfg.c (verify_gimple_assign_ternary): Likewise.
	* tree-eh.c (operation_could_trap_p): Likewise.
	(stmt_could_throw_1_p): Likewise.
	* tree-inline.c (estimate_operator_cost): Likewise.
	* tree-pretty-print.c (dump_generic_node): Likewise.
	(op_code_prio): Likewise.
	* tree-ssa-loop-im.c (stmt_cost): Likewise.
	* tree-ssa-operands.c (get_expr_operands): Likewise.
	* tree.c (commutative_ternary_tree_code, add_expr): Likewise.
	* fold-const-call.h (fold_fma): Delete.
	* fold-const-call.c (fold_const_call_ssss): Handle CFN_FMS,
	CFN_FNMA and CFN_FNMS.
	(fold_fma): Delete.
	* genmatch.c (combined_fn): New enum.
	(commutative_ternary_tree_code): Remove FMA_EXPR handling.
	(commutative_op): New function.
	(commutate): Use it.  Handle more than 2 operands.
	(dt_operand::gen_gimple_expr): Use commutative_op.
	(parser::parse_expr): Allow :c to be used with non-binary
	operators if the commutative operand is known.
	* gimple-ssa-backprop.c (backprop::process_builtin_call_use): Handle
	CFN_FMS, CFN_FNMA and CFN_FNMS.
	(backprop::process_assign_use): Remove FMA_EXPR handling.
	* hsa-gen.c (gen_hsa_insns_for_operation_assignment): Likewise.
	(gen_hsa_fma): New function.
	(gen_hsa_insn_for_internal_fn_call): Use it for IFN_FMA, IFN_FMS,
	IFN_FNMA and IFN_FNMS.
	* match.pd: Add folds for IFN_FMS, IFN_FNMA and IFN_FNMS.
	* gimple-fold.h (follow_all_ssa_edges): Declare.
	* gimple-fold.c (follow_all_ssa_edges): New function.
	* tree-ssa-math-opts.c (convert_mult_to_fma_1): Use the
	gimple_build interface and use follow_all_ssa_edges to fold the result.
	(convert_mult_to_fma): Use direct_internal_fn_suppoerted_p
	instead of checking for optabs directly.
	* config/i386/i386.c (ix86_add_stmt_cost): Recognize FMAs as calls
	rather than FMA_EXPRs.
	* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Create a
	call to IFN_FMA instead of an FMA_EXPR.

gcc/brig/
	* brigfrontend/brig-function.cc
	(brig_function::get_builtin_for_hsa_opcode): Use BUILT_IN_FMA
	for BRIG_OPCODE_FMA.
	(brig_function::get_tree_code_for_hsa_opcode): Treat BUILT_IN_FMA
	as a call.

gcc/c/
	* gimple-parser.c (c_parser_gimple_postfix_expression): Remove
	__FMA_EXPR handlng.

gcc/cp/
	* constexpr.c (cxx_eval_constant_expression): Remove FMA_EXPR handling.
	(potential_constant_expression_1): Likewise.

gcc/testsuite/
	* lib/target-supports.exp (check_effective_target_scalar_all_fma):
	New proc.
	* gcc.dg/fma-1.c: New test.
	* gcc.dg/fma-2.c: Likewise.
	* gcc.dg/fma-3.c: Likewise.
	* gcc.dg/fma-4.c: Likewise.
	* gcc.dg/fma-5.c: Likewise.
	* gcc.dg/fma-6.c: Likewise.
	* gcc.dg/fma-7.c: Likewise.
	* gcc.dg/gimplefe-26.c: Use .FMA instead of __FMA and require
	scalar_all_fma.
	* gfortran.dg/reassoc_7.f: Pass -ffp-contract=off.
	* gfortran.dg/reassoc_8.f: Likewise.
	* gfortran.dg/reassoc_9.f: Likewise.
	* gfortran.dg/reassoc_10.f: Likewise.

From-SVN: r260348
2018-05-18 08:27:58 +00:00
GCC Administrator
a35f9ec2f8 Daily bump.
From-SVN: r260347
2018-05-18 00:16:37 +00:00
Jason Merrill
b2ff74575d line-map.c (linemap_init): Use placement new.
* line-map.c (linemap_init): Use placement new.

	* system.h: #include <new>.

From-SVN: r260343
2018-05-17 19:28:34 -04:00
Jim Wilson
7bbce9b503 RISC-V: Optimize switch with sign-extended index.
gcc/
	* expr.c (do_tablejump): When converting index to Pmode, if we have a
	sign extended promoted subreg, and the range does not have the sign bit
	set, then do a sign extend.

	* config/riscv/riscv.c (riscv_extend_comparands): In unsigned QImode
	test, check for sign extended subreg and/or constant operands, and
	do a sign extend in that case.

	gcc/testsuite/
	* gcc.target/riscv/switch-qi.c: New.
	* gcc.target/riscv/switch-si.c: New.

From-SVN: r260340
2018-05-17 15:37:38 -07:00
Steve Ellcey
4e0684beff thunderx2t99.md (thunderx2t99_ls_both): Delete.
2018-05-17  Steve Ellcey  <sellcey@cavium.com>

	* config/aarch64/thunderx2t99.md (thunderx2t99_ls_both): Delete.
	(thunderx2t99_multiple): Delete psuedo-units from used cpus.
	Add untyped.
	(thunderx2t99_alu_shift): Remove alu_shift_reg, alus_shift_reg.
	Change logics_shift_reg to logics_shift_imm.
	(thunderx2t99_fp_loadpair_basic): Delete.
	(thunderx2t99_fp_storepair_basic): Delete.
	(thunderx2t99_asimd_int): Add neon_sub and neon_sub_q types.
	(thunderx2t99_asimd_polynomial): Delete.
	(thunderx2t99_asimd_fp_simple): Add neon_fp_mul_s_scalar_q
	and neon_fp_mul_d_scalar_q.
	(thunderx2t99_asimd_fp_conv): Add *int_to_fp* types.
	(thunderx2t99_asimd_misc): Delete neon_dup and neon_dup_q.
	(thunderx2t99_asimd_recip_step): Add missing *sqrt* types.
	(thunderx2t99_asimd_lut): Add missing tbl types.
	(thunderx2t99_asimd_ext): Delete.
	(thunderx2t99_asimd_load1_1_mult): Delete.
	(thunderx2t99_asimd_load1_2_mult): Delete.
	(thunderx2t99_asimd_load1_ldp): New.
	(thunderx2t99_asimd_load1): New.
	(thunderx2t99_asimd_load2): Add missing *load2* types.
	(thunderx2t99_asimd_load3): New.
	(thunderx2t99_asimd_load4): New.
	(thunderx2t99_asimd_store1_1_mult): Delete.
	(thunderx2t99_asimd_store1_2_mult): Delete.
	(thunderx2t99_asimd_store2_mult): Delete.
	(thunderx2t99_asimd_store2_onelane): Delete.
	(thunderx2t99_asimd_store_stp): New.
	(thunderx2t99_asimd_store1): New.
	(thunderx2t99_asimd_store2): New.
	(thunderx2t99_asimd_store3): New.
	(thunderx2t99_asimd_store4): New.

From-SVN: r260335
2018-05-17 21:05:46 +00:00
Jerome Lambourg
fcf4f8311e arm_cmse.h (cmse_nsfptr_create, [...]): Remove #include <stdint.h>.
2018-05-17  Jerome Lambourg  <lambourg@adacore.com>

	gcc/
	* config/arm/arm_cmse.h (cmse_nsfptr_create, cmse_is_nsfptr): Remove
	#include <stdint.h>.  Replace intptr_t with __INTPTR_TYPE__.

	libgcc/
	* config/arm/cmse.c (cmse_check_address_range): Replace
	UINTPTR_MAX with __UINTPTR_MAX__ and uintptr_t with __UINTPTR_TYPE__.

From-SVN: r260330
2018-05-17 16:36:36 +00:00
Pat Haugen
ca7584f79a re PR tree-optimization/85698 (CPU2017 525.x264_r fails starting with r257581)
PR target/85698
	* config/rs6000/rs6000.c (rs6000_output_move_128bit): Check dest operand.

	* gcc.target/powerpc/pr85698.c: New test.


Co-Authored-By: Segher Boessenkool <segher@kernel.crashing.org>

From-SVN: r260329
2018-05-17 16:19:16 +00:00
Jonathan Wakely
079638f924 PR libstdc++/85818 ensure path::preferred_separator is defined
Because path.cc is compiled with -std=gnu++17 the static constexpr
data member is implicitly 'inline' and so no definition gets emitted
unless it gets used in that translation unit. Other translation units
built as C++11 or C++14 still require a namespace-scope definition of
the variable, so mark the definition as used.

	PR libstdc++/85818
	* src/filesystem/path.cc (path::preferred_separator): Add used
	attribute.
	* testsuite/experimental/filesystem/path/preferred_separator.cc: New.

From-SVN: r260326
2018-05-17 16:36:25 +01:00
Jonathan Wakely
ff03245e00 PR libstdc++/85812 fix memory leak in std::make_exception_ptr
PR libstdc++/85812
	* libsupc++/cxxabi_init_exception.h (__cxa_free_exception): Declare.
	* libsupc++/exception_ptr.h (make_exception_ptr) [__cpp_exceptions]:
	Refactor to separate non-throwing and throwing implementations.
	[__cpp_rtti && !_GLIBCXX_HAVE_CDTOR_CALLABI]: Deallocate the memory
	if constructing the object throws.

From-SVN: r260323
2018-05-17 16:03:29 +01:00
Richard Biener
f1bcb061d1 tree-ssa-dse.c (dse_classify_store): Fix iterator increment for pruning loop and prune defs feeding only already...
2018-05-17  Richard Biener  <rguenther@suse.de>

	* tree-ssa-dse.c (dse_classify_store): Fix iterator increment
	for pruning loop and prune defs feeding only already visited PHIs.

From-SVN: r260322
2018-05-17 13:42:21 +00:00
Richard Biener
3f90a68f0f tree-ssa-sccvn.c (vn_reference_lookup_3): Improve memset handling.
2018-05-17  Richard Biener  <rguenther@suse.de>

	* tree-ssa-sccvn.c (vn_reference_lookup_3): Improve memset handling.

	* gcc.dg/tree-ssa/ssa-fre-63.c: New testcase.

From-SVN: r260318
2018-05-17 12:06:44 +00:00
Bin Cheng
bb4e474765 re PR tree-optimization/85793 ([AARCH64] ICE in verify_gimple during GIMPLE pass vect.)
PR tree-optimization/85793
	* tree-vect-stmts.c (vectorizable_load): Handle 1 element-wise load
	for VMAT_ELEMENTWISE.

	gcc/testsuite
	* gcc.dg/vect/pr85793.c: New test.

Co-Authored-By: Richard Biener <rguenther@suse.de>

From-SVN: r260317
2018-05-17 11:25:43 +00:00
Richard Sandiford
e4f81565ce Gimple FE support for internal functions
This patch gets the gimple FE to parse calls to internal functions.
The only non-obvious thing was how the functions should be written
to avoid clashes with real function names.  One option would be to
go the magic number of underscores route, but we already do that for
built-in functions, and it would be good to keep them visually
distinct.  In the end I borrowed the local/internal label convention
from asm and used:

  x = .SQRT (y);

2018-05-17  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* internal-fn.h (lookup_internal_fn): Declare
	* internal-fn.c (lookup_internal_fn): New function.
	* gimple.c (gimple_build_call_from_tree): Handle calls to
	internal functions.
	* gimple-pretty-print.c (dump_gimple_call): Print "." before
	internal function names.
	* tree-pretty-print.c (dump_generic_node): Likewise.
	* tree-ssa-scopedtables.c (expr_hash_elt::print): Likewise.

gcc/c/
	* gimple-parser.c: Include internal-fn.h.
	(c_parser_gimple_statement): Treat a leading CPP_DOT as a call.
	(c_parser_gimple_call_internal): New function.
	(c_parser_gimple_postfix_expression): Use it to handle CPP_DOT.
	Fix typos in comment.

gcc/testsuite/
	* gcc.dg/gimplefe-28.c: New test.
	* gcc.dg/asan/use-after-scope-9.c: Adjust expected output for
	internal function calls.
	* gcc.dg/goacc/loop-processing-1.c: Likewise.

From-SVN: r260316
2018-05-17 10:52:58 +00:00
Richard Sandiford
eb69361d0c Allow gimple_build with internal functions
This patch makes the function versions of gimple_build and
gimple_simplify take combined_fns rather than built_in_codes,
so that they work with internal functions too.  The old
gimple_builds were unused, so no existing callers need
to be updated.

2018-05-17  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* gimple-fold.h (gimple_build): Make the function forms take
	combined_fn rather than built_in_function.
	(gimple_simplify): Likewise.
	* gimple-match-head.c (gimple_simplify): Likewise.
	* gimple-fold.c (gimple_build): Likewise.
	* tree-vect-loop.c (get_initial_def_for_reduction): Use gimple_build
	rather than gimple_build_call_internal.
	(get_initial_defs_for_reduction): Likewise.
	(vect_create_epilog_for_reduction): Likewise.
	(vectorizable_live_operation): Likewise.

From-SVN: r260315
2018-05-17 10:51:42 +00:00
Martin Liska
40659769b2 Fix GNU coding style for G_.
2018-05-17  Martin Liska  <mliska@suse.cz>

	* gimple-ssa-sprintf.c (format_directive): Do not use
	space in between 'G_' and '('.
2018-05-17  Martin Liska  <mliska@suse.cz>

	* c-warn.c (overflow_warning): Do not use
	space in between 'G_' and '('.
2018-05-17  Martin Liska  <mliska@suse.cz>

	* gcc.dg/plugin/ggcplug.c (plugin_init): Do not use
	space in between 'G_' and '('.

From-SVN: r260314
2018-05-17 10:44:01 +00:00
Jakub Jelinek
78b9544b33 re PR target/85323 (SSE/AVX/AVX512 shift by 0 not optimized away)
PR target/85323
	* config/i386/i386.c (ix86_fold_builtin): Handle masked shifts
	even if the mask is not all ones.

	* gcc.target/i386/pr85323-7.c: New test.
	* gcc.target/i386/pr85323-8.c: New test.
	* gcc.target/i386/pr85323-9.c: New test.

From-SVN: r260313
2018-05-17 12:07:12 +02:00
Jakub Jelinek
6a03477e85 re PR target/85323 (SSE/AVX/AVX512 shift by 0 not optimized away)
PR target/85323
	* config/i386/i386.c (ix86_fold_builtin): Fold shift builtins by
	vector.
	(ix86_gimple_fold_builtin): Likewise.

	* gcc.target/i386/pr85323-4.c: New test.
	* gcc.target/i386/pr85323-5.c: New test.
	* gcc.target/i386/pr85323-6.c: New test.

From-SVN: r260312
2018-05-17 12:01:33 +02:00
Jakub Jelinek
28a8a768eb re PR target/85323 (SSE/AVX/AVX512 shift by 0 not optimized away)
PR target/85323
	* config/i386/i386.c: Include tree-vector-builder.h.
	(ix86_vector_shift_count): New function.
	(ix86_fold_builtin): Fold shift builtins by scalar count.
	(ix86_gimple_fold_builtin): Likewise.

	* gcc.target/i386/pr85323-1.c: New test.
	* gcc.target/i386/pr85323-2.c: New test.
	* gcc.target/i386/pr85323-3.c: New test.

From-SVN: r260311
2018-05-17 11:54:36 +02:00
Jakub Jelinek
4e6a811fad avx512fintrin.h (_mm512_set_epi16, [...]): New intrinsics.
* config/i386/avx512fintrin.h (_mm512_set_epi16, _mm512_set_epi8,
	_mm512_setzero): New intrinsics.

	* gcc.target/i386/avx512f-set-v32hi-1.c: New test.
	* gcc.target/i386/avx512f-set-v32hi-2.c: New test.
	* gcc.target/i386/avx512f-set-v32hi-3.c: New test.
	* gcc.target/i386/avx512f-set-v32hi-4.c: New test.
	* gcc.target/i386/avx512f-set-v32hi-5.c: New test.
	* gcc.target/i386/avx512f-set-v64qi-1.c: New test.
	* gcc.target/i386/avx512f-set-v64qi-2.c: New test.
	* gcc.target/i386/avx512f-set-v64qi-3.c: New test.
	* gcc.target/i386/avx512f-set-v64qi-4.c: New test.
	* gcc.target/i386/avx512f-set-v64qi-5.c: New test.
	* gcc.target/i386/avx512f-setzero-1.c: New test.

From-SVN: r260310
2018-05-17 11:47:52 +02:00
James Greenhalgh
b4e2cd5b9a [patch AArch64] Do not perform a vector splat for vector initialisation if it is not useful
In the testcase in this patch we create an SLP vector with only two
elements. Our current vector initialisation code will first duplicate
the first element to both lanes, then overwrite the top lane with a new
value.

This duplication can be clunky and wasteful.

Better would be to simply use the fact that we will always be
overwriting the remaining bits, and simply move the first element to the corrcet
place (implicitly zeroing all other bits).

This reduces the code generation for this case, and can allow more
efficient addressing modes, and other second order benefits for AArch64
code which has been vectorized to V2DI mode.

Note that the change is generic enough to catch the case for any vector
mode, but is expected to be most useful for 2x64-bit vectorization.

Unfortunately, on its own, this would cause failures in
gcc.target/aarch64/load_v2vec_lanes_1.c and
gcc.target/aarch64/store_v2vec_lanes.c , which expect to see many more
vec_merge and vec_duplicate for their simplifications to apply. To fix
this,
add a special case to the AArch64 code if we are loading from two memory
addresses, and use the load_pair_lanes patterns directly.

We also need a new pattern in simplify-rtx.c:simplify_ternary_operation
to catch:

  (vec_merge:OUTER
     (vec_duplicate:OUTER x:INNER)
     (subreg:OUTER y:INNER 0)
     (const_int N))

And simplify it to:

  (vec_concat:OUTER x:INNER y:INNER) or (vec_concat y x)

This is similar to the existing patterns which are tested in this
function, without requiring the second operand to also be a vec_duplicate. 

	* config/aarch64/aarch64.c (aarch64_expand_vector_init): Modify
	code generation for cases where splatting a value is not useful.
	* simplify-rtx.c (simplify_ternary_operation): Simplify
	vec_merge across a vec_duplicate and a paradoxical subreg forming a vector
	mode to a vec_concat.

	* gcc.target/aarch64/vect-slp-dup.c: New.


Co-Authored-By: Kyrylo Tkachov <kyrylo.tkachov@arm.com>

From-SVN: r260309
2018-05-17 09:39:02 +00:00
Paolo Carlini
9b4ef22db8 re PR c++/85713 (ICE in dependent_type_p, at cp/pt.c:24582 on valid code)
2018-05-17  Paolo Carlini  <paolo.carlini@oracle.com>

	PR c++/85713
	* g++.dg/cpp1y/lambda-generic-85713-2.C: New.

From-SVN: r260308
2018-05-17 09:17:56 +00:00
Olga Makhotina
74b2bb19f3 config.gcc: Support "goldmont-plus".
2018-05-17  Olga Makhotina  <olga.makhotina@intel.com>

gcc/

	* config.gcc: Support "goldmont-plus".
	* config/i386/driver-i386.c (host_detect_local_cpu): Detect
	"goldmont-plus".
	* config/i386/i386-c.c (ix86_target_macros_internal): Handle
	PROCESSOR_GOLDMONT_PLUS.
	* config/i386/i386.c (m_GOLDMONT_PLUS): Define.
	(processor_target_table): Add "goldmont-plus".
	(PTA_GOLDMONT_PLUS): Define.
	(ix86_lea_outperforms): Add TARGET_GOLDMONT_PLUS.
	(get_builtin_code_for_version): Handle PROCESSOR_GOLDMONT_PLUS.
	(fold_builtin_cpu): Add M_INTEL_GOLDMONT_PLUS.
	(fold_builtin_cpu): Add "goldmont-plus".
	(ix86_add_stmt_cost): Add TARGET_GOLDMONT_PLUS.
	(ix86_option_override_internal): Add "goldmont-plus".
	* config/i386/i386.h (processor_costs): Define TARGET_GOLDMONT_PLUS.
	(processor_type): Add PROCESSOR_GOLDMONT_PLUS.
	* config/i386/x86-tune.def: Add m_GOLDMONT_PLUS.
	* doc/invoke.texi: Add goldmont-plus as x86 -march=/-mtune= CPU type.

libgcc/

	* config/i386/cpuinfo.h (processor_types): Add INTEL_GOLDMONT_PLUS.
	* config/i386/cpuinfo.c (get_intel_cpu): Detect Goldmont Plus.

gcc/testsuite/

	* gcc.target/i386/builtin_target.c: Test goldmont-plus.
	* gcc.target/i386/funcspec-56.inc: Test arch=goldmont-plus.

From-SVN: r260307
2018-05-17 10:13:23 +02:00
Richard Biener
773d0331f7 re PR tree-optimization/85757 (tree optimizers fail to fully clean up fixed-size memcpy)
2018-05-17  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/85757
	* tree-ssa-dse.c (dse_classify_store): Record a PHI def and
	remove defs that only feed that PHI from further processing.

	* gcc.dg/tree-ssa/ssa-dse-34.c: New testcase.

From-SVN: r260306
2018-05-17 06:57:45 +00:00
GCC Administrator
8ee520219f Daily bump.
From-SVN: r260304
2018-05-17 00:16:17 +00:00
Marek Polacek
0932d398ef re PR c++/85363 (Throwing exception from member constructor (brace initializer vs initializer list))
PR c++/85363
	* call.c (set_flags_from_callee): Handle AGGR_INIT_EXPRs too.
	* tree.c (bot_manip): Call set_flags_from_callee for
	AGGR_INIT_EXPRs too.

	* g++.dg/cpp0x/initlist-throw1.C: New test.
	* g++.dg/cpp0x/initlist-throw2.C: New test.

From-SVN: r260300
2018-05-16 20:37:45 +00:00
Jim Wilson
110fb19f6c RISC-V: Minor pattern name cleanup.
gcc/
	* config/riscv/riscv.md (<optab>si3_mask, <optab>si3_mask_1): Prepend
	asterisk to name.
	(<optab>di3_mask, <optab>di3_mask_1): Likewise.

From-SVN: r260299
2018-05-16 11:37:52 -07:00
Mark Wielaard
bb14f4c6da DWARF: Add header for .debug_str_offsets table for dwarf_version 5.
DWARF5 defines a small header for .debug_str_offsets.  Since we only use
it for split dwarf .dwo files we don't need to keep track of the actual
index offset in an attribute.

gcc/ChangeLog

	* dwarf2out.c (count_index_strings): New function.
	(output_indirect_strings): Call count_index_strings and generate
	header for dwarf_version >= 5.

From-SVN: r260298
2018-05-16 18:20:08 +00:00
Mark Wielaard
c0134358c5 DWARF: Emit DWARF5 forms for indirect addresses and string offsets.
We already emit DWARF5 attributes and tables for indirect addresses
and string offsets, but still use GNU forms. Add a new helper function
dwarf_FORM () for emitting the right form.

Currently we only use the uleb128 forms. But DWARF5 also allows
1, 2, 3 and 4 byte forms (DW_FORM_strx[1234] and DW_FORM_addrx[1234])
which might be more space efficient.

gcc/ChangeLog

	* dwarf2out.c (dwarf_FORM): New function.
	(set_indirect_string): Use dwarf_FORM.
	(reset_indirect_string): Likewise.
	(size_of_die): Likewise.
	(value_format): Likewise.
	(output_die): Likewise.
	(add_skeleton_AT_string): Likewise.
	(output_macinfo_op): Likewise.
	(index_string): Likewise.
	(output_index_string_offset): Likewise.
	(output_index_string): Likewise.

From-SVN: r260297
2018-05-16 18:02:25 +00:00
Carl Love
b958e1c134 rs6000.md (prefetch): Generate ISA 2.06 instructions dcbt and dcbtstt with TH=16 if...
gcc/ChangeLog:

2018-05-16  Carl Love  <cel@us.ibm.com>

	* config/rs6000/rs6000.md (prefetch): Generate ISA 2.06 instructions
	dcbt and dcbtstt with TH=16 if operands[2] is 0 and Power 8 or newer.

From-SVN: r260296
2018-05-16 17:21:04 +00:00
Martin Jambor
73264a8d66 Remove unused function ipa_free_edge_args_substructures
2018-05-16  Martin Jambor  <mjambor@suse.cz>

	* ipa-prop.c (ipa_free_all_edge_args): Remove.
	* ipa-prop.h (ipa_free_all_edge_args): Likewise.

From-SVN: r260295
2018-05-16 18:22:56 +02:00
Carl Love
6747254bba vsx-vector-6-be.c: Remove file.
gcc/testsuite/ChangeLog:

2018-05-16 Carl Love  <cel@us.ibm.com>
	* gcc.target/powerpc/vsx-vector-6-be.c: Remove file.
	* gcc.target/powerpc/vsx-vector-6-be.p7.c: New test file.
	* gcc.target/powerpc/vsx-vector-6-be.p8.c: New test file.
	* gcc.target/powerpc/vsx-vector-6-le.c (dg-final): Update counts for
	xvcmpeqdp., xvcmpgtdp., xvcmpgedp., xxlxor, xvrdpi.

From-SVN: r260294
2018-05-16 16:06:08 +00:00
Wilco Dijkstra
d6e6e8b677 [AArch64] Improve register allocation of fma
This patch improves register allocation of fma by preferring to update the
accumulator register.  This is done by adding fma insns with operand 1 as the
accumulator.  The register allocator considers copy preferences only in operand
order, so if the first operand is dead, it has the highest chance of being
reused as the destination.  As a result code using fma often has a better
register allocation.  Performance of SPECFP2017 improves by over 0.5% on some
implementations, while it had no effect on other implementations.  Fma is more
readable too, in a simple example we now generate:

	fmadd	s16, s2, s1, s16
	fmadd	s7, s17, s16, s7
	fmadd	s6, s16, s7, s6
	fmadd	s5, s7, s6, s5

instead of:

	fmadd	s16, s16, s2, s1
	fmadd	s7, s7, s16, s6
	fmadd	s6, s6, s7, s5
	fmadd	s5, s5, s6, s4

    gcc/
	* config/aarch64/aarch64.md (fma<mode>4): Change into expand pattern.
	(fnma<mode>4): Likewise.
	(fms<mode>4): Likewise.
	(fnms<mode>4): Likewise.
	(aarch64_fma<mode>4): Rename insn, reorder accumulator operand.
	(aarch64_fnma<mode>4): Likewise.
	(aarch64_fms<mode>4): Likewise.
	(aarch64_fnms<mode>4): Likewise.
	(aarch64_fnmadd<mode>4): Likewise.

From-SVN: r260292
2018-05-16 14:33:16 +00:00
Jason Merrill
df0fc585b7 * tree.c (warn_deprecated_use): Return bool. Simplify logic.
From-SVN: r260290
2018-05-16 09:19:56 -04:00
Richard Biener
68435eb293 tree-vectorizer.h (struct stmt_info_for_cost): Add where member.
2018-05-16  Richard Biener  <rguenther@suse.de>

	* tree-vectorizer.h (struct stmt_info_for_cost): Add where member.
	(dump_stmt_cost): Declare.
	(add_stmt_cost): Dump cost we add.
	(add_stmt_costs): New function.
	(vect_model_simple_cost, vect_model_store_cost, vect_model_load_cost):
	No longer exported.
	(vect_analyze_stmt): Adjust prototype.
	(vectorizable_condition): Likewise.
	(vectorizable_live_operation): Likewise.
	(vectorizable_reduction): Likewise.
	(vectorizable_induction): Likewise.
	* tree-vect-loop.c (vect_analyze_loop_operations): Create local
	cost vector to pass to vectorizable_ and record afterwards.
	(vect_model_reduction_cost): Take cost vector argument and adjust.
	(vect_model_induction_cost): Likewise.
	(vectorizable_reduction): Likewise.
	(vectorizable_induction): Likewise.
	(vectorizable_live_operation): Likewise.
	* tree-vect-slp.c (vect_create_new_slp_node): Initialize
	SLP_TREE_NUMBER_OF_VEC_STMTS.
	(vect_analyze_slp_cost_1): Remove.
	(vect_analyze_slp_cost): Likewise.
	(vect_slp_analyze_node_operations): Take visited args and
	a target cost vector.  Avoid processing already visited stmt sets.
	(vect_slp_analyze_operations): Use a local cost vector to gather
	costs and register those of non-discarded instances.
	(vect_bb_vectorization_profitable_p): Use add_stmt_costs.
	(vect_schedule_slp_instance): Remove copying of
	SLP_TREE_NUMBER_OF_VEC_STMTS.  Instead assert that it is not
	zero.
	* tree-vect-stmts.c (record_stmt_cost): Remove path directly
	adding cost.  Record cost entry location.
	(vect_prologue_cost_for_slp_op): Function to compute cost of
	a constant or invariant generated for SLP vect in the prologue,
	split out from vect_analyze_slp_cost_1.
	(vect_model_simple_cost): Make static.  Adjust for SLP costing.
	(vect_model_promotion_demotion_cost): Likewise.
	(vect_model_store_cost): Likewise, make static.
	(vect_model_load_cost): Likewise.
	(vectorizable_bswap): Add cost vector arg and adjust.
	(vectorizable_call): Likewise.
	(vectorizable_simd_clone_call): Likewise.
	(vectorizable_conversion): Likewise.
	(vectorizable_assignment): Likewise.
	(vectorizable_shift): Likewise.
	(vectorizable_operation): Likewise.
	(vectorizable_store): Likewise.
	(vectorizable_load): Likewise.
	(vectorizable_condition): Likewise.
	(vectorizable_comparison): Likewise.
	(can_vectorize_live_stmts): Likewise.
	(vect_analyze_stmt): Likewise.
	(vect_transform_stmt): Adjust calls to vectorizable_*.
	* tree-vectorizer.c: Include gimple-pretty-print.h.
	(dump_stmt_cost): New function.

From-SVN: r260289
2018-05-16 13:08:04 +00:00
Richard Biener
311eb8168e params.def (PARAM_DSE_MAX_ALIAS_QUERIES_PER_STORE): New param.
2018-05-16  Richard Biener  <rguenther@suse.de>

	* params.def (PARAM_DSE_MAX_ALIAS_QUERIES_PER_STORE): New param.
	* doc/invoke.texi (dse-max-alias-queries-per-store): Document.
	* tree-ssa-dse.c: Include tree-ssa-loop.h.
	(check_name): New callback.
	(dse_classify_store): Track cycles via a visited bitmap of PHI
	defs and simplify handling of in-loop and across loop dead stores
	and properly fail for loop-variant refs.  Handle byte-tracking with
	multiple defs.  Use PARAM_DSE_MAX_ALIAS_QUERIES_PER_STORE for
	limiting the walk.

	* gcc.dg/tree-ssa/ssa-dse-32.c: New testcase.
	* gcc.dg/tree-ssa/ssa-dse-33.c: Likewise.
	* gcc.dg/uninit-pr81897-2.c: Use -fno-tree-dse.

From-SVN: r260288
2018-05-16 13:02:27 +00:00
Richard Sandiford
1f3cb66326 Handle vector boolean types when calculating the SLP unroll factor
The SLP unrolling factor is calculated by finding the smallest
scalar type for each SLP statement and taking the number of required
lanes from the vector versions of those scalar types.  E.g. for an
int32->int64 conversion, it's the vector of int32s rather than the
vector of int64s that determines the unroll factor.

We rely on tree-vect-patterns.c to replace boolean operations like:

   bool a, b, c;
   a = b & c;

with integer operations of whatever the best size is in context.
E.g. if b and c are fed by comparisons of ints, a, b and c will become
the appropriate size for an int comparison.  For most targets this means
that a, b and c will end up as int-sized themselves, but on targets like
SVE and AVX512 with packed vector booleans, they'll instead become a
small bitfield like :1, padded to a byte for memory purposes.
The SLP code would then take these scalar types and try to calculate
the vector type for them, causing the unroll factor to be much higher
than necessary.

This patch tries to make the SLP code use the same approach as the
loop vectorizer, by splitting out the code that calculates the
statement vector type and the vector type that should be used for
the number of units.

2018-05-16  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* tree-vectorizer.h (vect_get_vector_types_for_stmt): Declare.
	(vect_get_mask_type_for_stmt): Likewise.
	* tree-vect-slp.c (vect_two_operations_perm_ok_p): New function,
	split out from...
	(vect_build_slp_tree_1): ...here.  Use vect_get_vector_types_for_stmt
	to determine the statement's vector type and the vector type that
	should be used for calculating nunits.  Deal with cases in which
	the type has to be deferred.
	(vect_slp_analyze_node_operations): Use vect_get_vector_types_for_stmt
	and vect_get_mask_type_for_stmt to calculate STMT_VINFO_VECTYPE.
	* tree-vect-loop.c (vect_determine_vf_for_stmt_1)
	(vect_determine_vf_for_stmt): New functions, split out from...
	(vect_determine_vectorization_factor): ...here.
	* tree-vect-stmts.c (vect_get_vector_types_for_stmt)
	(vect_get_mask_type_for_stmt): New functions, split out from
	vect_determine_vectorization_factor.

gcc/testsuite/
	* gcc.target/aarch64/sve/vcond_10.c: New test.
	* gcc.target/aarch64/sve/vcond_10_run.c: Likewise.
	* gcc.target/aarch64/sve/vcond_11.c: Likewise.
	* gcc.target/aarch64/sve/vcond_11_run.c: Likewise.

From-SVN: r260287
2018-05-16 11:50:44 +00:00
Richard Biener
c448fedea9 tree-cfg.c (verify_gimple_assign_ternary): Properly verify the [VEC_]COND_EXPR embedded comparison.
2018-05-16  Richard Biener  <rguenther@suse.de>

	* tree-cfg.c (verify_gimple_assign_ternary): Properly
	verify the [VEC_]COND_EXPR embedded comparison.

From-SVN: r260283
2018-05-16 10:22:52 +00:00
Martin Sebor
7ad491c636 PR tree-optimization/85753 - missing -Wrestrict on memcpy into a member array
gcc/ChangeLog:

	PR tree-optimization/85753
	* gimple-ssa-warn-restrict.c (builtin_memref::builtin_memref): Handle
	RECORD_TYPE in addition to ARRAY_TYPE.

gcc/testsuite/ChangeLog:

	PR tree-optimization/85753
	* gcc.dg/Wrestrict-10.c: Adjust.
	* gcc.dg/Wrestrict-16.c: New test.

From-SVN: r260280
2018-05-15 20:30:38 -06:00
Jason Merrill
e4a148963e cp-tree.h (cp_expr): Remove copy constructor.
* cp-tree.h (cp_expr): Remove copy constructor.

	* mangle.c (struct releasing_vec): Declare copy constructor.

From-SVN: r260279
2018-05-15 20:57:56 -04:00
GCC Administrator
67ea8181df Daily bump.
From-SVN: r260277
2018-05-16 00:16:25 +00:00
Jason Merrill
dc5ca6c86f * constexpr.c (cxx_eval_vec_init_1): Pass tf_none if ctx->quiet.
From-SVN: r260273
2018-05-15 17:56:34 -04:00
Jason Merrill
30a52a6d62 PR c++/64372 - CWG 1560, gratuitous lvalue-rvalue conversion in ?:
* call.c (build_conditional_expr_1): Don't force_rvalue when one arm
	is a throw-expression.

From-SVN: r260272
2018-05-15 17:56:29 -04:00
Martin Sebor
275605696b PR middle-end/85643 - attribute nonstring fails to squash -Wstringop-truncation warning
gcc/ChangeLog:

	PR middle-end/85643
	* calls.c (get_attr_nonstring_decl): Handle MEM_REF.

gcc/testsuite/ChangeLog:

	PR middle-end/85643
	* c-c++-common/attr-nonstring-7.c: New test.

From-SVN: r260271
2018-05-15 15:52:16 -06:00
Jan Hubicka
ab16804487 re PR lto/85583 (lto1: internal compiler error: in lto_balanced_map, at lto/lto-partition.c:833)
PR lto/85583
	* lto-partition.c (account_reference_p): Do not account
	references from aliases; do not account refernces from
	external initializers.

From-SVN: r260266
2018-05-15 16:39:43 +00:00
Paolo Carlini
5f150326b3 cp-tree.h (DECL_MAYBE_IN_CHARGE_CDTOR_P): New.
2018-05-15  Paolo Carlini  <paolo.carlini@oracle.com>

	* cp-tree.h (DECL_MAYBE_IN_CHARGE_CDTOR_P): New.
	(FOR_EACH_CLONE): Update.
	* decl.c (grokdeclarator): Use it.
	* decl2.c (vague_linkage_p): Likewise.
	* mangle.c (mangle_decl): Likewise.
	* method.c (lazily_declare_fn): Likewise.
	* optimize.c (can_alias_cdtor, maybe_clone_body): Likewise.
	* repo.c (repo_emit_p): Likewise.
	* tree.c (decl_linkage): Likewise.

From-SVN: r260264
2018-05-15 16:03:56 +00:00
Jonathan Wakely
5a7960da41 PR libstdc++/85749 constrain seed sequences for random number engines
Constrain constructors and member functions of random number engines so
that functions taking seed sequences can only be called with types that
meet the seed sequence requirements.

	PR libstdc++/85749
	* include/bits/random.h (__detail::__is_seed_seq): New SFINAE helper.
	(linear_congruential_engine, mersenne_twister_engine)
	(subtract_with_carry_engine, discard_block_engine)
	(independent_bits_engine, shuffle_order_engine): Use __is_seed_seq to
	constrain function templates taking seed sequences.
	* include/bits/random.tcc (linear_congruential_engine::seed(_Sseq&))
	(mersenne_twister_engine::seed(_Sseq&))
	(subtract_with_carry_engine::seed(_Sseq&)): Change return types to
	match declarations.
	* include/ext/random (simd_fast_mersenne_twister_engine): Use
	__is_seed_seq to constrain function templates taking seed sequences.
	* include/ext/random.tcc (simd_fast_mersenne_twister_engine::seed):
	Change return type to match declaration.
	* testsuite/26_numerics/random/discard_block_engine/cons/seed_seq2.cc:
	New.
	* testsuite/26_numerics/random/independent_bits_engine/cons/
	seed_seq2.cc: New.
	* testsuite/26_numerics/random/linear_congruential_engine/cons/
	seed_seq2.cc: New.
	* testsuite/26_numerics/random/mersenne_twister_engine/cons/
	seed_seq2.cc: New.
	* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error lineno.
	* testsuite/26_numerics/random/shuffle_order_engine/cons/seed_seq2.cc:
	New.
	* testsuite/26_numerics/random/subtract_with_carry_engine/cons/
	seed_seq2.cc: New.
	* testsuite/ext/random/simd_fast_mersenne_twister_engine/cons/
	seed_seq2.cc: New.

From-SVN: r260263
2018-05-15 16:36:46 +01:00