Commit Graph

158606 Commits

Author SHA1 Message Date
Richard Sandiford
f3ff49007a poly_int: expand_expr_real_1
This patch makes the VIEW_CONVERT_EXPR handling in expand_expr_real_1
cope with polynomial type and mode sizes.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* expr.c (expand_expr_real_1): Use tree_to_poly_uint64
	instead of int_size_in_bytes when handling VIEW_CONVERT_EXPRs
	via stack temporaries.  Treat the mode size as polynomial too.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256152
2018-01-03 07:17:52 +00:00
Richard Sandiford
0fd03b44ab poly_int: expand_expr_real_2
This patch makes expand_expr_real_2 cope with polynomial mode sizes
when handling conversions involving a union type.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* expr.c (expand_expr_real_2): When handling conversions involving
	unions, apply tree_to_poly_uint64 to the TYPE_SIZE rather than
	multiplying int_size_in_bytes by BITS_PER_UNIT.  Treat GET_MODE_BISIZE
	as a poly_uint64 too.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256151
2018-01-03 07:17:46 +00:00
Richard Sandiford
c3266d10a4 poly_int: subreg_get_info
This patch makes subreg_get_info handle polynomial sizes.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* rtlanal.c (subreg_get_info): Handle polynomial mode sizes.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256150
2018-01-03 07:17:39 +00:00
Richard Sandiford
fad2288b4b poly_int: REGMODE_NATURAL_SIZE
This patch makes target-independent code that uses REGMODE_NATURAL_SIZE
treat it as a poly_int rather than a constant.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* combine.c (can_change_dest_mode): Handle polynomial
	REGMODE_NATURAL_SIZE.
	* expmed.c (store_bit_field_1): Likewise.
	* expr.c (store_constructor): Likewise.
	* emit-rtl.c (validate_subreg): Operate on polynomial mode sizes
	and polynomial REGMODE_NATURAL_SIZE.
	(gen_lowpart_common): Likewise.
	* reginfo.c (record_subregs_of_mode): Likewise.
	* rtlanal.c (read_modify_subreg_p): Likewise.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256149
2018-01-03 07:17:33 +00:00
Richard Sandiford
07626e49a0 poly_int: expand_vector_ubsan_overflow
This patch makes expand_vector_ubsan_overflow cope with a polynomial
number of elements.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* internal-fn.c (expand_vector_ubsan_overflow): Handle polynomial
	numbers of elements.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256148
2018-01-03 07:17:27 +00:00
Richard Sandiford
d34457c138 poly_int: folding BIT_FIELD_REFs on vectors
This patch makes the:

  (BIT_FIELD_REF CONSTRUCTOR@0 @1 @2)

folder cope with polynomial numbers of elements.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* match.pd: Cope with polynomial numbers of vector elements.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256147
2018-01-03 07:17:18 +00:00
Richard Sandiford
fece509bf1 poly_int: fold_indirect_ref_1
This patch makes fold_indirect_ref_1 handle polynomial offsets in
a POINTER_PLUS_EXPR.  The specific reason for doing this now is
to handle:

 		  (tree_to_uhwi (part_width) / BITS_PER_UNIT
 		   * TYPE_VECTOR_SUBPARTS (op00type));

when TYPE_VECTOR_SUBPARTS becomes a poly_int.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* fold-const.c (fold_indirect_ref_1): Handle polynomial offsets
	in a POINTER_PLUS_EXPR.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256146
2018-01-03 07:17:12 +00:00
Richard Sandiford
d8f860ef70 poly_int: omp-simd-clone.c
This patch adds a wrapper around TYPE_VECTOR_SUBPARTS for omp-simd-clone.c.
Supporting SIMD clones for variable-length vectors is post GCC8 work.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* omp-simd-clone.c (simd_clone_subparts): New function.
	(simd_clone_init_simd_arrays): Use it instead of TYPE_VECTOR_SUBPARTS.
	(ipa_simd_modify_function_body): Likewise.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256145
2018-01-03 07:17:06 +00:00
Richard Sandiford
e112bba2fc poly_int: brig vector elements
This patch adds a brig-specific wrapper around TYPE_VECTOR_SUBPARTS,
since presumably it will never need to support variable vector lengths.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/brig/
	* brigfrontend/brig-util.h (gccbrig_type_vector_subparts): New
	function.
	* brigfrontend/brig-basic-inst-handler.cc
	(brig_basic_inst_handler::build_shuffle): Use it instead of
	TYPE_VECTOR_SUBPARTS.
	(brig_basic_inst_handler::build_unpack): Likewise.
	(brig_basic_inst_handler::build_pack): Likewise.
	(brig_basic_inst_handler::build_unpack_lo_or_hi): Likewise.
	(brig_basic_inst_handler::operator ()): Likewise.
	(brig_basic_inst_handler::build_lower_element_broadcast): Likewise.
	* brigfrontend/brig-code-entry-handler.cc
	(brig_code_entry_handler::get_tree_cst_for_hsa_operand): Likewise.
	(brig_code_entry_handler::get_comparison_result_type): Likewise.
	(brig_code_entry_handler::expand_or_call_builtin): Likewise.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256144
2018-01-03 07:17:00 +00:00
Richard Sandiford
22afc2b31b poly_int: tree-vect-generic.c
This patch makes tree-vect-generic.c cope with variable-length vectors.
Decomposition is only supported for constant-length vectors, since we
should never generate unsupported variable-length operations.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* tree-vect-generic.c (nunits_for_known_piecewise_op): New function.
	(expand_vector_piecewise): Use it instead of TYPE_VECTOR_SUBPARTS.
	(expand_vector_addition, add_rshift, expand_vector_divmod): Likewise.
	(expand_vector_condition, vector_element): Likewise.
	(subparts_gt): New function.
	(get_compute_type): Use subparts_gt.
	(count_type_subparts): Delete.
	(expand_vector_operations_1): Use subparts_gt instead of
	count_type_subparts.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256143
2018-01-03 07:16:53 +00:00
Richard Sandiford
b064d4f9d6 poly_int: vect_no_alias_p
This patch replaces the two-state vect_no_alias_p with a three-state
vect_compile_time_alias that handles polynomial segment lengths.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* tree-vect-data-refs.c (vect_no_alias_p): Replace with...
	(vect_compile_time_alias): ...this new function.  Do the calculation
	on poly_ints rather than trees.
	(vect_prune_runtime_alias_test_list): Update call accordingly.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256142
2018-01-03 07:16:47 +00:00
Richard Sandiford
dad55d7014 poly_int: two-operation SLP
This patch makes two-operation SLP handle but reject variable-length
vectors.  Adding support for this is a post-GCC8 thing.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* tree-vect-slp.c (vect_build_slp_tree_1): Handle polynomial
	numbers of units.
	(vect_schedule_slp_instance): Likewise.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256141
2018-01-03 07:16:41 +00:00
Richard Sandiford
a23644f23d poly_int: vect_get_constant_vectors
For now, vect_get_constant_vectors can only cope with constant-length
vectors, although a patch after the main SVE submission relaxes this.
This patch adds an appropriate guard for variable-length vectors.
The TYPE_VECTOR_SUBPARTS use in vect_get_constant_vectors will then
have a to_constant call when TYPE_VECTOR_SUBPARTS becomes a poly_int.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* tree-vect-slp.c (vect_get_and_check_slp_defs): Reject
	constant and extern definitions for variable-length vectors.
	(vect_get_constant_vectors): Note that the number of units
	is known to be constant.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256140
2018-01-03 07:16:35 +00:00
Richard Sandiford
062d5ccc11 poly_int: vectorizable_conversion
This patch makes vectorizable_conversion cope with variable-length
vectors.  We already require the number of elements in one vector
to be a multiple of the number of elements in the other vector,
so the patch uses that to choose between widening and narrowing.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* tree-vect-stmts.c (vectorizable_conversion): Treat the number
	of units as polynomial.  Choose between WIDE and NARROW based
	on multiple_p.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256139
2018-01-03 07:16:28 +00:00
Richard Sandiford
cf1b2ba4ea poly_int: vectorizable_simd_clone_call
This patch makes vectorizable_simd_clone_call cope with variable-length
vectors.  For now we don't support SIMD clones for variable-length
vectors; this will be post GCC 8 material.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* tree-vect-stmts.c (simd_clone_subparts): New function.
	(vectorizable_simd_clone_call): Use it instead of TYPE_VECTOR_SUBPARTS.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256138
2018-01-03 07:16:22 +00:00
Richard Sandiford
c7bda0f40e poly_int: vectorizable_call
This patch makes vectorizable_call handle variable-length vectors.
The only substantial change is to use build_index_vector for
IFN_GOMP_SIMD_LANE; this makes no functional difference for
fixed-length vectors.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* tree-vect-stmts.c (vectorizable_call): Treat the number of
	vectors as polynomial.  Use build_index_vector for
	IFN_GOMP_SIMD_LANE.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256137
2018-01-03 07:16:14 +00:00
Richard Sandiford
4d694b27c3 poly_int: vectorizable_load/store
This patch makes vectorizable_load and vectorizable_store cope with
variable-length vectors.  The reverse and permute cases will be
excluded by the code that checks the permutation mask (although a
patch after the main SVE submission adds support for the reversed
case).  Here we also need to exclude VMAT_ELEMENTWISE and
VMAT_STRIDED_SLP, which split the operation up into a constant
number of constant-sized operations.  We also don't try to extend
the current widening gather/scatter support to variable-length
vectors, since SVE uses a different approach.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* tree-vect-stmts.c (get_load_store_type): Treat the number of
	units as polynomial.  Reject VMAT_ELEMENTWISE and VMAT_STRIDED_SLP
	for variable-length vectors.
	(vectorizable_mask_load_store): Treat the number of units as
	polynomial, asserting that it is constant if the condition has
	already been enforced.
	(vectorizable_store, vectorizable_load): Likewise.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256136
2018-01-03 07:16:06 +00:00
Richard Sandiford
fa78079469 poly_int: vectorizable_live_operation
This patch makes vectorizable_live_operation cope with variable-length
vectors.  For now we just handle cases in which we can tell at compile
time which vector contains the final result.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* tree-vect-loop.c (vectorizable_live_operation): Treat the number
	of units as polynomial.  Punt if we can't tell at compile time
	which vector contains the final result.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256135
2018-01-03 07:16:00 +00:00
Richard Sandiford
9fb9293aca poly_int: vectorizable_induction
This patch makes vectorizable_induction cope with variable-length
vectors.  For now we punt on SLP inductions, but patchees after
the main SVE submission add support for those too.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* tree-vect-loop.c (vectorizable_induction): Treat the number
	of units as polynomial.  Punt on SLP inductions.  Use an integer
	VEC_SERIES_EXPR for variable-length integer reductions.  Use a
	cast of such a series for variable-length floating-point
	reductions.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256134
2018-01-03 07:15:54 +00:00
Richard Sandiford
e54dd6d3a7 poly_int: vectorizable_reduction
This patch makes vectorizable_reduction cope with variable-length vectors.
We can handle the simple case of an inner loop reduction for which
the target has native support for the epilogue operation.  For now we
punt on other cases, but patches after the main SVE submission allow
SLP and double reductions too.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* tree.h (build_index_vector): Declare.
	* tree.c (build_index_vector): New function.
	* tree-vect-loop.c (get_initial_defs_for_reduction): Treat the number
	of units as polynomial, forcibly converting it to a constant if
	vectorizable_reduction has already enforced the condition.
	(vect_create_epilog_for_reduction): Likewise.  Use build_index_vector
	to create a {1,2,3,...} vector.
	(vectorizable_reduction): Treat the number of units as polynomial.
	Choose vectype_in based on the largest scalar element size rather
	than the smallest number of units.  Enforce the restrictions
	relied on above.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256133
2018-01-03 07:15:47 +00:00
Richard Sandiford
9031b367ac poly_int: vector_alignment_reachable_p
This patch makes vector_alignment_reachable_p cope with variable-length
vectors.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* tree-vect-data-refs.c (vector_alignment_reachable_p): Treat the
	number of units as polynomial.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256132
2018-01-03 07:15:41 +00:00
Richard Sandiford
86e3672871 poly_int: current_vector_size and TARGET_AUTOVECTORIZE_VECTOR_SIZES
This patch changes the type of current_vector_size to poly_uint64.
It also changes TARGET_AUTOVECTORIZE_VECTOR_SIZES so that it fills
in a vector of possible sizes (as poly_uint64s) instead of returning
a bitmask.  The documentation claimed that the hook didn't need to
include the default vector size (returned by preferred_simd_mode),
but that wasn't consistent with the omp-low.c usage.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* target.h (vector_sizes, auto_vector_sizes): New typedefs.
	* target.def (autovectorize_vector_sizes): Return the vector sizes
	by pointer, using vector_sizes rather than a bitmask.
	* targhooks.h (default_autovectorize_vector_sizes): Update accordingly.
	* targhooks.c (default_autovectorize_vector_sizes): Likewise.
	* config/aarch64/aarch64.c (aarch64_autovectorize_vector_sizes):
	Likewise.
	* config/arc/arc.c (arc_autovectorize_vector_sizes): Likewise.
	* config/arm/arm.c (arm_autovectorize_vector_sizes): Likewise.
	* config/i386/i386.c (ix86_autovectorize_vector_sizes): Likewise.
	* config/mips/mips.c (mips_autovectorize_vector_sizes): Likewise.
	* omp-general.c (omp_max_vf): Likewise.
	* omp-low.c (omp_clause_aligned_alignment): Likewise.
	* optabs-query.c (can_vec_mask_load_store_p): Likewise.
	* tree-vect-loop.c (vect_analyze_loop): Likewise.
	* tree-vect-slp.c (vect_slp_bb): Likewise.
	* doc/tm.texi: Regenerate.
	* tree-vectorizer.h (current_vector_size): Change from an unsigned int
	to a poly_uint64.
	* tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Take
	the vector size as a poly_uint64 rather than an unsigned int.
	(current_vector_size): Change from an unsigned int to a poly_uint64.
	(get_vectype_for_scalar_type): Update accordingly.
	* tree.h (build_truth_vector_type): Take the size and number of
	units as a poly_uint64 rather than an unsigned int.
	(build_vector_type): Add a temporary overload that takes
	the number of units as a poly_uint64 rather than an unsigned int.
	* tree.c (make_vector_type): Likewise.
	(build_truth_vector_type): Take the number of units as a poly_uint64
	rather than an unsigned int.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256131
2018-01-03 07:15:20 +00:00
Richard Sandiford
87133c45a0 poly_int: get_mask_mode
This patch makes TARGET_GET_MASK_MODE take polynomial nunits and
vector_size arguments.  The gcc_assert in default_get_mask_mode
is now handled by the exact_div call in vector_element_size.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* target.def (get_mask_mode): Take the number of units and length
	as poly_uint64s rather than unsigned ints.
	* targhooks.h (default_get_mask_mode): Update accordingly.
	* targhooks.c (default_get_mask_mode): Likewise.
	* config/i386/i386.c (ix86_get_mask_mode): Likewise.
	* doc/tm.texi: Regenerate.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256130
2018-01-03 07:14:43 +00:00
Richard Sandiford
9d2f08ab97 poly_int: omp_max_vf
This patch makes omp_max_vf return a polynomial vectorization factor.
We then need to be able to stash a polynomial value in
OMP_CLAUSE_SAFELEN_EXPR too:

   /* If max_vf is non-zero, then we can use only a vectorization factor
      up to the max_vf we chose.  So stick it into the safelen clause.  */

For now the cfgloop safelen is still constant though.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* omp-general.h (omp_max_vf): Return a poly_uint64 instead of an int.
	* omp-general.c (omp_max_vf): Likewise.
	* omp-expand.c (omp_adjust_chunk_size): Update call to omp_max_vf.
	(expand_omp_simd): Handle polynomial safelen.
	* omp-low.c (omplow_simd_context): Add a default constructor.
	(omplow_simd_context::max_vf): Change from int to poly_uint64.
	(lower_rec_simd_input_clauses): Update accordingly.
	(lower_rec_input_clauses): Likewise.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256129
2018-01-03 07:14:31 +00:00
Richard Sandiford
c5126ce8ca poly_int: vect_nunits_for_cost
This patch adds a function for getting the number of elements in
a vector for cost purposes, which is always constant.  It makes
it possible for a later patch to change GET_MODE_NUNITS and
TYPE_VECTOR_SUBPARTS to a poly_int.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* tree-vectorizer.h (vect_nunits_for_cost): New function.
	* tree-vect-loop.c (vect_model_reduction_cost): Use it.
	* tree-vect-slp.c (vect_analyze_slp_cost_1): Likewise.
	(vect_analyze_slp_cost): Likewise.
	* tree-vect-stmts.c (vect_model_store_cost): Likewise.
	(vect_model_load_cost): Likewise.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256128
2018-01-03 07:14:24 +00:00
Richard Sandiford
4b6068eadc poly_int: SLP max_units
This match makes tree-vect-slp.c track the maximum number of vector
units as a poly_uint64 rather than an unsigned int.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* tree-vect-slp.c (vect_record_max_nunits, vect_build_slp_tree_1)
	(vect_build_slp_tree_2, vect_build_slp_tree): Change max_nunits
	from an unsigned int * to a poly_uint64_pod *.
	(calculate_unrolling_factor): New function.
	(vect_analyze_slp_instance): Use it.  Track polynomial max_nunits.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256127
2018-01-03 07:14:16 +00:00
Richard Sandiford
d9f21f6acb poly_int: vectoriser vf and uf
This patch changes the type of the vectorisation factor and SLP
unrolling factor to poly_uint64.  This in turn required some knock-on
changes in signedness elsewhere.

Cost decisions are generally based on estimated_poly_value,
which for VF is wrapped up as vect_vf_for_cost.

The patch doesn't on its own enable variable-length vectorisation.
It just makes the minimum changes necessary for the code to build
with the new VF and UF types.  Later patches also make the
vectoriser cope with variable TYPE_VECTOR_SUBPARTS and variable
GET_MODE_NUNITS, at which point the code really does handle
variable-length vectors.

The patch also changes MAX_VECTORIZATION_FACTOR to INT_MAX,
to avoid hard-coding a particular architectural limit.

The patch includes a new test because a development version of the patch
accidentally used file print routines instead of dump_*, which would
fail with -fopt-info.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* tree-vectorizer.h (_slp_instance::unrolling_factor): Change
	from an unsigned int to a poly_uint64.
	(_loop_vec_info::slp_unrolling_factor): Likewise.
	(_loop_vec_info::vectorization_factor): Change from an int
	to a poly_uint64.
	(MAX_VECTORIZATION_FACTOR): Bump from 64 to INT_MAX.
	(vect_get_num_vectors): New function.
	(vect_update_max_nunits, vect_vf_for_cost): Likewise.
	(vect_get_num_copies): Use vect_get_num_vectors.
	(vect_analyze_data_ref_dependences): Change max_vf from an int *
	to an unsigned int *.
	(vect_analyze_data_refs): Change min_vf from an int * to a
	poly_uint64 *.
	(vect_transform_slp_perm_load): Take the vf as a poly_uint64 rather
	than an unsigned HOST_WIDE_INT.
	* tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr)
	(vect_analyze_data_ref_dependence): Change max_vf from an int *
	to an unsigned int *.
	(vect_analyze_data_ref_dependences): Likewise.
	(vect_compute_data_ref_alignment): Handle polynomial vf.
	(vect_enhance_data_refs_alignment): Likewise.
	(vect_prune_runtime_alias_test_list): Likewise.
	(vect_shift_permute_load_chain): Likewise.
	(vect_supportable_dr_alignment): Likewise.
	(dependence_distance_ge_vf): Take the vectorization factor as a
	poly_uint64 rather than an unsigned HOST_WIDE_INT.
	(vect_analyze_data_refs): Change min_vf from an int * to a
	poly_uint64 *.
	* tree-vect-loop-manip.c (vect_gen_scalar_loop_niters): Take
	vfm1 as a poly_uint64 rather than an int.  Make the same change
	for the returned bound_scalar.
	(vect_gen_vector_loop_niters): Handle polynomial vf.
	(vect_do_peeling): Likewise.  Update call to
	vect_gen_scalar_loop_niters and handle polynomial bound_scalars.
	(vect_gen_vector_loop_niters_mult_vf): Assert that the vf must
	be constant.
	* tree-vect-loop.c (vect_determine_vectorization_factor)
	(vect_update_vf_for_slp, vect_analyze_loop_2): Handle polynomial vf.
	(vect_get_known_peeling_cost): Likewise.
	(vect_estimate_min_profitable_iters, vectorizable_reduction): Likewise.
	(vect_worthwhile_without_simd_p, vectorizable_induction): Likewise.
	(vect_transform_loop): Likewise.  Use the lowest possible VF when
	updating the upper bounds of the loop.
	(vect_min_worthwhile_factor): Make static.  Return an unsigned int
	rather than an int.
	* tree-vect-slp.c (vect_attempt_slp_rearrange_stmts): Cope with
	polynomial unroll factors.
	(vect_analyze_slp_cost_1, vect_analyze_slp_instance): Likewise.
	(vect_make_slp_decision): Likewise.
	(vect_supported_load_permutation_p): Likewise, and polynomial
	vf too.
	(vect_analyze_slp_cost): Handle polynomial vf.
	(vect_slp_analyze_node_operations): Likewise.
	(vect_slp_analyze_bb_1): Likewise.
	(vect_transform_slp_perm_load): Take the vf as a poly_uint64 rather
	than an unsigned HOST_WIDE_INT.
	* tree-vect-stmts.c (vectorizable_simd_clone_call, vectorizable_store)
	(vectorizable_load): Handle polynomial vf.
	* tree-vectorizer.c (simduid_to_vf::vf): Change from an int to
	a poly_uint64.
	(adjust_simduid_builtins, shrink_simd_arrays): Update accordingly.

gcc/testsuite/
	* gcc.dg/vect-opt-info-1.c: New test.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256126
2018-01-03 07:14:07 +00:00
Richard Sandiford
fba05d9e9a match.pd handling of three-constant bitops
natch.pd tries to reassociate two bit operations if both of them have
constant operands.  However, with the polynomial integers added later,
there's no guarantee that a bit operation on two integers can be folded
at compile time.  This means that the pattern can trigger for operations
on three constants, and as things stood could endlessly oscillate
between the two associations.

This patch keeps the existing pattern for the normal case of a
non-constant first operand.  When all three operands are constant it
tries to find a pair of constants that do fold.  If none do, it keeps
the original expression as-was.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* match.pd: Handle bit operations involving three constants
	and try to fold one pair.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256125
2018-01-03 07:13:57 +00:00
Richard Sandiford
0f26839a0a Add an alternative vector loop iv mechanism
Normally we adjust the vector loop so that it iterates:

   (original number of scalar iterations - number of peels) / VF

times, enforcing this using an IV that starts at zero and increments
by one each iteration.  However, dividing by VF would be expensive
for variable VF, so this patch adds an alternative in which the IV
increments by VF each iteration instead.  We then need to take care
to handle possible overflow in the IV.

The new mechanism isn't used yet; a later patch replaces the
"if (1)" with a check for variable VF.

2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* tree-vect-loop-manip.c: Include gimple-fold.h.
	(slpeel_make_loop_iterate_ntimes): Add step, final_iv and
	niters_maybe_zero parameters.  Handle other cases besides a step of 1.
	(vect_gen_vector_loop_niters): Add a step_vector_ptr parameter.
	Add a path that uses a step of VF instead of 1, but disable it
	for now.
	(vect_do_peeling): Add step_vector, niters_vector_mult_vf_var
	and niters_no_overflow parameters.  Update calls to
	slpeel_make_loop_iterate_ntimes and vect_gen_vector_loop_niters.
	Create a new SSA name if the latter choses to use a ste other
	than zero, and return it via niters_vector_mult_vf_var.
	* tree-vect-loop.c (vect_transform_loop): Update calls to
	vect_do_peeling, vect_gen_vector_loop_niters and
	slpeel_make_loop_iterate_ntimes.
	* tree-vectorizer.h (slpeel_make_loop_iterate_ntimes, vect_do_peeling)
	(vect_gen_vector_loop_niters): Update declarations after above changes.

From-SVN: r256124
2018-01-03 07:13:50 +00:00
Ben Elliston
e50ffab340 Summary: Replace a few instances of 8 leading spaces with horizontal tabs.
From-SVN: r256123
2018-01-03 15:32:45 +11:00
Ben Elliston
ef7d7cf50d config.guess: Import latest version.
* config.guess: Import latest version.
	* config.sub: Likewise.

From-SVN: r256122
2018-01-03 15:25:18 +11:00
Michael Meissner
2d71e7b8d4 rs6000.md (floor<mode>2): Add support for IEEE 128-bit round to integer instructions.
[gcc]
2018-01-02  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000.md (floor<mode>2): Add support for IEEE
	128-bit round to integer instructions.
	(ceil<mode>2): Likewise.
	(btrunc<mode>2): Likewise.
	(round<mode>2): Likewise.

[gcc/testsuite]
2018-01-02  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/float128-hw2.c: Add tests for ceilf128,
	floorf128, truncf128, and roundf128.
	* gcc.target/powerpc/float128-hw5.c: New tests for _Float128
	optimizations added in match.pd.
	* gcc.target/powerpc/float128-hw6.c: Likewise.
	* gcc.target/powerpc/float128-hw7.c: Likewise.
	* gcc.target/powerpc/float128-hw8.c: Likewise.
	* gcc.target/powerpc/float128-hw9.c: Likewise.
	* gcc.target/powerpc/float128-hw10.c: Likewise.
	* gcc.target/powerpc/float128-hw11.c: Likewise.

From-SVN: r256118
2018-01-03 02:38:09 +00:00
GCC Administrator
50d75500a3 Daily bump.
From-SVN: r256116
2018-01-03 00:16:18 +00:00
Aaron Sawdey
3b0cb1a553 rs6000-string.c (expand_block_move): Allow the use of unaligned VSX load/store on P8/P9.
2018-01-02  Aaron Sawdey  <acsawdey@linux.vnet.ibm.com>

        * config/rs6000/rs6000-string.c (expand_block_move): Allow the use of
        unaligned VSX load/store on P8/P9.
        (expand_block_clear): Allow the use of unaligned VSX
	load/store on P8/P9.

From-SVN: r256112
2018-01-02 17:01:43 -06:00
Bill Schmidt
6012c652c7 rs6000-p8swap.c (swap_feeds_both_load_and_store): New function.
2018-01-02  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/rs6000-p8swap.c (swap_feeds_both_load_and_store):
	New function.
	(rs6000_analyze_swaps): Mark a web unoptimizable if it contains a
	swap associated with both a load and a store.

From-SVN: r256111
2018-01-02 22:56:45 +00:00
Andrew Waterman
f1bdc63a89 RISC-V: Fix for icache flush issue on multicore processors.
gcc/
	* config/riscv/linux.h (ICACHE_FLUSH_FUNC): New.
	* config/riscv/riscv.md (clear_cache): Use it.

From-SVN: r256109
2018-01-02 12:34:01 -08:00
Artyom Skrobov
a7e92aff74 * web.c: Remove out-of-date comment.
From-SVN: r256106
2018-01-02 12:16:44 -07:00
Richard Sandiford
2bc6986d01 Fix REG_ARGS_SIZE handling when pushing TLS addresses
The new assert in add_args_size_note triggered for gcc.dg/tls/opt-3.c
and others on m68k.  This looks like a pre-existing bug: if we pushed
a value that needs a call to something like __tls_get_addr, we ended
up with two different REG_ARGS_SIZE notes on the same instruction.

It seems to be OK for emit_single_push_insn to push something that
needs a call to __tls_get_addr:

      /* We have to allow non-call_pop patterns for the case
	 of emit_single_push_insn of a TLS address.  */
      if (GET_CODE (pat) != PARALLEL)
	return 0;

so I think the bug is in the way this is handled rather than the fact
that it occurs at all.

If we're pushing a value X that needs a call C to calculate, we'll
add REG_ARGS_SIZE notes to the pushes and pops for C as part of the
call sequence.  Then emit_single_push_insn calls fixup_args_size_notes
on the whole push sequence (the calculation of X, including C,
and the push of X itself).  This is where the double notes came from.
But emit_single_push_insn_1 adjusted stack_pointer_delta *before* the
push, so the notes added for C were relative to the situation after
the future push of X rather than before it.

Presumably this didn't matter in practice because the note added
second tended to trump the note added first.  But code is allowed to
walk REG_NOTES without having to disregard secondary notes.

2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* expr.c (fixup_args_size_notes): Check that any existing
	REG_ARGS_SIZE notes are correct, and don't try to re-add them.
	(emit_single_push_insn_1): Move stack_pointer_delta adjustment to...
	(emit_single_push_insn): ...here.

From-SVN: r256105
2018-01-02 19:14:43 +00:00
Richard Sandiford
cd5ff7bc32 Make CONST_VECTOR_ELT handle implicitly-encoded elements
This patch makes CONST_VECTOR_ELT handle implicitly-encoded elements,
in a similar way to VECTOR_CST_ELT.

2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* rtl.h (CONST_VECTOR_ELT): Redefine to const_vector_elt.
	(const_vector_encoded_nelts): New function.
	(CONST_VECTOR_NUNITS): Redefine to use GET_MODE_NUNITS.
	(const_vector_int_elt, const_vector_elt): Declare.
	* emit-rtl.c (const_vector_int_elt_1): New function.
	(const_vector_elt): Likewise.
	* simplify-rtx.c (simplify_immed_subreg): Avoid taking the address
	of CONST_VECTOR_ELT.

From-SVN: r256104
2018-01-02 18:28:14 +00:00
Richard Sandiford
3d8ca53dd9 Make more use of rtx_vector_builder
This patch makes various bits of CONST_VECTOR-building code use
rtx_vector_builder, operating directly on a specific encoding.

2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* expr.c: Include rtx-vector-builder.h.
	(const_vector_mask_from_tree): Use rtx_vector_builder and operate
	directly on the tree encoding.
	(const_vector_from_tree): Likewise.
	* optabs.c: Include rtx-vector-builder.h.
	(expand_vec_perm_var): Use rtx_vector_builder and create a repeating
	sequence of "u" values.
	* vec-perm-indices.c: Include rtx-vector-builder.h.
	(vec_perm_indices_to_rtx): Use rtx_vector_builder and operate
	directly on the vec_perm_indices encoding.

From-SVN: r256103
2018-01-02 18:28:06 +00:00
Richard Sandiford
3877c56065 New CONST_VECTOR layout
This patch makes CONST_VECTOR use the same encoding as VECTOR_CST.

One problem that occurs in RTL but not at the tree level is that a fair
amount of code uses XVEC and XVECEXP directly on CONST_VECTORs (which is
valid, just with looser checking).  This is complicated by the fact that
vectors are also represented as PARALLELs in some target interfaces,
so using XVECEXP is a good polymorphic way of handling both forms.

Rather than try to untangle all that, the best approach seemed to be to
continue to encode every element in a fixed-length vector.  That way only
target-independent and AArch64 code need to be precise about using
CONST_VECTOR_ELT over XVECEXP.

After this change is no longer valid to modify CONST_VECTORs in-place.
This needed some fix-up in the powerpc backends.

2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* doc/rtl.texi (const_vector): Describe new encoding scheme.
	* Makefile.in (OBJS): Add rtx-vector-builder.o.
	* rtx-vector-builder.h: New file.
	* rtx-vector-builder.c: Likewise.
	* rtl.h (rtx_def::u2): Add a const_vector field.
	(CONST_VECTOR_NPATTERNS): New macro.
	(CONST_VECTOR_NELTS_PER_PATTERN): Likewise.
	(CONST_VECTOR_DUPLICATE_P): Likewise.
	(CONST_VECTOR_STEPPED_P): Likewise.
	(CONST_VECTOR_ENCODED_ELT): Likewise.
	(const_vec_duplicate_p): Check for a duplicated vector encoding.
	(unwrap_const_vec_duplicate): Likewise.
	(const_vec_series_p): Check for a non-duplicated vector encoding.
	Say that the function only returns true for integer vectors.
	* emit-rtl.c: Include rtx-vector-builder.h.
	(gen_const_vec_duplicate_1): Delete.
	(gen_const_vector): Call gen_const_vec_duplicate instead of
	gen_const_vec_duplicate_1.
	(const_vec_series_p_1): Operate directly on the CONST_VECTOR encoding.
	(gen_const_vec_duplicate): Use rtx_vector_builder.
	(gen_const_vec_series): Likewise.
	(gen_rtx_CONST_VECTOR): Likewise.
	* config/powerpcspe/powerpcspe.c: Include rtx-vector-builder.h.
	(swap_const_vector_halves): Take an rtx pointer rather than rtx.
	Build a new vector rather than modifying a CONST_VECTOR in-place.
	(handle_special_swappables): Update call accordingly.
	* config/rs6000/rs6000-p8swap.c: Include rtx-vector-builder.h.
	(swap_const_vector_halves): Take an rtx pointer rather than rtx.
	Build a new vector rather than modifying a CONST_VECTOR in-place.
	(handle_special_swappables): Update call accordingly.

From-SVN: r256102
2018-01-02 18:27:50 +00:00
Richard Sandiford
8eff75e0d2 Use CONST_VECTOR_ELT instead of XVECEXP
This patch replaces target-independent uses of XVECEXP with uses
of CONST_VECTOR_ELT.  This kind of replacement isn't necessary
for code specific to targets other than AArch64.

2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* simplify-rtx.c (simplify_const_binary_operation): Use
	CONST_VECTOR_ELT instead of XVECEXP.

From-SVN: r256101
2018-01-02 18:27:42 +00:00
Richard Sandiford
b00cb3bfa5 Use ssizetype selectors for autovectorised VEC_PERM_EXPRs
The previous patches mean that there's no reason that constant
VEC_PERM_EXPRs need to have the same shape as the data inputs.
This patch makes the autovectoriser use sizetype elements instead,
so that indices don't get truncated for large or variable-length
vectors.

2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* tree-cfg.c (verify_gimple_assign_ternary): Allow the size of
	the selector elements to be different from the data elements
	if the selector is a VECTOR_CST.
	* tree-vect-stmts.c (vect_gen_perm_mask_any): Use a vector of
	ssizetype for the selector.

From-SVN: r256100
2018-01-02 18:27:35 +00:00
Richard Sandiford
d386748304 Use vec_perm_builder::series_p in shift_amt_for_vec_perm_mask
This patch makes shift_amt_for_vec_perm_mask use series_p to check
for the simple case of a natural linear series before falling back
to testing each element individually.  The series_p test works with
variable-length vectors but testing every individual element doesn't.

2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* optabs.c (shift_amt_for_vec_perm_mask): Try using series_p
	before testing each element individually.
	* tree-vect-generic.c (lower_vec_perm): Likewise.

From-SVN: r256099
2018-01-02 18:27:24 +00:00
Richard Sandiford
1a1c441dbe Rework VEC_PERM_EXPR folding
This patch reworks the VEC_PERM_EXPR folding so that more of it
works for variable-length vectors.  E.g. it means that we can
now recognise variable-length permutes that reduce to a single
vector, or cases in which a variable-length permute only needs
one input.  There should be no functional change for fixed-length
vectors.

2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* selftest.h (selftest::vec_perm_indices_c_tests): Declare.
	* selftest-run-tests.c (selftest::run_tests): Call it.
	* vector-builder.h (vector_builder::operator ==): New function.
	(vector_builder::operator !=): Likewise.
	* vec-perm-indices.h (vec_perm_indices::series_p): Declare.
	(vec_perm_indices::all_from_input_p): New function.
	* vec-perm-indices.c (vec_perm_indices::series_p): Likewise.
	(test_vec_perm_12, selftest::vec_perm_indices_c_tests): Likewise.
	* fold-const.c (fold_ternary_loc): Use tree_to_vec_perm_builder
	instead of reading the VECTOR_CST directly.  Detect whether both
	vector inputs are the same before constructing the vec_perm_indices,
	and update the number of inputs argument accordingly.  Use the
	utility functions added above.  Only construct sel2 if we need to.

From-SVN: r256098
2018-01-02 18:27:15 +00:00
Richard Sandiford
d980067b1e Use explicit encodings for simple permutes
This patch makes users of vec_perm_builders use the compressed encoding
where possible.  This means that they work with variable-length vectors.

2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* optabs.c (expand_vec_perm_var): Use an explicit encoding for
	the broadcast of the low byte.
	(expand_mult_highpart): Use an explicit encoding for the permutes.
	* optabs-query.c (can_mult_highpart_p): Likewise.
	* tree-vect-loop.c (calc_vec_perm_mask_for_shift): Likewise.
	* tree-vect-stmts.c (perm_mask_for_reverse): Likewise.
	(vectorizable_bswap): Likewise.
	* tree-vect-data-refs.c (vect_grouped_store_supported): Use an
	explicit encoding for the power-of-2 permutes.
	(vect_permute_store_chain): Likewise.
	(vect_grouped_load_supported): Likewise.
	(vect_permute_load_chain): Likewise.

From-SVN: r256097
2018-01-02 18:27:05 +00:00
Richard Sandiford
736d0f2878 Add a vec_perm_indices_to_tree helper function
This patch adds a function for creating a VECTOR_CST from a
vec_perm_indices, operating directly on the encoding.

2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* vec-perm-indices.h (vec_perm_indices_to_tree): Declare.
	* vec-perm-indices.c (vec_perm_indices_to_tree): New function.
	* tree-ssa-forwprop.c (simplify_vector_constructor): Use it.
	* tree-vect-slp.c (vect_transform_slp_perm_load): Likewise.
	* tree-vect-stmts.c (vectorizable_bswap): Likewise.
	(vect_gen_perm_mask_any): Likewise.

From-SVN: r256096
2018-01-02 18:26:56 +00:00
Richard Sandiford
e3342de49c Make vec_perm_indices use new vector encoding
This patch changes vec_perm_indices from a plain vec<> to a class
that stores a canonicalized permutation, using the same encoding
as for VECTOR_CSTs.  This means that vec_perm_indices now carries
information about the number of vectors being permuted (currently
always 1 or 2) and the number of elements in each input vector.

A new vec_perm_builder class is used to actually build up the vector,
like tree_vector_builder does for trees.  vec_perm_indices is the
completed representation, a bit like VECTOR_CST is for trees.

The patch just does a mechanical conversion of the code to
vec_perm_builder: a later patch uses explicit encodings where possible.

The point of all this is that it makes the representation suitable
for variable-length vectors.  It's no longer necessary for the
underlying vec<>s to store every element explicitly.

In int-vector-builder.h, "using the same encoding as tree and rtx constants"
describes the endpoint -- adding the rtx encoding comes later.

2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* int-vector-builder.h: New file.
	* vec-perm-indices.h: Include int-vector-builder.h.
	(vec_perm_indices): Redefine as an int_vector_builder.
	(auto_vec_perm_indices): Delete.
	(vec_perm_builder): Redefine as a stand-alone class.
	(vec_perm_indices::vec_perm_indices): New function.
	(vec_perm_indices::clamp): Likewise.
	* vec-perm-indices.c: Include fold-const.h and tree-vector-builder.h.
	(vec_perm_indices::new_vector): New function.
	(vec_perm_indices::new_expanded_vector): Update for new
	vec_perm_indices class.
	(vec_perm_indices::rotate_inputs): New function.
	(vec_perm_indices::all_in_range_p): Operate directly on the
	encoded form, without computing elided elements.
	(tree_to_vec_perm_builder): Operate directly on the VECTOR_CST
	encoding.  Update for new vec_perm_indices class.
	* optabs.c (expand_vec_perm_const): Create a vec_perm_indices for
	the given vec_perm_builder.
	(expand_vec_perm_var): Update vec_perm_builder constructor.
	(expand_mult_highpart): Use vec_perm_builder instead of
	auto_vec_perm_indices.
	* optabs-query.c (can_mult_highpart_p): Use vec_perm_builder and
	vec_perm_indices instead of auto_vec_perm_indices.  Use a single
	or double series encoding as appropriate.
	* fold-const.c (fold_ternary_loc): Use vec_perm_builder and
	vec_perm_indices instead of auto_vec_perm_indices.
	* tree-ssa-forwprop.c (simplify_vector_constructor): Likewise.
	* tree-vect-data-refs.c (vect_grouped_store_supported): Likewise.
	(vect_permute_store_chain): Likewise.
	(vect_grouped_load_supported): Likewise.
	(vect_permute_load_chain): Likewise.
	(vect_shift_permute_load_chain): Likewise.
	* tree-vect-slp.c (vect_build_slp_tree_1): Likewise.
	(vect_transform_slp_perm_load): Likewise.
	(vect_schedule_slp_instance): Likewise.
	* tree-vect-stmts.c (perm_mask_for_reverse): Likewise.
	(vectorizable_mask_load_store): Likewise.
	(vectorizable_bswap): Likewise.
	(vectorizable_store): Likewise.
	(vectorizable_load): Likewise.
	* tree-vect-generic.c (lower_vec_perm): Use vec_perm_builder and
	vec_perm_indices instead of auto_vec_perm_indices.  Use
	tree_to_vec_perm_builder to read the vector from a tree.
	* tree-vect-loop.c (calc_vec_perm_mask_for_shift): Take a
	vec_perm_builder instead of a vec_perm_indices.
	(have_whole_vector_shift): Use vec_perm_builder and
	vec_perm_indices instead of auto_vec_perm_indices.  Leave the
	truncation to calc_vec_perm_mask_for_shift.
	(vect_create_epilog_for_reduction): Likewise.
	* config/aarch64/aarch64.c (expand_vec_perm_d::perm): Change
	from auto_vec_perm_indices to vec_perm_indices.
	(aarch64_expand_vec_perm_const_1): Use rotate_inputs on d.perm
	instead of changing individual elements.
	(aarch64_vectorize_vec_perm_const): Use new_vector to install
	the vector in d.perm.
	* config/arm/arm.c (expand_vec_perm_d::perm): Change
	from auto_vec_perm_indices to vec_perm_indices.
	(arm_expand_vec_perm_const_1): Use rotate_inputs on d.perm
	instead of changing individual elements.
	(arm_vectorize_vec_perm_const): Use new_vector to install
	the vector in d.perm.
	* config/powerpcspe/powerpcspe.c (rs6000_expand_extract_even):
	Update vec_perm_builder constructor.
	(rs6000_expand_interleave): Likewise.
	* config/rs6000/rs6000.c (rs6000_expand_extract_even): Likewise.
	(rs6000_expand_interleave): Likewise.

From-SVN: r256095
2018-01-02 18:26:47 +00:00
Richard Sandiford
6da64f1b32 Check whether a vector of QIs can store all indices
The patch to remove the vec_perm_const optab checked whether replacing
a constant permute with a variable permute is safe, or whether it might
truncate the indices.  This patch adds a corresponding check for whether
variable permutes can be lowered to QImode-based permutes.

2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* optabs-query.c (can_vec_perm_var_p): Check whether lowering
	to qimode could truncate the indices.
	* optabs.c (expand_vec_perm_var): Likewise.

From-SVN: r256094
2018-01-02 18:26:35 +00:00
Richard Sandiford
f151c9e141 Remove vec_perm_const optab
One of the changes needed for variable-length VEC_PERM_EXPRs -- and for
long fixed-length VEC_PERM_EXPRs -- is the ability to use constant
selectors that wouldn't fit in the vectors being permuted.  E.g. a
permute on two V256QIs can't be done using a V256QI selector.

At the moment constant permutes use two interfaces:
targetm.vectorizer.vec_perm_const_ok for testing whether a permute is
valid and the vec_perm_const optab for actually emitting the permute.
The former gets passed a vec<> selector and the latter an rtx selector.
Most ports share a lot of code between the hook and the optab, with a
wrapper function for each interface.

We could try to keep that interface and require ports to define wider
vector modes that could be attached to the CONST_VECTOR (e.g. V256HI or
V256SI in the example above).  But building a CONST_VECTOR rtx seems a bit
pointless here, since the expand code only creates the CONST_VECTOR in
order to call the optab, and the first thing the target does is take
the CONST_VECTOR apart again.

The easiest approach therefore seemed to be to remove the optab and
reuse the target hook to emit the code.  One potential drawback is that
it's no longer possible to use match_operand predicates to force
operands into the required form, but in practice all targets want
register operands anyway.

The patch also changes vec_perm_indices into a class that provides
some simple routines for handling permutations.  A later patch will
flesh this out and get rid of auto_vec_perm_indices, but I didn't
want to do all that in this patch and make it more complicated than
it already is.

2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* Makefile.in (OBJS): Add vec-perm-indices.o.
	* vec-perm-indices.h: New file.
	* vec-perm-indices.c: Likewise.
	* target.h (vec_perm_indices): Replace with a forward class
	declaration.
	(auto_vec_perm_indices): Move to vec-perm-indices.h.
	* optabs.h: Include vec-perm-indices.h.
	(expand_vec_perm): Delete.
	(selector_fits_mode_p, expand_vec_perm_var): Declare.
	(expand_vec_perm_const): Declare.
	* target.def (vec_perm_const_ok): Replace with...
	(vec_perm_const): ...this new hook.
	* doc/tm.texi.in (TARGET_VECTORIZE_VEC_PERM_CONST_OK): Replace with...
	(TARGET_VECTORIZE_VEC_PERM_CONST): ...this new hook.
	* doc/tm.texi: Regenerate.
	* optabs.def (vec_perm_const): Delete.
	* doc/md.texi (vec_perm_const): Likewise.
	(vec_perm): Refer to TARGET_VECTORIZE_VEC_PERM_CONST.
	* expr.c (expand_expr_real_2): Use expand_vec_perm_const rather than
	expand_vec_perm for constant permutation vectors.  Assert that
	the mode of variable permutation vectors is the integer equivalent
	of the mode that is being permuted.
	* optabs-query.h (selector_fits_mode_p): Declare.
	* optabs-query.c: Include vec-perm-indices.h.
	(selector_fits_mode_p): New function.
	(can_vec_perm_const_p): Check whether targetm.vectorize.vec_perm_const
	is defined, instead of checking whether the vec_perm_const_optab
	exists.  Use targetm.vectorize.vec_perm_const instead of
	targetm.vectorize.vec_perm_const_ok.  Check whether the indices
	fit in the vector mode before using a variable permute.
	* optabs.c (shift_amt_for_vec_perm_mask): Take a mode and a
	vec_perm_indices instead of an rtx.
	(expand_vec_perm): Replace with...
	(expand_vec_perm_const): ...this new function.  Take the selector
	as a vec_perm_indices rather than an rtx.  Also take the mode of
	the selector.  Update call to shift_amt_for_vec_perm_mask.
	Use targetm.vectorize.vec_perm_const instead of vec_perm_const_optab.
	Use vec_perm_indices::new_expanded_vector to expand the original
	selector into bytes.  Check whether the indices fit in the vector
	mode before using a variable permute.
	(expand_vec_perm_var): Make global.
	(expand_mult_highpart): Use expand_vec_perm_const.
	* fold-const.c: Includes vec-perm-indices.h.
	* tree-ssa-forwprop.c: Likewise.
	* tree-vect-data-refs.c: Likewise.
	* tree-vect-generic.c: Likewise.
	* tree-vect-loop.c: Likewise.
	* tree-vect-slp.c: Likewise.
	* tree-vect-stmts.c: Likewise.
	* config/aarch64/aarch64-protos.h (aarch64_expand_vec_perm_const):
	Delete.
	* config/aarch64/aarch64-simd.md (vec_perm_const<mode>): Delete.
	* config/aarch64/aarch64.c (aarch64_expand_vec_perm_const)
	(aarch64_vectorize_vec_perm_const_ok): Fuse into...
	(aarch64_vectorize_vec_perm_const): ...this new function.
	(TARGET_VECTORIZE_VEC_PERM_CONST_OK): Delete.
	(TARGET_VECTORIZE_VEC_PERM_CONST): Redefine.
	* config/arm/arm-protos.h (arm_expand_vec_perm_const): Delete.
	* config/arm/vec-common.md (vec_perm_const<mode>): Delete.
	* config/arm/arm.c (TARGET_VECTORIZE_VEC_PERM_CONST_OK): Delete.
	(TARGET_VECTORIZE_VEC_PERM_CONST): Redefine.
	(arm_expand_vec_perm_const, arm_vectorize_vec_perm_const_ok): Merge
	into...
	(arm_vectorize_vec_perm_const): ...this new function.  Explicitly
	check for NEON modes.
	* config/i386/i386-protos.h (ix86_expand_vec_perm_const): Delete.
	* config/i386/sse.md (VEC_PERM_CONST, vec_perm_const<mode>): Delete.
	* config/i386/i386.c (ix86_expand_vec_perm_const_1): Update comment.
	(ix86_expand_vec_perm_const, ix86_vectorize_vec_perm_const_ok): Merge
	into...
	(ix86_vectorize_vec_perm_const): ...this new function.  Incorporate
	the old VEC_PERM_CONST conditions.
	* config/ia64/ia64-protos.h (ia64_expand_vec_perm_const): Delete.
	* config/ia64/vect.md (vec_perm_const<mode>): Delete.
	* config/ia64/ia64.c (ia64_expand_vec_perm_const)
	(ia64_vectorize_vec_perm_const_ok): Merge into...
	(ia64_vectorize_vec_perm_const): ...this new function.
	* config/mips/loongson.md (vec_perm_const<mode>): Delete.
	* config/mips/mips-msa.md (vec_perm_const<mode>): Delete.
	* config/mips/mips-ps-3d.md (vec_perm_constv2sf): Delete.
	* config/mips/mips-protos.h (mips_expand_vec_perm_const): Delete.
	* config/mips/mips.c (mips_expand_vec_perm_const)
	(mips_vectorize_vec_perm_const_ok): Merge into...
	(mips_vectorize_vec_perm_const): ...this new function.
	* config/powerpcspe/altivec.md (vec_perm_constv16qi): Delete.
	* config/powerpcspe/paired.md (vec_perm_constv2sf): Delete.
	* config/powerpcspe/spe.md (vec_perm_constv2si): Delete.
	* config/powerpcspe/vsx.md (vec_perm_const<mode>): Delete.
	* config/powerpcspe/powerpcspe-protos.h (altivec_expand_vec_perm_const)
	(rs6000_expand_vec_perm_const): Delete.
	* config/powerpcspe/powerpcspe.c (TARGET_VECTORIZE_VEC_PERM_CONST_OK):
	Delete.
	(TARGET_VECTORIZE_VEC_PERM_CONST): Redefine.
	(altivec_expand_vec_perm_const_le): Take each operand individually.
	Operate on constant selectors rather than rtxes.
	(altivec_expand_vec_perm_const): Likewise.  Update call to
	altivec_expand_vec_perm_const_le.
	(rs6000_expand_vec_perm_const): Delete.
	(rs6000_vectorize_vec_perm_const_ok): Delete.
	(rs6000_vectorize_vec_perm_const): New function.
	(rs6000_do_expand_vec_perm): Take a vec_perm_builder instead of
	an element count and rtx array.
	(rs6000_expand_extract_even): Update call accordingly.
	(rs6000_expand_interleave): Likewise.
	* config/rs6000/altivec.md (vec_perm_constv16qi): Delete.
	* config/rs6000/paired.md (vec_perm_constv2sf): Delete.
	* config/rs6000/vsx.md (vec_perm_const<mode>): Delete.
	* config/rs6000/rs6000-protos.h (altivec_expand_vec_perm_const)
	(rs6000_expand_vec_perm_const): Delete.
	* config/rs6000/rs6000.c (TARGET_VECTORIZE_VEC_PERM_CONST_OK): Delete.
	(TARGET_VECTORIZE_VEC_PERM_CONST): Redefine.
	(altivec_expand_vec_perm_const_le): Take each operand individually.
	Operate on constant selectors rather than rtxes.
	(altivec_expand_vec_perm_const): Likewise.  Update call to
	altivec_expand_vec_perm_const_le.
	(rs6000_expand_vec_perm_const): Delete.
	(rs6000_vectorize_vec_perm_const_ok): Delete.
	(rs6000_vectorize_vec_perm_const): New function.  Remove stray
	reference to the SPE evmerge intructions.
	(rs6000_do_expand_vec_perm): Take a vec_perm_builder instead of
	an element count and rtx array.
	(rs6000_expand_extract_even): Update call accordingly.
	(rs6000_expand_interleave): Likewise.
	* config/sparc/sparc.md (vec_perm_constv8qi): Delete in favor of...
	* config/sparc/sparc.c (sparc_vectorize_vec_perm_const): ...this
	new function.
	(TARGET_VECTORIZE_VEC_PERM_CONST): Redefine.

From-SVN: r256093
2018-01-02 18:26:27 +00:00