This patch changes the number of elements in a vector being built
by a vector_builder from unsigned int to poly_uint64. The case
in which it isn't a constant is the one that motivated adding
the vector encoding in the first place.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* vector-builder.h (vector_builder::m_full_nelts): Change from
unsigned int to poly_uint64.
(vector_builder::full_nelts): Update prototype accordingly.
(vector_builder::new_vector): Likewise.
(vector_builder::encoded_full_vector_p): Handle polynomial full_nelts.
(vector_builder::operator ==): Likewise.
(vector_builder::finalize): Likewise.
* int-vector-builder.h (int_vector_builder::int_vector_builder):
Take the number of elements as a poly_uint64 rather than an
unsigned int.
* vec-perm-indices.h (vec_perm_indices::m_nelts_per_input): Change
from unsigned int to poly_uint64.
(vec_perm_indices::vec_perm_indices): Update prototype accordingly.
(vec_perm_indices::new_vector): Likewise.
(vec_perm_indices::length): Likewise.
(vec_perm_indices::nelts_per_input): Likewise.
(vec_perm_indices::input_nelts): Likewise.
* vec-perm-indices.c (vec_perm_indices::new_vector): Take the
number of elements per input as a poly_uint64 rather than an
unsigned int. Use the original encoding for variable-length
vectors, rather than clamping each individual element.
For the second and subsequent elements in each pattern,
clamp the step and base before clamping their sum.
(vec_perm_indices::series_p): Handle polynomial element counts.
(vec_perm_indices::all_in_range_p): Likewise.
(vec_perm_indices_to_tree): Likewise.
(vec_perm_indices_to_rtx): Likewise.
* tree-vect-stmts.c (vect_gen_perm_mask_any): Likewise.
* tree-vector-builder.c (tree_vector_builder::new_unary_operation)
(tree_vector_builder::new_binary_operation): Handle polynomial
element counts. Return false if we need to know the number
of elements at compile time.
* fold-const.c (fold_vec_perm): Punt if the number of elements
isn't known at compile time.
From-SVN: r256165
This patch changes the vec_perm_indices element type from HOST_WIDE_INT
to poly_int64, so that it can represent indices into a variable-length
vector.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* vec-perm-indices.h (vec_perm_builder): Change element type
from HOST_WIDE_INT to poly_int64.
(vec_perm_indices::element_type): Update accordingly.
(vec_perm_indices::clamp): Handle polynomial element_types.
* vec-perm-indices.c (vec_perm_indices::series_p): Likewise.
(vec_perm_indices::all_in_range_p): Likewise.
(tree_to_vec_perm_builder): Check for poly_int64 trees rather
than shwi trees.
* vector-builder.h (vector_builder::stepped_sequence_p): Handle
polynomial vec_perm_indices element types.
* int-vector-builder.h (int_vector_builder::equal_p): Likewise.
* fold-const.c (fold_vec_perm): Likewise.
* optabs.c (shift_amt_for_vec_perm_mask): Likewise.
* tree-vect-generic.c (lower_vec_perm): Likewise.
* tree-vect-slp.c (vect_transform_slp_perm_load): Likewise.
* config/aarch64/aarch64.c (aarch64_evpc_tbl): Cast d->perm
element type to HOST_WIDE_INT.
From-SVN: r256164
The xsize and ysize arguments to memrefs_conflict_p are encode such
that:
- 0 means the size is unknown
- >0 means the size is known
- <0 means that the negative of the size is a worst-case size after
alignment
In other words, the sign effectively encodes a boolean; it isn't
meant to be taken literally. With poly_ints these correspond to:
- must_eq (..., 0)
- may_gt (..., 0)
- may_lt (..., 0)
respectively.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* alias.c (addr_side_effect_eval): Take the size as a poly_int64
rather than an int. Use plus_constant.
(memrefs_conflict_p): Take the sizes as poly_int64s rather than ints.
Take the offset "c" as a poly_int64 rather than a HOST_WIDE_INT.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256163
This patch makes calls.c treat struct_value_size (one of the
operands to a call pattern) as polynomial.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* calls.c (emit_call_1, expand_call): Change struct_value_size from
a HOST_WIDE_INT to a poly_int64.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256162
This patch makes load_register_parameters cope with polynomial sizes.
The requirement here is that any register parameters with non-constant
sizes must either have a specific mode (e.g. a variable-length vector
mode) or must be represented with a PARALLEL. This is in practice
already a requirement for parameters passed in vector registers,
since the default behaviour of splitting parameters into words doesn't
make sense for them.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* calls.c (load_register_parameters): Cope with polynomial
mode sizes. Require a constant size for BLKmode parameters
that aren't described by a PARALLEL. If BLOCK_REG_PADDING
forces a parameter to be padded at the lsb end in order to
fill a complete number of words, require the parameter size
to be ordered wrt UNITS_PER_WORD.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256161
This patch makes alter_reg cope with polynomial mode sizes.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* reload1.c (spill_stack_slot_width): Change element type
from unsigned int to poly_uint64_pod.
(alter_reg): Treat mode sizes as polynomial.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256160
This patch splits out a condition that is common to both push_reload
and reload_inner_reg_of_subreg.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* reload.c (complex_word_subreg_p): New function.
(reload_inner_reg_of_subreg, push_reload): Use it.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256159
This patch makes process_alt_operands check that the mode sizes
are ordered, so that match_reload can validly treat them as subregs
of one another.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* lra-constraints.c (process_alt_operands): Reject matched
operands whose sizes aren't ordered.
(match_reload): Refer to this check here.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256158
This patch makes the mode size assumptions in
expand_ifn_atomic_compare_exchange_into_call a bit more
explicit, so that a later patch can add a to_constant () call.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* builtins.c (expand_ifn_atomic_compare_exchange_into_call): Assert
that the mode size is in the set {1, 2, 4, 8, 16}.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256157
This patch makes the var-tracking.c handling of autoinc addresses
cope with polynomial mode sizes.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* var-tracking.c (adjust_mems): Treat mode sizes as polynomial.
Use plus_constant instead of gen_rtx_PLUS.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256156
PUSH_ROUNDING is difficult to convert to a hook since there is still
a lot of conditional code based on it. It isn't clear that a direct
conversion with checks for null hooks is the right thing to do.
Rather than untangle that, this patch converts all implementations
that do something to out-of-line functions that have the same
interface as a hook would have. This should at least help towards
any future hook conversion.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* config/cr16/cr16-protos.h (cr16_push_rounding): Declare.
* config/cr16/cr16.h (PUSH_ROUNDING): Move implementation to...
* config/cr16/cr16.c (cr16_push_rounding): ...this new function.
* config/h8300/h8300-protos.h (h8300_push_rounding): Declare.
* config/h8300/h8300.h (PUSH_ROUNDING): Move implementation to...
* config/h8300/h8300.c (h8300_push_rounding): ...this new function.
* config/i386/i386-protos.h (ix86_push_rounding): Declare.
* config/i386/i386.h (PUSH_ROUNDING): Move implementation to...
* config/i386/i386.c (ix86_push_rounding): ...this new function.
* config/m32c/m32c-protos.h (m32c_push_rounding): Take and return
a poly_int64.
* config/m32c/m32c.c (m32c_push_rounding): Likewise.
* config/m68k/m68k-protos.h (m68k_push_rounding): Declare.
* config/m68k/m68k.h (PUSH_ROUNDING): Move implementation to...
* config/m68k/m68k.c (m68k_push_rounding): ...this new function.
* config/pdp11/pdp11-protos.h (pdp11_push_rounding): Declare.
* config/pdp11/pdp11.h (PUSH_ROUNDING): Move implementation to...
* config/pdp11/pdp11.c (pdp11_push_rounding): ...this new function.
* config/stormy16/stormy16-protos.h (xstormy16_push_rounding): Declare.
* config/stormy16/stormy16.h (PUSH_ROUNDING): Move implementation to...
* config/stormy16/stormy16.c (xstormy16_push_rounding): ...this new
function.
* expr.c (emit_move_resolve_push): Treat the input and result
of PUSH_ROUNDING as a poly_int64.
(emit_move_complex_push, emit_single_push_insn_1): Likewise.
(emit_push_insn): Likewise.
* lra-eliminations.c (mark_not_eliminable): Likewise.
* recog.c (push_operand): Likewise.
* reload1.c (elimination_effects): Likewise.
* rtlanal.c (nonzero_bits1): Likewise.
* calls.c (store_one_arg): Likewise. Require the padding to be
known at compile time.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256155
This patch makes emit_single_push_insn_1 cope with polynomial mode sizes.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* expr.c (emit_single_push_insn_1): Treat mode sizes as polynomial.
Use plus_constant instead of gen_rtx_PLUS.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256154
This trivial patch makes auto-inc-dec.c:set_inc_state take a poly_int64.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* auto-inc-dec.c (set_inc_state): Take the mode size as a poly_int64
rather than an int.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256153
This patch makes the VIEW_CONVERT_EXPR handling in expand_expr_real_1
cope with polynomial type and mode sizes.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* expr.c (expand_expr_real_1): Use tree_to_poly_uint64
instead of int_size_in_bytes when handling VIEW_CONVERT_EXPRs
via stack temporaries. Treat the mode size as polynomial too.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256152
This patch makes expand_expr_real_2 cope with polynomial mode sizes
when handling conversions involving a union type.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* expr.c (expand_expr_real_2): When handling conversions involving
unions, apply tree_to_poly_uint64 to the TYPE_SIZE rather than
multiplying int_size_in_bytes by BITS_PER_UNIT. Treat GET_MODE_BISIZE
as a poly_uint64 too.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256151
This patch makes subreg_get_info handle polynomial sizes.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* rtlanal.c (subreg_get_info): Handle polynomial mode sizes.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256150
This patch makes target-independent code that uses REGMODE_NATURAL_SIZE
treat it as a poly_int rather than a constant.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* combine.c (can_change_dest_mode): Handle polynomial
REGMODE_NATURAL_SIZE.
* expmed.c (store_bit_field_1): Likewise.
* expr.c (store_constructor): Likewise.
* emit-rtl.c (validate_subreg): Operate on polynomial mode sizes
and polynomial REGMODE_NATURAL_SIZE.
(gen_lowpart_common): Likewise.
* reginfo.c (record_subregs_of_mode): Likewise.
* rtlanal.c (read_modify_subreg_p): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256149
This patch makes expand_vector_ubsan_overflow cope with a polynomial
number of elements.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* internal-fn.c (expand_vector_ubsan_overflow): Handle polynomial
numbers of elements.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256148
This patch makes the:
(BIT_FIELD_REF CONSTRUCTOR@0 @1 @2)
folder cope with polynomial numbers of elements.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* match.pd: Cope with polynomial numbers of vector elements.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256147
This patch makes fold_indirect_ref_1 handle polynomial offsets in
a POINTER_PLUS_EXPR. The specific reason for doing this now is
to handle:
(tree_to_uhwi (part_width) / BITS_PER_UNIT
* TYPE_VECTOR_SUBPARTS (op00type));
when TYPE_VECTOR_SUBPARTS becomes a poly_int.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* fold-const.c (fold_indirect_ref_1): Handle polynomial offsets
in a POINTER_PLUS_EXPR.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256146
This patch adds a wrapper around TYPE_VECTOR_SUBPARTS for omp-simd-clone.c.
Supporting SIMD clones for variable-length vectors is post GCC8 work.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* omp-simd-clone.c (simd_clone_subparts): New function.
(simd_clone_init_simd_arrays): Use it instead of TYPE_VECTOR_SUBPARTS.
(ipa_simd_modify_function_body): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256145
This patch adds a brig-specific wrapper around TYPE_VECTOR_SUBPARTS,
since presumably it will never need to support variable vector lengths.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/brig/
* brigfrontend/brig-util.h (gccbrig_type_vector_subparts): New
function.
* brigfrontend/brig-basic-inst-handler.cc
(brig_basic_inst_handler::build_shuffle): Use it instead of
TYPE_VECTOR_SUBPARTS.
(brig_basic_inst_handler::build_unpack): Likewise.
(brig_basic_inst_handler::build_pack): Likewise.
(brig_basic_inst_handler::build_unpack_lo_or_hi): Likewise.
(brig_basic_inst_handler::operator ()): Likewise.
(brig_basic_inst_handler::build_lower_element_broadcast): Likewise.
* brigfrontend/brig-code-entry-handler.cc
(brig_code_entry_handler::get_tree_cst_for_hsa_operand): Likewise.
(brig_code_entry_handler::get_comparison_result_type): Likewise.
(brig_code_entry_handler::expand_or_call_builtin): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256144
This patch makes tree-vect-generic.c cope with variable-length vectors.
Decomposition is only supported for constant-length vectors, since we
should never generate unsupported variable-length operations.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vect-generic.c (nunits_for_known_piecewise_op): New function.
(expand_vector_piecewise): Use it instead of TYPE_VECTOR_SUBPARTS.
(expand_vector_addition, add_rshift, expand_vector_divmod): Likewise.
(expand_vector_condition, vector_element): Likewise.
(subparts_gt): New function.
(get_compute_type): Use subparts_gt.
(count_type_subparts): Delete.
(expand_vector_operations_1): Use subparts_gt instead of
count_type_subparts.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256143
This patch replaces the two-state vect_no_alias_p with a three-state
vect_compile_time_alias that handles polynomial segment lengths.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vect-data-refs.c (vect_no_alias_p): Replace with...
(vect_compile_time_alias): ...this new function. Do the calculation
on poly_ints rather than trees.
(vect_prune_runtime_alias_test_list): Update call accordingly.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256142
This patch makes two-operation SLP handle but reject variable-length
vectors. Adding support for this is a post-GCC8 thing.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vect-slp.c (vect_build_slp_tree_1): Handle polynomial
numbers of units.
(vect_schedule_slp_instance): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256141
For now, vect_get_constant_vectors can only cope with constant-length
vectors, although a patch after the main SVE submission relaxes this.
This patch adds an appropriate guard for variable-length vectors.
The TYPE_VECTOR_SUBPARTS use in vect_get_constant_vectors will then
have a to_constant call when TYPE_VECTOR_SUBPARTS becomes a poly_int.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vect-slp.c (vect_get_and_check_slp_defs): Reject
constant and extern definitions for variable-length vectors.
(vect_get_constant_vectors): Note that the number of units
is known to be constant.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256140
This patch makes vectorizable_conversion cope with variable-length
vectors. We already require the number of elements in one vector
to be a multiple of the number of elements in the other vector,
so the patch uses that to choose between widening and narrowing.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vect-stmts.c (vectorizable_conversion): Treat the number
of units as polynomial. Choose between WIDE and NARROW based
on multiple_p.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256139
This patch makes vectorizable_simd_clone_call cope with variable-length
vectors. For now we don't support SIMD clones for variable-length
vectors; this will be post GCC 8 material.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vect-stmts.c (simd_clone_subparts): New function.
(vectorizable_simd_clone_call): Use it instead of TYPE_VECTOR_SUBPARTS.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256138
This patch makes vectorizable_call handle variable-length vectors.
The only substantial change is to use build_index_vector for
IFN_GOMP_SIMD_LANE; this makes no functional difference for
fixed-length vectors.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vect-stmts.c (vectorizable_call): Treat the number of
vectors as polynomial. Use build_index_vector for
IFN_GOMP_SIMD_LANE.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256137
This patch makes vectorizable_load and vectorizable_store cope with
variable-length vectors. The reverse and permute cases will be
excluded by the code that checks the permutation mask (although a
patch after the main SVE submission adds support for the reversed
case). Here we also need to exclude VMAT_ELEMENTWISE and
VMAT_STRIDED_SLP, which split the operation up into a constant
number of constant-sized operations. We also don't try to extend
the current widening gather/scatter support to variable-length
vectors, since SVE uses a different approach.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vect-stmts.c (get_load_store_type): Treat the number of
units as polynomial. Reject VMAT_ELEMENTWISE and VMAT_STRIDED_SLP
for variable-length vectors.
(vectorizable_mask_load_store): Treat the number of units as
polynomial, asserting that it is constant if the condition has
already been enforced.
(vectorizable_store, vectorizable_load): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256136
This patch makes vectorizable_live_operation cope with variable-length
vectors. For now we just handle cases in which we can tell at compile
time which vector contains the final result.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vect-loop.c (vectorizable_live_operation): Treat the number
of units as polynomial. Punt if we can't tell at compile time
which vector contains the final result.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256135
This patch makes vectorizable_induction cope with variable-length
vectors. For now we punt on SLP inductions, but patchees after
the main SVE submission add support for those too.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vect-loop.c (vectorizable_induction): Treat the number
of units as polynomial. Punt on SLP inductions. Use an integer
VEC_SERIES_EXPR for variable-length integer reductions. Use a
cast of such a series for variable-length floating-point
reductions.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256134
This patch makes vectorizable_reduction cope with variable-length vectors.
We can handle the simple case of an inner loop reduction for which
the target has native support for the epilogue operation. For now we
punt on other cases, but patches after the main SVE submission allow
SLP and double reductions too.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree.h (build_index_vector): Declare.
* tree.c (build_index_vector): New function.
* tree-vect-loop.c (get_initial_defs_for_reduction): Treat the number
of units as polynomial, forcibly converting it to a constant if
vectorizable_reduction has already enforced the condition.
(vect_create_epilog_for_reduction): Likewise. Use build_index_vector
to create a {1,2,3,...} vector.
(vectorizable_reduction): Treat the number of units as polynomial.
Choose vectype_in based on the largest scalar element size rather
than the smallest number of units. Enforce the restrictions
relied on above.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256133
This patch makes vector_alignment_reachable_p cope with variable-length
vectors.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vect-data-refs.c (vector_alignment_reachable_p): Treat the
number of units as polynomial.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256132
This patch changes the type of current_vector_size to poly_uint64.
It also changes TARGET_AUTOVECTORIZE_VECTOR_SIZES so that it fills
in a vector of possible sizes (as poly_uint64s) instead of returning
a bitmask. The documentation claimed that the hook didn't need to
include the default vector size (returned by preferred_simd_mode),
but that wasn't consistent with the omp-low.c usage.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* target.h (vector_sizes, auto_vector_sizes): New typedefs.
* target.def (autovectorize_vector_sizes): Return the vector sizes
by pointer, using vector_sizes rather than a bitmask.
* targhooks.h (default_autovectorize_vector_sizes): Update accordingly.
* targhooks.c (default_autovectorize_vector_sizes): Likewise.
* config/aarch64/aarch64.c (aarch64_autovectorize_vector_sizes):
Likewise.
* config/arc/arc.c (arc_autovectorize_vector_sizes): Likewise.
* config/arm/arm.c (arm_autovectorize_vector_sizes): Likewise.
* config/i386/i386.c (ix86_autovectorize_vector_sizes): Likewise.
* config/mips/mips.c (mips_autovectorize_vector_sizes): Likewise.
* omp-general.c (omp_max_vf): Likewise.
* omp-low.c (omp_clause_aligned_alignment): Likewise.
* optabs-query.c (can_vec_mask_load_store_p): Likewise.
* tree-vect-loop.c (vect_analyze_loop): Likewise.
* tree-vect-slp.c (vect_slp_bb): Likewise.
* doc/tm.texi: Regenerate.
* tree-vectorizer.h (current_vector_size): Change from an unsigned int
to a poly_uint64.
* tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Take
the vector size as a poly_uint64 rather than an unsigned int.
(current_vector_size): Change from an unsigned int to a poly_uint64.
(get_vectype_for_scalar_type): Update accordingly.
* tree.h (build_truth_vector_type): Take the size and number of
units as a poly_uint64 rather than an unsigned int.
(build_vector_type): Add a temporary overload that takes
the number of units as a poly_uint64 rather than an unsigned int.
* tree.c (make_vector_type): Likewise.
(build_truth_vector_type): Take the number of units as a poly_uint64
rather than an unsigned int.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256131
This patch makes TARGET_GET_MASK_MODE take polynomial nunits and
vector_size arguments. The gcc_assert in default_get_mask_mode
is now handled by the exact_div call in vector_element_size.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* target.def (get_mask_mode): Take the number of units and length
as poly_uint64s rather than unsigned ints.
* targhooks.h (default_get_mask_mode): Update accordingly.
* targhooks.c (default_get_mask_mode): Likewise.
* config/i386/i386.c (ix86_get_mask_mode): Likewise.
* doc/tm.texi: Regenerate.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256130
This patch makes omp_max_vf return a polynomial vectorization factor.
We then need to be able to stash a polynomial value in
OMP_CLAUSE_SAFELEN_EXPR too:
/* If max_vf is non-zero, then we can use only a vectorization factor
up to the max_vf we chose. So stick it into the safelen clause. */
For now the cfgloop safelen is still constant though.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* omp-general.h (omp_max_vf): Return a poly_uint64 instead of an int.
* omp-general.c (omp_max_vf): Likewise.
* omp-expand.c (omp_adjust_chunk_size): Update call to omp_max_vf.
(expand_omp_simd): Handle polynomial safelen.
* omp-low.c (omplow_simd_context): Add a default constructor.
(omplow_simd_context::max_vf): Change from int to poly_uint64.
(lower_rec_simd_input_clauses): Update accordingly.
(lower_rec_input_clauses): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256129
This patch adds a function for getting the number of elements in
a vector for cost purposes, which is always constant. It makes
it possible for a later patch to change GET_MODE_NUNITS and
TYPE_VECTOR_SUBPARTS to a poly_int.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vectorizer.h (vect_nunits_for_cost): New function.
* tree-vect-loop.c (vect_model_reduction_cost): Use it.
* tree-vect-slp.c (vect_analyze_slp_cost_1): Likewise.
(vect_analyze_slp_cost): Likewise.
* tree-vect-stmts.c (vect_model_store_cost): Likewise.
(vect_model_load_cost): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256128
This match makes tree-vect-slp.c track the maximum number of vector
units as a poly_uint64 rather than an unsigned int.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vect-slp.c (vect_record_max_nunits, vect_build_slp_tree_1)
(vect_build_slp_tree_2, vect_build_slp_tree): Change max_nunits
from an unsigned int * to a poly_uint64_pod *.
(calculate_unrolling_factor): New function.
(vect_analyze_slp_instance): Use it. Track polynomial max_nunits.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256127
This patch changes the type of the vectorisation factor and SLP
unrolling factor to poly_uint64. This in turn required some knock-on
changes in signedness elsewhere.
Cost decisions are generally based on estimated_poly_value,
which for VF is wrapped up as vect_vf_for_cost.
The patch doesn't on its own enable variable-length vectorisation.
It just makes the minimum changes necessary for the code to build
with the new VF and UF types. Later patches also make the
vectoriser cope with variable TYPE_VECTOR_SUBPARTS and variable
GET_MODE_NUNITS, at which point the code really does handle
variable-length vectors.
The patch also changes MAX_VECTORIZATION_FACTOR to INT_MAX,
to avoid hard-coding a particular architectural limit.
The patch includes a new test because a development version of the patch
accidentally used file print routines instead of dump_*, which would
fail with -fopt-info.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vectorizer.h (_slp_instance::unrolling_factor): Change
from an unsigned int to a poly_uint64.
(_loop_vec_info::slp_unrolling_factor): Likewise.
(_loop_vec_info::vectorization_factor): Change from an int
to a poly_uint64.
(MAX_VECTORIZATION_FACTOR): Bump from 64 to INT_MAX.
(vect_get_num_vectors): New function.
(vect_update_max_nunits, vect_vf_for_cost): Likewise.
(vect_get_num_copies): Use vect_get_num_vectors.
(vect_analyze_data_ref_dependences): Change max_vf from an int *
to an unsigned int *.
(vect_analyze_data_refs): Change min_vf from an int * to a
poly_uint64 *.
(vect_transform_slp_perm_load): Take the vf as a poly_uint64 rather
than an unsigned HOST_WIDE_INT.
* tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr)
(vect_analyze_data_ref_dependence): Change max_vf from an int *
to an unsigned int *.
(vect_analyze_data_ref_dependences): Likewise.
(vect_compute_data_ref_alignment): Handle polynomial vf.
(vect_enhance_data_refs_alignment): Likewise.
(vect_prune_runtime_alias_test_list): Likewise.
(vect_shift_permute_load_chain): Likewise.
(vect_supportable_dr_alignment): Likewise.
(dependence_distance_ge_vf): Take the vectorization factor as a
poly_uint64 rather than an unsigned HOST_WIDE_INT.
(vect_analyze_data_refs): Change min_vf from an int * to a
poly_uint64 *.
* tree-vect-loop-manip.c (vect_gen_scalar_loop_niters): Take
vfm1 as a poly_uint64 rather than an int. Make the same change
for the returned bound_scalar.
(vect_gen_vector_loop_niters): Handle polynomial vf.
(vect_do_peeling): Likewise. Update call to
vect_gen_scalar_loop_niters and handle polynomial bound_scalars.
(vect_gen_vector_loop_niters_mult_vf): Assert that the vf must
be constant.
* tree-vect-loop.c (vect_determine_vectorization_factor)
(vect_update_vf_for_slp, vect_analyze_loop_2): Handle polynomial vf.
(vect_get_known_peeling_cost): Likewise.
(vect_estimate_min_profitable_iters, vectorizable_reduction): Likewise.
(vect_worthwhile_without_simd_p, vectorizable_induction): Likewise.
(vect_transform_loop): Likewise. Use the lowest possible VF when
updating the upper bounds of the loop.
(vect_min_worthwhile_factor): Make static. Return an unsigned int
rather than an int.
* tree-vect-slp.c (vect_attempt_slp_rearrange_stmts): Cope with
polynomial unroll factors.
(vect_analyze_slp_cost_1, vect_analyze_slp_instance): Likewise.
(vect_make_slp_decision): Likewise.
(vect_supported_load_permutation_p): Likewise, and polynomial
vf too.
(vect_analyze_slp_cost): Handle polynomial vf.
(vect_slp_analyze_node_operations): Likewise.
(vect_slp_analyze_bb_1): Likewise.
(vect_transform_slp_perm_load): Take the vf as a poly_uint64 rather
than an unsigned HOST_WIDE_INT.
* tree-vect-stmts.c (vectorizable_simd_clone_call, vectorizable_store)
(vectorizable_load): Handle polynomial vf.
* tree-vectorizer.c (simduid_to_vf::vf): Change from an int to
a poly_uint64.
(adjust_simduid_builtins, shrink_simd_arrays): Update accordingly.
gcc/testsuite/
* gcc.dg/vect-opt-info-1.c: New test.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256126
natch.pd tries to reassociate two bit operations if both of them have
constant operands. However, with the polynomial integers added later,
there's no guarantee that a bit operation on two integers can be folded
at compile time. This means that the pattern can trigger for operations
on three constants, and as things stood could endlessly oscillate
between the two associations.
This patch keeps the existing pattern for the normal case of a
non-constant first operand. When all three operands are constant it
tries to find a pair of constants that do fold. If none do, it keeps
the original expression as-was.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* match.pd: Handle bit operations involving three constants
and try to fold one pair.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256125
Normally we adjust the vector loop so that it iterates:
(original number of scalar iterations - number of peels) / VF
times, enforcing this using an IV that starts at zero and increments
by one each iteration. However, dividing by VF would be expensive
for variable VF, so this patch adds an alternative in which the IV
increments by VF each iteration instead. We then need to take care
to handle possible overflow in the IV.
The new mechanism isn't used yet; a later patch replaces the
"if (1)" with a check for variable VF.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-vect-loop-manip.c: Include gimple-fold.h.
(slpeel_make_loop_iterate_ntimes): Add step, final_iv and
niters_maybe_zero parameters. Handle other cases besides a step of 1.
(vect_gen_vector_loop_niters): Add a step_vector_ptr parameter.
Add a path that uses a step of VF instead of 1, but disable it
for now.
(vect_do_peeling): Add step_vector, niters_vector_mult_vf_var
and niters_no_overflow parameters. Update calls to
slpeel_make_loop_iterate_ntimes and vect_gen_vector_loop_niters.
Create a new SSA name if the latter choses to use a ste other
than zero, and return it via niters_vector_mult_vf_var.
* tree-vect-loop.c (vect_transform_loop): Update calls to
vect_do_peeling, vect_gen_vector_loop_niters and
slpeel_make_loop_iterate_ntimes.
* tree-vectorizer.h (slpeel_make_loop_iterate_ntimes, vect_do_peeling)
(vect_gen_vector_loop_niters): Update declarations after above changes.
From-SVN: r256124
2018-01-02 Aaron Sawdey <acsawdey@linux.vnet.ibm.com>
* config/rs6000/rs6000-string.c (expand_block_move): Allow the use of
unaligned VSX load/store on P8/P9.
(expand_block_clear): Allow the use of unaligned VSX
load/store on P8/P9.
From-SVN: r256112
2018-01-02 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
* config/rs6000/rs6000-p8swap.c (swap_feeds_both_load_and_store):
New function.
(rs6000_analyze_swaps): Mark a web unoptimizable if it contains a
swap associated with both a load and a store.
From-SVN: r256111