In order to handle large character lengths on (L)LP64 targets, switch
the GFortran character length from an int to a size_t.
This is an ABI change, as procedures with character arguments take
hidden arguments with the character length.
I also changed the _size member in vtables from int to size_t, as
there were some cases where character lengths and sizes were
apparently mixed up and caused regressions otherwise. Although I
haven't tested, this might enable very large derived types as well.
Also, as there are some places in the frontend were negative character
lengths are used as special flag values, in the frontend the character
length is handled as a signed variable of the same size as a size_t,
although in the runtime library it really is size_t.
I haven't changed the character length variables for the co-array
intrinsics, as this is something that may need to be synchronized with
OpenCoarrays.
This is v5 of the patch. v4 was applied but caused breakage on big
endian targets. These have been fixed and tested, thanks to access to
the GCC compile farm.
Overview of v4 of the patch: v3 was applied but had to reverted due to
breaking bootstrap. The fix is in resolve.c:resolve_charlen, where
it's necessary to check that an expression is constant before using
mpz_sgn.
Overview of v3 of the patch: All the issues pointed out by FX's review
of v2 have been fixed. In particular, there are now new functions
gfc_mpz_get_hwi and gfc_mpz_set_hwi, similar to the GMP functions
mpz_get_si and mpz_set_si, except that they get/set a HOST_WIDE_INT
instead of a long value. Similarly, gfc_get_int_expr now takes a
HOST_WIDE_INT instead of a long, gfc_extract_long is replaced by
gfc_extract_hwi. Also, the preliminary work to handle
gfc_charlen_type_node being unsigned has been removed.
Regtested on x86_64-pc-linux-gnu, i686-pc-linux-gnu and
powerpc64-unknown-linux-gnu. Also regtested all three targets by
modifying gfortran-dg.exp to also test with "-g -flto", no new
failures observed.
frontend:
2018-01-05 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/78534
PR fortran/66310
* array.c (got_charlen): Use gfc_charlen_int_kind.
* class.c (gfc_find_derived_vtab): Use gfc_size_kind instead of
hardcoded kind.
(find_intrinsic_vtab): Likewise.
* decl.c (match_char_length): Use gfc_charlen_int_kind.
(add_init_expr_to_sym): Use gfc_charlen_t and gfc_charlen_int_kind.
(gfc_match_implicit): Use gfc_charlen_int_kind.
* dump-parse-tree.c (show_char_const): Use gfc_charlen_t and size_t.
(show_expr): Use HOST_WIDE_INT_PRINT_DEC.
* expr.c (gfc_get_character_expr): Length parameter of type
gfc_charlen_t.
(gfc_get_int_expr): Value argument of type HOST_WIDE_INT.
(gfc_extract_hwi): New function.
(simplify_const_ref): Make string_len of type gfc_charlen_t.
(gfc_simplify_expr): Use HOST_WIDE_INT for substring refs.
* frontend-passes.c (optimize_trim): Use gfc_charlen_int_kind.
* gfortran.h (gfc_mpz_get_hwi): New prototype.
(gfc_mpz_set_hwi): Likewise.
(gfc_charlen_t): New typedef.
(gfc_expr): Use gfc_charlen_t for character lengths.
(gfc_size_kind): New extern variable.
(gfc_extract_hwi): New prototype.
(gfc_get_character_expr): Use gfc_charlen_t for character length.
(gfc_get_int_expr): Use HOST_WIDE_INT type for value argument.
* gfortran.texi: Update description of hidden string length argument.
* iresolve.c (check_charlen_present): Use gfc_charlen_int_kind.
(gfc_resolve_char_achar): Likewise.
(gfc_resolve_repeat): Pass string length directly without
temporary, use gfc_charlen_int_kind.
(gfc_resolve_transfer): Use gfc_charlen_int_kind.
* match.c (select_intrinsic_set_tmp): Use HOST_WIDE_INT for charlen.
* misc.c (gfc_mpz_get_hwi): New function.
(gfc_mpz_set_hwi): New function.
* module.c (atom_int): Change type from int to HOST_WIDE_INT.
(parse_integer): Don't complain about large integers.
(write_atom): Use HOST_WIDE_INT for integers.
(mio_integer): Handle integer type mismatch.
(mio_hwi): New function.
(mio_intrinsic_op): Use HOST_WIDE_INT.
(mio_array_ref): Likewise.
(mio_expr): Likewise.
* primary.c (match_substring): Use gfc_charlen_int_kind.
* resolve.c (resolve_substring_charlen): Use gfc_charlen_int_kind.
(resolve_character_operator): Likewise.
(resolve_assoc_var): Likewise.
(resolve_select_type): Use HOST_WIDE_INT for charlen, use snprintf.
(resolve_charlen): Use mpz_sgn to determine sign.
* simplify.c (gfc_simplify_repeat): Use HOST_WIDE_INT/gfc_charlen_t
instead of long.
* symbol.c (generate_isocbinding_symbol): Use gfc_charlen_int_kind.
* target-memory.c (size_character): Length argument of type
gfc_charlen_t.
(gfc_encode_character): Likewise.
(gfc_interpret_character): Use gfc_charlen_t.
* target-memory.h (gfc_encode_character): Modify prototype.
* trans-array.c (gfc_trans_array_ctor_element): Use existing type.
(get_array_ctor_var_strlen): Use gfc_conv_mpz_to_tree_type.
(trans_array_constructor): Use existing type.
(get_array_charlen): Likewise.
* trans-const.c (gfc_conv_mpz_to_tree_type): New function.
* trans-const.h (gfc_conv_mpz_to_tree_type): New prototype.
* trans-decl.c (gfc_trans_deferred_vars): Use existing type.
(add_argument_checking): Likewise.
* trans-expr.c (gfc_class_len_or_zero_get): Build const of type
gfc_charlen_type_node.
(gfc_conv_intrinsic_to_class): Use gfc_charlen_int_kind instead of
4, fold_convert to correct type.
(gfc_conv_class_to_class): Build const of type size_type_node for
size.
(gfc_copy_class_to_class): Likewise.
(gfc_conv_string_length): Use same type in expression.
(gfc_conv_substring): Likewise, use HOST_WIDE_INT for charlen.
(gfc_conv_string_tmp): Make sure len is of the right type.
(gfc_conv_concat_op): Use same type in expression.
(gfc_conv_procedure_call): Likewise.
(fill_with_spaces): Comment out memset() block due to spurious
-Wstringop-overflow warnings.
(gfc_trans_string_copy): Use gfc_charlen_type_node.
(alloc_scalar_allocatable_for_subcomponent_assignment):
fold_convert to right type.
(gfc_trans_subcomponent_assign): Likewise.
(trans_class_vptr_len_assignment): Build const of correct type.
(gfc_trans_pointer_assignment): Likewise.
(alloc_scalar_allocatable_for_assignment): fold_convert to right
type in expr.
(trans_class_assignment): Build const of correct type.
* trans-intrinsic.c (gfc_conv_associated): Likewise.
(gfc_conv_intrinsic_repeat): Do calculation in sizetype.
* trans-io.c (gfc_build_io_library_fndecls): Use
gfc_charlen_type_node for character lengths.
(set_string): Convert to right type in assignment.
* trans-stmt.c (gfc_trans_label_assign): Build const of
gfc_charlen_type_node.
(trans_associate_var): Likewise.
(gfc_trans_character_select): Likewise.
(gfc_trans_allocate): Likewise, don't typecast strlen result.
(gfc_trans_deallocate): Don't typecast strlen result.
* trans-types.c (gfc_size_kind): New variable.
(gfc_init_types): Determine gfc_charlen_int_kind and gfc_size_kind
from size_type_node.
* trans-types.h: Fix comment.
testsuite:
2018-01-05 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/78534
PR fortran/66310
* gfortran.dg/char_cast_1.f90: Update scan pattern.
* gfortran.dg/dependency_49.f90: Likewise.
* gfortran.dg/repeat_4.f90: Use integers of kind C_SIZE_T.
* gfortran.dg/repeat_7.f90: New test for PR 66310.
* gfortran.dg/scan_2.f90: Handle potential cast in assignment.
* gfortran.dg/string_1.f90: Limit to ilp32 targets.
* gfortran.dg/string_1_lp64.f90: New test.
* gfortran.dg/string_3.f90: Limit to ilp32 targets.
* gfortran.dg/string_3_lp64.f90: New test.
libgfortran:
2019-01-05 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/78534
* intrinsics/args.c (getarg_i4): Use gfc_charlen_type.
(get_command_argument_i4): Likewise.
(get_command_i4): Likewise.
* intrinsics/chmod.c (chmod_internal): Likewise.
* intrinsics/env.c (get_environment_variable_i4): Likewise.
* intrinsics/extends_type_of.c (struct vtype): Use size_t for size
member.
* intrinsics/gerror.c (gerror): Use gfc_charlen_type.
* intrinsics/getlog.c (getlog): Likewise.
* intrinsics/hostnm.c (hostnm_0): Likewise.
* intrinsics/string_intrinsics_inc.c (string_len_trim): Rework to
work if gfc_charlen_type is unsigned.
(string_scan): Likewise.
* io/transfer.c (transfer_character): Modify prototype.
(transfer_character_write): Likewise.
(transfer_character_wide): Likewise.
(transfer_character_wide_write): Likewise.
(transfer_array): Typecast to avoid signed-unsigned comparison.
* io/unit.c (is_trim_ok): Use gfc_charlen_type.
* io/write.c (namelist_write): Likewise.
* libgfortran.h (gfc_charlen_type): Change typedef to size_t.
From-SVN: r256284
PR libstdc++/83626
* src/filesystem/ops.cc (remove(const path&, error_code&)): Do not
report an error for ENOENT.
(remove_all(const path&)): Fix type of result variable.
(remove_all(const path&, error_code&)): Use non-throwing increment
for directory iterator. Call POSIX remove directly to avoid redundant
calls to symlink_status. Do not report errors for ENOENT.
* src/filesystem/std-ops.cc: Likewise.
* testsuite/27_io/filesystem/operations/remove_all.cc: Test throwing
overload.
* testsuite/experimental/filesystem/operations/remove_all.cc:
Likewise.
From-SVN: r256283
PR target/83604
* config/i386/i386-builtin.def
(__builtin_ia32_vgf2p8affineinvqb_v64qi,
__builtin_ia32_vgf2p8affineqb_v64qi, __builtin_ia32_vgf2p8mulb_v64qi):
Require also OPTION_MASK_ISA_AVX512F in addition to
OPTION_MASK_ISA_GFNI.
(__builtin_ia32_vgf2p8affineinvqb_v16qi_mask,
__builtin_ia32_vgf2p8affineqb_v16qi_mask): Require
OPTION_MASK_ISA_AVX512VL instead of OPTION_MASK_ISA_SSE in addition
to OPTION_MASK_ISA_GFNI.
(__builtin_ia32_vgf2p8mulb_v32qi_mask): Require
OPTION_MASK_ISA_AVX512VL in addition to OPTION_MASK_ISA_GFNI and
OPTION_MASK_ISA_AVX512BW.
(__builtin_ia32_vgf2p8mulb_v16qi_mask): Require
OPTION_MASK_ISA_AVX512VL instead of OPTION_MASK_ISA_AVX512BW in
addition to OPTION_MASK_ISA_GFNI.
(__builtin_ia32_vgf2p8affineinvqb_v16qi,
__builtin_ia32_vgf2p8affineqb_v16qi, __builtin_ia32_vgf2p8mulb_v16qi):
Require OPTION_MASK_ISA_SSE2 instead of OPTION_MASK_ISA_SSE in addition
to OPTION_MASK_ISA_GFNI.
* config/i386/i386.c (def_builtin): Change to builtin isa/isa2 being
a requirement for all ISAs rather than any of them with a few
exceptions.
(ix86_add_new_builtins): Clear OPTION_MASK_ISA_64BIT from isa before
processing.
(ix86_expand_builtin): Require all ISAs from builtin's isa and isa2
bitmasks to be enabled with 3 exceptions, instead of requiring any
enabled ISA with lots of exceptions.
* config/i386/sse.md (vgf2p8affineinvqb_<mode><mask_name>,
vgf2p8affineqb_<mode><mask_name>, vgf2p8mulb_<mode><mask_name>):
Change avx512bw in isa attribute to avx512f.
* config/i386/sgxintrin.h: Add license boilerplate.
* config/i386/vaesintrin.h: Likewise. Fix macro spelling __AVX512F
to __AVX512F__ and __AVX512VL to __AVX512VL__.
(_mm256_aesdec_epi128, _mm256_aesdeclast_epi128, _mm256_aesenc_epi128,
_mm256_aesenclast_epi128): Enable temporarily avx if __AVX__ is not
defined.
* config/i386/gfniintrin.h (_mm_gf2p8mul_epi8,
_mm_gf2p8affineinv_epi64_epi8, _mm_gf2p8affine_epi64_epi8): Enable
temporarily sse2 rather than sse if not enabled already.
* gcc.target/i386/sse-26.c: New test.
From-SVN: r256281
r241959 included code to stop the vectoriser increasing the alignment of
a "user-aligned" variable. This wasn't the main purpose of the patch,
but was done for consistency with pass_increase_alignment, and was
needed to make the testcase work.
The documentation for the aligned attribute says:
This attribute specifies a minimum alignment for the variable or
structure field, measured in bytes.
so I think it's reasonable for the vectoriser to increase the
alignment further, if that helps us to vectorise code. It's also
useful if the "user" alignment actually came from an earlier pass
rather than the source code.
A possible counterexample came up when this was discussed on the lists.
Users who are trying to collate things from several translation units
into a single section can use:
__attribute__((section ("whatever"), aligned(N)))
and would not want extra padding. It turns out that the supported way
of doing that is to add a "used" attribute, which works even when no
"aligned" attribute is given.
2018-01-05 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-vect-data-refs.c (vect_compute_data_ref_alignment): Don't
punt for user-aligned variables.
gcc/testsuite/
* gcc.dg/vect/vect-align-4.c: New test.
* gcc.dg/vect/vect-nb-iter-ub-2.c (cc): Remove alignment attribute
and redefine as a structure with an unaligned member "b".
(foo): Update accordingly.
From-SVN: r256277
This patch add support for the missing transformation of
(x | y) == x -> (y & ~x) == 0. The transformation for (x & y) == x case
already exists in simplify-rtx.c since 2014 as of r218503 and this patch
only adds a couple of extra patterns for the IOR case. This benefits
targets that have the BICS instruction to generate better code. For
targets that do not have the BICS instructions, it still results in
no worse code generation and gives out 2 instructions.
ChangeLog Entries:
*** gcc/ChangeLog ***
2018-01-05 Sudakshina Das <sudi.das@arm.com>
PR target/82439
* simplify-rtx.c (simplify_relational_operation_1): Add simplifications
of (x|y) == x for BICS pattern.
*** gcc/testsuite/ChangeLog ***
2018-01-05 Sudakshina Das <sudi.das@arm.com>
PR target/82439
* gcc.target/aarch64/bics_5.c: New test.
* gcc.target/arm/bics_5.c: Likewise.
From-SVN: r256275
PR tree-optimization/83605
* gimple-ssa-strength-reduction.c: Include tree-eh.h.
(find_candidates_dom_walker::before_dom_children): Ignore stmts that
can throw.
* gcc.dg/pr83605.c: New test.
From-SVN: r256274
Following on from:
* tree-vrp.c (extract_range_from_multiplicative_op_1): Assert
for VR_RANGE only; don't allow VR_ANTI_RANGE.
(extract_range_from_binary_expr_1): Don't call
extract_range_from_multiplicative_op_1 if !range_int_cst_p.
there was a later call to extract_range_from_multiplicative_op_1 too,
that used a negative test for a symbolic (!is_gimple_min_invariant)
range rather than a positive test for an integer range.
2017-11-04 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-vrp.c (extract_range_from_binary_expr_1): Check
range_int_cst_p rather than !symbolic_range_p before calling
extract_range_from_multiplicative_op_1.
From-SVN: r256262
The first BIT_FIELD_REF folding pattern assumed without checking that
operands satisfy tree_fits_uhwi_p. The second pattern does check this:
/* On constants we can use native encode/interpret to constant
fold (nearly) all BIT_FIELD_REFs. */
if (CONSTANT_CLASS_P (arg0)
&& can_native_interpret_type_p (type)
&& BITS_PER_UNIT == 8
&& tree_fits_uhwi_p (op1)
&& tree_fits_uhwi_p (op2))
so this patch adds the checks to the first pattern too. This is needed
for POLY_INT_CST bit positions.
2018-01-04 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* fold-const.c (fold_ternary_loc): Check tree_fits_uhwi_p before
using tree_to_uhwi.
From-SVN: r256258
tree-ssa-forwprop.c was asserting that a VEC_PERM_EXPR fold on three
VECTOR_CSTs would always succeed, but it's possible for it to fail
with variable-length vectors.
2017-12-22 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-ssa-forwprop.c (is_combined_permutation_identity): Allow
the VEC_PERM_EXPR fold to fail.
From-SVN: r256257
PR debug/83585
* bb-reorder.c (insert_section_boundary_note): Set has_bb_partition
to switched_sections.
* gcc.dg/pr83585.c: New test.
From-SVN: r256256
PR target/83387
* config/rs6000/rs6000.c (rs6000_discover_homogeneous_aggregate): Do not
allow arguments in FP registers if TARGET_HARD_FLOAT is false.
From-SVN: r256250
PR debug/83666
* cfgexpand.c (expand_dbeug_expr) <case BIT_FIELD_REF>: Punt if mode
is BLKmode and bitpos not zero or mode change is needed.
* gcc.dg/pr83666.c: New test.
From-SVN: r256232
2018-01-04 Martin Liska <mliska@suse.cz>
PR gcov-profile/83669
* gcov.c (output_intermediate_file): Add version to intermediate
gcov file.
* doc/gcov.texi: Document new field 'version' in intermediate
file format. Fix location of '-k' option of gcov command.
From-SVN: r256227
* gcc.dg/vect-opt-info-1.c: Moved to ...
* gcc.dg/vect/nodump-vect-opt-info-1.c: ... here. Only run on
vect_int targets, use dg-additional-options instead of dg-options and
use relative line numbers instead of absolute.
From-SVN: r256225
gcc/ChangeLog:
PR tree-optimization/83603
* calls.c (maybe_warn_nonstring_arg): Avoid accessing function
arguments past the endof the argument list in functions declared
without a prototype.
* gimple-ssa-warn-restrict.c (wrestrict_dom_walker::check_call):
Avoid checking when arguments are null.
gcc/testsuite/ChangeLog:
PR tree-optimization/83603
* gcc.dg/Wrestrict-4.c: New test.
From-SVN: r256217
After the previous patches, it's easier to see that the remaining
inlined transform code in vectorizable_mask_load_store is just a
cut-down version of the VMAT_CONTIGUOUS handling in vectorizable_load
and vectorizable_store. This patch therefore makes those functions
handle masked loads and stores instead.
This makes it easier to handle more forms of masked load and store
without duplicating logic from the unmasked forms. It also helps with
support for fully-masked loops.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-vect-stmts.c (vect_get_store_rhs): New function.
(vectorizable_mask_load_store): Delete.
(vectorizable_call): Return false for masked loads and stores.
(vectorizable_store): Handle IFN_MASK_STORE. Use vect_get_store_rhs
instead of gimple_assign_rhs1.
(vectorizable_load): Handle IFN_MASK_LOAD.
(vect_transform_stmt): Don't set is_store for call_vec_info_type.
From-SVN: r256216
vectorizable_mask_load_store and vectorizable_load used the same
code to build a gather load call, except that the former also
vectorised a mask argument and used it for both the merge and mask
inputs. The latter instead used a merge input of zero and a mask
input of all-ones. This patch splits it out into a subroutine.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-vect-stmts.c (vect_build_gather_load_calls): New function,
split out from..,
(vectorizable_mask_load_store): ...here.
(vectorizable_load): ...and here.
From-SVN: r256215
This patch splits out the code to build an all-bits-one or all-bits-zero
input to a gather load. The catch is that both masks can have
floating-point type, in which case they are implicitly treated in
the same way as an integer bitmask.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-vect-stmts.c (vect_build_all_ones_mask)
(vect_build_zero_merge_argument): New functions, split out from...
(vectorizable_load): ...here.
From-SVN: r256214
This patch splits out the rhs checking code that's common to both
vectorizable_mask_load_store and vectorizable_store.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-vect-stmts.c (vect_check_store_rhs): New function,
split out from...
(vectorizable_mask_load_store): ...here.
(vectorizable_store): ...and here.
From-SVN: r256213
This patch splits the mask argument checking out of
vectorizable_mask_load_store, so that a later patch can use it in both
vectorizable_load and vectorizable_store. It also adds dump messages
for false returns. This is mostly useful for the TYPE_VECTOR_SUBPARTS
check, which can fail if pattern recognition didn't convert the mask
properly.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-vect-stmts.c (vect_check_load_store_mask): New function,
split out from...
(vectorizable_mask_load_store): ...here.
From-SVN: r256212
This patch makes vect_model_store_cost take a vec_load_store_type
instead of a vect_def_type. It's a wash on its own, but it helps
with later patches.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-vectorizer.h (vec_load_store_type): Moved from tree-vec-stmts.c
(vect_model_store_cost): Take a vec_load_store_type instead of a
vect_def_type.
* tree-vect-stmts.c (vec_load_store_type): Move to tree-vectorizer.h.
(vect_model_store_cost): Take a vec_load_store_type instead of a
vect_def_type.
(vectorizable_mask_load_store): Update accordingly.
(vectorizable_store): Likewise.
* tree-vect-slp.c (vect_analyze_slp_cost_1): Update accordingly.
From-SVN: r256211
vectorizable_mask_load_store replaces scalar IFN_MASK_LOAD calls with
dummy assignments, so that they never survive vectorisation. This patch
moves the code to vect_transform_loop instead, so that we only change
the scalar statements once all of them have been vectorised.
This makes it easier to handle other types of functions that need
stubbing out, and also makes it easier to handle groups and patterns.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-vect-loop.c (vect_transform_loop): Stub out scalar
IFN_MASK_LOAD calls here rather than...
* tree-vect-stmts.c (vectorizable_mask_load_store): ...here.
From-SVN: r256210
extract_bit_field_1 tries to use vec_extract to extract part of a
vector. However, if that pattern isn't defined or if the operands
aren't suitable, another good approach is to try a direct subreg
reference. This is particularly useful for multi-vector modes on
SVE (e.g. when extracting one vector from an LD2 result).
The function would go on to try the same thing anyway, but only
if there is an integer mode with the same size as the vector mode,
which isn't true for SVE modes (and doesn't seem a good thing to
require in general). Even when there is an integer mode, doing the
operation on the original modes avoids some unnecessary bitcasting.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* expmed.c (extract_bit_field_1): For vector extracts,
fall back to extract_bit_field_as_subreg if vec_extract
isn't available.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256209
Once SVE is enabled, a general AArch64 spill slot offset will be
A + B * VL
where A is a constant and B is a multiple of the SVE vector length.
The offsets in SVE load and store instructions are a multiple of VL
(and so can encode some values of B), while offsets for standard AArch64
load and store instructions aren't (and encode some values of A).
We therefore get better spill code if variable-sized slots are grouped
together separately from constant-sized slots, and if variable-sized
slots are not reused for constant-sized data. Then, spills to the
constant-sized slots can add B * VL to the offset first, creating a
common anchor point for spills with the same B component but different
A components. Similarly, spills to variable-sized slots can add A to
the offset first, creating a common anchor point for spills with the same
A component but different B components.
This patch implements the sorting and grouping side of the optimisation.
A later patch creates the anchor points.
The patch is a no-op on other targets.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* lra-spills.c (pseudo_reg_slot_compare): Sort slots by whether
they are variable or constant sized.
(assign_stack_slot_num_and_sort_pseudos): Don't reuse variable-sized
slots for constant-sized data.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256208
This patch allows us to recognise:
... = bool1 != bool2 ? x : y
as equivalent to:
bool tmp = bool1 ^ bool2;
... = tmp ? x : y
For the latter we were already able to find the natural number
of vector units for tmp based on the types that feed bool1 and
bool2, whereas with the former we would simply treat bool1 and
bool2 as vectorised 8-bit values, possibly requiring them to
be packed and unpacked from their natural width.
This is used by a later SVE patch.
2018-01-03 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-vect-patterns.c (vect_recog_mask_conversion_pattern): When
handling COND_EXPRs with boolean comparisons, try to find a better
basis for the mask type than the boolean itself.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r256207