Commit Graph

172866 Commits

Author SHA1 Message Date
Szabolcs Nagy 8d7be8d614 musl: use correct long double abi by default
On powerpc and s390x the musl ABI requires 64 bit and 128 bit long
double respectively, so adjust the default.

gcc/ChangeLog:

2019-11-18  Szabolcs Nagy  <szabolcs.nagy@arm.com>

	* configure.ac (gcc_cv_target_ldbl128): Set for powerpc*-*-linux-musl*
	and s390*-*-linux-musl* targets.
	* configure: Regenerate.

From-SVN: r278398
2019-11-18 12:03:18 +00:00
Szabolcs Nagy 3d6d8099b6 s390: add musl support
Add the musl dynamic linker names.

gcc/ChangeLog:

2019-11-18  Szabolcs Nagy  <szabolcs.nagy@arm.com>

	* config/s390/linux.h (MUSL_DYNAMIC_LINKER32): Define.
	(MUSL_DYNAMIC_LINKER64): Define.

From-SVN: r278397
2019-11-18 12:00:45 +00:00
Martin Liska 342ae9ad55 Improve -dbg-cnt error message and support :0.
2019-11-18  Martin Liska  <mliska@suse.cz>

	* dbgcnt.c (dbg_cnt_set_limit_by_name): Provide error
	message for an unknown counter.
	(dbg_cnt_process_single_pair): Support 0 as minimum value.
	(dbg_cnt_process_opt): Remove unreachable code.

From-SVN: r278396
2019-11-18 11:51:20 +00:00
Martin Liska 446096148c Verify NOP_EXPR LHS type in IPA ICF.
2019-11-18  Martin Liska  <mliska@suse.cz>

	PR ipa/92529
	* ipa-icf-gimple.c (func_checker::compare_gimple_assign):
	Compare LHS types of NOP_EXPR.
2019-11-18  Martin Liska  <mliska@suse.cz>

	PR ipa/92529
	* gcc.dg/ipa/pr92529.c: New test.

From-SVN: r278395
2019-11-18 11:51:05 +00:00
Matthew Malcomson 20a380171f [mid-end][__RTL] Clean state despite unspecified __RTL startwith passes
Hi there,

When compiling an __RTL function that has an unspecified "startwith"
pass we currently don't run the cleanup pass, this means that we ICE on
the next function (if it's a basic function).

This change ensures that the clean_state pass is run even if the
startwith pass is unspecified.

We also ensure the name of the startwith pass is always freed correctly.

As an example, before this change the following code would ICE when compiling
the function `foo_a`.

When compiled with
./aarch64-none-linux-gnu-gcc -O0 -S unspecified-pass-error.c -o test.s

```
int __RTL () badfoo ()
{
(function "badfoo"
  (insn-chain
    (block 2
      (edge-from entry (flags "FALLTHRU"))
      (cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK)
      (cinsn 101 (set (reg:DI x19) (reg:DI x0)))
      (cinsn 10 (use (reg/i:SI x19)))
      (edge-to exit (flags "FALLTHRU"))
    ) ;; block 2
  ) ;; insn-chain
) ;; function "foo2"
}

int
foo_a ()
{
  return 200;
}
```

Now it silently ignores the __RTL function and successfully compiles foo_a.

regtest done on aarch64
regtest done on x86_64

OK for trunk?

gcc/ChangeLog:

2019-11-18  Matthew Malcomson  <matthew.malcomson@arm.com>

	* run-rtl-passes.c (run_rtl_passes): Accept and handle empty
	"initial_pass_name" argument -- by running "*clean_state" pass.
	Also free the "initial_pass_name" when done.

gcc/c/ChangeLog:

2019-11-18  Matthew Malcomson  <matthew.malcomson@arm.com>

	* c-parser.c (c_parser_parse_rtl_body): Always call
	run_rtl_passes, even if startwith pass is not provided.

gcc/testsuite/ChangeLog:

2019-11-18  Matthew Malcomson  <matthew.malcomson@arm.com>

	* gcc.dg/rtl/aarch64/unspecified-pass-error.c: New test.

From-SVN: r278393
2019-11-18 11:16:46 +00:00
Richard Biener ef50b972e1 re PR target/92462 ([arm32] -ftree-pre makes a variable to be wrongly hoisted out)
2019-11-18  Richard Biener  <rguenther@suse.de>

	PR rtl-optimization/92462
	* alias.c (find_base_term): Restrict the look through ANDs.
	(find_base_value): Likewise.

From-SVN: r278391
2019-11-18 09:44:52 +00:00
Christophe Lyon 762ff5b304 [testsuite][ARM] check_effective_target_arm_vfp_ok_nocache: Fix typo in option name
2019-11-18  Christophe Lyon  <christophe.lyon@linaro.org>

	* lib/target-supports.exp
	(check_effective_target_arm_vfp_ok_nocache): Fix typo in option
	name.

From-SVN: r278390
2019-11-18 10:20:18 +01:00
Georg-Johann Lay 1ce51d9a8d re PR target/92545 (avr: support ATmega devices from the 0-series)
PR target/92545
	* config/avr/gen-avr-mmcu-specs.c (print_mcu)
	[link_pm_base_address]: Symbol name is __RODATA_PM_OFFSET__.

From-SVN: r278389
2019-11-18 08:19:08 +00:00
Georg-Johann Lay 9c5de632fd re PR target/92545 (avr: support ATmega devices from the 0-series)
PR target/92545
	* doc/avr-mmcu.texi: Regenerate.

From-SVN: r278388
2019-11-18 07:54:30 +00:00
Georg-Johann Lay 80b38f83f1 Add support for AVR devices from the 0-series.
PR target/92545
	* config/avr/avr-arch.h (avr_mcu_t) <flash_pm_offset>: New field.
	* config/avr/avr-devices.c (avr_mcu_types): Adjust initializers.
	* config/avr/avr-mcus.def (AVR_MCU): Add respective field.
	* config/avr/specs.h (LINK_SPEC) <%(link_pm_base_address)>: Add.
	* config/avr/gen-avr-mmcu-specs.c (print_mcu)
	<*cpp, *cpp_mcu, *cpp_avrlibc, *link_pm_base_address>: Emit code
	for spec definitions.
	* doc/avr-mmcu.texi: Regenerate.

From-SVN: r278387
2019-11-18 07:52:55 +00:00
Hongtao Liu 586bbef191 Split X86_TUNE_AVX128_OPTIMAL into X86_TUNE_AVX256_SPLIT_REGS
and X86_TUNE_AVX128_OPTIMAL.

Changelog
gcc/
	PR target/92448
	* config/i386/i386-expand.c (ix86_expand_set_or_cpymem):
	Replace TARGET_AVX128_OPTIMAL with TARGET_AVX256_SPLIT_REGS.
	* config/i386/i386-option.c (ix86_vec_cost): Ditto.
	(ix86_reassociation_width): Ditto.
	* config/i386/i386-options.c (ix86_option_override_internal):
	Replace TARGET_AVX128_OPTIAML with
	ix86_tune_features[X86_TUNE_AVX128_OPTIMAL]
	* config/i386/i386.h (TARGET_AVX256_SPLIT_REGS): New macro.
	(TARGET_AVX128_OPTIMAL): Deleted.
	* config/i386/x86-tune.def (X86_TUNE_AVX256_SPLIT_REGS): New
	DEF_TUNE.

From-SVN: r278385
2019-11-18 02:22:55 +00:00
Maciej W. Rozycki a128988785 libgomp: Regenerate `testsuite/Makefile.in' for GCC_HEADER_STDINT removal
Commit r276389 ("configure.ac: Remove GCC_HEADER_STDINT(gstdint.h)") has
not regenerated `testsuite/Makefile.in'.  Fix it.

	libgomp/
	* testsuite/Makefile.in: Regenerate.

From-SVN: r278384
2019-11-18 00:33:37 +00:00
Maciej W. Rozycki 38397aa621 libgfortran: Regenerate `Makefile.in' for `runstatedir' removal
A change made with r271340 ("libfortran/90038: Use posix_spawn instead
of fork") accidentally brought the obsolete `runstatedir' setting back
in.  Fix it.

	libgfortran/
	* Makefile.in: Regenerate.

From-SVN: r278383
2019-11-18 00:21:45 +00:00
GCC Administrator 8b5c3af777 Daily bump.
From-SVN: r278382
2019-11-18 00:16:13 +00:00
John David Anglin 632b5e3da7 linux-atomic.c (__kernel_cmpxchg): Change argument 1 to volatile void *.
* config/pa/linux-atomic.c (__kernel_cmpxchg): Change argument 1 to
	volatile void *.  Remove trap check.
	(__kernel_cmpxchg2): Likewise.
	(FETCH_AND_OP_2): Adjust operand types.
	(OP_AND_FETCH_2): Likewise.
	(FETCH_AND_OP_WORD): Likewise.
	(OP_AND_FETCH_WORD): Likewise.
	(COMPARE_AND_SWAP_2): Likewise.
	(__sync_val_compare_and_swap_4): Likewise.
	(__sync_bool_compare_and_swap_4): Likewise.
	(SYNC_LOCK_TEST_AND_SET_2): Likewise.
	(__sync_lock_test_and_set_4): Likewise.
	(SYNC_LOCK_RELEASE_1): Likewise.  Use __kernel_cmpxchg2 for release.
	(__sync_lock_release_4): Adjust operand types.  Use __kernel_cmpxchg
	for release.
	(__sync_lock_release_8): Remove.

From-SVN: r278377
2019-11-17 23:11:52 +00:00
Jeff Law b906729f81 * gcc.dg/complex-6.c: Do not run dump scan tests for rx target.
From-SVN: r278376
2019-11-17 09:31:32 -07:00
Jakub Jelinek cfe871e3ec method.c (lookup_comparison_result): Use %qD instead of %<%T::%D%> to print the decl.
* method.c (lookup_comparison_result): Use %qD instead of %<%T::%D%>
	to print the decl.
	(lookup_comparison_category): Use %qD instead of %<std::%D%> to print
	the decl.

	* g++.dg/cpp2a/spaceship-err3.C: New test.

From-SVN: r278375
2019-11-17 07:12:01 +01:00
Edward Smith-Rowland f6e86b3303 Repair the <tuple> part of C++20 p1032 Misc constexpr bits.
2019-11-16  Edward Smith-Rowland  <3dw4rd@verizon.net>

	Repair the <tuple> part of C++20 p1032 Misc constexpr bits.
	* include/bits/uses_allocator.h (__uses_alloc0::_Sink::operaror=)
	(__use_alloc(const _Alloc&)) : Constexpr.

From-SVN: r278373
2019-11-17 03:31:15 +00:00
Jonathan Wakely 8857080c81 libstdc++: add range constructor for std::string_view (P1391R4)
* include/std/string_view (basic_string_view(It, End)): Add range
	constructor and deduction guide from P1391R4.
	* testsuite/21_strings/basic_string_view/cons/char/range.cc: New test.

From-SVN: r278371
2019-11-17 01:32:55 +00:00
Jonathan Wakely 37f33df706 libstdc++: Define C++20 range utilities and range factories
This adds another chunk of the <ranges> header.

The changes from P1456R1 (Move-only views) and P1862R1 (Range adaptors
for non-copyable iterators) are included, but not the changes from
P1870R1 (forwarding-range<T> is too subtle).

The tests for subrange and iota_view are poor and should be improved.

	* include/bits/regex.h (match_results): Specialize __enable_view_impl.
	* include/bits/stl_set.h (set): Likewise.
	* include/bits/unordered_set.h (unordered_set, unordered_multiset):
	Likewise.
	* include/debug/multiset.h (__debug::multiset): Likewise.
	* include/debug/set.h (__debug::set): Likewise.
	* include/debug/unordered_set (__debug::unordered_set)
	(__debug::unordered_multiset): Likewise.
	* include/std/ranges (ranges::view, ranges::enable_view)
	(ranges::view_interface, ranges::subrange, ranges::empty_view)
	(ranges::single_view, ranges::views::single, ranges::iota_view)
	(ranges::views::iota): Define for C++20.
	* testsuite/std/ranges/empty_view.cc: New test.
	* testsuite/std/ranges/iota_view.cc: New test.
	* testsuite/std/ranges/single_view.cc: New test.
	* testsuite/std/ranges/view.cc: New test.

From-SVN: r278370
2019-11-17 01:07:54 +00:00
GCC Administrator efbd2539e1 Daily bump.
From-SVN: r278369
2019-11-17 00:16:14 +00:00
Segher Boessenkool a20a1a75be rs6000: Allow mode GPR in cceq_{ior,rev}_compare
Also make it a parmeterized name: @cceq_{ior,rev}_compare_<mode>.


	* config/rs6000/rs6000.md (cceq_ior_compare): Rename to...
	(@cceq_ior_compare_<mode> for GPR): ... this.  Allow GPR instead of
	just SI.
	(cceq_rev_compare): Rename to...
	(@cceq_rev_compare_<mode> for GPR): ... this.  Allow GPR instead of
	just SI.
	(define_split for <bd>tf_<mode>): Add SImode first argument to
	gen_cceq_ior_compare.

From-SVN: r278366
2019-11-17 00:31:19 +01:00
Jonathan Wakely bac6632921 Revert r278363 "Start work on <ranges> header"
This was not meant to be on the branch I committed r278364 from, as it
is not ready to commit yet.

	* include/std/ranges: Revert accidentally committed changes.

From-SVN: r278365
2019-11-16 22:00:23 +00:00
Jonathan Wakely 7453376403 libstdc++: Optimize std::jthread construction
This change avoids storing a copy of a stop_token object that isn't
needed and won't be passed to the callable object. This slightly reduces
memory usage when the callable doesn't use a stop_token. It also removes
indirection in the invocation of the callable in the new thread, as
there is no lambda and no additional calls to std::invoke.

It also adds some missing [[nodiscard]] attributes, and the non-member
swap overload for std::jthread.

	* include/std/thread (jthread::jthread()): Use nostopstate constant.
	(jthread::jthread(Callable&&, Args&&...)): Use helper function to
	create std::thread instead of indirection through a lambda. Use
	remove_cvref_t instead of decay_t.
	(jthread::joinable(), jthread::get_id(), jthread::native_handle())
	(jthread::hardware_concurrency()): Add nodiscard attribute.
	(swap(jthread&. jthread&)): Define hidden friend.
	(jthread::_S_create): New helper function for constructor.

From-SVN: r278364
2019-11-16 21:47:28 +00:00
Jonathan Wakely 970a9bfaad Start work on <ranges> header
From-SVN: r278363
2019-11-16 21:47:22 +00:00
Segher Boessenkool 0e2d00114b Delete common/config/powerpcspe
I missed this part in r266961.  Various people have been editing it
since; I finally noticed.


	* common/config/powerpcspe: Delete.

From-SVN: r278361
2019-11-16 20:32:12 +01:00
Jeff Law 513e0aa0c4 [PATCH] Fix slowness in demangler
* cp-demangle.c (d_print_init): Remove const from 4th param.
	(cplus_demangle_fill_name): Initialize d->d_counting.
	(cplus_demangle_fill_extended_operator): Likewise.
	(cplus_demangle_fill_ctor): Likewise.
	(cplus_demangle_fill_dtor): Likewise.
	(d_make_empty): Likewise.
	(d_count_templates_scopes): Remobe const from 3rd param,
	Return on dc->d_counting > 1,
	Increment dc->d_counting.
        * cp-demint.c (cplus_demangle_fill_component): Initialize d->d_counting.
	(cplus_demangle_fill_builtin_type): Likewise.
	(cplus_demangle_fill_operator): Likewise.

	* demangle.h (struct demangle_component): Add member
	d_counting.

From-SVN: r278359
2019-11-16 10:14:14 -07:00
Eduard-Mihai Burtescu 32fc3719e0 [PATCH] Refactor rust-demangle to be independent of C++ demangling.
* demangle.h (rust_demangle_callback): Add.

	* cplus-dem.c (cplus_demangle): Use rust_demangle directly.
	(rust_demangle): Remove.
	* rust-demangle.c (is_prefixed_hash): Rename to is_legacy_prefixed_hash.
	(parse_lower_hex_nibble): Rename to decode_lower_hex_nibble.
	(parse_legacy_escape): Rename to decode_legacy_escape.
	(rust_is_mangled): Remove.
	(struct rust_demangler): Add.
	(peek): Add.
	(next): Add.
	(struct rust_mangled_ident): Add.
	(parse_ident): Add.
	(rust_demangle_sym): Remove.
	(print_str): Add.
	(PRINT): Add.
	(print_ident): Add.
	(rust_demangle_callback): Add.
	(struct str_buf): Add.
	(str_buf_reserve): Add.
	(str_buf_append): Add.
	(str_buf_demangle_callback): Add.
	(rust_demangle): Add.
	* rust-demangle.h: Remove.

From-SVN: r278358
2019-11-16 08:32:50 -07:00
Miguel Saldivar f73cb38f65 * testsuite/demangle-expected: Fix test.
From-SVN: r278357
2019-11-16 07:45:30 -07:00
Richard Sandiford 4ec943d630 [AArch64] Robustify aarch64_wrffr
This patch uses distinct values for the FFR and FFRT outputs of
aarch64_wrffr, so that a following aarch64_copy_ffr_to_ffrt has
an effect.  This is needed to avoid regressions with later patches.

The block comment at the head of the file already described
the pattern this way, and there was already an unspec for it.
Not sure what made me change it...

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64-sve.md (aarch64_wrffr): Wrap the FFRT
	output in UNSPEC_WRFFR.

From-SVN: r278356
2019-11-16 13:31:28 +00:00
Richard Sandiford f9d6338bd1 Use a single comparison for index-based alias checks
This patch rewrites the index-based alias checks to use conditions
of the form:

  (unsigned T) (a - b + bias) <= limit

E.g. before the patch:

  struct s { int x[100]; };

  void
  f1 (struct s *s1, int a, int b)
  {
    for (int i = 0; i < 32; ++i)
      s1->x[i + a] += s1->x[i + b];
  }

used:

        add     w3, w1, 3
        cmp     w3, w2
        add     w3, w2, 3
        ccmp    w1, w3, 0, ge
        ble     .L2

whereas after the patch it uses:

        sub     w3, w1, w2
        add     w3, w3, 3
        cmp     w3, 6
        bls     .L2

The patch also fixes the seg_len1 and seg_len2 negation for cases in
which seg_len is a "negative unsigned" value narrower than 64 bits,
like it is for 32-bit targets.  Previously we'd end up with values
like 0xffffffff000000001 instead of 1.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-data-ref.c (create_intersect_range_checks_index): Rewrite
	the index tests to have the form (unsigned T) (B - A + bias) <= limit.

gcc/testsuite/
	* gcc.dg/vect/vect-alias-check-18.c: New test.
	* gcc.dg/vect/vect-alias-check-19.c: Likewise.
	* gcc.dg/vect/vect-alias-check-20.c: Likewise.

From-SVN: r278354
2019-11-16 11:43:31 +00:00
Richard Sandiford b4d1b63573 Print the type of alias check in a dump message
This patch prints a message to say how an alias check is being
implemented.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-data-ref.c (create_intersect_range_checks_index)
	(create_intersect_range_checks): Print dump messages.

gcc/testsuite/
	* gcc.dg/vect/vect-alias-check-1.c: Test for the type of alias check.
	* gcc.dg/vect/vect-alias-check-8.c: Likewise.
	* gcc.dg/vect/vect-alias-check-9.c: Likewise.
	* gcc.dg/vect/vect-alias-check-10.c: Likewise.
	* gcc.dg/vect/vect-alias-check-11.c: Likewise.
	* gcc.dg/vect/vect-alias-check-12.c: Likewise.
	* gcc.dg/vect/vect-alias-check-13.c: Likewise.
	* gcc.dg/vect/vect-alias-check-14.c: Likewise.
	* gcc.dg/vect/vect-alias-check-15.c: Likewise.
	* gcc.dg/vect/vect-alias-check-16.c: Likewise.
	* gcc.dg/vect/vect-alias-check-17.c: Likewise.

From-SVN: r278353
2019-11-16 11:42:53 +00:00
Richard Sandiford cad984b289 Dump the list of merged alias pairs
This patch dumps the final (merged) list of alias pairs.  It also adds:

- WAW and RAW versions of vect-alias-check-8.c
- a "well-ordered" version of vect-alias-check-9.c (i.e. all reads
  before any writes)
- a test with mixed steps in the same alias pair

I also tweaked the test value in vect-alias-check-9.c so that the
result was less likely to be accidentally correct if the alias
isn't honoured.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-data-ref.c (dump_alias_pair): New function.
	(prune_runtime_alias_test_list): Use it to dump each merged alias pair.

gcc/testsuite/
	* gcc.dg/vect/vect-alias-check-8.c: Test for the RAW flag.
	* gcc.dg/vect/vect-alias-check-9.c: Test for the ARBITRARY flag.
	(TEST_VALUE): Use a higher value for early iterations.
	* gcc.dg/vect/vect-alias-check-14.c: New test.
	* gcc.dg/vect/vect-alias-check-15.c: Likewise.
	* gcc.dg/vect/vect-alias-check-16.c: Likewise.
	* gcc.dg/vect/vect-alias-check-17.c: Likewise.

From-SVN: r278352
2019-11-16 11:42:02 +00:00
Richard Sandiford 52c2990525 Record whether a dr_with_seg_len contains mixed steps
prune_runtime_alias_test_list can merge dr_with_seg_len_pair_ts that
have different steps for the first reference or different steps for the
second reference.  This patch adds a flag to record that.

I don't know whether the change to create_intersect_range_checks_index
fixes anything in practice.  It would have to be a corner case if so,
since at present we only merge two alias pairs if either the first or
the second references are identical and only the other references differ.
And the vectoriser uses VF-based segment lengths only if both references
in a pair have the same step.  Either way, it still seems wrong to use
DR_STEP when it doesn't represent all checks that have been merged into
the pair.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-data-ref.h (DR_ALIAS_MIXED_STEPS): New flag.
	* tree-data-ref.c (prune_runtime_alias_test_list): Set it when
	merging data references with different steps.
	(create_intersect_range_checks_index): Take a
	dr_with_seg_len_pair_t instead of two dr_with_seg_lens.
	Bail out if DR_ALIAS_MIXED_STEPS is set.
	(create_intersect_range_checks): Take a dr_with_seg_len_pair_t
	instead of two dr_with_seg_lens.  Update call to
	create_intersect_range_checks_index.
	(create_runtime_alias_checks): Update call accordingly.

From-SVN: r278351
2019-11-16 11:41:16 +00:00
Richard Sandiford e9acf80c96 Add flags to dr_with_seg_len_pair_t
This patch adds a bunch of flags to dr_with_seg_len_pair_t,
for use by later patches.  The update to tree-loop-distribution.c
is conservatively correct, but might be tweakable later.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-data-ref.h (DR_ALIAS_RAW, DR_ALIAS_WAR, DR_ALIAS_WAW)
	(DR_ALIAS_ARBITRARY, DR_ALIAS_SWAPPED, DR_ALIAS_UNSWAPPED): New flags.
	(dr_with_seg_len_pair_t::sequencing): New enum.
	(dr_with_seg_len_pair_t::flags): New member variable.
	(dr_with_seg_len_pair_t::dr_with_seg_len_pair_t): Take a sequencing
	parameter and initialize the flags member variable.
	* tree-loop-distribution.c (compute_alias_check_pairs): Update
	call accordingly.
	* tree-vect-data-refs.c (vect_prune_runtime_alias_test_list): Likewise.
	Ensure the two data references in an alias pair are in statement
	order, if there is a defined order.
	* tree-data-ref.c (prune_runtime_alias_test_list): Use
	DR_ALIAS_SWAPPED and DR_ALIAS_UNSWAPPED to record whether we've
	swapped the references in a dr_with_seg_len_pair_t.  OR together
	the flags when merging two dr_with_seg_len_pair_ts.  After merging,
	try to restore the original dr_with_seg_len order, updating the
	flags if that fails.

From-SVN: r278350
2019-11-16 11:40:22 +00:00
Richard Sandiford 97602450b0 Delay swapping data refs in prune_runtime_alias_test_list
prune_runtime_alias_test_list swapped dr_as between two dr_with_seg_len
pairs before finally deciding whether to merge them.  Bailing out later
would therefore leave the pairs in an incorrect state.

IMO a better fix would be to split this out into a subroutine that
produces a temporary dr_with_seg_len on success, rather than changing
an existing one in-place.  It would then be easy to merge both the dr_as
and dr_bs if we wanted to, rather than requiring one of them to be equal.
But here I tried to do something that could be backported if necessary.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-data-ref.c (prune_runtime_alias_test_list): Delay
	swapping the dr_as based on init values until we've decided
	whether to merge them.

From-SVN: r278349
2019-11-16 11:35:56 +00:00
Richard Sandiford 1fb2b0f69e Move canonicalisation of dr_with_seg_len_pair_ts
The two users of tree-data-ref's runtime alias checks both canonicalise
the order of the dr_with_seg_lens in a pair before passing them to
prune_runtime_alias_test_list.  It's more convenient for later patches
if prune_runtime_alias_test_list does that itself.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-data-ref.c (prune_runtime_alias_test_list): Sort the
	two accesses in each dr_with_seg_len_pair_t before trying to
	combine separate dr_with_seg_len_pair_ts.
	* tree-loop-distribution.c (compute_alias_check_pairs): Don't do
	that here.
	* tree-vect-data-refs.c (vect_prune_runtime_alias_test_list): Likewise.

From-SVN: r278348
2019-11-16 11:35:08 +00:00
Richard Sandiford 37a3662f76 [AArch64] Add scatter stores for partial SVE modes
This patch adds support for scatter stores of partial vectors,
where the vector base or offset elements can be wider than the
elements being stored.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64-sve.md
	(scatter_store<SVE_FULL_SD:mode><v_int_equiv>): Extend to...
	(scatter_store<SVE_24:mode><v_int_container>): ...this.
	(mask_scatter_store<SVE_FULL_S:mode><v_int_equiv>): Extend to...
	(mask_scatter_store<SVE_4:mode><v_int_equiv>): ...this.
	(mask_scatter_store<SVE_FULL_D:mode><v_int_equiv>): Extend to...
	(mask_scatter_store<SVE_2:mode><v_int_equiv>): ...this.
	(*mask_scatter_store<mode><v_int_container>_<su>xtw_unpacked): New
	pattern.
	(*mask_scatter_store<SVE_FULL_D:mode><v_int_equiv>_sxtw): Extend to...
	(*mask_scatter_store<SVE_2:mode><v_int_equiv>_sxtw): ...this.
	(*mask_scatter_store<SVE_FULL_D:mode><v_int_equiv>_uxtw): Extend to...
	(*mask_scatter_store<SVE_2:mode><v_int_equiv>_uxtw): ...this.

gcc/testsuite/
	* gcc.target/aarch64/sve/scatter_store_1.c (TEST_LOOP): Start at 0.
	(TEST_ALL): Add tests for 8-bit and 16-bit elements.
	* gcc.target/aarch64/sve/scatter_store_2.c: Update accordingly.
	* gcc.target/aarch64/sve/scatter_store_3.c (TEST_LOOP): Start at 0.
	(TEST_ALL): Add tests for 8-bit and 16-bit elements.
	* gcc.target/aarch64/sve/scatter_store_4.c: Update accordingly.
	* gcc.target/aarch64/sve/scatter_store_5.c (TEST_LOOP): Start at 0.
	(TEST_ALL): Add tests for 8-bit, 16-bit and 32-bit elements.
	* gcc.target/aarch64/sve/scatter_store_8.c: New test.
	* gcc.target/aarch64/sve/scatter_store_9.c: Likewise.

From-SVN: r278347
2019-11-16 11:30:46 +00:00
Richard Sandiford 87a80d2721 [AArch64] Pattern-match SVE extending gather loads
This patch pattern-matches a partial gather load followed by a sign or
zero extension into an extending gather load.  (The partial gather load
is already an extending load; we just don't rely on the upper bits of
the elements.)

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/iterators.md (SVE_2BHSI, SVE_2HSDI, SVE_4BHI)
	(SVE_4HSI): New mode iterators.
	(ANY_EXTEND2): New code iterator.
	* config/aarch64/aarch64-sve.md
	(@aarch64_gather_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>):
	Extend to...
	(@aarch64_gather_load_<ANY_EXTEND:optab><SVE_4HSI:mode><SVE_4BHI:mode>):
	...this, handling extension to partial modes as well as full modes.
	Describe the extension as a predicated rather than unpredicated
	extension.
	(@aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>):
	Likewise extend to...
	(@aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>):
	...this, making the same adjustments.
	(*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw):
	Likewise extend to...
	(*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>_sxtw)
	...this, making the same adjustments.
	(*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw):
	Likewise extend to...
	(*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>_uxtw)
	...this, making the same adjustments.
	(*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>_<ANY_EXTEND2:su>xtw_unpacked):
	New pattern.
	(*aarch64_ldff1_gather<mode>_sxtw): Canonicalize to a constant
	extension predicate.
	(@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>)
	(@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>)
	(*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw):
	Describe the extension as a predicated rather than unpredicated
	extension.
	(*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw):
	Likewise.  Canonicalize to a constant extension predicate.
	* config/aarch64/aarch64-sve-builtins-base.cc
	(svld1_gather_extend_impl::expand): Add an extra predicate for
	the extension.
	(svldff1_gather_extend_impl::expand): Likewise.

gcc/testsuite/
	* gcc.target/aarch64/sve/gather_load_extend_1.c: New test.
	* gcc.target/aarch64/sve/gather_load_extend_2.c: Likewise.
	* gcc.target/aarch64/sve/gather_load_extend_3.c: Likewise.
	* gcc.target/aarch64/sve/gather_load_extend_4.c: Likewise.
	* gcc.target/aarch64/sve/gather_load_extend_5.c: Likewise.
	* gcc.target/aarch64/sve/gather_load_extend_6.c: Likewise.
	* gcc.target/aarch64/sve/gather_load_extend_7.c: Likewise.
	* gcc.target/aarch64/sve/gather_load_extend_8.c: Likewise.
	* gcc.target/aarch64/sve/gather_load_extend_9.c: Likewise.
	* gcc.target/aarch64/sve/gather_load_extend_10.c: Likewise.
	* gcc.target/aarch64/sve/gather_load_extend_11.c: Likewise.
	* gcc.target/aarch64/sve/gather_load_extend_12.c: Likewise.

From-SVN: r278346
2019-11-16 11:26:11 +00:00
Richard Sandiford f8186eeaf3 [AArch64] Add gather loads for partial SVE modes
This patch adds support for gather loads of partial vectors,
where the vector base or offset elements can be wider than the
elements being loaded.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/iterators.md (SVE_24, SVE_2, SVE_4): New mode
	iterators.
	* config/aarch64/aarch64-sve.md
	(gather_load<SVE_FULL_SD:mode><v_int_equiv>): Extend to...
	(gather_load<SVE_24:mode><v_int_container>): ...this.
	(mask_gather_load<SVE_FULL_S:mode><v_int_equiv>): Extend to...
	(mask_gather_load<SVE_4:mode><v_int_container>): ...this.
	(mask_gather_load<SVE_FULL_D:mode><v_int_equiv>): Extend to...
	(mask_gather_load<SVE_2:mode><v_int_container>): ...this.
	(*mask_gather_load<SVE_2:mode><v_int_container>_<su>xtw_unpacked):
	New pattern.
	(*mask_gather_load<SVE_FULL_D:mode><v_int_equiv>_sxtw): Extend to...
	(*mask_gather_load<SVE_2:mode><v_int_equiv>_sxtw): ...this.
	Allow the nominal extension predicate to be different from the
	load predicate.
	(*mask_gather_load<SVE_FULL_D:mode><v_int_equiv>_uxtw): Extend to...
	(*mask_gather_load<SVE_2:mode><v_int_equiv>_uxtw): ...this.

gcc/testsuite/
	* gcc.target/aarch64/sve/gather_load_1.c (TEST_LOOP): Start at 0.
	(TEST_ALL): Add tests for 8-bit and 16-bit elements.
	* gcc.target/aarch64/sve/gather_load_2.c: Update accordingly.
	* gcc.target/aarch64/sve/gather_load_3.c (TEST_LOOP): Start at 0.
	(TEST_ALL): Add tests for 8-bit and 16-bit elements.
	* gcc.target/aarch64/sve/gather_load_4.c: Update accordingly.
	* gcc.target/aarch64/sve/gather_load_5.c (TEST_LOOP): Start at 0.
	(TEST_ALL): Add tests for 8-bit, 16-bit and 32-bit elements.
	* gcc.target/aarch64/sve/gather_load_6.c: Add
	--param aarch64-sve-compare-costs=0.
	(TEST_LOOP): Start at 0.
	* gcc.target/aarch64/sve/gather_load_7.c: Add
	--param aarch64-sve-compare-costs=0.
	* gcc.target/aarch64/sve/gather_load_8.c: New test.
	* gcc.target/aarch64/sve/gather_load_9.c: Likewise.
	* gcc.target/aarch64/sve/mask_gather_load_6.c: Add
	--param aarch64-sve-compare-costs=0.

From-SVN: r278345
2019-11-16 11:20:30 +00:00
Richard Sandiford 2d56600c8d [AArch64] Add truncation for partial SVE modes
This patch adds support for "truncating" to a partial SVE vector from
either a full SVE vector or a wider partial vector.  This truncation is
actually a no-op and so should have zero cost in the vector cost model.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64-sve.md
	(trunc<SVE_HSDI:mode><SVE_PARTIAL_I:mode>2): New pattern.
	* config/aarch64/aarch64.c (aarch64_integer_truncation_p): New
	function.
	(aarch64_sve_adjust_stmt_cost): Call it.

gcc/testsuite/
	* gcc.target/aarch64/sve/mask_struct_load_1.c: Add
	--param aarch64-sve-compare-costs=0.
	* gcc.target/aarch64/sve/mask_struct_load_2.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_load_3.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_load_4.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_load_5.c: Likewise.
	* gcc.target/aarch64/sve/pack_1.c: Likewise.
	* gcc.target/aarch64/sve/truncate_1.c: New test.

From-SVN: r278344
2019-11-16 11:14:51 +00:00
Richard Sandiford 217ccab8f4 [AArch64] Pattern-match SVE extending loads
This patch pattern-matches a partial SVE load followed by a sign or zero
extension into an extending load.  (The partial load is already an
extending load; we just don't rely on the upper bits of the elements.)

Nothing yet uses the extra LDFF1 and LDNF1 combinations, but it seemed
more consistent to provide them, since I needed to update the pattern
to use a predicated extension anyway.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64-sve.md
	(@aarch64_load_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>):
	(@aarch64_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>)
	(@aarch64_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>):
	Combine into...
	(@aarch64_load_<ANY_EXTEND:optab><SVE_HSDI:mode><SVE_PARTIAL_I:mode>):
	...this new pattern, handling extension to partial modes as well
	as full modes.  Describe the extension as a predicated rather than
	unpredicated extension.
	(@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>)
	(@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>)
	(@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>):
	Combine into...
	(@aarch64_ld<fn>f1_<ANY_EXTEND:optab><SVE_HSDI:mode><SVE_PARTIAL_I:mode>):
	...this new pattern, handling extension to partial modes as well
	as full modes.  Describe the extension as a predicated rather than
	unpredicated extension.
	* config/aarch64/aarch64-sve-builtins.cc
	(function_expander::use_contiguous_load_insn): Add an extra
	predicate for extending loads.
	* config/aarch64/aarch64.c (aarch64_extending_load_p): New function.
	(aarch64_sve_adjust_stmt_cost): Likewise.
	(aarch64_add_stmt_cost): Use aarch64_sve_adjust_stmt_cost to adjust
	the cost of SVE vector stmts.

gcc/testsuite/
	* gcc.target/aarch64/sve/load_extend_1.c: New test.
	* gcc.target/aarch64/sve/load_extend_2.c: Likewise.
	* gcc.target/aarch64/sve/load_extend_3.c: Likewise.
	* gcc.target/aarch64/sve/load_extend_4.c: Likewise.
	* gcc.target/aarch64/sve/load_extend_5.c: Likewise.
	* gcc.target/aarch64/sve/load_extend_6.c: Likewise.
	* gcc.target/aarch64/sve/load_extend_7.c: Likewise.
	* gcc.target/aarch64/sve/load_extend_8.c: Likewise.
	* gcc.target/aarch64/sve/load_extend_9.c: Likewise.
	* gcc.target/aarch64/sve/load_extend_10.c: Likewise.
	* gcc.target/aarch64/sve/reduc_4.c: Add
	--param aarch64-sve-compare-costs=0.

From-SVN: r278343
2019-11-16 11:11:47 +00:00
Richard Sandiford e58703e2c1 [AArch64] Add sign and zero extension for partial SVE modes
This patch adds support for extending from partial SVE modes
to both full vector modes and wider partial modes.

Some tests now need --param aarch64-sve-compare-costs=0 to force
the original full-vector code.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/iterators.md (SVE_HSDI): New mode iterator.
	(narrower_mask): Handle VNx4HI, VNx2HI and VNx2SI.
	* config/aarch64/aarch64-sve.md
	(<ANY_EXTEND:optab><SVE_PARTIAL_I:mode><SVE_HSDI:mode>2): New pattern.
	(*<ANY_EXTEND:optab><SVE_PARTIAL_I:mode><SVE_HSDI:mode>2): Likewise.
	(@aarch64_pred_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Update
	comment.  Avoid new narrower_mask ambiguity.
	(@aarch64_cond_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Likewise.
	(*cond_uxt<mode>_2): Update comment.
	(*cond_uxt<mode>_any): Likewise.

gcc/testsuite/
	* gcc.target/aarch64/sve/cost_model_1.c: Expect the loop to be
	vectorized with bytes stored in 32-bit containers.
	* gcc.target/aarch64/sve/extend_1.c: New test.
	* gcc.target/aarch64/sve/extend_2.c: New test.
	* gcc.target/aarch64/sve/extend_3.c: New test.
	* gcc.target/aarch64/sve/extend_4.c: New test.
	* gcc.target/aarch64/sve/load_const_offset_3.c: Add
	--param aarch64-sve-compare-costs=0.
	* gcc.target/aarch64/sve/mask_struct_store_1.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_store_1_run.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_store_2.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_store_2_run.c: Likewise.
	* gcc.target/aarch64/sve/unpack_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve/unpack_unsigned_1_run.c: Likewise.

From-SVN: r278342
2019-11-16 11:07:23 +00:00
Richard Sandiford cc68f7c2da [AArch64] Add autovec support for partial SVE vectors
This patch adds the bare minimum needed to support autovectorisation of
partial SVE vectors, namely moves and integer addition.  Later patches
add more interesting cases.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64-modes.def: Define partial SVE vector
	float modes.
	* config/aarch64/aarch64-protos.h (aarch64_sve_pred_mode): New
	function.
	* config/aarch64/aarch64.c (aarch64_classify_vector_mode): Handle the
	new vector float modes.
	(aarch64_sve_container_bits): New function.
	(aarch64_sve_pred_mode): Likewise.
	(aarch64_get_mask_mode): Use it.
	(aarch64_sve_element_int_mode): Handle structure modes and partial
	modes.
	(aarch64_sve_container_int_mode): New function.
	(aarch64_vectorize_related_mode): Return SVE modes when given
	SVE modes.  Handle partial modes, taking the preferred number
	of units from the size of the given mode.
	(aarch64_hard_regno_mode_ok): Allow partial modes to be stored
	in registers.
	(aarch64_expand_sve_ld1rq): Use the mode form of aarch64_sve_pred_mode.
	(aarch64_expand_sve_const_vector): Handle partial SVE vectors.
	(aarch64_split_sve_subreg_move): Use the mode form of
	aarch64_sve_pred_mode.
	(aarch64_secondary_reload): Handle partial modes in the same way
	as full big-endian vectors.
	(aarch64_vector_mode_supported_p): Allow partial SVE vectors.
	(aarch64_autovectorize_vector_modes): Try unpacked SVE vectors,
	merging with the Advanced SIMD modes.  If two modes have the
	same size, try the Advanced SIMD mode first.
	(aarch64_simd_valid_immediate): Use the container rather than
	the element mode for INDEX constants.
	(aarch64_simd_vector_alignment): Make the alignment of partial
	SVE vector modes the same as their minimum size.
	(aarch64_evpc_sel): Use the mode form of aarch64_sve_pred_mode.
	* config/aarch64/aarch64-sve.md (mov<SVE_FULL:mode>): Extend to...
	(mov<SVE_ALL:mode>): ...this.
	(movmisalign<SVE_FULL:mode>): Extend to...
	(movmisalign<SVE_ALL:mode>): ...this.
	(*aarch64_sve_mov<mode>_le): Rename to...
	(*aarch64_sve_mov<mode>_ldr_str): ...this.
	(*aarch64_sve_mov<SVE_FULL:mode>_be): Rename and extend to...
	(*aarch64_sve_mov<SVE_ALL:mode>_no_ldr_str): ...this.  Handle
	partial modes regardless of endianness.
	(aarch64_sve_reload_be): Rename to...
	(aarch64_sve_reload_mem): ...this and enable for little-endian.
	Use aarch64_sve_pred_mode to get the appropriate predicate mode.
	(@aarch64_pred_mov<SVE_FULL:mode>): Extend to...
	(@aarch64_pred_mov<SVE_ALL:mode>): ...this.
	(*aarch64_sve_mov<SVE_FULL:mode>_subreg_be): Extend to...
	(*aarch64_sve_mov<SVE_ALL:mode>_subreg_be): ...this.
	(@aarch64_sve_reinterpret<SVE_FULL:mode>): Extend to...
	(@aarch64_sve_reinterpret<SVE_ALL:mode>): ...this.
	(*aarch64_sve_reinterpret<SVE_FULL:mode>): Extend to...
	(*aarch64_sve_reinterpret<SVE_ALL:mode>): ...this.
	(maskload<SVE_FULL:mode><vpred>): Extend to...
	(maskload<SVE_ALL:mode><vpred>): ...this.
	(maskstore<SVE_FULL:mode><vpred>): Extend to...
	(maskstore<SVE_ALL:mode><vpred>): ...this.
	(vec_duplicate<SVE_FULL:mode>): Extend to...
	(vec_duplicate<SVE_ALL:mode>): ...this.
	(*vec_duplicate<SVE_FULL:mode>_reg): Extend to...
	(*vec_duplicate<SVE_ALL:mode>_reg): ...this.
	(sve_ld1r<SVE_FULL:mode>): Extend to...
	(sve_ld1r<SVE_ALL:mode>): ...this.
	(vec_series<SVE_FULL_I:mode>): Extend to...
	(vec_series<SVE_I:mode>): ...this.
	(*vec_series<SVE_FULL_I:mode>_plus): Extend to...
	(*vec_series<SVE_I:mode>_plus): ...this.
	(@aarch64_pred_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Avoid
	new VPRED ambiguity.
	(@aarch64_cond_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Likewise.
	(add<SVE_FULL_I:mode>3): Extend to...
	(add<SVE_I:mode>3): ...this.
	* config/aarch64/iterators.md (SVE_ALL, SVE_I): New mode iterators.
	(Vetype, Vesize, VEL, Vel, vwcore): Handle partial SVE vector modes.
	(VPRED, vpred): Likewise.
	(Vctype): New iterator.
	(vw): Remove SVE modes.

gcc/testsuite/
	* gcc.target/aarch64/sve/mixed_size_1.c: New test.
	* gcc.target/aarch64/sve/mixed_size_2.c: Likewise.
	* gcc.target/aarch64/sve/mixed_size_3.c: Likewise.
	* gcc.target/aarch64/sve/mixed_size_4.c: Likewise.
	* gcc.target/aarch64/sve/mixed_size_5.c: Likewise.

From-SVN: r278341
2019-11-16 11:02:09 +00:00
Richard Sandiford 7f33359984 [AArch64] Tweak gcc.target/aarch64/sve/clastb_8.c
clastb_8.c was using scan-tree-dump-times to check for fully-masked
loops, which made it sensitive to the number of times we try to
vectorize.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/testsuite/
	* gcc.target/aarch64/sve/clastb_8.c: Use assembly tests to
	check for fully-masked loops.

From-SVN: r278340
2019-11-16 10:57:55 +00:00
Richard Sandiford 6544cb5289 [AArch64] Replace SVE_PARTIAL with SVE_PARTIAL_I
Another renaming, this time to make way for partial/unpacked
float modes.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/iterators.md (SVE_PARTIAL): Rename to...
	(SVE_PARTIAL_I): ...this.
	* config/aarch64/aarch64-sve.md: Apply the above renaming throughout.

From-SVN: r278339
2019-11-16 10:55:40 +00:00
Richard Sandiford f75cdd2c4e [AArch64] Add "FULL" to SVE mode iterator names
An upcoming patch will make more use of partial/unpacked SVE vectors.
We then need a distinction between mode iterators that include partial
modes and those that only include "full" modes.  This patch prepares
for that by adding "FULL" to the names of iterators that only select
full modes.  There should be no change in behaviour.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/iterators.md (SVE_ALL): Rename to...
	(SVE_FULL): ...this.
	(SVE_I): Rename to...
	(SVE_FULL_I): ...this.
	(SVE_F): Rename to...
	(SVE_FULL_F): ...this.
	(SVE_BHSI): Rename to...
	(SVE_FULL_BHSI): ...this.
	(SVE_HSD): Rename to...
	(SVE_FULL_HSD): ...this.
	(SVE_HSDI): Rename to...
	(SVE_FULL_HSDI): ...this.
	(SVE_HSF): Rename to...
	(SVE_FULL_HSF): ...this.
	(SVE_SD): Rename to...
	(SVE_FULL_SD): ...this.
	(SVE_SDI): Rename to...
	(SVE_FULL_SDI): ...this.
	(SVE_SDF): Rename to...
	(SVE_FULL_SDF): ...this.
	(SVE_S): Rename to...
	(SVE_FULL_S): ...this.
	(SVE_D): Rename to...
	(SVE_FULL_D): ...this.
	* config/aarch64/aarch64-sve.md: Apply the above renaming throughout.
	* config/aarch64/aarch64-sve2.md: Likewise.

From-SVN: r278338
2019-11-16 10:50:42 +00:00
Richard Sandiford eb23241ba8 [AArch64] Enable VECT_COMPARE_COSTS by default for SVE
This patch enables VECT_COMPARE_COSTS by default for SVE, both so
that we can compare SVE against Advanced SIMD and so that (with future
patches) we can compare multiple SVE vectorisation approaches against
each other.  It also adds a target-specific --param to control this.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64.opt (--param=aarch64-sve-compare-costs):
	New option.
	* doc/invoke.texi: Document it.
	* config/aarch64/aarch64.c (aarch64_autovectorize_vector_modes):
	By default, return VECT_COMPARE_COSTS for SVE.

gcc/testsuite/
	* gcc.target/aarch64/sve/reduc_3.c: Split multi-vector cases out
	into...
	* gcc.target/aarch64/sve/reduc_3_costly.c: ...this new test,
	passing -fno-vect-cost-model for them.
	* gcc.target/aarch64/sve/slp_6.c: Add -fno-vect-cost-model.
	* gcc.target/aarch64/sve/slp_7.c,
	* gcc.target/aarch64/sve/slp_7_run.c: Split multi-vector cases out
	into...
	* gcc.target/aarch64/sve/slp_7_costly.c,
	* gcc.target/aarch64/sve/slp_7_costly_run.c: ...these new tests,
	passing -fno-vect-cost-model for them.
	* gcc.target/aarch64/sve/while_7.c: Add -fno-vect-cost-model.
	* gcc.target/aarch64/sve/while_9.c: Likewise.

From-SVN: r278337
2019-11-16 10:43:52 +00:00
Richard Sandiford bcc7e346bf Optionally pick the cheapest loop_vec_info
This patch adds a mode in which the vectoriser tries each available
base vector mode and picks the one with the lowest cost.  The new
behaviour is selected by autovectorize_vector_modes.

The patch keeps the current behaviour of preferring a VF of
loop->simdlen over any larger or smaller VF, regardless of costs
or target preferences.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* target.h (VECT_COMPARE_COSTS): New constant.
	* target.def (autovectorize_vector_modes): Return a bitmask of flags.
	* doc/tm.texi: Regenerate.
	* targhooks.h (default_autovectorize_vector_modes): Update accordingly.
	* targhooks.c (default_autovectorize_vector_modes): Likewise.
	* config/aarch64/aarch64.c (aarch64_autovectorize_vector_modes):
	Likewise.
	* config/arc/arc.c (arc_autovectorize_vector_modes): Likewise.
	* config/arm/arm.c (arm_autovectorize_vector_modes): Likewise.
	* config/i386/i386.c (ix86_autovectorize_vector_modes): Likewise.
	* config/mips/mips.c (mips_autovectorize_vector_modes): Likewise.
	* tree-vectorizer.h (_loop_vec_info::vec_outside_cost)
	(_loop_vec_info::vec_inside_cost): New member variables.
	* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize them.
	(vect_better_loop_vinfo_p, vect_joust_loop_vinfos): New functions.
	(vect_analyze_loop): When autovectorize_vector_modes returns
	VECT_COMPARE_COSTS, try vectorizing the loop with each available
	vector mode and picking the one with the lowest cost.
	(vect_estimate_min_profitable_iters): Record the computed costs
	in the loop_vec_info.

From-SVN: r278336
2019-11-16 10:40:23 +00:00
Richard Sandiford f884cd2fea Extend can_duplicate_and_interleave_p to mixed-size vectors
This patch makes can_duplicate_and_interleave_p cope with mixtures of
vector sizes, by using queries based on get_vectype_for_scalar_type
instead of directly querying GET_MODE_SIZE (vinfo->vector_mode).

int_mode_for_size is now the first check we do for a candidate mode,
so it seemed better to restrict it to MAX_FIXED_MODE_SIZE.  This avoids
unnecessary work and avoids trying to create scalar types that the
target might not support.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vectorizer.h (can_duplicate_and_interleave_p): Take an
	element type rather than an element mode.
	* tree-vect-slp.c (can_duplicate_and_interleave_p): Likewise.
	Use get_vectype_for_scalar_type to query the natural types
	for a given element type rather than basing everything on
	GET_MODE_SIZE (vinfo->vector_mode).  Limit int_mode_for_size
	query to MAX_FIXED_MODE_SIZE.
	(duplicate_and_interleave): Update call accordingly.
	* tree-vect-loop.c (vectorizable_reduction): Likewise.

From-SVN: r278335
2019-11-16 10:36:20 +00:00