It is tricky to get right. Aarch64 does it by adding the appropriate lane-swapping
operations during expansion.
I'd like to do the same on arm eventually, but we'd need to port and validate the VTBL-generating
code and add it to all the right places and I'm not comfortable enough doing it for GCC 8, but I am keen
in getting the wrong-code fixed.
As I say in the PR, vectorisation on armeb is already severely restricted (we disable many patterns on BYTES_BIG_ENDIAN)
and the load/store_lanes patterns really were not working properly at all, so disabling them is not
a radical approach.
The way to do that is to return false in ARRAY_MODE_SUPPORTED_P for BYTES_BIG_ENDIAN.
Bootstrapped and tested on arm-none-linux-gnueabihf.
Also tested on armeb-none-eabi.
PR target/82518
* config/arm/arm.c (arm_array_mode_supported_p): Return false for
BYTES_BIG_ENDIAN.
* lib/target-supports.exp (check_effective_target_vect_load_lanes):
Disable for armeb targets.
* gcc.target/arm/pr82518.c: New test.
From-SVN: r258687
2018-03-20 Richard Biener <rguenther@suse.de>
PR target/84986
* config/i386/i386.c (ix86_add_stmt_cost): Only cost
sign-conversions as zero, fall back to standard scalar_stmt
cost for the rest.
* gcc.dg/vect/costmodel/x86_64/costmodel-pr84986.c: New testcase.
From-SVN: r258684
2018-03-20 Martin Liska <mliska@suse.cz>
PR ipa/84825
* predict.c (rebuild_frequencies): Handle case when we have
PROFILE_ABSENT, but flag_guess_branch_prob is false.
2018-03-20 Martin Liska <mliska@suse.cz>
PR ipa/84825
* g++.dg/ipa/pr84825.C: New test.
From-SVN: r258683
PR target/84845
* config/aarch64/aarch64.md (*aarch64_reg_<mode>3_neg_mask2): Rename
to ...
(*aarch64_<optab>_reg_<mode>3_neg_mask2): ... this. If pseudos can't
be created, use lowpart_subreg of operands[0] rather than operands[0]
itself.
(*aarch64_reg_<mode>3_minus_mask): Rename to ...
(*aarch64_ashl_reg_<mode>3_minus_mask): ... this.
(*aarch64_<optab>_reg_di3_mask2): Use const_int_operand predicate
and n constraint instead of aarch64_shift_imm_di and Usd.
(*aarch64_reg_<optab>_minus<mode>3): Rename to ...
(*aarch64_<optab>_reg_minus<mode>3): ... this.
* gcc.c-torture/compile/pr84845.c: New test.
From-SVN: r258678
This patch fixes PR82989 so that we avoid NEON instructions when
-mneon-for-64bits is not enabled. This is more of a short term fix
for the real deeper problem of making an early decision of choosing
or rejecting NEON instructions. There is now a new ticket PR84467 to
deal with the longer term solution.
(Please refer to the discussion in the bug report for more details).
Sudi
*** gcc/ChangeLog ***
2018-03-20 Sudakshina Das <sudi.das@arm.com>
PR target/82989
* config/arm/neon.md (ashldi3_neon): Update ?s for constraints
to favor GPR over NEON registers.
(<shift>di3_neon): Likewise.
*** gcc/testsuite/ChangeLog ***
2018-03-20 Sudakshina Das <sudi.das@arm.com>
PR target/82989
* gcc.target/arm/pr82989.c: New test.
From-SVN: r258677
2018-03-20 Tom de Vries <tom@codesourcery.com>
PR target/84954
* config/nvptx/nvptx.c (prevent_branch_around_nothing): Also update
seen_label if seen_label is already set.
From-SVN: r258674
PR target/84945
* config/i386/i386.c (fold_builtin_cpu): For features above 31
use __cpu_features2 variable instead of __cpu_model.__cpu_features[0].
Use 1U instead of 1. Formatting fixes.
* gcc.target/i386/pr84945.c: New test.
* config/i386/cpuinfo.h (__cpu_features2): Declare.
* config/i386/cpuinfo.c (__cpu_features2): New variable for
ifndef SHARED only.
(set_feature): Define.
(get_available_features): Use set_feature macro. Set __cpu_features2
to the second word of features ifndef SHARED.
From-SVN: r258673
PR c/84953
* builtins.c (fold_builtin_strpbrk): For strpbrk(x, "") use type
instead of TREE_TYPE (s1) for the return value.
* gcc.dg/pr84953.c: New test.
From-SVN: r258671
PR sanitizer/78651
* dwarf2asm.c: Include fold-const.c.
(dw2_output_indirect_constant_1): Set DECL_INITIAL (decl) to ADDR_EXPR
of decl rather than decl itself.
From-SVN: r258664
PR sanitizer/84761
* sanitizer_common/sanitizer_linux_libcdep.cc (__GLIBC_PREREQ):
Define if not defined.
(DL_INTERNAL_FUNCTION): Don't define.
(InitTlsSize): For __i386__ if not compiled against glibc 2.27+
determine at runtime whether to use regparm(3), stdcall calling
convention for older glibcs or normal calling convention for
newer glibcs for call to _dl_get_tls_static_info.
From-SVN: r258663
This patch fixes the inconsistent behavior observed at -O3 for the unordered
comparisons. According to the online docs (https://gcc.gnu.org/onlinedocs
/gcc-7.2.0/gccint/Unary-and-Binary-Expressions.html), all of the following
should not raise an FP exception:
- UNGE_EXPR
- UNGT_EXPR
- UNLE_EXPR
- UNLT_EXPR
- UNEQ_EXPR
Also ORDERED_EXPR and UNORDERED_EXPR should only return zero or one.
The aarch64-simd.md handling of these were generating exception raising
instructions such as fcmgt. This patch changes the instructions that are
emitted in order to not give out the exceptions. We first check each
operand for NaNs and force any elements containing NaN to zero before using
them in the compare.
Example: UN<cc> (a, b) -> UNORDERED (a, b)
| (cm<cc> (isnan (a) ? 0.0 : a, isnan (b) ? 0.0 : b))
The ORDERED_EXPR is now handled as (cmeq (a, a) & cmeq (b, b)) and
UNORDERED_EXPR as ~ORDERED_EXPR and UNEQ as (~ORDERED_EXPR | cmeq (a,b)).
ChangeLog Entries:
*** gcc/ChangeLog ***
2018-03-19 Sudakshina Das <sudi.das@arm.com>
PR target/81647
* config/aarch64/aarch64-simd.md (vec_cmp<mode><v_int_equiv>): Modify
instructions for UNLT, UNLE, UNGT, UNGE, UNEQ, UNORDERED and ORDERED.
*** gcc/testsuite/ChangeLog ***
2018-03-19 Sudakshina Das <sudi.das@arm.com>
PR target/81647
* gcc.target/aarch64/pr81647.c: New.
From-SVN: r258653
gcc/
PR bootstrap/84856
* config/riscv/riscv.c (riscv_function_arg_boundary): Use
PREFERRED_STACK_BOUNDARY instead of STACK_BOUNDARY.
(riscv_first_stack_step): Likewise.
(riscv_option_override): Use STACK_BOUNDARY instead of
MIN_STACK_BOUNDARY.
* config/riscv/riscv.h (STACK_BOUNDARY): Renamed from
MIN_STACK_BOUNDARY.
(BIGGEST_ALIGNMENT): Set to 128.
(PREFERRED_STACK_BOUNDARY): Renamed from STACK_BOUNDARY.
(RISCV_STACK_ALIGN): Use PREFERRED_STACK_BOUNDARY instead of
STACK_BOUNDARY.
From-SVN: r258650
2018-03-19 Richard Biener <rguenther@suse.de>
PR tree-optimization/84933
* tree-vrp.c (set_and_canonicalize_value_range): Treat out-of-bound
values as -INF/INF when canonicalizing an ANTI_RANGE to a RANGE.
* g++.dg/pr84933.C: New testcase.
From-SVN: r258646
2018-03-19 Richard Biener <rguenther@suse.de>
PR tree-optimization/84859
* tree-ssa-phiopt.c (single_trailing_store_in_bb): New function.
(cond_if_else_store_replacement): Perform sinking operation on
single-store BBs regardless of MAX_STORES_TO_SINK setting.
Generalize what a BB with a single eligible store is.
* gcc.dg/tree-ssa/pr84859.c: New testcase.
* gcc.dg/tree-ssa/pr35286.c: Disable cselim.
* gcc.dg/tree-ssa/split-path-6.c: Likewise.
* gcc.dg/tree-ssa/split-path-7.c: Likewise.
From-SVN: r258645
2018-03-18 Martin Liska <mliska@suse.cz>
PR rtl-optimization/84635
* regrename.c (build_def_use): Use matches_mode only when
matches >= 0.
From-SVN: r258634
2018-03-18 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/77414
* decl.c (get_proc_name): Check for a subroutine re-defined in
the contain portion of a subroutine. Change language of existing
error message to better describe the issue. While here fix whitespace
issues.
2018-03-18 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/77414
* gfortran.dg/pr77414.f90: New test.
* gfortran.dg/internal_references_1.f90: Adjust error message.
From-SVN: r258633
2018-03-18 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/65453
* decl.c (get_proc_name): Catch clash between a procedure statement
and a contained subprogram
2018-03-18 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/65453
* gfortran.dg/pr65453.f90: New test.
From-SVN: r258632
The testcase ICEd for both SVE and AVX512 because we were trying
to vectorise a chain of COND_EXPRs as a reduction and getting
confused by reduc_index == -1.
2018-03-18 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
PR tree-optimization/84913
* tree-vect-loop.c (vectorizable_reduction): Don't try to
vectorize chains of COND_EXPRs.
gcc/testsuite/
PR tree-optimization/84913
* gfortran.dg/vect/pr84913.f90: New test.
From-SVN: r258631