tlsdesc calls are guaranteed to preserve all Advanced SIMD registers,
but are not guaranteed to preserve the SVE extension of them.
The calls also don't preserve the SVE predicate registers.
The long-term plan for handling the SVE vector registers is CLOBBER_HIGH,
which adds a clobber equivalent of TARGET_HARD_REGNO_CALL_PART_CLOBBERED.
The pattern can then directly model the fact that the low 128 bits are
preserved and the upper bits are clobbered.
However, it's too late now for that to be included in GCC 8, so this
patch conservatively treats the whole vector register as being clobbered.
This has the obvious disadvantage that compiling for SVE can make NEON
code worse, but I don't think there's much we can do about that until
CLOBBER_HIGH is in.
2018-03-13 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* config/aarch64/aarch64.md (V4_REGNUM, V8_REGNUM, V12_REGNUM)
(V20_REGNUM, V24_REGNUM, V28_REGNUM, P1_REGNUM, P2_REGNUM, P3_REGNUM)
(P4_REGNUM, P5_REGNUM, P6_REGNUM, P8_REGNUM, P9_REGNUM, P10_REGNUM)
(P11_REGNUM, P12_REGNUM, P13_REGNUM, P14_REGNUM): New define_constants.
(tlsdesc_small_<mode>): Turn a define_expand and use
tlsdesc_small_sve_<mode> for SVE. Rename original define_insn to...
(tlsdesc_small_advsimd_<mode>): ...this.
(tlsdesc_small_sve_<mode>): New pattern.
gcc/testsuite/
* gcc.target/aarch64/sve/tls_1.c: New test.
* gcc.target/aarch64/sve/tls_2.C: Likewise.
From-SVN: r258488
One advantage of the new permute handling compared to the old way is
that we can now easily take advantage of the vectoriser's divmod patterns
for SVE.
2018-03-13 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* config/aarch64/iterators.md (UNSPEC_SMUL_HIGHPART)
(UNSPEC_UMUL_HIGHPART): New constants.
(MUL_HIGHPART): New int iteraor.
(su): Handle UNSPEC_SMUL_HIGHPART and UNSPEC_UMUL_HIGHPART.
* config/aarch64/aarch64-sve.md (<su>mul<mode>3_highpart): New
define_expand.
(*<su>mul<mode>3_highpart): New define_insn.
gcc/testsuite/
* gcc.target/aarch64/sve/mul_highpart_1.c: New test.
* gcc.target/aarch64/sve/mul_highpart_1_run.c: Likewise.
From-SVN: r258487
PR middle-end/84834
* match.pd ((A & C) != 0 ? D : 0): Use INTEGER_CST@2 instead of
integer_pow2p@2 and test integer_pow2p in condition.
(A < 0 ? C : 0): Similarly for @1.
* gcc.dg/pr84834.c: New test.
From-SVN: r258479
PR target/84828
* reg-stack.c (change_stack): Change update_end var from int to
rtx_insn *, if non-NULL don't update just BB_END (current_block), but
also call set_block_for_insn on the newly added insns and rescan.
* g++.dg/ext/pr84828.C: New test.
From-SVN: r258476
PR target/84786
* config/i386/sse.md (sse2_loadhpd): Use Yv constraint rather than v
on the last operand.
* gcc.target/i386/avx512f-pr84786-1.c: New test.
* gcc.target/i386/avx512f-pr84786-2.c: New test.
From-SVN: r258475
PR c++/84704
* tree.c (stabilize_reference_1): Return save_expr (e) for
STATEMENT_LIST even if it doesn't have side-effects.
* g++.dg/debug/pr84704.C: New test.
From-SVN: r258470
This makes the float32-basic.c testcase work on sysv (32-bit Linux).
"float" is promoted to "double" for varargs. The ABI also only defines
the use of double precision in varargs. But _Float32 is not promoted.
Since there is no way of passing single-precision float in FPRs we
should pass SFmode in GPRs (or memory) instead. This is similar to
the 64-bit ABI.
From-SVN: r258454
There still are situations where we have stale LOG_LINKS. This causes
combine to try two-insn combinations I2->I3 where the register set by
I2 is used before I3 as well. Not good.
This patch fixes it by checking for this situation in can_combine_p
(similar to what we already do for three and four insn combinations).
From-SVN: r258452
2018-03-12 Richard Biener <rguenther@suse.de>
PR tree-optimization/84803
* tree-if-conv.c (ifcvt_memrefs_wont_trap): Don't do anything
for refs DR analysis didn't process.
* gcc.dg/torture/pr84803.c: New testcase.
From-SVN: r258446
2018-03-12 Richard Biener <rguenther@suse.de>
PR tree-optimization/84777
* tree-ssa-loop-ch.c (should_duplicate_loop_header_p): For
force-vectorize loops ignore whether we are optimizing for size.
From-SVN: r258444
2018-03-11 Paul Thomas <pault@gcc.gnu.org>
PR fortran/84546
* trans-array.c (structure_alloc_comps): Make sure that the
vptr is copied and that the unlimited polymorphic _len is used
to compute the size to be allocated.
* trans-expr.c (gfc_get_class_array_ref): If unlimited, use the
unlimited polymorphic _len for the offset to the element.
(gfc_copy_class_to_class): Set the new 'unlimited' argument.
* trans.h : Add the boolean 'unlimited' to the prototype.
2018-03-11 Paul Thomas <pault@gcc.gnu.org>
PR fortran/84546
* gfortran.dg/unlimited_polymorphic_29.f90 : New test.
From-SVN: r258438
2018-03-11 Steven G. Kargl <kargls@gcc.gnu.org>
* check.c (gfc_check_kill): Check pid and sig are scalar.
(gfc_check_kill_sub): Restrict kind to 4 and 8.
* intrinsic.c (add_function): Sort keyword list. Add pid and sig
keywords for KILL. Remove redundant *back="back" in favor of the
original *bck="back".
(add_subroutines): Sort keyword list. Add pid and sig keywords
for KILL.
* intrinsic.texi: Fix documentation to consistently use pid and sig.
* iresolve.c (gfc_resolve_kill): Kind can only be 4 or 8. Choose the
correct function.
(gfc_resolve_rename_sub): Add comment.
From-SVN: r258436
PR debug/58150
* dwarf2out.c (gen_enumeration_type_die): Don't guard adding
DW_AT_declaration for ENUM_IS_OPAQUE on -gdwarf-4 or -gno-strict-dwarf,
but on TYPE_SIZE. Don't do anything for ENUM_IS_OPAQUE if not creating
a new die. Don't set TREE_ASM_WRITTEN if ENUM_IS_OPAQUE. Guard
addition of most attributes on !orig_type_die or the attribute not
being present already. Assert TYPE_VALUES is NULL for ENUM_IS_OPAQUE.
* g++.dg/debug/dwarf2/enum2.C: New test.
From-SVN: r258434
2018-03-09 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/84734
* arith.c (check_result, eval_intrinsic): If result overflows, pass
the expression up the chain instead of a NULL pointer.
2018-03-09 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/84734
* gfortran.dg/pr84734.f90: New test.
From-SVN: r258416
2018-03-10 Vladimir Makarov <vmakarov@redhat.com>
Reverting patch:
2018-03-09 Vladimir Makarov <vmakarov@redhat.com>
PR target/83712
* lra-assigns.c (assign_by_spills): Return a flag of reload
assignment failure. Do not process the reload assignment
failures. Do not spill other reload pseudos if they has the same
reg class.
(lra_assign): Add a return arg. Set up from the result of
assign_by_spills call.
(find_reload_regno_insns, lra_split_hard_reg_for): New functions.
* lra-constraints.c (split_reg): Add a new arg. Use it instead of
usage_insns if it is not NULL.
(spill_hard_reg_in_range): New function.
(split_if_necessary, inherit_in_ebb): Pass a new arg to split_reg.
* lra-int.h (spill_hard_reg_in_range, lra_split_hard_reg_for): New
function prototypes.
(lra_assign): Change prototype.
* lra.c (lra): Add code to deal with fails by splitting hard reg
live ranges.
From-SVN: r258415
When outputting entry views in symbolic mode, we used to use a lbl_id,
but that outputs the view as an addr, perhaps even in an indirect one,
which is all excessive and undesirable for a small assembler-computed
constant.
Introduce a new value class for symbolic views, so that we can output
the labels as constant data, using as narrow forms as possible, but
wide enough for any symbolic views output in the compilation. We
don't know exactly where the assembler will reset views, but we count
the symbolic views since known reset points and use that as an upper
bound for view numbers.
Ideally, we'd use uleb128, but then the compiler would have to defer
.debug_info offset computation to the assembler. I'm not going there
for now, so a symbolic uleb128 assembler constant in an attribute is
not something GCC can deal with ATM.
for gcc/ChangeLog
PR debug/84620
* dwarf2out.h (dw_val_class): Add dw_val_class_symview.
(dw_val_node): Add val_symbolic_view.
* dwarf2out.c (dw_line_info_table): Add symviews_since_reset.
(symview_upper_bound): New.
(new_line_info_table): Initialize symviews_since_reset.
(dwarf2out_source_line): Count symviews_since_reset and set
symview_upper_bound.
(dw_val_equal_p): Handle symview.
(add_AT_symview): New.
(print_dw_val): Handle symview.
(attr_checksum, attr_checksum_ordered): Likewise.
(same_dw_val_p, size_of_die): Likewise.
(value_format, output_die): Likewise.
(add_high_low_attributes): Use add_AT_symview for entry_view.
(dwarf2out_finish): Reset symview_upper_bound, clear
zero_view_p.
From-SVN: r258411