2019-02-09 Harald Anlauf <anlauf@gmx.de>
PR fortran/89077
* resolve.c (gfc_resolve_substring_charlen): Check substring
length for constantness prior to general calculation of length.
PR fortran/89077
* gfortran.dg/substr_simplify.f90: New test.
From-SVN: r268726
2019-02-09 Paul Thomas <pault@gcc.gnu.org>
PR fortran/89200
* trans-array.c (gfc_trans_create_temp_array): Set the 'span'
field for derived types.
2019-02-09 Paul Thomas <pault@gcc.gnu.org>
PR fortran/89200
* gfortran.dg/array_reference_2.f90 : New test.
From-SVN: r268721
PR middle-end/89246
* config/i386/i386.c (ix86_simd_clone_compute_vecsize_and_simdlen):
If !node->definition and TYPE_ARG_TYPES is non-NULL, use
TYPE_ARG_TYPES instead of DECL_ARGUMENTS.
* gcc.dg/gomp/pr89246-1.c: New test.
* gcc.dg/gomp/pr89246-2.c: New test.
From-SVN: r268718
In the standard these member functions are specified in terms of the
potentially-throwing path decompositions functions, but we implement
them without constructing any new paths or doing anything else that can
throw.
PR libstdc++/71044
* include/bits/fs_path.h (path::has_root_name)
(path::has_root_directory, path::has_root_path)
(path::has_relative_path, path::has_parent_path)
(path::has_filename, path::has_stem, path::has_extension)
(path::is_absolute, path::is_relative, path::_M_find_extension): Add
noexcept.
* src/c++17/fs_path.cc (path::has_root_name)
(path::has_root_directory, path::has_root_path)
(path::has_relative_path, path::has_parent_path)
(path::has_filename, path::_M_find_extension): Add noexcept.
From-SVN: r268713
Fixes lack of r30 save/restore on
// -m32 -fpic -ftls-model=initial-exec
__thread char* p;
char** f1 (void) { return &p; }
and
// -m32 -fpic -msecure-plt
extern int foo (int);
int f1 (int x) { return foo (x); }
These are both caused by save_reg_p returning false when the pic
offset table reg (r30 for ABI_V4) was used, due to the logic not
exactly matching that in rs6000_emit_prologue to set up r30.
I also noticed that save_reg_p isn't following the comment regarding
calls_eh_return (since svn 267049, git 0edf78b1b2a0), and the comment
needs tweaking too. For why the revised comment is correct, grep for
saves_all_registers in lra.c, and yes, we do want to save the pic
offset table reg for eh_return.
PR target/88343
* config/rs6000/rs6000.c (save_reg_p): Correct calls_eh_return
case. Match logic in rs6000_emit_prologue emitting pic_offset_table
setup.
From-SVN: r268708
This patch implements the vector copysign operation using vector select and a
signbit mask.
gcc/ChangeLog:
2019-02-08 Robin Dapp <rdapp@linux.ibm.com>
* config/s390/vector.md: Implement vector copysign.
gcc/testsuite/ChangeLog:
2019-02-08 Robin Dapp <rdapp@linux.ibm.com>
* gcc.target/s390/vector/vec-copysign-execute.c: New test.
* gcc.target/s390/vector/vec-copysign.c: New test.
From-SVN: r268697
2019-02-08 Richard Biener <rguenther@suse.de>
PR tree-optimization/89247
* tree-if-conv.c: Include tree-cfgcleanup.h.
(version_loop_for_if_conversion): Record LOOP_VECTORIZED call.
(tree_if_conversion): Pass through predicate vector.
(pass_if_conversion::execute): Do CFG cleanup and SSA update
inline, see if any if-converted loops we refrece in
LOOP_VECTORIZED calls vanished and fixup.
* tree-if-conv.h (tree_if_conversion): Adjust prototype.
* gcc.dg/torture/pr89247.c: New testcase.
From-SVN: r268689
Implementation of section anchors in S/390 back-end added in r266741
broke jump labels in S/390 Linux kernel [1]. Currently jump labels
pass global variable addresses to .quad directive in inline assembly
using "X" constraint. In the past this used to produce regular symbol
references, however, after r266741 we sometimes get values like
(plus (reg) (const_int)), where (reg) points to a section anchor.
Strictly speaking, this is still correct, since "X" accepts anything.
Thus, now we need another way to support jump labels.
The existing "i" constraint cannot be used, since with -fPIC it must
not accept non-local symbols, however, jump labels do require that,
e.g. __tracepoint_xdp_exception from kernel proper might be referenced
from kernel modules.
The existing "ZL" constraint cannot be used for the same reason.
The existing "b" constraint cannot be used because of the way
expand_asm_stmt works. It deduces whether the constraint allows
regs, subregs or mems, and processes asm operands differently based on
that. "b" is supposed to accept values like (mem (symbol_ref)), and
there appears to be no way to explain to expand_asm_stmt that for "b"
mem's address must not be in a register.
This patch introduces the new machine-specific constraint named "jdd" -
"j" prefix is already used for constants, and "d" stands for "data".
It accepts anything that fits into the data section, whether or not
this might require a relocation, that is, anything that passes
CONSTANT_P check.
[1] https://lkml.org/lkml/2019/1/23/346
gcc/ChangeLog:
2019-02-08 Ilya Leoshkevich <iii@linux.ibm.com>
* config/s390/constraints.md (jdd): New constraint.
gcc/testsuite/ChangeLog:
2019-02-08 Ilya Leoshkevich <iii@linux.ibm.com>
* gcc.target/s390/jump-label.c: New test.
From-SVN: r268688
* gcc-interface/trans.c (gnat_to_gnu) <N_Aggregate>: Minor tweak.
* gcc-interface/utils.c (convert): Do not pad when doing an unchecked
conversion here. Use TREE_CONSTANT throughout the function.
(unchecked_convert): Also pad if the source is a CONSTRUCTOR and the
destination is a more aligned array type or a larger aggregate type,
but not between original and packable versions of a type.
From-SVN: r268679
OImode and TImode moves must be done in XImode to access upper 16
vector registers without AVX512VL. With AVX512VL, we can access
upper 16 vector registers in OImode and TImode.
PR target/89229
* config/i386/i386.md (*movoi_internal_avx): Set mode to XI for
upper 16 vector registers without TARGET_AVX512VL.
(*movti_internal): Likewise.
From-SVN: r268678
* gcc-interface/trans.c (Regular_Loop_to_gnu): Replace tests on
individual flag_unswitch_loops and flag_tree_loop_vectorize switches
with test on global optimize switch.
(Raise_Error_to_gnu): Likewise.
From-SVN: r268671
PR rtl-optimization/89234
* except.c (copy_reg_eh_region_note_forward): Return if note_or_insn
is a NOTE, CODE_LABEL etc. - rtx_insn * other than INSN_P.
(copy_reg_eh_region_note_backward): Likewise.
* g++.dg/ubsan/pr89234.C: New test.
From-SVN: r268669
The backtrace functions backtrace_full, backtrace_print and backtrace_simple
walk the call stack, but make sure to skip the first entry, in order to skip
over the functions themselves, and start the backtrace at the caller of the
functions.
When compiling with -flto, the functions may be inlined, causing them to skip
over the caller instead.
Fix this by declaring the functions with __attribute__((noinline)).
2019-02-08 Tom de Vries <tdevries@suse.de>
* backtrace.c (backtrace_full): Declare with __attribute__((noinline)).
* print.c (backtrace_print): Same.
* simple.c (backtrace_simple): Same.
From-SVN: r268668
2019-02-08 Richard Biener <rguenther@suse.de>
PR middle-end/89223
* tree-data-ref.c (initialize_matrix_A): Fail if constant
doesn't fit in HWI.
(analyze_subscript_affine_affine): Handle failure from
initialize_matrix_A.
* gcc.dg/torture/pr89223.c: New testcase.
From-SVN: r268666
Add handling of the DW_FORM_ref_addr encoding to libbacktrace.
2019-02-08 Tom de Vries <tdevries@suse.de>
PR libbacktrace/78063
* dwarf.c (build_address_map): Keep all parsed units.
(read_referenced_name_from_attr): Handle DW_FORM_ref_addr.
From-SVN: r268663
PR tree-optimization/89235 reports an ICE inside -fsave-optimization-record
whilst reporting the inlining chain of of the location_t in the
vect_location global.
This is very similar to PR tree-optimization/86637, fixed in r266821.
The issue is that the inlining chains are read from the location_t's
ad-hoc data, referencing GC-managed tree blocks, but the former are
not GC roots; it's simply assumed that old locations referencing dead
blocks never get used again.
The fix is to reset the "vect_location" global in more places. Given
that is a somewhat subtle detail, the patch adds a sentinel class to
reset vect_location at the end of a scope. Doing it as a class
simplifies the task of ensuring that the global is reset on every
exit path from a function, and also gives a good place to signpost
the above subtlety (in the documentation for the class).
The patch also adds test cases for both of the PRs mentioned above.
gcc/testsuite/ChangeLog:
PR tree-optimization/86637
PR tree-optimization/89235
* gcc.c-torture/compile/pr86637-1.c: New test.
* gcc.c-torture/compile/pr86637-2.c: New test.
* gcc.c-torture/compile/pr86637-3.c: New test.
* gcc.c-torture/compile/pr89235.c: New test.
gcc/ChangeLog:
PR tree-optimization/86637
PR tree-optimization/89235
* tree-vect-loop.c (optimize_mask_stores): Add an
auto_purge_vect_location sentinel to ensure that vect_location is
purged on exit.
* tree-vectorizer.c
(auto_purge_vect_location::~auto_purge_vect_location): New dtor.
(try_vectorize_loop_1): Add an auto_purge_vect_location sentinel
to ensure that vect_location is purged on exit.
(pass_slp_vectorize::execute): Likewise, replacing the manual
reset.
* tree-vectorizer.h (class auto_purge_vect_location): New class.
From-SVN: r268659
Richard raised a concern about the RTL we use to represent the AdvSIMD SABD
(vector signed absolute difference) instruction.
We currently represent it as ABS (MINUS op1 op2).
This isn't exactly what SABD does. ABS treats its input as a signed value
and returns the absolute of that.
For example:
(sabd:QI 64 -128) == 192 (unsigned) aka -64 (signed)
whereas
(minus:QI 64 -128) == 192 (unsigned) aka -64 (signed), (abs ...) of that is 64.
A better way to describe the instruction is with MINUS (SMAX (op1 op2) SMIN (op1 op2)).
This patch implements that, and also implements similar semantics for the UABD instruction
that uses UMAX and UMIN.
That way for the example above we'll have:
(minus:QI (smax:QI (64 -128)) (smin:QI (64 -128))) == (minus:QI 64 -128) == 192 (or -64 signed) which matches
what SABD does.
* config/aarch64/iterators.md (max_opp): New code_attr.
(USMAX): New code iterator.
* config/aarch64/predicates.md (aarch64_smin): New predicate.
(aarch64_smax): Likewise.
* config/aarch64/aarch64-simd.md (abd<mode>_3): Rename to...
(*aarch64_<su>abd<mode>_3): ... Change RTL representation to
MINUS (MAX MIN).
* gcc.target/aarch64/abd_1.c: New test.
* gcc.dg/sabd_1.c: Likewise.
From-SVN: r268658
PR target/89229
* config/i386/i386.md (*movoi_internal_avx): Set mode to OI
for TARGET_AVX512VL.
(*movti_internal): Set mode to TI for TARGET_AVX512VL.
From-SVN: r268657
My previous patch failed to only run an arm test on arm architecture.
This adds that condition to the test.
gcc/testsuite/ChangeLog:
2019-02-07 Matthew Malcomson <matthew.malcomson@arm.com>
* gcc.dg/rtl/arm/ldrd-peepholes.c: Only run on arm
From-SVN: r268655
This patch fixes several problems with the vec_xl/vec_xst builtins:
- vec_xl/vec_xst needs to use the alignment of the scalar memory
operand for the vector type reference. This is required to emit the
proper vl/vst alignment hints.
- vec_xl / vec_xld2 / vec_xlw4 should accept const pointer source operands
- vec_xlw4 / vec_xstw4 needs to accept float memory operands
gcc/ChangeLog:
2019-02-07 Andreas Krebbel <krebbel@linux.ibm.com>
* config/s390/s390-builtin-types.def: Add new types.
* config/s390/s390-builtins.def: (s390_vec_xl, s390_vec_xld2)
(s390_vec_xlw4): Make the memory operand into a const pointer.
(s390_vec_xld2, s390_vec_xlw4): Add a variant for single precision
float.
* config/s390/s390-c.c (s390_expand_overloaded_builtin): Generate
a new vector type with the alignment of the scalar memory operand.
gcc/testsuite/ChangeLog:
2019-02-07 Andreas Krebbel <krebbel@linux.ibm.com>
* gcc.target/s390/zvector/xl-xst-align-1.c: New test.
* gcc.target/s390/zvector/xl-xst-align-2.c: New test.
From-SVN: r268651
These peepholes match a pair of SImode loads or stores that can be
implemented with a single LDRD or STRD instruction.
When compiling for TARGET_ARM, these peepholes originally created a set
pattern in DI mode to be caught by movdi patterns.
This approach failed to take into account the possibility that the two
matched insns operated on memory with different aliasing information.
The peepholes lost the aliasing information on one of the insns, which
could then cause the scheduler to make an invalid transformation.
This patch changes the peepholes so they generate a PARALLEL expression
of the two relevant loads or stores, which means the aliasing
information of both is kept. Such a PARALLEL pattern is what the
peepholes currently produce for TARGET_THUMB2.
In order to match these new insn patterns, we add two new define_insn's. These
define_insn's use the same checks as the peepholes to find valid insns.
Note that the patterns now created by the peepholes for LDRD and STRD
are very similar to those created by the peepholes for LDM and STM.
Many patterns could be matched by the LDM and STM define_insns, which
means we rely on the order the define_insn patterns are defined in the
machine description, with those for LDRD/STRD defined before those for
LDM/STM.
The difference between the peepholes for LDRD/STRD and those for LDM/STM
are mainly that those for LDRD/STRD have some logic to ensure that the
two registers are consecutive and the first one is even.
Bootstrapped and regtested on arm-none-linux-gnu.
Demonstrated fix of bug 88714 by bootstrapping on armv7l.
gcc/ChangeLog:
2019-02-07 Matthew Malcomson <matthew.malcomson@arm.com>
Jakub Jelinek <jakub@redhat.com>
PR bootstrap/88714
* config/arm/arm-protos.h (valid_operands_ldrd_strd,
arm_count_ldrdstrd_insns): New declarations.
* config/arm/arm.c (mem_ok_for_ldrd_strd): Remove broken handling of
MINUS.
(valid_operands_ldrd_strd): New function.
(arm_count_ldrdstrd_insns): New function.
* config/arm/ldrdstrd.md: Change peepholes to generate PARALLEL SImode
sets instead of single DImode set and define new insns to match this.
gcc/testsuite/ChangeLog:
2019-02-07 Matthew Malcomson <matthew.malcomson@arm.com>
Jakub Jelinek <jakub@redhat.com>
PR bootstrap/88714
* gcc.c-torture/execute/pr88714.c: New test.
* gcc.dg/rtl/arm/ldrd-peepholes.c: New test.
Co-Authored-By: Jakub Jelinek <jakub@redhat.com>
From-SVN: r268644
This fixes a missing = that would cause the array initializer to be a C++
initializer instead of a C one, causing a warning when building with pre-C++11
standards compiler.
Committed under the GCC obvious rules.
gcc/ChangeLog:
* config/aarch64/aarch64-builtins.c (aarch64_fcmla_lane_builtin_data):
Make it a C initializer.
From-SVN: r268614
We currently return cost 2 for NEON REG to REG moves, which would be incorrect
for 64 bit moves. We currently don't have a pattern for this in the neon_move
alternatives because this is a bit of a special case. We would almost never
want it to use this r -> r pattern unless it really has no choice.
As such we add a new neon r -> r move pattern but also hide it from being used
to determine register preferences and also disparage it during LRA.
gcc/ChangeLog:
PR/target 88850
* config/arm/neon.md (*neon_mov<mode>): Add r -> r case.
gcc/testsuite/ChangeLog:
PR/target 88850
* gcc.target/arm/pr88850.c: New test.
From-SVN: r268612
For the Dot Product instructions we have the scheduling types neon_dot and neon_dot_q for the 128-bit versions.
It seems that we're only using the former though, not assigning the neon_dot_q type anywhere.
This patch fixes that by adding the <q> mode attribute suffix to the type, similar to how we do it for other
types in neon.md.
* config/arm/neon.md (neon_<sup>dot<vsi2qi>):
Use neon_dot<q> for type.
(neon_<sup>dot_lane<vsi2qi>): Likewise.
From-SVN: r268611
For the Dot Product instructions we have the scheduling types neon_dot and neon_dot_q for the 128-bit versions.
It seems that we're only using the former though, not assigning the neon_dot_q type anywhere.
This patch fixes that by adding the <q> mode attribute suffix to the type, similar to how we do it for other
types in aarch64-simd.md.
* config/aarch64/aarch64-simd.md (aarch64_<sur>dot<vsi2qi>):
Use neon_dot<q> for type.
(aarch64_<sur>dot_lane<vsi2qi>): Likewise.
(aarch64_<sur>dot_laneq<vsi2qi>): Likewise.
From-SVN: r268610
Because of rank compares, and checks for ck_list, we know that if we
see user_conv_p or ck_list in ics1, we'll also see it in ics2. This
reasoning does not extend to ck_aggr, however, so we might have
ck_aggr conversions starting both ics1 and ics2, which we handle
correctly, or either, which we likely handle by crashing on whatever
path we take depending on whether ck_aggr is in ics1 or ics2.
We crash because, as we search the conversion sequences, we may very
well fail to find what we are looking for, and reach the end of the
sequence, which is unexpected in all paths.
This patch arranges for us to take the same path when ck_aggr is in
ics2 only that we would if it was in ics1 (regardless of ics2), and it
deals with not finding the kind of conversion we look for there.
I've changed the type of the literal constant in the testcase, so as
to hopefully make it well-formed. We'd fail to reject the narrowing
conversion in the original testcase, but that's a separate bug.
for gcc/cp/ChangeLog
PR c++/86218
* call.c (compare_ics): Deal with ck_aggr in either cs.
for gcc/testsuite/ChangeLog
PR c++/86218
* g++.dg/cpp0x/pr86218.C: New.
From-SVN: r268606