This function is declared unconditionally but was only defined for C++11
and later, leading to linker errors when the testsuite was run with
-std=gnu++98 -D_GLIBCXX_DEBUG added to the flags.
* include/debug/vector (__niter_base): Define for C++98.
From-SVN: r263816
* testsuite/25_algorithms/partial_sort_copy/debug/irreflexive_neg.cc:
Fix C++98 test to not use C++11 features.
* testsuite/25_algorithms/fill_n/2.cc: Likewise.
From-SVN: r263815
arm.md has some attributes "arch" and "arch_enabled" to aid enabling
and disabling insn alternatives based on the architecture being
targeted. This patch introduces a similar attribute in the aarch64
backend. The new attribute will be used to enable a new alternative
for the atomic_store insn in a future patch, but is an atomic change
in itself.
The new attribute has values "any", "fp", "fp16", "simd", and "sve".
These attribute values have been taken from the pre-existing
attributes "fp", "fp16", "simd", and "sve".
The standalone "fp" attribute has been reintroduced in terms of the
"arch" attribute as it's needed for the xgene1.md scheduling file --
the use in this file can't be changed to check for `(eq_attr "arch"
"fp")` as the file is reused by the arm.md machine description whose
'arch' attribute doesn't have an 'fp' value.
2018-08-23 Matthew Malcomson <matthew.malcomson@arm.com>
* config/aarch64/aarch64.md (arches): New enum.
(arch): New enum attr.
(arch_enabled): New attr.
(enabled): Now uses arch_enabled only.
(simd, sve, fp16): Removed attribute.
(fp): Attr now defined in terms of 'arch'.
(*mov<mode>_aarch64, *movsi_aarch64, *movdi_aarch64, *movti_aarch64,
*movhf_aarch64, <optab><fcvt_target><GPF:mode>2,
<FCVT_F2FIXED:fcvt_fixed_insn><GPF:mode>3,
<FCVT_FIXED2F:fcvt_fixed_insn><GPI:mode>3): Merge 'fp' and 'simd'
attributes into 'arch'.
(*movsf_aarch64, *movdf_aarch64, *movtf_aarch64, *add<mode>3_aarch64,
subdi3, neg<mode>2, <optab><mode>3, one_cmpl<mode>2,
*<NLOGICAL:optab>_one_cmpl<mode>3, *xor_one_cmpl<mode>3,
*aarch64_ashl_sisd_or_int_<mode>3, *aarch64_lshr_sisd_or_int_<mode>3,
*aarch64_ashr_sisd_or_int_<mode>3, *aarch64_sisd_ushl): Convert use of
'simd' attribute into 'arch'.
(load_pair_sw_<SX:mode><SX2:mode>, load_pair_dw_<DX:mode><DX2:mode>,
store_pair_sw_<SX:mode><SX2:mode>, store_pair_dw_<DX:mode><DX2:mode>):
Convert use of 'fp' attribute to 'arch'.
* config/aarch64/aarch64-simd.md (move_lo_quad_internal_<mode>,
move_lo_quad_internal_<mode>): (different modes) Merge 'fp' and 'simd'
into 'arch'.
(move_lo_quad_internal_be_<mode>, move_lo_quad_internal_be_<mode>):
(different modes) Merge 'fp' and 'simd' into 'arch'.
(*aarch64_combinez<mode>, *aarch64_combinez_be<mode>): Merge 'fp' and
'simd' into 'arch'.
From-SVN: r263811
The new code testing which way a comparison is best expressed creates
a pseudoregister (by hand) and creates some insns with that. Such
insns will no longer recog() when pseudo-registers are no longer
aloowed (after reload). But we have an ifcvt pass after reload (ce3).
This patch simply returns if we cannot create pseudos.
PR rtl-optimization/87026
* expmed.c (canonicalize_comparison): If we can no longer create
pseudoregisters, don't.
From-SVN: r263810
* include/debug/string (insert(__const_iterator, _InIter, _InIter)):
[!_GLIBCXX_USE_CXX11_ABI]: Replace use of C++11-only cbegin() with
begin(), for C++98 compatibility.
From-SVN: r263809
The __gnu_debug string (mostly) implements the C++11 API, but when it
wraps the old COW string many of the member functions in the base class
have the wrong parameter types or return types. This makes the
__gnu_debug::string type adapt itself to the base class API. This
actually makes the debug string slightly more conforming than the
underlying string type when using the old ABI.
* include/bits/basic_string.h [_GLIBCXX_USE_CXX11_ABI]
(basic_string::__const_iterator): Change access to protected.
[!_GLIBCXX_USE_CXX11_ABI] (basic_string::__const_iterator): Define
as typedef for iterator.
* include/debug/string (__const_iterator): Use typedef from base.
(insert(const_iterator, _CharT))
(replace(const_iterator, const_iterator, const basic_string&))
(replace(const_iterator, const_iterator, const _CharT*, size_type))
(replace(const_iterator, const_iterator, const CharT*))
(replace(const_iterator, const_iterator, size_type, _CharT))
(replace(const_iterator, const_iterator, _InputIter, _InputIter))
(replace(const_iterator, const_iterator, initializer_list<_CharT>)):
Change const_iterator parameters to __const_iterator.
(insert(iterator, size_type, _CharT)): Add C++98 overload.
(insert(const_iterator, _InputIterator, _InputIterator)): Change
const_iterator parameter to __const_iterator.
[!_GLIBCXX_USE_CXX11_ABI]: Add workaround for incorrect return type
of base's member function.
(insert(const_iterator, size_type, _CharT)) [!_GLIBCXX_USE_CXX11_ABI]:
Likewise.
(insert(const_iterator, initializer_list<_CharT>))
[!_GLIBCXX_USE_CXX11_ABI]: Likewise.
* testsuite/21_strings/basic_string/init-list.cc: Remove effective
target directive.
From-SVN: r263808
The AArch32 instruction sets prior to Armv7 do not define the ISB and
DSB instructions that are needed to form a speculation barrier. While
I do not know of any instances of cores based on those instruction
sets being vulnerable to speculative side channel attacks it is
possible to run code built for those ISAs on more recent hardware
where they would become vulnerable.
This patch works around this by using a library call added to libgcc.
That code can then take any platform-specific actions necessary to
ensure safety.
For the moment I've only handled two cases: the library code being
built for armv7 or later anyway and running on Linux.
On Linux we can handle this by calling the kernel function that will
flush a small amount of cache. Such a sequence ends with a ISB+DSB
sequence if running on an Armv7 or later CPU.
gcc:
PR target/86951
* config/arm/arm-protos.h (arm_emit_speculation_barrier): New
prototype.
* config/arm/arm.c (speculation_barrier_libfunc): New static
variable.
(arm_init_libfuncs): Initialize it.
(arm_emit_speculation_barrier): New function.
* config/arm/arm.md (speculation_barrier): Call
arm_emit_speculation_barrier for architectures that do not have
DSB or ISB.
(speculation_barrier_insn): Only match on Armv7 or later.
libgcc:
PR target/86951
* config/arm/lib1funcs.asm (speculation_barrier): New function.
* config/arm/t-arm (LIB1ASMFUNCS): Add it to list of functions
to build.
From-SVN: r263806
aarch64_vectorize_vec_perm_const was failing to set one_vector_p
if the permute had only a single input. This in turn was hiding
a problem in the SVE TBL handling: it accepted single-vector
variable-length permutes, but sent them through the general
two-vector aarch64_expand_sve_vec_perm, which is only set up
to handle constant-length permutes.
2018-08-23 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64.c (aarch64_evpc_sve_tbl): Fix handling
of single-vector TBLs.
(aarch64_vectorize_vec_perm_const): Set one_vector_p when only
one input is given.
gcc/testsuite/
* gcc.dg/vect/no-vfa-vect-depend-2.c: Remove XFAIL.
* gcc.dg/vect/no-vfa-vect-depend-3.c: Likewise.
* gcc.dg/vect/pr65947-13.c: Update for vect_fold_extract_last.
* gcc.dg/vect/pr80631-2.c: Likewise.
From-SVN: r263804
This patch fixes a typo in aarch64_expand_vec_perm_const_1 that I
introduced as part of the SVE changes. I don't know of any cases in
which it has any practical effect, since we'll eventually try to use
TBL as a variable permute instead. Having the code is still an
important part of defining the interface properly and so we shouldn't
simply drop it.
2018-08-23 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR target/85910
* config/aarch64/aarch64.c (aarch64_expand_vec_perm_const_1): Fix
aarch64_evpc_tbl guard.
From-SVN: r263803
The Fortran standard specifies (e.g. F2018 7.4.3.2) that intrinsic
procedures shall treat positive and negative real zero as equivalent,
unless it is explicitly specified otherwise. For {max,min}val there
is no such explicit mention. Thus, remove code to handle signed
zeros.
2018-08-23 Janne Blomqvist <blomqvist.janne@gmail.com>
* trans-intrinsic.c (gfc_conv_intrinsic_minmaxval): Delete
HONOR_SIGNED_ZEROS checks.
From-SVN: r263802
* testsuite/20_util/reference_wrapper/lwg2993.cc: Fix C++11 test to
not use C++14 feature.
* testsuite/23_containers/list/68222_neg.cc: Likewise.
From-SVN: r263801
2017-08-23 Paul Thomas <pault@gcc.gnu.org>
PR fortran/86863
* resolve.c (resolve_typebound_call): If the TBP is not marked
as a subroutine, check the specific symbol.
2017-08-23 Paul Thomas <pault@gcc.gnu.org>
PR fortran/86863
* gfortran.dg/submodule_32.f08: New test.
From-SVN: r263799
The container requirements imply that max_size() can't exceed the
maximum value of the container's difference_type. Enforce this for
std::vector and std::deque, and add checks to ensure the container
doesn't grow larger than that.
PR libstdc++/78448
* include/bits/deque.tcc (deque::_M_range_initialize): Use
_S_check_init_len to check size.
(deque::_M_push_back_aux, deque::_M_push_front_aux): Throw length
error if size would exceed max_size().
* include/bits/stl_deque.h (_Deque_base::size_type): Remove typedef.
(_Deque_base(_Deque_base&&, const allocator_type&, size_t)): Use
size_t instead of size_type.
(deq(size_type, const allocator_type&)
(deq(size_type, const value_type&, const allocator_type&)
(deque::_M_initialize_dispatch): Use _S_check_init_len to check size.
(deque::max_size): Call _S_max_size.
(deque::_S_check_init_len, deque::_S_max_size): New functions.
* include/bits/stl_vector.h (vector(size_type, const allocator_type&))
(vector(size_type, const value_type&, const allocator_type&))
(vector::_M_initialize_dispatch, vector::_M_range_initialize): Use
_S_check_init_len to check size.
(vector::max_size): Call _S_max_size.
(vector::_M_check_len): Prevent max from being expanded as a
function-like macro.
(vector::_S_check_init_len, vector::_S_max_size): New functions.
* include/bits/vector.tcc (vector::_M_assign_aux): Use
_S_check_init_len to check size.
* testsuite/23_containers/deque/capacity/max_size.cc: New test.
* testsuite/23_containers/vector/capacity/max_size.cc: New test.
From-SVN: r263789
2018-08-22 Thomas Koenig <tkoenig@gcc.gnu.org>
* gfortran.texi: Mention that asynchronous I/O does
not work on systems which lack condition variables, such
as AIX.
2018-08-22 Thomas Koenig <tkoenig@gcc.gnu.org>
* async.h: Set ASYNC_IO to zero if _AIX is defined.
(struct adv_cond): If ASYNC_IO is zero, the struct has no members.
(async_unit): If ASYNC_IO is zero, remove unneeded members.
From-SVN: r263788
2018-08-22 Andrew Benson <abensonca@gmail.com>
* module.c (load_generic_interfaces): Move call to find_symbol()
so that only occurs if actually needed.
From-SVN: r263784
2018-08-22 Segher Boessenkool <segher@kernel.crashing.org>
PR rtl-optimization/86771
* combine.c (try_combine): Do not allow splitting a resulting PARALLEL
of two SETs into those two SETs, one to be placed at i2, if that SETs
destination is modified between i2 and i3.
From-SVN: r263780
gfortran now always uses MAX_EXPR/MIN_EXPR for MAX/MIN intrinsics, so the
AArch64 specific FMAX/FMIN tests are no longer valid.
2018-08-22 Szabolcs Nagy <szabolcs.nagy@arm.com>
* gfortran.dg/max_fmax_aarch64.f90: Rename to...
* gfortran.dg/max_expr.f90: ...this.
* gfortran.dg/min_fmin_aarch64.f90: Rename to...
* gfortran.dg/min_expr.f90: ...this.
From-SVN: r263778
When combine splits a resulting parallel into its two SETs, it has to
place one at i2, and the other stays at i3. This does not work if the
destination of the SET that will be placed at i2 is modified between
i2 and i3. This patch fixes it.
* combine.c (try_combine): Do not allow splitting a resulting PARALLEL
of two SETs into those two SETs, one to be placed at i2, if that SETs
destination is modified between i2 and i3.
From-SVN: r263776
This patch is the second part of the fix for PR 86725. The problem
in the original test is that for:
outer1:
x_1 = PHI <x_4(outer2), ...>;
...
inner:
x_2 = PHI <x_1(outer1), x_3(...)>;
...
x_3 = ...;
...
outer2:
x_4 = PHI <x_3(inner)>;
...
there are corner cases in which it is possible to classify the
inner phi as an induction but not the outer phi. The -4.c test
is a more direct example.
After failing to classify x_1 as an induction, we go on to
classify it as a double reduction (which is basically true).
But we still classified the inner phi as an induction rather
than as part of a reduction, leading to an ICE when trying
to vectorise the outer phi.
We analyse the phis for outer loops first, so the simplest
fix is not to classify the phi as an induction if outer loop
analysis said that it should be a reduction.
The -2.c test is from the original PR. The -3.c test is a
version in which "wo" really is used a reduction; this was
already correctly rejected, but for the wrong reason ("inner-loop
induction only used outside of the outer vectorized loop").
The -4.c test is another way of tickling the original problem
without relying on the undefinedness of signed overflow.
The -5.c test shows an (uninteresting) example in which the
patch prevents a spurious failure to vectorise the outer loop.
2018-08-22 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/86725
* tree-vect-loop.c (vect_inner_phi_in_double_reduction_p): New
function.
(vect_analyze_scalar_cycles_1): Check it.
gcc/testsuite/
PR tree-optimization/86725
* gcc.dg/vect/no-scevccp-pr86725-2.c: New test.
* gcc.dg/vect/no-scevccp-pr86725-3.c: Likewise.
* gcc.dg/vect/no-scevccp-pr86725-4.c: Likewise.
* gcc.dg/vect/no-scevccp-pr86725-5.c: Likewise.
From-SVN: r263774
This patch is the first part of the fix for PR 86725. We would
treat x_1 in:
outer1:
x_1 = PHI <x_4(outer2), ...>;
...
inner:
x_2 = ...x_1...;
...
x_3 = ...;
...
outer2:
x_4 = PHI <x_3(inner)>;
...
as a double reduction without checking what kind of statement x_2 is.
In practice it has to be a phi, since for other x_2, x_1 would simply
be a loop invariant that gets used for every inner loop iteration.
The idea with doing this patch first is that, by checking x_2 really
is a phi, we can hand off the validation of the rest of the reduction
to the phi analysis in the inner loop.
The test case is a variant of the one in the PR.
2018-08-22 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/86725
* tree-vect-loop.c (vect_is_simple_reduction): When treating
an outer loop phi as a double reduction, make sure that the
single user of the phi result is an inner loop phi.
gcc/testsuite/
PR tree-optimization/86725
* gcc.dg/vect/no-scevccp-pr86725-1.c: New test.
From-SVN: r263773
We could vectorise:
for (...)
{
a[0] = ...;
a[1] = ...;
a[2] = ...;
a[3] = ...;
a += stride;
}
(including the case when stride == 8) but not:
for (...)
{
a[0] = ...;
a[1] = ...;
a[2] = ...;
a[3] = ...;
a += 8;
}
(where the stride is always 8). The former was treated as a "grouped
and strided" store, while the latter was treated as a grouped store
with gaps, which we don't support.
This patch makes us treat groups of stores with gaps at the end as
strided groups too. I tried to go through all uses of STMT_VINFO_STRIDED_P
and all vector uses of DR_STEP to see whether there were any hard-baked
assumptions, but couldn't see any. I wondered whether we should relax:
/* We do not have to consider dependences between accesses that belong
to the same group, unless the stride could be smaller than the
group size. */
if (DR_GROUP_FIRST_ELEMENT (stmtinfo_a)
&& (DR_GROUP_FIRST_ELEMENT (stmtinfo_a)
== DR_GROUP_FIRST_ELEMENT (stmtinfo_b))
&& !STMT_VINFO_STRIDED_P (stmtinfo_a))
return false;
for cases in which the step is constant and the absolute step is known
to be greater than the group size, but data dependence analysis should
already return chrec_known for those cases.
The new test is a version of vect-avg-15.c with the variable step
replaced by a constant one.
A natural follow-on would be to do the same for groups with gaps in
the middle:
/* Check that the distance between two accesses is equal to the type
size. Otherwise, we have gaps. */
diff = (TREE_INT_CST_LOW (DR_INIT (data_ref))
- TREE_INT_CST_LOW (prev_init)) / type_size;
if (diff != 1)
{
[...]
if (DR_IS_WRITE (data_ref))
{
if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
"interleaved store with gaps\n");
return false;
}
But I think we should do that separately and see what the fallout
from this change is first.
2018-08-22 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-data-refs.c (vect_analyze_group_access_1): Convert
grouped stores with gaps to a strided group.
gcc/testsuite/
* gcc.dg/vect/vect-avg-16.c: New test.
* gcc.dg/vect/slp-37.c: Expect the loop to be vectorized.
* gcc.dg/vect/vect-strided-u8-i8-gap4.c,
* gcc.dg/vect/vect-strided-u8-i8-gap4-big-array.c: Likewise for
the second loop in main1.
From-SVN: r263772
get_load_store_type & co were testing STMT_VINFO_STRIDED_P on individual
statements in a group instead of the first. This has no effect on
its own, but is needed by a later patch.
2018-08-22 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-stmts.c (get_group_load_store_type)
(get_load_store_type): Only test STMT_VINFO_STRIDED_P for the
first statement in a group.
From-SVN: r263771
gcc/
PR bootstrap/81033
PR target/81733
PR target/52795
* gcc/dwarf2out.c (FUNC_SECOND_SECT_LABEL): New.
(dwarf2out_switch_text_section): Generate a local label for the second
function sub-section and apply it as the second FDE start label.
* gcc/final.c (final_scan_insn_1): Emit second FDE label after the second
sub-section start.
From-SVN: r263763
2018-08-22 Richard Biener <rguenther@suse.de>
PR tree-optimization/86988
* tree-vrp.c (vrp_prop::check_mem_ref): Bail out on VLAs.
* g++.dg/pr86988.C: New testcase.
From-SVN: r263762