The AArch32 instruction sets prior to Armv7 do not define the ISB and
DSB instructions that are needed to form a speculation barrier. While
I do not know of any instances of cores based on those instruction
sets being vulnerable to speculative side channel attacks it is
possible to run code built for those ISAs on more recent hardware
where they would become vulnerable.
This patch works around this by using a library call added to libgcc.
That code can then take any platform-specific actions necessary to
ensure safety.
For the moment I've only handled two cases: the library code being
built for armv7 or later anyway and running on Linux.
On Linux we can handle this by calling the kernel function that will
flush a small amount of cache. Such a sequence ends with a ISB+DSB
sequence if running on an Armv7 or later CPU.
gcc:
PR target/86951
* config/arm/arm-protos.h (arm_emit_speculation_barrier): New
prototype.
* config/arm/arm.c (speculation_barrier_libfunc): New static
variable.
(arm_init_libfuncs): Initialize it.
(arm_emit_speculation_barrier): New function.
* config/arm/arm.md (speculation_barrier): Call
arm_emit_speculation_barrier for architectures that do not have
DSB or ISB.
(speculation_barrier_insn): Only match on Armv7 or later.
libgcc:
PR target/86951
* config/arm/lib1funcs.asm (speculation_barrier): New function.
* config/arm/t-arm (LIB1ASMFUNCS): Add it to list of functions
to build.
From-SVN: r263806
aarch64_vectorize_vec_perm_const was failing to set one_vector_p
if the permute had only a single input. This in turn was hiding
a problem in the SVE TBL handling: it accepted single-vector
variable-length permutes, but sent them through the general
two-vector aarch64_expand_sve_vec_perm, which is only set up
to handle constant-length permutes.
2018-08-23 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64.c (aarch64_evpc_sve_tbl): Fix handling
of single-vector TBLs.
(aarch64_vectorize_vec_perm_const): Set one_vector_p when only
one input is given.
gcc/testsuite/
* gcc.dg/vect/no-vfa-vect-depend-2.c: Remove XFAIL.
* gcc.dg/vect/no-vfa-vect-depend-3.c: Likewise.
* gcc.dg/vect/pr65947-13.c: Update for vect_fold_extract_last.
* gcc.dg/vect/pr80631-2.c: Likewise.
From-SVN: r263804
This patch fixes a typo in aarch64_expand_vec_perm_const_1 that I
introduced as part of the SVE changes. I don't know of any cases in
which it has any practical effect, since we'll eventually try to use
TBL as a variable permute instead. Having the code is still an
important part of defining the interface properly and so we shouldn't
simply drop it.
2018-08-23 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR target/85910
* config/aarch64/aarch64.c (aarch64_expand_vec_perm_const_1): Fix
aarch64_evpc_tbl guard.
From-SVN: r263803
The Fortran standard specifies (e.g. F2018 7.4.3.2) that intrinsic
procedures shall treat positive and negative real zero as equivalent,
unless it is explicitly specified otherwise. For {max,min}val there
is no such explicit mention. Thus, remove code to handle signed
zeros.
2018-08-23 Janne Blomqvist <blomqvist.janne@gmail.com>
* trans-intrinsic.c (gfc_conv_intrinsic_minmaxval): Delete
HONOR_SIGNED_ZEROS checks.
From-SVN: r263802
* testsuite/20_util/reference_wrapper/lwg2993.cc: Fix C++11 test to
not use C++14 feature.
* testsuite/23_containers/list/68222_neg.cc: Likewise.
From-SVN: r263801
2017-08-23 Paul Thomas <pault@gcc.gnu.org>
PR fortran/86863
* resolve.c (resolve_typebound_call): If the TBP is not marked
as a subroutine, check the specific symbol.
2017-08-23 Paul Thomas <pault@gcc.gnu.org>
PR fortran/86863
* gfortran.dg/submodule_32.f08: New test.
From-SVN: r263799
The container requirements imply that max_size() can't exceed the
maximum value of the container's difference_type. Enforce this for
std::vector and std::deque, and add checks to ensure the container
doesn't grow larger than that.
PR libstdc++/78448
* include/bits/deque.tcc (deque::_M_range_initialize): Use
_S_check_init_len to check size.
(deque::_M_push_back_aux, deque::_M_push_front_aux): Throw length
error if size would exceed max_size().
* include/bits/stl_deque.h (_Deque_base::size_type): Remove typedef.
(_Deque_base(_Deque_base&&, const allocator_type&, size_t)): Use
size_t instead of size_type.
(deq(size_type, const allocator_type&)
(deq(size_type, const value_type&, const allocator_type&)
(deque::_M_initialize_dispatch): Use _S_check_init_len to check size.
(deque::max_size): Call _S_max_size.
(deque::_S_check_init_len, deque::_S_max_size): New functions.
* include/bits/stl_vector.h (vector(size_type, const allocator_type&))
(vector(size_type, const value_type&, const allocator_type&))
(vector::_M_initialize_dispatch, vector::_M_range_initialize): Use
_S_check_init_len to check size.
(vector::max_size): Call _S_max_size.
(vector::_M_check_len): Prevent max from being expanded as a
function-like macro.
(vector::_S_check_init_len, vector::_S_max_size): New functions.
* include/bits/vector.tcc (vector::_M_assign_aux): Use
_S_check_init_len to check size.
* testsuite/23_containers/deque/capacity/max_size.cc: New test.
* testsuite/23_containers/vector/capacity/max_size.cc: New test.
From-SVN: r263789
2018-08-22 Thomas Koenig <tkoenig@gcc.gnu.org>
* gfortran.texi: Mention that asynchronous I/O does
not work on systems which lack condition variables, such
as AIX.
2018-08-22 Thomas Koenig <tkoenig@gcc.gnu.org>
* async.h: Set ASYNC_IO to zero if _AIX is defined.
(struct adv_cond): If ASYNC_IO is zero, the struct has no members.
(async_unit): If ASYNC_IO is zero, remove unneeded members.
From-SVN: r263788
2018-08-22 Andrew Benson <abensonca@gmail.com>
* module.c (load_generic_interfaces): Move call to find_symbol()
so that only occurs if actually needed.
From-SVN: r263784
2018-08-22 Segher Boessenkool <segher@kernel.crashing.org>
PR rtl-optimization/86771
* combine.c (try_combine): Do not allow splitting a resulting PARALLEL
of two SETs into those two SETs, one to be placed at i2, if that SETs
destination is modified between i2 and i3.
From-SVN: r263780
gfortran now always uses MAX_EXPR/MIN_EXPR for MAX/MIN intrinsics, so the
AArch64 specific FMAX/FMIN tests are no longer valid.
2018-08-22 Szabolcs Nagy <szabolcs.nagy@arm.com>
* gfortran.dg/max_fmax_aarch64.f90: Rename to...
* gfortran.dg/max_expr.f90: ...this.
* gfortran.dg/min_fmin_aarch64.f90: Rename to...
* gfortran.dg/min_expr.f90: ...this.
From-SVN: r263778
When combine splits a resulting parallel into its two SETs, it has to
place one at i2, and the other stays at i3. This does not work if the
destination of the SET that will be placed at i2 is modified between
i2 and i3. This patch fixes it.
* combine.c (try_combine): Do not allow splitting a resulting PARALLEL
of two SETs into those two SETs, one to be placed at i2, if that SETs
destination is modified between i2 and i3.
From-SVN: r263776
This patch is the second part of the fix for PR 86725. The problem
in the original test is that for:
outer1:
x_1 = PHI <x_4(outer2), ...>;
...
inner:
x_2 = PHI <x_1(outer1), x_3(...)>;
...
x_3 = ...;
...
outer2:
x_4 = PHI <x_3(inner)>;
...
there are corner cases in which it is possible to classify the
inner phi as an induction but not the outer phi. The -4.c test
is a more direct example.
After failing to classify x_1 as an induction, we go on to
classify it as a double reduction (which is basically true).
But we still classified the inner phi as an induction rather
than as part of a reduction, leading to an ICE when trying
to vectorise the outer phi.
We analyse the phis for outer loops first, so the simplest
fix is not to classify the phi as an induction if outer loop
analysis said that it should be a reduction.
The -2.c test is from the original PR. The -3.c test is a
version in which "wo" really is used a reduction; this was
already correctly rejected, but for the wrong reason ("inner-loop
induction only used outside of the outer vectorized loop").
The -4.c test is another way of tickling the original problem
without relying on the undefinedness of signed overflow.
The -5.c test shows an (uninteresting) example in which the
patch prevents a spurious failure to vectorise the outer loop.
2018-08-22 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/86725
* tree-vect-loop.c (vect_inner_phi_in_double_reduction_p): New
function.
(vect_analyze_scalar_cycles_1): Check it.
gcc/testsuite/
PR tree-optimization/86725
* gcc.dg/vect/no-scevccp-pr86725-2.c: New test.
* gcc.dg/vect/no-scevccp-pr86725-3.c: Likewise.
* gcc.dg/vect/no-scevccp-pr86725-4.c: Likewise.
* gcc.dg/vect/no-scevccp-pr86725-5.c: Likewise.
From-SVN: r263774
This patch is the first part of the fix for PR 86725. We would
treat x_1 in:
outer1:
x_1 = PHI <x_4(outer2), ...>;
...
inner:
x_2 = ...x_1...;
...
x_3 = ...;
...
outer2:
x_4 = PHI <x_3(inner)>;
...
as a double reduction without checking what kind of statement x_2 is.
In practice it has to be a phi, since for other x_2, x_1 would simply
be a loop invariant that gets used for every inner loop iteration.
The idea with doing this patch first is that, by checking x_2 really
is a phi, we can hand off the validation of the rest of the reduction
to the phi analysis in the inner loop.
The test case is a variant of the one in the PR.
2018-08-22 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/86725
* tree-vect-loop.c (vect_is_simple_reduction): When treating
an outer loop phi as a double reduction, make sure that the
single user of the phi result is an inner loop phi.
gcc/testsuite/
PR tree-optimization/86725
* gcc.dg/vect/no-scevccp-pr86725-1.c: New test.
From-SVN: r263773
We could vectorise:
for (...)
{
a[0] = ...;
a[1] = ...;
a[2] = ...;
a[3] = ...;
a += stride;
}
(including the case when stride == 8) but not:
for (...)
{
a[0] = ...;
a[1] = ...;
a[2] = ...;
a[3] = ...;
a += 8;
}
(where the stride is always 8). The former was treated as a "grouped
and strided" store, while the latter was treated as a grouped store
with gaps, which we don't support.
This patch makes us treat groups of stores with gaps at the end as
strided groups too. I tried to go through all uses of STMT_VINFO_STRIDED_P
and all vector uses of DR_STEP to see whether there were any hard-baked
assumptions, but couldn't see any. I wondered whether we should relax:
/* We do not have to consider dependences between accesses that belong
to the same group, unless the stride could be smaller than the
group size. */
if (DR_GROUP_FIRST_ELEMENT (stmtinfo_a)
&& (DR_GROUP_FIRST_ELEMENT (stmtinfo_a)
== DR_GROUP_FIRST_ELEMENT (stmtinfo_b))
&& !STMT_VINFO_STRIDED_P (stmtinfo_a))
return false;
for cases in which the step is constant and the absolute step is known
to be greater than the group size, but data dependence analysis should
already return chrec_known for those cases.
The new test is a version of vect-avg-15.c with the variable step
replaced by a constant one.
A natural follow-on would be to do the same for groups with gaps in
the middle:
/* Check that the distance between two accesses is equal to the type
size. Otherwise, we have gaps. */
diff = (TREE_INT_CST_LOW (DR_INIT (data_ref))
- TREE_INT_CST_LOW (prev_init)) / type_size;
if (diff != 1)
{
[...]
if (DR_IS_WRITE (data_ref))
{
if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
"interleaved store with gaps\n");
return false;
}
But I think we should do that separately and see what the fallout
from this change is first.
2018-08-22 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-data-refs.c (vect_analyze_group_access_1): Convert
grouped stores with gaps to a strided group.
gcc/testsuite/
* gcc.dg/vect/vect-avg-16.c: New test.
* gcc.dg/vect/slp-37.c: Expect the loop to be vectorized.
* gcc.dg/vect/vect-strided-u8-i8-gap4.c,
* gcc.dg/vect/vect-strided-u8-i8-gap4-big-array.c: Likewise for
the second loop in main1.
From-SVN: r263772
get_load_store_type & co were testing STMT_VINFO_STRIDED_P on individual
statements in a group instead of the first. This has no effect on
its own, but is needed by a later patch.
2018-08-22 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-stmts.c (get_group_load_store_type)
(get_load_store_type): Only test STMT_VINFO_STRIDED_P for the
first statement in a group.
From-SVN: r263771
gcc/
PR bootstrap/81033
PR target/81733
PR target/52795
* gcc/dwarf2out.c (FUNC_SECOND_SECT_LABEL): New.
(dwarf2out_switch_text_section): Generate a local label for the second
function sub-section and apply it as the second FDE start label.
* gcc/final.c (final_scan_insn_1): Emit second FDE label after the second
sub-section start.
From-SVN: r263763
2018-08-22 Richard Biener <rguenther@suse.de>
PR tree-optimization/86988
* tree-vrp.c (vrp_prop::check_mem_ref): Bail out on VLAs.
* g++.dg/pr86988.C: New testcase.
From-SVN: r263762
for gcc/ChangeLog
* config/rs6000/rs6000.c (SMALL_DATA_RELOC, SMALL_DATA_REG): Add
a comment about how uses of r2 for .sdata2 come about.
From-SVN: r263760
2018-08-21 François Dumont <fdumont@gcc.gnu.org>
P0646R1 Improving the Return Value of Erase-Like Algorithms I
* include/debug/forward_list (forward_list::__remove_return_type):
Define typedef as size_type or void, according to __cplusplus value.
(_GLIBCXX_FWDLIST_REMOVE_RETURN_TYPE_TAG): Define macro as abi-tag or
empty, according to __cplusplus value.
(_GLIBCXX20_ONLY): Define macro.
(forward_list::remove, forward_list::unique): Use typedef and macro
to change return type and add abi-tag for C++2a. Return number of
removed elements for C++2a.
(forward_list::remove_if<Pred>, forward_list::unique<BinPred>): Use
typedef to change return type for C++2a. Return number of removed
elements for C++2a.
* include/debug/list (list::__remove_return_type): Define typedef as
size_type or void, according to __cplusplus value.
(_GLIBCXX_LIST_REMOVE_RETURN_TYPE_TAG): Define macro as abi-tag or
empty, according to __cplusplus value.
(_GLIBCXX20_ONLY): Define macro.
(list::remove, list::unique): Use typedef and macro to change return
type and add abi-tag for C++2a. Return number of removed elements for
C++2a.
(list::remove_if<Predicate>, list::unique<BinaryPredicate>): Use typedef
to change return type for C++2a. Return number of removed elements for
C++2a.
From-SVN: r263752
For floating point types, the question is what MAX(a, NaN) or MIN(a,
NaN) should return (where "a" is a normal number). There are valid
usecases for returning either one, but the Fortran standard doesn't
specify which one should be chosen. Also, there is no consensus among
other tested compilers. In short, it's a mess. So lets just do
whatever is fastest, which is using MAX_EXPR/MIN_EXPR which are not
defined to do anything in particular if one of the operands is a NaN.
gcc/fortran/ChangeLog:
2018-08-21 Janne Blomqvist <jb@gcc.gnu.org>
* trans-intrinsic.c (gfc_conv_intrinsic_minmax): Use
MAX_EXPR/MIN_EXPR unconditionally for real arguments.
* gfortran.texi (Compiler Characteristics): Document MAX/MIN
behavior wrt NaN.
gcc/testsuite/ChangeLog:
2018-08-21 Janne Blomqvist <jb@gcc.gnu.org>
* gfortran.dg/nan_1.f90: Remove tests that test MAX/MIN with NaNs.
From-SVN: r263751
2018-08-21 Nicolas Koenig <koenigni@gcc.gnu.org>
Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/25829
* gfortran.texi: Add description of asynchronous I/O.
* trans-decl.c (gfc_finish_var_decl): Treat asynchronous variables
as volatile.
* trans-io.c (gfc_build_io_library_fndecls): Rename st_wait to
st_wait_async and change argument spec from ".X" to ".w".
(gfc_trans_wait): Pass ID argument via reference.
2018-08-21 Nicolas Koenig <koenigni@gcc.gnu.org>
Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/25829
* gfortran.dg/f2003_inquire_1.f03: Add write statement.
* gfortran.dg/f2003_io_1.f03: Add wait statement.
2018-08-21 Nicolas Koenig <koenigni@gcc.gnu.org>
Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/25829
* Makefile.am: Add async.c to gfor_io_src.
Add async.h to gfor_io_headers.
* Makefile.in: Regenerated.
* gfortran.map: Add _gfortran_st_wait_async.
* io/async.c: New file.
* io/async.h: New file.
* io/close.c: Include async.h.
(st_close): Call async_wait for an asynchronous unit.
* io/file_pos.c (st_backspace): Likewise.
(st_endfile): Likewise.
(st_rewind): Likewise.
(st_flush): Likewise.
* io/inquire.c: Add handling for asynchronous PENDING
and ID arguments.
* io/io.h (st_parameter_dt): Add async bit.
(st_parameter_wait): Correct.
(gfc_unit): Add au pointer.
(st_wait_async): Add prototype.
(transfer_array_inner): Likewise.
(st_write_done_worker): Likewise.
* io/open.c: Include async.h.
(new_unit): Initialize asynchronous unit.
* io/transfer.c (async_opt): New struct.
(wrap_scalar_transfer): New function.
(transfer_integer): Call wrap_scalar_transfer to do the work.
(transfer_real): Likewise.
(transfer_real_write): Likewise.
(transfer_character): Likewise.
(transfer_character_wide): Likewise.
(transfer_complex): Likewise.
(transfer_array_inner): New function.
(transfer_array): Call transfer_array_inner.
(transfer_derived): Call wrap_scalar_transfer.
(data_transfer_init): Check for asynchronous I/O.
Perform a wait operation on any pending asynchronous I/O
if the data transfer is synchronous. Copy PDT and enqueue
thread for data transfer.
(st_read_done_worker): New function.
(st_read_done): Enqueue transfer or call st_read_done_worker.
(st_write_done_worker): New function.
(st_write_done): Enqueue transfer or call st_read_done_worker.
(st_wait): Document as no-op for compatibility reasons.
(st_wait_async): New function.
* io/unit.c (insert_unit): Use macros LOCK, UNLOCK and TRYLOCK;
add NOTE where necessary.
(get_gfc_unit): Likewise.
(init_units): Likewise.
(close_unit_1): Likewise. Call async_close if asynchronous.
(close_unit): Use macros LOCK and UNLOCK.
(finish_last_advance_record): Likewise.
(newunit_alloc): Likewise.
* io/unix.c (find_file): Likewise.
(flush_all_units_1): Likewise.
(flush_all_units): Likewise.
* libgfortran.h (generate_error_common): Add prototype.
* runtime/error.c: Include io.h and async.h.
(generate_error_common): New function.
2018-08-21 Nicolas Koenig <koenigni@gcc.gnu.org>
Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/25829
* testsuite/libgomp.fortran/async_io_1.f90: New test.
* testsuite/libgomp.fortran/async_io_2.f90: New test.
* testsuite/libgomp.fortran/async_io_3.f90: New test.
* testsuite/libgomp.fortran/async_io_4.f90: New test.
* testsuite/libgomp.fortran/async_io_5.f90: New test.
* testsuite/libgomp.fortran/async_io_6.f90: New test.
* testsuite/libgomp.fortran/async_io_7.f90: New test.
Co-Authored-By: Thomas Koenig <tkoenig@gcc.gnu.org>
From-SVN: r263750
PR c++/86499
* parser.c (cp_parser_lambda_introducer): Give error if a non-local
lambda has a capture-default.
* g++.dg/cpp0x/lambda/lambda-non-local.C: New test.
* g++.dg/cpp0x/lambda/lambda-this10.C: Adjust dg-error.
From-SVN: r263749
PR c++/65043
* call.c (standard_conversion): Set check_narrowing.
* typeck2.c (check_narrowing): Use CP_INTEGRAL_TYPE_P rather
than comparing with INTEGER_TYPE.
* g++.dg/concepts/pr67595.C: Add dg-warning.
* g++.dg/cpp0x/Wnarrowing11.C: New test.
* g++.dg/cpp0x/Wnarrowing12.C: New test.
* g++.dg/cpp0x/rv-cast5.C: Add static_cast.
From-SVN: r263739
VxLink is a helper tool used as a wrapper around g++/gcc to build
VxWorks DKM (Downloadable Kernel Modules).
Such DKM is a partially linked object that includes entry points for
constructors and destructors.
This tool thus uses g++ to generate an intermediate partially linked
object, retrieves the list of constructors and destructors in it and
produces a C file that lists those ctors/dtors in a way that is
understood be VxWorks kernel. It then links this file with the
intermediate object to produce a valid DKM.
2018-08-21 Jerome Lambourg <lambourg@adacore.com>
gcc/ada/
* vxlink-bind.adb, vxlink-bind.ads, vxlink-link.adb,
vxlink-link.ads, vxlink-main.adb, vxlink.adb, vxlink.ads: Add a
new tool vxlink to handle VxWorks constructors in DKMs.
* gcc-interface/Makefile.in: add rules to build vxlink
From-SVN: r263736