HSA assumes all program scope HSAIL symbols can be queried from
the host runtime API, thus cannot be removed by the IPA.
Getting some inlining happening in the finalized binary required:
* explicitly marking the 'prog' scope functions and the launcher
function "externally_visible" to avoid the inliner removing it
* also the host_def ptr is set to externally visible, otherwise
IPA assumes it's never set
* adding the 'inline' keyword to functions to enable inlining,
otherwise GCC defaults to replaceable functions (one can link
over the previous one) which cannot be inlined
* replacing all calls to declarations with calls to definitions to
enable the inliner to find the definition
* to fix missing hidden argument types in the generated functions.
These were ignored silently until GCC started to be able to
inline calls to such functions.
* do not gimplify before fixing the call targets. Otherwise the
calls get detached and the definitions are not found. The reason
why this happens is not clear, but gimplifying only after call
target decl->def conversion fixes this.
From-SVN: r259943
We didn't preserve additional space for the alloca frame pointers that
are needed to be saved in the alloca space.
Fixes libgomp.c++/target-6.C execution test.
From-SVN: r259942
The ELFv1 ABI says: "Single precision floating point values are mapped
to the second word in a single doubleword" and also "Floating point
registers f1 through f13 are used consecutively to pass up to 13
floating point values, one member aggregates passed by value
containing a floating point value, and to pass complex floating point
values".
libffi wasn't expecting float args in the second word, and wasn't
passing one member aggregates in fp registers. This patch fixes those
problems, making use of the existing ELFv2 homogeneous aggregate
support since a one element fp struct is a special case of an
homogeneous aggregate.
I've also set a flag when returning pointers that might be used one
day. This is just a tidy since the ppc64 assembly support code
currently doesn't test FLAG_RETURNS_64BITS for integer types..
* src/powerpc/ffi_linux64.c (discover_homogeneous_aggregate):
Compile for ELFv1 too, handling single element aggregates.
(ffi_prep_cif_linux64_core): Call discover_homogeneous_aggregate
for ELFv1. Set FLAG_RETURNS_64BITS for FFI_TYPE_POINTER return.
(ffi_prep_args64): Call discover_homogeneous_aggregate for ELFv1,
and handle single element structs containing float or double
as if the element wasn't wrapped in a struct. Store floats in
second word of doubleword slot when big-endian.
(ffi_closure_helper_LINUX64): Similarly.
From-SVN: r259934
2018-05-04 Richard Biener <rguenther@suse.de>
PR middle-end/85627
* tree-complex.c (update_complex_assignment): We are always in SSA form.
(expand_complex_div_wide): Likewise.
(expand_complex_operations_1): Likewise.
(expand_complex_libcall): Preserve EH info of the original stmt.
(tree_lower_complex): Handle removed blocks.
* tree.c (build_common_builtin_nodes): Do not set ECF_NOTRHOW
on complex multiplication and division libcall builtins.
* g++.dg/torture/pr85627.C: New testcase.
From-SVN: r259923
2018-05-04 Richard Biener <rguenther@suse.de>
PR middle-end/85574
* fold-const.c (negate_expr_p): Restrict negation of operand
zero of a division to when we know that can happen without
overflow.
(fold_negate_expr_1): Likewise.
* gcc.dg/torture/pr85574.c: New testcase.
* gcc.dg/torture/pr57656.c: Use dg-additional-options.
From-SVN: r259922
PR libstdc++/85466
* real.h (real_nextafter): Declare.
* real.c (real_nextafter): New function.
* fold-const-call.c (fold_const_nextafter): New function.
(fold_const_call_sss): Call it for CASE_CFN_NEXTAFTER and
CASE_CFN_NEXTTOWARD.
(fold_const_call_1): For CASE_CFN_NEXTTOWARD call fold_const_call_sss
even when arg1_mode is different from arg0_mode.
* gcc.dg/nextafter-1.c: New test.
* gcc.dg/nextafter-2.c: New test.
* gcc.dg/nextafter-3.c: New test.
* gcc.dg/nextafter-4.c: New test.
From-SVN: r259921
In https://golang.org/cl/111097 the gc version of cmd/go was updated
to include some gofrontend-specific changes. The gofrontend code
already has different versions of those changes; this CL makes the
gofrontend match the upstream code.
Reviewed-on: https://go-review.googlesource.com/111099
From-SVN: r259918
Following a recent change for PR 82644 the non-standard hypergeomtric
functions are not defined by <cmath> when __STRICT_ANSI__ is defined
(e.g. for -std=c++17, or -std=c++14 -D__STDCPP_WANT_MATH_SPEC_FUNCS__).
That caused errors in <tr1/cmath> because the using-declarations for
tr1::hyperg et al are invalid in strict modes.
The solution is to define the TR1 hypergeometric functions inline in
<tr1/cmath> if __STRICT_ANSI__ is defined.
PR libstdc++/82644
* include/tr1/cmath [__STRICT_ANSI__] (hypergf, hypergl, hyperg): Use
inline definitions instead of using-declarations.
[__STRICT_ANSI__] (conf_hypergf, conf_hypergl, conf_hyperg): Likewise.
* testsuite/tr1/5_numerical_facilities/special_functions/
07_conf_hyperg/compile_cxx17.cc: New.
* testsuite/tr1/5_numerical_facilities/special_functions/
17_hyperg/compile_cxx17.cc: New.
From-SVN: r259912
On 32-bit targets any values over 4GB would wrap and produce the wrong
result.
PR libstdc++/85632 use uintmax_t for arithmetic
* src/filesystem/ops.cc (experimental::filesystem::space): Perform
arithmetic in result type.
* src/filesystem/std-ops.cc (filesystem::space): Likewise.
* testsuite/27_io/filesystem/operations/space.cc: Check total capacity
is greater than free space.
* testsuite/experimental/filesystem/operations/space.cc: New.
From-SVN: r259901
Tweak the array type checking code to avoid crashing on array types
whose length expressions are explicit non-integer types (for example,
"float64(10)"). If such constructs are seen, issue an "invalid array
bound" error.
Fixesgolang/go#13486.
Reviewed-on: https://go-review.googlesource.com/91975
From-SVN: r259900
The standard requires that the std::thread constructor is constrained so
it can't be called with a first argument of type std::thread. The
current implementation only meets that requirement if the constructor is
called with one argument, by using deleted overloads. This uses an
enable_if constraint to enforce the requirement for any number of
arguments.
Also add a static assertion to give a more readable error for invalid
arguments that cannot be invoked. Also simplify _Invoker to reduce the
error cascade for ill-formed instantiations with non-invocable
arguments.
PR libstdc++/84535
* include/std/thread (thread::__not_same): New SFINAE helper.
(thread::thread(_Callable&&, _Args&&...)): Add SFINAE constraint that
first argument is not a std::thread. Add static assertion to check
INVOKE expression is valid.
(thread::thread(thread&), thread::thread(const thread&&)): Remove.
(thread::_Invoke::_M_invoke, thread::_Invoke::operator()): Use
__invoke_result for return types and remove exception specifications.
* testsuite/30_threads/thread/cons/84535.cc: New.
From-SVN: r259893
2018-05-03 Tom de Vries <tom@codesourcery.com>
PR testsuite/85106
* lib/scanoffloadtree.exp: New file.
* testsuite/lib/libgomp-dg.exp (libgomp-dg-test): Add save-temps to
extra_tool_flags if it contains an -foffload=-fdump-* flag.
* testsuite/lib/libgomp.exp: Include scanoffloadtree.exp.
* testsuite/libgomp.oacc-c/vec.c: Use scan-offload-tree-dump.
* doc/sourcebuild.texi (Commands for use in dg-final, Scan optimization
dump files): Add offload-tree.
From-SVN: r259892
2018-05-03 Richard Biener <rguenther@suse.de>
PR tree-optimization/85615
* tree-ssa-threadupdate.c (thread_block_1): Only allow exits
to loops not nested in BBs loop father to avoid creating multi-entry
loops.
* gcc.dg/torture/pr85615.c: New testcase.
From-SVN: r259891
We can improve the performance of complex floating-point multiplications by inlining the expansion a bit more aggressively.
We can inline complex x = a * b as:
x = (ar*br - ai*bi) + i(ar*bi + br*ai);
if (isunordered (__real__ x, __imag__ x))
x = __muldc3 (a, b); //Or __mulsc3 for single-precision
That way the common case where no NaNs are produced we can avoid the libgcc call and fall back to the
NaN handling stuff in libgcc if either components of the expansion are NaN.
The implementation is done in expand_complex_multiplication in tree-complex.c and the above expansion
will be done when optimising for -O1 and greater and when not optimising for size.
At -O0 and -Os the single call to libgcc will be emitted.
For the code:
__complex double
foo (__complex double a, __complex double b)
{
return a * b;
}
We will now emit at -O2 for aarch64:
foo:
fmul d16, d1, d3
fmul d6, d1, d2
fnmsub d5, d0, d2, d16
fmadd d4, d0, d3, d6
fcmp d5, d4
bvs .L8
fmov d1, d4
fmov d0, d5
ret
.L8:
stp x29, x30, [sp, -16]!
mov x29, sp
bl __muldc3
ldp x29, x30, [sp], 16
ret
Instead of just a branch to __muldc3.
PR tree-optimization/70291
* tree-complex.c (expand_complex_libcall): Add type, inplace_p
arguments. Change return type to tree. Emit libcall as a new
statement rather than replacing existing one when inplace_p is true.
(expand_complex_multiplication_components): New function.
(expand_complex_multiplication): Expand floating-point complex
multiplication using the above.
(expand_complex_division): Rename inner_type parameter to type.
Update expand_complex_libcall call-site.
(expand_complex_operations_1): Update expand_complex_multiplication
and expand_complex_division call-sites.
* gcc.dg/complex-6.c: New test.
* gcc.dg/complex-7.c: Likewise.
From-SVN: r259889
PR other/85622
* gcc_release: For -f, verify contrib/gennews has the major version
pages listed and both index.html and changes.html have been updated
for the new release.
From-SVN: r259881
These tests used to be disabled in the gofrontend since the go tool
didn't support build IDs for the gofrontend. It does now, so enable
the tests again.
Reviewed-on: https://go-review.googlesource.com/111098
From-SVN: r259875
This patch adds explicit references to various types and constants
defined by the header files included by sysinfo.c (used to drive the
generation of gen-sysinfo.go as part of the libgo build via the GCC
"-fdump-go-spec" option).
The intent is to enable clients to gather the same info generated by
"-fdump-go-spec" by instead reading the generated DWARF from a
sysinfo.o object file compiled with "-g". Some compilers (notably
clang) try to omit DWARF records for a given type unless there is an
explicit use of it in the translation unit; the additional references
are to insure that everything we want to see in the DWARF shows up.
Reviewed-on: https://go-review.googlesource.com/99063
From-SVN: r259868
We're never going to use stack.go for gccgo. Although a build tag
keeps it from being built, even having it around can be confusing.
Remove it.
Reviewed-on: https://go-review.googlesource.com/40774
From-SVN: r259865
Move the list of libgo, gotool, and check-target packages into
separate files, then read the file contents as part of the build
process on the fly. This is intended to enable other build tooling to
share the canonical list of target packages (avoid duplication).
Reviewed-on: https://go-review.googlesource.com/89515
libgo: revise rules for runtime.inc generation
Refactor code for generating runtime.inc: extract out the relevant
commands and place them in a separate shell script ("mkruntimeinc.sh").
Update rules to avoid generating macros whose names begin with "$",
such as "#define $sinkconst0 0".
Reviewed-on: https://go-review.googlesource.com/85955
From-SVN: r259863
PR target/85582
* config/i386/i386.md (*ashl<dwi>3_doubleword_mask,
*ashl<dwi>3_doubleword_mask_1, *<shift_insn><dwi>3_doubleword_mask,
*<shift_insn><dwi>3_doubleword_mask_1): In condition require that
the highest significant bit of the shift count mask is clear. In
check whether and[sq]i3 is needed verify that all significant bits
of the shift count other than the highest are set.
* gcc.c-torture/execute/pr85582-3.c: New test.
From-SVN: r259862
The C portion of the Go runtime includes the header "unwind-pe.h" from
libgcc, which contains some constants and a few small routines for
decoding pointer values within unwind info. This patch gets rid of
that include and instead adds a re-implementation of that
functionality in the single file that uses it. The intent is to allow
the C runtime portion of libgo to be built without a companion GCC
installation.
Reviewed-on: https://go-review.googlesource.com/90235
From-SVN: r259861
The suggested resolution of LWG 3083 is to make invalid indices
undefined, but we can fairly easily check for them and treat them as
errors in the same way as allocation failure. This avoids a segfault or
worse, setting an error flag on the stream instead.
PR libstdc++/68197
* include/bits/ios_base.h (ios_base::iword, ios_base::pword): Cast
indices to unsigned.
* src/c++11/ios.cc (ios_base::_M_grow_words): Treat negative indices
as failure. Refactor error handling.
* testsuite/27_io/ios_base/storage/68197.cc: New.
From-SVN: r259854