Currently, allocation expression that can be allocated on stack
is implemented with __builtin_alloca, which turns into
__morestack_allocate_stack_space, which may call C malloc. This
may be slow. Also if this happens during certain runtime
functions (e.g. write barrier), bad things might happen (when
the escape analysis is enabled for the runtime). Make a temporary
variable on stack for the allocation instead.
Also remove the write barrier in the assignment in building heap
expression when it is stack allocated.
Reviewed-on: https://go-review.googlesource.com/86242
From-SVN: r256412
The escape analysis models closures by flowing captured variable
address to the closure node. However, the escape state for the
address expressions remained unset as ESCAPE_UNKNOWN. This
caused later passes to conclude that the address escapes. Fix this by
setting its escape state to ESCAPE_NONE first. If it escapes
(because the closure escapes), the flood phase will set its
escape state properly.
Reviewed-on: https://go-review.googlesource.com/86240
From-SVN: r256411
Defer statement may need to allocate a thunk. When it is not
inside a loop, this can be stack allocated, as it runs before
the function finishes.
Reviewed-on: https://go-review.googlesource.com/85639
From-SVN: r256410
Bound_method_expression needs a closure. Stack allocate the
closure when it does not escape.
Reviewed-on: https://go-review.googlesource.com/85638
From-SVN: r256407
Move some check of escape state earlier, from get_backend to
Mark_address_taken. So we can reclaim escape analysis Nodes
before kicking off the backend (not done in this CL). Also it
makes it easier to check variables and closures do not escape
when the escape analysis is run for the runtime package (also
not done in this CL).
Reviewed-on: https://go-review.googlesource.com/85735
From-SVN: r256406
CL 83876 added support of go:noescape pragma, but it only works
for functions called from the same package. The pragma did not
take effect for exported functions that are not called from
the same package. The reason is that top level function
declarations are not traversed, and only reached from calls
from other functions. This CL adds this support. The Traverse
class is extended with a mode to traverse function declarations.
Reviewed-on: https://go-review.googlesource.com/85637
From-SVN: r256405
If we're making a slice of constant size that does not need to
escape, allocate it on stack.
In lower, do not create temporaries for constant size makeslice,
so that it is easier to detect the slice is constant size later.
Reviewed-on: https://go-review.googlesource.com/85636
From-SVN: r256404
Arrays that are sliced are set to escape in type checking, very
early in compilation. The escape analysis runs later but cannot
undo it. This CL changes it to not escape in the early stage.
Later the escape analysis will make it escape when needed.
Reviewed-on: https://go-review.googlesource.com/85635
From-SVN: r256403
PR target/78585 has been fixed for GCC 7 by
commit 7ed04d053eead43d87dff40fb4e2904219afc4d5
Author: jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4>
Date: Wed Nov 30 13:02:07 2016 +0000
* config/i386/i386.c (dimode_scalar_chain::convert_op): Avoid
sharing the SUBREG rtx between move and following insn.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@243018 138bc75d-0d04-0410-961f-82ee72b054a4
PR target/78585:
* gcc.target/i386/pr78585.c: New test.
From-SVN: r256402
PR libstdc++/80276
* python/libstdcxx/v6/printers.py (SharedPointerPrinter)
(UniquePointerPrinter): Print correct template argument, not type of
the pointer.
(TemplateTypePrinter._recognizer.recognize): Handle failure to lookup
a type.
* testsuite/libstdc++-prettyprinters/cxx11.cc: Test unique_ptr of
array type.
* testsuite/libstdc++-prettyprinters/cxx17.cc: Test shared_ptr and
weak_ptr of array types.
From-SVN: r256400
If a local variable's address is taken and passed out of its
lexical scope, GCC backend may reuse the stack slot for the
variable, not knowing it is still live through a pointer. In
this case, we create a top-level temporary variable and let the
user-defined variable refer to the temporary variable as its
storage location. As the temporary variable is declared at the
top level, its stack slot will remain live throughout the
function.
Reviewed-on: https://go-review.googlesource.com/84675
* go-gcc.cc (local_variable): Add decl_var parameter.
From-SVN: r256398
gcc/ChangeLog:
2018-01-09 Carl Love <cel@us.ibm.com>
* config/rs6002/altivec.md (p8_vmrgow): Add support for V2DI, V2DF,
V4SI, V4SF types.
(p8_vmrgew): Add support for V2DI, V2DF, V4SF types.
* config/rs6000/rs6000-builtin.def: Add definitions for FLOAT2_V2DF,
VMRGEW_V2DI, VMRGEW_V2DF, VMRGEW_V4SF, VMRGOW_V4SI, VMRGOW_V4SF,
VMRGOW_V2DI, VMRGOW_V2DF. Remove definition for VMRGOW.
* config/rs6000/rs6000-c.c (VSX_BUILTIN_VEC_FLOAT2,
P8V_BUILTIN_VEC_VMRGEW, P8V_BUILTIN_VEC_VMRGOW): Add definitions.
* config/rs6000/rs6000-protos.h: Add extern defition for
rs6000_generate_float2_double_code.
* config/rs6000/rs6000.c (rs6000_generate_float2_double_code): Add
function.
* config/rs6000/vsx.md (vsx_xvcdpsp): Add define_insn.
(float2_v2df): Add define_expand.
gcc/testsuite/ChangeLog:
2017-01-09 Carl Love <cel@us.ibm.com>
* gcc.target/powerpc/builtins-1.c (main): Add tests for vec_mergee and
vec_mergeo builtins with float, double, long long, unsigned long long,
bool long long arguments.
* gcc.target/powerpc/builtins-3-runnable.c (main): Add test for
vec_float2 with double arguments.
* gcc.target/powerpc/builtins-mergew-mergow.c: New runable test for the
vec_mergew and vec_mergow builtins.
From-SVN: r256395
Add a flag -fgo-debug-escape-hash for debugging escape analysis.
It takes a binary string, optionally led by a "-", as argument.
When specified, the escape analysis runs only on functions whose
name is hashed to a value with matching suffix. The "-" sign
negates the match, i.e. the analysis runs only on functions with
non-matching hash.
Reviewed-on: https://go-review.googlesource.com/83878
* lang.opt (fgo-debug-escape-hash): New option.
* go-c.h (struct go_create_gogo_args): Add debug_escape_hash
field.
* go-lang.c (go_langhook_init): Set debug_escape_hash field.
* gccgo.texi (Invoking gccgo): Document -fgo-debug-escape-hash.
From-SVN: r256393
2018-01-09 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/83742
* expr.c (gfc_is_simply_contiguous): Check for NULL pointer.
2018-01-09 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/83742
* gfortran.dg/contiguous_6.f90: New test.
From-SVN: r256391
2018-01-08 Aaron Sawdey <acsawdey@linux.vnet.ibm.com>
* config/rs6000/rs6000-string.c (do_load_for_compare_from_addr): New
function.
(do_ifelse): New function.
(do_isel): New function.
(do_sub3): New function.
(do_add3): New function.
(do_load_mask_compare): New function.
(do_overlap_load_compare): New function.
(expand_compare_loop): New function.
(expand_block_compare): Call expand_compare_loop() when appropriate.
* config/rs6000/rs6000.opt (-mblock-compare-inline-limit): Change
option description.
(-mblock-compare-inline-loop-limit): New option.
From-SVN: r256388
This patch makes the AArch64 vec_perm_const code use the new
vec_perm_indices routines, instead of checking each element individually.
This means that they extend naturally to variable-length vectors.
Also, aarch64_evpc_dup was the only function that generated rtl when
testing_p is true, and that looked accidental. The patch adds the
missing check and then replaces the gen_rtx_REG/start_sequence/
end_sequence stuff with an assert that no rtl is generated.
2018-01-09 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* config/aarch64/aarch64.c (aarch64_evpc_trn): Use d.perm.series_p
instead of checking each element individually.
(aarch64_evpc_uzp): Likewise.
(aarch64_evpc_zip): Likewise.
(aarch64_evpc_ext): Likewise.
(aarch64_evpc_rev): Likewise.
(aarch64_evpc_dup): Test the encoding for a single duplicated element,
instead of checking each element individually. Return true without
generating rtl if
(aarch64_vectorize_vec_perm_const): Use all_from_input_p to test
whether all selected elements come from the same input, instead of
checking each element individually. Remove calls to gen_rtx_REG,
start_sequence and end_sequence and instead assert that no rtl is
generated.
From-SVN: r256385
The aarch64_legitimate_constant_p tests for HIGH and CONST seem
to be the wrong way round: (high (const ...)) is valid rtl that
could be passed in, but (const (high ...)) isn't. As it stands,
we disallow anchor+offset but allow (high anchor+offset).
2018-01-09 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* config/aarch64/aarch64.c (aarch64_legitimate_constant_p): Fix
order of HIGH and CONST checks.
From-SVN: r256384
As mentioned in https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01575.html ,
the scatter handling in vectorizable_store seems to be dead code at the
moment. Enabling it with the vect_analyze_data_ref_access part of
that patch triggered an ICE in the avx512f-scatter-*.c tests (which
previously didn't use scatters). The problem was that the NARROW
and WIDEN handling uses permute_vec_elements to marshal the inputs,
and permute_vec_elements expected the lhs of the stmt to be an SSA_NAME,
which of course it isn't for stores.
This patch makes permute_vec_elements create a fresh variable in this case.
2018-01-09 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-vect-stmts.c (permute_vec_elements): Create a fresh variable
if the destination isn't an SSA_NAME.
From-SVN: r256383
2018-01-09 Richard Biener <rguenther@suse.de>
PR tree-optimization/83668
* graphite.c (canonicalize_loop_closed_ssa): Add edge argument,
move prologue...
(canonicalize_loop_form): ... here, renamed from ...
(canonicalize_loop_closed_ssa_form): ... this and amended to
swap successor edges for loop exit blocks to make us use
the RPO order we need for initial schedule generation.
* gcc.dg/graphite/pr83668.c: New testcase.
From-SVN: r256381
The folding of comparisons against Inf (to constants or comparisons
with the maximum finite value) has various cases where it introduces
or loses "invalid" exceptions for comparisons with NaNs.
Folding x > +Inf to 0 should not be about HONOR_SNANS - ordered
comparisons of both quiet and signaling NaNs should raise invalid.
x <= +Inf is not the same as x == x, because again that loses an
exception (equality comparisons don't raise exceptions except for
signaling NaNs).
x == +Inf is not the same as x > DBL_MAX, and a similar issue applies
with the x != +Inf case - that transformation causes a spurious
exception.
This patch fixes the conditionals on the folding to avoid such
introducing or losing exceptions.
Bootstrapped with no regressions on x86_64-pc-linux-gnu (where the
cases involving spurious exceptions wouldn't have failed anyway before
GCC 8 because of unordered comparisons wrongly always having formerly
been used by the back end). Also tested for powerpc-linux-gnu
soft-float that this fixes many glibc math/ test failures that arose
in that configuration because this folding affected the IBM long
double support in libgcc (no such failures appeared for hard-float
because of the bug of powerpc hard-float always using unordered
comparisons) - some failures remain, but I believe them to be
unrelated.
PR tree-optimization/64811
gcc:
* match.pd: When optimizing comparisons with Inf, avoid
introducing or losing exceptions from comparisons with NaN.
gcc/testsuite:
* gcc.dg/torture/inf-compare-1.c, gcc.dg/torture/inf-compare-2.c,
gcc.dg/torture/inf-compare-3.c, gcc.dg/torture/inf-compare-4.c,
gcc.dg/torture/inf-compare-5.c, gcc.dg/torture/inf-compare-6.c,
gcc.dg/torture/inf-compare-7.c, gcc.dg/torture/inf-compare-8.c:
New tests.
* gcc.c-torture/execute/ieee/fp-cmp-7.x: New file.
From-SVN: r256380
2018-01-09 Tamar Christina <tamar.christina@arm.com>
PR target/82641
* gcc.target/arm/pragma_fpu_attribute.c: Rewrite to use
no NEON and require softfp or hard float-abi.
* gcc.target/arm/pragma_fpu_attribute_2.c: Likewise.
From-SVN: r256375
gcc/
Don't save registers in main().
PR target/83737
* doc/invoke.texi (AVR Options) [-mmain-is-OS_task]: Document it.
* config/avr/avr.opt (-mmain-is-OS_task): New target option.
* config/avr/avr.c (avr_set_current_function): Don't error if
naked, OS_task or OS_main are specified at the same time.
(avr_function_ok_for_sibcall): Don't disable sibcalls for OS_task,
OS_main.
(avr_insert_attributes) [-mmain-is-OS_task] <main>: Add OS_task
attribute.
* common/config/avr/avr-common.c (avr_option_optimization_table):
Switch on -mmain-is-OS_task for optimizing compilations.
From-SVN: r256373
2018-01-09 Richard Biener <rguenther@suse.de>
PR tree-optimization/83572
* graphite.c: Include cfganal.h.
(graphite_transform_loops): Connect infinite loops to exit
and remove fake edges at the end.
* gcc.dg/graphite/pr83572.c: New testcase.
From-SVN: r256372
PR target/83507
* modulo-sched.c (schedule_reg_moves): Punt if we'd need to move
hard registers. Formatting fixes.
* gcc.dg/sms-13.c: New test.
From-SVN: r256368
PR preprocessor/83722
* gcc.c (try_generate_repro): Pass
&temp_stderr_files[RETRY_ICE_ATTEMPTS - 1] rather than
&temp_stdout_files[RETRY_ICE_ATTEMPTS - 1] as last argument to
do_report_bug.
From-SVN: r256367
Update the Go library to the 1.10beta1 release.
Requires a few changes to the compiler for modifications to the map
runtime code, and to handle some nowritebarrier cases in the runtime.
Reviewed-on: https://go-review.googlesource.com/86455
gotools/:
* Makefile.am (go_cmd_vet_files): New variable.
(go_cmd_buildid_files, go_cmd_test2json_files): New variables.
(s-zdefaultcc): Change from constants to functions.
(noinst_PROGRAMS): Add vet, buildid, and test2json.
(cgo$(EXEEXT)): Link against $(LIBGOTOOL).
(vet$(EXEEXT)): New target.
(buildid$(EXEEXT)): New target.
(test2json$(EXEEXT)): New target.
(install-exec-local): Install all $(noinst_PROGRAMS).
(uninstall-local): Uninstasll all $(noinst_PROGRAMS).
(check-go-tool): Depend on $(noinst_PROGRAMS). Copy down
objabi.go.
(check-runtime): Depend on $(noinst_PROGRAMS).
(check-cgo-test, check-carchive-test): Likewise.
(check-vet): New target.
(check): Depend on check-vet. Look at cmd_vet-testlog.
(.PHONY): Add check-vet.
* Makefile.in: Rebuild.
From-SVN: r256365
[gcc]
2018-01-08 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
PR target/83677
* config/rs6000/altivec.md (*altivec_vpermr_<mode>_internal):
Reverse order of second and third operands in first alternative.
* config/rs6000/rs6000.c (rs6000_expand_vector_set): Reverse order
of first and second elements in UNSPEC_VPERMR vector.
(altivec_expand_vec_perm_le): Likewise.
[gcc/testsuite]
2018-01-08 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
PR target/83677
* gcc.target/powerpc/pr83677.c: New file.
From-SVN: r256358
2018-01-08 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/quad-float128.h (IBM128_TYPE): Explicitly use
__ibm128, instead of trying to use long double.
(CVT_FLOAT128_TO_IBM128): Use TFtype instead of __float128 to
accomidate -mabi=ieeelongdouble multilibs.
(CVT_IBM128_TO_FLOAT128): Likewise.
* config/rs6000/ibm-ldouble.c (IBM128_TYPE): New macro to define
the appropriate IBM extended double type.
(__gcc_qadd): Change all occurances of long double to IBM128_TYPE.
(__gcc_qsub): Likewise.
(__gcc_qmul): Likewise.
(__gcc_qdiv): Likewise.
(pack_ldouble): Likewise.
(__gcc_qneg): Likewise.
(__gcc_qeq): Likewise.
(__gcc_qne): Likewise.
(__gcc_qge): Likewise.
(__gcc_qle): Likewise.
(__gcc_stoq): Likewise.
(__gcc_dtoq): Likewise.
(__gcc_itoq): Likewise.
(__gcc_utoq): Likewise.
(__gcc_qunord): Likewise.
* config/rs6000/_mulkc3.c (toplevel): Include soft-fp.h and
quad-float128.h for the definitions.
(COPYSIGN): Use the f128 version instead of the q version.
(INFINITY): Likewise.
(__mulkc3): Use TFmode/TCmode for float128 scalar/complex types.
* config/rs6000/_divkc3.c (toplevel): Include soft-fp.h and
quad-float128.h for the definitions.
(COPYSIGN): Use the f128 version instead of the q version.
(INFINITY): Likewise.
(FABS): Likewise.
(__divkc3): Use TFmode/TCmode for float128 scalar/complex types.
* config/rs6000/extendkftf2-sw.c (__extendkftf2_sw): Likewise.
* config/rs6000/trunctfkf2-sw.c (__trunctfkf2_sw): Likewise.
From-SVN: r256354
2018-01-08 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/quad-float128.h (IBM128_TYPE): Explicitly use
__ibm128, instead of trying to use long double.
(CVT_FLOAT128_TO_IBM128): Use TFtype instead of __float128 to
accomidate -mabi=ieeelongdouble multilibs.
(CVT_IBM128_TO_FLOAT128): Likewise.
* config/rs6000/ibm-ldouble.c (IBM128_TYPE): New macro to define
the appropriate IBM extended double type.
(__gcc_qadd): Change all occurances of long double to IBM128_TYPE.
(__gcc_qsub): Likewise.
(__gcc_qmul): Likewise.
(__gcc_qdiv): Likewise.
(pack_ldouble): Likewise.
(__gcc_qneg): Likewise.
(__gcc_qeq): Likewise.
(__gcc_qne): Likewise.
(__gcc_qge): Likewise.
(__gcc_qle): Likewise.
(__gcc_stoq): Likewise.
(__gcc_dtoq): Likewise.
(__gcc_itoq): Likewise.
(__gcc_utoq): Likewise.
(__gcc_qunord): Likewise.
* config/rs6000/_mulkc3.c (toplevel): Include soft-fp.h and
quad-float128.h for the definitions.
(COPYSIGN): Use the f128 version instead of the q version.
(INFINITY): Likewise.
(__mulkc3): Use TFmode/TCmode for float128 scalar/complex types.
* config/rs6000/_divkc3.c (toplevel): Include soft-fp.h and
quad-float128.h for the definitions.
(COPYSIGN): Use the f128 version instead of the q version.
(INFINITY): Likewise.
(FABS): Likewise.
(__divkc3): Use TFmode/TCmode for float128 scalar/complex types.
* config/rs6000/extendkftf2-sw.c (__extendkftf2_sw): Likewise.
* config/rs6000/trunctfkf2-sw.c (__trunctfkf2_sw): Likewise.
From-SVN: r256353
2018-01-08 Aaron Sawdey <acsawdey@linux.vnet.ibm.com>
* config/rs6000/rs6000-string.c (do_load_for_compare_from_addr): New
function.
(do_ifelse): New function.
(do_isel): New function.
(do_sub3): New function.
(do_add3): New function.
(do_load_mask_compare): New function.
(do_overlap_load_compare): New function.
(expand_compare_loop): New function.
(expand_block_compare): Call expand_compare_loop() when appropriate.
* config/rs6000/rs6000.opt (-mblock-compare-inline-limit): Change
option description.
(-mblock-compare-inline-loop-limit): New option.
From-SVN: r256351