In various of our 32-bit load_toc patterns we take the difference of
two immediates (labels) as a term to something bigger; but this isn't
canonical RTL, it needs to be wrapped in CONST.
PR target/83629
* config/rs6000/rs6000.md (load_toc_v4_PIC_2, load_toc_v4_PIC_3b,
load_toc_v4_PIC_3c): Wrap const term in CONST RTL.
testsuite/
PR target/83629
* gcc.target/powerpc/pr83629.c: New testcase.
From-SVN: r256432
The signature of makemap changed with the update to 1.10beta1,
but I forgot to update the call from C code.
Reviewed-on: https://go-review.googlesource.com/87135
From-SVN: r256431
2018-01-10 Richard Biener <rguenther@suse.de>
PR debug/83765
* dwarf2out.c (gen_subprogram_die): Hoist old_die && declaration
early out so it also covers the case where we have a non-NULL
origin.
From-SVN: r256428
After cunrolling the inner loop, the remaining loop in the testcase
has a single 32-bit access and a group of 64-bit accesses. We first
try to vectorise at 128 bits (VF 4), but decide not to for cost reasons.
We then try with 64 bits (VF 2) instead. This means that the group
of 64-bit accesses uses a single-element vector, which is deliberately
supported as of r251538. We then try to create "permutes" for these
single-element vectors and fall foul of:
for (i = 0; i < 6; i++)
sel[i] += exact_div (nelt, 2);
in vect_grouped_store_supported, since nelt==1.
Maybe we shouldn't even be trying to vectorise statements in the
single-element case, and instead just copy the scalar statement
for each member of the group. But until then, this patch treats
non-strided grouped accesses as VMAT_CONTIGUOUS if no permutation
is necessary.
2018-01-10 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
PR tree-optimization/83753
* tree-vect-stmts.c (get_group_load_store_type): Use VMAT_CONTIGUOUS
for non-strided grouped accesses if the number of elements is 1.
gcc/testsuite/
PR tree-optimization/83753
* gcc.dg/torture/pr83753.c: New test.
From-SVN: r256427
This patch fixes up the formatting and corrects the PR number in the
ChangeLog for r256425.
gcc/fortran/ChangeLog:
2018-01-10 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/83740
* trans-array.c (gfc_trans_array_ctor_element): Fix formatting.
From-SVN: r256426
Need to convert the RHS to the type of the LHS when assigning.
Regtested on x86_64-pc-linux-gnu, committed as obvious.
gcc/fortran/ChangeLog:
2018-01-10 Janne Blomqvist <jb@gcc.gnu.org>
PR fortran/84740
* trans-array.c (gfc_trans_array_ctor_element): Convert RHS to the
LHS type when assigning.
From-SVN: r256425
2018-01-10 Martin Liska <mliska@suse.cz>
PR bootstrap/82831
* basic-block.h (CLEANUP_NO_PARTITIONING): New define.
* bb-reorder.c (pass_reorder_blocks::execute): Do not clean up
partitioning.
* cfgcleanup.c (try_optimize_cfg): Fix up partitioning if
CLEANUP_NO_PARTITIONING is not set.
From-SVN: r256422
r254296 added support for (const ...) wrappers around vectors,
but in the end the agreement was to use a variable-length
encoding of CONST_VECTOR (and VECTOR_CST) instead. This patch
therefore reverts the bits that are no longer needed.
The rtl.texi part isn't a full revert, since r254296 also updated the
documentation to mention unspecs in address calculations, and to relax
the requirement that the mode had to be Pmode.
2018-01-10 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* doc/rtl.texi: Remove documentation of (const ...) wrappers
for vectors, as a partial revert of r254296.
* rtl.h (const_vec_p): Delete.
(const_vec_duplicate_p): Don't test for vector CONSTs.
(unwrap_const_vec_duplicate, const_vec_series_p): Likewise.
* expmed.c (make_tree): Likewise.
Revert:
* common.md (E, F): Use CONSTANT_P instead of checking for
CONST_VECTOR.
* emit-rtl.c (gen_lowpart_common): Use const_vec_p instead of
checking for CONST_VECTOR.
From-SVN: r256421
When compiling runtime, it is not allowed for local variables
and closures to be heap allocated. In one test, there is a go
statement with a closure. In the gc compiler, it distinguishes
capturing variable by value vs. by address, and rewrites it to
passing the captured values as arguments. Currently we don't
have this, and the escape analysis decides to heap allocate the
closure and also the captured variables, which is not allowed.
Work around it by passing the variables explicitly.
This is in preparation of turning on escape analysis for the
runtime.
Reviewed-on: https://go-review.googlesource.com/86245
From-SVN: r256419
This is in preparation of turning on escape analysis for the
runtime.
- In gccgo, systemstack is implemented with mcall, which is not
go:noescape. Wrap the closure in noescape so the escape analysis
does not think it escapes.
- Mark some C functions go:noescape. They do not leak arguments.
- Use noescape function to make a few local variables' addresses
not escape. The escape analysis cannot figure out because they
are assigned to pointer indirections.
Reviewed-on: https://go-review.googlesource.com/86244
From-SVN: r256418
Currently, allocation expression that can be allocated on stack
is implemented with __builtin_alloca, which turns into
__morestack_allocate_stack_space, which may call C malloc. This
may be slow. Also if this happens during certain runtime
functions (e.g. write barrier), bad things might happen (when
the escape analysis is enabled for the runtime). Make a temporary
variable on stack for the allocation instead.
Also remove the write barrier in the assignment in building heap
expression when it is stack allocated.
Reviewed-on: https://go-review.googlesource.com/86242
From-SVN: r256412
The escape analysis models closures by flowing captured variable
address to the closure node. However, the escape state for the
address expressions remained unset as ESCAPE_UNKNOWN. This
caused later passes to conclude that the address escapes. Fix this by
setting its escape state to ESCAPE_NONE first. If it escapes
(because the closure escapes), the flood phase will set its
escape state properly.
Reviewed-on: https://go-review.googlesource.com/86240
From-SVN: r256411
Defer statement may need to allocate a thunk. When it is not
inside a loop, this can be stack allocated, as it runs before
the function finishes.
Reviewed-on: https://go-review.googlesource.com/85639
From-SVN: r256410
Bound_method_expression needs a closure. Stack allocate the
closure when it does not escape.
Reviewed-on: https://go-review.googlesource.com/85638
From-SVN: r256407
Move some check of escape state earlier, from get_backend to
Mark_address_taken. So we can reclaim escape analysis Nodes
before kicking off the backend (not done in this CL). Also it
makes it easier to check variables and closures do not escape
when the escape analysis is run for the runtime package (also
not done in this CL).
Reviewed-on: https://go-review.googlesource.com/85735
From-SVN: r256406
CL 83876 added support of go:noescape pragma, but it only works
for functions called from the same package. The pragma did not
take effect for exported functions that are not called from
the same package. The reason is that top level function
declarations are not traversed, and only reached from calls
from other functions. This CL adds this support. The Traverse
class is extended with a mode to traverse function declarations.
Reviewed-on: https://go-review.googlesource.com/85637
From-SVN: r256405
If we're making a slice of constant size that does not need to
escape, allocate it on stack.
In lower, do not create temporaries for constant size makeslice,
so that it is easier to detect the slice is constant size later.
Reviewed-on: https://go-review.googlesource.com/85636
From-SVN: r256404
Arrays that are sliced are set to escape in type checking, very
early in compilation. The escape analysis runs later but cannot
undo it. This CL changes it to not escape in the early stage.
Later the escape analysis will make it escape when needed.
Reviewed-on: https://go-review.googlesource.com/85635
From-SVN: r256403
PR target/78585 has been fixed for GCC 7 by
commit 7ed04d053eead43d87dff40fb4e2904219afc4d5
Author: jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4>
Date: Wed Nov 30 13:02:07 2016 +0000
* config/i386/i386.c (dimode_scalar_chain::convert_op): Avoid
sharing the SUBREG rtx between move and following insn.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@243018 138bc75d-0d04-0410-961f-82ee72b054a4
PR target/78585:
* gcc.target/i386/pr78585.c: New test.
From-SVN: r256402
PR libstdc++/80276
* python/libstdcxx/v6/printers.py (SharedPointerPrinter)
(UniquePointerPrinter): Print correct template argument, not type of
the pointer.
(TemplateTypePrinter._recognizer.recognize): Handle failure to lookup
a type.
* testsuite/libstdc++-prettyprinters/cxx11.cc: Test unique_ptr of
array type.
* testsuite/libstdc++-prettyprinters/cxx17.cc: Test shared_ptr and
weak_ptr of array types.
From-SVN: r256400
If a local variable's address is taken and passed out of its
lexical scope, GCC backend may reuse the stack slot for the
variable, not knowing it is still live through a pointer. In
this case, we create a top-level temporary variable and let the
user-defined variable refer to the temporary variable as its
storage location. As the temporary variable is declared at the
top level, its stack slot will remain live throughout the
function.
Reviewed-on: https://go-review.googlesource.com/84675
* go-gcc.cc (local_variable): Add decl_var parameter.
From-SVN: r256398
gcc/ChangeLog:
2018-01-09 Carl Love <cel@us.ibm.com>
* config/rs6002/altivec.md (p8_vmrgow): Add support for V2DI, V2DF,
V4SI, V4SF types.
(p8_vmrgew): Add support for V2DI, V2DF, V4SF types.
* config/rs6000/rs6000-builtin.def: Add definitions for FLOAT2_V2DF,
VMRGEW_V2DI, VMRGEW_V2DF, VMRGEW_V4SF, VMRGOW_V4SI, VMRGOW_V4SF,
VMRGOW_V2DI, VMRGOW_V2DF. Remove definition for VMRGOW.
* config/rs6000/rs6000-c.c (VSX_BUILTIN_VEC_FLOAT2,
P8V_BUILTIN_VEC_VMRGEW, P8V_BUILTIN_VEC_VMRGOW): Add definitions.
* config/rs6000/rs6000-protos.h: Add extern defition for
rs6000_generate_float2_double_code.
* config/rs6000/rs6000.c (rs6000_generate_float2_double_code): Add
function.
* config/rs6000/vsx.md (vsx_xvcdpsp): Add define_insn.
(float2_v2df): Add define_expand.
gcc/testsuite/ChangeLog:
2017-01-09 Carl Love <cel@us.ibm.com>
* gcc.target/powerpc/builtins-1.c (main): Add tests for vec_mergee and
vec_mergeo builtins with float, double, long long, unsigned long long,
bool long long arguments.
* gcc.target/powerpc/builtins-3-runnable.c (main): Add test for
vec_float2 with double arguments.
* gcc.target/powerpc/builtins-mergew-mergow.c: New runable test for the
vec_mergew and vec_mergow builtins.
From-SVN: r256395
Add a flag -fgo-debug-escape-hash for debugging escape analysis.
It takes a binary string, optionally led by a "-", as argument.
When specified, the escape analysis runs only on functions whose
name is hashed to a value with matching suffix. The "-" sign
negates the match, i.e. the analysis runs only on functions with
non-matching hash.
Reviewed-on: https://go-review.googlesource.com/83878
* lang.opt (fgo-debug-escape-hash): New option.
* go-c.h (struct go_create_gogo_args): Add debug_escape_hash
field.
* go-lang.c (go_langhook_init): Set debug_escape_hash field.
* gccgo.texi (Invoking gccgo): Document -fgo-debug-escape-hash.
From-SVN: r256393
2018-01-09 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/83742
* expr.c (gfc_is_simply_contiguous): Check for NULL pointer.
2018-01-09 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/83742
* gfortran.dg/contiguous_6.f90: New test.
From-SVN: r256391
2018-01-08 Aaron Sawdey <acsawdey@linux.vnet.ibm.com>
* config/rs6000/rs6000-string.c (do_load_for_compare_from_addr): New
function.
(do_ifelse): New function.
(do_isel): New function.
(do_sub3): New function.
(do_add3): New function.
(do_load_mask_compare): New function.
(do_overlap_load_compare): New function.
(expand_compare_loop): New function.
(expand_block_compare): Call expand_compare_loop() when appropriate.
* config/rs6000/rs6000.opt (-mblock-compare-inline-limit): Change
option description.
(-mblock-compare-inline-loop-limit): New option.
From-SVN: r256388
This patch makes the AArch64 vec_perm_const code use the new
vec_perm_indices routines, instead of checking each element individually.
This means that they extend naturally to variable-length vectors.
Also, aarch64_evpc_dup was the only function that generated rtl when
testing_p is true, and that looked accidental. The patch adds the
missing check and then replaces the gen_rtx_REG/start_sequence/
end_sequence stuff with an assert that no rtl is generated.
2018-01-09 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* config/aarch64/aarch64.c (aarch64_evpc_trn): Use d.perm.series_p
instead of checking each element individually.
(aarch64_evpc_uzp): Likewise.
(aarch64_evpc_zip): Likewise.
(aarch64_evpc_ext): Likewise.
(aarch64_evpc_rev): Likewise.
(aarch64_evpc_dup): Test the encoding for a single duplicated element,
instead of checking each element individually. Return true without
generating rtl if
(aarch64_vectorize_vec_perm_const): Use all_from_input_p to test
whether all selected elements come from the same input, instead of
checking each element individually. Remove calls to gen_rtx_REG,
start_sequence and end_sequence and instead assert that no rtl is
generated.
From-SVN: r256385
The aarch64_legitimate_constant_p tests for HIGH and CONST seem
to be the wrong way round: (high (const ...)) is valid rtl that
could be passed in, but (const (high ...)) isn't. As it stands,
we disallow anchor+offset but allow (high anchor+offset).
2018-01-09 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* config/aarch64/aarch64.c (aarch64_legitimate_constant_p): Fix
order of HIGH and CONST checks.
From-SVN: r256384
As mentioned in https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01575.html ,
the scatter handling in vectorizable_store seems to be dead code at the
moment. Enabling it with the vect_analyze_data_ref_access part of
that patch triggered an ICE in the avx512f-scatter-*.c tests (which
previously didn't use scatters). The problem was that the NARROW
and WIDEN handling uses permute_vec_elements to marshal the inputs,
and permute_vec_elements expected the lhs of the stmt to be an SSA_NAME,
which of course it isn't for stores.
This patch makes permute_vec_elements create a fresh variable in this case.
2018-01-09 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-vect-stmts.c (permute_vec_elements): Create a fresh variable
if the destination isn't an SSA_NAME.
From-SVN: r256383
2018-01-09 Richard Biener <rguenther@suse.de>
PR tree-optimization/83668
* graphite.c (canonicalize_loop_closed_ssa): Add edge argument,
move prologue...
(canonicalize_loop_form): ... here, renamed from ...
(canonicalize_loop_closed_ssa_form): ... this and amended to
swap successor edges for loop exit blocks to make us use
the RPO order we need for initial schedule generation.
* gcc.dg/graphite/pr83668.c: New testcase.
From-SVN: r256381
The folding of comparisons against Inf (to constants or comparisons
with the maximum finite value) has various cases where it introduces
or loses "invalid" exceptions for comparisons with NaNs.
Folding x > +Inf to 0 should not be about HONOR_SNANS - ordered
comparisons of both quiet and signaling NaNs should raise invalid.
x <= +Inf is not the same as x == x, because again that loses an
exception (equality comparisons don't raise exceptions except for
signaling NaNs).
x == +Inf is not the same as x > DBL_MAX, and a similar issue applies
with the x != +Inf case - that transformation causes a spurious
exception.
This patch fixes the conditionals on the folding to avoid such
introducing or losing exceptions.
Bootstrapped with no regressions on x86_64-pc-linux-gnu (where the
cases involving spurious exceptions wouldn't have failed anyway before
GCC 8 because of unordered comparisons wrongly always having formerly
been used by the back end). Also tested for powerpc-linux-gnu
soft-float that this fixes many glibc math/ test failures that arose
in that configuration because this folding affected the IBM long
double support in libgcc (no such failures appeared for hard-float
because of the bug of powerpc hard-float always using unordered
comparisons) - some failures remain, but I believe them to be
unrelated.
PR tree-optimization/64811
gcc:
* match.pd: When optimizing comparisons with Inf, avoid
introducing or losing exceptions from comparisons with NaN.
gcc/testsuite:
* gcc.dg/torture/inf-compare-1.c, gcc.dg/torture/inf-compare-2.c,
gcc.dg/torture/inf-compare-3.c, gcc.dg/torture/inf-compare-4.c,
gcc.dg/torture/inf-compare-5.c, gcc.dg/torture/inf-compare-6.c,
gcc.dg/torture/inf-compare-7.c, gcc.dg/torture/inf-compare-8.c:
New tests.
* gcc.c-torture/execute/ieee/fp-cmp-7.x: New file.
From-SVN: r256380
2018-01-09 Tamar Christina <tamar.christina@arm.com>
PR target/82641
* gcc.target/arm/pragma_fpu_attribute.c: Rewrite to use
no NEON and require softfp or hard float-abi.
* gcc.target/arm/pragma_fpu_attribute_2.c: Likewise.
From-SVN: r256375