Restrict the number of SGPRs and VGPRs available to non-kernel functions
to improve compute-unit occupancy with multiple threads.
2019-11-15 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* config/gcn/gcn.c (default_requested_args): New.
(gcn_parse_amdgpu_hsa_kernel_attribute): Initialize requested args
set with default_requested_args.
(gcn_conditional_register_usage): Limit register usage of non-kernel
functions. Reassign fixed registers if a non-standard set of args is
requested.
* config/gcn/gcn.h (FIXED_REGISTERS): Fix registers according to ABI.
From-SVN: r278301
gcn_conditional_register_usage needs to be called for every function
to set the fixed registers depending on the kernel args currently
requested.
2019-11-15 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* config/gcn/gcn.c (gcn_init_cumulative_args): Call reinit_regs.
From-SVN: r278299
Rather than reimplement brace elision here, we call reshape_init and then
discard the result. We needed to set CLASSTYPE_NON_AGGREGATE a bit more in
this patch, since outside a template it's set in check_bases_and_members.
* pt.c (maybe_aggr_guide, collect_ctor_idx_types): New.
(is_spec_or_derived): Split out from do_class_deduction.
(build_deduction_guide): Handle aggregate guide.
* class.c (finish_struct): Set CLASSTYPE_NON_AGGREGATE in a
template.
* cp-tree.h (CP_AGGREGATE_TYPE_P): An incomplete class is not an
aggregate.
From-SVN: r278298
Use v1 instead of v0 when a zero-valued VGPR is needed. This frees up
v0 for other purposes.
2019-11-15 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* config/gcn/gcn.c (gcn_expand_prologue): Remove initialization and
prologue use of v0.
(print_operand_address): Use v1 for zero vector offset.
From-SVN: r278297
r278235 broke conversions of vector/scalar shifts into vector/vector
shifts on targets that only provide the latter. We need to record
whether a conversion is required in that case too.
Also, the old useless_type_conversion_p condition seemed unnecessarily
strong, since the shift amount can have a different signedness from
the shifted value and its vector type is never assumed to be identical
to vectype. The patch therefore uses tree_nop_conversion_p instead.
2019-11-15 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR tree-optimization/92515
* tree-vect-stmts.c (vectorizable_shift): Record incompatible op1
types when converting a vector/scalar shift into a vector/vector one,
using tree_nop_conversion_p instead of useless_type_conversion_p.
Move the conversion code to the transform block.
From-SVN: r278295
The documentation for __RTL tests (see "(gccint) RTL Tests" info node) has the
following snippet.
```
The parser expects the RTL body to be in the format emitted by this
dumping function:
DEBUG_FUNCTION void
print_rtx_function (FILE *outfile, function *fn, bool compact);
when "compact" is true. So you can capture RTL in the correct format
from the debugger using:
(gdb) print_rtx_function (stderr, cfun, true);
and copy and paste the output into the body of the C function.
```
Since r264944 print_rtx_function prints column number information, which the
__RTL function parsing does not handle.
This patch handles column number information optionally, so pre-existing __RTL
functions still work, and the above documentation quote still holds.
Note: If people would prefer to require column information I could make a
slightly neater code and update existing tests.
I guess this would be OK since the intended use for __RTL functions is in these
testcases so there is no worry about other existing code.
bootstrapped and regtested on aarch64
bootstrapped and regtested on x86_64
Ok for trunk?
Cheers,
Matthew
gcc/ChangeLog:
2019-11-15 Matthew Malcomson <matthew.malcomson@arm.com>
* read-rtl-function.c
(function_reader::add_fixup_source_location): Take additional
parameter of a column.
(function_reader::maybe_read_location): Optionally parse column
information and pass to add_fixup_source_location.
gcc/testsuite/ChangeLog:
2019-11-15 Matthew Malcomson <matthew.malcomson@arm.com>
* gcc.dg/rtl/aarch64/rtl-handle-column-numbers.c: New test.
From-SVN: r278294
2019-11-15 Richard Biener <rguenther@suse.de>
PR tree-optimization/92512
* tree-vect-loop.c (check_reduction_path): Fix operand index
computability check. Add check for second use in COND_EXPRs.
* gcc.dg/torture/pr92512.c: New testcase.
From-SVN: r278293
The new tree-cfg.c checking in r278245 tripped on folds of
ALTIVEC_BUILTIN_VPERM_*, which were using gimple_convert
rather than VIEW_CONVERT_EXPR to reinterpret the contents
of a vector as a different type.
2019-11-15 Richard Sandiford <richard.sandiford@arm.com>
gcc/
PR target/92515
* config/rs6000/rs6000-call.c (rs6000_gimple_fold_builtin): Use
VIEW_CONVERT_EXPR to reinterpret vectors as different types.
From-SVN: r278292
Classify vcc_lo and vcc_hi into the VCC_CONDITIONAL_REG class,
and spill them into SGPRs if necessary.
2019-11-15 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* config/gcn/gcn.c (gcn_regno_reg_class): Return VCC_CONDITIONAL_REG
register class for VCC_LO and VCC_HI.
(gcn_spill_class): Use SGPR_REGS to spill registers in
VCC_CONDITIONAL_REG.
From-SVN: r278290
2019-11-15 Richard Biener <rguenther@suse.de>
PR tree-optimization/92324
* tree-vect-loop.c (vect_create_epilog_for_reduction): Fix
singedness of SLP reduction epilouge operations. Also reduce
the vector width for SLP reductions before doing elementwise
operations if possible.
* gcc.dg/vect/pr92324-4.c: New testcase.
From-SVN: r278289
2019-11-15 Paul Thomas <pault@gcc.gnu.org>
PR fortran/69654
* trans-expr.c (gfc_trans_structure_assign): Move assignment to
'cm' after treatment of C pointer types and test that the type
has been completely built before it. Add an assert that the
backend_decl for each component exists.
2019-11-15 Paul Thomas <pault@gcc.gnu.org>
PR fortran/69654
* gfortran.dg/derived_init_6.f90: New test.
From-SVN: r278287
Set global epilogue_completed when skipping pro_and_epilogue pass
When compiling RTL functions marked to start at a pass after the reload
pass, `skip_pass` is used to mark the reload pass as having completed
since many patterns use the `reload_completed` variable to determine
whether to run or not.
Here we do the same for the `epilogue_completed` variable and the
pro_and_epilogue pass.
Also include a testcase that relies on the availability of a
define_split in the aarch64 backend that is conditioned on this
`epilogue_completed` variable.
regtest done on native aarch64
regtest done on native x64_86
gcc/ChangeLog:
2019-11-15 Matthew Malcomson <matthew.malcomson@arm.com>
* passes.c (skip_pass): Set epilogue_completed if skipping the
pro_and_epilogue pass.
gcc/testsuite/ChangeLog:
2019-11-15 Matthew Malcomson <matthew.malcomson@arm.com>
* gcc.dg/rtl/aarch64/test-epilogue-set.c: New test.
From-SVN: r278285
2019-11-15 Andrew Stubbs <ams@codesourcery.com>
libgomp/
* testsuite/libgomp.c/target-print-1.c: New file.
* testsuite/libgomp.fortran/target-print-1.f90: New file.
* testsuite/libgomp.oacc-c/print-1.c: New file.
* testsuite/libgomp.oacc-fortran/print-1.f90: New file.
From-SVN: r278284
Hi there,
When compiling an __RTL function that has an invalid "startwith" pass we
currently don't run the dfinish cleanup pass. This means we ICE on the next
function.
This change ensures that all state is cleaned up for the next function
to run correctly.
As an example, before this change the following code would ICE when compiling
the function `foo2` because the "peephole2" pass is not run at optimisation
level -O0.
When compiled with
./aarch64-none-linux-gnu-gcc -O0 -S missed-pass-error.c -o test.s
```
int __RTL (startwith ("peephole2")) badfoo ()
{
(function "badfoo"
(insn-chain
(block 2
(edge-from entry (flags "FALLTHRU"))
(cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK)
(cinsn 101 (set (reg:DI x19) (reg:DI x0)))
(cinsn 10 (use (reg/i:SI x19)))
(edge-to exit (flags "FALLTHRU"))
) ;; block 2
) ;; insn-chain
) ;; function "foo2"
}
int __RTL (startwith ("final")) foo2 ()
{
(function "foo2"
(insn-chain
(block 2
(edge-from entry (flags "FALLTHRU"))
(cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK)
(cinsn 101 (set (reg:DI x19) (reg:DI x0)))
(cinsn 10 (use (reg/i:SI x19)))
(edge-to exit (flags "FALLTHRU"))
) ;; block 2
) ;; insn-chain
) ;; function "foo2"
}
```
Now it silently ignores the __RTL function and successfully compiles foo2.
regtest done on aarch64
regtest done on x86_64
OK for trunk?
gcc/ChangeLog:
2019-11-15 Matthew Malcomson <matthew.malcomson@arm.com>
* passes.c (should_skip_pass_p): Always run "dfinish".
gcc/testsuite/ChangeLog:
2019-11-15 Matthew Malcomson <matthew.malcomson@arm.com>
* gcc.dg/rtl/aarch64/missed-pass-error.c: New test.
From-SVN: r278283
2019-11-15 Richard Biener <rguenther@suse.de>
* ipa-inline.c (inline_small_functions): Move assignment
to next before call destroying edge.
From-SVN: r278282
2019-11-15 Richard Biener <rguenther@suse.de>
PR tree-optimization/92039
PR tree-optimization/91975
* tree-ssa-loop-ivcanon.c (constant_after_peeling): Revert
previous change, treat invariants consistently as non-constant.
(tree_estimate_loop_size): Ternary ops with just the first op
constant are not optimized away.
* gcc.dg/tree-ssa/cunroll-2.c: Revert to state previous to
unroller adjustment.
* g++.dg/tree-ssa/ivopts-3.C: Likewise.
From-SVN: r278281
* gimplify.c (gimplify_call_expr): Don't call
omp_resolve_declare_variant after gimplification.
* omp-general.c (omp_context_selector_matches): For isa that might
match in some other function, defer if in declare simd function.
(omp_context_compute_score): Don't look for " score" in construct
trait set. Set *score to -1 if it can't ever match.
(omp_resolve_declare_variant): If any variants need to be deferred,
don't punt immediately, but compute scores of all variants and if
ther eis a score winner that doesn't need to be deferred, return that.
* c-c++-common/gomp/declare-variant-13.c: New test.
From-SVN: r278280
next is initialized only in the loop before, it is never updated
in it's own loop.
gcc/ChangeLog:
2019-11-15 Xiong Hu Luo <luoxhu@linux.ibm.com>
* ipa-inline.c (inline_small_functions): Update iterator of next.
From-SVN: r278277
When the compiler writes an inlinable function to the export data,
parameter names are written out (in Export::write_name) using the
Gogo::message_name as opposed to a raw/encoded name. This means that
sink parameters (those named "_") get created with the name "_"
instead of "._" (the name created by the lexer/parser). This confuses
Gogo::is_sink_name, which looks for the latter sequence and not just
"_". This can cause issues later on if an inlinable function is
imported and fed through the rest of the compiler (things that are
sinks are no recognized as such). To fix these issues, change
Gogo::is_sink_name to return true for either variants ("_" or "._").
Fixesgolang/go#35586.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/207259
From-SVN: r278275
* include/Makefile.am: Add <stop_token> header.
* include/Makefile.in: Regenerate.
* include/std/condition_variable: Add overloads for stop_token support
to condition_variable_any.
* include/std/stop_token: New file.
* include/std/thread: Add jthread type.
* include/std/version (__cpp_lib_jthread): New value.
* testsuite/30_threads/condition_variable_any/stop_token/1.cc: New test.
* testsuite/30_threads/condition_variable_any/stop_token/2.cc: New test.
* testsuite/30_threads/condition_variable_any/stop_token/wait_on.cc: New test.
* testsuite/30_threads/jthread/1.cc: New test.
* testsuite/30_threads/jthread/2.cc: New test.
* testsuite/30_threads/jthread/jthread.cc: New test.
* testsuite/30_threads/stop_token/1.cc: New test.
* testsuite/30_threads/stop_token/2.cc: New test.
* testsuite/30_threads/stop_token/stop_token.cc: New test.
From-SVN: r278274
When adding C2x attribute support, some [[fallthrough]] support
appeared as a side-effect because of code for that attribute going
through separate paths from the normal attribute handling.
However, going through those paths without the normal attribute
handlers meant that certain checks, such as for the invalid usage
[[fallthrough()]], did not operate. This patch improves checks by
adding this attribute to the standard attribute table, so that the
parser knows it expects no arguments, along with adding an explicit
check for "[[fallthrough]];" attribute-declarations at top level. As
with other attributes, there are still cases where warnings should be
pedwarns because C2x constraints are violated, but this patch improves
the attribute handling.
Bootstrapped with no regressions on x86_64-pc-linux-gnu.
gcc/c:
* c-decl.c (std_attribute_table): Add fallthrough.
* c-parser.c (c_parser_declaration_or_fndef): Diagnose fallthrough
attribute at top level.
gcc/c-family:
* c-attribs.c (handle_fallthrough_attribute): Remove static.
* c-common.h (handle_fallthrough_attribute): Declare.
gcc/testsuite:
* gcc.dg/c2x-attr-fallthrough-2.c,
gcc.dg/c2x-attr-fallthrough-3.c: New tests.
From-SVN: r278273
This patch adds support for the C2x [[deprecated]] attribute. All the
actual logic for generating warnings can be identical to the GNU
__attribute__ ((deprecated)), as can the attribute handler, so this is
just a matter of wiring things up appropriately and adding the checks
specified in the standard. Unlike for C++, this patch gives
"deprecated" an entry in a table of standard attributes rather than
remapping it internally to the GNU attribute, as that seems a cleaner
approach to me.
Specifically, the only form of arguments to the attribute permitted in
the standard is (string-literal); empty parentheses are not permitted
in the case of no arguments, and a string literal (which includes
concatenated adjacent string literals, because concatenation is an
earlier phase of translation) cannot have further redundant
parentheses around it. For the case of empty parentheses, this patch
makes the C parser disallow them for all known attributes using the
[[]] syntax, as done for C++. For string literals (where the C++
front end is missing the check to avoid redundant parentheses, 92521
filed for that issue), a special case is inserted in the C parser.
A known issue that I think can be addressed later as a bug fix is that
the warnings for the attribute being ignored in certain cases
(attribute declarations, statements, most uses on types) ought to be
pedwarns, as those usages are constraint violations.
Bad handling of wide string literals with this attribute is also a
pre-existing bug (91182 - although that's filed as a C++ bug, the code
in question is language-independent, in tree.c).
Bootstrapped with no regressions on x86_64-pc-linux-gnu.
gcc/c:
* c-decl.c (std_attribute_table): New.
(c_init_decl_processing): Register attributes from
std_attribute_table.
* c-parser.c (c_parser_attribute_arguments): Add arguments
require_string and allow_empty_args. All callers changed.
(c_parser_std_attribute): Set require_string argument for
"deprecated" attribute.
gcc/c-family:
* c-attribs.c (handle_deprecated_attribute): Remove static.
* c-common.h (handle_deprecated_attribute): Declare.
gcc/testsuite:
* gcc.dg/c2x-attr-deprecated-1.c, gcc.dg/c2x-attr-deprecated-2.c,
gcc.dg/c2x-attr-deprecated-3.c: New tests.
From-SVN: r278268
2019-11-14 Kwok Cheung Yeung <kcy@codesourcery.com>
gcc/
* lra-spills.c (assign_spill_hard_regs): Check that the spill
register is suitable for the mode.
From-SVN: r278267
C2x adds u8'' character constants to C. This patch adds the
corresponding GCC support.
Most of the support was already present for C++ and just needed
enabling for C2x. However, in C2x these constants have type unsigned
char, which required corresponding adjustments in the compiler and the
preprocessor to give them that type for C.
For C, it seems clear to me that having type unsigned char means the
constants are unsigned in the preprocessor (and thus treated as having
type uintmax_t in #if conditionals), so this patch implements that. I
included a conditional in the libcpp change to avoid affecting
signedness for C++, but I'm not sure if in fact these constants should
also be unsigned in the preprocessor for C++ in which case that
!CPP_OPTION (pfile, cplusplus) conditional would not be needed.
Bootstrapped with no regressions on x86_64-pc-linux-gnu.
gcc/c:
* c-parser.c (c_parser_postfix_expression)
(c_parser_check_literal_zero): Handle CPP_UTF8CHAR.
* gimple-parser.c (c_parser_gimple_postfix_expression): Likewise.
gcc/c-family:
* c-lex.c (lex_charconst): Make CPP_UTF8CHAR constants unsigned
char for C.
gcc/testsuite:
* gcc.dg/c11-utf8char-1.c, gcc.dg/c2x-utf8char-1.c,
gcc.dg/c2x-utf8char-2.c, gcc.dg/c2x-utf8char-3.c,
gcc.dg/gnu2x-utf8char-1.c: New tests.
libcpp:
* charset.c (narrow_str_to_charconst): Make CPP_UTF8CHAR constants
unsigned for C.
* init.c (lang_defaults): Set utf8_char_literals for GNUC2X and
STDC2X.
From-SVN: r278265
gcc.dg/vect/bb-slp-40.c was failing on some targets because the
explicit dg-options overrode things like -maltivec. This patch
uses dg-additional-options instead.
Also, it seems safer not to require exactly 1 instance of each message,
since that depends on the target vector length.
gcc.dg/vect/bb-slp-41.c contained invariant constructors that are
vectorised on AArch64 (foo) and constructors that aren't (bar).
This meant that the number of times we print "Found vectorizable
constructor" depended on how many vector sizes we try, since we'd
print it for each failed attempt.
In foo, we create invariant { b[0], ... } and { b[1], ... },
and the test is making sure that the two separate invariant vectors
can be fed from the same vector load at b. This is a different case
from bb-slp-40.c, where the constructors are naturally separate.
(The expected count is 4 rather than 2 because we can vectorise the
epilogue too.)
However, due to limitations in the loop vectoriser, we still do the
addition of { b[0], ... } and { b[1], ... } in the loop. Hopefully
that'll be fixed at some point, so this patch adds an alternative test
that directly needs 4 separate invariant constructors. E.g. with Joel's
SLP optimisation, the new test generates:
ldr q4, [x1]
dup v7.4s, v4.s[0]
dup v6.4s, v4.s[1]
dup v5.4s, v4.s[2]
dup v4.4s, v4.s[3]
instead of the somewhat bizarre:
ldp s6, s5, [x1, 4]
ldr s4, [x1, 12]
ld1r {v7.4s}, [x1]
dup v6.4s, v6.s[0]
dup v5.4s, v5.s[0]
dup v4.4s, v4.s[0]
The patch then disables vectorisation of the original foo in
bb-vect-slp-41.c, so that we get the same correctness testing
for bar but don't need to test for specific counts.
2019-11-14 Richard Sandiford <richard.sandiford@arm.com>
gcc/testsuite/
PR testsuite/92366
* gcc.dg/vect/bb-slp-40.c: Use dg-additional-options instead
of dg-options. Remove expected counts.
* gcc.dg/vect/bb-slp-41.c: Remove dg-options and explicit
dg-do run. Suppress vectorization of foo.
* gcc.dg/vect/bb-slp-42.c: New test.
From-SVN: r278262
2019-11-14 Andrew MacLeod <amacleod@redhat.com>
PR tree-optimization/92506
* range-op.cc (range_operator::fold_range): Start with range undefined.
(operator_abs::wi_fold): Fix wrong line copy... With wrapv, abs with
overflow is varying.
From-SVN: r278259
This is a follow-up to
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00919.html (r278095).
Dominance info is deleted even if we don't perform jump threading.
Since the whole point of this pass is to perform jump threading (other
cleanups are not valuable at this point), skip it completely when
flag_thread_jumps is not set.
gcc/ChangeLog:
2019-11-14 Ilya Leoshkevich <iii@linux.ibm.com>
PR rtl-optimization/92430
* cfgcleanup.c (pass_jump_after_combine::gate): New function.
(pass_jump_after_combine::execute): Perform jump threading
unconditionally.
From-SVN: r278254
2019-11-13 Jerome Lambourg <lambourg@adacore.com>
Doug Rupp <rupp@adacore.com>
Olivier Hainque <hainque@adacore.com>
gcc/
* config.gcc: Collapse the arm-vxworks entries into
a single arm-wrs-vxworks7* one, bpabi based. Update
the default cpu from arm8 to armv7-a
* config/arm/vxworks.h (CC1_SPEC): Simplify, knowing that
we always use ARM_UNWIND_INFO.
(DWARF2_UNWIND_INFO): Remove redefinition.
(ARM_TARGET2_DWARF_FORMAT): Likewise.
(VXWORKS_PERSONALITY): Define, to "llvm".
(VXWORKS_EXTRA_LIBS_RTP): Define, to "-lllvm".
libgcc/
* config.host: Collapse the arm-vxworks entries into
a single arm-wrs-vxworks7* one.
* config/arm/unwind-arm-vxworks.c: Update comments. Provide
__gnu_Unwind_Find_exidx and a weak dummy __cxa_type_match for
kernel modules, to be overriden by libstdc++ when we link with
it. Rely on externally provided __exidx_start/end.
Co-Authored-By: Doug Rupp <rupp@adacore.com>
Co-Authored-By: Olivier Hainque <hainque@adacore.com>
From-SVN: r278253
2019-11-14 Jerome Lambourg <lambourg@adacore.com>
* config/arm/vxworks.h (TARGET_OS_CPP_BUILTINS): Use
_VX_CPU instead of CPU and handle arm_arch8.
From-SVN: r278252
2019-11-12 Olivier Hainque <hainque@adacore.com>
libgcc/
* config/t-gthr-vxworksae: New file, add all the gthr-vxworks
sources except the cxx0x support to LIB2ADDEH. We don't support
cxx0x on AE/653.
* config/t-vxworksae: New file.
* config.host: Handle *-*-vxworksae: Add the two aforementioned
Makefile fragment files at their expected position in the tmake_file
list, in accordance with what is done for other VxWorks variants.
From-SVN: r278250
2019-11-12 Corentin Gay <gay@adacore.com>
Jerome Lambourg <lambourg@adacore.com>
Olivier Hainque <hainque@adacore.com>
libgcc/
* config/t-gthr-vxworks: New file, add all the gthr-vxworks
sources to LIB2ADDEH.
* config/t-vxworks: Remove adjustments to LIB2ADDEH.
* config/t-vxworks7: Likewise.
* config.host: Append a block at the end of the file to add the
t-gthr files to the tmake_file list for VxWorks after everything
else.
* config/vxlib.c: Rename as gthr-vxworks.c.
* config/vxlib-tls.c: Rename as gthr-vxworks-tls.c.
* config/gthr-vxworks.h: Simplify a few comments. Expose a TAS
API and a basic error checking API, both internal. Simplify the
__gthread_once_t type definition and initializers. Add sections
for condition variables support and for the C++0x thread support,
conditioned against Vx653 for the latter.
* config/gthr-vxworks.c (__gthread_once): Simplify comments and
implementation, leveraging the TAS internal API.
* config/gthr-vxworks-tls.c: Introduce an internal TLS data access
API, leveraging the general availability of TLS services in VxWorks7
post SR6xxx.
(__gthread_setspecific, __gthread_setspecific): Use it.
(tls_delete_hook): Likewise, and simplify the enter/leave dtor logic.
* config/gthr-vxworks-cond.c: New file. GTHREAD_COND variable
support based on VxWorks primitives.
* config/gthr-vxworks-thread.c: New file. GTHREAD_CXX0X support
based on VxWorks primitives.
Co-Authored-By: Jerome Lambourg <lambourg@adacore.com>
Co-Authored-By: Olivier Hainque <hainque@adacore.com>
From-SVN: r278249
2019-11-06 Jerome Lambourg <lambourg@adacore.com>
Olivier Hainque <hainque@adacore.com>
libgcc/
* config/vxcrtstuff.c: New file.
* config/t-vxcrtstuff: New Makefile fragment.
* config.host: Append t-vxcrtstuff to the tmake_file list
on all VxWorks ports using dwarf for table based EH.
gcc/
* config/vx-common.h (USE_TM_CLONE_REGISTRY): Remove
definition, pointless with a VxWorks specific version
of crtstuff.
(DWARF2_UNWIND_INFO): Conditionalize on !ARM_UNWIND_INFO.
* config/vxworks.h (VX_CRTBEGIN_SPEC, VX_CRTEND_SPEC):
New local macros, controlling the addition of vxworks specific
crtstuff objects depending on the EH mechanism and kind of
module being linked.
(VXWORKS_STARTFILE_SPEC, VXWORKS_ENDFILE_SPEC): Use them.
Co-Authored-By: Olivier Hainque <hainque@adacore.com>
From-SVN: r278248
2019-11-06 Pat Bernardi <bernardi@adacore.com>
Jerome Lambourg <lambourg@adacore.com>
Olivier Hainque <hainque@adacore.com>
gcc/
* config.gcc: Add comment to introduce the TARGET_VXWORKS
commong macro definitions, conveying VXWORKS7 or 64bit general
variations. Add a block to set gcc_cv_initfini_array
unconditionally to "yes" for VxWorks7.
config/vx-common.h (VXWORKS_CC1_SPEC): New macro, empty string
by default. Update some comments.
config/vxworks.h (VXWORKS_EXTRA_LIBS_RTP): New macro, empty by
default, to be added the end of VXWORKS_LIBS_RTP.
(VXWORKS_LIBS_RTP): Replace hardcoded part by VXWORKS_BASE_LIBS_RTP
and append VXWORKS_EXTRA_LIBS_RTP, both of which specific ports may
redefine.
(VXWORKS_NET_LIBS_RTP): Account for VxWorks7 specificities.
(VXWORKS_CC1_SPEC): Common base definition, with VxWorks7 variation
to account for the now available TLS abilities.
(TARGET_LIBC_HAS_FUNCTION): Account for VxWorks7 abilities.
(VXWORKS_HAVE_TLS): Likewise.
Co-Authored-By: Jerome Lambourg <lambourg@adacore.com>
Co-Authored-By: Olivier Hainque <hainque@adacore.com>
From-SVN: r278247
If the statements in an SLP node aren't similar enough to be vectorised,
or aren't something the vectoriser has code to handle, the BB vectoriser
tries building the vector from scalars instead. This patch does the
same thing if we're able to build a viable-looking tree but fail later
during the analysis phase, e.g. because the target doesn't support a
particular vector operation.
This is needed to avoid regressions with a later patch.
2019-11-14 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-slp.c (vect_contains_pattern_stmt_p): New function.
(vect_slp_convert_to_external): Likewise.
(vect_slp_analyze_node_operations): If analysis fails, try building
the node from scalars instead.
gcc/testsuite/
* gcc.dg/vect/bb-slp-div-2.c: New test.
From-SVN: r278246
This patch adds AArch64 patterns for converting between 64-bit and
128-bit integer vectors, and makes the vectoriser and expand pass
use them.
2019-11-14 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-cfg.c (verify_gimple_assign_unary): Handle conversions
between vector types.
* tree-vect-stmts.c (vectorizable_conversion): Extend the
non-widening and non-narrowing path to handle standard
conversion codes, if the target supports them.
* expr.c (convert_move): Try using the extend and truncate optabs
for vectors.
* optabs-tree.c (supportable_convert_operation): Likewise.
* config/aarch64/iterators.md (Vnarroqw): New iterator.
* config/aarch64/aarch64-simd.md (<optab><Vnarrowq><mode>2)
(trunc<mode><Vnarrowq>2): New patterns.
gcc/testsuite/
* gcc.dg/vect/bb-slp-pr69907.c: Do not expect BB vectorization
to fail for aarch64 targets.
* gcc.dg/vect/no-scevccp-outer-12.c: Expect the test to pass
on aarch64 targets.
* gcc.dg/vect/vect-double-reduc-5.c: Likewise.
* gcc.dg/vect/vect-outer-4e.c: Likewise.
* gcc.target/aarch64/vect_mixed_sizes_5.c: New test.
* gcc.target/aarch64/vect_mixed_sizes_6.c: Likewise.
* gcc.target/aarch64/vect_mixed_sizes_7.c: Likewise.
* gcc.target/aarch64/vect_mixed_sizes_8.c: Likewise.
* gcc.target/aarch64/vect_mixed_sizes_9.c: Likewise.
* gcc.target/aarch64/vect_mixed_sizes_10.c: Likewise.
* gcc.target/aarch64/vect_mixed_sizes_11.c: Likewise.
* gcc.target/aarch64/vect_mixed_sizes_12.c: Likewise.
* gcc.target/aarch64/vect_mixed_sizes_13.c: Likewise.
From-SVN: r278245
Although a previous patch allowed mixed vector sizes within a vector
region, we generally still required equal vector sizes within a vector
stmt. Specifically, vect_get_vector_types_for_stmt computes two vector
types: the vector type corresponding to STMT_VINFO_VECTYPE and the
vector type that determines the minimum vectorisation factor for the
stmt ("nunits_vectype"). It then required these two types to be
the same size.
There doesn't seem to be any need for that restriction though. AFAICT,
all vectorizable_* functions either do their own compatibility checks
or don't need to do them (because gimple guarantees that the scalar
types are compatible).
It should always be the case that nunits_vectype has at least as many
elements as the other vectype, but that's something we can assert for.
I couldn't resist a couple of other tweaks while there:
- there's no need to compute nunits_vectype if its element type is
the same as STMT_VINFO_VECTYPE's.
- it's useful to distinguish the nunits_vectype from the main vectype
in dump messages
- when reusing the existing STMT_VINFO_VECTYPE, it's useful to say so
in the dump, and say what the type is
2019-11-14 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-stmts.c (vect_get_vector_types_for_stmt): Don't
require vectype and nunits_vectype to have the same size;
instead assert that nunits_vectype has at least as many
elements as vectype. Don't compute a separate nunits_vectype
if the scalar type is obviously the same as vectype's.
Tweak dump messages.
From-SVN: r278244
This patch makes the vectoriser try mixtures of 64-bit and 128-bit
vector modes on AArch64. It fixes some existing XFAILs and allows
kernel 24 from the Livermore Loops test to be vectorised (by using
a mixture of V2DF and V2SI).
2019-11-14 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64.c (aarch64_vectorize_related_mode): New
function.
(aarch64_autovectorize_vector_modes): Also add V4HImode and V2SImode.
(TARGET_VECTORIZE_RELATED_MODE): Define.
gcc/testsuite/
* gcc.dg/vect/vect-outer-4f.c: Expect the test to pass on aarch64
targets.
* gcc.dg/vect/vect-outer-4g.c: Likewise.
* gcc.dg/vect/vect-outer-4k.c: Likewise.
* gcc.dg/vect/vect-outer-4l.c: Likewise.
* gfortran.dg/vect/vect-8.f90: Expect kernel 24 to be vectorized
for aarch64.
* gcc.target/aarch64/vect_mixed_sizes_1.c: New test.
* gcc.target/aarch64/vect_mixed_sizes_2.c: Likewise.
* gcc.target/aarch64/vect_mixed_sizes_3.c: Likewise.
* gcc.target/aarch64/vect_mixed_sizes_4.c: Likewise.
From-SVN: r278243