2017-12-08 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/83141
* tree-sra.c (contains_vce_or_bfcref_p): Move up in the file, also
test for MEM_REFs implicitely changing types with padding. Remove
inline keyword.
(build_accesses_from_assign): Added contains_vce_or_bfcref_p checks.
testsuite/
* gcc.dg/tree-ssa/pr83141.c: New test.
* gcc.dg/guality/pr54970.c: XFAIL tests querying a[0].
From-SVN: r255510
In PR83304 two insns are combined, where the I2 uses a register that
has a REG_DEAD note on an insn after I2 but before I3. In such a case
move_deaths should move that death note. But move_deaths only looks
at the reg_stat[regno].last_death insn, and that field can be zeroed
out (previously, use_crosses_set_p would prevent the combination in
this case).
If the last_death field is zero it means "unknown", not "no death", so
we have to find if there is a REG_DEAD note.
PR rtl-optimization/83304
* combine.c (move_deaths): If we do not know where a register died,
search for it.
From-SVN: r255506
2017-12-08 Richard Biener <rguenther@suse.de>
* gimple-loop-interchange.cc (tree_loop_interchange::interchange):
Provide -fopt-info-loop feedback when we interchange in a nest.
From-SVN: r255505
A quirk in the historical naming of some ARMv6 products means that the
main CPU name implies the presence or otherwise of the floating point unit.
This causes problems when using -mfpu=auto with -mcpu=native: the driver is
picking a CPU that does not support a floating-point unit, even though
one may well exist.
This patch addresses this by selecting the FP-capable names so that FP
instructions will be generated if the other options suggest this is
permitted.
Note that a more complete fix is really needed here to look up the
FP/simd capabilities and append the appropriate capability extensions.
This will be the subject of some follow-up patches.
* config/arm/driver-arm.c (arm_cpu_table): Use fp-capable product names
for armv6 ARM CPU IDs.
From-SVN: r255504
When GCC invokes the assembler it generates a sanitized version of the
user-specified -march option to pass through, since the assembler does
not understand all the new FPU-related architectural options.
Unfortunately it goes too far and strips off all the architectural
extensions, including some that are unrelated to the -mfpu variant
selected.
Again, this doesn't really matter when compiling C code because the
compiler will override the command-line specified architecture with
directives in the assembly file itself, but when using the compiler
driver to invoke the assembler the only indiciation of the desired
architecture might come from the command line.
We fix this by adjusting the canonicalization pass to remove any
option that only specifies features that can be expressed by -mfpu
(any that go beyond that are already supported by the assembler). We
do have to take care to re-order the options, though as the assembler
expects feature options to be in a canonical order (unlike the
compiler, where ordering is handled left-to-right: there's only a
difference if there are negation options, but a canonicalized
architecture string shouldn't have any of those). We do this by
recording which options we need and then sorting the final list
alphabetically.
* common/config/arm/arm-common.c: Include <algorithm>.
(INCLUDE_VECTOR): Define.
(compare_opt_names): New function.
(arm_rewrite_selected_arch): Only strip out extensions that can be
expressed through -mfpu. Sort the remaining extensions
alphabetically.
From-SVN: r255503
When gcc runs with the new -mfpu=auto option (either explicitly or
when that's the default behaviour) then this option is not passed
through to the assembler as we cannot rely on the assembler
understanding it (currently it doesn't understand it at all, but in
future that might change). That means that the assembler falls back
to its builtin default, which may not correspond to what the user
expected based on the command-line options they passed.
Normally that wouldn't matter because assembler files generated by the
compiler will contain explicit directives that set the FPU type
directly and override any internal defaults; but when the compiler
driver is used to invoke the assembler directly (because the source
file ends in .s or .S) then this might cause a problem if that assumes
the FPU matches the compiler.
To address this, this patch makes the driver construct a -mfpu= option
for the assembler in the same way as the compiler generates an
internal .fpu directive. As mentioned, this makes no difference if
the assembler file explicitly overrides the command line options, but
helps in the case where this is implicit.
* config/arm/arm.h (arm_asm_auto_mfpu): Declare.
(ASM_CPU_SPEC_FUNCTIONS): Add new rule asm_auto_mfpu.
(ASM_CPU_SPEC): Use it if -mfpu is set to auto.
* common/config/arm/arm-common.c (arm_asm_auto_mfpu): New function.
-- This line, and those below, will be ignored--
M gcc/ChangeLog
M gcc/common/config/arm/arm-common.c
M gcc/config/arm/arm.h
From-SVN: r255502
2017-06-08 Tristan Gingold <gindold@adacore.com>
PR ada/81470
* dwarf2out.c (dwarf2out_do_cfi_startproc): Only emit
.cfi_personality or .cfi_lsda if the eh data format is dwarf2.
From-SVN: r255501
2017-12-08 Richard Biener <rguenther@suse.de>
PR tree-optimization/81303
* tree-vect-stmts.c (vect_is_simple_cond): For invariant
conditions try to create a comparison vector type matching
the data vector type.
(vectorizable_condition): Adjust.
* tree-vect-patterns.c (vect_recog_mask_conversion_pattern):
Leave invariant conditions alone in case we can vectorize those.
* gcc.target/i386/vectorize9.c: New testcase.
* gcc.target/i386/vectorize10.c: New testcase.
From-SVN: r255497
2017-12-07 Andrew Waterman <andrew@sifive.com>
gcc/
* config/riscv/riscv.c (TARGET_ASM_SELECT_SECTION): New define.
(TARGET_HAVE_SRODATA_SECTION): New define.
(riscv_select_section): New function.
From-SVN: r255491
PR tree-optimization/83075
* gcc.dg/tree-ssa/strncpy-2.c: Use size_t instead of unsigned, add
separate function with noipa attribute to also verify behavior when
optimizers don't know the sizes and aliasing, verify resulting sizes
and array content. Add -Wstringop-overflow to dg-options.
* gcc.dg/tree-ssa/strncat.c: Likewise.
From-SVN: r255485
Effective target fstack_protector fails to return an error for
newlib-based target (such as arm-none-eabi targets) which does not
support stack protector. This is due to the test being too simplist for
stack protection code to be generated by GCC: it does not contain a
local buffer and does not read unknown input.
This commit adds a small local buffer with a copy of the filename to
trigger stack protector code to be generated. The filename is used
instead of the full path so as to ensure the size will fit in the local
buffer.
2017-12-07 Thomas Preud'homme <thomas.preudhomme@arm.com>
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_fstack_protector):
Copy filename in local buffer to trigger stack protection.
From-SVN: r255484
C11 DR#444 dealt with how C11 intended to allow alignment specifiers
on struct and union members, but failed to include that in the syntax.
The final resolution of that DR also allows alignment specifiers in
type names in compound literals (in order to apply an increased
alignment to the unnamed object created by the compound literal), but
not other cases of type names.
This patch implements allowing alignment specifiers in compound
literals and adds tests for the resolution of the DR (including that
they are allowed on struct and union members, which GCC already
implemented). Because the parser has to parse the parenthesized type
name of a compound literal before it can tell that it's a compound
literal (rather than, depending on the context, a cast expression or
sizeof (type-name) or _Alignof (type-name)), this means _Alignas
specifiers are allowed syntactically in those contexts and then an
error is given once it's known to be an invalid use (whereas _Alignas
specifiers are disallowed syntactically in other contexts where type
names can occur and a compound literal is not possible).
Bootstrapped with no regressions on x86_64-pc-linux-gnu.
gcc/c:
* c-decl.c (build_compound_literal): Add parameter alignas_align
and set alignment of decl if nonzero.
* c-parser.c (c_keyword_starts_typename): Allow RID_ALIGNAS.
(c_parser_declspecs): Allow RID_ALIGNAS to follow a type, like a
qualifier.
(c_parser_struct_declaration): Update syntax comment.
(c_parser_type_name): Add alignas_ok argument and pass it to
c_parser_declspecs.
(c_parser_cast_expression): Pass true to c_parser_type_name and
give error if a cast used an _Alignas specifier.
(c_parser_sizeof_expression): Pass true to c_parser_type_name and
give error if sizeof (type-name) used an _Alignas specifier.
(c_parser_alignof_expression): Pass true to c_parser_type_name and
give error if _Alignof (type-name) used an _Alignas specifier.
(c_parser_postfix_expression_after_paren_type): Check specified
alignment for a compound literal and pass it to
build_compound_literal.
* c-parser.h (c_parser_type_name): Update prototype.
* c-tree.h (build_compound_literal): Update prototype.
gcc/testsuite:
* gcc.dg/c11-align-7.c, gcc.dg/c11-align-8.c,
gcc.dg/c11-align-9.c, gcc.dg/gnu11-align-1.c: New tests.
* gcc.dg/c11-align-5.c (test): Update expected error for sizeof
case.
From-SVN: r255482
Three related regression fixes:
- We can't use asserts like:
gcc_assert (GET_MODE_SIZE (mode) == 16);
in aarch64_print_operand because it could trigger for invalid user input.
- The output_operand_lossage in aarch64_print_address_internal:
output_operand_lossage ("invalid operand for '%%%c'", op);
wasn't right because "op" is an rtx_code enum rather than the
prefix character.
- aarch64_print_operand_address shouldn't call output_operand_lossage
(because it doesn't have a prefix code) but instead fall back to
output_addr_const.
2017-12-05 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* config/aarch64/aarch64.c (aarch64_print_address_internal): Return
a bool success value. Don't call output_operand_lossage here.
(aarch64_print_ldpstp_address): Return a bool success value.
(aarch64_print_operand_address): Call output_addr_const if
aarch64_print_address_internal fails.
(aarch64_print_operand): Don't assert that the mode is 16 bytes for
'y'; call output_operand_lossage instead. Call output_operand_lossage
if aarch64_print_ldpstp_address fails.
gcc/testsuite/
* gcc.target/aarch64/asm-2.c: New test.
* gcc.target/aarch64/asm-3.c: Likewise.
From-SVN: r255481
This patch makes various bits of code operate directly on the new
VECTOR_CST encoding, instead of using VECTOR_CST_ELT on all elements
of the vector.
Previous patches handled operations that produce a new VECTOR_CST,
while this patch handles things like predicates. It also makes
print_node dump the encoding instead of the full vector that
the encoding represents.
2017-12-07 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-vector-builder.h
(tree_vector_builder::binary_encoded_nelts): Declare.
* tree-vector-builder.c
(tree_vector_builder::binary_encoded_nelts): New function.
* fold-const.c (negate_expr_p): Likewise.
(operand_equal_p, fold_checksum_tree): Likewise.
* tree-loop-distribution.c (const_with_all_bytes_same): Likewise.
* tree.c (integer_zerop, integer_onep, integer_all_onesp, real_zerop)
(real_onep, real_minus_onep, add_expr, initializer_zerop): Likewise.
(uniform_vector_p): Likewise.
* varasm.c (const_hash_1, compare_constant): Likewise.
* tree-ssa-ccp.c: Include tree-vector-builder.h.
(valid_lattice_transition): Operate directly on the VECTOR_CST
encoding.
* ipa-icf.c: Include tree-vector-builder.h.
(sem_variable::equals): Operate directly on the VECTOR_CST encoding.
* print-tree.c (print_node): Print encoding of VECTOR_CSTs.
From-SVN: r255480
The only remaining uses of build_vector are in the selftests in tree.c.
This patch makes it static and moves it to the selftest part of the file.
2017-12-07 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree.c (build_vector): Delete.
* tree.h (build_vector): Make static and move into the self-testing
block.
From-SVN: r255479
This patch changes gimple_build_vector so that it takes a
tree_vector_builder instead of a size and a vector of trees.
2017-12-07 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* vector-builder.h (vector_builder::derived): New const overload.
(vector_builder::elt): New function.
* tree-vector-builder.h (tree_vector_builder::type): New function.
(tree_vector_builder::apply_step): Declare.
* tree-vector-builder.c (tree_vector_builder::apply_step): New
function.
* gimple-fold.h (tree_vector_builder): Declare.
(gimple_build_vector): Take a tree_vector_builder instead of a
type and vector of elements.
* gimple-fold.c (gimple_build_vector): Likewise.
* tree-vect-loop.c (get_initial_def_for_reduction): Update call
accordingly.
(get_initial_defs_for_reduction): Likewise.
(vectorizable_induction): Likewise.
From-SVN: r255478
This patch makes fold-const.c operate directly on the VECTOR_CST
encoding when folding an operation that has two VECTOR_CST inputs.
2017-12-07 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-vector-builder.h
(tree_vector_builder::new_binary_operation): Declare.
* tree-vector-builder.c
(tree_vector_builder::new_binary_operation): New function.
* fold-const.c (fold_relational_const): Use it.
(const_binop): Likewise. Check that both input vectors have
the same number of elements, thus excluding things like WIDEN_SUM.
Check whether it is possible to operate directly on the encodings
of stepped inputs.
From-SVN: r255477
This patch makes fold-const.c operate directly on the VECTOR_CST
encoding when folding an operation that has a single VECTOR_CST input.
2017-12-07 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* fold-const.c (fold_negate_expr_1): Use tree_vector_builder and
new_unary_operation, operating only on the encoded elements.
(const_unop): Likewise.
(exact_inverse): Likewise.
(distributes_over_addition_p): New function.
(const_binop): Use tree_vector_builder and new_unary_operation
for combinations of VECTOR_CST and INTEGER_CST. Operate only
on the encoded elements unless the encoding is strided and the
operation does not distribute over addition.
(fold_convert_const): Use tree_vector_builder and
new_unary_operation. Operate only on the encoded elements
for truncating integer conversions, or for non-stepped encodings.
From-SVN: r255476
This patch switches most build_vector calls over to tree_vector_builder,
using explicit encodings where appropriate. Later patches handle
the remaining uses of build_vector.
2017-12-07 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* config/sparc/sparc.c: Include tree-vector-builder.h.
(sparc_fold_builtin): Use tree_vector_builder instead of build_vector.
* expmed.c: Include tree-vector-builder.h.
(make_tree): Use tree_vector_builder instead of build_vector.
* fold-const.c: Include tree-vector-builder.h.
(const_binop): Use tree_vector_builder instead of build_vector.
(const_unop): Likewise.
(native_interpret_vector): Likewise.
(fold_vec_perm): Likewise.
(fold_ternary_loc): Likewise.
* gimple-fold.c: Include tree-vector-builder.h.
(gimple_fold_stmt_to_constant_1): Use tree_vector_builder instead
of build_vector.
* tree-ssa-forwprop.c: Include tree-vector-builder.h.
(simplify_vector_constructor): Use tree_vector_builder instead
of build_vector.
* tree-vect-generic.c: Include tree-vector-builder.h.
(add_rshift): Use tree_vector_builder instead of build_vector.
(expand_vector_divmod): Likewise.
(optimize_vector_constructor): Likewise.
* tree-vect-loop.c: Include tree-vector-builder.h.
(vect_create_epilog_for_reduction): Use tree_vector_builder instead
of build_vector. Explicitly use a stepped encoding for
{ 1, 2, 3, ... }.
* tree-vect-slp.c: Include tree-vector-builder.h.
(vect_get_constant_vectors): Use tree_vector_builder instead
of build_vector.
(vect_transform_slp_perm_load): Likewise.
(vect_schedule_slp_instance): Likewise.
* tree-vect-stmts.c: Include tree-vector-builder.h.
(vectorizable_bswap): Use tree_vector_builder instead of build_vector.
(vect_gen_perm_mask_any): Likewise.
(vectorizable_call): Likewise. Explicitly use a stepped encoding.
* tree.c: (build_vector_from_ctor): Use tree_vector_builder instead
of build_vector.
(build_vector_from_val): Likewise. Explicitly use a duplicate
encoding.
From-SVN: r255475
This patch uses a simple compression scheme to represent the contents
of a VECTOR_CST using its leading elements. There are three formats:
1) a repeating sequence of N values. This is encoded using the first
N elements.
2) a "foreground" sequence of N values inserted at the beginning of
a "background" repeating sequence of N values, such as:
{ 1, 2, 0, 0, 0, 0, ... }. This is encoded using the first 2*N
elements.
2) a "foreground" sequence of N values inserted at the beginning of
a "background" repeating sequence of N interleaved linear series,
such as: { 0, 0, 8, 10, 9, 11, 10, 12, ... }. This is encoded
using the first 3*N elements. In practice the foreground values
are often part of the same series as the background values,
such as: { 1, 11, 2, 12, 3, 13, ... }.
This reduces the amount of work involved in processing simple
vector constants and means that the encoding extends naturally
to variable-length vectors.
2017-12-07 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* doc/generic.texi (VECTOR_CST): Describe new representation of
vector constants.
* vector-builder.h: New file.
* tree-vector-builder.h: Likewise.
* tree-vector-builder.c: Likewise.
* Makefile.in (OBJS): Add tree-vector-builder.o.
* tree.def (VECTOR_CST): Update comment to refer to generic.texi.
* tree-core.h (tree_base): Add a vector_cst field to the u union.
(tree_vector): Change the number of elements to
vector_cst_encoded_nelts.
* tree.h (VECTOR_CST_NELTS): Redefine using TYPE_VECTOR_SUBPARTS.
(VECTOR_CST_ELTS): Delete.
(VECTOR_CST_ELT): Redefine using vector_cst_elt.
(VECTOR_CST_LOG2_NPATTERNS, VECTOR_CST_NPATTERNS): New macros.
(VECTOR_CST_NELTS_PER_PATTERN, VECTOR_CST_DUPLICATE_P): Likewise.
(VECTOR_CST_STEPPED_P, VECTOR_CST_ENCODED_ELTS): Likewise.
(VECTOR_CST_ENCODED_ELT): Likewise.
(vector_cst_encoded_nelts): New function.
(make_vector): Take the values of VECTOR_CST_LOG2_NPATTERNS and
VECTOR_CST_NELTS_PER_PATTERN as arguments.
(vector_cst_int_elt, vector_cst_elt): Declare.
* tree.c: Include tree-vector-builder.h.
(tree_code_size): Abort if passed VECTOR_CST.
(tree_size): Update for new VECTOR_CST layout.
(make_vector): Take the values of VECTOR_CST_LOG2_NPATTERNS and
VECTOR_CST_NELTS_PER_PATTERN as arguments.
(build_vector): Use tree_vector_builder.
(vector_cst_int_elt, vector_cst_elt): New functions.
(drop_tree_overflow): For VECTOR_CST, drop the TREE_OVERFLOW from the
encoded elements and then create the vector in the canonical form.
(check_vector_cst, check_vector_cst_duplicate, check_vector_cst_fill)
(check_vector_cst_stepped, test_vector_cst_patterns): New functions.
(tree_c_tests): Call test_vector_cst_patterns.
* lto-streamer-out.c (DFS::DFS_write_tree_body): Handle the new
VECTOR_CST fields.
(hash_tree): Likewise.
* tree-streamer-out.c (write_ts_vector_tree_pointers): Likewise.
(streamer_write_tree_header): Likewise.
* tree-streamer-in.c (lto_input_ts_vector_tree_pointers): Likewise.
(streamer_alloc_tree): Likewise. Update call to make_vector.
* fold-const.c (fold_ternary_loc): Avoid using VECTOR_CST_ELTS.
gcc/lto/
* lto.c (compare_tree_sccs_1): Compare the new VECTOR_CST flags.
From-SVN: r255474
PR tree-optimization/81303
* Makefile.in (gimple-loop-interchange.o): New object file.
* common.opt (floop-interchange): Reuse the option from graphite.
* doc/invoke.texi (-floop-interchange): Ditto. New document for
-floop-interchange and mention it for -O3.
* opts.c (default_options_table): Enable -floop-interchange at -O3.
* gimple-loop-interchange.cc: New file.
* params.def (PARAM_LOOP_INTERCHANGE_MAX_NUM_STMTS): New parameter.
(PARAM_LOOP_INTERCHANGE_STRIDE_RATIO): New parameter.
* passes.def (pass_linterchange): New pass.
* timevar.def (TV_LINTERCHANGE): New time var.
* tree-pass.h (make_pass_linterchange): New declaration.
* tree-ssa-loop-ivcanon.c (create_canonical_iv): Change to external
interchange. Record IV before/after increment in new parameters.
* tree-ssa-loop-ivopts.h (create_canonical_iv): New declaration.
* tree-vect-loop.c (vect_is_simple_reduction): Factor out reduction
path check into...
(check_reduction_path): ...New function here.
* tree-vectorizer.h (check_reduction_path): New declaration.
gcc/testsuite
* gcc.dg/tree-ssa/loop-interchange-1.c: New test.
* gcc.dg/tree-ssa/loop-interchange-1b.c: New test.
* gcc.dg/tree-ssa/loop-interchange-2.c: New test.
* gcc.dg/tree-ssa/loop-interchange-3.c: New test.
* gcc.dg/tree-ssa/loop-interchange-4.c: New test.
* gcc.dg/tree-ssa/loop-interchange-5.c: New test.
* gcc.dg/tree-ssa/loop-interchange-6.c: New test.
* gcc.dg/tree-ssa/loop-interchange-7.c: New test.
* gcc.dg/tree-ssa/loop-interchange-8.c: New test.
* gcc.dg/tree-ssa/loop-interchange-9.c: New test.
* gcc.dg/tree-ssa/loop-interchange-10.c: New test.
* gcc.dg/tree-ssa/loop-interchange-11.c: New test.
* gcc.dg/tree-ssa/loop-interchange-12.c: New test.
* gcc.dg/tree-ssa/loop-interchange-13.c: New test.
Co-Authored-By: Richard Biener <rguenther@suse.de>
From-SVN: r255472
2017-12-07 Vladimir Makarov <vmakarov@redhat.com>
PR target/83252
PR rtl-optimization/80818
* lra.c (add_regs_to_insn_regno_info): Make a hard reg in CLOBBER
always early clobbered.
* lra-lives.c (process_bb_lives): Check input hard regs for early
clobbered non-operand hard reg.
From-SVN: r255471
PR middle-end/83164
* tree-cfg.c (verify_gimple_assign_binary): Don't require
types_compatible_p, just that TYPE_MODE is the same.
* gcc.c-torture/compile/pr83164.c: New test.
From-SVN: r255470