When adding the call to gen_vec_duplicate, I failed to notice that
code further down modified the VEC_DUPLICATE in place. That isn't
safe if gen_vec_duplicate returned a const_vector.
2017-11-02 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
PR target/82809
* config/i386/i386.c (ix86_vector_duplicate_value): Use
gen_vec_duplicate after forcing the scalar into a register.
gcc/testsuite/
* gcc.dg/pr82809.c: New test.
From-SVN: r254366
This adds some extra debug info to the dump file for combine: print
the insns that are input to try_combine. I was worried printing more
will make the dump file only harder to read, but especially the info
from the REG_DEAD notes is invaluable.
* combine (try_combine): Print the insns input to try_combine to the
dump file.
From-SVN: r254365
gcc/ChangeLog:
* diagnostic.c: Include "selftest-diagnostic.h".
(selftest::assert_location_text): New function.
(selftest::test_diagnostic_get_location_text): New function.
(selftest::diagnostic_c_tests): Call it.
From-SVN: r254355
It's useful to not rely on global_dc in selftests, so this patch
moves class selftest::test_diagnostic_context from
diagnostic-show-locus.c to a new header and source file.
gcc/ChangeLog:
* Makefile.in (OBJS-libcommon): Add selftest-diagnostic.o.
* diagnostic-show-locus.c: Include "selftest-diagnostic.h".
(class selftest::test_diagnostic_context): Move to...
* selftest-diagnostic.c: New file.
* selftest-diagnostic.h: New file.
From-SVN: r254354
FT32B is a new FT32 architecture type. FT32B has a code compression
scheme which uses linker relaxations. It also has a security option to
prevent reads from program memory.
gcc/
* config/ft32/ft32.c (ft32_addr_space_legitimate_address_p): increase
offset range for FT32B.
* config/ft32/ft32.h: option "mcompress" enables relaxation.
* config/ft32/ft32.md: Add TARGET_NOPM.
* config/ft32/ft32.opt: Add mft32b, mcompress, mnopm.
* gcc/doc/invoke.texi: Add mft32b, mcompress, mnopm.
From-SVN: r254351
https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00119.html
PR c++/82710
* decl.c (grokdeclarator): Don't warn when parens protect a return
type from a qualified name.
PR c++/82710
* g++.dg/warn/pr82710.C: New.
From-SVN: r254349
The AArch64 backend currently doesn't set MALLOC_ABI_ALIGNMENT, so
add this to enable alignment optimizations on malloc pointers.
Use the same value as STACK_BOUNDARY and BIGGEST_ALIGNMENT.
gcc/
* config/aarch64/aarch64.h (MALLOC_ABI_ALIGNMENT): New define.
From-SVN: r254348
2017-11-02 Tom de Vries <tom@codesourcery.com>
PR testsuite/82415
* gcc.target/i386/naked-1.c: Make scan patterns more precise.
* gcc.target/i386/naked-2.c: Same.
From-SVN: r254338
2017-11-02 Richard Biener <rguenther@suse.de>
PR middle-end/82765
* varasm.c (decode_addr_const): Make offset HOST_WIDE_INT.
Truncate ARRAY_REF index and element size.
* gcc.dg/pr82765.c: New testcase.
From-SVN: r254337
* tree-ssa-ccp.c (ccp_folder): New class derived from
substitute_and_fold_engine.
(ccp_folder::get_value): New member function.
(ccp_folder::fold_stmt): Renamed from ccp_fold_stmt.
(ccp_fold_stmt): Remove prototype.
(ccp_finalize): Call substitute_and_fold from the ccp_class.
* tree-ssa-copy.c (copy_folder): New class derived from
substitute_and_fold_engine.
(copy_folder::get_value): Renamed from get_value.
(fini_copy_prop): Call substitute_and_fold from copy_folder class.
* tree-vrp.c (vrp_folder): New class derived from
substitute_and_fold_engine.
(vrp_folder::fold_stmt): Renamed from vrp_fold_stmt.
(vrp_folder::get_value): New member function.
(vrp_finalize): Call substitute_and_fold from vrp_folder class.
(evrp_dom_walker::before_dom_children): Similarly for replace_uses_in.
* tree-ssa-propagate.h (substitute_and_fold_engine): New class to
provide a class interface to folder/substitute routines.
(ssa_prop_fold_stmt_fn): Remove typedef.
(ssa_prop_get_value_fn): Likewise.
(subsitute_and_fold): Remove prototype.
(replace_uses_in): Likewise.
* tree-ssa-propagate.c (substitute_and_fold_engine::replace_uses_in):
Renamed from replace_uses_in. Call the virtual member function
(substitute_and_fold_engine::replace_phi_args_in): Similarly.
(substitute_and_fold_dom_walker): Remove initialization of
data member entries for calbacks. Add substitute_and_fold_engine
member and initialize it.
(substitute_and_fold_dom_walker::before_dom_children0: Use the
member functions for get_value, replace_phi_args_in c
replace_uses_in, and fold_stmt calls.
(substitute_and_fold_engine::substitute_and_fold): Renamed from
substitute_and_fold. Remove assert. Update ctor call.
From-SVN: r254330
* tree-ssa-propagate.h (ssa_prop_visit_stmt_fn): Remove typedef.
(ssa_prop_visit_phi_fn): Likewise.
(class ssa_propagation_engine): New class to provide an interface
into ssa_propagate.
* tree-ssa-propagate.c (ssa_prop_visit_stmt): Remove file scoped
variable.
(ssa_prop_visit_phi): Likewise.
(ssa_propagation_engine::simulate_stmt): Moved into class.
Call visit_phi/visit_stmt from the class rather than via
file scoped static variables.
(ssa_propagation_engine::simulate_block): Moved into class.
(ssa_propagation_engine::process_ssa_edge_worklist): Similarly.
(ssa_propagation_engine::ssa_propagate): Similarly. No longer
set file scoped statics for the visit_stmt/visit_phi callbacks.
* tree-complex.c (complex_propagate): New class derived from
ssa_propagation_engine.
(complex_propagate::visit_stmt): Renamed from complex_visit_stmt.
(complex_propagate::visit_phi): Renamed from complex_visit_phi.
(tree_lower_complex): Call ssa_propagate via the complex_propagate
class.
* tree-ssa-ccp.c: (ccp_propagate): New class derived from
ssa_propagation_engine.
(ccp_propagate::visit_phi): Renamed from ccp_visit_phi_node.
(ccp_propagate::visit_stmt): Renamed from ccp_visit_stmt.
(do_ssa_ccp): Call ssa_propagate from the ccp_propagate class.
* tree-ssa-copy.c (copy_prop): New class derived from
ssa_propagation_engine.
(copy_prop::visit_stmt): Renamed from copy_prop_visit_stmt.
(copy_prop::visit_phi): Renamed from copy_prop_visit_phi_node.
(execute_copy_prop): Call ssa_propagate from the copy_prop class.
* tree-vrp.c (vrp_prop): New class derived from ssa_propagation_engine.
(vrp_prop::visit_stmt): Renamed from vrp_visit_stmt.
(vrp_prop::visit_phi): Renamed from vrp_visit_phi_node.
(execute_vrp): Call ssa_propagate from the vrp_prop class.
From-SVN: r254329
PR rtl-optimization/82778
PR rtl-optimization/82597
* compare-elim.c (struct comparison): Add in_a_setter field.
(find_comparison_dom_walker::before_dom_children): Remove killed
bitmap and df_simulate_find_defs call, instead walk the defs.
Compute last_setter and initialize in_a_setter. Merge definitions
with first initialization for a few variables.
(try_validate_parallel): Use insn_invalid_p instead of
recog_memoized. Return insn rather than just the pattern.
(try_merge_compare): Fix up comment. Don't uselessly test if
in_a is a REG_P. Use cmp->in_a_setter instead of walking UD
chains.
(execute_compare_elim_after_reload): Remove df_chain_add_problem
call.
* g++.dg/opt/pr82778.C: New test.
2017-11-01 Michael Collison <michael.collison@arm.com>
PR rtl-optimization/82597
* gcc.dg/pr82597.c: New test.
From-SVN: r254328
aarch64_rtx_costs uses the number of registers in a mode as the basis
of SET costs. This patch makes it get the number of registers from
aarch64_hard_regno_nregs rather than repeating the calcalation inline.
Handling SVE modes in aarch64_hard_regno_nregs is then enough to get
the correct SET cost as well.
2017-11-01 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* config/aarch64/aarch64.c (aarch64_rtx_costs): Use
aarch64_hard_regno_nregs to get the number of registers
in a mode.
Reviewed-By: James Greenhalgh <james.greenhalgh@arm.com>
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254327
The SVE port uses the public constraints "Upl" and "Upa" to mean
"low predicate register" and "any predicate register" respectively.
"Upl" was already used as an internal-only constraint by the
addition patterns, so this patch renames it to "Uaa" ("two adds
needed").
2017-11-01 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* config/aarch64/constraints.md (Upl): Rename to...
(Uaa): ...this.
* config/aarch64/aarch64.md
(*zero_extend<SHORT:mode><GPI:mode>2_aarch64, *addsi3_aarch64_uxtw):
Update accordingly.
Reviewed-By: James Greenhalgh <james.greenhalgh@arm.com>
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254326
This patch simply moves code around, in order to make the later
patches easier to read, and to avoid forward declarations.
It doesn't add the missing function comments because the interfaces
will change in a later patch.
2017-11-01 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* config/aarch64/aarch64.c (aarch64_add_constant_internal)
(aarch64_add_constant, aarch64_add_sp, aarch64_sub_sp): Move
earlier in file.
Reviewed-by: James Greenhalgh <james.greenhalgh@arm.com>
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254325
This patch replaces switch statements that call specific generator
functions with code that constructs the rtl pattern directly.
This seemed to scale better to SVE and also seems less error-prone.
As a side-effect, the patch fixes the REV handling for diff==1,
vmode==E_V4HFmode and adds missing support for diff==3,
vmode==E_V4HFmode.
To compensate for the lack of switches that check for specific modes,
the patch makes aarch64_expand_vec_perm_const_1 reject permutes on
single-element vectors (specifically V1DImode).
2017-11-01 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* config/aarch64/aarch64.c (aarch64_evpc_trn, aarch64_evpc_uzp)
(aarch64_evpc_zip, aarch64_evpc_ext, aarch64_evpc_rev)
(aarch64_evpc_dup): Generate rtl direcly, rather than using
named expanders.
(aarch64_expand_vec_perm_const_1): Explicitly check for permutes
of a single element.
* config/aarch64/iterators.md: Add a comment above the permute
unspecs to say that they are generated directly by
aarch64_expand_vec_perm_const.
* config/aarch64/aarch64-simd.md: Likewise the permute instructions.
Reviewed-by: James Greenhalgh <james.greenhalgh@arm.com>
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254324
This documentation is patterned off the aarch64 -mcmodel documentation.
gcc/ChangeLog:
2017-11-01 Palmer Dabbelt <palmer@dabbelt.com>
* doc/invoke.texi (RISC-V Options): Explicitly name the medlow
and medany code models, and describe what they do.
From-SVN: r254321
2017-11-01 François Dumont <fdumont@gcc.gnu.org>
* python/libstdcxx/v6/printers.py (StdExpAnyPrinter.__init__): Strip
typename versioned namespace before the substitution.
(StdExpOptionalPrinter.__init__): Likewise.
(StdVariantPrinter.__init__): Likewise.
(Printer.add_version): Inject versioned namespace after std or
__gnu_cxx.
(build_libstdcxx_dictionary): Adapt add_version usages, always pass
namespace first and symbol second.
From-SVN: r254320
When we have a REG_DEAD note for a reg that is set in the new I2, we
drop the note on the floor (we cannot find whether to place it on I2
or on I3). But the code I added to do this has a bug and does not
always actually drop it. This patch fixes it.
But that on its own is too pessimistic, it turns out, and we generate
worse code. One case where we do know where to place the note is if
it came from I3 (it should go to I3 again). Doing this fixes all of
the regressions.
PR rtl-optimization/64682
PR rtl-optimization/69567
PR rtl-optimization/69737
PR rtl-optimization/82683
* combine.c (distribute_notes) <REG_DEAD>: If the new I2 sets the same
register mentioned in the note, drop the note, unless it came from I3,
in which case it should go to I3 again.
From-SVN: r254315
This patch moves the check for an overlapping byte to normalize_ref
from its callers, so that it's easier to convert to poly_ints later.
It's not really worth it on its own.
2017-11-01 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-ssa-dse.c (normalize_ref): Check whether the ranges overlap
and return false if not.
(clear_bytes_written_by, live_bytes_read): Update accordingly.
From-SVN: r254313
Most GCC ranges seem to be represented as an offset and a size (rather
than a start and inclusive end or start and exclusive end). The usual
test for whether X is in a range is of course:
x >= start && x < start + size
or:
x >= start && x - start < size
which means that an empty range of size 0 contains nothing. But other
range tests aren't as obvious.
The usual test for whether one range is contained within another
range is:
start1 >= start2 && start1 + size1 <= start2 + size2
while the test for whether two ranges overlap (from ranges_overlap_p) is:
(start1 >= start2 && start1 < start2 + size2)
|| (start2 >= start1 && start2 < start1 + size1)
i.e. the ranges overlap if one range contains the start of the other
range. This leads to strange results like:
(start X, size 0) is a subrange of (start X, size 0) but
(start X, size 0) does not overlap (start X, size 0)
Similarly:
(start 4, size 0) is a subrange of (start 2, size 2) but
(start 4, size 0) does not overlap (start 2, size 2)
It seems like "X is a subrange of Y" should imply "X overlaps Y".
This becomes harder to ignore with the runtime sizes and offsets
added for SVE. The most obvious fix seemed to be to say that
an empty range does not overlap anything, and is therefore not
a subrange of anything.
Using the new definition of subranges didn't seem to cause any
codegen differences in the testsuite. But there was one change
with the new definition of overlapping ranges. strncpy-chk.c has:
memset (dst, 0, sizeof (dst));
if (strncpy (dst, src, 0) != dst || strcmp (dst, ""))
abort();
The strncpy is detected as a zero-size write, and so with the new
definition of overlapping ranges, we treat the strncpy as having
no effect on the strcmp (which is true). The reaching definition
is the memset instead.
This patch makes ranges_overlap_p return false for zero-sized
ranges, even if the other range has an unknown size.
2017-11-01 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-ssa-alias.h (ranges_overlap_p): Return false if either
range is known to be empty.
From-SVN: r254312
This patch avoids some calculations of the form:
GET_MODE_SIZE (vector_mode) / GET_MODE_SIZE (element_mode)
in simplify-rtx.c. If we're dealing with CONST_VECTORs, it's better
to use CONST_VECTOR_NUNITS, since that remains constant even after the
SVE patches. In other cases we can get the number from GET_MODE_NUNITS.
2017-11-01 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* simplify-rtx.c (simplify_const_unary_operation): Use GET_MODE_NUNITS
and CONST_VECTOR_NUNITS instead of computing the number of units from
the byte sizes of the vector and element.
(simplify_binary_operation_1): Likewise.
(simplify_const_binary_operation): Likewise.
(simplify_ternary_operation): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254311
This avoids the double evaluation mentioned in the comments and
simplifies the change to make MEM_OFFSET variable.
2017-11-01 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* var-tracking.c (INT_MEM_OFFSET): Replace with...
(int_mem_offset): ...this new function.
(var_mem_set, var_mem_delete_and_set, var_mem_delete)
(find_mem_expr_in_1pdv, dataflow_set_preserve_mem_locs)
(same_variable_part_p, use_type, add_stores, vt_get_decl_and_offset):
Update accordingly.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254309
This patch adds a helper routine (interesting_mode_p) to lower-subreg.c,
to make the decision about whether a mode can be split and, if so,
calculate the number of bytes and words in the mode. At present this
function always returns true; a later patch will add cases in which it
can return false.
2017-11-01 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* lower-subreg.c (interesting_mode_p): New function.
(compute_costs, find_decomposable_subregs, decompose_register)
(simplify_subreg_concatn, can_decompose_p, resolve_simple_move)
(resolve_clobber, dump_choices): Use it.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254308
Avoid using add_object when we have more specific routines available.
2017-11-01 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* rtlhash.c (add_rtx): Use add_hwi for 'w' and add_int for 'i'.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254307
alias.c:find_base_term and find_base_value checked:
if (GET_MODE_SIZE (GET_MODE (src)) < GET_MODE_SIZE (Pmode))
but (a) comparing the precision seems more correct, since it's possible
for modes to have the same memory size as Pmode but fewer bits and
(b) the functions are called on arbitrary rtl, so there's no guarantee
that we're handling an integer truncation.
Since there's no point processing truncations of anything other than an
integer, this patch checks that first.
2017-11-01 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* alias.c (find_base_value, find_base_term): Only process integer
truncations. Check the precision rather than the size.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254306
This patch adds a function for testing whether an arbitrary mode X
is an integer mode that is narrower than integer mode Y. This is
useful for code like expand_float and expand_fix that could in
principle handle vectors as well as scalars.
2017-11-01 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* machmode.h (is_narrower_int_mode): New function
* optabs.c (expand_float, expand_fix): Use it.
* dwarf2out.c (rotate_loc_descriptor): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254305
This patch adds a narrowing equivalent of wider_subreg_mode. At present
there is only one user.
2017-11-01 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* rtl.h (narrower_subreg_mode): New function.
* ira-color.c (update_costs_from_allocno): Use it.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254304