gcc/ChangeLog:
* ipa-cp.c (ipcp_driver): Set edge_clone_summaries to NULL after
deleting it.
* ipa-reference.c (ipa_reference_c_finalize): Delete
ipa_ref_opt_sum_summaries and set it to NULL.
From-SVN: r261846
Atm this test in pr45882.c fails:
...
int d = a[i]; /* { dg-final { gdb-test 16 "d" "112" } } */
...
as follows:
...
FAIL: gcc.dg/guality/pr45882.c -O2 -flto -fuse-linker-plugin \
-fno-fat-lto-objects line 16 d == 112
...
In more detail, gdb fails to print the value of d:
...
Breakpoint 1, foo (i=i@entry=7, j=j@entry=7) at pr45882.c:16
16 ++v;
$1 = <optimized out>
$2 = 112
<optimized out> != 112
...
Variable d is a local variable in function foo, initialized from global array a.
When compiling, first cddce1 removes the initialization of d in foo, given
that d is not used afterwards. Then ipa marks array a as write-only, and
removes the stores to array a in main. This invalidates the location
expression for d, which points to a[i], so it is removed, which is why gdb
ends up printing <optimized out> for d.
This patches fixes the fail by adding attribute used to array a, preventing
array a from being marked as write-only.
Tested on x86_64.
2018-06-21 Tom de Vries <tdevries@suse.de>
* gcc.dg/guality/pr45882.c (a): Add used attribute.
From-SVN: r261845
2018-06-21 Tom de Vries <tdevries@suse.de>
PR tree-optimization/85859
* tree-ssa-tail-merge.c (stmt_local_def): Copy gimple_is_call
test with comment from bb_no_side_effects_p.
* gcc.dg/pr85859.c: New test.
From-SVN: r261844
2018-06-21 Richard Biener <rguenther@suse.de>
PR tree-optimization/86232
* tree-ssa-loop-niter.c (number_of_iterations_popcount): Adjust
max for constant niter.
* gcc.dg/torture/pr86232.c: New testcase.
From-SVN: r261843
gcc
2018-06-21 Andre Vieira <andre.simoesdiasvieira@arm.com>
* config/aarch64/aarch64-simd.md (aarch64_crypto_aes<aes_op>v16qi):
Make opernads of the unspec commutative.
gcc/testsuite
2018-06-21 Andre Vieira <andre.simoesdiasvieira@arm.com>
* gcc/gcc.target/aarch64/aes_2.c: New test.
From-SVN: r261835
2018-06-21 Richard Biener <rguenther@suse.de>
* tree-data-ref.c (dr_step_indicator): Handle NULL DR_STEP.
* tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
Avoid calling vect_mark_for_runtime_alias_test with gathers or scatters.
(vect_analyze_data_ref_dependence): Re-order checks to deal with
NULL DR_STEP.
(vect_record_base_alignments): Do not record base alignment
for gathers or scatters.
(vect_compute_data_ref_alignment): Drop return value that is always
true. Bail out early for gathers or scatters.
(vect_enhance_data_refs_alignment): Bail out early for gathers
or scatters.
(vect_find_same_alignment_drs): Likewise.
(vect_analyze_data_refs_alignment): Remove dead code.
(vect_slp_analyze_and_verify_node_alignment): Likewise.
(vect_analyze_data_refs): For possible gathers or scatters do
not create an alternate DR, just check their possible validity
and mark them. Adjust DECL_NONALIASED handling to not rely
on DR_BASE_ADDRESS.
* tree-vect-loop-manip.c (vect_update_inits_of_drs): Do not
update inits of gathers or scatters.
* tree-vect-patterns.c (vect_recog_mask_conversion_pattern):
Also copy gather/scatter flag to pattern vinfo.
From-SVN: r261834
2018-06-21 François Dumont <fdumont@gcc.gnu.org>
* include/debug/debug.h
(_Safe_iterator<>(const _Safe_iterator<_MutableIterator,>& __x)):
Compare __x base iterator with a default initialized iterator of the
same type.
From-SVN: r261831
libgcc/:
PR libgcc/86213
* generic-morestack.c (allocate_segment): Move calls to getenv and
getpagesize to __morestack_load_mmap.
(__morestack_load_mmap) Initialize static_pagesize and
use_guard_page here so as to avoid clobbering SSE regs during a
__morestack call.
gcc/testsuite/:
* gcc.dg/split-8.c: New.
From-SVN: r261823
gcc/ChangeLog:
2018-06-20 Kelvin Nilsen <kelvin@gcc.gnu.org>
* config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Change
behavior of vec_packsu (vector unsigned long long, vector unsigned
long long) to match behavior of vec_packs with same signature.
gcc/testsuite/ChangeLog:
2018-06-20 Kelvin Nilsen <kelvin@gcc.gnu.org>
* gcc.target/powerpc/builtins-1.c: Adjust dg directives to scan
for vpkudus in place of vpksdus.
* gcc.target/powerpc/builtins-3-p8.c: Likewise.
From-SVN: r261819
Construct the program-wide resource objects using placement new. This
means they have dynamic storage duration and won't be destroyed during
termination.
PR libstdc++/70966
* include/experimental/memory_resource (__resource_adaptor_imp): Add
static assertions to enforce requirements on pointer types.
(__resource_adaptor_imp::get_allocator()): Add noexcept.
(new_delete_resource, null_memory_resource): Return address of an
object with dynamic storage duration.
(__null_memory_resource): Remove.
* testsuite/experimental/memory_resource/70966.cc: New.
From-SVN: r261818
PR debug/86194
* var-tracking.c (use_narrower_mode_test): Check if shift amount can
be narrowed.
* gcc.target/i386/pr86194.c: New test.
From-SVN: r261807
PR tree-optimization/86231
* tree-vrp.c (union_ranges): For ( [ ) ] or ( )[ ] range and
anti-range don't overwrite *vr0min before using it to compute *vr0max.
* gcc.dg/tree-ssa/vrp119.c: New test.
* gcc.c-torture/execute/pr86231.c: New test.
From-SVN: r261805
2018-06-20 Tom de Vries <tdevries@suse.de>
PR tree-optimization/86097
* tree-ssa-loop-manip.c (canonicalize_loop_ivs): Also convert *nit to
iv type if signedness of iv type is not the same as that of *nit.
* gcc.dg/autopar/pr86097.c: New test.
From-SVN: r261804
This patch adds support for generating LDPs and STPs of Q-registers.
This allows for more compact code generation and makes better use of the ISA.
It's implemented in a straightforward way by allowing 16-byte modes in the
sched-fusion machinery and adding appropriate peepholes in aarch64-ldpstp.md
as well as the patterns themselves in aarch64-simd.md.
It adds a new no_ldp_stp_qregs tuning flag.
I use it to restrict the peepholes in aarch64-ldpstp.md from merging the
operations together into PARALLELs. I also use it to restrict the sched fusion
check that brings such loads and stores together. This is enough to avoid
forming the pairs when the tuning flag is set.
I didn't see any non-noise performance effect on SPEC2017 on Cortex-A72 and Cortex-A53.
* config/aarch64/aarch64-tuning-flags.def (no_ldp_stp_qregs): New.
* config/aarch64/aarch64.c (xgene1_tunings): Add
AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS to tune_flags.
(aarch64_mode_valid_for_sched_fusion_p):
Allow 16-byte modes.
(aarch64_classify_address): Allow 16-byte modes for load_store_pair_p.
* config/aarch64/aarch64-ldpstp.md: Add peepholes for LDP STP of
128-bit modes.
* config/aarch64/aarch64-simd.md (load_pair<VQ:mode><VQ2:mode>):
New pattern.
(vec_store_pair<VQ:mode><VQ2:mode>): Likewise.
* config/aarch64/iterators.md (VQ2): New mode iterator.
* gcc.target/aarch64/ldp_stp_q.c: New test.
* gcc.target/aarch64/stp_vec_128_1.c: Likewise.
* gcc.target/aarch64/ldp_stp_q_disable.c: Likewise.
From-SVN: r261796
2018-06-20 Martin Liska <mliska@suse.cz>
* tree-switch-conversion.c (jump_table_cluster::can_be_handled):
Change default ratio from 10 to 8.
From-SVN: r261795
This patch makes pattern recognisers do their own checking for vector
types and target support. Previously some recognisers did this
themselves and some left it to vect_pattern_recog_1.
Doing this means we can get rid of the type_in argument, which was
ignored if the recogniser did its own checking. It also means
we create fewer junk statements.
2018-06-20 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (NUM_PATTERNS, vect_recog_func_ptr): Move to
tree-vect-patterns.c.
* tree-vect-patterns.c (vect_supportable_direct_optab_p): New function.
(vect_recog_dot_prod_pattern): Use it. Remove the type_in argument.
(vect_recog_sad_pattern): Likewise.
(vect_recog_widen_sum_pattern): Likewise.
(vect_recog_pow_pattern): Likewise. Check for a null vectype.
(vect_recog_widen_shift_pattern): Remove the type_in argument.
(vect_recog_rotate_pattern): Likewise.
(vect_recog_mult_pattern): Likewise.
(vect_recog_vector_vector_shift_pattern): Likewise.
(vect_recog_divmod_pattern): Likewise.
(vect_recog_mixed_size_cond_pattern): Likewise.
(vect_recog_bool_pattern): Likewise.
(vect_recog_mask_conversion_pattern): Likewise.
(vect_try_gather_scatter_pattern): Likewise.
(vect_recog_widen_mult_pattern): Likewise. Check for a null vectype.
(vect_recog_over_widening_pattern): Likewise.
(vect_recog_gather_scatter_pattern): Likewise.
(vect_recog_func_ptr): Move from tree-vectorizer.h
(vect_vect_recog_func_ptrs): Move further down the file.
(vect_recog_func): Likewise. Remove the third argument.
(NUM_PATTERNS): Define based on vect_vect_recog_func_ptrs.
(vect_pattern_recog_1): Expect the pattern function to do any
necessary target tests. Also expect it to provide a vector type.
Remove the type_in handling.
From-SVN: r261791
This message is a long write-up for a patch that simply adds a common
routine for printing the "vector_foo_pattern: detected:" messages.
The reason for doing this is that some routines check for target support
themselves and some leave it to vect_pattern_recog_1. Those that leave
it to vect_pattern_recog_1 currently print these "detected:" messages if
the statements have the right form, even if the pattern is eventually
discarded. IMO that's useful, and a lot of existing scan tests rely on it.
However, a later patch makes patterns do their own testing, and stops
them creating pattern statements until the tests have passed. This means
(a) they need to print the "detected:" message earlier and (b) the pattern
statement won't be around to print.
The patch therefore makes all routines print the original statement
rather than the pattern one. That information isn't obvious otherwise,
whereas vect_pattern_recog_1 already prints the pattern statement
in the case of a successful match. This also avoids the previous
situation in which a routine could print "detected:" and then
silently bail out before saying what had been detected.
2018-06-20 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-patterns.c (vect_pattern_detected): New function.
(vect_recog_dot_prod_patternm, vect_recog_sad_pattern)
(vect_recog_widen_mult_pattern, vect_recog_widen_sum_pattern)
(vect_recog_over_widening_pattern, vect_recog_widen_shift_pattern
(vect_recog_rotate_pattern, vect_recog_vector_vector_shift_pattern)
(vect_recog_mult_pattern, vect_recog_divmod_pattern)
(vect_recog_mixed_size_cond_pattern, vect_recog_bool_pattern)
(vect_recog_mask_conversion_pattern)
(vect_try_gather_scatter_pattern): Likewise.
From-SVN: r261790
This patch adds a helper for pattern code that wants to find an
internal (vectorisable) definition of an SSA name.
A later patch will make more use of this, and alter the definition.
2018-06-20 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-patterns.c (vect_get_internal_def): New function.
(vect_recog_dot_prod_pattern, vect_recog_sad_pattern)
(vect_recog_vector_vector_shift_pattern, check_bool_pattern)
(search_type_for_mask_1): Use it.
From-SVN: r261789
vect_recog_dot_prod_pattern and vect_recog_sad_pattern both checked
whether the statement passed in had already been recognised as a
WIDEN_SUM_EXPR pattern. That isn't possible (any more?), since the
first recognised pattern wins, and since vect_recog_widen_sum_pattern
never matches a later statement than the one it's given.
2018-06-20 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-patterns.c (vect_recog_dot_prod_pattern): Remove
redundant WIDEN_SUM_EXPR handling.
(vect_recog_sad_pattern): Likewise.
From-SVN: r261788
tree-vect-patterns.c checked that operands to primitive arithmetic ops
are compatible with each other and with the result. The checks date
back years and have long been redundant with verify_gimple_stmt.
2018-06-20 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-patterns.c (vect_recog_dot_prod_pattern): Remove
redundant check that the types of a PLUS_EXPR or MULT_EXPR agree.
(vect_recog_sad_pattern): Likewise PLUS_EXPR, ABS_EXPR and MINUS_EXPR.
(vect_recog_widen_mult_pattern): Likewise MULT_EXPR.
(vect_recog_widen_sum_pattern): Likewise PLUS_EXPR.
From-SVN: r261787
vectorizable_call stubs out the original scalar statement with
a dummy assignment to the same lhs, so that we don't leave any bogus
scalar calls around. If the call is actually a pattern statement,
the code rightly took the lhs of the original bb statement:
if (is_pattern_stmt_p (stmt_info))
lhs = gimple_call_lhs (STMT_VINFO_RELATED_STMT (stmt_info));
else
lhs = gimple_call_lhs (stmt);
But it then associated the new statement with the stmt_vec_info of the
pattern statement rather than the bb statement, which meant we had two
stmt_vec_infos assigning to the same lhs. This seems to be latent at
the moment but caused problems further into the series.
2018-06-20 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-stmts.c (vectorizable_call): Make sure that we
use the stmt_vec_info of the original bb statement for the
new zero assignment, even if the call is part of a pattern.
From-SVN: r261786
A pattern's PATTERN_DEF_SEQ was attached to both the original statement
and the main pattern statement, which made it harder to update later.
This patch attaches it to just the original statement. In practice,
anything that cared had ready access to both.
2018-06-20 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (_stmt_vec_info): Note above pattern_def_seq
that the sequence is attached to the original statement rather
than the pattern statement.
* tree-vect-loop.c (vect_determine_vf_for_stmt): Take the
PATTERN_DEF_SEQ from the original statement rather than
the main pattern statement.
* tree-vect-stmts.c (free_stmt_vec_info): Likewise.
* tree-vect-patterns.c (vect_recog_dot_prod_pattern): Likewise.
(vect_mark_pattern_stmts): Don't copy the PATTERN_DEF_SEQ.
From-SVN: r261785
This patch is the first part of a series to fix to PR85694.
Later patches can make the pattern for a statement S2 reuse the
results of a PATTERN_DEF_SEQ statement attached to an earlier
statement S1. Although vect_mark_stmts_to_be_vectorized handled
this fine, vect_analyze_stmt and vect_transform_loop both skipped the
PATTERN_DEF_SEQ for S1 if S1's main pattern wasn't live or relevant.
I couldn't wrap my head around the flow in vect_transform_loop,
so ended up moving the per-statement handling into a subroutine.
That makes the patch look bigger than it actually is.
2018-06-20 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-stmts.c (vect_analyze_stmt): Move the handling of pattern
definition statements before the early exit for statements that aren't
live or relevant.
* tree-vect-loop.c (vect_transform_loop_stmt): New function,
split out from...
(vect_transform_loop): ...here. Process pattern definition
statements without first checking whether the main pattern
statement is live or relevant.
From-SVN: r261784
* tree-cfgcleanup.c (tree_forwarder_block_p): Do not return false at
-O0 if the locus represent UNKNOWN_LOCATION but have different values.
From-SVN: r261770
The issue is caused by reordering of stack pointer update after stack
space allocation with instructions that write to the allocated stack
space. In windowed ABI register spill area for the previous call frame
is located just below the stack pointer and may be reloaded back into
the register file on movsp.
Implement allocate_stack pattern for windowed ABI configuration and
insert an instruction that prevents reordering of frame memory access
and stack pointer update.
gcc/
2018-06-19 Max Filippov <jcmvbkbc@gmail.com>
* config/xtensa/xtensa.md (UNSPEC_FRAME_BLOCKAGE): New unspec
constant.
(allocate_stack, frame_blockage, *frame_blockage): New patterns.
From-SVN: r261755