This patch checks whether two data references x and y cannot
partially overlap and so are independent whenever &x != &y.
We can then use this in the vectoriser to optimise alias checks.
gcc/
2016-08-04 Richard Sandiford <richard.sandiford@linaro.org>
* hash-traits.h (pair_hash): New struct.
* tree-data-ref.h (data_dependence_relation): Add object_a and
object_b fields.
(DDR_OBJECT_A, DDR_OBJECT_B): New macros.
* tree-data-ref.c (initialize_data_dependence_relation): Initialize
DDR_OBJECT_A and DDR_OBJECT_B.
* tree-vectorizer.h (vec_object_pair): New type.
(_loop_vec_info): Add a check_unequal_addrs field.
(LOOP_VINFO_CHECK_UNEQUAL_ADDRS): New macro.
(LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Return true if there is an
entry in check_unequal_addrs. Check comp_alias_ddrs instead of
may_alias_ddrs.
* tree-vect-loop.c (destroy_loop_vec_info): Release
LOOP_VINFO_CHECK_UNEQUAL_ADDRS.
(vect_analyze_loop_2): Likewise, when restarting.
(vect_estimate_min_profitable_iters): Estimate the cost of
LOOP_VINFO_CHECK_UNEQUAL_ADDRS.
* tree-vect-data-refs.c: Include tree-hash-traits.h.
(vect_prune_runtime_alias_test_list): Try to handle conflicts
using LOOP_VINFO_CHECK_UNEQUAL_ADDRS, if the data dependence allows.
Count such tests in the final summary.
* tree-vect-loop-manip.c (chain_cond_expr): New function.
(vect_create_cond_for_align_checks): Use it.
(vect_create_cond_for_unequal_addrs): New function.
(vect_loop_versioning): Call it.
gcc/testsuite/
* gcc.dg/vect/vect-alias-check-6.c: New test.
From-SVN: r250868
This patch tries to calculate conservatively-correct distance
vectors for two references whose base addresses are not the same.
It sets a new flag DDR_COULD_BE_INDEPENDENT_P if the dependence
isn't guaranteed to occur.
The motivating example is:
struct s { int x[8]; };
void
f (struct s *a, struct s *b)
{
for (int i = 0; i < 8; ++i)
a->x[i] += b->x[i];
}
in which the "a" and "b" accesses are either independent or have a
dependence distance of 0 (assuming -fstrict-aliasing). Neither case
prevents vectorisation, so we can vectorise without an alias check.
I'd originally wanted to do the same thing for arrays as well, e.g.:
void
f (int a[][8], struct b[][8])
{
for (int i = 0; i < 8; ++i)
a[0][i] += b[0][i];
}
I think this is valid because C11 6.7.6.2/6 says:
For two array types to be compatible, both shall have compatible
element types, and if both size specifiers are present, and are
integer constant expressions, then both size specifiers shall have
the same constant value.
So if we access an array through an int (*)[8], it must have type X[8]
or X[], where X is compatible with int. It doesn't seem possible in
either case for "a[0]" and "b[0]" to overlap when "a != b".
However, as the comment above "if (same_base_p)" explains, GCC is more
forgiving: it supports arbitrary overlap of arrays and allows arrays to
be accessed with different dimensionality. There are examples of this
in PR50067. The patch therefore only handles references that end in a
structure field access.
There are two ways of handling these dependences in the vectoriser:
use them to limit VF, or check at runtime as before. I've gone for
the approach of checking at runtime if we can, to avoid limiting VF
unnecessarily, but falling back to a VF cap when runtime checks aren't
allowed.
The patch tests whether we queued an alias check with a dependence
distance of X and then picked a VF <= X, in which case it's safe to
drop the alias check. Since vect_prune_runtime_alias_check_list
can be called twice with different VF for the same loop, it's no
longer safe to clear may_alias_ddrs on exit. Instead we should use
comp_alias_ddrs to check whether versioning is necessary.
2017-08-04 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* tree-data-ref.h (subscript): Add access_fn field.
(data_dependence_relation): Add could_be_independent_p.
(SUB_ACCESS_FN, DDR_COULD_BE_INDEPENDENT_P): New macros.
(same_access_functions): Move to tree-data-ref.c.
* tree-data-ref.c (ref_contains_union_access_p): New function.
(access_fn_component_p): Likewise.
(access_fn_components_comparable_p): Likewise.
(dr_analyze_indices): Add a reference to access_fn_component_p.
(dump_data_dependence_relation): Use SUB_ACCESS_FN instead of
DR_ACCESS_FN.
(constant_access_functions): Likewise.
(add_other_self_distances): Likewise.
(same_access_functions): Likewise. (Moved from tree-data-ref.h.)
(initialize_data_dependence_relation): Use XCNEW and remove
explicit zeroing of DDR_REVERSED_P. Look for a subsequence
of access functions that have the same type. Allow the
subsequence to end with different bases in some circumstances.
Record the chosen access functions in SUB_ACCESS_FN.
(build_classic_dist_vector_1): Replace ddr_a and ddr_b with
a_index and b_index. Use SUB_ACCESS_FN instead of DR_ACCESS_FN.
(subscript_dependence_tester_1): Likewise dra and drb.
(build_classic_dist_vector): Update calls accordingly.
(subscript_dependence_tester): Likewise.
* tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Check
DDR_COULD_BE_INDEPENDENT_P.
* tree-vectorizer.h (LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Test
comp_alias_ddrs instead of may_alias_ddrs.
* tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
New function.
(vect_analyze_data_ref_dependence): Use it if
DDR_COULD_BE_INDEPENDENT_P, but fall back to using the recorded
distance vectors if that fails.
(dependence_distance_ge_vf): New function.
(vect_prune_runtime_alias_test_list): Use it. Don't clear
LOOP_VINFO_MAY_ALIAS_DDRS.
gcc/testsuite/
* gcc.dg/vect/vect-alias-check-3.c: New test.
* gcc.dg/vect/vect-alias-check-4.c: Likewise.
* gcc.dg/vect/vect-alias-check-5.c: Likewise.
From-SVN: r250867
Currently we generate an if with probability set on only one of the two edges:
<bb 5> [0.00%] [count: INV]:
_5 = mask.3[iter.6_3];
if (_5 == 0)
goto <bb 6>; [INV] [count: INV]
else
goto <bb 2>; [100.00%] [count: INV]
Add the missing edge probability, and set the split to unlikely/likely:
if (_5 == 0)
goto <bb 6>; [19.99%] [count: INV]
else
goto <bb 2>; [80.01%] [count: INV]
2017-08-04 Tom de Vries <tom@codesourcery.com>
* omp-simd-clone.c (simd_clone_adjust): Add missing edge probability.
From-SVN: r250865
2017-08-03 Richard Biener <rguenther@suse.de>
* lto-symtab.h (lto_symtab_prevail_decl): Do not use
DECL_ABSTRACT_ORIGIN as flag we can end up using that. Instead
use DECL_LANG_FLAG_0.
(lto_symtab_prevail_decl): Likewise.
From-SVN: r250856
2017-08-03 Richard Biener <rguenther@suse.de>
* tree-ssa-reassoc.c (should_break_up_subtract): Also break
up if the use is in USE - X.
* gcc.dg/tree-ssa/reassoc-23.c: Adjust to fool early folding
and CSE.
From-SVN: r250855
* toplev.c (dumpfile.h): New include.
(internal_error_reentered): New static function. Use it...
(internal_error_function): ...here to handle reentered internal_error.
From-SVN: r250854
2017-08-03 Richard Biener <rguenther@suse.de>
PR middle-end/81148
* fold-const.c (split_tree): Add minus_var and minus_con
arguments, remove unused loc arg. Never generate NEGATE_EXPRs
here but always use minus_*.
(associate_trees): Assert we never associate with MINUS_EXPR
and NULL first operand. Do not recurse for PLUS_EXPR operands
when associating as MINUS_EXPR either.
(fold_binary_loc): Track minus_var and minus_con.
* c-c++-common/ubsan/pr81148.c: New testcase.
From-SVN: r250853
2017-08-03 Tom de Vries <tom@codesourcery.com>
PR lto/81430
* tree-streamer-in.c (lto_input_ts_function_decl_tree_pointers): If
ACCEL_COMPILER, apply finish_options on
DECL_FUNCTION_SPECIFIC_OPTIMIZATION.
From-SVN: r250852
PR driver/81650
* calls.c (alloc_max_size): Use HOST_WIDE_INT_UC (10??)
instead of 10??LU, perform unit multiplication in wide_int,
don't change alloc_object_size_limit if the limit is larger
than SSIZE_MAX.
* gcc.dg/pr81650.c: New test.
From-SVN: r250850
PR tree-optimization/81655
PR tree-optimization/81588
* tree-ssa-reassoc.c (optimize_range_tests_var_bound): Handle also
the case when ranges[i].low and high are 1 for unsigned type with
precision 1.
From-SVN: r250849
PR middle-end/81052
* omp-low.c (diagnose_sb_0): Handle flag_openmp_simd like flag_openmp.
(pass_diagnose_omp_blocks::gate): Enable also for flag_openmp_simd.
* c-c++-common/pr81052.c: New test.
From-SVN: r250847
When finalizing the methods of a named struct type, we used to
finalize all the field types first. That can fail if the field types
refer indirectly to the named type. Change it to just finalize the
embedded field types first, and the rest of the fields later.
Fixesgolang/go#21253
Reviewed-on: https://go-review.googlesource.com/52570
From-SVN: r250832
PR c++/81640
* call.c (build_user_type_conversion_1): Only call
lookup_fnfields_slot if totype is CLASS_TYPE_P.
* g++.dg/warn/Wshadow-compatible-local-2.C: New test.
From-SVN: r250816
PR middle-end/79499
* function.c (thread_prologue_and_epilogue_insns): Determine blocks
for find_many_sub_basic_blocks bitmap by looking up BLOCK_FOR_INSN
of first NONDEBUG_INSN_P in each of the split_prologue_seq and
prologue_seq sequences - if any.
* gcc.dg/pr79499.c: New test.
From-SVN: r250814
2017-08-02 Richard Biener <rguenther@suse.de>
* tree-vect-stmts.c (vectorizable_store): Perform vector extracts
via vectors if supported, integer extracts via punning if supported
or otherwise vector extracts.
From-SVN: r250813
* c-ada-spec.c (has_static_fields): Look only into fields.
(dump_generic_ada_node): Small tweak.
(dump_nested_types): Look only into fields.
(print_ada_declaration): Look only into methods. Small tweak.
(print_ada_struct_decl): Look only into fields. Use DECL_VIRTUAL_P.
From-SVN: r250802
Add some tests for implementing interrupt handlers with naked attribute
and without asm statements.
* gcc.dg/guality/pr25967-3.c: New test.
* gcc.dg/guality/pr25967-4.c: Likewise.
* gcc.dg/torture/pr25967-3.c: Likewise.
* gcc.dg/torture/pr25967-4.c: Likewise.
From-SVN: r250800