Gold plugins may wish to further process an input file added by a plugin. For
example, the plugin may need to assign a unique segment for sections in a
plugin-generated input file. This patch adds a plugin callback that the linker
will call when reading symbols from a new input file added after the
all_symbols_read event (i.e. an input file added by a plugin).
2017-11-10 Stephen Crane <sjc@immunant.com>
* plugin-api.h: Add plugin API for processing plugin-added
input files.
From-SVN: r254640
* auto-profile.c (afdo_indirect_call): Drop frequency.
* cgraph.c (symbol_table::create_edge): Drop frequency argument.
(cgraph_node::create_edge): Drop frequency argument.
(cgraph_node::create_indirect_edge): Drop frequency argument.
(cgraph_edge::make_speculative): Drop frequency arguments.
(cgraph_edge::resolve_speculation): Do not update frequencies
(cgraph_edge::dump_edge_flags): Do not dump frequency.
(cgraph_node::dump): Check consistency in IPA mode.
(cgraph_edge::maybe_hot_p): Use IPA counter.
(cgraph_edge::verify_count_and_frequency): Rename to ...
(cgraph_edge::verify_count): ... this one; drop frequency checking.
(cgraph_node::verify_node): Update.
* cgraph.h (struct cgraph_edge): Drop frequency.
(cgraph_edge::frequency): New function.
* cgraphbuild.c (pass_build_cgraph_edges::execute): Donot pass
frequencies.
(cgraph_edge::rebuild_edges): Likewise.
* cgraphclones.c (cgraph_edge::clone): Scale only counts.
(duplicate_thunk_for_node): Do not pass frequency.
(cgraph_node::create_clone): Scale only counts.
(cgraph_node::create_virtual_clone): Do not pass frequency.
(cgraph_node::create_edge_including_clones): Do not pass frequency.
(cgraph_node::create_version_clone): Do not pass frequency.
* cgraphunit.c (cgraph_node::analyze): Do not pass frequency.
(cgraph_node::expand_thunk): Do not pass frequency.
(cgraph_node::create_wrapper): Do not pass frequency.
* gimple-iterator.c (update_call_edge_frequencies): Do not pass
frequency.
* gimple-streamer-in.c (input_bb): Scale only IPA counts.
* ipa-chkp.c (chkp_produce_thunks): Do not pass frequency.
* ipa-cp.c (ipcp_lattice::print): Use frequency function.
(gather_caller_stats): Use frequency function.
(ipcp_cloning_candidate_p): Use frequency function.
(ipcp_propagate_stage): Use frequency function.
(get_info_about_necessary_edges): Use frequency function.
(update_profiling_info): Update only IPA profile.
(update_specialized_profile): Use frequency functoin.
(perhaps_add_new_callers): Update only IPA profile.
* ipa-devirt.c (ipa_devirt): Use IPA profile.
* ipa-fnsummary.c (redirect_to_unreachable): Do not set frequrency.
(dump_ipa_call_summary): Use frequency function.
(estimate_edge_size_and_time): Use frequency function.
(ipa_merge_fn_summary_after_inlining): Use frequency function.
* ipa-inline-analysis.c (do_estimate_edge_time): Use IPA profile.
* ipa-inline-transform.c (update_noncloned_frequencies): Rename to ..
(update_noncloned_counts): ... ths one; scale counts only.
(clone_inlined_nodes): Do not scale frequency.
(inline_call): Do not pass frequency.
* ipa-inline.c (compute_uninlined_call_time): Use IPA profile.
(compute_inlined_call_time): Use IPA profile.
(want_inline_small_function_p): Use IPA profile.
(want_inline_self_recursive_call_p): Use IPA profile.
(edge_badness): Use IPA profile.
(lookup_recursive_calls): Use IPA profile.
(recursive_inlining): Do not pass frequency.
(resolve_noninline_speculation): Do not update frequency.
(inline_small_functions): Collect max of IPA profile.
(dump_overall_stats): Dump IPA porfile.
(dump_inline_stats): Dump IPA porfile.
(ipa_inline): Collect IPA stats.
* ipa-inline.h (clone_inlined_nodes): Update prototype.
* ipa-profile.c (ipa_propagate_frequency_1): Use frequency function.
(ipa_propagate_frequency): Use frequency function.
(ipa_profile): Cleanup.
* ipa-prop.c (ipa_make_edge_direct_to_target): Do not pass frequency
* ipa-utils.c (ipa_merge_profiles): Merge all profiles.
* lto-cgraph.c (lto_output_edge): Do not stream frequency.
(input_node): Do not stream frequency.
(input_edge): Do not stream frequency.
(merge_profile_summaries): Scale only IPA profiles.
* omp-simd-clone.c (simd_clone_adjust): Do not pass frequency.
* predict.c (drop_profile): Do not recompute frequency.
* trans-mem.c (ipa_tm_insert_irr_call): Do not pass frequency.
(ipa_tm_insert_gettmclone_call): Do not pass frequency.
* tree-cfg.c (execute_fixup_cfg): Drop profile to global0 if needed.
* tree-chkp.c (chkp_copy_bounds_for_assign): Do not pass frequency.
* tree-emutls.c (gen_emutls_addr): Do not pass frequency.
* tree-inline.c (copy_bb): Do not scale frequency.
(expand_call_inline): Do not scale frequency.
(tree_function_versioning): Do not scale frequency.
* ubsan.c (ubsan_create_edge): Do not pass frequency.
lto/ChangeLog:
2017-11-10 Jan Hubicka <hubicka@ucw.cz>
* lto-partition.c (lto_balanced_map): Use frequency accessor.
From-SVN: r254636
For the most part, testcases under gcc.target/arm/cmse/baseline and
gcc.target/arm/cmse/mainline are duplicate copies with only different
dejagnu directives. Although there is no requirement for them to be
similar, having them both identical allow to compare the code generated
and make it easier in case of change in code generation to both
architecture to update the testcases (if one needs updating so does the
other).
Similarly all the tests in gcc.target/arm/cmse/mainline/<floatabi> have
the same source but are duplicate copies.
This patch moves all the code in the tests to a parent directory:
gcc.target/arm/cmse for tests shared by Armv8-M Baseline and Mainline
and gcc.target/arm/cmse/mainline for tests *only* shared by the various
float ABI of Armv8-M Mainline. C includes are then used where the code
used to sit.
Note that the cmse-13.c test used to differ slightly between
architectures and float ABI tested in the first floating-point constant
passed to bar: sometimes 1.0 and sometimes 3.0. This patch settles on
3.0 to not confuse with the 1.0 constant used to clear VFP registers in
some of the configurations.
2017-11-10 Thomas Preud'homme <thomas.preudhomme@arm.com>
gcc/testsuite/
* gcc.target/arm/cmse/bitfield-4.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-4.c: Remove code and include
above file.
* gcc.target/arm/cmse/mainline/bitfield-4.c: Likewise.
* gcc.target/arm/cmse/bitfield-5.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-5.c: Remove code and include
above file.
* gcc.target/arm/cmse/mainline/bitfield-5.c: Likewise.
* gcc.target/arm/cmse/bitfield-6.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-6.c: Remove code and include
above file.
* gcc.target/arm/cmse/mainline/bitfield-6.c: Likewise.
* gcc.target/arm/cmse/bitfield-7.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-7.c: Remove code and include
above file.
* gcc.target/arm/cmse/mainline/bitfield-7.c: Likewise.
* gcc.target/arm/cmse/bitfield-8.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-8.c: Remove code and include
above file.
* gcc.target/arm/cmse/mainline/bitfield-8.c: Likewise.
* gcc.target/arm/cmse/bitfield-9.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-9.c: Remove code and include
above file.
* gcc.target/arm/cmse/mainline/bitfield-9.c: Likewise.
* gcc.target/arm/cmse/bitfield-and-union.x: New file.
* gcc.target/arm/cmse/baseline/bitfield-and-union-1.c: Rename into ...
* gcc.target/arm/cmse/baseline/bitfield-and-union.c: This. Remove code
and include above bitfield-and-union.x file.
* gcc.target/arm/cmse/mainline/bitfield-and-union-1.c: Rename into ...
* gcc.target/arm/cmse/mainline/bitfield-and-union.c: this. Remove code
and include above bitfield-and-union.x file.
* gcc.target/arm/cmse/cmse-13.x: New file.
* gcc.target/arm/cmse/baseline/cmse-13.c: Remove code and include above
file.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/hard/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-13.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-13.c: Likewise.
* gcc.target/arm/cmse/cmse-5.x: New file.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: Remove code and
include above file.
* gcc.target/arm/cmse/mainline/hard/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-5.c: Likewise.
* gcc.target/arm/cmse/cmse-7.x: New file.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-7.c: Remove code and
include above file.
* gcc.target/arm/cmse/mainline/hard/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp-sp/cmse-7.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-7.c: Likewise.
* gcc.target/arm/cmse/cmse-8.x: New file.
* gcc.target/arm/cmse/mainline/hard-sp/cmse-8.c: Remove code and
include above file.
* gcc.target/arm/cmse/mainline/hard/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/soft/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp-sp/cmse-8.c: Likewise.
* gcc.target/arm/cmse/mainline/softfp/cmse-8.c: Likewise.
* gcc.target/arm/cmse/union-1.x: New file.
* gcc.target/arm/cmse/baseline/union-1.c: Remove code and include above
file.
* gcc.target/arm/cmse/mainline/union-1.c: Likewise.
* gcc.target/arm/cmse/union-2.x: New file.
* gcc.target/arm/cmse/baseline/union-2.c: Remove code and include above
file.
* gcc.target/arm/cmse/mainline/union-2.c: Likewise.
From-SVN: r254633
2017-11-10 Martin Liska <mliska@suse.cz>
PR gcov-profile/82702
* gcov.c (main): Handle intermediate files in a different
way.
(get_gcov_intermediate_filename): New function.
(output_gcov_file): Remove support of intermediate files.
(generate_results): Allocate intermediate file.
(release_structures): Clean-up properly fn_end.
(output_intermediate_file): Start iterating with line 1.
From-SVN: r254629
2017-11-10 Martin Liska <mliska@suse.cz>
* coverage.c (coverage_init): Stream information about
support of has_unexecuted_blocks.
* doc/gcov.texi: Document that.
* gcov-dump.c (dump_gcov_file): Support it in gcov_dump tool.
* gcov.c (read_graph_file): Likewise.
(output_line_beginning): Fix a small issue with
color output.
From-SVN: r254627
2017-11-10 Paul Thomas <pault@gcc.gnu.org>
PR fortran/82934
* trans-stmt.c (gfc_trans_allocate): Remove the gcc_assert on
null string length for assumed length typespec and set
expr3_esize to NULL_TREE;
2017-11-10 Paul Thomas <pault@gcc.gnu.org>
PR fortran/82934
* gfortran.dg/allocate_assumed_charlen_1.f90: New test.
From-SVN: r254624
When gcc-dg-runtest is used to run a test the test is run several times
with different options. For clarity of the log, the test infrastructure
then append the options to the testname. This means that all the code
that must deal with the testcase itself (eg. removing the output files
after the test has run) needs to remove the option name.
There is already a pattern (see below) for this in several place of the
testsuite framework but it is also missing in many places. This patch
fixes all of these places. The pattern is as follows:
set testcase [testname-for-summary]
; The name might include a list of options; extract the file name.
set testcase [lindex $testcase 0]
2017-11-10 Thomas Preud'homme <thomas.preudhomme@arm.com>
gcc/testsuite/
* lib/scanasm.exp (scan-assembler): Extract filename from testname used
in summary.
(scan-assembler-not): Likewise.
(scan-hidden): Likewise.
(scan-not-hidden): Likewise.
(scan-stack-usage): Likewise.
(scan-stack-usage-not): Likewise.
(scan-assembler-times): Likewise.
(scan-assembler-dem): Likewise.
(scan-assembler-dem-not): Likewise.
(object-size): Likewise.
(scan-lto-assembler): Likewise.
* lib/scandump.exp (scan-dump): Likewise.
(scan-dump-times): Likewise.
(scan-dump-not): Likewise.
(scan-dump-dem): Likewise.
(scan-dump-dem-not): Likewise
From-SVN: r254622
* gcc-interface/utils.c (convert) <RECORD_TYPE>: Add comment and do
not fall through to the next case.
<ARRAY_TYPE>: Deal specially with a dereference from another array
type with the same element type.
From-SVN: r254618
PR rtl-optimization/82913
* compare-elim.c (try_merge_compare): Punt if def_insn is not
single set.
* gcc.c-torture/compile/pr82913.c: New test.
From-SVN: r254614
* vr-values.h: New file with vr_values class.
* tree-vrp.c: Include vr-values.h
(vrp_value_range_pool, vrp_equiv_obstack, num_vr_values): Move static
data objects into the vr_values class.
(vr_value, values_propagated, vr_phi_edge_counts): Likewise.
(get_value_range): Make it a member function within vr_values class.
(set_defs_to_varying, update_value_range, add_equivalence): Likewise.
(vrp_stmt_computes_nonzero_p, op_with_boolean_value_range_p): Likewise.
(op_with_constant_singleton_value_range): Likewise.
(extract_range_for_var_from_comparison_expr): Likewise.
(extract_range_from_assert, extract_range_from_ssa_name): Likewise.
(extract_range_from_binary_expr): Likewise.
(extract_range_from_unary_expr): Likewise.
(extract_range_from_cond_expr, extrat_range_from_comparison): Likewise.
(check_for_binary_op_overflow, extract_range_basic): Likewise.
(extract_range_from_assignment, adjust_range_with_scev): Likewise.
(dump_all_value_ranges, get_vr_for_comparison): Likewise.
(compare_name_with_value, compare_names): Likewise.
(vrp_evaluate_conditional_warnv_with_ops_using_ranges): Likewise.
(vrp_evaluate_conditional_warnv_with_ops): Likewise. Remove prototype.
(vrp_evaluate_conditional, vrp_visit_cond_stmt): Likewise.
(vrp_visit_switch_stmt, extract_range_from_stmt): Likewise.
(extract_range_from_phi_node): Likewise.
(simplify_truth_ops_using_ranges): Likewise.
(simplify_div_or_mod_using_ranges): Likewise.
(simplify_min_or_max_using_ranges, simplify_abs_using_ranges): Likewise.
(simplify_bit_ops_using_ranges, simplify_cond_using_ranges_1): Likewise.
(simplify_cond_using_ranges_2, simplify_switch_using_ranges): Likewise.
(simplify_float_conversion_using_ranges): Likewise.
(simplify_internal_call_using_ranges): Likewise.
(two_valued_val_range_p, simplify_stmt_using_ranges): Likewise.
(vrp_visit_assignment_or_call): Likewise. Smuggle class instance
poitner via x_vr_values for calls into gimple folder.
(vrp_initialize_lattice): Make this the vr_values ctor.
(vrp_free_lattice): Make this the vr_values dtor.
(set_vr_value): New function.
(class vrp_prop): Add vr_values data member. Add various member
functions as well as member functions that delegate to vr_values.
(check_array_ref): Make a member function within vrp_prop class.
(search_for_addr_array, vrp_initialize): Likewise.
(vrp_finalize): Likewise. Revamp to avoid direct access to
vr_value, values_propagated, etc.
(check_array_bounds): Extract vrp_prop class instance pointer from
walk info structure. Use it to call member functions.
(check_all_array_refs): Make a member function within vrp_prop class.
Smuggle class instance pointer via walk info structure.
(x_vr_values): New local static.
(vrp_valueize): Use x_vr_values to get class instance.
(vr_valueize_1): Likewise.
(class vrp_folder): Add vr_values data member. Add various member
functions as well as member functions that delegate to vr_values.
(fold_predicate_in): Make a mber fucntion within vrp_folder class.
(simplify_stmt_for_jump_threading): Extract smuggled vr_values
class instance from vr_values. Use it to call member functions.
(vrp_dom_walker): Add vr_values data member.
(vrp_dom_walker::after_dom_children): Smuggle vr_values class
instance via x_vr_values.
(identify_jump_threads): Accept vr_values as argument. Store
it into the walker structure.
(evrp_dom_walker): Add vr_values class data member. Add various
delegators.
(evrp_dom_walker::try_find_new_range): Use vr_values data
member to access the memory allocator.
(evrp_dom_walker::before_dom_children): Store vr_values class
instance into the vrp_folder class.
(evrp_dom_walker::push_value_range): Rework to avoid direct
access to num_vr_values and vr_value.
(evrp_dom_walker::pop_value_range): Likewise.
(execute_early_vrp): Remove call to vrp_initialize_lattice.
Use vr_values to get to dump_all_value_ranges member function.
Remove call to vrp_free_lattice. Call vrp_initialize, vrp_finalize,
and simplify_cond_using_ranges_2 via vrp_prop class instance.
Pass vr_values class instance down to identify_jump_threads.
Remove call to vrp_free_lattice.
(debug_all_value_ranges): Remove.
From-SVN: r254613
* tree-vrp.c (set_value_range): Do not reference vrp_equiv_obstack.
Get it from the existing bitmap instead.
(vrp_intersect_ranges_1): Likewise.
From-SVN: r254611
For a misaligned address force a panic rather than assuming that reading
from the address 0 will cause one.
Reviewed-on: https://go-review.googlesource.com/69850
From-SVN: r254610
* gimple-ssa-store-merging.c (struct store_immediate_info): Add
bit_not_p field.
(store_immediate_info::store_immediate_info): Add bitnotp argument,
set bit_not_p to it.
(imm_store_chain_info::coalesce_immediate_stores): Break group
if bit_not_p is different.
(count_multiple_uses, split_group,
imm_store_chain_info::output_merged_store): Handle info->bit_not_p.
(handled_load): Avoid multiple chained BIT_NOT_EXPRs.
(pass_store_merging::process_store): Handle BIT_{AND,IOR,XOR}_EXPR
result inverted using BIT_NOT_EXPR, compute bit_not_p, pass it
to store_immediate_info ctor.
From-SVN: r254606
2017-11-09 Paul Thomas <pault@gcc.gnu.org>
PR fortran/78619
* check.c (same_type_check): Introduce a new argument 'assoc'
with default value false. If this is true, use the symbol type
spec of BT_PROCEDURE expressions.
(gfc_check_associated): Set 'assoc' true in the call to
'same_type_check'.
2017-11-09 Paul Thomas <pault@gcc.gnu.org>
PR fortran/78619
* gfortran.dg/pr78619.f90: New test.
From-SVN: r254605
2017-11-09 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/78814
* interface.c (symbol_rank): Check for NULL pointer.
2017-11-09 Steven G. Kargl <kargl@gcc.gnu.org>
PR fortran/78814
* gfortran.dg/interface_40.f90: New testcase.
From-SVN: r254604
A number of instructions are output in assembler form by
output_return_instruction () when compiling a function with the
cmse_nonsecure_entry attribute for Armv8-M Mainline with hardfloat float
ABI. However, the corresponding thumb2_cmse_entry_return insn pattern
does not account for all these instructions in its computing of the
length of the instruction.
This may lead GCC to use the wrong branching instruction due to
incorrect computation of the offset between the branch instruction's
address and the target address.
This commit fixes the mismatch between what output_return_instruction ()
does and what the pattern think it does and adds a note warning about
mismatch in the affected functions' heading comments to ensure code does
not get out of sync again.
Note: no test is provided because the C testcase is fragile (only works
on GCC 6) and the extracted RTL test fails to compile due to bugs in the
RTL frontend (PR82815 and PR82817)
2017-11-09 Thomas Preud'homme <thomas.preudhomme@arm.com>
gcc/
* config/arm/arm.c (output_return_instruction): Add comments to
indicate requirement for cmse_nonsecure_entry return to account
for the size of clearing instruction output here.
(thumb_exit): Likewise.
* config/arm/thumb2.md (thumb2_cmse_entry_return): Fix length for
return in hardfloat mode.
From-SVN: r254601
This makes the TOC register save a component. If -msave-toc-indirect
is not explicitly disabled, it enables it, and then moves the prologue
code generated for that to a better place. So far this only matters
for indirect calls (for direct calls the save is done in the PLT stub).
The restore is always done directly after the bl insn (the compiler
generates a nop there, the linker replaces it with a load).
* config/rs6000/rs6000.c (machine_function): Add a bool,
"toc_is_wrapped_separately".
(rs6000_option_override_internal): Enable OPTION_MASK_SAVE_TOC_INDIRECT
if it wasn't explicitly set or unset, we are optimizing for speed, and
doing separate shrink-wrapping.
(rs6000_get_separate_components): Enable the TOC component if
saving the TOC register in the prologue.
(rs6000_components_for_bb): Handle the TOC component.
(rs6000_emit_prologue_components): Store the TOC register where needed.
(rs6000_set_handled_components): Mark TOC as handled, if handled.
(rs6000_emit_prologue): Don't save the TOC if that is already done.
From-SVN: r254599
This patch adds a target selector that says whether the target
supports IFN_MASK_STORE.
2017-11-09 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/sourcebuild.texi (vect_masked_store): Document.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_vect_masked_store):
New proc.
* gcc.dg/vect/vect-cselim-1.c (foo): Mention that the second loop
is vectorizable with masked stores. Update scan-tree-dump-times
accordingly.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254597
This patch adds a target selector to say whether it's possible to
align a local variable to the target's preferred vector alignment.
This can be false for large vectors if the alignment is only
a preference and not a hard requirement (and thus if there is no
need to support a stack realignment mechanism).
2017-11-09 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/sourcebuild.texi (vect_align_stack_vars): Document.
gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_vect_align_stack_vars): New proc.
* gcc.dg/vect/vect-23.c: Only expect the array to be aligned if
vect_align_stack_vars.
* gcc.dg/vect/vect-24.c: Likewise.
* gcc.dg/vect/vect-25.c: Likewise.
* gcc.dg/vect/vect-26.c: Likewise.
* gcc.dg/vect/vect-32-big-array.c: Likewise.
* gcc.dg/vect/vect-32.c: Likewise.
* gcc.dg/vect/vect-40.c: Likewise.
* gcc.dg/vect/vect-42.c: Likewise.
* gcc.dg/vect/vect-46.c: Likewise.
* gcc.dg/vect/vect-48.c: Likewise.
* gcc.dg/vect/vect-52.c: Likewise.
* gcc.dg/vect/vect-54.c: Likewise.
* gcc.dg/vect/vect-62.c: Likewise.
* gcc.dg/vect/vect-67.c: Likewise.
* gcc.dg/vect/vect-75-big-array.c: Likewise.
* gcc.dg/vect/vect-75.c: Likewise.
* gcc.dg/vect/vect-77-alignchecks.c: Likewise.
* gcc.dg/vect/vect-78-alignchecks.c: Likewise.
* gcc.dg/vect/vect-89-big-array.c: Likewise.
* gcc.dg/vect/vect-89.c: Likewise.
* gcc.dg/vect/vect-96.c: Likewise.
* gcc.dg/vect/vect-multitypes-3.c: Likewise.
* gcc.dg/vect/vect-multitypes-6.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254596
This patch adds a target selector for variable-length vectors.
Initially it's always false, but the SVE patch provides a case
in which it's true.
2017-11-09 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/sourcebuild.texi (vect_variable_length): Document.
gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_vect_variable_length): New proc.
* gcc.dg/vect/pr60482.c: XFAIL test for no epilog loop if
vect_variable_length.
* gcc.dg/vect/slp-reduc-6.c: XFAIL two-operation SLP if
vect_variable_length.
* gcc.dg/vect/vect-alias-check-5.c: XFAIL alias optimization if
vect_variable_length.
* gfortran.dg/vect/fast-math-mgrid-resid.f: XFAIL predictive
commoning optimization if vect_variable_length.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254595
This patch adds a target selector that says whether we can ever
generate an "unaligned" accesses, where "unaligned" is relative
to the target's preferred vector alignment. This is already true if:
vect_no_align && { ! vect_hw_misalign }
i.e. if the target doesn't have any alignment mechanism and also
doesn't allow unaligned accesses. It is also true (for the things
tested by gcc.dg/vect) if the target only wants things to be aligned
to an element; in that case every normal scalar access is "vector aligned".
2017-11-09 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/sourcebuild.texi (vect_unaligned_possible): Document.
gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_vect_unaligned_possible): New proc.
* gcc.dg/vect/slp-25.c: Extend XFAIL of peeling for alignment from
vect_no_align && { ! vect_hw_misalign } to ! vect_unaligned_possible.
* gcc.dg/vect/vect-multitypes-1.c: Likewise.
* gcc.dg/vect/vect-109.c: XFAIL vectorisation of an unaligned
access to ! vect_unaligned_possible.
* gcc.dg/vect/vect-33.c: Likewise.
* gcc.dg/vect/vect-42.c: Likewise.
* gcc.dg/vect/vect-56.c: Likewise.
* gcc.dg/vect/vect-60.c: Likewise.
* gcc.dg/vect/vect-96.c: Likewise.
* gcc.dg/vect/vect-peel-1.c: Likewise.
* gcc.dg/vect/vect-27.c: Extend XFAIL of unaligned vectorization from
vect_no_align && { ! vect_hw_misalign } to ! vect_unaligned_possible.
* gcc.dg/vect/vect-29.c: Likewise.
* gcc.dg/vect/vect-44.c: Likewise.
* gcc.dg/vect/vect-48.c: Likewise.
* gcc.dg/vect/vect-50.c: Likewise.
* gcc.dg/vect/vect-52.c: Likewise.
* gcc.dg/vect/vect-72.c: Likewise.
* gcc.dg/vect/vect-75-big-array.c: Likewise.
* gcc.dg/vect/vect-75.c: Likewise.
* gcc.dg/vect/vect-77-alignchecks.c: Likewise.
* gcc.dg/vect/vect-77-global.c: Likewise.
* gcc.dg/vect/vect-78-alignchecks.c: Likewise.
* gcc.dg/vect/vect-78-global.c: Likewise.
* gcc.dg/vect/vect-multitypes-3.c: Likewise.
* gcc.dg/vect/vect-multitypes-4.c: Likewise.
* gcc.dg/vect/vect-multitypes-6.c: Likewise.
* gcc.dg/vect/vect-peel-4.c: Likewise.
* gcc.dg/vect/vect-peel-3.c: Likewise, and also for peeling
for alignment.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254594
This patch adds a target selector for targets whose
preferred_vector_alignment is the alignment of one element. We'll never
peel in that case, and the step of a loop that operates on normal (as
opposed to packed) elements will always divide the preferred alignment.
2017-11-09 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/sourcebuild.texi (vect_element_align_preferred): Document.
gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_vect_element_align_preferred): New proc.
(check_effective_target_vect_peeling_profitable): Test it.
* gcc.dg/vect/no-section-anchors-vect-31.c: Don't expect peeling
if vect_element_align_preferred.
* gcc.dg/vect/no-section-anchors-vect-64.c: Likewise.
* gcc.dg/vect/pr65310.c: Likewise.
* gcc.dg/vect/vect-26.c: Likewise.
* gcc.dg/vect/vect-54.c: Likewise.
* gcc.dg/vect/vect-56.c: Likewise.
* gcc.dg/vect/vect-58.c: Likewise.
* gcc.dg/vect/vect-60.c: Likewise.
* gcc.dg/vect/vect-89-big-array.c: Likewise.
* gcc.dg/vect/vect-89.c: Likewise.
* gcc.dg/vect/vect-92.c: Likewise.
* gcc.dg/vect/vect-peel-1.c: Likewise.
* gcc.dg/vect/vect-outer-3a-big-array.c: Expect the step to
divide the alignment if vect_element_align_preferred.
* gcc.dg/vect/vect-outer-3a.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254593
SLP load permutation fails if any individual permutation requires more
than two vector inputs. For 128-bit vectors, it's possible to permute
3 contiguous loads of 32-bit and 8-bit elements, but not 16-bit elements
or 64-bit elements. The results are reversed for 256-bit vectors,
and so on for wider vectors.
This patch adds a routine that tests whether a permute will require
three vectors for a given vector count and element size, then adds
vect_perm3_* target selectors for the cases that we currently use.
2017-11-09 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* doc/sourcebuild.texi (vect_perm_short, vect_perm_byte): Document
previously undocumented selectors.
(vect_perm3_byte, vect_perm3_short, vect_perm3_int): Document.
gcc/testsuite/
* lib/target-supports.exp (vect_perm_supported): New proc.
(check_effective_target_vect_perm3_int): Likewise.
(check_effective_target_vect_perm3_short): Likewise.
(check_effective_target_vect_perm3_byte): Likewise.
* gcc.dg/vect/slp-perm-1.c: Expect SLP load permutation to
succeed if vect_perm3_int.
* gcc.dg/vect/slp-perm-5.c: Likewise.
* gcc.dg/vect/slp-perm-6.c: Likewise.
* gcc.dg/vect/slp-perm-7.c: Likewise.
* gcc.dg/vect/slp-perm-8.c: Likewise vect_perm3_byte.
* gcc.dg/vect/slp-perm-9.c: Likewise vect_perm3_short.
Use vect_perm_short instead of vect_perm. Add a scan-tree-dump-not
test for vect_perm3_short targets.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254592
Some tests assumed that there would only be 2 vector sizes if
vect_multiple_sizes, whereas for SVE there are three (SVE, 128-bit
and 64-bit). This patch replaces scan-tree-dump-times with
scan-tree-dump for vect_multiple_sizes but keeps it for
!vect_multiple_sizes.
2017-11-09 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/testsuite/
* gcc.dg/vect/no-vfa-vect-101.c: Use scan-tree-dump rather than
scan-tree-dump-times for vect_multiple_sizes.
* gcc.dg/vect/no-vfa-vect-102.c: Likewise.
* gcc.dg/vect/no-vfa-vect-102a.c: Likewise.
* gcc.dg/vect/no-vfa-vect-37.c: Likewise.
* gcc.dg/vect/no-vfa-vect-79.c: Likewise.
* gcc.dg/vect/vect-104.c: Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254591
This patch adds a routine that lists the available vector sizes
for a target and uses it for some existing target conditions.
Later patches add more uses.
The cases are taken from multiple_sizes.
2017-11-09 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/testsuite/
* lib/target-supports.exp (available_vector_sizes): New proc.
(check_effective_target_vect_multiple_sizes): Use it.
(check_effective_target_vect64): Likewise.
(check_effective_target_vect_sizes_32B_16B): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254590
Several vector tests are sensitive to the vector size. This patch adds
a VECTOR_BITS macro to tree-vect.h to select the expected vector size
and uses it to influence iteration counts and array sizes. The tests
keep the original values if the vector size is small enough.
For now VECTOR_BITS is always 128, but the SVE patches add other values.
2017-11-09 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/testsuite/
* gcc.dg/vect/tree-vect.h (VECTOR_BITS): Define.
* gcc.dg/vect/bb-slp-pr69907.c: Include tree-vect.h.
(N): New macro.
(foo): Use it instead of hard-coded 320.
* gcc.dg/vect/no-scevccp-outer-7.c (N): Redefine if the default
value is too small for VECTOR_BITS.
* gcc.dg/vect/no-scevccp-vect-iv-3.c (N): Likewise.
* gcc.dg/vect/no-section-anchors-vect-31.c (N): Likewise.
* gcc.dg/vect/no-section-anchors-vect-36.c (N): Likewise.
* gcc.dg/vect/slp-perm-9.c (N): Likewise.
* gcc.dg/vect/vect-32.c (N): Likewise.
* gcc.dg/vect/vect-75.c (N, OFF): Likewise.
* gcc.dg/vect/vect-77-alignchecks.c (N, OFF): Likewise.
* gcc.dg/vect/vect-78-alignchecks.c (N, OFF): Likewise.
* gcc.dg/vect/vect-89.c (N): Likewise.
* gcc.dg/vect/vect-96.c (N): Likewise.
* gcc.dg/vect/vect-multitypes-3.c (N): Likewise.
* gcc.dg/vect/vect-multitypes-6.c (N): Likewise.
* gcc.dg/vect/vect-over-widen-1.c (N): Likewise.
* gcc.dg/vect/vect-over-widen-4.c (N): Likewise.
* gcc.dg/vect/vect-reduc-pattern-1a.c (N): Likewise.
* gcc.dg/vect/vect-reduc-pattern-1b.c (N): Likewise.
* gcc.dg/vect/vect-reduc-pattern-2a.c (N): Likewise.
* gcc.dg/vect/no-section-anchors-vect-64.c (NINTS): New macro.
(N): Redefine in terms of NINTS.
(ia, ib, ic): Use NINTS instead of hard-coded constants in the
array bounds.
* gcc.dg/vect/no-section-anchors-vect-69.c (NINTS): New macro.
(N): Redefine in terms of NINTS.
(test1): Replace a and b fields with NINTS - 2 ints of padding.
(main1): Use NINTS instead of hard-coded constants.
* gcc.dg/vect/section-anchors-vect-69.c (NINTS): New macro.
(N): Redefine in terms of NINTS.
(test1): Replace a and b fields with NINTS - 2 ints of padding.
(test2): Remove incorrect comments about alignment.
(main1): Use NINTS instead of hard-coded constants.
* gcc.dg/vect/pr45752.c (N): Redefine if the default value is
too small for VECTOR_BITS.
(main): Continue to use canned results for the default value of N,
but compute the expected results from scratch for other values.
* gcc.dg/vect/slp-perm-1.c (N, main): As for pr45752.c.
* gcc.dg/vect/slp-perm-4.c (N, main): Likewise.
* gcc.dg/vect/slp-perm-5.c (N, main): Likewise.
* gcc.dg/vect/slp-perm-6.c (N, main): Likewise.
* gcc.dg/vect/slp-perm-7.c (N, main): Likewise.
* gcc.dg/vect/pr65518.c (NINTS, N, RESULT): New macros.
(giga): Use NINTS as the array bound.
(main): Use NINTS, N and RESULT.
* gcc.dg/vect/pr65947-5.c (N): Redefine if the default value is
too small for VECTOR_BITS.
(main): Fill in any remaining elements of A programmatically.
* gcc.dg/vect/pr81136.c: Include tree-vect.h.
(a): Use VECTOR_BITS to set the alignment of the target structure.
* gcc.dg/vect/slp-19c.c (N): Redefine if the default value is
too small for VECTOR_BITS.
(main1): Continue to use the canned input for the default value of N,
but compute the input from scratch for other values.
* gcc.dg/vect/slp-28.c (N): Redefine if the default value is
too small for VECTOR_BITS.
(in1, in2, in3): Remove initialization.
(check1, check2): Delete.
(main1): Initialize in1, in2 and in3 here. Check every element
of the vectors and compute the expected values directly instead
of using an array.
* gcc.dg/vect/slp-perm-8.c (N): Redefine if the default value is
too small for VECTOR_BITS.
(foo, main): Change type of "i" to int.
* gcc.dg/vect/vect-103.c (NINTS): New macro.
(N): Redefine in terms of N.
(c): Delete.
(main1): Use NINTS. Check the result from a and b directly.
* gcc.dg/vect/vect-67.c (NINTS): New macro.
(N): Redefine in terms of N.
(main1): Use NINTS for the inner array bounds.
* gcc.dg/vect/vect-70.c (NINTS, OUTERN): New macros.
(N): Redefine in terms of NINTS.
(s): Keep the outer dimensions as 4 even if N is larger than 24.
(tmp1): New variable.
(main1): Only define a local tmp1 if NINTS is relatively small.
Use OUTERN for the outer loops and NINTS for the inner loops.
* gcc.dg/vect/vect-91.c (OFF): New macro.
(a, main3): Use it.
* gcc.dg/vect/vect-92.c (NITER): New macro.
(main1, main2): Use it.
* gcc.dg/vect/vect-93.c (N): Rename to...
(N1): ...this.
(main): Update accordingly.
(N2): New macro.
(main1): Use N1 instead of 3001 and N2 insteaed of 10.
* gcc.dg/vect/vect-multitypes-1.c (NSHORTS, NINTS): New macros.
(N): Redefine in terms of NSHORTS.
(main1): Use NINTS - 1 instead of 3 and NSHORTS - 1 instead of 7.
(main): Likewise.
* gcc.dg/vect/vect-over-widen-3-big-array.c (N): Define to VECTOR_BITS.
(foo): Truncate the expected value to the type of *d.
* gcc.dg/vect/vect-peel-3.c (NINTS, EXTRA): New macros.
(ia, ib, ic, main): Use EXTRA.
(main): Use NINTS.
(RES_A, RES_B, REC_C): New macros.
(RES): Redefine as their sum.
* gcc.dg/vect/vect-reduc-or_1.c (N): New macro.
(in): Change number of elements to N.
(main): Update accordingly. Calculate the expected result.
* gcc.dg/vect/vect-reduc-or_2.c (N, in, main): As for
vect-reduc-or-1.c.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254589
The recent gen_vec_duplicate patches used CONST_VECTOR for all
constants, but the documentation says:
@findex const_vector
@item (const_vector:@var{m} [@var{x0} @var{x1} @dots{}])
Represents a vector constant. The square brackets stand for the vector
containing the constant elements. @var{x0}, @var{x1} and so on are
the @code{const_int}, @code{const_double} or @code{const_fixed} elements.
Both the AArch32 and AArch64 ports relied on the elements having
this form and would ICE if the element was something like a CONST
instead. This showed up as a failure in vect-126.c for both arm-eabi
and aarch64-elf (but not aarch64-linux-gnu, which is what the series
was tested on).
The two obvious options were to redefine CONST_VECTOR to accept all
constants or make gen_vec_duplicate honour the existing documentation.
It looks like other code also assumes that integer CONST_VECTORs contain
CONST_INTs, so the patch does the latter.
I deliberately didn't add an assert to gen_const_vec_duplicate
because it looks like the SPU port *does* expect to be able to create
CONST_VECTORs of symbolic constants.
Also, I think the list above should include const_wide_int for vectors
of TImode and wider.
The new routine takes a mode for consistency with the generators,
and because I think it does make sense to accept all constants for
variable-length:
(const (vec_duplicate ...))
rather than have some rtxes for which we instead use:
(vec_duplicate (const ...))
2017-11-09 Richard Sandiford <richard.sandiford@linaro.org>
gcc/
* doc/rtl.texi (const_vector): Say that elements can be
const_wide_ints too.
* emit-rtl.h (valid_for_const_vec_duplicate_p): Declare.
* emit-rtl.c (valid_for_const_vec_duplicate_p): New function.
(gen_vec_duplicate): Use it instead of CONSTANT_P.
* optabs.c (expand_vector_broadcast): Likewise.
From-SVN: r254586
This patch improves the ivopts address cost calculation for modes
in which an index must be scaled rather than unscaled. Previously
we would only try the scaled form if the unscaled form was valid.
Many of the SVE tests rely on this when matching scaled indices.
2017-11-09 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
gcc/
* tree-ssa-loop-ivopts.c (get_address_cost): Try using a
scaled index even if the unscaled address was invalid.
Don't increase the complexity of using a scale in that case.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r254585