This fixes a mistake in the previous change in this area to what
was desired - figure the largest power-of-two group size fitting
in the matching area.
2020-10-27 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (vect_build_slp_instance): Use ceil_log2
to compute maximum group-size.
* gcc.dg/vect/bb-slp-67.c: New testcase.
PR ipa/97586
* ipa-modref-tree.h (modref_tree::remap_params): New member function.
* ipa-modref.c (modref_summaries_lto::duplicate): Check that
optimization summaries are not duplicated.
(remap_arguments): Remove.
(modref_transform): Rename to ...
(update_signature): ... this one; handle also lto summary.
(pass_ipa_modref::execute): Update signatures here rather
than in transform hook.
This adjusts the condition when to split at control altering stmts,
only when there's a definition. It also removes the only use
of --param slp-max-insns-in-bb which a previous change left doing
nothing (but repeatedly print a message for each successive
instruction...).
2020-10-27 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (vect_slp_bbs): Remove no-op
slp-max-insns-in-bb check.
(vect_slp_function): Dump when splitting the function.
Adjust the split condition for control altering stmts.
* params.opt (-param=slp-max-insns-in-bb): Remove.
* doc/invoke.texi (-param=slp-max-insns-in-bb): Likewise.
gcc/analyzer/ChangeLog:
PR analyzer/97568
* region-model.cc (region_model::get_initial_value_for_global):
Move check that !DECL_EXTERNAL from here to...
* region.cc (decl_region::get_svalue_for_initializer): ...here,
using it to reject zero initialization.
gcc/testsuite/ChangeLog:
PR analyzer/97568
* gcc.dg/analyzer/pr97568.c: New test.
Casting to intptr_t states the intent of an integer to pointer cast
more clearly and ensures that the cast causes no loss of precision on
any platforms. LLP64 platforms eg. have a long value of 4 bytes and
pointer values of 8 bytes which may even cause compiler errors.
gcc/analyzer/ChangeLog:
PR analyzer/96608
* store.h (hash): Cast to intptr_t instead of long
This patch is a followup to the previous one, eliminating
non-determinism in the behavior of the analyzer (rather than just in
the logs), by sorting whenever the result previously depended on
pointer values. Tested as per the previous patch.
gcc/analyzer/ChangeLog:
* constraint-manager.cc (svalue_cmp_by_ptr): Delete.
(equiv_class::canonicalize): Use svalue::cmp_ptr_ptr instead.
(equiv_class_cmp): Eliminate pointer comparison.
* diagnostic-manager.cc (dedupe_key::comparator): If they are at
the same location, also compare epath ength and pending_diagnostic
kind.
* engine.cc (readability_comparator): If two path_vars have the
same readability, then impose an arbitrary ordering on them.
(worklist::key_t::cmp): If two points have the same plan ordering,
continue the comparison. Call sm_state_map::cmp rather than
comparing hash values.
* program-state.cc (sm_state_map::entry_t::cmp): New.
(sm_state_map::cmp): New.
* program-state.h (sm_state_map::entry_t::cmp): New decl.
(sm_state_map::elements): New.
(sm_state_map::cmp): New.
This patch and the followup eliminate various forms of non-determinism
in the analyzer due to changing pointer values.
This patch fixes churn seen when diffing analyzer logs. The patch
avoids embedding pointers in various places, and adds sorting when
dumping hash_set and hash_map for various analyzer types. Doing so
requires implementing a way to sort svalue instances, and assigning UIDs
to gimple statements.
Tested both patches together via a script that runs a testcase 100 times,
and then using diff and md5sum to verify that the results are consistent
in the face of address space randomization:
FILENAME=$1
rm $FILENAME.*
for i in `seq 1 100`; do
echo "iteration: $i"
./xgcc -B. -fanalyzer -c ../../src/gcc/testsuite/gcc.dg/analyzer/$FILENAME \
--Wanalyzer-too-complex \
-fdump-analyzer-supergraph \
-fdump-analyzer-exploded-graph \
-fdump-analyzer \
-fdump-noaddr \
-fdump-analyzer-exploded-nodes-2
mv $FILENAME.supergraph.dot $FILENAME.$i.supergraph.dot
mv $FILENAME.analyzer.txt $FILENAME.$i.analyzer.txt
mv $FILENAME.supergraph-eg.dot $FILENAME.$i.supergraph-eg.dot
mv $FILENAME.eg.txt $FILENAME.$i.eg.txt
mv $FILENAME.eg.dot $FILENAME.$i.eg.dot
done
gcc/analyzer/ChangeLog:
* engine.cc (setjmp_record::cmp): New.
(supernode_cluster::dump_dot): Avoid embedding pointer in cluster
name.
(supernode_cluster::cmp_ptr_ptr): New.
(function_call_string_cluster::dump_dot): Avoid embedding pointer
in cluster name. Sort m_map when dumping child clusters.
(function_call_string_cluster::cmp_ptr_ptr): New.
(root_cluster::dump_dot): Sort m_map when dumping child clusters.
* program-point.cc (function_point::cmp): New.
(function_point::cmp_ptr): New.
* program-point.h (function_point::cmp): New decl.
(function_point::cmp_ptr): New decl.
* program-state.cc (sm_state_map::print): Sort the values. Guard
the printing of pointers with !flag_dump_noaddr.
(program_state::prune_for_point): Sort the regions.
(log_set_of_svalues): Sort the values. Guard the printing of
pointers with !flag_dump_noaddr.
* region-model-manager.cc (log_uniq_map): Sort the values.
* region-model-reachability.cc (dump_set): New function template.
(reachable_regions::dump_to_pp): Use it.
* region-model.h (svalue::cmp_ptr): New decl.
(svalue::cmp_ptr_ptr): New decl.
(setjmp_record::cmp): New decl.
(placeholder_svalue::get_name): New accessor.
(widening_svalue::get_point): New accessor.
(compound_svalue::get_map): New accessor.
(conjured_svalue::get_stmt): New accessor.
(conjured_svalue::get_id_region): New accessor.
(region::cmp_ptrs): Rename to...
(region::cmp_ptr_ptr): ...this.
* region.cc (region::cmp_ptrs): Rename to...
(region::cmp_ptr_ptr): ...this.
* state-purge.cc
(state_purge_per_ssa_name::state_purge_per_ssa_name): Sort
m_points_needing_name when dumping.
* store.cc (concrete_binding::cmp_ptr_ptr): New.
(symbolic_binding::cmp_ptr_ptr): New.
(binding_map::cmp): New.
(get_sorted_parent_regions): Update for renaming of
region::cmp_ptrs to region::cmp_ptr_ptr.
(store::dump_to_pp): Likewise.
(store::to_json): Likewise.
(store::can_merge_p): Sort the base regions before considering
them.
* store.h (concrete_binding::cmp_ptr_ptr): New decl.
(symbolic_binding::cmp_ptr_ptr): New decl.
(binding_map::cmp): New decl.
* supergraph.cc (supergraph::supergraph): Assign UIDs to the
gimple stmts.
* svalue.cc (cmp_cst): New.
(svalue::cmp_ptr): New.
(svalue::cmp_ptr_ptr): New.
This was effectively checking for one beyond the limit, rather than
the limit itself.
Seen when fixing PR analyzer/97514.
gcc/analyzer/ChangeLog:
* engine.cc (exploded_graph::get_or_create_node): Fix off-by-one
when imposing param_analyzer_max_enodes_per_program_point limit.
This fixes an ICE seen e.g. with gcc.dg/analyzer/data-model-16.c when
enabling -fdump-analyzer.
gcc/analyzer/ChangeLog:
* region-model.cc (region_model::get_representative_path_var):
Implement case RK_LABEL.
* region-model.h (label_region::get_label): New accessor.
This refactors the array descriptor component access tree building
to commonize code into new helpers to provide a single place to
fix correctness issues with respect to TBAA.
The only interesting part is the gfc_conv_descriptor_data_get change
to drop broken special-casing of REFERENCE_TYPE desc which, when hit,
would build invalid GENERIC trees, missing an INDIRECT_REF before
subsetting the descriptor with a COMPONENT_REF.
2020-10-16 Richard Biener <rguenther@suse.de>
gcc/fortran/ChangeLog:
* trans-array.c (gfc_get_descriptor_field): New helper.
(gfc_conv_descriptor_data_get): Use it - drop strange
REFERENCE_TYPE handling and make sure we don't trigger it.
(gfc_conv_descriptor_data_addr): Use gfc_get_descriptor_field.
(gfc_conv_descriptor_data_set): Likewise.
(gfc_conv_descriptor_offset): Likewise.
(gfc_conv_descriptor_dtype): Likewise.
(gfc_conv_descriptor_span): Likewise.
(gfc_get_descriptor_dimension): Likewise.
(gfc_conv_descriptor_token): Likewise.
(gfc_conv_descriptor_subfield): New helper.
(gfc_conv_descriptor_stride): Use it.
(gfc_conv_descriptor_lbound): Likewise.
(gfc_conv_descriptor_ubound): Likewise.
This makes SLP discovery detect backedges by seeding the bst_map with
the node to be analyzed so it can be picked up from recursive calls.
This removes the need to discover backedges in a separate walk.
This enables SLP build to handle PHI nodes in full, continuing
the SLP build to non-backedges. For loop vectorization this
enables outer loop vectorization of nested SLP cycles and for
BB vectorization this enables vectorization of PHIs at CFG merges.
It also turns code generation into a SCC discovery walk to handle
irreducible regions and nodes only reachable via backedges where
we now also fill in vectorized backedge defs.
This requires sanitizing the SLP tree for SLP reduction chains even
more, manually filling the backedge SLP def.
This also exposes the fact that CFG copying (and edge splitting
until I fixed that) ends up with different edge order in the
copy which doesn't play well with the desired 1:1 mapping of
SLP PHI node children and edges for epilogue vectorization.
I've tried to fixup CFG copying here but this really looks
like a dead (or expensive) end there so I've done fixup in
slpeel_tree_duplicate_loop_to_edge_cfg instead for the cases
we can run into.
There's still NULLs in the SLP_TREE_CHILDREN vectors and I'm
not sure it's possible to eliminate them all this stage1 so the
patch has quite some checks for this case all over the place.
Bootstrapped and tested on x86_64-unknown-linux-gnu. SPEC CPU 2017
and SPEC CPU 2006 successfully built and tested.
2020-10-27 Richard Biener <rguenther@suse.de>
* gimple.h (gimple_expr_type): For PHIs return the type
of the result.
* tree-vect-loop-manip.c (slpeel_tree_duplicate_loop_to_edge_cfg):
Make sure edge order into copied loop headers line up with the
originals.
* tree-vect-loop.c (vect_transform_cycle_phi): Handle nested
loops with SLP.
(vectorizable_phi): New function.
(vectorizable_live_operation): For BB vectorization compute insert
location here.
* tree-vect-slp.c (vect_free_slp_tree): Deal with NULL
SLP_TREE_CHILDREN entries.
(vect_create_new_slp_node): Add overloads with pre-existing node
argument.
(vect_print_slp_graph): Likewise.
(vect_mark_slp_stmts): Likewise.
(vect_mark_slp_stmts_relevant): Likewise.
(vect_gather_slp_loads): Likewise.
(vect_optimize_slp): Likewise.
(vect_slp_analyze_node_operations): Likewise.
(vect_bb_slp_scalar_cost): Likewise.
(vect_remove_slp_scalar_calls): Likewise.
(vect_get_and_check_slp_defs): Handle PHIs.
(vect_build_slp_tree_1): Handle PHIs.
(vect_build_slp_tree_2): Continue SLP build, following PHI
arguments. Fix memory leak.
(vect_build_slp_tree): Put stub node into the hash-map so
we can discover cycles directly.
(vect_build_slp_instance): Set the backedge SLP def for
reduction chains.
(vect_analyze_slp_backedges): Remove.
(vect_analyze_slp): Do not call it.
(vect_slp_convert_to_external): Release SLP_TREE_LOAD_PERMUTATION.
(vect_slp_analyze_node_operations): Handle stray failed
backedge defs by failing.
(vect_slp_build_vertices): Adjust leaf condition.
(vect_bb_slp_mark_live_stmts): Handle PHIs, use visited
hash-set to handle cycles.
(vect_slp_analyze_operations): Adjust.
(vect_bb_partition_graph_r): Likewise.
(vect_slp_function): Adjust split condition to allow CFG
merges.
(vect_schedule_slp_instance): Rename to ...
(vect_schedule_slp_node): ... this. Move DFS walk to ...
(vect_schedule_scc): ... this new function.
(vect_schedule_slp): Call it. Remove ad-hoc vectorized
backedge fill code.
* tree-vect-stmts.c (vect_analyze_stmt): Call
vectorizable_phi.
(vect_transform_stmt): Likewise.
(vect_is_simple_use): Handle vect_backedge_def.
* tree-vectorizer.c (vec_info::new_stmt_vec_info): Only
set loop header PHIs to vect_unknown_def_type for loop
vectorization.
* tree-vectorizer.h (enum vect_def_type): Add vect_backedge_def.
(enum stmt_vec_info_type): Add phi_info_type.
(vectorizable_phi): Declare.
* gcc.dg/vect/bb-slp-54.c: New test.
* gcc.dg/vect/bb-slp-55.c: Likewise.
* gcc.dg/vect/bb-slp-56.c: Likewise.
* gcc.dg/vect/bb-slp-57.c: Likewise.
* gcc.dg/vect/bb-slp-58.c: Likewise.
* gcc.dg/vect/bb-slp-59.c: Likewise.
* gcc.dg/vect/bb-slp-60.c: Likewise.
* gcc.dg/vect/bb-slp-61.c: Likewise.
* gcc.dg/vect/bb-slp-62.c: Likewise.
* gcc.dg/vect/bb-slp-63.c: Likewise.
* gcc.dg/vect/bb-slp-64.c: Likewise.
* gcc.dg/vect/bb-slp-65.c: Likewise.
* gcc.dg/vect/bb-slp-66.c: Likewise.
* gcc.dg/vect/vect-outer-slp-1.c: Likewise.
* gfortran.dg/vect/O3-bb-slp-1.f: Likewise.
* gfortran.dg/vect/O3-bb-slp-2.f: Likewise.
* g++.dg/vect/simd-11.cc: Likewise.
This makes sure to use splats early when facing uniform internal
operands in BB SLP discovery rather than relying on the late
heuristincs re-building nodes from scratch.
2020-10-27 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (vect_build_slp_tree_2): When vectorizing
BBs splat uniform operands and stop SLP discovery.
* gcc.target/i386/pr95866-1.c: Adjust.
FAIL: gcc.target/powerpc/swaps-p8-22.c (test for excess errors)
Excess errors:
cc1: error: '-mcmodel' not supported in this configuration
* gcc.target/powerpc/swaps-p8-22.c: Enable only for aix and
-m64 linux.
The allocation of mutex objects for synchronized statements has been
moved to the library as of merging druntime 58560d51. All support code
in the compiler for getting the OS critical section size has been
removed along with it.
Reviewed-on: https://github.com/dlang/dmd/pull/11902https://github.com/dlang/druntime/pull/3248
gcc/ChangeLog:
* config/aarch64/aarch64-linux.h (GNU_USER_TARGET_D_CRITSEC_SIZE):
Remove.
* config/glibc-d.c (glibc_d_critsec_size): Likewise.
(TARGET_D_CRITSEC_SIZE): Likewise.
* config/i386/linux-common.h (GNU_USER_TARGET_D_CRITSEC_SIZE):
Likewise.
* config/sol2-d.c (solaris_d_critsec_size): Likewise.
(TARGET_D_CRITSEC_SIZE): Likewise.
* doc/tm.texi.in (TARGET_D_CRITSEC_SIZE): Likewise.
* doc/tm.texi: Regenerate.
gcc/d/ChangeLog:
* dmd/MERGE: Merge upstream dmd bec5973b0.
* d-target.cc (Target::critsecsize): Remove.
* d-target.def: Remove d_critsec_size.
libphobos/ChangeLog:
* libdruntime/MERGE: Merge upstream druntime 58560d51.
Fixes a bug where there was undefined template references when compiling
upstream dmd mainline.
In `TemplateInstance::semantic`, there exists special handling of
matching template instances for the same template declaration to ensure
that only at most one instance gets codegen'd.
If the primary instance `inst` originated from a non-root module, the
`minst` field will be updated so it is now coming from a root module,
however all Dsymbol `inst->members` of the instance still have their
`_scope->minst` pointing at the original non-root module. We must now
propagate `minst` to all members so that forward referenced dependencies
that get instantiated will also be appended to the root module,
otherwise there will be undefined references at link-time.
This doesn't affect compilations where all modules are compiled
together, as every module is a root module in that situation. What this
primarily affects are cases where there is a mix of root and non-root
modules, and a template was first instantiated in a non-root context,
then later instantiated again in a root context.
Reviewed-on: https://github.com/dlang/dmd/pull/11867
gcc/d/ChangeLog:
* dmd/MERGE: Merge upstream dmd 0fcdaab32
gcc/ChangeLog:
PR gcov-profile/97461
* gcov-io.h (GCOV_PREALLOCATED_KVP): Pre-allocate 64
static counters.
libgcc/ChangeLog:
PR gcov-profile/97461
* libgcov.h (gcov_counter_add): Use first static counters
as it should help to have malloc wrappers set up.
gcc/testsuite/ChangeLog:
PR gcov-profile/97461
* gcc.dg/tree-prof/pr97461.c: New test.
gcc/ada/
* exp_spark.adb (Expand_SPARK_Array_Aggregate): Dedicated
routine for array aggregates; mostly reuses existing code, but
calls itself recursively for multi-dimensional array aggregates.
(Expand_SPARK_N_Aggregate): Call Expand_SPARK_Array_Aggregate to
do the actual expansion, starting from the first index of the
array type.
gcc/ada/
* sem_aggr.adb (Resolve_Iterated_Component_Association): new
internal subprogram Remove_References, to reset semantic
information on each reference to the index variable of the
association, so that Collect_Aggregate_Bounds can work properly
on multidimensional arrays with nested associations, and
subsequent expansion into loops can verify that dimensions of
each subaggregate are compatible.
* builtin-attrs.def (STRERRNOC): New macro.
(STRERRNOP): New macro.
(ATTR_ERRNOCONST_NOTHROW_LEAF_LIST): New attr list.
(ATTR_ERRNOPURE_NOTHROW_LEAF_LIST): New attr list.
* builtins.def (ATTR_MATHFN_ERRNO): Use
ATTR_ERRNOCONST_NOTHROW_LEAF_LIST.
(ATTR_MATHFN_FPROUNDING_ERRNO): Use ATTR_ERRNOCONST_NOTHROW_LEAF_LIST
or ATTR_ERRNOPURE_NOTHROW_LEAF_LIST.
- Generalize logic for translating arch to internal flags, this patch
is infrastructure for supporing sub-extension parsing.
gcc/ChangeLog
* common/config/riscv/riscv-common.c (opt_var_ref_t): New.
(riscv_ext_flag_table_t): New.
(riscv_ext_flag_table): New.
(riscv_parse_arch_string): Pass gcc_options* instead of
&opts->x_target_flags only, and using riscv_arch_option_table to
setup flags.
(riscv_handle_option): Update argument for riscv_parse_arch_string.
(riscv_expand_arch): Ditto.
(riscv_expand_arch_from_cpu): Ditto.
* tree.c (set_call_expr_flags): Fix string for ECF_RET1.
(build_common_builtin_nodes): Do not set ECF_RET1 for memcpy, memmove,
and memset. They are handled by builtin_fnspec.
This introduces a global alloc-pool for SLP nodes to reduce overhead
on SLP allocation churn which will get worse and to eventually release
SLP cycles which will retain a refcount of one and thus are never
freed at the moment.
2020-10-26 Richard Biener <rguenther@suse.de>
* tree-vectorizer.h (slp_tree_pool): Declare.
(_slp_tree::operator new): Likewise.
(_slp_tree::operator delete): Likewise.
* tree-vectorizer.c (vectorize_loops): Allocate and free the
slp_tree_pool.
(pass_slp_vectorize::execute): Likewise.
* tree-vect-slp.c (slp_tree_pool): Define.
(_slp_tree::operator new): Likewise.
(_slp_tree::operator delete): Likewise.
We newly correctly detect that a job server is not active for
a LTO linking:
lto-wrapper: warning: jobserver is not available: '--jobserver-auth=' is not present in 'MAKEFLAGS'
In that situation we should not call make -f abc.mk as it can leed
to N^2 LTRANS units.
gcc/ChangeLog:
* lto-wrapper.c (run_gcc): Do not use sub-make when jobserver is
not detected properly.
When running with -m32
FAIL: gcc.target/powerpc/pr94740.c (test for excess errors)
Excess errors:
cc1: error: '-mpcrel' requires '-mcmodel=medium'
The others don't run for -m32, but remove the unnecessary -mpcrel
anyway.
* gcc.target/powerpc/localentry-1.c: Remove -mpcrel from options.
* gcc.target/powerpc/notoc-direct-1.c: Likewise.
* gcc.target/powerpc/pr94740.c: Likewise.
All these tests fail with -m32 due to lack of int128 support, in some
cases with what I thought was not the best error message. For example
vsx_mask-move-runnable.c:34:3: error: unknown type name 'vector'
is misleading. The problem isn't "vector" but "vector __uint128_t".
* gcc.target/powerpc/vsx-load-element-extend-char.c: Require int128.
* gcc.target/powerpc/vsx-load-element-extend-int.c: Likewise.
* gcc.target/powerpc/vsx-load-element-extend-longlong.c: Likewise.
* gcc.target/powerpc/vsx-load-element-extend-short.c: Likewise.
* gcc.target/powerpc/vsx-store-element-truncate-char.c: Likewise.
* gcc.target/powerpc/vsx-store-element-truncate-int.c: Likewise.
* gcc.target/powerpc/vsx-store-element-truncate-longlong.c: Likewise.
* gcc.target/powerpc/vsx-store-element-truncate-short.c: Likewise.
* gcc.target/powerpc/vsx_mask-count-runnable.c: Likewise.
* gcc.target/powerpc/vsx_mask-expand-runnable.c: Likewise.
* gcc.target/powerpc/vsx_mask-extract-runnable.c: Likewise.
* gcc.target/powerpc/vsx_mask-move-runnable.c: Likewise.