One of the things we want to do on AArch64 is compare vector loops
side-by-side and pick the best one. For some targets, we want this
to be based on issue rates as well as the usual latency-based costs
(at least for loops with relatively high iteration counts).
The current approach to doing this is: when costing vectorisation
candidate A, try to guess what the other main candidate B will look
like and adjust A's latency-based cost up or down based on the likely
difference between A and B's issue rates. This effectively means
that we try to cost parts of B at the same time as A, without actually
being able to see B.
This is needlessly indirect and complex. It was a compromise due
to the code being added (too) late in the GCC 11 cycle, so that
target-independent changes weren't possible.
The target-independent code already compares two candidate loop_vec_infos
side-by-side, so that information about A and B above are available
directly. This patch creates a way for targets to hook into this
comparison.
The AArch64 code can therefore hook into better_main_loop_than_p to
compare issue rates. If the issue rate comparison isn't decisive,
the code can fall back to the normal latency-based comparison instead.
gcc/
* tree-vectorizer.h (vector_costs::better_main_loop_than_p)
(vector_costs::better_epilogue_loop_than_p)
(vector_costs::compare_inside_loop_cost)
(vector_costs::compare_outside_loop_cost): Likewise.
* tree-vectorizer.c (vector_costs::better_main_loop_than_p)
(vector_costs::better_epilogue_loop_than_p)
(vector_costs::compare_inside_loop_cost)
(vector_costs::compare_outside_loop_cost): New functions,
containing code moved from...
* tree-vect-loop.c (vect_better_loop_vinfo_p): ...here.
The vector costs now use a common base class instead of being
completely abstract. This means that there's no longer a
need to record the inside and outside costs separately.
gcc/
* tree-vectorizer.h (_loop_vec_info): Remove vec_outside_cost
and vec_inside_cost.
(vector_costs::outside_cost): New function.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Update
after above.
(vect_estimate_min_profitable_iters): Likewise.
(vect_better_loop_vinfo_p): Get the inside and outside costs
from the loop_vec_infos' vector_costs.
target_cost_data is in vec_info but is really specific to
loop_vec_info. This patch moves it there and renames it to
vector_costs, to distinguish it from scalar target costs.
gcc/
* tree-vectorizer.h (vec_info::target_cost_data): Replace with...
(_loop_vec_info::vector_costs): ...this.
(LOOP_VINFO_TARGET_COST_DATA): Delete.
* tree-vectorizer.c (vec_info::vec_info): Remove target_cost_data
initialization.
(vec_info::~vec_info): Remove corresponding delete.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize
vector_costs to null.
(_loop_vec_info::~_loop_vec_info): Delete vector_costs.
(vect_analyze_loop_operations): Update after above changes.
(vect_analyze_loop_2): Likewise.
(vect_estimate_min_profitable_iters): Likewise.
* tree-vect-slp.c (vect_slp_analyze_operations): Likewise.
I hoped that I am done with EAF flags related changes, but while looking into
the Fortran testcases I noticed that I have designed them in unnecesarily
restricted way. I followed the scheme of NOESCAPE and NODIRECTESCAPE which is
however the only property tht is naturally transitive.
This patch replaces the existing flags by 9 flags:
EAF_UNUSED
EAF_NO_DIRECT_CLOBBER and EAF_NO_INDIRECT_CLOBBER
EAF_NO_DIRECT_READ and EAF_NO_INDIRECT_READ
EAF_NO_DIRECT_ESCAPE and EAF_NO_INDIRECT_ESCAPE
EAF_NO_DIRECT_READ and EAF_NO_INDIRECT_READ
So I have removed the unified EAF_DIRECT flag and made each of the flags to come
in direct and indirect variant. Newly the indirect variant is not implied by direct
(well except for escape but it is not special cased in the code)
Consequently we can analyse i.e. the case where function reads directly and clobber
indirectly as in the following testcase:
struct wrap {
void **array;
};
__attribute__ ((noinline))
void
write_array (struct wrap *ptr)
{
ptr->array[0]=0;
}
int
test ()
{
void *arrayval;
struct wrap w = {&arrayval};
write_array (&w);
return w.array == &arrayval;
}
This is pretty common in array descriptors and also C++ pointer wrappers or structures
containing pointers to arrays.
Other advantage is that !binds_to_current_def_p functions we can still track the fact
that the value is not clobbered indirectly while previously we implied EAF_DIRECT
for all three cases.
Finally the propagation becomes more regular and I hope easier to understand
because the flags are handled in a symmetric way.
In tree-ssa-structalias I now produce "callarg" var_info as before and if necessary
also "indircallarg" for the indirect accesses. I added some logic to optimize the
common case where we can not make difference between direct and indirect.
gcc/ChangeLog:
2021-11-09 Jan Hubicka <hubicka@ucw.cz>
* tree-core.h (EAF_DIRECT): Remove.
(EAF_NOCLOBBER): Remove.
(EAF_UNUSED): Remove.
(EAF_NOESCAPE): Remove.
(EAF_NO_DIRECT_CLOBBER): New.
(EAF_NO_INDIRECT_CLOBBER): New.
(EAF_NODIRECTESCAPE): Remove.
(EAF_NO_DIRECT_ESCAPE): New.
(EAF_NO_INDIRECT_ESCAPE): New.
(EAF_NOT_RETURNED): Remove.
(EAF_NOT_RETURNED_INDIRECTLY): New.
(EAF_NOREAD): Remove.
(EAF_NO_DIRECT_READ): New.
(EAF_NO_INDIRECT_READ): New.
* gimple.c (gimple_call_arg_flags): Update for new flags.
(gimple_call_retslot_flags): Update for new flags.
* ipa-modref.c (dump_eaf_flags): Likewise.
(remove_useless_eaf_flags): Likewise.
(deref_flags): Likewise.
(modref_lattice::init): Likewise.
(modref_lattice::merge): Likewise.
(modref_lattice::merge_direct_load): Likewise.
(modref_lattice::merge_direct_store): Likewise.
(modref_eaf_analysis::merge_call_lhs_flags): Likewise.
(callee_to_caller_flags): Likewise.
(modref_eaf_analysis::analyze_ssa_name): Likewise.
(modref_eaf_analysis::propagate): Likewise.
(modref_merge_call_site_flags): Likewise.
* ipa-modref.h (interposable_eaf_flags): Likewise.
* tree-ssa-alias.c: (ref_maybe_used_by_call_p_1) Likewise.
* tree-ssa-structalias.c (handle_call_arg): Likewise.
(handle_rhs_call): Likewise.
* tree-ssa-uninit.c (maybe_warn_pass_by_reference): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/ipa/modref-1.C: Update template.
* gcc.dg/ipa/modref-3.c: Update template.
* gcc.dg/lto/modref-3_0.c: Update template.
* gcc.dg/lto/modref-4_0.c: Update template.
* gcc.dg/tree-ssa/modref-10.c: Update template.
* gcc.dg/tree-ssa/modref-11.c: Update template.
* gcc.dg/tree-ssa/modref-5.c: Update template.
* gcc.dg/tree-ssa/modref-6.c: Update template.
* gcc.dg/tree-ssa/modref-13.c: New test.
These tests are still failing on SPARC and it looks like this is because I need
to use vect_long_long instead of vect_long.
gcc/testsuite/ChangeLog:
PR testsuite/103042
* gcc.dg/vect/complex/bb-slp-complex-add-pattern-long.c: Use
vect_long_long instead of vect_long.
* gcc.dg/vect/complex/bb-slp-complex-add-pattern-unsigned-long.c:
Likewise.
* gcc.dg/vect/complex/vect-complex-add-pattern-long.c: Likewise.
* gcc.dg/vect/complex/vect-complex-add-pattern-unsigned-long.c:
Likewise.
These test don't work on vector ISAs where the truth
type don't match the vector mode of the operation.
However I still want the tests to run on these
architectures but just turn off the ISA modes that
enable masks.
This thus turns off SVE is it's on and turns off
AVX512 if it's on.
gcc/testsuite/ChangeLog:
* gcc.dg/signbit-2.c: Turn off masks.
* gcc.dg/signbit-5.c: Likewise.
This removed an unused variable that clang seems to catch when
compiling GCC with Clang.
gcc/ChangeLog:
* tree-vect-slp-patterns.c (complex_mul_pattern::matches): Remove l1node.
The <cxxx> headers for the C library are not under our control, so we
can't prevent them from including <unistd.h>. Change the PR 49745 test
to only include the C++ library headers, not the <cxxx> ones.
To ensure <bits/stdc++.h> isn't included automatically we need to use
no_pch to disable PCH.
libstdc++-v3/ChangeLog:
PR libstdc++/100117
* testsuite/17_intro/headers/c++1998/49745.cc: Explicitly list
all C++ headers instead of including <bits/stdc++.h>
Since Glibc 2.34 all pthreads symbols are defined directly in libc not
libpthread, and since Glibc 2.32 we have used __libc_single_threaded to
avoid unnecessary locking in single-threaded programs. This means there
is no reason to avoid linking to libpthread now, and so no reason to use
weak symbols defined in gthr-posix.h for all the pthread_xxx functions.
libstdc++-v3/ChangeLog:
PR libstdc++/100748
PR libstdc++/103133
* config/os/gnu-linux/os_defines.h (_GLIBCXX_GTHREAD_USE_WEAK):
Define for glibc 2.34 and later.
This XFAILs the bogus diagnostic test and rectifies the expectation
on the optimization.
2021-11-10 Richard Biener <rguenther@suse.de>
PR testsuite/102690
* g++.dg/warn/Warray-bounds-16.C: XFAIL diagnostic part
and optimization.
This patch fixes the wrong TBAA information when lowering NEON loads and stores
to gimple that showed up when bootstrapping with UBSAN.
gcc/ChangeLog:
* config/aarch64/aarch64-builtins.c
(aarch64_general_gimple_fold_builtin): Change pointer alignment and
alias.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/simd/lowering_tbaa.c: New test.
This patch reverts the tests for big-endian after the NEON gimple lowering
patch. The earlier patch only lowers NEON loads and stores for little-endian,
meaning the codegen now differs between endinanness so we need target specific
testing.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/fmla_intrinsic_1.c: Fix big-endian testism.
* gcc.target/aarch64/fmls_intrinsic_1.c: Likewise.
* gcc.target/aarch64/fmul_intrinsic_1.c: Likewise.
Jonathan reported and I've verified a
In file included from ../../../libgcc/unwind-dw2.c:412:
./md-unwind-support.h:398:6: warning: no previous prototype for ‘ppc_backchain_fallback’ [-Wmissing-prototypes]
398 | void ppc_backchain_fallback (struct _Unwind_Context *context, void *a)
| ^~~~~~~~~~~~~~~~~~~~~~
warning on powerpc*-linux* libgcc build.
All the other MD_* macro functions are static, so I think the following
is the right thing rather than adding a previous prototype for
ppc_backchain_fallback.
2021-11-10 Jakub Jelinek <jakub@redhat.com>
* config/rs6000/linux-unwind.h (ppc_back_fallback): Make it static,
formatting fix.
gcc/ada/
* gcc-interface/ada-tree.h (DECL_STUBBED_P): Delete.
* gcc-interface/decl.c (gnat_to_gnu_entity): Do not set it.
* gcc-interface/trans.c (Call_to_gnu): Use GNAT_NAME local variable
and adjust accordingly. Replace test on DECL_STUBBED_P with direct
test on Convention and move it down in the processing.
gcc/ada/
* scng.adb (Check_Bidi): New procedure to give warning. Note
that this is called only for non-ASCII characters, so should not
be an efficiency issue.
(Slit): Call Check_Bidi for wide characters in string_literals.
(Minus_Case): Call Check_Bidi for wide characters in comments.
(Char_Literal_Case): Call Check_Bidi for wide characters in
character_literals. Move Accumulate_Checksum down, because
otherwise, if Err is True, the Code is uninitialized.
* errout.ads: Make the obsolete nature of "Insertion character
?" more prominent; one should not have to read several
paragraphs before finding out that it's obsolete.
gcc/ada/
* exp_ch4.adb (Expand_Array_Equality): Fix inconsistent casing
in comment about the template for expansion of array equality;
now we use lower case for true/false/boolean.
(Handle_One_Dimension): Fix comment about the template for
expansion of array equality.
gcc/ada/
* aspects.adb, aspects.ads (Is_Aspect_Id): New function.
* namet-sp.ads, namet-sp.adb (Aspect_Spell_Check,
Attribute_Spell_Check): New Functions.
* par-ch13.adb (Possible_Misspelled_Aspect): Removed.
(With_Present): Use Aspect_Spell_Check, use Is_Aspect_Id.
(Get_Aspect_Specifications): Use Aspect_Spell_Check,
Is_Aspect_Id, Bad_Aspect.
* par-sync.adb (Resync_Past_Malformed_Aspect): Use Is_Aspect_Id.
* sem_ch13.adb (Check_One_Attr): Use Is_Aspect_Id.
* sem_prag.adb (Process_Restrictions_Or_Restriction_Warnings):
Introduce the Process_No_Specification_Of_Aspect, emit a warning
instead of an error on unknown aspect, hint for typos.
Introduce Process_No_Use_Of_Attribute to add spell check for
attributes too.
(Set_Error_Msg_To_Profile_Name): Use Is_Aspect_Id.
* sem_util.adb (Bad_Attribute): Use Attribute_Spell_Check.
(Bad_Aspect): New function.
* sem_util.ads (Bad_Aspect): New function.
gcc/ada/
* libgnarl/s-taskin.adb (Initialize_ATCB): Initialize
T.Common.Current_Priority to Priority'First.
* libgnarl/s-taskin.ads (Unspecified_Priority): Redefined as -1.
* libgnat/system-rtems.ads: Start priority range from 1, as 0 is
reserved by the operating system.
gcc/ada/
* libgnat/a-nbnbig.ads: Mark the unit as Pure.
* libgnat/s-aridou.adb: Add contracts and ghost code for proof.
(Scaled_Divide): Reorder operations and use of temporaries in
two places to facilitate proof.
* libgnat/s-aridou.ads: Add full functional contracts.
* libgnat/s-arit64.adb: Mark in SPARK.
* libgnat/s-arit64.ads: Add contracts similar to those from
s-aridou.ads.
* rtsfind.ads: Document the limitation that runtime units
loading does not work for private with-clauses.
gcc/ada/
* exp_ch4.adb (Expand_Composite_Equality): Handle arrays inside
records just like scalars; only records inside records need
dedicated handling.
gcc/ada/
* sem_type.ads (Has_Compatible_Type): Add For_Comparison parameter.
* sem_type.adb (Has_Compatible_Type): Put back the reversed calls
to Covers guarded with For_Comparison.
* sem_ch4.adb (Analyze_Membership_Op) <Try_One_Interp>: Remove new
reversed call to Covers and set For_Comparison to true instead.
(Find_Comparison_Types) <Try_One_Interp>: Likewise
(Find_Equality_Types) <Try_One_Interp>: Likewise.
gcc/ada/
* Makefile.rtl: Add unit.
* libgnat/a-nbnbin__ghost.adb: Move...
* libgnat/a-nbnbig.adb: ... here. Mark ghost as ignored.
* libgnat/a-nbnbin__ghost.ads: Move...
* libgnat/a-nbnbig.ads: ... here. Add comment for purpose of
this unit. Mark ghost as ignored.
* libgnat/s-widthu.adb: Use new unit.
* sem_aux.adb (First_Subtype): Adapt to the case of a ghost type
whose freeze node is rewritten to a null statement.
gcc/ada/
* exp_ch4.adb (Expand_Array_Equality): Remove check of the array
bound being an N_Range node; use Type_High_Bound/Type_Low_Bound,
which handle all kinds of array bounds.
gcc/ada/
* sem_ch3.adb (Derived_Type_Declaration): Introduce a subprogram
for tree transformation. If a tree transformation is performed,
then warn that it would be better to reorder the interfaces.
I forgot to include the path dump when failing a path in resolve_phi.
To do so I abstracted dump_path into its own function, which made me
realize we had another copy with slightly different output.
I've merged everything and cleaned it up.
gcc/ChangeLog:
* tree-ssa-threadbackward.c
(back_threader::maybe_register_path_dump): Abstract path dumping...
(dump_path): ...here.
(back_threader::resolve_phi): Call dump_path.
(debug): Same.
This patch is to support fold _mm512_fmadd_pch (a, _mm512_set1_pch(*(b)), c)
to 1 instruction vfmaddcph (%rsp){1to16}, %zmm1, %zmm2;
gcc/ChangeLog:
* config/i386/sse.md (fma_<complexpairopname>_<mode>_pair):
Add new define_insn.
(fma_<mode>_fmaddc_bcst): Add new define_insn_and_split.
(fma_<mode>_fcmaddc_bcst): Likewise
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16vl-complex-broadcast-1.c: New test.
a and b are same type as trunc type and has less precision than
extend type.
gcc/ChangeLog:
PR target/102464
* match.pd: Simplify (trunc)fmax/fmin((extend)a, (extend)b) to
MAX/MIN(a,b)
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr102464-maxmin.c: New test.
The function aarch64_evpc_ins would reuse the target even though
it might be the same register as the two inputs.
Instead of checking to see if we can reuse the target, just use the
original input directly.
Committed as approved after bootstrapped and tested on
aarch64-linux-gnu with no regressions.
PR target/101529
gcc/ChangeLog:
* config/aarch64/aarch64.c (aarch64_evpc_ins): Don't use target
as an input, use original one.
gcc/testsuite/ChangeLog:
* c-c++-common/torture/builtin-convertvector-2.c: New test.
* c-c++-common/torture/builtin-shufflevector-2.c: New test.
gcc/c-family/ChangeLog:
* c-pragma.c (GCC_BAD_AT): New macro.
(GCC_BAD2_AT): New macro.
(handle_pragma_pack): Use the location of the pertinent token when
issuing diagnostics about invalid constants/actions, and trailing
junk.
(handle_pragma_target): Likewise for non-string "GCC option".
(handle_pragma_message): Likewise for trailing junk.
gcc/testsuite/ChangeLog:
* gcc.dg/bad-pragma-locations.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2021-11-09 Bill Schmidt <wschmidt@linux.ibm.com>
gcc/
* config/rs6000/rs6000-call.c (rs6000_gimple_fold_new_builtin):
Disable gimple fold for RS6000_BIF_{XVMINDP,XVMINSP,VMINFP} and
RS6000_BIF_{XVMAXDP,XVMAXSP,VMAXFP} when fast-math is not set.
(lxvrse_expand_builtin): Modify the expansion for sign extension.
All extensions are done within VSX registers.
gcc/testsuite/
* gcc.target/powerpc/p10_vec_xl_sext.c: Fix long long case.
If a finalization is not required we created a namespace containing
formal arguments for an internal interface definition but never used
any of these. So the whole sub_ns namespace was not wired up to the
program and consequently was never freed. The fix is to simply not
generate any finalization wrappers if we know that it will be unused.
Note that this reverts back to the original r190869
(8a96d64282ac534cb597f446f02ac5d0b13249cc) handling for this case
by reverting this specific part of r194075
(f1ee56b4be7cc3892e6ccc75d73033c129098e87) for PR fortran/37336.
valgrind summary for e.g.
gfortran.dg/abstract_type_3.f03 and gfortran.dg/abstract_type_4.f03
where ".orig" is pristine trunk and ".mine" contains this fix:
at3.orig.vg:LEAK SUMMARY:
at3.orig.vg- definitely lost: 8,460 bytes in 11 blocks
at3.orig.vg- indirectly lost: 13,288 bytes in 55 blocks
at3.orig.vg- possibly lost: 0 bytes in 0 blocks
at3.orig.vg- still reachable: 572,278 bytes in 2,142 blocks
at3.orig.vg- suppressed: 0 bytes in 0 blocks
at3.orig.vg-
at3.orig.vg-Use --track-origins=yes to see where uninitialised values come from
at3.orig.vg-ERROR SUMMARY: 38 errors from 33 contexts (suppressed: 0 from 0)
--
at3.mine.vg:LEAK SUMMARY:
at3.mine.vg- definitely lost: 344 bytes in 1 blocks
at3.mine.vg- indirectly lost: 7,192 bytes in 18 blocks
at3.mine.vg- possibly lost: 0 bytes in 0 blocks
at3.mine.vg- still reachable: 572,278 bytes in 2,142 blocks
at3.mine.vg- suppressed: 0 bytes in 0 blocks
at3.mine.vg-
at3.mine.vg-ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
at3.mine.vg-ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
at4.orig.vg:LEAK SUMMARY:
at4.orig.vg- definitely lost: 13,751 bytes in 12 blocks
at4.orig.vg- indirectly lost: 11,976 bytes in 60 blocks
at4.orig.vg- possibly lost: 0 bytes in 0 blocks
at4.orig.vg- still reachable: 572,278 bytes in 2,142 blocks
at4.orig.vg- suppressed: 0 bytes in 0 blocks
at4.orig.vg-
at4.orig.vg-Use --track-origins=yes to see where uninitialised values come from
at4.orig.vg-ERROR SUMMARY: 18 errors from 16 contexts (suppressed: 0 from 0)
--
at4.mine.vg:LEAK SUMMARY:
at4.mine.vg- definitely lost: 3,008 bytes in 3 blocks
at4.mine.vg- indirectly lost: 4,056 bytes in 11 blocks
at4.mine.vg- possibly lost: 0 bytes in 0 blocks
at4.mine.vg- still reachable: 572,278 bytes in 2,142 blocks
at4.mine.vg- suppressed: 0 bytes in 0 blocks
at4.mine.vg-
at4.mine.vg-ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
at4.mine.vg-ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
gcc/fortran/ChangeLog:
2018-10-12 Bernhard Reutner-Fischer <aldot@gcc.gnu.org>
PR fortran/68800
* class.c (generate_finalization_wrapper): Do not leak
finalization wrappers if they will not be used.
* expr.c (gfc_free_actual_arglist): Formatting fix.
* gfortran.h (gfc_free_symbol): Pass argument by reference.
(gfc_release_symbol): Likewise.
(gfc_free_namespace): Likewise.
* symbol.c (gfc_release_symbol): Adjust acordingly.
(free_components): Set procedure pointer components
of derived types to NULL after freeing.
(free_tb_tree): Likewise.
(gfc_free_symbol): Set sym to NULL after freeing.
(gfc_free_namespace): Set namespace to NULL after freeing.
When I fixed PR 102622, I accidently left behind a TYPE_PRECISION
check which I had there for checking before hand. This check
is not needed as the code will handle it correctly anyways.
Committed as obvious after a bootstrap/test on x86_64-linux-gnu.
PR tree-optimization/10352
gcc/ChangeLog:
* match.pd: Remove check of TYPE_PRECISION for
the a?0:pow2 case.
gcc/testsuite/ChangeLog:
* gcc.c-torture/execute/pr10352-1.c: New test.
Instead of x_range_query always pointing to an object, have it default to
NULL and return a pointer to the global query in that case.
* function.c (allocate_struct_function): Don't set x_range_query.
* function.h (get_range_query): Move to value-query.h.
* gimple-range.cc (enable_ranger): Check that query is currently NULL.
(disable_ranger): Clear function current query field.
* value-query.cc (get_global_range_query): Relocate to:
* value-query.h (get_global_range_query): Here and inline.
(get_range_query): Relocate here from function.h.
The goal with these sets of patches is to improve the detailed dumps for
the threader, as I hope we eventually reach the point when I'm not
the only one looking at these dumps ;-).
This patch adds candidate paths to the detailed threading dumps to make it
easier to see the decisions the threader makes. With it we can now
grep for the discovery logic in action:
$ grep ^path: a.ii.*thread*
a.ii.034t.ethread:path: 4->5->xx REJECTED
a.ii.034t.ethread:path: 3->5->8 SUCCESS
a.ii.034t.ethread:path: 4->5->6 SUCCESS
a.ii.034t.ethread:path: 0->2->xx REJECTED
a.ii.034t.ethread:path: 0->2->xx REJECTED
...
...
a.ii.111t.threadfull1:path: 14->22->23->xx REJECTED (unreachable)
a.ii.111t.threadfull1:path: 15->22->23->xx REJECTED (unreachable)
a.ii.111t.threadfull1:path: 16->22->23->xx REJECTED (unreachable)
In addition to this, if --param=threader-debug=all is used, one can see
the entire chain of events leading up to the ultimate threading
decision:
==============================================
path_range_query: compute_ranges for path: 2->5
Registering killing_def (path_oracle) _3
Registering killing_def (path_oracle) _1
range_defined_in_block (BB2) for _1 is _Bool VARYING
Registering killing_def (path_oracle) _2
range_defined_in_block (BB2) for _2 is _Bool VARYING
range_defined_in_block (BB2) for _3 is _Bool VARYING
outgoing_edge_range_p for b_10(D) on edge 2->5 is int VARYING
...
... [BBs and gimple along path]
...
path: 2->5->xx REJECTED
Tested on x86-64 Linux.
gcc/ChangeLog:
* tree-ssa-threadbackward.c
(back_threader::maybe_register_path_dump): New.
(back_threader::maybe_register_path): Call maybe_register_path_dump.
This is a minor cleanup for maybe_register_path to return NULL when
the path is unprofitable. It is needed for a follow-up patch to
generate better dumps from the threader.
There is no change in behavior, since the only call to this function
bails on !profitable_path_p.
Tested on x86-64 Linux.
gcc/ChangeLog:
* tree-ssa-threadbackward.c (back_threader::maybe_register_path):
Return NULL when unprofitable.
This patch introduces a helper function build_debug_expr_decl to build
DEBUG_EXPR_DECL tree nodes in the most common way and replaces with a
call of this function all code pieces which build such a DECL itself
and sets its mode to the TYPE_MODE of its type.
There still remain 11 instances of open-coded creation of a
DEBUG_EXPR_DECL which set the mode of the DECL to something else. It
would probably be a good idea to figure out that has any effect and if
not, convert them to calls of build_debug_expr_decl too. But this
patch deliberately does not introduce any functional changes.
gcc/ChangeLog:
2021-11-08 Martin Jambor <mjambor@suse.cz>
* tree.h (build_debug_expr_decl): Declare.
* tree.c (build_debug_expr_decl): New function.
* cfgexpand.c (avoid_deep_ter_for_debug): Use build_debug_expr_decl
instead of building a DEBUG_EXPR_DECL.
* ipa-param-manipulation.c
(ipa_param_body_adjustments::prepare_debug_expressions): Likewise.
* omp-simd-clone.c (ipa_simd_modify_stmt_ops): Likewise.
* tree-ssa-ccp.c (optimize_atomic_bit_test_and): Likewise.
* tree-ssa-phiopt.c (spaceship_replacement): Likewise.
* tree-ssa-reassoc.c (make_new_ssa_for_def): Likewise.