2020-09-20 07:25:16 +02:00
|
|
|
/* Data structure for the modref pass.
|
2022-01-03 10:42:10 +01:00
|
|
|
Copyright (C) 2020-2022 Free Software Foundation, Inc.
|
2020-09-20 07:25:16 +02:00
|
|
|
Contributed by David Cepelik and Jan Hubicka
|
|
|
|
|
|
|
|
This file is part of GCC.
|
|
|
|
|
|
|
|
GCC is free software; you can redistribute it and/or modify it under
|
|
|
|
the terms of the GNU General Public License as published by the Free
|
|
|
|
Software Foundation; either version 3, or (at your option) any later
|
|
|
|
version.
|
|
|
|
|
|
|
|
GCC is distributed in the hope that it will be useful, but WITHOUT ANY
|
|
|
|
WARRANTY; without even the implied warranty of MERCHANTABILITY or
|
|
|
|
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
|
|
|
|
for more details.
|
|
|
|
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
|
|
along with GCC; see the file COPYING3. If not see
|
|
|
|
<http://www.gnu.org/licenses/>. */
|
|
|
|
|
|
|
|
#include "config.h"
|
|
|
|
#include "system.h"
|
|
|
|
#include "coretypes.h"
|
|
|
|
#include "backend.h"
|
|
|
|
#include "tree.h"
|
|
|
|
#include "ipa-modref-tree.h"
|
|
|
|
#include "selftest.h"
|
2021-11-14 12:01:41 +01:00
|
|
|
#include "tree-ssa-alias.h"
|
|
|
|
#include "gimple.h"
|
2021-11-17 20:40:44 +01:00
|
|
|
#include "cgraph.h"
|
|
|
|
#include "tree-streamer.h"
|
2020-09-20 07:25:16 +02:00
|
|
|
|
2021-11-13 18:27:18 +01:00
|
|
|
/* Return true if both accesses are the same. */
|
|
|
|
bool
|
|
|
|
modref_access_node::operator == (modref_access_node &a) const
|
|
|
|
{
|
|
|
|
if (parm_index != a.parm_index)
|
|
|
|
return false;
|
Determine global memory accesses in ipa-modref
As discussed in PR103585, fatigue2 is now only benchmark from my usual testing
set (SPEC2k6, SPEC2k17, CPP benchmarks, polyhedron, Firefox, clang) which sees
important regression when inlining functions called once is limited. This
prevents us from solving runtime issues in roms benchmarks and elsewhere.
The problem is that there is perdida function that takes many arguments and
some of them are array descriptors. We constant propagate most of their fields
but still keep their initialization. Because perdida is quite fast, the call
overhead dominates, since we need over 100 memory stores consuing about 35%
of the overall benchmark runtime.
The memory stores would be eliminated if perdida did not call fortran I/O which
makes modref to thin that the array descriptors could be accessed. We are
quite close discovering that they can't becuase they are non-escaping from
function. This patch makes modref to distingush between global memory access
(only things that escapes) and unkonwn accesss (that may access also
nonescaping things reaching the function). This makes disambiguation for
functions containing error handling better.
Unfortunately the patch hits two semi-latent issues in Fortran frontned.
First is wrong code in gfortran.dg/unlimited_polymorphic_3.f03. This can be
turned into wrong code testcase on both mainline and gcc11 if the runtime
call is removed, so I filled PR 103662 for it. There is TBAA mismatch for
structure produced in FE.
Second is issue with GOMP where Fortran marks certain parameters as non-escaping
and then makes them escape via GOMP_parallel. For this I disabled the use of
escape info in verify_arg which also disables the useful transform on perdida
but still does useful work for e.g. GCC error handling. I will work on this
incrementally.
Bootstrapped/regtested x86_64-linux, lto-bootstrapped and also tested with
clang build. I plan to commit this tomorrow if there are no complains
(the patch is not completely short but conceptualy simple and handles a lot
of common cases).
gcc/ChangeLog:
2021-12-12 Jan Hubicka <hubicka@ucw.cz>
PR ipa/103585
* ipa-modref-tree.c (modref_access_node::range_info_useful_p): Handle
MODREF_GLOBAL_MEMORY_PARM.
(modref_access_node::dump): Likewise.
(modref_access_node::get_call_arg): Likewise.
* ipa-modref-tree.h (enum modref_special_parms): Add
MODREF_GLOBAL_MEMORY_PARM.
(modref_access_node::useful_for_kill): Handle
MODREF_GLOBAL_MEMORY_PARM.
(modref:tree::merge): Add promote_unknown_to_global.
* ipa-modref.c (verify_arg):New function.
(may_access_nonescaping_parm_p): New function.
(modref_access_analysis::record_global_memory_load): New member
function.
(modref_access_analysis::record_global_memory_store): Likewise.
(modref_access_analysis::process_fnspec): Distingush global and local
memory.
(modref_access_analysis::analyze_call): Likewise.
* tree-ssa-alias.c (ref_may_access_global_memory_p): New function.
(modref_may_conflict): Use it.
gcc/testsuite/ChangeLog:
2021-12-12 Jan Hubicka <hubicka@ucw.cz>
* gcc.dg/analyzer/data-model-1.c: Disable ipa-modref.
* gcc.dg/uninit-38.c: Likewise.
* gcc.dg/uninit-pr98578.c: Liewise.
2021-12-14 16:50:27 +01:00
|
|
|
if (parm_index != MODREF_UNKNOWN_PARM
|
|
|
|
&& parm_index != MODREF_GLOBAL_MEMORY_PARM)
|
2021-11-13 18:27:18 +01:00
|
|
|
{
|
|
|
|
if (parm_offset_known != a.parm_offset_known)
|
|
|
|
return false;
|
|
|
|
if (parm_offset_known
|
|
|
|
&& !known_eq (parm_offset, a.parm_offset))
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
if (range_info_useful_p () != a.range_info_useful_p ())
|
|
|
|
return false;
|
|
|
|
if (range_info_useful_p ()
|
|
|
|
&& (!known_eq (a.offset, offset)
|
|
|
|
|| !known_eq (a.size, size)
|
|
|
|
|| !known_eq (a.max_size, max_size)))
|
|
|
|
return false;
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Return true A is a subaccess. */
|
|
|
|
bool
|
|
|
|
modref_access_node::contains (const modref_access_node &a) const
|
|
|
|
{
|
|
|
|
poly_int64 aoffset_adj = 0;
|
|
|
|
if (parm_index != MODREF_UNKNOWN_PARM)
|
|
|
|
{
|
|
|
|
if (parm_index != a.parm_index)
|
|
|
|
return false;
|
|
|
|
if (parm_offset_known)
|
|
|
|
{
|
|
|
|
if (!a.parm_offset_known)
|
|
|
|
return false;
|
|
|
|
/* Accesses are never below parm_offset, so look
|
|
|
|
for smaller offset.
|
|
|
|
If access ranges are known still allow merging
|
|
|
|
when bit offsets comparsion passes. */
|
|
|
|
if (!known_le (parm_offset, a.parm_offset)
|
|
|
|
&& !range_info_useful_p ())
|
|
|
|
return false;
|
|
|
|
/* We allow negative aoffset_adj here in case
|
|
|
|
there is an useful range. This is because adding
|
|
|
|
a.offset may result in non-ngative offset again.
|
|
|
|
Ubsan fails on val << LOG_BITS_PER_UNIT where val
|
|
|
|
is negative. */
|
|
|
|
aoffset_adj = (a.parm_offset - parm_offset)
|
|
|
|
* BITS_PER_UNIT;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (range_info_useful_p ())
|
|
|
|
{
|
|
|
|
if (!a.range_info_useful_p ())
|
|
|
|
return false;
|
|
|
|
/* Sizes of stores are used to check that object is big enough
|
|
|
|
to fit the store, so smaller or unknown sotre is more general
|
|
|
|
than large store. */
|
|
|
|
if (known_size_p (size)
|
|
|
|
&& (!known_size_p (a.size)
|
|
|
|
|| !known_le (size, a.size)))
|
|
|
|
return false;
|
|
|
|
if (known_size_p (max_size))
|
|
|
|
return known_subrange_p (a.offset + aoffset_adj,
|
|
|
|
a.max_size, offset, max_size);
|
|
|
|
else
|
|
|
|
return known_le (offset, a.offset + aoffset_adj);
|
|
|
|
}
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Update access range to new parameters.
|
|
|
|
If RECORD_ADJUSTMENTS is true, record number of changes in the access
|
|
|
|
and if threshold is exceeded start dropping precision
|
|
|
|
so only constantly many updates are possible. This makes dataflow
|
|
|
|
to converge. */
|
|
|
|
void
|
|
|
|
modref_access_node::update (poly_int64 parm_offset1,
|
|
|
|
poly_int64 offset1, poly_int64 size1,
|
|
|
|
poly_int64 max_size1, bool record_adjustments)
|
|
|
|
{
|
|
|
|
if (known_eq (parm_offset, parm_offset1)
|
|
|
|
&& known_eq (offset, offset1)
|
|
|
|
&& known_eq (size, size1)
|
|
|
|
&& known_eq (max_size, max_size1))
|
|
|
|
return;
|
|
|
|
if (!record_adjustments
|
|
|
|
|| (++adjustments) < param_modref_max_adjustments)
|
|
|
|
{
|
|
|
|
parm_offset = parm_offset1;
|
|
|
|
offset = offset1;
|
|
|
|
size = size1;
|
|
|
|
max_size = max_size1;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
if (dump_file)
|
|
|
|
fprintf (dump_file,
|
|
|
|
"--param param=modref-max-adjustments limit reached:");
|
|
|
|
if (!known_eq (parm_offset, parm_offset1))
|
|
|
|
{
|
|
|
|
if (dump_file)
|
|
|
|
fprintf (dump_file, " parm_offset cleared");
|
|
|
|
parm_offset_known = false;
|
|
|
|
}
|
|
|
|
if (!known_eq (size, size1))
|
|
|
|
{
|
|
|
|
size = -1;
|
|
|
|
if (dump_file)
|
|
|
|
fprintf (dump_file, " size cleared");
|
|
|
|
}
|
|
|
|
if (!known_eq (max_size, max_size1))
|
|
|
|
{
|
|
|
|
max_size = -1;
|
|
|
|
if (dump_file)
|
|
|
|
fprintf (dump_file, " max_size cleared");
|
|
|
|
}
|
|
|
|
if (!known_eq (offset, offset1))
|
|
|
|
{
|
|
|
|
offset = 0;
|
|
|
|
if (dump_file)
|
|
|
|
fprintf (dump_file, " offset cleared");
|
|
|
|
}
|
|
|
|
if (dump_file)
|
|
|
|
fprintf (dump_file, "\n");
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Merge in access A if it is possible to do without losing
|
|
|
|
precision. Return true if successful.
|
|
|
|
If RECORD_ADJUSTMENTs is true, remember how many interval
|
|
|
|
was prolonged and punt when there are too many. */
|
|
|
|
bool
|
|
|
|
modref_access_node::merge (const modref_access_node &a,
|
|
|
|
bool record_adjustments)
|
|
|
|
{
|
|
|
|
poly_int64 offset1 = 0;
|
|
|
|
poly_int64 aoffset1 = 0;
|
|
|
|
poly_int64 new_parm_offset = 0;
|
|
|
|
|
|
|
|
/* We assume that containment was tested earlier. */
|
|
|
|
gcc_checking_assert (!contains (a) && !a.contains (*this));
|
|
|
|
if (parm_index != MODREF_UNKNOWN_PARM)
|
|
|
|
{
|
|
|
|
if (parm_index != a.parm_index)
|
|
|
|
return false;
|
|
|
|
if (parm_offset_known)
|
|
|
|
{
|
|
|
|
if (!a.parm_offset_known)
|
|
|
|
return false;
|
|
|
|
if (!combined_offsets (a, &new_parm_offset, &offset1, &aoffset1))
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
/* See if we can merge ranges. */
|
|
|
|
if (range_info_useful_p ())
|
|
|
|
{
|
|
|
|
/* In this case we have containment that should be
|
|
|
|
handled earlier. */
|
|
|
|
gcc_checking_assert (a.range_info_useful_p ());
|
|
|
|
|
|
|
|
/* If a.size is less specified than size, merge only
|
|
|
|
if intervals are otherwise equivalent. */
|
|
|
|
if (known_size_p (size)
|
|
|
|
&& (!known_size_p (a.size) || known_lt (a.size, size)))
|
|
|
|
{
|
|
|
|
if (((known_size_p (max_size) || known_size_p (a.max_size))
|
|
|
|
&& !known_eq (max_size, a.max_size))
|
|
|
|
|| !known_eq (offset1, aoffset1))
|
|
|
|
return false;
|
|
|
|
update (new_parm_offset, offset1, a.size, max_size,
|
|
|
|
record_adjustments);
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
/* If sizes are same, we can extend the interval. */
|
|
|
|
if ((known_size_p (size) || known_size_p (a.size))
|
|
|
|
&& !known_eq (size, a.size))
|
|
|
|
return false;
|
|
|
|
if (known_le (offset1, aoffset1))
|
|
|
|
{
|
|
|
|
if (!known_size_p (max_size)
|
|
|
|
|| known_ge (offset1 + max_size, aoffset1))
|
|
|
|
{
|
|
|
|
update2 (new_parm_offset, offset1, size, max_size,
|
|
|
|
aoffset1, a.size, a.max_size,
|
|
|
|
record_adjustments);
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else if (known_le (aoffset1, offset1))
|
|
|
|
{
|
|
|
|
if (!known_size_p (a.max_size)
|
|
|
|
|| known_ge (aoffset1 + a.max_size, offset1))
|
|
|
|
{
|
|
|
|
update2 (new_parm_offset, offset1, size, max_size,
|
|
|
|
aoffset1, a.size, a.max_size,
|
|
|
|
record_adjustments);
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
update (new_parm_offset, offset1,
|
|
|
|
size, max_size, record_adjustments);
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Return true if A1 and B1 can be merged with lower information
|
|
|
|
less than A2 and B2.
|
|
|
|
Assume that no containment or lossless merging is possible. */
|
|
|
|
bool
|
|
|
|
modref_access_node::closer_pair_p (const modref_access_node &a1,
|
|
|
|
const modref_access_node &b1,
|
|
|
|
const modref_access_node &a2,
|
|
|
|
const modref_access_node &b2)
|
|
|
|
{
|
|
|
|
/* Merging different parm indexes comes to complete loss
|
|
|
|
of range info. */
|
|
|
|
if (a1.parm_index != b1.parm_index)
|
|
|
|
return false;
|
|
|
|
if (a2.parm_index != b2.parm_index)
|
|
|
|
return true;
|
|
|
|
/* If parm is known and parm indexes are the same we should
|
|
|
|
already have containment. */
|
|
|
|
gcc_checking_assert (a1.parm_offset_known && b1.parm_offset_known);
|
|
|
|
gcc_checking_assert (a2.parm_offset_known && b2.parm_offset_known);
|
|
|
|
|
|
|
|
/* First normalize offsets for parm offsets. */
|
|
|
|
poly_int64 new_parm_offset, offseta1, offsetb1, offseta2, offsetb2;
|
|
|
|
if (!a1.combined_offsets (b1, &new_parm_offset, &offseta1, &offsetb1)
|
|
|
|
|| !a2.combined_offsets (b2, &new_parm_offset, &offseta2, &offsetb2))
|
|
|
|
gcc_unreachable ();
|
|
|
|
|
|
|
|
|
|
|
|
/* Now compute distnace of the intervals. */
|
|
|
|
poly_int64 dist1, dist2;
|
|
|
|
if (known_le (offseta1, offsetb1))
|
|
|
|
{
|
|
|
|
if (!known_size_p (a1.max_size))
|
|
|
|
dist1 = 0;
|
|
|
|
else
|
|
|
|
dist1 = offsetb1 - offseta1 - a1.max_size;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
if (!known_size_p (b1.max_size))
|
|
|
|
dist1 = 0;
|
|
|
|
else
|
|
|
|
dist1 = offseta1 - offsetb1 - b1.max_size;
|
|
|
|
}
|
|
|
|
if (known_le (offseta2, offsetb2))
|
|
|
|
{
|
|
|
|
if (!known_size_p (a2.max_size))
|
|
|
|
dist2 = 0;
|
|
|
|
else
|
|
|
|
dist2 = offsetb2 - offseta2 - a2.max_size;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
if (!known_size_p (b2.max_size))
|
|
|
|
dist2 = 0;
|
|
|
|
else
|
|
|
|
dist2 = offseta2 - offsetb2 - b2.max_size;
|
|
|
|
}
|
|
|
|
/* It may happen that intervals overlap in case size
|
2021-12-04 11:09:33 +01:00
|
|
|
is different. Prefer the overlap to non-overlap. */
|
2021-11-13 18:27:18 +01:00
|
|
|
if (known_lt (dist1, 0) && known_ge (dist2, 0))
|
|
|
|
return true;
|
|
|
|
if (known_lt (dist2, 0) && known_ge (dist1, 0))
|
|
|
|
return false;
|
|
|
|
if (known_lt (dist1, 0))
|
|
|
|
/* If both overlaps minimize overlap. */
|
|
|
|
return known_le (dist2, dist1);
|
|
|
|
else
|
|
|
|
/* If both are disjoint look for smaller distance. */
|
|
|
|
return known_le (dist1, dist2);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Merge in access A while losing precision. */
|
|
|
|
void
|
|
|
|
modref_access_node::forced_merge (const modref_access_node &a,
|
|
|
|
bool record_adjustments)
|
|
|
|
{
|
|
|
|
if (parm_index != a.parm_index)
|
|
|
|
{
|
|
|
|
gcc_checking_assert (parm_index != MODREF_UNKNOWN_PARM);
|
|
|
|
parm_index = MODREF_UNKNOWN_PARM;
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* We assume that containment and lossless merging
|
|
|
|
was tested earlier. */
|
|
|
|
gcc_checking_assert (!contains (a) && !a.contains (*this)
|
|
|
|
&& !merge (a, record_adjustments));
|
|
|
|
gcc_checking_assert (parm_offset_known && a.parm_offset_known);
|
|
|
|
|
|
|
|
poly_int64 new_parm_offset, offset1, aoffset1;
|
|
|
|
if (!combined_offsets (a, &new_parm_offset, &offset1, &aoffset1))
|
|
|
|
{
|
|
|
|
parm_offset_known = false;
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
gcc_checking_assert (range_info_useful_p ()
|
|
|
|
&& a.range_info_useful_p ());
|
|
|
|
if (record_adjustments)
|
|
|
|
adjustments += a.adjustments;
|
|
|
|
update2 (new_parm_offset,
|
|
|
|
offset1, size, max_size,
|
|
|
|
aoffset1, a.size, a.max_size,
|
|
|
|
record_adjustments);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Merge two ranges both starting at parm_offset1 and update THIS
|
|
|
|
with result. */
|
|
|
|
void
|
|
|
|
modref_access_node::update2 (poly_int64 parm_offset1,
|
|
|
|
poly_int64 offset1, poly_int64 size1,
|
|
|
|
poly_int64 max_size1,
|
|
|
|
poly_int64 offset2, poly_int64 size2,
|
|
|
|
poly_int64 max_size2,
|
|
|
|
bool record_adjustments)
|
|
|
|
{
|
|
|
|
poly_int64 new_size = size1;
|
|
|
|
|
|
|
|
if (!known_size_p (size2)
|
|
|
|
|| known_le (size2, size1))
|
|
|
|
new_size = size2;
|
|
|
|
else
|
|
|
|
gcc_checking_assert (known_le (size1, size2));
|
|
|
|
|
|
|
|
if (known_le (offset1, offset2))
|
|
|
|
;
|
|
|
|
else if (known_le (offset2, offset1))
|
|
|
|
{
|
|
|
|
std::swap (offset1, offset2);
|
|
|
|
std::swap (max_size1, max_size2);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
gcc_unreachable ();
|
|
|
|
|
|
|
|
poly_int64 new_max_size;
|
|
|
|
|
|
|
|
if (!known_size_p (max_size1))
|
|
|
|
new_max_size = max_size1;
|
|
|
|
else if (!known_size_p (max_size2))
|
|
|
|
new_max_size = max_size2;
|
|
|
|
else
|
|
|
|
{
|
|
|
|
new_max_size = max_size2 + offset2 - offset1;
|
|
|
|
if (known_le (new_max_size, max_size1))
|
|
|
|
new_max_size = max_size1;
|
|
|
|
}
|
|
|
|
|
|
|
|
update (parm_offset1, offset1,
|
|
|
|
new_size, new_max_size, record_adjustments);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Given access nodes THIS and A, return true if they
|
|
|
|
can be done with common parm_offsets. In this case
|
|
|
|
return parm offset in new_parm_offset, new_offset
|
|
|
|
which is start of range in THIS and new_aoffset that
|
|
|
|
is start of range in A. */
|
|
|
|
bool
|
|
|
|
modref_access_node::combined_offsets (const modref_access_node &a,
|
|
|
|
poly_int64 *new_parm_offset,
|
|
|
|
poly_int64 *new_offset,
|
|
|
|
poly_int64 *new_aoffset) const
|
|
|
|
{
|
|
|
|
gcc_checking_assert (parm_offset_known && a.parm_offset_known);
|
|
|
|
if (known_le (a.parm_offset, parm_offset))
|
|
|
|
{
|
|
|
|
*new_offset = offset
|
|
|
|
+ ((parm_offset - a.parm_offset)
|
|
|
|
<< LOG2_BITS_PER_UNIT);
|
|
|
|
*new_aoffset = a.offset;
|
|
|
|
*new_parm_offset = a.parm_offset;
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
else if (known_le (parm_offset, a.parm_offset))
|
|
|
|
{
|
|
|
|
*new_aoffset = a.offset
|
|
|
|
+ ((a.parm_offset - parm_offset)
|
|
|
|
<< LOG2_BITS_PER_UNIT);
|
|
|
|
*new_offset = offset;
|
|
|
|
*new_parm_offset = parm_offset;
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Try to optimize the access ACCESSES list after entry INDEX was modified. */
|
|
|
|
void
|
|
|
|
modref_access_node::try_merge_with (vec <modref_access_node, va_gc> *&accesses,
|
|
|
|
size_t index)
|
|
|
|
{
|
|
|
|
size_t i;
|
|
|
|
|
|
|
|
for (i = 0; i < accesses->length ();)
|
|
|
|
if (i != index)
|
|
|
|
{
|
|
|
|
bool found = false, restart = false;
|
|
|
|
modref_access_node *a = &(*accesses)[i];
|
|
|
|
modref_access_node *n = &(*accesses)[index];
|
|
|
|
|
|
|
|
if (n->contains (*a))
|
|
|
|
found = true;
|
|
|
|
if (!found && n->merge (*a, false))
|
|
|
|
found = restart = true;
|
|
|
|
gcc_checking_assert (found || !a->merge (*n, false));
|
|
|
|
if (found)
|
|
|
|
{
|
|
|
|
accesses->unordered_remove (i);
|
|
|
|
if (index == accesses->length ())
|
|
|
|
{
|
|
|
|
index = i;
|
|
|
|
i++;
|
|
|
|
}
|
|
|
|
if (restart)
|
|
|
|
i = 0;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
i++;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
i++;
|
|
|
|
}
|
|
|
|
|
2021-11-17 20:40:44 +01:00
|
|
|
/* Stream out to OB. */
|
|
|
|
|
|
|
|
void
|
|
|
|
modref_access_node::stream_out (struct output_block *ob) const
|
|
|
|
{
|
|
|
|
streamer_write_hwi (ob, parm_index);
|
|
|
|
if (parm_index != MODREF_UNKNOWN_PARM)
|
|
|
|
{
|
|
|
|
streamer_write_uhwi (ob, parm_offset_known);
|
|
|
|
if (parm_offset_known)
|
|
|
|
{
|
|
|
|
streamer_write_poly_int64 (ob, parm_offset);
|
|
|
|
streamer_write_poly_int64 (ob, offset);
|
|
|
|
streamer_write_poly_int64 (ob, size);
|
|
|
|
streamer_write_poly_int64 (ob, max_size);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
modref_access_node
|
|
|
|
modref_access_node::stream_in (struct lto_input_block *ib)
|
|
|
|
{
|
|
|
|
int parm_index = streamer_read_hwi (ib);
|
|
|
|
bool parm_offset_known = false;
|
|
|
|
poly_int64 parm_offset = 0;
|
|
|
|
poly_int64 offset = 0;
|
|
|
|
poly_int64 size = -1;
|
|
|
|
poly_int64 max_size = -1;
|
|
|
|
|
|
|
|
if (parm_index != MODREF_UNKNOWN_PARM)
|
|
|
|
{
|
|
|
|
parm_offset_known = streamer_read_uhwi (ib);
|
|
|
|
if (parm_offset_known)
|
|
|
|
{
|
|
|
|
parm_offset = streamer_read_poly_int64 (ib);
|
|
|
|
offset = streamer_read_poly_int64 (ib);
|
|
|
|
size = streamer_read_poly_int64 (ib);
|
|
|
|
max_size = streamer_read_poly_int64 (ib);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return {offset, size, max_size, parm_offset, parm_index,
|
|
|
|
parm_offset_known, false};
|
|
|
|
}
|
|
|
|
|
2021-11-13 18:27:18 +01:00
|
|
|
/* Insert access with OFFSET and SIZE.
|
|
|
|
Collapse tree if it has more than MAX_ACCESSES entries.
|
|
|
|
If RECORD_ADJUSTMENTs is true avoid too many interval extensions.
|
|
|
|
Return true if record was changed.
|
|
|
|
|
|
|
|
Reutrn 0 if nothing changed, 1 if insert was successful and -1
|
|
|
|
if entries should be collapsed. */
|
|
|
|
int
|
|
|
|
modref_access_node::insert (vec <modref_access_node, va_gc> *&accesses,
|
|
|
|
modref_access_node a, size_t max_accesses,
|
|
|
|
bool record_adjustments)
|
|
|
|
{
|
|
|
|
size_t i, j;
|
|
|
|
modref_access_node *a2;
|
|
|
|
|
|
|
|
/* Verify that list does not contain redundant accesses. */
|
|
|
|
if (flag_checking)
|
|
|
|
{
|
|
|
|
size_t i, i2;
|
|
|
|
modref_access_node *a, *a2;
|
|
|
|
|
|
|
|
FOR_EACH_VEC_SAFE_ELT (accesses, i, a)
|
|
|
|
{
|
|
|
|
FOR_EACH_VEC_SAFE_ELT (accesses, i2, a2)
|
|
|
|
if (i != i2)
|
|
|
|
gcc_assert (!a->contains (*a2));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
FOR_EACH_VEC_SAFE_ELT (accesses, i, a2)
|
|
|
|
{
|
|
|
|
if (a2->contains (a))
|
|
|
|
return 0;
|
|
|
|
if (a.contains (*a2))
|
|
|
|
{
|
|
|
|
a.adjustments = 0;
|
|
|
|
a2->parm_index = a.parm_index;
|
|
|
|
a2->parm_offset_known = a.parm_offset_known;
|
|
|
|
a2->update (a.parm_offset, a.offset, a.size, a.max_size,
|
|
|
|
record_adjustments);
|
|
|
|
modref_access_node::try_merge_with (accesses, i);
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
if (a2->merge (a, record_adjustments))
|
|
|
|
{
|
|
|
|
modref_access_node::try_merge_with (accesses, i);
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
gcc_checking_assert (!(a == *a2));
|
|
|
|
}
|
|
|
|
|
|
|
|
/* If this base->ref pair has too many accesses stored, we will clear
|
|
|
|
all accesses and bail out. */
|
|
|
|
if (accesses && accesses->length () >= max_accesses)
|
|
|
|
{
|
|
|
|
if (max_accesses < 2)
|
|
|
|
return -1;
|
|
|
|
/* Find least harmful merge and perform it. */
|
|
|
|
int best1 = -1, best2 = -1;
|
|
|
|
FOR_EACH_VEC_SAFE_ELT (accesses, i, a2)
|
|
|
|
{
|
|
|
|
for (j = i + 1; j < accesses->length (); j++)
|
|
|
|
if (best1 < 0
|
|
|
|
|| modref_access_node::closer_pair_p
|
|
|
|
(*a2, (*accesses)[j],
|
|
|
|
(*accesses)[best1],
|
|
|
|
best2 < 0 ? a : (*accesses)[best2]))
|
|
|
|
{
|
|
|
|
best1 = i;
|
|
|
|
best2 = j;
|
|
|
|
}
|
|
|
|
if (modref_access_node::closer_pair_p
|
|
|
|
(*a2, a,
|
|
|
|
(*accesses)[best1],
|
|
|
|
best2 < 0 ? a : (*accesses)[best2]))
|
|
|
|
{
|
|
|
|
best1 = i;
|
|
|
|
best2 = -1;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
(*accesses)[best1].forced_merge (best2 < 0 ? a : (*accesses)[best2],
|
|
|
|
record_adjustments);
|
|
|
|
/* Check that merging indeed merged ranges. */
|
|
|
|
gcc_checking_assert ((*accesses)[best1].contains
|
|
|
|
(best2 < 0 ? a : (*accesses)[best2]));
|
|
|
|
if (!(*accesses)[best1].useful_p ())
|
|
|
|
return -1;
|
|
|
|
if (dump_file && best2 >= 0)
|
|
|
|
fprintf (dump_file,
|
|
|
|
"--param param=modref-max-accesses limit reached;"
|
|
|
|
" merging %i and %i\n", best1, best2);
|
|
|
|
else if (dump_file)
|
|
|
|
fprintf (dump_file,
|
|
|
|
"--param param=modref-max-accesses limit reached;"
|
|
|
|
" merging with %i\n", best1);
|
|
|
|
modref_access_node::try_merge_with (accesses, best1);
|
|
|
|
if (best2 >= 0)
|
|
|
|
insert (accesses, a, max_accesses, record_adjustments);
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
a.adjustments = 0;
|
|
|
|
vec_safe_push (accesses, a);
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
2021-11-13 23:18:38 +01:00
|
|
|
/* Return true if range info is useful. */
|
|
|
|
bool
|
|
|
|
modref_access_node::range_info_useful_p () const
|
|
|
|
{
|
Determine global memory accesses in ipa-modref
As discussed in PR103585, fatigue2 is now only benchmark from my usual testing
set (SPEC2k6, SPEC2k17, CPP benchmarks, polyhedron, Firefox, clang) which sees
important regression when inlining functions called once is limited. This
prevents us from solving runtime issues in roms benchmarks and elsewhere.
The problem is that there is perdida function that takes many arguments and
some of them are array descriptors. We constant propagate most of their fields
but still keep their initialization. Because perdida is quite fast, the call
overhead dominates, since we need over 100 memory stores consuing about 35%
of the overall benchmark runtime.
The memory stores would be eliminated if perdida did not call fortran I/O which
makes modref to thin that the array descriptors could be accessed. We are
quite close discovering that they can't becuase they are non-escaping from
function. This patch makes modref to distingush between global memory access
(only things that escapes) and unkonwn accesss (that may access also
nonescaping things reaching the function). This makes disambiguation for
functions containing error handling better.
Unfortunately the patch hits two semi-latent issues in Fortran frontned.
First is wrong code in gfortran.dg/unlimited_polymorphic_3.f03. This can be
turned into wrong code testcase on both mainline and gcc11 if the runtime
call is removed, so I filled PR 103662 for it. There is TBAA mismatch for
structure produced in FE.
Second is issue with GOMP where Fortran marks certain parameters as non-escaping
and then makes them escape via GOMP_parallel. For this I disabled the use of
escape info in verify_arg which also disables the useful transform on perdida
but still does useful work for e.g. GCC error handling. I will work on this
incrementally.
Bootstrapped/regtested x86_64-linux, lto-bootstrapped and also tested with
clang build. I plan to commit this tomorrow if there are no complains
(the patch is not completely short but conceptualy simple and handles a lot
of common cases).
gcc/ChangeLog:
2021-12-12 Jan Hubicka <hubicka@ucw.cz>
PR ipa/103585
* ipa-modref-tree.c (modref_access_node::range_info_useful_p): Handle
MODREF_GLOBAL_MEMORY_PARM.
(modref_access_node::dump): Likewise.
(modref_access_node::get_call_arg): Likewise.
* ipa-modref-tree.h (enum modref_special_parms): Add
MODREF_GLOBAL_MEMORY_PARM.
(modref_access_node::useful_for_kill): Handle
MODREF_GLOBAL_MEMORY_PARM.
(modref:tree::merge): Add promote_unknown_to_global.
* ipa-modref.c (verify_arg):New function.
(may_access_nonescaping_parm_p): New function.
(modref_access_analysis::record_global_memory_load): New member
function.
(modref_access_analysis::record_global_memory_store): Likewise.
(modref_access_analysis::process_fnspec): Distingush global and local
memory.
(modref_access_analysis::analyze_call): Likewise.
* tree-ssa-alias.c (ref_may_access_global_memory_p): New function.
(modref_may_conflict): Use it.
gcc/testsuite/ChangeLog:
2021-12-12 Jan Hubicka <hubicka@ucw.cz>
* gcc.dg/analyzer/data-model-1.c: Disable ipa-modref.
* gcc.dg/uninit-38.c: Likewise.
* gcc.dg/uninit-pr98578.c: Liewise.
2021-12-14 16:50:27 +01:00
|
|
|
return parm_index != MODREF_UNKNOWN_PARM
|
|
|
|
&& parm_index != MODREF_GLOBAL_MEMORY_PARM
|
|
|
|
&& parm_offset_known
|
2021-11-13 23:18:38 +01:00
|
|
|
&& (known_size_p (size)
|
|
|
|
|| known_size_p (max_size)
|
|
|
|
|| known_ge (offset, 0));
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Dump range to debug OUT. */
|
|
|
|
void
|
|
|
|
modref_access_node::dump (FILE *out)
|
|
|
|
{
|
|
|
|
if (parm_index != MODREF_UNKNOWN_PARM)
|
|
|
|
{
|
Determine global memory accesses in ipa-modref
As discussed in PR103585, fatigue2 is now only benchmark from my usual testing
set (SPEC2k6, SPEC2k17, CPP benchmarks, polyhedron, Firefox, clang) which sees
important regression when inlining functions called once is limited. This
prevents us from solving runtime issues in roms benchmarks and elsewhere.
The problem is that there is perdida function that takes many arguments and
some of them are array descriptors. We constant propagate most of their fields
but still keep their initialization. Because perdida is quite fast, the call
overhead dominates, since we need over 100 memory stores consuing about 35%
of the overall benchmark runtime.
The memory stores would be eliminated if perdida did not call fortran I/O which
makes modref to thin that the array descriptors could be accessed. We are
quite close discovering that they can't becuase they are non-escaping from
function. This patch makes modref to distingush between global memory access
(only things that escapes) and unkonwn accesss (that may access also
nonescaping things reaching the function). This makes disambiguation for
functions containing error handling better.
Unfortunately the patch hits two semi-latent issues in Fortran frontned.
First is wrong code in gfortran.dg/unlimited_polymorphic_3.f03. This can be
turned into wrong code testcase on both mainline and gcc11 if the runtime
call is removed, so I filled PR 103662 for it. There is TBAA mismatch for
structure produced in FE.
Second is issue with GOMP where Fortran marks certain parameters as non-escaping
and then makes them escape via GOMP_parallel. For this I disabled the use of
escape info in verify_arg which also disables the useful transform on perdida
but still does useful work for e.g. GCC error handling. I will work on this
incrementally.
Bootstrapped/regtested x86_64-linux, lto-bootstrapped and also tested with
clang build. I plan to commit this tomorrow if there are no complains
(the patch is not completely short but conceptualy simple and handles a lot
of common cases).
gcc/ChangeLog:
2021-12-12 Jan Hubicka <hubicka@ucw.cz>
PR ipa/103585
* ipa-modref-tree.c (modref_access_node::range_info_useful_p): Handle
MODREF_GLOBAL_MEMORY_PARM.
(modref_access_node::dump): Likewise.
(modref_access_node::get_call_arg): Likewise.
* ipa-modref-tree.h (enum modref_special_parms): Add
MODREF_GLOBAL_MEMORY_PARM.
(modref_access_node::useful_for_kill): Handle
MODREF_GLOBAL_MEMORY_PARM.
(modref:tree::merge): Add promote_unknown_to_global.
* ipa-modref.c (verify_arg):New function.
(may_access_nonescaping_parm_p): New function.
(modref_access_analysis::record_global_memory_load): New member
function.
(modref_access_analysis::record_global_memory_store): Likewise.
(modref_access_analysis::process_fnspec): Distingush global and local
memory.
(modref_access_analysis::analyze_call): Likewise.
* tree-ssa-alias.c (ref_may_access_global_memory_p): New function.
(modref_may_conflict): Use it.
gcc/testsuite/ChangeLog:
2021-12-12 Jan Hubicka <hubicka@ucw.cz>
* gcc.dg/analyzer/data-model-1.c: Disable ipa-modref.
* gcc.dg/uninit-38.c: Likewise.
* gcc.dg/uninit-pr98578.c: Liewise.
2021-12-14 16:50:27 +01:00
|
|
|
if (parm_index == MODREF_GLOBAL_MEMORY_PARM)
|
|
|
|
fprintf (out, " Base in global memory");
|
|
|
|
else if (parm_index >= 0)
|
2021-11-13 23:18:38 +01:00
|
|
|
fprintf (out, " Parm %i", parm_index);
|
|
|
|
else if (parm_index == MODREF_STATIC_CHAIN_PARM)
|
|
|
|
fprintf (out, " Static chain");
|
|
|
|
else
|
|
|
|
gcc_unreachable ();
|
|
|
|
if (parm_offset_known)
|
|
|
|
{
|
|
|
|
fprintf (out, " param offset:");
|
|
|
|
print_dec ((poly_int64_pod)parm_offset, out, SIGNED);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (range_info_useful_p ())
|
|
|
|
{
|
|
|
|
fprintf (out, " offset:");
|
|
|
|
print_dec ((poly_int64_pod)offset, out, SIGNED);
|
|
|
|
fprintf (out, " size:");
|
|
|
|
print_dec ((poly_int64_pod)size, out, SIGNED);
|
|
|
|
fprintf (out, " max_size:");
|
|
|
|
print_dec ((poly_int64_pod)max_size, out, SIGNED);
|
|
|
|
if (adjustments)
|
|
|
|
fprintf (out, " adjusted %i times", adjustments);
|
|
|
|
}
|
|
|
|
fprintf (out, "\n");
|
|
|
|
}
|
|
|
|
|
2021-11-14 12:01:41 +01:00
|
|
|
/* Return tree corresponding to parameter of the range in STMT. */
|
|
|
|
tree
|
|
|
|
modref_access_node::get_call_arg (const gcall *stmt) const
|
|
|
|
{
|
Determine global memory accesses in ipa-modref
As discussed in PR103585, fatigue2 is now only benchmark from my usual testing
set (SPEC2k6, SPEC2k17, CPP benchmarks, polyhedron, Firefox, clang) which sees
important regression when inlining functions called once is limited. This
prevents us from solving runtime issues in roms benchmarks and elsewhere.
The problem is that there is perdida function that takes many arguments and
some of them are array descriptors. We constant propagate most of their fields
but still keep their initialization. Because perdida is quite fast, the call
overhead dominates, since we need over 100 memory stores consuing about 35%
of the overall benchmark runtime.
The memory stores would be eliminated if perdida did not call fortran I/O which
makes modref to thin that the array descriptors could be accessed. We are
quite close discovering that they can't becuase they are non-escaping from
function. This patch makes modref to distingush between global memory access
(only things that escapes) and unkonwn accesss (that may access also
nonescaping things reaching the function). This makes disambiguation for
functions containing error handling better.
Unfortunately the patch hits two semi-latent issues in Fortran frontned.
First is wrong code in gfortran.dg/unlimited_polymorphic_3.f03. This can be
turned into wrong code testcase on both mainline and gcc11 if the runtime
call is removed, so I filled PR 103662 for it. There is TBAA mismatch for
structure produced in FE.
Second is issue with GOMP where Fortran marks certain parameters as non-escaping
and then makes them escape via GOMP_parallel. For this I disabled the use of
escape info in verify_arg which also disables the useful transform on perdida
but still does useful work for e.g. GCC error handling. I will work on this
incrementally.
Bootstrapped/regtested x86_64-linux, lto-bootstrapped and also tested with
clang build. I plan to commit this tomorrow if there are no complains
(the patch is not completely short but conceptualy simple and handles a lot
of common cases).
gcc/ChangeLog:
2021-12-12 Jan Hubicka <hubicka@ucw.cz>
PR ipa/103585
* ipa-modref-tree.c (modref_access_node::range_info_useful_p): Handle
MODREF_GLOBAL_MEMORY_PARM.
(modref_access_node::dump): Likewise.
(modref_access_node::get_call_arg): Likewise.
* ipa-modref-tree.h (enum modref_special_parms): Add
MODREF_GLOBAL_MEMORY_PARM.
(modref_access_node::useful_for_kill): Handle
MODREF_GLOBAL_MEMORY_PARM.
(modref:tree::merge): Add promote_unknown_to_global.
* ipa-modref.c (verify_arg):New function.
(may_access_nonescaping_parm_p): New function.
(modref_access_analysis::record_global_memory_load): New member
function.
(modref_access_analysis::record_global_memory_store): Likewise.
(modref_access_analysis::process_fnspec): Distingush global and local
memory.
(modref_access_analysis::analyze_call): Likewise.
* tree-ssa-alias.c (ref_may_access_global_memory_p): New function.
(modref_may_conflict): Use it.
gcc/testsuite/ChangeLog:
2021-12-12 Jan Hubicka <hubicka@ucw.cz>
* gcc.dg/analyzer/data-model-1.c: Disable ipa-modref.
* gcc.dg/uninit-38.c: Likewise.
* gcc.dg/uninit-pr98578.c: Liewise.
2021-12-14 16:50:27 +01:00
|
|
|
if (parm_index == MODREF_UNKNOWN_PARM
|
|
|
|
|| parm_index == MODREF_GLOBAL_MEMORY_PARM)
|
2021-11-14 12:01:41 +01:00
|
|
|
return NULL;
|
|
|
|
if (parm_index == MODREF_STATIC_CHAIN_PARM)
|
|
|
|
return gimple_call_chain (stmt);
|
|
|
|
/* MODREF_RETSLOT_PARM should not happen in access trees since the store
|
|
|
|
is seen explicitly in the caller. */
|
|
|
|
gcc_checking_assert (parm_index >= 0);
|
|
|
|
if (parm_index >= (int)gimple_call_num_args (stmt))
|
|
|
|
return NULL;
|
|
|
|
return gimple_call_arg (stmt, parm_index);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Return tree corresponding to parameter of the range in STMT. */
|
|
|
|
bool
|
|
|
|
modref_access_node::get_ao_ref (const gcall *stmt, ao_ref *ref) const
|
|
|
|
{
|
|
|
|
tree arg;
|
|
|
|
|
|
|
|
if (!parm_offset_known || !(arg = get_call_arg (stmt)))
|
|
|
|
return false;
|
|
|
|
poly_offset_int off = (poly_offset_int)offset
|
|
|
|
+ ((poly_offset_int)parm_offset << LOG2_BITS_PER_UNIT);
|
|
|
|
poly_int64 off2;
|
|
|
|
if (!off.to_shwi (&off2))
|
|
|
|
return false;
|
|
|
|
ao_ref_init_from_ptr_and_range (ref, arg, true, off2, size, max_size);
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2021-11-14 18:49:15 +01:00
|
|
|
/* Return true A is a subkill. */
|
|
|
|
bool
|
|
|
|
modref_access_node::contains_for_kills (const modref_access_node &a) const
|
|
|
|
{
|
|
|
|
poly_int64 aoffset_adj = 0;
|
|
|
|
|
|
|
|
gcc_checking_assert (parm_index != MODREF_UNKNOWN_PARM
|
|
|
|
&& a.parm_index != MODREF_UNKNOWN_PARM);
|
|
|
|
if (parm_index != a.parm_index)
|
|
|
|
return false;
|
|
|
|
gcc_checking_assert (parm_offset_known && a.parm_offset_known);
|
|
|
|
aoffset_adj = (a.parm_offset - parm_offset)
|
|
|
|
* BITS_PER_UNIT;
|
|
|
|
gcc_checking_assert (range_info_useful_p () && a.range_info_useful_p ());
|
|
|
|
return known_subrange_p (a.offset + aoffset_adj,
|
|
|
|
a.max_size, offset, max_size);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Merge two ranges both starting at parm_offset1 and update THIS
|
|
|
|
with result. */
|
|
|
|
bool
|
|
|
|
modref_access_node::update_for_kills (poly_int64 parm_offset1,
|
|
|
|
poly_int64 offset1,
|
|
|
|
poly_int64 max_size1,
|
|
|
|
poly_int64 offset2,
|
|
|
|
poly_int64 max_size2,
|
|
|
|
bool record_adjustments)
|
|
|
|
{
|
|
|
|
if (known_le (offset1, offset2))
|
|
|
|
;
|
|
|
|
else if (known_le (offset2, offset1))
|
|
|
|
{
|
|
|
|
std::swap (offset1, offset2);
|
|
|
|
std::swap (max_size1, max_size2);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
gcc_unreachable ();
|
|
|
|
|
|
|
|
poly_int64 new_max_size = max_size2 + offset2 - offset1;
|
|
|
|
if (known_le (new_max_size, max_size1))
|
|
|
|
new_max_size = max_size1;
|
|
|
|
if (known_eq (parm_offset, parm_offset1)
|
|
|
|
&& known_eq (offset, offset1)
|
|
|
|
&& known_eq (size, new_max_size)
|
|
|
|
&& known_eq (max_size, new_max_size))
|
|
|
|
return false;
|
|
|
|
|
|
|
|
if (!record_adjustments
|
|
|
|
|| (++adjustments) < param_modref_max_adjustments)
|
|
|
|
{
|
|
|
|
parm_offset = parm_offset1;
|
|
|
|
offset = offset1;
|
|
|
|
max_size = new_max_size;
|
|
|
|
size = new_max_size;
|
|
|
|
gcc_checking_assert (useful_for_kill_p ());
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Merge in access A if it is possible to do without losing
|
|
|
|
precision. Return true if successful.
|
|
|
|
Unlike merge assume that both accesses are always executed
|
|
|
|
and merge size the same was as max_size. */
|
|
|
|
bool
|
|
|
|
modref_access_node::merge_for_kills (const modref_access_node &a,
|
|
|
|
bool record_adjustments)
|
|
|
|
{
|
|
|
|
poly_int64 offset1 = 0;
|
|
|
|
poly_int64 aoffset1 = 0;
|
|
|
|
poly_int64 new_parm_offset = 0;
|
|
|
|
|
|
|
|
/* We assume that containment was tested earlier. */
|
|
|
|
gcc_checking_assert (!contains_for_kills (a) && !a.contains_for_kills (*this)
|
|
|
|
&& useful_for_kill_p () && a.useful_for_kill_p ());
|
|
|
|
|
|
|
|
if (parm_index != a.parm_index
|
|
|
|
|| !combined_offsets (a, &new_parm_offset, &offset1, &aoffset1))
|
|
|
|
return false;
|
|
|
|
|
|
|
|
if (known_le (offset1, aoffset1))
|
|
|
|
{
|
|
|
|
if (!known_size_p (max_size)
|
|
|
|
|| known_ge (offset1 + max_size, aoffset1))
|
|
|
|
return update_for_kills (new_parm_offset, offset1, max_size,
|
|
|
|
aoffset1, a.max_size, record_adjustments);
|
|
|
|
}
|
|
|
|
else if (known_le (aoffset1, offset1))
|
|
|
|
{
|
|
|
|
if (!known_size_p (a.max_size)
|
|
|
|
|| known_ge (aoffset1 + a.max_size, offset1))
|
|
|
|
return update_for_kills (new_parm_offset, offset1, max_size,
|
|
|
|
aoffset1, a.max_size, record_adjustments);
|
|
|
|
}
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Insert new kill A into KILLS. If RECORD_ADJUSTMENTS is true limit number
|
|
|
|
of changes to each entry. Return true if something changed. */
|
|
|
|
|
|
|
|
bool
|
|
|
|
modref_access_node::insert_kill (vec<modref_access_node> &kills,
|
|
|
|
modref_access_node &a, bool record_adjustments)
|
|
|
|
{
|
|
|
|
size_t index;
|
|
|
|
modref_access_node *a2;
|
|
|
|
bool merge = false;
|
|
|
|
|
|
|
|
gcc_checking_assert (a.useful_for_kill_p ());
|
|
|
|
|
|
|
|
/* See if we have corresponding entry already or we can merge with
|
|
|
|
neighbouring entry. */
|
|
|
|
FOR_EACH_VEC_ELT (kills, index, a2)
|
|
|
|
{
|
|
|
|
if (a2->contains_for_kills (a))
|
|
|
|
return false;
|
|
|
|
if (a.contains_for_kills (*a2))
|
|
|
|
{
|
|
|
|
a.adjustments = 0;
|
|
|
|
*a2 = a;
|
|
|
|
merge = true;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
if (a2->merge_for_kills (a, record_adjustments))
|
|
|
|
{
|
|
|
|
merge = true;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
/* If entry was not found, insert it. */
|
|
|
|
if (!merge)
|
|
|
|
{
|
|
|
|
if ((int)kills.length () >= param_modref_max_accesses)
|
|
|
|
{
|
|
|
|
if (dump_file)
|
|
|
|
fprintf (dump_file,
|
|
|
|
"--param param=modref-max-accesses limit reached:");
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
a.adjustments = 0;
|
|
|
|
kills.safe_push (a);
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
/* Extending range in an entry may make it possible to merge it with
|
|
|
|
other entries. */
|
|
|
|
size_t i;
|
|
|
|
|
|
|
|
for (i = 0; i < kills.length ();)
|
|
|
|
if (i != index)
|
|
|
|
{
|
|
|
|
bool found = false, restart = false;
|
|
|
|
modref_access_node *a = &kills[i];
|
|
|
|
modref_access_node *n = &kills[index];
|
|
|
|
|
|
|
|
if (n->contains_for_kills (*a))
|
|
|
|
found = true;
|
|
|
|
if (!found && n->merge_for_kills (*a, false))
|
|
|
|
found = restart = true;
|
|
|
|
gcc_checking_assert (found || !a->merge_for_kills (*n, false));
|
|
|
|
if (found)
|
|
|
|
{
|
|
|
|
kills.unordered_remove (i);
|
|
|
|
if (index == kills.length ())
|
|
|
|
{
|
|
|
|
index = i;
|
|
|
|
i++;
|
|
|
|
}
|
|
|
|
if (restart)
|
|
|
|
i = 0;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
i++;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
i++;
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2021-11-13 20:43:55 +01:00
|
|
|
#if CHECKING_P
|
|
|
|
|
2020-09-22 22:16:00 +02:00
|
|
|
namespace selftest {
|
2020-09-20 07:25:16 +02:00
|
|
|
|
|
|
|
static void
|
|
|
|
test_insert_search_collapse ()
|
|
|
|
{
|
|
|
|
modref_base_node<alias_set_type> *base_node;
|
|
|
|
modref_ref_node<alias_set_type> *ref_node;
|
Track access ranges in ipa-modref
this patch implements tracking of access ranges. This is only applied when
base pointer is an arugment. Incrementally i will extend it to also track
TBAA basetype so we can disambiguate ranges for accesses to same basetype
(which makes is quite bit more effective). For this reason i track the access
offset separately from parameter offset (the second track combined adjustments
to the parameter). This is I think last feature I would like to add to the
memory access summary this stage1.
Further work will be needed to opitmize the summary and merge adjacent
range/make collapsing more intelingent (so we do not lose track that often),
but I wanted to keep basic patch simple.
According to the cc1plus stats:
Alias oracle query stats:
refs_may_alias_p: 64108082 disambiguations, 74386675 queries
ref_maybe_used_by_call_p: 142319 disambiguations, 65004781 queries
call_may_clobber_ref_p: 23587 disambiguations, 29420 queries
nonoverlapping_component_refs_p: 0 disambiguations, 38117 queries
nonoverlapping_refs_since_match_p: 19489 disambiguations, 55748 must overlaps, 76044 queries
aliasing_component_refs_p: 54763 disambiguations, 755876 queries
TBAA oracle: 24184658 disambiguations 56823187 queries
16260329 are in alias set 0
10617146 queries asked about the same object
125 queries asked about the same alias set
0 access volatile
3960555 are dependent in the DAG
1800374 are aritificially in conflict with void *
Modref stats:
modref use: 10656 disambiguations, 47037 queries
modref clobber: 1473322 disambiguations, 1961464 queries
5027242 tbaa queries (2.563005 per modref query)
649087 base compares (0.330920 per modref query)
PTA query stats:
pt_solution_includes: 977385 disambiguations, 13609749 queries
pt_solutions_intersect: 1032703 disambiguations, 13187507 queries
Which should still compare with
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554930.html
there is about 2% more load disambiguations and 3.6% more store that is not
great, but the TBAA part helps noticeably more and also this should help
with -fno-strict-aliasing.
I plan to work on improving param tracking too.
Bootstrapped/regtested x86_64-linux with the other changes, OK?
2020-10-02 Jan Hubicka <hubicka@ucw.cz>
* ipa-modref-tree.c (test_insert_search_collapse): Update andling
of accesses.
(test_merge): Likewise.
* ipa-modref-tree.h (struct modref_access_node): Add offset, size,
max_size, parm_offset and parm_offset_known.
(modref_access_node::useful_p): Constify.
(modref_access_node::range_info_useful_p): New predicate.
(modref_access_node::operator==): New.
(struct modref_parm_map): New structure.
(modref_tree::merge): Update for racking parameters)
* ipa-modref.c (dump_access): Dump new fields.
(get_access): Fill in new fields.
(merge_call_side_effects): Update handling of parm map.
(write_modref_records): Stream new fields.
(read_modref_records): Stream new fields.
(compute_parm_map): Update for new parm map.
(ipa_merge_modref_summary_after_inlining): Update.
(modref_propagate_in_scc): Update.
* tree-ssa-alias.c (modref_may_conflict): Handle known ranges.
2020-10-03 17:20:16 +02:00
|
|
|
modref_access_node a = unspecified_modref_access_node;
|
2020-09-20 07:25:16 +02:00
|
|
|
|
2021-11-23 16:36:01 +01:00
|
|
|
modref_tree<alias_set_type> *t = new modref_tree<alias_set_type>();
|
2020-09-20 07:25:16 +02:00
|
|
|
ASSERT_FALSE (t->every_base);
|
|
|
|
|
|
|
|
/* Insert into an empty tree. */
|
2021-11-23 16:36:01 +01:00
|
|
|
t->insert (1, 2, 2, 1, 2, a, false);
|
2020-09-20 07:25:16 +02:00
|
|
|
ASSERT_NE (t->bases, NULL);
|
|
|
|
ASSERT_EQ (t->bases->length (), 1);
|
|
|
|
ASSERT_FALSE (t->every_base);
|
|
|
|
ASSERT_EQ (t->search (2), NULL);
|
|
|
|
|
|
|
|
base_node = t->search (1);
|
|
|
|
ASSERT_NE (base_node, NULL);
|
|
|
|
ASSERT_EQ (base_node->base, 1);
|
|
|
|
ASSERT_NE (base_node->refs, NULL);
|
|
|
|
ASSERT_EQ (base_node->refs->length (), 1);
|
|
|
|
ASSERT_EQ (base_node->search (1), NULL);
|
|
|
|
|
|
|
|
ref_node = base_node->search (2);
|
|
|
|
ASSERT_NE (ref_node, NULL);
|
|
|
|
ASSERT_EQ (ref_node->ref, 2);
|
|
|
|
|
|
|
|
/* Insert when base exists but ref does not. */
|
2021-11-23 16:36:01 +01:00
|
|
|
t->insert (1, 2, 2, 1, 3, a, false);
|
2020-09-20 07:25:16 +02:00
|
|
|
ASSERT_NE (t->bases, NULL);
|
|
|
|
ASSERT_EQ (t->bases->length (), 1);
|
|
|
|
ASSERT_EQ (t->search (1), base_node);
|
|
|
|
ASSERT_EQ (t->search (2), NULL);
|
|
|
|
ASSERT_NE (base_node->refs, NULL);
|
|
|
|
ASSERT_EQ (base_node->refs->length (), 2);
|
|
|
|
|
|
|
|
ref_node = base_node->search (3);
|
|
|
|
ASSERT_NE (ref_node, NULL);
|
|
|
|
|
|
|
|
/* Insert when base and ref exist, but access is not dominated by nor
|
|
|
|
dominates other accesses. */
|
2021-11-23 16:36:01 +01:00
|
|
|
t->insert (1, 2, 2, 1, 2, a, false);
|
2020-09-20 07:25:16 +02:00
|
|
|
ASSERT_EQ (t->bases->length (), 1);
|
|
|
|
ASSERT_EQ (t->search (1), base_node);
|
|
|
|
|
|
|
|
ref_node = base_node->search (2);
|
|
|
|
ASSERT_NE (ref_node, NULL);
|
|
|
|
|
|
|
|
/* Insert when base and ref exist and access is dominated. */
|
2021-11-23 16:36:01 +01:00
|
|
|
t->insert (1, 2, 2, 1, 2, a, false);
|
2020-09-20 07:25:16 +02:00
|
|
|
ASSERT_EQ (t->search (1), base_node);
|
|
|
|
ASSERT_EQ (base_node->search (2), ref_node);
|
|
|
|
|
|
|
|
/* Insert ref to trigger ref list collapse for base 1. */
|
2021-11-23 16:36:01 +01:00
|
|
|
t->insert (1, 2, 2, 1, 4, a, false);
|
2020-09-20 07:25:16 +02:00
|
|
|
ASSERT_EQ (t->search (1), base_node);
|
|
|
|
ASSERT_EQ (base_node->refs, NULL);
|
|
|
|
ASSERT_EQ (base_node->search (2), NULL);
|
|
|
|
ASSERT_EQ (base_node->search (3), NULL);
|
|
|
|
ASSERT_TRUE (base_node->every_ref);
|
|
|
|
|
|
|
|
/* Further inserts to collapsed ref list are ignored. */
|
2021-11-23 16:36:01 +01:00
|
|
|
t->insert (1, 2, 2, 1, 5, a, false);
|
2020-09-20 07:25:16 +02:00
|
|
|
ASSERT_EQ (t->search (1), base_node);
|
|
|
|
ASSERT_EQ (base_node->refs, NULL);
|
|
|
|
ASSERT_EQ (base_node->search (2), NULL);
|
|
|
|
ASSERT_EQ (base_node->search (3), NULL);
|
|
|
|
ASSERT_TRUE (base_node->every_ref);
|
|
|
|
|
|
|
|
/* Insert base to trigger base list collapse. */
|
2021-11-23 16:36:01 +01:00
|
|
|
t->insert (1, 2, 2, 5, 0, a, false);
|
2020-09-20 07:25:16 +02:00
|
|
|
ASSERT_TRUE (t->every_base);
|
|
|
|
ASSERT_EQ (t->bases, NULL);
|
|
|
|
ASSERT_EQ (t->search (1), NULL);
|
|
|
|
|
|
|
|
/* Further inserts to collapsed base list are ignored. */
|
2021-11-23 16:36:01 +01:00
|
|
|
t->insert (1, 2, 2, 7, 8, a, false);
|
2020-09-20 07:25:16 +02:00
|
|
|
ASSERT_TRUE (t->every_base);
|
|
|
|
ASSERT_EQ (t->bases, NULL);
|
|
|
|
ASSERT_EQ (t->search (1), NULL);
|
2020-10-22 12:44:27 +02:00
|
|
|
|
|
|
|
delete t;
|
2020-09-20 07:25:16 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
test_merge ()
|
|
|
|
{
|
|
|
|
modref_tree<alias_set_type> *t1, *t2;
|
|
|
|
modref_base_node<alias_set_type> *base_node;
|
Track access ranges in ipa-modref
this patch implements tracking of access ranges. This is only applied when
base pointer is an arugment. Incrementally i will extend it to also track
TBAA basetype so we can disambiguate ranges for accesses to same basetype
(which makes is quite bit more effective). For this reason i track the access
offset separately from parameter offset (the second track combined adjustments
to the parameter). This is I think last feature I would like to add to the
memory access summary this stage1.
Further work will be needed to opitmize the summary and merge adjacent
range/make collapsing more intelingent (so we do not lose track that often),
but I wanted to keep basic patch simple.
According to the cc1plus stats:
Alias oracle query stats:
refs_may_alias_p: 64108082 disambiguations, 74386675 queries
ref_maybe_used_by_call_p: 142319 disambiguations, 65004781 queries
call_may_clobber_ref_p: 23587 disambiguations, 29420 queries
nonoverlapping_component_refs_p: 0 disambiguations, 38117 queries
nonoverlapping_refs_since_match_p: 19489 disambiguations, 55748 must overlaps, 76044 queries
aliasing_component_refs_p: 54763 disambiguations, 755876 queries
TBAA oracle: 24184658 disambiguations 56823187 queries
16260329 are in alias set 0
10617146 queries asked about the same object
125 queries asked about the same alias set
0 access volatile
3960555 are dependent in the DAG
1800374 are aritificially in conflict with void *
Modref stats:
modref use: 10656 disambiguations, 47037 queries
modref clobber: 1473322 disambiguations, 1961464 queries
5027242 tbaa queries (2.563005 per modref query)
649087 base compares (0.330920 per modref query)
PTA query stats:
pt_solution_includes: 977385 disambiguations, 13609749 queries
pt_solutions_intersect: 1032703 disambiguations, 13187507 queries
Which should still compare with
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554930.html
there is about 2% more load disambiguations and 3.6% more store that is not
great, but the TBAA part helps noticeably more and also this should help
with -fno-strict-aliasing.
I plan to work on improving param tracking too.
Bootstrapped/regtested x86_64-linux with the other changes, OK?
2020-10-02 Jan Hubicka <hubicka@ucw.cz>
* ipa-modref-tree.c (test_insert_search_collapse): Update andling
of accesses.
(test_merge): Likewise.
* ipa-modref-tree.h (struct modref_access_node): Add offset, size,
max_size, parm_offset and parm_offset_known.
(modref_access_node::useful_p): Constify.
(modref_access_node::range_info_useful_p): New predicate.
(modref_access_node::operator==): New.
(struct modref_parm_map): New structure.
(modref_tree::merge): Update for racking parameters)
* ipa-modref.c (dump_access): Dump new fields.
(get_access): Fill in new fields.
(merge_call_side_effects): Update handling of parm map.
(write_modref_records): Stream new fields.
(read_modref_records): Stream new fields.
(compute_parm_map): Update for new parm map.
(ipa_merge_modref_summary_after_inlining): Update.
(modref_propagate_in_scc): Update.
* tree-ssa-alias.c (modref_may_conflict): Handle known ranges.
2020-10-03 17:20:16 +02:00
|
|
|
modref_access_node a = unspecified_modref_access_node;
|
Add access through parameter derference tracking to modref
re-add tracking of accesses which was unfinished in David's patch.
At the moment I only implemented tracking of the fact that access is based on
derefernece of the parameter (so we track THIS pointers).
Patch does not implement IPA propagation since it needs bit more work which
I will post shortly: ipa-fnsummary needs to track when parameter points to
local memory, summaries needs to be merged when function is inlined (because
jump functions are) and propagation needs to be turned into iterative dataflow
on SCC components.
Patch also adds documentation of -fipa-modref and params that was left uncommited
in my branch :(.
Even without this change it does lead to nice increase of disambiguations
for cc1plus build.
Alias oracle query stats:
refs_may_alias_p: 62758323 disambiguations, 72935683 queries
ref_maybe_used_by_call_p: 139511 disambiguations, 63654045 queries
call_may_clobber_ref_p: 23502 disambiguations, 29242 queries
nonoverlapping_component_refs_p: 0 disambiguations, 37654 queries
nonoverlapping_refs_since_match_p: 19417 disambiguations, 55555 must overlaps, 75721 queries
aliasing_component_refs_p: 54665 disambiguations, 752449 queries
TBAA oracle: 21917926 disambiguations 53054678 queries
15763411 are in alias set 0
10162238 queries asked about the same object
124 queries asked about the same alias set
0 access volatile
3681593 are dependent in the DAG
1529386 are aritificially in conflict with void *
Modref stats:
modref use: 8311 disambiguations, 32527 queries
modref clobber: 742126 disambiguations, 1036986 queries
1987054 tbaa queries (1.916182 per modref query)
125479 base compares (0.121004 per modref query)
PTA query stats:
pt_solution_includes: 968314 disambiguations, 13609584 queries
pt_solutions_intersect: 1019136 disambiguations, 13147139 queries
So compared to
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554605.html
we get 41% more use disambiguations (with similar number of queries) and 8% more
clobber disambiguations.
For tramp3d:
Alias oracle query stats:
refs_may_alias_p: 2052256 disambiguations, 2312703 queries
ref_maybe_used_by_call_p: 7122 disambiguations, 2089118 queries
call_may_clobber_ref_p: 234 disambiguations, 234 queries
nonoverlapping_component_refs_p: 0 disambiguations, 4299 queries
nonoverlapping_refs_since_match_p: 329 disambiguations, 10200 must overlaps, 10616 queries
aliasing_component_refs_p: 857 disambiguations, 34555 queries
TBAA oracle: 885546 disambiguations 1677080 queries
132105 are in alias set 0
469030 queries asked about the same object
0 queries asked about the same alias set
0 access volatile
190084 are dependent in the DAG
315 are aritificially in conflict with void *
Modref stats:
modref use: 426 disambiguations, 1881 queries
modref clobber: 10042 disambiguations, 16202 queries
19405 tbaa queries (1.197692 per modref query)
2775 base compares (0.171275 per modref query)
PTA query stats:
pt_solution_includes: 313908 disambiguations, 526183 queries
pt_solutions_intersect: 130510 disambiguations, 416084 queries
Here uses decrease by 4 disambiguations and clobber improve by 3.5%. I think
the difference is caused by fact that gcc has much more alias set 0 accesses
originating from gimple and tree unions as I mentioned in original mail.
After pushing out the IPA propagation I will re-add code to track offsets and
sizes that further improve disambiguation. On tramp3d it enables a lot of DSE
for structure fields not acessed by uninlined function.
gcc/
* doc/invoke.texi: Document -fipa-modref, ipa-modref-max-bases,
ipa-modref-max-refs, ipa-modref-max-accesses, ipa-modref-max-tests.
* ipa-modref-tree.c (test_insert_search_collapse): Update.
(test_merge): Update.
(gt_ggc_mx): New function.
* ipa-modref-tree.h (struct modref_access_node): New structure.
(struct modref_ref_node): Add every_access and accesses array.
(modref_ref_node::modref_ref_node): Update ctor.
(modref_ref_node::search): New member function.
(modref_ref_node::collapse): New member function.
(modref_ref_node::insert_access): New member function.
(modref_base_node::insert_ref): Do not collapse base if ref is 0.
(modref_base_node::collapse): Copllapse also refs.
(modref_tree): Add accesses.
(modref_tree::modref_tree): Initialize max_accesses.
(modref_tree::insert): Add access parameter.
(modref_tree::cleanup): New member function.
(modref_tree::merge): Add parm_map; merge accesses.
(modref_tree::copy_from): New member function.
(modref_tree::create_ggc): Add max_accesses.
* ipa-modref.c (dump_access): New function.
(dump_records): Dump accesses.
(dump_lto_records): Dump accesses.
(get_access): New function.
(record_access): Record access.
(record_access_lto): Record access.
(analyze_call): Compute parm_map.
(analyze_function): Update construction of modref records.
(modref_summaries::duplicate): Likewise; use copy_from.
(write_modref_records): Stream accesses.
(read_modref_records): Sream accesses.
(pass_ipa_modref::execute): Update call of merge.
* params.opt (-param=modref-max-accesses): New.
* tree-ssa-alias.c (alias_stats): Add modref_baseptr_tests.
(dump_alias_stats): Update.
(base_may_alias_with_dereference_p): New function.
(modref_may_conflict): Check accesses.
(ref_maybe_used_by_call_p_1): Update call to modref_may_conflict.
(call_may_clobber_ref_p_1): Update call to modref_may_conflict.
2020-09-24 15:09:17 +02:00
|
|
|
|
2021-11-23 16:36:01 +01:00
|
|
|
t1 = new modref_tree<alias_set_type>();
|
|
|
|
t1->insert (3, 4, 1, 1, 1, a, false);
|
|
|
|
t1->insert (3, 4, 1, 1, 2, a, false);
|
|
|
|
t1->insert (3, 4, 1, 1, 3, a, false);
|
|
|
|
t1->insert (3, 4, 1, 2, 1, a, false);
|
|
|
|
t1->insert (3, 4, 1, 3, 1, a, false);
|
|
|
|
|
|
|
|
t2 = new modref_tree<alias_set_type>();
|
|
|
|
t2->insert (10, 10, 10, 1, 2, a, false);
|
|
|
|
t2->insert (10, 10, 10, 1, 3, a, false);
|
|
|
|
t2->insert (10, 10, 10, 1, 4, a, false);
|
|
|
|
t2->insert (10, 10, 10, 3, 2, a, false);
|
|
|
|
t2->insert (10, 10, 10, 3, 3, a, false);
|
|
|
|
t2->insert (10, 10, 10, 3, 4, a, false);
|
|
|
|
t2->insert (10, 10, 10, 3, 5, a, false);
|
|
|
|
|
|
|
|
t1->merge (3, 4, 1, t2, NULL, NULL, false);
|
2020-09-20 07:25:16 +02:00
|
|
|
|
|
|
|
ASSERT_FALSE (t1->every_base);
|
|
|
|
ASSERT_NE (t1->bases, NULL);
|
|
|
|
ASSERT_EQ (t1->bases->length (), 3);
|
|
|
|
|
|
|
|
base_node = t1->search (1);
|
|
|
|
ASSERT_NE (base_node->refs, NULL);
|
|
|
|
ASSERT_FALSE (base_node->every_ref);
|
|
|
|
ASSERT_EQ (base_node->refs->length (), 4);
|
|
|
|
|
|
|
|
base_node = t1->search (2);
|
|
|
|
ASSERT_NE (base_node->refs, NULL);
|
|
|
|
ASSERT_FALSE (base_node->every_ref);
|
|
|
|
ASSERT_EQ (base_node->refs->length (), 1);
|
|
|
|
|
|
|
|
base_node = t1->search (3);
|
|
|
|
ASSERT_EQ (base_node->refs, NULL);
|
|
|
|
ASSERT_TRUE (base_node->every_ref);
|
2020-10-22 12:44:27 +02:00
|
|
|
|
|
|
|
delete t1;
|
|
|
|
delete t2;
|
2020-09-20 07:25:16 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
void
|
2020-09-22 22:16:00 +02:00
|
|
|
ipa_modref_tree_c_tests ()
|
2020-09-20 07:25:16 +02:00
|
|
|
{
|
|
|
|
test_insert_search_collapse ();
|
|
|
|
test_merge ();
|
|
|
|
}
|
|
|
|
|
2020-09-22 22:16:00 +02:00
|
|
|
} // namespace selftest
|
|
|
|
|
2020-09-20 07:25:16 +02:00
|
|
|
#endif
|
|
|
|
|
|
|
|
void
|
|
|
|
gt_ggc_mx (modref_tree < int >*const &tt)
|
|
|
|
{
|
|
|
|
if (tt->bases)
|
|
|
|
{
|
|
|
|
ggc_test_and_set_mark (tt->bases);
|
|
|
|
gt_ggc_mx (tt->bases);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
gt_ggc_mx (modref_tree < tree_node * >*const &tt)
|
|
|
|
{
|
|
|
|
if (tt->bases)
|
|
|
|
{
|
|
|
|
ggc_test_and_set_mark (tt->bases);
|
|
|
|
gt_ggc_mx (tt->bases);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
void gt_pch_nx (modref_tree<int>* const&) {}
|
|
|
|
void gt_pch_nx (modref_tree<tree_node*>* const&) {}
|
|
|
|
void gt_pch_nx (modref_tree<int>* const&, gt_pointer_operator, void *) {}
|
|
|
|
void gt_pch_nx (modref_tree<tree_node*>* const&, gt_pointer_operator, void *) {}
|
|
|
|
|
|
|
|
void gt_ggc_mx (modref_base_node<int>* &b)
|
|
|
|
{
|
|
|
|
ggc_test_and_set_mark (b);
|
|
|
|
if (b->refs)
|
|
|
|
{
|
|
|
|
ggc_test_and_set_mark (b->refs);
|
|
|
|
gt_ggc_mx (b->refs);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
void gt_ggc_mx (modref_base_node<tree_node*>* &b)
|
|
|
|
{
|
|
|
|
ggc_test_and_set_mark (b);
|
|
|
|
if (b->refs)
|
|
|
|
{
|
|
|
|
ggc_test_and_set_mark (b->refs);
|
|
|
|
gt_ggc_mx (b->refs);
|
|
|
|
}
|
|
|
|
if (b->base)
|
|
|
|
gt_ggc_mx (b->base);
|
|
|
|
}
|
|
|
|
|
|
|
|
void gt_pch_nx (modref_base_node<int>*) {}
|
|
|
|
void gt_pch_nx (modref_base_node<tree_node*>*) {}
|
|
|
|
void gt_pch_nx (modref_base_node<int>*, gt_pointer_operator, void *) {}
|
|
|
|
void gt_pch_nx (modref_base_node<tree_node*>*, gt_pointer_operator, void *) {}
|
|
|
|
|
|
|
|
void gt_ggc_mx (modref_ref_node<int>* &r)
|
|
|
|
{
|
|
|
|
ggc_test_and_set_mark (r);
|
Add access through parameter derference tracking to modref
re-add tracking of accesses which was unfinished in David's patch.
At the moment I only implemented tracking of the fact that access is based on
derefernece of the parameter (so we track THIS pointers).
Patch does not implement IPA propagation since it needs bit more work which
I will post shortly: ipa-fnsummary needs to track when parameter points to
local memory, summaries needs to be merged when function is inlined (because
jump functions are) and propagation needs to be turned into iterative dataflow
on SCC components.
Patch also adds documentation of -fipa-modref and params that was left uncommited
in my branch :(.
Even without this change it does lead to nice increase of disambiguations
for cc1plus build.
Alias oracle query stats:
refs_may_alias_p: 62758323 disambiguations, 72935683 queries
ref_maybe_used_by_call_p: 139511 disambiguations, 63654045 queries
call_may_clobber_ref_p: 23502 disambiguations, 29242 queries
nonoverlapping_component_refs_p: 0 disambiguations, 37654 queries
nonoverlapping_refs_since_match_p: 19417 disambiguations, 55555 must overlaps, 75721 queries
aliasing_component_refs_p: 54665 disambiguations, 752449 queries
TBAA oracle: 21917926 disambiguations 53054678 queries
15763411 are in alias set 0
10162238 queries asked about the same object
124 queries asked about the same alias set
0 access volatile
3681593 are dependent in the DAG
1529386 are aritificially in conflict with void *
Modref stats:
modref use: 8311 disambiguations, 32527 queries
modref clobber: 742126 disambiguations, 1036986 queries
1987054 tbaa queries (1.916182 per modref query)
125479 base compares (0.121004 per modref query)
PTA query stats:
pt_solution_includes: 968314 disambiguations, 13609584 queries
pt_solutions_intersect: 1019136 disambiguations, 13147139 queries
So compared to
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554605.html
we get 41% more use disambiguations (with similar number of queries) and 8% more
clobber disambiguations.
For tramp3d:
Alias oracle query stats:
refs_may_alias_p: 2052256 disambiguations, 2312703 queries
ref_maybe_used_by_call_p: 7122 disambiguations, 2089118 queries
call_may_clobber_ref_p: 234 disambiguations, 234 queries
nonoverlapping_component_refs_p: 0 disambiguations, 4299 queries
nonoverlapping_refs_since_match_p: 329 disambiguations, 10200 must overlaps, 10616 queries
aliasing_component_refs_p: 857 disambiguations, 34555 queries
TBAA oracle: 885546 disambiguations 1677080 queries
132105 are in alias set 0
469030 queries asked about the same object
0 queries asked about the same alias set
0 access volatile
190084 are dependent in the DAG
315 are aritificially in conflict with void *
Modref stats:
modref use: 426 disambiguations, 1881 queries
modref clobber: 10042 disambiguations, 16202 queries
19405 tbaa queries (1.197692 per modref query)
2775 base compares (0.171275 per modref query)
PTA query stats:
pt_solution_includes: 313908 disambiguations, 526183 queries
pt_solutions_intersect: 130510 disambiguations, 416084 queries
Here uses decrease by 4 disambiguations and clobber improve by 3.5%. I think
the difference is caused by fact that gcc has much more alias set 0 accesses
originating from gimple and tree unions as I mentioned in original mail.
After pushing out the IPA propagation I will re-add code to track offsets and
sizes that further improve disambiguation. On tramp3d it enables a lot of DSE
for structure fields not acessed by uninlined function.
gcc/
* doc/invoke.texi: Document -fipa-modref, ipa-modref-max-bases,
ipa-modref-max-refs, ipa-modref-max-accesses, ipa-modref-max-tests.
* ipa-modref-tree.c (test_insert_search_collapse): Update.
(test_merge): Update.
(gt_ggc_mx): New function.
* ipa-modref-tree.h (struct modref_access_node): New structure.
(struct modref_ref_node): Add every_access and accesses array.
(modref_ref_node::modref_ref_node): Update ctor.
(modref_ref_node::search): New member function.
(modref_ref_node::collapse): New member function.
(modref_ref_node::insert_access): New member function.
(modref_base_node::insert_ref): Do not collapse base if ref is 0.
(modref_base_node::collapse): Copllapse also refs.
(modref_tree): Add accesses.
(modref_tree::modref_tree): Initialize max_accesses.
(modref_tree::insert): Add access parameter.
(modref_tree::cleanup): New member function.
(modref_tree::merge): Add parm_map; merge accesses.
(modref_tree::copy_from): New member function.
(modref_tree::create_ggc): Add max_accesses.
* ipa-modref.c (dump_access): New function.
(dump_records): Dump accesses.
(dump_lto_records): Dump accesses.
(get_access): New function.
(record_access): Record access.
(record_access_lto): Record access.
(analyze_call): Compute parm_map.
(analyze_function): Update construction of modref records.
(modref_summaries::duplicate): Likewise; use copy_from.
(write_modref_records): Stream accesses.
(read_modref_records): Sream accesses.
(pass_ipa_modref::execute): Update call of merge.
* params.opt (-param=modref-max-accesses): New.
* tree-ssa-alias.c (alias_stats): Add modref_baseptr_tests.
(dump_alias_stats): Update.
(base_may_alias_with_dereference_p): New function.
(modref_may_conflict): Check accesses.
(ref_maybe_used_by_call_p_1): Update call to modref_may_conflict.
(call_may_clobber_ref_p_1): Update call to modref_may_conflict.
2020-09-24 15:09:17 +02:00
|
|
|
if (r->accesses)
|
|
|
|
{
|
|
|
|
ggc_test_and_set_mark (r->accesses);
|
|
|
|
gt_ggc_mx (r->accesses);
|
|
|
|
}
|
2020-09-20 07:25:16 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
void gt_ggc_mx (modref_ref_node<tree_node*>* &r)
|
|
|
|
{
|
|
|
|
ggc_test_and_set_mark (r);
|
Add access through parameter derference tracking to modref
re-add tracking of accesses which was unfinished in David's patch.
At the moment I only implemented tracking of the fact that access is based on
derefernece of the parameter (so we track THIS pointers).
Patch does not implement IPA propagation since it needs bit more work which
I will post shortly: ipa-fnsummary needs to track when parameter points to
local memory, summaries needs to be merged when function is inlined (because
jump functions are) and propagation needs to be turned into iterative dataflow
on SCC components.
Patch also adds documentation of -fipa-modref and params that was left uncommited
in my branch :(.
Even without this change it does lead to nice increase of disambiguations
for cc1plus build.
Alias oracle query stats:
refs_may_alias_p: 62758323 disambiguations, 72935683 queries
ref_maybe_used_by_call_p: 139511 disambiguations, 63654045 queries
call_may_clobber_ref_p: 23502 disambiguations, 29242 queries
nonoverlapping_component_refs_p: 0 disambiguations, 37654 queries
nonoverlapping_refs_since_match_p: 19417 disambiguations, 55555 must overlaps, 75721 queries
aliasing_component_refs_p: 54665 disambiguations, 752449 queries
TBAA oracle: 21917926 disambiguations 53054678 queries
15763411 are in alias set 0
10162238 queries asked about the same object
124 queries asked about the same alias set
0 access volatile
3681593 are dependent in the DAG
1529386 are aritificially in conflict with void *
Modref stats:
modref use: 8311 disambiguations, 32527 queries
modref clobber: 742126 disambiguations, 1036986 queries
1987054 tbaa queries (1.916182 per modref query)
125479 base compares (0.121004 per modref query)
PTA query stats:
pt_solution_includes: 968314 disambiguations, 13609584 queries
pt_solutions_intersect: 1019136 disambiguations, 13147139 queries
So compared to
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554605.html
we get 41% more use disambiguations (with similar number of queries) and 8% more
clobber disambiguations.
For tramp3d:
Alias oracle query stats:
refs_may_alias_p: 2052256 disambiguations, 2312703 queries
ref_maybe_used_by_call_p: 7122 disambiguations, 2089118 queries
call_may_clobber_ref_p: 234 disambiguations, 234 queries
nonoverlapping_component_refs_p: 0 disambiguations, 4299 queries
nonoverlapping_refs_since_match_p: 329 disambiguations, 10200 must overlaps, 10616 queries
aliasing_component_refs_p: 857 disambiguations, 34555 queries
TBAA oracle: 885546 disambiguations 1677080 queries
132105 are in alias set 0
469030 queries asked about the same object
0 queries asked about the same alias set
0 access volatile
190084 are dependent in the DAG
315 are aritificially in conflict with void *
Modref stats:
modref use: 426 disambiguations, 1881 queries
modref clobber: 10042 disambiguations, 16202 queries
19405 tbaa queries (1.197692 per modref query)
2775 base compares (0.171275 per modref query)
PTA query stats:
pt_solution_includes: 313908 disambiguations, 526183 queries
pt_solutions_intersect: 130510 disambiguations, 416084 queries
Here uses decrease by 4 disambiguations and clobber improve by 3.5%. I think
the difference is caused by fact that gcc has much more alias set 0 accesses
originating from gimple and tree unions as I mentioned in original mail.
After pushing out the IPA propagation I will re-add code to track offsets and
sizes that further improve disambiguation. On tramp3d it enables a lot of DSE
for structure fields not acessed by uninlined function.
gcc/
* doc/invoke.texi: Document -fipa-modref, ipa-modref-max-bases,
ipa-modref-max-refs, ipa-modref-max-accesses, ipa-modref-max-tests.
* ipa-modref-tree.c (test_insert_search_collapse): Update.
(test_merge): Update.
(gt_ggc_mx): New function.
* ipa-modref-tree.h (struct modref_access_node): New structure.
(struct modref_ref_node): Add every_access and accesses array.
(modref_ref_node::modref_ref_node): Update ctor.
(modref_ref_node::search): New member function.
(modref_ref_node::collapse): New member function.
(modref_ref_node::insert_access): New member function.
(modref_base_node::insert_ref): Do not collapse base if ref is 0.
(modref_base_node::collapse): Copllapse also refs.
(modref_tree): Add accesses.
(modref_tree::modref_tree): Initialize max_accesses.
(modref_tree::insert): Add access parameter.
(modref_tree::cleanup): New member function.
(modref_tree::merge): Add parm_map; merge accesses.
(modref_tree::copy_from): New member function.
(modref_tree::create_ggc): Add max_accesses.
* ipa-modref.c (dump_access): New function.
(dump_records): Dump accesses.
(dump_lto_records): Dump accesses.
(get_access): New function.
(record_access): Record access.
(record_access_lto): Record access.
(analyze_call): Compute parm_map.
(analyze_function): Update construction of modref records.
(modref_summaries::duplicate): Likewise; use copy_from.
(write_modref_records): Stream accesses.
(read_modref_records): Sream accesses.
(pass_ipa_modref::execute): Update call of merge.
* params.opt (-param=modref-max-accesses): New.
* tree-ssa-alias.c (alias_stats): Add modref_baseptr_tests.
(dump_alias_stats): Update.
(base_may_alias_with_dereference_p): New function.
(modref_may_conflict): Check accesses.
(ref_maybe_used_by_call_p_1): Update call to modref_may_conflict.
(call_may_clobber_ref_p_1): Update call to modref_may_conflict.
2020-09-24 15:09:17 +02:00
|
|
|
if (r->accesses)
|
|
|
|
{
|
|
|
|
ggc_test_and_set_mark (r->accesses);
|
|
|
|
gt_ggc_mx (r->accesses);
|
|
|
|
}
|
2020-09-20 07:25:16 +02:00
|
|
|
if (r->ref)
|
|
|
|
gt_ggc_mx (r->ref);
|
|
|
|
}
|
|
|
|
|
|
|
|
void gt_pch_nx (modref_ref_node<int>* ) {}
|
|
|
|
void gt_pch_nx (modref_ref_node<tree_node*>*) {}
|
|
|
|
void gt_pch_nx (modref_ref_node<int>*, gt_pointer_operator, void *) {}
|
|
|
|
void gt_pch_nx (modref_ref_node<tree_node*>*, gt_pointer_operator, void *) {}
|
|
|
|
|
Add access through parameter derference tracking to modref
re-add tracking of accesses which was unfinished in David's patch.
At the moment I only implemented tracking of the fact that access is based on
derefernece of the parameter (so we track THIS pointers).
Patch does not implement IPA propagation since it needs bit more work which
I will post shortly: ipa-fnsummary needs to track when parameter points to
local memory, summaries needs to be merged when function is inlined (because
jump functions are) and propagation needs to be turned into iterative dataflow
on SCC components.
Patch also adds documentation of -fipa-modref and params that was left uncommited
in my branch :(.
Even without this change it does lead to nice increase of disambiguations
for cc1plus build.
Alias oracle query stats:
refs_may_alias_p: 62758323 disambiguations, 72935683 queries
ref_maybe_used_by_call_p: 139511 disambiguations, 63654045 queries
call_may_clobber_ref_p: 23502 disambiguations, 29242 queries
nonoverlapping_component_refs_p: 0 disambiguations, 37654 queries
nonoverlapping_refs_since_match_p: 19417 disambiguations, 55555 must overlaps, 75721 queries
aliasing_component_refs_p: 54665 disambiguations, 752449 queries
TBAA oracle: 21917926 disambiguations 53054678 queries
15763411 are in alias set 0
10162238 queries asked about the same object
124 queries asked about the same alias set
0 access volatile
3681593 are dependent in the DAG
1529386 are aritificially in conflict with void *
Modref stats:
modref use: 8311 disambiguations, 32527 queries
modref clobber: 742126 disambiguations, 1036986 queries
1987054 tbaa queries (1.916182 per modref query)
125479 base compares (0.121004 per modref query)
PTA query stats:
pt_solution_includes: 968314 disambiguations, 13609584 queries
pt_solutions_intersect: 1019136 disambiguations, 13147139 queries
So compared to
https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554605.html
we get 41% more use disambiguations (with similar number of queries) and 8% more
clobber disambiguations.
For tramp3d:
Alias oracle query stats:
refs_may_alias_p: 2052256 disambiguations, 2312703 queries
ref_maybe_used_by_call_p: 7122 disambiguations, 2089118 queries
call_may_clobber_ref_p: 234 disambiguations, 234 queries
nonoverlapping_component_refs_p: 0 disambiguations, 4299 queries
nonoverlapping_refs_since_match_p: 329 disambiguations, 10200 must overlaps, 10616 queries
aliasing_component_refs_p: 857 disambiguations, 34555 queries
TBAA oracle: 885546 disambiguations 1677080 queries
132105 are in alias set 0
469030 queries asked about the same object
0 queries asked about the same alias set
0 access volatile
190084 are dependent in the DAG
315 are aritificially in conflict with void *
Modref stats:
modref use: 426 disambiguations, 1881 queries
modref clobber: 10042 disambiguations, 16202 queries
19405 tbaa queries (1.197692 per modref query)
2775 base compares (0.171275 per modref query)
PTA query stats:
pt_solution_includes: 313908 disambiguations, 526183 queries
pt_solutions_intersect: 130510 disambiguations, 416084 queries
Here uses decrease by 4 disambiguations and clobber improve by 3.5%. I think
the difference is caused by fact that gcc has much more alias set 0 accesses
originating from gimple and tree unions as I mentioned in original mail.
After pushing out the IPA propagation I will re-add code to track offsets and
sizes that further improve disambiguation. On tramp3d it enables a lot of DSE
for structure fields not acessed by uninlined function.
gcc/
* doc/invoke.texi: Document -fipa-modref, ipa-modref-max-bases,
ipa-modref-max-refs, ipa-modref-max-accesses, ipa-modref-max-tests.
* ipa-modref-tree.c (test_insert_search_collapse): Update.
(test_merge): Update.
(gt_ggc_mx): New function.
* ipa-modref-tree.h (struct modref_access_node): New structure.
(struct modref_ref_node): Add every_access and accesses array.
(modref_ref_node::modref_ref_node): Update ctor.
(modref_ref_node::search): New member function.
(modref_ref_node::collapse): New member function.
(modref_ref_node::insert_access): New member function.
(modref_base_node::insert_ref): Do not collapse base if ref is 0.
(modref_base_node::collapse): Copllapse also refs.
(modref_tree): Add accesses.
(modref_tree::modref_tree): Initialize max_accesses.
(modref_tree::insert): Add access parameter.
(modref_tree::cleanup): New member function.
(modref_tree::merge): Add parm_map; merge accesses.
(modref_tree::copy_from): New member function.
(modref_tree::create_ggc): Add max_accesses.
* ipa-modref.c (dump_access): New function.
(dump_records): Dump accesses.
(dump_lto_records): Dump accesses.
(get_access): New function.
(record_access): Record access.
(record_access_lto): Record access.
(analyze_call): Compute parm_map.
(analyze_function): Update construction of modref records.
(modref_summaries::duplicate): Likewise; use copy_from.
(write_modref_records): Stream accesses.
(read_modref_records): Sream accesses.
(pass_ipa_modref::execute): Update call of merge.
* params.opt (-param=modref-max-accesses): New.
* tree-ssa-alias.c (alias_stats): Add modref_baseptr_tests.
(dump_alias_stats): Update.
(base_may_alias_with_dereference_p): New function.
(modref_may_conflict): Check accesses.
(ref_maybe_used_by_call_p_1): Update call to modref_may_conflict.
(call_may_clobber_ref_p_1): Update call to modref_may_conflict.
2020-09-24 15:09:17 +02:00
|
|
|
void gt_ggc_mx (modref_access_node &)
|
|
|
|
{
|
|
|
|
}
|