Also apply the insn to prefetch ratio heuristic to loops with known trip count.

2010-05-17  Changpeng Fang  <changpeng.fang@amd.com>

	* doc/invoke.texi: Update documentation for min-insn-to-prefetch-ratio.
	* tree-ssa-loop-prefetch.c (is_loop_prefetching_profitable): Also apply
	the insn to prefetch ratio heuristic to loops with known trip count.

From-SVN: r159515
This commit is contained in:
Changpeng Fang 2010-05-17 22:34:03 +00:00 committed by Sebastian Pop
parent 8532678c05
commit d3a9b459b7
3 changed files with 22 additions and 16 deletions

View File

@ -1,3 +1,9 @@
2010-05-17 Changpeng Fang <changpeng.fang@amd.com>
* doc/invoke.texi: Update documentation for min-insn-to-prefetch-ratio.
* tree-ssa-loop-prefetch.c (is_loop_prefetching_profitable): Also apply
the insn to prefetch ratio heuristic to loops with known trip count.
2010-05-17 Changpeng Fang <changpeng.fang@amd.com>
* tree-ssa-loop-prefetch.c (PREFETCH_MOD_TO_UNROLL_FACTOR_RATIO): New.

View File

@ -8518,8 +8518,7 @@ The size of L2 cache, in kilobytes.
@item min-insn-to-prefetch-ratio
The minimum ratio between the number of instructions and the
number of prefetches to enable prefetching in a loop with an
unknown trip count.
number of prefetches to enable prefetching in a loop.
@item prefetch-min-insn-to-mem-ratio
The minimum ratio between the number of instructions and the

View File

@ -1603,17 +1603,9 @@ is_loop_prefetching_profitable (unsigned ahead, HOST_WIDE_INT est_niter,
return false;
}
/* Profitability of prefetching is highly dependent on the trip count.
For a given AHEAD distance, the first AHEAD iterations do not benefit
from prefetching, and the last AHEAD iterations execute useless
prefetches. So, if the trip count is not large enough relative to AHEAD,
prefetching may cause serious performance degradation. To avoid this
problem when the trip count is not known at compile time, we
conservatively skip loops with high prefetching costs. For now, only
the I-cache cost is considered. The relative I-cache cost is estimated
by taking the ratio between the number of prefetches and the total
number of instructions. Since we are using integer arithmetic, we
compute the reciprocal of this ratio.
/* Prefetching most likely causes performance degradation when the instruction
to prefetch ratio is too small. Too many prefetch instructions in a loop
may reduce the I-cache performance.
(unroll_factor * ninsns) is used to estimate the number of instructions in
the unrolled loop. This implementation is a bit simplistic -- the number
of issued prefetch instructions is also affected by unrolling. So,
@ -1623,12 +1615,21 @@ is_loop_prefetching_profitable (unsigned ahead, HOST_WIDE_INT est_niter,
original loop * unroll_factor (at least the induction variable increases
and the exit branches will get eliminated), so it might be better to use
tree_estimate_loop_size + estimated_unrolled_size. */
if (est_niter < 0)
insn_to_prefetch_ratio = (unroll_factor * ninsns) / prefetch_count;
if (insn_to_prefetch_ratio < MIN_INSN_TO_PREFETCH_RATIO)
{
insn_to_prefetch_ratio = (unroll_factor * ninsns) / prefetch_count;
return insn_to_prefetch_ratio >= MIN_INSN_TO_PREFETCH_RATIO;
if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file,
"Not prefetching -- instruction to prefetch ratio (%d) too small\n",
insn_to_prefetch_ratio);
return false;
}
/* Could not do further estimation if the trip count is unknown. Just assume
prefetching is profitable. Too aggressive??? */
if (est_niter < 0)
return true;
if (est_niter < (HOST_WIDE_INT) (TRIP_COUNT_TO_AHEAD_RATIO * ahead))
{
if (dump_file && (dump_flags & TDF_DETAILS))