d61ce6ab04
This issue was observed in rs6000 specific PR102658 as well. I've looked into it a bit, it's caused by the "conditional store replacement" which is originally disabled without vectorization as below code. /* If either vectorization or if-conversion is disabled then do not sink any stores. */ if (param_max_stores_to_sink == 0 || (!flag_tree_loop_vectorize && !flag_tree_slp_vectorize) || !flag_tree_loop_if_convert) return false; The new change makes the innermost loop look like for (int c1 = 0; c1 <= 1499; c1 += 1) { if (c1 <= 500) { S_10(c0, c1); } else { S_9(c0, c1); } S_11(c0, c1); } and can not be splitted as: for (int c1 = 0; c1 <= 500; c1 += 1) S_10(c0, c1); for (int c1 = 501; c1 <= 1499; c1 += 1) S_9(c0, c1); So instead of disabling vectorization, could we just disable this cs replacement with parameter "--param max-stores-to-sink=0"? I tested this proposal on ppc64le, it should work as well. 2021-10-11 Kewen Lin <linkw@linux.ibm.com> libgomp/ChangeLog: * testsuite/libgomp.graphite/force-parallel-8.c: Add --param max-stores-to-sink=0. |
||
---|---|---|
.. | ||
config | ||
plugin | ||
testsuite | ||
.gitattributes | ||
acc_prof.h | ||
acinclude.m4 | ||
aclocal.m4 | ||
affinity-fmt.c | ||
affinity.c | ||
alloc.c | ||
allocator.c | ||
atomic.c | ||
barrier.c | ||
ChangeLog | ||
ChangeLog.graphite | ||
config.h.in | ||
configure | ||
configure.ac | ||
configure.tgt | ||
critical.c | ||
env.c | ||
error.c | ||
fortran.c | ||
hashtab.h | ||
icv-device.c | ||
icv.c | ||
iter_ull.c | ||
iter.c | ||
libgomp_f.h.in | ||
libgomp_g.h | ||
libgomp-plugin.c | ||
libgomp-plugin.h | ||
libgomp.h | ||
libgomp.map | ||
libgomp.spec.in | ||
libgomp.texi | ||
lock.c | ||
loop_ull.c | ||
loop.c | ||
Makefile.am | ||
Makefile.in | ||
oacc-async.c | ||
oacc-cuda.c | ||
oacc-host.c | ||
oacc-init.c | ||
oacc-int.h | ||
oacc-mem.c | ||
oacc-parallel.c | ||
oacc-plugin.c | ||
oacc-plugin.h | ||
oacc-profiling.c | ||
oacc-target.c | ||
omp_lib.f90.in | ||
omp_lib.h.in | ||
omp.h.in | ||
openacc_lib.h | ||
openacc.f90 | ||
openacc.h | ||
ordered.c | ||
parallel.c | ||
priority_queue.c | ||
priority_queue.h | ||
scope.c | ||
sections.c | ||
secure_getenv.h | ||
single.c | ||
splay-tree.c | ||
splay-tree.h | ||
target.c | ||
task.c | ||
taskloop.c | ||
team.c | ||
teams.c | ||
work.c |