gcc/libgomp
Tom de Vries fc14ff6111 [omp, simt] Handle alternative IV
Consider the test-case libgomp.c/pr81778.c added in this commit, with
this core loop (note: CANARY_SIZE set to 0 for simplicity):
...
  int s = 1;
  #pragma omp target simd
  for (int i = N - 1; i > -1; i -= s)
    a[i] = 1;
...
which, given that N is 32, sets a[0..31] to 1.

After omp-expand, this looks like:
...
  <bb 5> :
  simduid.7 = .GOMP_SIMT_ENTER (simduid.7);
  .omp_simt.8 = .GOMP_SIMT_ENTER_ALLOC (simduid.7);
  D.3193 = -s;
  s.9 = s;
  D.3204 = .GOMP_SIMT_LANE ();
  D.3205 = -s.9;
  D.3206 = (int) D.3204;
  D.3207 = D.3205 * D.3206;
  i = D.3207 + 31;
  D.3209 = 0;
  D.3210 = -s.9;
  D.3211 = D.3210 - i;
  D.3210 = -s.9;
  D.3212 = D.3211 / D.3210;
  D.3213 = (unsigned int) D.3212;
  D.3213 = i >= 0 ? D.3213 : 0;

  <bb 19> :
  if (D.3209 < D.3213)
    goto <bb 6>; [87.50%]
  else
    goto <bb 7>; [12.50%]

  <bb 6> :
  a[i] = 1;
  D.3215 = -s.9;
  D.3219 = .GOMP_SIMT_VF ();
  D.3216 = (int) D.3219;
  D.3220 = D.3215 * D.3216;
  i = D.3220 + i;
  D.3209 = D.3209 + 1;
  goto <bb 19>; [100.00%]
...

On nvptx, the first time bb6 is executed, i is in the 0..31 range (depending
on the lane that is executing) at bb entry.

So we have the following sequence:
- a[0..31] is set to 1
- i is updated to -32..-1
- D.3209 is updated to 1 (being 0 initially)
- bb19 is executed, and if condition (D.3209 < D.3213) == (1 < 32) evaluates
  to true
- bb6 is once more executed, which should not happen because all the elements
  that needed to be handled were already handled.
- consequently, elements that should not be written are written
- with CANARY_SIZE == 0, we may run into a libgomp error:
  ...
  libgomp: cuCtxSynchronize error: an illegal memory access was encountered
  ...
  and with CANARY_SIZE unmodified, we run into:
  ...
  Expected 0, got 1 at base[-961]
  Aborted (core dumped)
  ...

The cause of this is as follows:
- because the step s is a variable rather than a constant, an alternative
  IV (D.3209 in our example) is generated in expand_omp_simd, and the
  loop condition is tested in terms of the alternative IV rather than
  the original IV (i in our example).
- the SIMT code in expand_omp_simd works by modifying step and initial value.
- The initial value fd->loop.n1 is loaded into a variable n1, which is
  modified by the SIMT code and then used there-after.
- The step fd->loop.step is loaded into a variable step, which is modified
  by the SIMT code, but afterwards there are uses of both step and
  fd->loop.step.
- There are uses of fd->loop.step in the alternative IV handling code,
  which should use step instead.

Fix this by introducing an additional variable orig_step, which is not
modified by the SIMT code and replacing all remaining uses of fd->loop.step
by either step or orig_step.

Build on x86_64-linux with nvptx accelerator, tested libgomp.

This fixes for-5.c and for-6.c FAILs I'm currently seeing on a quadro m1200
with driver 450.66.

gcc/ChangeLog:

2020-10-02  Tom de Vries  <tdevries@suse.de>

	* omp-expand.c (expand_omp_simd): Add step_orig, and replace uses of
	fd->loop.step by either step or orig_step.

libgomp/ChangeLog:

2020-10-02  Tom de Vries  <tdevries@suse.de>

	* testsuite/libgomp.c/pr81778.c: New test.
2021-04-29 14:37:32 +02:00
..
config libgomp/i386: Revert the type of syscall wrappers output back to long. 2021-02-12 00:07:56 +01:00
plugin libgomp HSA/GCN plugins: don't prepend the 'HSA_RUNTIME_LIB' path to 'libhsa-runtime64.so' 2021-03-25 14:11:50 +01:00
testsuite [omp, simt] Handle alternative IV 2021-04-29 14:37:32 +02:00
.gitattributes
acc_prof.h Update copyright years. 2021-01-04 10:26:59 +01:00
acinclude.m4
aclocal.m4
affinity-fmt.c Update copyright years. 2021-01-04 10:26:59 +01:00
affinity.c Update copyright years. 2021-01-04 10:26:59 +01:00
alloc.c Update copyright years. 2021-01-04 10:26:59 +01:00
allocator.c Update copyright years. 2021-01-04 10:26:59 +01:00
atomic.c Update copyright years. 2021-01-04 10:26:59 +01:00
barrier.c Update copyright years. 2021-01-04 10:26:59 +01:00
ChangeLog Daily bump. 2021-04-29 00:17:01 +00:00
ChangeLog.graphite
config.h.in offload-defaulted: Config option to silently ignore uninstalled offload compilers 2021-04-28 18:46:47 +02:00
configure offload-defaulted: Config option to silently ignore uninstalled offload compilers 2021-04-28 18:46:47 +02:00
configure.ac offload-defaulted: Config option to silently ignore uninstalled offload compilers 2021-04-28 18:46:47 +02:00
configure.tgt libgomp: enable linux-futex on riscv64 2021-01-18 15:13:48 +01:00
critical.c Update copyright years. 2021-01-04 10:26:59 +01:00
env.c Update copyright years. 2021-01-04 10:26:59 +01:00
error.c Update copyright years. 2021-01-04 10:26:59 +01:00
fortran.c openmp: Add support for the OpenMP 5.0 task detach clause 2021-01-16 12:58:13 -08:00
hashtab.h Update copyright years. 2021-01-04 10:26:59 +01:00
icv-device.c Update copyright years. 2021-01-04 10:26:59 +01:00
icv.c Update copyright years. 2021-01-04 10:26:59 +01:00
iter_ull.c Update copyright years. 2021-01-04 10:26:59 +01:00
iter.c Update copyright years. 2021-01-04 10:26:59 +01:00
libgomp_f.h.in Update copyright years. 2021-01-04 10:26:59 +01:00
libgomp_g.h openmp: Add support for the OpenMP 5.0 task detach clause 2021-01-16 12:58:13 -08:00
libgomp-plugin.c Update copyright years. 2021-01-04 10:26:59 +01:00
libgomp-plugin.h Update copyright years. 2021-01-04 10:26:59 +01:00
libgomp.h openmp: Fix intermittent hanging of task-detach-6 libgomp tests [PR98738] 2021-02-25 14:47:11 -08:00
libgomp.map openmp: Add support for the OpenMP 5.0 task detach clause 2021-01-16 12:58:13 -08:00
libgomp.spec.in
libgomp.texi libgomp: Add documentation for omp_fulfill_event runtime function 2021-01-25 09:58:51 -08:00
lock.c Update copyright years. 2021-01-04 10:26:59 +01:00
loop_ull.c Update copyright years. 2021-01-04 10:26:59 +01:00
loop.c Update copyright years. 2021-01-04 10:26:59 +01:00
Makefile.am
Makefile.in offload-defaulted: Config option to silently ignore uninstalled offload compilers 2021-04-28 18:46:47 +02:00
oacc-async.c Update copyright years. 2021-01-04 10:26:59 +01:00
oacc-cuda.c Update copyright years. 2021-01-04 10:26:59 +01:00
oacc-host.c Update copyright years. 2021-01-04 10:26:59 +01:00
oacc-init.c Update copyright years. 2021-01-04 10:26:59 +01:00
oacc-int.h Update copyright years. 2021-01-04 10:26:59 +01:00
oacc-mem.c Update copyright years. 2021-01-04 10:26:59 +01:00
oacc-parallel.c Update copyright years. 2021-01-04 10:26:59 +01:00
oacc-plugin.c Update copyright years. 2021-01-04 10:26:59 +01:00
oacc-plugin.h Update copyright years. 2021-01-04 10:26:59 +01:00
oacc-profiling.c Update copyright years. 2021-01-04 10:26:59 +01:00
oacc-target.c
omp_lib.f90.in openmp: Add support for the OpenMP 5.0 task detach clause 2021-01-16 12:58:13 -08:00
omp_lib.h.in openmp: Add support for the OpenMP 5.0 task detach clause 2021-01-16 12:58:13 -08:00
omp.h.in openmp: Add support for the OpenMP 5.0 task detach clause 2021-01-16 12:58:13 -08:00
openacc_lib.h Update copyright years. 2021-01-04 10:26:59 +01:00
openacc.f90 Update copyright years. 2021-01-04 10:26:59 +01:00
openacc.h Update copyright years. 2021-01-04 10:26:59 +01:00
ordered.c Update copyright years. 2021-01-04 10:26:59 +01:00
parallel.c Update copyright years. 2021-01-04 10:26:59 +01:00
priority_queue.c openmp: Add support for the OpenMP 5.0 task detach clause 2021-01-16 12:58:13 -08:00
priority_queue.h openmp: Add support for the OpenMP 5.0 task detach clause 2021-01-16 12:58:13 -08:00
sections.c Update copyright years. 2021-01-04 10:26:59 +01:00
secure_getenv.h Update copyright years. 2021-01-04 10:26:59 +01:00
single.c Update copyright years. 2021-01-04 10:26:59 +01:00
splay-tree.c Update copyright years. 2021-01-04 10:26:59 +01:00
splay-tree.h Update copyright years. 2021-01-04 10:26:59 +01:00
target.c offload-defaulted: Config option to silently ignore uninstalled offload compilers 2021-04-28 18:46:47 +02:00
task.c openmp: Fix intermittent hanging of task-detach-6 libgomp tests [PR98738] 2021-02-25 14:47:11 -08:00
taskloop.c Update copyright years. 2021-01-04 10:26:59 +01:00
team.c libgomp: Silence false positive -Wmaybe-uninitialized warning [PR99984] 2021-04-09 10:18:47 +02:00
teams.c Update copyright years. 2021-01-04 10:26:59 +01:00
work.c Update copyright years. 2021-01-04 10:26:59 +01:00