gcc/libgomp
Tom de Vries 052aaaceed [nvptx] Don't allow vector_length 64 with num_workers 16
When using a compiler build with:
...
+#define PTX_DEFAULT_VECTOR_LENGTH PTX_CTA_SIZE
...

consider a test-case:
...
int
main (void)
{
  #pragma acc parallel vector_length (64)
    #pragma acc loop worker
    for (unsigned int i = 0; i < 32; i++)
      #pragma acc loop vector
      for (unsigned int j = 0; j < 64; j++)
        ;

  return 0;
}
...

If num_workers is 16, either because:
- we add a "num_workers (16)" clause on the parallel directive, or
- we set "GOMP_OPENACC_DIM=:16:", or
- the libgomp plugin chooses 16 num_workers
we run into an illegal instruction at runtime, because a bar.sync instruction
tries to use a barrier 16.  The instruction is illegal, because ptx supports
only 16 barriers per CTA, and the valid range is 0..15.

The problem is that with a warp-multiple vector length, we use a code generation
scheme with a per-worker barrier.  And because barrier zero is reserved for
per-cta barrier, only the remaining 15 barriers can be used as per-worker
barrier, and consequently we can't use num_workers larger than 15.

This problem occurs only for vector_length 64.  For vector_length 32, we use a
different code generation scheme, and for vector_length >= 96, the maximum
num_workers is not big enough not to trigger this problem.

Also, this problem only occurs for num_workers 16.  As explained above,
num_workers 15 is safe to use, and 16 is already the maximum num_workers for
vector_length 64.

This patch fixes the problem in both the compiler (handling "num_workers (16)")
and in the libgomp nvptx plugin (with and without "GOMP_OPENACC_DIM=:16:").

2019-01-11  Tom de Vries  <tdevries@suse.de>

	* config/nvptx/nvptx.c (PTX_CTA_NUM_BARRIERS, PTX_PER_CTA_BARRIER)
	(PTX_NUM_PER_CTA_BARRIER, PTX_FIRST_PER_WORKER_BARRIER)
	(PTX_NUM_PER_WORKER_BARRIERS): Define.
	(nvptx_apply_dim_limits): Prevent vector_length 64 and
	num_workers 16.

	* plugin/plugin-nvptx.c (nvptx_exec): Prevent vector_length 64 and
	num_workers 16.

From-SVN: r267838
2019-01-11 11:46:43 +00:00
..
config libgomp: Reduce copy and paste for RTEMS 2019-01-09 06:16:05 +00:00
plugin [nvptx] Don't allow vector_length 64 with num_workers 16 2019-01-11 11:46:43 +00:00
testsuite [libgomp, testsuite, openacc] Remove -foffload=-w in reduction-[1-5].c 2019-01-11 11:46:06 +00:00
acinclude.m4 Enable building libgomp with Intel CET 2017-11-17 22:22:09 +01:00
aclocal.m4 Update GCC to autoconf 2.69, automake 1.15.1 (PR bootstrap/82856). 2018-10-31 17:03:16 +00:00
affinity-fmt.c Update copyright years. 2019-01-01 13:31:55 +01:00
affinity.c Update copyright years. 2019-01-01 13:31:55 +01:00
alloc.c Update copyright years. 2019-01-01 13:31:55 +01:00
atomic.c Update copyright years. 2019-01-01 13:31:55 +01:00
barrier.c Update copyright years. 2019-01-01 13:31:55 +01:00
ChangeLog [nvptx] Don't allow vector_length 64 with num_workers 16 2019-01-11 11:46:43 +00:00
ChangeLog.graphite
config.h.in builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
configure builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
configure.ac builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
configure.tgt builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
critical.c Update copyright years. 2019-01-01 13:31:55 +01:00
env.c Update copyright years. 2019-01-01 13:31:55 +01:00
error.c Update copyright years. 2019-01-01 13:31:55 +01:00
fortran.c Update copyright years. 2019-01-01 13:31:55 +01:00
hashtab.h Update copyright years. 2019-01-01 13:31:55 +01:00
icv-device.c Update copyright years. 2019-01-01 13:31:55 +01:00
icv.c Update copyright years. 2019-01-01 13:31:55 +01:00
iter_ull.c Update copyright years. 2019-01-01 13:31:55 +01:00
iter.c Update copyright years. 2019-01-01 13:31:55 +01:00
libgomp_f.h.in Update copyright years. 2019-01-01 13:31:55 +01:00
libgomp_g.h Update copyright years. 2019-01-01 13:31:55 +01:00
libgomp-plugin.c Update copyright years. 2019-01-01 13:31:55 +01:00
libgomp-plugin.h Update copyright years. 2019-01-01 13:31:55 +01:00
libgomp.h Update copyright years. 2019-01-01 13:31:55 +01:00
libgomp.map builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
libgomp.spec.in
libgomp.texi gcc.c (process_command): Update copyright notice dates. 2019-01-01 12:34:49 +01:00
lock.c Update copyright years. 2019-01-01 13:31:55 +01:00
loop_ull.c Update copyright years. 2019-01-01 13:31:55 +01:00
loop.c Update copyright years. 2019-01-01 13:31:55 +01:00
Makefile.am builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
Makefile.in Makefile.am (AUTOMAKE_OPTIONS): Drop dejagnu. 2018-11-26 22:03:41 +01:00
oacc-async.c Update copyright years. 2019-01-01 13:31:55 +01:00
oacc-cuda.c Update copyright years. 2019-01-01 13:31:55 +01:00
oacc-host.c Update copyright years. 2019-01-01 13:31:55 +01:00
oacc-init.c Update copyright years. 2019-01-01 13:31:55 +01:00
oacc-int.h Update copyright years. 2019-01-01 13:31:55 +01:00
oacc-mem.c Update copyright years. 2019-01-01 13:31:55 +01:00
oacc-parallel.c Update copyright years. 2019-01-01 13:31:55 +01:00
oacc-plugin.c Update copyright years. 2019-01-01 13:31:55 +01:00
oacc-plugin.h Update copyright years. 2019-01-01 13:31:55 +01:00
omp_lib.f90.in Update copyright years. 2019-01-01 13:31:55 +01:00
omp_lib.h.in Update copyright years. 2019-01-01 13:31:55 +01:00
omp.h.in Update copyright years. 2019-01-01 13:31:55 +01:00
openacc_lib.h Update copyright years. 2019-01-01 13:31:55 +01:00
openacc.f90 Update copyright years. 2019-01-01 13:31:55 +01:00
openacc.h Update copyright years. 2019-01-01 13:31:55 +01:00
ordered.c Update copyright years. 2019-01-01 13:31:55 +01:00
parallel.c Update copyright years. 2019-01-01 13:31:55 +01:00
priority_queue.c Update copyright years. 2019-01-01 13:31:55 +01:00
priority_queue.h Update copyright years. 2019-01-01 13:31:55 +01:00
sections.c Update copyright years. 2019-01-01 13:31:55 +01:00
secure_getenv.h Update copyright years. 2019-01-01 13:31:55 +01:00
single.c Update copyright years. 2019-01-01 13:31:55 +01:00
splay-tree.c Update copyright years. 2019-01-01 13:31:55 +01:00
splay-tree.h Update copyright years. 2019-01-01 13:31:55 +01:00
target.c Update copyright years. 2019-01-01 13:31:55 +01:00
task.c Update copyright years. 2019-01-01 13:31:55 +01:00
taskloop.c Update copyright years. 2019-01-01 13:31:55 +01:00
team.c Update copyright years. 2019-01-01 13:31:55 +01:00
teams.c Update copyright years. 2019-01-01 13:31:55 +01:00
work.c Update copyright years. 2019-01-01 13:31:55 +01:00