From 2c372e81a996e105571e71108f6427c38ec2a71a Mon Sep 17 00:00:00 2001 From: Tom de Vries Date: Wed, 9 Jan 2019 00:07:45 +0000 Subject: [PATCH] [nvptx, libgomp] Don't launch with num_workers == 0 When using a compiler build with: ... +#define PTX_DEFAULT_VECTOR_LENGTH PTX_CTA_SIZE +#define PTX_MAX_VECTOR_LENGTH PTX_CTA_SIZE ... and running the libgomp testsuite, we run into an execution failure in parallel-loop-1.c, due to a cuda launch failure: ... nvptx_exec: kernel f6_none_none$_omp_fn$0: launch gangs=480, workers=0, \ vectors=1024 libgomp: cuLaunchKernel error: invalid argument ... because workers == 0. The workers variable is set to 0 here in nvptx_exec: ... workers = blocks / actual_vectors; ... because actual_vectors is 1024, and blocks is 768: ... cuOccupancyMaxPotentialBlockSize: grid = 10, block = 768 ... Fix this by ensuring that workers is at least one. 2019-01-09 Tom de Vries * plugin/plugin-nvptx.c (nvptx_exec): Make sure to launch with at least one worker. From-SVN: r267746 --- libgomp/ChangeLog | 5 +++++ libgomp/plugin/plugin-nvptx.c | 1 + 2 files changed, 6 insertions(+) diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog index 120f0874b27..fba0ba0562a 100644 --- a/libgomp/ChangeLog +++ b/libgomp/ChangeLog @@ -1,3 +1,8 @@ +2019-01-09 Tom de Vries + + * plugin/plugin-nvptx.c (nvptx_exec): Make sure to launch with at least + one worker. + 2019-01-07 Tom de Vries * testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c: Fix diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c index 572d9ef8d5c..60553bdf3bd 100644 --- a/libgomp/plugin/plugin-nvptx.c +++ b/libgomp/plugin/plugin-nvptx.c @@ -1272,6 +1272,7 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs, ? vectors : dims[GOMP_DIM_VECTOR]); workers = blocks / actual_vectors; + workers = MAX (workers, 1); } for (i = 0; i != GOMP_DIM_MAX; i++)