gcc/libgomp/testsuite/libgomp.c-c++-common
Tom de Vries 5ed77fb3ed [libgomp, nvptx] Fix hang in gomp_team_barrier_wait_end
Consider the following omp fragment.
...
  #pragma omp target
  #pragma omp parallel num_threads (2)
  #pragma omp task
    ;
...

This hangs at -O0 for nvptx.

Investigating the behaviour gives us the following trace of events:
- both threads execute GOMP_task, where they:
  - deposit a task, and
  - execute gomp_team_barrier_wake
- thread 1 executes gomp_team_barrier_wait_end and, not being the last thread,
  proceeds to wait at the team barrier
- thread 0 executes gomp_team_barrier_wait_end and, being the last thread, it
  calls gomp_barrier_handle_tasks, where it:
  - executes both tasks and marks the team barrier done
  - executes a gomp_team_barrier_wake which wakes up thread 1
- thread 1 exits the team barrier
- thread 0 returns from gomp_barrier_handle_tasks and goes to wait at
  the team barrier.
- thread 0 hangs.

To understand why there is a hang here, it's good to understand how things
are setup for nvptx.  The libgomp/config/nvptx/bar.c implementation is
a copy of the libgomp/config/linux/bar.c implementation, with uses of both
futex_wake and do_wait replaced with uses of ptx insn bar.sync:
...
  if (bar->total > 1)
    asm ("bar.sync 1, %0;" : : "r" (32 * bar->total));
...

The point where thread 0 goes to wait at the team barrier, corresponds in
the linux implementation with a do_wait.  In the linux case, the call to
do_wait doesn't hang, because it's waiting for bar->generation to become
a certain value, and if bar->generation already has that value, it just
proceeds, without any need for coordination with other threads.

In the nvtpx case, the bar.sync waits until thread 1 joins it in the same
logical barrier, which never happens: thread 1 is lingering in the
thread pool at the thread pool barrier (using a different logical barrier),
waiting to join a new team.

The easiest way to fix this is to revert to the posix implementation for
bar.{c,h}.  That however falls back on a busy-waiting approach, and
does not take advantage of the ptx bar.sync insn.

Instead, we revert to the linux implementation for bar.c,
and implement bar.c local functions futex_wait and futex_wake using the
bar.sync insn.

The bar.sync insn takes an argument specifying how many threads are
participating, and that doesn't play well with the futex syntax where it's
not clear in advance how many threads will be woken up.

This is solved by waking up all waiting threads each time a futex_wait or
futex_wake happens, and possibly going back to sleep with an updated thread
count.

Tested libgomp on x86_64 with nvptx accelerator.

libgomp/ChangeLog:

2021-04-20  Tom de Vries  <tdevries@suse.de>

	PR target/99555
	* config/nvptx/bar.c (generation_to_barrier): New function, copied
	from config/rtems/bar.c.
	(futex_wait, futex_wake): New function.
	(do_spin, do_wait): New function, copied from config/linux/wait.h.
	(gomp_barrier_wait_end, gomp_barrier_wait_last)
	(gomp_team_barrier_wake, gomp_team_barrier_wait_end):
	(gomp_team_barrier_wait_cancel_end, gomp_team_barrier_cancel): Remove
	and replace with include of config/linux/bar.c.
	* config/nvptx/bar.h (gomp_barrier_t): Add fields waiters and lock.
	(gomp_barrier_init): Init new fields.
	* testsuite/libgomp.c-c++-common/task-detach-6.c: Remove nvptx-specific
	workarounds.
	* testsuite/libgomp.c/pr99555-1.c: Same.
	* testsuite/libgomp.fortran/task-detach-6.f90: Same.
2022-02-22 15:48:03 +01:00
..
alloc-1.c openmp: Add basic library allocator support. 2020-05-19 10:11:01 +02:00
alloc-2.c libgomp: Add Fortran routine support for allocators 2020-07-15 08:33:20 +02:00
alloc-3.c openmp: Add basic library allocator support. 2020-05-19 10:11:01 +02:00
alloc-4.c openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc 2021-09-30 09:30:18 +02:00
alloc-5.c openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc 2021-09-30 09:30:18 +02:00
alloc-6.c openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc 2021-09-30 09:30:18 +02:00
alloc-7.c openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc 2021-09-30 09:30:18 +02:00
alloc-8.c openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc 2021-09-30 09:30:18 +02:00
alloc-9.c libgomp: alloc* test fixes [PR102628, PR102668] 2021-10-12 09:30:41 +02:00
alloc-10.c libgomp: Add tests for omp_atv_serialized and deprecate omp_atv_sequential. 2021-10-11 04:34:51 -07:00
allocate-1.c openmp: Add support for non-VLA {,first}private allocate on omp task 2020-11-14 01:46:16 +01:00
allocate-2.c openmp: Add support for allocator and align modifiers on allocate clauses 2021-09-22 09:29:13 +02:00
allocate-3.c openmp: Add support for allocator and align modifiers on allocate clauses 2021-09-22 09:29:13 +02:00
atomic-18.c
atomic-19.c openmp: Add support for OpenMP 5.1 atomics for C++ 2021-09-17 11:28:31 +02:00
atomic-20.c openmp: Add support for OpenMP 5.1 atomics for C++ 2021-09-17 11:28:31 +02:00
atomic-21.c openmp: Add support for OpenMP 5.1 atomics for C++ 2021-09-17 11:28:31 +02:00
cancel-parallel-1.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
cancel-taskgroup-1.c
cancel-taskgroup-2.c
cancel-taskgroup-3.c re PR libgomp/87995 (libgomp.c/../libgomp.c-c++-common/cancel-taskgroup-3.c fails consistently after r265930) 2018-12-08 09:58:24 +01:00
cancel-taskgroup-4.c omp-low.c (check_omp_nesting_restrictions): Allow cancel or cancellation point with taskgroup clause inside of taskloop. 2018-12-02 13:48:42 +01:00
critical-hint-1.c critical-hint-*.{c,f90}: Move from gcc/testsuite to libgomp/testsuite 2020-07-22 12:14:22 +02:00
critical-hint-2.c critical-hint-*.{c,f90}: Move from gcc/testsuite to libgomp/testsuite 2020-07-22 12:14:22 +02:00
declare_target-1.c OpenMP: Fix 'omp declare target' handling for vars [PR99509] 2021-03-15 10:12:58 +01:00
default-1.c openmp: Allow private or firstprivate arguments to default clause even for C/C++ 2021-09-18 09:47:25 +02:00
depend-iterator-1.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
depend-iterator-2.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
depend-mutexinout-1.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
depend-mutexinout-2.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
depobj-1.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
display-affinity-1.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
error-1.c libgomp.*/error-1.{c,f90}: Fix dg-output newline pattern 2021-09-03 15:27:00 +02:00
for-1.c
for-1.h
for-2.c
for-2.h openmp: Handle reduction clauses on host teams construct [PR96459] 2020-08-05 10:40:10 +02:00
for-3.c openmp: Fix up handling of target constructs in offloaded routines [PR100573] 2021-05-26 11:28:42 +02:00
for-4.c
for-5.c
for-6.c
for-7.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
for-8.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
for-9.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
for-10.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
for-11.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
for-12.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
for-13.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
for-14.c openmp: Handle reduction clauses on host teams construct [PR96459] 2020-08-05 10:40:10 +02:00
for-15.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
for-16.c omp-builtins.def (BUILT_IN_GOMP_LOOP_NONMONOTONIC_RUNTIME_START, [...]): Fix up function types - remove one argument. 2018-12-12 23:47:55 +01:00
function-not-offloaded-aux.c [offloading] Error on missing symbols 2018-12-14 13:48:56 +00:00
function-not-offloaded.c libgomp/testsuite: Fix checks for dg-excess-errors 2021-04-21 20:07:19 +02:00
icv-3.c OpenMP: Add strictly nested API call check [PR102972] 2021-10-30 23:45:32 +02:00
icv-4.c OpenMP: Add strictly nested API call check [PR102972] 2021-10-30 23:45:32 +02:00
lastprivate-conditional-1.c tree-core.h (enum omp_clause_code): Add OMP_CLAUSE__CONDTEMP_. 2019-05-24 23:31:59 +02:00
lastprivate-conditional-2.c tree-core.h (enum omp_clause_code): Add OMP_CLAUSE__CONDTEMP_. 2019-05-24 23:31:59 +02:00
lastprivate-conditional-3.c omp-low.c (lower_omp_1): Look through ordered... 2019-05-27 23:31:40 +02:00
lastprivate-conditional-4.c gimplify.c (struct gimplify_omp_ctx): Add clauses member. 2019-05-29 09:51:43 +02:00
lastprivate-conditional-5.c gimplify.c (struct gimplify_omp_ctx): Add clauses member. 2019-05-29 09:51:43 +02:00
lastprivate-conditional-6.c gimplify.c (struct gimplify_omp_ctx): Add clauses member. 2019-05-29 09:51:43 +02:00
lastprivate-conditional-7.c Adjust more testcases for O2 vectorization enabling. 2021-10-09 16:28:11 +08:00
lastprivate-conditional-8.c Adjust more testcases for O2 vectorization enabling. 2021-10-09 16:28:11 +08:00
lastprivate-conditional-9.c gimplify.c (gimplify_scan_omp_clauses): Don't sorry_at on lastprivate conditional on combined for simd. 2019-06-04 14:49:03 +02:00
lastprivate-conditional-10.c gimplify.c (gimplify_scan_omp_clauses): Don't sorry_at on lastprivate conditional on combined for simd. 2019-06-04 14:49:03 +02:00
loop-1.c tree.def (OMP_LOOP): New tree code. 2019-07-20 13:21:42 +02:00
loop-13.c
loop-14.c
loop-15.c
masked-1.c openmp: Add support for OpenMP 5.1 masked construct 2021-08-12 22:41:17 +02:00
master-combined-1.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
monotonic-1.c
monotonic-2.c
nested-parallel-unbalanced.c libgomp: disable barriers in nested teams 2020-09-29 11:48:04 +01:00
nonmonotonic-1.c
nonmonotonic-2.c
nothing-1.c openmp: Add nothing directive support 2021-08-18 11:10:43 +02:00
on_device_arch.h Improve Intel MIC offloading XFAILing for 'omp_get_device_num' 2022-01-13 13:09:36 +01:00
order-reproducible-1.c openmp: Differentiate between order(concurrent) and order(reproducible:concurrent) 2021-10-01 10:45:48 +02:00
order-reproducible-2.c openmp: Differentiate between order(concurrent) and order(reproducible:concurrent) 2021-10-01 10:45:48 +02:00
ordered-4.c
pause-1.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
pause-2.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
pr45784.c
pr64824.c
pr64868.c
pr66199-1.c
pr66199-2.c
pr66199-3.c
pr66199-4.c
pr66199-5.c
pr66199-6.c
pr66199-7.c
pr66199-8.c
pr66199-9.c
pr66199-10.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
pr66199-11.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
pr66199-12.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
pr66199-13.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
pr66199-14.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
pr69389.c
pr81875.c
pr83046.c
pr93515.c openmp: Fix handling of non-addressable shared scalars in parallel nested inside of target [PR93515] 2020-02-06 09:19:08 +01:00
pr94366.c openmp - Fix up && and || reductions [PR94366] 2021-07-01 08:55:49 +02:00
pr96390.c [libgomp, testsuite, nvptx] Fix pr96390.c without CUDA 2022-02-22 10:23:20 +01:00
ptr-attach-1.c openmp: Implement OpenMP 5.0 base-pointer attachement and clause ordering 2020-11-10 03:36:58 -08:00
reduction-1.c OpenMP: Support complex/float in && and || reduction 2021-05-04 14:42:26 +02:00
reduction-2.c OpenMP: Support complex/float in && and || reduction 2021-05-04 14:42:26 +02:00
reduction-3.c OpenMP: Support complex/float in && and || reduction 2021-05-04 14:42:26 +02:00
reduction-4.c OpenMP: Support complex/float in && and || reduction 2021-05-04 14:42:26 +02:00
reduction-5.c Add 'default' to -foffload=; document that flag [PR67300] 2021-06-29 16:00:04 +02:00
reduction-6.c Add 'default' to -foffload=; document that flag [PR67300] 2021-06-29 16:00:04 +02:00
reduction-16.c Add 'default' to -foffload=; document that flag [PR67300] 2021-06-29 16:00:04 +02:00
reduction-17.c openmp: Fix reduction clause handling on teams distribute simd [PR99928] 2021-05-25 11:07:01 +02:00
refcount-1.c libgomp: Structure element mapping for OpenMP 5.0 2021-06-17 21:34:59 +08:00
scope-1.c openmp: Implement OpenMP 5.1 scope construct 2021-08-17 09:30:09 +02:00
simd-1.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
simd-14.c
simd-15.c
simd-16.c
simd-17.c
struct-elem-1.c libgomp: Structure element mapping for OpenMP 5.0 2021-06-17 21:34:59 +08:00
struct-elem-2.c libgomp: Structure element mapping for OpenMP 5.0 2021-06-17 21:34:59 +08:00
struct-elem-3.c libgomp: Structure element mapping for OpenMP 5.0 2021-06-17 21:34:59 +08:00
struct-elem-4.c libgomp: Structure element mapping for OpenMP 5.0 2021-06-17 21:34:59 +08:00
struct-elem-5.c testsuite/101114: Adjust libgomp.c-c++-common/struct-elem-5.c testcase 2021-06-26 00:46:11 +08:00
target-1.c
target-2.c
target-10.c
target-13.c
target-40.c openmp: Also implicitly mark as declare target to functions mentioned in target regions 2020-05-14 09:48:32 +02:00
target-41.c openmp: Fix up handling of target constructs in offloaded routines [PR100573] 2021-05-26 11:28:42 +02:00
target-42.c openmp: Fix up handling of target constructs in offloaded routines [PR100573] 2021-05-26 11:28:42 +02:00
target-45.c Improve Intel MIC offloading XFAILing for 'omp_get_device_num' 2022-01-13 13:09:36 +01:00
target-has-device-addr-1.c C, C++, Fortran, OpenMP: Add 'has_device_addr' clause to 'target' construct. 2022-02-09 23:47:12 -08:00
target-implicit-map-1.c openmp: Relax handling of implicit map vs. existing device mappings 2021-11-12 20:29:48 +08:00
target-implicit-map-2.c OpenMP 5.0: Remove array section base-pointer mapping semantics and other front-end adjustments 2021-12-09 00:01:10 +08:00
target-in-reduction-1.c openmp: in_reduction clause support on target construct 2021-06-24 11:35:08 +02:00
target-in-reduction-2.c openmp: in_reduction clause support on target construct 2021-06-24 11:35:08 +02:00
task-detach-1.c openmp: Fix intermittent hanging of task-detach-6 libgomp tests [PR98738] 2021-02-25 14:47:11 -08:00
task-detach-2.c openmp: Fix intermittent hanging of task-detach-6 libgomp tests [PR98738] 2021-02-25 14:47:11 -08:00
task-detach-3.c openmp: Fix intermittent hanging of task-detach-6 libgomp tests [PR98738] 2021-02-25 14:47:11 -08:00
task-detach-4.c openmp: Fix intermittent hanging of task-detach-6 libgomp tests [PR98738] 2021-02-25 14:47:11 -08:00
task-detach-5.c openmp: Fix intermittent hanging of task-detach-6 libgomp tests [PR98738] 2021-02-25 14:47:11 -08:00
task-detach-6.c [libgomp, nvptx] Fix hang in gomp_team_barrier_wait_end 2022-02-22 15:48:03 +01:00
task-detach-7.c openmp: Fix intermittent hanging of task-detach-6 libgomp tests [PR98738] 2021-02-25 14:47:11 -08:00
task-detach-8.c openmp: Fix intermittent hanging of task-detach-6 libgomp tests [PR98738] 2021-02-25 14:47:11 -08:00
task-detach-9.c openmp: Fix intermittent hanging of task-detach-6 libgomp tests [PR98738] 2021-02-25 14:47:11 -08:00
task-detach-10.c openmp: Fix intermittent hanging of task-detach-6 libgomp tests [PR98738] 2021-02-25 14:47:11 -08:00
task-detach-11.c openmp: Fix intermittent hanging of task-detach-6 libgomp tests [PR98738] 2021-02-25 14:47:11 -08:00
task-detach-12.c OpenMP: detach - fix firstprivate handling 2021-05-13 00:14:34 +02:00
task-detach-13.c openmp: Notify team barrier of pending tasks in omp_fulfill_event 2021-05-17 13:15:08 -07:00
task-reduction-1.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
task-reduction-2.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
task-reduction-3.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
task-reduction-4.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
task-reduction-5.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
task-reduction-6.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
task-reduction-7.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
task-reduction-8.c task-reduction-8.c (bar): Add in_reduction clause for s[0]. 2018-11-08 20:38:21 +01:00
task-reduction-9.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
task-reduction-11.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
task-reduction-12.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
task-reduction-13.c workshare-reduction-1.c: New test. 2018-11-09 14:02:50 +01:00
task-reduction-14.c workshare-reduction-1.c: New test. 2018-11-09 14:02:50 +01:00
task-reduction-15.c openmp: Fix up *_reduction clause handling with UDRs on PARM_DECLs [PR101167] 2021-06-23 10:03:28 +02:00
task-reduction-16.c openmp: Implement OpenMP 5.1 scope construct 2021-08-17 09:30:09 +02:00
taskgroup-1.c
taskloop-1.c
taskloop-2.c
taskloop-3.c
taskloop-4.c openmp: Add support for strict modifier on grainsize/num_tasks clauses 2021-08-23 10:16:24 +02:00
taskloop-5.c openmp: Add support for strict modifier on grainsize/num_tasks clauses 2021-08-23 10:16:24 +02:00
taskloop-reduction-1.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
taskloop-reduction-2.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
taskloop-reduction-3.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
taskloop-reduction-4.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
taskwait-depend-1.c builtin-types.def (BT_FN_VOID_BOOL, [...]): New. 2018-11-08 18:13:04 +01:00
teams-1.c openmp: Add support for 2 argument num_teams clause 2021-11-11 09:42:47 +01:00
teams-2.c openmp: Honor OpenMP 5.1 num_teams lower bound 2021-11-12 12:41:22 +01:00
thread-limit-1.c openmp: Add support for thread_limit clause on target 2021-11-15 13:20:53 +01:00
udr-1.c
unmap-infinity-2.c OpenACC reference count overhaul 2019-12-20 01:20:16 +00:00
variable-not-offloaded.c libgomp/testsuite: Fix checks for dg-excess-errors 2021-04-21 20:07:19 +02:00