gcc/libgomp/testsuite/libgomp.c-c++-common/nested-parallel-unbalanced.c
Andrew Stubbs 6f51395197 libgomp: disable barriers in nested teams
Both GCN and NVPTX allow nested parallel regions, but the barrier
implementation did not allow the nested teams to run independently of each
other (due to hardware limitations).  This patch fixes that, under the
assumption that each thread will create a new subteam of one thread, by
simply not using barriers when there's no other thread to synchronise.

libgomp/ChangeLog:

	* config/gcn/bar.c (gomp_barrier_wait_end): Skip the barrier if the
	total number of threads is one.
	(gomp_team_barrier_wake): Likewise.
	(gomp_team_barrier_wait_end): Likewise.
	(gomp_team_barrier_wait_cancel_end): Likewise.
	* config/nvptx/bar.c (gomp_barrier_wait_end): Likewise.
	(gomp_team_barrier_wake): Likewise.
	(gomp_team_barrier_wait_end): Likewise.
	(gomp_team_barrier_wait_cancel_end): Likewise.
	* testsuite/libgomp.c-c++-common/nested-parallel-unbalanced.c: New test.
2020-09-29 11:48:04 +01:00

32 lines
653 B
C

/* Ensure that nested parallel regions work even when the number of loop
iterations is not divisible by the number of threads. */
#include <stdlib.h>
int main() {
int A[30][40], B[30][40];
size_t n = 30;
for (size_t i = 0; i < 30; ++i)
for (size_t j = 0; j < 40; ++j)
A[i][j] = 42;
#pragma omp target map(A[0:30][0:40], B[0:30][0:40])
{
#pragma omp parallel for num_threads(8)
for (size_t i = 0; i < n; ++i)
{
#pragma omp parallel for
for (size_t j = 0; j < n; ++j)
{
B[i][j] = A[i][j];
}
}
}
for (size_t i = 0; i < n; ++i)
for (size_t j = 0; j < n; ++j)
if (B[i][j] != 42)
abort ();
}