gcc/libgomp/testsuite/libgomp.c/usm-4.c

/* { dg-do run } */
/* { dg-require-effective-target omp_usm } */

#include <omp.h>
#include <stdint.h>

int
main ()
{
  int *a = (int *) omp_alloc(sizeof(int)*2, ompx_unified_shared_mem_alloc);
  if (!a)
    __builtin_abort ();

  a[0] = 42;
  a[1] = 43;

  uintptr_t a_p = (uintptr_t)a;

#pragma omp target enter data map(to:a[0:2])

#pragma omp target
    {
      if (a[0] != 42 || a_p != (uintptr_t)a)
	__builtin_abort ();
    }

#pragma omp target
    {
      if (a[1] != 43 || a_p != (uintptr_t)a)
	__builtin_abort ();
    }

#pragma omp target exit data map(delete:a[0:2])

  omp_free(a, ompx_unified_shared_mem_alloc);
  return 0;
}
openmp, nvptx: ompx_unified_shared_mem_alloc This adds support for using Cuda Managed Memory with omp_alloc. It will be used as the underpinnings for "requires unified_shared_memory" in a later patch. There are two new predefined allocators, ompx_unified_shared_mem_alloc and ompx_host_mem_alloc, plus corresponding memory spaces, which can be used to allocate memory in the "managed" space and explicitly on the host (it is intended that "malloc" will be intercepted by the compiler). The nvptx plugin is modified to make the necessary Cuda calls, and libgomp is modified to switch to shared-memory mode for USM allocated mappings. Backport of the patch posted at https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591352.html libgomp/ChangeLog: * allocator.c (omp_max_predefined_alloc): Update. (omp_aligned_alloc): Don't fallback ompx_host_mem_alloc. (omp_aligned_calloc): Likewise. (omp_realloc): Likewise. * config/linux/allocator.c (linux_memspace_alloc): Handle USM. (linux_memspace_calloc): Handle USM. (linux_memspace_free): Handle USM. (linux_memspace_realloc): Handle USM. * config/nvptx/allocator.c (nvptx_memspace_alloc): Reject ompx_host_mem_alloc. (nvptx_memspace_calloc): Likewise. (nvptx_memspace_realloc): Likewise. * libgomp-plugin.h (GOMP_OFFLOAD_usm_alloc): New prototype. (GOMP_OFFLOAD_usm_free): New prototype. (GOMP_OFFLOAD_is_usm_ptr): New prototype. * libgomp.h (gomp_usm_alloc): New prototype. (gomp_usm_free): New prototype. (gomp_is_usm_ptr): New prototype. (struct gomp_device_descr): Add USM functions. * omp.h.in (omp_memspace_handle_t): Add ompx_unified_shared_mem_space and ompx_host_mem_space. (omp_allocator_handle_t): Add ompx_unified_shared_mem_alloc and ompx_host_mem_alloc. * omp_lib.f90.in: Likewise. * plugin/plugin-nvptx.c (nvptx_alloc): Add "usm" parameter. Call cuMemAllocManaged as appropriate. (GOMP_OFFLOAD_alloc): Move internals to ... (GOMP_OFFLOAD_alloc_1): ... this, and add usm parameter. (GOMP_OFFLOAD_usm_alloc): New function. (GOMP_OFFLOAD_usm_free): New function. (GOMP_OFFLOAD_is_usm_ptr): New function. * target.c (gomp_map_vars_internal): Add USM support. (gomp_usm_alloc): New function. (gomp_usm_free): New function. (gomp_load_plugin_for_device): New function. * testsuite/libgomp.c/usm-1.c: New test. * testsuite/libgomp.c/usm-2.c: New test. * testsuite/libgomp.c/usm-3.c: New test. * testsuite/libgomp.c/usm-4.c: New test. * testsuite/libgomp.c/usm-5.c: New test. 2022-03-11 13:37:58 +01:00			`/* { dg-do run } */`
amdgcn: libgomp plugin USM implementation Implement the Unified Shared Memory API calls in the GCN plugin. The allocate and free are pretty straight-forward because all "target" memory allocations are compatible with USM, on the right hardware. However, there's no known way to check what memory region was used, after the fact, so we use a splay tree to record allocations so we can answer "is_usm_ptr" later. libgomp/ChangeLog: * plugin/plugin-gcn.c (struct usm_splay_tree_key_s): New. (usm_splay_compare): New. (splay_tree_prefix): New. (GOMP_OFFLOAD_usm_alloc): New. (GOMP_OFFLOAD_usm_free): New. (GOMP_OFFLOAD_is_usm_ptr): New. (GOMP_OFFLOAD_supported_features): Move into the OpenMP API fold. Add GOMP_REQUIRES_UNIFIED_ADDRESS and GOMP_REQUIRES_UNIFIED_SHARED_MEMORY. (gomp_fatal): New. (splay_tree_c): New. * testsuite/lib/libgomp.exp (check_effective_target_omp_usm): New. * testsuite/libgomp.c++/usm-1.C: Use dg-require-effective-target. * testsuite/libgomp.c-c++-common/requires-1.c: Likewise. * testsuite/libgomp.c/usm-1.c: Likewise. * testsuite/libgomp.c/usm-2.c: Likewise. * testsuite/libgomp.c/usm-3.c: Likewise. * testsuite/libgomp.c/usm-4.c: Likewise. * testsuite/libgomp.c/usm-5.c: Likewise. * testsuite/libgomp.c/usm-6.c: Likewise. 2022-06-20 16:51:15 +02:00			`/* { dg-require-effective-target omp_usm } */`
openmp, nvptx: ompx_unified_shared_mem_alloc This adds support for using Cuda Managed Memory with omp_alloc. It will be used as the underpinnings for "requires unified_shared_memory" in a later patch. There are two new predefined allocators, ompx_unified_shared_mem_alloc and ompx_host_mem_alloc, plus corresponding memory spaces, which can be used to allocate memory in the "managed" space and explicitly on the host (it is intended that "malloc" will be intercepted by the compiler). The nvptx plugin is modified to make the necessary Cuda calls, and libgomp is modified to switch to shared-memory mode for USM allocated mappings. Backport of the patch posted at https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591352.html libgomp/ChangeLog: * allocator.c (omp_max_predefined_alloc): Update. (omp_aligned_alloc): Don't fallback ompx_host_mem_alloc. (omp_aligned_calloc): Likewise. (omp_realloc): Likewise. * config/linux/allocator.c (linux_memspace_alloc): Handle USM. (linux_memspace_calloc): Handle USM. (linux_memspace_free): Handle USM. (linux_memspace_realloc): Handle USM. * config/nvptx/allocator.c (nvptx_memspace_alloc): Reject ompx_host_mem_alloc. (nvptx_memspace_calloc): Likewise. (nvptx_memspace_realloc): Likewise. * libgomp-plugin.h (GOMP_OFFLOAD_usm_alloc): New prototype. (GOMP_OFFLOAD_usm_free): New prototype. (GOMP_OFFLOAD_is_usm_ptr): New prototype. * libgomp.h (gomp_usm_alloc): New prototype. (gomp_usm_free): New prototype. (gomp_is_usm_ptr): New prototype. (struct gomp_device_descr): Add USM functions. * omp.h.in (omp_memspace_handle_t): Add ompx_unified_shared_mem_space and ompx_host_mem_space. (omp_allocator_handle_t): Add ompx_unified_shared_mem_alloc and ompx_host_mem_alloc. * omp_lib.f90.in: Likewise. * plugin/plugin-nvptx.c (nvptx_alloc): Add "usm" parameter. Call cuMemAllocManaged as appropriate. (GOMP_OFFLOAD_alloc): Move internals to ... (GOMP_OFFLOAD_alloc_1): ... this, and add usm parameter. (GOMP_OFFLOAD_usm_alloc): New function. (GOMP_OFFLOAD_usm_free): New function. (GOMP_OFFLOAD_is_usm_ptr): New function. * target.c (gomp_map_vars_internal): Add USM support. (gomp_usm_alloc): New function. (gomp_usm_free): New function. (gomp_load_plugin_for_device): New function. * testsuite/libgomp.c/usm-1.c: New test. * testsuite/libgomp.c/usm-2.c: New test. * testsuite/libgomp.c/usm-3.c: New test. * testsuite/libgomp.c/usm-4.c: New test. * testsuite/libgomp.c/usm-5.c: New test. 2022-03-11 13:37:58 +01:00
			`#include <omp.h>`
			`#include <stdint.h>`

			`int`
			`main ()`
			`{`
			`int a = (int ) omp_alloc(sizeof(int)*2, ompx_unified_shared_mem_alloc);`
			`if (!a)`
			`__builtin_abort ();`

			`a[0] = 42;`
			`a[1] = 43;`

			`uintptr_t a_p = (uintptr_t)a;`

			`#pragma omp target enter data map(to:a[0:2])`

			`#pragma omp target`
			`{`
			`if (a[0] != 42 \|\| a_p != (uintptr_t)a)`
			`__builtin_abort ();`
			`}`

			`#pragma omp target`
			`{`
			`if (a[1] != 43 \|\| a_p != (uintptr_t)a)`
			`__builtin_abort ();`
			`}`

			`#pragma omp target exit data map(delete:a[0:2])`

			`omp_free(a, ompx_unified_shared_mem_alloc);`
			`return 0;`
			`}`