2018-08-01 15:20:22 +02:00
|
|
|
CUDA_ONE_CALL (cuCtxCreate)
|
|
|
|
CUDA_ONE_CALL (cuCtxDestroy)
|
|
|
|
CUDA_ONE_CALL (cuCtxGetCurrent)
|
|
|
|
CUDA_ONE_CALL (cuCtxGetDevice)
|
|
|
|
CUDA_ONE_CALL (cuCtxPopCurrent)
|
|
|
|
CUDA_ONE_CALL (cuCtxPushCurrent)
|
|
|
|
CUDA_ONE_CALL (cuCtxSynchronize)
|
|
|
|
CUDA_ONE_CALL (cuDeviceGet)
|
|
|
|
CUDA_ONE_CALL (cuDeviceGetAttribute)
|
|
|
|
CUDA_ONE_CALL (cuDeviceGetCount)
|
Add OpenACC 2.6 `acc_get_property' support
Add generic support for the OpenACC 2.6 `acc_get_property' and
`acc_get_property_string' routines, as well as full handlers for the
host and the NVPTX offload targets and minimal handlers for the HSA,
Intel MIC, and AMD GCN offload targets.
Included are C/C++ and Fortran tests that, in particular, print
the property values for acc_property_vendor, acc_property_memory,
acc_property_free_memory, acc_property_name, and acc_property_driver.
The output looks as follows:
Vendor: GNU
Name: GOMP
Total memory: 0
Free memory: 0
Driver: 1.0
with the host driver (where the memory related properties are not
supported for the host device and yield 0, conforming to the standard)
and output like:
Vendor: Nvidia
Total memory: 12651462656
Free memory: 12202737664
Name: TITAN V
Driver: CUDA Driver 9.1
with the NVPTX driver.
2019-12-22 Maciej W. Rozycki <macro@codesourcery.com>
Frederik Harwath <frederik@codesourcery.com>
Thomas Schwinge <tschwinge@codesourcery.com>
include/
* gomp-constants.h (gomp_device_property): New enum.
libgomp/
* libgomp.h (gomp_device_descr): Add `get_property_func' member.
* libgomp-plugin.h (gomp_device_property_value): New union.
(gomp_device_property_value): New prototype.
* openacc.h (acc_device_t): Add `acc_device_current' enumeration
constant.
(acc_device_property_t): New enum.
(acc_get_property, acc_get_property_string): New prototypes.
* oacc-init.c (acc_get_device_type): Also assert that result
is not `acc_device_current'.
(get_property_any, acc_get_property, acc_get_property_string):
New functions.
* openacc.f90 (openacc_kinds): Add `acc_device_current' and
`acc_property_memory', `acc_property_free_memory',
`acc_property_name', `acc_property_vendor' and
`acc_property_driver' constants. Add `acc_device_property' data
type.
(openacc_internal): Add `acc_get_property' and
`acc_get_property_string' interfaces. Add `acc_get_property_h',
`acc_get_property_string_h', `acc_get_property_l' and
`acc_get_property_string_l'.
* oacc-host.c (host_get_property): New function.
(host_dispatch): Wire it.
* target.c (gomp_load_plugin_for_device): Handle `get_property'.
* libgomp.map (OACC_2.6): Add `acc_get_property', `acc_get_property_h_',
`acc_get_property_string' and `acc_get_property_string_h_' symbols.
* libgomp.texi (OpenACC Runtime Library Routines): Add
`acc_get_property'.
(acc_get_property): New node.
* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_property): New
function (stub).
* plugin/plugin-hsa.c (GOMP_OFFLOAD_get_property): New function.
* plugin/plugin-nvptx.c (CUDA_CALLS): Add `cuDeviceGetName',
`cuDeviceTotalMem', `cuDriverGetVersion' and `cuMemGetInfo'
calls.
(GOMP_OFFLOAD_get_property): New function.
(struct ptx_device): Add new field "name".
(cuda_driver_version_s): Add new static variable ...
(nvptx_init): ... and init from here.
* testsuite/libgomp.oacc-c-c++-common/acc_get_property.c: New test.
* testsuite/libgomp.oacc-c-c++-common/acc_get_property-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/acc_get_property-3.c: New test.
* testsuite/libgomp.oacc-c-c++-common/acc_get_property-aux.c: New file
with test helper functions.
* testsuite/libgomp.oacc-fortran/acc_get_property.f90: New test.
liboffloadmic/
* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_get_property):
New function.
Reviewed-by: Thomas Schwinge <thomas@codesourcery.com>
Co-Authored-By: Frederik Harwath <frederik@codesourcery.com>
Co-Authored-By: Thomas Schwinge <tschwinge@codesourcery.com>
From-SVN: r279710
2019-12-22 20:54:09 +01:00
|
|
|
CUDA_ONE_CALL (cuDeviceGetName)
|
|
|
|
CUDA_ONE_CALL (cuDeviceTotalMem)
|
|
|
|
CUDA_ONE_CALL (cuDriverGetVersion)
|
2018-08-01 15:20:22 +02:00
|
|
|
CUDA_ONE_CALL (cuEventCreate)
|
|
|
|
CUDA_ONE_CALL (cuEventDestroy)
|
|
|
|
CUDA_ONE_CALL (cuEventElapsedTime)
|
|
|
|
CUDA_ONE_CALL (cuEventQuery)
|
|
|
|
CUDA_ONE_CALL (cuEventRecord)
|
|
|
|
CUDA_ONE_CALL (cuEventSynchronize)
|
|
|
|
CUDA_ONE_CALL (cuFuncGetAttribute)
|
2018-08-08 16:26:28 +02:00
|
|
|
CUDA_ONE_CALL_MAYBE_NULL (cuGetErrorString)
|
2018-08-01 15:20:22 +02:00
|
|
|
CUDA_ONE_CALL (cuInit)
|
|
|
|
CUDA_ONE_CALL (cuLaunchKernel)
|
|
|
|
CUDA_ONE_CALL (cuLinkAddData)
|
2018-08-08 16:26:37 +02:00
|
|
|
CUDA_ONE_CALL_MAYBE_NULL (cuLinkAddData_v2)
|
2018-08-01 15:20:22 +02:00
|
|
|
CUDA_ONE_CALL (cuLinkComplete)
|
|
|
|
CUDA_ONE_CALL (cuLinkCreate)
|
2018-08-08 16:26:37 +02:00
|
|
|
CUDA_ONE_CALL_MAYBE_NULL (cuLinkCreate_v2)
|
2018-08-01 15:20:22 +02:00
|
|
|
CUDA_ONE_CALL (cuLinkDestroy)
|
|
|
|
CUDA_ONE_CALL (cuMemAlloc)
|
|
|
|
CUDA_ONE_CALL (cuMemAllocHost)
|
|
|
|
CUDA_ONE_CALL (cuMemcpy)
|
|
|
|
CUDA_ONE_CALL (cuMemcpyDtoDAsync)
|
|
|
|
CUDA_ONE_CALL (cuMemcpyDtoH)
|
|
|
|
CUDA_ONE_CALL (cuMemcpyDtoHAsync)
|
|
|
|
CUDA_ONE_CALL (cuMemcpyHtoD)
|
|
|
|
CUDA_ONE_CALL (cuMemcpyHtoDAsync)
|
|
|
|
CUDA_ONE_CALL (cuMemFree)
|
|
|
|
CUDA_ONE_CALL (cuMemFreeHost)
|
|
|
|
CUDA_ONE_CALL (cuMemGetAddressRange)
|
Add OpenACC 2.6 `acc_get_property' support
Add generic support for the OpenACC 2.6 `acc_get_property' and
`acc_get_property_string' routines, as well as full handlers for the
host and the NVPTX offload targets and minimal handlers for the HSA,
Intel MIC, and AMD GCN offload targets.
Included are C/C++ and Fortran tests that, in particular, print
the property values for acc_property_vendor, acc_property_memory,
acc_property_free_memory, acc_property_name, and acc_property_driver.
The output looks as follows:
Vendor: GNU
Name: GOMP
Total memory: 0
Free memory: 0
Driver: 1.0
with the host driver (where the memory related properties are not
supported for the host device and yield 0, conforming to the standard)
and output like:
Vendor: Nvidia
Total memory: 12651462656
Free memory: 12202737664
Name: TITAN V
Driver: CUDA Driver 9.1
with the NVPTX driver.
2019-12-22 Maciej W. Rozycki <macro@codesourcery.com>
Frederik Harwath <frederik@codesourcery.com>
Thomas Schwinge <tschwinge@codesourcery.com>
include/
* gomp-constants.h (gomp_device_property): New enum.
libgomp/
* libgomp.h (gomp_device_descr): Add `get_property_func' member.
* libgomp-plugin.h (gomp_device_property_value): New union.
(gomp_device_property_value): New prototype.
* openacc.h (acc_device_t): Add `acc_device_current' enumeration
constant.
(acc_device_property_t): New enum.
(acc_get_property, acc_get_property_string): New prototypes.
* oacc-init.c (acc_get_device_type): Also assert that result
is not `acc_device_current'.
(get_property_any, acc_get_property, acc_get_property_string):
New functions.
* openacc.f90 (openacc_kinds): Add `acc_device_current' and
`acc_property_memory', `acc_property_free_memory',
`acc_property_name', `acc_property_vendor' and
`acc_property_driver' constants. Add `acc_device_property' data
type.
(openacc_internal): Add `acc_get_property' and
`acc_get_property_string' interfaces. Add `acc_get_property_h',
`acc_get_property_string_h', `acc_get_property_l' and
`acc_get_property_string_l'.
* oacc-host.c (host_get_property): New function.
(host_dispatch): Wire it.
* target.c (gomp_load_plugin_for_device): Handle `get_property'.
* libgomp.map (OACC_2.6): Add `acc_get_property', `acc_get_property_h_',
`acc_get_property_string' and `acc_get_property_string_h_' symbols.
* libgomp.texi (OpenACC Runtime Library Routines): Add
`acc_get_property'.
(acc_get_property): New node.
* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_property): New
function (stub).
* plugin/plugin-hsa.c (GOMP_OFFLOAD_get_property): New function.
* plugin/plugin-nvptx.c (CUDA_CALLS): Add `cuDeviceGetName',
`cuDeviceTotalMem', `cuDriverGetVersion' and `cuMemGetInfo'
calls.
(GOMP_OFFLOAD_get_property): New function.
(struct ptx_device): Add new field "name".
(cuda_driver_version_s): Add new static variable ...
(nvptx_init): ... and init from here.
* testsuite/libgomp.oacc-c-c++-common/acc_get_property.c: New test.
* testsuite/libgomp.oacc-c-c++-common/acc_get_property-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/acc_get_property-3.c: New test.
* testsuite/libgomp.oacc-c-c++-common/acc_get_property-aux.c: New file
with test helper functions.
* testsuite/libgomp.oacc-fortran/acc_get_property.f90: New test.
liboffloadmic/
* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_get_property):
New function.
Reviewed-by: Thomas Schwinge <thomas@codesourcery.com>
Co-Authored-By: Frederik Harwath <frederik@codesourcery.com>
Co-Authored-By: Thomas Schwinge <tschwinge@codesourcery.com>
From-SVN: r279710
2019-12-22 20:54:09 +01:00
|
|
|
CUDA_ONE_CALL (cuMemGetInfo)
|
2018-08-01 15:20:22 +02:00
|
|
|
CUDA_ONE_CALL (cuMemHostGetDevicePointer)
|
|
|
|
CUDA_ONE_CALL (cuModuleGetFunction)
|
|
|
|
CUDA_ONE_CALL (cuModuleGetGlobal)
|
|
|
|
CUDA_ONE_CALL (cuModuleLoad)
|
|
|
|
CUDA_ONE_CALL (cuModuleLoadData)
|
|
|
|
CUDA_ONE_CALL (cuModuleUnload)
|
2018-08-13 14:04:24 +02:00
|
|
|
CUDA_ONE_CALL_MAYBE_NULL (cuOccupancyMaxPotentialBlockSize)
|
2019-05-13 Chung-Lin Tang <cltang@codesourcery.com>
Reviewed-by: Thomas Schwinge <thomas@codesourcery.com>
libgomp/
* libgomp-plugin.h (struct goacc_asyncqueue): Declare.
(struct goacc_asyncqueue_list): Likewise.
(goacc_aq): Likewise.
(goacc_aq_list): Likewise.
(GOMP_OFFLOAD_openacc_register_async_cleanup): Remove.
(GOMP_OFFLOAD_openacc_async_test): Remove.
(GOMP_OFFLOAD_openacc_async_test_all): Remove.
(GOMP_OFFLOAD_openacc_async_wait): Remove.
(GOMP_OFFLOAD_openacc_async_wait_async): Remove.
(GOMP_OFFLOAD_openacc_async_wait_all): Remove.
(GOMP_OFFLOAD_openacc_async_wait_all_async): Remove.
(GOMP_OFFLOAD_openacc_async_set_async): Remove.
(GOMP_OFFLOAD_openacc_exec): Adjust declaration.
(GOMP_OFFLOAD_openacc_cuda_get_stream): Likewise.
(GOMP_OFFLOAD_openacc_cuda_set_stream): Likewise.
(GOMP_OFFLOAD_openacc_async_exec): Declare.
(GOMP_OFFLOAD_openacc_async_construct): Declare.
(GOMP_OFFLOAD_openacc_async_destruct): Declare.
(GOMP_OFFLOAD_openacc_async_test): Declare.
(GOMP_OFFLOAD_openacc_async_synchronize): Declare.
(GOMP_OFFLOAD_openacc_async_serialize): Declare.
(GOMP_OFFLOAD_openacc_async_queue_callback): Declare.
(GOMP_OFFLOAD_openacc_async_host2dev): Declare.
(GOMP_OFFLOAD_openacc_async_dev2host): Declare.
* libgomp.h (struct acc_dispatch_t): Define 'async' sub-struct.
(gomp_acc_insert_pointer): Adjust declaration.
(gomp_copy_host2dev): New declaration.
(gomp_copy_dev2host): Likewise.
(gomp_map_vars_async): Likewise.
(gomp_unmap_tgt): Likewise.
(gomp_unmap_vars_async): Likewise.
(gomp_fini_device): Likewise.
* oacc-async.c (get_goacc_thread): New function.
(get_goacc_thread_device): New function.
(lookup_goacc_asyncqueue): New function.
(get_goacc_asyncqueue): New function.
(acc_async_test): Adjust code to use new async design.
(acc_async_test_all): Likewise.
(acc_wait): Likewise.
(acc_wait_async): Likewise.
(acc_wait_all): Likewise.
(acc_wait_all_async): Likewise.
(goacc_async_free): New function.
(goacc_init_asyncqueues): Likewise.
(goacc_fini_asyncqueues): Likewise.
* oacc-cuda.c (acc_get_cuda_stream): Adjust code to use new async
design.
(acc_set_cuda_stream): Likewise.
* oacc-host.c (host_openacc_exec): Adjust parameters, remove 'async'.
(host_openacc_register_async_cleanup): Remove.
(host_openacc_async_exec): New function.
(host_openacc_async_test): Adjust parameters.
(host_openacc_async_test_all): Remove.
(host_openacc_async_wait): Remove.
(host_openacc_async_wait_async): Remove.
(host_openacc_async_wait_all): Remove.
(host_openacc_async_wait_all_async): Remove.
(host_openacc_async_set_async): Remove.
(host_openacc_async_synchronize): New function.
(host_openacc_async_serialize): New function.
(host_openacc_async_host2dev): New function.
(host_openacc_async_dev2host): New function.
(host_openacc_async_queue_callback): New function.
(host_openacc_async_construct): New function.
(host_openacc_async_destruct): New function.
(struct gomp_device_descr host_dispatch): Remove initialization of old
interface, add intialization of new async sub-struct.
* oacc-init.c (acc_shutdown_1): Adjust to use gomp_fini_device.
(goacc_attach_host_thread_to_device): Remove old async code usage.
* oacc-int.h (goacc_init_asyncqueues): New declaration.
(goacc_fini_asyncqueues): Likewise.
(goacc_async_copyout_unmap_vars): Likewise.
(goacc_async_free): Likewise.
(get_goacc_asyncqueue): Likewise.
(lookup_goacc_asyncqueue): Likewise.
* oacc-mem.c (memcpy_tofrom_device): Adjust code to use new async
design.
(present_create_copy): Adjust code to use new async design.
(delete_copyout): Likewise.
(update_dev_host): Likewise.
(gomp_acc_insert_pointer): Add async parameter, adjust code to use new
async design.
(gomp_acc_remove_pointer): Adjust code to use new async design.
* oacc-parallel.c (GOACC_parallel_keyed): Adjust code to use new async
design.
(GOACC_enter_exit_data): Likewise.
(goacc_wait): Likewise.
(GOACC_update): Likewise.
* oacc-plugin.c (GOMP_PLUGIN_async_unmap_vars): Change to assert fail
when called, warn as obsolete in comment.
* target.c (goacc_device_copy_async): New function.
(gomp_copy_host2dev): Remove 'static', add goacc_asyncqueue parameter,
add goacc_device_copy_async case.
(gomp_copy_dev2host): Likewise.
(gomp_map_vars_existing): Add goacc_asyncqueue parameter, adjust code.
(gomp_map_pointer): Likewise.
(gomp_map_fields_existing): Likewise.
(gomp_map_vars_internal): New always_inline function, renamed from
gomp_map_vars.
(gomp_map_vars): Implement by calling gomp_map_vars_internal.
(gomp_map_vars_async): Implement by calling gomp_map_vars_internal,
passing goacc_asyncqueue argument.
(gomp_unmap_tgt): Remove static, add attribute_hidden.
(gomp_unref_tgt): New function.
(gomp_unmap_vars_internal): New always_inline function, renamed from
gomp_unmap_vars.
(gomp_unmap_vars): Implement by calling gomp_unmap_vars_internal.
(gomp_unmap_vars_async): Implement by calling
gomp_unmap_vars_internal, passing goacc_asyncqueue argument.
(gomp_fini_device): New function.
(gomp_exit_data): Adjust gomp_copy_dev2host call.
(gomp_load_plugin_for_device): Remove old interface, adjust to load
new async interface.
(gomp_target_fini): Adjust code to call gomp_fini_device.
* plugin/plugin-nvptx.c (struct cuda_map): Remove.
(struct ptx_stream): Remove.
(struct nvptx_thread): Remove current_stream field.
(cuda_map_create): Remove.
(cuda_map_destroy): Remove.
(map_init): Remove.
(map_fini): Remove.
(map_pop): Remove.
(map_push): Remove.
(struct goacc_asyncqueue): Define.
(struct nvptx_callback): Define.
(struct ptx_free_block): Define.
(struct ptx_device): Remove null_stream, active_streams, async_streams,
stream_lock, and next fields.
(enum ptx_event_type): Remove.
(struct ptx_event): Remove.
(ptx_event_lock): Remove.
(ptx_events): Remove.
(init_streams_for_device): Remove.
(fini_streams_for_device): Remove.
(select_stream_for_async): Remove.
(nvptx_init): Remove ptx_events and ptx_event_lock references.
(nvptx_attach_host_thread_to_device): Remove CUDA_ERROR_NOT_PERMITTED
case.
(nvptx_open_device): Add free_blocks initialization, remove
init_streams_for_device call.
(nvptx_close_device): Remove fini_streams_for_device call, add
free_blocks destruct code.
(event_gc): Remove.
(event_add): Remove.
(nvptx_exec): Adjust parameters and code.
(nvptx_free): Likewise.
(nvptx_host2dev): Remove.
(nvptx_dev2host): Remove.
(nvptx_set_async): Remove.
(nvptx_async_test): Remove.
(nvptx_async_test_all): Remove.
(nvptx_wait): Remove.
(nvptx_wait_async): Remove.
(nvptx_wait_all): Remove.
(nvptx_wait_all_async): Remove.
(nvptx_get_cuda_stream): Remove.
(nvptx_set_cuda_stream): Remove.
(GOMP_OFFLOAD_alloc): Adjust code.
(GOMP_OFFLOAD_free): Likewise.
(GOMP_OFFLOAD_openacc_register_async_cleanup): Remove.
(GOMP_OFFLOAD_openacc_exec): Adjust parameters and code.
(GOMP_OFFLOAD_openacc_async_test_all): Remove.
(GOMP_OFFLOAD_openacc_async_wait): Remove.
(GOMP_OFFLOAD_openacc_async_wait_async): Remove.
(GOMP_OFFLOAD_openacc_async_wait_all): Remove.
(GOMP_OFFLOAD_openacc_async_wait_all_async): Remove.
(GOMP_OFFLOAD_openacc_async_set_async): Remove.
(cuda_free_argmem): New function.
(GOMP_OFFLOAD_openacc_async_exec): New plugin hook function.
(GOMP_OFFLOAD_openacc_create_thread_data): Adjust code.
(GOMP_OFFLOAD_openacc_cuda_get_stream): Adjust code.
(GOMP_OFFLOAD_openacc_cuda_set_stream): Adjust code.
(GOMP_OFFLOAD_openacc_async_construct): New plugin hook function.
(GOMP_OFFLOAD_openacc_async_destruct): New plugin hook function.
(GOMP_OFFLOAD_openacc_async_test): Remove and re-implement.
(GOMP_OFFLOAD_openacc_async_synchronize): New plugin hook function.
(GOMP_OFFLOAD_openacc_async_serialize): New plugin hook function.
(GOMP_OFFLOAD_openacc_async_queue_callback): New plugin hook function.
(cuda_callback_wrapper): New function.
(cuda_memcpy_sanity_check): New function.
(GOMP_OFFLOAD_host2dev): Remove and re-implement.
(GOMP_OFFLOAD_dev2host): Remove and re-implement.
(GOMP_OFFLOAD_openacc_async_host2dev): New plugin hook function.
(GOMP_OFFLOAD_openacc_async_dev2host): New plugin hook function.
From-SVN: r271128
2019-05-13 15:32:00 +02:00
|
|
|
CUDA_ONE_CALL (cuStreamAddCallback)
|
2018-08-01 15:20:22 +02:00
|
|
|
CUDA_ONE_CALL (cuStreamCreate)
|
|
|
|
CUDA_ONE_CALL (cuStreamDestroy)
|
|
|
|
CUDA_ONE_CALL (cuStreamQuery)
|
|
|
|
CUDA_ONE_CALL (cuStreamSynchronize)
|
|
|
|
CUDA_ONE_CALL (cuStreamWaitEvent)
|