OpenE2K/gcc - gcc - Expired Mentality Git

Author	SHA1	Message	Date
Cesar Philippidis	2049befdd0	[nvptx] Remove use of CUDA unified memory in libgomp libgomp/ * plugin/plugin-nvptx.c (struct cuda_map): New. (struct ptx_stream): Replace d, h, h_begin, h_end, h_next, h_prev, h_tail with (cuda_map *) map. (cuda_map_create): New function. (cuda_map_destroy): New function. (map_init): Update to use a linked list of cuda_map objects. (map_fini): Likewise. (map_pop): Likewise. (map_push): Likewise. Return CUdeviceptr instead of void. (init_streams_for_device): Remove stales references to ptx_stream members. (select_stream_for_async): Likewise. (nvptx_exec): Update call to map_init. From-SVN: r264397	2018-09-18 08:41:54 -07:00
Cesar Philippidis	bd9b3d3d1a	[nvptx] Use CUDA driver API to select default runtime launch geometry The CUDA driver API starting version 6.5 offers a set of runtime functions to calculate several occupancy-related measures, as a replacement for the occupancy calculator spreadsheet. This patch adds a heuristic for default runtime launch geometry, based on the new runtime function cuOccupancyMaxPotentialBlockSize. Build on x86_64 with nvptx accelerator and ran libgomp testsuite. 2018-08-13 Cesar Philippidis <cesar@codesourcery.com> Tom de Vries <tdevries@suse.de> PR target/85590 * plugin/cuda/cuda.h (CUoccupancyB2DSize): New typedef. (cuOccupancyMaxPotentialBlockSize): Declare. * plugin/cuda-lib.def (cuOccupancyMaxPotentialBlockSize): New CUDA_ONE_CALL_MAYBE_NULL. * plugin/plugin-nvptx.c (CUDA_VERSION < 6050): Define CUoccupancyB2DSize and declare cuOccupancyMaxPotentialBlockSize. (nvptx_exec): Use cuOccupancyMaxPotentialBlockSize to set the default num_gangs and num_workers when the driver supports it. Co-Authored-By: Tom de Vries <tdevries@suse.de> From-SVN: r263505	2018-08-13 12:04:24 +00:00
Tom de Vries	8e09a12f01	[libgomp, nvptx] Fall back to cuLinkAddData/cuLinkCreate if _v2 not found Cuda driver api functions cuLinkAddData and cuLinkCreate are available starting version 5.5. In version 6.5, they are remapped onto _v2 versions. The dlopen interface of the libgomp nvptx plugin uses the _v2 versions, so it won't work with a cuda driver with driver api version lower than 6.5. This patch fixes the problem by testing for the presence of the _v2 versions, and falling back to the original versions in case of absence of the _v2 versions. Build on x86_64 with nvptx accelerator and reg-tested libgomp, both with and without --without-cuda-driver. 2018-08-08 Tom de Vries <tdevries@suse.de> * plugin/cuda-lib.def (cuLinkAddData_v2, cuLinkCreate_v2): Declare using CUDA_ONE_CALL_MAYBE_NULL. * plugin/plugin-nvptx.c (cuLinkAddData, cuLinkCreate): Undef and declare. (cuLinkAddData_v2, cuLinkCreate_v2): Declare. (link_ptx): Fall back to cuLinkAddData/cuLinkCreate if the _v2 versions are not found. From-SVN: r263408	2018-08-08 14:26:37 +00:00
Tom de Vries	cedd9bd016	[libgomp, nvptx] Allow cuGetErrorString to be NULL Cuda driver api function cuGetErrorString is available in version 6.0 and higher. Currently, when the driver that is used does not contain this function, the libgomp nvptx plugin will not build (PLUGIN_NVPTX_DYNAMIC == 0) or run (PLUGIN_NVPTX_DYNAMIC == 1). This patch fixes this problem by testing for the presence of the function, and handling absence. Build on x86_64 with nvptx accelerator and reg-tested libgomp, both with and without --without-cuda-driver. 2018-08-08 Tom de Vries <tdevries@suse.de> * plugin/cuda-lib.def (cuGetErrorString): Use CUDA_ONE_CALL_MAYBE_NULL. * plugin/plugin-nvptx.c (cuda_error): Handle if cuGetErrorString is not present. From-SVN: r263407	2018-08-08 14:26:28 +00:00
Tom de Vries	b113af959c	[libgomp, nvptx] Remove hard-coded const in nvptx_open_device CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR is defined in cuda driver api version 6.0 and higher. Currently nvptx_open_device uses a hard-coded constant instead. This patch fixes that by: - defining CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR to the hardcoded constant at toplevel, if not present in cuda.h, and - using CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR in nvptx_open_device Build on x86_64 with nvptx accelerator and reg-tested libgomp. 2018-08-08 Tom de Vries <tdevries@suse.de> * plugin/plugin-nvptx.c (CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR): Define. (nvptx_open_device): Use CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR. From-SVN: r263406	2018-08-08 14:26:19 +00:00
Tom de Vries	94767dacea	[libgomp, nvptx] Note that cuGetErrorString is in CUDA_VERSION >= 6000 Cuda driver api function cuGetErrorString is available in version 6.0 and higher. This patch: - removes a comment saying the declaration is not available in cuda.h 6.0 - fixes the presence test to use CUDA_VERSION < 6000 - moves the declaration to toplevel Build on x86_64 with nvptx accelerator and reg-tested libgomp. 2018-08-08 Tom de Vries <tdevries@suse.de> * plugin/plugin-nvptx.c (cuda_error): Move declaration of cuGetErrorString ... (cuGetErrorString): ... here. Guard with CUDA_VERSION < 6000. From-SVN: r263405	2018-08-08 14:26:10 +00:00
Tom de Vries	02150de863	[libgomp, nvptx] Handle CUDA_ONE_CALL_MAYBE_NULL This patch adds handling of functions that may not be present in the cuda driver. Such a function can be declared using CUDA_ONE_CALL_MAYBE_NULL in cuda-lib.def, it can be called with the usual convenience macros, but before calling its presence needs to be tested using new macro CUDA_CALL_EXISTS. When using the dlopen interface (PLUGIN_NVPTX_DYNAMIC == 1), we allow non-present functions by allowing dlsym to return NULL. Otherwise (PLUGIN_NVPTX_DYNAMIC == 0) we declare the non-present function to be weak. Build and reg-tested libgomp on x86_64 with nvidia accelerator, with and without --disable-cuda-driver, in combination with a trigger patch that adds a non-existing function foo to cuda-lib.def: ... CUDA_ONE_CALL_MAYBE_NULL (foo) ... and declares it in plugin-nvptx.c: ... CUresult foo (void); ... and then uses it in nvptx_init after the init_cuda_lib call: ... if (CUDA_CALL_EXISTS (foo)) CUDA_CALL (foo); ... Also build and reg-tested on x86_64 with nvidia accelerator, with and without --disable-cuda-driver, in combination with a trigger patch that replaces all CUDA_ONE_CALLs in cuda-lib.def with CUDA_ONE_CALL_MAYBE_NULL, and guards two CUDA_CALLs with CUDA_CALL_EXISTS, one for a regular fn, and one for a fn that is a define in cuda/cuda.h. 2018-08-07 Tom de Vries <tdevries@suse.de> * plugin/plugin-nvptx.c (DO_PRAGMA): Define. (struct cuda_lib_s): Add def/undef of CUDA_ONE_CALL_MAYBE_NULL. (init_cuda_lib): Add new param to CUDA_ONE_CALL_1. Add arg to corresponding call in CUDA_ONE_CALL. Add def/undef of CUDA_ONE_CALL_MAYBE_NULL. (CUDA_CALL_EXISTS): Define. From-SVN: r263346	2018-08-06 22:13:56 +00:00
Tom de Vries	9e28b10779	[libgomp, nvptx] Minimize lifetime of CUDA_ONE_CALL defines This patch makes sure that the lifetimes of the CUDA_ONE_CALL macro (which is defined twice in plugin-nvptx.c) are minimized, to make it obvious that the definitions are used only in the lib-cuda.def include. Build on x86_64 with nvptx accelerator and reg-tested libgomp. 2018-08-07 Tom de Vries <tdevries@suse.de> * plugin/plugin-nvptx.c (struct cuda_lib_s, init_cuda_lib): Put CUDA_ONE_CALL defines right before the cuda-lib.def include, and the corresponding undefs right after. From-SVN: r263345	2018-08-06 22:13:46 +00:00
Tom de Vries	099400909e	[libgomp, nvptx, --without-cuda-driver] Don't use system cuda driver Using libgomp configure option --with-cuda-driver=<dir> we can indicate what cuda driver to use to build the libgomp nvptx plugin. Without such an option, the system cuda driver is used, if available. If not availabe, a dlopen interface is used instead. However, when we use --without-cuda-driver (or the equivalent --with-cuda-driver=no) the system cuda driver is still used if available. This patch fixes that, making sure that --without-cuda-driver selects the dlopen interface. Build on x86_64 with nvptx accelerator and tested libgomp testsuite, with and without option --without-cuda-driver. 2018-08-04 Tom de Vries <tdevries@suse.de> * plugin/configfrag.ac: For --without-cuda-driver, set CUDA_DRIVER_INCLUDE and CUDA_DRIVER_LIB to no. Handle CUDA_DRIVER_INCLUDE == no and CUDA_DRIVER_LIB == no. * configure: Regenerate. From-SVN: r263310	2018-08-04 20:07:22 +00:00
Cesar Philippidis	094db6beb9	[PATCH] Remove use of 'struct map' from plugin (nvptx) libgomp/ * plugin/plugin-nvptx.c (struct map): Removed. (map_init, map_pop): Remove use of struct map. (map_push): Likewise and change argument list. * testsuite/libgomp.oacc-c-c++-common/mapping-1.c: New Co-Authored-By: James Norris <jnorris@codesourcery.com> From-SVN: r263212	2018-08-01 07:09:56 -07:00
Tom de Vries	8c6310a2c2	[libgomp, nvptx] Add cuda-lib.def 2018-08-01 Tom de Vries <tdevries@suse.de> * plugin/cuda-lib.def: New file. Factor out of ... * plugin/plugin-nvptx.c (CUDA_CALLS): ... here. (struct cuda_lib_s, init_cuda_lib): Include cuda-lib.def instead of using CUDA_CALLS. From-SVN: r263208	2018-08-01 13:20:22 +00:00
Tom de Vries	4cdfee3f20	[libgomp, nvptx] Handle per-function max-threads-per-block in default dims Currently parallel-loop-1.c fails at -O0 on a Quadro M1200, because one of the kernel launch configurations exceeds the resources available in the device, due to the default dimensions chosen by the runtime. This patch fixes that by taking the per-function max_threads_per_block into account when using the default dimensions. 2018-07-30 Tom de Vries <tdevries@suse.de> * plugin/plugin-nvptx.c (MIN, MAX): Redefine. (nvptx_exec): Ensure worker and vector default dims don't exceed targ_fn->max_threads_per_block. From-SVN: r263062	2018-07-30 08:17:26 +00:00
Tom de Vries	0b210c43bb	[libgomp, nvptx] Calculate default dims per device The default dimensions are calculated using per-device properties, but initialized once and used on all devices. This patch fixes this problem by introducing per-device default dimensions. 2018-07-30 Tom de Vries <tdevries@suse.de> * plugin/plugin-nvptx.c (struct ptx_device): Add default_dims field. (nvptx_open_device): Init default_dims for device. (nvptx_exec): Use default_dims from device. From-SVN: r263061	2018-07-30 08:17:16 +00:00
Cesar Philippidis	88a4654d03	[libgomp, nvptx] Add error with recompilation hint for launch failure Currently, when a kernel is lauched with too many workers, it results in a cuda launch failure. This is triggered f.i. for parallel-loop-1.c at -O0 on a Quadro M1200. This patch detects this situation, and errors out with a hint on how to fix it. Build and reg-tested on x86_64 with nvptx accelerator. 2018-07-26 Cesar Philippidis <cesar@codesourcery.com> Tom de Vries <tdevries@suse.de> * plugin/plugin-nvptx.c (nvptx_exec): Error if the hardware doesn't have sufficient resources to launch a kernel, and give a hint on how to fix it. Co-Authored-By: Tom de Vries <tdevries@suse.de> From-SVN: r262997	2018-07-26 11:42:29 +00:00
Cesar Philippidis	0c6c2f5fc2	[libgomp, nvptx] Move device property sampling from nvptx_exec to nvptx_open Move sampling of device properties from nvptx_exec to nvptx_open, and assume the sampling always succeeds. This simplifies the default dimension initialization code in nvptx_open. 2018-07-26 Cesar Philippidis <cesar@codesourcery.com> Tom de Vries <tdevries@suse.de> * plugin/plugin-nvptx.c (struct ptx_device): Add warp_size, max_threads_per_block and max_threads_per_multiprocessor fields. (nvptx_open_device): Initialize new fields. (nvptx_exec): Use num_sms, and new fields. Co-Authored-By: Tom de Vries <tdevries@suse.de> From-SVN: r262996	2018-07-26 11:42:19 +00:00
Tom de Vries	ec00d3faf4	[openacc] Move GOMP_OPENACC_DIM parsing out of nvptx plugin 2018-05-02 Tom de Vries <tom@codesourcery.com> PR libgomp/85411 * plugin/plugin-nvptx.c (nvptx_exec): Move parsing of GOMP_OPENACC_DIM ... * env.c (parse_gomp_openacc_dim): ... here. New function. (initialize_env): Call parse_gomp_openacc_dim. (goacc_default_dims): Define. * libgomp.h (goacc_default_dims): Declare. * oacc-plugin.c (GOMP_PLUGIN_acc_default_dim): New function. * oacc-plugin.h (GOMP_PLUGIN_acc_default_dim): Declare. * libgomp.map: New version "GOMP_PLUGIN_1.2". Add GOMP_PLUGIN_acc_default_dim. * testsuite/libgomp.oacc-c-c++-common/loop-default-runtime.c: New test. * testsuite/libgomp.oacc-c-c++-common/loop-default.h: New test. From-SVN: r259852	2018-05-02 17:53:56 +00:00
Tom de Vries	df36a3d3be	[nvptx, libgomp] Add GOMP_NVPTX_JIT=-O[0-4] in nvptx libgomp plugin 2018-04-26 Tom de Vries <tom@codesourcery.com> PR libgomp/84020 * plugin/cuda/cuda.h (CUjit_option): Add CU_JIT_OPTIMIZATION_LEVEL. * plugin/plugin-nvptx.c (_GNU_SOURCE): Define. (process_GOMP_NVPTX_JIT): New function. (link_ptx): Use process_GOMP_NVPTX_JIT. From-SVN: r259678	2018-04-26 13:27:04 +00:00
Jakub Jelinek	85ec4feb11	Update copyright years. From-SVN: r256169	2018-01-03 11:03:58 +01:00
Tom de Vries	12e9c8ce6c	Remove semicolon after do {} while (false) in HSA_LOG 2017-10-31 Tom de Vries <tom@codesourcery.com> * plugin/plugin-hsa.c (HSA_LOG): Remove semicolon after "do {} while (false)". (init_single_kernel, GOMP_OFFLOAD_async_run): Add missing semicolon after HSA_DEBUG call. From-SVN: r254264	2017-10-31 12:56:41 +00:00
Tom de Vries	9607b014b2	Fix secure_getenv.h include in plugin-hsa.c 2017-07-03 Tom de Vries <tom@codesourcery.com> * plugin/plugin-hsa.c: Fix secure_getenv.h include. From-SVN: r249918	2017-07-03 13:40:19 +00:00
Tom de Vries	dfb15f6bbb	Show value of GOMP_OPENACC_DIM in libgomp nvptx plugin 2017-06-27 Tom de Vries <tom@codesourcery.com> * plugin/plugin-nvptx.c (notify_var): New function. (nvptx_exec): Use notify_var for GOMP_OPENACC_DIM. From-SVN: r249695	2017-06-27 15:51:48 +00:00
Tom de Vries	22f1a03704	Use secure_getenv for GOMP_DEBUG 2017-06-27 Tom de Vries <tom@codesourcery.com> * env.c (parse_unsigned_long_1): Factor out of ... (parse_unsigned_long): ... here. (parse_int_1): Factor out of ... (parse_int): ... here. (parse_int_secure): New function. (initialize_env): Use parse_int_secure for GOMP_DEBUG. * secure_getenv.h: Factor out of ... * plugin/plugin-hsa.c: ... here. * testsuite/libgomp.oacc-c-c++-common/gomp-debug-env.c: New test. From-SVN: r249694	2017-06-27 15:51:37 +00:00
Thomas Schwinge	78672bd8fd	libgomp nvptx plugin: Debugging output when disabling nvptx offloading libgomp/ * plugin/plugin-nvptx.c (nvptx_get_num_devices): Debugging output when disabling nvptx offloading. From-SVN: r248400	2017-05-24 08:59:05 +02:00
Thomas Schwinge	0da2f96af0	libgomp hsa plugin: debug output for HSA runtime library loading failure libgomp/ * plugin/plugin-hsa.c (DLSYM_FN, init_hsa_runtime_functions): Debug output for failure. From-SVN: r248277	2017-05-19 15:32:04 +02:00
Jakub Jelinek	19929ba9c9	plugin-nvptx.c (cuda_lib_inited): Use signed char type instead of char. * plugin/plugin-nvptx.c (cuda_lib_inited): Use signed char type instead of char. From-SVN: r246918	2017-04-13 21:59:04 +02:00
Thomas Schwinge	e70ab10d5c	libgomp, nvptx plugin: Make "nvptx_exec" static libgomp/ * plugin/plugin-nvptx.c (nvptx_exec): Make it static. From-SVN: r245127	2017-02-02 15:35:30 +01:00
Thomas Schwinge	345a8c1712	libgomp: Normalize the names of a few functions of the libgomp plugin API libgomp/ * libgomp-plugin.h (GOMP_OFFLOAD_openacc_parallel): Rename to GOMP_OFFLOAD_openacc_exec. Adjust all users. (GOMP_OFFLOAD_openacc_get_current_cuda_device): Rename to GOMP_OFFLOAD_openacc_cuda_get_current_device. Adjust all users. (GOMP_OFFLOAD_openacc_get_current_cuda_context): Rename to GOMP_OFFLOAD_openacc_cuda_get_current_context. Adjust all users. (GOMP_OFFLOAD_openacc_get_cuda_stream): Rename to GOMP_OFFLOAD_openacc_cuda_get_stream. Adjust all users. (GOMP_OFFLOAD_openacc_set_cuda_stream): Rename to GOMP_OFFLOAD_openacc_cuda_set_stream. Adjust all users. From-SVN: r245125	2017-02-02 15:13:57 +01:00
Thomas Schwinge	dced339c8a	libgomp: Provide prototypes for functions implemented by libgomp plugins libgomp/ * libgomp-plugin.h: #include <stdbool.h>. (GOMP_OFFLOAD_get_name, GOMP_OFFLOAD_get_caps) (GOMP_OFFLOAD_get_type, GOMP_OFFLOAD_get_num_devices) (GOMP_OFFLOAD_init_device, GOMP_OFFLOAD_fini_device) (GOMP_OFFLOAD_version, GOMP_OFFLOAD_load_image) (GOMP_OFFLOAD_unload_image, GOMP_OFFLOAD_alloc, GOMP_OFFLOAD_free) (GOMP_OFFLOAD_dev2host, GOMP_OFFLOAD_host2dev) (GOMP_OFFLOAD_dev2dev, GOMP_OFFLOAD_can_run, GOMP_OFFLOAD_run) (GOMP_OFFLOAD_async_run, GOMP_OFFLOAD_openacc_parallel) (GOMP_OFFLOAD_openacc_register_async_cleanup) (GOMP_OFFLOAD_openacc_async_test) (GOMP_OFFLOAD_openacc_async_test_all) (GOMP_OFFLOAD_openacc_async_wait) (GOMP_OFFLOAD_openacc_async_wait_async) (GOMP_OFFLOAD_openacc_async_wait_all) (GOMP_OFFLOAD_openacc_async_wait_all_async) (GOMP_OFFLOAD_openacc_async_set_async) (GOMP_OFFLOAD_openacc_create_thread_data) (GOMP_OFFLOAD_openacc_destroy_thread_data) (GOMP_OFFLOAD_openacc_get_current_cuda_device) (GOMP_OFFLOAD_openacc_get_current_cuda_context) (GOMP_OFFLOAD_openacc_get_cuda_stream) (GOMP_OFFLOAD_openacc_set_cuda_stream): New prototypes. * libgomp.h (struct acc_dispatch_t, struct gomp_device_descr): Use these. * plugin/plugin-hsa.c (GOMP_OFFLOAD_load_image) (GOMP_OFFLOAD_unload_image): Fix argument types. liboffloadmic/ * plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_get_type): Fix return type. (GOMP_OFFLOAD_load_image): Fix argument types. From-SVN: r245062	2017-01-31 15:32:58 +01:00
Pekka Jääskeläinen	5fd1486ce5	Brig front-end 2017-01-24 Pekka Jääskeläinen <pekka@parmance.com> Martin Jambor <mjambor@suse.cz> * Makefile.def (target_modules): Added libhsail-rt. (languages): Added language brig. * Makefile.in: Regenerated. * configure.ac (TOPLEVEL_CONFIGURE_ARGUMENTS): Added tgarget-libhsail-rt. Make brig unsupported on untested architectures. * configure: Regenerated. gcc/ * brig-builtins.def: New file. * builtins.def (DEF_HSAIL_BUILTIN): New macro. (DEF_HSAIL_ATOMIC_BUILTIN): Likewise. (DEF_HSAIL_SAT_BUILTIN): Likewise. (DEF_HSAIL_INTR_BUILTIN): Likewise. (DEF_HSAIL_CVT_ZEROI_SAT_BUILTIN): Likewise. * builtin-types.def (BT_INT8): New. (BT_INT16): Likewise. (BT_UINT8): Likewise. (BT_UINT16): Likewise. (BT_FN_ULONG): Likewise. (BT_FN_UINT_INT): Likewise. (BT_FN_UINT_ULONG): Likewise. (BT_FN_UINT_LONG): Likewise. (BT_FN_UINT_PTR): Likewise. (BT_FN_ULONG_PTR): Likewise. (BT_FN_INT8_FLOAT): Likewise. (BT_FN_INT16_FLOAT): Likewise. (BT_FN_UINT32_FLOAT): Likewise. (BT_FN_UINT16_FLOAT): Likewise. (BT_FN_UINT8_FLOAT): Likewise. (BT_FN_UINT64_FLOAT): Likewise. (BT_FN_UINT16_UINT32): Likewise. (BT_FN_UINT32_UINT16): Likewise. (BT_FN_UINT16_UINT16_UINT16): Likewise. (BT_FN_INT_PTR_INT): Likewise. (BT_FN_UINT_PTR_UINT): Likewise. (BT_FN_LONG_PTR_LONG): Likewise. (BT_FN_ULONG_PTR_ULONG): Likewise. (BT_FN_VOID_UINT64_UINT64): Likewise. (BT_FN_UINT8_UINT8_UINT8): Likewise. (BT_FN_INT8_INT8_INT8): Likewise. (BT_FN_INT16_INT16_INT16): Likewise. (BT_FN_INT_INT_INT): Likewise. (BT_FN_UINT_FLOAT_UINT): Likewise. (BT_FN_FLOAT_UINT_UINT): Likewise. (BT_FN_ULONG_UINT_UINT): Likewise. (BT_FN_ULONG_UINT_PTR): Likewise. (BT_FN_ULONG_ULONG_ULONG): Likewise. (BT_FN_UINT_UINT_UINT): Likewise. (BT_FN_VOID_UINT_PTR): Likewise. (BT_FN_UINT_UINT_PTR: Likewise. (BT_FN_UINT32_UINT64_PTR): Likewise. (BT_FN_INT_INT_UINT_UINT): Likewise. (BT_FN_UINT_UINT_UINT_UINT): Likewise. (BT_FN_UINT_UINT_UINT_PTR): Likewise. (BT_FN_UINT_ULONG_ULONG_UINT): Likewise. (BT_FN_ULONG_ULONG_ULONG_ULONG): Likewise. (BT_FN_LONG_LONG_UINT_UINT): Likewise. (BT_FN_ULONG_ULONG_UINT_UINT): Likewise. (BT_FN_VOID_UINT32_UINT64_PTR): Likewise. (BT_FN_VOID_UINT32_UINT32_PTR): Likewise. (BT_FN_UINT_UINT_UINT_UINT_UINT): Likewise. (BT_FN_UINT_FLOAT_FLOAT_FLOAT_FLOAT): Likewise. (BT_FN_ULONG_ULONG_ULONG_UINT_UINT): Likewise. * doc/frontends.texi: List BRIG FE. * doc/install.texi (Testing): Add BRIG tesring requirements. * doc/invoke.texi (Overall Options): Mention BRIG. * doc/standards.texi (Standards): Doucment BRIG HSA version. gcc/brig/ * Make-lang.in: New file. * brig-builtins.h: Likewise. * brig-c.h: Likewise. * brig-lang.c: Likewise. * brigspec.c: Likewise. * config-lang.in: Likewise. * lang-specs.h: Likewise. * lang.opt: Likewise. * brigfrontend/brig-arg-block-handler.cc: Likewise. * brigfrontend/brig-atomic-inst-handler.cc: Likewise. * brigfrontend/brig-basic-inst-handler.cc: Likewise. * brigfrontend/brig-branch-inst-handler.cc: Likewise. * brigfrontend/brig-cmp-inst-handler.cc: Likewise. * brigfrontend/brig-code-entry-handler.cc: Likewise. * brigfrontend/brig-code-entry-handler.h: Likewise. * brigfrontend/brig-comment-handler.cc: Likewise. * brigfrontend/brig-control-handler.cc: Likewise. * brigfrontend/brig-copy-move-inst-handler.cc: Likewise. * brigfrontend/brig-cvt-inst-handler.cc: Likewise. * brigfrontend/brig-fbarrier-handler.cc: Likewise. * brigfrontend/brig-function-handler.cc: Likewise. * brigfrontend/brig-function.cc: Likewise. * brigfrontend/brig-function.h: Likewise. * brigfrontend/brig-inst-mod-handler.cc: Likewise. * brigfrontend/brig-label-handler.cc: Likewise. * brigfrontend/brig-lane-inst-handler.cc: Likewise. * brigfrontend/brig-machine.c: Likewise. * brigfrontend/brig-machine.h: Likewise. * brigfrontend/brig-mem-inst-handler.cc: Likewise. * brigfrontend/brig-module-handler.cc: Likewise. * brigfrontend/brig-queue-inst-handler.cc: Likewise. * brigfrontend/brig-seg-inst-handler.cc: Likewise. * brigfrontend/brig-signal-inst-handler.cc: Likewise. * brigfrontend/brig-to-generic.cc: Likewise. * brigfrontend/brig-to-generic.h: Likewise. * brigfrontend/brig-util.cc: Likewise. * brigfrontend/brig-util.h: Likewise. * brigfrontend/brig-variable-handler.cc: Likewise. * brigfrontend/phsa.h: Likewise. gcc/testsuite/ * lib/brig-dg.exp: New file. * lib/brig.exp: Likewise. * brig.dg/README: Likewise. * brig.dg/dg.exp: Likewise. * brig.dg/test/gimple/alloca.hsail: Likewise. * brig.dg/test/gimple/atomics.hsail: Likewise. * brig.dg/test/gimple/branches.hsail: Likewise. * brig.dg/test/gimple/fbarrier.hsail: Likewise. * brig.dg/test/gimple/function_calls.hsail: Likewise. * brig.dg/test/gimple/kernarg.hsail: Likewise. * brig.dg/test/gimple/mem.hsail: Likewise. * brig.dg/test/gimple/mulhi.hsail: Likewise. * brig.dg/test/gimple/packed.hsail: Likewise. * brig.dg/test/gimple/smoke_test.hsail: Likewise. * brig.dg/test/gimple/variables.hsail: Likewise. * brig.dg/test/gimple/vector.hsail: Likewise. include/ * hsa.h: Moved here from libgomp/plugin/hsa.h. libgomp/ * plugin/hsa.h: Moved to top level include. * plugin/plugin-hsa.c: Chanfgd include of hsa.h accordingly. libhsail-rt/ * Makefile.am: New file. * target-config.h.in: Likewise. * configure.ac: Likewise. * configure: Likewise. * config.h.in: Likewise. * aclocal.m4: Likewise. * README: Likewise. * Makefile.in: Likewise. * include/internal/fibers.h: Likewise. * include/internal/phsa-queue-interface.h: Likewise. * include/internal/phsa-rt.h: Likewise. * include/internal/workitems.h: Likewise. * rt/arithmetic.c: Likewise. * rt/atomics.c: Likewise. * rt/bitstring.c: Likewise. * rt/fbarrier.c: Likewise. * rt/fibers.c: Likewise. * rt/fp16.c: Likewise. * rt/misc.c: Likewise. * rt/multimedia.c: Likewise. * rt/queue.c: Likewise. * rt/sat_arithmetic.c: Likewise. * rt/segment.c: Likewise. * rt/workitems.c: Likewise. Co-Authored-By: Martin Jambor <mjambor@suse.cz> From-SVN: r244867	2017-01-24 13:45:56 +01:00
Jakub Jelinek	b32e85fa42	cuda.h (CUdeviceptr): Typedef to unsigned long long even for _WIN64. * plugin/cuda/cuda.h (CUdeviceptr): Typedef to unsigned long long even for _WIN64. From-SVN: r244638	2017-01-19 16:53:51 +01:00
Jakub Jelinek	d190d5c093	hsa.h: Add GCC runtime library exception. * plugin/hsa.h: Add GCC runtime library exception. * plugin/hsa_ext_finalize.h: Likewise. From-SVN: r244523	2017-01-17 10:45:23 +01:00
Jakub Jelinek	2393d337e7	configfrag.ac: For --without-cuda-driver don't initialize CUDA_DRIVER_INCLUDE nor CUDA_DRIVER_LIB. * plugin/configfrag.ac: For --without-cuda-driver don't initialize CUDA_DRIVER_INCLUDE nor CUDA_DRIVER_LIB. If both CUDA_DRIVER_INCLUDE and CUDA_DRIVER_LIB are empty and linking small cuda program fails, define PLUGIN_NVPTX_DYNAMIC to 1 and use plugin/include/cuda as include dir and -ldl instead of -lcuda as library to link ptx plugin against. * plugin/plugin-nvptx.c: Include dlfcn.h if PLUGIN_NVPTX_DYNAMIC. (CUDA_CALLS): Define. (cuda_lib, cuda_lib_inited): New variables. (init_cuda_lib): New function. (CUDA_CALL_PREFIX): Define. (CUDA_CALL_ERET, CUDA_CALL_ASSERT): Use CUDA_CALL_PREFIX. (CUDA_CALL): Use FN instead of (FN). (CUDA_CALL_NOCHECK): Define. (cuda_error, fini_streams_for_device, select_stream_for_async, nvptx_attach_host_thread_to_device, nvptx_open_device, link_ptx, event_gc, nvptx_exec, nvptx_async_test, nvptx_async_test_all, nvptx_wait_all, nvptx_set_clocktick, GOMP_OFFLOAD_unload_image, nvptx_stacks_alloc, nvptx_stacks_free, GOMP_OFFLOAD_run): Use CUDA_CALL_NOCHECK. (nvptx_init): Call init_cuda_lib, if it fails, return false. Use CUDA_CALL_NOCHECK. (nvptx_get_num_devices): Call init_cuda_lib, if it fails, return 0. Use CUDA_CALL_NOCHECK. * plugin/cuda/cuda.h: New file. * config.h.in: Regenerated. * configure: Regenerated. From-SVN: r244522	2017-01-17 10:44:17 +01:00
Jakub Jelinek	cbe34bb5ed	Update copyright years. From-SVN: r243994	2017-01-01 13:07:43 +01:00
Alexander Monakov	6103184e81	OpenMP offloading to NVPTX: libgomp changes * Makefile.am (libgomp_la_SOURCES): Add atomic.c, icv.c, icv-device.c. * Makefile.in. Regenerate. * configure.ac [nvptx--] (libgomp_use_pthreads): Set and use it... (LIBGOMP_USE_PTHREADS): ...here; new define. configure: Regenerate. * config.h.in: Likewise. * config/posix/affinity.c: Move to... * affinity.c: ...here (new file). Guard use of Pthreads-specific interface by LIBGOMP_USE_PTHREADS. * critical.c: Split out GOMP_atomic_{start,end} into... * atomic.c: ...here (new file). * env.c: Split out ICV definitions into... * icv.c: ...here (new file) and... * icv-device.c: ...here. New file. * config/linux/lock.c (gomp_init_lock_30): Move to generic lock.c. (gomp_destroy_lock_30): Ditto. (gomp_set_lock_30): Ditto. (gomp_unset_lock_30): Ditto. (gomp_test_lock_30): Ditto. (gomp_init_nest_lock_30): Ditto. (gomp_destroy_nest_lock_30): Ditto. (gomp_set_nest_lock_30): Ditto. (gomp_unset_nest_lock_30): Ditto. (gomp_test_nest_lock_30): Ditto. * lock.c: New. * config/nvptx/lock.c: New. * config/nvptx/bar.c: New. * config/nvptx/bar.h: New. * config/nvptx/doacross.h: New. * config/nvptx/error.c: New. * config/nvptx/icv-device.c: New. * config/nvptx/mutex.h: New. * config/nvptx/pool.h: New. * config/nvptx/proc.c: New. * config/nvptx/ptrlock.h: New. * config/nvptx/sem.h: New. * config/nvptx/simple-bar.h: New. * config/nvptx/target.c: New. * config/nvptx/task.c: New. * config/nvptx/team.c: New. * config/nvptx/time.c: New. * config/posix/simple-bar.h: New. * libgomp.h: Guard pthread.h inclusion. Include simple-bar.h. (gomp_num_teams_var): Declare. (struct gomp_thread_pool): Change threads_dock member to gomp_simple_barrier_t. [__nvptx__] (gomp_thread): New implementation. (gomp_thread_attr): Guard by LIBGOMP_USE_PTHREADS. (gomp_thread_destructor): Ditto. (gomp_init_thread_affinity): Ditto. * team.c: Guard uses of Pthreads-specific interfaces by LIBGOMP_USE_PTHREADS. Adjust all uses of threads_dock. (gomp_free_thread) [__nvptx__]: Do not call 'free'. * config/nvptx/alloc.c: Delete. * config/nvptx/barrier.c: Ditto. * config/nvptx/fortran.c: Ditto. * config/nvptx/iter.c: Ditto. * config/nvptx/iter_ull.c: Ditto. * config/nvptx/loop.c: Ditto. * config/nvptx/loop_ull.c: Ditto. * config/nvptx/ordered.c: Ditto. * config/nvptx/parallel.c: Ditto. * config/nvptx/priority_queue.c: Ditto. * config/nvptx/sections.c: Ditto. * config/nvptx/single.c: Ditto. * config/nvptx/splay-tree.c: Ditto. * config/nvptx/work.c: Ditto. * testsuite/libgomp.fortran/fortran.exp (lang_link_flags): Pass -foffload=-lgfortran in addition to -lgfortran. * testsuite/libgomp.oacc-fortran/fortran.exp (lang_link_flags): Ditto. * plugin/plugin-nvptx.c: Include <limits.h>. (struct targ_fn_descriptor): Add new fields. (struct ptx_device): Ditto. Set them... (nvptx_open_device): ...here. (nvptx_adjust_launch_bounds): New. (nvptx_host2dev): Allow NULL 'nvthd'. (nvptx_dev2host): Ditto. (GOMP_OFFLOAD_get_caps): Add GOMP_OFFLOAD_CAP_OPENMP_400. (link_ptx): Adjust log sizes. (nvptx_host2dev): Allow NULL 'nvthd'. (nvptx_dev2host): Ditto. (nvptx_set_clocktick): New. Use it... (GOMP_OFFLOAD_load_image): ...here. Set new targ_fn_descriptor fields. (GOMP_OFFLOAD_dev2dev): New. (nvptx_adjust_launch_bounds): New. (nvptx_stacks_size): New. (nvptx_stacks_alloc): New. (nvptx_stacks_free): New. (GOMP_OFFLOAD_run): New. (GOMP_OFFLOAD_async_run): New (stub). Co-Authored-By: Dmitry Melnik <dm@ispras.ru> Co-Authored-By: Jakub Jelinek <jakub@redhat.com> From-SVN: r242789	2016-11-23 21:36:41 +03:00
Martin Liska	b8d89b03db	Remove build dependence on HSA run-time 2016-11-23 Martin Liska <mliska@suse.cz> Martin Jambor <mjambor@suse.cz> gcc/ * doc/install.texi: Remove entry about --with-hsa-kmt-lib. libgomp/ * plugin/hsa.h: New file. * plugin/hsa_ext_finalize.h: New file. * plugin/configfrag.ac: Remove hsa-kmt-lib test. Added checks for header file unistd.h, and functions secure_getenv, __secure_getenv, getuid, geteuid, getgid and getegid. * plugin/Makefrag.am (libgomp_plugin_hsa_la_CPPFLAGS): Added -D_GNU_SOURCE. * plugin/plugin-hsa.c: Include config.h, inttypes.h and stdbool.h. Handle various cases of secure_getenv presence, add an implementation when we can test effective UID and GID. (struct hsa_runtime_fn_info): New structure. (hsa_runtime_fn_info hsa_fns): New variable. (hsa_runtime_lib): Likewise. (support_cpu_devices): Likewise. (init_enviroment_variables): Load newly introduced ENV variables. (hsa_warn): Call hsa run-time functions via hsa_fns structure. (hsa_fatal): Likewise. (DLSYM_FN): New macro. (init_hsa_runtime_functions): New function. (suitable_hsa_agent_p): Call hsa run-time functions via hsa_fns structure. Depending on environment, also allow CPU devices. (init_hsa_context): Call hsa run-time functions via hsa_fns structure. (get_kernarg_memory_region): Likewise. (GOMP_OFFLOAD_init_device): Likewise. (destroy_hsa_program): Likewise. (init_basic_kernel_info): New function. (GOMP_OFFLOAD_load_image): Use it. (create_and_finalize_hsa_program): Call hsa run-time functions via hsa_fns structure. (create_single_kernel_dispatch): Likewise. (release_kernel_dispatch): Likewise. (init_single_kernel): Likewise. (parse_target_attributes): Allow up multiple HSA grid dimensions. (get_group_size): New function. (run_kernel): Likewise. (GOMP_OFFLOAD_run): Outline most functionality to run_kernel. (GOMP_OFFLOAD_fini_device): Call hsa run-time functions via hsa_fns structure. * testsuite/lib/libgomp.exp: Remove hsa_kmt_lib support. * testsuite/libgomp-test-support.exp.in: Likewise. * Makefile.in: Regenerated. * aclocal.m4: Likewise. * config.h.in: Likewise. * configure: Likewise. * testsuite/Makefile.in: Likewise. Co-Authored-By: Martin Jambor <mjambor@suse.cz> From-SVN: r242749	2016-11-23 13:27:13 +01:00
Cesar Philippidis	6668eb4593	nvptx.c (PTX_GANG_DEFAULT): Set to zero. gcc/ * config/nvptx/nvptx.c (PTX_GANG_DEFAULT): Set to zero. libgomp/ * plugin/plugin-nvptx.c (nvptx_exec): Interrogate board attributes to determine default geometry. * testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Set gang dimension. Co-Authored-By: Nathan Sidwell <nathan@acm.org> From-SVN: r241803	2016-11-02 15:10:02 -07:00
Chung-Lin Tang	b4557008c4	oacc-plugin.h (GOMP_PLUGIN_async_unmap_vars): Add int parameter. 2016-05-26 Chung-Lin Tang <cltang@codesourcery.com> libgomp/ * oacc-plugin.h (GOMP_PLUGIN_async_unmap_vars): Add int parameter. * oacc-plugin.c (GOMP_PLUGIN_async_unmap_vars): Add 'int async' parameter, use to set async stream around call to gomp_unmap_vars, call gomp_unmap_vars() with 'do_copyfrom' set to true. * plugin/plugin-nvptx.c (struct ptx_event): Add 'int val' field. (event_gc): Adjust event handling loop, collect PTX_EVT_ASYNC_CLEANUP events and call GOMP_PLUGIN_async_unmap_vars() for each of them. (event_add): Add int parameter, initialize 'val' field when adding new ptx_event struct. (nvptx_evec): Adjust event_add() call arguments. (nvptx_host2dev): Likewise. (nvptx_dev2host): Likewise. (nvptx_wait_async): Likewise. (nvptx_wait_all_async): Likewise. (GOMP_OFFLOAD_openacc_register_async_cleanup): Add async parameter, pass to event_add() call. * oacc-host.c (host_openacc_register_async_cleanup): Add 'int async' parameter. * oacc-mem.c (gomp_acc_remove_pointer): Adjust async case to call openacc.register_async_cleanup_func() hook. * oacc-parallel.c (GOACC_parallel_keyed): Likewise. * target.c (gomp_copy_from_async): Delete function. (gomp_map_vars): Remove async_refcount. (gomp_unmap_vars): Likewise. (gomp_load_image_to_device): Likewise. (omp_target_associate_ptr): Likewise. * libgomp.h (struct splay_tree_key_s): Remove async_refcount. (acc_dispatch_t.register_async_cleanup_func): Add int parameter. (gomp_copy_from_async): Remove. From-SVN: r236772	2016-05-26 13:28:25 +00:00
Chung-Lin Tang	6ce1307231	target.c (gomp_device_copy): New function. libgomp/ 2016-05-26 Chung-Lin Tang <cltang@codesourcery.com> * target.c (gomp_device_copy): New function. (gomp_copy_host2dev): Likewise. (gomp_copy_dev2host): Likewise. (gomp_free_device_memory): Likewise. (gomp_map_vars_existing): Adjust to call gomp_copy_host2dev. (gomp_map_pointer): Likewise. (gomp_map_vars): Adjust to call gomp_copy_host2dev, handle NULL value from alloc_func plugin hook. (gomp_unmap_tgt): Adjust to call gomp_free_device_memory. (gomp_copy_from_async): Adjust to call gomp_copy_dev2host. (gomp_unmap_vars): Likewise. (gomp_update): Adjust to call gomp_copy_dev2host and gomp_copy_host2dev functions. (gomp_unload_image_from_device): Handle false value from unload_image_func plugin hook. (gomp_init_device): Handle false value from init_device_func plugin hook. (gomp_exit_data): Adjust to call gomp_copy_dev2host. (omp_target_free): Adjust to call gomp_free_device_memory. (omp_target_memcpy): Handle return values from host2dev_func, dev2host_func, and dev2dev_func plugin hooks. (omp_target_memcpy_rect_worker): Likewise. (gomp_target_fini): Handle false value from fini_device_func plugin hook. * libgomp.h (struct gomp_device_descr): Adjust return type of init_device_func, fini_device_func, unload_image_func, free_func, dev2host_func,host2dev_func, and dev2dev_func plugin hooks to 'bool'. * oacc-init.c (acc_shutdown_1): Handle false value from fini_device_func plugin hook. * oacc-host.c (host_init_device): Change return type to bool. (host_fini_device): Likewise. (host_unload_image): Likewise. (host_free): Likewise. (host_dev2host): Likewise. (host_host2dev): Likewise. * oacc-mem.c (acc_free): Handle plugin hook fatal error case. (acc_memcpy_to_device): Likewise. (acc_memcpy_from_device): Likewise. (delete_copyout): Add libfnname parameter, handle free_func hook fatal error case. (acc_delete): Adjust delete_copyout call. (acc_copyout): Likewise. (update_dev_host): Move gomp_mutex_unlock to after host2dev/dev2host hook calls. * plugin/plugin-hsa.c (hsa_warn): Adjust 'hsa_error' local variable to 'hsa_error_msg', for clarity. (hsa_fatal): Likewise. (hsa_error): New function. (init_hsa_context): Change return type to bool, adjust to return false on error. (GOMP_OFFLOAD_get_num_devices): Adjust to handle init_hsa_context return value. (GOMP_OFFLOAD_init_device): Change return type to bool, adjust to return false on error. (get_agent_info): Adjust to return NULL on error. (destroy_hsa_program): Change return type to bool, adjust to return false on error. (GOMP_OFFLOAD_load_image): Adjust to return -1 on error. (destroy_module): Change return type to bool, adjust to return false on error. (GOMP_OFFLOAD_unload_image): Likewise. (GOMP_OFFLOAD_fini_device): Likewise. (GOMP_OFFLOAD_alloc): Change to return NULL when called. (GOMP_OFFLOAD_free): Change to return false when called. (GOMP_OFFLOAD_dev2host): Likewise. (GOMP_OFFLOAD_host2dev): Likewise. (GOMP_OFFLOAD_dev2dev): Likewise. * plugin/plugin-nvptx.c (CUDA_CALL_ERET): New convenience macro. (CUDA_CALL): Likewise. (CUDA_CALL_ASSERT): Likewise. (map_init): Change return type to bool, use CUDA_CALL* macros. (map_fini): Likewise. (init_streams_for_device): Change return type to bool, adjust call to map_init. (fini_streams_for_device): Change return type to bool, adjust call to map_fini. (select_stream_for_async): Release stream_lock before calls to GOMP_PLUGIN_fatal, adjust call to map_init. (nvptx_init): Use CUDA_CALL* macros. (nvptx_attach_host_thread_to_device): Change return type to bool, use CUDA_CALL* macros. (nvptx_open_device): Use CUDA_CALL* macros. (nvptx_close_device): Change return type to bool, use CUDA_CALL* macros. (nvptx_get_num_devices): Use CUDA_CALL* macros. (link_ptx): Change return type to bool, use CUDA_CALL* macros. (nvptx_exec): Use CUDA_CALL* macros. (nvptx_alloc): Use CUDA_CALL* macros. (nvptx_free): Change return type to bool, use CUDA_CALL* macros. (nvptx_host2dev): Likewise. (nvptx_dev2host): Likewise. (nvptx_wait): Use CUDA_CALL* macros. (nvptx_wait_async): Likewise. (nvptx_wait_all): Likewise. (nvptx_wait_all_async): Likewise. (nvptx_set_cuda_stream): Adjust order of stream_lock acquire, use CUDA_CALL* macros, adjust call to map_fini. (GOMP_OFFLOAD_init_device): Change return type to bool, adjust code accordingly. (GOMP_OFFLOAD_fini_device): Likewise. (GOMP_OFFLOAD_load_image): Adjust calls to nvptx_attach_host_thread_to_device/link_ptx to handle errors, use CUDA_CALL* macros. (GOMP_OFFLOAD_unload_image): Change return type to bool, adjust return code. (GOMP_OFFLOAD_alloc): Adjust calls to code to handle error return. (GOMP_OFFLOAD_free): Change return type to bool, adjust calls to handle error return. (GOMP_OFFLOAD_dev2host): Likewise. (GOMP_OFFLOAD_host2dev): Likewise. (GOMP_OFFLOAD_openacc_register_async_cleanup): Use CUDA_CALL* macros. (GOMP_OFFLOAD_openacc_create_thread_data): Likewise. liboffloadmic/ 2016-05-26 Chung-Lin Tang <cltang@codesourcery.com> * plugin/libgomp-plugin-intelmic.cpp (offload): Change return type to bool, adjust return code. (GOMP_OFFLOAD_init_device): Likewise. (GOMP_OFFLOAD_fini_device): Likewise. (get_target_table): Likewise. (offload_image): Likwise. (GOMP_OFFLOAD_load_image): Adjust call to offload_image(), change to return -1 on error. (GOMP_OFFLOAD_unload_image): Change return type to bool, adjust return code. (GOMP_OFFLOAD_alloc): Likewise. (GOMP_OFFLOAD_free): Likewise. (GOMP_OFFLOAD_host2dev): Likewise. (GOMP_OFFLOAD_dev2host): Likewise. (GOMP_OFFLOAD_dev2dev): Likewise. From-SVN: r236768	2016-05-26 09:58:56 +00:00
Alexander Monakov	c2bd3b6911	libgomp nvptx plugin: make cuMemFreeHost error non-fatal From-SVN: r235339	2016-04-21 16:11:47 +03:00
Martin Liska	f9c8babbab	Properly assign to packet header (PR hsa/70394) * plugin/plugin-hsa.c (packet_store_release): New function that is taken from the HSA runtime manual. (GOMP_OFFLOAD_run): Use the function. From-SVN: r234454	2016-03-24 13:04:12 +00:00
Martin Liska	7397fce2f7	Copy shadow argument conditionally (PR hsa/70337) PR hsa/70337 * plugin/plugin-hsa.c (GOMP_OFFLOAD_run): Copy shadow argument just in case a dispatched kernel uses that argument. From-SVN: r234418	2016-03-23 09:59:51 +00:00
Thomas Schwinge	f99c355797	Use plain -fopenacc to enable OpenACC kernels processing gcc/ * tree-parloops.c (create_parallel_loop, gen_parallel_loop) (parallelize_loops): In OpenACC kernels mode, set n_threads to zero. (pass_parallelize_loops::gate): In OpenACC kernels mode, gate on flag_openacc. * tree-ssa-loop.c (gate_oacc_kernels): Likewise. gcc/testsuite/ * c-c++-common/goacc/kernels-counter-vars-function-scope.c: Adjust to -ftree-parallelize-loops/-fopenacc changes. * c-c++-common/goacc/kernels-double-reduction-n.c: Likewise. * c-c++-common/goacc/kernels-double-reduction.c: Likewise. * c-c++-common/goacc/kernels-loop-2.c: Likewise. * c-c++-common/goacc/kernels-loop-3.c: Likewise. * c-c++-common/goacc/kernels-loop-g.c: Likewise. * c-c++-common/goacc/kernels-loop-mod-not-zero.c: Likewise. * c-c++-common/goacc/kernels-loop-n.c: Likewise. * c-c++-common/goacc/kernels-loop-nest.c: Likewise. * c-c++-common/goacc/kernels-loop.c: Likewise. * c-c++-common/goacc/kernels-one-counter-var.c: Likewise. * c-c++-common/goacc/kernels-reduction.c: Likewise. * gfortran.dg/goacc/kernels-loop-inner.f95: Likewise. * gfortran.dg/goacc/kernels-loops-adjacent.f95: Likewise. libgomp/ * oacc-parallel.c (GOACC_parallel_keyed): Initialize dims. * plugin/plugin-nvptx.c (nvptx_exec): Provide default values for dims. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c: Adjust to -ftree-parallelize-loops/-fopenacc changes. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c: Likewise. From-SVN: r233634	2016-02-23 16:07:54 +01:00
Thomas Schwinge	033ff3d130	libgomp: Use HSA_RUNTIME_LIB, HSA_KMT_LIB in the testsuite libgomp/ * plugin/configfrag.ac (HSA_KMT_LIB, HSA_KMT_LDFLAGS): New variables. * testsuite/libgomp-test-support.exp.in (hsa_runtime_lib) (hsa_kmt_lib): Set variables. * testsuite/lib/libgomp.exp (libgomp_init): Use them to amend always_ld_library_path. * Makefile.in: Regenerate. * configure: Likewise. * testsuite/Makefile.in: Likewise. From-SVN: r233072	2016-02-02 13:48:31 +01:00
Thomas Schwinge	4a88d9b77a	libgomp: For hsa offloading, compilation is all handled by the target compiler libgomp/ * plugin/configfrag.ac (offload_additional_options) (offload_additional_lib_paths): Don't amend for hsa offloading. * configure: Regenerate. From-SVN: r233071	2016-02-02 13:48:21 +01:00
Thomas Schwinge	41d809d3c8	libgomp: Don't configure for offloading target if we don't build the corresponding plugin libgomp/ * plugin/configfrag.ac: Don't configure for offloading target if we don't build the corresponding plugin. * configure: Regenerate. From-SVN: r233070	2016-02-02 13:48:04 +01:00
Martin Jambor	b2b4005150	Merge of HSA 2016-01-19 Martin Jambor <mjambor@suse.cz> Martin Liska <mliska@suse.cz> Michael Matz <matz@suse.de> libgomp/ * plugin/Makefrag.am: Add HSA plugin requirements. * plugin/configfrag.ac (HSA_RUNTIME_INCLUDE): New variable. (HSA_RUNTIME_LIB): Likewise. (HSA_RUNTIME_CPPFLAGS): Likewise. (HSA_RUNTIME_INCLUDE): New substitution. (HSA_RUNTIME_LIB): Likewise. (HSA_RUNTIME_LDFLAGS): Likewise. (hsa-runtime): New configure option. (hsa-runtime-include): Likewise. (hsa-runtime-lib): Likewise. (PLUGIN_HSA): New substitution variable. Fill HSA_RUNTIME_INCLUDE and HSA_RUNTIME_LIB according to the new configure options. (PLUGIN_HSA_CPPFLAGS): Likewise. (PLUGIN_HSA_LDFLAGS): Likewise. (PLUGIN_HSA_LIBS): Likewise. Check that we have access to HSA run-time. * libgomp-plugin.h (offload_target_type): New element OFFLOAD_TARGET_TYPE_HSA. * libgomp.h (gomp_target_task): New fields firstprivate_copies and args. (bool gomp_create_target_task): Updated. (gomp_device_descr): Extra parameter of run_func and async_run_func, new field can_run_func. * libgomp_g.h (GOMP_target_ext): Update prototype. * oacc-host.c (host_run): Added a new parameter args. * target.c (calculate_firstprivate_requirements): New function. (copy_firstprivate_data): Likewise. (gomp_target_fallback_firstprivate): Use them. (gomp_target_unshare_firstprivate): New function. (gomp_get_target_fn_addr): Allow returning NULL for shared memory devices. (GOMP_target): Do host fallback for all shared memory devices. Do not pass any args to plugins. (GOMP_target_ext): Introduce device-specific argument parameter args. Allow host fallback if device shares memory. Do not remap data if device has shared memory. (gomp_target_task_fn): Likewise. Also treat shared memory devices like host fallback for mappings. (GOMP_target_data): Treat shared memory devices like host fallback. (GOMP_target_data_ext): Likewise. (GOMP_target_update): Likewise. (GOMP_target_update_ext): Likewise. Also pass NULL as args to gomp_create_target_task. (GOMP_target_enter_exit_data): Likewise. (omp_target_alloc): Treat shared memory devices like host fallback. (omp_target_free): Likewise. (omp_target_is_present): Likewise. (omp_target_memcpy): Likewise. (omp_target_memcpy_rect): Likewise. (omp_target_associate_ptr): Likewise. (gomp_load_plugin_for_device): Also load can_run. * task.c (GOMP_PLUGIN_target_task_completion): Free firstprivate_copies. (gomp_create_target_task): Accept new argument args and store it to ttask. * plugin/plugin-hsa.c: New file. gcc/ * Makefile.in (OBJS): Add new source files. (GTFILES): Add hsa.c. * common.opt (disable_hsa): New variable. (-Whsa): New warning. * config.in (ENABLE_HSA): New. * configure.ac: Treat hsa differently from other accelerators. (OFFLOAD_TARGETS): Define ENABLE_OFFLOADING according to $enable_offloading. (ENABLE_HSA): Define ENABLE_HSA according to $enable_hsa. * doc/install.texi (Configuration): Document --with-hsa-runtime, --with-hsa-runtime-include, --with-hsa-runtime-lib and --with-hsa-kmt-lib. * doc/invoke.texi (-Whsa): Document. (hsa-gen-debug-stores): Likewise. * lto-wrapper.c (compile_images_for_offload_targets): Do not attempt to invoke offload compiler for hsa acclerator. * opts.c (common_handle_option): Determine whether HSA offloading should be performed. * params.def (PARAM_HSA_GEN_DEBUG_STORES): New parameter. * builtin-types.def (BT_FN_VOID_UINT_PTR_INT_PTR): New. (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_INT_INT): Removed. (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR): New. * gimple-low.c (lower_stmt): Also handle GIMPLE_OMP_GRID_BODY. * gimple-pretty-print.c (dump_gimple_omp_for): Also handle GF_OMP_FOR_KIND_GRID_LOOP. (dump_gimple_omp_block): Also handle GIMPLE_OMP_GRID_BODY. (pp_gimple_stmt_1): Likewise. * gimple-walk.c (walk_gimple_stmt): Likewise. * gimple.c (gimple_build_omp_grid_body): New function. (gimple_copy): Also handle GIMPLE_OMP_GRID_BODY. * gimple.def (GIMPLE_OMP_GRID_BODY): New. * gimple.h (enum gf_mask): Added GF_OMP_PARALLEL_GRID_PHONY, GF_OMP_FOR_KIND_GRID_LOOP, GF_OMP_FOR_GRID_PHONY and GF_OMP_TEAMS_GRID_PHONY. (gimple_statement_omp_single_layout): Updated comments. (gimple_build_omp_grid_body): New function. (gimple_has_substatements): Also handle GIMPLE_OMP_GRID_BODY. (gimple_omp_for_grid_phony): New function. (gimple_omp_for_set_grid_phony): Likewise. (gimple_omp_parallel_grid_phony): Likewise. (gimple_omp_parallel_set_grid_phony): Likewise. (gimple_omp_teams_grid_phony): Likewise. (gimple_omp_teams_set_grid_phony): Likewise. (gimple_return_set_retbnd): Also handle GIMPLE_OMP_GRID_BODY. * omp-builtins.def (BUILT_IN_GOMP_OFFLOAD_REGISTER): New. (BUILT_IN_GOMP_OFFLOAD_UNREGISTER): Likewise. (BUILT_IN_GOMP_TARGET): Updated type. * omp-low.c: Include symbol-summary.h, hsa.h and params.h. (adjust_for_condition): New function. (get_omp_for_step_from_incr): Likewise. (extract_omp_for_data): Moved parts to adjust_for_condition and get_omp_for_step_from_incr. (build_outer_var_ref): Handle GIMPLE_OMP_GRID_BODY. (fixup_child_record_type): Bail out if receiver_decl is NULL. (scan_sharing_clauses): Handle OMP_CLAUSE__GRIDDIM_. (scan_omp_parallel): Do not create child functions for phony constructs. (check_omp_nesting_restrictions): Handle GIMPLE_OMP_GRID_BODY. (scan_omp_1_op): Checking assert we are not remapping to ERROR_MARK. Also also handle GIMPLE_OMP_GRID_BODY. (parallel_needs_hsa_kernel_p): New function. (expand_parallel_call): Register apprpriate parallel child functions as HSA kernels. (grid_launch_attributes_trees): New type. (grid_attr_trees): New variable. (grid_create_kernel_launch_attr_types): New function. (grid_insert_store_range_dim): Likewise. (grid_get_kernel_launch_attributes): Likewise. (get_target_argument_identifier_1): Likewise. (get_target_argument_identifier): Likewise. (get_target_argument_value): Likewise. (push_target_argument_according_to_value): Likewise. (get_target_arguments): Likewise. (expand_omp_target): Call get_target_arguments instead of looking up for teams and thread limit. (grid_expand_omp_for_loop): New function. (grid_arg_decl_map): New type. (grid_remap_kernel_arg_accesses): New function. (grid_expand_target_kernel_body): New function. (expand_omp): Call it. (lower_omp_for): Do not emit phony constructs. (lower_omp_taskreg): Do not emit phony constructs but create for them a temporary variable receiver_decl. (lower_omp_taskreg): Do not emit phony constructs. (lower_omp_teams): Likewise. (lower_omp_grid_body): New function. (lower_omp_1): Call it. (grid_reg_assignment_to_local_var_p): New function. (grid_seq_only_contains_local_assignments): Likewise. (grid_find_single_omp_among_assignments_1): Likewise. (grid_find_single_omp_among_assignments): Likewise. (grid_find_ungridifiable_statement): Likewise. (grid_target_follows_gridifiable_pattern): Likewise. (grid_remap_prebody_decls): Likewise. (grid_copy_leading_local_assignments): Likewise. (grid_process_kernel_body_copy): Likewise. (grid_attempt_target_gridification): Likewise. (grid_gridify_all_targets_stmt): Likewise. (grid_gridify_all_targets): Likewise. (execute_lower_omp): Call grid_gridify_all_targets. (make_gimple_omp_edges): Handle GIMPLE_OMP_GRID_BODY. * tree-core.h (omp_clause_code): Added OMP_CLAUSE__GRIDDIM_. (tree_omp_clause): Added union field dimension. * tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE__GRIDDIM_. * tree.c (omp_clause_num_ops): Added number of arguments of OMP_CLAUSE__GRIDDIM_. (omp_clause_code_name): Added name of OMP_CLAUSE__GRIDDIM_. (walk_tree_1): Handle OMP_CLAUSE__GRIDDIM_. * tree.h (OMP_CLAUSE_GRIDDIM_DIMENSION): New. (OMP_CLAUSE_SET_GRIDDIM_DIMENSION): Likewise. (OMP_CLAUSE_GRIDDIM_SIZE): Likewise. (OMP_CLAUSE_GRIDDIM_GROUP): Likewise. * passes.def: Schedule pass_ipa_hsa and pass_gen_hsail. * tree-pass.h (make_pass_gen_hsail): Declare. (make_pass_ipa_hsa): Likewise. * ipa-hsa.c: New file. * lto-section-in.c (lto_section_name): Add hsa section name. * lto-streamer.h (lto_section_type): Add hsa section. * timevar.def (TV_IPA_HSA): New. * hsa-brig-format.h: New file. * hsa-brig.c: New file. * hsa-dump.c: Likewise. * hsa-gen.c: Likewise. * hsa.c: Likewise. * hsa.h: Likewise. * toplev.c (compile_file): Call hsa_output_brig. * hsa-regalloc.c: New file. gcc/fortran/ * types.def (BT_FN_VOID_UINT_PTR_INT_PTR): New. (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_INT_INT): Removed. (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR): New. gcc/lto/ * lto-partition.c: Include "hsa.h" (add_symbol_to_partition_1): Put hsa implementations into the same partition as host implementations. liboffloadmic/ * plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_async_run): New unused parameter. (GOMP_OFFLOAD_run): Likewise. include/ * gomp-constants.h (GOMP_DEVICE_HSA): New macro. (GOMP_VERSION_HSA): Likewise. (GOMP_TARGET_ARG_DEVICE_MASK): Likewise. (GOMP_TARGET_ARG_DEVICE_ALL): Likewise. (GOMP_TARGET_ARG_SUBSEQUENT_PARAM): Likewise. (GOMP_TARGET_ARG_ID_MASK): Likewise. (GOMP_TARGET_ARG_NUM_TEAMS): Likewise. (GOMP_TARGET_ARG_THREAD_LIMIT): Likewise. (GOMP_TARGET_ARG_VALUE_SHIFT): Likewise. (GOMP_TARGET_ARG_HSA_KERNEL_ATTRIBUTES): Likewise. From-SVN: r232549	2016-01-19 11:35:10 +01:00
Alexander Monakov	0d58938ed7	nvptx plugin: do not force JIT target SM version When link_ptx runs, a CUDA device is already bound to current thread, so the driver library knows the target architecture. There isn't any benefit from forcing a specific target here; on the contrary, hardcoding sm_30 breaks offloading on later (Maxwell, sm_5x) devices. * plugin/plugin-nvptx.c (link_ptx): Do not set CU_JIT_TARGET. From-SVN: r232227	2016-01-11 15:55:31 +03:00
Jakub Jelinek	818ab71a41	Update copyright years. From-SVN: r232055	2016-01-04 15:30:50 +01:00
Nathan Sidwell	5c06742f6f	libgomp.h (struct acc_dispatch_t): Remove args from exec_func. * libgomp.h (struct acc_dispatch_t): Remove args from exec_func. * plugin/plugin-nvptx.c (nvptx_exec): Remove sizes & kinds arg. (GOMP_OFFLOAD_openacc_parallel): Likewise. * oacc-host.c (host_openacc_exec): Likewise. * oacc-parallel.c (GOACC_parallel_keyed): Adjust exec_func call. From-SVN: r229721	2015-11-03 20:18:33 +00:00
Nathan Sidwell	a1c1908bbd	plugin-nvptx.c (nvptx_exec): Remove check on compute dimensions. * plugin/plugin-nvptx.c (nvptx_exec): Remove check on compute dimensions. From-SVN: r229471	2015-10-28 03:00:10 +00:00

1 2

67 Commits