Commit Graph

14 Commits

Author SHA1 Message Date
Jakub Jelinek 7adcbafe45 Update copyright years. 2022-01-03 10:42:10 +01:00
Jakub Jelinek f49c7a4fb2 libgomp: Unbreak gcn offload build
My recent libgomp change apparently broke libgomp build for gcn offloading.
The problem is that gcn, unlike nvptx, doesn't override teams.c source file
and the patch I've committed assumed all the non-LIBGOMP_USE_PTHREADS targets
do not use it.  My understanding is that gcn included omp_get_num_teams
and omp_get_team_num definitions in both icv-device.o and teams.o,
with the definitions only in the former working correctly.

This patch brings gcn into sync with how nvptx does it, that teams.c
is overridden, provides a dummy GOMP_teams_reg and omp_get_{num_teams,team_num}
definitions and icv-device.c doesn't provide those.

2021-11-12  Jakub Jelinek  <jakub@redhat.com>

	PR target/103201
	* config/gcn/icv-device.c (omp_get_num_teams, omp_get_team_num): Move
	to ...
	* config/gcn/teams.c: ... here.  New file.
2021-11-12 16:11:02 +01:00
Jakub Jelinek 7d6da11fce openmp: Honor OpenMP 5.1 num_teams lower bound
The following patch implements what I've been talking about earlier,
honor that for explicit num_teams clause we create at least the
lower-bound (if not specified, upper-bound) teams in the league.
For host fallback, it still means we only have one thread doing all the
teams, sequentially one after another.
For PTX and GCN, I think the new teams-2.c test and maybe teams-4.c too
will or might fail.
For these offloads, I think it is ok to remove symbols no longer used
from libgomp.a.
If num_teams_lower is bigger than the provided num_blocks or num_workgroups,
we should arrange for gomp_num_teams_var to be num_teams_lower - 1,
stop using the %ctaid.x or __builtin_gcn_dim_pos (0) for omp_get_team_num ()
and instead use for it some .shared var that GOMP_teams4 initializes to
%ctaid.x or __builtin_gcn_dim_pos (0) when first and for !first
increment that by num_blocks or num_workgroups each time and only
return false when we are above num_teams_lower.
Any help with actually implementing this for the 2 architectures highly
appreciated.

2021-11-12  Jakub Jelinek  <jakub@redhat.com>

gcc/
	* omp-builtins.def (BUILT_IN_GOMP_TEAMS): Remove.
	(BUILT_IN_GOMP_TEAMS4): New.
	* builtin-types.def (BT_FN_VOID_UINT_UINT): Remove.
	(BT_FN_BOOL_UINT_UINT_UINT_BOOL): New.
	* omp-low.c (lower_omp_teams): Use GOMP_teams4 instead of
	GOMP_teams, pass to it also num_teams lower-bound expression
	or a dup of upper-bound if it is missing and a flag whether
	it is the first call or not.
gcc/fortran/
	* types.def (BT_FN_VOID_UINT_UINT): Remove.
	(BT_FN_BOOL_UINT_UINT_UINT_BOOL): New.
libgomp/
	* libgomp_g.h (GOMP_teams4): Declare.
	* libgomp.map (GOMP_5.1): Export GOMP_teams4.
	* target.c (GOMP_teams4): New function.
	* config/nvptx/target.c (GOMP_teams): Remove.
	(GOMP_teams4): New function.
	* config/gcn/target.c (GOMP_teams): Remove.
	(GOMP_teams4): New function.
	* testsuite/libgomp.c/teams-4.c (main): Expect exactly 2
	teams instead of <= 2.
	* testsuite/libgomp.c-c++-common/teams-2.c: New test.
2021-11-12 12:41:22 +01:00
Chung-Lin Tang 0bac793ed6 openmp: Implement omp_get_device_num routine
This patch implements the omp_get_device_num library routine, specified in
OpenMP 5.0.

GOMP_DEVICE_NUM_VAR is a macro symbol which defines name of a "device number"
variable, is defined on the device-side libgomp, has it's address returned to
host-side libgomp during device initialization, and the host libgomp then
sets its value to the designated device number.

libgomp/ChangeLog:

	* icv-device.c (omp_get_device_num): New API function, host side.
	* fortran.c (omp_get_device_num_): New interface function.
	* libgomp-plugin.h (GOMP_DEVICE_NUM_VAR): Define macro symbol.
	* libgomp.map (OMP_5.0.2): New version space with omp_get_device_num,
	omp_get_device_num_.
	* libgomp.texi (omp_get_device_num): Add documentation for new API
	function.
	* omp.h.in (omp_get_device_num): Add declaration.
	* omp_lib.f90.in (omp_get_device_num): Likewise.
	* omp_lib.h.in (omp_get_device_num): Likewise.
	* target.c (gomp_load_image_to_device): If additional entry for device
	number exists at end of returned entries from 'load_image_func' hook,
	copy the assigned device number over to the device variable.

	* config/gcn/icv-device.c (GOMP_DEVICE_NUM_VAR): Define static global.
	(omp_get_device_num): New API function, device side.
	* plugin/plugin-gcn.c ("symcat.h"): Add include.
	(GOMP_OFFLOAD_load_image): Add addresses of device GOMP_DEVICE_NUM_VAR
	at end of returned 'target_table' entries.

	* config/nvptx/icv-device.c (GOMP_DEVICE_NUM_VAR): Define static global.
	(omp_get_device_num): New API function, device side.
	* plugin/plugin-nvptx.c ("symcat.h"): Add include.
	(GOMP_OFFLOAD_load_image): Add addresses of device GOMP_DEVICE_NUM_VAR
	at end of returned 'target_table' entries.

	* testsuite/lib/libgomp.exp
	(check_effective_target_offload_target_intelmic): New function for
	testing for intelmic offloading.
	* testsuite/libgomp.c-c++-common/target-45.c: New test.
	* testsuite/libgomp.fortran/target10.f90: New test.
2021-08-05 23:29:03 +08:00
Thomas Schwinge 8168338684 [gcn] Work-around libgomp 'error: array subscript 0 is outside array bounds of ‘__lds struct gomp_thread * __lds[0]’ [-Werror=array-bounds]' some more [PR101484]
With yesterday's commit 9f2bc5077d "[gcn]
Work-around libgomp 'error: array subscript 0 is outside array bounds of
‘__lds struct gomp_thread * __lds[0]’ [-Werror=array-bounds]' [PR101484]",
I did defuse the "unexpected" '-Werror=array-bounds' diagnostics that we see
as of commit a110855667 "Correct handling of
variable offset minus constant in -Warray-bounds [PR100137]".  However, these
'#pragma GCC diagnostic [...]' directives cause some code generation changes
(that seems unexpected, problematic!), which results in a lot (ten thousands)
of 'GCN team arena exhausted' run-time diagnostics, also leading to a few
FAILs:

    PASS: libgomp.c/../libgomp.c-c++-common/for-11.c (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.c/../libgomp.c-c++-common/for-11.c execution test

    PASS: libgomp.c/../libgomp.c-c++-common/for-12.c (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.c/../libgomp.c-c++-common/for-12.c execution test

    PASS: libgomp.c/../libgomp.c-c++-common/for-3.c (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.c/../libgomp.c-c++-common/for-3.c execution test

    PASS: libgomp.c/../libgomp.c-c++-common/for-5.c (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.c/../libgomp.c-c++-common/for-5.c execution test

    PASS: libgomp.c/../libgomp.c-c++-common/for-6.c (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.c/../libgomp.c-c++-common/for-6.c execution test

    PASS: libgomp.c/../libgomp.c-c++-common/for-9.c (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.c/../libgomp.c-c++-common/for-9.c execution test

Same for 'libgomp.c++'.

It remains to be analyzed how '#pragma GCC diagnostic [...]' directives can
cause code generation changes; for now I'm working around the "unexpected"
'-Werror=array-bounds' diagnostics differently.

Overall, still awaiting a different solution, of course.

	libgomp/
	PR target/101484
	* configure.tgt [amdgcn*-*-*] (XCFLAGS): Add
	'-Wno-error=array-bounds'.
	* config/gcn/team.c: Remove '-Werror=array-bounds' work-around.
	* libgomp.h [__AMDGCN__]: Likewise.
2021-07-20 09:14:28 +02:00
Thomas Schwinge 9f2bc5077d [gcn] Work-around libgomp 'error: array subscript 0 is outside array bounds of ‘__lds struct gomp_thread * __lds[0]’ [-Werror=array-bounds]' [PR101484]
... seen as of commit a110855667 "Correct
handling of variable offset minus constant in -Warray-bounds [PR100137]".

Awaiting a different solution, of course.

	libgomp/
	PR target/101484
	* config/gcn/team.c: Apply '-Werror=array-bounds' work-around.
	* libgomp.h [__AMDGCN__]: Likewise.
2021-07-19 10:26:12 +02:00
Jakub Jelinek 95d6776217 openmp: Fix up handling of target constructs in offloaded routines [PR100573]
OpenMP Nesting of Regions restrictions say:
- If a target update, target data, target enter data, or target exit data
construct is encountered during execution of a target region, the behavior is unspecified.
- If a target construct is encountered during execution of a target region and a device
clause in which the ancestor device-modifier appears is not present on the construct, the
behavior is unspecified.
That wording is about the dynamic (runtime) behavior, not about lexical nesting,
so while it is UB if omp target * is encountered in the target region, we need to make
it compile and link (for lexical nesting of target * inside of target we actually
emit a warning).

To make this work, I had to do multiple changes.
One was to mark .omp_data_{sizes,kinds}.* variables when static as "omp declare target".
Another one was to add stub GOMP_target* entrypoints to nvptx and gcn libgomp.a.
The entrypoint functions shouldn't be called or passed in the offload regions,
otherwise
libgomp: cuLaunchKernel error: too many resources requested for launch
was reported; fixed by changing those arguments of calls to GOMP_target_ext
to NULL.
And we didn't mark the entrypoints "omp target entrypoint" when the caller
has been "omp declare target".

2021-05-26  Jakub Jelinek  <jakub@redhat.com>

	PR libgomp/100573
gcc/
	* omp-low.c: Include omp-offload.h.
	(create_omp_child_function): If current_function_decl has
	"omp declare target" attribute and is_gimple_omp_offloaded,
	remove that attribute from the copy of attribute list and
	add "omp target entrypoint" attribute instead.
	(lower_omp_target): Mark .omp_data_sizes.* and .omp_data_kinds.*
	variables for offloading if in omp_maybe_offloaded_ctx.
	* omp-offload.c (pass_omp_target_link::execute): Nullify second
	argument to GOMP_target_data_ext in offloaded code.
libgomp/
	* config/nvptx/target.c (GOMP_target_ext, GOMP_target_data_ext,
	GOMP_target_end_data, GOMP_target_update_ext,
	GOMP_target_enter_exit_data): New dummy entrypoints.
	* config/gcn/target.c (GOMP_target_ext, GOMP_target_data_ext,
	GOMP_target_end_data, GOMP_target_update_ext,
	GOMP_target_enter_exit_data): Likewise.
	* testsuite/libgomp.c-c++-common/for-3.c (DO_PRAGMA, OMPTEAMS,
	OMPFROM, OMPTO): Define.
	(main): Remove #pragma omp target teams around all the tests.
	* testsuite/libgomp.c-c++-common/target-41.c: New test.
	* testsuite/libgomp.c-c++-common/target-42.c: New test.
2021-05-26 11:28:42 +02:00
Jakub Jelinek 99dee82307 Update copyright years. 2021-01-04 10:26:59 +01:00
Jakub Jelinek 74c9882b80 openmp: Change omp_get_initial_device () to match OpenMP 5.1 requirements
> Therefore, I think until omp_get_initial_device () value is changed, we

The following so far untested patch implements that change.

OpenMP 4.5 said for omp_get_initial_device:
The value of the device number is implementation defined. If it is between 0 and one less than
omp_get_num_devices() then it is valid for use with all device constructs and routines; if it is
outside that range, then it is only valid for use with the device memory routines and not in the
device clause.
and OpenMP 5.0 similarly, but OpenMP 5.1 says:
The value of the device number is the value returned by the omp_get_num_devices routine.

As the new value is compatible with what has been required earlier, I think
we can change it already now.

2020-10-22  Jakub Jelinek  <jakub@redhat.com>

	* icv.c (omp_get_initial_device): Remove including corresponding
	ialias.
	* icv-device.c (omp_get_initial_device): New function.  Return
	gomp_get_num_devices ().  Add ialias.
	* target.c (resolve_device): Don't fail with
	OMP_TARGET_OFFLOAD=mandatory if device_id is equal to
	gomp_get_num_devices ().
	(omp_target_alloc, omp_target_free, omp_target_is_present,
	omp_target_memcpy, omp_target_memcpy_rect, omp_target_associate_ptr,
	omp_target_disassociate_ptr, omp_pause_resource): Use
	gomp_get_num_devices () instead of GOMP_DEVICE_HOST_FALLBACK on the
	first use in the functions, in uses dominated by the
	gomp_get_num_devices call use num_devices_openmp instead.
	* libgomp.texi (omp_get_initial_device): Document.
	* config/gcn/icv-device.c (omp_get_initial_device): New function.
	Add ialias.
	* config/nvptx/icv-device.c (omp_get_initial_device): Likewise.
	* testsuite/libgomp.c/target-40.c: New test.
2020-10-22 09:31:01 +02:00
Andrew Stubbs 6f51395197 libgomp: disable barriers in nested teams
Both GCN and NVPTX allow nested parallel regions, but the barrier
implementation did not allow the nested teams to run independently of each
other (due to hardware limitations).  This patch fixes that, under the
assumption that each thread will create a new subteam of one thread, by
simply not using barriers when there's no other thread to synchronise.

libgomp/ChangeLog:

	* config/gcn/bar.c (gomp_barrier_wait_end): Skip the barrier if the
	total number of threads is one.
	(gomp_team_barrier_wake): Likewise.
	(gomp_team_barrier_wait_end): Likewise.
	(gomp_team_barrier_wait_cancel_end): Likewise.
	* config/nvptx/bar.c (gomp_barrier_wait_end): Likewise.
	(gomp_team_barrier_wake): Likewise.
	(gomp_team_barrier_wait_end): Likewise.
	(gomp_team_barrier_wait_cancel_end): Likewise.
	* testsuite/libgomp.c-c++-common/nested-parallel-unbalanced.c: New test.
2020-09-29 11:48:04 +01:00
Jakub Jelinek 8d9254fc8a Update copyright years.
From-SVN: r279813
2020-01-01 12:51:42 +01:00
Tobias Burnus 93d9021987 libgomp – spelling fixes, incl. omp_lib.h.in
* omp_lib.h.in: Fix spelling of function declaration
        omp_get_cancell(l)ation.
        * libgomp.texi (acc_is_present, acc_async_test, acc_async_test_all):
        Fix typos.
        * env.c: Fix comment typos.
        * oacc-host.c: Likewise.
        * ordered.c: Likewise.
        * task.c: Likewise.
        * team.c: Likewise.
        * config/gcn/task.c: Likewise.
        * config/gcn/team.c: Likewise.
        * config/nvptx/task.c: Likewise.
        * config/nvptx/team.c: Likewise.
        * plugin/plugin-gcn.c: Likewise.
        * testsuite/libgomp.fortran/jacobi.f: Likewise.
        * testsuite/libgomp.hsa.c/tiling-2.c: Likewise.
        * testsuite/libgomp.oacc-c-c++-common/enter_exit-lib.c: Likewise.

From-SVN: r279218
2019-12-11 12:45:49 +01:00
Andrew Stubbs cee1645106 Optimize GCN OpenMP malloc performance
2019-11-13  Andrew Stubbs  <ams@codesourcery.com>

	libgomp/
	* config/gcn/team.c (gomp_gcn_enter_kernel): Set up the team arena
	and use team_malloc variants.
	(gomp_gcn_exit_kernel): Use team_free.
	* libgomp.h (TEAM_ARENA_SIZE): Define.
	(TEAM_ARENA_START): Define.
	(TEAM_ARENA_FREE): Define.
	(TEAM_ARENA_END): Define.
	(team_malloc): New function.
	(team_malloc_cleared): New function.
	(team_free): New function.
	* team.c (gomp_new_team): Initialize and use team_malloc.
	(free_team): Use team_free.
	(gomp_free_thread): Use team_free.
	(gomp_pause_host): Use team_free.
	* work.c (gomp_init_work_share): Use team_malloc.
	(gomp_fini_work_share): Use team_free.

From-SVN: r278136
2019-11-13 12:38:09 +00:00
Andrew Stubbs fa4999953d GCN libgomp port
2019-11-13  Andrew Stubbs  <ams@codesourcery.com>
	    Kwok Cheung Yeung  <kcy@codesourcery.com>
	    Julian Brown  <julian@codesourcery.com>
	    Tom de Vries  <tom@codesourcery.com>

	include/
	* gomp-constants.h (GOMP_DEVICE_GCN): Define.
	(GOMP_VERSION_GCN): Define.

	libgomp/
	* Makefile.am (libgomp_la_SOURCES): Add oacc-target.c.
	* Makefile.in: Regenerate.
	* config.h.in (PLUGIN_GCN): Add new undef.
	* config/accel/openacc.f90 (acc_device_gcn): New parameter.
	* config/gcn/affinity-fmt.c: New file.
	* config/gcn/bar.c: New file.
	* config/gcn/bar.h: New file.
	* config/gcn/doacross.h: New file.
	* config/gcn/icv-device.c: New file.
	* config/gcn/oacc-target.c: New file.
	* config/gcn/simple-bar.h: New file.
	* config/gcn/target.c: New file.
	* config/gcn/task.c: New file.
	* config/gcn/team.c: New file.
	* config/gcn/time.c: New file.
	* configure.ac: Add amdgcn*-*-*.
	* configure: Regenerate.
	* configure.tgt: Add amdgcn*-*-*.
	* libgomp-plugin.h (offload_target_type): Add OFFLOAD_TARGET_TYPE_GCN.
	* libgomp.h (gcn_thrs): Add amdgcn variant.
	(set_gcn_thrs): Likewise.
	(gomp_thread): Likewise.
	* oacc-int.h (goacc_thread): Likewise.
	* oacc-target.c: New file.
	* openacc.f90 (acc_device_gcn): New parameter.
	* openacc.h (acc_device_t): Add acc_device_gcn.
	* team.c (gomp_free_pool_helper): Add amdgcn support.

Co-Authored-By: Julian Brown <julian@codesourcery.com>
Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>
Co-Authored-By: Tom de Vries <tom@codesourcery.com>

From-SVN: r278135
2019-11-13 12:38:04 +00:00