Commit Graph

885 Commits

Author SHA1 Message Date
Tom de Vries
d41d952c9b [nvptx] Handle assignment to gang-level reduction variable
2019-01-15  Tom de Vries  <tdevries@suse.de>

	PR target/80547
	* config/nvptx/nvptx.c (nvptx_goacc_reduction_init): Handle
	lhs == NULL_TREE for gang-level reduction.

	* testsuite/libgomp.oacc-c-c++-common/gang-reduction-var-assignment.c:
	New test.

From-SVN: r267934
2019-01-15 10:11:16 +00:00
Tom de Vries
efb56ae82b [nvptx] Enable setting vector length using -fopenacc-dim -- testcases
Add some test-cases that set vector length using -fopenacc-dim.

2019-01-12  Tom de Vries  <tdevries@suse.de>

	* testsuite/libgomp.oacc-c-c++-common/pr85486-2.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-2.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c: New test.
	* testsuite/libgomp.oacc-fortran/gemm-2.f90: New test.

From-SVN: r267897
2019-01-12 22:19:31 +00:00
Tom de Vries
2c2ff1684d [nvptx] Enable setting vector length using -fopenacc-dim
Enable setting vector length using -fopenacc-dim, f.i. -fopenacc-dim=::128.

2019-01-12  Tom de Vries  <tdevries@suse.de>

	* config/nvptx/nvptx.c (nvptx_goacc_validate_dims_1): Alow setting
	vector length using -fopenacc-dim.

	* plugin/plugin-nvptx.c (nvptx_exec): Update error message.

From-SVN: r267896
2019-01-12 22:19:15 +00:00
Tom de Vries
a105775825 [nvptx] Add vector_length 64 test-cases
Add some test-cases using vector_length 64.

2019-01-12  Tom de Vries  <tdevries@suse.de>

	* testsuite/libgomp.oacc-c-c++-common/vector-length-64-1.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-64-2.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-64-3.c: New test.

From-SVN: r267895
2019-01-12 22:19:02 +00:00
Tom de Vries
56314b772f [nvptx] Force vl32 if calling vector-partitionable routines -- test-cases
Add test-cases for "[nvptx] Force vl32 if calling vector-partitionable
routines".

2019-01-12  Tom de Vries  <tdevries@suse.de>

	PR target/85486
	* testsuite/libgomp.oacc-c-c++-common/pr85486-3.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/pr85486.c: New test.

From-SVN: r267894
2019-01-12 22:18:50 +00:00
Tom de Vries
b39e4366a2 [nvptx] Don't emit barriers for empty loops -- test-cases
Add test-cases for PR85381.

2019-01-12  Tom de Vries  <tdevries@suse.de>

	PR target/85381
	* testsuite/libgomp.oacc-c-c++-common/pr85381-5.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/pr85381.c: New test.

From-SVN: r267893
2019-01-12 22:18:39 +00:00
Tom de Vries
2cb7a501ab [nvptx] Enable large vectors -- reduction testcases
Add various reduction test-cases with vector length 128.

2019-01-12  Tom de Vries  <tdevries@suse.de>

	* testsuite/libgomp.oacc-c-c++-common/vred2d-128.c: New test.
	* testsuite/libgomp.oacc-fortran/gemm.f90: New test.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-10.c: New test.

From-SVN: r267892
2019-01-12 22:18:27 +00:00
Tom de Vries
8e77f71eda [nvptx] Enable large vectors -- test-cases
Add various test-cases with vector length 128.

2019-01-12  Tom de Vries  <tdevries@suse.de>

	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-6.c: New test.

From-SVN: r267891
2019-01-12 22:18:11 +00:00
Tom de Vries
52d22ece49 [nvptx] Update insufficient launch message for variable vector_length
Update message in nvptx libgomp plugin about insufficient resources to launch
kernel, to accommodate for the fact the vector_length can now be variable.

2019-01-12  Tom de Vries  <tdevries@suse.de>

	* plugin/plugin-nvptx.c (nvptx_exec): Update insufficient hardware
	resources diagnostic.

From-SVN: r267890
2019-01-12 22:18:00 +00:00
Tom de Vries
2b9d9e3937 [nvptx] Enable large vectors
Allow vector_length clauses to accept values larger than warp size.  Note that
this does not enable setting vector_length to values larger than warp size using
-fopenacc-dim.

2019-01-12  Tom de Vries  <tdevries@suse.de>

	* config/nvptx/nvptx.c (nvptx_goacc_validate_dims): Take larger vector
	lengths into account.

	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c: Expect
	vector length to be 128.
	* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Expect vector
	length 2097152 to be reduced to 1024 instead of 32.

From-SVN: r267889
2019-01-12 22:17:42 +00:00
Thomas Schwinge
4102bda64c Better distinguish OpenACC and OpenMP sections in libgomp.texi
2019-01-11  Thomas Schwinge  <thomas@codesourcery.com>
            James Norris  <jnorris@codesourcery.com>

	* libgomp.texi: Better distinguish OpenACC and OpenMP "Runtime
	Library Routines", and "Environment Variables".

Co-Authored-By: James Norris <jnorris@codesourcery.com>

From-SVN: r267841
2019-01-11 13:55:01 +00:00
Tom de Vries
052aaaceed [nvptx] Don't allow vector_length 64 with num_workers 16
When using a compiler build with:
...
+#define PTX_DEFAULT_VECTOR_LENGTH PTX_CTA_SIZE
...

consider a test-case:
...
int
main (void)
{
  #pragma acc parallel vector_length (64)
    #pragma acc loop worker
    for (unsigned int i = 0; i < 32; i++)
      #pragma acc loop vector
      for (unsigned int j = 0; j < 64; j++)
        ;

  return 0;
}
...

If num_workers is 16, either because:
- we add a "num_workers (16)" clause on the parallel directive, or
- we set "GOMP_OPENACC_DIM=:16:", or
- the libgomp plugin chooses 16 num_workers
we run into an illegal instruction at runtime, because a bar.sync instruction
tries to use a barrier 16.  The instruction is illegal, because ptx supports
only 16 barriers per CTA, and the valid range is 0..15.

The problem is that with a warp-multiple vector length, we use a code generation
scheme with a per-worker barrier.  And because barrier zero is reserved for
per-cta barrier, only the remaining 15 barriers can be used as per-worker
barrier, and consequently we can't use num_workers larger than 15.

This problem occurs only for vector_length 64.  For vector_length 32, we use a
different code generation scheme, and for vector_length >= 96, the maximum
num_workers is not big enough not to trigger this problem.

Also, this problem only occurs for num_workers 16.  As explained above,
num_workers 15 is safe to use, and 16 is already the maximum num_workers for
vector_length 64.

This patch fixes the problem in both the compiler (handling "num_workers (16)")
and in the libgomp nvptx plugin (with and without "GOMP_OPENACC_DIM=:16:").

2019-01-11  Tom de Vries  <tdevries@suse.de>

	* config/nvptx/nvptx.c (PTX_CTA_NUM_BARRIERS, PTX_PER_CTA_BARRIER)
	(PTX_NUM_PER_CTA_BARRIER, PTX_FIRST_PER_WORKER_BARRIER)
	(PTX_NUM_PER_WORKER_BARRIERS): Define.
	(nvptx_apply_dim_limits): Prevent vector_length 64 and
	num_workers 16.

	* plugin/plugin-nvptx.c (nvptx_exec): Prevent vector_length 64 and
	num_workers 16.

From-SVN: r267838
2019-01-11 11:46:43 +00:00
Tom de Vries
9390f91687 [libgomp, testsuite, openacc] Remove -foffload=-w in reduction-[1-5].c
Before the commit "[libgomp, testsuite, openacc] Don't use const int for
dimensions", the "const int" construct was used to set launch dimensions in
reductions-[1-5].c.  In the case of -xc -O0, the const int is implemented as a
variable by the C front-end.  Consequently, the nvptx back-end generated
warnings that vector_length was overridden to be hard-coded, rather than left to
be set at runtime.  The test-cases silenced these warnings by switching off all
warnings in the accelerator compiler using "-foffload=-w".

Given that no warnings occur anymore, remove the "-foffload=-w" setting.

2019-01-11  Tom de Vries  <tdevries@suse.de>

	* testsuite/libgomp.oacc-c-c++-common/reduction-1.c: Remove
	-foffload=-w.
	* testsuite/libgomp.oacc-c-c++-common/reduction-2.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/reduction-3.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/reduction-4.c: Same.
	* testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Same.

From-SVN: r267836
2019-01-11 11:46:06 +00:00
Tom de Vries
2c3e7ad20b [nvptx, testsuite, openacc, libgomp] Add insufficient-resources.c
Add a test-case that tests the "insufficient resources" fatal in the nvptx
libgomp plugin.

2019-01-11  Tom de Vries  <tdevries@suse.de>

	* testsuite/libgomp.oacc-c-c++-common/insufficient-resources.c: New
	test.

From-SVN: r267835
2019-01-11 11:45:55 +00:00
Nathan Sidwell
e222497dcb Add testcase from PR71959
libgomp/

	PR lto/71959
	* testsuite/libgomp.oacc-c++/pr71959-aux.cc: New.
	* testsuite/libgomp.oacc-c++/pr71959.C: New.

Co-Authored-By: Julian Brown <julian@codesourcery.com>

From-SVN: r267806
2019-01-10 12:32:03 +00:00
Sebastian Huber
cb87fec331 libgomp: Reduce copy and paste for RTEMS
libgomp/

	* config/rtems/bar.c: Include "../linux/bar.c" and delete copy
	and paste code.

From-SVN: r267752
2019-01-09 06:16:05 +00:00
Sebastian Huber
30b4d0d0b9 libgomp: Avoid complex dependencies for RTEMS
libgomp/

	* config/rtems/affinity-fmt.c: New file.  Include affinity-fmt.c,
	undefining HAVE_GETPID and HAVE_GETHOSTNAME, and mapping fwrite to
	write.

From-SVN: r267751
2019-01-09 06:14:15 +00:00
Tom de Vries
5d0bc70ae4 [libgomp, testsuite, openacc] Don't use const int for dimensions
Const int is handled differently at -O0 for -xc and -xc++, which can cause noise
in testsuite/libgomp.oacc-c-c++-common test-cases (which are both run for c and
c++) if const int is used for launch dimensions.

Fix this by using #defines instead.

2019-01-09  Tom de Vries  <tdevries@suse.de>

	PR target/88756
	* testsuite/libgomp.oacc-c-c++-common/reduction-1.c (ng, nw, vl): Use
	#define instead of "const int".
	* testsuite/libgomp.oacc-c-c++-common/reduction-2.c (ng, nw, vl): Same.
	* testsuite/libgomp.oacc-c-c++-common/reduction-3.c (ng, nw, vl): Same.
	* testsuite/libgomp.oacc-c-c++-common/reduction-4.c (ng, nw, vl): Same.
	* testsuite/libgomp.oacc-c-c++-common/reduction-5.c (ng, nw, vl): Same.

From-SVN: r267747
2019-01-09 00:07:55 +00:00
Tom de Vries
2c372e81a9 [nvptx, libgomp] Don't launch with num_workers == 0
When using a compiler build with:
...
+#define PTX_DEFAULT_VECTOR_LENGTH PTX_CTA_SIZE
+#define PTX_MAX_VECTOR_LENGTH PTX_CTA_SIZE
...
and running the libgomp testsuite, we run into an execution failure in
parallel-loop-1.c, due to a cuda launch failure:
...
  nvptx_exec: kernel f6_none_none$_omp_fn$0: launch gangs=480, workers=0, \
    vectors=1024

libgomp: cuLaunchKernel error: invalid argument
...
because workers == 0.

The workers variable is set to 0 here in nvptx_exec:
...
                workers = blocks / actual_vectors;
...
because actual_vectors is 1024, and blocks is 768:
...
cuOccupancyMaxPotentialBlockSize: grid = 10, block = 768
...

Fix this by ensuring that workers is at least one.

2019-01-09  Tom de Vries  <tdevries@suse.de>

	* plugin/plugin-nvptx.c (nvptx_exec): Make sure to launch with at least
	one worker.

From-SVN: r267746
2019-01-09 00:07:45 +00:00
Tom de Vries
43493c97a6 [nvptx] Fix libgomp.oacc-c-c++-common/vector-length-128-3.c
The vector-length-128-3.c test-case uses GOMP_OPENACC_DIM=-:-:128, but '-' is
not yet supported on trunk.  Use GOMP_OPENACC_DIM=::128 instead.

2019-01-07  Tom de Vries  <tdevries@suse.de>

	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c: Fix
	GOMP_OPENACC_DIM argument.

From-SVN: r267624
2019-01-07 08:09:49 +00:00
Tom de Vries
5c571497e1 [nvptx] Add vector_length 128 testcases
Add a couple of test-cases using vector length 128, while checking that we
override to vector length 32.

2019-01-03  Tom de Vries  <tdevries@suse.de>

	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c: New test.

From-SVN: r267559
2019-01-03 15:08:46 +00:00
Jakub Jelinek
a554497024 Update copyright years.
From-SVN: r267494
2019-01-01 13:31:55 +01:00
Jakub Jelinek
3f27508ce4 gcc.c (process_command): Update copyright notice dates.
* gcc.c (process_command): Update copyright notice dates.
	* gcov-dump.c (print_version): Ditto.
	* gcov.c (print_version): Ditto.
	* gcov-tool.c (print_version): Ditto.
	* gengtype.c (create_file): Ditto.
	* doc/cpp.texi: Bump @copying's copyright year.
	* doc/cppinternals.texi: Ditto.
	* doc/gcc.texi: Ditto.
	* doc/gccint.texi: Ditto.
	* doc/gcov.texi: Ditto.
	* doc/install.texi: Ditto.
	* doc/invoke.texi: Ditto.
gcc/fortran/
	* gfortranspec.c (lang_specific_driver): Update copyright notice
	dates.
	* gfc-internals.texi: Bump @copying's copyright year.
	* gfortran.texi: Ditto.
	* intrinsic.texi: Ditto.
	* invoke.texi: Ditto.
gcc/go/
	* gccgo.texi: Bump @copyrights-go year.
gcc/ada/
 	* gnat_ugn.texi: Bump @copying's copyright year.
 	* gnat_rm.texi: Likewise.
gcc/d/
	* gdc.texi: Bump @copyrights-d year.
libitm/
	* libitm.texi: Bump @copying's copyright year.
libgomp/
	* libgomp.texi: Bump @copying's copyright year.
libquadmath/
	* libquadmath.texi: Bump @copying's copyright year.

From-SVN: r267492
2019-01-01 12:34:49 +01:00
Thomas Schwinge
59d5960cdb For libgomp OpenACC entry points, redefine the "device" argument to "flags"
... so that we're then able to use this for other flags in addition to
"GOACC_FLAG_HOST_FALLBACK".

	gcc/
	* omp-expand.c (expand_omp_target): Restructure OpenACC vs. OpenMP
	code paths.  Update for libgomp OpenACC entry points change.
	include/
	* gomp-constants.h (GOACC_FLAG_HOST_FALLBACK)
	(GOACC_FLAGS_MARSHAL_OP, GOACC_FLAGS_UNMARSHAL): Define.
	libgomp/
	* oacc-parallel.c (GOACC_parallel_keyed, GOACC_parallel)
	(GOACC_data_start, GOACC_enter_exit_data, GOACC_update)
	(GOACC_declare): Redefine the "device" argument to "flags".

From-SVN: r267448
2018-12-28 12:34:14 +01:00
Thomas Schwinge
a44c17909a Cleanup libgomp's coalesce chunk data structures
libgomp/
	* target.c (struct gomp_coalesce_chunk): New structure.
	(struct gomp_coalesce_buf): Update the chunks member to use that
	type.  Adjust all users.

Co-Authored-By: Cesar Philippidis <cesar@codesourcery.com>

From-SVN: r267446
2018-12-28 12:33:54 +01:00
Steven G. Kargl
b3c64ed0b0 aligned1.f03: Fix invalid code that now causes an error after r267415.
2018-12-27  Steven G. Kargl  <kargl@gcc.gnu.org>

	* libgomp.fortran/aligned1.f03: Fix invalid code that now causes
	an error after r267415.

From-SVN: r267436
2018-12-27 20:57:12 +00:00
Tom de Vries
a152954ea4 [nvptx] Commit passing pr85381-*.c test-cases
Add pr85381*.c test-cases that are already passing without the fix for PR85381.

Build and reg-tested on x86_64 with nvptx accelerator.

2018-12-19  Tom de Vries  <tdevries@suse.de>

	* testsuite/libgomp.oacc-c-c++-common/pr85381-2.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/pr85381-3.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/pr85381-4.c: New test.

From-SVN: r267268
2018-12-19 14:20:54 +00:00
Tom de Vries
49188cd1f2 [nvptx, libgomp] Move rtl-dump test-cases to libgomp
The goacc.exp test-cases nvptx-merged-loop.c and nvptx-sese-1.c are failing
during linking due to missing libgomp.spec.

Move them to the libgomp testsuite.

Build and reg-tested on x86_64 with nvptx accelerator.

2018-12-19  Tom de Vries  <tdevries@suse.de>

	* gcc.dg/goacc/nvptx-merged-loop.c: Move to
	libgomp/testsuite/libgomp.oacc-c-c++-common.
	* gcc.dg/goacc/nvptx-sese-1.c: Same.

	* testsuite/lib/libgomp.exp: Add load_lib of scanoffloadrtl.exp.
	* testsuite/libgomp.oacc-c-c++-common/nvptx-merged-loop.c: Move from
	gcc/testsuite/gcc.dg/goacc.
	* testsuite/libgomp.oacc-c-c++-common/nvptx-sese-1.c: Same.

From-SVN: r267267
2018-12-19 14:20:44 +00:00
Thomas Schwinge
c759830b29 Missing changes from "Adjust copy/copyin/copyout/create for OpenACC 2.5"
Most of that patch's changes were already committed as part of r261813 "Update
OpenACC data clause semantics to the 2.5 behavior", but not all of them.

	libgomp/
	* oacc-mem.c (acc_present_or_create): Remove definition and change
	to alias of acc_create.
	(acc_present_or_copyin): Remove definition and change to alias of
	acc_copyin.
	* oacc-parallel.c (GOACC_enter_exit_data): Call acc_create instead
	of acc_present_or_create.
	* testsuite/libgomp.oacc-c-c++-common/data-already-1.c: Remove.
	* testsuite/libgomp.oacc-c-c++-common/data-already-2.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/data-already-3.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/data-already-4.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/data-already-5.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/data-already-6.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/data-already-7.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/data-already-8.c: Likewise.
	* testsuite/libgomp.oacc-fortran/data-already-1.f: Likewise.
	* testsuite/libgomp.oacc-fortran/data-already-2.f: Likewise.
	* testsuite/libgomp.oacc-fortran/data-already-3.f: Likewise.
	* testsuite/libgomp.oacc-fortran/data-already-4.f: Likewise.
	* testsuite/libgomp.oacc-fortran/data-already-5.f: Likewise.
	* testsuite/libgomp.oacc-fortran/data-already-6.f: Likewise.
	* testsuite/libgomp.oacc-fortran/data-already-7.f: Likewise.
	* testsuite/libgomp.oacc-fortran/data-already-8.f: Likewise.

Co-Authored-By: Chung-Lin Tang <cltang@codesourcery.com>

From-SVN: r267153
2018-12-14 21:43:12 +01:00
Thomas Schwinge
f847198ec3 [PR88495] An OpenACC async queue is always synchronized with itself
An OpenACC async queue is always synchronized with itself, so invocations like
"#pragma acc wait(0) async(0)", or "acc_wait_async (0, 0)" don't make a lot of
sense, but are still valid.

	libgomp/
	PR libgomp/88495
	* plugin/plugin-nvptx.c (nvptx_wait_async): Don't refuse
	"identical parameters".
	* testsuite/libgomp.oacc-c-c++-common/asyncwait-nop-1.c: Update.
	* testsuite/libgomp.oacc-c-c++-common/lib-80.c: Remove.

From-SVN: r267152
2018-12-14 21:43:02 +01:00
Thomas Schwinge
c8ab8aab9f [PR88484] OpenACC wait directive without wait argument but with async clause
We don't correctly handle "#pragma acc wait async (a)" for "a >= 0", handling
as a no-op whereas it should enqueue the appropriate wait operations on
"async (a)".

	libgomp/
	PR libgomp/88484
	* oacc-parallel.c (GOACC_wait): Correct handling for "async >= 0".
	* testsuite/libgomp.oacc-c-c++-common/asyncwait-nop-1.c: New file.

From-SVN: r267151
2018-12-14 21:42:50 +01:00
Thomas Schwinge
1404af62dc [PR88407] [OpenACC] Correctly handle unseen async-arguments
... which turn the operation into a no-op.

	libgomp/
	PR libgomp/88407
	* plugin/plugin-nvptx.c (nvptx_async_test, nvptx_wait)
	(nvptx_wait_async): Unseen async-argument is a no-op.
	* testsuite/libgomp.oacc-c-c++-common/async_queue-1.c: Update.
	* testsuite/libgomp.oacc-c-c++-common/data-2-lib.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/data-2.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-79.c: Likewise.
	* testsuite/libgomp.oacc-fortran/lib-12.f90: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-71.c: Merge into...
	* testsuite/libgomp.oacc-c-c++-common/lib-69.c: ... this.  Update.
	* testsuite/libgomp.oacc-c-c++-common/lib-77.c: Merge into...
	* testsuite/libgomp.oacc-c-c++-common/lib-74.c: ... this.  Update

From-SVN: r267150
2018-12-14 21:42:40 +01:00
Thomas Schwinge
7de562eec2 Revise libgomp.oacc-c-c++-common/data-2-lib.c, libgomp.oacc-c-c++-common/data-2.c
These are meant to be functionally equivalent (but no longer are), just using
different means.  Also, use the OpenACC "*_async" functions recently added.

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/data-2-lib.c: Revise.
	* testsuite/libgomp.oacc-c-c++-common/data-2.c: Likewise.

From-SVN: r267149
2018-12-14 21:42:29 +01:00
Chung-Lin Tang
17469af75b Correctly describe OpenACC async/wait dependencies
libgomp/
	* testsuite/libgomp.oacc-c-c++-common/data-2-lib.c: Adjust.
	* testsuite/libgomp.oacc-c-c++-common/data-2.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/data-3.c: Likewise.

Reviewed-by: Thomas Schwinge <thomas@codesourcery.com>

From-SVN: r267148
2018-12-14 21:42:18 +01:00
Thomas Schwinge
18c247cc0b [PR88370] acc_get_cuda_stream/acc_set_cuda_stream: acc_async_sync, acc_async_noval
Per my reading of the OpenACC specification (and as supported by secondary
documentation, such as code examples, or presentations), it's valid to call
"acc_get_cuda_stream"/"acc_set_cuda_stream" also with "acc_async_sync",
"acc_async_noval" arguments, not just with the nonnegative values as currently
implemented.

	libgomp/
	PR libgomp/88370
	* libgomp.texi (acc_get_current_cuda_context, acc_get_cuda_stream)
	(acc_set_cuda_stream): Clarify.
	* oacc-cuda.c (acc_get_cuda_stream, acc_set_cuda_stream): Use
	"async_valid_p".
	* plugin/plugin-nvptx.c (nvptx_set_cuda_stream): Refuse "async ==
	acc_async_sync".
	* testsuite/libgomp.oacc-c-c++-common/acc_set_cuda_stream-1.c: New file.
	* testsuite/libgomp.oacc-c-c++-common/async_queue-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-84.c: Update.
	* testsuite/libgomp.oacc-c-c++-common/lib-85.c: Likewise.

From-SVN: r267147
2018-12-14 21:42:08 +01:00
Tom de Vries
b0aba46ca6 [offloading] Error on missing symbols
When compiling an OpenMP or OpenACC program containing a reference in the
offloaded code to a symbol that has not been included in the offloaded code,
the offloading compiler may ICE in lto1.

Fix this by erroring out instead, mentioning the problematic symbol:
...
error: variable 'var' has been referenced in offloaded code but hasn't
  been marked to be included in the offloaded code
lto1: fatal error: errors during merging of translation units
compilation terminated.
...

Build x86_64 with nvptx accelerator and reg-tested libgomp.

Build x86_64 and reg-tested libgomp.

2018-12-14  Tom de Vries  <tdevries@suse.de>

	* lto-cgraph.c (verify_node_partition): New function.
	(input_overwrite_node, input_varpool_node): Use verify_node_partition.

	* testsuite/libgomp.c-c++-common/function-not-offloaded-aux.c: New test.
	* testsuite/libgomp.c-c++-common/function-not-offloaded.c: New test.
	* testsuite/libgomp.c-c++-common/variable-not-offloaded.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/function-not-offloaded.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/variable-not-offloaded.c: New test.

From-SVN: r267134
2018-12-14 13:48:56 +00:00
Tom de Vries
fe0827eed0 [libgomp, nvptx] Fix libgomp.c/target-5.c compilation
Libgomp test-case libgomp.c/target-5.c is failing to compile when building for
x86_64 with nvptx accelerator due to missing:
- getpid
- gethostname
- isatty (pulled in by fwrite)
in the nvptx newlib.

This patch fixes the build failure by:
- adding a function gomp_print_string which limits the use of fwrite to a single
  location (in affinity-fmt.c), and
- creating an nvptx version of affinity-fmt.c, which:
  - overrides the configure test results HAVE_GETPID and HAVE_GETHOSTNAME, and
  - implements fwrite using write.

Build and reg-tested on x86_64 with nvptx accelerator.

2018-12-13  Tom de Vries  <tdevries@suse.de>

	* affinity-fmt.c (gomp_print_string): New function, factored out of ...
	(omp_display_affinity, gomp_display_affinity_thread): ... here, and ...
	* fortran.c (omp_display_affinity_): ... here.
	* libgomp.h (gomp_print_string): Declare.
	* config/nvptx/affinity-fmt.c: New file.  Include affinity-fmt.c,
	undefining HAVE_GETPID and HAVE_GETHOSTNAME, and mapping fwrite to
	write.

From-SVN: r267100
2018-12-13 18:04:05 +00:00
Jakub Jelinek
a51f8c92e2 re PR libgomp/88460 ([nvptx] FAIL: libgomp.c++/for-24.C (internal compiler error))
PR libgomp/88460
	* testsuite/libgomp.c++/for-24.C (results): Include it in
	omp declare target region.
	(main): Use map (always, tofrom: results) instead of
	map (tofrom: results).

From-SVN: r267093
2018-12-13 13:53:19 +01:00
Jakub Jelinek
7a289b7d0a re PR fortran/88463 (Rejects conforming source, OpenMP Parallel region Default(None) reference to module parameter array, separate source)
PR fortran/88463
	* trans-openmp.c (gfc_omp_predetermined_sharing): Handle TREE_READONLY
	VAR_DECLs with DECL_EXTERNAL like those with TREE_STATIC.

	* testsuite/libgomp.fortran/pr88463-1.f90: New test.
	* testsuite/libgomp.fortran/pr88463-2.f90: New test.

From-SVN: r267069
2018-12-12 23:49:35 +01:00
Jakub Jelinek
a6ef2ac9d9 omp-builtins.def (BUILT_IN_GOMP_LOOP_NONMONOTONIC_RUNTIME_START, [...]): Fix up function types - remove one argument.
* omp-builtins.def (BUILT_IN_GOMP_LOOP_NONMONOTONIC_RUNTIME_START,
	BUILT_IN_GOMP_LOOP_MAYBE_NONMONOTONIC_RUNTIME_START,
	BUILT_IN_GOMP_LOOP_ULL_NONMONOTONIC_RUNTIME_START,
	BUILT_IN_GOMP_LOOP_ULL_MAYBE_NONMONOTONIC_RUNTIME_START,
	BUILT_IN_GOMP_PARALLEL_LOOP_NONMONOTONIC_RUNTIME,
	BUILT_IN_GOMP_PARALLEL_LOOP_MAYBE_NONMONOTONIC_RUNTIME): Fix up
	function types - remove one argument.

	* testsuite/libgomp.c-c++-common/for-16.c: New test.

From-SVN: r267067
2018-12-12 23:47:55 +01:00
Andreas Schwab
9c47a01994 Don't mark stack pointer as clobbered in asm
* config/linux/ia64/futex.h (sys_futex0): Don't mark r12 as
clobbered.

From-SVN: r267052
2018-12-12 11:22:42 +00:00
Thomas Koenig
d0cbb206e2 re PR libfortran/88411 (Random crashes for ASYNCHRONOUS writes (bad locking?))
2018-12-09  Thomas Koenig  <tkoenig@gcc.gnu.org>

	PR fortran/88411
	* io/transfer.c (dta_transfer_init): Do not treat as an
	asynchronous statement unless the statement has
	ASYNCHRONOUS="YES".
	(st_write_done): Likewise.
	(st_read_done): Do not perform async_wait for synchronous I/O
	on an async unit.
	(st_read_done): Likewise.

2018-12-09  Thomas Koenig  <tkoenig@gcc.gnu.org>

	PR fortran/88411
	* testsuite/libgomp.fortran/async_io_8.f90: New test.

From-SVN: r266929
2018-12-09 18:54:47 +00:00
Thomas Schwinge
84ca08d2d5 Coalesce host to device transfers in libgomp: not for link pointer
libgomp/
	* target.c (gomp_map_vars): Call gomp_copy_host2dev instead of
	devicep->host2dev_func.

Co-Authored-By: Jakub Jelinek <jakub@redhat.com>

From-SVN: r266919
2018-12-09 13:47:23 +01:00
Jakub Jelinek
6997628d35 re PR libgomp/87995 (libgomp.c/../libgomp.c-c++-common/cancel-taskgroup-3.c fails consistently after r265930)
PR libgomp/87995
	* testsuite/libgomp.c-c++-common/cancel-taskgroup-3.c: Require
	tls_runtime effective target.
	(t): New threadprivate variable.
	(main): Set t in threads which execute iterations of the worksharing
	loop.  Propagate that to the task after the loop and don't abort
	if the current taskgroup hasn't been cancelled.

From-SVN: r266904
2018-12-08 09:58:24 +01:00
Jakub Jelinek
4a82df9a38 tree-nested.c (convert_nonlocal_omp_clauses, [...]): Handle OMP_CLAUSE_IN_REDUCTION...
* tree-nested.c (convert_nonlocal_omp_clauses,
	convert_local_omp_clauses): Handle OMP_CLAUSE_IN_REDUCTION,
	OMP_CLAUSE_TASK_REDUCTION and OMP_CLAUSE__SIMT_ clauses.
	(convert_nonlocal_reference_stmt, convert_local_reference_stmt):
	Convert clauses for GIMPLE_OMP_TASKGROUP.

	* testsuite/libgomp.c/task-reduction-3.c: New test.

From-SVN: r266723
2018-12-02 13:50:50 +01:00
Jakub Jelinek
daa8c1d763 omp-low.c (check_omp_nesting_restrictions): Allow cancel or cancellation point with taskgroup clause inside of taskloop.
* omp-low.c (check_omp_nesting_restrictions): Allow cancel or
	cancellation point with taskgroup clause inside of taskloop.  Consider
	a taskloop construct without nogroup clause as implicit taskgroup for
	diagnostics if cancel/cancellation point with taskgroup clause is
	closely nested inside of taskgroup region.

	* c-c++-common/gomp/cancel-1.c (f2): Add various taskloop related
	tests.

	* testsuite/libgomp.c-c++-common/cancel-taskgroup-4.c: New test.

From-SVN: r266722
2018-12-02 13:48:42 +01:00
Cesar Philippidis
fe570ff8d4 [PR88288, OpenACC, libgomp] Adjust offsets for present data clauses
Make libgomp respect the on device offset of subarrays which may arise in
present data clauses.

	libgomp/
	PR libgomp/88288
	* oacc-parallel.c (GOACC_parallel_keyed): Add offset to devaddrs.
	* testsuite/libgomp.oacc-c-c++-common/pr88288.c: New test.

Reviewed-by: Thomas Schwinge <thomas@codesourcery.com>

From-SVN: r266688
2018-11-30 21:39:49 +01:00
Thomas Schwinge
c223608f54 Add libgomp.oacc-fortran/lib-16-2.f90
This is a copy of libgomp.oacc-fortran/lib-16.f90, but does 'include
"openacc_lib.h"' instead of 'use openacc'.

	libgomp/
	* testsuite/libgomp.oacc-fortran/lib-16-2.f90: New file.

From-SVN: r266683
2018-11-30 21:38:57 +01:00
Richard Biener
fc60283c5c re PR tree-optimization/88182 (ICE in vectorizable_reduction, at tree-vect-loop.c:6465)
2018-10-19  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/88182
	* g++.dg/gomp/pr88182.C: Move from libgomp and use -fopenmp-simd.

	* testsuite/libgomp.c++/pr88182.C: Move to g++.dg/gomp.

From-SVN: r266545
2018-11-28 08:29:16 +00:00
Jakub Jelinek
ffcf3b79c2 Makefile.am (AUTOMAKE_OPTIONS): Drop dejagnu.
* testsuite/Makefile.am (AUTOMAKE_OPTIONS): Drop dejagnu.
	(RUNTEST): Don't define.
	(RUNTESTDEFAULTFLAGS): Add.
	(check-DEJAGNU, site.exp, distclean-DEJAGNU): New goals.
	(distclean-am): Depend on distclean-DEJAGNU.
	(check-am): If -j% option is present in MFLAGS and if
	`getconf _NPROCESSORS_ONLN` is more than 8, export OMP_NUM_THREADS=8.
	(.PHONY): Add check-DEJAGNU and distclean-DEJAGNU.
	* testsuite/Makefile.in: Regenerated.

From-SVN: r266482
2018-11-26 22:03:41 +01:00