Commit Graph

756 Commits

Author SHA1 Message Date
Tom de Vries
1644d7f4c1 [openacc, libgomp, testsuite] Xfail declare-5.f90
We're currently running into:
...
FAIL: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 \
  -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  execution test
FAIL: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 \
  -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O1  execution test
FAIL: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 \
  -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  execution test
FAIL: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 \
  -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O3 -fomit-frame-pointer \
  -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
FAIL: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 \
  -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O3 -g  execution test
FAIL: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 \
  -DACC_MEM_SHARED=0 -foffload=nvptx-none  -Os  execution test
...

A PR was filed for this: PR92790 - "[OpenACC] declare device_resident -
Fortran common blocks not handled / libgomp.oacc-fortran/declare-5.f90 fails"

Xfail the fails.

Tested on x86_64-linux with nvptx accelerator.

libgomp/ChangeLog:

2020-10-06  Tom de Vries  <tdevries@suse.de>

	* testsuite/libgomp.oacc-fortran/declare-5.f90: Add xfail for PR92790.
2020-10-06 18:43:24 +02:00
Tom de Vries
3f2e15c2e6 [openacc] Fix acc declare for VLAs
Consider test-case test.c, with VLA A:
...
int main (void) {
  int N = 1000;
  int A[N];
  #pragma acc declare copy(A)
  return 0;
}
...
compiled using:
...
$ gcc test.c -fopenacc -S -fdump-tree-all
...

At original, we have:
...
  #pragma acc declare map(tofrom:A);
...
but at gimple, we have a map (to:A.1), but not a map (from:A.1):
...
  int[0:D.2074] * A.1;

  {
    int A[0:D.2074] [value-expr: *A.1];

    saved_stack.2 = __builtin_stack_save ();
    try
      {
        A.1 = __builtin_alloca_with_align (D.2078, 32);
        #pragma omp target oacc_declare map(to:(*A.1) [len: D.2076])
      }
    finally
      {
        __builtin_stack_restore (saved_stack.2);
      }
  }
...

This is caused by the following incompatibility.  When storing the desired
from clause in oacc_declare_returns, we use 'A.1' as the key:
...
10898                 oacc_declare_returns->put (decl, c);
(gdb) call debug_generic_expr (decl)
A.1
(gdb) call debug_generic_expr (c)
map(from:(*A.1))
...
but when looking it up, we use 'A' as the key:
...
(gdb)
1471                  tree *c = oacc_declare_returns->get (t);
(gdb) call debug_generic_expr (t)
A
...

Fix this by extracing the 'A.1' lookup key from 'A' using the decl-expr.

In addition, unshare the looked up value, to fix avoid running into
an "incorrect sharing of tree nodes" error.

Using these two fixes, we get our desired:
...
     finally
       {
+        #pragma omp target oacc_declare map(from:(*A.1))
         __builtin_stack_restore (saved_stack.2);
       }
...

Build on x86_64-linux with nvptx accelerator, tested libgomp.

gcc/ChangeLog:

2020-10-06  Tom de Vries  <tdevries@suse.de>

	PR middle-end/90861
	* gimplify.c (gimplify_bind_expr): Handle lookup in
	oacc_declare_returns using key with decl-expr.

libgomp/ChangeLog:

2020-10-06  Tom de Vries  <tdevries@suse.de>

	PR middle-end/90861
	* testsuite/libgomp.oacc-c-c++-common/declare-vla.c: Remove xfail.
2020-10-06 16:50:22 +02:00
Tom de Vries
ab3f4b27ab [omp, ftracer] Don't duplicate blocks in SIMT region
When running the libgomp testsuite on x86_64-linux with nvptx accelerator on
the test-case included in this patch, we run into:
...
FAIL: libgomp.fortran/pr95654.f90 -O3 -fomit-frame-pointer -funroll-loops \
  -fpeel-loops -ftracer -finline-functions  execution test
...

The test-case is a minimal version of this FAIL:
...
FAIL: libgomp.fortran/pr66199-5.f90 -O3 -fomit-frame-pointer -funroll-loops \
  -fpeel-loops -ftracer -finline-functions  execution test
...
but that one has stopped failing at commit c2ebf4f10d "openmp: Add support
for non-rect simd and improve collapsed simd support".

The problem is that ftracer duplicates a block containing GOMP_SIMT_VOTE_ANY.

That is, before ftracer we have (dropping the GOMP_SIMT_ prefix):
...
bb4(ENTER_ALLOC)
*----------+
|           \
|            \
|             v
|             *
v             bb8
*<------------*
bb5(VOTE_ANY)
*-------------+
|             |
|             |
|             |
|             |
|             v
|             *
v             bb7(XCHG_IDX)
*<------------*
bb6(EXIT)
...

The XCHG_IDX internal-fn does inter-SIMT-lane communication, which for nvptx
maps onto shfl, an operator which has the requirement that the warp executing
the operator is convergent.  The warp diverges at bb4, and
reconverges at bb5, and does not diverge by going to bb7, so the shfl is
indeed executed by a convergent warp.

After ftracer, we have:
...
bb4(ENTER_ALLOC)
*----------+
|           \
|            \
|             \
|              \
v               v
*               *
bb5(VOTE_ANY)   bb8(VOTE_ANY)
*               *
|\             /|
| \  +--------+ |
|  \/           |
|  /\           |
| /  +----------v
|/              *
v               bb7(XCHG_IDX)
*<--------------*
bb6(EXIT)
...

The warp diverges again at bb5, but does not reconverge again before bb6, so
the shfl is executed by a divergent warp, which causes the FAIL.

Fix this by making ftracer ignore blocks containing ENTER_ALLOC, VOTE_ANY and
EXIT, effectively treating the SIMT region conservatively.

An argument can be made that the test needs to be added in a more
generic place, like gimple_can_duplicate_bb_p or some such, and that ftracer
then needs to use the generic test.  But that's a discussion with a much
broader scope, so I'm leaving that for another patch.

Bootstrapped and reg-tested on x86_64-linux.

Build on x86_64-linux with nvptx accelerator, tested with libgomp.

gcc/ChangeLog:

	PR fortran/95654
	* tracer.c (ignore_bb_p): Ignore GOMP_SIMT_ENTER_ALLOC,
	GOMP_SIMT_VOTE_ANY and GOMP_SIMT_EXIT.

libgomp/ChangeLog:

2020-10-05  Tom de Vries  <tdevries@suse.de>

	PR fortran/95654
	* testsuite/libgomp.fortran/pr95654.f90: New test.
2020-10-05 08:53:11 +02:00
Tobias Burnus
2fe5a545e0 libgomp: Regenerate configure files with automake 1.15.1
libgomp/ChangeLog:
	* Makefile.in: Regenerate with automake 1.15.1.
	* aclocal.m4: Likewise.
	* configure: Likewise.
	* testsuite/Makefile.in: Likewise.
2020-10-02 12:08:47 +02:00
Tobias Burnus
8b0a63e47c OpenMP: Add implicit declare target for nested procedures
gcc/ChangeLog:

	* omp-offload.c (omp_discover_implicit_declare_target): Also
	handled nested functions.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/declare-target-3.f90: New test.
2020-09-30 14:59:27 +02:00
Andrew Stubbs
6f51395197 libgomp: disable barriers in nested teams
Both GCN and NVPTX allow nested parallel regions, but the barrier
implementation did not allow the nested teams to run independently of each
other (due to hardware limitations).  This patch fixes that, under the
assumption that each thread will create a new subteam of one thread, by
simply not using barriers when there's no other thread to synchronise.

libgomp/ChangeLog:

	* config/gcn/bar.c (gomp_barrier_wait_end): Skip the barrier if the
	total number of threads is one.
	(gomp_team_barrier_wake): Likewise.
	(gomp_team_barrier_wait_end): Likewise.
	(gomp_team_barrier_wait_cancel_end): Likewise.
	* config/nvptx/bar.c (gomp_barrier_wait_end): Likewise.
	(gomp_team_barrier_wake): Likewise.
	(gomp_team_barrier_wait_end): Likewise.
	(gomp_team_barrier_wait_cancel_end): Likewise.
	* testsuite/libgomp.c-c++-common/nested-parallel-unbalanced.c: New test.
2020-09-29 11:48:04 +01:00
Tobias Burnus
2a10a2c068 OpenMP: Handle cpp_implicit_alias in declare-target discovery (PR96390)
gcc/ChangeLog:

	PR middle-end/96390
	* omp-offload.c (omp_discover_declare_target_tgt_fn_r): Handle
	alias nodes.

libgomp/ChangeLog:

	PR middle-end/96390
	* testsuite/libgomp.c++/pr96390.C: New test.
	* testsuite/libgomp.c-c++-common/pr96390.c: New test.
2020-09-28 18:08:05 +02:00
Jakub Jelinek
c2ebf4f10d openmp: Add support for non-rect simd and improve collapsed simd support
The following change adds support for non-rectangular simd loops.
While working on that, I've noticed we actually don't vectorize collapsed
simd loops at all, because the code that I thought would be vectorizable
actually is not vectorized.  While in theory for the constant lower/upper
bounds and constant step of all but the outermost loop we could in theory
vectorize by computing the seprate iterators using vectorized division
and modulo for each of them from the single iterator that increments
by 1 from 0 to total iteration count in the loop nest, I think that would
be fairly expensive and the chances of the loop body being vectorizable
would be low e.g. because of array indices unlikely to be linear and would
need scatters/gathers.
This patch changes the generated code to vectorize only the innermost
loop which has higher chance of being vectorized.  Below is the list of
tests and function names in which the patch resulted in vectorizing something
that hasn't been vectorized before (ok, the first line is a new test).
I've also found that the vectorizer will not vectorize loops with non-constant
steps, I plan to do something about those incrementally on the omp-expand.c
side (basically, compute number of iterations before the loop and use a 0 to
number_of_iterations step 1 IV as the main one).

I have problem with the composite simd vectorization though.
The point is that each thread (or task etc.) is given only a range of
consecutive iterations, so somewhere earlier it computes total number of iterations
and splits the work between the workers and then the intent is to try to vectorize it.
So, each thread is then given a begin ... end-1 range that it would handle.
This means that from the single begin value I need to compute the individual iteration
vars I should start at and then goto into the loop nest to begin iterating there
(and actually compute how many iterations the innermost loop should do each time
so that it stops before end).
Very roughly the IL I emit is something like:
int t[100][100][100];

void
foo (int a, int b, int c, int d, int e, int f, int g, int h, int u, int v, int w, int x)
{
  int i, j, k;
  int cnt;
  if (x)
    {
      i = u; j = v; k = w; goto doit;
    }
  for (i = a; i < b; i += c)
    for (j = d; j < e; j += f)
      {
        k = g;
        doit:
        for (; k < h; k++)
          t[i][j][k] += i + j + k;
      }
}
Unfortunately, some pass then turns the innermost loop to have more than 2 basic blocks
and it isn't vectorized because of that.

Also, I have disabled (for now) SIMTization of collapsed simd loops, because for SIMT
it would be using a single thread anyway and I didn't want to bother with checking
SIMT on all places I've been changing.  If SIMT support is added for some or all
collapsed loops, that omp-low.c change needs to be reverted.

Here is that list of what hasn't been vectorized before and is now:

gcc/testsuite/gcc.dg/vect/vect-simd-17.c doit
gcc/testsuite/gfortran.dg/gomp/openmp-simd-6.f90 bar
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-10.c f28_taskloop_simd_normal._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-10.c _Z24f28_taskloop_simd_normalv._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f25_t_simd_normal._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f26_t_simd_normal._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f27_t_simd_normal._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f28_tpf_simd_guided32._omp_fn.1
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f28_tpf_simd_runtime._omp_fn.1
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z17f25_t_simd_normaliiiiiii._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z17f26_t_simd_normaliiiixxi._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z17f27_t_simd_normalv._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z20f28_tpf_simd_runtimev._omp_fn.1
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z21f28_tpf_simd_guided32v._omp_fn.1
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c f7_simd_normal
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f7_simd_normal
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c f8_f_simd_guided32
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f8_f_simd_guided32
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c f8_f_simd_runtime
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f8_f_simd_runtime
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f8_pf_simd_guided32._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f8_pf_simd_runtime._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c _Z18f8_pf_simd_runtimev._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c _Z19f8_pf_simd_guided32v._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-4.c f8_taskloop_simd_normal._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-4.c _Z23f8_taskloop_simd_normalv._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-5.c f7_t_simd_normal._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-5.c f8_tpf_simd_guided32._omp_fn.1
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-5.c f8_tpf_simd_runtime._omp_fn.1
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-5.c _Z16f7_t_simd_normalv._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-5.c _Z19f8_tpf_simd_runtimev._omp_fn.1
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-5.c _Z20f8_tpf_simd_guided32v._omp_fn.1
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f25_simd_normal
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f25_simd_normal
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f26_simd_normal
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f26_simd_normal
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f27_simd_normal
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f27_simd_normal
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f28_f_simd_guided32
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f28_f_simd_guided32
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f28_f_simd_runtime
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f28_f_simd_runtime
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f28_pf_simd_guided32._omp_fn.0
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f28_pf_simd_runtime._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c _Z19f28_pf_simd_runtimev._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c _Z20f28_pf_simd_guided32v._omp_fn.0
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/master-combined-1.c main._omp_fn.9
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/master-combined-1.c main._omp_fn.9
libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/simd-1.c f2
libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/simd-1.c f2
libgomp/testsuite/libgomp.c/pr70680-2.c f1._omp_fn.0
libgomp/testsuite/libgomp.c/pr70680-2.c f2._omp_fn.0
libgomp/testsuite/libgomp.c/pr70680-2.c f3._omp_fn.0
libgomp/testsuite/libgomp.c/pr70680-2.c f4._omp_fn.0
libgomp/testsuite/libgomp.c/simd-8.c foo
libgomp/testsuite/libgomp.c/simd-9.c bar
libgomp/testsuite/libgomp.c/simd-9.c foo

2020-09-25  Jakub Jelinek  <jakub@redhat.com>

gcc/
	* omp-low.c (scan_omp_1_stmt): Don't call scan_omp_simd for
	collapse > 1 loops as simt doesn't support collapsed loops yet.
	* omp-expand.c (expand_omp_for_init_counts, expand_omp_for_init_vars):
	Small tweaks to function comment.
	(expand_omp_simd): Rewritten collapse > 1 support to only attempt
	to vectorize the innermost loop and emit set of outer loops around it.
	For non-composite simd with collapse > 1 without broken loop don't
	even try to compute number of iterations first.  Add support for
	non-rectangular simd loops.
	(expand_omp_for): Don't sorry_at on non-rectangular simd loops.
gcc/testsuite/
	* gcc.dg/vect/vect-simd-17.c: New test.
libgomp/
	* testsuite/libgomp.c/loop-25.c: New test.
2020-09-25 10:43:37 +02:00
Tobias Burnus
f74c87f85f libgomp.fortran/pr66199-5.f90: Make stop codes unique
libgomp/ChangeLog:

	PR fortran/95654
	* testsuite/libgomp.fortran/pr66199-5.f90: Make stop codes unique.
2020-09-22 19:16:34 +02:00
Nathan Sidwell
8155316c6f c++: local-scope OMP UDR reductions have no template head
This corrects the earlier problems with removing the template header
from local omp reductions.  And it uncovered a latent bug.  When we
tsubst such a decl, we immediately tsubst its body.
cp_check_omp_declare_reduction gets a success return value to gate
that instantiation.

udr-2.C got a further error, as the omp checking machinery doesn't
appear to turn the reduction into an error mark when failing.  I
didn't dig into that further.  udr-3.C appears to have been invalid
and accidentally worked.

	gcc/cp/
	* cp-tree.h (cp_check_omp_declare_reduction): Return bool.
	* semantics.c (cp_check_omp_declare_reduction): Return true on for
	success.
	* pt.c (push_template_decl_real): OMP reductions do not get a
	template header.
	(tsubst_function_decl): Remove special casing for local decl omp
	reductions.
	(tsubst_expr): Call instantiate_body for a local omp reduction.
	(instantiate_body): Add nested_p parm, and deal with such
	instantiations.
	(instantiate_decl): Reject FUNCTION_SCOPE entities, adjust
	instantiate_body call.
	gcc/testsuite/
	* g++.dg/gomp/udr-2.C: Add additional expected error.
	libgomp/
	* testsuite/libgomp.c++/udr-3.C: Add missing ctor.
2020-09-16 12:16:11 -07:00
Tobias Burnus
972da55746 OpenMP/Fortran: Fix (re)mapping of allocatable/pointer arrays [PR96668]
gcc/cp/ChangeLog:

	PR fortran/96668
	* cp-gimplify.c (cxx_omp_finish_clause): Add bool openacc arg.
	* cp-tree.h (cxx_omp_finish_clause): Likewise
	* semantics.c (handle_omp_for_class_iterator): Update call.

gcc/fortran/ChangeLog:

	PR fortran/96668
	* trans.h (gfc_omp_finish_clause): Add bool openacc arg.
	* trans-openmp.c (gfc_omp_finish_clause): Ditto. Use
	GOMP_MAP_ALWAYS_POINTER with PSET for pointers.
	(gfc_trans_omp_clauses): Like the latter and also if the always
	modifier is used.

gcc/ChangeLog:

	PR fortran/96668
	* gimplify.c (gimplify_omp_for): Add 'bool openacc' argument;
	update omp_finish_clause calls.
	(gimplify_adjust_omp_clauses_1, gimplify_adjust_omp_clauses,
	gimplify_expr, gimplify_omp_loop): Update omp_finish_clause
	and/or gimplify_for calls.
	* langhooks-def.h (lhd_omp_finish_clause): Add bool openacc arg.
	* langhooks.c (lhd_omp_finish_clause): Likewise.
	* langhooks.h (lhd_omp_finish_clause): Likewise.
	* omp-low.c (scan_sharing_clauses): Keep GOMP_MAP_TO_PSET cause for
	'declare target' vars.

include/ChangeLog:

	PR fortran/96668
	* gomp-constants.h (GOMP_MAP_ALWAYS_POINTER_P): Define.

libgomp/ChangeLog:

	PR fortran/96668
	* libgomp.h (struct target_var_desc): Add has_null_ptr_assoc member.
	* target.c (gomp_map_vars_existing): Add always_to_flag flag.
	(gomp_map_vars_existing): Update call to it.
	(gomp_map_fields_existing): Likewise
	(gomp_map_vars_internal): Update PSET handling such that if a nullptr is
	now allocated or if GOMP_MAP_POINTER is used PSET is updated and pointer
	remapped.
	(GOMP_target_enter_exit_data): Hanlde GOMP_MAP_ALWAYS_POINTER like
	GOMP_MAP_POINTER.
	* testsuite/libgomp.fortran/map-alloc-ptr-1.f90: New test.
	* testsuite/libgomp.fortran/map-alloc-ptr-2.f90: New test.
2020-09-15 09:24:47 +02:00
Tom de Vries
4ac7b66958 [libgomp, nvptx] Add __sync_compare_and_swap_16
As reported here
( https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553070.html  ),
when running test-case libgomp.c-c++-common/reduction-16.c for powerpc host
with nvptx accelerator, we run into:
...
unresolved symbol __sync_val_compare_and_swap_16
...

I can reproduce the problem on x86_64 with a trigger patch that:
- initializes ix86_isa_flags2 to TARGET_ISA2_CX16
- enables define_expand "atomic_load<mode>" in gcc/config/i386/sync.md
  for TImode

The problem is that omp-expand.c generates atomic builtin calls based on
checks whether those are supported on the host, which forces the target to
support these, even though those checks fail for the accelerator target.

Fix this by:
- adding a __sync_val_compare_and_swap_16 in libgomp for nvptx,
  which falls back onto libatomic's __atomic_compare_and_swap_16
- adding -foffload=-latomic in the test-case

Tested libgomp on x86_64-linux with nvptx accelerator.

Tested libgomp with trigger patch on x86_64-linux with nvptx accelerator.

libgomp/ChangeLog:

	* config/nvptx/atomic.c: New file.  Add
	__sync_val_compare_and_swap_16.
	* testsuite/libgomp.c-c++-common/reduction-16.c: Add -latomic for
	target offload_target_nvptx.
2020-09-14 08:28:56 +02:00
Julian Brown
8183ebcdc1 openacc: Fix atomic_capture-2.c iteration-ordering issues
The test case was written with assumptions about loop iteration ordering
that are not guaranteed by OpenACC and do not apply on all targets,
in particular AMD GCN. This patch removes those assumptions.

2020-09-08  Julian Brown  <julian@codesourcery.com>

libgomp/
	* testsuite/libgomp.oacc-c-c++-common/atomic_capture-2.c: Remove
	iteration-ordering assumptions.
2020-09-08 13:26:42 -07:00
Julian Brown
d6d9be7c6b openacc: Fix race condition in Fortran loop collapse tests
The gangs participating in a gang-partitioned loop are not all guaranteed
to complete before some given gang continues to execute beyond that loop.
This means that two existing test cases contain a race condition,
because a loop that may be gang-partitioned is followed immediately by
another loop.  The fix is to place the loops in separate parallel regions.

2020-09-08  Julian Brown  <julian@codesourcery.com>

libgomp/
	* testsuite/libgomp.oacc-fortran/collapse-1.f90: Fix race condition.
	* testsuite/libgomp.oacc-fortran/collapse-2.f90: Likewise.
2020-09-08 13:26:42 -07:00
Tobias Burnus
656218ab98 Fortran: Fix OpenMP's 'if(simd:' etc. conditions
gcc/fortran/ChangeLog:

	* openmp.c (gfc_match_omp_clauses): Re-order 'if' clause pasing
	to avoid creating spurious symbols.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/lastprivate-conditional-10.f90: New test.
2020-08-20 13:33:40 +02:00
Kwok Cheung Yeung
17dc08edc2 nvptx: Add support for subword compare-and-swap
This adds support for __sync_val_compare_and_swap and
__sync_bool_compare_and_swap for 1-byte and 2-byte long
values, which are not natively supported on nvptx.

Build and reg-tested on nvptx.
Build and reg-tested libgomp on x86_64 with nvptx accelerator.

2020-07-16  Kwok Cheung Yeung  <kcy@codesourcery.com>

	libgcc/
	* config/nvptx/atomic.c: New.
	* config/nvptx/t-nvptx (LIB2ADD): Add atomic.c.

	gcc/testsuite/
	* gcc.target/nvptx/ia64-sync-5.c: New.

	libgomp/
	* testsuite/libgomp.c-c++-common/reduction-16.c: New.
2020-08-13 11:11:55 +02:00
Jakub Jelinek
2e47c8c6ea openmp: Add support for non-rectangular loops in taskloop construct
2020-08-13  Jakub Jelinek  <jakub@redhat.com>

	* gimplify.c (gimplify_omp_taskloop_expr): New function.
	(gimplify_omp_for): Use it.  For OMP_FOR_NON_RECTANGULAR
	loops adjust in outer taskloop the var-outer decls.
	* omp-expand.c (expand_omp_taskloop_for_inner): Handle non-rectangular
	loops.
	(expand_omp_for): Don't reject non-rectangular taskloop.
	* omp-general.c (omp_extract_for_data): Don't assert that
	non-rectangular loops have static schedule, instead treat loop->m1
	or loop->m2 as if loop->n1 or loop->n2 is non-constant.

	* testsuite/libgomp.c/loop-22.c (main): Add some further tests.
	* testsuite/libgomp.c/loop-23.c (main): Likewise.
	* testsuite/libgomp.c/loop-24.c: New test.
2020-08-13 09:06:05 +02:00
Jakub Jelinek
676b5525e8 openmp: Handle clauses with gimple sequences in convert_nonlocal_omp_clauses properly
If the walk_body on the various sequences of reduction, lastprivate and/or linear
clauses needs to create a temporary variable, we should declare that variable
in that sequence rather than outside, where it would need to be privatized inside of
the construct.

2020-08-08  Jakub Jelinek  <jakub@redhat.com>

	PR fortran/93553
	* tree-nested.c (convert_nonlocal_omp_clauses): For
	OMP_CLAUSE_REDUCTION, OMP_CLAUSE_LASTPRIVATE and OMP_CLAUSE_LINEAR
	save info->new_local_var_chain around walks of the clause gimple
	sequences and declare_vars if needed into the sequence.

2020-08-08  Tobias Burnus  <tobias@codesourcery.com>

	PR fortran/93553
	* testsuite/libgomp.fortran/pr93553.f90: New test.
2020-08-08 11:10:30 +02:00
Jakub Jelinek
9f3abfb84e openmp: Handle even some combined non-rectangular loops
The number of loops computation and logical iteration -> actual iterator values
computations can now be done separately even on composite constructs (though
for triangular loops it would still be more efficient to propagate a few values
through, will handle that incrementally).
simd and taskloop are still unhandled.

2020-08-05  Jakub Jelinek  <jakub@redhat.com>

	* omp-expand.c (expand_omp_for): Don't disallow combined non-rectangular
	loops.

	* testsuite/libgomp.c/loop-22.c: New test.
	* testsuite/libgomp.c/loop-23.c: New test.
2020-08-05 10:45:16 +02:00
Jakub Jelinek
916c7a201a openmp: Handle reduction clauses on host teams construct [PR96459]
As the new testcase shows, we weren't actually performing reductions on
host teams construct.  And fixing that revealed a flaw in the for-14.c testcase.
The problem is that the tests perform also initialization and checking around the
calls to the functions with the OpenMP constructs.  In that testcase, all the
tests have been spawned from a teams construct but only the tested loops were
distribute, which means the initialization and checking has been performed
redundantly and racily in each team.  Fixed by performing the initialization
and checking outside of host teams and only do the calls to functions with
the tested constructs inside of host teams.

2020-08-05  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/96459
	* omp-low.c (lower_omp_taskreg): Call lower_reduction_clauses even in
	for host teams.

	* testsuite/libgomp.c/teams-3.c: New test.
	* testsuite/libgomp.c-c++-common/for-2.h (OMPTEAMS): Define to nothing
	if not defined yet.
	(N(test)): Use it before all N(f*) calls.
	* testsuite/libgomp.c-c++-common/for-14.c (DO_PRAGMA, OMPTEAMS): Define.
	(main): Don't call all test_* functions from within
	#pragma omp teams reduction(|:err), call them directly.
2020-08-05 10:40:10 +02:00
Tom de Vries
344f09a756 [nvptx] Handle V2DI/V2SI mode in nvptx_gen_shuffle
With the pr96628-part1.f90 source and -ftree-slp-vectorize, we run into an
ICE due to the fact that V2DI mode is not handled in nvptx_gen_shuffle.

Fix this by adding handling of V2DI as well as V2SI mode in
nvptx_gen_shuffle.

Build and reg-tested on x86_64 with nvptx accelerator.

gcc/ChangeLog:

	PR target/96428
	* config/nvptx/nvptx.c (nvptx_gen_shuffle): Handle V2SI/V2DI.

libgomp/ChangeLog:

	PR target/96428
	* testsuite/libgomp.oacc-fortran/pr96628-part1.f90: New test.
	* testsuite/libgomp.oacc-fortran/pr96628-part2.f90: New test.
2020-08-04 11:59:08 +02:00
Julian Brown
f2f4212e20 openacc: No attach/detach present/release mappings for array descriptors
Standalone attach and detach clauses should not create present/release
mappings for Fortran array descriptors (e.g. used when we have a pointer
to an array), both because it is unnecessary and because those mappings
will be incorrectly subject to reference counting. Simply omitting the
mappings means we just use GOMP_MAP_TO_PSET and GOMP_MAP_{ATTACH,DETACH}
mappings for array descriptors.

That requires a tweak in gimplify.c, since we may now see GOMP_MAP_TO_PSET
without a preceding data-movement mapping.

2020-08-03  Julian Brown  <julian@codesourcery.com>
	    Thomas Schwinge  <thomas@codesourcery.com>

gcc/fortran/
	* trans-openmp.c (gfc_trans_omp_clauses): Don't create present/release
	mappings for array descriptors.

gcc/
	* gimplify.c (gimplify_omp_target_update): Allow GOMP_MAP_TO_PSET
	without a preceding data-movement mapping.

gcc/testsuite/
	* gfortran.dg/goacc/attach-descriptor.f90: Update pattern output. Add
	scanning of gimplify dump.

libgomp/
	* testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90: Don't run for
	shared-memory devices.  Extend with further checking.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2020-08-03 12:06:49 -07:00
Martin Jambor
c56684fd61 Removal of HSA offloading from gcc and libgomp
This patch removes the generation of HSAIL from the compiler, the HSA
offloading plugin from libgomp and the associated testsuite tests and
infrastructure bits from the respective testsuites.

Apart from removal of the obvious files, I removed bits that I found
by searching for HSA related terms and by re-tracing my steps and
looking at the patches that introduced HSA in the first place.  I did
not remove everything these patches brought in, for example:

  - the mechanism to pass offload-target specific info from the application to
    the offloading plugin - but the same mechanism is also used to
    communicate number of teams and the thread limit to all offload targets.

  - run_func hook in gomp_device_descr stays too, although now it is
    not used.  If some future offload target would like the ability to
    refuse to offload some functions, it can use it.  It is easy to
    remove as a follow-up if it is considered clutter, though.

  - configure options --with-hsa-runtime=PATH, -with-hsa-runtime-include=PATH
    and --with-hsa-runtime-lib=PATH rmeain because GCN uses them too.

  - Surprisingly, GOMP_TARGET_ARG_HSA_KERNEL_ATTRIBUTES (a constant
    from gomp-constants.h) appears in the source of the amdgcn libgomp
    plugin, although I tend to think that code path is not ever used
    and this patch certainly removes it from the compiler.
    Nevertheless, it seems it has potential value beyond HSAIL and so
    I've kept it, it can of course always be easily removed in the
    future of GCN folk abandon it too.

  - I assume constants OFFLOAD_TARGET_TYPE_HSA and GOMP_DEVICE_HSA
    need to stay indefinitely too just so that no future offload
    target picks that number.

  - I have kept dg-require-effective-target
    offload_device_nonshared_as requirement of thests which have it.

It is quite probable I missed some small HSA artifacts but those
should be easy to remove later as we find them.

include/ChangeLog:

2020-07-24  Martin Jambor  <mjambor@suse.cz>

	* gomp-constants.h (GOMP_VERSION_HSA): Remove.

gcc/ChangeLog:

2020-07-24  Martin Jambor  <mjambor@suse.cz>

	* hsa-brig-format.h: Moved to brig/brigfrontend.
	* hsa-brig.c: Removed.
	* hsa-builtins.def: Likewise.
	* hsa-common.c: Likewise.
	* hsa-common.h: Likewise.
	* hsa-dump.c: Likewise.
	* hsa-gen.c: Likewise.
	* hsa-regalloc.c: Likewise.
	* ipa-hsa.c: Likewise.
	* omp-grid.c: Likewise.
	* omp-grid.h: Likewise.
	* Makefile.in (BUILTINS_DEF): Remove hsa-builtins.def.
	(OBJS): Remove hsa-common.o, hsa-gen.o, hsa-regalloc.o, hsa-brig.o,
	hsa-dump.o, ipa-hsa.c and omp-grid.o.
	(GTFILES): Removed hsa-common.c and omp-expand.c.
	* builtins.def: Remove processing of hsa-builtins.def.
	(DEF_HSA_BUILTIN): Remove.
	* common.opt (flag_disable_hsa): Remove.
	(-Whsa): Ignore.
	* config.in (ENABLE_HSA): Removed.
	* configure.ac: Removed handling configuration for hsa offloading.
	(ENABLE_HSA): Removed.
	* configure: Regenerated.
	* doc/install.texi (--enable-offload-targets): Remove hsa from the
	example.
	(--with-hsa-runtime): Reword to reference any HSA run-time, not
	specifically HSA offloading.
	* doc/invoke.texi (Option Summary): Remove -Whsa.
	(Warning Options): Likewise.
	(Optimize Options): Remove hsa-gen-debug-stores.
	* doc/passes.texi (Regular IPA passes): Remove section on IPA HSA
	pass.
	* gimple-low.c (lower_stmt): Remove GIMPLE_OMP_GRID_BODY case.
	* gimple-pretty-print.c (dump_gimple_omp_for): Likewise.
	(dump_gimple_omp_block): Likewise.
	(pp_gimple_stmt_1): Likewise.
	* gimple-walk.c (walk_gimple_stmt): Likewise.
	* gimple.c (gimple_build_omp_grid_body): Removed function.
	(gimple_copy): Remove GIMPLE_OMP_GRID_BODY case.
	* gimple.def (GIMPLE_OMP_GRID_BODY): Removed.
	* gimple.h (gf_mask): Removed GF_OMP_PARALLEL_GRID_PHONY,
	OMP_FOR_KIND_GRID_LOOP, GF_OMP_FOR_GRID_PHONY,
	GF_OMP_FOR_GRID_INTRA_GROUP, GF_OMP_FOR_GRID_GROUP_ITER and
	GF_OMP_TEAMS_GRID_PHONY.  Renumbered GF_OMP_FOR_KIND_SIMD and
	GF_OMP_TEAMS_HOST.
	(gimple_build_omp_grid_body): Removed declaration.
	(gimple_has_substatements): Remove GIMPLE_OMP_GRID_BODY case.
	(gimple_omp_for_grid_phony): Removed.
	(gimple_omp_for_set_grid_phony): Likewise.
	(gimple_omp_for_grid_intra_group): Likewise.
	(gimple_omp_for_grid_intra_group): Likewise.
	(gimple_omp_for_grid_group_iter): Likewise.
	(gimple_omp_for_set_grid_group_iter): Likewise.
	(gimple_omp_parallel_grid_phony): Likewise.
	(gimple_omp_parallel_set_grid_phony): Likewise.
	(gimple_omp_teams_grid_phony): Likewise.
	(gimple_omp_teams_set_grid_phony): Likewise.
	(CASE_GIMPLE_OMP): Remove GIMPLE_OMP_GRID_BODY case.
	* lto-section-in.c (lto_section_name): Removed hsa.
	* lto-streamer.h (lto_section_type): Removed LTO_section_ipa_hsa.
	* lto-wrapper.c (compile_images_for_offload_targets): Remove special
	handling of hsa.
	* omp-expand.c: Do not include hsa-common.h and gt-omp-expand.h.
	(parallel_needs_hsa_kernel_p): Removed.
	(grid_launch_attributes_trees): Likewise.
	(grid_launch_attributes_trees): Likewise.
	(grid_create_kernel_launch_attr_types): Likewise.
	(grid_insert_store_range_dim): Likewise.
	(grid_get_kernel_launch_attributes): Likewise.
	(get_target_arguments): Remove code passing HSA grid sizes.
	(grid_expand_omp_for_loop): Remove.
	(grid_arg_decl_map): Likewise.
	(grid_remap_kernel_arg_accesses): Likewise.
	(grid_expand_target_grid_body): Likewise.
	(expand_omp): Remove call to grid_expand_target_grid_body.
	(omp_make_gimple_edges): Remove GIMPLE_OMP_GRID_BODY case.
	* omp-general.c: Do not include hsa-common.h.
	(omp_maybe_offloaded): Do not check for HSA offloading.
	(omp_context_selector_matches): Likewise.
	* omp-low.c: Do not include hsa-common.h and omp-grid.h.
	(build_outer_var_ref): Remove handling of GIMPLE_OMP_GRID_BODY.
	(scan_sharing_clauses): Remove handling of OMP_CLAUSE__GRIDDIM_.
	(scan_omp_parallel): Remove handling of the phoney variant.
	(check_omp_nesting_restrictions): Remove handling of
	GIMPLE_OMP_GRID_BODY and GF_OMP_FOR_KIND_GRID_LOOP.
	(scan_omp_1_stmt): Remove handling of GIMPLE_OMP_GRID_BODY.
	(lower_omp_for_lastprivate): Remove handling of gridified loops.
	(lower_omp_for): Remove phony loop handling.
	(lower_omp_taskreg): Remove phony construct handling.
	(lower_omp_teams): Likewise.
	(lower_omp_grid_body): Removed.
	(lower_omp_1): Remove GIMPLE_OMP_GRID_BODY case.
	(execute_lower_omp): Do not call omp_grid_gridify_all_targets.
	* opts.c (common_handle_option): Do not handle hsa when processing
	OPT_foffload_.
	* params.opt (hsa-gen-debug-stores): Remove.
	* passes.def: Remove pass_ipa_hsa and pass_gen_hsail.
	* timevar.def: Remove TV_IPA_HSA.
	* toplev.c: Do not include hsa-common.h.
	(compile_file): Do not call hsa_output_brig.
	* tree-core.h (enum omp_clause_code): Remove OMP_CLAUSE__GRIDDIM_.
	(tree_omp_clause): Remove union field dimension.
	* tree-nested.c (convert_nonlocal_omp_clauses): Remove the
	OMP_CLAUSE__GRIDDIM_ case.
	(convert_local_omp_clauses): Likewise.
	* tree-pass.h (make_pass_gen_hsail): Remove declaration.
	(make_pass_ipa_hsa): Likewise.
	* tree-pretty-print.c (dump_omp_clause): Remove GIMPLE_OMP_GRID_BODY
	case.
	* tree.c (omp_clause_num_ops): Remove the element corresponding to
	OMP_CLAUSE__GRIDDIM_.
	(omp_clause_code_name): Likewise.
	(walk_tree_1): Remove GIMPLE_OMP_GRID_BODY case.
	* tree.h (OMP_CLAUSE__GRIDDIM__DIMENSION): Remove.
	(OMP_CLAUSE__GRIDDIM__SIZE): Likewise.
	(OMP_CLAUSE__GRIDDIM__GROUP): Likewise.

gcc/fortran/ChangeLog:

2020-07-24  Martin Jambor  <mjambor@suse.cz>

	* f95-lang.c (gfc_init_builtin_functions): Remove processing of
	hsa-builtins.def.

gcc/brig/ChangeLog:

2020-07-24  Martin Jambor  <mjambor@suse.cz>

	* brigfrontend/brig-util.h (hsa_type_packed_p): Declared.
	* brigfrontend/brig-util.cc (hsa_type_packed_p): Moved here from
	removed gcc/hsa-common.c.

libgomp/ChangeLog:

2020-07-24  Martin Jambor  <mjambor@suse.cz>

	* plugin/Makefrag.am: Remove configuration of HSA plugin.
	* aclocal.m4: Regenerated.
	* Makefile.in: Regenerated.
	* config.h.in: Regenerated.
	* configure: Regenerated.
	* plugin/configfrag.ac: Likewise.
	* plugin/hsa_ext_finalize.h: Removed.
	* plugin/plugin-hsa.c: Likewise.
	* testsuite/Makefile.in: Regenerated.
	* testsuite/lib/libgomp.exp
	(offload_target_to_openacc_device_type): Remove hsa case.
	(check_effective_target_hsa_offloading_selected_nocache): Removed
	(check_effective_target_hsa_offloading_selected): Likewise.
	(libgomp_init): Do not add -Wno-hsa to additional_flags.
	* testsuite/libgomp.hsa.c/alloca-1.c: Removed test.
	* testsuite/libgomp.hsa.c/bitfield-1.c: Likewise.
	* testsuite/libgomp.hsa.c/bits-insns.c: Likewise.
	* testsuite/libgomp.hsa.c/builtins-1.c: Likewise.
	* testsuite/libgomp.hsa.c/c.exp: Likewise.
	* testsuite/libgomp.hsa.c/complex-1.c: Likewise.
	* testsuite/libgomp.hsa.c/complex-align-2.c: Likewise.
	* testsuite/libgomp.hsa.c/formal-actual-args-1.c: Likewise.
	* testsuite/libgomp.hsa.c/function-call-1.c: Likewise.
	* testsuite/libgomp.hsa.c/get-level-1.c: Likewise.
	* testsuite/libgomp.hsa.c/gridify-1.c: Likewise.
	* testsuite/libgomp.hsa.c/gridify-2.c: Likewise.
	* testsuite/libgomp.hsa.c/gridify-3.c: Likewise.
	* testsuite/libgomp.hsa.c/gridify-4.c: Likewise.
	* testsuite/libgomp.hsa.c/memory-operations-1.c: Likewise.
	* testsuite/libgomp.hsa.c/pr69568.c: Likewise.
	* testsuite/libgomp.hsa.c/pr82416.c: Likewise.
	* testsuite/libgomp.hsa.c/rotate-1.c: Likewise.
	* testsuite/libgomp.hsa.c/staticvar.c: Likewise.
	* testsuite/libgomp.hsa.c/switch-1.c: Likewise.
	* testsuite/libgomp.hsa.c/switch-branch-1.c: Likewise.
	* testsuite/libgomp.hsa.c/switch-sbr-2.c: Likewise.
	* testsuite/libgomp.hsa.c/tiling-1.c: Likewise.
	* testsuite/libgomp.hsa.c/tiling-2.c: Likewise.

gcc/testsuite/ChangeLog:

2020-07-24  Martin Jambor  <mjambor@suse.cz>

	* lib/target-supports.exp (check_effective_target_offload_hsa):
	Removed.
	* c-c++-common/gomp/gridify-1.c: Removed test.
	* c-c++-common/gomp/gridify-2.c: Likewise.
	* c-c++-common/gomp/gridify-3.c: Likewise.
	* c-c++-common/gomp/hsa-indirect-call-1.c: Likewise.
	* gfortran.dg/gomp/gridify-1.f90: Likewise.
	* gcc.dg/gomp/gomp.exp: Do not pass -Wno-hsa to tests.
	* g++.dg/gomp/gomp.exp: Likewise.
	* gfortran.dg/gomp/gomp.exp: Likewise.
2020-08-03 18:13:00 +02:00
Julian Brown
bc4ed079dc openacc: Deep copy attach/detach should not affect reference counts
Attach and detach operations are not supposed to affect structural or
dynamic reference counts for OpenACC. Previously they did so, which led to
subtle problems in some circumstances. We can avoid reference-counting
attach/detach operations by extending and slightly repurposing the
do_detach field in target_var_desc. It is now called is_attach to better
reflect its new role.

2020-07-27  Julian Brown  <julian@codesourcery.com>
	    Thomas Schwinge  <thomas@codesourcery.com>

libgomp/
	* libgomp.h (struct target_var_desc): Rename do_detach field to
	is_attach.
	* oacc-mem.c (goacc_exit_datum_1): Add assert.  Don't set finalize for
	GOMP_MAP_FORCE_DETACH. Update checking to use is_attach field.
	(goacc_enter_data_internal): Don't affect reference counts
	for attach mappings.
	(goacc_exit_data_internal): Don't affect reference counts for detach
	mappings.
	* target.c (gomp_map_vars_existing): Don't affect reference counts for
	attach mappings.
	(gomp_map_vars_internal): Set renamed is_attach flag unconditionally to
	mark attach mappings.
	(gomp_unmap_vars_internal): Use is_attach flag to prevent affecting
	reference count for attach mappings.
	* testsuite/libgomp.oacc-c-c++-common/mdc-refcount-1.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/mdc-refcount-2.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/mdc-refcount-2.c: New test.
	* testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90: Mark
	test as shouldfail.
	* testsuite/libgomp.oacc-fortran/deep-copy-6.f90: Adjust to fail
	gracefully in no-finalize mode.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2020-07-27 09:16:57 -07:00
Thomas Schwinge
fdc9db2539 [testsuite] Unset 'offload_target' after use
..., so that we don't leak this into '*.exp' files running later.

This is relevant after commit efc16503ca "handle
dumpbase in offloading, adjust testsuite" -- I was confused why in a
(simplified) testing sequence as follows:

  default 'libgomp.c/c.exp'
  default 'libgomp.oacc-c/c.exp'
  '-m32' 'libgomp.c/c.exp'
  '-m32' 'libgomp.oacc-c/c.exp'

..., the "'-m32' 'libgomp.c/c.exp'" variant would not execute any offloading
dump scanning.  The reason is that the "default 'libgomp.oacc-c/c.exp'" variant
ends with 'offload_target=disable' set, so that's what the "'-m32'
'libgomp.c/c.exp'" variant would then see, in particular
'gcc/testsuite/lib/scanoffload.exp:scoff'.

	libgomp/
	* testsuite/libgomp.oacc-c++/c++.exp: Unset 'offload_target' after
	use.
	* testsuite/libgomp.oacc-c/c.exp: Likewise.
	* testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.
2020-07-24 14:00:43 +02:00
Julian Brown
25bce75c77 openacc: Remove unnecessary detach finalization
The call to gomp_detach_pointer in gomp_unmap_vars_internal does not
need to force finalization, and doing so may mask mismatched pointer
attachments/detachments. This patch removes the forcing.

2020-07-16  Julian Brown  <julian@codesourcery.com>
	    Thomas Schwinge  <thomas@codesourcery.com>

libgomp/
	* target.c (gomp_unmap_vars_internal): Remove unnecessary forcing of
	finalization for detach operation.
	* testsuite/libgomp.oacc-c-c++-common/structured-detach-underflow.c:
	New test.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2020-07-23 12:50:07 -07:00
Tobias Burnus
2631d95ae2 libomp: Add omp_depend_kind to omp_lib.{f90,h}
gcc/fortran/ChangeLog:

	* intrinsic.texi (OMP_LIB_KINDS): Add omp_depend_kind.

libgomp/ChangeLog:

	* configure.ac: Add OMP_DEPEND_KIND and OMP_INT128_SIZE.
	* libgomp_f.h.in (omp_check_defines): Check whether
	sizeof of determined Fortran kind and C typedef match.
	* omp_lib.f90.in: Add omp_depened_kind.
	* omp_lib.h.in: Likewise; fix omp_alloctrait_key_kind.
	* configure: Regenerate.
	* Makefile.in: Regenerate.
	* testsuite/Makefile.in: Regenerate.
2020-07-23 15:02:15 +02:00
Tobias Burnus
ade6e7204c critical-hint-*.{c,f90}: Move from gcc/testsuite to libgomp/testsuite
libgomp/ChangeLog:

	* testsuite/libgomp.c-c++-common/critical-hint-1.c: New; moved from
	gcc/testsuite/c-c++-common/gomp/.
	* testsuite/libgomp.c-c++-common/critical-hint-2.c: Likewise.
	* testsuite/libgomp.fortran/critical-hint-1.f90: New; moved
	from gcc/testsuite/gfortran.dg/gomp/.
	* testsuite/libgomp.fortran/critical-hint-2.f90: Likewise.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/critical-hint-1.c: Moved to libgomp/.
	* c-c++-common/gomp/critical-hint-2.c: Moved to libgomp/.
	* gfortran.dg/gomp/critical-hint-1.f90: Moved to libgomp/.
	* gfortran.dg/gomp/critical-hint-2.f90: Moved to libgomp/.
2020-07-22 12:14:22 +02:00
H.J. Lu
7aa22a8f1a x86-64: Define ASM_OUTPUT_ALIGNED_DECL_LOCAL
Define ASM_OUTPUT_ALIGNED_DECL_LOCAL for large local common symbol.

gcc/

	PR target/95620
	* config/i386/x86-64.h (ASM_OUTPUT_ALIGNED_DECL_LOCAL): New.

libgomp/

	PR target/95620
	* testsuite/libgomp.c/pr95620.c: New test.
2020-07-18 08:51:54 -07:00
Julian Brown
39dda00208 openacc: Fix standalone attach for Fortran assumed-shape array pointers
This patch makes it so that an "attach" operation for a Fortran pointer
with an array descriptor copies that array descriptor to the target,
and similarly that detach operations release the array descriptor.

2020-07-16  Julian Brown  <julian@codesourcery.com>
	    Thomas Schwinge  <thomas@codesourcery.com>

gcc/fortran/
	* trans-openmp.c (gfc_trans_omp_clauses): Rework OpenACC
	attach/detach handling for arrays with descriptors.

gcc/testsuite/
	* gfortran.dg/goacc/attach-descriptor.f90: New test.

libgomp/
	* testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90: New test.
	* testsuite/libgomp.oacc-fortran/attach-descriptor-2.f90: New test.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2020-07-16 14:12:53 -07:00
Tobias Burnus
51542d9254 libgomp.fortran/alloc-1.F90: Fix testcase for 32bit size_t
libgomp/ChangeLog:

	* testsuite/libgomp.fortran/alloc-1.F90: Use c_size_t to
	avoid conversion on 32bit systems from 32bit to 64bit due
	to -fdefault-integer-8.
2020-07-15 17:23:04 +02:00
Tobias Burnus
e0685fadb6 libgomp.fortran/struct-elem-map-1.f90: Add char kind=4 tests
As the Fortran PR 95837 has been fixed, the test could be be added.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/struct-elem-map-1.f90: Remove unused
	variables; add character(kind=4) tests; update TODO comment.
2020-07-15 12:34:03 +02:00
Tobias Burnus
fff15bad1a libgomp: Add Fortran routine support for allocators
libgomp/ChangeLog:

	* allocator.c: Add ialias for omp_init_allocator and
	omp_destroy_allocator.
	* configure.ac: Set INTPTR_T_KIND.
	* configure: Regenerate.
	* Makefile.in: Regenerate.
	* testsuite/Makefile.in: Regenerate.
	* fortran.c (omp_init_allocator_, omp_destroy_allocator_,
	omp_set_default_allocator_, omp_get_default_allocator_): New
	functions and ialias_redirect.
	* icv.c: Add ialias for omp_set_default_allocator and
	omp_get_default_allocator.
	* libgomp.map (OMP_5.0.1): Add omp_init_allocator_,
	omp_destroy_allocator_, omp_set_default_allocator_ and
	omp_get_default_allocator_.
	* omp_lib.f90.in: Add allocator traits parameters, declare
	allocator routines and add related kind parameters.
	* omp_lib.h.in: Likewise.
	* testsuite/libgomp.c-c++-common/alloc-2.c: Fix sizeof.
	* testsuite/libgomp.fortran/alloc-1.F90: New test.
	* testsuite/libgomp.fortran/alloc-2.F90: New test.
	* testsuite/libgomp.fortran/alloc-3.F: New test.
	* testsuite/libgomp.fortran/alloc-4.f90: New test.
	* testsuite/libgomp.fortran/alloc-5.f90: New test.
2020-07-15 08:33:20 +02:00
Kwok Cheung Yeung
b52643ab90 libgomp: Fix hang when profiling OpenACC programs with CUDA 9.0 nvprof
The version of nvprof in CUDA 9.0 causes a hang when used to profile an
OpenACC program.  This is because it calls acc_get_device_type from
a callback called during device initialization, which then attempts
to acquire acc_device_lock while it is already taken, resulting in
deadlock.  This works around the issue by returning acc_device_none
from acc_get_device_type without attempting to acquire the lock when
initialization has not completed yet.

2020-07-14  Tom de Vries  <tom@codesourcery.com>
	    Cesar Philippidis  <cesar@codesourcery.com>
	    Thomas Schwinge  <thomas@codesourcery.com>
	    Kwok Cheung Yeung  <kcy@codesourcery.com>

	libgomp/
	* oacc-init.c (acc_init_state_lock, acc_init_state, acc_init_thread):
	New variable.
	(acc_init_1): Set acc_init_thread to pthread_self ().  Set
	acc_init_state to initializing at the start, and to initialized at the
	end.
	(self_initializing_p): New function.
	(acc_get_device_type): Return acc_device_none if called by thread that
	is currently executing acc_init_1.
	* libgomp.texi (acc_get_device_type): Update documentation.
	(Implementation Status and Implementation-Defined Behavior): Likewise.
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-init-2.c: New.
2020-07-14 10:31:35 -07:00
Tobias Burnus
102502e32e [OpenMP, Fortran] Add structure/derived-type element mapping
gcc/fortran/ChangeLog:

	* openmp.c (gfc_match_omp_clauses): Match also derived-type
	component refs in OMP_CLAUSE_MAP.
	(resolve_omp_clauses): Resolve those.
	* trans-openmp.c (gfc_trans_omp_array_section, gfc_trans_omp_clauses):
	Handle OpenMP structure-element mapping.
	(gfc_trans_oacc_construct, gfc_trans_oacc_executable_directive,
	(gfc_trans_oacc_combined_directive, gfc_trans_oacc_declare): Update
	add openacc=true in gfc_trans_omp_clauses call.

gcc/testsuite/ChangeLog:

	* gfortran.dg/goacc/finalize-1.f: Update dump scan pattern.
	* gfortran.dg/gomp/map-1.f90: Update dg-error.
	* gfortran.dg/gomp/map-2.f90: New test.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/struct-elem-map-1.f90: New test.
2020-07-14 13:39:46 +02:00
Tobias Burnus
174e79bf73 [Fortran, OpenMP] Fix allocatable-components check (PR67311)
gcc/fortran/ChangeLog:

	PR fortran/67311
	* trans-openmp.c (gfc_has_alloc_comps): Return false also for
	pointers to arrays.

libgomp/ChangeLog:

	PR fortran/67311
	* testsuite/libgomp.fortran/target-map-1.f90: New test.
2020-07-14 12:55:53 +02:00
Jakub Jelinek
f418bd4b92 openmp: Adjust outer bounds of non-rect loops
In loops like:
  #pragma omp parallel for collapse(2)
  for (i = -4; i < 8; i++)
    for (j = 3 * i; j > 2 * i; j--)
for some outer loop iterations there are no inner loop iterations at all,
the condition is false.  In order to use Summæ Potestate to count number
of iterations or to transform the logical iteration number to actual
iterator values using quadratic non-equation root discovery the outer
iterator range needs to be adjusted, such that the inner loop has at least
one iteration for each of the outer loop iterator value in the reduced
range.  Sometimes this adjustment is done at the start of the range,
at other times at the end.

This patch implements it during the compile time number of loop computation
(if all expressions are compile time constants).

2020-07-14  Jakub Jelinek  <jakub@redhat.com>

	* omp-general.h (struct omp_for_data): Add adjn1 member.
	* omp-general.c (omp_extract_for_data): For non-rect loop, punt on
	count computing if n1, n2 or step are not INTEGER_CST earlier.
	Narrow the outer iterator range if needed so that non-rect loop
	has at least one iteration for each outer range iteration.  Compute
	adjn1.
	* omp-expand.c (expand_omp_for_init_vars): Use adjn1 if non-NULL
	instead of the outer loop's n1.

	* testsuite/libgomp.c/loop-21.c: New test.
2020-07-14 10:31:59 +02:00
Julian Brown
b20097c65d openacc: Don't strip TO_PSET/POINTER for enter/exit data
OpenACC 2.6 specifies that the array descriptor (when present) must be
copied to the target before attaching pointers in Fortran. This patch
reverses the stripping of GOMP_MAP_TO_PSET and GOMP_MAP_POINTER that
was introduced by the "OpenACC reference count overhaul" patch.

2020-07-10  Julian Brown  <julian@codesourcery.com>
	    Thomas Schwinge  <thomas@codesourcery.com>

gcc/
	* gimplify.c (gimplify_scan_omp_clauses): Do not strip
	GOMP_MAP_TO_PSET/GOMP_MAP_POINTER for OpenACC enter/exit data
	directives (see also PR92929).

gcc/testsuite/
	* gfortran.dg/goacc/finalize-1.f: Update expected dump output.

libgomp/
	* testsuite/libgomp.oacc-fortran/dynamic-pointer-1.f90: New test.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2020-07-13 03:21:20 -07:00
Julian Brown
6f5b4b64d2 openacc: Adjust dynamic reference count semantics
This patch adjusts how dynamic reference counts work so that they match
the semantics of the source program more closely, instead of representing
"excess" reference counts beyond those that represent pointers in the
internal libgomp splay-tree data structure. This allows some corner
cases to be handled more gracefully.

2020-07-10  Julian Brown  <julian@codesourcery.com>
	    Thomas Schwinge  <thomas@codesourcery.com>

	libgomp/
	* libgomp.h (struct splay_tree_key_s): Change virtual_refcount to
	dynamic_refcount.
	(struct gomp_device_descr): Remove GOMP_MAP_VARS_OPENACC_ENTER_DATA.
	* oacc-mem.c (acc_map_data): Substitute virtual_refcount for
	dynamic_refcount.
	(acc_unmap_data): Update comment.
	(goacc_map_var_existing, goacc_enter_datum): Adjust for
	dynamic_refcount semantics.
	(goacc_exit_datum_1, goacc_exit_datum): Re-add some error checking.
	Adjust for dynamic_refcount semantics.
	(goacc_enter_data_internal): Implement "present" case of dynamic
	memory-map handling here.  Update "non-present" case for
	dynamic_refcount semantics.
	(goacc_exit_data_internal): Use goacc_exit_datum_1.
	* target.c (gomp_map_vars_internal): Remove
	GOMP_MAP_VARS_OPENACC_ENTER_DATA handling.  Update for dynamic_refcount
	handling.
	(gomp_unmap_vars_internal): Remove virtual_refcount handling.
	(gomp_load_image_to_device): Substitute dynamic_refcount for
	virtual_refcount.
	* testsuite/libgomp.oacc-c-c++-common/pr92843-1.c: Remove XFAILs.
	* testsuite/libgomp.oacc-c-c++-common/refcounting-1.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/refcounting-2.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/struct-3-1-1.c: New test.
	* testsuite/libgomp.oacc-fortran/deep-copy-6.f90: Remove XFAILs and
	trace output.
	* testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90: Remove
	trace output.
	* testsuite/libgomp.oacc-fortran/dynamic-incr-structural-1.f90: New
	test.
	* testsuite/libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-4.c:
	Remove stale comment.
	* testsuite/libgomp.oacc-fortran/mdc-refcount-1-1-1.f90: Remove XFAILs.
	* testsuite/libgomp.oacc-fortran/mdc-refcount-1-1-2.F90: Likewise.
	* testsuite/libgomp.oacc-fortran/mdc-refcount-1-2-1.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/mdc-refcount-1-2-2.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/mdc-refcount-1-3-1.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/mdc-refcount-1-4-1.f90: Adjust XFAIL.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2020-07-10 08:07:12 -07:00
Julian Brown
0d00fe404c openacc: Set bias to zero for explicit attach/detach clauses in C and C++
This is a fix for the pointer (or array) size inadvertently being used
for the bias with attach and detach mapping kinds, for both C and C++.

2020-07-09  Julian Brown  <julian@codesourcery.com>
	    Thomas Schwinge  <thomas@codesourcery.com>

gcc/c/
	PR middle-end/95270
	* c-typeck.c (c_finish_omp_clauses): Set OMP_CLAUSE_SIZE (bias) to zero
	for standalone attach/detach clauses.

gcc/cp/
	PR middle-end/95270
	* semantics.c (finish_omp_clauses): Likewise.

include/
	PR middle-end/95270
	* gomp-constants.h (gomp_map_kind): Expand comment for attach/detach
	mapping kinds.

gcc/testsuite/
	PR middle-end/95270
	* c-c++-common/goacc/mdc-1.c: Update expected dump output for zero
	bias.

libgomp/
	PR middle-end/95270
	* testsuite/libgomp.oacc-c-c++-common/pr95270-1.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/pr95270-2.c: New test.
2020-07-09 15:17:52 -07:00
Jakub Jelinek
5acef69f9d openmp: Optimize triangular loop logical iterator to actual iterators computation using search for quadratic equation root(s)
This patch implements the optimized logical to actual iterators
computation for triangular loops.

I have a rough implementation using integers, but this one uses floating
point.  There is a small problem that -fopenmp programs aren't linked with
-lm, so it does it only if the hw has sqrt optab (and uses ifn rather than
__builtin_sqrt because it obviously doesn't need errno handling etc.).

Do you think it is ok this way, or should I use the integral computation
using inlined isqrt (we have inequation of the form
start >= x * t10 + t11 * (((x - 1) * x) / 2)
where t10 and t11 are signed long long values and start unsigned long long,
and the division by 2 actually is a problem for accuracy in some cases, so
if we do it in integral, we need to do actually
      long long t12 = 2 * t10 - t11;
      unsigned long long t13 = t12 * t12 + start * 8 * t11;
      unsigned long long isqrt_ = isqrtull (t13);
      long long x = (((long long) isqrt_ - t12) / t11) >> 1;
with careful overflow checking on all the computations before isqrtull
(and on overflows use the fallback implementation).

2020-07-09  Jakub Jelinek  <jakub@redhat.com>

	* omp-general.h (struct omp_for_data): Add min_inner_iterations
	and factor members.
	* omp-general.c (omp_extract_for_data): Initialize them and remember
	them in OMP_CLAUSE_COLLAPSE_COUNT if needed and restore from there.
	* omp-expand.c (expand_omp_for_init_counts): Fix up computation of
	counts[fd->last_nonrect] if fd->loop.n2 is INTEGER_CST.
	(expand_omp_for_init_vars): For
	fd->first_nonrect + 1 == fd->last_nonrect loops with for now
	INTEGER_CST fd->loop.n2 find quadratic equation roots instead of
	using fallback method when possible.

	* testsuite/libgomp.c/loop-19.c: New test.
	* testsuite/libgomp.c/loop-20.c: New test.
2020-07-09 12:07:17 +02:00
Thomas Schwinge
e7f3f7fe08 [OpenACC] Revert always-copyfrom behavior for 'GOMP_MAP_FORCE_FROM' in 'libgomp/oacc-mem.c:goacc_exit_data_internal'
As done for 'GOMP_MAP_FROM', also for 'GOMP_MAP_FORCE_FROM' we should only
'gomp_copy_dev2host' if 'n->refcount == 0'.

This had gotten altered in commit 378da98fcc
(r279621) "OpenACC reference count overhaul".

	libgomp/
	* oacc-mem.c (goacc_exit_data_internal): Revert always-copyfrom
	behavior for 'GOMP_MAP_FORCE_FROM'.
	* testsuite/libgomp.oacc-c-c++-common/pr92843-1.c: Adjust XFAIL.
2020-07-03 17:14:40 +02:00
Thomas Schwinge
8a8efad098 [testsuite] Replace fragile 'scan-assembler' with 'scan-offload-rtl' in 'libgomp.oacc-c-c++-common/pr85381*.c'
These test cases use directives similar to:

    /* { dg-additional-options "-save-temps" } */

    /* { dg-final { scan-assembler-times "bar.sync" 2 } } */

This expects to scan the PTX offloading compilation assembler code (not host
code!), expecting that nvptx offloading code assembly is produced after the
host code, and thus overwrites the latter file.  (Yes, that's certainly
ugly/fragile...)

..., and this broke with recent commit 1dedc12d18
"revamp dump and aux output names" plus fix-up commit commit
efc16503ca "handle dumpbase in offloading, adjust
testsuite" (short summary: file names changed), so let's finally make that
robust.

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/pr85381-2.c: Replace fragile
	'scan-assembler' with 'scan-offload-rtl'.
	* testsuite/libgomp.oacc-c-c++-common/pr85381-3.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/pr85381-4.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/pr85381-5.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/pr85381.c: Likewise.
2020-06-30 17:48:36 +02:00
Jakub Jelinek
aed3ab253d openmp: Non-rectangular loop support for non-composite worksharing loops and distribute
This implements the fallback mentioned in
https://gcc.gnu.org/pipermail/gcc/2020-June/232874.html
Special cases for triangular loops etc. to follow later, also composite
constructs not supported yet (need to check the passing of temporaries around)
and lastprivate might not give the same answers as serial loop if the last
innermost body iteration isn't the last one for some of the outer loops
(that will need to be solved separately together with rectangular loops that have no
innermost body iterations, but some of the outer loops actually iterate).
Also, simd needs work.

2020-06-27  Jakub Jelinek  <jakub@redhat.com>

	* omp-general.h (struct omp_for_data_loop): Add non_rect_referenced
	member, move outer member.
	(struct omp_for_data): Add first_nonrect and last_nonrect members.
	* omp-general.c (omp_extract_for_data): Initialize first_nonrect,
	last_nonrect and non_rect_referenced members.
	* omp-expand.c (expand_omp_for_init_counts): Handle non-rectangular
	loops.
	(expand_omp_for_init_vars): Add nonrect_bounds parameter.  Handle
	non-rectangular loops.
	(extract_omp_for_update_vars): Likewise.
	(expand_omp_for_generic, expand_omp_for_static_nochunk,
	expand_omp_for_static_chunk, expand_omp_simd,
	expand_omp_taskloop_for_outer, expand_omp_taskloop_for_inner): Adjust
	expand_omp_for_init_vars and extract_omp_for_update_vars callers.
	(expand_omp_for): Don't sorry on non-composite worksharing-loop or
	distribute.

	* testsuite/libgomp.c/loop-17.c: New test.
	* testsuite/libgomp.c/loop-18.c: New test.
2020-06-27 12:43:36 +02:00
Marek Polacek
0801f41944 c++: Change the default dialect to C++17.
Since GCC 9, C++17 support is no longer experimental.  It was too late
to change the default C++ dialect to C++17 in GCC 10, but I think now
it's time to pull the trigger (C++14 was made the default in GCC 6.1).
We're still missing two C++17 library features, but that shouldn't stop
us.  See
<https://gcc.gnu.org/onlinedocs/libstdc++/manual/status.html#status.iso.2017>
and
<https://gcc.gnu.org/projects/cxx-status.html#cxx17>
for the C++17 status.

I won't list all C++17 features here, but just a few heads-up:

- trigraphs were removed (hardly anyone cares, unless your keyboard is
  missing the # key),
- operator++(bool) was removed (so some tests now run in C++14 and down
  only),
- the keyword register was removed (some legacy code might trip on
  this),
- noexcept specification is now part of the type system and C++17 does
  not allow dynamic exception specifications anymore (the empty throw
  specification is still available, but it is deprecated),
- the evaluation order rules are different in C++17,
- static constexpr data members are now implicitly inline (which makes
  them definitions),
- C++17 requires guaranteed copy elision, meaning that a copy/move
  constructor call might be elided completely.  That means that if
  something relied on a constructor being instantiated via e.g. copying
  a function parameter, it might now fail.

I'll post an update for cxx-status.html and add a new caveat to changes.html
once this is in.

gcc/ChangeLog:
	* doc/invoke.texi (C Dialect Options): Adjust -std default for C++.
	* doc/standards.texi (C Language): Correct the default dialect.
	(C++ Language): Update the default for C++ to gnu++17.

gcc/c-family/ChangeLog:
	* c-opts.c (c_common_init_options): Default to gnu++17.

gcc/testsuite/ChangeLog:
	* c-c++-common/torture/vector-subscript-3.c: In C++17, define away
	the keyword register.
	* g++.dg/cpp1z/attributes-enum-1a.C: Only run pre-C++17.
	* g++.dg/cpp1z/fold7a.C: Likewise.
	* g++.dg/cpp1z/nontype3a.C: Likewise.
	* g++.dg/cpp1z/utf8-2a.C: Likewise.
	* g++.dg/parse/error11.C: Update expected diagnostics for C++17.
	* g++.dg/torture/pr34850.C: Add -Wno-attribute-warning.
	* g++.dg/torture/pr49394.C: In C++17, use noexcept(false).
	* g++.dg/torture/pr82154.C: Use -std=c++14.
	* lib/target-supports.exp: Set to C++17.
	* obj-c++.dg/try-catch-9.mm: Use -Wno-register.

libgomp/ChangeLog:
	* testsuite/libgomp.c++/atomic-3.C: Use -std=gnu++14.
2020-06-26 15:29:07 -04:00
Alexandre Oliva
efc16503ca handle dumpbase in offloading, adjust testsuite
Pass dumpbase on to mkoffloads and their offload-target compiler runs,
using different suffixes for different offloading targets.
Obey -save-temps in naming temporary files while at that.

Adjust the testsuite offload dump scanning machinery to look for dump
files named under the new conventions, iterating internally over all
configured offload targets, or recognizing libgomp's testsuite's own
iteration.


for  gcc/ChangeLog

	* collect-utils.h (dumppfx): New.
	* collect-utils.c (dumppfx): Likewise.
	* lto-wrapper.c (run_gcc): Set global dumppfx.
	(compile_offload_image): Pass a -dumpbase on to mkoffload.
	* config/nvptx/mkoffload.c (ptx_dumpbase): New.
	(main): Handle incoming -dumpbase.  Set ptx_dumpbase.  Obey
	save_temps.
	(compile_native): Pass -dumpbase et al to compiler.
	* config/gcn/mkoffload.c (gcn_dumpbase): New.
	(main): Handle incoming -dumpbase.  Set gcn_dumpbase.  Obey
	save_temps.  Pass -dumpbase et al to offload target compiler.
	(compile_native): Pass -dumpbase et al to compiler.

for  gcc/testsuite/ChangeLog

	* lib/scanoffload.exp: New.
	* lib/scanoffloadrtl.exp: Load it.  Replace ".o" with ""
	globally, and use scanoffload's scoff wrapper to fill it in.
	* lib/scanoffloadtree.exp: Likewise.

for libgomp/ChangeLog

	* testsuite/lib/libgomp.exp: Load gcc lib scanoffload.exp.
	* testsuite/lib/libgomp-dg.exp: Drop now-obsolete -save-temps.
2020-06-23 06:31:18 -03:00
Thomas Schwinge
5864930754 Add 'dg-do run' to 'libgomp.fortran/use_device_ptr-optional-3.f90' [PR94848]
Fix-up for r279858/commit f760c0c77f "Fortran]
OpenMP/OpenACC – fix more issues with OPTIONAL".

With offloading enabled, we then saw:

    PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  (test for excess errors)
    PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  execution test
    PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  (test for excess errors)
    PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  execution test
    FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  (test for excess errors)
    UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  compilation failed to produce executable
    FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)
    UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  compilation failed to produce executable
    FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  (test for excess errors)
    UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  compilation failed to produce executable
    FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  (test for excess errors)
    UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  compilation failed to produce executable

 ... due to:

    /tmp/cciVc43I.o:(.gnu.offload_vars+0x10): undefined reference to `A.12.4064'
    [...]

..., but after the recent PR94848, PR95551 changes, that problem is now gone.

	libgomp/
	PR lto/94848
	* testsuite/libgomp.fortran/use_device_ptr-optional-3.f90: Add
	'dg-do run'.
2020-06-18 00:14:46 +02:00
Tobias Burnus
12df77ab6d OpenACC/Fortran: permit 'routine' inside PURE
gcc/fortran/ChangeLog

	* parse.c (decode_oacc_directive): Permit 'acc routine' also
	inside pure procedures.
	* openmp.c (gfc_match_oacc_routine): Inside pure procedures
	do not permit gang, worker or vector clauses.

libgomp/ChangeLog:

	* testsuite/libgomp.oacc-fortran/routine-10.f90: New test.

gcc/testsuite/ChangeLog:

	* gfortran.dg/goacc/pure-elemental-procedures-2.f90: New test.

Reviewed-by: Thomas Schwinge <thomas@codesourcery.com>
2020-06-16 20:23:58 +02:00
Tobias Burnus
1c0fdaf79e openmp: ensure variables in offload table are streamed out (PRs 94848 + 95551)
gcc/ChangeLog:

	PR lto/94848
	PR middle-end/95551
	* omp-offload.c (add_decls_addresses_to_decl_constructor,
	omp_finish_file): Skip removed items.
	* lto-cgraph.c (output_offload_tables): Likewise; set force_output
	to this node for variables and functions.

libgomp/ChangeLog:

	PR lto/94848
	PR middle-end/95551
	* testsuite/libgomp.fortran/target-var.f90: New test.
2020-06-08 23:24:57 +02:00
Julian Brown
9643f5bbe2 Add 'libgomp.oacc-c-c++-common/struct-copyout-{1,2}.c'
libgomp/
	* testsuite/libgomp.oacc-c-c++-common/struct-copyout-1.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/struct-copyout-2.c: New test.

Reviewed-by: Thomas Schwinge <thomas@codesourcery.com>
2020-06-05 18:04:12 +02:00