Commit Graph

896 Commits

Author SHA1 Message Date
Thomas Schwinge
a43ae03a05 Further test case adjustment re "Fortran: Fix assumed-size to assumed-rank passing"
Fix-up for recent commit 00f6de9c69
"Fortran: Fix assumed-size to assumed-rank passing [PR94070]",
and commit da1f6391b7
"libgomp.oacc-fortran/privatized-ref-2.f90: Fix dg-note".

Due to use of '#if !ACC_MEM_SHARED' conditionals in
'libgomp.oacc-fortran/if-1.f90', 'target { !  openacc_host_selected }'
needs some special care (ignoring the pre-existing mismatch of
'ACC_MEM_SHARED' vs. 'openacc_host_selected').

As seen with GCN offloading, we need to revert to another bit of the
original code in 'libgomp.oacc-fortran/privatized-ref-2.f90'.

	libgomp/
	* testsuite/libgomp.oacc-fortran/if-1.f90: Adjust.
	* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise.
2021-09-28 14:18:21 +02:00
Aldy Hernandez
0288527f47 Replace VRP threader with a hybrid forward threader.
This patch implements the new hybrid forward threader and replaces the
embedded VRP threader with it.

With all the pieces that have gone in, the implementation of the hybrid
threader is straightforward: convert the current state into
SSA imports that the solver will understand, and let the path solver
precompute ranges and relations for the path.  After this setup is done,
we can use the range_query API to solve gimple statements in the threader.
The forward threader is now engine agnostic so there are no changes to
the threader per se.

I have put the hybrid bits in tree-ssa-threadedge.*, instead of VRP,
because they will also be used in the evrp removal of the DOM/threader,
which is my next task.

Most of the patch, is actually test changes.  I have gone through every
single one and verified that we're correct.  Most were trivial dump
file name changes, but others required going through the IL an
certifying that the different IL was expected.

For example, in pr59597.c, we have one less thread because the
ASSERT_EXPR was getting in the way, and making it seem like things were
not crossing loops.  The hybrid threader sees the correct representation
of the IL, and avoids threading this one case.

The final numbers are a 12.16% improvement in jump threads immediately
after VRP, and a 0.82% improvement in overall jump threads.  The
performance drop is 0.6% (plus the 1.43% hit from moving the embedded
threader into its own pass).  As I've said, I'd prefer to keep the
threader in its own pass, but if this is an issue, we can address this
with a shared ranger when VRP is replaced with an evrp instance
(upcoming).

Note, that these numbers are slightly different than what I originally
posted.  A few correctness tweaks, plus restricting loop threads, made
the difference.  That being said, I was aiming for par.  A 12% gain is
just gravy ;-).  When we merge the threaders, we should see even better
numbers-- and we'll have the benefit of an entire release stress testing
the solver.

As I mentioned in my introductory note, paths ending in MEM_REF
conditional are missing.  In reality, this didn't make a difference, as
it was so rare.  However, as a follow-up, I will distill a test and add
a suitable PR to keep us honest.

There is a one-line change to libgomp/team.c silencing a new used
uninitialized warning.  As my previous work with the threaders has
shown, warnings flare up after each improvement to jump threading.  I
expect this to be no different.  I've promised Jakub to investigate
fully, so I will analyze and add the appropriate PR for the warning
experts.

Oh yeah, the new pass dump is called vrp-threader[12] to match each
VRP[12] pass.  However, there's no reason for it to either be named
vrp-threader, or for it to live in tree-vrp.c.

Tested on x86-64 Linux.

OK?

p.s. "Did I say 5 weeks?  My bad, I meant 5 months."

gcc/ChangeLog:

	* passes.def (pass_vrp_threader): New.
	* tree-pass.h (make_pass_vrp_threader): Add make_pass_vrp_threader.
	* tree-ssa-threadedge.c (hybrid_jt_state::register_equivs_stmt): New.
	(hybrid_jt_simplifier::hybrid_jt_simplifier): New.
	(hybrid_jt_simplifier::simplify): New.
	(hybrid_jt_simplifier::compute_ranges_from_state): New.
	* tree-ssa-threadedge.h (class hybrid_jt_state): New.
	(class hybrid_jt_simplifier): New.
	* tree-vrp.c (execute_vrp): Remove ASSERT_EXPR based jump
	threader.
	(class hybrid_threader): New.
	(hybrid_threader::hybrid_threader): New.
	(hybrid_threader::~hybrid_threader): New.
	(hybrid_threader::before_dom_children): New.
	(hybrid_threader::after_dom_children): New.
	(execute_vrp_threader): New.
	(class pass_vrp_threader): New.
	(make_pass_vrp_threader): New.

libgomp/ChangeLog:

	* team.c: Initialize start_data.
	* testsuite/libgomp.graphite/force-parallel-4.c: Adjust.
	* testsuite/libgomp.graphite/force-parallel-8.c: Adjust.

gcc/testsuite/ChangeLog:

	* gcc.dg/torture/pr55107.c: Adjust.
	* gcc.dg/tree-ssa/phi_on_compare-1.c: Adjust.
	* gcc.dg/tree-ssa/phi_on_compare-2.c: Adjust.
	* gcc.dg/tree-ssa/phi_on_compare-3.c: Adjust.
	* gcc.dg/tree-ssa/phi_on_compare-4.c: Adjust.
	* gcc.dg/tree-ssa/pr21559.c: Adjust.
	* gcc.dg/tree-ssa/pr59597.c: Adjust.
	* gcc.dg/tree-ssa/pr61839_1.c: Adjust.
	* gcc.dg/tree-ssa/pr61839_3.c: Adjust.
	* gcc.dg/tree-ssa/pr71437.c: Adjust.
	* gcc.dg/tree-ssa/ssa-dom-thread-11.c: Adjust.
	* gcc.dg/tree-ssa/ssa-dom-thread-16.c: Adjust.
	* gcc.dg/tree-ssa/ssa-dom-thread-18.c: Adjust.
	* gcc.dg/tree-ssa/ssa-dom-thread-2a.c: Adjust.
	* gcc.dg/tree-ssa/ssa-dom-thread-4.c: Adjust.
	* gcc.dg/tree-ssa/ssa-thread-14.c: Adjust.
	* gcc.dg/tree-ssa/ssa-vrp-thread-1.c: Adjust.
	* gcc.dg/tree-ssa/vrp106.c: Adjust.
	* gcc.dg/tree-ssa/vrp55.c: Adjust.
2021-09-27 17:39:51 +02:00
Tobias Burnus
da1f6391b7 libgomp.oacc-fortran/privatized-ref-2.f90: Fix dg-note
In my last commit, r12-3897-g00f6de9c69119594f7dad3bd525937c94c8200d0,
which inlined array-size code, I had to update the expected output.  However,
in doing so, I accidentally (copy'n'paste) changed dg-note into dg-message.

libgomp/
	* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Change
	dg-message back to dg-note.
2021-09-27 14:33:39 +02:00
Tobias Burnus
00f6de9c69 Fortran: Fix assumed-size to assumed-rank passing [PR94070]
This code inlines the size0 and size1 libgfortran calls, the former is still
used by libgfortan itself (and by old code). Besides permitting more
optimizations, it also permits to handle assumed-rank dummies better: If the
dummy argument is a nonpointer/nonallocatable, an assumed-size actual arg is
repesented by having ubound == -1 for the last dimension. However, for
allocatable/pointers, this value can also exist. Hence, the dummy arg attr
has to be honored.

For that reason, when calling an assumed-rank procedure with nonpointer,
nonallocatable dummy arguments, the bounds have to be updated to avoid
the case ubound == -1 for the last dimension.

	PR fortran/94070

gcc/fortran/ChangeLog:

	* trans-array.c (gfc_tree_array_size): New function to
	find size inline (whole array or one dimension).
	(array_parameter_size): Use it, take stmt_block as arg.
	(gfc_conv_array_parameter): Update call.
	* trans-array.h (gfc_tree_array_size): Add prototype.
	* trans-decl.c (gfor_fndecl_size0, gfor_fndecl_size1): Remove
	these global vars.
	(gfc_build_intrinsic_function_decls): Remove their initialization.
	* trans-expr.c (gfc_conv_procedure_call): Update
	bounds of pointer/allocatable actual args to nonallocatable/nonpointer
	dummies to be one based.
	* trans-intrinsic.c (gfc_conv_intrinsic_shape): Fix case for
	assumed rank with allocatable/pointer dummy.
	(gfc_conv_intrinsic_size): Update to use inline function.
	* trans.h (gfor_fndecl_size0, gfor_fndecl_size1): Remove var decl.

libgfortran/ChangeLog:

	* intrinsics/size.c (size0, size1): Comment that now not
	used by newer compiler code.

libgomp/ChangeLog:

	* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Update
	expected dg-note output.

gcc/testsuite/ChangeLog:

	* gfortran.dg/c-interop/cf-out-descriptor-6.f90: Remove xfail.
	* gfortran.dg/c-interop/size.f90: Remove xfail.
	* gfortran.dg/intrinsic_size_3.f90: Update scan-tree-dump-times.
	* gfortran.dg/transpose_optimization_2.f90: Likewise.
	* gfortran.dg/size_optional_dim_1.f90: Add scan-tree-dump-not.
	* gfortran.dg/assumed_rank_22.f90: New test.
	* gfortran.dg/assumed_rank_22_aux.c: New test.
2021-09-27 14:04:54 +02:00
Tobias Burnus
83aac69883 Fortran: Improve -Wmissing-include-dirs warnings [PR55534]
It turned out that enabling the -Wmissing-include-dirs for libcpp did output
too many warnings – at least as run with -B and similar options during the
GCC build and warning for internal include dirs like finclude, unlikely of
relevance to for a real-world user.
This patch now only warns for -I and -J by default but permits to get the
full warnings including libcpp ones with -Wmissing-include-dirs. It
additionally documents this in the manual.

With that change, the -Wno-missing-include-dirs could be removed
from libgfortran's configure and libgomp's testsuite always cflags.
This reverts those bits of the previous
commit r12-3722-g417ea5c02cef7f000e66d1af22b066c2c1cda047

Additionally, it turned out that all call to load_file called exit
explicitly - except for the main file via gfc_init -> gfc_new_file. The
latter also output a file not existing fatal error, such that two errors
where printed. Now exit is called in line with the other users of
load_file.

Finally, when compileing with "nonexisting/file.f90", first a warning that
"nonexisting" does not exist as include path was printed before the file
not found error was printed. Now the directory in which the physical file
is located is added silently, relying on the file-not-found diagnostic for
those.

	PR fortran/55534
gcc/ChangeLog:

	* doc/invoke.texi (-Wno-missing-include-dirs.): Document Fortran
	behavior.

gcc/fortran/ChangeLog:

	* cpp.c (gfc_cpp_register_include_paths, gfc_cpp_post_options):
	Add new bool verbose_missing_dir_warn argument.
	* cpp.h (gfc_cpp_post_options): Update prototype.
	* f95-lang.c (gfc_init): Remove duplicated file-not found diag.
	* gfortran.h (gfc_check_include_dirs): Takes bool
	verbose_missing_dir_warn arg.
	(gfc_new_file): Returns now void.
	* options.c (gfc_post_options): Update to warn for -I and -J,
	only, by default but for all when user requested.
	* scanner.c (gfc_do_check_include_dir):
	(gfc_do_check_include_dirs, gfc_check_include_dirs): Take bool
	verbose warn arg and update to avoid printing the same message
	twice or never.
	(load_file): Fix indent.
	(gfc_new_file): Return void and exit when load_file failed
	as all other load_file users do.

libgfortran/ChangeLog:

	* configure.ac (AM_FCFLAGS): Revert r12-3722 by removing
	-Wno-missing-include-dirs.
	* configure: Regenerate.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/fortran.exp (ALWAYS_CFLAGS): Revert
	r12-3722 by removing -Wno-missing-include-dirs.
	* testsuite/libgomp.oacc-fortran/fortran.exp (ALWAYS_CFLAGS): Likewise.

gcc/testsuite/ChangeLog:

	* gfortran.dg/include_14.f90: Add -J testcase and update dg-output.
	* gfortran.dg/include_15.f90: Likewise.
	* gfortran.dg/include_16.f90: Likewise.
	* gfortran.dg/include_17.f90: Likewise.
	* gfortran.dg/include_18.f90: Likewise.
	* gfortran.dg/include_19.f90: Likewise.
2021-09-22 20:58:35 +02:00
Jakub Jelinek
059b819e3c openmp: Add support for allocator and align modifiers on allocate clauses
As the allocate-2.c testcase shows, this change isn't 100% backwards compatible,
one could have allocate and/or align functions that return an OpenMP allocator
handle and previously it would call those functions and now would use those
names as keywords for the modifiers.  But it allows specify extra alignment
requirements for the allocations.

2021-09-22  Jakub Jelinek  <jakub@redhat.com>

gcc/
	* tree.h (OMP_CLAUSE_ALLOCATE_ALIGN): Define.
	* tree.c (omp_clause_num_ops): Change number of OMP_CLAUSE_ALLOCATE
	arguments from 2 to 3.
	* tree-pretty-print.c (dump_omp_clause): Print allocator() around
	allocate clause allocator and print align if present.
	* omp-low.c (scan_sharing_clauses): Force allocate_map entry even
	for omp_default_mem_alloc if align modifier is present.  If align
	modifier is present, use TREE_LIST to encode both allocator and
	align.
	(lower_private_allocate, lower_rec_input_clauses, create_task_copyfn):
	Handle align modifier on allocator clause if present.
gcc/c-family/
	* c-omp.c (c_omp_split_clauses): Copy over OMP_CLAUSE_ALLOCATE_ALIGN.
gcc/c/
	* c-parser.c (c_parser_omp_clause_allocate): Parse allocate clause
	modifiers.
gcc/cp/
	* parser.c (cp_parser_omp_clause_allocate): Parse allocate clause
	modifiers.
	* semantics.c (finish_omp_clauses) <OMP_CLAUSE_ALLOCATE>: Perform
	semantic analysis of OMP_CLAUSE_ALLOCATE_ALIGN.
	* pt.c (tsubst_omp_clauses) <case OMP_CLAUSE_ALLOCATE>: Handle
	also OMP_CLAUSE_ALLOCATE_ALIGN.
gcc/testsuite/
	* c-c++-common/gomp/allocate-6.c: New test.
	* c-c++-common/gomp/allocate-7.c: New test.
	* g++.dg/gomp/allocate-4.C: New test.
libgomp/
	* testsuite/libgomp.c-c++-common/allocate-2.c: New test.
	* testsuite/libgomp.c-c++-common/allocate-3.c: New test.
2021-09-22 09:29:13 +02:00
Tobias Burnus
417ea5c02c Fortran: Fix -Wno-missing-include-dirs handling [PR55534]
gcc/fortran/ChangeLog:

	PR fortran/55534
	* cpp.c: Define GCC_C_COMMON_C for #include "options.h" to make
	cpp_reason_option_codes available.
	(gfc_cpp_register_include_paths): Make static, set pfile's
	warn_missing_include_dirs and move before caller.
	(gfc_cpp_init_cb): New, cb code moved from ...
	(gfc_cpp_init_0): ... here.
	(gfc_cpp_post_options): Call gfc_cpp_init_cb.
	(cb_cpp_diagnostic_cpp_option): New. As implemented in c-family
	to match CppReason flags to -W... names.
	(cb_cpp_diagnostic): Use it to replace single special case.
	* cpp.h (gfc_cpp_register_include_paths): Remove as now static.
	* gfortran.h (gfc_check_include_dirs): New prototype.
	(gfc_add_include_path): Add new bool arg.
	* options.c (gfc_init_options): Don't set -Wmissing-include-dirs.
	(gfc_post_options): Set it here after commandline processing. Call
	gfc_add_include_path with defer_warn=false.
	(gfc_handle_option): Call it with defer_warn=true.
	* scanner.c (gfc_do_check_include_dir, gfc_do_check_include_dirs,
	gfc_check_include_dirs): New. Diagnostic moved from ...
	(add_path_to_list): ... here, which came before cmdline processing.
	Take additional bool defer_warn argument.
	(gfc_add_include_path): Take additional defer_warn arg.
	* scanner.h (struct gfc_directorylist): Reorder for alignment issues,
	add new 'bool warn'.

libgfortran/ChangeLog:
	PR fortran/55534
	* configure.ac (AM_FCFLAGS): Add -Wno-missing-include-dirs.
	* configure: Regenerate.

libgomp/ChangeLog:
	PR fortran/55534
	* testsuite/libgomp.fortran/fortran.exp: Add -Wno-missing-include-dirs
	to ALWAYS_CFLAGS.
	* testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.

gcc/testsuite/ChangeLog:
	* gfortran.dg/include_6.f90: Change dg-error to
	dg-warning and update pattern.
	* gfortran.dg/include_14.f90: New test.
	* gfortran.dg/include_15.f90: New test.
	* gfortran.dg/include_16.f90: New test.
	* gfortran.dg/include_17.f90: New test.
	* gfortran.dg/include_18.f90: New test.
	* gfortran.dg/include_19.f90: New test.
	* gfortran.dg/include_20.f90: New test.
	* gfortran.dg/include_21.f90: New test.
2021-09-21 08:28:30 +02:00
Jakub Jelinek
e5597f2ad5 openmp: Allow private or firstprivate arguments to default clause even for C/C++
OpenMP 5.1 allows default(private) or default(firstprivate) even in C/C++,
but it behaves the same way as in Fortran only for variables not declared at
namespace or file scope.  For the namespace/file scope variables it instead
behaves as default(none).

2021-09-18  Jakub Jelinek  <jakub@redhat.com>

gcc/
	* gimplify.c (omp_default_clause): For C/C++ default({,first}private),
	if file/namespace scope variable doesn't have predetermined sharing,
	treat it as if there was default(none).
gcc/c/
	* c-parser.c (c_parser_omp_clause_default): Handle private and
	firstprivate arguments, adjust diagnostics on unknown argument.
gcc/cp/
	* parser.c (cp_parser_omp_clause_default): Handle private and
	firstprivate arguments, adjust diagnostics on unknown argument.
	* cp-gimplify.c (cxx_omp_finish_clause): Handle OMP_CLAUSE_PRIVATE.
gcc/testsuite/
	* c-c++-common/gomp/default-2.c: New test.
	* c-c++-common/gomp/default-3.c: New test.
	* g++.dg/gomp/default-1.C: New test.
libgomp/
	* testsuite/libgomp.c++/default-1.C: New test.
	* testsuite/libgomp.c-c++-common/default-1.c: New test.
	* libgomp.texi (OpenMP 5.1): Mark "private and firstprivate argument
	to default clause in C and C++" as implemented.
2021-09-18 09:47:25 +02:00
Julian Brown
2a3f9f6532 openacc: Shared memory layout optimisation
This patch implements an algorithm to lay out local data-share (LDS)
space.  It currently works for AMD GCN.  At the moment, LDS is used for
three things:

  1. Gang-private variables
  2. Reduction temporaries (accumulators)
  3. Broadcasting for worker partitioning

After the patch is applied, (2) and (3) are placed at preallocated
locations in LDS, and (1) continues to be handled by the backend (as it
is at present prior to this patch being applied). LDS now looks like this:

  +--------------+ (gang-private size + 1024, = 1536)
  | free space   |
  |    ...       |
  | - - - - - - -|
  | worker bcast |
  +--------------+
  | reductions   |
  +--------------+ <<< -mgang-private-size=<number> (def. 512)
  | gang-private |
  |    vars      |
  +--------------+ (32)
  | low LDS vars |
  +--------------+ LDS base

So, gang-private space is fixed at a constant amount at compile time
(which can be increased with a command-line switch if necessary
for some given code). The layout algorithm takes out a slice of the
remainder of usable space for reduction vars, and uses the rest for
worker partitioning.

The partitioning algorithm works as follows.

 1. An "adjacency" set is built up for each basic block that might
    do a broadcast. This is calculated by starting at each such block,
    and doing a recursive DFS walk over successors to find the next
    block (or blocks) that *also* does a broadcast
    (dfs_broadcast_reachable_1).

 2. The adjacency set is inverted to get adjacent predecessor blocks also.

 3. Blocks that will perform a broadcast are sorted by size of that
    broadcast: the biggest blocks are handled first.

 4. A splay tree structure is used to calculate the spans of LDS memory
    that are already allocated by the blocks adjacent to this one
    (merge_ranges{,_1}.

 5. The current block's broadcast space is allocated from the first free
    span not allocated in the splay tree structure calculated above
    (first_fit_range). This seems to work quite nicely and efficiently
    with the splay tree structure.

 6. Continue with the next-biggest broadcast block until we're done.

In this way, "adjacent" broadcasts will not use the same piece of
LDS memory.

PR96334 "openacc: Unshare reduction temporaries for GCN" got merged in:

The GCN backend uses tree nodes like MEM((__lds TYPE *) <constant>)
for reduction temporaries. Unlike e.g. var decls and SSA names, these
nodes cannot be shared during gimplification, but are so in some
circumstances. This is detected when appropriate --enable-checking
options are used. This patch unshares such nodes when they are reused
more than once.

gcc/
	* config/gcn/gcn-protos.h
	(gcn_goacc_create_worker_broadcast_record): Update prototype.
	* config/gcn/gcn-tree.c (gcn_goacc_get_worker_red_decl): Use
	preallocated block of LDS memory.  Do not cache/share decls for
	reduction temporaries between invocations.
	(gcn_goacc_reduction_teardown): Unshare VAR on second use.
	(gcn_goacc_create_worker_broadcast_record): Add OFFSET parameter
	and return temporary LDS space at that offset.  Return pointer in
	"sender" case.
	* config/gcn/gcn.c (acc_lds_size, gang_private_hwm, lds_allocs):
	New global vars.
	(ACC_LDS_SIZE): Define as acc_lds_size.
	(gcn_init_machine_status): Don't initialise lds_allocated,
	lds_allocs, reduc_decls fields of machine function struct.
	(gcn_option_override): Handle default size for gang-private
	variables and -mgang-private-size option.
	(gcn_expand_prologue): Use LDS_SIZE instead of LDS_SIZE-1 when
	initialising M0_REG.
	(gcn_shared_mem_layout): New function.
	(gcn_print_lds_decl): Update comment. Use global lds_allocs map and
	gang_private_hwm variable.
	(TARGET_GOACC_SHARED_MEM_LAYOUT): Define target hook.
	* config/gcn/gcn.h (machine_function): Remove lds_allocated,
	lds_allocs, reduc_decls. Add reduction_base, reduction_limit.
	* config/gcn/gcn.opt (gang_private_size_opt): New global.
	(mgang-private-size=): New option.
	* doc/tm.texi.in (TARGET_GOACC_SHARED_MEM_LAYOUT): Place
	documentation hook.
	* doc/tm.texi: Regenerate.
	* omp-oacc-neuter-broadcast.cc (targhooks.h, diagnostic-core.h):
	Add includes.
	(build_sender_ref): Handle sender_decl being pointer.
	(worker_single_copy): Add PLACEMENT and ISOLATE_BROADCASTS
	parameters.  Pass placement argument to
	create_worker_broadcast_record hook invocations.  Handle
	sender_decl being pointer and isolate_broadcasts inserting extra
	barriers.
	(blk_offset_map_t): Add typedef.
	(neuter_worker_single): Add BLK_OFFSET_MAP parameter.  Pass
	preallocated range to worker_single_copy call.
	(dfs_broadcast_reachable_1): New function.
	(idx_decl_pair_t, used_range_vec_t): New typedefs.
	(sort_size_descending): New function.
	(addr_range): New class.
	(splay_tree_compare_addr_range, splay_tree_free_key)
	(first_fit_range, merge_ranges_1, merge_ranges): New functions.
	(execute_omp_oacc_neuter_broadcast): Rename to...
	(oacc_do_neutering): ... this.  Add BOUNDS_LO, BOUNDS_HI
	parameters.  Arrange layout of shared memory for broadcast
	operations.
	(execute_omp_oacc_neuter_broadcast): New function.
	(pass_omp_oacc_neuter_broadcast::gate): Remove num_workers==1
	handling from here.  Enable pass for all OpenACC routines in order
	to call shared memory-layout hook.
	* target.def (create_worker_broadcast_record): Add OFFSET
	parameter.
	(shared_mem_layout): New hook.
libgomp/
	* testsuite/libgomp.oacc-c-c++-common/broadcast-many.c: Update.
2021-09-17 21:04:30 +02:00
Julian Brown
8251f90e87 Add 'libgomp.oacc-c-c++-common/broadcast-many.c'
libgomp/
	* testsuite/libgomp.oacc-c-c++-common/broadcast-many.c: New test.
2021-09-17 21:04:29 +02:00
Jakub Jelinek
3a2bcffac6 openmp: Add support for OpenMP 5.1 atomics for C++
Besides the C++ FE changes, I've noticed that the C FE didn't reject
  #pragma omp atomic capture compare
  { v = x; x = y; }
and other forms of atomic swap, this patch fixes that too.  And the
c-family/ routine needed quite a few changes so that the new code
in it works fine with both FEs.

2021-09-17  Jakub Jelinek  <jakub@redhat.com>

gcc/c-family/
	* c-omp.c (c_finish_omp_atomic): Avoid creating
	TARGET_EXPR if test is true, use create_tmp_var_raw instead of
	create_tmp_var and add a zero initializer to TARGET_EXPRs that
	had NULL initializer.  When omitting operands after v = x,
	use type of v rather than type of x.  Fix type of vtmp
	TARGET_EXPR.
gcc/c/
	* c-parser.c (c_parser_omp_atomic): Reject atomic swap if capture
	is true.
gcc/cp/
	* cp-tree.h (finish_omp_atomic): Add r and weak arguments.
	* parser.c (cp_parser_omp_atomic): Update function comment for
	OpenMP 5.1 atomics, parse OpenMP 5.1 atomics and fail, compare and
	weak clauses.
	* semantics.c (finish_omp_atomic): Add r and weak arguments, handle
	them, handle COND_EXPRs.
	* pt.c (tsubst_expr): Adjust for COND_EXPR forms that
	finish_omp_atomic can now produce.
gcc/testsuite/
	* c-c++-common/gomp/atomic-18.c: Expect same diagnostics in C++ as in
	C.
	* c-c++-common/gomp/atomic-25.c: Drop c effective target.
	* c-c++-common/gomp/atomic-26.c: Likewise.
	* c-c++-common/gomp/atomic-27.c: Likewise.
	* c-c++-common/gomp/atomic-28.c: Likewise.
	* c-c++-common/gomp/atomic-29.c: Likewise.
	* c-c++-common/gomp/atomic-30.c: Likewise.  Adjust expected diagnostics
	for C++ when it differs from C.
	(foo): Change return type from double to void.
	* g++.dg/gomp/atomic-5.C: Adjust expected diagnostics wording.
	* g++.dg/gomp/atomic-20.C: New test.
libgomp/
	* testsuite/libgomp.c-c++-common/atomic-19.c: Drop c effective target.
	Use /* */ comments instead of //.
	* testsuite/libgomp.c-c++-common/atomic-20.c: Likewise.
	* testsuite/libgomp.c-c++-common/atomic-21.c: Likewise.
	* testsuite/libgomp.c++/atomic-16.C: New test.
	* testsuite/libgomp.c++/atomic-17.C: New test.
2021-09-17 11:28:31 +02:00
Jakub Jelinek
8122fbff77 openmp: Implement OpenMP 5.1 atomics, so far for C only
This patch implements OpenMP 5.1 atomics (with clarifications from upcoming 5.2).
The most important changes are that it is now possible to write (for C/C++,
for Fortran it was possible before already) min/max atomics and more importantly
compare and exchange in various forms.
Also, acq_rel is now allowed on read/write and acq_rel/acquire are allowed on
update, and there are new compare, weak and fail clauses.

2021-09-10  Jakub Jelinek  <jakub@redhat.com>

gcc/
	* tree-core.h (enum omp_memory_order): Add OMP_MEMORY_ORDER_MASK,
	OMP_FAIL_MEMORY_ORDER_UNSPECIFIED, OMP_FAIL_MEMORY_ORDER_RELAXED,
	OMP_FAIL_MEMORY_ORDER_ACQUIRE, OMP_FAIL_MEMORY_ORDER_RELEASE,
	OMP_FAIL_MEMORY_ORDER_ACQ_REL, OMP_FAIL_MEMORY_ORDER_SEQ_CST and
	OMP_FAIL_MEMORY_ORDER_MASK enumerators.
	(OMP_FAIL_MEMORY_ORDER_SHIFT): Define.
	* gimple-pretty-print.c (dump_gimple_omp_atomic_load,
	dump_gimple_omp_atomic_store): Print [weak] for weak atomic
	load/store.
	* gimple.h (enum gf_mask): Change GF_OMP_ATOMIC_MEMORY_ORDER
	to 6-bit mask, adjust GF_OMP_ATOMIC_NEED_VALUE value and add
	GF_OMP_ATOMIC_WEAK.
	(gimple_omp_atomic_weak_p, gimple_omp_atomic_set_weak): New inline
	functions.
	* tree.h (OMP_ATOMIC_WEAK): Define.
	* tree-pretty-print.c (dump_omp_atomic_memory_order): Adjust for
	fail memory order being encoded in the same enum and also print
	fail clause if present.
	(dump_generic_node): Print weak clause if OMP_ATOMIC_WEAK.
	* gimplify.c (goa_stabilize_expr): Add target_expr and rhs arguments,
	handle pre_p == NULL case as a test mode that only returns value
	but doesn't change gimplify nor change anything otherwise, adjust
	recursive calls, add MODIFY_EXPR, ADDR_EXPR, COND_EXPR, TARGET_EXPR
	and CALL_EXPR handling, adjust COMPOUND_EXPR handling for
	__builtin_clear_padding calls, for !rhs gimplify as lvalue rather
	than rvalue.
	(gimplify_omp_atomic): Adjust goa_stabilize_expr caller.  Handle
	COND_EXPR rhs.  Set weak flag on gimple load/store for
	OMP_ATOMIC_WEAK.
	* omp-expand.c (omp_memory_order_to_fail_memmodel): New function.
	(omp_memory_order_to_memmodel): Adjust for fail clause encoded
	in the same enum.
	(expand_omp_atomic_cas): New function.
	(expand_omp_atomic_pipeline): Use omp_memory_order_to_fail_memmodel
	function.
	(expand_omp_atomic): Attempt to optimize atomic compare and exchange
	using expand_omp_atomic_cas.
gcc/c-family/
	* c-common.h (c_finish_omp_atomic): Add r and weak arguments.
	* c-omp.c: Include gimple-fold.h.
	(c_finish_omp_atomic): Add r and weak arguments.  Add support for
	OpenMP 5.1 atomics.
gcc/c/
	* c-parser.c (c_parser_conditional_expression): If omp_atomic_lhs and
	cond.value is >, < or == with omp_atomic_lhs as one of the operands,
	don't call build_conditional_expr, instead build a COND_EXPR directly.
	(c_parser_binary_expression): Avoid calling parser_build_binary_op
	if omp_atomic_lhs even in more cases for >, < or ==.
	(c_parser_omp_atomic): Update function comment for OpenMP 5.1 atomics,
	parse OpenMP 5.1 atomics and fail, compare and weak clauses, allow
	acq_rel on atomic read/write and acq_rel/acquire clauses on update.
	* c-typeck.c (build_binary_op): For flag_openmp only handle
	MIN_EXPR/MAX_EXPR.
gcc/cp/
	* parser.c (cp_parser_omp_atomic): Allow acq_rel on atomic read/write
	and acq_rel/acquire clauses on update.
	* semantics.c (finish_omp_atomic): Adjust c_finish_omp_atomic caller.
gcc/testsuite/
	* c-c++-common/gomp/atomic-17.c (foo): Add tests for atomic read,
	write or update with acq_rel clause and atomic update with acquire clause.
	* c-c++-common/gomp/atomic-18.c (foo): Adjust expected diagnostics
	wording, remove tests moved to atomic-17.c.
	* c-c++-common/gomp/atomic-21.c: Expect only 2 omp atomic release and
	2 omp atomic acq_rel directives instead of 4 omp atomic release.
	* c-c++-common/gomp/atomic-25.c: New test.
	* c-c++-common/gomp/atomic-26.c: New test.
	* c-c++-common/gomp/atomic-27.c: New test.
	* c-c++-common/gomp/atomic-28.c: New test.
	* c-c++-common/gomp/atomic-29.c: New test.
	* c-c++-common/gomp/atomic-30.c: New test.
	* c-c++-common/goacc-gomp/atomic.c: Expect 1 omp atomic release and
	1 omp atomic_acq_rel instead of 2 omp atomic release directives.
	* gcc.dg/gomp/atomic-5.c: Adjust expected error diagnostic wording.
	* g++.dg/gomp/atomic-18.C:Expect 4 omp atomic release and
	1 omp atomic_acq_rel instead of 5 omp atomic release directives.
libgomp/
	* testsuite/libgomp.c-c++-common/atomic-19.c: New test.
	* testsuite/libgomp.c-c++-common/atomic-20.c: New test.
	* testsuite/libgomp.c-c++-common/atomic-21.c: New test.
2021-09-10 20:41:33 +02:00
Thomas Schwinge
086bb917d6 'libgomp.c/target-43.c': '-latomic' for nvptx offloading
... to avoid a regression with recent
commit 090f0d78f1
"openmp: Improve expand_omp_atomic_pipeline":

    unresolved symbol __atomic_compare_exchange_1
    collect2: error: ld returned 1 exit status
    mkoffload: fatal error: [...]/gcc/x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status

	libgomp/
	* testsuite/libgomp.c/target-43.c: '-latomic' for nvptx offloading.
2021-09-06 11:51:13 +02:00
Tobias Burnus
4ce90454c2 libgomp.*/error-1.{c,f90}: Fix dg-output newline pattern
libgomp/ChangeLog:

	* testsuite/libgomp.c-c++-common/error-1.c: Use \r\n not \n\r in
	dg-output.
	* testsuite/libgomp.fortran/error-1.f90: Likewise.
2021-09-03 15:27:00 +02:00
Thomas Schwinge
29c355f76c Add 'libgomp.c/address-space-1.c'
Intel MIC (emulated) offloading execution failure remains to be analyzed.

	libgomp/
	* testsuite/libgomp.c/address-space-1.c: New file.

Co-authored-by: Jakub Jelinek <jakub@redhat.com>
2021-08-23 17:46:08 +02:00
Thomas Schwinge
bb75b22aba Allow matching Intel MIC in OpenMP 'declare variant'
..., and use that to improve XFAILing for Intel MIC offloading execution
instead of compilation in 'libgomp.c-c++-common/target-45.c',
'libgomp.fortran/target10.f90'.

	gcc/
	* config/i386/i386-options.c (ix86_omp_device_kind_arch_isa)
	<omp_device_arch> [ACCEL_COMPILER]: Match "intel_mic".
	* config/i386/t-omp-device (omp-device-properties-i386) <arch>:
	Add "intel_mic".
	libgomp/
	* testsuite/lib/libgomp.exp
	(check_effective_target_offload_target_intelmic): Remove 'proc'.
	(check_effective_target_offload_device_intel_mic): New 'proc'.
	* testsuite/libgomp.c-c++-common/on_device_arch.h
	(device_arch_intel_mic, on_device_arch_intel_mic): New.
	* testsuite/libgomp.c-c++-common/target-45.c: Use that for
	'dg-xfail-run-if'.
	* testsuite/libgomp.fortran/target10.f90: Likewise.
2021-08-23 17:45:40 +02:00
Tobias Burnus
d4de7e32ef Fortran/OpenMP: strict modifier on grainsize/num_tasks
This patch adds support for the 'strict' modifier on grainsize/num_tasks
clauses, an OpenMP 5.1 feature supported in C/C++ since commit
r12-3066-g3bc75533d1f87f0617be6c1af98804f9127ec637

gcc/fortran/ChangeLog:

	* dump-parse-tree.c (show_omp_clauses): Handle 'strict' modifier
	on grainsize/num_tasks
	* gfortran.h (gfc_omp_clauses): Add grainsize_strict
	and num_tasks_strict.
	* trans-openmp.c (gfc_trans_omp_clauses, gfc_split_omp_clauses):
	Handle 'strict' modifier on grainsize/num_tasks.
	* openmp.c (gfc_match_omp_clauses): Likewise.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/taskloop-4-a.f90: New test.
	* testsuite/libgomp.fortran/taskloop-4.f90: New test.
	* testsuite/libgomp.fortran/taskloop-5-a.f90: New test.
	* testsuite/libgomp.fortran/taskloop-5.f90: New test.
2021-08-23 15:15:30 +02:00
Jakub Jelinek
3bc75533d1 openmp: Add support for strict modifier on grainsize/num_tasks clauses
With strict: modifier on these clauses, the standard is explicit about
how many iterations (and which) each generated task of taskloop directive
should contain.  For num_tasks it actually matches what we were already
implementing, but for grainsize it does not (and even violates the old
rule - without strict it requires that the number of iterations (unspecified
which exactly) handled by each generated task is >= grainsize argument and
< 2 * grainsize argument, with strict: it requires that each generated
task handles exactly == grainsize argument iterations, except for the
generated task handling the last iteration which can handles <= grainsize
iterations).

The following patch implements it for C and C++.

2021-08-23  Jakub Jelinek  <jakub@redhat.com>

gcc/
	* tree.h (OMP_CLAUSE_GRAINSIZE_STRICT): Define.
	(OMP_CLAUSE_NUM_TASKS_STRICT): Define.
	* tree-pretty-print.c (dump_omp_clause) <case OMP_CLAUSE_GRAINSIZE,
	case OMP_CLAUSE_NUM_TASKS>: Print strict: modifier.
	* omp-expand.c (expand_task_call): Use GOMP_TASK_FLAG_STRICT in iflags
	if either grainsize or num_tasks clause has the strict modifier.
gcc/c/
	* c-parser.c (c_parser_omp_clause_num_tasks,
	c_parser_omp_clause_grainsize): Parse the optional strict: modifier.
gcc/cp/
	* parser.c (cp_parser_omp_clause_num_tasks,
	cp_parser_omp_clause_grainsize): Parse the optional strict: modifier.
include/
	* gomp-constants.h (GOMP_TASK_FLAG_STRICT): Define.
libgomp/
	* taskloop.c (GOMP_taskloop): Handle GOMP_TASK_FLAG_STRICT.
	* testsuite/libgomp.c-c++-common/taskloop-4.c (main): Fix up comment.
	* testsuite/libgomp.c-c++-common/taskloop-5.c: New test.
2021-08-23 10:16:24 +02:00
Thomas Schwinge
a5416bf369 Make the OpenMP 'error' directive work for nvptx offloading
... and add a minimum amount of offloading testing.

(Leaving aside that 'fwrite' to 'stderr' probably wouldn't work anyway) the
'fwrite' calls in 'libgomp/error.c:GOMP_warning', 'libgomp/error.c:GOMP_error'
drag in 'isatty', which isn't provided by my nvptx newlib build at present, so
we get, for example:

    [...]
    FAIL: libgomp.c/../libgomp.c-c++-common/declare_target-1.c (test for excess errors)
    Excess errors:
    unresolved symbol isatty
    mkoffload: fatal error: [...]/build-gcc/./gcc/x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status
    [...]

..., and many more.

Fix up for recent commit 0d973c0a0d
"openmp: Implement the error directive".

	libgomp/
	* config/nvptx/error.c (fwrite, exit): Override, too.
	* testsuite/libgomp.c-c++-common/error-1.c: Add a minimum amount
	of offloading testing.
	* testsuite/libgomp.fortran/error-1.f90: Likewise.
2021-08-22 11:08:26 +02:00
Tobias Burnus
77167196fe Fortran: Add OpenMP's error directive
Fortran part to the C/C++ implementation of
commit r12-3040-g0d973c0a0d90a0a302e7eda1a4d9709be3c5b102

gcc/fortran/ChangeLog:

	* dump-parse-tree.c (show_omp_clauses): Handle 'at', 'severity'
	and 'message' clauses.
	(show_omp_node, show_code_node): Handle EXEC_OMP_ERROR.
	* gfortran.h (gfc_statement): Add ST_OMP_ERROR.
	(gfc_omp_severity_type, gfc_omp_at_type): New.
	(gfc_omp_clauses): Add 'at', 'severity' and 'message' clause;
	use more bitfields + ENUM_BITFIELD.
	(gfc_exec_op): Add EXEC_OMP_ERROR.
	* match.h (gfc_match_omp_error): New.
	* openmp.c (enum omp_mask1): Add OMP_CLAUSE_(AT,SEVERITY,MESSAGE).
	(gfc_match_omp_clauses): Handle new clauses.
	(OMP_ERROR_CLAUSES, gfc_match_omp_error): New.
	(resolve_omp_clauses): Resolve new clauses.
	(omp_code_to_statement, gfc_resolve_omp_directive): Handle
	EXEC_OMP_ERROR.
	* parse.c (decode_omp_directive, next_statement,
	gfc_ascii_statement): Handle 'omp error'.
	* resolve.c (gfc_resolve_blocks): Likewise.
	* st.c (gfc_free_statement): Likewise.
	* trans-openmp.c (gfc_trans_omp_error): Likewise.
	(gfc_trans_omp_directive): Likewise.
	* trans.c (trans_code): Likewise.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/error-1.f90: New test.

gcc/testsuite/ChangeLog:

	* gfortran.dg/gomp/error-1.f90: New test.
	* gfortran.dg/gomp/error-2.f90: New test.
	* gfortran.dg/gomp/error-3.f90: New test.
2021-08-20 12:12:51 +02:00
Jakub Jelinek
0d973c0a0d openmp: Implement the error directive
This patch implements the error directive.  Depending on clauses it is either
a compile time diagnostics (in that case diagnosed right away) or runtime
diagnostics (libgomp API call that diagnoses at runtime), and either fatal
or warning (error or warning at compile time or fatal error vs. error at
runtime) and either has no message or user supplied message (this kind of
e.g. deprecated attribute).  The directive is also stand-alone directive
when at runtime while utility (thus disappears from the IL as if it wasn't
there for parsing like nothing directive) at compile time.

There are some clarifications in the works ATM, so this patch doesn't yet
require that for compile time diagnostics the user message must be a constant
string literal, there are uncertainities on what exactly is valid argument
of message clause (whether just const char * type, convertible to const char *,
qualified/unqualified const char * or char * or what else) and what to do
in templates.  Currently even in templates it is diagnosed right away for
compile time diagnostics, if we'll need to substitute it, we'd need to queue
something into the IL, have pt.c handle it and diagnose only later.

2021-08-20  Jakub Jelinek  <jakub@redhat.com>

gcc/
	* omp-builtins.def (BUILT_IN_GOMP_WARNING, BUILT_IN_GOMP_ERROR): New
	builtins.
gcc/c-family/
	* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_ERROR.
	* c-pragma.c (omp_pragmas): Add error directive.
	* c-omp.c (omp_directives): Uncomment error directive entry.
gcc/c/
	* c-parser.c (c_parser_omp_error): New function.
	(c_parser_pragma): Handle PRAGMA_OMP_ERROR.
gcc/cp/
	* parser.c (cp_parser_handle_statement_omp_attributes): Determine if
	PRAGMA_OMP_ERROR directive is C_OMP_DIR_STANDALONE.
	(cp_parser_omp_error): New function.
	(cp_parser_pragma): Handle PRAGMA_OMP_ERROR.
gcc/fortran/
	* types.def (BT_FN_VOID_CONST_PTR_SIZE): New DEF_FUNCTION_TYPE_2.
	* f95-lang.c (ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST): Define.
gcc/testsuite/
	* c-c++-common/gomp/error-1.c: New test.
	* c-c++-common/gomp/error-2.c: New test.
	* c-c++-common/gomp/error-3.c: New test.
	* g++.dg/gomp/attrs-1.C (bar): Add error directive test.
	* g++.dg/gomp/attrs-2.C (bar): Add error directive test.
	* g++.dg/gomp/attrs-13.C: New test.
	* g++.dg/gomp/error-1.C: New test.
libgomp/
	* libgomp.map (GOMP_5.1): Add GOMP_error and GOMP_warning.
	* libgomp_g.h (GOMP_warning, GOMP_error): Declare.
	* error.c (GOMP_warning, GOMP_error): New functions.
	* testsuite/libgomp.c-c++-common/error-1.c: New test.
2021-08-20 11:36:52 +02:00
Tobias Burnus
76bb3c50dd Fortran/OpenMP: Add memory routines existing for C/C++
This patch adds the Fortran interface for omp_alloc/omp_free
and the omp_target_* memory routines, which were added in
OpenMP 5.0 for C/C++ but only OpenMP 5.1 added them for Fortran.

Those functions use BIND(C), i.e. on the libgomp side, the same
interface as for C/C++ is used.

Note: By using BIND(C) in omp_lib.h, files including this file
no longer compiler with -std=f95 but require at least -std=f2003.

libgomp/ChangeLog:

	* omp_lib.f90.in (omp_alloc, omp_free, omp_target_alloc,
	omp_target_free. omp_target_is_present, omp_target_memcpy,
	omp_target_memcpy_rect, omp_target_associate_ptr,
	omp_target_disassociate_ptr): Add interface.
	* omp_lib.h.in (omp_alloc, omp_free, omp_target_alloc,
	omp_target_free. omp_target_is_present, omp_target_memcpy,
	omp_target_memcpy_rect, omp_target_associate_ptr,
	omp_target_disassociate_ptr): Add interface.
	* testsuite/libgomp.fortran/alloc-1.F90: Remove local
	interface block for omp_alloc + omp_free.
	* testsuite/libgomp.fortran/alloc-4.f90: Likewise.
	* testsuite/libgomp.fortran/refcount-1.f90: New test.
	* testsuite/libgomp.fortran/target-12.f90: New test.
2021-08-18 11:15:47 +02:00
Jakub Jelinek
5079b7781a openmp: Add nothing directive support
As has been clarified, it is intentional that nothing directive is accepted
in substatements of selection and looping statements and after labels and
is handled as if the directive just isn't there, so that
void
foo (int x)
{
  if (x)
    #pragma omp metadirective when (...:nothing) when (...:parallel)
    bar ();
}
behaves consistently; declarative and stand-alone directives aren't allowed
at that point, but constructs are parsed with the following statement as
the construct body and nothing or missing default on metadirective therefore
should handle the following statement as part of the if substatement instead
of having nothing as the substatement and bar done unconditionally after the
if.

2021-08-18  Jakub Jelinek  <jakub@redhat.com>

gcc/c-family/
	* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_NOTHING.
	* c-pragma.c (omp_pragmas): Add nothing directive.
	* c-omp.c (omp_directives): Uncomment nothing directive entry.
gcc/c/
	* c-parser.c (c_parser_omp_nothing): New function.
	(c_parser_pragma): Handle PRAGMA_OMP_NOTHING.
gcc/cp/
	* parser.c (cp_parser_omp_nothing): New function.
	(cp_parser_pragma): Handle PRAGMA_OMP_NOTHING.
gcc/testsuite/
	* c-c++-common/gomp/nothing-1.c: New test.
	* g++.dg/gomp/attrs-1.C (bar): Add nothing directive test.
	* g++.dg/gomp/attrs-2.C (bar): Likewise.
	* g++.dg/gomp/attrs-9.C: Likewise.
libgomp/
	* testsuite/libgomp.c-c++-common/nothing-1.c: New test.
2021-08-18 11:10:43 +02:00
Tobias Burnus
f8d535f3fe Fortran: Implement OpenMP 5.1 scope construct
Fortran version to commit e45483c7c4,
which implemented OpenMP's scope construct for C and C++.
Most testcases are based on the C testcases; it also contains some
testcases which existed previously but had no Fortran equivalent.

gcc/fortran/ChangeLog:

	* dump-parse-tree.c (show_omp_node, show_code_node): Handle
	EXEC_OMP_SCOPE.
	* gfortran.h (enum gfc_statement): Add ST_OMP_(END_)SCOPE.
	(enum gfc_exec_op): Add EXEC_OMP_SCOPE.
	* match.h (gfc_match_omp_scope): New.
	* openmp.c (OMP_SCOPE_CLAUSES): Define
	(gfc_match_omp_scope): New.
	(gfc_match_omp_cancellation_point, gfc_match_omp_end_nowait):
	Improve error diagnostic.
	(omp_code_to_statement): Handle ST_OMP_SCOPE.
	(gfc_resolve_omp_directive): Handle EXEC_OMP_SCOPE.
	* parse.c (decode_omp_directive, next_statement,
	gfc_ascii_statement, parse_omp_structured_block,
	parse_executable): Handle OpenMP's scope construct.
	* resolve.c (gfc_resolve_blocks): Likewise
	* st.c (gfc_free_statement): Likewise
	* trans-openmp.c (gfc_trans_omp_scope): New.
	(gfc_trans_omp_directive): Call it.
	* trans.c (trans_code): handle EXEC_OMP_SCOPE.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/scope-1.f90: New test.
	* testsuite/libgomp.fortran/task-reduction-16.f90: New test.

gcc/testsuite/ChangeLog:

	* gfortran.dg/gomp/scan-1.f90:
	* gfortran.dg/gomp/cancel-1.f90: New test.
	* gfortran.dg/gomp/cancel-4.f90: New test.
	* gfortran.dg/gomp/loop-4.f90: New test.
	* gfortran.dg/gomp/nesting-1.f90: New test.
	* gfortran.dg/gomp/nesting-2.f90: New test.
	* gfortran.dg/gomp/nesting-3.f90: New test.
	* gfortran.dg/gomp/nowait-1.f90: New test.
	* gfortran.dg/gomp/reduction-task-1.f90: New test.
	* gfortran.dg/gomp/reduction-task-2.f90: New test.
	* gfortran.dg/gomp/reduction-task-2a.f90: New test.
	* gfortran.dg/gomp/reduction-task-3.f90: New test.
	* gfortran.dg/gomp/scope-1.f90: New test.
	* gfortran.dg/gomp/scope-2.f90: New test.
2021-08-17 15:51:03 +02:00
Jakub Jelinek
e45483c7c4 openmp: Implement OpenMP 5.1 scope construct
This patch implements the OpenMP 5.1 scope construct, which is similar
to worksharing constructs in many regards, but isn't one of them.
The body of the construct is encountered by all threads though, it can
be nested in itself or intermixed with taskgroup and worksharing etc.
constructs can appear inside of it (but it can't be nested in
worksharing etc. constructs).  The main purpose of the construct
is to allow reductions (normal and task ones) without the need to
close the parallel and reopen another one.

If it doesn't have task reductions, it can be implemented without
any new library support, with nowait it just does the privatizations
at the start if any and reductions before the end of the body, with
without nowait emits a normal GOMP_barrier{,_cancel} at the end too.

For task reductions, we need to ensure only one thread initializes
the task reduction library data structures and other threads copy from that,
so a new GOMP_scope_start routine is added to the library for that.
It acts as if the start of the scope construct is a nowait worksharing
construct (that is ok, it can't be nested in other worksharing
constructs and all threads need to encounter the start in the same
order) which does the task reduction initialization, but as the body
can have other scope constructs and/or worksharing constructs, that is
all where we use this dummy worksharing construct.  With task reductions,
the construct must not have nowait and ends with a GOMP_barrier{,_cancel},
followed by task reductions followed by GOMP_workshare_task_reduction_unregister.

Only C/C++ FE support is done.

2021-08-17  Jakub Jelinek  <jakub@redhat.com>

gcc/
	* tree.def (OMP_SCOPE): New tree code.
	* tree.h (OMP_SCOPE_BODY, OMP_SCOPE_CLAUSES): Define.
	* tree-nested.c (convert_nonlocal_reference_stmt,
	convert_local_reference_stmt, convert_gimple_call): Handle
	GIMPLE_OMP_SCOPE.
	* tree-pretty-print.c (dump_generic_node): Handle OMP_SCOPE.
	* gimple.def (GIMPLE_OMP_SCOPE): New gimple code.
	* gimple.c (gimple_build_omp_scope): New function.
	(gimple_copy): Handle GIMPLE_OMP_SCOPE.
	* gimple.h (gimple_build_omp_scope): Declare.
	(gimple_has_substatements): Handle GIMPLE_OMP_SCOPE.
	(gimple_omp_scope_clauses, gimple_omp_scope_clauses_ptr,
	gimple_omp_scope_set_clauses): New inline functions.
	(CASE_GIMPLE_OMP): Add GIMPLE_OMP_SCOPE.
	* gimple-pretty-print.c (dump_gimple_omp_scope): New function.
	(pp_gimple_stmt_1): Handle GIMPLE_OMP_SCOPE.
	* gimple-walk.c (walk_gimple_stmt): Likewise.
	* gimple-low.c (lower_stmt): Likewise.
	* gimplify.c (is_gimple_stmt): Handle OMP_MASTER.
	(gimplify_scan_omp_clauses): For task reductions, handle OMP_SCOPE
	like ORT_WORKSHARE constructs.  Adjust diagnostics for %<scope%>
	allowing task reductions.  Reject inscan reductions on scope.
	(omp_find_stores_stmt): Handle GIMPLE_OMP_SCOPE.
	(gimplify_omp_workshare, gimplify_expr): Handle OMP_SCOPE.
	* tree-inline.c (remap_gimple_stmt): Handle GIMPLE_OMP_SCOPE.
	(estimate_num_insns): Likewise.
	* omp-low.c (build_outer_var_ref): Look through GIMPLE_OMP_SCOPE
	contexts if var isn't privatized there.
	(check_omp_nesting_restrictions): Handle GIMPLE_OMP_SCOPE.
	(scan_omp_1_stmt): Likewise.
	(maybe_add_implicit_barrier_cancel): Look through outer
	scope constructs.
	(lower_omp_scope): New function.
	(lower_omp_task_reductions): Handle OMP_SCOPE.
	(lower_omp_1): Handle GIMPLE_OMP_SCOPE.
	(diagnose_sb_1, diagnose_sb_2): Likewise.
	* omp-expand.c (expand_omp_single): Support also GIMPLE_OMP_SCOPE.
	(expand_omp): Handle GIMPLE_OMP_SCOPE.
	(omp_make_gimple_edges): Likewise.
	* omp-builtins.def (BUILT_IN_GOMP_SCOPE_START): New built-in.
gcc/c-family/
	* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_SCOPE.
	* c-pragma.c (omp_pragmas): Add scope construct.
	* c-omp.c (omp_directives): Uncomment scope directive entry.
gcc/c/
	* c-parser.c (OMP_SCOPE_CLAUSE_MASK): Define.
	(c_parser_omp_scope): New function.
	(c_parser_omp_construct): Handle PRAGMA_OMP_SCOPE.
gcc/cp/
	* parser.c (OMP_SCOPE_CLAUSE_MASK): Define.
	(cp_parser_omp_scope): New function.
	(cp_parser_omp_construct, cp_parser_pragma): Handle PRAGMA_OMP_SCOPE.
	* pt.c (tsubst_expr): Handle OMP_SCOPE.
gcc/testsuite/
	* c-c++-common/gomp/nesting-2.c (foo): Add scope and masked
	construct tests.
	* c-c++-common/gomp/scan-1.c (f3): Add scope construct test..
	* c-c++-common/gomp/cancel-1.c (f2): Add scope and masked
	construct tests.
	* c-c++-common/gomp/reduction-task-2.c (bar): Add scope construct
	test.  Adjust diagnostics for the addition of scope.
	* c-c++-common/gomp/loop-1.c (f5): Add master, masked and scope
	construct tests.
	* c-c++-common/gomp/clause-dups-1.c (f1): Add scope construct test.
	* gcc.dg/gomp/nesting-1.c (f1, f2, f3): Add scope construct tests.
	* c-c++-common/gomp/scope-1.c: New test.
	* c-c++-common/gomp/scope-2.c: New test.
	* g++.dg/gomp/attrs-1.C (bar): Add scope construct tests.
	* g++.dg/gomp/attrs-2.C (bar): Likewise.
	* gfortran.dg/gomp/reduction4.f90: Adjust expected diagnostics.
	* gfortran.dg/gomp/reduction7.f90: Likewise.
libgomp/
	* Makefile.am (libgomp_la_SOURCES): Add scope.c
	* Makefile.in: Regenerated.
	* libgomp_g.h (GOMP_scope_start): Declare.
	* libgomp.map: Add GOMP_scope_start@@GOMP_5.1.
	* scope.c: New file.
	* testsuite/libgomp.c-c++-common/scope-1.c: New test.
	* testsuite/libgomp.c-c++-common/task-reduction-16.c: New test.
2021-08-17 09:30:09 +02:00
Thomas Schwinge
a2ab2f0dfb Address '?:' issues in 'libgomp.oacc-c-c++-common/mode-transitions.c'
[...]/libgomp.oacc-c-c++-common/mode-transitions.c: In function ‘t3’:
    [...]/libgomp.oacc-c-c++-common/mode-transitions.c:127:43: warning: ‘?:’ using integer constants in boolean context, the expression will always evaluate to ‘true’ [-Wint-in-bool-context]
      127 |     assert (arr[i] == ((i % 64) < 32) ? 1 : -1);
          |                                           ^

    [...]/libgomp.oacc-c-c++-common/mode-transitions.c: In function ‘t9’:
    [...]/libgomp.oacc-c-c++-common/mode-transitions.c:359:46: warning: ‘?:’ using integer constants in boolean context, the expression will always evaluate to ‘true’ [-Wint-in-bool-context]
      359 |         assert (arr[i] == ((i % 3) == 0) ? 1 : 2);
          |                                              ^

..., and PR101862 "[C, C++] Potential '?:' diagnostic for always-true
expressions in boolean context".

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/mode-transitions.c: Address
	'?:' issues.
2021-08-16 12:12:09 +02:00
Tobias Burnus
53d5b59cb3 Fortran/OpenMP: Add support for OpenMP 5.1 masked construct
Commit r12-2891-gd0befed793b94f3f407be44e6f69f81a02f5f073 added C/C++
support for the masked construct. This patch extends it to
Fortran.

gcc/fortran/ChangeLog:

	* dump-parse-tree.c (show_omp_clauses): Handle 'filter' clause.
	(show_omp_node, show_code_node): Handle (combined) omp masked construct.
	* frontend-passes.c (gfc_code_walker): Likewise.
	* gfortran.h (enum gfc_statement): Add ST_OMP_*_MASKED*.
	(enum gfc_exec_op): Add EXEC_OMP_*_MASKED*.
	* match.h (gfc_match_omp_masked, gfc_match_omp_masked_taskloop,
	gfc_match_omp_masked_taskloop_simd, gfc_match_omp_parallel_masked,
	gfc_match_omp_parallel_masked_taskloop,
	gfc_match_omp_parallel_masked_taskloop_simd): New prototypes.
	* openmp.c (enum omp_mask1): Add OMP_CLAUSE_FILTER.
	(gfc_match_omp_clauses): Match it.
	(OMP_MASKED_CLAUSES, gfc_match_omp_parallel_masked,
	gfc_match_omp_parallel_masked_taskloop,
	gfc_match_omp_parallel_masked_taskloop_simd,
	gfc_match_omp_masked, gfc_match_omp_masked_taskloop,
	gfc_match_omp_masked_taskloop_simd): New.
	(resolve_omp_clauses): Resolve filter clause.
	(gfc_resolve_omp_parallel_blocks, resolve_omp_do,
	omp_code_to_statement, gfc_resolve_omp_directive): Handle
	omp masked constructs.
	* parse.c (decode_omp_directive, case_exec_markers,
	gfc_ascii_statement, parse_omp_do, parse_omp_structured_block,
	parse_executable): Likewise.
	* resolve.c (gfc_resolve_blocks, gfc_resolve_code): Likewise.
	* st.c (gfc_free_statement): Likewise.
	* trans-openmp.c (gfc_trans_omp_clauses): Handle filter clause.
	(GFC_OMP_SPLIT_MASKED, GFC_OMP_MASK_MASKED): New enum values.
	(gfc_trans_omp_masked): New.
	(gfc_split_omp_clauses): Handle combined masked directives.
	(gfc_trans_omp_master_taskloop): Rename to ...
	(gfc_trans_omp_master_masked_taskloop): ... this; handle also
	combined masked directives.
	(gfc_trans_omp_parallel_master): Rename to ...
	(gfc_trans_omp_parallel_master_masked): ... this; handle
	combined masked directives.
	(gfc_trans_omp_directive): Handle EXEC_OMP_*_MASKED*.
	* trans.c (trans_code): Likewise.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/masked-1.f90: New test.

gcc/testsuite/ChangeLog:

	* gfortran.dg/gomp/masked-1.f90: New test.
	* gfortran.dg/gomp/masked-2.f90: New test.
	* gfortran.dg/gomp/masked-3.f90: New test.
	* gfortran.dg/gomp/masked-combined-1.f90: New test.
	* gfortran.dg/gomp/masked-combined-2.f90: New test.
2021-08-16 09:26:26 +02:00
Thomas Schwinge
2cc65fcbd4 Adjust 'libgomp.oacc-c-c++-common/static-variable-1.c'
... for 'gcc/gimplify.c:gimplify_scan_omp_clauses' changes in recent
commit d0befed793 "openmp: Add support
for OpenMP 5.1 masked construct".

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/static-variable-1.c: Adjust.
2021-08-13 22:53:58 +02:00
Jakub Jelinek
d0befed793 openmp: Add support for OpenMP 5.1 masked construct
This construct has been introduced as a replacement for master
construct, but unlike that construct is slightly more general,
has an optional clause which allows to choose which thread
will be the one running the region, it can be some other thread
than the master (primary) thread with number 0, or it could be no
threads or multiple threads (then of course one needs to be careful
about data races).

It is way too early to deprecate the master construct though, we don't
even have OpenMP 5.0 fully implemented, it has been deprecated in 5.1,
will be also in 5.2 and removed in 6.0.  But even then it will likely
be a good idea to just -Wdeprecated warn about it and still accept it.

The patch also contains something I should have done much earlier,
for clauses that accept some integral expression where we only care
about the value, forces during gimplification that value into
either a min invariant (as before), SSA_NAME or a fresh temporary,
but never e.g. a user VAR_DECL, so that for those clauses we don't
need to worry about adjusting it.

2021-08-12  Jakub Jelinek  <jakub@redhat.com>

gcc/
	* tree.def (OMP_MASKED): New tree code.
	* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_FILTER.
	* tree.h (OMP_MASKED_BODY, OMP_MASKED_CLAUSES, OMP_MASKED_COMBINED,
	OMP_CLAUSE_FILTER_EXPR): Define.
	* tree.c (omp_clause_num_ops): Add OMP_CLAUSE_FILTER entry.
	(omp_clause_code_name): Likewise.
	(walk_tree_1): Handle OMP_CLAUSE_FILTER.
	* tree-nested.c (convert_nonlocal_omp_clauses,
	convert_local_omp_clauses): Handle OMP_CLAUSE_FILTER.
	(convert_nonlocal_reference_stmt, convert_local_reference_stmt,
	convert_gimple_call): Handle GIMPLE_OMP_MASTER.
	* tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE_FILTER.
	(dump_generic_node): Handle OMP_MASTER.
	* gimple.def (GIMPLE_OMP_MASKED): New gimple code.
	* gimple.c (gimple_build_omp_masked): New function.
	(gimple_copy): Handle GIMPLE_OMP_MASKED.
	* gimple.h (gimple_build_omp_masked): Declare.
	(gimple_has_substatements): Handle GIMPLE_OMP_MASKED.
	(gimple_omp_masked_clauses, gimple_omp_masked_clauses_ptr,
	gimple_omp_masked_set_clauses): New inline functions.
	(CASE_GIMPLE_OMP): Add GIMPLE_OMP_MASKED.
	* gimple-pretty-print.c (dump_gimple_omp_masked): New function.
	(pp_gimple_stmt_1): Handle GIMPLE_OMP_MASKED.
	* gimple-walk.c (walk_gimple_stmt): Likewise.
	* gimple-low.c (lower_stmt): Likewise.
	* gimplify.c (is_gimple_stmt): Handle OMP_MASTER.
	(gimplify_scan_omp_clauses): Handle OMP_CLAUSE_FILTER.  For clauses
	that take one expression rather than decl or constant, force
	gimplification of that into a SSA_NAME or temporary unless min
	invariant.
	(gimplify_adjust_omp_clauses): Handle OMP_CLAUSE_FILTER.
	(gimplify_expr): Handle OMP_MASKED.
	* tree-inline.c (remap_gimple_stmt): Handle GIMPLE_OMP_MASKED.
	(estimate_num_insns): Likewise.
	* omp-low.c (scan_sharing_clauses): Handle OMP_CLAUSE_FILTER.
	(check_omp_nesting_restrictions): Handle GIMPLE_OMP_MASKED.  Adjust
	diagnostics for existence of masked construct.
	(scan_omp_1_stmt, lower_omp_master, lower_omp_1, diagnose_sb_1,
	diagnose_sb_2): Handle GIMPLE_OMP_MASKED.
	* omp-expand.c (expand_omp_synch, expand_omp, omp_make_gimple_edges):
	Likewise.
gcc/c-family/
	* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_MASKED.
	(enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_FILTER.
	* c-pragma.c (omp_pragmas_simd): Add masked construct.
	* c-common.h (enum c_omp_clause_split): Add C_OMP_CLAUSE_SPLIT_MASKED
	enumerator.
	(c_finish_omp_masked): Declare.
	* c-omp.c (c_finish_omp_masked): New function.
	(c_omp_split_clauses): Handle combined masked constructs.
gcc/c/
	* c-parser.c (c_parser_omp_clause_name): Parse filter clause name.
	(c_parser_omp_clause_filter): New function.
	(c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_FILTER.
	(OMP_MASKED_CLAUSE_MASK): Define.
	(c_parser_omp_masked): New function.
	(c_parser_omp_parallel): Handle parallel masked.
	(c_parser_omp_construct): Handle PRAGMA_OMP_MASKED.
	* c-typeck.c (c_finish_omp_clauses): Handle OMP_CLAUSE_FILTER.
gcc/cp/
	* parser.c (cp_parser_omp_clause_name): Parse filter clause name.
	(cp_parser_omp_clause_filter): New function.
	(cp_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_FILTER.
	(OMP_MASKED_CLAUSE_MASK): Define.
	(cp_parser_omp_masked): New function.
	(cp_parser_omp_parallel): Handle parallel masked.
	(cp_parser_omp_construct, cp_parser_pragma): Handle PRAGMA_OMP_MASKED.
	* semantics.c (finish_omp_clauses): Handle OMP_CLAUSE_FILTER.
	* pt.c (tsubst_omp_clauses): Likewise.
	(tsubst_expr): Handle OMP_MASKED.
gcc/testsuite/
	* c-c++-common/gomp/clauses-1.c (bar): Add tests for combined masked
	constructs with clauses.
	* c-c++-common/gomp/clauses-5.c (foo): Add testcase for filter clause.
	* c-c++-common/gomp/clause-dups-1.c (f1): Likewise.
	* c-c++-common/gomp/masked-1.c: New test.
	* c-c++-common/gomp/masked-2.c: New test.
	* c-c++-common/gomp/masked-combined-1.c: New test.
	* c-c++-common/gomp/masked-combined-2.c: New test.
	* c-c++-common/goacc/uninit-if-clause.c: Remove xfails.
	* g++.dg/gomp/block-11.C: New test.
	* g++.dg/gomp/tpl-masked-1.C: New test.
	* g++.dg/gomp/attrs-1.C (bar): Add tests for masked construct and
	combined masked constructs with clauses in attribute syntax.
	* g++.dg/gomp/attrs-2.C (bar): Likewise.
	* gcc.dg/gomp/nesting-1.c (f1, f2): Add tests for masked construct
	nesting.
	* gfortran.dg/goacc/host_data-tree.f95: Allow also SSA_NAMEs in if
	clause.
	* gfortran.dg/goacc/kernels-tree.f95: Likewise.
libgomp/
	* testsuite/libgomp.c-c++-common/masked-1.c: New test.
2021-08-12 22:41:17 +02:00
Tobias Burnus
432de08498 OpenMP 5.1: Add proc-bind 'primary' support
In OpenMP 5.1 "master thread" was changed to "primary thread" and
the proc_bind clause and the OMP_PROC_BIND environment variable
now take 'primary' as argument as alias for 'master', while the
latter is deprecated.
This commit accepts 'primary' and adds the named constant
omp_proc_bind_primary and changes 'master thread' in the
documentation; however, given that not even OpenMP 5.0 is
fully supported, omp_display_env and the dumps currently
still output 'master' and there is no deprecation warning
when using the 'master' in the proc_bind clause.

gcc/c/ChangeLog:

	* c-parser.c (c_parser_omp_clause_proc_bind): Accept
	'primary' as alias for 'master'.

gcc/cp/ChangeLog:

	* parser.c (cp_parser_omp_clause_proc_bind): Accept
	'primary' as alias for 'master'.

gcc/fortran/ChangeLog:

	* gfortran.h (gfc_omp_proc_bind_kind): Add OMP_PROC_BIND_PRIMARY.
	* dump-parse-tree.c (show_omp_clauses): Add TODO comment to
	change 'master' to 'primary' in proc_bind for OpenMP 5.1.
	* intrinsic.texi (OMP_LIB): Mention OpenMP 5.1; add
	omp_proc_bind_primary.
	* openmp.c (gfc_match_omp_clauses): Accept
	'primary' as alias for 'master'.
	* trans-openmp.c (gfc_trans_omp_clauses): Handle
	OMP_PROC_BIND_PRIMARY.

gcc/ChangeLog:

	* tree-core.h (omp_clause_proc_bind_kind): Add
	OMP_CLAUSE_PROC_BIND_PRIMARY.
	* tree-pretty-print.c (dump_omp_clause): Add TODO comment to
	change 'master' to 'primary' in proc_bind for OpenMP 5.1.

libgomp/ChangeLog:

	* env.c (parse_bind_var): Accept 'primary' as alias for
	'master'.
	(omp_display_env): Add TODO comment to
	change 'master' to 'primary' in proc_bind for OpenMP 5.1.
	* libgomp.texi: Change 'master thread' to 'primary thread'
	in line with OpenMP 5.1.
	(omp_get_proc_bind): Add omp_proc_bind_primary and note that
	omp_proc_bind_master is an alias of it.
	(OMP_PROC_BIND): Mention 'PRIMARY'.
	* omp.h.in (__GOMP_DEPRECATED_5_1): Define.
	(omp_proc_bind_primary): Add.
	(omp_proc_bind_master): Deprecate for OpenMP 5.1.
	* omp_lib.f90.in (omp_proc_bind_primary): Add.
	(omp_proc_bind_master): Deprecate for OpenMP 5.1.
	* omp_lib.h.in (omp_proc_bind_primary): Add.
	* testsuite/libgomp.c/affinity-1.c: Check that
	'primary' works and is identical to 'master'.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/pr61486-2.c: Duplicate one proc_bind(master)
	testcase and test proc_bind(primary) instead.
	* gfortran.dg/gomp/affinity-1.f90: Likewise.
2021-08-12 15:49:49 +02:00
Julian Brown
c408512e1f amdgcn: Enable OpenACC worker partitioning for AMD GCN
gcc/
	* config/gcn/gcn.c (gcn_init_builtins): Override decls for
	BUILT_IN_GOACC_SINGLE_START, BUILT_IN_GOACC_SINGLE_COPY_START,
	BUILT_IN_GOACC_SINGLE_COPY_END and BUILT_IN_GOACC_BARRIER.
	(gcn_goacc_validate_dims): Turn on worker partitioning unconditionally.
	(gcn_fork_join): Update comment.
	* config/gcn/gcn.opt (flag_worker_partitioning): Remove.
	(macc_experimental_workers): Remove unused option.
	libgomp/
	* plugin/plugin-gcn.c (gcn_exec): Change default number of workers to
	16.
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c
	[acc_device_radeon]: Update.
	* testsuite/libgomp.oacc-c-c++-common/loop-dim-default.c
	[ACC_DEVICE_TYPE_radeon]: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/parallel-dims.c
	[acc_device_radeon]: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/routine-wv-2.c
	[ACC_DEVICE_TYPE_radeon]: Likewise.
	* testsuite/libgomp.oacc-fortran/optional-reduction.f90: XFAIL for
	'openacc_radeon_accel_selected' and '-O0'.
	* testsuite/libgomp.oacc-fortran/reduction-7.f90: Likewise.

Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2021-08-09 15:08:44 +02:00
Chung-Lin Tang
0bac793ed6 openmp: Implement omp_get_device_num routine
This patch implements the omp_get_device_num library routine, specified in
OpenMP 5.0.

GOMP_DEVICE_NUM_VAR is a macro symbol which defines name of a "device number"
variable, is defined on the device-side libgomp, has it's address returned to
host-side libgomp during device initialization, and the host libgomp then
sets its value to the designated device number.

libgomp/ChangeLog:

	* icv-device.c (omp_get_device_num): New API function, host side.
	* fortran.c (omp_get_device_num_): New interface function.
	* libgomp-plugin.h (GOMP_DEVICE_NUM_VAR): Define macro symbol.
	* libgomp.map (OMP_5.0.2): New version space with omp_get_device_num,
	omp_get_device_num_.
	* libgomp.texi (omp_get_device_num): Add documentation for new API
	function.
	* omp.h.in (omp_get_device_num): Add declaration.
	* omp_lib.f90.in (omp_get_device_num): Likewise.
	* omp_lib.h.in (omp_get_device_num): Likewise.
	* target.c (gomp_load_image_to_device): If additional entry for device
	number exists at end of returned entries from 'load_image_func' hook,
	copy the assigned device number over to the device variable.

	* config/gcn/icv-device.c (GOMP_DEVICE_NUM_VAR): Define static global.
	(omp_get_device_num): New API function, device side.
	* plugin/plugin-gcn.c ("symcat.h"): Add include.
	(GOMP_OFFLOAD_load_image): Add addresses of device GOMP_DEVICE_NUM_VAR
	at end of returned 'target_table' entries.

	* config/nvptx/icv-device.c (GOMP_DEVICE_NUM_VAR): Define static global.
	(omp_get_device_num): New API function, device side.
	* plugin/plugin-nvptx.c ("symcat.h"): Add include.
	(GOMP_OFFLOAD_load_image): Add addresses of device GOMP_DEVICE_NUM_VAR
	at end of returned 'target_table' entries.

	* testsuite/lib/libgomp.exp
	(check_effective_target_offload_target_intelmic): New function for
	testing for intelmic offloading.
	* testsuite/libgomp.c-c++-common/target-45.c: New test.
	* testsuite/libgomp.fortran/target10.f90: New test.
2021-08-05 23:29:03 +08:00
Thomas Schwinge
0829ab79d3 [OpenACC] Extract 'pass_oacc_loop_designation' out of 'pass_oacc_device_lower'
This really is a separate step -- and another pass to be added between the two,
later on.

	gcc/
	* omp-offload.c (oacc_loop_xform_head_tail, oacc_loop_process):
	'update_stmt' after modification.
	(pass_oacc_loop_designation): New function, extracted out of...
	(pass_oacc_device_lower): ... this.
	(pass_data_oacc_loop_designation, pass_oacc_loop_designation)
	(make_pass_oacc_loop_designation): New
	* passes.def: Add it.
	* tree-parloops.c (create_parallel_loop): Adjust.
	* tree-pass.h (make_pass_oacc_loop_designation): New.
	gcc/testsuite/
	* c-c++-common/goacc/classify-kernels-unparallelized.c:
	's%oaccdevlow%oaccloops%g'.
	* c-c++-common/goacc/classify-kernels.c: Likewise.
	* c-c++-common/goacc/classify-parallel.c: Likewise.
	* c-c++-common/goacc/classify-routine-nohost.c: Likewise.
	* c-c++-common/goacc/classify-routine.c: Likewise.
	* c-c++-common/goacc/classify-serial.c: Likewise.
	* c-c++-common/goacc/routine-nohost-1.c: Likewise.
	* g++.dg/goacc/template.C: Likewise.
	* gcc.dg/goacc/loop-processing-1.c: Likewise.
	* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise.
	* gfortran.dg/goacc/classify-kernels.f95: Likewise.
	* gfortran.dg/goacc/classify-parallel.f95: Likewise.
	* gfortran.dg/goacc/classify-routine-nohost.f95: Likewise.
	* gfortran.dg/goacc/classify-routine.f95: Likewise.
	* gfortran.dg/goacc/classify-serial.f95: Likewise.
	* gfortran.dg/goacc/routine-multiple-directives-1.f90: Likewise.
	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/pr85486-2.c:
	's%oaccdevlow%oaccloops%g'.
	* testsuite/libgomp.oacc-c-c++-common/pr85486-3.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/pr85486.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-2.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-6.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c:
	Likewise.
	* testsuite/libgomp.oacc-fortran/routine-nohost-1.f90: Likewise.

Co-Authored-By: Julian Brown <julian@codesourcery.com>
Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>
2021-07-29 09:19:44 +02:00
Aldy Hernandez
2e96b5f14e Backwards jump threader rewrite with ranger.
This is a rewrite of the backwards threader with a ranger based solver.

The code is divided into two parts: the path solver in
gimple-range-path.*, and the path discovery bits in
tree-ssa-threadbackward.c.

The legacy code is still available with --param=threader-mode=legacy,
but will be removed shortly after.

gcc/ChangeLog:

	* Makefile.in (tree-ssa-loop-im.o-warn): New.
	* flag-types.h (enum threader_mode): New.
	* params.opt: Add entry for --param=threader-mode.
	* tree-ssa-threadbackward.c (THREADER_ITERATIVE_MODE): New.
	(class back_threader): New.
	(back_threader::back_threader): New.
	(back_threader::~back_threader): New.
	(back_threader::maybe_register_path): New.
	(back_threader::find_taken_edge): New.
	(back_threader::find_taken_edge_switch): New.
	(back_threader::find_taken_edge_cond): New.
	(back_threader::resolve_def): New.
	(back_threader::resolve_phi): New.
	(back_threader::find_paths_to_names): New.
	(back_threader::find_paths): New.
	(dump_path): New.
	(debug): New.
	(thread_jumps::find_jump_threads_backwards): Call ranger threader.
	(thread_jumps::find_jump_threads_backwards_with_ranger): New.
	(pass_thread_jumps::execute): Abstract out code...
	(try_thread_blocks): ...here.
	* tree-ssa-threadedge.c (jump_threader::thread_outgoing_edges):
	Abstract out threading candidate code to...
	(single_succ_to_potentially_threadable_block): ...here.
	* tree-ssa-threadedge.h (single_succ_to_potentially_threadable_block):
	New.
	* tree-ssa-threadupdate.c (register_jump_thread): Return boolean.
	* tree-ssa-threadupdate.h (class jump_thread_path_registry):
	Return bool from register_jump_thread.

libgomp/ChangeLog:

	* testsuite/libgomp.graphite/force-parallel-4.c: Adjust for
	threader.
	* testsuite/libgomp.graphite/force-parallel-8.c: Same.

gcc/testsuite/ChangeLog:

	* g++.dg/debug/dwarf2/deallocator.C: Adjust for threader.
	* gcc.c-torture/compile/pr83510.c: Same.
	* dg.dg/analyzer/pr94851-2.c: Same.
	* gcc.dg/loop-unswitch-2.c: Same.
	* gcc.dg/old-style-asm-1.c: Same.
	* gcc.dg/pr68317.c: Same.
	* gcc.dg/pr97567-2.c: Same.
	* gcc.dg/predict-9.c: Same.
	* gcc.dg/shrink-wrap-loop.c: Same.
	* gcc.dg/sibcall-1.c: Same.
	* gcc.dg/tree-ssa/builtin-sprintf-3.c: Same.
	* gcc.dg/tree-ssa/pr21001.c: Same.
	* gcc.dg/tree-ssa/pr21294.c: Same.
	* gcc.dg/tree-ssa/pr21417.c: Same.
	* gcc.dg/tree-ssa/pr21458-2.c: Same.
	* gcc.dg/tree-ssa/pr21563.c: Same.
	* gcc.dg/tree-ssa/pr49039.c: Same.
	* gcc.dg/tree-ssa/pr61839_1.c: Same.
	* gcc.dg/tree-ssa/pr61839_3.c: Same.
	* gcc.dg/tree-ssa/pr77445-2.c: Same.
	* gcc.dg/tree-ssa/split-path-4.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-11.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-12.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-14.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-18.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same.
	* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Same.
	* gcc.dg/tree-ssa/ssa-fre-48.c: Same.
	* gcc.dg/tree-ssa/ssa-thread-11.c: Same.
	* gcc.dg/tree-ssa/ssa-thread-12.c: Same.
	* gcc.dg/tree-ssa/ssa-thread-14.c: Same.
	* gcc.dg/tree-ssa/vrp02.c: Same.
	* gcc.dg/tree-ssa/vrp03.c: Same.
	* gcc.dg/tree-ssa/vrp05.c: Same.
	* gcc.dg/tree-ssa/vrp06.c: Same.
	* gcc.dg/tree-ssa/vrp07.c: Same.
	* gcc.dg/tree-ssa/vrp09.c: Same.
	* gcc.dg/tree-ssa/vrp19.c: Same.
	* gcc.dg/tree-ssa/vrp20.c: Same.
	* gcc.dg/tree-ssa/vrp33.c: Same.
	* gcc.dg/uninit-pred-9_b.c: Same.
	* gcc.dg/uninit-pr61112.c: Same.
	* gcc.dg/vect/bb-slp-16.c: Same.
	* gcc.target/i386/avx2-vect-aggressive.c: Same.
	* gcc.dg/tree-ssa/ranger-threader-1.c: New test.
	* gcc.dg/tree-ssa/ranger-threader-2.c: New test.
	* gcc.dg/tree-ssa/ranger-threader-3.c: New test.
	* gcc.dg/tree-ssa/ranger-threader-4.c: New test.
	* gcc.dg/tree-ssa/ranger-threader-5.c: New test.
2021-07-29 08:24:50 +02:00
Thomas Schwinge
d88a695158 Don't use libgomp 'cbuf' buffering with OpenACC 'async'
The host data might not be computed yet (by an earlier asynchronous compute
region, for example.

	libgomp/
	* target.c (gomp_coalesce_buf_add): Update comment.
	(gomp_copy_host2dev, gomp_map_vars_internal): Don't expect to see
	'aq && cbuf'.
	(gomp_map_vars_internal): Only 'if (!aq)', do
	'gomp_coalesce_buf_add'.
	* testsuite/libgomp.oacc-c-c++-common/async-data-1-2.c: Remove
	XFAIL.

Co-Authored-By: Julian Brown <julian@codesourcery.com>
2021-07-27 11:16:37 +02:00
Julian Brown
9c41f5b9cd Fix OpenACC "ephemeral" asynchronous host-to-device copies
This patch fixes several places in libgomp/target.c where "ephemeral" data
(on the stack or in temporary heap locations) may be used as the source of
an asynchronous host-to-device copy that may not complete before the host
data disappears.

An existing, but flawed, workaround for this problem in the AMD GCN
libgomp offloading plugin is currently present on mainline, and was
posted for the og9 branch here:

  https://gcc.gnu.org/legacy-ml/gcc-patches/2019-08/msg00901.html

and previous versions of this patch were posted here (for mainline/og9):

  https://gcc.gnu.org/legacy-ml/gcc-patches/2019-11/msg01482.html
  https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01026.html

libgomp/
	* libgomp.h (gomp_copy_host2dev): Update prototype.
	* oacc-mem.c (memcpy_tofrom_device, update_dev_host): Add new
	argument to gomp_copy_host2dev (false).
	* plugin/plugin-gcn.c (struct copy_data): Remove free_src field.
	(copy_data): Don't free src.
	(queue_push_copy): Remove free_src handling.
	(GOMP_OFFLOAD_dev2dev): Update call to queue_push_copy.
	(GOMP_OFFLOAD_openacc_async_host2dev): Remove source-data
	snapshotting.
	(GOMP_OFFLOAD_openacc_async_dev2host): Update call to
	queue_push_copy.
	* target.c (goacc_device_copy_async): Add SRCADDR_ORIG parameter.
	(gomp_copy_host2dev): Add EPHEMERAL parameter.  Snapshot source
	data when true, and set up deferred freeing of temporary buffer.
	(gomp_copy_dev2host): Update call to goacc_device_copy_async.
	(gomp_map_vars_existing, gomp_map_pointer, gomp_attach_pointer)
	(gomp_detach_pointer, gomp_map_vars_internal, gomp_update): Update
	calls to gomp_copy_host2dev with appropriate ephemeral argument.
	* testsuite/libgomp.oacc-c-c++-common/async-data-1-1.c: Remove
	XFAIL.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
2021-07-27 11:16:27 +02:00
Thomas Schwinge
88c40c36db Add 'libgomp.oacc-c-c++-common/async-data-1-{1,2}.c'
libgomp/
	* testsuite/libgomp.oacc-c-c++-common/async-data-1-1.c: New file.
	* testsuite/libgomp.oacc-c-c++-common/async-data-1-2.c: Likewise.

Co-Authored-By: Tom de Vries <tom@codesourcery.com>
2021-07-27 11:16:26 +02:00
Thomas Schwinge
29ddaf43f7 [OpenACC] Clarify sequencing of 'async' data copying vs. profiling events in 'libgomp.oacc-c-c++-common/acc_prof-{init,parallel}-1.c'
... as noticed with GCN offloading.

Fix-up for r271346 (commit 5fae049dc2)
"OpenACC Profiling Interface (incomplete)".

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1.c: Clarify
	sequencing of 'async' data copying vs. profiling events.
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c:
	Likewise.
2021-07-27 11:16:25 +02:00
Thomas Schwinge
599e275d7e Fix OpenACC 'async'/'wait' issues in 'libgomp.oacc-c-c++-common/lib-{94,95}.c', 'libgomp.oacc-fortran/lib-16{,-2}.f90'
Fix-up for r265842 (commit 58168bbf6f)
"[OpenACC 2.5, libgomp] Add *_async versions of runtime library API functions".

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/lib-94.c: Fix OpenACC
	'async'/'wait' issue.
	* testsuite/libgomp.oacc-c-c++-common/lib-95.c: Likewise.
	* testsuite/libgomp.oacc-fortran/lib-16-2.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/lib-16.f90: Likewise.

Co-Authored-By: Julian Brown <julian@codesourcery.com>
2021-07-27 11:16:24 +02:00
Thomas Schwinge
a61f6afbee OpenACC 'nohost' clause
Do not "compile a version of this procedure for the host".

	gcc/
	* tree-core.h (omp_clause_code): Add 'OMP_CLAUSE_NOHOST'.
	* tree.c (omp_clause_num_ops, omp_clause_code_name, walk_tree_1):
	Handle it.
	* tree-pretty-print.c (dump_omp_clause): Likewise.
	* omp-general.c (oacc_verify_routine_clauses): Likewise.
	* gimplify.c (gimplify_scan_omp_clauses)
	(gimplify_adjust_omp_clauses): Likewise.
	* tree-nested.c (convert_nonlocal_omp_clauses)
	(convert_local_omp_clauses): Likewise.
	* omp-low.c (scan_sharing_clauses): Likewise.
	* omp-offload.c (execute_oacc_device_lower): Update.
	gcc/c-family/
	* c-pragma.h (pragma_omp_clause): Add 'PRAGMA_OACC_CLAUSE_NOHOST'.
	gcc/c/
	* c-parser.c (c_parser_omp_clause_name): Handle 'nohost'.
	(c_parser_oacc_all_clauses): Handle 'PRAGMA_OACC_CLAUSE_NOHOST'.
	(OACC_ROUTINE_CLAUSE_MASK): Add 'PRAGMA_OACC_CLAUSE_NOHOST'.
	* c-typeck.c (c_finish_omp_clauses): Handle 'OMP_CLAUSE_NOHOST'.
	gcc/cp/
	* parser.c (cp_parser_omp_clause_name): Handle 'nohost'.
	(cp_parser_oacc_all_clauses): Handle 'PRAGMA_OACC_CLAUSE_NOHOST'.
	(OACC_ROUTINE_CLAUSE_MASK): Add 'PRAGMA_OACC_CLAUSE_NOHOST'.
	* pt.c (tsubst_omp_clauses): Handle 'OMP_CLAUSE_NOHOST'.
	* semantics.c (finish_omp_clauses): Likewise.
	gcc/fortran/
	* dump-parse-tree.c (show_attr): Update.
	* gfortran.h (symbol_attribute): Add 'oacc_routine_nohost' member.
	(gfc_omp_clauses): Add 'nohost' member.
	* module.c (ab_attribute): Add 'AB_OACC_ROUTINE_NOHOST'.
	(attr_bits, mio_symbol_attribute): Update.
	* openmp.c (omp_mask2): Add 'OMP_CLAUSE_NOHOST'.
	(gfc_match_omp_clauses): Handle 'OMP_CLAUSE_NOHOST'.
	(OACC_ROUTINE_CLAUSES): Add 'OMP_CLAUSE_NOHOST'.
	(gfc_match_oacc_routine): Update.
	* trans-decl.c (add_attributes_to_decl): Update.
	* trans-openmp.c (gfc_trans_omp_clauses): Likewise.
	gcc/testsuite/
	* c-c++-common/goacc/classify-routine-nohost.c: New file.
	* c-c++-common/goacc/classify-routine.c: Update.
	* c-c++-common/goacc/routine-2.c: Likewise.
	* c-c++-common/goacc/routine-nohost-1.c: New file.
	* c-c++-common/goacc/routine-nohost-2.c: Likewise.
	* g++.dg/goacc/template.C: Update.
	* gfortran.dg/goacc/classify-routine-nohost.f95: New file.
	* gfortran.dg/goacc/classify-routine.f95: Update.
	* gfortran.dg/goacc/pure-elemental-procedures-2.f90: Likewise.
	* gfortran.dg/goacc/routine-6.f90: Likewise.
	* gfortran.dg/goacc/routine-intrinsic-2.f: Likewise.
	* gfortran.dg/goacc/routine-module-1.f90: Likewise.
	* gfortran.dg/goacc/routine-module-2.f90: Likewise.
	* gfortran.dg/goacc/routine-module-3.f90: Likewise.
	* gfortran.dg/goacc/routine-module-mod-1.f90: Likewise.
	* gfortran.dg/goacc/routine-multiple-directives-1.f90: Likewise.
	* gfortran.dg/goacc/routine-multiple-directives-2.f90: Likewise.
	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c: New
	file.
	* testsuite/libgomp.oacc-c-c++-common/routine-nohost-2.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/routine-nohost-2_2.c:
	Likewise.
	* testsuite/libgomp.oacc-fortran/routine-nohost-1.f90: Likewise.

Co-Authored-By: Joseph Myers <joseph@codesourcery.com>
Co-Authored-By: Cesar Philippidis <cesar@codesourcery.com>
2021-07-21 23:58:11 +02:00
Jakub Jelinek
91c771ec8a openmp - Fix up && and || reductions [PR94366]
As the testcase shows, the special treatment of && and || reduction combiners
where we expand them as omp_out = (omp_out != 0) && (omp_in != 0) (or with ||)
is not needed just for &&/|| on floating point or complex types, but for all
&&/|| reductions - when expanded as omp_out = omp_out && omp_in (not in C but
GENERIC) it is actually gimplified into NOP_EXPRs to bool from both operands,
which turns non-zero values multiple of 2 into 0 rather than 1.

This patch just treats all &&/|| the same and furthermore uses bool type
instead of int for the comparisons.

2021-07-01  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/94366
gcc/
	* omp-low.c (lower_rec_input_clauses): Rename is_fp_and_or to
	is_truth_op, set it for TRUTH_*IF_EXPR regardless of new_var's type,
	use boolean_type_node instead of integer_type_node as NE_EXPR type.
	(lower_reduction_clauses): Likewise.
libgomp/
	* testsuite/libgomp.c-c++-common/pr94366.c: New test.
2021-07-01 08:55:49 +02:00
Tobias Burnus
33c4e46624 Add 'default' to -foffload=; document that flag [PR67300]
As -foffload={options,targets,targets=options} is very convoluted,
it has been split into -foffload=targets (supporting the old syntax
for backward compatibilty) and -foffload-options={options,target=options}.

Only the new syntax is documented.

Additionally, -foffload=default is supported, which can reset the
devices after -foffload=disable / -foffload=targets to the default,
if needed.

gcc/ChangeLog:

	PR other/67300
	* common.opt (-foffload=): Update description.
	(-foffload-options=): New.
	* doc/invoke.texi (C Language Options): Document
	-foffload and -foffload-options.
	* gcc.c (check_offload_target_name): New, split off from
	handle_foffload_option.
	(check_foffload_target_names): New.
	(handle_foffload_option): Handle -foffload=default.
	(driver_handle_option): Update for -foffload-options.
	* lto-opts.c (lto_write_options): Use -foffload-options
	instead of -foffload.
	* lto-wrapper.c (merge_and_complain, append_offload_options):
	Likewise.
	* opts.c (common_handle_option): Likewise.

libgomp/ChangeLog:

	PR other/67300
	* testsuite/libgomp.c-c++-common/reduction-16.c: Replace
	-foffload=nvptx-none= by -foffload-options=nvptx-none= to
	avoid disabling other offload targets.
	* testsuite/libgomp.c-c++-common/reduction-5.c: Likewise.
	* testsuite/libgomp.c-c++-common/reduction-6.c: Likewise.
	* testsuite/libgomp.c/target-44.c: Likewise.
2021-06-29 16:00:04 +02:00
Tobias Burnus
489c5dcf7b libgomp.fortran/defaultmap-8.f90: Fix non-shared memory handling
Disable some more parts of the test as firstprivate does not work yet
due to PR fortran/90742.

libgomp/
	* testsuite/libgomp.fortran/defaultmap-8.f90 (bar): Determine whether
	target has shared memory and disable some scalar pointer/allocatable
	checks if not as firstprivate does not work.
2021-06-29 15:50:23 +02:00
Chung-Lin Tang
e067201737 testsuite/101114: Adjust libgomp.c-c++-common/struct-elem-5.c testcase
The dg-shouldfail testcase libgomp.c-c++-common/struct-elem-5.c does not
properly fail for non-shared address space offloading. Adjust testcase
to limit testing only for "target offload_device_nonshared_as".

libgomp/ChangeLog:

	PR testsuite/101114
	* testsuite/libgomp.c-c++-common/struct-elem-5.c:
	Add "target offload_device_nonshared_as" condition for enabling test.
2021-06-26 00:46:11 +08:00
Jakub Jelinek
7619d33471 openmp: in_reduction clause support on target construct
This patch adds support for in_reduction clause on target construct, though
for now only for synchronous targets (without nowait clause).
The encountering thread in that case runs the target task and blocks until
the target region ends, so it is implemented by remapping it before entering
the target, initializing the private copy if not yet initialized for the
current thread and then using the remapped addresses for the mapping
addresses.
For nowait combined with in_reduction the patch contains a hack where the
nowait clause is ignored.  To implement it correctly, I think we would need
to create a new private variable for the in_reduction and initialize it before
doing the async target and adjust the map addresses to that private variable
and then pass a function pointer to the library routine with code where the callback
would remap the address to the current threads private variable and use in_reduction
combiner to combine the private variable we've created into the thread's copy.
The library would then need to make sure that the routine is called in some thread
participating in the parallel (and not in an unshackeled thread).

2021-06-24  Jakub Jelinek  <jakub@redhat.com>

gcc/
	* tree.h (OMP_CLAUSE_MAP_IN_REDUCTION): Document meaning for OpenMP.
	* gimplify.c (gimplify_scan_omp_clauses): For OpenMP map clauses
	with OMP_CLAUSE_MAP_IN_REDUCTION flag partially defer gimplification
	of non-decl OMP_CLAUSE_DECL.  For OMP_CLAUSE_IN_REDUCTION on
	OMP_TARGET user outer_ctx instead of ctx for placeholders and
	initializer/combiner gimplification.
	* omp-low.c (scan_sharing_clauses): Handle OMP_CLAUSE_MAP_IN_REDUCTION
	on target constructs.
	(lower_rec_input_clauses): Likewise.
	(lower_omp_target): Likewise.
	* omp-expand.c (expand_omp_target): Temporarily ignore nowait clause
	on target if in_reduction is present.
gcc/c-family/
	* c-common.h (enum c_omp_region_type): Add C_ORT_TARGET and
	C_ORT_OMP_TARGET.
	* c-omp.c (c_omp_split_clauses): For OMP_CLAUSE_IN_REDUCTION on
	combined target constructs also add map (always, tofrom:) clause.
gcc/c/
	* c-parser.c (omp_split_clauses): Pass C_ORT_OMP_TARGET instead of
	C_ORT_OMP for clauses on target construct.
	(OMP_TARGET_CLAUSE_MASK): Add in_reduction clause.
	(c_parser_omp_target): For non-combined target add
	map (always, tofrom:) clauses for OMP_CLAUSE_IN_REDUCTION.  Pass
	C_ORT_OMP_TARGET to c_finish_omp_clauses.
	* c-typeck.c (handle_omp_array_sections): Adjust ort handling
	for addition of C_ORT_OMP_TARGET and simplify, mapping clauses are
	never present on C_ORT_*DECLARE_SIMD.
	(c_finish_omp_clauses): Likewise.  Handle OMP_CLAUSE_IN_REDUCTION
	on C_ORT_OMP_TARGET, set OMP_CLAUSE_MAP_IN_REDUCTION on
	corresponding map clauses.
gcc/cp/
	* parser.c (cp_omp_split_clauses): Pass C_ORT_OMP_TARGET instead of
	C_ORT_OMP for clauses on target construct.
	(OMP_TARGET_CLAUSE_MASK): Add in_reduction clause.
	(cp_parser_omp_target): For non-combined target add
	map (always, tofrom:) clauses for OMP_CLAUSE_IN_REDUCTION.  Pass
	C_ORT_OMP_TARGET to finish_omp_clauses.
	* semantics.c (handle_omp_array_sections_1): Adjust ort handling
	for addition of C_ORT_OMP_TARGET and simplify, mapping clauses are
	never present on C_ORT_*DECLARE_SIMD.
	(handle_omp_array_sections): Likewise.
	(finish_omp_clauses): Likewise.  Handle OMP_CLAUSE_IN_REDUCTION
	on C_ORT_OMP_TARGET, set OMP_CLAUSE_MAP_IN_REDUCTION on
	corresponding map clauses.
	* pt.c (tsubst_expr): Pass C_ORT_OMP_TARGET instead of C_ORT_OMP for
	clauses on target construct.
gcc/testsuite/
	* c-c++-common/gomp/target-in-reduction-1.c: New test.
	* c-c++-common/gomp/clauses-1.c: Add in_reduction clauses on
	target or combined target constructs.
libgomp/
	* testsuite/libgomp.c-c++-common/target-in-reduction-1.c: New test.
	* testsuite/libgomp.c-c++-common/target-in-reduction-2.c: New test.
	* testsuite/libgomp.c++/target-in-reduction-1.C: New test.
	* testsuite/libgomp.c++/target-in-reduction-2.C: New test.
2021-06-24 11:35:08 +02:00
Jakub Jelinek
679506c383 openmp: Fix up *_reduction clause handling with UDRs on PARM_DECLs [PR101167]
The following testcase FAILs, because the UDR combiner is invoked incorrectly.
lower_omp_rec_clauses expects that when it sets
DECL_VALUE_EXPR/DECL_HAS_VALUE_EXPR_P
for both the placeholder and the var that everything will be properly
regimplified, but as the variable in question is a PARM_DECL rather than
VAR_DECL, lower_omp_regimplify_p doesn't say that it should be regimplified
and so it is not.

2021-06-23  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/101167
	* omp-low.c (lower_omp_regimplify_p): Regimplify also PARM_DECLs
	and RESULT_DECLs that have DECL_HAS_VALUE_EXPR_P set.

	* testsuite/libgomp.c-c++-common/task-reduction-15.c: New test.
2021-06-23 10:03:28 +02:00
Chung-Lin Tang
275c736e73 libgomp: Structure element mapping for OpenMP 5.0
This patch implement OpenMP 5.0 requirements of incrementing/decrementing
the reference count of a mapped structure at most once (across all elements)
on a construct.

This is implemented by pulling in libgomp/hashtab.h and using htab_t as a
pointer set. Structure element list siblings also have pointers-to-refcounts
linked together, to naturally achieve uniform increment/decrement without
repeating.

There are still some questions on whether using such a htab_t based set is
faster/slower than using a sorted pointer array based implementation. This
is to be researched on later.

libgomp/ChangeLog:

	* hashtab.h (htab_clear): New function with initialization code
	factored out from...
	(htab_create): ...here, adjust to use htab_clear function.

	* libgomp.h (REFCOUNT_SPECIAL): New symbol to denote range of
	special refcount values, add comments.
	(REFCOUNT_INFINITY): Adjust definition to use REFCOUNT_SPECIAL.
	(REFCOUNT_LINK): Likewise.
	(REFCOUNT_STRUCTELEM): New special refcount range for structure
	element siblings.
	(REFCOUNT_STRUCTELEM_P): Macro for testing for structure element
	sibling maps.
	(REFCOUNT_STRUCTELEM_FLAG_FIRST): Flag to indicate first sibling.
	(REFCOUNT_STRUCTELEM_FLAG_LAST):  Flag to indicate last sibling.
	(REFCOUNT_STRUCTELEM_FIRST_P): Macro to test _FIRST flag.
	(REFCOUNT_STRUCTELEM_LAST_P): Macro to test _LAST flag.
	(struct splay_tree_key_s): Add structelem_refcount and
	structelem_refcount_ptr fields into a union with dynamic_refcount.
	Add comments.
	(gomp_map_vars): Delete declaration.
	(gomp_map_vars_async): Likewise.
	(gomp_unmap_vars): Likewise.
	(gomp_unmap_vars_async): Likewise.
	(goacc_map_vars): New declaration.
	(goacc_unmap_vars): Likewise.

	* oacc-mem.c (acc_map_data): Adjust to use goacc_map_vars.
	(goacc_enter_datum): Likewise.
	(goacc_enter_data_internal): Likewise.
	* oacc-parallel.c (GOACC_parallel_keyed): Adjust to use goacc_map_vars
	and goacc_unmap_vars.
	(GOACC_data_start): Adjust to use goacc_map_vars.
	(GOACC_data_end): Adjust to use goacc_unmap_vars.

	* target.c (hash_entry_type): New typedef.
	(htab_alloc): New function hook for hashtab.h.
	(htab_free): Likewise.
	(htab_hash): Likewise.
	(htab_eq): Likewise.
	(hashtab.h): Add file include.
	(gomp_increment_refcount): New function.
	(gomp_decrement_refcount): Likewise.
	(gomp_map_vars_existing): Add refcount_set parameter, adjust to use
	gomp_increment_refcount.
	(gomp_map_fields_existing): Add refcount_set parameter, adjust calls
	to gomp_map_vars_existing.

	(gomp_map_vars_internal): Add refcount_set parameter, add local openmp_p
	variable to guard OpenMP specific paths, adjust calls to
	gomp_map_vars_existing, add structure element sibling splay_tree_key
	sequence creation code, adjust Fortran map case to avoid increment
	under OpenMP.
	(gomp_map_vars): Adjust to static, add refcount_set parameter, manage
	local refcount_set if caller passed in NULL, adjust call to
	gomp_map_vars_internal.
	(gomp_map_vars_async): Adjust and rename into...
	(goacc_map_vars): ...this new function, adjust call to
	gomp_map_vars_internal.

	(gomp_remove_splay_tree_key): New function with code factored out from
	gomp_remove_var_internal.
	(gomp_remove_var_internal): Add code to handle removing multiple
	splay_tree_key sequence for structure elements, adjust code to use
	gomp_remove_splay_tree_key for splay-tree key removal.
	(gomp_unmap_vars_internal): Add refcount_set parameter, adjust to use
	gomp_decrement_refcount.
	(gomp_unmap_vars): Adjust to static, add refcount_set parameter, manage
	local refcount_set if caller passed in NULL, adjust call to
	gomp_unmap_vars_internal.
	(gomp_unmap_vars_async): Adjust and rename into...
	(goacc_unmap_vars): ...this new function, adjust call to
	gomp_unmap_vars_internal.
	(GOMP_target): Manage refcount_set and adjust calls to gomp_map_vars and
	gomp_unmap_vars.
	(GOMP_target_ext): Likewise.
	(gomp_target_data_fallback): Adjust call to gomp_map_vars.
	(GOMP_target_data): Likewise.
	(GOMP_target_data_ext): Likewise.
	(GOMP_target_end_data): Adjust call to gomp_unmap_vars.
	(gomp_exit_data): Add refcount_set parameter, adjust to use
	gomp_decrement_refcount, adjust to queue splay-tree keys for removal
	after main loop.
	(GOMP_target_enter_exit_data): Manage refcount_set and adjust calls to
	gomp_map_vars and gomp_exit_data.
	(gomp_target_task_fn): Likewise.

	* testsuite/libgomp.c-c++-common/refcount-1.c: New testcase.
	* testsuite/libgomp.c-c++-common/struct-elem-1.c: New testcase.
	* testsuite/libgomp.c-c++-common/struct-elem-2.c: New testcase.
	* testsuite/libgomp.c-c++-common/struct-elem-3.c: New testcase.
	* testsuite/libgomp.c-c++-common/struct-elem-4.c: New testcase.
	* testsuite/libgomp.c-c++-common/struct-elem-5.c: New testcase.
2021-06-17 21:34:59 +08:00
Tobias Burnus
1de31913d2 Fortran/OpenMP: Extend defaultmap clause for OpenMP 5 [PR92568]
PR fortran/92568

gcc/fortran/ChangeLog:

	* dump-parse-tree.c (show_omp_clauses): Update for defaultmap.
	* f95-lang.c (LANG_HOOKS_OMP_ALLOCATABLE_P,
	LANG_HOOKS_OMP_SCALAR_TARGET_P): New.
	* gfortran.h (enum gfc_omp_defaultmap,
	enum gfc_omp_defaultmap_category): New.
	* openmp.c (gfc_match_omp_clauses): Update defaultmap matching.
	* trans-decl.c (gfc_finish_decl_attrs): Set GFC_DECL_SCALAR_TARGET.
	* trans-openmp.c (gfc_omp_allocatable_p, gfc_omp_scalar_target_p): New.
	(gfc_omp_scalar_p): Take 'ptr_alloc_ok' argument.
	(gfc_trans_omp_clauses, gfc_split_omp_clauses): Update for
	defaultmap changes.
	* trans.h (gfc_omp_scalar_p): Update prototype.
	(gfc_omp_allocatable_p, gfc_omp_scalar_target_p): New.
	(struct lang_decl): Add scalar_target.
	(GFC_DECL_SCALAR_TARGET, GFC_DECL_GET_SCALAR_TARGET): New.

gcc/ChangeLog:

	* gimplify.c (enum gimplify_defaultmap_kind): Add GDMK_SCALAR_TARGET.
	(struct gimplify_omp_ctx): Extend defaultmap array by one.
	(new_omp_context): Init defaultmap[GDMK_SCALAR_TARGET].
	(omp_notice_variable): Update type classification for Fortran.
	(gimplify_scan_omp_clauses): Update calls for new argument; handle
	GDMK_SCALAR_TARGET; for Fortran, GDMK_POINTER avoid GOVD_MAP_0LEN_ARRAY.
	* langhooks-def.h (lhd_omp_scalar_p): Add 'ptr_ok' argument.
	* langhooks.c (lhd_omp_scalar_p): Likewise.
	(LANG_HOOKS_OMP_ALLOCATABLE_P, LANG_HOOKS_OMP_SCALAR_TARGET_P): New.
	(LANG_HOOKS_DECLS): Add them.
	* langhooks.h (struct lang_hooks_for_decls): Add new hooks, update
	omp_scalar_p pointer type to include the new bool argument.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/defaultmap-8.f90: New test.

gcc/testsuite/ChangeLog:

	* gfortran.dg/gomp/pr99928-1.f90: Uncomment 'defaultmap(none)'.
	* gfortran.dg/gomp/pr99928-2.f90: Uncomment 'defaultmap(none)'.
	* gfortran.dg/gomp/pr99928-3.f90: Uncomment 'defaultmap(none)'.
	* gfortran.dg/gomp/pr99928-4.f90: Uncomment 'defaultmap(none)'.
	* gfortran.dg/gomp/pr99928-5.f90: Uncomment 'defaultmap(none)'.
	* gfortran.dg/gomp/pr99928-6.f90: Uncomment 'defaultmap(none)'.
	* gfortran.dg/gomp/pr99928-8.f90: Uncomment 'defaultmap(none)'.
	* gfortran.dg/gomp/defaultmap-1.f90: New test.
	* gfortran.dg/gomp/defaultmap-2.f90: New test.
	* gfortran.dg/gomp/defaultmap-3.f90: New test.
	* gfortran.dg/gomp/defaultmap-4.f90: New test.
	* gfortran.dg/gomp/defaultmap-5.f90: New test.
	* gfortran.dg/gomp/defaultmap-6.f90: New test.
	* gfortran.dg/gomp/defaultmap-7.f90: New test.
2021-06-15 16:07:11 +02:00
Jakub Jelinek
7d19a50ea1 testsuite: Fix up libgomp.fortran/pr100981-2.f90 testcase [PR100981]
The dsdotr and dsdoti variables uninitialized and the testcase fails e.g.
on i686-linux.  Fixed by zero initialization.

2021-06-10  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/100981
	* testsuite/libgomp.fortran/pr100981-2.f90 (cdcdot): Initialize
	dsdotr and dsdoti to 0.
2021-06-10 09:31:06 +02:00
H.J. Lu
c8d581bdf7 libgomp: Compile tests with -march=i486 only if needed
Don't add -march=i486 if atomic compare-and-swap is supported on 'int'.
This fixes libgomp tests with "-march=x86-64 -m32 -fcf-protection".

	* testsuite/lib/libgomp.exp (libgomp_init): Don't add -march=i486
	if atomic compare-and-swap is supported on 'int'.
2021-06-09 10:05:40 -07:00