Change test for CUDA callback context in nvptx_free() from using
GOMP_PLUGIN_acc_thread () into checking for CUDA_ERROR_NOT_PERMITTED,
for the former only works for OpenACC, but not OpenMP offloading.
2020-08-20 Chung-Lin Tang <cltang@codesourcery.com>
libgomp/
* plugin/plugin-nvptx.c (nvptx_free):
Change "GOMP_PLUGIN_acc_thread () == NULL" test into check of
CUDA_ERROR_NOT_PERMITTED status for cuMemGetAddressRange. Adjust
comments.
This adds support for __sync_val_compare_and_swap and
__sync_bool_compare_and_swap for 1-byte and 2-byte long
values, which are not natively supported on nvptx.
Build and reg-tested on nvptx.
Build and reg-tested libgomp on x86_64 with nvptx accelerator.
2020-07-16 Kwok Cheung Yeung <kcy@codesourcery.com>
libgcc/
* config/nvptx/atomic.c: New.
* config/nvptx/t-nvptx (LIB2ADD): Add atomic.c.
gcc/testsuite/
* gcc.target/nvptx/ia64-sync-5.c: New.
libgomp/
* testsuite/libgomp.c-c++-common/reduction-16.c: New.
2020-08-13 Jakub Jelinek <jakub@redhat.com>
* gimplify.c (gimplify_omp_taskloop_expr): New function.
(gimplify_omp_for): Use it. For OMP_FOR_NON_RECTANGULAR
loops adjust in outer taskloop the var-outer decls.
* omp-expand.c (expand_omp_taskloop_for_inner): Handle non-rectangular
loops.
(expand_omp_for): Don't reject non-rectangular taskloop.
* omp-general.c (omp_extract_for_data): Don't assert that
non-rectangular loops have static schedule, instead treat loop->m1
or loop->m2 as if loop->n1 or loop->n2 is non-constant.
* testsuite/libgomp.c/loop-22.c (main): Add some further tests.
* testsuite/libgomp.c/loop-23.c (main): Likewise.
* testsuite/libgomp.c/loop-24.c: New test.
If the walk_body on the various sequences of reduction, lastprivate and/or linear
clauses needs to create a temporary variable, we should declare that variable
in that sequence rather than outside, where it would need to be privatized inside of
the construct.
2020-08-08 Jakub Jelinek <jakub@redhat.com>
PR fortran/93553
* tree-nested.c (convert_nonlocal_omp_clauses): For
OMP_CLAUSE_REDUCTION, OMP_CLAUSE_LASTPRIVATE and OMP_CLAUSE_LINEAR
save info->new_local_var_chain around walks of the clause gimple
sequences and declare_vars if needed into the sequence.
2020-08-08 Tobias Burnus <tobias@codesourcery.com>
PR fortran/93553
* testsuite/libgomp.fortran/pr93553.f90: New test.
The number of loops computation and logical iteration -> actual iterator values
computations can now be done separately even on composite constructs (though
for triangular loops it would still be more efficient to propagate a few values
through, will handle that incrementally).
simd and taskloop are still unhandled.
2020-08-05 Jakub Jelinek <jakub@redhat.com>
* omp-expand.c (expand_omp_for): Don't disallow combined non-rectangular
loops.
* testsuite/libgomp.c/loop-22.c: New test.
* testsuite/libgomp.c/loop-23.c: New test.
As the new testcase shows, we weren't actually performing reductions on
host teams construct. And fixing that revealed a flaw in the for-14.c testcase.
The problem is that the tests perform also initialization and checking around the
calls to the functions with the OpenMP constructs. In that testcase, all the
tests have been spawned from a teams construct but only the tested loops were
distribute, which means the initialization and checking has been performed
redundantly and racily in each team. Fixed by performing the initialization
and checking outside of host teams and only do the calls to functions with
the tested constructs inside of host teams.
2020-08-05 Jakub Jelinek <jakub@redhat.com>
PR middle-end/96459
* omp-low.c (lower_omp_taskreg): Call lower_reduction_clauses even in
for host teams.
* testsuite/libgomp.c/teams-3.c: New test.
* testsuite/libgomp.c-c++-common/for-2.h (OMPTEAMS): Define to nothing
if not defined yet.
(N(test)): Use it before all N(f*) calls.
* testsuite/libgomp.c-c++-common/for-14.c (DO_PRAGMA, OMPTEAMS): Define.
(main): Don't call all test_* functions from within
#pragma omp teams reduction(|:err), call them directly.
With the pr96628-part1.f90 source and -ftree-slp-vectorize, we run into an
ICE due to the fact that V2DI mode is not handled in nvptx_gen_shuffle.
Fix this by adding handling of V2DI as well as V2SI mode in
nvptx_gen_shuffle.
Build and reg-tested on x86_64 with nvptx accelerator.
gcc/ChangeLog:
PR target/96428
* config/nvptx/nvptx.c (nvptx_gen_shuffle): Handle V2SI/V2DI.
libgomp/ChangeLog:
PR target/96428
* testsuite/libgomp.oacc-fortran/pr96628-part1.f90: New test.
* testsuite/libgomp.oacc-fortran/pr96628-part2.f90: New test.
Standalone attach and detach clauses should not create present/release
mappings for Fortran array descriptors (e.g. used when we have a pointer
to an array), both because it is unnecessary and because those mappings
will be incorrectly subject to reference counting. Simply omitting the
mappings means we just use GOMP_MAP_TO_PSET and GOMP_MAP_{ATTACH,DETACH}
mappings for array descriptors.
That requires a tweak in gimplify.c, since we may now see GOMP_MAP_TO_PSET
without a preceding data-movement mapping.
2020-08-03 Julian Brown <julian@codesourcery.com>
Thomas Schwinge <thomas@codesourcery.com>
gcc/fortran/
* trans-openmp.c (gfc_trans_omp_clauses): Don't create present/release
mappings for array descriptors.
gcc/
* gimplify.c (gimplify_omp_target_update): Allow GOMP_MAP_TO_PSET
without a preceding data-movement mapping.
gcc/testsuite/
* gfortran.dg/goacc/attach-descriptor.f90: Update pattern output. Add
scanning of gimplify dump.
libgomp/
* testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90: Don't run for
shared-memory devices. Extend with further checking.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
This patch removes the generation of HSAIL from the compiler, the HSA
offloading plugin from libgomp and the associated testsuite tests and
infrastructure bits from the respective testsuites.
Apart from removal of the obvious files, I removed bits that I found
by searching for HSA related terms and by re-tracing my steps and
looking at the patches that introduced HSA in the first place. I did
not remove everything these patches brought in, for example:
- the mechanism to pass offload-target specific info from the application to
the offloading plugin - but the same mechanism is also used to
communicate number of teams and the thread limit to all offload targets.
- run_func hook in gomp_device_descr stays too, although now it is
not used. If some future offload target would like the ability to
refuse to offload some functions, it can use it. It is easy to
remove as a follow-up if it is considered clutter, though.
- configure options --with-hsa-runtime=PATH, -with-hsa-runtime-include=PATH
and --with-hsa-runtime-lib=PATH rmeain because GCN uses them too.
- Surprisingly, GOMP_TARGET_ARG_HSA_KERNEL_ATTRIBUTES (a constant
from gomp-constants.h) appears in the source of the amdgcn libgomp
plugin, although I tend to think that code path is not ever used
and this patch certainly removes it from the compiler.
Nevertheless, it seems it has potential value beyond HSAIL and so
I've kept it, it can of course always be easily removed in the
future of GCN folk abandon it too.
- I assume constants OFFLOAD_TARGET_TYPE_HSA and GOMP_DEVICE_HSA
need to stay indefinitely too just so that no future offload
target picks that number.
- I have kept dg-require-effective-target
offload_device_nonshared_as requirement of thests which have it.
It is quite probable I missed some small HSA artifacts but those
should be easy to remove later as we find them.
include/ChangeLog:
2020-07-24 Martin Jambor <mjambor@suse.cz>
* gomp-constants.h (GOMP_VERSION_HSA): Remove.
gcc/ChangeLog:
2020-07-24 Martin Jambor <mjambor@suse.cz>
* hsa-brig-format.h: Moved to brig/brigfrontend.
* hsa-brig.c: Removed.
* hsa-builtins.def: Likewise.
* hsa-common.c: Likewise.
* hsa-common.h: Likewise.
* hsa-dump.c: Likewise.
* hsa-gen.c: Likewise.
* hsa-regalloc.c: Likewise.
* ipa-hsa.c: Likewise.
* omp-grid.c: Likewise.
* omp-grid.h: Likewise.
* Makefile.in (BUILTINS_DEF): Remove hsa-builtins.def.
(OBJS): Remove hsa-common.o, hsa-gen.o, hsa-regalloc.o, hsa-brig.o,
hsa-dump.o, ipa-hsa.c and omp-grid.o.
(GTFILES): Removed hsa-common.c and omp-expand.c.
* builtins.def: Remove processing of hsa-builtins.def.
(DEF_HSA_BUILTIN): Remove.
* common.opt (flag_disable_hsa): Remove.
(-Whsa): Ignore.
* config.in (ENABLE_HSA): Removed.
* configure.ac: Removed handling configuration for hsa offloading.
(ENABLE_HSA): Removed.
* configure: Regenerated.
* doc/install.texi (--enable-offload-targets): Remove hsa from the
example.
(--with-hsa-runtime): Reword to reference any HSA run-time, not
specifically HSA offloading.
* doc/invoke.texi (Option Summary): Remove -Whsa.
(Warning Options): Likewise.
(Optimize Options): Remove hsa-gen-debug-stores.
* doc/passes.texi (Regular IPA passes): Remove section on IPA HSA
pass.
* gimple-low.c (lower_stmt): Remove GIMPLE_OMP_GRID_BODY case.
* gimple-pretty-print.c (dump_gimple_omp_for): Likewise.
(dump_gimple_omp_block): Likewise.
(pp_gimple_stmt_1): Likewise.
* gimple-walk.c (walk_gimple_stmt): Likewise.
* gimple.c (gimple_build_omp_grid_body): Removed function.
(gimple_copy): Remove GIMPLE_OMP_GRID_BODY case.
* gimple.def (GIMPLE_OMP_GRID_BODY): Removed.
* gimple.h (gf_mask): Removed GF_OMP_PARALLEL_GRID_PHONY,
OMP_FOR_KIND_GRID_LOOP, GF_OMP_FOR_GRID_PHONY,
GF_OMP_FOR_GRID_INTRA_GROUP, GF_OMP_FOR_GRID_GROUP_ITER and
GF_OMP_TEAMS_GRID_PHONY. Renumbered GF_OMP_FOR_KIND_SIMD and
GF_OMP_TEAMS_HOST.
(gimple_build_omp_grid_body): Removed declaration.
(gimple_has_substatements): Remove GIMPLE_OMP_GRID_BODY case.
(gimple_omp_for_grid_phony): Removed.
(gimple_omp_for_set_grid_phony): Likewise.
(gimple_omp_for_grid_intra_group): Likewise.
(gimple_omp_for_grid_intra_group): Likewise.
(gimple_omp_for_grid_group_iter): Likewise.
(gimple_omp_for_set_grid_group_iter): Likewise.
(gimple_omp_parallel_grid_phony): Likewise.
(gimple_omp_parallel_set_grid_phony): Likewise.
(gimple_omp_teams_grid_phony): Likewise.
(gimple_omp_teams_set_grid_phony): Likewise.
(CASE_GIMPLE_OMP): Remove GIMPLE_OMP_GRID_BODY case.
* lto-section-in.c (lto_section_name): Removed hsa.
* lto-streamer.h (lto_section_type): Removed LTO_section_ipa_hsa.
* lto-wrapper.c (compile_images_for_offload_targets): Remove special
handling of hsa.
* omp-expand.c: Do not include hsa-common.h and gt-omp-expand.h.
(parallel_needs_hsa_kernel_p): Removed.
(grid_launch_attributes_trees): Likewise.
(grid_launch_attributes_trees): Likewise.
(grid_create_kernel_launch_attr_types): Likewise.
(grid_insert_store_range_dim): Likewise.
(grid_get_kernel_launch_attributes): Likewise.
(get_target_arguments): Remove code passing HSA grid sizes.
(grid_expand_omp_for_loop): Remove.
(grid_arg_decl_map): Likewise.
(grid_remap_kernel_arg_accesses): Likewise.
(grid_expand_target_grid_body): Likewise.
(expand_omp): Remove call to grid_expand_target_grid_body.
(omp_make_gimple_edges): Remove GIMPLE_OMP_GRID_BODY case.
* omp-general.c: Do not include hsa-common.h.
(omp_maybe_offloaded): Do not check for HSA offloading.
(omp_context_selector_matches): Likewise.
* omp-low.c: Do not include hsa-common.h and omp-grid.h.
(build_outer_var_ref): Remove handling of GIMPLE_OMP_GRID_BODY.
(scan_sharing_clauses): Remove handling of OMP_CLAUSE__GRIDDIM_.
(scan_omp_parallel): Remove handling of the phoney variant.
(check_omp_nesting_restrictions): Remove handling of
GIMPLE_OMP_GRID_BODY and GF_OMP_FOR_KIND_GRID_LOOP.
(scan_omp_1_stmt): Remove handling of GIMPLE_OMP_GRID_BODY.
(lower_omp_for_lastprivate): Remove handling of gridified loops.
(lower_omp_for): Remove phony loop handling.
(lower_omp_taskreg): Remove phony construct handling.
(lower_omp_teams): Likewise.
(lower_omp_grid_body): Removed.
(lower_omp_1): Remove GIMPLE_OMP_GRID_BODY case.
(execute_lower_omp): Do not call omp_grid_gridify_all_targets.
* opts.c (common_handle_option): Do not handle hsa when processing
OPT_foffload_.
* params.opt (hsa-gen-debug-stores): Remove.
* passes.def: Remove pass_ipa_hsa and pass_gen_hsail.
* timevar.def: Remove TV_IPA_HSA.
* toplev.c: Do not include hsa-common.h.
(compile_file): Do not call hsa_output_brig.
* tree-core.h (enum omp_clause_code): Remove OMP_CLAUSE__GRIDDIM_.
(tree_omp_clause): Remove union field dimension.
* tree-nested.c (convert_nonlocal_omp_clauses): Remove the
OMP_CLAUSE__GRIDDIM_ case.
(convert_local_omp_clauses): Likewise.
* tree-pass.h (make_pass_gen_hsail): Remove declaration.
(make_pass_ipa_hsa): Likewise.
* tree-pretty-print.c (dump_omp_clause): Remove GIMPLE_OMP_GRID_BODY
case.
* tree.c (omp_clause_num_ops): Remove the element corresponding to
OMP_CLAUSE__GRIDDIM_.
(omp_clause_code_name): Likewise.
(walk_tree_1): Remove GIMPLE_OMP_GRID_BODY case.
* tree.h (OMP_CLAUSE__GRIDDIM__DIMENSION): Remove.
(OMP_CLAUSE__GRIDDIM__SIZE): Likewise.
(OMP_CLAUSE__GRIDDIM__GROUP): Likewise.
gcc/fortran/ChangeLog:
2020-07-24 Martin Jambor <mjambor@suse.cz>
* f95-lang.c (gfc_init_builtin_functions): Remove processing of
hsa-builtins.def.
gcc/brig/ChangeLog:
2020-07-24 Martin Jambor <mjambor@suse.cz>
* brigfrontend/brig-util.h (hsa_type_packed_p): Declared.
* brigfrontend/brig-util.cc (hsa_type_packed_p): Moved here from
removed gcc/hsa-common.c.
libgomp/ChangeLog:
2020-07-24 Martin Jambor <mjambor@suse.cz>
* plugin/Makefrag.am: Remove configuration of HSA plugin.
* aclocal.m4: Regenerated.
* Makefile.in: Regenerated.
* config.h.in: Regenerated.
* configure: Regenerated.
* plugin/configfrag.ac: Likewise.
* plugin/hsa_ext_finalize.h: Removed.
* plugin/plugin-hsa.c: Likewise.
* testsuite/Makefile.in: Regenerated.
* testsuite/lib/libgomp.exp
(offload_target_to_openacc_device_type): Remove hsa case.
(check_effective_target_hsa_offloading_selected_nocache): Removed
(check_effective_target_hsa_offloading_selected): Likewise.
(libgomp_init): Do not add -Wno-hsa to additional_flags.
* testsuite/libgomp.hsa.c/alloca-1.c: Removed test.
* testsuite/libgomp.hsa.c/bitfield-1.c: Likewise.
* testsuite/libgomp.hsa.c/bits-insns.c: Likewise.
* testsuite/libgomp.hsa.c/builtins-1.c: Likewise.
* testsuite/libgomp.hsa.c/c.exp: Likewise.
* testsuite/libgomp.hsa.c/complex-1.c: Likewise.
* testsuite/libgomp.hsa.c/complex-align-2.c: Likewise.
* testsuite/libgomp.hsa.c/formal-actual-args-1.c: Likewise.
* testsuite/libgomp.hsa.c/function-call-1.c: Likewise.
* testsuite/libgomp.hsa.c/get-level-1.c: Likewise.
* testsuite/libgomp.hsa.c/gridify-1.c: Likewise.
* testsuite/libgomp.hsa.c/gridify-2.c: Likewise.
* testsuite/libgomp.hsa.c/gridify-3.c: Likewise.
* testsuite/libgomp.hsa.c/gridify-4.c: Likewise.
* testsuite/libgomp.hsa.c/memory-operations-1.c: Likewise.
* testsuite/libgomp.hsa.c/pr69568.c: Likewise.
* testsuite/libgomp.hsa.c/pr82416.c: Likewise.
* testsuite/libgomp.hsa.c/rotate-1.c: Likewise.
* testsuite/libgomp.hsa.c/staticvar.c: Likewise.
* testsuite/libgomp.hsa.c/switch-1.c: Likewise.
* testsuite/libgomp.hsa.c/switch-branch-1.c: Likewise.
* testsuite/libgomp.hsa.c/switch-sbr-2.c: Likewise.
* testsuite/libgomp.hsa.c/tiling-1.c: Likewise.
* testsuite/libgomp.hsa.c/tiling-2.c: Likewise.
gcc/testsuite/ChangeLog:
2020-07-24 Martin Jambor <mjambor@suse.cz>
* lib/target-supports.exp (check_effective_target_offload_hsa):
Removed.
* c-c++-common/gomp/gridify-1.c: Removed test.
* c-c++-common/gomp/gridify-2.c: Likewise.
* c-c++-common/gomp/gridify-3.c: Likewise.
* c-c++-common/gomp/hsa-indirect-call-1.c: Likewise.
* gfortran.dg/gomp/gridify-1.f90: Likewise.
* gcc.dg/gomp/gomp.exp: Do not pass -Wno-hsa to tests.
* g++.dg/gomp/gomp.exp: Likewise.
* gfortran.dg/gomp/gomp.exp: Likewise.
Attach and detach operations are not supposed to affect structural or
dynamic reference counts for OpenACC. Previously they did so, which led to
subtle problems in some circumstances. We can avoid reference-counting
attach/detach operations by extending and slightly repurposing the
do_detach field in target_var_desc. It is now called is_attach to better
reflect its new role.
2020-07-27 Julian Brown <julian@codesourcery.com>
Thomas Schwinge <thomas@codesourcery.com>
libgomp/
* libgomp.h (struct target_var_desc): Rename do_detach field to
is_attach.
* oacc-mem.c (goacc_exit_datum_1): Add assert. Don't set finalize for
GOMP_MAP_FORCE_DETACH. Update checking to use is_attach field.
(goacc_enter_data_internal): Don't affect reference counts
for attach mappings.
(goacc_exit_data_internal): Don't affect reference counts for detach
mappings.
* target.c (gomp_map_vars_existing): Don't affect reference counts for
attach mappings.
(gomp_map_vars_internal): Set renamed is_attach flag unconditionally to
mark attach mappings.
(gomp_unmap_vars_internal): Use is_attach flag to prevent affecting
reference count for attach mappings.
* testsuite/libgomp.oacc-c-c++-common/mdc-refcount-1.c: New test.
* testsuite/libgomp.oacc-c-c++-common/mdc-refcount-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/mdc-refcount-2.c: New test.
* testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90: Mark
test as shouldfail.
* testsuite/libgomp.oacc-fortran/deep-copy-6.f90: Adjust to fail
gracefully in no-finalize mode.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
..., so that we don't leak this into '*.exp' files running later.
This is relevant after commit efc16503ca "handle
dumpbase in offloading, adjust testsuite" -- I was confused why in a
(simplified) testing sequence as follows:
default 'libgomp.c/c.exp'
default 'libgomp.oacc-c/c.exp'
'-m32' 'libgomp.c/c.exp'
'-m32' 'libgomp.oacc-c/c.exp'
..., the "'-m32' 'libgomp.c/c.exp'" variant would not execute any offloading
dump scanning. The reason is that the "default 'libgomp.oacc-c/c.exp'" variant
ends with 'offload_target=disable' set, so that's what the "'-m32'
'libgomp.c/c.exp'" variant would then see, in particular
'gcc/testsuite/lib/scanoffload.exp:scoff'.
libgomp/
* testsuite/libgomp.oacc-c++/c++.exp: Unset 'offload_target' after
use.
* testsuite/libgomp.oacc-c/c.exp: Likewise.
* testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.
The call to gomp_detach_pointer in gomp_unmap_vars_internal does not
need to force finalization, and doing so may mask mismatched pointer
attachments/detachments. This patch removes the forcing.
2020-07-16 Julian Brown <julian@codesourcery.com>
Thomas Schwinge <thomas@codesourcery.com>
libgomp/
* target.c (gomp_unmap_vars_internal): Remove unnecessary forcing of
finalization for detach operation.
* testsuite/libgomp.oacc-c-c++-common/structured-detach-underflow.c:
New test.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
libgomp/ChangeLog:
* testsuite/libgomp.c-c++-common/critical-hint-1.c: New; moved from
gcc/testsuite/c-c++-common/gomp/.
* testsuite/libgomp.c-c++-common/critical-hint-2.c: Likewise.
* testsuite/libgomp.fortran/critical-hint-1.f90: New; moved
from gcc/testsuite/gfortran.dg/gomp/.
* testsuite/libgomp.fortran/critical-hint-2.f90: Likewise.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/critical-hint-1.c: Moved to libgomp/.
* c-c++-common/gomp/critical-hint-2.c: Moved to libgomp/.
* gfortran.dg/gomp/critical-hint-1.f90: Moved to libgomp/.
* gfortran.dg/gomp/critical-hint-2.f90: Moved to libgomp/.
gcc/c-family/ChangeLog:
* c-omp.c (c_finish_omp_critical): Check for no name but
nonzero hint provided.
gcc/c/ChangeLog:
* c-parser.c (c_parser_omp_clause_hint): Require nonnegative hint clause.
(c_parser_omp_critical): Permit hint(0) clause without named critical.
(c_parser_omp_construct): Don't assert if error_mark_node is returned.
gcc/cp/ChangeLog:
* parser.c (cp_parser_omp_clause_hint): Require nonnegative hint.
(cp_parser_omp_critical): Permit hint(0) clause without named critical.
* pt.c (tsubst_expr): Re-check the latter for templates.
gcc/fortran/ChangeLog:
* openmp.c (gfc_match_omp_critical): Fix handling hints; permit
hint clause without named critical.
(resolve_omp_clauses): Require nonnegative constant integer
for the hint clause.
(gfc_resolve_omp_directive): Check for no name but
nonzero value for hint clause.
* parse.c (parse_omp_structured_block): Fix same-name check
for critical.
* trans-openmp.c (gfc_trans_omp_critical): Handle hint clause properly.
libgomp/ChangeLog:
* omp_lib.f90.in: Add omp_sync_hint_* and omp_sync_hint_kind.
* omp_lib.h.in: Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/gomp/critical-3.C: Add nameless critical with hint testcase.
* c-c++-common/gomp/critical-hint-1.c: New test.
* c-c++-common/gomp/critical-hint-2.c: New test.
* gfortran.dg/gomp/critical-hint-1.f90: New test.
* gfortran.dg/gomp/critical-hint-2.f90: New test.
Define ASM_OUTPUT_ALIGNED_DECL_LOCAL for large local common symbol.
gcc/
PR target/95620
* config/i386/x86-64.h (ASM_OUTPUT_ALIGNED_DECL_LOCAL): New.
libgomp/
PR target/95620
* testsuite/libgomp.c/pr95620.c: New test.
This patch makes it so that an "attach" operation for a Fortran pointer
with an array descriptor copies that array descriptor to the target,
and similarly that detach operations release the array descriptor.
2020-07-16 Julian Brown <julian@codesourcery.com>
Thomas Schwinge <thomas@codesourcery.com>
gcc/fortran/
* trans-openmp.c (gfc_trans_omp_clauses): Rework OpenACC
attach/detach handling for arrays with descriptors.
gcc/testsuite/
* gfortran.dg/goacc/attach-descriptor.f90: New test.
libgomp/
* testsuite/libgomp.oacc-fortran/attach-descriptor-1.f90: New test.
* testsuite/libgomp.oacc-fortran/attach-descriptor-2.f90: New test.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
libgomp/ChangeLog:
* testsuite/libgomp.fortran/alloc-1.F90: Use c_size_t to
avoid conversion on 32bit systems from 32bit to 64bit due
to -fdefault-integer-8.
As the Fortran PR 95837 has been fixed, the test could be be added.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/struct-elem-map-1.f90: Remove unused
variables; add character(kind=4) tests; update TODO comment.
The version of nvprof in CUDA 9.0 causes a hang when used to profile an
OpenACC program. This is because it calls acc_get_device_type from
a callback called during device initialization, which then attempts
to acquire acc_device_lock while it is already taken, resulting in
deadlock. This works around the issue by returning acc_device_none
from acc_get_device_type without attempting to acquire the lock when
initialization has not completed yet.
2020-07-14 Tom de Vries <tom@codesourcery.com>
Cesar Philippidis <cesar@codesourcery.com>
Thomas Schwinge <thomas@codesourcery.com>
Kwok Cheung Yeung <kcy@codesourcery.com>
libgomp/
* oacc-init.c (acc_init_state_lock, acc_init_state, acc_init_thread):
New variable.
(acc_init_1): Set acc_init_thread to pthread_self (). Set
acc_init_state to initializing at the start, and to initialized at the
end.
(self_initializing_p): New function.
(acc_get_device_type): Return acc_device_none if called by thread that
is currently executing acc_init_1.
* libgomp.texi (acc_get_device_type): Update documentation.
(Implementation Status and Implementation-Defined Behavior): Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-init-2.c: New.
The FAT libraries config fragments need to know which library is native
and which is a multilib to choose the correct multilib from which to
append the additional object file or shared object file. Testing the
top-level archive is fragile because it will fail if rebuilding. This
patch tests the compiler preprocessing macros for the 64 bit AIX specific
__64BIT__ to determine the native mode of the compiler in MULTILIBTOP.
2020-07-14 David Edelsohn <dje.gcc@gmail.com>
libatomic/ChangeLog
* config/t-aix: Set BITS from compiler cpp macro.
libgcc/ChangeLog
* config/rs6000/t-slibgcc-aix: Set BITS from compiler cpp macro.
libgfortran/ChangeLog
* config/t-aix: Set BITS from compiler cpp macro.
libgomp/ChangeLog
* config/t-aix: Set BITS from compiler cpp macro.
libstdc++-v3/ChangeLog
* config/os/aix/t-aix: Set BITS from compiler cpp macro.
gcc/fortran/ChangeLog:
PR fortran/67311
* trans-openmp.c (gfc_has_alloc_comps): Return false also for
pointers to arrays.
libgomp/ChangeLog:
PR fortran/67311
* testsuite/libgomp.fortran/target-map-1.f90: New test.
In loops like:
#pragma omp parallel for collapse(2)
for (i = -4; i < 8; i++)
for (j = 3 * i; j > 2 * i; j--)
for some outer loop iterations there are no inner loop iterations at all,
the condition is false. In order to use Summæ Potestate to count number
of iterations or to transform the logical iteration number to actual
iterator values using quadratic non-equation root discovery the outer
iterator range needs to be adjusted, such that the inner loop has at least
one iteration for each of the outer loop iterator value in the reduced
range. Sometimes this adjustment is done at the start of the range,
at other times at the end.
This patch implements it during the compile time number of loop computation
(if all expressions are compile time constants).
2020-07-14 Jakub Jelinek <jakub@redhat.com>
* omp-general.h (struct omp_for_data): Add adjn1 member.
* omp-general.c (omp_extract_for_data): For non-rect loop, punt on
count computing if n1, n2 or step are not INTEGER_CST earlier.
Narrow the outer iterator range if needed so that non-rect loop
has at least one iteration for each outer range iteration. Compute
adjn1.
* omp-expand.c (expand_omp_for_init_vars): Use adjn1 if non-NULL
instead of the outer loop's n1.
* testsuite/libgomp.c/loop-21.c: New test.
OpenACC 2.6 specifies that the array descriptor (when present) must be
copied to the target before attaching pointers in Fortran. This patch
reverses the stripping of GOMP_MAP_TO_PSET and GOMP_MAP_POINTER that
was introduced by the "OpenACC reference count overhaul" patch.
2020-07-10 Julian Brown <julian@codesourcery.com>
Thomas Schwinge <thomas@codesourcery.com>
gcc/
* gimplify.c (gimplify_scan_omp_clauses): Do not strip
GOMP_MAP_TO_PSET/GOMP_MAP_POINTER for OpenACC enter/exit data
directives (see also PR92929).
gcc/testsuite/
* gfortran.dg/goacc/finalize-1.f: Update expected dump output.
libgomp/
* testsuite/libgomp.oacc-fortran/dynamic-pointer-1.f90: New test.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
This patch adjusts how dynamic reference counts work so that they match
the semantics of the source program more closely, instead of representing
"excess" reference counts beyond those that represent pointers in the
internal libgomp splay-tree data structure. This allows some corner
cases to be handled more gracefully.
2020-07-10 Julian Brown <julian@codesourcery.com>
Thomas Schwinge <thomas@codesourcery.com>
libgomp/
* libgomp.h (struct splay_tree_key_s): Change virtual_refcount to
dynamic_refcount.
(struct gomp_device_descr): Remove GOMP_MAP_VARS_OPENACC_ENTER_DATA.
* oacc-mem.c (acc_map_data): Substitute virtual_refcount for
dynamic_refcount.
(acc_unmap_data): Update comment.
(goacc_map_var_existing, goacc_enter_datum): Adjust for
dynamic_refcount semantics.
(goacc_exit_datum_1, goacc_exit_datum): Re-add some error checking.
Adjust for dynamic_refcount semantics.
(goacc_enter_data_internal): Implement "present" case of dynamic
memory-map handling here. Update "non-present" case for
dynamic_refcount semantics.
(goacc_exit_data_internal): Use goacc_exit_datum_1.
* target.c (gomp_map_vars_internal): Remove
GOMP_MAP_VARS_OPENACC_ENTER_DATA handling. Update for dynamic_refcount
handling.
(gomp_unmap_vars_internal): Remove virtual_refcount handling.
(gomp_load_image_to_device): Substitute dynamic_refcount for
virtual_refcount.
* testsuite/libgomp.oacc-c-c++-common/pr92843-1.c: Remove XFAILs.
* testsuite/libgomp.oacc-c-c++-common/refcounting-1.c: New test.
* testsuite/libgomp.oacc-c-c++-common/refcounting-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/struct-3-1-1.c: New test.
* testsuite/libgomp.oacc-fortran/deep-copy-6.f90: Remove XFAILs and
trace output.
* testsuite/libgomp.oacc-fortran/deep-copy-6-no_finalize.F90: Remove
trace output.
* testsuite/libgomp.oacc-fortran/dynamic-incr-structural-1.f90: New
test.
* testsuite/libgomp.oacc-c-c++-common/structured-dynamic-lifetimes-4.c:
Remove stale comment.
* testsuite/libgomp.oacc-fortran/mdc-refcount-1-1-1.f90: Remove XFAILs.
* testsuite/libgomp.oacc-fortran/mdc-refcount-1-1-2.F90: Likewise.
* testsuite/libgomp.oacc-fortran/mdc-refcount-1-2-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/mdc-refcount-1-2-2.f90: Likewise.
* testsuite/libgomp.oacc-fortran/mdc-refcount-1-3-1.f90: Likewise.
* testsuite/libgomp.oacc-fortran/mdc-refcount-1-4-1.f90: Adjust XFAIL.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
This patch factors out the parts of goacc_enter_datum and
goacc_exit_datum that can be shared with goacc_enter_data_internal
and goacc_exit_data_internal respectively (in the next patch),
without overloading function return values or complicating code paths
unnecessarily.
2020-07-10 Julian Brown <julian@codesourcery.com>
Thomas Schwinge <thomas@codesourcery.com>
libgomp/
* oacc-mem.c (goacc_map_var_existing): New function.
(goacc_enter_datum): Use above function.
(goacc_exit_datum_1): New function.
(goacc_exit_datum): Use above function.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
This is a fix for the pointer (or array) size inadvertently being used
for the bias with attach and detach mapping kinds, for both C and C++.
2020-07-09 Julian Brown <julian@codesourcery.com>
Thomas Schwinge <thomas@codesourcery.com>
gcc/c/
PR middle-end/95270
* c-typeck.c (c_finish_omp_clauses): Set OMP_CLAUSE_SIZE (bias) to zero
for standalone attach/detach clauses.
gcc/cp/
PR middle-end/95270
* semantics.c (finish_omp_clauses): Likewise.
include/
PR middle-end/95270
* gomp-constants.h (gomp_map_kind): Expand comment for attach/detach
mapping kinds.
gcc/testsuite/
PR middle-end/95270
* c-c++-common/goacc/mdc-1.c: Update expected dump output for zero
bias.
libgomp/
PR middle-end/95270
* testsuite/libgomp.oacc-c-c++-common/pr95270-1.c: New test.
* testsuite/libgomp.oacc-c-c++-common/pr95270-2.c: New test.
Arrange for GOMP_MAP_ATTACH to be grouped together with a preceding
GOMP_MAP_TO_PSET or other "to" data movement clause, except in cases
where an explicit "attach" clause is used.
2020-07-09 Julian Brown <julian@codesourcery.com>
include/
* gomp-constants.h (gomp_map_kind): Update comment for GOMP_MAP_TO_PSET.
libgomp/
* oacc-mem.c (find_group_last): Group data-movement clauses
(GOMP_MAP_TO_PSET, GOMP_MAP_TO, etc.) together with a subsequent
GOMP_MAP_ATTACH. Allow standalone GOMP_MAP_ATTACH also.
This patch implements the optimized logical to actual iterators
computation for triangular loops.
I have a rough implementation using integers, but this one uses floating
point. There is a small problem that -fopenmp programs aren't linked with
-lm, so it does it only if the hw has sqrt optab (and uses ifn rather than
__builtin_sqrt because it obviously doesn't need errno handling etc.).
Do you think it is ok this way, or should I use the integral computation
using inlined isqrt (we have inequation of the form
start >= x * t10 + t11 * (((x - 1) * x) / 2)
where t10 and t11 are signed long long values and start unsigned long long,
and the division by 2 actually is a problem for accuracy in some cases, so
if we do it in integral, we need to do actually
long long t12 = 2 * t10 - t11;
unsigned long long t13 = t12 * t12 + start * 8 * t11;
unsigned long long isqrt_ = isqrtull (t13);
long long x = (((long long) isqrt_ - t12) / t11) >> 1;
with careful overflow checking on all the computations before isqrtull
(and on overflows use the fallback implementation).
2020-07-09 Jakub Jelinek <jakub@redhat.com>
* omp-general.h (struct omp_for_data): Add min_inner_iterations
and factor members.
* omp-general.c (omp_extract_for_data): Initialize them and remember
them in OMP_CLAUSE_COLLAPSE_COUNT if needed and restore from there.
* omp-expand.c (expand_omp_for_init_counts): Fix up computation of
counts[fd->last_nonrect] if fd->loop.n2 is INTEGER_CST.
(expand_omp_for_init_vars): For
fd->first_nonrect + 1 == fd->last_nonrect loops with for now
INTEGER_CST fd->loop.n2 find quadratic equation roots instead of
using fallback method when possible.
* testsuite/libgomp.c/loop-19.c: New test.
* testsuite/libgomp.c/loop-20.c: New test.
While this is an OpenMP 5.1 change, it is undesirable to let people use different
values and then deal with ABI backwards compatibility in a year or two.
2020-07-09 Jakub Jelinek <jakub@redhat.com>
* omp.h.in (omp_alloctrait_value_t): Change omp_atv_default from
2 to -1. Add omp_atv_serialized and define omp_atv_sequential using
it. Remove __omp_alloctrait_value_max__.
* allocator.c (omp_init_allocator): Handle omp_atv_default for
omp_atk_alignment and omp_atk_pool_size.