2018-08-01 Tom de Vries <tdevries@suse.de>
* plugin/cuda-lib.def: New file. Factor out of ...
* plugin/plugin-nvptx.c (CUDA_CALLS): ... here.
(struct cuda_lib_s, init_cuda_lib): Include cuda-lib.def instead of
using CUDA_CALLS.
From-SVN: r263208
2018-07-31 Andre Vieira <andre.simoesdiasvieira@arm.com>
Revert 'AsyncI/O patch committed'
2018-07-25 Nicolas Koenig <koenigni@gcc.gnu.org>
Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/25829
* gfortran.texi: Add description of asynchronous I/O.
* trans-decl.c (gfc_finish_var_decl): Treat asynchronous variables
as volatile.
* trans-io.c (gfc_build_io_library_fndecls): Rename st_wait to
st_wait_async and change argument spec from ".X" to ".w".
(gfc_trans_wait): Pass ID argument via reference.
2018-07-31 Andre Vieira <andre.simoesdiasvieira@arm.com>
Revert 'AsyncI/O patch committed'
2018-07-25 Nicolas Koenig <koenigni@gcc.gnu.org>
Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/25829
* gfortran.dg/f2003_inquire_1.f03: Add write statement.
* gfortran.dg/f2003_io_1.f03: Add wait statement.
2018-07-31 Andre Vieira <andre.simoesdiasvieira@arm.com>
Revert 'AsyncI/O patch committed'
2018-07-25 Nicolas Koenig <koenigni@gcc.gnu.org>
Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/25829
* Makefile.am: Add async.c to gfor_io_src.
Add async.h to gfor_io_headers.
* Makefile.in: Regenerated.
* gfortran.map: Add _gfortran_st_wait_async.
* io/async.c: New file.
* io/async.h: New file.
* io/close.c: Include async.h.
(st_close): Call async_wait for an asynchronous unit.
* io/file_pos.c (st_backspace): Likewise.
(st_endfile): Likewise.
(st_rewind): Likewise.
(st_flush): Likewise.
* io/inquire.c: Add handling for asynchronous PENDING
and ID arguments.
* io/io.h (st_parameter_dt): Add async bit.
(st_parameter_wait): Correct.
(gfc_unit): Add au pointer.
(st_wait_async): Add prototype.
(transfer_array_inner): Likewise.
(st_write_done_worker): Likewise.
* io/open.c: Include async.h.
(new_unit): Initialize asynchronous unit.
* io/transfer.c (async_opt): New struct.
(wrap_scalar_transfer): New function.
(transfer_integer): Call wrap_scalar_transfer to do the work.
(transfer_real): Likewise.
(transfer_real_write): Likewise.
(transfer_character): Likewise.
(transfer_character_wide): Likewise.
(transfer_complex): Likewise.
(transfer_array_inner): New function.
(transfer_array): Call transfer_array_inner.
(transfer_derived): Call wrap_scalar_transfer.
(data_transfer_init): Check for asynchronous I/O.
Perform a wait operation on any pending asynchronous I/O
if the data transfer is synchronous. Copy PDT and enqueue
thread for data transfer.
(st_read_done_worker): New function.
(st_read_done): Enqueue transfer or call st_read_done_worker.
(st_write_done_worker): New function.
(st_write_done): Enqueue transfer or call st_read_done_worker.
(st_wait): Document as no-op for compatibility reasons.
(st_wait_async): New function.
* io/unit.c (insert_unit): Use macros LOCK, UNLOCK and TRYLOCK;
add NOTE where necessary.
(get_gfc_unit): Likewise.
(init_units): Likewise.
(close_unit_1): Likewise. Call async_close if asynchronous.
(close_unit): Use macros LOCK and UNLOCK.
(finish_last_advance_record): Likewise.
(newunit_alloc): Likewise.
* io/unix.c (find_file): Likewise.
(flush_all_units_1): Likewise.
(flush_all_units): Likewise.
* libgfortran.h (generate_error_common): Add prototype.
* runtime/error.c: Include io.h and async.h.
(generate_error_common): New function.
2018-07-31 Andre Vieira <andre.simoesdiasvieira@arm.com>
Revert 'AsyncI/O patch committed'.
2018-07-25 Nicolas Koenig <koenigni@gcc.gnu.org>
Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/25829
* testsuite/libgomp.fortran/async_io_1.f90: New test.
* testsuite/libgomp.fortran/async_io_2.f90: New test.
* testsuite/libgomp.fortran/async_io_3.f90: New test.
* testsuite/libgomp.fortran/async_io_4.f90: New test.
* testsuite/libgomp.fortran/async_io_5.f90: New test.
* testsuite/libgomp.fortran/async_io_6.f90: New test.
* testsuite/libgomp.fortran/async_io_7.f90: New test.
From-SVN: r263082
Currently parallel-loop-1.c fails at -O0 on a Quadro M1200, because one of the
kernel launch configurations exceeds the resources available in the device, due
to the default dimensions chosen by the runtime.
This patch fixes that by taking the per-function max_threads_per_block into
account when using the default dimensions.
2018-07-30 Tom de Vries <tdevries@suse.de>
* plugin/plugin-nvptx.c (MIN, MAX): Redefine.
(nvptx_exec): Ensure worker and vector default dims don't exceed
targ_fn->max_threads_per_block.
From-SVN: r263062
The default dimensions are calculated using per-device properties, but
initialized once and used on all devices.
This patch fixes this problem by introducing per-device default dimensions.
2018-07-30 Tom de Vries <tdevries@suse.de>
* plugin/plugin-nvptx.c (struct ptx_device): Add default_dims field.
(nvptx_open_device): Init default_dims for device.
(nvptx_exec): Use default_dims from device.
From-SVN: r263061
PR testsuite/86660
* testsuite/libgomp.c++/for-15.C (results): Include it in
omp declare target region.
(main): Use map (always, tofrom: results) instead of
map (tofrom: results).
From-SVN: r263011
PR middle-end/86660
* omp-low.c (scan_sharing_clauses): Don't ignore map clauses for
declare target to variables if they have always,{to,from,tofrom} map
kinds.
* testsuite/libgomp.c/pr86660.c: New test.
From-SVN: r263010
Currently, when a kernel is lauched with too many workers, it results in a cuda
launch failure. This is triggered f.i. for parallel-loop-1.c at -O0 on a Quadro
M1200.
This patch detects this situation, and errors out with a hint on how to fix it.
Build and reg-tested on x86_64 with nvptx accelerator.
2018-07-26 Cesar Philippidis <cesar@codesourcery.com>
Tom de Vries <tdevries@suse.de>
* plugin/plugin-nvptx.c (nvptx_exec): Error if the hardware doesn't have
sufficient resources to launch a kernel, and give a hint on how to fix
it.
Co-Authored-By: Tom de Vries <tdevries@suse.de>
From-SVN: r262997
Move sampling of device properties from nvptx_exec to nvptx_open, and assume
the sampling always succeeds. This simplifies the default dimension
initialization code in nvptx_open.
2018-07-26 Cesar Philippidis <cesar@codesourcery.com>
Tom de Vries <tdevries@suse.de>
* plugin/plugin-nvptx.c (struct ptx_device): Add warp_size,
max_threads_per_block and max_threads_per_multiprocessor fields.
(nvptx_open_device): Initialize new fields.
(nvptx_exec): Use num_sms, and new fields.
Co-Authored-By: Tom de Vries <tdevries@suse.de>
From-SVN: r262996
In testcase lib-12.f90, all acc_async_test calls are placed in a location
where they are not guaranteed to succeed, which explains why there's an xfail
for the lower optimization levels.
This patch fixes the problem by moving the acc_async_test calls to the correct
locations.
Reg-tested on x86_64 with nvptx accelerator.
2018-07-26 Tom de Vries <tdevries@suse.de>
* testsuite/libgomp.oacc-fortran/lib-12.f90: Move acc_async_test calls
to correct locations. Remove xfail.
From-SVN: r262990
The purpose of the lib-13.f90 test-case is to test acc_wait_all_async. The
test indeed calls acc_wait_all_async, but then subsequentlys calls
acc_wait_all, so the acc_wait_all_async functionality is not tested.
Furthermore, all acc_async_test calls are placed in a location where they are
not guaranteed to succeed, which explains why there's an xfail for the lower
optimization levels.
This patch fixes the problems by replacing acc_wait_all with an acc_wait on
the async id used for the acc_wait_all_async call, and moving the
acc_async_test calls to the correct locations.
Reg-tested on x86_64 with nvptx accelerator.
2018-07-26 Tom de Vries <tdevries@suse.de>
* testsuite/libgomp.oacc-fortran/lib-13.f90: Replace acc_wait_all with
acc_wait. Move acc_async_test calls to correct locations. Remove
xfail.
From-SVN: r262989
2018-07-25 Nicolas Koenig <koenigni@gcc.gnu.org>
Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/25829
* gfortran.texi: Add description of asynchronous I/O.
* trans-decl.c (gfc_finish_var_decl): Treat asynchronous variables
as volatile.
* trans-io.c (gfc_build_io_library_fndecls): Rename st_wait to
st_wait_async and change argument spec from ".X" to ".w".
(gfc_trans_wait): Pass ID argument via reference.
2018-07-25 Nicolas Koenig <koenigni@gcc.gnu.org>
Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/25829
* gfortran.dg/f2003_inquire_1.f03: Add write statement.
* gfortran.dg/f2003_io_1.f03: Add wait statement.
2018-07-25 Nicolas Koenig <koenigni@gcc.gnu.org>
Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/25829
* Makefile.am: Add async.c to gfor_io_src.
Add async.h to gfor_io_headers.
* Makefile.in: Regenerated.
* gfortran.map: Add _gfortran_st_wait_async.
* io/async.c: New file.
* io/async.h: New file.
* io/close.c: Include async.h.
(st_close): Call async_wait for an asynchronous unit.
* io/file_pos.c (st_backspace): Likewise.
(st_endfile): Likewise.
(st_rewind): Likewise.
(st_flush): Likewise.
* io/inquire.c: Add handling for asynchronous PENDING
and ID arguments.
* io/io.h (st_parameter_dt): Add async bit.
(st_parameter_wait): Correct.
(gfc_unit): Add au pointer.
(st_wait_async): Add prototype.
(transfer_array_inner): Likewise.
(st_write_done_worker): Likewise.
* io/open.c: Include async.h.
(new_unit): Initialize asynchronous unit.
* io/transfer.c (async_opt): New struct.
(wrap_scalar_transfer): New function.
(transfer_integer): Call wrap_scalar_transfer to do the work.
(transfer_real): Likewise.
(transfer_real_write): Likewise.
(transfer_character): Likewise.
(transfer_character_wide): Likewise.
(transfer_complex): Likewise.
(transfer_array_inner): New function.
(transfer_array): Call transfer_array_inner.
(transfer_derived): Call wrap_scalar_transfer.
(data_transfer_init): Check for asynchronous I/O.
Perform a wait operation on any pending asynchronous I/O
if the data transfer is synchronous. Copy PDT and enqueue
thread for data transfer.
(st_read_done_worker): New function.
(st_read_done): Enqueue transfer or call st_read_done_worker.
(st_write_done_worker): New function.
(st_write_done): Enqueue transfer or call st_read_done_worker.
(st_wait): Document as no-op for compatibility reasons.
(st_wait_async): New function.
* io/unit.c (insert_unit): Use macros LOCK, UNLOCK and TRYLOCK;
add NOTE where necessary.
(get_gfc_unit): Likewise.
(init_units): Likewise.
(close_unit_1): Likewise. Call async_close if asynchronous.
(close_unit): Use macros LOCK and UNLOCK.
(finish_last_advance_record): Likewise.
(newunit_alloc): Likewise.
* io/unix.c (find_file): Likewise.
(flush_all_units_1): Likewise.
(flush_all_units): Likewise.
* libgfortran.h (generate_error_common): Add prototype.
* runtime/error.c: Include io.h and async.h.
(generate_error_common): New function.
2018-07-25 Nicolas Koenig <koenigni@gcc.gnu.org>
Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/25829
* testsuite/libgomp.fortran/async_io_1.f90: New test.
* testsuite/libgomp.fortran/async_io_2.f90: New test.
* testsuite/libgomp.fortran/async_io_3.f90: New test.
* testsuite/libgomp.fortran/async_io_4.f90: New test.
* testsuite/libgomp.fortran/async_io_5.f90: New test.
* testsuite/libgomp.fortran/async_io_6.f90: New test.
* testsuite/libgomp.fortran/async_io_7.f90: New test.
Co-Authored-By: Thomas Koenig <tkoenig@gcc.gnu.org>
From-SVN: r262978
PR middle-end/86542
* omp-low.c (create_task_copyfn): Copy over also fields corresponding
to _looptemp_ clauses, other than the first two.
* testsuite/libgomp.c++/pr86542.C: New test.
From-SVN: r262815
PR middle-end/86539
* gimplify.c (gimplify_omp_for): Ensure taskloop firstprivatized init
and cond temporaries don't have reference type if iterator has
pointer type. For init use &for_pre_body instead of pre_p if
for_pre_body is non-empty.
* testsuite/libgomp.c++/pr86539.C: New test.
From-SVN: r262776
PR c++/86443
* gimplify.c (find_combined_omp_for): Add DATA argument, in addition
to finding the inner OMP_FOR/OMP_SIMD stmt find non-trivial wrappers,
BLOCKs with BLOCK_VARs, OMP_PARALLEL in between, OMP_FOR in between.
(gimplify_omp_for): For composite loops, move outer
OMP_{DISTRIBUTE,TASKLOOP,FOR,PARALLEL} right around innermost
OMP_FOR/OMP_SIMD if there are any non-trivial wrappers. For class
iterators add any needed clauses. Allow OMP_FOR_ORIG_DECLS to contain
TREE_LIST for both the original class iterator and the "last" helper
var. Gimplify OMP_FOR_PRE_BODY before the outermost composite
loop, remember has_decl_expr from outer composite loops for the
innermost OMP_SIMD in TREE_PRIVATE bit on OMP_FOR_INIT.
gcc/c-family/
* c-omp.c (c_omp_check_loop_iv_r, c_omp_check_loop_iv): Allow declv
to contain TREE_LIST for both the original class iterator and the
"last" helper var.
gcc/cp/
* semantics.c (handle_omp_for_class_iterator): Remove lastp argument,
instead of setting *lastp turn orig_declv elt into a TREE_LIST.
(finish_omp_for): Adjust handle_omp_for_class_iterator caller.
* pt.c (tsubst_omp_for_iterator): Allow OMP_FOR_ORIG_DECLS to contain
TREE_LIST for both the original class iterator and the "last" helper
var.
libgomp/
* testsuite/libgomp.c++/for-15.C: New test.
From-SVN: r262534
2018-05-09 Tom de Vries <tom@codesourcery.com>
PR libgomp/82901
* oacc-parallel.c (GOACC_declare): Use GOMP_ASYNC_SYNC as async argument
to GOACC_enter_exit_data.
From-SVN: r260085
2018-05-07 Tom de Vries <tom@codesourcery.com>
PR testsuite/85677
* testsuite/lib/libgomp.exp (libgomp_init): Move inclusion of top-level
include directory in ALWAYS_CFLAGS out of $blddir != "" condition.
From-SVN: r259992
2018-05-03 Tom de Vries <tom@codesourcery.com>
PR testsuite/85106
* lib/scanoffloadtree.exp: New file.
* testsuite/lib/libgomp-dg.exp (libgomp-dg-test): Add save-temps to
extra_tool_flags if it contains an -foffload=-fdump-* flag.
* testsuite/lib/libgomp.exp: Include scanoffloadtree.exp.
* testsuite/libgomp.oacc-c/vec.c: Use scan-offload-tree-dump.
* doc/sourcebuild.texi (Commands for use in dg-final, Scan optimization
dump files): Add offload-tree.
From-SVN: r259892
2018-05-02 Tom de Vries <tom@codesourcery.com>
PR testsuite/85106
* gcc.dg/ipa/ipa-icf-38.c: Use scan-ltrans-tree-dump.
* lib/scanltranstree.exp: New file.
* lib/target-supports.exp (scan-ltrans-tree-dump_required_options)
(scan-ltrans-tree-dump-times_required_options)
(scan-ltrans-tree-dump-not_required_options)
(scan-ltrans-tree-dump-dem_required_options)
(scan-ltrans-tree-dump-dem-not_required_options): New proc.
* lib/gcc-dg.exp: Include scanltranstree.exp.
* testsuite/lib/libatomic.exp: Include scanltranstree.exp.
* testsuite/lib/libgomp.exp: Include scanltranstree.exp.
* testsuite/lib/libitm.exp: Include scanltranstree.exp.
* testsuite/lib/libvtv.exp: Include scanltranstree.exp.
* doc/sourcebuild.texi (Commands for use in dg-final, Scan optimization
dump files): Add ltrans-tree.
From-SVN: r259838
2018-05-02 Tom de Vries <tom@codesourcery.com>
PR testsuite/85106
* gcc.dg/ipa/ipa-icf-38.c: New test.
* gcc.dg/ipa/ipa-icf-38a.c: New test.
* lib/scandump.exp (dump-base): New proc.
(scan-dump, scan-dump-times, scan-dump-not, scan-dump-dem)
(scan-dump-dem-not): Add and handle parameter for suffix of the dump
base.
* lib/scanipa.exp: Add "" argument to scan-dump calls.
* lib/scanlang.exp: Same.
* lib/scanrtl.exp: Same.
* lib/scantree.exp: Same.
* lib/scanwpaipa.exp: New file.
* lib/gcc-dg.exp: Include scanwpaipa.exp.
* testsuite/lib/libatomic.exp: Include scanwpaipa.exp.
* testsuite/lib/libgomp.exp: Include scanwpaipa.exp.
* testsuite/lib/libitm.exp: Include scanwpaipa.exp.
* testsuite/lib/libvtv.exp: Include scanwpaipa.exp.
* doc/sourcebuild.texi (Commands for use in dg-final, Scan optimization
dump files): Add wpa-ipa.
From-SVN: r259837
2018-04-29 Julian Brown <julian@codesourcery.com>
Tom de Vries <tom@codesourcery.com>
PR testsuite/85527
* testsuite/libgomp.oacc-c-c++-common/atomic_capture-1.c: Allow
arbitrary order for iterations of atomic subtract check.
Co-Authored-By: Tom de Vries <tom@codesourcery.com>
From-SVN: r259748
2018-04-28 Tom de Vries <tom@codesourcery.com>
PR testsuite/85527
* testsuite/libgomp.oacc-fortran/atomic_capture-1.f90 (main): Store
atomic capture results obtained in parallel loop to an array, instead of
to a scalar.
From-SVN: r259733
2018-04-26 Richard Biener <rguenther@suse.de>
Tom de Vries <tom@codesourcery.com>
PR lto/85422
* lto-streamer-out.c (output_function): Fixup loops if required to match
discovery done in the reader.
* testsuite/libgomp.oacc-c-c++-common/pr85422.c: New test.
Co-Authored-By: Tom de Vries <tom@codesourcery.com>
From-SVN: r259675
* config/cet.m4 (GCC_CET_FLAGS): Default to --disable-cet, replace
--enable-cet=default with --enable-cet=auto.
* doc/install.texi: Document --disable-cet being the default and
--enable-cet=auto.
* configure: Regenerated.
From-SVN: r259487
PR jit/85384
* acx.m4 (GCC_BASE_VER): Remove \$\$ from sed expression.
* configure.ac (gcc-driver-name.h): Honor --with-gcc-major-version
by using gcc_base_ver to generate a gcc_driver_version, and use
it when generating GCC_DRIVER_NAME.
* configure: Regenerate.
* configure: Regenerate.
From-SVN: r259462
2018-04-16 Cesar Philippidis <cesar@codesourcery.com>
Tom de Vries <tom@codesourcery.com>
PR middle-end/84955
* omp-expand.c (expand_oacc_for): Add dummy false branch for
tiled basic blocks without omp continue statements.
* testsuite/libgomp.oacc-c-c++-common/pr84955.c: New test.
* testsuite/libgomp.oacc-fortran/pr84955.f90: New test.
Co-Authored-By: Tom de Vries <tom@codesourcery.com>
From-SVN: r259406
2018-04-12 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/83064
PR testsuite/85346
* trans-stmt.c (gfc_trans_forall_loop): Use annot_expr_ivdep_kind
for annotation and remove dependence on -ftree-parallelize-loops.
2018-04-12 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/83064
PR testsuite/85346
* gfortran.dg/do_concurrent_5.f90: Dynamically allocate main work
array and move test to libgomp/testsuite/libgomp.fortran.
* gfortran.dg/do_concurrent_6.f90: New test.
2018-04-12 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/83064
PR testsuite/85346
* testsuite/libgomp.fortran/do_concurrent_5.f90: Move modified
test from gfortran.dg to here.
From-SVN: r259359
2018-04-05 Tom de Vries <tom@codesourcery.com>
PR target/85204
* config/nvptx/nvptx.c (nvptx_single): Fix neutering of bb with only
cond jump.
* testsuite/libgomp.oacc-c-c++-common/broadcast-1.c: New test.
From-SVN: r259125
2018-03-26 Tom de Vries <tom@codesourcery.com>
PR tree-optimization/85063
* omp-general.c (offloading_function_p): New function. Factor out
of ...
* omp-offload.c (pass_omp_target_link::gate): ... here.
* omp-general.h (offloading_function_p): Declare.
* tree-switch-conversion.c (build_one_array): Mark CSWTCH.x variable
with attribute omp declare target for offloading functions.
* testsuite/libgomp.c/switch-conversion-2.c: New test.
* testsuite/libgomp.c/switch-conversion.c: New test.
* testsuite/libgomp.oacc-c-c++-common/switch-conversion-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/switch-conversion.c: New test.
From-SVN: r258852