OpenE2K/gcc - gcc - Expired Mentality Git

Author	SHA1	Message	Date
Tobias Burnus	33c4e46624	Add 'default' to -foffload=; document that flag [PR67300] As -foffload={options,targets,targets=options} is very convoluted, it has been split into -foffload=targets (supporting the old syntax for backward compatibilty) and -foffload-options={options,target=options}. Only the new syntax is documented. Additionally, -foffload=default is supported, which can reset the devices after -foffload=disable / -foffload=targets to the default, if needed. gcc/ChangeLog: PR other/67300 * common.opt (-foffload=): Update description. (-foffload-options=): New. * doc/invoke.texi (C Language Options): Document -foffload and -foffload-options. * gcc.c (check_offload_target_name): New, split off from handle_foffload_option. (check_foffload_target_names): New. (handle_foffload_option): Handle -foffload=default. (driver_handle_option): Update for -foffload-options. * lto-opts.c (lto_write_options): Use -foffload-options instead of -foffload. * lto-wrapper.c (merge_and_complain, append_offload_options): Likewise. * opts.c (common_handle_option): Likewise. libgomp/ChangeLog: PR other/67300 * testsuite/libgomp.c-c++-common/reduction-16.c: Replace -foffload=nvptx-none= by -foffload-options=nvptx-none= to avoid disabling other offload targets. * testsuite/libgomp.c-c++-common/reduction-5.c: Likewise. * testsuite/libgomp.c-c++-common/reduction-6.c: Likewise. * testsuite/libgomp.c/target-44.c: Likewise.	2021-06-29 16:00:04 +02:00
Thomas Schwinge	abf937ac00	'libgomp.c/target-44.c': Restrict '-latomic' to nvptx offloading compilation Fix-up for recent commit `f87990a2a8` "[openmp, simt] Disable SIMT for user-defined reduction"; see commit `d42088e453` "Avoid -latomic for amdgcn offloading". libgomp/ * testsuite/libgomp.c/target-44.c: Restrict '-latomic' to nvptx offloading compilation.	2021-05-18 12:57:35 +02:00
Martin Liska	810afb0b5f	testsuite: prune new LTO warning libgomp/ChangeLog: PR testsuite/100569 * testsuite/libgomp.c/omp-nested-3.c: Prune new LTO warning. * testsuite/libgomp.c/pr46032-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/data-clauses-kernels-ipa-pta.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/data-clauses-parallel-ipa-pta.c: Likewise. gcc/testsuite/ChangeLog: PR testsuite/100569 * gcc.dg/atomic/c11-atomic-exec-2.c: Prune new LTO warning. * gcc.dg/torture/pr94947-1.c: Likewise.	2021-05-13 09:24:23 +02:00
Jakub Jelinek	98acbb3111	openmp: Fix up taskloop reduction ICE if taskloop has no iterations [PR100471] When a taskloop doesn't have any iterations, GOMP_taskloop* takes an early return, doesn't create any tasks and more importantly, doesn't create a taskgroup and doesn't register task reductions. But, the code emitted in the callers assumes task reductions have been registered and performs the reduction handling and task reduction unregistration. The pointer to the task reduction private variables is reused, on input it is the alignment and only on output it is the pointer, so in the case taskloop with no iterations the caller attempts to dereference the alignment value as if it was a pointer and crashes. We could in the early returns register the task reductions only to have them looped over and unregistered in the caller, but I think it is better to tell the caller there is nothing to task reduce and bypass all that. 2021-05-11 Jakub Jelinek <jakub@redhat.com> PR middle-end/100471 * omp-low.c (lower_omp_task_reductions): For OMP_TASKLOOP, if data is 0, bypass the reduction loop including GOMP_taskgroup_reduction_unregister call. * taskloop.c (GOMP_taskloop): If GOMP_TASK_FLAG_REDUCTION and not GOMP_TASK_FLAG_NOGROUP, when doing early return clear the task reduction pointer. * testsuite/libgomp.c/task-reduction-4.c: New test.	2021-05-11 09:07:47 +02:00
Tom de Vries	f87990a2a8	[openmp, simt] Disable SIMT for user-defined reduction The test-case included in this patch contains this target region: ... for (int i0 = 0 ; i0 < N0 ; i0++ ) counter_N0.i += 1; ... When running with nvptx accelerator, the counter variable is expected to be N0 after the region, but instead is N0 / 32. The problem is that rather than getting the result for all warp lanes, we get it for just one lane. This is caused by the implementation of SIMT being incomplete. It handles regular reductions, but appearantly not user-defined reductions. For now, handle this by disabling SIMT in this case, specifically by setting sctx->max_vf to 1. Tested libgomp on x86_64-linux with nvptx accelerator. gcc/ChangeLog: 2021-05-03 Tom de Vries <tdevries@suse.de> PR target/100321 * omp-low.c (lower_rec_input_clauses): Disable SIMT for user-defined reduction. libgomp/ChangeLog: 2021-05-03 Tom de Vries <tdevries@suse.de> PR target/100321 * testsuite/libgomp.c/target-44.c: New test.	2021-05-03 23:13:59 +02:00
Tom de Vries	fc14ff6111	[omp, simt] Handle alternative IV Consider the test-case libgomp.c/pr81778.c added in this commit, with this core loop (note: CANARY_SIZE set to 0 for simplicity): ... int s = 1; #pragma omp target simd for (int i = N - 1; i > -1; i -= s) a[i] = 1; ... which, given that N is 32, sets a[0..31] to 1. After omp-expand, this looks like: ... <bb 5> : simduid.7 = .GOMP_SIMT_ENTER (simduid.7); .omp_simt.8 = .GOMP_SIMT_ENTER_ALLOC (simduid.7); D.3193 = -s; s.9 = s; D.3204 = .GOMP_SIMT_LANE (); D.3205 = -s.9; D.3206 = (int) D.3204; D.3207 = D.3205 * D.3206; i = D.3207 + 31; D.3209 = 0; D.3210 = -s.9; D.3211 = D.3210 - i; D.3210 = -s.9; D.3212 = D.3211 / D.3210; D.3213 = (unsigned int) D.3212; D.3213 = i >= 0 ? D.3213 : 0; <bb 19> : if (D.3209 < D.3213) goto <bb 6>; [87.50%] else goto <bb 7>; [12.50%] <bb 6> : a[i] = 1; D.3215 = -s.9; D.3219 = .GOMP_SIMT_VF (); D.3216 = (int) D.3219; D.3220 = D.3215 * D.3216; i = D.3220 + i; D.3209 = D.3209 + 1; goto <bb 19>; [100.00%] ... On nvptx, the first time bb6 is executed, i is in the 0..31 range (depending on the lane that is executing) at bb entry. So we have the following sequence: - a[0..31] is set to 1 - i is updated to -32..-1 - D.3209 is updated to 1 (being 0 initially) - bb19 is executed, and if condition (D.3209 < D.3213) == (1 < 32) evaluates to true - bb6 is once more executed, which should not happen because all the elements that needed to be handled were already handled. - consequently, elements that should not be written are written - with CANARY_SIZE == 0, we may run into a libgomp error: ... libgomp: cuCtxSynchronize error: an illegal memory access was encountered ... and with CANARY_SIZE unmodified, we run into: ... Expected 0, got 1 at base[-961] Aborted (core dumped) ... The cause of this is as follows: - because the step s is a variable rather than a constant, an alternative IV (D.3209 in our example) is generated in expand_omp_simd, and the loop condition is tested in terms of the alternative IV rather than the original IV (i in our example). - the SIMT code in expand_omp_simd works by modifying step and initial value. - The initial value fd->loop.n1 is loaded into a variable n1, which is modified by the SIMT code and then used there-after. - The step fd->loop.step is loaded into a variable step, which is modified by the SIMT code, but afterwards there are uses of both step and fd->loop.step. - There are uses of fd->loop.step in the alternative IV handling code, which should use step instead. Fix this by introducing an additional variable orig_step, which is not modified by the SIMT code and replacing all remaining uses of fd->loop.step by either step or orig_step. Build on x86_64-linux with nvptx accelerator, tested libgomp. This fixes for-5.c and for-6.c FAILs I'm currently seeing on a quadro m1200 with driver 450.66. gcc/ChangeLog: 2020-10-02 Tom de Vries <tdevries@suse.de> * omp-expand.c (expand_omp_simd): Add step_orig, and replace uses of fd->loop.step by either step or orig_step. libgomp/ChangeLog: 2020-10-02 Tom de Vries <tdevries@suse.de> * testsuite/libgomp.c/pr81778.c: New test.	2021-04-29 14:37:32 +02:00
Tom de Vries	4d7c874e2c	[omp, simt] Fix expand_GOMP_SIMT_* When running the test-case included in this patch using an nvptx accelerator, it fails in execution. The problem is that the expansion of GOMP_SIMT_XCHG_BFLY is optimized away during pass_jump as "trivially dead insns". This is caused by this code in expand_GOMP_SIMT_XCHG_BFLY: ... class expand_operand ops[3]; create_output_operand (&ops[0], target, mode); ... expand_insn (targetm.code_for_omp_simt_xchg_bfly, 3, ops); ... which doesn't guarantee that target is assigned to by the expanded insn. F.i., if target is: ... (gdb) call debug_rtx ( target ) (subreg/s/u:QI (reg:SI 40 [ _61 ]) 0) ... then after expand_insn, we have: ... (gdb) call debug_rtx ( ops[0].value ) (reg:QI 57) ... See commit `3af3bec2e4` "internal-fn: Avoid dropping the lhs of some calls [PR94941]" for a similar problem. Fix this in the same way, by adding: ... if (!rtx_equal_p (target, ops[0].value)) emit_move_insn (target, ops[0].value); ... where applicable in the expand_GOMP_SIMT_* functions. Tested libgomp on x86_64 with nvptx accelerator. gcc/ChangeLog: 2021-04-28 Tom de Vries <tdevries@suse.de> PR target/100232 * internal-fn.c (expand_GOMP_SIMT_ENTER_ALLOC) (expand_GOMP_SIMT_LAST_LANE, expand_GOMP_SIMT_ORDERED_PRED) (expand_GOMP_SIMT_VOTE_ANY, expand_GOMP_SIMT_XCHG_BFLY) (expand_GOMP_SIMT_XCHG_IDX): Ensure target is assigned to.	2021-04-29 09:55:15 +02:00
Tobias Burnus	95dfc3ac7b	libgomp/testsuite: Fix checks for dg-excess-errors For the tests modified below, the effective target line has to be effective when compiling for an offload target, except that variable-not-offloaded.c would compile with unified-share memory and pr86416-.c if long double/float128 is supported. The previous check used a run-time device ability check. This new variant now enables those dg- lines when _compiling_ for nvptx or gcn. libgomp/ChangeLog: testsuite/lib/libgomp.exp (offload_target_to_openacc_device_type): New, based on check_effective_target_offload_target_nvptx. (check_effective_target_offload_target_nvptx): Call it. (check_effective_target_offload_target_amdgcn): New. * testsuite/libgomp.c-c++-common/function-not-offloaded.c: Require target offload_target_nvptx \|\| offload_target_amdgcn. * testsuite/libgomp.c-c++-common/variable-not-offloaded.c: Likewise. * testsuite/libgomp.c/pr86416-1.c: Likewise. * testsuite/libgomp.c/pr86416-2.c: Likewise.	2021-04-21 20:07:19 +02:00
Thomas Schwinge	4dd9e1c541	XFAIL OpenMP/nvptx execution-time hangs for simple nested OpenMP 'target'/'parallel'/'task' constructs [PR99555] ... still awaiting proper resolution, of course. libgomp/ PR target/99555 * testsuite/lib/libgomp.exp (check_effective_target_offload_device_nvptx): New. * testsuite/libgomp.c/pr99555-1.c <nvptx offload device>: Until resolved, make sure that we exit quickly, with error status, XFAILed. * testsuite/libgomp.c-c++-common/task-detach-6.c: Likewise. * testsuite/libgomp.fortran/task-detach-6.f90: Likewise.	2021-04-15 11:13:27 +02:00
Tobias Burnus	d579e2e76f	libgomp: Fix on_device_arch.c aux-file handling [PR99555] libgomp/ChangeLog: PR target/99555 * testsuite/lib/on_device_arch.c: Move to ... * testsuite/libgomp.c-c++-common/on_device_arch.h: ... here. * testsuite/libgomp.fortran/on_device_arch.c: New file; #include on_device_arch.h. * testsuite/libgomp.c-c++-common/task-detach-6.c: #include on_device_arch.h instead of using dg-additional-source. * testsuite/libgomp.c/pr99555-1.c: Likewise. * testsuite/libgomp.fortran/task-detach-6.f90: Update to use on_device_arch.c without relative paths.	2021-03-29 10:40:38 +02:00
Thomas Schwinge	d99111fd8e	Avoid OpenMP/nvptx execution-time hangs for simple nested OpenMP 'target'/'parallel'/'task' constructs [PR99555] ... awaiting proper resolution, of course. libgomp/ PR target/99555 * testsuite/lib/on_device_arch.c: New file. * testsuite/libgomp.c/pr99555-1.c: Likewise. * testsuite/libgomp.c-c++-common/task-detach-6.c: Until resolved, skip for nvptx offloading, with error status. * testsuite/libgomp.fortran/task-detach-6.f90: Likewise.	2021-03-25 13:00:11 +01:00
Jakub Jelinek	99dee82307	Update copyright years.	2021-01-04 10:26:59 +01:00
Jakub Jelinek	8b60459465	openmp: Don't optimize shared to firstprivate on task with depend clause The attached testcase is miscompiled, because we optimize shared clauses to firstprivate when task body can't modify the variable even when the task has depend clause. That is wrong, because firstprivate means the variable will be copied immediately when the task is created, while with depend clause some other task might change it later before the dependencies are satisfied and the task should observe the value only after the change. 2020-12-18 Jakub Jelinek <jakub@redhat.com> * gimplify.c (struct gimplify_omp_ctx): Add has_depend member. (gimplify_scan_omp_clauses): Set it to true if OMP_CLAUSE_DEPEND appears on OMP_TASK. (gimplify_adjust_omp_clauses_1, gimplify_adjust_omp_clauses): Force GOVD_WRITTEN on shared variables if task construct has depend clause. * testsuite/libgomp.c/task-6.c: New test.	2020-12-18 21:43:20 +01:00
Tobias Burnus	cb1a4876a0	testsuite/libgomp.c/usleep.h: Use sleep-loop also for GCN As typically configured, newlib's libc.a does not build 'posix' and, hence, usleep is not available. Thus, use the same fallback as for nvptx. libgomp/ * testsuite/libgomp.c/usleep.h (fallback_usleep): Renamed from nvptx_usleep; use also for device={arch(gcn)}.	2020-11-18 14:11:27 +01:00
Kwok Cheung Yeung	10508db867	openmp: Mark deprecated symbols in OpenMP 5.0 2020-11-05 Ulrich Drepper <drepper@redhat.com> Kwok Cheung Yeung <kcy@codesourcery.com> libgomp/ * Makefile.am (%.mod): Add -cpp and -fopenmp to compile flags. * Makefile.in: Regenerate. * fortran.c: Wrap uses of omp_set_nested and omp_get_nested with pragmas to ignore -Wdeprecated-declarations warnings. * icv.c: Likewise. * omp.h.in (__GOMP_DEPRECATED_5_0): Define. Mark omp_lock_hint_* enum values, omp_lock_hint_t, omp_set_nested, and omp_get_nested with __GOMP_DEPRECATED_5_0. * omp_lib.f90.in: Mark omp_get_nested and omp_set_nested as deprecated. * testsuite/libgomp.c++/affinity-1.C: Add -Wno-deprecated-declarations to test options. * testsuite/libgomp.c/affinity-1.c: Likewise. * testsuite/libgomp.c/affinity-2.c: Likewise. * testsuite/libgomp.c/appendix-a/a.15.1.c: Likewise. * testsuite/libgomp.c/lib-1.c: Likewise. * testsuite/libgomp.c/nested-1.c: Likewise. * testsuite/libgomp.c/nested-2.c: Likewise. * testsuite/libgomp.c/nested-3.c: Likewise. * testsuite/libgomp.c/pr32362-1.c: Likewise. * testsuite/libgomp.c/pr32362-2.c: Likewise. * testsuite/libgomp.c/pr32362-3.c: Likewise. * testsuite/libgomp.c/pr35549.c: Likewise. * testsuite/libgomp.c/pr42942.c: Likewise. * testsuite/libgomp.c/pr61200.c: Likewise. * testsuite/libgomp.c/sort-1.c: Likewise. * testsuite/libgomp.c/target-5.c: Likewise. * testsuite/libgomp.c/target-6.c: Likewise. * testsuite/libgomp.c/teams-1.c: Likewise. * testsuite/libgomp.c/thread-limit-1.c: Likewise. * testsuite/libgomp.c/thread-limit-2.c: Likewise. * testsuite/libgomp.c/thread-limit-4.c: Likewise. * testsuite/libgomp.fortran/affinity1.f90: Likewise. * testsuite/libgomp.fortran/lib1.f90: Likewise. * testsuite/libgomp.fortran/lib2.f: Likewise. * testsuite/libgomp.fortran/nested1.f90: Likewise. * testsuite/libgomp.fortran/teams1.f90: Likewise.	2020-11-05 10:32:56 -08:00
Jakub Jelinek	2298ca2d3e	openmp: Implicitly discover declare target for variants of declare variant calls This marks all variants of declare variant also declare target if the base functions are called directly in target regions or declare target functions. 2020-10-28 Jakub Jelinek <jakub@redhat.com> gcc/ * omp-offload.c (omp_declare_target_tgt_fn_r): Handle direct calls to declare variant base functions. libgomp/ * testsuite/libgomp.c/target-42.c: New test.	2020-10-28 10:36:31 +01:00
Jakub Jelinek	3f39b64e57	xfail and improve some failing libgomp tests [PR81690] With the patch I've posted today to fix up declare variant LTO handling, Tobias reported the patch still doesn't work, and there are two reasons for that. One is that when the base function is marked implicitly as declare target, we don't mark also implicitly the variants. I'll need to ask on omp-lang about details for that, but generally the compiler should do it some way. The other one is that the way base_delay is written, it will always call the usleep function, which is undesirable for nvptx. While the compiler will replace all direct calls to base_delay to nvptx_delay, the base_delay definition which calls usleep stays. 2020-10-28 Jakub Jelinek <jakub@redhat.com> Tom de Vries <tdevries@suse.de> PR testsuite/81690 * testsuite/libgomp.c/usleep.h: New file. * testsuite/libgomp.c/target-32.c: Include usleep.h. (main): Use tgt_usleep instead of usleep. * testsuite/libgomp.c/thread-limit-2.c: Include usleep.h. (main): Use tgt_usleep instead of usleep.	2020-10-28 10:30:41 +01:00
Jakub Jelinek	f165ef89c0	lto: LTO cgraph support for late declare variant resolution [PR96680] > I've tried to add the saving/restoring next to ipa refs saving/restoring, as > the declare variant alt stuff is kind of extension of those, unfortunately > following doesn't compile, because I need to also write or read a tree there > (ctx is a portion of DECL_ATTRIBUTES of the base function), but the ipa refs > write/read back functions don't have arguments that can be used for that. This patch adds the streaming out and in of those omp_declare_variant_alt hash table on the side data for the declare_variant_alt cgraph_nodes and treats for LTO purposes the declare_variant_alt nodes (which have no body) as if they contained a body that calls all the possible variants. After IPA all the calls to these magic declare_variant_alt calls are replaced with call to one of the variant depending on which one has the highest score in the context. 2020-10-28 Jakub Jelinek <jakub@redhat.com> PR lto/96680 gcc/ * lto-streamer.h (omp_lto_output_declare_variant_alt, omp_lto_input_declare_variant_alt): Declare variant. * symtab.c (symtab_node::get_partitioning_class): Return SYMBOL_DUPLICATE for declare_variant_alt nodes. * passes.c (ipa_write_summaries): Add declare_variant_alt to partition. * lto-cgraph.c (output_refs): Call omp_lto_output_declare_variant_alt on declare_variant_alt nodes. (input_refs): Call omp_lto_input_declare_variant_alt on declare_variant_alt nodes. * lto-streamer-out.c (output_function): Don't call collect_block_tree_leafs if DECL_INITIAL is error_mark_node. (lto_output): Call output_function even for declare_variant_alt nodes. * omp-general.c (omp_lto_output_declare_variant_alt, omp_lto_input_declare_variant_alt): New functions. gcc/lto/ * lto-common.c (lto_fixup_prevailing_decls): Don't use LTO_NO_PREVAIL on TREE_LIST's TREE_PURPOSE. * lto-partition.c (lto_balanced_map): Treat declare_variant_alt nodes like definitions. libgomp/ * testsuite/libgomp.c/declare-variant-1.c: New test.	2020-10-28 10:29:09 +01:00
Jakub Jelinek	17c5b7e1dc	openmp: Add test for OMP_TARGET_OFFLOAD=mandatory for cases where it must not fail 2020-10-22 Jakub Jelinek <jakub@redhat.com> * testsuite/libgomp.c/target-41.c: New test.	2020-10-22 09:36:18 +02:00
Jakub Jelinek	74c9882b80	openmp: Change omp_get_initial_device () to match OpenMP 5.1 requirements > Therefore, I think until omp_get_initial_device () value is changed, we The following so far untested patch implements that change. OpenMP 4.5 said for omp_get_initial_device: The value of the device number is implementation defined. If it is between 0 and one less than omp_get_num_devices() then it is valid for use with all device constructs and routines; if it is outside that range, then it is only valid for use with the device memory routines and not in the device clause. and OpenMP 5.0 similarly, but OpenMP 5.1 says: The value of the device number is the value returned by the omp_get_num_devices routine. As the new value is compatible with what has been required earlier, I think we can change it already now. 2020-10-22 Jakub Jelinek <jakub@redhat.com> * icv.c (omp_get_initial_device): Remove including corresponding ialias. * icv-device.c (omp_get_initial_device): New function. Return gomp_get_num_devices (). Add ialias. * target.c (resolve_device): Don't fail with OMP_TARGET_OFFLOAD=mandatory if device_id is equal to gomp_get_num_devices (). (omp_target_alloc, omp_target_free, omp_target_is_present, omp_target_memcpy, omp_target_memcpy_rect, omp_target_associate_ptr, omp_target_disassociate_ptr, omp_pause_resource): Use gomp_get_num_devices () instead of GOMP_DEVICE_HOST_FALLBACK on the first use in the functions, in uses dominated by the gomp_get_num_devices call use num_devices_openmp instead. * libgomp.texi (omp_get_initial_device): Document. * config/gcn/icv-device.c (omp_get_initial_device): New function. Add ialias. * config/nvptx/icv-device.c (omp_get_initial_device): Likewise. * testsuite/libgomp.c/target-40.c: New test.	2020-10-22 09:31:01 +02:00
Kwok Cheung Yeung	8949b985db	openmp: Add support for the omp_get_supported_active_levels runtime library routine This patch implements the omp_get_supported_active_levels runtime routine from the OpenMP 5.0 specification, which returns the maximum number of active nested parallel regions supported by this implementation. The current maximum (set using the omp_set_max_active_levels routine or the OMP_MAX_ACTIVE_LEVELS environment variable) cannot exceed this number. 2020-10-13 Kwok Cheung Yeung <kcy@codesourcery.com> libgomp/ * env.c (gomp_max_active_levels_var): Initialize to gomp_supported_active_levels. (initialize_env): Limit gomp_max_active_levels_var to be at most equal to gomp_supported_active_levels. * fortran.c (omp_get_supported_active_levels): Add ialias_redirect. (omp_get_supported_active_levels_): New. * icv.c (omp_set_max_active_levels): Limit gomp_max_active_levels_var to at most equal to gomp_supported_active_levels. (omp_get_supported_active_levels): New. * libgomp.h (gomp_supported_active_levels): New. * libgomp.map (OMP_5.0.1): Add omp_get_supported_active_levels and omp_get_supported_active_levels_. * libgomp.texi (omp_get_supported_active_levels): New. (omp_set_max_active_levels): Update. Add reference to omp_get_supported_active_levels. * omp.h.in (omp_get_supported_active_levels): New. * omp_lib.f90.in (omp_get_supported_active_levels): New. * omp_lib.h.in (omp_get_supported_active_levels): New. * testsuite/libgomp.c/lib-2.c (main): Check omp_get_max_active_levels against omp_get_supported_active_levels. * testsuite/libgomp.fortran/lib4.f90 (lib4): Likewise.	2020-10-13 13:21:02 -07:00
Jakub Jelinek	c2ebf4f10d	openmp: Add support for non-rect simd and improve collapsed simd support The following change adds support for non-rectangular simd loops. While working on that, I've noticed we actually don't vectorize collapsed simd loops at all, because the code that I thought would be vectorizable actually is not vectorized. While in theory for the constant lower/upper bounds and constant step of all but the outermost loop we could in theory vectorize by computing the seprate iterators using vectorized division and modulo for each of them from the single iterator that increments by 1 from 0 to total iteration count in the loop nest, I think that would be fairly expensive and the chances of the loop body being vectorizable would be low e.g. because of array indices unlikely to be linear and would need scatters/gathers. This patch changes the generated code to vectorize only the innermost loop which has higher chance of being vectorized. Below is the list of tests and function names in which the patch resulted in vectorizing something that hasn't been vectorized before (ok, the first line is a new test). I've also found that the vectorizer will not vectorize loops with non-constant steps, I plan to do something about those incrementally on the omp-expand.c side (basically, compute number of iterations before the loop and use a 0 to number_of_iterations step 1 IV as the main one). I have problem with the composite simd vectorization though. The point is that each thread (or task etc.) is given only a range of consecutive iterations, so somewhere earlier it computes total number of iterations and splits the work between the workers and then the intent is to try to vectorize it. So, each thread is then given a begin ... end-1 range that it would handle. This means that from the single begin value I need to compute the individual iteration vars I should start at and then goto into the loop nest to begin iterating there (and actually compute how many iterations the innermost loop should do each time so that it stops before end). Very roughly the IL I emit is something like: int t[100][100][100]; void foo (int a, int b, int c, int d, int e, int f, int g, int h, int u, int v, int w, int x) { int i, j, k; int cnt; if (x) { i = u; j = v; k = w; goto doit; } for (i = a; i < b; i += c) for (j = d; j < e; j += f) { k = g; doit: for (; k < h; k++) t[i][j][k] += i + j + k; } } Unfortunately, some pass then turns the innermost loop to have more than 2 basic blocks and it isn't vectorized because of that. Also, I have disabled (for now) SIMTization of collapsed simd loops, because for SIMT it would be using a single thread anyway and I didn't want to bother with checking SIMT on all places I've been changing. If SIMT support is added for some or all collapsed loops, that omp-low.c change needs to be reverted. Here is that list of what hasn't been vectorized before and is now: gcc/testsuite/gcc.dg/vect/vect-simd-17.c doit gcc/testsuite/gfortran.dg/gomp/openmp-simd-6.f90 bar libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-10.c f28_taskloop_simd_normal._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-10.c _Z24f28_taskloop_simd_normalv._omp_fn.0 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f25_t_simd_normal._omp_fn.0 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f26_t_simd_normal._omp_fn.0 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f27_t_simd_normal._omp_fn.0 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f28_tpf_simd_guided32._omp_fn.1 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f28_tpf_simd_runtime._omp_fn.1 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z17f25_t_simd_normaliiiiiii._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z17f26_t_simd_normaliiiixxi._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z17f27_t_simd_normalv._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z20f28_tpf_simd_runtimev._omp_fn.1 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z21f28_tpf_simd_guided32v._omp_fn.1 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c f7_simd_normal libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f7_simd_normal libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c f8_f_simd_guided32 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f8_f_simd_guided32 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c f8_f_simd_runtime libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f8_f_simd_runtime libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f8_pf_simd_guided32._omp_fn.0 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f8_pf_simd_runtime._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c _Z18f8_pf_simd_runtimev._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c _Z19f8_pf_simd_guided32v._omp_fn.0 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-4.c f8_taskloop_simd_normal._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-4.c _Z23f8_taskloop_simd_normalv._omp_fn.0 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-5.c f7_t_simd_normal._omp_fn.0 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-5.c f8_tpf_simd_guided32._omp_fn.1 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-5.c f8_tpf_simd_runtime._omp_fn.1 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-5.c _Z16f7_t_simd_normalv._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-5.c _Z19f8_tpf_simd_runtimev._omp_fn.1 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-5.c _Z20f8_tpf_simd_guided32v._omp_fn.1 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f25_simd_normal libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f25_simd_normal libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f26_simd_normal libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f26_simd_normal libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f27_simd_normal libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f27_simd_normal libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f28_f_simd_guided32 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f28_f_simd_guided32 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f28_f_simd_runtime libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f28_f_simd_runtime libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f28_pf_simd_guided32._omp_fn.0 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f28_pf_simd_runtime._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c _Z19f28_pf_simd_runtimev._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c _Z20f28_pf_simd_guided32v._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/master-combined-1.c main._omp_fn.9 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/master-combined-1.c main._omp_fn.9 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/simd-1.c f2 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/simd-1.c f2 libgomp/testsuite/libgomp.c/pr70680-2.c f1._omp_fn.0 libgomp/testsuite/libgomp.c/pr70680-2.c f2._omp_fn.0 libgomp/testsuite/libgomp.c/pr70680-2.c f3._omp_fn.0 libgomp/testsuite/libgomp.c/pr70680-2.c f4._omp_fn.0 libgomp/testsuite/libgomp.c/simd-8.c foo libgomp/testsuite/libgomp.c/simd-9.c bar libgomp/testsuite/libgomp.c/simd-9.c foo 2020-09-25 Jakub Jelinek <jakub@redhat.com> gcc/ * omp-low.c (scan_omp_1_stmt): Don't call scan_omp_simd for collapse > 1 loops as simt doesn't support collapsed loops yet. * omp-expand.c (expand_omp_for_init_counts, expand_omp_for_init_vars): Small tweaks to function comment. (expand_omp_simd): Rewritten collapse > 1 support to only attempt to vectorize the innermost loop and emit set of outer loops around it. For non-composite simd with collapse > 1 without broken loop don't even try to compute number of iterations first. Add support for non-rectangular simd loops. (expand_omp_for): Don't sorry_at on non-rectangular simd loops. gcc/testsuite/ * gcc.dg/vect/vect-simd-17.c: New test. libgomp/ * testsuite/libgomp.c/loop-25.c: New test.	2020-09-25 10:43:37 +02:00
Jakub Jelinek	2e47c8c6ea	openmp: Add support for non-rectangular loops in taskloop construct 2020-08-13 Jakub Jelinek <jakub@redhat.com> * gimplify.c (gimplify_omp_taskloop_expr): New function. (gimplify_omp_for): Use it. For OMP_FOR_NON_RECTANGULAR loops adjust in outer taskloop the var-outer decls. * omp-expand.c (expand_omp_taskloop_for_inner): Handle non-rectangular loops. (expand_omp_for): Don't reject non-rectangular taskloop. * omp-general.c (omp_extract_for_data): Don't assert that non-rectangular loops have static schedule, instead treat loop->m1 or loop->m2 as if loop->n1 or loop->n2 is non-constant. * testsuite/libgomp.c/loop-22.c (main): Add some further tests. * testsuite/libgomp.c/loop-23.c (main): Likewise. * testsuite/libgomp.c/loop-24.c: New test.	2020-08-13 09:06:05 +02:00
Jakub Jelinek	9f3abfb84e	openmp: Handle even some combined non-rectangular loops The number of loops computation and logical iteration -> actual iterator values computations can now be done separately even on composite constructs (though for triangular loops it would still be more efficient to propagate a few values through, will handle that incrementally). simd and taskloop are still unhandled. 2020-08-05 Jakub Jelinek <jakub@redhat.com> * omp-expand.c (expand_omp_for): Don't disallow combined non-rectangular loops. * testsuite/libgomp.c/loop-22.c: New test. * testsuite/libgomp.c/loop-23.c: New test.	2020-08-05 10:45:16 +02:00
Jakub Jelinek	916c7a201a	openmp: Handle reduction clauses on host teams construct [PR96459] As the new testcase shows, we weren't actually performing reductions on host teams construct. And fixing that revealed a flaw in the for-14.c testcase. The problem is that the tests perform also initialization and checking around the calls to the functions with the OpenMP constructs. In that testcase, all the tests have been spawned from a teams construct but only the tested loops were distribute, which means the initialization and checking has been performed redundantly and racily in each team. Fixed by performing the initialization and checking outside of host teams and only do the calls to functions with the tested constructs inside of host teams. 2020-08-05 Jakub Jelinek <jakub@redhat.com> PR middle-end/96459 * omp-low.c (lower_omp_taskreg): Call lower_reduction_clauses even in for host teams. * testsuite/libgomp.c/teams-3.c: New test. * testsuite/libgomp.c-c++-common/for-2.h (OMPTEAMS): Define to nothing if not defined yet. (N(test)): Use it before all N(f) calls. testsuite/libgomp.c-c++-common/for-14.c (DO_PRAGMA, OMPTEAMS): Define. (main): Don't call all test_* functions from within #pragma omp teams reduction(\|:err), call them directly.	2020-08-05 10:40:10 +02:00
H.J. Lu	7aa22a8f1a	x86-64: Define ASM_OUTPUT_ALIGNED_DECL_LOCAL Define ASM_OUTPUT_ALIGNED_DECL_LOCAL for large local common symbol. gcc/ PR target/95620 * config/i386/x86-64.h (ASM_OUTPUT_ALIGNED_DECL_LOCAL): New. libgomp/ PR target/95620 * testsuite/libgomp.c/pr95620.c: New test.	2020-07-18 08:51:54 -07:00
Jakub Jelinek	f418bd4b92	openmp: Adjust outer bounds of non-rect loops In loops like: #pragma omp parallel for collapse(2) for (i = -4; i < 8; i++) for (j = 3 * i; j > 2 * i; j--) for some outer loop iterations there are no inner loop iterations at all, the condition is false. In order to use Summæ Potestate to count number of iterations or to transform the logical iteration number to actual iterator values using quadratic non-equation root discovery the outer iterator range needs to be adjusted, such that the inner loop has at least one iteration for each of the outer loop iterator value in the reduced range. Sometimes this adjustment is done at the start of the range, at other times at the end. This patch implements it during the compile time number of loop computation (if all expressions are compile time constants). 2020-07-14 Jakub Jelinek <jakub@redhat.com> * omp-general.h (struct omp_for_data): Add adjn1 member. * omp-general.c (omp_extract_for_data): For non-rect loop, punt on count computing if n1, n2 or step are not INTEGER_CST earlier. Narrow the outer iterator range if needed so that non-rect loop has at least one iteration for each outer range iteration. Compute adjn1. * omp-expand.c (expand_omp_for_init_vars): Use adjn1 if non-NULL instead of the outer loop's n1. * testsuite/libgomp.c/loop-21.c: New test.	2020-07-14 10:31:59 +02:00
Jakub Jelinek	5acef69f9d	openmp: Optimize triangular loop logical iterator to actual iterators computation using search for quadratic equation root(s) This patch implements the optimized logical to actual iterators computation for triangular loops. I have a rough implementation using integers, but this one uses floating point. There is a small problem that -fopenmp programs aren't linked with -lm, so it does it only if the hw has sqrt optab (and uses ifn rather than __builtin_sqrt because it obviously doesn't need errno handling etc.). Do you think it is ok this way, or should I use the integral computation using inlined isqrt (we have inequation of the form start >= x * t10 + t11 * (((x - 1) * x) / 2) where t10 and t11 are signed long long values and start unsigned long long, and the division by 2 actually is a problem for accuracy in some cases, so if we do it in integral, we need to do actually long long t12 = 2 * t10 - t11; unsigned long long t13 = t12 * t12 + start * 8 * t11; unsigned long long isqrt_ = isqrtull (t13); long long x = (((long long) isqrt_ - t12) / t11) >> 1; with careful overflow checking on all the computations before isqrtull (and on overflows use the fallback implementation). 2020-07-09 Jakub Jelinek <jakub@redhat.com> * omp-general.h (struct omp_for_data): Add min_inner_iterations and factor members. * omp-general.c (omp_extract_for_data): Initialize them and remember them in OMP_CLAUSE_COLLAPSE_COUNT if needed and restore from there. * omp-expand.c (expand_omp_for_init_counts): Fix up computation of counts[fd->last_nonrect] if fd->loop.n2 is INTEGER_CST. (expand_omp_for_init_vars): For fd->first_nonrect + 1 == fd->last_nonrect loops with for now INTEGER_CST fd->loop.n2 find quadratic equation roots instead of using fallback method when possible. * testsuite/libgomp.c/loop-19.c: New test. * testsuite/libgomp.c/loop-20.c: New test.	2020-07-09 12:07:17 +02:00
Jakub Jelinek	aed3ab253d	openmp: Non-rectangular loop support for non-composite worksharing loops and distribute This implements the fallback mentioned in https://gcc.gnu.org/pipermail/gcc/2020-June/232874.html Special cases for triangular loops etc. to follow later, also composite constructs not supported yet (need to check the passing of temporaries around) and lastprivate might not give the same answers as serial loop if the last innermost body iteration isn't the last one for some of the outer loops (that will need to be solved separately together with rectangular loops that have no innermost body iterations, but some of the outer loops actually iterate). Also, simd needs work. 2020-06-27 Jakub Jelinek <jakub@redhat.com> * omp-general.h (struct omp_for_data_loop): Add non_rect_referenced member, move outer member. (struct omp_for_data): Add first_nonrect and last_nonrect members. * omp-general.c (omp_extract_for_data): Initialize first_nonrect, last_nonrect and non_rect_referenced members. * omp-expand.c (expand_omp_for_init_counts): Handle non-rectangular loops. (expand_omp_for_init_vars): Add nonrect_bounds parameter. Handle non-rectangular loops. (extract_omp_for_update_vars): Likewise. (expand_omp_for_generic, expand_omp_for_static_nochunk, expand_omp_for_static_chunk, expand_omp_simd, expand_omp_taskloop_for_outer, expand_omp_taskloop_for_inner): Adjust expand_omp_for_init_vars and extract_omp_for_update_vars callers. (expand_omp_for): Don't sorry on non-composite worksharing-loop or distribute. * testsuite/libgomp.c/loop-17.c: New test. * testsuite/libgomp.c/loop-18.c: New test.	2020-06-27 12:43:36 +02:00
Jakub Jelinek	dc703151d4	openmp: Implement discovery of implicit declare target to clauses This attempts to implement what the OpenMP 5.0 spec in declare target section says as ammended by the 5.1 changes so far (related to device_type(host)), except that it doesn't have the device(ancestor: ...) handling yet because we do not support it yet, and I've left so far out the except lambda note, because I need that clarified. 2020-05-12 Jakub Jelinek <jakub@redhat.com> * omp-offload.h (omp_discover_implicit_declare_target): Declare. * omp-offload.c: Include context.h. (omp_declare_target_fn_p, omp_declare_target_var_p, omp_discover_declare_target_fn_r, omp_discover_declare_target_var_r, omp_discover_implicit_declare_target): New functions. * cgraphunit.c (analyze_functions): Call omp_discover_implicit_declare_target. * testsuite/libgomp.c/target-39.c: New test.	2020-05-12 09:17:09 +02:00
Tobias Burnus	c2211a60ff	Fix OpenMP offload handling for target-link variables for nvptx (PR81689) PR libgomp/81689 * lto.c (offload_handle_link_vars): Propagate TREE_PUBLIC state. PR libgomp/81689 * omp-offload.c (omp_finish_file): Fix target-link handling if targetm_common.have_named_sections is false. PR libgomp/81689 * testsuite/libgomp.c/target-link-1.c: Remove xfail.	2020-03-24 15:13:56 +01:00
Jakub Jelinek	9c3cdb43c2	tree-nested: Fix handling of reduction clauses with C array sections [PR93566] tree-nested.c didn't handle C array sections in {,task_,in_}reduction clauses. 2020-03-14 Jakub Jelinek <jakub@redhat.com> PR middle-end/93566 tree-nested.c (convert_nonlocal_omp_clauses, convert_local_omp_clauses): Handle {,in_,task_}reduction clauses with C/C++ array sections. * testsuite/libgomp.c/pr93566.c: New test.	2020-03-15 01:27:40 +01:00
Frederik Harwath	001ab12e62	openmp: ignore nowait if async execution is unsupported [PR93481] An OpenMP "nowait" clause on a target construct currently leads to a call to GOMP_OFFLOAD_async_run in the plugin that is used for offloading at execution time. The nvptx plugin contains only a stub of this function that always produces a fatal error if called. This commit changes the "nowait" implementation to ignore the clause if the executing device's plugin does not implement GOMP_OFFLOAD_async_run. The stub in the nvptx plugin is removed which effectively means that programs containing "nowait" can now be executed with nvptx offloading as if the clause had not been used. This behavior is consistent with the OpenMP specification which says that "[...] execution of the target task may be deferred" (emphasis added), cf. OpenMP 5.0, page 172. libgomp/ * plugin/plugin-nvptx.c: Remove GOMP_OFFLOAD_async_run stub. * target.c (gomp_load_plugin_for_device): Make "async_run" loading optional. (gomp_target_task_fn): Assert "devicep->async_run_func". (clear_unsupported_flags): New function to remove unsupported flags (right now only GOMP_TARGET_FLAG_NOWAIT) that can be be ignored. (GOMP_target_ext): Apply clear_unsupported_flags to flags. * testsuite/libgomp.c/target-33.c: Remove xfail for offload_target_nvptx. * testsuite/libgomp.c/target-34.c: Likewise.	2020-02-13 10:18:31 +01:00
Frederik Harwath	fd789c816b	Add xfails to libgomp tests target-{33,34}.c, target-link-1.c Add xfails for nvptx offloading because "no GOMP_OFFLOAD_async_run implemented in plugin-nvptx.c" (https://gcc.gnu.org/PR81688) and because "omp target link not implemented for nvptx" (https://gcc.gnu.org/PR81689). libgomp/ * testsuite/libgomp.c/target-33.c: Add xfail for execution on offload_target_nvptx, cf. https://gcc.gnu.org/PR81688. * testsuite/libgomp.c/target-34.c: Likewise. * testsuite/libgomp.c/target-link-1.c: Add xfail for offload_target_nvptx, cf. https://gcc.gnu.org/PR81689.	2020-02-10 09:16:46 +01:00
Jakub Jelinek	9bc3b95dfe	openmp: Optimize DECL_IN_CONSTANT_POOL vars in target regions DECL_IN_CONSTANT_POOL are shared and thus don't really get emitted in the BLOCK where they are used, so for OpenMP target regions that have initializers gimplified into copying from them we actually map them at runtime from host to offload devices. This patch instead marks them as "omp declare target", so that they are on the target device from the beginning and don't need to be copied there. 2020-02-09 Jakub Jelinek <jakub@redhat.com> * gimplify.c (gimplify_adjust_omp_clauses_1): Promote DECL_IN_CONSTANT_POOL variables into "omp declare target" to avoid copying them around between host and target. * testsuite/libgomp.c/target-38.c: New test.	2020-02-09 08:17:10 +01:00
Jakub Jelinek	8d9254fc8a	Update copyright years. From-SVN: r279813	2020-01-01 12:51:42 +01:00
Jakub Jelinek	601399c0df	re PR middle-end/86416 ([OpenMP] Offloading - better lto1 error message if mode not supported on offloading target) PR middle-end/86416 * testsuite/libgomp.c/pr86416-1.c (main): Use L suffixes rather than q or none. * testsuite/libgomp.c/pr86416-2.c (main): Use Q suffixes rather than L or none. From-SVN: r279552	2019-12-19 00:27:28 +01:00
Tobias Burnus	c80c9e26de	PR 86416 – improve lto1 diagnostic if a mode does not exist PR middle-end/86416 * Makefile.in (CFLAGS-lto-streamer-in.o): Pass target_noncanonical on. * lto-streamer-in.c (lto_input_mode_table): Improve unsupported-mode diagnostic. PR middle-end/86416 * testsuite/libgomp.c/pr86416-1.c: New. * testsuite/libgomp.c/pr86416-2.c: New. From-SVN: r279528	2019-12-18 17:51:08 +01:00
Rainer Orth	b8e724465b	Fix failures on Solaris with -fno-common default gcc/testsuite: * gcc.c-torture/execute/20030913-1.c: Rename glob to g. * gcc.c-torture/execute/960218-1.c: Rename glob to gl. * gcc.c-torture/execute/complex-6.c: Rename err to e. * gcc.dg/torture/ssa-pta-fn-1.c: Rename glob to g. libgomp: * testsuite/libgomp.c/pr39591-1.c: Rename err to e. * testsuite/libgomp.c/pr39591-2.c: Likewise. * testsuite/libgomp.c/pr39591-3.c: Likewise. * testsuite/libgomp.c/private-1.c: Likewise. * testsuite/libgomp.c/task-1.c: Likewise. * testsuite/libgomp.c/task-5.c: Renamed err to serr. From-SVN: r278571	2019-11-21 16:14:21 +00:00
Andrew Stubbs	8916ba874d	Add tests for print from offload target. 2019-11-15 Andrew Stubbs <ams@codesourcery.com> libgomp/ * testsuite/libgomp.c/target-print-1.c: New file. * testsuite/libgomp.fortran/target-print-1.f90: New file. * testsuite/libgomp.oacc-c/print-1.c: New file. * testsuite/libgomp.oacc-fortran/print-1.f90: New file. From-SVN: r278284	2019-11-15 10:49:10 +00:00
Jakub Jelinek	e62506f362	re PR libgomp/91530 (Several libgomp./scan- tests FAIL without avx_runtime) PR libgomp/91530 * config/i386/sse.md (vec_shl_<mode>, vec_shr_<mode>): Use V_128 iterator instead of VI_128. * testsuite/libgomp.c/scan-21.c: New test. * testsuite/libgomp.c/scan-22.c: New test. From-SVN: r274985	2019-08-28 12:13:21 +02:00
Jakub Jelinek	5cb72d83bb	re PR libgomp/91530 (Several libgomp./scan- tests FAIL without avx_runtime) PR libgomp/91530 * config/i386/sse.md (vec_shl_<mode>, vec_shr_<mode>): Use V_128 iterator instead of VI_128. * testsuite/libgomp.c/scan-21.c: New test. * testsuite/libgomp.c/scan-22.c: New test. From-SVN: r274984	2019-08-28 12:12:11 +02:00
Jakub Jelinek	0ad7981cb4	re PR libgomp/91530 (Several libgomp./scan- tests FAIL without avx_runtime) PR libgomp/91530 * testsuite/libgomp.c/scan-11.c: Add -msse2 option for sse2_runtime targets. * testsuite/libgomp.c/scan-12.c: Likewise. * testsuite/libgomp.c/scan-13.c: Likewise. * testsuite/libgomp.c/scan-14.c: Likewise. * testsuite/libgomp.c/scan-15.c: Likewise. * testsuite/libgomp.c/scan-16.c: Likewise. * testsuite/libgomp.c/scan-17.c: Likewise. * testsuite/libgomp.c/scan-18.c: Likewise. * testsuite/libgomp.c/scan-19.c: Likewise. * testsuite/libgomp.c/scan-20.c: Likewise. * testsuite/libgomp.c++/scan-9.C: Likewise. * testsuite/libgomp.c++/scan-10.C: Likewise. * testsuite/libgomp.c++/scan-11.C: Likewise. * testsuite/libgomp.c++/scan-12.C: Likewise. * testsuite/libgomp.c++/scan-14.C: Likewise. * testsuite/libgomp.c++/scan-15.C: Likewise. * testsuite/libgomp.c++/scan-13.C: Likewise. Use sse2_runtime instead of i?86-- x86_64-- as target for scan-tree-dump-times. * testsuite/libgomp.c++/scan-16.C: Likewise. From-SVN: r274947	2019-08-27 12:45:55 +02:00
Jakub Jelinek	8860d2706d	gimplify.c (omp_add_variable): Use GOVD_PRIVATE \| GOVD_EXPLICIT for VLA helper variables on target data even if... * gimplify.c (omp_add_variable): Use GOVD_PRIVATE \| GOVD_EXPLICIT for VLA helper variables on target data even if not GOVD_FIRSTPRIVATE. (gimplify_scan_omp_clauses): For OMP_CLAUSE_USE_DEVICE_* use just GOVD_EXPLICIT flags. (gimplify_omp_workshare): For OMP_TARGET_DATA move all OMP_CLAUSE_USE_DEVICE_* clauses to the end of clauses chain. * omp-low.c (scan_sharing_clauses): For OMP_CLAUSE_USE_DEVICE_* call install_var_field with mask 11 instead of 3. (lower_omp_target): For OMP_CLAUSE_USE_DEVICE_* use pass (splay_tree_key) &DECL_UID (var) to build_sender_ref instead of var. gcc/c/ * c-typeck.c (c_finish_omp_clauses): For C_ORT_OMP OMP_CLAUSE_USE_DEVICE_* clauses use oacc_reduction_head bitmap instead of generic_head to track duplicates. gcc/cp/ * semantics.c (finish_omp_clauses): For C_ORT_OMP OMP_CLAUSE_USE_DEVICE_* clauses use oacc_reduction_head bitmap instead of generic_head to track duplicates. libgomp/ * target.c (gomp_map_vars_internal): For GOMP_MAP_USE_DEVICE_PTR perform the lookup in the first loop only if !not_found_cnt, otherwise perform lookups for it in the second loop guarded with if (not_found_cnt \|\| has_firstprivate). * testsuite/libgomp.c/target-37.c: New test. * testsuite/libgomp.c++/target-22.C: New test. From-SVN: r274206	2019-08-08 08:39:02 +02:00
Jakub Jelinek	398e3feb8a	tree-core.h (enum omp_clause_code): Adjust OMP_CLAUSE_USE_DEVICE_PTR OpenMP description. * tree-core.h (enum omp_clause_code): Adjust OMP_CLAUSE_USE_DEVICE_PTR OpenMP description. Add OMP_CLAUSE_USE_DEVICE_ADDR clause. * tree.c (omp_clause_num_ops, omp_clause_code_name): Add entries for OMP_CLAUSE_USE_DEVICE_ADDR clause. (walk_tree_1): Handle OMP_CLAUSE_USE_DEVICE_ADDR. * tree-pretty-print.c (dump_omp_clause): Likewise. * tree-nested.c (convert_nonlocal_omp_clauses, convert_local_omp_clauses): Likewise. * gimplify.c (gimplify_scan_omp_clauses, gimplify_adjust_omp_clauses): Likewise. * omp-low.c (scan_sharing_clauses, lower_omp_target): Likewise. Treat OMP_CLAUSE_USE_DEVICE_ADDR like OMP_CLAUSE_USE_DEVICE_PTR clause with array or reference to array types, no matter what type except for reference it has. gcc/c-family/ * c-pragma.h (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_USE_DEVICE_ADDR. Set PRAGMA_OACC_CLAUSE_USE_DEVICE equal to PRAGMA_OMP_CLAUSE_USE_DEVICE_PTR instead of being a separate enumeration value. gcc/c/ * c-parser.c (c_parser_omp_clause_name): Parse use_device_addr clause. (c_parser_omp_clause_use_device_addr): New function. (c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_USE_DEVICE_ADDR. (OMP_TARGET_DATA_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_USE_DEVICE_ADDR. (c_parser_omp_target_data): Handle PRAGMA_OMP_CLAUSE_USE_DEVICE_ADDR like PRAGMA_OMP_CLAUSE_USE_DEVICE_PTR, adjust diagnostics about no map or use_device_* clauses. * c-typeck.c (c_finish_omp_clauses): For OMP_CLAUSE_USE_DEVICE_PTR in OpenMP, require pointer type rather than pointer or array type. Handle OMP_CLAUSE_USE_DEVICE_ADDR. gcc/cp/ * parser.c (cp_parser_omp_clause_name): Parse use_device_addr clause. (cp_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_USE_DEVICE_ADDR. (OMP_TARGET_DATA_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_USE_DEVICE_ADDR. (cp_parser_omp_target_data): Handle PRAGMA_OMP_CLAUSE_USE_DEVICE_ADDR like PRAGMA_OMP_CLAUSE_USE_DEVICE_PTR, adjust diagnostics about no map or use_device_* clauses. * semantics.c (finish_omp_clauses): For OMP_CLAUSE_USE_DEVICE_PTR in OpenMP, require pointer or reference to pointer type rather than pointer or array or reference to pointer or array type. Handle OMP_CLAUSE_USE_DEVICE_ADDR. * pt.c (tsubst_omp_clauses): Handle OMP_CLAUSE_USE_DEVICE_ADDR. gcc/testsuite/ * c-c++-common/gomp/target-data-1.c (foo): Use use_device_addr clause instead of use_device_ptr clause where required by OpenMP 5.0, add further tests for both use_device_ptr and use_device_addr clauses. libgomp/ * testsuite/libgomp.c/target-18.c (struct S): New type. (foo): Use use_device_addr clause instead of use_device_ptr clause where required by OpenMP 5.0, add further tests for both use_device_ptr and use_device_addr clauses. * testsuite/libgomp.c++/target-9.C (struct S): New type. (foo): Use use_device_addr clause instead of use_device_ptr clause where required by OpenMP 5.0, add further tests for both use_device_ptr and use_device_addr clauses. Add t and u arguments. (main): Adjust caller. From-SVN: r274159	2019-08-07 09:27:10 +02:00
Jakub Jelinek	6f67abcdb0	omp-low.c (lower_rec_input_clauses): For lastprivate clauses in ctx->for_simd_scan_phase simd copy the outer var to... * omp-low.c (lower_rec_input_clauses): For lastprivate clauses in ctx->for_simd_scan_phase simd copy the outer var to the privatized variable(s). For conditional lastprivate look through outer GIMPLE_OMP_SCAN context. (lower_omp_1): For conditional lastprivate look through outer GIMPLE_OMP_SCAN context. * testsuite/libgomp.c/scan-19.c: New test. * testsuite/libgomp.c/scan-20.c: New test. From-SVN: r273169	2019-07-06 23:58:01 +02:00
Jakub Jelinek	1f52d1a8b5	omp-low.c (struct omp_context): Add for_simd_scan_phase member. * omp-low.c (struct omp_context): Add for_simd_scan_phase member. (maybe_lookup_ctx): Add forward declaration. (omp_find_scan): Likewise. Walk into body of simd if composited with worksharing loop. (scan_omp_simd_scan): New function. (scan_omp_1_stmt): Call it. (lower_rec_simd_input_clauses): Don't create rvar nor rvar2 if ctx->for_simd_scan_phase. (lower_rec_input_clauses): Do much less work for inscan reductions in ctx->for_simd_scan_phase is_simd regions. (lower_omp_scan): Set is_simd also on simd constructs composited with worksharing loop, unless ctx->for_simd_scan_phase. Never emit a sorry message. Don't change GIMPLE_OMP_SCAN stmts into nops and emit their body after in simd constructs composited with worksharing loop. (lower_omp_for_scan): Handle worksharing loop composited with simd. * c-c++-common/gomp/scan-4.c: Don't expect sorry message. * testsuite/libgomp.c/scan-11.c: New test. * testsuite/libgomp.c/scan-12.c: New test. * testsuite/libgomp.c/scan-13.c: New test. * testsuite/libgomp.c/scan-14.c: New test. * testsuite/libgomp.c/scan-15.c: New test. * testsuite/libgomp.c/scan-16.c: New test. * testsuite/libgomp.c/scan-17.c: New test. * testsuite/libgomp.c/scan-18.c: New test. * testsuite/libgomp.c++/scan-9.C: New test. * testsuite/libgomp.c++/scan-10.C: New test. * testsuite/libgomp.c++/scan-11.C: New test. * testsuite/libgomp.c++/scan-12.C: New test. * testsuite/libgomp.c++/scan-13.C: New test. * testsuite/libgomp.c++/scan-14.C: New test. * testsuite/libgomp.c++/scan-15.C: New test. * testsuite/libgomp.c++/scan-16.C: New test. From-SVN: r273157	2019-07-06 09:53:48 +02:00
Jakub Jelinek	2f03073ff2	omp-expand.c (expand_omp_for_static_nochunk): Don't emit GOMP_loop_start at the start of second worksharing loop in a scan. * omp-expand.c (expand_omp_for_static_nochunk): Don't emit GOMP_loop_start at the start of second worksharing loop in a scan. For nowait, don't emit GOMP_loop_end_nowait at the end of first worksharing loop in a scan even if there are conditional lastprivates, and do emit GOMP_loop_end_nowait at the end of second worksharing loop. * testsuite/libgomp.c/scan-9.c: New test. * testsuite/libgomp.c/scan-10.c: New test. From-SVN: r273095	2019-07-04 23:40:56 +02:00
Jakub Jelinek	2f6bb511d1	tree-core.h (enum omp_clause_code): Add OMP_CLAUSE__SCANTEMP_ clause. * tree-core.h (enum omp_clause_code): Add OMP_CLAUSE__SCANTEMP_ clause. * tree.h (OMP_CLAUSE_DECL): Use OMP_CLAUSE__SCANTEMP_ instead of OMP_CLAUSE__CONDTEMP_ as range's upper bound. (OMP_CLAUSE__SCANTEMP__ALLOC, OMP_CLAUSE__SCANTEMP__CONTROL): Define. * tree.c (omp_clause_num_ops, omp_clause_code_name): Add OMP_CLAUSE__SCANTEMP_ entry. (walk_tree_1): Handle OMP_CLAUSE__SCANTEMP_. * tree-pretty-print.c (dump_omp_clause): Likewise. * tree-nested.c (convert_nonlocal_omp_clauses, convert_local_omp_clauses): Likewise. * omp-general.h (struct omp_for_data): Add have_scantemp and have_nonctrl_scantemp members. * omp-general.c (omp_extract_for_data): Initialize them. * omp-low.c (struct omp_context): Add scan_exclusive member. (scan_omp_1_stmt): Don't unnecessarily mask gimple_omp_for_kind result again with GF_OMP_FOR_KIND_MASK. Initialize also ctx->scan_exclusive. (lower_rec_simd_input_clauses): Use ctx->scan_exclusive instead of !ctx->scan_inclusive. (lower_rec_input_clauses): Simplify gimplification of dtors using gimplify_and_add. For non-is_simd test OMP_CLAUSE_REDUCTION_INSCAN rather than rvarp. Handle OMP_CLAUSE_REDUCTION_INSCAN in worksharing loops. Don't add barrier for reduction_omp_orig_ref if ctx->scan_??xclusive. (lower_reduction_clauses): Don't do anything for ctx->scan_??xclusive. (lower_omp_scan): Use ctx->scan_exclusive instead of !ctx->scan_inclusive. Handle worksharing loops with inscan reductions. Use new_vard != new_var instead of repeated omp_is_reference calls. (omp_find_scan, lower_omp_for_scan): New functions. (lower_omp_for): Call lower_omp_for_scan for worksharing loops with inscan reductions. * omp-expand.c (expand_omp_scantemp_alloc): New function. (expand_omp_for_static_nochunk): Handle fd->have_nonctrl_scantemp and fd->have_scantemp. * c-c++-common/gomp/scan-3.c (f1): Don't expect a sorry message. * c-c++-common/gomp/scan-5.c (foo): Likewise. * testsuite/libgomp.c++/scan-1.C: New test. * testsuite/libgomp.c++/scan-2.C: New test. * testsuite/libgomp.c++/scan-3.C: New test. * testsuite/libgomp.c++/scan-4.C: New test. * testsuite/libgomp.c++/scan-5.C: New test. * testsuite/libgomp.c++/scan-6.C: New test. * testsuite/libgomp.c++/scan-7.C: New test. * testsuite/libgomp.c++/scan-8.C: New test. * testsuite/libgomp.c/scan-1.c: New test. * testsuite/libgomp.c/scan-2.c: New test. * testsuite/libgomp.c/scan-3.c: New test. * testsuite/libgomp.c/scan-4.c: New test. * testsuite/libgomp.c/scan-5.c: New test. * testsuite/libgomp.c/scan-6.c: New test. * testsuite/libgomp.c/scan-7.c: New test. * testsuite/libgomp.c/scan-8.c: New test. From-SVN: r272958	2019-07-03 07:03:58 +02:00
Jakub Jelinek	211b7533bf	re PR middle-end/90779 (Fortran array initialization in offload regions) PR middle-end/90779 * gimplify.c: Include omp-offload.h and context.h. (gimplify_bind_expr): Add "omp declare target" attributes to static block scope variables inside of target region or target functions. * c-c++-common/goacc/routine-5.c (func2): Don't expect error for static block scope variable in #pragma acc routine. * testsuite/libgomp.c/pr90779.c: New test. * testsuite/libgomp.fortran/pr90779.f90: New test. From-SVN: r272322	2019-06-15 09:09:04 +02:00

1 2 3 4 5

203 Commits