OpenE2K/gcc - gcc - Expired Mentality Git

Author	SHA1	Message	Date
Jakub Jelinek	91c771ec8a	openmp - Fix up && and \|\| reductions [PR94366] As the testcase shows, the special treatment of && and \|\| reduction combiners where we expand them as omp_out = (omp_out != 0) && (omp_in != 0) (or with \|\|) is not needed just for &&/\|\| on floating point or complex types, but for all &&/\|\| reductions - when expanded as omp_out = omp_out && omp_in (not in C but GENERIC) it is actually gimplified into NOP_EXPRs to bool from both operands, which turns non-zero values multiple of 2 into 0 rather than 1. This patch just treats all &&/\|\| the same and furthermore uses bool type instead of int for the comparisons. 2021-07-01 Jakub Jelinek <jakub@redhat.com> PR middle-end/94366 gcc/ * omp-low.c (lower_rec_input_clauses): Rename is_fp_and_or to is_truth_op, set it for TRUTH_IF_EXPR regardless of new_var's type, use boolean_type_node instead of integer_type_node as NE_EXPR type. (lower_reduction_clauses): Likewise. libgomp/ testsuite/libgomp.c-c++-common/pr94366.c: New test.	2021-07-01 08:55:49 +02:00
Tobias Burnus	33c4e46624	Add 'default' to -foffload=; document that flag [PR67300] As -foffload={options,targets,targets=options} is very convoluted, it has been split into -foffload=targets (supporting the old syntax for backward compatibilty) and -foffload-options={options,target=options}. Only the new syntax is documented. Additionally, -foffload=default is supported, which can reset the devices after -foffload=disable / -foffload=targets to the default, if needed. gcc/ChangeLog: PR other/67300 * common.opt (-foffload=): Update description. (-foffload-options=): New. * doc/invoke.texi (C Language Options): Document -foffload and -foffload-options. * gcc.c (check_offload_target_name): New, split off from handle_foffload_option. (check_foffload_target_names): New. (handle_foffload_option): Handle -foffload=default. (driver_handle_option): Update for -foffload-options. * lto-opts.c (lto_write_options): Use -foffload-options instead of -foffload. * lto-wrapper.c (merge_and_complain, append_offload_options): Likewise. * opts.c (common_handle_option): Likewise. libgomp/ChangeLog: PR other/67300 * testsuite/libgomp.c-c++-common/reduction-16.c: Replace -foffload=nvptx-none= by -foffload-options=nvptx-none= to avoid disabling other offload targets. * testsuite/libgomp.c-c++-common/reduction-5.c: Likewise. * testsuite/libgomp.c-c++-common/reduction-6.c: Likewise. * testsuite/libgomp.c/target-44.c: Likewise.	2021-06-29 16:00:04 +02:00
Tobias Burnus	489c5dcf7b	libgomp.fortran/defaultmap-8.f90: Fix non-shared memory handling Disable some more parts of the test as firstprivate does not work yet due to PR fortran/90742. libgomp/ * testsuite/libgomp.fortran/defaultmap-8.f90 (bar): Determine whether target has shared memory and disable some scalar pointer/allocatable checks if not as firstprivate does not work.	2021-06-29 15:50:23 +02:00
Chung-Lin Tang	e067201737	testsuite/101114: Adjust libgomp.c-c++-common/struct-elem-5.c testcase The dg-shouldfail testcase libgomp.c-c++-common/struct-elem-5.c does not properly fail for non-shared address space offloading. Adjust testcase to limit testing only for "target offload_device_nonshared_as". libgomp/ChangeLog: PR testsuite/101114 * testsuite/libgomp.c-c++-common/struct-elem-5.c: Add "target offload_device_nonshared_as" condition for enabling test.	2021-06-26 00:46:11 +08:00
Jakub Jelinek	7619d33471	openmp: in_reduction clause support on target construct This patch adds support for in_reduction clause on target construct, though for now only for synchronous targets (without nowait clause). The encountering thread in that case runs the target task and blocks until the target region ends, so it is implemented by remapping it before entering the target, initializing the private copy if not yet initialized for the current thread and then using the remapped addresses for the mapping addresses. For nowait combined with in_reduction the patch contains a hack where the nowait clause is ignored. To implement it correctly, I think we would need to create a new private variable for the in_reduction and initialize it before doing the async target and adjust the map addresses to that private variable and then pass a function pointer to the library routine with code where the callback would remap the address to the current threads private variable and use in_reduction combiner to combine the private variable we've created into the thread's copy. The library would then need to make sure that the routine is called in some thread participating in the parallel (and not in an unshackeled thread). 2021-06-24 Jakub Jelinek <jakub@redhat.com> gcc/ * tree.h (OMP_CLAUSE_MAP_IN_REDUCTION): Document meaning for OpenMP. * gimplify.c (gimplify_scan_omp_clauses): For OpenMP map clauses with OMP_CLAUSE_MAP_IN_REDUCTION flag partially defer gimplification of non-decl OMP_CLAUSE_DECL. For OMP_CLAUSE_IN_REDUCTION on OMP_TARGET user outer_ctx instead of ctx for placeholders and initializer/combiner gimplification. * omp-low.c (scan_sharing_clauses): Handle OMP_CLAUSE_MAP_IN_REDUCTION on target constructs. (lower_rec_input_clauses): Likewise. (lower_omp_target): Likewise. * omp-expand.c (expand_omp_target): Temporarily ignore nowait clause on target if in_reduction is present. gcc/c-family/ * c-common.h (enum c_omp_region_type): Add C_ORT_TARGET and C_ORT_OMP_TARGET. * c-omp.c (c_omp_split_clauses): For OMP_CLAUSE_IN_REDUCTION on combined target constructs also add map (always, tofrom:) clause. gcc/c/ * c-parser.c (omp_split_clauses): Pass C_ORT_OMP_TARGET instead of C_ORT_OMP for clauses on target construct. (OMP_TARGET_CLAUSE_MASK): Add in_reduction clause. (c_parser_omp_target): For non-combined target add map (always, tofrom:) clauses for OMP_CLAUSE_IN_REDUCTION. Pass C_ORT_OMP_TARGET to c_finish_omp_clauses. * c-typeck.c (handle_omp_array_sections): Adjust ort handling for addition of C_ORT_OMP_TARGET and simplify, mapping clauses are never present on C_ORT_DECLARE_SIMD. (c_finish_omp_clauses): Likewise. Handle OMP_CLAUSE_IN_REDUCTION on C_ORT_OMP_TARGET, set OMP_CLAUSE_MAP_IN_REDUCTION on corresponding map clauses. gcc/cp/ parser.c (cp_omp_split_clauses): Pass C_ORT_OMP_TARGET instead of C_ORT_OMP for clauses on target construct. (OMP_TARGET_CLAUSE_MASK): Add in_reduction clause. (cp_parser_omp_target): For non-combined target add map (always, tofrom:) clauses for OMP_CLAUSE_IN_REDUCTION. Pass C_ORT_OMP_TARGET to finish_omp_clauses. * semantics.c (handle_omp_array_sections_1): Adjust ort handling for addition of C_ORT_OMP_TARGET and simplify, mapping clauses are never present on C_ORT_DECLARE_SIMD. (handle_omp_array_sections): Likewise. (finish_omp_clauses): Likewise. Handle OMP_CLAUSE_IN_REDUCTION on C_ORT_OMP_TARGET, set OMP_CLAUSE_MAP_IN_REDUCTION on corresponding map clauses. pt.c (tsubst_expr): Pass C_ORT_OMP_TARGET instead of C_ORT_OMP for clauses on target construct. gcc/testsuite/ * c-c++-common/gomp/target-in-reduction-1.c: New test. * c-c++-common/gomp/clauses-1.c: Add in_reduction clauses on target or combined target constructs. libgomp/ * testsuite/libgomp.c-c++-common/target-in-reduction-1.c: New test. * testsuite/libgomp.c-c++-common/target-in-reduction-2.c: New test. * testsuite/libgomp.c++/target-in-reduction-1.C: New test. * testsuite/libgomp.c++/target-in-reduction-2.C: New test.	2021-06-24 11:35:08 +02:00
Jakub Jelinek	679506c383	openmp: Fix up _reduction clause handling with UDRs on PARM_DECLs [PR101167] The following testcase FAILs, because the UDR combiner is invoked incorrectly. lower_omp_rec_clauses expects that when it sets DECL_VALUE_EXPR/DECL_HAS_VALUE_EXPR_P for both the placeholder and the var that everything will be properly regimplified, but as the variable in question is a PARM_DECL rather than VAR_DECL, lower_omp_regimplify_p doesn't say that it should be regimplified and so it is not. 2021-06-23 Jakub Jelinek <jakub@redhat.com> PR middle-end/101167 omp-low.c (lower_omp_regimplify_p): Regimplify also PARM_DECLs and RESULT_DECLs that have DECL_HAS_VALUE_EXPR_P set. * testsuite/libgomp.c-c++-common/task-reduction-15.c: New test.	2021-06-23 10:03:28 +02:00
Chung-Lin Tang	275c736e73	libgomp: Structure element mapping for OpenMP 5.0 This patch implement OpenMP 5.0 requirements of incrementing/decrementing the reference count of a mapped structure at most once (across all elements) on a construct. This is implemented by pulling in libgomp/hashtab.h and using htab_t as a pointer set. Structure element list siblings also have pointers-to-refcounts linked together, to naturally achieve uniform increment/decrement without repeating. There are still some questions on whether using such a htab_t based set is faster/slower than using a sorted pointer array based implementation. This is to be researched on later. libgomp/ChangeLog: * hashtab.h (htab_clear): New function with initialization code factored out from... (htab_create): ...here, adjust to use htab_clear function. * libgomp.h (REFCOUNT_SPECIAL): New symbol to denote range of special refcount values, add comments. (REFCOUNT_INFINITY): Adjust definition to use REFCOUNT_SPECIAL. (REFCOUNT_LINK): Likewise. (REFCOUNT_STRUCTELEM): New special refcount range for structure element siblings. (REFCOUNT_STRUCTELEM_P): Macro for testing for structure element sibling maps. (REFCOUNT_STRUCTELEM_FLAG_FIRST): Flag to indicate first sibling. (REFCOUNT_STRUCTELEM_FLAG_LAST): Flag to indicate last sibling. (REFCOUNT_STRUCTELEM_FIRST_P): Macro to test _FIRST flag. (REFCOUNT_STRUCTELEM_LAST_P): Macro to test _LAST flag. (struct splay_tree_key_s): Add structelem_refcount and structelem_refcount_ptr fields into a union with dynamic_refcount. Add comments. (gomp_map_vars): Delete declaration. (gomp_map_vars_async): Likewise. (gomp_unmap_vars): Likewise. (gomp_unmap_vars_async): Likewise. (goacc_map_vars): New declaration. (goacc_unmap_vars): Likewise. * oacc-mem.c (acc_map_data): Adjust to use goacc_map_vars. (goacc_enter_datum): Likewise. (goacc_enter_data_internal): Likewise. * oacc-parallel.c (GOACC_parallel_keyed): Adjust to use goacc_map_vars and goacc_unmap_vars. (GOACC_data_start): Adjust to use goacc_map_vars. (GOACC_data_end): Adjust to use goacc_unmap_vars. * target.c (hash_entry_type): New typedef. (htab_alloc): New function hook for hashtab.h. (htab_free): Likewise. (htab_hash): Likewise. (htab_eq): Likewise. (hashtab.h): Add file include. (gomp_increment_refcount): New function. (gomp_decrement_refcount): Likewise. (gomp_map_vars_existing): Add refcount_set parameter, adjust to use gomp_increment_refcount. (gomp_map_fields_existing): Add refcount_set parameter, adjust calls to gomp_map_vars_existing. (gomp_map_vars_internal): Add refcount_set parameter, add local openmp_p variable to guard OpenMP specific paths, adjust calls to gomp_map_vars_existing, add structure element sibling splay_tree_key sequence creation code, adjust Fortran map case to avoid increment under OpenMP. (gomp_map_vars): Adjust to static, add refcount_set parameter, manage local refcount_set if caller passed in NULL, adjust call to gomp_map_vars_internal. (gomp_map_vars_async): Adjust and rename into... (goacc_map_vars): ...this new function, adjust call to gomp_map_vars_internal. (gomp_remove_splay_tree_key): New function with code factored out from gomp_remove_var_internal. (gomp_remove_var_internal): Add code to handle removing multiple splay_tree_key sequence for structure elements, adjust code to use gomp_remove_splay_tree_key for splay-tree key removal. (gomp_unmap_vars_internal): Add refcount_set parameter, adjust to use gomp_decrement_refcount. (gomp_unmap_vars): Adjust to static, add refcount_set parameter, manage local refcount_set if caller passed in NULL, adjust call to gomp_unmap_vars_internal. (gomp_unmap_vars_async): Adjust and rename into... (goacc_unmap_vars): ...this new function, adjust call to gomp_unmap_vars_internal. (GOMP_target): Manage refcount_set and adjust calls to gomp_map_vars and gomp_unmap_vars. (GOMP_target_ext): Likewise. (gomp_target_data_fallback): Adjust call to gomp_map_vars. (GOMP_target_data): Likewise. (GOMP_target_data_ext): Likewise. (GOMP_target_end_data): Adjust call to gomp_unmap_vars. (gomp_exit_data): Add refcount_set parameter, adjust to use gomp_decrement_refcount, adjust to queue splay-tree keys for removal after main loop. (GOMP_target_enter_exit_data): Manage refcount_set and adjust calls to gomp_map_vars and gomp_exit_data. (gomp_target_task_fn): Likewise. * testsuite/libgomp.c-c++-common/refcount-1.c: New testcase. * testsuite/libgomp.c-c++-common/struct-elem-1.c: New testcase. * testsuite/libgomp.c-c++-common/struct-elem-2.c: New testcase. * testsuite/libgomp.c-c++-common/struct-elem-3.c: New testcase. * testsuite/libgomp.c-c++-common/struct-elem-4.c: New testcase. * testsuite/libgomp.c-c++-common/struct-elem-5.c: New testcase.	2021-06-17 21:34:59 +08:00
Tobias Burnus	1de31913d2	Fortran/OpenMP: Extend defaultmap clause for OpenMP 5 [PR92568] PR fortran/92568 gcc/fortran/ChangeLog: * dump-parse-tree.c (show_omp_clauses): Update for defaultmap. * f95-lang.c (LANG_HOOKS_OMP_ALLOCATABLE_P, LANG_HOOKS_OMP_SCALAR_TARGET_P): New. * gfortran.h (enum gfc_omp_defaultmap, enum gfc_omp_defaultmap_category): New. * openmp.c (gfc_match_omp_clauses): Update defaultmap matching. * trans-decl.c (gfc_finish_decl_attrs): Set GFC_DECL_SCALAR_TARGET. * trans-openmp.c (gfc_omp_allocatable_p, gfc_omp_scalar_target_p): New. (gfc_omp_scalar_p): Take 'ptr_alloc_ok' argument. (gfc_trans_omp_clauses, gfc_split_omp_clauses): Update for defaultmap changes. * trans.h (gfc_omp_scalar_p): Update prototype. (gfc_omp_allocatable_p, gfc_omp_scalar_target_p): New. (struct lang_decl): Add scalar_target. (GFC_DECL_SCALAR_TARGET, GFC_DECL_GET_SCALAR_TARGET): New. gcc/ChangeLog: * gimplify.c (enum gimplify_defaultmap_kind): Add GDMK_SCALAR_TARGET. (struct gimplify_omp_ctx): Extend defaultmap array by one. (new_omp_context): Init defaultmap[GDMK_SCALAR_TARGET]. (omp_notice_variable): Update type classification for Fortran. (gimplify_scan_omp_clauses): Update calls for new argument; handle GDMK_SCALAR_TARGET; for Fortran, GDMK_POINTER avoid GOVD_MAP_0LEN_ARRAY. * langhooks-def.h (lhd_omp_scalar_p): Add 'ptr_ok' argument. * langhooks.c (lhd_omp_scalar_p): Likewise. (LANG_HOOKS_OMP_ALLOCATABLE_P, LANG_HOOKS_OMP_SCALAR_TARGET_P): New. (LANG_HOOKS_DECLS): Add them. * langhooks.h (struct lang_hooks_for_decls): Add new hooks, update omp_scalar_p pointer type to include the new bool argument. libgomp/ChangeLog: * testsuite/libgomp.fortran/defaultmap-8.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/pr99928-1.f90: Uncomment 'defaultmap(none)'. * gfortran.dg/gomp/pr99928-2.f90: Uncomment 'defaultmap(none)'. * gfortran.dg/gomp/pr99928-3.f90: Uncomment 'defaultmap(none)'. * gfortran.dg/gomp/pr99928-4.f90: Uncomment 'defaultmap(none)'. * gfortran.dg/gomp/pr99928-5.f90: Uncomment 'defaultmap(none)'. * gfortran.dg/gomp/pr99928-6.f90: Uncomment 'defaultmap(none)'. * gfortran.dg/gomp/pr99928-8.f90: Uncomment 'defaultmap(none)'. * gfortran.dg/gomp/defaultmap-1.f90: New test. * gfortran.dg/gomp/defaultmap-2.f90: New test. * gfortran.dg/gomp/defaultmap-3.f90: New test. * gfortran.dg/gomp/defaultmap-4.f90: New test. * gfortran.dg/gomp/defaultmap-5.f90: New test. * gfortran.dg/gomp/defaultmap-6.f90: New test. * gfortran.dg/gomp/defaultmap-7.f90: New test.	2021-06-15 16:07:11 +02:00
Jakub Jelinek	7d19a50ea1	testsuite: Fix up libgomp.fortran/pr100981-2.f90 testcase [PR100981] The dsdotr and dsdoti variables uninitialized and the testcase fails e.g. on i686-linux. Fixed by zero initialization. 2021-06-10 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/100981 * testsuite/libgomp.fortran/pr100981-2.f90 (cdcdot): Initialize dsdotr and dsdoti to 0.	2021-06-10 09:31:06 +02:00
H.J. Lu	c8d581bdf7	libgomp: Compile tests with -march=i486 only if needed Don't add -march=i486 if atomic compare-and-swap is supported on 'int'. This fixes libgomp tests with "-march=x86-64 -m32 -fcf-protection". * testsuite/lib/libgomp.exp (libgomp_init): Don't add -march=i486 if atomic compare-and-swap is supported on 'int'.	2021-06-09 10:05:40 -07:00
Richard Biener	374f93da97	tree-optimization/100981 - fix SLP patterns involving reductions The following fixes the SLP FMA patterns to preserve reduction info and the reduction vectorization to consider internal function call defs for the reduction stmt. 2021-06-09 Richard Biener <rguenther@suse.de> PR tree-optimization/100981 gcc/ * tree-vect-loop.c (vect_create_epilog_for_reduction): Use gimple_get_lhs to also handle calls. * tree-vect-slp-patterns.c (complex_pattern::build): Transfer reduction info. gcc/testsuite/ * gfortran.dg/vect/pr100981-1.f90: New testcase. libgomp/ * testsuite/libgomp.fortran/pr100981-2.f90: New testcase.	2021-06-09 16:33:18 +02:00
Thomas Schwinge	30656822b3	[GCN] Fix run-time variable 'num_workers' ... which currently has not been forced to 'num_workers (1)'. In addition to the testcases modified here, this also fixes: FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/mode-transitions.c -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa -O0 execution test [Etc.] mode-transitions.exe: [...]/libgomp.oacc-c-c++-common/mode-transitions.c:702: t17: Assertion `arr_b[i] == (i ^ 31) * 8' failed. libgomp/ * plugin/plugin-gcn.c (gcn_exec): Force 'num_workers (1)' unconditionally. * testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Update. * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-wv-2.c: Likewise.	2021-06-08 12:00:15 +02:00
Thomas Schwinge	c68ddd5e2a	Enable more 'libgomp.oacc-/lib-' testcases for non-'openacc_nvidia_accel_selected' libgomp/ * testsuite/libgomp.oacc-c-c++-common/lib-11.c: Enable for all but '-DACC_MEM_SHARED=0'. * testsuite/libgomp.oacc-c-c++-common/lib-13.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-14.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-15.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-20.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-23.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-24.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-34.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-42.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-44.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-48.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-88.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-89.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-92.c: Likewise. * testsuite/libgomp.oacc-fortran/lib-14.f90: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-5.c: Add 'acc_device_radeon' testing. * testsuite/libgomp.oacc-c-c++-common/lib-6.c: Likewise. * testsuite/libgomp.oacc-fortran/lib-5.f90: Likewise. * testsuite/libgomp.oacc-fortran/lib-7.f90: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-52.c: Enable for all. * testsuite/libgomp.oacc-c-c++-common/lib-53.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-54.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-86.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-87.c: Likewise. * testsuite/libgomp.oacc-fortran/lib-10.f90: Likewise. * testsuite/libgomp.oacc-fortran/lib-8.f90: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-57.c: Improve checking for non-'openacc_nvidia_accel_selected'. * testsuite/libgomp.oacc-c-c++-common/lib-58.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-62.c: Clarify that "Not all implement this checking". * testsuite/libgomp.oacc-c-c++-common/lib-63.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-64.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-65.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-67.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-68.c: Likewise.	2021-06-08 11:51:45 +02:00
Thomas Schwinge	32099c0d24	Fix 'libgomp.oacc-fortran/parallel-dims.f90' for 'acc_device_radeon' ..., by simplifying 'libgomp.oacc-c-c++-common/parallel-dims.c', and updating the former correspondingly. '__builtin_goacc_parlevel_id' does the right thing for all 'acc_device_'. Follow-up to commit `09e0ad6253` "Update OpenACC tests for amdgcn". libgomp/ testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Simplify. * testsuite/libgomp.oacc-fortran/parallel-dims-aux.c: Update.	2021-06-08 11:41:52 +02:00
Thomas Schwinge	984df1e163	Fix 'libgomp.oacc-c-c++-common/acc_prof-kernels-1.c' for 'acc_device_radeon' ... on top of r279378 (commit `26b74ed022`) "Update OpenACC tests for amdgcn". libgomp/ * testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Fix for 'acc_device_radeon'.	2021-06-08 11:33:41 +02:00
Thomas Schwinge	292fb10beb	Enhance 'libgomp.oacc-c-c++-common/firstprivate-1.c' for non-'acc_device_nvidia' libgomp/ * testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c: Enhance for non-'acc_device_nvidia'.	2021-06-08 11:31:49 +02:00
Thomas Schwinge	97a040e987	Add 'acc_device_radeon' testing to 'libgomp.oacc-/acc_on_device-' libgomp/ * testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c: Add 'acc_device_radeon' testing. * testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f: Likewise. * testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f: Likewise.	2021-06-08 11:28:53 +02:00
Thomas Schwinge	89c1a427a1	Don't require 'openacc_nvidia_accel_selected' in 'libgomp.oacc-c-c++-common/async_queue-1.c' That is, re-enable it for host-fallback, and enable it for GCN offloading. Fix-up for r279378 (commit `26b74ed022`) "Update OpenACC tests for amdgcn". libgomp/ * testsuite/libgomp.oacc-c-c++-common/async_queue-1.c: Don't require 'openacc_nvidia_accel_selected'. Fix up for 'ACC_DEVICE_TYPE_radeon'.	2021-06-08 11:23:31 +02:00
Thomas Schwinge	77f41a5c4e	Don't require 'openacc_nvidia_accel_selected' in additional 'libgomp.oacc-/declare-' Like r253779 (commit `92d5d01ac6`) "Enable libgomp.oacc-/declare-.{c,f90} for non-nvidia devices". libgomp/ * testsuite/libgomp.oacc-c++/declare-1.C: Don't require 'openacc_nvidia_accel_selected'. * testsuite/libgomp.oacc-c-c++-common/declare-3.c: Likewise.	2021-06-08 11:21:47 +02:00
Thomas Schwinge	f9da798ba6	[GCN] Streamline 'libgomp/testsuite/lib/libgomp.exp:check_effective_target_openacc_radeon_accel_selected' The GCN support that got added in r278935 (commit `83caa34e2a`) "Enable OpenACC GCN testing" was forked before my r269107 (commit `ee332b4a9a`) "[libgomp] Clarify difference between offload target, offload plugin, and OpenACC device type", and didn't later pick up these changes. No functional change. libgomp/ * testsuite/lib/libgomp.exp (check_effective_target_openacc_radeon_accel_selected): Streamline.	2021-06-08 11:16:21 +02:00
Thomas Schwinge	0886426f5f	Revert PR80547 workaround in 'libgomp.oacc-c-c++-common/parallel-dims.c' This problem has been fixed long ago, in r267934 (commit `d41d952c9b`) "[nvptx] Handle assignment to gang-level reduction variable". libgomp/ * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Revert PR80547 workaround.	2021-06-08 11:10:55 +02:00
Thomas Schwinge	e64d62c700	[nvptx] Update comment in 'libgomp.oacc-c-c++-common/parallel-dims.c' Small fix-up for r267889 (commit `2b9d9e3937`) "[nvptx] Enable large vectors": > * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Expect vector > length 2097152 to be reduced to 1024 instead of 32. libgomp/ * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c <acc_device_nvidia>: Update comment.	2021-06-08 11:06:30 +02:00
Tobias Burnus	9a5de4d5af	OpenMP: Add iterator support to Fortran's depend; add affinity clause gcc/c-family/ChangeLog: * c-pragma.h (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_AFFINITY. gcc/c/ChangeLog: * c-parser.c (c_parser_omp_clause_affinity): New. (c_parser_omp_clause_name, c_parser_omp_variable_list, c_parser_omp_all_clauses, OMP_TASK_CLAUSE_MASK): Handle affinity clause. * c-typeck.c (handle_omp_array_sections_1, handle_omp_array_sections, c_finish_omp_clauses): Likewise. gcc/cp/ChangeLog: * parser.c (cp_parser_omp_clause_affinity): New. (cp_parser_omp_clause_name, cp_parser_omp_var_list_no_open, cp_parser_omp_all_clauses, OMP_TASK_CLAUSE_MASK): Handle affinity clause. * semantics.c (handle_omp_array_sections_1, handle_omp_array_sections, finish_omp_clauses): Likewise. gcc/fortran/ChangeLog: * dump-parse-tree.c (show_iterator): New. (show_omp_namelist): Handle iterators. (show_omp_clauses): Handle affinity. * gfortran.h (gfc_free_omp_namelist): New union with 'udr' and new 'ns'. * match.c (gfc_free_omp_namelist): Add are to choose union element. * openmp.c (gfc_free_omp_clauses, gfc_match_omp_detach, gfc_match_omp_clause_reduction, gfc_match_omp_flush): Update call to gfc_free_omp_namelist. (gfc_match_omp_variable_list): Likewise; permit preceeding whitespace. (enum omp_mask1): Add OMP_CLAUSE_AFFINITY. (gfc_match_iterator): New. (gfc_match_omp_clauses): Use it; update call to gfc_free_omp_namelist. (OMP_TASK_CLAUSES): Add OMP_CLAUSE_AFFINITY. (gfc_match_omp_taskwait): Match depend clause. (resolve_omp_clauses): Handle affinity; update for udr/union change. (gfc_resolve_omp_directive): Resolve clauses of taskwait. * st.c (gfc_free_statement): Update gfc_free_omp_namelist call. * trans-openmp.c (gfc_trans_omp_array_reduction_or_udr): Likewise (handle_iterator): New. (gfc_trans_omp_clauses): Handle iterators for depend/affinity clause. (gfc_trans_omp_taskwait): Handle depend clause. (gfc_trans_omp_directive): Update call. gcc/ChangeLog: * gimplify.c (gimplify_omp_affinity): New. (gimplify_scan_omp_clauses): Call it; remove affinity clause afterwards. * tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_AFFINITY. * tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE_AFFINITY. * tree.c (omp_clause_num_ops, omp_clause_code_name): Add clause. (walk_tree_1): Handle OMP_CLAUSE_AFFINITY. libgomp/ChangeLog: * testsuite/libgomp.fortran/depend-iterator-2.f90: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/affinity-1.c: New test. * c-c++-common/gomp/affinity-2.c: New test. * c-c++-common/gomp/affinity-3.c: New test. * c-c++-common/gomp/affinity-4.c: New test. * c-c++-common/gomp/affinity-5.c: New test. * c-c++-common/gomp/affinity-6.c: New test. * c-c++-common/gomp/affinity-7.c: New test. * gfortran.dg/gomp/affinity-clause-1.f90: New test. * gfortran.dg/gomp/affinity-clause-2.f90: New test. * gfortran.dg/gomp/affinity-clause-3.f90: New test. * gfortran.dg/gomp/affinity-clause-4.f90: New test. * gfortran.dg/gomp/affinity-clause-5.f90: New test. * gfortran.dg/gomp/affinity-clause-6.f90: New test. * gfortran.dg/gomp/depend-iterator-1.f90: New test. * gfortran.dg/gomp/depend-iterator-2.f90: New test. * gfortran.dg/gomp/depend-iterator-3.f90: New test. * gfortran.dg/gomp/taskwait.f90: New test.	2021-05-28 10:46:23 +02:00
Jakub Jelinek	79e3f7d54b	libgomp: Add openacc_{cuda,cublas,cudart} effective targets and use them in openacc testsuite When gcc is configured for nvptx offloading with --without-cuda-driver and full CUDA isn't installed, many libgomp.oacc-/ tests fail, some of them because cuda.h header can't be found, others because the tests can't be linked against -lcuda, -lcudart or -lcublas. I usually only have akmod-nvidia and xorg-x11-drv-nvidia-cuda rpms installed, so libcuda.so.1 can be dlopened and the offloading works, but linking against those libraries isn't possible nor are the headers around (for the plugin itself there is the fallback libgomp/plugin/cuda/cuda.h). The following patch adds 3 new effective targets and uses them in tests that needs those. 2021-05-27 Jakub Jelinek <jakub@redhat.com> * testsuite/lib/libgomp.exp (check_effective_target_openacc_cuda, check_effective_target_openacc_cublas, check_effective_target_openacc_cudart): New. * testsuite/libgomp.oacc-fortran/host_data-4.f90: Require effective target openacc_cublas. * testsuite/libgomp.oacc-fortran/host_data-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/host_data-3.f: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-91.c: Require effective target openacc_cuda. * testsuite/libgomp.oacc-c-c++-common/lib-70.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-90.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-75.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-69.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-74.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-81.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-72.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-85.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/pr87835.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-82.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-73.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-83.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-78.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-76.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-84.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-79.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/host_data-1.c: Require effective targets openacc_cublas and openacc_cudart. * testsuite/libgomp.oacc-c-c++-common/context-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/context-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/context-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/context-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/acc_get_property-nvptx.c: Require effective target openacc_cudart. * testsuite/libgomp.oacc-c-c++-common/asyncwait-1.c: Add -DUSE_CUDA_H for effective target openacc_cuda and add && defined USE_CUDA_H to preprocessor conditionals. Guard -lcuda also on openacc_cuda effective target.	2021-05-27 22:44:36 +02:00
Jakub Jelinek	95d6776217	openmp: Fix up handling of target constructs in offloaded routines [PR100573] OpenMP Nesting of Regions restrictions say: - If a target update, target data, target enter data, or target exit data construct is encountered during execution of a target region, the behavior is unspecified. - If a target construct is encountered during execution of a target region and a device clause in which the ancestor device-modifier appears is not present on the construct, the behavior is unspecified. That wording is about the dynamic (runtime) behavior, not about lexical nesting, so while it is UB if omp target * is encountered in the target region, we need to make it compile and link (for lexical nesting of target * inside of target we actually emit a warning). To make this work, I had to do multiple changes. One was to mark .omp_data_{sizes,kinds}.* variables when static as "omp declare target". Another one was to add stub GOMP_target* entrypoints to nvptx and gcn libgomp.a. The entrypoint functions shouldn't be called or passed in the offload regions, otherwise libgomp: cuLaunchKernel error: too many resources requested for launch was reported; fixed by changing those arguments of calls to GOMP_target_ext to NULL. And we didn't mark the entrypoints "omp target entrypoint" when the caller has been "omp declare target". 2021-05-26 Jakub Jelinek <jakub@redhat.com> PR libgomp/100573 gcc/ * omp-low.c: Include omp-offload.h. (create_omp_child_function): If current_function_decl has "omp declare target" attribute and is_gimple_omp_offloaded, remove that attribute from the copy of attribute list and add "omp target entrypoint" attribute instead. (lower_omp_target): Mark .omp_data_sizes.* and .omp_data_kinds.* variables for offloading if in omp_maybe_offloaded_ctx. * omp-offload.c (pass_omp_target_link::execute): Nullify second argument to GOMP_target_data_ext in offloaded code. libgomp/ * config/nvptx/target.c (GOMP_target_ext, GOMP_target_data_ext, GOMP_target_end_data, GOMP_target_update_ext, GOMP_target_enter_exit_data): New dummy entrypoints. * config/gcn/target.c (GOMP_target_ext, GOMP_target_data_ext, GOMP_target_end_data, GOMP_target_update_ext, GOMP_target_enter_exit_data): Likewise. * testsuite/libgomp.c-c++-common/for-3.c (DO_PRAGMA, OMPTEAMS, OMPFROM, OMPTO): Define. (main): Remove #pragma omp target teams around all the tests. * testsuite/libgomp.c-c++-common/target-41.c: New test. * testsuite/libgomp.c-c++-common/target-42.c: New test.	2021-05-26 11:28:42 +02:00
Jakub Jelinek	3a81735c1c	openmp: Fix reduction clause handling on teams distribute simd [PR99928] When a directive isn't combined with worksharing-loop, it takes much simpler clause splitting path for reduction, and that one was missing handling of teams when combined with simd. 2021-05-25 Jakub Jelinek <jakub@redhat.com> PR middle-end/99928 gcc/c-family/ * c-omp.c (c_omp_split_clauses): Copy reduction to teams when teams is combined with simd and not with taskloop or for. gcc/testsuite/ * c-c++-common/gomp/pr99928-8.c: Remove xfails from omp teams r21 and r28 checks. * c-c++-common/gomp/pr99928-9.c: Likewise. * c-c++-common/gomp/pr99928-10.c: Likewise. libgomp/ * testsuite/libgomp.c-c++-common/reduction-17.c: New test.	2021-05-25 11:07:01 +02:00
Tobias Burnus	0e3b3b77e1	OpenMP/Fortran: Handle polymorphic scalars in data-sharing FIRSTPRIVATE [PR86470] gcc/fortran/ChangeLog: PR fortran/86470 * trans-expr.c (gfc_copy_class_to_class): Add unshare_expr. * trans-openmp.c (gfc_is_polymorphic_nonptr, gfc_is_unlimited_polymorphic_nonptr): New. (gfc_omp_clause_copy_ctor, gfc_omp_clause_dtor): Handle polymorphic scalars. libgomp/ChangeLog: PR fortran/86470 * testsuite/libgomp.fortran/class-firstprivate-1.f90: New test. * testsuite/libgomp.fortran/class-firstprivate-2.f90: New test. * testsuite/libgomp.fortran/class-firstprivate-3.f90: New test. gcc/testsuite/ChangeLog: PR fortran/86470 * gfortran.dg/gomp/class-firstprivate-1.f90: New test. * gfortran.dg/gomp/class-firstprivate-2.f90: New test. * gfortran.dg/gomp/class-firstprivate-3.f90: New test. * gfortran.dg/gomp/class-firstprivate-4.f90: New test.	2021-05-24 16:50:51 +02:00
Thomas Schwinge	3050a1a182	[OpenACC privatization] Prune uninteresting/varying diagnostics in 'libgomp.oacc-fortran/privatized-ref-2.f90' Minor fix-up for my recent commit `11b8286a83` "[OpenACC privatization] Largely extend diagnostics and corresponding testsuite coverage [PR90115]". libgomp/ PR testsuite/90115 * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Prune uninteresting/varying diagnostics. Reported-by: Sunil K Pandey <skpandey@sc.intel.com>	2021-05-22 10:37:17 +02:00
Thomas Schwinge	325aa13996	[OpenACC privatization] Reject 'static', 'external' in blocks [PR90115] gcc/ PR middle-end/90115 * omp-low.c (oacc_privatization_candidate_p): Reject 'static', 'external' in blocks. gcc/testsuite/ PR middle-end/90115 * c-c++-common/goacc/privatization-1-compute-loop.c: Update. * c-c++-common/goacc/privatization-1-compute.c: Likewise. * c-c++-common/goacc/privatization-1-routine_gang-loop.c: Likewise. * c-c++-common/goacc/privatization-1-routine_gang.c: Likewise. libgomp/ PR middle-end/90115 * testsuite/libgomp.oacc-c-c++-common/static-variable-1.c: Update. * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise.	2021-05-21 20:23:34 +02:00
Thomas Schwinge	11b8286a83	[OpenACC privatization] Largely extend diagnostics and corresponding testsuite coverage [PR90115] gcc/ PR middle-end/90115 * flag-types.h (enum openacc_privatization): New. * params.opt (-param=openacc-privatization): New. * doc/invoke.texi (openacc-privatization): Document it. * omp-general.h (get_openacc_privatization_dump_flags): New function. * omp-low.c (oacc_privatization_candidate_p): Add diagnostics. * omp-offload.c (execute_oacc_device_lower) <IFN_UNIQUE_OACC_PRIVATE>: Re-work diagnostics. * target.def (goacc.adjust_private_decl): Add 'location_t' parameter. * doc/tm.texi: Regenerate. * config/gcn/gcn-protos.h (gcn_goacc_adjust_private_decl): Adjust. * config/gcn/gcn-tree.c (gcn_goacc_adjust_private_decl): Likewise. * config/nvptx/nvptx.c (nvptx_goacc_adjust_private_decl): Likewise. Preserve it for... (nvptx_goacc_expand_var_decl): ... use here. gcc/testsuite/ PR middle-end/90115 * c-c++-common/goacc/privatization-1-compute-loop.c: New file. * c-c++-common/goacc/privatization-1-compute.c: Likewise. * c-c++-common/goacc/privatization-1-routine_gang-loop.c: Likewise. * c-c++-common/goacc/privatization-1-routine_gang.c: Likewise. * gfortran.dg/goacc/privatization-1-compute-loop.f90: Likewise. * gfortran.dg/goacc/privatization-1-compute.f90: Likewise. * gfortran.dg/goacc/privatization-1-routine_gang-loop.f90: Likewise. * gfortran.dg/goacc/privatization-1-routine_gang.f90: Likewise. * c-c++-common/goacc-gomp/nesting-1.c: Update. * c-c++-common/goacc/private-reduction-1.c: Likewise. * gfortran.dg/goacc/private-3.f95: Likewise. libgomp/ PR middle-end/90115 * testsuite/libgomp.oacc-fortran/private-atomic-1-vector.f90: New file. * testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c: Update. * testsuite/libgomp.oacc-c-c++-common/host_data-7.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-local-worker-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-gang-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-vector-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-private-vars-loop-worker-7.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-g-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-g-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-v-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-v-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-v-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-w-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/parallel-reduction.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/private-atomic-1-gang.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/private-atomic-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/private-variables.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/static-variable-1.c: Likewise. * testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f: Likewise. * testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f: Likewise. * testsuite/libgomp.oacc-fortran/declare-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/host_data-5.F90: Likewise. * testsuite/libgomp.oacc-fortran/if-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-3.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-gang-6.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-vector-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-1.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-3.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-4.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-5.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-6.f90: Likewise. * testsuite/libgomp.oacc-fortran/kernels-private-vars-loop-worker-7.f90: Likewise. * testsuite/libgomp.oacc-fortran/optional-private.f90: Likewise. * testsuite/libgomp.oacc-fortran/parallel-dims.f90: Likewise. * testsuite/libgomp.oacc-fortran/private-atomic-1-gang.f90: Likewise. * testsuite/libgomp.oacc-fortran/private-atomic-1-worker.f90: Likewise. * testsuite/libgomp.oacc-fortran/private-variables.f90: Likewise. * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/routine-7.f90: Likewise.	2021-05-21 20:09:59 +02:00
Julian Brown	29a2f51806	openacc: Add support for gang local storage allocation in shared memory [PR90115] This patch implements a method to track the "private-ness" of OpenACC variables declared in offload regions in gang-partitioned, worker-partitioned or vector-partitioned modes. Variables declared implicitly in scoped blocks and those declared "private" on enclosing directives (e.g. "acc parallel") are both handled. Variables that are e.g. gang-private can then be adjusted so they reside in GPU shared memory. The reason for doing this is twofold: correct implementation of OpenACC semantics, and optimisation, since shared memory might be faster than the main memory on a GPU. Handling of private variables is intimately tied to the execution model for gangs/workers/vectors implemented by a particular target: for current targets, we use (or on mainline, will soon use) a broadcasting/neutering scheme. That is sufficient for code that e.g. sets a variable in worker-single mode and expects to use the value in worker-partitioned mode. The difficulty (semantics-wise) comes when the user wants to do something like an atomic operation in worker-partitioned mode and expects a worker-single (gang private) variable to be shared across each partitioned worker. Forcing use of shared memory for such variables makes that work properly. In terms of implementation, the parallelism level of a given loop is not fixed until the oaccdevlow pass in the offload compiler, so the patch delays fixing the parallelism level of variables declared on or within such loops until the same point. This is done by adding a new internal UNIQUE function (OACC_PRIVATE) that lists (the address of) each private variable as an argument, and other arguments set so as to be able to determine the correct parallelism level to use for the listed variables. This new internal function fits into the existing scheme for demarcating OpenACC loops, as described in comments in the patch. Two new target hooks are introduced: TARGET_GOACC_ADJUST_PRIVATE_DECL and TARGET_GOACC_EXPAND_VAR_DECL. The first can tweak a variable declaration at oaccdevlow time, and the second at expand time. The first or both of these target hooks can be used by a given offload target, depending on its strategy for implementing private variables. This patch updates the TARGET_GOACC_ADJUST_PRIVATE_DECL target hook in the AMD GCN backend to the current name and prototype. (An earlier version of the hook was already present, but dormant.) gcc/ PR middle-end/90115 * doc/tm.texi.in (TARGET_GOACC_EXPAND_VAR_DECL) (TARGET_GOACC_ADJUST_PRIVATE_DECL): Add documentation hooks. * doc/tm.texi: Regenerate. * expr.c (expand_expr_real_1): Expand decls using the expand_var_decl OpenACC hook if defined. * internal-fn.c (expand_UNIQUE): Handle IFN_UNIQUE_OACC_PRIVATE. * internal-fn.h (IFN_UNIQUE_CODES): Add OACC_PRIVATE. * omp-low.c (omp_context): Add oacc_privatization_candidates field. (lower_oacc_reductions): Add PRIVATE_MARKER parameter. Insert before fork. (lower_oacc_head_tail): Add PRIVATE_MARKER parameter. Modify private marker's gimple call arguments, and pass it to lower_oacc_reductions. (oacc_privatization_scan_clause_chain) (oacc_privatization_scan_decl_chain, lower_oacc_private_marker): New functions. (lower_omp_for, lower_omp_target, lower_omp_1): Use these. * omp-offload.c (convert.h): Include. (oacc_loop_xform_head_tail): Treat private-variable markers like fork/join when transforming head/tail sequences. (struct var_decl_rewrite_info): Add struct. (oacc_rewrite_var_decl, is_sync_builtin_call): New functions. (execute_oacc_device_lower): Support rewriting gang-private variables using target hook, and fix up addr_expr and var_decl nodes afterwards. * target.def (adjust_private_decl, expand_var_decl): New hooks. * config/gcn/gcn-protos.h (gcn_goacc_adjust_gangprivate_decl): Rename to... (gcn_goacc_adjust_private_decl): ...this. * config/gcn/gcn-tree.c (gcn_goacc_adjust_gangprivate_decl): Rename to... (gcn_goacc_adjust_private_decl): ...this. Add LEVEL parameter. * config/gcn/gcn.c (TARGET_GOACC_ADJUST_GANGPRIVATE_DECL): Rename definition using gcn_goacc_adjust_gangprivate_decl... (TARGET_GOACC_ADJUST_PRIVATE_DECL): ...to this, using gcn_goacc_adjust_private_decl. * config/nvptx/nvptx.c (tree-pretty-print.h): Include. (gang_private_shared_size): New global variable. (gang_private_shared_align): Likewise. (gang_private_shared_sym): Likewise. (gang_private_shared_hmap): Likewise. (nvptx_option_override): Initialize these. (nvptx_file_end): Output gang_private_shared_sym. (nvptx_goacc_adjust_private_decl, nvptx_goacc_expand_var_decl): New functions. (nvptx_set_current_function): Clear gang_private_shared_hmap. (TARGET_GOACC_ADJUST_PRIVATE_DECL): Define hook. (TARGET_GOACC_EXPAND_VAR_DECL): Likewise. libgomp/ PR middle-end/90115 * testsuite/libgomp.oacc-c-c++-common/private-atomic-1-gang.c: New test. * testsuite/libgomp.oacc-fortran/private-atomic-1-gang.f90: Likewise. * testsuite/libgomp.oacc-fortran/private-atomic-1-worker.f90: Likewise. Co-Authored-By: Chung-Lin Tang <cltang@codesourcery.com> Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>	2021-05-21 18:58:07 +02:00
Thomas Schwinge	5d42db5333	Don't skip 'libgomp.oacc-fortran/privatized-ref-2.f90' for nvptx offloading libgomp/ * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Don't skip for nvptx offloading.	2021-05-21 16:22:46 +02:00
Tobias Burnus	61796dc03b	Add 'libgomp.oacc-fortran/privatized-ref-2.f90' libgomp/ * testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: New.	2021-05-21 16:08:40 +02:00
Thomas Schwinge	1467100fc7	Add 'libgomp.oacc-c-c++-common/private-atomic-1.c' [PR83812] ... to at least document/test/XFAIL nvptx offloading: PR83812 "operation not supported on global/shared address space". libgomp/ PR target/83812 * testsuite/libgomp.oacc-c-c++-common/private-atomic-1.c: New.	2021-05-19 14:23:29 +02:00
Julian Brown	5a16fb19e7	Add 'libgomp.oacc-c-c++-common/loop-gwv-2.c' libgomp/ * testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c: New.	2021-05-19 13:58:38 +02:00
Thomas Schwinge	b5c3145ad9	[libgomp, testsuite] Don't shadow global 'offload_targets' variable See local 'offload_targets' variable in 'libgomp/testsuite/lib/libgomp.exp:libgomp_check_effective_target_offload_target' vs. global 'libgomp/testsuite/libgomp-test-support.exp.in:offload_targets' variable. libgomp/ * testsuite/lib/libgomp.exp (check_effective_target_offload_target_nvptx): Don't shadow global 'offload_targets' variable.	2021-05-18 13:18:43 +02:00
Thomas Schwinge	937fa5fb78	'libgomp.c-c++-common/reduction-{5,6}.c': Restrict '-latomic' to nvptx offloading compilation Fix-up for recent commit `33b647956c` "OpenMP: Fix SIMT for complex/float reduction with && and \|\|"; see commit `d42088e453` "Avoid -latomic for amdgcn offloading". libgomp/ * testsuite/libgomp.c-c++-common/reduction-5.c: Restrict '-latomic' to nvptx offloading compilation. * testsuite/libgomp.c-c++-common/reduction-6.c: Likewise.	2021-05-18 12:57:45 +02:00
Thomas Schwinge	abf937ac00	'libgomp.c/target-44.c': Restrict '-latomic' to nvptx offloading compilation Fix-up for recent commit `f87990a2a8` "[openmp, simt] Disable SIMT for user-defined reduction"; see commit `d42088e453` "Avoid -latomic for amdgcn offloading". libgomp/ * testsuite/libgomp.c/target-44.c: Restrict '-latomic' to nvptx offloading compilation.	2021-05-18 12:57:35 +02:00
Kwok Cheung Yeung	ba886d0c48	openmp: Notify team barrier of pending tasks in omp_fulfill_event The team barrier should be notified of any new tasks that become runnable as the result of a completing task, otherwise the barrier threads might not resume processing available tasks, resulting in a hang. 2021-05-17 Kwok Cheung Yeung <kcy@codesourcery.com> libgomp/ * task.c (omp_fulfill_event): Call gomp_team_barrier_set_task_pending if new tasks generated. * testsuite/libgomp.c-c++-common/task-detach-13.c: New.	2021-05-17 13:15:08 -07:00
Tobias Burnus	0e3702f8da	Fortran/OpenMP: Support 'omp parallel master' gcc/fortran/ChangeLog: * dump-parse-tree.c (show_omp_node, show_code_node): Handle EXEC_OMP_PARALLEL_MASTER. * frontend-passes.c (gfc_code_walker): Likewise. * gfortran.h (enum gfc_statement): Add ST_OMP_PARALLEL_MASTER and ST_OMP_END_PARALLEL_MASTER. (enum gfc_exec_op): Add EXEC_OMP_PARALLEL_MASTER.. * match.h (gfc_match_omp_parallel_master): Handle it. * openmp.c (gfc_match_omp_parallel_master, resolve_omp_clauses, omp_code_to_statement, gfc_resolve_omp_directive): Likewise. * parse.c (decode_omp_directive, case_exec_markers, gfc_ascii_statement, parse_omp_structured_block, parse_executable): Likewise. * resolve.c (gfc_resolve_blocks, gfc_resolve_code): Likewise. * st.c (gfc_free_statement): Likewise. * trans-openmp.c (gfc_trans_omp_parallel_master, gfc_trans_omp_workshare, gfc_trans_omp_directive): Likewise. * trans.c (trans_code): Likewise. libgomp/ChangeLog: * testsuite/libgomp.fortran/parallel-master.f90: New test. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/parallel-master-1.f90: New test. * gfortran.dg/gomp/parallel-master-2.f90: New test.	2021-05-14 19:21:47 +02:00
Martin Liska	810afb0b5f	testsuite: prune new LTO warning libgomp/ChangeLog: PR testsuite/100569 * testsuite/libgomp.c/omp-nested-3.c: Prune new LTO warning. * testsuite/libgomp.c/pr46032-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/data-clauses-kernels-ipa-pta.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/data-clauses-parallel-ipa-pta.c: Likewise. gcc/testsuite/ChangeLog: PR testsuite/100569 * gcc.dg/atomic/c11-atomic-exec-2.c: Prune new LTO warning. * gcc.dg/torture/pr94947-1.c: Likewise.	2021-05-13 09:24:23 +02:00
Tobias Burnus	d21963ce7a	OpenMP: detach - fix firstprivate handling gcc/ChangeLog: * omp-low.c (finish_taskreg_scan): Use the proper detach decl. libgomp/ChangeLog: * testsuite/libgomp.c-c++-common/task-detach-12.c: New test. * testsuite/libgomp.fortran/task-detach-12.f90: New test.	2021-05-13 00:14:34 +02:00
Jakub Jelinek	98acbb3111	openmp: Fix up taskloop reduction ICE if taskloop has no iterations [PR100471] When a taskloop doesn't have any iterations, GOMP_taskloop* takes an early return, doesn't create any tasks and more importantly, doesn't create a taskgroup and doesn't register task reductions. But, the code emitted in the callers assumes task reductions have been registered and performs the reduction handling and task reduction unregistration. The pointer to the task reduction private variables is reused, on input it is the alignment and only on output it is the pointer, so in the case taskloop with no iterations the caller attempts to dereference the alignment value as if it was a pointer and crashes. We could in the early returns register the task reductions only to have them looped over and unregistered in the caller, but I think it is better to tell the caller there is nothing to task reduce and bypass all that. 2021-05-11 Jakub Jelinek <jakub@redhat.com> PR middle-end/100471 * omp-low.c (lower_omp_task_reductions): For OMP_TASKLOOP, if data is 0, bypass the reduction loop including GOMP_taskgroup_reduction_unregister call. * taskloop.c (GOMP_taskloop): If GOMP_TASK_FLAG_REDUCTION and not GOMP_TASK_FLAG_NOGROUP, when doing early return clear the task reduction pointer. * testsuite/libgomp.c/task-reduction-4.c: New test.	2021-05-11 09:07:47 +02:00
Tobias Burnus	33b647956c	OpenMP: Fix SIMT for complex/float reduction with && and \|\| 2021-05-07 Tobias Burnus <tobias@codesourcery.com> Tom de Vries <tdevries@suse.de> gcc/ChangeLog: * omp-low.c (lower_rec_simd_input_clauses): Set max_vf = 1 if a truth_value_p reduction variable is nonintegral. libgomp/ChangeLog: * testsuite/libgomp.c-c++-common/reduction-5.c: New test, testing complex/floating-point \|\| + && reduction with 'omp target'. * testsuite/libgomp.c-c++-common/reduction-6.c: Likewise.	2021-05-07 12:11:51 +02:00
Tobias Burnus	1580fc7644	OpenMP: Support complex/float in && and \|\| reduction C/C++ permit logical AND and logical OR also with floating-point or complex arguments by doing an unequal zero comparison; the result is an 'int' with value one or zero. Hence, those are also permitted as reduction variable, even though it is not the most sensible thing to do. gcc/c/ChangeLog: * c-typeck.c (c_finish_omp_clauses): Accept float + complex for \|\| and && reductions. gcc/cp/ChangeLog: * semantics.c (finish_omp_reduction_clause): Accept float + complex for \|\| and && reductions. gcc/ChangeLog: * omp-low.c (lower_rec_input_clauses, lower_reduction_clauses): Handle && and \|\| with floating-point and complex arguments. gcc/testsuite/ChangeLog: * gcc.dg/gomp/clause-1.c: Use 'reduction(&:..)' instead of '...(&&:..)'. libgomp/ChangeLog: * testsuite/libgomp.c-c++-common/reduction-1.c: New test. * testsuite/libgomp.c-c++-common/reduction-2.c: New test. * testsuite/libgomp.c-c++-common/reduction-3.c: New test.	2021-05-04 14:42:26 +02:00
Tobias Burnus	08fff201c9	OpenMP/Fortran - fix pasto + testcase in depobj [PR100397] gcc/fortran/ChangeLog: PR testsuite/100397 * trans-openmp.c (gfc_trans_omp_depobj): Fix pasto in enum values. libgomp/ChangeLog: PR testsuite/100397 * testsuite/libgomp.fortran/depobj-1.f90 (dep2, dep3): Move var declaration to scope of non-'depend'-guarded assignment to avoid races.	2021-05-04 09:22:36 +02:00
Tom de Vries	f87990a2a8	[openmp, simt] Disable SIMT for user-defined reduction The test-case included in this patch contains this target region: ... for (int i0 = 0 ; i0 < N0 ; i0++ ) counter_N0.i += 1; ... When running with nvptx accelerator, the counter variable is expected to be N0 after the region, but instead is N0 / 32. The problem is that rather than getting the result for all warp lanes, we get it for just one lane. This is caused by the implementation of SIMT being incomplete. It handles regular reductions, but appearantly not user-defined reductions. For now, handle this by disabling SIMT in this case, specifically by setting sctx->max_vf to 1. Tested libgomp on x86_64-linux with nvptx accelerator. gcc/ChangeLog: 2021-05-03 Tom de Vries <tdevries@suse.de> PR target/100321 * omp-low.c (lower_rec_input_clauses): Disable SIMT for user-defined reduction. libgomp/ChangeLog: 2021-05-03 Tom de Vries <tdevries@suse.de> PR target/100321 * testsuite/libgomp.c/target-44.c: New test.	2021-05-03 23:13:59 +02:00
Roman Zhuykov	4cf3b10f27	modulo-sched: skip loops with strange register defs [PR100225] PR84878 fix adds an assertion which can fail, e.g. when stack pointer is adjusted inside the loop. We have to prevent it and search earlier for any 'strange' instruction. The solution is to skip the whole loop if using 'note_stores' we found that one of hard registers is in 'df->regular_block_artificial_uses' set. Also patch properly prohibit not single-set instruction in loop body. gcc/ChangeLog: PR rtl-optimization/100225 PR rtl-optimization/84878 * modulo-sched.c (sms_schedule): Use note_stores to skip loops where we have an instruction which touches (writes) any hard register from df->regular_block_artificial_uses set. Allow not-single-set instruction only right before basic block tail. gcc/testsuite/ChangeLog: PR rtl-optimization/100225 PR rtl-optimization/84878 * gcc.dg/pr100225.c: New test. libgomp/ChangeLog: * testsuite/libgomp.oacc-c-c++-common/atomic_capture-3.c: New test.	2021-04-30 11:08:03 +03:00
Tom de Vries	fc14ff6111	[omp, simt] Handle alternative IV Consider the test-case libgomp.c/pr81778.c added in this commit, with this core loop (note: CANARY_SIZE set to 0 for simplicity): ... int s = 1; #pragma omp target simd for (int i = N - 1; i > -1; i -= s) a[i] = 1; ... which, given that N is 32, sets a[0..31] to 1. After omp-expand, this looks like: ... <bb 5> : simduid.7 = .GOMP_SIMT_ENTER (simduid.7); .omp_simt.8 = .GOMP_SIMT_ENTER_ALLOC (simduid.7); D.3193 = -s; s.9 = s; D.3204 = .GOMP_SIMT_LANE (); D.3205 = -s.9; D.3206 = (int) D.3204; D.3207 = D.3205 * D.3206; i = D.3207 + 31; D.3209 = 0; D.3210 = -s.9; D.3211 = D.3210 - i; D.3210 = -s.9; D.3212 = D.3211 / D.3210; D.3213 = (unsigned int) D.3212; D.3213 = i >= 0 ? D.3213 : 0; <bb 19> : if (D.3209 < D.3213) goto <bb 6>; [87.50%] else goto <bb 7>; [12.50%] <bb 6> : a[i] = 1; D.3215 = -s.9; D.3219 = .GOMP_SIMT_VF (); D.3216 = (int) D.3219; D.3220 = D.3215 * D.3216; i = D.3220 + i; D.3209 = D.3209 + 1; goto <bb 19>; [100.00%] ... On nvptx, the first time bb6 is executed, i is in the 0..31 range (depending on the lane that is executing) at bb entry. So we have the following sequence: - a[0..31] is set to 1 - i is updated to -32..-1 - D.3209 is updated to 1 (being 0 initially) - bb19 is executed, and if condition (D.3209 < D.3213) == (1 < 32) evaluates to true - bb6 is once more executed, which should not happen because all the elements that needed to be handled were already handled. - consequently, elements that should not be written are written - with CANARY_SIZE == 0, we may run into a libgomp error: ... libgomp: cuCtxSynchronize error: an illegal memory access was encountered ... and with CANARY_SIZE unmodified, we run into: ... Expected 0, got 1 at base[-961] Aborted (core dumped) ... The cause of this is as follows: - because the step s is a variable rather than a constant, an alternative IV (D.3209 in our example) is generated in expand_omp_simd, and the loop condition is tested in terms of the alternative IV rather than the original IV (i in our example). - the SIMT code in expand_omp_simd works by modifying step and initial value. - The initial value fd->loop.n1 is loaded into a variable n1, which is modified by the SIMT code and then used there-after. - The step fd->loop.step is loaded into a variable step, which is modified by the SIMT code, but afterwards there are uses of both step and fd->loop.step. - There are uses of fd->loop.step in the alternative IV handling code, which should use step instead. Fix this by introducing an additional variable orig_step, which is not modified by the SIMT code and replacing all remaining uses of fd->loop.step by either step or orig_step. Build on x86_64-linux with nvptx accelerator, tested libgomp. This fixes for-5.c and for-6.c FAILs I'm currently seeing on a quadro m1200 with driver 450.66. gcc/ChangeLog: 2020-10-02 Tom de Vries <tdevries@suse.de> * omp-expand.c (expand_omp_simd): Add step_orig, and replace uses of fd->loop.step by either step or orig_step. libgomp/ChangeLog: 2020-10-02 Tom de Vries <tdevries@suse.de> * testsuite/libgomp.c/pr81778.c: New test.	2021-04-29 14:37:32 +02:00
Tom de Vries	4d7c874e2c	[omp, simt] Fix expand_GOMP_SIMT_* When running the test-case included in this patch using an nvptx accelerator, it fails in execution. The problem is that the expansion of GOMP_SIMT_XCHG_BFLY is optimized away during pass_jump as "trivially dead insns". This is caused by this code in expand_GOMP_SIMT_XCHG_BFLY: ... class expand_operand ops[3]; create_output_operand (&ops[0], target, mode); ... expand_insn (targetm.code_for_omp_simt_xchg_bfly, 3, ops); ... which doesn't guarantee that target is assigned to by the expanded insn. F.i., if target is: ... (gdb) call debug_rtx ( target ) (subreg/s/u:QI (reg:SI 40 [ _61 ]) 0) ... then after expand_insn, we have: ... (gdb) call debug_rtx ( ops[0].value ) (reg:QI 57) ... See commit `3af3bec2e4` "internal-fn: Avoid dropping the lhs of some calls [PR94941]" for a similar problem. Fix this in the same way, by adding: ... if (!rtx_equal_p (target, ops[0].value)) emit_move_insn (target, ops[0].value); ... where applicable in the expand_GOMP_SIMT_* functions. Tested libgomp on x86_64 with nvptx accelerator. gcc/ChangeLog: 2021-04-28 Tom de Vries <tdevries@suse.de> PR target/100232 * internal-fn.c (expand_GOMP_SIMT_ENTER_ALLOC) (expand_GOMP_SIMT_LAST_LANE, expand_GOMP_SIMT_ORDERED_PRED) (expand_GOMP_SIMT_VOTE_ANY, expand_GOMP_SIMT_XCHG_BFLY) (expand_GOMP_SIMT_XCHG_IDX): Ensure target is assigned to.	2021-04-29 09:55:15 +02:00

1 2 3 4 5 ...

856 Commits