OpenE2K/gcc - gcc - Expired Mentality Git

Author	SHA1	Message	Date
GCC Administrator	e93aae4a49	Daily bump.	2020-10-29 00:16:50 +00:00
Jakub Jelinek	2298ca2d3e	openmp: Implicitly discover declare target for variants of declare variant calls This marks all variants of declare variant also declare target if the base functions are called directly in target regions or declare target functions. 2020-10-28 Jakub Jelinek <jakub@redhat.com> gcc/ * omp-offload.c (omp_declare_target_tgt_fn_r): Handle direct calls to declare variant base functions. libgomp/ * testsuite/libgomp.c/target-42.c: New test.	2020-10-28 10:36:31 +01:00
Jakub Jelinek	3f39b64e57	xfail and improve some failing libgomp tests [PR81690] With the patch I've posted today to fix up declare variant LTO handling, Tobias reported the patch still doesn't work, and there are two reasons for that. One is that when the base function is marked implicitly as declare target, we don't mark also implicitly the variants. I'll need to ask on omp-lang about details for that, but generally the compiler should do it some way. The other one is that the way base_delay is written, it will always call the usleep function, which is undesirable for nvptx. While the compiler will replace all direct calls to base_delay to nvptx_delay, the base_delay definition which calls usleep stays. 2020-10-28 Jakub Jelinek <jakub@redhat.com> Tom de Vries <tdevries@suse.de> PR testsuite/81690 * testsuite/libgomp.c/usleep.h: New file. * testsuite/libgomp.c/target-32.c: Include usleep.h. (main): Use tgt_usleep instead of usleep. * testsuite/libgomp.c/thread-limit-2.c: Include usleep.h. (main): Use tgt_usleep instead of usleep.	2020-10-28 10:30:41 +01:00
Jakub Jelinek	f165ef89c0	lto: LTO cgraph support for late declare variant resolution [PR96680] > I've tried to add the saving/restoring next to ipa refs saving/restoring, as > the declare variant alt stuff is kind of extension of those, unfortunately > following doesn't compile, because I need to also write or read a tree there > (ctx is a portion of DECL_ATTRIBUTES of the base function), but the ipa refs > write/read back functions don't have arguments that can be used for that. This patch adds the streaming out and in of those omp_declare_variant_alt hash table on the side data for the declare_variant_alt cgraph_nodes and treats for LTO purposes the declare_variant_alt nodes (which have no body) as if they contained a body that calls all the possible variants. After IPA all the calls to these magic declare_variant_alt calls are replaced with call to one of the variant depending on which one has the highest score in the context. 2020-10-28 Jakub Jelinek <jakub@redhat.com> PR lto/96680 gcc/ * lto-streamer.h (omp_lto_output_declare_variant_alt, omp_lto_input_declare_variant_alt): Declare variant. * symtab.c (symtab_node::get_partitioning_class): Return SYMBOL_DUPLICATE for declare_variant_alt nodes. * passes.c (ipa_write_summaries): Add declare_variant_alt to partition. * lto-cgraph.c (output_refs): Call omp_lto_output_declare_variant_alt on declare_variant_alt nodes. (input_refs): Call omp_lto_input_declare_variant_alt on declare_variant_alt nodes. * lto-streamer-out.c (output_function): Don't call collect_block_tree_leafs if DECL_INITIAL is error_mark_node. (lto_output): Call output_function even for declare_variant_alt nodes. * omp-general.c (omp_lto_output_declare_variant_alt, omp_lto_input_declare_variant_alt): New functions. gcc/lto/ * lto-common.c (lto_fixup_prevailing_decls): Don't use LTO_NO_PREVAIL on TREE_LIST's TREE_PURPOSE. * lto-partition.c (lto_balanced_map): Treat declare_variant_alt nodes like definitions. libgomp/ * testsuite/libgomp.c/declare-variant-1.c: New test.	2020-10-28 10:29:09 +01:00
GCC Administrator	56ddd5e23a	Daily bump.	2020-10-22 08:28:22 +00:00
Jakub Jelinek	17c5b7e1dc	openmp: Add test for OMP_TARGET_OFFLOAD=mandatory for cases where it must not fail 2020-10-22 Jakub Jelinek <jakub@redhat.com> * testsuite/libgomp.c/target-41.c: New test.	2020-10-22 09:36:18 +02:00
Jakub Jelinek	74c9882b80	openmp: Change omp_get_initial_device () to match OpenMP 5.1 requirements > Therefore, I think until omp_get_initial_device () value is changed, we The following so far untested patch implements that change. OpenMP 4.5 said for omp_get_initial_device: The value of the device number is implementation defined. If it is between 0 and one less than omp_get_num_devices() then it is valid for use with all device constructs and routines; if it is outside that range, then it is only valid for use with the device memory routines and not in the device clause. and OpenMP 5.0 similarly, but OpenMP 5.1 says: The value of the device number is the value returned by the omp_get_num_devices routine. As the new value is compatible with what has been required earlier, I think we can change it already now. 2020-10-22 Jakub Jelinek <jakub@redhat.com> * icv.c (omp_get_initial_device): Remove including corresponding ialias. * icv-device.c (omp_get_initial_device): New function. Return gomp_get_num_devices (). Add ialias. * target.c (resolve_device): Don't fail with OMP_TARGET_OFFLOAD=mandatory if device_id is equal to gomp_get_num_devices (). (omp_target_alloc, omp_target_free, omp_target_is_present, omp_target_memcpy, omp_target_memcpy_rect, omp_target_associate_ptr, omp_target_disassociate_ptr, omp_pause_resource): Use gomp_get_num_devices () instead of GOMP_DEVICE_HOST_FALLBACK on the first use in the functions, in uses dominated by the gomp_get_num_devices call use num_devices_openmp instead. * libgomp.texi (omp_get_initial_device): Document. * config/gcn/icv-device.c (omp_get_initial_device): New function. Add ialias. * config/nvptx/icv-device.c (omp_get_initial_device): Likewise. * testsuite/libgomp.c/target-40.c: New test.	2020-10-22 09:31:01 +02:00
Jakub Jelinek	121a8812c4	libgomp: Hopefully avoid false positive warnings in env.c on solaris > the patch also breaks bootstrap on both i386-pc-solaris2.11 and > sparc-sun-solaris2.11: > > /vol/gcc/src/hg/master/local/libgomp/env.c: In function 'initialize_env': > /vol/gcc/src/hg/master/local/libgomp/env.c:414:16: error: 'new_offload' may be used uninitialized in this function [-Werror=maybe-uninitialized] > 414 \| offload = new_offload; > \| ~~~~~~~~~^~~~~~~~~~~~~ > /vol/gcc/src/hg/master/local/libgomp/env.c:384:30: note: 'new_offload' was declared here > 384 \| enum gomp_target_offload_t new_offload; > \| ^~~~~~~~~~~ I can't reproduce that, but I fail to see why we need two separate variables, one with actual value and one tracking if the value is valid. So, I'm going with: 2020-10-21 Jakub Jelinek <jakub@redhat.com> env.c (parse_target_offload): Change new_offload var type to int, preinitialize to -1, remove found var and test new_offload != -1 instead of found.	2020-10-21 10:21:52 +02:00
GCC Administrator	e2e0428854	Daily bump.	2020-10-21 00:16:36 +00:00
Jakub Jelinek	35f258f4bb	libgomp: Fix up bootstrap in libgomp/target.c due to false positive warning > On 10/20/20 2:11 PM, Tobias Burnus wrote: > > > Unfortunately, the committed patch > > (r11-4121-g1bfc07d150790fae93184a79a7cce897655cb37b) > > causes build errors. > > > > The error seems to be provoked by function cloning – as the code > > itself looks fine: > > ... > > struct gomp_device_descr devices_s > > = malloc (num_devices sizeof (struct gomp_device_descr)); > > ... > > for (i = 0; i < num_devices; i++) > > if (!(devices[i].capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)) > > devices_s[num_devices_after_openmp++] = devices[i]; > > gomp_target_init.part.0 () > { > ... > <bb 2> > devices_s_1 = malloc (0); > ... > num_devices.16_67 = num_devices; > ... > if (num_devices.16_67 > 0) > goto <bb 3>; [89.00%] > else > goto <bb 18>; [11.00%] > > Which seems to have an ordering problem. This patch fixes the warning that breaks the bootstrap. 2020-10-20 Jakub Jelinek <jakub@redhat.com> * target.c (gomp_target_init): Inside of the function, use automatic variables corresponding to num_devices, num_devices_openmp and devices global variables and update the globals only at the end of the function.	2020-10-20 16:38:24 +02:00
Kwok Cheung Yeung	1bfc07d150	openmp: Implement support for OMP_TARGET_OFFLOAD environment variable This implements support for the OMP_TARGET_OFFLOAD environment variable introduced in the OpenMP 5.0 standard, which controls how offloading is handled. It may be set to MANDATORY (abort if offloading cannot be performed), DISABLED (no offloading to devices) or DEFAULT (offload to device if possible, fall back to host if not). 2020-10-20 Kwok Cheung Yeung <kcy@codesourcery.com> libgomp/ * env.c (gomp_target_offload_var): New. (parse_target_offload): New. (handle_omp_display_env): Print value of OMP_TARGET_OFFLOAD. (initialize_env): Parse OMP_TARGET_OFFLOAD. * libgomp.h (gomp_target_offload_t): New. (gomp_target_offload_var): New. * libgomp.texi (OMP_TARGET_OFFLOAD): New section. * target.c (resolve_device): Generate error if device not found and offloading is mandatory. (gomp_target_fallback): Generate error if offloading is mandatory. (GOMP_target): Add argument in call to gomp_target_fallback. (GOMP_target_ext): Likewise. (gomp_target_data_fallback): Generate error if offloading is mandatory. (GOMP_target_data): Add argument in call to gomp_target_data_fallback. (GOMP_target_data_ext): Likewise. (gomp_target_task_fn): Add argument in call to gomp_target_fallback. (gomp_target_init): Return early if offloading is disabled.	2020-10-20 04:16:26 -07:00
GCC Administrator	b85d5dc583	Daily bump.	2020-10-16 00:16:29 +00:00
Kwok Cheung Yeung	445567b22a	libgomp: Amend documentation for omp_get_max_active_levels and omp_get_supported_active_levels 2020-10-15 Kwok Cheung Yeung <kcy@codesourcery.com> libgomp/ * libgomp.texi (omp_get_max_active_levels): Modify description. (omp_get_supported_active_levels): Make descriptions consistent.	2020-10-15 03:02:57 -07:00
GCC Administrator	b2698c21f2	Daily bump.	2020-10-15 00:16:34 +00:00
Jakub Jelinek	2fa5f5c42b	libgomp: Fix a typo in documentation 2020-10-14 Jakub Jelinek <jakub@redhat.com> * libgomp.texi (omp_get_supported_active_levels): Fix a typo.	2020-10-14 10:17:11 +02:00
GCC Administrator	bdd74cc20c	Daily bump.	2020-10-14 00:16:24 +00:00
Kwok Cheung Yeung	8949b985db	openmp: Add support for the omp_get_supported_active_levels runtime library routine This patch implements the omp_get_supported_active_levels runtime routine from the OpenMP 5.0 specification, which returns the maximum number of active nested parallel regions supported by this implementation. The current maximum (set using the omp_set_max_active_levels routine or the OMP_MAX_ACTIVE_LEVELS environment variable) cannot exceed this number. 2020-10-13 Kwok Cheung Yeung <kcy@codesourcery.com> libgomp/ * env.c (gomp_max_active_levels_var): Initialize to gomp_supported_active_levels. (initialize_env): Limit gomp_max_active_levels_var to be at most equal to gomp_supported_active_levels. * fortran.c (omp_get_supported_active_levels): Add ialias_redirect. (omp_get_supported_active_levels_): New. * icv.c (omp_set_max_active_levels): Limit gomp_max_active_levels_var to at most equal to gomp_supported_active_levels. (omp_get_supported_active_levels): New. * libgomp.h (gomp_supported_active_levels): New. * libgomp.map (OMP_5.0.1): Add omp_get_supported_active_levels and omp_get_supported_active_levels_. * libgomp.texi (omp_get_supported_active_levels): New. (omp_set_max_active_levels): Update. Add reference to omp_get_supported_active_levels. * omp.h.in (omp_get_supported_active_levels): New. * omp_lib.f90.in (omp_get_supported_active_levels): New. * omp_lib.h.in (omp_get_supported_active_levels): New. * testsuite/libgomp.c/lib-2.c (main): Check omp_get_max_active_levels against omp_get_supported_active_levels. * testsuite/libgomp.fortran/lib4.f90 (lib4): Likewise.	2020-10-13 13:21:02 -07:00
GCC Administrator	2baa36d491	Daily bump.	2020-10-12 00:16:25 +00:00
Clément Chigot	4eaf96c56c	aix: remove libgomp and libatomic archives before creating FAT archives AIX caches shared objects in archives with read-other permission. libgomp and libatomic might be in use during the build or testing, which may cause archiver operations on them to fail. This patch adjusts the Makefile fragments to delete the library archives before creating fresh archives containing both the 32 bit and 64 bit shared objects. libatomic/ChangeLog: 2020-10-11 Clement Chigot <clement.chigot@atos.net> * config/t-aix: Delete and recreate libatomic before creating FAT library. libgomp/ChangeLog: 2020-10-11 Clement Chigot <clement.chigot@atos.net> * config/t-aix: Delete and recreate libgomp before creating FAT library.	2020-10-11 17:30:24 -04:00
GCC Administrator	da9df69975	Daily bump.	2020-10-09 00:16:27 +00:00
Tom de Vries	7345ef6c2a	[libgomp, nvptx] Report launch dimensions in GOMP_OFFLOAD_run Using this patch, when using GOMP_DEBUG=1 and launching a kernel in GOMP_OFFLOAD_run (used by the omp implementation), we see the kernel launch dimensions: ... GOMP_OFFLOAD_run: kernel main$_omp_fn$0: \ launch [(teams: 1), 1, 1] [(lanes: 32), (threads: 1), 1] ... Build on x86_64-linux with nvptx accelerator, tested libgomp. libgomp/ChangeLog: 2020-10-08 Tom de Vries <tdevries@suse.de> PR libgomp/81802 * plugin/plugin-nvptx.c (GOMP_OFFLOAD_run): Report launch dimensions.	2020-10-08 11:03:29 +02:00
GCC Administrator	8e97b9052d	Daily bump.	2020-10-07 00:16:35 +00:00
Tom de Vries	1644d7f4c1	[openacc, libgomp, testsuite] Xfail declare-5.f90 We're currently running into: ... FAIL: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 \ -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0 execution test FAIL: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 \ -DACC_MEM_SHARED=0 -foffload=nvptx-none -O1 execution test FAIL: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 \ -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 execution test FAIL: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 \ -DACC_MEM_SHARED=0 -foffload=nvptx-none -O3 -fomit-frame-pointer \ -funroll-loops -fpeel-loops -ftracer -finline-functions execution test FAIL: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 \ -DACC_MEM_SHARED=0 -foffload=nvptx-none -O3 -g execution test FAIL: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 \ -DACC_MEM_SHARED=0 -foffload=nvptx-none -Os execution test ... A PR was filed for this: PR92790 - "[OpenACC] declare device_resident - Fortran common blocks not handled / libgomp.oacc-fortran/declare-5.f90 fails" Xfail the fails. Tested on x86_64-linux with nvptx accelerator. libgomp/ChangeLog: 2020-10-06 Tom de Vries <tdevries@suse.de> * testsuite/libgomp.oacc-fortran/declare-5.f90: Add xfail for PR92790.	2020-10-06 18:43:24 +02:00
Tom de Vries	3f2e15c2e6	[openacc] Fix acc declare for VLAs Consider test-case test.c, with VLA A: ... int main (void) { int N = 1000; int A[N]; #pragma acc declare copy(A) return 0; } ... compiled using: ... $ gcc test.c -fopenacc -S -fdump-tree-all ... At original, we have: ... #pragma acc declare map(tofrom:A); ... but at gimple, we have a map (to:A.1), but not a map (from:A.1): ... int[0:D.2074] * A.1; { int A[0:D.2074] [value-expr: A.1]; saved_stack.2 = __builtin_stack_save (); try { A.1 = __builtin_alloca_with_align (D.2078, 32); #pragma omp target oacc_declare map(to:(A.1) [len: D.2076]) } finally { __builtin_stack_restore (saved_stack.2); } } ... This is caused by the following incompatibility. When storing the desired from clause in oacc_declare_returns, we use 'A.1' as the key: ... 10898 oacc_declare_returns->put (decl, c); (gdb) call debug_generic_expr (decl) A.1 (gdb) call debug_generic_expr (c) map(from:(A.1)) ... but when looking it up, we use 'A' as the key: ... (gdb) 1471 tree c = oacc_declare_returns->get (t); (gdb) call debug_generic_expr (t) A ... Fix this by extracing the 'A.1' lookup key from 'A' using the decl-expr. In addition, unshare the looked up value, to fix avoid running into an "incorrect sharing of tree nodes" error. Using these two fixes, we get our desired: ... finally { + #pragma omp target oacc_declare map(from:(A.1)) __builtin_stack_restore (saved_stack.2); } ... Build on x86_64-linux with nvptx accelerator, tested libgomp. gcc/ChangeLog: 2020-10-06 Tom de Vries <tdevries@suse.de> PR middle-end/90861 gimplify.c (gimplify_bind_expr): Handle lookup in oacc_declare_returns using key with decl-expr. libgomp/ChangeLog: 2020-10-06 Tom de Vries <tdevries@suse.de> PR middle-end/90861 * testsuite/libgomp.oacc-c-c++-common/declare-vla.c: Remove xfail.	2020-10-06 16:50:22 +02:00
GCC Administrator	7e9282ae62	Daily bump.	2020-10-06 00:16:25 +00:00
Tom de Vries	ab3f4b27ab	[omp, ftracer] Don't duplicate blocks in SIMT region When running the libgomp testsuite on x86_64-linux with nvptx accelerator on the test-case included in this patch, we run into: ... FAIL: libgomp.fortran/pr95654.f90 -O3 -fomit-frame-pointer -funroll-loops \ -fpeel-loops -ftracer -finline-functions execution test ... The test-case is a minimal version of this FAIL: ... FAIL: libgomp.fortran/pr66199-5.f90 -O3 -fomit-frame-pointer -funroll-loops \ -fpeel-loops -ftracer -finline-functions execution test ... but that one has stopped failing at commit `c2ebf4f10d` "openmp: Add support for non-rect simd and improve collapsed simd support". The problem is that ftracer duplicates a block containing GOMP_SIMT_VOTE_ANY. That is, before ftracer we have (dropping the GOMP_SIMT_ prefix): ... bb4(ENTER_ALLOC) ----------+ \| \ \| \ \| v \| v bb8 <------------ bb5(VOTE_ANY) -------------+ \| \| \| \| \| \| \| \| \| v \| v bb7(XCHG_IDX) <------------ bb6(EXIT) ... The XCHG_IDX internal-fn does inter-SIMT-lane communication, which for nvptx maps onto shfl, an operator which has the requirement that the warp executing the operator is convergent. The warp diverges at bb4, and reconverges at bb5, and does not diverge by going to bb7, so the shfl is indeed executed by a convergent warp. After ftracer, we have: ... bb4(ENTER_ALLOC) ----------+ \| \ \| \ \| \ \| \ v v * bb5(VOTE_ANY) bb8(VOTE_ANY) * * \|\ /\| \| \ +--------+ \| \| \/ \| \| /\ \| \| / +----------v \|/ * v bb7(XCHG_IDX) <-------------- bb6(EXIT) ... The warp diverges again at bb5, but does not reconverge again before bb6, so the shfl is executed by a divergent warp, which causes the FAIL. Fix this by making ftracer ignore blocks containing ENTER_ALLOC, VOTE_ANY and EXIT, effectively treating the SIMT region conservatively. An argument can be made that the test needs to be added in a more generic place, like gimple_can_duplicate_bb_p or some such, and that ftracer then needs to use the generic test. But that's a discussion with a much broader scope, so I'm leaving that for another patch. Bootstrapped and reg-tested on x86_64-linux. Build on x86_64-linux with nvptx accelerator, tested with libgomp. gcc/ChangeLog: PR fortran/95654 * tracer.c (ignore_bb_p): Ignore GOMP_SIMT_ENTER_ALLOC, GOMP_SIMT_VOTE_ANY and GOMP_SIMT_EXIT. libgomp/ChangeLog: 2020-10-05 Tom de Vries <tdevries@suse.de> PR fortran/95654 * testsuite/libgomp.fortran/pr95654.f90: New test.	2020-10-05 08:53:11 +02:00
GCC Administrator	b0b9b8f02a	Daily bump.	2020-10-03 00:16:25 +00:00
Tobias Burnus	2fe5a545e0	libgomp: Regenerate configure files with automake 1.15.1 libgomp/ChangeLog: * Makefile.in: Regenerate with automake 1.15.1. * aclocal.m4: Likewise. * configure: Likewise. * testsuite/Makefile.in: Likewise.	2020-10-02 12:08:47 +02:00
GCC Administrator	660bfe61d4	Daily bump.	2020-10-01 00:16:30 +00:00
Andrew Stubbs	091ddcc1b2	libgomp: Enforce 1-thread limit in subteams Accelerators with fixed thread-counts will break if nested teams are expected to have multiple threads each. libgomp/ChangeLog: 2020-09-29 Andrew Stubbs <ams@codesourcery.com> * parallel.c (gomp_resolve_num_threads): Ignore nest_var on nvptx and amdgcn targets.	2020-09-30 17:37:31 +01:00
Tobias Burnus	8b0a63e47c	OpenMP: Add implicit declare target for nested procedures gcc/ChangeLog: * omp-offload.c (omp_discover_implicit_declare_target): Also handled nested functions. libgomp/ChangeLog: * testsuite/libgomp.fortran/declare-target-3.f90: New test.	2020-09-30 14:59:27 +02:00
GCC Administrator	93bca37c0a	Daily bump.	2020-09-30 00:16:29 +00:00
Andrew Stubbs	6f51395197	libgomp: disable barriers in nested teams Both GCN and NVPTX allow nested parallel regions, but the barrier implementation did not allow the nested teams to run independently of each other (due to hardware limitations). This patch fixes that, under the assumption that each thread will create a new subteam of one thread, by simply not using barriers when there's no other thread to synchronise. libgomp/ChangeLog: * config/gcn/bar.c (gomp_barrier_wait_end): Skip the barrier if the total number of threads is one. (gomp_team_barrier_wake): Likewise. (gomp_team_barrier_wait_end): Likewise. (gomp_team_barrier_wait_cancel_end): Likewise. * config/nvptx/bar.c (gomp_barrier_wait_end): Likewise. (gomp_team_barrier_wake): Likewise. (gomp_team_barrier_wait_end): Likewise. (gomp_team_barrier_wait_cancel_end): Likewise. * testsuite/libgomp.c-c++-common/nested-parallel-unbalanced.c: New test.	2020-09-29 11:48:04 +01:00
GCC Administrator	e84761c6f3	Daily bump.	2020-09-29 00:16:30 +00:00
Tobias Burnus	2a10a2c068	OpenMP: Handle cpp_implicit_alias in declare-target discovery (PR96390) gcc/ChangeLog: PR middle-end/96390 * omp-offload.c (omp_discover_declare_target_tgt_fn_r): Handle alias nodes. libgomp/ChangeLog: PR middle-end/96390 * testsuite/libgomp.c++/pr96390.C: New test. * testsuite/libgomp.c-c++-common/pr96390.c: New test.	2020-09-28 18:08:05 +02:00
GCC Administrator	4383c595ce	Daily bump.	2020-09-28 00:16:21 +00:00
Clément Chigot	3c11f25fb8	aix: Use $(AR) without -X32_64 to build FAT libraries. AIX FAT libraries should be built with the version of AR chosen by configure. The GNU Make $(AR) variable includes the AIX -X32_64 option needed by the default Makefile rules to accept both 32 bit and 64 bit object files. The -X32_64 option conflicts with ar archiving objects of the same name used to build FAT libraries. This patch changes the Makefile fragments for AIX FAT libraries to use $(AR), but strips the -X32_64 option from the Make variable. libgcc/ChangeLog: 2020-09-27 Clement Chigot <clement.chigot@atos.net> * config/rs6000/t-slibgcc-aix: Use $(AR) without -X32_64. libatomic/ChangeLog: 2020-09-27 Clement Chigot <clement.chigot@atos.net> * config/t-aix: Use $(AR) without -X32_64. libgomp/ChangeLog: 2020-09-27 Clement Chigot <clement.chigot@atos.net> * config/t-aix: Use $(AR) without -X32_64. libstdc++-v3/ChangeLog: 2020-09-27 Clement Chigot <clement.chigot@atos.net> * config/os/aix/t-aix: Use $(AR) without -X32_64. libgfortran/ChangeLog: 2020-09-27 Clement Chigot <clement.chigot@atos.net> * config/t-aix: Use $(AR) without -X32_64.	2020-09-27 12:43:29 -04:00
GCC Administrator	cdd8f031c7	Daily bump.	2020-09-26 00:16:25 +00:00
Jakub Jelinek	c2ebf4f10d	openmp: Add support for non-rect simd and improve collapsed simd support The following change adds support for non-rectangular simd loops. While working on that, I've noticed we actually don't vectorize collapsed simd loops at all, because the code that I thought would be vectorizable actually is not vectorized. While in theory for the constant lower/upper bounds and constant step of all but the outermost loop we could in theory vectorize by computing the seprate iterators using vectorized division and modulo for each of them from the single iterator that increments by 1 from 0 to total iteration count in the loop nest, I think that would be fairly expensive and the chances of the loop body being vectorizable would be low e.g. because of array indices unlikely to be linear and would need scatters/gathers. This patch changes the generated code to vectorize only the innermost loop which has higher chance of being vectorized. Below is the list of tests and function names in which the patch resulted in vectorizing something that hasn't been vectorized before (ok, the first line is a new test). I've also found that the vectorizer will not vectorize loops with non-constant steps, I plan to do something about those incrementally on the omp-expand.c side (basically, compute number of iterations before the loop and use a 0 to number_of_iterations step 1 IV as the main one). I have problem with the composite simd vectorization though. The point is that each thread (or task etc.) is given only a range of consecutive iterations, so somewhere earlier it computes total number of iterations and splits the work between the workers and then the intent is to try to vectorize it. So, each thread is then given a begin ... end-1 range that it would handle. This means that from the single begin value I need to compute the individual iteration vars I should start at and then goto into the loop nest to begin iterating there (and actually compute how many iterations the innermost loop should do each time so that it stops before end). Very roughly the IL I emit is something like: int t[100][100][100]; void foo (int a, int b, int c, int d, int e, int f, int g, int h, int u, int v, int w, int x) { int i, j, k; int cnt; if (x) { i = u; j = v; k = w; goto doit; } for (i = a; i < b; i += c) for (j = d; j < e; j += f) { k = g; doit: for (; k < h; k++) t[i][j][k] += i + j + k; } } Unfortunately, some pass then turns the innermost loop to have more than 2 basic blocks and it isn't vectorized because of that. Also, I have disabled (for now) SIMTization of collapsed simd loops, because for SIMT it would be using a single thread anyway and I didn't want to bother with checking SIMT on all places I've been changing. If SIMT support is added for some or all collapsed loops, that omp-low.c change needs to be reverted. Here is that list of what hasn't been vectorized before and is now: gcc/testsuite/gcc.dg/vect/vect-simd-17.c doit gcc/testsuite/gfortran.dg/gomp/openmp-simd-6.f90 bar libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-10.c f28_taskloop_simd_normal._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-10.c _Z24f28_taskloop_simd_normalv._omp_fn.0 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f25_t_simd_normal._omp_fn.0 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f26_t_simd_normal._omp_fn.0 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f27_t_simd_normal._omp_fn.0 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f28_tpf_simd_guided32._omp_fn.1 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-11.c f28_tpf_simd_runtime._omp_fn.1 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z17f25_t_simd_normaliiiiiii._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z17f26_t_simd_normaliiiixxi._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z17f27_t_simd_normalv._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z20f28_tpf_simd_runtimev._omp_fn.1 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-11.c _Z21f28_tpf_simd_guided32v._omp_fn.1 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c f7_simd_normal libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f7_simd_normal libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c f8_f_simd_guided32 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f8_f_simd_guided32 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c f8_f_simd_runtime libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f8_f_simd_runtime libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f8_pf_simd_guided32._omp_fn.0 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-2.c f8_pf_simd_runtime._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c _Z18f8_pf_simd_runtimev._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-2.c _Z19f8_pf_simd_guided32v._omp_fn.0 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-4.c f8_taskloop_simd_normal._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-4.c _Z23f8_taskloop_simd_normalv._omp_fn.0 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-5.c f7_t_simd_normal._omp_fn.0 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-5.c f8_tpf_simd_guided32._omp_fn.1 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-5.c f8_tpf_simd_runtime._omp_fn.1 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-5.c _Z16f7_t_simd_normalv._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-5.c _Z19f8_tpf_simd_runtimev._omp_fn.1 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-5.c _Z20f8_tpf_simd_guided32v._omp_fn.1 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f25_simd_normal libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f25_simd_normal libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f26_simd_normal libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f26_simd_normal libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f27_simd_normal libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f27_simd_normal libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f28_f_simd_guided32 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f28_f_simd_guided32 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c f28_f_simd_runtime libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f28_f_simd_runtime libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f28_pf_simd_guided32._omp_fn.0 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/for-8.c f28_pf_simd_runtime._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c _Z19f28_pf_simd_runtimev._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/for-8.c _Z20f28_pf_simd_guided32v._omp_fn.0 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/master-combined-1.c main._omp_fn.9 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/master-combined-1.c main._omp_fn.9 libgomp/testsuite/libgomp.c++/../libgomp.c-c++-common/simd-1.c f2 libgomp/testsuite/libgomp.c/../libgomp.c-c++-common/simd-1.c f2 libgomp/testsuite/libgomp.c/pr70680-2.c f1._omp_fn.0 libgomp/testsuite/libgomp.c/pr70680-2.c f2._omp_fn.0 libgomp/testsuite/libgomp.c/pr70680-2.c f3._omp_fn.0 libgomp/testsuite/libgomp.c/pr70680-2.c f4._omp_fn.0 libgomp/testsuite/libgomp.c/simd-8.c foo libgomp/testsuite/libgomp.c/simd-9.c bar libgomp/testsuite/libgomp.c/simd-9.c foo 2020-09-25 Jakub Jelinek <jakub@redhat.com> gcc/ * omp-low.c (scan_omp_1_stmt): Don't call scan_omp_simd for collapse > 1 loops as simt doesn't support collapsed loops yet. * omp-expand.c (expand_omp_for_init_counts, expand_omp_for_init_vars): Small tweaks to function comment. (expand_omp_simd): Rewritten collapse > 1 support to only attempt to vectorize the innermost loop and emit set of outer loops around it. For non-composite simd with collapse > 1 without broken loop don't even try to compute number of iterations first. Add support for non-rectangular simd loops. (expand_omp_for): Don't sorry_at on non-rectangular simd loops. gcc/testsuite/ * gcc.dg/vect/vect-simd-17.c: New test. libgomp/ * testsuite/libgomp.c/loop-25.c: New test.	2020-09-25 10:43:37 +02:00
GCC Administrator	521d271140	Daily bump.	2020-09-23 00:16:27 +00:00
Tobias Burnus	f74c87f85f	libgomp.fortran/pr66199-5.f90: Make stop codes unique libgomp/ChangeLog: PR fortran/95654 * testsuite/libgomp.fortran/pr66199-5.f90: Make stop codes unique.	2020-09-22 19:16:34 +02:00
Tom de Vries	c0e9cee285	[libgomp, nvptx] Print error log for link error By running libgomp test-case libgomp.c/target-28.c with GOMP_NVPTX_PTXRW=w (using a maintenance patch that adds support for this env var), we dump the ptx in target-28.exe to file. By editing one ptx file to rename gomp_nvptx_main to gomp_nvptx_main2 in both declaration and call, and running with GOMP_NVPTX_PTXRW=r, we trigger a link error: ... $ GOMP_NVPTX_PTXRW=r ./target-28.exe libgomp: cuLinkComplete error: unknown error ... The error is somewhat uninformative. Fix this by dumping the error log returned by the failing cuda call, such that we have instead: ... $ GOMP_NVPTX_PTXRW=r ./target-28.exe libgomp: Link error log error : \ Undefined reference to 'gomp_nvptx_main2' in '' libgomp: cuLinkComplete error: unknown error ... Build on x86_64 with nvptx accelerator, tested libgomp. libgomp/ChangeLog: * plugin/plugin-nvptx.c (link_ptx): Print elog if cuLinkComplete call fails.	2020-09-22 13:38:00 +02:00
GCC Administrator	ecde1b0a46	Daily bump.	2020-09-17 00:16:31 +00:00
Nathan Sidwell	8155316c6f	c++: local-scope OMP UDR reductions have no template head This corrects the earlier problems with removing the template header from local omp reductions. And it uncovered a latent bug. When we tsubst such a decl, we immediately tsubst its body. cp_check_omp_declare_reduction gets a success return value to gate that instantiation. udr-2.C got a further error, as the omp checking machinery doesn't appear to turn the reduction into an error mark when failing. I didn't dig into that further. udr-3.C appears to have been invalid and accidentally worked. gcc/cp/ * cp-tree.h (cp_check_omp_declare_reduction): Return bool. * semantics.c (cp_check_omp_declare_reduction): Return true on for success. * pt.c (push_template_decl_real): OMP reductions do not get a template header. (tsubst_function_decl): Remove special casing for local decl omp reductions. (tsubst_expr): Call instantiate_body for a local omp reduction. (instantiate_body): Add nested_p parm, and deal with such instantiations. (instantiate_decl): Reject FUNCTION_SCOPE entities, adjust instantiate_body call. gcc/testsuite/ * g++.dg/gomp/udr-2.C: Add additional expected error. libgomp/ * testsuite/libgomp.c++/udr-3.C: Add missing ctor.	2020-09-16 12:16:11 -07:00
GCC Administrator	9f7ab8c561	Daily bump.	2020-09-16 00:16:37 +00:00
Tobias Burnus	1b9bdd5203	libgomp/target.c: Silence -Wuninitialized warning libgomp/ChangeLog: PR fortran/96668 * target.c (gomp_map_vars_internal): Initialize has_nullptr.	2020-09-15 21:28:40 +02:00
Tobias Burnus	972da55746	OpenMP/Fortran: Fix (re)mapping of allocatable/pointer arrays [PR96668] gcc/cp/ChangeLog: PR fortran/96668 * cp-gimplify.c (cxx_omp_finish_clause): Add bool openacc arg. * cp-tree.h (cxx_omp_finish_clause): Likewise * semantics.c (handle_omp_for_class_iterator): Update call. gcc/fortran/ChangeLog: PR fortran/96668 * trans.h (gfc_omp_finish_clause): Add bool openacc arg. * trans-openmp.c (gfc_omp_finish_clause): Ditto. Use GOMP_MAP_ALWAYS_POINTER with PSET for pointers. (gfc_trans_omp_clauses): Like the latter and also if the always modifier is used. gcc/ChangeLog: PR fortran/96668 * gimplify.c (gimplify_omp_for): Add 'bool openacc' argument; update omp_finish_clause calls. (gimplify_adjust_omp_clauses_1, gimplify_adjust_omp_clauses, gimplify_expr, gimplify_omp_loop): Update omp_finish_clause and/or gimplify_for calls. * langhooks-def.h (lhd_omp_finish_clause): Add bool openacc arg. * langhooks.c (lhd_omp_finish_clause): Likewise. * langhooks.h (lhd_omp_finish_clause): Likewise. * omp-low.c (scan_sharing_clauses): Keep GOMP_MAP_TO_PSET cause for 'declare target' vars. include/ChangeLog: PR fortran/96668 * gomp-constants.h (GOMP_MAP_ALWAYS_POINTER_P): Define. libgomp/ChangeLog: PR fortran/96668 * libgomp.h (struct target_var_desc): Add has_null_ptr_assoc member. * target.c (gomp_map_vars_existing): Add always_to_flag flag. (gomp_map_vars_existing): Update call to it. (gomp_map_fields_existing): Likewise (gomp_map_vars_internal): Update PSET handling such that if a nullptr is now allocated or if GOMP_MAP_POINTER is used PSET is updated and pointer remapped. (GOMP_target_enter_exit_data): Hanlde GOMP_MAP_ALWAYS_POINTER like GOMP_MAP_POINTER. * testsuite/libgomp.fortran/map-alloc-ptr-1.f90: New test. * testsuite/libgomp.fortran/map-alloc-ptr-2.f90: New test.	2020-09-15 09:24:47 +02:00
GCC Administrator	50a71cd018	Daily bump.	2020-09-15 00:16:37 +00:00
Tom de Vries	4ac7b66958	[libgomp, nvptx] Add __sync_compare_and_swap_16 As reported here ( https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553070.html ), when running test-case libgomp.c-c++-common/reduction-16.c for powerpc host with nvptx accelerator, we run into: ... unresolved symbol __sync_val_compare_and_swap_16 ... I can reproduce the problem on x86_64 with a trigger patch that: - initializes ix86_isa_flags2 to TARGET_ISA2_CX16 - enables define_expand "atomic_load<mode>" in gcc/config/i386/sync.md for TImode The problem is that omp-expand.c generates atomic builtin calls based on checks whether those are supported on the host, which forces the target to support these, even though those checks fail for the accelerator target. Fix this by: - adding a __sync_val_compare_and_swap_16 in libgomp for nvptx, which falls back onto libatomic's __atomic_compare_and_swap_16 - adding -foffload=-latomic in the test-case Tested libgomp on x86_64-linux with nvptx accelerator. Tested libgomp with trigger patch on x86_64-linux with nvptx accelerator. libgomp/ChangeLog: * config/nvptx/atomic.c: New file. Add __sync_val_compare_and_swap_16. * testsuite/libgomp.c-c++-common/reduction-16.c: Add -latomic for target offload_target_nvptx.	2020-09-14 08:28:56 +02:00
GCC Administrator	31a0504624	Daily bump.	2020-09-09 00:16:29 +00:00

1 2 3 4 5 ...

1254 Commits