This patch adds some XFAILs for PR98979 until the patch to fix them has
been approved. See:
https://gcc.gnu.org/pipermail/gcc-patches/2021-February/564711.html
gcc/testsuite/
PR fortran/98979
* gfortran.dg/goacc/array-with-dt-2.f90: Add expected errors.
* gfortran.dg/goacc/derived-chartypes-1.f90: Skip ICEing test.
* gfortran.dg/goacc/derived-chartypes-2.f90: Likewise.
libgomp/
PR fortran/98979
* testsuite/libgomp.oacc-fortran/array-stride-dt-1.f90: Add expected
errors.
OpenACC 3.0 ("2.14.4. Update Directive") states:
Noncontiguous subarrays may appear. It is implementation-specific
whether noncontiguous regions are updated by using one transfer for
each contiguous subregion, or whether the non-contiguous data is
packed, transferred once, and unpacked, or whether one or more larger
subarrays (no larger than the smallest contiguous region that contains
the specified subarray) are updated.
This patch relaxes some conditions in the Fortran front-end so that
strided accesses are permitted for update directives.
gcc/fortran/
* openmp.c (resolve_omp_clauses): Omit OpenACC update in
contiguity check and stride-specified error.
gcc/testsuite/
* gfortran.dg/goacc/array-with-dt-2.f90: New test.
libgomp/
* testsuite/libgomp.oacc-fortran/array-stride-dt-1.f90: New test.
On Wed, Jan 20, 2021 at 05:04:39PM +0100, Florian Weimer wrote:
> Sorry, this appears to cause OpenMP task state corruption in RPM. We
> have only seen this on s390x.
Haven't actually verified it, but my suspection is that this is a caller
stack corruption.
We play with fire with the GOMP_task API/ABI extensions, the GOMP_task
function used to be:
void
GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) (void *, void *),
long arg_size, long arg_align, bool if_clause, unsigned flags);
and later:
void
GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) (void *, void *),
long arg_size, long arg_align, bool if_clause, unsigned flags,
void **depend);
and later:
void
GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) (void *, void *),
long arg_size, long arg_align, bool if_clause, unsigned flags,
void **depend, int priority);
and now:
void
GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) (void *, void *),
long arg_size, long arg_align, bool if_clause, unsigned flags,
void **depend, int priority, void *detach)
and which of those depend, priority and detach argument is present depends
on the bits in flags.
I'm afraid the compiler just decided to spill the detach = NULL store in
if ((flags & GOMP_TASK_FLAG_DETACH) == 0)
detach = NULL;
on s390x into the argument stack slot. Not a problem if the caller passes
all those 10 arguments, but if not, can clobber random stack location.
This hack should fix it up. Priority doesn't need changing, but I've
changed it anyway just to be safe. With the patch none of the 3 arguments
are ever modified, so I'd hope gcc doesn't decide to spill something
unrelated there.
2021-01-20 Jakub Jelinek <jakub@redhat.com>
* task.c (GOMP_task): Rename priority argument to priority_arg,
add priority automatic variable and modify that variable. Instead of
clearing detach argument when GOMP_TASK_FLAG_DETACH bit is not set,
check flags for that bit.
This patch introduces gomp_sem_getcount wrapper, which uses sem_getvalue
for POSIX and atomic loads for linux futex and accel. rtems for now
remains broken.
2021-01-18 Jakub Jelinek <jakub@redhat.com>
* config/linux/sem.h (gomp_sem_getcount): New function.
* config/posix/sem.h (gomp_sem_getcount): New function.
* config/posix/sem.c (gomp_sem_getcount): New function.
* config/accel/sem.h (gomp_sem_getcount): New function.
* task.c (task_fulfilled_p): Use gomp_sem_getcount.
(omp_fulfill_event): Likewise.
The recent changes to error on mixing -march=i386 and -fcf-protection broke
bootstrap. This patch changes lib{atomic,gomp,itm} configury, so that it
only adds -march=i486 to flags if really needed (i.e. when 486 or later isn't
on by default already). Similarly, it will not use ifuncs if -mcx16
(or -march=i686 for 32-bit) is on by default.
2021-01-15 Jakub Jelinek <jakub@redhat.com>
PR target/70454
libatomic/
* configure.tgt: For i?86 and x86_64 determine if -march=i486 needs to
be added through preprocessor check on
__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4. Determine if try_ifunc is needed
based on preprocessor check on __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16
or __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8.
libgomp/
* configure.tgt: For i?86 and x86_64 determine if -march=i486 needs to
be added through preprocessor check on
__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4.
libitm/
* configure.tgt: For i?86 and x86_64 determine if -march=i486 needs to
be added through preprocessor check on
__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4.
As recently again discussed in <https://gcc.gnu.org/PR97436> "[nvptx] -m32
support", nvptx offloading other than for 64-bit host has never been
implemented, tested, supported. So we simply should buildn't the nvptx libgomp
plugin in this case.
This avoids build problems if, for example, in a (standard) bi-arch
x86_64-pc-linux-gnu '-m64'/'-m32' build, libcuda is available only in a 64-bit
variant but not in a 32-bit one, which, for example, is the case if you build
GCC against the CUDA toolkit's 'stubs/libcuda.so' (see
<https://stackoverflow.com/a/52784819>).
This amends PR65099 commit a92defdab7 (r225560)
"[nvptx offloading] Only 64-bit configurations are currently supported" to
match the way we're doing this for the HSA/GCN plugins.
libgomp/
PR libgomp/65099
* plugin/configfrag.ac (PLUGIN_NVPTX): Restrict to supported
configurations.
* configure: Regenerate.
* plugin/plugin-nvptx.c (nvptx_get_num_devices): Remove 64-bit
check.
The libgomp texinfo docs lead to an invalid "up" link on the Top node,
which we can avoid similarly to the Top link in the main GCC manual.
2020-12-28 Sandra Loosemore <sandra@codesourcery.com>
libgomp/
* libgomp.texi (Top): Avoid bad "up" link.
The attached testcase is miscompiled, because we optimize shared clauses
to firstprivate when task body can't modify the variable even when the
task has depend clause. That is wrong, because firstprivate means the
variable will be copied immediately when the task is created, while with
depend clause some other task might change it later before the dependencies
are satisfied and the task should observe the value only after the change.
2020-12-18 Jakub Jelinek <jakub@redhat.com>
* gimplify.c (struct gimplify_omp_ctx): Add has_depend member.
(gimplify_scan_omp_clauses): Set it to true if OMP_CLAUSE_DEPEND
appears on OMP_TASK.
(gimplify_adjust_omp_clauses_1, gimplify_adjust_omp_clauses): Force
GOVD_WRITTEN on shared variables if task construct has depend clause.
* testsuite/libgomp.c/task-6.c: New test.
These are the same header files that exist in the Radeon Open Compute Runtime
project (as of October 2020), but they have been specially relicensed by AMD
for use in GCC.
The header files retain AMD copyright.
include/ChangeLog:
* hsa.h: Replace whole file.
* hsa_ext_amd.h: New file.
* hsa_ext_image.h: New file.
libgomp/ChangeLog:
* plugin/plugin-gcn.c: Include hsa_ext_amd.h.
(HSA_AMD_AGENT_INFO_COMPUTE_UNIT_COUNT): Delete redundant definition.
The change in major version (and the increment from Darwin19 to 20)
caused libtool tests to fail which resulted in incorrect build settings
for shared libraries.
We take this opportunity to sort out the shared undefined symbols state
rather than propagating the current unsound behaviour into a new rev.
This change means that we default to the case that missing symbols are
considered an error, and if one wants to allow this intentionally, the
confiuration for that case should be set appropriately.
Three existing cases need undefined dynamic lookup:
libitm, where there is already a configuration mechanism to add the
flags.
libcc1, where we add simple configuration to add the flags for Darwin.
libsanitizer, where we can add to the existing extra flags.
libcc1/ChangeLog:
PR target/97865
* Makefile.am: Add dynamic_lookup to LD flags for Darwin.
* configure.ac: Test for Darwin host and set a flag.
* Makefile.in: Regenerate.
* configure: Regenerate.
libitm/ChangeLog:
PR target/97865
* configure.tgt: Add dynamic_lookup to XLDFLAGS for Darwin.
* configure: Regenerate.
libsanitizer/ChangeLog:
PR target/97865
* configure.tgt: Add dynamic_lookup to EXTRA_CXXFLAGS for
Darwin.
* configure: Regenerate.
ChangeLog:
PR target/97865
* libtool.m4: Update handling of Darwin platform link flags
for Darwin20.
gcc/ChangeLog:
PR target/97865
* configure: Regenerate.
libatomic/ChangeLog:
PR target/97865
* configure: Regenerate.
libbacktrace/ChangeLog:
PR target/97865
* configure: Regenerate.
libffi/ChangeLog:
PR target/97865
* configure: Regenerate.
libgfortran/ChangeLog:
PR target/97865
* configure: Regenerate.
libgomp/ChangeLog:
PR target/97865
* configure: Regenerate.
libhsail-rt/ChangeLog:
PR target/97865
* configure: Regenerate.
libobjc/ChangeLog:
PR target/97865
* configure: Regenerate.
libphobos/ChangeLog:
PR target/97865
* configure: Regenerate.
libquadmath/ChangeLog:
PR target/97865
* configure: Regenerate.
libssp/ChangeLog:
PR target/97865
* configure: Regenerate.
libstdc++-v3/ChangeLog:
PR target/97865
* configure: Regenerate.
libvtv/ChangeLog:
PR target/97865
* configure: Regenerate.
zlib/ChangeLog:
PR target/97865
* configure: Regenerate.
The testcase had invalid assumptions about which loop iterations would run
first and last.
libgomp/ChangeLog
* testsuite/libgomp.oacc-fortran/atomic_capture-1.f90 (main): Adjust
expected results.
Ensure the code will continue to compile when elf.h gets these definitions.
libgomp/ChangeLog:
* plugin/plugin-gcn.c: Don't redefine relocations if elf.h has them.
(reserved): Delete unused define.
This removes the nest-var ICV, expressing nesting in terms of the
max-active-levels-var ICV instead. The max-active-levels-var ICV
is now per data environment rather than per device.
2020-11-18 Kwok Cheung Yeung <kcy@codesourcery.com>
libgomp/
* env.c (gomp_global_icv): Remove nest_var field. Add
max_active_levels_var field.
(gomp_max_active_levels_var): Remove.
(parse_boolean): Return true on success.
(handle_omp_display_env): Express OMP_NESTED in terms of
max_active_levels_var. Change format specifier for
max_active_levels_var.
(initialize_env): Set max_active_levels_var from
OMP_MAX_ACTIVE_LEVELS, OMP_NESTED, OMP_NUM_THREADS and
OMP_PROC_BIND.
* icv.c (omp_set_nested): Express in terms of
max_active_levels_var.
(omp_get_nested): Likewise.
(omp_set_max_active_levels): Use max_active_levels_var field instead
of gomp_max_active_levels_var.
(omp_get_max_active_levels): Likewise.
* libgomp.h (struct gomp_task_icv): Remove nest_var field. Add
max_active_levels_var field.
(gomp_supported_active_levels): Set to UCHAR_MAX.
(gomp_max_active_levels_var): Delete.
* libgomp.texi (omp_get_nested): Update documentation.
(omp_set_nested): Likewise.
(OMP_MAX_ACTIVE_LEVELS): Likewise.
(OMP_NESTED): Likewise.
(OMP_NUM_THREADS): Likewise.
(OMP_PROC_BIND): Likewise.
* parallel.c (gomp_resolve_num_threads): Replace reference
to nest_var with max_active_levels_var. Use max_active_levels_var
field instead of gomp_max_active_levels_var.
As typically configured, newlib's libc.a does not build 'posix' and,
hence, usleep is not available. Thus, use the same fallback as for nvptx.
libgomp/
* testsuite/libgomp.c/usleep.h (fallback_usleep): Renamed from
nvptx_usleep; use also for device={arch(gcn)}.