gcc/libgomp
Jakub Jelinek c7abdf46fb openmp: Fix up struct gomp_work_share handling [PR102838]
If GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC is not defined, the intent was to
treat the split of the structure between first cacheline (64 bytes)
as mostly write-once, use afterwards and second cacheline as rw just
as an optimization.  But as has been reported, with vectorization enabled
at -O2 it can now result in aligned vector 16-byte or larger stores.
When not having posix_memalign/aligned_alloc/memalign or other similar API,
alloc.c emulates it but it needs to allocate extra memory for the dynamic
realignment.
So, for the GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC not defined case, this patch
stops using aligned (64) attribute in the middle of the structure and instead
inserts padding that puts the second half of the structure at offset 64 bytes.

And when GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC is defined, usually it was allocated
as aligned, but for the orphaned case it could still be allocated just with
gomp_malloc without guaranteed proper alignment.

2021-10-20  Jakub Jelinek  <jakub@redhat.com>

	PR libgomp/102838
	* libgomp.h (struct gomp_work_share_1st_cacheline): New type.
	(struct gomp_work_share): Only use aligned(64) attribute if
	GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC is defined, otherwise just
	add padding before lock to ensure lock is at offset 64 bytes
	into the structure.
	(gomp_workshare_struct_check1, gomp_workshare_struct_check2):
	New poor man's static assertions.
	* work.c (gomp_work_share_start): Use gomp_aligned_alloc instead of
	gomp_malloc if GOMP_HAVE_EFFICIENT_ALIGNED_ALLOC.
2021-10-20 09:34:51 +02:00
..
config openmp: Fix handling of numa_domains(1) 2021-10-18 15:00:46 +02:00
plugin
testsuite Disallow loop rotation and loop header crossing in jump threaders. 2021-10-20 07:07:35 +02:00
.gitattributes
acc_prof.h
acinclude.m4
aclocal.m4
affinity-fmt.c openmp: Avoid PLT relocations for omp_* symbols in libgomp 2021-10-01 10:42:07 +02:00
affinity.c
alloc.c
allocator.c libgomp: Add tests for omp_atv_serialized and deprecate omp_atv_sequential. 2021-10-11 04:34:51 -07:00
atomic.c
barrier.c
ChangeLog Daily bump. 2021-10-19 00:16:23 +00:00
ChangeLog.graphite
config.h.in
configure libgomp: Only check for 2*sizeof(void*) int type with Fortran [PR96661] 2021-09-28 15:15:47 +02:00
configure.ac libgomp: Only check for 2*sizeof(void*) int type with Fortran [PR96661] 2021-09-28 15:15:47 +02:00
configure.tgt
critical.c
env.c openmp: Handle OpenMP 5.1 simplified OMP_PLACES syntax 2021-10-15 16:35:57 +02:00
error.c
fortran.c openmp: Add omp_set_num_teams, omp_get_max_teams, omp_[gs]et_teams_thread_limit 2021-10-11 12:20:22 +02:00
hashtab.h
icv-device.c openmp: Avoid PLT relocations for omp_* symbols in libgomp 2021-10-01 10:42:07 +02:00
icv.c openmp: Add omp_set_num_teams, omp_get_max_teams, omp_[gs]et_teams_thread_limit 2021-10-11 12:20:22 +02:00
iter_ull.c
iter.c
libgomp_f.h.in
libgomp_g.h
libgomp-plugin.c
libgomp-plugin.h
libgomp.h openmp: Fix up struct gomp_work_share handling [PR102838] 2021-10-20 09:34:51 +02:00
libgomp.map openmp: Add omp_set_num_teams, omp_get_max_teams, omp_[gs]et_teams_thread_limit 2021-10-11 12:20:22 +02:00
libgomp.spec.in
libgomp.texi openmp: Handle OpenMP 5.1 simplified OMP_PLACES syntax 2021-10-15 16:35:57 +02:00
lock.c
loop_ull.c
loop.c
Makefile.am
Makefile.in
oacc-async.c
oacc-cuda.c
oacc-host.c
oacc-init.c
oacc-int.h
oacc-mem.c
oacc-parallel.c
oacc-plugin.c
oacc-plugin.h
oacc-profiling.c
oacc-target.c
omp_lib.f90.in libgomp: Add tests for omp_atv_serialized and deprecate omp_atv_sequential. 2021-10-11 04:34:51 -07:00
omp_lib.h.in openmp: Add omp_set_num_teams, omp_get_max_teams, omp_[gs]et_teams_thread_limit 2021-10-11 12:20:22 +02:00
omp.h.in libgomp: Add tests for omp_atv_serialized and deprecate omp_atv_sequential. 2021-10-11 04:34:51 -07:00
openacc_lib.h
openacc.f90
openacc.h
ordered.c
parallel.c
priority_queue.c
priority_queue.h
scope.c
sections.c
secure_getenv.h
single.c
splay-tree.c
splay-tree.h
target.c libgomp: Release device lock on cbuf error path 2021-10-12 06:50:26 -07:00
task.c
taskloop.c
team.c Replace VRP threader with a hybrid forward threader. 2021-09-27 17:39:51 +02:00
teams.c openmp: Add omp_set_num_teams, omp_get_max_teams, omp_[gs]et_teams_thread_limit 2021-10-11 12:20:22 +02:00
work.c openmp: Fix up struct gomp_work_share handling [PR102838] 2021-10-20 09:34:51 +02:00