As mentioned in the PR, the latest Intel SDM has added:
"Processors that enumerate support for Intel® AVX (by setting the feature flag CPUID.01H:ECX.AVX[bit 28])
guarantee that the 16-byte memory operations performed by the following instructions will always be
carried out atomically:
• MOVAPD, MOVAPS, and MOVDQA.
• VMOVAPD, VMOVAPS, and VMOVDQA when encoded with VEX.128.
• VMOVAPD, VMOVAPS, VMOVDQA32, and VMOVDQA64 when encoded with EVEX.128 and k0 (masking disabled).
(Note that these instructions require the linear addresses of their memory operands to be 16-byte
aligned.)"
The following patch deals with it just on the libatomic library side so far,
currently (since ~ 2017) we emit all the __atomic_* 16-byte builtins as
library calls since and this is something that we can hopefully backport.
The patch simply introduces yet another ifunc variant that takes priority
over the pure CMPXCHG16B one, one that checks AVX and CMPXCHG16B bits and
on non-Intel clears the AVX bit during detection for now (if AMD comes
with the same guarantee, we could revert the config/x86/init.c hunk),
which implements 16-byte atomic load as vmovdqa and 16-byte atomic store
as vmovdqa followed by mfence.
2022-03-17 Jakub Jelinek <jakub@redhat.com>
PR target/104688
* Makefile.am (IFUNC_OPTIONS): Change on x86_64 to -mcx16 -mcx16.
(libatomic_la_LIBADD): Add $(addsuffix _16_2_.lo,$(SIZEOBJS)) for
x86_64.
* Makefile.in: Regenerated.
* config/x86/host-config.h (IFUNC_COND_1): For x86_64 define to
both AVX and CMPXCHG16B bits.
(IFUNC_COND_2): Define.
(IFUNC_NCOND): For x86_64 define to 2 * (N == 16).
(MAYBE_HAVE_ATOMIC_CAS_16, MAYBE_HAVE_ATOMIC_EXCHANGE_16,
MAYBE_HAVE_ATOMIC_LDST_16): Define to IFUNC_COND_2 rather than
IFUNC_COND_1.
(HAVE_ATOMIC_CAS_16): Redefine to 1 whenever IFUNC_ALT != 0.
(HAVE_ATOMIC_LDST_16): Redefine to 1 whenever IFUNC_ALT == 1.
(atomic_compare_exchange_n): Define whenever IFUNC_ALT != 0
on x86_64 for N == 16.
(__atomic_load_n, __atomic_store_n): Redefine whenever IFUNC_ALT == 1
on x86_64 for N == 16.
(atomic_load_n, atomic_store_n): New functions.
* config/x86/init.c (__libat_feat1_init): On x86_64 clear bit_AVX
if CPU vendor is not Intel.
Resolves:
PR bootstrap/101379 - libatomic arm build failure after r12-2132 due to -Warray-bounds on a constant address
libatomic/ChangeLog:
PR bootstrap/101379
* config/linux/arm/host-config.h (__kernel_helper_version): New
function. Adjust shadow macro.
AIX caches shared objects in archives with read-other permission.
libgomp and libatomic might be in use during the build or testing, which
may cause archiver operations on them to fail. This patch adjusts the
Makefile fragments to delete the library archives before creating fresh
archives containing both the 32 bit and 64 bit shared objects.
libatomic/ChangeLog:
2020-10-11 Clement Chigot <clement.chigot@atos.net>
* config/t-aix: Delete and recreate libatomic before creating
FAT library.
libgomp/ChangeLog:
2020-10-11 Clement Chigot <clement.chigot@atos.net>
* config/t-aix: Delete and recreate libgomp before creating
FAT library.
AIX FAT libraries should be built with the version of AR chosen by configure.
The GNU Make $(AR) variable includes the AIX -X32_64 option needed
by the default Makefile rules to accept both 32 bit and 64 bit object files.
The -X32_64 option conflicts with ar archiving objects of the same name
used to build FAT libraries.
This patch changes the Makefile fragments for AIX FAT libraries to use $(AR),
but strips the -X32_64 option from the Make variable.
libgcc/ChangeLog:
2020-09-27 Clement Chigot <clement.chigot@atos.net>
* config/rs6000/t-slibgcc-aix: Use $(AR) without -X32_64.
libatomic/ChangeLog:
2020-09-27 Clement Chigot <clement.chigot@atos.net>
* config/t-aix: Use $(AR) without -X32_64.
libgomp/ChangeLog:
2020-09-27 Clement Chigot <clement.chigot@atos.net>
* config/t-aix: Use $(AR) without -X32_64.
libstdc++-v3/ChangeLog:
2020-09-27 Clement Chigot <clement.chigot@atos.net>
* config/os/aix/t-aix: Use $(AR) without -X32_64.
libgfortran/ChangeLog:
2020-09-27 Clement Chigot <clement.chigot@atos.net>
* config/t-aix: Use $(AR) without -X32_64.
Add nvptx support to libatomic.
Given that atomic_test_and_set is not implemented for nvptx (PR96964), the
compiler translates __atomic_test_and_set falling back onto the "Failing all
else, assume a single threaded environment and simply perform the operation"
case in expand_atomic_test_and_set, so it doesn't map onto an actual atomic
operation.
Still, that counts as supported for the configure test of libatomic, so we
end up with HAVE_ATOMIC_TAS_1/2/4/8/16 == 1, and the corresponding
__atomic_test_and_set_1/2/4/8/16 in libatomic all using that non-atomic
implementation.
Fix this by adding an atomic_test_and_set expansion for nvptx, that uses
libatomics __atomic_test_and_set_1.
This again makes the configure tests for HAVE_ATOMIC_TAS_1/2/4/8/16 fail, so
instead we use this case in tas_n.c:
...
/* If this type is smaller than word-sized, fall back to a word-sized
compare-and-swap loop. */
bool
SIZE(libat_test_and_set) (UTYPE *mptr, int smodel)
...
which for __atomic_test_and_set_8 uses INVERT_MASK_8.
Add INVERT_MASK_8 in libatomic_i.h, as well as MASK_8.
Tested libatomic testsuite on nvptx.
gcc/ChangeLog:
PR target/96964
* config/nvptx/nvptx.md (define_expand "atomic_test_and_set"): New
expansion.
libatomic/ChangeLog:
PR target/96898
* configure.tgt: Add nvptx.
* libatomic_i.h (MASK_8, INVERT_MASK_8): New macro definition.
* config/nvptx/host-config.h: New file.
* config/nvptx/lock.c: New file.
The FAT libraries config fragments need to know which library is native
and which is a multilib to choose the correct multilib from which to
append the additional object file or shared object file. Testing the
top-level archive is fragile because it will fail if rebuilding. This
patch tests the compiler preprocessing macros for the 64 bit AIX specific
__64BIT__ to determine the native mode of the compiler in MULTILIBTOP.
2020-07-14 David Edelsohn <dje.gcc@gmail.com>
libatomic/ChangeLog
* config/t-aix: Set BITS from compiler cpp macro.
libgcc/ChangeLog
* config/rs6000/t-slibgcc-aix: Set BITS from compiler cpp macro.
libgfortran/ChangeLog
* config/t-aix: Set BITS from compiler cpp macro.
libgomp/ChangeLog
* config/t-aix: Set BITS from compiler cpp macro.
libstdc++-v3/ChangeLog
* config/os/aix/t-aix: Set BITS from compiler cpp macro.
This patch adds the ability to configure GCC on AIX to build as a
64 bit application and to build target libraries "FAT" libraries in both
32 bit and 64 bit mode.
The patch adds makefile fragment hooks to target libraries that allows
them to include target-specific rules. The target specific rules for
AIX place both 32 bit and 64 bit objects and shared objects
in archives at the top-level, not multilib subdirectories. The
multilibs are built in subdirectories, but must be combined during the
last parts of the target library build process. Because of the way
that GCC bootstrap works, the libraries must be combined during the
multiple stages of GCC bootstrap, not solely when installed in the
final destination, so the libraries are correct at the end of
each target library build stage, not solely an install recipe.
gcc/ChangeLog
2020-06-21 David Edelsohn <dje.gcc@gmail.com>
* config.gcc: Use t-aix64, biarch64 and default64 for cpu_is_64bit.
* config/rs6000/aix72.h (ASM_SPEC): Remove aix64 option.
(ASM_SPEC32): New.
(ASM_SPEC64): New.
(ASM_CPU_SPEC): Remove vsx and altivec options.
(CPP_SPEC_COMMON): Rename from CPP_SPEC.
(CPP_SPEC32): New.
(CPP_SPEC64): New.
(CPLUSPLUS_CPP_SPEC): Rename to CPLUSPLUS_CPP_SPEC_COMMON..
(TARGET_DEFAULT): Only define if not BIARCH.
(LIB_SPEC_COMMON): Rename from LIB_SPEC.
(LIB_SPEC32): New.
(LIB_SPEC64): New.
(LINK_SPEC_COMMON): Rename from LINK_SPEC.
(LINK_SPEC32): New.
(LINK_SPEC64): New.
(STARTFILE_SPEC): Add 64 bit version of crtcxa and crtdbase.
(ASM_SPEC): Define 32 and 64 bit alternatives using DEFAULT_ARCH64_P.
(CPP_SPEC): Same.
(CPLUSPLUS_CPP_SPEC): Same.
(LIB_SPEC): Same.
(LINK_SPEC): Same.
(SUBTARGET_EXTRA_SPECS): Add new 32/64 specs.
* config/rs6000/defaultaix64.h: New file.
* config/rs6000/t-aix64: New file.
libgcc/ChangeLog
2020-06-21 David Edelsohn <dje.gcc@gmail.com>
* config.host (extra_parts): Add crtcxa_64 and crtdbase_64.
* config/rs6000/t-aix-cxa: Explicitly compile 32 bit with -maix32
and 64 bit with -maix64.
* config/rs6000/t-slibgcc-aix: Remove extra @multilib_dir@ level.
Build and install AIX-style FAT libraries.
libgomp/ChangeLog
2020-06-21 David Edelsohn <dje.gcc@gmail.com>
* Makefile.am (tmake_file): Build and install AIX-style FAT libraries.
* Makefile.in: Regenerate
* configure.ac (tmake_file): Substitute.
* configure: Regenerate.
* configure.tgt (powerpc-ibm-aix*): Define tmake_file.
* config/t-aix: New file.
libstdc++-v3/ChangeLog
2020-06-21 David Edelsohn <dje.gcc@gmail.com>
* Makefile.am (tmake_file): Build and install AIX-style FAT libraries.
* Makefile.in: Regenerate.
* configure.ac (tmake_file): Substitute.
* configure: Regenerate.
* configure.host (aix*): Define tmake_file.
* config/os/aix/t-aix: New file.
libatomic/ChangeLog
2020-06-21 David Edelsohn <dje.gcc@gmail.com>
* Makefile.am (tmake_file): Build and install AIX-style FAT libraries.
* Makefile.in: Regenerate.
* configure.ac (tmake_file): Substitute.
* configure: Regenerate.
* configure.tgt (powerpc-ibm-aix*): Define tmake_file.
* config/t-aix: New file.
libgfortran/ChangeLog
2020-06-21 David Edelsohn <dje.gcc@gmail.com>
* Makefile.am (tmake_file): Build and install AIX-style FAT libraries.
* Makefile.in: Regenerate.
* configure.ac (tmake_file): Substitute.
* configure: Regenerate.
* configure.host: Add system configury stanza. Define tmake_file.
* config/t-aix: New file.
Windows ABI (MinGW) is different than Linux ABI when bitfileds are involved.
The following patch adds __attribute__ ((gcc_struct)) to struct fenv in order
to match the layout of x87 state image in memory.
2020-06-01 Uroš Bizjak <ubizjak@gmail.com>
libatomic/ChangeLog:
* config/x86/fenv.c (struct fenv): Add __attribute__ ((gcc_struct)).
libgcc/ChangeLog:
* config/i386/sfp-exceptions.c (struct fenv):
Add __attribute__ ((gcc_struct)).
libgfortran/ChangeLog:
PR libfortran/95418
* config/fpu-387.h (struct fenv): Add __attribute__ ((gcc_struct)).
Introduce math_force_eval_div to use generic division to generate
INEXACT as well as INVALID and DIVZERO exceptions.
libgcc/ChangeLog:
* config/i386/sfp-exceptions.c (__math_force_eval): Remove.
(__math_force_eval_div): New define.
(__sfp_handle_exceptions): Use __math_force_eval_div to use
generic division to generate INVALID, DIVZERO and INEXACT
exceptions.
libatomic/ChangeLog:
* config/x86/fenv.c (__math_force_eval): Remove.
(__math_force_eval_div): New define.
(__atomic_deraiseexcept): Use __math_force_eval_div to use
generic division to generate INVALID, DIVZERO and INEXACT
exceptions.
libgfortran/ChangeLog:
* config/fpu-387.h (__math_force_eval): Remove.
(__math_force_eval_div): New define.
(local_feraiseexcept): Use __math_force_eval_div to use
generic division to generate INVALID, DIVZERO and INEXACT
exceptions.
(struct fenv): Define named struct instead of typedef.
Introduce math_force_eval to evaluate generic division to generate
INVALID and DIVZERO exceptions.
libgcc/ChangeLog:
* config/i386/sfp-exceptions.c (__math_force_eval): New define.
(__sfp_handle_exceptions): Use __math_force_eval to evaluete
generic division to generate INVALID and DIVZERO exceptions.
libatomic/ChangeLog:
* config/x86/fenv.c (__math_force_eval): New define.
(__atomic_feraiseexcept): Use __math_force_eval to evaluete
generic division to generate INVALID and DIVZERO exceptions.
libgfortran/ChangeLog:
* config/fpu-387.h (__math_force_eval): New define.
(local_feraiseexcept): Use __math_force_eval to evaluete
generic division to generate INVALID and DIVZERO exceptions.
According to "Intel 64 and IA32 Arch SDM, Vol. 3:
"Because SIMD floating-point exceptions are precise and occur immediately,
the situation does not arise where an x87 FPU instruction, a WAIT/FWAIT
instruction, or another SSE/SSE2/SSE3 instruction will catch a pending
unmasked SIMD floating-point exception."
Remove unneeded assignments to volatile memory.
libgcc/ChangeLog:
* config/i386/sfp-exceptions.c (__sfp_handle_exceptions) [__SSE_MATH__]:
Remove unneeded assignments to volatile memory.
libatomic/ChangeLog:
* config/x86/fenv.c (__atomic_feraiseexcept) [__SSE_MATH__]:
Remove unneeded assignments to volatile memory.
libgfortran/ChangeLog:
* config/fpu-387.h (local_feraiseexcept) [__SSE_MATH__]:
Remove unneeded assignments to volatile memory.
The compiler builtin will use the hardware instruction cdsg if the
memory operand is properly aligned and will fall back to the
library call otherwise.
In case the compiler for one part is able to detect that the
location is aligned and fails to do so for another usage of the hw
instruction and the sw fall back would be mixed on the same memory
location. To avoid this the library fall back also has to use the
hardware instruction if possible.
libatomic/ChangeLog:
2018-03-09 Andreas Krebbel <krebbel@linux.vnet.ibm.com>
* config/s390/exch_n.c: New file.
* configure.tgt: Add the config directory for s390.
From-SVN: r258384
gcc/
* builtins.c (fold_builtin_atomic_always_lock_free): Make "lock-free"
conditional on existance of a fast atomic load.
* optabs-query.c (can_atomic_load_p): New function.
* optabs-query.h (can_atomic_load_p): Declare it.
* optabs.c (expand_atomic_exchange): Always delegate to libatomic if
no fast atomic load is available for the particular size of access.
(expand_atomic_compare_and_swap): Likewise.
(expand_atomic_load): Likewise.
(expand_atomic_store): Likewise.
(expand_atomic_fetch_op): Likewise.
* testsuite/lib/target-supports.exp
(check_effective_target_sync_int_128): Remove x86 because it provides
no fast atomic load.
(check_effective_target_sync_int_128_runtime): Likewise.
libatomic/
* acinclude.m4: Add #define FAST_ATOMIC_LDST_*.
* auto-config.h.in: Regenerate.
* config/x86/host-config.h (FAST_ATOMIC_LDST_16): Define to 0.
(atomic_compare_exchange_n): New.
* glfree.c (EXACT, LARGER): Change condition and add comments.
From-SVN: r245098
ARM libatomic inline asm uses sel, uadd8, uadd16 instructions
which are only available if __ARM_FEATURE_SIMD32 is defined.
libatomic/
2017-01-30 Szabolcs Nagy <szabolcs.nagy@arm.com>
PR target/78945
* config/arm/exch_n.c (libat_exchange): Check __ARM_FEATURE_SIMD32.
From-SVN: r245023