f3cd0904a4
362 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Joseph Myers
|
2ce7ba7d15 |
Move SNAN_TESTS_* out of math-tests.h.
Continuing moving macros out of math-tests.h to smaller headers following typo-proof conventions instead of using #ifndef, this patch moves the SNAN_TESTS_* macros for individual types out to their own sysdeps header (while the type-generic SNAN_TESTS wrapper for those macros remains in math-tests.h). Tested for x86_64 and x86, and with build-many-glibcs.py. * sysdeps/generic/math-tests-snan.h: New file. * sysdeps/generic/math-tests.h: Include <math-tests-snan.h>. (SNAN_TESTS_float): Do not define here. (SNAN_TESTS_double): Likewise. (SNAN_TESTS_long_double): Likewise. (SNAN_TESTS_float128): Likewise. * sysdeps/i386/fpu/math-tests-snan.h: New file. * sysdeps/i386/fpu/math-tests.h: Remove file. * sysdeps/ia64/math-tests-snan.h: New file. * sysdeps/ia64/math-tests.h: Remove file. * sysdeps/x86/math-tests.h: Likewise. * sysdeps/x86_64/fpu/math-tests-snan.h: New file. |
||
Wilco Dijkstra
|
ea5c662c62 |
Improve performance of sincosf
This patch is a complete rewrite of sincosf. The new version is significantly faster, as well as simple and accurate. The worst-case ULP is 0.5607, maximum relative error is 0.5303 * 2^-23 over all 4 billion inputs. In non-nearest rounding modes the error is 1ULP. The algorithm uses 3 main cases: small inputs which don't need argument reduction, small inputs which need a simple range reduction and large inputs requiring complex range reduction. The code uses approximate integer comparisons to quickly decide between these cases. The small range reducer uses a single reduction step to handle values up to 120.0. It is fastest on targets which support inlined round instructions. The large range reducer uses integer arithmetic for simplicity. It does a 32x96 bit multiply to compute a 64-bit modulo result. This is more than accurate enough to handle the worst-case cancellation for values close to an integer multiple of PI/4. It could be further optimized, however it is already much faster than necessary. sincosf throughput gains on Cortex-A72: * |x| < 0x1p-12 : 1.6x * |x| < M_PI_4 : 1.7x * |x| < 2 * M_PI: 1.5x * |x| < 120.0 : 1.8x * |x| < Inf : 2.3x * math/Makefile: Add s_sincosf_data.c. * sysdeps/ia64/fpu/s_sincosf_data.c: New file. * sysdeps/ieee754/flt-32/s_sincosf.h (abstop12): Add new function. (sincosf_poly): Likewise. (reduce_small): Likewise. (reduce_large): Likewise. * sysdeps/ieee754/flt-32/s_sincosf.c (sincosf): Rewrite. * sysdeps/ieee754/flt-32/s_sincosf_data.c: New file with sincosf data. * sysdeps/m68k/m680x0/fpu/s_sincosf_data.c: New file. * sysdeps/x86_64/fpu/s_sincosf_data.c: New file. |
||
H.J. Lu
|
67c0579669 |
Mark _init and _fini as hidden [BZ #23145]
_init and _fini are special functions provided by glibc for linker to define DT_INIT and DT_FINI in executable and shared library. They should never be put in dynamic symbol table. This patch marks them as hidden to remove them from dynamic symbol table. Tested with build-many-glibcs.py. [BZ #23145] * elf/Makefile (tests-special): Add $(objpfx)check-initfini.out. ($(all-built-dso:=.dynsym): New target. (common-generated): Add $(all-built-dso:$(common-objpfx)%=%.dynsym). ($(objpfx)check-initfini.out): New target. (generated): Add check-initfini.out. * scripts/check-initfini.awk: New file. * sysdeps/aarch64/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/alpha/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/arm/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/hppa/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/i386/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/ia64/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/m68k/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/microblaze/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/mips/mips32/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/mips/mips64/n32/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/mips/mips64/n64/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/nios2/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/powerpc/powerpc32/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/powerpc/powerpc64/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/s390/s390-32/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/s390/s390-64/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/sh/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/sparc/crti.S (_init): Mark as hidden. (_fini): Likewise. * sysdeps/x86_64/crti.S (_init): Mark as hidden. (_fini): Likewise. |
||
Maciej W. Rozycki
|
10a446ddcc |
elf: Unify symbol address run-time calculation [BZ #19818]
Wrap symbol address run-time calculation into a macro and use it throughout, replacing inline calculations. There are a couple of variants, most of them different in a functionally insignificant way. Most calculations are right following RESOLVE_MAP, at which point either the map or the symbol returned can be checked for validity as the macro sets either both or neither. In some places both the symbol and the map has to be checked however. My initial implementation therefore always checked both, however that resulted in code larger by as much as 0.3%, as many places know from elsewhere that no check is needed. I have decided the size growth was unacceptable. Having looked closer I realized that it's the map that is the culprit. Therefore I have modified LOOKUP_VALUE_ADDRESS to accept an additional boolean argument telling it to access the map without checking it for validity. This in turn has brought quite nice results, with new code actually being smaller for i686, and MIPS o32, n32 and little-endian n64 targets, unchanged in size for x86-64 and, unusually, marginally larger for big-endian MIPS n64, as follows: i686: text data bss dec hex filename 152255 4052 192 156499 26353 ld-2.27.9000-base.so 152159 4052 192 156403 262f3 ld-2.27.9000-elf-symbol-value.so MIPS/o32/el: text data bss dec hex filename 142906 4396 260 147562 2406a ld-2.27.9000-base.so 142890 4396 260 147546 2405a ld-2.27.9000-elf-symbol-value.so MIPS/n32/el: text data bss dec hex filename 142267 4404 260 146931 23df3 ld-2.27.9000-base.so 142171 4404 260 146835 23d93 ld-2.27.9000-elf-symbol-value.so MIPS/n64/el: text data bss dec hex filename 149835 7376 408 157619 267b3 ld-2.27.9000-base.so 149787 7376 408 157571 26783 ld-2.27.9000-elf-symbol-value.so MIPS/o32/eb: text data bss dec hex filename 142870 4396 260 147526 24046 ld-2.27.9000-base.so 142854 4396 260 147510 24036 ld-2.27.9000-elf-symbol-value.so MIPS/n32/eb: text data bss dec hex filename 142019 4404 260 146683 23cfb ld-2.27.9000-base.so 141923 4404 260 146587 23c9b ld-2.27.9000-elf-symbol-value.so MIPS/n64/eb: text data bss dec hex filename 149763 7376 408 157547 2676b ld-2.27.9000-base.so 149779 7376 408 157563 2677b ld-2.27.9000-elf-symbol-value.so x86-64: text data bss dec hex filename 148462 6452 400 155314 25eb2 ld-2.27.9000-base.so 148462 6452 400 155314 25eb2 ld-2.27.9000-elf-symbol-value.so [BZ #19818] * sysdeps/generic/ldsodefs.h (LOOKUP_VALUE_ADDRESS): Add `set' parameter. (SYMBOL_ADDRESS): New macro. [!ELF_FUNCTION_PTR_IS_SPECIAL] (DL_SYMBOL_ADDRESS): Use SYMBOL_ADDRESS for symbol address calculation. * elf/dl-runtime.c (_dl_fixup): Likewise. (_dl_profile_fixup): Likewise. * elf/dl-symaddr.c (_dl_symbol_address): Likewise. * elf/rtld.c (dl_main): Likewise. * sysdeps/aarch64/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/alpha/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/arm/dl-machine.h (elf_machine_rel): Likewise. (elf_machine_rela): Likewise. * sysdeps/hppa/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/hppa/dl-symaddr.c (_dl_symbol_address): Likewise. * sysdeps/i386/dl-machine.h (elf_machine_rel): Likewise. (elf_machine_rela): Likewise. * sysdeps/ia64/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/m68k/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/microblaze/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/mips/dl-machine.h (ELF_MACHINE_BEFORE_RTLD_RELOC): Likewise. (elf_machine_reloc): Likewise. (elf_machine_got_rel): Likewise. * sysdeps/mips/dl-trampoline.c (__dl_runtime_resolve): Likewise. * sysdeps/nios2/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/powerpc/powerpc64/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/riscv/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/s390/s390-32/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/s390/s390-64/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/sh/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/sparc/sparc32/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/sparc/sparc64/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/tile/dl-machine.h (elf_machine_rela): Likewise. * sysdeps/x86_64/dl-machine.h (elf_machine_rela): Likewise. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> |
||
Joseph Myers
|
ffec7b2740 |
Use x86_64 backtrace as generic version.
No glibc configuration uses the present debug/backtrace.c, whereas several #include the x86_64 version. The x86_64 version is effectively a generic one (using _Unwind_Backtrace from libgcc, which works much more reliably than the built-in functions used by debug/backtrace.c). This patch moves it to debug/backtrace.c and removes all the #includes of the x86_64 version from other architectures which are no longer required. I do not know whether all the other architecture-specific backtrace implementations that are based on _Unwind_Backtrace are required, or whether, where their differences from the generic version do something useful, suitable hooks could be added to the generic version to reduce the duplication involved. Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. * sysdeps/x86_64/backtrace.c: Move to .... * debug/backtrace.c: ... here. * sysdeps/aarch64/backtrace.c: Remove file. * sysdeps/alpha/backtrace.c: Likewise. * sysdeps/hppa/backtrace.c: Likewise. * sysdeps/ia64/backtrace.c: Likewise. * sysdeps/mips/backtrace.c: Likewise. * sysdeps/nios2/backtrace.c: Likewise. * sysdeps/riscv/backtrace.c: Likewise. * sysdeps/sh/backtrace.c: Likewise. * sysdeps/tile/backtrace.c: Likewise. |
||
Samuel Thibault
|
a5df0318ef |
hurd: add gscope support
* elf/dl-support.c [!THREAD_GSCOPE_IN_TCB] (_dl_thread_gscope_count): Define variable. * sysdeps/generic/ldsodefs.h [!THREAD_GSCOPE_IN_TCB] (struct rtld_global): Add _dl_thread_gscope_count member. * sysdeps/mach/hurd/tls.h: Include <atomic.h>. [!defined __ASSEMBLER__] (THREAD_GSCOPE_GLOBAL, THREAD_GSCOPE_SET_FLAG, THREAD_GSCOPE_RESET_FLAG, THREAD_GSCOPE_WAIT): Define macros. * sysdeps/generic/tls.h: Document THREAD_GSCOPE_IN_TCB. * sysdeps/aarch64/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/alpha/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/arm/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/hppa/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/i386/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/ia64/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/m68k/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/microblaze/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/mips/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/nios2/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/powerpc/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/riscv/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/s390/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/sh/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/sparc/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/tile/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. * sysdeps/x86_64/nptl/tls.h: Define THREAD_GSCOPE_IN_TCB to 1. |
||
Wilco Dijkstra
|
610ee1fc93 |
Remove mplog and mpexp
Remove the now unused mplog and mpexp files. * math/Makefile: Remove mpexp.c and mplog.c * sysdeps/i386/fpu/mpexp.c: Delete file. * sysdeps/i386/fpu/mplog.c: Likewise. * sysdeps/ia64/fpu/mpexp.c: Likewise. * sysdeps/ia64/fpu/mplog.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c: Remove mention of mpexp and mplog. * sysdeps/ieee754/dbl-64/mpa.h (__pow_mp): Remove unused function. * sysdeps/ieee754/dbl-64/mpexp.c: Delete file. * sysdeps/ieee754/dbl-64/mplog.c: Likewise. * sysdeps/m68k/m680x0/fpu/mpexp.c: Likewise. * sysdeps/m68k/m680x0/fpu/mplog.c: Likewise. * sysdeps/x86_64/fpu/multiarch/Makefile: Remove mpexp* and mplog*. * sysdeps/x86_64/fpu/multiarch/e_log-avx.c: Remove unused defines. * sysdeps/x86_64/fpu/multiarch/e_log-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/e_log-fma4.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpexp-avx.c: Delete file. * sysdeps/x86_64/fpu/multiarch/mpexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mpexp-fma4.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mplog-avx.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mplog-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/mplog-fma4.c: Likewise. |
||
Szabolcs Nagy
|
de800d8305 |
Remove slow paths from exp
Remove the __slowexp code, so exp is no longer correctly rounded. The result is computed to about 70 bits precision so the worst case ulp error is about 0.500007 in nearest rounding mode. * manual/probes.texi: Remove slowexp probes. * math/Makefile: Remove slowexp. * sysdeps/generic/math_private.h (__slowexp): Remove. * sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Remove __slowexp and document error bounds. * sysdeps/i386/fpu/slowexp.c: Remove. * sysdeps/ia64/fpu/slowexp.c: Remove. * sysdeps/ieee754/dbl-64/slowexp.c: Remove. * sysdeps/ieee754/dbl-64/uexp.h (err_0): Remove. * sysdeps/m68k/m680x0/fpu/slowexp.c: Remove. * sysdeps/powerpc/power4/fpu/Makefile (CPPFLAGS-slowexp.c): Remove. * sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowexp-fma. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Remove. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Remove. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Remove. * sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Remove. * sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Remove. * sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Remove. |
||
Wilco Dijkstra
|
c3d466cba1 |
Remove slow paths from pow
Remove the slow paths from pow. Like several other double precision math functions, pow is exactly rounded. This is not required from math functions and causes major overheads as it requires multiple fallbacks using higher precision arithmetic if a result is close to 0.5ULP. Ridiculous slowdowns of up to 100000x have been reported when the highest precision path triggers. All GLIBC math tests pass on AArch64 and x64 (with ULP of pow set to 1). The worst case error is ~0.506ULP. A simple test over a few hundred million values shows pow is 10% faster on average. This fixes BZ #13932. [BZ #13932] * sysdeps/ieee754/dbl-64/uexp.h (err_1): Remove. * benchtests/pow-inputs: Update comment for slow path cases. * manual/probes.texi (slowpow_p10): Delete removed probe. (slowpow_p10): Likewise. * math/Makefile: Remove halfulp.c and slowpow.c. * sysdeps/aarch64/libm-test-ulps: Set ULP of pow to 1. * sysdeps/generic/math_private.h (__exp1): Remove error argument. (__halfulp): Remove. (__slowpow): Remove. * sysdeps/i386/fpu/halfulp.c: Delete file. * sysdeps/i386/fpu/slowpow.c: Likewise. * sysdeps/ia64/fpu/halfulp.c: Likewise. * sysdeps/ia64/fpu/slowpow.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove error argument, improve comments and add error analysis. * sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Add error analysis. (power1): Remove function: (log1): Remove error argument, add error analysis. (my_log2): Remove function. * sysdeps/ieee754/dbl-64/halfulp.c: Delete file. * sysdeps/ieee754/dbl-64/slowpow.c: Likewise. * sysdeps/m68k/m680x0/fpu/halfulp.c: Likewise. * sysdeps/m68k/m680x0/fpu/slowpow.c: Likewise. * sysdeps/powerpc/power4/fpu/Makefile: Remove CPPFLAGS-slowpow.c. * sysdeps/x86_64/fpu/libm-test-ulps: Set ULP of pow to 1. * sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowpow-fma.c, slowpow-fma4.c, halfulp-fma.c, halfulp-fma4.c. * sysdeps/x86_64/fpu/multiarch/e_pow-fma.c (__slowpow): Remove define. * sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c (__slowpow): Likewise. * sysdeps/x86_64/fpu/multiarch/halfulp-fma.c: Delete file. * sysdeps/x86_64/fpu/multiarch/halfulp-fma4.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowpow-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowpow-fma4.c: Likewise. |
||
Joseph Myers
|
0d40d0ecba |
Unify and simplify bits/byteswap.h, bits/byteswap-16.h headers (bug 14508, bug 15512, bug 17082, bug 20530).
We have a general principle of preferring optimizations for library
facilities to use compiler built-in functions rather than being
located in library headers, where the compiler can reasonably optimize
code without needing to know glibc implementation details.
This patch applies this principle to bits/byteswap.h, eliminating all
the architecture-specific variants and bits/byteswap-16.h. The
__bswap_16, __bswap_32 and __bswap_64 interfaces all become inline
functions, never macros, using the GCC built-in functions where
available and otherwise a single architecture-independent definition
using shifts and masking (which compilers may well be able to detect
and optimize; GCC has detection of various byte-swapping idioms).
The __bswap_constant_32 macro needs to stay around because of uses in
static initializers within glibc and its tests, and so for consistency
all __bswap_constant_* are kept rather than just being inlined into
the old-GCC-or-non-GCC parts of the __bswap_* inline function
definitions.
Various open bugs are addressed by this cleanup, with caveats about
exactly what is covered by those bugs and when the bugs applied at
all.
Bug 14508 reports -Wformat warnings building glibc because __bswap_*
sometimes returned the wrong types. Obviously we already don't have
such warnings any more or the build would be failing, given -Werror,
and I suspect that bug was originally for wrong types for x86_64, as
fixed by commit
|
||
Joseph Myers
|
688903eb3e |
Update copyright dates with scripts/update-copyrights.
* All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise. |
||
Joseph Myers
|
f1e005022e |
Revert exp reimplementation (causes test failures).
Revert: 2017-12-19 Joseph Myers <joseph@codesourcery.com> * sysdeps/x86_64/fpu/libm-test-ulps: Update. 2017-12-19 Patrick McGehearty <patrick.mcgehearty@oracle.com> * sysdeps/ieee754/dbl-64/e_exp.c: Include <math-svid-compat.h> and <errno.h>. Include "eexp.tbl". (half): New constant. (one): Likewise. (__ieee754_exp): Rewrite. (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/eexp.tbl: New file. * sysdeps/ieee754/dbl-64/slowexp.c: Remove file. * sysdeps/i386/fpu/slowexp.c: Likewise. * sysdeps/ia64/fpu/slowexp.c: Likewise. * sysdeps/m68k/m680x0/fpu/slowexp.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Likewise. * sysdeps/generic/math_private.h (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/e_pow.c: Remove mention of slowexp.c in comment. * sysdeps/powerpc/power4/fpu/Makefile [$(subdir) = math] (CPPFLAGS-slowexp.c): Remove variable. * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove slowexp-fma, slowexp-fma4 and slowexp-avx. (CFLAGS-slowexp-fma.c): Remove variable. (CFLAGS-slowexp-fma4.c): Likewise. (CFLAGS-slowexp-avx.c): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Do not define as macro. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Likewise. * math/Makefile (type-double-routines): Remove slowexp. * manual/probes.texi (slowexp_p6): Remove. (slowexp_p32): Likewise. |
||
Patrick McGehearty
|
6fd0a3c6a8 |
Improve __ieee754_exp() performance by greater than 5x on sparc/x86.
These changes will be active for all platforms that don't provide their own exp() routines. They will also be active for ieee754 versions of ccos, ccosh, cosh, csin, csinh, sinh, exp10, gamma, and erf. Typical performance gains is typically around 5x when measured on Sparc s7 for common values between exp(1) and exp(40). Using the glibc perf tests on sparc, sparc (nsec) x86 (nsec) old new old new max 17629 395 5173 144 min 399 54 15 13 mean 5317 200 1349 23 The extreme max times for the old (ieee754) exp are due to the multiprecision computation in the old algorithm when the true value is very near 0.5 ulp away from an value representable in double precision. The new algorithm does not take special measures for those cases. The current glibc exp perf tests overrepresent those values. Informal testing suggests approximately one in 200 cases might invoke the high cost computation. The performance advantage of the new algorithm for other values is still large but not as large as indicated by the chart above. Glibc correctness tests for exp() and expf() were run. Within the test suite 3 input values were found to cause 1 bit differences (ulp) when "FE_TONEAREST" rounding mode is set. No differences in exp() were seen for the tested values for the other rounding modes. Typical example: exp(-0x1.760cd2p+0) (-1.46113312244415283203125) new code: 2.31973271630014299393707e-01 0x1.db14cd799387ap-3 old code: 2.31973271630014271638132e-01 0x1.db14cd7993879p-3 exp = 2.31973271630014285508337 (high precision) Old delta: off by 0.49 ulp New delta: off by 0.51 ulp In addition, because ieee754_exp() is used by other routines, cexp() showed test results with very small imaginary input values where the imaginary portion of the result was off by 3 ulp when in upward rounding mode, but not in the other rounding modes. For x86, tgamma showed a few values where the ulp increased to 6 (max ulp for tgamma is 5). Sparc tgamma did not show these failures. I presume the tgamma differences are due to compiler optimization differences within the gamma function.The gamma function is known to be difficult to compute accurately. * sysdeps/ieee754/dbl-64/e_exp.c: Include <math-svid-compat.h> and <errno.h>. Include "eexp.tbl". (half): New constant. (one): Likewise. (__ieee754_exp): Rewrite. (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/eexp.tbl: New file. * sysdeps/ieee754/dbl-64/slowexp.c: Remove file. * sysdeps/i386/fpu/slowexp.c: Likewise. * sysdeps/ia64/fpu/slowexp.c: Likewise. * sysdeps/m68k/m680x0/fpu/slowexp.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-avx.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowexp-fma4.c: Likewise. * sysdeps/generic/math_private.h (__slowexp): Remove prototype. * sysdeps/ieee754/dbl-64/e_pow.c: Remove mention of slowexp.c in comment. * sysdeps/powerpc/power4/fpu/Makefile [$(subdir) = math] (CPPFLAGS-slowexp.c): Remove variable. * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove slowexp-fma, slowexp-fma4 and slowexp-avx. (CFLAGS-slowexp-fma.c): Remove variable. (CFLAGS-slowexp-fma4.c): Likewise. (CFLAGS-slowexp-avx.c): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__slowexp): Do not define as macro. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__slowexp): Likewise. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__slowexp): Likewise. * math/Makefile (type-double-routines): Remove slowexp. * manual/probes.texi (slowexp_p6): Remove. (slowexp_p32): Likewise. |
||
Adhemerval Zanella
|
3bb1ef58b9 |
ia64: Fix memchr for large input sizes (BZ #22603)
Current optimized ia64 memchr uses a strategy to check for last address
by adding the input one with expected size. However it does not take
care for possible overflow.
It was triggered by
|
||
Adhemerval Zanella
|
c80acdc325 |
Update IA64 libm-test-ulps
Ran on Itanium Processor 9020, GCC 7.2.1. * sysdeps/ia64/fpu/libm-test-ulps: Update. Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> |
||
Joseph Myers
|
c191f64cd5 |
Correct some ia64 libm_alias_float_other calls.
This patch corrects three ia64 libm_alias_float_other calls so they generate the intended _Float32 aliases when such aliases are enabled. Tested with build-many-glibcs.py for ia64-linux-gnu (that installed stripped shared libraries are unchanged when applied to current sources, and that this enables compilation tests to pass when used in conjunction with other _Float32 patches). * sysdeps/ia64/fpu/e_exp2f.S (__exp2f): Use exp2 not __exp2 as second argument to libm_alias_float_other. * sysdeps/ia64/fpu/e_log2f.S (__log2f): Use log2 not __log2 as second argument to libm_alias_float_other. * sysdeps/ia64/fpu/e_powf.S (__powf): Use pow not __pow as second argument to libm_alias_float_other. |
||
Joseph Myers
|
aa1142c593 |
Use libm_alias_float for ia64.
Continuing the preparation for additional _FloatN / _FloatNx function aliases, this patch makes ia64 libm function implementations use libm_alias_float to define function aliases. The same approach is followed as with the corresponding long double and double patches: the ia64-specific macros are left unchanged, with calls to libm_alias_float_other being added in most cases and libm_alias_float itself being used in only a few places. Tested with build-many-glibcs.py for ia64-linux-gnu that installed stripped shared libraries are unchanged by the patch. * sysdeps/ia64/fpu/libm-symbols.h: Include <libm-alias-float.h>. * sysdeps/ia64/fpu/e_acosf.S (acosf): Use libm_alias_float_other. * sysdeps/ia64/fpu/e_acoshf.S (acoshf): Likewise. * sysdeps/ia64/fpu/e_asinf.S (asinf): Likewise. * sysdeps/ia64/fpu/e_atan2f.S (atan2f): Likewise. * sysdeps/ia64/fpu/e_atanhf.S (atanhf): Likewise. * sysdeps/ia64/fpu/e_coshf.S (coshf): Likewise. * sysdeps/ia64/fpu/e_exp10f.S (exp10f): Likewise. * sysdeps/ia64/fpu/e_exp2f.S (exp2f): Likewise. * sysdeps/ia64/fpu/e_expf.S (expf): Likewise. * sysdeps/ia64/fpu/e_fmodf.S (fmodf): Likewise. * sysdeps/ia64/fpu/e_hypotf.S (hypotf): Likewise. * sysdeps/ia64/fpu/e_lgammaf_r.c (lgammaf_r): Define using libm_alias_float_r. * sysdeps/ia64/fpu/e_log2f.S (log2f): Use libm_alias_float_other. * sysdeps/ia64/fpu/e_logf.S (log10f): Likewise. (logf): Likewise. * sysdeps/ia64/fpu/e_powf.S (powf): Likewise. * sysdeps/ia64/fpu/e_remainderf.S (remainderf): Likewise. * sysdeps/ia64/fpu/e_sinhf.S (sinhf): Likewise. * sysdeps/ia64/fpu/e_sqrtf.S (sqrtf): Likewise. * sysdeps/ia64/fpu/libm_sincosf.S (sincosf): Likewise. * sysdeps/ia64/fpu/s_asinhf.S (asinhf): Likewise. * sysdeps/ia64/fpu/s_atanf.S (atanf): Likewise. * sysdeps/ia64/fpu/s_cbrtf.S (cbrtf): Likewise. * sysdeps/ia64/fpu/s_ceilf.S (ceilf): Likewise. * sysdeps/ia64/fpu/s_copysign.S (copysignf): Define using libm_alias_float. * sysdeps/ia64/fpu/s_cosf.S (sinf): Use libm_alias_float_other. (cosf): Likewise. * sysdeps/ia64/fpu/s_erfcf.S (erfcf): Likewise. * sysdeps/ia64/fpu/s_erff.S (erff): Likewise. * sysdeps/ia64/fpu/s_expm1f.S (expm1f): Likewise. * sysdeps/ia64/fpu/s_fabsf.S (fabsf): Likewise. * sysdeps/ia64/fpu/s_fdimf.S (fdimf): Likewise. * sysdeps/ia64/fpu/s_floorf.S (floorf): Likewise. * sysdeps/ia64/fpu/s_fmaf.S (fmaf): Likewise. * sysdeps/ia64/fpu/s_fmaxf.S (fmaxf): Likewise. * sysdeps/ia64/fpu/s_frexpf.c (frexpf): Likewise. * sysdeps/ia64/fpu/s_ldexpf.c (ldexpf): Likewise. * sysdeps/ia64/fpu/s_log1pf.S (log1pf): Likewise. * sysdeps/ia64/fpu/s_logbf.S (logbf): Likewise. * sysdeps/ia64/fpu/s_modff.S (modff): Likewise. * sysdeps/ia64/fpu/s_nearbyintf.S (nearbyintf): Likewise. * sysdeps/ia64/fpu/s_nextafterf.S (nextafterf): Likewise. * sysdeps/ia64/fpu/s_rintf.S (rintf): Likewise. * sysdeps/ia64/fpu/s_roundf.S (roundf): Likewise. * sysdeps/ia64/fpu/s_scalblnf.c (scalblnf): Likewise. * sysdeps/ia64/fpu/s_scalbnf.c (scalbnf): Define using libm_alias_float. * sysdeps/ia64/fpu/s_tanf.S (tanf): Use libm_alias_float_other. * sysdeps/ia64/fpu/s_tanhf.S (tanhf): Likewise. * sysdeps/ia64/fpu/s_truncf.S (truncf): Likewise. * sysdeps/ia64/fpu/w_lgammaf_main.c [BUILD_LGAMMA && !USE_AS_COMPAT] (lgammaf): Likewise. * sysdeps/ia64/fpu/w_tgammaf_compat.S (tgammaf): Likewise. |
||
Joseph Myers
|
0609ec0a74 |
Use libm_alias_double for ia64.
Continuing the preparation for additional _FloatN / _FloatNx function aliases, this patch makes ia64 libm function implementations use libm_alias_double to define function aliases. The same approach is followed as with the corresponding long double patch: the ia64-specific macros are left unchanged, with calls to libm_alias_double_other being added in most cases and libm_alias_double itself being used in only a few places. Tested with build-many-glibcs.py for ia64-linux-gnu that installed stripped shared libraries are unchanged by the patch. * sysdeps/ia64/fpu/libm-symbols.h: Include <libm-alias-double.h>. * sysdeps/ia64/fpu/e_acos.S (acos): Use libm_alias_double_other. * sysdeps/ia64/fpu/e_acosh.S (acosh): Likewise. * sysdeps/ia64/fpu/e_asin.S (asin): Likewise. * sysdeps/ia64/fpu/e_atan2.S (atan2): Likewise. * sysdeps/ia64/fpu/e_atanh.S (atanh): Likewise. * sysdeps/ia64/fpu/e_cosh.S (cosh): Likewise. * sysdeps/ia64/fpu/e_exp.S (exp): Likewise. * sysdeps/ia64/fpu/e_exp10.S (exp10): Likewise. * sysdeps/ia64/fpu/e_exp2.S (exp2): Likewise. * sysdeps/ia64/fpu/e_fmod.S (fmod): Likewise. * sysdeps/ia64/fpu/e_hypot.S (hypot): Likewise. * sysdeps/ia64/fpu/e_lgamma_r.c (lgamma_r): Define using libm_alias_double_r. * sysdeps/ia64/fpu/e_log.S (log10): Use libm_alias_double_other. (log): Likewise. * sysdeps/ia64/fpu/e_log2.S (log2): Likewise. * sysdeps/ia64/fpu/e_pow.S (pow): Likewise. * sysdeps/ia64/fpu/e_remainder.S (remainder): Likewise. * sysdeps/ia64/fpu/e_sinh.S (sinh): Likewise. * sysdeps/ia64/fpu/e_sqrt.S (sqrt): Likewise. * sysdeps/ia64/fpu/libm_sincos.S (sincos): Likewise. * sysdeps/ia64/fpu/s_asinh.S (asinh): Likewise. * sysdeps/ia64/fpu/s_atan.S (atan): Likewise. * sysdeps/ia64/fpu/s_cbrt.S (cbrt): Likewise. * sysdeps/ia64/fpu/s_ceil.S (ceil): Likewise. * sysdeps/ia64/fpu/s_copysign.S (copysign): Define using libm_alias_double. * sysdeps/ia64/fpu/s_cos.S (sin): Use libm_alias_double_other. (cos): Likewise. * sysdeps/ia64/fpu/s_erf.S (erf): Likewise. * sysdeps/ia64/fpu/s_erfc.S (erfc): Likewise. * sysdeps/ia64/fpu/s_expm1.S (expm1): Likewise. * sysdeps/ia64/fpu/s_fabs.S (fabs): Likewise. * sysdeps/ia64/fpu/s_fdim.S (fdim): Likewise. * sysdeps/ia64/fpu/s_floor.S (floor): Likewise. * sysdeps/ia64/fpu/s_fma.S (fma): Likewise. * sysdeps/ia64/fpu/s_fmax.S (fmax): Likewise. * sysdeps/ia64/fpu/s_frexp.c (frexp): Likewise. * sysdeps/ia64/fpu/s_ldexp.c (ldexp): Likewise. * sysdeps/ia64/fpu/s_log1p.S (log1p): Likewise. * sysdeps/ia64/fpu/s_logb.S (logb): Likewise. * sysdeps/ia64/fpu/s_modf.S (modf): Likewise. * sysdeps/ia64/fpu/s_nearbyint.S (nearbyint): Likewise. * sysdeps/ia64/fpu/s_nextafter.S (nextafter): Likewise. * sysdeps/ia64/fpu/s_rint.S (rint): Likewise. * sysdeps/ia64/fpu/s_round.S (round): Likewise. * sysdeps/ia64/fpu/s_scalbn.c (scalbn): Define using libm_alias_double. * sysdeps/ia64/fpu/s_tan.S (tan): Use libm_alias_double_other. * sysdeps/ia64/fpu/s_tanh.S (tanh): Likewise. * sysdeps/ia64/fpu/s_trunc.S (trunc): Likewise. * sysdeps/ia64/fpu/w_lgamma_main.c [BUILD_LGAMMA && !USE_AS_COMPAT] (lgamma): Likewise. * sysdeps/ia64/fpu/w_tgamma_compat.S (tgamma): Likewise. |
||
Joseph Myers
|
a23aa5b727 |
Add _Float64x function aliases.
This patch continues filling out TS 18661-3 support by adding *f64x function aliases on platforms with _Float64x support. (It so happens the set of such platforms is exactly the same as the set of platforms with _Float128 support, although on x86_64, x86 and ia32 the _Float64x format is Intel extended rather than binary128.) The API provided corresponds exactly to that provided for _Float128, mostly coming from TS 18661-3. As these functions always alias those for another type (long double, _Float128 or both), __* function names are not provided, as in other cases of alias types. Given the preparation done in previous patches, this one just enables the feature via Makeconfig and bits/floatn.h, adds symbol versions, and updates documentation and ABI baselines. The symbol versions are present unconditionally as GLIBC_2.27 in the relevant Versions files, as it's OK for those to specify versions for functions that may not be present in some configurations; no additional complexity is needed unless in future some configuration gains support for this type that didn't have such support in 2.27. The Makeconfig additions for ia64 and x86 aren't strictly needed, as those configurations also get float64x-alias-fcts definitions from sysdeps/ieee754/float128/Makeconfig, but still seem appropriate given that _Float64x is not _Float128 for those configurations. A libm-test-ulps update for x86 is included. This is because bits/mathinline.h does not have _Float64x support added and for two functions the use of out-of-line functions results in increased ulps (ifloat64x shares ulps with ildouble / ifloat128 as appropriate). Given that we'd like generally to eliminate bits/mathinline.h optimizations, preferring to have such optimizations in GCC instead, it seems reasonable not to add such support there for new types. GCC support for _FloatN / _FloatNx built-in functions is limited, but has been improved in GCC 8, and at some point I hope the full set of libm built-in functions in GCC, and other optimizations with per-floating-type aspects, will be enabled for all _FloatN / _FloatNx types. Tested for x86_64 and x86, and with build-many-glibcs.py, with both GCC 6 and GCC 7. * sysdeps/ia64/Makeconfig (float64x-alias-fcts): New variable. * sysdeps/ieee754/float128/Makeconfig (float64x-alias-fcts): Likewise. * sysdeps/ieee754/ldbl-128/Makeconfig (float64x-alias-fcts): Likewise. * sysdeps/x86/Makeconfig: New file. * bits/floatn-common.h (__HAVE_FLOAT64X): Remove macro. (__HAVE_FLOAT64X_LONG_DOUBLE): Likewise. * bits/floatn.h (__HAVE_FLOAT64X): New macro. (__HAVE_FLOAT64X_LONG_DOUBLE): Likewise. * sysdeps/ia64/bits/floatn.h (__HAVE_FLOAT64X): Likewise. (__HAVE_FLOAT64X_LONG_DOUBLE): Likewise. * sysdeps/ieee754/ldbl-128/bits/floatn.h (__HAVE_FLOAT64X): Likewise. (__HAVE_FLOAT64X_LONG_DOUBLE): Likewise. * sysdeps/mips/ieee754/bits/floatn.h (__HAVE_FLOAT64X): Likewise. (__HAVE_FLOAT64X_LONG_DOUBLE): Likewise. * sysdeps/powerpc/bits/floatn.h (__HAVE_FLOAT64X): Likewise. (__HAVE_FLOAT64X_LONG_DOUBLE): Likewise. * sysdeps/x86/bits/floatn.h (__HAVE_FLOAT64X): Likewise. (__HAVE_FLOAT64X_LONG_DOUBLE): Likewise. * manual/math.texi (Mathematics): Document support for _Float64x. * math/Versions (GLIBC_2.27): Add _Float64x functions. * stdlib/Versions (GLIBC_2.27): Likewise. * wcsmbs/Versions (GLIBC_2.27): Likewise. * sysdeps/unix/sysv/linux/aarch64/libc.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise. * sysdeps/i386/fpu/libm-test-ulps: Likewise. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. |
||
Joseph Myers
|
3a327316ad |
Use libm_alias_ldouble macros in sysdeps/ia64/fpu.
Continuing the preparation for additional _FloatN / _FloatNx aliases, this patch makes long double functions in sysdeps/ia64/fpu use libm_alias_ldouble macros, so that they can have _Float64x aliases added in future. Most ia64 libm functions are defined using ia64-specific macros in libm-symbols.h. These are left unchanged, with libm-alias-ldouble.h included from libm-symbols.h (and the expectation that other libm-alias-*.h headers will be included from there as well in future), and libm_alias_ldouble_other then being used in most cases to define aliases for any additional types (currently the empty set). Functions that used weak_alias are converted to use libm_alias_ldouble. Tested (compilation only) with build-many-glibcs.py for ia64, including that installed stripped shared libraries are unchanged by the patch. * sysdeps/ia64/fpu/libm-symbols.h: Include <libm-alias-ldouble.h>. * sysdeps/ia64/fpu/e_acoshl.S (acoshl): Use libm_alias_ldouble_other. * sysdeps/ia64/fpu/e_acosl.S (acosl): Likewise. * sysdeps/ia64/fpu/e_asinl.S (asinl): Likewise. * sysdeps/ia64/fpu/e_atanhl.S (atanhl): Likewise. * sysdeps/ia64/fpu/e_coshl.S (coshl): Likewise. * sysdeps/ia64/fpu/e_exp10l.S (exp10l): Likewise. * sysdeps/ia64/fpu/e_exp2l.S (exp2l): Likewise. * sysdeps/ia64/fpu/e_fmodl.S (fmodl): Likewise. * sysdeps/ia64/fpu/e_hypotl.S (hypotl): Likewise. * sysdeps/ia64/fpu/e_lgammal_r.c (lgammal_r): Define using libm_alias_ldouble_r. * sysdeps/ia64/fpu/e_log2l.S (log2l): Use libm_alias_ldouble_other. * sysdeps/ia64/fpu/e_logl.S (logl): Likewise. (log10l): Likewise. * sysdeps/ia64/fpu/e_powl.S (powl): Likewise. * sysdeps/ia64/fpu/e_remainderl.S (remainderl): Likewise. * sysdeps/ia64/fpu/e_sinhl.S (sinhl): Likewise. * sysdeps/ia64/fpu/e_sqrtl.S (sqrtl): Likewise. * sysdeps/ia64/fpu/libm_sincosl.S (sincosl): Likewise. * sysdeps/ia64/fpu/s_asinhl.S (asinhl): Likewise. * sysdeps/ia64/fpu/s_atanl.S (atanl): Likewise. (atan2l): Likewise. * sysdeps/ia64/fpu/s_cbrtl.S (cbrtl): Likewise. * sysdeps/ia64/fpu/s_ceill.S (ceill): Likewise. * sysdeps/ia64/fpu/s_copysign.S (copysignl): Define using libm_alias_ldouble. * sysdeps/ia64/fpu/s_cosl.S (sinl): Use libm_alias_ldouble_other. (cosl): Likewise. * sysdeps/ia64/fpu/s_erfcl.S (erfcl): Likewise. * sysdeps/ia64/fpu/s_erfl.S (erfl): Likewise. * sysdeps/ia64/fpu/s_expm1l.S (expm1l): Likewise. (expl): Likewise. * sysdeps/ia64/fpu/s_fabsl.S (fabsl): Likewise. * sysdeps/ia64/fpu/s_fdiml.S (fdiml): Likewise. * sysdeps/ia64/fpu/s_floorl.S (floorl): Likewise. * sysdeps/ia64/fpu/s_fmal.S (fmal): Likewise. * sysdeps/ia64/fpu/s_fmaxl.S (fmaxl): Likewise. * sysdeps/ia64/fpu/s_frexpl.c (frexpl): Likewise. * sysdeps/ia64/fpu/s_ldexpl.c (ldexpl): Likewise. * sysdeps/ia64/fpu/s_log1pl.S (log1pl): Likewise. * sysdeps/ia64/fpu/s_logbl.S (logbl): Likewise. * sysdeps/ia64/fpu/s_modfl.S (modfl): Likewise. * sysdeps/ia64/fpu/s_nearbyintl.S (nearbyintl): Define using libm_alias_ldouble. * sysdeps/ia64/fpu/s_nextafterl.S (nextafterl): Use libm_alias_ldouble_other. * sysdeps/ia64/fpu/s_rintl.S (rintl): Likewise. * sysdeps/ia64/fpu/s_roundl.S (roundl): Likewise. * sysdeps/ia64/fpu/s_scalbnl.c (scalbnl): Define using libm_alias_ldouble. * sysdeps/ia64/fpu/s_tanhl.S (tanhl): Use libm_alias_ldouble_other. * sysdeps/ia64/fpu/s_tanl.S (tanl): Likewise. * sysdeps/ia64/fpu/s_truncl.S (truncl): Likewise. * sysdeps/ia64/fpu/w_lgammal_main.c [BUILD_LGAMMA && !USE_AS_COMPAT] (lgammal): Likewise. * sysdeps/ia64/fpu/w_tgammal_compat.S (tgammal): Likewise. |
||
Joseph Myers
|
015c6dc288 |
Support bits/floatn.h inclusion from .S files.
Further _FloatN / _FloatNx type alias support will involve making architecture-specific .S files use the common macros for libm function aliases. Making them use those macros will also serve to simplify existing code for aliases / symbol versions in various cases, similar to such simplifications for ldbl-opt code. The libm-alias-*.h files sometimes need to include <bits/floatn.h> to determine which aliases they should define. At present, this does not work for inclusion from .S files because <bits/floatn.h> can define typedefs for old compilers. This patch changes all the <bits/floatn.h> and <bits/floatn-common.h> headers to include __ASSEMBLER__ conditionals. Those conditionals disable everything related to C syntax in the __ASSEMBLER__ case, not just the problem typedefs, as that seemed cleanest. The __HAVE_* definitions remain in the __ASSEMBLER__ case, as those provide information that is required to define the correct set of aliases. Tested with build-many-glibcs.py for a representative set of configurations (x86_64-linux-gnu i686-linux-gnu ia64-linux-gnu powerpc64le-linux-gnu mips64-linux-gnu-n64 sparc64-linux-gnu) with GCC 6. Also tested with GCC 6 for i686-linux-gnu in conjunction with changes to use alias macros in .S files. * bits/floatn-common.h [!__ASSEMBLER]: Disable everything related to C syntax instead of availability and properties of types. * bits/floatn.h [!__ASSEMBLER]: Likewise. * sysdeps/ia64/bits/floatn.h [!__ASSEMBLER]: Likewise. * sysdeps/ieee754/ldbl-128/bits/floatn.h [!__ASSEMBLER]: Likewise. * sysdeps/mips/ieee754/bits/floatn.h [!__ASSEMBLER]: Likewise. * sysdeps/powerpc/bits/floatn.h [!__ASSEMBLER]: Likewise. * sysdeps/x86/bits/floatn.h [!__ASSEMBLER]: Likewise. |
||
Adhemerval Zanella
|
06be6368da |
nptl: Define __PTHREAD_MUTEX_{NUSERS_AFTER_KIND,USE_UNION}
This patch adds two new internal defines to set the internal pthread_mutex_t layout required by the supported ABIS: 1. __PTHREAD_MUTEX_NUSERS_AFTER_KIND which control whether to define __nusers fields before or after __kind. The preferred value for is 0 for new ports and it sets __nusers before __kind. 2. __PTHREAD_MUTEX_USE_UNION which control whether internal __spins and __list members will be place inside an union for linuxthreads compatibility. The preferred value is 0 for ports and it sets to not use an union to define both fields. It fixes the wrong offsets value for __kind value on x86_64-linux-gnu-x32. Checked with a make check run-built-tests=no on all afected ABIs. [BZ #22298] * nptl/allocatestack.c (allocate_stack): Check if __PTHREAD_MUTEX_HAVE_PREV is non-zero, instead if __PTHREAD_MUTEX_HAVE_PREV is defined. * nptl/descr.h (pthread): Likewise. * nptl/nptl-init.c (__pthread_initialize_minimal_internal): Likewise. * nptl/pthread_create.c (START_THREAD_DEFN): Likewise. * sysdeps/nptl/fork.c (__libc_fork): Likewise. * sysdeps/nptl/pthread.h (PTHREAD_MUTEX_INITIALIZER): Likewise. * sysdeps/nptl/bits/thread-shared-types.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): New defines. (__pthread_internal_list): Check __PTHREAD_MUTEX_USE_UNION instead of __WORDSIZE for internal layout. (__pthread_mutex_s): Check __PTHREAD_MUTEX_NUSERS_AFTER_KIND instead of __WORDSIZE for internal __nusers layout and __PTHREAD_MUTEX_USE_UNION instead of __WORDSIZE whether to use an union for __spins and __list fields. (__PTHREAD_MUTEX_HAVE_PREV): Define also for __PTHREAD_MUTEX_USE_UNION case. * sysdeps/aarch64/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): New defines. * sysdeps/alpha/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/arm/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/hppa/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/ia64/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/m68k/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/microblaze/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/mips/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/nios2/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/powerpc/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/s390/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/sh/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/sparc/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/tile/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. * sysdeps/x86/nptl/bits/pthreadtypes-arch.h (__PTHREAD_MUTEX_NUSERS_AFTER_KIND, __PTHREAD_MUTEX_USE_UNION): Likewise. Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> |
||
Adhemerval Zanella
|
dff91cd45e |
nptl: Add tests for internal pthread_mutex_t offsets
This patch adds a new build test to check for internal fields offsets for user visible internal field. Although currently the only field which is statically initialized to a non zero value is pthread_mutex_t.__data.__kind value, the tests also check the offset of __kind, __spins, __elision (if supported), and __list internal member. A internal header (pthread-offset.h) is added to each major ABI with the reference value. Checked on x86_64-linux-gnu and with a build check for all affected ABIs (aarch64-linux-gnu, alpha-linux-gnu, arm-linux-gnueabihf, hppa-linux-gnu, i686-linux-gnu, ia64-linux-gnu, m68k-linux-gnu, microblaze-linux-gnu, mips64-linux-gnu, mips64-n32-linux-gnu, mips-linux-gnu, powerpc64le-linux-gnu, powerpc-linux-gnu, s390-linux-gnu, s390x-linux-gnu, sh4-linux-gnu, sparc64-linux-gnu, sparcv9-linux-gnu, tilegx-linux-gnu, tilegx-linux-gnu-x32, tilepro-linux-gnu, x86_64-linux-gnu, and x86_64-linux-x32). * nptl/pthreadP.h (ASSERT_PTHREAD_STRING, ASSERT_PTHREAD_INTERNAL_OFFSET): New macro. * nptl/pthread_mutex_init.c (__pthread_mutex_init): Add build time checks for internal pthread_mutex_t offsets. * sysdeps/aarch64/nptl/pthread-offsets.h (__PTHREAD_MUTEX_NUSERS_OFFSET, __PTHREAD_MUTEX_KIND_OFFSET, __PTHREAD_MUTEX_SPINS_OFFSET, __PTHREAD_MUTEX_ELISION_OFFSET, __PTHREAD_MUTEX_LIST_OFFSET): New macro. * sysdeps/alpha/nptl/pthread-offsets.h: Likewise. * sysdeps/arm/nptl/pthread-offsets.h: Likewise. * sysdeps/hppa/nptl/pthread-offsets.h: Likewise. * sysdeps/i386/nptl/pthread-offsets.h: Likewise. * sysdeps/ia64/nptl/pthread-offsets.h: Likewise. * sysdeps/m68k/nptl/pthread-offsets.h: Likewise. * sysdeps/microblaze/nptl/pthread-offsets.h: Likewise. * sysdeps/mips/nptl/pthread-offsets.h: Likewise. * sysdeps/nios2/nptl/pthread-offsets.h: Likewise. * sysdeps/powerpc/nptl/pthread-offsets.h: Likewise. * sysdeps/s390/nptl/pthread-offsets.h: Likewise. * sysdeps/sh/nptl/pthread-offsets.h: Likewise. * sysdeps/sparc/nptl/pthread-offsets.h: Likewise. * sysdeps/tile/nptl/pthread-offsets.h: Likewise. * sysdeps/x86_64/nptl/pthread-offsets.h: Likewise. Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> |
||
Joseph Myers
|
797ba44ba2 |
Add bits/floatn.h defines for more _FloatN / _FloatNx types.
The bits/floatn.h header currently only has defines relating to _Float128. This patch adds defines relating to other _FloatN / _FloatNx types. The approach taken is to add defines for all _FloatN / _FloatNx types known to GCC, and to put them in a common bits/floatn-common.h header included at the end of all the individual bits/floatn.h headers. If in future some defines become different for different glibc configurations, they will move out into the separate bits/floatn.h headers. Some defines are expected always to be the same across glibc ports. Corresponding defines are nevertheless put in this header. The intent is that where there are conditionals (in headers or in non-installed files) that can just repeat the same or nearly the same logic for each floating-point type, they should do so, even if in fact the cases for some types could be unconditionally present or absent because the same conditionals are true or false for all glibc configurations. This should make the glibc code with such conditionals easier to read, because the reader can just see that the same conditionals are repeated for each type, rather than seeing different conditionals for different types and needing to reason, at each location with such differences, why those differences are indeed correct there. (Cases involving per-format rather than per-type logic are more likely still to need differences in how they handle different types.) Having such defines and conditionals also helps in incremental preparation for adding _Float32 / _Float64 / _Float32x / _Float64x function aliases. I intend subsequent patches to add such conditionals corresponding to those already present for _Float128, as well as making more architecture-specific function implementations use common macros to define aliases in preparation for adding such _FloatN / _FloatNx aliases. Tested for x86_64. * bits/floatn-common.h: New file. * math/Makefile (headers): Add bits/floatn-common.h. * bits/floatn.h: Include <bits/floatn-common.h>. * sysdeps/ia64/bits/floatn.h: Likewise. * sysdeps/ieee754/ldbl-128/bits/floatn.h: Likewise. * sysdeps/mips/ieee754/bits/floatn.h: Likewise. * sysdeps/powerpc/bits/floatn.h: Likewise. * sysdeps/x86/bits/floatn.h: Likewise. |
||
H.J. Lu
|
b8818ab592 |
ld.so: Replace (&bootstrap_map) with BOOTSTRAP_MAP
(&_dl_main_map) is used instead of (&bootstrap_map) to bootstrap static PIE. Define BOOTSTRAP_MAP with (&_dl_main_map) to avoid hardcode to (&bootstrap_map). * elf/rtld.c (BOOTSTRAP_MAP): New. (RESOLVE_MAP): Replace (&bootstrap_map) with BOOTSTRAP_MAP. * sysdeps/hppa/dl-machine.h (ELF_MACHINE_BEFORE_RTLD_RELOC): Likewise. * sysdeps/ia64/dl-machine.h (ELF_MACHINE_BEFORE_RTLD_RELOC): Likewise. * sysdeps/mips/dl-machine.h (ELF_MACHINE_BEFORE_RTLD_RELOC): Likewise. |
||
Szabolcs Nagy
|
72d3d28108 |
New symbol version for logf, log2f and powf without SVID compat
This patch changes the logf, log2f and powf error handling semantics to only set errno accoring to POSIX rules. New symbol version is introduced at GLIBC_2.27. The old wrappers are kept for compat symbols. ia64 needed assembly change to have the new and compat versioned symbol map to the same function. All linux libm abilists are updated. * math/Versions (logf): New libm symbol at GLIBC_2.27. (log2f): Likewise. (powf): Likewise. * math/w_log2f.c: New file. * math/w_logf.c: New file. * math/w_powf.c: New file. * math/w_log2f_compat.c (__log2f_compat): For compat symbol only. * math/w_logf_compat.c (__logf_compat): Likewise. * math/w_powf_compat.c (__powf_compat): Likewise. * sysdeps/ia64/fpu/e_log2f.S: Add versioned symbols. * sysdeps/ia64/fpu/e_logf.S: Likewise. * sysdeps/ia64/fpu/e_powf.S: Likewise. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise. |
||
Szabolcs Nagy
|
4ea49f4c08 |
New generic powf
without wrapper on aarch64: powf reciprocal-throughput: 4.2x faster powf latency: 2.6x faster old worst-case error: 1.11 ulp new worst-case error: 0.82 ulp aarch64 .text size: -780 bytes aarch64 .rodata size: +144 bytes powf(x,y) is implemented as exp2(y*log2(x)) with the same algorithms that are used in exp2f and log2f, except that the log2f polynomial is larger for extra precision and its output (and exp2f input) may be scaled by a power of 2 (POWF_SCALE) to simplify the argument reduction step of exp2 (possible when efficient round and convert toint operation is available). The special case handling tries to minimize the checks in the hot path. When the input of exp2_inline is checked, int arithmetics is used as that was faster on the tested aarch64 cores. * math/Makefile (type-float-routines): Add e_powf_log2_data. * sysdeps/ieee754/flt-32/e_powf.c: New implementation. * sysdeps/ieee754/flt-32/e_powf_log2_data.c: New file. * sysdeps/ieee754/flt-32/math_config.h (__powf_log2_data): Define. (issignalingf_inline): Likewise. (POWF_LOG2_TABLE_BITS): Likewise. (POWF_LOG2_POLY_ORDER): Likewise. (POWF_SCALE_BITS): Likewise. (POWF_SCALE): Likewise. * sysdeps/i386/fpu/e_powf_log2_data.c: New file. * sysdeps/ia64/fpu/e_powf_log2_data.c: New file. * sysdeps/m68k/m680x0/fpu/e_powf_log2_data.c: New file. |
||
Szabolcs Nagy
|
875c76c704 |
New generic log2f
Similar to the new logf: double precision arithmetics and a small lookup table is used. The argument reduction step is the same as in the new logf. without wrapper on aarch64: log2f reciprocal-throughput: 2.3x faster log2f latency: 2.1x faster old worst case error: 1.72 ulp new worst case error: 0.75 ulp aarch64 .text size: -252 bytes aarch64 .rodata size: +244 bytes * math/Makefile (type-float-routines): Add e_log2f_data. * sysdeps/ieee754/flt-32/e_log2f.c: New implementation. * sysdeps/ieee754/flt-32/e_log2f_data.c: New file. * sysdeps/ieee754/flt-32/math_config.h (__log2f_data): Define. (LOG2F_TABLE_BITS, LOG2F_POLY_ORDER): Define. * sysdeps/i386/fpu/e_log2f_data.c: New file. * sysdeps/ia64/fpu/e_log2f_data.c: New file. * sysdeps/m68k/m680x0/fpu/e_log2f_data.c: New file. |
||
Szabolcs Nagy
|
bf27d3973d |
New generic logf
without wrapper on aarch64: logf reciprocal-throughput: 2.2x faster logf latency: 1.9x faster old worst case error: 0.89 ulp new worst case error: 0.82 ulp aarch64 .text size: -356 bytes aarch64 .rodata size: +240 bytes Uses double precision arithmetics and a lookup table to allow smaller polynomial and avoid the use of division. Data is in a separate translation unit with fixed layout to prevent the compiler generating suboptimal literal access. Errors are handled inline according to POSIX rules, but this patch keeps the wrapper with SVID compatible error handling. Needs libm-test-ulps adjustment for clogf in non-nearest rounding mode. * math/Makefile (type-float-routines): Add e_logf_data. * sysdeps/ieee754/flt-32/e_logf.c: New implementation. * sysdeps/ieee754/flt-32/e_logf_data.c: New file. * sysdeps/ieee754/flt-32/math_config.h (__logf_data): Define. (LOGF_TABLE_BITS, LOGF_POLY_ORDER): Define. * sysdeps/i386/fpu/e_logf_data.c: New file. * sysdeps/ia64/fpu/e_logf_data.c: New file. * sysdeps/m68k/m680x0/fpu/e_logf_data.c: New file. |
||
Wilco Dijkstra
|
4d3693ec1c |
Remove ancient __signbit inlines
Remove __signbit inlines from mathinline.h. Math.h already uses the builtin when supported, so additional inlines are only used on pre 4.0 GCCs. Similarly remove ancient copysign and fabs inlines. * sysdeps/alpha/fpu/bits/mathinline.h: Delete file. * sysdeps/ia64/fpu/bits/mathinline.h: Delete file. * sysdeps/m68k/coldfire/fpu/bits/mathinline.h: Delete file. * sysdeps/m68k/m680x0/fpu/bits/mathinline.h: (__signbitf): Remove. (__signbit): Remove. (__signbitl): Remove. * sysdeps/powerpc/bits/mathinline.h (__signbitf): Remove. (__signbit): Remove. (__signbitl): Remove. * sysdeps/s390/fpu/bits/mathinline.h: (__signbitf): Remove. (__signbit): Remove. (__signbitl): Remove * sysdeps/sparc/fpu/bits/mathinline.h (__signbitf): Remove. (__signbit): Remove. (__signbitl): Remove. * sysdeps/tile/bits/mathinline.h: Delete file. * sysdeps/x86/fpu/bits/mathinline.h (__signbitf): Remove. (__signbit): Remove. (__signbitl): Remove. |
||
Joseph Myers
|
12ef66c411 |
Fix ia64 executable stack default (bug 22156).
As per https://gcc.gnu.org/ml/gcc-patches/2017-09/msg01220.html ia64 defaults to non-executable stacks in the Linux kernel (furthermore, the use of function descriptors means that trampolines for nested function pointers never need an executable stack). glibc however defines DEFAULT_STACK_PERMS to include PF_X for that architecture, meaning (a) elf/check-execstack fails and (b) (from code inspection, not tested, but this is why I think this is a user-visible bug) thread stacks are unnecessarily mapped with execute permission. This patch fixes the DEFAULT_STACK_PERMS definition in question. Tested (compilation only) with build-many-glibcs.py for ia64. This fixes the check-execstack failure. [BZ #22156] * sysdeps/ia64/stackinfo.h (DEFAULT_STACK_PERMS): Likewise. |
||
Szabolcs Nagy
|
f5f0f52651 |
New expf and exp2f version without SVID compat wrapper
This patch changes the expf and exp2f error handling semantics to only set errno accoring to POSIX rules. New symbol version is introduced at GLIBC_2.27. The old wrappers are kept for compat symbols. Internal calls to __expf now get the new error semantics, this seems to only affect sysdeps/i386/fpu/s_expm1f.S where the errno-only behaviour should be correct. ia64 needed assembly change to have the new and compat versioned symbol map to the same function. All linux libm abilists are updated. * math/Versions (expf): New libm symbol at GLIBC_2.27. (exp2f): Likewise. * math/w_exp2f.c: New file. * math/w_expf.c: New file. * math/w_exp2f_compat.c (__exp2f_compat): For compat symbol only. * math/w_expf_compat.c (__expf_compat): Likewise. * sysdeps/ia64/fpu/e_exp2f.S: Add versioned symbols. * sysdeps/ia64/fpu/e_expf.S: Likewise. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/arm/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/i386/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sh/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/tile/tilepro/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Likewise. |
||
Szabolcs Nagy
|
72aa623345 |
Optimized generic expf and exp2f with wrappers
Based on new expf and exp2f code from https://github.com/ARM-software/optimized-routines/ with wrapper on aarch64: expf reciprocal-throughput: 2.3x faster expf latency: 1.7x faster without wrapper on aarch64: expf reciprocal-throughput: 3.3x faster expf latency: 1.7x faster without wrapper on aarch64: exp2f reciprocal-throughput: 2.8x faster exp2f latency: 1.3x faster libm.so size on aarch64: .text size: -152 bytes .rodata size: -1740 bytes expf/exp2f worst case nearest rounding error: 0.502 ulp worst case non-nearest rounding error: 1 ulp Error checks are inline and errno setting is in separate tail called functions, but the wrappers are kept in this patch to handle the _LIB_VERSION==_SVID_ case. (So e.g. errno is set twice for expf calls and once for __expf_finite calls on targets where the new code is used.) Double precision arithmetics is used which is expected to be faster on most targets (including soft-float) than using single precision and it is easier to get good precision result with it. Const data is kept in a separate translation unit which complicates maintenance a bit, but is expected to give good code for literal loads on most targets and allows sharing data across expf, exp2f and powf. (This data is disabled on i386, m68k and ia64 which have their own expf, exp2f and powf code.) Some details may need target specific tweaks: - best convert and round to int operation in the arg reduction may be different across targets. - code was optimized on fma target, optimal polynomial eval may be different without fma. - gcc does not always generate good code for fp bit representation access via unions or it may be inherently slow on some targets. The libm-test-ulps will need adjustment because.. - The argument reduction ideally uses nearest rounded rint, but that is not efficient on most targets, so the polynomial can get evaluated on a wider interval in non-nearest rounding mode making 1 ulp errors common in that case. - The polynomial is evaluated such that it may have 1 ulp error on negative tiny inputs with upward rounding. * math/Makefile (type-float-routines): Add math_errf and e_exp2f_data. * sysdeps/aarch64/fpu/math_private.h (TOINT_INTRINSICS): Define. (roundtoint, converttoint): Likewise. * sysdeps/ieee754/flt-32/e_expf.c: New implementation. * sysdeps/ieee754/flt-32/e_exp2f.c: New implementation. * sysdeps/ieee754/flt-32/e_exp2f_data.c: New file. * sysdeps/ieee754/flt-32/math_config.h: New file. * sysdeps/ieee754/flt-32/math_errf.c: New file. * sysdeps/ieee754/flt-32/t_exp2f.h: Remove. * sysdeps/i386/fpu/e_exp2f_data.c: New file. * sysdeps/i386/fpu/math_errf.c: New file. * sysdeps/ia64/fpu/e_exp2f_data.c: New file. * sysdeps/ia64/fpu/math_errf.c: New file. * sysdeps/m68k/m680x0/fpu/e_exp2f_data.c: New file. * sysdeps/m68k/m680x0/fpu/math_errf.c: New file. |
||
Joseph Myers
|
4f3647e46e |
Prefer new libm function wrappers for !LIBM_SVID_COMPAT.
The initial obsoletion of SVID libm error handling left the old wrappers and __kernel_standard still being used for new ports and static linking, just with macro definitions of _LIB_VERSION and matherr that meant symbols with those names were never actually used and the code for different error handling variants could be optimized out. This patch cleans things up further by eliminating the __kernel_standard use for new ports and static linking. Now, the old wrappers no longer generate any code in the !LIBM_SVID_COMPAT case, while the new errno-only wrappers that were added for float128 support are now also used for float, double and long double in that case. The changes are generally straightforward. The w_scalb*_compat wrappers continue to be used (scalb is obsolescent in the sense of not being supported for float128, but is present in supported standards - the 2001 edition of POSIX and earlier XSI versions - so remains supported for static linking and new ports, as do the float and long double variants that are existing GNU extensions). Those wrappers would only call __kernel_standard in the _LIB_VERSION == _SVID_ case. Since we would like to be able to compile most of glibc without optimization, relying on a static function whose only use is under an if (0) condition being optimized away to avoid an undefined __kernel_standard reference may not be a good idea. Thus, the relevant code in the scalb wrappers has LIBM_SVID_COMPAT conditionals added to guarantee it's not built at all in the case where __kernel_standard does not exist. Just as i386 has its own w_sqrt_compat.c, so w_sqrt.c is also added. ia64 gets dummy w_*.c to prevent those files being built where they would conflict with the ia64 libm, as with its existing w_*_compat.c. Conditions disabling code for !LIBM_SVID_COMPAT are needed in both the math/ wrappers and in the long double wrappers in ldbl-opt (to avoid them setting up aliases and symbol versions for undefined symbols). I hope that future cleanups to how libm function aliases and symbol versioning are done will eliminate the need for most of the ldbl-opt wrappers. Tested for x86_64 and x86, and with build-many-glibcs.py. * sysdeps/generic/math-type-macros-double.h: Include <math-svid-compat.h>. (__USE_WRAPPER_TEMPLATE): Define to !LIBM_SVID_COMPAT. * sysdeps/generic/math-type-macros-float.h: Include <math-svid-compat.h>. (__USE_WRAPPER_TEMPLATE): Define to !LIBM_SVID_COMPAT. * sysdeps/generic/math-type-macros-ldouble.h: Include <math-svid-compat.h>. (__USE_WRAPPER_TEMPLATE): Define to !LIBM_SVID_COMPAT. * math/lgamma-compat.h (BUILD_LGAMMA): Include LIBM_SVID_COMPAT condition. * math/w_acos_compat.c: Condition contents on [LIBM_SVID_COMPAT]. * math/w_acosf_compat.c: Likewise. * math/w_acosh_compat.c: Likewise. * math/w_acoshf_compat.c: Likewise. * math/w_acoshl_compat.c: Likewise. * math/w_acosl_compat.c: Likewise. * math/w_asin_compat.c: Likewise. * math/w_asinf_compat.c: Likewise. * math/w_asinl_compat.c: Likewise. * math/w_atan2_compat.c: Likewise. * math/w_atan2f_compat.c: Likewise. * math/w_atan2l_compat.c: Likewise. * math/w_atanh_compat.c: Likewise. * math/w_atanhf_compat.c: Likewise. * math/w_atanhl_compat.c: Likewise. * math/w_cosh_compat.c: Likewise. * math/w_coshf_compat.c: Likewise. * math/w_coshl_compat.c: Likewise. * math/w_exp10_compat.c: Likewise. * math/w_exp10f_compat.c: Likewise. * math/w_exp10l_compat.c: Likewise. * math/w_exp2_compat.c: Likewise. * math/w_exp2f_compat.c: Likewise. * math/w_exp2l_compat.c: Likewise. * math/w_fmod_compat.c: Likewise. * math/w_fmodf_compat.c: Likewise. * math/w_fmodl_compat.c: Likewise. * math/w_hypot_compat.c: Likewise. * math/w_hypotf_compat.c: Likewise. * math/w_hypotl_compat.c: Likewise. * math/w_j0_compat.c: Likewise. * math/w_j0f_compat.c: Likewise. * math/w_j0l_compat.c: Likewise. * math/w_j1_compat.c: Likewise. * math/w_j1f_compat.c: Likewise. * math/w_j1l_compat.c: Likewise. * math/w_jn_compat.c: Likewise. * math/w_jnf_compat.c: Likewise. * math/w_jnl_compat.c: Likewise. * math/w_lgamma_r_compat.c: Likewise. * math/w_lgammaf_r_compat.c: Likewise. * math/w_lgammal_r_compat.c: Likewise. * math/w_log10_compat.c: Likewise. * math/w_log10f_compat.c: Likewise. * math/w_log10l_compat.c: Likewise. * math/w_log2_compat.c: Likewise. * math/w_log2f_compat.c: Likewise. * math/w_log2l_compat.c: Likewise. * math/w_log_compat.c: Likewise. * math/w_logf_compat.c: Likewise. * math/w_logl_compat.c: Likewise. * math/w_pow_compat.c: Likewise. * math/w_powf_compat.c: Likewise. * math/w_powl_compat.c: Likewise. * math/w_remainder_compat.c: Likewise. * math/w_remainderf_compat.c: Likewise. * math/w_remainderl_compat.c: Likewise. * math/w_sinh_compat.c: Likewise. * math/w_sinhf_compat.c: Likewise. * math/w_sinhl_compat.c: Likewise. * math/w_sqrt_compat.c: Likewise. * math/w_sqrtf_compat.c: Likewise. * math/w_sqrtl_compat.c: Likewise. * math/w_tgamma_compat.c: Likewise. * math/w_tgammaf_compat.c: Likewise. * math/w_tgammal_compat.c: Likewise. * math/w_scalb_compat.c (sysv_scalb): Condition definition on [LIBM_SVID_COMPAT]. (__scalb): Condition call to sysv_scalb on [LIBM_SVID_COMPAT]. * math/w_scalbf_compat.c (sysv_scalbf): Condition definition on [LIBM_SVID_COMPAT]. (__scalbf): Condition call to sysv_scalbf on [LIBM_SVID_COMPAT]. * math/w_scalbl_compat.c (sysv_scalbl): Condition definition on [LIBM_SVID_COMPAT]. (__scalbl): Condition call to sysv_scalbl on [LIBM_SVID_COMPAT]. * sysdeps/i386/fpu/w_sqrt.c: New file. * sysdeps/ia64/fpu/w_acos.c: Likewise. * sysdeps/ia64/fpu/w_acosf.c: Likewise. * sysdeps/ia64/fpu/w_acosh.c: Likewise. * sysdeps/ia64/fpu/w_acoshf.c: Likewise. * sysdeps/ia64/fpu/w_acoshl.c: Likewise. * sysdeps/ia64/fpu/w_acosl.c: Likewise. * sysdeps/ia64/fpu/w_asin.c: Likewise. * sysdeps/ia64/fpu/w_asinf.c: Likewise. * sysdeps/ia64/fpu/w_asinl.c: Likewise. * sysdeps/ia64/fpu/w_atan2.c: Likewise. * sysdeps/ia64/fpu/w_atan2f.c: Likewise. * sysdeps/ia64/fpu/w_atan2l.c: Likewise. * sysdeps/ia64/fpu/w_atanh.c: Likewise. * sysdeps/ia64/fpu/w_atanhf.c: Likewise. * sysdeps/ia64/fpu/w_atanhl.c: Likewise. * sysdeps/ia64/fpu/w_cosh.c: Likewise. * sysdeps/ia64/fpu/w_coshf.c: Likewise. * sysdeps/ia64/fpu/w_coshl.c: Likewise. * sysdeps/ia64/fpu/w_exp.c: Likewise. * sysdeps/ia64/fpu/w_exp10.c: Likewise. * sysdeps/ia64/fpu/w_exp10f.c: Likewise. * sysdeps/ia64/fpu/w_exp10l.c: Likewise. * sysdeps/ia64/fpu/w_exp2.c: Likewise. * sysdeps/ia64/fpu/w_exp2f.c: Likewise. * sysdeps/ia64/fpu/w_exp2l.c: Likewise. * sysdeps/ia64/fpu/w_expf.c: Likewise. * sysdeps/ia64/fpu/w_expl.c: Likewise. * sysdeps/ia64/fpu/w_fmod.c: Likewise. * sysdeps/ia64/fpu/w_fmodf.c: Likewise. * sysdeps/ia64/fpu/w_fmodl.c: Likewise. * sysdeps/ia64/fpu/w_hypot.c: Likewise. * sysdeps/ia64/fpu/w_hypotf.c: Likewise. * sysdeps/ia64/fpu/w_hypotl.c: Likewise. * sysdeps/ia64/fpu/w_lgamma_r.c: Likewise. * sysdeps/ia64/fpu/w_lgammaf_r.c: Likewise. * sysdeps/ia64/fpu/w_lgammal_r.c: Likewise. * sysdeps/ia64/fpu/w_log.c: Likewise. * sysdeps/ia64/fpu/w_log10.c: Likewise. * sysdeps/ia64/fpu/w_log10f.c: Likewise. * sysdeps/ia64/fpu/w_log10l.c: Likewise. * sysdeps/ia64/fpu/w_log2.c: Likewise. * sysdeps/ia64/fpu/w_log2f.c: Likewise. * sysdeps/ia64/fpu/w_log2l.c: Likewise. * sysdeps/ia64/fpu/w_logf.c: Likewise. * sysdeps/ia64/fpu/w_logl.c: Likewise. * sysdeps/ia64/fpu/w_pow.c: Likewise. * sysdeps/ia64/fpu/w_powf.c: Likewise. * sysdeps/ia64/fpu/w_powl.c: Likewise. * sysdeps/ia64/fpu/w_remainder.c: Likewise. * sysdeps/ia64/fpu/w_remainderf.c: Likewise. * sysdeps/ia64/fpu/w_remainderl.c: Likewise. * sysdeps/ia64/fpu/w_sinh.c: Likewise. * sysdeps/ia64/fpu/w_sinhf.c: Likewise. * sysdeps/ia64/fpu/w_sinhl.c: Likewise. * sysdeps/ia64/fpu/w_sqrt.c: Likewise. * sysdeps/ia64/fpu/w_sqrtf.c: Likewise. * sysdeps/ia64/fpu/w_sqrtl.c: Likewise. * sysdeps/ia64/fpu/w_tgamma.c: Likewise. * sysdeps/ia64/fpu/w_tgammaf.c: Likewise. * sysdeps/ia64/fpu/w_tgammal.c: Likewise. * sysdeps/ieee754/dbl-64/w_exp_compat.c: Condition contents on [LIBM_SVID_COMPAT]. * sysdeps/ieee754/flt-32/w_expf_compat.c: Likewise. * sysdeps/ieee754/k_standard.c: Likewise. * sysdeps/ieee754/k_standardf.c: Likewise. * sysdeps/ieee754/k_standardl.c: Likewise. * sysdeps/ieee754/ldbl-128/w_expl_compat.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/w_expl_compat.c: Likewise. * sysdeps/ieee754/ldbl-96/w_expl_compat.c: Likewise. * sysdeps/ieee754/ldbl-64-128/w_expl_compat.c: Condition long_double_symbol call on [LIBM_SVID_COMPAT]. * sysdeps/ieee754/ldbl-opt/w_acoshl_compat.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_acosl_compat.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_asinl_compat.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_atan2l_compat.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_atanhl_compat.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_coshl_compat.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_fmodl_compat.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_hypotl_compat.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_j0l_compat.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_j1l_compat.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_jnl_compat.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_lgammal_r_compat.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_log10l_compat.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_log2l_compat.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_logl_compat.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_powl_compat.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_remainderl_compat.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_sinhl_compat.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_sqrtl_compat.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_tgammal_compat.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_exp10l_compat.c: Condition long_double_symbol and compat_symbol calls on [LIBM_SVID_COMPAT]. |
||
Joseph Myers
|
5a80d39d0d |
Obsolete pow10 functions.
This patch obsoletes the pow10, pow10f and pow10l functions (makes them into compat symbols, not available for new ports or static linking). The exp10 names for these functions are standardized (in TS 18661-4) and were added in the same glibc version (2.1) as pow10 so source code can change to use them without any loss of portability. Since pow10 is deliberately not provided for _Float128, only exp10, this slightly simplifies moving to the new wrapper templates in the !LIBM_SVID_COMPAT case, by avoiding needing to arrange for pow10, pow10f and pow10l to be defined by those templates. Tested for x86_64, and with build-many-glibcs.py. * manual/math.texi (pow10): Do not document. (pow10f): Likewise. (pow10l): Likewise. * math/bits/mathcalls.h [__USE_GNU] (pow10): Do not declare. * math/bits/math-finite.h [__USE_GNU] (pow10): Likewise. * math/libm-test-exp10.inc (pow10_test): Remove. (do_test): Do not call pow10. * math/w_exp10_compat.c (pow10): Make into compat symbol. [NO_LONG_DOUBLE] (pow10l): Likewise. * math/w_exp10f_compat.c (pow10f): Likewise. * math/w_exp10l_compat.c (pow10l): Likewise. * sysdeps/ia64/fpu/e_exp10.S: Include <shlib-compat.h>. (pow10): Make into compat symbol. * sysdeps/ia64/fpu/e_exp10f.S: Include <shlib-compat.h>. (pow10f): Make into compat symbol. * sysdeps/ia64/fpu/e_exp10l.S: Include <shlib-compat.h>. (pow10l): Make into compat symbol. * sysdeps/ieee754/ldbl-opt/Makefile (libnldbl-calls): Remove pow10. (CFLAGS-nldbl-pow10.c): Remove variable.. * sysdeps/ieee754/ldbl-opt/nldbl-pow10.c: Remove file. * sysdeps/ieee754/ldbl-opt/w_exp10_compat.c (pow10l): Condition on [SHLIB_COMPAT (libm, GLIBC_2_1, GLIBC_2_27)]. * sysdeps/ieee754/ldbl-opt/w_exp10l_compat.c (compat_symbol): Undefine and redefine. (pow10l): Make into compat symbol. * sysdeps/aarch64/libm-test-ulps: Remove pow10 ulps. * sysdeps/alpha/fpu/libm-test-ulps: Likewise. * sysdeps/arm/libm-test-ulps: Likewise. * sysdeps/hppa/fpu/libm-test-ulps: Likewise. * sysdeps/i386/fpu/libm-test-ulps: Likewise. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/microblaze/libm-test-ulps: Likewise. * sysdeps/mips/mips32/libm-test-ulps: Likewise. * sysdeps/mips/mips64/libm-test-ulps: Likewise. * sysdeps/nios2/libm-test-ulps: Likewise. * sysdeps/powerpc/fpu/libm-test-ulps: Likewise. * sysdeps/powerpc/nofpu/libm-test-ulps: Likewise. * sysdeps/s390/fpu/libm-test-ulps: Likewise. * sysdeps/sh/libm-test-ulps: Likewise. * sysdeps/sparc/fpu/libm-test-ulps: Likewise. * sysdeps/tile/libm-test-ulps: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Likewise. |
||
Joseph Myers
|
a60eca2e55 |
Simplify HUGE_VAL definitions.
There are various bits/huge_val*.h headers to define HUGE_VAL and related macros. All of them use __builtin_huge_val etc. for GCC 3.3 and later. Then there are various fallbacks, such as using a large hex float constant for GCC 2.96 and later, or using unions (with or without compound literals) to construct the bytes of an infinity, with this last being the reason for having architecture-specific files. Supporting TS 18661-3 _FloatN / _FloatNx types that have the same format as other supported types will mean adding more such macros; needing to add more headers for them doesn't seem very desirable. The fallbacks based on bytes of the representation of an infinity do not meet the standard requirements for a constant expression. At least one of them is also wrong: sysdeps/sh/bits/huge_val.h is producing a mixed-endian representation which does not match what GCC does. This patch eliminates all those headers, defining the macros directly in math.h. For GCC 3.3 and later, the built-in functions are used as now. For other compilers, a large constant 1e10000 (with appropriate suffix) is used. This is like the fallback for GCC 2.96 and later, but without using hex floats (which have no apparent advantage here). It is unambiguously valid standard C for all floating-point formats with infinities, which covers all formats supported by glibc or likely to be supported by glibc in future (C90 DR#025 said that if a floating-point format represents infinities, all real values lie within the range of representable values, so the constraints for constant expressions are not violated), but may generate compiler warnings and wouldn't handle the TS 18661-1 FENV_ROUND pragma correctly. If someone is actually using a compiler with glibc that does not claim to be GCC 3.3 or later, but which has a better way to define the HUGE_VAL macros, we can always add compiler conditionals in with alternative definitions. I intend to make similar changes for INF and NAN. The SNAN macros already just use __builtin_nans etc. with no fallback for compilers not claiming to be GCC 3.3 or later. Tested for x86_64. * math/math.h: Do not include bits/huge_val.h, bits/huge_valf.h, bits/huge_vall.h or bits/huge_val_flt128.h. (HUGE_VAL): Define directly here. [__USE_ISOC99] (HUGE_VALF): Likewise. [__USE_ISOC99] (HUGE_VALL): Likewise. [__HAVE_FLOAT128 && __GLIBC_USE (IEC_60559_TYPES_EXT)] (HUGE_VAL_F128): Likewise. * math/Makefile (headers): Remove bits/huge_val.h, bits/huge_valf.h, bits/huge_vall.h and bits/huge_val_flt128.h. * bits/huge_val.h: Remove. * bits/huge_val_flt128.h: Likewise. * bits/huge_valf.h: Likewise. * bits/huge_vall.h: Likewise. * sysdeps/ia64/bits/huge_vall.h: Likewise. * sysdeps/ieee754/bits/huge_val.h: Likewise. * sysdeps/ieee754/bits/huge_valf.h: Likewise. * sysdeps/m68k/m680x0/bits/huge_vall.h: Likewise. * sysdeps/sh/bits/huge_val.h: Likewise. * sysdeps/sparc/bits/huge_vall.h: Likewise. * sysdeps/x86/bits/huge_vall.h: Likewise. |
||
Joseph Myers
|
813378e9fe |
Obsolete matherr, _LIB_VERSION, libieee.a.
This patch obsoletes support for SVID libm error handling (the system where a user-defined function matherr is called on a libm function error; only enabled if you also set _LIB_VERSION = _SVID_ or _LIB_VERSION = _XOPEN_) and the use of the _LIB_VERSION global variable to control libm error handling. matherr and _LIB_VERSION are made into compat symbols, not supported for new ports or for static linking. The libieee.a object file (which sets _LIB_VERSION = _IEEE_, so disabling errno setting for some functions) is also removed, and all the related definitions are removed from math.h. The manual already recommends against using matherr, and it's already not supported for _Float128 functions (those use new wrappers that don't support matherr, only errno) - this patch means that it becomes possible to e.g. add sinf32 as an alias to sinf without that resulting in undesired matherr support in sinf32 for existing glibc ports. matherr support is not part of any standard supported by glibc (it was removed in XPG4). Because matherr is a function to be defined by the user, of course user programs defining such a function will still continue to link; it just quietly won't be used. If they try to write to the library's copy of _LIB_VERSION to enable SVID error handling, however, they will get a link error (but if they define their own _LIB_VERSION variable, they won't). I expect the most likely case of build failures from this patch to be programs with unconditional cargo-culted uses of -lieee (based on a notion of "I want IEEE floating point", not any actual requirement for that library). Ideally, the new-port-or-static-linking case would use the new wrappers used for _Float128. This is not implemented in this patch, because of the complication of architecture-specific (powerpc32 and sparc) sqrt wrappers that use _LIB_VERSION and __kernel_standard directly. Thus, the old wrappers and __kernel_standard are still built unconditionally, and _LIB_VERSION still exists in static libm. But when the old wrappers and __kernel_standard are built in the non-compat case, _LIB_VERSION and matherr are defined as macros so code to support those features isn't actually built into static libm or new ports' shared libm after this patch. I intend to move to the new wrappers for static libm and new ports in followup patches. I believe the sqrt wrappers for powerpc32 and sparc can reasonably be removed. GCC already optimizes the normal case of sqrt by generating code that uses a hardware instruction and only calls the sqrt function if the argument was negative (if -fno-math-errno, of course, it just uses the hardware instruction without any check for negative argument being needed). Thus those wrappers will only actually get called in the case of negative arguments, which is not a case it makes sense to optimize for. But even without removing the powerpc32 and sparc wrappers it should still be possible to move to the new wrappers for static libm and new ports, just without having those dubious architecture-specific optimizations in static libm. Everything said about matherr equally applies to matherrf and matherrl (IA64-specific, undocumented), except that the structure of IA64 libm means it won't be converted to using the new wrappers (it doesn't use the old ones either, but its own error-handling code instead). As with other tests of compat symbols, I expect test-matherr and test-matherr-2 to need to become appropriately conditional once we have a system for disabling such tests for ports too new to have the relevant symbols. Tested for x86_64 and x86, and with build-many-glibcs.py. * math/math.h [__USE_MISC] (_LIB_VERSION_TYPE): Remove. [__USE_MISC] (_LIB_VERSION): Likewise. [__USE_MISC] (struct exception): Likewise. [__USE_MISC] (matherr): Likewise. [__USE_MISC] (DOMAIN): Likewise. [__USE_MISC] (SING): Likewise. [__USE_MISC] (OVERFLOW): Likewise. [__USE_MISC] (UNDERFLOW): Likewise. [__USE_MISC] (TLOSS): Likewise. [__USE_MISC] (PLOSS): Likewise. [__USE_MISC] (HUGE): Likewise. [__USE_XOPEN] (MAXFLOAT): Define even if [__USE_MISC]. * math/math-svid-compat.h: New file. * conform/linknamespace.pl (@whitelist): Remove matherr, matherrf and matherrl. * include/math.h [!_ISOMAC] (__matherr): Remove. * manual/arith.texi (FP Exceptions): Do not document matherr. * math/Makefile (tests): Change test-matherr to test-matherr-3. (tests-internal): New variable. (install-lib): Do not add libieee.a. (non-lib.a): Likewise. (extra-objs): Do not add libieee.a and ieee-math.o. (CPPFLAGS-s_lib_version.c): Remove variable. ($(objpfx)libieee.a): Remove rule. ($(addprefix $(objpfx), $(tests-internal)): Depend on $(libm). * math/ieee-math.c: Remove. * math/libm-test-support.c (matherr): Remove. * math/test-matherr.c: Use <support/test-driver.c>. Add copyright and license notices. Include <math-svid-compat.h> and <shlib-compat.h>. (matherr): Undefine as macro. Use compat_symbol_reference. (_LIB_VERSION): Likewise. * math/test-matherr-2.c: New file. * math/test-matherr-3.c: Likewise. * sysdeps/generic/math_private.h (__kernel_standard): Remove declaration. (__kernel_standard_f): Likewise. (__kernel_standard_l): Likewise. * sysdeps/ieee754/s_lib_version.c: Do not include <math.h> or <math_private.h>. Include <math-svid-compat.h>. (_LIB_VERSION): Undefine as macro. (_LIB_VERSION_INTERNAL): Always initialize to _POSIX_. Define only if [LIBM_SVID_COMPAT || !defined SHARED]. If [LIBM_SVID_COMPAT], use compat_symbol. * sysdeps/ieee754/s_matherr.c: Do not include <math.h> or <math_private.h>. Include <math-svid-compat.h>. (matherr): Undefine as macro. (__matherr): Define only if [LIBM_SVID_COMPAT]. Use compat_symbol. * sysdeps/ia64/fpu/libm_error.c: Include <math-svid-compat.h>. [_LIBC && LIBM_SVID_COMPAT] (matherrf): Use compat_symbol_reference. [_LIBC && LIBM_SVID_COMPAT] (matherrl): Likewise. [_LIBC && !LIBM_SVID_COMPAT] (matherrf): Define as macro. [_LIBC && !LIBM_SVID_COMPAT] (matherrl): Likewise. * sysdeps/ia64/fpu/libm_support.h: Include <math-svid-compat.h>. (MATHERR_D): Remove declaration. [!_LIBC] (_LIB_VERSION_TYPE): Likewise [!LIBM_BUILD] (_LIB_VERSIONIMF): Likewise. [LIBM_BUILD] (pmatherrf): Likewise. [LIBM_BUILD] (pmatherr): Likewise. [LIBM_BUILD] (pmatherrl): Likewise. (DOMAIN): Likewise. (SING): Likewise. (OVERFLOW): Likewise. (UNDERFLOW): Likewise. (TLOSS): Likewise. (PLOSS): Likewise. * sysdeps/ia64/fpu/s_matherrf.c: Include <math-svid-compat.h>. (__matherrf): Define only if [LIBM_SVID_COMPAT]. Use compat_symbol. * sysdeps/ia64/fpu/s_matherrl.c: Include <math-svid-compat.h>. (__matherrl): Define only if [LIBM_SVID_COMPAT]. Use compat_symbol. * math/lgamma-compat.h: Include <math-svid-compat.h>. * math/w_acos_compat.c: Likewise. * math/w_acosf_compat.c: Likewise. * math/w_acosh_compat.c: Likewise. * math/w_acoshf_compat.c: Likewise. * math/w_acoshl_compat.c: Likewise. * math/w_acosl_compat.c: Likewise. * math/w_asin_compat.c: Likewise. * math/w_asinf_compat.c: Likewise. * math/w_asinl_compat.c: Likewise. * math/w_atan2_compat.c: Likewise. * math/w_atan2f_compat.c: Likewise. * math/w_atan2l_compat.c: Likewise. * math/w_atanh_compat.c: Likewise. * math/w_atanhf_compat.c: Likewise. * math/w_atanhl_compat.c: Likewise. * math/w_cosh_compat.c: Likewise. * math/w_coshf_compat.c: Likewise. * math/w_coshl_compat.c: Likewise. * math/w_exp10_compat.c: Likewise. * math/w_exp10f_compat.c: Likewise. * math/w_exp10l_compat.c: Likewise. * math/w_exp2_compat.c: Likewise. * math/w_exp2f_compat.c: Likewise. * math/w_exp2l_compat.c: Likewise. * math/w_fmod_compat.c: Likewise. * math/w_fmodf_compat.c: Likewise. * math/w_fmodl_compat.c: Likewise. * math/w_hypot_compat.c: Likewise. * math/w_hypotf_compat.c: Likewise. * math/w_hypotl_compat.c: Likewise. * math/w_j0_compat.c: Likewise. * math/w_j0f_compat.c: Likewise. * math/w_j0l_compat.c: Likewise. * math/w_j1_compat.c: Likewise. * math/w_j1f_compat.c: Likewise. * math/w_j1l_compat.c: Likewise. * math/w_jn_compat.c: Likewise. * math/w_jnf_compat.c: Likewise. * math/w_jnl_compat.c: Likewise. * math/w_lgamma_main.c: Likewise. * math/w_lgamma_r_compat.c: Likewise. * math/w_lgammaf_main.c: Likewise. * math/w_lgammaf_r_compat.c: Likewise. * math/w_lgammal_main.c: Likewise. * math/w_lgammal_r_compat.c: Likewise. * math/w_log10_compat.c: Likewise. * math/w_log10f_compat.c: Likewise. * math/w_log10l_compat.c: Likewise. * math/w_log2_compat.c: Likewise. * math/w_log2f_compat.c: Likewise. * math/w_log2l_compat.c: Likewise. * math/w_log_compat.c: Likewise. * math/w_logf_compat.c: Likewise. * math/w_logl_compat.c: Likewise. * math/w_pow_compat.c: Likewise. * math/w_powf_compat.c: Likewise. * math/w_powl_compat.c: Likewise. * math/w_remainder_compat.c: Likewise. * math/w_remainderf_compat.c: Likewise. * math/w_remainderl_compat.c: Likewise. * math/w_scalb_compat.c: Likewise. * math/w_scalbf_compat.c: Likewise. * math/w_scalbl_compat.c: Likewise. * math/w_sinh_compat.c: Likewise. * math/w_sinhf_compat.c: Likewise. * math/w_sinhl_compat.c: Likewise. * math/w_sqrt_compat.c: Likewise. * math/w_sqrtf_compat.c: Likewise. * math/w_sqrtl_compat.c: Likewise. * math/w_tgamma_compat.c: Likewise. * math/w_tgammaf_compat.c: Likewise. * math/w_tgammal_compat.c: Likewise. * sysdeps/ieee754/dbl-64/w_exp_compat.c: Likewise. * sysdeps/ieee754/flt-32/w_expf_compat.c: Likewise. * sysdeps/ieee754/k_standard.c: Likewise. * sysdeps/ieee754/k_standardf.c: Likewise. * sysdeps/ieee754/k_standardl.c: Likewise. * sysdeps/ieee754/ldbl-128/w_expl_compat.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/w_expl_compat.c: Likewise. * sysdeps/ieee754/ldbl-96/w_expl_compat.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/w_sqrt_compat.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/w_sqrtf_compat.S: Likewise. * sysdeps/powerpc/powerpc32/power5/fpu/w_sqrt_compat.S: Likewise. * sysdeps/powerpc/powerpc32/power5/fpu/w_sqrtf_compat.S: Likewise. * sysdeps/sparc/sparc32/fpu/w_sqrt_compat.S: Likewise. * sysdeps/sparc/sparc32/fpu/w_sqrtf_compat.S: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrt_compat-vis3.S: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/w_sqrtf_compat-vis3.S: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/w_sqrt_compat.S: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/w_sqrtf_compat.S: Likewise. * sysdeps/sparc/sparc64/fpu/w_sqrt_compat.S: Likewise. * sysdeps/sparc/sparc64/fpu/w_sqrtf_compat.S: Likewise. |
||
Gabriel F. T. Gomes
|
8466ee1cb7 |
float128: Add signbit alternative for old compilers
In math/math.h, __MATH_TG will expand signbit to __builtin_signbit*, e.g.: __builtin_signbitf128, before GCC 6. However, there has never been a __builtin_signbitf128 in GCC and the type-generic builtin is only available since GCC 6. For older GCC, this patch defines __builtin_signbitf128 to __signbitf128, so that the internal function is used instead of the non-existent builtin. This patch also changes the implementation of __signbitf128, because it was reusing the implementation of __signbitl from ldbl-128, which calls __builtin_signbitl. Using the long double version of the builtin is not correct on machines where _Float128 is ABI-distinct from long double (i.e.: ia64, powerpc64le, x86, x86_84). The new implementation does not rely on builtins when being built with GCC versions older than 6.0. The new code does not currently affect powerpc64le builds, because only GCC 6.2 fulfills the requirements from configure. It might affect powerpc64le builds if those requirements are backported to older versions of the compiler. The new code affects x86_64 builds, since glibc is supposed to build correctly with older versions of GCC. Tested for powerpc64le and x86_64. * include/math.h (__signbitf128): Define as hidden. * sysdeps/ieee754/float128/s_signbitf128.c (__signbitf128): Reimplement without builtins. * sysdeps/ia64/bits/floatn.h [!__GNUC_PREREQ (6, 0)] (__builtin_signbitf128): Define to __signbitf128. * sysdeps/powerpc/bits/floatn.h: Likewise. * sysdeps/x86/bits/floatn.h: Likewise. |
||
Joseph Myers
|
034e738021 |
Add float128 support for ia64.
This patch enables float128 support for ia64, so that all the configurations where GCC supports _Float128 / __float128 as an ABI-distinct type now have glibc support as well. bits/floatn.h declares the support to be available for GCC 4.4 and later, which is when the libgcc support was added. The removal of sysdeps/ia64/fpu/k_rem_pio2.c is because the generic k_rem_pio2.c defines a function required by the float128 code. Tested (compilation only) with build-many-glibcs.py for ia64 (GCC 6 and GCC 7). Given how long it is since libm-test-ulps has been updated for ia64, I think truncating the file and regenerating it from scratch would be a good idea when doing a regeneration to add float128 ulps. I expect various ia64 libm issues (at least some already filed in Bugzilla) to result in test failures even after ulps regeneration, but hopefully the float128 code will pass tests as it's the same as used on other architectures. * sysdeps/ia64/Implies: Add ieee754/float128. * sysdeps/ia64/bits/floatn.h: New file. * sysdeps/ia64/float128-abi.h: Likewise. * manual/math.texi (Mathematics): Document support for _Float128 on ia64. * sysdeps/ia64/Makefile [$(subdir) = math] (CPPFLAGS): Append to Makefile variable. * sysdeps/ia64/fpu/e_sqrtf128.c: New file. * sysdeps/ia64/fpu/k_rem_pio2.c: Remove file. * sysdeps/ia64/fpu/sfp-machine.h: New file. Based on libgcc. * sysdeps/ia64/math-tests.h: New file. * math/libm-test-support.h (XFAIL_FLOAT128_PAYLOAD): Also define based on TEST_COND_binary128 for [__ia64__]. * sysdeps/unix/sysv/linux/ia64/libc.abilist: Update. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Likewise, |
||
Alan Modra
|
0572433b5b |
PowerPC64 ELFv2 PPC64_OPT_LOCALENTRY
ELFv2 functions with localentry:0 are those with a single entry point, ie. global entry == local entry, that have no requirement on r2 or r12 and guarantee r2 is unchanged on return. Such an external function can be called via the PLT without saving r2 or restoring it on return, avoiding a common load-hit-store for small functions. This patch implements the ld.so changes necessary for this optimization. ld.so needs to check that an optimized plt call sequence is in fact calling a function implemented with localentry:0, end emit a fatal error otherwise. The elf/testobj6.c change is to stop "error while loading shared libraries: expected localentry:0 `preload'" when running elf/preloadtest, which we'd get otherwise. * elf/elf.h (PPC64_OPT_LOCALENTRY): Define. * sysdeps/alpha/dl-machine.h (elf_machine_fixup_plt): Add refsym and sym parameters. Adjust callers. * sysdeps/aarch64/dl-machine.h (elf_machine_fixup_plt): Likewise. * sysdeps/arm/dl-machine.h (elf_machine_fixup_plt): Likewise. * sysdeps/generic/dl-machine.h (elf_machine_fixup_plt): Likewise. * sysdeps/hppa/dl-machine.h (elf_machine_fixup_plt): Likewise. * sysdeps/i386/dl-machine.h (elf_machine_fixup_plt): Likewise. * sysdeps/ia64/dl-machine.h (elf_machine_fixup_plt): Likewise. * sysdeps/m68k/dl-machine.h (elf_machine_fixup_plt): Likewise. * sysdeps/microblaze/dl-machine.h (elf_machine_fixup_plt): Likewise. * sysdeps/mips/dl-machine.h (elf_machine_fixup_plt): Likewise. * sysdeps/nios2/dl-machine.h (elf_machine_fixup_plt): Likewise. * sysdeps/powerpc/powerpc32/dl-machine.h (elf_machine_fixup_plt): Likewise. * sysdeps/s390/s390-32/dl-machine.h (elf_machine_fixup_plt): Likewise. * sysdeps/s390/s390-64/dl-machine.h (elf_machine_fixup_plt): Likewise. * sysdeps/sh/dl-machine.h (elf_machine_fixup_plt): Likewise. * sysdeps/sparc/sparc32/dl-machine.h (elf_machine_fixup_plt): Likewise. * sysdeps/sparc/sparc64/dl-machine.h (elf_machine_fixup_plt): Likewise. * sysdeps/tile/dl-machine.h (elf_machine_fixup_plt): Likewise. * sysdeps/x86_64/dl-machine.h (elf_machine_fixup_plt): Likewise. * sysdeps/powerpc/powerpc64/dl-machine.c (_dl_error_localentry): New. (_dl_reloc_overflow): Increase buffser size. Formatting. * sysdeps/powerpc/powerpc64/dl-machine.h (ppc64_local_entry_offset): Delete reloc param, add refsym and sym. Check optimized plt call stubs for localentry:0 functions. Adjust callers. (elf_machine_fixup_plt, elf_machine_plt_conflict): Add refsym and sym parameters. Adjust callers. (_dl_reloc_overflow): Move attribute. (_dl_error_localentry): Declare. * elf/dl-runtime.c (_dl_fixup): Save original sym. Pass refsym and sym to elf_machine_fixup_plt. * elf/testobj6.c (preload): Call printf. |
||
Stefan Liebler
|
12d2dd7060 |
Optimize generic spinlock code and use C11 like atomic macros.
This patch optimizes the generic spinlock code. The type pthread_spinlock_t is a typedef to volatile int on all archs. Passing a volatile pointer to the atomic macros which are not mapped to the C11 atomic builtins can lead to extra stores and loads to stack if such a macro creates a temporary variable by using "__typeof (*(mem)) tmp;". Thus, those macros which are used by spinlock code - atomic_exchange_acquire, atomic_load_relaxed, atomic_compare_exchange_weak - have to be adjusted. According to the comment from Szabolcs Nagy, the type of a cast expression is unqualified (see http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_423.htm): __typeof ((__typeof (*(mem)) *(mem)) tmp; Thus from spinlock perspective the variable tmp is of type int instead of type volatile int. This patch adjusts those macros in include/atomic.h. With this construct GCC >= 5 omits the extra stores and loads. The atomic macros are replaced by the C11 like atomic macros and thus the code is aligned to it. The pthread_spin_unlock implementation is now using release memory order instead of sequentially consistent memory order. The issue with passed volatile int pointers applies to the C11 like atomic macros as well as the ones used before. I've added a glibc_likely hint to the first atomic exchange in pthread_spin_lock in order to return immediately to the caller if the lock is free. Without the hint, there is an additional jump if the lock is free. I've added the atomic_spin_nop macro within the loop of plain reads. The plain reads are also realized by C11 like atomic_load_relaxed macro. The new define ATOMIC_EXCHANGE_USES_CAS determines if the first try to acquire the spinlock in pthread_spin_lock or pthread_spin_trylock is an exchange or a CAS. This is defined in atomic-machine.h for all architectures. The define SPIN_LOCK_READS_BETWEEN_CMPXCHG is now removed. There is no technical reason for throwing in a CAS every now and then, and so far we have no evidence that it can improve performance. If that would be the case, we have to adjust other spin-waiting loops elsewhere, too! Using a CAS loop without plain reads is not a good idea on many targets and wasn't used by one. Thus there is now no option to do so. Architectures are now using the generic spinlock automatically if they do not provide an own implementation. Thus the pthread_spin_lock.c files in sysdeps folder are deleted. ChangeLog: * NEWS: Mention new spinlock implementation. * include/atomic.h: (__atomic_val_bysize): Cast type to omit volatile qualifier. (atomic_exchange_acq): Likewise. (atomic_load_relaxed): Likewise. (ATOMIC_EXCHANGE_USES_CAS): Check definition. * nptl/pthread_spin_init.c (pthread_spin_init): Use atomic_store_relaxed. * nptl/pthread_spin_lock.c (pthread_spin_lock): Use C11-like atomic macros. * nptl/pthread_spin_trylock.c (pthread_spin_trylock): Likewise. * nptl/pthread_spin_unlock.c (pthread_spin_unlock): Use atomic_store_release. * sysdeps/aarch64/nptl/pthread_spin_lock.c: Delete File. * sysdeps/arm/nptl/pthread_spin_lock.c: Likewise. * sysdeps/hppa/nptl/pthread_spin_lock.c: Likewise. * sysdeps/m68k/nptl/pthread_spin_lock.c: Likewise. * sysdeps/microblaze/nptl/pthread_spin_lock.c: Likewise. * sysdeps/mips/nptl/pthread_spin_lock.c: Likewise. * sysdeps/nios2/nptl/pthread_spin_lock.c: Likewise. * sysdeps/aarch64/atomic-machine.h (ATOMIC_EXCHANGE_USES_CAS): Define. * sysdeps/alpha/atomic-machine.h: Likewise. * sysdeps/arm/atomic-machine.h: Likewise. * sysdeps/i386/atomic-machine.h: Likewise. * sysdeps/ia64/atomic-machine.h: Likewise. * sysdeps/m68k/coldfire/atomic-machine.h: Likewise. * sysdeps/m68k/m680x0/m68020/atomic-machine.h: Likewise. * sysdeps/microblaze/atomic-machine.h: Likewise. * sysdeps/mips/atomic-machine.h: Likewise. * sysdeps/powerpc/powerpc32/atomic-machine.h: Likewise. * sysdeps/powerpc/powerpc64/atomic-machine.h: Likewise. * sysdeps/s390/atomic-machine.h: Likewise. * sysdeps/sparc/sparc32/atomic-machine.h: Likewise. * sysdeps/sparc/sparc32/sparcv9/atomic-machine.h: Likewise. * sysdeps/sparc/sparc64/atomic-machine.h: Likewise. * sysdeps/tile/tilegx/atomic-machine.h: Likewise. * sysdeps/tile/tilepro/atomic-machine.h: Likewise. * sysdeps/unix/sysv/linux/hppa/atomic-machine.h: Likewise. * sysdeps/unix/sysv/linux/m68k/coldfire/atomic-machine.h: Likewise. * sysdeps/unix/sysv/linux/nios2/atomic-machine.h: Likewise. * sysdeps/unix/sysv/linux/sh/atomic-machine.h: Likewise. * sysdeps/x86_64/atomic-machine.h: Likewise. |
||
Zack Weinberg
|
7c3018f9e4 |
Suppress internal declarations for most of the testsuite.
This patch adds a new build module called 'testsuite'. IS_IN (testsuite) implies _ISOMAC, as do IS_IN_build and __cplusplus (which means several ad-hoc tests for __cplusplus can go away). libc-symbols.h now suppresses almost all of *itself* when _ISOMAC is defined; in particular, _ISOMAC mode does not get config.h automatically anymore. There are still quite a few tests that need to see internal gunk of one variety or another. For them, we now have 'tests-internal' and 'test-internal-extras'; files in this category will still be compiled with MODULE_NAME=nonlib, and everything proceeds as it always has. The bulk of this patch is moving tests from 'tests' to 'tests-internal'. There is also 'tests-static-internal', which has the same effect on files in 'tests-static', and 'modules-names-tests', which has the *inverse* effect on files in 'modules-names' (it's inverted because most of the things in modules-names are *not* tests). For both of these, the file must appear in *both* the new variable and the old one. There is also now a special case for when libc-symbols.h is included without MODULE_NAME being defined at all. (This happens during the creation of libc-modules.h, and also when preprocessing Versions files.) When this happens, IS_IN is set to be always false and _ISOMAC is *not* defined, which was the status quo, but now it's explicit. The remaining changes to C source files in this patch seemed likely to cause problems in the absence of the main change. They should be relatively self-explanatory. In a few cases I duplicated a definition from an internal header rather than move the test to tests-internal; this was a judgement call each time and I'm happy to change those however reviewers feel is more appropriate. * Makerules: New subdir configuration variables 'tests-internal' and 'test-internal-extras'. Test files in these categories will still be compiled with MODULE_NAME=nonlib. Test files in the existing categories (tests, xtests, test-srcs, test-extras) are now compiled with MODULE_NAME=testsuite. New subdir configuration variable 'modules-names-tests'. Files which are in both 'modules-names' and 'modules-names-tests' will be compiled with MODULE_NAME=testsuite instead of MODULE_NAME=extramodules. (gen-as-const-headers): Move to tests-internal. (do-tests-clean, common-mostlyclean): Support tests-internal. * Makeconfig (built-modules): Add testsuite. * Makefile: Change libof-check-installed-headers-c and libof-check-installed-headers-cxx to 'testsuite'. * Rules: Likewise. Support tests-internal. * benchtests/strcoll-inputs/filelist#en_US.UTF-8: Remove extra-modules.mk. * config.h.in: Don't check for __OPTIMIZE__ or __FAST_MATH__ here. * include/libc-symbols.h: Move definitions of _GNU_SOURCE, PASTE_NAME, PASTE_NAME1, IN_MODULE, IS_IN, and IS_IN_LIB to the very top of the file and rationalize their order. If MODULE_NAME is not defined at all, define IS_IN to always be false, and don't define _ISOMAC. If any of IS_IN (testsuite), IS_IN_build, or __cplusplus are true, define _ISOMAC and suppress everything else in this file, starting with the inclusion of config.h. Do check for inappropriate definitions of __OPTIMIZE__ and __FAST_MATH__ here, but only if _ISOMAC is not defined. Correct some out-of-date commentary. * include/math.h: If _ISOMAC is defined, undefine NO_LONG_DOUBLE and _Mlong_double_ before including math.h. * include/string.h: If _ISOMAC is defined, don't expose _STRING_ARCH_unaligned. Move a comment to a more appropriate location. * include/errno.h, include/stdio.h, include/stdlib.h, include/string.h * include/time.h, include/unistd.h, include/wchar.h: No need to check __cplusplus nor use __BEGIN_DECLS/__END_DECLS. * misc/sys/cdefs.h (__NTHNL): New macro. * sysdeps/m68k/m680x0/fpu/bits/mathinline.h (__m81_defun): Use __NTHNL to avoid errors with GCC 6. * elf/tst-env-setuid-tunables.c: Include config.h with _LIBC defined, for HAVE_TUNABLES. * inet/tst-checks-posix.c: No need to define _ISOMAC. * intl/tst-gettext2.c: Provide own definition of N_. * math/test-signgam-finite-c99.c: No need to define _ISOMAC. * math/test-signgam-main.c: No need to define _ISOMAC. * stdlib/tst-strtod.c: Convert to test-driver. Split locale_test to... * stdlib/tst-strtod1i.c: ...this new file. * stdlib/tst-strtod5.c: Convert to test-driver and add copyright notice. Split tests of __strtod_internal to... * stdlib/tst-strtod5i.c: ...this new file. * string/test-string.h: Include stdint.h. Duplicate definition of inhibit_loop_to_libcall here (from libc-symbols.h). * string/test-strstr.c: Provide dummy definition of libc_hidden_builtin_def when including strstr.c. * sysdeps/ia64/fpu/libm-symbols.h: Suppress entire file in _ISOMAC mode; no need to test __STRICT_ANSI__ nor __cplusplus as well. * sysdeps/x86_64/fpu/math-tests-arch.h: Include cpu-features.h. Don't include init-arch.h. * sysdeps/x86_64/multiarch/test-multiarch.h: Include cpu-features.h. Don't include init-arch.h. * elf/Makefile: Move tst-ptrguard1-static, tst-stackguard1-static, tst-tls1-static, tst-tls2-static, tst-tls3-static, loadtest, unload, unload2, circleload1, neededtest, neededtest2, neededtest3, neededtest4, tst-tls1, tst-tls2, tst-tls3, tst-tls6, tst-tls7, tst-tls8, tst-dlmopen2, tst-ptrguard1, tst-stackguard1, tst-_dl_addr_inside_object, and all of the ifunc tests to tests-internal. Don't add $(modules-names) to test-extras. * inet/Makefile: Move tst-inet6_scopeid_pton to tests-internal. Add tst-deadline to tests-static-internal. * malloc/Makefile: Move tst-mallocstate and tst-scratch_buffer to tests-internal. * misc/Makefile: Move tst-atomic and tst-atomic-long to tests-internal. * nptl/Makefile: Move tst-typesizes, tst-rwlock19, tst-sem11, tst-sem12, tst-sem13, tst-barrier5, tst-signal7, tst-tls3, tst-tls3-malloc, tst-tls5, tst-stackguard1, tst-sem11-static, tst-sem12-static, and tst-stackguard1-static to tests-internal. Link tests-internal with libpthread also. Don't add $(modules-names) to test-extras. * nss/Makefile: Move tst-field to tests-internal. * posix/Makefile: Move bug-regex5, bug-regex20, bug-regex33, tst-rfc3484, tst-rfc3484-2, and tst-rfc3484-3 to tests-internal. * stdlib/Makefile: Move tst-strtod1i, tst-strtod3, tst-strtod4, tst-strtod5i, tst-tls-atexit, and tst-tls-atexit-nodelete to tests-internal. * sunrpc/Makefile: Move tst-svc_register to tests-internal. * sysdeps/powerpc/Makefile: Move test-get_hwcap and test-get_hwcap-static to tests-internal. * sysdeps/unix/sysv/linux/Makefile: Move tst-setgetname to tests-internal. * sysdeps/x86_64/fpu/Makefile: Add all libmvec test modules to modules-names-tests. |
||
Adhemerval Zanella
|
eab380d8ec |
Move shared pthread definitions to common headers
This patch removes all the replicated pthread definition accross the architectures and consolidates it on shared headers. The new organization is as follow: * Architecture specific definition (such as pthread types sizes) are place in the new pthreadtypes-arch.h header in arch specific path. * All shared structure definition are moved to a common NPTL header at sysdeps/nptl/bits/pthreadtypes.h (with now includes the arch specific one for internal definitions). * Also, for C11 future thread support, both mutex and condition definition are placed in a common header at sysdeps/nptl/bits/thread-shared-types.h. It is also a refactor patch without expected functional changes. Checked with a build for all major ABI (aarch64-linux-gnu, alpha-linux-gnu, arm-linux-gnueabi, i386-linux-gnu, ia64-linux-gnu, m68k-linux-gnu, microblaze-linux-gnu, mips{64}-linux-gnu, nios2-linux-gnu, powerpc{64le}-linux-gnu, s390{x}-linux-gnu, sparc{64}-linux-gnu, tile{pro,gx}-linux-gnu, and x86_64-linux-gnu). * posix/Makefile (headers): Add pthreadtypes-arch.h and thread-shared-types.h. * sysdeps/aarch64/nptl/bits/pthreadtypes-arch.h: New file: arch specific thread definition. * sysdeps/alpha/nptl/bits/pthreadtypes-arch.h: Likewise. * sysdeps/arm/nptl/bits/pthreadtypes-arch.h: Likewise. * sysdeps/hppa/nptl/bits/pthreadtypes-arch.h: Likewise. * sysdeps/ia64/nptl/bits/pthreadtypes-arch.h: Likewise. * sysdeps/m68k/nptl/bits/pthreadtypes-arch.h: Likewise. * sysdeps/microblaze/nptl/bits/pthreadtypes-arch.h: Likewise. * sysdeps/mips/nptl/bits/pthreadtypes-arch.h: Likewise. * sysdeps/nios2/nptl/bits/pthreadtypes-arch.h: Likewise. * sysdeps/powerpc/nptl/bits/pthreadtypes-arch.h: Likewise. * sysdeps/s390/nptl/bits/pthreadtypes-arch.h: Likewise. * sysdeps/sh/nptl/bits/pthreadtypes-arch.h: Likewise. * sysdeps/sparc/nptl/bits/pthreadtypes-arch.h: Likewise. * sysdeps/tile/nptl/bits/pthreadtypes-arch.h: Likewise. * sysdeps/x86/nptl/bits/pthreadtypes-arch.h: Likewise. * sysdeps/nptl/bits/thread-shared-types.h: New file: shared thread definition between POSIX and C11. * sysdeps/aarch64/nptl/bits/pthreadtypes.h.: Remove file. * sysdeps/alpha/nptl/bits/pthreadtypes.h: Likewise. * sysdeps/arm/nptl/bits/pthreadtypes.h: Likewise. * sysdeps/hppa/nptl/bits/pthreadtypes.h: Likewise. * sysdeps/m68k/nptl/bits/pthreadtypes.h: Likewise. * sysdeps/microblaze/nptl/bits/pthreadtypes.h: Likewise. * sysdeps/mips/nptl/bits/pthreadtypes.h: Likewise. * sysdeps/nios2/nptl/bits/pthreadtypes.h: Likewise. * sysdeps/ia64/nptl/bits/pthreadtypes.h: Likewise. * sysdeps/powerpc/nptl/bits/pthreadtypes.h: Likewise. * sysdeps/s390/nptl/bits/pthreadtypes.h: Likewise. * sysdeps/sh/nptl/bits/pthreadtypes.h: Likewise. * sysdeps/sparc/nptl/bits/pthreadtypes.h: Likewise. * sysdeps/tile/nptl/bits/pthreadtypes.h: Likewise. * sysdeps/x86/nptl/bits/pthreadtypes.h: Likewise. * sysdeps/nptl/bits/pthreadtypes.h: New file: common thread definitions shared across all architectures. |
||
Zack Weinberg
|
963394a22b |
Allow direct use of math_ldbl.h in testsuite.
A few 'long double'-related tests include math_private.h just for their variety of math_ldbl.h, which contains macros for assembling and disassembling the binary representation of 'long double'. math_ldbl.h insists on being included from math_private.h, but if we relax this restriction (and fix some portability sloppiness) we can use it directly and not have to expose all of math_private.h to the testsuite. * sysdeps/generic/math_private.h: Use __BIG_ENDIAN and __LITTLE_ENDIAN, not BIG_ENDIAN and LITTLE_ENDIAN. * sysdeps/generic/math_ldbl.h * sysdeps/ia64/fpu/math_ldbl.h * sysdeps/ieee754/ldbl-128/math_ldbl.h * sysdeps/ieee754/ldbl-128ibm/math_ldbl.h * sysdeps/ieee754/ldbl-96/math_ldbl.h * sysdeps/powerpc/fpu/math_ldbl.h * sysdeps/x86_64/fpu/math_ldbl.h: Allow direct inclusion. Use uintNN_t instead of u_intNN_t. Use __BIG_ENDIAN and __LITTLE_ENDIAN, not BIG_ENDIAN and LITTLE_ENDIAN. Include endian.h and/or stdint.h if necessary. Add copyright notices. * sysdeps/ieee754/ldbl-128ibm/math_ldbl.h (ldbl_canonicalize_int): Don't use EXTRACT_WORDS64. * sysdeps/ieee754/ldbl-96/test-canonical-ldbl-96.c * sysdeps/ieee754/ldbl-96/test-totalorderl-ldbl-96.c * sysdeps/ieee754/ldbl-128ibm/test-canonical-ldbl-128ibm.c * sysdeps/ieee754/ldbl-128ibm/test-totalorderl-ldbl-128ibm.c: Include math_ldbl.h, not math_private.h. |
||
Gabriel F. T. Gomes
|
5ab621c347 |
Move w_exp to libm-compat-call-auto
This patch adds the "_compat" suffix to the wrappers of the function exp, which use _LIB_VERSION / matherr / __kernel_standard functionality. Tested for powerpc64le, s390, and x86_64. * math/Makefile (libm-calls): Move w_exp... (libm-compat-calls-auto): Here. * math/w_expl.c: Add suffix "_compat" to filename. * sysdeps/ia64/fpu/w_expl.c: Likewise. * sysdeps/ia64/fpu/w_expf.c: Likewise. * sysdeps/ia64/fpu/w_exp.c: Likewise. * sysdeps/ieee754/dbl-64/w_exp.c: Likewise. * sysdeps/ieee754/flt-32/w_expf.c: Likewise. * sysdeps/ieee754/ldbl-128/w_expl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/w_expl.c: Likewise. * sysdeps/ieee754/ldbl-96/w_expl.c: Likewise. * math/w_expl_compat.c: New file, copied from above. * sysdeps/ia64/fpu/w_exp_compat.c: Likewise. * sysdeps/ia64/fpu/w_expf_compat.c: Likewise. * sysdeps/ia64/fpu/w_expl_compat.c: Likewise. * sysdeps/ieee754/dbl-64/w_exp_compat.c: Likewise. * sysdeps/ieee754/flt-32/w_expf_compat.c: Likewise. * sysdeps/ieee754/ldbl-128/w_expl_compat.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/w_expl_compat.c: Likewise. * sysdeps/ieee754/ldbl-96/w_expl_compat.c: Likewise. * sysdeps/ieee754/ldbl-64-128/w_expl.c: Add suffix "_compat" to filename. * sysdeps/ieee754/ldbl-opt/w_exp.c: Likewise. * sysdeps/ieee754/ldbl-64-128/w_expl_compat.c: New file, copied from above and adjusted for the new filenames. * sysdeps/ieee754/ldbl-opt/w_exp_compat.c: Likewise. |
||
Gabriel F. T. Gomes
|
ea814db27a |
Move w_lgamma_r to libm-compat-calls-auto
This patch adds the suffix "_compat" to lgamma_r wrappers and make some adjustments to #includes and Makefiles. This is a step towards deprecation of wrappers that use _LIB_VERSION / matherr / __kernel_standard functionality. Tested for powerpc64le, s390, and x86_64. * math/Makefile (libm-calls): Move w_lgammaF_r... (libm-compat-calls-auto): Here. * math/w_lgamma_r.c: Add suffix "_compat" to filename. * math/w_lgammaf_r.c: Likewise. * math/w_lgammal_r.c: Likewise. * sysdeps/ia64/fpu/w_lgammal_r.c: Likewise. * sysdeps/ia64/fpu/w_lgammaf_r.c: Likewise. * sysdeps/ia64/fpu/w_lgamma_r.c: Likewise. * math/w_lgamma_r_compat.c: New file, copied from above. * math/w_lgammaf_r_compat.c: Likewise. * math/w_lgammal_r_compat.c: Likewise. * sysdeps/ia64/fpu/w_lgamma_r_compat.c: Likewise. * sysdeps/ia64/fpu/w_lgammaf_r_compat.c: Likewise. * sysdeps/ia64/fpu/w_lgammal_r_compat.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_lgamma_r.c: Add suffix "_compat" to filename. * sysdeps/ieee754/ldbl-opt/w_lgammal_r.c: Likewise. * sysdeps/ieee754/ldbl-opt/w_lgamma_r_compat.c: New file copied from above and adjusted for the new filenames. * sysdeps/ieee754/ldbl-opt/w_lgammal_r_compat.c: Likewise. |
||
Joseph Myers
|
aee47c934e |
Remove very old libm-test-ulps entries.
I noticed that some libm-test-ulps files still had long-obsolete entries for *_tonearest functions, which will no longer be used since functions with FE_TONEAREST explicitly set aren't tested separately from those functions with it as the default rounding mode any more. This patch removes those obsolete entries. However, as they are a sign of libm-test-ulps not having been regenerated from scratch for a long time, I strongly advise people testing on those platforms to remove / truncate the libm-test-ulps file, run "make regen-ulps" and commit the regenerated-from-scratch file. (Ideally any failures of libm tests still present after regeneration would be investigated / fixed - there are several open "math" bugs spread across these platforms - but simply regenerating from scratch improves things.) * sysdeps/hppa/fpu/libm-test-ulps: Remove *_tonearest entries. * sysdeps/ia64/fpu/libm-test-ulps: Likewise. * sysdeps/m68k/m680x0/fpu/libm-test-ulps: Likewise. * sysdeps/microblaze/libm-test-ulps: Likewise. * sysdeps/sh/libm-test-ulps: Likewise. |
||
Torvald Riegel
|
cc25c8b4c1 |
New pthread rwlock that is more scalable.
This replaces the pthread rwlock with a new implementation that uses a more scalable algorithm (primarily through not using a critical section anymore to make state changes). The fast path for rdlock acquisition and release is now basically a single atomic read-modify write or CAS and a few branches. See nptl/pthread_rwlock_common.c for details. * nptl/DESIGN-rwlock.txt: Remove. * nptl/lowlevelrwlock.sym: Remove. * nptl/Makefile: Add new tests. * nptl/pthread_rwlock_common.c: New file. Contains the new rwlock. * nptl/pthreadP.h (PTHREAD_RWLOCK_PREFER_READER_P): Remove. (PTHREAD_RWLOCK_WRPHASE, PTHREAD_RWLOCK_WRLOCKED, PTHREAD_RWLOCK_RWAITING, PTHREAD_RWLOCK_READER_SHIFT, PTHREAD_RWLOCK_READER_OVERFLOW, PTHREAD_RWLOCK_WRHANDOVER, PTHREAD_RWLOCK_FUTEX_USED): New. * nptl/pthread_rwlock_init.c (__pthread_rwlock_init): Adapt to new implementation. * nptl/pthread_rwlock_rdlock.c (__pthread_rwlock_rdlock_slow): Remove. (__pthread_rwlock_rdlock): Adapt. * nptl/pthread_rwlock_timedrdlock.c (pthread_rwlock_timedrdlock): Adapt. * nptl/pthread_rwlock_timedwrlock.c (pthread_rwlock_timedwrlock): Adapt. * nptl/pthread_rwlock_trywrlock.c (pthread_rwlock_trywrlock): Adapt. * nptl/pthread_rwlock_tryrdlock.c (pthread_rwlock_tryrdlock): Adapt. * nptl/pthread_rwlock_unlock.c (pthread_rwlock_unlock): Adapt. * nptl/pthread_rwlock_wrlock.c (__pthread_rwlock_wrlock_slow): Remove. (__pthread_rwlock_wrlock): Adapt. * nptl/tst-rwlock10.c: Adapt. * nptl/tst-rwlock11.c: Adapt. * nptl/tst-rwlock17.c: New file. * nptl/tst-rwlock18.c: New file. * nptl/tst-rwlock19.c: New file. * nptl/tst-rwlock2b.c: New file. * nptl/tst-rwlock8.c: Adapt. * nptl/tst-rwlock9.c: Adapt. * sysdeps/aarch64/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/arm/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/hppa/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/ia64/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/m68k/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/microblaze/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/mips/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/nios2/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/s390/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/sh/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/sparc/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/tile/nptl/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/unix/sysv/linux/alpha/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/unix/sysv/linux/powerpc/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * sysdeps/x86/bits/pthreadtypes.h (pthread_rwlock_t): Adapt. * nptl/nptl-printers.py (): Adapt. * nptl/nptl_lock_constants.pysym: Adapt. * nptl/test-rwlock-printers.py: Adapt. * nptl/test-rwlockattr-printers.c: Adapt. * nptl/test-rwlockattr-printers.py: Adapt. |
||
Gabriel F. T. Gomes
|
f67d78192c |
Move wrappers to libm-compat-calls-auto
This commit moves one step towards the deprecation of wrappers that use _LIB_VERSION / matherr / __kernel_standard functionality, by adding the suffix '_compat' to their filenames and adjusting Makefiles and #includes accordingly. New template wrappers that do not use such functionality will be added by future patches and will be first used by the float128 wrappers. |
||
Adhemerval Zanella
|
640e44c5d0 |
Remove duplicate strcat implementations
Since commit
|