glibc

Commit Graph

Author	SHA1	Message	Date
Nobody	790298dd8d	glibc with MCST patches (25.014.1)	2022-08-11 21:25:08 +03:00
Florian Weimer	8e5d591b10	math/test-sinl-pseudo: Use stack protector only if available This fixes commit `9333498794` ("Avoid ldbl-96 stack corruption from range reduction of pseudo-zero (bug 25487)."). (cherry picked from commit `c10acd4026`)	2020-03-15 17:33:57 -04:00
Joseph Myers	0474cd5de6	Avoid ldbl-96 stack corruption from range reduction of pseudo-zero (bug 25487). Bug 25487 reports stack corruption in ldbl-96 sinl on a pseudo-zero argument (an representation where all the significand bits, including the explicit high bit, are zero, but the exponent is not zero, which is not a valid representation for the long double type). Although this is not a valid long double representation, existing practice in this area (see bug 4586, originally marked invalid but subsequently fixed) is that we still seek to avoid invalid memory accesses as a result, in case of programs that treat arbitrary binary data as long double representations, although the invalid representations of the ldbl-96 format do not need to be consistently handled the same as any particular valid representation. This patch makes the range reduction detect pseudo-zero and unnormal representations that would otherwise go to __kernel_rem_pio2, and returns a NaN for them instead of continuing with the range reduction process. (Pseudo-zero and unnormal representations whose unbiased exponent is less than -1 have already been safely returned from the function before this point without going through the rest of range reduction.) Pseudo-zero representations would previously result in the value passed to __kernel_rem_pio2 being all-zero, which is definitely unsafe; unnormal representations would previously result in a value passed whose high bit is zero, which might well be unsafe since that is not a form of input expected by __kernel_rem_pio2. Tested for x86_64. (cherry picked from commit `9333498794`)	2020-03-15 17:33:26 -04:00
Dmitry V. Levin	a1b02ae763	Fix a few typos in comments Apply the following spelling fixes: $ git grep -F -l 'relevent' \| xargs sed -i 's/relevent/relevant/g' $ git grep -F -l 'checked fot' \| xargs sed -i 's/checked fot/checked for/g' $ git grep -F -l "could't" \| xargs sed -i "s/could't/couldn't/g" $ git grep -F -l 'wheter' \| grep -Fv ChangeLog.old \| xargs sed -i 's/wheter/whether/g' $ git grep -F -l 'neccessary' \| grep -Fv ChangeLog.old \| xargs sed -i 's/neccessary/necessary/g' $ git grep -F -l 'ouput' \| xargs sed -i 's/ouput/output/g' $ git grep -F -w -l 'iput' \| xargs sed -i 's/iput/input/g' This is inspired by a gnulib bug report at https://lists.gnu.org/archive/html/bug-gnulib/2019-01/msg00081.html * argp/argp-help.c: Fix typo in comment. * misc/sys/cdefs.h: Likewise. * posix/regexec.c (sift_states_iter_mb): Likewise. * socket/sockatmark.c: Likewise. * socket/sys/socket.h: Likewise. * sysdeps/ia64/fpu/libm_sincos_large.S: Likewise. * sysdeps/ia64/fpu/libm_sincosl.S: Likewise. * sysdeps/ia64/fpu/s_cosl.S: Likewise. * sysdeps/ieee754/dbl-64/k_rem_pio2.c: Likewise. * sysdeps/unix/sockatmark.c: Likewise. * time/strptime_l.c: Likewise.	2019-01-12 13:44:51 +00:00
H.J. Lu	69da3c9e87	soft-fp: Properly check _FP_W_TYPE_SIZE [BZ #24066 ] quad.h have #if _FP_W_TYPE_SIZE < 64 union _FP_UNION_Q { Use 4 _FP_W_TYPEs } #else union _FP_UNION_Q { Use 2 _FP_W_TYPEs } #endif Replace #if (2 * _FP_W_TYPE_SIZE) < _FP_FRACBITS_Q with #if _FP_W_TYPE_SIZE < 64 to check whether 4 or 2 _FP_W_TYPEs are used for IEEE quad precision. Tested with build-many-glibcs.py. [BZ #24066] * soft-fp/extenddftf2.c: Use "_FP_W_TYPE_SIZE < 64" to check if 4_FP_W_TYPEs are used for IEEE quad precision. * soft-fp/extendhftf2.c: Likewise. * soft-fp/extendsftf2.c: Likewise. * soft-fp/extendxftf2.c: Likewise. * soft-fp/trunctfdf2.c: Likewise. * soft-fp/trunctfhf2.c: Likewise. * soft-fp/trunctfsf2.c: Likewise. * soft-fp/trunctfxf2.c: Likewise. * sysdeps/alpha/ots_cvttx.c: Likewise. * sysdeps/alpha/ots_cvtxt.c: Likewise. * sysdeps/ieee754/soft-fp/s_daddl.c: Likewise. * sysdeps/ieee754/soft-fp/s_ddivl.c: Likewise. * sysdeps/ieee754/soft-fp/s_dmull.c: Likewise. * sysdeps/ieee754/soft-fp/s_dsubl.c: Likewise. * sysdeps/ieee754/soft-fp/s_faddl.c: Likewise. * sysdeps/ieee754/soft-fp/s_fdivl.c: Likewise. * sysdeps/ieee754/soft-fp/s_fmull.c: Likewise. * sysdeps/ieee754/soft-fp/s_fsubl.c: Likewise. * sysdeps/sparc/sparc32/q_dtoq.c: Likewise. * sysdeps/sparc/sparc32/q_qtod.c: Likewise. * sysdeps/sparc/sparc32/q_qtos.c: Likewise. * sysdeps/sparc/sparc32/q_stoq.c: Likewise. * sysdeps/sparc/sparc64/qp_dtoq.c: Likewise. * sysdeps/sparc/sparc64/qp_qtod.c: Likewise. * sysdeps/sparc/sparc64/qp_qtos.c: Likewise. * sysdeps/sparc/sparc64/qp_stoq.c: Likewise.	2019-01-07 09:04:39 -08:00
Martin Jansa	27c5e756a2	sysdeps/ieee754: prevent maybe-uninitialized errors with -O [BZ #19444 ] With -O included in CFLAGS it fails to build with: ../sysdeps/ieee754/ldbl-96/e_jnl.c: In function '__ieee754_jnl': ../sysdeps/ieee754/ldbl-96/e_jnl.c:146:20: error: 'temp' may be used uninitialized in this function [-Werror=maybe-uninitialized] b = invsqrtpi * temp / sqrtl (x); ~~~~~~~~~~^~~~~~ ../sysdeps/ieee754/ldbl-96/e_jnl.c: In function '__ieee754_ynl': ../sysdeps/ieee754/ldbl-96/e_jnl.c:375:16: error: 'temp' may be used uninitialized in this function [-Werror=maybe-uninitialized] b = invsqrtpi * temp / sqrtl (x); ~~~~~~~~~~^~~~~~ ../sysdeps/ieee754/dbl-64/e_jn.c: In function '__ieee754_jn': ../sysdeps/ieee754/dbl-64/e_jn.c:113:20: error: 'temp' may be used uninitialized in this function [-Werror=maybe-uninitialized] b = invsqrtpi * temp / sqrt (x); ~~~~~~~~~~^~~~~~ ../sysdeps/ieee754/dbl-64/e_jn.c: In function '__ieee754_yn': ../sysdeps/ieee754/dbl-64/e_jn.c:320:16: error: 'temp' may be used uninitialized in this function [-Werror=maybe-uninitialized] b = invsqrtpi * temp / sqrt (x); ~~~~~~~~~~^~~~~~ Build tested with Yocto for ARM, AARCH64, X86, X86_64, PPC, MIPS, MIPS64 with -O, -O1, -Os. For AARCH64 it needs one more fix in locale for -Os: https://sourceware.org/ml/libc-alpha/2018-09/msg00539.html [BZ #19444] * sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Use __builtin_unreachable for default case in switch. (__ieee754_yn): Likewise. * sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl): Likewise. * sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl): Likewise.	2019-01-04 16:17:48 +00:00
Zack Weinberg	03992356e6	Use C99-compliant scanf under _GNU_SOURCE with modern compilers. The only difference between noncompliant and C99-compliant scanf is that the former accepts the archaic GNU extension '%as' (also %aS and %a[...]) meaning to allocate space for the input string with malloc. This extension conflicts with C99's use of %a as a format _type_ meaning to read a floating-point number; POSIX.1-2008 standardized equivalent functionality using the modifier letter 'm' instead (%ms, %mS, %m[...]). The extension was already disabled in most conformance modes: specifically, any mode that doesn't involve _GNU_SOURCE and _does_ involve either strict conformance to C99 or loose conformance to both C99 and POSIX.1-2001 would get the C99-compliant scanf. With compilers new enough to use -std=gnu11 instead of -std=gnu89, or equivalent, that includes the default mode. With this patch, we now provide C99-compliant scanf in all configurations except when _GNU_SOURCE is defined and __STDC_VERSION__ or __cplusplus (whichever is relevant) indicates C89/C++98. This leaves the old scanf available under e.g. -std=c89 -D_GNU_SOURCE, but removes it from e.g. -std=gnu11 -D_GNU_SOURCE (it was already not present under -std=gnu11 without -D_GNU_SOURCE) and from -std=gnu89 without -D_GNU_SOURCE. There needs to be an internal override so we can compile the noncompliant scanf itself. This is the same problem we had when we removed 'gets' from _GNU_SOURCE and it's dealt with the same way: there's a new __GLIBC_USE symbol, DEPRECATED_SCANF, which defaults to off under the appropriate conditions for external code, but can be overridden by individual files within stdio. We also run into problems with PLT bypass for internal uses of sscanf, because libc_hidden_proto uses __REDIRECT and so does the logic in stdio.h for choosing which implementation of scanf to use; __REDIRECT isn't transitive, so include/stdio.h needs to bridge the gap with a macro. As far as I can tell, sscanf is the only function in this family that's internally called by unrelated code. Finally, there are several tests in stdio-common that use the extension. bug21.c is a regression test for a crash; it still exercises the relevant code when changed to use %ms instead of %as. scanf14.c through scanf17.c are more complicated since they are actually testing the subtleties of the extension - under what circumstances is 'a' treated as a modifier letter, etc. I changed all of them to use %ms instead of %as as well, but duplicated scanf14.c and scanf16.c as scanf14a.c and scanf16a.c. These still use %as and are compiled with -std=gnu89 to access the old extension. A bunch of diagnostic overrides and manual workarounds for the old stdio.h behavior become unnecessary. Yay! * include/features.h (__GLIBC_USE_DEPRECATED_SCANF): New __GLIBC_USE parameter. Only use deprecated scanf when __USE_GNU is defined and __STDC_VERSION__ is less than 199901L or __cplusplus is less than 201103L, whichever is relevant for the language being compiled. * libio/stdio.h, libio/bits/stdio-ldbl.h: Decide whether to redirect scanf, fscanf, sscanf, vscanf, vfscanf, and vsscanf to their __isoc99_ variants based only on __GLIBC_USE (DEPRECATED_SCANF). * wcsmbs/wchar.h: wcsmbs/bits/wchar-ldbl.h: Likewise for wscanf, fwscanf, swscanf, vwscanf, vfwscanf, and vswscanf. * libio/iovsscanf.c * libio/fwscanf.c * libio/iovswscanf.c * libio/swscanf.c * libio/vscanf.c * libio/vwscanf.c * libio/wscanf.c * stdio-common/fscanf.c * stdio-common/scanf.c * stdio-common/vfscanf.c * stdio-common/vfwscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-compat.c * sysdeps/ieee754/ldbl-opt/nldbl-fscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-fwscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-iovfscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-scanf.c * sysdeps/ieee754/ldbl-opt/nldbl-sscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-swscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vfscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vfwscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vsscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vswscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-vwscanf.c * sysdeps/ieee754/ldbl-opt/nldbl-wscanf.c: Override __GLIBC_USE_DEPRECATED_SCANF to 1. * stdio-common/sscanf.c: Likewise. Remove ldbl_hidden_def for __sscanf. * stdio-common/isoc99_sscanf.c: Add libc_hidden_def for __isoc99_sscanf. * include/stdio.h: Provide libc_hidden_proto for __isoc99_sscanf, not sscanf. [!__GLIBC_USE (DEPRECATED_SCANF)]: Define sscanf as __isoc99_scanf with a preprocessor macro. * stdio-common/bug21.c, stdio-common/scanf14.c: Use %ms instead of %as, %mS instead of %aS, %m[] instead of %a[]; remove DIAG_IGNORE_NEEDS_COMMENT for -Wformat. * stdio-common/scanf16.c: Likewise. Add __attribute__ ((format (scanf))) to xscanf, xfscanf, xsscanf. * stdio-common/scanf14a.c: New copy of scanf14.c which still uses %as, %aS, %a[]. Remove DIAG_IGNORE_NEEDS_COMMENT for -Wformat. * stdio-common/scanf16a.c: New copy of scanf16.c which still uses %as, %aS, %a[]. Add __attribute__ ((format (scanf))) to xscanf, xfscanf, xsscanf. * stdio-common/scanf15.c, stdio-common/scanf17.c: No need to override feature selection macros or provide definitions of u_char etc. * stdio-common/Makefile (tests): Add scanf14a and scanf16a. (CFLAGS-scanf15.c, CFLAGS-scanf17.c): Remove. (CFLAGS-scanf14a.c, CFLAGS-scanf16a.c): New. Compile these files with -std=gnu89.	2019-01-03 11:12:39 -05:00
Joseph Myers	04277e02d7	Update copyright dates with scripts/update-copyrights. * All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.	2019-01-01 00:11:28 +00:00
H.J. Lu	8700a7851b	x86-64: Vectorize sincosf_poly and update s_sincosf-fma.c Add <sincosf_poly.h> and include it in s_sincosf.h to allow vectorized sincosf_poly. Add x86 sincosf_poly.h to vectorize sincosf_poly. On Broadwell, bench-sincosf shows: Before After Improvement max 160.273 114.198 40% min 6.25 5.625 11% mean 13.0325 10.6462 22% Vectorized sincosf_poly shows Before After Improvement max 138.653 114.198 21% min 5.004 5.625 -11% mean 11.5934 10.6462 9% Tested on x86-64 and i686 as well as with build-many-glibcs.py. * sysdeps/ieee754/flt-32/s_sincosf.h: Include <sincosf_poly.h>. (sincos_t, sincosf_poly, sinf_poly): Moved to ... * sysdeps/ieee754/flt-32/sincosf_poly.h: Here. New file. * sysdeps/x86/fpu/s_sincosf_data.c: New file. * sysdeps/x86/fpu/sincosf_poly.h: Likewise. * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: Just include <sysdeps/ieee754/flt-32/s_sincosf.c>.	2018-12-26 06:56:13 -08:00
Szabolcs Nagy	505b5b2922	Fix powf overflow handling in non-nearest rounding mode [BZ #23961 ] The threshold value at which powf overflows depends on the rounding mode and the current check did not take this into account. So when the result was rounded away from zero it could become infinity without setting errno to ERANGE. Example: pow(0x1.7ac7cp+5, 23) is 0x1.fffffep+127 + 0.1633ulp If the result goes above 0x1.fffffep+127 + 0.5ulp then errno is set, which is fine in nearest rounding mode, but powf(0x1.7ac7cp+5, 23) is inf in upward rounding mode powf(-0x1.7ac7cp+5, 23) is -inf in downward rounding mode and the previous implementation did not set errno in these cases. The fix tries to avoid affecting the common code path or calling a function that may introduce a stack frame, so float arithmetics is used to check the rounding mode and the threshold is selected accordingly. [BZ #23961] * math/auto-libm-test-in: Add new test case. * math/auto-libm-test-out-pow: Regenerated. * sysdeps/ieee754/flt-32/e_powf.c (__powf): Fix overflow check.	2018-12-11 10:01:43 +00:00
Zack Weinberg	35caceb145	Use PRINTF_LDBL_IS_DBL instead of __ldbl_is_dbl. After all that prep work, nldbl-compat.c can now use PRINTF_LDBL_IS_DBL instead of __no_long_double to control the behavior of printf-like functions; this is the last thing we needed __no_long_double for, so it can go away entirely. Tested for powerpc and powerpc64le.	2018-12-05 18:15:43 -02:00
Zack Weinberg	4e2f43f842	Use PRINTF_FORTIFY instead of _IO_FLAGS2_FORTIFY (bug 11319) The _chk variants of all of the printf functions become much simpler. This is the last thing that we needed _IO_acquire_lock_clear_flags2 for, so it can go as well. I took the opportunity to make the headers included and the names of all local variables consistent across all the affected files. Since we ultimately want to get rid of __no_long_double as well, it must be possible to get all of the nontrivial effects of the _chk functions by calling the _internal functions with appropriate flags. For most of the __(v)xprintf_chk functions, this is covered by PRINTF_FORTIFY plus some up-front argument checks that can be duplicated. However, __(v)sprintf_chk installs a custom jump table so that it can crash instead of overflowing the output buffer. This functionality is moved to __vsprintf_internal, which now has a 'maxlen' argument like __vsnprintf_internal; to get the unsafe behavior of ordinary (v)sprintf, pass -1 for that argument. obstack_printf_chk and obstack_vprintf_chk are no longer in the same file. As a side-effect of the unification of both fortified and non-fortified vdprintf initialization, this patch fixes bug 11319 for __dprintf_chk and __vdprintf_chk, which was previously fixed only for dprintf and vdprintf by the commit commit `7ca890b88e` Author: Ulrich Drepper <drepper@redhat.com> Date: Wed Feb 24 16:07:57 2010 -0800 Fix reporting of I/O errors in *dprintf functions. This patch adds a test case to avoid regressions. Tested for powerpc and powerpc64le.	2018-12-05 18:15:43 -02:00
Zack Weinberg	124fc732c1	Add __vsyslog_internal, with same flags as __vprintf_internal. __nldbl___vsyslog_chk will ultimately want to pass PRINTF_LDBL_IS_DBL down to __vfprintf_internal as well as* possibly setting PRINTF_FORTIFY. To make that possible, we need a __vsyslog_internal that takes the same flags as printf. The code in misc/syslog.c does also get a little simpler. Tested for powerpc and powerpc64le.	2018-12-05 18:15:43 -02:00
Zack Weinberg	698fb75b9f	Add __vprintf_internal with flags arguments There are a lot more printf variants than there are scanf variants, and the code for setting up and tearing down their custom FILE variants around the call to __vf(w)printf is more complicated and variable. Therefore, I have added _internal versions of all the vprintf variants, rather than introducing helper routines so that they can all directly call __vf(w)printf_internal, as was done with scanf. As with the scanf changes, in this patch the _internal functions still look at the environmental mode bits and all callers pass 0 for the flags parameter. Several of the affected public functions had _IO_ name aliases that were not exported (but, in one case, appeared in libio.h anyway); I was originally planning to leave them as aliases to avoid having to touch internal callers, but it turns out ldbl__alias only work for exported symbols, so they've all been removed instead. It also turns out there were hardly any internal callers. _IO_vsprintf and _IO_vfprintf are* exported, so those two stick around. Summary for the changes to each of the affected symbols: _IO_vfprintf, _IO_vsprintf: All internal calls removed, thus the internal declarations, as well as uses of libc_hidden_proto and libc_hidden_def, were also removed. The external symbol is now exposed via uses of ldbl_strong_alias to __vfprintf_internal and __vsprintf_internal, respectively. _IO_vasprintf, _IO_vdprintf, _IO_vsnprintf, _IO_vfwprintf, _IO_vswprintf, _IO_obstack_vprintf, _IO_obstack_printf: All internal calls removed, thus declaration in internal headers were also removed. They were never exported, so there are no aliases tying them to the internal functions. I.e.: entirely gone. __vsnprintf: Internal calls were always preceded by macros such as #define __vsnprintf _IO_vsnprintf, and #define __vsnprintf vsnprintf The macros were removed and their uses replaced with calls to the new internal function __vsnprintf_internal. Since there were no internal calls, the internal declaration was also removed. The external symbol is preserved with ldbl_weak_alias to ___vsnprintf. __vfwprintf: All internal calls converted into calls to __vfwprintf_internal, thus the internal declaration was removed. The function is now a wrapper that calls __vfwprintf_internal. The external symbol is preserved. __vswprintf: Similarly, but no external symbol. __vasprintf, __vdprintf, __vfprintf, __vsprintf: New internal wrappers. Not exported. vasprintf, vdprintf, vfprintf, vsprintf, vsnprintf, vfwprintf, vswprintf, obstack_vprintf, obstack_printf: These functions used to be aliases to the respective _IO_* function, they are now aliases to their respective __* functions. Tested for powerpc and powerpc64le.	2018-12-05 18:15:42 -02:00
Zack Weinberg	d91798b31a	Use SCANF_LDBL_IS_DBL instead of __ldbl_is_dbl. Change the callers of __vfscanf_internal and __vfwscanf_internal that want to treat 'long double' as another name for 'double' (all of which happen to be in sysdeps/ieee754/ldbl-opt/nldbl-compat.c) to communicate this via the new flags argument, instead of the per-thread variable __no_long_double and its __ldbl_is_dbl wrapper macro. Tested for powerpc and powerpc64le.	2018-12-05 18:15:42 -02:00
Zack Weinberg	349718d4d7	Add __vfscanf_internal and __vfwscanf_internal with flags arguments. There are two flags currently defined: SCANF_LDBL_IS_DBL is the mode used by __nldbl_ scanf variants, and SCANF_ISOC99_A is the mode used by __isoc99_ scanf variants. In this patch, the new functions honor these flag bits if they're set, but they still also look at the corresponding bits of environmental state, and callers all pass zero. The new functions do not have the "errp" argument possessed by _IO_vfscanf and _IO_vfwscanf. All internal callers passed NULL for that argument. External callers could theoretically exist, so I preserved wrappers, but they are flagged as compat symbols and they don't preserve the three-way distinction among types of errors that was formerly exposed. These functions probably should have been in the list of deprecated _IO_ symbols in 2.27 NEWS -- they're not just aliases for vfscanf and vfwscanf. (It was necessary to introduce ldbl_compat_symbol for _IO_vfscanf. Please check that part of the patch very carefully, I am still not confident I understand all of the details of ldbl-opt.) This patch also introduces helper inlines in libio/strfile.h that encapsulate the process of initializing an _IO_strfile object for reading. This allows us to call __vfscanf_internal directly from sscanf, and __vfwscanf_internal directly from swscanf, without duplicating the initialization code. (Previously, they called their v-counterparts, but that won't work if we want to control both C99 mode and ldbl-is-dbl mode using the flags argument to__vfscanf_internal.) It's still a little awkward, especially for wide strfiles, but it's much better than what we had. Tested for powerpc and powerpc64le.	2018-12-05 18:15:42 -02:00
Szabolcs Nagy	a502c5294b	Remove the error handling wrapper from pow Introduce new pow symbol version that doesn't do SVID compatible error handling. The standard errno and fp exception based error handling is inline in the new code and does not have significant overhead. The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty w_pow.c and enabled for targets with their own pow implementation or ifunc dispatch on __ieee754_pow by including math/w_pow.c. The compatibility symbol version still uses the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). On targets where previously powl was an alias of pow, now it points to the compatibility symbol with the wrapper, because it still need the SVID compatible error handling. This affects NO_LONG_DOUBLE (e.g. arm) and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well. The __pow_finite symbol is now an alias of pow. Both __pow_finite and pow set errno and thus not const functions. The ia64 asm is changed so the compat and new symbol versions map to the same address. On x86_64 #include <math.h> was added before macro definitions that may affect that header. Tested with build-many-glibcs.py. * math/Versions (GLIBC_2.29): Add pow. * math/w_pow_compat.c (__pow_compat): Change to versioned compat symbol. * math/w_pow.c: New file. * sysdeps/i386/fpu/w_pow.c: New file. * sysdeps/ia64/fpu/e_pow.S: Add versioned symbols. * sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Rename to __pow and add necessary aliases. * sysdeps/ieee754/dbl-64/w_pow.c: New file. * sysdeps/m68k/m680x0/fpu/w_pow.c: New file. * sysdeps/mach/hurd/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Update. * sysdeps/unix/sysv/linux/arm/libm.abilist: Update. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Update. * sysdeps/unix/sysv/linux/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update. * sysdeps/unix/sysv/linux/sh/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update. * sysdeps/x86_64/fpu/multiarch/e_pow-fma.c (__ieee754_pow): Rename to __pow. * sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c (__ieee754_pow): Likewise. * sysdeps/x86_64/fpu/multiarch/e_pow.c (__ieee754_pow): Likewise. * sysdeps/x86_64/fpu/multiarch/w_pow.c: New file.	2018-11-21 09:58:36 +00:00
Szabolcs Nagy	718d6542f2	Remove the error handling wrapper from log2 Introduce new log2 symbol version that doesn't do SVID compatible error handling. The standard errno and fp exception based error handling is inline in the new code and does not have significant overhead. The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty w_log2.c and enabled for targets with their own log2 implementation by including math/w_log2.c. The compatibility symbol version still uses the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). On targets where previously log2l was an alias of log2, now it points to the compatibility symbol with the wrapper, because it still need the SVID compatible error handling. This affects NO_LONG_DOUBLE (e.g. arm) and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well. The __log2_finite symbol is now an alias of log2. Both __log2_finite and log2 set errno and thus not const functions. The ia64 asm is changed so the compat and new symbol versions map to the same address. Tested with build-many-glibcs.py. * math/Versions (GLIBC_2.29): Add log2. * math/w_log2_compat.c (__log2_compat): Change to versioned compat symbol. * math/w_log2.c: New file. * sysdeps/i386/fpu/w_log2.c: New file. * sysdeps/ia64/fpu/e_log2.S: Add versioned symbols. * sysdeps/ieee754/dbl-64/e_log2.c (__ieee754_log2): Rename to __log2 and add necessary aliases. * sysdeps/ieee754/dbl-64/w_log2.c: New file. * sysdeps/m68k/m680x0/fpu/w_log2.c: New file. * sysdeps/mach/hurd/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Update. * sysdeps/unix/sysv/linux/arm/libm.abilist: Update. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Update. * sysdeps/unix/sysv/linux/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update. * sysdeps/unix/sysv/linux/sh/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update.	2018-11-21 09:57:21 +00:00
Szabolcs Nagy	f29b7c492d	Remove the error handling wrapper from log Introduce new log symbol version that doesn't do SVID compatible error handling. The standard errno and fp exception based error handling is inline in the new code and does not have significant overhead. The wrapper is disabled for sysdeps/ieee754/dbl-64 by using empty w_log.c and enabled for targets with their own log implementation by including math/w_log.c. The compatibility symbol version still uses the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). On targets where previously logl was an alias of log, now it points to the compatibility symbol with the wrapper, because it still need the SVID compatible error handling. This affects NO_LONG_DOUBLE (e.g. arm) and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well. The __log_finite symbol is now an alias of log. Both __log_finite and log set errno and thus not const functions. The ia64 asm is changed so the compat and new symbol versions map to the same address. On x86_64 #include <math.h> was added before macro definitions that may affect that header. Tested with build-many-glibcs.py. * math/Versions (GLIBC_2.29): Add log. * math/w_log_compat.c (__log_compat): Change to versioned compat symbol. * math/w_log.c: New file. * sysdeps/i386/fpu/w_log.c: New file. * sysdeps/ia64/fpu/e_log.S: Update. * sysdeps/ieee754/dbl-64/e_log.c (__ieee754_log): Rename to __log and add necessary aliases. * sysdeps/ieee754/dbl-64/w_log.c: New file. * sysdeps/m68k/m680x0/fpu/w_log.c: New file. * sysdeps/mach/hurd/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Update. * sysdeps/unix/sysv/linux/arm/libm.abilist: Update. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Update. * sysdeps/unix/sysv/linux/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update. * sysdeps/unix/sysv/linux/sh/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update. * sysdeps/x86_64/fpu/multiarch/e_log-avx.c (__ieee754_log): Rename to __log. * sysdeps/x86_64/fpu/multiarch/e_log-fma.c (__ieee754_log): Likewise. * sysdeps/x86_64/fpu/multiarch/e_log-fma4.c (__ieee754_log): Likewise. * sysdeps/x86_64/fpu/multiarch/e_log.c (__ieee754_log): Likewise. * sysdeps/x86_64/fpu/multiarch/w_log.c: New file.	2018-11-21 09:56:27 +00:00
Szabolcs Nagy	c20a10561a	Remove the error handling wrapper from exp and exp2 Introduce new exp and exp2 symbol version that don't do SVID compatible error handling. The standard errno and fp exception based error handling is inline in the new code and does not have significant overhead. The double precision wrappers are disabled for sysdeps/ieee754/dbl-64 by using empty w_exp.c and w_exp2.c files, the math/w_exp.c and math/w_exp2.c files use the wrapper template and can be included by targets that have their own exp and exp2 implementations or use ifunc on the glibc internal __ieee754_exp symbol. The compatibility symbol versions still use the wrapper with SVID error handling around the new code. There is no new symbol version nor compatibility code on !LIBM_SVID_COMPAT targets (e.g. riscv). On targets where previously expl and exp2l were aliases of exp and exp2, now they point to the compatibility symbols with the wrapper, because they still need the SVID compatible error handling. This affects NO_LONG_DOUBLE (e.g arm) and LONG_DOUBLE_COMPAT (e.g. alpha) targets as well. The _finite symbols are now aliases of the standard symbols (they have no performance advantage anymore). Both the standard symbols and _finite symbols set errno and thus not const functions. The ia64 asm is changed so the compat and new symbol versions map to the same address. On x86_64 #include <math.h> was added before macro definitions that may affect that header (the new macro name is __exp instead of __ieee754_exp which breaks some math.h macros). Tested with build-many-glibcs.py. * math/Versions (GLIBC_2.29): Add exp and exp2. * math/w_exp2_compat.c (__exp2_compat): Change to versioned compat symbol, handle NO_LONG_DOUBLE and LONG_DOUBLE_COMPAT explicitly. * math/w_exp_compat.c (__exp_compat): Likewise. * math/w_exp.c: New file. * math/w_exp2.c: New file. * sysdeps/i386/fpu/w_exp.c: New file. * sysdeps/i386/fpu/w_exp2.c: New file. * sysdeps/ia64/fpu/e_exp.S: Add versioned symbols. * sysdeps/ia64/fpu/e_exp2.S: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c (__ieee754_exp): Rename to __exp and add necessary aliases. * sysdeps/ieee754/dbl-64/e_exp2.c (__ieee754_exp2): Rename to __exp2 and add necessary aliases. * sysdeps/ieee754/dbl-64/w_exp.c: New file. * sysdeps/ieee754/dbl-64/w_exp2.c: New file. * sysdeps/m68k/m680x0/fpu/w_exp.c: New file. * sysdeps/m68k/m680x0/fpu/w_exp2.c: New file. * sysdeps/mach/hurd/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/aarch64/libm.abilist: Update. * sysdeps/unix/sysv/linux/alpha/libm.abilist: Update. * sysdeps/unix/sysv/linux/arm/libm.abilist: Update. * sysdeps/unix/sysv/linux/hppa/libm.abilist: Update. * sysdeps/unix/sysv/linux/i386/libm.abilist: Update. * sysdeps/unix/sysv/linux/ia64/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/coldfire/libm.abilist: Update. * sysdeps/unix/sysv/linux/m68k/m680x0/libm.abilist: Update. * sysdeps/unix/sysv/linux/microblaze/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips32/libm.abilist: Update. * sysdeps/unix/sysv/linux/mips/mips64/libm.abilist: Update. * sysdeps/unix/sysv/linux/nios2/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libm.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm-le.abilist: Update. * sysdeps/unix/sysv/linux/powerpc/powerpc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-32/libm.abilist: Update. * sysdeps/unix/sysv/linux/s390/s390-64/libm.abilist: Update. * sysdeps/unix/sysv/linux/sh/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc32/libm.abilist: Update. * sysdeps/unix/sysv/linux/sparc/sparc64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/64/libm.abilist: Update. * sysdeps/unix/sysv/linux/x86_64/x32/libm.abilist: Update. * sysdeps/x86_64/fpu/multiarch/e_exp-avx.c (__exp1): Remove. (__ieee754_exp): Rename to __exp. * sysdeps/x86_64/fpu/multiarch/e_exp-fma.c (__exp1): Remove. (__ieee754_exp): Rename to __exp. * sysdeps/x86_64/fpu/multiarch/e_exp-fma4.c (__exp1): Remove. (__ieee754_exp): Rename to __exp. * sysdeps/x86_64/fpu/multiarch/e_exp.c (__ieee754_exp): Rename to __exp. * sysdeps/x86_64/fpu/multiarch/w_exp.c: New file.	2018-11-21 09:55:02 +00:00
Zack Weinberg	c75772e3f0	Use STRFMON_LDBL_IS_DBL instead of __ldbl_is_dbl. On platforms where long double used to have the same format as double, but later switched to a different format (alpha, s390, sparc, and powerpc), accessing the older behavior is possible and it happens via __nldbl_* functions (not on the API, but accessible from header redirection and from compat symbols). These functions write to the global flag __ldbl_is_dbl, which tells other functions that long double variables should be handled as double. This patch takes the first step towards removing this global flag and creates __vstrfmon_l_internal, which takes an explicit flags parameter. This change arguably makes the generated code slightly worse on architectures where __ldbl_is_dbl is never true; right now, on those architectures, it's a compile-time constant; after this change, the compiler could theoretically prove that __vstrfmon_l_internal was never called with a nonzero flags argument, but it would probably need LTO to do it. This is not performance critical code and I tend to think that the maintainability benefits of removing action at a distance are worth it. However, we _could_ wrap the runtime flag check with a macro that was defined to ignore its argument and always return false on architectures where __ldbl_is_dbl is never true, if people think the codegen benefits are important. Tested for powerpc and powerpc64le.	2018-11-16 09:21:14 -02:00
Joseph Myers	a19876214a	Fix libnldbl_nonshared.a references to internal libm symbols (bug 23735). The redirection of built-in functions such as sqrt in include/math.h applies when the wrappers for those functions in libnldbl_nonshared.a are built, resulting in references to internal names such as __ieee754_sqrt that aren't actually exported from the shared libm. (This applies for sqrt in 2.28, also for the round-to-integer functions in current master because of my changes there.) This patch arranges for NO_MATH_REDIRECT to be used for all the affected functions, and adds a test for those functions in libnldbl_nonshared.a. (We could of course choose to obsolete libnldbl_nonshared.a and require that people building with -mlong-double-64 either include the relevant headers and have a compiler supporting asm redirection, or have some other means of achieving that redirection at compile time if not including those headers. But while we have libnldbl_nonshared.a, it seems appropriate to fix such bugs in it.) Tested for powerpc, and with build-many-glibcs.py. [BZ #23735] * sysdeps/ieee754/ldbl-opt/nldbl-compat.h (NO_MATH_REDIRECT): Define. * sysdeps/ieee754/ldbl-opt/test-nldbl-redirect.c: New file. * sysdeps/ieee754/ldbl-opt/Makefile [$(subdir) = math] (tests): Add test-nldbl-redirect. [$(subdir) = math] (CFLAGS-test-nldbl-redirect.c): New variable. [$(subdir) = math] ($(objpfx)test-nldbl-redirect): Depend on $(objpfx)libnldbl_nonshared.a.	2018-10-04 12:16:05 +00:00
Martin Jansa	4a06ceea33	sysdeps/ieee754/soft-fp: ignore maybe-uninitialized with -O [BZ #19444 ] * with -O, -O1, -Os it fails with: In file included from ../soft-fp/soft-fp.h:318, from ../sysdeps/ieee754/soft-fp/s_fdiv.c:28: ../sysdeps/ieee754/soft-fp/s_fdiv.c: In function '__fdiv': ../soft-fp/op-2.h:98:25: error: 'R_f1' may be used uninitialized in this function [-Werror=maybe-uninitialized] X##_f0 = (X##_f1 << (_FP_W_TYPE_SIZE - (N)) \| X##_f0 >> (N) \ ^~ ../sysdeps/ieee754/soft-fp/s_fdiv.c:38:14: note: 'R_f1' was declared here FP_DECL_D (R); ^ ../soft-fp/op-2.h:37:36: note: in definition of macro '_FP_FRAC_DECL_2' _FP_W_TYPE X##_f0 _FP_ZERO_INIT, X##_f1 _FP_ZERO_INIT ^ ../soft-fp/double.h:95:24: note: in expansion of macro '_FP_DECL' # define FP_DECL_D(X) _FP_DECL (2, X) ^~~~~~~~ ../sysdeps/ieee754/soft-fp/s_fdiv.c:38:3: note: in expansion of macro 'FP_DECL_D' FP_DECL_D (R); ^~~~~~~~~ ../soft-fp/op-2.h:101:17: error: 'R_f0' may be used uninitialized in this function [-Werror=maybe-uninitialized] : (X##_f0 << (_FP_W_TYPE_SIZE - (N))) != 0)); \ ^~ ../sysdeps/ieee754/soft-fp/s_fdiv.c:38:14: note: 'R_f0' was declared here FP_DECL_D (R); ^ ../soft-fp/op-2.h:37:14: note: in definition of macro '_FP_FRAC_DECL_2' _FP_W_TYPE X##_f0 _FP_ZERO_INIT, X##_f1 _FP_ZERO_INIT ^ ../soft-fp/double.h:95:24: note: in expansion of macro '_FP_DECL' # define FP_DECL_D(X) _FP_DECL (2, X) ^~~~~~~~ ../sysdeps/ieee754/soft-fp/s_fdiv.c:38:3: note: in expansion of macro 'FP_DECL_D' FP_DECL_D (R); ^~~~~~~~~ Build tested with Yocto for ARM, AARCH64, X86, X86_64, PPC, MIPS, MIPS64 with -O, -O1, -Os. For AARCH64 it needs one more fix in locale for -Os. [BZ #19444] * sysdeps/ieee754/soft-fp/s_fdiv.c: Include <libc-diag.h> and use DIAG_PUSH_NEEDS_COMMENT, DIAG_IGNORE_NEEDS_COMMENT and DIAG_POP_NEEDS_COMMENT to disable -Wmaybe-uninitialized.	2018-10-02 15:40:57 +00:00
Joseph Myers	c52944e8cc	Remove unnecessary math_private.h includes. After my changes to move various macros, inlines and other content from math_private.h to more specific headers, many files including math_private.h no longer need to do so. Furthermore, since the optimized inlines of various functions have been moved to include/fenv.h or replaced by use of function names GCC inlines automatically, a missing math_private.h include where one is appropriate will reliably cause a build failure rather than possibly causing code to be less well optimized while still building successfully. Thus, this patch removes includes of math_private.h that are now unnecessary. In the case of two RISC-V files, the include is replaced by one of stdbool.h because the files in question were relying on math_private.h to get a definition of bool. Tested for x86_64 and x86, and with build-many-glibcs.py. * math/fromfp.h: Do not include <math_private.h>. * math/s_cacosh_template.c: Likewise. * math/s_casin_template.c: Likewise. * math/s_casinh_template.c: Likewise. * math/s_ccos_template.c: Likewise. * math/s_cproj_template.c: Likewise. * math/s_fdim_template.c: Likewise. * math/s_fmaxmag_template.c: Likewise. * math/s_fminmag_template.c: Likewise. * math/s_iseqsig_template.c: Likewise. * math/s_ldexp_template.c: Likewise. * math/s_nextdown_template.c: Likewise. * math/w_log1p_template.c: Likewise. * math/w_scalbln_template.c: Likewise. * sysdeps/aarch64/fpu/feholdexcpt.c: Likewise. * sysdeps/aarch64/fpu/fesetround.c: Likewise. * sysdeps/aarch64/fpu/fgetexcptflg.c: Likewise. * sysdeps/aarch64/fpu/ftestexcept.c: Likewise. * sysdeps/aarch64/fpu/s_llrint.c: Likewise. * sysdeps/aarch64/fpu/s_llrintf.c: Likewise. * sysdeps/aarch64/fpu/s_lrint.c: Likewise. * sysdeps/aarch64/fpu/s_lrintf.c: Likewise. * sysdeps/i386/fpu/s_atanl.c: Likewise. * sysdeps/i386/fpu/s_f32xaddf64.c: Likewise. * sysdeps/i386/fpu/s_f32xsubf64.c: Likewise. * sysdeps/i386/fpu/s_fdim.c: Likewise. * sysdeps/i386/fpu/s_logbl.c: Likewise. * sysdeps/i386/fpu/s_rintl.c: Likewise. * sysdeps/i386/fpu/s_significandl.c: Likewise. * sysdeps/ia64/fpu/s_matherrf.c: Likewise. * sysdeps/ia64/fpu/s_matherrl.c: Likewise. * sysdeps/ieee754/dbl-64/s_atan.c: Likewise. * sysdeps/ieee754/dbl-64/s_cbrt.c: Likewise. * sysdeps/ieee754/dbl-64/s_fma.c: Likewise. * sysdeps/ieee754/dbl-64/s_fmaf.c: Likewise. * sysdeps/ieee754/flt-32/s_cbrtf.c: Likewise. * sysdeps/ieee754/k_standardf.c: Likewise. * sysdeps/ieee754/k_standardl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_copysignl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_finitel.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_fpclassifyl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_isinfl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_isnanl.c: Likewise. * sysdeps/ieee754/ldbl-64-128/s_signbitl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_cbrtl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fma.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fmal.c: Likewise. * sysdeps/ieee754/s_signgam.c: Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c: Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c: Likewise. * sysdeps/powerpc/power7/fpu/s_logbf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_ceil.c: Likewise. * sysdeps/riscv/rv64/rvd/s_floor.c: Likewise. * sysdeps/riscv/rv64/rvd/s_nearbyint.c: Likewise. * sysdeps/riscv/rv64/rvd/s_round.c: Likewise. * sysdeps/riscv/rv64/rvd/s_roundeven.c: Likewise. * sysdeps/riscv/rv64/rvd/s_trunc.c: Likewise. * sysdeps/riscv/rvd/s_finite.c: Likewise. * sysdeps/riscv/rvd/s_fmax.c: Likewise. * sysdeps/riscv/rvd/s_fmin.c: Likewise. * sysdeps/riscv/rvd/s_fpclassify.c: Likewise. * sysdeps/riscv/rvd/s_isinf.c: Likewise. * sysdeps/riscv/rvd/s_isnan.c: Likewise. * sysdeps/riscv/rvd/s_issignaling.c: Likewise. * sysdeps/riscv/rvf/fegetround.c: Likewise. * sysdeps/riscv/rvf/feholdexcpt.c: Likewise. * sysdeps/riscv/rvf/fesetenv.c: Likewise. * sysdeps/riscv/rvf/fesetround.c: Likewise. * sysdeps/riscv/rvf/feupdateenv.c: Likewise. * sysdeps/riscv/rvf/fgetexcptflg.c: Likewise. * sysdeps/riscv/rvf/ftestexcept.c: Likewise. * sysdeps/riscv/rvf/s_ceilf.c: Likewise. * sysdeps/riscv/rvf/s_finitef.c: Likewise. * sysdeps/riscv/rvf/s_floorf.c: Likewise. * sysdeps/riscv/rvf/s_fmaxf.c: Likewise. * sysdeps/riscv/rvf/s_fminf.c: Likewise. * sysdeps/riscv/rvf/s_fpclassifyf.c: Likewise. * sysdeps/riscv/rvf/s_isinff.c: Likewise. * sysdeps/riscv/rvf/s_isnanf.c: Likewise. * sysdeps/riscv/rvf/s_issignalingf.c: Likewise. * sysdeps/riscv/rvf/s_nearbyintf.c: Likewise. * sysdeps/riscv/rvf/s_roundevenf.c: Likewise. * sysdeps/riscv/rvf/s_roundf.c: Likewise. * sysdeps/riscv/rvf/s_truncf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_rint.c: Include <stdbool.h> instead of <math_private.h>. * sysdeps/riscv/rvf/s_rintf.c: Likewise.	2018-09-28 21:53:33 +00:00
Joseph Myers	81dca813cc	Use copysign functions not __copysign functions in glibc libm. Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __copysign functions to call the corresponding copysign names instead, with asm redirection to __copysign when the calls are not inlined (all cases are inlined except for IBM long double for powerpc soft-float / e500v1). This eliminates the need for an inline function defining __copysign in terms of __builtin_copysign. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_BINARY_ARGS): New macro. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (copysign): Redirect using MATH_REDIRECT. * sysdeps/alpha/fpu/s_copysign.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/alpha/fpu/s_copysignf.c: Likewise. * sysdeps/ieee754/dbl-64/s_copysign.c: Likewise. * sysdeps/ieee754/float128/s_copysignf128.c: Likewise. * sysdeps/ieee754/flt-32/s_copysignf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_copysignl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_copysignl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_copysignl.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysignf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysignf.c: Likewise. * sysdeps/riscv/rvd/s_copysign.c: Likewise. * sysdeps/riscv/rvf/s_copysignf.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysignf.c: Likewise. * sysdeps/generic/math_private_calls.h [!__MATH_DECLARING_LONG_DOUBLE \|\| !NO_LONG_DOUBLE] (__copysign): Do not declare and define as an inline function. * math/divtc3.c (__divtc3): Use copysign functions instead of __copysign variants. * math/multc3.c (__multc3): Likewise. * sysdeps/generic/math-type-macros.h (M_COPYSIGN): Likewise. * sysdeps/ieee754/dbl-64/e_atan2.c (signArctan2): Likewise. * sysdeps/ieee754/dbl-64/e_atanh.c (__ieee754_atanh): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Likewise. * sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Likewise. (__ieee754_yn): Likewise. * sysdeps/ieee754/dbl-64/s_asinh.c (__asinh): Likewise. * sysdeps/ieee754/dbl-64/s_atan.c (__signArctan): Likewise. * sysdeps/ieee754/dbl-64/s_scalbln.c (__scalbln): Likewise. * sysdeps/ieee754/dbl-64/s_scalbn.c (__scalbn): Likewise. * sysdeps/ieee754/dbl-64/s_sin.c (do_sin): Likewise. (__sin): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c (__nearbyint): Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_scalbln.c (__scalbln): Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_scalbn.c (__scalbn): Likewise. * sysdeps/ieee754/flt-32/e_atanhf.c (__ieee754_atanhf): Likewise. * sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r): Likewise. * sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise. (__ieee754_ynf): Likewise. * sysdeps/ieee754/flt-32/s_asinhf.c (__asinhf): Likewise. * sysdeps/ieee754/flt-32/s_scalbnf.c (__scalbnf): Likewise. * sysdeps/ieee754/k_standard.c (__kernel_standard): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl): Likewise. * sysdeps/ieee754/ldbl-128/s_scalblnl.c (__scalblnl): Likewise. * sysdeps/ieee754/ldbl-128/s_scalbnl.c (__scalbnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_fmal.c (__fmal): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_scalblnl.c (__scalblnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c (__scalbnl): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl) * sysdeps/ieee754/ldbl-96/s_asinhl.c (__asinhl): Likewise. * sysdeps/ieee754/ldbl-96/s_scalblnl.c (__scalblnl): Likewise. * sysdeps/ieee754/ldbl-opt/nldbl-copysign.c (copysignl): Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.	2018-09-27 20:04:48 +00:00
Joseph Myers	9755bc4686	Use round functions not __round functions in glibc libm. Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __round functions to call the corresponding round names instead, with asm redirection to __round when the calls are not inlined. An additional complication arises in sysdeps/ieee754/ldbl-128ibm/e_expl.c, where a call to roundl, with the result converted to int, gets converted by the compiler to call lroundl in the case of 32-bit long, so resulting in localplt test failures. It's logically correct to let the compiler make such an optimization; an appropriate asm redirection of lroundl to __lroundl is thus added to that file (it's not needed anywhere else). Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (round): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_round.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_roundf.c: Likewise. * sysdeps/ieee754/dbl-64/s_round.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_round.c: Likewise. * sysdeps/ieee754/float128/s_roundf128.c: Likewise. * sysdeps/ieee754/flt-32/s_roundf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_roundl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_roundl.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_round.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_roundf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_round.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_roundf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_round.c: Likewise. * sysdeps/riscv/rvf/s_roundf.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_roundl.c: Likewise. (round): Redirect to __round. (__roundl): Call round instead of __round. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__round): Remove macro. [_ARCH_PWR5X] (__roundf): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Use round functions instead of __round variants. * sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/x86/fpu/powl_helper.c (__powl_helper): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_expl.c (lroundl): Redirect to __lroundl. (__ieee754_expl): Call roundl instead of __roundl.	2018-09-27 12:35:23 +00:00
Joseph Myers	7abf97bed9	Use trunc functions not __trunc functions in glibc libm. Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __trunc functions to call the corresponding trunc names instead, with asm redirection to __trunc when the calls are not inlined. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (trunc): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_trunc.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_truncf.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_trunc.c: Likewise. * sysdeps/ieee754/float128/s_truncf128.c: Likewise. * sysdeps/ieee754/dbl-64/s_trunc.c: Likewise. * sysdeps/ieee754/flt-32/s_truncf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_truncl.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_trunc.c: Likewise. * sysdeps/riscv/rvf/s_truncf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_trunc_template.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_truncl.c: Likewise. (ceil): Redirect to __ceil. (floor): Redirect to __floor. (trunc): Redirect to __trunc. (__truncl): Call trunc instead of __trunc. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__trunc): Remove macro. [_ARCH_PWR5X] (__truncf): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Use trunc functions instead of __trunc variants. * sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r): Likewise.	2018-09-20 21:11:10 +00:00
Szabolcs Nagy	d734727837	Fix the documentation comment of checkint in powf checkint in powf is not supposed to be used with 0, inf or nan inputs. * sysdeps/ieee754/flt-32/e_powf.c (checkint): Fix documentation.	2018-09-19 10:13:20 +01:00
Szabolcs Nagy	424c4f60ed	Add new pow implementation The algorithm is exp(y * log(x)), where log(x) is computed with about 1.32^-68 relative error (1.52^-68 without fma), returning the result in two doubles, and the exp part uses the same algorithm (and lookup tables) as exp, but takes the input as two doubles and a sign (to handle negative bases with odd integer exponent). The __exp1 internal symbol is no longer necessary. There is separate code path when fma is not available but the worst case error is about 0.54 ULP in both cases. The lookup table and consts for log are 4168 bytes. The .rodata+.text is decreased by 37908 bytes on aarch64. The non-nearest rounding error is less than 1 ULP. Improvements on Cortex-A72 compared to current glibc master: pow thruput: 2.40x in [0.01 11.1]x[0.01 11.1] pow latency: 1.84x in [0.01 11.1]x[0.01 11.1] Tested on aarch64-linux-gnu (defined __FP_FAST_FMA, TOINT_INTRINSICS) and arm-linux-gnueabihf (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and x86_64-linux-gnu (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and powerpc64le-linux-gnu (defined __FP_FAST_FMA, !TOINT_INTRINSICS) targets. * NEWS: Mention pow improvements. * math/Makefile (type-double-routines): Add e_pow_log_data. * sysdeps/generic/math_private.h (__exp1): Remove. * sysdeps/i386/fpu/e_pow_log_data.c: New file. * sysdeps/ia64/fpu/e_pow_log_data.c: New file. * sysdeps/ieee754/dbl-64/Makefile (CFLAGS-e_pow.c): Allow fma contraction. * sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove. (exp_inline): Remove. (__ieee754_exp): Only single double input is handled. * sysdeps/ieee754/dbl-64/e_pow.c: Rewrite. * sysdeps/ieee754/dbl-64/e_pow_log_data.c: New file. * sysdeps/ieee754/dbl-64/math_config.h (issignaling_inline): Define. (__pow_log_data): Define. * sysdeps/ieee754/dbl-64/upow.h: Remove. * sysdeps/ieee754/dbl-64/upow.tbl: Remove. * sysdeps/m68k/m680x0/fpu/e_pow_log_data.c: New file. * sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_pow-fma.c): Allow fma contraction. (CFLAGS-e_pow-fma4.c): Likewise.	2018-09-19 10:04:51 +01:00
Joseph Myers	50bc59ca4d	Fix ldbl-128ibm ceill, floorl inlining of ceil, floor. The ldbl-128ibm implementations of ceill and floorl call the corresponding double functions. This patch fixes those implementations to call those functions as ceil and floor rather than as __ceil and __floor, so that the proper inlining takes place when possible, while including local asm redirections for when the functions are not inlined since NO_MATH_REDIRECT applies to the double functions as well as to the long double ones. Tested with build-many-glibcs.py for all its powerpc configurations. * sysdeps/ieee754/ldbl-128ibm/s_ceill.c (ceil): Redirect to __ceil. (__ceill): Call ceil instead of __ceil. * sysdeps/ieee754/ldbl-128ibm/s_floorl.c (floor): Redirect to __floor. (__floorl): Call floor instead of __floor.	2018-09-18 13:24:14 +00:00
Joseph Myers	71223ef909	Use ceil functions not __ceil functions in glibc libm. Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __ceil functions to call the corresponding ceil names instead, with asm redirection to __ceil when the calls are not inlined. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (ceil): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_ceil.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_ceilf.c: Likewise. * sysdeps/ieee754/dbl-64/s_ceil.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_ceil.c: Likewise. * sysdeps/ieee754/float128/s_ceilf128.c: Likewise. * sysdeps/ieee754/flt-32/s_ceilf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_ceill.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_ceill.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_ceil_template.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_ceil.c: Likewise. * sysdeps/riscv/rvf/s_ceilf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__ceil): Remove macro. * sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Use ceil functions instead of __ceil variants. * sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.	2018-09-17 20:42:06 +00:00
Joseph Myers	f29b6f17e4	Use rint functions not __rint functions in glibc libm. Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __rint functions to call the corresponding rint names instead, with asm redirection to __rint when the calls are not inlined. The x86_64 math_private.h is removed as no longer useful after this patch. This patch is relative to a tree with my floor patch <https://sourceware.org/ml/libc-alpha/2018-09/msg00148.html> applied, and much the same considerations arise regarding possibly replacing an IFUNC call with a direct inline expansion. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (rint): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_rint.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_rintf.c: Likewise. * sysdeps/alpha/fpu/s_rint.c: Likewise. * sysdeps/alpha/fpu/s_rintf.c: Likewise. * sysdeps/i386/fpu/s_rintl.c: Likewise. * sysdeps/ieee754/dbl-64/s_rint.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_rint.c: Likewise. * sysdeps/ieee754/float128/s_rintf128.c: Likewise. * sysdeps/ieee754/flt-32/s_rintf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_rintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise. * sysdeps/m68k/coldfire/fpu/s_rint.c: Likewise. * sysdeps/m68k/coldfire/fpu/s_rintf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_rint.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_rintf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_rintl.c: Likewise. * sysdeps/powerpc/fpu/s_rint.c: Likewise. * sysdeps/powerpc/fpu/s_rintf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_rint.c: Likewise. * sysdeps/riscv/rvf/s_rintf.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rint.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_rintf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_rint.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_rintf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rint.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_rintf.c: Likewise. * sysdeps/x86_64/fpu/math_private.h: Remove file. * math/e_scalb.c (invalid_fn): Use rint functions instead of __rint variants. * math/e_scalbf.c (invalid_fn): Likewise. * math/e_scalbl.c (invalid_fn): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Likewise. * sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r): Likewise. * sysdeps/ieee754/k_standard.c (__kernel_standard): Likewise. * sysdeps/ieee754/k_standardl.c (__kernel_standard_l): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/powerpc/powerpc32/fpu/s_llrint.c (__llrint): Likewise. * sysdeps/powerpc/powerpc32/fpu/s_llrintf.c (__llrintf): Likewise.	2018-09-14 13:10:39 +00:00
Joseph Myers	e44acb2063	Use floor functions not __floor functions in glibc libm. Similar to the changes that were made to call sqrt functions directly in glibc, instead of __ieee754_sqrt variants, so that the compiler could inline them automatically without needing special inline definitions in lots of math_private.h headers, this patch makes libm code call floor functions directly instead of __floor variants, removing the inlines / macros for x86_64 (SSE4.1) and powerpc (POWER5). The redirection used to ensure that __ieee754_sqrt does still get called when the compiler doesn't inline a built-in function expansion is refactored so it can be applied to other functions; the refactoring is arranged so it's not limited to unary functions either (it would be reasonable to use this mechanism for copysign - removing the inline in math_private_calls.h but also eliminating unnecessary local PLT entry use in the cases (powerpc soft-float and e500v1, for IBM long double) where copysign calls don't get inlined). The point of this change is that more architectures can get floor calls inlined where they weren't previously (AArch64, for example), without needing special inline definitions in their math_private.h, and existing such definitions in math_private.h headers can be removed. Note that it's possible that in some cases an inline may be used where an IFUNC call was previously used - this is the case on x86_64, for example. I think the direct calls to floor are still appropriate; if there's any significant performance cost from inline SSE2 floor instead of an IFUNC call ending up with SSE4.1 floor, that indicates that either the function should be doing something else that's faster than using floor at all, or it should itself have IFUNC variants, or that the compiler choice of inlining for generic tuning should change to allow for the possibility that, by not inlining, an SSE4.1 IFUNC might be called at runtime - but not that glibc should avoid calling floor internally. (After all, all the same considerations would apply to any user program calling floor, where it might either be inlined or left as an out-of-line call allowing for a possible IFUNC.) Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT): New macro. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_LDBL): Likewise. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_F128): Likewise. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_UNARY_ARGS): Likewise. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (sqrt): Redirect using MATH_REDIRECT. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (floor): Likewise. * sysdeps/aarch64/fpu/s_floor.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_floorf.c: Likewise. * sysdeps/ieee754/dbl-64/s_floor.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_floor.c: Likewise. * sysdeps/ieee754/float128/s_floorf128.c: Likewise. * sysdeps/ieee754/flt-32/s_floorf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_floorl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_floorl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_floor_template.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_floor.c: Likewise. * sysdeps/riscv/rvf/s_floorf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__floor): Remove macro. [_ARCH_PWR5X] (__floorf): Likewise. * sysdeps/x86_64/fpu/math_private.h [__SSE4_1__] (__floor): Remove inline function. [__SSE4_1__] (__floorf): Likewise. * math/w_lgamma_main.c (LGFUNC (__lgamma)): Use floor functions instead of __floor variants. * math/w_lgamma_r_compat.c (__lgamma_r): Likewise. * math/w_lgammaf_main.c (LGFUNC (__lgammaf)): Likewise. * math/w_lgammaf_r_compat.c (__lgammaf_r): Likewise. * math/w_lgammal_main.c (LGFUNC (__lgammal)): Likewise. * math/w_lgammal_r_compat.c (__lgammal_r): Likewise. * math/w_tgamma_compat.c (__tgamma): Likewise. * math/w_tgamma_template.c (M_DECL_FUNC (__tgamma)): Likewise. * math/w_tgammaf_compat.c (__tgammaf): Likewise. * math/w_tgammal_compat.c (__tgammal): Likewise. * sysdeps/ieee754/dbl-64/e_lgamma_r.c (sin_pi): Likewise. * sysdeps/ieee754/dbl-64/k_rem_pio2.c (__kernel_rem_pio2): Likewise. * sysdeps/ieee754/dbl-64/lgamma_neg.c (__lgamma_neg): Likewise. * sysdeps/ieee754/flt-32/e_lgammaf_r.c (sin_pif): Likewise. * sysdeps/ieee754/flt-32/lgamma_negf.c (__lgamma_negf): Likewise. * sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r): Likewise. * sysdeps/ieee754/ldbl-128/e_powl.c (__ieee754_powl): Likewise. * sysdeps/ieee754/ldbl-128/lgamma_negl.c (__lgamma_negl): Likewise. * sysdeps/ieee754/ldbl-128/s_expm1l.c (__expm1l): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c (__ieee754_lgammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Likewise. * sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c (__lgamma_negl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Likewise. * sysdeps/ieee754/ldbl-96/e_lgammal_r.c (sin_pi): Likewise. * sysdeps/ieee754/ldbl-96/lgamma_negl.c (__lgamma_negl): Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.	2018-09-14 13:09:01 +00:00
Szabolcs Nagy	3e08ff544b	Add new log2 implementation Similar algorithm is used as in log: log2(2^k x) = k + log2(c) + log2(x/c) where the last term is approximated by a polynomial of x/c - 1, the first order coefficient is about 1/ln2 in this case. There is separate code path when fma instruction is not available for computing x/c - 1 precisely, for which the table size is doubled. The worst case error is 0.547 ULP (0.55 without fma), the read only global data size is 1168 bytes (2192 without fma) on aarch64. The non-nearest rounding error is less than 1 ULP. Improvements on Cortex-A72 compared to current glibc master: log2 thruput: 2.00x in [0.01 11.1] log2 latency: 2.04x in [0.01 11.1] log2 thruput: 2.17x in [0.999 1.001] log2 latency: 2.88x in [0.999 1.001] Tested on aarch64-linux-gnu (defined __FP_FAST_FMA) arm-linux-gnueabihf (!defined __FP_FAST_FMA) x86_64-linux-gnu (!defined __FP_FAST_FMA) powerpc64le-linxu-gnu (defined __FP_FAST_FMA) targets. * NEWS: Mention log2 improvements. * math/Makefile (type-double-routines): Add e_log2_data. * sysdeps/i386/fpu/e_log2_data.c: New file. * sysdeps/ia64/fpu/e_log2_data.c: New file. * sysdeps/ieee754/dbl-64/e_log2.c: Rewrite. * sysdeps/ieee754/dbl-64/e_log2_data.c: New file. * sysdeps/ieee754/dbl-64/math_config.h (__log2_data): Add. * sysdeps/ieee754/dbl-64/wordsize-64/e_log2.c: Remove. * sysdeps/m68k/m680x0/fpu/e_log2_data.c: New file.	2018-09-12 17:36:33 +01:00
Szabolcs Nagy	f41b0a43e4	Add new log implementation Optimized log using carefully generated lookup table with 1/c and log(c) values for small intervalls around 1. The log(c) is very near a double precision value, it has about 62 bits precision. The algorithm is log(2^k x) = k log(2) + log(c) + log(x/c), where the last term is approximated by a polynomial of x/c - 1. Near 1 a single polynomial of x - 1 is used. There is separate code path when fma instruction is not available for computing x/c - 1 precisely, in which case the table size is doubled. The code uses __builtin_fma under __FP_FAST_FMA to ensure it is inlined as an instruction. With the default configuration settings the worst case error is 0.519 ULP (and 0.520 without fma), the rodata size is 2192 bytes (4240 without fma). The non-nearest rounding error is less than 1 ULP. Improvements on Cortex-A72 compared to current glibc master: log thruput: 3.28x in [0.01 11.1] log latency: 2.23x in [0.01 11.1] log thruput: 1.56x in [0.999 1.001] log latency: 1.57x in [0.999 1.001] Tested on aarch64-linux-gnu (defined __FP_FAST_FMA) arm-linux-gnueabihf (!defined __FP_FAST_FMA) x86_64-linux-gnu (!defined __FP_FAST_FMA) powerpc64le-linux-gnu (defined __FP_FAST_FMA) targets. * NEWS: Mention log improvement. * math/Makefile (type-double-routines): Add e_log_data. * sysdeps/i386/fpu/e_log_data.c: New file. * sysdeps/ia64/fpu/e_log_data.c: New file. * sysdeps/ieee754/dbl-64/e_log.c: Rewrite. * sysdeps/ieee754/dbl-64/e_log_data.c: New file. * sysdeps/ieee754/dbl-64/math_config.h (__log_data): Add. * sysdeps/ieee754/dbl-64/ulog.h: Remove. * sysdeps/ieee754/dbl-64/ulog.tbl: Remove. * sysdeps/m68k/m680x0/fpu/e_log_data.c: New file.	2018-09-12 17:33:30 +01:00
Szabolcs Nagy	e70c176825	Add new exp and exp2 implementations Optimized exp and exp2 implementations using a lookup table for fractional powers of 2. There are several variants, see e_exp_data.c, they can be selected by modifying math_config.h allowing different tradeoffs. The default selection should be acceptable as generic libm code. Worst case error is 0.509 ULP for exp and 0.507 ULP for exp2, on aarch64 the rodata size is 2160 bytes, shared between exp and exp2. On aarch64 .text + .rodata size decreased by 24912 bytes. The non-nearest rounding error is less than 1 ULP even on targets without efficient round implementation (although the error rate is higher in that case). Targets with single instruction, rounding mode independent, to nearest integer rounding and conversion can use them by setting TOINT_INTRINSICS and adding the necessary code to their math_private.h. The __exp1 code uses the same algorithm, so the error bound of pow increased a bit. New double precision error handling code was added following the style of the single precision error handling code. Improvements on Cortex-A72 compared to current glibc master: exp thruput: 1.61x in [-9.9 9.9] exp latency: 1.53x in [-9.9 9.9] exp thruput: 1.13x in [0.5 1] exp latency: 1.30x in [0.5 1] exp2 thruput: 2.03x in [-9.9 9.9] exp2 latency: 1.64x in [-9.9 9.9] For small (< 1) inputs the current exp code uses a separate algorithm so the speed up there is less. Was tested on aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and arm-linux-gnueabihf (!TOINT_INTRINSICS, no fma contraction) and x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) and powerpc64le-linux-gnu (!TOINT_INTRINSICS, fma contraction) targets, only non-nearest rounding ulp errors increase and they are within acceptable bounds (ulp updates are in separate patches). * NEWS: Mention exp and exp2 improvements. * math/Makefile (libm-support): Remove t_exp. (type-double-routines): Add math_err and e_exp_data. * sysdeps/aarch64/libm-test-ulps: Update. * sysdeps/arm/libm-test-ulps: Update. * sysdeps/i386/fpu/e_exp_data.c: New file. * sysdeps/i386/fpu/math_err.c: New file. * sysdeps/i386/fpu/t_exp.c: Remove. * sysdeps/ia64/fpu/e_exp_data.c: New file. * sysdeps/ia64/fpu/math_err.c: New file. * sysdeps/ia64/fpu/t_exp.c: Remove. * sysdeps/ieee754/dbl-64/e_exp.c: Rewrite. * sysdeps/ieee754/dbl-64/e_exp2.c: Rewrite. * sysdeps/ieee754/dbl-64/e_exp_data.c: New file. * sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Update error bound. * sysdeps/ieee754/dbl-64/eexp.tbl: Remove. * sysdeps/ieee754/dbl-64/math_config.h: New file. * sysdeps/ieee754/dbl-64/math_err.c: New file. * sysdeps/ieee754/dbl-64/t_exp.c: Remove. * sysdeps/ieee754/dbl-64/t_exp2.h: Remove. * sysdeps/ieee754/dbl-64/uexp.h: Remove. * sysdeps/ieee754/dbl-64/uexp.tbl: Remove. * sysdeps/m68k/m680x0/fpu/e_exp_data.c: New file. * sysdeps/m68k/m680x0/fpu/math_err.c: New file. * sysdeps/m68k/m680x0/fpu/t_exp.c: Remove. * sysdeps/powerpc/fpu/libm-test-ulps: Update. * sysdeps/x86_64/fpu/libm-test-ulps: Update.	2018-09-05 16:22:00 +01:00
Joseph Myers	418d99e622	Move fenv.h soft-float inlines from fenv_private.h to include/fenv.h. <fenv_private.h> has inline versions of various <fenv.h> functions, and their __fe* variants, for systems (generally soft-float) without support for floating-point exceptions, rounding modes or both. Having these inlines in a separate header introduces a risk of a source file including <fenv.h> and compiling OK on x86_64, but failing to compile (because the feraiseexcept inline is actually a macro that discards its argument, to avoid the need for #ifdef FE_INVALID conditionals), or not being properly optimized, on systems without the exceptions and rounding modes support (when these inlines were in math_private.h, we had a few cases where this broke the build because there was no obvious reason for a file to need math_private.h and it didn't need that header on x86_64). By moving those inlines to include/fenv.h, this risk can be avoided, and fenv_private.h becomes more clearly defined as specifically the header for the internal libc_fe* and SET_RESTORE_ROUND* interfaces. This patch makes that move, removing fenv_private.h includes that are no longer needed (or replacing them by fenv.h includes in a few cases that didn't already have such an include). Tested for x86_64 and x86, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by the patch. * sysdeps/generic/fenv_private.h [FE_ALL_EXCEPT == 0]: Move this code .... [!FE_HAVE_ROUNDING_MODES]: And this code .... * include/fenv.h [!_ISOMAC]: ... to here. * math/fraiseexcpt.c (__feraiseexcept): Undefine as macro. (feraiseexcept): Likewise. * math/fromfp.h: Do not include <fenv_private.h>. * math/s_cexp_template.c: Likewise. * math/s_csin_template.c: Likewise. * math/s_csinh_template.c: Likewise. * math/s_ctan_template.c: Likewise. * math/s_ctanh_template.c: Likewise. * math/s_iseqsig_template.c: Likewise. * math/w_acos_compat.c: Likewise. * math/w_acosf_compat.c: Likewise. * math/w_acosl_compat.c: Likewise. * math/w_asin_compat.c: Likewise. * math/w_asinf_compat.c: Likewise. * math/w_asinl_compat.c: Likewise. * math/w_j0_compat.c: Likewise. * math/w_j0f_compat.c: Likewise. * math/w_j0l_compat.c: Likewise. * math/w_j1_compat.c: Likewise. * math/w_j1f_compat.c: Likewise. * math/w_j1l_compat.c: Likewise. * math/w_jn_compat.c: Likewise. * math/w_jnf_compat.c: Likewise. * math/w_log10_compat.c: Likewise. * math/w_log10f_compat.c: Likewise. * math/w_log10l_compat.c: Likewise. * math/w_log2_compat.c: Likewise. * math/w_log2f_compat.c: Likewise. * math/w_log2l_compat.c: Likewise. * math/w_log_compat.c: Likewise. * math/w_logf_compat.c: Likewise. * math/w_logl_compat.c: Likewise. * sysdeps/ieee754/dbl-64/s_llrint.c: Likewise. * sysdeps/ieee754/dbl-64/s_llround.c: Likewise. * sysdeps/ieee754/dbl-64/s_lrint.c: Likewise. * sysdeps/ieee754/dbl-64/s_lround.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_lround.c: Likewise. * sysdeps/ieee754/flt-32/s_llrintf.c: Likewise. * sysdeps/ieee754/flt-32/s_llroundf.c: Likewise. * sysdeps/ieee754/flt-32/s_lrintf.c: Likewise. * sysdeps/ieee754/flt-32/s_lroundf.c: Likewise. * sysdeps/ieee754/k_standardl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_expl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-128/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nearbyintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fma.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-96/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_lroundl.c: Likewise. * math/w_ilogb_template.c: Include <fenv.h> instead of <fenv_private.h>. * math/w_llogb_template.c: Likewise. * sysdeps/powerpc/fpu/e_sqrt.c: Likewise. * sysdeps/powerpc/fpu/e_sqrtf.c: Likewise.	2018-09-04 19:52:06 +00:00
Joseph Myers	70e2ba332f	Do not include fenv_private.h in math_private.h. Continuing the clean-up related to the catch-all math_private.h header, this patch stops math_private.h from including fenv_private.h. Instead, fenv_private.h is included directly from those users of math_private.h that also used interfaces from fenv_private.h. No attempt is made to remove unused includes of math_private.h, but that is a natural followup. (However, since math_private.h sometimes defines optimized versions of math.h interfaces or __* variants thereof, as well as defining its own interfaces, I think it might make sense to get all those optimized versions included from include/math.h, not requiring a separate header at all, before eliminating unused math_private.h includes - that avoids a file quietly becoming less-optimized if someone adds a call to one of those interfaces without restoring a math_private.h include to that file.) There is still a pitfall that if code uses plain fe* and __fe* interfaces, but only includes fenv.h and not fenv_private.h or (before this patch) math_private.h, it will compile on platforms with exceptions and rounding modes but not get the optimized versions (and possibly not compile) on platforms without exception and rounding mode support, so making it easy to break the build for such platforms accidentally. I think it would be most natural to move the inlines / macros for fe* and __fe* in the case of no exceptions and rounding modes into include/fenv.h, so that all code including fenv.h with _ISOMAC not defined automatically gets them. Then fenv_private.h would be purely the header for the libc_fe, SET_RESTORE_ROUND etc. internal interfaces and the risk of breaking the build on other platforms than the one you tested on because of a missing fenv_private.h include would be much reduced (and there would be some unused fenv_private.h includes to remove along with unused math_private.h includes). Tested for x86_64 and x86, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. sysdeps/generic/math_private.h: Do not include <fenv_private.h>. * math/fromfp.h: Include <fenv_private.h>. * math/math-narrow.h: Likewise. * math/s_cexp_template.c: Likewise. * math/s_csin_template.c: Likewise. * math/s_csinh_template.c: Likewise. * math/s_ctan_template.c: Likewise. * math/s_ctanh_template.c: Likewise. * math/s_iseqsig_template.c: Likewise. * math/w_acos_compat.c: Likewise. * math/w_acosf_compat.c: Likewise. * math/w_acosl_compat.c: Likewise. * math/w_asin_compat.c: Likewise. * math/w_asinf_compat.c: Likewise. * math/w_asinl_compat.c: Likewise. * math/w_ilogb_template.c: Likewise. * math/w_j0_compat.c: Likewise. * math/w_j0f_compat.c: Likewise. * math/w_j0l_compat.c: Likewise. * math/w_j1_compat.c: Likewise. * math/w_j1f_compat.c: Likewise. * math/w_j1l_compat.c: Likewise. * math/w_jn_compat.c: Likewise. * math/w_jnf_compat.c: Likewise. * math/w_llogb_template.c: Likewise. * math/w_log10_compat.c: Likewise. * math/w_log10f_compat.c: Likewise. * math/w_log10l_compat.c: Likewise. * math/w_log2_compat.c: Likewise. * math/w_log2f_compat.c: Likewise. * math/w_log2l_compat.c: Likewise. * math/w_log_compat.c: Likewise. * math/w_logf_compat.c: Likewise. * math/w_logl_compat.c: Likewise. * sysdeps/aarch64/fpu/feholdexcpt.c: Likewise. * sysdeps/aarch64/fpu/fesetround.c: Likewise. * sysdeps/aarch64/fpu/fgetexcptflg.c: Likewise. * sysdeps/aarch64/fpu/ftestexcept.c: Likewise. * sysdeps/ieee754/dbl-64/e_atan2.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp2.c: Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c: Likewise. * sysdeps/ieee754/dbl-64/e_jn.c: Likewise. * sysdeps/ieee754/dbl-64/e_pow.c: Likewise. * sysdeps/ieee754/dbl-64/e_remainder.c: Likewise. * sysdeps/ieee754/dbl-64/e_sqrt.c: Likewise. * sysdeps/ieee754/dbl-64/gamma_product.c: Likewise. * sysdeps/ieee754/dbl-64/lgamma_neg.c: Likewise. * sysdeps/ieee754/dbl-64/s_atan.c: Likewise. * sysdeps/ieee754/dbl-64/s_fma.c: Likewise. * sysdeps/ieee754/dbl-64/s_fmaf.c: Likewise. * sysdeps/ieee754/dbl-64/s_llrint.c: Likewise. * sysdeps/ieee754/dbl-64/s_llround.c: Likewise. * sysdeps/ieee754/dbl-64/s_lrint.c: Likewise. * sysdeps/ieee754/dbl-64/s_lround.c: Likewise. * sysdeps/ieee754/dbl-64/s_nearbyint.c: Likewise. * sysdeps/ieee754/dbl-64/s_sin.c: Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c: Likewise. * sysdeps/ieee754/dbl-64/s_tan.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_lround.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c: Likewise. * sysdeps/ieee754/dbl-64/x2y2m1.c: Likewise. * sysdeps/ieee754/float128/float128_private.h: Likewise. * sysdeps/ieee754/flt-32/e_gammaf_r.c: Likewise. * sysdeps/ieee754/flt-32/e_j1f.c: Likewise. * sysdeps/ieee754/flt-32/e_jnf.c: Likewise. * sysdeps/ieee754/flt-32/lgamma_negf.c: Likewise. * sysdeps/ieee754/flt-32/s_llrintf.c: Likewise. * sysdeps/ieee754/flt-32/s_llroundf.c: Likewise. * sysdeps/ieee754/flt-32/s_lrintf.c: Likewise. * sysdeps/ieee754/flt-32/s_lroundf.c: Likewise. * sysdeps/ieee754/flt-32/s_nearbyintf.c: Likewise. * sysdeps/ieee754/k_standardl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_expl.c: Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c: Likewise. * sysdeps/ieee754/ldbl-128/e_j1l.c: Likewise. * sysdeps/ieee754/ldbl-128/e_jnl.c: Likewise. * sysdeps/ieee754/ldbl-128/gamma_productl.c: Likewise. * sysdeps/ieee754/ldbl-128/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-128/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-128/s_nearbyintl.c: Likewise. * sysdeps/ieee754/ldbl-128/x2y2m1l.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_expl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_j1l.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_rintl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/x2y2m1l.c: Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c: Likewise. * sysdeps/ieee754/ldbl-96/e_jnl.c: Likewise. * sysdeps/ieee754/ldbl-96/gamma_productl.c: Likewise. * sysdeps/ieee754/ldbl-96/lgamma_negl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fma.c: Likewise. * sysdeps/ieee754/ldbl-96/s_fmal.c: Likewise. * sysdeps/ieee754/ldbl-96/s_llrintl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_llroundl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_lrintl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_lroundl.c: Likewise. * sysdeps/ieee754/ldbl-96/x2y2m1l.c: Likewise. * sysdeps/powerpc/fpu/e_sqrt.c: Likewise. * sysdeps/powerpc/fpu/e_sqrtf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_ceil.c: Likewise. * sysdeps/riscv/rv64/rvd/s_floor.c: Likewise. * sysdeps/riscv/rv64/rvd/s_nearbyint.c: Likewise. * sysdeps/riscv/rv64/rvd/s_round.c: Likewise. * sysdeps/riscv/rv64/rvd/s_roundeven.c: Likewise. * sysdeps/riscv/rv64/rvd/s_trunc.c: Likewise. * sysdeps/riscv/rvd/s_finite.c: Likewise. * sysdeps/riscv/rvd/s_fmax.c: Likewise. * sysdeps/riscv/rvd/s_fmin.c: Likewise. * sysdeps/riscv/rvd/s_fpclassify.c: Likewise. * sysdeps/riscv/rvd/s_isinf.c: Likewise. * sysdeps/riscv/rvd/s_isnan.c: Likewise. * sysdeps/riscv/rvd/s_issignaling.c: Likewise. * sysdeps/riscv/rvf/fegetround.c: Likewise. * sysdeps/riscv/rvf/feholdexcpt.c: Likewise. * sysdeps/riscv/rvf/fesetenv.c: Likewise. * sysdeps/riscv/rvf/fesetround.c: Likewise. * sysdeps/riscv/rvf/feupdateenv.c: Likewise. * sysdeps/riscv/rvf/fgetexcptflg.c: Likewise. * sysdeps/riscv/rvf/ftestexcept.c: Likewise. * sysdeps/riscv/rvf/s_ceilf.c: Likewise. * sysdeps/riscv/rvf/s_finitef.c: Likewise. * sysdeps/riscv/rvf/s_floorf.c: Likewise. * sysdeps/riscv/rvf/s_fmaxf.c: Likewise. * sysdeps/riscv/rvf/s_fminf.c: Likewise. * sysdeps/riscv/rvf/s_fpclassifyf.c: Likewise. * sysdeps/riscv/rvf/s_isinff.c: Likewise. * sysdeps/riscv/rvf/s_isnanf.c: Likewise. * sysdeps/riscv/rvf/s_issignalingf.c: Likewise. * sysdeps/riscv/rvf/s_nearbyintf.c: Likewise. * sysdeps/riscv/rvf/s_roundevenf.c: Likewise. * sysdeps/riscv/rvf/s_roundf.c: Likewise. * sysdeps/riscv/rvf/s_truncf.c: Likewise.	2018-09-03 21:09:04 +00:00
Wilco Dijkstra	ca3aac57ef	Remove unused math files Remove empty files due to the sin/cos improvements: k_sinf.c, k_cosf.c, k_cos.c, k_sin.c. After the tanf change s_rem_pio2f.c and k_rem_pio2f.c (and the ia64, m68k and powerpc equivalents) are no longer used, so remove them. All e_rem_pio2.c files were already empty or commented out, so remove them too. Passes build-many-glibcs. * math/Makefile: Remove empty files k_sin(f).c, k_cos(f).c. Remove unused files e_rem_pio2(f).c, k_rem_pio2f.c. * sysdeps/i386/fpu/e_rem_pio2.c: Delete file. * sysdeps/ia64/fpu/e_rem_pio2.c: Likewise. * sysdeps/ia64/fpu/e_rem_pio2f.c: Likewise. * sysdeps/ia64/fpu/k_rem_pio2f.c: Likewise. * sysdeps/ieee754/dbl-64/e_rem_pio2.c: Likewise. * sysdeps/ieee754/dbl-64/k_cos.c: Likewise. * sysdeps/ieee754/dbl-64/k_sin.c: Likewise. * sysdeps/ieee754/flt-32/e_rem_pio2f.c: Likewise. * sysdeps/ieee754/flt-32/k_cosf.c: Likewise. * sysdeps/ieee754/flt-32/k_rem_pio2f.c: Likewise. * sysdeps/ieee754/flt-32/k_sinf.c: Likewise. * sysdeps/m68k/m680x0/fpu/e_rem_pio2.c: Likewise * sysdeps/m68k/m680x0/fpu/e_rem_pio2f.c: Likewise * sysdeps/m68k/m680x0/fpu/k_rem_pio2f.c: Likewise * sysdeps/powerpc/fpu/e_rem_pio2f.c: Likewise. * sysdeps/powerpc/fpu/k_rem_pio2f.c: Likewise.	2018-08-24 15:34:54 +01:00
Wilco Dijkstra	900fb446eb	Speedup tanf range reduction Speedup tanf range reduction by using the new sincosf range reduction algorithm. Overall code quality is improved due to inlining, so there is a speedup even if no range reduction is required. tanf throughput gains on Cortex-A72: * \|x\| < M_PI_4 : 1.1x * \|x\| < M_PI_2 : 1.2x * \|x\| < 2 * M_PI: 1.5x * \|x\| < 120.0 : 1.6x * \|x\| < Inf : 12.1x * sysdeps/ieee754/flt-32/s_tanf.c (__tanf): Use fast range reduction.	2018-08-23 12:38:16 +01:00
Wilco Dijkstra	126c4e3f80	Use generic sinf/cosf in lgammaf_r The internal functions __kernel_sinf and __kernel_cosf are used only by lgammaf_r. Removing the internal functions and using the generic sinf and cosf is better overall. Benchmarking on Cortex-A72 shows the generic sinf and cosf are 1.4x and 2.3x faster in the range \|x\| < PI/4, and 0.66x and 1.1x for \|x\| < PI/2, so it should make lgammaf_r faster on average. GLIBC regression tests pass on AArch64. * sysdeps/ieee754/flt-32/e_lgammaf_r.c (sin_pif): Use __sinf/__cosf. * sysdeps/ieee754/flt-32/k_cosf.c (__kernel_cosf): Remove all code. * sysdeps/ieee754/flt-32/k_sinf.c (__kernel_sinf): Likewise.	2018-08-15 16:01:21 +01:00
Wilco Dijkstra	599cf39766	Improve performance of sinf and cosf The second patch improves performance of sinf and cosf using the same algorithms and polynomials. The returned values are identical to sincosf for the same input. ULP definitions for AArch64 and x64 are updated. sinf/cosf througput gains on Cortex-A72: * \|x\| < 0x1p-12 : 1.2x * \|x\| < M_PI_4 : 1.8x * \|x\| < 2 * M_PI: 1.7x * \|x\| < 120.0 : 2.3x * \|x\| < Inf : 3.0x * NEWS: Mention sinf, cosf, sincosf. * sysdeps/aarch64/libm-test-ulps: Update ULP for sinf, cosf, sincosf. * sysdeps/x86_64/fpu/libm-test-ulps: Update ULP for sinf and cosf. * sysdeps/x86_64/fpu/multiarch/s_sincosf-fma.c: Add definitions of constants rather than including generic sincosf.h. * sysdeps/x86_64/fpu/s_sincosf_data.c: Remove. * sysdeps/ieee754/flt-32/s_cosf.c (cosf): Rewrite. * sysdeps/ieee754/flt-32/s_sincosf.h (reduced_sin): Remove. (reduced_cos): Remove. (sinf_poly): New function. * sysdeps/ieee754/flt-32/s_sinf.c (sinf): Rewrite.	2018-08-14 10:45:59 +01:00
Wilco Dijkstra	ea5c662c62	Improve performance of sincosf This patch is a complete rewrite of sincosf. The new version is significantly faster, as well as simple and accurate. The worst-case ULP is 0.5607, maximum relative error is 0.5303 * 2^-23 over all 4 billion inputs. In non-nearest rounding modes the error is 1ULP. The algorithm uses 3 main cases: small inputs which don't need argument reduction, small inputs which need a simple range reduction and large inputs requiring complex range reduction. The code uses approximate integer comparisons to quickly decide between these cases. The small range reducer uses a single reduction step to handle values up to 120.0. It is fastest on targets which support inlined round instructions. The large range reducer uses integer arithmetic for simplicity. It does a 32x96 bit multiply to compute a 64-bit modulo result. This is more than accurate enough to handle the worst-case cancellation for values close to an integer multiple of PI/4. It could be further optimized, however it is already much faster than necessary. sincosf throughput gains on Cortex-A72: * \|x\| < 0x1p-12 : 1.6x * \|x\| < M_PI_4 : 1.7x * \|x\| < 2 * M_PI: 1.5x * \|x\| < 120.0 : 1.8x * \|x\| < Inf : 2.3x * math/Makefile: Add s_sincosf_data.c. * sysdeps/ia64/fpu/s_sincosf_data.c: New file. * sysdeps/ieee754/flt-32/s_sincosf.h (abstop12): Add new function. (sincosf_poly): Likewise. (reduce_small): Likewise. (reduce_large): Likewise. * sysdeps/ieee754/flt-32/s_sincosf.c (sincosf): Rewrite. * sysdeps/ieee754/flt-32/s_sincosf_data.c: New file with sincosf data. * sysdeps/m68k/m680x0/fpu/s_sincosf_data.c: New file. * sysdeps/x86_64/fpu/s_sincosf_data.c: New file.	2018-08-10 17:34:39 +01:00
Szabolcs Nagy	43cfdf8f48	Clean up converttoint handling and document the semantics This patch currently only affects aarch64. The roundtoint and converttoint internal functions are only called with small values, so 32 bit result is enough for converttoint and it is a signed int conversion so the return type is changed to int32_t. The original idea was to help the compiler keeping the result in uint64_t, then it's clear that no sign extension is needed and there is no accidental undefined or implementation defined signed int arithmetics. But it turns out gcc does a good job with inlining so changing the type has no overhead and the semantics of the conversion is less surprising this way. Since we want to allow the asuint64 (x + 0x1.8p52) style conversion, the top bits were never usable and the existing code ensures that only the bottom 32 bits of the conversion result are used. On aarch64 the neon intrinsics (which round ties to even) are changed to round and lround (which round ties away from zero) this does not affect the results in a significant way, but more portable (relies on round and lround being inlined which works with -fno-math-errno). The TOINT_SHIFT and TOINT_RINT macros were removed, only keep separate code paths for TOINT_INTRINSICS and !TOINT_INTRINSICS. * sysdeps/aarch64/fpu/math_private.h (roundtoint): Use round. (converttoint): Use lround. * sysdeps/ieee754/flt-32/math_config.h (roundtoint): Declare and document the semantics when TOINT_INTRINSICS is set. (converttoint): Likewise. (TOINT_RINT): Remove. (TOINT_SHIFT): Remove. * sysdeps/ieee754/flt-32/e_expf.c (__expf): Remove the TOINT_RINT code path.	2018-08-10 17:23:16 +01:00
Gabriel F. T. Gomes	b7b88cea41	ldbl-128ibm-compat: Add printf_size Since the addition of the _Float128 API, strfromf128 and printf_size use __printf_fp to print _Float128 values. This is achieved by setting the 'is_binary128' member of the 'printf_info' structure to one. Now that the format of long double on powerpc64le is getting a third option, this mechanism is reused for long double values that have binary128 format (i.e.: when -mabi=ieeelongdouble). This patch adds __printf_sizeieee128 as an exported symbol, but doesn't provide redirections from printf_size, yet. All redirections will be installed in a future commit, once all other functions that print or read long double values with binary128 format are ready. In __printf_fp, when 'is_binary128' is one, the floating-point argument is treated as if it was of _Float128 type, regardless of the value of 'is_long_double', thus __printf_sizeieee128 sets 'is_binary128' to the same value of 'is_long_double'. Otherwise, double values would not be printed correctly. Tested for powerpc64le.	2018-07-02 10:51:01 -03:00
Szabolcs Nagy	2b445206a1	Use uint32_t sign in single precision math error handling functions Ideally sign should be bool, but sometimes (e.g. in powf) it's more efficient to pass a non-zero value than 1 to indicate that the sign should be set. The fixed size int is less ambigous than unsigned long. * sysdeps/ieee754/flt-32/e_powf.c (__powf): Use uint32_t. (exp2f_inline): Likewise. * sysdeps/ieee754/flt-32/math_config.h (__math_oflowf): Likewise. (__math_uflowf): Likewise. (__math_may_uflowf): Likewise. (__math_divzerof): Likewise. (__math_invalidf): Likewise. * sysdeps/ieee754/flt-32/math_errf.c (xflowf): Likewise. (__math_oflowf): Likewise. (__math_uflowf): Likewise. (__math_may_uflowf): Likewise. (__math_divzerof): Likewise. (__math_invalidf): Likewise.	2018-07-02 09:29:04 +01:00
Rajalakshmi Srinivasaraghavan	86a0f56158	ldbl-128ibm-compat: Introduce ieee128 symbols This patch adds __ieee128 symbols for strfrom, strtold, strtold_l, wcstold and wcstold_l functions. Redirection from l to ieee128 will be handled in separate patch once we start building these new files. 2018-06-28 Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com> sysdeps/ieee754/ldbl-128ibm-compat/Versions: Add __strfromieee128, __strtoieee128, __strtoieee128_l,__wcstoieee128 and __wcstoieee128_l. * sysdeps/ieee754/ldbl-128ibm-compat/strfromf128.c: New file. * sysdeps/ieee754/ldbl-128ibm-compat/strtof128.c: New file. * sysdeps/ieee754/ldbl-128ibm-compat/strtof128_l.c: New file. * sysdeps/ieee754/ldbl-128ibm-compat/wcstof128.c: New file. * sysdeps/ieee754/ldbl-128ibm-compat/wcstof128_l.c: New file.	2018-06-28 13:57:50 +05:30
Tulio Magno Quites Machado Filho	209ae17c60	ldbl-128ibm-compat: Create libm-alias-float128.h Add a new libm-alias-float128.h in order to provide the __ieee128 aliases for the existing f128 that do not have a globally exported symbol. * sysdeps/ieee754/ldbl-128ibm-compat/Versions: New file. * sysdeps/ieee754/ldbl-128ibm-compat/libm-alias-float128.h: New file.	2018-06-20 19:02:07 -03:00
Tulio Magno Quites Machado Filho	5e79e0292b	Add a generic significand implementation Create a template for significand. * math/Makefile (libm-calls): Move s_significandF to... (gen-libm-calls): ... here. * math/s_significand_template.c: New file. * math/s_significand.c: Removed. * math/s_significandf.c: Removed. * math/s_significandl.c: Removed. * sysdeps/ieee754/ldbl-opt/s_significand.c: Removed. * sysdeps/ieee754/ldbl-opt/s_significandl.c: Removed. Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>	2018-06-20 18:15:06 -03:00
Joseph Myers	ca121b117f	Fix ldbl-96 fma (Inf, Inf, finite) (bug 23272). As reported in bug 23272, the ldbl-96 implementation of fma (fma for double, in terms of ldbl-96 as the internal arithmetic type, as used on 32-bit x86) is missing some of the special-case handling for non-finite arguments, resulting in incorrect NaN results when the first two arguments are infinities, the third is finite and so the infinities go through the logic for finite arguments. This patch fixes it by handling all cases of non-finite arguments up front, with additional fma tests for the problem cases being added to the testsuite. Tested for x86_64 and x86. [BZ #23272] * sysdeps/ieee754/ldbl-96/s_fma.c (__fma): Start by handling all cases of non-finite arguments. * math/libm-test-fma.inc (fma_test_data): Add more tests.	2018-06-11 16:33:42 +00:00

1 2 3 4 5 ...

1022 Commits