glibc/sysdeps
Joseph Myers f280fa6d17 Use __builtin_fma more in dbl-64 code.
sysdeps/ieee754/dbl-64/dla.h can use a macro DLA_FMS for more
efficient double-width operations when fused multiply-subtract is
supported.  However, this macro is only defined for x86_64,
conditional on architecture-specific __FMA4__.  This patch makes the
code use __builtin_fma conditional on __FP_FAST_FMA, as used elsewhere
in glibc.

Tested for x86_64, x86 and powerpc.  On powerpc (where this is causing
fused operations to be used where they weren't previously) I see an
increase from 1ulp to 2ulp in the imaginary part of clog10:

testing double (without inline functions)
Failure: Test: Imaginary part of: clog10 (0x1.7a858p+0 - 0x6.d940dp-4 i)
Result:
 is:         -1.2237865208199886e-01  -0x1.f5435146bb61ap-4
 should be:  -1.2237865208199888e-01  -0x1.f5435146bb61cp-4
 difference:  2.7755575615628914e-17   0x1.0000000000000p-55
 ulp       :  2.0000
 max.ulp   :  1.0000
Maximal error of real part of: clog10
 is      : 3 ulp
 accepted: 3 ulp
Maximal error of imaginary part of: clog10
 is      : 2 ulp
 accepted: 1 ulp

This is actually resulting from atan2 becoming *more* accurate (atan2
(-0x6.d940dp-4, 0x1.7a858p+0) should ideally be -0x1.208cd6e841554p-2
but was -0x1.208cd6e841555p-2 from a powerpc libm built before this
change, and is -0x1.208cd6e841554p-2 from a powerpc libm built after
this change).  Since these functions are not expected to be correctly
rounding by glibc's accuracy goals, neither result is a problem, but
this does imply that some of this code, although designed to be
correctly rounding, is not in fact correctly rounding (possibly
because of GCC creating fused operations where the code does not
expect it, something we've only disabled for specific functions where
it was found to cause large errors).  (Of course as previously
discussed I think we should remove the slow cases where an error
analysis shows this wouldn't increase the errors much above 0.5ulp;
it's only functions such as cratan2 that are expected to be correctly
rounding, not atan2.)

	* sysdeps/ieee754/dbl-64/dla.h [__FP_FAST_FMA] (DLA_FMS): Define
	macro to use __builtin_fma.
	* sysdeps/x86_64/fpu/dla.h: Remove file.
2016-09-30 15:49:51 +00:00
..
aarch64 Add femode_t functions: aarch64. 2016-09-07 16:41:20 +00:00
alpha Add femode_t functions: alpha. 2016-09-07 16:42:19 +00:00
arm Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
generic Installed-header hygiene (BZ#20366): stack_t. 2016-09-23 08:43:56 -04:00
gnu Installed-header hygiene (BZ#20366): obsolete BSD u_* types. 2016-09-23 08:43:56 -04:00
hppa Add femode_t functions: hppa. 2016-09-07 16:43:43 +00:00
i386 Installed-header hygiene (BZ#20366): stack_t. 2016-09-23 08:43:56 -04:00
ia64 Remove the ptw-% patterns 2016-09-14 16:02:06 +02:00
ieee754 Use __builtin_fma more in dbl-64 code. 2016-09-30 15:49:51 +00:00
init_array Update copyright dates with scripts/update-copyrights. 2016-01-04 16:05:18 +00:00
m68k Installed-header hygiene (BZ#20366): stack_t. 2016-09-23 08:43:56 -04:00
mach Installed-header hygiene (BZ#20366): stack_t. 2016-09-23 08:43:56 -04:00
microblaze Add femode_t functions. 2016-09-07 16:40:09 +00:00
mips Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
nacl Remove the ptw-% patterns 2016-09-14 16:02:06 +02:00
nios2 Add femode_t functions. 2016-09-07 16:40:09 +00:00
nptl Installed-header hygiene (BZ#20366): time.h types. 2016-09-23 08:43:56 -04:00
posix hurd: fix fcntl visibility 2016-09-18 23:48:55 +02:00
powerpc powerpc: Fix POWER9 implies 2016-09-19 09:35:38 -03:00
pthread Installed-header hygiene (BZ#20366): time.h types. 2016-09-23 08:43:56 -04:00
s390 Remove the ptw-% patterns 2016-09-14 16:02:06 +02:00
sh Add femode_t functions: sh. 2016-09-07 16:48:08 +00:00
sparc Remove remnants of .og patterns 2016-09-20 12:18:13 +02:00
tile Add femode_t functions. 2016-09-07 16:40:09 +00:00
unix Add iscanonical. 2016-09-30 00:27:50 +00:00
wordsize-32 Update copyright dates with scripts/update-copyrights. 2016-01-04 16:05:18 +00:00
wordsize-64 Update copyright dates with scripts/update-copyrights. 2016-01-04 16:05:18 +00:00
x86 Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
x86_64 Use __builtin_fma more in dbl-64 code. 2016-09-30 15:49:51 +00:00