5eebc49d2d
The x87 fyl2xp1 emulation is currently based around conversion to double. This is inherently unsuitable for a good emulation of any floatx80 operation, even before considering that it is a particularly naive implementation using double (adding 1 then using log rather than attempting a better emulation using log1p). Reimplement using the soft-float operations, as was done for f2xm1; as in that case, m68k has related operations but not exactly this one and it seemed safest to implement directly rather than reusing the m68k code to avoid accumulation of errors. A test is included with many randomly generated inputs. The assumption of the test is that the result in round-to-nearest mode should always be one of the two closest floating-point numbers to the mathematical value of y * log2(x + 1); the implementation aims to do somewhat better than that (about 70 correct bits before rounding). I haven't investigated how accurate hardware is. Intel manuals describe a narrower range of valid arguments to this instruction than AMD manuals. The implementation accepts the wider range (it's needed anyway for the core code to be reusable in a subsequent patch reimplementing fyl2x), but the test only has inputs in the narrower range so that it's valid on hardware that may reject or produce poor results for inputs outside that range. Code in the previous implementation that sets C2 for some out-of-range arguments is not carried forward to the new implementation; C2 is undefined for this instruction and I suspect that code was just cut-and-pasted from the trigonometric instructions (fcos, fptan, fsin, fsincos) where C2 *is* defined to be set for out-of-range arguments. Signed-off-by: Joseph Myers <joseph@codesourcery.com> Message-Id: <alpine.DEB.2.21.2006172320190.20587@digraph.polyomino.org.uk> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
---|---|---|
.. | ||
alpha | ||
arm | ||
cris | ||
hppa | ||
i386 | ||
lm32 | ||
m68k | ||
microblaze | ||
mips | ||
moxie | ||
nios2 | ||
openrisc | ||
ppc | ||
riscv | ||
rx | ||
s390x | ||
sh4 | ||
sparc | ||
tilegx | ||
tricore | ||
unicore32 | ||
xtensa |