Motorola treats denormals with explicit integer bit set as
having unbiased exponent 0, unlike Intel which treats it as
having unbiased exponent 1 (more like all other IEEE formats
that have no explicit integer bit).
Add a flag on FloatFmt to differentiate the behaviour.
Reported-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We missed these functions when upstreaming the bfloat16 support.
Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Message-Id: <20230531065458.2082-1-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add versions of float64_to_int* which do not saturate the result.
Reviewed-by: Christoph Muellner <christoph.muellner@vrull.eu>
Tested-by: Christoph Muellner <christoph.muellner@vrull.eu>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230527141910.1885950-2-richard.henderson@linaro.org>
The float32_exp2 function is computing wrong exponent of 2.
For example, with the following set of values {0.1, 2.0, 2.0, -1.0},
the expected output would be {1.071773, 4.000000, 4.000000, 0.500000}.
Instead, the function is computing {1.119102, 3.382044, 3.382044, -0.191022}
Looking at the code, the float32_exp2() attempts to do this
2 3 4 5 n
x x x x x x x
e = 1 + --- + --- + --- + --- + --- + ... + --- + ...
1! 2! 3! 4! 5! n!
But because of the typo it ends up doing
x x x x x x x
e = 1 + --- + --- + --- + --- + --- + ... + --- + ...
1! 2! 3! 4! 5! n!
This is because instead of the xnp which holds the numerator, parts_muladd
is using the xp which is just 'x'. Commit '572c4d862ff2' refactored this
function, and mistakenly used xp instead of xnp.
Cc: qemu-stable@nongnu.org
Fixes: 572c4d862ff2 "softfloat: Convert float32_exp2 to FloatParts"
Partially-Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1623
Reported-By: Luca Barbato (https://gitlab.com/lu-zero)
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Message-Id: <168304110865.537992.13059030916325018670.stgit@localhost.localdomain>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Added the possibility of recalculating a result if it overflows or
underflows, if the result overflow and the rebias bool is true then the
intermediate result should have 3/4 of the total range subtracted from
the exponent. The same for underflow but it should be added to the
exponent of the intermediate number instead.
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220805141522.412864-2-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Since the caller, partsN_compare, is now exclusively
using FloatRelation, it's clearer to use it here too.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20220401132240.79730-4-richard.henderson@linaro.org>
The declaration used 'int', while the definition used 'FloatRelation'.
This should have resulted in a compiler error, but mysteriously didn't.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20220401132240.79730-2-richard.henderson@linaro.org>
Based on parts_sint_to_float, implements int128_to_float128 to convert a
signed 128-bit value received through an Int128 argument.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20220330175932.6995-5-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Based on parts_uint_to_float, implements uint128_to_float128 to convert
an unsigned 128-bit value received through an Int128 argument.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220330175932.6995-4-matheus.ferst@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
These variants take a float64 as input, compute the result to
infinite precision (as we do with FloatParts), round the result
to the precision and dynamic range of float32, and then return
the result in the format of float64.
This is the operation PowerPC requires for its float32 operations.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-28-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
PowerPC has this flag, and it's easier to compute it here
than after the fact.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211119160502.17432-8-richard.henderson@linaro.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
For "fmax/fmin ft0, ft1, ft2" and if one of the inputs is sNaN,
The original logic:
Return NaN and set invalid flag if ft1 == sNaN || ft2 == sNan.
The alternative path:
Set invalid flag if ft1 == sNaN || ft2 == sNaN.
Return NaN only if ft1 == NaN && ft2 == NaN.
The IEEE 754 spec allows both implementation and some architecture such
as riscv choose different defintions in two spec versions.
(riscv-spec-v2.2 use original version, riscv-spec-20191213 changes to
alternative)
Signed-off-by: Chih-Min Chao <chihmin.chao@sifive.com>
Signed-off-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211021160847.2748577-2-frank.chang@sifive.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210614233143.1221879-3-richard.henderson@linaro.org>
Typo in the conversion to FloatParts64.
Fixes: 572c4d862ff2
Fixes: Coverity CID 1457457
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210607223812.110596-1-richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For the normal case of no additional scaling, this reduces the
profile contribution of int64_to_float64 to the testcase in the
linked issue from 0.81% to 0.04%.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/134
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_modrem. This was the last use of a lot
of the legacy infrastructure, so remove it as required.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_log2. Though this is partly a ruse, since I do not
believe the code will succeed for float128 without work. Which is ok
for now, because we do not need this for more than float32 and float64.
Since berkeley-testfloat-3 doesn't support log2, compare float64_log2
vs the system log2. Fix the errors for inputs near 1.0:
test: 3ff00000000000b0 +0x1.00000000000b0p+0
sf: 3d2fa00000000000 +0x1.fa00000000000p-45
libm: 3d2fbd422b1bd36f +0x1.fbd422b1bd36fp-45
Error in fraction: 32170028290927 ulp
test: 3feec24f6770b100 +0x1.ec24f6770b100p-1
sf: bfad3740d13c9ec0 -0x1.d3740d13c9ec0p-5
libm: bfad3740d13c9e98 -0x1.d3740d13c9e98p-5
Error in fraction: 40 ulp
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Keep the intermediate results in FloatParts instead of
converting back and forth between float64. Use muladd
instead of separate mul+add.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is the last use of commonNaNT and all of the routines
that use it, so remove all of them for Werror.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since this is the first such, this includes all of the
packing and unpacking routines as well.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use an enumeration instead of raw 32/64/80 values.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Remove frac_lsb, frac_lsbm1, roundeven_mask. Compute
these from round_mask in parts$N_uncanon_normal.
With floatx80, round_mask will not be tied to frac_shift.
Everything else is easily computable.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We will need to treat the non-normal cases of floatx80 specially,
so split out the normal case that we can reuse.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_sqrt.
Reimplement float128_sqrt with FloatParts128.
Reimplement with the inverse sqrt newton-raphson algorithm from musl.
This is significantly faster than even the berkeley sqrt n-r algorithm,
because it does not use division instructions, only multiplication.
Ordinarily, changing algorithms at the same time as migrating code is
a bad idea, but this is the only way I found that didn't break one of
the routines at the same time.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_scalbn.
Reimplement float128_scalbn with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_compare. Rename all of the intermediate
functions to ftype_do_compare. Rename the hard-float functions
to ftype_hs_compare. Convert float128 to FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The float128 implementation is straight-forward.
Unfortuantely, we don't have any tests we can simply adjust/unlock.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210517142739.38597-24-david@redhat.com>
[rth: Update for changed parts_minmax return value]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_minmax. Combine 3 bool arguments to a bitmask.
Introduce ftype_minmax functions as a common optimization point.
Fold bfloat16 expansions into the same macro as the other types.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_uint_to_float.
Reimplement uint64_to_float128 with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_sint_to_float.
Reimplement int{32,64}_to_float128 with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_float_to_uint. Reimplement
float128_to_uint{32,64}{_round_to_zero} with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_float_to_sint. Reimplement
float128_to_int{32,64}{_round_to_zero} with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
At the same time, convert to pointers, split out
parts$N_round_to_int_normal, define a macro for
parts_round_to_int using QEMU_GENERIC.
This necessarily meant some rearrangement to the
rount_to_{,u}int_and_pack routines, so go ahead and
convert to parts_round_to_int_normal, which in turn
allows cleaning up of the raised exception handling.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Introduce parts_float_to_float_widen and parts_float_to_float_narrow.
Use them for float128_to_float{32,64} and float{32,64}_to_float128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Split out parts_float_to_ahp and parts_float_to_float.
Convert to pointers.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_div.
Implement float128_div with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Have x86_64 assembly for them, with a fallback.
This avoids shuffling values through %cl in the x86 case.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rename to parts$N_muladd.
Implement float128_muladd with FloatParts128.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>