qemu-e2k

Author	SHA1	Message	Date
Paolo Bonzini	2872b0f390	target/i386: implement FMA instructions The only issue with FMA instructions is that there are _a lot_ of them (30 opcodes, each of which comes in up to 4 versions depending on VEX.W and VEX.L; a total of 96 possibilities). However, they can be implement with only 6 helpers, two for scalar operations and four for packed operations. (Scalar versions do not do any merging; they only affect the bottom 32 or 64 bits of the output operand. Therefore, there is no separate XMM and YMM of the scalar helpers). First, we can reduce the number of helpers to one third by passing four operands (one output and three inputs); the reordering of which operands go to the multiply and which go to the add is done in emit.c. Second, the different instructions also dispatch to the same softfloat function, so the flags for float32_muladd and float64_muladd are passed in the helper as int arguments, with a little extra complication to handle FMADDSUB and FMSUBADD. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-22 09:05:54 +02:00
Paolo Bonzini	cf5ec6641e	target/i386: implement F16C instructions F16C only consists of two instructions, which are a bit peculiar nevertheless. First, they access only the low half of an YMM or XMM register for the packed-half operand; the exact size still depends on the VEX.L flag. This is similar to the existing avx_movx flag, but not exactly because avx_movx is hardcoded to affect operand 2. To this end I added a "ph" format name; it's possible to reuse this approach for the VPMOVSX and VPMOVZX instructions, though that would also require adding two more formats for the low-quarter and low-eighth of an operand. Second, VCVTPS2PH is somewhat weird because it stores the result of the instruction into memory rather than loading it. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-20 15:16:18 +02:00
Paolo Bonzini	0339ddfa75	tests/tcg: extend SSE tests to AVX Extracted from a patch by Paul Brook <paul@nowt.org>. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-10-18 13:58:05 +02:00
Paolo Bonzini	e02907cc12	tests/tcg: refine MMX support in SSE tests Extend the support to memory operands, and skip MMX instructions that were introduced in SSE times, because they are now covered in test-mmx. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-19 15:15:59 +02:00
Paolo Bonzini	fa7ce0b028	tests/tcg: i386: add MMX and 3DNow! tests Adjust the test-avx.py generator to produce tests specifically for MMX and 3DNow. Using a separate generator introduces some code duplication, but is a simpler approach because of test-avx's extra complexity to support 3- and 4-operand AVX instructions. If needed, a common library can be introduced later. While at it, for consistency move all the -cpu max rules to the same place. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-19 15:14:40 +02:00
Paul Brook	91117bc546	tests/tcg: i386: add SSE tests Tests for correct operation of most x86-64 SSE instructions. It should cover all combinations of overlapping register and memory operands on a set of random-ish data. Results are bit-identical to an Intel i5-8500, with the exception of the RCPSS and RSQRT approximations where the real CPU gives less accurate results (the Intel spec allows relative errors up to 1.5 * 2^-12) Signed-off-by: Paul Brook <paul@nowt.org> Acked-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220424220204.2493824-42-paul@nowt.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-01 20:16:33 +02:00

6 Commits