qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Richard Henderson	3e114acc91	target/ppc: Use tcg_gen_gvec_rotlv Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-06-02 08:42:37 -07:00
Richard Henderson	36af59d062	target/ppc: Use tcg_gen_gvec_dup_imm We can now unify the implementation of the 3 VSPLTI instructions. Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-05-06 09:25:01 -07:00
BALATON Zoltan	92eeb004e8	target/ppc: Fix typo in comments "Deferred" was misspelled as "differed" in some comments, correct this typo, Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu> Message-Id: <20200214155748.0896B745953@zero.eik.bme.hu> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-02-21 09:15:04 +11:00
Stefan Brankovic	8d745875c2	target/ppc: Fix for optimized vsl/vsr instructions In previous implementation, invocation of TCG shift function could request shift of TCG variable by 64 bits when variable 'sh' is 0, which is not supported in TCG (values can be shifted by 0 to 63 bits). This patch fixes this by using two separate invocation of TCG shift functions, with maximum shift amount of 32. Name of variable 'shifted' is changed to 'carry' so variable naming is similar to old helper implementation. Variables 'avrA' and 'avrB' are replaced with variable 'avr'. Fixes: `4e6d0920e7` Reported-by: "Paul A. Clark" <pc@us.ibm.com> Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Suggested-by: Aleksandar Markovic <aleksandar.markovic@rt-rk.com> Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Message-Id: <1570196639-7025-2-git-send-email-stefan.brankovic@rt-rk.com> Tested-by: Paul A. Clarke <pc@us.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-10-24 09:36:55 +11:00
Paul A. Clarke	bc7a45ab88	ppc: Add support for 'mffsce' instruction ISA 3.0B added a set of Floating-Point Status and Control Register (FPSCR) instructions: mffsce, mffscdrn, mffscdrni, mffscrn, mffscrni, mffsl. This patch adds support for 'mffsce' instruction. 'mffsce' is identical to 'mffs', except that it also clears the exception enable bits in the FPSCR. On CPUs without support for 'mffsce' (below ISA 3.0), the instruction will execute identically to 'mffs'. Signed-off-by: Paul A. Clarke <pc@us.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1568817082-1384-1-git-send-email-pc@us.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-10-04 10:25:23 +10:00
Paul A. Clarke	a2735cf483	ppc: Add support for 'mffscrn','mffscrni' instructions ISA 3.0B added a set of Floating-Point Status and Control Register (FPSCR) instructions: mffsce, mffscdrn, mffscdrni, mffscrn, mffscrni, mffsl. This patch adds support for 'mffscrn' and 'mffscrni' instructions. 'mffscrn' and 'mffscrni' are similar to 'mffsl', except they do not return the status bits (FI, FR, FPRF) and they also set the rounding mode in the FPSCR. On CPUs without support for 'mffscrn'/'mffscrni' (below ISA 3.0), the instructions will execute identically to 'mffs'. Signed-off-by: Paul A. Clarke <pc@us.ibm.com> Message-Id: <1568817081-1345-1-git-send-email-pc@us.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-10-04 10:25:23 +10:00
Stefan Brankovic	897b639789	target/ppc: Refactor emulation of vmrgew and vmrgow instructions Since I found this two instructions implemented with tcg, I refactored them so they are consistent with other similar implementations that I introduced in this patch. Also, a new dual macro GEN_VXFORM_TRANS_DUAL is added. This macro is used if one instruction is realized with direct translation, and second one with a helper. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Message-Id: <1566898663-25858-4-git-send-email-stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-29 09:46:07 +10:00
Paul A. Clarke	256be7d07a	ppc: Fix xsmaddmdp and friends A class of instructions of the form: op Target,A,B which operate like: Target = Target * A + B have a bit set which distinguishes them from instructions that operate as: Target = Target * B + A This bit is not being checked properly (using PPC_BIT macro), so all instructions in this class are operating incorrectly as the second form above. The bit was being checked as if it were part of a 64-bit instruction opcode, rather than a proper 32-bit opcode. Fix by using the macro (PPC_BIT32) which treats the opcode as a 32-bit quantity. Fixes: `c9f4e4d8b6` ("target/ppc: improve VSX_FMADD with new GEN_VSX_HELPER_VSX_MADD macro") Signed-off-by: Paul A. Clarke <pc@us.ibm.com> Message-Id: <1566401321-22419-1-git-send-email-pc@us.ibm.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Tested-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-29 09:46:07 +10:00
Paul A. Clarke	31eb7dddac	ppc: Add support for 'mffsl' instruction ISA 3.0B added a set of Floating-Point Status and Control Register (FPSCR) instructions: mffsce, mffscdrn, mffscdrni, mffscrn, mffscrni, mffsl. This patch adds support for 'mffsl'. 'mffsl' is identical to 'mffs', except it only returns mode, status, and enable bits from the FPSCR. On CPUs without support for 'mffsl' (below ISA 3.0), the 'mffsl' instruction will execute identically to 'mffs'. Note: I renamed FPSCR_RN to FPSCR_RN0 so I could create an FPSCR_RN mask which is both bits of the FPSCR rounding mode, as defined in the ISA. I also fixed a typo in the definition of FPSCR_FR. Signed-off-by: Paul A. Clarke <pc@us.ibm.com> v4: - nit: added some braces to resolve a checkpatch complaint. v3: - Changed tcg_gen_and_i64 to tcg_gen_andi_i64, eliminating the need for a temporary, per review from Richard Henderson. v2: - I found that I copied too much of the 'mffs' implementation. The 'Rc' condition code bits are not needed for 'mffsl'. Removed. - I now free the (renamed) 'tmask' temporary. - I now bail early for older ISA to the original 'mffs' implementation. Message-Id: <1565982203-11048-1-git-send-email-pc@us.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-21 17:17:39 +10:00
Stefan Brankovic	1872588ede	target/ppc: Optimize emulation of vclzw instruction Optimize Altivec instruction vclzw (Vector Count Leading Zeros Word). This instruction counts the number of leading zeros of each word element in source register and places result in the appropriate word element of destination register. Counting is to be performed in four iterations of for loop(one for each word elemnt of source register vB). Every iteration consists of loading appropriate word element from source register, counting leading zeros with tcg_gen_clzi_i32, and saving the result in appropriate word element of destination register. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1563200574-11098-7-git-send-email-stefan.brankovic@rt-rk.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-21 17:17:11 +10:00
Stefan Brankovic	b8313f0d91	target/ppc: Optimize emulation of vclzd instruction Optimize Altivec instruction vclzd (Vector Count Leading Zeros Doubleword). This instruction counts the number of leading zeros of each doubleword element in source register and places result in the appropriate doubleword element of destination register. Using tcg-s count leading zeros instruction two times(once for each doubleword element of source register vB) and placing result in appropriate doubleword element of destination register vD. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1563200574-11098-6-git-send-email-stefan.brankovic@rt-rk.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-21 17:17:11 +10:00
Stefan Brankovic	083b3f012f	target/ppc: Optimize emulation of vgbbd instruction Optimize altivec instruction vgbbd (Vector Gather Bits by Bytes by Doubleword) All ith bits (i in range 1 to 8) of each byte of doubleword element in source register are concatenated and placed into ith byte of appropriate doubleword element in destination register. Following solution is done for both doubleword elements of source register in parallel, in order to reduce the number of instructions needed(that's why arrays are used): First, both doubleword elements of source register vB are placed in appropriate element of array avr. Bits are gathered in 2x8 iterations(2 for loops). In first iteration bit 1 of byte 1, bit 2 of byte 2,... bit 8 of byte 8 are in their final spots so avr[i], i={0,1} can be and-ed with tcg_mask. For every following iteration, both avr[i] and tcg_mask variables have to be shifted right for 7 and 8 places, respectively, in order to get bit 1 of byte 2, bit 2 of byte 3.. bit 7 of byte 8 in their final spots so shifted avr values(saved in tmp) can be and-ed with new value of tcg_mask... After first 8 iteration(first loop), all the first bits are in their final places, all second bits but second bit from eight byte are in their places... only 1 eight bit from eight byte is in it's place). In second loop we do all operations symmetrically, in order to get other half of bits in their final spots. Results for first and second doubleword elements are saved in result[0] and result[1] respectively. In the end those results are saved in appropriate doubleword element of destination register vD. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1563200574-11098-5-git-send-email-stefan.brankovic@rt-rk.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-21 17:17:11 +10:00
Stefan Brankovic	4e6d0920e7	target/ppc: Optimize emulation of vsl and vsr instructions Optimization of altivec instructions vsl and vsr(Vector Shift Left/Rigt). Perform shift operation (left and right respectively) on 128 bit value of register vA by value specified in bits 125-127 of register vB. Lowest 3 bits in each byte element of register vB must be identical or result is undefined. For vsl instruction, the first step is bits 125-127 of register vB have to be saved in variable sh. Then, the highest sh bits of the lower doubleword element of register vA are saved in variable shifted, in order not to lose those bits when shift operation is performed on the lower doubleword element of register vA, which is the next step. After shifting the lower doubleword element shift operation is performed on higher doubleword element of vA, with replacement of the lowest sh bits(that are now 0) with bits saved in shifted. For vsr instruction, firstly, the bits 125-127 of register vB have to be saved in variable sh. Then, the lowest sh bits of the higher doubleword element of register vA are saved in variable shifted, in odred not to lose those bits when the shift operation is performed on the higher doubleword element of register vA, which is the next step. After shifting higher doubleword element, shift operation is performed on lower doubleword element of vA, with replacement of highest sh bits(that are now 0) with bits saved in shifted. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1563200574-11098-3-git-send-email-stefan.brankovic@rt-rk.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-21 17:17:11 +10:00
Stefan Brankovic	1cc792698e	target/ppc: Optimize emulation of lvsl and lvsr instructions Adding simple macro that is calling tcg implementation of appropriate instruction if altivec support is active. Optimization of altivec instruction lvsl (Load Vector for Shift Left). Place bytes sh:sh+15 of value 0x00 \|\| 0x01 \|\| 0x02 \|\| ... \|\| 0x1E \|\| 0x1F in destination register. Sh is calculated by adding 2 source registers and getting bits 60-63 of result. First, the bits [28-31] are placed from EA to variable sh. After that, the bytes are created in the following way: sh:(sh+7) of X(from description) by multiplying sh with 0x0101010101010101 followed by addition of the result with 0x0001020304050607. Value obtained is placed in higher doubleword element of vD. (sh+8):(sh+15) by adding the result of previous multiplication with 0x08090a0b0c0d0e0f. Value obtained is placed in lower doubleword element of vD. Optimization of altivec instruction lvsr (Load Vector for Shift Right). Place bytes 16-sh:31-sh of value 0x00 \|\| 0x01 \|\| 0x02 \|\| ... \|\| 0x1E \|\| 0x1F in destination register. Sh is calculated by adding 2 source registers and getting bits 60-63 of result. First, the bits [28-31] are placed from EA to variable sh. After that, the bytes are created in the following way: sh:(sh+7) of X(from description) by multiplying sh with 0x0101010101010101 followed by substraction of the result from 0x1011121314151617. Value obtained is placed in higher doubleword element of vD. (sh+8):(sh+15) by substracting the result of previous multiplication from 0x18191a1b1c1d1e1f. Value obtained is placed in lower doubleword element of vD. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1563200574-11098-2-git-send-email-stefan.brankovic@rt-rk.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-21 17:17:11 +10:00
Mark Cave-Ayland	c9f4e4d8b6	target/ppc: improve VSX_FMADD with new GEN_VSX_HELPER_VSX_MADD macro Introduce a new GEN_VSX_HELPER_VSX_MADD macro for the generator function which enables the source and destination registers to be decoded at translation time. This enables the determination of a or m form to be made at translation time so that a single helper function can now be used for both variants. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-16-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	5ba5335d93	target/ppc: decode target register in VSX_EXTRACT_INSERT at translation time Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-15-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	2aba168e50	target/ppc: decode target register in VSX_VECTOR_LOAD_STORE_LENGTH at translation time Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-14-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	6ae4a57ab0	target/ppc: introduce GEN_VSX_HELPER_R2_AB macro to fpu_helper.c Rather than perform the VSR register decoding within the helper itself, introduce a new GEN_VSX_HELPER_R2_AB macro which performs the decode based upon rA and rB at translation time. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-13-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	9922962011	target/ppc: introduce GEN_VSX_HELPER_R2 macro to fpu_helper.c Rather than perform the VSR register decoding within the helper itself, introduce a new GEN_VSX_HELPER_R2 macro which performs the decode based upon rD and rB at translation time. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-12-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	23d0766bd9	target/ppc: introduce GEN_VSX_HELPER_R3 macro to fpu_helper.c Rather than perform the VSR register decoding within the helper itself, introduce a new GEN_VSX_HELPER_R3 macro which performs the decode based upon rD, rA and rB at translation time. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-11-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	8d830485fc	target/ppc: introduce GEN_VSX_HELPER_X1 macro to fpu_helper.c Rather than perform the VSR register decoding within the helper itself, introduce a new GEN_VSX_HELPER_X1 macro which performs the decode based upon xB at translation time. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-10-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	033e1fcd97	target/ppc: introduce GEN_VSX_HELPER_X2_AB macro to fpu_helper.c Rather than perform the VSR register decoding within the helper itself, introduce a new GEN_VSX_HELPER_X2_AB macro which performs the decode based upon xA and xB at translation time. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-9-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	75cf84cbee	target/ppc: introduce GEN_VSX_HELPER_X2 macro to fpu_helper.c Rather than perform the VSR register decoding within the helper itself, introduce a new GEN_VSX_HELPER_X2 macro which performs the decode based upon xT and xB at translation time. With the previous change to the xscvqpdp generator and helper functions the opcode parameter is no longer required in the common case and can be removed. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-8-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	e0d6a362be	target/ppc: introduce separate generator and helper for xscvqpdp Rather than perform the VSR register decoding within the helper itself, introduce a new generator and helper function which perform the decode based upon xT and xB at translation time. The xscvqpdp helper is the only 2 parameter xT/xB implementation that requires the opcode to be passed as an additional parameter, so handling this separately allows us to optimise the conversion in the next commit. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-7-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	99125c7499	target/ppc: introduce GEN_VSX_HELPER_X3 macro to fpu_helper.c Rather than perform the VSR register decoding within the helper itself, introduce a new GEN_VSX_HELPER_X3 macro which performs the decode based upon xT, xA and xB at translation time. With the previous changes to the VSX_CMP generator and helper macros the opcode parameter is no longer required in the common case and can be removed. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-6-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Mark Cave-Ayland	00084a25ad	target/ppc: introduce separate VSX_CMP macro for xvcmp* instructions Rather than perform the VSR register decoding within the helper itself, introduce a new VSX_CMP macro which performs the decode based upon xT, xA and xB at translation time. Subsequent commits will make the same changes for other instructions however the xvcmp* instructions are different in that they return a set of flags to be optionally written back to the crf[6] register. Move this logic from the helper function to the generator function, along with the float_status update. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-5-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00
Richard Henderson	fe2d169614	target/ppc: Use tcg_gen_gvec_bitsel Replace the target-specific implementation of XXSEL. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190603164927.8336-1-richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-06-12 10:41:50 +10:00
Anton Blanchard	2a12243590	target/ppc: Fix lxvw4x, lxvh8x and lxvb16x During the conversion these instructions were incorrectly treated as stores. We need to use set_cpu_vsr* and not get_cpu_vsr*. Fixes: `8b3b2d75c7` ("introduce get_cpu_vsr{l,h}() and set_cpu_vsr{l,h}() helpers for VSR register access") Signed-off-by: Anton Blanchard <anton@ozlabs.org> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Tested-by: Greg Kurz <groug@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <20190524065345.25591-1-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-06-12 10:41:49 +10:00
Richard Henderson	571fbe6ccd	target/ppc: Use vector variable shifts for VSL, VSR, VSRA The gvec expanders take care of masking the shift amount against the element width. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190518191430.21686-2-richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-05-29 11:39:45 +10:00
Anton Blanchard	77bd8937c0	target/ppc: Fix xvabs[sd]p, xvnabs[sd]p, xvneg[sd]p, xvcpsgn[sd]p We were using set_cpu_vsr() when we should have used get_cpu_vsr(). Fixes: `8b3b2d75c7` ("introduce get_cpu_vsr{l,h}() and set_cpu_vsr{l,h}() helpers for VSR register access") Signed-off-by: Anton Blanchard <anton@ozlabs.org> Message-Id: <20190509104912.6b754dff@kryten> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-05-29 11:39:45 +10:00
Anton Blanchard	e04c5dd139	target/ppc: Optimise VSX_LOAD_SCALAR_DS and VSX_VECTOR_LOAD_STORE A few small optimisations: In VSX_LOAD_SCALAR_DS() we can don't need to read the VSR via get_cpu_vsrh(). Split VSX_VECTOR_LOAD_STORE() into two functions. Loads only need to write the VSRs (set_cpu_vsr()) and stores only need to read the VSRs (get_cpu_vsr()) Thanks to Mark Cave-Ayland for the suggestions. Signed-off-by: Anton Blanchard <anton@ozlabs.org> Message-Id: <20190509103545.4a7fa71a@kryten> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-05-29 11:39:45 +10:00
Anton Blanchard	4c406ca734	target/ppc: Fix xxspltib xxspltib raises a VMX or a VSX exception depending on the register set it is operating on. We had a check, but it was backwards. Fixes: `f113283525` ("target-ppc: add xxspltib instruction") Signed-off-by: Anton Blanchard <anton@ozlabs.org> Message-Id: <20190509061713.69490488@kryten> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-05-29 11:39:44 +10:00
Anton Blanchard	d47a751ada	target/ppc: Fix xxbrq, xxbrw Fix a typo in xxbrq and xxbrw where we put both results into the lower doubleword. Fixes: `8b3b2d75c7` ("introduce get_cpu_vsr{l,h}() and set_cpu_vsr{l,h}() helpers for VSR register access") Signed-off-by: Anton Blanchard <anton@ozlabs.org> Message-Id: <20190507004811.29968-3-anton@ozlabs.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-05-29 11:39:44 +10:00
Anton Blanchard	cf4e9363f7	target/ppc: Fix xvxsigdp Fix a typo in xvxsigdp where we put both results into the lower doubleword. Fixes: `dd977e4f45` ("target/ppc: Optimize x[sv]xsigdp using deposit_i64()") Signed-off-by: Anton Blanchard <anton@ozlabs.org> Message-Id: <20190507004811.29968-1-anton@ozlabs.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-05-29 11:39:44 +10:00
Philippe Mathieu-Daudé	d577dbaac7	target/ppc: Use tcg_gen_abs_i32 Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20190423102145.14812-2-f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +00:00
Richard Henderson	53229a7703	tcg: Specify optional vector requirements with a list Replace the single opcode in .opc with a null-terminated array in .opt_opc. We still require that all opcodes be used with the same .vece. Validate the contents of this list with CONFIG_DEBUG_TCG. All tcg_gen_*_vec functions will check any list active during .fniv expansion. Swap the active list in and out as we expand other opcodes, or take control away from the front-end function. Convert all existing vector aware front ends. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-13 14:44:03 -07:00
David Gibson	eb512d15a0	target/ppc: Style fixes for translate/spe-impl.inc.c Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>	2019-04-26 11:37:57 +10:00
David Gibson	3255386633	target/ppc: Style fixes for translate/vmx-impl.inc.c Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>	2019-04-26 11:37:57 +10:00
David Gibson	34b2300cbb	target/ppc: Style fixes for translate/vsx-impl.inc.c Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>	2019-04-26 11:37:57 +10:00
David Gibson	f895d2c820	target/ppc: Style fixes for translate/fp-impl.inc.c Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org>	2019-04-26 11:37:57 +10:00
Greg Kurz	3e5365b7aa	target/ppc: Fix QEMU crash with stxsdx I've been hitting several QEMU crashes while running a fedora29 ppc64le guest under TCG. Each time, this would occur several minutes after the guest reached login: Fedora 29 (Twenty Nine) Kernel 4.20.6-200.fc29.ppc64le on an ppc64le (hvc0) Web console: https://localhost:9090/ localhost login: tcg/tcg.c:3211: tcg fatal error This happens because a bug crept up in the gen_stxsdx() helper when it was converted to use VSR register accessors by commit `8b3b2d75c7` "target/ppc: introduce get_cpu_vsr{l,h}() and set_cpu_vsr{l,h}() helpers for VSR register access". The code creates a temporary, passes it directly to gen_qemu_st64_i64() and then to set_cpu_vrsh()... which looks like this was mistakenly coded as a load instead of a store. Reverse the logic: read the VSR to the temporary first and then store it to memory. Fixes: `8b3b2d75c7` Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <155371035249.2038502.12364252604337688538.stgit@bahia.lan> Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-03-29 10:22:22 +11:00
Philippe Mathieu-Daudé	dd977e4f45	target/ppc: Optimize x[sv]xsigdp using deposit_i64() Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20190309214255.9952-3-f4bug@amsat.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-03-12 14:33:05 +11:00
Philippe Mathieu-Daudé	cde0a41c12	target/ppc: Optimize xviexpdp() using deposit_i64() The t0 tcg_temp register is now unused, remove it. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20190309214255.9952-2-f4bug@amsat.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-03-12 14:33:05 +11:00
Mark Cave-Ayland	d59d1182b1	target/ppc: introduce vsr64_offset() to simplify get_cpu_vsr{l,h}() and set_cpu_vsr{l,h}() Now that all VSX registers are stored in host endian order, there is no need to go via different accessors depending upon the register number. Instead we introduce vsr64_offset() and use it directly from within get_cpu_vsr{l,h}() and set_cpu_vsr{l,h}(). This also allows us to rewrite avr64_offset() and fpr_offset() in terms of the new vsr64_offset() function to more clearly express the relationship between the VSX, FPR and VMX registers, and also remove vsrl_offset() which is no longer required. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Message-Id: <20190307180520.13868-8-mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-03-12 14:33:04 +11:00
Mark Cave-Ayland	37da91f163	target/ppc: improve avr64_offset() and use it to simplify get_avr64()/set_avr64() By using the VsrD macro in avr64_offset() the same offset calculation can be used regardless of the host endian. This allows get_avr64() and set_avr64() to be simplified accordingly. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Message-Id: <20190307180520.13868-6-mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-03-12 14:33:04 +11:00
Mark Cave-Ayland	c82a8a8542	target/ppc: introduce avr_full_offset() function All TCG vector operations require pointers to the base address of the vector rather than separate access to the top and bottom 64-bits. Convert the VMX TCG instructions to use a new avr_full_offset() function instead of avr64_offset() which can then itself be written as a simple wrapper onto vsr_full_offset(). This same function can also reused in cpu_avr_ptr() to avoid having more than one copy of the offset calculation logic. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Message-Id: <20190307180520.13868-5-mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-03-12 14:33:04 +11:00
Mark Cave-Ayland	45141dfd23	target/ppc: introduce single vsrl_offset() function Instead of having multiple copies of the offset calculation logic, move it to a single vsrl_offset() function. This commit also renames the existing get_vsr()/set_vsr() functions to get_vsrl()/set_vsrl() which better describes their purpose. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190307180520.13868-3-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-03-12 14:33:04 +11:00
Richard Henderson	73e14c6a9c	target/ppc: convert vmin* and vmax* to vector operations Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Acked-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190215100058.20015-18-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-02-18 11:00:44 +11:00
Richard Henderson	fb11ae7daa	target/ppc: convert vadds and vsubs to vector operations Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Acked-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190215100058.20015-17-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-02-18 11:00:44 +11:00
Richard Henderson	cc2b90d725	target/ppc: Add helper_mfvscr This is required before changing the representation of the register. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Acked-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20190215100058.20015-13-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-02-18 11:00:44 +11:00

1 2 3

127 Commits