qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Daniel Henrique Barboza	a6f91249e0	target/ppc: PMU: update counters on MMCR1 write MMCR1 determines the events to be sampled by the PMU. Updating the counters at every MMCR1 write ensures that we're not sampling more or less events by looking only at MMCR0 and the PMCs. It is worth noticing that both the Book3S PowerPC PMU, and this IBM Power8+ PMU that we're modeling, also uses MMCRA, MMCR2 and MMCR3 to control the PMU. These three registers aren't being handled in this initial implementation, so for now we're controlling all the PMU aspects using MMCR0, MMCR1 and the PMCs. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20211201151734.654994-5-danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2021-12-17 17:57:18 +01:00
Daniel Henrique Barboza	308b9fad2a	target/ppc: PMU: update counters on PMCs r/w Calling pmu_update_cycles() on every PMC read/write operation ensures that the values being fetched are up to date with the current PMU state. In theory we can get away by just trapping PMCs reads, but we're going to trap PMC writes to deal with counter overflow logic later on. Let's put the required wiring for that and make our lives a bit easier in the next patches. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20211201151734.654994-4-danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2021-12-17 17:57:18 +01:00
Daniel Henrique Barboza	c2eff582a3	target/ppc: PMU basic cycle count for pseries TCG This patch adds the barebones of the PMU logic by enabling cycle counting. The overall logic goes as follows: - MMCR0 reg initial value is set to 0x80000000 (MMCR0_FC set) to avoid having to spin the PMU right at system init; - to retrieve the events that are being profiled, pmc_get_event() will check the current MMCR0 and MMCR1 value and return the appropriate PMUEventType. For PMCs 1-4, event 0x2 is the implementation dependent value of PMU_EVENT_INSTRUCTIONS and event 0x1E is the implementation dependent value of PMU_EVENT_CYCLES. These events are supported by IBM Power chips since Power8, at least, and the Linux Perf driver makes use of these events until kernel v5.15. For PMC1, event 0xF0 is the architected PowerISA event for cycles. Event 0xFE is the architected PowerISA event for instructions; - if the counter is frozen, either via the global MMCR0_FC bit or its individual frozen counter bits, PMU_EVENT_INACTIVE is returned; - pmu_update_cycles() will go through each counter and update the values of all PMCs that are counting cycles. This function will be called every time a MMCR0 update is done to keep counters values up to date. Upcoming patches will use this function to allow the counters to be properly updated during read/write of the PMCs and MMCR1 writes. Given that the base CPU frequency is fixed at 1Ghz for both powernv and pseries clock, cycle calculation assumes that 1 nanosecond equals 1 CPU cycle. Cycle value is then calculated by adding the elapsed time, in nanoseconds, of the last cycle update done via pmu_update_cycles(). Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20211201151734.654994-3-danielhb413@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2021-12-17 17:57:18 +01:00
Matheus Ferst	caf6f9b568	target/ppc: move xscvqpdp to decodetree Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211213120958.24443-5-victor.colombo@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2021-12-17 17:57:18 +01:00
Victor Colombo	201fc774e0	target/ppc: Fix xs{max, min}[cj]dp to use VSX registers PPC instruction xsmaxcdp, xsmincdp, xsmaxjdp, and xsminjdp are using vector registers when they should be using VSX ones. This happens because the instructions are using GEN_VSX_HELPER_R3, which adds 32 to the register numbers, effectively making them vector registers. This patch fixes it by changing these instructions to use GEN_VSX_HELPER_X3. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Victor Colombo <victor.colombo@eldorado.org.br> Message-Id: <20211213120958.24443-2-victor.colombo@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2021-12-17 17:57:18 +01:00
Fabiano Rosas	a09410ed1f	target/ppc: Remove the software TLB model of 7450 CPUs (Applies to 7441, 7445, 7450, 7451, 7455, 7457, 7447, 7447a and 7448) The QEMU-side software TLB implementation for the 7450 family of CPUs is being removed due to lack of known users in the real world. The last users in the code were removed by the two previous commits. A brief history: The feature was added in QEMU by commit `7dbe11acd8` ("Handle all MMU models in switches...") with the mention that Linux was not able to handle the TLB miss interrupts and the MMU model would be kept disabled. At some point later, commit `8ca3f6c382` ("Allow selection of all defined PowerPC 74xx (aka G4) CPUs.") enabled the model for the 7450 family without further justification. We have since the year 2011 [1] been unable to run OpenBIOS in the 7450s and have not heard of any other software that is used with those CPUs in QEMU. Attempts were made to find a guest OS that implemented the TLB miss handlers and none were found among Linux 5.15, FreeBSD 13, MacOS9, MacOSX and MorphOS 3.15. All CPUs that registered this feature were moved to an MMU model that replaces the software TLB with a QEMU hardware TLB implementation. They can now run the same software as the 7400 CPUs, including the OSes mentioned above. References: - https://bugs.launchpad.net/qemu/+bug/812398 https://gitlab.com/qemu-project/qemu/-/issues/86 - https://lists.nongnu.org/archive/html/qemu-ppc/2021-11/msg00289.html message id: 20211119134431.406753-1-farosas@linux.ibm.com Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20211130230123.781844-4-farosas@linux.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2021-12-17 17:57:16 +01:00
Richard Henderson	dedbfda765	target/ppc: Add helper for frsqrtes There is no double-rounding bug here, because the result is merely an estimate to within 1 part in 32, but perform the operation with float64r32_div for consistency. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211119160502.17432-33-richard.henderson@linaro.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2021-12-17 17:57:16 +01:00
Richard Henderson	7f87214e3b	target/ppc: Add helper for fmuls Use float64r32_mul. Fixes a double-rounding issue with performing the compuation in float64 and then rounding afterward. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211119160502.17432-32-richard.henderson@linaro.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2021-12-17 17:57:16 +01:00
Richard Henderson	d9e792a1c1	target/ppc: Add helpers for fadds, fsubs, fdivs Use float64r32_{add,sub,div}. Fixes a double-rounding issue with performing the compuation in float64 and then rounding afterward. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211119160502.17432-31-richard.henderson@linaro.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2021-12-17 17:57:16 +01:00
Richard Henderson	41ae890d08	target/ppc: Add helper for fsqrts Use float64r32_sqrt. Fixes a double-rounding issue with performing the compuation in float64 and then rounding afterward. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211119160502.17432-30-richard.henderson@linaro.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2021-12-17 17:57:16 +01:00
Richard Henderson	d04ca895dc	target/ppc: Add helpers for fmadds et al Use float64r32_muladd. Fixes a double-rounding issue with performing the compuation in float64 and then rounding afterward. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211119160502.17432-29-richard.henderson@linaro.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2021-12-17 17:57:16 +01:00
Lucas Mateus Castro (alqotel)	c3a824b0cf	target/ppc: Fixed call to deferred exception mtfsf, mtfsfi and mtfsb1 instructions call helper_float_check_status after updating the value of FPSCR, but helper_float_check_status checks fp_status and fp_status isn't updated based on FPSCR and since the value of fp_status is reset earlier in the instruction, it's always 0. Because of this helper_float_check_status would change the FI bit to 0 as this bit checks if the last operation was inexact and float_flag_inexact is always 0. These instructions also don't throw exceptions correctly since helper_float_check_status throw exceptions based on fp_status. This commit created a new helper, helper_fpscr_check_status that checks FPSCR value instead of fp_status and checks for a larger variety of exceptions than do_float_check_status. Since fp_status isn't used, gen_reset_fpstatus() was removed. The hardware used to compare QEMU's behavior to was a Power9. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Message-Id: <20211201163808.440385-2-lucas.araujo@eldorado.org.br> Signed-off-by: Cédric Le Goater <clg@kaod.org>	2021-12-17 17:57:12 +01:00
Matheus Ferst	788c63998c	target/ppc: Implement xxblendvb/xxblendvh/xxblendvw/xxblendvd instructions Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Bruno Larsen (billionai) <bruno.larsen@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211104123719.323713-24-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-09 10:32:53 +11:00
Matheus Ferst	28110b72a8	target/ppc: Implement Vector Extract Double to VSR using GPR index insns Implement the following PowerISA v3.1 instructions: vextdubvlx: Vector Extract Double Unsigned Byte to VSR using GPR-specified Left-Index vextduhvlx: Vector Extract Double Unsigned Halfword to VSR using GPR-specified Left-Index vextduwvlx: Vector Extract Double Unsigned Word to VSR using GPR-specified Left-Index vextddvlx: Vector Extract Double Doubleword to VSR using GPR-specified Left-Index vextdubvrx: Vector Extract Double Unsigned Byte to VSR using GPR-specified Right-Index vextduhvrx: Vector Extract Double Unsigned Halfword to VSR using GPR-specified Right-Index vextduwvrx: Vector Extract Double Unsigned Word to VSR using GPR-specified Right-Index vextddvrx: Vector Extract Double Doubleword to VSR using GPR-specified Right-Index Suggested-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Luis Pires <luis.pires@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211104123719.323713-10-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-09 10:32:52 +11:00
Matheus Ferst	b422c2cb52	target/ppc: Move vinsertb/vinserth/vinsertw/vinsertd to decodetree Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211104123719.323713-9-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-09 10:32:52 +11:00
Matheus Ferst	2cc12af399	target/ppc: Implement Vector Insert from GPR using GPR index insns Implements the following PowerISA v3.1 instructions: vinsblx: Vector Insert Byte from GPR using GPR-specified Left-Index vinshlx: Vector Insert Halfword from GPR using GPR-specified Left-Index vinswlx: Vector Insert Word from GPR using GPR-specified Left-Index vinsdlx: Vector Insert Doubleword from GPR using GPR-specified Left-Index vinsbrx: Vector Insert Byte from GPR using GPR-specified Right-Index vinshrx: Vector Insert Halfword from GPR using GPR-specified Right-Index vinswrx: Vector Insert Word from GPR using GPR-specified Right-Index vinsdrx: Vector Insert Doubleword from GPR using GPR-specified Right-Index The helpers and do_vinsx receive i64 to allow code sharing with the future implementation of Vector Insert from VSR using GPR Index. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211104123719.323713-6-matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-09 10:32:52 +11:00
Matheus Ferst	00a16569eb	target/ppc: Implement vpdepd/vpextd instruction pdepd and pextd helpers are moved out of #ifdef (TARGET_PPC64) to allow them to be reused as GVecGen3.fni8. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Luis Pires <luis.pires@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211104123719.323713-4-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-09 10:32:52 +11:00
Matheus Ferst	6e0bbc4048	target/ppc: Move vcfuged to vmx-impl.c.inc There's no reason to keep vector-impl.c.inc separate from vmx-impl.c.inc. Additionally, let GVec handle the multiple calls to helper_cfuged for us. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211104123719.323713-2-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-09 10:32:52 +11:00
Luis Pires	a23297479c	target/ppc: Move ddedpd[q],denbcd[q],dscli[q],dscri[q] to decodetree Move the following instructions to decodetree: ddedpd: DFP Decode DPD To BCD ddedpdq: DFP Decode DPD To BCD Quad denbcd: DFP Encode BCD To DPD denbcdq: DFP Encode BCD To DPD Quad dscli: DFP Shift Significand Left Immediate dscliq: DFP Shift Significand Left Immediate Quad dscri: DFP Shift Significand Right Immediate dscriq: DFP Shift Significand Right Immediate Quad Also deleted dfp-ops.c.inc, now that all PPC DFP instructions were moved to decodetree. Signed-off-by: Luis Pires <luis.pires@eldorado.org.br> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211029192417.400707-16-luis.pires@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-09 10:32:52 +11:00
Luis Pires	c8ef4d1ec0	target/ppc: Move dct{dp,qpq},dr{sp,dpq},dc{f,t}fix[q],dxex[q] to decodetree Move the following instructions to decodetree: dctdp: DFP Convert To DFP Long dctqpq: DFP Convert To DFP Extended drsp: DFP Round To DFP Short drdpq: DFP Round To DFP Long dcffix: DFP Convert From Fixed dcffixq: DFP Convert From Fixed Quad dctfix: DFP Convert To Fixed dctfixq: DFP Convert To Fixed Quad dxex: DFP Extract Biased Exponent dxexq: DFP Extract Biased Exponent Quad Signed-off-by: Luis Pires <luis.pires@eldorado.org.br> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211029192417.400707-15-luis.pires@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-09 10:32:52 +11:00
Luis Pires	a8f4bce6f8	target/ppc: Move dqua[q], drrnd[q] to decodetree Move the following instructions to decodetree: dqua: DFP Quantize dquaq: DFP Quantize Quad drrnd: DFP Reround drrndq: DFP Reround Quad Signed-off-by: Luis Pires <luis.pires@eldorado.org.br> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211029192417.400707-14-luis.pires@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-09 10:32:52 +11:00
Luis Pires	78464edb8f	target/ppc: Move dquai[q], drint{x,n}[q] to decodetree Move the following instructions to decodetree: dquai: DFP Quantize Immediate dquaiq: DFP Quantize Immediate Quad drintx: DFP Round to FP Integer With Inexact drintxq: DFP Round to FP Integer With Inexact Quad drintn: DFP Round to FP Integer Without Inexact drintnq: DFP Round to FP Integer Without Inexact Quad Signed-off-by: Luis Pires <luis.pires@eldorado.org.br> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211029192417.400707-13-luis.pires@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-09 10:32:52 +11:00
Luis Pires	85c38a460c	target/ppc: Move dcmp{u,o}[q],dts{tex,tsf,tsfi}[q] to decodetree Move the following instructions to decodetree: dcmpu: DFP Compare Unordered dcmpuq: DFP Compare Unordered Quad dcmpo: DFP Compare Ordered dcmpoq: DFP Compare Ordered Quad dtstex: DFP Test Exponent dtstexq: DFP Test Exponent Quad dtstsf: DFP Test Significance dtstsfq: DFP Test Significance Quad dtstsfi: DFP Test Significance Immediate dtstsfiq: DFP Test Significance Immediate Quad Signed-off-by: Luis Pires <luis.pires@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211029192417.400707-12-luis.pires@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-09 10:32:52 +11:00
Luis Pires	afdc931013	target/ppc: Move d{add,sub,mul,div,iex}[q] to decodetree Move the following instructions to decodetree: dadd: DFP Add daddq: DFP Add Quad dsub: DFP Subtract dsubq: DFP Subtract Quad dmul: DFP Multiply dmulq: DFP Multiply Quad ddiv: DFP Divide ddivq: DFP Divide Quad diex: DFP Insert Biased Exponent diexq: DFP Insert Biased Exponent Quad Signed-off-by: Luis Pires <luis.pires@eldorado.org.br> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211029192417.400707-11-luis.pires@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-09 10:32:52 +11:00
Luis Pires	87bc8e52b1	target/ppc: Move dtstdc[q]/dtstdg[q] to decodetree Move the following instructions to decodetree: dtstdc: DFP Test Data Class dtstdcq: DFP Test Data Class Quad dtstdg: DFP Test Data Group dtstdgq: DFP Test Data Group Quad Signed-off-by: Luis Pires <luis.pires@eldorado.org.br> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211029192417.400707-10-luis.pires@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-09 10:32:52 +11:00
Luis Pires	328747f32f	target/ppc: Implement DCTFIXQQ Implement the following PowerISA v3.1 instruction: dctfixqq: DFP Convert To Fixed Quadword Quad Signed-off-by: Luis Pires <luis.pires@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211029192417.400707-8-luis.pires@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-09 10:32:52 +11:00
Luis Pires	d39b2cc7d0	target/ppc: Implement DCFFIXQQ Implement the following PowerISA v3.1 instruction: dcffixqq: DFP Convert From Fixed Quadword Quad Signed-off-by: Luis Pires <luis.pires@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211029192417.400707-5-luis.pires@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-09 10:32:52 +11:00
Matheus Ferst	8bdb760606	target/ppc: Implement pextd instruction Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211029202424.175401-11-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-09 10:32:52 +11:00
Matheus Ferst	21ba6e5873	target/ppc: Implement pdepd instruction Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20211029202424.175401-10-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-11-09 10:32:52 +11:00
Richard Henderson	7319d83a73	tcg: Combine dh_is_64bit and dh_is_signed to dh_typecode We will shortly be interested in distinguishing pointers from integers in the helper's declaration, as well as a true void return. We currently have two parallel 1 bit fields; merge them and expand to a 3 bit field. Our current maximum is 7 helper arguments, plus the return makes 8 * 3 = 24 bits used within the uint32_t typemask. Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-06-19 08:51:11 -07:00
Matheus Ferst	89ccd7dc3f	target/ppc: Implement cfuged instruction Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20210601193528.2533031-12-matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-06-03 18:10:31 +10:00
Richard Henderson	46a0add975	target/ppc: Mark helper_raise_exception* as noreturn Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Message-Id: <20210517205025.3777947-8-matheus.ferst@eldorado.org.br> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-05-19 10:30:29 +10:00
Richard Henderson	f43520e5b2	target/ppc: Create helper_scv Perform the test against FSCR_SCV at runtime, in the helper. This means we can remove the incorrect set against SCV in ppc_tr_init_disas_context and do not need to add an HFLAGS bit. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210323184340.619757-6-richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2021-05-04 11:41:24 +10:00
Lijun Pan	c4b8b49d68	target/ppc: add vmulh{su}d instructions vmulhsd: Vector Multiply High Signed Doubleword vmulhud: Vector Multiply High Unsigned Doubleword Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Message-Id: <20200724045845.89976-5-ljp@linux.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-08-12 13:16:27 +10:00
Lijun Pan	f3e0d864ab	target/ppc: add vmulh{su}w instructions vmulhsw: Vector Multiply High Signed Word vmulhuw: Vector Multiply High Unsigned Word Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Message-Id: <20200724045845.89976-4-ljp@linux.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-08-12 13:16:27 +10:00
Lijun Pan	a285ffa680	target/ppc: convert vmuluwm to tcg_gen_gvec_mul Convert the original implementation of vmuluwm to the more generic tcg_gen_gvec_mul. Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Message-Id: <20200701234344.91843-5-ljp@linux.ibm.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-08-12 13:16:27 +10:00
Richard Henderson	3e114acc91	target/ppc: Use tcg_gen_gvec_rotlv Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-06-02 08:42:37 -07:00
Nicholas Piggin	3c89b8d6ac	target/ppc: Add support for scv and rfscv instructions POWER9 adds scv and rfscv instructions and the system call vectored interrupt. Linux does not support this instruction yet but it has been tested with a modified kernel that runs on real hardware. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20200507115328.789175-1-npiggin@gmail.com> [dwg: Corrected an overlong line] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-05-27 15:29:24 +10:00
Nicholas Piggin	0418bf78fe	target/ppc: Fix ISA v3.0 (POWER9) slbia implementation The new ISA v3.0 slbia variants have not been implemented for TCG, which can lead to crashing when a POWER9 machine boots Linux using the hash MMU, for example ("disable_radix" kernel command line). Add them. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20200319064439.1020571-1-npiggin@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> [dwg: Fixed compile error for USER_ONLY builds] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-03-24 11:56:14 +11:00
Cédric Le Goater	5ba7ba1da0	target/ppc: Add privileged message send facilities The Processor Control facility for POWER8 processors and later provides a mechanism for the hypervisor to send messages to other threads in the system (msgsnd instruction) and cause hypervisor-level exceptions. Privileged non-hypervisor programs can also send messages (msgsndp instruction) but are restricted to the threads of the same subprocessor and cause privileged-level exceptions. The Directed Privileged Doorbell Exception State (DPDES) register reflects the state of pending privileged doorbell exceptions and can be used to modify that state. The register can be used to read and modify the state of privileged doorbell exceptions for all threads of a subprocessor and thus is a shared facility for that subprocessor. The register can be read/written by the hypervisor and read by the supervisor if enabled in the HFSCR, otherwise a hypervisor facility unavailable exception is generated. The privileged message send and clear instructions (msgsndp & msgclrp) are used to generate and clear the presence of a directed privileged doorbell exception, respectively. The msgsndp instruction can be used to target any thread of the current subprocessor, msgclrp acts on the thread issuing the instruction. These instructions are privileged, but will generate a hypervisor facility unavailable exception if not enabled in the HFSCR and executed in privileged non-hypervisor state. The HV facility unavailable exception will be addressed in other patch. Add and implement this register and instructions by reading or modifying the pending interrupt state of the cpu. Note that TCG only supports one thread per core and so we only need to worry about the cpu making the access. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20200120104935.24449-2-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2020-02-02 14:07:57 +11:00
Suraj Jitindar Singh	f0ec31b1e2	target/ppc: Add SPR TBU40 The spr TBU40 is used to set the upper 40 bits of the timebase register, present on POWER5+ and later processors. This register can only be written by the hypervisor, and cannot be read. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191128134700.16091-5-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Suraj Jitindar Singh	5cc7e69f6d	target/ppc: Work [S]PURR implementation and add HV support The Processor Utilisation of Resources Register (PURR) and Scaled Processor Utilisation of Resources Register (SPURR) provide an estimate of the resources used by the thread, present on POWER7 and later processors. Currently the [S]PURR registers simply count at the rate of the timebase. Preserve this behaviour but rework the implementation to store an offset like the timebase rather than doing the calculation manually. Also allow hypervisor write access to the register along with the currently available read access. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [ clg: rebased on current ppc tree ] Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191128134700.16091-3-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Suraj Jitindar Singh	5d62725b2f	target/ppc: Implement the VTB for HV access The virtual timebase register (VTB) is a 64-bit register which increments at the same rate as the timebase register, present on POWER8 and later processors. The register is able to be read/written by the hypervisor and read by the supervisor. All other accesses are illegal. Currently the VTB is just an alias for the timebase (TB) register. Implement the VTB so that is can be read/written independent of the TB. Make use of the existing method for accessing timebase facilities where by the compensation is stored and used to compute the value on reads/is updated on writes. Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> [ clg: rebased on current ppc tree ] Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20191128134700.16091-2-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-12-17 10:39:48 +11:00
Mark Cave-Ayland	d9acba3130	target/ppc: update {get,set}_dfp{64,128}() helper functions to read/write DFP numbers correctly Since commit `ef96e3ae96` "target/ppc: move FP and VMX registers into aligned vsr register array" FP registers are no longer stored consecutively in memory and so the current method of combining FP register pairs into DFP numbers is incorrect. Firstly update the definition of the dh_*_fprp defines in helper.h to reflect that FP registers are now stored as part of an array of ppc_vsr_t elements rather than plain uint64_t elements, and then introduce a new ppc_fprp_t type which conceptually represents a DFP even-odd register pair to be consumed by the DFP helper functions. Finally update the new DFP {get,set}_dfp{64,128}() helper functions to convert between DFP numbers and DFP even-odd register pairs correctly, making use of the existing VsrD() macro to access the correct elements regardless of host endian. Fixes: `ef96e3ae96` "target/ppc: move FP and VMX registers into aligned vsr register array" Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190926185801.11176-4-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-10-04 19:08:21 +10:00
Stefan Brankovic	1872588ede	target/ppc: Optimize emulation of vclzw instruction Optimize Altivec instruction vclzw (Vector Count Leading Zeros Word). This instruction counts the number of leading zeros of each word element in source register and places result in the appropriate word element of destination register. Counting is to be performed in four iterations of for loop(one for each word elemnt of source register vB). Every iteration consists of loading appropriate word element from source register, counting leading zeros with tcg_gen_clzi_i32, and saving the result in appropriate word element of destination register. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1563200574-11098-7-git-send-email-stefan.brankovic@rt-rk.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-21 17:17:11 +10:00
Stefan Brankovic	b8313f0d91	target/ppc: Optimize emulation of vclzd instruction Optimize Altivec instruction vclzd (Vector Count Leading Zeros Doubleword). This instruction counts the number of leading zeros of each doubleword element in source register and places result in the appropriate doubleword element of destination register. Using tcg-s count leading zeros instruction two times(once for each doubleword element of source register vB) and placing result in appropriate doubleword element of destination register vD. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1563200574-11098-6-git-send-email-stefan.brankovic@rt-rk.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-21 17:17:11 +10:00
Stefan Brankovic	083b3f012f	target/ppc: Optimize emulation of vgbbd instruction Optimize altivec instruction vgbbd (Vector Gather Bits by Bytes by Doubleword) All ith bits (i in range 1 to 8) of each byte of doubleword element in source register are concatenated and placed into ith byte of appropriate doubleword element in destination register. Following solution is done for both doubleword elements of source register in parallel, in order to reduce the number of instructions needed(that's why arrays are used): First, both doubleword elements of source register vB are placed in appropriate element of array avr. Bits are gathered in 2x8 iterations(2 for loops). In first iteration bit 1 of byte 1, bit 2 of byte 2,... bit 8 of byte 8 are in their final spots so avr[i], i={0,1} can be and-ed with tcg_mask. For every following iteration, both avr[i] and tcg_mask variables have to be shifted right for 7 and 8 places, respectively, in order to get bit 1 of byte 2, bit 2 of byte 3.. bit 7 of byte 8 in their final spots so shifted avr values(saved in tmp) can be and-ed with new value of tcg_mask... After first 8 iteration(first loop), all the first bits are in their final places, all second bits but second bit from eight byte are in their places... only 1 eight bit from eight byte is in it's place). In second loop we do all operations symmetrically, in order to get other half of bits in their final spots. Results for first and second doubleword elements are saved in result[0] and result[1] respectively. In the end those results are saved in appropriate doubleword element of destination register vD. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1563200574-11098-5-git-send-email-stefan.brankovic@rt-rk.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-21 17:17:11 +10:00
Stefan Brankovic	4e6d0920e7	target/ppc: Optimize emulation of vsl and vsr instructions Optimization of altivec instructions vsl and vsr(Vector Shift Left/Rigt). Perform shift operation (left and right respectively) on 128 bit value of register vA by value specified in bits 125-127 of register vB. Lowest 3 bits in each byte element of register vB must be identical or result is undefined. For vsl instruction, the first step is bits 125-127 of register vB have to be saved in variable sh. Then, the highest sh bits of the lower doubleword element of register vA are saved in variable shifted, in order not to lose those bits when shift operation is performed on the lower doubleword element of register vA, which is the next step. After shifting the lower doubleword element shift operation is performed on higher doubleword element of vA, with replacement of the lowest sh bits(that are now 0) with bits saved in shifted. For vsr instruction, firstly, the bits 125-127 of register vB have to be saved in variable sh. Then, the lowest sh bits of the higher doubleword element of register vA are saved in variable shifted, in odred not to lose those bits when the shift operation is performed on the higher doubleword element of register vA, which is the next step. After shifting higher doubleword element, shift operation is performed on lower doubleword element of vA, with replacement of highest sh bits(that are now 0) with bits saved in shifted. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1563200574-11098-3-git-send-email-stefan.brankovic@rt-rk.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-21 17:17:11 +10:00
Stefan Brankovic	1cc792698e	target/ppc: Optimize emulation of lvsl and lvsr instructions Adding simple macro that is calling tcg implementation of appropriate instruction if altivec support is active. Optimization of altivec instruction lvsl (Load Vector for Shift Left). Place bytes sh:sh+15 of value 0x00 \|\| 0x01 \|\| 0x02 \|\| ... \|\| 0x1E \|\| 0x1F in destination register. Sh is calculated by adding 2 source registers and getting bits 60-63 of result. First, the bits [28-31] are placed from EA to variable sh. After that, the bytes are created in the following way: sh:(sh+7) of X(from description) by multiplying sh with 0x0101010101010101 followed by addition of the result with 0x0001020304050607. Value obtained is placed in higher doubleword element of vD. (sh+8):(sh+15) by adding the result of previous multiplication with 0x08090a0b0c0d0e0f. Value obtained is placed in lower doubleword element of vD. Optimization of altivec instruction lvsr (Load Vector for Shift Right). Place bytes 16-sh:31-sh of value 0x00 \|\| 0x01 \|\| 0x02 \|\| ... \|\| 0x1E \|\| 0x1F in destination register. Sh is calculated by adding 2 source registers and getting bits 60-63 of result. First, the bits [28-31] are placed from EA to variable sh. After that, the bytes are created in the following way: sh:(sh+7) of X(from description) by multiplying sh with 0x0101010101010101 followed by substraction of the result from 0x1011121314151617. Value obtained is placed in higher doubleword element of vD. (sh+8):(sh+15) by substracting the result of previous multiplication from 0x18191a1b1c1d1e1f. Value obtained is placed in lower doubleword element of vD. Signed-off-by: Stefan Brankovic <stefan.brankovic@rt-rk.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1563200574-11098-2-git-send-email-stefan.brankovic@rt-rk.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-08-21 17:17:11 +10:00
Mark Cave-Ayland	c9f4e4d8b6	target/ppc: improve VSX_FMADD with new GEN_VSX_HELPER_VSX_MADD macro Introduce a new GEN_VSX_HELPER_VSX_MADD macro for the generator function which enables the source and destination registers to be decoded at translation time. This enables the determination of a or m form to be made at translation time so that a single helper function can now be used for both variants. Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20190616123751.781-16-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2019-07-02 09:43:58 +10:00

1 2 3

127 Commits