The number of actual invocations of ctpop itself does not warrent
an opcode, but it is very helpful for POWER7 to use in generating
an expansion for ctz.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This will let us choose how to interpret a given constraint
depending on whether the opcode is 32- or 64-bit. Which will
let us share more constraint combinations between opcodes.
At the same time, change the interface to return the advanced
pointer instead of passing it in/out by reference.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This will allow the target to tailor the constraints to the
auto-detected ISA extensions.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Adds tcg_gen_extract_* and tcg_gen_sextract_* for extraction of
fixed position bitfields, much like we already have for deposit.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Take stack frame parameters out from the function body.
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: James Hogan <james.hogan@imgtec.com>
Tested-by: YunQiang Su <wzssyqa@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Jin Guojie <jinguojie@loongson.cn>
Message-Id: <1483592275-4496-8-git-send-email-jinguojie@loongson.cn>
tcg_out_ldst: using a generic ALIAS_PADD to avoid ifdefs
tcg_out_ld: generates LD or LW
tcg_out_st: generates SD or SW
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: James Hogan <james.hogan@imgtec.com>
Tested-by: YunQiang Su <wzssyqa@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Jin Guojie <jinguojie@loongson.cn>
Message-Id: <1483592275-4496-7-git-send-email-jinguojie@loongson.cn>
tcg_out_mov: using OPC_OR as most mips assemblers do;
tcg_out_movi: extended to 64-bit immediate.
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: James Hogan <james.hogan@imgtec.com>
Tested-by: YunQiang Su <wzssyqa@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Jin Guojie <jinguojie@loongson.cn>
Message-Id: <1483592275-4496-6-git-send-email-jinguojie@loongson.cn>
Without the mips32r2 instructions to perform swapping, bswap is quite large,
dominating the size of each reverse-endian qemu_ld/qemu_st operation.
Create two subroutines in the prologue block. The subroutines require extra
reserved registers (TCG_TMP[2, 3]). Using these within qemu_ld means that
we need not place additional restrictions on the qemu_ld outputs.
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: James Hogan <james.hogan@imgtec.com>
Tested-by: YunQiang Su <wzssyqa@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Jin Guojie <jinguojie@loongson.cn>
Message-Id: <1483592275-4496-5-git-send-email-jinguojie@loongson.cn>
Bulk patch adding 64-bit opcodes into tcg_out_op. Note that
mips64 is as yet neither complete nor enabled.
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: James Hogan <james.hogan@imgtec.com>
Tested-by: YunQiang Su <wzssyqa@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Jin Guojie <jinguojie@loongson.cn>
Message-Id: <1483592275-4496-4-git-send-email-jinguojie@loongson.cn>
Since the mips manual tables are in octal, reorg all of the opcodes
into that format for clarity. Note that the 64-bit opcodes are as
yet unused.
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: James Hogan <james.hogan@imgtec.com>
Tested-by: YunQiang Su <wzssyqa@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Jin Guojie <jinguojie@loongson.cn>
Message-Id: <1483592275-4496-3-git-send-email-jinguojie@loongson.cn>
Without the mips32r2 instructions to perform swapping, bswap is quite large,
dominating the size of each reverse-endian qemu_ld/qemu_st operation.
Create a subroutine in the prologue block. The subroutine requires extra
reserved registers (TCG_TMP[2, 3]). Using these within qemu_ld means that
we need not place additional restrictions on the qemu_ld outputs.
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: James Hogan <james.hogan@imgtec.com>
Tested-by: YunQiang Su <wzssyqa@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Jin Guojie <jinguojie@loongson.cn>
Message-Id: <1483592275-4496-2-git-send-email-jinguojie@loongson.cn>
Previously we allowed fully unaligned operations, but not operations
that are aligned but with less alignment than the operation size.
In addition, arm32, ia64, mips, and sparc had been omitted from the
previous overalignment patch, which would have led to that alignment
being enforced.
Signed-off-by: Richard Henderson <rth@twiddle.net>
These use guard symbols like TCG_TARGET_$target.
scripts/clean-header-guards.pl doesn't like them because they don't
match their file name (they should, to make guard collisions less
likely).
Clean them up: use guard symbol $target_TCG_TARGET_H for
tcg/$target/tcg-target.h.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
While we can store constants via constrants on INDEX_op_st_i32 et al,
we weren't able to spill constants to backing store.
Add a new backend interface, tcg_out_sti, which may store the constant
(and is allowed to fail). Rearrange the temp_* helpers so that we only
attempt to directly store a constant when the temp is becoming dead/free.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Briefly describe in a comment how direct block chaining is done. It
should help in understanding of the following data fields.
Rename some fields in TranslationBlock and TCGContext structures to
better reflect their purpose (dropping excessive 'tb_' prefix in
TranslationBlock but keeping it in TCGContext):
tb_next_offset => jmp_reset_offset
tb_jmp_offset => jmp_insn_offset
tb_next => jmp_target_addr
jmp_next => jmp_list_next
jmp_first => jmp_list_first
Avoid using a magic constant as an invalid offset which is used to
indicate that there's no n-th jump generated.
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Ensure direct jump patching in MIPS is atomic by using
atomic_read()/atomic_set() for code patching.
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1461341333-19646-11-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
[rth: Merged the deposit32 followup.]
[rth: Merged the following followup.]
Message-Id: <1462210518-26522-1-git-send-email-sergey.fedorov@linaro.org>
Check for CONFIG_DEBUG_TCG instead of NDEBUG, drop now useless code.
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 1461228530-14852-2-git-send-email-aurelien@aurel32.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The TCG code is quite performance sensitive, but at the same time can
also be quite tricky. That is why asserts that can be enabled with the
--enable-debug-tcg configure option.
This used to work the following way:
| #include "config.h"
|
| ...
|
| #if !defined(CONFIG_DEBUG_TCG) && !defined(NDEBUG)
| /* define it to suppress various consistency checks (faster) */
| #define NDEBUG
| #endif
|
| ...
|
| #include <assert.h>
Since commit 757e725b (tcg: Clean up includes) "config.h" as been
replaced by "qemu/osdep.h" which itself includes <assert.h>. As a
consequence the assertions are always enabled, even when using
--disable-debug-tcg, causing a performance regression, especially on
targets with many registers. For instance on qemu-system-ppc the
speed difference is about 15%.
tcg_debug_assert is controlled directly by CONFIG_DEBUG_TCG and already
uses in some places. This patch replaces all the calls to assert into
calss to tcg_debug_assert.
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 1461228530-14852-1-git-send-email-aurelien@aurel32.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The MIPS TCG backend is the only one to have
tcg_target_reg_alloc_order[] elements of type TCGReg rather than int.
This resulted in commit 91478cefaa ("tcg: Allocate indirect_base
temporaries in a different order") breaking the build on MIPS since the
type differed from indirect_reg_alloc_order[]:
tcg/tcg.c:1725:44: error: pointer type mismatch in conditional expression [-Werror]
order = rev ? indirect_reg_alloc_order : tcg_target_reg_alloc_order;
^
Make it an array of ints to fix the build and match other architectures.
Fixes: 91478cefaa ("tcg: Allocate indirect_base temporaries in a different order")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <1459522179-6584-1-git-send-email-james.hogan@imgtec.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Commit 757e725b58 added a number of #include "qemu/osdep.h"
files to the tcg-target.c files (as they were named at the time).
These are unnecessary because these files are not standalone C
files, and the tcg/tcg.c file which includes them will have
already included osdep.h on their behalf. Remove the unneeded
include directives.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1456238983-10160-4-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Rename the per-architecture tcg-target.c files to tcg-target.inc.c.
This makes it clearer that they are not intended to be standalone
C files, but are instead #included into another source file.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1456238983-10160-2-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1453832250-766-16-git-send-email-peter.maydell@linaro.org
Extend MIPS movcond implementation to support the SELNEZ/SELEQZ
instructions introduced in MIPS r6 (where MOVN/MOVZ have been removed).
Whereas the "MOVN/MOVZ rd, rs, rt" instructions have the following
semantics:
rd = [!]rt ? rs : rd
The "SELNEZ/SELEQZ rd, rs, rt" instructions are slightly different:
rd = [!]rt ? rs : 0
First we ensure that if one of the movcond input values is zero that it
comes last (we can swap the input arguments if we invert the condition).
This is so that it can exactly match one of the SELNEZ/SELEQZ
instructions and avoid the need to emit the other one.
Otherwise we emit the opposite instruction first into a temporary
register, and OR that into the result:
SELNEZ/SELEQZ TMP1, v2, c1
SELEQZ/SELNEZ ret, v1, c1
OR ret, ret, TMP1
Which does the following:
ret = cond ? v1 : v2
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1443788657-14537-7-git-send-email-james.hogan@imgtec.com>
MIPSr6 adds several new integer multiply, divide, and modulo
instructions, and removes several pre-r6 encodings, along with the HI/LO
registers which were the implicit operands of some of those
instructions. Update TCG to use the new instructions when built for r6.
The new instructions actually map much more directly to the TCG ops, as
they only provide a single 32-bit half of the result and in a normal
general purpose register instead of HI or LO.
The mulu2_i32 and muls2_i32 operations are no longer appropriate for r6,
so they are removed from the TCG opcode table. This is because they
would need to emit two separate host instructions anyway (for the high
and low half of the result), which TCG can arrange automatically for us
in the absense of mulu2_i32/muls2_i32 by splitting it into mul_i32 and
mul*h_i32 TCG ops.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1443788657-14537-6-git-send-email-james.hogan@imgtec.com>
MIPSr6 encodes JR as JALR with zero as the link register, and the pre-r6
JR encoding is removed. Update TCG to use the new encoding when built
for r6.
We still use the old encoding for pre-r6, so as not to confuse return
prediction stack hardware which may detect only particular encodings of
the return instruction.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1443788657-14537-5-git-send-email-james.hogan@imgtec.com>
Add definition use_mips32r6_instructions to the MIPS TCG backend which
is constant 1 when built for MIPS release 6. This will be used to decide
between pre-R6 and R6 instruction encodings.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1443788657-14537-4-git-send-email-james.hogan@imgtec.com>
Instead of computing mem_index and s_bits in both tcg_out_qemu_ld and
tcg_out_qemu_st function and passing them to tcg_out_tlb_load, directly
pass oi to the tcg_out_tlb_load function and compute mem_index and
s_bits there.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Somehow the tcg_out_addsub2 function ended-up in the middle of the
qemu_ld/st related functions. Move it with other arithmetics related
functions.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The MIPS TCG backend implements qemu_ld with 64-bit targets using the v0
register (base) as a temporary to load the upper half of the QEMU TLB
comparator (see line 5 below), however this happens before the input
address is used (line 8 to mask off the low bits for the TLB
comparison, and line 12 to add the host-guest offset). If the input
address (addrl) also happens to have been placed in v0 (as in the second
column below), it gets clobbered before it is used.
addrl in t2 addrl in v0
1 srl a0,t2,0x7 srl a0,v0,0x7
2 andi a0,a0,0x1fe0 andi a0,a0,0x1fe0
3 addu a0,a0,s0 addu a0,a0,s0
4 lw at,9136(a0) lw at,9136(a0) set TCG_TMP0 (at)
5 lw v0,9140(a0) lw v0,9140(a0) set base (v0)
6 li t9,-4093 li t9,-4093
7 lw a0,9160(a0) lw a0,9160(a0) set addend (a0)
8 and t9,t9,t2 and t9,t9,v0 use addrl
9 bne at,t9,0x836d8c8 bne at,t9,0x836d838 use TCG_TMP0
10 nop nop
11 bne v0,t8,0x836d8c8 bne v0,a1,0x836d838 use base
12 addu v0,a0,t2 addu v0,a0,v0 use addrl, addend
13 lw t0,0(v0) lw t0,0(v0)
Fix by using TCG_TMP0 (at) as the temporary instead of v0 (base),
pushing the load on line 5 forward into the delay slot of the low
comparison (line 10). The early load of the addend on line 7 also needs
pushing even further for 64-bit targets, or it will clobber a0 before
we're done with it. The output for 32-bit targets is unaffected.
srl a0,v0,0x7
andi a0,a0,0x1fe0
addu a0,a0,s0
lw at,9136(a0)
-lw v0,9140(a0) load high comparator
li t9,-4093
-lw a0,9160(a0) load addend
and t9,t9,v0
bne at,t9,0x836d838
- nop
+ lw at,9140(a0) load high comparator
+lw a0,9160(a0) load addend
-bne v0,a1,0x836d838
+bne at,a1,0x836d838
addu v0,a0,v0
lw t0,0(v0)
Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
As we have removed CONFIG_USE_GUEST_BASE, we always use a guest base
and the macros GUEST_BASE and RESERVED_VA become useless: replace
them by their values.
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <1440420834-8388-1-git-send-email-laurent@vivier.eu>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The add2 code in the tcg_out_addsub2 function doesn't take into account
the case where rl == al == bl. In that case we can't compute the carry
after the addition. As it corresponds to a multiplication by 2, the
carry bit is the bit 31.
While this is a corner case, this prevents x86-64 guests to boot on a
MIPS host.
Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Commit 2b7ec66f fixed TCGMemOp masking following the MO_AMASK addition,
but two cases were forgotten in the TCG MIPS backend.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
For 32-bit guest, we load a 32-bit address from the TLB, so there is no
need to compensate for the low or high part. This fixes 32-bit guests on
big-endian hosts.
Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Commit 3972ef6f83 ("tcg: Push merged memop+mmu_idx parameter to
softmmu routines") caused the following build errors when building TCG
for MIPS:
In file included from tcg/tcg.c:258:0:
tcg/mips/tcg-target.c In function ‘tcg_out_qemu_ld_slow_path’:
tcg/mips/tcg-target.c:1015:22: error: ‘lb’ undeclared (first use in this function)
tcg/mips/tcg-target.c In function ‘tcg_out_qemu_st_slow_path’:
tcg/mips/tcg-target.c:1058:22: error: ‘lb’ undeclared (first use in this function)
It looks like lb was meant to refer to the TCGLabelQemuLdst *l
parameter, so fix both references to lb to refer to just l.
Fixes: 3972ef6f83 ("tcg: Push merged memop+mmu_idx parameter to softmmu routines")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Message-id: 1436433435-24898-2-git-send-email-james.hogan@imgtec.com
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The addition of MO_AMASK means that places that used inverted masks
need to be changed to use positive masks, and places that failed to
mask the intended bits need updating.
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
Tested-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This will be used to size the TLB when more than 8 MMU modes are
used by the target. Limitations come from the limited size of
the immediate fields (which sometimes, as in the case of Aarch64,
extend to instructions that shift the immediate).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1424436345-37924-2-git-send-email-pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
The extra information is not yet used but it is now available.
This requires minor changes through all of the tcg backends.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
At the tcg opcode level, not at the tcg-op.h generator level.
This requires minor changes through all of the tcg backends,
but none of the cpu translators.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This is less about improved type checking than enabling a
subsequent change to the representation of labels.
Acked-by: Claudio Fontana <claudio.fontana@huawei.com>
Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
Cc: Andrzej Zaborowski <balrogg@gmail.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Commit 9d8bf2d1 moved the softmmu slow path out of line and introduce a
regression at the same time by always calling tcg_out_tlb_load with
is_load=1. This makes impossible to run any significant code under
qemu-system-mips*.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Since all backends have been converted, remove the compatibility code.
Acked-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Now that the code_gen_buffer is constrained to not cross 256mb
regions, we are assured that we can use J to reach another TB.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>