Instead require either mulu2_i32 or muluh_i32. The code in tcg-op.h
already supports looking for both. Previous incomplete conversion?
Signed-off-by: Richard Henderson <rth@twiddle.net>
Most 64-bit targets need to be able to ignore the high bits
of a TCG_TYPE_I32 value.
Suggested-by: Stuart Brady <sdb@zubnet.me.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Static code analyzers complain about signed bitfields with only a single
bit. is_ld is used as a boolean value, so make it bool.
ppc64 already used bool for the 2nd argument is_ld of the local function
add_qemu_ldst_label. Modify all other TCG targets to do follow this
example.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The arm ldrd/strd insns must cause alignment traps, whereas
at least for armv7 ldr/str must handle unaligned operations.
While this is hardly the only problem facing user-only emu,
this solves one problem for i386 on armv7 emulation.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reported-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
It's this that should be subtracted from 0x20 when converting to a right rotate.
Cc: qemu-stable@nongnu.org
Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Allow host detection on linux systems without glibc 2.16 or later.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
If we pull the code to emit the actual load/store into a subroutine,
we can share the reg+reg addressing mode code between softmmu and
usermode. This lets us load GUEST_BASE into a temporary register
rather than attempting to add it piece-wise to the address.
Which lets us use movw+movt for armv7, rather than (up to) 4 adds.
Code size for pre-armv7 stays the same.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Step two in the transition, adding the new ldst opcodes. Keep the old
opcodes around until all backends support the new opcodes.
Signed-off-by: Richard Henderson <rth@twiddle.net>
There are free scheduling slots between the sequence of
comparison instructions. This requires changing the
register in use to avoid conflict with those compares.
Signed-off-by: Richard Henderson <rth@twiddle.net>
The main intent of the patch is to allow the tlb addend register
to be changed, without tying that change to the constraint. But
the most common side-effect seems to be to enable usage of ldrd
with the r0,r1 pair.
Signed-off-by: Richard Henderson <rth@twiddle.net>
This allows us to make more intelligent decisions about the relative
offsets of the tlb comparator and the addend, avoiding any need of
writeback addressing.
Signed-off-by: Richard Henderson <rth@twiddle.net>
One of the two constraints we already checked via #if, but
the tlb offset distance was only checked at runtime.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Use the new helper_ret_*_mmu routines. Use a conditional call
to arrange for a tail-call from the store path, and to load the
return address for the helper for the load path.
Signed-off-by: Richard Henderson <rth@twiddle.net>
The _cmmu helpers can be moved to exec-all.h. The helpers that are
used from TCG will shortly need access to tcg_target_long so move
their declarations into tcg.h.
This requires minor include adjustments to all TCG backends.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Use them in places where mulu2 and muls2 are used.
Optimize mulx2 with dead low part to mulxh.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
With this we can generate armv7 insns even when the OS compiles for a
lower common denominator. The macros are arranged so that when we do
compile for a given ISA, all of the runtime checks for that ISA are
optimized away.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
GCC 4.8 defines a handy __ARM_ARCH symbol that we can use, which
will make us nicely forward compatible with ARMv8 AArch32.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
As it really controls the availability of a thumb interworking
instruction on armv5t.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
We can now detect and use divide instructions at runtime, rather than
having to restrict their availability to compile-time.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
There are several hosts with only a "div" insn. Remainder is computed
manually from the quotient and inputs. We can do this generically.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
We've got a compile-time check for the condition in exec/cpu-defs.h.
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: liguang <lig.fnst@cn.fujitsu.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Avoid the mini constant pool for armv7, and avoid replicating
the test for pre-v7.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Found by inspection, since the effect of the bug was simply to
send all memory ops through the slow path.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Move the slow path out of line, as the TODO's mention.
This allows the fast path to be unconditional, which can
speed up the fast path as well, depending on the core.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Work better with branch predition when we have movw+movt,
as the size of the code is the same. Perhaps re-evaluate
when we have a proper constant pool.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The schedule was fully serial, with no possibility for dual issue.
The old schedule had a minimal issue of 7 cycles; the new schedule
has a minimal issue of 5 cycles.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Share code between qemu_ld and qemu_st to process the tlb.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Use even more primitive helper functions to avoid lots of duplicated code.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Make the code more readable by only having one copy of the magic
numbers, swapping registers as needed prior to that. Speed the
compiler by not applying the rd == rn avoidance for v6 or later.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
R12 is call clobbered, while R8 is call saved. This change
gives tcg one more call saved register for real data.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>