Commit Graph

1015 Commits

Author SHA1 Message Date
Peter Maydell 774d566cdb tcg/i386: Fix build for systems without working cpuid.h (MacOSX, Win32)
Win32 doesn't have a cpuid.h, and MacOSX may have one but without
the __cpuid() function we use, which means that commit 9d2eec20
broke the build for those platforms. Fix this by tightening up
our configure cpuid.h check to test that the functions we need
are present, and adding some missing #ifdef guards in
tcg/i386/tcg-target.c.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2014-02-21 10:39:10 +00:00
Richard Henderson 6399ab3325 tcg/i386: Use SHLX/SHRX/SARX instructions
These three-operand shift instructions do not require the shift count
to be placed into ECX.  This reduces the number of mov insns required,
with the mere addition of a new register constraint.

Don't attempt to get rid of the matching constraint, as that's impossible
to manipulate with just a new constraint.  In addition, constant shifts
still need the matching constraint.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17 10:12:29 -06:00
Richard Henderson 9d2eec202f tcg/i386: Use ANDN instruction
Note that the optimizer cannot simplify ANDC X,Y,C to AND X,Y,~C
so we must handle constants in the implementation of andc.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17 10:12:29 -06:00
Richard Henderson ecc7e84327 tcg/i386: Add tcg_out_vex_modrm
Prepare for emitting BMI insns which require VEX encoding.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17 10:12:29 -06:00
Richard Henderson a1b29c9ae0 tcg/i386: Move TCG_CT_CONST_* to tcg-target.c
These are not needed by users of tcg-target.h.  No need to recompile
when we adjust them.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17 10:12:29 -06:00
Richard Henderson 464a1441c1 tcg/optimize: Add more identity simplifications
Recognize 0 operand to andc, and -1 operands to and, orc, eqv.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17 10:12:29 -06:00
Richard Henderson e64e958e20 tcg/optimize: Optmize ANDC X,Y,Y to MOV X,0
Like we already do for SUB and XOR.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17 10:12:29 -06:00
Richard Henderson e201b56418 tcg/optimize: Simply some logical ops to NOT
Given, of course, an appropriate constant.  These could be generated
from the "canonical" operation for inversion on the guest, or via
other optimizations.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17 10:12:29 -06:00
Richard Henderson 23ec69ed37 tcg/optimize: Handle known-zeros masks for ANDC
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17 10:12:29 -06:00
Aurelien Jarno c8d7027253 tcg/optimize: add known-zero bits compute for load ops
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17 10:12:28 -06:00
Aurelien Jarno f096dc9618 tcg/optimize: improve known-zero bits for 32-bit ops
The shl_i32 op might set some bits of the unused 32 high bits of the
mask. Fix that by clearing the unused 32 high bits for all 32-bit ops
except load/store which operate on tl values.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17 10:12:28 -06:00
Aurelien Jarno 3031244b01 tcg/optimize: fix known-zero bits optimization
Known-zero bits optimization is a great idea that helps to generate more
optimized code. However the current implementation only works in very few
cases as the computed mask is not saved.

Fix this to make it really working.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17 10:12:28 -06:00
Aurelien Jarno e46b225a31 tcg/optimize: fix known-zero bits for right shift ops
32-bit versions of sar and shr ops should not propagate known-zero bits
from the unused 32 high bits. For sar it could even lead to wrong code
being generated.

Cc: qemu-stable@nongnu.org
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17 10:12:28 -06:00
Huw Davies 7a3a00979d tcg-arm: The shift count of op_rotl_i32 is in args[2] not args[1].
It's this that should be subtracted from 0x20 when converting to a right rotate.

Cc: qemu-stable@nongnu.org
Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-17 10:12:08 -06:00
Richard Henderson f6aa2f7dee TCG: Fix 32-bit host allocation typo
The second half register of a 64-bit temp on a 32-bit host
was allocated with the wrong base_type.

The base_type of the second half register is never checked,
but for consistency it should be the same as the first half.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-02-15 15:20:17 -08:00
Peter Maydell c1de788ab9 tcg: Add TCGV_UNUSED_PTR, TCGV_IS_UNUSED_PTR, TCGV_EQUAL_PTR
We have macros for marking TCGv values as unused, checking if they
are unused and comparing them to each other. However these only exist
for TCGv_i32 and TCGv_i64; add them for TCGv_ptr as well.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2014-02-08 14:46:55 +00:00
Richard Henderson c6830cdb2c tcg/s390: Remove sigill_handler
Commit c9baa30f42 failed to
delete all of the relevant code, leading to Werrors about
unused symbols.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-02-01 13:45:20 +04:00
Peter Maydell dc08f85188 Merge remote-tracking branch 'rth/tcg-movbe' into staging
* rth/tcg-movbe:
  tcg/i386: cleanup useless #ifdef
  tcg/i386: use movbe instruction in qemu_ldst routines
  tcg/i386: add support for three-byte opcodes
  tcg/i386: remove hardcoded P_REXW value
  disas/i386.c: disassemble movbe instruction

Message-id: 1390692772-15282-1-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-01-30 19:02:16 +00:00
Alexander Graf 18d13fa293 TCG: Fix I64-on-32bit-host temporaries
We have cache pools of temporaries that we can reuse later when they've
already been allocated before.

These cache pools differenciate between the target TCG variable type they
contain. So we have one pool for I32 and one pool for I64 variables.

On a 32bit system, we can't work with 64bit registers though. So instead we
spawn two I32 temporaries for every I64 temporary we create. All caching
works the same way as on a real 64-bit system though: We create a cache entry
in the 64bit array for the first i32 index.

However, when we free such a temporary we free it to the pool of its type
(which is always i32 on 32bit systems) rather than its base_type (which is
i64 or i32 depending on the variable). This means we put a temporary that
is of base_type == i64 into the i32 preallocated temporary pool.

Eventually, this results in failures like this on 32bit hosts:

  qemu-system-ppc64: tcg/tcg.c:515: tcg_temp_new_internal: Assertion `ts->base_type == type' failed.

This patch makes the free routine use the base_type instead for the free case,
so it's consistent with the temporary allocation. It fixes the above failure
for me.

Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1390146811-59936-1-git-send-email-agraf@suse.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-01-30 13:25:28 +00:00
Aurelien Jarno 2d23d5edb5 tcg/i386: cleanup useless #ifdef
TCG_TARGET_HAS_movcond_i32 is always defined to 1 in tcg-target.h, so
remove the corresponding #ifdef #endif sequence, left from a previous
refactoring.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-01-25 15:21:33 -08:00
Aurelien Jarno 085bb5bb64 tcg/i386: use movbe instruction in qemu_ldst routines
The movbe instruction has been added on some Intel Atom CPUs and on
recent Intel Haswell CPUs. It allows to load/store a value and at the
same time bswap it.

This patch detects the avaibility of this instruction and when available
use it in the qemu load/store routines in replacement of load/store +
bswap. Note that for 16-bit unsigned loads, movbe + movzw is basically the
same as movzw + bswap, so the patch doesn't touch this case.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
[RTH: Reduced the number of conditionals using "movop".]
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-01-25 15:19:19 -08:00
Aurelien Jarno 2a1137753f tcg/i386: add support for three-byte opcodes
Add support for three-byte opcodes, starting with the 0x0f 0x38 prefix.
Use P_EXT38 as the new constant, and shift all other constants so that
P_EXT and P_EXT38 have neighbouring values.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
[RTH: Changed the name from P_EXT2 to P_EXT38.]
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-01-25 14:12:45 -08:00
Aurelien Jarno c9d78213b8 tcg/i386: remove hardcoded P_REXW value
P_REXW is defined has a constant at the beginning of i386/tcg-target.c,
but the corresponding bit is later used in a harcoded way, which defeat
the purpose of a constant.

Fix that by using a conditional expression operator instead of a shift.
On x86 this actually makes the code slightly smaller as GCC does in
practice (opc >> 8) & 8 instead of (opc & 0x800) >> 8 so the constants
are smaller to load.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2014-01-25 14:12:38 -08:00
Aurelien Jarno 8589467f94 tcg/i386: fix a comment
The comments apply to 8-bit stores, not 8-byte stores.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-12-21 16:41:56 +01:00
Richard Henderson 0ec9eabc7f tcg: Use bitmaps for free temporaries
We previously allocated 32-bits per temp for the next_free_temp entry.
We now allocate 4 bits per temp across the 4 bitmaps.

Using a linked list meant that if a translator is tweeked, resulting in
temps being freed in a different order, that would have follow-on effects
throughout the TB.  Always allocating the lowest free temp means that
follow-on effects are minimized, which can make it easier to diff output
when debugging the translators.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-12-10 09:23:45 -08:00
Richard Henderson c9baa30f42 tcg-s390: Use qemu_getauxval in query_facilities
No need to set up a SIGILL signal handler for detection anymore.

Remove a ton of sanity checks that must be true, given that we're
requiring a 64-bit build (the note about 31-bit KVM is satisfied
by configuring with TCI).

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-11-30 07:45:30 +13:00
Richard Henderson 41d9ea80ac tcg-arm: Use qemu_getauxval
Allow host detection on linux systems without glibc 2.16 or later.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-11-30 07:45:14 +13:00
Richard Henderson cd629de1cf tcg-ppc64: Use qemu_getauxval
Allow host detection on linux systems without glibc 2.16 or later.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-11-30 07:45:13 +13:00
Richard Henderson 463230d85e tcg-ia64: Introduce tcg_opc_bswap64_i
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-11-18 15:57:59 +10:00
Richard Henderson db008a8de2 tcg-ia64: Introduce tcg_opc_ext_i
Being able to "extend" from 64-bits (with a mov) simplifies
a few places where the conditional breaks the train of thought.

Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-11-18 15:57:54 +10:00
Richard Henderson fa0cdb6c2a tcg-ia64: Introduce tcg_opc_movi_a
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-11-18 15:57:50 +10:00
Richard Henderson 3b9ccdcc74 tcg-ia64: Introduce tcg_opc_mov_a
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-11-18 15:57:46 +10:00
Richard Henderson 25c9c73bdc tcg-ia64: Use A3 form of logical operations
We can and/or/xor/andcm small constants, saving one cycle.

Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-11-18 15:57:40 +10:00
Richard Henderson f940fb086c tcg-ia64: Use SUB_A3 and ADDS_A4 for subtraction
We can subtract from more small constants that just 0 with one insn,
and we can add the negative for most small constants.

Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-11-18 15:57:33 +10:00
Richard Henderson 8642088a3d tcg-ia64: Use ADDS for small addition
Avoids a wasted cycle loading up small constants.

Simplify the code assuming the tcg optimizer is going to work
and don't expect the first operand of the add to be constant.

Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-11-18 15:57:23 +10:00
Richard Henderson 3c289cba9b tcg-ia64: Avoid unnecessary stop bit in tcg_out_alu
When performing an operation with two input registers, we'd leave
the stop bit (and thus an extra cycle) that's only needed when one
or the other input is a constant.

Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-11-18 15:57:16 +10:00
Richard Henderson d15de15ca0 tcg-ia64: Move AREG0 to R32
Since the move away from the global areg0, we're no longer globally
reserving areg0.  Which means our use of R7 clobbers a call-saved
register.  Shift areg0 into the windowed registers.  Indeed, choose
the incoming parameter register that it comes to us by.

This requires moving the register holding the return address elsewhere.
Choose R33 for tidiness.

Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-11-18 15:57:08 +10:00
Richard Henderson 6d264b38fc tcg-ia64: Simplify brcond
There was a misconception that a stop bit is required between a compare
and the branch that uses the predicate set by the compare.  This lead to
the usage of an extra bundle in which to perform the compare.  The extra
bundle left room for constants to be loaded for use with the compare insn.

If we pack the compare and the branch together in the same bundle, then
there's no longer any room for non-zero constants.  At which point we
can eliminate half the function by not handling them.

Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-11-18 15:56:42 +10:00
Richard Henderson 6f65c780b9 tcg-ia64: Handle constant calls
Using only indirect calls results in 3 bundles (one to load the
descriptor address), and 4 stop bits.  By looking through the
descriptor to the constants, we can perform the call with 2
bundles and only 1 stop bit.

Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-11-18 15:56:30 +10:00
Richard Henderson 5f7b16877a tcg-ia64: Use shortcuts for nop insns
There's no need to go through the full opcode-to-insn function call
to generate nops.  This makes the source a bit more readable.

Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-11-18 15:56:25 +10:00
Richard Henderson e3afa1c4ad tcg-ia64: Use TCGMemOp within qemu_ldst routines
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-11-18 15:56:12 +10:00
Richard Henderson 1768ec0623 tcg-ppc64: Support new ldst opcodes
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:20 -07:00
Richard Henderson 5dd391604f tcg-ppc: Support new ldst opcodes
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:20 -07:00
Richard Henderson e349a8d4ff tcg-ppc64: Convert to le/be ldst helpers
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:20 -07:00
Richard Henderson 92d0acda27 tcg-ppc: Convert to le/be ldst helpers
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:20 -07:00
Richard Henderson a058557381 tcg-ppc64: Use TCGMemOp within qemu_ldst routines
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:20 -07:00
Richard Henderson f1a16dcdd5 tcg-ppc: Use TCGMemOp within qemu_ldst routines
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:20 -07:00
Richard Henderson 091d567771 tcg-arm: Improve GUEST_BASE qemu_ld/st
If we pull the code to emit the actual load/store into a subroutine,
we can share the reg+reg addressing mode code between softmmu and
usermode.  This lets us load GUEST_BASE into a temporary register
rather than attempting to add it piece-wise to the address.

Which lets us use movw+movt for armv7, rather than (up to) 4 adds.
Code size for pre-armv7 stays the same.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:20 -07:00
Richard Henderson 15ecf6e394 tcg-arm: Convert to new ldst opcodes
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:20 -07:00
Richard Henderson a485cff09c tcg-arm: Tidy variable naming convention in qemu_ld/st
s/addr_reg2/addrhi/
s/addr_reg/addrlo/
s/data_reg2/datahi/
s/data_reg/datalo/

Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-10-12 16:19:20 -07:00