The CONFIG_TCG_PASS_AREG0 code for calling ld/st helpers
was not respecting the ABI requirement for 64-bit values
being aligned in registers.
Mirror the ARM port in use of helper functions to marshal
arguments into the correct registers.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Neither of these functions were performing double-word
compares properly.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Commit 6375e09e changed the type of TranslationBlock.tb_next,
but failed to change the type of TCGContext.tb_next.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
While swapping constants to the second operand, swap
sources matching destinations to the first operand.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Implemented with setcond if the target does not provide
the optional opcode.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Commit e31b0a7c05 fixed copy propagation on
32-bit host by restricting the copy between different types. This was the
wrong fix.
The real problem is that the all temps states should be reset at the end
of a basic block. This was done by adding such operations in the switch,
but brcond2 was forgotten (that's why the crash was only observed on 32-bit
hosts).
Fix that by looking at the TCG_OPF_BB_END instead. We need to keep the case
for op_set_label as temps might be modified through another path.
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Given the copy propagation breakage on 32-bit hosts has been fixed
commit e31b0a7c05 can be reverted.
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
set_label is effectively the end of a basic block, as no optimization
can be made accross it. It was treated as such in the liveness analysis
code, but as a special case.
Mark it with TCG_OPF_BB_END flag so that this information can be used
by other parts of the TCG code, and remove the special case in the liveness
analysis code.
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
On x86, it is possible to move a constant value to memory. Add code to
handle a constant argument to load/store ops.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Now that CONFIG_TCG_PASS_AREG0 is enabled for all targets,
remove dead code and support for !CONFIG_TCG_PASS_AREG0 case.
Remove dyngen-exec.h and all references to it. Although included by
hw/spapr_hcall.c, it does not seem to use it.
Remove unused HELPER_CFLAGS.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
optimizer.c contains some cases were the break is appearing in both the
if and the else parts. Fix that by moving it to the outer part. Also
move some common code there.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
brcond and setcond ops are not commutative, but it's easy to compute the
new condition after swapping the arguments. Try to always put the constant
argument in second position like for commutative ops, to help backends to
generate better code.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Split expression simplification in multiple parts so that a given op
can appear multiple times. This patch should not change anything.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Now that there are two passes of optimization (optimize.c, liveness)
there is no point of outputing the statistics of the liveness part
only. Update the code to take into account both optimizations.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The load/store slow path has been broken in e141ab52d:
- We need to move 4 registers for store functions and 3 registers for
load functions and not the reverse.
- According to the s390x calling convention the arguments of a function
should be zero extended. This means that the register shift should be
done with TCG_TYPE_I64 to ensure the higher word is correctly zero
extended when needed.
I am aware that CONFIG_TCG_PASS_AREG0 is being removed and thus that
this patch can be improved, but doing so means it can also be applied to
the 1.1 and 1.2 stable branches.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
The CONFIG_TCG_PASS_AREG0 code for calling ld/st helpers was
broken in that it did not respect the ABI requirement that 64
bit values were passed in even-odd register pairs. The simplest
way to fix this is to implement some new utility functions
for marshalling function arguments into the correct registers
and stack, so that the code which sets up the address and
data arguments does not need to care whether there has been
a preceding env argument.
Based on commit 9716ef3b for ARM by Peter Maydell.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Store slow path has been broken in e141ab52d:
- the arguments are shifted before the last one (mem_index) is written.
- the shift is done for both slow and fast paths.
Fix that. Also optimize a bit by bundling the move together. This still
can be optimized, but it's better to wait for a decision to be taken on
the arguments order.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The CONFIG_TCG_PASS_AREG0 code for calling ld/st helpers was
broken in that it did not respect the ABI requirement that 64
bit values were passed in even-odd register pairs. The simplest
way to fix this is to implement some new utility functions
for marshalling function arguments into the correct registers
and stack, so that the code which sets up the address and
data arguments does not need to care whether there has been
a preceding env argument.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
If tci_out_label is called in the context of tcg_gen_code_search_pc, we
could be overwriting an already patched relocation with zero -- and not
repatch it because the set_label is past search_pc, causing a QEMU crash
when it tries to branch to a zero label.
Not writing anything to the relocation area seems to be in line with what
other backends do from the couple I looked at (x86, ppc).
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Commit eeacee4d86 changed the syntax of tcg_dump_ops, but didn't convert
all users (notably missing the ppc ones) to it. Fix them to the new syntax.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: malc <av1474@comtv.ru>
Don't use global variables directly but via accessor functions. Rename globals.
Convert macros to functions, add GCC format attributes.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
powerpc-apple-darwin9-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5577)
does not define _CALL_DARWIN, leading to unexpected behavior w.r.t.
register clobbering and stack frame layout.
Since _CALL_DARWIN is a reserved identifier, define a custom
TCG_TARGET_CALL_DARWIN based on either _CALL_DARWIN or __APPLE__.
Signed-off-by: Andreas F?rber <andreas.faerber@web.de>
Signed-off-by: malc <av1474@comtv.ru>
In qemu_ld/st load the registers for the helper calls directly rather
than rotating them around afterwards for AREG0.
Also clobber the additional register.
Signed-off-by: Andreas F?rber <afaerber@suse.de>
Signed-off-by: malc <av1474@comtv.ru>
Adjust the tcg_out_qemu_{ld,st}() slow paths to pass AREG0 in r3,
based on patches by malc.
Also adjust the registers clobbered, based on patch by Alex.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Acked-by: Alexander Graf <agraf@suse.de>
[AF: Do not hardcode r3 for AREG0, requested by Alex]
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This accounts for the additional addr_reg2 register.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Also assure i64 alignment where necessary.
Alignment code optimization suggested by malc.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
For targets where TARGET_LONG_BITS != 32, i.e. 64-bit guests,
addr_reg is moved to r4. For hosts without TCG_TARGET_CALL_ALIGN_ARGS
either data_reg2 or data_reg or a masked version thereof would overwrite
r4. Place it in r5 instead, matching TCG_TARGET_CALL_ALIGN_ARGS hosts.
This fixes immediate crashes of 64-bit guests observed on Darwin/ppc but
not on Darwin/ppc64.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Acked-by: malc <av1474@comtv.ru>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
w64 uses the registers rcx, rdx, r8 and r9 for function arguments,
so it needs a different declaration of tcg_target_call_iarg_regs.
rax, rcx, rdx, r8, r9, r10 and r11 may be changed by function calls.
rbx, rbp, rdi, rsi, r12, r13, r14 and r15 remain unchanged by function calls.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Not all i386 / x86_64 hosts use ELF.
Ask the compiler whether ELF is used.
On w64, gdb crashes when ELF_HOST_MACHINE is defined.
Cc: Blue Swirl <blauwirbel@gmail.com>
Acked-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
There two entries of INDEX_op_ld_i64 in the ppc_op_defs. That causes an
assertion failure in tcg_add_target_add_op_defs() when --enable-debug is
used on a ppc64 backend (that's ppc64 host, not target).
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: malc <av1474@comtv.ru>
This allows us to actually supply a function name in softmmu builds;
gdb doesn't pick up the minimal symbol table otherwise. Also add a
bit of documentation and statically generate more of the ELF image.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
This allows us to generate unwind info for the dynamicly generated
code in the code_gen_buffer. Only i386 is converted at this point.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Optionally, make memory access helpers take a parameter for CPUState
instead of relying on global env.
On most targets, perform simple moves to reorder registers. On i386,
switch from regparm(3) calling convention to standard stack-based
version.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Use stack based calling convention (GCC default) for interfacing with
generated code instead of register based convention (regparm(3)).
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
next_tb is the numeric value of a tcg target (= QEMU host) address.
Using tcg_target_ulong instead of unsigned long shows this and makes
the code portable for hosts with an unusual size of long (w64).
The type cast '(long)(next_tb & ~3)' was not needed (casting
unsigned long to long does not change the bits, and nor does
casting long to pointer for most (= all non w64) hosts.
It is removed here.
Macro or function tcg_qemu_tb_exec is used to set next_tb.
The function also returns next_tb. Therefore tcg_qemu_tb_exec
must return a tcg_target_ulong.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
An attempt to allocate a large memory chunk after a small one resulted in
circular links in list of pools. It caused the same memory being
allocated twice for different arrays.
Now pools for large memory chunks are kept in separate list and are
freed during pool reset because current allocator can not reuse them.
Signed-off-by: Kirill Batuzov <batuzovk@ispras.ru>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Scripted conversion:
for file in *.[hc] hw/*.[hc] hw/kvm/*.[hc] linux-user/*.[hc] linux-user/m68k/*.[hc] bsd-user/*.[hc] darwin-user/*.[hc] tcg/*/*.[hc] target-*/cpu.h; do
sed -i "s/CPUState/CPUArchState/g" $file
done
All occurrences of CPUArchState are expected to be replaced by QOM CPUState,
once all targets are QOM'ified and common fields have been extracted.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
tcg_out_label is always called with a third argument of pointer type
which was casted to tcg_target_long.
These casts can be avoided by changing the prototype of tcg_out_label.
There was also a cast to long. For most hosts with
sizeof(long) == sizeof(tcg_target_long) == sizeof(void *) this did not
matter, but for w64 it was wrong. This is fixed now.
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The TCG targets i386 and tci needed a change of the function
prototype for w64.
This change is currently not needed for the other TCG targets,
but it can be applied to avoid code differences.
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Andrzej Zaborowski <balrogg@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
flush_icache_range takes two address parameters which must be large
enough to address any address of the host.
For hosts with sizeof(unsigned long) == sizeof(void *), this patch
changes nothing. All currently supported hosts fall into this category.
For w64 hosts, sizeof(unsigned long) is 4 while sizeof(void *) is 8,
so the use of tcg_target_ulong is needed for i386 and tci (the tcg
targets which work with w64).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
This change makes tcg_target_ulong available in tcg-target.h.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The standard include files are already included in qemu-common.h.
malloc.h and alloca.h were needed for alloca() which was removed
from TCG code some years ago when switching from dyngen to TCG
(see commit 49516bc0d6).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
ARM still doesn't support 16GB buffers in 32-bit modes, replace the
16GB by 16MB in the comment.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
On ARM, in Thumb mode r7 is used for the framepointer; this meant
that we would fail to compile in debug mode because we were using r7
for TCG_AREG0. Shift to r6 instead to avoid this clash.
(Bug reported as LP:870990.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrzej Zaborowski <andrew.zaborowski@intel.com>
On ARM, don't map the code buffer at a fixed location, and fix up the
call/goto tcg routines to let it do long jumps.
Mapping the code buffer at a fixed address could sometimes result in it being
mapped over the top of the heap with pretty random results.
Signed-off-by: Dr. David Alan Gilbert <david.gilbert@linaro.org>
Signed-off-by: Andrzej Zaborowski <andrew.zaborowski@intel.com>
Make tcg_const_ptr() include a cast so that you can pass it a
pointer. This allows us to drop the casts we had in all the places
that use this macro.
Acked-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
TCG_TARGET_REG_BITS is declared in tcg.h for all TCG targets.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
This is standard for other tcg targets and improves tci, too.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
In both cases, val is computed, but then not used in the
subsequent line, which then re-computes the quantity in
a different type (int32_t vs unsigned long).
Keep the computation type that's been working so far.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* 's390-1.0' of git://repo.or.cz/qemu/agraf:
s390x: initialize virtio dev region
tcg: Use TCGReg for standard tcg-target entry points.
tcg: Standardize on TCGReg as the enum for hard registers
s390x: Add shutdown for TCG s390-virtio machine
s390: Fix cpu shutdown for KVM
s390: fix short kernel command lines
s390: fix reset hypercall to reset the status
s390x: implement SIGP restart and shutdown
s390x: implement rrbe instruction properly
s390x: update R and C bits in storage key
s390x: make ipte 31-bit aware
s390x: add ldeb instruction
Including tcg_out_ld, tcg_out_st, tcg_out_mov, tcg_out_movi.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Most targets did not name the enum; tci used TCGRegister.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
tcg/ppc64/tcg-target.c has a couple of places where variables are set
unconditionally, but otherwise used only for softmmu builds, not
userspace only builds. This causes compiler warnings (which are fatal
by default) when compiling for a ppc64 host with gcc 4.6. This patch
fixes the problem by moving the code which defines and sets the
variables into the CONFIG_SOFTMMU guarded regions.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
* 'tci' of git://qemu.weilnetz.de/qemu:
tcg: Add tcg interpreter to configure / make
tcg: Add tci disassembler
tcg: Add interpreter for bytecode
tcg: Add bytecode generator for tcg interpreter
tcg: Make ARRAY_SIZE(tcg_op_defs) globally available
tcg: TCG targets may define tcg_qemu_tb_exec
The error being caused by the failure to copy the other half of
the input to the output after having narrowed the deposit operation.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: malc <av1474@comtv.ru>
Unlike other tcg target code generators, this one does not generate
machine code for some cpu. It generates machine independent bytecode
which is interpreted later.
This allows running QEMU on any host.
Interpreted bytecode is slower than direct execution of generated
machine code.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
tcg_op_defs was already a global array.
The tci disassembler also needs ARRAY_SIZE(tcg_op_defs),
so add a new global constant with this value.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Targets may use a non standard definition of tcg_tb_exec
by defining this macro in their tcg_target.h.
This is used here by ppc. It will be used by the TCG interpreter, too.
Cc: malc <av1474@comtv.ru>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
If the deposit replaces the entire word, optimize to a move.
If we're inserting to the top of the word, avoid the mask of arg2
as we'll be shifting out all of the garbage and shifting in zeros.
If the host is 32-bit, reduce a 64-bit deposit to a 32-bit deposit
when possible.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Remove the unused function tcg_out_addi() from the s390 TCG backend;
this brings it into line with other backends.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Remove the unused function tcg_out_addi() from the ia64 TCG backend;
this brings it into line with other backends.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
x86 cannot provide an optimized generic deposit implementation. But at
least for a few special cases, namely for writing bits 0..7, 8..15, and
0..15, versions using only a single instruction are feasible.
Introducing such limited support improves emulating 16-bit x86 code on
x86, but also rarer cases where 32-bit or 64-bit code accesses bytes or
words.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Remove the unused function tcg_out_addi() from the ARM TCG backend;
this fixes a compilation failure on ARM hosts with newer gcc.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
These functions are defined in the tcg target specific file
tcg-target.c.
The forward declarations assert that every tcg target uses
the same function prototype.
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
It is now declared for all tcg targets in tcg.h,
so the tcg target specific declarations are redundant.
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
TCG_TARGET_REG_BITS can be determined by the compiler,
so there is no need to declare it for each individual tcg target.
This is especially important for new tcg targets
which will be supported by the tcg interpreter.
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The second register is only needed for 32 bit hosts.
Cc: Vassili Karpov <av1474@comtv.ru>
Fine-with-me'd-by: Vassili Karpov <av1474@comtv.ru>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The second register is only needed for 32 bit hosts.
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The second register is only needed for 32 bit hosts.
Cc: Alexander Graf <agraf@suse.de>
Acked-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The second register is never used for ia64 hosts.
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The second register is only needed for 32 bit hosts.
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The ppc64 code generation backend uses an rldicr (Rotate Left Double
Immediate and Clear Right) instruction to implement zero extension of
a 32 bit quantity to a 64 bit quantity (INDEX_op_ext32u_i64). However
this is wrong - this instruction clears specified low bits of the
value, instead of high bits as we require for a zero extension. It
should instead use an rldicl (Rotate Left Double Immediate and Clear
Left) instruction.
Presumably amongst other things, this causes the SLOF firmware image
used with -M pseries to not boot on a ppc64 host.
It appears this bug was exposed by commit
0bf1dbdcc9 (tcg/ppc64: fix 16/32 mixup)
which enabled the use of the op_ext32u_i64 operation on the ppc64
backend.
Signed-off-by: Thomas Huth <thuth@de.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: malc <av1474@comtv.ru>
Move the declaration and initialisation of some variables in
tcg_out_qemu_ld and tcg_out_qemu_st inside CONFIG_SOFTMMU, to
avoid the "variable set but not used" warning of gcc 4.6.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: malc <av1474@comtv.ru>
Use enum TCGOpcode instead of plain old int so that the name of
current op can be seen in GDB. Add a default case to switch
so that GCC does not complain about unhandled enum cases.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
By always defining these symbols, we can eliminate a lot of ifdefs.
To allow this to be checked reliably, the semantics of the
TCG_TARGET_HAS_* macros must be changed from def/undef to true/false.
This allows even more ifdefs to be removed, converting them into
C if statements.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
This allows the simplification of the op_bits function from
tcg/optimize.c.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Copy propagation introduced in 22613af4a6
considered only global registers. However, register temps and stack
allocated locals must be handled differently because register temps
don't survive across brcond.
Fix by propagating only within same class of temps.
Tested-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Fix breakage by a640f03178
and 55c0975c5b.
Some TCG targets don't implement all TCG ops, so make
optimizing those conditional.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Perform constant folding for NOT and EXT{8,16,32}{S,U} operations.
Signed-off-by: Kirill Batuzov <batuzovk@ispras.ru>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Perform actual constant folding for ADD, SUB and MUL operations.
Signed-off-by: Kirill Batuzov <batuzovk@ispras.ru>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Make tcg_constant_folding do copy and constant propagation. It is a
preparational work before actual constant folding.
Signed-off-by: Kirill Batuzov <batuzovk@ispras.ru>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Added file tcg/optimize.c to hold TCG optimizations. Function tcg_optimize
is called from tcg_gen_code_common. It calls other functions performing
specific optimizations. Stub for constant folding was added.
Signed-off-by: Kirill Batuzov <batuzovk@ispras.ru>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
cppcheck reports an error:
qemu/tcg/mips/tcg-target.c:1487: error: Invalid number of character (()
The unpatched code won't compile on mips hosts starting with commit
cea5f9a28f.
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Expand the note on the number of TCG ops generated per target insn,
to be clearer about the range of applicability of the 20 op rule
of thumb. Also add a note about the hard MAX_OP_PER_INSTR limit.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Use stack instead of temp_buf array in CPUState for TCG temps.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Use TCG_REG_CALL_STACK instead of TCG_REG_SP for consistency.
Acked-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Use stack instead of temp_buf array in CPUState for TCG temps.
On Sparc64, stack pointer is not aligned but there is a fixed bias of 2047,
so don't try to enforce alignment.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Except for specific cases where the use of %esp changes the encoding of
the instruction, it's cleaner to use TCG_REG_CALL_STACK instead of
TCG_REG_ESP.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The code for stack allocation for call arguments is way too simplistic
to actually work on targets with non-trivial stack allocation policies,
e.g. ppc64. We've also already allocated TCG_STATIC_CALL_ARGS_SIZE worth
of stack for calls which should be well more than any helper needs.
Remove broken dynamic stack allocation code and replace it with an assert.
Should dynamic stack allocation ever be needed again, target specific
functions should be added.
Thanks to Richard Henderson for the analysis.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Make functions take a parameter for CPUState instead of relying
on global env. Pass CPUState pointer to TCG prologue, which moves
it to AREG0.
Thanks to Peter Maydell and Laurent Desnogues for the ARM prologue
change.
Revert the hacks to avoid AREG0 use on Sparc hosts.
Move cpu_has_work() and cpu_pc_from_tb() from exec.h to cpu.h.
Compile the file without HELPER_CFLAGS.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Based on a patch from Hans de Goede <hdegoede@redhat.com>
This warning is new in gcc 4.6.
Acked-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Christophe Fergeau <cfergeau@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
When compiling with DEBUG_TCGV enabled, make the TCGv_ptr type distinct
from TCGv_i32/TCGv_i64. This means that using an i32 or i64 TCG op to
manipulate a TCGv_ptr will always be detected at compile time, rather
than only if compiling on a host system with the other word size.
NB: the tcg_add_ptr and tcg_sub_ptr macros have been removed as they
were not used anywhere.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The prototypes for the ld/st functions on a 64 bit host declared
the address parameter as a TCGv_i64 rather than a TCGv_ptr. This
worked OK (since the two are aliases), but needs to be fixed to
allow extension of TCG type debugging to i64/i32/ptr mismatches.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Use the correct header in the TCG MIPS code to find cacheflush() on OpenBSD
to fix compilation of the MIPS host support for OpenBSD/mips64 based architecures.
Signed-off-by: Brad Smith <brad@comstyle.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
If an op with dead outputs is not removed, because it has side effects
or has multiple output and only one dead, mark the registers as dead
instead of saving them. This avoid a few register spills on TCG targets
with low register count, especially with div2 and mul2 ops, or when a
qemu_ld* result is not used (prefetch emulation for example).
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
If an op is not removed and has dead output arguments, mark it
in op_dead_args similarly to what is done for input arguments.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Allow all args to be dead by replacing the input specific op_dead_iargs
variable by op_dead_args. Note this is a purely mechanical change.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Although the TCG generated code is always in ARM mode, it is possible
that the host code was compiled by gcc in Thumb mode (this is often the
default for Linux distributions targeting ARM v7 only). Handle this
by using BLX imm when doing a call from ARM into Thumb mode.
Since BLX imm is not a conditionalisable instruction, we make
tcg_out_call() no longer take a condition code; we were only ever
using it with COND_AL anyway.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrzej Zaborowski <andrew.zaborowski@intel.com>
Add support (if CONFIG_DEBUG_TCG is defined) for debugging leakage
of temporary variables. Generally any temporaries created by
a target while it is translating an instruction should be freed
by the end of that instruction; otherwise carefully crafted
guest code could cause TCG to run out of temporaries and assert.
By calling tcg_check_temp_count() after each instruction we can
check that we are not leaking temporaries in this way.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Add a comment about cache coherency and retranslation, so that people
developping new targets based on existing ones are warned of the issue.
Acked-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Improve constant loading in two ways:
- On all ARM versions, it's possible to load 0xffffff00 = -0x100 using
the mvn rd, #0. Fix the conditions.
- On <= ARMv6 versions, where movw and movt are not available, load the
constants using mov and orr with rotations depending on the constant
to load. This is very useful for example to load constants where the
low byte is 0. This reduce the generated code size by about 7%.
Also fix the coding style at the same time.
Cc: Andrzej Zaborowski <balrog@zabor.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
TCG on MIPS was trying to avoid changing the branch offset, but didn't
due to a stupid typo. Fix it.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Due to a typo, qemu_st64 doesn't properly byteswap the 32-bit low word of
a 64 bit word before saving it. This patch fixes that.
Acked-by: Andrzej Zaborowski <balrogg@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
QEMU uses code retranslation to restore the CPU state when an exception
happens. For it to work the retranslation must not modify the generated
code. This is what is currently implemented in ARM TCG.
However on CPU that don't have icache/dcache/memory synchronised like
ARM, this requirement is stronger and code retranslation must not modify
the generated code "atomically", as the cache line might be flushed
at any moment (interrupt, exception, task switching), even if not
triggered by QEMU. The probability for this to happen is very low, and
depends on cache size and associativiy, machine load, interrupts, so the
symptoms are might happen randomly.
This requirement is currently not followed in tcg/arm, for the
load/store code, which basically has the following structure:
1) tlb access code is written
2) conditional fast path code is written
3) branch is written with a temporary target
4) slow path code is written
5) branch target is updated
The cache lines corresponding to the retranslated code is not flushed
after code retranslation as the generated code is supposed to be the
same. However if the cache line corresponding to the branch instruction
is flushed between step 3 and 5, and is not flushed again before the
code is executed again, the branch target is wrong. In the guest, the
symptoms are MMU page fault at a random addresses, which leads to
kernel page fault or segmentation faults.
The patch fixes this issue by avoiding writing the branch target until
it is known, that is by writing only the branch instruction first, and
later only the offset.
This fixes booting linux guests on ARM hosts (tested: arm, i386, mips,
mipsel, sh4, sparc).
Acked-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The usermode version of qemu_ld doesn't used mem_index,
leading to set-but-not-used warnings.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
A typo in the usermode address calculation path; R3 used where R2 needed.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
Use ld4 not ld8 for reading the tlb of 32-bit targets.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
The port was not properly merged following
86feb1c860
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
Fix compilation error when GUEST_BASE is not defined.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
The arguments to tcg_gen_helper32 for these functions were not
updated correctly in rev 2bece2c883.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
fprintf_function uses format checking with GCC_FMT_ATTR.
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
When qemu is configured with --enable-debug-tcg,
gcc throws this warning (or error with -Werror):
tcg/tcg.c:1030: error: comparison of unsigned expression >= 0 is always true
Fix it by removing the >= 0 part.
The type cast to 'unsigned' catches negative values of op
(which should never happen).
This is a modification of Hollis Blanchard's patch.
Cc: Hollis Blanchard <hollis@penguinppc.org>
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
3b6dac3416 apparently broke the ppc64 TCG target
compilation in the code path without guest base.
Reverting this line fixes the build.
Signed-off-by: Andreas F?rber <andreas.faerber@web.de>
Cc: malc <av1474@comtv.ru>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: malc <av1474@comtv.ru>
5da79c86a3 broke compilation on Mac OS X v10.5 ppc.
Apple's GCC 4.0.1 does not define _CALL_DARWIN. Recognize __APPLE__ again as well.
Signed-off-by: Andreas F?rber <andreas.faerber@web.de>
Cc: malc <av1474@comtv.ru>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: J?rgen Lock <nox@jelal.kn-bremen.de>
Cc: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: malc <av1474@comtv.ru>
Original patch from Ulrich Hecht, further work from Alexander Graf
and Richard Henderson.
Cc: Ulrich Hecht <uli@suse.de>
Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
We need not reserve the register unless we're going to use it.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: malc <av1474@comtv.ru>
Some hosts (amd64, ia64) have an ABI that ignores the high bits
of the 64-bit register when passing 32-bit arguments. Others
require the value to be properly sign-extended for the type.
I.e. "int32_t" must be sign-extended and "uint32_t" must be
zero-extended to 64-bits.
To effect this, extend the "sizemask" parameter to tcg_gen_callN
to include the signedness of the type of each parameter. If the
tcg target requires it, extend each 32-bit argument into a 64-bit
temp and pass that to the function call.
This ABI feature is required by sparc64, ppc64 and s390x.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Before gcc 4.2, __builtin___clear_cache doesn't exist, and
afterward the gcc s390 backend implements it as nothing.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Both tcg_target_init and tcg_target_qemu_prologue
are unused outside of tcg.c.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Mirror tcg_out_movi in having a TYPE parameter. This allows x86_64
to perform the move at the proper width, which may elide a REX prefix.
Introduce a TCG_TYPE_REG enumerator to represent the "native width"
of the host register, and to distinguish the usage from "pointer data"
as represented by the existing TCG_TYPE_PTR.
Update all targets to match.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Make fallthru be TLB hit and branch be TLB miss. Doing this
both improves branch prediction and will allow further cleanup.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Splitting out these functions will allow further cleanups.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Include it in the opcode as an extension, as with P_EXT
or the REX bits in the x86-64 port.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(1) The output registers were not marked call-clobbered, even though
they can be modified by called functions.
(2) The thread pointer was not marked reserved.
(3) R4-R6 are call-saved, but not saved by the prologue. Rather than
save them, mark them reserved so that we don't use them.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Now that the prologue is generated after GUEST_BASE is fixed,
we can load it as an immediate, and also avoid reserving the
register if it isn't necessary.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
This will allow backends to make intelligent choices about how
to implement GUEST_BASE.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The result is shorter than the mov+add that TCG would
otherwise generate for us.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Implement full modrm+sib addressing mode processing.
Use that in qemu_ld/st to output the LEA.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Define OPC_GRP3 and EXT3_FOO to match. Use them instead of
bare constants.
Define OPC_GRP5 and rename the existing EXT_BAR to EXT5_BAR to
make it clear which extension should be used with which opcode.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Define OPC_CALL_Jz, generated by tcg_out_calli; use the later
throughout. Unify the calls within qemu_st; adjust the stack
with a single pop if applicable.
Define and use EXT_CALLN_Ev for indirect calls.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Move tcg_out_push/pop up in the file so that they can be used
by qemu_ld/st. Define a tcg_out_pushi to be used as well.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Add more OPC values, and tgen_arithr. Use the later throughout.
Note that normal reg/reg arithmetic now uses the Gv,Ev opcode form
instead of the Ev,Gv opcode form used previously. Both forms
disassemble properly, and so there's no visible change when diffing
log files before and after the change. This change makes the operand
ordering within the output routines more natural, and avoids the need
to define an OPC_ARITH_EvGv since a read-modify-write with memory is
not needed within TCG.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Define OPC_ARITH_EvI[bz]; use throughout. Use tcg_out_ext8u
directly in setcond. Use tgen_arithi in qemu_ld/st.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Define OPC_MOVSBL and OPC_MOVSWL. Factor opcode emission to
separate functions.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Define OPC_MOVZBL and OPC_MOVZWL. Factor opcode emission to
separate functions.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Define OPC_JCC*, OC_JMP*, and EXT_JMPN_Ev. Use them throughout.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
If the address register overlaps one of the output registers
simply issue the clobbering load last, rather than emitting
an extra move of the address register.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Define OPC_MOVB* and OPC_MOVL*; use them throughout.
Use tcg_out_ld/st instead of bare tcg_out_modrm_offset
when it makes sense.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Setting the registers one by one is easier to read, and gets
optimized by the compiler just the same.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
tcg_out_reloc is only used locally (in */target.c which is
included in tcg.c).
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Branch offsets should only be overwritten during relocation,
to support partial retranslation.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Issue the tlb load as early as possible and perform the address
masking while the load is completing.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Along the tlb hit path, we were modifying the variables holding the input
register numbers, which lead to incorrect expansion of the tlb miss path.
Fix this by extracting the tlb hit path to separate functions with their
own local variables. This also makes the difference between softmmu and
user-only easier to read.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Load from the guest_base variable rather than embed a constant.
Always reserve TCG_GUEST_BASE_REG if guest base support enabled.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Define "M" constraint for and_mask_p and "O" constraint for or_mask_p.
Assume that inputs are correct in tcg_out_ori and tcg_out_andi.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The beginning of the register allocation order list on the TCG arm
target matches the list of clobbered registers. This means that when an
helper is called, there is almost always clobbered registers that have
to be spilled.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
64-bit arguments should be aligned on an even register as specified
by the "Procedure Call Standard for the ARM Architecture".
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
addr_reg, data_reg and data_reg2 can't be register r0 or r1 du to the
constraints. Don't check if they equals these registers.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
On big endian targets, data arguments of qemu_ld/st ops have to be
byte swapped. Two temporary registers are needed for qemu_st to do
the bswap. r0 and r1 are used in system mode, do the same in user
mode, which implies reworking the constraints.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
While it make sense to pass a conditional argument to tcg_out_*()
functions as the ARM architecture allows that, it doesn't make sense
for qemu_ld/st functions. These functions use comparison instructions
and conditional execution already, so it is not possible to use a
second level of conditional execution.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Add an bswap16 and bswap32 ops, either using the rev and rev16
instructions on ARMv6+ or shifts and logical operations on previous
ARM versions. In both cases the result use less instructions than
the pure TCG version.
These ops are also needed by the qemu_ld/st functions.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Add an ext16u op, either using the uxth instruction on ARMv6+ or two
shifts on previous ARM versions. In both cases the result use the same
number or less instructions than the pure TCG version.
Also move all sign extension code to separate functions, so that they
can be reused in other parts of the code.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Use a set of variables to define the allowed ARM instructions, depending
on the __ARM_ARCH_*__ GCC defines.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The TCG ARM backends uses integer values to refer to both immediate
values and register number. This makes the code difficult to read.
The patch below replaces all (if I haven't miss any ;-) integer values
representing register number by TCG_REG_* enum values.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Instead of writing very compact code, declare all registers that are
clobbered or reserved one by one. This makes the code easier to read.
Also declare all the 16 registers to TCG, and mark pc as reserved.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
There is no need to save the LR register (r14) before a call to a
subroutine. According to the "Procedure Call Standard for the ARM
Architecture", it is the job of the callee to save this register.
Moreover, this register is already saved in the prologue/epilogue.
This patch removes the disabled SAVE_LR code, as there is no need to
reenable later.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
There's a header file inclusion ordering problem between cpu-all.h
and qemu-timer.h, such that cpu_get_real_ticks is not defined when
we attempt to use it in profile_getclock.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Register fp (frame pointer) is a bad choice for compilations
without optimisation, because the compiler makes heavy use
of this register (so the resulting code crashes).
Register s0 had been used for TCG_AREG1 in earlier releases,
but was no longer used and is now free for TCG_AREG0.
The resulting code works for compilations without
optimisation (tested with qemu mips in qemu mips
on x86 host).
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
PA-RISC uses procedure descriptors. We'd need to emit a call to
the millicode routine $$dyncall. However, this situation doesn't
actually arise, since we always have the descriptor available at
TCG code generation time.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Handle the output log part overlapping the input high parts.
Also, improve sub2 to handle some constants the second input low part.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Delete inline functions from tcg-target.h that don't need to be there,
move the others to tcg-target.c. Add 'Z', 'I', 'J' constraints for
0, signed 11-bit, and signed 5-bit respectively. Add GUEST_BASE support
similar to ppc64, with the value stored in a register. Add missing
registers to reg_alloc_order. Add support for 12-bit branch relocations.
Add functions for synthetic operations: addi, mtctl, dep, shd, vshd, ori,
andi, shifts, rotates, multiply, branches, setcond. Split out TLB reads
from qemu_ld and qemu_st; fix argument loading for tlb external calls.
Generate the prologue.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Historically the qemu tlb "addend" field was used for both RAM and IO accesses,
so needed to be able to hold both host addresses (unsigned long) and guest
physical addresses (target_phys_addr_t). However since the introduction of
the iotlb field it has only been used for RAM accesses.
This means we can change the type of addend to unsigned long, and remove
associated hacks in the big-endian TCG backends.
We can also remove the host dependence from target_phys_addr_t.
Signed-off-by: Paul Brook <paul@codesourcery.com>
A few words about design choices:
* On IA64, instructions should be grouped by bundle, and dependencies
between instructions declared. A first version of this code tried to
schedule instructions automatically, but was very complex and too
invasive for the current common TCG code (ops not ending at
instruction boundaries, code retranslation breaking already generated
code, etc.) It was also not very efficient, as dependencies between
TCG ops is not available.
Instead the option taken by the current implementation does not try
to fill the bundle by scheduling instructions, but by providing ops
not available as an ia64 instruction, and by offering 22-bit constant
loading for most of the instructions. With both options the bundle are
filled at approximately the same level.
* Up to 128 registers can be affected to a function on IA64, but TCG
limits this number to 64, which is actually more than enough. The
register affectation is the following:
- r0: used to map a constant argument with value 0
- r1: global pointer
- r2, r3: internal use
- r4 to r6: not used to avoid saving them
- r7: env structure
- r8 to r11: free for TCG (call clobbered)
- r12: stack pointer
- r13: thread pointer
- r14 to r31: free for TCG (call clobbered)
- r32: reserved (return address)
- r33: reserved (PFS)
- r33 to r63: free for TCG
* The IA64 architecture has only 64-bit registers and no 32-bit
instructions (the only exception being cmp4). Therefore 64-bit
registers and instructions are used for 32-bit ops. The adopted
strategy is the same as the ABI, that is the higher 32 bits are
undefined. Most ops (and, or, add, shl, etc.) can directly use
the 64-bit registers, while some others have to sign-extend (sar,
div, etc.) or zero-extend (shr, divu, etc.) the register first.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Commit 86feb1c860
did not change all occurrences of INDEX_op_qemu_ld32u
for tcg/arm.
Please note that I could not test this patch
(I have currently no arm system available).
Cc: Richard Henderson <rth@twiddle.net>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Some targets (e.g. Alpha and MIPS64) need to keep 32-bit operands
sign-extended in 64-bit registers (regardless of the "real" sign
of the operand). For that, we need to be able to distinguish
between a 32-bit load with a 32-bit result and a 32-bit load with
a given extension to a 64-bit result. This distinction already
exists for the ld* loads, but not the qemu_ld* loads.
Reserve qemu_ld32u for 64-bit outputs and introduce qemu_ld32 for
32-bit outputs. Adjust all code generators to match.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The TCGType name was already used consistently. Changing it
to an enumeration instead of a set of defines aids debugging.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Use the TCGCond enumeration type in the brcond and setcond
related prototypes in tcg-op.h and each code generator.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Give the enumeration formed from tcg-opc.h a name: TCGOpcode.
Use that enumeration type instead of "int" whereever appropriate.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
There is no need to save r7, it is used to store the address
of the env structure and is not modified by GCC.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
TCG internal helpers only access to the values passed in arguments, and
do not modify the CPU internal state. Thus they can be declared as
const and pure.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Some targets like ARM would benefit to use 32-bit helpers for
div/rem/divu/remu.
Create a #define for div2 so that targets can select between
div, div2 and helper implementation. Use the helper version if none
of the #define are present.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Since commit 6113d6d316 QEMU crashes
on ARM hosts. This is not a bug of this commit, but a latent bug
revealed by this commit.
The TCG code is called through a procedure call using the prologue
and epilogue code. This code does not save and restore enough registers.
The "Procedure Call Standard for the ARM Architecture" says:
A subroutine must preserve the contents of the registers r4-r8, r10,
r11 and SP (and r9 in PCS variants that designate r9 as v6).
The current code only saves and restores r9 to r11, and misses r4 to
r8. The patch fixes that by saving r4 to r12. Theoretically there is
no need to save and restore r12, but an even number of registers have
to be saved as per EABI.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Fix error:
CC sparc-bsd-user/op_helper.o
In file included from /src/qemu/tcg/tcg.c:158:
/src/qemu/tcg/sparc/tcg-target.c:728:5: "TARGET_PHYS_ADDR_BITS" is not defined
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
When restoring register values, increase the stack register for skipped
values.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Andrzej Zaborowski <andrew.zaborowski@intel.com>
On 32-bit hosts op_qemu_ld32s is unused. Remove it to fix the
following assertion failure:
qemu-alpha: tcg/tcg.c:1055:
tcg_add_target_add_op_defs: Assertion `tcg_op_defs[op].used' failed.
Signed-off-by: Jay Foad <jay.foad@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Previously ORC was always implemented by tcg-op.h with
an explicit NOT opcode. Allow a target implementation.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Previously ANDC was always implemented by tcg-op.h with
an explicit NOT opcode. Allow a target implementation.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>