If we execute linux-user code that does the following:
* A = mmap()
* execute code in A
* munmap(A)
* B = mmap(), but mmap returns the same address as A
* execute code in B
we end up executing a stale cached tb that contains translated code
from A, while we want new code from B.
This patch adds a TB flush for mmap'ed regions, before we return them,
avoiding the whole issue. It also adds a flush for munmap, so that we
don't execute stale TBs instead of getting a segfault.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Change the data type of tci_tb_ptr, so GETPC() returns an
uintptr_t now (like for all other TCG targets).
This completes commit 2050396801
and fixes builds with TCI.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Allow TB invalidation by its physical address, extract implementation
from the breakpoint_invalidate function.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Use uintptr_t instead of void * or unsigned long in
several op related functions, env->mem_io_pc and
GETPC() macro.
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
cpu_io_recompile terminates by calling either cpu_abort or
cpu_resume_from_signal which both never return.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
QEMU host addresses must use uintptr_t to be portable for hosts with
an unusual size of long (w64).
tb_jmp_offset is an uint16_t value, therefore the local variable offset
in function tb_set_jmp_target was changed from unsigned long to uint16_t.
The type cast to long in function tb_add_jump now also uses uintptr_t.
For the bit operation used here, the signedness of the type cast does
not matter.
Some remaining unsigned long values are either only used for ARM assembler
code or will be fixed in a later patch for PPC.
v2:
Fix signature of tb_find_pc in exec.c, too (hint from Blue Swirl, thanks).
There remain lots of other long / unsigned long in exec.c which must be
replaced by uintptr_t. This will be done in a separate patch. Here
only one of these type casts is fixed.
v3:
Also fix signature of page_unprotect.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Optionally, make memory access helpers take a parameter for CPUState
instead of relying on global env.
On most targets, perform simple moves to reorder registers. On i386,
switch from regparm(3) calling convention to standard stack-based
version.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Scripted conversion:
for file in *.[hc] hw/*.[hc] hw/kvm/*.[hc] linux-user/*.[hc] linux-user/m68k/*.[hc] bsd-user/*.[hc] darwin-user/*.[hc] tcg/*/*.[hc] target-*/cpu.h; do
sed -i "s/CPUState/CPUArchState/g" $file
done
All occurrences of CPUArchState are expected to be replaced by QOM CPUState,
once all targets are QOM'ified and common fields have been extracted.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
The return value of cpu_register_io_memory() is no longer used anywhere, so
we can remove it and all associated data and code.
Signed-off-by: Avi Kivity <avi@redhat.com>
Instead of indirecting via io_mem_region, dispatch directly
through the MemoryRegion obtained from the iotlb or phys_page_find().
Signed-off-by: Avi Kivity <avi@redhat.com>
Now that all mmio goes through MemoryRegions, we can convert
io_mem_opaque to be a MemoryRegion pointer, and remove the thunks
that convert from old-style CPU{Read,Write}MemoryFunc to MemoryRegionOps.
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Its use of IO_MEM_ROM and friends will later cause #include loops; and it
is too large to merit inlining.
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
The code sometimes uses range comparisons on io indexes (e.g.
index =< IO_MEM_ROM). Avoid these as they make moving to objects harder.
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Currently mmio access goes directly to the io_mem_{read,write} arrays.
In preparation for eliminating them, add indirection via a function.
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Unlike other tcg target code generators, this one does not generate
machine code for some cpu. It generates machine independent bytecode
which is interpreted later.
This allows running QEMU on any host.
Interpreted bytecode is slower than direct execution of generated
machine code.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Adding an offset to a void pointer works with gcc but is not allowed
by the current C standards. With -pedantic, gcc complains:
exec-all.h:344: error: pointer of type ‘void *’ used in arithmetic
Fix this, and also replace (unsigned long) by (uintptr_t) in the same
statement.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
GETPC() can be used even from outside of helper code. Move the macro to
a more accessible location. Avoid a compile warning from redefining it in exec.c.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
cea5f9a28f exposed bugs in unassigned memory
access handling. Fix them by always passing CPUState to the handlers.
Reported-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The target-arm frontend's worst-case TCG ops per instr is 194 (and in
general many of the "load multiple registers" ARM instructions generate
more than 100 TCG ops). Raise MAX_OP_PER_INSTR accordingly to avoid
possible buffer overruns.
Since it doesn't make any sense for the "64 bit guest on 32 bit host"
case to have a smaller limit than the normal case, we collapse the
two cases back into each other again.
(This increase costs us about 14K in extra static buffer space and
21K of extra margin at the end of a 32MB codegen buffer.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Move functions cpu_has_work() and cpu_pc_from_tb() from exec.h to cpu.h. This is
needed by later patches.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* 's390-next' of git://repo.or.cz/qemu/agraf:
s390x: complain when allocating ram fails
s390x: fix memory detection for guests > 64GB
s390x: change mapping base to allow guests > 2GB
s390x: Fix debugging for unknown sigp order codes
s390x: build s390x by default
s390x: remove compatibility cc field
s390x: Adjust GDB stub
s390x: translate engine for s390x CPU
s390x: Adjust internal kvm code
s390x: Implement opcode helpers
s390x: helper functions for system emulation
s390x: Shift variables in CPUState for memset(0)
s390x: keep hint on virtio managing size
s390x: make kvm exported functions conditional on kvm
s390x: s390x-linux-user support
tcg: extend max tcg opcodes when using 64-on-32bit
s390x: fix smp support for kvm
tb_invalidate_page_range() was intended to be used to invalidate an
area of a TB which the guest explicitly flushes from i-cache. However,
QEMU detects writes to code areas where TBs have been generated, so
his has never been useful.
Delete the function, adjust callers.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
When running a 64 bit guest on a 32 bit host, we tend to use more TCG ops
than on a 64 bit host. Reflect that in the reserved opcode amount constant.
Signed-off-by: Alexander Graf <agraf@suse.de>
The previous patch removed the need for parameter puc.
Is is now unused, so remove it.
Cc: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Function gen_pc_load was introduced in commit
d2856f1ad4.
The only reason for parameter searched_pc was
a debug statement in target-i386/translate.c.
Parameter puc was needed by target-sparc until
commit d7da2a1040.
Remove searched_pc from the debug statement and remove both
parameters from the parameter list of gen_pc_load.
As the function name gen_pc_load was also misleading,
it is now called restore_state_to_opc. This new name
was suggested by Peter Maydell, thanks.
v2: Remove last parameter, too, and rename the function.
v3: Fix [] typo in target-arm/translate.c.
Fix wrong SHA1 object name in commit message (copy+paste error).
Cc: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
This function is only used within exec.c, so no need to make it public.
Signed-off-by: Tristan Gingold <gingold@adacore.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Most of emulated CPU have instructions aligned on 16 or 32 bits, while
on others GCC tries to align the target jump location. This means that
1/2 or 3/4 of tb_phys_hash entries are never used.
Update the hash function tb_phys_hash_func() to ignore the two lowest
bits of the address. This brings a 6% speed-up when booting a MIPS
image.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Use __builtin___clear_cache() instead of __clear_cache() to avoid having
to define the function as extern. Fix the following warning:
| In file included from qemu/cpus.c:34:
| qemu/exec-all.h: In function 'tb_set_jmp_target1':
| qemu/exec-all.h:208: error: nested extern declaration of '__clear_cache'
| make[1]: *** [cpus.o] Error 1
| make: *** [subdir-i386-softmmu] Error 2
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
this patch removes unused function cpu_restore_state_copy().
Signed-off-by: Jun Koi <junkoi2004@gmail.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Both values are only used in exec.c, so there is no need
to make them globally available.
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Store tcg loop exit request on a global variable, and transfer it to
per-CPUState exit_request after assignment of cpu_single_env.
This makes exit request signal from robust. Drop the timedlock hack.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
MAX_OPC_PARAM is intended to refer to the maximum number of entries used
in gen_opparam_buf[] for any single helper call. It is currently defined
as 10, but for 32-bit archs, the correct value (with a maximum for four
helper arguments) is 14, and for 64-bit archs, only 9 entries are needed.
tcg_gen_callN() fills four entries with the function address, flags,
number of args, etc. and on 32-bit archs uses a further two entries per
argument (with a maximum of four helper arguments), plus two more for the
return value. On 64-bit archs, only half as many entries are used for the
args and the return value.
In reality, TBs tend not to consist purely of helper calls exceeding the
stated 10 gen_opparam_buf[] entries, so this would never actually be a
problem on 32-bit archs, but the definition is still rather confusing.
Signed-off-by: Stuart Brady <sdb@zubnet.me.uk>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Arrange various declarations so that also non-CPU code can access
them, adjust users.
Move CPU specific code to cpus.c.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
QEMU uses a fixed page size for the CPU TLB. If the guest uses large
pages then we effectively split these into multiple smaller pages, and
populate the corresponding TLB entries on demand.
When the guest invalidates the TLB by virtual address we must invalidate
all entries covered by the large page. However the address used to
invalidate the entry may not be present in the QEMU TLB, so we do not
know which regions to clear.
Implementing a full vaiable size TLB is hard and slow, so just keep a
simple address/mask pair to record which addresses may have been mapped by
large pages. If the guest invalidates this region then flush the
whole TLB.
Signed-off-by: Paul Brook <paul@codesourcery.com>
The page tracking code in exec.c is used by both userspace and system
emulation. Userspace emulation uses it to track virtual pages, and
system emulation to track ram pages. Introduce a new type to hold this
kind of address.
Signed-off-by: Paul Brook <paul@codesourcery.com>