Commit Graph

744 Commits

Author SHA1 Message Date
Paolo Bonzini
b7e95164d1 exec: simplify destruction of the phys map
Do not bother visiting the radix tree when an address space is destroyed.
After the previous patch, this has become a pointless exercise.  When
called from address_space_destroy_dispatch, all you're doing is zeroing
out a structure that will be freed as soon as you come back.  When called
from mem_begin, when phys_page_set_level will call phys_map_node_alloc the
radix tree's array will be zeroed too.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:45 +02:00
Paolo Bonzini
058bc4b57f memory: destroy phys_sections one by one
phys_sections_clear is invoked after the dispatch tree has been
destroyed.  This leaves a window where phys_sections_nb > 0 but the
subpages are not valid anymore, which is a recipe for use-after-free
bugs.

Move the destruction of subpages in phys_sections_clear.  We will
still destroy the subpages when an address space is cleaned up,
because address_space_destroy will clear as->root and commit the
change before it calls address_space_destroy_dispatch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:44 +02:00
Paolo Bonzini
2c9b15cab1 memory: add owner argument to initialization functions
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:44 +02:00
Jan Kiszka
b40acf99be ioport: Switch dispatching to memory core layer
The current ioport dispatcher is a complex beast, mostly due to the
need to deal with old portio interface users. But we can overcome it
without converting all portio users by embedding the required base
address of a MemoryRegionPortio access into that data structure. That
removes the need to have the additional MemoryRegionIORange structure
in the loop on every access.

To handle old portio memory ops, we simply install dispatching handlers
for portio memory regions when registering them with the memory core.
This removes the need for the old_portio field.

We can drop the additional aliasing of ioport regions and also the
special address space listener. cpu_in and cpu_out now simply call
address_space_read/write. And we can concentrate portio handling in a
single source file.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:44 +02:00
Andreas Färber
878096eeb2 cpu: Turn cpu_dump_{state,statistics}() into CPUState hooks
Make cpustats monitor command available unconditionally.

Prepares for changing kvm_handle_internal_error() and kvm_cpu_exec()
arguments to CPUState.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-06-28 13:25:12 +02:00
Andreas Färber
60a3e17a46 cpu: Change cpu_exit() argument to CPUState
It no longer depends on CPUArchState, so move it to qom/cpu.c.

Prepares for changing GDBState::c_cpu to CPUState.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-06-28 13:25:12 +02:00
Andreas Färber
1a1562f5ea cpu: Introduce VMSTATE_CPU() macro for CPUState
To be used to embed common CPU state into CPU subclasses.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-06-28 13:25:11 +02:00
Peter Maydell
ec3f8c9913 linux-user: Fix compilation failure
Fix compilation failures for linux-user targets following recent
migration related commits bd2fa51fcd and 43487c67.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1372362818-4740-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-06-27 15:38:35 -05:00
Michael R. Hines
bd2fa51fcd rdma: introduce qemu_ram_foreach_block()
This is used during RDMA initialization in order to
transmit a description of all the RAM blocks to the
peer for later dynamic chunk registration purposes.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-06-27 02:38:36 +02:00
Alexey Kardashevskiy
7dca8043f3 memory: give name to every AddressSpace
The "info mtree" command in QEMU console prints only "memory" and "I/O"
address spaces while there are actually a lot more other AddressSpace
structs created by PCI and VIO devices. Those devices do not normally
have names and therefore not present in "info mtree" output.

The patch fixes this.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:39:52 +02:00
Paolo Bonzini
df32fd1c9f dma: eliminate DMAContext
The DMAContext is a simple pointer to an AddressSpace that is now always
already available.  Make everyone hold the address space directly,
and clean up the DMA API to use the AddressSpace directly.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:39:52 +02:00
Paolo Bonzini
24addbc76d dma: eliminate old-style IOMMU support
The translate function in the DMAContext is now always NULL.
Remove every reference to it.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:47 +02:00
Avi Kivity
3095115744 memory: iommu support
Add a new memory region type that translates addresses it is given,
then forwards them to a target address space.  This is similar to
an alias, except that the mapping is more flexible than a linear
translation and trucation, and also less efficient since the
translation happens at runtime.

The implementation uses an AddressSpace mapping the target region to
avoid hierarchical dispatch all the way to the resolved region; only
iommu regions are looked up dynamically.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
[Modified to put translation in address_space_translate; assume
 IOMMUs are not reachable from TCG. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:47 +02:00
Paolo Bonzini
052e87b073 memory: make section size a 128-bit integer
So far, the size of all regions passed to listeners could fit in 64 bits,
because artificial regions (containers and aliases) are eliminated by
the memory core, leaving only device regions which have reasonable sizes

An IOMMU however cannot be eliminated by the memory core, and may have
an artificial size, hence we may need 65 bits to represent its size.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:47 +02:00
Paolo Bonzini
733d5ef527 exec: reorganize mem_add to match Int128 version
When adding support for 2^64-byte sections, we will have to change
the structure of mem_add to avoid failures in int128_get64.
Reorganize the code now before introducing Int128.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:47 +02:00
Paolo Bonzini
99b9cc0679 Revert "memory: limit sections in the radix tree to the actual address space size"
This reverts commit 86a8623692.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:46 +02:00
Paolo Bonzini
5c8a00ce18 exec: return MemoryRegion from address_space_translate
Only address_space_translate_for_iotlb needs to return the section.
Every caller of address_space_translate now uses only section->mr,
return it directly.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:46 +02:00
Jan Kiszka
acc9d80b26 exec: Implement subpage_read/write via address_space_rw
This will allow to add support for unaligned memory regions: the subpage
container region can activate unaligned support unconditionally because
the read/write handler will now ensure that accesses are split as
required by calling address_space_rw. We can furthermore drop the
special handling of RAM subpages, address_space_rw takes care of this
already.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:46 +02:00
Jan Kiszka
90260c6c09 exec: Resolve subpages in one step except for IOTLB fills
Except for the case of setting the IOTLB entry in TCG mode, we can avoid
the subpage dispatching handlers and do the resolution directly on
address_space_lookup_region. An IOTLB entry describes a full page, not
only the region that the first access to a sub-divided page may return.

This patch therefore introduces a special translation function,
address_space_translate_for_iotlb, that avoids the subpage resolutions.
In contrast, callers of the existing address_space_translate service
will now always receive the terminal memory region section. This will be
important for breaking the BQL and for enabling unaligned memory region.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:46 +02:00
Jan Kiszka
f52cc46742 exec: Allow unaligned address_space_rw
This will be needed for some corner cases with para-virtual I/O ports.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:46 +02:00
Paolo Bonzini
1db8abb102 memory: move private types to exec.c
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:46 +02:00
Jan Kiszka
9f029603ab memory: Introduce address_space_lookup_region
This introduces a wrapper for phys_page_find (before we complicate
address_space_translate with IOMMU translation).  This function will
also encapsulate locking and reference counting when we introduce
BQL-free dispatching.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:46 +02:00
Peter Maydell
3752a03648 exec.c: address_space_translate: handle access to addr 0 of 2^64 sized region
The memory API allows a MemoryRegion's size to be 2^64, as a special
case (otherwise the size always fits in a 64 bit integer). This meant
that attempts to access address zero in a 2^64 sized region would
assert in address_space_translate():

  #3  0x00007ffff3e4d192 in __GI___assert_fail#(assertion=0x555555a43f32
    "!a.hi", file=0x555555a43ef0 "include/qemu/int128.h", line=18,
    function=0x555555a4439f "int128_get64") at assert.c:103
  #4  0x0000555555877642 in int128_get64 (a=...)
    at include/qemu/int128.h:18
  #5  0x00005555558782f2 in address_space_translate (as=0x55555668d140,
   /addr=0, xlat=0x7fffafac9918, plen=0x7fffafac9920, is_write=false)
    at exec.c:221

Fix this by doing the 'min' operation in 128 bit arithmetic
rather than 64 bit arithmetic (we know the result of the 'min'
definitely fits in 64 bits because one of the inputs did).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:46 +02:00
Paolo Bonzini
fd8aaa767a memory: add return value to address_space_rw/read/write
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:27:34 +02:00
Paolo Bonzini
791af8c861 memory: propagate errors on I/O dispatch
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:27:32 +02:00
Paolo Bonzini
a649b9168c exec: just use io_mem_read/io_mem_write for 8-byte I/O accesses
The memory API is able to split it in two 4-byte accesses.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:27:29 +02:00
Paolo Bonzini
968a5627c8 memory: correctly handle endian-swapped 64-bit accesses
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:27:26 +02:00
Paolo Bonzini
51644ab70b memory: add address_space_access_valid
The old-style IOMMU lets you check whether an access is valid in a
given DMAContext.  There is no equivalent for AddressSpace in the
memory API, implement it with a lookup of the dispatch tree.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:27:16 +02:00
Paolo Bonzini
c353e4cc08 exec: implement .valid.accepts for subpages
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:27:14 +02:00
Paolo Bonzini
82f2563fc8 exec: introduce memory_access_size
This will be used by address_space_access_valid too.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:27:08 +02:00
Paolo Bonzini
2bbfa05d20 exec: introduce memory_access_is_direct
After the previous patches, this is a common test for all read/write
functions.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:27:04 +02:00
Paolo Bonzini
d17d45e95f exec: expect mr->ops to be initialized for ROM
There is no need to use the special phys_section_rom section.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:27:01 +02:00
Paolo Bonzini
d197063fcf memory: move unassigned_mem_ops to memory.c
reservation_ops is already doing the same thing.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:26:56 +02:00
Paolo Bonzini
149f54b53b memory: add address_space_translate
Using phys_page_find to translate an AddressSpace to a MemoryRegionSection
is unwieldy.  It requires to pass the page index rather than the address,
and later memory_region_section_addr has to be called.  Replace
memory_region_section_addr with a function that does all of it: call
phys_page_find, compute the offset within the region, and check how
big the current mapping is.  This way, a large flat region can be written
with a single lookup rather than a page at a time.

address_space_translate will also provide a single point where IOMMU
forwarding is implemented.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:26:50 +02:00
Paolo Bonzini
b018ddf633 memory: dispatch unassigned accesses based on .valid.accepts
This provides the basics for detecting accesses to unassigned memory
as soon as they happen, and also for a simple implementation of
address_space_access_valid.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:26:47 +02:00
Paolo Bonzini
bf8d516639 exec: do not use error_mem_read
We will soon reach this case when doing (unaligned) accesses that
span partly past the end of memory.  We do not want to crash in
that case.

unassigned_mem_ops and rom_mem_ops are now the same.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:26:44 +02:00
Paolo Bonzini
0844e00762 exec: make io_mem_unassigned private
There is no reason to avoid a recompile before accessing unassigned
memory.  In the end it will be treated as MMIO anyway.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:26:41 +02:00
Paolo Bonzini
ae4e43e80f exec: drop useless #if
This code is only compiled for softmmu targets.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:26:34 +02:00
Paolo Bonzini
2a8e749909 exec: eliminate io_mem_ram
It is never used, the IOTLB always goes through io_mem_notdirty.

In fact in softmmu_template.h, if it were, QEMU would crash just
below the tests, as soon as io_mem_read/write dispatches to
error_mem_read/write.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:26:21 +02:00
Paolo Bonzini
fd2989341e memory: clean up phys_page_find
Remove the goto.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-24 18:43:54 +02:00
Avi Kivity
86a8623692 memory: limit sections in the radix tree to the actual address space size
The radix tree is statically sized to fit TARGET_PHYS_ADDR_SPACE_BITS.
If a larger memory region is registered, it will overflow.

Fix by limiting any section in the radix tree to the supported size.

This problem was not observed earlier since artificial regions (containers
and aliases) are eliminated by the memory core, leaving only device regions
which have reasonable sizes.  An IOMMU however cannot be eliminated by the
memory core, and may have an artificial size.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
[ Fail the build if TARGET_PHYS_ADDR_SPACE_BITS is too large - Paolo ]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-24 18:43:35 +02:00
Paolo Bonzini
68f3f65b09 memory: assert that PhysPageEntry's ptr does not overflow
While sized to 15 bits in PhysPageEntry, the ptr field is ORed into the
iotlb entries together with a page-aligned pointer.  The ptr field must
not overflow into this page-aligned value, assert that it is smaller than
the page size.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-24 18:42:30 +02:00
Paolo Bonzini
8b0d6711a2 exec: eliminate stq_phys_notdirty
It is not used anywhere.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-24 18:42:27 +02:00
Paolo Bonzini
4f39178b3a exec: eliminate qemu_put_ram_ptr
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-24 18:42:19 +02:00
Paolo Bonzini
bbcfd2913c exec: remove obsolete comment
See how we call memory_region_section_addr two lines below to
convert a physical address to a base address in the region.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-24 18:42:07 +02:00
Paolo Bonzini
e7a09b92b7 osdep: introduce qemu_anon_ram_free to free qemu_anon_ram_alloc-ed memory
We switched from qemu_memalign to mmap() but then we don't modify
qemu_vfree() to do a munmap() over free().  Which we cannot do
because qemu_vfree() frees memory allocated by qemu_{mem,block}align.

Introduce a new function that does the munmap(), luckily the size is
available in the RAMBlock.

Reported-by: Amos Kong <akong@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Message-id: 1368454796-14989-3-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-05-14 08:53:31 -05:00
Paolo Bonzini
6eebf958ab osdep, kvm: rename low-level RAM allocation functions
This is preparatory to the introduction of a separate freeing API.

Reported-by: Amos Kong <akong@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Message-id: 1368454796-14989-2-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-05-14 08:53:31 -05:00
Michael S. Tsirkin
d6b9e0d60c cpu: Add qemu_for_each_cpu()
Wrapper to avoid open-coded loops and to make CPUState iteration
independent of CPUArchState.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-05-01 13:04:18 +02:00
Paolo Bonzini
0d09e41a51 hw: move headers to include/
Many of these should be cleaned up with proper qdev-/QOM-ification.
Right now there are many catch-all headers in include/hw/ARCH depending
on cpu.h, and this makes it necessary to compile these files per-target.
However, fixing this does not belong in these patches.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-04-08 18:13:10 +02:00
Stefan Hajnoczi
49cd9ac6a1 exec: assert that RAMBlock size is non-zero
find_ram_offset() does not handle size=0 gracefully.  It hands out the
same RAMBlock offset multiple times, leading to obscure failures later
on.

Add an assert to warn early if something is incorrectly allocating a
zero size RAMBlock.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-03-26 21:02:17 +02:00
Anthony Liguori
3d34a4110c Merge remote-tracking branch 'afaerber/qom-cpu' into staging
# By Andreas Färber (16) and Igor Mammedov (1)
# Via Andreas Färber
* afaerber/qom-cpu:
  target-lm32: Update VMStateDescription to LM32CPU
  target-arm: Override do_interrupt for ARMv7-M profile
  cpu: Replace do_interrupt() by CPUClass::do_interrupt method
  cpu: Pass CPUState to cpu_interrupt()
  exec: Pass CPUState to cpu_reset_interrupt()
  cpu: Move halted and interrupt_request fields to CPUState
  target-cris/helper.c: Update Coding Style
  target-i386: Update VMStateDescription to X86CPU
  cpu: Introduce cpu_class_set_vmsd()
  cpu: Register VMStateDescription through CPUState
  stubs: Add a vmstate_dummy struct for CONFIG_USER_ONLY
  vmstate: Make vmstate_register() static inline
  target-sh4: Move PVR/PRR/CVR into SuperHCPUClass
  target-sh4: Introduce SuperHCPU subclasses
  cpus: Replace open-coded CPU loop in qmp_memsave() with qemu_get_cpu()
  monitor: Use qemu_get_cpu() in monitor_set_cpu()
  cpu: Fix qemu_get_cpu() to return NULL if CPU not found
2013-03-14 14:50:58 -05:00
Peter Feiner
8ca761f661 exec: make -mem-path filenames deterministic
Adds ramblocks' names to their backing files when using -mem-path.  Eases
introspection and debugging.

Signed-off-by: Peter Feiner <peter@gridcentric.ca>
Message-id: 1362423265-15855-1-git-send-email-peter@gridcentric.ca
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-03-12 13:42:52 -05:00
Andreas Färber
c3affe5670 cpu: Pass CPUState to cpu_interrupt()
Move it to qom/cpu.h to avoid issues with include order.

Change pc_acpi_smi_interrupt() opaque to X86CPU.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-03-12 10:35:55 +01:00
Andreas Färber
d8ed887bdc exec: Pass CPUState to cpu_reset_interrupt()
Move it to qom/cpu.c to avoid build failures depending on include order
of cpu-qom.h and exec/cpu-all.h.

Change opaques of various ..._irq_handler() functions to the
appropriate CPU type to facilitate using cpu_reset_interrupt().

Fix Coding Style issues while at it (missing braces, indentation).

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-03-12 10:35:55 +01:00
Andreas Färber
259186a7d2 cpu: Move halted and interrupt_request fields to CPUState
Both fields are used in VMState, thus need to be moved together.
Explicitly zero them on reset since they were located before
breakpoints.

Pass PowerPCCPU to kvmppc_handle_halt().

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-03-12 10:35:55 +01:00
Andreas Färber
b170fce3dd cpu: Register VMStateDescription through CPUState
In comparison to DeviceClass::vmsd, CPU VMState is split in two,
"cpu_common" and "cpu", and uses cpu_index as instance_id instead of -1.
Therefore add a CPU-specific CPUClass::vmsd field.

Unlike the legacy CPUArchState registration, rather register CPUState.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
2013-03-12 10:35:54 +01:00
Igor Mammedov
d76fddaeee cpu: Fix qemu_get_cpu() to return NULL if CPU not found
Commit 55e5c2850 breaks CPU not found return value, and returns
CPU corresponding to the last non NULL env.
Fix it by returning CPU only if env is not NULL, otherwise CPU is
not found and function should return NULL.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-03-12 10:35:53 +01:00
Peter Maydell
378df4b237 Handle CPU interrupts by inline checking of a flag
Fix some of the nasty TCG race conditions and crashes by implementing
cpu_exit() as setting a flag which is checked at the start of each TB.
This avoids crashes if a thread or signal handler calls cpu_exit()
while the execution thread is itself modifying the TB graph (which
may happen in system emulation mode as well as in linux-user mode
with a multithreaded guest binary).

This fixes the crashes seen in LP:668799; however there are another
class of crashes described in LP:1098729 which stem from the fact
that in linux-user with a multithreaded guest all threads will
use and modify the same global TCG date structures (including the
generated code buffer) without any kind of locking. This means that
multithreaded guest binaries are still in the "unsupported"
category.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-03-03 14:28:47 +00:00
Andreas Färber
907a5e32f2 cputlb: Pass CPUState to cpu_unlink_tb()
CPUArchState is no longer needed.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-02-16 14:51:00 +01:00
Andreas Färber
fcd7d0034b cpu: Move exit_request field to CPUState
Since it was located before breakpoints field, it needs to be reset.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-02-16 14:51:00 +01:00
Stefan Weil
e4ada48242 Replace non-portable asprintf by g_strdup_printf
g_strdup_printf already handles OOM errors, so some error handling in
QEMU code can be removed.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-01-19 10:24:43 +00:00
Andreas Färber
38d8f5c84e exec: Return CPUState from qemu_get_cpu()
Move the declaration to qemu/cpu.h and add documentation.
The implementation still depends on CPUArchState for CPU iteration.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-01-15 04:09:14 +01:00
Andreas Färber
55e5c28502 cpu: Move cpu_index field to CPUState
Note that target-alpha accesses this field from TCG, now using a
negative offset. Therefore the field is placed last in CPUState.

Pass PowerPCCPU to [kvm]ppc_fixup_cpu() to facilitate this change.

Move common parts of mips cpu_state_reset() to mips_cpu_reset().

Acked-by: Richard Henderson <rth@twiddle.net> (for alpha)
[AF: Rebased onto ppc CPU subclasses and openpic changes]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-01-15 04:09:13 +01:00
Andreas Färber
1b1ed8dc40 cpu: Move numa_node field to CPUState
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-01-15 04:09:13 +01:00
Paolo Bonzini
5708fc6655 stubs: fully replace qemu-tool.c and qemu-user.c
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-01-12 17:19:08 +01:00
Blue Swirl
8e4a424b30 Revert "virtio-pci: replace byte swap hack"
This reverts commit 9807caccd6.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-01-06 18:30:17 +00:00
Blue Swirl
9807caccd6 virtio-pci: replace byte swap hack
Remove byte swaps by declaring the config space
as native endian.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-01-06 08:24:26 +00:00
Umesh Deshpande
b2a8658ef5 protect the ramlist with a separate mutex
Add the new mutex that protects shared state between ram_save_live
and the iothread.  If the iothread mutex has to be taken together
with the ramlist mutex, the iothread shall always be _outside_.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Umesh Deshpande <udeshpan@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2012-12-20 23:08:47 +01:00
Umesh Deshpande
f798b07f51 add a version number to ram_list
This will be used to detect if last_block might have become invalid
across different calls to ram_save_live.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Umesh Deshpande <udeshpan@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2012-12-20 23:08:47 +01:00
Paolo Bonzini
abb26d63e7 exec: sort the memory from biggest to smallest
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-12-20 23:08:47 +01:00
Paolo Bonzini
a3161038a1 exec: change RAM list to a TAILQ
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-12-20 23:08:47 +01:00
Paolo Bonzini
0d6d3c87a2 exec: change ramlist from MRU order to a 1-item cache
Most of the time, only 2 items will be active (from/to for a string operation,
or code/data).  But TCG guests likely won't have gigabytes of memory, so
this actually goes down to 1 item.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-12-20 23:08:40 +01:00
Paolo Bonzini
9c17d615a6 softmmu: move include files to include/sysemu/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:32:45 +01:00
Paolo Bonzini
1de7afc984 misc: move include files to include/qemu/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:32:39 +01:00
Paolo Bonzini
022c62cbbc exec: move include files to include/exec/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:31:31 +01:00
Paolo Bonzini
077805fa92 janitor: do not rely on indirect inclusions of or from qemu-char.h
Various header files rely on qemu-char.h including qemu-config.h or
main-loop.h, but they really do not need qemu-char.h at all (particularly
interesting is the case of the block layer!).  Clean this up, and also
add missing inclusions of qemu-char.h itself.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:29:52 +01:00
Blue Swirl
5b6dd8683d exec: move TB handling to translate-all.c
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-12-16 08:28:41 +00:00
Blue Swirl
5a3165263a exec: extract TB watchpoint check
Will be moved by the next patch.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-12-16 08:28:29 +00:00
Blue Swirl
44209fc4ed exec: fix coding style
Fix coding style in areas to be moved by later patches.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-12-16 08:28:16 +00:00
Richard Henderson
0be4835b49 exec: Advise huge pages for the TCG code gen buffer
After allocating 32MB or more contiguous memory, huge pages
would seem to be ideal.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-12-08 14:18:37 +00:00
Peter Maydell
9e11908f12 dma: Define dma_context_memory and use in sysbus-ohci
Define a new global dma_context_memory which is a DMAContext corresponding
to the global address_space_memory AddressSpace. This can be used by
sysbus peripherals like sysbus-ohci which need to do DMA.

In particular, use it in the sysbus-ohci device, which fixes a
segfault when attempting to use that device.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2012-11-12 16:44:57 +01:00
Blue Swirl
ef84755ebb Merge branch 'trivial-patches' of git://github.com/stefanha/qemu
* 'trivial-patches' of git://github.com/stefanha/qemu:
  pc: Drop redundant test for ROM memory region
  exec: make some functions static
  target-ppc: make some functions static
  ppc: add missing static
  vnc: add missing static
  vl.c: add missing static
  target-sparc: make do_unaligned_access static
  m68k: Return semihosting errno values correctly
  cadence_uart: More debug information

Conflicts:
	target-m68k/m68k-semi.c
2012-11-03 12:55:05 +00:00
Yeongkyoon Lee
fdbb84d133 tcg: Add extended GETPC mechanism for MMU helpers with ldst optimization
Add GETPC_EXT which is used by MMU helpers to selectively calculate the code
address of accessing guest memory when called from a qemu_ld/st optimized code
or a C function. Currently, it supports only i386 and x86-64 hosts.

Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-11-03 09:44:20 +00:00
Blue Swirl
8b9c99d9dc exec: make some functions static
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2012-11-01 19:49:45 +01:00
Andreas Färber
9f09e18a6d cpu: Move thread_id to CPUState
Signed-off-by: Andreas Färber <afaerber@suse.de>
2012-10-31 04:12:23 +01:00
Andreas Färber
c08d7424d6 cpus: Pass CPUState to qemu_cpu_kick()
CPUArchState is no longer needed there.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2012-10-31 01:02:45 +01:00
Andreas Färber
60e82579c7 cpus: Pass CPUState to qemu_cpu_is_self()
Change return type to bool, move to include/qemu/cpu.h and
add documentation.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
[AF: Updated new caller qemu_in_vcpu_thread()]
2012-10-31 01:02:39 +01:00
Avi Kivity
a8170e5e97 Rename target_phys_addr_t to hwaddr
target_phys_addr_t is unwieldly, violates the C standard (_t suffixes are
reserved) and its purpose doesn't match the name (most target_phys_addr_t
addresses are not target specific).  Replace it with a finger-friendly,
standards conformant hwaddr.

Outstanding patchsets can be fixed up with the command

  git rebase -i --exec 'find -name "*.[ch]"
                        | xargs s/target_phys_addr_t/hwaddr/g' origin

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-10-23 08:58:25 -05:00
Luiz Capitulino
ad0b5321f1 Call MADV_HUGEPAGE for guest RAM allocations
This makes it possible for QEMU to use transparent huge pages (THP)
when transparent_hugepage/enabled=madvise. Otherwise THP is only
used when it's enabled system wide.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-10-22 13:26:34 -05:00
Anthony Liguori
f526f3c315 Merge remote-tracking branch 'quintela/migration-next-20121017' into staging
* quintela/migration-next-20121017: (41 commits)
  cpus: create qemu_in_vcpu_thread()
  savevm: make qemu_file_put_notify() return errors
  savevm: un-export qemu_file_set_error()
  block-migration: handle errors with the return codes correctly
  block-migration:  Switch meaning of return value
  block-migration: make flush_blks() return errors
  buffered_file: buffered_put_buffer() don't need to set last_error
  savevm: Only qemu_fflush() can generate errors
  savevm: make qemu_fill_buffer() be consistent
  savevm: unexport qemu_ftell()
  savevm: unfold qemu_fclose_internal()
  savevm: make qemu_fflush() return an error code
  savevm: Remove qemu_fseek()
  virtio-net: use qemu_get_buffer() in a temp buffer
  savevm: unexport qemu_fflush
  migration: make migrate_fd_wait_for_unfreeze() return errors
  buffered_file: make buffered_flush return the error code
  buffered_file: callers of buffered_flush() already check for errors
  buffered_file: We can access directly to bandwidth_limit
  buffered_file: unfold migrate_fd_close
  ...
2012-10-22 13:26:23 -05:00
Anthony Liguori
d3e2efc5b5 Merge remote-tracking branch 'qemu-kvm/memory/dma' into staging
* qemu-kvm/memory/dma: (23 commits)
  pci: honor PCI_COMMAND_MASTER
  pci: give each device its own address space
  memory: add address_space_destroy()
  dma: make dma access its own address space
  memory: per-AddressSpace dispatch
  s390: avoid reaching into memory core internals
  memory: use AddressSpace for MemoryListener filtering
  memory: move tcg flush into a tcg memory listener
  memory: move address_space_memory and address_space_io out of memory core
  memory: manage coalesced mmio via a MemoryListener
  xen: drop no-op MemoryListener callbacks
  kvm: drop no-op MemoryListener callbacks
  xen_pt: drop no-op MemoryListener callbacks
  vfio: drop no-op MemoryListener callbacks
  memory: drop no-op MemoryListener callbacks
  memory: provide defaults for MemoryListener operations
  memory: maintain a list of address spaces
  memory: export AddressSpace
  memory: prepare AddressSpace for exporting
  xen_pt: use separate MemoryListeners for memory and I/O
  ...
2012-10-22 13:26:07 -05:00
Avi Kivity
83f3c25142 memory: add address_space_destroy()
Since address spaces can be created dynamically by device hotplug, they
can also be destroyed dynamically.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-22 14:50:08 +02:00
Avi Kivity
ac1970fbe8 memory: per-AddressSpace dispatch
Currently we use a global radix tree to dispatch memory access.  This only
works with a single address space; to support multiple address spaces we
make the radix tree a member of AddressSpace (via an intermediate structure
AddressSpaceDispatch to avoid exposing too many internals).

A side effect is that address_space_io also gains a dispatch table.  When
we remove all the pre-memory-API I/O registrations, we can use that for
dispatching I/O and get rid of the original I/O dispatch.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-22 14:50:08 +02:00
Avi Kivity
f6790af6bc memory: use AddressSpace for MemoryListener filtering
Using the AddressSpace type reduces confusion, as you can't accidentally
supply the MemoryRegion you're interested in.

Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-22 14:50:07 +02:00
Avi Kivity
1d71148eac memory: move tcg flush into a tcg memory listener
We plan to make the core listener listen to all address spaces; this
will cause many more flushes than necessary.  Prepare for that by
moving the flush into a tcg-specific listener.

Later we can avoid registering the listener if tcg is disabled.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-22 14:50:07 +02:00
Avi Kivity
2673a5da25 memory: move address_space_memory and address_space_io out of memory core
With this change, memory.c no longer knows anything about special address
spaces, so it is prepared for AddressSpace based DMA.

Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-22 14:50:07 +02:00
Avi Kivity
95d2994a2f memory: manage coalesced mmio via a MemoryListener
Instead of calling a global function on coalesced mmio changes, which
routes the call to kvm if enabled, add coalesced mmio hooks to
MemoryListener and make kvm use that instead.

The motivation is support for multiple address spaces (which means we
we need to filter the call on the right address space) but the result
is cleaner as well.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-22 14:50:00 +02:00
Richard Henderson
74d590c8e9 exec: Make MIN_CODE_GEN_BUFFER_SIZE private to exec.c
It is used nowhere else, and the corresponding MAX_CODE_GEN_BUFFER_SIZE
also lives there.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-10-20 07:54:04 +00:00
Richard Henderson
4438c8a946 exec: Allocate code_gen_prologue from code_gen_buffer
We had a hack for arm and sparc, allocating code_gen_prologue to a
special section.  Which, honestly does no good under certain cases.
We've already got limits on code_gen_buffer_size to ensure that all
TBs can use direct branches between themselves; reuse this limit to
ensure the prologue is also reachable.

As a bonus, we get to avoid marking a page of the main executable's
data segment as executable.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-10-20 07:54:04 +00:00
Richard Henderson
405def1846 exec: Do not use absolute address hints for code_gen_buffer with -fpie
The hard-coded addresses inside alloc_code_gen_buffer only make sense
if we're building an executable that will actually run at the address
we've put into the linker scripts.

When we're building with -fpie, the executable will run at some
random location chosen by the kernel.  We get better placement for
the code_gen_buffer if we allow the kernel to place the memory,
as it will tend to to place it near the executable, based on the
PROT_EXEC bit.

Since code_gen_prologue is always inside the executable, this effect
is easily seen at the end of most TB, with the exit_tb opcode, and
with any calls to helper functions.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-10-20 07:54:04 +00:00
Richard Henderson
3d85a72fd8 exec: Don't make DEFAULT_CODE_GEN_BUFFER_SIZE too large
For ARM we cap the buffer size to 16MB.  Do not allocate 32MB in that case.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-10-20 07:54:04 +00:00
Richard Henderson
f1bc0bcc9d exec: Split up and tidy code_gen_buffer
It now consists of:

A macro definition of MAX_CODE_GEN_BUFFER_SIZE with host-specific values,

A function size_code_gen_buffer that applies most of the reasoning for
choosing a buffer size,

Three variations of a function alloc_code_gen_buffer that contain all
of the logic for allocating executable memory via a given allocation
mechanism.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-10-20 07:54:04 +00:00
Juan Quintela
652d7ec291 ram: Export last_ram_offset()
Is the only way of knowing the RAM size.

Signed-off-by: Juan Quintela <quintela@redhat.com>

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2012-10-17 18:34:58 +02:00
Avi Kivity
9a2c913b77 memory: drop no-op MemoryListener callbacks
Removes quite a bit of useless code.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-15 11:43:07 +02:00
Avi Kivity
7762c2c1e0 memory: rename 'exec-obsolete.h'
exec-obsolete.h used to hold pre-memory-API functions that were used from
device code prior to the transition to the memory API.  Now that the
transition is complete, the name no longer describes the file.  The
functions still need to be merged better into the memory core, but there's
no danger of anyone using them.

Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-15 11:43:05 +02:00
Peter Maydell
6fd2a026fb cpu_dump_state: move DUMP_FPU and DUMP_CCOP flags from x86-only to generic
Move the DUMP_FPU and DUMP_CCOP flags for cpu_dump_state() from being
x86-specific flags to being generic ones. This allows us to drop some
TARGET_I386 ifdefs in various places, and means that we can (potentially)
be more consistent across architectures about which monitor commands or
debug abort printouts include FPU register contents and info about
QEMU's condition-code optimisations.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2012-10-05 15:04:43 +01:00
Anthony PERARD
e226939de5 exec, memory: Call to xen_modified_memory.
This patch add some calls to xen_modified_memory to notify Xen about dirtybits
during migration.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Avi Kivity <avi@redhat.com>
2012-10-03 13:49:22 +00:00
Anthony PERARD
51d7a9eb2b exec: Introduce helper to set dirty flags.
This new helper/hook is used in the next patch to add an extra call in a single
place.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Avi Kivity <avi@redhat.com>
2012-10-03 13:49:05 +00:00
Richard Henderson
9b9c37c364 tcg-sparc: Assume v9 cpu always, i.e. force v8plus in 32-bit mode.
Current code doesn't actually work in 32-bit mode at all.  Since
no one really noticed, drop the complication of v7 and v8 cpus.
Eliminate the --sparc_cpu configure option and standardize macro
testing on TCG_TARGET_REG_BITS / HOST_LONG_BITS

Signed-off-by: Richard Henderson <rth@twiddle.net>
2012-09-21 22:02:16 +02:00
Richard Henderson
d5dd696fe3 tcg-sparc: Don't MAP_FIXED on top of the program
The address we pick in sparc64.ld is also 0x60000000, so doing a fixed map
on top of that is guaranteed to blow up.  Choosing 0x40000000 is exactly
right for the max of code_gen_buffer_size set below.

No need to ever use MAP_FIXED.  While getting our desired address helps
optimize the generated code, we won't fail if we don't get it.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2012-09-21 22:02:16 +02:00
David Gibson
0b57e28713 cpu_physical_memory_write_rom() needs to do TB invalidates
cpu_physical_memory_write_rom(), despite the name, can also be used to
write images into RAM - and will often be used that way if the machine
uses load_image_targphys() into RAM addresses.

However, cpu_physical_memory_write_rom(), unlike cpu_physical_memory_rw()
doesn't invalidate any cached TBs which might be affected by the region
written.

This was breaking reset (under full emu) on the pseries machine - we loaded
our firmware image into RAM, and while executing it rewrite the code at
the entry point (correctly causing a TB invalidate/refresh).  When we
reset the firmware image was reloaded, but the TB from the rewrite was
still active and caused us to get an illegal instruction trap.

This patch fixes the bug by duplicating the tb invalidate code from
cpu_physical_memory_rw() in cpu_physical_memory_write_rom().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-09-17 10:18:48 -05:00
Luiz Capitulino
8490fc78e7 add -machine mem-merge=on|off option
It allows to disable memory merge support (KSM on Linux), which is
enabled by default otherwise.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-09-17 10:18:47 -05:00
Jason Baron
ddb97f1deb memory: add -machine dump-guest-core=on|off
Add a new '[,dump-guest-core=on|off]' option to the '-machine' option. When
'dump-guest-core=off' is specified, guest memory is omitted from the core dump.
The default behavior continues to be to include guest memory when a core dump is
triggered. In my testing, this brought the core dump size down from 384MB to 6MB
on a 2GB guest.

Is anything additional required to preserve this setting for migration or
savevm? I don't believe so.

Changelog:
v3:
    Eliminate globals as per Anthony's suggestion
    set no dump from qemu_ram_remap() as well
v2:
    move the option from -m to -machine, rename option dump -> dump-guest-core

Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-08-16 13:41:15 -05:00
Igor Mitsyanko
5fda043f9c exec.c: fix dirty bitmap reallocation
For each newly created RAM block, dirty bitmap is reallocated with g_realloc, which doesn't
make any promises on initial content of new extra data in returned buffer. In theory,
we initialize this new data with cpu_physical_memory_set_dirty_range() call. The
problem is, cpu_physical_memory_set_dirty_range() has a side effect of incrementing
ram_list.dirty_pages variable, but only for pages which are not already dirty. And
page "cleanliness" is determined using the same not yet uninitialized dirty bitmap
we've just reallocated. This results in inconsistency between real dirty page number
and value in ram_list.dirty_pages variable, which in turn could (and will) result
in errors during VM migration.
Zero initialize new dirty bitmap bytes to fix this problem.

Signed-off-by: Igor Mitsyanko <i.mitsyanko@samsung.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-08-11 12:23:46 +00:00
Peter Maydell
c308efe63a exec.c: Remove out of date comment
Remove an out of date comment: this comment used to be attached to
cpu_register_physical_memory_log(), before commit 0f0cb164 accidentally
inserted a couple of other functions between the comment and its function.
It is in any case obsolete since (a) the function arguments it refers
to have been replaced with a single MemoryRegionSection* argument and
(b) the inability to handle regions whose offset_within_address_space
and offset_within_region aren't equally aligned was fixed as part of
the rewrite of this code.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-08-03 14:25:22 +01:00
Tyler Hall
69b67646bc exec.c: Use subpages for large unaligned mappings
Registering a multi-page memory region that is non-page-aligned results
in a subpage from the start to the page boundary, some number of full
pages, and possibly another subpage from the last page boundary to the
end. The full pages will have a value for offset_within_region that is
not a multiple of TARGET_PAGE_SIZE. Accesses through softmmu are unable
to handle this and will segfault.

Handling full pages through subpages is not optimal, but only
non-page-aligned mappings take the penalty.

Signed-off-by: Tyler Hall <tylerwhall@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-08-03 14:25:22 +01:00
Tyler Hall
adb2a9b5d4 exec.c: Fix off-by-one error in register_subpage
subpage_register() expects "end" to be the last byte in the mapping.
Registering a non-page-aligned memory region that extends up to or
beyond a page boundary causes subpage_register() to silently fail
through the (end >= PAGE_SIZE) check.

This bug does not cause noticeable problems for mappings that do not
extend to a page boundary, though they do register an extra byte.

Signed-off-by: Tyler Hall <tylerwhall@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-08-03 14:25:22 +01:00
Anthony Liguori
09f06a6c60 Merge remote-tracking branch 'qemu-kvm/uq/master' into staging
* qemu-kvm/uq/master:
  virtio: move common irqfd handling out of virtio-pci
  virtio: move common ioeventfd handling out of virtio-pci
  event_notifier: add event_notifier_set_handler
  memory: pass EventNotifier, not eventfd
  ivshmem: wrap ivshmem_del_eventfd loops with transaction
  ivshmem: use EventNotifier and memory API
  event_notifier: add event_notifier_init_fd
  event_notifier: remove event_notifier_test
  event_notifier: add event_notifier_set
  apic: Defer interrupt updates to VCPU thread
  apic: Reevaluate pending interrupts on LVT_LINT0 changes
  apic: Resolve potential endless loop around apic_update_irq
  kvm: expose tsc deadline timer feature to guest
  kvm_pv_eoi: add flag support
  kvm: Don't abort on kvm_irqchip_add_msi_route()
2012-07-18 14:44:43 -05:00
Paolo Bonzini
753d5e14c4 memory: pass EventNotifier, not eventfd
Under Win32, EventNotifiers will not have event_notifier_get_fd, so we
cannot call it in common code such as hw/virtio-pci.c.  Pass a pointer to
the notifier, and only retrieve the file descriptor in kvm-specific code.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-07-12 14:08:10 +03:00
Christian Borntraeger
fdec991857 s390: autodetect map private
By default qemu will use MAP_PRIVATE for guest pages. This will write
protect pages and thus break on s390 systems that dont support this feature.
Therefore qemu has a hack to always use MAP_SHARED for s390. But MAP_SHARED
has other problems (no dirty pages tracking, a lot more swap overhead etc.)
Newer systems allow the distinction via KVM_CAP_S390_COW. With this feature
qemu can use the standard qemu alloc if available, otherwise it will use
the old s390 hack.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-07-10 18:27:33 +02:00
Juan Quintela
1720aeee72 dirty bitmap: abstract its use
Always use accessors to read/set the dirty bitmap.

Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-06-29 13:31:07 +02:00
Juan Quintela
d24981d37e Only TCG needs TLB handling
Refactor the code that is only needed for tcg to an static function.
Call that only when tcg is enabled.  We can't refactor to a dummy
function in the kvm case, as qemu can be compiled at the same time
with tcg and kvm.

Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-06-29 13:27:28 +02:00
Blue Swirl
5726c27fa9 qemu-log: move logging to qemu-log.c
Move logging functions from exec.c to qemu-log.c,
compile it only once.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-06-21 18:45:16 +00:00
Anthony Liguori
09e5ab6360 qdev: Use wrapper for qdev_get_path
This makes it easier to remove it from BusInfo.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[AF: Drop now unnecessary NULL initialization in scsibus_get_dev_path()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2012-06-18 15:14:38 +02:00
Anthony Liguori
3525c42fd3 Merge remote-tracking branch 'stefanha/trivial-patches' into staging
* stefanha/trivial-patches:
  configure: report missing libraries for virtfs
  trace/simple.c: fix deprecated glib2 interface
  Clarify comments of tb_invalidate_phys_[page_]range
2012-06-11 12:15:51 -05:00
Max Filippov
9d70c4b7b8 exec: fix TB invalidation after breakpoint insertion/deletion
tb_invalidate_phys_addr has to be called with the exact physical address of
the breakpoint we add/remove, not just the page's base address.
Otherwise we easily fail to flush the right TB.

This breakage was introduced by the commit f3705d5329 "memory: make
phys_page_find() return an unadjusted".

This appeared to work for some guest architectures because their
cpu_get_phys_page_debug implementation returns full translated physical
address, not just the base of the TARGET_PAGE_SIZE-sized page.

Reported-by: TeLeMan <geleman@gmail.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-06-09 10:49:19 +00:00
Jan Kiszka
8e0fdce32d Clarify comments of tb_invalidate_phys_[page_]range
They could suggest that all TBs of the page containing the range would
be invalidated.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-06-08 09:32:26 +01:00
Wen Congyang
76f3553883 Add API to check whether a physical address is I/O address
This API will be used in the following patch.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2012-06-04 13:49:33 -03:00
Alexander Graf
77a8f1a512 linux-user: Fix stale tbs after mmap
If we execute linux-user code that does the following:

  * A = mmap()
  * execute code in A
  * munmap(A)
  * B = mmap(), but mmap returns the same address as A
  * execute code in B

we end up executing a stale cached tb that contains translated code
from A, while we want new code from B.

This patch adds a TB flush for mmap'ed regions, before we return them,
avoiding the whole issue. It also adds a flush for munmap, so that we
don't execute stale TBs instead of getting a segfault.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-05-19 15:49:40 +00:00
Blue Swirl
fd06257351 memory: move functions is_romd and section_addr to memory API
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-05-01 10:45:07 +00:00
Blue Swirl
cc5bea608d cputlb: prepare private memory API for public consumption
Fold is_ram_rom and is_ram_rom_romd() into callers.

Change is_romd() and section_addr() to take MemoryRegion
instead of MemoryRegionSection for consistency and
use memory_region_ prefix.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-05-01 10:45:05 +00:00
Blue Swirl
0cac1b66c8 cputlb: move TLB handling to a separate file
Move TLB handling and softmmu code load helpers to cputlb.c,
compile only for softmmu targets.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-05-01 10:45:04 +00:00
Blue Swirl
e554861766 exec: prepare for splitting
Make s_cputlb_empty_entry 'const'.

Rename tlb_flush_jmp_cache() to tb_flush_jmp_cache().

Refactor code to add cpu_tlb_reset_dirty_all(),
memory_region_section_get_iotlb() and
memory_region_is_unassigned().

Remove unused cpu_tlb_update_dirty().

Fix coding style in areas to be moved.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-05-01 10:45:02 +00:00
Stefan Weil
8efe0ca83e w64: Use uintptr_t in exec.c
Replace all type casts to 'long' or 'unsigned long' by 'intptr_t' or 'uintptr_t'.

For type casts which are only used to extract the lower bits of an address
or to modify those bits, signedness does not matter. There I always use 'uintptr_t'.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2012-04-15 21:25:17 +02:00
Stefan Weil
6840981dfb w64: Use larger alignment for section with generated code
The MinGW-w64 compiler allows __attribute__((aligned (32)).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2012-04-15 21:25:16 +02:00
Stefan Weil
c6d506742f w64: Fix data types in cpu-all.h, exec.c
w64 needs uintptr_t instead of unsigned long.
For other hosts, nothing changes.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2012-04-15 21:25:16 +02:00
Max Filippov
1e7855a558 exec: provide tb_invalidate_phys_addr function
Allow TB invalidation by its physical address, extract implementation
from the breakpoint_invalidate function.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-04-14 15:25:36 +00:00
Blue Swirl
2050396801 Use uintptr_t for various op related functions
Use uintptr_t instead of void * or unsigned long in
several op related functions, env->mem_io_pc and
GETPC() macro.

Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-04-14 14:23:37 +00:00
Stefan Weil
6375e09e79 w64: Fix data type of tb_next and other variables used for host addresses
QEMU host addresses must use uintptr_t to be portable for hosts with
an unusual size of long (w64).

tb_jmp_offset is an uint16_t value, therefore the local variable offset
in function tb_set_jmp_target was changed from unsigned long to uint16_t.

The type cast to long in function tb_add_jump now also uses uintptr_t.
For the bit operation used here, the signedness of the type cast does
not matter.

Some remaining unsigned long values are either only used for ARM assembler
code or will be fixed in a later patch for PPC.

v2:
Fix signature of tb_find_pc in exec.c, too (hint from Blue Swirl, thanks).
There remain lots of other long / unsigned long in exec.c which must be
replaced by uintptr_t. This will be done in a separate patch. Here
only one of these type casts is fixed.

v3:
Also fix signature of page_unprotect.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-04-07 11:27:45 +00:00
Richard Henderson
813da6277c tcg: Use the GDB JIT debugging interface.
This allows us to generate unwind info for the dynamicly generated
code in the code_gen_buffer.  Only i386 is converted at this point.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-03-24 13:07:48 +00:00
Anthony PERARD
0a1b357f15 exec: fix guest memory access for Xen
In cpu_physical_memory_rw, a change has been introduced and qemu_get_ram_ptr is
no longuer called with the ram addr we want to access, but only with the
section address. This patch fixes this. (All other call to qemu_get_ram_ptr are
already called with the right address.)

This patch fixes Xen guest.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-19 19:13:30 +02:00
Avi Kivity
32b089808f memory: check for watchpoints when getting code ram_addr
The code to get the ram_addr from a (tlb entry, vaddr) pair
checks that the resulting memory is not MMIO, but neglects to
check whether the region is hidden by a watchpoint page.

Add the missing check.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-19 11:15:01 +02:00
Avi Kivity
7859cc6e39 exec: fix write tlb entry misused as iotlb
A couple of code paths check the lower bits of CPUTLBEntry::addr_write
against io_mem_ram as a way of looking for a dirty RAM page.  This works
by accident since the value is zero, which matches all clear bits for
TLB_INVALID, TLB_MMIO, and TLB_NOTDIRTY (indicating dirty RAM).

Make it work by design by checking for the proper bits.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-19 11:15:00 +02:00
Blue Swirl
e141ab52d2 softmmu templates: optionally pass CPUState to memory access functions
Optionally, make memory access helpers take a parameter for CPUState
instead of relying on global env.

On most targets, perform simple moves to reorder registers. On i386,
switch from regparm(3) calling convention to standard stack-based
version.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-03-18 12:21:52 +00:00
Andreas Färber
9349b4f9fd Rename CPUState -> CPUArchState
Scripted conversion:
  for file in *.[hc] hw/*.[hc] hw/kvm/*.[hc] linux-user/*.[hc] linux-user/m68k/*.[hc] bsd-user/*.[hc] darwin-user/*.[hc] tcg/*/*.[hc] target-*/cpu.h; do
    sed -i "s/CPUState/CPUArchState/g" $file
  done

All occurrences of CPUArchState are expected to be replaced by QOM CPUState,
once all targets are QOM'ified and common fields have been extracted.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
2012-03-14 22:20:27 +01:00
Avi Kivity
97161e177b memory: get rid of cpu_register_io_memory()
The return value of cpu_register_io_memory() is no longer used anywhere, so
we can remove it and all associated data and code.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-08 19:16:39 +02:00
Avi Kivity
37ec01d433 memory: dispatch directly via MemoryRegion
Instead of indirecting via io_mem_region, dispatch directly
through the MemoryRegion obtained from the iotlb or phys_page_find().

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-08 19:06:11 +02:00
Avi Kivity
ce5d64c2d0 exec: fix code tlb entry misused as iotlb in get_page_addr_code()
get_page_addr_code() reads a code tlb entry, but interprets it as an
iotlb entry.  This works by accident since the low bits of a RAM code
tlb entry are clear, and match a RAM iotlb entry.  This accident is
about to unhappen, so fix the code to use an iotlb entry (using the
code entry with TLB_MMIO may fail if the page is a watchpoint).

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-08 18:54:20 +02:00
Avi Kivity
aa102231f0 memory: store section indices in iotlb instead of io indices
A step towards eliminating io indices.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-08 17:06:55 +02:00
Avi Kivity
f3705d5329 memory: make phys_page_find() return an unadjusted section
We'd like to store the section index in the iotlb, so we can't
adjust it before returning.  Return an unadjusted section and
instead introduce section_addr(), which does the adjustment later.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-08 16:16:34 +02:00
Avi Kivity
a2d335214a memory: fix I/O port aliases
Commit e58ac72b6a0 ("ioport: change portio_list not to use
memory_region_set_offset()") started using aliases of I/O memory
regions.  Since the IORange used for the I/O was contained in the
target region, the alias information (specifically, the offset
into the region) was lost.  This broke -vga std.

Fix by allocating an independent object to hold the IORange and
also the new offset.

Note that I/O memory regions were conceptually broken wrt aliases
in a different way: an alias can cause the same region to appear
twice in an address space, but we had just one IORange to service it.
This patch fixes that problem as well, since we can now have multiple
IORange/MemoryRegion associations.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-03-05 17:40:12 +02:00
Blue Swirl
b3e54c689c Merge branch 'xtensa' of git://jcmvbkbc.spb.ru/dumb/qemu-xtensa
* 'xtensa' of git://jcmvbkbc.spb.ru/dumb/qemu-xtensa:
  target-xtensa: add breakpoint tests
  target-xtensa: add DEBUG_SECTION to overlay tool
  target-xtensa: add DBREAK data breakpoints
  exec: let cpu_watchpoint_insert accept larger watchpoints
  exec: fix check_watchpoint exiting cpu_loop
  exec: add missing breaks to the watch_mem_write
  target-xtensa: add ICOUNT SR and debug exception
  target-xtensa: implement instruction breakpoints
  target-xtensa: add DEBUGCAUSE SR and configuration
  target-xtensa: fetch 3rd opcode byte only when needed
  target-xtensa: implement info tlb monitor command
  target-xtensa: define TLB_TEMPLATE for MMU-less cores
2012-03-03 17:53:41 +00:00
Avi Kivity
07f07b31e5 memory: allow phys_map tree paths to terminate early
When storing large contiguous ranges in phys_map, all values tend to
be the same pointers to a single MemoryRegionSection.  Collapse them
by marking nodes with level > 0 as leaves.  This reduces tree memory
usage dramatically.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-29 13:44:45 +02:00
Avi Kivity
c19e8800d4 memory: unify PhysPageEntry::node and ::leaf
They have the same type, unify them.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-29 13:44:45 +02:00
Avi Kivity
2999097bf1 memory: change phys_page_set() to set multiple pages
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-29 13:44:45 +02:00
Avi Kivity
f7bf546118 memory: switch phys_page_set() to a recursive implementation
Setting multiple pages at once requires backtracking to previous
nodes; easiest to achieve via recursion.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-29 13:44:45 +02:00
Avi Kivity
a391843286 memory: replace phys_page_find_alloc() with phys_page_set()
By giving the function the value we want to set, we make it
more flexible for the next patch.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-29 13:44:45 +02:00
Avi Kivity
0f0cb164cc memory: simplify multipage/subpage registration
Instead of considering subpage on a per-page basis, split each section
into a subpage head, multipage body, and subpage tail, and register
each separately.  This simplifies the registration functions.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-29 13:44:44 +02:00
Avi Kivity
31ab2b4a46 memory: give phys_page_find() its own tree search loop
We'll change phys_page_find_alloc() soon, but phys_page_find()
doesn't need to bear the consequences.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-29 13:44:44 +02:00
Avi Kivity
06ef3525e1 memory: make phys_page_find() return a MemoryRegionSection
We no longer describe memory in terms of individual pages; use sections
throughout instead.

PhysPageDesc no longer used - remove.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-29 13:44:44 +02:00
Avi Kivity
117712c3e4 memory: move tlb flush to MemoryListener commit callback
This way, if we have several changes in a single transaction, we flush just
once.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-29 13:44:44 +02:00
Avi Kivity
717cb7b259 memory: unify the two branches of cpu_register_physical_memory_log()
Identical except that the second branch knows its not modifying an existing
subpage.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-29 13:44:44 +02:00
Avi Kivity
8636b9295b memory: fix RAM subpages in newly initialized pages
If the first subpage installed in a page is RAM, then we install it as
a full page, instead of a subpage.  Fix by not special casing RAM.

The issue dates to commit db7b5426a4, which introduced subpages.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-29 13:44:43 +02:00
Avi Kivity
d6f2ea22a0 memory: compress phys_map node pointers to 16 bits
Use an expanding vector to store nodes.  Allocation is baroque to g_renew()
potentially invalidating pointers; this will be addressed later.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-29 13:44:43 +02:00
Avi Kivity
5312bd8b31 memory: store MemoryRegionSection pointers in phys_map
Instead of storing PhysPageDesc, store pointers to MemoryRegionSections.
The various offsets (phys_offset & ~TARGET_PAGE_MASK,
PHYS_OFFSET & TARGET_PAGE_MASK, region_offset) can all be synthesized
from the information in a MemoryRegionSection.  Adjust phys_page_find()
to synthesize a PhysPageDesc.

The upshot is that phys_map now contains uniform values, so it's easier
to generate and compress.

The end result is somewhat clumsy but this will be improved as we we
propagate MemoryRegionSections throughout the code instead of transforming
them to PhysPageDesc.

The MemoryRegionSection pointers are stored as uint16_t offsets in an
array.  This saves space (when we also compress node pointers) and is
more cache friendly.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-29 13:44:43 +02:00
Avi Kivity
4346ae3e28 memory: unify phys_map last level with intermediate levels
This lays the groundwork for storing leaf data in intermediate levels,
saving space.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-29 13:44:43 +02:00
Avi Kivity
3eef53df6b memory: remove first level of l1_phys_map
L1 and the lower levels in l1_phys_map are equivalent, except that L1 has
a different size, and is always allocated.  Simplify the code by removing
L1.  This leaves us with a tree composed solely of L2 tables, but that
problem can be renamed away later.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-29 13:44:43 +02:00
Avi Kivity
54688b1ec1 memory: change memory registration to rebuild the memory map on each change
Instead of incrementally building the memory map, rebuild it every time.
This allows later simplification, since the code need not consider overlaying
a previous mapping.  It is also RCU friendly.

With large memory guests this can get expensive, since the operation is
O(mem size), but this will be optimized later.

As a side effect subpage and L2 leaks are fixed here.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-29 13:44:43 +02:00
Avi Kivity
50c1e1491e memory: support stateless memory listeners
Current memory listeners are incremental; that is, they are expected to
maintain their own state, and receive callbacks for changes to that state.

This patch adds support for stateless listeners; these work by receiving
a ->begin() callback (which tells them that new state is coming), a
sequence of ->region_add() and ->region_nop() callbacks, and then a
->commit() callback which signifies the end of the new state.  They should
ignore ->region_del() callbacks.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-29 13:44:42 +02:00
Avi Kivity
4855d41a61 memory: split memory listener for the two address spaces
The memory and I/O address spaces do different things, so split them into
two memory listeners.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-29 13:44:42 +02:00
Avi Kivity
7376e5827a memory: allow MemoryListeners to observe a specific address space
Ignore any regions not belonging to a specified address space.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-29 13:44:42 +02:00
Avi Kivity
9363274709 memory: use a MemoryListener for core memory map updates too
This transforms memory.c into a library which can then be unit tested
easily, by feeding it inputs and listening to its outputs.

Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2012-02-29 13:44:42 +02:00
Avi Kivity
d7ec83e6b5 memory: don't pass ->readable attribute to cpu_register_physical_memory_log
It can be derived from the MemoryRegion itself (which is why it is not
used there).

Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2012-02-29 13:44:42 +02:00
Max Filippov
0dc23828f1 exec: let cpu_watchpoint_insert accept larger watchpoints
Make cpu_watchpoint_insert accept watchpoints of any power-of-two size
up to the target page size.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2012-02-20 20:07:11 +04:00
Max Filippov
488d65772c exec: fix check_watchpoint exiting cpu_loop
In case of BP_STOP_BEFORE_ACCESS watchpoint check_watchpoint intends to
signal EXCP_DEBUG exception on exit from cpu loop, but later overwrites
exception code by the cpu_resume_from_signal call.

Use cpu_loop_exit with BP_STOP_BEFORE_ACCESS watchpoints.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2012-02-20 20:07:11 +04:00
Max Filippov
6736415047 exec: add missing breaks to the watch_mem_write
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Meador Inge <meadori@codesourcery.com>
2012-02-20 20:07:02 +04:00
Peter Maydell
771124e1a6 exec.c: Clarify comment about tlb_flush() flush_global parameter
Clarify the comment about tlb_flush()'s flush_global parameter,
so it is clearer what it does and why it is OK that the implementation
currently ignores it.

Reviewed-by: Andreas F=C3=A4rber <afaerber@suse.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-02-01 14:45:01 -06:00
Benjamin Herrenschmidt
82afa58641 virtio-pci: Fix endianness of virtio config
The virtio config area in PIO space is a bit special. The initial
header is little endian but the rest (device specific) is guest
native endian.

The PIO accessors for PCI on machines that don't have native IO ports
assume that all PIO is little endian, which works fine for everything
except the above.

A complicated way to fix it would be to split the BAR into two memory
regions with different endianess settings, but this isn't practical
to do, besides, the PIO code doesn't honor region endianness anyway
(I have a patch for that too but it isn't necessary at this stage).

So I decided to go for the quick fix instead which consists of
reverting the swap in virtio-pci in selected places, hoping that when
we eventually do a "v2" of the virtio protocols, we sort that out once
and for all using a fixed endian setting for everything.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
[agraf: keep virtio in libhw and determine endianness through a
        helper function in exec.c]
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
2012-01-21 05:17:01 +01:00
Aurelien Jarno
5c84bd904b tcg-arm: fix a typo in comments
ARM still doesn't support 16GB buffers in 32-bit modes, replace the
16GB by 16MB in the comment.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-01-13 10:36:59 +00:00
Avi Kivity
11c7ef0c73 Remove IO_MEM_SHIFT
We no longer use any of the lower bits of a ram_addr, so we might as well
use them for the io table index.  This increases the number of potential
I/O handlers by a factor of 8.

Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2012-01-04 13:34:50 +02:00
Avi Kivity
75c578dcaa Drop IO_MEM_ROMD
Unlike ->readonly, ->readable is not inherited from aliase, so we can simply
query the memory region.

Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2012-01-04 13:34:50 +02:00
Avi Kivity
b3b00c78d8 Remove IO_MEM_SUBPAGE
Replace with a MemoryRegion flag.

Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2012-01-04 13:34:50 +02:00
Avi Kivity
a621f38de8 Direct dispatch through MemoryRegion
Now that all mmio goes through MemoryRegions, we can convert
io_mem_opaque to be a MemoryRegion pointer, and remove the thunks
that convert from old-style CPU{Read,Write}MemoryFunc to MemoryRegionOps.

Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2012-01-04 13:34:50 +02:00
Avi Kivity
1ec9b909ff Convert io_mem_watch to be a MemoryRegion
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2012-01-04 13:34:50 +02:00
Avi Kivity
de712f9469 Convert IO_MEM_SUBPAGE_RAM to be a MemoryRegion
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2012-01-04 13:34:50 +02:00
Avi Kivity
70c68e44bc Convert the subpage wrapper to be a MemoryRegion
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2012-01-04 13:34:50 +02:00
Avi Kivity
dd81124bf6 Switch cpu_register_physical_memory_log() to use MemoryRegions
Still internally using ram_addr.

Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2012-01-04 13:34:50 +02:00
Avi Kivity
0e0df1e24d Convert IO_MEM_{RAM,ROM,UNASSIGNED,NOTDIRTY} to MemoryRegions
Convert the fixed-address IO_MEM_RAM, IO_MEM_ROM, IO_MEM_UNASSIGNED,
and IO_MEM_NOTDIRTY io handlers to MemoryRegions.  These aren't real
regions, since they are never added to the memory hierarchy, but they
allow reuse of the dispatch functionality.

Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2012-01-04 13:34:50 +02:00
Avi Kivity
d39e822265 Uninline get_page_addr_code()
Its use of IO_MEM_ROM and friends will later cause #include loops; and it
is too large to merit inlining.

Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2012-01-04 13:34:49 +02:00
Avi Kivity
1d393fa2d1 Avoid range comparisons on io index types
The code sometimes uses range comparisons on io indexes (e.g.
index =< IO_MEM_ROM).  Avoid these as they make moving to objects harder.

Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2012-01-04 13:34:49 +02:00
Avi Kivity
2774c6d0ae Fix wrong region_offset when overlaying a page with another
cpu_register_physical_memory_log() does not update region_offset
if a page was previously registered for the same address.  This
could cause mmio accesses going to the wrong place, by using the
old region_offset.

Signed-off-by: Avi Kivity <avi@redhat.com>
Acked-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2012-01-04 13:34:49 +02:00
Avi Kivity
acbbec5d43 memory: move mmio access to functions
Currently mmio access goes directly to the io_mem_{read,write} arrays.
In preparation for eliminating them, add indirection via a function.

Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2012-01-04 13:34:49 +02:00
Avi Kivity
f1f6e3b86e exec: make phys_page_find() return a temporary
Instead of returning a PhysPageDesc pointer, return a temporary.
This lets us move away from actually storing PhysPageDesc's, and
instead sythesising them when needed.

Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2012-01-04 13:34:49 +02:00
Avi Kivity
be675c9720 memory: move endianness compensation to memory core
Instead of doing device endianness compensation in cpu_register_io_memory(),
do it in the memory core.

Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2012-01-04 13:34:49 +02:00
Avi Kivity
8f77558f22 memory: obsolete cpu_physical_memory_[gs]et_dirty_tracking()
The getter is no longer used, so it is completely removed.

Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-04 13:34:49 +02:00
Avi Kivity
7c63736603 Store MemoryRegion in RAMBlock
As a step in moving live migration from RAMBlocks to MemoryRegions,
store the MemoryRegion in a RAMBlock.

Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-04 13:34:48 +02:00
Avi Kivity
c5705a7728 vmstate, memory: decouple vmstate from memory API
Currently creating a memory region automatically registers it for
live migration.  This differs from other state (which is enumerated
in a VMStateDescription structure) and ties the live migration code
into the memory core.

Decouple the two by introducing a separate API, vmstate_register_ram(),
for registering a RAM block for migration.  Currently the same
implementation is reused, but later it can be moved into a separate list,
and registrations can be moved to VMStateDescription blocks.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-04 13:34:48 +02:00
Avi Kivity
586c6230c0 Remove cpu_get_physical_page_desc()
No longer used.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-03 19:19:28 +02:00
Avi Kivity
dcd97e33af memory: remove CPUPhysMemoryClient
No longer used.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-01-03 19:19:27 +02:00
Avi Kivity
7664e80c84 memory: add API for observing updates to the physical memory map
Add an API that allows a client to observe changes in the global
memory map:
 - region added (possibly with logging enabled)
 - region removed (possibly with logging enabled)
 - logging started on a region
 - logging stopped on a region
 - global logging started
 - global logging removed

This API will eventually replace cpu_register_physical_memory_client().

Signed-off-by: Avi Kivity <avi@redhat.com>
2011-12-20 14:14:07 +02:00