* qemu-kvm/uq/master:
virtio: move common irqfd handling out of virtio-pci
virtio: move common ioeventfd handling out of virtio-pci
event_notifier: add event_notifier_set_handler
memory: pass EventNotifier, not eventfd
ivshmem: wrap ivshmem_del_eventfd loops with transaction
ivshmem: use EventNotifier and memory API
event_notifier: add event_notifier_init_fd
event_notifier: remove event_notifier_test
event_notifier: add event_notifier_set
apic: Defer interrupt updates to VCPU thread
apic: Reevaluate pending interrupts on LVT_LINT0 changes
apic: Resolve potential endless loop around apic_update_irq
kvm: expose tsc deadline timer feature to guest
kvm_pv_eoi: add flag support
kvm: Don't abort on kvm_irqchip_add_msi_route()
Under Win32, EventNotifiers will not have event_notifier_get_fd, so we
cannot call it in common code such as hw/virtio-pci.c. Pass a pointer to
the notifier, and only retrieve the file descriptor in kvm-specific code.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
By default qemu will use MAP_PRIVATE for guest pages. This will write
protect pages and thus break on s390 systems that dont support this feature.
Therefore qemu has a hack to always use MAP_SHARED for s390. But MAP_SHARED
has other problems (no dirty pages tracking, a lot more swap overhead etc.)
Newer systems allow the distinction via KVM_CAP_S390_COW. With this feature
qemu can use the standard qemu alloc if available, otherwise it will use
the old s390 hack.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Refactor the code that is only needed for tcg to an static function.
Call that only when tcg is enabled. We can't refactor to a dummy
function in the kvm case, as qemu can be compiled at the same time
with tcg and kvm.
Signed-off-by: Juan Quintela <quintela@redhat.com>
This makes it easier to remove it from BusInfo.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[AF: Drop now unnecessary NULL initialization in scsibus_get_dev_path()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
tb_invalidate_phys_addr has to be called with the exact physical address of
the breakpoint we add/remove, not just the page's base address.
Otherwise we easily fail to flush the right TB.
This breakage was introduced by the commit f3705d5329 "memory: make
phys_page_find() return an unadjusted".
This appeared to work for some guest architectures because their
cpu_get_phys_page_debug implementation returns full translated physical
address, not just the base of the TARGET_PAGE_SIZE-sized page.
Reported-by: TeLeMan <geleman@gmail.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
They could suggest that all TBs of the page containing the range would
be invalidated.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
This API will be used in the following patch.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
If we execute linux-user code that does the following:
* A = mmap()
* execute code in A
* munmap(A)
* B = mmap(), but mmap returns the same address as A
* execute code in B
we end up executing a stale cached tb that contains translated code
from A, while we want new code from B.
This patch adds a TB flush for mmap'ed regions, before we return them,
avoiding the whole issue. It also adds a flush for munmap, so that we
don't execute stale TBs instead of getting a segfault.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Fold is_ram_rom and is_ram_rom_romd() into callers.
Change is_romd() and section_addr() to take MemoryRegion
instead of MemoryRegionSection for consistency and
use memory_region_ prefix.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Make s_cputlb_empty_entry 'const'.
Rename tlb_flush_jmp_cache() to tb_flush_jmp_cache().
Refactor code to add cpu_tlb_reset_dirty_all(),
memory_region_section_get_iotlb() and
memory_region_is_unassigned().
Remove unused cpu_tlb_update_dirty().
Fix coding style in areas to be moved.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Replace all type casts to 'long' or 'unsigned long' by 'intptr_t' or 'uintptr_t'.
For type casts which are only used to extract the lower bits of an address
or to modify those bits, signedness does not matter. There I always use 'uintptr_t'.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Allow TB invalidation by its physical address, extract implementation
from the breakpoint_invalidate function.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Use uintptr_t instead of void * or unsigned long in
several op related functions, env->mem_io_pc and
GETPC() macro.
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
QEMU host addresses must use uintptr_t to be portable for hosts with
an unusual size of long (w64).
tb_jmp_offset is an uint16_t value, therefore the local variable offset
in function tb_set_jmp_target was changed from unsigned long to uint16_t.
The type cast to long in function tb_add_jump now also uses uintptr_t.
For the bit operation used here, the signedness of the type cast does
not matter.
Some remaining unsigned long values are either only used for ARM assembler
code or will be fixed in a later patch for PPC.
v2:
Fix signature of tb_find_pc in exec.c, too (hint from Blue Swirl, thanks).
There remain lots of other long / unsigned long in exec.c which must be
replaced by uintptr_t. This will be done in a separate patch. Here
only one of these type casts is fixed.
v3:
Also fix signature of page_unprotect.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
This allows us to generate unwind info for the dynamicly generated
code in the code_gen_buffer. Only i386 is converted at this point.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
In cpu_physical_memory_rw, a change has been introduced and qemu_get_ram_ptr is
no longuer called with the ram addr we want to access, but only with the
section address. This patch fixes this. (All other call to qemu_get_ram_ptr are
already called with the right address.)
This patch fixes Xen guest.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
The code to get the ram_addr from a (tlb entry, vaddr) pair
checks that the resulting memory is not MMIO, but neglects to
check whether the region is hidden by a watchpoint page.
Add the missing check.
Signed-off-by: Avi Kivity <avi@redhat.com>
A couple of code paths check the lower bits of CPUTLBEntry::addr_write
against io_mem_ram as a way of looking for a dirty RAM page. This works
by accident since the value is zero, which matches all clear bits for
TLB_INVALID, TLB_MMIO, and TLB_NOTDIRTY (indicating dirty RAM).
Make it work by design by checking for the proper bits.
Signed-off-by: Avi Kivity <avi@redhat.com>
Optionally, make memory access helpers take a parameter for CPUState
instead of relying on global env.
On most targets, perform simple moves to reorder registers. On i386,
switch from regparm(3) calling convention to standard stack-based
version.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Scripted conversion:
for file in *.[hc] hw/*.[hc] hw/kvm/*.[hc] linux-user/*.[hc] linux-user/m68k/*.[hc] bsd-user/*.[hc] darwin-user/*.[hc] tcg/*/*.[hc] target-*/cpu.h; do
sed -i "s/CPUState/CPUArchState/g" $file
done
All occurrences of CPUArchState are expected to be replaced by QOM CPUState,
once all targets are QOM'ified and common fields have been extracted.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
The return value of cpu_register_io_memory() is no longer used anywhere, so
we can remove it and all associated data and code.
Signed-off-by: Avi Kivity <avi@redhat.com>
Instead of indirecting via io_mem_region, dispatch directly
through the MemoryRegion obtained from the iotlb or phys_page_find().
Signed-off-by: Avi Kivity <avi@redhat.com>
get_page_addr_code() reads a code tlb entry, but interprets it as an
iotlb entry. This works by accident since the low bits of a RAM code
tlb entry are clear, and match a RAM iotlb entry. This accident is
about to unhappen, so fix the code to use an iotlb entry (using the
code entry with TLB_MMIO may fail if the page is a watchpoint).
Signed-off-by: Avi Kivity <avi@redhat.com>
We'd like to store the section index in the iotlb, so we can't
adjust it before returning. Return an unadjusted section and
instead introduce section_addr(), which does the adjustment later.
Signed-off-by: Avi Kivity <avi@redhat.com>
Commit e58ac72b6a0 ("ioport: change portio_list not to use
memory_region_set_offset()") started using aliases of I/O memory
regions. Since the IORange used for the I/O was contained in the
target region, the alias information (specifically, the offset
into the region) was lost. This broke -vga std.
Fix by allocating an independent object to hold the IORange and
also the new offset.
Note that I/O memory regions were conceptually broken wrt aliases
in a different way: an alias can cause the same region to appear
twice in an address space, but we had just one IORange to service it.
This patch fixes that problem as well, since we can now have multiple
IORange/MemoryRegion associations.
Signed-off-by: Avi Kivity <avi@redhat.com>
When storing large contiguous ranges in phys_map, all values tend to
be the same pointers to a single MemoryRegionSection. Collapse them
by marking nodes with level > 0 as leaves. This reduces tree memory
usage dramatically.
Signed-off-by: Avi Kivity <avi@redhat.com>
Instead of considering subpage on a per-page basis, split each section
into a subpage head, multipage body, and subpage tail, and register
each separately. This simplifies the registration functions.
Signed-off-by: Avi Kivity <avi@redhat.com>
We no longer describe memory in terms of individual pages; use sections
throughout instead.
PhysPageDesc no longer used - remove.
Signed-off-by: Avi Kivity <avi@redhat.com>
If the first subpage installed in a page is RAM, then we install it as
a full page, instead of a subpage. Fix by not special casing RAM.
The issue dates to commit db7b5426a4, which introduced subpages.
Signed-off-by: Avi Kivity <avi@redhat.com>
Use an expanding vector to store nodes. Allocation is baroque to g_renew()
potentially invalidating pointers; this will be addressed later.
Signed-off-by: Avi Kivity <avi@redhat.com>
Instead of storing PhysPageDesc, store pointers to MemoryRegionSections.
The various offsets (phys_offset & ~TARGET_PAGE_MASK,
PHYS_OFFSET & TARGET_PAGE_MASK, region_offset) can all be synthesized
from the information in a MemoryRegionSection. Adjust phys_page_find()
to synthesize a PhysPageDesc.
The upshot is that phys_map now contains uniform values, so it's easier
to generate and compress.
The end result is somewhat clumsy but this will be improved as we we
propagate MemoryRegionSections throughout the code instead of transforming
them to PhysPageDesc.
The MemoryRegionSection pointers are stored as uint16_t offsets in an
array. This saves space (when we also compress node pointers) and is
more cache friendly.
Signed-off-by: Avi Kivity <avi@redhat.com>
L1 and the lower levels in l1_phys_map are equivalent, except that L1 has
a different size, and is always allocated. Simplify the code by removing
L1. This leaves us with a tree composed solely of L2 tables, but that
problem can be renamed away later.
Signed-off-by: Avi Kivity <avi@redhat.com>