Commit Graph

715 Commits

Author SHA1 Message Date
Juan Quintela
7e5609a85e exec: create function to get a single dirty bit
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2014-01-13 14:04:53 +01:00
Juan Quintela
06567942e5 exec: use accessor function to know if memory is dirty
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-01-13 14:04:53 +01:00
Marcelo Tosatti
2ba8285289 mempath prefault: fix off-by-one error
Fix off-by-one error (noticed by Andrea Arcangeli).

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-12-30 19:05:11 -02:00
Alexander Graf
582b55a96a roms: Flush icache when writing roms to guest memory
We use the rom infrastructure to write firmware and/or initial kernel
blobs into guest address space. So we're basically emulating the cache
off phase on very early system bootup.

That phase is usually responsible for clearing the instruction cache for
anything it writes into cachable memory, to ensure that after reboot we
don't happen to execute stale bits from the instruction cache.

So we need to invalidate the icache every time we write a rom into guest
address space. We do not need to do this for every DMA since the guest
expects it has to flush the icache manually in that case.

This fixes random reboot issues on e5500 (booke ppc) for me.

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-12-20 01:58:03 +01:00
Alexander Graf
a94b36ddd6 roms: Flush icache when writing roms to guest memory
We use the rom infrastructure to write firmware and/or initial kernel
blobs into guest address space. So we're basically emulating the cache
off phase on very early system bootup.

That phase is usually responsible for clearing the instruction cache for
anything it writes into cachable memory, to ensure that after reboot we
don't happen to execute stale bits from the instruction cache.

So we need to invalidate the icache every time we write a rom into guest
address space. We do not need to do this for every DMA since the guest
expects it has to flush the icache manually in that case.

This fixes random reboot issues on e5500 (booke ppc) for me.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-12-13 13:38:50 +01:00
Marcel Apfelbaum
53cb28cbfe exec: separate sections and nodes per address space
Every address space has its own nodes and sections, but
it uses the same global arrays of nodes/section.

This limits the number of devices that can be attached
to the guest to 20-30 devices. It happens because:
 - The sections array is limited to 2^12 entries.
 - The main memory has at least 100 sections.
 - Each device address space is actually an alias to
   main memory, multiplying its number of nodes/sections.

Remove the limitation by using separate arrays of
nodes and sections for each address space.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-12-11 20:11:09 +02:00
Michael S. Tsirkin
026736cebf exec: reduce L2_PAGE_SIZE
With the single exception of ppc with 16M pages,
we get the same number of levels
with L2_PAGE_SIZE = 10 as with L2_PAGE_SIZE = 9.

by doing this we reduce memory footprint of a single level
in the node memory map by 2x without runtime overhead.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-12-10 12:29:56 +02:00
Paolo Bonzini
57271d63c4 exec: make address spaces 64-bit wide
As an alternative to commit 818f86b (exec: limit system memory
size, 2013-11-04) let's just make all address spaces 64-bit wide.
This eliminates problems with phys_page_find ignoring bits above
TARGET_PHYS_ADDR_SPACE_BITS and address_space_translate_internal
consequently messing up the computations.

In Luiz's reported crash, at startup gdb attempts to read from address
0xffffffffffffffe6 to 0xffffffffffffffff inclusive.  The region it gets
is the newly introduced master abort region, which is as big as the PCI
address space (see pci_bus_init).  Due to a typo that's only 2^63-1,
not 2^64.  But we get it anyway because phys_page_find ignores the upper
bits of the physical address.  In address_space_translate_internal then

    diff = int128_sub(section->mr->size, int128_make64(addr));
    *plen = int128_get64(int128_min(diff, int128_make64(*plen)));

diff becomes negative, and int128_get64 booms.

The size of the PCI address space region should be fixed anyway.

Reported-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-12-10 12:29:56 +02:00
Michael S. Tsirkin
b35ba30f8f exec: memory radix tree page level compression
At the moment, memory radix tree is already variable width, but it can
only skip the low bits of address.

This is efficient if we have huge memory regions but inefficient if we
are only using a tiny portion of the address space.

After we have built up the map, detect
configurations where a single L2 entry is valid.

We then speed up the lookup by skipping one or more levels.
In case any levels were skipped, we might end up in a valid section
instead of erroring out. We handle this by checking that
the address is in range of the resulting section.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-12-10 12:29:56 +02:00
Michael S. Tsirkin
97115a8d45 exec: pass hw address to phys_page_find
callers always shift by target page bits so let's just do this
internally.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-12-10 12:29:56 +02:00
Michael S. Tsirkin
8b795765db exec: extend skip field to 6 bit, page entry to 32 bit
Extend skip to 6 bit. As page entry doesn't fit in 16 bit
any longer anyway, extend it to 32 bit.
This doubles node map memory requirements, but follow-up
patches will save this memory.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-12-10 12:29:56 +02:00
Michael S. Tsirkin
9736e55b78 exec: replace leaf with skip
In preparation for dynamic radix tree depth support, rename is_leaf
field to skip, telling us how many bits to skip to next level.
Set to 0 for leaf.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-12-10 12:29:56 +02:00
Paolo Bonzini
03f4995781 split definitions for exec.c and translate-all.c radix trees
The exec.c and translate-all.c radix trees are quite different, and
the exec.c one in particular is not limited to the CPU---it can be
used also by devices that do DMA, and in that case the address space
is not limited to TARGET_PHYS_ADDR_SPACE_BITS bits.

We want to make exec.c's radix trees 64-bit wide.  As a first step,
stop sharing the constants between exec.c and translate-all.c.
exec.c gets P_L2_* constants, translate-all.c gets V_L2_*, for
consistency with the existing V_L1_* symbols.  Though actually
in the softmmu case translate-all.c is also indexed by physical
addresses...

This patch has no semantic change.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-12-10 12:29:56 +02:00
Marcelo Tosatti
ef36fa1492 qemu: mempath: prefault pages manually (v4)
v4: s/fail/failed/  (Peter Maydell)

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-11-25 11:28:56 +01:00
Anthony Liguori
29c5b77d3d pci, pc, virtio bug fixes
This reverts PCI master abort support - we'll want it
 eventually but it exposes too many core bugs to be safe for 1.7.
 This also reverts a recent exec.c change that was an
 attempt to work-around some of these core bugs.
 
 Also included are small fixes in pc and virtio,
 and a core loader fix for PPC bamboo.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.15 (GNU/Linux)
 
 iQEcBAABAgAGBQJSf4ZyAAoJECgfDbjSjVRp9DIIAK7yEMa9ie5n3sInKH+xHT3R
 Sf4uErqx55WfT/54dnLJPrs7DTfXblW+Qjnq/7RuaoJ32Dfshgxz64mPF+Lm2s3+
 ghjdQrKo2YkdSbbxy+AnBNO4eHMSeUs/rM2yIfi7FZU0nwC7wNe1QpAN3UjM4yAF
 5vE18xZE0Rxz/prXgofLtPHa1czvGPFk1qbS7Vag6HCSkfEI4N1Jxf9otDRV6KZP
 9hX0kTvZyOKdbhccN05G4VCWwx5YUrpBsNSoph4Jx1aokEBoucr4sgE1FPDp0H9H
 bJqDaAM2G5HNrDtIiDov5WOzRNT/ly011Q4mcaQh3va0pqUXttKCHgE1KRgn76I=
 =iMNW
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'mst/tags/for_anthony' into staging

pci, pc, virtio bug fixes

This reverts PCI master abort support - we'll want it
eventually but it exposes too many core bugs to be safe for 1.7.
This also reverts a recent exec.c change that was an
attempt to work-around some of these core bugs.

Also included are small fixes in pc and virtio,
and a core loader fix for PPC bamboo.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Sun 10 Nov 2013 05:13:22 AM PST using RSA key ID D28D5469
# gpg: Can't check signature: public key not found

# By Michael S. Tsirkin (3) and others
# Via Michael S. Tsirkin
* mst/tags/for_anthony:
  Revert "exec: limit system memory size"
  Revert "hw/pci: partially handle pci master abort"
  loader: drop return value for rom_add_blob_fixed
  acpi-build: disable with -no-acpi
  virtio-net: only delete bh that existed
  Fix pc migration from qemu <= 1.5

Message-id: 1384159176-31662-1-git-send-email-mst@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-11-13 11:48:35 -08:00
Michael S. Tsirkin
ef9e455d64 Revert "exec: limit system memory size"
This reverts commit 818f86b883.

This was a work-around for bugs elsewhere in the system,
exposed by commit a53ae8e934:
    "hw/pci: partially handle pci master abort"
since that's reverted now, the work-around is not required for 1.7
anymore.
The proper fix is supporting full 64 bit addresses in the radix tree.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Marcel Apfelbaum <marcel.a@redhat.com>
2013-11-10 15:11:01 +02:00
Max Filippov
e8262a1b5b exec: fix breakpoint_invalidate when pc may not be translated
This fixes qemu abort with the following message:

    include/qemu/int128.h:22: int128_get64: Assertion `!a.hi' failed.

which happens due to attempt to invalidate breakpoint by virtual address
for which get_phys_page_debug couldn't find mapping.

For more details see
http://lists.nongnu.org/archive/html/qemu-devel/2013-09/msg04582.html

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2013-11-08 09:25:22 +04:00
Michael S. Tsirkin
818f86b883 exec: limit system memory size
The page table logic in exec.c assumes
that memory addresses are at most TARGET_PHYS_ADDR_SPACE_BITS.

But pci addresses are full 64 bit so if we try to render them ignoring
the extra bits, we get strange effects with sections overlapping each
other.

To fix, simply limit the system memory size to
 1 << TARGET_PHYS_ADDR_SPACE_BITS,
pci addresses will be rendered within that.

Cc: qemu-stable@nongnu.org
Reported-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-11-04 15:38:49 +02:00
Kevin Wolf
e85d9db5f6 exec: Fix bounce buffer allocation in address_space_map()
This fixes a regression introduced by commit e3127ae0c, which kept the
allocation size of the bounce buffer limited to one page in order to
avoid unbounded allocations (as explained in the commit message of
6d16c2f88), but broke the reporting of the shortened bounce buffer to
the caller. The caller therefore assumes that the full requested size
was provided and causes memory corruption when writing beyond the end of
the actually allocated buffer.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-28 17:34:42 +01:00
Paolo Bonzini
041603fe5d exec: remove qemu_safe_ram_ptr
This is not needed since the RAM list is not modified anymore by
qemu_get_ram_ptr.  Replace it with qemu_get_ram_block.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-17 17:31:00 +02:00
Stefan Weil
575ddeb459 exec: Fix prototype of phys_mem_set_alloc and related functions
phys_mem_alloc and its assigned values qemu_anon_ram_alloc and
legacy_s390_alloc must have identical argument lists.

legacy_s390_alloc uses the size parameter to call mmap, so size_t is
good enough for all of them.

This patch fixes compiler errors on i686 Linux hosts:

  CC    alpha-softmmu/exec.o
exec.c:752:51: error:
 initialization from incompatible pointer type [-Werror]
exec.c: In function 'qemu_ram_alloc_from_ptr':
exec.c:1139:32: error:
 comparison of distinct pointer types lacks a cast [-Werror]
exec.c: In function 'qemu_ram_remap':
exec.c:1283:21: error:
 comparison of distinct pointer types lacks a cast [-Werror]

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1380481005-32399-1-git-send-email-sw@weilnetz.de
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
2013-10-14 08:50:34 -07:00
Anthony Liguori
39c153b80f QOM CPUState refactorings / X86CPU
* Fix for X86CPU model field of qemu32/qemu64 CPU models
 * Bug fix for longjmp on FreeBSD
 * Removal of unused function
 * Confinement of clone syscall infrastructure to linux-user
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABAgAGBQJSVTKzAAoJEPou0S0+fgE/7tYP/i5dgm6q7jSnhJcwzgHlCHDE
 c0BTwnvFjdBdkuAARYb/soo0m9QWfsW/dgC4bG3rO5j3o84PLstMjiZSQch0pqM1
 YhA0hYSiFjHrMcRk9FOwIECPIe+QcHZ79iNML+9G4K13D7qg36aJWISbVOWy24Dp
 kj5D0wBBDNw032Oh/3z3EAK4U+vLc/+i4s8XjfwtbuBCCn7GMCE3mRnEqnf8ZX3o
 H3Il3h/o+I3XQSzIJKXXyJZ5ZVXTtlj0z/0ShQXe8o8u1hINXE2Nf9lB6WG/6sh0
 Y43d0uU/e9fWDer25j9yis9KfDNErgYyxlBMUA2X1+Rny5P0twjnnBr5GTAeKgSq
 Kcux8Ov7W8cbVoM/px03rnynF9rbFbgmGlx82L+QsNMKWhjnEsfs6unpccpGhHR5
 UuZX3ZPrmeHfjv0AZD/U2ya3jfrp0v+9gsTqy3QV1rCPbqPDcJ6jg8jzbPZYjEfa
 /Zy0e/0O3sytSyiaAfBg3MzVPBxdzPcn0JjExJQV9BHsUlkZIVCZVMfePw1oIaf+
 coyV4cT3hCe8LrSCzPZlRYP+1hIg41W4NicLbDxtS8lqgfRbcglvqw6NFdAM+NcB
 z3heQ7IFstQ+pEINXQNy6bS8orv8F1VVvCtZaV+2pzB4TZzjPYuGsrqygre4QkLU
 mtpN9BTfmSIjzyo6iYBv
 =hQfy
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'afaerber/tags/qom-cpu-for-anthony' into staging

QOM CPUState refactorings / X86CPU

* Fix for X86CPU model field of qemu32/qemu64 CPU models
* Bug fix for longjmp on FreeBSD
* Removal of unused function
* Confinement of clone syscall infrastructure to linux-user

# gpg: Signature made Wed 09 Oct 2013 03:40:51 AM PDT using RSA key ID 3E7E013F
# gpg: Can't check signature: public key not found

# By Andreas Färber (2) and others
# Via Andreas Färber
* afaerber/tags/qom-cpu-for-anthony:
  cpu: Drop cpu_model_str from CPU_COMMON
  cpu: Move cpu_copy() into linux-user
  cputlb: Remove dead function tlb_update_dirty()
  cpu-exec: Also reload CPUClass *cc after longjmp return in cpu_exec()
  target-i386: Set model=6 on qemu64 & qemu32 CPU models
2013-10-10 13:16:25 -07:00
Andreas Färber
30ba0ee52d cpu: Move cpu_copy() into linux-user
It is only used there and is deemed very fragile if not incorrect in its
current memcpy() form. Moving it into linux-user will allow to move
parts into target_cpu.h headers and only copy what the ABI mandates.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-10-07 11:48:39 +02:00
Amos Kong
016e9d62fe exec: cleanup DEBUG_SUBPAGE
Touched some error after enabling DEBUG_SUBPAGE.

Signed-off-by: Amos Kong <akong@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-10-02 22:55:28 +04:00
Anthony Liguori
2e6ae666c8 Merge remote-tracking branch 'mjt/trivial-patches' into staging
# By Stefan Weil (8) and others
# Via Michael Tokarev
* mjt/trivial-patches:
  tests/.gitignore: ignore test-throttle
  exec: Fix broken build for MinGW (regression)
  kvm: Fix compiler warning (clang)
  tcg-sparc: Fix parenthesis warning
  Makefile: Remove some more files when cleaning
  target-i386: Fix segment cache dump
  iov: avoid "orig_len may be used unitialized" warning
  vscclient: remove unnecessary use of uninitialized variable
  trace-events: Clean up with scripts/cleanup-trace-events.pl again
  tci: Fix qemu-alpha on 32 bit hosts (wrong assertions)
  *-user: Improve documentation for lock_user function
  MAINTAINERS: Add missing entry to filelist for TCI target
  translate-all: Fix formatting of dump output
  *-user: Fix typo in comment (ulocking -> unlocking)
  docs: Fix IO port number for CPU present bitmap.
  q35: Fix typo in constant DEFUALT -> DEFAULT.
  configure: Undefine _FORTIFY_SOURCE prior using it

Message-id: 1379696296-32105-1-git-send-email-mjt@msgid.tls.msk.ru
2013-09-23 11:52:55 -05:00
Anthony Liguori
3e4be9c297 Merge remote-tracking branch 'qemu-kvm/uq/master' into staging
# By Alexey Kardashevskiy (3) and others
# Via Paolo Bonzini
* qemu-kvm/uq/master:
  target-i386: add feature kvm_pv_unhalt
  linux-headers: update to 3.12-rc1
  target-i386: forward CPUID cache leaves when -cpu host is used
  linux-headers: update to 3.11
  kvm: fix traces to use %x instead of %d
  kvmvapic: Clear also physical ROM address when entering INACTIVE state
  kvmvapic: Enter inactive state on hardware reset
  kvmvapic: Catch invalid ROM size
  kvm irqfd: support direct msimessage to irq translation
  fix steal time MSR vmsd callback to proper opaque type
  kvm: warn if num cpus is greater than num recommended
  cpu: Move cpu state syncs up into cpu_dump_state()
  exec: always use MADV_DONTFORK

Message-id: 1379694292-1601-1-git-send-email-pbonzini@redhat.com
2013-09-23 11:52:49 -05:00
Stefan Weil
089f3f761e exec: Fix broken build for MinGW (regression)
Commit 3435f39513 reduced the ifdeffery with
this result for MinGW:

exec.c: In function ‘qemu_ram_free’:
exec.c:1239:17: warning:
 implicit declaration of function ‘munmap’ [-Wimplicit-function-declaration]
exec.c:1239:17: warning:
 nested extern declaration of ‘munmap’ [-Wnested-externs]
exec.c:1239: undefined reference to `munmap'

Add some ifdeffery again to fix this.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2013-09-20 20:13:09 +04:00
Andrea Arcangeli
3e469dbfe4 exec: always use MADV_DONTFORK
MADV_DONTFORK prevents fork to fail with -ENOMEM if the default
overcommit heuristics decides there's too much anonymous virtual
memory allocated. If the KVM secondary MMU is synchronized with MMU
notifiers or not, doesn't make a difference in that regard.

Secondly it's always more efficient to avoid copying the guest
physical address space in the fork child (so we avoid to mark all the
guest memory readonly in the parent and so we skip the establishment
and teardown of lots of pagetables in the child).

In the common case we can ignore the error if MADV_DONTFORK is not
available. Leave a second invocation that errors out in the KVM path
if MMU notifiers are missing and KVM is enabled, to abort in such
case.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Tested-By: Benoit Canet <benoit@irqsave.net>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-09-20 12:37:52 +02:00
Markus Armbruster
39228250ce exec: Don't abort when we can't allocate guest memory
We abort() on memory allocation failure.  abort() is appropriate for
programming errors.  Maybe most memory allocation failures are
programming errors, maybe not.  But guest memory allocation failure
isn't, and aborting when the user asks for more memory than we can
provide is not nice.  exit(1) instead, and do it in just one place, so
the error message is consistent.

Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1375276272-15988-8-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:32 -05:00
Markus Armbruster
e1e84ba050 exec: Clean up unnecessary S390 ifdeffery
Another issue missed in commit fdec991 is -mem-path: it needs to be
rejected only for old S390 KVM, not for any S390.  Not that I
personally care, but the ifdeffery in qemu_ram_alloc_from_ptr() annoys
me.

Note that this doesn't actually make -mem-path work, as the kernel
doesn't (yet?)  support large pages in the host for KVM guests.  Clean
it up anyway.

Thanks to Christian Borntraeger for pointing out the S390 kernel
limitations.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1375276272-15988-7-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:32 -05:00
Markus Armbruster
2eb9fbaab5 exec: Drop incorrect & dead S390 code in qemu_ram_remap()
Old S390 KVM wants guest RAM mapped in a peculiar way.  Commit 6b02494
implemented that.

When qemu_ram_remap() got added in commit cd19cfa, its code carefully
mimicked the allocation code: peculiar way if defined(TARGET_S390X) &&
defined(CONFIG_KVM), else normal way.

For new S390 KVM, we actually want the normal way.  Commit fdec991
changed qemu_ram_alloc_from_ptr() accordingly, but forgot to update
qemu_ram_remap().  If qemu_ram_alloc_from_ptr() maps RAM the normal
way, but qemu_ram_remap() remaps it the peculiar way, remapping
changes protection and flags, which it shouldn't.

Fortunately, this can't happen, as we never remap on S390.

Replace the incorrect code with an assertion.

Thanks to Christian Borntraeger for help with assessing the bug's
(non-)impact.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Message-id: 1375276272-15988-6-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:31 -05:00
Markus Armbruster
91138037cb exec: Simplify the guest physical memory allocation hook
Make it a generic hook rather than a KVM hook.  Less code and
ifdeffery.

Since the only user of the hook is old S390 KVM, there's hope we can
get rid of it some day.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Message-id: 1375276272-15988-5-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:31 -05:00
Markus Armbruster
3435f39513 exec: Reduce ifdeffery around -mem-path
Instead of spreading its ifdeffery everywhere, confine it to
qemu_ram_alloc_from_ptr().  Everywhere else, simply test block->fd,
which is non-negative exactly when block uses -mem-path.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Message-id: 1375276272-15988-4-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:31 -05:00
Markus Armbruster
0628c18267 exec: Clean up fall back when -mem-path allocation fails
With -mem-path, qemu_ram_alloc_from_ptr() first tries to allocate
accordingly, but when it fails, it falls back to normal allocation.

The fall back allocation code used to be effectively identical to the
"-mem-path not given" code, until it started to diverge in commit
432d268.  I believe the code still works, but clean it up anyway: drop
the special fall back allocation code, and fall back to the ordinary
"-mem-path not given" code instead.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Message-id: 1375276272-15988-3-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:31 -05:00
Markus Armbruster
dfeaf2abc7 exec: Fix Xen RAM allocation with unusual options
Issues:

* We try to obey -mem-path even though it can't work with Xen.

* To implement -machine mem-merge, we call
  memory_try_enable_merging(new_block->host, size).  But with Xen,
  new_block->host remains null.  Oops.

Fix by separating Xen allocation from normal allocation.

Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Message-id: 1375276272-15988-2-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
2013-09-12 11:45:31 -05:00
liguang
2641689a37 exec: do tcg_commit only when tcg_enabled
Signed-off-by: liguang <lig.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-05 18:11:52 +02:00
Jan Kiszka
3bb28b7208 memory: Provide separate handling of unassigned io ports accesses
Accesses to unassigned io ports shall return -1 on read and be ignored
on write. Ensure these properties via dedicated ops, decoupling us from
the memory core's handling of unassigned accesses.

Cc: qemu-stable@nongnu.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-05 18:11:43 +02:00
Hu Tao
8826624970 exec: check offset_within_address_space for register subpage
If offset_within_address_space falls in a page, then we register a
subpage. So check offset_within_address_space rather than
offset_within_region.

Cc: qemu-stable@nongnu.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: "Andreas Färber" <afaerber@suse.de>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-05 18:11:37 +02:00
Paolo Bonzini
098178f274 exec: fix writing to MMIO area with non-power-of-two length
The problem is introduced by commit 2332616 (exec: Support 64-bit
operations in address_space_rw, 2013-07-08).  Before that commit,
memory_access_size would only return 1/2/4.

Since alignment is already handled above, reduce l to the largest
power of two that is smaller than l.

Cc: qemu-stable@nongnu.org
Reported-by: Oleksii Shevchuk <alxchk@gmail.com>
Tested-by: Oleksii Shevchuk <alxchk@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-05 18:11:28 +02:00
Andreas Färber
38fcbd3f08 cpu: Replace qemu_for_each_cpu()
It was introduced to loop over CPUs from target-independent code, but
since commit 182735efaf target-independent
CPUState is used.

A loop can be considered more efficient than function calls in a loop,
and CPU_FOREACH() hides implementation details just as well, so use that
instead.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-09-03 12:25:55 +02:00
Andreas Färber
bdc44640cb cpu: Use QTAILQ for CPU list
Introduce CPU_FOREACH(), CPU_FOREACH_SAFE() and CPU_NEXT() shorthand
macros.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-09-03 12:25:55 +02:00
Andreas Färber
e0d4794458 cpu: Fix VMSTATE_CPU() semantics
Commit 1a1562f5ea prepared a VMSTATE_CPU()
macro for device-style VMStateDescription registration, but missed to
adapt cpu_exec_init(), so that the "cpu_common" VMStateDescription was
still registered for AlphaCPU (fe31e73742)
and OpenRISCCPU (da69721460). Fix this.

Cc: Richard Henderson <rth@twiddle.net>
Tested-by: Jia Liu <proljc@gmail.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-31 21:03:59 +02:00
Stefan Weil
38e478eccf kvm: Change prototype of kvm_update_guest_debug()
Passing a CPUState pointer instead of a CPUArchState pointer eliminates
the last target dependent data type in sysemu/kvm.h.

It also simplifies the code.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-26 23:05:31 +02:00
Anthony Liguori
874ec3c5b3 Merge remote-tracking branch 'riku/linux-user-for-upstream' into staging
* riku/linux-user-for-upstream: (21 commits)
  linux-user: Handle compressed ISA encodings when processing MIPS exceptions
  linux-user: Unlock mmap_lock when resuming guest from page_unprotect
  linux-user: Reset copied CPUs in cpu_copy() always
  linux-user: Fix epoll on ARM hosts
  linux-user: fix segmentation fault passing with h2g(x) != x
  linux-user: Fix pipe syscall return for SPARC
  linux-user: Fix target_stat and target_stat64 for OpenRISC
  linux-user: Avoid conditional cpu_reset()
  configure: Make NPTL non-optional
  linux-user: Enable NPTL for x86-64
  linux-user: Add i386 TLS setter
  linux-user: Clean up handling of clone() argument order
  linux-user: Add missing 'break' in i386 get_thread_area syscall
  linux-user: Enable NPTL for m68k
  linux-user: Enable NPTL for SPARC targets
  linux-user: Enable NPTL for OpenRISC
  linux-user: Move includes of target-specific headers to end of qemu.h
  configure: Enable threading for unicore32-linux-user
  configure: Enable threading on all ppc and mips linux-user targets
  configure: Don't say target_nptl="no" if there is no linux-user target
  ...

Conflicts:
	linux-user/main.c

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-07-25 15:56:06 -05:00
Alexander Graf
b24c882b94 linux-user: Reset copied CPUs in cpu_copy() always
When a new thread gets created, we need to reset non arch specific state to
get the new CPU into clean state.

However this reset should happen before the arch specific CPU contents get
copied over. Otherwise we end up having clean reset state in our newly created
thread.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2013-07-23 17:28:28 +03:00
Andreas Färber
f17ec444c3 exec: Change cpu_memory_rw_debug() argument to CPUState
Propagate X86CPU in kvmvapic for simplicity.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-23 02:41:33 +02:00
Andreas Färber
00b941e581 cpu: Turn cpu_get_phys_page_debug() into a CPUClass hook
Change breakpoint_invalidate() argument to CPUState alongside.

Since all targets now assign a softmmu-only field, we can drop helpers
cpu_class_set_{do_unassigned_access,vmsd}() and device_class_set_vmsd().

Prepares for changing cpu_memory_rw_debug() argument to CPUState.

Acked-by: Max Filippov <jcmvbkbc@gmail.com> (for xtensa)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-23 02:41:33 +02:00
Andreas Färber
3825b28ff1 cpu: Change cpu_single_step() argument to CPUState
Use CPUState::env_ptr for now.

Needed for GdbState::c_cpu.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-23 02:41:32 +02:00
Andreas Färber
ed2803da58 cpu: Move singlestep_enabled field from CPU_COMMON to CPUState
Prepares for changing cpu_single_step() argument to CPUState.

Acked-by: Michael Walle <michael@walle.cc> (for lm32)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-23 02:41:32 +02:00
Paolo Bonzini
e1622f4b15 exec: fix incorrect assumptions in memory_access_size
access_size_min can be 1 because erroneous accesses must not crash
QEMU, they should trigger exceptions in the guest or just return
garbage (depending on the CPU).  I am not sure I understand the
comment: placing a 4-byte field at the last byte of a region
makes no sense (unless impl.unaligned is true), and that is
why memory.c:access_with_adjusted_size does not bother with
minimums larger than the remaining length.

access_size_max can be mr->ops->valid.max_access_size because memory.c
can and will still break accesses bigger than
mr->ops->impl.max_access_size.

Reported-by: Markus Armbruster <armbru@redhat.com>
Tested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-18 06:03:25 +02:00
Peter Maydell
cb85f7ab04 exec.c: Pass correct pointer type to qemu_ram_ptr_length
Commit e3127ae0 introduced a problem where we're passing a
hwaddr* to qemu_ram_ptr_length() but it wants a ram_addr_t*;
this will cause problems on 32 bit hosts and in any case
provokes a clang warning on MacOSX:

  CC    arm-softmmu/exec.o
exec.c:2164:46: warning: incompatible pointer types passing 'hwaddr *'
(aka 'unsigned long long *') to parameter of type 'ram_addr_t *'
(aka 'unsigned long *')
[-Wincompatible-pointer-types]
    return qemu_ram_ptr_length(raddr + base, plen);
                                             ^~~~
exec.c:1392:63: note: passing argument to parameter 'size' here
static void *qemu_ram_ptr_length(ram_addr_t addr, ram_addr_t *size)
                                                              ^

Since this function is only used in one place, change its
prototype to pass a hwaddr* rather than a ram_addr_t*,
rather than contorting the calling code to get the type right.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Riku Voipio <riku.voipio@linaro.org>
Tested-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-18 06:03:25 +02:00
Richard Henderson
23326164ae exec: Support 64-bit operations in address_space_rw
Honor the implementation maximum access size, and at least check
the minimum access size.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2013-07-14 13:40:31 -07:00
Anthony Liguori
51455c59dd QOM CPUState refactorings
* Fix for OpenRISCCPU subclasses
 * Fix for gdbstub CPU selection
 * Move linux-user CPU functions into new header
 * CPUState part 10 refactoring: first_cpu, next_cpu, cpu_single_env et al.
 * Fix some targets to consistently inline TCG code generation
 * Centrally log CPU reset
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABAgAGBQJR3VkXAAoJEPou0S0+fgE/KFQP/3eUyCzZ6QmUG3gmrnfYRDMH
 uwMstD1JRUc5kTEC2bMtld8zZKwx2kxMJpe5fizig8GaLka0J5U2wyvwskkX27ag
 7ouNwFdD/dOmvaKfcqHYKbA3CTuIrbnMm7nzrXpLnWXCiMlW1XmXttQsb3hoAjjt
 asFxQIHONNIgqpcJBrz/C6XX2bEkLra4s2QlXPE5Bl3QkKTtK9+NYahHtgIk3Y7Y
 fqbAxebNGh9eZ9PKjPExhNBZ17Yi4ciM7UB7yrXFYOfwKSpmmTsJdu/m776b1oAK
 c/zWO0uea+sLsMnibnSD1foeeZJItDQDRid+PjC44zB5kS8pkPcT5+TVB04Zilap
 rhNF2Fox+fe8eIc/2WuY3ZGchVjrD/EPbFFCCRQ/qI3Nb98WfLCDu3pAP1hRdo+p
 P6qCH5JmWYcR+2gp8MHY0NtqcklL8A2HpQTRvX1mUliMJbE+unanT4nmKolOTYrm
 +6jvp72GkmqqaLQDQ0d8ig/GmcI9QeftSFD5Y8p5prPsMkQbOAbOUSBlPgwY+Syl
 QmP8xNNzbj00UF8GvRL/m9O75geis/I+op5E7hJqaO5U1yd+ww5Z1EFvDEkUOeYu
 BclqCg1jTnzBzE/FaRP0NWFAUDR+4Z0tumdRES1cDfaMJr3+pYT7y8tjVZn7PEvn
 Ljq+/pyyiunG3Mbvw2o8
 =lFBU
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'afaerber/tags/qom-cpu-for-anthony' into staging

QOM CPUState refactorings

* Fix for OpenRISCCPU subclasses
* Fix for gdbstub CPU selection
* Move linux-user CPU functions into new header
* CPUState part 10 refactoring: first_cpu, next_cpu, cpu_single_env et al.
* Fix some targets to consistently inline TCG code generation
* Centrally log CPU reset

# gpg: Signature made Wed 10 Jul 2013 07:52:39 AM CDT using RSA key ID 3E7E013F
# gpg: Can't check signature: public key not found

# By Andreas Färber (41) and others
# Via Andreas Färber
* afaerber/tags/qom-cpu-for-anthony: (43 commits)
  cpu: Move reset logging to CPUState
  target-ppc: Change LOG_MMU_STATE() argument to CPUState
  target-i386: Change LOG_PCALL_STATE() argument to CPUState
  log: Change log_cpu_state[_mask]() argument to CPUState
  target-i386: Change do_smm_enter() argument to X86CPU
  target-i386: Change do_interrupt_all() argument to X86CPU
  target-xtensa: Change gen_intermediate_code_internal() arg to XtensaCPU
  target-unicore32: Change gen_intermediate_code_internal() signature
  target-sparc: Change gen_intermediate_code_internal() argument to SPARCCPU
  target-sh4: Change gen_intermediate_code_internal() argument to SuperHCPU
  target-s390x: Change gen_intermediate_code_internal() argument to S390CPU
  target-ppc: Change gen_intermediate_code_internal() argument to PowerPCCPU
  target-mips: Change gen_intermediate_code_internal() argument to MIPSCPU
  target-microblaze: Change gen_intermediate_code_internal() argument types
  target-m68k: Change gen_intermediate_code_internal() argument to M68kCPU
  target-lm32: Change gen_intermediate_code_internal() argument to LM32CPU
  target-i386: Change gen_intermediate_code_internal() argument to X86CPU
  target-cris: Change gen_intermediate_code_internal() argument to CRISCPU
  target-arm: Change gen_intermediate_code_internal() argument to ARMCPU
  target-alpha: Change gen_intermediate_code_internal() argument to AlphaCPU
  ...
2013-07-10 10:54:16 -05:00
Andreas Färber
a0762859ae log: Change log_cpu_state[_mask]() argument to CPUState
Since commit 878096eeb2 (cpu: Turn
cpu_dump_{state,statistics}() into CPUState hooks) CPUArchState is no
longer needed.

Add documentation and make the functions available through qemu/log.h
outside NEED_CPU_H to allow use in qom/cpu.c. Moving them to qom/cpu.h
was not yet possible due to convoluted include paths, so that some
devices grow an implicit and unneeded dependency on qom/cpu.h for now.

Acked-by: Michael Walle <michael@walle.cc> (for lm32)
Reviewed-by: Richard Henderson <rth@twiddle.net>
[AF: Simplified mb_cpu_do_interrupt() and do_interrupt_all() changes]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-09 21:33:04 +02:00
Andreas Färber
182735efaf cpu: Make first_cpu and next_cpu CPUState
Move next_cpu from CPU_COMMON to CPUState.
Move first_cpu variable to qom/cpu.h.

gdbstub needs to use CPUState::env_ptr for now.
cpu_copy() no longer needs to save and restore cpu_next.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
[AF: Rebased, simplified cpu_copy()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-09 21:32:54 +02:00
Andreas Färber
4917cf4432 cpu: Replace cpu_single_env with CPUState current_cpu
Move it to qom/cpu.h.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-07-09 21:20:28 +02:00
Markus Armbruster
2ff3de685a Simplify -machine option queries with qemu_get_machine_opts()
The previous two commits fixed bugs in -machine option queries.  I
can't find fault with the remaining queries, but let's use
qemu_get_machine_opts() everywhere, for consistency, simplicity and
robustness.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1372943363-24081-7-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-07-09 13:38:58 -05:00
Stefan Weil
154bb106dc exec: Remove unused global variable phys_ram_fd
It seems to be unused since several years (commit
be995c2764 in 2006).

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Message-id: 1373044036-14443-1-git-send-email-sw@weilnetz.de
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-07-09 13:38:56 -05:00
Paolo Bonzini
c7086b4a23 exec: change some APIs to take AddressSpaceDispatch
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:50 +02:00
Paolo Bonzini
6092666ebd exec: remove cur_map
cur_map is not used anymore; instead, each AddressSpaceDispatch
has its own nodes/sections pair.  The priorities of the
MemoryListeners, and in the future RCU, guarantee that the
nodes/sections are not freed while they are still in use.

(In fact, next_map itself is not needed except to free the data on the
next update).

To avoid incorrect use, replace cur_map with a temporary copy that
is only valid while the topology is being updated.  If you use it,
the name prev_map makes it clear that you're doing something weird.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:50 +02:00
Paolo Bonzini
0475d94fff exec: put memory map in AddressSpaceDispatch
After this patch, AddressSpaceDispatch holds a constistent tuple of
(phys_map, nodes, sections).  This will be important when updates
of the topology will run concurrently with reads.

cur_map is not used anymore except for freeing it at the end of the
topology update.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:49 +02:00
Paolo Bonzini
0075270317 exec: separate current radix tree from the one being built
This same treatment previously done to phys_node_map and phys_sections
is now applied to the dispatch field of AddressSpace.  Topology updates
use as->next_dispatch while accesses use as->dispatch.

Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:49 +02:00
Paolo Bonzini
89ae337acb exec: move listener from AddressSpaceDispatch to AddressSpace
This will help having two copies of AddressSpaceDispatch during the
recreation of the radix tree (one being built, and one that is complete
and will be protected by RCU).  We do not want to have to unregister and
re-register the listener.

Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:49 +02:00
Paolo Bonzini
9affd6fc0e exec: separate current memory map from the one being built
Currently, phys_node_map and phys_sections are shared by all
of the AddressSpaceDispatch.  When updating mem topology, all
AddressSpaceDispatch will rebuild dispatch tables sequentially
on them.  In order to prepare for RCU access, leave the old
memory map alive while the next one is being accessed.

When rebuilding, the new dispatch tables will build and lookup
next_map; after all dispatch tables are rebuilt, we can switch
to next_* and free the previous table.

Based on a patch from Liu Ping Fan.

Signed-off-by: Liu Ping Fan <qemulist@gmail.com>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:49 +02:00
Liu Ping Fan
b41aac4f0d exec: change well-known physical sections to macros
Sections like phys_section_unassigned always have fixed address
in phys_sections.  Declared as macro, so we can use them
when having more than one phys_sections array.

Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
Signed-off-by: Liu Ping Fan <qemulist@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:49 +02:00
Paolo Bonzini
d3e71559a8 memory: ref/unref memory across address_space_map/unmap
The iothread mutex might be released between map and unmap, so the
mapped region might disappear.

Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:46 +02:00
Paolo Bonzini
e3127ae0cd exec: reorganize address_space_map
First of all, rename "todo" to "done".

Second, clearly separate the case of done == 0 with the case of done != 0.
This will help handling reference counting in the next patch.

Third, this test:

             if (memory_region_get_ram_addr(mr) + xlat != raddr + todo) {

does not guarantee that the memory region is the same across two iterations
of the while loop.  For example, you could have two blocks:

A) size 640 K, mapped at physical address 0, ram_addr_t 0
B) size 64 K, mapped at physical address 0xa0000, ram_addr_t 0xa0000

then mapping 1 M starting at physical address zero will erroneously treat
B as the continuation of block A.  qemu_ram_ptr_length ensures that no
invalid memory is accessed, but it is still a pointless complication of
the algorithm.  The patch makes the logic clearer with an explicit test
that the memory region is the same.

Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:46 +02:00
Paolo Bonzini
1b5ec23467 memory: return MemoryRegion from qemu_ram_addr_from_host
It will be needed in the next patch.

Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:46 +02:00
Paolo Bonzini
7443b43758 exec: move qemu_ram_addr_from_host_nofail to cputlb.c
After the next patch it would not be used elsewhere anyway.  Also,
the _nofail and the standard versions of this function return different
things, which is confusing.  Removing the function from the public headers
limits the confusion.

Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:45 +02:00
Paolo Bonzini
23887b79df exec: check MRU in qemu_ram_addr_from_host
This function is not used outside the iothread mutex, so it
can use ram_list.mru_block.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:45 +02:00
Paolo Bonzini
dfde4e6e1a memory: add ref/unref calls
Add ref/unref calls at the following places:

- places where memory regions are stashed by a listener and
  used outside the BQL (including in Xen or KVM).

- memory_region_find callsites

- creation of aliases and containers (only the aliased/contained
  region gets a reference to avoid loops)

- around calls to del_subregion/add_subregion, where the region
  could disappear after the first call

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:45 +02:00
Paolo Bonzini
b7e95164d1 exec: simplify destruction of the phys map
Do not bother visiting the radix tree when an address space is destroyed.
After the previous patch, this has become a pointless exercise.  When
called from address_space_destroy_dispatch, all you're doing is zeroing
out a structure that will be freed as soon as you come back.  When called
from mem_begin, when phys_page_set_level will call phys_map_node_alloc the
radix tree's array will be zeroed too.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:45 +02:00
Paolo Bonzini
058bc4b57f memory: destroy phys_sections one by one
phys_sections_clear is invoked after the dispatch tree has been
destroyed.  This leaves a window where phys_sections_nb > 0 but the
subpages are not valid anymore, which is a recipe for use-after-free
bugs.

Move the destruction of subpages in phys_sections_clear.  We will
still destroy the subpages when an address space is cleaned up,
because address_space_destroy will clear as->root and commit the
change before it calls address_space_destroy_dispatch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:44 +02:00
Paolo Bonzini
2c9b15cab1 memory: add owner argument to initialization functions
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:44 +02:00
Jan Kiszka
b40acf99be ioport: Switch dispatching to memory core layer
The current ioport dispatcher is a complex beast, mostly due to the
need to deal with old portio interface users. But we can overcome it
without converting all portio users by embedding the required base
address of a MemoryRegionPortio access into that data structure. That
removes the need to have the additional MemoryRegionIORange structure
in the loop on every access.

To handle old portio memory ops, we simply install dispatching handlers
for portio memory regions when registering them with the memory core.
This removes the need for the old_portio field.

We can drop the additional aliasing of ioport regions and also the
special address space listener. cpu_in and cpu_out now simply call
address_space_read/write. And we can concentrate portio handling in a
single source file.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-04 17:42:44 +02:00
Andreas Färber
878096eeb2 cpu: Turn cpu_dump_{state,statistics}() into CPUState hooks
Make cpustats monitor command available unconditionally.

Prepares for changing kvm_handle_internal_error() and kvm_cpu_exec()
arguments to CPUState.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-06-28 13:25:12 +02:00
Andreas Färber
60a3e17a46 cpu: Change cpu_exit() argument to CPUState
It no longer depends on CPUArchState, so move it to qom/cpu.c.

Prepares for changing GDBState::c_cpu to CPUState.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-06-28 13:25:12 +02:00
Andreas Färber
1a1562f5ea cpu: Introduce VMSTATE_CPU() macro for CPUState
To be used to embed common CPU state into CPU subclasses.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-06-28 13:25:11 +02:00
Peter Maydell
ec3f8c9913 linux-user: Fix compilation failure
Fix compilation failures for linux-user targets following recent
migration related commits bd2fa51fcd and 43487c67.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1372362818-4740-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-06-27 15:38:35 -05:00
Michael R. Hines
bd2fa51fcd rdma: introduce qemu_ram_foreach_block()
This is used during RDMA initialization in order to
transmit a description of all the RAM blocks to the
peer for later dynamic chunk registration purposes.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Chegu Vinod <chegu_vinod@hp.com>
Tested-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-06-27 02:38:36 +02:00
Alexey Kardashevskiy
7dca8043f3 memory: give name to every AddressSpace
The "info mtree" command in QEMU console prints only "memory" and "I/O"
address spaces while there are actually a lot more other AddressSpace
structs created by PCI and VIO devices. Those devices do not normally
have names and therefore not present in "info mtree" output.

The patch fixes this.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:39:52 +02:00
Paolo Bonzini
df32fd1c9f dma: eliminate DMAContext
The DMAContext is a simple pointer to an AddressSpace that is now always
already available.  Make everyone hold the address space directly,
and clean up the DMA API to use the AddressSpace directly.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:39:52 +02:00
Paolo Bonzini
24addbc76d dma: eliminate old-style IOMMU support
The translate function in the DMAContext is now always NULL.
Remove every reference to it.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:47 +02:00
Avi Kivity
3095115744 memory: iommu support
Add a new memory region type that translates addresses it is given,
then forwards them to a target address space.  This is similar to
an alias, except that the mapping is more flexible than a linear
translation and trucation, and also less efficient since the
translation happens at runtime.

The implementation uses an AddressSpace mapping the target region to
avoid hierarchical dispatch all the way to the resolved region; only
iommu regions are looked up dynamically.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
[Modified to put translation in address_space_translate; assume
 IOMMUs are not reachable from TCG. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:47 +02:00
Paolo Bonzini
052e87b073 memory: make section size a 128-bit integer
So far, the size of all regions passed to listeners could fit in 64 bits,
because artificial regions (containers and aliases) are eliminated by
the memory core, leaving only device regions which have reasonable sizes

An IOMMU however cannot be eliminated by the memory core, and may have
an artificial size, hence we may need 65 bits to represent its size.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:47 +02:00
Paolo Bonzini
733d5ef527 exec: reorganize mem_add to match Int128 version
When adding support for 2^64-byte sections, we will have to change
the structure of mem_add to avoid failures in int128_get64.
Reorganize the code now before introducing Int128.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:47 +02:00
Paolo Bonzini
99b9cc0679 Revert "memory: limit sections in the radix tree to the actual address space size"
This reverts commit 86a8623692.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:46 +02:00
Paolo Bonzini
5c8a00ce18 exec: return MemoryRegion from address_space_translate
Only address_space_translate_for_iotlb needs to return the section.
Every caller of address_space_translate now uses only section->mr,
return it directly.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:46 +02:00
Jan Kiszka
acc9d80b26 exec: Implement subpage_read/write via address_space_rw
This will allow to add support for unaligned memory regions: the subpage
container region can activate unaligned support unconditionally because
the read/write handler will now ensure that accesses are split as
required by calling address_space_rw. We can furthermore drop the
special handling of RAM subpages, address_space_rw takes care of this
already.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:46 +02:00
Jan Kiszka
90260c6c09 exec: Resolve subpages in one step except for IOTLB fills
Except for the case of setting the IOTLB entry in TCG mode, we can avoid
the subpage dispatching handlers and do the resolution directly on
address_space_lookup_region. An IOTLB entry describes a full page, not
only the region that the first access to a sub-divided page may return.

This patch therefore introduces a special translation function,
address_space_translate_for_iotlb, that avoids the subpage resolutions.
In contrast, callers of the existing address_space_translate service
will now always receive the terminal memory region section. This will be
important for breaking the BQL and for enabling unaligned memory region.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:46 +02:00
Jan Kiszka
f52cc46742 exec: Allow unaligned address_space_rw
This will be needed for some corner cases with para-virtual I/O ports.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:46 +02:00
Paolo Bonzini
1db8abb102 memory: move private types to exec.c
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:46 +02:00
Jan Kiszka
9f029603ab memory: Introduce address_space_lookup_region
This introduces a wrapper for phys_page_find (before we complicate
address_space_translate with IOMMU translation).  This function will
also encapsulate locking and reference counting when we introduce
BQL-free dispatching.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:46 +02:00
Peter Maydell
3752a03648 exec.c: address_space_translate: handle access to addr 0 of 2^64 sized region
The memory API allows a MemoryRegion's size to be 2^64, as a special
case (otherwise the size always fits in a 64 bit integer). This meant
that attempts to access address zero in a 2^64 sized region would
assert in address_space_translate():

  #3  0x00007ffff3e4d192 in __GI___assert_fail#(assertion=0x555555a43f32
    "!a.hi", file=0x555555a43ef0 "include/qemu/int128.h", line=18,
    function=0x555555a4439f "int128_get64") at assert.c:103
  #4  0x0000555555877642 in int128_get64 (a=...)
    at include/qemu/int128.h:18
  #5  0x00005555558782f2 in address_space_translate (as=0x55555668d140,
   /addr=0, xlat=0x7fffafac9918, plen=0x7fffafac9920, is_write=false)
    at exec.c:221

Fix this by doing the 'min' operation in 128 bit arithmetic
rather than 64 bit arithmetic (we know the result of the 'min'
definitely fits in 64 bits because one of the inputs did).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-06-20 16:32:46 +02:00
Paolo Bonzini
fd8aaa767a memory: add return value to address_space_rw/read/write
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:27:34 +02:00
Paolo Bonzini
791af8c861 memory: propagate errors on I/O dispatch
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:27:32 +02:00
Paolo Bonzini
a649b9168c exec: just use io_mem_read/io_mem_write for 8-byte I/O accesses
The memory API is able to split it in two 4-byte accesses.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:27:29 +02:00
Paolo Bonzini
968a5627c8 memory: correctly handle endian-swapped 64-bit accesses
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:27:26 +02:00
Paolo Bonzini
51644ab70b memory: add address_space_access_valid
The old-style IOMMU lets you check whether an access is valid in a
given DMAContext.  There is no equivalent for AddressSpace in the
memory API, implement it with a lookup of the dispatch tree.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:27:16 +02:00
Paolo Bonzini
c353e4cc08 exec: implement .valid.accepts for subpages
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-05-29 16:27:14 +02:00