Commit ef701d7 screwed up handling of out-of-memory conditions.
Before the commit, we report the error and exit(1), in one place. The
commit lifts the error handling up the call chain some, to three
places. Fine. Except it uses &error_abort in these places, changing
the behavior from exit(1) to abort(), and thus undoing the work of
commit 3922825 "exec: Don't abort when we can't allocate guest
memory".
The previous two commits fixed one of the three places, another one
was fixed in commit 33e0eb5. This commit fixes the third one.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1441983105-26376-5-git-send-email-armbru@redhat.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Just specifying ops = NULL in some cases can be more convenient than having
two functions.
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 78a379ab1b6b30ab497db7971ad336dad1dbee76.1438758065.git.p.fedin@samsung.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Very often the owner of the aliased region is the same as the owner of the alias
region itself. When this happens, the reference count can never go back to 0 and
the owner is leaked. This is for example breaking hot-unplug of virtio-pci
devices (the device cannot be plugged back again with the same id).
Another common use for alias is to transform the system I/O address space
into an MMIO regions; in this case the aliased region never dies, so there
is no problem. Otherwise the owner is always the same for aliasing
and aliased region.
I checked all calls to memory_region_init_alias introduced after commit
dfde4e6 (memory: add ref/unref calls, 2013-05-06) and they do not need the
reference in order to keep the owner of the aliased region alive.
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For a board that has multiple framebuffer devices, both of them
might want to use DIRTY_MEMORY_VGA on the same memory region.
The lack of reference counting in memory_region_set_log makes
this very awkward to implement.
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
memory_region_present() leaks a reference to a MemoryRegion in the
case "mr == container". While fixing it, avoid reference counting
altogether for memory_region_present(), by using RCU only.
The return value could in principle be already invalid immediately
after memory_region_present returns, but presumably the caller knows
that and it's using memory_region_present to probe for devices that
are unpluggable, or something like that. The RCU critical section
is needed anyway, because it protects as->current_map.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As memory_region_read/write_accessor will now be run also without BQL held,
we need to move coalesced MMIO flushing earlier in the dispatch process.
Cc: Frederic Konrad <fred.konrad@greensocs.com>
Message-Id: <1434646046-27150-5-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This introduces the memory region property "global_locking". It is true
by default. By setting it to false, a device model can request BQL-free
dispatching of region accesses to its r/w handlers. The actual BQL
break-up will be provided in a separate patch.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Frederic Konrad <fred.konrad@greensocs.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1434646046-27150-4-git-send-email-pbonzini@redhat.com>
mr->terminates alone doesn't guarantee that we are looking at a RAM region.
mr->ram_addr also has to be checked, in order to distinguish RAM and I/O
regions.
So, do the following:
1) add a new define RAM_ADDR_INVALID, and test it in the assertions
instead of mr->terminates
2) IOMMU regions were not setting mr->ram_addr to a bogus value, initialize
it in the instance_init function so that the new assertions would fire
for IOMMU regions as well.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The cpu_physical_memory_reset_dirty() function is sometimes used
together with cpu_physical_memory_get_dirty(). This is not atomic since
two separate accesses to the dirty memory bitmap are made.
Turn cpu_physical_memory_reset_dirty() and
cpu_physical_memory_clear_dirty_range_type() into the atomic
cpu_physical_memory_test_and_clear_dirty().
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-6-git-send-email-stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This cuts in half the cost of bitmap operations (which will become more
expensive when made atomic) during migration on non-VRAM regions.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The separate handling of DIRTY_MEMORY_MIGRATION, which does not
call log_start/log_stop callbacks when it changes in a region's
dirty logging mask, has caused several bugs.
One recent example is commit 4cc856f (kvm-all: Sync dirty-bitmap from
kvm before kvm destroy the corresponding dirty_bitmap, 2015-04-02).
Another performance problem is that KVM keeps tracking dirty pages
after a failed live migration, which causes bad performance due to
disallowing huge page mapping.
This patch removes the root cause of the problem by reporting
DIRTY_MEMORY_MIGRATION changes via log_start and log_stop.
Note that we now have to rebuild the FlatView when global dirty
logging is enabled or disabled; this ensures that log_start and
log_stop callbacks are invoked.
This will also be used to make the setting of bitmaps conditional.
In general, this patch lets users of the memory API ignore the
global state of dirty logging if they handle dirty logging
generically per region.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
DIRTY_MEMORY_CODE is only needed for TCG. By adding it directly to
mr->dirty_log_mask, we avoid testing for TCG everywhere a region is
checked for the enabled/disabled state of dirty logging.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When the dirty log mask will also cover other bits than DIRTY_MEMORY_VGA,
some listeners may be interested in the overall zero/non-zero value of
the dirty log mask; others may be interested in the value of single bits.
For this reason, always call log_start/log_stop if bits have respectively
appeared or disappeared, and pass the old and new values of the dirty log
mask so that listeners can distinguish the kinds of change.
For example, KVM checks if dirty logging used to be completely disabled
(in log_start) or is now completely disabled (in log_stop). On the
other hand, Xen has to check manually if DIRTY_MEMORY_VGA changed,
since that is the only bit it cares about.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For now memory regions only track DIRTY_MEMORY_VGA individually, but
this will change soon. To support this, split memory_region_is_logging
in two functions: one that returns a given bit from dirty_log_mask,
and one that returns the entire mask. memory_region_is_logging gets an
extra parameter so that the compiler flags misuse.
While VGA-specific users (including the Xen listener!) will want to keep
checking that bit, KVM and vhost check for "any bit except migration"
(because migration is handled via the global start/stop listener
callbacks).
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
DIRTY_MEMORY_MIGRATION is triggered by memory_global_dirty_log_start
and memory_global_dirty_log_stop, so it cannot be used with
memory_region_set_log.
Specify this in the documentation and assert it.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- next part in the thread-safe address_space_* saga: atomic access
to the bounce buffer and the map_clients list, from Fam
- optional support for linking with tcmalloc, also from Fam
- reapplying Peter Crosthwaite's "Respect as_translate_internal
length clamp" after fixing the SPARC fallout.
- build system fix from Wei Liu
- small acpi-build and ioport cleanup by myself
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJVQJd4AAoJEL/70l94x66DYFYH/3ifhqWZsd4dfJri0CGAHI4i
SpPmNeouc8W+F/3lwf6Inrh5NnTgd5QzoUBMQaWVkQKwUiWls8g2mXkT3jo0iDqT
/B40YXnZjNm20MixNaZmk9AsOF6OqPM8EMufau874k5zTlx3tCGAW1QD+I1N7WK7
DfsFsIUD1svo2prn55fSoitMG1TIVPnpcklb4YGJRbAacQYUDhr5KAIhT1quDR2R
93BvToyQmPqRQ4YKqnJLp8HAkL4FaJumfFZVvyh2cZvyaYGN/RVdi2Dw985dJDPX
/z4enE4GCAs4RDw3lZ1RDbiZDqpT2ibFgASg/arX3SxzqHirOGvMdkOjO99r9j4=
=aLjh
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
- miscellaneous cleanups for TCG (Emilio) and NBD (Bogdan)
- next part in the thread-safe address_space_* saga: atomic access
to the bounce buffer and the map_clients list, from Fam
- optional support for linking with tcmalloc, also from Fam
- reapplying Peter Crosthwaite's "Respect as_translate_internal
length clamp" after fixing the SPARC fallout.
- build system fix from Wei Liu
- small acpi-build and ioport cleanup by myself
# gpg: Signature made Wed Apr 29 09:34:00 2015 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream: (22 commits)
nbd/trivial: fix type cast for ioctl
translate-all: use bitmap helpers for PageDesc's bitmap
target-i386: disable LINT0 after reset
Makefile.target: prepend $libs_softmmu to $LIBS
milkymist: do not modify libs-softmmu
configure: Add support for tcmalloc
exec: Respect as_translate_internal length clamp
ioport: reserve the whole range of an I/O port in the AddressSpace
ioport: loosen assertions on emulation of 16-bit ports
ioport: remove wrong comment
ide: there is only one data port
gus: clean up MemoryRegionPortio
sb16: remove useless mixer_write_indexw
sun4m: fix slavio sysctrl and led register sizes
acpi-build: remove dependency from ram_addr.h
memory: add memory_region_ram_resize
dma-helpers: Fix race condition of continue_after_map_failure and dma_aio_cancel
exec: Notify cpu_register_map_client caller if the bounce buffer is available
exec: Protect map_client_list with mutex
linux-user, bsd-user: Remove two calls to cpu_exec_init_all
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Rather than retaining io_mem_read/write as simple wrappers around
the memory_region_dispatch_read/write functions, make the latter
public and change all the callers to use them, since we need to
touch all the callsites anyway to add MemTxAttrs and MemTxResult
support. Delete io_mem_read and io_mem_write entirely.
(All the callers currently pass MEMTXATTRS_UNSPECIFIED
and convert the return value back to bool or ignore it.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Define an API so that devices can register MemoryRegionOps whose read
and write callback functions are passed an arbitrary pointer to some
transaction attributes and can return a success-or-failure status code.
This will allow us to model devices which:
* behave differently for ARM Secure/NonSecure memory accesses
* behave differently for privileged/unprivileged accesses
* may return a transaction failure (causing a guest exception)
for erroneous accesses
This patch defines the new API and plumbs the attributes parameter through
to the memory.c public level functions io_mem_read() and io_mem_write(),
where it is currently dummied out.
The success/failure response indication is also propagated out to
io_mem_read() and io_mem_write(), which retain the old-style
boolean true-for-error return.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
This cleans up the official /machine namespace. In particular
/machine/system[0] and /machine/io[0], as well as entries with
non-sanitized node names such as "/machine/qemu extended regs[0]".
The actual MemoryRegion names remain unchanged.
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
This fixes a use-after-free if do_address_space_destroy is executed
too late.
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
address_space_destroy_dispatch is called from an RCU callback and hence
outside the iothread mutex (BQL). However, after address_space_destroy
no new accesses can hit the destroyed AddressSpace so it is not necessary
to observe changes to the memory map. Move the memory_listener_unregister
call earlier, to make it thread-safe again.
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Fixes: 374f2981d1
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Do the entire lookup under RCU, which avoids atomic operations
in flatview_ref and flatview_unref.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Replace the flat_view_mutex with RCU, avoiding futex contention for
dataplane on large systems and many iothreads.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now that memory_region_destroy can be called from an RCU callback,
checking the BQL-protected global memory_region_transaction_depth
does not make much sense.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add API to allocate resizeable RAM MR.
This looks just like regular RAM generally, but
has a special property that only a portion of it
(used_length) is actually used, and migrated.
This used_length size can change across reboots.
Follow up patches will change used_length for such blocks at migration,
making it easier to extend devices using such RAM (notably ACPI,
but in the future thinkably other ROMs) without breaking migration
compatibility or wasting ROM (guest) memory.
Device is notified on resize, so it can adjust if necessary.
Note: nothing prevents making all RAM resizeable in this way.
However, reviewers felt that only enabling this selectively will
make some class of errors easier to detect.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Add API to change MR size.
Will be used internally for RAM resize.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
introduce memory_region_get_alignment() that returns
underlying memory block alignment or 0 if it's not
relevant/implemented for backend.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The PCI MMIO might be disabled or the device in the reset state.
Make sure we do not dump these memory regions.
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add parameter errp to memory_region_init_rom_device and update all call
sites to propagate the error.
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
[Propagate the error out of realize. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add parameter errp to memory_region_init_ram and update all call sites
to pass in &error_abort.
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add parameter errp to qemu_ram_alloc and qemu_ram_alloc_from_ptr so that
we can handle errors.
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Assert ptr != NULL in memory_region_init_ram_ptr. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Obsoleted by automatic object_property_add() arrayification.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
To support name retrieval of MemoryRegions that were created
dynamically (that is, not via memory_region_init and friends). We
cache the name in MemoryRegion's state as
object_get_canonical_path_component mallocs the returned value
so it's not suitable for direct return to callers. Memory already
frees the name field, so this will be garbage collected along with
the MR object.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Memory changes for QOMification and automatic tracking of MR lifetime.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJT8et9AAoJEBvWZb6bTYbyIJAQAI3AlLSe27xWoUGfQUgWH30z
Rt/pShHz3BJMfQpD79JfTH8u6uBpkQmKtflerNT7FhXN9ULDzNq+b/jRtke8nkuy
ctCt05FhhK00rfWpUoRue4XiCuvbizBU7MK0DI3yCyNdXQyYnFvgnvsJtlqox8Zh
J5HZcBJEmdCiWBxq7UPk0qBitp4PqNoy7jlD/Ex3m7fJN5WK2cyspQIT9zmhehVn
B8Nwp+RitDDbXbwm0r18col5rFr/6Nj6+dW1gr+7sVJDLNsmJEqC2l3Kgk0wbPkG
Uqwbih29me9PC9/L1VLGHY0ApKDQ8JGE0GrYgEg162hbhoxEHkjjoHMhDUfV6Pj8
NkqcjjWl11UUhgkNqrGafayXbBVnOiEglxy8uXCeq14y9Xd/gjK9Fz6MQvRSOjms
PFmaKknhdmpxh0DuZmTix7WBmKim8zOiCE0/vrAPvwx5L+d1bn5xh6yQvtVjBMpU
Sru3Mhdm9bL9dUDBgOM/G6WCxSTVLBlExOblcYkQh03MfabD7bfplcrKYPXt5ull
Y8YLjqkoIfoy5t0ErvtlpdBJjeEz99JXU+wLQ6NYHnzwzTV+oUtSaEph14mAFOcY
XkFKdoPDI9PnyEfvy4193du8z/dSbhu7sWgHWbTCQyrcaNnSaVhlH43NUC+p23YN
8vfEsVLd1X7MFkDBUmWp
=M+/m
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
SCSI changes that enable sending vendor-specific commands via virtio-scsi.
Memory changes for QOMification and automatic tracking of MR lifetime.
# gpg: Signature made Mon 18 Aug 2014 13:03:09 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: aka "Paolo Bonzini <bonzini@gnu.org>"
* remotes/bonzini/tags/for-upstream:
mtree: remove write-only field
memory: Use canonical path component as the name
memory: Use memory_region_name for name access
memory: constify memory_region_name
exec: Abstract away ref to memory region names
loader: Abstract away ref to memory region names
tpm_tis: remove instance_finalize callback
memory: remove memory_region_destroy
memory: convert memory_region_destroy to object_unparent
ioport: split deletion and destruction
nic: do not destroy memory regions in cleanup functions
vga: do not dynamically allocate chain4_alias
sysbus: remove unused function sysbus_del_io
qom: object: move unparenting to the child property's release callback
qom: object: delete properties before calling instance_finalize
virtio-scsi: implement parse_cdb
scsi-block, scsi-generic: implement parse_cdb
scsi-block: extract scsi_block_is_passthrough
scsi-bus: introduce parse_cdb in SCSIDeviceClass and SCSIBusInfo
scsi-bus: prepare scsi_req_new for introduction of parse_cdb
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Rather than having the name as separate state. This prepares support
for creating a MemoryRegion dynamically (i.e. without
memory_region_init() and friends) and the MemoryRegion still getting
a usable name.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Despite being local to memory.c, use the helper function. This prepares
support for fully QOMifiying the name field of MR (which will remove
this state from MR completely).
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It doesn't change the MR and some prospective call sites will have
const MRs at hand.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The function is empty after the previous patch, so remove it.
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Explicitly call object_unparent in the few places where we
will re-create the memory region. If the memory region is
simply being destroyed as part of device teardown, let QOM
handle it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We are not 64 bit any more since
08dafab4 memory: use 128-bit integers for sizes and intermediates
but the comment is forgotten to be updated.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
To allow devices to dynamically resize the device. The motivation is
to allow devices with variable size to init their memory_region
without size early and then correctly populate size at realize() time.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QOM propertyify the .may-overlap and .priority fields. The setters
will re-add the memory as a subregion if needed (i.e. the values change
when the memory region is already contained).
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Remove setters. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Expose the already existing .parent and .addr fields as QOM properties.
.parent (i.e. the field describing the memory region that contains this
one in Memory hierachy) is renamed "container". This is to avoid
confusion with the QOM parent.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Remove setters. Do not unref parent on releasing the property. Clean
up error propagation. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QOMify memory regions as an Object. The former init() and destroy()
routines become instance_init() and instance_finalize() resp.
memory_region_init() is re-implemented to be:
object_initialize() + set fields
memory_region_destroy() is re-implemented to call unparent().
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Add newly-created MR as child, unparent on destruction. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will be added (after QOMification) as the QOM parent.
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A new "share" property can be used with the "memory-file" backend to
map memory with MAP_SHARED instead of MAP_PRIVATE.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
And allow preallocation of file-based memory even without -mem-prealloc.
Some care is necessary because -mem-prealloc does not allow disabling
preallocation for hostmem-file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Right now, -mem-path will fall back to RAM-based allocation in some
cases. This should never happen with "-object memory-file", prepare
the code by adding correct error propagation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
MST: drop \n at end of error messages
Like the previous patch did in exec.c, split memory_region_init_ram and
memory_region_init_ram_from_file, and push mem_path one step further up.
Other RAM regions than system memory will now be backed by regular RAM.
Also, boards that do not use memory_region_allocate_system_memory will
not support -mem-path anymore. This can be changed before the patches
are merged by migrating boards to use the function.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Split the internal interface in exec.c to a separate function, and
push the check on mem_path up to memory_region_init_ram.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
which allows to check if MemoryRegion is already mapped.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
With huge number of PCI devices in the system (for example, 200
virtio-blk-pci), this unconditional call can slow down emulation of
irrelevant PCI operations drastically, such as a BAR update on a device
that has no coalescing region. So avoid it.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
memory_region_set_address is mostly just a function that deletes and
re-adds a memory region. Factor this generic functionality out into a
re-usable function. This prepares support for further QOMification
of MemoryRegion.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Split off the core looping code that actually adds subregions into
it's own fn. This prepares support for Memory Region qomification
where setting the MR address or parent via QOM will back onto this more
minimal function.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Rename new function. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This if else is not needed. The previous call to memory_region_add
(whether _overlap or not) will always set priority and may_overlap
to desired values. And its not possible to get here without having
called memory_region_add_subregion due to the null guard on parent.
So we can just directly call memory_region_add_subregion_common.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
memory mappings don't rely on ioeventfds, there is no need
to destroy and rebuild them when manipulating ioeventfds,
otherwise it scarifies performance.
according to testing result, each ioeventfd deleing needs
about 5ms, within which memory mapping rebuilding needs
about 4ms. With many Nics and vmchannel in a VM doing migrating,
there can be many ioeventfds deleting which increasing
downtime remarkably.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Herongguang <herongguang.he@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
At the moment, most AddressSpace objects last as long as the guest system
in practice, but that could well change in future. In addition, for VFIO
we will be introducing some private per-AdressSpace information, which must
be disposed of before the AddressSpace itself is destroyed.
To reduce the chances of subtle bugs in this area, this patch adds
asssertions to ensure that when an AddressSpace is destroyed, there are no
remaining MemoryListeners using that AS as a filter.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Windows XP shows COM2 port as non functional in
"Device Manager" although no COM2 port backing device
is present in QEMU.
This regression is really due to
3bb28b7208b349e7a1b326e3c6ef9efac1d462bf?
memory: Provide separate handling of unassigned io ports accesses
That is caused by the fact that QEMU reports to
OSPM that device is present by setting 5th bit in
PII4XPM.pci_conf[0x67] register when COM2 doesn't
exist.
It happens due to memory_region_present(io_as, 0x2f8)
returning false positive since 0x2f8 address eventually
translates into catchall io_as address space.
Fix memory_region_present(parent, addr) by returning
true only if addr maps into a MemoryRegion within
parent (excluding parent itself), to match its
doc comment.
While at it fix copy/paste error in
memory_region_present() doc comment.
Note: this is a temporary hack: we really need better handling for
unassigned regions, we should avoid fallback regions since they are bad
for performance (breaking radix tree assumption that the data structure
is sparsely populated); for memory we need to fix this to implement PCI
master abort properly, anyway.
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
All the functions that use ram_addr_t should be here.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
We have an end parameter in all the callers, and this make it coherent
with the rest of cpu_physical_memory_* functions, that also take a
length parameter.
Once here, move the start/end calculation to
tlb_reset_dirty_range_all() as we don't need it here anymore.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Document it
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
So remove the flag argument and do it directly. After this change,
there is nothing else using cpu_physical_memory_set_dirty_flags() so
remove it.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
'address_space_get_flatview' gets a reference to a FlatView.
If the flatview lookup fails, the code returns without
"unreferencing" the view.
Cc: qemu-stable@nongnu.org
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This includes some pretty big changes:
- pci master abort support by Marcel
- pci IRQ API rework by Marcel
- acpi generation support by myself
Everything has gone through several revisions, latest versions have been on
list for a while without any more comments, tested by several
people.
Please pull for 1.7.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.15 (GNU/Linux)
iQEcBAABAgAGBQJSXNO8AAoJECgfDbjSjVRp7VAH/0B73mCOiyVACGx7fazK3SGK
X8TxZWVtG5A77ISqKyrtjLAhK9DCQjEzQTbMNhXHM3Ar6crwo7nJZnQvH2Gh1X2p
34BOQSVc4rtXz5pwDIr48dBLrxeslwXub79chUs+IK1/4RSn3h3nuS3k6JVkmLJN
rcHMj4ljJmi4Hd9vOpmS1jo/a61usi36hhU7CMgcrsXzStZycBBzCozOB3VW8p1X
/iwyf91YjmNPkn9gA3/aViGjszu8jE91dkA0C+ljwvcGbs2yEl3LCWEJfsMvoh5P
2M+k0XXbHwq/P9PFMa/2/lWOo4EO4Oxa+G/6QvovJrteYnktr+E9DqjU8pCT7yI=
=CVfs
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'mst/tags/for_anthony' into staging
pci, pc, acpi fixes, enhancements
This includes some pretty big changes:
- pci master abort support by Marcel
- pci IRQ API rework by Marcel
- acpi generation support by myself
Everything has gone through several revisions, latest versions have been on
list for a while without any more comments, tested by several
people.
Please pull for 1.7.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Tue 15 Oct 2013 07:33:48 AM CEST using RSA key ID D28D5469
# gpg: Can't check signature: public key not found
* mst/tags/for_anthony: (39 commits)
ssdt-proc: update generated file
ssdt: fix PBLK length
i386: ACPI table generation code from seabios
pc: use new api to add builtin tables
acpi: add interface to access user-installed tables
hpet: add API to find it
pvpanic: add API to access io port
ich9: APIs for pc guest info
piix: APIs for pc guest info
acpi/piix: add macros for acpi property names
i386: define pc guest info
loader: allow adding ROMs in done callbacks
i386: add bios linker/loader
loader: use file path size from fw_cfg.h
acpi: ssdt pcihp: updat generated file
acpi: pre-compiled ASL files
acpi: add rules to compile ASL source
i386: add ACPI table files from seabios
q35: expose mmcfg size as a property
q35: use macro for MCFG property name
...
Message-id: 1381818560-18367-1-git-send-email-mst@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
mtree_print_mr() calls int128_get64() in 3 places but only 2 places
handle 2^64 correctly.
This fixes the third call of int128_get64().
Cc: qemu-stable@nongnu.org
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When memory regions overlap, priority can be used to specify
which of them takes priority. By making the priority values signed
rather than unsigned, we make it more convenient to implement
a situation where one "background" region should appear only
where no other region exists: rather than having to explicitly
specify a high priority for all the other regions, we can let them take
the default (zero) priority and specify a negative priority for the
background region.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This reverts commit 9b8c692435.
The commit was wrong: We only return -1 on invalid accesses, not on
valid but unbacked ones. This broke various corner cases.
Cc: qemu-stable@nongnu.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
memory.c does not use any kvm specific interfaces,
don't include kvm.h
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This is quite handy to debug softmmu targets.
Reviewed-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1375016242-32651-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
When combining multiple accesses into a single value, we need to do so
in the device's desired endianness. The target endianness does not have
any influence.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Message-id: 1374501278-31549-28-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The accessors all use a MemoryRegion opaque value. Avoid going
uselessly through void*.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Message-id: 1374501278-31549-27-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Prepare for next patch, no semantic change.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Message-id: 1374501278-31549-26-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
if mr->size == 0, then
int128_get64(int128_sub(mr->size, int128_make64(1))) => assert(!a.hi)
Also, use int128_one().
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20130719184124.15864.20803.stgit@bling.home
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This restore the behavior prior to b018ddf633 which accidentally changed
the return code to 0. Specifically guests probing for register existence
were affected by this.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Brown paper bag for me. Originally commit 803c0816 came before commit
2c9b15c. When the order was inverted, I left in the NULL initialization
of mr->owner.
Reviewed-by: Hu Tao <hutao@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
With this change, a FlatView can be used even after a concurrent
update has replaced it. Because we do not yet have RCU, we use a
mutex to protect the small critical sections that read/write the
as->current_map pointer. Accesses to the FlatView can be done
outside the mutex.
If a MemoryRegion will be used after the FlatView is unref-ed (or after
a MemoryListener callback is returned), a reference has to be added to
that MemoryRegion. memory_region_find already does it for the region
that it returns. The same will be done for address_space_translate
as soon as the dispatch tree is also converted to RCU-style.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is the first step towards converting as->current_map to
RCU-style updates, where the FlatView updates run concurrently
with uses of an old FlatView.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We will soon require accesses to as->current_map to be placed under
a lock (with reference counting so as to keep the critical section
small). To simplify this change, always fetch as->current_map into
a local variable and access it through that variable.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add ref/unref calls at the following places:
- places where memory regions are stashed by a listener and
used outside the BQL (including in Xen or KVM).
- memory_region_find callsites
- creation of aliases and containers (only the aliased/contained
region gets a reference to avoid loops)
- around calls to del_subregion/add_subregion, where the region
could disappear after the first call
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This new API will avoid having too many memory_region_ref/unref
in paths that currently use memory_region_find.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Whenever memory regions are accessed outside the BQL, they need to be
preserved against hot-unplug. MemoryRegions actually do not have their
own reference count; they piggyback on a QOM object, their "owner".
The owner is set at creation time, and there is a function to retrieve
the owner.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The current ioport dispatcher is a complex beast, mostly due to the
need to deal with old portio interface users. But we can overcome it
without converting all portio users by embedding the required base
address of a MemoryRegionPortio access into that data structure. That
removes the need to have the additional MemoryRegionIORange structure
in the loop on every access.
To handle old portio memory ops, we simply install dispatching handlers
for portio memory regions when registering them with the memory core.
This removes the need for the old_portio field.
We can drop the additional aliasing of ioport regions and also the
special address space listener. cpu_in and cpu_out now simply call
address_space_read/write. And we can concentrate portio handling in a
single source file.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use it for all targets, but be careful not to pass invalid CPUState.
cpu_single_env can be NULL, e.g. on Xen.
Signed-off-by: Andreas Färber <afaerber@suse.de>
These 4 replicated lines set properties of fr that are constant over
the course of the function. Factor out their repeated setting (and also
guards against them being set multiple times in the loop below).
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These comments were a little difficult to read. First one had
incorrect parenthesis. The part about attributes changing is
really applicable to the region being 'in both' rather than 'in
new'
Second comment has an obscure parenthetic about 'Logging may have
changed'. Made clearer, as this if is supposed to handle the case where
the memory region is unchanged (with the notable exception re logging).
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The "info mtree" command in QEMU console prints only "memory" and "I/O"
address spaces while there are actually a lot more other AddressSpace
structs created by PCI and VIO devices. Those devices do not normally
have names and therefore not present in "info mtree" output.
The patch fixes this.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch adds a NotifierList to MemoryRegions which represent IOMMUs
allowing other parts of the code to register interest in mappings or
unmappings from the IOMMU. All IOMMU implementations will need to call
memory_region_notify_iommu() to inform those waiting on the notifier list,
whenever an IOMMU mapping is made or removed.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a new memory region type that translates addresses it is given,
then forwards them to a target address space. This is similar to
an alias, except that the mapping is more flexible than a linear
translation and trucation, and also less efficient since the
translation happens at runtime.
The implementation uses an AddressSpace mapping the target region to
avoid hierarchical dispatch all the way to the resolved region; only
iommu regions are looked up dynamically.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
[Modified to put translation in address_space_translate; assume
IOMMUs are not reachable from TCG. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
So far, the size of all regions passed to listeners could fit in 64 bits,
because artificial regions (containers and aliases) are eliminated by
the memory core, leaving only device regions which have reasonable sizes
An IOMMU however cannot be eliminated by the memory core, and may have
an artificial size, hence we may need 65 bits to represent its size.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will be used to split 8-byte access down to two four-byte accesses.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The memory API is able to use smaller/wider accesses than requested,
match that in memory_region_access_valid. Of course, the accepts
callback is still free to reject those accesses.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We'll use it to implement address_space_access_valid.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This allows to remove the checks on section->readonly. Simply,
write accesses to ROM will not be considered "direct" and will
go through mr->ops without any special intervention.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This provides the basics for detecting accesses to unassigned memory
as soon as they happen, and also for a simple implementation of
address_space_access_valid.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Even a new address space might have a non-empty FlatView. In order
to initialize it properly, address_space_init should (a) call
memory_region_transaction_commit after the address space is inserted
into the list; (b) force memory_region_transaction_commit to do something.
This bug was latent so far because all address spaces started empty, including
the PCI address space where the bus master region is initially disabled.
However, the target address space of an IOMMU is usually rooted at
get_system_memory(), which might not be empty at the time the IOMMU is created.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A couple of fields were left uninitialized. This was not observed earlier
because all address spaces were statically allocated. Also free allocation
for those fields.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since this is a MemoryListener operation, it only makes sense
on an AddressSpace granularity.
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
"Readable" is a very unfortunate name for this flag because even a
rom_device region will always be readable from the guest POV. What
differs is the mapping, just like the comments had to explain already.
Also, readable could currently be understood as being a generic region
flag, but it only applies to rom_device regions.
So rename the flag and the function to modify it after the original term
"ROMD" which could also be interpreted as "ROM direct", i.e. ROM mode
with direct access. In any case, the scope of the flag is clearer now.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
memory_region_find() is similar to registering a MemoryListener and
checking for the MemoryRegionSections that come from a particular
region. There is no reason for this to be limited to a root memory
region.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A memory size of zero is invalid, and so that edge condition
does not occur.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
We had two copies of a ffs function for longs with subtly different
semantics and, for the one in bitops.h, a confusing name: the result
was off-by-one compared to the library function ffsl.
Unify the functions into one, and solve the name problem by calling
the 0-based functions "bitops_ctzl" and "bitops_ctol" respectively.
This also fixes the build on platforms with ffsl, including Mac OS X
and Windows.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Andreas Färber <afaerber@suse.de>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Cirrus is triggering this, e.g. during Win2k boot: Changes only on
disabled regions require no topology update when transaction depth drops
to 0 again.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The memory core drops regions that are hidden by another region (for example,
during BAR sizing), but it doesn't do so correctly if the lower address of the
existing range is below the lower address of the new range.
Example (qemu-system-mips -M malta -kernel vmlinux-2.6.32-5-4kc-malta
-append "console=ttyS0" -nographic -vga cirrus):
Existing range: 10000000-107fffff
New range: 100a0000-100bffff
Correct behaviour: drop new range
Incorrect behaviour: add new range
Fix by taking this case into account (previously we only considered
equal lower boundaries).
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
target_phys_addr_t is unwieldly, violates the C standard (_t suffixes are
reserved) and its purpose doesn't match the name (most target_phys_addr_t
addresses are not target specific). Replace it with a finger-friendly,
standards conformant hwaddr.
Outstanding patchsets can be fixed up with the command
git rebase -i --exec 'find -name "*.[ch]"
| xargs s/target_phys_addr_t/hwaddr/g' origin
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* qemu-kvm/memory/urgent:
memory: abort if a memory region is destroyed during a transaction
i440fx: avoid destroying memory regions within a transaction
memory: Make eventfd adhere to device endianness
Currently we use a global radix tree to dispatch memory access. This only
works with a single address space; to support multiple address spaces we
make the radix tree a member of AddressSpace (via an intermediate structure
AddressSpaceDispatch to avoid exposing too many internals).
A side effect is that address_space_io also gains a dispatch table. When
we remove all the pre-memory-API I/O registrations, we can use that for
dispatching I/O and get rid of the original I/O dispatch.
Signed-off-by: Avi Kivity <avi@redhat.com>
Using the AddressSpace type reduces confusion, as you can't accidentally
supply the MemoryRegion you're interested in.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
With this change, memory.c no longer knows anything about special address
spaces, so it is prepared for AddressSpace based DMA.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Instead of calling a global function on coalesced mmio changes, which
routes the call to kvm if enabled, add coalesced mmio hooks to
MemoryListener and make kvm use that instead.
The motivation is support for multiple address spaces (which means we
we need to filter the call on the right address space) but the result
is cleaner as well.
Signed-off-by: Avi Kivity <avi@redhat.com>
Destroying a memory region is illegal within a transaction, as until
the transaction is committed, the memory core may hold references to
the region. Add an assert to check for violations of this rule.
Signed-off-by: Avi Kivity <avi@redhat.com>
Our memory API MMIO regions know the concept of device endianness. This
is used to automatically swap endianness between devices and host CPU,
depending on whether buses in between would swizzle the bits.
The ioeventfd value comparison does not adhere to that semantic though.
Probably because nobody has been running ioeventfd on a BE platform and
the only device implementing ioeventfd right now is LE (PCI) based.
So add swizzling to ioeventfd registration / deletion to make the rest
of the code as consistent as possible.
Thanks a lot to Michael Tsirkin to point me towards the right direction.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
Many listeners don't need to respond to all MemoryListener callbacks;
provide suitable no-op defaults instead.
Signed-off-by: Avi Kivity <avi@redhat.com>
Instead of embedding knowledge of the memory and I/O address spaces in the
memory core, maintain a list of all address spaces. This list will later
be extended dynamically for other bus masters.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
The DMA API will use an AddressSpace to differentiate among different
initiators.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
AddressSpace contains a member, current_map, of type FlatView. Since we
want to limit the leakage of internal types to public headers, switch to
a pointer to a FlatView. There is no performance impact as this isn't used
during lookups, only address space reconfigurations.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
exec-obsolete.h used to hold pre-memory-API functions that were used from
device code prior to the transition to the memory API. Now that the
transition is complete, the name no longer describes the file. The
functions still need to be merged better into the memory core, but there's
no danger of anyone using them.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Flush pending coalesced MMIO before performing mapping or state changes
that could affect the event orderings or route the buffered requests to
a wrong region.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Simplify the code as we are using now only a subset of the original
features of memory_region_update_topology.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Wrap also simple operations consisting only of a single step with
memory_region_transaction_begin/commit. This allows to perform
additional steps like coalesced MMIO flushing from a single place.
This requires dropping some micro-optimizations: The skipping of
topology updates after updating disabled or unregistered regions.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Instead of flushing pending coalesced MMIO requests on every vmexit,
this provides a mechanism to selectively flush when memory regions
related to the coalesced one are accessed. This first of all includes
the coalesced region itself but can also applied to other regions, e.g.
of the same device, by calling memory_region_set_flush_coalesced.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
The last argument of find_portio is "write", so this must be true here.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Under Win32, EventNotifiers will not have event_notifier_get_fd, so we
cannot call it in common code such as hw/virtio-pci.c. Pass a pointer to
the notifier, and only retrieve the file descriptor in kvm-specific code.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
This patch resolves a bug in memory listener registration.
"range_add" callback was called on each section of the both
address space (IO and memory space) even if it doesn't match
the address space filter.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
The return value of cpu_register_io_memory() is no longer used anywhere, so
we can remove it and all associated data and code.
Signed-off-by: Avi Kivity <avi@redhat.com>
Instead of indirecting via io_mem_region, dispatch directly
through the MemoryRegion obtained from the iotlb or phys_page_find().
Signed-off-by: Avi Kivity <avi@redhat.com>
Commit e58ac72b6a0 ("ioport: change portio_list not to use
memory_region_set_offset()") started using aliases of I/O memory
regions. Since the IORange used for the I/O was contained in the
target region, the alias information (specifically, the offset
into the region) was lost. This broke -vga std.
Fix by allocating an independent object to hold the IORange and
also the new offset.
Note that I/O memory regions were conceptually broken wrt aliases
in a different way: an alias can cause the same region to appear
twice in an address space, but we had just one IORange to service it.
This patch fixes that problem as well, since we can now have multiple
IORange/MemoryRegion associations.
Signed-off-by: Avi Kivity <avi@redhat.com>
Current memory listeners are incremental; that is, they are expected to
maintain their own state, and receive callbacks for changes to that state.
This patch adds support for stateless listeners; these work by receiving
a ->begin() callback (which tells them that new state is coming), a
sequence of ->region_add() and ->region_nop() callbacks, and then a
->commit() callback which signifies the end of the new state. They should
ignore ->region_del() callbacks.
Signed-off-by: Avi Kivity <avi@redhat.com>
All functionality has been moved to various MemoryListeners.
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
This transforms memory.c into a library which can then be unit tested
easily, by feeding it inputs and listening to its outputs.
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
It can be derived from the MemoryRegion itself (which is why it is not
used there).
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
.readonly cannot be obtained from the MemoryRegion, since it is
inherited from aliases (so you can have a MemoryRegion mapped RW
at one address and RO at another). Record it in a MemoryRegionSection
for listeners.
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
This allows reverse iteration, which in turns allows consistent ordering
among multiple listeners:
l1->add
l2->add
l2->del
l1->del
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
memory_region_set_offset() complicates the API, and has been deprecated
since its introduction. Now that it is no longer used, remove it.
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Helpful to understand guest configurations of things like the i440FX's
PAM or the state of ROM devices.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Instead of each device knowing or guessing the guest page size,
just pass the desired size of dirtied memory area.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Instead of each target knowing or guessing the guest page size,
just pass the desired size of dirtied memory area.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Introduce a memory region type that can reserve I/O space. Such regions
are useful for modeling I/O that is only handled outside of QEMU, i.e.
in the context of an accelerator like KVM.
Any access to such a region from QEMU is a bug, but could theoretically
be triggered by guest code (DMA to reserved region). So only warning
about such events once, then ignore them.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
All files under GPLv2 will get GPLv2+ changes starting tomorrow.
event_notifier.c and exec-obsolete.h were only ever touched by Red Hat
employees and can be relicensed now.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Commit a621f38de8 (Direct dispatch
through MemoryRegion) moved byte swaps to a central function.
Add a missing break, so that long-sized byte swaps don't abort.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Avi Kivity <avi@redhat.com>
Since commit be675c9720 (memory: move
endianness compensation to memory core) it was checking for
TARGET_BIG_ENDIAN instead of TARGET_WORDS_BIGENDIAN, thereby not
swapping correctly for Big Endian targets.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Avi Kivity <avi@redhat.com>
Unlike ->readonly, ->readable is not inherited from aliase, so we can simply
query the memory region.
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Now that all mmio goes through MemoryRegions, we can convert
io_mem_opaque to be a MemoryRegion pointer, and remove the thunks
that convert from old-style CPU{Read,Write}MemoryFunc to MemoryRegionOps.
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Convert the fixed-address IO_MEM_RAM, IO_MEM_ROM, IO_MEM_UNASSIGNED,
and IO_MEM_NOTDIRTY io handlers to MemoryRegions. These aren't real
regions, since they are never added to the memory hierarchy, but they
allow reuse of the dispatch functionality.
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
The code sometimes uses range comparisons on io indexes (e.g.
index =< IO_MEM_ROM). Avoid these as they make moving to objects harder.
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
backend_registered was used to lazify the process of registering an
mmio region, since the it is different for the I/O address space and
the memory address space. However, it also makes registration dependent
on the region being visible in the address space. This is not the case
for "fake" regions, like watchpoints or IO_MEM_UNASSIGNED.
Remove backend_registered and always initialize the region. If it turns
out to be part of the I/O address space, we've wasted an I/O slot, but
that's not too bad. In any case this will be optimized later on.
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Currently mmio access goes directly to the io_mem_{read,write} arrays.
In preparation for eliminating them, add indirection via a function.
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Instead of doing device endianness compensation in cpu_register_io_memory(),
do it in the memory core.
Signed-off-by: Avi Kivity <avi@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
The getter is no longer used, so it is completely removed.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Currently creating a memory region automatically registers it for
live migration. This differs from other state (which is enumerated
in a VMStateDescription structure) and ties the live migration code
into the memory core.
Decouple the two by introducing a separate API, vmstate_register_ram(),
for registering a RAM block for migration. Currently the same
implementation is reused, but later it can be moved into a separate list,
and registrations can be moved to VMStateDescription blocks.
Signed-off-by: Avi Kivity <avi@redhat.com>
This is a layering violation, but needed while the code contains
naked calls to qemu_get_ram_ptr() and the like.
Signed-off-by: Avi Kivity <avi@redhat.com>
Add an API that allows a client to observe changes in the global
memory map:
- region added (possibly with logging enabled)
- region removed (possibly with logging enabled)
- logging started on a region
- logging stopped on a region
- global logging started
- global logging removed
This API will eventually replace cpu_register_physical_memory_client().
Signed-off-by: Avi Kivity <avi@redhat.com>
Given an address space (represented by the top-level memory region),
returns the memory region that maps a given range. Useful for implementing
DMA.
The implementation is a simplistic binary search. Once we have a tree
representation this can be optimized.
Signed-off-by: Avi Kivity <avi@redhat.com>
Currently xen_ram_alloc() relies on ram_addr, which is going away.
Give it something else to use as a cookie.
Signed-off-by: Avi Kivity <avi@redhat.com>
The mutating memory APIs can easily cause empty transactions,
where the mutators don't actually change anything, or perhaps
only modify disabled regions. Detect these conditions and
avoid regenerating the memory topology.
Signed-off-by: Avi Kivity <avi@redhat.com>
Add an API to update an alias offset of an active alias. This can be
used to simplify implementation of dynamic memory banks.
Signed-off-by: Avi Kivity <avi@redhat.com>
This allows users to disable a memory region without removing
it from the hierarchy, simplifying the implementation of
memory routers.
Signed-off-by: Avi Kivity <avi@redhat.com>
MemoryRegionOps::valid tries to declaratively specify which transactions
are accepted by the device/bus, however it is not completely generic. Add
a callback for special cases.
Signed-off-by: Avi Kivity <avi@redhat.com>
'info mtree' accesses invalid memory in two cases, both due to incorrect
(and unsafe) usage of QTAILQ_FOREACH_SAFE().
Reported-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>