Remove direct uses of ram_addr_t and optimize memory_region_{get,set}_fd
now that a MemoryRegion knows its RAMBlock directly.
Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On the one hand, we have already qemu_get_ram_block() whose function
is similar. On the other hand, we can directly use mr->ram_block but
searching RAMblock by ram_addr which is a kind of waste.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1462845901-89716-2-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The only caller now knows exactly which RAMBlock to free, so it's not
necessary to do the lookup.
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1456813104-25902-6-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Previously we return RAMBlock.offset; now return the pointer to the
whole structure.
ram_block_add returns void now, error is completely passed with errp.
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1456813104-25902-2-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The last two arguments to these functions are the last and first bit to
check relative to the base. The code was using incorrectly the first
bit and the number of bits. Fix this in cpu_physical_memory_get_dirty
and cpu_physical_memory_all_dirty. This requires a few changes in the
iteration; change the code in cpu_physical_memory_set_dirty_range to
match.
Fixes: 5b82b70
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Leon Alrae <leon.alrae@imgtec.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-id: 1455113505-11237-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Although accesses to ram_list.dirty_memory[] use atomics so multiple
threads can safely dirty the bitmap, the data structure is not fully
thread-safe yet.
This patch handles the RAM hotplug case where ram_list.dirty_memory[] is
grown. ram_list.dirty_memory[] is change from a regular bitmap to an
RCU array of pointers to fixed-size bitmap blocks. Threads can continue
accessing bitmap blocks while the array is being extended. See the
comments in the code for an in-depth explanation of struct
DirtyMemoryBlocks.
I have tested that live migration with virtio-blk dataplane works.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1453728801-5398-2-git-send-email-stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This condition is true in the common case, so we can cut out the body of
the function. In addition, this makes it easier for the compiler to do
at least partial inlining, even if it decides that fully inlining the
function is unreasonable.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Split host_from_stream_offset() into two parts:
One is to get ram block, which the block idstr may be get from migration
stream, the other is to get hva (host) address from block and the offset.
Besides, we will do the check working in a new helper offset_in_ramblock().
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1452829066-9764-2-git-send-email-zhang.zhanghailiang@huawei.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
If virtio-net driver allocates memory in ivshmem shared memory,
vhost-net will work correctly, but vhost-user will not work because
a fd of shared memory will not be sent to vhost-user backend.
This patch fixes ivshmem to store file descriptor of shared memory.
It will be used when vhost-user negotiates vhost-user backend.
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
memcpy can take a large amount of time for small reads and writes.
Handle the common case of reading s/g descriptors from memory (there
is no corresponding "write" case that is as common, because writes
often use address_space_st* functions) by inlining the relevant
parts of address_space_read into the caller.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Replace qemu_ram_free_from_ptr() with qemu_ram_free().
The only difference between qemu_ram_free_from_ptr() and
qemu_ram_free() is that g_free_rcu() is used instead of
call_rcu(reclaim_ramblock). We can safely replace it because:
* RAM blocks allocated by qemu_ram_alloc_from_ptr() always have
RAM_PREALLOC set;
* reclaim_ramblock(block) will do nothing except g_free(block)
if RAM_PREALLOC is set at block->flags.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1446844805-14492-2-git-send-email-ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Postcopy sends RAMBlock names and offsets over the wire (since it can't
rely on the order of ramaddr being the same), and it starts out with
HVA fault addresses from the kernel.
qemu_ram_block_from_host translates a HVA into a RAMBlock, an offset
in the RAMBlock and the global ram_addr_t value.
Rewrite qemu_ram_addr_from_host to use qemu_ram_block_from_host.
Provide qemu_ram_get_idstr since its the actual name text sent on the
wire.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The header is included from basically everywhere, thanks to cpu.h.
It should be moved to the (TCG only) files that actually need it.
As a start, remove non-TCG stuff.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1439547914-18249-1-git-send-email-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The fast path of cpu_physical_memory_sync_dirty_bitmap() directly
manipulates the dirty bitmap. Use atomic_xchg() to make the
test-and-clear atomic.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-7-git-send-email-stefanha@redhat.com>
[Only do xchg on nonzero words. - Paolo]
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The cpu_physical_memory_reset_dirty() function is sometimes used
together with cpu_physical_memory_get_dirty(). This is not atomic since
two separate accesses to the dirty memory bitmap are made.
Turn cpu_physical_memory_reset_dirty() and
cpu_physical_memory_clear_dirty_range_type() into the atomic
cpu_physical_memory_test_and_clear_dirty().
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-6-git-send-email-stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The dirty memory bitmap is managed by ram_addr.h and copied to
migration_bitmap[] periodically during live migration.
Move the code to sync the bitmap to ram_addr.h where related code lives.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-5-git-send-email-stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use set_bit_atomic() and bitmap_set_atomic() so that multiple threads
can dirty memory without race conditions.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-4-git-send-email-stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
cpu_physical_memory_set_dirty_lebitmap unconditionally syncs the
DIRTY_MEMORY_CODE bitmap. This however is unused unless TCG is
enabled.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Most of the time, not all bitmaps have to be marked as dirty;
do not do anything if the interesting ones are already dirty.
Previously, any clean bitmap would have cause all the bitmaps to be
marked dirty.
In fact, unless running TCG most of the time bitmap operations need
not be done at all, because memory_region_is_logging returns zero.
In this case, skip the call to cpu_physical_memory_range_includes_clean
altogether as well.
With this patch, cpu_physical_memory_set_dirty_range is called
unconditionally, so there need not be anymore a separate call to
xen_modified_memory.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While it is obvious that cpu_physical_memory_get_dirty returns true even if
a single page is dirty, the same is not true for cpu_physical_memory_get_clean;
one would expect that it returns true only if all the pages are clean, but
it actually looks for even one clean page. (By contrast, the caller of that
function, cpu_physical_memory_range_includes_clean, has a good name).
To clarify, rename the function to cpu_physical_memory_all_dirty and return
true if _all_ the pages are dirty. This is the opposite of the previous
meaning, because "all are 1" is the same as "not (any is 0)", so we have to
modify cpu_physical_memory_range_includes_clean as well.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This cuts in half the cost of bitmap operations (which will become more
expensive when made atomic) during migration on non-VRAM regions.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Invoke xen_modified_memory from cpu_physical_memory_set_dirty_range_nocode;
it is akin to DIRTY_MEMORY_MIGRATION, so set it together with that bitmap.
The remaining call from invalidate_and_set_dirty's "else" branch will go
away soon.
Second, fix the second argument to the function in the
cpu_physical_memory_set_dirty_lebitmap call site. That function is only used
by KVM, but it is better to be clean anyway.
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add API to allocate "resizeable" RAM.
This looks just like regular RAM generally, but
has a special property that only a portion of it
(used_length) is actually used, and migrated.
This used_length size can change across reboots.
Follow up patches will change used_length for such blocks at migration,
making it easier to extend devices using such RAM (notably ACPI,
but in the future thinkably other ROMs) without breaking migration
compatibility or wasting ROM (guest) memory.
Device is notified on resize, so it can adjust if necessary.
qemu_ram_alloc_resizeable allocates this memory, qemu_ram_resize resizes
it.
Note: nothing prevents making all RAM resizeable in this way.
However, reviewers felt that only enabling this selectively will
make some class of errors easier to detect.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Make cpu_physical_memory_set/clear_dirty_range
behave symmetrically.
To clear range for a given client type only, add
cpu_physical_memory_clear_dirty_range_type.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
The code in invalidate_and_set_dirty() needs to handle addr/length
combinations which cross guest physical page boundaries. This can happen,
for example, when disk I/O reads large blocks into guest RAM which previously
held code that we have cached translations for. Unfortunately we were only
checking the clean/dirty status of the first page in the range, and then
were calling a tb_invalidate function which only handles ranges that don't
cross page boundaries. Fix the function to deal with multipage ranges.
The symptoms of this bug were that guest code would misbehave (eg segfault),
in particular after a guest reboot but potentially any time the guest
reused a page of its physical RAM for new code.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1416167061-13203-1-git-send-email-peter.maydell@linaro.org
Add parameter errp to qemu_ram_alloc and qemu_ram_alloc_from_ptr so that
we can handle errors.
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
[Assert ptr != NULL in memory_region_init_ram_ptr. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Devices that use address_space_rw to write large areas to memory
(as opposed to address_space_map/unmap) were broken with respect
to migration since fe680d0 (exec: Limit translation limiting in
address_space_translate to xen, 2014-05-07). Such devices include
IDE CD-ROMs.
The reason is that invalidate_and_set_dirty (called by address_space_rw
but not address_space_map/unmap) was only setting the dirty bit for
the first page in the translation.
To fix this, introduce cpu_physical_memory_set_dirty_range_nocode that
is the same as cpu_physical_memory_set_dirty_range except it does not
muck with the DIRTY_MEMORY_CODE bitmap. This function can be used if
the caller invalidates translations with tb_invalidate_phys_page_range.
There is another difference between cpu_physical_memory_set_dirty_range
and cpu_physical_memory_set_dirty_flag; the former includes a call
to xen_modified_memory. This is handled separately in
invalidate_and_set_dirty, and is not needed in other callers of
cpu_physical_memory_set_dirty_range_nocode, so leave it alone.
Just one nit: now that invalidate_and_set_dirty takes care of handling
multiple pages, there is no need for address_space_unmap to wrap it
in a loop. In fact that loop would now be O(n^2).
Reported-by: Dave Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Old code was affected by memory gaps which resulted in buffer pointers
pointing to address outside of the mapped regions.
Here we are introducing following changes:
- new function qemu_get_ram_block_host_ptr() returns host pointer
to the ram block, it is needed to calculate offset of specific
region in the host memory
- new field mmap_offset is added to the VhostUserMemoryRegion. It
contains offset where specific region starts in the mapped memory.
As there is stil no wider adoption of vhost-user agreement was made
that we will not bump version number due to this change
- other fileds in VhostUserMemoryRegion struct are not changed, as
they are all needed for usermode app implementation
- region data is not taken from ram_list.blocks anymore, instead we
use region data which is alredy calculated for use in vhost-net
- Now multiple regions can have same FD and user applicaton can call
mmap() multiple times with the same FD but with different offset
(user needs to take care for offset page alignment)
Signed-off-by: Damjan Marion <damarion@cisco.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Damjan Marion <damarion@cisco.com>
A new "share" property can be used with the "memory-file" backend to
map memory with MAP_SHARED instead of MAP_PRIVATE.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
And allow preallocation of file-based memory even without -mem-prealloc.
Some care is necessary because -mem-prealloc does not allow disabling
preallocation for hostmem-file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Right now, -mem-path will fall back to RAM-based allocation in some
cases. This should never happen with "-object memory-file", prepare
the code by adding correct error propagation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
MST: drop \n at end of error messages
Split the internal interface in exec.c to a separate function, and
push the check on mem_path up to memory_region_init_ram.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
See commit fbeadf50 (bitops: unify bitops_ffsl with the one in
host-utils.h, call it bitops_ctzl) on why ctzl should be used instead
of ffsl.
This is also needed for musl libc which does not implement ffsl.
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The ae2810c4bb patch introduced
optimization for ram_list.dirty_memory update. However it can only
work correctly if hpratio is 1 as the @bitmap parameter stores 1 bits
per system page size (may vary, 4K or 64K on PPC64) and
ram_list.dirty_memory stores 1 bit per TARGET_PAGE_SIZE
(which is hardcoded to 4K).
This fixes hpratio!=1 case to fall back to the slow path.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
cpu_physical_memory_set_dirty_lebitmap calls getpageaddr and ffsl which are
unavailable for MinGW. As the function is unused for MinGW, it can simply
be excluded from compilation.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
If bitmaps are aligned properly, use bitmap operations. If they are
not, just use old bit at a time code.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
We want to have all the functions that handle directly the dirty
bitmap near. We will change it later.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
All the functions that use ram_addr_t should be here.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>