1203 Commits

Author SHA1 Message Date
Wei Yang
10da4a3689 migration: extract ram_load_precopy
After cleanup, it would be clear to audience there are two cases
ram_load:

  * precopy
  * postcopy

And it is not necessary to check postcopy_running on each iteration for
precopy.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Message-Id: <20190725002023.2335-3-richardw.yang@linux.intel.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-08-14 17:33:14 +01:00
Wei Yang
be4a1a1b6f migration: return -EINVAL directly when version_id mismatch
It is not reasonable to continue when version_id mismatch.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190722075339.25121-2-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-08-14 17:33:14 +01:00
Wei Yang
4695ce3fdc migration: equation is more proper than and to check LOADVM_QUIT
LOADVM_QUIT allows a command to quit all layers of nested loadvm loops,
while current return value check is not that proper even it works now.

Current return value check "ret & LOADVM_QUIT" would return true if
bit[0] is 1. This would be true when ret is -1 which is used to indicate
an error of handling a command.

Since there is only one place return LOADVM_QUIT and no other
combination of return value, use "ret == LOADVM_QUIT" would be more
proper.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190718064257.29218-1-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-08-14 17:33:14 +01:00
Wei Yang
5d0980a459 migration: just pass RAMBlock is enough
RAMBlock->used_length is always passed to migration_bitmap_sync_range(),
which could be retrieved from RAMBlock.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190718012547.16373-1-richardw.yang@linux.intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-08-14 17:33:14 +01:00
Wei Yang
6a88eb2b08 migration: use migration_in_postcopy() to check POSTCOPY_ACTIVE
Use common helper function to check the state.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190719071129.11880-1-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-08-14 17:33:14 +01:00
Wei Yang
52aec70923 migration/postcopy: start_postcopy could be true only when migrate_postcopy() return true
There is only one place to set start_postcopy to true,
qmp_migrate_start_postcopy(), which make sure start_postcopy could be
set to true when migrate_postcopy() return true.

So start_postcopy is true implies the other one.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190718083747.5859-1-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-08-14 17:33:14 +01:00
Wei Yang
305b6f8431 migration/postcopy: PostcopyState is already set in loadvm_postcopy_handle_advise()
PostcopyState is already set to ADVISE at the beginning of
loadvm_postcopy_handle_advise().

Remove the redundant set.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190711080816.6405-1-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-08-14 17:33:14 +01:00
Wei Yang
e326767b45 migration/savevm: move non SaveStateEntry condition check out of iteration
in_postcopy and iterable_only are not SaveStateEntry specific, it would
be more proper to check them out of iteration.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190709140924.13291-4-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-08-14 17:33:14 +01:00
Wei Yang
622a80c955 migration/savevm: split qemu_savevm_state_complete_precopy() into two parts
This is a preparation patch for further cleanup.

No functional change, just wrap two major part of
qemu_savevm_state_complete_precopy() into function.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190709140924.13291-3-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-08-14 17:33:14 +01:00
Wei Yang
4e455d51ef migration/savevm: flush file for iterable_only case
It would be proper to flush file even for iterable_only case.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190709140924.13291-2-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-08-14 17:33:14 +01:00
Wei Yang
8996604fe6 migration/postcopy: do_fixup is true when host_offset is non-zero
This means it is not necessary to spare an extra variable to hold this
condition. Use host_offset directly is fine.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190710050814.31344-3-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-08-14 17:33:14 +01:00
Wei Yang
e927a03317 migration/postcopy: reduce one operation to calculate fixup_start_addr
Use the same way for run_end to calculate run_start, which saves one
operation.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190710050814.31344-2-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-08-14 17:33:14 +01:00
Wei Yang
a162b572e9 migration/postcopy: discard_length must not be 0
Since we break the loop when there is no more page to discard, we are
sure the following process would find some page to discard.

It is not necessary to check it again.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190627020822.15485-4-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-08-14 17:33:14 +01:00
Wei Yang
33a5cb6202 migration/postcopy: break the loop when there is no more page to discard
When one is equal or bigger then end, it means there is no page to
discard. Just break the loop in this case instead of processing it.

No functional change, just refactor it a little.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190627020822.15485-3-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-08-14 17:33:14 +01:00
Wei Yang
0abfff9ea7 migration/postcopy: the valid condition is one less then end
If one equals end, it means we have gone through the whole bitmap.

Use a more restrict check to skip a unnecessary condition.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190627020822.15485-2-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-08-14 17:33:14 +01:00
Wei Yang
640dfb14db migration: consolidate time info into populate_time_info
Consolidate time information fill up into its function for better
readability.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190716005411.4156-1-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-08-14 17:33:14 +01:00
Yury Kotov
3d661c8ab1 migration: Add error_desc for file channel errors
Currently, there is no information about error if outgoing migration was failed
because of file channel errors.
Example (QMP session):
-> { "execute": "migrate", "arguments": { "uri": "exec:head -c 1" }}
<- { "return": {} }
...
-> { "execute": "query-migrate" }
<- { "return": { "status": "failed" }} // There is not error's description

And even in the QEMU's output there is nothing.

This patch
1) Adds errp for the most of QEMUFileOps
2) Adds qemu_file_get_error_obj/qemu_file_set_error_obj
3) And finally using of qemu_file_get_error_obj in migration.c

And now, the status for the mentioned fail will be:
-> { "execute": "query-migrate" }
<- { "return": { "status": "failed",
                 "error-desc": "Unable to write to command: Broken pipe" }}

Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
Message-Id: <20190422103420.15686-1-yury-kotov@yandex-team.ru>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-08-14 17:33:14 +01:00
Ivan Ren
f193bc0c53 migration: fix migrate_cancel multifd migration leads destination hung forever
When migrate_cancel a multifd migration, if run sequence like this:

        [source]                              [destination]

multifd_send_sync_main[finish]
                                    multifd_recv_thread wait &p->sem_sync
shutdown to_dst_file
                                    detect error from_src_file
send  RAM_SAVE_FLAG_EOS[fail]       [no chance to run multifd_recv_sync_main]
                                    multifd_load_cleanup
                                    join multifd receive thread forever

will lead destination qemu hung at following stack:

pthread_join
qemu_thread_join
multifd_load_cleanup
process_incoming_migration_co
coroutine_trampoline

Signed-off-by: Ivan Ren <ivanren@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <1561468699-9819-4-git-send-email-ivanren@tencent.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-24 14:47:21 +02:00
Juan Quintela
3c3ca25d1f migration: Make explicit that we are quitting multifd
We add a bool to indicate that.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-24 14:47:12 +02:00
Ivan Ren
a3ec6b7d23 migration: fix migrate_cancel leads live_migration thread hung forever
When we 'migrate_cancel' a multifd migration, live_migration thread may
hung forever at some points, because of multifd_send_thread has already
exit for socket error:
1. multifd_send_pages may hung at qemu_sem_wait(&multifd_send_state->
   channels_ready)
2. multifd_send_sync_main my hung at qemu_sem_wait(&multifd_send_state->
   sem_sync)

Signed-off-by: Ivan Ren <ivanren@tencent.com>
Message-Id: <1561468699-9819-3-git-send-email-ivanren@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

---

Remove spurious not needed bits
2019-07-24 14:47:02 +02:00
Ivan Ren
713f762a31 migration: fix migrate_cancel leads live_migration thread endless loop
When we 'migrate_cancel' a multifd migration, live_migration thread may
go into endless loop in multifd_send_pages functions.

Reproduce steps:

(qemu) migrate_set_capability multifd on
(qemu) migrate -d url
(qemu) [wait a while]
(qemu) migrate_cancel

Then may get live_migration 100% cpu usage in following stack:

pthread_mutex_lock
qemu_mutex_lock_impl
multifd_send_pages
multifd_queue_page
ram_save_multifd_page
ram_save_target_page
ram_save_host_page
ram_find_and_save_block
ram_find_and_save_block
ram_save_iterate
qemu_savevm_state_iterate
migration_iteration_run
migration_thread
qemu_thread_start
start_thread
clone

Signed-off-by: Ivan Ren <ivanren@tencent.com>
Message-Id: <1561468699-9819-2-git-send-email-ivanren@tencent.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-24 14:46:51 +02:00
Ivan Ren
40c4d4a835 migration: always initial RAMBlock.bmap to 1 for new migration
Reproduce the problem:
migrate
migrate_cancel
migrate

Error happen for memory migration

The reason as follows:
1. qemu start, ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] all set to
   1 by a series of cpu_physical_memory_set_dirty_range
2. migration start:ram_init_bitmaps
   - memory_global_dirty_log_start: begin log diry
   - memory_global_dirty_log_sync: sync dirty bitmap to
     ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]
   - migration_bitmap_sync_range: sync ram_list.
     dirty_memory[DIRTY_MEMORY_MIGRATION] to RAMBlock.bmap
     and ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] is set to zero
3. migration data...
4. migrate_cancel, will stop log dirty
5. migration start:ram_init_bitmaps
   - memory_global_dirty_log_start: begin log diry
   - memory_global_dirty_log_sync: sync dirty bitmap to
     ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]
   - migration_bitmap_sync_range: sync ram_list.
     dirty_memory[DIRTY_MEMORY_MIGRATION] to RAMBlock.bmap
     and ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] is set to zero

   Here RAMBlock.bmap only have new logged dirty pages, don't contain
   the whole guest pages.

Signed-off-by: Ivan Ren <ivanren@tencent.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <1563115879-2715-1-git-send-email-ivanren@tencent.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15 15:47:47 +02:00
Wei Yang
40277ca807 migration/postcopy: remove redundant cpu_synchronize_all_post_init
cpu_synchronize_all_post_init() is called twice in
loadvm_postcopy_handle_run_bh(), so remove one redundant call.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20190715080751.24304-1-richardw.yang@linux.intel.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15 15:45:59 +02:00
Wei Yang
89dab31b27 migration/postcopy: fix document of postcopy_send_discard_bm_ram()
Commit 6b6712efccd3 ('ram: Split dirty bitmap by RAMBlock') changes the
parameter of postcopy_send_discard_bm_ram(), while left the document
part untouched.

This patch correct the document and fix two typo by hand.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190715020549.15018-1-richardw.yang@linux.intel.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15 15:45:22 +02:00
Peng Tao
b17fbbe55c migration: allow private destination ram with x-ignore-shared
By removing the share ram check, qemu is able to migrate
to private destination ram when x-ignore-shared capability
is on. Then we can create multiple destination VMs based
on the same source VM.

This changes the x-ignore-shared migration capability to
work similar to Lai's original bypass-shared-memory
work(https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg00003.html)
which enables kata containers (https://katacontainers.io)
to implement the VM templating feature.

An example usage in kata containers(https://katacontainers.io):
1. Start the source VM:
   qemu-system-x86 -m 2G \
     -object memory-backend-file,id=mem0,size=2G,share=on,mem-path=/tmpfs/template-memory \
     -numa node,memdev=mem0
2. Stop the template VM, set migration x-ignore-shared capability,
   migrate "exec:cat>/tmpfs/state", quit it
3. Start target VM:
   qemu-system-x86 -m 2G \
     -object memory-backend-file,id=mem0,size=2G,share=off,mem-path=/tmpfs/template-memory \
     -numa node,memdev=mem0 \
     -incoming defer
4. connect to target VM qmp, set migration x-ignore-shared capability,
migrate_incoming "exec:cat /tmpfs/state"
5. create more target VMs repeating 3 and 4

Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Yury Kotov <yury-kotov@yandex-team.ru>
Cc: Jiangshan Lai <laijs@hyper.sh>
Cc: Xu Wang <xu@hyper.sh>
Signed-off-by: Peng Tao <tao.peng@linux.alibaba.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1560494113-1141-1-git-send-email-tao.peng@linux.alibaba.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15 15:39:03 +02:00
Peter Xu
002cad6b16 migration: Split log_clear() into smaller chunks
Currently we are doing log_clear() right after log_sync() which mostly
keeps the old behavior when log_clear() was still part of log_sync().

This patch tries to further optimize the migration log_clear() code
path to split huge log_clear()s into smaller chunks.

We do this by spliting the whole guest memory region into memory
chunks, whose size is decided by MigrationState.clear_bitmap_shift (an
example will be given below).  With that, we don't do the dirty bitmap
clear operation on the remote node (e.g., KVM) when we fetch the dirty
bitmap, instead we explicitly clear the dirty bitmap for the memory
chunk for each of the first time we send a page in that chunk.

Here comes an example.

Assuming the guest has 64G memory, then before this patch the KVM
ioctl KVM_CLEAR_DIRTY_LOG will be a single one covering 64G memory.
If after the patch, let's assume when the clear bitmap shift is 18,
then the memory chunk size on x86_64 will be 1UL<<18 * 4K = 1GB.  Then
instead of sending a big 64G ioctl, we'll send 64 small ioctls, each
of the ioctl will cover 1G of the guest memory.  For each of the 64
small ioctls, we'll only send if any of the page in that small chunk
was going to be sent right away.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190603065056.25211-12-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15 15:39:03 +02:00
Peter Xu
267691b65c migration: No need to take rcu during sync_dirty_bitmap
cpu_physical_memory_sync_dirty_bitmap() has one RAMBlock* as
parameter, which means that it must be with RCU read lock held
already.  Taking it again inside seems redundant.  Removing it.
Instead comment on the functions about the RCU read lock.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20190603065056.25211-2-peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15 15:39:02 +02:00
Wei Yang
422314e751 migration/ram.c: reset complete_round when we gets a queued page
In case we gets a queued page, the order of block is interrupted. We may
not rely on the complete_round flag to say we have already searched the
whole blocks on the list.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20190605010828.6969-1-richardw.yang@linux.intel.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15 15:39:02 +02:00
Wei Yang
77568ea7f8 migration/multifd: sync packet_num after all thread are done
Notification from recv thread is not ordered, which means we may be
notified by one MultiFDRecvParams but adjust packet_num for another.

Move the adjustment after we are sure each recv thread are sync-ed.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20190604023540.26532-1-richardw.yang@linux.intel.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15 15:39:02 +02:00
Wei Yang
ca35380390 migration/xbzrle: update cache and current_data in one place
When we are not in the last_stage, we need to update the cache if page
is not the same.

Currently this procedure is scattered in two places and mixed with
encoding status check.

This patch extract this general step out to make the code a little bit
easy to read.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190610004159.20966-1-richardw.yang@linux.intel.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15 15:39:02 +02:00
Wei Yang
b6526c4b21 migration/multifd: call multifd_send_sync_main when sending RAM_SAVE_FLAG_EOS
On receiving RAM_SAVE_FLAG_EOS, multifd_recv_sync_main() is called to
synchronize receive threads. Current synchronization mechanism is to wait
for each channel's sem_sync semaphore. This semaphore is triggered by a
packet with MULTIFD_FLAG_SYNC flag. While in current implementation, we
don't do multifd_send_sync_main() to send such packet when
blk_mig_bulk_active() is true.

This will leads to the receive threads won't notify
multifd_recv_sync_main() by sem_sync. And multifd_recv_sync_main() will
always wait there.

[Note]: normal migration test works, while didn't test the
blk_mig_bulk_active() case. Since not sure how to produce this
situation.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20190612014337.11255-1-richardw.yang@linux.intel.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15 15:39:02 +02:00
Juan Quintela
8ebad0f7a7 migration: fix multifd_recv event typo
It uses num in multifd_send().  Make it coherent.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Wei Yang <richardw.yang@linux.intel.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-07-15 15:39:01 +02:00
Like Xu
5cc8767d05 general: Replace global smp variables with smp machine properties
Basically, the context could get the MachineState reference via call
chains or unrecommended qdev_get_machine() in !CONFIG_USER_ONLY mode.

A local variable of the same name would be introduced in the declaration
phase out of less effort OR replace it on the spot if it's only used
once in the context. No semantic changes.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190518205428.90532-4-like.xu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:07:36 -03:00
Alex Bennée
1f4abd81f7 migration: move port_attr inside CONFIG_LINUX
Otherwise the FreeBSD compiler complains about an unused variable.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2019-07-04 19:23:07 +01:00
Zhang Chen
0e8818f023 migration/colo.c: Add missed filter notify for Xen COLO.
We need to notify net filter to do checkpoint for Xen COLO, like KVM side.

Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2019-07-02 10:21:07 +08:00
Markus Armbruster
a8d2532645 Include qemu-common.h exactly where needed
No header includes qemu-common.h after this commit, as prescribed by
qemu-common.h's file comment.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-5-armbru@redhat.com>
[Rebased with conflicts resolved automatically, except for
include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c
block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c
target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h
target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h
target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h
target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and
net/tap-bsd.c fixed up]
2019-06-12 13:20:20 +02:00
Markus Armbruster
0b8fa32f55 Include qemu/module.h where needed, drop it from qemu-common.h
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190523143508.25387-4-armbru@redhat.com>
[Rebased with conflicts resolved automatically, except for
hw/usb/dev-hub.c hw/misc/exynos4210_rng.c hw/misc/bcm2835_rng.c
hw/misc/aspeed_scu.c hw/display/virtio-vga.c hw/arm/stm32f205_soc.c;
ui/cocoa.m fixed up]
2019-06-12 13:18:33 +02:00
Peter Maydell
0d74f3b427 Trivial fixes 06/06/2019
-----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAlz4844SHGxhdXJlbnRA
 dml2aWVyLmV1AAoJEPMMOL0/L748FtwQALCoNnMrEY4mnmHy0dnEQPRFcPMKa9pp
 3lqpmxLHAkSsWFKmmLKPteZhUroBmzXPa91984hhQiglcMMMsIPy+A+x1QBj7Yt2
 KeEKpIdSS6Qi4T72zVOtO4MR1pCeKUYHY8ICn/rqAkpkA/lt5DuX2xJSepgrSdAI
 /JgpawJ4Rz95x5rCLuy/t5egtKVYVhauv4EbQ9PeaFhSlwoKNYbc6qAZSvs8pr9n
 H4W8DgtI35wPj4zE3i9bbmnUUxCUMj6MjkIm/jTB5qewY/I+llb27CN2Uq1yvRKW
 ANGbGW3rVwQe8p6kbVcM7CDbawm4J0c59w/4mUTa3BRRuAj4KtHTeghXALHLn/gv
 aO90oZKGd2xGxpSMAapzgebNezUQxFFoRWhyI4o8N+SWEpoRbHkxDwrk2WlKXsCR
 xRYOensU17NOKMJ32AbUReC2/m7D71EH3723aVzd2O5nuIHlsEG2CYjlzjXFB4X8
 wPbaigcqpDEMwLTt3kYy4TrghFdcSaAYepmqXJ9D9UONMOrnRhhR9GkvEmCB4Eus
 BJanLE0xp59KTVDZ5c/v6+44P/RQ04aD2oFh0bMKlNb4+cfbQlq2odoiCPcUKimh
 XCCbYbwJFRRkgGTh9pFMgMqH9zX8HTgG/2Zp2VGFnYXnJzz/AuvgH9k5Pc4P3C/A
 lOCSBWpxR6bk
 =wmCY
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/vivier2/tags/trivial-branch-pull-request' into staging

Trivial fixes 06/06/2019

# gpg: Signature made Thu 06 Jun 2019 12:05:50 BST
# gpg:                using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C
# gpg:                issuer "laurent@vivier.eu"
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full]
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>" [full]
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full]
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/trivial-branch-pull-request:
  hw/watchdog/wdt_i6300esb: Use DEVICE() macro to access DeviceState.qdev
  hw/scsi: Use the QOM BUS() macro to access BusState.qbus
  hw/sd: Use the QOM BUS() macro to access BusState.qbus
  hw/audio/ac97: Use the QOM DEVICE() macro to access DeviceState.qdev
  hw/vfio/pci: Use the QOM DEVICE() macro to access DeviceState.qdev
  hw/usb-storage: Use the QOM DEVICE() macro to access DeviceState.qdev
  hw/isa: Use the QOM DEVICE() macro to access DeviceState.qdev
  hw/s390x/event-facility: Use the QOM BUS() macro to access BusState.qbus
  hw/pci-bridge: Use the QOM BUS() macro to access BusState.qbus
  hw/scsi/vmw_pvscsi: Use qbus_reset_all() directly
  docs/devel/build-system: Update an example
  test: Fix make target check-report.tap
  util: Adjust qemu_guest_getrandom_nofail for Coverity
  vhost: fix incorrect print type
  migration: fix a typo
  hw/rdma: Delete unused headers inclusion

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-06-06 14:09:14 +01:00
Li Qiang
ff1543af22 migration: fix a typo
'postocpy' should be 'postcopy'.

CC: qemu-trivial@nongnu.org
Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20190525062832.18009-1-liq3ea@163.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2019-06-06 11:17:32 +02:00
Wei Yang
0315851938 migratioin/ram: leave RAMBlock->bmap blank on allocating
During migration, we would sync bitmap from ram_list.dirty_memory to
RAMBlock.bmap in cpu_physical_memory_sync_dirty_bitmap().

Since we set RAMBlock.bmap and ram_list.dirty_memory both to all 1, this
means at the first round this sync is meaningless and is a duplicated
work.

Leaving RAMBlock->bmap blank on allocating would have a side effect on
migration_dirty_pages, since it is calculated from the result of
cpu_physical_memory_sync_dirty_bitmap(). To keep it right, we need to
set migration_dirty_pages to 0 in ram_state_init().

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-06-05 12:44:03 +02:00
Yury Kotov
61053d4826 migration: Fix fd protocol for incoming defer
Currently, incoming migration through fd supports only command-line case:
E.g.
    fork();
    fd = open();
    exec("qemu ... -incoming fd:%d", fd);

It's possible to use add-fd commands to pass fd for migration, but it's
invalid case. add-fd works with fdset but not with particular fds.

To work with getfd in incoming defer it's enough to use monitor_fd_param
instead of strtol. monitor_fd_param supports both cases:
* fd:123
* fd:fd_name (added by getfd).

And also the use of monitor_fd_param improves error messages.

Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-06-05 12:43:55 +02:00
Wei Yang
f38d7fbc01 migration/ram.c: multifd_send_state->count is not really used
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-06-05 12:42:54 +02:00
Wei Yang
7d4eaace46 migration/ram.c: MultiFDSendParams.sem_sync is not really used
Besides init and destroy, MultiFDSendParams.sem_sync is not really used.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2019-06-05 12:42:39 +02:00
Kevin Wolf
d861ab3acf block: Add BlockBackend.ctx
This adds a new parameter to blk_new() which requires its callers to
declare from which AioContext this BlockBackend is going to be used (or
the locks of which AioContext need to be taken anyway).

The given context is only stored and kept up to date when changing
AioContexts. Actually applying the stored AioContext to the root node
is saved for another commit.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-06-04 15:22:22 +02:00
John Snow
592203e7cf migration/dirty-bitmaps: change bitmap enumeration method
Shift from looking at every root BDS to *every* BDS. This will migrate
bitmaps that are attached to blockdev created nodes instead of just ones
attached to emulated storage devices.

Note that this will not migrate anonymous or internal-use bitmaps, as
those are defined as having no name.

This will also fix the Coverity issues Peter Maydell has been asking
about for the past several releases, as well as fixing a real bug.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Reported-by: Coverity 😅
Reported-by: aihua liang <aliang@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 20190514201926.10407-1-jsnow@redhat.com
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1652490
Fixes: Coverity CID 1390625
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
2019-05-28 19:33:31 -04:00
Greg Kurz
b6eca81e1b migration: Fix typo in migrate_add_blocker() error message
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <155800428514.543845.17558475870097990036.stgit@bahia.lan>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2019-05-22 17:35:27 +02:00
Wei Yang
a5f7b1a63c migration/ram.c: fix typos in comments
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190510233729.15554-1-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-05-14 19:00:04 +01:00
Yury Kotov
fd392cfa8e migration: Fix use-after-free during process exit
It fixes heap-use-after-free which was found by clang's ASAN.

Control flow of this use-after-free:
main_thread:
    * Got SIGTERM and completes main loop
    * Calls migration_shutdown
      - migrate_fd_cancel (so, migration_thread begins to complete)
      - object_unref(OBJECT(current_migration));

migration_thread:
    * migration_iteration_finish -> schedule cleanup bh
    * object_unref(OBJECT(s)); (Now, current_migration is freed)
    * exits

main_thread:
    * Calls vm_shutdown -> drain bdrvs -> main loop
      -> cleanup_bh -> use after free

If you want to reproduce, these couple of sleeps will help:
vl.c:4613:
     migration_shutdown();
+    sleep(2);
migration.c:3269:
+    sleep(1);
     trace_migration_thread_after_loop();
     migration_iteration_finish(s);

Original output:
qemu-system-x86_64: terminating on signal 15 from pid 31980 (<unknown process>)
=================================================================
==31958==ERROR: AddressSanitizer: heap-use-after-free on address 0x61900001d210
  at pc 0x555558a535ca bp 0x7fffffffb190 sp 0x7fffffffb188
READ of size 8 at 0x61900001d210 thread T0 (qemu-vm-0)
    #0 0x555558a535c9 in migrate_fd_cleanup migration/migration.c:1502:23
    #1 0x5555594fde0a in aio_bh_call util/async.c:90:5
    #2 0x5555594fe522 in aio_bh_poll util/async.c:118:13
    #3 0x555559524783 in aio_poll util/aio-posix.c:725:17
    #4 0x555559504fb3 in aio_wait_bh_oneshot util/aio-wait.c:71:5
    #5 0x5555573bddf6 in virtio_blk_data_plane_stop
      hw/block/dataplane/virtio-blk.c:282:5
    #6 0x5555589d5c09 in virtio_bus_stop_ioeventfd hw/virtio/virtio-bus.c:246:9
    #7 0x5555589e9917 in virtio_pci_stop_ioeventfd hw/virtio/virtio-pci.c:287:5
    #8 0x5555589e22bf in virtio_pci_vmstate_change hw/virtio/virtio-pci.c:1072:9
    #9 0x555557628931 in virtio_vmstate_change hw/virtio/virtio.c:2257:9
    #10 0x555557c36713 in vm_state_notify vl.c:1605:9
    #11 0x55555716ef53 in do_vm_stop cpus.c:1074:9
    #12 0x55555716eeff in vm_shutdown cpus.c:1092:12
    #13 0x555557c4283e in main vl.c:4617:5
    #14 0x7fffdfdb482f in __libc_start_main
      (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #15 0x555556ecb118 in _start (x86_64-softmmu/qemu-system-x86_64+0x1977118)

0x61900001d210 is located 144 bytes inside of 952-byte region
  [0x61900001d180,0x61900001d538)
freed by thread T6 (live_migration) here:
    #0 0x555556f76782 in __interceptor_free
      /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:124:3
    #1 0x555558d5fa94 in object_finalize qom/object.c:618:9
    #2 0x555558d57651 in object_unref qom/object.c:1068:9
    #3 0x555558a55588 in migration_thread migration/migration.c:3272:5
    #4 0x5555595393f2 in qemu_thread_start util/qemu-thread-posix.c:502:9
    #5 0x7fffe057f6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9)

previously allocated by thread T0 (qemu-vm-0) here:
    #0 0x555556f76b03 in __interceptor_malloc
      /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:146:3
    #1 0x7ffff6ee37b8 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4f7b8)
    #2 0x555558d58031 in object_new qom/object.c:640:12
    #3 0x555558a31f21 in migration_object_init migration/migration.c:139:25
    #4 0x555557c41398 in main vl.c:4320:5
    #5 0x7fffdfdb482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

Thread T6 (live_migration) created by T0 (qemu-vm-0) here:
    #0 0x555556f5f0dd in pthread_create
      /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_interceptors.cc:210:3
    #1 0x555559538cf9 in qemu_thread_create util/qemu-thread-posix.c:539:11
    #2 0x555558a53304 in migrate_fd_connect migration/migration.c:3332:5
    #3 0x555558a72bd8 in migration_channel_connect migration/channel.c:92:5
    #4 0x555558a6ef87 in exec_start_outgoing_migration migration/exec.c:42:5
    #5 0x555558a4f3c2 in qmp_migrate migration/migration.c:1922:9
    #6 0x555558bb4f6a in qmp_marshal_migrate qapi/qapi-commands-migration.c:607:5
    #7 0x555559363738 in do_qmp_dispatch qapi/qmp-dispatch.c:131:5
    #8 0x555559362a15 in qmp_dispatch qapi/qmp-dispatch.c:174:11
    #9 0x5555571bac15 in monitor_qmp_dispatch monitor.c:4124:11
    #10 0x55555719a22d in monitor_qmp_bh_dispatcher monitor.c:4207:9
    #11 0x5555594fde0a in aio_bh_call util/async.c:90:5
    #12 0x5555594fe522 in aio_bh_poll util/async.c:118:13
    #13 0x5555595201e0 in aio_dispatch util/aio-posix.c:460:5
    #14 0x555559503553 in aio_ctx_dispatch util/async.c:261:5
    #15 0x7ffff6ede196 in g_main_context_dispatch
      (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4a196)

SUMMARY: AddressSanitizer: heap-use-after-free migration/migration.c:1502:23
  in migrate_fd_cleanup
Shadow bytes around the buggy address:
  0x0c327fffb9f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c327fffba00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c327fffba10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c327fffba20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c327fffba30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
=>0x0c327fffba40: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c327fffba50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c327fffba60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c327fffba70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c327fffba80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c327fffba90: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable: 00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone: fa
  Freed heap region: fd
  Stack left redzone: f1
  Stack mid redzone: f2
  Stack right redzone: f3
  Stack after return: f5
  Stack use after scope: f8
  Global redzone: f9
  Global init order: f6
  Poisoned by user: f7
  Container overflow: fc
  Array cookie: ac
  Intra object redzone: bb
  ASan internal: fe
  Left alloca redzone: ca
  Right alloca redzone: cb
  Shadow gap: cc
==31958==ABORTING

Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
Message-Id: <20190408113343.2370-1-yury-kotov@yandex-team.ru>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  Fixed up comment formatting
2019-05-14 18:59:54 +01:00
Wei Yang
16015d32e4 migration/savevm: wrap into qemu_loadvm_state_header()
On source side, we have qemu_savevm_state_header() to send related data,
while on the receiving side those steps are scattered in
qemu_loadvm_state().

This patch wrap those related steps into qemu_loadvm_state_header() to
make it friendly to read.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190424004700.12766-5-richardw.yang@linux.intel.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-05-14 17:33:35 +01:00
Wei Yang
9e14b84908 migration/savevm: load_header before load_setup
In migration_thread() and qemu_savevm_state(), we savevm_state in
following sequence:

    qemu_savevm_state_header(f);
    qemu_savevm_state_setup(f);

Then it would be more proper to loadvm_state in the save sequence.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190424004700.12766-4-richardw.yang@linux.intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2019-05-14 17:33:35 +01:00