qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Markus Armbruster	735527e179	migration/colo: Fix qmp_xen_colo_do_checkpoint() error handling The Error ** argument must be NULL, &error_abort, &error_fatal, or a pointer to a variable containing NULL. Passing an argument of the latter kind twice without clearing it in between is wrong: if the first call sets an error, it no longer points to NULL for the second call. qmp_xen_colo_do_checkpoint() passes @errp first to replication_do_checkpoint_all(), and then to colo_notify_filters_event(). If both fail, this will trip the assertion in error_setv(). Similar code in secondary_vm_do_failover() calls colo_notify_filters_event() only after replication_do_checkpoint_all() succeeded. Do the same here. Fixes: `0e8818f023` Cc: Zhang Chen <chen.zhang@intel.com> Cc: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Zhang Chen <chen.zhang@intel.com> Message-Id: <20200422130719.28225-12-armbru@redhat.com>	2020-04-29 08:01:52 +02:00
Marc-André Lureau	9cbc36497c	migration: fix cleanup_bh leak on resume Since commit `8c6b0356b5` ("util/async: make bh_aio_poll() O(1)"), migration-test reveals a leak: QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 tests/qtest/migration-test -p /x86_64/migration/postcopy/recovery tests/qtest/libqtest.c:140: kill_qemu() tried to terminate QEMU process but encountered exit status 1 (expected 0) ================================================================= ==2082571==ERROR: LeakSanitizer: detected memory leaks Direct leak of 40 byte(s) in 1 object(s) allocated from: #0 0x7f25971dfc58 in __interceptor_malloc (/lib64/libasan.so.5+0x10dc58) #1 0x7f2596d08358 in g_malloc (/lib64/libglib-2.0.so.0+0x57358) #2 0x560970d006f8 in qemu_bh_new /home/elmarco/src/qemu/util/main-loop.c:532 #3 0x5609704afa02 in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:3407 #4 0x5609704b6b6f in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:92 #5 0x5609704b2bfb in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:108 #6 0x560970b9bd6c in qio_task_complete /home/elmarco/src/qemu/io/task.c:196 #7 0x560970b9aa97 in qio_task_thread_result /home/elmarco/src/qemu/io/task.c:111 #8 0x7f2596cfee3a (/lib64/libglib-2.0.so.0+0x4de3a) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20200325184723.2029630-2-marcandre.lureau@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-04-02 14:55:45 -04:00
Mao Zhongyi	7cd75cbdb8	migration: use "" instead of (null) for tls-authz run: (qemu) info migrate_parameters announce-initial: 50 ms ... announce-max: 550 ms multifd-compression: none xbzrle-cache-size: 4194304 max-postcopy-bandwidth: 0 tls-authz: '(null)' Migration parameter 'tls-authz' is used to provide the QOM ID of a QAuthZ subclass instance that provides the access control check, default is NULL. But the empty string is not a valid object ID, so use "" instead of the default. Although it will fail when lookup an object with ID "", it is harmless, just consistent with tls_creds. As a bonus, this patch also fixed the bad indentation on the last line and removed 'has_tls_authz' redundant check in 'hmp_info_migrate_parameters'. Signed-off-by: Mao Zhongyi <maozhongyi@cmss.chinamobile.com> Message-Id: <119f539a9f4d198bc3bcced46b8280520d60bc51.1585100802.git.maozhongyi@cmss.chinamobile.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-03-25 12:31:38 +00:00
Vladimir Sementsov-Ogievskiy	b4a1733c5e	migration/ram: fix use after free of local_err local_err is used again in migration_bitmap_sync_precopy() after precopy_notify(), so we must zero it. Otherwise try to set non-NULL local_err will crash. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200324153630.11882-6-vsementsov@virtuozzo.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-03-25 12:31:38 +00:00
Vladimir Sementsov-Ogievskiy	27d07fcfa7	migration/colo: fix use after free of local_err local_err is used again in secondary_vm_do_failover() after replication_stop_all(), so we must zero it. Otherwise try to set non-NULL local_err will crash. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200324153630.11882-5-vsementsov@virtuozzo.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-03-25 12:31:38 +00:00
Mao Zhongyi	06b1c6f8b7	xbzrle: update xbzrle doc Add new parameter description, also: 1. Remove unsociable space. 2. Nit picking: s/two/2 in report Signed-off-by: Mao Zhongyi <maozhongyi@cmss.chinamobile.com> Message-Id: <20200320143216.423374-1-maozhongyi@cmss.chinamobile.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-03-25 12:31:38 +00:00
zhanghailiang	19dd408a47	migration: recognize COLO as part of activating process We will migrate parts of dirty pages backgroud lively during the gap time of two checkpoints, without this modification, it will not work because ram_save_iterate() will check it before send RAM_SAVE_FLAG_EOS at the end of it. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Message-Id: <20200224065414.36524-7-zhang.zhanghailiang@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-03-13 09:36:30 +00:00
zhanghailiang	8af66371ed	ram/colo: only record bitmap of dirty pages in COLO stage It is only need to record bitmap of dirty pages while goes into COLO stage. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Message-Id: <20200224065414.36524-6-zhang.zhanghailiang@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-03-13 09:36:30 +00:00
zhanghailiang	0393031a16	COLO: Optimize memory back-up process This patch will reduce the downtime of VM for the initial process, Previously, we copied all these memory in preparing stage of COLO while we need to stop VM, which is a time-consuming process. Here we optimize it by a trick, back-up every page while in migration process while COLO is enabled, though it affects the speed of the migration, but it obviously reduce the downtime of back-up all SVM'S memory in COLO preparing stage. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Message-Id: <20200224065414.36524-5-zhang.zhanghailiang@huawei.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> minor typo fixes	2020-03-13 09:36:30 +00:00
Keqian Zhu	dc14a47076	migration/throttle: Add throttle-trig-thres migration parameter Currently, if the bytes_dirty_period is more than the 50% of bytes_xfer_period, we start or increase throttling. If we make this percentage higher, then we can tolerate higher dirty rate during migration, which means less impact on guest. The side effect of higher percentage is longer migration time. We can make this parameter configurable to switch between mig- ration time first or guest performance first. The default value is 50 and valid range is 1 to 100. Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Message-Id: <20200224023142.39360-1-zhukeqian1@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-03-13 09:36:30 +00:00
zhanghailiang	f51d0b4178	savevm: Don't call colo_init_ram_cache twice This helper has been called twice which is wrong. Left the one where called while get COLO enable message from source side. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-02-28 10:13:54 +01:00
zhanghailiang	6ad8ad38d0	migration/colo: wrap incoming checkpoint process into new helper Split checkpoint incoming process into a helper. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-02-28 10:13:54 +01:00
zhanghailiang	0306dae5ac	migration: fix COLO broken caused by a previous commit This commit "migration: Create migration_is_running()" broke COLO. Becuase there is a process broken by this commit. colo_process_checkpoint ->colo_do_checkpoint_transaction ->migrate_set_block_enabled ->qmp_migrate_set_capabilities It can be fixed by make COLO process as an exception, Maybe we need a better way to fix it. Cc: Juan Quintela <quintela@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-02-28 10:13:54 +01:00
Stefan Hajnoczi	a152bd0093	migration/block: rename BLOCK_SIZE macro Both <linux/fs.h> and <sys/mount.h> define BLOCK_SIZE macros. Avoiding using that name in block/migration.c. I noticed this when including <liburing.h> (Linux io_uring) from "block/aio.h" and compilation failed. Although patches adding that include haven't been sent yet, it makes sense to rename the macro now in case someone else stumbles on it in the meantime. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-02-28 09:25:49 +01:00
Pan Nengyuan	26daeba4d6	migration/savevm: release gslist after dump_vmstate_json 'list' forgot to free at the end of dump_vmstate_json_to_file(), although it's called only once, but seems like a clean code. Fix the leak as follow: Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x7fb946abd768 in __interceptor_malloc (/lib64/libasan.so.5+0xef768) #1 0x7fb945eca445 in g_malloc (/lib64/libglib-2.0.so.0+0x52445) #2 0x7fb945ee2066 in g_slice_alloc (/lib64/libglib-2.0.so.0+0x6a066) #3 0x7fb945ee3139 in g_slist_prepend (/lib64/libglib-2.0.so.0+0x6b139) #4 0x5585db591581 in object_class_get_list_tramp /mnt/sdb/qemu-new/qemu/qom/object.c:1084 #5 0x5585db590f66 in object_class_foreach_tramp /mnt/sdb/qemu-new/qemu/qom/object.c:1028 #6 0x7fb945eb35f7 in g_hash_table_foreach (/lib64/libglib-2.0.so.0+0x3b5f7) #7 0x5585db59110c in object_class_foreach /mnt/sdb/qemu-new/qemu/qom/object.c:1038 #8 0x5585db5916b6 in object_class_get_list /mnt/sdb/qemu-new/qemu/qom/object.c:1092 #9 0x5585db335ca0 in dump_vmstate_json_to_file /mnt/sdb/qemu-new/qemu/migration/savevm.c:638 #10 0x5585daa5bcbf in main /mnt/sdb/qemu-new/qemu/vl.c:4420 #11 0x7fb941204812 in __libc_start_main ../csu/libc-start.c:308 #12 0x5585da29420d in _start (/mnt/sdb/qemu-new/qemu/build/x86_64-softmmu/qemu-system-x86_64+0x27f020d) Indirect leak of 7472 byte(s) in 467 object(s) allocated from: #0 0x7fb946abd768 in __interceptor_malloc (/lib64/libasan.so.5+0xef768) #1 0x7fb945eca445 in g_malloc (/lib64/libglib-2.0.so.0+0x52445) #2 0x7fb945ee2066 in g_slice_alloc (/lib64/libglib-2.0.so.0+0x6a066) #3 0x7fb945ee3139 in g_slist_prepend (/lib64/libglib-2.0.so.0+0x6b139) #4 0x5585db591581 in object_class_get_list_tramp /mnt/sdb/qemu-new/qemu/qom/object.c:1084 #5 0x5585db590f66 in object_class_foreach_tramp /mnt/sdb/qemu-new/qemu/qom/object.c:1028 #6 0x7fb945eb35f7 in g_hash_table_foreach (/lib64/libglib-2.0.so.0+0x3b5f7) #7 0x5585db59110c in object_class_foreach /mnt/sdb/qemu-new/qemu/qom/object.c:1038 #8 0x5585db5916b6 in object_class_get_list /mnt/sdb/qemu-new/qemu/qom/object.c:1092 #9 0x5585db335ca0 in dump_vmstate_json_to_file /mnt/sdb/qemu-new/qemu/migration/savevm.c:638 #10 0x5585daa5bcbf in main /mnt/sdb/qemu-new/qemu/vl.c:4420 #11 0x7fb941204812 in __libc_start_main ../csu/libc-start.c:308 #12 0x5585da29420d in _start (/mnt/sdb/qemu-new/qemu/build/x86_64-softmmu/qemu-system-x86_64+0x27f020d) Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-02-28 09:25:49 +01:00
Chen Qun	600fe89d6e	migration/vmstate: Remove redundant statement in vmstate_save_state_v() The "ret" has been assigned in all branches. It didn't need to be assigned separately. Clang static code analyzer show warning: migration/vmstate.c:365:17: warning: Value stored to 'ret' is never read ret = 0; ^ ~ Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-02-28 09:25:49 +01:00
Juan Quintela	87dc6f5f66	multifd: Add zstd compression multifd support Signed-off-by: Juan Quintela <quintela@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-02-28 09:25:49 +01:00
Juan Quintela	6a9ad15420	multifd: Add multifd-zstd-level parameter This parameter specifies the zstd compression level. The next patch will put it to use. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com>	2020-02-28 09:25:28 +01:00
Juan Quintela	7ec2c2b3c1	multifd: Add zlib compression multifd support Signed-off-by: Juan Quintela <quintela@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-02-28 09:24:43 +01:00
Juan Quintela	9004db48c0	multifd: Add multifd-zlib-level parameter This parameter specifies the zlib compression level. The next patch will put it to use. Signed-off-by: Juan Quintela <quintela@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-02-28 09:24:43 +01:00
Juan Quintela	ab7cbb0b9a	multifd: Make no compression operations into its own structure It will be used later. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- No comp value needs to be zero.	2020-02-28 09:24:43 +01:00
Juan Quintela	96eef04238	multifd: Add multifd-compression parameter This will store the compression method to use. We start with none. Signed-off-by: Juan Quintela <quintela@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- Rename multifd-method to multifd-compression	2020-02-28 09:24:43 +01:00
Dr. David Alan Gilbert	2a1bc8bde7	migration/rdma: rdma_accept_incoming_migration fix error handling rdma_accept_incoming_migration is called from an fd handler and can't return an Error * anywhere. Currently it's leaking Error's in errp/local_err - there's no point putting them in there unless we can report them. Turn most into fprintf's, and the last into an error_reportf_err where it's coming up from another function. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-02-13 10:55:55 +01:00
Keqian Zhu	d05de9e39a	migration: Optimization about wait-unplug migration state qemu_savevm_nr_failover_devices() is originally designed to get the number of failover devices, but it actually returns the number of "unplug-pending" failover devices now. Moreover, what drives migration state to wait-unplug should be the number of "unplug-pending" failover devices, not all failover devices. We can also notice that qemu_savevm_state_guest_unplug_pending() and qemu_savevm_nr_failover_devices() is equivalent almost (from the code view). So the latter is incorrect semantically and useless, just delete it. In the qemu_savevm_state_guest_unplug_pending(), once hit a unplug-pending failover device, then it can return true right now to save cpu time. Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Tested-by: Jens Freimann <jfreimann@redhat.com> Reviewed-by: Jens Freimann <jfreimann@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-02-13 10:53:10 +01:00
Zhimin Feng	8958338b10	migration: Maybe VM is paused when migration is cancelled If the migration is cancelled when it is in the completion phase, the migration state is set to MIGRATION_STATUS_CANCELLING. The VM maybe wait for the 'pause_sem' semaphore in migration_maybe_pause function, so that VM always is paused. Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Zhimin Feng <fengzhimin1@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-02-13 10:52:58 +01:00
Wei Yang	42d24611af	migration/compress: compress QEMUFile is not writable We open a file with empty_ops for compress QEMUFile, which means this is not writable. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-29 11:28:59 +01:00
Eric Auger	a085664f21	migration: Simplify get_qlist Instead of inserting read elements at the head and then reversing the list, it is simpler to add each element after the previous one. Introduce QLIST_RAW_INSERT_AFTER helper and use it in get_qlist(). Signed-off-by: Eric Auger <eric.auger@redhat.com> Suggested-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	d32ca5ad79	multifd: Split multifd code into its own file Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	b673eab4e2	multifd: Make multifd_load_setup() get an Error parameter We need to change the full chain to pass the Error parameter. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	00f4b572e6	multifd: Make multifd_save_setup() get an Error parameter Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	857a4bbb86	migration: Make checkpatch happy with comments Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	a6703e4d33	multifd: Use qemu_target_page_size() We will make it cpu independent. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	99f2c6fb46	multifd: multifd_send_sync_main only needs the qemufile Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	67a4c8910c	multifd: multifd_queue_page only needs the qemufile Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	df94d32bb1	multifd: multifd_send_pages only needs the qemufile Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Zhimin Feng	9c4d333c09	migration/multifd: fix nullptr access in multifd_send_terminate_threads If the multifd_send_threads is not created when migration is failed, multifd_save_cleanup would be called twice. In this senario, the multifd_send_state is accessed after it has been released, the result is that the source VM is crashing down. Here is the coredump stack: Program received signal SIGSEGV, Segmentation fault. 0x00005629333a78ef in multifd_send_terminate_threads (err=err@entry=0x0) at migration/ram.c:1012 1012 MultiFDSendParams *p = &multifd_send_state->params[i]; #0 0x00005629333a78ef in multifd_send_terminate_threads (err=err@entry=0x0) at migration/ram.c:1012 #1 0x00005629333ab8a9 in multifd_save_cleanup () at migration/ram.c:1028 #2 0x00005629333abaea in multifd_new_send_channel_async (task=0x562935450e70, opaque=<optimized out>) at migration/ram.c:1202 #3 0x000056293373a562 in qio_task_complete (task=task@entry=0x562935450e70) at io/task.c:196 #4 0x000056293373a6e0 in qio_task_thread_result (opaque=0x562935450e70) at io/task.c:111 #5 0x00007f475d4d75a7 in g_idle_dispatch () from /usr/lib64/libglib-2.0.so.0 #6 0x00007f475d4da9a9 in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0 #7 0x0000562933785b33 in glib_pollfds_poll () at util/main-loop.c:219 #8 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242 #9 main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:518 #10 0x00005629334c5acf in main_loop () at vl.c:1810 #11 0x000056293334d7bb in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4471 If the multifd_send_threads is not created when migration is failed. In this senario, we don't call multifd_save_cleanup in multifd_new_send_channel_async. Signed-off-by: Zhimin Feng <fengzhimin1@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	392d87e213	migration: Create migration_is_running() This function returns true if we are in the middle of a migration. It is like migration_is_setup_or_active() with CANCELLING and COLO. Adapt all callers that are needed. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	b69a0227a8	migration: Don't send data if we have stopped If we do a cancel, we got out without one error, but we can't do the rest of the output as in a normal situation. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	a555b8092a	qemu-file: Don't do IO after shutdown Be sure that we are not doing neither read/write after shutdown of the QEMUFile. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Juan Quintela	3d4095b222	multifd: Make sure that we don't do any IO after an error Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00
Marc-André Lureau	4f67d30b5e	qdev: set properties with device_class_set_props() The following patch will need to handle properties registration during class_init time. Let's use a device_class_set_props() setter. spatch --macro-file scripts/cocci-macro-file.h --sp-file ./scripts/coccinelle/qdev-set-props.cocci --keep-comments --in-place --dir . @@ typedef DeviceClass; DeviceClass *d; expression val; @@ - d->props = val + device_class_set_props(d, val) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20200110153039.1379601-20-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-01-24 20:59:15 +01:00
Juan Quintela	ddac5cb2d9	multifd: Be consistent about using uint64_t We transmit ram_addr_t always as uint64_t. Be consistent in its use (on 64bit system, it is always uint64_t problem is 32bits). Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-20 09:17:07 +01:00
Eric Auger	4746dbf8a9	migration: Support QLIST migration Support QLIST migration using the same principle as QTAILQ: `94869d5c52` ("migration: migrate QTAILQ"). The VMSTATE_QLIST_V macro has the same proto as VMSTATE_QTAILQ_V. The change mainly resides in QLIST RAW macros: QLIST_RAW_INSERT_HEAD and QLIST_RAW_REVERSE. Tests also are provided. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Peter Xu	93062e2361	migration: Change SaveStateEntry.instance_id into uint32_t It was always used as 32bit, so define it as used to be clear. Instead of using -1 as the auto-gen magic value, we switch to UINT32_MAX. We also make sure that we don't auto-gen this value to avoid overflowed instance IDs without being noticed. Suggested-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Peter Xu	1df2c9a26f	migration: Define VMSTATE_INSTANCE_ID_ANY Define the new macro VMSTATE_INSTANCE_ID_ANY for callers who wants to auto-generate the vmstate instance ID. Previously it was hard coded as -1 instead of this macro. It helps to change this default value in the follow up patches. No functional change. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Alexey Romko	8bba004cca	Bug #1829242 correction. Added type conversions to ram_addr_t before all left shifts of page indexes to TARGET_PAGE_BITS, to correct overflows when the page address was 4Gb and more. Signed-off-by: Alexey Romko <nevilad@yahoo.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Jiahui Cen	9560a48ecc	migration/multifd: fix destroyed mutex access in terminating multifd threads One multifd will lock all the other multifds' IOChannel mutex to inform them to quit by setting p->quit or shutting down p->c. In this senario, if some multifds had already been terminated and multifd_load_cleanup/multifd_save_cleanup had destroyed their mutex, it could cause destroyed mutex access when trying lock their mutex. Here is the coredump stack: #0 0x00007f81a2794437 in raise () from /usr/lib64/libc.so.6 #1 0x00007f81a2795b28 in abort () from /usr/lib64/libc.so.6 #2 0x00007f81a278d1b6 in __assert_fail_base () from /usr/lib64/libc.so.6 #3 0x00007f81a278d262 in __assert_fail () from /usr/lib64/libc.so.6 #4 0x000055eb1bfadbd3 in qemu_mutex_lock_impl (mutex=0x55eb1e2d1988, file=<optimized out>, line=<optimized out>) at util/qemu-thread-posix.c:64 #5 0x000055eb1bb4564a in multifd_send_terminate_threads (err=<optimized out>) at migration/ram.c:1015 #6 0x000055eb1bb4bb7f in multifd_send_thread (opaque=0x55eb1e2d19f8) at migration/ram.c:1171 #7 0x000055eb1bfad628 in qemu_thread_start (args=0x55eb1e170450) at util/qemu-thread-posix.c:502 #8 0x00007f81a2b36df5 in start_thread () from /usr/lib64/libpthread.so.0 #9 0x00007f81a286048d in clone () from /usr/lib64/libc.so.6 To fix it up, let's destroy the mutex after all the other multifd threads had been terminated. Signed-off-by: Jiahui Cen <cenjiahui@huawei.com> Signed-off-by: Ying Fang <fangying1@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Jiahui Cen	f76e32eb05	migration/multifd: fix nullptr access in terminating multifd threads One multifd channel will shutdown all the other multifd's IOChannel when it fails to receive an IOChannel. In this senario, if some multifds had not received its IOChannel yet, it would try to shutdown its IOChannel which could cause nullptr access at qio_channel_shutdown. Here is the coredump stack: #0 object_get_class (obj=obj@entry=0x0) at qom/object.c:908 #1 0x00005563fdbb8f4a in qio_channel_shutdown (ioc=0x0, how=QIO_CHANNEL_SHUTDOWN_BOTH, errp=0x0) at io/channel.c:355 #2 0x00005563fd7b4c5f in multifd_recv_terminate_threads (err=<optimized out>) at migration/ram.c:1280 #3 0x00005563fd7bc019 in multifd_recv_new_channel (ioc=ioc@entry=0x556400255610, errp=errp@entry=0x7ffec07dce00) at migration/ram.c:1478 #4 0x00005563fda82177 in migration_ioc_process_incoming (ioc=ioc@entry=0x556400255610, errp=errp@entry=0x7ffec07dce30) at migration/migration.c:605 #5 0x00005563fda8567d in migration_channel_process_incoming (ioc=0x556400255610) at migration/channel.c:44 #6 0x00005563fda83ee0 in socket_accept_incoming_migration (listener=0x5563fff6b920, cioc=0x556400255610, opaque=<optimized out>) at migration/socket.c:166 #7 0x00005563fdbc25cd in qio_net_listener_channel_func (ioc=<optimized out>, condition=<optimized out>, opaque=<optimized out>) at io/net-listener.c:54 #8 0x00007f895b6fe9a9 in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0 #9 0x00005563fdc18136 in glib_pollfds_poll () at util/main-loop.c:218 #10 0x00005563fdc181b5 in os_host_main_loop_wait (timeout=1000000000) at util/main-loop.c:241 #11 0x00005563fdc183a2 in main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:517 #12 0x00005563fd8edb37 in main_loop () at vl.c:1791 #13 0x00005563fd74fd45 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4473 To fix it up, let's check p->c before calling qio_channel_shutdown. Signed-off-by: Jiahui Cen <cenjiahui@huawei.com> Signed-off-by: Ying Fang <fangying1@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Wei Yang	c6b3a2e0c4	migration/multifd: not use multifd during postcopy We don't support multifd during postcopy, but user still could enable both multifd and postcopy. This leads to migration failure. Skip multifd during postcopy. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Wei Yang	eab54aa78f	migration/multifd: clean pages after filling packet This is a preparation for the next patch: not use multifd during postcopy. Without enabling postcopy, everything looks good. While after enabling postcopy, migration may fail even not use multifd during postcopy. The reason is the pages is not properly cleared and old target page will continue to be transferred. After clean pages, migration succeeds. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Wei Yang	644acf99b8	migration/postcopy: enable compress during postcopy postcopy requires to place a whole host page, while migration thread migrate memory in target page size. This makes postcopy need to collect all target pages in one host page before placing via userfaultfd. To enable compress during postcopy, there are two problems to solve: 1. Random order for target page arrival 2. Target pages in one host page arrives without interrupt by target page from other host page The first one is handled by previous cleanup patch. This patch handles the second one by: 1. Flush compress thread for each host page 2. Wait for decompress thread for before placing host page Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Wei Yang	91ba442f5c	migration/postcopy: enable random order target page arrival After using number of target page received to track one host page, we could have the capability to handle random order target page arrival in one host page. This is a preparation for enabling compress during postcopy. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Wei Yang	e5e73b0f90	migration/postcopy: set all_zero to true on the first target page For the first target page, all_zero is set to true for this round check. After target_pages introduced, we could leverage this variable instead of checking the address offset. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Wei Yang	4cbb3c63c1	migration/postcopy: count target page number to decide the place_needed In postcopy, it requires to place whole host page instead of target page. Currently, it relies on the page offset to decide whether this is the last target page. We also can count the target page number during the iteration. When the number of target page equals (host page size / target page size), this means it is the last target page in the host page. This is a preparation for non-ordered target page transmission. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Wei Yang	ca1a6b708b	migration/postcopy: wait for decompress thread in precopy Compress is not supported with postcopy, it is safe to wait for decompress thread just in precopy. This is a preparation for later patch. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Wei Yang	2e36bc1b88	migration/postcopy: reduce memset when it is zero page and matches_target_page_size In this case, page_buffer content would not be used. Skip this to save some time. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:23 +01:00
Yury Kotov	e65cec5e5d	migration/ram: Yield periodically to the main loop Usually, incoming migration coroutine yields to the main loop while its IO-channel is waiting for data to receive. But there is a case when RAM migration and data receive have the same speed: VM with huge zeroed RAM. In this case, IO-channel won't read and thus the main loop is stuck and for instance, it doesn't respond to QMP commands. For this case, yield periodically, but not too often, so as not to affect the speed of migration. Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:22 +01:00
Scott Cheloha	174723ffe5	migration: savevm_state_handler_insert: constant-time element insertion savevm_state's SaveStateEntry TAILQ is a priority queue. Priority sorting is maintained by searching from head to tail for a suitable insertion spot. Insertion is thus an O(n) operation. If we instead keep track of the head of each priority's subqueue within that larger queue we can reduce this operation to O(1) time. savevm_state_handler_remove() becomes slightly more complex to accomodate these gains: we need to replace the head of a priority's subqueue when removing it. With O(1) insertion, booting VMs with many SaveStateEntry objects is more plausible. For example, a ppc64 VM with maxmem=8T has 40000 such objects to insert. Signed-off-by: Scott Cheloha <cheloha@linux.vnet.ibm.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:22 +01:00
Scott Cheloha	bd5de61e7b	migration: add savevm_state_handler_remove() Create a function to abstract common logic needed when removing a SaveStateEntry element from the savevm_state.handlers queue. For now we just remove the element. Soon it will involve additional cleanup. Signed-off-by: Scott Cheloha <cheloha@linux.vnet.ibm.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:22 +01:00
Yury Kotov	603d5a42d3	migration: Fix the re-run check of the migrate-incoming command The current check sets an error but doesn't fail the command. This may cause a problem if new connection attempt by the same URI affects the first connection. Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:22 +01:00
Fangrui Song	2667c98722	migration: Fix incorrect integer->float conversion caught by clang Clang does not like qmp_migrate_set_downtime()'s code to clamp double @value to 0..INT64_MAX: qemu/migration/migration.c:2038:24: error: implicit conversion from 'long' to 'double' changes value from 9223372036854775807 to 9223372036854775808 [-Werror,-Wimplicit-int-float-conversion] The warning will be enabled by default in clang 10. It is not available for clang <= 9. The clamp is actually useless; @value is checked to be within 0..MAX_MIGRATE_DOWNTIME_SECONDS immediately before. Delete it. While there, make the conversion from double to int64_t explicit. Signed-off-by: Fangrui Song <i@maskray.me> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> [Patch split, commit message improved] Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:22 +01:00
Dr. David Alan Gilbert	97e1e06780	migration: Rate limit inside host pages When using hugepages, rate limiting is necessary within each huge page, since a 1G huge page can take a significant time to send, so you end up with bursty behaviour. Fixes: `4c011c37ec` ("postcopy: Send whole huge pages") Reported-by: Lin Ma <LMa@suse.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:22 +01:00
Daniel Henrique Barboza	03acb4e94d	ram.c: remove unneeded labels ram_save_queue_pages() has an 'err' label that can be replaced by 'return -1' instead. Same thing with ram_discard_range(), and in this case we can also get rid of the 'ret' variable and return either '-1' on error or the result of ram_block_discard_range(). CC: Juan Quintela <quintela@redhat.com> CC: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:22 +01:00
Juan Quintela	4d65a6216b	migration: Make sure that we don't call write() in case of error If we are exiting due to an error/finish/.... Just don't try to even touch the channel with one IO operation. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2020-01-20 09:10:22 +01:00
Juan Quintela	d069bcca6c	multifd: Initialize local variable Fill everything with zero, so the padding fields are also initialized. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2020-01-20 09:08:53 +01:00
Marc-André Lureau	3cad405bab	vmstate: replace DeviceState with VMStateIf Replace DeviceState dependency with VMStateIf on vmstate API. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com>	2020-01-06 18:41:32 +04:00
Paolo Bonzini	44901b5aff	colo: fix return without releasing RCU Use WITH_RCU_READ_LOCK_GUARD to avoid exiting colo_init_ram_cache without releasing RCU. Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-12-17 19:33:52 +01:00
Marc-André Lureau	e4f1bea2a8	migration: fix maybe-uninitialized warning ../migration/ram.c: In function ‘multifd_recv_thread’: /home/elmarco/src/qq/include/qapi/error.h:165:5: error: ‘block’ may be used uninitialized in this function [-Werror=maybe-uninitialized] 165 \| error_setg_internal((errp), __FILE__, __LINE__, __func__, \ \| ^~~~~~~~~~~~~~~~~~~ ../migration/ram.c:818:15: note: ‘block’ was declared here 818 \| RAMBlock *block; \| ^~~~~ Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-12-17 19:32:47 +01:00
Beata Michalska	bd108a44bc	migration: ram: Switch to ram block writeback Switch to ram block writeback for pmem migration. Signed-off-by: Beata Michalska <beata.michalska@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-id: 20191121000843.24844-4-beata.michalska@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-12-16 10:46:35 +00:00
Jens Freimann	284f42a520	net/virtio: fix dev_unplug_pending .dev_unplug_pending is set up by virtio-net code indepent of failover support was set for the device or not. This gives a wrong result when we check for existing primary devices in migration code. Fix this by actually calling dev_unplug_pending() instead of just checking if the function pointer was set. When the feature was not negotiated dev_unplug_pending() will always return false. This prevents us from going into the wait-unplug state when there's no primary device present. Fixes: `9711cd0dfc` ("net/virtio: add failover support") Signed-off-by: Jens Freimann <jfreimann@redhat.com> Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2019-11-25 23:30:28 +08:00
Jens Freimann	c7e0acd5a3	migration: add new migration state wait-unplug This patch adds a new migration state called wait-unplug. It is entered after the SETUP state if failover devices are present. It will transition into ACTIVE once all devices were succesfully unplugged from the guest. So if a guest doesn't respond or takes long to honor the unplug request the user will see the migration state 'wait-unplug'. In the migration thread we query failover devices if they're are still pending the guest unplug. When all are unplugged the migration continues. If one device won't unplug migration will stay in wait_unplug state. Signed-off-by: Jens Freimann <jfreimann@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20191029114905.6856-9-jfreimann@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2019-10-29 18:55:26 -04:00
Wei Yang	038adc2f58	core: replace getpagesize() with qemu_real_host_page_size There are three page size in qemu: real host page size host page size target page size All of them have dedicate variable to represent. For the last two, we use the same form in the whole qemu project, while for the first one we use two forms: qemu_real_host_page_size and getpagesize(). qemu_real_host_page_size is defined to be a replacement of getpagesize(), so let it serve the role. [Note] Not fully tested for some arch or device. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191013021145.16011-3-richardw.yang@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-10-26 15:38:06 +02:00
Vladimir Sementsov-Ogievskiy	ef9041a7b8	block/dirty-bitmap: refactor bdrv_dirty_bitmap_next bdrv_dirty_bitmap_next is always used in same pattern. So, split it into _next and _first, instead of combining two functions into one and add FOR_EACH_DIRTY_BITMAP macro. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20190916141911.5255-5-vsementsov@virtuozzo.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-10-17 17:02:32 -04:00
Vladimir Sementsov-Ogievskiy	5deb6cbd1f	block/dirty-bitmap: add bs link Add bs field to BdrvDirtyBitmap structure. Drop BlockDriverState parameter from bitmap APIs where possible. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20190916141911.5255-3-vsementsov@virtuozzo.com [Rebased on top of block-copy. --js] Signed-off-by: John Snow <jsnow@redhat.com>	2019-10-17 17:02:32 -04:00
Eric Auger	9a85e4b8f6	migration: Support gtree migration Introduce support for GTree migration. A custom save/restore is implemented. Each item is made of a key and a data. If the key is a pointer to an object, 2 VMSDs are passed into the GTree VMStateField. When putting the items, the tree is traversed in sorted order by g_tree_foreach. On the get() path, gtrees must be allocated using the proper key compare, key destroy and value destroy. This must be handled beforehand, for example in a pre_load method. Tests are added to test save/dump of structs containing gtrees including the virtio-iommu domain/mappings scenario. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20191011121724.433-1-eric.auger@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> uintptr_t fixup for test on 32bit	2019-10-11 17:52:31 +01:00
Wei Yang	aff66d2ef0	migration/multifd: pages->used would be cleared when attach to multifd_send_state When we found an available channel in multifd_send_pages(), its pages->used is cleared and then attached to multifd_send_state. It is not necessary to do this twice. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191011085050.17622-5-richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 15:02:06 +01:00
Wei Yang	9985e1f48d	migration/multifd: initialize packet->magic/version once at setup stage MultiFDPacket_t's magic and version field never changes during migration, so move these two fields in setup stage. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191011085050.17622-4-richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 15:02:00 +01:00
Wei Yang	f2148c4c79	migration/multifd: use pages->allocated instead of the static max multifd_send_fill_packet() prepares meta data for following pages to transfer. It would be more proper to fill pages->allocated instead of static max value, especially we want to support flexible packet size. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191011085050.17622-3-richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 15:01:54 +01:00
Wei Yang	d884e77bfe	migration/multifd: fix a typo in comment of multifd_recv_unfill_packet() Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191011085050.17622-2-richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 15:01:44 +01:00
Wei Yang	0197d89025	migration/postcopy: check PostcopyState before setting to POSTCOPY_INCOMING_RUNNING Currently, we set PostcopyState blindly to RUNNING, even we found the previous state is not LISTENING. This will lead to a corner case. First let's look at the code flow: qemu_loadvm_state_main() ret = loadvm_process_command() loadvm_postcopy_handle_run() return -1; if (ret < 0) { if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING) ... } >From above snippet, the corner case is loadvm_postcopy_handle_run() always sets state to RUNNING. And then it checks the previous state. If the previous state is not LISTENING, it will return -1. But at this moment, PostcopyState is already been set to RUNNING. Then ret is checked in qemu_loadvm_state_main(), when it is -1 PostcopyState is checked. Current logic would pause postcopy and retry if PostcopyState is RUNNING. This is not what we expect, because postcopy is not active yet. This patch makes sure state is set to RUNNING only previous state is LISTENING by checking the state first. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Suggested by: Peter Xu <peterx@redhat.com> Message-Id: <20191010011316.31363-3-richardw.yang@linux.intel.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 15:00:16 +01:00
Wei Yang	2a7eb14844	migration/postcopy: rename postcopy_ram_enable_notify to postcopy_ram_incoming_setup Function postcopy_ram_incoming_setup and postcopy_ram_incoming_cleanup is a pair. Rename to make it clear for audience. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20191010011316.31363-2-richardw.yang@linux.intel.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 14:59:58 +01:00
Wei Yang	2d49bacda0	migration/postcopy: postpone setting PostcopyState to END There are two places to call function postcopy_ram_incoming_cleanup() postcopy_ram_listen_thread on migration success loadvm_postcopy_handle_listen one setup failure On success, the vm will never accept another migration. On failure, PostcopyState is transited from LISTENING to END and would be checked in qemu_loadvm_state_main(). If PostcopyState is RUNNING, migration would be paused and retried. Currently PostcopyState is set to END in function postcopy_ram_incoming_cleanup(). With above analysis, we can take this step out and postpone this till the end of listen thread to indicate the listen thread is done. This is a preparation patch for later cleanup. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191006000249.29926-3-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Fixed up in merge to the 1 parameter postcopy_state_set	2019-10-11 14:57:22 +01:00
Wei Yang	2a461c2467	migration/postcopy: mis->have_listen_thread check will never be touched If mis->have_listen_thread is true, this means current PostcopyState must be LISTENING or RUNNING. While the check at the beginning of the function makes sure the state transaction happens when its previous PostcopyState is ADVISE or DISCARD. This means we would never touch this check. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191006000249.29926-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 14:53:30 +01:00
Wei Yang	4991f3091e	migration: report SaveStateEntry id and name on failure This provides helpful information on which entry failed. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191005220517.24029-5-richardw.yang@linux.intel.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 14:31:39 +01:00
Wei Yang	17d9351bf2	migration: pass in_postcopy instead of check state again Not necessary to do the check again. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191005220517.24029-4-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 14:31:27 +01:00
Wei Yang	da1725d3f9	migration/postcopy: fix typo in mark_postcopy_blocktime_begin's comment Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191005220517.24029-3-richardw.yang@linux.intel.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 14:31:08 +01:00
Wei Yang	6629890d55	migration/postcopy: map large zero page in postcopy_ram_incoming_setup() postcopy_ram_incoming_setup() and postcopy_ram_incoming_cleanup() are counterpart. It is reasonable to map/unmap large zero page in these two functions respectively. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20191005135021.21721-3-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 14:28:49 +01:00
Wei Yang	3414322a83	migration/postcopy: allocate tmp_page in setup stage During migration, a tmp page is allocated so that we could place a whole host page during postcopy. Currently the page is allocated during load stage, this is a little bit late. And more important, if we failed to allocate it, the error is not checked properly. Even it is NULL, we would still use it. This patch moves the allocation to setup stage and if failed error message would be printed and caller would notice it. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 14:28:19 +01:00
Dr. David Alan Gilbert	fb14a42ade	migration: Don't try and recover return path in non-postcopy In normal precopy we can't do reconnection recovery - but we also don't need to, since you can just rerun migration. At the moment if the 'return-path' capability is on, we use the return path in precopy to give a positive 'OK' to the end of migration; however if migration fails then we fall into the postcopy recovery path and hang. This fixes it by only running the return path in the postcopy case. Reported-by: Greg Kurz <groug@kaod.org> Tested-by: Greg Kurz <groug@kaod.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 14:25:26 +01:00
Dr. David Alan Gilbert	987ab2a549	migration: Use automatic rcu_read unlock in rdma.c Use the automatic read unlocker in migration/rdma.c. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20191007143642.301445-5-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 14:20:01 +01:00
Dr. David Alan Gilbert	89ac5a1d2a	migration: Use automatic rcu_read unlock in ram.c Use the automatic read unlocker in migration/ram.c Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20191007143642.301445-4-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 14:20:00 +01:00
Dr. David Alan Gilbert	0e6ebd4877	migration: Fix missing rcu_read_unlock Use the automatic rcu_read unlocker to fix a missing unlock. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20191007143642.301445-3-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 14:19:59 +01:00
Wei Yang	8f8d528e73	migration: use migration_is_active to represent active state Wrap the check into a function to make it easy to read. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190717005341.14140-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-10-11 14:18:13 +01:00
Dr. David Alan Gilbert	3748fef9b9	migration/postcopy: Recognise the recovery states as 'in_postcopy' Various parts of the migration code do different things when they're in postcopy mode; prior to this patch this has been 'postcopy-active'. This patch extends 'in_postcopy' to include 'postcopy-paused' and 'postcopy-recover'. In particular, when you set the max-postcopy-bandwidth parameter, this only affects the current migration fd if we're 'in_postcopy'; this leads to a race in the postcopy recovery test where it increases the speed from 4k/sec to unlimited, but that increase can get ignored if the change is made between the point at which the reconnection happens and it transitions back to active. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190923174942.12182-1-dgilbert@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Tested-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-25 15:51:19 +01:00
Dr. David Alan Gilbert	d46a4847ca	migration/rdma.c: Swap synchronize_rcu for call_rcu This fixes a deadlock that can occur on the migration source after a failed RDMA migration; as the source tries to cleanup it clears a pair of pointers and uses synchronize_rcu to wait; this is happening on the main thread. With the CPUs running a CPU thread can be an rcu reader and attempt to grab the main lock (kvm_handle_io->address_space_write->flatview_write->flatview_write_continue-> prepare_mmio_access->qemu_mutex_lock_iothread_impl) Replace the synchronize_rcu with a call_rcu to postpone the freeing. Fixes: `74637e6f08` ("migration: implement bi-directional RDMA QIOChannel") ( https://bugzilla.redhat.com/show_bug.cgi?id=1746787 ) Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190913163507.1403-3-dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-25 15:51:19 +01:00
Dr. David Alan Gilbert	de8434a35a	migration/rdma: Don't moan about disconnects at the end If we've already finished the migration or something has already gone wrong, don't moan about the migration stream disconnecting. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190913163507.1403-2-dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-25 15:51:19 +01:00
Wei Yang	64737606e8	migration: remove sent parameter in get_queued_page_not_dirty This is a cleanup for previous removal of unsentmap. The sent parameter is not necessary now. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190819061843.28642-4-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-25 15:51:19 +01:00
Wei Yang	1e7cf8c323	migration/postcopy: unsentmap is not necessary for postcopy Commit `f3f491fcd6` ('Postcopy: Maintain unsentmap') introduced unsentmap to track not yet sent pages. This is not necessary since: * unsentmap is a sub-set of bmap before postcopy start * unsentmap is the summation of bmap and unsentmap after canonicalizing This patch just removes it. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190819061843.28642-3-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-25 15:51:19 +01:00
Wei Yang	8324ef86f0	migration/postcopy: not necessary to do discard when canonicalizing bitmap All pages, either partially sent or partially dirty, will be discarded in postcopy_send_discard_bm_ram(), since we update the unsentmap to be unsentmap = unsentmap \| dirty in ram_postcopy_send_discard_bitmap(). This is not necessary to do discard when canonicalizing bitmap. And by doing so, we separate the page discard into two individual steps: * canonicalize bitmap * discard page Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190819061843.28642-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-25 15:51:19 +01:00
Marc-André Lureau	91490583f3	migration: fix vmdesc leak on vmstate_save() error Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20190912122514.22504-2-marcandre.lureau@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-25 15:51:19 +01:00
Nir Soffer	8972571509	block: Remove unused masks Replace confusing usage: ~BDRV_SECTOR_MASK With more clear: (BDRV_SECTOR_SIZE - 1) Remove BDRV_SECTOR_MASK and the unused BDRV_BLOCK_OFFSET_MASK which was it's last user. Signed-off-by: Nir Soffer <nsoffer@redhat.com> Message-id: 20190827185913.27427-3-nsoffer@redhat.com Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-09-16 14:48:30 +02:00
Wei Yang	268dcd46ae	migration: fix one typo in comment of function migration_total_bytes() Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190912024957.11780-1-richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-12 11:25:06 +01:00
Wei Yang	1bf57fb3df	migration/qemu-file: fix potential buf waste for extra buf_index adjustment In add_to_iovec(), qemu_fflush() will be called if iovec is full. If this happens, buf_index is reset. Currently, this is not checked and buf_index would always been adjust with buf size. This is not harmful, but will waste some space in file buffer. This patch make add_to_iovec() return 1 when it has flushed the file. Then the caller could check the return value to see whether it is necessary to adjust the buf_index any more. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190911132839.23336-3-richard.weiyang@gmail.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-12 11:23:32 +01:00
Wei Yang	89fe04b458	migration/qemu-file: remove check on writev_buffer in qemu_put_compression_data The check of writev_buffer is in qemu_fflush, which means it is not harmful if it is NULL. And removing it will make the code consistent since all other add_to_iovec() is called without the check. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190911132839.23336-2-richard.weiyang@gmail.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-12 11:23:26 +01:00
Peter Xu	8504ddeca0	migration: Fix postcopy bw for recovery We've got max-postcopy-bandwidth parameter but it's not applied correctly after a postcopy recovery so the recovered migration stream will still eat the whole net bandwidth. Fix that up. Reported-by: Xiaohui Li <xiaohli@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20190906130103.20961-1-peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-12 11:21:25 +01:00
Yury Kotov	b9d68df62a	migration: Add validate-uuid capability This capability realizes simple source validation by UUID. It's useful for live migration between hosts. Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Message-Id: <20190903162246.18524-2-yury-kotov@yandex-team.ru> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-12 11:19:23 +01:00
Dr. David Alan Gilbert	3b34870672	qemu-file: Rework old qemu_fflush comment Commit `11808bb` removed the non-iovec based write support, the comment hung on. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190823103946.7388-1-dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-12 11:16:10 +01:00
Dr. David Alan Gilbert	ce62df5378	migration: register_savevm_live doesn't need dev Commit `78dd48df3` removed the last caller of register_savevm_live for an instantiable device (rather than a single system wide device); so trim out the parameter. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190822115433.12070-1-dgilbert@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-12 11:15:03 +01:00
Wei Yang	cea3b4c083	migration: cleanup check on ops in savevm.handlers iterations During migration, there are several places to iterate on savevm.handlers. And on each iteration, we need to check its ops and related callbacks before invoke it. Generally, ops is the first element to check, and it is only necessary to check it once. This patch clean all the related part in savevm.c to check ops only once in those iterations. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190819032804.8579-1-richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-12 10:55:23 +01:00
Ivan Ren	2f4aefd320	migration: multifd_send_thread always post p->sem_sync when error happen When encounter error, multifd_send_thread should always notify who pay attention to it before exit. Otherwise it may block migration_thread at multifd_send_sync_main forever. Error as follow: ------------------------------------------------------------------------------- (gdb) bt #0 0x00007f4d669dfa0b in do_futex_wait.constprop.1 () from /lib64/libpthread.so.0 #1 0x00007f4d669dfa9f in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0 #2 0x00007f4d669dfb3b in sem_wait@@GLIBC_2.2.5 () from /lib64/libpthread.so.0 #3 0x0000562ccf0a5614 in qemu_sem_wait (sem=sem@entry=0x562cd1b698e8) at util/qemu-thread-posix.c:319 #4 0x0000562ccecb4752 in multifd_send_sync_main (rs=<optimized out>) at /qemu/migration/ram.c:1099 #5 0x0000562ccecb95f4 in ram_save_iterate (f=0x562cd0ecc000, opaque=<optimized out>) at /qemu/migration/ram.c:3550 #6 0x0000562ccef43c23 in qemu_savevm_state_iterate (f=0x562cd0ecc000, postcopy=false) at migration/savevm.c:1189 #7 0x0000562ccef3dcf3 in migration_iteration_run (s=0x562cd09fabf0) at migration/migration.c:3131 #8 migration_thread (opaque=opaque@entry=0x562cd09fabf0) at migration/migration.c:3258 #9 0x0000562ccf0a4c26 in qemu_thread_start (args=<optimized out>) at util/qemu-thread-posix.c:502 #10 0x00007f4d669d9e25 in start_thread () from /lib64/libpthread.so.0 #11 0x00007f4d6670635d in clone () from /lib64/libc.so.6 (gdb) f 4 #4 0x0000562ccecb4752 in multifd_send_sync_main (rs=<optimized out>) at /qemu/migration/ram.c:1099 1099 qemu_sem_wait(&p->sem_sync); (gdb) list 1094 } 1095 for (i = 0; i < migrate_multifd_channels(); i++) { 1096 MultiFDSendParams *p = &multifd_send_state->params[i]; 1097 1098 trace_multifd_send_sync_main_wait(p->id); 1099 qemu_sem_wait(&p->sem_sync); 1100 } 1101 trace_multifd_send_sync_main(multifd_send_state->packet_num); 1102 } 1103 (gdb) p i $1 = 0 (gdb) p multifd_send_state->params[0].pending_job $2 = 2 //It means the job before MULTIFD_FLAG_SYNC has already fail (gdb) p multifd_send_state->params[0].quit $3 = true Signed-off-by: Ivan Ren <ivanren@tencent.com> Message-Id: <1567044996-2362-1-git-send-email-ivanren@tencent.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-09-12 10:53:33 +01:00
Juan Quintela	0705e56496	multifd: Use number of channels as listen backlog Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-09-03 23:24:42 +02:00
Juan Quintela	fc8135c630	socket: Add num connections to qio_net_listener_open_sync() Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2019-09-03 23:24:42 +02:00
Paolo Bonzini	9458a9a1df	memory: fix race between TCG and accesses to dirty bitmap There is a race between TCG and accesses to the dirty log: vCPU thread reader thread ----------------------- ----------------------- TLB check -> slow path notdirty_mem_write write to RAM set dirty flag clear dirty flag TLB check -> fast path read memory write to RAM Fortunately, in order to fix it, no change is required to the vCPU thread. However, the reader thread must delay the read after the vCPU thread has finished the write. This can be approximated conservatively by run_on_cpu, which waits for the end of the current translation block. A similar technique is used by KVM, which has to do a synchronous TLB flush after doing a test-and-clear of the dirty-page flags. Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-08-20 17:26:20 +02:00
John Snow	c4e4b0fa59	qapi: implement block-dirty-bitmap-remove transaction action It is used to do transactional movement of the bitmap (which is possible in conjunction with merge command). Transactional bitmap movement is needed in scenarios with external snapshot, when we don't want to leave copy of the bitmap in the base image. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20190708220502.12977-3-jsnow@redhat.com [Edited "since" version to 4.2 --js] Signed-off-by: John Snow <jsnow@redhat.com>	2019-08-16 16:28:03 -04:00
John Snow	28636b8211	block/dirty-bitmap: add bdrv_dirty_bitmap_get Add a public interface for get. While we're at it, rename "bdrv_get_dirty_bitmap_locked" to "bdrv_dirty_bitmap_get_locked". (There are more functions to rename to the bdrv_dirty_bitmap_VERB form, but they will wait until the conclusion of this series.) Signed-off-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20190709232550.10724-11-jsnow@redhat.com Signed-off-by: John Snow <jsnow@redhat.com>	2019-08-16 16:28:02 -04:00
Peter Maydell	95a9457fd4	Header cleanup patches for 2019-08-13 -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAl1WleASHGFybWJydUBy ZWRoYXQuY29tAAoJEDhwtADrkYZTBBYQALQLzIYb2Zux95bAxoJdhqNuEOGLfxeu gx0i0roPe6SBleHozUK+gf7kVYyw7he58n2dZURGqrpqktgZOFcea2a6Dq1rnVw6 JMJ2Oy7V326bHwJT0Np9rW4n+FHsMQZoAUEHjl9EeGCZfO/zy2aSWPsD8mbcbm0g hUW5Jr4+cpm28BCL8I+2HhWFazB6G2IPAF9oEXmNsOM6J1Ho8WGrTAjASe0Il5Yi m2B4QWG+4uz77WYnkttnssm41K1S95HYyaKluIVyNwTnsPTN303V/sUj+wdRaooL k1O6WqaavGhal7QeRqy+vCpF8m6qLq7NaYCzSCOrrkkuC8TAnpVn7Xmi9qI+vb6O kGBpDWhq5wOnphsEhnFvhPZgD+WZo3mwTgW4h0d3UhB6orOTPTMvWKEwFJ1j/O6/ gntV61o542c9gpZjS133221HRmNjteHF/5/TFzmX/G50sgivJn+WOP87naM2aBAz 8MW5HatTox+qQqYD4VMUIVnVkguxHDVhFRBunYu0HvZZ1Rud+Lc6Xzi6H4jDlZ81 vtOmAlMU3dbp97gNvJrAVqV4JIL3puOWbu0MMaQWoG53Kcdfu46LIr57TTg3dw61 R9e7HSOQjYILChoodwELlyeAsVeZo3IzX9vPX8aw7MoHvneyTUNqtha/rHsLEwsb 97G19dydGEC6 =eSUz -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/armbru/tags/pull-include-2019-08-13-v2' into staging Header cleanup patches for 2019-08-13 # gpg: Signature made Fri 16 Aug 2019 12:39:12 BST # gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653 # gpg: issuer "armbru@redhat.com" # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full] # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [full] # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * remotes/armbru/tags/pull-include-2019-08-13-v2: (29 commits) sysemu: Split sysemu/runstate.h off sysemu/sysemu.h sysemu: Move the VMChangeStateEntry typedef to qemu/typedefs.h Include sysemu/sysemu.h a lot less Clean up inclusion of sysemu/sysemu.h numa: Move remaining NUMA declarations from sysemu.h to numa.h Include sysemu/hostmem.h less numa: Don't include hw/boards.h into sysemu/numa.h Include hw/boards.h a bit less Include hw/qdev-properties.h less Include qemu/main-loop.h less Include qemu/queue.h slightly less Include hw/hw.h exactly where needed Include qom/object.h slightly less Include exec/memory.h slightly less Include migration/vmstate.h less migration: Move the VMStateDescription typedef to typedefs.h Clean up inclusion of exec/cpu-common.h Include hw/irq.h a lot less typedefs: Separate incomplete types and function types ide: Include hw/ide/internal a bit less outside hw/ide/ ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-08-16 14:53:43 +01:00
Markus Armbruster	54d31236b9	sysemu: Split sysemu/runstate.h off sysemu/sysemu.h sysemu/sysemu.h is a rather unfocused dumping ground for stuff related to the system-emulator. Evidence: * It's included widely: in my "build everything" tree, changing sysemu/sysemu.h still triggers a recompile of some 1100 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h, down from 5400 due to the previous two commits). * It pulls in more than a dozen additional headers. Split stuff related to run state management into its own header sysemu/runstate.h. Touching sysemu/sysemu.h now recompiles some 850 objects. qemu/uuid.h also drops from 1100 to 850, and qapi/qapi-types-run-state.h from 4400 to 4200. Touching new sysemu/runstate.h recompiles some 500 objects. Since I'm touching MAINTAINERS to add sysemu/runstate.h anyway, also add qemu/main-loop.h. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-30-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> [Unbreak OS-X build]	2019-08-16 13:37:36 +02:00
Markus Armbruster	46517dd497	Include sysemu/sysemu.h a lot less In my "build everything" tree, changing sysemu/sysemu.h triggers a recompile of some 5400 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). hw/qdev-core.h includes sysemu/sysemu.h since recent commit `e965ffa70a` "qdev: add qdev_add_vm_change_state_handler()". This is a bad idea: hw/qdev-core.h is widely included. Move the declaration of qdev_add_vm_change_state_handler() to sysemu/sysemu.h, and drop the problematic include from hw/qdev-core.h. Touching sysemu/sysemu.h now recompiles some 1800 objects. qemu/uuid.h also drops from 5400 to 1800. A few more headers show smaller improvement: qemu/notify.h drops from 5600 to 5200, qemu/timer.h from 5600 to 4500, and qapi/qapi-types-run-state.h from 5500 to 5000. Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20190812052359.30071-28-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>	2019-08-16 13:31:53 +02:00
Markus Armbruster	a27bd6c779	Include hw/qdev-properties.h less In my "build everything" tree, changing hw/qdev-properties.h triggers a recompile of some 2700 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). Many places including hw/qdev-properties.h (directly or via hw/qdev.h) actually need only hw/qdev-core.h. Include hw/qdev-core.h there instead. hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h and hw/qdev-properties.h, which in turn includes hw/qdev-core.h. Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h. While there, delete a few superfluous inclusions of hw/qdev-core.h. Touching hw/qdev-properties.h now recompiles some 1200 objects. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Daniel P. Berrangé" <berrange@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20190812052359.30071-22-armbru@redhat.com>	2019-08-16 13:31:53 +02:00
Markus Armbruster	db72581598	Include qemu/main-loop.h less In my "build everything" tree, changing qemu/main-loop.h triggers a recompile of some 5600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). It includes block/aio.h, which in turn includes qemu/event_notifier.h, qemu/notify.h, qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h, qemu/thread.h, qemu/timer.h, and a few more. Include qemu/main-loop.h only where it's needed. Touching it now recompiles only some 1700 objects. For block/aio.h and qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the others, they shrink only slightly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-21-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-08-16 13:31:52 +02:00
Markus Armbruster	d484205210	Include exec/memory.h slightly less Drop unnecessary inclusions from headers. Downgrade a few more to exec/hwaddr.h. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190812052359.30071-17-armbru@redhat.com>	2019-08-16 13:31:52 +02:00
Markus Armbruster	d645427057	Include migration/vmstate.h less In my "build everything" tree, changing migration/vmstate.h triggers a recompile of some 2700 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). hw/hw.h supposedly includes it for convenience. Several other headers include it just to get VMStateDescription. The previous commit made that unnecessary. Include migration/vmstate.h only where it's still needed. Touching it now recompiles only some 1600 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-16-armbru@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-08-16 13:31:52 +02:00
Markus Armbruster	6a0acfff99	Clean up inclusion of exec/cpu-common.h migration/qemu-file.h neglects to include it even though it needs ram_addr_t. Fix that. Drop a few superfluous inclusions elsewhere. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190812052359.30071-14-armbru@redhat.com>	2019-08-16 13:31:52 +02:00
Juan Quintela	7dd59d01dd	migration: add some multifd traces Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20190814020218.1868-6-quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Juan Quintela	18cdcea371	migration: Make global sem_sync semaphore by channel This makes easy to debug things because when you want for all threads to arrive at that semaphore, you know which one your are waiting for. Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20190814020218.1868-3-quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Juan Quintela	5558c91ae8	migration: Add traces for multifd terminate threads Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20190814020218.1868-2-quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Marc-André Lureau	3170a6453b	qemu-file: move qemu_{get,put}_counted_string() declarations Move migration helpers for strings under include/, so they can be used outside of migration/ Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20190808150325.21939-2-marcandre.lureau@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	1ce542620a	migration/postcopy: use mis->bh instead of allocating a QEMUBH For migration incoming side, it either quit in precopy or postcopy. It is safe to use the mis->bh for both instead of allocating a dedicated QEMUBH for postcopy. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190805053146.32326-1-richardw.yang@linux.intel.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	7a3e957177	migration: rename migration_bitmap_sync_range to ramblock_sync_dirty_bitmap Rename for better understanding of the code. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190808033155.30162-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Ivan Ren	81507f6b7e	migration: update ram_counters for multifd sync packet Multifd sync will send MULTIFD_FLAG_SYNC flag info to destination, add these bytes to ram_counters record. Signed-off-by: Ivan Ren <ivanren@tencent.com> Suggested-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <1564464816-21804-4-git-send-email-ivanren@tencent.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Ivan Ren	1b81c974cc	migration: add speed limit for multifd migration Limit the speed of multifd migration through common speed limitation qemu file. Signed-off-by: Ivan Ren <ivanren@tencent.com> Message-Id: <1564464816-21804-3-git-send-email-ivanren@tencent.com> Reviewed-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Ivan Ren	5d7d255863	migration: add qemu_file_update_transfer interface Add qemu_file_update_transfer for just update bytes_xfer for speed limitation. This will be used for further migration feature such as multifd migration. Signed-off-by: Ivan Ren <ivanren@tencent.com> Reviewed-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <1564464816-21804-2-git-send-email-ivanren@tencent.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Ivan Ren	87f3bd8717	migration: always initialise ram_counters for a new migration This patch fix a multifd migration bug in migration speed calculation, this problem can be reproduced as follows: 1. start a vm and give a heavy memory write stress to prevent the vm be successfully migrated to destination 2. begin a migration with multifd 3. migrate for a long time [actually, this can be measured by transferred bytes] 4. migrate cancel 5. begin a new migration with multifd, the migration will directly run into migration_completion phase Reason as follows: Migration update bandwidth and s->threshold_size in function migration_update_counters after BUFFER_DELAY time: current_bytes = migration_total_bytes(s); transferred = current_bytes - s->iteration_initial_bytes; time_spent = current_time - s->iteration_start_time; bandwidth = (double)transferred / time_spent; s->threshold_size = bandwidth * s->parameters.downtime_limit; In multifd migration, migration_total_bytes function return qemu_ftell(s->to_dst_file) + ram_counters.multifd_bytes. s->iteration_initial_bytes will be initialized to 0 at every new migration, but ram_counters is a global variable, and history migration data will be accumulated. So if the ram_counters.multifd_bytes is big enough, it may lead pending_size >= s->threshold_size become false in migration_iteration_run after the first migration_update_counters. Signed-off-by: Ivan Ren <ivanren@tencent.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Wei Yang <richardw.yang@linux.intel.com> Suggested-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <1564741121-1840-1-git-send-email-ivanren@tencent.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	14adf288d3	migration: remove unused field bytes_xfer MigrationState->bytes_xfer is only set to 0 in migrate_init(). Remove this unnecessary field. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190402003106.17614-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	9dec3cc3f4	migration/postcopy: use QEMU_IS_ALIGNED to replace host_offset Use QEMU_IS_ALIGNED for the check, it would be more consistent with other align calculations. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190806004648.8659-3-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	dad45ab2be	migration/postcopy: simplify calculation of run_start and fixup_start_addr The purpose of the calculation is to find a HostPage which is partially dirty. * fixup_start_addr points to the start of the HostPage to discard * run_start points to the next HostPage to check While in the middle stage, there would two cases for run_start: * aligned with HostPage means this is not partially dirty * not aligned means this is partially dirty When it is aligned, no work and calculation is necessary. run_start already points to the start of next HostPage and is ready to continue. When it is not aligned, the calculation could be simplified with: * fixup_start_addr = QEMU_ALIGN_DOWN(run_start, host_ratio) * run_start = QEMU_ALIGN_UP(run_start, host_ratio) By doing so, run_start always points to the next HostPage to check. fixup_start_addr always points to the HostPage to discard. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190806004648.8659-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	810cf2bbd4	migration/postcopy: make PostcopyDiscardState a static variable In postcopy-ram.c, we provide three functions to discard certain RAMBlock range: * postcopy_discard_send_init() * postcopy_discard_send_range() * postcopy_discard_send_finish() Currently, we allocate/deallocate PostcopyDiscardState for each RAMBlock on sending discard information to destination. This is not necessary and the same data area could be reused for each RAMBlock. This patch defines PostcopyDiscardState a static variable. By doing so: 1) avoid memory allocation and deallocation to the system 2) avoid potential failure of memory allocation 3) hide some details for their users Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190724010721.2146-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	10da4a3689	migration: extract ram_load_precopy After cleanup, it would be clear to audience there are two cases ram_load: * precopy * postcopy And it is not necessary to check postcopy_running on each iteration for precopy. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190725002023.2335-3-richardw.yang@linux.intel.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	be4a1a1b6f	migration: return -EINVAL directly when version_id mismatch It is not reasonable to continue when version_id mismatch. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190722075339.25121-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	4695ce3fdc	migration: equation is more proper than and to check LOADVM_QUIT LOADVM_QUIT allows a command to quit all layers of nested loadvm loops, while current return value check is not that proper even it works now. Current return value check "ret & LOADVM_QUIT" would return true if bit[0] is 1. This would be true when ret is -1 which is used to indicate an error of handling a command. Since there is only one place return LOADVM_QUIT and no other combination of return value, use "ret == LOADVM_QUIT" would be more proper. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190718064257.29218-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	5d0980a459	migration: just pass RAMBlock is enough RAMBlock->used_length is always passed to migration_bitmap_sync_range(), which could be retrieved from RAMBlock. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190718012547.16373-1-richardw.yang@linux.intel.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	6a88eb2b08	migration: use migration_in_postcopy() to check POSTCOPY_ACTIVE Use common helper function to check the state. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190719071129.11880-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	52aec70923	migration/postcopy: start_postcopy could be true only when migrate_postcopy() return true There is only one place to set start_postcopy to true, qmp_migrate_start_postcopy(), which make sure start_postcopy could be set to true when migrate_postcopy() return true. So start_postcopy is true implies the other one. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190718083747.5859-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	305b6f8431	migration/postcopy: PostcopyState is already set in loadvm_postcopy_handle_advise() PostcopyState is already set to ADVISE at the beginning of loadvm_postcopy_handle_advise(). Remove the redundant set. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190711080816.6405-1-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	e326767b45	migration/savevm: move non SaveStateEntry condition check out of iteration in_postcopy and iterable_only are not SaveStateEntry specific, it would be more proper to check them out of iteration. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190709140924.13291-4-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	622a80c955	migration/savevm: split qemu_savevm_state_complete_precopy() into two parts This is a preparation patch for further cleanup. No functional change, just wrap two major part of qemu_savevm_state_complete_precopy() into function. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190709140924.13291-3-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	4e455d51ef	migration/savevm: flush file for iterable_only case It would be proper to flush file even for iterable_only case. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190709140924.13291-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	8996604fe6	migration/postcopy: do_fixup is true when host_offset is non-zero This means it is not necessary to spare an extra variable to hold this condition. Use host_offset directly is fine. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190710050814.31344-3-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	e927a03317	migration/postcopy: reduce one operation to calculate fixup_start_addr Use the same way for run_end to calculate run_start, which saves one operation. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190710050814.31344-2-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00
Wei Yang	a162b572e9	migration/postcopy: discard_length must not be 0 Since we break the loop when there is no more page to discard, we are sure the following process would find some page to discard. It is not necessary to check it again. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> Message-Id: <20190627020822.15485-4-richardw.yang@linux.intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2019-08-14 17:33:14 +01:00

1 2 3 4 5 ...

1290 Commits