qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Eric Blake	5d39f44d7a	migration: Minor control flow simplification No need to declare a temporary variable. Suggested-by: Juan Quintela <quintela@redhat.com> Fixes: 1df36e8c6289 ("migration: Handle block device inactivation failures better") Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-04-24 15:01:46 +02:00
Juan Quintela	b02c7fc9ef	migration: Pass migrate_caps_check() the old and new caps We used to pass the old capabilities array and the new capabilities as a list. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>	2023-04-24 11:29:02 +02:00
Juan Quintela	0cec2056ff	migration: rename enabled_capabilities to capabilities It is clear from the context what that means, and such a long name with the extra long names of the capabilities make very difficilut to stay inside the 80 columns limit. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>	2023-04-24 11:29:01 +02:00
Eric Blake	403d18ae38	migration: Handle block device inactivation failures better Consider what happens when performing a migration between two host machines connected to an NFS server serving multiple block devices to the guest, when the NFS server becomes unavailable. The migration attempts to inactivate all block devices on the source (a necessary step before the destination can take over); but if the NFS server is non-responsive, the attempt to inactivate can itself fail. When that happens, the destination fails to get the migrated guest (good, because the source wasn't able to flush everything properly): (qemu) qemu-kvm: load of migration failed: Input/output error at which point, our only hope for the guest is for the source to take back control. With the current code base, the host outputs a message, but then appears to resume: (qemu) qemu-kvm: qemu_savevm_state_complete_precopy_non_iterable: bdrv_inactivate_all() failed (-1) (src qemu)info status VM status: running but a second migration attempt now asserts: (src qemu) qemu-kvm: ../block.c:6738: int bdrv_inactivate_recurse(BlockDriverState *): Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed. Whether the guest is recoverable on the source after the first failure is debatable, but what we do not want is to have qemu itself fail due to an assertion. It looks like the problem is as follows: In migration.c:migration_completion(), the source sets 'inactivate' to true (since COLO is not enabled), then tries savevm.c:qemu_savevm_state_complete_precopy() with a request to inactivate block devices. In turn, this calls block.c:bdrv_inactivate_all(), which fails when flushing runs up against the non-responsive NFS server. With savevm failing, we are now left in a state where some, but not all, of the block devices have been inactivated; but migration_completion() then jumps to 'fail' rather than 'fail_invalidate' and skips an attempt to reclaim those those disks by calling bdrv_activate_all(). Even if we do attempt to reclaim disks, we aren't taking note of failure there, either. Thus, we have reached a state where the migration engine has forgotten all state about whether a block device is inactive, because we did not set s->block_inactive in enough places; so migration allows the source to reach vm_start() and resume execution, violating the block layer invariant that the guest CPUs should not be restarted while a device is inactive. Note that the code in migration.c:migrate_fd_cancel() will also try to reactivate all block devices if s->block_inactive was set, but because we failed to set that flag after the first failure, the source assumes it has reclaimed all devices, even though it still has remaining inactivated devices and does not try again. Normally, qmp_cont() will also try to reactivate all disks (or correctly fail if the disks are not reclaimable because NFS is not yet back up), but the auto-resumption of the source after a migration failure does not go through qmp_cont(). And because we have left the block layer in an inconsistent state with devices still inactivated, the later migration attempt is hitting the assertion failure. Since it is important to not resume the source with inactive disks, this patch marks s->block_inactive before attempting inactivation, rather than after succeeding, in order to prevent any vm_start() until it has successfully reactivated all devices. See also https://bugzilla.redhat.com/show_bug.cgi?id=2058982 Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Acked-by: Lukas Straub <lukasstraub2@web.de> Tested-by: Lukas Straub <lukasstraub2@web.de> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-04-24 11:29:00 +02:00
Juan Quintela	8c0cda8fa0	migration: Rename normal to normal_pages Rest of counters that refer to pages has a _pages suffix. And historically, this showed the number of full pages transferred. The name "normal" refered to the fact that they were sent without any optimization (compression, xbzrle, zero_page, ...). Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>	2023-04-24 11:29:00 +02:00
Juan Quintela	1a386e8de5	migration: Rename duplicate to zero_pages Rest of counters that refer to pages has a _pages suffix. And historically, this showed the number of pages composed of the same character, here comes the name "duplicated". But since years ago, it refers to the number of zero_pages. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>	2023-04-24 11:28:59 +02:00
Juan Quintela	3c764f9b2b	migration: Make postcopy_requests atomic Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>	2023-04-24 11:28:59 +02:00
Juan Quintela	536b5a4e56	migration: Make dirty_sync_count atomic Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>	2023-04-24 11:28:58 +02:00
Juan Quintela	296a4ac2aa	migration: Make downtime_bytes atomic Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>	2023-04-24 11:28:58 +02:00
Juan Quintela	b013b5d1f3	migration: Make precopy_bytes atomic Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>	2023-04-24 11:28:58 +02:00
Juan Quintela	4291823694	migration: Make dirty_sync_missed_zero_copy atomic Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>	2023-04-24 11:28:57 +02:00
Juan Quintela	cf671116fa	migration: Make multifd_bytes atomic In the spirit of: commit 394d323bc3451e4d07f13341cb8817fac8dfbadd Author: Peter Xu <peterx@redhat.com> Date: Tue Oct 11 17:55:51 2022 -0400 migration: Use atomic ops properly for page accountings Reviewed-by: David Edmondson <david.edmondson@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-04-24 11:28:57 +02:00
Juan Quintela	abce5fa16d	migration: Merge ram_counters and ram_atomic_counters Using MgrationStats as type for ram_counters mean that we didn't have to re-declare each value in another struct. The need of atomic counters have make us to create MigrationAtomicStats for this atomic counters. Create RAMStats type which is a merge of MigrationStats and MigrationAtomicStats removing unused members. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> --- Fix typos found by David Edmondson	2023-04-24 11:28:56 +02:00
Peter Xu	06064a6715	migration: Recover behavior of preempt channel creation for pre-7.2 In 8.0 devel window we reworked preempt channel creation, so that there'll be no race condition when the migration channel and preempt channel got established in the wrong order in commit `5655aab079`. However no one noticed that the change will also be not compatible with older qemus, majorly 7.1/7.2 versions where preempt mode started to be supported. Leverage the same pre-7.2 flag introduced in the previous patch to recover the behavior hopefully before 8.0 releases, so we don't break migration when we migrate from 8.0 to older qemu binaries. Fixes: `5655aab079` ("migration: Postpone postcopy preempt channel to be after main") Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-04-12 21:44:56 +02:00
Peter Xu	6621883f93	migration: Fix potential race on postcopy_qemufile_src postcopy_qemufile_src object should be owned by one thread, either the main thread (e.g. when at the beginning, or at the end of migration), or by the return path thread (when during a preempt enabled postcopy migration). If that's not the case the access to the object might be racy. postcopy_preempt_shutdown_file() can be potentially racy, because it's called at the end phase of migration on the main thread, however during which the return path thread hasn't yet been recycled; the recycle happens in await_return_path_close_on_source() which is after this point. It means, logically it's posslbe the main thread and the return path thread are both operating on the same qemufile. While I don't think qemufile is thread safe at all. postcopy_preempt_shutdown_file() used to be needed because that's where we send EOS to dest so that dest can safely shutdown the preempt thread. To avoid the possible race, remove this only place that a race can happen. Instead we figure out another way to safely close the preempt thread on dest. The core idea during postcopy on deciding "when to stop" is that dest will send a postcopy SHUT message to src, telling src that all data is there. Hence to shut the dest preempt thread maybe better to do it directly on dest node. This patch proposed such a way that we change postcopy_prio_thread_created into PreemptThreadStatus, so that we kick the preempt thread on dest qemu by a sequence of: mis->preempt_thread_status = PREEMPT_THREAD_QUIT; qemu_file_shutdown(mis->postcopy_qemufile_dst); While here shutdown() is probably so far the easiest way to kick preempt thread from a blocked qemu_get_be64(). Then it reads preempt_thread_status to make sure it's not a network failure but a willingness to quit the thread. We could have avoided that extra status but just rely on migration status. The problem is postcopy_ram_incoming_cleanup() is just called early enough so we're still during POSTCOPY_ACTIVE no matter what.. So just make it simple to have the status introduced. One flag x-preempt-pre-7-2 is added to keep old pre-7.2 behaviors of postcopy preempt. Fixes: `9358982744` ("migration: Send requested page directly in rp-return thread") Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-04-12 21:44:38 +02:00
Juan Quintela	24beea4efe	migration: Rename res_{postcopy,precopy}_only Once that res_compatible is removed, they don't make sense anymore. We remove the _only preffix. And to make things clearer we rename them to must_precopy and can_postcopy. Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-02-15 20:04:30 +01:00
Juan Quintela	24f254ed79	migration: Remove unused res_compatible Nothing assigns to it after previous commit. Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-02-15 20:04:30 +01:00
Leonardo Bras	cfc3bcf373	migration/multifd: Move load_cleanup inside incoming_state_destroy Currently running migration_incoming_state_destroy() without first running multifd_load_cleanup() will cause a yank error: qemu-system-x86_64: ../util/yank.c:107: yank_unregister_instance: Assertion `QLIST_EMPTY(&entry->yankfns)' failed. (core dumped) The above error happens in the target host, when multifd is being used for precopy, and then postcopy is triggered and the migration finishes. This will crash the VM in the target host. To avoid that, move multifd_load_cleanup() inside migration_incoming_state_destroy(), so that the load cleanup becomes part of the incoming state destroying process. Running multifd_load_cleanup() twice can become an issue, though, but the only scenario it could be ran twice is on process_incoming_migration_bh(). So removing this extra call is necessary. On the other hand, this multifd_load_cleanup() call happens way before the migration_incoming_state_destroy() and having this happening before dirty_bitmap_mig_before_vm_start() and vm_start() may be a need. So introduce a new function multifd_load_shutdown() that will mainly stop all multifd threads and close their QIOChannels. Then use this function instead of multifd_load_cleanup() to make sure nothing else is received before dirty_bitmap_mig_before_vm_start(). Fixes: `b5eea99ec2` ("migration: Add yank feature") Reported-by: Li Xiaohui <xiaohli@redhat.com> Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-02-13 03:45:40 +01:00
Leonardo Bras	e5bac1f525	migration/multifd: Change multifd_load_cleanup() signature and usage Since it's introduction in commit `f986c3d256` ("migration: Create multifd migration threads"), multifd_load_cleanup() never returned any value different than 0, neither set up any error on errp. Even though, on process_incoming_migration_bh() an if clause uses it's return value to decide on setting autostart = false, which will never happen. In order to simplify the codebase, change multifd_load_cleanup() signature to 'void multifd_load_cleanup(void)', and for every usage remove error handling or decision made based on return value != 0. Fixes: `b5eea99ec2` ("migration: Add yank feature") Reported-by: Li Xiaohui <xiaohli@redhat.com> Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-02-13 03:44:44 +01:00
Peter Xu	5655aab079	migration: Postpone postcopy preempt channel to be after main Postcopy with preempt-mode enabled needs two channels to communicate. The order of channel establishment is not guaranteed. It can happen that the dest QEMU got the preempt channel connection request before the main channel is established, then the migration may make no progress even during precopy due to the wrong order. To fix it, create the preempt channel only if we know the main channel is established. For a general postcopy migration, we delay it until postcopy_start(), that's where we already went through some part of precopy on the main channel. To make sure dest QEMU has already established the channel, we wait until we got the first PONG received. That's something we do at the start of precopy when postcopy enabled so it's guaranteed to happen sooner or later. For a postcopy recovery, we delay it to qemu_savevm_state_resume_prepare() where we'll have round trips of data on bitmap synchronizations, which means the main channel must have been established. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-02-11 16:51:09 +01:00
Peter Xu	b28fb58227	migration: Add a semaphore to count PONGs This is mostly useless, but useful for us to know whether the main channel is correctly established without changing the migration protocol. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-02-11 16:51:09 +01:00
Peter Xu	fc063a7b8a	migration: Cleanup postcopy_preempt_setup() Since we just dropped the only case where postcopy_preempt_setup() can return an error, it doesn't need a retval anymore because it never fails. Move the preempt check to the caller, preparing it to be used elsewhere to do nothing but as simple as kicking the async connection. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-02-11 16:51:09 +01:00
Peter Xu	d6f74fd12e	migration: Rework multi-channel checks on URI The whole idea of multi-channel checks was not properly done, IMHO. Currently we check multi-channel in a lot of places, but actually that's not needed because we only need to check it right after we get the URI and that should be it. If the URI check succeeded, we should never need to check it again because we must have it. If it check fails, we should fail immediately on either the qmp_migrate or qmp_migrate_incoming, instead of failingg it later after the connection established. Neither should we fail any set capabiliities like what we used to do here: `5ad15e8614` ("migration: allow enabling mutilfd for specific protocol only", 2021-10-19) Because logically the URI will only be set later after the capability is set, so it doesn't make a lot of sense to check the URI type when setting the capability, because we're checking the cap with an old URI passed in, and that may not even be the URI we're going to use later. This patch mostly reverted all such checks for before, dropping the variable migrate_allow_multi_channels and helpers. Instead, add a common helper to check URI for multi-channels for either qmp_migrate and qmp_migrate_incoming and that should do all the proper checks. The failure will only trigger with the "migrate" or "migrate_incoming" command, or when user specified "-incoming xxx" where "xxx" is not "defer". Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-02-11 16:51:09 +01:00
Jiang Jiacheng	1b1f4ab69c	migration: save/delete migration thread info To support query migration thread infomation, save and delete thread(live_migration and multifdsend) information at thread creation and finish. Signed-off-by: Jiang Jiacheng <jiangjiacheng@huawei.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-02-06 19:22:57 +01:00
manish.mishra	6720c2b327	migration: check magic value for deciding the mapping of channels Current logic assumes that channel connections on the destination side are always established in the same order as the source and the first one will always be the main channel followed by the multifid or post-copy preemption channel. This may not be always true, as even if a channel has a connection established on the source side it can be in the pending state on the destination side and a newer connection can be established first. Basically causing out of order mapping of channels on the destination side. Currently, all channels except post-copy preempt send a magic number, this patch uses that magic number to decide the type of channel. This logic is applicable only for precopy(multifd) live migration, as mentioned, the post-copy preempt channel does not send any magic number. Also, tls live migrations already does tls handshake before creating other channels, so this issue is not possible with tls, hence this logic is avoided for tls live migrations. This patch uses read peek to check the magic number of channels so that current data/control stream management remains un-effected. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Suggested-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: manish.mishra <manish.mishra@nutanix.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-02-06 19:22:57 +01:00
Peter Xu	db18dee7d7	migration: Show downtime during postcopy phase The downtime should be displayed during postcopy phase because the switchover phase is done. OTOH it's weird to show "expected downtime" which can confuse what does that mean if the switchover has already happened anyway. This is a slight ABI change on QMP, but I assume it shouldn't affect anyone. Reviewed-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-02-06 19:22:56 +01:00
David Hildenbrand	80fe315c38	migration/ram: Factor out check for advised postcopy Let's factor out this check, to be used in virtio-mem context next. While at it, fix a spelling error in a related comment. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com>S Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-02-06 19:22:56 +01:00
David Hildenbrand	e3bf5e68e2	migration/savevm: Prepare vmdesc json writer in qemu_savevm_state_setup() ... and store it in the migration state. This is a preparation for storing selected vmds's already in qemu_savevm_state_setup(). Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-02-06 19:22:56 +01:00
Juan Quintela	d9df92925e	migration: simplify migration_iteration_run() Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2023-02-06 19:22:56 +01:00
Juan Quintela	fd70385d38	migration: Remove unused threshold_size parameter Until previous commit, save_live_pending() was used for ram. Now with the split into state_pending_estimate() and state_pending_exact() it is not needed anymore, so remove them. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2023-02-06 19:22:56 +01:00
Juan Quintela	c8df4a7aef	migration: Split save_live_pending() into state_pending_* We split the function into to: - state_pending_estimate: We estimate the remaining state size without stopping the machine. - state pending_exact: We calculate the exact amount of remaining state. The only "device" that implements different functions for _estimate() and _exact() is ram. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2023-02-06 19:22:56 +01:00
Juan Quintela	255dc7af7e	migration: No save_live_pending() method uses the QEMUFile parameter So remove it everywhere. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2023-02-06 19:22:56 +01:00
Markus Armbruster	27be86351e	migration: Move the QMP command from monitor/ to migration/ This moves the command from MAINTAINERS sections "Human Monitor (HMP)" and "QMP" to "Migration". Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20230124121946.1139465-19-armbru@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com>	2023-02-04 07:56:54 +01:00
Peter Maydell	928eac9539	Migration patches for 8.0 Hi This are the patches that I had to drop form the last PULL request because they werent fixes: - AVX2 is dropped, intel posted a fix, I have to redo it - Fix for out of order channels is out Daniel nacked it and I need to redo it -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEGJn/jt6/WMzuA0uC9IfvGFhy1yMFAmOa6xUACgkQ9IfvGFhy 1yP13BAAj4GdlWCqgvv98qIf9dY5WjvrbzL+8qdUvt7VIsDgh18amjlBmvvBngmd tssPHqLTqs6CXYxo4PBwKsvhA1qBCg9Fr+RtMTJG4FoumFdeO/l4tcXs99Ww5o9p OnrMAshTRHMRapvvX0vIiR0dGUPXs6KOz2JLNX1oF5ZY1yqskLxp9x3ydL7iw2oN GikRUfd4bG8drvhrKl6WPZOMKt0fVRH/2j0TqKPtl/hh/F4Ie6AUSI7McYMwOeXx xUhFcm2PKY5US6uYhZpKo7envCmuxreZSAH/eRrlu5uNCCOKaZ9uWYwACMJGpfrB SqY5dCTDpfFoaOloFEOYDfWOwoCJl5u9vNwRK1ArSVCfjczq50itswFTQ3A/hyd2 1noMv60XcR3An3mUydQ3j/C+hfE3KVXdFPImOKjPrn8zU6f2Dfug3ALXiHi1xyov ZdpcZjCEhdSruYxIdlIKfzlYLy8R1G4mSFrBV3NuMrywlM2fWQgyCUAYwzRwQrJw oBiedgpNP/MCM4NPQKLpvz/sci6nxkrGV8QX44zg0LdViXkpCU5ZiaoPXQcbiQCC Xkkah3GLbVt6788qKja2U9ccdofAe5yUbjo6XYxdbXC7y9mSyvBS9FCHvWr4HY/8 TUavGrcjKqQ31WxiyWw5CEi/hqNftFUNtWmEzZuAjRwM2cw89sU= =zGNB -----END PGP SIGNATURE----- Merge tag 'next-8.0-pull-request' of https://gitlab.com/juan.quintela/qemu into staging Migration patches for 8.0 Hi This are the patches that I had to drop form the last PULL request because they werent fixes: - AVX2 is dropped, intel posted a fix, I have to redo it - Fix for out of order channels is out Daniel nacked it and I need to redo it # gpg: Signature made Thu 15 Dec 2022 09:38:29 GMT # gpg: using RSA key 1899FF8EDEBF58CCEE034B82F487EF185872D723 # gpg: Good signature from "Juan Quintela <quintela@redhat.com>" [full] # gpg: aka "Juan Quintela <quintela@trasno.org>" [full] # Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723 * tag 'next-8.0-pull-request' of https://gitlab.com/juan.quintela/qemu: migration: Drop rs->f migration: Remove old preempt code around state maintainance migration: Send requested page directly in rp-return thread migration: Move last_sent_block into PageSearchStatus migration: Make PageSearchStatus part of RAMState migration: Add pss_init() migration: Introduce pss_channel migration: Teach PSS about host page migration: Use atomic ops properly for page accountings migration: Yield bitmap_mutex properly when sending/sleeping migration: Remove RAMState.f references in compression code migration: Trivial cleanup save_page_header() on same block check migration: Cleanup xbzrle zero page cache update logic migration: Add postcopy_preempt_active() migration: Take bitmap mutex when completing ram migration migration: Export ram_release_page() migration: Export ram_transferred_ram() multifd: Create page_count fields into both MultiFD{Recv,Send}Params multifd: Create page_size fields into both MultiFD{Recv,Send}Params Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2022-12-15 14:52:13 +00:00
Peter Xu	b062106d3a	migration: Remove old preempt code around state maintainance With the new code to send pages in rp-return thread, there's little help to keep lots of the old code on maintaining the preempt state in migration thread, because the new way should always be faster.. Then if we'll always send pages in the rp-return thread anyway, we don't need those logic to maintain preempt state anymore because now we serialize things using the mutex directly instead of using those fields. It's very unfortunate to have those code for a short period, but that's still one intermediate step that we noticed the next bottleneck on the migration thread. Now what we can do best is to drop unnecessary code as long as the new code is stable to reduce the burden. It's actually a good thing because the new "sending page in rp-return thread" model is (IMHO) even cleaner and with better performance. Remove the old code that was responsible for maintaining preempt states, at the meantime also remove x-postcopy-preempt-break-huge parameter because with concurrent sender threads we don't really need to break-huge anymore. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2022-12-15 10:30:37 +01:00
Peter Xu	9358982744	migration: Send requested page directly in rp-return thread With all the facilities ready, send the requested page directly in the rp-return thread rather than queuing it in the request queue, if and only if postcopy preempt is enabled. It can achieve so because it uses separate channel for sending urgent pages. The only shared data is bitmap and it's protected by the bitmap_mutex. Note that since we're moving the ownership of the urgent channel from the migration thread to rp thread it also means the rp thread is responsible for managing the qemufile, e.g. properly close it when pausing migration happens. For this, let migration_release_from_dst_file to cover shutdown of the urgent channel too, renaming it as migration_release_dst_files() to better show what it does. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2022-12-15 10:30:37 +01:00
Peter Xu	23b7576d78	migration: Use atomic ops properly for page accountings To prepare for thread-safety on page accountings, at least below counters need to be accessed only atomically, they are: ram_counters.transferred ram_counters.duplicate ram_counters.normal ram_counters.postcopy_bytes There are a lot of other counters but they won't be accessed outside migration thread, then they're still safe to be accessed without atomic ops. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2022-12-15 10:30:37 +01:00
Markus Armbruster	720a252c26	qapi migration: Elide redundant has_FOO in generated C The has_FOO for pointer-valued FOO are redundant, except for arrays. They are also a nuisance to work with. Recent commit "qapi: Start to elide redundant has_FOO in generated C" provided the means to elide them step by step. This is the step for qapi/migration.json. Said commit explains the transformation in more detail. The invariant violations mentioned there do not occur here. Cc: Juan Quintela <quintela@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20221104160712.3005652-17-armbru@redhat.com>	2022-12-14 20:04:47 +01:00
Peter Xu	6f39c90b86	migration: Disable multifd explicitly with compression Multifd thread model does not work for compression, explicitly disable it. Note that previuosly even we can enable both of them, nothing will go wrong, because the compression code has higher priority so multifd feature will just be ignored. Now we'll fail even earlier at config time so the user should be aware of the consequence better. Note that there can be a slight chance of breaking existing users, but let's assume they're not majority and not serious users, or they should have found that multifd is not working already. With that, we can safely drop the check in ram_save_target_page() for using multifd, because when multifd=on then compression=off, then the removed check on save_page_use_compression() will also always return false too. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2022-11-21 11:58:10 +01:00
Peter Xu	afed4273b5	migration: Disallow postcopy preempt to be used with compress The preempt mode requires the capability to assign channel for each of the page, while the compression logic will currently assign pages to different compress thread/local-channel so potentially they're incompatible. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2022-11-21 11:58:10 +01:00
Marc-André Lureau	38e8f9af08	migration: add missing coroutine_fn annotations Callers of coroutine_fn must be coroutine_fn themselves, or the call must be within "if (qemu_in_coroutine())". Apply coroutine_fn to functions where this holds. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Faria <afaria@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220922084924.201610-26-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-10-07 12:11:41 +02:00
Peter Maydell	ead34f64f9	migration: Assert that migrate_multifd_compression() returns an in-range value Coverity complains that when we use the return value from migrate_multifd_compression() as an array index: multifd_recv_state->ops = multifd_ops[migrate_multifd_compression()]; that this might overrun the array (which is declared to have size MULTIFD_COMPRESSION__MAX). This is because the function return type is MultiFDCompression, which is an autogenerated enum. The code generator includes the "one greater than the maximum possible value" MULTIFD_COMPRESSION__MAX in the enum, even though this is not actually a valid value for the enum, and this makes Coverity think that migrate_multifd_compression() could return that __MAX value and index off the end of the array. Suppress the Coverity error by asserting that the value we're going to return is within range. Resolves: Coverity CID 1487239, 1487254 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20220721115207.729615-2-peter.maydell@linaro.org> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-08-02 16:46:52 +01:00
Leonardo Bras	df67aa3e61	migration: add remaining params->has_* = true in migration_instance_init() Some of params->has_* = true are missing in migration_instance_init, this causes migrate_params_check() to skip some tests, allowing some unsupported scenarios. Fix this by adding all missing params->has_* = true in migration_instance_init(). Fixes: `69ef1f36b0` ("migration: define 'tls-creds' and 'tls-hostname' migration parameters") Fixes: `1d58872a91` ("migration: do not wait for free thread") Fixes: `d2f1d29b95` ("migration: add support for a "tls-authz" migration parameter") Signed-off-by: Leonardo Bras <leobras@redhat.com> Message-Id: <20220726010235.342927-1-leobras@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-08-02 16:46:52 +01:00
Leonardo Bras	90eb69e4f1	migration: Avoid false-positive on non-supported scenarios for zero-copy-send Migration with zero-copy-send currently has it's limitations, as it can't be used with TLS nor any kind of compression. In such scenarios, it should output errors during parameter / capability setting. But currently there are some ways of setting this not-supported scenarios without printing the error message: !) For 'compression' capability, it works by enabling it together with zero-copy-send. This happens because the validity test for zero-copy uses the helper unction migrate_use_compression(), which check for compression presence in s->enabled_capabilities[MIGRATION_CAPABILITY_COMPRESS]. The point here is: the validity test happens before the capability gets enabled. If all of them get enabled together, this test will not return error. In order to fix that, replace migrate_use_compression() by directly testing the cap_list parameter migrate_caps_check(). 2) For features enabled by parameters such as TLS & 'multifd_compression', there was also a possibility of setting non-supported scenarios: setting zero-copy-send first, then setting the unsupported parameter. In order to fix that, also add a check for parameters conflicting with zero-copy-send on migrate_params_check(). 3) XBZRLE is also a compression capability, so it makes sense to also add it to the list of capabilities which are not supported with zero-copy-send. Fixes: `1abaec9a1b` ("migration: Change zero_copy_send from migration parameter to migration capability") Signed-off-by: Leonardo Bras <leobras@redhat.com> Message-Id: <20220719122345.253713-1-leobras@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-07-20 12:15:09 +01:00
Leonardo Bras	cf20c89733	Add dirty-sync-missed-zero-copy migration stat Signed-off-by: Leonardo Bras <leobras@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220711211112.18951-3-leobras@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-07-20 12:15:09 +01:00
Peter Xu	9a26662752	migration: Export tls-[creds\|hostname\|authz] params to cmdline too It's useful for specifying tls credentials all in the cmdline (along with the -object tls-creds-*), especially for debugging purpose. The trick here is we must remember to not free these fields again in the finalize() function of migration object, otherwise it'll cause double-free. The thing is when destroying an object, we'll first destroy the properties that bound to the object, then the object itself. To be explicit, when destroy the object in object_finalize() we have such sequence of operations: object_property_del_all(obj); object_deinit(obj, ti); So after this change the two fields are properly released already even before reaching the finalize() function but in object_property_del_all(), hence we don't need to free them anymore in finalize() or it's double-free. This also fixes a trivial memory leak for tls-authz as we forgot to free it before this patch. Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220707185515.27475-1-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-07-20 12:15:09 +01:00
Peter Xu	85a8578ea5	migration: Add helpers to detect TLS capability Add migrate_channel_requires_tls() to detect whether the specific channel requires TLS, leveraging the recently introduced migrate_use_tls(). No functional change intended. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220707185513.27421-1-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-07-20 12:15:08 +01:00
Peter Xu	c8750de118	migration: Add property x-postcopy-preempt-break-huge Add a property field that can conditionally disable the "break sending huge page" behavior in postcopy preemption. By default it's enabled. It should only be used for debugging purposes, and we should never remove the "x-" prefix. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Manish Mishra <manish.mishra@nutanix.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220707185511.27366-1-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-07-20 12:15:08 +01:00
Peter Xu	d0edb8a173	migration: Create the postcopy preempt channel asynchronously This patch allows the postcopy preempt channel to be created asynchronously. The benefit is that when the connection is slow, we won't take the BQL (and potentially block all things like QMP) for a long time without releasing. A function postcopy_preempt_wait_channel() is introduced, allowing the migration thread to be able to wait on the channel creation. The channel is always created by the main thread, in which we'll kick a new semaphore to tell the migration thread that the channel has created. We'll need to wait for the new channel in two places: (1) when there's a new postcopy migration that is starting, or (2) when there's a postcopy migration to resume. For the start of migration, we don't need to wait for this channel until when we want to start postcopy, aka, postcopy_start(). We'll fail the migration if we found that the channel creation failed (which should probably not happen at all in 99% of the cases, because the main channel is using the same network topology). For a postcopy recovery, we'll need to wait in postcopy_pause(). In that case if the channel creation failed, we can't fail the migration or we'll crash the VM, instead we keep in PAUSED state, waiting for yet another recovery. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Manish Mishra <manish.mishra@nutanix.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220707185509.27311-1-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-07-20 12:15:08 +01:00
Peter Xu	60bb3c5871	migration: Postcopy recover with preempt enabled To allow postcopy recovery, the ram fast load (preempt-only) dest QEMU thread needs similar handling on fault tolerance. When ram_load_postcopy() fails, instead of stopping the thread it halts with a semaphore, preparing to be kicked again when recovery is detected. A mutex is introduced to make sure there's no concurrent operation upon the socket. To make it simple, the fast ram load thread will take the mutex during its whole procedure, and only release it if it's paused. The fast-path socket will be properly released by the main loading thread safely when there's network failures during postcopy with that mutex held. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220707185506.27257-1-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-07-20 12:15:08 +01:00
Peter Xu	c01b16edf6	migration: Postcopy preemption enablement This patch enables postcopy-preempt feature. It contains two major changes to the migration logic: (1) Postcopy requests are now sent via a different socket from precopy background migration stream, so as to be isolated from very high page request delays. (2) For huge page enabled hosts: when there's postcopy requests, they can now intercept a partial sending of huge host pages on src QEMU. After this patch, we'll live migrate a VM with two channels for postcopy: (1) PRECOPY channel, which is the default channel that transfers background pages; and (2) POSTCOPY channel, which only transfers requested pages. There's no strict rule of which channel to use, e.g., if a requested page is already being transferred on precopy channel, then we will keep using the same precopy channel to transfer the page even if it's explicitly requested. In 99% of the cases we'll prioritize the channels so we send requested page via the postcopy channel as long as possible. On the source QEMU, when we found a postcopy request, we'll interrupt the PRECOPY channel sending process and quickly switch to the POSTCOPY channel. After we serviced all the high priority postcopy pages, we'll switch back to PRECOPY channel so that we'll continue to send the interrupted huge page again. There's no new thread introduced on src QEMU. On the destination QEMU, one new thread is introduced to receive page data from the postcopy specific socket (done in the preparation patch). This patch has a side effect: after sending postcopy pages, previously we'll assume the guest will access follow up pages so we'll keep sending from there. Now it's changed. Instead of going on with a postcopy requested page, we'll go back and continue sending the precopy huge page (which can be intercepted by a postcopy request so the huge page can be sent partially before). Whether that's a problem is debatable, because "assuming the guest will continue to access the next page" may not really suite when huge pages are used, especially if the huge page is large (e.g. 1GB pages). So that locality hint is much meaningless if huge pages are used. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220707185504.27203-1-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-07-20 12:15:08 +01:00
Peter Xu	36f62f11e4	migration: Postcopy preemption preparation on channel creation Create a new socket for postcopy to be prepared to send postcopy requested pages via this specific channel, so as to not get blocked by precopy pages. A new thread is also created on dest qemu to receive data from this new channel based on the ram_load_postcopy() routine. The ram_load_postcopy(POSTCOPY) branch and the thread has not started to function, and that'll be done in follow up patches. Cleanup the new sockets on both src/dst QEMUs, meanwhile look after the new thread too to make sure it'll be recycled properly. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220707185502.27149-1-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: With Peter's fix to quieten compiler warning on start_migration	2022-07-20 12:15:08 +01:00
Peter Xu	ce5b0f4afc	migration: Add postcopy-preempt capability Firstly, postcopy already preempts precopy due to the fact that we do unqueue_page() first before looking into dirty bits. However that's not enough, e.g., when there're host huge page enabled, when sending a precopy huge page, a postcopy request needs to wait until the whole huge page that is sending to finish. That could introduce quite some delay, the bigger the huge page is the larger delay it'll bring. This patch adds a new capability to allow postcopy requests to preempt existing precopy page during sending a huge page, so that postcopy requests can be serviced even faster. Meanwhile to send it even faster, bypass the precopy stream by providing a standalone postcopy socket for sending requested pages. Since the new behavior will not be compatible with the old behavior, this will not be the default, it's enabled only when the new capability is set on both src/dst QEMUs. This patch only adds the capability itself, the logic will be added in follow up patches. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220707185342.26794-2-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-07-20 12:15:08 +01:00
Daniel P. Berrangé	77ef2dc1c8	migration: remove the QEMUFileOps abstraction Now that all QEMUFile callbacks are removed, the entire concept can be deleted. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-06-23 10:18:13 +01:00
Daniel P. Berrangé	fbfa6404e5	migration: rename qemu_ftell to qemu_file_total_transferred The name 'ftell' gives the misleading impression that the QEMUFile objects are seekable. This is not the case, as in general we just have an opaque stream. The users of this method are only interested in the total bytes processed. This switches to a new name that reflects the intended usage. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Wrapped long line	2022-06-22 19:33:36 +01:00
Leonardo Bras	1abaec9a1b	migration: Change zero_copy_send from migration parameter to migration capability When originally implemented, zero_copy_send was designed as a Migration paramenter. But taking into account how is that supposed to work, and how the difference between a capability and a parameter, it only makes sense that zero-copy-send would work better as a capability. Taking into account how recently the change got merged, it was decided that it's still time to make it right, and convert zero_copy_send into a Migration capability. Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: always define the capability, even on non-Linux but error if set; avoids build problems with the capability	2022-06-22 18:11:17 +01:00
Leonardo Bras	5b1d9bab2d	multifd: Implement zero copy write in multifd migration (multifd-zero-copy) Implement zero copy send on nocomp_send_write(), by making use of QIOChannel writev + flags & flush interface. Change multifd_send_sync_main() so flush_zero_copy() can be called after each iteration in order to make sure all dirty pages are sent before a new iteration is started. It will also flush at the beginning and at the end of migration. Also make it return -1 if flush_zero_copy() fails, in order to cancel the migration process, and avoid resuming the guest in the target host without receiving all current RAM. This will work fine on RAM migration because the RAM pages are not usually freed, and there is no problem on changing the pages content between writev_zero_copy() and the actual sending of the buffer, because this change will dirty the page and cause it to be re-sent on a next iteration anyway. A lot of locked memory may be needed in order to use multifd migration with zero-copy enabled, so disabling the feature should be necessary for low-privileged users trying to perform multifd migrations. Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220513062836.965425-9-leobras@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-05-16 13:56:24 +01:00
Leonardo Bras	d2fafb6a68	migration: Add migrate_use_tls() helper A lot of places check parameters.tls_creds in order to evaluate if TLS is in use, and sometimes call migrate_get_current() just for that test. Add new helper function migrate_use_tls() in order to simplify testing for TLS usage. Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220513062836.965425-6-leobras@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-05-16 13:56:24 +01:00
Leonardo Bras	abb6295b3a	migration: Add zero-copy-send parameter for QMP/HMP for Linux Add property that allows zero-copy migration of memory pages on the sending side, and also includes a helper function migrate_use_zero_copy_send() to check if it's enabled. No code is introduced to actually do the migration, but it allow future implementations to enable/disable this feature. On non-Linux builds this parameter is compiled-out. Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20220513062836.965425-5-leobras@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-05-16 13:56:24 +01:00
Dr. David Alan Gilbert	552de79bfd	migration: Read state once The 'status' field for the migration is updated normally using an atomic operation from the migration thread. Most readers of it aren't that careful, and in most cases it doesn't matter. In query_migrate->fill_source_migration_info the 'state' is read twice; the first time to decide which state fields to fill in, and then secondly to copy the state to the status field; that can end up with a status that's inconsistent; e.g. setting up the fields for 'setup' and then having an 'active' status. In that case libvirt gets upset by the lack of ram info. The symptom is: libvirt.libvirtError: internal error: migration was active, but no RAM info was set Read the state exactly once in fill_source_migration_info. This is a possible fix for: https://bugzilla.redhat.com/show_bug.cgi?id=2074205 Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20220413113329.103696-1-dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-04-21 19:36:46 +01:00
Peter Xu	08401c0426	migration: Allow migrate-recover to run multiple times Previously migration didn't have an easy way to cleanup the listening transport, migrate recovery only allows to execute once. That's done with a trick flag in postcopy_recover_triggered. Now the facility is already there. Drop postcopy_recover_triggered and instead allows a new migrate-recover to release the previous listener transport. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220331150857.74406-8-peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-04-21 19:36:46 +01:00
Peter Xu	a39e933962	migration: Move channel setup out of postcopy_try_recover() We used to use postcopy_try_recover() to replace migration_incoming_setup() to setup incoming channels. That's fine for the old world, but in the new world there can be more than one channels that need setup. Better move the channel setup out of it so that postcopy_try_recover() only handles the last phase of switching to the recovery phase. To do that in migration_fd_process_incoming(), move the postcopy_try_recover() call to be after migration_incoming_setup(), which will setup the channels. While in migration_ioc_process_incoming(), postpone the recover() routine right before we'll jump into migration_incoming_process(). A side benefit is we don't need to pass in QEMUFile* to postcopy_try_recover() anymore. Remove it. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220331150857.74406-7-peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-04-21 19:36:46 +01:00
Peter Xu	f444eeda71	migration: Move migrate_allow_multifd and helpers into migration.c This variable, along with its helpers, is used to detect whether multiple channel will be supported for migration. In follow up patches, there'll be other capability that requires multi-channels. Hence move it outside multifd specific code and make it public. Meanwhile rename it from "multifd" to "multi_channels" to show its real meaning. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220331150857.74406-5-peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-04-21 19:36:46 +01:00
Peter Xu	83174765da	migration: Postpone releasing MigrationState.hostname We used to release it right after migrate_fd_connect(). That's not good enough when there're more than one socket pair required, because it'll be needed to establish TLS connection for the rest channels. One example is multifd, where we copied over the hostname for each channel but that's actually not needed. Keeping the hostname until the cleanup phase of migration. Cc: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220331150857.74406-2-peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Fixup checkpatch error; don't need to check for NULL around g_free	2022-04-21 19:36:46 +01:00
Marc-André Lureau	8e3b0cbb72	Replace qemu_real_host_page variables with inlined functions Replace the global variables with inlined helper functions. getpagesize() is very likely annotated with a "const" function attribute (at least with glibc), and thus optimization should apply even better. This avoids the need for a constructor initialization too. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-12-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 10:50:38 +02:00
Emanuele Giuseppe Esposito	3b71719462	block: rename bdrv_invalidate_cache_all, blk_invalidate_cache and test_sync_op_invalidate_cache Following the bdrv_activate renaming, change also the name of the respective callers. bdrv_invalidate_cache_all -> bdrv_activate_all blk_invalidate_cache -> blk_activate test_sync_op_invalidate_cache -> test_sync_op_activate No functional change intended. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220209105452.1694545-5-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-03-04 18:14:40 +01:00
Peter Xu	e031149c78	migration: Add migration_incoming_transport_cleanup() Add a helper to cleanup the transport listener. When do it, we should also null-ify the cleanup hook and the data, then it's even safe to call it multiple times. Move the socket_address_list cleanup altogether, because that's a mirror of the listener channels and only for the purpose of query-migrate. Hence when someone wants to cleanup the listener transport, it should also want to cleanup the socket list too, always. No functional change intended. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220301083925.33483-15-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-03-02 18:20:45 +00:00
Peter Xu	d5c8f2afe8	migration: Enlarge postcopy recovery to capture !-EIO too We used to have quite a few places making sure -EIO happened and that's the only way to trigger postcopy recovery. That's based on the assumption that we'll only return -EIO for channel issues. It'll work in 99.99% cases but logically that won't cover some corner cases. One example is e.g. ram_block_from_stream() could fail with an interrupted network, then -EINVAL will be returned instead of -EIO. I remembered Dave Gilbert pointed that out before, but somehow this is overlooked. Neither did I encounter anything outside the -EIO error. However we'd better touch that up before it triggers a rare VM data loss during live migrating. To cover as much those cases as possible, remove the -EIO restriction on triggering the postcopy recovery, because even if it's not a channel failure, we can't do anything better than halting QEMU anyway - the corpse of the process may even be used by a good hand to dig out useful memory regions, or the admin could simply kill the process later on. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220301083925.33483-11-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-03-02 18:20:45 +00:00
David Edmondson	ae68066880	migration: Tally pre-copy, downtime and post-copy bytes independently Provide information on the number of bytes copied in the pre-copy, downtime and post-copy phases of migration. Signed-off-by: David Edmondson <david.edmondson@oracle.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2022-01-28 15:38:23 +01:00
Peter Xu	739fcc1b0e	migration: Drop return code for disgard ram process It will just never fail. Drop those return values where they're constantly zeros. A tiny touch-up on the tracepoint so trace_ram_postcopy_send_discard_bitmap() is called after the logic itself (which sounds more reasonable). Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2022-01-28 15:38:23 +01:00
Zhang Chen	01ee5e3556	migration/migration.c: Remove the MIGRATION_STATUS_ACTIVE when migration finished The MIGRATION_STATUS_ACTIVE indicates that migration is running. Remove it to be handled by the default operation, It should be part of the unknown ending states. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2022-01-28 15:38:23 +01:00
Zhang Chen	eeeb48ee33	migration/migration.c: Avoid COLO boot in postcopy migration COLO dose not support postcopy migration and remove the Fixme. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2022-01-28 15:38:23 +01:00
Zhang Chen	444252b96a	migration/migration.c: Add missed default error handler for migration state In the migration_completion() no other status is expected, for example MIGRATION_STATUS_CANCELLING, MIGRATION_STATUS_CANCELLED, etc. Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2022-01-28 15:38:23 +01:00
Laurent Vivier	1b529d908d	failover: Silence warning messages during qtest virtio-net-failover test tries several device combinations that produces some expected warnings. These warning can be confusing, so we disable them during the qtest sequence. Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20211220145314.390697-1-lvivier@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> [thuth: Fix memory leak by using error_free()] Signed-off-by: Thomas Huth <thuth@redhat.com>	2021-12-22 08:12:45 +01:00
Juan Quintela	144fa06b34	migration: Never call twice qemu_target_page_size() Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-12-15 10:31:42 +01:00
Zhang Chen	751fe4c608	migration/colo: Optimize COLO primary node start code path Optimize COLO primary start path from: MIGRATION_STATUS_XXX --> MIGRATION_STATUS_ACTIVE --> MIGRATION_STATUS_COLO --> MIGRATION_STATUS_COMPLETED To: MIGRATION_STATUS_XXX --> MIGRATION_STATUS_COLO --> MIGRATION_STATUS_COMPLETED No need to start primary COLO through "MIGRATION_STATUS_ACTIVE". Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2021-12-15 10:31:42 +01:00
Rao, Lei	795969ab1f	Fixed a QEMU hang when guest poweroff in COLO mode When the PVM guest poweroff, the COLO thread may wait a semaphore in colo_process_checkpoint().So, we should wake up the COLO thread before migration shutdown. Signed-off-by: Lei Rao <lei.rao@intel.com> Reviewed-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2021-12-15 10:31:42 +01:00
Rao, Lei	684bfd1820	Fixed SVM hang when do failover before PVM crash This patch fixed as follows: Thread 1 (Thread 0x7f34ee738d80 (LWP 11212)): #0 __pthread_clockjoin_ex (threadid=139847152957184, thread_return=0x7f30b1febf30, clockid=<optimized out>, abstime=<optimized out>, block=<optimized out>) at pthread_join_common.c:145 #1 0x0000563401998e36 in qemu_thread_join (thread=0x563402d66610) at util/qemu-thread-posix.c:587 #2 0x00005634017a79fa in process_incoming_migration_co (opaque=0x0) at migration/migration.c:502 #3 0x00005634019b59c9 in coroutine_trampoline (i0=63395504, i1=22068) at util/coroutine-ucontext.c:115 #4 0x00007f34ef860660 in ?? () at ../sysdeps/unix/sysv/linux/x86_64/__start_context.S:91 from /lib/x86_64-linux-gnu/libc.so.6 #5 0x00007f30b21ee730 in ?? () #6 0x0000000000000000 in ?? () Thread 13 (Thread 0x7f30b3dff700 (LWP 11747)): #0 __lll_lock_wait (futex=futex@entry=0x56340218ffa0 <qemu_global_mutex>, private=0) at lowlevellock.c:52 #1 0x00007f34efa000a3 in _GI__pthread_mutex_lock (mutex=0x56340218ffa0 <qemu_global_mutex>) at ../nptl/pthread_mutex_lock.c:80 #2 0x0000563401997f99 in qemu_mutex_lock_impl (mutex=0x56340218ffa0 <qemu_global_mutex>, file=0x563401b7a80e "migration/colo.c", line=806) at util/qemu-thread-posix.c:78 #3 0x0000563401407144 in qemu_mutex_lock_iothread_impl (file=0x563401b7a80e "migration/colo.c", line=806) at /home/workspace/colo-qemu/cpus.c:1899 #4 0x00005634017ba8e8 in colo_process_incoming_thread (opaque=0x563402d664c0) at migration/colo.c:806 #5 0x0000563401998b72 in qemu_thread_start (args=0x5634039f8370) at util/qemu-thread-posix.c:519 #6 0x00007f34ef9fd609 in start_thread (arg=<optimized out>) at pthread_create.c:477 #7 0x00007f34ef924293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 The QEMU main thread is holding the lock: (gdb) p qemu_global_mutex $1 = {lock = {_data = {lock = 2, __count = 0, __owner = 11212, __nusers = 9, __kind = 0, __spins = 0, __elision = 0, __list = {_prev = 0x0, __next = 0x0}}, __size = "\002\000\000\000\000\000\000\000\314+\000\000\t", '\000' <repeats 26 times>, __align = 2}, file = 0x563401c07e4b "util/main-loop.c", line = 240, initialized = true} >From the call trace, we can see it is a deadlock bug. and the QEMU main thread holds the global mutex to wait until the COLO thread ends. and the colo thread wants to acquire the global mutex, which will cause a deadlock. So, we should release the qemu_global_mutex before waiting colo thread ends. Signed-off-by: Lei Rao <lei.rao@intel.com> Reviewed-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2021-11-03 09:38:53 +01:00
Rao, Lei	aa505f8e0e	Fixed qemu crash when guest power off in COLO mode This patch fixes the following: qemu-system-x86_64: invalid runstate transition: 'shutdown' -> 'running' Aborted (core dumped) The gdb bt as following: 0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 1 0x00007faa3d613859 in __GI_abort () at abort.c:79 2 0x000055c5a21268fd in runstate_set (new_state=RUN_STATE_RUNNING) at vl.c:723 3 0x000055c5a1f8cae4 in vm_prepare_start () at /home/workspace/colo-qemu/cpus.c:2206 4 0x000055c5a1f8cb1b in vm_start () at /home/workspace/colo-qemu/cpus.c:2213 5 0x000055c5a2332bba in migration_iteration_finish (s=0x55c5a4658810) at migration/migration.c:3376 6 0x000055c5a2332f3b in migration_thread (opaque=0x55c5a4658810) at migration/migration.c:3527 7 0x000055c5a251d68a in qemu_thread_start (args=0x55c5a5491a70) at util/qemu-thread-posix.c:519 8 0x00007faa3d7e9609 in start_thread (arg=<optimized out>) at pthread_create.c:477 9 0x00007faa3d710293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Signed-off-by: Lei Rao <lei.rao@intel.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2021-11-03 09:38:53 +01:00
yuxiating	fa0b31d585	migration: initialise compression_counters for a new migration If the compression migration fails or is canceled, the query for the value of compression_counters during the next compression migration is wrong. Signed-off-by: yuxiating <yuxiating@huawei.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2021-11-03 09:38:53 +01:00
Laurent Vivier	458fecca80	migration: provide an error message to migration_cancel() This avoids to call migrate_get_current() in the caller function whereas migration_cancel() already needs the pointer to the current migration state. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2021-11-03 09:38:53 +01:00
David Hildenbrand	7648297d40	migration: Simplify alignment and alignment checks Let's use QEMU_ALIGN_DOWN() and friends to make the code a bit easier to read. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2021-11-01 22:56:44 +01:00
Peter Xu	60fd680193	migration: Add migrate_add_blocker_internal() An internal version that removes -only-migratable implications. It can be used for temporary migration blockers like dump-guest-memory. Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2021-11-01 22:56:44 +01:00
Peter Xu	4c170330aa	migration: Make migration blocker work for snapshots too save_snapshot() checks migration blocker, which looks sane. At the meantime we should also teach the blocker add helper to fail if during a snapshot, just like for migrations. Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2021-11-01 22:56:43 +01:00
Li Zhijian	5ad15e8614	migration: allow enabling mutilfd for specific protocol only To: <quintela@redhat.com>, <dgilbert@redhat.com>, <qemu-devel@nongnu.org> CC: Li Zhijian <lizhijian@cn.fujitsu.com> Date: Sat, 31 Jul 2021 22:05:52 +0800 (5 weeks, 4 days, 17 hours ago) And change the default to true so that in '-incoming defer' case, user is able to change multifd capability. Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2021-10-19 08:39:04 +02:00
Li Zhijian	b7acd65707	migration: allow multifd for socket protocol only To: <quintela@redhat.com>, <dgilbert@redhat.com>, <qemu-devel@nongnu.org> CC: Li Zhijian <lizhijian@cn.fujitsu.com> Date: Sat, 31 Jul 2021 22:05:51 +0800 (5 weeks, 4 days, 17 hours ago) multifd with unsupported protocol will cause a segment fault. (gdb) bt #0 0x0000563b4a93faf8 in socket_connect (addr=0x0, errp=0x7f7f02675410) at ../util/qemu-sockets.c:1190 #1 0x0000563b4a797a03 in qio_channel_socket_connect_sync (ioc=0x563b4d16e8c0, addr=0x0, errp=0x7f7f02675410) at ../io/channel-socket.c:145 #2 0x0000563b4a797abf in qio_channel_socket_connect_worker (task=0x563b4cd86c30, opaque=0x0) at ../io/channel-socket.c:168 #3 0x0000563b4a792631 in qio_task_thread_worker (opaque=0x563b4cd86c30) at ../io/task.c:124 #4 0x0000563b4a91da69 in qemu_thread_start (args=0x563b4c44bb80) at ../util/qemu-thread-posix.c:541 #5 0x00007f7fe9b5b3f9 in ?? () #6 0x0000000000000000 in ?? () It's enough to check migrate_multifd_is_allowed() in multifd cleanup() and multifd setup() though there are so many other places using migrate_use_multifd(). Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2021-10-19 08:39:04 +02:00
Emanuele Giuseppe Esposito	68b88468f6	migration: add missing qemu_mutex_lock_iothread in migration_completion qemu_savevm_state_complete_postcopy assumes the iothread lock (BQL) to be held, but instead it isn't. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20211005080751.3797161-3-eesposit@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-05 13:10:29 +02:00
Markus Armbruster	7d6f6933aa	migration: Handle migration_incoming_setup() errors consistently Commit `b673eab4e2` "multifd: Make multifd_load_setup() get an Error parameter" changed migration_incoming_setup() to take an Error ** argument, and adjusted the callers accordingly. It neglected to change adjust multifd_load_setup(): it still exit()s on error. Clean that up. The error now gets propagated up two call chains: via migration_fd_process_incoming() to rdma_accept_incoming_migration(), and via migration_ioc_process_incoming() to migration_channel_process_incoming(). Both chain ends report the error with error_report_err(), but otherwise ignore it. Behavioral change: we no longer exit() on this error. This is consistent with how we handle other errors here, e.g. from multifd_recv_new_channel() via migration_ioc_process_incoming() to migration_channel_process_incoming(). Whether it's consistently right or consistently wrong I can't tell. Also clean up the return value from the unusual 0 on success, 1 on error to the more common true on success, false on error. Cc: Juan Quintela <quintela@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210720125408.387910-11-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>	2021-08-26 17:15:28 +02:00
Markus Armbruster	f9734d5d40	error: Use error_fatal to simplify obvious fatal errors (again) We did this with scripts/coccinelle/use-error_fatal.cocci before, in commit `50beeb6809` and `007b06578a`. This commit cleans up rarer variations that don't seem worth matching with Coccinelle. Cc: Thomas Huth <thuth@redhat.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Peter Xu <peterx@redhat.com> Cc: Juan Quintela <quintela@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Marc-André Lureau <marcandre.lureau@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210720125408.387910-2-armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2021-08-26 17:15:28 +02:00
Peter Xu	39675ffffb	migration: Move the yank unregister of channel_close out It's efficient, but hackish to call yank unregister calls in channel_close(), especially it'll be hard to debug when qemu crashed with some yank function leaked. Remove that hack, but instead explicitly unregister yank functions at the places where needed, they are: (on src) - migrate_fd_cleanup - postcopy_pause (on dst) - migration_incoming_state_destroy - postcopy_pause_incoming Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210722175841.938739-6-peterx@redhat.com> Reviewed-by: Lukas Straub <lukasstraub2@web.de> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-07-26 12:45:03 +01:00
Peter Xu	43044ac0ee	migration: Make from_dst_file accesses thread-safe Accessing from_dst_file is potentially racy in current code base like below: if (s->from_dst_file) do_something(s->from_dst_file); Because from_dst_file can be reset right after the check in another thread (rp_thread). One example is migrate_fd_cancel(). Use the same qemu_file_lock to protect it too, just like to_dst_file. When it's safe to access without lock, comment it. There's one special reference in migration_thread() that can be replaced by the newly introduced rp_thread_created flag. Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Lukas Straub <lukasstraub2@web.de> Message-Id: <20210722175841.938739-3-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> with Peter's fixup	2021-07-26 12:44:46 +01:00
Peter Xu	53021ea165	migration: Fix missing join() of rp_thread It's possible that the migration thread skip the join() of the rp_thread in below race and crash on src right at finishing migration: migration_thread rp_thread ---------------- --------- migration_completion() (before rp_thread quits) from_dst_file=NULL [thread got scheduled out] s->rp_state.from_dst_file==NULL (skip join() of rp_thread) migrate_fd_cleanup() qemu_fclose(s->to_dst_file) yank_unregister_instance() assert(yank_find_entry()) <------- crash It could mostly happen with postcopy, but that shouldn't be required, e.g., I think it could also trigger with MIGRATION_CAPABILITY_RETURN_PATH set. It's suspected that above race could be the root cause of a recent (but rare) migration-test break reported by either Dave or PMM: https://lore.kernel.org/qemu-devel/YPamXAHwan%2FPPXLf@work-vm/ The issue is: from_dst_file is reset in the rp_thread, so if the thread reset it to NULL fast enough then the migration thread will assume there's no rp_thread at all. This could potentially cause more severe issue (e.g. crash) after the yank code. Fix it by using a boolean to keep "whether we've created rp_thread". Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210722175841.938739-2-peterx@redhat.com> Reviewed-by: Lukas Straub <lukasstraub2@web.de> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-07-26 12:44:26 +01:00
Peter Xu	ca7bd0821b	migration: Clear error at entry of migrate_fd_connect() For each "migrate" command, remember to clear the s->error before going on. For one reason, when there's a new error it'll be still remembered; see migrate_set_error() who only sets the error if error==NULL. Meanwhile if a failed migration completes (e.g., postcopy recovered and finished), we shouldn't dump an error when calling migrate_fd_cleanup() at last. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210708190653.252961-4-peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-07-13 16:21:57 +01:00
Peter Xu	ca30f24d12	migration: Don't do migrate cleanup if during postcopy resume Below process could crash qemu with postcopy recovery: 1. (hmp) migrate -d .. 2. (hmp) migrate_start_postcopy 3. [network down, postcopy paused] 4. (hmp) migrate -r $WRONG_PORT when try the recover on an invalid $WRONG_PORT, cleanup_bh will be cleared 5. (hmp) migrate -r $RIGHT_PORT [qemu crash on assert(cleanup_bh)] The thing is we shouldn't cleanup if it's postcopy resume; the error is set mostly because the channel is wrong, so we return directly waiting for the user to retry. migrate_fd_cleanup() should only be called when migration is cancelled or completed. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210708190653.252961-3-peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-07-13 16:21:57 +01:00
Peter Xu	2e3e3da3c2	migration: Release return path early for paused postcopy When postcopy pause triggered, we rely on the migration thread to cleanup the to_dst_file handle, and the return path thread to cleanup the from_dst_file handle (which is stored in the local variable "rp"). Within the process, from_dst_file cleanup (qemu_fclose) is postponed until it's setup again due to a postcopy recovery. It used to work before yank was born; after yank is introduced we rely on the refcount of IOC to correctly unregister yank function in channel_close(). If without the early and on-time release of from_dst_file handle the yank function will be leftover during paused postcopy. Without this patch, below steps (quoted from Xiaohui) could trigger qemu src crash: 1.Boot vm on src host 2.Boot vm on dst host 3.Enable postcopy on src&dst host 4.Load stressapptest in vm and set postcopy speed to 50M 5.Start migration from src to dst host, change into postcopy mode when migration is active. 6.When postcopy is active, down the network card(do migration via this network) on dst host. 7.Wait untill postcopy is paused on src&dst host. 8.Before up network card, recover migration on dst host, will get error like following. 9.Ignore the error of step 8, go on recovering migration on src host: After step 9, qemu on src host will core dump after some seconds: qemu-kvm: ../util/yank.c:107: yank_unregister_instance: Assertion `QLIST_EMPTY(&entry->yankfns)' failed. 1.sh: line 38: 44662 Aborted (core dumped) Reported-by: Li Xiaohui <xiaohli@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210708190653.252961-2-peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-07-13 16:21:57 +01:00
Laurent Vivier	a51dcef08b	migration: failover: emit a warning when the card is not fully unplugged When the migration fails or is canceled we wait the end of the unplug operation to be able to plug it back. But if the unplug operation is never finished we stop to wait and QEMU emits a warning to inform the user. Based-on: 20210629155007.629086-1-lvivier@redhat.com Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20210701131458.112036-1-lvivier@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-07-13 16:21:57 +01:00
Laurent Vivier	944bc52842	migration: failover: continue to wait card unplug on error If the user cancels the migration in the unplug-wait state, QEMU will try to plug back the card and this fails because the card is partially unplugged. To avoid the problem, continue to wait the card unplug, but to allow the migration to be canceled if the card never finishes to unplug use a timeout. Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1976852 Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20210629155007.629086-3-lvivier@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-07-05 10:51:26 +01:00
Laurent Vivier	fde93d99d9	migration: move wait-unplug loop to its own function The loop is used in migration_thread() and bg_migration_thread(), so we can move it to its own function and call it from these both places. Moreover, in migration_thread() we have a wrong state transition from SETUP to ACTIVE while state could be WAIT_UNPLUG. This is correctly managed in bg_migration_thread() so use this code instead. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Message-Id: <20210629155007.629086-2-lvivier@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-07-05 10:51:26 +01:00
Peter Xu	b7f9afd48e	migration: Allow reset of postcopy_recover_triggered when failed It's possible qemu_start_incoming_migration() failed at any point, when it happens we should reset postcopy_recover_triggered to false so that the user can still retry with a saner incoming port. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20210629181356.217312-3-peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-07-05 10:51:26 +01:00
Peter Xu	cc48c587d2	migration: Move yank outside qemu_start_incoming_migration() Starting from commit `b5eea99ec2`, qmp_migrate_recover() calls unregister before calling qemu_start_incoming_migration(). I believe it wanted to mitigate the next call to yank_register_instance(), but I think that's wrong. Firstly, if during recover, we should keep the yank instance there, not "quickly removing and adding it back". Meanwhile, calling qmp_migrate_recover() twice with `b5eea99ec2` will directly crash the dest qemu (right now it can't; but it'll start to work right after the next patch) because the 1st call of qmp_migrate_recover() will unregister permanently when the channel failed to establish, then the 2nd call of qmp_migrate_recover() crashes at yank_unregister_instance(). This patch fixes it by moving yank ops out of qemu_start_incoming_migration() into qmp_migrate_incoming. For qmp_migrate_recover(), drop the unregister of yank instance too since we keep it there during the recovery phase. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20210629181356.217312-2-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-07-05 10:51:26 +01:00

1 2 3 4 5 ...

618 Commits