qemu-e2k

Author	SHA1	Message	Date
Peter Xu	efd8c5439d	migration/multifd: Drop MultiFDSendParams.normal[] array This array is redundant when p->pages exists. Now we extended the life of p->pages to the whole period where pending_job is set, it should be safe to always use p->pages->offset[] rather than p->normal[]. Drop the array. Alongside, the normal_num is also redundant, which is the same to p->pages->num. This doesn't apply to recv side, because there's no extra buffering on recv side, so p->normal[] array is still needed. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-6-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
Peter Xu	15f3f21d59	migration/multifd: Drop MultiFDSendParams.quit, cleanup error paths Multifd send side has two fields to indicate error quits: - MultiFDSendParams.quit - &multifd_send_state->exiting Merge them into the global one. The replacement is done by changing all p->quit checks into the global var check. The global check doesn't need any lock. A few more things done on top of this altogether: - multifd_send_terminate_threads() Moving the xchg() of &multifd_send_state->exiting upper, so as to cover the tracepoint, migrate_set_error() and migrate_set_state(). - multifd_send_sync_main() In the 2nd loop, add one more check over the global var to make sure we don't keep the looping if QEMU already decided to quit. - multifd_tls_outgoing_handshake() Use multifd_send_terminate_threads() to set the error state. That has a benefit of updating MigrationState.error to that error too, so we can persist that 1st error we hit in that specific channel. - multifd_new_send_channel_async() Take similar approach like above, drop the migrate_set_error() because multifd_send_terminate_threads() already covers that. Unwrap the helper multifd_new_send_channel_cleanup() along the way; not really needed. Reviewed-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240202102857.110210-4-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>	2024-02-05 14:42:10 +08:00
Fabiano Rosas	9346fa1870	migration/multifd: Remove QEMUFile from where it is not needed Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240104142144.9680-3-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-16 11:16:09 +08:00
Fabiano Rosas	dca1bc7f24	migration/multifd: Remove MultiFDPages_t::packet_num This was introduced by commit `34c55a94b1` ("migration: Create multipage support") and never used. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240104142144.9680-2-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>	2024-01-16 11:16:09 +08:00
Lukas Straub	5d1d1fcf43	multifd: Add the ramblock to MultiFDRecvParams This will be used in the next commits to add colo support to multifd. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <88135197411df1a71d7832962b39abf60faf0021.1683572883.git.lukasstraub2@web.de> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-05-10 18:48:11 +02:00
Leonardo Bras	cfc3bcf373	migration/multifd: Move load_cleanup inside incoming_state_destroy Currently running migration_incoming_state_destroy() without first running multifd_load_cleanup() will cause a yank error: qemu-system-x86_64: ../util/yank.c:107: yank_unregister_instance: Assertion `QLIST_EMPTY(&entry->yankfns)' failed. (core dumped) The above error happens in the target host, when multifd is being used for precopy, and then postcopy is triggered and the migration finishes. This will crash the VM in the target host. To avoid that, move multifd_load_cleanup() inside migration_incoming_state_destroy(), so that the load cleanup becomes part of the incoming state destroying process. Running multifd_load_cleanup() twice can become an issue, though, but the only scenario it could be ran twice is on process_incoming_migration_bh(). So removing this extra call is necessary. On the other hand, this multifd_load_cleanup() call happens way before the migration_incoming_state_destroy() and having this happening before dirty_bitmap_mig_before_vm_start() and vm_start() may be a need. So introduce a new function multifd_load_shutdown() that will mainly stop all multifd threads and close their QIOChannels. Then use this function instead of multifd_load_cleanup() to make sure nothing else is received before dirty_bitmap_mig_before_vm_start(). Fixes: `b5eea99ec2` ("migration: Add yank feature") Reported-by: Li Xiaohui <xiaohli@redhat.com> Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-02-13 03:45:40 +01:00
Leonardo Bras	e5bac1f525	migration/multifd: Change multifd_load_cleanup() signature and usage Since it's introduction in commit `f986c3d256` ("migration: Create multifd migration threads"), multifd_load_cleanup() never returned any value different than 0, neither set up any error on errp. Even though, on process_incoming_migration_bh() an if clause uses it's return value to decide on setting autostart = false, which will never happen. In order to simplify the codebase, change multifd_load_cleanup() signature to 'void multifd_load_cleanup(void)', and for every usage remove error handling or decision made based on return value != 0. Fixes: `b5eea99ec2` ("migration: Add yank feature") Reported-by: Li Xiaohui <xiaohli@redhat.com> Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-02-13 03:44:44 +01:00
manish.mishra	6720c2b327	migration: check magic value for deciding the mapping of channels Current logic assumes that channel connections on the destination side are always established in the same order as the source and the first one will always be the main channel followed by the multifid or post-copy preemption channel. This may not be always true, as even if a channel has a connection established on the source side it can be in the pending state on the destination side and a newer connection can be established first. Basically causing out of order mapping of channels on the destination side. Currently, all channels except post-copy preempt send a magic number, this patch uses that magic number to decide the type of channel. This logic is applicable only for precopy(multifd) live migration, as mentioned, the post-copy preempt channel does not send any magic number. Also, tls live migrations already does tls handshake before creating other channels, so this issue is not possible with tls, hence this logic is avoided for tls live migrations. This patch uses read peek to check the magic number of channels so that current data/control stream management remains un-effected. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Suggested-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: manish.mishra <manish.mishra@nutanix.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-02-06 19:22:57 +01:00
Juan Quintela	d6f45eba2b	multifd: Create page_count fields into both MultiFD{Recv,Send}Params We were recalculating it left and right. We plan to change that values on next patches. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Leonardo Bras <leobras@redhat.com>	2022-12-15 10:30:37 +01:00
Juan Quintela	ddec20f858	multifd: Create page_size fields into both MultiFD{Recv,Send}Params We were calling qemu_target_page_size() left and right. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Leonardo Bras <leobras@redhat.com>	2022-12-15 10:30:37 +01:00
Juan Quintela	4a8f19c95c	multifd: Document the locking of MultiFD{Send/Recv}Params Reorder the structures so we can know if the fields are: - Read only - Their own locking (i.e. sems) - Protected by 'mutex' - Only for the multifd channel Signed-off-by: Juan Quintela <quintela@redhat.com> Message-Id: <20220531104318.7494-2-quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Typo fixes from Chen Zhang	2022-07-20 12:15:09 +01:00
Leonardo Bras	5b1d9bab2d	multifd: Implement zero copy write in multifd migration (multifd-zero-copy) Implement zero copy send on nocomp_send_write(), by making use of QIOChannel writev + flags & flush interface. Change multifd_send_sync_main() so flush_zero_copy() can be called after each iteration in order to make sure all dirty pages are sent before a new iteration is started. It will also flush at the beginning and at the end of migration. Also make it return -1 if flush_zero_copy() fails, in order to cancel the migration process, and avoid resuming the guest in the target host without receiving all current RAM. This will work fine on RAM migration because the RAM pages are not usually freed, and there is no problem on changing the pages content between writev_zero_copy() and the actual sending of the buffer, because this change will dirty the page and cause it to be re-sent on a next iteration anyway. A lot of locked memory may be needed in order to use multifd migration with zero-copy enabled, so disabling the feature should be necessary for low-privileged users trying to perform multifd migrations. Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220513062836.965425-9-leobras@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-05-16 13:56:24 +01:00
Leonardo Bras	33d70973a3	multifd: multifd_send_sync_main now returns negative on error Even though multifd_send_sync_main() currently emits error_reports, it's callers don't really check it before continuing. Change multifd_send_sync_main() to return -1 on error and 0 on success. Also change all it's callers to make use of this change and possibly fail earlier. (This change is important to next patch on multifd zero copy implementation, to make it sure an error in zero-copy flush does not go unnoticed. Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20220513062836.965425-7-leobras@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-05-16 13:56:24 +01:00
Peter Xu	f444eeda71	migration: Move migrate_allow_multifd and helpers into migration.c This variable, along with its helpers, is used to detect whether multiple channel will be supported for migration. In follow up patches, there'll be other capability that requires multi-channels. Hence move it outside multifd specific code and make it public. Meanwhile rename it from "multifd" to "multi_channels" to show its real meaning. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220331150857.74406-5-peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-04-21 19:36:46 +01:00
Peter Xu	7f692ec79a	migration: Drop multifd tls_hostname cache The hostname is cached N times, N equals to the multifd channels. Drop that cache because after previous patch we've got s->hostname being alive for the whole lifecycle of migration procedure. Cc: Juan Quintela <quintela@redhat.com> Cc: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20220331150857.74406-3-peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-04-21 19:36:46 +01:00
Juan Quintela	8c0ec0b2b0	multifd: Rename pages_used to normal_pages Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-01-28 15:38:23 +01:00
Juan Quintela	faf60935df	multifd: recv side only needs the RAMBlock host address So we can remove the MultiFDPages. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-01-28 15:38:23 +01:00
Juan Quintela	cf2d4aa8a2	multifd: Use normal pages array on the recv side Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- Rename num_normal_pages to total_normal_pages (peter)	2022-01-28 15:38:23 +01:00
Juan Quintela	815956f039	multifd: Use normal pages array on the send side We are only sending normal pages through multifd channels. Later on this series, we are going to also send zero pages. We are going to detect if a page is zero or non zero in the multifd channel thread, not on the main thread. So we receive an array of pages page->offset[N] And we will end with: p->normal[N - zero_pages] p->zero[zero_pages]. In this patch, we just copy all the pages in offset to normal. for (i = 0; i < pages->num; i++) { p->narmal[p->normal_num] = pages->offset[i]; p->normal_num++: } Later in the series this becomes: for (i = 0; i < pages->num; i++) { if (buffer_is_zero(page->offset[i])) { p->zerol[p->zero_num] = pages->offset[i]; p->zero_num++: } else { p->narmal[p->normal_num] = pages->offset[i]; p->normal_num++: } } Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- Improving comment (dave) Renaming num_normal_pages to total_normal_pages (peter)	2022-01-28 15:38:23 +01:00
Juan Quintela	468fcb5dd0	multifd: Remove send_write() method Everything use now iov's. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-01-28 15:38:23 +01:00
Juan Quintela	226468ba3d	multifd: Move iov from pages to params This will allow us to reduce the number of system calls on the next patch. Signed-off-by: Juan Quintela <quintela@redhat.com>	2022-01-28 15:38:23 +01:00
Juan Quintela	40a4bfe9d3	multifd: remove used parameter from send_recv_pages() method It is already there as p->pages->num. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-12-15 10:31:42 +01:00
Juan Quintela	02fb81043e	multifd: remove used parameter from send_prepare() method It is already there as p->pages->num. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-12-15 10:31:42 +01:00
Juan Quintela	90a3d2f9d5	multifd: Rename used field to num We will need to split it later in zero_num (number of zero pages) and normal_num (number of normal pages). This name is better. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-12-15 10:31:42 +01:00
Li Zhijian	b7acd65707	migration: allow multifd for socket protocol only To: <quintela@redhat.com>, <dgilbert@redhat.com>, <qemu-devel@nongnu.org> CC: Li Zhijian <lizhijian@cn.fujitsu.com> Date: Sat, 31 Jul 2021 22:05:51 +0800 (5 weeks, 4 days, 17 hours ago) multifd with unsupported protocol will cause a segment fault. (gdb) bt #0 0x0000563b4a93faf8 in socket_connect (addr=0x0, errp=0x7f7f02675410) at ../util/qemu-sockets.c:1190 #1 0x0000563b4a797a03 in qio_channel_socket_connect_sync (ioc=0x563b4d16e8c0, addr=0x0, errp=0x7f7f02675410) at ../io/channel-socket.c:145 #2 0x0000563b4a797abf in qio_channel_socket_connect_worker (task=0x563b4cd86c30, opaque=0x0) at ../io/channel-socket.c:168 #3 0x0000563b4a792631 in qio_task_thread_worker (opaque=0x563b4cd86c30) at ../io/task.c:124 #4 0x0000563b4a91da69 in qemu_thread_start (args=0x563b4c44bb80) at ../util/qemu-thread-posix.c:541 #5 0x00007f7fe9b5b3f9 in ?? () #6 0x0000000000000000 in ?? () It's enough to check migrate_multifd_is_allowed() in multifd cleanup() and multifd setup() though there are so many other places using migrate_use_multifd(). Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2021-10-19 08:39:04 +02:00
Lukas Straub	20171ea895	multifd: Implement yank for multifd send side To: qemu-devel <qemu-devel@nongnu.org> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Juan Quintela <quintela@redhat.com>, Peter Xu <peterx@redhat.com>, Leonardo Bras Soares Passos <lsoaresp@redhat.com> Date: Wed, 1 Sep 2021 17:58:57 +0200 (1 week, 15 hours, 17 minutes ago) [[PGP Signed Part:No public key for 35AB0B289C5DB258 created at 2021-09-01T17:58:57+0200 using RSA]] When introducing yank functionality in the migration code I forgot to cover the multifd send side. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Tested-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2021-10-19 08:39:04 +02:00
Chuan Zheng	8e5fa05932	migration/tls: add tls_hostname into MultiFDSendParams Since multifd creation is async with migration_channel_connect, we should pass the hostname from MigrationState to MultiFDSendParams. Signed-off-by: Chuan Zheng <zhengchuan@huawei.com> Signed-off-by: Yan Jin <jinyan12@huawei.com> Message-Id: <1600139042-104593-4-git-send-email-zhengchuan@huawei.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-09-25 12:45:58 +01:00
Juan Quintela	87dc6f5f66	multifd: Add zstd compression multifd support Signed-off-by: Juan Quintela <quintela@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-02-28 09:25:49 +01:00
Juan Quintela	7ec2c2b3c1	multifd: Add zlib compression multifd support Signed-off-by: Juan Quintela <quintela@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-02-28 09:24:43 +01:00
Juan Quintela	ab7cbb0b9a	multifd: Make no compression operations into its own structure It will be used later. Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- No comp value needs to be zero.	2020-02-28 09:24:43 +01:00
Juan Quintela	d32ca5ad79	multifd: Split multifd code into its own file Signed-off-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2020-01-29 11:28:59 +01:00

31 Commits