migration/multifd: Move multifd_send_setup into migration thread

We currently have an unfavorable situation around multifd channels
creation and the migration thread execution.

We create the multifd channels with qio_channel_socket_connect_async
-> qio_task_run_in_thread, but only connect them at the
multifd_new_send_channel_async callback, called from
qio_task_complete, which is registered as a glib event.

So at multifd_send_setup() we create the channels, but they will only
be actually usable after the whole multifd_send_setup() calling stack
returns back to the main loop. Which means that the migration thread
is already up and running without any possibility for the multifd
channels to be ready on time.

We currently rely on the channels-ready semaphore blocking
multifd_send_sync_main() until channels start to come up and release
it. However there have been bugs recently found when a channel's
creation fails and multifd_send_cleanup() is allowed to run while
other channels are still being created.

Let's start to organize this situation by moving the
multifd_send_setup() call into the migration thread. That way we
unblock the main-loop to dispatch the completion callbacks and
actually have a chance of getting the multifd channels ready for when
the migration thread needs them.

The next patches will deal with the synchronization aspects.

Note that this takes multifd_send_setup() out of the BQL.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240206215118.6171-5-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
This commit is contained in:
Fabiano Rosas 2024-02-06 18:51:16 -03:00 committed by Peter Xu
parent bd8b0a8f82
commit dd904bc13f

View File

@ -3327,6 +3327,10 @@ static void *migration_thread(void *opaque)
object_ref(OBJECT(s)); object_ref(OBJECT(s));
update_iteration_initial_status(s); update_iteration_initial_status(s);
if (!multifd_send_setup()) {
goto out;
}
bql_lock(); bql_lock();
qemu_savevm_state_header(s->to_dst_file); qemu_savevm_state_header(s->to_dst_file);
bql_unlock(); bql_unlock();
@ -3398,6 +3402,7 @@ static void *migration_thread(void *opaque)
urgent = migration_rate_limit(); urgent = migration_rate_limit();
} }
out:
trace_migration_thread_after_loop(); trace_migration_thread_after_loop();
migration_iteration_finish(s); migration_iteration_finish(s);
object_unref(OBJECT(s)); object_unref(OBJECT(s));
@ -3635,11 +3640,6 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
return; return;
} }
if (!multifd_send_setup()) {
migrate_fd_cleanup(s);
return;
}
if (migrate_background_snapshot()) { if (migrate_background_snapshot()) {
qemu_thread_create(&s->thread, "bg_snapshot", qemu_thread_create(&s->thread, "bg_snapshot",
bg_migration_thread, s, QEMU_THREAD_JOINABLE); bg_migration_thread, s, QEMU_THREAD_JOINABLE);