docs/migration: Organize "Postcopy" page
Reorganize the page, moving things around, and add a few headlines ("Postcopy internals", "Postcopy features") to cover sub-areas. Reviewed-by: Cédric Le Goater <clg@redhat.com> Link: https://lore.kernel.org/r/20240109064628.595453-9-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com>
This commit is contained in:
parent
4c6f8a79ae
commit
21b17cd011
@ -1,6 +1,9 @@
|
||||
========
|
||||
Postcopy
|
||||
========
|
||||
|
||||
.. contents::
|
||||
|
||||
'Postcopy' migration is a way to deal with migrations that refuse to converge
|
||||
(or take too long to converge) its plus side is that there is an upper bound on
|
||||
the amount of migration traffic and time it takes, the down side is that during
|
||||
@ -14,7 +17,7 @@ Postcopy can be combined with precopy (i.e. normal migration) so that if precopy
|
||||
doesn't finish in a given time the switch is made to postcopy.
|
||||
|
||||
Enabling postcopy
|
||||
-----------------
|
||||
=================
|
||||
|
||||
To enable postcopy, issue this command on the monitor (both source and
|
||||
destination) prior to the start of migration:
|
||||
@ -49,8 +52,71 @@ time per vCPU.
|
||||
``migrate_set_parameter`` is ignored (to avoid delaying requested pages that
|
||||
the destination is waiting for).
|
||||
|
||||
Postcopy device transfer
|
||||
------------------------
|
||||
Postcopy internals
|
||||
==================
|
||||
|
||||
State machine
|
||||
-------------
|
||||
|
||||
Postcopy moves through a series of states (see postcopy_state) from
|
||||
ADVISE->DISCARD->LISTEN->RUNNING->END
|
||||
|
||||
- Advise
|
||||
|
||||
Set at the start of migration if postcopy is enabled, even
|
||||
if it hasn't had the start command; here the destination
|
||||
checks that its OS has the support needed for postcopy, and performs
|
||||
setup to ensure the RAM mappings are suitable for later postcopy.
|
||||
The destination will fail early in migration at this point if the
|
||||
required OS support is not present.
|
||||
(Triggered by reception of POSTCOPY_ADVISE command)
|
||||
|
||||
- Discard
|
||||
|
||||
Entered on receipt of the first 'discard' command; prior to
|
||||
the first Discard being performed, hugepages are switched off
|
||||
(using madvise) to ensure that no new huge pages are created
|
||||
during the postcopy phase, and to cause any huge pages that
|
||||
have discards on them to be broken.
|
||||
|
||||
- Listen
|
||||
|
||||
The first command in the package, POSTCOPY_LISTEN, switches
|
||||
the destination state to Listen, and starts a new thread
|
||||
(the 'listen thread') which takes over the job of receiving
|
||||
pages off the migration stream, while the main thread carries
|
||||
on processing the blob. With this thread able to process page
|
||||
reception, the destination now 'sensitises' the RAM to detect
|
||||
any access to missing pages (on Linux using the 'userfault'
|
||||
system).
|
||||
|
||||
- Running
|
||||
|
||||
POSTCOPY_RUN causes the destination to synchronise all
|
||||
state and start the CPUs and IO devices running. The main
|
||||
thread now finishes processing the migration package and
|
||||
now carries on as it would for normal precopy migration
|
||||
(although it can't do the cleanup it would do as it
|
||||
finishes a normal migration).
|
||||
|
||||
- Paused
|
||||
|
||||
Postcopy can run into a paused state (normally on both sides when
|
||||
happens), where all threads will be temporarily halted mostly due to
|
||||
network errors. When reaching paused state, migration will make sure
|
||||
the qemu binary on both sides maintain the data without corrupting
|
||||
the VM. To continue the migration, the admin needs to fix the
|
||||
migration channel using the QMP command 'migrate-recover' on the
|
||||
destination node, then resume the migration using QMP command 'migrate'
|
||||
again on source node, with resume=true flag set.
|
||||
|
||||
- End
|
||||
|
||||
The listen thread can now quit, and perform the cleanup of migration
|
||||
state, the migration is now complete.
|
||||
|
||||
Device transfer
|
||||
---------------
|
||||
|
||||
Loading of device data may cause the device emulation to access guest RAM
|
||||
that may trigger faults that have to be resolved by the source, as such
|
||||
@ -130,7 +196,20 @@ processing.
|
||||
is no longer used by migration, while the listen thread carries on servicing
|
||||
page data until the end of migration.
|
||||
|
||||
Postcopy Recovery
|
||||
Source side page bitmap
|
||||
-----------------------
|
||||
|
||||
The 'migration bitmap' in postcopy is basically the same as in the precopy,
|
||||
where each of the bit to indicate that page is 'dirty' - i.e. needs
|
||||
sending. During the precopy phase this is updated as the CPU dirties
|
||||
pages, however during postcopy the CPUs are stopped and nothing should
|
||||
dirty anything any more. Instead, dirty bits are cleared when the relevant
|
||||
pages are sent during postcopy.
|
||||
|
||||
Postcopy features
|
||||
=================
|
||||
|
||||
Postcopy recovery
|
||||
-----------------
|
||||
|
||||
Comparing to precopy, postcopy is special on error handlings. When any
|
||||
@ -166,76 +245,6 @@ configurations of the guest. For example, when with async page fault
|
||||
enabled, logically the guest can proactively schedule out the threads
|
||||
accessing missing pages.
|
||||
|
||||
Postcopy states
|
||||
---------------
|
||||
|
||||
Postcopy moves through a series of states (see postcopy_state) from
|
||||
ADVISE->DISCARD->LISTEN->RUNNING->END
|
||||
|
||||
- Advise
|
||||
|
||||
Set at the start of migration if postcopy is enabled, even
|
||||
if it hasn't had the start command; here the destination
|
||||
checks that its OS has the support needed for postcopy, and performs
|
||||
setup to ensure the RAM mappings are suitable for later postcopy.
|
||||
The destination will fail early in migration at this point if the
|
||||
required OS support is not present.
|
||||
(Triggered by reception of POSTCOPY_ADVISE command)
|
||||
|
||||
- Discard
|
||||
|
||||
Entered on receipt of the first 'discard' command; prior to
|
||||
the first Discard being performed, hugepages are switched off
|
||||
(using madvise) to ensure that no new huge pages are created
|
||||
during the postcopy phase, and to cause any huge pages that
|
||||
have discards on them to be broken.
|
||||
|
||||
- Listen
|
||||
|
||||
The first command in the package, POSTCOPY_LISTEN, switches
|
||||
the destination state to Listen, and starts a new thread
|
||||
(the 'listen thread') which takes over the job of receiving
|
||||
pages off the migration stream, while the main thread carries
|
||||
on processing the blob. With this thread able to process page
|
||||
reception, the destination now 'sensitises' the RAM to detect
|
||||
any access to missing pages (on Linux using the 'userfault'
|
||||
system).
|
||||
|
||||
- Running
|
||||
|
||||
POSTCOPY_RUN causes the destination to synchronise all
|
||||
state and start the CPUs and IO devices running. The main
|
||||
thread now finishes processing the migration package and
|
||||
now carries on as it would for normal precopy migration
|
||||
(although it can't do the cleanup it would do as it
|
||||
finishes a normal migration).
|
||||
|
||||
- Paused
|
||||
|
||||
Postcopy can run into a paused state (normally on both sides when
|
||||
happens), where all threads will be temporarily halted mostly due to
|
||||
network errors. When reaching paused state, migration will make sure
|
||||
the qemu binary on both sides maintain the data without corrupting
|
||||
the VM. To continue the migration, the admin needs to fix the
|
||||
migration channel using the QMP command 'migrate-recover' on the
|
||||
destination node, then resume the migration using QMP command 'migrate'
|
||||
again on source node, with resume=true flag set.
|
||||
|
||||
- End
|
||||
|
||||
The listen thread can now quit, and perform the cleanup of migration
|
||||
state, the migration is now complete.
|
||||
|
||||
Source side page map
|
||||
--------------------
|
||||
|
||||
The 'migration bitmap' in postcopy is basically the same as in the precopy,
|
||||
where each of the bit to indicate that page is 'dirty' - i.e. needs
|
||||
sending. During the precopy phase this is updated as the CPU dirties
|
||||
pages, however during postcopy the CPUs are stopped and nothing should
|
||||
dirty anything any more. Instead, dirty bits are cleared when the relevant
|
||||
pages are sent during postcopy.
|
||||
|
||||
Postcopy with hugepages
|
||||
-----------------------
|
||||
|
||||
@ -293,7 +302,7 @@ Retro-fitting postcopy to existing clients is possible:
|
||||
guest memory access is made while holding a lock then all other
|
||||
threads waiting for that lock will also be blocked.
|
||||
|
||||
Postcopy Preemption Mode
|
||||
Postcopy preemption mode
|
||||
------------------------
|
||||
|
||||
Postcopy preempt is a new capability introduced in 8.0 QEMU release, it
|
||||
|
Loading…
Reference in New Issue
Block a user