Commit Graph

168 Commits

Author SHA1 Message Date
Markus Armbruster
5305a4eeb8 qapi: Correct documentation indentation and whitespace
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240322140910.328840-12-armbru@redhat.com>
[Add a previous patch's stray hunk]
2024-03-26 06:36:08 +01:00
Markus Armbruster
209e64d9ed qapi: Refill doc comments to conform to current conventions
For legibility, wrap text paragraphs so every line is at most 70
characters long.

To check the generated documentation does not change, I compared the
generated HTML before and after this commit with "wdiff -3".  Finds no
differences.  Comparing with diff is not useful, as the refilled
paragraphs are visible there.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240322140910.328840-11-armbru@redhat.com>
2024-03-26 06:36:08 +01:00
Markus Armbruster
73c67f3851 qapi: Start sentences with a capital letter, end them with a period
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240322140910.328840-9-armbru@redhat.com>
2024-03-26 06:36:08 +01:00
Markus Armbruster
7d08424cf7 qapi: Fix abbreviation punctuation in doc comments
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240322140910.328840-8-armbru@redhat.com>
2024-03-26 06:36:08 +01:00
Markus Armbruster
e6c60bf02d qapi: Fix bogus documentation of query-migrationthreads
The doc comment documents an argument that doesn't exist.  Would
fail compilation if it was marked up correctly.  Delete.

The Returns: section fails to refer to the data type, leaving the user
to guess.  Fix that.

The command name violates QAPI naming rules: it should be
query-migration-threads.  Too late to fix.

Reported-by: John Snow <jsnow@redhat.com>
Fixes: 671326201d (migration: Introduce interface query-migrationthreads)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240322135117.195489-4-armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
2024-03-26 06:36:08 +01:00
Markus Armbruster
8eb0a257c5 qapi: Resync MigrationParameter and MigrateSetParameters
Enum MigrationParameter mirrors the members of struct
MigrateSetParameters.  Differences to MigrateSetParameters's member
documentation are pointless.  Clean them up:

* @compress-level, @compress-threads, @decompress-threads, and
  x-checkpoint-delay are more thoroughly documented for
  MigrationParameter, so use that version for both.

* @max-cpu-throttle is almost the same.  Use MigrationParameter's
  version for both.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240322135117.195489-3-armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
2024-03-26 06:36:08 +01:00
Markus Armbruster
e8c5503a5c qapi: Improve migration TLS documentation
MigrateSetParameters is about setting parameters, and
MigrationParameters is about querying them.  Their documentation of
@tls-creds and @tls-hostname has residual damage from a failed attempt
at de-duplicating them (see commit de63ab6124 "migrate: Share common
MigrationParameters struct" and commit 1bda8b3c69 "migration: Unshare
MigrationParameters struct for now").

MigrateSetParameters documentation issues:

* It claims plain text mode "was reported by omitting tls-creds"
  before 2.9.  MigrateSetParameters is not used for reporting, so this
  is misleading.  Delete.

* It similarly claims hostname defaulting to migration URI "was
  reported by omitting tls-hostname" before 2.9.  Delete as well.

Rephrase the remaining @tls-hostname contents for clarity.

Enum MigrationParameter mirrors the members of struct
MigrateSetParameters.  Differences to MigrateSetParameters's member
documentation are pointless.  Copy the new text to MigrationParameter.

MigrationParameters documentation issues:

* @tls-creds runs the two last sentences together without punctuation.
  Fix that.

* Much of the contents on @tls-hostname only applies to setting
  parameters, resulting in confusion.  Replace by a suitable abridged
  version of the new MigrateSetParameters text, and a note on
  @tls-hostname omission in 2.8.

Additional damage is due to flawed doc fix commit
66fcb9d651 (qapi/migration: Add missing tls-authz documentation):
since it copied the missing MigrateSetParameters text from
MigrationParameters instead of MigrationParameter, the part on
recreating @tls-authz on the fly is missing.  Copy that, too.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240322135117.195489-2-armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
[Some typos corrected]
2024-03-26 06:36:08 +01:00
Hao Xiang
70c25c92e6 migration/multifd: Enable multifd zero page checking by default.
1. Set default "zero-page-detection" option to "multifd". Now
zero page checking can be done in the multifd threads and this
becomes the default configuration.
2. Handle migration QEMU9.0 -> QEMU8.2 compatibility. We provide
backward compatibility where zero page checking is done from the
migration main thread.

Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/20240311180015.3359271-7-hao.xiang@linux.dev
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11 16:57:09 -04:00
Hao Xiang
303e6f54f9 migration/multifd: Implement zero page transmission on the multifd thread.
1. Add zero_pages field in MultiFDPacket_t.
2. Implements the zero page detection and handling on the multifd
threads for non-compression, zlib and zstd compression backends.
3. Added a new value 'multifd' in ZeroPageDetection enumeration.
4. Adds zero page counters and updates multifd send/receive tracing
format to track the newly added counters.

Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240311180015.3359271-5-hao.xiang@linux.dev
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11 16:57:09 -04:00
Hao Xiang
5fdbb1dfcc migration/multifd: Add new migration option zero-page-detection.
This new parameter controls where the zero page checking is running.
1. If this parameter is set to 'legacy', zero page checking is
done in the migration main thread.
2. If this parameter is set to 'none', zero page checking is disabled.

Signed-off-by: Hao Xiang <hao.xiang@bytedance.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Link: https://lore.kernel.org/r/20240311180015.3359271-4-hao.xiang@linux.dev
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-11 16:57:05 -04:00
Peter Maydell
7d4e29ef80 QAPI patches patches for 2024-03-04
-----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmXlaSISHGFybWJydUBy
 ZWRoYXQuY29tAAoJEDhwtADrkYZTdZ8P/iMgqLoAFkCCjwfkUc/rqZUezK52Ynr7
 LYwOPI/xcYD7EnVogdRgFgjWFNoivQLP5yKsU/eRTk29pwdDzTscFm/0ztTQX/Gb
 ypWV+GBcu5J8mKbp1KF5w68aDD8Bat4WRfEgDQ1DV7v6CoMiUzTiF3CGXkYzqK5Y
 kYNq97vdEkBFvFdOl/7scs/XXN2jG27egDhMp68RTxnPHlXZiAO9/2Bul3uVe3x0
 fzQ2ViYv0qLnjE/PwENDqqE3Thv3Sxp5iEeQQ6GWi07EVh07UtHpOM3RYyrTU0Sb
 VrTApSrg0oxlkOuR0CBd9Fi+timtbokBL0DWyUpXNTfIEZfLtA9H+8riUg3EOcDp
 r7a4SI/27VdPxX6Kc6zA3bi+/j1o7CLTW2LGEwuZs52nmixoo1HTWPIFdyh13g/V
 QjNbun0fViHb0FVLiyDlXF/7Y+EWUWIyqwwGqbvve1DyUHQmo3CUQAKGOpkeKSBe
 4eGciVDgpBoKhtw9Kv6LCDj2cwZKC8DxBMibf7GHkOnAsX2mnyuHcey7HvYNCoF+
 yYz7oIEXdlL2eWqg7CfBZK7lniCDln50RI4Ll1v+J4r1v1kRZGMLesTYXCdNc4ku
 yb4kpU4t22/RODffLE7K+fc3Onwze3fcfxlZMN66F+wFtk4KdPR2aQBE66bB8J99
 vuSKlTbT4cGL
 =s9AR
 -----END PGP SIGNATURE-----

Merge tag 'pull-qapi-2024-03-04' of https://repo.or.cz/qemu/armbru into staging

QAPI patches patches for 2024-03-04

# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmXlaSISHGFybWJydUBy
# ZWRoYXQuY29tAAoJEDhwtADrkYZTdZ8P/iMgqLoAFkCCjwfkUc/rqZUezK52Ynr7
# LYwOPI/xcYD7EnVogdRgFgjWFNoivQLP5yKsU/eRTk29pwdDzTscFm/0ztTQX/Gb
# ypWV+GBcu5J8mKbp1KF5w68aDD8Bat4WRfEgDQ1DV7v6CoMiUzTiF3CGXkYzqK5Y
# kYNq97vdEkBFvFdOl/7scs/XXN2jG27egDhMp68RTxnPHlXZiAO9/2Bul3uVe3x0
# fzQ2ViYv0qLnjE/PwENDqqE3Thv3Sxp5iEeQQ6GWi07EVh07UtHpOM3RYyrTU0Sb
# VrTApSrg0oxlkOuR0CBd9Fi+timtbokBL0DWyUpXNTfIEZfLtA9H+8riUg3EOcDp
# r7a4SI/27VdPxX6Kc6zA3bi+/j1o7CLTW2LGEwuZs52nmixoo1HTWPIFdyh13g/V
# QjNbun0fViHb0FVLiyDlXF/7Y+EWUWIyqwwGqbvve1DyUHQmo3CUQAKGOpkeKSBe
# 4eGciVDgpBoKhtw9Kv6LCDj2cwZKC8DxBMibf7GHkOnAsX2mnyuHcey7HvYNCoF+
# yYz7oIEXdlL2eWqg7CfBZK7lniCDln50RI4Ll1v+J4r1v1kRZGMLesTYXCdNc4ku
# yb4kpU4t22/RODffLE7K+fc3Onwze3fcfxlZMN66F+wFtk4KdPR2aQBE66bB8J99
# vuSKlTbT4cGL
# =s9AR
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 04 Mar 2024 06:24:34 GMT
# gpg:                using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg:                issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* tag 'pull-qapi-2024-03-04' of https://repo.or.cz/qemu/armbru:
  migration: simplify exec migration functions
  qapi: New strv_from_str_list()
  qapi: New QAPI_LIST_LENGTH()
  docs/devel/writing-monitor-commands: Minor improvements
  docs/devel/writing-monitor-commands: Repair a decade of rot
  qapi: Reject "Returns" section when command doesn't return anything
  qga/qapi-schema: Fix guest-set-memory-blocks documentation
  qga/qapi-schema: Tweak documentation of fsfreeze commands
  qga/qapi-schema: Clean up "Returns" sections
  qga/qapi-schema: Delete useless "Returns" sections
  qga/qapi-schema: Move error documentation to new "Errors" sections
  qapi/yank: Tweak @yank's error description for consistency
  qapi: Clean up "Returns" sections
  qapi: Delete useless "Returns" sections
  qapi: Move error documentation to new "Errors" sections
  qapi: New documentation section tag "Errors"
  qapi: Slightly clearer error message for invalid "Returns" section
  qapi: Memorize since & returns sections

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-03-05 11:20:15 +00:00
Markus Armbruster
53d5c36d8d qapi: Delete useless "Returns" sections
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240227113921.236097-6-armbru@redhat.com>
2024-03-04 07:12:40 +01:00
Fabiano Rosas
4ed49feb44 migration/ram: Introduce 'mapped-ram' migration capability
Add a new migration capability 'mapped-ram'.

The core of the feature is to ensure that RAM pages are mapped
directly to offsets in the resulting migration file instead of being
streamed at arbitrary points.

The reasons why we'd want such behavior are:

 - The resulting file will have a bounded size, since pages which are
   dirtied multiple times will always go to a fixed location in the
   file, rather than constantly being added to a sequential
   stream. This eliminates cases where a VM with, say, 1G of RAM can
   result in a migration file that's 10s of GBs, provided that the
   workload constantly redirties memory.

 - It paves the way to implement O_DIRECT-enabled save/restore of the
   migration stream as the pages are ensured to be written at aligned
   offsets.

 - It allows the usage of multifd so we can write RAM pages to the
   migration file in parallel.

For now, enabling the capability has no effect. The next couple of
patches implement the core functionality.

Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20240229153017.2221-8-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-01 15:42:04 +08:00
Steve Sistare
87a2848715 migration: massage cpr-reboot documentation
Re-wrap the cpr-reboot documentation to 70 columns, use '@' for
cpr-reboot references, capitalize COLO and VFIO, and tweak the
wording.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Link: https://lore.kernel.org/r/1709218462-3640-1-git-send-email-steven.sistare@oracle.com
[peterx: s/qemu/QEMU per Markus's suggestion]
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-03-01 14:14:14 +08:00
Steve Sistare
cbdafc1b34 migration: options incompatible with cpr
Fail the migration request if options are set that are incompatible
with cpr.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-15-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-02-28 11:31:28 +08:00
Steve Sistare
ce5db1cb49 migration: update cpr-reboot description
Clarify qapi for cpr-reboot migration mode, and add vfio support.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-14-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-02-28 11:31:28 +08:00
Markus Armbruster
d23055b8db qapi: Require descriptions and tagged sections to be indented
By convention, we indent the second and subsequent lines of
descriptions and tagged sections, except for examples.

Turn this into a hard rule, and apply it to examples, too.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240216145841.2099240-11-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[Straightforward conflicts in qapi/migration.json resolved]
2024-02-26 10:43:56 +01:00
Het Gala
3cee17e739 qapi: Misc cleanups to migrate QAPIs
Signed-off-by: Het Gala <het.gala@nutanix.com>
Message-ID: <20240216195659.189091-1-het.gala@nutanix.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-02-26 10:43:56 +01:00
Peter Xu
66fcb9d651 qapi/migration: Add missing tls-authz documentation
As reported in Markus's recent enforcement series on qapi doc [1], we
accidentally miss one entry for tls-authz.  Add it.

[1] https://lore.kernel.org/r/20240205074709.3613229-1-armbru@redhat.com

Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Fabiano Rosas <farosas@suse.de>
Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-ID: <20240207032836.268183-1-peterx@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Update of qapi/pragma.json squashed in, commit message adjusted]
2024-02-12 10:04:32 +01:00
Markus Armbruster
89a2273b9d qapi: Add missing union tag documentation
Low-hanging fruit, and except for StatsFilter, the only members of
these unions lacking documentation.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240205074709.3613229-16-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-02-12 10:04:32 +01:00
Markus Armbruster
1ed1d4d608 qapi: Indent tagged doc comment sections properly
docs/devel/qapi-code-gen demands that the "second and subsequent lines
of sections other than "Example"/"Examples" should be indented".
Commit a937b6aa739q (qapi: Reformat doc comments to conform to current
conventions) missed a few instances, and messed up a few others.
Clean that up.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240205074709.3613229-5-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-02-12 10:04:31 +01:00
Han Han
dee70f51bf qapi/migration.json: Fix the member name for MigrationCapability
s/@compression/@compress/

Fixes: 864128df46

Signed-off-by: Han Han <hhan@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-01-30 21:20:20 +03:00
Het Gala
57fd4b4e10 Make 'uri' optional for migrate QAPI
'uri' argument should be optional, as 'uri' and 'channels'
arguments are mutally exclusive in nature.

Fixes: 074dbce5fc (migration: New migrate and migrate-incoming argument 'channels')
Signed-off-by: Het Gala <het.gala@nutanix.com>
Link: https://lore.kernel.org/r/20240123064219.40514-1-het.gala@nutanix.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-01-29 11:02:12 +08:00
Markus Armbruster
37507c14a6 qapi: Fix malformed "Since:" section tags (again)
"Since X.Y" is not recognized as a tagged section, and therefore not
formatted as such in generated documentation.  Fix by adding the
required colon.

Previously fixed in commit 433a4fdc42 (qapi: Fix malformed "Since:"
section tags)

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240120095327.666239-8-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2024-01-26 07:04:54 +01:00
Michael Tokarev
4061c3346e qapi/migration.json: spelling: transfering
Fixes: 074dbce5fc "migration: New migrate and migrate-incoming argument 'channels'"
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-11-15 11:09:17 +03:00
Het Gala
074dbce5fc migration: New migrate and migrate-incoming argument 'channels'
MigrateChannelList allows to connect accross multiple interfaces.
Add MigrateChannelList struct as argument to migration QAPIs.

We plan to include multiple channels in future, to connnect
multiple interfaces. Hence, we choose 'MigrateChannelList'
as the new argument over 'MigrateChannel' to make migration
QAPIs future proof.

Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com>
Signed-off-by: Het Gala <het.gala@nutanix.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231023182053.8711-10-farosas@suse.de>
2023-11-02 11:35:04 +01:00
Het Gala
e034f88364 migration: New QAPI type 'MigrateAddress'
This patch introduces well defined MigrateAddress struct
and its related child objects.

The existing argument of 'migrate' and 'migrate-incoming' QAPI
- 'uri' is of type string. The current implementation follows
double encoding scheme for fetching migration parameters like
'uri' and this is not an ideal design.

Motive for intoducing struct level design is to prevent double
encoding of QAPI arguments, as Qemu should be able to directly
use the QAPI arguments without any level of encoding.

Note: this commit only adds the type, and actual uses comes
in later commits.

Fabiano fixed for "file" transport.

Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com>
Signed-off-by: Het Gala <het.gala@nutanix.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231023182053.8711-2-farosas@suse.de>
Message-Id: <20231023182053.8711-3-farosas@suse.de>
2023-11-02 11:35:03 +01:00
Steve Sistare
a87e64519b cpr: reboot mode
Add the cpr-reboot migration mode.  Usage:

$ qemu-system-$arch -monitor stdio ...
QEMU 8.1.50 monitor - type 'help' for more information
(qemu) migrate_set_capability x-ignore-shared on
(qemu) migrate_set_parameter mode cpr-reboot
(qemu) migrate -d file:vm.state
(qemu) info status
VM status: paused (postmigrate)
(qemu) quit

$ qemu-system-$arch -monitor stdio -incoming defer ...
QEMU 8.1.50 monitor - type 'help' for more information
(qemu) migrate_set_capability x-ignore-shared on
(qemu) migrate_set_parameter mode cpr-reboot
(qemu) migrate_incoming file:vm.state
(qemu) info status
VM status: running

In this mode, the migrate command saves state to a file, allowing one
to quit qemu, reboot to an updated kernel, and restart an updated version
of qemu.  The caller must specify a migration URI that writes to and reads
from a file.  Unlike normal mode, the use of certain local storage options
does not block the migration, but the caller must not modify guest block
devices between the quit and restart.  To avoid saving guest RAM to the
file, the memory backend must be shared, and the @x-ignore-shared migration
capability must be set.  Guest RAM must be non-volatile across reboot, such
as by backing it with a dax device, but this is not enforced.  The restarted
qemu arguments must match those used to initially start qemu, plus the
-incoming option.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <1698263069-406971-6-git-send-email-steven.sistare@oracle.com>
2023-11-01 16:13:59 +01:00
Steve Sistare
eea1e5c9d6 migration: mode parameter
Create a mode migration parameter that can be used to select alternate
migration algorithms.  The default mode is normal, representing the
current migration algorithm, and does not need to be explicitly set.

No functional change until a new mode is added, except that the mode is
shown by the 'info migrate' command.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <1698263069-406971-2-git-send-email-steven.sistare@oracle.com>
2023-11-01 16:13:58 +01:00
Juan Quintela
864128df46 migration: Deprecate old compression method
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231018115513.2163-6-quintela@redhat.com>
2023-10-31 08:44:33 +01:00
Juan Quintela
66db46ca83 migration: Deprecate block migration
It is obsolete.  It is better to use driver-mirror with NBD instead.

CC: Kevin Wolf <kwolf@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Hanna Czenczek <hreitz@redhat.com>

Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231018115513.2163-5-quintela@redhat.com>
2023-10-31 08:44:33 +01:00
Juan Quintela
8846b5bfca migration: migrate 'blk' command option is deprecated.
Use blocked-mirror with NBD instead.

Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231018115513.2163-4-quintela@redhat.com>
2023-10-31 08:44:33 +01:00
Juan Quintela
40101f320d migration: migrate 'inc' command option is deprecated.
Use blockdev-mirror with NBD instead.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231018115513.2163-3-quintela@redhat.com>
2023-10-31 08:44:33 +01:00
Juan Quintela
e4ceec292f migration: Improve json and formatting
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231013104736.31722-2-quintela@redhat.com>
2023-10-17 09:25:13 +02:00
Peter Xu
8b2395970a migration: Allow user to specify available switchover bandwidth
Migration bandwidth is a very important value to live migration.  It's
because it's one of the major factors that we'll make decision on when to
switchover to destination in a precopy process.

This value is currently estimated by QEMU during the whole live migration
process by monitoring how fast we were sending the data.  This can be the
most accurate bandwidth if in the ideal world, where we're always feeding
unlimited data to the migration channel, and then it'll be limited to the
bandwidth that is available.

However in reality it may be very different, e.g., over a 10Gbps network we
can see query-migrate showing migration bandwidth of only a few tens of
MB/s just because there are plenty of other things the migration thread
might be doing.  For example, the migration thread can be busy scanning
zero pages, or it can be fetching dirty bitmap from other external dirty
sources (like vhost or KVM).  It means we may not be pushing data as much
as possible to migration channel, so the bandwidth estimated from "how many
data we sent in the channel" can be dramatically inaccurate sometimes.

With that, the decision to switchover will be affected, by assuming that we
may not be able to switchover at all with such a low bandwidth, but in
reality we can.

The migration may not even converge at all with the downtime specified,
with that wrong estimation of bandwidth, keeping iterations forever with a
low estimation of bandwidth.

The issue is QEMU itself may not be able to avoid those uncertainties on
measuing the real "available migration bandwidth".  At least not something
I can think of so far.

One way to fix this is when the user is fully aware of the available
bandwidth, then we can allow the user to help providing an accurate value.

For example, if the user has a dedicated channel of 10Gbps for migration
for this specific VM, the user can specify this bandwidth so QEMU can
always do the calculation based on this fact, trusting the user as long as
specified.  It may not be the exact bandwidth when switching over (in which
case qemu will push migration data as fast as possible), but much better
than QEMU trying to wildly guess, especially when very wrong.

A new parameter "avail-switchover-bandwidth" is introduced just for this.
So when the user specified this parameter, instead of trusting the
estimated value from QEMU itself (based on the QEMUFile send speed), it
trusts the user more by using this value to decide when to switchover,
assuming that we'll have such bandwidth available then.

Note that specifying this value will not throttle the bandwidth for
switchover yet, so QEMU will always use the full bandwidth possible for
sending switchover data, assuming that should always be the most important
way to use the network at that time.

This can resolve issues like "unconvergence migration" which is caused by
hilarious low "migration bandwidth" detected for whatever reason.

Reported-by: Zhiyi Guo <zhguo@redhat.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231010221922.40638-1-peterx@redhat.com>
2023-10-17 09:14:32 +02:00
Peter Xu
c94143e587 migration: Display error in query-migrate irrelevant of status
Display it as long as being set, irrelevant of FAILED status.  E.g., it may
also be applicable to PAUSED stage of postcopy, to provide hint on what has
gone wrong.

The error_mutex seems to be overlooked when referencing the error, add it
to be very safe.

This will change QAPI behavior by showing up error message outside !FAILED
status, but it's intended and doesn't expect to break anyone.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2018404
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231004220240.167175-2-peterx@redhat.com>
2023-10-11 11:17:04 +02:00
Andrei Gudkov
320a6ccc76 migration/dirtyrate: use QEMU_CLOCK_HOST to report start-time
Currently query-dirty-rate uses QEMU_CLOCK_REALTIME as
the source for start-time field. This translates to
clock_gettime(CLOCK_MONOTONIC), i.e. number of seconds
since host boot. This is not very useful. The only
reasonable use case of start-time I can imagine is to
check whether previously completed measurements are
too old or not. But this makes sense only if start-time
is reported as host wall-clock time.

This patch replaces source of start-time from
QEMU_CLOCK_REALTIME to QEMU_CLOCK_HOST.

Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com>
Reviewed-by: Hyman Huang <yong.huang@smartx.com>
Message-Id: <399861531e3b24a1ecea2ba453fb2c3d129fb03a.1693905328.git.gudkov.andrei@huawei.com>
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
2023-10-10 08:04:12 +08:00
Andrei Gudkov
34a68001f1 migration/calc-dirty-rate: millisecond-granularity period
This patch allows to measure dirty page rate for
sub-second intervals of time. An optional argument is
introduced -- calc-time-unit. For example:
{"execute": "calc-dirty-rate", "arguments":
  {"calc-time": 500, "calc-time-unit": "millisecond"} }

Millisecond granularity allows to make predictions whether
migration will succeed or not. To do this, calculate dirty
rate with calc-time set to max allowed downtime (e.g. 300ms),
convert measured rate into volume of dirtied memory,
and divide by network throughput. If the value is lower
than max allowed downtime, then migration will converge.

Measurement results for single thread randomly writing to
a 1/4/24GiB memory region:

+----------------+-----------------------------------------------+
| calc-time      |                dirty rate MiB/s               |
| (milliseconds) +----------------+---------------+--------------+
|                | theoretical    | page-sampling | dirty-bitmap |
|                | (at 3M wr/sec) |               |              |
+----------------+----------------+---------------+--------------+
|                               1GiB                             |
+----------------+----------------+---------------+--------------+
|            100 |           6996 |          7100 |         3192 |
|            200 |           4606 |          4660 |         2655 |
|            300 |           3305 |          3280 |         2371 |
|            400 |           2534 |          2525 |         2154 |
|            500 |           2041 |          2044 |         1871 |
|            750 |           1365 |          1341 |         1358 |
|           1000 |           1024 |          1052 |         1025 |
|           1500 |            683 |           678 |          684 |
|           2000 |            512 |           507 |          513 |
+----------------+----------------+---------------+--------------+
|                               4GiB                             |
+----------------+----------------+---------------+--------------+
|            100 |          10232 |          8880 |         4070 |
|            200 |           8954 |          8049 |         3195 |
|            300 |           7889 |          7193 |         2881 |
|            400 |           6996 |          6530 |         2700 |
|            500 |           6245 |          5772 |         2312 |
|            750 |           4829 |          4586 |         2465 |
|           1000 |           3865 |          3780 |         2178 |
|           1500 |           2694 |          2633 |         2004 |
|           2000 |           2041 |          2031 |         1789 |
+----------------+----------------+---------------+--------------+
|                               24GiB                            |
+----------------+----------------+---------------+--------------+
|            100 |          11495 |          8640 |         5597 |
|            200 |          11226 |          8616 |         3527 |
|            300 |          10965 |          8386 |         2355 |
|            400 |          10713 |          8370 |         2179 |
|            500 |          10469 |          8196 |         2098 |
|            750 |           9890 |          7885 |         2556 |
|           1000 |           9354 |          7506 |         2084 |
|           1500 |           8397 |          6944 |         2075 |
|           2000 |           7574 |          6402 |         2062 |
+----------------+----------------+---------------+--------------+

Theoretical values are computed according to the following formula:
size * (1 - (1-(4096/size))^(time*wps)) / (time * 2^20),
where size is in bytes, time is in seconds, and wps is number of
writes per second.

Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com>
Reviewed-by: Hyman Huang <yong.huang@smartx.com>
Message-Id: <d802e6b8053eb60fbec1a784cf86f67d9528e0a8.1693895970.git.gudkov.andrei@huawei.com>
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
2023-10-10 08:03:50 +08:00
Hyman Huang(黄勇)
ef96537732 qapi: Craft the dirty-limit capability comment
Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Message-ID: <169073570563.19893.2928364761104733482-2@git.sr.ht>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2023-08-02 09:33:38 +02:00
Hyman Huang(黄勇)
8abc81150f qapi: Reformat the dirty-limit migration doc comments
Reformat the dirty-limit migration doc comments to conform
to current conventions as commit a937b6aa73 (qapi: Reformat
doc comments to conform to current conventions).

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Message-ID: <169073570563.19893.2928364761104733482-1@git.sr.ht>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Whitespace tidied up]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2023-08-02 09:33:34 +02:00
Richard Henderson
ccdd312676 QAPI patches patches for 2023-07-26
-----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmTBFvUSHGFybWJydUBy
 ZWRoYXQuY29tAAoJEDhwtADrkYZTML4QAKhHciLnEudtZ6SFSqpOgt80IJnw8a+r
 z1AowVYtgPhlZ8TtQJFXpBtAZtKu8xb/QdFxomm4bdNQnWX6CXCoheF5ZJ9V3Rrz
 A3pA1wt5KTnRif6R9/Rs1dYXEr4cWagg1UNT3g2eOV3fvdDHvJMPOsqK/jWeXuC1
 T94yFMv1bZSLyiLgB7QQNYDZhIWQ06RGU6tZdWaZQReA8N8maXiZN5NnUISK32Rq
 L2X0FtgzyJQ+dLHtbXOw6kIwZdOLNauOM78skZoiZUyFVaH2aDUIg3mnfRw36hN6
 feXGtw68PkTQGexKmonPDljIacfMDApmNBelLwsvB9MTrwVV+hKZPy1ZEwPIFDJ9
 yid63pp2CtQ1TZ3dSjZ1cGbRR+g2NI5X4g1DlcFPAxydMkv9/m5NwQx8OYqVIzqg
 VXeS0++O2BM5+ORjlJxMx3RsyH2O1I8DCfwmifzYSo+3Xg/4nCV3f38czbavjCfJ
 4T3ooZx0+PRtjlOlfZTkgxV14TMV+XzQr3bsN4wbPdnjnueSE1tyoVGy8MwQ5aXi
 2oAsjrR8g7iqU6f+6PyRNn5F6D0ge+AYQ7bYS51i3Hyih/y2QUJECpL3XAgOxREb
 /68SEtr4m/GJvmQNdwwwu6e1JFo8LknwMfkfzQAOCK1npAJGsWPmJ6iY7KtWgS8F
 oDwqng/WOhvV
 =mNMX
 -----END PGP SIGNATURE-----

Merge tag 'pull-qapi-2023-07-26-v2' of https://repo.or.cz/qemu/armbru into staging

QAPI patches patches for 2023-07-26

# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmTBFvUSHGFybWJydUBy
# ZWRoYXQuY29tAAoJEDhwtADrkYZTML4QAKhHciLnEudtZ6SFSqpOgt80IJnw8a+r
# z1AowVYtgPhlZ8TtQJFXpBtAZtKu8xb/QdFxomm4bdNQnWX6CXCoheF5ZJ9V3Rrz
# A3pA1wt5KTnRif6R9/Rs1dYXEr4cWagg1UNT3g2eOV3fvdDHvJMPOsqK/jWeXuC1
# T94yFMv1bZSLyiLgB7QQNYDZhIWQ06RGU6tZdWaZQReA8N8maXiZN5NnUISK32Rq
# L2X0FtgzyJQ+dLHtbXOw6kIwZdOLNauOM78skZoiZUyFVaH2aDUIg3mnfRw36hN6
# feXGtw68PkTQGexKmonPDljIacfMDApmNBelLwsvB9MTrwVV+hKZPy1ZEwPIFDJ9
# yid63pp2CtQ1TZ3dSjZ1cGbRR+g2NI5X4g1DlcFPAxydMkv9/m5NwQx8OYqVIzqg
# VXeS0++O2BM5+ORjlJxMx3RsyH2O1I8DCfwmifzYSo+3Xg/4nCV3f38czbavjCfJ
# 4T3ooZx0+PRtjlOlfZTkgxV14TMV+XzQr3bsN4wbPdnjnueSE1tyoVGy8MwQ5aXi
# 2oAsjrR8g7iqU6f+6PyRNn5F6D0ge+AYQ7bYS51i3Hyih/y2QUJECpL3XAgOxREb
# /68SEtr4m/GJvmQNdwwwu6e1JFo8LknwMfkfzQAOCK1npAJGsWPmJ6iY7KtWgS8F
# oDwqng/WOhvV
# =mNMX
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 26 Jul 2023 05:52:05 AM PDT
# gpg:                using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg:                issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [undefined]
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* tag 'pull-qapi-2023-07-26-v2' of https://repo.or.cz/qemu/armbru:
  qapi: Reformat recent doc comments to conform to current conventions
  qapi/trace: Tidy up trace-event-get-state, -set-state documentation
  qapi/qdev: Tidy up device_add documentation
  qapi/block: Tidy up block-latency-histogram-set documentation
  qapi/block-core: Tidy up BlockLatencyHistogramInfo documentation

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-26 07:16:19 -07:00
Markus Armbruster
9e272073e1 qapi: Reformat recent doc comments to conform to current conventions
Since commit a937b6aa73 (qapi: Reformat doc comments to conform to
current conventions), a number of comments not conforming to the
current formatting conventions were added.  No problem, just sweep
the entire documentation once more.

To check the generated documentation does not change, I compared the
generated HTML before and after this commit with "wdiff -3".  Finds no
differences.  Comparing with diff is not useful, as the reflown
paragraphs are visible there.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20230720071610.1096458-7-armbru@redhat.com>
2023-07-26 14:51:36 +02:00
Juan Quintela
7b24d32634 migration: skipped field is really obsolete.
Has return zero for more than 10 years.

Specifically we introduced the field in 1.5.0

commit f1c72795af
Author: Peter Lieven <pl@kamp.de>
Date:   Tue Mar 26 10:58:37 2013 +0100

    migration: do not sent zero pages in bulk stage

    during bulk stage of ram migration if a page is a
    zero page do not send it at all.
    the memory at the destination reads as zero anyway.

    even if there is an madvise with QEMU_MADV_DONTNEED
    at the target upon receipt of a zero page I have observed
    that the target starts swapping if the memory is overcommitted.
    it seems that the pages are dropped asynchronously.

    this patch also updates QMP to return the number of
    skipped pages in MigrationStats.

but removed its usage in 1.5.3

commit 9ef051e553
Author: Peter Lieven <pl@kamp.de>
Date:   Mon Jun 10 12:14:19 2013 +0200

    Revert "migration: do not sent zero pages in bulk stage"

    Not sending zero pages breaks migration if a page is zero
    at the source but not at the destination. This can e.g. happen
    if different BIOS versions are used at source and destination.
    It has also been reported that migration on pseries is completely
    broken with this patch.

    This effectively reverts commit f1c72795af.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20230612193344.3796-2-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)
15699cf542 migration: Extend query-migrate to provide dirty page limit info
Extend query-migrate to provide throttle time and estimated
ring full time with dirty-limit capability enabled, through which
we can observe if dirty limit take effect during live migration.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-ID: <168733225273.5845.15871826788879741674-8@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)
dc62395557 migration: Introduce dirty-limit capability
Introduce migration dirty-limit capability, which can
be turned on before live migration and limit dirty
page rate durty live migration.

Introduce migrate_dirty_limit function to help check
if dirty-limit capability enabled during live migration.

Meanwhile, refactor vcpu_dirty_rate_stat_collect
so that period can be configured instead of hardcoded.

dirty-limit capability is kind of like auto-converge
but using dirty limit instead of traditional cpu-throttle
to throttle guest down. To enable this feature, turn on
the dirty-limit capability before live migration using
migrate-set-capabilities, and set the parameters
"x-vcpu-dirty-limit-period", "vcpu-dirty-limit" suitably
to speed up convergence.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-4@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)
09f9ec9913 qapi/migration: Introduce vcpu-dirty-limit parameters
Introduce "vcpu-dirty-limit" migration parameter used
to limit dirty page rate during live migration.

"vcpu-dirty-limit" and "x-vcpu-dirty-limit-period" are
two dirty-limit-related migration parameters, which can
be set before and during live migration by qmp
migrate-set-parameters.

This two parameters are used to help implement the dirty
page rate limit algo of migration.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-3@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)
4d80785719 qapi/migration: Introduce x-vcpu-dirty-limit-period parameter
Introduce "x-vcpu-dirty-limit-period" migration experimental
parameter, which is in the range of 1 to 1000ms and used to
make dirtyrate calculation period configurable.

Currently with the "x-vcpu-dirty-limit-period" varies, the
total time of live migration changes, test results show the
optimal value of "x-vcpu-dirty-limit-period" ranges from
500ms to 1000 ms. "x-vcpu-dirty-limit-period" should be made
stable once it proves best value can not be determined with
developer's experiments.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-2@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Juan Quintela
fd658a7b8c migration.json: Don't use space before colon
So all the file is consistent.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-Id: <20230612191604.2219-1-quintela@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2023-07-10 07:47:36 +02:00
Andrei Gudkov
5034e3d4e8 qapi: better docs for calc-dirty-rate and friends
Rewrote calc-dirty-rate documentation. Briefly described
different modes of dirty page rate measurement. Added some
examples. Fixed obvious grammar errors.

Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com>
Message-Id: <fe7d32a621ebd69ef6974beb2499c0b5dccb9e19.1684854849.git.gudkov.andrei@huawei.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
[Prose tweaked and spacing corrected, as per review]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2023-07-10 07:47:36 +02:00
Avihai Horon
6574232fff migration: Add switchover ack capability
Migration downtime estimation is calculated based on bandwidth and
remaining migration data. This assumes that loading of migration data in
the destination takes a negligible amount of time and that downtime
depends only on network speed.

While this may be true for RAM, it's not necessarily true for other
migrated devices. For example, loading the data of a VFIO device in the
destination might require from the device to allocate resources, prepare
internal data structures and so on. These operations can take a
significant amount of time which can increase migration downtime.

This patch adds a new capability "switchover ack" that prevents the
source from stopping the VM and completing the migration until an ACK
is received from the destination that it's OK to do so.

This can be used by migrated devices in various ways to reduce downtime.
For example, a device can send initial precopy metadata to pre-allocate
resources in the destination and use this capability to make sure that
the pre-allocation is completed before the source VM is stopped, so it
will have full effect.

This new capability relies on the return path capability to communicate
from the destination back to the source.

The actual implementation of the capability will be added in the
following patches.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Tested-by: YangHang Liu <yanghliu@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-06-30 06:02:51 +02:00