Commit Graph

1693 Commits

Author SHA1 Message Date
Stefan Hajnoczi
1d6e13c1c7 dump queue
Hi
 
 The "dump" queue, with:
 - [PATCH v3 qemu 0/3] Allow dump-guest-memory to output standard kdump format
 - [PATCH v2 0/5] dump: Minor fixes & improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAmVEmsEcHG1hcmNhbmRy
 ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5YNkD/sFnz+I75mn6+DIdC3x
 aSVUU87JxAvWkt+G3KYGS+de2+g2YkRkPwwrIsIceiX7mlL4Es350AVcTl7+fXpu
 Jl9k9I32QI+U3pNXo9BStIqjKUMBxmmKs4aLCh9OHJ6oliTCG+aJTUmSl/dABIuw
 fAcW9vjhyR4ogAp8x7WhR6PKEAAb6OE/9k0w/z0GV2K09N/R0pPAvObQ36VQJ/Cl
 6DN8tRRytl0IQmC/mZZ+MQPQ5cvamK78X3DmnYCGtyN9HTQERfUFMSSgD/sHLvNi
 rMKuwhXiGQfDs/xQ9Z6Vh2AL7JfAwbIQwUstepb78M/5GBLaZfwFYG4+eCohJE82
 s0GOQ45Yks+AOTGj6lNyOfJ8PIf0SocCTbnLWZicpdHIfoEkSmmL0VZ5w+w0EpDO
 WOZJRpANJGTLhKNb//X3A3OJ05LoavN3/criokhC19DW/yE/VEGd3dXlP6yvFOku
 vGUINGivg1bw7yO0S/rzXNw4+cHCPgBCXbKCNuMI6B+dxL5pUR5Zr4OqcYgwejqE
 RWMdqsHA4ohpzc3AfbuHLFilXJNAgLR3jAEiVUXyrz9U1FiYEiq/8RNuupe9Uveq
 pO1PDZ9fher0Zda4y28bHl/e5M9hVeCFqElcVk0FQGt97T5olVvSaL/hFUPf65ls
 8A3lN6WaAT9dvM33pkeswZvGxg==
 =eSbp
 -----END PGP SIGNATURE-----

Merge tag 'dump-pull-request' of https://gitlab.com/marcandre.lureau/qemu into staging

dump queue

Hi

The "dump" queue, with:
- [PATCH v3 qemu 0/3] Allow dump-guest-memory to output standard kdump format
- [PATCH v2 0/5] dump: Minor fixes & improvements

# -----BEGIN PGP SIGNATURE-----
#
# iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAmVEmsEcHG1hcmNhbmRy
# ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5YNkD/sFnz+I75mn6+DIdC3x
# aSVUU87JxAvWkt+G3KYGS+de2+g2YkRkPwwrIsIceiX7mlL4Es350AVcTl7+fXpu
# Jl9k9I32QI+U3pNXo9BStIqjKUMBxmmKs4aLCh9OHJ6oliTCG+aJTUmSl/dABIuw
# fAcW9vjhyR4ogAp8x7WhR6PKEAAb6OE/9k0w/z0GV2K09N/R0pPAvObQ36VQJ/Cl
# 6DN8tRRytl0IQmC/mZZ+MQPQ5cvamK78X3DmnYCGtyN9HTQERfUFMSSgD/sHLvNi
# rMKuwhXiGQfDs/xQ9Z6Vh2AL7JfAwbIQwUstepb78M/5GBLaZfwFYG4+eCohJE82
# s0GOQ45Yks+AOTGj6lNyOfJ8PIf0SocCTbnLWZicpdHIfoEkSmmL0VZ5w+w0EpDO
# WOZJRpANJGTLhKNb//X3A3OJ05LoavN3/criokhC19DW/yE/VEGd3dXlP6yvFOku
# vGUINGivg1bw7yO0S/rzXNw4+cHCPgBCXbKCNuMI6B+dxL5pUR5Zr4OqcYgwejqE
# RWMdqsHA4ohpzc3AfbuHLFilXJNAgLR3jAEiVUXyrz9U1FiYEiq/8RNuupe9Uveq
# pO1PDZ9fher0Zda4y28bHl/e5M9hVeCFqElcVk0FQGt97T5olVvSaL/hFUPf65ls
# 8A3lN6WaAT9dvM33pkeswZvGxg==
# =eSbp
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 03 Nov 2023 15:01:21 HKT
# gpg:                using RSA key 87A9BD933F87C606D276F62DDAE8E10975969CE5
# gpg:                issuer "marcandre.lureau@redhat.com"
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full]
# gpg:                 aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full]
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276  F62D DAE8 E109 7596 9CE5

* tag 'dump-pull-request' of https://gitlab.com/marcandre.lureau/qemu:
  dump: Drop redundant check for empty dump
  dump: Improve some dump-guest-memory error messages
  dump: Recognize "fd:" protocols on Windows hosts
  dump: Fix g_array_unref(NULL) in dump-guest-memory
  dump: Rename qmp_dump_guest_memory() parameter to match QAPI schema
  dump: Add command interface for kdump-raw formats
  dump: Allow directly outputting raw kdump format
  dump: Pass DumpState to write_ functions

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-11-06 08:36:47 +08:00
Song Gao
31f694b911 target/loongarch: Implement query-cpu-model-expansion
Add support for the query-cpu-model-expansion QMP command to LoongArch.
We support query the cpu features.

  e.g
    la464 and max cpu support LSX/LASX, default enable,
    la132 not support LSX/LASX.

    1. start with '-cpu max,lasx=off'

    (QEMU) query-cpu-model-expansion type=static  model={"name":"max"}
    {"return": {"model": {"name": "max", "props": {"lasx": false, "lsx": true}}}}

    2. start with '-cpu la464,lasx=off'
    (QEMU) query-cpu-model-expansion type=static  model={"name":"la464"}
    {"return": {"model": {"name": "max", "props": {"lasx": false, "lsx": true}}}

    3. start with '-cpu la132,lasx=off'
    qemu-system-loongarch64: can't apply global la132-loongarch-cpu.lasx=off: Property 'la132-loongarch-cpu.lasx' not found

    4. start with '-cpu max,lasx=off' or start with '-cpu la464,lasx=off' query cpu model la132
    (QEMU) query-cpu-model-expansion type=static  model={"name":"la132"}
    {"return": {"model": {"name": "la132"}}}

Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Song Gao <gaosong@loongson.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231020084925.3457084-4-gaosong@loongson.cn>
2023-11-03 14:13:07 +08:00
Stephen Brennan
e6549197f7 dump: Add command interface for kdump-raw formats
The QMP dump API represents the dump format as an enumeration. Add three
new enumerators, one for each supported kdump compression, each named
"kdump-raw-*".

For the HMP command line, rather than adding a new flag corresponding to
each format, it seems more human-friendly to add a single flag "-R" to
switch the kdump formats to "raw" mode. The choice of "-R" also
correlates nicely to the "makedumpfile -R" option, which would serve to
reassemble a flattened vmcore.

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[ Marc-André: replace loff_t with off_t, indent fixes ]
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230918233233.1431858-4-stephen.s.brennan@oracle.com>
2023-11-02 18:40:37 +04:00
Het Gala
074dbce5fc migration: New migrate and migrate-incoming argument 'channels'
MigrateChannelList allows to connect accross multiple interfaces.
Add MigrateChannelList struct as argument to migration QAPIs.

We plan to include multiple channels in future, to connnect
multiple interfaces. Hence, we choose 'MigrateChannelList'
as the new argument over 'MigrateChannel' to make migration
QAPIs future proof.

Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com>
Signed-off-by: Het Gala <het.gala@nutanix.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231023182053.8711-10-farosas@suse.de>
2023-11-02 11:35:04 +01:00
Het Gala
e034f88364 migration: New QAPI type 'MigrateAddress'
This patch introduces well defined MigrateAddress struct
and its related child objects.

The existing argument of 'migrate' and 'migrate-incoming' QAPI
- 'uri' is of type string. The current implementation follows
double encoding scheme for fetching migration parameters like
'uri' and this is not an ideal design.

Motive for intoducing struct level design is to prevent double
encoding of QAPI arguments, as Qemu should be able to directly
use the QAPI arguments without any level of encoding.

Note: this commit only adds the type, and actual uses comes
in later commits.

Fabiano fixed for "file" transport.

Suggested-by: Aravind Retnakaran <aravind.retnakaran@nutanix.com>
Signed-off-by: Het Gala <het.gala@nutanix.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231023182053.8711-2-farosas@suse.de>
Message-Id: <20231023182053.8711-3-farosas@suse.de>
2023-11-02 11:35:03 +01:00
Steve Sistare
a87e64519b cpr: reboot mode
Add the cpr-reboot migration mode.  Usage:

$ qemu-system-$arch -monitor stdio ...
QEMU 8.1.50 monitor - type 'help' for more information
(qemu) migrate_set_capability x-ignore-shared on
(qemu) migrate_set_parameter mode cpr-reboot
(qemu) migrate -d file:vm.state
(qemu) info status
VM status: paused (postmigrate)
(qemu) quit

$ qemu-system-$arch -monitor stdio -incoming defer ...
QEMU 8.1.50 monitor - type 'help' for more information
(qemu) migrate_set_capability x-ignore-shared on
(qemu) migrate_set_parameter mode cpr-reboot
(qemu) migrate_incoming file:vm.state
(qemu) info status
VM status: running

In this mode, the migrate command saves state to a file, allowing one
to quit qemu, reboot to an updated kernel, and restart an updated version
of qemu.  The caller must specify a migration URI that writes to and reads
from a file.  Unlike normal mode, the use of certain local storage options
does not block the migration, but the caller must not modify guest block
devices between the quit and restart.  To avoid saving guest RAM to the
file, the memory backend must be shared, and the @x-ignore-shared migration
capability must be set.  Guest RAM must be non-volatile across reboot, such
as by backing it with a dax device, but this is not enforced.  The restarted
qemu arguments must match those used to initially start qemu, plus the
-incoming option.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <1698263069-406971-6-git-send-email-steven.sistare@oracle.com>
2023-11-01 16:13:59 +01:00
Steve Sistare
eea1e5c9d6 migration: mode parameter
Create a mode migration parameter that can be used to select alternate
migration algorithms.  The default mode is normal, representing the
current migration algorithm, and does not need to be explicitly set.

No functional change until a new mode is added, except that the mode is
shown by the 'info migrate' command.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <1698263069-406971-2-git-send-email-steven.sistare@oracle.com>
2023-11-01 16:13:58 +01:00
Stefan Hajnoczi
6c9ae1ce82 Block layer patches
- virtio-blk: use blk_io_plug_call() instead of notification BH
 - mirror: allow switching from background to active mode
 - qemu-img rebase: add compression support
 - Fix locking in media change monitor commands
 - Fix a few blockjob-related deadlocks when using iothread
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmVBTkERHGt3b2xmQHJl
 ZGhhdC5jb20ACgkQfwmycsiPL9ZiqRAAqvsWbblmEGJ7TBKYQK3f8QshJ66RxzbC
 4eSjKHrciWNTeeIeU8r8OvFcPPoTcPXxpcmasD2gsAxG5W5N8vkPbBkW+YT4YdDJ
 pWJXrbJ15nILC4DmnR1ARVtvxKgv9zy5LSm5bjss1K+OSYJl/nx+ILjmfVZnYDF7
 z1dP/G0JxKKm4JzAIdBE3uZS+6Q5kx/wGYlJv8EQmlH3DYfsJfy6Lthe9jfw8ijg
 lSqLoQ+D0lEd6Bk4XbkUqqBxFcYBWTfU6qPZoyIO94zCTwTG9yIjmoivxmmfwQZq
 cJUTGGZjcxpJYnvcC6P13WgcWBtcD9L2kYFVH0JyjpwcSg9cCGHMF66n9pSlyEGq
 DUikwVzbTwOotwzYQyM88v4ET+2+Qdcwn8pRbv9PllEczh0kAsUAEuxSgtz4NEcN
 bZrap/16xHFybNOKkMZcmpqxspT5NXKbDODUP0IvbSYMOYpWS983nBTxwMRpyHog
 2TFDZu4DjNiPkI2BcYM5VOKk6diNowZFShcEKvoaOLX/n9EBhP0tjoH9VUn1800F
 myHrhF2jpIf9GhErMWB7N2W3/0aK0pqdQgbpVnd1ARDdIdYkr7G/S+50D9K80b6n
 0q2E7br4S5bcsY0HQzBL9YARSayY+lVOssLoolCWEsYzijdBQmAvs5THajFKcism
 /idI6nlp2Vs=
 =RdxS
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream' of https://repo.or.cz/qemu/kevin into staging

Block layer patches

- virtio-blk: use blk_io_plug_call() instead of notification BH
- mirror: allow switching from background to active mode
- qemu-img rebase: add compression support
- Fix locking in media change monitor commands
- Fix a few blockjob-related deadlocks when using iothread

# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmVBTkERHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9ZiqRAAqvsWbblmEGJ7TBKYQK3f8QshJ66RxzbC
# 4eSjKHrciWNTeeIeU8r8OvFcPPoTcPXxpcmasD2gsAxG5W5N8vkPbBkW+YT4YdDJ
# pWJXrbJ15nILC4DmnR1ARVtvxKgv9zy5LSm5bjss1K+OSYJl/nx+ILjmfVZnYDF7
# z1dP/G0JxKKm4JzAIdBE3uZS+6Q5kx/wGYlJv8EQmlH3DYfsJfy6Lthe9jfw8ijg
# lSqLoQ+D0lEd6Bk4XbkUqqBxFcYBWTfU6qPZoyIO94zCTwTG9yIjmoivxmmfwQZq
# cJUTGGZjcxpJYnvcC6P13WgcWBtcD9L2kYFVH0JyjpwcSg9cCGHMF66n9pSlyEGq
# DUikwVzbTwOotwzYQyM88v4ET+2+Qdcwn8pRbv9PllEczh0kAsUAEuxSgtz4NEcN
# bZrap/16xHFybNOKkMZcmpqxspT5NXKbDODUP0IvbSYMOYpWS983nBTxwMRpyHog
# 2TFDZu4DjNiPkI2BcYM5VOKk6diNowZFShcEKvoaOLX/n9EBhP0tjoH9VUn1800F
# myHrhF2jpIf9GhErMWB7N2W3/0aK0pqdQgbpVnd1ARDdIdYkr7G/S+50D9K80b6n
# 0q2E7br4S5bcsY0HQzBL9YARSayY+lVOssLoolCWEsYzijdBQmAvs5THajFKcism
# /idI6nlp2Vs=
# =RdxS
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 01 Nov 2023 03:58:09 JST
# gpg:                using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg:                issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (27 commits)
  iotests: add test for changing mirror's copy_mode
  mirror: return mirror-specific information upon query
  blockjob: query driver-specific info via a new 'query' driver method
  qapi/block-core: turn BlockJobInfo into a union
  qapi/block-core: use JobType for BlockJobInfo's type
  mirror: implement mirror_change method
  block/mirror: determine copy_to_target only once
  block/mirror: move dirty bitmap to filter
  block/mirror: set actively_synced even after the job is ready
  blockjob: introduce block-job-change QMP command
  virtio-blk: remove batch notification BH
  virtio: use defer_call() in virtio_irqfd_notify()
  util/defer-call: move defer_call() to util/
  block: rename blk_io_plug_call() API to defer_call()
  blockdev: mirror: avoid potential deadlock when using iothread
  block: avoid potential deadlock during bdrv_graph_wrlock() in bdrv_close()
  blockjob: drop AioContext lock before calling bdrv_graph_wrlock()
  iotests: Test media change with iothreads
  block: Fix locking in media change monitor commands
  iotests: add tests for "qemu-img rebase" with compression
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-11-01 06:58:11 +09:00
Fiona Ebner
76cb2f2491 mirror: return mirror-specific information upon query
To start out, only actively-synced is returned.

For example, this is useful for jobs that started out in background
mode and switched to active mode. Once actively-synced is true, it's
clear that the mode switch has been completed. Note that completion of
the switch might happen much earlier, e.g. if the switch happens
before the job is ready, once all background operations have finished.
It's assumed that whether the disks are actively-synced or not is more
interesting than whether the mode switch completed. That information
can still be added if required in the future.

In presence of an iothread, the actively_synced member is now shared
between the iothread and the main thread, so turn accesses to it
atomic.

Requires to adapt the output for iotest 109.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20231031135431.393137-10-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 18:20:29 +01:00
Fiona Ebner
701efc9f2d qapi/block-core: turn BlockJobInfo into a union
In preparation to additionally return job-type-specific information.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20231031135431.393137-8-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 18:20:29 +01:00
Fiona Ebner
d67c54d05f qapi/block-core: use JobType for BlockJobInfo's type
In preparation to turn BlockJobInfo into a union with @type as the
discriminator. That requires it to be an enum. Even without that
requirement, it's nicer to have an enum instead of a str here.

No functional change is intended.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20231031135431.393137-7-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 18:20:29 +01:00
Fiona Ebner
2d400d15a0 mirror: implement mirror_change method
which allows switching the @copy-mode from 'background' to
'write-blocking'.

This is useful for management applications, so they can start out in
background mode to avoid limiting guest write speed and switch to
active mode when certain criteria are fulfilled.

In presence of an iothread, the copy_mode member is now shared between
the iothread and the main thread, so turn accesses to it atomic.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Message-ID: <20231031135431.393137-6-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 18:20:29 +01:00
Fiona Ebner
61a3a5a76a blockjob: introduce block-job-change QMP command
which will allow changing job-type-specific options after job
creation.

In the JobVerbTable, the same allow bits as for set-speed are used,
because set-speed can be considered an existing change command.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-ID: <20231031135431.393137-2-f.ebner@proxmox.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 18:20:25 +01:00
Juan Quintela
864128df46 migration: Deprecate old compression method
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231018115513.2163-6-quintela@redhat.com>
2023-10-31 08:44:33 +01:00
Juan Quintela
66db46ca83 migration: Deprecate block migration
It is obsolete.  It is better to use driver-mirror with NBD instead.

CC: Kevin Wolf <kwolf@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Hanna Czenczek <hreitz@redhat.com>

Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231018115513.2163-5-quintela@redhat.com>
2023-10-31 08:44:33 +01:00
Juan Quintela
8846b5bfca migration: migrate 'blk' command option is deprecated.
Use blocked-mirror with NBD instead.

Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231018115513.2163-4-quintela@redhat.com>
2023-10-31 08:44:33 +01:00
Juan Quintela
40101f320d migration: migrate 'inc' command option is deprecated.
Use blockdev-mirror with NBD instead.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231018115513.2163-3-quintela@redhat.com>
2023-10-31 08:44:33 +01:00
Stefan Hajnoczi
ebdf417220 * s390x CPU topology support
* Simplify the KVM register synchronization code
 * Disable the analyze-migration.py test on s390x
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmUyDYMRHHRodXRoQHJl
 ZGhhdC5jb20ACgkQLtnXdP5wLbUlgBAAkF3dvW0vMcb653sCI5vt2GHIvQQtc2Rw
 ghRRcTBZ7wyVxKHtqohCh7/byzDW5YEuCWUyLsc2oIz/84pc00VR/5Ng1EAxLAfe
 mvzzjr4jX96SmoO0DbJpqJQXaUPNYdmoshbRL0I3wkIfGtkvGRM8zHZuYINOg0hw
 bH6gWZ2QL/NFjXh0uAOaJB1+hRtPWvHD2rnVt0g9U9W5QhRxGJqti5YEaLBH7hh5
 RydsquRZ/E6uFw4pMjjvCxDaswPwejddrP2YeR5Fd5Zo+Kzp53r9Hf/eJwlZ8yFL
 5f1dRb19NZYpW1hZuJVOP8tkPydYxAM85vkUunI7Qg4gez5KI0Nz6hQozw6ufMlQ
 r8L17fwQMsCrwcRypImYNXyyrtHlNH5Y8FjqTct8aK64Bw3e7Qqi7d3ybFAuYZ+D
 k2EJ8Rlwhbg69h+Q+ucHx4NkYu9+2MFS6G7w5EcM6xl3WHSwUxh9orlEMsIkyHS3
 OMFMTr1jjfFdEN6EafhPwFE/xKglFF2Fe3u6NoR+5pkv3UA5Z87giitxoekYecpH
 J96P3anORpWW75qvOF+nccqrd7OrUL1/yYdOyJh5Tkm0oCIeQ9E5extVf3Gne3E/
 yWzr00GJRiHFO2qbGStgKHTQLItgQpccwNpSzEdgHCqwLbXl6e3Hoq42VIFOlbN/
 ZtgpyUkuYyQ=
 =xDb+
 -----END PGP SIGNATURE-----

Merge tag 'pull-request-2023-10-20' of https://gitlab.com/thuth/qemu into staging

* s390x CPU topology support
* Simplify the KVM register synchronization code
* Disable the analyze-migration.py test on s390x

# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmUyDYMRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbUlgBAAkF3dvW0vMcb653sCI5vt2GHIvQQtc2Rw
# ghRRcTBZ7wyVxKHtqohCh7/byzDW5YEuCWUyLsc2oIz/84pc00VR/5Ng1EAxLAfe
# mvzzjr4jX96SmoO0DbJpqJQXaUPNYdmoshbRL0I3wkIfGtkvGRM8zHZuYINOg0hw
# bH6gWZ2QL/NFjXh0uAOaJB1+hRtPWvHD2rnVt0g9U9W5QhRxGJqti5YEaLBH7hh5
# RydsquRZ/E6uFw4pMjjvCxDaswPwejddrP2YeR5Fd5Zo+Kzp53r9Hf/eJwlZ8yFL
# 5f1dRb19NZYpW1hZuJVOP8tkPydYxAM85vkUunI7Qg4gez5KI0Nz6hQozw6ufMlQ
# r8L17fwQMsCrwcRypImYNXyyrtHlNH5Y8FjqTct8aK64Bw3e7Qqi7d3ybFAuYZ+D
# k2EJ8Rlwhbg69h+Q+ucHx4NkYu9+2MFS6G7w5EcM6xl3WHSwUxh9orlEMsIkyHS3
# OMFMTr1jjfFdEN6EafhPwFE/xKglFF2Fe3u6NoR+5pkv3UA5Z87giitxoekYecpH
# J96P3anORpWW75qvOF+nccqrd7OrUL1/yYdOyJh5Tkm0oCIeQ9E5extVf3Gne3E/
# yWzr00GJRiHFO2qbGStgKHTQLItgQpccwNpSzEdgHCqwLbXl6e3Hoq42VIFOlbN/
# ZtgpyUkuYyQ=
# =xDb+
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 19 Oct 2023 22:17:55 PDT
# gpg:                using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg:                issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg:                 aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg:                 aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg:                 aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3  EAB9 2ED9 D774 FE70 2DB5

* tag 'pull-request-2023-10-20' of https://gitlab.com/thuth/qemu: (24 commits)
  tests/qtest/migration-test: Disable the analyze-migration.py test on s390x
  target/s390x/kvm: Simplify the GPRs, ACRs, CRs and prefix synchronization code
  target/s390x/kvm: Turn KVM_CAP_SYNC_REGS into a hard requirement
  tests/avocado: s390x cpu topology bad move
  tests/avocado: s390x cpu topology dedicated errors
  tests/avocado: s390x cpu topology test socket full
  tests/avocado: s390x cpu topology test dedicated CPU
  tests/avocado: s390x cpu topology entitlement tests
  tests/avocado: s390x cpu topology polarization
  tests/avocado: s390x cpu topology core
  docs/s390x/cpu topology: document s390x cpu topology
  qapi/s390x/cpu topology: add query-s390x-cpu-polarization command
  qapi/s390x/cpu topology: CPU_POLARIZATION_CHANGE QAPI event
  machine: adding s390 topology to info hotpluggable-cpus
  machine: adding s390 topology to query-cpu-fast
  qapi/s390x/cpu topology: set-cpu-topology qmp command
  target/s390x/cpu topology: activate CPU topology
  s390x/cpu topology: interception of PTF instruction
  s390x/cpu topology: resetting the Topology-Change-Report
  s390x/sclp: reporting the maximum nested topology entries
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-10-20 06:46:41 -07:00
Pierre Morel
0d177cdd2b docs/s390x/cpu topology: document s390x cpu topology
Add some basic examples for the definition of cpu topology
in s390x.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Co-developed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Message-ID: <20231016183925.2384704-15-nsg@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-10-20 07:16:53 +02:00
Pierre Morel
154893a784 qapi/s390x/cpu topology: add query-s390x-cpu-polarization command
The query-s390x-cpu-polarization qmp command returns the current
CPU polarization of the machine.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Co-developed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Message-ID: <20231016183925.2384704-14-nsg@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-10-20 07:16:53 +02:00
Pierre Morel
1cfe52b782 qapi/s390x/cpu topology: CPU_POLARIZATION_CHANGE QAPI event
When the guest asks to change the polarization this change
is forwarded to the upper layer using QAPI.
The upper layer is supposed to take according decisions concerning
CPU provisioning.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Co-developed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Message-ID: <20231016183925.2384704-13-nsg@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-10-20 07:16:53 +02:00
Pierre Morel
ad2d1afc1d machine: adding s390 topology to query-cpu-fast
S390x provides two more topology attributes, entitlement and dedication.

Let's add these CPU attributes to the QAPI command query-cpu-fast.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Co-developed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Message-ID: <20231016183925.2384704-11-nsg@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-10-20 07:16:53 +02:00
Pierre Morel
a457c2ab5a qapi/s390x/cpu topology: set-cpu-topology qmp command
The modification of the CPU attributes are done through a monitor
command.

It allows to move the core inside the topology tree to optimize
the cache usage in the case the host's hypervisor previously
moved the CPU.

The same command allows to modify the CPU attributes modifiers
like polarization entitlement and the dedicated attribute to notify
the guest if the host admin modified scheduling or dedication of a vCPU.

With this knowledge the guest has the possibility to optimize the
usage of the vCPUs.

The command has a feature unstable for the moment.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Co-developed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20231016183925.2384704-10-nsg@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-10-20 07:16:53 +02:00
Pierre Morel
f4f54b582f target/s390x/cpu topology: handle STSI(15) and build the SYSIB
On interception of STSI(15.1.x) the System Information Block
(SYSIB) is built from the list of pre-ordered topology entries.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Co-developed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Message-ID: <20231016183925.2384704-5-nsg@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-10-20 07:16:53 +02:00
Pierre Morel
5de1aff255 CPU topology: extend with s390 specifics
S390 adds two new SMP levels, drawers and books to the CPU
topology.
S390 CPUs have specific topology features like dedication and
entitlement. These indicate to the guest information on host
vCPU scheduling and help the guest make better scheduling decisions.

Add the new levels to the relevant QAPI structs.
Add all the supported topology levels, dedication and entitlement
as properties to S390 CPUs.
Create machine-common.json so we can later include it in
machine-target.json also.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Co-developed-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Message-ID: <20231016183925.2384704-3-nsg@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-10-20 07:16:53 +02:00
Nina Schoetterl-Glausch
3da4aef81c qapi: machine.json: change docs regarding CPU topology
Clarify roles of different architectures.
Also change things a bit in anticipation of additional members being
added.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com>
Message-ID: <20231016183925.2384704-2-nsg@linux.ibm.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
[thuth: Updated some comments according to suggestions from Markus]
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-10-20 07:16:53 +02:00
Markus Armbruster
0a59c02b0c qapi: Belatedly update CompatPolicy documentation for unstable
Commit 57df0dff1a (qapi: Extend -compat to set policy for unstable
interfaces) neglected to update the "Limitation" paragraph to mention
feature 'unstable' in addition to feature 'deprecated'.  Do that now.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20231009110449.4015601-1-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-10-19 07:02:29 +02:00
Juan Quintela
e4ceec292f migration: Improve json and formatting
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231013104736.31722-2-quintela@redhat.com>
2023-10-17 09:25:13 +02:00
Peter Xu
8b2395970a migration: Allow user to specify available switchover bandwidth
Migration bandwidth is a very important value to live migration.  It's
because it's one of the major factors that we'll make decision on when to
switchover to destination in a precopy process.

This value is currently estimated by QEMU during the whole live migration
process by monitoring how fast we were sending the data.  This can be the
most accurate bandwidth if in the ideal world, where we're always feeding
unlimited data to the migration channel, and then it'll be limited to the
bandwidth that is available.

However in reality it may be very different, e.g., over a 10Gbps network we
can see query-migrate showing migration bandwidth of only a few tens of
MB/s just because there are plenty of other things the migration thread
might be doing.  For example, the migration thread can be busy scanning
zero pages, or it can be fetching dirty bitmap from other external dirty
sources (like vhost or KVM).  It means we may not be pushing data as much
as possible to migration channel, so the bandwidth estimated from "how many
data we sent in the channel" can be dramatically inaccurate sometimes.

With that, the decision to switchover will be affected, by assuming that we
may not be able to switchover at all with such a low bandwidth, but in
reality we can.

The migration may not even converge at all with the downtime specified,
with that wrong estimation of bandwidth, keeping iterations forever with a
low estimation of bandwidth.

The issue is QEMU itself may not be able to avoid those uncertainties on
measuing the real "available migration bandwidth".  At least not something
I can think of so far.

One way to fix this is when the user is fully aware of the available
bandwidth, then we can allow the user to help providing an accurate value.

For example, if the user has a dedicated channel of 10Gbps for migration
for this specific VM, the user can specify this bandwidth so QEMU can
always do the calculation based on this fact, trusting the user as long as
specified.  It may not be the exact bandwidth when switching over (in which
case qemu will push migration data as fast as possible), but much better
than QEMU trying to wildly guess, especially when very wrong.

A new parameter "avail-switchover-bandwidth" is introduced just for this.
So when the user specified this parameter, instead of trusting the
estimated value from QEMU itself (based on the QEMUFile send speed), it
trusts the user more by using this value to decide when to switchover,
assuming that we'll have such bandwidth available then.

Note that specifying this value will not throttle the bandwidth for
switchover yet, so QEMU will always use the full bandwidth possible for
sending switchover data, assuming that should always be the most important
way to use the network at that time.

This can resolve issues like "unconvergence migration" which is caused by
hilarious low "migration bandwidth" detected for whatever reason.

Reported-by: Zhiyi Guo <zhguo@redhat.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231010221922.40638-1-peterx@redhat.com>
2023-10-17 09:14:32 +02:00
Peter Xu
c94143e587 migration: Display error in query-migrate irrelevant of status
Display it as long as being set, irrelevant of FAILED status.  E.g., it may
also be applicable to PAUSED stage of postcopy, to provide hint on what has
gone wrong.

The error_mutex seems to be overlooked when referencing the error, add it
to be very safe.

This will change QAPI behavior by showing up error message outside !FAILED
status, but it's intended and doesn't expect to break anyone.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2018404
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <20231004220240.167175-2-peterx@redhat.com>
2023-10-11 11:17:04 +02:00
Andrei Gudkov
320a6ccc76 migration/dirtyrate: use QEMU_CLOCK_HOST to report start-time
Currently query-dirty-rate uses QEMU_CLOCK_REALTIME as
the source for start-time field. This translates to
clock_gettime(CLOCK_MONOTONIC), i.e. number of seconds
since host boot. This is not very useful. The only
reasonable use case of start-time I can imagine is to
check whether previously completed measurements are
too old or not. But this makes sense only if start-time
is reported as host wall-clock time.

This patch replaces source of start-time from
QEMU_CLOCK_REALTIME to QEMU_CLOCK_HOST.

Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com>
Reviewed-by: Hyman Huang <yong.huang@smartx.com>
Message-Id: <399861531e3b24a1ecea2ba453fb2c3d129fb03a.1693905328.git.gudkov.andrei@huawei.com>
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
2023-10-10 08:04:12 +08:00
Andrei Gudkov
34a68001f1 migration/calc-dirty-rate: millisecond-granularity period
This patch allows to measure dirty page rate for
sub-second intervals of time. An optional argument is
introduced -- calc-time-unit. For example:
{"execute": "calc-dirty-rate", "arguments":
  {"calc-time": 500, "calc-time-unit": "millisecond"} }

Millisecond granularity allows to make predictions whether
migration will succeed or not. To do this, calculate dirty
rate with calc-time set to max allowed downtime (e.g. 300ms),
convert measured rate into volume of dirtied memory,
and divide by network throughput. If the value is lower
than max allowed downtime, then migration will converge.

Measurement results for single thread randomly writing to
a 1/4/24GiB memory region:

+----------------+-----------------------------------------------+
| calc-time      |                dirty rate MiB/s               |
| (milliseconds) +----------------+---------------+--------------+
|                | theoretical    | page-sampling | dirty-bitmap |
|                | (at 3M wr/sec) |               |              |
+----------------+----------------+---------------+--------------+
|                               1GiB                             |
+----------------+----------------+---------------+--------------+
|            100 |           6996 |          7100 |         3192 |
|            200 |           4606 |          4660 |         2655 |
|            300 |           3305 |          3280 |         2371 |
|            400 |           2534 |          2525 |         2154 |
|            500 |           2041 |          2044 |         1871 |
|            750 |           1365 |          1341 |         1358 |
|           1000 |           1024 |          1052 |         1025 |
|           1500 |            683 |           678 |          684 |
|           2000 |            512 |           507 |          513 |
+----------------+----------------+---------------+--------------+
|                               4GiB                             |
+----------------+----------------+---------------+--------------+
|            100 |          10232 |          8880 |         4070 |
|            200 |           8954 |          8049 |         3195 |
|            300 |           7889 |          7193 |         2881 |
|            400 |           6996 |          6530 |         2700 |
|            500 |           6245 |          5772 |         2312 |
|            750 |           4829 |          4586 |         2465 |
|           1000 |           3865 |          3780 |         2178 |
|           1500 |           2694 |          2633 |         2004 |
|           2000 |           2041 |          2031 |         1789 |
+----------------+----------------+---------------+--------------+
|                               24GiB                            |
+----------------+----------------+---------------+--------------+
|            100 |          11495 |          8640 |         5597 |
|            200 |          11226 |          8616 |         3527 |
|            300 |          10965 |          8386 |         2355 |
|            400 |          10713 |          8370 |         2179 |
|            500 |          10469 |          8196 |         2098 |
|            750 |           9890 |          7885 |         2556 |
|           1000 |           9354 |          7506 |         2084 |
|           1500 |           8397 |          6944 |         2075 |
|           2000 |           7574 |          6402 |         2062 |
+----------------+----------------+---------------+--------------+

Theoretical values are computed according to the following formula:
size * (1 - (1-(4096/size))^(time*wps)) / (time * 2^20),
where size is in bytes, time is in seconds, and wps is number of
writes per second.

Signed-off-by: Andrei Gudkov <gudkov.andrei@huawei.com>
Reviewed-by: Hyman Huang <yong.huang@smartx.com>
Message-Id: <d802e6b8053eb60fbec1a784cf86f67d9528e0a8.1693895970.git.gudkov.andrei@huawei.com>
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
2023-10-10 08:03:50 +08:00
Andrey Drobyshev via
52b10c9c0c qemu-img: map: report compressed data blocks
Right now "qemu-img map" reports compressed blocks as containing data
but having no host offset.  This is not very informative.  Instead,
let's add another boolean field named "compressed" in case JSON output
mode is specified.  This is achieved by utilizing new allocation status
flag BDRV_BLOCK_COMPRESSED for bdrv_block_status().

Also update the expected qemu-iotests outputs to contain the new field.

Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Message-ID: <20230907210226.953821-3-andrey.drobyshev@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-09-20 17:46:01 +02:00
Stefan Hajnoczi
4907644841 Hi,
"Host Memory Backends" and "Memory devices" queue ("mem"):
 - Support and document VM templating with R/O files using a new "rom"
   parameter for memory-backend-file
 - Some cleanups and fixes around NVDIMMs and R/O file handling for guest
   RAM
 - Optimize ioeventfd updates by skipping address spaces that are not
   applicable
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAmUJdykRHGRhdmlkQHJl
 ZGhhdC5jb20ACgkQTd4Q9wD/g1pf2w//akOUoYMuamySGjXtKLVyMKZkjIys+Ama
 k2C0xzsWAHBP572ezwHi8uxf5j9kzAjsw6GxDZ7FAamD9MhiohkEvkecloBx6f/c
 q3fVHblBNkG7v2urtf4+6PJtJvhzOST2SFXfWeYhO/vaA04AYCDgexv82JN3gA6B
 OS8WyOX62b8wILPSY2GLZ8IqpE9XnOYZwzVBn6YB1yo7ZkYEfXO6cA8nykNuNcOE
 vppqDo7uVIX6317FWj8ygxmzFfOaj0WT2MT2XFzEIDfg8BInQN8HC4mTn0hcVKMa
 N1y+eZH733CQKT+uNBRZ5YOeljOi4d6gEEyvkkA/L7e5D3Qg9hIdvHb4uryCFSWX
 Vt07OP1XLBwCZFobOC6sg+2gtTZJxxYK89e6ZzEd0454S24w5bnEteRAaCGOP0XL
 ww9xYULqhtZs55UC4rvZHJwdUAk1fIY4VqynwkeQXegvz6BxedNeEkJiiEU0Tizx
 N2VpsxAJ7H/LLSFeZoCRESo4azrH6U4n7S/eS1tkCniFqibfe2yIQCDoJVfb42ec
 gfg/vThCrDwHkIHzkMmoV8NndA7Q7SIkyMfYeEEBeZMeg8JzYll4DJEw/jQCacxh
 KRUa+AZvGlTJUq0mkvyOVfLki+iaehoIUuY1yvMrmdWijPO8n3YybmP9Ljhr8VdR
 9MSYZe+I2v8=
 =iraT
 -----END PGP SIGNATURE-----

Merge tag 'mem-2023-09-19' of https://github.com/davidhildenbrand/qemu into staging

Hi,

"Host Memory Backends" and "Memory devices" queue ("mem"):
- Support and document VM templating with R/O files using a new "rom"
  parameter for memory-backend-file
- Some cleanups and fixes around NVDIMMs and R/O file handling for guest
  RAM
- Optimize ioeventfd updates by skipping address spaces that are not
  applicable

# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAmUJdykRHGRhdmlkQHJl
# ZGhhdC5jb20ACgkQTd4Q9wD/g1pf2w//akOUoYMuamySGjXtKLVyMKZkjIys+Ama
# k2C0xzsWAHBP572ezwHi8uxf5j9kzAjsw6GxDZ7FAamD9MhiohkEvkecloBx6f/c
# q3fVHblBNkG7v2urtf4+6PJtJvhzOST2SFXfWeYhO/vaA04AYCDgexv82JN3gA6B
# OS8WyOX62b8wILPSY2GLZ8IqpE9XnOYZwzVBn6YB1yo7ZkYEfXO6cA8nykNuNcOE
# vppqDo7uVIX6317FWj8ygxmzFfOaj0WT2MT2XFzEIDfg8BInQN8HC4mTn0hcVKMa
# N1y+eZH733CQKT+uNBRZ5YOeljOi4d6gEEyvkkA/L7e5D3Qg9hIdvHb4uryCFSWX
# Vt07OP1XLBwCZFobOC6sg+2gtTZJxxYK89e6ZzEd0454S24w5bnEteRAaCGOP0XL
# ww9xYULqhtZs55UC4rvZHJwdUAk1fIY4VqynwkeQXegvz6BxedNeEkJiiEU0Tizx
# N2VpsxAJ7H/LLSFeZoCRESo4azrH6U4n7S/eS1tkCniFqibfe2yIQCDoJVfb42ec
# gfg/vThCrDwHkIHzkMmoV8NndA7Q7SIkyMfYeEEBeZMeg8JzYll4DJEw/jQCacxh
# KRUa+AZvGlTJUq0mkvyOVfLki+iaehoIUuY1yvMrmdWijPO8n3YybmP9Ljhr8VdR
# 9MSYZe+I2v8=
# =iraT
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 19 Sep 2023 06:25:45 EDT
# gpg:                using RSA key 1BD9CAAD735C4C3A460DFCCA4DDE10F700FF835A
# gpg:                issuer "david@redhat.com"
# gpg: Good signature from "David Hildenbrand <david@redhat.com>" [unknown]
# gpg:                 aka "David Hildenbrand <davidhildenbrand@gmail.com>" [full]
# gpg:                 aka "David Hildenbrand <hildenbr@in.tum.de>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 1BD9 CAAD 735C 4C3A 460D  FCCA 4DDE 10F7 00FF 835A

* tag 'mem-2023-09-19' of https://github.com/davidhildenbrand/qemu:
  memory: avoid updating ioeventfds for some address_space
  machine: Improve error message when using default RAM backend id
  softmmu/physmem: Hint that "readonly=on,rom=off" exists when opening file R/W for private mapping fails
  docs: Start documenting VM templating
  docs: Don't mention "-mem-path" in multi-process.rst
  softmmu/physmem: Never return directories from file_ram_open()
  softmmu/physmem: Fail creation of new files in file_ram_open() with readonly=true
  softmmu/physmem: Bail out early in ram_block_discard_range() with readonly files
  softmmu/physmem: Remap with proper protection in qemu_ram_remap()
  backends/hostmem-file: Add "rom" property to support VM templating with R/O files
  softmmu/physmem: Distinguish between file access mode and mmap protection
  nvdimm: Reject writing label data to ROM instead of crashing QEMU

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-09-19 13:22:19 -04:00
David Hildenbrand
e92666b0ba backends/hostmem-file: Add "rom" property to support VM templating with R/O files
For now, "share=off,readonly=on" would always result in us opening the
file R/O and mmap'ing the opened file MAP_PRIVATE R/O -- effectively
turning it into ROM.

Especially for VM templating, "share=off" is a common use case. However,
that use case is impossible with files that lack write permissions,
because "share=off,readonly=on" will not give us writable RAM.

The sole user of ROM via memory-backend-file are R/O NVDIMMs, but as we
have users (Kata Containers) that rely on the existing behavior --
malicious VMs should not be able to consume COW memory for R/O NVDIMMs --
we cannot change the semantics of "share=off,readonly=on"

So let's add a new "rom" property with on/off/auto values. "auto" is
the default and what most people will use: for historical reasons, to not
change the old semantics, it defaults to the value of the "readonly"
property.

For VM templating, one can now use:
    -object memory-backend-file,share=off,readonly=on,rom=off,...

But we'll disallow:
    -object memory-backend-file,share=on,readonly=on,rom=off,...
because we would otherwise get an error when trying to mmap the R/O file
shared and writable. An explicit error message is cleaner.

We will also disallow for now:
    -object memory-backend-file,share=off,readonly=off,rom=on,...
    -object memory-backend-file,share=on,readonly=off,rom=on,...
It's not harmful, but also not really required for now.

Alternatives that were abandoned:
* Make "unarmed=on" for the NVDIMM set the memory region container
  readonly. We would still see a change of ROM->RAM and possibly run
  into memslot limits with vhost-user. Further, there might be use cases
  for "unarmed=on" that should still allow writing to that memory
  (temporary files, system RAM, ...).
* Add a new "readonly=on/off/auto" parameter for NVDIMMs. Similar issues
  as with "unarmed=on".
* Make "readonly" consume "on/off/file" instead of being a 'bool' type.
  This would slightly changes the behavior of the "readonly" parameter:
  values like true/false (as accepted by a 'bool'type) would no longer be
  accepted.

Message-ID: <20230906120503.359863-4-david@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2023-09-19 10:23:21 +02:00
Ilya Maximets
cb039ef3d9 net: add initial support for AF_XDP network backend
AF_XDP is a network socket family that allows communication directly
with the network device driver in the kernel, bypassing most or all
of the kernel networking stack.  In the essence, the technology is
pretty similar to netmap.  But, unlike netmap, AF_XDP is Linux-native
and works with any network interfaces without driver modifications.
Unlike vhost-based backends (kernel, user, vdpa), AF_XDP doesn't
require access to character devices or unix sockets.  Only access to
the network interface itself is necessary.

This patch implements a network backend that communicates with the
kernel by creating an AF_XDP socket.  A chunk of userspace memory
is shared between QEMU and the host kernel.  4 ring buffers (Tx, Rx,
Fill and Completion) are placed in that memory along with a pool of
memory buffers for the packet data.  Data transmission is done by
allocating one of the buffers, copying packet data into it and
placing the pointer into Tx ring.  After transmission, device will
return the buffer via Completion ring.  On Rx, device will take
a buffer form a pre-populated Fill ring, write the packet data into
it and place the buffer into Rx ring.

AF_XDP network backend takes on the communication with the host
kernel and the network interface and forwards packets to/from the
peer device in QEMU.

Usage example:

  -device virtio-net-pci,netdev=guest1,mac=00:16:35:AF:AA:5C
  -netdev af-xdp,ifname=ens6f1np1,id=guest1,mode=native,queues=1

XDP program bridges the socket with a network interface.  It can be
attached to the interface in 2 different modes:

1. skb - this mode should work for any interface and doesn't require
         driver support.  With a caveat of lower performance.

2. native - this does require support from the driver and allows to
            bypass skb allocation in the kernel and potentially use
            zero-copy while getting packets in/out userspace.

By default, QEMU will try to use native mode and fall back to skb.
Mode can be forced via 'mode' option.  To force 'copy' even in native
mode, use 'force-copy=on' option.  This might be useful if there is
some issue with the driver.

Option 'queues=N' allows to specify how many device queues should
be open.  Note that all the queues that are not open are still
functional and can receive traffic, but it will not be delivered to
QEMU.  So, the number of device queues should generally match the
QEMU configuration, unless the device is shared with something
else and the traffic re-direction to appropriate queues is correctly
configured on a device level (e.g. with ethtool -N).
'start-queue=M' option can be used to specify from which queue id
QEMU should start configuring 'N' queues.  It might also be necessary
to use this option with certain NICs, e.g. MLX5 NICs.  See the docs
for examples.

In a general case QEMU will need CAP_NET_ADMIN and CAP_SYS_ADMIN
or CAP_BPF capabilities in order to load default XSK/XDP programs to
the network interface and configure BPF maps.  It is possible, however,
to run with no capabilities.  For that to work, an external process
with enough capabilities will need to pre-load default XSK program,
create AF_XDP sockets and pass their file descriptors to QEMU process
on startup via 'sock-fds' option.  Network backend will need to be
configured with 'inhibit=on' to avoid loading of the program.
QEMU will need 32 MB of locked memory (RLIMIT_MEMLOCK) per queue
or CAP_IPC_LOCK.

There are few performance challenges with the current network backends.

First is that they do not support IO threads.  This means that data
path is handled by the main thread in QEMU and may slow down other
work or may be slowed down by some other work.  This also means that
taking advantage of multi-queue is generally not possible today.

Another thing is that data path is going through the device emulation
code, which is not really optimized for performance.  The fastest
"frontend" device is virtio-net.  But it's not optimized for heavy
traffic either, because it expects such use-cases to be handled via
some implementation of vhost (user, kernel, vdpa).  In practice, we
have virtio notifications and rcu lock/unlock on a per-packet basis
and not very efficient accesses to the guest memory.  Communication
channels between backend and frontend devices do not allow passing
more than one packet at a time as well.

Some of these challenges can be avoided in the future by adding better
batching into device emulation or by implementing vhost-af-xdp variant.

There are also a few kernel limitations.  AF_XDP sockets do not
support any kinds of checksum or segmentation offloading.  Buffers
are limited to a page size (4K), i.e. MTU is limited.  Multi-buffer
support implementation for AF_XDP is in progress, but not ready yet.
Also, transmission in all non-zero-copy modes is synchronous, i.e.
done in a syscall.  That doesn't allow high packet rates on virtual
interfaces.

However, keeping in mind all of these challenges, current implementation
of the AF_XDP backend shows a decent performance while running on top
of a physical NIC with zero-copy support.

Test setup:

2 VMs running on 2 physical hosts connected via ConnectX6-Dx card.
Network backend is configured to open the NIC directly in native mode.
The driver supports zero-copy.  NIC is configured to use 1 queue.

Inside a VM - iperf3 for basic TCP performance testing and dpdk-testpmd
for PPS testing.

iperf3 result:
 TCP stream      : 19.1 Gbps

dpdk-testpmd (single queue, single CPU core, 64 B packets) results:
 Tx only         : 3.4 Mpps
 Rx only         : 2.0 Mpps
 L2 FWD Loopback : 1.5 Mpps

In skb mode the same setup shows much lower performance, similar to
the setup where pair of physical NICs is replaced with veth pair:

iperf3 result:
  TCP stream      : 9 Gbps

dpdk-testpmd (single queue, single CPU core, 64 B packets) results:
  Tx only         : 1.2 Mpps
  Rx only         : 1.0 Mpps
  L2 FWD Loopback : 0.7 Mpps

Results in skb mode or over the veth are close to results of a tap
backend with vhost=on and disabled segmentation offloading bridged
with a NIC.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> (docker/lcitool)
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00
Marc-André Lureau
32aa1f8dee ui/vc: do not parse VC-specific options in Spice and GTK
In commit 6f974c843c ("gtk: overwrite the console.c char driver"), I
shared the VC console parse handler with GTK. And later on in commit
d8aec9d9 ("display: add -display spice-app launching a Spice client"),
I also used it to handle spice-app VC.

This is not necessary, the VC console options (width/height/cols/rows)
are specific, and unused by tty-level GTK/Spice VC.

This is not a breaking change, as those options are still being parsed
by QAPI ChardevVC. Adjust the documentation about it.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20230830093843.3531473-44-marcandre.lureau@redhat.com>
2023-09-04 14:57:37 +04:00
Hyman Huang(黄勇)
ef96537732 qapi: Craft the dirty-limit capability comment
Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Message-ID: <169073570563.19893.2928364761104733482-2@git.sr.ht>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2023-08-02 09:33:38 +02:00
Hyman Huang(黄勇)
8abc81150f qapi: Reformat the dirty-limit migration doc comments
Reformat the dirty-limit migration doc comments to conform
to current conventions as commit a937b6aa73 (qapi: Reformat
doc comments to conform to current conventions).

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Message-ID: <169073570563.19893.2928364761104733482-1@git.sr.ht>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Whitespace tidied up]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2023-08-02 09:33:34 +02:00
Richard Henderson
ccdd312676 QAPI patches patches for 2023-07-26
-----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmTBFvUSHGFybWJydUBy
 ZWRoYXQuY29tAAoJEDhwtADrkYZTML4QAKhHciLnEudtZ6SFSqpOgt80IJnw8a+r
 z1AowVYtgPhlZ8TtQJFXpBtAZtKu8xb/QdFxomm4bdNQnWX6CXCoheF5ZJ9V3Rrz
 A3pA1wt5KTnRif6R9/Rs1dYXEr4cWagg1UNT3g2eOV3fvdDHvJMPOsqK/jWeXuC1
 T94yFMv1bZSLyiLgB7QQNYDZhIWQ06RGU6tZdWaZQReA8N8maXiZN5NnUISK32Rq
 L2X0FtgzyJQ+dLHtbXOw6kIwZdOLNauOM78skZoiZUyFVaH2aDUIg3mnfRw36hN6
 feXGtw68PkTQGexKmonPDljIacfMDApmNBelLwsvB9MTrwVV+hKZPy1ZEwPIFDJ9
 yid63pp2CtQ1TZ3dSjZ1cGbRR+g2NI5X4g1DlcFPAxydMkv9/m5NwQx8OYqVIzqg
 VXeS0++O2BM5+ORjlJxMx3RsyH2O1I8DCfwmifzYSo+3Xg/4nCV3f38czbavjCfJ
 4T3ooZx0+PRtjlOlfZTkgxV14TMV+XzQr3bsN4wbPdnjnueSE1tyoVGy8MwQ5aXi
 2oAsjrR8g7iqU6f+6PyRNn5F6D0ge+AYQ7bYS51i3Hyih/y2QUJECpL3XAgOxREb
 /68SEtr4m/GJvmQNdwwwu6e1JFo8LknwMfkfzQAOCK1npAJGsWPmJ6iY7KtWgS8F
 oDwqng/WOhvV
 =mNMX
 -----END PGP SIGNATURE-----

Merge tag 'pull-qapi-2023-07-26-v2' of https://repo.or.cz/qemu/armbru into staging

QAPI patches patches for 2023-07-26

# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmTBFvUSHGFybWJydUBy
# ZWRoYXQuY29tAAoJEDhwtADrkYZTML4QAKhHciLnEudtZ6SFSqpOgt80IJnw8a+r
# z1AowVYtgPhlZ8TtQJFXpBtAZtKu8xb/QdFxomm4bdNQnWX6CXCoheF5ZJ9V3Rrz
# A3pA1wt5KTnRif6R9/Rs1dYXEr4cWagg1UNT3g2eOV3fvdDHvJMPOsqK/jWeXuC1
# T94yFMv1bZSLyiLgB7QQNYDZhIWQ06RGU6tZdWaZQReA8N8maXiZN5NnUISK32Rq
# L2X0FtgzyJQ+dLHtbXOw6kIwZdOLNauOM78skZoiZUyFVaH2aDUIg3mnfRw36hN6
# feXGtw68PkTQGexKmonPDljIacfMDApmNBelLwsvB9MTrwVV+hKZPy1ZEwPIFDJ9
# yid63pp2CtQ1TZ3dSjZ1cGbRR+g2NI5X4g1DlcFPAxydMkv9/m5NwQx8OYqVIzqg
# VXeS0++O2BM5+ORjlJxMx3RsyH2O1I8DCfwmifzYSo+3Xg/4nCV3f38czbavjCfJ
# 4T3ooZx0+PRtjlOlfZTkgxV14TMV+XzQr3bsN4wbPdnjnueSE1tyoVGy8MwQ5aXi
# 2oAsjrR8g7iqU6f+6PyRNn5F6D0ge+AYQ7bYS51i3Hyih/y2QUJECpL3XAgOxREb
# /68SEtr4m/GJvmQNdwwwu6e1JFo8LknwMfkfzQAOCK1npAJGsWPmJ6iY7KtWgS8F
# oDwqng/WOhvV
# =mNMX
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 26 Jul 2023 05:52:05 AM PDT
# gpg:                using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg:                issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [undefined]
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* tag 'pull-qapi-2023-07-26-v2' of https://repo.or.cz/qemu/armbru:
  qapi: Reformat recent doc comments to conform to current conventions
  qapi/trace: Tidy up trace-event-get-state, -set-state documentation
  qapi/qdev: Tidy up device_add documentation
  qapi/block: Tidy up block-latency-histogram-set documentation
  qapi/block-core: Tidy up BlockLatencyHistogramInfo documentation

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-07-26 07:16:19 -07:00
Markus Armbruster
9e272073e1 qapi: Reformat recent doc comments to conform to current conventions
Since commit a937b6aa73 (qapi: Reformat doc comments to conform to
current conventions), a number of comments not conforming to the
current formatting conventions were added.  No problem, just sweep
the entire documentation once more.

To check the generated documentation does not change, I compared the
generated HTML before and after this commit with "wdiff -3".  Finds no
differences.  Comparing with diff is not useful, as the reflown
paragraphs are visible there.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20230720071610.1096458-7-armbru@redhat.com>
2023-07-26 14:51:36 +02:00
Markus Armbruster
e27a9d628d qapi/trace: Tidy up trace-event-get-state, -set-state documentation
trace-event-set-state's explanation of how events are selected is
under "Features".  Doesn't belong there.  Simply delete it, as it
feels redundant with documentation of member @name.

trace-event-get-state's explanation is under "Returns".  Tolerable,
but similarly redundant.  Delete it, too.

Cc: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20230720071610.1096458-5-armbru@redhat.com>
2023-07-26 14:51:36 +02:00
Markus Armbruster
a9c72efd6d qapi/qdev: Tidy up device_add documentation
The notes section comes out like this:

    Notes

    Additional arguments depend on the type.

    1. For detailed information about this command, please refer to the
       ‘docs/qdev-device-use.txt’ file.

    2. It’s possible to list device properties by running QEMU with the
       “-device DEVICE,help” command-line argument, where DEVICE is the
       device’s name

The first item isn't numbered.  Fix that:

    1. Additional arguments depend on the type.

    2. For detailed information about this command, please refer to the
       ‘docs/qdev-device-use.txt’ file.

    3. It’s possible to list device properties by running QEMU with the
       “-device DEVICE,help” command-line argument, where DEVICE is the
       device’s name

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20230720071610.1096458-4-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-07-26 14:51:36 +02:00
Markus Armbruster
e893b9e3b3 qapi/block: Tidy up block-latency-histogram-set documentation
Examples come out like

    Example

       set new histograms for all io types with intervals [0, 10), [10,
       50), [50, 100), [100, +inf):

The sentence "set new histograms ..." starts with a lower case letter.
Capitalize it.  Same for the other examples.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20230720071610.1096458-3-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-07-26 14:51:36 +02:00
Markus Armbruster
dad3c9565d qapi/block-core: Tidy up BlockLatencyHistogramInfo documentation
Documentation for member @bin comes out like

    list of io request counts corresponding to histogram intervals.
    len("bins") = len("boundaries") + 1 For the example above, "bins"
    may be something like [3, 1, 5, 2], and corresponding histogram
    looks like:

Note how the equation and the sentence following it run together.
Replace the equation:

    list of io request counts corresponding to histogram intervals,
    one more element than "boundaries" has.  For the example above,
    "bins" may be something like [3, 1, 5, 2], and corresponding
    histogram looks like:

Cc: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20230720071610.1096458-2-armbru@redhat.com>
[Off by one fixed]
2023-07-26 14:50:16 +02:00
Juan Quintela
7b24d32634 migration: skipped field is really obsolete.
Has return zero for more than 10 years.

Specifically we introduced the field in 1.5.0

commit f1c72795af
Author: Peter Lieven <pl@kamp.de>
Date:   Tue Mar 26 10:58:37 2013 +0100

    migration: do not sent zero pages in bulk stage

    during bulk stage of ram migration if a page is a
    zero page do not send it at all.
    the memory at the destination reads as zero anyway.

    even if there is an madvise with QEMU_MADV_DONTNEED
    at the target upon receipt of a zero page I have observed
    that the target starts swapping if the memory is overcommitted.
    it seems that the pages are dropped asynchronously.

    this patch also updates QMP to return the number of
    skipped pages in MigrationStats.

but removed its usage in 1.5.3

commit 9ef051e553
Author: Peter Lieven <pl@kamp.de>
Date:   Mon Jun 10 12:14:19 2013 +0200

    Revert "migration: do not sent zero pages in bulk stage"

    Not sending zero pages breaks migration if a page is zero
    at the source but not at the destination. This can e.g. happen
    if different BIOS versions are used at source and destination.
    It has also been reported that migration on pseries is completely
    broken with this patch.

    This effectively reverts commit f1c72795af.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20230612193344.3796-2-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)
15699cf542 migration: Extend query-migrate to provide dirty page limit info
Extend query-migrate to provide throttle time and estimated
ring full time with dirty-limit capability enabled, through which
we can observe if dirty limit take effect during live migration.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-ID: <168733225273.5845.15871826788879741674-8@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)
dc62395557 migration: Introduce dirty-limit capability
Introduce migration dirty-limit capability, which can
be turned on before live migration and limit dirty
page rate durty live migration.

Introduce migrate_dirty_limit function to help check
if dirty-limit capability enabled during live migration.

Meanwhile, refactor vcpu_dirty_rate_stat_collect
so that period can be configured instead of hardcoded.

dirty-limit capability is kind of like auto-converge
but using dirty limit instead of traditional cpu-throttle
to throttle guest down. To enable this feature, turn on
the dirty-limit capability before live migration using
migrate-set-capabilities, and set the parameters
"x-vcpu-dirty-limit-period", "vcpu-dirty-limit" suitably
to speed up convergence.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-4@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)
09f9ec9913 qapi/migration: Introduce vcpu-dirty-limit parameters
Introduce "vcpu-dirty-limit" migration parameter used
to limit dirty page rate during live migration.

"vcpu-dirty-limit" and "x-vcpu-dirty-limit-period" are
two dirty-limit-related migration parameters, which can
be set before and during live migration by qmp
migrate-set-parameters.

This two parameters are used to help implement the dirty
page rate limit algo of migration.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-3@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00
Hyman Huang(黄勇)
4d80785719 qapi/migration: Introduce x-vcpu-dirty-limit-period parameter
Introduce "x-vcpu-dirty-limit-period" migration experimental
parameter, which is in the range of 1 to 1000ms and used to
make dirtyrate calculation period configurable.

Currently with the "x-vcpu-dirty-limit-period" varies, the
total time of live migration changes, test results show the
optimal value of "x-vcpu-dirty-limit-period" ranges from
500ms to 1000 ms. "x-vcpu-dirty-limit-period" should be made
stable once it proves best value can not be determined with
developer's experiments.

Signed-off-by: Hyman Huang(黄勇) <yong.huang@smartx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <168618975839.6361.17407633874747688653-2@git.sr.ht>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-07-26 10:55:56 +02:00