migration_channel_read_peek() calls qio_channel_readv_full() and handles
both cases of return value == 0 and return value < 0 the same way, by
calling error_setg() with errp. However, if return value < 0, errp is
already set, so calling error_setg() with errp will lead to an assert.
Fix it by handling these cases separately, calling error_setg() with
errp only in return value == 0 case.
Fixes: 6720c2b327 ("migration: check magic value for deciding the mapping of channels")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20231231093016.14204-10-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
If multifd_load_setup() fails in migration_ioc_process_incoming(),
error_setg() is called with errp. This will lead to an assert because in
that case errp already contains an error.
Fix it by removing the redundant error_setg().
Fixes: 6720c2b327 ("migration: check magic value for deciding the mapping of channels")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20231231093016.14204-9-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
If there is an error in multifd TLS handshake task,
multifd_tls_outgoing_handshake() retrieves the error with
qio_task_propagate_error() but never frees it.
Fix it by freeing the obtained Error.
In addition, the error is not reported at all, so report it with
migrate_set_error().
Fixes: 2964714015 ("migration/tls: add support for multifd tls-handshake")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20231231093016.14204-8-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
The else branch in multifd_channel_connect() is redundant because when
the if branch is taken the function returns.
Simplify the code by removing the else branch.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20231231093016.14204-7-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
In multifd_recv_initial_packet(), if MultiFDInit_t->id is greater than
the configured number of multifd channels, an irrelevant error message
about multifd version is printed.
Change the error message to a relevant one about the channel id.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20231231093016.14204-6-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Commit 6720c2b327 ("migration: check magic value for deciding the
mapping of channels") extracted the only code that could fail in
migration_incoming_setup().
Now migration_incoming_setup() can't fail, so refactor it to return void
and remove errp parameter.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20231231093016.14204-4-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
MigrationState->hostname is set to NULL in migrate_init(). This is
redundant because it is already freed and set to NULL in
migrade_fd_cleanup(). Remove it.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20231231093016.14204-3-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
migrate_max_downtime() has been removed long ago, but its declaration
was mistakenly left. Remove it.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20231231093016.14204-2-avihaih@nvidia.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Add a test case to verify that the suspended state is handled correctly by
live migration postcopy. The test suspends the src, migrates, then wakes
the dest.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1704312341-66640-13-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Add a test case to verify that the suspended state is handled correctly
during live migration precopy. The test suspends the src, migrates, then
wakes the dest.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1704312341-66640-12-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Add an option to suspend the src in a-b-bootblock.S, which puts the guest
in S3 state after one round of writing to memory. The option is enabled by
poking a 1 into the suspend_me word in the boot block prior to starting the
src vm. Generate symbol offsets in a-b-bootblock.h so that the suspend_me
offset is known. Generate the bootblock for each test, because suspend_me
may differ for each.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/1704312341-66640-11-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Define a state object to capture events seen by migration tests, to allow
more events to be captured in a subsequent patch, and simplify event
checking in wait_for_migration_pass. No functional change.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Link: https://lore.kernel.org/r/1704312341-66640-10-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Do not wake a suspended guest during bg_migration, and restore the prior
state at finish rather than unconditionally running. Allow the additional
state transitions that occur.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1704312341-66640-9-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Restoring a snapshot can break a suspended guest. Snapshots suffer from
the same suspended-state issues that affect live migration, plus they must
handle an additional problematic scenario, which is that a running vm must
remain running if it loads a suspended snapshot.
To save, the existing vm_stop call now completely stops the suspended
state. Finish with vm_resume to leave the vm in the state it had prior
to the save, correctly restoring the suspended state.
To load, if the snapshot is not suspended, then vm_stop + vm_resume
correctly handles all states, and leaves the vm in the state it had prior
to the load. However, if the snapshot is suspended, restoration is
trickier. First, call vm_resume to restore the state to suspended so the
current state matches the saved state. Then, if the pre-load state is
running, call wakeup to resume running.
Prior to these changes, the vm_stop to RUN_STATE_SAVE_VM and
RUN_STATE_RESTORE_VM did not change runstate if the current state was
suspended, but now it does, so allow these transitions.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1704312341-66640-8-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
A guest that is migrated in the suspended state automaticaly wakes and
continues execution. This is wrong; the guest should end migration in
the same state it started. The root cause is that the outgoing migration
code automatically wakes the guest, then saves the RUNNING runstate in
global_state_store(), hence the incoming migration code thinks the guest is
running and continues the guest if autostart is true.
On the outgoing side, delete the call to qemu_system_wakeup_request().
Now that vm_stop completely stops a vm in the suspended state (from the
preceding patches), the existing call to vm_stop_force_state is sufficient
to correctly migrate all vmstate.
On the incoming side, call vm_start if the pre-migration state was running
or suspended. For the latter, vm_start correctly restores the suspended
state, and a future system_wakeup monitor request will cause the vm to
resume running.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1704312341-66640-7-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
If the outgoing machine was previously suspended, propagate that to the
incoming side via global_state, so a subsequent vm_start restores the
suspended state. To maintain backward and forward compatibility, reclaim
some space from the runstate member.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1704312341-66640-6-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
When a vm transitions from running to suspended, runstate notifiers are
not called, so the notifiers still think the vm is running. Hence, when
we call vm_start to restore the suspended state, we call vm_state_notify
with running=1. However, some notifiers check for RUN_STATE_RUNNING.
They must check the running boolean instead.
No functional change.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1704312341-66640-4-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Currently, a vm in the suspended state is not completely stopped. The VCPUs
have been paused, but the cpu clock still runs, and runstate notifiers for
the transition to stopped have not been called. This causes problems for
live migration. Stale cpu timers_state is saved to the migration stream,
causing time errors in the guest when it wakes from suspend, and state that
would have been modified by runstate notifiers is wrong.
Modify vm_stop to completely stop the vm if the current state is suspended,
transition to RUN_STATE_PAUSED, and remember that the machine was suspended.
Modify vm_start to restore the suspended state.
This affects all callers of vm_stop and vm_start, notably, the qapi stop and
cont commands:
old behavior:
RUN_STATE_SUSPENDED --> stop --> RUN_STATE_SUSPENDED
new behavior:
RUN_STATE_SUSPENDED --> stop --> RUN_STATE_PAUSED
RUN_STATE_PAUSED --> cont --> RUN_STATE_SUSPENDED
For example:
(qemu) info status
VM status: paused (suspended)
(qemu) stop
(qemu) info status
VM status: paused
(qemu) system_wakeup
Error: Unable to wake up: guest is not in suspended state
(qemu) cont
(qemu) info status
VM status: paused (suspended)
(qemu) system_wakeup
(qemu) info status
VM status: running
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Link: https://lore.kernel.org/r/1704312341-66640-3-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
I am currently focusing in kernel development, so I will probably not be
of much help in reviewing general Live Migration changes.
For above reason I am removing my Reviewer status from Migration and RDMA
Migration.
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Link: https://lore.kernel.org/r/20231221170739.332378-1-leobras@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
I am leaving Red Hat, and as part of that I am leaving Migration
maintenarship.
You are left in good hands with Peter and Fabiano.
Thanks for all the fish.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Link: https://lore.kernel.org/r/20240102201908.1987-2-quintela@redhat.com
[peterx: prefix the subject]
Signed-off-by: Peter Xu <peterx@redhat.com>
It is not useful when configuring with --enable-trace-backends=nop.
Signed-off-by: Carlos Santos <casantos@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230408010410.281263-1-casantos@redhat.com>
vhost-scsi support for worker ioctls
fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmWKohIPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpG2YH/1rJGV8TQm4V8kcGP9wOknPAMFADnEFdFmrB
V+JEDnyKrdcEZLPRh0b846peWRJhC13iL7Ks3VNjeVsfE9TyzNyNDpUzCJPfYFjR
3m8ChLDvE9tKBA5/hXMIcgDXaYcPIrPvHyl4HG8EQn7oaeMpS2uecKqDpDDvNXGq
oNamNvqimFSqA+3ChzA+0Qt07Ts7xFEw4OEXSwfRXlsam/dhQG0SI+crRheHuvFb
HR8EwmNydA1D/M51AuBNuvX36u3SnPWm7Anp5711SZ1b59unshI0ztIqIJnGkvYe
qpUJSmxR6ulwWe4nQfb+GhBsuJ2j2ORC7YfXyAT7mw8rds8loaI=
=cNy2
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pc,pci: features, cleanups, fixes
vhost-scsi support for worker ioctls
fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmWKohIPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpG2YH/1rJGV8TQm4V8kcGP9wOknPAMFADnEFdFmrB
# V+JEDnyKrdcEZLPRh0b846peWRJhC13iL7Ks3VNjeVsfE9TyzNyNDpUzCJPfYFjR
# 3m8ChLDvE9tKBA5/hXMIcgDXaYcPIrPvHyl4HG8EQn7oaeMpS2uecKqDpDDvNXGq
# oNamNvqimFSqA+3ChzA+0Qt07Ts7xFEw4OEXSwfRXlsam/dhQG0SI+crRheHuvFb
# HR8EwmNydA1D/M51AuBNuvX36u3SnPWm7Anp5711SZ1b59unshI0ztIqIJnGkvYe
# qpUJSmxR6ulwWe4nQfb+GhBsuJ2j2ORC7YfXyAT7mw8rds8loaI=
# =cNy2
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 26 Dec 2023 04:51:14 EST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (21 commits)
vdpa: move memory listener to vhost_vdpa_shared
vdpa: use dev_shared in vdpa_iommu
vdpa: use VhostVDPAShared in vdpa_dma_map and unmap
vdpa: move iommu_list to vhost_vdpa_shared
vdpa: remove msg type of vhost_vdpa
vdpa: move backend_cap to vhost_vdpa_shared
vdpa: move iotlb_batch_begin_sent to vhost_vdpa_shared
vdpa: move file descriptor to vhost_vdpa_shared
vdpa: use vdpa shared for tracing
vdpa: move shadow_data to vhost_vdpa_shared
vdpa: move iova_range to vhost_vdpa_shared
vdpa: move iova tree to the shared struct
vdpa: add VhostVDPAShared
vdpa: do not set virtio status bits if unneeded
Fix bugs when VM shutdown with virtio-gpu unplugged
vhost-scsi: fix usage of error_reportf_err()
hw/acpi: propagate vcpu hotplug after switch to modern interface
vhost-scsi: Add support for a worker thread per virtqueue
vhost: Add worker backend callouts
tests: bios-tables-test: Rename smbios type 4 related test functions
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the memory listener to a common place rather than always in the
first / last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-14-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The memory listener functions can call these too. Make vdpa_iommu work
with VhostVDPAShared.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-13-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The callers only have the shared information by the end of this series.
Start converting this functions.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-12-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the iommu_list member to VhostVDPAShared so all vhost_vdpa can use
it, rather than always in the first / last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-11-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It is always VHOST_IOTLB_MSG_V2. We can always make it back per
vhost_dev if needed.
This change makes easier for vhost_vdpa_map and unmap not to depend on
vhost_vdpa but only in VhostVDPAShared.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-10-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the backend_cap member to VhostVDPAShared so all vhost_vdpa can use
it, rather than always in the first / last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-9-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the iotlb_batch_begin_sent member to VhostVDPAShared so all
vhost_vdpa can use it, rather than always in the first / last
vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-8-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the file descriptor to VhostVDPAShared so all vhost_vdpa can use
it, rather than always in the first / last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-7-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
By the end of this series dma_map and dma_unmap functions don't have the
vdpa device for tracing. Movinge trace function to shared member one.
Print it also in the vdpa initialization so log reader can relate them.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-6-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the shadow_data member to VhostVDPAShared so all vhost_vdpa can use
it, rather than always in the first or last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-5-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the iova range to VhostVDPAShared so all vhost_vdpa can use it,
rather than always in the first or last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-4-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next patches will register the vhost_vdpa memory listener while the VM
is migrating at the destination, so we can map the memory to the device
before stopping the VM at the source. The main goal is to reduce the
downtime.
However, the destination QEMU is unaware of which vhost_vdpa device will
register its memory_listener. If the source guest has CVQ enabled, it
will be the CVQ device. Otherwise, it will be the first one.
Move the iova tree to VhostVDPAShared so all vhost_vdpa can use it,
rather than always in the first or last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-3-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It will hold properties shared among all vhost_vdpa instances associated
with of the same device. For example, we just need one iova_tree or one
memory listener for the entire device.
Next patches will register the vhost_vdpa memory listener at the
beginning of the VM migration at the destination. This enables QEMU to
map the memory to the device before stopping the VM at the source,
instead of doing while both source and destination are stopped, thus
minimizing the downtime.
However, the destination QEMU is unaware of which vhost_vdpa struct will
register its memory_listener. If the source guest has CVQ enabled, it
will be the one associated with the CVQ. Otherwise, it will be the
first one.
Save the memory operations related members in a common place rather than
always in the first / last vhost_vdpa.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20231221174322.3130442-2-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Next commits will set DRIVER and ACKNOWLEDGE flags repeatedly in the
case of a migration destination. Let's save ioctls with this.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20231215172830.2540987-2-eperezma@redhat.com>
Virtio-gpu malloc memory for the queue when it realized, but the queues was not
released when it unrealized, which resulting in a memory leak. In addition,
vm_change_state_handler is not cleaned up, which is related to vdev and will
lead to segmentation fault when VM shutdown.
Signed-off-by: wangmeiling <wangmeiling21@huawei.com>
Signed-off-by: Binfeng Wu <wubinfeng@huawei.com>
Message-Id: <7bbbc0f3-2ad9-83ca-b39b-f976d0837daf@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It is required to use error_report() instead of error_reportf_err(), if the
prior function does not take local_err as the argument. As a result, the
local_err is always NULL and segment fault may happen.
vhost_scsi_start()
-> vhost_scsi_set_endpoint(s) --> does not allocate local_err
-> error_reportf_err()
-> error_vprepend()
-> g_string_append(newmsg, (*errp)->msg) --> (*errp) is NULL
In addition, add ": " at the end of other error_reportf_err() logs.
Fixes: 7962e432b4 ("vhost-user-scsi: support reconnect to backend")
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Message-Id: <20231214003117.43960-1-dongli.zhang@oracle.com>
Reviewed-by: Feng Li <fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If a vcpu with an apic-id that is not supported by the legacy
interface (>255) is hot-plugged, the legacy code will dynamically switch
to the modern interface. However, the hotplug event is not forwarded to
the new interface resulting in the vcpu not being fully/properly added
to the machine config. This BUG is evidenced by OVMF when it
it attempts to count the vcpus and reports an inconsistent vcpu count
reported by the fw_cfg interface and the modern hotpug interface.
Fix is to propagate the hotplug event after making the switch from
the legacy interface to the modern interface.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Aaron Young <aaron.young@oracle.com>
Message-Id: <0e8a9baebbb29f2a6c87fd08e43dc2ac4019759a.1702398644.git.Aaron.Young@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This adds support for vhost-scsi to be able to create a worker thread
per virtqueue. Right now for vhost-net we get a worker thread per
tx/rx virtqueue pair which scales nicely as we add more virtqueues and
CPUs, but for scsi we get the single worker thread that's shared by all
virtqueues. When trying to send IO to more than 2 virtqueues the single
thread becomes a bottlneck.
This patch adds a new setting, worker_per_virtqueue, which can be set
to:
false: Existing behavior where we get the single worker thread.
true: Create a worker per IO virtqueue.
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20231204231618.21962-3-michael.christie@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
This adds the vhost backend callouts for the worker ioctls added in the
6.4 linux kernel commit:
c1ecd8e95007 ("vhost: allow userspace to create workers")
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20231204231618.21962-2-michael.christie@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In fact, type4-count, core-count, core-count2, thread-count and
thread-count2 are tested with KVM not TCG.
Rename these test functions to reflect KVM base instead of TCG.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20231127160202.1037290-1-zhao1.liu@linux.intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Since the driver doesn't support interrupts, we must return early when
index is set to VIRTIO_CONFIG_IRQ_IDX. Basically the same thing Viresh
did for "91208dd297f2 virtio: i2c: Check notifier helpers for
VIRTIO_CONFIG_IRQ_IDX".
Fixes: 544f0278af ("virtio: introduce macro VIRTIO_CONFIG_IRQ_IDX")
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Message-Id: <20231025171841.3379663-1-mathieu.poirier@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>