vhost blocks migration if logging is not supported to track dirty
memory, and vhost-user blocks it if the log cannot be saved to a shm fd.
vhost-vdpa blocks migration if both hosts do not support all the device's
features using a shadow VQ, for tracking requests and dirty memory.
vhost-scsi blocks migration if storage cannot be shared across hosts,
or if state cannot be migrated.
None of these conditions apply if the old and new qemu processes do
not run concurrently, and if new qemu starts on the same host as old,
which is the case for cpr.
Narrow the scope of these blockers so they only apply to normal mode.
They will not block cpr modes when they are added in subsequent patches.
No functional change until a new mode is added.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <1698263069-406971-5-git-send-email-steven.sistare@oracle.com>
The networking subsystem may wish to use defer_call(), so move the code
to util/ where it can be reused.
As a reminder of what defer_call() does:
This API defers a function call within a defer_call_begin()/defer_call_end()
section, allowing multiple calls to batch up. This is a performance
optimization that is used in the block layer to submit several I/O requests
at once instead of individually:
defer_call_begin(); <-- start of section
...
defer_call(my_func, my_obj); <-- deferred my_func(my_obj) call
defer_call(my_func, my_obj); <-- another
defer_call(my_func, my_obj); <-- another
...
defer_call_end(); <-- end of section, my_func(my_obj) is called once
Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230913200045.1024233-3-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Prepare to move the blk_io_plug_call() API out of the block layer so
that other subsystems call use this deferred call mechanism. Rename it
to defer_call() but leave the code in block/plug.c.
The next commit will move the code out of the block layer.
Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230913200045.1024233-2-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
infrastructure for vhost-vdpa shadow work
piix south bridge rework
reconnect for vhost-user-scsi
dummy ACPI QTG DSM for cxl
tests, cleanups, fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmU06PMPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpNIsH/0DlKti86VZLJ6PbNqsnKxoK2gg05TbEhPZU
pQ+RPDaCHpFBsLC5qsoMJwvaEQFe0e49ZFemw7bXRzBxgmbbNnZ9ArCIPqT+rvQd
7UBmyC+kacVyybZatq69aK2BHKFtiIRlT78d9Izgtjmp8V7oyKoz14Esh8wkE+FT
ypHUa70Addi6alNm6BVkm7bxZxi0Wrmf3THqF8ViYvufzHKl7JR5e17fKWEG0BqV
9W7AeHMnzJ7jkTvBGUw7g5EbzFn7hPLTbO4G/VW97k0puS4WRX5aIMkVhUazsRIa
zDOuXCCskUWuRapiCwY0E4g7cCaT8/JR6JjjBaTgkjJgvo5Y8Eg=
=ILek
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pc,pci: features, cleanups
infrastructure for vhost-vdpa shadow work
piix south bridge rework
reconnect for vhost-user-scsi
dummy ACPI QTG DSM for cxl
tests, cleanups, fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmU06PMPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpNIsH/0DlKti86VZLJ6PbNqsnKxoK2gg05TbEhPZU
# pQ+RPDaCHpFBsLC5qsoMJwvaEQFe0e49ZFemw7bXRzBxgmbbNnZ9ArCIPqT+rvQd
# 7UBmyC+kacVyybZatq69aK2BHKFtiIRlT78d9Izgtjmp8V7oyKoz14Esh8wkE+FT
# ypHUa70Addi6alNm6BVkm7bxZxi0Wrmf3THqF8ViYvufzHKl7JR5e17fKWEG0BqV
# 9W7AeHMnzJ7jkTvBGUw7g5EbzFn7hPLTbO4G/VW97k0puS4WRX5aIMkVhUazsRIa
# zDOuXCCskUWuRapiCwY0E4g7cCaT8/JR6JjjBaTgkjJgvo5Y8Eg=
# =ILek
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 22 Oct 2023 02:18:43 PDT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (62 commits)
intel-iommu: Report interrupt remapping faults, fix return value
MAINTAINERS: Add include/hw/intc/i8259.h to the PC chip section
vhost-user: Fix protocol feature bit conflict
tests/acpi: Update DSDT.cxl with QTG DSM
hw/cxl: Add QTG _DSM support for ACPI0017 device
tests/acpi: Allow update of DSDT.cxl
hw/i386/cxl: ensure maxram is greater than ram size for calculating cxl range
vhost-user: fix lost reconnect
vhost-user-scsi: start vhost when guest kicks
vhost-user-scsi: support reconnect to backend
vhost: move and rename the conn retry times
vhost-user-common: send get_inflight_fd once
hw/i386/pc_piix: Make PIIX4 south bridge usable in PC machine
hw/isa/piix: Implement multi-process QEMU support also for PIIX4
hw/isa/piix: Resolve duplicate code regarding PCI interrupt wiring
hw/isa/piix: Reuse PIIX3's PCI interrupt triggering in PIIX4
hw/isa/piix: Rename functions to be shared for PCI interrupt triggering
hw/isa/piix: Reuse PIIX3 base class' realize method in PIIX4
hw/isa/piix: Share PIIX3's base class with PIIX4
hw/isa/piix: Harmonize names of reset control memory regions
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When the vhost-user is reconnecting to the backend, and if the vhost-user fails
at the get_features in vhost_dev_init(), then the reconnect will fail
and it will not be retriggered forever.
The reason is:
When the vhost-user fails at get_features, the vhost_dev_cleanup will be called
immediately.
vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'.
The reconnect path is:
vhost_user_blk_event
vhost_user_async_close(.. vhost_user_blk_disconnect ..)
qemu_chr_fe_set_handlers <----- clear the notifier callback
schedule vhost_user_async_close_bh
The vhost->vdev is null, so the vhost_user_blk_disconnect will not be
called, then the event fd callback will not be reinstalled.
All vhost-user devices have this issue, including vhost-user-blk/scsi.
With this patch, if the vdev->vdev is null, the fd callback will still
be reinstalled.
Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling")
Signed-off-by: Li Feng <fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20231009044735.941655-6-fengli@smartx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's keep the same behavior as vhost-user-blk.
Some old guests kick virtqueue before setting VIRTIO_CONFIG_S_DRIVER_OK.
Signed-off-by: Li Feng <fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20231009044735.941655-5-fengli@smartx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If the backend crashes and restarts, the device is broken.
This patch adds reconnect for vhost-user-scsi.
This patch also improves the error messages, and reports some silent errors.
Tested with spdk backend.
Signed-off-by: Li Feng <fengli@smartx.com>
Message-Id: <20231009044735.941655-4-fengli@smartx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Currently the get_inflight_fd will be sent every time the device is started, and
the backend will allocate shared memory to save the inflight state. If the
backend finds that it receives the second get_inflight_fd, it will release the
previous shared memory, which breaks inflight working logic.
This patch is a preparation for the following patches.
Signed-off-by: Li Feng <fengli@smartx.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20231009044735.941655-2-fengli@smartx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
vhost-user-scsi has a VirtioDeviceClass->reset() function that calls
->vhost_reset_device(). The other vhost devices don't notify the vhost
device upon reset.
Stateful vhost devices may need to handle device reset in order to free
resources or prevent stale device state from interfering after reset.
Call ->vhost_device_reset() from virtio_reset() so that that vhost
devices are notified of device reset.
This patch affects behavior as follows:
- vhost-kernel: No change in behavior since ->vhost_reset_device() is
not implemented.
- vhost-user: back-ends that negotiate
VHOST_USER_PROTOCOL_F_RESET_DEVICE now receive a
VHOST_USER_DEVICE_RESET message upon device reset. Otherwise there is
no change in behavior. DPDK, SPDK, libvhost-user, and the
vhost-user-backend crate do not negotiate
VHOST_USER_PROTOCOL_F_RESET_DEVICE automatically.
- vhost-vdpa: an extra SET_STATUS 0 call is made during device reset.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20231004014532.1228637-4-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
The VHOST_USER_RESET_OWNER message is deprecated in the spec:
This is no longer used. Used to be sent to request disabling all
rings, but some back-ends interpreted it to also discard connection
state (this interpretation would lead to bugs). It is recommended
that back-ends either ignore this message, or use it to disable all
rings.
The only caller of vhost_user_reset_device() is vhost_user_scsi_reset().
It checks that F_RESET_DEVICE was negotiated before calling it:
static void vhost_user_scsi_reset(VirtIODevice *vdev)
{
VHostSCSICommon *vsc = VHOST_SCSI_COMMON(vdev);
struct vhost_dev *dev = &vsc->dev;
/*
* Historically, reset was not implemented so only reset devices
* that are expecting it.
*/
if (!virtio_has_feature(dev->protocol_features,
VHOST_USER_PROTOCOL_F_RESET_DEVICE)) {
return;
}
if (dev->vhost_ops->vhost_reset_device) {
dev->vhost_ops->vhost_reset_device(dev);
}
}
Therefore VHOST_USER_RESET_OWNER is actually never sent by
vhost_user_reset_device(). Remove the dead code. This effectively moves
the vhost-user protocol specific code from vhost-user-scsi.c into
vhost-user.c where it belongs.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20231004014532.1228637-2-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Modify migrate_add_blocker and migrate_del_blocker to take an Error **
reason. This allows migration to own the Error object, so that if
an error occurs in migrate_add_blocker, migration code can free the Error
and clear the client handle, simplifying client code. It also simplifies
the migrate_del_blocker call site.
In addition, this is a pre-requisite for a proposed future patch that would
add a mode argument to migration requests to support live update, and
maintain a list of blockers for each mode. A blocker may apply to a single
mode or to multiple modes, and passing Error** will allow one Error object
to be registered for multiple modes.
No functional change.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Tested-by: Michael Galaxy <mgalaxy@akamai.com>
Reviewed-by: Michael Galaxy <mgalaxy@akamai.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <1697634216-84215-1-git-send-email-steven.sistare@oracle.com>
Access QOM parent with the proper QOM VIRTIO_SCSI_COMMON() macro.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20231017140150.44995-5-philmd@linaro.org>
vdpa:
shadow vq vlan support
net migration with cvq
cxl:
support emulating 4 HDM decoders
serial number extended capability
virtio:
hared dma-buf
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmUd4/YPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpyM8H/02cRbJcQOjYt7j68zPW6GaDXxBI/UmdWDyG
15LZZbGNOPjyjNd3Vz1M7stQ5rhoKcgo/RdI+0E60a78svgW5JvpXoXR3pksc3Dx
v28B/akXwHUErYFSZQ+2VHNc8OhCd0v2ehxZxbwPEAYIOAj3hcCIVoPGXTnKJmAJ
imr5hjH0wZUc0+xdsmn8Vfdv5NTzpwfVObbGiMZejeJsaoh0y6Rt8RANBMY67KQD
S7/HPlVuDYf/y43t4ZEHNYuV9RaCdZZYlLWwV1scdKaYcofgmtJOKbOdCjHRXgj+
004Afb3rggIoCfnCzOFzhGx+MLDtLjvEn2N4oLEWCLi+k/3huaA=
=GAvH
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pci: features, cleanups
vdpa:
shadow vq vlan support
net migration with cvq
cxl:
support emulating 4 HDM decoders
serial number extended capability
virtio:
hared dma-buf
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (53 commits)
libvhost-user: handle shared_object msg
vhost-user: add shared_object msg
hw/display: introduce virtio-dmabuf
util/uuid: add a hash function
virtio: remove unused next argument from virtqueue_split_read_next_desc()
virtio: remove unnecessary thread fence while reading next descriptor
virtio: use shadow_avail_idx while checking number of heads
libvhost-user.c: add assertion to vu_message_read_default
pcie_sriov: unregister_vfs(): fix error path
hw/i386/pc: improve physical address space bound check for 32-bit x86 systems
amd_iommu: Fix APIC address check
vdpa net: follow VirtIO initialization properly at cvq isolation probing
vdpa net: stop probing if cannot set features
vdpa net: fix error message setting virtio status
hw/pci-bridge/cxl-upstream: Add serial number extended capability support
hw/cxl: Support 4 HDM decoders at all levels of topology
hw/cxl: Fix and use same calculation for HDM decoder block size everywhere
hw/cxl: Add utility functions decoder interleave ways and target count.
hw/cxl: Push cxl_decoder_count_enc() and cxl_decode_ig() into .c
vdpa net: zero vhost_vdpa iova_tree pointer at cleanup
...
Conflicts:
hw/core/machine.c
Context conflict with commit 314e0a84cd ("hw/core: remove needless
includes") because it removed an adjacent #include.
Move the definition of VhostUserProtocolFeature to
include/hw/virtio/vhost-user.h.
Remove previous definitions in hw/scsi/vhost-user-scsi.c,
hw/virtio/vhost-user.c, and hw/virtio/virtio-qmp.c.
Previously there were 3 separate definitions of this over 3 different
files. Now only 1 definition of this will be present for these 3 files.
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Reviewed-by: Emmanouil Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-Id: <20230926224107.2951144-4-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Otherwise when a FORMAT UNIT command is issued, the SCSI layer can become
confused because it can find itself in the situation where it thinks there
is still data to be transferred which can cause the next emulated SCSI
command to fail.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: 6ab71761 ("scsi-disk: add FORMAT UNIT command")
Tested-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20230913204410.65650-4-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In the case where a SCSI layer transfer is incorrectly terminated, it is
possible for a TI command to cause a SCSI buffer overflow due to the
expected transfer data length being less than the available data in the
FIFO. When this occurs the unsigned async_len variable underflows and
becomes a large offset which writes past the end of the allocated SCSI
buffer.
Restrict the non-DMA transfer length to be the smallest of the expected
transfer length and the available FIFO data to ensure that it is no longer
possible for the SCSI buffer overflow to occur.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1810
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20230913204410.65650-3-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The call to esp_dma_enable() was being made with the SYSBUS_ESP type instead of
the ESP type. This meant that when GPIO 1 was being used to trigger a DMA
request from an external DMA controller, the setting of ESPState's dma_enabled
field would clobber unknown memory whilst the dma_cb callback pointer would
typically return NULL so the DMA request would never start.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20230913204410.65650-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We are doing things like
nb_sectors /= (s->qdev.blocksize / BDRV_SECTOR_SIZE);
in the code here (e.g. in scsi_disk_emulate_mode_sense()), so if
the blocksize is smaller than BDRV_SECTOR_SIZE (=512), this crashes
with a division by 0 exception. Thus disallow block sizes of 256
bytes to avoid this situation.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1813
CVE: 2023-42467
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20230925091854.49198-1-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
scsi_clear_unit_attention() now only handles REPORTED LUNS DATA HAS
CHANGED.
This only happens when we handle REPORT LUNS commands, so let's rename
the function in scsi_clear_reported_luns_changed() and call it only in
scsi_target_emulate_report_luns().
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-ID: <20230712134352.118655-4-sgarzare@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The previous commit moved the unit attention clearing when we create
the request. So now we can clean scsi_clear_unit_attention() to handle
only the case of the REPORT LUNS command: this is the only case in
which a UNIT ATTENTION is cleared without having been reported.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-ID: <20230712134352.118655-3-sgarzare@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 1880ad4f4e ("virtio-scsi: Batched prepare for cmd reqs") split
calls to scsi_req_new() and scsi_req_enqueue() in the virtio-scsi device.
No ill effects were observed until commit 8cc5583abe ("virtio-scsi: Send
"REPORTED LUNS CHANGED" sense data upon disk hotplug events") added a
unit attention that was easy to trigger with device hotplug and
hot-unplug.
Because the two calls were separated, all requests in the batch were
prepared calling scsi_req_new() to report a sense. The first one
submitted would report the right sense and reset it to NO_SENSE, while
the others reported CHECK_CONDITION with no sense data. This caused
SCSI errors in Linux.
To solve this issue, let's fetch the unit attention as early as possible
when we prepare the request, so that only the first request in the batch
will use the unit attention SCSIReqOps and the others will not report
CHECK CONDITION.
Fixes: 1880ad4f4e ("virtio-scsi: Batched prepare for cmd reqs")
Fixes: 8cc5583abe ("virtio-scsi: Send "REPORTED LUNS CHANGED" sense data upon disk hotplug events")
Reported-by: Thomas Huth <thuth@redhat.com>
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2176702
Co-developed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-ID: <20230712134352.118655-2-sgarzare@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
virtio_scsi_dataplane_stop() calls blk_drain_all(), which invokes
->drained_begin()/->drained_end() after we've already detached the host
notifier. virtio_scsi_drained_end() currently attaches the host notifier
again and leaves it dangling after dataplane has stopped.
This results in the following assertion failure because
virtio_scsi_defer_to_dataplane() is called from the IOThread instead of
the main loop thread:
qemu-system-x86_64: ../softmmu/memory.c:1111: memory_region_transaction_commit: Assertion `qemu_mutex_iothread_locked()' failed.
Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1680
Reported-by: Jean-Louis Dupond <jean-louis@dupond.be>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230611193924.2444914-1-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The previous commit remove the unnecessary "virtio-access.h"
header. These files no longer have target-specific dependency.
Move them to the generic 'softmmu_ss' source set.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230524093744.88442-11-philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
None of these files use the VirtIO Load/Store API declared
by "hw/virtio/virtio-access.h". This header probably crept
in via copy/pasting, remove it.
Note, "virtio-access.h" is target-specific, so any file
including it also become tainted as target-specific.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230524093744.88442-10-philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Following the SCSI variable named '[specific_]scsi_ss', rename the
target-specific VirtIO/SCSI set prefixed with 'specific_'. This will
help when adding target-agnostic VirtIO/SCSI set in few commits.
No logical change.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230524093744.88442-5-philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
We will modify this file shortly. Re-arrange it slightly first,
declaring source sets first.
No logical change.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230524093744.88442-4-philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Instead of adding 'vhost-scsi-common.c' twice (for VHOST_SCSI and
VHOST_USER_SCSI), have it depend on VHOST_SCSI_COMMON, selected by
both symbols.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230524093744.88442-3-philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
We use the user_ss[] array to hold the user emulation sources,
and the softmmu_ss[] array to hold the system emulation ones.
Hold the latter in the 'system_ss[]' array for parity with user
emulation.
Mechanical change doing:
$ sed -i -e s/softmmu_ss/system_ss/g $(git grep -l softmmu_ss)
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230613133347.82210-10-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
GCC9 is confused when building with CFLAG -O3:
hw/scsi/megasas.c: In function ‘megasas_scsi_realize’:
hw/scsi/megasas.c:2387:26: error: duplicated ‘if’ condition [-Werror=duplicated-cond]
2387 | } else if (s->fw_sge >= 128 - MFI_PASS_FRAME_SIZE) {
hw/scsi/megasas.c:2385:19: note: previously used here
2385 | if (s->fw_sge >= MEGASAS_MAX_SGE - MFI_PASS_FRAME_SIZE) {
cc1: all warnings being treated as errors
When this device was introduced in commit e8f943c3bc, the author
cared about modularity, using a definition for the firmware limit.
However if the firmware limit isn't changed (MEGASAS_MAX_SGE = 128),
the code ends doing the same check twice.
Per the maintainer [*]:
> The original code assumed that one could change MFI_PASS_FRAME_SIZE,
> but it turned out not to be possible as it's being hardcoded in the
> drivers themselves (even though the interface provides mechanisms to
> query it). So we can remove the duplicate lines.
Add the 'MEGASAS_MIN_SGE' definition for the '64' magic value,
slightly rewrite the condition check to simplify a bit the logic
and remove the unnecessary / duplicated check.
[*] https://lore.kernel.org/qemu-devel/e0029fc5-882f-1d63-15e3-1c3dbe9b6a2c@suse.de/
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Message-Id: <20230328210126.16282-1-philmd@linaro.org>
Mechanical change running Coccinelle spatch with content
generated from the qom-cast-macro-clean-cocci-gen.py added
in the previous commit.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230601093452.38972-3-philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Introduce a new API for thread-local blk_io_plug() that does not
traverse the block graph. The goal is to make blk_io_plug() multi-queue
friendly.
Instead of having block drivers track whether or not we're in a plugged
section, provide an API that allows them to defer a function call until
we're unplugged: blk_io_plug_call(fn, opaque). If blk_io_plug_call() is
called multiple times with the same fn/opaque pair, then fn() is only
called once at the end of the function - resulting in batching.
This patch introduces the API and changes blk_io_plug()/blk_io_unplug().
blk_io_plug()/blk_io_unplug() no longer require a BlockBackend argument
because the plug state is now thread-local.
Later patches convert block drivers to blk_io_plug_call() and then we
can finally remove .bdrv_co_io_plug() once all block drivers have been
converted.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20230530180959.1108766-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The virtio-scsi Host Bus Adapter provides access to devices on a SCSI
bus. Those SCSI devices typically have a BlockBackend. When the
BlockBackend enters a drained section, the SCSI device must temporarily
stop submitting new I/O requests.
Implement this behavior by temporarily stopping virtio-scsi virtqueue
processing when one of the SCSI devices enters a drained section. The
new scsi_device_drained_begin() API allows scsi-disk to message the
virtio-scsi HBA.
scsi_device_drained_begin() uses a drain counter so that multiple SCSI
devices can have overlapping drained sections. The HBA only sees one
pair of .drained_begin/end() calls.
After this commit, virtio-scsi no longer depends on hw/virtio's
ioeventfd aio_set_event_notifier(is_external=true). This commit is a
step towards removing the aio_disable_external() API.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230516190238.8401-19-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
virtio_queue_aio_detach_host_notifier() does two things:
1. It removes the fd handler from the event loop.
2. It processes the virtqueue one last time.
The first step can be peformed by any thread and without taking the
AioContext lock.
The second step may need the AioContext lock (depending on the device
implementation) and runs in the thread where request processing takes
place. virtio-blk and virtio-scsi therefore call
virtio_queue_aio_detach_host_notifier() from a BH that is scheduled in
AioContext.
The next patch will introduce a .drained_begin() function that needs to
call virtio_queue_aio_detach_host_notifier(). .drained_begin() functions
cannot call aio_poll() to wait synchronously for the BH. It is possible
for a .drained_poll() callback to asynchronously wait for the BH, but
that is more complex than necessary here.
Move the virtqueue processing out to the callers of
virtio_queue_aio_detach_host_notifier() so that the function can be
called from any thread. This is in preparation for the next patch.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230516190238.8401-17-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch is part of an effort to remove the aio_disable_external()
API because it does not fit in a multi-queue block layer world where
many AioContexts may be submitting requests to the same disk.
The SCSI emulation code is already in good shape to stop using
aio_disable_external(). It was only used by commit 9c5aad84da
("virtio-scsi: fixed virtio_scsi_ctx_check failed when detaching scsi
disk") to ensure that virtio_scsi_hotunplug() works while the guest
driver is submitting I/O.
Ensure virtio_scsi_hotunplug() is safe as follows:
1. qdev_simple_device_unplug_cb() -> qdev_unrealize() ->
device_set_realized() calls qatomic_set(&dev->realized, false) so
that future scsi_device_get() calls return NULL because they exclude
SCSIDevices with realized=false.
That means virtio-scsi will reject new I/O requests to this
SCSIDevice with VIRTIO_SCSI_S_BAD_TARGET even while
virtio_scsi_hotunplug() is still executing. We are protected against
new requests!
2. scsi_qdev_unrealize() already contains a call to
scsi_device_purge_requests() so that in-flight requests are cancelled
synchronously. This ensures that no in-flight requests remain once
qdev_simple_device_unplug_cb() returns.
Thanks to these two conditions we don't need aio_disable_external()
anymore.
Cc: Zhengui Li <lizhengui@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-5-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Only report a transport reset event to the guest after the SCSIDevice
has been unrealized by qdev_simple_device_unplug_cb().
qdev_simple_device_unplug_cb() sets the SCSIDevice's qdev.realized field
to false so that scsi_device_find/get() no longer see it.
scsi_target_emulate_report_luns() also needs to be updated to filter out
SCSIDevices that are unrealized.
Change virtio_scsi_push_event() to take event information as an argument
instead of the SCSIDevice. This allows virtio_scsi_hotunplug() to emit a
VIRTIO_SCSI_T_TRANSPORT_RESET event after the SCSIDevice has already
been unrealized.
These changes ensure that the guest driver does not see the SCSIDevice
that's being unplugged if it responds very quickly to the transport
reset event.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-4-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add a helper function to check whether the device is realized without
requiring the Big QEMU Lock. The next patch adds a second caller. The
goal is to avoid spreading DeviceState field accesses throughout the
code.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230516190238.8401-3-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We cannot use the generic reentrancy guard in the LSI code, so
we have to manually prevent endless reentrancy here. The problematic
lsi_execute_script() function has already a way to detect whether
too many instructions have been executed - we just have to slightly
change the logic here that it also takes into account if the function
has been called too often in a reentrant way.
The code in fuzz-lsi53c895a-test.c has been taken from an earlier
patch by Mauro Matteo Cascella.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1563
Message-Id: <20230522091011.1082574-1-thuth@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
While trying to use a SCSI disk on the LSI controller with an
older version of Fedora (25), I'm getting:
qemu: warning: Blocked re-entrant IO on MemoryRegion: lsi-mmio at addr: 0x34
and the SCSI controller is not usable. Seems like we have to
disable the reentrancy checker for the MMIO region, too, to
get this working again.
The problem could be reproduced it like this:
./qemu-system-x86_64 -accel kvm -m 2G -machine q35 \
-device lsi53c810,id=lsi1 -device scsi-hd,drive=d0 \
-drive if=none,id=d0,file=.../somedisk.qcow2 \
-cdrom Fedora-Everything-netinst-i386-25-1.3.iso
Where somedisk.qcow2 is an image that contains already some partitions
and file systems.
In the boot menu of Fedora, go to
"Troubleshooting" -> "Rescue a Fedora system" -> "3) Skip to shell"
Then check "dmesg | grep -i 53c" for failure messages, and try to mount
a partition from somedisk.qcow2.
Message-Id: <20230516090556.553813-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Using linux 6.x guest, at boot time, an inquiry on a scsi-generic
device makes qemu crash. This is caused by a buffer overflow when
scsi-generic patches the block limits VPD page.
Do the operations on a temporary on-stack buffer that is guaranteed
to be large enough.
Reported-by: Théo Maillart <tmaillart@freebox.fr>
Analyzed-by: Théo Maillart <tmaillart@freebox.fr>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is no need for the AioContext lock in aio_wait_bh_oneshot().
It's easy to remove the lock from existing callers and then switch from
AIO_WAIT_WHILE() to AIO_WAIT_WHILE_UNLOCKED() in aio_wait_bh_oneshot().
Document that the AioContext lock should not be held across
aio_wait_bh_oneshot(). Holding a lock across aio_poll() can cause
deadlock so we don't want callers to do that.
This is a step towards getting rid of the AioContext lock.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230404153307.458883-1-stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
As the code is designed to use the memory APIs to access the script ram,
disable reentrancy checks for the pseudo-RAM ram_io MemoryRegion.
In the future, ram_io may be converted from an IO to a proper RAM MemoryRegion.
Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20230427211013.2994127-6-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This protects devices from bh->mmio reentrancy issues.
Thanks: Thomas Huth <thuth@redhat.com> for diagnosing OS X test failure.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230427211013.2994127-5-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Bring the files in line with the QEMU coding style, with spaces
for indentation.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/378
Signed-off-by: Yeqi Fu <fufuyqqqqqq@gmail.com>
Message-Id: <20230315032649.57568-1-fufuyqqqqqq@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The Free Software Foundation moved to a new address and some
sources in QEMU referred to their old location.
The address should be updated and replaced by a pointer to
<https://www.gnu.org/licenses/>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/379
Signed-off-by: Khadija Kamran <kkamran.bese16seecs@seecs.edu.pk>
Message-Id: <576ee9203fdac99d7251a98faa66b9ce1e7febc5.1675941486.git.kkamran.bese16seecs@seecs.edu.pk>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When an IOThread is configured, the ctrl virtqueue is processed in the
IOThread. TMFs that reset SCSI devices are currently called directly
from the IOThread and trigger an assertion failure in blk_drain() from
the following call stack:
virtio_scsi_handle_ctrl_req -> virtio_scsi_do_tmf -> device_code_reset
-> scsi_disk_reset -> scsi_device_purge_requests -> blk_drain
../block/block-backend.c:1780: void blk_drain(BlockBackend *): Assertion `qemu_in_main_thread()' failed.
The blk_drain() function is not designed to be called from an IOThread
because it needs the Big QEMU Lock (BQL).
This patch defers TMFs that reset SCSI devices to a Bottom Half (BH)
that runs in the main loop thread under the BQL. This way it's safe to
call blk_drain() and the assertion failure is avoided.
Introduce s->tmf_bh_list for tracking TMF requests that have been
deferred to the BH. When the BH runs it will grab the entire list and
process all requests. Care must be taken to clear the list when the
virtio-scsi device is reset or unrealized. Otherwise deferred TMF
requests could execute later and lead to use-after-free or other
undefined behavior.
The s->resetting counter that's used by TMFs that reset SCSI devices is
accessed from multiple threads. This patch makes that explicit by using
atomic accessor functions. With this patch applied the counter is only
modified by the main loop thread under the BQL but can be read by any
thread.
Reported-by: Qing Wang <qinwang@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230221212218.1378734-4-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
dma_blk_cb() only takes the AioContext lock around ->io_func(). That
means the rest of dma_blk_cb() is not protected. In particular, the
DMAAIOCB field accesses happen outside the lock.
There is a race when the main loop thread holds the AioContext lock and
invokes scsi_device_purge_requests() -> bdrv_aio_cancel() ->
dma_aio_cancel() while an IOThread executes dma_blk_cb(). The dbs->acb
field determines how cancellation proceeds. If dma_aio_cancel() sees
dbs->acb == NULL while dma_blk_cb() is still running, the request can be
completed twice (-ECANCELED and the actual return value).
The following assertion can occur with virtio-scsi when an IOThread is
used:
../hw/scsi/scsi-disk.c:368: scsi_dma_complete: Assertion `r->req.aiocb != NULL' failed.
Fix the race by holding the AioContext across dma_blk_cb(). Now
dma_aio_cancel() under the AioContext lock will not see
inconsistent/intermediate states.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230221212218.1378734-3-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If requests are being processed in the IOThread when a SCSIDevice is
unplugged, scsi_device_purge_requests() -> scsi_req_cancel_async() races
with I/O completion callbacks. Both threads load and store req->aiocb.
This can lead to assert(r->req.aiocb == NULL) failures and undefined
behavior.
Protect r->req.aiocb with the AioContext lock to prevent the race.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230221212218.1378734-2-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>