Commit Graph

1265 Commits

Author SHA1 Message Date
Eugenio Pérez 4725a4181b vhost: Add vhost_svq_valid_features to shadow vq
This allows SVQ to negotiate features with the guest and the device. For
the device, SVQ is a driver. While this function bypasses all
non-transport features, it needs to disable the features that SVQ does
not support when forwarding buffers. This includes packed vq layout,
indirect descriptors or event idx.

Future changes can add support to offer more features to the guest,
since the use of VirtQueue gives this for free. This is left out at the
moment for simplicity.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-03-15 13:57:44 +08:00
Eugenio Pérez a8ac88585d vhost: Add Shadow VirtQueue call forwarding capabilities
This will make qemu aware of the device used buffers, allowing it to
write the guest memory with its contents if needed.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-03-15 13:57:44 +08:00
Eugenio Pérez dff4426fa6 vhost: Add Shadow VirtQueue kick forwarding capabilities
At this mode no buffer forwarding will be performed in SVQ mode: Qemu
will just forward the guest's kicks to the device.

Host memory notifiers regions are left out for simplicity, and they will
not be addressed in this series.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-03-15 13:57:44 +08:00
Eugenio Pérez 10857ec0ad vhost: Add VhostShadowVirtqueue
Vhost shadow virtqueue (SVQ) is an intermediate jump for virtqueue
notifications and buffers, allowing qemu to track them. While qemu is
forwarding the buffers and virtqueue changes, it is able to commit the
memory it's being dirtied, the same way regular qemu's VirtIO devices
do.

This commit only exposes basic SVQ allocation and free. Next patches of
the series add functionality like notifications and buffers forwarding.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-03-15 13:57:44 +08:00
Sergio Lopez ff5eb77b8a vhost: use wfd on functions setting vring call fd
When ioeventfd is emulated using qemu_pipe(), only EventNotifier's wfd
can be used for writing.

Use the recently introduced event_notifier_get_wfd() function to
obtain the fd that our peer must use to signal the vring.

Signed-off-by: Sergio Lopez <slp@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220304100854.14829-3-slp@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-06 06:19:47 -05:00
Stefano Garzarella 8d1b247f37 vhost-vsock: detach the virqueue element in case of error
In vhost_vsock_common_send_transport_reset(), if an element popped from
the virtqueue is invalid, we should call virtqueue_detach_element() to
detach it from the virtqueue before freeing its memory.

Fixes: fc0b9b0e1c ("vhost-vsock: add virtio sockets device")
Fixes: CVE-2022-26354
Cc: qemu-stable@nongnu.org
Reported-by: VictorV <vv474172261@gmail.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20220228095058.27899-1-sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-06 05:08:23 -05:00
Jean-Philippe Brucker d9c96f2425 virtio-iommu: Support bypass domain
The driver can create a bypass domain by passing the
VIRTIO_IOMMU_ATTACH_F_BYPASS flag on the ATTACH request. Bypass domains
perform slightly better than domains with identity mappings since they
skip translation.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20220214124356.872985-4-jean-philippe@linaro.org>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-06 05:08:23 -05:00
Jean-Philippe Brucker 448179e33e virtio-iommu: Default to bypass during boot
Currently the virtio-iommu device must be programmed before it allows
DMA from any PCI device. This can make the VM entirely unusable when a
virtio-iommu driver isn't present, for example in a bootloader that
loads the OS from storage.

Similarly to the other vIOMMU implementations, default to DMA bypassing
the IOMMU during boot. Add a "boot-bypass" property, defaulting to true,
that lets users change this behavior.

Replace the VIRTIO_IOMMU_F_BYPASS feature, which didn't support bypass
before feature negotiation, with VIRTIO_IOMMU_F_BYPASS_CONFIG.

We add the bypass field to the migration stream without introducing
subsections, based on the assumption that this virtio-iommu device isn't
being used in production enough to require cross-version migration at
the moment (all previous version required workarounds since they didn't
support ACPI and boot-bypass).

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20220214124356.872985-3-jean-philippe@linaro.org>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-06 05:08:23 -05:00
Laurent Vivier b1f030a0a2 vhost-vdpa: make notifiers _init()/_uninit() symmetric
vhost_vdpa_host_notifiers_init() initializes queue notifiers
for queues "dev->vq_index" to queue "dev->vq_index + dev->nvqs",
whereas vhost_vdpa_host_notifiers_uninit() uninitializes the
same notifiers for queue "0" to queue "dev->nvqs".

This asymmetry seems buggy, fix that by using dev->vq_index
as the base for both.

Fixes: d0416d487b ("vhost-vdpa: map virtqueue notification area if possible")
Cc: jasowang@redhat.com
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20220211161309.1385839-1-lvivier@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-06 05:08:23 -05:00
Laurent Vivier 98f7607ecd hw/virtio: vdpa: Fix leak of host-notifier memory-region
If call virtio_queue_set_host_notifier_mr fails, should free
host-notifier memory-region.

This problem can trigger a coredump with some vDPA drivers (mlx5,
but not with the vdpasim), if we unplug the virtio-net card from
the guest after a stop/start.

The same fix has been done for vhost-user:
  1f89d3b91e ("hw/virtio: Fix leak of host-notifier memory-region")

Fixes: d0416d487b ("vhost-vdpa: map virtqueue notification area if possible")
Cc: jasowang@redhat.com
Resolves: https://bugzilla.redhat.com/2027208
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20220211170259.1388734-1-lvivier@redhat.com>
Cc: qemu-stable@nongnu.org
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-06 05:08:23 -05:00
Viresh Kumar 0a24dd1fd5 hw/vhost-user-i2c: Add support for VIRTIO_I2C_F_ZERO_LENGTH_REQUEST
VIRTIO_I2C_F_ZERO_LENGTH_REQUEST is a mandatory feature, that must be
implemented by everyone. Add its support.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Message-Id: <fc47ab63b1cd414319c9201e8d6c7705b5ec3bd9.1644490993.git.viresh.kumar@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-04 08:30:52 -05:00
Halil Pasic e65902a913 virtio: fix the condition for iommu_platform not supported
The commit 04ceb61a40 ("virtio: Fail if iommu_platform is requested, but
unsupported") claims to fail the device hotplug when iommu_platform
is requested, but not supported by the (vhost) device. On the first
glance the condition for detecting that situation looks perfect, but
because a certain peculiarity of virtio_platform it ain't.

In fact the aforementioned commit introduces a regression. It breaks
virtio-fs support for Secure Execution, and most likely also for AMD SEV
or any other confidential guest scenario that relies encrypted guest
memory.  The same also applies to any other vhost device that does not
support _F_ACCESS_PLATFORM.

The peculiarity is that iommu_platform and _F_ACCESS_PLATFORM collates
"device can not access all of the guest RAM" and "iova != gpa, thus
device needs to translate iova".

Confidential guest technologies currently rely on the device/hypervisor
offering _F_ACCESS_PLATFORM, so that, after the feature has been
negotiated, the guest  grants access to the portions of memory the
device needs to see. So in for confidential guests, generally,
_F_ACCESS_PLATFORM is about the restricted access to memory, but not
about the addresses used being something else than guest physical
addresses.

This is the very reason for which commit f7ef7e6e3b ("vhost: correctly
turn on VIRTIO_F_IOMMU_PLATFORM") fences _F_ACCESS_PLATFORM from the
vhost device that does not need it, because on the vhost interface it
only means "I/O address translation is needed".

This patch takes inspiration from f7ef7e6e3b ("vhost: correctly turn on
VIRTIO_F_IOMMU_PLATFORM"), and uses the same condition for detecting the
situation when _F_ACCESS_PLATFORM is requested, but no I/O translation
by the device, and thus no device capability is needed. In this
situation claiming that the device does not support iommu_plattform=on
is counter-productive. So let us stop doing that!

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reported-by: Jakob Naucke <Jakob.Naucke@ibm.com>
Fixes: 04ceb61a40 ("virtio: Fail if iommu_platform is requested, but
unsupported")
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Tested-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-stable@nongnu.org

Message-Id: <20220207112857.607829-1-pasic@linux.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-03-04 08:30:52 -05:00
Xueming Li 0b0af4d62f vhost-user: fix VirtQ notifier cleanup
When vhost-user device cleanup, remove notifier MR and munmaps notifier
address in the event-handling thread, VM CPU thread writing the notifier
in concurrent fails with an error of accessing invalid address. It
happens because MR is still being referenced and accessed in another
thread while the underlying notifier mmap address is being freed and
becomes invalid.

This patch calls RCU and munmap notifiers in the callback after the
memory flatview update finish.

Fixes: 44866521bd ("vhost-user: support registering external host notifiers")
Cc: qemu-stable@nongnu.org
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Message-Id: <20220207071929.527149-3-xuemingl@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-04 08:30:52 -05:00
Xueming Li e867144b73 vhost-user: remove VirtQ notifier restore
Notifier set when vhost-user backend asks qemu to mmap an FD and
offset. When vhost-user backend restart or getting killed, VQ notifier
FD and mmap addresses become invalid. After backend restart, MR contains
the invalid address will be restored and fail on notifier access.

On the other hand, qemu should munmap the notifier, release underlying
hardware resources to enable backend restart and allocate hardware
notifier resources correctly.

Qemu shouldn't reference and use resources of disconnected backend.

This patch removes VQ notifier restore, uses the default vhost-user
notifier to avoid invalid address access.

After backend restart, the backend should ask qemu to install a hardware
notifier if needed.

Fixes: 44866521bd ("vhost-user: support registering external host notifiers")
Cc: qemu-stable@nongnu.org
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Message-Id: <20220207071929.527149-2-xuemingl@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-04 08:30:52 -05:00
Peter Maydell b85ea5fa2f include: Move qemu_madvise() and related #defines to new qemu/madvise.h
The function qemu_madvise() and the QEMU_MADV_* constants associated
with it are used in only 10 files.  Move them out of osdep.h to a new
qemu/madvise.h header that is included where it is needed.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220208200856.3558249-2-peter.maydell@linaro.org
2022-02-21 13:30:20 +00:00
Bernhard Beschow 5e78c98b7c Mark remaining global TypeInfo instances as const
More than 1k of TypeInfo instances are already marked as const. Mark the
remaining ones, too.

This commit was created with:
  git grep -z -l 'static TypeInfo' -- '*.c' | \
  xargs -0 sed -i 's/static TypeInfo/static const TypeInfo/'

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Message-id: 20220117145805.173070-2-shentey@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-02-21 13:30:20 +00:00
Peter Maydell 17e3134061 Remove unnecessary minimum_version_id_old fields
The migration code will not look at a VMStateDescription's
minimum_version_id_old field unless that VMSD has set the
load_state_old field to something non-NULL.  (The purpose of
minimum_version_id_old is to specify what migration version is needed
for the code in the function pointed to by load_state_old to be able
to handle it on incoming migration.)

We have exactly one VMSD which still has a load_state_old,
in the PPC CPU; every other VMSD which sets minimum_version_id_old
is doing so unnecessarily. Delete all the unnecessary ones.

Commit created with:
  sed -i '/\.minimum_version_id_old/d' $(git grep -l '\.minimum_version_id_old')
with the one legitimate use then hand-edited back in.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>

Signed-off-by: Juan Quintela <quintela@redhat.com>

---

It missed vmstate_ppc_cpu.
2022-01-28 15:38:23 +01:00
Gavin Shan b1b87327a9 hw/arm/virt: Support for virtio-mem-pci
This supports virtio-mem-pci device on "virt" platform, by simply
following the implementation on x86.

   * This implements the hotplug handlers to support virtio-mem-pci
     device hot-add, while the hot-remove isn't supported as we have
     on x86.

   * The block size is 512MB on ARM64 instead of 128MB on x86.

   * It has been passing the tests with various combinations like 64KB
     and 4KB page sizes on host and guest, different memory device
     backends like normal, transparent huge page and HugeTLB, plus
     migration.

Co-developed-by: David Hildenbrand <david@redhat.com>
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-id: 20220111063329.74447-3-gshan@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-01-20 11:47:52 +00:00
Gavin Shan 1263615efe virtio-mem: Correct default THP size for ARM64
The default block size is same as to the THP size, which is either
retrieved from "/sys/kernel/mm/transparent_hugepage/hpage_pmd_size"
or hardcoded to 2MB. There are flaws in both mechanisms and this
intends to fix them up.

  * When "/sys/kernel/mm/transparent_hugepage/hpage_pmd_size" is
    used to getting the THP size, 32MB and 512MB are valid values
    when we have 16KB and 64KB page size on ARM64.

  * When the hardcoded THP size is used, 2MB, 32MB and 512MB are
    valid values when we have 4KB, 16KB and 64KB page sizes on
    ARM64.

Co-developed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-id: 20220111063329.74447-2-gshan@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-01-20 11:47:52 +00:00
Stefan Hajnoczi db608fb784 virtio: unify dataplane and non-dataplane ->handle_output()
Now that virtio-blk and virtio-scsi are ready, get rid of
the handle_aio_output() callback. It's no longer needed.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20211207132336.36627-7-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-01-12 17:09:39 +00:00
Stefan Hajnoczi d6fbfe2b83 virtio: use ->handle_output() instead of ->handle_aio_output()
The difference between ->handle_output() and ->handle_aio_output() was
that ->handle_aio_output() returned a bool return value indicating
progress. This was needed by the old polling API but now that the bool
return value is gone, the two functions can be unified.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20211207132336.36627-6-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-01-12 17:09:39 +00:00
Stefan Hajnoczi d93d16c045 virtio: get rid of VirtIOHandleAIOOutput
The virtqueue host notifier API
virtio_queue_aio_set_host_notifier_handler() polls the virtqueue for new
buffers. AioContext previously required a bool progress return value
indicating whether an event was handled or not. This is no longer
necessary because the AioContext polling API has been split into a poll
check function and an event handler function. The event handler is only
run when we know there is work to do, so it doesn't return bool.

The VirtIOHandleAIOOutput function signature is now the same as
VirtIOHandleOutput. Get rid of the bool return value.

Further simplifications will be made for virtio-blk and virtio-scsi in
the next patch.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20211207132336.36627-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-01-12 17:09:39 +00:00
Stefan Hajnoczi 826cc32423 aio-posix: split poll check from ready handler
Adaptive polling measures the execution time of the polling check plus
handlers called when a polled event becomes ready. Handlers can take a
significant amount of time, making it look like polling was running for
a long time when in fact the event handler was running for a long time.

For example, on Linux the io_submit(2) syscall invoked when a virtio-blk
device's virtqueue becomes ready can take 10s of microseconds. This
can exceed the default polling interval (32 microseconds) and cause
adaptive polling to stop polling.

By excluding the handler's execution time from the polling check we make
the adaptive polling calculation more accurate. As a result, the event
loop now stays in polling mode where previously it would have fallen
back to file descriptor monitoring.

The following data was collected with virtio-blk num-queues=2
event_idx=off using an IOThread. Before:

168k IOPS, IOThread syscalls:

  9837.115 ( 0.020 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 16, iocbpp: 0x7fcb9f937db0)    = 16
  9837.158 ( 0.002 ms): IO iothread1/620155 write(fd: 103, buf: 0x556a2ef71b88, count: 8)                         = 8
  9837.161 ( 0.001 ms): IO iothread1/620155 write(fd: 104, buf: 0x556a2ef71b88, count: 8)                         = 8
  9837.163 ( 0.001 ms): IO iothread1/620155 ppoll(ufds: 0x7fcb90002800, nfds: 4, tsp: 0x7fcb9f1342d0, sigsetsize: 8) = 3
  9837.164 ( 0.001 ms): IO iothread1/620155 read(fd: 107, buf: 0x7fcb9f939cc0, count: 512)                        = 8
  9837.174 ( 0.001 ms): IO iothread1/620155 read(fd: 105, buf: 0x7fcb9f939cc0, count: 512)                        = 8
  9837.176 ( 0.001 ms): IO iothread1/620155 read(fd: 106, buf: 0x7fcb9f939cc0, count: 512)                        = 8
  9837.209 ( 0.035 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 32, iocbpp: 0x7fca7d0cebe0)    = 32

174k IOPS (+3.6%), IOThread syscalls:

  9809.566 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0cdd62be0)    = 32
  9809.625 ( 0.001 ms): IO iothread1/623061 write(fd: 103, buf: 0x5647cfba5f58, count: 8)                         = 8
  9809.627 ( 0.002 ms): IO iothread1/623061 write(fd: 104, buf: 0x5647cfba5f58, count: 8)                         = 8
  9809.663 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0d0388b50)    = 32

Notice that ppoll(2) and eventfd read(2) syscalls are eliminated because
the IOThread stays in polling mode instead of falling back to file
descriptor monitoring.

As usual, polling is not implemented on Windows so this patch ignores
the new io_poll_read() callback in aio-win32.c.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20211207132336.36627-2-stefanha@redhat.com

[Fixed up aio_set_event_notifier() calls in
tests/unit/test-fdmon-epoll.c added after this series was queued.
--Stefan]

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-01-12 17:09:39 +00:00
Michael S. Tsirkin a882b57123 Revert "virtio: introduce macro IRTIO_CONFIG_IRQ_IDX"
This reverts commit bf1d85c166.

Fixes: bf1d85c166 ("virtio: introduce macro IRTIO_CONFIG_IRQ_IDX")
Cc: "Cindy Lu" <lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-10 16:02:54 -05:00
Michael S. Tsirkin a20fa00ce1 Revert "virtio-pci: decouple notifier from interrupt process"
This reverts commit e3480ef81f.

Fixes: e3480ef81f ("virtio-pci: decouple notifier from interrupt process")
Cc: "Cindy Lu" <lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-10 16:02:36 -05:00
Michael S. Tsirkin 38ce405198 Revert "virtio-pci: decouple the single vector from the interrupt process"
This reverts commit 316011b8a7.

Fixes: 316011b8a7 ("virtio-pci: decouple the single vector from the interrupt process")
Cc: "Cindy Lu" <lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-10 16:02:16 -05:00
Michael S. Tsirkin 73bd56abe1 Revert "vhost-vdpa: add support for config interrupt"
This reverts commit 634f7c89fb.

Fixes: 634f7c89fb ("vhost-vdpa: add support for config interrupt")
Cc: "Cindy Lu" <lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-10 16:01:44 -05:00
Michael S. Tsirkin 81c3ebc32f Revert "virtio: add support for configure interrupt"
This reverts commit 081f864f56.

Fixes: 081f864f56 ("virtio: add support for configure interrupt")
Cc: "Cindy Lu" <lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-10 16:01:28 -05:00
Michael S. Tsirkin a86d1a0a93 Revert "vhost: add support for configure interrupt"
This reverts commit f7220a7ce2.

Fixes: f7220a7ce2 ("vhost: add support for configure interrupt")
Cc: "Cindy Lu" <lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-10 16:01:11 -05:00
Michael S. Tsirkin 99478e5941 Revert "virtio-mmio: add support for configure interrupt"
This reverts commit d48185f1a4.

Fixes: d48185f1a4 ("virtio-mmio: add support for configure interrupt")
Cc: "Cindy Lu" <lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-10 16:00:38 -05:00
Michael S. Tsirkin 847e9bc974 Revert "virtio-pci: add support for configure interrupt"
This reverts commit d5d24d859c.

Fixes: d5d24d859c ("virtio-pci: add support for configure interrupt")
Cc: "Cindy Lu" <lulu@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-10 16:00:02 -05:00
Daniil Tatianin d731ab3119 virtio/vhost-vsock: don't double close vhostfd, remove redundant cleanup
In case of an error during initialization in vhost_dev_init, vhostfd is
closed in vhost_dev_cleanup. Remove close from err_virtio as it's both
redundant and causes a double close on vhostfd.

Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Message-Id: <20211129125204.1108088-1-d-tatianin@yandex-team.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 19:30:13 -05:00
David Hildenbrand 60f1f77cab virtio-mem: Set "unplugged-inaccessible=auto" for the 7.0 machine on x86
Set the new default to "auto", keeping it set to "off" for compat
machines. This property is only available for x86 targets.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134039.29670-4-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 19:30:13 -05:00
David Hildenbrand 23ad8dec8d virtio-mem: Support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE
With VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE, we signal the VM that reading
unplugged memory is not supported. We have to fail feature negotiation
in case the guest does not support VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE.

First, VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE is required to properly handle
memory backends (or architectures) without support for the shared zeropage
in the hypervisor cleanly. Without the shared zeropage, even reading an
unpopulated virtual memory location can populate real memory and
consequently consume memory in the hypervisor. We have a guaranteed shared
zeropage only on MAP_PRIVATE anonymous memory.

Second, we want VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE to be the default
long-term as even populating the shared zeropage can be problematic: for
example, without THP support (possible) or without support for the shared
huge zeropage with THP (unlikely), the PTE page tables to hold the shared
zeropage entries can consume quite some memory that cannot be reclaimed
easily.

Third, there are other optimizations+features (e.g., protection of
unplugged memory, reducing the total memory slot size and bitmap sizes)
that will require VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE.

We really only support x86 targets with virtio-mem for now (and
Linux similarly only support x86), but that might change soon, so prepare
for different targets already.

Add a new "unplugged-inaccessible" tristate property for x86 targets:
- "off" will keep VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE unset and legacy
  guests working.
- "on" will set VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE and stop legacy guests
  from using the device.
- "auto" selects the default based on support for the shared zeropage.

Warn in case the property is set to "off" and we don't have support for the
shared zeropage.

For existing compat machines, the property will default to "off", to
not change the behavior but eventually warn about a problematic setup.
Short-term, we'll set the property default to "auto" for new QEMU machines.
Mid-term, we'll set the property default to "on" for new QEMU machines.
Long-term, we'll deprecate the parameter and disallow legacy
guests completely.

The property has to match on the migration source and destination. "auto"
will result in the same VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE setting as long
as the qemu command line (esp. memdev) match -- so "auto" is good enough
for migration purposes and the parameter doesn't have to be migrated
explicitly.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134039.29670-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 19:30:13 -05:00
Stefan Hajnoczi 750539c4c4 virtio: signal after wrapping packed used_idx
Packed Virtqueues wrap used_idx instead of letting it run freely like
Split Virtqueues do. If the used ring wraps more than once there is no
way to compare vq->signalled_used and vq->used_idx in
virtio_packed_should_notify() since they are modulo vq->vring.num.

This causes the device to stop sending used buffer notifications when
when virtio_packed_should_notify() is called less than once each time
around the used ring.

It is possible to trigger this with virtio-blk's dataplane
notify_guest_bh() irq coalescing optimization. The call to
virtio_notify_irqfd() (and virtio_packed_should_notify()) is deferred to
a BH. If the guest driver is polling it can complete and submit more
requests before the BH executes, causing the used ring to wrap more than
once. The result is that the virtio-blk device ceases to raise
interrupts and I/O hangs.

Cc: Tiwei Bie <tiwei.bie@intel.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20211130134510.267382-1-stefanha@redhat.com>
Fixes: 86044b24e8 ("virtio: basic packed virtqueue support")
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 19:30:13 -05:00
David Hildenbrand 09b3b7e092 virtio-mem: Support "prealloc=on" option
For scarce memory resources, such as hugetlb, we want to be able to
prealloc such memory resources in order to not crash later on access. On
simple user errors we could otherwise easily run out of memory resources
an crash the VM -- pretty much undesired.

For ordinary memory devices, such as DIMMs, we preallocate memory via the
memory backend for such use cases; however, with virtio-mem we're dealing
with sparse memory backends; preallocating the whole memory backend
destroys the whole purpose of virtio-mem.

Instead, we want to preallocate memory when actually exposing memory to the
VM dynamically, and fail plugging memory gracefully + warn the user in case
preallocation fails.

A common use case for hugetlb will be using "reserve=off,prealloc=off" for
the memory backend and "prealloc=on" for the virtio-mem device. This
way, no huge pages will be reserved for the process, but we can recover
if there are no actual huge pages when plugging memory. Libvirt is
already prepared for this.

Note that preallocation cannot protect from the OOM killer -- which
holds true for any kind of preallocation in QEMU. It's primarily useful
only for scarce memory resources such as hugetlb, or shared file-backed
memory. It's of little use for ordinary anonymous memory that can be
swapped, KSM merged, ... but we won't forbid it.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134611.31172-9-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 19:30:13 -05:00
Roman Kagan 5d33ae4b7a vhost: stick to -errno error return convention
The generic vhost code expects that many of the VhostOps methods in the
respective backends set errno on errors.  However, none of the existing
backends actually bothers to do so.  In a number of those methods errno
from the failed call is clobbered by successful later calls to some
library functions; on a few code paths the generic vhost code then
negates and returns that errno, thus making failures look as successes
to the caller.

As a result, in certain scenarios (e.g. live migration) the device
doesn't notice the first failure and goes on through its state
transitions as if everything is ok, instead of taking recovery actions
(break and reestablish the vhost-user connection, cancel migration, etc)
before it's too late.

To fix this, consolidate on the convention to return negated errno on
failures throughout generic vhost, and use it for error propagation.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-10-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Roman Kagan 025faa872b vhost-user: stick to -errno error return convention
VhostOps methods in user_ops are not very consistent in their error
returns: some return negated errno while others just -1.

Make sure all of them consistently return negated errno.  This also
helps error propagation from the functions being called inside.
Besides, this synchronizes the error return convention with the other
two vhost backends, kernel and vdpa, and will therefore allow for
consistent error propagation in the generic vhost code (in a followup
patch).

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-9-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Roman Kagan 3631151b3e vhost-vdpa: stick to -errno error return convention
Almost all VhostOps methods in vdpa_ops follow the convention of
returning negated errno on error.

Adjust the few that don't.  To that end, rework vhost_vdpa_add_status to
check if setting of the requested status bits has succeeded and return
the respective error code it hasn't, and propagate the error codes
wherever it's appropriate.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-8-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Roman Kagan 2d88d9c65c vhost-backend: stick to -errno error return convention
Almost all VhostOps methods in kernel_ops follow the convention of
returning negated errno on error.

Adjust the only one that doesn't.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-7-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2022-01-07 05:19:55 -05:00
Roman Kagan 6dcae534e8 vhost-backend: avoid overflow on memslots_limit
Fix the (hypothetical) potential problem when the value parsed out of
the vhost module parameter in sysfs overflows the return value from
vhost_kernel_memslots_limit.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Message-Id: <20211111153354.18807-6-rvkagan@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Cindy Lu d5d24d859c virtio-pci: add support for configure interrupt
Add support for configure interrupt, The process is used kvm_irqfd_assign
to set the gsi to kernel. When the configure notifier was signal by
host, qemu will inject a msix interrupt to guest

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-11-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Cindy Lu d48185f1a4 virtio-mmio: add support for configure interrupt
Add configure interrupt support for virtio-mmio bus. This
interrupt will be working while the backend is vhost-vdpa

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-10-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Cindy Lu f7220a7ce2 vhost: add support for configure interrupt
Add functions to support configure interrupt.
The configure interrupt process will start in vhost_dev_start
and stop in vhost_dev_stop.

Also add the functions to support vhost_config_pending and
vhost_config_mask, for masked_config_notifier, we only
use the notifier saved in vq 0.

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-8-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06 06:11:39 -05:00
Cindy Lu 081f864f56 virtio: add support for configure interrupt
Add the functions to support the configure interrupt in virtio
The function virtio_config_guest_notifier_read will notify the
guest if there is an configure interrupt.
The function virtio_config_set_guest_notifier_fd_handler is
to set the fd hander for the notifier

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-7-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06 06:11:39 -05:00
Cindy Lu 634f7c89fb vhost-vdpa: add support for config interrupt
Add new call back function in vhost-vdpa, this function will
set the event fd to kernel. This function will be called
in the vhost_dev_start and vhost_dev_stop

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-6-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06 06:11:39 -05:00
Cindy Lu 316011b8a7 virtio-pci: decouple the single vector from the interrupt process
To reuse the interrupt process in configure interrupt
Need to decouple the single vector from the interrupt process. Add new function
kvm_virtio_pci_vector_use_one and _release_one. These functions are use
for the single vector, the whole process will finish in a loop for the vq number.

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-4-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06 06:11:39 -05:00
Cindy Lu e3480ef81f virtio-pci: decouple notifier from interrupt process
To reuse the notifier process in configure interrupt.
Use the virtio_pci_get_notifier function to get the notifier.
the INPUT of this function is the IDX, the OUTPUT is notifier and
the vector

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-3-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06 06:11:39 -05:00
Cindy Lu bf1d85c166 virtio: introduce macro IRTIO_CONFIG_IRQ_IDX
To support configure interrupt for vhost-vdpa
Introduce VIRTIO_CONFIG_IRQ_IDX -1 as configure interrupt's queue index,
Then we can reuse the functions guest_notifier_mask and guest_notifier_pending.
Add the check of queue index in these drivers, if the driver does not support
configure interrupt, the function will just return

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211104164827.21911-2-lulu@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06 06:11:39 -05:00
David Hildenbrand 7656d9ce09 virtio-mem: Don't skip alignment checks when warning about block size
If we warn about the block size being smaller than the default, we skip
some alignment checks.

This can currently only fail on x86-64, when specifying a block size of
1 MiB, however, we detect the THP size of 2 MiB.

Fixes: 228957fea3 ("virtio-mem: Probe THP size to determine default block size")
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211011173305.13778-1-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-06 04:16:58 -05:00
Philippe Mathieu-Daudé a1d4b0a305 dma: Let dma_memory_map() take MemTxAttrs argument
Let devices specify transaction attributes when calling
dma_memory_map().

Patch created mechanically using spatch with this script:

  @@
  expression E1, E2, E3, E4;
  @@
  - dma_memory_map(E1, E2, E3, E4)
  + dma_memory_map(E1, E2, E3, E4, MEMTXATTRS_UNSPECIFIED)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20211223115554.3155328-7-philmd@redhat.com>
2021-12-30 17:16:32 +01:00
Leonardo Garcia f71d31fa81 hw/virtio/vhost: Fix typo in comment.
Signed-off-by: Leonardo Garcia <lagarcia@br.ibm.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <a10a0ddab65b474ebea1e1141abe0f4aa463909b.1637668012.git.lagarcia@br.ibm.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-12-17 10:46:08 +01:00
Richard Henderson aab8cfd4c3 target-arm queue:
* ITS: error reporting cleanup
  * aspeed: improve documentation
  * Fix STM32F2XX USART data register readout
  * allow emulated GICv3 to be disabled in non-TCG builds
  * fix exception priority for singlestep, misaligned PC, bp, etc
  * Correct calculation of tlb range invalidate length
  * npcm7xx_emc: fix missing queue_flush
  * virt: Add VIOT ACPI table for virtio-iommu
  * target/i386: Use assert() to sanity-check b1 in SSE decode
  * Don't include qemu-common unnecessarily
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmG5xekZHHBldGVyLm1h
 eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3ramD/0WL8YV70sW5B/tHdb+/em1
 xTBuABUUj5QDvKnxNoPIBwJI0vgmzwhAonYzcKKEUvlbL97crkgt6xSPvVxv2nf5
 wnSYMKTDEC11AuYVdEyIMm5KLc88mq1w78pTYkFSUJmujCpfqLAsyXdEastIPHfN
 MdrwkpQ3wVmMeMcNBTq2yCxiGlz7x/myeJtDU9ihgPTcsgXa8BzziK6qCZHAOGCL
 0/ljXDbVTJtLYUki9IqptPs8QUtlqOBt3rLplxHfKRKpmjiuD+xFlQ4GuIOBX+AL
 tQWgEyyiR9FnYpY1t3fWVtuKgjYXzlbh1A6cwdsK3Q68+qfi7Yr+lPryjwrmOkx7
 /Yupq+QB/xgK4nxF4ydDXLvqI3h6GjaF2U9qujK3H9DyMOEYJDpaX1TZMphtWI89
 9u7kLO6DNE00oUoiX+6Aty0qQtXv12SSaNpJmFON87/WLJJamHuiS6NiZp/r4ORU
 51ds2LPGJAKAy9duqmZJ/81WlNjmHmurq1v+FIl29XInc4a2SpwEUM0rsTrrQTaD
 16Qh2OZCnlYEg9nh6B54FQe8xP+pp69Gn/BRFhcwW9fPq4/pHSrwKEkI6lE+Yuiq
 +Fe8r0DbZczfhjcGdoUlIgMj+WSVY9Q8Opztsmv/kjZqxt0VvfdmAVp0odl5KdB4
 cKAeYciNSgq2bGd+N4kuHA==
 =KuTi
 -----END PGP SIGNATURE-----

Merge tag 'pull-target-arm-20211215' of https://git.linaro.org/people/pmaydell/qemu-arm into staging

target-arm queue:
 * ITS: error reporting cleanup
 * aspeed: improve documentation
 * Fix STM32F2XX USART data register readout
 * allow emulated GICv3 to be disabled in non-TCG builds
 * fix exception priority for singlestep, misaligned PC, bp, etc
 * Correct calculation of tlb range invalidate length
 * npcm7xx_emc: fix missing queue_flush
 * virt: Add VIOT ACPI table for virtio-iommu
 * target/i386: Use assert() to sanity-check b1 in SSE decode
 * Don't include qemu-common unnecessarily

# gpg: Signature made Wed 15 Dec 2021 02:39:37 AM PST
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]

* tag 'pull-target-arm-20211215' of https://git.linaro.org/people/pmaydell/qemu-arm: (33 commits)
  tests/acpi: add expected blob for VIOT test on virt machine
  tests/acpi: add expected blobs for VIOT test on q35 machine
  tests/acpi: add test case for VIOT
  tests/acpi: allow updates of VIOT expected data files
  hw/arm/virt: Use object_property_set instead of qdev_prop_set
  hw/arm/virt: Reject instantiation of multiple IOMMUs
  hw/arm/virt: Remove device tree restriction for virtio-iommu
  hw/arm/virt-acpi-build: Add VIOT table for virtio-iommu
  hw/net: npcm7xx_emc fix missing queue_flush
  target/arm: Correct calculation of tlb range invalidate length
  hw/arm: Don't include qemu-common.h unnecessarily
  target/rx/cpu.h: Don't include qemu-common.h
  target/hexagon/cpu.h: don't include qemu-common.h
  include/hw/i386: Don't include qemu-common.h in .h files
  target/i386: Use assert() to sanity-check b1 in SSE decode
  tests/tcg: Add arm and aarch64 pc alignment tests
  target/arm: Suppress bp for exceptions with more priority
  target/arm: Assert thumb pc is aligned
  target/arm: Take an exception if PC is misaligned
  target/arm: Split compute_fsr_fsc out of arm_deliver_fault
  ...

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-12-15 10:33:45 -08:00
Jean-Philippe Brucker 092cba0350 hw/arm/virt: Remove device tree restriction for virtio-iommu
virtio-iommu is now supported with ACPI VIOT as well as device tree.
Remove the restriction that prevents from instantiating a virtio-iommu
device under ACPI.

Acked-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-id: 20211210170415.583179-3-jean-philippe@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-12-15 10:35:26 +00:00
Eric Auger 6b77ae0531 virtio-iommu: Fix the domain_range end
in old times the domain range was defined by a domain_bits le32.
This was then converted into a domain_range struct. During the
upgrade the original value of '32' (bits) has been kept while
the end field now is the max value of the domain id (UINT32_MAX).
Fix that and also use UINT64_MAX for the input_range.end.

Reported-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20211127072910.1261824-4-eric.auger@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-12-15 08:08:59 +01:00
Eric Auger 3a411b2d96 virtio-iommu: Fix endianness in get_config
Endianess is not properly handled when populating
the returned config. Use the cpu_to_le* primitives
for each separate field. Also, while at it, trace
the domain range start.

Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20211127072910.1261824-3-eric.auger@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-12-15 08:08:59 +01:00
Eric Auger 7b140d2359 virtio-iommu: Remove set_config callback
The spec says "the driver must not write to device configuration
fields". So remove the set_config() callback which anyway did
not do anything.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20211127072910.1261824-2-eric.auger@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-12-15 08:08:59 +01:00
Jason Wang d3f1f940eb virtio-balloon: correct used length
Spec said:

"and len the total of bytes written into the buffer."

For inflateq, deflateq and statsq, we don't process in_sg so the used
length should be zero. For free_page_vq, tough the pages could be
changed by the device (in the destination), spec said:

"Note: len is particularly useful for drivers using untrusted buffers:
if a driver does not know exactly how much has been written by the
device, the driver would have to zero the buffer in advance to ensure
no data leakage occurs."

So 0 should be used as well here.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211129030841.3611-2-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2021-11-29 08:49:36 -05:00
Jason Wang 0fe7245d8b virtio-balloon: process all in sgs for free_page_vq
We only process the first in sg which may lead to the bitmap of the
pages belongs to following sgs were not cleared. This may result more
pages to be migrated. Fixing this by process all in sgs for
free_page_vq.

Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211129030841.3611-1-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-29 08:49:36 -05:00
Cindy Lu 7abba7c638 virtio-mmio : fix the crash in the vm shutdown
The root cause for this crash is the ioeventfd not stopped while the VM stop.
The callback for vmstate_change was not implement in virtio-mmio bus

Reproduce step
load the vm with
 -M microvm \
  -netdev tap,id=net0,vhostforce,script=no,downscript=no  \
  -device virtio-net-device,netdev=net0\

After the VM boot, login the vm and then shutdown the vm

System will crash
[Current thread is 1 (Thread 0x7ffff6edde00 (LWP 374378))]
(gdb) bt
0  0x00005555558f18b4 in qemu_flush_or_purge_queued_packets (purge=false, nc=0x55500252e850) at ../net/net.c:636
1  qemu_flush_queued_packets (nc=0x55500252e850) at ../net/net.c:656
2  0x0000555555b6c363 in virtio_queue_notify_vq (vq=0x7fffe7e2b010) at ../hw/virtio/virtio.c:2339
3  virtio_queue_host_notifier_read (n=0x7fffe7e2b08c) at ../hw/virtio/virtio.c:3583
4  0x0000555555de7b5a in aio_dispatch_handler (ctx=ctx@entry=0x5555567c5780, node=0x555556b83fd0) at ../util/aio-posix.c:329
5  0x0000555555de8454 in aio_dispatch_ready_handlers (ready_list=<optimized out>, ctx=<optimized out>) at ../util/aio-posix.c:359
6  aio_poll (ctx=0x5555567c5780, blocking=blocking@entry=false) at ../util/aio-posix.c:662
7  0x0000555555cce0cc in monitor_cleanup () at ../monitor/monitor.c:645
8  0x0000555555b06bd2 in qemu_cleanup () at ../softmmu/runstate.c:822
9  0x000055555586e693 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ../softmmu/main.c:51

Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20211109023744.22387-1-lulu@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-28 17:03:52 -05:00
Jason Wang d152cdd6f6 virtio: use virtio accessor to access packed event
We used to access packed descriptor event and off_wrap via
address_space_{write|read}_cached(). When we hit the cache, memcpy()
is used which is not atomic which may lead a wrong value to be read or
wrote.

This patch fixes this by switching to use
virito_{stw|lduw}_phys_cached() to make sure the access is atomic.

Fixes: 683f766567 ("virtio: event suppression support for packed ring")
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211111063854.29060-2-jasowang@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-15 09:44:46 -05:00
Jason Wang f463e761a4 virtio: use virtio accessor to access packed descriptor flags
We used to access packed descriptor flags via
address_space_{write|read}_cached(). When we hit the cache, memcpy()
is used which is not an atomic operation which may lead a wrong value
is read or wrote.

So this patch switches to use virito_{stw|lduw}_phys_cached() to make
sure the aceess is atomic.

Fixes: 86044b24e8 ("virtio: basic packed virtqueue support")
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211111063854.29060-1-jasowang@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-15 09:44:46 -05:00
Eugenio Pérez 245cf2c24e vhost: Rename last_index to vq_index_end
The doc of this field pointed out that last_index is the last vq index.
This is misleading, since it's actually one past the end of the vqs.

Renaming and modifying comment.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211104085625.2054959-2-eperezma@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-11 03:13:05 -05:00
Richard Henderson 7fa736595e pc,pci,virtio: features, fixes
virtio-iommu support for x86/ACPI.
 Fixes, cleanups all over the place.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmGAefYPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpiCUH/2pIs3FmOGIasEqn4BnqXr4dHMReUO5Ghg0v
 cXle4ZUrbg7Qpnxh07CwMuUpJV3Qv+xtVK7hzbD13nnxrkTZuKzBRV1AthkA1Hly
 zIKOxnEgV497LaXoaSOtqAx48fuznk5XOHju91usgu4mehJ0qe2gcwb4H8uWGkQi
 hrsR7a9woP0M4H/jvb3+aQRCJKMscj8ReabM1ulOugNpPdNI/jIKtBvZBtTxAqtQ
 CH9/DJLfVmzDRYdeBpnF06A+tXm4uU1Q5BmpmF9qaymk/PzthN54gdnDd6zH405Z
 Tmjp9UA2xfEYDmKzuTCBdPmoUe6OI7mU9o0WbB5MGYx5RRRBETw=
 =R7DD
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

pc,pci,virtio: features, fixes

virtio-iommu support for x86/ACPI.
Fixes, cleanups all over the place.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Mon 01 Nov 2021 07:36:22 PM EDT
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]

* remotes/mst/tags/for_upstream:
  hw/i386: fix vmmouse registration
  pci: Export pci_for_each_device_under_bus*()
  pci: Define pci_bus_dev_fn/pci_bus_fn/pci_bus_ret_fn
  hw/i386/pc: Allow instantiating a virtio-iommu device
  hw/i386/pc: Move IOMMU singleton into PCMachineState
  hw/i386/pc: Remove x86_iommu_get_type()
  hw/acpi: Add VIOT table
  vhost-vdpa: Set discarding of RAM broken when initializing the backend
  qtest: fix 'expression is always false' build failure in qtest_has_accel()

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-11-02 19:24:17 -04:00
David Hildenbrand e1c1915bef vhost-vdpa: Set discarding of RAM broken when initializing the backend
Similar to VFIO, vDPA will go ahead an map+pin all guest memory. Memory
that used to be discarded will get re-populated and if we
discard+re-access memory after mapping+pinning, the pages mapped into the
vDPA IOMMU will go out of sync with the actual pages mapped into the user
space page tables.

Set discarding of RAM broken such that:
- virtio-mem and vhost-vdpa run mutually exclusive
- virtio-balloon is inhibited and no memory discards will get issued

In the future, we might be able to support coordinated discarding of RAM
as used by virtio-mem and already supported by vfio via the
RamDiscardManager.

Acked-by: Jason Wang <jasowang@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Cindy Lu <lulu@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211027130324.59791-1-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-11-01 18:49:10 -04:00
David Hildenbrand f4578df399 virtio-mem: Drop precopy notifier
Migration code now properly handles RAMBlocks which are indirectly managed
by a RamDiscardManager. No need for manual handling via the free page
optimization interface, let's get rid of it.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01 22:56:44 +01:00
David Hildenbrand 372aa6fd73 virtio-mem: Implement replay_discarded RamDiscardManager callback
Implement it similar to the replay_populated callback.

Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2021-11-01 22:56:44 +01:00
Jason Wang 402378407d vhost-vdpa: multiqueue support
This patch implements the multiqueue support for vhost-vdpa. This is
done simply by reading the number of queue pairs from the config space
and initialize the datapath and control path net client.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211020045600.16082-11-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-10-20 04:44:05 -04:00
Jason Wang 353244d8b9 vhost-vdpa: prepare for the multiqueue support
Unlike vhost-kernel, vhost-vdpa adapts a single device multiqueue
model. So we need to simply use virtqueue index as the vhost virtqueue
index. This is a must for multiqueue to work for vhost-vdpa.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211020045600.16082-4-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-10-20 04:44:05 -04:00
Jason Wang 4d191cfdc7 vhost-vdpa: classify one time request
Vhost-vdpa uses one device multiqueue queue (pairs) model. So we need
to classify the one time request (e.g SET_OWNER) and make sure those
request were only called once per device.

This is used for multiqueue support.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211020045600.16082-3-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-10-20 04:44:05 -04:00
Xueming Li a1ed9ef1de vhost-user: fix duplicated notifier MR init
In case of device resume after suspend, VQ notifier MR still valid.
Duplicated registrations explode memory block list and slow down device
resume.

Fixes: 44866521bd ("vhost-user: support registering external host notifiers")
Cc: tiwei.bie@intel.com
Cc: qemu-stable@nongnu.org
Cc: Yuwei Zhang <zhangyuwei.9149@bytedance.com>

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Message-Id: <20211008080215.590292-1-xuemingl@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-10-20 04:37:55 -04:00
Mathieu Poirier c7160fff7d vhost-user-rng-pci: Add vhost-user-rng-pci implementation
This patch provides a PCI bus interface to the vhost-user-rng backend.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Message-Id: <20211012205904.4106769-3-mathieu.poirier@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-10-20 04:37:55 -04:00
Mathieu Poirier 821d28b88f vhost-user-rng: Add vhost-user-rng implementation
Introduce a random number generator (RNG) backend that communicates
with a vhost-user server to retrieve entropy.  That way other VMM
that comply with the vhost user protocl can use the same vhost-user
daemon without having to write yet another RNG driver.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Message-Id: <20211012205904.4106769-2-mathieu.poirier@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-10-20 04:37:55 -04:00
Eric Auger 19d20e910a virtio-iommu: Drop base_name and change generic_name
Drop base_name and turn generic_name into
"virtio-iommu-pci". This is more in line with
other modern-only devices.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20211013191755.767468-3-eric.auger@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-10-20 04:37:55 -04:00
Eric Auger 819bbda81f virtio-iommu: Remove the non transitional name
Remove the non transitional name for virtio iommu. Like other
devices introduced after 1.0 spec, the virtio-iommu does
not need it.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reported-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20211013191755.767468-2-eric.auger@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-10-20 04:37:55 -04:00
Eugenio Pérez 013108b6e5 vdpa: Check for iova range at mappings changes
Check vdpa device range before updating memory regions so we don't add
any outside of it, and report the invalid change if any.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20211014141236.923287-4-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-10-20 04:37:55 -04:00
Eugenio Pérez 032e4d686e vdpa: Add vhost_vdpa_section_end
Abstract this operation, that will be reused when validating the region
against the iova range that the device supports.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211014141236.923287-3-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-10-20 04:37:55 -04:00
Eugenio Pérez c64038c93e vdpa: Skip protected ram IOMMU mappings
Following the logic of commit 56918a126a ("memory: Add RAM_PROTECTED
flag to skip IOMMU mappings") with VFIO, skip memory sections
inaccessible via normal mechanisms, including DMA.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20211014141236.923287-2-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-10-20 04:37:55 -04:00
Dr. David Alan Gilbert 243a9284a9 virtio-balloon: Fix page-poison subsection name
The subsection name for page-poison was typo'd as:

  vitio-balloon-device/page-poison

Note the missing 'r' in virtio.

When we have a machine type that enables page poison, and the guest
enables it (which needs a new kernel), things fail rather unpredictably.

The fallout from this is that most of the other subsections fail to
load, including things like the feature bits in the device, one
possible fallout is that the physical addresses of the queues
then get aligned differently and we fail with an error about
last_avail_idx being wrong.
It's not obvious to me why this doesn't produce a more obvious failure,
but virtio's vmstate loading is a bit open-coded.

Fixes: 7483cbbaf8 ("virtio-balloon: Implement support for page poison reporting feature")
bz: https://bugzilla.redhat.com/show_bug.cgi?id=1984401
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210914131716.102851-1-dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2021-10-05 17:30:57 -04:00
Stefano Garzarella 46ce017167 vhost-vsock: handle common features in vhost-vsock-common
virtio-vsock features, like VIRTIO_VSOCK_F_SEQPACKET, can be handled
by vhost-vsock-common parent class. In this way, we can reuse the
same code for all virtio-vsock backends (i.e. vhost-vsock,
vhost-user-vsock).

Let's move `seqpacket` property to vhost-vsock-common class, add
vhost_vsock_common_get_features() used by children, and disable
`seqpacket` for vhost-user-vsock device for machine types < 6.2.

The behavior of vhost-vsock device doesn't change; vhost-user-vsock
device now supports `seqpacket` property.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20210921161642.206461-3-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-10-05 17:30:57 -04:00
Stefano Garzarella d6a9378f47 vhost-vsock: fix migration issue when seqpacket is supported
Commit 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
enabled the SEQPACKET feature bit.
This commit is released with QEMU 6.1, so if we try to migrate a VM where
the host kernel supports SEQPACKET but machine type version is less than
6.1, we get the following errors:

    Features 0x130000002 unsupported. Allowed features: 0x179000000
    Failed to load virtio-vhost_vsock:virtio
    error while loading state for instance 0x0 of device '0000:00:05.0/virtio-vhost_vsock'
    load of migration failed: Operation not permitted

Let's disable the feature bit for machine types < 6.1.
We add a new OnOffAuto property for this, called `seqpacket`.
When it is `auto` (default), QEMU behaves as before, trying to enable the
feature, when it is `on` QEMU will fail if the backend (vhost-vsock
kernel module) doesn't support it.

Fixes: 1e08fd0a46 ("vhost-vsock: SOCK_SEQPACKET feature bit support")
Cc: qemu-stable@nongnu.org
Reported-by: Jiang Wang <jiang.wang@bytedance.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20210921161642.206461-2-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-10-05 17:30:57 -04:00
Philippe Mathieu-Daudé d6ed27bae7 hw/virtio: Have virtqueue_get_avail_bytes() pass caches arg to callees
Both virtqueue_packed_get_avail_bytes() and
virtqueue_split_get_avail_bytes() access the region cache, but
their caller also does. Simplify by having virtqueue_get_avail_bytes
calling both with RCU lock held, and passing the caches as argument.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210906104318.1569967-4-philmd@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-10-05 11:19:40 -04:00
Philippe Mathieu-Daudé ab4dd2746c hw/virtio: Acquire RCU read lock in virtqueue_packed_drop_all()
vring_get_region_caches() must be called with the RCU read lock
acquired. virtqueue_packed_drop_all() does not, and uses the
'caches' pointer. Fix that by using the RCU_READ_LOCK_GUARD()
macro.

Reported-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210906104318.1569967-3-philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-10-05 11:19:40 -04:00
David Hildenbrand d89dd28f0e qapi: Include qom-path in MEMORY_DEVICE_SIZE_CHANGE qapi events
As we might not always have a device id, it is impossible to always
match MEMORY_DEVICE_SIZE_CHANGE events to an actual device. Let's
include the qom-path in the event, which allows for reliable mapping of
events to devices.

Fixes: 722a3c783e ("virtio-pci: Send qapi events when the virtio-mem size changes")
Suggested-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210929162445.64060-3-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-02 08:43:21 +02:00
David Hildenbrand 75b98cb9f6 virtio-mem-pci: Fix memory leak when creating MEMORY_DEVICE_SIZE_CHANGE event
Apparently, we don't have to duplicate the string.

Fixes: 722a3c783e ("virtio-pci: Send qapi events when the virtio-mem size changes")
Cc: qemu-stable@nongnu.org
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210929162445.64060-2-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-10-02 08:43:21 +02:00
Peter Maydell bb4aa8f59e target-arm queue:
* allwinner-h3: Switch to SMC as PSCI conduit
  * arm: tcg: Adhere to SMCCC 1.3 section 5.2
  * xlnx-zcu102, xlnx-versal-virt: Support BBRAM and eFUSE devices
  * gdbstub related code cleanups
  * Don't put FPEXC and FPSID in org.gnu.gdb.arm.vfp XML
  * Use _init vs _new convention in bus creation function names
  * sabrelite: Connect SPI flash CS line to GPIO3_19
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmFV05gZHHBldGVyLm1h
 eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3kVoD/9rlpi81v6U2zPmW5s/lFB8
 m7eqtVpP2T1UwwGPw5jXZ4qAyDyCDXJxtW8B2ePxjXfrFT5f59hy9myBFqDebjNC
 Xwdwafc17lkUm0CrIEwhMhGYiXs6yak1YcGqEPZ3ceWt67kVByXGj89mLepogCHn
 LvcjQGC3PuDvDHWnOKdOBhxTu+rvQSDRXpVCuBAd3eJBn9jxG10cdaCr3/Z7VFA/
 bnc9bSU8qJ0hYCswHHld9R2Rk9zYDQmrtMpygN6pviCd5qWGEOh8b5vszmrSHYo9
 tn0bSp9d9k2wBXrPR5Ux3L0IMRBp7N88tSDw2QyatDltltjsCKw+ZMxjKHh0mxnr
 N1QF1FteIFliu5GQeMiEWPP87rVZ31quWZUIln6XYo9+aXus8jd88vxdpND1v767
 np/q6BW0g+Tuu2T+QRe5V8VBQJzgEAKT7AwCVHC+5Flyq8fWFcFdPp1dygWXdzW3
 Yhmq2JwwMq/3MjZY10aymohrvFPAQSx2bGGDS9yi8m5seaJvHjJW5fZQUVapy0vw
 andiIFNC9KxeQ23AZM0oKkW/d5EckKIkagfiq5+71QhvtbJrXbz+fs7UxHN0IVeX
 7px+ih0xJcz3uVxZtZ/kvpBMMe3WEMd9r2tZOhbJ9K8RlCcB11y1AnZaBs/2fess
 +DzTJOkZGu1oDP4IAAqGBg==
 =eAN3
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20210930' into staging

target-arm queue:
 * allwinner-h3: Switch to SMC as PSCI conduit
 * arm: tcg: Adhere to SMCCC 1.3 section 5.2
 * xlnx-zcu102, xlnx-versal-virt: Support BBRAM and eFUSE devices
 * gdbstub related code cleanups
 * Don't put FPEXC and FPSID in org.gnu.gdb.arm.vfp XML
 * Use _init vs _new convention in bus creation function names
 * sabrelite: Connect SPI flash CS line to GPIO3_19

# gpg: Signature made Thu 30 Sep 2021 16:11:20 BST
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [ultimate]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20210930: (22 commits)
  hw/arm: sabrelite: Connect SPI flash CS line to GPIO3_19
  ide: Rename ide_bus_new() to ide_bus_init()
  qbus: Rename qbus_create() to qbus_new()
  qbus: Rename qbus_create_inplace() to qbus_init()
  pci: Rename pci_root_bus_new_inplace() to pci_root_bus_init()
  ipack: Rename ipack_bus_new_inplace() to ipack_bus_init()
  scsi: Replace scsi_bus_new() with scsi_bus_init(), scsi_bus_init_named()
  target/arm: Don't put FPEXC and FPSID in org.gnu.gdb.arm.vfp XML
  target/arm: Move gdbstub related code out of helper.c
  target/arm: Fix coding style issues in gdbstub code in helper.c
  configs: Don't include 32-bit-only GDB XML in aarch64 linux configs
  docs/system/arm: xlnx-versal-virt: BBRAM and eFUSE Usage
  hw/arm: xlnx-zcu102: Add Xilinx eFUSE device
  hw/arm: xlnx-zcu102: Add Xilinx BBRAM device
  hw/arm: xlnx-versal-virt: Add Xilinx eFUSE device
  hw/arm: xlnx-versal-virt: Add Xilinx BBRAM device
  hw/nvram: Introduce Xilinx battery-backed ram
  hw/nvram: Introduce Xilinx ZynqMP eFuse device
  hw/nvram: Introduce Xilinx Versal eFuse device
  hw/nvram: Introduce Xilinx eFuse QOM
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-09-30 21:16:54 +01:00
Peter Xu 142518bda5 memory: Name all the memory listeners
Provide a name field for all the memory listeners.  It can be used to identify
which memory listener is which.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210817013553.30584-2-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-30 15:30:24 +02:00
Peter Maydell d637e1dc6d qbus: Rename qbus_create_inplace() to qbus_init()
Rename qbus_create_inplace() to qbus_init(); this is more in line
with our usual naming convention for functions that in-place
initialize objects.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20210923121153.23754-5-peter.maydell@linaro.org
2021-09-30 13:42:10 +01:00
Jason Wang 2a83e97ee8 vhost-vdpa: correctly return err in vhost_vdpa_set_backend_cap()
We should return error code instead of zero, otherwise there's no way
for the caller to detect the failure.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210903091031.47303-3-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-04 17:34:05 -04:00
Yuwei Zhang c6effa9cf5 hw/virtio: Add flatview update in vhost_user_cleanup()
Qemu will crash on vhost backend unexpected exit and re-connect                                                                          │
in some case due to access released memory.

Signed-off-by: Yuwei Zhang <zhangyuwei.9149@bytedance.com>
Message-Id: <20210830123433.45727-1-zhangyuwei.9149@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-04 17:34:05 -04:00
Philippe Mathieu-Daudé b116d6c319 hw/virtio: Remove NULL check in virtio_free_region_cache()
virtio_free_region_cache() is called within call_rcu(),
always with a non-NULL argument. Ensure new code keep it
that way by replacing the NULL check by an assertion.
Add a comment this function is called within call_rcu().

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210826172658.2116840-3-philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-09-04 17:34:05 -04:00
Philippe Mathieu-Daudé 7f51beddad hw/virtio: Document virtio_queue_packed_empty_rcu is called within RCU
While virtio_queue_packed_empty_rcu() uses the '_rcu' suffix,
it is not obvious it is called within rcu_read_lock(). All other
functions from this file called with the RCU locked have a comment
describing it. Document this one similarly for consistency.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210826172658.2116840-2-philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-09-04 17:34:05 -04:00
Jason Wang ae4003738f vhost: correctly detect the enabling IOMMU
Vhost used to compare the dma_as against the address_space_memory to
detect whether the IOMMU is enabled or not. This might not work well
since the virito-bus may call get_dma_as if VIRTIO_F_IOMMU_PLATFORM is
set without an actual IOMMU enabled when device is plugged. In the
case of PCI where pci_get_address_space() is used, the bus master as
is returned. So vhost actually tries to enable device IOTLB even if
the IOMMU is not enabled. This will lead a lots of unnecessary
transactions between vhost and Qemu and will introduce a huge drop of
the performance.

For PCI, an ideal approach is to use pci_device_iommu_address_space()
just for get_dma_as. But Qemu may choose to initialize the IOMMU after
the virtio-pci which lead a wrong address space is returned during
device plugged. So this patch switch to use transport specific way via
iommu_enabled() to detect the IOMMU during vhost start. In this case,
we are fine since we know the IOMMU is initialized correctly.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210804034803.1644-4-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-04 16:35:17 -04:00
Jason Wang 3d1e5d86fe virtio-pci: implement iommu_enabled()
This patch implements the PCI transport version of iommu_enabled. This
is done by comparing the address space returned by
pci_device_iommu_address_space() against address_space_memory.

Note that an ideal approach is to use pci_device_iommu_address_space()
in get_dma_as(), but it might not work well since the IOMMU could be
initialized after the virtio-pci device is initialized.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210804034803.1644-3-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-04 16:35:17 -04:00
Jason Wang dd014b4f49 virtio-bus: introduce iommu_enabled()
This patch introduce a new method for the virtio-bus for the transport
to report whether or not the IOMMU is enabled for the device.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20210804034803.1644-2-jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-04 16:35:17 -04:00
David Hildenbrand 2d050ed07c virtio-balloon: free page hinting cleanups
Let's compress the code a bit to improve readability. We can drop the
vm_running check in virtio_balloon_free_page_start() as it's already
properly checked in the single caller.

Cc: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210708095339.20274-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-04 16:35:17 -04:00
David Hildenbrand fd51e54fa1 virtio-balloon: don't start free page hinting if postcopy is possible
Postcopy never worked properly with 'free-page-hint=on', as there are
at least two issues:

1) With postcopy, the guest will never receive a VIRTIO_BALLOON_CMD_ID_DONE
   and consequently won't release free pages back to the OS once
   migration finishes.

   The issue is that for postcopy, we won't do a final bitmap sync while
   the guest is stopped on the source and
   virtio_balloon_free_page_hint_notify() will only call
   virtio_balloon_free_page_done() on the source during
   PRECOPY_NOTIFY_CLEANUP, after the VM state was already migrated to
   the destination.

2) Once the VM touches a page on the destination that has been excluded
   from migration on the source via qemu_guest_free_page_hint() while
   postcopy is active, that thread will stall until postcopy finishes
   and all threads are woken up. (with older Linux kernels that won't
   retry faults when woken up via userfaultfd, we might actually get a
   SEGFAULT)

   The issue is that the source will refuse to migrate any pages that
   are not marked as dirty in the dirty bmap -- for example, because the
   page might just have been sent. Consequently, the faulting thread will
   stall, waiting for the page to be migrated -- which could take quite
   a while and result in guest OS issues.

While we could fix 1) comparatively easily, 2) is harder to get right and
might require more involved RAM migration changes on source and destination
[1].

As it never worked properly, let's not start free page hinting in the
precopy notifier if the postcopy migration capability was enabled to fix
it easily. Capabilities cannot be enabled once migration is already
running.

Note 1: in the future we might either adjust migration code on the source
        to track pages that have actually been sent or adjust
        migration code on source and destination  to eventually send
        pages multiple times from the source and and deal with pages
        that are sent multiple times on the destination.

Note 2: virtio-mem has similar issues, however, access to "unplugged"
        memory by the guest is very rare and we would have to be very
        lucky for it to happen during migration. The spec states
        "The driver SHOULD NOT read from unplugged memory blocks ..."
        and "The driver MUST NOT write to unplugged memory blocks".
        virtio-mem will move away from virtio_balloon_free_page_done()
        soon and handle this case explicitly on the destination.

[1] https://lkml.kernel.org/r/e79fd18c-aa62-c1d8-c7f3-ba3fc2c25fc8@redhat.com

Fixes: c13c4153f7 ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Cc: qemu-stable@nongnu.org
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210708095339.20274-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2021-09-04 16:35:17 -04:00
Alyssa Ross edb40732bf vhost-user: add missing space in error message
This would previously give error messages like

> Received unexpected msg type.Expected 0 received 1

Signed-off-by: Alyssa Ross <hi@alyssa.is>
Message-Id: <20210806143926.315725-1-hi@alyssa.is>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-04 09:07:46 -04:00
Tiberiu Georgescu 9b1d929adb hw/virtio: move vhost_set_backend_type() to vhost.c
Just a small refactor patch.

vhost_set_backend_type() gets called only in vhost.c, so we can move the
function there and make it static. We can then extern the visibility of
kernel_ops, to match the other VhostOps in vhost-backend.h.
The VhostOps constants now make more sense in vhost.h

Suggested-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Tiberiu Georgescu <tiberiu.georgescu@nutanix.com>
Message-Id: <20210809134015.67941-1-tiberiu.georgescu@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-04 09:07:46 -04:00
Denis Plotnikov 699f2e535d vhost: make SET_VRING_ADDR, SET_FEATURES send replies
On vhost-user-blk migration, qemu normally sends a number of commands
to enable logging if VHOST_USER_PROTOCOL_F_LOG_SHMFD is negotiated.
Qemu sends VHOST_USER_SET_FEATURES to enable buffers logging and
VHOST_USER_SET_VRING_ADDR per each started ring to enable "used ring"
data logging.
The issue is that qemu doesn't wait for reply from the vhost daemon
for these commands which may result in races between qemu expectation
of logging starting and actual login starting in vhost daemon.

The race can appear as follows: on migration setup, qemu enables dirty page
logging by sending VHOST_USER_SET_FEATURES. The command doesn't arrive to a
vhost-user-blk daemon immediately and the daemon needs some time to turn the
logging on internally. If qemu doesn't wait for reply, after sending the
command, qemu may start migrateing memory pages to a destination. At this time,
the logging may not be actually turned on in the daemon but some guest pages,
which the daemon is about to write to, may have already been transferred
without logging to the destination. Since the logging wasn't turned on,
those pages won't be transferred again as dirty. So we may end up with
corrupted data on the destination.
The same scenario is applicable for "used ring" data logging, which is
turned on with VHOST_USER_SET_VRING_ADDR command.

To resolve this issue, this patch makes qemu wait for the command result
explicitly if VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated and logging enabled.

Signed-off-by: Denis Plotnikov <den-plotnikov@yandex-team.ru>

Message-Id: <20210809104824.78830-1-den-plotnikov@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-04 09:07:45 -04:00