Currently we will try to traverse all virtqueues to find a subset that
using a specific vector. This is sub optimal when we will support
hundreds or even thousands of virtqueues. So this patch introduces a
method which could be used by transport to get all virtqueues that
using a same vector. This is done through QLISTs and the number of
QLISTs was queried through a transport specific method. When guest
setting vectors, the virtqueue will be linked and helpers for traverse
the list was also introduced.
The first user will be virtio pci which will use this to speed up
MSI-X masking and unmasking handling.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We don't validate the existence of handle_output which may let a buggy
guest to trigger a SIGSEV easily. E.g:
1) write 10 to queue_sel to a virtio net device with only 1 queue
2) setup an arbitrary pfn
3) then notify queue 10
Fixing this by validating the existence of handle_output before.
Cc: qemu-stable@nongnu.org
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Don Koch <dkoch@verizon.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Add a helper function for checking whether a bit is set in the guest
features for a vdev as well as one that works on a feature bit set.
Convert code that open-coded this: It cleans up the code and makes it
easier to extend the guest feature bits.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Drop a bunch of code duplicated from virtio_config.h and virtio_ring.h.
This makes us rename event index accessors which conflict,
as reusing the ones from virtio_ring.h isn't trivial.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
One of the annoyances of the current migration format is the fact that
it's not self-describing. In fact, it's not properly describing at all.
Some code randomly scattered throughout QEMU elaborates roughly how to
read and write a stream of bytes.
We discussed an idea during KVM Forum 2013 to add a JSON description of
the migration protocol itself to the migration stream. This patch
adds a section after the VM_END migration end marker that contains
description data on what the device sections of the stream are composed of.
This approach is backwards compatible with any QEMU version reading the
stream, because QEMU just stops reading after the VM_END marker and ignores
any data following it.
With an additional external program this allows us to decipher the
contents of any migration stream and hopefully make migration bugs easier
to track down.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The event idx in virtio is an effective way to reduce the number of
interrupts and exits of the guest. When the guest puts an request
into the virtio ring, it doesn't exit immediately to inform the
backend. Instead, the guest checks the "avail" event idx to determine
the notification.
In virtqueue_pop, when a request is poped, the current avail event
idx should be set to the number of vq->last_avail_idx.
Signed-off-by: Bin Wu <wu.wubin@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
For better code sharing, add a helper function that handles
reference counting of the virtio backend for virtio proxy devices.
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts commit a1bc7b827e422e1ff065640d8ec5347c4aadfcd8.
virtio: don't call device on !vm_running
It turns out that virtio net assumes that vm_running
is updated before device status callback in many places,
so this change leads to asserts.
Previous commit fixes the root issue that motivated
a1bc7b827e422e1ff065640d8ec5347c4aadfcd8 differently,
so there's no longer a need for this change.
In the future, we might be able to drop checking vm_running
completely, and check vm state directly.
Reported-by: Dietmar Maurer <dietmar@proxmox.com>
Cc: qemu-stable@nongnu.org
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
On vm stop, virtio changes vm_running state
too soon, so callbacks can get envoked with
vm_running = false;
Cc: qemu-stable@nongnu.org
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Quoting original text from Rusty: "This is based on a simpler patch by Anthony
Liguouri".
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
[ add VirtIODevice * argument to most helpers,
Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Some CPU families can dynamically change their endianness. This means we
can have little endian ppc or big endian arm guests for example. This has
an impact on legacy virtio data structures since they are target endian.
We hence introduce a new property to track the endianness of each virtio
device. It is reasonnably assumed that endianness won't change while the
device is in use : we hence capture the device endianness when it gets
reset.
We migrate this property in a subsection, after the device descriptor. This
means the load code must not rely on it until it is restored. As a consequence,
the vring sanity checks had to be moved after the call to vmstate_load_state().
We enforce paranoia by poisoning the property at the begining of virtio_load().
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
There is a need to add some more fields to VirtIODevice that should be
migrated (broken status, endianness). The problem is that we do not
want to break compatibility while adding a new feature... This issue has
been addressed in the generic VMState code with the use of optional
subsections. As a *temporary* alternative to port the whole virtio
migration code to VMState, this patch mimics a similar subsectionning
ability for virtio, using the VMState code.
Since each virtio device is streamed in its own section, the idea is to
stream subsections between the end of the device section and the start
of the next sections. This allows an older QEMU to complain and exit
when fed with subsections:
Unknown savevm section type 5
load of migration failed
Suggested-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In order to migrate virtio subsections, they should be streamed after
the device itself. We need the device specific code to be called from
the common migration code to achieve this. This patch introduces load
and save methods for this purpose.
Suggested-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit 'virtio: validate config_len on load' restricted config_len
loaded from the wire to match the config_len that the device had.
Unfortunately, there are cases where this isn't true, the one
we found it on was the wce addition in virtio-blk.
Allow mismatched config-lengths:
*) If the version on the wire is shorter then fine
*) If the version on the wire is longer, load what we have space
for and skip the rest.
(This is mst@redhat.com's rework of what I originally posted)
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
It's a loop from i < num_sg and the array is VIRTQUEUE_MAX_SIZE - so
it's OK if the value read is VIRTQUEUE_MAX_SIZE.
Not a big problem in practice as people don't use
such big queues, but it's inelegant.
Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Malformed input can have config_len in migration stream
exceed the array size allocated on destination, the
result will be heap overflow.
To fix, that config_len matches on both sides.
CVE-2014-0182
Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
--
v2: use %ix and %zx to print config_len values
Signed-off-by: Juan Quintela <quintela@redhat.com>
CVE-2013-4535
CVE-2013-4536
Both virtio-block and virtio-serial read,
VirtQueueElements are read in as buffers, and passed to
virtqueue_map_sg(), where num_sg is taken from the wire and can force
writes to indicies beyond VIRTQUEUE_MAX_SIZE.
To fix, validate num_sg.
Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
CVE-2013-6399
vdev->queue_sel is read from the wire, and later used in the
emulation code as an index into vdev->vq[]. If the value of
vdev->queue_sel exceeds the length of vdev->vq[], currently
allocated to be VIRTIO_PCI_QUEUE_MAX elements, subsequent PIO
operations such as VIRTIO_PCI_QUEUE_PFN can be used to overrun
the buffer with arbitrary data originating from the source.
Fix this by failing migration if the value from the wire exceeds
VIRTIO_PCI_QUEUE_MAX.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
CVE-2013-4151 QEMU 1.0 out-of-bounds buffer write in
virtio_load@hw/virtio/virtio.c
So we have this code since way back when:
num = qemu_get_be32(f);
for (i = 0; i < num; i++) {
vdev->vq[i].vring.num = qemu_get_be32(f);
array of vqs has size VIRTIO_PCI_QUEUE_MAX, so
on invalid input this will write beyond end of buffer.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This got lost in a rebase.
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Temporarily allow either VirtioDeviceClass::init or
VirtioDeviceClass::realize.
Introduce VirtioDeviceClass::unrealize for symmetry.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Right now we have these pairs:
- virtio_bus_plug_device/virtio_bus_destroy_device. The first
takes a VirtIODevice, the second takes a VirtioBusState
- device_plugged/device_unplug callbacks in the VirtioBusClass
(here it's just the naming that is inconsistent)
- virtio_bus_destroy_device is not called by anyone (and since
it calls qdev_free, it would be called by the proxies---but
then the callback is useless since the proxies can do whatever
they want before calling virtio_bus_destroy_device)
And there is a k->init but no k->exit, hence virtio_device_exit is
overwritten by subclasses (except virtio-9p). This cleans it up by:
- renaming the device_unplug callback to device_unplugged
- renaming virtio_bus_plug_device to virtio_bus_device_plugged,
matching the callback name
- renaming virtio_bus_destroy_device to virtio_bus_device_unplugged,
removing the qdev_free, making it take a VirtIODevice and calling it
from virtio_device_exit
- adding a k->exit callback
virtio_device_exit is still overwritten, the next patches will fix that.
Cc: qemu-stable@nongnu.org
Acked-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
virtqueue_get_avail_bytes: when found a indirect desc, we need loop over it.
/* loop over the indirect descriptor table */
indirect = 1;
max = vring_desc_len(desc_pa, i) / sizeof(VRingDesc);
num_bufs = i = 0;
desc_pa = vring_desc_addr(desc_pa, i);
But, It init i to 0, then use i to update desc_pa. so we will always get:
desc_pa = vring_desc_addr(desc_pa, 0);
the last two line should swap.
Cc: qemu-stable@nongnu.org
Signed-off-by: Yin Yin <yin.yin@cs2c.com.cn>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This includes some last-minute bugfixes for 1.6.
All very small patches that also look very safe to me.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQEcBAABAgAGBQJSCKrZAAoJECgfDbjSjVRpzWsH/2vJswTENyE1ws/fgs3QIxM/
YNGpOkxXGtfLB8EgkSchdEFytFidDE7VJZA/maRS3jY1/vZbd54qjlfBSaoWa27l
eaLMqjr5vdFQXJMn4WS1Fhv2HEiTRame8RxvCkLvv3SU87QzDxbwdvgTNUsDSREJ
OUBZLqEpyK5mf7e/qdFxxFUWuOGAfbQhMw3A8jYYxNbmczbSvawA/qthTgsXiyW4
t5Kak2GzQ5W5yLhhe3PhdoD/9XnG0qFKP2ZGha/PcrQjAi+7oCZl2qJ55V5MTHl8
mh8Q1Qpp/5SDeo6kKNVBQ5ysF9iUbrPxog44LnkVgX4F8/282/birt6VfeyKZbg=
=U+hn
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'mst/tags/for_anthony' into staging
pci,virtio fixes for 1.6
This includes some last-minute bugfixes for 1.6.
All very small patches that also look very safe to me.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Mon 12 Aug 2013 04:28:57 AM CDT using RSA key ID D28D5469
# gpg: Can't check signature: public key not found
# By Michael S. Tsirkin (2) and others
# Via Michael S. Tsirkin
* mst/tags/for_anthony:
vhost: clear signalled_used_valid on vhost stop
virtio: clear signalled_used_valid when switching from dataplane
i82801b11: Fix i82801b11 PCI host bridge config space
pc: disable pci-info for 1.6
Message-id: 1376308831-19978-1-git-send-email-mst@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
When the dataplane thread stops, its vring.c implementation synchronizes
vring state back to virtio.c so we can continue emulating the virtio
device.
This patch ensures that virtio.c's signalled_used_valid flag is reset so
that we do not suppress guest notifications due to stale signalled_used
values.
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
A queue size of 0 is used to indicate a nonexistent queue, so
don't allow the guest to flip a queue between zero-size and
non-zero-size. Don't permit setting of negative queue sizes
either.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1374853288-9912-2-git-send-email-peter.maydell@linaro.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Support virtio transports which can specify the vring alignment
(ie where the guest communicates this to the host) by providing
a new virtio_queue_set_align() function. (The default alignment
remains as before.)
Transports which wish to make use of this must set the
has_variable_vring_alignment field in their VirtioBusClass
struct to true; they can then change the alignment via
virtio_queue_set_align().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1373977512-28932-5-git-send-email-peter.maydell@linaro.org
The MMIO virtio transport spec allows the guest to tell the host how
large the queue size is. Add virtio_queue_set_num() function which
implements this in the QEMU common virtio support code.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1373977512-28932-4-git-send-email-peter.maydell@linaro.org
There are several several issues in the current checking:
- The check was based on the minus of unsigned values which can overflow
- It was done after .{set|get}_config() which can lead crash when config_len
is zero since vdev->config is NULL
Fix this by:
- Validate the address in virtio_pci_config_{read|write}() before
.{set|get}_config
- Use addition instead minus to do the validation
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Petr Matousek <pmatouse@redhat.com>
Message-id: 1367905369-10765-1-git-send-email-jasowang@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add virtio_device_set_child_bus_name function.
It will be used with virtio-serial-x and virtio-scsi-x to set the
child bus name before calling virtio-x-device's init.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Message-id: 1367330931-12994-3-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This clean the init and the exit functions and rename virtio_common_cleanup
to virtio_cleanup.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1366791683-5350-7-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This remove virtio-bindings, and use class instead.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1366791683-5350-6-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This remove the function pointer in VirtIODevice, and use only
VirtioDeviceClass function pointer.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1366791683-5350-5-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>