When HASH_REPORT is negotiated, the guest_hdr_len might be larger than
the size of the mergeable rx buffer header. Using
virtio_net_hdr_mrg_rxbuf during the header swap might lead a stack
overflow in this case. Fixing this by using virtio_net_hdr_v1_hash
instead.
Reported-by: Xiao Lei <leixiao.nop@zju.edu.cn>
Cc: Yuri Benditovich <yuri.benditovich@daynix.com>
Cc: qemu-stable@nongnu.org
Cc: Mauro Matteo Cascella <mcascell@redhat.com>
Fixes: CVE-2023-6693
Fixes: e22f0603fb ("virtio-net: reference implementation of hash report")
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Recently MemReentrancyGuard was added to DeviceState to record that the
device is engaging in I/O. The network device backend needs to update it
when delivering a packet to a device.
In preparation for such a change, add MemReentrancyGuard * as a
parameter of qemu_new_nic().
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Pass the callback function to add_migration_state_change_notifier so
that migration can initialize the notifier on add and clear it on
delete, which simplifies the call sites. Shorten the function names
so the extra arg can be added more legibly. Hide the global notifier
list in a new function migration_call_notifiers, and make it externally
visible so future live update code can call it.
No functional change.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Michael Galaxy <mgalaxy@akamai.com>
Reviewed-by: Michael Galaxy <mgalaxy@akamai.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Message-ID: <1686148954-250144-1-git-send-email-steven.sistare@oracle.com>
vhost-vdpa shadowed CVQ needs to know the maximum number of
vlans supported by the virtio-net device, so QEMU can restore
the VLAN state in a migration.
Co-developed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Message-Id: <ca03403319c6405ea7c400836a572255bbc9ceba.1690106284.git.yin31149@gmail.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This function is called after virtio_load, so all vlan configuration is
lost in migration case.
Just allow all the vlan-tagged packets if vlan is not configured, and
trust device reset to clear all filtered vlans.
Fixes: 0b1eaa8803 ("virtio-net: Do not filter VLANs without F_CTRL_VLAN")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Hawkins Jiawei <yin31149@gmail.com>
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Message-Id: <95af0d013281282f48ad3f47f6ad1ac4ca9e52eb.1690106284.git.yin31149@gmail.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
USO features of virtio-net device depend on kernel ability
to support them, for backward compatibility by default the
features are disabled on 8.0 and earlier.
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Signed-off-by: Andrew Melnychecnko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Passing additional parameters (USOv4 and USOv6 offloads) when
setting TAP offloads
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Fixes: 544f0278af (virtio: introduce macro VIRTIO_CONFIG_IRQ_IDX)
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230710153522.3469097-4-alex.bennee@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If vhost is enabled for virtio-net, Device-TLB enable/disable events
must be passed to vhost for proper IOMMU unmap flag selection.
Signed-off-by: Viktor Prutyanov <viktor@daynix.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230626091258.24453-3-viktor@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Maximum value for tx_queue_size depends on the backend type.
1024 for vDPA/vhost-user, 256 for all the others.
The value is returned by virtio_net_max_tx_queue_size() to set the
parameter:
n->net_conf.tx_queue_size = MIN(virtio_net_max_tx_queue_size(n),
n->net_conf.tx_queue_size);
But the parameter checking uses VIRTQUEUE_MAX_SIZE (1024).
So the parameter is silently ignored and ethtool reports a different
value than the one provided by the user.
... -netdev tap,... -device virtio-net,tx_queue_size=1024
# ethtool -g enp0s2
Ring parameters for enp0s2:
Pre-set maximums:
RX: 256
RX Mini: n/a
RX Jumbo: n/a
TX: 256
Current hardware settings:
RX: 256
RX Mini: n/a
RX Jumbo: n/a
TX: 256
... -netdev vhost-user,... -device virtio-net,tx_queue_size=2048
Invalid tx_queue_size (= 2048), must be a power of 2 between 256 and 1024
With this patch the correct maximum value is checked and displayed.
For vDPA/vhost-user:
Invalid tx_queue_size (= 2048), must be a power of 2 between 256 and 1024
For all the others:
Invalid tx_queue_size (= 512), must be a power of 2 between 256 and 256
Fixes: 2eef278b9e ("virtio-net: fix tx queue size for !vhost-user")
Cc: mst@redhat.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Slave/master nomenclature was replaced with backend/frontend in commit
1fc19b6527 ("vhost-user: Adopt new backend naming")
This patch replaces all remaining uses of master and slave in the
codebase.
Signed-off-by: Emmanouil Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-Id: <20230613080849.2115347-1-manos.pitsidianakis@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
To support restoring offloads state in vdpa, it is necessary to
expose the function virtio_net_supported_guest_offloads().
According to VirtIO standard, "Upon feature negotiation
corresponding offload gets enabled to preserve backward compatibility.".
Therefore, QEMU uses this function to get the device supported offloads.
This allows QEMU to know the device's defaults and skip the control
message sending if these defaults align with the driver's configuration.
Note that the device's defaults can mismatch the driver's configuration
only at live migration.
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Message-Id: <43679506f3f039a7aa2bdd5b49785107b5dfd7d4.1685704856.git.yin31149@gmail.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Tested-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The VirtIONet structure is not modified in
virtio_net_supported_guest_offloads().
Therefore, make it const to allow this function to
accept const variables.
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <489b09c3998ac09b9135e57a7dd8c56a4be8cdf9.1685704856.git.yin31149@gmail.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Tested-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
igb does not properly ensure the buffer passed to
net_rx_pkt_set_protocols() is contiguous for the entire L2/L3/L4 header.
Allow it to pass scattered data to net_rx_pkt_set_protocols().
Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The commit 93a97dc520 ("virtio-net: enable vq reset feature") enables
unconditionally vq reset feature as long as the device is emulated.
This makes impossible to actually disable the feature, and it causes
migration problems from qemu version previous than 7.2.
The entire final commit is unneeded as device system already enable or
disable the feature properly.
This reverts commit 93a97dc520.
Fixes: 93a97dc520 ("virtio-net: enable vq reset feature")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230504101447.389398-1-eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This protects devices from bh->mmio reentrancy issues.
Thanks: Thomas Huth <thuth@redhat.com> for diagnosing OS X test failure.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230427211013.2994127-5-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
igb, a new network device emulation, will need SCTP checksum offloading.
Currently eth_get_protocols() has a bool parameter for each protocol
currently it supports, but there will be a bit too many parameters if
we add yet another protocol.
Introduce an enum type, EthL4HdrProto to represent all L4 protocols
eth_get_protocols() support with one parameter.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The values returned by eth_get_protocols() are used to perform RSS,
checksumming and segmentation. Even when a packet signals the use of the
protocols which these operations can be applied to, the headers for them
may not be present because of too short packet or fragmentation, for
example. In such a case, the operations cannot be applied safely.
Report the presence of headers instead of whether the use of the
protocols are indicated with eth_get_protocols(). This also makes
corresponding changes to the callers of eth_get_protocols() to match
with its new signature and to remove redundant checks for fragmentation.
Fixes: 75020a7021 ("Common definitions for VMWARE devices")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
When virtio-net header is not set, net_rx_pkt_get_vhdr() returns
zero-filled virtio_net_hdr, which is actually valid. In fact, tap device
uses zero-filled virtio_net_hdr when virtio-net header is not provided
by the peer. Therefore, we can just remove net_rx_pkt_has_virt_hdr() and
always assume NetTxPkt has a valid virtio-net header.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Since GUEST_ANNOUNCE is emulated the feature bit could be set without
backend support. This happens in the vDPA case.
However, backend vDPA parent may not have CVQ support. This causes an
incoherent feature set, and the driver may refuse to start. This
happens in virtio-net Linux driver.
This may be solved differently in the future. Qemu is able to emulate a
CVQ just for guest_announce purposes, helping guest to notify the new
location with vDPA devices that does not support it. However, this is
left as a TODO as it is way more complex to backport.
Tested with vdpa_net_sim, toggling manually VIRTIO_NET_F_CTRL_VQ in the
driver and migrating it with x-svq=on.
Fixes: 980003debd ("vdpa: do not handle VIRTIO_NET_F_GUEST_ANNOUNCE in vhost-vdpa")
Reported-by: Dawar, Gautam <gautam.dawar@amd.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230124161159.2182117-1-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Edmondson <david.edmondson@oracle.com>
Reviewed-by: Gautam Dawar <gautam.dawar@amd.com>
Tested-by: Gautam Dawar <gautam.dawar@amd.com>
Tested-by: Lei Yang <leiyang@redhat.com>
PCIDeviceClass and PCIDevice are defined in pci.h. Many users of the
header don't actually need them. Similar structs live in their own
headers: PCIBusClass and PCIBus in pci_bus.h, PCIBridge in
pci_bridge.h, PCIHostBridgeClass and PCIHostState in pci_host.h,
PCIExpressHost in pcie_host.h, and PCIERootPortClass, PCIEPort, and
PCIESlot in pcie_port.h.
Move PCIDeviceClass and PCIDeviceClass to new pci_device.h, along with
the code that needs them. Adjust include directives.
This also enables the next commit.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221222100330.380143-6-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add functions to support configure interrupt in virtio_net
Add the functions to support vhost_net_config_pending
and vhost_net_config_mask.
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20221222070451.936503-9-lulu@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
To support configure interrupt for vhost-vdpa
Introduce VIRTIO_CONFIG_IRQ_IDX -1 as configure interrupt's queue index,
Then we can reuse the functions guest_notifier_mask and guest_notifier_pending.
Add the check of queue index in these drivers, if the driver does not support
configure interrupt, the function will just return
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20221222070451.936503-2-lulu@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch aims to fix unexpected negotiation features for
vhost-user netdev interface.
When openvswitch reconnect Qemu after an unexpected disconnection
and Qemu therefore start the vhost_dev, acked_features field in
vhost_dev is initialized with value fetched from acked_features
field in NetVhostUserState, which should be up-to-date at that
moment but Qemu could not make it actually during the time window
of virtio features negotiation.
So we save the acked_features right after being configured by
guest virtio driver so it can be used to restore acked_features
field in vhost_dev correctly.
Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
Signed-off-by: Liuxiangdong <liuxiangdong5@huawei.com>
Message-Id: <b9f8cf5561a79ea65ea38960e5a5e6d3707eef0a.1671627406.git.huangy81@chinatelecom.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Status part of the emulated feature. It will follow device model, so we
must copy it as long as NIC device model has it set.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20221221115015.1400889-3-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Next patches introduce more code on vhost-vdpa branch, with already have
too much indentation.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221221115015.1400889-2-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmOZ6lYSHGFybWJydUBy
ZWRoYXQuY29tAAoJEDhwtADrkYZT6VEQAKynjWh3AIZ4/qOgrVqsP0oRspevLmfH
BbuGoldjYpEE7RbwuCaZalZ7iy7TcSySxnPfUDVsFHd7NWffJVjwKHifGC0D/Ez0
+Ggyb1CBebN+mS7t+BNFUHdMM+wxFIlHwg4f4aTFbn2o0HKgj2a8tcNzNRonZbfa
xURnvbD4G4u0VZEc3Jak+x193xbOJFsuuWq0BZnDuNk+XqjyW2RwfpXLPJVk+82a
4uy/YgYuqXUqBeULwcJj+shBL4SXR9GyajTFMS64przSUle0ADUmXkPtaS2agV7e
Pym/UQuAcxvNyw34fJsiMZxx6rZI9YU30jQUMRLoYcPRR/Q/aiPeiiHtiD6Kaid7
IfOeH/EArXaQRFpD89xj4YcaTnRLQOEj0NXgXvAbQf6eD8JYyao/S/0lCsPZEoA2
nibLqEQ25ncDNXoSomuwtfjVff3w68lODFbhwqfA0gf3cPtCgVZ6xQ8P/McNY6K6
wqFHXMWTDHk1LOCTucjYz1z2TGzTnSG4iWi5Yt6FSxAc958AO+v5ALn/1pcYun+E
azM/MF0AInKj2aJCT530zT0tpCs/Jo07YKC8k6ubi77S0ZdmGS1XLeXkRXfk1+yI
OhuUgiVlSTHxD69DagT2vbnx1mDMM9X+OBIMvEi5nwvD9A/ghaCgkDeGFvbA1ud0
t0mxPBZJ+tiZ
=JJjG
-----END PGP SIGNATURE-----
Merge tag 'pull-misc-2022-12-14' of https://repo.or.cz/qemu/armbru into staging
Miscellaneous patches for 2022-12-14
# gpg: Signature made Wed 14 Dec 2022 15:23:02 GMT
# gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg: issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* tag 'pull-misc-2022-12-14' of https://repo.or.cz/qemu/armbru:
ppc4xx_sdram: Simplify sdram_ddr_size() to return
block/vmdk: Simplify vmdk_co_create() to return directly
cleanup: Tweak and re-run return_directly.cocci
io: Tidy up fat-fingered parameter name
qapi: Use returned bool to check for failure (again)
sockets: Use ERRP_GUARD() where obviously appropriate
qemu-config: Use ERRP_GUARD() where obviously appropriate
qemu-config: Make config_parse_qdict() return bool
monitor: Use ERRP_GUARD() in monitor_init()
monitor: Simplify monitor_fd_param()'s error handling
error: Move ERRP_GUARD() to the beginning of the function
error: Drop a few superfluous ERRP_GUARD()
error: Drop some obviously superfluous error_propagate()
Drop more useless casts from void * to pointer
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The has_FOO for pointer-valued FOO are redundant, except for arrays.
They are also a nuisance to work with. Recent commit "qapi: Start to
elide redundant has_FOO in generated C" provided the means to elide
them step by step. This is the step for qapi/net.json.
Said commit explains the transformation in more detail. The invariant
violations mentioned there do not occur here.
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221104160712.3005652-19-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[Fixes for MacOS squashed in]
Run shell script:
cat << EOF | valgrind qemu-system-i386 -display none -machine accel=qtest, -m \
512M -M q35 -nodefaults -device virtio-net,netdev=net0 -netdev \
user,id=net0 -qtest stdio
outl 0xcf8 0x80000810
outl 0xcfc 0xc000
outl 0xcf8 0x80000804
outl 0xcfc 0x01
outl 0xc00d 0x0200
outl 0xcf8 0x80000890
outb 0xcfc 0x4
outl 0xcf8 0x80000889
outl 0xcfc 0x1c000000
outl 0xcf8 0x80000893
outw 0xcfc 0x100
EOF
Got:
==68666== Invalid read of size 8
==68666== at 0x688536: virtio_net_queue_enable (virtio-net.c:575)
==68666== by 0x6E31AE: memory_region_write_accessor (memory.c:492)
==68666== by 0x6E098D: access_with_adjusted_size (memory.c:554)
==68666== by 0x6E4DB3: memory_region_dispatch_write (memory.c:1521)
==68666== by 0x6E31AE: memory_region_write_accessor (memory.c:492)
==68666== by 0x6E098D: access_with_adjusted_size (memory.c:554)
==68666== by 0x6E4DB3: memory_region_dispatch_write (memory.c:1521)
==68666== by 0x6EBCD3: flatview_write_continue (physmem.c:2820)
==68666== by 0x6EBFBF: flatview_write (physmem.c:2862)
==68666== by 0x6EF5E7: address_space_write (physmem.c:2958)
==68666== by 0x6DFDEC: cpu_outw (ioport.c:70)
==68666== by 0x6F6DF0: qtest_process_command (qtest.c:480)
==68666== Address 0x29087fe8 is 24 bytes after a block of size 416 in arena "client"
That is reported by Alexander Bulekov. https://gitlab.com/qemu-project/qemu/-/issues/1309
Here, the queue_index is the index of the cvq, but in some cases cvq
does not have the corresponding NetClientState, so overflow appears.
I add a check here, ignore illegal queue_index and cvq queue_index.
Note the queue_index is below the VIRTIO_QUEUE_MAX but greater or equal
than cvq index could hit this. Other devices are similar.
Fixes: 7f863302 ("virtio-net: support queue_enable")
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1309
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <20221110095739.130393-1-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add virtqueue reset feature for virtio-net
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <20221017092558.111082-16-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Support queue_enable in vhost-kernel scenario. It can be called when
a vq reset operation has been performed and the vq is restared.
It should be noted that we can restart the vq when the vhost has
already started. When launching a new vhost device, the vhost is not
started and all vqs are not initalized until VIRTIO_PCI_COMMON_STATUS
is written. Thus, we should use vhost_started to differentiate the
two cases: vq reset and device start.
Currently it only supports vhost-kernel.
Signed-off-by: Kangjie Xu <kangjie.xu@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221017092558.111082-14-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
virtio-net and vhost-kernel implement queue reset.
Queued packets in the corresponding queue pair are flushed
or purged.
For virtio-net, userspace datapath will be disabled later in
__virtio_queue_reset(). It will set addr of vring to 0 and idx to 0.
Thus, virtio_net_receive() and virtio_net_flush_tx() will not receive
or send packets.
For vhost-net, the datapath will be disabled in vhost_net_virtqueue_reset().
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Kangjie Xu <kangjie.xu@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221017092558.111082-13-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce the fucntion flush_or_purge_queued_packets(), it will be
used in device reset and virtqueue reset. Therefore, we extract the
common logic as a new function.
Signed-off-by: Kangjie Xu <kangjie.xu@linux.alibaba.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221017092558.111082-12-xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When virtio_net_flush_tx() reaches the tx_burst value all
the queue is not flushed and nothing restart the timer.
Fix that by doing for TX timer as we do for bottom half TX:
rearming the timer if we find any packet to send during the
virtio_net_flush_tx() call.
Fixes: e3f30488e5 ("virtio-net: Limit number of packets sent per TX flush")
Cc: alex.williamson@redhat.com
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
When virtio-net is used with the socket netdev backend, the backend
can be busy and not able to collect new packets.
In this case, net_socket_receive() returns 0 and registers a poll function
to detect when the socket is ready again.
In virtio_net_tx_bh(), virtio_net_flush_tx() forwards the 0, the virtio
notifications are disabled and the function is not re-scheduled, waiting
for the backend to be ready.
When the socket netdev backend is again able to send packets, the poll
function re-starts to flush remaining packets. This is done by
calling virtio_net_tx_complete(). It re-enables notifications and calls
again virtio_net_flush_tx().
But it seems if virtio_net_flush_tx() reaches the tx_burst value all
the queue is not flushed and no new notification is sent to re-schedule
virtio_net_tx_bh(). Nothing re-start to flush the queue and remaining
packets are stuck in the queue.
To fix that, detect in virtio_net_tx_complete() if virtio_net_flush_tx()
has been stopped by tx_burst and if yes re-schedule the bottom half
function virtio_net_tx_bh() to flush the remaining packets.
This is what is done in virtio_net_tx_bh() when the virtio_net_flush_tx()
is synchronous, and completly by-passed when the operation needs to be
asynchronous.
Fixes: a697a334b3 ("virtio-net: Introduce a new bottom half packet TX")
Cc: alex.williamson@redhat.com
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This is the first step towards moving all device config size calculation
logic into the virtio core code. In particular, this adds a struct that
contains all the necessary information for common virtio code to be able
to calculate the final config size for a device. This is expected to be
used with the new virtio_get_config_size helper, which calculates the
final length based on the provided host features.
This builds on top of already existing code like VirtIOFeature and
virtio_feature_get_config_size(), but adds additional fields, as well as
sanity checking so that device-specifc code doesn't have to duplicate it.
An example usage would be:
static const VirtIOFeature dev_features[] = {
{.flags = 1ULL << FEATURE_1_BIT,
.end = endof(struct virtio_dev_config, feature_1)},
{.flags = 1ULL << FEATURE_2_BIT,
.end = endof(struct virtio_dev_config, feature_2)},
{}
};
static const VirtIOConfigSizeParams dev_cfg_size_params = {
.min_size = DEV_BASE_CONFIG_SIZE,
.max_size = sizeof(struct virtio_dev_config),
.feature_sizes = dev_features
};
// code inside my_dev_device_realize()
size_t config_size = virtio_get_config_size(&dev_cfg_size_params,
host_features);
virtio_init(vdev, VIRTIO_ID_MYDEV, config_size);
Currently every device is expected to write its own boilerplate from the
example above in device_realize(), however, the next step of this
transition is moving VirtIOConfigSizeParams into VirtioDeviceClass,
so that it can be done automatically by the virtio initialization code.
All of the users of virtio_feature_get_config_size have been converted
to use virtio_get_config_size so it's no longer needed and is removed
with this commit.
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Message-Id: <20220906073111.353245-2-d-tatianin@yandex-team.ru>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It was returned as error before. Instead of it, simply update the
corresponding field so qemu can send it in the migration data.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This allows external vhost-net devices to modify the state of the
VirtIO device model once the vhost-vdpa device has acknowledged the
control commands.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
vhost-vdpa control virtqueue needs to know the maximum entries supported
by the virtio-net device, so we know if it is possible to apply the
filter.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
virtio_queue_host_notifier_read() tends to read pending event
left behind on ioeventfd in the vhost_net_stop() path, and
attempts to handle outstanding kicks from userspace vq handler.
However, in the ctrl_vq handler, virtio_net_handle_mq() has a
recursive call into virtio_net_set_status(), which may lead to
segmentation fault as shown in below stack trace:
0 0x000055f800df1780 in qdev_get_parent_bus (dev=0x0) at ../hw/core/qdev.c:376
1 0x000055f800c68ad8 in virtio_bus_device_iommu_enabled (vdev=vdev@entry=0x0) at ../hw/virtio/virtio-bus.c:331
2 0x000055f800d70d7f in vhost_memory_unmap (dev=<optimized out>) at ../hw/virtio/vhost.c:318
3 0x000055f800d70d7f in vhost_memory_unmap (dev=<optimized out>, buffer=0x7fc19bec5240, len=2052, is_write=1, access_len=2052) at ../hw/virtio/vhost.c:336
4 0x000055f800d71867 in vhost_virtqueue_stop (dev=dev@entry=0x55f8037ccc30, vdev=vdev@entry=0x55f8044ec590, vq=0x55f8037cceb0, idx=0) at ../hw/virtio/vhost.c:1241
5 0x000055f800d7406c in vhost_dev_stop (hdev=hdev@entry=0x55f8037ccc30, vdev=vdev@entry=0x55f8044ec590) at ../hw/virtio/vhost.c:1839
6 0x000055f800bf00a7 in vhost_net_stop_one (net=0x55f8037ccc30, dev=0x55f8044ec590) at ../hw/net/vhost_net.c:315
7 0x000055f800bf0678 in vhost_net_stop (dev=dev@entry=0x55f8044ec590, ncs=0x55f80452bae0, data_queue_pairs=data_queue_pairs@entry=7, cvq=cvq@entry=1)
at ../hw/net/vhost_net.c:423
8 0x000055f800d4e628 in virtio_net_set_status (status=<optimized out>, n=0x55f8044ec590) at ../hw/net/virtio-net.c:296
9 0x000055f800d4e628 in virtio_net_set_status (vdev=vdev@entry=0x55f8044ec590, status=15 '\017') at ../hw/net/virtio-net.c:370
10 0x000055f800d534d8 in virtio_net_handle_ctrl (iov_cnt=<optimized out>, iov=<optimized out>, cmd=0 '\000', n=0x55f8044ec590) at ../hw/net/virtio-net.c:1408
11 0x000055f800d534d8 in virtio_net_handle_ctrl (vdev=0x55f8044ec590, vq=0x7fc1a7e888d0) at ../hw/net/virtio-net.c:1452
12 0x000055f800d69f37 in virtio_queue_host_notifier_read (vq=0x7fc1a7e888d0) at ../hw/virtio/virtio.c:2331
13 0x000055f800d69f37 in virtio_queue_host_notifier_read (n=n@entry=0x7fc1a7e8894c) at ../hw/virtio/virtio.c:3575
14 0x000055f800c688e6 in virtio_bus_cleanup_host_notifier (bus=<optimized out>, n=n@entry=14) at ../hw/virtio/virtio-bus.c:312
15 0x000055f800d73106 in vhost_dev_disable_notifiers (hdev=hdev@entry=0x55f8035b51b0, vdev=vdev@entry=0x55f8044ec590)
at ../../../include/hw/virtio/virtio-bus.h:35
16 0x000055f800bf00b2 in vhost_net_stop_one (net=0x55f8035b51b0, dev=0x55f8044ec590) at ../hw/net/vhost_net.c:316
17 0x000055f800bf0678 in vhost_net_stop (dev=dev@entry=0x55f8044ec590, ncs=0x55f80452bae0, data_queue_pairs=data_queue_pairs@entry=7, cvq=cvq@entry=1)
at ../hw/net/vhost_net.c:423
18 0x000055f800d4e628 in virtio_net_set_status (status=<optimized out>, n=0x55f8044ec590) at ../hw/net/virtio-net.c:296
19 0x000055f800d4e628 in virtio_net_set_status (vdev=0x55f8044ec590, status=15 '\017') at ../hw/net/virtio-net.c:370
20 0x000055f800d6c4b2 in virtio_set_status (vdev=0x55f8044ec590, val=<optimized out>) at ../hw/virtio/virtio.c:1945
21 0x000055f800d11d9d in vm_state_notify (running=running@entry=false, state=state@entry=RUN_STATE_SHUTDOWN) at ../softmmu/runstate.c:333
22 0x000055f800d04e7a in do_vm_stop (state=state@entry=RUN_STATE_SHUTDOWN, send_stop=send_stop@entry=false) at ../softmmu/cpus.c:262
23 0x000055f800d04e99 in vm_shutdown () at ../softmmu/cpus.c:280
24 0x000055f800d126af in qemu_cleanup () at ../softmmu/runstate.c:812
25 0x000055f800ad5b13 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ../softmmu/main.c:51
For now, temporarily disable handling MQ request from the ctrl_vq
userspace hanlder to avoid the recursive virtio_net_set_status()
call. Some rework is needed to allow changing the number of
queues without going through a full virtio_net_set_status cycle,
particularly for vhost-vdpa backend.
This patch will need to be reverted as soon as future patches of
having the change of #queues handled in userspace is merged.
Fixes: 402378407d ("vhost-vdpa: multiqueue support")
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <1651890498-24478-8-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
With MQ enabled vdpa device and non-MQ supporting guest e.g.
booting vdpa with mq=on over OVMF of single vqp, below assert
failure is seen:
../hw/virtio/vhost-vdpa.c:560: vhost_vdpa_get_vq_index: Assertion `idx >= dev->vq_index && idx < dev->vq_index + dev->nvqs' failed.
0 0x00007f8ce3ff3387 in raise () at /lib64/libc.so.6
1 0x00007f8ce3ff4a78 in abort () at /lib64/libc.so.6
2 0x00007f8ce3fec1a6 in __assert_fail_base () at /lib64/libc.so.6
3 0x00007f8ce3fec252 in () at /lib64/libc.so.6
4 0x0000558f52d79421 in vhost_vdpa_get_vq_index (dev=<optimized out>, idx=<optimized out>) at ../hw/virtio/vhost-vdpa.c:563
5 0x0000558f52d79421 in vhost_vdpa_get_vq_index (dev=<optimized out>, idx=<optimized out>) at ../hw/virtio/vhost-vdpa.c:558
6 0x0000558f52d7329a in vhost_virtqueue_mask (hdev=0x558f55c01800, vdev=0x558f568f91f0, n=2, mask=<optimized out>) at ../hw/virtio/vhost.c:1557
7 0x0000558f52c6b89a in virtio_pci_set_guest_notifier (d=d@entry=0x558f568f0f60, n=n@entry=2, assign=assign@entry=true, with_irqfd=with_irqfd@entry=false)
at ../hw/virtio/virtio-pci.c:974
8 0x0000558f52c6c0d8 in virtio_pci_set_guest_notifiers (d=0x558f568f0f60, nvqs=3, assign=true) at ../hw/virtio/virtio-pci.c:1019
9 0x0000558f52bf091d in vhost_net_start (dev=dev@entry=0x558f568f91f0, ncs=0x558f56937cd0, data_queue_pairs=data_queue_pairs@entry=1, cvq=cvq@entry=1)
at ../hw/net/vhost_net.c:361
10 0x0000558f52d4e5e7 in virtio_net_set_status (status=<optimized out>, n=0x558f568f91f0) at ../hw/net/virtio-net.c:289
11 0x0000558f52d4e5e7 in virtio_net_set_status (vdev=0x558f568f91f0, status=15 '\017') at ../hw/net/virtio-net.c:370
12 0x0000558f52d6c4b2 in virtio_set_status (vdev=vdev@entry=0x558f568f91f0, val=val@entry=15 '\017') at ../hw/virtio/virtio.c:1945
13 0x0000558f52c69eff in virtio_pci_common_write (opaque=0x558f568f0f60, addr=<optimized out>, val=<optimized out>, size=<optimized out>) at ../hw/virtio/virtio-pci.c:1292
14 0x0000558f52d15d6e in memory_region_write_accessor (mr=0x558f568f19d0, addr=20, value=<optimized out>, size=1, shift=<optimized out>, mask=<optimized out>, attrs=...)
at ../softmmu/memory.c:492
15 0x0000558f52d127de in access_with_adjusted_size (addr=addr@entry=20, value=value@entry=0x7f8cdbffe748, size=size@entry=1, access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=0x558f52d15cf0 <memory_region_write_accessor>, mr=0x558f568f19d0, attrs=...) at ../softmmu/memory.c:554
16 0x0000558f52d157ef in memory_region_dispatch_write (mr=mr@entry=0x558f568f19d0, addr=20, data=<optimized out>, op=<optimized out>, attrs=attrs@entry=...)
at ../softmmu/memory.c:1504
17 0x0000558f52d078e7 in flatview_write_continue (fv=fv@entry=0x7f8accbc3b90, addr=addr@entry=103079215124, attrs=..., ptr=ptr@entry=0x7f8ce6300028, len=len@entry=1, addr1=<optimized out>, l=<optimized out>, mr=0x558f568f19d0) at /home/opc/qemu-upstream/include/qemu/host-utils.h:165
18 0x0000558f52d07b06 in flatview_write (fv=0x7f8accbc3b90, addr=103079215124, attrs=..., buf=0x7f8ce6300028, len=1) at ../softmmu/physmem.c:2822
19 0x0000558f52d0b36b in address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., buf=buf@entry=0x7f8ce6300028, len=<optimized out>)
at ../softmmu/physmem.c:2914
20 0x0000558f52d0b3da in address_space_rw (as=<optimized out>, addr=<optimized out>, attrs=...,
attrs@entry=..., buf=buf@entry=0x7f8ce6300028, len=<optimized out>, is_write=<optimized out>) at ../softmmu/physmem.c:2924
21 0x0000558f52dced09 in kvm_cpu_exec (cpu=cpu@entry=0x558f55c2da60) at ../accel/kvm/kvm-all.c:2903
22 0x0000558f52dcfabd in kvm_vcpu_thread_fn (arg=arg@entry=0x558f55c2da60) at ../accel/kvm/kvm-accel-ops.c:49
23 0x0000558f52f9f04a in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:556
24 0x00007f8ce4392ea5 in start_thread () at /lib64/libpthread.so.0
25 0x00007f8ce40bb9fd in clone () at /lib64/libc.so.6
The cause for the assert failure is due to that the vhost_dev index
for the ctrl vq was not aligned with actual one in use by the guest.
Upon multiqueue feature negotiation in virtio_net_set_multiqueue(),
if guest doesn't support multiqueue, the guest vq layout would shrink
to a single queue pair, consisting of 3 vqs in total (rx, tx and ctrl).
This results in ctrl_vq taking a different vhost_dev group index than
the default. We can map vq to the correct vhost_dev group by checking
if MQ is supported by guest and successfully negotiated. Since the
MQ feature is only present along with CTRL_VQ, we ensure the index
2 is only meant for the control vq while MQ is not supported by guest.
Fixes: 22288fe ("virtio-net: vhost control virtqueue support")
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <1651890498-24478-3-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When the control virtqueue feature is absent or not negotiated,
vhost_net_start() still tries to set up vhost_dev and install
vhost notifiers for the control virtqueue, which results in
erroneous ioctl calls with incorrect queue index sending down
to driver. Do that only when needed.
Fixes: 22288fe ("virtio-net: vhost control virtqueue support")
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <1651890498-24478-2-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch adds a get_vhost() callback function for VirtIODevices that
returns the device's corresponding vhost_dev structure, if the vhost
device is running. This patch also adds a vhost_started flag for
VirtIODevices.
Previously, a VirtIODevice wouldn't be able to tell if its corresponding
vhost device was active or not.
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <1648819405-25696-3-git-send-email-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch drops the name parameter for the virtio_init function.
The pair between the numeric device ID and the string device ID
(name) of a virtio device already exists, but not in a way that
lets us map between them.
This patch lets us do this and removes the need for the name
parameter in the virtio_init function.
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <1648819405-25696-2-git-send-email-jonah.palmer@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Per https://discourse.gnome.org/t/port-your-module-from-g-memdup-to-g-memdup2-now/5538
The old API took the size of the memory to duplicate as a guint,
whereas most memory functions take memory sizes as a gsize. This
made it easy to accidentally pass a gsize to g_memdup(). For large
values, that would lead to a silent truncation of the size from 64
to 32 bits, and result in a heap area being returned which is
significantly smaller than what the caller expects. This can likely
be exploited in various modules to cause a heap buffer overflow.
Replace g_memdup() by the safer g_memdup2() wrapper.
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20220512175747.142058-6-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer,
for two reasons. One, it catches multiplication overflowing size_t.
Two, it returns T * rather than void *, which lets the compiler catch
more type errors.
This commit only touches allocations with size arguments of the form
sizeof(T).
Patch created mechanically with:
$ spatch --in-place --sp-file scripts/coccinelle/use-g_new-etc.cocci \
--macro-file scripts/cocci-macro-file.h FILES...
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20220315144156.1595462-4-armbru@redhat.com>
Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Commit bedd7e93d0 ("virtio-net: fix use after unmap/free for sg")
tries to fix the use after free of the sg by caching the virtqueue
elements in an array and unmap them at once after receiving the
packets, But it forgot to unmap the cached elements on error which
will lead to leaking of mapping and other unexpected results.
Fixing this by detaching the cached elements on error. This addresses
CVE-2022-26353.
Reported-by: Victor Tom <vv474172261@gmail.com>
Cc: qemu-stable@nongnu.org
Fixes: CVE-2022-26353
Fixes: bedd7e93d0 ("virtio-net: fix use after unmap/free for sg")
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>