Commit Graph

1019 Commits

Author SHA1 Message Date
Ani Sinha a0d7215e33 vhost-vdpa: do not cleanup the vdpa/vhost-net structures if peer nic is present
When a peer nic is still attached to the vdpa backend, it is too early to free
up the vhost-net and vdpa structures. If these structures are freed here, then
QEMU crashes when the guest is being shut down. The following call chain
would result in an assertion failure since the pointer returned from
vhost_vdpa_get_vhost_net() would be NULL:

do_vm_stop() -> vm_state_notify() -> virtio_set_status() ->
virtio_net_vhost_status() -> get_vhost_net().

Therefore, we defer freeing up the structures until at guest shutdown
time when qemu_cleanup() calls net_cleanup() which then calls
qemu_del_net_client() which would eventually call vhost_vdpa_cleanup()
again to free up the structures. This time, the loop in net_cleanup()
ensures that vhost_vdpa_cleanup() will be called one last time when
all the peer nics are detached and freed.

All unit tests pass with this change.

CC: imammedo@redhat.com
CC: jusual@redhat.com
CC: mst@redhat.com
Fixes: CVE-2023-3301
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20230619065209.442185-1-anisinha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-06-26 09:50:00 -04:00
Eugenio Pérez d45243bcfc vdpa: fix not using CVQ buffer in case of error
Bug introducing when refactoring.  Otherway, the guest never received
the used buffer.

Fixes: be4278b65f ("vdpa: extract vhost_vdpa_net_cvq_add from vhost_vdpa_net_handle_ctrl_avail")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230602173451.1917999-1-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
2023-06-26 09:50:00 -04:00
Eugenio Pérez 51e84244a7 vdpa: mask _F_CTRL_GUEST_OFFLOADS for vhost vdpa devices
QEMU does not emulate it so it must be disabled as long as the backend
does not support it.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230602173328.1917385-1-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
2023-06-26 09:50:00 -04:00
Hawkins Jiawei 4b4a1378b9 vdpa: Allow VIRTIO_NET_F_CTRL_GUEST_OFFLOADS in SVQ
Enable SVQ with VIRTIO_NET_F_CTRL_GUEST_OFFLOADS feature.

Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <778d642ecae6deed8a218b0e6232e4d7bb96b439.1685704856.git.yin31149@gmail.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Tested-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-06-26 09:50:00 -04:00
Hawkins Jiawei 0b58d3686a vdpa: Add vhost_vdpa_net_load_offloads()
This patch introduces vhost_vdpa_net_load_offloads() to
restore offloads state at device's startup.

Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Message-Id: <7e2b5cad9c48c917df53d80dec27dbfeb513e1a3.1685704856.git.yin31149@gmail.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Tested-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-06-26 09:50:00 -04:00
Hawkins Jiawei 02d3bf099b vdpa: reuse virtio_vdev_has_feature()
We can use virtio_vdev_has_feature() instead of manually
accessing the features.

Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <ff838d30206209fd865511b16ffb34cc0d5e8d8f.1685704856.git.yin31149@gmail.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Tested-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-06-26 09:50:00 -04:00
Eugenio Pérez babf8b8712 vdpa: map shadow vrings with MAP_SHARED
The vdpa devices that use va addresses neeeds these maps shared.
Otherwise, vhost_vdpa checks will refuse to accept the maps.

The mmap call will always return a page aligned address, so removing the
qemu_memalign call.  Keeping the ROUND_UP for the size as we still need
to DMA-map them in full.

Not applying fixes tag as it never worked with va devices.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230602143854.1879091-4-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-06-26 09:50:00 -04:00
Eugenio Pérez 915bf6ccd7 vdpa: reorder vhost_vdpa_net_cvq_cmd_page_len function
We need to call it from resource cleanup context, as munmap needs the
size of the mappings.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230602143854.1879091-3-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-06-26 09:50:00 -04:00
Eugenio Pérez 8bc0049ead vdpa: do not block migration if device has cvq and x-svq=on
It was a mistake to forbid in all cases, as SVQ is already able to send
all the CVQ messages before start forwarding data vqs.  It actually
caused a regression, making impossible to migrate device previously
migratable.

Fixes: 36e4647247 ("vdpa: add vhost_vdpa_net_valid_svq_features")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230602143854.1879091-2-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
2023-06-26 09:50:00 -04:00
Eugenio Pérez 152128d646 vdpa: move CVQ isolation check to net_init_vhost_vdpa
Evaluating it at start time instead of initialization time may make the
guest capable of dynamically adding or removing migration blockers.

Also, moving to initialization reduces the number of ioctls in the
migration, reducing failure possibilities.

As a drawback we need to check for CVQ isolation twice: one time with no
MQ negotiated and another one acking it, as long as the device supports
it.  This is because Vring ASID / group management is based on vq
indexes, but we don't know the index of CVQ before negotiating MQ.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230526153143.470745-3-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2023-06-23 03:09:45 -04:00
Eugenio Pérez 0f2bb0bf38 vdpa: return errno in vhost_vdpa_get_vring_group error
We need to tell in the caller, as some errors are expected in a normal
workflow.  In particular, parent drivers in recent kernels with
VHOST_BACKEND_F_IOTLB_ASID may not support vring groups.  In that case,
-ENOTSUP is returned.

This is the case of vp_vdpa in Linux 6.2.

Next patches in this series will use that information to know if it must
abort or not.  Also, next patches return properly an errp instead of
printing with error_report.

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230526153143.470745-2-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-06-23 02:54:44 -04:00
Philippe Mathieu-Daudé de6cd7599b meson: Replace softmmu_ss -> system_ss
We use the user_ss[] array to hold the user emulation sources,
and the softmmu_ss[] array to hold the system emulation ones.
Hold the latter in the 'system_ss[]' array for parity with user
emulation.

Mechanical change doing:

  $ sed -i -e s/softmmu_ss/system_ss/g $(git grep -l softmmu_ss)

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230613133347.82210-10-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-20 10:01:30 +02:00
Philippe Mathieu-Daudé f975033d56 cocoa: Fix warnings about invalid prototype declarations
Fix the following Cocoa trivial warnings:

  C compiler for the host machine: cc (clang 14.0.0 "Apple clang version 14.0.0 (clang-1400.0.29.202)")
  Objective-C compiler for the host machine: clang (clang 14.0.0)

  [100/334] Compiling Objective-C object libcommon.fa.p/net_vmnet-bridged.m.o
  net/vmnet-bridged.m:40:31: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
  static char* get_valid_ifnames()
                                ^
                                 void
  [742/1436] Compiling Objective-C object libcommon.fa.p/ui_cocoa.m.o
  ui/cocoa.m:1937:22: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
  static int cocoa_main()
                       ^
                        void

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230425192820.34063-1-philmd@linaro.org>
2023-06-13 11:28:58 +02:00
Akihiko Odaki 7e64a9cabb igb: Strip the second VLAN tag for extended VLAN
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki 907209e311 igb: Implement Rx SCTP CSO
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki aaa8a15c96 net/eth: Always add VLAN tag
It is possible to have another VLAN tag even if the packet is already
tagged.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki 85427bf388 net/eth: Use void pointers
The uses of uint8_t pointers were misleading as they are never accessed
as an array of octets and it even require more strict alignment to
access as struct eth_header.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki 0b11783014 net/eth: Rename eth_setup_vlan_headers_ex
The old eth_setup_vlan_headers has no user so remove it and rename
eth_setup_vlan_headers_ex.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki 2f0fa232b8 net/net_rx_pkt: Use iovec for net_rx_pkt_set_protocols()
igb does not properly ensure the buffer passed to
net_rx_pkt_set_protocols() is contiguous for the entire L2/L3/L4 header.
Allow it to pass scattered data to net_rx_pkt_set_protocols().

Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Vladimir Sementsov-Ogievskiy 6c1e3906ce configure: add --disable-colo-proxy option
Add option to not build filter-rewriter and colo-compare when
they are not needed.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Message-Id: <20230515130640.46035-2-vsementsov@yandex-team.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-05-18 18:40:50 +02:00
Eugenio Pérez 0d74e2b785 vdpa: accept VIRTIO_NET_F_SPEED_DUPLEX in SVQ
There is no reason to block it as it has nothing to do with the vrings.
All the support of the feature comes via config space.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Suggested-by: Alvaro Karsz <alvaro.karsz@solid-run.com>
Message-Id: <20230307170018.260557-1-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-21 03:08:21 -04:00
Marc-André Lureau 25657fc6c1 win32: replace closesocket() with close() wrapper
Use a close() wrapper instead, so that we don't need to worry about
closesocket() vs close() anymore, let's hope.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Message-Id: <20230221124802.4103554-17-marcandre.lureau@redhat.com>
2023-03-13 15:39:31 +04:00
Marc-André Lureau fd3c333315 slirp: open-code qemu_socket_(un)select()
We are about to make the QEMU socket API use file-descriptor space only,
but libslirp gives us SOCKET as fd, still.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Message-Id: <20230221124802.4103554-14-marcandre.lureau@redhat.com>
2023-03-13 15:39:31 +04:00
Marc-André Lureau 21ac728498 slirp: unregister the win32 SOCKET
Presumably, this is what should happen when the SOCKET is to be removed.
(it probably worked until now because closesocket() does it implicitly,
but we never now how the slirp library could use the SOCKET later)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Message-Id: <20230221124802.4103554-13-marcandre.lureau@redhat.com>
2023-03-13 15:39:31 +04:00
Marc-André Lureau faa4ec1641 main-loop: remove qemu_fd_register(), win32/slirp/socket specific
Open-code the socket registration where it's needed, to avoid
artificially used or unclear generic interface.

Furthermore, the following patches are going to make socket handling use
FD-only inside QEMU, but we need to handle win32 SOCKET from libslirp.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Message-Id: <20230221124802.4103554-12-marcandre.lureau@redhat.com>
2023-03-13 15:39:31 +04:00
Peter Maydell 7284d53f6f -----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
 
 iQEcBAABAgAGBQJkCvgFAAoJEO8Ells5jWIRHiUH/jhydpJHIqnAPxHQAwGtmyhb
 9Z52UOzW5V6KxfZJ+bQ4RPFkS2UwcxmeadPHY4zvvJTVBLAgG3QVgP4igj8CXKCI
 xRnwMgTNeu655kZQ5P/elTwdBTCJFODk7Egg/bH3H1ZiUhXBhVRhK7q/wMgtlZkZ
 Kexo6txCK4d941RNzEh45ZaGhdELE+B+D7cRuQgBs/DXZtJpsyEzBbP8KYSMHuER
 AXfWo0YIBYj7X3ek9D6j0pbOkB61vqtYd7W6xV4iDrJCcFBIOspJbbBb1tGCHola
 AXo5/OhRmiQnp/c/HTbJIDbrj0sq/r7LxYK4zY1x7UPbewHS9R+wz+FfqSmoBF0=
 =056y
 -----END PGP SIGNATURE-----

Merge tag 'net-pull-request' of https://github.com/jasowang/qemu into staging

# -----BEGIN PGP SIGNATURE-----
# Version: GnuPG v1
#
# iQEcBAABAgAGBQJkCvgFAAoJEO8Ells5jWIRHiUH/jhydpJHIqnAPxHQAwGtmyhb
# 9Z52UOzW5V6KxfZJ+bQ4RPFkS2UwcxmeadPHY4zvvJTVBLAgG3QVgP4igj8CXKCI
# xRnwMgTNeu655kZQ5P/elTwdBTCJFODk7Egg/bH3H1ZiUhXBhVRhK7q/wMgtlZkZ
# Kexo6txCK4d941RNzEh45ZaGhdELE+B+D7cRuQgBs/DXZtJpsyEzBbP8KYSMHuER
# AXfWo0YIBYj7X3ek9D6j0pbOkB61vqtYd7W6xV4iDrJCcFBIOspJbbBb1tGCHola
# AXo5/OhRmiQnp/c/HTbJIDbrj0sq/r7LxYK4zY1x7UPbewHS9R+wz+FfqSmoBF0=
# =056y
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 10 Mar 2023 09:27:33 GMT
# gpg:                using RSA key EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* tag 'net-pull-request' of https://github.com/jasowang/qemu: (44 commits)
  ebpf: fix compatibility with libbpf 1.0+
  docs/system/devices/igb: Add igb documentation
  tests/avocado: Add igb test
  igb: Introduce qtest for igb device
  tests/qtest/libqos/e1000e: Export macreg functions
  tests/qtest/e1000e-test: Fabricate ethernet header
  Intrdocue igb device emulation
  e1000: Split header files
  pcie: Introduce pcie_sriov_num_vfs
  net/eth: Introduce EthL4HdrProto
  e1000e: Implement system clock
  net/eth: Report if headers are actually present
  e1000e: Count CRC in Tx statistics
  e1000: Count CRC in Tx statistics
  e1000e: Combine rx traces
  MAINTAINERS: Add e1000e test files
  MAINTAINERS: Add Akihiko Odaki as a e1000e reviewer
  e1000e: Do not assert when MSI-X is disabled later
  hw/net/net_tx_pkt: Check the payload length
  hw/net/net_tx_pkt: Implement TCP segmentation
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-03-11 17:17:18 +00:00
Akihiko Odaki 65f474bbae net/eth: Introduce EthL4HdrProto
igb, a new network device emulation, will need SCTP checksum offloading.
Currently eth_get_protocols() has a bool parameter for each protocol
currently it supports, but there will be a bit too many parameters if
we add yet another protocol.

Introduce an enum type, EthL4HdrProto to represent all L4 protocols
eth_get_protocols() support with one parameter.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki 69ff5ef847 net/eth: Report if headers are actually present
The values returned by eth_get_protocols() are used to perform RSS,
checksumming and segmentation. Even when a packet signals the use of the
protocols which these operations can be applied to, the headers for them
may not be present because of too short packet or fragmentation, for
example. In such a case, the operations cannot be applied safely.

Report the presence of headers instead of whether the use of the
protocols are indicated with eth_get_protocols(). This also makes
corresponding changes to the callers of eth_get_protocols() to match
with its new signature and to remove redundant checks for fragmentation.

Fixes: 75020a7021 ("Common definitions for VMWARE devices")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki 02ef5fdc09 hw/net/net_tx_pkt: Implement TCP segmentation
There was no proper implementation of TCP segmentation before this
change, and net_tx_pkt relied solely on IPv4 fragmentation. Not only
this is not aligned with the specification, but it also resulted in
corrupted IPv6 packets.

This is particularly problematic for the igb, a new proposed device
implementation; igb provides loopback feature for VMDq and the feature
relies on software segmentation.

Implement proper TCP segmentation in net_tx_pkt to fix such a scenario.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki 481c52320a net: Strip virtio-net header when dumping
filter-dump specifiees Ethernet as PCAP LinkType, which does not expect
virtio-net header. Having virtio-net header in such PCAP file breaks
PCAP unconsumable. Unfortunately currently there is no LinkType for
virtio-net so for now strip virtio-net header to convert the output to
Ethernet.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Eugenio Pérez 609ab4c3ed vdpa net: allow VHOST_F_LOG_ALL
Since some actions move to the start function instead of init, the
device features may not be the parent vdpa device's, but the one
returned by vhost backend.  If transition to SVQ is supported, the vhost
backend will return _F_LOG_ALL to signal the device is migratable.

Add VHOST_F_LOG_ALL.  HW dirty page tracking can be added on top of this
change if the device supports it in the future.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230303172445.1089785-14-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez 5c1ebd4c43 vdpa: block migration if device has unsupported features
A vdpa net device must initialize with SVQ in order to be migratable at
this moment, and initialization code verifies some conditions.  If the
device is not initialized with the x-svq parameter, it will not expose
_F_LOG so the vhost subsystem will block VM migration from its
initialization.

Next patches change this, so we need to verify migration conditions
differently.

QEMU only supports a subset of net features in SVQ, and it cannot
migrate state that cannot track or restore in the destination.  Add a
migration blocker if the device offers an unsupported feature.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-12-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez 9c363cf6d5 vdpa net: block migration if the device has CVQ
Devices with CVQ need to migrate state beyond vq state.  Leaving this to
future series.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-11-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez 6949843046 vdpa: add vdpa net migration state notifier
This allows net to restart the device backend to configure SVQ on it.

Ideally, these changes should not be net specific and they could be done
in:
* vhost_vdpa_set_features (with VHOST_F_LOG_ALL)
* vhost_vdpa_set_vring_addr (with .enable_log)
* vhost_vdpa_set_log_base.

However, the vdpa net backend is the one with enough knowledge to
configure everything because of some reasons:
* Queues might need to be shadowed or not depending on its kind (control
  vs data).
* Queues need to share the same map translations (iova tree).

Also, there are other problems that may have solutions but complicates
the implementation at this stage:
* We're basically duplicating vhost_dev_start and vhost_dev_stop, and
  they could go out of sync.  If we want to reuse them, we need a way to
  skip some function calls to avoid recursiveness (either vhost_ops ->
  vhost_set_features, vhost_set_vring_addr, ...).
* We need to traverse all vhost_dev of a given net device twice: one to
  stop and get the vq state and another one after the reset to
  configure properties like address, fd, etc.

Because of that it is cleaner to restart the whole net backend and
configure again as expected, similar to how vhost-kernel moves between
userspace and passthrough.

If more kinds of devices need dynamic switching to SVQ we can:
* Create a callback struct like VhostOps and move most of the code
  there.  VhostOps cannot be reused since all vdpa backend share them,
  and to personalize just for networking would be too heavy.
* Add a parent struct or link all the vhost_vdpa or vhost_dev structs so
  we can traverse them.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-9-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez 00ef422e9f vdpa net: move iova tree creation from init to start
Only create iova_tree if and when it is needed.

The cleanup keeps being responsible for the last VQ but this change
allows it to merge both cleanup functions.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230303172445.1089785-2-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez 525ae11522 vdpa: fix VHOST_BACKEND_F_IOTLB_ASID flag check
VHOST_BACKEND_F_IOTLB_ASID is the feature bit, not the bitmask. Since
the device under test also provided VHOST_BACKEND_F_IOTLB_MSG_V2 and
VHOST_BACKEND_F_IOTLB_BATCH, this went unnoticed.

Fixes: c1a1008685 ("vdpa: always start CVQ in SVQ mode if possible")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-02-17 13:31:33 +08:00
Laurent Vivier 148fbf0d58 net: stream: add a new option to automatically reconnect
In stream mode, if the server shuts down there is currently
no way to reconnect the client to a new server without removing
the NIC device and the netdev backend (or to reboot).

This patch introduces a reconnect option that specifies a delay
to try to reconnect with the same parameters.

Add a new test in qtest to test the reconnect option and the
connect/disconnect events.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-02-17 13:31:33 +08:00
Joelle van Dyne 993f71ee33 vmnet: stop recieving events when VM is stopped
When the VM is stopped using the HMP command "stop", soon the handler will
stop reading from the vmnet interface. This causes a flood of
`VMNET_INTERFACE_PACKETS_AVAILABLE` events to arrive and puts the host CPU
at 100%. We fix this by removing the event handler from vmnet when the VM
is no longer in a running state and restore it when we return to a running
state.

Signed-off-by: Joelle van Dyne <j@getutm.app>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-02-17 13:31:33 +08:00
Christian Svensson 0c65ef4fbb net: Increase L2TPv3 buffer to fit jumboframes
Increase the allocated buffer size to fit larger packets.
Given that jumboframes can commonly be up to 9000 bytes the closest suitable
value seems to be 16 KiB.

Tested by running qemu towards a Linux L2TPv3 endpoint and pushing
jumboframe traffic through the interfaces.

Signed-off-by: Christian Svensson <blue@cmd.nu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-02-17 13:31:33 +08:00
Thomas Huth 3b0cca8e4e net: Replace "Supported NIC models" with "Available NIC models"
Just because a NIC model is compiled into the QEMU binary does not
necessary mean that it can be used with each and every machine.
So let's rather talk about "available" models instead of "supported"
models, just to avoid confusion.

Reviewed-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-02-17 13:31:33 +08:00
Thomas Huth 27c819244b net: Restore printing of the help text with "-nic help"
Running QEMU with "-nic help" used to work in QEMU 5.2 and earlier versions
(it showed the available netdev backends), but this feature got broken during
some refactoring in version 6.0. Let's restore the old behavior, and while
we're at it, let's also print the available NIC models here now since this
option can be used to configure both, netdev backend and model in one go.

Fixes: ad6f932fe8 ("net: do not exit on "netdev_add help" monitor command")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-02-17 13:31:33 +08:00
Thomas Huth c6941b3b9b net: Move the code to collect available NIC models to a separate function
The code that collects the available NIC models is not really specific
to PCI anymore and will be required in the next patch, too, so let's
move this into a new separate function in net.c instead.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-02-17 13:31:33 +08:00
Markus Armbruster e02e085c8b net: Clean up includes
This commit was created with scripts/clean-includes.

All .c should include qemu/osdep.h first.  The script performs three
related cleanups:

* Ensure .c files include qemu/osdep.h first.
* Including it in a .h is redundant, since the .c  already includes
  it.  Drop such inclusions.
* Likewise, including headers qemu/osdep.h includes is redundant.
  Drop these, too.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20230202133830.2152150-13-armbru@redhat.com>
2023-02-08 07:28:05 +01:00
Markus Armbruster ae71d13d4e net: Move hmp_info_network() to net-hmp-cmds.c
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230124121946.1139465-17-armbru@redhat.com>
2023-02-04 07:56:54 +01:00
Markus Armbruster 2030ca36bf net: Move HMP commands from monitor to net/
This moves these commands from MAINTAINERS sections "Human
Monitor (HMP)" and "QMP" to "Network device backends".

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230124121946.1139465-16-armbru@redhat.com>
2023-02-04 07:56:54 +01:00
Peter Maydell aa96ab7c9d * s390x header clean-ups from Philippe
* Rework and improvements of the EINTR handling by Nikita
 * Deprecate the -no-hpet command line option
 * Disable the qtests in the 32-bit Windows CI job again
 * Some other misc fixes here and there
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmO8It8RHHRodXRoQHJl
 ZGhhdC5jb20ACgkQLtnXdP5wLbUwbA//dXgfHy95C1r2nTMDekk09+KkmNB1f6M8
 3HK4ROmmrMT/aP9FwfqMBT7JHM/m4bwOGw0Sula8vfjg9NYGPWuSYjdObWKnrIq/
 YORoTxqak9c98Co06EQbAfWn3Pj0ifQkX+FIyzcNGhu4856FWdBsMuyq52VLi36q
 Z8ruSOmclzluoIB3mVYY/s5J7ED2A3K0h39frKLE9FGsKObX10KWj+MZyDHi9oGZ
 ucTHai12OXgNghjlrwI0BqJziih4NxfIWs0JovSo3cN0at7m57G5JChjR38zTMNT
 2Q46tDKoIXesY1GUmVuIgJ5F1Uoshc8Pz5qBSQ5mUbZUQMpivhFrEB666wsYmPd1
 M/YwnZ+PFhWjem7p28fKmnmkeATvE0S+vMDifTVZ880nmAbyUm1vFKfqV6r2mBrT
 p4iXfh/9easFfJWHueU4fBwyMndDGRaCRJnP8KQ5I9yb0WZbt+/0k/y8CQD8Oxr7
 dNFFFoY3KnIO9DCRO5Wr+3OqUgtSAQyhBDf5V2wSMCFrwPHKsvWKSbdiWR3Qe4ck
 41InWgawB3xx57+vXraDUA10+nBZ1VrM92ObqfLPTFqjLCom6Fm85cG4YFRLIvRt
 rdlOC+ScpeVpec7MwcHrScGL0HmUgPnShDAo07pRy4oKK+c89sXzdAFf2nYJTAWS
 WCuChrn7VFM=
 =D+Yw
 -----END PGP SIGNATURE-----

Merge tag 'pull-request-2023-01-09' of https://gitlab.com/thuth/qemu into staging

* s390x header clean-ups from Philippe
* Rework and improvements of the EINTR handling by Nikita
* Deprecate the -no-hpet command line option
* Disable the qtests in the 32-bit Windows CI job again
* Some other misc fixes here and there

# gpg: Signature made Mon 09 Jan 2023 14:21:19 GMT
# gpg:                using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg:                issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg:                 aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg:                 aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg:                 aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3  EAB9 2ED9 D774 FE70 2DB5

* tag 'pull-request-2023-01-09' of https://gitlab.com/thuth/qemu:
  .gitlab-ci.d/windows: Do not run the qtests in the msys2-32bit job
  error handling: Use RETRY_ON_EINTR() macro where applicable
  Refactoring: refactor TFR() macro to RETRY_ON_EINTR()
  docs/interop: Change the vnc-ledstate-Pseudo-encoding doc into .rst
  i386: Deprecate the -no-hpet QEMU command line option
  tests/qtest/bios-tables-test: Replace -no-hpet with hpet=off machine parameter
  tests/readconfig: spice doesn't support unix socket on windows yet
  target/s390x: Restrict sysemu/reset.h to system emulation
  target/s390x/tcg/excp_helper: Restrict system headers to sysemu
  target/s390x/tcg/misc_helper: Remove unused "memory.h" include
  hw/s390x/pv: Restrict Protected Virtualization to sysemu
  exec/memory: Expose memory_region_access_valid()
  MAINTAINERS: Add MIPS-related docs and configs to the MIPS architecture section
  tests/vm: Update get_default_jobs() to work on non-x86_64 non-KVM hosts
  qemu-iotests/stream-under-throttle: do not shutdown QEMU

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-01-09 15:54:31 +00:00
Nikita Ivanov 37b0b24e93 error handling: Use RETRY_ON_EINTR() macro where applicable
There is a defined RETRY_ON_EINTR() macro in qemu/osdep.h
which handles the same while loop.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/415
Signed-off-by: Nikita Ivanov <nivanov@cloudlinux.com>
Message-Id: <20221023090422.242617-3-nivanov@cloudlinux.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[thuth: Dropped the hunk that changed socket_accept() in libqtest.c]
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-01-09 13:50:47 +01:00
Nikita Ivanov 8b6aa69365 Refactoring: refactor TFR() macro to RETRY_ON_EINTR()
Rename macro name to more transparent one and refactor
it to expression.

Signed-off-by: Nikita Ivanov <nivanov@cloudlinux.com>
Message-Id: <20221023090422.242617-2-nivanov@cloudlinux.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-01-09 13:50:47 +01:00
Longpeng bf7a2ad8b6 vdpa: harden the error path if get_iova_range failed
We should stop if the GET_IOVA_RANGE ioctl failed.

Signed-off-by: Longpeng <longpeng2@huawei.com>
Message-Id: <20221224114848.3062-3-longpeng2@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2023-01-08 01:54:22 -05:00
Longpeng c672f348cb vdpa-dev: get iova range explicitly
In commit a585fad26b ("vdpa: request iova_range only once") we remove
GET_IOVA_RANGE form vhost_vdpa_init, the generic vdpa device will start
without iova_range populated, so the device won't work. Let's call
GET_IOVA_RANGE ioctl explicitly.

Fixes: a585fad26b ("vdpa: request iova_range only once")
Signed-off-by: Longpeng <longpeng2@huawei.com>
Message-Id: <20221224114848.3062-2-longpeng2@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2023-01-08 01:54:22 -05:00