qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Ben Widawsky	4f8db8711c	hw/pxb: Allow creation of a CXL PXB (host bridge) This works like adding a typical pxb device, except the name is 'pxb-cxl' instead of 'pxb-pcie'. An example command line would be as follows: -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1 A CXL PXB is backward compatible with PCIe. What this means in practice is that an operating system that is unaware of CXL should still be able to enumerate this topology as if it were PCIe. One can create multiple CXL PXB host bridges, but a host bridge can only be connected to the main root bus. Host bridges cannot appear elsewhere in the topology. Note that as of this patch, the ACPI tables needed for the host bridge (specifically, an ACPI object in _SB named ACPI0016 and the CEDT) aren't created. So while this patch internally creates it, it cannot be properly used by an operating system or other system software. Also necessary is to add an exception to scripts/device-crash-test similar to that for exiting pxb as both must created on a PCIexpress host bus. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Signed-off-by: Jonathan.Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220429144110.25167-15-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-13 06:13:36 -04:00
Jonathan Cameron	abb3009baf	cxl: Machine level control on whether CXL support is enabled There are going to be some potential overheads to CXL enablement, for example the host bridge region reserved in memory maps. Add a machine level control so that CXL is disabled by default. Signed-off-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220429144110.25167-14-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-13 06:13:36 -04:00
Ben Widawsky	9dccb1216b	hw/pci/cxl: Create a CXL bus type The easiest way to differentiate a CXL bus, and a PCIE bus is using a flag. A CXL bus, in hardware, is backward compatible with PCIE, and therefore the code tries pretty hard to keep them in sync as much as possible. The other way to implement this would be to try to cast the bus to the correct type. This is less code and useful for debugging via simply looking at the flags. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220429144110.25167-13-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-13 06:13:36 -04:00
Ben Widawsky	25a2e524e3	hw/pxb: Use a type for realizing expanders This opens up the possibility for more types of expanders (other than PCI and PCIe). We'll need this to create a CXL expander. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220429144110.25167-12-Jonathan.Cameron@huawei.com>	2022-05-13 06:13:36 -04:00
Ben Widawsky	056172691b	hw/cxl/device: Add log commands (8.2.9.4) + CEL CXL specification provides for the ability to obtain logs from the device. Logs are either spec defined, like the "Command Effects Log" (CEL), or vendor specific. UUIDs are defined for all log types. The CEL is a mechanism to provide information to the host about which commands are supported. It is useful both to determine which spec'd optional commands are supported, as well as provide a list of vendor specified commands that might be used. The CEL is already created as part of mailbox initialization, but here it is now exported to hosts that use these log commands. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220429144110.25167-11-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-13 06:13:36 -04:00
Ben Widawsky	557a79c83e	hw/cxl/device: Timestamp implementation (8.2.9.3) Errata F4 to CXL 2.0 clarified the meaning of the timer as the sum of the value set with the timestamp set command and the number of nano seconds since it was last set. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220429144110.25167-10-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-13 06:13:36 -04:00
Ben Widawsky	57c02b355f	hw/cxl/device: Add cheap EVENTS implementation (8.2.9.1) Using the previously implemented stubbed helpers, it is now possible to easily add the missing, required commands to the implementation. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220429144110.25167-9-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-13 06:13:36 -04:00
Ben Widawsky	ce3b4e5c15	hw/cxl/device: Add memory device utilities Memory devices implement extra capabilities on top of CXL devices. This adds support for that. A large part of memory devices is the mailbox/command interface. All of the mailbox handling is done in the mailbox-utils library. Longer term, new CXL devices that are being emulated may want to handle commands differently, and therefore would need a mechanism to opt in/out of the specific generic handlers. As such, this is considered sufficient for now, but may need more depth in the future. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220429144110.25167-8-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-13 06:13:36 -04:00
Ben Widawsky	464e14ac43	hw/cxl/device: Implement basic mailbox (8.2.8.4) This is the beginning of implementing mailbox support for CXL 2.0 devices. The implementation recognizes when the doorbell is rung, handles the command/payload, clears the doorbell while returning error codes and data. Generally the mailbox mechanism is designed to permit communication between the host OS and the firmware running on the device. For our purposes, we emulate both the firmware, implemented primarily in cxl-mailbox-utils.c, and the hardware. No commands are implemented yet. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220429144110.25167-7-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-13 06:13:36 -04:00
Ben Widawsky	6364adacdf	hw/cxl/device: Implement the CAP array (8.2.8.1-2) This implements all device MMIO up to the first capability. That includes the CXL Device Capabilities Array Register, as well as all of the CXL Device Capability Header Registers. The latter are filled in as they are implemented in the following patches. Endianness and alignment are managed by softmmu memory core. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220429144110.25167-6-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-13 06:13:36 -04:00
Ben Widawsky	cd90126b4c	hw/cxl/device: Introduce a CXL device (8.2.8) A CXL device is a type of CXL component. Conceptually, a CXL device would be a leaf node in a CXL topology. From an emulation perspective, CXL devices are the most complex and so the actual implementation is reserved for discrete commits. This new device type is specifically catered towards the eventual implementation of a Type3 CXL.mem device, 8.2.8.5 in the CXL 2.0 specification. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Adam Manzanares <a.manzanares@samsung.com> Message-Id: <20220429144110.25167-5-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-13 06:13:36 -04:00
Jonathan Cameron	502730ee3c	MAINTAINERS: Add entry for Compute Express Link Emulation The CXL emulation will be jointly maintained by Ben Widawsky and Jonathan Cameron. Broken out as a separate patch to improve visibility. Signed-off-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220429144110.25167-4-Jonathan.Cameron@huawei.com>	2022-05-13 06:13:36 -04:00
Ben Widawsky	9e58f52d3f	hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5) A CXL 2.0 component is any entity in the CXL topology. All components have a analogous function in PCIe. Except for the CXL host bridge, all have a PCIe config space that is accessible via the common PCIe mechanisms. CXL components are enumerated via DVSEC fields in the extended PCIe header space. CXL components will minimally implement some subset of CXL.mem and CXL.cache registers defined in 8.2.5 of the CXL 2.0 specification. Two headers and a utility library are introduced to support the minimum functionality needed to enumerate components. The cxl_pci header manages bits associated with PCI, specifically the DVSEC and related fields. The cxl_component.h variant has data structures and APIs that are useful for drivers implementing any of the CXL 2.0 components. The library takes care of making use of the DVSEC bits and the CXL.[mem\|cache] registers. Per spec, the registers are little endian. None of the mechanisms required to enumerate a CXL capable hostbridge are introduced at this point. Note that the CXL.mem and CXL.cache registers used are always 4B wide. It's possible in the future that this constraint will not hold. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Adam Manzanares <a.manzanares@samsung.com> Message-Id: <20220429144110.25167-3-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-13 06:13:35 -04:00
Ben Widawsky	cf04aba2a9	hw/pci/cxl: Add a CXL component type (interface) A CXL component is a hardware entity that implements CXL component registers from the CXL 2.0 spec (8.2.3). Currently these represent 3 general types. 1. Host Bridge 2. Ports (root, upstream, downstream) 3. Devices (memory, other) A CXL component can be conceptually thought of as a PCIe device with extra functionality when enumerated and enabled. For this reason, CXL does here, and will continue to add on to existing PCI code paths. Host bridges will typically need to be handled specially and so they can implement this newly introduced interface or not. All other components should implement this interface. Implementing this interface allows the core PCI code to treat these devices as special where appropriate. Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Adam Manzanares <a.manzanares@samsung.com> Message-Id: <20220429144110.25167-2-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-05-13 06:13:35 -04:00
Jason Wang	250227f4fa	intel-iommu: correct the value used for error_setg_errno() error_setg_errno() expects a normal errno value, not a negated one, so we should use ENOTSUP instead of -ENOSUP. Fixes: Coverity CID 1487174 Fixes: ("intel_iommu: support snoop control") Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220401022824.9337-1-jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com>	2022-05-13 05:22:31 -04:00
Halil Pasic	06134e2bc3	virtio: fix feature negotiation for ACCESS_PLATFORM Unlike most virtio features ACCESS_PLATFORM is considered mandatory by QEMU, i.e. the driver must accept it if offered by the device. The virtio specification says that the driver SHOULD accept the ACCESS_PLATFORM feature if offered, and that the device MAY fail to operate if ACCESS_PLATFORM was offered but not negotiated. While a SHOULD ain't exactly a MUST, we are certainly allowed to fail the device when the driver fences ACCESS_PLATFORM. With commit `2943b53f68` ("virtio: force VIRTIO_F_IOMMU_PLATFORM") we already made the decision to do so whenever the get_dma_as() callback is implemented (by the bus), which in practice means for the entirety of virtio-pci. That means, if the device needs to translate I/O addresses, then ACCESS_PLATFORM is mandatory. The aforementioned commit tells us in the commit message that this is for security reasons. More precisely if we were to allow a less then trusted driver (e.g. an user-space driver, or a nested guest) to make the device bypass the IOMMU by not negotiating ACCESS_PLATFORM, then the guest kernel would have no ability to control/police (by programming the IOMMU) what pieces of guest memory the driver may manipulate using the device. Which would break security assumptions within the guest. If ACCESS_PLATFORM is offered not because we want the device to utilize an IOMMU and do address translation, but because the device does not have access to the entire guest RAM, and needs the driver to grant access to the bits it needs access to (e.g. confidential guest support), we still require the guest to have the corresponding logic and to accept ACCESS_PLATFORM. If the driver does not accept ACCESS_PLATFORM, then things are bound to go wrong, and we may see failures much less graceful than failing the device because the driver didn't negotiate ACCESS_PLATFORM. So let us make ACCESS_PLATFORM mandatory for the driver regardless of whether the get_dma_as() callback is implemented or not. Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Fixes: `2943b53f68` ("virtio: force VIRTIO_F_IOMMU_PLATFORM") Message-Id: <20220307112939.2780117-1-pasic@linux.ibm.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com>	2022-05-13 05:22:31 -04:00
Richard Henderson	9de5f2b408	* small cleanups for pc-bios/optionrom Makefiles * checkpatch: fix g_malloc check * fix mremap() and RDMA detection * confine igd-passthrough-isa-bridge to Xen-enabled builds * cover PCI in arm-virt machine qtests * add -M boot and -M mem compound properties * bump SLIRP submodule * support CFI with system libslirp (>= 4.7) * clean up CoQueue wakeup functions * fix vhost-vsock regression * fix --disable-vnc compilation * other minor bugfixes -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmJ8/KMUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNTTAf9Et1C8iZn+OlZi99wMEeMy8a4mIE5 CpkBpFphhkBvt3AH7XNsCyL4Gea4QgsI7nOIEVUwvW7gPf85PiBUX8mjrIVg3x1k bmMEwMKSTYPmDieAnYBP9zCqZQXNYP8L8WxVs2jFY2GXZ2ZogODYFbvCY4yEEB72 UR6uIvQRdpiB6BEj8UZ+5i+sDtb0zxqrjzUz8T/PJC9/2JSNgi+sAWWQoQT3PPU7 R7z2nmEa1VeVLPP6mUHvJKhBltVXF+LyIjQHvo+Tp9tSqp9JwXfFBNQ5W/MFes2D skF47N7PdgKRH9Dp4r0j+MqBwoAq86+ao+MKsbQ1Gb91HhoCWt/MrVrVyg== =1E6P -----END PGP SIGNATURE----- Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging * small cleanups for pc-bios/optionrom Makefiles * checkpatch: fix g_malloc check * fix mremap() and RDMA detection * confine igd-passthrough-isa-bridge to Xen-enabled builds * cover PCI in arm-virt machine qtests * add -M boot and -M mem compound properties * bump SLIRP submodule * support CFI with system libslirp (>= 4.7) * clean up CoQueue wakeup functions * fix vhost-vsock regression * fix --disable-vnc compilation * other minor bugfixes # -----BEGIN PGP SIGNATURE----- # # iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmJ8/KMUHHBib256aW5p # QHJlZGhhdC5jb20ACgkQv/vSX3jHroNTTAf9Et1C8iZn+OlZi99wMEeMy8a4mIE5 # CpkBpFphhkBvt3AH7XNsCyL4Gea4QgsI7nOIEVUwvW7gPf85PiBUX8mjrIVg3x1k # bmMEwMKSTYPmDieAnYBP9zCqZQXNYP8L8WxVs2jFY2GXZ2ZogODYFbvCY4yEEB72 # UR6uIvQRdpiB6BEj8UZ+5i+sDtb0zxqrjzUz8T/PJC9/2JSNgi+sAWWQoQT3PPU7 # R7z2nmEa1VeVLPP6mUHvJKhBltVXF+LyIjQHvo+Tp9tSqp9JwXfFBNQ5W/MFes2D # skF47N7PdgKRH9Dp4r0j+MqBwoAq86+ao+MKsbQ1Gb91HhoCWt/MrVrVyg== # =1E6P # -----END PGP SIGNATURE----- # gpg: Signature made Thu 12 May 2022 05:25:07 AM PDT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [undefined] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [undefined] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (27 commits) vmxcap: add tertiary execution controls vl: make machine type deprecation a warning meson: link libpng independent of vnc vhost-backend: do not depend on CONFIG_VHOST_VSOCK coroutine-lock: qemu_co_queue_restart_all is a coroutine-only qemu_co_enter_all coroutine-lock: introduce qemu_co_queue_enter_all coroutine-lock: qemu_co_queue_next is a coroutine-only qemu_co_enter_next net: slirp: allow CFI with libslirp >= 4.7 net: slirp: add support for CFI-friendly timer API net: slirp: switch to slirp_new net: slirp: introduce a wrapper struct for QemuTimer slirp: bump submodule past 4.7 release machine: move more memory validation to Machine object machine: make memory-backend a link property machine: add mem compound property machine: add boot compound property machine: use QAPI struct for boot configuration tests/qtest/libqos: Add generic pci host bridge in arm-virt machine tests/qtest/libqos: Skip hotplug tests if pci root bus is not hotpluggable tests/qtest/libqos/pci: Introduce pio_limit ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2022-05-12 10:52:15 -07:00
Richard Henderson	b32b3897f8	Block layer patches - coroutine: Fix crashes due to too large pool batch size - fdc: Prevent end-of-track overrun - nbd: MULTI_CONN for shared writable exports - iotests test runner improvements -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmJ9KCkRHGt3b2xmQHJl ZGhhdC5jb20ACgkQfwmycsiPL9ZtSRAAmYDFBPqxfutpFXM7kIKwL6COXJC12MOx Tmu8cDiGB/jNChdi3kl6I5h5njzo3U0ZlL/Ign6EzHoeoXLAPSeUWmuRsARwsZ+A rL61gf6yrMjAo45FZuIS0GlMDk8BauRwPl9qPWeqQcrtOMYpxwZfyFGmcMpQgAOI MSC1I8p3FA7oJhGpKIHDPOjaZA97Lm2rLnDIwZ4f0YgssbybFBcFCXOQbhpsVhLy Tjp/L+qRUtna9xBsPHQvHZW0kITQbCQPdX+oVqqUmwzSvuHqfXKe1YppyPjBt/S0 H7nxtx4HOgP0lP5Kea+wbIRAk9Da5uaOW8hlMWRLShEKv1iTUenQSKteBB6CD03t GD9ze1kGoR9b6szw795BXxZxcWii0cn359lIVHeKR/U8zDuz5w3zhyl0klK8xeJy nj+JErLwQ7BD8kNR+7WAfXTF3tk2dQao1AvsBjn087KjMiJ/Mg8HY4K2zrjBUrHL DLTyAIjzct3BWJDZ02fb5jb8pHmIP3JO6m9Zvjm7ibP65BqJOwIXUTFpbgnrOg45 oFLDV4JgC4Hh4GEtdm+UhQE51A0VVW5pDaqWTdWkCcuk3QgxUdM3Wm3SW6pw1Gvb T0X0j5RgF/k3YrW576R/VIy6z4YPbzAtiG4O/zSlsujHoDcVNWnxApgSB/unaDh8 LNkFPGEMeSs= =JmTm -----END PGP SIGNATURE----- Merge tag 'for-upstream' of git://repo.or.cz/qemu/kevin into staging Block layer patches - coroutine: Fix crashes due to too large pool batch size - fdc: Prevent end-of-track overrun - nbd: MULTI_CONN for shared writable exports - iotests test runner improvements # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmJ9KCkRHGt3b2xmQHJl # ZGhhdC5jb20ACgkQfwmycsiPL9ZtSRAAmYDFBPqxfutpFXM7kIKwL6COXJC12MOx # Tmu8cDiGB/jNChdi3kl6I5h5njzo3U0ZlL/Ign6EzHoeoXLAPSeUWmuRsARwsZ+A # rL61gf6yrMjAo45FZuIS0GlMDk8BauRwPl9qPWeqQcrtOMYpxwZfyFGmcMpQgAOI # MSC1I8p3FA7oJhGpKIHDPOjaZA97Lm2rLnDIwZ4f0YgssbybFBcFCXOQbhpsVhLy # Tjp/L+qRUtna9xBsPHQvHZW0kITQbCQPdX+oVqqUmwzSvuHqfXKe1YppyPjBt/S0 # H7nxtx4HOgP0lP5Kea+wbIRAk9Da5uaOW8hlMWRLShEKv1iTUenQSKteBB6CD03t # GD9ze1kGoR9b6szw795BXxZxcWii0cn359lIVHeKR/U8zDuz5w3zhyl0klK8xeJy # nj+JErLwQ7BD8kNR+7WAfXTF3tk2dQao1AvsBjn087KjMiJ/Mg8HY4K2zrjBUrHL # DLTyAIjzct3BWJDZ02fb5jb8pHmIP3JO6m9Zvjm7ibP65BqJOwIXUTFpbgnrOg45 # oFLDV4JgC4Hh4GEtdm+UhQE51A0VVW5pDaqWTdWkCcuk3QgxUdM3Wm3SW6pw1Gvb # T0X0j5RgF/k3YrW576R/VIy6z4YPbzAtiG4O/zSlsujHoDcVNWnxApgSB/unaDh8 # LNkFPGEMeSs= # =JmTm # -----END PGP SIGNATURE----- # gpg: Signature made Thu 12 May 2022 08:30:49 AM PDT # gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6 # gpg: issuer "kwolf@redhat.com" # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full] * tag 'for-upstream' of git://repo.or.cz/qemu/kevin: qemu-iotests: inline common.config into common.rc nbd/server: Allow MULTI_CONN for shared writable exports qemu-nbd: Pass max connections to blockdev layer tests/qtest/fdc-test: Add a regression test for CVE-2021-3507 hw/block/fdc: Prevent end-of-track overrun (CVE-2021-3507) .gitlab-ci.d: export meson testlog.txt as an artifact tests/qemu-iotests: print intent to run a test in TAP mode iotests/testrunner: Flush after run_test() coroutine: Revert to constant batch size coroutine: Rename qemu_coroutine_inc/dec_pool_size() Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2022-05-12 08:37:28 -07:00
Paolo Bonzini	f70625299e	qemu-iotests: inline common.config into common.rc common.rc has some complicated logic to find the common.config that dates back to xfstests and is completely unnecessary now. Just include the contents of the file. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220505094723.732116-1-pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-05-12 15:42:49 +02:00
Paolo Bonzini	333dbac358	vmxcap: add tertiary execution controls Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 14:23:19 +02:00
Paolo Bonzini	7adb75d6be	vl: make machine type deprecation a warning error_report should generally be followed by a failure; if we can proceed anyway, that is just a warning and should be communicated properly to the user with warn_report. Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20220511175043.27327-1-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 14:23:17 +02:00
Eric Blake	58a6fdcc9e	nbd/server: Allow MULTI_CONN for shared writable exports According to the NBD spec, a server that advertises NBD_FLAG_CAN_MULTI_CONN promises that multiple client connections will not see any cache inconsistencies: when properly separated by a single flush, actions performed by one client will be visible to another client, regardless of which client did the flush. We always satisfy these conditions in qemu - even when we support multiple clients, ALL clients go through a single point of reference into the block layer, with no local caching. The effect of one client is instantly visible to the next client. Even if our backend were a network device, we argue that any multi-path caching effects that would cause inconsistencies in back-to-back actions not seeing the effect of previous actions would be a bug in that backend, and not the fault of caching in qemu. As such, it is safe to unconditionally advertise CAN_MULTI_CONN for any qemu NBD server situation that supports parallel clients. Note, however, that we don't want to advertise CAN_MULTI_CONN when we know that a second client cannot connect (for historical reasons, qemu-nbd defaults to a single connection while nbd-server-add and QMP commands default to unlimited connections; but we already have existing means to let either style of NBD server creation alter those defaults). This is visible by no longer advertising MULTI_CONN for 'qemu-nbd -r' without -e, as in the iotest nbd-qemu-allocation. The harder part of this patch is setting up an iotest to demonstrate behavior of multiple NBD clients to a single server. It might be possible with parallel qemu-io processes, but I found it easier to do in python with the help of libnbd, and help from Nir and Vladimir in writing the test. Signed-off-by: Eric Blake <eblake@redhat.com> Suggested-by: Nir Soffer <nsoffer@redhat.com> Suggested-by: Vladimir Sementsov-Ogievskiy <v.sementsov-og@mail.ru> Message-Id: <20220512004924.417153-3-eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-05-12 13:10:52 +02:00
Eric Blake	a5fced4021	qemu-nbd: Pass max connections to blockdev layer The next patch wants to adjust whether the NBD server code advertises MULTI_CONN based on whether it is known if the server limits to exactly one client. For a server started by QMP, this information is obtained through nbd_server_start (which can support more than one export); but for qemu-nbd (which supports exactly one export), it is controlled only by the command-line option -e/--shared. Since we already have a hook function used by qemu-nbd, it's easiest to just alter its signature to fit our needs. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20220512004924.417153-2-eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-05-12 13:10:52 +02:00
Philippe Mathieu-Daudé	46609b90d9	tests/qtest/fdc-test: Add a regression test for CVE-2021-3507 Add the reproducer from https://gitlab.com/qemu-project/qemu/-/issues/339 Without the previous commit, when running 'make check-qtest-i386' with QEMU configured with '--enable-sanitizers' we get: ==4028352==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x619000062a00 at pc 0x5626d03c491a bp 0x7ffdb4199410 sp 0x7ffdb4198bc0 READ of size 786432 at 0x619000062a00 thread T0 #0 0x5626d03c4919 in __asan_memcpy (qemu-system-i386+0x1e65919) #1 0x5626d1c023cc in flatview_write_continue softmmu/physmem.c:2787:13 #2 0x5626d1bf0c0f in flatview_write softmmu/physmem.c:2822:14 #3 0x5626d1bf0798 in address_space_write softmmu/physmem.c:2914:18 #4 0x5626d1bf0f37 in address_space_rw softmmu/physmem.c:2924:16 #5 0x5626d1bf14c8 in cpu_physical_memory_rw softmmu/physmem.c:2933:5 #6 0x5626d0bd5649 in cpu_physical_memory_write include/exec/cpu-common.h:82:5 #7 0x5626d0bd0a07 in i8257_dma_write_memory hw/dma/i8257.c:452:9 #8 0x5626d09f825d in fdctrl_transfer_handler hw/block/fdc.c:1616:13 #9 0x5626d0a048b4 in fdctrl_start_transfer hw/block/fdc.c:1539:13 #10 0x5626d09f4c3e in fdctrl_write_data hw/block/fdc.c:2266:13 #11 0x5626d09f22f7 in fdctrl_write hw/block/fdc.c:829:9 #12 0x5626d1c20bc5 in portio_write softmmu/ioport.c:207:17 0x619000062a00 is located 0 bytes to the right of 512-byte region [0x619000062800,0x619000062a00) allocated by thread T0 here: #0 0x5626d03c66ec in posix_memalign (qemu-system-i386+0x1e676ec) #1 0x5626d2b988d4 in qemu_try_memalign util/oslib-posix.c:210:11 #2 0x5626d2b98b0c in qemu_memalign util/oslib-posix.c:226:27 #3 0x5626d09fbaf0 in fdctrl_realize_common hw/block/fdc.c:2341:20 #4 0x5626d0a150ed in isabus_fdc_realize hw/block/fdc-isa.c:113:5 #5 0x5626d2367935 in device_set_realized hw/core/qdev.c:531:13 SUMMARY: AddressSanitizer: heap-buffer-overflow (qemu-system-i386+0x1e65919) in __asan_memcpy Shadow bytes around the buggy address: 0x0c32800044f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c3280004500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0c3280004510: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0c3280004520: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0c3280004530: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>0x0c3280004540:[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c3280004550: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c3280004560: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c3280004570: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c3280004580: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c3280004590: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Heap left redzone: fa Freed heap region: fd ==4028352==ABORTING [ kwolf: Added snapshot=on to prevent write file lock failure ] Reported-by: Alexander Bulekov <alxndr@bu.edu> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alexander Bulekov <alxndr@bu.edu> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-05-12 13:03:25 +02:00
Philippe Mathieu-Daudé	defac5e2fb	hw/block/fdc: Prevent end-of-track overrun (CVE-2021-3507) Per the 82078 datasheet, if the end-of-track (EOT byte in the FIFO) is more than the number of sectors per side, the command is terminated unsuccessfully: * 5.2.5 DATA TRANSFER TERMINATION The 82078 supports terminal count explicitly through the TC pin and implicitly through the underrun/over- run and end-of-track (EOT) functions. For full sector transfers, the EOT parameter can define the last sector to be transferred in a single or multisector transfer. If the last sector to be transferred is a par- tial sector, the host can stop transferring the data in mid-sector, and the 82078 will continue to complete the sector as if a hardware TC was received. The only difference between these implicit functions and TC is that they return "abnormal termination" result status. Such status indications can be ignored if they were expected. * 6.1.3 READ TRACK This command terminates when the EOT specified number of sectors have been read. If the 82078 does not find an I D Address Mark on the diskette after the second· occurrence of a pulse on the INDX# pin, then it sets the IC code in Status Regis- ter 0 to "01" (Abnormal termination), sets the MA bit in Status Register 1 to "1", and terminates the com- mand. * 6.1.6 VERIFY Refer to Table 6-6 and Table 6-7 for information concerning the values of MT and EC versus SC and EOT value. * Table 6·6. Result Phase Table * Table 6-7. Verify Command Result Phase Table Fix by aborting the transfer when EOT > # Sectors Per Side. Cc: qemu-stable@nongnu.org Cc: Hervé Poussineau <hpoussin@reactos.org> Fixes: `baca51faff` ("floppy driver: disk geometry auto detect") Reported-by: Alexander Bulekov <alxndr@bu.edu> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/339 Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211118115733.4038610-2-philmd@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-05-12 12:31:08 +02:00
Kshitij Suri	e23a13c042	meson: link libpng independent of vnc Currently png support is dependent on vnc for linking object file to libpng. This commit makes the parameter independent of vnc as it breaks system emulator with --disable-vnc unless --disable-png is added. Fixes: `9a0a119a38` ("Added parameter to take screenshot with screendump as PNG", 2022-04-27) Signed-off-by: Kshitij Suri <kshitij.suri@nutanix.com> Message-Id: <20220510161932.228481-1-kshitij.suri@nutanix.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:29:44 +02:00
Paolo Bonzini	d93e839ccd	vhost-backend: do not depend on CONFIG_VHOST_VSOCK The vsock callbacks .vhost_vsock_set_guest_cid and .vhost_vsock_set_running are the only ones to be conditional on #ifdef CONFIG_VHOST_VSOCK. This is different from any other device-dependent callbacks like .vhost_scsi_set_endpoint, and it also broke when CONFIG_VHOST_VSOCK was changed to a per-target symbol. It would be possible to also use the CONFIG_DEVICES include, but really there is no reason for most virtio files to be per-target so just remove the #ifdef to fix the issue. Reported-by: Dov Murik <dovmurik@linux.ibm.com> Fixes: `9972ae314f` ("build: move vhost-vsock configuration to Kconfig") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:29:44 +02:00
Paolo Bonzini	f0d43b1ece	coroutine-lock: qemu_co_queue_restart_all is a coroutine-only qemu_co_enter_all qemu_co_queue_restart_all is basically the same as qemu_co_enter_all but without a QemuLockable argument. That's perfectly fine, but only as long as the function is marked coroutine_fn. If used outside coroutine context, qemu_co_queue_wait will attempt to take the lock and that is just broken: if you are calling qemu_co_queue_restart_all outside coroutine context, the lock is going to be a QemuMutex which cannot be taken twice by the same thread. The patch adds the marker to qemu_co_queue_restart_all and to its sole non-coroutine_fn caller; it then reimplements the function in terms of qemu_co_enter_all_impl, to remove duplicated code and to clarify that the latter also works in coroutine context. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20220427130830.150180-4-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:29:44 +02:00
Paolo Bonzini	d6ee15adec	coroutine-lock: introduce qemu_co_queue_enter_all Because qemu_co_queue_restart_all does not release the lock, it should be used only in coroutine context. Introduce a new function that, like qemu_co_enter_next, does release the lock, and use it whenever qemu_co_queue_restart_all was used outside coroutine context. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20220427130830.150180-3-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:29:44 +02:00
Paolo Bonzini	248af9e80a	coroutine-lock: qemu_co_queue_next is a coroutine-only qemu_co_enter_next qemu_co_queue_next is basically the same as qemu_co_enter_next but without a QemuLockable argument. That's perfectly fine, but only as long as the function is marked coroutine_fn. If used outside coroutine context, qemu_co_queue_wait will attempt to take the lock and that is just broken: if you are calling qemu_co_queue_next outside coroutine context, the lock is going to be a QemuMutex which cannot be taken twice by the same thread. The patch adds the marker and reimplements qemu_co_queue_next in terms of qemu_co_enter_next_impl, to remove duplicated code and to clarify that the latter also works in coroutine context. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20220427130830.150180-2-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:29:44 +02:00
Paolo Bonzini	bf2f69d08b	net: slirp: allow CFI with libslirp >= 4.7 slirp 4.7 introduces a new CFI-friendly timer callback that does not pass function pointers within libslirp as callbacks for timers. Check the version number and, if it is new enough, allow using CFI even with a system libslirp. Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Reviewed-by: Marc-André Lureau <malureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:29:44 +02:00
Paolo Bonzini	6222e55d13	net: slirp: add support for CFI-friendly timer API libslirp 4.7 introduces a CFI-friendly version of the .timer_new callback. The new callback replaces the function pointer with an enum; invoking the callback is done with a new function slirp_handle_timer. Support the new API so that CFI can be made compatible with using a system libslirp. Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Reviewed-by: Marc-André Lureau <malureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:29:44 +02:00
Paolo Bonzini	bce63ded20	net: slirp: switch to slirp_new Replace slirp_init with slirp_new, so that a more recent cfg.version can be specified. The function appeared in version 4.1.0. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:29:44 +02:00
Paolo Bonzini	ad2e5b87d7	net: slirp: introduce a wrapper struct for QemuTimer This struct will be extended in the next few patches to support the new slirp_handle_timer() call. For that we need to store an additional "int" for each SLIRP timer, in addition to the cb_opaque. Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Reviewed-by: Marc-André Lureau <malureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:29:44 +02:00
Paolo Bonzini	0c1450e204	slirp: bump submodule past 4.7 release Version 4.7 of slirp provides a new timer API that works better with CFI, together with several other improvements: * Allow disabling the internal DHCP server !22 * Support Unix sockets in hostfwd !103 * IPv6 DNS proxying support !110 * bootp: add support for UEFI HTTP boot !111 and bugfixes. The submodule update also includes 2 commits to fix warnings in the Win32 build. Reviewed-by: Marc-André Lureau <malureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:29:44 +02:00
Paolo Bonzini	fb56b7a052	machine: move more memory validation to Machine object This allows setting memory properties without going through vl.c, and have them validated just the same. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220414165300.555321-6-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:29:44 +02:00
Paolo Bonzini	26f88d84da	machine: make memory-backend a link property Handle HostMemoryBackend creation and setting of ms->ram entirely in machine_run_board_init. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220414165300.555321-5-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:29:44 +02:00
Paolo Bonzini	ce9d03fb3f	machine: add mem compound property Make -m syntactic sugar for a compound property "-machine mem.{size,max-size,slots}". The new property does not have the magic conversion to megabytes of unsuffixed arguments, and also does not understand that "0" means the default size (you have to leave it out to get the default). This means that we need to convert the QemuOpts by hand to a QDict. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220414165300.555321-4-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:29:44 +02:00
Paolo Bonzini	8c4da4b521	machine: add boot compound property Make -boot syntactic sugar for a compound property "-machine boot.{order,menu,...}". machine_boot_parse is replaced by the setter for the property. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220414165300.555321-3-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:29:43 +02:00
Paolo Bonzini	97ec4d21e0	machine: use QAPI struct for boot configuration As part of converting -boot to a property with a QAPI type, define the struct and use it throughout QEMU to access boot configuration. machine_boot_parse takes care of doing the QemuOpts->QAPI conversion by hand, for now. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220414165300.555321-2-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:29:43 +02:00
Daniel P. Berrangé	29a493765e	.gitlab-ci.d: export meson testlog.txt as an artifact When running 'make check' we only get a summary of progress on the console. Fortunately meson/ninja have saved the raw test output to a logfile. Exposing this log will make it easier to debug failures that happen in CI. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220509124134.867431-3-berrange@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-05-12 12:27:32 +02:00
Daniel P. Berrangé	5e781c700a	tests/qemu-iotests: print intent to run a test in TAP mode When running I/O tests using TAP output mode, we get a single TAP test with a sub-test reported for each I/O test that is run. The output looks something like this: 1..123 ok qcow2 011 ok qcow2 012 ok qcow2 013 ok qcow2 217 ... If everything runs or fails normally this is fine, but periodically we have been seeing the test harness abort early before all 123 tests have been run, just leaving a fairly useless message like TAP parsing error: Too few tests run (expected 123, got 107) we have no idea which tests were running at the time the test harness abruptly exited. This change causes us to print a message about our intent to run each test, so we have a record of what is active at the time the harness exits abnormally. 1..123 # running qcow2 011 ok qcow2 011 # running qcow2 012 ok qcow2 012 # running qcow2 013 ok qcow2 013 # running qcow2 217 ok qcow2 217 ... Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20220509124134.867431-2-berrange@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-05-12 12:27:20 +02:00
Hanna Reitz	22d92e71c7	iotests/testrunner: Flush after run_test() When stdout is not a terminal, the buffer may not be flushed at each end of line, so we should flush after each test is done. This is especially apparent when run by check-block, in two ways: First, when running make check-block -jX with X > 1, progress indication was missing, even though testrunner.py does theoretically print each test's status once it has been run, even in multi-processing mode. Flushing after each test restores this progress indication. Second, sometimes make check-block failed altogether, with an error message that "too few tests [were] run". I presume that's because one worker process in the job pool did not get to flush its stdout before the main process exited, and so meson did not get to see that worker's test results. In any case, by flushing at the end of run_test(), the problem has disappeared for me. Signed-off-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220506134215.10086-1-hreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-05-12 12:25:18 +02:00
Kevin Wolf	9ec7a59b5a	coroutine: Revert to constant batch size Commit `4c41c69e` changed the way the coroutine pool is sized because for virtio-blk devices with a large queue size and heavy I/O, it was just too small and caused coroutines to be deleted and reallocated soon afterwards. The change made the size dynamic based on the number of queues and the queue size of virtio-blk devices. There are two important numbers here: Slightly simplified, when a coroutine terminates, it is generally stored in the global release pool up to a certain pool size, and if the pool is full, it is freed. Conversely, when allocating a new coroutine, the coroutines in the release pool are reused if the pool already has reached a certain minimum size (the batch size), otherwise we allocate new coroutines. The problem after commit `4c41c69e` is that it not only increases the maximum pool size (which is the intended effect), but also the batch size for reusing coroutines (which is a bug). It means that in cases with many devices and/or a large queue size (which defaults to the number of vcpus for virtio-blk-pci), many thousand coroutines could be sitting in the release pool without being reused. This is not only a waste of memory and allocations, but it actually makes the QEMU process likely to hit the vm.max_map_count limit on Linux because each coroutine requires two mappings (its stack and the guard page for the stack), causing it to abort() in qemu_alloc_stack() because when the limit is hit, mprotect() starts to fail with ENOMEM. In order to fix the problem, change the batch size back to 64 to avoid uselessly accumulating coroutines in the release pool, but keep the dynamic maximum pool size so that coroutines aren't freed too early in heavy I/O scenarios. Note that this fix doesn't strictly make it impossible to hit the limit, but this would only happen if most of the coroutines are actually in use at the same time, not just sitting in a pool. This is the same behaviour as we already had before commit `4c41c69e`. Fully preventing this would require allowing qemu_coroutine_create() to return an error, but it doesn't seem to be a scenario that people hit in practice. Cc: qemu-stable@nongnu.org Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2079938 Fixes: `4c41c69e05` Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20220510151020.105528-3-kwolf@redhat.com> Tested-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-05-12 12:21:30 +02:00
Kevin Wolf	98e3ab3505	coroutine: Rename qemu_coroutine_inc/dec_pool_size() It's true that these functions currently affect the batch size in which coroutines are reused (i.e. moved from the global release pool to the allocation pool of a specific thread), but this is a bug and will be fixed in a separate patch. In fact, the comment in the header file already just promises that it influences the pool size, so reflect this in the name of the functions. As a nice side effect, the shorter function name makes some line wrapping unnecessary. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20220510151020.105528-2-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-05-12 12:20:45 +02:00
Eric Auger	70be1d93f9	tests/qtest/libqos: Add generic pci host bridge in arm-virt machine Up to now the virt-machine node contains a virtio-mmio node. However no driver produces any PCI interface node. Hence, PCI tests cannot be run with aarch64 binary. Add a GPEX driver node that produces a pci interface node. This latter then can be consumed by all the pci tests. One of the first motivation was to be able to run the virtio-iommu-pci tests. We still face an issue with pci hotplug tests as hotplug cannot happen on the pcie root bus and require a generic root port. This will be addressed later on. We force cpu=max along with aarch64/virt machine as some PCI tests require high MMIO regions to be available. Signed-off-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20220504152025.1785704-4-eric.auger@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:07:06 +02:00
Eric Auger	02ee7a8a97	tests/qtest/libqos: Skip hotplug tests if pci root bus is not hotpluggable ARM does not not support hotplug on pcie.0. Add a flag on the bus which tells if devices can be hotplugged and skip hotplug tests if the bus cannot be hotplugged. This is a temporary solution to enable the other pci tests on aarch64. Signed-off-by: Eric Auger <eric.auger@redhat.com> Acked-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220504152025.1785704-3-eric.auger@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:07:06 +02:00
Eric Auger	3df72d1c55	tests/qtest/libqos/pci: Introduce pio_limit At the moment the IO space limit is hardcoded to QPCI_PIO_LIMIT = 0x10000. When accesses are performed to a bar, the base address of this latter is compared against the limit to decide whether we perform an IO or a memory access. On ARM, we cannot keep this PIO limit as the arm-virt machine uses [0x3eff0000, 0x3f000000 ] for the IO space map and we are mandated to allocate at 0x0. Add a new flag in QPCIBar indicating whether it is an IO bar or a memory bar. This flag is set on QPCIBar allocation and provisionned based on the BAR configuration. Then the new flag is used in access functions and in iomap() function. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20220504152025.1785704-2-eric.auger@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:07:06 +02:00
Bernhard Beschow	76acef2b73	hw/xen/xen_pt: Resolve igd_passthrough_isa_bridge_create() indirection Now that igd_passthrough_isa_bridge_create() is implemented within the xen context it may use Xen* data types directly and become xen_igd_passthrough_isa_bridge_create(). This resolves an indirection. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20220326165825.30794-3-shentey@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:07:06 +02:00
Bernhard Beschow	4a8027363e	hw/xen/xen_pt: Confine igd-passthrough-isa-bridge to XEN igd-passthrough-isa-bridge is only requested in xen_pt but was implemented in pc_piix.c. This caused xen_pt to dependend on i386/pc which is hereby resolved. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Message-Id: <20220326165825.30794-2-shentey@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-12 12:07:06 +02:00

1 2 3 4 5 ...

95781 Commits All Branches Search

95781 Commits

All Branches