qemu-e2k

Author	SHA1	Message	Date
Kevin Wolf	5661a00d40	block: Call transaction callbacks with lock held In previous patches, we changed some transactionable functions to be marked as GRAPH_WRLOCK, but required that tran_finalize() is still called without the lock. This was because all callbacks that can be in the same transaction need to follow the same convention. Now that we don't have conflicting requirements any more, we can switch all of the transaction callbacks to be declared GRAPH_WRLOCK, too, and call tran_finalize() with the lock held. Document for each of these transactionable functions that the lock needs to be held when completing the transaction, and make sure that all callers down to the place where the transaction is finalised actually have the writer lock. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-12-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:01 +02:00
Kevin Wolf	7d4ca9d25b	block: Mark bdrv_attach_child_common() GRAPH_WRLOCK Instead of taking the writer lock internally, require callers to already hold it when calling bdrv_attach_child_common(). These callers will typically already hold the graph lock once the locking work is completed, which means that they can't call functions that take it internally. Note that the transaction callbacks still take the lock internally, so tran_finalize() must be called without the lock held. This is because bdrv_append() also calls bdrv_replace_node_noperm(), which currently requires the transaction callbacks to be called unlocked. In the next step, both of them can be switched to locked tran_finalize() calls together. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-11-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:01 +02:00
Kevin Wolf	2f64e1fc57	block: Mark bdrv_replace_child_tran() GRAPH_WRLOCK Instead of taking the writer lock internally, require callers to already hold it when calling bdrv_replace_child_tran(). These callers will typically already hold the graph lock once the locking work is completed, which means that they can't call functions that take it internally. While a graph lock is held, polling is not allowed. Therefore draining the necessary nodes can no longer be done in bdrv_remove_child() and bdrv_replace_node_noperm(), but the callers must already make sure that they are drained. Note that the transaction callbacks still take the lock internally, so tran_finalize() must be called without the lock held. This is because bdrv_append() also calls bdrv_attach_child_noperm(), which currently requires to be called unlocked. Once it changes, the transaction callbacks can be changed, too. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-10-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:01 +02:00
Kevin Wolf	ad29eb3ddc	block: Mark bdrv_replace_child_noperm() GRAPH_WRLOCK Instead of taking the writer lock internally, require callers to already hold it when calling bdrv_replace_child_noperm(). These callers will typically already hold the graph lock once the locking work is completed, which means that they can't call functions that take it internally. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-9-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:01 +02:00
Kevin Wolf	d21843491a	block-coroutine-wrapper: Allow arbitrary parameter names Don't assume specific parameter names like 'bs' or 'blk' in the generated code, but use the actual name. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-8-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:01 +02:00
Kevin Wolf	de90329889	block-coroutine-wrapper: Add no_co_wrapper_bdrv_wrlock functions Add a new wrapper type for GRAPH_WRLOCK functions that should be called from coroutine context. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-7-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:01 +02:00
Kevin Wolf	ac2ae233a0	block: Introduce bdrv_schedule_unref() bdrv_unref() is called by a lot of places that need to hold the graph lock (it naturally happens in the context of operations that change the graph). However, bdrv_unref() takes the graph writer lock internally, so it can't actually be called while already holding a graph lock without causing a deadlock. bdrv_unref() also can't just become GRAPH_WRLOCK because it drains the node before closing it, and draining requires that the graph is unlocked. The solution is to defer deleting the node until we don't hold the lock any more and draining is possible again. Note that keeping images open for longer than necessary can create problems, too: You can't open an image again before it is really closed (if image locking didn't prevent it, it would cause corruption). Reopening an image immediately happens at least during bdrv_open() and bdrv_co_create(). In order to solve this problem, make sure to run the deferred unref in bdrv_graph_wrunlock(), i.e. the first possible place where we can drain again. This is also why bdrv_schedule_unref() is marked GRAPH_WRLOCK. The output of iotest 051 is updated because the additional polling changes the order of HMP output, resulting in a new "(qemu)" prompt in the test output that was previously on a separate line and filtered out. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230911094620.45040-6-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:01 +02:00
Kevin Wolf	487b91870f	block: Take AioContext lock for bdrv_append() more consistently The documentation for bdrv_append() says that the caller must hold the AioContext lock for bs_top. Change all callers to actually adhere to the contract. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-5-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:00 +02:00
Kevin Wolf	edcce17b15	preallocate: Don't poll during permission updates When the permission related BlockDriver callbacks are called, we are in the middle of an operation traversing the block graph. Polling in such a place is a very bad idea because the graph could change in unexpected ways. In the future, callers will also hold the graph lock, which is likely to turn polling into a deadlock. So we need to get rid of calls to functions like bdrv_getlength() or bdrv_truncate() there as these functions poll internally. They are currently used so that when no parent has write/resize permissions on the image any more, the preallocate filter drops the extra preallocated area in the image file and gives up write/resize permissions itself. In order to achieve this without polling in .bdrv_check_perm, don't immediately truncate the image, but only schedule a BH to do so. The filter keeps the write/resize permissions a bit longer now until the BH has executed. There is one case in which delaying doesn't work: Reopening the image read-only. In this case, bs->file will likely be reopened read-only, too, so keeping write permissions a bit longer on it doesn't work. But we can already cover this case in preallocate_reopen_prepare() and not rely on the permission updates for it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-4-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:00 +02:00
Kevin Wolf	01e28f6006	preallocate: Factor out preallocate_truncate_to_real_size() It's essentially the same code in preallocate_check_perm() and preallocate_close(), except that the latter ignores errors. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-3-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:00 +02:00
Kevin Wolf	7210448edc	block: Remove unused BlockReopenQueueEntry.perms_checked This field has been unused since commit `72373e40fb` ('block: bdrv_reopen_multiple: refresh permissions on updated graph'). Remove it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-2-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:00 +02:00
Stefan Hajnoczi	4907644841	Hi, "Host Memory Backends" and "Memory devices" queue ("mem"): - Support and document VM templating with R/O files using a new "rom" parameter for memory-backend-file - Some cleanups and fixes around NVDIMMs and R/O file handling for guest RAM - Optimize ioeventfd updates by skipping address spaces that are not applicable -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAmUJdykRHGRhdmlkQHJl ZGhhdC5jb20ACgkQTd4Q9wD/g1pf2w//akOUoYMuamySGjXtKLVyMKZkjIys+Ama k2C0xzsWAHBP572ezwHi8uxf5j9kzAjsw6GxDZ7FAamD9MhiohkEvkecloBx6f/c q3fVHblBNkG7v2urtf4+6PJtJvhzOST2SFXfWeYhO/vaA04AYCDgexv82JN3gA6B OS8WyOX62b8wILPSY2GLZ8IqpE9XnOYZwzVBn6YB1yo7ZkYEfXO6cA8nykNuNcOE vppqDo7uVIX6317FWj8ygxmzFfOaj0WT2MT2XFzEIDfg8BInQN8HC4mTn0hcVKMa N1y+eZH733CQKT+uNBRZ5YOeljOi4d6gEEyvkkA/L7e5D3Qg9hIdvHb4uryCFSWX Vt07OP1XLBwCZFobOC6sg+2gtTZJxxYK89e6ZzEd0454S24w5bnEteRAaCGOP0XL ww9xYULqhtZs55UC4rvZHJwdUAk1fIY4VqynwkeQXegvz6BxedNeEkJiiEU0Tizx N2VpsxAJ7H/LLSFeZoCRESo4azrH6U4n7S/eS1tkCniFqibfe2yIQCDoJVfb42ec gfg/vThCrDwHkIHzkMmoV8NndA7Q7SIkyMfYeEEBeZMeg8JzYll4DJEw/jQCacxh KRUa+AZvGlTJUq0mkvyOVfLki+iaehoIUuY1yvMrmdWijPO8n3YybmP9Ljhr8VdR 9MSYZe+I2v8= =iraT -----END PGP SIGNATURE----- Merge tag 'mem-2023-09-19' of https://github.com/davidhildenbrand/qemu into staging Hi, "Host Memory Backends" and "Memory devices" queue ("mem"): - Support and document VM templating with R/O files using a new "rom" parameter for memory-backend-file - Some cleanups and fixes around NVDIMMs and R/O file handling for guest RAM - Optimize ioeventfd updates by skipping address spaces that are not applicable # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAmUJdykRHGRhdmlkQHJl # ZGhhdC5jb20ACgkQTd4Q9wD/g1pf2w//akOUoYMuamySGjXtKLVyMKZkjIys+Ama # k2C0xzsWAHBP572ezwHi8uxf5j9kzAjsw6GxDZ7FAamD9MhiohkEvkecloBx6f/c # q3fVHblBNkG7v2urtf4+6PJtJvhzOST2SFXfWeYhO/vaA04AYCDgexv82JN3gA6B # OS8WyOX62b8wILPSY2GLZ8IqpE9XnOYZwzVBn6YB1yo7ZkYEfXO6cA8nykNuNcOE # vppqDo7uVIX6317FWj8ygxmzFfOaj0WT2MT2XFzEIDfg8BInQN8HC4mTn0hcVKMa # N1y+eZH733CQKT+uNBRZ5YOeljOi4d6gEEyvkkA/L7e5D3Qg9hIdvHb4uryCFSWX # Vt07OP1XLBwCZFobOC6sg+2gtTZJxxYK89e6ZzEd0454S24w5bnEteRAaCGOP0XL # ww9xYULqhtZs55UC4rvZHJwdUAk1fIY4VqynwkeQXegvz6BxedNeEkJiiEU0Tizx # N2VpsxAJ7H/LLSFeZoCRESo4azrH6U4n7S/eS1tkCniFqibfe2yIQCDoJVfb42ec # gfg/vThCrDwHkIHzkMmoV8NndA7Q7SIkyMfYeEEBeZMeg8JzYll4DJEw/jQCacxh # KRUa+AZvGlTJUq0mkvyOVfLki+iaehoIUuY1yvMrmdWijPO8n3YybmP9Ljhr8VdR # 9MSYZe+I2v8= # =iraT # -----END PGP SIGNATURE----- # gpg: Signature made Tue 19 Sep 2023 06:25:45 EDT # gpg: using RSA key 1BD9CAAD735C4C3A460DFCCA4DDE10F700FF835A # gpg: issuer "david@redhat.com" # gpg: Good signature from "David Hildenbrand <david@redhat.com>" [unknown] # gpg: aka "David Hildenbrand <davidhildenbrand@gmail.com>" [full] # gpg: aka "David Hildenbrand <hildenbr@in.tum.de>" [unknown] # gpg: WARNING: The key's User ID is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 1BD9 CAAD 735C 4C3A 460D FCCA 4DDE 10F7 00FF 835A * tag 'mem-2023-09-19' of https://github.com/davidhildenbrand/qemu: memory: avoid updating ioeventfds for some address_space machine: Improve error message when using default RAM backend id softmmu/physmem: Hint that "readonly=on,rom=off" exists when opening file R/W for private mapping fails docs: Start documenting VM templating docs: Don't mention "-mem-path" in multi-process.rst softmmu/physmem: Never return directories from file_ram_open() softmmu/physmem: Fail creation of new files in file_ram_open() with readonly=true softmmu/physmem: Bail out early in ram_block_discard_range() with readonly files softmmu/physmem: Remap with proper protection in qemu_ram_remap() backends/hostmem-file: Add "rom" property to support VM templating with R/O files softmmu/physmem: Distinguish between file access mode and mmap protection nvdimm: Reject writing label data to ROM instead of crashing QEMU Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-09-19 13:22:19 -04:00
Stefan Hajnoczi	1361bba536	edk2: update to edk2-stable202308 v2: include acpi test data updates -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEoDKM/7k6F6eZAf59TLbY7tPocTgFAmUIUYUACgkQTLbY7tPo cTiPgQ/9Hfn4ooawA2k7i4KB5mAdNMhG1TYmR05hjIPur8S+UBhfHx3Qdv/lojzr 9hRkXsi3CpV8E/t7sA/ZUVbc17ukBrJvL2VbW1nGqPZytiNqmU/2HOZEd88WByyg O1UYg9FZ1JbrqVbFkrE7Y0CHJmrr4EDWRxEGd7ITPDbR4UEuiQUm7+TeHIbQFCll T5vNxkCBP6smY9n/OEMZHX964D7906pBflHSjzpLPV/mXBrlM/rDNtPXA6dcIquh cCOndACPpenM8ngtgbW2gvDkkflXv4gtLozJR8XE8O434HmCviUjcxGW6L7nelcZ +madon48CZ/5AJUvC09R3xuzWHOBuLOn21O3ooprnCBFWAgCtaMEDWwNbgf1Pig3 PgwOd1HeiQTKRuNCFDwNX1GJRN7Cyq6tY+ALQal3glDmWEMiyihUHViSsqux3c01 RAkyyOJAMOZ6+MbZ4HMWNVI9pKRTYY7IDxg3NWSvlCD3KmDuDt8YBuQftZMN+T8X yMSa1wQda7ATlrsjUZL5LsEYO3qkho4ybffiFFDVz8QR/sO0TQg9uw6mggIghLAh GsSUE9SpVZmu+1lZYV/+/KomGeyNlhfchgIVPApMLQS3j0kDgVeNsrsjfbDgCqsn q3Ame+Roul54cv437F02ugt6JoxP76gNXXn8KdZPIDqOHWxMeS0= =Grjx -----END PGP SIGNATURE----- Merge tag 'firmware/edk2-20230918-pull-request' of https://gitlab.com/kraxel/qemu into staging edk2: update to edk2-stable202308 v2: include acpi test data updates # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCgAdFiEEoDKM/7k6F6eZAf59TLbY7tPocTgFAmUIUYUACgkQTLbY7tPo # cTiPgQ/9Hfn4ooawA2k7i4KB5mAdNMhG1TYmR05hjIPur8S+UBhfHx3Qdv/lojzr # 9hRkXsi3CpV8E/t7sA/ZUVbc17ukBrJvL2VbW1nGqPZytiNqmU/2HOZEd88WByyg # O1UYg9FZ1JbrqVbFkrE7Y0CHJmrr4EDWRxEGd7ITPDbR4UEuiQUm7+TeHIbQFCll # T5vNxkCBP6smY9n/OEMZHX964D7906pBflHSjzpLPV/mXBrlM/rDNtPXA6dcIquh # cCOndACPpenM8ngtgbW2gvDkkflXv4gtLozJR8XE8O434HmCviUjcxGW6L7nelcZ # +madon48CZ/5AJUvC09R3xuzWHOBuLOn21O3ooprnCBFWAgCtaMEDWwNbgf1Pig3 # PgwOd1HeiQTKRuNCFDwNX1GJRN7Cyq6tY+ALQal3glDmWEMiyihUHViSsqux3c01 # RAkyyOJAMOZ6+MbZ4HMWNVI9pKRTYY7IDxg3NWSvlCD3KmDuDt8YBuQftZMN+T8X # yMSa1wQda7ATlrsjUZL5LsEYO3qkho4ybffiFFDVz8QR/sO0TQg9uw6mggIghLAh # GsSUE9SpVZmu+1lZYV/+/KomGeyNlhfchgIVPApMLQS3j0kDgVeNsrsjfbDgCqsn # q3Ame+Roul54cv437F02ugt6JoxP76gNXXn8KdZPIDqOHWxMeS0= # =Grjx # -----END PGP SIGNATURE----- # gpg: Signature made Mon 18 Sep 2023 09:32:53 EDT # gpg: using RSA key A0328CFFB93A17A79901FE7D4CB6D8EED3E87138 # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" [full] # gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" [full] # gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" [full] # Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138 * tag 'firmware/edk2-20230918-pull-request' of https://gitlab.com/kraxel/qemu: tests/acpi: disallow virt/SSDT.memhp updates tests/acpi: update virt/SSDT.memhp edk2: update binaries to edk2-stable202308 edk2: update submodule to edk2-stable202308 edk2: workaround edk-stable202308 bug edk2: update build config edk2: update build script tests/acpi: allow virt/SSDT.memhp updates Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-09-19 13:22:10 -04:00
Stefan Hajnoczi	6a0eddb34a	ppc patch queue for 2023-09-18: In this short queue we're making two important changes: - Nicholas Piggin is now the qemu-ppc maintainer. Cédric Le Goater and Daniel Barboza will act as backup during Nick's transition to this new role. - Support for NVIDIA V100 GPU with NVLink2 is dropped from qemu-ppc. Linux removed the same support back in 5.13, we're following suit now. A xive Coverity fix is also included. -----BEGIN PGP SIGNATURE----- iIwEABYKADQWIQQX6/+ZI9AYAK8oOBk82cqW3gMxZAUCZQhPnBYcZGFuaWVsaGI0 MTNAZ21haWwuY29tAAoJEDzZypbeAzFk5QUBAJJNnCtv/SPP6bQVNGMgtfI9sz2z MEttDa7SINyLCiVxAP0Y9z8ZHEj6vhztTX0AAv2QubCKWIVbJZbPV5RWrHCEBQ== =y3nh -----END PGP SIGNATURE----- Merge tag 'pull-ppc-20230918' of https://gitlab.com/danielhb/qemu into staging ppc patch queue for 2023-09-18: In this short queue we're making two important changes: - Nicholas Piggin is now the qemu-ppc maintainer. Cédric Le Goater and Daniel Barboza will act as backup during Nick's transition to this new role. - Support for NVIDIA V100 GPU with NVLink2 is dropped from qemu-ppc. Linux removed the same support back in 5.13, we're following suit now. A xive Coverity fix is also included. # -----BEGIN PGP SIGNATURE----- # # iIwEABYKADQWIQQX6/+ZI9AYAK8oOBk82cqW3gMxZAUCZQhPnBYcZGFuaWVsaGI0 # MTNAZ21haWwuY29tAAoJEDzZypbeAzFk5QUBAJJNnCtv/SPP6bQVNGMgtfI9sz2z # MEttDa7SINyLCiVxAP0Y9z8ZHEj6vhztTX0AAv2QubCKWIVbJZbPV5RWrHCEBQ== # =y3nh # -----END PGP SIGNATURE----- # gpg: Signature made Mon 18 Sep 2023 09:24:44 EDT # gpg: using EDDSA key 17EBFF9923D01800AF2838193CD9CA96DE033164 # gpg: issuer "danielhb413@gmail.com" # gpg: Good signature from "Daniel Henrique Barboza <danielhb413@gmail.com>" [unknown] # gpg: WARNING: The key's User ID is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 17EB FF99 23D0 1800 AF28 3819 3CD9 CA96 DE03 3164 * tag 'pull-ppc-20230918' of https://gitlab.com/danielhb/qemu: spapr: Remove support for NVIDIA V100 GPU with NVLink2 ppc/xive: Fix uint32_t overflow MAINTAINERS: Nick Piggin PPC maintainer, other PPC changes Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-09-19 13:22:02 -04:00
Stefan Hajnoczi	dd0c84983d	-----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJlB/SLAAoJEO8Ells5jWIR7EQH/1kAbxHcSGJXDOgQAXJ/rOZi UKn3ugJzD0Hxd4Xz8cvdVLM+9/JoEEOK1uB+NIG7Ask/gA5D7eUYzaLtp1OJ8VNO mamfKmn3EIBWJoLSHH19TKzfW2tGMJHQ0Nj+sbDQRkK5f2c7hwLTRXa1EmlJd4dB VoVzX4OiJtrQyv4OVmpP/PSETXJDvYYX/DNcRl9/3ccKtQW/wVDI3YzrMzXrsgyc w9ItJi8k+19mVH6RgQwciqRvTbVMdzkOxqvU//LY0TxnjsHfbyHr+KlNAa2WTY2N QgpAlMZhHqUG6/XXAs0o2VEtA66zmw932Xfy/CZUEcdGWfkG/9CEVfbuT4CKGY4= =tF7K -----END PGP SIGNATURE----- Merge tag 'net-pull-request' of https://github.com/jasowang/qemu into staging # -----BEGIN PGP SIGNATURE----- # Version: GnuPG v1 # # iQEcBAABAgAGBQJlB/SLAAoJEO8Ells5jWIR7EQH/1kAbxHcSGJXDOgQAXJ/rOZi # UKn3ugJzD0Hxd4Xz8cvdVLM+9/JoEEOK1uB+NIG7Ask/gA5D7eUYzaLtp1OJ8VNO # mamfKmn3EIBWJoLSHH19TKzfW2tGMJHQ0Nj+sbDQRkK5f2c7hwLTRXa1EmlJd4dB # VoVzX4OiJtrQyv4OVmpP/PSETXJDvYYX/DNcRl9/3ccKtQW/wVDI3YzrMzXrsgyc # w9ItJi8k+19mVH6RgQwciqRvTbVMdzkOxqvU//LY0TxnjsHfbyHr+KlNAa2WTY2N # QgpAlMZhHqUG6/XXAs0o2VEtA66zmw932Xfy/CZUEcdGWfkG/9CEVfbuT4CKGY4= # =tF7K # -----END PGP SIGNATURE----- # gpg: Signature made Mon 18 Sep 2023 02:56:11 EDT # gpg: using RSA key EF04965B398D6211 # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [full] # Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211 * tag 'net-pull-request' of https://github.com/jasowang/qemu: net/tap: Avoid variable-length array net/dump: Avoid variable length array hw/net/rocker: Avoid variable length array hw/net/fsl_etsec/rings.c: Avoid variable length array net: add initial support for AF_XDP network backend tests: bump libvirt-ci for libasan and libxdp e1000e: rename e1000e_ba_state and e1000e_write_hdr_to_rx_buffers igb: packet-split descriptors support igb: add IPv6 extended headers traffic detection igb: RX payload guest writting refactoring igb: RX descriptors guest writting refactoring igb: rename E1000E_RingInfo_st igb: remove TCP ACK detection virtio-net: Add support for USO features virtio-net: Add USO flags to vhost support. tap: Add check for USO features tap: Add USO support to tap device. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-09-19 13:21:49 -04:00
Stefan Hajnoczi	d7754940d7	: Delete checks for old host definitions tcg/loongarch64: Generate LSX instructions fpu: Add conversions between bfloat16 and [u]int8 fpu: Handle m68k extended precision denormals properly accel/tcg: Improve cputlb i/o organization accel/tcg: Simplify tlb_plugin_lookup accel/tcg: Remove false-negative halted assertion tcg: Add gvec compare with immediate and scalar operand tcg/aarch64: Emit BTI insns at jump landing pads -----BEGIN PGP SIGNATURE----- iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmUF4VIdHHJpY2hhcmQu aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV8wOwf+I9qNus2kV3yQxpuU 2hqYuLXvH96l9vbqaoyx7hyyJTtrqytLGCMPmQKUdtBGtO6z7PnLNDiooGcbO+gw 2gdfw3Q//JZUTdx+ZSujUksV0F96Tqu0zi4TdJUPNIwhCrh0K8VjiftfPfbynRtz KhQ1lNeO/QzcAgzKiun2NyqdPiYDmNuEIS/jYedQwQweRp/xQJ4/x8DmhGf/OiD4 rGAcdslN+RenqgFACcJ2A1vxUGMeQv5g/Cn82FgTk0cmgcfAODMnC+WnOm8ruQdT snluvnh/2/r8jIhx3frKDKGtaKHCPhoCS7GNK48qejxaybvv3CJQ4qsjRIBKVrVM cIrsSw== =cTgD -----END PGP SIGNATURE----- Merge tag 'pull-tcg-20230915-2' of https://gitlab.com/rth7680/qemu into staging : Delete checks for old host definitions tcg/loongarch64: Generate LSX instructions fpu: Add conversions between bfloat16 and [u]int8 fpu: Handle m68k extended precision denormals properly accel/tcg: Improve cputlb i/o organization accel/tcg: Simplify tlb_plugin_lookup accel/tcg: Remove false-negative halted assertion tcg: Add gvec compare with immediate and scalar operand tcg/aarch64: Emit BTI insns at jump landing pads [Resolved conflict between CPUINFO_PMULL and CPUINFO_BTI. --Stefan] * tag 'pull-tcg-20230915-2' of https://gitlab.com/rth7680/qemu: (39 commits) tcg: Map code_gen_buffer with PROT_BTI tcg/aarch64: Emit BTI insns at jump landing pads util/cpuinfo-aarch64: Add CPUINFO_BTI tcg: Add tcg_out_tb_start backend hook fpu: Handle m68k extended precision denormals properly fpu: Add conversions between bfloat16 and [u]int8 accel/tcg: Introduce do_st16_mmio_leN accel/tcg: Introduce do_ld16_mmio_beN accel/tcg: Merge io_writex into do_st_mmio_leN accel/tcg: Merge io_readx into do_ld_mmio_beN accel/tcg: Replace direct use of io_readx/io_writex in do_{ld,st}_1 accel/tcg: Merge cpu_transaction_failed into io_failed plugin: Simplify struct qemu_plugin_hwaddr accel/tcg: Use CPUTLBEntryFull.phys_addr in io_failed accel/tcg: Split out io_prepare and io_failed accel/tcg: Simplify tlb_plugin_lookup target/arm: Use tcg_gen_gvec_cmpi for compare vs 0 tcg: Add gvec compare with immediate and scalar operand tcg/loongarch64: Implement 128-bit load & store tcg/loongarch64: Lower rotli_vec to vrotri ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-09-19 13:20:54 -04:00
hongmianquan	544cff46c0	memory: avoid updating ioeventfds for some address_space When updating ioeventfds, we need to iterate all address spaces, but some address spaces do not register eventfd_add\|del call when memory_listener_register() and they do nothing when updating ioeventfds. So we can skip these AS in address_space_update_ioeventfds(). The overhead of memory_region_transaction_commit() can be significantly reduced. For example, a VM with 8 vhost net devices and each one has 64 vectors, can reduce the time spent on memory_region_transaction_commit by 20%. Message-ID: <20230830032906.12488-1-hongmianquan@bytedance.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: hongmianquan <hongmianquan@bytedance.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-09-19 10:44:44 +02:00
David Hildenbrand	41ddcd2308	machine: Improve error message when using default RAM backend id For migration purposes, users might want to reuse the default RAM backend id, but specify a different memory backend. For example, to reuse "pc.ram" on q35, one has to set -machine q35,memory-backend=pc.ram Only then, can a memory backend with the id "pc.ram" be created manually. Let's improve the error message by improving the hint. Use error_append_hint() -- which in turn requires ERRP_GUARD(). Message-ID: <20230906120503.359863-12-david@redhat.com> Suggested-by: ThinerLogoer <logoerthiner1@163.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-09-19 10:44:36 +02:00
David Hildenbrand	6da4b1c25d	softmmu/physmem: Hint that "readonly=on,rom=off" exists when opening file R/W for private mapping fails It's easy to miss that memory-backend-file with "share=off" (default) will always try opening the file R/W as default, and fail if we don't have write permissions to the file. In that case, the user has to explicit specify "readonly=on,rom=off" to get usable RAM, for example, for VM templating. Let's hint that '-object memory-backend-file,readonly=on,rom=off,...' exists to consume R/O files in a private mapping to create writable RAM, but only if we have permissions to open the file read-only. Message-ID: <20230906120503.359863-11-david@redhat.com> Suggested-by: ThinerLogoer <logoerthiner1@163.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-09-19 10:23:21 +02:00
David Hildenbrand	9cd9313fc3	docs: Start documenting VM templating Let's add some details about VM templating, focusing on the VM memory configuration only. There is much more to VM templating (VM state? block devices?), but I leave that as future work. Message-ID: <20230906120503.359863-10-david@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-09-19 10:23:21 +02:00
David Hildenbrand	9e6180d22c	docs: Don't mention "-mem-path" in multi-process.rst "-mem-path" corresponds to "memory-backend-file,share=off" and, therefore, creates a private COW mapping of the file. For multi-proces QEMU, we need proper shared file-backed memory. Let's make that clearer. Message-ID: <20230906120503.359863-9-david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-09-19 10:23:21 +02:00
David Hildenbrand	ca01f1b89b	softmmu/physmem: Never return directories from file_ram_open() open() does not fail on directories when opening them readonly (O_RDONLY). Currently, we succeed opening such directories and fail later during mmap(), resulting in a misleading error message. $ ./qemu-system-x86_64 \ -object memory-backend-file,id=ram0,mem-path=tmp,readonly=true,size=1g qemu-system-x86_64: unable to map backing store for guest RAM: No such device To identify directories and handle them accordingly in file_ram_open() also when readonly=true was specified, detect if we just opened a directory using fstat() instead. Then, fail file_ram_open() right away, similarly to how we now fail if the file does not exist and we want to open the file readonly. With this change, we get a nicer error message: qemu-system-x86_64: can't open backing store tmp for guest RAM: Is a directory Note that the only memory-backend-file will end up calling memory_region_init_ram_from_file() -> qemu_ram_alloc_from_file() -> file_ram_open(). Message-ID: <20230906120503.359863-8-david@redhat.com> Reported-by: Thiner Logoer <logoerthiner1@163.com> Reviewed-by: Peter Xu <peterx@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-09-19 10:23:21 +02:00
David Hildenbrand	4d6b23f7e2	softmmu/physmem: Fail creation of new files in file_ram_open() with readonly=true Currently, if a file does not exist yet, file_ram_open() will create new empty file and open it writable. However, it even does that when readonly=true was specified. Specifying O_RDONLY instead to create a new readonly file would theoretically work, however, ftruncate() will refuse to resize the new empty file and we'll get a warning: ftruncate: Invalid argument And later eventually more problems when actually mmap'ing that file and accessing it. If someone intends to let QEMU open+mmap a file read-only, better create+resize+fill that file ahead of time outside of QEMU context. We'll now fail with: ./qemu-system-x86_64 \ -object memory-backend-file,id=ram0,mem-path=tmp,readonly=true,size=1g qemu-system-x86_64: can't open backing store tmp for guest RAM: No such file or directory All use cases of readonly files (R/O NVDIMMs, VM templating) work on existing files, so silently creating new files might just hide user errors when accidentally specifying a non-existent file. Note that the only memory-backend-file will end up calling memory_region_init_ram_from_file() -> qemu_ram_alloc_from_file() -> file_ram_open(). Move error reporting to the single caller. Message-ID: <20230906120503.359863-7-david@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-09-19 10:23:21 +02:00
David Hildenbrand	b2cccb52bd	softmmu/physmem: Bail out early in ram_block_discard_range() with readonly files fallocate() will fail, let's print a nicer error message. Message-ID: <20230906120503.359863-6-david@redhat.com> Suggested-by: Peter Xu <peterx@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-09-19 10:23:21 +02:00
David Hildenbrand	9e6b9f3791	softmmu/physmem: Remap with proper protection in qemu_ram_remap() Let's remap with the proper protection that we can derive from RAM_READONLY. Message-ID: <20230906120503.359863-5-david@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-09-19 10:23:21 +02:00
David Hildenbrand	e92666b0ba	backends/hostmem-file: Add "rom" property to support VM templating with R/O files For now, "share=off,readonly=on" would always result in us opening the file R/O and mmap'ing the opened file MAP_PRIVATE R/O -- effectively turning it into ROM. Especially for VM templating, "share=off" is a common use case. However, that use case is impossible with files that lack write permissions, because "share=off,readonly=on" will not give us writable RAM. The sole user of ROM via memory-backend-file are R/O NVDIMMs, but as we have users (Kata Containers) that rely on the existing behavior -- malicious VMs should not be able to consume COW memory for R/O NVDIMMs -- we cannot change the semantics of "share=off,readonly=on" So let's add a new "rom" property with on/off/auto values. "auto" is the default and what most people will use: for historical reasons, to not change the old semantics, it defaults to the value of the "readonly" property. For VM templating, one can now use: -object memory-backend-file,share=off,readonly=on,rom=off,... But we'll disallow: -object memory-backend-file,share=on,readonly=on,rom=off,... because we would otherwise get an error when trying to mmap the R/O file shared and writable. An explicit error message is cleaner. We will also disallow for now: -object memory-backend-file,share=off,readonly=off,rom=on,... -object memory-backend-file,share=on,readonly=off,rom=on,... It's not harmful, but also not really required for now. Alternatives that were abandoned: * Make "unarmed=on" for the NVDIMM set the memory region container readonly. We would still see a change of ROM->RAM and possibly run into memslot limits with vhost-user. Further, there might be use cases for "unarmed=on" that should still allow writing to that memory (temporary files, system RAM, ...). * Add a new "readonly=on/off/auto" parameter for NVDIMMs. Similar issues as with "unarmed=on". * Make "readonly" consume "on/off/file" instead of being a 'bool' type. This would slightly changes the behavior of the "readonly" parameter: values like true/false (as accepted by a 'bool'type) would no longer be accepted. Message-ID: <20230906120503.359863-4-david@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-09-19 10:23:21 +02:00
David Hildenbrand	5c52a219bb	softmmu/physmem: Distinguish between file access mode and mmap protection There is a difference between how we open a file and how we mmap it, and we want to support writable private mappings of readonly files. Let's define RAM_READONLY and RAM_READONLY_FD flags, to replace the single "readonly" parameter for file-related functions. In memory_region_init_ram_from_fd() and memory_region_init_ram_from_file(), initialize mr->readonly based on the new RAM_READONLY flag. While at it, add some RAM_* flags we missed to add to the list of accepted flags in the documentation of some functions. No change in functionality intended. We'll make use of both flags next and start setting them independently for memory-backend-file. Message-ID: <20230906120503.359863-3-david@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-09-19 10:23:21 +02:00
David Hildenbrand	3a1258399b	nvdimm: Reject writing label data to ROM instead of crashing QEMU Currently, when using a true R/O NVDIMM (ROM memory backend) with a label area, the VM can easily crash QEMU by trying to write to the label area, because the ROM memory is mmap'ed without PROT_WRITE. [root@vm-0 ~]# ndctl disable-region region0 disabled 1 region [root@vm-0 ~]# ndctl zero-labels nmem0 -> QEMU segfaults Let's remember whether we have a ROM memory backend and properly reject the write request: [root@vm-0 ~]# ndctl disable-region region0 disabled 1 region [root@vm-0 ~]# ndctl zero-labels nmem0 zeroed 0 nmem In comparison, on a system with a R/W NVDIMM: [root@vm-0 ~]# ndctl disable-region region0 disabled 1 region [root@vm-0 ~]# ndctl zero-labels nmem0 zeroed 1 nmem For ACPI, just return "unsupported", like if no label exists. For spapr, return "H_P2", similar to when no label area exists. Could we rely on the "unarmed" property? Maybe, but it looks cleaner to only disallow what certainly cannot work. After all "unarmed=on" primarily means: cannot accept persistent writes. In theory, there might be setups where devices with "unarmed=on" set could be used to host non-persistent data (temporary files, system RAM, ...); for example, in Linux, admins can overwrite the "readonly" setting and still write to the device -- which will work as long as we're not using ROM. Allowing writing label data in such configurations can make sense. Message-ID: <20230906120503.359863-2-david@redhat.com> Fixes: `dbd730e859` ("nvdimm: check -object memory-backend-file, readonly=on option") Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>	2023-09-19 10:23:21 +02:00
Stefan Hajnoczi	13d6b16081	Unify implementation of carry-less multiply. Accelerate carry-less multiply for 64x64->128. -----BEGIN PGP SIGNATURE----- iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmUEiPodHHJpY2hhcmQu aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV/akgf/XkiIeErWJr1YXSbS YPQtCsDAfIrqn3RiyQ2uwSn2eeuwVqTFFPGER04YegRDK8dyO874JBfvOwmBT70J I/aU8Z4BbRyNu9nfaCtFMlXQH9KArAKcAds1PnshfcnI5T2yBloZ1sAU97IuJFZk Uuz96H60+ohc4wzaUiPqPhXQStgZeSYwwAJB0s25DhCckdea0udRCAJ1tQTVpxkM wIFef1SHPoM6DtMzFKHLLUH6VivSlHjqx8GqFusa7pVqfQyDzNBfwvDl1F/bkE07 yTocQEkV3QnZvIplhqUxAaZXIFZr9BNk7bDimMjHW6z3pNPN3T8zRn4trNjxbgPV jqzAtg== =8nnk -----END PGP SIGNATURE----- Merge tag 'pull-crypto-20230915' of https://gitlab.com/rth7680/qemu into staging Unify implementation of carry-less multiply. Accelerate carry-less multiply for 64x64->128. # -----BEGIN PGP SIGNATURE----- # # iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmUEiPodHHJpY2hhcmQu # aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV/akgf/XkiIeErWJr1YXSbS # YPQtCsDAfIrqn3RiyQ2uwSn2eeuwVqTFFPGER04YegRDK8dyO874JBfvOwmBT70J # I/aU8Z4BbRyNu9nfaCtFMlXQH9KArAKcAds1PnshfcnI5T2yBloZ1sAU97IuJFZk # Uuz96H60+ohc4wzaUiPqPhXQStgZeSYwwAJB0s25DhCckdea0udRCAJ1tQTVpxkM # wIFef1SHPoM6DtMzFKHLLUH6VivSlHjqx8GqFusa7pVqfQyDzNBfwvDl1F/bkE07 # yTocQEkV3QnZvIplhqUxAaZXIFZr9BNk7bDimMjHW6z3pNPN3T8zRn4trNjxbgPV # jqzAtg== # =8nnk # -----END PGP SIGNATURE----- # gpg: Signature made Fri 15 Sep 2023 12:40:26 EDT # gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F # gpg: issuer "richard.henderson@linaro.org" # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full] # Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F * tag 'pull-crypto-20230915' of https://gitlab.com/rth7680/qemu: host/include/aarch64: Implement clmul.h host/include/i386: Implement clmul.h target/ppc: Use clmul_64 target/s390x: Use clmul_64 target/i386: Use clmul_64 target/arm: Use clmul_64 crypto: Add generic 64-bit carry-less multiply routine target/ppc: Use clmul_32* routines target/s390x: Use clmul_32* routines target/arm: Use clmul_32* routines crypto: Add generic 32-bit carry-less multiply routines target/ppc: Use clmul_16* routines target/s390x: Use clmul_16* routines target/arm: Use clmul_16* routines crypto: Add generic 16-bit carry-less multiply routines target/ppc: Use clmul_8* routines target/s390x: Use clmul_8* routines target/arm: Use clmul_8* routines crypto: Add generic 8-bit carry-less multiply routines Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-09-18 11:04:21 -04:00
Gerd Hoffmann	0ec0767e59	tests/acpi: disallow virt/SSDT.memhp updates Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2023-09-18 15:27:27 +02:00
Gerd Hoffmann	5f88dd43d0	tests/acpi: update virt/SSDT.memhp The edk2 update caused an address change: DefinitionBlock ("", "SSDT", 1, "BOCHS ", "NVDIMM", 0x00000001) { Scope (\_SB) { Device (NVDR) { Name (_HID, "ACPI0012" /* NVDIMM Root Device */) // _HID: Hardware ID [ ... ] } } - Name (MEMA, 0x43D10000) + Name (MEMA, 0x43C90000) } Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2023-09-18 15:27:27 +02:00
Gerd Hoffmann	91e0127087	edk2: update binaries to edk2-stable202308 Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2023-09-18 15:27:27 +02:00
Gerd Hoffmann	241f99399f	edk2: update submodule to edk2-stable202308 New stable release was tagged in August 2023, update the edk2 submodule to it. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2023-09-18 15:27:27 +02:00
Gerd Hoffmann	3bb5051009	edk2: workaround edk-stable202308 bug Set PCD to workaround two fixes missing the release. `8b66f9df1b` `020cc9e2e7` Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2023-09-18 15:27:27 +02:00
Gerd Hoffmann	b0494f131e	edk2: update build config risc-v switched to use split code/vars images like the other archs. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2023-09-18 15:27:27 +02:00
Gerd Hoffmann	c28a2891f3	edk2: update build script Sync with latest version from gitlab.com/kraxel/edk2-build-config Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2023-09-18 15:27:27 +02:00
Gerd Hoffmann	3808a058fc	tests/acpi: allow virt/SSDT.memhp updates Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2023-09-18 15:27:27 +02:00
Cédric Le Goater	44fa20c928	spapr: Remove support for NVIDIA V100 GPU with NVLink2 NVLink2 support was removed from the PPC PowerNV platform and VFIO in Linux 5.13 with commits : 562d1e207d32 ("powerpc/powernv: remove the nvlink support") b392a1989170 ("vfio/pci: remove vfio_pci_nvlink2") This was 2.5 years ago. Do the same in QEMU with a revert of commit `ec132efaa8` ("spapr: Support NVIDIA V100 GPU with NVLink2"). Some adjustements are required on the NUMA part. Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Message-ID: <20230918091717.149950-1-clg@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2023-09-18 07:25:28 -03:00
Cédric Le Goater	527b238329	ppc/xive: Fix uint32_t overflow As reported by Coverity, "idx << xive->pc_shift" is evaluated using 32-bit arithmetic, and then used in a context expecting a "uint64_t". Add a uint64_t cast. Fixes: Coverity CID 1519049 Fixes: `b68147b7a5` ("ppc/xive: Add support for the PC MMIOs") Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Message-ID: <20230914154650.222111-1-clg@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2023-09-18 07:25:24 -03:00
Daniel Henrique Barboza	0cbc34dc8e	MAINTAINERS: Nick Piggin PPC maintainer, other PPC changes Update all relevant PowerPC entries as follows: - Nick Piggin is promoted to Maintainer in all qemu-ppc subsystems. Nick has been a solid contributor for the last couple of years and has the required knowledge and motivation to drive the boat. - Greg Kurz is being removed from all qemu-ppc entries. Greg has moved to other areas of interest and will retire from qemu-ppc. Thanks Mr Kurz for all the years of service. - David Gibson was removed as 'Reviewer' from PowerPC TCG CPUs and PPC KVM CPUs. Change done per his request. - Daniel Barboza downgraded from 'Maintainer' to 'Reviewer' in sPAPR and PPC KVM CPUs. It has been a long since I last touched those areas and it's not justified to be kept as maintainer in them. - Cedric Le Goater and Daniel Barboza removed as 'Reviewer' in VOF. We don't have the required knowledge to justify it. - VOF support downgraded from 'Maintained' to 'Odd Fixes' since it better reflects the current state of the subsystem. Acked-by: Cédric Le Goater <clg@kaod.org> Acked-by: Greg Kurz <groug@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Nicholas Piggin <npiggin@gmail.com> Message-ID: <20230915110507.194762-1-danielhb413@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>	2023-09-18 07:25:01 -03:00
Peter Maydell	6d7a53e9f1	net/tap: Avoid variable-length array Use a heap allocation instead of a variable length array in tap_receive_iov(). The codebase has very few VLAs, and if we can get rid of them all we can make the compiler error on new additions. This is a defensive measure against security bugs where an on-stack dynamic allocation isn't correctly size-checked (e.g. CVE-2021-3527). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2023-09-18 14:36:13 +08:00
Peter Maydell	c4cf68198e	net/dump: Avoid variable length array Use a g_autofree heap allocation instead of a variable length array in dump_receive_iov(). The codebase has very few VLAs, and if we can get rid of them all we can make the compiler error on new additions. This is a defensive measure against security bugs where an on-stack dynamic allocation isn't correctly size-checked (e.g. CVE-2021-3527). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2023-09-18 14:36:13 +08:00
Peter Maydell	1257065783	hw/net/rocker: Avoid variable length array Replace an on-stack variable length array in of_dpa_ig() with a g_autofree heap allocation. The codebase has very few VLAs, and if we can get rid of them all we can make the compiler error on new additions. This is a defensive measure against security bugs where an on-stack dynamic allocation isn't correctly size-checked (e.g. CVE-2021-3527). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2023-09-18 14:36:13 +08:00
Peter Maydell	2a6cb383e2	hw/net/fsl_etsec/rings.c: Avoid variable length array In fill_rx_bd() we create a variable length array of size etsec->rx_padding. In fact we know that this will never be larger than 64 bytes, because rx_padding is set in rx_init_frame() in a way that ensures it is only that large. Use a fixed sized array and assert that it is big enough. Since padd[] is now potentially rather larger than the actual padding required, adjust the memset() we do on it to match the size that we write with cpu_physical_memory_write(), rather than clearing the entire array. The codebase has very few VLAs, and if we can get rid of them all we can make the compiler error on new additions. This is a defensive measure against security bugs where an on-stack dynamic allocation isn't correctly size-checked (e.g. CVE-2021-3527). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com>	2023-09-18 14:36:13 +08:00
Ilya Maximets	cb039ef3d9	net: add initial support for AF_XDP network backend AF_XDP is a network socket family that allows communication directly with the network device driver in the kernel, bypassing most or all of the kernel networking stack. In the essence, the technology is pretty similar to netmap. But, unlike netmap, AF_XDP is Linux-native and works with any network interfaces without driver modifications. Unlike vhost-based backends (kernel, user, vdpa), AF_XDP doesn't require access to character devices or unix sockets. Only access to the network interface itself is necessary. This patch implements a network backend that communicates with the kernel by creating an AF_XDP socket. A chunk of userspace memory is shared between QEMU and the host kernel. 4 ring buffers (Tx, Rx, Fill and Completion) are placed in that memory along with a pool of memory buffers for the packet data. Data transmission is done by allocating one of the buffers, copying packet data into it and placing the pointer into Tx ring. After transmission, device will return the buffer via Completion ring. On Rx, device will take a buffer form a pre-populated Fill ring, write the packet data into it and place the buffer into Rx ring. AF_XDP network backend takes on the communication with the host kernel and the network interface and forwards packets to/from the peer device in QEMU. Usage example: -device virtio-net-pci,netdev=guest1,mac=00:16:35:AF:AA:5C -netdev af-xdp,ifname=ens6f1np1,id=guest1,mode=native,queues=1 XDP program bridges the socket with a network interface. It can be attached to the interface in 2 different modes: 1. skb - this mode should work for any interface and doesn't require driver support. With a caveat of lower performance. 2. native - this does require support from the driver and allows to bypass skb allocation in the kernel and potentially use zero-copy while getting packets in/out userspace. By default, QEMU will try to use native mode and fall back to skb. Mode can be forced via 'mode' option. To force 'copy' even in native mode, use 'force-copy=on' option. This might be useful if there is some issue with the driver. Option 'queues=N' allows to specify how many device queues should be open. Note that all the queues that are not open are still functional and can receive traffic, but it will not be delivered to QEMU. So, the number of device queues should generally match the QEMU configuration, unless the device is shared with something else and the traffic re-direction to appropriate queues is correctly configured on a device level (e.g. with ethtool -N). 'start-queue=M' option can be used to specify from which queue id QEMU should start configuring 'N' queues. It might also be necessary to use this option with certain NICs, e.g. MLX5 NICs. See the docs for examples. In a general case QEMU will need CAP_NET_ADMIN and CAP_SYS_ADMIN or CAP_BPF capabilities in order to load default XSK/XDP programs to the network interface and configure BPF maps. It is possible, however, to run with no capabilities. For that to work, an external process with enough capabilities will need to pre-load default XSK program, create AF_XDP sockets and pass their file descriptors to QEMU process on startup via 'sock-fds' option. Network backend will need to be configured with 'inhibit=on' to avoid loading of the program. QEMU will need 32 MB of locked memory (RLIMIT_MEMLOCK) per queue or CAP_IPC_LOCK. There are few performance challenges with the current network backends. First is that they do not support IO threads. This means that data path is handled by the main thread in QEMU and may slow down other work or may be slowed down by some other work. This also means that taking advantage of multi-queue is generally not possible today. Another thing is that data path is going through the device emulation code, which is not really optimized for performance. The fastest "frontend" device is virtio-net. But it's not optimized for heavy traffic either, because it expects such use-cases to be handled via some implementation of vhost (user, kernel, vdpa). In practice, we have virtio notifications and rcu lock/unlock on a per-packet basis and not very efficient accesses to the guest memory. Communication channels between backend and frontend devices do not allow passing more than one packet at a time as well. Some of these challenges can be avoided in the future by adding better batching into device emulation or by implementing vhost-af-xdp variant. There are also a few kernel limitations. AF_XDP sockets do not support any kinds of checksum or segmentation offloading. Buffers are limited to a page size (4K), i.e. MTU is limited. Multi-buffer support implementation for AF_XDP is in progress, but not ready yet. Also, transmission in all non-zero-copy modes is synchronous, i.e. done in a syscall. That doesn't allow high packet rates on virtual interfaces. However, keeping in mind all of these challenges, current implementation of the AF_XDP backend shows a decent performance while running on top of a physical NIC with zero-copy support. Test setup: 2 VMs running on 2 physical hosts connected via ConnectX6-Dx card. Network backend is configured to open the NIC directly in native mode. The driver supports zero-copy. NIC is configured to use 1 queue. Inside a VM - iperf3 for basic TCP performance testing and dpdk-testpmd for PPS testing. iperf3 result: TCP stream : 19.1 Gbps dpdk-testpmd (single queue, single CPU core, 64 B packets) results: Tx only : 3.4 Mpps Rx only : 2.0 Mpps L2 FWD Loopback : 1.5 Mpps In skb mode the same setup shows much lower performance, similar to the setup where pair of physical NICs is replaced with veth pair: iperf3 result: TCP stream : 9 Gbps dpdk-testpmd (single queue, single CPU core, 64 B packets) results: Tx only : 1.2 Mpps Rx only : 1.0 Mpps L2 FWD Loopback : 0.7 Mpps Results in skb mode or over the veth are close to results of a tap backend with vhost=on and disabled segmentation offloading bridged with a NIC. Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> (docker/lcitool) Signed-off-by: Jason Wang <jasowang@redhat.com>	2023-09-18 14:36:13 +08:00
Ilya Maximets	a6f376e9ba	tests: bump libvirt-ci for libasan and libxdp This pulls in the fixes for libasan version as well as support for libxdp that will be used for af-xdp netdev in the next commits. Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2023-09-18 14:36:13 +08:00
Tomasz Dzieciol	e710f9c470	e1000e: rename e1000e_ba_state and e1000e_write_hdr_to_rx_buffers Rename e1000e_ba_state according and e1000e_write_hdr_to_rx_buffers for consistency with IGB. Signed-off-by: Tomasz Dzieciol <t.dzieciol@partner.samsung.com> Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com> Tested-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2023-09-18 14:36:13 +08:00
Tomasz Dzieciol	560cf339b2	igb: packet-split descriptors support Packet-split descriptors are used by Linux VF driver for MTU values from 2048 Signed-off-by: Tomasz Dzieciol <t.dzieciol@partner.samsung.com> Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com> Tested-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2023-09-18 14:36:13 +08:00
Tomasz Dzieciol	1c4e67a5be	igb: add IPv6 extended headers traffic detection Signed-off-by: Tomasz Dzieciol <t.dzieciol@partner.samsung.com> Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com> Tested-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2023-09-18 14:36:13 +08:00
Tomasz Dzieciol	17ccd01647	igb: RX payload guest writting refactoring Refactoring is done in preparation for support of multiple advanced descriptors RX modes, especially packet-split modes. Signed-off-by: Tomasz Dzieciol <t.dzieciol@partner.samsung.com> Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com> Tested-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2023-09-18 14:36:13 +08:00

1 2 3 4 5 ...

106749 Commits